Serotonin selective reuptake inhibitors (SSRIs) emerged as first-line agents for the treatment of panic disorder not long after initial treatment trials were published (1–3). Indeed, the rapid acceptance of SSRIs for the treatment of panic disorder occurred without comparative treatment trials with older agents.
To address this issue and provide continuity between past and current efficacy studies of treatments for panic disorder, we conducted an effect-size analysis of recent, controlled trials of SSRIs in treating panic disorder. Specifically, we examined the effect size of SSRI treatments for panic disorder from 12 placebo-controlled efficacy trials and compared these results to findings from Gould and associates’ meta-analysis (4) of non-SSRI treatments for panic disorder.
Double-blind, placebo-controlled studies of SSRIs for panic disorder with or without agoraphobia were identified in literature searches with the use of PsychLIT and MEDLINE, discussions with colleagues, and examination of reference sections of related articles. The following key terms were used alone and in combination in our literature search: panic, medication, treatment outcome, selective serotonin reuptake inhibitor, SSRI, paroxetine, fluoxetine, fluvoxamine, sertraline, and citalopram. We excluded from analysis results from uncontrolled medication trials, studies examining the effects of SSRIs on biological challenge procedures (e.g., CO2 challenges), case reports, long-term or follow-up studies, and articles reviewing several studies that were published separately. Twelve studies met the criteria for inclusion in the present effect-size analysis.
By using methods comparable to those of Gould et al. (4), study effect sizes were calculated for all dependent variables at posttreatment, excluding measures of depression symptoms, e.g., the Hamilton Rating Scale for Depression score (5). A separate panic frequency effect size (for full panic attacks only) was also calculated. t1 presents these effect-size measures as well as the number of subjects, dropout rates, and the percentage of subjects who were free of panic at endpoint. To provide the fairest estimate of the efficacy of SSRIs in fixed-dose studies, effect sizes were calculated only for doses identified as effective. For example, Ballenger et al. (13) concluded that daily doses of 10 and 20 mg of paroxetine were not significantly better than placebo, and effect sizes were calculated only for the 40-mg dose. In the case of fluoxetine (15), both 10- and 20-mg daily doses were found to be effective, with slightly higher effect sizes for the 10-mg dose; data from both doses were included in the meta-analysis.
An effect size for a particular measure was computed by subtracting the mean score of the posttreatment comparison group from that of the posttreatment active-treatment group and then dividing by the standard deviation of the posttreatment comparison group (or, alternatively, following a similar computation for change scores if these scores were used for data analysis):
Furthermore, if the results for a particular measure were reported as proportions of individuals who improved or were free of panic (rather than as means, standard deviations, and t scores), then effect sizes were determined by using a table provided by Glass et al. (18, p. 139). Effect sizes were assigned by interpolation from listed values.
Both weighted and unweighted (by sample size) overall effect sizes were calculated. Weighted effect sizes have the advantage of providing stronger representations of studies examining large numbers of subjects and helping overcome the potential bias introduced by the tendency for underpowered studies with smaller effect sizes to be excluded from publication. The extent of this potential bias was examined with a funnel plot that plotted the sample size against the effect size.
We obtained a mean study effect size of d=0.55 and a panic frequency effect size (based on eight studies) of 0.38. Overall effect sizes were significantly associated with the sample size of the study; larger studies were associated with lower effect sizes (R=0.72, F=10.9, df=1, 10, p<0.009). Overall, the mean sample size of the SSRI studies (N=145) tended to be smaller than that of earlier imipramine studies (N=194). When SSRI effect sizes were weighted by sample size, a lower overall effect size and a similar panic frequency effect size estimate were obtained: 0.47 and 0.37, respectively.
These effect-size estimates were compared, by using unpaired t tests, to both the antidepressant (d=0.55) and imipramine effect sizes (nine studies, d=0.48) obtained by Gould et al. (4). Imipramine was selected as a specific comparison drug because of the use of this agent in a recent multicenter collaborative treatment study (19). No significant differences between SSRI effect sizes and these comparison groups were found. The mean study effect size was identical to the mean non-SSRI effect size for antidepressants reported by Gould et al. (4), and, likewise, the weighted effect size was nearly identical to the effect size for imipramine alone.
The mean study dropout rate for SSRIs was 19.9% (SD=10.9), an estimate almost identical to that in pharmacotherapy studies (19.8%) reported by Gould et al. (4). In the analysis by Gould et al., antidepressants had a mean dropout rate of 25.4%, and imipramine had a mean dropout rate of 22.4%, based on data from seven studies. No significant differences between SSRIs and these comparison conditions were found by t test analyses. When data were pooled across all studies (thereby giving larger studies a larger contribution to the computation of the overall dropout rate), we found the mean dropout rate for SSRIs was 24.6%.
Associations Between Sample Size and Effect Size
The significant negative association between study sample size and effect size provides evidence for what has been termed a "publication bias" against smaller, lower-powered studies obtaining lower effect sizes for SSRIs. To further examine this hypothesis, a qualitative, funnel plot analysis was conducted (20, 21). When no sample-related bias is evident, a funnel plot (with the effect size on the abscissa and the sample size or standard error on the ordinate) should produce a symmetrical pyramidal shape with the apex at the top. Under conditions of publication bias, typically, the lower left base of the pyramid (reflecting the lower sample size and lower effect-size studies) is missing. This was the case with the current data set, in which the funnel plot was not only asymmetrical but indicative of a strong negative association between sample size and effect size.
In our data, sample size was also confounded with the year of study publication (R=0.76, F=13.4, df=1, 10, p<0.004); more recent studies used larger sample sizes. Accordingly, we examined the independent association between year of publication and effect size. More recently published studies were associated with smaller SSRI effect sizes (R=0.82, F=20.1, df=1, 10, p<0.002), even when the independent influence of sample size was statistically controlled. Furthermore, this effect was not explained by differences in the medication under study (e.g., the influence of year and sample size continued to be significant when the earlier and smaller fluvoxamine studies were eliminated from analysis). As a result of this association, we investigated whether effect sizes may have been influenced by changes in sample responsivity over time. To examine this hypothesis, we used the panic-free rate for patients in placebo treatment as an index of sample responsivity to nonspecific treatment effects. Using a linear regression analysis of the nine studies providing these data, we found no significant association between year of publication and treatment responsivity (R=0.41, F=1.40, df=1, 8, p<0.28). Hence, changes in treatment responsivity, at least as assessed by panic-free rates, do not appear to be the explanation for findings of lower SSRI effect sizes over time in the panic disorder literature. Likewise, we found no obvious associations between the type of outcome variable used or the degree of avoidance between early and later studies of SSRIs for panic disorder.
We obtained an overall SSRI effect-size estimate identical to the overall non-SSRI effect size for pharmacotherapy reported by Gould et al. (4), suggesting that results for recent SSRI studies are well within expectations of the kinds of results achieved with older agents. The specific efficacy of SSRIs to reduce panic frequency revealed a smaller effect size, but this estimate was limited to the eight studies that included this outcome variable.
We found a large negative association between the size of the study and the effect-size estimate; larger studies were associated with smaller effect sizes. Consequently, we recalculated effect size by taking into account the greater confidence in an estimate afforded by a larger sample. This smaller, weighted SSRI effect size was nearly identical to the effect size for imipramine treatment alone in the Gould et al. (4) meta-analysis. That is, the present effect-size analysis provides no evidence supporting the hypothesis that SSRIs are more effective than older antidepressants in the treatment of panic disorder. This result agrees in full with analyses of the relative effect sizes of SSRIs and tricyclic antidepressants for major depression (22).
In our limited sample of studies, we did not find that SSRIs were significantly more tolerable than treatment with other antidepressants when dropout rates were examined. However, we were able to analyze only attrition rates as a whole; attrition associated with side effects was not reported consistently enough to allow analysis, as was done for the observed difference in side effect severity between the SSRIs and older antidepressants in depressed samples (23, 24).
We found that, over time, as the investigation of SSRIs intensified and included more agents and larger studies, smaller effect sizes were obtained relative to those of earlier reports. We could not account for this finding by examining the responsivity of the treatment samples, the specific medication under study, differences in sample characteristics, or outcome measures, although we cannot rule out the existence of other cohort effects. Our results are consistent with the notion that the early publication of small but significant studies may have led to an initial overestimation of the effect size of SSRIs for panic disorder; cf., Boyer (25). Moreover, it is clear that our analysis may continue to overestimate the true effect size of SSRIs, given evidence that negative, industry-sponsored trials were not published. For example, Ninan (26) described four negative trials with high rates of placebo response that were not published; in these studies, which found a mean effect size of d=0.16 for categorical outcomes (panic-free rate and Clinical Global Impression improvement score of 2 or more), fluovoxamine did not outperform placebo.
In conclusion, we found no evidence supporting the hypothesis that SSRIs are more effective than older antidepressants for the treatment of panic disorder. One implication of our findings is that earlier estimates of the relative efficacy of cognitive behavior therapy and older antidepressants (4, 10, 27) may well be applicable to SSRIs, as suggested by initial comparisons of open treatment using SSRIs or cognitive behavior therapy (28). Our results are necessarily limited by the biases and challenges encountered with meta-analytic comparisons (29). Direct-comparison (three-arm) outcome studies of SSRIs, older antidepressants, and placebo have the potential of further clarifying the efficacy relationship among these agents.
Presented in part at the 33rd convention of the Association for Advancement of Behavior Therapy, Toronto, Nov. 11–14, 1999. Received March 27, 2000; revision received April 20, 2001; accepted May 10, 2001. From Massachusetts General Hospital, Harvard Medical School. Address reprint requests to Dr. Otto, Massachusetts General Hospital, WACC-812, 15 Parkman St., Boston, MA 02114; firstname.lastname@example.org (e-mail).