The primary aim of an antidepressant efficacy trial is to demonstrate drug-placebo differences. Consistent with this aim, antidepressant efficacy trials routinely exclude subjects believed to have high placebo response rates (e.g., subjects with mild depression) or low drug response rates (e.g., those with long-term depression or with comorbid anxiety, personality, or substance use disorders). The exclusion of these subjects significantly increases recruitment costs (1) and limits the generalizability of antidepressant efficacy trials to a narrow population of "pure" depressed patients.
Few published accounts of antidepressant efficacy trials present the percentage of individuals applying for entry who are screened out. Partonen et al. (2) reported that 381 of 612 subjects (62%) applying to participate in two separate antidepressant efficacy trials were excluded for a variety of reasons. Keitner et al. (3) found that only 60 of 866 antidepressant efficacy trial applicants (7%) screened in telephone interviews were ultimately randomly assigned to treatment groups. A survey of 18 clinical trial investigators found that an estimated 80% of applicants to antidepressant efficacy trials overseen by the investigators were excluded (1). Exact exclusion rates are difficult to determine, however, because eligibility is ascertained through a sequence of screening stages, and many ineligible subjects may not even be referred to such trials. We reported elsewhere that less than 15% of the depressed patients from our outpatient practice would be eligible to participate in an antidepressant efficacy trial due to various exclusion criteria (4).
Considering the cost and the limits to generalizability associated with the exclusion criteria that have been employed in antidepressant efficacy trials, some researchers have begun to question the wisdom of their use (5, 6). Since most exclusion criteria were implemented before rigorous testing, we wondered whether the current state of knowledge would support their continued use. The goal of the present report was to review the empirical research on the efficacy of antidepressant medications in subjects typically excluded from antidepressant efficacy trials. If comparable drug-placebo differences are found in these individuals, the standard exclusion criteria could perhaps be loosened without jeopardizing the overall aims of these studies.
Review of Antidepressant Efficacy Trials
Details of our review of published antidepressant efficacy trials were presented elsewhere (4). Briefly, we reviewed the inclusion and exclusion criteria used in treatment efficacy studies of depression published from 1994 through 1998 in five psychiatric journals (Archives of General Psychiatry, American Journal of Psychiatry, Journal of Clinical Psychiatry, Journal of Clinical Psychopharmacology, and Psychopharmacology Bulletin). We identified 31 studies of outpatients that were not limited to a particular demographic group such as elderly patients. We evaluated how frequently each exclusion criterion was used in these 31 studies. No standard inclusion and exclusion criteria set exists. Among the 31 studies, no two employed the same inclusion and exclusion criteria set. Fourteen exclusion criteria were identified, and the 10 most common ones (mild depression, short episode duration, long episode duration, comorbid dysthymia, comorbid anxiety disorders, comorbid substance use disorders, comorbid personality disorders, medical comorbidity, prior nonresponse to treatment, and a positive response during the placebo lead-in phase) form the basis of our review (t1). Because the present report focused on exclusion criteria designed primarily to maximize drug-placebo differences, exclusion criteria used for other reasons are not considered. Thus, although exclusion criteria such as a diagnosis of bipolar disorder, psychotic features, suicidal ideation, age under 18 or over 65 years, and an inability to speak English are also common in antidepressant efficacy trials, these are not considered in the present review because the rationale for their use is distinct.
A computerized MEDLINE search, independent of the one used to identify the antidepressant efficacy trials, of all studies published from 1966 to December 2000 was performed with each of the 10 exclusion criteria used as key words and cross-referenced with "antidepressants." All relevant articles were obtained, and a manual search was performed after each of these articles was reviewed. Because of the volume and breadth of the studies involved in the review, it is not possible to present the results of each study. Instead, we synthesized our findings and placed the greatest emphasis on double-blind, placebo-controlled studies that focused specifically on the efficacy of somatic therapy in the populations of interest. We also included studies that were not placebo controlled, placing the greatest emphasis on (in descending order) post hoc analyses derived from placebo-controlled studies, open-label trials, and naturalistic follow-up studies. More emphasis was also placed on studies that had adequate sample sizes, employed reasonable controls, and randomly assigned subjects to study groups.
In the present review, "efficacy," as opposed to "effectiveness," refers specifically to outcomes derived from placebo-controlled trials.
Several early antidepressant efficacy studies found that medications were no more effective than placebo in treating mild depression (7–9). Especially high placebo response rates (10, 11) and high spontaneous remission rates (12) in mild depression were believed to account for the lack of drug-placebo differences. In the National Institute of Mental Health (NIMH) Treatment of Depression Collaborative Research Program, subjects with mild depression who received imipramine plus clinical management fared no better than those who received clinical management alone (13). Largely on the basis of these results, it has generally been concluded that antidepressants are not efficacious for mild depression.
The Treatment of Depression Collaborative Research Program, however, was a 16-week study, and this amount of time could have been sufficient for many depressive episodes to spontaneously remit. Some have also questioned whether subjects in the research program may have derived significant benefit from "clinical management," which would have undermined the ability of this cohort to serve as a control group. Last, Stewart et al. (6) have argued that the Treatment of Depression Collaborative Research Program sample sizes lacked sufficient power to assert that drug-placebo differences were not present. Larger studies focusing specifically on patients with mild depression, i.e., those with Hamilton Rating Scale for Depression scores of 13–17, have shown that antidepressants are efficacious for these patients (6, 14–18). However, there are conflicting data on whether they are efficacious for very mild depression, i.e., conditions associated with Hamilton depression scale scores of 12 or less (6, 15).
Episode Duration of 4 Weeks or Less
The DSM-IV diagnosis of major depressive disorder requires a minimum duration of symptoms of at least 2 weeks. Therefore, this exclusion criterion affects only patients whose episode has lasted 2–4 weeks. No studies have directly examined the impact of this cutoff. Placebo responders have generally been found to have a shorter duration of illness, and an episode duration of 3 months or less has consistently been found to predict high placebo response rates (19–22). Two analyses of separate cohorts of untreated subjects from the NIMH Collaborative Program on the Psychobiology of Depression found that spontaneous remission was also most likely to occur within the first 3 months of a depressive illness (23, 24). Thus, because of the high rates of placebo response and spontaneous remission in these patients, it is likely that drug-placebo differences are significantly less robust in patients with a short episode duration.
A long episode duration at baseline is generally considered to be one of the strongest predictors of nonresponse to both somatic therapy (25, 26) and placebo (8, 19, 22, 27–30). One prospective study found that the duration of illness at baseline accounted for 45% of the variance in the time to eventual recovery (31). On the other hand, several studies have reported good response rates to somatic therapy in patients with chronic depression. For example, 92 of 167 patients with chronic depression (55%) responded to a 12-week open-label trial of nefazodone (32). In a double-blind, placebo-controlled trial, Kocsis et al. (33) found that 45% of patients with chronic depressive symptoms responded to imipramine (N=29), compared to only 12% who responded to placebo (N=25). In a post hoc analysis, Khan et al. (30) found that 56% of patients with chronic depression responded to antidepressant medications, which was only slightly less than the response rate in patients who were ill for less than 1 year (61%). Placebo response rates, on the other hand, were 45% in the latter cohort, compared to 23% in the patients with longer-term depression. In another post hoc analysis, paroxetine was found to be significantly more efficacious than placebo for depressed subjects whose episode had lasted 1 year or more, compared to those whose episode had lasted less than 1 year (34). Thus, while naturalistic studies have found that overall response rates to somatic therapy are lower in patients with chronic depression, compared to those with a shorter duration of depression, several controlled trials suggest that drug-placebo differences may be no less robust.
Comorbid Dysthymic Disorder
Depression complicated by dysthymia ("double depression") is associated with a lower level of functioning (35, 36) and a worse long-term prognosis than depression alone (36–38). In a study by Kocsis and colleagues (33), all subjects had diagnoses of major depression and comorbid dysthymia. Response rates were significantly higher for subjects treated with imipramine than for those who received placebo. In an 8-week trial involving 102 subjects with "double depression" randomly assigned to receive moclobemide, imipramine, or placebo, both active treatments were superior to placebo (39). In another study involving 89 subjects with "double depression," approximately two-thirds responded to active medication and one-third to placebo; these rates were comparable to those reported in patients with a diagnosis of dysthymia alone or with major depression in partial remission (40).
Comorbid Anxiety Disorders
When a diagnosis of a comorbid anxiety disorder is made in depressed patients, the depressive episode tends to be more severe and to be associated with a lower level of functioning (41–44). Most studies, but not all (45–47) have found that high levels of baseline anxiety are associated with lower response rates to somatic therapy (48–50) and a poorer overall prognosis (41, 42, 44, 51). Four naturalistic follow-up studies have shown that depressed patients with comorbid panic disorder respond less well to antidepressant therapy than those with a diagnosis of major depression alone (52–55).
To our knowledge, no double-blind, placebo-controlled studies have been designed specifically to evaluate the efficacy of antidepressant medications in depressed patients with a comorbid anxiety disorder. One double-blind, placebo-controlled study involving depressed subjects with moderate to severe levels of baseline anxiety, i.e., a Covi Anxiety Scale rating of ≥7, found that fluoxetine and extended-release venlafaxine were both superior to placebo (56). A post hoc analysis of this data set involving a subset of subjects with comorbid generalized anxiety disorder found that extended-release venlafaxine was superior to placebo at the 12-week follow-up visit (57). Antidepressant medications have also repeatedly been shown to ameliorate comorbid anxiety in depressed patients (46, 56, 58–60).
Several treatment studies utilizing antidepressant medications for primary anxiety disorders have provided further evidence that antidepressants may be efficacious for the treatment of depression confounded by anxiety. In a study of 126 subjects with panic disorder and a concurrent mood disorder (major depression, dysthymia, or depressive disorder not otherwise specified) treated for 16 weeks with imipramine, alprazolam, or placebo, both active medications were found to be efficacious in treating the comorbid major depression (61). A reanalysis of this data set revealed that imipramine was specifically beneficial for depressive symptoms, whereas alprazolam produced most of its benefits in the domains of sleep and anxiety (62). A post hoc analysis by Zajecka (63) of data for 55 subjects with panic disorder and comorbid major depression found that nefazodone was superior to placebo in treating the comorbid depressive disorder. Placebo-controlled studies involving subjects with posttraumatic stress disorder (64–66), obsessive-compulsive disorder (67–70), and generalized anxiety disorder (71) have suggested that antidepressants may be efficacious for comorbid major depression in these disorders as well.
In summary, the presence of comorbid anxiety appears to be associated with lower response rates to somatic therapy. The few placebo-controlled studies that have included patients with comorbid anxiety have suggested that antidepressants may be efficacious for depression confounded by anxiety. However, to our knowledge, no studies have been specifically designed to assess the efficacy of antidepressant medications in patients with a comorbid anxiety disorder.
Comorbid Substance Use Disorders
Some studies have found that comorbid substance abuse is associated with a lower recovery rate from depression (72, 73), while others have not found this association (74). Early placebo-controlled studies did not find that antidepressants were helpful for depressive symptoms in subjects with active alcoholism (75, 76). More recently, placebo-controlled studies have shown that antidepressants are efficacious in treating depressive symptoms in recently abstinent alcoholics (77–80) as well as in actively drinking alcohol-dependent patients (81, 82). One double-blind, placebo-controlled study found that fluoxetine was efficacious for depressive symptoms in 22 depressed alcoholic marijuana users (83).
Data on the benefits of antidepressants for active drug abusers are mixed. Three double-blind, placebo-controlled studies have found tricyclic antidepressants to be efficacious for depressed opioid addicts (84–86), while two other studies found no advantage for tricyclics over placebo in methadone-maintained addicts (87, 88). We are not aware of any placebo-controlled studies evaluating the efficacy of antidepressant medications for major depression in recently detoxified (rather than active) drug abusers.
Comorbid Personality Disorders
The overwhelming majority of studies assessing the impact of comorbid personality pathology on antidepressant response rates have found that the presence of a comorbid personality disorder conveys a worse prognosis (50, 89–94). Short-term response rates to somatic therapy in naturalistic follow-up studies have ranged from 9% to 52% (95). No consistent pattern of nonresponse exists among the personality disorder clusters (95), and the presence of several personality disorder diagnoses appears to convey a worse prognosis than any single personality disorder (96). Nevertheless, treatment with antidepressant medications appears to diminish dysfunctional attitudes and decrease rates of posttreatment personality disorder diagnoses (97). We were unable to locate any studies that have directly addressed placebo response rates in depressed personality disorder patients. Weissman et al. (98) reported that high neuroticism scores on the Maudsley Personality Inventory predicted poor outcome in a cohort of comparison (i.e., untreated) subjects with major depression "to a highly significant degree." It was also noted that medications and psychotherapy appeared to ameliorate the negative impact that a high level of neuroticism has on outcome.
We are not aware of any placebo-controlled studies designed specifically to evaluate the efficacy of antidepressant medications in depressed patients with a comorbid personality disorder diagnosis. Soloff and colleagues (99, 100) performed a series of studies comparing the efficacy of antidepressants, haloperidol, and placebo in patients with borderline personality disorder. The patients in these studies did not necessarily suffer from major depression, although depressive symptoms were prominent in many patients. Antidepressant medications were found to be of marginal benefit. These studies are limited because they took place on inpatient units, and the therapeutic effects of hospitalization may have inflated placebo response rates.
In a post hoc analysis, Parsons et al. (101) found extremely robust antidepressant response rates in depressed patients with atypical features and a comorbid borderline personality disorder diagnosis. Depending on the diagnostic threshold used to define borderline personality disorder, approximately 90% of subjects responded to a trial of phenelzine, 40% to imipramine, and only 20% to placebo. A study by Tyrer and colleagues (102) was unable to replicate the efficacy of phenelzine for depressed patients with personality pathology.
In summary, studies have consistently shown that personality pathology predicts inferior rates of response to somatic therapy. However, data on response rates to placebo and drug-placebo differences are insufficient to draw any conclusions.
Comorbid Medical Conditions
Prospective, naturalistic studies have found that medical comorbidity conveys a worse long-term prognosis (72, 103). In an open-label study involving 50 medically ill inpatients treated for major depression, only 40% responded to an adequate antidepressant trial (104). Although no comparison group was included, the low response rate suggests that medical comorbidity may limit the therapeutic benefits of antidepressant medications. Supporting this contention, Hall et al. (105) found that nearly one-half of all depressed patients who did not respond to an antidepressant trial had some underlying medical condition that might have accounted for the lack of response. The authors reported that addressing the medical condition often ameliorated the depressive symptoms.
Nevertheless, at least eight placebo-controlled studies have supported the efficacy of antidepressant medications in treating depression associated with such diverse illnesses as cancer (106), multiple sclerosis (107), chronic obstructive pulmonary disease (108), stroke (109, 110), diabetes (111, 112), and cardiac disease (113). Two other randomized controlled trials involving "physically ill" patients also found antidepressants to be superior to placebo (114, 115). Antidepressants were not found to be efficacious in two studies involving patients with Alzheimer’s disease (116) and epilepsy (117).
Prior Nonresponse to Multiple Antidepressant Trials
Prior nonresponse to treatment is associated with lower drug and placebo response rates (8, 22, 27, 50). Thase and Rush (118) have argued that patients who have failed three or more adequate antidepressant trials have less than a 30% chance of responding to a fourth. Prior exposure to psychotropic medications is associated with improved compliance (119), however. One study found that drug-placebo differences were greatest in patients who had undergone at least two previous medication trials (119). Furthermore, individuals who have had a positive response to somatic therapy in the past may be more likely to subsequently respond to placebo (27, 120). Thus, the exclusion of individuals who have failed multiple antidepressant trials may result in exclusion of individuals who are more likely to comply with the protocol and less likely to respond to placebo. Although these subjects would also be expected to have a lower response rate to somatic therapy, it is unclear whether drug-placebo differences would be affected.
Response to Placebo Lead-In
The majority of antidepressant efficacy trials use a placebo lead-in period (121). Individuals who demonstrate clinical improvement during this period are often excluded from the active phase of treatment despite meeting all other eligibility requirements. The potential value of a placebo lead-in period was depicted in a 1966 study by Jones and Ainslie (122), who showed that the placebo response tends to occur within the first 2 weeks of treatment. By using the week 3 ratings as baseline rather than week 1 ratings, the authors demonstrated that placebo response rates could be significantly reduced without significantly affecting drug response rates. The practice of excluding placebo lead-in responders did not occur until two decades later, and the origins of this practice remain unclear (20, 121). Most likely, it was presumed that individuals who improved on placebo during the lead-in phase would be the same ones to improve during the active phase. If so, the inclusion of such individuals would be expected to obscure drug-placebo differences.
Although this reasoning is logical and intuitively appealing, one study that compared placebo lead-in responders with 6-week placebo responders found that the two cohorts had distinct demographic characteristics, suggesting that placebo lead-in responders may not be isomorphic with subjects who respond to placebo during the active phase of treatment (28). Reimherr and Ward (123) examined the impact of including subjects who demonstrated significant improvement on self-rating scales during the placebo lead-in phase but who were included in the active phase because clinicians’ ratings did not reflect improvement ("hidden placebo responders"). Inclusion of these subjects did not affect drug-placebo differences. In fact, the hidden placebo responders demonstrated larger drug-placebo differences than nonresponders to the placebo lead-in. In a meta-analysis that compared 39 antidepressant efficacy trials that used a placebo lead-in and 33 that did not, Trivedi and Rush (121) found that the response to placebo during the active phase was nearly identical in both cohorts (27.8% versus 28.5%). These findings suggest that the placebo lead-in does not diminish placebo response rates during the active phase. Furthermore, drug-placebo differences were found to be slightly less in studies that used a placebo lead-in. Some investigators have even speculated that improvement during the placebo lead-in may serve to mitigate further improvement in patients who receive placebo during the active phase of the trial (20, 124).
For years, investigators have raised concerns over the generalizability of antidepressant efficacy trials. Although some of the exclusion criteria used in antidepressant efficacy trials are clearly necessary, others are implemented primarily to maximize drug-placebo differences. This practice greatly reduces the generalizability of these studies but perhaps can be justified if it decreases the likelihood of obtaining a type II error, i.e., not finding drug-placebo differences when real differences are present. However, if drug-placebo differences are in fact no less robust in these populations, then the rationale for excluding these subjects becomes less tenable.
Three cohorts of subjects—those with mild depression, those with an episode duration of less than 4 weeks, and those who improve during the placebo lead-in period—are excluded to minimize placebo response rates. Although early studies suggested that subjects with mild depression may not respond any better to antidepressant medication than to placebo, these studies had several shortcomings, including a lack of statistical power in failing to reject the null hypothesis. More recent studies, though few in number, have shown that antidepressants may be efficacious for mild depression. Similarly, the practice of excluding placebo lead-in responders has never been shown to magnify drug-placebo differences, and all available evidence suggests that this exclusion criterion has no discernable impact on differential response rates. Although studies have consistently shown that a short episode duration is associated with high placebo response rates, a 3-month or less episode duration would appear to be a more empirically validated cutoff than 4 weeks.
The remaining seven exclusion criteria that we reviewed—chronic depression, comorbid dysthymia, comorbid anxiety disorders, comorbid substance use disorders, comorbid personality disorders, comorbid medical conditions, and prior nonresponse to treatment—were originally implemented after naturalistic studies had shown that patients with these features responded less well to somatic therapy and had a worse overall prognosis. Without a placebo comparison group, however, a lack of drug-placebo differences cannot be inferred. Even now, few placebo-controlled studies have been performed to specifically address the efficacy of antidepressant medications in depressed patients with these features. The few studies that have been performed have suggested that placebo response rates are also lower in these patients.
In summary, our review suggests a paucity of empirical support justifying the use of the 10 exclusion criteria considered in the present report (t2). For individuals with chronic depression, comorbid dysthymia, or medical comorbidity and for placebo lead-in responders, our review suggests that drug-placebo differences may be no less robust in these subjects than in individuals who qualify for antidepressant efficacy trials. For subjects with mild depression, a comorbid anxiety disorder, a comorbid substance use disorder, a comorbid personality disorder, or a history of nonresponse to treatment, the data are either conflicting or too preliminary to determine whether the magnitude of drug-placebo differences is any less. It remains unknown whether antidepressant medications are superior to placebo in patients with a short episode duration, but the high placebo and spontaneous response rates in these individuals make it likely that drug-placebo differences are significantly less robust.
Further evidence to support the efficacy of antidepressant medications in less rarefied populations is provided by placebo-controlled studies in primary care settings where standard exclusion criteria are not employed. Although only a handful of such studies have been performed with methods comparable to those in antidepressant efficacy trials, these studies have consistently found antidepressants to be efficacious in relatively unselected populations of depressed patients (14, 58, 119, 125). Depressed patients in primary care settings have less psychiatric comorbidity and lower rates of treatment resistance than psychiatric patients, however, and these results can not necessarily be generalized to psychiatric patients.
In interpreting the results of the present review, several important caveats should be kept in mind. First, the review was limited by the scarcity of controlled studies involving the populations of interest. For example, it was our intention to statistically compare outcomes of subjects who are excluded from antidepressant efficacy trials with the outcomes of subjects who are typically included. However, the limited number of studies precluded any meaningful effect size comparisons. Second, although preliminary evidence has suggested that drug-placebo differences exist in many of the populations excluded from antidepressant efficacy trials, we do not know whether the magnitude of their response is comparable to that of those included in the trials. If the magnitude of drug-placebo differences were less in the excluded individuals, even if drug-placebo differences exist, then the exclusion of these individuals would still decrease the likelihood of obtaining a type II error. Third, our review includes only published studies and is therefore susceptible to the "file drawer" bias, since studies with negative results are less likely to be published. Because the number of placebo-controlled studies that have been performed in each of the populations reviewed here is small, only a handful of negative, unpublished studies would be needed to undermine our principal conclusions (126). It should also be pointed that easing the inclusion/exclusion criteria of antidepressant efficacy trials would increase the heterogeneity of the population studied, and, therefore, the variance in base response rates would likely increase. When this occurs, the power to detect differences decreases. At this time, it is difficult to estimate the impact of this variance in base response rates on the results of antidepressant efficacy trials.
In conclusion, our review suggests that the rationale for employing many of the exclusion criteria used in standard antidepressant efficacy trials lacks a clear empirical basis. It is somewhat shocking that after 50 years following the introduction of antidepressant medications, there has yet to be a single published study that was designed specifically to evaluate their efficacy in depressed patients with a comorbid anxiety disorder or a comorbid personality disorder—patients who constitute perhaps that majority of those encountered in routine clinical practice. Other populations of patients, such as those with mild depression or chronic depression, have only rarely been the focus of systematic inquiry. Clearly, there is a need to move beyond the traditional model of how antidepressant efficacy trials are conducted.
Received May 7, 2001; revision received Sept. 17, 2001; accepted Oct. 9, 2001. From the Department of Psychiatry and Human Behavior, Brown University School of Medicine. Address reprint requests to Dr. Posternak, Department of Psychiatry and Human Behavior, Brown University School of Medicine, Rhode Island Hospital, 235 Plain St., Suite 501, Providence, RI 02905; email@example.com (e-mail).