The societal cost and morbidity of depressive illness are equivalent to those for chronic physical illnesses, such as diabetes, and the economic cost is measurable annually in tens of billions of dollars (1). In the past decade, access to treatment and outcome have been enhanced by the introduction of antidepressants with prescription ease (efficacy with one pill per day) and low side effect burden (2). In spite of the widespread use of the newer antidepressants, little is known about how long it takes to see their full effect. Acknowledging the absence of empirical data, Schulberg et al. (3), in an Agency for Health Care Policy and Research update, indicated the need for research identifying "the time point at which to augment or change the initial acute treatment."
Current guidelines are primarily based on the experience of groups of experts (4). Several (3, 4) suggest that if a partial response has not occurred after 6 weeks, treatment should be changed. Others (5) suggest that lack of response by 4 weeks bodes poorly for a beneficial outcome.
The most frequently prescribed antidepressants are the selective serotonin reuptake inhibitors (SSRIs) (2). At what point should a trial of these drugs be terminated? Treatment should be changed when switching to a new drug improves the probability of response (prognosis). The probability of response is altered by clinical variables. Thus, for a group of patients unresponsive to 6 weeks of treatment, the anticipated response rate is less than that of an untreated group. In assessing treatment choices, we should consider the anticipated proportion who will improve if treatment with the drug is continued, the proportion expected to improve if the drug is changed, as well as the rate of spontaneous improvement during an analogous time period. The few studies that have addressed the issue of trial length did not assess the unimproved patient’s prognosis at different time points, and so the point at which treatment should be changed remains obscure (5). In the following we discuss the values and limitations of a method for estimating prognosis after different durations of drug treatment.
Our purpose was to examine a large set of data on patients treated with fluoxetine so as to develop criteria permitting clinicians to decide whether a trial of fluoxetine has not succeeded. The term "unsuccessful trial" is used here to indicate the point at which treatment should be changed. Although some data suggest that first-generation antidepressants require at least 6 weeks for assessment of efficacy, there is no reason to assume that SSRIs are similar (6).
A randomized, placebo-substitution design offers a strategy to answer questions about the point at which a clinical trial is no longer likely to succeed (7). We addressed this issue by reanalyzing data from a multicenter, placebo-controlled discontinuation study (8). In that study, patients were treated openly with a fixed dose of fluoxetine (20 mg/day) for 12 weeks, and patients with remissions (to be defined) were randomly assigned to receive drug or placebo for weeks 13–26.
There are two parts to this article. In the first, we examine the relationship of weekly improvement to remission at week 12 so as to identify criteria at different time points warranting treatment change. For example, after 6 weeks, if a group that has not achieved a 25% improvement has an acceptable prognosis, it would be premature to change drugs. However, after 8 weeks, if the group that has not achieved at least a 25% improvement has virtually no chance of responding by week 12, switching treatment would be indicated. The terms "switching" and "changing" treatment are used to simplify the presentation but should include the options of increasing the dose to the maximum, augmentation with lithium, thyroid hormone, or amphetamine, and adding another antidepressant or course of ECT. Discussion of which strategy to follow is beyond the scope of this paper. The data in this paper only help identify when switching would be premature and clinically ill advised.
In the second part of the article, we examine time to relapse during weeks 13–26 in the double-blind placebo-versus-fluoxetine discontinuation phase for patients with remissions. The course of patients who were unimproved or partial responders at weeks 6 and 8 but experienced remission by week 12 is examined in terms of their response to placebo substitution. Weeks 4, 6, 8, and 10 were chosen arbitrarily because these are the points at which data were collected. Weeks 6 and 8 are the focus of attention because remission rates were so high for patients unimproved at week 4 and so low for patients unimproved at week 10 that these data are presented but not analyzed. Confidence in the improvement observed from weeks 6 through 12 would be enhanced if patients with "late remissions" (defined as remission later than week 6) and those with "early remissions" (remission before or at week 6) had the same time to relapse and if this interval was longer than that for patients switched to placebo.
Overall outcomes from this study have been published (8). In this article, the focus is on determining at what point treatment should be changed.
Definitions of Outcomes and Trial Length
The design of this study has been described elsewhere (8). Commonly accepted definitions of remission, response, partial response, and nonresponse were used (9). The patients were stratified by these criteria to determine the ability of different levels of improvement at weeks 4, 6, and 8 to predict outcome. In other words, if a patient is unimproved at week 6, what is the chance of remission by week 12? Definitions of the various outcomes were based on the 17-item Hamilton Depression Rating Scale. Remission was defined as a Hamilton scale score of ≤7. Response was characterized by at least 50% reduction in the Hamilton score but an absolute score of >7. Partial response was defined as a reduction in Hamilton scale score of 25%–49%. Nonresponse was defined as <25% improvement in the Hamilton score (9). There is some evidence (9) of the validity of the definitions used for remission and response. While the Agency for Health Care Policy and Research (4) suggested that the prognoses for partial response and nonresponse differ, it proposed no specific criteria.
In order to develop criteria by which to select the appropriate trial length, we reviewed reports on the rate of response to a second drug following nonresponse to a first medication. We also attempted to estimate rates of spontaneous remission after lack of response to a drug. Unsurprisingly, we found no studies of patients receiving a drug who were unimproved and switched to a trial of placebo. The closest clinical analogy was unimproved depressed patients who continued receiving placebo after 6 weeks of nonresponse. Five relevant placebo-controlled studies were found (10–14). From week 6 to week 12 of treatment, the proportion of responders in the placebo group did not increase in two studies (10, 11) and increased by approximately 10% in three studies (12–14). This suggests that new placebo responses are negligible after 6 weeks of no response (i.e., >0, <10%). Next we attempted to estimate response rates associated with the switch to a second drug after a lack of response to a first drug. In one extensive review examining strategies for patients unresponsive to 4–6 weeks of drug treatment (15), the rates of response after the switch to a second drug varied from 30% to 50% in the next 4–6 weeks. However, in these studies the response was frequently defined as a 50% decrease in the Hamilton depression scale score. In the current study, the criterion for remission was a Hamilton score of ≤7. Since this represents a greater degree of improvement than was specified in prior studies, the lower bound derived from the prior studies, 30%, was considered sufficient to justify continuation of treatment. In this investigation we examined the following questions:
After 4 weeks of treatment with fluoxetine, is there an improvement criterion that suggests treatment change is indicated?
After 6 weeks of treatment with fluoxetine, is there an improvement criterion that suggests treatment change is indicated?
After 8 weeks of treatment with fluoxetine, is there an improvement criterion that suggests treatment change is indicated?
Prognosis at week 10 had a remission by week 12 and only three of the 68 with partial responses had remissions by week 12. Therefore, patients who are unimproved or have partial responses at week 8 should have their treatments changed.
The study was conducted at five sites. Patients were referred by health care professionals and recruited by advertising. All of the study subjects were outpatients between 18 and 65 years of age, were physically healthy, and had a diagnosis of major depression according to DSM-III-R and a score on the 17-item Hamilton Depression Rating Scale of 16 or more. The Hamilton scale initially uses 22 items: 12 standard items plus five items that rate positive (classical) vegetative symptoms, i.e., poor appetite, weight loss, and three types of insomnia, and five items that rate reverse (atypical) vegetative symptoms, i.e., overeating, increased weight, and three types of oversleeping. At the first evaluation, if the patient had more positive symptoms, only the 12 standard items plus the five items for positive symptoms were evaluated. For patients whose score on the reverse items was greater, all subsequent evaluations included the 12 standard and five reverse items. The usual criteria for patient exclusion were followed (8).
The study was approved by the New York State Psychiatric Institute’s review board. All of the patients signed informed consent statements.
A DSM-III-R clinical checklist was used to establish diagnoses. Patients whose symptoms persisted during a drug-free observation week were treated openly with fluoxetine hydrochloride, 20 mg/day, for 12 weeks. Scores on the Hamilton Depression Rating Scale and Clinical Global Impression severity measure were obtained at each visit, weekly for weeks 1 to 4 and at weeks 6, 8, 10, 11, and 12.
Remission was defined as having a Hamilton scale score of ≤7 at weeks 10 and 12. After 12 weeks of treatment, patients with remissions were randomly assigned to four groups. One-fourth (25%) were to continue taking fluoxetine for 50 additional weeks, 25% were immediately switched to placebo, 25% were to take fluoxetine an additional 14 weeks and then be switched to placebo, and the final 25% were to take fluoxetine an additional 38 weeks and then be switched to placebo for the final 12 weeks. In this article, all patients taking fluoxetine during weeks 13–26 were combined and compared to the group that was switched immediately to placebo. This was done a priori. At the first randomization, week 13, 395 patients were available. At week 26, when 156 patients were available, some groups had 10 or fewer patients, which is too few to have confidence in survival rates. For example, only 10 patients who were unimproved at week 6 and four patients who were unimproved at week 8 received fluoxetine during weeks 27–50. Therefore, we examined weeks 1–12, the open treatment, and weeks 13–26, the blinded fluoxetine-placebo substitution trial.
After randomization, treatment was double-blind and patients were seen weekly for 2 weeks and then biweekly. Relapse was defined as either 1) having a Hamilton depression score of 14 or higher or 2) having met the DSM-III-R criteria for major depression for 2 consecutive weeks.
The patients were stratified by the standard method of globally characterizing improvement status on the basis of the 17-item Hamilton depression scale: nonresponse (<25% improvement), partial response (25%–49% improvement), response (≥50% improvement), and remission (score ≤7) at the relevant evaluation weeks (i.e., weeks 4, 6, 8, and 10). The proportion of patients in each stratum at each evaluation week who experienced remission by week 12 was determined. Week 4 was not analyzed further because outcome was so favorable for unimproved patients (51%, 63 of 124) (F1) that changing treatment was clearly incorrect. Week 10 was not further analyzed because the rates of response for patients with nonresponse or partial response were so low that it is obvious treatment should be changed at week 10 for unimproved patients or partial responders.
The clinical relevance of the improvement observed at weeks 6 and 8 was tested by examining the time to relapse during weeks 13–26 for patients taking placebo and those taking fluoxetine, stratified by symptom severity at weeks 6 and 8. This was done arbitrarily after we examined the prognosis for all patients at weeks 4, 6, 8, and 10. Ideally, examination of the data for weeks 13–26 would have compared patients by stratum of improvement at weeks 6 and 8 (nonresponse, partial response, etc.), contrasting patients randomly assigned to drug and placebo during weeks 13–26. Because of a flaw in the randomization process, of the patients who were unimproved at week 6 but had remissions by week 12, only two patients were assigned to placebo and 28 were assigned to fluoxetine. Obviously, a contrast of drug and placebo within strata was not possible, and a Cox proportional hazard analysis was done instead of a Mantel-Haenszel survival analysis. A Cox proportional hazard model was fit to the times to relapse as a function of treatment (drug or placebo) and symptom severity at week 6 and week 8. First we modeled time to relapse as a function of treatment, outcome at week 6 (nonresponse, partial response, response, and remission), and their interaction for all patients achieving remission by week 12. The significance of the interaction terms was judged by using alpha=0.2 in order to detect even small effects and minimize the chance of a type II error. We wanted to be certain that severity at week 6 was not a predictor of relapse because if it did predict relapse, patients with that level of severity should not continue taking fluoxetine beyond week 6. When no interaction was detected, we then used a model that tested the main effects of treatment (fluoxetine versus placebo) and status at week 6 (nonresponse, partial response, response, and remission).
An identical Cox proportional hazard analysis was performed to test the relevance of improvement between weeks 8 and 12 to relapse during weeks 13–26, as a function of symptom severity at week 8. The limitations imposed by the use of the Cox proportional hazard model are included in the discussion.
We wished to assess the prognosis of unimproved patients at different weeks in the study. In order to do this, the proportion of patients with remissions at week 12 was examined by level of severity at weeks 4, 6, and 8. For example, if there were 124 unimproved at week 4, what was the proportion who attained remission at week 12? In order to simplify the exposition, only a graphic presentation of outcome from weeks 13 to 26 stratified by status at week 6 is presented. (A graph stratified by status at week 8 is similar and is available from the authors on request.)
Of the 840 patients entering the study, 607 completed the 12-week open phase, and 424 were in remission at week 12. The relation of outcome at week 12 to outcomes at weeks 4, 6, and 8 is shown in t1.
Question 1: After 4 Weeks of Treatment With Fluoxetine, Is There a Minimum Degree of Improvement That Suggests a Treatment Change Is Indicated?
The prognosis of patients unchanged at week 4 was so favorable—51% (63 of 124) attained remission status by week 12—that switching any patient’s treatment at this point seemed inadvisable, and the data for week 4 (F1) will not be discussed further.
Question 2: After 6 Weeks of Treatment With Fluoxetine, Is There a Minimum Degree of Improvement That Clinically Suggests a Treatment Change Is Indicated?
Of the patients who completed 12 study weeks, 85 patients were unimproved after 6 weeks of treatment, and 41% of these (N=35) attained remission by week 12. Of the 126 patients who were partially improved at 6 weeks, 48% (N=61) had remissions at week 12 (F1).
The analysis is based on the patients who completed the study. In addition to the 85 completers with nonresponse at 6 weeks, 35 patients who were unimproved at week 6 dropped out before week 12, and two of these 35 had remissions before they dropped out. Therefore, if the analysis includes all patients who were unimproved at week 6 with the last observation carried forward, 31% (37 of 120) were responders by week 12. It is unclear whether the 41% (35 of 85) or the 31% (37 of 120) more accurately reflects the "true" prognosis if the dropouts had been treated for 12 weeks. It is fair to say the lower bound is 31% and the upper bound is 41%. In other words, we can anticipate at least 31% will respond, but it is unlikely that more than 41% will respond. Since the lower bound, 31% remission, exceeds the selected criterion (30%), it appears justifiable to extend treatment for patients unimproved after 6 weeks. Obviously, dropouts did not enter the discontinuation phase, weeks 13–26, and thus, the Cox proportional hazard model, discussed in the following, does not change. For responders and patients in remission at week 6, 77% (95 of 124) and 86% (232 of 270), respectively, were in remission by week 12. These data suggest that lack of improvement after 6 weeks of treatment with fluoxetine is not a clinically sound basis for switching treatments.
The second series of analyses involved time to relapse in weeks 13–26 of the discontinuation phase. During weeks 13–26, among the patients receiving fluoxetine, 25% (seven of 28) of the week 6 nonresponders, 21% (nine of 42) of the week 6 partial responders, 18% (11 of 60) of the week 6 responders, and 27% (46 of 168) of the patients in remission at week 6 relapsed. During this same period, 44% (42 of 96) of the patients randomly assigned to placebo relapsed. In order to determine the prognosis of patients who were experiencing remission at week 12, time to relapse during weeks 13–26 was examined with a Cox proportional hazard regression analysis (F2). The relevance of severity at week 6, treatment (drug or placebo), and their interaction were examined. Severity was treated as an ordinal factor with the week 6 status characterized as nonresponse, partial response, response, and remission. Even with an alpha of 0.2, there was no evidence of an interaction between status at week 6 and time to relapse during weeks 13–26. The Cox proportional hazard model for time to relapse, which included an interaction term, resulted in a coefficient of 0.03, SE=0.12, and p=0.79. The model without the interaction resulted in a significant treatment effect with a coefficient of –0.44, SE=0.00, and p<0.001. Severity in this model has a coefficient of 0.07, SE=0.71, and p=0.48. Thus, there was a significant main effect of treatment and no effect of status at week 6 on time to relapse. This indicates that time to relapse was different for patients switched to placebo (relapse with continued drug treatment was delayed) and that the level of improvement at week 6 had no bearing on prognosis during weeks 13–26 for patients who achieved remission by week 12. Patients who improved after week 6 had a prognosis in weeks 13–26 indistinguishable from that for patients who improved before week 6.
Question 3: After 8 Weeks of Treatment With Fluoxetine, Is There a Minimum Degree of Improvement That Clinically Suggests a Treatment Change Is Indicated?
After 8 weeks of treatment, 65 patients were unimproved, and 23% of these (15 of 65) were in remission at week 12. This suggests that, for patients who have not improved, treatment should be changed after 8 weeks.
After 8 weeks, there were 100 patients who were partially improved, and 42% (N=42) were experiencing remission at week 12. Of the patients partially improved at week 8, 10 did not complete the study, and an analysis with the last observation carried forward indicated that 38% (42 of 110) had attained remission by week 12. The analysis with the last observation carried forward also supports continuing the fluoxetine trial for patients who are partially improved at week 8. The corresponding percentages for responders and patients with remissions at week 8 were 72% (78 of 109) and 87% (284 of 328). Outcome data for weeks 13–26 are not presented but are available from the authors.
Our purpose was to determine the point at which a clinician should declare a fluoxetine trial failed and either augment it or switch treatments. The criteria used to determine remission were stringent, essentially identifying a virtually asymptomatic group (with a score of ≤7 on the Hamilton Depression Rating Scale). These data suggest that a minimum of an 8-week trial of fluoxetine is indicated because patients who were unimproved at week 6 (with less than 25% improvement in Hamilton score) had a 31% to 41% chance of attaining remission by week 12. Obviously, this observation requires replication (see following discussion). The proportion of patients unimproved at week 8 who reached remission by week 12 was too small to suggest that treatment for unimproved patients be continued for more than 8 weeks. We will discuss the limitations associated with the fixed 20-mg/day dose and other aspects of this study.
For patients who were partially improved at 8 weeks, the prognosis of a 38% remission rate (42 of 110 patients) by week 12 was good enough that a trial of at least 10 weeks is indicated for patients who are at least partially improved.
What implications do these data have for improving treatment outcome? Of the 840 randomly assigned patients (607 who completed the study), 424 patients achieved remission. Therefore, in an intent-to-treat analysis, 50% (424 of 840) had remissions and 70% of the completers (424 of 607) had remissions. All current guidelines suggest switching treatment for patients who are unimproved after 6 weeks of treatment. How would the proportion of ultimate remissions be affected if patients who were unimproved were removed from treatment at week 6? The 37 patients unimproved at week 6 who had remissions by week 12 represent 9% (37 of 424) of all the patients who had remissions. In an intent-to-treat analysis, these patients represent 4% (37 of 840) of those eligible for response. This is considerable when we consider that, in the usual intent-to-treat analysis, drug is 15% superior to placebo (16).
How can the clinician get patients with very little improvement to persist with treatment for 8–10 weeks? We think that, at the initiation of any drug trial, in addition to a discussion of side effects and dose increments, a discussion of the time limits of a clinical drug trial is indicated. Patients should be informed that it may take 8–10 weeks to determine how helpful fluoxetine will be. Some patients will be disgruntled by this information, but discussing this at treatment initiation rather than midway through a trial of fluoxetine may increase compliance.
There are several limitations to this study, and our findings should be considered preliminary observations requiring replication. The first 12 weeks of the study were not blinded. The original purpose was to identify patients whose illness remitted during fluoxetine treatment to test the long-term efficacy in a double-blind, placebo-controlled discontinuation study. The fact that the rates of maintained remission with drug and with placebo during weeks 13–26 differed supports the validity of the week 1–12 ratings. For example, if the rater had determined every patient to have a remission, those assigned to fluoxetine would not be expected to do better than those assigned to placebo in weeks 13–26, the blinded phase. In another study (8), we demonstrated that patients who had remissions in week 1 or 2 or whose illness had a fluctuating course derived no advantage from drug versus placebo, but for patients with a nonfluctuating course and onset of remission in week 3 or later, there was a significant drug-placebo difference during weeks 13–26. This suggests that the raters were not randomly classifying patients as improved and supports the validity of these evaluations.
Another limitation is the fact that, of the patients unimproved at week 6 who had remissions by week 12, by chance only two were randomly assigned to placebo and 28 were assigned to fluoxetine. This is a result of the fact that randomization at week 12 was not stratified by severity at week 6 and the requirement of the randomization process that three patients be assigned to drug for each patient assigned to placebo. The shorter relapse time for the patients switched to placebo than for those continuing to take fluoxetine suggests that, taken in total, improvement of some patients was a result of a drug benefit, which was lost with the switch to placebo. However, for patients unimproved at week 6 who had remissions by week 12, a direct contrast between the outcomes with drug and placebo after week 12 is not possible because of the small number of patients taking placebo (N=2), and we cannot assert that their improvement resulted from a drug effect. The data support only the assertion that the prognosis of these patients with late remissions who continued taking fluoxetine during weeks 13–26 was equal to that for patients taking fluoxetine in weeks 13–26 who had remissions earlier. We suspect that this late improvement was a drug effect, but we cannot prove it. Support of this assertion is gained from the data suggesting that late improvement is more likely to be a drug effect than a placebo effect (8). From a clinical perspective, the prognosis for later response is similar to that for early response.
Other limitations include the fact that the maximum dose of fluoxetine was 20 mg/day. Many clinicians would raise the dose of an unimproved patient by the fourth to sixth week, and it is unclear whether as much improvement would still be observed after week 6 if the dose had been increased earlier. However, some patients can tolerate only 20 mg/day of fluoxetine, and this is frequently the maximum dose prescribed in primary care settings. Another limitation is the absence of a placebo group during weeks 1–12 of this study, which would have helped establish that the observed improvement was a drug effect. We relied on the difference between the drug and placebo relapse rates during weeks 13–26 to support the likelihood that the improvements observed in weeks 1–12 were true drug effects. However, for reasons noted in the introduction, it is unlikely that a placebo-controlled study with a fixed drug dose and sufficient number of subjects to estimate the relevance of change each week will be conducted.
Another issue is whether findings applicable to fluoxetine, which has a long half-life of 1 to 6 days, are applicable to other SSRIs. Steady state is generally achieved at five times a drug’s half-life (17). However, the relevance of this to onset of clinical effect is unclear. Since steady state with fluoxetine would be achieved in less than 5 weeks, this does not appear to explain further improvement after 6 weeks. Therefore, the relevance of fluoxetine’s long half-life to the timing of onset of response is unclear. Data from a study with sertraline (18), which has a shorter half-life, averaging 24 hours, are consistent with the possibility of onset of beneficial effects long after steady state is achieved. In that study, of approximately 307 patients unimproved at week 8, 34% (103 of 307) were improved by week 12. Unfortunately, the data presentation in the published report (18) is such that the exact number of patients taking sertraline is unclear (some were taking imipramine). However, most appear to have been taking sertraline, since there were twice as many patients taking sertraline as imipramine. However, to specifically address this issue, studies examining the ultimate proportion of responders to other SSRIs at each week would be necessary.
In summary, these data suggest that it may be clinically sound practice to continue fluoxetine treatment beyond 6 weeks for patients with minimal improvement. Further study of this issue with fluoxetine and other SSRIs is clinically and heuristically relevant. All practitioners should discuss the issue of trial length at the start of a treatment because if the necessity of a longer trial is made clear, compliance may improve.
Received Feb. 20, 2002; revision received Oct. 6, 2002; accepted Nov. 18, 2002. From the Department of Therapeutics, New York State Psychiatric Institute; and the Department of Psychiatry, Columbia University College of Physicians and Surgeons, New York. Address reprint requests to Dr. Quitkin, Department of Therapeutics, New York State Psychiatric Institute, 1051 Riverside Dr., New York, NY 10032; firstname.lastname@example.org (e-mail).
Rate of Remission at Week 12 for Patients With Major Depression Who Had Not Achieved Remission at Weeks 4, 6, and 8 of an Open Trial of Fluoxetinea
aNonresponse: <25% improvement in score on 17-item Hamilton Depression Rating Scale; partial response: 25%–49% improvement in Hamilton score; response: ≥50% improvement in Hamilton score with a total score of >7; remission: ≥50% improvement in Hamilton score with a total score of ≤7.
bEach percentage is based on the number of patients with the given outcome at week 4, 6, or 8 who had not dropped out by week 12 (see Table 1). For example, at week 6 there were 85 nonresponders who later completed the 12 study weeks, and of these 85, 41% (N=35) had remissions at week 12.
Relation of Outcome at Week 6 of a 12-Week Open Trial of Fluoxetine to Relapse During Placebo-Controlled Discontinuation in Weeks 13–26 for Patients With Major Depression Who Had Remission by Week 12