The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×

Abstract

Objective:

The authors compared measurement-based care with standard treatment in major depression.

Methods:

Outpatients with moderate to severe major depression were consecutively randomized to 24 weeks of either measurement-based care (guideline- and rating scale-based decisions; N=61), or standard treatment (clinicians’ choice decisions; N=59). Pharmacotherapy was restricted to paroxetine (20–60 mg/day) or mirtazapine (15–45 mg/day) in both groups. Depressive symptoms were measured with the Hamilton Depression Rating Scale (HAM-D) and the Quick Inventory of Depressive Symptomatology–Self-Report (QIDS-SR). Time to response (a decrease of at least 50% in HAM-D score) and remission (a HAM-D score of 7 or less) were the primary endpoints. Outcomes were evaluated by raters blind to study protocol and treatment.

Results:

Significantly more patients in the measurement-based care group than in the standard treatment group achieved response (86.9% compared with 62.7%) and remission (73.8% compared with 28.8%). Similarly, time to response and remission were significantly shorter with measurement-based care (for response, 5.6 weeks compared with 11.6 weeks, and for remission, 10.2 weeks compared with 19.2 weeks). HAM-D scores decreased significantly in both groups, but the reduction was significantly larger for the measurement-based care group (−17.8 compared with −13.6). The measurement-based care group had significantly more treatment adjustments (44 compared with 23) and higher antidepressant dosages from week 2 to week 24. Rates of study discontinuation, adverse effects, and concomitant medications did not differ between groups.

Conclusions:

The results demonstrate the feasibility and effectiveness of measurement-based care for outpatients with moderate to severe major depression, suggesting that this approach can be incorporated in the clinical care of patients with major depression.

Major depression is common, leading to marked suffering for patients and families and causing physical and mental disability, with a substantial economic burden (1). Although major depression is prevalent across different cultures and effective pharmacological and psychosocial interventions are available, low remission rates in clinical practice are discouraging (2). Poor outcomes are related to inadequate dose and duration of pharmacotherapy, poor treatment adherence, high dropout, and frequent as well as unnecessary medication changes (3). In addition, inconsistency of treatment strategies among clinicians is common. Even in current, guideline-driven practice, there are often wide variations in clinicians’ behaviors, resulting in practice bias rather than a tailored and individualized treatment algorithm (4).

The concept of measurement-based care was developed and tested in the Texas Medication Algorithm Project (TMAP) (57) and the German Algorithm Project, phases 1–3 (GAP1, GAP2, and GAP3) (811). The term “measurement-based care” was coined by Trivedi et al. (12). Recently, measurement-based care has been gaining attention in the treatment of depression because it allows psychiatrists to individualize treatment decisions for each patient based on changes in psychopathology and tolerance of antidepressants (12). The TMAP was the first controlled study to evaluate measurement-based care in the treatment of depression (57). Subsequently, several open or randomized controlled studies, such as the GAP1, GAP2, and GAP3 studies (811) and the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study (4, 12, 13), evaluated measurement-based care, finding that measurement-based care-informed sequential algorithms can be successfully integrated into clinical practice and improve patient outcomes. However, design weaknesses, including lack of randomization (4, 5) and of blind raters (14, 15), may have biased the findings.

Despite a strong theoretical rationale for measurement-based care and data supporting the ability to implement this approach in clinical practice settings, there has never been a randomized controlled trial with blind raters comparing measurement-based care with usual care in the treatment of depression. Since usual care may involve different medication choices that could influence outcomes in addition to the presence or absence of measurement-based care, an alternative strategy is to compare measurement-based care with “standard treatment” that limits medication choices in order to isolate the effect of measurement-based care. Given the frequency of major depression, and the individual as well as societal burden it imposes, evaluation of the effectiveness and cost-saving potential of measurement-based care in a randomized controlled trial is critical, in order to inform clinical care and guidelines. Moreover, measurement-based care strategies are needed that are easily implemented in clinical practice and are scalable.

The aim of this study was to determine the efficacy and safety of measurement-based care in patients with major depression. We hypothesized that time to response and time to remission would be significantly shorter in the measurement-based care group, without greater dropout rates and side effect burden, compared with the standard treatment group.

Method

Patients and Study Setting

This was a randomized controlled trial, with assessors blind to protocol and treatment group, conducted between December 2011 and November 2012 in Beijing Anding Hospital, a university-affiliated teaching hospital in China. This 800-bed hospital serves a population of approximately 19 million people and has 1,100 outpatient visits daily.

To maximize the generalizability of the findings, only patients seeking psychiatric treatment (as opposed to those enrolled by advertisements) were recruited. Patients had to be outpatients, 18–65 years of age, with a diagnosis of nonpsychotic major depression established by treating psychiatrists and confirmed by a checklist based on DSM-IV criteria at study entry (12), as well as a score ≥17 on the Chinese version of the 17-item Hamilton Depression Rating Scale (HAM-D) (16, 17); all participants had to have the ability to communicate and to provide written consent. Exclusion criteria were a lifetime history of drug or alcohol dependence; bipolar, psychotic, obsessive-compulsive, or eating disorders; history of a lack of response or intolerance to either of the two protocol antidepressants (paroxetine and mirtazapine); pregnancy or breastfeeding; suicide attempts in the current depressive episode; or any major medical condition contraindicating the use of the protocol antidepressants. Paroxetine and mirtazapine doses were converted into amitriptyline-equivalent milligrams (50 mg in amitriptyline equivalents equals 10 mg of paroxetine or 15 mg of mirtazapine) (18).

The study protocol was approved by the Human Research and Ethics Committee of Beijing Anding Hospital in accordance with the Declaration of Helsinki and local clinical traditions. All patients provided written informed consent.

Interventions and Measurement-Based Care

Patients who met inclusion criteria and provided written informed consent entered a 1-week washout phase for previously taken psychotropic medications for major depression (the washout phase was stipulated as 1 week by the Human Research and Ethics Committee of Beijing Anding Hospital). After washout, patients were randomly assigned to standard treatment or to measurement-based care according to a table of random numbers, and then followed for 24 weeks.

Patients in both groups received either open-label paroxetine (20–60 mg/day) or open-label mirtazapine (15–45 mg/day), within the therapeutic dosage range recommended by the Chinese Medical Association’s guideline for the prevention and treatment of major depression (19). The study’s therapeutic dosage range for paroxetine was recommended by the Chinese Society of Psychiatry’s Depression Panel. Paroxetine, a selective serotonin reuptake inhibitor, was chosen because it has been one of the most commonly prescribed antidepressants in China during the past decade, and mirtazapine, an alpha-2 antagonist, was chosen because it has a different mechanism of action (20). The treating psychiatrists could decide which of the antidepressants and dosages to prescribe, as long as they were within the study’s recommended dosage ranges. During the study, one medication change between paroxetine and mirtazapine was allowed for intolerability or inefficacy. The only other psychotropic medications allowed in the study were short-acting benzodiazepines, sparingly, for agitation, anxiety, and insomnia. Other medications not affecting the CNS were permitted.

Patients in the standard treatment group were treated by their psychiatrists according to their clinical needs as judged at each outpatient visit.

Following the STAR*D project (www.star-d.org), patients in the measurement-based care group received treatment according to a schedule that included individualized starting dosages, dosage adjustment, and medication changes to minimize side effects, maximize safety, and optimize the therapeutic benefit for each patient. The treating psychiatrists made treatment decisions on the basis of ratings on self-report scales obtained at each treatment visit: the 16-item Quick Inventory of Depressive Symptomatology–Self-Report (QIDS-SR) (21, 22) and the Frequency, Intensity, and Burden of Side Effects Rating scale (23). Paroxetine was started at 20 mg/day and increased to 30 mg/day by week 4, to 40 mg/day by week 6, to 50 mg/day by week 8, and to 60 mg/day (final dosage) by week 10. Mirtazapine was started at 15 mg/day and increased to 30 mg/day by week 1 and to 45 mg/day (final dosage) by week 4. Dosage adjustments were dependent on how long the patient had received a particular dosage, symptom changes, and side effects. The measurement-based care schedule used in this study is presented in Table 1.

TABLE 1. The Measurement-Based Care Schedule, Using Assessment Tools in the Treatment of Major Depressive Disordera

Treatment Plan
Time Point and MeasureClinical StatusParoxetineMirtazapine
Week 0Start at 20 mg/dayStart at 15 mg/day, increase to 30 mg/day by week 1
Week 2
 Outpatient visit, no dosage adjustment
Week 4
 QIDS-SR score, ≤5RemissionContinue current doseContinue current dose
 QIDS-SR score, 6–8Partial responseContinue current dose or consider increasing to 30 mg/dayContinue current dose or consider increasing to 45 mg/day
SEs intolerableContinue current dose and address SEs, or switch to mirtazapineContinue current dose and address SEs, or switch to paroxetine
 QIDS-SR score, ≥9NonresponseIncrease to 30 mg/day or switch to mirtazapineIncrease to 45 mg/day or switch to paroxetine
SEs intolerableSwitch to mirtazapineSwitch to paroxetine
Week 6
 QIDS-SR score, ≤5RemissionContinue current doseContinue current dose
 QIDS-SR score, 6–8Partial responseContinue current dose or consider increasing to 40 mg/dayContinue current dose or consider increasing to 45 mg/day
SEs intolerableContinue current dose and address SEs, or switch to mirtazapineContinue current dose and address SEs, or switch to paroxetine
 QIDS-SR score, ≥9NonresponseIncrease to 40 mg/dayIncrease to 45 mg/day or switch to paroxetine
SEs intolerableSwitch to mirtazapineSwitch to paroxetine
Week 8
 QIDS-SR score, ≤5RemissionContinue current doseContinue current dose
 QIDS-SR score, 6–8Partial responseContinue current dose or consider increasing to 50 mg/dayContinue current dose or consider increasing to 45 mg/day
SEs intolerableSwitch to mirtazapineSwitch to paroxetine
 QIDS-SR score, ≥9NonresponseSwitch to mirtazapineSwitch to paroxetine
SEs intolerableSwitch to mirtazapineSwitch to paroxetine
Week 10
 QIDS-SR score, ≤5RemissionContinue current doseContinue current dose
 QIDS-SR score, 6–8Partial responseContinue current dose or consider increasing to 60 mg/dayContinue current dose or consider switch to paroxetine
SEs intolerableSwitch to mirtazapineSwitch to paroxetine
 QIDS-SR score, ≥9NonresponseSwitch to mirtazapineSwitch to paroxetine
SEs intolerableSwitch to mirtazapineSwitch to paroxetine
Week 12
 QIDS-SR score, ≤5RemissionContinue current dose and follow upContinue current dose and follow up
 QIDS-SR score, 6–8Partial responseContinue current dose and follow up, or consider switch to mirtazapineContinue current dose and follow up, or consider switch to paroxetine
SEs intolerableSwitch to mirtazapineSwitch to paroxetine
 QIDS-SR score, ≥9NonresponseSwitch to mirtazapineSwitch to paroxetine
SEs intolerableSwitch to mirtazapineSwitch to paroxetine

aQIDS-SR=Quick Inventory of Depressive Symptomatology–Self-Report; SEs=side effects.

TABLE 1. The Measurement-Based Care Schedule, Using Assessment Tools in the Treatment of Major Depressive Disordera

Enlarge table

The treatments were delivered in the outpatient department of Beijing Anding Hospital. All treating psychiatrists were regular clinicians. The two treatment groups were cared for by separate treatment teams. Before the study, the clinicians responsible for the measurement-based care group underwent a 2-day training program on using measurement-based care according to the study schedule.

Following the STAR*D study (24), an independent clinical research coordinator monitored psychiatrists’ compliance with the measurement-based care treatment guidelines. A physician feedback form, designed for the study, was used to ensure that the treatment was delivered according to the measurement-based care guidelines. After each clinical visit, the clinical research coordinator completed the physician feedback form on the basis of the clinical visit’s documentation. If the treatment decision deviated from the guidelines, the psychiatrist was alerted shortly after the appointment to make the treatment consistent with the guidelines as soon as possible. With the assistance of the physician feedback form, the rate of nonadherence to the guidelines was <5% throughout the study period, and all violations were corrected promptly.

Outcome Measures

Basic sociodemographic and clinical characteristics were collected through a review of medical records, using a form designed for this study, and then confirmed in a clinical interview. The two primary outcome measures were the estimated time from randomization to response and to remission according to the HAM-D score. Response was defined as a decrease ≥50% from the baseline HAM-D score, and remission as a HAM-D score ≤7 (12). The pill count method was used to measure treatment adherence.

Secondary outcome measures included the severity of depressive symptoms according to the HAM-D and the severity of manic or hypomanic symptoms according to the Young Mania Rating Scale (YMRS) (25). An additional checklist with six common side effects (dry mouth, diarrhea or constipation, dizziness or drowsiness, loss of appetite or nausea, headache, and excessive sweating) was used to measure side effects at each treatment visit.

As tools to implement the measurement-based care, the QIDS-SR and the Frequency, Intensity, and Burden of Side Effects Rating scale were used to measure the severity of depressive symptoms within the past week and antidepressant side effects, respectively, in the measurement-based care group only. On the QIDS-SR, higher scores indicate more severe depressive symptoms (21, 22). The Frequency, Intensity, and Burden of Side Effects Rating scale (23) is a self-report instrument assessing three domains of medication side effects within the past week: frequency, intensity, and burden (the degree to which side effects over the past week interfered with day-to-day functions). Each domain is rated on a 7-point (0–6) scale (frequency, ranging from “no side effects” to “present all of the time”; intensity, ranging from “no side effects” to “intolerable”; and burden, ranging from “no impairment” to “unable to function”). A low score (0–2) indicates that current treatment may continue; an intermediate score (3 or 4) suggests that side effects require attention; a high score (5 or 6) means that the current treatment is unacceptable and a decrease in dosage or a medication switch is needed (13).

Assessment Methods

Two raters with >8 years of experience in clinical practice and research independently assessed patients with the above-described instruments at baseline and at 2, 4, 8, 12, and 24 weeks. The raters were blind to the study protocol and treatment assignment and were not involved in treatment. Before the study, the two raters were trained in the use of the instruments. In the prestudy reliability exercise, interrater reliability (intraclass correlation coefficients for continuous ratings and kappa values for categorical measures) was above 0.8. All patients were instructed by the research coordinators not to disclose their group membership to the raters at any time during the study. Patients were removed from the study if they had a suicide attempt, became pregnant, developed a severe medical condition, or suffered from newly emerging side effects that they found intolerable and that could not be managed. Patients who were removed from the study received antidepressant treatment as appropriate as part of clinical care.

Statistical Analysis

Data were analyzed using SPSS for Windows, version 20.0 (IBM Corp., Armonk, N.Y.). Full intent-to-treat analyses were performed on all-cause and specific-cause discontinuation; all other analyses were conducted in the modified intent-to-treat sample, that is, patients who underwent a baseline assessment and at least one follow-up assessment. Baseline sociodemographic and clinical characteristics, discontinuation, response and remission rates, and side effects were compared between the two groups using independent-sample t tests, Mann-Whitney U tests, and chi-square tests, as appropriate.

Kaplan-Meier survival analyses were used to calculate the estimated time from randomization to response and remission. The Cox proportional hazards regression model was used to compare the estimated time to response and remission between the two groups while controlling for covariates, such as marital status, age, and concomitant medications. The analyses included patients who met the criteria for response or remission and those who were lost to follow-up without a documented response or remission, as well as those who did not meet response or remission criteria at their last assessment. Additionally, Kaplan-Meier survival analysis and Cox proportional hazards regression analysis were performed to compare the estimated time to all-cause discontinuation between the two treatment groups. Differences in the changes in HAM-D and YMRS scores between the two groups from baseline to endpoint were subjected to analysis of covariance with baseline scores, marital status, and age as covariates. Continuous outcomes were analyzed as last-observation-carried-forward data. The significance threshold was set at 0.05 (two-tailed).

Results

Of 164 screened patients, 120 (73.2%) met study criteria and were randomly assigned to standard treatment (N=59) or to measurement-based care (N=61) (see Figure S1 in the data supplement that accompanies the online edition of this article).

Sociodemographic and Clinical Characteristics

All participating patients had medical insurance. There were no significant differences between the two groups in demographic or clinical characteristics, except that patients in the measurement-based care group were younger on average and less likely to be married (Table 2).

TABLE 2. Baseline Demographic and Clinical Characteristics of Participants Receiving Measurement-Based Care or Standard Treatment for Major Depression

CharacteristicOverall SampleStandard Treatment Group (N=59)Measurement-Based Care Group (N=61)
N%N%N%
Female7764.23559.34268.9
Marrieda9478.35186.44370.5
Family history of psychiatric disorders1714.2915.3813.1
Major medical conditions3226.71728.81524.6
Initial antidepressant
 Paroxetine5545.82847.52744.3
 Mirtazapine6554.23152.53455.7
MeanSDMeanSDMeanSD
Age (years)a41.112.143.511.638.812.2
Age at onset (years)35.612.337.311.733.912.7
Number of depressive episodes3.28.43.911.92.41.7
Duration of illness (years)11.017.812.923.39.210.0
Hamilton Depression Rating Scale score22.44.122.24.122.54.2
Young Mania Rating Scale score1.01.61.01.71.01.6
Quick Inventory of Depressive Symptomatology–Self-Report score13.16.0

aSignificant difference between groups, p<0.05.

TABLE 2. Baseline Demographic and Clinical Characteristics of Participants Receiving Measurement-Based Care or Standard Treatment for Major Depression

Enlarge table

Study Discontinuation

All-cause discontinuation did not differ significantly between the measurement-based care and standard treatment groups (27.9% and 37.3%, respectively; for details, see Figure S1 in the data supplement). Likewise, time to all-cause discontinuation was similar between groups (measurement-based care, 14.6 days; standard treatment, 15.0 days; hazard ratio=0.67, 95% CI=0.35–1.29).

Dosage, Medication Adherence, and Treatment Adjustment

Antidepressant dosages in amitriptyline equivalents and clinical visits are summarized in Table 3. Treatment adherence did not differ between groups (99.8% and 99.7%). The mean number of clinical visits was 8.0 (95% CI=7.5–8.4) for the standard treatment group and 8.4 (95% CI=8.0–8.9) for the measurement-based care group over the whole study period. There were no significant differences between groups at 0–2 weeks and 3–4 weeks; however, patients in measurement-based care had more clinical visits at 5–8 weeks and 9–12 weeks (both p values <0.001), but fewer visits at 13–24 weeks (p<0.001) than those in the standard treatment group. The total number of treatment adjustments was 23 for the standard treatment and 44 for the measurement-based care group (χ2=13.4, df=1, p<0.001). In the standard treatment group, there were 22 dosage adjustments and one medication switch (from mirtazapine to paroxetine; the mean dosage was 75 mg in amitriptyline equivalents at the time of the switch). In the measurement-based care group, there were 40 dosage adjustments and four medication switches (two from mirtazapine to paroxetine and two from paroxetine to mirtazapine; the mean dosage was 137.5 mg in amitriptyline equivalents at the time of the switches). The mean antidepressant exposure time was 20.5 weeks (SD=5.3) in the standard treatment group and 20.6 weeks (SD=5.8) in the measurement-based care group. However, antidepressant dosages were significantly higher in the measurement-based care group than in the standard treatment group from week 2 (118 mg/day compared with 106.7 mg/day; p=0.02) to week 24 (122.1 mg/day compared with 106.7 mg/day; p=0.006).

TABLE 3. Antidepressant Dosages (in Amitriptyline Equivalents) and Hamilton Depression Rating Scale (HAM-D) Scores for Participants Receiving Measurement-Based Care or Standard Treatment for Major Depressiona

Standard Treatment GroupMeasurement-Based Care Group
Time Point and MeasureNMeanSDNMeanSDp
Baseline
 Dosage (mg/day)5961
 HAM-D score5922.24.16122.54.20.74
2 weeks
 Dosage (mg/day)59106.721.661118.028.90.02
 HAM-D score5915.24.86112.45.00.002
 Visits from baseline to week 2592.00.0612.00.0
4 weeks
 Dosage (mg/day)58106.721.660118.028.90.02
 HAM-D score5812.13.9609.34.1<0.001
 Visits from week 3 to week 4581.00.0601.00.0
8 weeks
 Dosage (mg/day)55107.624.257119.635.60.03
 HAM-D score559.93.3576.33.9<0.001
 Visits from week 5 to week 8551.10.3571.90.2<0.001
12 weeks
 Dosage (mg/day)52109.327.049122.135.90.03
 HAM-D score528.93.5495.43.7<0.001
 Visits from week 9 to week 12521.20.5491.70.5<0.001
24 weeks
 Dosage (mg/day)37106.721.644122.135.90.006
 HAM-D score378.63.6444.83.6<0.001
 Visits from week 13 to week 24372.51.1441.61.1<0.001

aFifty milligrams of amitriptyline equivalents equals 10 mg of paroxetine or 15 mg of mirtazapine.

TABLE 3. Antidepressant Dosages (in Amitriptyline Equivalents) and Hamilton Depression Rating Scale (HAM-D) Scores for Participants Receiving Measurement-Based Care or Standard Treatment for Major Depressiona

Enlarge table

Response and Remission

The response rate was 62.7% in the standard treatment group and 86.9% in the measurement-based care group (χ2=9.3, df=1, p=0.002), for an overall response rate of 75.0%. The remission rate was 28.8% in the standard treatment group and 73.8% in the measurement-based care group (χ2=24.2, df=1, p<0.001), for an overall response rate of 51.7%.

The average time to response was 8.1 weeks (95% CI=6.5–9.6) in the standard treatment group and 4.5 weeks (95% CI=3.3–5.8) in the measurement-based care group (t=3.4, df=118, p=0.001). The corresponding figures for remission were 14.8 weeks (95% CI=12.8–16.7) and 8.4 weeks (95% CI=6.7–10.2) for the two groups (t=4.8, df=118, p<0.001). For responding patients, the average time to response was 5.1 weeks (95% CI=4.1–6.0) for the standard treatment group and 3.1 weeks (95% CI=2.7–3.6) for the measurement-based care group (p<0.001). For remitting patients, the time to remission was 8.4 weeks (95% CI=5.7–11.1) for the standard treatment group and 6.0 weeks (95% CI=4.8–7.2) for the measurement-based care group, which fell short of statistical significance (p=0.054).

The estimated time intervals for response and remission in the two groups in the Kaplan-Meier analysis are illustrated in Figure 1 and Table 4. The difference between the survival curves was significant for both response (log rank=18.6, p<0.001) and remission (log rank=29.1, p<0.001).

FIGURE 1.

FIGURE 1. Estimated Mean Time to Response and Remission, by Kaplan-Meier Analysisa

a In panel A, the numbers of patients who achieved treatment response at 2, 4, 8, 12, and 24 weeks, respectively, were 9, 24, 35, 37, and 37 in the standard treatment group and 30, 49, 53, 53, and 53 in the measurement-based care group (p<0.001). In panel B, the numbers of patients who achieved remission at 2, 4, 8, 12, and 24 weeks, respectively, were 2, 5, 12, 16, and 17 in the standard treatment group and 8, 25, 41, 44, and 45 in the measurement-based care group (p<0.001).

In the Cox regression model, controlling for the potentially confounding effects of marital status, age, and concomitant medications, the estimated times to response (hazard ratio=2.2, 95% CI=1.4–3.5; p<0.001) and remission (hazard ratio=4.2, 95% CI=2.3–7.6; p<0.001) were significantly longer with standard treatment than with measurement-based care.

Symptom Ratings

There were no significant differences between the two groups in baseline HAM-D and YMRS scores (Tables 2 and 4). By the end of the study, the HAM-D score decreased significantly in both groups, but the change was significantly larger in the measurement-based care group (p<0.001). The overall low YMRS score, however, did not change significantly from baseline to endpoint in either group.

TABLE 4. Efficacy Outcomes for Participants Receiving Measurement-Based Care or Standard Treatment for Major Depressiona

MeasureStandard Treatment Group (N=59)Measurement-Based Care Group (N=61)
N%N%p
Response3762.75386.90.002
Remission1728.84573.8<0.001
Mean95% CIMean95% CIp
Estimated time to response (weeks)11.69.2, 14.15.63.9, 7.4<0.001
Estimated time to remission (weeks)19.217.2, 21.310.28.0, 12.3<0.001
Change in HAM-D score13.612.1, 15.017.816.3, 19.2<0.001
Change in YMRS score0.60.3, 1.10.60.2, 1.00.73
Change in QIDS-SR score9.78.2, 11.3

aHAM-D=Hamilton Rating Scale for Depression; QIDS-SR=Quick Inventory of Depressive Symptomatology-Self-Report; YMRS=Young Mania Rating Scale. Response was defined as a decrease ≥50% from the baseline HAM-D score, and remission as a HAM-D score ≤7. Estimated time to response and remission were based on Kaplan-Meier analysis. The continuous outcome measures were subjected to last-observation-carried-forward analysis. Changes in HAM-D and YMRS scores over the study period were compared by analysis of covariance after controlling for baseline scores, marital status, and age as covariates.

TABLE 4. Efficacy Outcomes for Participants Receiving Measurement-Based Care or Standard Treatment for Major Depressiona

Enlarge table

Adverse Events

The proportions of any type and the total number of adverse events did not differ significantly between the two groups (Table 5).

TABLE 5. Adverse Effects Among Participants Receiving Measurement-Based Care or Standard Treatment for Major Depressiona

Adverse EffectStandard Treatment Group (N=59)Measurement-Based Care Group (N=61)
N%N%
Checklist of common side effects
 Dry mouth813.669.8
 Diarrhea or constipation610.258.2
 Dizziness or drowsiness58.558.2
 Loss of appetite or nausea46.834.9
 Headache35.134.9
 Excessive sweating35.123.3
 Total2949.22439.3
Frequency, Intensity, and Burden of Side Effects Rating scale
 Frequency domain
  Mild (0–2)3252.5
  Moderate (3, 4)2744.3
  Severe (5, 6)23.3
 Severity domain
  Mild (0–2)3557.4
  Moderate (3, 4)2439.3
  Severe (5, 6)23.3
 Burden domain
  Mild (0–2)4065.6
  Moderate (3, 4)2032.8
  Severe (5, 6)11.6

aThere were no significant differences between groups. The checklist of six common side effects was administered to participants in both groups at each assessment. The Frequency, Intensity, and Burden of Side Effects Rating scale was administered only to participants in the measurement-based care group. For the latter instrument, mean domain ratings for the measurement-based care group were 2.3 (SD=1.3) for the frequency domain, 2.3 (SD=1.2) for the severity domain, and 2.0 (SD=1.2) for the burden domain.

TABLE 5. Adverse Effects Among Participants Receiving Measurement-Based Care or Standard Treatment for Major Depressiona

Enlarge table

Concomitant Psychotropic Medications

Short-acting benzodiazepines were prescribed for 32.2% (19/59) of patients in the standard treatment group and 47.5% (29/61) in the measurement-based care group, a difference that fell short of statistical significance (χ2=2.9, df=1, p=0.09). Of these patients, all 19 in the standard treatment group took lorazepam, and in the measurement-based care group, 27 patients took lorazepam and two took oxazepam.

Discussion

Evidence is increasing that measurement-based care allows psychiatrists to individualize treatment decisions for major depression based on changes in psychopathology and side effects, which decreases inappropriate variance and improves the implementation of appropriate treatment strategies, thereby enhancing outcomes, reducing treatment resistance, and increasing the quality of care (10). To the best of our knowledge, this was the first randomized controlled trial with blind raters to systematically investigate the effect of measurement-based care, compared with standard treatment, on time to response and remission in patients with major depression, using identical medication options in the two groups in order to isolate the effect of measurement-based care. Similar to findings of the TMAP (5) and subsequent open or randomized studies (811), we found that measurement-based care-informed sequential algorithms can be successfully integrated into clinical practice and improve patient outcomes. Our study demonstrated significantly higher response and remission rates at 6 months in the measurement-based care group compared with the standard treatment group, translating into numbers needed to treat of 5 and 3 for response and remission, respectively. Furthermore, compared with the standard treatment group, the measurement-based care group had a significantly higher proportion of responders (86.9% compared with 62.7%) and remitters (73.8% compared with 28.8%). In fact, in the measurement-based care group, 85% of the responders achieved remission, whereas only 46% did so in the standard care group. Moreover, for patients receiving measurement-based care, both time to response and time to remission were significantly shorter than for those in the standard treatment group (5.6 weeks compared with 11.6 weeks, and 10.2 weeks compared with 19.2 weeks, respectively). These findings were both statistically significant and clinically meaningful, reducing time of suffering and expense before reaching response by 6 weeks and before reaching remission by 9 weeks. These positive results were obtained within the context of significantly more treatment adjustments in the measurement-based care group, guided by rating scale-based assessments, and without a higher frequency of dropouts, concomitant medications, or adverse effects.

Response was measured because it is a common outcome measure in drug trials (26). However, response that falls short of remission is suboptimal, since it is associated with residual symptoms, high frequency of relapse, impaired social functioning, and high risk for suicide (12, 27). In contrast, the higher rates of and shorter time to remission in the measurement-based care group translate into a lower symptom burden, lower rates of expected relapse and suicide, and normal psychosocial function (28).

The antidepressant dosages in the measurement-based care group were higher than those in the standard treatment group from week 2 to week 24. The higher dosages in the early phase of treatment may have led to faster symptom reduction. Furthermore, patients in the measurement-based care group had more treatment adjustments than those in the standard treatment group (44 compared with 23), which were predominantly dosage adjustments. This rating scale-based and individualized treatment approach that set the clear target of remission was likely responsible for the superior efficacy outcomes in the measurement-based care group without compromising treatment continuation and tolerability, despite higher antidepressant dosages. Our finding of lower dosages in the standard treatment group supports the notion that suboptimal antidepressant dosages and lack of appropriate medication changes in the context of inadequate outcome contributed to the low remission rates (3, 4). Moreover, our results, although they still require replication, also suggest that the period between 1 and 3 months may be the most critical for fine-tuning the antidepressant treatment approach, which may yield faster and better outcomes and reduce the need for frequent visits beyond 3 months. Taken together, these findings support the notion that measurement-based care can maximize therapeutic effects and minimize, or at least not increase, side effects (13).

The remission rate in the measurement-based care group (73.8%) was higher than rates reported in most but not all earlier studies (29). The favorable remission rate in this study compared with earlier efficacy studies (e.g., 22% [30]) may be due to advantages related to measurement-based care. The remission rate in our measurement-based care group was also higher than the rates of 28%−33% in the STAR*D project (12) and 54% in the GAP2 study (14), which also used measurement-based care. Possible reasons for differences between the present study and the STAR*D study (12) include a longer treatment duration in our study (24 weeks compared with 14 weeks) and a higher proportion of married patients (70.5% in the measurement-based care group and 86.4% in the standard treatment group, compared with 41.7% in the STAR*D study), who are known to have higher remission rates (12). Additional possible reasons for lower remission rates in STAR*D than in our study include different sampling methods, the potential failure to distinguish unipolar and bipolar depression, and the sequential phases design, in which the next phases of the study had to be filled with patients who did not meet remission criteria (31). Among possible reasons for differences in remission rates between our study and the GAP2 study (14) are that the latter included patients with psychotic major depression (17.7%−18.3%) and inpatients, who are generally more severely ill, whereas our trial excluded these patient groups. Moreover, because of a lack of community-based psychiatric services in most areas of China (32), outpatients are usually not severely ill. For example, the mean baseline HAM-D score was 22.4 in our study. No remission data are available for the TMAP study (5). In our study, the actual (not estimated) mean time to response (4.5 weeks) and remission (8.4 weeks) for the measurement-based care group was similar to the Keller et al. study (5.7 weeks and 6.7 weeks, respectively) (30) and the GAP2 study (mean time to remission, 7.0 weeks) (14). No time to event data are available for the STAR*D and TMAP studies.

As mentioned above, measurement-based care consists of individualized treatment, which incorporates self-reported measures that can improve patients’ ability to monitor their own symptoms and side effects and help them understand the nature of their depression and the complexity of its treatment. All these factors are beneficial in improving the acceptability of the illness management (33) and may help in the making of shared decisions with a clear goal of remission. Therefore, we had hypothesized that treatment adherence in the measurement-based care group would be better than in the standard treatment group. However, treatment adherence rates were very high and nearly identical in both groups (99.8% and 99.7%). In STAR*D, self-reported medication adherence monitoring via a web-based system was employed, but adherence rates were not reported (12). Because of a lack of community-based rehabilitation or day-care facilities in most areas of China (32), psychiatric patients frequently live with their families. As a result, treatment monitoring and support from their families may improve medication adherence, which could account for the high adherence rate even in the standard treatment group.

The results of this study should be interpreted within the context of its limitations. First, the open study design may have influenced the outcomes to an unknown degree. However, we sought to conduct a study that would mirror usual practice patterns, except for the measurement-based care component, in order to ensure generalizability. Moreover, we utilized assessors blind to both group assignment and study protocol to provide unbiased ratings. Second, we restricted the study antidepressants to paroxetine and mirtazapine, and our results may not be applicable to other antidepressant drugs. However, efficacy differences among antidepressants are likely small (34), and by restricting the medications to the two commonly used antidepressants in both groups, we were able to isolate our testing to the effectiveness of measurement-based care compared with standard treatment, independent of potential differences in medication choice. Third, our clinical research coordinators’ routine check on psychiatrists’ adherence with the measurement-based care guideline, as was done in STAR*D, may have increased the efficacy of the measurement-based care. Fourth, only outpatients at one site in mainland China were involved; therefore, the findings need to be replicated in other treatment settings. In addition, information on family psychiatric history from case notes may be inaccurate because of potential recall bias. Because of the limited number of psychiatrists, the treatment rate for psychiatric disorders is very low in China (32, 35). In many cases, patients or their families can only recall a positive family history of psychiatric disorders, but not the actual diagnoses. Fifth, neither specific reasons for treatment discontinuation nor sexual side effects were assessed, yet rates of discontinuation numerically favored the measurement-based care group and none of the side effects measured differed between the two groups. Sixth, in the measurement-based care group, treatment visits coincided with the research assessment visits at weeks 2, 4, 6, 8, 10, 12, and 24. Additional visits according to clinical need were allowed, especially in the latter 3 months of the study. Conversely, the clinical visits were uncontrolled in the standard treatment group. However, the number of clinical visits did not differ significantly for the two groups over the study period, arguing against a relevant effect on the outcome. Seventh, pill counts were conducted in both groups. Although this could have increased adherence in the standard treatment group beyond usual care standards (there were no pill count differences between the two groups), this is a conservative bias that would have worked against a separation of measurement-based care from standard treatment. Eighth, similar to the STAR*D project, no structured diagnostic schedule for major depression was used, although the clinical diagnosis of major depression was confirmed by a checklist based on DSM-IV criteria at study entry, and major depression symptoms had to be moderate to severe. Ninth, the study lasted only 24 weeks; however, this is a relatively long duration for a randomized trial, and achievement of remission has been associated with positive long-term outcomes (28). Finally, given the different pharmacological action of mirtazapine, paroxetine, and amitriptyline, the conversion of antidepressant doses into amitriptyline-equivalent milligrams is not entirely clear and precise. However, using the same conversion standard when comparing measurement-based care and standard treatment in the multivariate analysis should have mitigated this limitation.

Conclusions

We found measurement-based care to be a feasible and more effective method than standard treatment for patients with moderate to severe major depression. Measurement-based care not only markedly increased the likelihood and the speed of achieving response and remission, it also did not increase side effect burden, despite higher antidepressant dosages, and it proved to be acceptable to patients. Results from this study should inform clinical care delivery to patients with major depression, and its cost-effectiveness should be explored further.

From the Mood Disorders Center and Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Capital Medical University, Beijing; the Center of Depression, Beijing Institute for Brain Disorders and China Clinical Research Center for Mental Disorders, Beijing; the Unit of Psychiatry, Faculty of Health Sciences, University of Macau; the Department of Psychiatry, Chinese University of Hong Kong, Hong Kong; the University of Notre Dame Australia, Marian Centre, Perth; the School of Psychiatry and Clinical Neurosciences, University of Western Australia, Perth; and the Division of Psychiatry Research, Zucker Hillside Hospital, North Shore–Long Island Jewish Health System, Glen Oaks, N.Y.
Address correspondence to Dr. Xiang () or Dr. Wang ().

The first four authors contributed equally to this work.

Supported by the National Science and Technology Major Projects for “Major New Drugs Innovation and Development” (2012ZX09303014-002), the Key Medical Specialties Development Project of Beijing Municipal Administration of Hospitals (ZYLX201403), the Beijing Institute for Brain Disorders (BIBD-PXM2013_014226_07_000084), the Capital Medical Development and Research Fund (2009-1051), and the Start-Up Research Grant (SRG2014-00019-FHS) and Multi-Year Research Grant (MYRG2015-00230-FHS) from University of Macau.

Clinicaltrials.gov identifier: NCT02191124.

Dr. Correll has received grant support from Bristol-Myers Squibb, Janssen/Johnson & Johnson, Novo Nordisk A/S, Otsuka, and Takeda and has served as a consultant or adviser to or received honoraria from AbbVie, Actavis, Actelion, Alexza, Alkermes, Bristol-Myers Squibb, Cephalon, Eli Lilly, Genentech, Gerson Lehrman Group, IntraCellular Therapies, Janssen/Johnson & Johnson, Lundbeck, MedAvante, Medscape, Merck, Otsuka, Pfizer, ProPhase, Reviva, Roche, Sunovion, Supernus, Takeda, Teva, and Vanda. The other authors report no financial relationships with commercial interests.

The authors thank all the clinicians for their contribution to this study.

References

1 Kessler RC, Chiu WT, Demler O, et al.: Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the National Comorbidity Survey Replication. Arch Gen Psychiatry 2005; 62:617–627Crossref, MedlineGoogle Scholar

2 Rush AJ, Trivedi MH, Wisniewski SR, et al.: Bupropion-SR, sertraline, or venlafaxine-XR after failure of SSRIs for depression. N Engl J Med 2006; 354:1231–1242Crossref, MedlineGoogle Scholar

3 Kessler RC, Berglund P, Demler O, et al.: The epidemiology of major depressive disorder: results from the National Comorbidity Survey Replication (NCS-R). JAMA 2003; 289:3095–3105Crossref, MedlineGoogle Scholar

4 Trivedi MH, Daly EJ: Measurement-based care for refractory depression: a clinical decision support model for clinical research and practice. Drug Alcohol Depend 2007; 88(suppl 2):S61–S71Crossref, MedlineGoogle Scholar

5 Trivedi MH, Rush AJ, Crismon ML, et al.: Clinical results for patients with major depressive disorder in the Texas Medication Algorithm Project. Arch Gen Psychiatry 2004; 61:669–680Crossref, MedlineGoogle Scholar

6 Crismon ML, Trivedi M, Pigott TA, et al.: The Texas Medication Algorithm Project: report of the Texas Consensus Conference Panel on Medication Treatment of Major Depressive Disorder. J Clin Psychiatry 1999; 60:142–156Crossref, MedlineGoogle Scholar

7 Shon SP, Crismon ML, Toprac MG, et al.: Mental health care from the public perspective: the Texas Medication Algorithm Project. J Clin Psychiatry 1999; 60(suppl 3):16–20, discussion 21MedlineGoogle Scholar

8 Adli M, Berghöfer A, Linden M, et al.: Effectiveness and feasibility of a standardized stepwise drug treatment regimen algorithm for inpatients with depressive disorders: results of a 2-year observational algorithm study. J Clin Psychiatry 2002; 63:782–790Crossref, MedlineGoogle Scholar

9 Adli M, Rush AJ, Möller HJ, et al.: Algorithms for optimizing the treatment of depression: making the right decision at the right time. Pharmacopsychiatry 2003; 36(suppl 3):S222–S229Crossref, MedlineGoogle Scholar

10 Adli M, Bauer M, Rush AJ: Algorithms and collaborative-care systems for depression: are they effective and why? A systematic review. Biol Psychiatry 2006; 59:1029–1038Crossref, MedlineGoogle Scholar

11 Ricken R, Wiethoff K, Reinhold T, et al.: Algorithm-guided treatment of depression reduces treatment costs: results from the randomized controlled German Algorithm Project (GAPII). J Affect Disord 2011; 134:249–256Crossref, MedlineGoogle Scholar

12 Trivedi MH, Rush AJ, Wisniewski SR, et al.: Evaluation of outcomes with citalopram for depression using measurement-based care in STAR*D: implications for clinical practice. Am J Psychiatry 2006; 163:28–40LinkGoogle Scholar

13 Trivedi MH: Tools and strategies for ongoing assessment of depression: a measurement-based approach to remission. J Clin Psychiatry 2009; 70(suppl 6):26–31Crossref, MedlineGoogle Scholar

14 Bauer M, Pfennig A, Linden M, et al.: Efficacy of an algorithm-guided treatment compared with treatment as usual: a randomized, controlled study of inpatients with depression. J Clin Psychopharmacol 2009; 29:327–333Crossref, MedlineGoogle Scholar

15 Wiethoff K, Bauer M, Baghai TC, et al.: Prevalence and treatment outcome in anxious versus nonanxious depression: results from the German Algorithm Project. J Clin Psychiatry 2010; 71:1047–1054Crossref, MedlineGoogle Scholar

16 Hamilton M: A rating scale for depression. J Neurol Neurosurg Psychiatry 1960; 23:56–62Crossref, MedlineGoogle Scholar

17 Xie GR, Shen QJ: [Use of the Chinese version of the Hamilton Rating Scale for Depression in general population and patients with major depression.] Chinese Journal of Nervous and Mental Diseases 1984; 10:346 (Chinese)Google Scholar

18 Bezchlibnyk-Butler KZ, Jeffries JJ: Clinical Handbook of Psychotropic Drugs, 11th ed. Toronto, Hogrefe and Huber, 2001Google Scholar

19 Chinese Medical Association: Guideline for the Prevention and Treatment of Psychiatric Disorders in China. Beijing, Chinese Medical Association, 2003Google Scholar

20 Holm KJ, Markham A: Mirtazapine: a review of its use in major depression. Drugs 1999; 57:607–631Crossref, MedlineGoogle Scholar

21 Rush AJ, Trivedi MH, Ibrahim HM, et al.: The 16-Item Quick Inventory of Depressive Symptomatology (QIDS), Clinician Rating (QIDS-C), and Self-Report (QIDS-SR): a psychometric evaluation in patients with chronic major depression. Biol Psychiatry 2003; 54:573–583Crossref, MedlineGoogle Scholar

22 Liu J, Xiang YT, Wang G, et al.: Psychometric properties of the Chinese versions of the Quick Inventory of Depressive Symptomatology–Clinician Rating (C-QIDS-C) and Self-Report (C-QIDS-SR). J Affect Disord 2013; 147:421–424Crossref, MedlineGoogle Scholar

23 Wisniewski SR, Rush AJ, Balasubramani GK, et al.: Self-rated global measure of the frequency, intensity, and burden of side effects. J Psychiatr Pract 2006; 12:71–79Crossref, MedlineGoogle Scholar

24 Trivedi MH, Rush AJ, Gaynes BN, et al.: Maximizing the adequacy of medication treatment in controlled trials and clinical practice: STAR*D measurement-based care. Neuropsychopharmacology 2007; 32:2479–2489Crossref, MedlineGoogle Scholar

25 Young RC, Biggs JT, Ziegler VE, et al.: A rating scale for mania: reliability, validity, and sensitivity. Br J Psychiatry 1978; 133:429–435Crossref, MedlineGoogle Scholar

26 Bridge JA, Iyengar S, Salary CB, et al.: Clinical response and risk for reported suicidal ideation and suicide attempts in pediatric antidepressant treatment: a meta-analysis of randomized controlled trials. JAMA 2007; 297:1683–1696Crossref, MedlineGoogle Scholar

27 Rush AJ, Kraemer HC, Sackeim HA, et al.: Report by the ACNP Task Force on response and remission in major depressive disorder. Neuropsychopharmacology 2006; 31:1841–1853Crossref, MedlineGoogle Scholar

28 Judd LL: Major depressive disorder: longitudinal symptomatic structure, relapse, and recovery. Acta Psychiatr Scand 2001; 104:81–83Crossref, MedlineGoogle Scholar

29 Hollon SD, DeRubeis RJ, Fawcett J, et al.: Effect of cognitive therapy with antidepressant medications vs antidepressants alone on the rate of recovery in major depressive disorder: a randomized clinical trial. JAMA Psychiatry 2014; 71:1157–1164Crossref, MedlineGoogle Scholar

30 Keller MB, McCullough JP, Klein DN, et al.: A comparison of nefazodone, the cognitive behavioral-analysis system of psychotherapy, and their combination for the treatment of chronic depression. N Engl J Med 2000; 342:1462–1470Crossref, MedlineGoogle Scholar

31 Sussman N: Translating science into service: lessons learned from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study. Prim Care Companion J Clin Psychiatry 2007; 9:331–337Crossref, MedlineGoogle Scholar

32 Xiang YT, Yu X, Sartorius N, et al.: Mental health in China: challenges and progress. Lancet 2012; 380:1715–1716Crossref, MedlineGoogle Scholar

33 Rush AJ, Fava M, Wisniewski SR, et al.: Sequenced Treatment Alternatives to Relieve Depression (STAR*D): rationale and design. Control Clin Trials 2004; 25:119–142Crossref, MedlineGoogle Scholar

34 Cipriani A, Furukawa TA, Salanti G, et al.: Comparative efficacy and acceptability of 12 new-generation antidepressants: a multiple-treatments meta-analysis. Lancet 2009; 373:746–758Crossref, MedlineGoogle Scholar

35 Ma X, Xiang YT, Cai ZJ, et al.: Prevalence and socio-demographic correlates of major depressive episode in rural and urban areas of Beijing, China. J Affect Disord 2009; 115:323–330Crossref, MedlineGoogle Scholar