Psychodynamic Therapy: As Efficacious as Other Empirically Supported Treatments? A Meta-Analysis Testing Equivalence of Outcomes
Abstract
Objective:
Pharmacotherapy, cognitive-behavioral therapy (CBT), and psychodynamic therapy are most frequently applied to treat mental disorders. However, whether psychodynamic therapy is as efficacious as other empirically supported treatments is not yet clear. Thus, for the first time the equivalence of psychodynamic therapy to treatments established in efficacy was formally tested. The authors controlled for researcher allegiance effects by including representatives of psychodynamic therapy and CBT, the main rival psychotherapeutic treatments (adversarial collaboration).
Method:
The authors applied the formal criteria for testing equivalence, implying a particularly strict test: a priori defining a margin compatible with equivalence (g=0.25), using the two one-sided test procedure, and ensuring the efficacy of the comparator. Independent raters assessed effect sizes, study quality, and allegiance. A systematic literature search used the following criteria: randomized controlled trial of manual-guided psychodynamic therapy in adults, testing psychodynamic therapy against a treatment with efficacy established for the disorder under study, and applying reliable and valid outcome measures. The primary outcome was “target symptoms” (e.g., depressive symptoms in depressive disorders).
Results:
Twenty-three randomized controlled trials with 2,751 patients were included. The mean study quality was good as demonstrated by reliable rating methods. Statistical analyses showed equivalence of psychodynamic therapy to comparison conditions for target symptoms at posttreatment (g=−0.153, 90% equivalence CI=−0.227 to −0.079) and at follow-up (g=−0.049, 90% equivalence CI=−0.137 to −0.038) because both CIs were included in the equivalence interval (−0.25 to 0.25).
Conclusions:
Results suggest equivalence of psychodynamic therapy to treatments established in efficacy. Further research should examine who benefits most from which treatment.
Mental disorders are common and represent a significant public health concern (1). They are associated with a high negative impact on all areas of life and cause more burden of disease than other illnesses (2). Up to 45% of primary care patients have been found to have at least one mental disorder (3). Current reviews and practice guidelines regard specific forms of psychotherapy (e.g., cognitive-behavioral therapy [CBT], interpersonal therapy) and specific forms of pharmacotherapy as empirically supported for the treatment of common mental disorders (4, 5). Psychodynamic therapy, another method of psychotherapy, has a long tradition, and a considerable proportion of therapists report a primary psychodynamic orientation (6, 7), with some differences between countries.
Thus, the efficacy of psychodynamic therapy is of high relevance to patients, therapists, and the health care system in general. For common mental disorders, evidence for psychodynamic therapy is available (8). A Cochrane review investigating the efficacy of psychodynamic therapy for common mental disorders found psychodynamic therapy to be superior over control conditions (waiting list, treatment as usual, minimal contact) (9). In addition, several meta-analyses found no statistically significant differences when psychodynamic therapy was compared with other forms of psychotherapy in patients with anxiety or depressive disorders (10, 11). Other meta-analyses, however, reported psychodynamic therapy to be inferior to CBT, which is regarded as an established treatment (12–14). These inconsistent findings and the frequent use of psychodynamic therapy suggest a need to examine whether psychodynamic therapy is as efficacious as treatments with established efficacy.
A comparison with a rival treatment can be considered a particularly strict test because both specific (e.g., techniques, ingredients, and procedures) and nonspecific (e.g., expectation and attention) factors are controlled for (15). Comparisons of this kind are rare in the whole field of medicine (16). Such a test is even more strict if the rival treatment has been established in efficacy. Comparisons for which no differences in outcomes are expected are referred to as equivalence trials (17, 18). eAppendix A, in the data supplement that accompanies the online edition of this article, highlights the differences between equivalence testing and the far more common superiority testing.
Of note, in psychotherapy research, presently no single individual study seems to exist that is sufficiently powered to test for equivalence if a small margin is used as compatible with equivalence (8, 19). In contrast, meta-analyses may yield a higher power than individual studies and are therefore especially suitable to test for equivalence; the logic of equivalence testing as outlined in eAppendix A in the data supplement applies to meta-analyses, as well. Nevertheless, despite available guidelines (20), equivalence testing in meta-analysis is almost nonexistent.
Applying the procedures of equivalence testing, we investigated whether psychodynamic therapy is equivalent in outcome to treatments established in efficacy for the respective disorder (i.e., other forms of psychotherapy and pharmacotherapy).
Method
Study Design and Choice of Equivalence Margin
We conducted the meta-analysis in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (21). A prespecified protocol is registered at PROSPERO (International Prospective Register of Systematic Reviews; registration number: CRD42016038161).
The design, study selection, and statistical analyses follow the logic of equivalence testing; that is, defining a margin, searching for studies with one or more established comparators, and applying the two one-sided test procedure (17, 20).
For defining an equivalence margin (i.e., “the minimum difference between two groups that would be important enough to make the two groups nonequivalent” [20, p. 554]), there are no generally accepted standards. What is considered to be a clinically meaningful minimum difference relative to a clinically irrelevant minimum difference depends on the field of research. If the outcome is a vital event, such as mortality, smaller margins are required than in other fields (18). Small margins make it more difficult to establish equivalence (17). As emphasized by Walker and Nowacki, the equivalence margin not only determines the result of the test but also gives scientific credibility to a study: “The value and impact of a study depend on how well the equivalence margin can be justified in terms of relevant evidence and sound clinical considerations” (17, p. 194).
Several proposals for choosing an equivalence margin in the context of mental disorders have been made (Table 1). Suggestions for the maximum difference in outcomes considered to be clinically irrelevant range from d=0.24 to d=0.60. The smallest margin was suggested by Cuijpers and colleagues (d=0.24) for the treatment of depression (22). Thus, for our study across a range of mental disorders, we decided to use a margin of 0.25 (i.e., an equivalence interval of −0.25 to 0.25), corresponding to a small effect size.
Study Type | Cohen’s d |
---|---|
Proposals or guidelines | |
Chambless and Hollon (15) | 0.65b |
National Institute for Clinical Excellence (42) | 0.5b |
Cuijpers et al. (22) | 0.24c |
Leichsenring et al. (8) | 0.5 |
Trials addressing noninferiority or equivalenced | |
Hedman et al. (43) | ≈0.39–0.50 |
Norton (44); Norton and Barrera (45) | 0.6 |
Driessen et al. (46) | 0.3 |
Tyrer et al. (47) | 0.26e |
Herpertz-Dahlmann et al. (48) | 0.52f |
Meuldijk et al. (49) | ≈0.27g |
Richards et al. (50) | 0.35 |
Connolly Gibbons et al. (51) | ≈0.29–0.45 |
Cutoffs for a Clinically Irrelevant Effect As Proposed in the Literature or Applied in Psychotherapy Trialsa
Selection Criteria and Search Strategy
Participants were a sufficiently described adult population treated for a specific mental disorder according to DSM-III or later versions or ICD-10 criteria. Organic mental disorders were excluded.
Interventions were manual-guided forms of psychodynamic therapy, a talking therapy operating on an interpretive-supportive continuum (23). Interpretive interventions focus on conscious and unconscious processes or conflicts and aim at enhancing the patient’s insight in repetitive patterns assumed to sustain his or her problems. Supportive interventions aim to strengthen abilities (“ego functions”) that are (temporarily) not accessible to a patient because of acute stress or because they are not sufficiently developed. Characteristic techniques of psychodynamic psychotherapy include fostering a helpful therapeutic relationship, focusing on affect and expression of emotion, exploring avoidance patterns and resistance to change, identifying recurring themes, discussing past experiences, exploring fantasies and dreams, and focusing on interpersonal issues. Moreover, processes of transference and countertransference are taken into account and interpreted, if suitable (23, 24).
Comparators were bona fide methods of psychotherapy or pharmacotherapy with efficacy demonstrated for the respective disorder according to published criteria and guidelines (4, 5, 15). For specific or new treatments not yet included in available listings, we performed our own searches for evidence. Following current standards for a designation as efficacious (15), we regarded at least two randomized controlled trials carried out in independent research settings as necessary, in which the respective treatment proved to be efficacious.
The primary outcome was “target symptoms,” which included measures specific to the mental disorder under study (e.g., measures of depressive symptoms in depressive disorders or of social anxiety in social anxiety disorder). As secondary outcomes, general psychiatric symptoms and psychosocial functioning (i.e., social, occupational, and personality functioning) were examined. Posttreatment and follow-up assessments were considered.
The meta-analysis included randomized controlled trials in which psychodynamic therapy was compared with a treatment established in efficacy using reliable and valid outcome measures. For intervention and comparison groups, only manual-guided forms of psychotherapy were included. A manual or manual-like guideline is a clear description of a treatment that includes the theoretical background, a set of technical recommendations, and case examples. Concurrent medication was allowed, provided that it was given in all treatment arms. Studies in which psychodynamic therapy was systematically combined with another treatment (e.g., psychodynamic therapy plus pharmacotherapy) were excluded. To ensure effective randomization, a minimum sample size of N=20 patients per treatment group was required for inclusion (25). Treatments must have been terminated (i.e., no ongoing treatments were permitted).
The following search strategy was applied (the complete search strategy can be found in eAppendix C in the online data supplement): systematic searches in the electronic databases PubMed, PsycINFO, and CENTRAL; manual searches in relevant systematic reviews, textbooks, and reference lists of included studies; and communication with experts in the field, which included a search in a comprehensive, published, and regularly updated list (the so-called Lilliengren List) of all previously identified randomized controlled trials on psychodynamic therapy (http://w3.psychology.su.se/staff/peli/RCTs_of_PDT.pdf). No language or date limits were applied. The main electronic search was conducted on March 23, 2016. Updated searches were regularly performed until December 2016.
Study Selection and Data Extraction
After completing literature searches, all hits (N=5,142) were saved in the citation management program EndNote. After removal of duplicates (N=1,216), two authors (C.S., F.L.) independently screened titles and abstracts of the remaining 3,926 articles according to the predefined selection criteria. All potentially relevant articles were then retrieved for full-text review (N=62), which resulted in the inclusion of 23 randomized controlled trials (and a total of 30 articles, of which seven presented follow-up data or additional outcomes; see Table 2 and eAppendixes B and D in the online data supplement). To retrieve study details, a data extraction form was used. Effect sizes included in the main analysis (i.e., target symptoms at posttreatment) were independently extracted and calculated by two authors each. To determine interrater reliability for the calculation of effect sizes, the intraclass correlation coefficient (ICC) was calculated with SPSS, version 23 (SPSS, Chicago), using a two-way mixed model in combination with the absolute agreement type, single measures. Interrater reliability proved to be excellent (ICC=0.99). Disagreements in the search process and effect size calculation were resolved by consensus or by consulting a third expert. Masking of raters regarding authors of primary studies was not done because evidence suggests that such masking is unnecessary for meta-analyses (26).
Studyb | Diagnosis | Treatment Conditions | Subjects Included in Analysis at Posttreatment (N) | Sessions (N) | Outcome Measures | Longest Follow-Up | RCT-PQRS Total Score | MARS Total Score | ||
---|---|---|---|---|---|---|---|---|---|---|
Depressive disorders | ||||||||||
Barber et al. (52) | MDD (DSM-IV), HAM-D score ≥14 | 1. PDT | 51 | 20 | T | None | 41 | 0 | ||
2. ADM | 55 | — | ||||||||
Connolly Gibbons et al. (51) | MDD (DSM-IV), QIDS score ≥11 | 1. PDT | 118 | 16 | T, P | None | 39 | 1 | ||
2. CBT | 119 | 16 | ||||||||
Cooper et al. (53) | Postpartum MDD (DSM-III), EPDS score ≥12 | 1. PDT | 48 | 10 | T | 55.5 months | 35 | 0 | ||
2. CBT | 42 | 10 | ||||||||
Driessen et al. (46) | MDD (DSM-IV), HAM-D score ≥14 | 1. PDT | 177 | 11 | T | 12 months | 39 | 0 | ||
2. CBT | 164 | 11 | ||||||||
Gallagher-Thompson and Steffen (54) | Depressed family caregivers; major, minor, or intermittent depressive disorder (RDC); BDI score ≥10 | 1. PDT | 21 | 20 | T | 3 months | 25.5 | –1 | ||
2. CBT | 31 | 20 | ||||||||
Salminen et al. (55) | MDD (DSM-IV), HAM-D score ≥15 | 1. PDT | 26 | 16 | T, P | 8 months | 27.5 | 1 | ||
2. ADM | 25 | — | ||||||||
Shapiro et al. (56) | MDD (DSM-III), BDI score >16 | 1. PDT–8 | 30 | 8 | T, G, P | 12 months | 34 | 0 | ||
2. PDT–16 | 28 | 16 | ||||||||
3. CBT–8 | 29 | 8 | ||||||||
4. CBT–16 | 30 | 16 | ||||||||
Thompson et al. (57) | Depressed elders, MDD (RDC), HAM-D score ≥14, BDI score ≥17 | 1. PDT | 30 | 16–20 | T, G, P | 24 months | 22 | –1 | ||
2. CBT | 31 | 16–20 | ||||||||
3. BT | 30 | 16–20 | ||||||||
Anxiety disorders | ||||||||||
Bögels et al. (58) | Social anxiety disorder (DSM-IV) | 1. PDT | 19 | 31 | T, G | 12 months | 34 | –2 | ||
2. CBT | 25 | 20 | ||||||||
Leichsenring et al. (59) | Social anxiety disorder (DSM-IV) | 1. PDT | 207 | 26 | T, G, P | 24 months | 46.5 | 0 | ||
2. CBT | 209 | 26 | ||||||||
Leichsenring et al. (60) | Generalized anxiety disorder (DSM-IV) | 1. PDT | 28 | 29 | T, G, P | 12 months | 37 | 0 | ||
2. CBT | 29 | 29 | ||||||||
Milrod et al. (61) | Panic disorder (DSM-IV) | 1. PDT | 81 | 19–24 | T | None | 44 | 0 | ||
2. CBT | 81 | 19–24 | ||||||||
Posttraumatic stress disorder (PTSD) | ||||||||||
Brom et al. (62)c | PTSD (DSM-III) | 1. PDT | 25 | 19 | T, G, P | 3 months | 22 | 0 | ||
2. CBT | 27 | 15 | ||||||||
Eating disorders | ||||||||||
Garner et al. (63) | Bulimia nervosa (modified DSM-III criteria and Russell criteria) | 1. PDT | 25 | 18 | T, G, P | None | 29.5 | –1 | ||
2. CBT | 25 | 18 | ||||||||
Poulsen et al. (64) | Bulimia nervosa (DSM-IV) | 1. PDT | 34 | 72 | T, G, P | None | 36.5 | 0 | ||
2. CBT | 36 | 20 | ||||||||
Tasca et al. (65) | Binge eating disorder (DSM-IV) | 1. G-PIP | 37 | 12 | T, G, P | 12 months | 37 | 1 | ||
2. G-CBT | 37 | 12 | ||||||||
Zipfel et al. (66) | Full or subsyndromal anorexia (DSM-IV) | 1. PDT | 80 | 40 | T | 12 months | 39.5 | 0 | ||
2. CBT | 80 | 45 | ||||||||
Substance-related disorders | ||||||||||
Crits-Christoph et al. (67) | Cocaine dependence (current or in early partial remission, DSM-IV) | 1. PDT | 91 | 16 | T, G, P | 6 months | 44 | |||
2. CBT | 97 | 15 | 0 (PDT compared with CBT) | |||||||
3. IDC | 92 | 12 | 0 (PDT compared with IDC) | |||||||
Woody et al. (68)d | Opiate addiction (DSM-III and RDC) | 1. PDT | 31 | 12 | T, G, P | 6 months | 31 | 0 | ||
2. CBT | 34 | 9 | ||||||||
Personality disorders | ||||||||||
Clarkin et al. (69) | Borderline personality disorder (DSM-IV) | 1. TFP | 23 | ≈84 | T, G, P | None | 29 | 1 (TFP compared with DBT) | ||
2. PDT | 22 | ≈42 | 0 (PDT compared with DBT) | |||||||
3. DBT | 17 | ≈84e | ||||||||
Emmelkamp et al. (70) | Avoidant personality disorder (DSM-IV) | 1. PDT | 22 | 19 | T, G, P | None | 26 | –1 | ||
2. CBT | 18 | 18 | ||||||||
Muran et al. (71) | Cluster C personality disorder or personality disorder not otherwise specified (DSM-IV) | 1. BRT | 33 | 30 | T, G, P | 6 months | 34 | 0 (BRT compared with CBT) | ||
2. PDT | 22 | 30 | –1 (PDT compared with CBT) | |||||||
3. CBT | 29 | 30 | ||||||||
Svartberg et al. (72) | One or more cluster C personality disorders (DSM-III-R) | 1. PDT | 25 | 40 | T, G, P | 24 months | 36.5 | 0 | ||
2. CBT | 25 | 40 |
Characteristics of Studies Included in a Meta-Analysis Comparing Efficacy of Psychodynamic Therapy With Established Treatmentsa
Assessment of Study Quality
Study quality was assessed by use of the Randomized Controlled Trial Psychotherapy Quality Rating Scale (RCT-PQRS) (27). The RCT-PQRS provides an empirical method for evaluating the quality of published randomized controlled trials. It contains 24 items rated on a scale from 0 to 2, yielding a maximum score of 48. A quality score of 24 or above is considered to represent a cutoff for a “reasonably well done study” (28, p. 24). The RCT-PQRS was found to have good interrater reliability, internal consistency, and validity (27). RCT-PQRS ratings for each study were performed by at least two independent raters. Interrater agreement for the total score was excellent (ICC=0.82). The average total score of the respective independent ratings was used in the statistical analyses.
Assessment of Allegiance
It has been repeatedly shown that results in psychotherapy research might be heavily biased by researchers’ allegiances (29, 30). Despite these findings, allegiance is rarely controlled for both in primary studies as well as in meta-analyses (31). We took allegiance into account on both levels.
First, to control for possible allegiance effects and to minimize bias on the level of performing this meta-analysis, a model of adversarial collaboration was implemented by including proponents of both psychodynamic therapy (C.S., F.L., and T.M.) and CBT (J.H. and S.R.), the treatment psychodynamic therapy was compared with most often in the present meta-analysis (k=21/23). J.H. is a CBT researcher, and S.R. is a specialist in research methods and research synthesis who, although putting special emphasis on research of psychodynamic therapy, has been formally trained in CBT.
Second, researcher allegiances often find expression in design features such as poor implementation of unfavored treatments or uncontrolled therapist allegiance (29, 32). To assess allegiance on the level of included studies, we modified a scale used in a previous study by one of us (T.M.) (29). The scale consists of five items assessing allegiance on four levels (the complete scale can be found in eAppendix E in the online data supplement): researcher allegiance (two items), therapist allegiance, trainer allegiance, and supervisor allegiance.
Items were assessed separately for each treatment condition based on the information provided in the respective articles. For each condition, scores were added, and the difference in scores between the conditions was calculated. The scale yields a score from 0 (balanced allegiance) to 4 or −4 (strong allegiance toward one treatment). Each study was judged by two independent raters. Interrater agreement was excellent (ICC=0.83). Disagreements were resolved by consensus.
Statistical Analyses
Statistical analyses were performed with Comprehensive Meta-Analysis, version 3. We aggregated effect size estimates across studies, adopting a random effects model, using maximum likelihood estimation to estimate between-study variability (tau2). Between-group effect sizes for psychodynamic therapy compared with established comparators were calculated for the primary outcome (target symptoms) as well as for two other outcome areas: general psychiatric symptoms and psychosocial functioning. A complete list of assessed outcomes and assignment of outcomes to outcome areas can be found in eAppendix F in the data supplement. Whenever possible, we used the most basic effect size estimate (i.e., unadjusted values). For continuous outcomes, Hedges’ g correcting for small sample bias was determined by calculating the difference of the mean scores of the respective treatments at posttreatment or at follow-up and dividing it by the pooled standard deviation. If means and standard deviations were not reported or could not be calculated, we used dichotomous data (e.g., remission or response). When continuous and categorical data of the same outcome instrument were provided, only the continuous data were included to avoid redundancies. Whenever a study reported data of more than one outcome instrument for an area of outcome (e.g., target symptoms), we assessed effect sizes separately for each instrument and calculated a combined effect to assess the overall outcome. In case continuous and dichotomous data were available, they were transformed into a common metric (Hedges’ g). When means and standard deviations or dichotomous data to calculate effect sizes were not provided, we contacted the authors of relevant studies (k=1). In case a study included more than two comparison groups, we included pairwise comparisons separately. To avoid “double counts” in the shared intervention group, the shared group N was split in half (33). Assessments at the end of treatment and at the latest follow-up were included. Intent-to-treat data were preferred over completer data. All effect sizes were coded in such a way that a positive sign indicated an advantage of psychodynamic therapy.
To test equivalence, we applied the two one-sided test procedure (see also eAppendix A in the online data supplement) (17, 20) using a prespecified equivalence interval of −0.25 to 0.25 at a significance level of 0.05 for each of the two one-sided tests (17). Corresponding to the two one-sided tests, a 90% equivalence confidence interval (CI) was calculated according to ES ± (zα)×(SE), with ES being the mean pooled effects size, SE the standard error of ES, and zα=1.645 (20). If the CI is included in the prespecified equivalence interval, the null hypothesis of nonequivalence is rejected and equivalence is concluded (20). Here, a significant result indicates equivalence.
Heterogeneity was assessed by chi-square heterogeneity tests and I2 statistics. The I2 statistic expresses the ratio of true to observed variance with values of 25%, 50%, and 75%, referred to as low, moderate, or high heterogeneity, respectively. Publication bias was assessed by testing for funnel plot asymmetry and by means of the Duval and Tweedie trim and fill procedure.
Moderator analyses were performed for a range of variables by means of meta-regressions using maximum likelihood estimation. The following moderators were studied: year of publication, recruitment method (community compared with clinical compared with mixed), intent-to-treat compared with completer analyses, type of diagnosis, study quality (total score of the RCT-PQRS), allegiance, number of sessions in the psychodynamic therapy groups, patient-per-therapist ratio (as an indicator for bias from therapist effects), and average sample size per group to investigate the presence of small study bias (34).
Results
Characteristics of Included Studies
Literature searches yielded 23 randomized controlled trials, published between 1983 and 2016, that fulfilled the a priori set selection criteria (Table 2). These studies included data on 2,751 patients. Twenty-one randomized controlled trials compared one or more forms of psychodynamic therapy with another form of psychotherapy, which in all cases was a method of CBT. Comparisons with other forms of psychotherapy, such as interpersonal therapy, were not identified. The remaining two studies compared psychodynamic therapy with a selective serotonin reuptake inhibitor or with a serotonin-norepinephrine reuptake inhibitor in the treatment of depression. The majority of studies (k=8) investigated participants with a depressive disorder, followed by anxiety disorders (k=4), eating disorders (k=4), personality disorders (k=4), substance dependence (k=2), and posttraumatic stress disorder (k=1). With one exception (an investigation studying group psychotherapy), all studies employed psychodynamic therapy in an individual face-to-face format.
Equivalence Testing: Psychodynamic Therapy Relative to Established Comparators
The pooled between-group difference in outcome for target symptoms at posttreatment was g=−0.153, indicating a small difference in favor of comparison treatments (Figure 1, Table 3). The 90% equivalence CI for this contrast was −0.227 to −0.079. Because this CI was included in the prespecified equivalence interval (−0.25 to 0.25), the null hypothesis of nonequivalence was rejected, and the alternative hypothesis of equivalence was accepted (p=0.016). Heterogeneity was very low (I2=0, tau2=0.0018). Similar results were found for target symptoms at follow-up (k=16, pooled difference g=−0.049, 90% equivalence CI=−0.137 to 0.039, p=0.0001; I2=7.12, tau2=0).
Symptom and Psychosocial Functioning Measures | Number of Studies (k) | Hedges’ g | 90% Equivalence CI | pa | Outcome of Equivalence Test | I2 (%) | tau2 |
---|---|---|---|---|---|---|---|
All studies | |||||||
Target symptoms (posttreatment) | 23 | –0.153 | –0.227 to –0.079 | 0.016 | Equivalent | 0 | 0.0018 |
Target symptoms (follow-up) | 16 | –0.049 | –0.137 to 0.039 | 0.0001 | Equivalent | 7.12 | 0 |
General psychiatric symptoms (posttreatment) | 15 | –0.116 | –0.211 to –0.020 | 0.01 | Equivalent | 0 | 0 |
General psychiatric symptoms (follow-up) | 10 | –0.014 | –0.121 to 0.093 | 0.0001 | Equivalent | 0 | 0 |
Psychosocial functioning (posttreatment) | 16 | –0.088 | –0.192 to 0.012 | 0.005 | Equivalent | 12.51 | 0.0108 |
Psychosocial functioning (follow-up) | 9 | 0.165 | –0.027 to 0.358b | 0.23 | Not equivalent | 57.59 | 0.0614 |
Cognitive-behavioral therapy only | |||||||
Target symptoms (posttreatment) | 21 | –0.158 | –0.236 to –0.080 | 0.026 | Equivalent | 0 | 0.0029 |
Target symptoms (follow-up) | 15 | –0.046 | –0.135 to 0.043 | 0.0001 | Equivalent | 12.67 | 0 |
General psychiatric symptoms (posttreatment)c | 15 | –0.116 | –0.211 to –0.020 | 0.01 | Equivalent | 0 | 0 |
General psychiatric symptoms (follow-up)c | 10 | –0.014 | –0.121 to 0.093 | 0.0001 | Equivalent | 0 | 0 |
Psychosocial functioning (posttreatment) | 15 | –0.087 | –0.195 to 0.021 | 0.006 | Equivalent | 18.17 | 0.0122 |
Psychosocial functioning (follow-up)c | 9 | 0.165 | –0.027 to 0.358b | 0.23 | Not equivalent | 57.59 | 0.0614 |
Between-Group Effects, 90% Equivalence CI, and Observed Heterogeneity of Psychodynamic Relative to Established Comparison Treatments for Target Symptoms, General Psychiatric Symptoms, and Psychosocial Functioning at Posttreatment and at Follow-Up
Equivalence was also shown for the other areas of outcome at posttreatment and follow-up (Table 3), except for psychosocial functioning. For the latter, psychodynamic therapy was not statistically equivalent to comparison treatments but was nominally better (g=0.165, 90% equivalence CI=−0.027 to 0.358, I2=57.59), suggesting superiority of psychodynamic therapy. However, a post hoc test of superiority did not yield statistical significance (p=0.162). Excluding randomized controlled trials in which the comparison condition consisted of pharmacotherapy (k=2) did not change results, implying equivalence in outcome of psychodynamic therapy and CBT (Table 3).
Study Quality and Allegiance
Results for study quality and allegiance ratings can be found in Table 2. With a mean score of 35.3 (SD=5.7), the vast majority of studies (k=21/23, or 91%) clearly were above the RCT-PQRS cutoff score of 24. For two studies with scores of 22, quality was below the RCT-PQRS cutoff.
Most of the studies achieved a balanced allegiance score of 0 (k=16); that is, no indicators for a favor toward one of the tested treatments were found. In k=7 of included studies, we found a minor allegiance toward the comparison treatment (score of −1 [k=6] or −2 [k=1]), while we found a minor allegiance toward psychodynamic therapy in k=4 studies (score of 1). Thus, in cases where some indication of allegiance was found, it was only minor (i.e., only one or two of four indicators were positive).
Moderator Analyses
According to moderator analyses performed for the main analysis (target symptoms at posttreatment), no moderator was significantly related to outcome (p>0.19, see Table 4), implying, for example, that the results are valid across the various disorders (no effect of diagnosis).
Moderator | Significance of Moderator | Slope | 95% CI Slope |
---|---|---|---|
Year of publication | p=0.87 | –0.0008 | –0.01 to 0.01 |
Recruitment (community, clinical, or mixed)b | p=0.28 | 0.062 | –0.05 to 0.17 |
ITT compared with completer data | p=0.77 | 0.027 | –0.16 to 0.21 |
Type of diagnosis | p=0.93 | 0.003 | –0.07 to 0.07 |
Number of sessions in psychodynamic groups | p=0.59 | –0.002 | –0.009 to 0.005 |
Average sample size per group | p=0.19 | –0.0008 | –0.002 to 0.0004 |
Patient-per-therapist ratio | p=0.35 | 0.007 | –0.01 to 0.02 |
Study quality (RCT-PQRS total score) | p=0.38 | –0.006 | –0.02 to 0.01 |
Allegiance | p=0.91 | 0.008 | –0.14 to 0.16 |
Results of Moderator Analyses Based on Target Symptoms at Posttreatmenta
Publication Bias
Egger’s regression test did not indicate funnel plot asymmetry (intercept=0.67, 95% CI=−0.39 to 1.73, p=0.20). Duval and Tweedie’s trim and fill procedure indicated two missing studies on the left of the mean (i.e., in favor of comparisons). Adjusting for publication bias resulted in the addition of two “trimmed” studies and an adjusted pooled effect size of g=−0.176. However, this did not change the main result as the 90% equivalence CI (−0.246 to −0.106) was included in the equivalence interval (p=0.04). To assess equivalence after correcting for publication bias, the standard error (SE) was obtained via the following formula: SE=(upper limit−lower limit)/3.92=0.043 (33).
Discussion
To our knowledge, this meta-analysis is the first in psychotherapy research to systematically investigate equivalence of a specific form of psychotherapy to established treatments by formally applying the logic of equivalence testing. Our meta-analysis found psychodynamic therapy to be as efficacious as other treatments with established efficacy, including CBT. Because we used high methodological standards (e.g., controlling for researcher allegiance, applying the logic of equivalence testing, using one of the smallest margins ever suggested as compatible with equivalence, and using treatments established in efficacy as comparators), the results of this meta-analysis can be expected to be robust. However, the number of studies that could be included is still limited, and further research is required.
Several conventional meta-analyses reported no differences in outcome between psychodynamic therapy and other treatments (e.g., 10, 11), whereas other conventional meta-analyses reported CBT to be superior to psychodynamic therapy (12–14). It is of note, however, that these previous meta-analyses did not apply the logic of equivalence testing, did not include only established comparators, and did not adequately control for researcher allegiance, thus allowing only for less definite conclusions. Our results are consistent with the conventional meta-analyses that reported no differences in outcome between psychodynamic therapy and other treatments (10, 11), adding more robust data to support the notion of equivalence between treatments. It is of note that the meta-analyses reporting inferiority of psychodynamic therapy showed both some differences in design and several methodological shortcomings (35). For example, Tolin (13) applied less strict inclusion criteria than our meta-analysis did, which resulted in the inclusion of 11 randomized controlled trials that did not fulfill our inclusion criteria. Thus, the overlap in studies between Tolin’s and our meta-analysis is small (k=7). Furthermore, according to Tolin’s own analysis, most of the results in favor of CBT compared with psychodynamic therapy were not robust against file drawer effects (13). The two further meta-analyses that found CBT to be superior to psychodynamic therapy are both based on only three studies of psychodynamic therapy and are therefore not representative (12, 14). Further shortcomings of these meta-analyses were discussed by Wampold et al. (35).
Our findings are limited with regard to psychopharmacology because only two studies of this treatment were included. Previous meta-analyses concluded that psychotherapy and pharmacotherapy may be equally efficacious (36), suggesting that this may also be true for psychodynamic therapy regarding the mental disorders studied here. Furthermore, randomized controlled trials comparing psychodynamic therapy with other forms of psychotherapy, such as interpersonal therapy, were not identified. Like all meta-analyses, the present one is limited by the nature of the studies included. To the extent that some of the studies comparing psychodynamic therapy with CBT or with medication may have recruited, at least in part, patients who do not respond well to treatment, the literature may be biased toward the finding of no differences between these treatments. However, the between-studies variance was very low, suggesting no significant effects of low responsiveness.
Although efficacious treatments for mental disorders are available, it is important to note that, in general, rates of response and remission are not yet satisfactory. For anxiety disorders, for example, a recent review found a mean CBT response rate of 49.5% (37). For depressive disorders, response rates are comparable, but remission rates are even lower (38). Thus, at present, none of the available treatments may claim to be the panacea. There clearly is room for improvement. Because therapist effects seem to have a stronger impact on outcome than the treatments being compared and need to be taken into account, one promising strategy for improving treatments is enhancing therapist training and eventually therapist outcome (39). Furthermore, different patients may benefit from different approaches, which is why a shift from one empirically supported treatment to another may be helpful in case of nonresponse (40, 41).
1 : Trillion-dollar brain drain. Nature 2011; 478:15Crossref, Medline, Google Scholar
2 : Disability and functional burden of disease because of mental in comparison to somatic disorders in general practice patients. Eur Psychiatry 2015; 30:789–792Crossref, Medline, Google Scholar
3 : Prevalence of mental disorders in primary care: results from the diagnosis and treatment of mental disorders in primary care study (DASMAP). Soc Psychiatry Psychiatr Epidemiol 2010; 45:201–210Crossref, Medline, Google Scholar
4 : A Guide to Treatments That Work. New York, Oxford University Press, 2015Crossref, Google Scholar
5 Society of Clinical Psychology (Division 12 of the American Psychological Association): Research-supported psychological treatments. 2016Google Scholar
6 : Psychotherapists’ personal identities, theoretical orientations, and professional relationships: elective affinity and role adjustment as modes of congruence. Psychother Res 2013; 23:718–731Crossref, Medline, Google Scholar
7 : Psychologists conducting psychotherapy in 2012: current practices and historical trends among Division 29 members. Psychotherapy (Chic) 2013; 50:490–495Crossref, Medline, Google Scholar
8 : Psychodynamic therapy meets evidence-based medicine: a systematic review using updated criteria. Lancet Psychiatry 2015; 2:648–660Crossref, Medline, Google Scholar
9 : Short-term psychodynamic psychotherapies for common mental disorders. Cochrane Database Syst Rev 2014; (7):CD004687Medline, Google Scholar
10 : The efficacy of short-term psychodynamic psychotherapy for depression: a meta-analysis update. Clin Psychol Rev 2015; 42:1–15Crossref, Medline, Google Scholar
11 : A meta-analytic review of psychodynamic therapies for anxiety disorders. Clin Psychol Rev 2014; 34:309–323Crossref, Medline, Google Scholar
12 : Is the Dodo bird endangered in the 21st century? A meta-analysis of treatment comparison studies. Clin Psychol Rev 2014; 34:519–530Crossref, Medline, Google Scholar
13 : Is cognitive-behavioral therapy more effective than other therapies? A meta-analytic review. Clin Psychol Rev 2010; 30:710–720Crossref, Medline, Google Scholar
14 : Psychological and pharmacological interventions for social anxiety disorder in adults: a systematic review and network meta-analysis. Lancet Psychiatry 2014; 1:368–376Crossref, Medline, Google Scholar
15 : Defining empirically supported therapies. J Consult Clin Psychol 1998; 66:7–18Crossref, Medline, Google Scholar
16 : Meta-research: the art of getting it wrong. Res Synth Methods 2010; 1:169–184Crossref, Medline, Google Scholar
17 : Understanding equivalence and noninferiority testing. J Gen Intern Med 2011; 26:192–196Crossref, Medline, Google Scholar
18 : Reporting of noninferiority and equivalence randomized trials: extension of the CONSORT 2010 statement. JAMA 2012; 308:2594–2604Crossref, Medline, Google Scholar
19 : Are all psychotherapies equally effective in the treatment of adult depression? The lack of statistical power of comparative outcome studies. Evid Based Ment Health 2016; 19:39–42Crossref, Medline, Google Scholar
20 : Using significance tests to evaluate equivalence between two experimental groups. Psychol Bull 1993; 113:553–565Crossref, Medline, Google Scholar
21 : The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ 2009; 339:b2700Crossref, Medline, Google Scholar
22 : What is the threshold for a clinically relevant effect? The case of major depressive disorders. Depress Anxiety 2014; 31:374–378Crossref, Medline, Google Scholar
23 : Long-Term Psychodynamic Psychotherapy: A Basic Text. Washington, DC, American Psychiatric Association Publishing, 2004Google Scholar
24 : The efficacy of psychodynamic psychotherapy. Am Psychol 2010; 65:98–109Crossref, Medline, Google Scholar
25 : Random sampling, randomization, and equivalence of contrasted groups in psychotherapy outcome research. J Consult Clin Psychol 1989; 57:131–137Crossref, Medline, Google Scholar
26 : Does blinding of readers affect the results of meta-analyses? University of Pennsylvania Meta-analysis Blinding Study Group. Lancet 1997; 350:185–186Crossref, Medline, Google Scholar
27 : A new scale for assessing the quality of randomized clinical trials of psychotherapy. Compr Psychiatry 2010; 51:319–324Crossref, Medline, Google Scholar
28 : A quality-based review of randomized controlled trials of psychodynamic psychotherapy. Am J Psychiatry 2011; 168:19–28Link, Google Scholar
29 : Testing the allegiance bias hypothesis: a meta-analysis. Psychother Res 2011; 21:670–684Crossref, Medline, Google Scholar
30 : Researcher allegiance in psychotherapy outcome research: an overview of reviews. Clin Psychol Rev 2013; 33:501–511Crossref, Medline, Google Scholar
31 : Disclosure of researcher allegiance in meta-analyses and randomised controlled trials of psychotherapy: a systematic appraisal. BMJ Open 2015; 5:e007206Crossref, Medline, Google Scholar
32 : Biases in research: risk factors for non-replicability in psychotherapy and pharmacotherapy research. Psychol Med 2017; 47:1000–1011Crossref, Medline, Google Scholar
33 Higgins JPT, Green S: Cochrane Handbook for Systematic Reviews of Interventions, Version 5.1.0 (updated March 2011). The Cochrane Collaboration, 2011. www.cochrane-handbook.orgGoogle Scholar
34 : Small study effects in meta-analyses of osteoarthritis trials: meta-epidemiological study. BMJ 2010; 341:c3515Crossref, Medline, Google Scholar
35 : In pursuit of truth: a critical examination of meta-analyses of cognitive behavior therapy. Psychother Res 2017; 27:14–32Crossref, Medline, Google Scholar
36 : Efficacy of pharmacotherapy and psychotherapy for adult psychiatric disorders: a systematic overview of meta-analyses. JAMA Psychiatry 2014; 71:706–715Crossref, Medline, Google Scholar
37 : Response rates for CBT for anxiety disorders: need for standardized criteria. Clin Psychol Rev 2015; 42:72–82Crossref, Medline, Google Scholar
38 : The effects of psychotherapies for major depression in adults on remission, recovery and improvement: a meta-analysis. J Affect Disord 2014; 159:118–126Crossref, Medline, Google Scholar
39 : The Great Psychotherapy Debate: The Evidence for What Makes Psychotherapy Work, 2nd ed. New York, Routledge, 2015Crossref, Google Scholar
40 : Treating treatment-resistant patients with panic disorder and agoraphobia using psychotherapy: a randomized controlled switching trial. Psychother Psychosom 2015; 84:100–109Crossref, Medline, Google Scholar
41 : What to do when a psychotherapy fails. Lancet Psychiatry 2015; 2:186–190Crossref, Medline, Google Scholar
42 : Depression: Management of Depression in Primary and Secondary Care: Clinical Practice Guideline No 23. London, National Institute for Clinical Excellence, 2004Google Scholar
43 : Internet-based cognitive behavior therapy vs. cognitive behavioral group therapy for social anxiety disorder: a randomized controlled non-inferiority trial. PLoS One 2011; 6:e18001Crossref, Medline, Google Scholar
44 : A randomized clinical trial of transdiagnostic cognitive-behavioral treatments for anxiety disorder by comparison to relaxation training. Behav Ther 2012; 43:506–517Crossref, Medline, Google Scholar
45 : Transdiagnostic versus diagnosis-specific CBT for anxiety disorders: a preliminary randomized controlled noninferiority trial. Depress Anxiety 2012; 29:874–882Crossref, Medline, Google Scholar
46 : The efficacy of cognitive-behavioral therapy and psychodynamic therapy in the outpatient treatment of major depression: a randomized clinical trial. Am J Psychiatry 2013; 170:1041–1050Link, Google Scholar
47 : Clinical and cost-effectiveness of cognitive behaviour therapy for health anxiety in medical patients: a multicentre randomised controlled trial. Lancet 2014; 383:219–225Crossref, Medline, Google Scholar
48 : Day-patient treatment after short inpatient care versus continued inpatient treatment in adolescents with anorexia nervosa (ANDI): a multicentre, randomised, open-label, non-inferiority trial. Lancet 2014; 383:1222–1229Crossref, Medline, Google Scholar
49 : The clinical effectiveness of concise cognitive behavioral therapy with or without pharmacotherapy for depressive and anxiety disorders; a pragmatic randomized controlled equivalence trial in clinical practice. Contemp Clin Trials 2016; 47:131–138Crossref, Medline, Google Scholar
50 : Cost and outcome of behavioural activation versus cognitive behavioural therapy for depression (COBRA): a randomised, controlled, non-inferiority trial. Lancet 2016; 388:871–880Crossref, Medline, Google Scholar
51 : Comparative effectiveness of cognitive therapy and dynamic psychotherapy for major depressive disorder in a community mental health setting: a randomized clinical noninferiority trial. JAMA Psychiatry 2016; 73:904–911Crossref, Medline, Google Scholar
52 : Short-term dynamic psychotherapy versus pharmacotherapy for major depressive disorder: a randomized, placebo-controlled trial. J Clin Psychiatry 2012; 73:66–73Crossref, Medline, Google Scholar
53 : Controlled trial of the short- and long-term effect of psychological treatment of post-partum depression, I: impact on maternal mood. Br J Psychiatry 2003; 182:412–419Crossref, Medline, Google Scholar
54 : Comparative effects of cognitive-behavioral and brief psychodynamic psychotherapies for depressed family caregivers. J Consult Clin Psychol 1994; 62:543–549Crossref, Medline, Google Scholar
55 : Short-term psychodynamic psychotherapy and fluoxetine in major depressive disorder: a randomized comparative study. Psychother Psychosom 2008; 77:351–357Crossref, Medline, Google Scholar
56 : Effects of treatment duration and severity of depression on the effectiveness of cognitive-behavioral and psychodynamic-interpersonal psychotherapy. J Consult Clin Psychol 1994; 62:522–534Crossref, Medline, Google Scholar
57 : Comparative effectiveness of psychotherapies for depressed elders. J Consult Clin Psychol 1987; 55:385–390Crossref, Medline, Google Scholar
58 : Psychodynamic psychotherapy versus cognitive behavior therapy for social anxiety disorder: an efficacy and partial effectiveness trial. Depress Anxiety 2014; 31:363–373Crossref, Medline, Google Scholar
59 : Psychodynamic therapy and cognitive-behavioral therapy in social anxiety disorder: a multicenter randomized controlled trial. Am J Psychiatry 2013; 170:759–767Link, Google Scholar
60 : Short-term psychodynamic psychotherapy and cognitive-behavioral therapy in generalized anxiety disorder: a randomized, controlled trial. Am J Psychiatry 2009; 166:875–881Link, Google Scholar
61 : Psychotherapies for panic disorder: a tale of two sites. J Clin Psychiatry 2016; 77:927–935Crossref, Medline, Google Scholar
62 : Brief psychotherapy for posttraumatic stress disorders. J Consult Clin Psychol 1989; 57:607–612Crossref, Medline, Google Scholar
63 : Comparison of cognitive-behavioral and supportive-expressive therapy for bulimia nervosa. Am J Psychiatry 1993; 150:37–46Link, Google Scholar
64 : A randomized controlled trial of psychoanalytic psychotherapy or cognitive-behavioral therapy for bulimia nervosa. Am J Psychiatry 2014; 171:109–116Link, Google Scholar
65 : Attachment scales predict outcome in a randomized clinical trial of group psychotherapy for binge eating disorder: an aptitude by treatment interaction. Psychother Res 2006; 16:106–121Crossref, Google Scholar
66 : Focal psychodynamic therapy, cognitive behaviour therapy, and optimised treatment as usual in outpatients with anorexia nervosa (ANTOP study): randomised controlled trial. Lancet 2014; 383:127–137Crossref, Medline, Google Scholar
67 : Psychosocial treatments for cocaine dependence: National Institute on Drug Abuse Collaborative Cocaine Treatment Study. Arch Gen Psychiatry 1999; 56:493–502Crossref, Medline, Google Scholar
68 : Psychotherapy for opiate addicts. Does it help? Arch Gen Psychiatry 1983; 40:639–645Crossref, Medline, Google Scholar
69 : Evaluating three treatments for borderline personality disorder: a multiwave study. Am J Psychiatry 2007; 164:922–928Link, Google Scholar
70 : Comparison of brief dynamic and cognitive-behavioural therapies in avoidant personality disorder. Br J Psychiatry 2006; 189:60–64Crossref, Medline, Google Scholar
71 : The relationship of early alliance ruptures and their resolution to process and outcome in three time-limited psychotherapies for personality disorders. Psychotherapy (Chic) 2009; 46:233–248Crossref, Medline, Google Scholar
72 : Randomized, controlled trial of the effectiveness of short-term dynamic psychotherapy and cognitive therapy for cluster C personality disorders. Am J Psychiatry 2004; 161:810–817Link, Google Scholar