The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×
Letters to the EditorFull Access

The Impact of Underpowered Studies on Clinical Trial Results

To the Editor: The commentary by Marder et al. in the September 2017 issue of the Journal, “Why Are Innovative Drugs Failing in Phase III?” (1), brings up a critical topic: the frequent failure of phase III trials for CNS programs. The authors effectively summarized investigations into this issue. However, we believe that a possible explanation for phase III trial failure was minimized, as noted in the commentary: “that the phase II results were misleading” (1, p. 829).

It is perhaps an underappreciated truism that statistically underpowered studies with small sample sizes (often early-phase studies) are highly likely to both under- and overestimate treatment effect sizes (2, 3). By considering large effects from small studies to be true effects rather than overestimation errors, some researchers have concluded that smaller trials are more advantageous.

Take, for example, the citation of Undurraga and Baldessarini, who suggest limiting phase III antidepressant trials to 30–75 patients (4, p. 860). Testing this suggestion using data from Food and Drug Administration (FDA) antidepressant phase III trials, we found that treatment arms with Ns of ≤75 had both the highest (0.75) and lowest (−0.29) effect sizes of the entire group of 115 arms and that they achieved statistical significance only 50% of the time. In contrast, the largest treatment arms (with Ns of ≥350) had a 100% rate of statistical significance and little variation in effect size (from 0.24 to 0.33).

These smaller trials empirically demonstrate the chance findings of underpowering—they are no more reliable than tossing a coin. These findings are unsurprising, given that a two-arm trial with Ns of ≤75 would be powered only at 50% for an effect size of 0.5 (2). Underreporting negative studies in published research may mask the downside of the underpowering coin, which can be seen in the FDA data; that is, those “unlucky” underpowered trials resulting in failure and underestimations of drug effect.

The increase in size of phase III trials should not be viewed as a design flaw because this rejects the statistical principles of the scientific method. If phase II results are a reflection of the true magnitude of treatment effects, by rule they should be replicated in a larger sample. If they are not, then the phase II results were likely misleading. Seeking to replicate such lucky results from underpowered studies by underpowering future studies only perpetuates scientifically unsound methods and high failure rates.

From the Northwest Clinical Research Center, Bellevue, Wash.; the Department of Psychiatry, Duke University School of Medicine, Durham, N.C.; and the Department of Psychiatry and Human Behavior, Brown University, Providence, R.I.
Address correspondence to Dr. Khan ().

The authors report no financial relationships with commercial interests.

References

1 Marder SR, Laughren T, Romano SJ: Why are innovative drugs failing in phase III? Am J Psychiatry 2017; 174:829–831LinkGoogle Scholar

2 Kraemer HC, Mintz J, Noda A, et al.: Caution regarding the use of pilot studies to guide power calculations for study proposals. Arch Gen Psychiatry 2006; 63:484–489Crossref, MedlineGoogle Scholar

3 Gibertini M, Nations KR, Whitaker JA: Obtained effect size as a function of sample size in approved antidepressants: a real-world illustration in support of better trial design. Int Clin Psychopharmacol 2012; 27:100–106Crossref, MedlineGoogle Scholar

4 Undurraga J, Baldessarini RJ: Randomized, placebo-controlled trials of antidepressants for acute major depression: thirty-year meta-analytic review. Neuropsychopharmacology 2012; 37:851–864Crossref, MedlineGoogle Scholar