0
Articles   |    
DSM-5 Field Trials in the United States and Canada, Part II: Test-Retest Reliability of Selected Categorical Diagnoses
Darrel A. Regier, M.D., M.P.H.; William E. Narrow, M.D., M.P.H.; Diana E. Clarke, Ph.D., M.Sc.; Helena C. Kraemer, Ph.D.; S. Janet Kuramoto, Ph.D., M.H.S.; Emily A. Kuhl, Ph.D.; David J. Kupfer, M.D.
Am J Psychiatry 2013;170:59-70. 10.1176/appi.ajp.2012.12070999
View Author and Article Information

All authors report no financial relationships with commercial interests.

This study was funded by the American Psychiatric Association.

From the American Psychiatric Association, Division of Research and American Psychiatric Institute for Research and Education, Arlington, Va.; the Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, Md.; the Stanford University School of Medicine, Palo Alto, Calif.; and the University of Pittsburgh Medical Center, Pittsburgh, Pa.

Presented in part at the 165th annual meeting of the American Psychiatric Association, Philadelphia, May 5–9, 2012.

Address correspondence to Dr. Regier (dregier@psych.org).

Copyright © 2013 by the American Psychiatric Association

Received July 30, 2012; Revised August 29, 2012; Accepted September 04, 2012.

Abstract

Objective  The DSM-5 Field Trials were designed to obtain precise (standard error <0.1) estimates of the intraclass kappa as a measure of the degree to which two clinicians could independently agree on the presence or absence of selected DSM-5 diagnoses when the same patient was interviewed on separate occasions, in clinical settings, and evaluated with usual clinical interview methods.

Method  Eleven academic centers in the United States and Canada were selected, and each was assigned several target diagnoses frequently treated in that setting. Consecutive patients visiting a site during the study were screened and stratified on the basis of DSM-IV diagnoses or symptomatic presentations. Patients were randomly assigned to two clinicians for a diagnostic interview; clinicians were blind to any previous diagnosis. All data were entered directly via an Internet-based software system to a secure central server. Detailed research design and statistical methods are presented in an accompanying article.

Results  There were a total of 15 adult and eight child/adolescent diagnoses for which adequate sample sizes were obtained to report adequately precise estimates of the intraclass kappa. Overall, five diagnoses were in the very good range (kappa=0.60–0.79), nine in the good range (kappa=0.40–0.59), six in the questionable range (kappa=0.20–0.39), and three in the unacceptable range (kappa values <0.20). Eight diagnoses had insufficient sample sizes to generate precise kappa estimates at any site.

Conclusions  Most diagnoses adequately tested had good to very good reliability with these representative clinical populations assessed with usual clinical interview methods. Some diagnoses that were revised to encompass a broader spectrum of symptom expression or had a more dimensional approach tested in the good to very good range.

Abstract Teaser
Figures in this Article

Your Session has timed out. Please sign back in to continue.
Sign In Your Session has timed out. Please sign back in to continue.
Sign In to Access Full Content
 
Username
Password
Sign in via Athens (What is this?)
Athens is a service for single sign-on which enables access to all of an institution's subscriptions on- or off-site.
Not a subscriber?

Subscribe Now/Learn More

PsychiatryOnline subscription options offer access to the DSM-5 library, books, journals, CME, and patient resources. This all-in-one virtual library provides psychiatrists and mental health professionals with key resources for diagnosis, treatment, research, and professional development.

Need more help? PsychiatryOnline Customer Service may be reached by emailing PsychiatryOnline@psych.org or by calling 800-368-5777 (in the U.S.) or 703-907-7322 (outside the U.S.).

FIGURE 1. Comorbidity of Major Depressive Disorder, Posttraumatic Stress Disorder, Alcohol Use Disorder, and Generalized Anxiety Disordera

a Rates are average weighted percentages from Houston VA/Menninger (N=264).

Anchor for Jump
TABLE 1.Test-Retest Reliability of Categorical DSM-5 Criteria Tested at Houston VA/Menninger (N=264)
Table Footer Note

a Patients could be assigned to these six strata on the basis of DSM-IV diagnosis or qualifying symptom profile.

Table Footer Note

b Identified by one or both clinicians as having the target DSM-5 diagnosis; proportion calculated from those with two visits.

Table Footer Note

c Intraclass kappa estimated for a stratified sample with bootstrap confidence interval.

Table Footer Note

d Strength of agreement for the kappa coefficient was interpreted as follows: <0.20=unacceptable, 0.20–0.39=questionable, 0.40–0.59=good, 0.60–0.79=very good, and 0.80–1=excellent. Estimates with confidence intervals that overlap zero are considered unacceptable because of insufficient sample size (19).

Table Footer Note

e Prevalence () calculated as , where = the sample weight (proportion assigned for sampling in a particular stratum), Qi2 = proportion of those in stratum i where both clinicians diagnosed the particular diagnosis, and Qi1 = proportion of those in stratum i where only one of the two clinicians diagnosed the particular diagnosis.

Anchor for Jump
TABLE 2.Test-Retest Reliability of Target DSM-5 Diagnoses at the Adult Field Trial Sitesa
Table Footer Note

a Kappa estimates shown are those with standard errors ≤0.1 and 95% CI of sizes ≤0.5.

Table Footer Note

b Since the individual intraclass kappas for the stratified samples and their 95% CIs do not overlap, the pooled intraclass kappa needs to be interpreted with caution.

Table Footer Note

c Not applicable because the diagnosis is new to DSM.

Table Footer Note

d Estimated DSM-IV prevalence represents the DSM-IV diagnosis of any somatoform disorder, excluding conversion and body dysmorphic disorders.

Table Footer Note

e Estimated DSM-IV prevalence represents the DSM-IV diagnosis of alcohol abuse and/or alcohol dependence.

Table Footer Note

f Estimated DSM-IV prevalence represents the DSM-IV diagnosis of any dementia disorder.

Table Footer Note

g The kappa interpretation provided here is based on the nonrounded estimate, which was below 0.80.

Anchor for Jump
TABLE 3.DSM-5 Adult Field Trials Unsuccessful in Obtaining Accurate Estimates of Kappaa
Table Footer Note

a An unsuccessful field trial refers to one in which the size of the 95% CI around the reliability coefficient was >0.5, which indicates a lack of precision (SE>0.1) in the estimation of the reliability coefficient (20). Narcissistic personality disorder was assessed at Houston/Menninger, but no data are shown because fewer than seven patients were studied.

Table Footer Note

b Not applicable because the diagnosis is new to DSM.

Anchor for Jump
TABLE 4.Test-Retest Reliability of Target DSM-5 Diagnoses at the Child/Pediatric Field Trial Sitesa
Table Footer Note

a Kappa estimates shown are those with standard errors ≤0.1 and 95% CI sizes ≤0.5.

Table Footer Note

b For autism spectrum disorder, the estimated DSM-IV prevalence represents the DSM-IV diagnosis of autistic disorder, Asperger’s disorder, or pervasive developmental disorder not otherwise specified.

Table Footer Note

c Not applicable because the diagnosis is new to DSM.

Table Footer Note

d Since the individual intraclass kappas for the stratified samples and their 95% CIs do not overlap, the pooled intraclass kappa needs to be interpreted with caution.

Anchor for Jump
TABLE 5.DSM-5 Child/Pediatric Field Trials Unsuccessful in Obtaining Accurate Estimates of Kappaa
Table Footer Note

a An unsuccessful field trial refers to one in which the size of the 95% CI around the reliability coefficient was >0.5 which indicates a lack of precision (SE>0.1) in the estimation of the reliability coefficient (20).

Table Footer Note

b Not applicable because the diagnosis is new to DSM.

+

References

Kupfer  DJ;  First  MB;  Regier  DA (ed):  A Research Agenda for the DSM-V .  Washington, DC,  American Psychiatric Association,  2002
 
Hyman  SE:  Can neuroscience be integrated into the DSM-V? Nat Rev Neurosci   2007; 8:725–732
[CrossRef] | [PubMed]
 
Andrews  G;  Charney  DS;  Sirovatka  PJ;  Regier  DA (ed):  Stress-Induced and Fear Circuitry Disorders: Refining the Research Agenda for DSM-V .  Arlington, Va,  American Psychiatric Association,  2009
 
Stengel  E:  Classification of mental disorders.  Bull World Health Organ   1959; 21:601–663
[PubMed]
 
Kramer M: Some problems for international research suggested by observations on differences in first admission rates to mental hospitals of England and Wales and of the United States, in Proceedings of the Third World Congress of Psychiatry, vol 3. Montreal, University of Toronto Press/McGill University Press, 1961, pp 153–160
 
Cooper  JE;  Kendell  RE;  Gurland  BJ;  Sharpe  L;  Copeland  JRM;  Simon  R:  Psychiatric Diagnosis in New York and London .  London,  Oxford University Press,  1972
 
Maas  JW;  Koslow  SH;  Davis  JM;  Katz  MM;  Mendels  J;  Robins  E;  Stokes  PE;  Bowden  CL:  Biological component of the NIMH Clinical Research Branch Collaborative Program on the Psychobiology of Depression, I: background and theoretical considerations.  Psychol Med   1980; 10:759–776
[CrossRef] | [PubMed]
 
Rice  J;  Andreasen  NC;  Coryell  W;  Endicott  J;  Fawcett  J;  Hirschfeld  RM;  Keller  MB;  Klerman  GL;  Lavori  P;  Reich  T;  Scheftner  WA:  NIMH Collaborative Program on the Psychobiology of Depression: clinical.  Genet Epidemiol   1989; 6:179–182
[CrossRef] | [PubMed]
 
Feighner  JP;  Robins  E;  Guze  SB;  Woodruff  RA  Jr;  Winokur  G;  Munoz  R:  Diagnostic criteria for use in psychiatric research.  Arch Gen Psychiatry   1972; 26:57–63
[CrossRef] | [PubMed]
 
Spitzer  RL;  Endicott  J;  Robins  E:  Research Diagnostic Criteria: rationale and reliability.  Arch Gen Psychiatry   1978; 35:773–782
[CrossRef] | [PubMed]
 
Spitzer  RL;  Forman  JB;  Nee  J:  DSM-III field trials, I: initial interrater diagnostic reliability.  Am J Psychiatry   1979; 136:815–817
[PubMed]
 
Widiger  TA;  Frances  A;  Pincus  H (eds):  DSM-IV Sourcebook .  Washington, DC,  American Psychiatric Publishing,  1994
 
Sartorius  N;  Ustün  TB;  Korten  A;  Cooper  JE;  van Drimmelen  J:  Progress toward achieving a common language in psychiatry, II: results from the international field trials of the ICD-10 diagnostic criteria for research for mental and behavioral disorders.  Am J Psychiatry   1995; 152:1427–1437
[PubMed]
 
Regier  DA;  Kaelber  CT;  Roper  MT;  Rae  DS;  Sartorius  N:  The ICD-10 clinical field trial for mental and behavioral disorders: results in Canada and the United States.  Am J Psychiatry   1994; 151:1340–1350
[PubMed]
 
Robins  E;  Guze  SB:  Establishment of diagnostic validity in psychiatric illness: its application to schizophrenia.  Am J Psychiatry   1970; 126:983–987
[PubMed]
 
Andrews  G;  Goldberg  DP;  Krueger  RF;  Carpenter  WT  Jr;  Hyman  SE;  Sachdev  P;  Pine  DS:  Exploring the feasibility of a meta-structure for DSM-V and ICD-11: could it improve utility and validity? Psychol Med   2009; 39:1993–2000
[CrossRef] | [PubMed]
 
American Psychiatric Association, Committee on Nomenclature and Statistics: DSM-III field trials: interrater reliability and listing of participants (Appendix F), in Diagnostic and Statistical Manual of Mental Disorders, Third Edition. Washington, DC, American Psychiatric Association, 1980, pp 467–481
 
Fleiss  JL;  Spitzer  RL;  Endicott  J;  Cohen  J:  Quantification of agreement in multiple psychiatric diagnosis.  Arch Gen Psychiatry   1972; 26:168–171
[CrossRef] | [PubMed]
 
Kraemer  HC;  Kupfer  DJ;  Clarke  DE;  Narrow  WE;  Regier  DA:  DSM-5: how reliable is reliable enough? (commentary). Am J Psychiatry   2012; 169:13–15
[CrossRef]
 
Clarke  DE;  Narrow  WE;  Regier  DA;  Kuramoto  SJ;  Kupfer  DJ;  Kuhl  EA;  Greiner  L;  Kraemer  HC:  DSM-5 Field Trials in the United States and Canada, part I: study design, sampling strategy, implementation, and analytic approaches.  Am J Psychiatry   2013; 170:43–58
 
Cottler  LB;  Schuckit  MA;  Helzer  JE;  Crowley  T;  Woody  G;  Nathan  P;  Hughes  J:  The DSM-IV field trial for substance use disorders: major results.  Drug Alcohol Depend   1995; 38:59–69, discussion 71–83
[CrossRef] | [PubMed]
 
Lahey  BB;  Applegate  B;  Barkley  RA;  Garfinkel  B;  McBurnett  K;  Kerdyk  L;  Greenhill  L;  Hynd  GW;  Frick  PJ;  Newcorn  J;  Biederman  J;  Ollendick  T;  Hart  EL;  Perez  D;  Irwin  W;  Shaffer  D:  DSM-IV field trials for oppositional defiant disorder and conduct disorder in children and adolescents.  Am J Psychiatry   1994; 151:1163–1171
[PubMed]
 
Brown  TA;  Di Nardo  PA;  Lehman  CL;  Campbell  LA:  Reliability of DSM-IV anxiety and mood disorders: implications for the classification of emotional disorders.  J Abnorm Psychol   2001; 110:49–58
[CrossRef] | [PubMed]
 
Holzer  CE  3rd;  Nguyen  HT;  Hirschfeld  RMA:  Reliability of diagnosis in mood disorders.  Psychiatr Clin North Am   1996; 19:73–84
[CrossRef] | [PubMed]
 
Kraemer  HC;  Kupfer  DJ;  Narrow  WE;  Clarke  DE;  Regier  DA:  Moving toward DSM-5: the field trials (commentary).  Am J Psychiatry   2010; 167:1158–1160
[CrossRef] | [PubMed]
 
March  JS;  Silva  SG;  Compton  S;  Shapiro  M;  Califf  R;  Krishnan  R:  The case for practical clinical trials in psychiatry.  Am J Psychiatry   2005; 162:836–846
[CrossRef] | [PubMed]
 
Polsky  D;  Doshi  JA;  Bauer  MS;  Glick  HA:  Clinical trial-based cost-effectiveness analyses of antipsychotic use.  Am J Psychiatry   2006; 163:2047–2056
[CrossRef] | [PubMed]
 
Narrow  WE;  Clarke  DE;  Kuramoto  SJ;  Kraemer  HC;  Kupfer  DJ;  Greiner  L;  Regier  DA:  DSM-5 Field Trials in the United States and Canada, part III: development and reliability testing of a cross-cutting symptom assessment for DSM-5.  Am J Psychiatry   2013; 170:71–82
 
Volkmar  FR;  State  M;  Klin  A:  Autism and autism spectrum disorders: diagnostic issues for the coming decade.  J Child Psychol Psychiatry   2009; 50:108–115
[CrossRef] | [PubMed]
 
Swedo  SE;  Baird  G;  Cook  EH  Jr;  Happé  FG;  Harris  JC;  Kaufmann  WE;  King  BH;  Lord  CE;  Piven  J;  Rogers  SJ;  Spence  SJ;  Wetherby  A;  Wright  HH:  Commentary from the DSM-5 Workgroup on Neurodevelopmental Disorders.  J Am Acad Child Adolesc Psychiatry   2012; 51:347–349
[CrossRef] | [PubMed]
 
Leibenluft  E:  Severe mood dysregulation, irritability, and the diagnostic boundaries of bipolar disorder in youths.  Am J Psychiatry   2011; 168:129–142
[CrossRef] | [PubMed]
 
Boyd  JH;  Burke  JD  Jr;  Gruenberg  E;  Holzer  CE  3rd;  Rae  DS;  George  LK;  Karno  M;  Stoltzman  R;  McEvoy  L;  Nestadt  G:  Exclusion criteria of DSM-III: a study of co-occurrence of hierarchy-free syndromes.  Arch Gen Psychiatry   1984; 41:983–989
[CrossRef] | [PubMed]
 
Regier  DA;  Farmer  ME;  Rae  DS;  Locke  BZ;  Keith  SJ;  Judd  LL;  Goodwin  FK:  Comorbidity of mental disorders with alcohol and other drug abuse: results from the Epidemiologic Catchment Area (ECA) study.  JAMA   1990; 264:2511–2518
[CrossRef] | [PubMed]
 
Kessler  RC;  McGonagle  KA;  Zhao  S;  Nelson  CB;  Hughes  M;  Eshleman  S;  Wittchen  HU;  Kendler  KS:  Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States: results from the National Comorbidity Survey.  Arch Gen Psychiatry   1994; 51:8–19
[CrossRef] | [PubMed]
 
Löwe  B;  Spitzer  RL;  Williams  JB;  Mussell  M;  Schellberg  D;  Kroenke  K:  Depression, anxiety and somatization in primary care: syndrome overlap and functional impairment.  Gen Hosp Psychiatry   2008; 30:191–199
[CrossRef] | [PubMed]
 
Kroenke  K;  Spitzer  RL;  Williams  JB:  The PHQ-9: validity of a brief depression severity measure.  J Gen Intern Med   2001; 16:606–613
[CrossRef] | [PubMed]
 
Spitzer  RL;  Kroenke  K;  Williams  JB;  Löwe  B:  A brief measure for assessing generalized anxiety disorder: the GAD-7.  Arch Intern Med   2006; 166:1092–1097
[CrossRef] | [PubMed]
 
Kroenke  K;  Spitzer  RL;  Williams  JB:  The PHQ-15: validity of a new measure for evaluating the severity of somatic symptoms.  Psychosom Med   2002; 64:258–266
[PubMed]
 
Regier  DA;  Narrow  WE;  Kuhl  EA;  Kupfer  DJ:  The conceptual development of DSM-V.  Am J Psychiatry   2009; 166:645–650
[CrossRef] | [PubMed]
 
Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, Sanislow CA, Wang PW: Research Domain Criteria (RDoC): developing a valid diagnostic framework for research on mental disorders. Am J Psychiatry 2010; 167:748–751
 
Kraepelin  E:  Diagnose und Prognose der Dementia Praecox: Heidelberger Versammlung 26/27.  Dohr Neurol Psychiatry   1898; 56:254–263
 
Kraepelin  E:  Patterns of Mental Disorder (1920), in Themes and Variations in European Psychiatry . Edited by Hirsch  SR;  Shepherd  M.  Charlottesville, Va,  University Press of Virginia,  1974, pp 7–30
 
Kendell RL, Jablensky A: Distinguishing between the validity and utility of psychiatric diagnoses. Am J Psychiatry 2003; 160:4–12
 
Hempel  C:  Introduction to problems of taxonomy, in  Field Studies in the Mental Disorders . Edited by Zubin  J.  New York,  Grune & Stratton,  1961, pp 3–32
 
References Container
+
+

CME Activity

There is currently no quiz available for this resource. Please click here to go to the CME page to find another.
Submit a Comments
Please read the other comments before you post yours. Contributors must reveal any conflict of interest.
Comments are moderated and will appear on the site at the discertion of APA editorial staff.

* = Required Field
(if multiple authors, separate names by comma)
Example: John Doe



Related Content
Articles
Books
Manual of Clinical Psychopharmacology, 7th Edition > Chapter 1.  >
Manual of Clinical Psychopharmacology, 7th Edition > Chapter 2.  >
Manual of Clinical Psychopharmacology, 7th Edition > Chapter 2.  >
Manual of Clinical Psychopharmacology, 7th Edition > Chapter 2.  >
Manual of Clinical Psychopharmacology, 7th Edition > Chapter 2.  >
Topic Collections
Psychiatric News
Read more at Psychiatric News >>
APA Guidelines