The development of structured diagnostic interviews for psychiatric disorders in the last two decades has been an essential part in the growth of epidemiologic research in psychiatry (1). This research has yielded important information on the prevalence of psychiatric disorders in the population, features of their natural course and co-occurrence, risk factors, use of services, and burden of disease. The need for shortened versions of these structured interviews has increased in recent years with the incorporation of psychiatric disorders into the spectrum of morbidity monitored in general health surveys, such as the National Health Interview Survey (2), as well as the increased importance of mental health screening by primary health care providers (3). Moreover, surveys that focus on a specific subset of psychiatric disorders, such as major depressive disorders, can no longer ignore the role of co-occurring disorders and need an efficient method to assess disorders of secondary interest. Short screening scales are also useful in surveys with two-phase designs, in which a quick screening test is used to select a subset of respondents who are likely to have a disorder (phase 1) for more intensive diagnostic assessment (phase 2) (4, 5).
Recently, Kessler et al. (2) introduced the World Health Organization (WHO) Composite International Diagnostic Interview Short Form, which comprises screening scales for eight DSM-III-R disorders and is based on data from the National Comorbidity Survey. One disorder that is missing from the Composite International Diagnostic Interview Short Form is posttraumatic stress disorder (PTSD), estimated to affect more than 10% of residents of U.S. communities (6, 7).
A short screening scale for PTSD would have an important use in addition to those described above. Previous epidemiologic studies have typically asked respondents to report their PTSD symptoms only in connection with their worst or most upsetting trauma. This data collection strategy is used because exposure to trauma is common, and many respondents report numerous traumas, precluding a detailed assessment of PTSD for each trauma. However, as we demonstrated elsewhere (6), this strategy leads to overestimation of the conditional risk of PTSD following exposure to trauma and might bias the estimates of the comparative risk of PTSD across different traumas. The availability of a short screening scale could resolve this problem by making it possible to assess PTSD in connection with all reported traumas or assess a random sample of traumas among individuals exposed to a large number.
In this report, we present a short screening scale for PTSD. The seven-symptom scale is a short form of the modified National Institute of Mental Health Diagnostic Interview Schedule (DIS) (8) and the WHO Composite International Diagnostic Interview, version 2.1 (9), developed and used in the Detroit Area Survey of Trauma (6).
The 1996 Detroit Area Survey of Trauma involved a representative sample of 2,181 subjects 18–45 years of age in the Detroit primary metropolitan statistical area, 23% of whom reside in the city of Detroit. Detailed information on the population, sampling, and assessment is presented elsewhere (6). Briefly, a random-digit-dialing method was used to select the sample, and a computer-assisted telephone interview was used to collect the data. Screening for eligibility was completed in 76.2% of households, and 86.8% of screened-eligible households (one respondent per household) completed the interview. The study was approved by the institutional review board of the Henry Ford Health System, and oral informed consent was elicited and recorded at the start of the interview.
The Modified DIS/Composite International Diagnostic Interview for Assessing PTSD
The interview began with an enumeration of traumatic events experienced in the lifetime of the subjects, based on a list of 19 types of trauma that operationalized the DSM-IV definition of stressors. An endorsement of an event type was followed by questions on the number of times an event of that type occurred and the respondent’s age at each time. PTSD was assessed in connection with a computer-selected random trauma from the list of traumas reported by each respondent. PTSD also was assessed in connection with the worst trauma and the earliest trauma. The analysis presented in this report is based on data from the randomly selected traumas. The strategy of randomly selecting one trauma from the list of each respondent was devised to yield an unbiased estimate of the conditional risk of PTSD (that is, the probability that exposure to traumatic events would culminate in PTSD). By the same token, the use of data on the randomly selected traumas would yield a short scale with characteristics that are representative of the entire range of qualifying traumas experienced in the community, rather than the most severe traumas or those associated with the most distressing psychological sequelae.
The PTSD section of the DIS for DSM-IV (8) and the Composite International Diagnostic Interview, version 2.1 (9), with slight modifications, was used to assess PTSD criterion symptoms, duration of symptoms, and impairment according to DSM-IV. This PTSD instrument, like the DIS and the Composite International Diagnostic Interview, is a fully structured interview, designed to be administered by experienced interviewers without clinical training. A validation study conducted on a subset of the sample found high concordance between the structured interview administered by lay interviewers and independent clinical reinterviews (sensitivity=95%, specificity=71%, and kappa=0.63) (10).
The goal of the analysis was to select a short subset of items (symptoms) from the DIS/Composite International Diagnostic Interview PTSD section that would most efficiently predict PTSD as diagnosed by the full-length interview. The analysis proceeded in four steps. Step 1 was designed to select the five best combinations of symptoms for each of several scale lengths. Ordinary least-squares best-subsets binary linear regressions (11) were used, and models were displayed in descending order according to the magnitude of the coefficient of determination. Seventeen items that measure PTSD criterion symptoms were included in the analysis. Items that measure duration and impairment were not included. Although the 17 symptom items can be treated as distinct variables, each inquiring about a separate experience, the items on duration of symptoms and impairment cannot because they refer to the configurations of symptoms experienced by the respondent. Nonetheless, the utility of the screening scale was tested against the diagnosis of PTSD as elicited by the full interview and classified according to DSM-IV criteria, including the 1-month duration and impairment. The candidate scale lengths selected for the next step were based on the pattern of increases in the coefficient of determination. T1 presents the list of PTSD criterion symptoms and the lifetime prevalence of each symptom associated with the randomly selected traumas.
Step 2 used estimates of the area under the receiver operating characteristic curve (12) to identify the single best combination of items out of the five combinations selected in step 1 for each of three candidate scale lengths. Step 3 applied receiver operating characteristic analysis to each of the three best models identified in step 2 in order to select the best final scale. The best scale was selected on the basis of a classification rule determined by the predicted probability of having PTSD, as modeled by logistic regression with the selected symptoms as predictors and with the diagnosis of PTSD according to the full-length interview as the outcome. The regression equation for the probability of having PTSD was 1/(1+exp–[α+β1 x1 +…+ βk xk]), where β1 to βk are the estimated coefficients for the symptom items x1 to xk. The set of fitted probabilities was evaluated in terms of sensitivity, specificity, and positive and negative predictive values. Requiring sensitivity greater than 90% and specificity greater than 90%, we chose the best scale based on the predicted probability that resulted in the maximum sum of positive and negative predictive values.
Step 4 simplified the classification rule in step 3, which used the regression coefficients to estimate a probability of having PTSD. The simpler classification rule was based on the total number of symptoms reported by the respondent, with equal weights given to each symptom. Step 4 estimates the sensitivity, specificity, positive predictive value, and negative predictive value for this final scale for varying scale scores, calculated by this rule. These estimates are the final product of the analysis and offer alternative cutoff points for transforming the continuous scale to a dichotomous screening tool for diagnosing PTSD with the full DIS/Composite International Diagnostic Interview as a standard. Additional analyses repeated step 4 in men and women separately.
Of the 1,957 respondents who reported exposure to traumatic events, 152 met criteria for PTSD in connection with their randomly selected trauma. The analysis was performed on 1,830 subjects with complete data on all 17 criterion symptoms (142 with PTSD and 1,688 with no PTSD). Examination of the array of coefficient of determination values calculated in stepwise multiple regressions for the best five combinations of symptoms for varying scale lengths (step 1) showed increases in the coefficient of determination with increases in scale length reaching a plateau at six to eight symptoms and a coefficient of determination of approximately 0.54. Based on this initial analysis, we selected the five best combinations of symptoms for each of six- , seven- , and eight-symptom scales.
Estimates of the area under the receiver operating characteristic curve (step 2) for models selected in step 1 revealed a tight range of values within the five combinations of six- , seven- , and eight-symptom scales. The selection of the single best combination of symptoms for each of the three scale lengths (six, seven, and eight symptoms), given the tight range of values for the receiver operating characteristic curve, was guided by the goal of selecting nested scales, so that the longer scales would include the symptoms in the shorter scale plus one additional symptom. T2 presents the receiver operating characteristic analysis (step 3) for these three scales and shows that the seven-symptom scale had the highest positive predictive value and specificity and the other statistics (sensitivity and negative predictive value) were constant across the three scales.
Information on the seven-symptom scale for screening PTSD, formed by summing the number of positive replies on the symptoms, is presented in T3. A score of 4 or more appears to be the best overall cutoff point in terms of sensitivity, specificity, and positive and negative predictive value. Using this score as a cutoff for screening for PTSD, we found that less than 2% of "true" cases of PTSD were missed, whereas 29% of subjects without PTSD were falsely identified as having PTSD. T3 displays the tradeoff associated with using higher or lower cutoffs. Separate analyses of the data for men and women showed similar results for both sexes (data not shown).
We have presented a screening scale for DSM-IV PTSD and documented the analytic procedures applied in the development of the scale. The scale is a short version of a structured diagnostic interview that, with minor modifications, closely follows the PTSD sections of the DIS for DSM-IV and the Composite International Diagnostic Interview, version 2.1, which is modeled after the DIS. The screening scale is designed to measure lifetime history of PTSD in individuals exposed to traumatic events as defined in DSM-IV. History of exposure to traumatic events is not covered by this scale. A score of 4 or more on the seven-symptom screening scale has the following characteristics for diagnosing DSM-IV PTSD: sensitivity=80%, specificity=97%, positive predictive value=71%, and negative predictive value=98% (T3). The scale also can be used as a semicontinuous predictor for the probability of having PTSD. Methods for this alternative use of a screening scale have been suggested by Kessler et al. (2).
Several limitations to this study should be mentioned. First, although the structured diagnostic interview that served as the standard for evaluating the screening scale has high concordance with blind clinical reinterviews, a stronger test of the utility of the screening scale would consist of a direct comparison with clinical assessments. Second, data come from one region of the United States and a limited age range (18–45 years). Third, data on which the analysis was conducted were based on randomly selected traumas, one from each respondent’s list. We expect that most studies in which a screening scale would be used would focus on the worst or most upsetting trauma, an approach that provides an efficient way for identifying individuals with PTSD. Tests of the psychometric characteristics of the seven-symptom screening scale for data on the worst trauma showed similar results to those described in T3. Specifically, based on data on the worst traumas, a score of 4 or higher had the following characteristic: sensitivity=85%, specificity=93%, positive predictive value=68%, and negative predictive value=98%.
We have identified a score of 4 on the seven-symptom screening scale as the optimal cutoff point for separating subjects with and without PTSD. This cutoff point minimized the probability of missing true cases of PTSD at the expense of somewhat raising the probability of classifying subjects without the disorder as having it. This tradeoff is particularly suitable for two-phase surveys in which the first phase is designed to maximize the number of true cases of PTSD and the second phase is expected to reclassify those who were wrongly classified as having the disorder. Other uses of a screening scale for PTSD might find another cutoff more desirable. Clearly, the screening scale is not a substitute for a psychiatric diagnosis.
It comes as no surprise that, of the seven PTSD symptoms selected for their utility in distinguishing patients with PTSD from those without the disorder, five are from the avoidance and numbing symptom group, including all four numbing symptoms. It has been observed that the avoidance and numbing symptom group is the least frequently met criterion (13, 14) This is critically significant to the diagnosis of PTSD because only a small fraction of those who report sufficient symptoms in the other symptom groups meet this criterion. Although the requirement of three symptoms from the avoidance and numbing group has been criticized as too stringent (13, 14), it persisted from DSM-III-R to DSM-IV, and there is no reason to expect that it will be modified soon.
Received Sept. 14, 1998; revision received Dec. 14, 1998; accepted Jan. 20, 1999. From the Department of Psychiatry and the Department of Biostatistics/Research Epidemiology, Henry Ford Health System; the Department of Psychiatry, School of Medicine, Case Western Reserve University, Cleveland; and the Department of Health Care Policy, Harvard Medical School, Boston. Address reprint requests to Dr. Breslau, Department of Psychiatry, Henry Ford Health Sciences Center, One Ford Place 3A, Detroit, MI 48202-3450; email@example.com (e-mail). Supported in part by NIMH grant MH-48802