Since the publication of DSM-III, research in the area of personality disorders has greatly increased (1). In part, research in this area was facilitated by the development of semistructured diagnostic interviews that increased the reliability of assessing the DSM personality disorders (2, 3). Recently, Westen (4) called into question the validity of personality disorder research that is based on these semistructured diagnostic interviews. He noted that the semistructured research interview method, which relies on direct questions to ascertain the presence or absence of personality disorder criterion symptoms, is at variance with the method clinicians use to diagnose personality disorders. That is, clinicians use a longitudinal perspective to determine the presence or absence of a personality disorder, and their judgments are based on the real-life vignettes patients describe during the course of treatment and the behavior and attitudes patients display during treatment sessions. The results of a national survey asking clinicians to rate the importance of different methods used to diagnose personality disorders supported Westen’s hypothesis that clinicians and researchers use different approaches in making these diagnoses (4).
A comparison of rates of personality disorder in studies using clinical evaluations and research assessments suggests that more diagnoses are made when semistructured diagnostic interviews are used. Oldham and Skodol (5) examined the rate of DSM-III personality disorders in 129,286 inpatients and outpatients seen in the New York State mental health system during a 1-year period and found that 11% were diagnosed with a personality disorder. They concluded that personality disorders were not being systematically diagnosed in these patients. Koenigsberg and colleagues (6) found that in a mixed group of patients from inpatient, outpatient, consultation-liaison, and emergency services, 26% were diagnosed with a specific personality disorder (36% had a personality disorder if mixed and atypical personality disorders were included). In contrast, in the only study to administer a standardized personality disorder interview to an unselected consecutive series of patients (7), 81% of outpatients were diagnosed as having a DSM-III-defined personality disorder. Because of the marked differences in demographic and clinical characteristics between study samples, it is not possible to determine how much of the difference in diagnostic rates is due to the method of assessment and how much is due to true differences in the samples. We are not aware of any studies that have compared the research and clinical approaches to making personality disorder diagnoses in patients drawn from the same population.
In this study from the Rhode Island Methods to Improve Diagnosis and Services project, we examined the influence of assessment method on the occurrence of borderline personality disorder. The rate of borderline personality disorder was compared in two large groups of patients drawn from the same practice setting. In the first group, diagnoses were based on a standard unstructured clinical interview; in the second group, diagnoses were based on a semistructured interview. It was predicted that the rate of borderline personality disorder would be higher in the group of patients who received a semistructured diagnostic interview. We also examined whether clinicians would diagnose the disorder more frequently if they were presented with the results of a standardized interview. We hypothesized that the diagnosis of borderline personality disorder during the initial evaluation is influenced by the amount of information clinicians have available to them at the interview, and if clinicians are provided with information indicative of a diagnosis of borderline personality disorder, then the diagnosis will be made.
One thousand patients were evaluated in the Rhode Island Hospital Department of Psychiatry outpatient practice. This private practice group primarily treats individuals with medical insurance (including Medicare but not Medicaid) on a fee-for-service basis, and it is distinct from the hospital’s outpatient residency training clinic that primarily serves lower-income, uninsured, and medical assistance patients.
Before the initial evaluation, all patients were asked to complete a 116-item, self-administered symptom questionnaire as part of their initial paperwork. The clinical study group consisted of 500 patients who successfully completed this questionnaire. Because the validity of the symptom questionnaire was being investigated as part of another project, the clinicians were blind to the patients’ responses on the questionnaire.
After the completion of the first study, the method of conducting initial diagnostic evaluations was changed. Five hundred patients were interviewed by a diagnostic rater who administered the Structured Clinical Interview for DSM-IV Axis I Disorders, Patient Edition (SCID-P) (8), and the borderline personality disorder section of the Structured Interview for DSM-IV Personality (9). The results of this interview were presented to a psychiatrist who finished the evaluation. The borderline personality disorder section of the Structured Interview for DSM-IV Personality was added to the SCID-P interview after 91 patients had already participated in the project; thus, only 409 patients were interviewed with both measures. These patients are subsequently referred to as the structured interview group. The Rhode Island Hospital institutional review committee approved the research protocol, and all patients provided informed, written consent.
In the clinical group, almost all (96%, N=480) diagnostic evaluations were conducted by board-certified or board-eligible psychiatrists. Clinical nurse specialists or master’s-level social workers under the supervision of a psychiatrist conducted the other evaluations. Diagnoses were based on DSM-IV criteria. Clinicians completed a standardized intake form modeled on the Initial Evaluation Form of Mezzich and colleagues (10). On the last page of the five-page form, the clinician recorded the patient’s DSM-IV multiaxial diagnoses. Research assistants recorded the results of the clinician’s diagnostic evaluation written on the last page of the intake form and collected demographic information from the narrative.
In the study involving the structured interviews, when patients called to schedule their initial appointments, they were told that they would be interviewed by two people—first by a diagnostic rater, who would conduct a comprehensive evaluation, and then by a psychiatrist. After administration of the SCID-P and the borderline personality disorder section of the Structured Interview for DSM-IV Personality, the rater presented the case to a psychiatrist who reviewed the findings of the evaluation with the patient. The clinicians then completed the same standardized intake form, and their diagnoses were abstracted in the same manner as was done for the first 500 patients.
Six diagnostic raters were used to administer the SCID-P and the borderline personality disorder section of the Structured Interview for DSM-IV Personality. The raters included the authors of this paper, each of whom has extensive experience administering research diagnostic interviews. The other four raters were research assistants with college degrees in the social or biological sciences. One of these four raters had more than 6 years’ experience administering the SCID-P and had previously trained other research assistants in its use. The other three received 3 months of training during which they observed at least 20 interviews, and they were observed and supervised in their administration of more than 20 evaluations. At the end of the training period, these three raters were required to demonstrate exact, or nearly exact, agreement with a senior diagnostician on five consecutive evaluations. During the course of the study, joint-interview diagnostic reliability information was collected on 17 patients; the kappa coefficient of agreement for the diagnosis of borderline personality disorder was 1.0. Throughout the Methods to Improve Diagnosis and Services project, ongoing supervision of the raters included weekly diagnostic case conferences involving all members of the team.
Toward the end of the clinical study and throughout the structured interview study, patients were given a booklet of questionnaires to complete at home and return by mail. Forty-nine patients in the clinical group and 275 patients in the structured interview group returned the booklet of questionnaires. To examine the clinical similarity of the study groups, the two groups of patients were compared on self-report symptom measures of bulimia (the Eating Disorder Inventory bulimia subscale ), social phobia (a brief version of the Fear of Negative Evaluation Scale  and the Fear Questionnaire social phobia subscale ), agoraphobic fears (the Fear Questionnaire agoraphobia subscale  and the Social Phobia and Anxiety Inventory agoraphobia subscale ), posttraumatic stress (the Posttraumatic Stress Disorder Scale ),obsessive-compulsive behavior (the Obsessive Compulsive Scale [M. Coles et al., unpublished]), cognitions common in generalized anxiety (the Penn State Worry Questionnaire ), anxiety symptoms common in panic attacks (the Beck Anxiety Inventory ), alcohol use (the Michigan Alcohol Screening Test ), drug use (the Drug Abuse Screening Test ), hypochondriasis (the Whitely Index ), and somatization (the Somatic Symptom Index [21, 22]).Thesescales have been commonly used in research, and their reliability and validity have been well established. The symptom severity measure of depression changed in the two studies. However, the symptom measure that all patients completed before their evaluation included a 21-item depression section, and this subscale was used to compare the two groups.
We used t tests to compare the groups on continuously distributed variables. Categorical variables were compared by means of the chi-square statistic or with the use of Fisher’s exact test if the expected value in any cell of a two-by-two table was less than 5.
The demographic characteristics of the two groups were quite similar (t1). The majority of both groups were white, female, high school graduates, and married or never married. The patients in the two groups were also compared on the self-report symptom severity measures, and there were no significant differences between the groups.
The rate of borderline personality disorder diagnoses was significantly higher in the structured interview group than in the clinical group (14.4%, N=59, versus 0.4%, N=2; χ2=70.69, df=1, p<0.001). The 14.4% frequency of borderline diagnoses in the structured interview group represents the diagnostic rate according to the borderline personality disorder section of the Structured Interview for DSM-IV Personality. The clinicians were not obligated to agree with the diagnoses made by the research interviews. We reviewed the clinical charts of 392 of the 409 patients interviewed with the borderline personality disorder section of the Structured Interview for DSM-IV Personality and abstracted the diagnoses recorded in the patients’ clinical records. Thirty-six (9.2%) of these patients either were diagnosed with borderline personality disorder (N=27) or had borderline personality disorder ruled out (N=9) by the intake clinician. Thus, the frequency of borderline diagnoses assigned by the clinicians was significantly increased when the information from the interview was presented to them before they made their evaluations (9.2% versus 0.4%; χ2=31.97, df=1, p<0.001).
As expected, the rate of borderline personality disorder diagnoses was higher when the diagnosis was based on a semistructured diagnostic interview than when it was based on an unstructured clinical interview. In fact, clinicians were very reluctant to diagnose borderline personality disorder during their routine intake diagnostic evaluations. This is consistent with the results of Westen’s survey of clinicians’ diagnostic practices (4), which found that clinicians rely on their longitudinal observations to make personality disorder diagnoses. However, inconsistent with Westen’s hypothesis that clinicians rely on longitudinal observation to make these diagnoses, we found that clinicians more frequently diagnosed borderline personality disorder during the initial evaluation when the results of the research interview were presented to them. If clinicians relied on longitudinal observations and considered information based on the direct-question approach of research interviews to be irrelevant or invalid, then being presented with the results from the borderline personality disorder section of the Structured Interview for DSM-IV Personality should not have influenced the rate at which they diagnosed borderline personality disorder. Perhaps the issue in diagnosing personality disorders is not so much the need for longitudinal observation but the need for more time to conduct a comprehensive intake evaluation. Diagnostic evaluations are time-limited, and initially it is more important to determine the axis I diagnosis than an axis II diagnosis, because axis I pathology will have a more immediate effect on treatment planning.
A potential criticism of the diagnostic approach taken in the present study is that patients were interviewed upon presentation for treatment, when they were rarely euthymic. We are aware of the well-known pathologizing effect of psychiatric state on personality (3); nevertheless, we chose this time for assessment because we were interested in the clinical significance of personality pathology, and the sooner a clinician was aware of the presence of borderline personality disorder, the more likely this information could be used for treatment planning. Second, if we had waited to assess personality disorders after symptom improvement, we would have disproportionately excluded patients with borderline personality disorder from our study group, because the presence of personality pathology predicts poorer outcome (23). Ultimately, the onus will be on us to demonstrate that our diagnostic approach is valid. However, a large literature examining the treatment, prognostic, familial, and biological correlates of personality disorders already suggests that diagnosis of personality disorder in this manner is valid (24, 25).
Westen (4) critiqued the research literature on personality disorder on several fronts. As already discussed, he noted the marked variance between the clinical and research approaches to diagnosing personality disorders. Moreover, he indicated that there was a marked discrepancy in the product of clinical and research interviews. That is, researchers make multiple axis II diagnoses without prioritizing them, whereas clinicians usually assign only one diagnosis. Westen is correct in noting that in studies using semistructured research interviews to diagnose personality disorder, the majority of patients with a personality disorder are given more than one diagnosis (26), whereas clinicians tend to assign only one diagnosis. However, it is erroneous to conclude that this necessarily reflects a problem with the research method. It is equally plausible to conclude that the clinical evaluation is not sufficiently thorough. A review of the literature on detection of comorbidity of axis I disorders suggests that clinicians underrecognize diagnostic comorbidity. Three studies of axis I diagnostic comorbidity rates based on clinical evaluations (27–29) found that approximately 25% of patients were diagnosed with two or more disorders. This rate is one-half to one-third the rates found in studies of psychiatric patients, as well as general population residents in the community, when diagnoses are based on semistructured interviews (30–41). Thus, Westen’s suggestion that there is a problem with current personality disorder research because of the high rate of multiple diagnoses can be interpreted in either of two ways. As Westen indicated, it might reflect a problem with a method of diagnosis that mindlessly casts too wide a net and fails to distinguish clinically relevant from irrelevant pathology. If true, then the problem of overdiagnosing comorbidity exists for both axis II and axis I assessments. Alternatively, it could be that there is a problem with the unstructured routine clinical evaluation and that clinicians do not thoroughly explore the presence of comorbid pathology after documenting the diagnosis responsible for the patient’s chief complaint.
Received Sept. 8, 1998; revisions received Dec. 11, 1998, and March 22, 1999; accepted April 1, 1999. From the Department of Psychiatry and Human Behavior, Brown University School of Medicine, Rhode Island Hospital, Providence. Address reprint requests to Dr. Zimmerman, Bayside Medical Center, 235 Plain St., Providence, RI 02905; email@example.com (e-mail). Supported in part by NIMH grant MH-48732 to Dr. Zimmerman. The authors thank Sharon Hunter, Ava Nepaul, Melissa Torres, and Sharon Younkin for assistance in collecting the data.