0
Get Alert
Please Wait... Processing your request... Please Wait.
You must sign in to sign-up for alerts.

Please confirm that your email address is correct, so you can successfully receive this alert.

1
Letter to the Editor   |    
Dr. Hilsenroth and Colleagues Reply
MARK J. HILSENROTH, PH.D.; STEVEN J. ACKERMAN, M.A.; MATTHEW D. BLAGYS, M.A.; JENNIFER L. PRICE, M.A.
Am J Psychiatry 2001;158:1936-1937. doi:10.1176/appi.ajp.158.11.1936

To the Editor: We agree in part with Dr. Janca that our very high levels of interrater reliability regarding the DSM-IV axis V clinician rating scales may have been influenced by extensive training, high motivation on the part of the clinicians, and the clinicians’ working within a larger research protocol. Also, it is fairly common that interrater reliability for a variety of clinical conditions or constructs is higher between raters at the same site than for raters across sites (1, 2). However, our results are quite similar to those from a number of other studies involving the Global Assessment of Functioning Scale and its predecessor, the Global Assessment Scale (310). This prior research has demonstrated the interrater reliability of the Global Assessment of Functioning Scale as in the "good" or "excellent" range (ICC/k=0.60–0.74 and ICC/k=0.75 or more, respectively [11]). In addition, the WHO Short Disability Assessment Schedule, which possesses subcomponents similar to those in the Global Assessment of Relational Functioning Scale and the Social and Occupational Functioning Assessment Scale, has been shown to possess "good" interrater reliability (ICC=0.62) in at least one multisite field trial (12).

Furthermore, we disagree with Dr. Janca’s conclusion that our findings may not represent a true psychometric evaluation of these scales. We base this disagreement on three potentially related issues for further research in the assessment of multiaxial psychiatric functioning. Our discussion of these issues is particularly relevant to the rating of patient-clinician interactions and interview narratives in psychology and psychiatry.

First, the high level of agreement between the two raters in our study of the DSM-IV axis V scales suggests that these measures may be used to reliably rate the general severity of psychopathology and relational, social, and occupational functioning. The specific rating criteria developed for the DSM-IV axis V scales appear sufficiently clear to produce high levels of interrater reliability. The extensive supervised training of raters in the use of this scale likely contributed to the high level of agreement between raters. The low interrater reliability coefficients for the DSM-IV axis V scales found in other studies may not be assumed to reflect poor coding criteria or scale definition but rather may be due to poor or inadequate rater training.

While time constraints may prohibit such extensive training, it provides an optimal level of familiarity with the DSM-IV axis V scales and helps raters make subtle distinctions between scores before rating the patients included in the data analyses. The excellent interrater reliability coefficients achieved in our study suggested that the general severity of psychopathology and relational, social, and occupational functioning can be reliably coded and suggested the importance of training judges before coding begins.

Second, we encourage future investigators to examine the differential impact of the time or length of the interview in relation to reliability. The length of interviews used in most reliability field trials usually ranges from approximately 45 minutes up to 2 hours. The ratings from our original study were based on two sessions, each lasting approximately 3 hours. The higher levels of interrater reliability that were found in our work may be related to the clinician’s spending this additional time interacting with the patient. The implications of time or length of interviews on reliability have rarely been discussed in the psychiatric literature, and given the current impingement of third-party payers and the reduced support for more thorough evaluations (13, 14), this seems an especially important issue. If clinicians are unduly limited in the time spent on an assessment, then less reliability, misdiagnosis, and potential problems for treatment may result.

In addition to including extra time spent by the clinicians, both in training and in interacting with their patients, our study also focused parts of the interview on key relational episodes from patients’ lives. This focus on patient narratives during the interview (15), as well as the organization of the interview and feedback session from a therapeutic assessment model (16), may have contributed to the higher reliability of the interview or videotape raters. Rather than focusing simply on the description of psychiatric symptoms or on a structured interview (i.e., the Structured Clinical Interview for DSM), the patients were encouraged to describe and explore relational interactions (thoughts, feelings, and fantasies) associated with the appearance of symptoms. In this manner, the clinicians attempted to enlist the patients to help them clarify and understand the impact of these experiences, both past and present, on their functioning. This relationally based exploration was focused on helping clinicians gain a better understanding of the personal meaning of life experiences related to psychiatric symptoms as well as explore prior successful and unsuccessful ways of coping with problems or symptoms.

The amount of prerequisite training on any scale applied to interview data (or any patient-clinician interaction) will invariably affect the subsequent reliability of that scale or measure. It is also possible that additional time spent and/or the relational focus of an interview can aid clinicians in making more reliable assessments of the general severity of psychopathology and relational, social, and occupational functioning. Perhaps when examining a patient’s general severity of psychopathology and relational, social, and occupational functioning, clinicians should be aided by first training to meet an acceptable criterion for accuracy on a given scale, spending additional time with the patient, and then examining psychiatric symptoms and relational, social, and occupational functioning within an interpersonal and narrative context. In contrast, when adequate prerequisite training, involved patient-clinician interaction, and exploration of functioning within a relational context are not present, the true psychometric properties of any clinician rating scale may be underestimated.

Keller MB, Klein DN, Hirschfeld RM, Kocsis JH, McCullough JP, Miller I, First MB, Holzer CP III, Keitner GI, Marin DB, Shea T: Results of the DSM-IV Mood Disorders Field Trial. Am J Psychiatry  1995; 152:843-849
[PubMed]
 
Perry JC, Hoglend P, Shear K, Vaillant GE, Horowitz M, Kardos ME, Bille H, Kagan D: Field trial of a diagnostic axis for defense mechanisms for DSM-IV. J Personal Disord  1998; 12:56-68
[CrossRef]
 
Endicott J, Spitzer RL, Fleiss JL, Cohen J: The Global Assessment Scale: a procedure for measuring overall severity of psychiatric disturbance. Arch Gen Psychiatry  1976; 33:766-771
[PubMed]
 
Spitzer RL, Forman JB: DSM-III field trials, II: initial experience with the multiaxial system. Am J Psychiatry  1979; 136:818-820
[PubMed]
 
Strakowski SM, Keck PE Jr, McElroy SL, West SA, Sax KW, Hawkins JM, Kmetz GF, Upadhyaya VH, Tugrul KC, Bourne ML: Twelve-month outcome after a first hospitalization for affective psychosis. Arch Gen Psychiatry  1998; 55:49-55
[PubMed]
[CrossRef]
 
Hollon SD, DeRubeis RJ, Evans MD, Wiemer MJ, Garvey MJ, Grove WM, Tuason VB: Cognitive therapy and pharmacotherapy for depression: singly and in combination. Arch Gen Psychiatry  1992; 49:774-781
[PubMed]
 
Hoglend P: Transference interpretations and long-term change after dynamic psychotherapy of brief to moderate length. Am J Psychother  1993; 47:494-507
[PubMed]
 
Hooley JM, Hoffman PD: Expressed emotion and clinical outcome in borderline personality disorder. Am J Psychiatry  1999; 156:1557-1562
[PubMed]
 
Durbin CE, Klein DN, Schwartz JE: Predicting the 2 ≡-year outcome of dysthymic disorder: the roles of childhood adversity and family history of psychopathology. J Consult Clin Psychol  2000; 68:57-63
[PubMed]
[CrossRef]
 
Williams JBW, Gibbon M, First MB, Spitzer RL, Davies M, Borus J, Howes MJ, Kane J, Pope HG, Rounsaville B, Wittchen H-U: The Structured Clinical Interview for DSM-III-R (SCID), II: multisite test-retest reliability. Arch Gen Psychiatry  1992; 49:630-636
[PubMed]
 
Fleiss J: Statistical Methods for Rates and Proportions, 2nd ed. New York, Wiley, 1981
 
Michels R, Siebel U, Freyberger HJ, Stieglitz RD, Schaub RT, Dilling H: The multiaxial system of ICD-10: evaluation of a preliminary draft in a multicentric field trial. Psychopathology  1996; 29:347-356
[PubMed]
[CrossRef]
 
Eisman E, Dies R, Finn SE, Eyde L, Kay GG, Kubiszyn T, Meyer GJ, Moreland K: Problems and limitations in the use of psychological assessment in contemporary healthcare delivery. Professional Psychol: Res Practice  2000; 31:131-140
[CrossRef]
 
Piotrowski C: Assessment practices in the era of managed care: current status and future directions. J Clin Psychol  1999; 55:787-796
[PubMed]
[CrossRef]
 
Westen D: Divergences between clinical and research methods for assessing personality disorders: implications for research and the evolution of axis II. Am J Psychiatry  1997; 154:895-903
[PubMed]
 
Finn SE, Tonsager M: Information-gathering and therapeutic models of assessment: complementary paradigms. Psychol Assess  1997; 19:374-385
 
+

References

Keller MB, Klein DN, Hirschfeld RM, Kocsis JH, McCullough JP, Miller I, First MB, Holzer CP III, Keitner GI, Marin DB, Shea T: Results of the DSM-IV Mood Disorders Field Trial. Am J Psychiatry  1995; 152:843-849
[PubMed]
 
Perry JC, Hoglend P, Shear K, Vaillant GE, Horowitz M, Kardos ME, Bille H, Kagan D: Field trial of a diagnostic axis for defense mechanisms for DSM-IV. J Personal Disord  1998; 12:56-68
[CrossRef]
 
Endicott J, Spitzer RL, Fleiss JL, Cohen J: The Global Assessment Scale: a procedure for measuring overall severity of psychiatric disturbance. Arch Gen Psychiatry  1976; 33:766-771
[PubMed]
 
Spitzer RL, Forman JB: DSM-III field trials, II: initial experience with the multiaxial system. Am J Psychiatry  1979; 136:818-820
[PubMed]
 
Strakowski SM, Keck PE Jr, McElroy SL, West SA, Sax KW, Hawkins JM, Kmetz GF, Upadhyaya VH, Tugrul KC, Bourne ML: Twelve-month outcome after a first hospitalization for affective psychosis. Arch Gen Psychiatry  1998; 55:49-55
[PubMed]
[CrossRef]
 
Hollon SD, DeRubeis RJ, Evans MD, Wiemer MJ, Garvey MJ, Grove WM, Tuason VB: Cognitive therapy and pharmacotherapy for depression: singly and in combination. Arch Gen Psychiatry  1992; 49:774-781
[PubMed]
 
Hoglend P: Transference interpretations and long-term change after dynamic psychotherapy of brief to moderate length. Am J Psychother  1993; 47:494-507
[PubMed]
 
Hooley JM, Hoffman PD: Expressed emotion and clinical outcome in borderline personality disorder. Am J Psychiatry  1999; 156:1557-1562
[PubMed]
 
Durbin CE, Klein DN, Schwartz JE: Predicting the 2 ≡-year outcome of dysthymic disorder: the roles of childhood adversity and family history of psychopathology. J Consult Clin Psychol  2000; 68:57-63
[PubMed]
[CrossRef]
 
Williams JBW, Gibbon M, First MB, Spitzer RL, Davies M, Borus J, Howes MJ, Kane J, Pope HG, Rounsaville B, Wittchen H-U: The Structured Clinical Interview for DSM-III-R (SCID), II: multisite test-retest reliability. Arch Gen Psychiatry  1992; 49:630-636
[PubMed]
 
Fleiss J: Statistical Methods for Rates and Proportions, 2nd ed. New York, Wiley, 1981
 
Michels R, Siebel U, Freyberger HJ, Stieglitz RD, Schaub RT, Dilling H: The multiaxial system of ICD-10: evaluation of a preliminary draft in a multicentric field trial. Psychopathology  1996; 29:347-356
[PubMed]
[CrossRef]
 
Eisman E, Dies R, Finn SE, Eyde L, Kay GG, Kubiszyn T, Meyer GJ, Moreland K: Problems and limitations in the use of psychological assessment in contemporary healthcare delivery. Professional Psychol: Res Practice  2000; 31:131-140
[CrossRef]
 
Piotrowski C: Assessment practices in the era of managed care: current status and future directions. J Clin Psychol  1999; 55:787-796
[PubMed]
[CrossRef]
 
Westen D: Divergences between clinical and research methods for assessing personality disorders: implications for research and the evolution of axis II. Am J Psychiatry  1997; 154:895-903
[PubMed]
 
Finn SE, Tonsager M: Information-gathering and therapeutic models of assessment: complementary paradigms. Psychol Assess  1997; 19:374-385
 
+
+

CME Activity

There is currently no quiz available for this resource. Please click here to go to the CME page to find another.
Submit a Comments
Please read the other comments before you post yours. Contributors must reveal any conflict of interest.
Comments are moderated and will appear on the site at the discertion of APA editorial staff.

* = Required Field
(if multiple authors, separate names by comma)
Example: John Doe



Web of Science® Times Cited: 5

Related Content
Books
APA Practice Guidelines > Chapter 0.  >
The American Psychiatric Publishing Textbook of Geriatric Psychiatry, 4th Edition > Chapter 8.  >
Gabbard's Treatments of Psychiatric Disorders, 4th Edition > Chapter 55.  >
Textbook of Psychotherapeutic Treatments > Chapter 15.  >
The American Psychiatric Publishing Textbook of Geriatric Psychiatry, 4th Edition > Chapter 9.  >
Psychiatric News
PubMed Articles