OBJECTIVE: The goal of this study was to review the reported methods of rater training, assessment of interrater reliability, and rater drift in clinical trials of treatments for depressive disorders. METHOD: Two psychiatrists independently identified all original reports of clinical trials relevant to depressive disorders published between 1996 and 2000 in the American Journal of Psychiatry and the Archives of General Psychiatry. Reported methods of rater training, assessment of interrater reliability, and rater drift were systematically summarized. RESULTS: Sixty-three original papers met criteria for inclusion. Only 11 (17%) of the studies reported the number of raters. Only two (9%) of the 22 multicenter and four (10%) of the 41 single-center trials documented rater training. Only nine (22%) of the single-center trials and three (14%) of the multicenter trials reported interrater reliability, despite a median number of five raters (range=2–20). Only three (5%) of the 63 articles reported rater drift. CONCLUSIONS: Few published reports of clinical trials of treatments for depressive disorders document adequately the number of raters, rater training, assessment of interrater reliability, and rater drift.