The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×
No Access

An intervention to improve the reliability of manuscript reviews for the Journal of the American Academy of Child and Adolescent Psychiatry

Published Online:https://doi.org/10.1176/ajp.150.6.947

OBJECTIVE: The effects of methods used to improve the interrater reliability of reviewers' ratings of manuscripts submitted to the Journal of the American Academy of Child and Adolescent Psychiatry were studied. METHOD: Reviewers' ratings of consecutive manuscripts submitted over approximately 1 year were first analyzed; 296 pairs of ratings were studied. Intraclass correlations and confidence intervals for the correlations were computed for the two main ratings by which reviewers quantified the quality of the article: a 1-10 overall quality rating and a recommendation for acceptance or rejection with four possibilities along that continuum. Modifications were then introduced, including a multi-item rating scale and two training manuals to accompany it. Over the next year, 272 more articles were rated, and reliabilities were computed for the new scale and for the scales previously used. RESULTS: The intraclass correlation of the most reliable rating before the intervention was 0.27; the reliability of the new rating procedure was 0.43. The difference between these two was significant. The reliability for the new rating scale was in the fair to good range, and it became even better when the ratings of the two reviewers were averaged and the reliability stepped up by the Spearman- Brown formula. The new rating scale had excellent internal consistency and correlated highly with other quality ratings. CONCLUSIONS: The data confirm that the reliability of ratings of scientific articles may be improved by increasing the number of rating scale points, eliciting ratings of separate, concrete items rather than a global judgment, using training manuals, and averaging the scores of multiple reviewers.

Access content

To read the fulltext, please use one of the options below to sign in or purchase access.