We agree that Pearson’s correlations are less than ideal for assessing the reliability of individual item-to-item comparisons, especially when the scaling is different. That said, all of the studies reviewed used this coefficient, causing us by necessity to rely on it. Note that the use of Pearson’s r likely produces inflated estimates of association relative to weighted kappa, which "corrects" for chance association, with the result that many of the individual Hamilton depression scale items are likely more problematic than we concluded. We would, however, argue that Pearson’s r is, in fact, appropriate for examining item-to-total correlations as composite scores, such as Hamilton depression scale total scores, approach interval-level measurement. Pearson’s r is widely used to compare individual items with total scores.