Schizophrenia is a chronic illness that typically begins in early adulthood, rarely leading to a quick and complete recovery. Its course is chronic, with variable levels of symptoms and functions present at any given time throughout the life of the individual. This long-term course leaves open the possibility that an underlying active pathologic brain process continues throughout the life span of individuals with schizophrenia (1). This, however, has not been widely accepted, and many investigators have come to the conclusion that schizophrenia arises from an early developmental brain lesion, with a delay in onset of symptoms until maturation and adult organization of synaptic pathways are in place (2). Nevertheless, no direct evidence exists to implicate a developmental anomaly, only retrospective assumptions based on studies of adults with chronic schizophrenia. Assessment of cognitive function over the course of schizophrenic illness is one method of determining the static versus progressive nature of brain pathology in this illness.
While a few early longitudinal studies suggested cognitive deterioration from baseline levels in some patients with schizophrenia, particularly those hospitalized for several years (3, 4) but also in patients with first-episode schizophrenia (5), other studies found no change or even improvement in some functions over time (5–12). In previous longitudinal studies of neuropsychological function, methodological weaknesses included testing at only two time points (baseline and follow-up) (5–11), testing at nonuniform intervals with no statistical adjustment of duration (9–11), testing mixed groups of first-episode and chronically ill patients (10, 11), the absence of normal comparison subjects (5–9), and testing without medication at baseline and with medication at follow-up (10, 11), thus measuring the effectiveness of medication but not necessarily the natural history of cognition over the course of the schizophrenic illness.
At the State University of New York at Stony Brook, a prospective long-term follow-up study was initiated in 1987 to study brain structural and cognitive change over time in new patients with a recent onset of schizophrenia-like illness (8, 13). Eighty-seven patients satisfying DSM-III-R criteria for schizophrenia, schizoaffective disorder, or schizophreniform disorder and 52 normal comparison subjects continue to be studied on an annual basis. Reports from this cohort showed that patients had a significantly larger left ventricle size at first hospitalization than normal subjects (13), a left temporal lobe size that was correlated with duration of illness (13), and a significant reduction in total left and right hemispheric volume (8, 14) and an increase in left ventricle size over time (14).
At the first hospitalization and after stabilization, these patients scored significantly worse than comparison subjects on measures of executive function, verbal and spatial memory, concentration/speed, and global cognition, with comparable performance to that of patients with chronic schizophrenia (7, 15). Our preliminary analysis of cognition on repeat follow-up evaluations of some patients showed a lack of deterioration and even some improvement (8). However, these data did not include many time points, were only extended to 4 years in a small number of subjects, and were not compared with a similar aged normal comparison group, not excluding the possibility that improvements were practice effects. The present study is a complete analysis of 42 of these patients and 16 comparison subjects, who completed serial neuropsychological evaluations over an approximate mean of 4 years of follow-up.
Consecutive patients admitted to a public hospital’s catchment area acute inpatient unit (Kings Park Psychiatric Center, Kings Park, N.Y.) from 1987 to 1992 with their first episode of schizophrenia, schizophreniform disorder, or schizoaffective disorder received a comprehensive neuropsychological test battery and magnetic resonance imaging (MRI) scans after approximately 2 to 4 weeks of stabilization with medication. Baseline interviews included the Brief Psychiatric Rating Scale (16), the Scale for the Assessment of Negative Symptoms (SANS) (17), and the Scale for the Assessment of Positive Symptoms (SAPS) (18). Only patients who met the criteria for schizophrenia at the end of 2 years of illness were included in the present study group. Normal comparison subjects were recruited from the surrounding community and screened by structured interviews, eliminating those with major psychiatric disorders, axis II personality disorders (19), alcohol or drug abuse, or physical illness requiring medication. All subjects gave written informed consent to participate in this study after the procedure was fully explained.
All subjects were interviewed by using a revised version of the Schedule for Affective Disorders and Schizophrenia (20), and diagnoses were made by using DSM-III-R criteria and on the basis of interviews, medical records, and information from family members and treating physicians. Two independent professionals assigned diagnoses to all cases. When disagreements were present, a third physician was consulted (L.E.D.), and a final diagnosis was reached by consensus of all three. In addition, SANS and SAPS ratings were made annually. These ratings were available for 37 patients. The sums of all positive and negative symptoms (excluding global ratings) were calculated for each annual evaluation.
The subjects reported here are only those who agreed to neuropsychological retesting and who had a minimum of three yearly neuropsychological evaluations, including a baseline evaluation. All patients and comparison subjects were actively enrolled in the study at the time of data analysis, such that missing data from later time points was because they had not yet reached those time points. Thus, there was no selective case attrition. Follow-up neuropsychological evaluations were completed on 42 patients (31 men and 11 women) who were a mean of 26.3 years of age (SD=7.4) at baseline and 16 comparison subjects (11 men and five women) who were a mean of 26.1 years old (SD=5.3) at baseline. The patients had a mean of 12.5 years (SD=2.1) of education, whereas comparison subjects had completed a mean of 14.9 years of education (SD=2.1) (t=–3.97, df=1, p<0.001). The mean age at first hospitalization for patients was 26.7 years (SD=7.2). Thirty-one of the 42 patients were men; 11 of the comparison subjects were men (χ2=0.19, df=1, n.s.).
The mean time of follow-up was 3.6 years (SD=0.7) for patients and 3.8 years (SD=0.9) for the comparison group. Of the 42 patients, 14 had follow-up testing through their fifth year of illness, 15 had testing through their fourth year, nine had testing through their third year, and four had testing through their second year of illness. At the follow-up testing, patients were clinically stable and not in acute episodes of illness. Of the 16 comparison subjects, two had 5-year follow-up testing, six had 4-year follow-up, three had 3-year follow-up, and five had 2-year follow-up testing. The neuropsychological test battery was administered by a trained psychology graduate student (M.S.). These tests and evaluations are detailed in previous publications (7, 15). The unit of data analysis was the rate of change per year for each subject; a regression slope was used.
All medications (neuroleptic and nonneuroleptic drugs) at all yearly evaluations were recorded and converted into chlorpromazine and atropine dose equivalents by using standard formulas (21). Atropine-equivalent doses, which included estimates of neuroleptic and nonneuroleptic medications, were used as an indirect estimate of the amount of anticholinergic activity in the medications. To keep treatment uniform, the treating psychiatrist medicated all patients at baseline with 10 to 20 mg of haloperidol and 1 to 2 mg of benztropine mesylate, if needed. This medication regimen was successful in most, but not all, cases. At baseline, 39 patients were receiving typical neuroleptics (haloperidol: N=27, fluphenazine: N=4, thiothixene: N=4, thioridazine: N=3, and loxapine: N=1), and three patients were taking no medication. At the final follow-up, 30 patients were taking typical neuroleptics (haloperidol: N=20, thiothixene: N=4, fluphenazine: N=2, perphenazine: N=2, thioridazine: N=1, and loxapine: N=1), four patients were taking atypical neuroleptics (clozapine: N=3, and risperidone: N=1), and eight patients were taking no medication. Most patients remained on the same treatment regimen throughout the follow-up period. Average chlorpromazine and atropine dose equivalents at baseline were 546.4 mg (SD=523.6) and 2.8 mg (SD=5.3), respectively, and at follow-up, 301.6 mg (SD=377.2) and 6.3 mg (SD=20.2), respectively.
The MRI methods and results on 4 to 5 years of follow-up were published previously (14). Of the 42 patients and 16 comparison subjects in the present study, 34 patients and 12 comparison subjects completed serial MRI scans. Since details of these procedures and subsequent analyses appeared in a separate publication (14), only a brief description of the MRI procedures will be given here. MRI scans were performed by using a GE Signa scanner with a 1.5-T magnet and spin-echo T1-weighted 5-mm coronal sequences with a 2-mm gap. The MRI specifications determined in 1987 were kept constant over time to determine longitudinal change. Measurements of left and right hemispheres were made on coronal slices including the whole brain, from frontal to occipital poles, and excluding the cerebellum, ventricles, brain stem, and CSF between outer sulci. Measurements of the temporal lobes were also made on coronal slices, beginning posteriorly with the first visualization of the superior and inferior colliculi, tracing medially at the lowest medial end of the lateral sulcus, circumscribing the outer surface and following fissures ending at a line connecting the base of the stem of the lobes. For the lateral ventricles, coronal measurements began posteriorly, where the occipital horns were first visualized, and ended anteriorly, where the frontal horns could no longer be seen. Intrarater reliability was performed on 10 randomly selected scans measured twice several weeks apart by using intraclass correlations. They ranged from 0.90 to 0.99 for all structures analyzed. All measurements were made by one master’s-level research assistant (M.S.) trained in brain anatomy who was blind to diagnostic status and year of follow-up.
At the last follow-up evaluation, diagnoses were as follows: schizophrenia, paranoid type: N=8; schizophrenia, undifferentiated type: N=7; schizophrenia, residual type: N=11; schizoaffective disorder: N=9; and in remission: N=7. Of those in remission, four were taking medication, and three were not.
Because of the large number of neuropsychological dependent measures, neuropsychological summary scale scores were created for each annual evaluation by converting all individual test variables to z scores on the basis of the mean and standard deviations of an independent normal comparison group (N=74; men: N=56, women: N=18), which did not differ from the present study’s normal comparison group (N=16) on age, education, or parental socioeconomic status. This comparison group was tested on one occasion. Summary scales were constructed empirically, on the basis of internal consistency measures, and were the summed average of the individual z scores for each test. They were as follows: language (prorated Verbal IQ; Wide-Range Achievement Test—Revised—Reading, Boston Naming Test, Word Attack, and Controlled Oral Word Association), executive function (Wisconsin Card Sorting Test, abbreviated Booklet Category Test, and Stroop Color-Word Test), verbal memory (California Verbal Learning Test, Associate Learning, and Logical Memory [sum of story elements and immediate and delayed recall from the Wechsler Memory Scale]), spatial memory (Visual Reproduction from the Wechsler Memory Scale and Benton Visual Retention Test), concentration/speed (Trail Making Tests A and B, Symbol Digit Modalities Test, Finger Tapping test, and Cancellation Test), sensory-perceptual (finger gnosis and finger-tip number writing), and global cognition (the summed average of all six scale scores). These details are provided in our earlier report (7). Means and standard deviations of each test and time points for patients and comparison subjects are provided in an appendix, which is available on request (Dr. Hoff).
The rate of change was calculated for each neuropsychological summary scale score by using all available annual data points, with the requirement that 3 or more years of test data be available. The slope of a line through all data points for each individual per variable was calculated by using linear regression with time (e.g., baseline, year 1, year 2, year 3) as the independent variable. This approach was advantageous because it allowed for fitting a line over three or more time points, which could be interpreted as a weighted average of repeated observations, measuring a consistent directional trend. This approach has considerable advantages over a pre-post design in reducing error variance associated with subject or measurement inconsistency (22). This approach was also used in the analysis of the MRI data to detect the average rate of change in a structure per year (14). Thus, the slope expressed in z scores indicates the average amount of improvement or deterioration in standard deviations in a particular cognitive domain per year.
Both the patient group and the normal comparison group were broken down into two groups: those who had follow-up through years 2 and 3 (shorter follow-up) and those who had follow-up through years 4 and 5 (longer follow-up). Within the patient cohort, there were no statistically significant differences between the shorter (N=13) and longer (N=29) follow-up groups on baseline neuropsychological measures, neuropsychological slopes, age, age at onset, or baseline SANS and SAPS scores. Likewise, within the comparison group, there were no statistically significant differences between the shorter (N=8) and longer (N=8) follow-up groups on baseline neuropsychological measures, slopes, or age. Therefore, we felt that groups of subjects with shorter and longer follow-up times could be combined. In addition, the slope measured the rate of change per year and was independent of the number of years of follow-up.
The brain structures quantified by measurements of MRI scans were left and right hemisphere volumes, left and right lateral ventricular volumes, and left and right temporal lobe volumes. The first two were found to change over time (cerebral volume decreasing and ventricular volume increasing) to a greater extent in patients than in comparison subjects, whereas reduced temporal lobe volume was found to be correlated with duration of illness (8, 13, 14). Slopes were also calculated for SANS and SAPS scores, reflecting the amount of increase or decrease in positive and negative symptoms over the same time period and for chlorpromazine and atropine dose equivalents.
Analyses of covariance (ANCOVAs), with baseline performance as the covariate, were computed for each neuropsychological summary scale score to determine differences in slopes between patients and comparison subjects. Baseline performance was used as a covariate because statistically significant negative relationships were found between baseline neuropsychological performance and slope, with lower baseline associated with greater slope. This was also done to avoid the statistical problem of regression to the mean. Post hoc tests using Tukey’s honestly significant difference (unequal study groups) were performed for statistically significant effects. Spearman r correlations were performed between neuropsychological summary scale slopes and the slopes of left and right brain volumes, temporal lobes, and lateral ventricles; the slopes of SANS and SAPS total scores; and the slopes of chlorpromazine and atropine dose equivalents. Spearman correlations were used to guard against violations of assumptions of the Pearson correlation—i.e., asymmetric distributions, which were seen on visual inspection of some variables. MRI slopes and neuropsychological slopes were adjusted for baseline values by using linear regression.
F1 illustrates the average performance in z scores for each year from baseline to 5-year follow-up for patients and comparison subjects on all neuropsychological domains. Patients scored approximately 1 to 2 standard deviations below normal comparison subjects on all domains throughout the follow-up period. Since it is well documented that patients with schizophrenia perform worse than normal comparison subjects on measures of neuropsychological function, the focus of the present study was to determine if patients and comparison subjects were different in their rate of change over time—that is, do patients with schizophrenia deteriorate in cognitive abilities in the early course of illness?
ANCOVAs adjusting for baseline performance were performed on the seven summary scales. Comparison subjects had significantly greater improvement over time on verbal memory and sensory-perceptual domain scores than patients. Analysis of F2 reveals that comparison subjects had an average annual improvement on these scales of approximately 0.2 standard deviations compared to patients, who had significantly less improvement. That is, patients deteriorated or did not improve on these measures compared to normal subjects—i.e., they did not benefit from repeated annual practice. Interestingly, only the memory scale scores indicated deterioration in patients over time—i.e., negative slopes—but only in the verbal memory domain was the difference between patients and comparison subjects statistically significant.
There were no statistically significant differences between male (N=31) and female (N=11) patients in executive function (F=3.12, df=1, 39, p<0.08), concentration/speed (F=3.57, df=1, 39, p<0.07), and global (F=3.71, df=1, 39, p<0.06) cognitive domains.
When diagnosis at last follow-up was considered, diagnostic groups and comparison subjects differed significantly from one another on executive function (F=3.59, df=5, 51, p<0.007), verbal memory (F=2.64, df=5, 51, p<0.03), and concentration/speed (F=2.36, df=5, 51, p<0.05) domains. Post hoc Tukey honestly significant difference test scores revealed that patients who were considered to be in remission had greater improvement than comparison subjects in the executive function domain (p<0.05). Post hoc test scores did not reach significant levels for the verbal memory domain or the concentration/speed domain.
F3 shows average SANS and SAPS ratings over the 5-year period. As shown, positive symptoms showed a steep drop from baseline to first-year follow-up and then remained constant, whereas negative symptoms gradually decreased slightly over time. Statistically significant correlations between the slopes of the executive function (rs=–0.40, N=33, p<0.02), spatial memory (rs=–0.54, N=33, p<0.001), concentration/speed (rs=–0.56, N=33, p<0.0006), and global cognition (rs=–0.65, N=33, p<0.00005) scale scores and slopes of the SAPS scales were found, indicating that decreases in positive symptoms were related to improvements in these cogitive domains. No other statistically significant relationships were found between neuropsychological slopes and negative symptoms or with either chlorpromazine or atropine dose equivalents.
There were no statistically significant correlations between neuropsychological and MRI change in patients. Within the comparison group, decreases in right temporal lobe size were associated with decreases in spatial memory (rs=0.63, N=12, p<0.05) and concentration/speed (rs=0.61, N=12, p<0.05). Enlargement of the left (rs=–0.85, N=12, p<0.001) and right (rs=–0.61, N=12, p<0.05) lateral ventricle over time was associated with deterioration in concentration/speed.
Poor cognitive performance was detected initially by the first hospitalization in the present group of patients with schizophrenia, and when studied serially over the first few years of illness, there was surprisingly little change in cognitive functioning compared to similarly studied normal comparison subjects. On average, the performance of patients with schizophrenia remained 1 to 2 standard deviations below that of comparison subjects throughout the 5-year period. Although many neuropsychological functions actually improved over time in both patients and comparison subjects, verbal memory scores did not improve in patients. This lack of improvement did not appear to be related to the effects of medications, as measured by chlorpromazine and atropine dose equivalents (atropine being an indirect measure of the anticholinergic properties of the medication). Serum anticholinergic assays would have better addressed the issue of lack of memory improvement in patients, but blood samples were not available for study.
Some neuropsychological studies have demonstrated that verbal memory is a selective deficit in patients with schizophrenia (23). These findings are supported by relatively greater volumetric reductions in the left temporal regions of patients with schizophrenia (24) and may suggest an area of lessened functional and anatomical plasticity in patients with schizophrenia. In general, our findings were consistent with those of a recent review of 15 follow-up studies of neuropsychological testing in patients with schizophrenia, in which it was concluded that after the onset of illness, cognitive deficits are relatively stable over long periods (12).
There was strong evidence that improvement in positive symptoms, in particular, was associated with improvement in cognitive performance. Reduction of positive symptoms per se may have had an enhancing effect on cognition, or it is possible that treatment responders have premorbidly healthier brains with greater neural plasticity and greater capacity to improve on cognitive tasks. Changes in cognitive functioning were not correlated with the changes in brain structure previously reported in this cohort. It is possible that the brain changes seen over time are unrelated to the manifestation of the disease or do not have functional consequences at this early stage of illness.
There are several limitations to our study. One limitation is that small group sizes, particularly of the comparison group, may have limited our ability to detect differences between the two groups in rate of improvement or deterioration in cognition. However, effect sizes using group means and pooled standard deviations computed for nonsignificant group differences (mean of patients minus mean of comparison subjects) were as follows: language: –0.13, executive function: 0.29, spatial memory: 0.25, concentration/speed: 0.18, and global cognition: 0.08. The only effect sizes large enough to be of interest—i.e., the executive and spatial memory domains—go in the direction opposite that predicted. That is, patients actually improved more than comparison subjects, although not significantly, on measures of executive function and spatial memory. Thus, we do not feel that lack of power was a major issue in our not being able to detect differences. A second problem with the study has to do with different lengths of follow-up periods within and between patient and comparison groups and that not all subjects were evaluated through their fifth year of study. By comparing patients studied through shorter (through 2 or 3 years) and longer (through 4 or 5 years) follow-up periods, we were able to determine that there were no differences between the groups on demographic or clinical characteristics, nor were there differences in baseline cognitive function or in neuropsychological slopes. Thus, we felt confident that differences in length of follow-up did not affect the results considerably. In addition, our unit of measurement, the slope, reflects the amount of cognitive change per year, regardless of the number of years of follow-up. A final issue concerns the use of a linear slope to characterize changes in neuropsychological function or symptoms that may not be linear. Unfortunately, we do not have enough data points to determine the pattern of these changes (e.g., linear, U-shaped, exponential). Rather, we used the slope, which is a weighted average of repeated observations, as an outcome measure because it can be interpreted as a measure of consistent directional trend. We did not assume that the pattern of change is a linear one, but we used a linear slope as the best first approximation of these longitudinal changes.
In conclusion, there is evidence for lack of improvement of verbal memory, despite repeated evaluations, in the first 2 to 5 years of schizophrenic illness, but there is also evidence for stability in other cognitive abilities during the same time period. The performance of patients with schizophrenia remains significantly below that of comparison subjects. Whether further deterioration or improvement occurs beyond the first few years of illness will be answered in more long-term follow-up studies of this cohort and others.
Presented in part at the Sixth International Congress on Schizophrenia Research, Colorado Springs, April 12, 1997, and the 37th annual meeting of the American College of Neuropsychopharmacology, Las Croabas, Puerto Rico, Dec. 18, 1998. Received Aug. 20, 1998; revision received Feb. 25, 1999; accepted March 11, 1999. From the Stony Brook First-Episode Schizophrenia Project, Department of Psychiatry, Health Sciences Center, State University of New York, Stony Brook; and the Department of Psychiatry, University of California Davis School of Medicine, Sacramento. Address reprint requests to Dr. Hoff, University of California Davis-Napa Psychiatric Research Center, Napa State Hospital, 2100 Napa-Vallejo Highway, Napa, CA 94558; firstname.lastname@example.org (e-mail). Supported by NIMH grant MH-44233. The authors thank Angela Boccio-Smith for administrative assistance, Scott Espinoza for technical support, and Sue Thiemann, M.S., for statistical consultation.
Neuropsychological Summary Scale Scores for the First 5 Years in 42 Patients With First-Episode Schizophrenia and 16 Normal Comparison Subjects
Slopes of Neuropsychological Summary Scale Scores for the 5-Year Period in 42 Patients With First-Episode Schizophrenia and 16 Normal Comparison Subjects (Adjusted for Baseline Performance)
Significant group effect (F=8.54, df=1, 55, p=0.005).
Significant group effect (F=8.03, df=1, 55, p=0.006)
SANS and SAPS Scores for the First 5 Years in 42 Patients With First-Episode Schizophrenia