Schizophrenia is a debilitating disease involving multiple and diverse symptoms, none of which is unique to schizophrenia. There is no biological marker to diagnose schizophrenia, and today the diagnosis is achieved primarily by psychiatric evaluation, which relies on symptoms, medical history, interview, and observation. This procedure is difficult and somewhat unreliable, since each patient manifests a different subset of symptoms, whose evaluation in turn may differ even across expert observers.
The aim of this work was to develop an ecologically valid but fully operationalized testing paradigm that may shed light on the cognitive abnormalities (i.e., cognitive dysmetria) in schizophrenia. Thus, we have developed a testing paradigm that is based on virtual reality technology, which includes real-time interactions and multimodal experience. The objectives of this study were twofold: 1) to assess the construct validity of our paradigm in relation to standard diagnostic criteria and commonly used tools for assessing symptoms and signs in schizophrenia and 2) to illustrate its mechanistic usefulness, by revisiting perseveration in schizophrenia and using our scheme to show that perseverative behavior is sensitive to context.
We believe that virtual reality technology is especially suitable for studying schizophrenia for two main reasons. First, schizophrenia involves primarily high-level brain functions, and therefore some of its symptoms (such as abnormal integration) may be manifested only in an ecologically valid environment with a strong sense of presence. Calling on multiple cognitive and sensorimotor processes, within the same testing environment, allows for abnormal integration or interactions among different cognitive processes to be disclosed and measured. Second, by replacing the traditional "boring" testing procedure with a "fun" game in a virtual environment, we may be able to overcome the notorious low motivation and lack of concentration exhibited by schizophrenic patients.
This experiment measured aspects of sensory integration within working memory, known to be deficient among schizophrenic patients (1). The main experiment involved a computer game requiring navigation in a virtual maze with "challenge" and "delay" rooms. Each challenge room had three doors, only one of which was the correct choice, while each delay room had a single door. The goal of the game was to reach the end of the maze as fast as possible, and the end was reached only after all the correct doors were opened.
Each door in a challenge room was associated with up to three distinct features—shape (triangle, square, or circle), color (red, green, or blue), and sound (three different sounds) (F1). The sound was played when the subject examined the door. At each point in time, there was a certain door-opening rule, which determined which door should be used to exit a challenge room. For example, the rule might say that only red doors should be opened, in which case any red door, regardless of its shape or sound, could be used. There was always a single such door in each challenge room. The subject had to figure out the correct rule and open only the appropriate door (with the correct combination) in each challenge room. The rule randomly changed after 4–6 correct choices.
We created four experimental conditions by manipulating two factors: the number of features that defined the door-opening rule (one or two) and the presence or absence of a distractor feature on the doors (a feature that was not used in the rule) (see top two images in F1). The rule changed over time as indicated by a visual cue. When the correct door was chosen, the subject received a reward (cigarette or chocolate icon) and got encouragement (dancing figure with clapping hands) (lower right image in F1).
Between challenge rooms, the subject passed through a few delay rooms, each of which had only one door. The door in a delay room was also associated with a colored shape and sound, consistently different from those used on doors in challenge rooms (lower left image in F1). The delay rooms masked the target stimulus and imposed an active load on working memory, as the subjects needed to remember the correct rule during navigation. We manipulated the number of delay rooms to achieve a constant 20-second delay between successive challenge rooms.
The maze design was inspired by the Wisconsin Card Sorting Test (2), in which the subject needs to sort a deck of cards into four piles. At any moment the sorting should be done according to one feature (out of three), which changes after 10 consecutive correct placements. In a similar manner, each room in our maze had three doors characterized by two visual features and one auditory feature (instead of three visual features in the Wisconsin Card Sorting Test). While in the Wisconsin Card Sorting Test only one out of the three features displayed is important at any moment, we controlled both the number of features that defined the door-opening rule (one or two) and the number of features displayed (one, two, or three). There were two additional differences: 1) how the rule was defined—in the maze, the subjects needed to remember feature values (e.g., category values such as red rectangle), while in the Wisconsin Card Sorting Test the task requires the subject to remember a category, and 2) explanation—our subjects received detailed explanations of the task, followed by a training session, while no explanation is offered in the standard Wisconsin Card Sorting Test.
The participants were 39 schizophrenic patients and 21 healthy comparison subjects matched by gender (male), age, and education level. The subjects’ mean age was 32.3 years (SD=7.9), and the mean number of years of education was 10.6 (SD=2.6). The patients were diagnosed according to DSM-IV criteria and were rated for symptom severity with the Positive and Negative Syndrome Scale (PANSS) (3) during an interview by a clinical psychiatrist (A.P.). Schizophrenic patients with a history of neurological disorders, comorbidity, or drug abuse were excluded from the study. The patients were medicated with therapeutic doses of risperidone and olanzapine. Five patients also received long-acting medications (three patients received haloperidol decanoate, and two patients received long-acting fluphenazine). In all, the patients received a mean daily dose equivalent to 414 mg of chlorpromazine. All subjects volunteered and received payment. After complete description of the study to the subjects, written informed consent was obtained. The study was approved by the internal review board of Sha’ar Menashe Mental Health Center and the Israeli Ministry of Health, in accordance with the Helsinki Declaration.
The experiment included a training phase intended to bring all subjects up to their best level of performance, followed by the actual game. Training consisted of three stages. First, the subjects learned how to open correct doors without movement; during this stage the subjects experienced all types of door-opening rules. Second, the subjects learned how to navigate in the maze at the desired speed. Finally, they practiced in a game-like session, with emphasis on achieving the fewest errors (rather than speed). During training the experimenter (A.S.) intervened when three or more consecutive errors occurred, in which case the subject was reminded of the goals of the task, was encouraged to verbalize his strategy, and received compliments on correct choices.
The duration of the sessions varied among subjects, since a session ended only after a fixed number of correct doors were chosen. Upon any incorrect door choice, the subject was presented with another challenge room with the same set of doors, shifted in position. Thus, the session duration was positively correlated with the number of errors. In general, it took the patients roughly twice as long to complete the training as the comparison subjects took (58.6 and 28.6 minutes, respectively), while the durations of the test sessions were more similar (31.7 and 26.4 minutes, respectively). This difference was reflected in the set of measurements defining a subject’s profile.
A sense of reality was obtained with three-dimensional glasses, a head tracker, and a joystick. The subjects used the joystick to navigate and to open doors. The navigation button enabled movement in four directions: forward, backward, left, and right. A change in the direction of movement could also be made by turning the head.
We collected 26 measurements for each subject based on a variety of continuous physical measures. These included error scores and response time, the position and direction of gaze at any time, and the rate of improvement with time. The 26 measurements defined the subject’s performance profile and can be divided into three groups: working memory and integration, navigation and strategy, and learning.
The variables reflecting working memory and integration included various error scores, a perseveration measure, and the effect of distractors. In calculating error scores we differentiated 1) errors made while the subject was learning the rule (after the rule changed), 2) errors made during use of the rule, and 3) the number of consecutive errors. Perseveration errors occurred in all of these error categories and included any repeated selection of a previous incorrect choice and any erroneous choice that was consistent with a previous door-opening rule that had already changed. Perseveration was measured as the ratio between the number of perseveration errors and the total number of errors. The distractor effect was calculated as the error rate when the distractor was present minus the error rate when the distractor was absent.
The measurements of navigation and strategy included response time, navigation profile, and strategy. The navigation profile included a measure combining navigation speed with the number of collisions with walls and a histogram of the subject’s movements (forward, backward, or rotation). Decision strategy was measured by the number of doors inspected in each room and the time spent looking at each door. To assess the subject’s selection strategy, we compared the histogram of the locations of all selected doors with the histogram of the locations of correct doors.
The measurements of learning included the rate of improvement over time in the variables reflecting working memory and integration, in response time, and in navigation speed.
All the data were normalized so that within the comparison group the values for each variable were distributed with a mean value of 0 and a standard deviation of 1. A subject was noted to differ from the expected (normal) value for a given variable if his normalized absolute value exceeded 2.
Highlights of performance profile
In general, the patients differed from the comparison subjects on most of the measured variables, while individually each patient differed on a unique subset of variables. Specifically, the patients exhibited higher rates of errors on most measurements of working memory and integration. The patients were significantly slower than the comparison subjects, as expressed by worse values on the navigation and strategy measurements. Finally, the patients improved more than the comparison subjects, as manifested in some learning measurements. However, no single variable differentiated between the patients and the comparison group. On any given variable, some patients differed substantially, while others performed like the comparison subjects, resulting in a large variance in all of the measurements. F2 summarizes the distributions of the comparison and patient groups on a number of variables.
The biggest difference between the patients and comparison subjects (involving more than half of the patients) was manifested in a higher error rate when the rule was being used (F2), more consecutive errors (F2), and large head rotations (data not shown). The patients’ higher error rate during use of the rule was maintained throughout both the training and experimental sessions. Some patients, however, showed a large improvement during the training stage. In addition, a noticeable number of patients showed one or more of the following deficits: less ability to ignore irrelevant information (distractor effect), higher error rate during learning of the rule, longer response time, and poorer selection strategy (F2). Overall, the patients were significantly slower than the comparison subjects, as manifested in response time, speed, and time spent looking at doors. However, they also showed a much bigger improvement than the comparison subjects in response time and navigation speed. Finally, there was no marked difference between the groups in decision strategy (F2), movement profile (data not shown), and perseveration (F2).
To illustrate the large variance among the patients, several examples of individual performance plots are shown in F3. We can see that patient 1 performed well within the range of the comparison subjects on all but two measurements, while patients 2, 3, and 4 deviated in a wide range of variables, each displaying his own unique profile. Patient 2, for example, had difficulties on variables concentrated in the upper right corner, most of which are measurements of working memory and integration. Patient 3 showed scattered deviations in all groups of measurements, while patient 4 differed mostly on navigation and strategy variables. Note that patient 5 performed like the comparison subjects on all measurements.
The patient group demonstrated somewhat less ability to ignore irrelevant information. Accordingly, in the distractor conditions they showed higher error rates when using the door-opening rule. The distractor effect varied greatly, with some patients exhibiting a distractor effect only when the rule specified just one feature, some only when the rule specified two features, and some when the rule specified either one or two features. When the distractor was absent, some patients made many errors, while others performed like the comparison subjects. This measure—the number of errors when the distractor was absent—reflects only the errors made after the subject had learned the rule, and therefore it mostly reflects impaired working memory rather than inference ability.
On the basis of these two measures, i.e., the distractor effect and the number of errors when the distractor was absent, the patients could be divided into four subgroups. F2 shows that working memory impairment and the distractor effect exhibited double dissociation in the schizophrenic patients. Some patients had impairment only in working memory, and some patients had impairment only in the presence of a distractor.
We designed a classification routine based on the performance profiles. First, we estimated the distribution of performance profiles with the comparison group only. For simplicity, we made the false assumptions that the variables were independent and that each variable had normally distributed values. We then estimated the probability of each subject’s performance profile under the estimated distribution. Finally, we fixed a threshold to best discriminate between the comparison subjects and the patients in a leave-one-out paradigm. Specifically, we fixed a probability value that best separated the comparison and patient groups, using 38 out of the 39 patients; we then checked the prediction regarding the remaining patient. The sensitivity of this procedure was 0.85, with 33 out of 39 patients being predicted correctly. (Canonical variate analysis correctly classified 31 patients, for a sensitivity of 0.79. Multivariate analysis of variance indicated that the comparison and patient groups differed significantly with p=0.00002.)
In the preceding procedure we used all 26 measurements defining the performance profiles. However, with only 21 data points there is a high risk of overfitting the distribution of the comparison group. We therefore looked for a minimal subset of features that would give the same classification accuracy. We applied the same procedure while using all subsets of two to six features. The minimal subset of features that achieved the same accuracy contained four measures: distractor effect (sound and shape rule), error rate when the rule was used during training, consecutive error rate, and response time. This set of four features achieved a sensitivity similar to that for the leave-one-out classification paradigm—0.85.
Finally, we tested the estimation procedure using a similar leave-one-out approach. Specifically, we estimated the distribution of the comparison group based on 20 of the 21 subjects, fixed the threshold on the basis of the same 20 comparison subjects and all of the patients, and checked the prediction regarding the missing comparison subject. As expected from the preceding counting argument, the reduced set of four features was more robust than the full set of 26 measurements, with 100% correct classification of the comparison group (specificity, 1.00). With 26 measurements, only 86% (18 out of 21) of the comparison subjects were classified correctly.
Correlations with symptoms
To study the correlation between our measurements and the subjects’ PANSS scores, we assigned "absent" to the comparison subjects on all symptoms and normalized the PANSS scores to a 0–5 range: 0=absent or minimal, 1=mild, 2=moderate, 3 or 4=severe, 5=extreme. The analysis revealed a number of significant correlations (Spearman’s r≥0.4, t≥3.32, df=58, p<0.01): 1) the error rate during use of the rule (after the rule was learned) was significantly correlated with five positive and four negative symptoms, 2) the consecutive error rate was correlated with six negative symptoms, 3) the distractor effect for the sound-based door-opening rule was correlated with six positive symptoms, while the distractor effect for the sound-and-shape rule was correlated with only one positive symptom (conceptual disorganization), 4) longer response time was correlated with six negative symptoms, and 5) poor selection strategy was correlated with six positive symptoms. None of the variables showed any significant correlation with age. When using canonical correlation analysis to measure correlations between mixtures of variables, we found two significant correlations including the same group of highly correlated measures and symptoms.
Perseveration is a common indicator of schizophrenia (4). However, our approach to measuring perseveration differed in two ways from the classical procedure such as used in the Wisconsin Card Sorting Test. First, we measured perseveration by a ratio—the number of perseverative errors divided by the total number of errors. This is because when the number of total errors is high, the number of perseverative errors is expected to be high as well, irrespective of the source of error. Indeed, the numbers of total and perseverative errors showed a high correlation (rs=0.87). Second, in our experiment the subjects received a detailed explanation of the task, in addition to extensive training. This difference might explain the discrepancy between our results and those in the relevant literature. To test this hypothesis, we designed an additional experiment, described in the next section.
We also noted an interesting dissociation between the patients’ ability to learn a new rule and their ability to recover from a mistake. While 23 patients showed high rates of consecutive errors, only 15 patients showed high error rates when they were learning a new rule.
This experiment was designed to investigate the underlying reason for the absence of perseveration in experiment 1 and, specifically, the relation between task understanding and perseveration. We tried to replicate the standard Wisconsin Card Sorting Test as closely as possible in our virtual maze. The experiment was conducted in the same virtual maze used in experiment 1, with the rule defined by just one feature, color, and without a distractor. The goal of the game was to find the correct door by which to exit each room. Two experimental conditions were compared: 1) when the subjects were not told what defines a correct door and 2) when the subjects were told that the correct door is defined by color and that the correct color may change. If perseveration indeed results from the subject’s inability to adapt his behavior to change, the number of perseverative errors should be the same in both conditions.
The participants in this experiment were 21 schizophrenic patients and 19 comparison subjects. They are described in the methods section for experiment 1 (most of them later participated in experiment 1). The subjects received some initial navigation training. The game ended after the opening of 50 doors. The correct color changed after 10 consecutive correct choices.
We collected the same standard measurements as used in the Wisconsin Card Sorting Test, among them the number of colors completed, maintaining set (the number of consecutive correct choices), perseveration to previous color (the number of times that the previous correct color is chosen after the rule changes), perseveration to previous incorrect color (the number of times that some incorrect choice is repeated), the number of steps to learn the first color, the number of steps to find the correct color after the first change, and the average number of steps to learn a new color after a change. In addition, the navigation and strategy variables used in experiment 1 were measured. The data were analyzed by two-way analysis of variance for the effect of two factors: explanation and group type (patient or comparison).
In the patient group, the scores on a number of variables were affected by explanation (F4). On the other hand, the comparison group performed equally well in the two experimental conditions on all of the measurements. In addition, the patients differed from the comparison subjects on navigation and strategy variables regardless of the experimental condition, in a way similar to that in experiment 1.
Only patients who did not receive the explanation exhibited high perseveration rates of both kinds (F4). The explanation effect in the patient group was also manifested in 1) the number of colors learned, 2) maintaining set (number of consecutive correct choices), 3) the number of steps to learn a new door-opening rule after the initial change, and 4) the average number of steps to learn a new rule after a change. The first three effects are shown in F4. In addition, the patient group had a longer response time (F4) and slower navigation speed, made more collisions, and inspected more doors in the challenge rooms before making a decision.
We used virtual reality technology to design a complex environment for the study of schizophrenia. The technology made it possible to collect multiple measurements during a complex behavior, including multimodal interactions that pose a high load on working memory. In addition, the technology allowed us to conduct the experiment as a game and engage the patients in the task, which improved the subjects’ concentration and motivation.
The most important finding of this study is that schizophrenic patients can be reliably separated from comparison subjects on the basis of the profile of their performance in the virtual maze. The classification procedure succeeded in predicting correctly 33 (85%) out of 39 patients and all of the comparison subjects, by using performance profiles consisting of four measures: distractor effect, error rate during use of the rule during training, consecutive error rate, and response time. A closer look at the performance profiles of the misclassified patients revealed that they fell in the normal range on almost all the variables studied.
This experiment concentrated on working memory, which is not the only known deficiency of schizophrenic patients. Thus, any credible diagnostic routine should evaluate a wider spectrum of cognitive functions. On the positive side, some of the measured variables showed significant correlations with standard measures of schizophrenia (based on personal interviews), leading us to hope that similar tests may be able to replace subjective interviews in future diagnosis of the disease. Our measures were not able to divide the schizophrenic patients into clear subgroups, attesting to the complexity of the syndrome and the need to increase the size of the tested study group. Also, the routine should be evaluated with additional comparison groups consisting of patients with different mental disorders.
Our results indicate the need to clarify the notion of perseveration. In our main experiment, the patients did not differ from the comparison group on the perseveration measure. In a second experiment, we found high numbers of perseveration errors only when the patients did not receive an explanation of the task. This deficiency was correlated with other measures related to task understanding. This finding implies that perseveration as measured in the standard Wisconsin Card Sorting Test may indicate a deficiency in problem solving, rather than the patients’ inability to adjust to changes (as is usually understood). It is consistent with other reports that schizophrenic patients can dramatically improve after training on different tasks (5, 6).
Viewing schizophrenia as a disturbance in integration creates a unified framework (7–9) in which a unique disintegration profile can be created for each subject, reflecting his or her specific cognitive dysfunctions. We believe that our results provide the first step in creating disintegration profiles of schizophrenia. Next we plan to study integration processes on higher levels. Our hope is that such disintegration profiles not only will distinguish schizophrenic patients from healthy subjects but also will reveal subdivisions within the patient population that will result in better diagnostics and treatment.
Received Feb. 2, 2005; revision received April 18, 2005; accepted April 25, 2005. From the Interdisciplinary Center for Neural Computation and the School of Computer Science and Engineering, Hebrew University of Jerusalem; and the Rehabilitation Department, Sha’ar Menashe Mental Health Center, Mobile post Hefer, Israel. Address correspondence and reprint requests to Ms. Sorkin, Interdisciplinary Center for Neural Computation, P.O. Box 1255, Hebrew University of Jerusalem, Jerusalem 91904, Israel; firstname.lastname@example.org or email@example.com (e-mail).The authors thank the staff of the Hesed and Emuna hostel in Jerusalem and its director Hannah Rosenthal for their help and encouragement.
Virtual Maze Used to Study Sensory Integration Within Working Memory a
aEach door in a challenge room was associated with up to three distinct features—shape, color, and sound. At each point in time, there was a certain door-opening rule, which determined which door should be used to exit. For example, the rule might say that only red doors should be opened, in which case any red door, regardless of its shape or sound, could be used. There was always a single such door. The subject had to figure out the correct rule and open only the appropriate door. The door-opening rule randomly changed after 4–6 correct choices. In this example, the rule is defined by sound and shape, and color serves as a distracting feature.
bBetween challenge rooms, the subject passed through a few delay rooms, each of which had only one door. This door was also associated with a color, shape, and sound, which were consistently different from those used on doors in challenge rooms. The delay rooms masked the target stimulus and imposed an active load on working memory, as the subject needed to remember the correct rule during navigation.
Performance of Schizophrenic Patients on Virtual Reality Taska Measuring Sensory Integration Within Working Memory, in Relation to Performance of Healthy Comparison Subjectsb
aThe subject navigated through a virtual maze. Each room was associated with up to three distinct features—shape, color, and sound. At each point in time, there was a certain door-opening rule that determined which door should be used to exit. For example, the rule may have said that only red doors should be opened, in which case any red door, regardless of its shape or sound, could be used. There was always a single such door. A feature not used in the rule was considered a distractor. The subject had to figure out the correct rule and open only the appropriate door. The door-opening rule randomly changed after 4–6 correct choices.
bAll the data were normalized so that within the comparison group each variable was distributed with a mean value of 0 and a standard deviation of 1. Thus, the scores of the comparison subjects were concentrated between –1 and 1. In contrast, the patients’ scores show a much wider distribution.
cSignificantly different from the rate for the comparison subjects (F=65.7, df=1, 38, p<0.001).
dSignificantly different from the rate for the comparison subjects (F=43.9, df=1, 31, p<0.001).
eThe minimal error rate is based on all conditions in which the distractor was absent. The distractor effect (increase in error rate) was taken as the maximum over two conditions, one when the rule was based on sound and the other when the rule was based on sound and shape. Any subject differing by more than 2.5 standard deviations from the mean value of the comparison subjects was considered impaired on the relevant measure.
Polar Coordinates Profiling Performance of Five Schizophrenic Patients on Virtual Reality Taska Measuring Sensory Integration Within Working Memory, in Relation to Performance of Healthy Comparison Subjectsb
aThe subject navigated through a virtual maze. Each room was associated with up to three distinct features—shape, color, and sound. At each point in time, there was a certain door-opening rule that determined which door should be used to exit. For example, the rule may have said that only red doors should be opened, in which case any red door, regardless of its shape or sound, could be used. There was always a single such door. The subject had to figure out the correct rule and open only the appropriate door. The door-opening rule randomly changed after 4–6 correct choices. The 26 measurements made were divided into three categories: working memory and integration, navigation and strategy, and learning.
bEach variable corresponds to a certain angle j, and the radius r reflects the subject’s measurement value in the normalized scale for that variable. Thus, a subject’s profile corresponds to a close curve through 26 pairs of r, j coordinates. The scores were normalized as follows: 0=less than one standard deviation from the mean for the comparison subjects, 1=less than two standard deviations from the mean, 2=less than three standard deviations, 3=less than five standard deviations, 4=less than eight standard deviations, 5=more than eight standard deviations. The performance profiles of the comparison subjects concentrate by definition in the area r≤2.
Effect of Explanation on Performance of Schizophrenic Patients and Healthy Comparison Subjects on Modified Virtual Reality Taska Measuring Sensory Integration Within Working Memoryb
aThe subject navigated through a virtual maze. In each room were three doors of different colors. At each point in time, there was a certain door-opening rule that specified the color of the door that should be used to exit. The subject had to figure out the correct color and open only the appropriate door. The correct color changed after 10 consecutive correct choices, and the game ended after the opening of 50 doors.
bThe explanation consisted of telling the subjects that the correct door was defined by color.
cSignificant difference between conditions (F=18.75, df=1, 19, p=0.0004).
dSignificant difference between conditions (F=6.40, df=1, 17, p=0.03).
eSignificant difference between conditions (F=20.89, df=1, 19, p=0.0002).
fSignificant difference between conditions (F=17.77, df=1, 38, p=0.0001).