The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×
New ResearchFull Access

Ventral Striatum Response During Reward and Punishment Reversal Learning in Unmedicated Major Depressive Disorder

Abstract

Objective:

Affective biases may underlie many of the key symptoms of major depressive disorder, from anhedonia to altered cognitive performance. Understanding the cause of these biases is therefore critical in the quest for improved treatments. Depression is associated, for example, with a negative affective bias in reversal learning. However, despite the fact that reversal learning is associated with striatal response in healthy individuals and depressed individuals exhibit attenuated striatal function on multiple tasks, studies to date have not demonstrated striatal involvement in the negative bias in reversal learning in depression. In this study, the authors sought to determine whether this may be because reversal learning tasks conventionally used to study behavior examine reversals only on the basis of unexpected punishment and therefore do not adequately separate reward- and punishment-based behavior.

Method:

The authors used functional MRI to compare the hemodynamic response to a reversal learning task with mixed reward- and punishment-based reversal stages between individuals with unmedicated major depressive disorder (N=13) and healthy comparison subjects (N=14).

Results:

Impaired reward (but not punishment) reversal accuracy was found alongside attenuated anteroventral striatal response to unexpected reward in depression.

Conclusions:

Attenuated neurophysiological response of the anteroventral striatum may reflect dysfunction in circuits involving afferent projections from the orbitofrontal, limbic, and/or mesostriatal dopaminergic pathways, which conceivably may, together with the ventral striatum, underlie anhedonia in depression. Learning to appreciate and enjoy positive life experiences is critical for recovery from depression. This study pinpoints a neural target for such recovery.

Depression is associated with varied symptoms, from mood changes to cognitive impairment. A large proportion of these symptoms may be driven at least in part by abnormal responses to affective stimuli (1). Specifically, depression is associated with a strong “negative” bias: enhanced sensitivity to negative (punishing) stimuli and a behavioral neglect of positive (rewarding) stimuli (2). This affective bias, which is manifested across many facets of learning, memory, and cognition, putatively serves both to instigate and to uphold the debilitating negative and anhedonic mood state (3, 4). A clearer understanding of the neural basis of affective bias in depression will thus lead to a clearer understanding of the overall pathology.

In this study, we focused on affective biases seen in flexible learning in depression. Adaptive behavior in our daily life, where the consequences of our actions are often uncertain and variable, requires individuals to frequently and flexibly update their behavior. The experimental model most often used to examine such flexible behavior is the probabilistic reversal learning paradigm. In this paradigm, subjects learn by trial and error to choose the most rewarding stimulus and then subsequently reverse their choice when contingencies change and this previously rewarding stimulus is unexpectedly followed by punishment. In this probabilistic task, where around one-fourth of the reward and punishment feedback is misleading, depressed individuals reverse more often than do healthy individuals when they receive misleading negative feedback (57). This problem has been interpreted to reflect a negative affective bias and may underlie the tendency of depressed individuals to emphasize negative—at the expense of positive—life experiences.

However, this negative affective bias could be driven by at least two different processes: 1) increased behavioral sensitivity to unexpected punishment in depression (encouraging reversal during misleading negative feedback), and/or 2) reduced behavioral sensitivity to reward in depression (reducing the ability to maintain the correct stimulus-reward association). To elucidate the nature of affective biases in reversal learning, we developed a novel reversal learning paradigm that enabled direct comparison of reversals signaled by unexpected reward with reversals signaled by unexpected punishment (811). In this task, subjects do not directly choose the rewarded or punished stimulus but rather predict the outcome of stimuli selected by the computer. Unlike the probabilistic tasks, this task is deterministic and subjects are required to reverse their behavior as soon as they receive unexpected outcomes. Critically, our study design involved Pavlovian rather than instrumental conditioning, which allowed the assessment of reversals on the basis of unexpected reward as well as unexpected punishment (8, 11).

Using this task, we previously demonstrated that both punishment and reward reversals rely on overlapping but distinct regions of the striatum (11). This involvement of the striatum is consistent with imaging studies of the classic probabilistic reversal learning task in healthy individuals, in whom increased striatal response precedes behavioral switching (12), and it concurs with the frequently highlighted role of the striatum in dopamine-mediated prediction error learning (13).

Extrapolation from the above findings suggests that it is plausible that the behavioral bias in reversal learning seen in depression (57) is driven by altered striatal processing. Indeed, attenuated striatal function is seen in the depressive pathology across multiple cognitive tasks, from higher-order planning to gambling (1, 14, 15). However, previous work using the classic probabilistic reversal learning paradigm in depressed individuals has not found significant differences in striatal response during reversal learning (5, 16, 17). Although the striatum is a key region involved in reversal learning in healthy individuals and reversal learning is impaired in depression, studies to date have not demonstrated striatal involvement in the negative bias in reversal learning in depression, despite the fact that the striatum is involved in the neuropathology of depression (14).

The negative bias in reversal learning in depression therefore might not directly involve the striatum but rather aberrant function in, for example, the orbitofrontal cortex (1820) or the amygdala (5). Alternatively, however, previous studies may have failed to reveal the contribution of the striatum because they inadequately disentangled the separate reward and punishment components of reversal learning. In this study, we therefore employed our new deterministic reversal learning task to examine differences in the hemodynamic response during separate punishment and reward reversal trials across unmedicated depressed individuals and healthy comparison subjects. We predicted that depressed individuals would demonstrate a negative bias in reversal learning and that this would be associated with a corresponding attenuation in striatal response during reversal trials. However, given the absence of striatal differences across diagnosis in punishment-based probabilistic reversal learning, we predicted that any alteration in striatal response would be restricted to reward-based reversals.

Method

Volunteers (N=27; 15 Caucasian, one Asian, 11 African American; all right-handed) 18–50 years of age underwent screening evaluations that included a medical history, physical examination, laboratory testing, and structural MRI. Psychiatric assessment was conducted using the Structured Clinical Interview for DSM-IV-TR and an unstructured interview with a psychiatrist; 14 volunteers had no psychiatric disorders (healthy comparison subjects), and 13 had major depressive disorder. Exclusion criteria for all participants included psychotropic drug exposure (including nicotine) within the past 3 weeks; major medical or neurological illness; illicit drug use or alcohol abuse within the past year; lifetime history of alcohol or drug dependence; psychiatric disorders other than major depression (excepting comorbid anxiety disorder and a remote history of substance abuse); current pregnancy or breastfeeding; structural brain abnormalities on MRI; general MRI exclusions. Additional exclusion criteria for comparison subjects were a history of any psychiatric disorder (except a remote history of substance abuse) and a history of any mood disorder in a first-degree relative. After receiving a complete description of the study, participants provided written informed consent as approved by the National Institutes of Health Combined Neuroscience Institutional Review Board. Participants were group matched for age (healthy comparison group, mean=31 years [SD=6], depressed group, mean=36 years [SD=11]), gender (eight male participants in each group), years of education (healthy comparison group, mean=17 years [SD=2], depressed group, mean=16 years [SD=2]), and IQ (healthy comparison group, mean=120 [SD=15], depressed group, mean=120 [SD=15]; IQ scores were not available for eight participants [five in the depressed group], four because English was not their first language [one in the depressed group]; one [in the depressed group] because he vocationally administered IQ testing, and three because they dropped out of the study after scanning but before neuropsychological testing). The mean score on the 21-item Hamilton Depression Rating Scale (HAM-D) (21) was higher in the depressed than in the comparison group (depressed group, mean score=20 [SD=7]; comparison group, mean score=1 [SD=1]; F=95, df=1, 25, p<0.001).

Behavioral and Functional Neuroimaging Measures

Task.

The behavioral task was adapted from a previously developed paradigm (8, 9, 11) and programmed using E-PRIME (Psychological Software Tools, Inc., Pittsburgh).

On each trial, participants were presented with two vertically adjacent stimuli, one scene and one face (location randomized) on a projector viewed by means of a mirror attached to the head coil in the functional MRI (fMRI) scanner. One of these two stimuli was associated with reward and the other with punishment. Participants were required to learn these deterministic stimulus-outcome associations by trial and error. Unlike standard probabilistic reversal paradigms, however, participants were not required to choose between the two stimuli but were instructed to predict whether a stimulus that was highlighted with a black border (randomized from trial to trial) would lead to reward or to punishment (the task contingencies were thus Pavlovian and expected to be processed more specifically in the ventral striatum [22]). They indicated their outcome prediction for the highlighted stimulus by pressing, with the index or middle finger of their dominant (right) hand, one of two buttons (one for reward, one for punishment; response mappings counterbalanced) on a button box placed on their abdomen. They had up to 1,500 msec to provide a response. Once they responded, the outcome was presented for 500 msec in the center of the screen (between the two stimuli). Reward consisted of a green smiley face and punishment a red sad face. If they failed to make a response, “Too late!” was displayed instead of the outcome. After the outcome, the screen showed only a fixation cross for a reaction time-dependent interval, so that the interstimulus interval was jittered modestly between 2,000 and 4,000 msec.

Each experimental block consisted of one acquisition stage and a variable number of reversal stages. The task proceeded from one stage to the next following a specific number of consecutive correct trials as determined by a preset learning criterion. This criterion varied between stages (four, five, or six correct responses) to prevent predictability of reversals. The task also terminated after 10 consecutive incorrect trials in order to avoid scanning blocks in which participants were not performing the task correctly (e.g., because of having forgotten the outcome-response mappings). Reversals of contingencies were signaled to participants either by an unexpected reward presented after the previously punished stimulus was highlighted or by an unexpected punishment presented after the previously rewarded stimulus was highlighted. Unexpected reward and unexpected punishment events were interspersed within blocks. Consistent with previous versions of this task (8, 11), the same stimulus was highlighted after the unexpected outcome and was presented until participants correctly reversed their predictions.

During the scan session, participants completed six experimental blocks. The average number of reversal stages per experimental block was eight (four signaled by punishment), although the block terminated automatically after completion of 150 trials (7.4 minutes), so that each participant performed 900 trials (six blocks) per experimental session (approximately 90 minutes, including breaks). A 30-second fixation period was also included at the beginning and end of each block to provide a baseline with which to compare blood-oxygen-level-dependent (BOLD) response during trials.

All participants performed a practice block before entering the fMRI scanner to familiarize them with the task. The practice task was identical to the main task except that the stimuli were presented on a laptop computer.

Behavioral analysis.

Reaction times and accuracy rates were assessed in an analysis of variance with reversal (reversal versus nonreversal trials) and valence (reward versus punishment) as within-subject factors and group (depressed versus healthy comparison group) as the between-subjects factor. Trials on which participants failed to make a response were excluded from reaction time analyses, and the rare trials in which participants coincidentally made a nonreversal error on an unexpected outcome trial were excluded from all analyses (as this meant that they accidentally preempted the reversal, making the expectancy of outcome unclear). Accuracy was determined as a proportion of the total number of trials for the type being examined; nonreversal reward errors were divided by the total number of nonreversal reward trials, and punishment reversal errors were divided by the total number of punishment reversals. As the task was deterministic, reversal errors were defined as errors on the trial immediately following the unexpected outcome (9, 11). Partial eta-squared (ηp2) effect sizes are reported for all significant contrasts, and p values are Bonferroni adjusted.

Functional Neuroimaging

Image acquisition.

A GE Signa HDxt 3-T scanner (GE Healthcare, Milwaukee) was used to acquire structural and functional MR images. The functional sequence comprised six echo-planar imaging sessions of 255 volume acquisitions (flip angle=90°; repetition time=2,000 msec; echo time=30 msec; field-of-view=24×24 cm; slice thickness=3 mm; slice spacing=0.5 mm; matrix=64×64 sagittal slices with array spatial sensitivity encoding technique). The first 10 volumes from each session were discarded to avoid T1 equilibrium effects. The structural sequence comprised a magnetization-prepared rapid gradient echo anatomical reference image (flip angle=60°; repetition time=7,800 msec; echo time=3,000 msec; field of view=22×22 cm; slice thickness=1.2 mm; slice spacing=0 mm; matrix=246×192 for spatial coregistration and normalization).

Image analysis.

Images were preprocessed (see the data supplement that accompanies the online edition of this article) and analyzed using SPM8 (Wellcome Department of Cognitive Neurology, London). We estimated a general linear model, for which parameter estimates were generated at the onsets of all expected and unexpected reward and punishment trials (with zero duration), which co-occurred with the response. Consistent with our previous study, an unexpected outcome was the first outcome of a new stage, presented after learning criterion had been obtained (i.e., the outcome signaling contingency reversal), and all other outcomes were coded as expected outcomes, irrespective of task performance (11).

Because of strong a priori hypotheses regarding the role of the striatum in this task, a region-of-interest analysis was performed by extracting standardized β values from the anatomically defined (23) left and right caudate and putamen using the MarsBar software package (24) for each trial type. In line with our hypotheses, across-group analyses were performed separately for each trial.

Next, to localize more specifically the peak differences in responses within the striatum and to investigate the extended functional anatomical network of regions that may interact with the striatum during task performance, a whole brain voxel-wise analy-sis was performed post hoc for each of the four trial types. For this whole brain analysis, a one-sample t test was created for each trial type (unexpected punishment and unexpected reward) with group as a covariate. Clusters are reported at voxel-level p values <0.001 (labels assigned using the automated anatomical labeling toolbox for SPM [23]) and defined using a voxel-level threshold corresponding to an uncorrected p value <0.001 and coordinates reported (Montreal Neurological Institute [MNI]/Talairach) for peak voxel t value. Family-wise error voxel-level corrected p values are also reported for the peak voxel t values within small-volume-corrected regions of interest.

Results

Behavioral Analysis

Error rates and reaction times are presented in Table 1. There was a significant three-way interaction of valence, reversal (reversal, nonreversal), and group in error rates (F=10.4, df=1, 25, p=0.004; ηp2=0.29), but not for reaction time. This significant three-way interaction was broken down in simple (interaction) effects analyses for reversal and nonreversal trials separately.

TABLE 1. Behavioral Results on a Reversal Learning Task in Depressed Individuals and Healthy Comparison Subjects

Error Ratea
Reaction Time (msec)
Group, Stage, and ValenceMeanSDMeanSD
Major depression group
    Reversal
        Reward0.3820.2159673
        Punishment0.3450.1764083
    Nonreversal
        Reward0.2150.0663637
        Punishment0.2670.1165947
Healthy comparison group
    Reversal
        Reward0.1650.1758860
        Punishment0.2770.1563657
    Nonreversal
        Reward0.2180.0763946
        Punishment0.2100.0564935

a Proportional error rates are reported as a function of trial type.

TABLE 1. Behavioral Results on a Reversal Learning Task in Depressed Individuals and Healthy Comparison Subjects

Enlarge table

Reversal Trials

According to our hypothesis, the main outcome of interest was reward-based reversal learning. Depressed participants made more errors than did comparison subjects on reward reversal trials (F=11.7, df=1, 25, p=0.002; ηp2=0.32; Figure 1A) but made equal numbers of punishment reversal errors, driving a significant group-by-valence interaction in error rates (F=5.2, df=1, 25, p=0.032; ηp2=0.17). This difference was seen despite comparable reaction times during reward and punishment reversals across groups. Thus, the depressed participants demonstrated a negative affective bias in reversal learning as a result of reduced behavioral responsiveness to reward but not punishment.

FIGURE 1.

FIGURE 1. Impaired Reward Reversal Learning and Attenuated Right Putamen Response to Unexpected Reward in Depressed Individuals Relative to Healthy Comparison Subjectsa

a As shown in panel A, accuracy is lower on reward (F=11.7, df=1, 25, p=0.002) but not punishment reversals in depressed individuals relative to healthy individuals. Panel B shows attenuated right (anatomically defined) putamen response during reward reversal trials in depressed individuals relative to healthy individuals (F=10.5, df=1, 25, p=0.003) but equivalent response during punishment reversal. Error bars indicate standard deviations. In panel C, whole brain analysis confirms that the peak neural response difference between depressed and healthy individuals on reward reversal trials was the right anteroventral putamen (peak voxel x=30, y=3, z=–8; image shows SPM t scores ranging from 2.1 to 4.1).

Nonreversal Trials

By contrast, there was no valence specificity on nonreversal trials. Depressed and healthy individuals responded equally well on nonreversal reward and punishment trials.

Image Analyses

Region-of-interest analyses.

Neural effects in each of the four regions of interest during the key reward- and punishment-based reversal trials are summarized in Table 2. The most striking pattern was observed in the right putamen (23), which showed a significant three-way interaction of valence, reversal, and HAM-D score included as a continuous variable (F=3.1, df=13, 13, p=0.026; ηp2=0.76). Accordingly, we emphasize the data from this region.

TABLE 2. Differences During Reward and Punishment Reversal Trials in Anatomically Defined Regions of Interest in Depressed Individuals and Healthy Comparison Subjects

Healthy Comparison Subjects > Depressed Individuals
Unexpected Reward
Unexpected Punishment
Anatomically Defined RegionFdfpFdfp
Right putamen10.41, 250.0030.81, 250.4
Left putamen3.71, 250.070.21, 250.7
Right caudate0.91, 250.40.061, 250.8
Left caudate0.0081, 250.93.51, 250.07

TABLE 2. Differences During Reward and Punishment Reversal Trials in Anatomically Defined Regions of Interest in Depressed Individuals and Healthy Comparison Subjects

Enlarge table

Reversal trials.

The main trials of interest were the reward reversal trials. Significantly decreased right putamen response was observed in depressed individuals during reward (F=10.5, df=1, 25, p=0.003; ηp2=0.30) but not punishment reversals. These results are shown in Figure 1B and correspond with the accuracy results presented in Figure 1A. Thus, as predicted, the negative affective bias in the behavior of depressed individuals was accompanied by attenuation in striatal response during reward reversals.

Nonreversal trials.

There was, by contrast, no valence specificity in neural responses during the nonreversal trials. Putamen response was significantly higher in the healthy comparison group than in the depressed group during both reward (F=7.4, df=1, 25, p=0.01; ηp2=0.23) and punishment (F=7.8, df=1, 25, p=0.01; ηp2=0.24) nonreversal trials.

In the depressed group, HAM-D score did not correlate with the ventral putamen BOLD response to unexpected reward or the reward reversal errors.

Whole Brain Analyses

Consistent with the region-of-interest analysis, a whole brain analysis of regions that were more active in the healthy comparison group relative to the depressed group during unexpected reward revealed increased response in the right anteroventral putamen in healthy relative to depressed individuals (whole brain peak voxel: MNI coordinates, x=30, y=3, z=–8; Talairach coordinates, x=30, y=2, z=–7 [right anteroventral putamen]; uncorrected p<0.001; small-volume corrected region-of-interest, family-wise error corrected p=0.011; Figure 1C and Table 3). A comparable whole brain analysis for unexpected punishment trials failed to reveal any significant difference between the comparison and depression groups.

TABLE 3. Regions More Active at a More Liberal Statistical Threshold in Healthy Versus Depressed Participants During Unexpected Reward Trialsa

MNI Coordinates
Talairach Coordinates
Brain RegionxyzxyzTK (Cluster Size)
Right ventral putamen303–8302–74.0516
Left mid-cingulate cortex–15038–15–1323.752
Left mid-occipital cortex–24–5734–24–54343.652

a In this analysis, the whole brain uncorrected p was <0.001 at voxel level. Coordinates represent spatial locations of peak voxels on the Montreal Neurological Institute (MNI) and Talairach brain templates. Cluster size refers to the number of significant voxels adjacent to this peak.

TABLE 3. Regions More Active at a More Liberal Statistical Threshold in Healthy Versus Depressed Participants During Unexpected Reward Trialsa

Enlarge table

Discussion

Consistent with our hypothesis, a negative bias in reversal learning in depression was accompanied by altered reward-related striatal response. Specifically, we found impaired reward (but not punishment) reversal behavior in depression alongside attenuated ventral striatal response to unexpected reward. Thus, we provide a potential neural basis for the negative bias underlying the flexible-learning impairment in depression.

The attenuated reward-related striatal response in major depressive disorder is consistent with results of several recent studies examining reward processing deficits in different aspects of cognition in depression (1, 2, 15, 25, 26). However, this study is the first to demonstrate valence specificity in the striatal response to reward and punishment in depression and the first to demonstrate that striatal attenuation in depression extends beyond the receipt and anticipation of reward (15) to reward-based reversal learning. This blunted behavioral response to reward and not to punishment also provides an alternative explanation for the previously demonstrated impairment in reversal learning in depression (57); it may be driven by attenuated reward responses rather than by elevated punishment responses. Previous studies with the probabilistic reversal learning task failed to reveal differences in striatal function while solely examining reversals based on unexpected punishment (5, 16, 17), and (although the interpretation of this latter negative finding was limited by the low generalizability and statistical sensitivity conferred by the relatively small sample sizes) we saw significant three-way interactions of valence, reversal, and depression and also failed to demonstrate striatum-specific differences between depressed and comparison groups on punishment-based reversals. The group difference in the striatal hemodynamic response was significant only when we compared responses to unexpected reward.

Under a variety of experimental conditions, mood disorders have been associated with abnormal neural processing in structures implicated in appetitive and aversive learning, including the orbitofrontal cortex (1820) and the amygdala (5), which likely contributes to the overall neurocognitive profile of depression. The locus within the striatum where we observed an attenuated hemodynamic response to unexpected rewards implicated a region of the anterior ventrolateral putamen, which receives projections from both the medial and orbital prefrontal cortical networks (14, 27) as well as the amygdala (28). Thus the attenuated BOLD response in the putamen may have been driven by abnormal afferent transmission from these cortical regions (27, 29) rather than by a specific abnormality within the striatum. Notably, lesions in the ventral striatum, orbitofrontal cortex, pallidum, or mediodorsal nucleus of the thalamus have all been shown to cause perseverative deficits in stimulus-reward reversal tasks in rats and monkeys, such that the animals have difficulty switching away from previously rewarded but not unrewarded stimuli (14). The present study thus extends the sources of altered neural transmission in depression to encompass attenuated reward reversal-related responses in the ventral striatum, but this finding is interpreted within the context of the limbic-prefrontal cortical-striatal-pallidal-thalamic circuits involving this part of the striatum (11, 12, 14).

While the negative bias demonstrated with the reversal learning task used here joins the affective biases demonstrated by a range of cognitive tasks in depression (2), the specific direction of the impairment we observed—attenuated reward processing rather than improved punishment processing—may be related to the impaired ability to derive pleasure from rewarding activities seen in depression. This hypothesis would be compatible with evidence that the functioning of the mesolimbic dopaminergic system, which plays a major modulatory role within the limbic-cortical-striatal-pallidal-thalamic circuitry (30), is reduced in depression (both in general and in response to unpredicted reward) (1, 14, 31, 32) and with evidence for the involvement of dopamine in punishment and reward learning in the striatum in this (810, 33) and other (34, 35) tasks. Individuals with higher dopamine synthesis capacity, for instance, demonstrate improved reward-based relative to punishment-based reversal learning on the task we used here (10, 36). Moreover, amphetamine-induced dopamine release within the anteroventral putamen is correlated with subjective feelings of euphoria (or hedonia) in healthy individuals (37, 38). Thus, the attenuated anteroventral putamen response we identified in depression may reflect a reciprocal process: attenuated striatal response associated with reduced dopamine release and anhedonia. It is conceivable, furthermore, that amelioration of the reversal learning impairment and anhedonia in depression would result from enhancement of the mesolimbic dopaminergic system (17, 39). Nevertheless, these hypotheses require testing in future studies, since the present study included neither anhedonia ratings nor assessments of central dopaminergic function.

Finally, our findings do not invalidate the proposition that depression is also associated with hypersensitivity to punishment in other contexts, such as when performance declines after a perceived error (and associated aversive feedback) on planning or mnemonic tasks (2). Indeed, alterations to both reward and punishment processing are seen in depression (1), and while this “catastrophic response to perceived failure” (2, p. 64) is likely due to an enhanced impact of negative (punishing) judgment on performance, the task used in this study does not provide patients with explicit judgment about their performance and may therefore tap into distinct reward and punishment processing mechanisms. Indeed, one key advantage of neurocognitive assessment as a measure of pathology is that it is possible to target distinct neural systems with different cognitive tasks, thereby breaking down the underlying architecture of such multifaceted and subjective behaviors. Recent findings in fact implicate a habenula-rostromedial tegmental circuit in the processing of reward omission and expected punishment (40), but our fMRI parameters were not optimized to detect signal change in a structure of this small size. Whether this circuit therefore underlies altered punishment processing in depression is a question for future research.

Conclusions

These results suggest that altered reversal learning in depression is driven by attenuated striatal function and that this effect depends more specifically on an attenuated response to unexpected reward rather than to unexpected punishment. The region of the striatum critical for this bias corresponds with the anteroventral putamen, which is known to play a key role in hedonic processing and may therefore represent the neural underpinnings of anhedonic mood in depression. Improving the ability of depressed patients to learn about rewarding feedback, including social interactions and positive life experiences, is critical for recovery. The findings from this study provide a neural target for such recovery.

From the Section on Neuroimaging in Mood and Anxiety Disorders, NIMH, Bethesda, Md.; the Department of Psychiatry and MRC/Wellcome Trust Behavioural and Clinical Neuroscience Institute, University of Cambridge, Cambridge, U.K.; Addenbrooke's Hospital, Cambridge; Center for Cognitive Neuroimaging, Donders Institute for Brain, Cognition, and Behavior, Department of Psychiatry, Radboud University Nijmegen Medical Center, Nijmegen, the Netherlands; and the Laureate Institute for Brain Research and Department of Psychiatry, Oklahoma University College of Medicine, Tulsa.
Address correspondence and reprint requests to Dr. Robinson ().

Presented in part as a poster at the Society for Neuroscience Conference, San Diego, Nov. 13–17, 2010.

Received Jan. 25, 2011; revisions received April 21 and June 1, 2011; accepted June 10, 2011.

Dr. Sahakian has received grant support from Johnson & Johnson, has served as a consultant for Boehringer-Ingelheim, Cambridge Cognition, Eli Lilly, GlaxoSmithKline, Hoffmann-La Roche, Novartis, and Shire, and receives an honorarium from the Journal of Psychological Medicine. Dr. Drevets has served as a consultant for Pfizer, Johnson & Johnson, Eisai, and Rules-Based Medicine. The other authors report no financial relationships with commercial interests.

Supported by the NIMH Intramural Research Program (protocol number 04-M-0002).

References

1. Eshel N , Roiser JP: Reward and punishment processing in depression. Biol Psychiatry 2010; 68:118–124Crossref, MedlineGoogle Scholar

2. Clark L , Chamberlain SR , Sahakian BJ: Neurocognitive mechanisms in depression: implications for treatment. Ann Rev Neurosci 2009; 32:57–74Crossref, MedlineGoogle Scholar

3. Elliott R , Rubinsztein JS , Sahakian BJ , Dolan RJ: The neural basis of mood-congruent processing biases in depression. Arch Gen Psychiatry 2002; 59:597–604Crossref, MedlineGoogle Scholar

4. Robinson OJ , Sahakian BJ: Recurrence in major depressive disorder: a neurocognitive perspective. Psychol Med 2008; 38:315–318Crossref, MedlineGoogle Scholar

5. Taylor Tavares JV , Clark L , Furey ML , Williams GB , Sahakian BJ , Drevets WC: Neural basis of abnormal response to negative feedback in unmedicated mood disorders. Neuroimage 2008; 42:1118–1126Crossref, MedlineGoogle Scholar

6. Murphy FC , Michael A , Robbins TW , Sahakian BJ: Neuropsychological impairment in patients with major depressive disorder: the effects of feedback on task performance. Psychol Med 2003; 33:455–467Crossref, MedlineGoogle Scholar

7. Dombrovski AY , Clark L , Siegle GJ , Butters MA , Ichikawa N , Sahakian BJ , Szanto K: Reward/punishment reversal learning in older suicide attempters. Am J Psychiatry 2010; 167:699–707LinkGoogle Scholar

8. Cools R , Altamirano L , D'Esposito M: Reversal learning in Parkinson's disease depends on medication status and outcome valence. Neuropsychologia 2006; 44:1663–1673Crossref, MedlineGoogle Scholar

9. Robinson O , Standing H , DeVito E , Cools R , Sahakian B: Dopamine precursor depletion improves punishment prediction during reversal learning in healthy females but not males. Psychopharmacology (Berl) 2010; 211:187–195Crossref, MedlineGoogle Scholar

10. Cools R , Frank MJ , Gibbs SE , Miyakawa A , Jagust W , D'Esposito M: Striatal dopamine predicts outcome-specific reversal learning and its sensitivity to dopaminergic drug administration. J Neurosci 2009; 29:1538–1543Crossref, MedlineGoogle Scholar

11. Robinson OJ , Frank MJ , Sahakian BJ , Cools R: Dissociable responses to punishment in distinct striatal regions during reversal learning. Neuroimage 2010; 51:1459–1467Crossref, MedlineGoogle Scholar

12. Cools R , Clark L , Owen AM , Robbins TW: Defining the neural mechanisms of probabilistic reversal learning using event-related functional magnetic resonance imaging. J Neurosci 2002; 22:4563–4567Crossref, MedlineGoogle Scholar

13. Schultz W , Dickinson A: Neuronal coding of prediction errors. Ann Rev Neurosci 2000; 23:473–500Crossref, MedlineGoogle Scholar

14. Price JL , Drevets WC: Neurocircuitry of mood disorders. Neuropsychopharmacology 2009; 35:192–216CrossrefGoogle Scholar

15. Pizzagalli DA , Holmes AJ , Dillon DG , Goetz EL , Birk JL , Bogdan R , Dougherty DD , Iosifescu DV , Rauch SL , Fava M: Reduced caudate and nucleus accumbens response to rewards in unmedicated individuals with major depressive disorder. Am J Psychiatry 2009; 166:702–710LinkGoogle Scholar

16. Remijnse PL , Nielen MM , van Balkom AJ , Hendriks GJ , Hoogendijk WJ , Uylings HB , Veltman DJ: Differential frontal-striatal and paralimbic activity during reversal learning in major depressive disorder and obsessive-compulsive disorder. Psychol Med 2009; 39:1503–1518Crossref, MedlineGoogle Scholar

17. Hasler G , Mondillo K , Drevets WC , Blair JR: Impairments of probabilistic response reversal and passive avoidance following catecholamine depletion. Neuropsychopharmacology 2009; 34:2691–2698Crossref, MedlineGoogle Scholar

18. Remijnse PL , Nielen MMA , Uylings HBM , Veltman DJ: Neural correlates of a reversal learning task with an affectively neutral baseline: an event-related fMRI study. Neuroimage 2005; 26:609–618Crossref, MedlineGoogle Scholar

19. O'Doherty J , Critchley H , Deichmann R , Dolan RJ: Dissociating valence of outcome from behavioral control in human orbital and ventral prefrontal cortices. J Neurosci 2003; 23:7931–7939Crossref, MedlineGoogle Scholar

20. Tsuchida A , Doll BB , Fellows LK: Beyond reversal: a critical role for human orbitofrontal cortex in flexible learning from probabilistic feedback. J Neurosci 2010; 30:16868–16875Crossref, MedlineGoogle Scholar

21. Hamilton M: A rating scale for depression. J Neurol Neurosurg Psychiatry 1960; 23:56–62Crossref, MedlineGoogle Scholar

22. O'Doherty J , Dayan P , Schultz J , Deichmann R , Friston K , Dolan RJ: Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 2004; 304:452–454Crossref, MedlineGoogle Scholar

23. Tzourio-Mazoyer N , Landeau B , Papathanassiou D , Crivello F , Etard O , Delcroix N , Mazoyer B , Joliot M: Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 2002; 15:273–289Crossref, MedlineGoogle Scholar

24. Brett M , Anton J-L , Valabregue R , Poline J-B: Region of interest analysis using an SPM toolbox (abstract). Presented at the 8th International Conferance on Functional Mapping of the Human Brain, Sendai, Japan. Neuroimage 2002; 16(2):abstract 497Google Scholar

25. Forbes EE , Hariri AR , Martin SL , Silk JS , Moyles DL , Fisher PM , Brown SM , Ryan ND , Birmaher B , Axelson DA , Dahl RE: Altered striatal activation predicting real-world positive affect in adolescent major depressive disorder. Am J Psychiatry 2009; 166:64–73LinkGoogle Scholar

26. Kumar P , Waiter G , Ahearn T , Milders M , Reid I , Steele JD: Abnormal temporal difference reward-learning signals in major depression. Brain 2008; 131:2084–2093Crossref, MedlineGoogle Scholar

27. Haber SN , Kim K-S , Mailly P , Calzavara R: Reward-related cortical inputs define a large striatal region in primates that interface with associative cortical connections, providing a substrate for incentive-based learning. J Neurosci 2006; 26:8368–8376Crossref, MedlineGoogle Scholar

28. Russchen FT , Bakst I , Amaral DG , Price JL: The amygdalostriatal projections in the monkey: an anterograde tracing study. Brain Res 1985; 329:241–257Crossref, MedlineGoogle Scholar

29. Shulman RG , Rothman DL , Behar KL , Hyder F: Energetic basis of brain activity: implications for neuroimaging. Trends Neurosci 2004; 27:489–495Crossref, MedlineGoogle Scholar

30. Graybiel AM: Neurotransmitters and neuromodulators in the basal ganglia. Trends Neurosci 1990; 13:244–254Crossref, MedlineGoogle Scholar

31. Martin-Soelch C: Is depression associated with dysfunction of the central reward system? Biochem Soc Trans 2009; 37:313–317Crossref, MedlineGoogle Scholar

32. Nestler EJ , Carlezon JWA: The mesolimbic dopamine reward circuit in depression. Biol Psychiatry 2006; 59:1151–1159Crossref, MedlineGoogle Scholar

33. Cools R , Lewis SJG , Clark L , Barker RA , Robbins TW: l-Dopa disrupts activity in the nucleus accumbens during reversal learning in Parkinson's disease. Neuropsychopharmacology 2006; 32:180–189Crossref, MedlineGoogle Scholar

34. Frank MJ: Dynamic dopamine modulation in the basal ganglia: a neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism. J Cogn Neurosci 2005; 17:51–72Crossref, MedlineGoogle Scholar

35. Frank MJ , Seeberger LC , O'Reilly RC: By carrot or by stick: cognitive reinforcement learning in Parkinsonism. Science 2004; 306:1940–1943Crossref, MedlineGoogle Scholar

36. Cools R , Sheridan M , Jacobs E , D'Esposito M: Impulsive personality predicts dopamine-dependent changes in frontostriatal activity during component processes of working memory. J Neurosci 2007; 27:5506–5514Crossref, MedlineGoogle Scholar

37. Drevets WC , Gautier C , Price JC , Kupfer DJ , Kinahan PE , Grace AA , Price JL , Mathis CA: Amphetamine-induced dopamine release in human ventral striatum correlates with euphoria. Biol Psychiatry 2001; 49:81–96Crossref, MedlineGoogle Scholar

38. Martinez D , Slifstein M , Broft A , Mawlawi O , Hwang DR , Huang Y , Cooper T , Kegeles L , Zarahn E , Abi-Dargham A , Haber SN , Laruelle M: Imaging human mesolimbic dopamine transmission with positron emission tomography, part II: amphetamine-induced dopamine release in the functional subdivisions of the striatum. J Cereb Blood Flow Metab 2003; 23:285–300Crossref, MedlineGoogle Scholar

39. Hasler G , Luckenbaugh DA , Snow J , Meyers N , Waldeck T , Geraci M , Roiser J , Knutson B , Charney DS , Drevets WC: Reward processing after catecholamine depletion in unmedicated, remitted subjects with major depressive disorder. Biol Psychiatry 2009; 66:201–205Crossref, MedlineGoogle Scholar

40. Hikosaka O: The habenula: from stress evasion to value-based decision-making. Nat Rev Neurosci 2010; 11:503–513Crossref, MedlineGoogle Scholar