Pleasure and Aversion:
Challenging the Conventional Dichotomy

George Ainslie
Veterans Affairs Medical Center, Coatesville PA, USA
University of Cape Town, South Africa
George.Ainslie@va.gov

Presented at Wanting and Liking
University of Oslo
March 2, 2008

Published in Inquiry 52 (4), 357-377, 2009.

This material is the result of work supported with resources and the use of facilities at the Department of Veterans Affairs Medical Center, Coatesville, PA, USA. The opinions expressed are not those of the Department of Veterans Affairs of the US Government.

Abstract

Philosophy and its descendents in the behavioral sciences have traditionally divided incentives into those that are sought and those that are avoided. Positive incentives are held to be both attractive and memorable because of the direct effects of pleasure. Negative incentives are held to be unattractive but still memorable (the problem of pain) because they force unpleasant emotions on an individual by an unmotivated process, either a hardwired response (unconditioned response) or one substituted by association (conditioned response). Negative incentives are divided into those that are always avoided and those that are avoided only by higher mental processes-- archetypically the passions, which are also thought of as hardwired or conditioned. Newer dichotomies within the negative have been proposed, hinging on whether a negative incentive is nevertheless sought (“wanted but not liked”) or on an incentive’s being negative only because it is confining (the product of “rule worship”). The newer dichotomies have lacked motivational explanations, and there is reason to question conditioning in the motivational mechanism for the older ones.
Both experimental findings and the examination of common experience indicate that even the most aversive experiences, such as pain and panic, do not prevail in reflex fashion but because of an urge to attend to them. The well-established hyperbolic curve in which prospective rewards are discounted implies a mechanism for such an urge, as well as for the “lower” incentives in the other dichotomies. The properties of these lower incentives are predicted by particular durations of temporary preferences on a continuum that stretches from fractions of a second to years.

I. Background of motivational dichotomies

The most fundamental choices are dichotomies: Seek or avoid, approve or disparage, go or not. It has thus been natural that we have organized our perception of the motivational universe into dichotomies: pain or pleasure, and within pleasure, impulsive or rational. However, nature may not have followed this organization plan. Even in situations where the most adaptive response is decisive choice of one alternative or its opposite, the motives by which we generate the relevant preferences may not be polar opposites themselves. In this article I will update evidence for a proposal along this line that I first made many years ago (Ainslie, 1975), and suggest how it relates to recent neurophysiological research and philosophical analysis.

From earliest history people noticed that in manipulating others’ behavior it was helpful to divide their options into those that motivated and those that did not, and to divide the motivators into rewards and punishments. Mindful of a need to escape motivational manipulation, including by one’s own sensations, philosophers soon recognized another dichotomy, between higher goods, the goals of reason, and inferior goods, objects of the passions. Plato famously likened the motives they incite to a pair of chariot horses,

the right hand… a lover of honour and modesty and temperance, and the follower of true glory… the other is… the mate of insolence and pride, shag-eared and deaf (Phaedrus, 253e)

Passion spoiled the efforts of reason, but, unlike pain, the threat of it could not be used as a deterrent. Modern philosophers have made further distinctions among higher goods, but the empiricist tradition from which psychology sprang rested on the basic split of pleasure and pain.

The chief spring or actuating principle of the human mind is pleasure or pain; and when these sensations are remov’d, both from our thought and feeling, we are, in a great measure, incapable of passion or action, or desire or volition (Hume, 1739/1888, p 574)

And the point of pleasure and pain was behavioral—They were not just sensations, but the selective principle of behavior:

The most immediate effects of pleasure and pain are the propense and averse motions of the mind. (ibid.)

Bentham was even more emphatic:

Nature has placed mankind under the governance of two sovereign masters, pain and pleasure. It is for them alone to point out what we ought to do, as well as to determine what we shall do. On the one hand the standard of right and wrong, on the other the chain of causes and effects, are fastened to their throne. They govern us in all we do, in all we say, in all we think: every effort we can make to throw off our subjection, will serve but to demonstrate and confirm it (1823/1962).

The early psychologists inherited this dichotomy: “Without an antecedent of pleasurable or painful feeling—actual or ideal, primary or derivative—the will cannot be stimulated.” (Bain, 1859/1886, p. 352). Others reacted against “the silliness of the old-fashioned pleasure-philosophy” that tried to derive all value from association with physical pleasures and pains (James, 1890, v. 2, p. 551 note). However, when they made models consistent with the neurophysiology that was known at the time, they had trouble getting beyond the same principle:

One great inhibiter of the discharge of K [sensory neuron] into M [motor neuron] seems to be the painful or otherwise displeasing quality of the sensation itself of K; and conversely… (James, 1890, v.2., p 583)

Dualistically allowing for the experiential aspect of this physiology did not help:

The drainage currents and discharges of the brain are not purely physical facts. They are psycho-physical facts, and the spiritual quality of them seems a codeterminant of their mechanical effectiveness. If the mechanical activities in a cell, as they increase, give pleasure, they seem to increase all the more rapidly for that fact; if they give displeasure, the displeasure seems to damp the activities. The psychic side of the phenomenon thus seems, somewhat like the applause or hissing at a spectacle, to be an encouraging or adverse comment on what the machinery brings forth. The soul presents nothing herself; creates nothing; is at the mercy of the material forces for all possibilities; but amongst these possibilities she selects; and by reinforcing one and checking others, she figures not as an ‘epiphenomenon,’ but as something from which the play gets moral support [all italics his]. (James, 1890, vs. 2, p. 584.

Good things were applauded, bad things hissed. This audience reaction was not all-powerful, but gave “moral support” to an underlying “tendency” that was not goal-directed. The cells of the brain generated experiences that the soul might or might not like, and the soul responded with encouragement or discouragement. This mind/body dualism presaged the distinction between motivated and unmotivated processes that has more recently been brought to bear on the asymmetry of pleasure/pain.

Freud’s several models were all fundamentally based on pleasure and “unpleasure,” or the imagined neural equivalent (1895/1956). He eventually recognized an attractive component to pain in the “death instinct” (1923/1956), but not as something intrinsic, rather an interpretation that the mind could create. Still, he was the first author in a scientific tradition who acknowledged that ostensibly negative experiences can actively attract an individual’s participation, an idea to which I will return. He also distinguished a higher category of choice-making from a lower one, such that the higher served the purposes of the lower: “the substitution of the reality principle for the pleasure principle implies no deposing of the pleasure principle, but only a safeguarding of it” (Freud, 1911/1956, p. 223). This idea foreshadowed a softening of the passion/reason dichotomy.

Psychology evolved toward greater reductionism still holding to the basic pleasure/pain dichotomy. In formulating the precursor of Skinner’s law of effect, E. L. Thorndike depicted learnable pathways leading to “satisfiers” or “annoyers,” which deepened the pathways or made them shallower, respectively (1905). However, as he broke this process into steps, he was obliged to notice an asymmetry between the two seeming opposites—the first time that any behavioral theorist had done so.

The idea of making [the punished] response or the impulse to make it then tends to arouse a memory of the punishment and fear, repulsion, or shame. This is relieved by making no response to the situation... or by making a response that seems opposite to the original responses (1935, p. 80)

This passage implies that punishment induces learning, and not just by selecting for non-punished responses-- Details of the sequence leading to punishment are affirmatively remembered as "fear, repulsion, or shame." That is, punishment has to reinforce at least some kind of learning, in the most fundamental sense that it causes attention to, and later rehearsal of, the aversive experience. The problem that Thorndike's later theory raised was that if punishment strengthens learned connections leading to it, how does it come to be avoided? His attempt to specify a substrate for learning in terms of pathways brought the problem of pain into focus for the first time.

Thorndike’s description anticipated O. H. Mowrer’s two factor theory (1947), which became the conventional solution to the problem of pain. In that model, there is a preprogrammed connection between some stimuli and certain emotional responses—fear, disgust, grief—and when the stimulus for one of these responses is predicted by any other stimulus, that stimulus acquires the property of inducing the corresponding response. In the language of classical conditioning, an innately effective, unconditioned stimulus (UCS) elicits the emotion, an unconditioned response (UCR), which is subsequently elicited (as a conditioned, or in Pavlov’s original terms, conditional response, CR) by any other stimulus to which the subject was attending at the time (conditioned stimulus, CS). This is the first factor, the one that forces an aversive experience upon the individual. She is then said to avoid the CS and CR by ordinary goal-directed learning, the second factor.

Two-factor theory has been intuitively attractive because it matches our subjective experience of coercive motives. Pain or anguish do not seem to offer themselves as options but overcome you without being sought, indeed while you are actively trying to avoid them. Reminders of these experiences can occasion a re-living of them, just as conditioned stimuli should. Furthermore, once we adopt this theory of aversive experiences, we can use it also to account for pleasurable experiences that are only transiently preferred: the passions that oppose our true interests, the other big philosophical dichotomy of motives (Hirschman, 1977). A current school of thought holds that impulsive choice is driven by a special class of incentives, “visceral rewards,” those that involve passion or appetite and can be conditioned to surprise the person when triggered by reminders of their past occurrence (Loewenstein, 1996; Laibson, 2001). So we might have two dichotomies defining a hierarchy of three kinds of experience:

Pains—attended to because of reflexive hardwiring or conditioning, but never sought
Lower pleasures—made disproportionately attractive by hardwiring or conditioning, preferred only under the influence of this temporary attraction
Higher (or rational) pleasures—those that are free from the influence of visceral factors

II. Two newer dichotomies

Two newer motivational distinctions have appeared in the literature on choice: rule-based choice versus choice that transcends or otherwise escapes the confines of rules, and pleasures versus events that are wanted without being pleasurable.

It has long been evident that achieving rationality requires active tactics against impulses. That is, achieving higher pleasures cannot occur through spontaneous choice, but only through some kind of discipline. This advice was first recorded in Aristotle’s precept that you should choose according to principle (Nichomachean Ethics 1147a24-28), and reached an extreme in Kant’s imperative that all choice should be made in such a way that it would serve as a precedent for a universal rule (1793/1960, pp. 15-49). In the nineteenth century philosophers became increasingly sensitive to the confining nature of this restriction, and split from it yet another category of experienced incentive. The nature of this split has been harder to specify than the earlier splits, and has consisted mainly of a negation of rule-bound choice as the highest principle. Existentialists and theologians of various faiths have suggested that a change in the focus of awareness can result in a freedom from impulsiveness without becoming rule-bound. These suggestions are hard to characterize in motivational terms, but their ubiquity suggests some robustness of underlying percept: an awareness of being-in-itself, empathic experience of the love of a divinity, or “enlightenment.” Some philosophers have suggested that the higher basis of choice is a progression beyond the discipline of rules that does not negate this discipline: William James said that “the highest function of the will is to rise above a rule that has grown too narrow for the case” (1890, p.209), and Freud described the harmful potential of the superego (1923/1956). More recently the developmental psychologist Lawrence Kohlberg tried to define a stage of existential integrity beyond his erstwhile highest developmental stage, moral reasoning (1973). “Motive utilitarians” have recommended being guided by metapreferences rather than rules (Adams, 1976), and “two-level utilitarians” have said that it is necessary occasionally to use a “critical” faculty to transcend rules, judging rather as an “archangel” would (Hare, 1963). The general group of act utilitarians argue against rule utilitarians, but without an explicit theoretical account of how to avoid impulsiveness (e.g. Emmons, 1973). These various writers have in common only a view that rules for choice are an expedient that may restrict you from your most heartfelt wishes. Their dichotomy of selective principles might be called rule-bound versus transcendent.

The other new dichotomy is easier to specify but more disconcerting to motivational theorists. Near-satiated smokers have often been puzzled—and irritated-- by the urge to light a cigarette that they know will not be pleasurable. Urges to emit tics or mannerisms, bite fingernails, or aggravate a psychogenic itch have been equally puzzling. Obsessive-compulsive disorder is characterized mainly by such short-term urges, for instance to wash, check for security, or pull out hairs (American Psychiatric Association, 1994). People with eating problems often eat without enjoyment (Volkow, et.al., 2002), and recent neurophysiologic research has induced laboratory animals to seek rewards for which they evince clear dislike (Berridge, this issue). Berridge and his colleagues have analyzed several such behaviors, which subjects have strong tendencies to repeat despite a lack of pleasure that is evident either from self-reports, or, in the case of nonhumans, facial expressions (Berridge & Robinson, 1998; Berridge, 2003). These authors were suspicious of supposed brain “pleasure centers” that induced avid electrical self-stimulation despite reported sensations that patients said they did not enjoy and that rats would not cross their cages to initiate after periods of interruption. Using increasingly precise physiological mapping techniques they have separated brain centers that subtend pleasurable activities and centers for activities that are “wanted but not liked,” which subjects perform repeatedly despite evident distaste for them (Pecina et.al.,2006). The authors interpret the latter findings as evidence of a dissociable component of decision making, response selection by “salience.” This interpretation remains controversial (O’Doherty, 2004), partly because of the difficulty of separating attention from preference (Maunsell, 2004; Schultz, 2006). The motivational properties of salience have been especially difficult to define. Despite the “wanting” label the Berridge group follows Volkow in calling this kind of selection “nonhedonic” (Berridge, 2003); a wanted, disliked goal is nevertheless “a motivational magnet” (Berridge, 2007) or “a false pleasure” (Pecina et.al., 2006). They make it clear that wanting is still motivation, and must interact with hedonic reward to determine which motor behavior a subject will perform. These unwanted behaviors are certainly not conditioned. They use the voluntary (skeletal) muscles, and can be overcome by sufficient incentive—as illustrated dramatically by the case of a surgeon whose Tourette’s syndrome stays quiescent while he operates (Canadian Medical Association Journal, 1991). The motivation for them represents a hedonic category with distinct properties that sits somewhere between pains and pleasures.

Now five qualitatively distinct hedonic experiences have been described, some accompanied by theoretical explanations:

Pains—attended to because of reflexive hardwiring or conditioning, but never sought. These are not wanted and not liked.
Itches—events that are voluntarily sought without being pleasurable—wanted but not liked, a seeming paradox that can be reproduced but not explained by experimental procedures.
Lower pleasures—made disproportionately attractive by hardwiring or conditioning, preferred only under the influence of this temporary attraction. These are liked but not approved of.
Systematic higher pleasures—those that are free of the influence of visceral factors, but constrained by rules, approved of but not necessarily wished for.
Transcendent higher pleasures—those that are free of the distortions imposed by rules, and, presumably, of visceral factors. The wished-for rational ideal.

III. In a one-factor theory valence hinges on the timing of rewards

This list of qualitatively distinct experiences might seem to need a number of different explanations. Such a need would be even more difficult than it seems at first, because of the inadequacy of two factor theory as a basic explanation (Mackintosh, 1983, pp. 99-170). I have elsewhere summarized evidence that the simple response-transfer process called conditioning cannot be what governs emotional processes, negative or positive (Ainslie, 2001, pp. 18-22, 65-69; 2010). I will not review that topic here, but instead proceed to a proposal that makes this kind of connectionistic mechanism unnecessary.

Conditioning has probably remained popular as a mechanism of response selection despite its disadvantages because a purely reward-based theory has always seemed inadequate to the task. Theorists have proposed a unitary selective principle in the past (Hilgard & Marquis, 1940; Donahoe et.al., 1993), pointing out the striking coincidence that all stimuli that can induce conditioning-- all UCSs-- have a motivational valence as well (Hull, 1943; Miller, 1969). But the rewards that have been used in experiments select only for approach, while UCSs select for both approach and avoidance. That is, only UCSs have seemed to be able to both reward and punish. Besides, many emotions do not feel pleasurable. As William James asked,

Who smiles for the pleasure of the smiling, or frowns for the pleasure of the frown? Who blushes to escape the discomfort of not blushing? Or who in anger, grief, or fear is actuated to the movements which he makes by the pleasures which they yield? (1890, v. 2, p. 550)

It would be easy to argue that smiles really are motivated by the pleasure of smiling, but “the pleasure of the frown” or “the discomfort of not blushing” seem to represent contradictions in terms. And yet we will need to find a rationale for them if we are to do without a coercive principle such as conditioning.

The existence of an internal marketplace for positive incentives has long been postulated by utility theorists, economists foremost among them. Neurophysiologists have reiterated the necessity of recognizing such a marketplace (Montague & Berns, 2002 Shizgal & Conover, 1996). This would be to say that many diverse processes compete for a limited channel of expression on the basis of a common dimension of selectability, such that an relative increase in this dimension for an act of game-playing, say, or charity, can lead it to be selected over an act of food consumption, while a relative decrease for the game or charity could lead the consumption to be selected. However, only desirable processes are usually imagined to compete directly with one another. Backed by intuition, two-factor theory has dictated that aversive processes participate only negatively in this marketplace—that they are introduced by a non-market process and have their effect only by making means to escape them rewarding. Conversely, the notion that aversive processes are directly selectable along the same dimension as desirable ones has been counterintuitive because of two problems: a linguistic happenstance, and the assumption that the relative value of prospective events stays constant in the absence of new information. The linguistic problem is easily dealt with. We use the words “reward” or “utility” for a property that is deliberately sought, and different words such as “urgency” or “vividness” for a property that seems to demand attention without deliberation; yet the latter terms also imply positive motivation—motivation that impels you toward the thing, rather than away from it. If we stop equating rewardingness with desirability—the property that lets something be deliberately sought—and define it more basically as the property that makes whatever process it follows tend to be repeated, we can both broaden and simplify our picture of the internal marketplace. The neurophysiologist Maunsell has made a similar proposal, and has suggested that even attention is governed by reward (2004)
The assumption that the relative value of prospective events stays constant in the absence of new information has been shown to be false. Smaller, sooner (SS) rewards are temporarily preferred to larger, later (LL) ones as the result of a basic discount curve that is more concave than a “rational” exponential curve (Ainslie, 2001, pp. 27-39). The shape of the discount curve has been repeatedly demonstrated by controlled experiment. Given choices between rewards of varying sizes at varying delays, both human and nonhuman subjects express preferences that by least squares tests fit curves of the form,

a hyperbola (where Value0 = value if immediate and k is degree of impatience), better than the form,

an exponential curve (where Value0 = value if immediate and δ = [1 – discount rate]); Grace, 1996; Green, Fry & Myerson, 1994; Green & Myerson, 2004, Kirby, 1997; Mazur 2001). The hyperbolic shape has been confirmed by three other approaches as well:

1. Given choices between SS rewards and LL ones available at a constant lag after the SS ones, subjects prefer the LL reward when the delay before both rewards is long, but switch to the SS reward as it becomes imminent, a pattern that would not be seen if the discount curves were exponential (Ainslie & Herrnstein, 1981; Ainslie & Haendel, 1983; Green et.al, 1981; Kirby & Herrnstein, 1995). This is true if even the delay to the SS reward is never zero (Green et.al., 2005), thus ruling out an alternative explanation that invokes an effect of immediacy per se.

2. Given choices between SS rewards and LL ones, nonhuman subjects will sometimes choose an option available in advance that prevents the SS alternative from becoming available (Ainslie, 1974; Hayes et.al, 1981).

3. When a whole series of LL rewards and SS alternatives must be chosen all at once, both human and nonhuman subjects choose the LL rewards more than when each SS vs. LL choice can be made individually (Ainslie & Monterosso, 2003; Kirby & Guastello, 2001)

In short, hyperbolic discounting seems to be an elementary property of the reward process. The resulting implication that our choices are intrinsically unstable has an obvious application to addictions and other impulsive behaviors, and is also relevant to the relationship between pain and pleasure. Impulses are motivated not simply by reward or aversion, but by a sequence of reward and obligatory nonreward. The occurrence of a reward that satiates quickly but recovers during a period of inhibited reward can be expected to set up a cycle in which the reward and nonreward alternate as long as the conditions that give rise to the reward are present (figure 1). The most familiar such pattern is the cycle of binge and hangover, in which the rewarding phase may last for hours—or minutes in the cases of nicotine or crack cocaine—and is expandable to some extent by repeating and increasing the dose, but always sets up a dysphoric phase when the substance loses its potency and/or has to be discontinued. Of course, the satiation of any significant pleasure will result in a period of refractoriness to that pleasure by the general principle of opponent processes (Solomon & Wynne, 1954). It is always possible that a person will regret the combination during the dysphoric phase, however great the pleasure and however slight the dysphoric reaction. However, hyperbolic discount curves offer a clear definition of when the choice of the combination is irrational, that is, regrettable from what could be called an objective standpoint: It will be irrational if the combination is familiar and the individual chooses it when it is imminently available, but chooses to avoid it, if given the chance, from some distance in advance. This is because, from the perspective of distance, the summed discount curves from the combination will approach proportionality with the sum of its heights.

Figure 1A

Figure 1B

IV. Dichotomies can be replaced with a continuum of payoff times

With different reward durations and values, figure 1 can describe the other hedonic experiences on our list, within some constraints. Again because of the opponent process, the durations and depths of obligatory nonreward following a cyclic reward will be correlated with that of the reward it follows. Amounts of experienced reward will probably lie within physiological limits similar to those of direct sensory experiences, thus having a range of no more than a hundredfold or so from least to greatest (see Ainslie, 2006). However, the durations of rewards can fall anywhere on a continuum from fractions of a second to years. Prospects of very long durations will probably be estimates based on shorter periods of actual experience, and will thus be somewhat complex—I will touch on this possibility shortly. Durations shorter than the binge/hangover cycle just described should result in simple motivational patterns; by some coincidence these match the two lower categories of hedonic experience charted above:

When cycles of preference and dysphoria are on the order of seconds to minutes they will cease to be something you seek in advance. The cigarette or salted nut close to the satiation point, or the chance to bite a fingernail, will still have motivating power when imminently available, but at any significant distance your motivation will be to keep them from becoming available—Put them away until the appetite is stronger or trim the nail before you are alone again. The combination could be said to be wanted but not liked. Unfortunately, no one has studied “wanting” as a function of delay to available reward in the wanted/liked paradigms, but I would predict a strong effect. If so, hyperbolic curves can suggest a clear resolution to the problem of how to conceive the hedonic status of a wanted-but-not-liked event: It is rewarding, but for only small parts of its duration. Remember that I am using the term “reward” without any connotation of pleasure, but rather in its most basic functional sense: an event that increases the frequency of (selects for) a process that it follows. Following an urge to emit a tic or pursue an intrusive memory may not be pleasurable, but it can still be governed by the recurring opportunity for reward depicted in figure 1b.

Shortening the cycle length in figure 1b will let us do without the most basic of the conventional dichotomies, that between pleasure and pain. Protopathic pain has always seemed to be a reflex, and aversive emotions such as panic, anguish, and dread to be conditioned reflexes, because a strong urge to attend to them has to be combined with a strong urge to avoid or escape them. As long as reward implies pleasure some factor beyond reward is needed to account for their vividness—their tendency to attract attention, or, better stated, to occasion the individual’s affective participation in them. However, if we recognize reward as just a retroactive selective factor without regard to pleasure, aversion can be seen as a combination of reward and the consequent obligatory inhibition of reward. From the case of wanted-but-not-liked it is not a great leap to given-in-to-but-not-wanted. The same cyclic mechanism may explain seemingly unmotivated responses generally. If the “wanted” phase of an option is too short to motivate even brief approach behaviors, it might still attract attention and negative appetites/emotions.

Maunsell has suggested that the extensive debate about whether disliked behaviors are rewarded or merely “salient” has arisen because we have too narrow a definition of reward: “If reward is defined to include all motivating factors, then there may be no differences between attention and expectation of reward” (2004, p. 264)—which is also the usage I have suggested. He thus implies that the concept should extend beyond the wanted to the attended-to. Berridge’s definition of incentive salience still excludes magnets for attention per se. However, the urge to pay attention to an aversive process might be how the very truncated “wanting” of this process is experienced—not in a consciously discriminable approach phase, but merged with its alternating avoidance phase as in flicker fusion. The reward that selects for this attention thus cannot last long, but also cannot be slight if the urge is strong; hence it might be represented as recurring tall, thin spikes. The reward must be brief because it must be intense, or the net effect would be pleasure; conversely, the nonreward phase must be long enough to give the whole cycle a negative value. Future theorists might find reason to conceive this combination in other ways, such as simultaneous processes in separate brain centers that control attention/participation and motor behavior; but the simplest mechanism for combining these opposite effects depicts them in sequence in a single comprehensive reward center: If speeding up the binge/hangover cycle can account for behaviors that are wanted but not liked, speeding it up even more should account for frank aversions. When the duration of dominance of the narrow spikes of reward in figure 1b is long enough to govern attention, and the frequency of the cycle is above your flicker fusion threshold, you should experience a strong urge that draws you into an aversive emotion at the same time that you want to get away from it. This, then, would explain the lure of panic, and offer a way that James could accept “the discomfort of not blushing” (1890, v. 2, p. 550; see above).

There are sometimes stable behaviors that extend over years but that a person feels imprisoned by. These behaviors often fit the pattern of the overly rule-bound choices about which philosophers have complained. In addition to patients who have named syndromes such as anorexia nervosa and obsessive-compulsive personality disorder ), there are many people who simply feel stuck with rigid characters. However, unlike the rapid preference cycles of obsessive-compulsive disorder (OCD), for which patients are eager to get treatment, the overcontrol symptoms of OCPD are stable over time and systematically defended by the patient herself, even though it is not infrequently the same patient who has both syndromes (Mataix-Cols et.al., 2000). I have argued that such rigidity comes from the overuse or inept use of the process that creates willpower: the interpretation of current choices as test cases for similar choices in the future. Such interpretation gives rise to an intertemporal game of repeated prisoner’s dilemma that is stabilized by establishing personal rules as criteria for what will constitute cooperation; too great a reliance on this interpretation leads to the “rule-worship” that philosophers since Kant have decried (Mintoff, 2004). Such a recursive self-prediction process is made possible—and useful—only by the instability of spontaneous choice over time that comes from hyperbolic discount curves.

The process becomes excessive in some people because of the differential value of unique, concrete criteria in solving repeated prisoner’s dilemmas. I have explored this topic at length elsewhere (Ainslie, 2001, pp. 90-104). Here I suggest only that it completes the table of hedonic experiences which have been paired variously as opposites in the literature of motivational science and philosophy.

Organized as a range of durations, our table of hedonic experiences neatly overlies the continuum that I proposed some years ago as predicted by hyperbolic discount curves (Ainslie, 1992, pp. 96-122; 2001, pp. 48-65):

Table 1

In this table higher processes are higher because they are more foresighted: They seek increasingly distant rewards by forestalling some short-sighted processes and accepting others. Their basic conflict is not between pain and pleasure or viscerality and rationality, but between shorter and longer term efficacy at getting reward. Part of a person’s long term efficacy may involve classifying possible outcomes as good and bad, but in making such dichotomies she ignores the strategic interaction of processes that pay off in different time zones. There could be any number of such zones, but on an experiential basis it is hard to clearly differentiate more than the five in the table. The complexity of their competition is suggested by an example from Jon Elster:

I wish that I didn't wish that I didn't wish to eat cream cake. I wish to eat cream cake because I like it. I wish that I didn't like it, because, as a moderately vain person, I think it is more important to remain slim. But I wish I was less vain (1989, p. 37 note).

If we imagine a person gorging herself with cream cake on a hot day, we can add two more levels and represent the whole table:

I wish I weren’t so vain (highest wish)
My vanity supplants my better self (approved, not wished for)
My craving overcomes my vanity (liked, not approved)
I can’t help worrying about food poisoning, but it spoils my pleasure (wanted but not liked)
The pain of my distended stomach blocks out everything else (given in to but not wanted)

These experiences are hierarchical, from higher to lower mental processes, in the sense that each includes motivation to get rid of the one below it for as long as possible—or perhaps all the ones below it, except that a higher process might accept a lower process in order to get rid of an intermediate one: A desire not to be vain (highest level) may use the appetite for cream cake (addiction level) to defeat the vanity (compulsion level), or a wish to be generous (highest level) may license a wish to gamble at a charity event (addiction level) to overcome avarice (compulsion level). Higher processes can be seen as brokering lower ones, as Francis Bacon once described:

[Sometimes it is wise to] set affection against affection and to master one by another: even as we use to hunt beast with beast... For as in the government of states it is sometimes necessary to bridle one faction with another, so it is in the government within (quoted in Hirschman, 1977, p.22).

Conversely, lower processes can be seen as finding and suggesrting rationalizations to higher processes as if to apply for this role. Lower processes replace higher ones when conditions permit them to, but they are not motivated to damage the higher ones, that is, to interfere with them beyond the present moment.

These zones of temporary preference are defined experientially, and do not have sharp borders. The urge to do an addictive activity when nearly satiated becomes wanted-but-not-liked, and a surrender to panic sometimes feels like a conscious choice—which would imply that it is also wanted in the terminology I have been using, rather than subjectively involuntary like pain. Nevertheless, different brain regions might play a part in giving these experiences their properties. Just as the continuum of wavelengths of light is experienced in colors through the interaction of three distinct wavelength receptors in the eye, brain regions that process incentives of different duration may give the different zones their characteristic flavors.

Differential involvement of particular brain regions by type of experience is often reported. The prospect of pleasurable rewards in the relatively near future but not the more distant future excites the medial prefrontal cortex and ventral striatum, for instance (McClure et.al. 2004, 2007). Even within the striatum delay sensitivity is not uniform, but can be pinpointed by a graded map with ventroanterior regions tracking more immediate rewards and dorsoposterior regions tracking more delayed ones (Tanaka et.al., 2004). Berridge and his collaborators have observed dopaminergic neurons in this area to be active in wanted-but-not-liked behaviors (this issue), while interspersed anatomical “hot spots” of opiate-sensitive neurons respond to “liked” rewards (Peciña et.al., 2006). The process of governing behavior by rules seems to depend on the dorsolateral prefrontal cortex and cingulate gyrus (Bechara, 2006). Doubtless the latter structures are active also in unambivalent preferences, but their activity would be suspected in cases of rule-worship.

Surprisingly, the brain centers excited by pain are at least grossly the same as the centers excited by conventional rewards (ventral striatum, sublenticular extended amygdala, ventral tegmentum, and orbital gyrus; Becerra et.al., 2001). In all but the striatum the response to pain is in the same direction as that to reward, which “suggests that they may constitute a general circuitry processing both rewarding and aversive information” (ibid., p. 942). Even in the striatum other authors have reported increased activity after both reward and punishment, which contrasts with their decreased activity after the unexpected non-delivery of either (O’Doherty, 2004). Its response to both reward and punishment has been puzzling to authors who are trying to find centers for incentive as it is usually understood. Several have endorsed the suggestion that such activity is in response to “salience” (simple motivational relevance) rather than hedonic valence (e.g. Zink et.al., 2003). However, as O’Doherty points out, if the striatum were merely tracking salience its activity should increase for the unexpected omission of reward or punishment as well as their delivery. This area responds specifically to reward or punishment in contrast to nonreward and non-punishment. Hyperbolic discounting theory offers a mechanism that would explain this pattern: The striatum responds to rewardingness, which is a component of the liked, the wanted, and the unwanted-but-given-in-to, but not of their negations.

There are also sites that are differentially active in specific aversive experiences, such as panic and disgust. Not surprisingly in the organ that chooses how an organism will try to survive, many brain sites process diverse kinds of motivational information. But however anatomically separate these sites may be, their activity must be integrated into a single marketplace of choice. There is indeed some evidence that all centers for the rewards that are conventionally used in experiments discount anticipated reward at the same rate, which is correlated with the same subjects’ hyperbolic discount pattern observed behaviorally (Glimcher, et.al., 2007). Neuroimaging data are still too noisy for a hyperbolic pattern to be directly distinguishable from an exponential one.

V. Conclusions

Recognition of the hyperbolic discounting of prospective events permits reconciliation of familiar subtleties of choice with a reductionist, bottom-up model of the self. The person emerges as a population of reward-seeking processes that compete for expression on the basis of their payoff times and ability to take each others’ existence into account. Higher processes grow at increasing lengths from choices, somewhat as plants grow taller to capture sunlight before it gets to shorter plants. Higher and lower processes fight, sometimes extremely, but not because any are the opposite of others. This view of motivational conflict may be helpful in the analysis of problems that have defeated more common-sense approaches, such as the attractiveness of pain and negative emotions, the existence of wanted-but-not-liked behaviors, and the pathological side effects of willpower.

Notes

1. This mistranslation is arguably the origin of what is still often supposed to be a response selection process separate from reward, “conditioning” (Dinsmoor, 2004).

2. For instance, Christian theologians have called self-government by universal principles “pagan morality” (Davison, 1888, pp. 156-183).

3. It seems to be a property of psychophysical comparisons generally (Gibbon, 1977), and indeed even reflect the molecular dynamics of how neurotransmitters saturate the receptors of postsynaptic neurons (Michaelis-Menton equation; Berns et.al., 2007).

4. This pattern is typical of abused substances, but also applies to “doses” of purely behavioral thrills, the archetype of which is gambling (Wray & Dickerson, 1981).

5. There may of course be standards by which a choice from a distance is still irrational. In particular, short intense experiences seem to be remembered more than long, less intense experiences that might have the same areas under a plot of their intensity over duration; Kahneman has found that the greatest and the latest values as reported by subjects themselves during an experience are disproportionately effective in the subjects’ choice of whether to repeat the experience (2000). However, as long as this valuation pattern is robust, it will appear irrational only to an outside observer, and it is not clear on what basis this appearance should determine our view of rationality more than the subject’s own stable preference.

6. Even the highly precise distance senses of loudness and brightness have only this dynamic range (Pugh, 1988; Green, 1988). The smaller range of the more visceral senses such as taste and smell would be expected to be closer to the range of effectively different reward strengths (Cain, 1988).

7. When your appetite is strong you may experience these activities as unconflictual pleasures, or as bad habits in the range of preference durations just discussed that require the equivalent self-control measures.

8. The cases of urgency that Elster presents (this issue) are provocative additions to the catalog of perverse motives. I would classify them mostly as impulses rather than itches—they offer gambles on getting genuinely liked outcomes, or options that are hedonically sound, albeit in the short run. The fighter who attacks his enemy too soon and the mountaineer who seeks his lost companion at the expense of reducing the probability of finding him create the prospect, if they are lucky, of immediate payoffs. The lovers genuinely increase their present bliss by creating a prospect of perpetual togetherness. Where urgency comes from a strategic purpose such as blame avoidance it may be simply rational, even when the reason that the avoidance succeEds is not. Where urgency arises because of the expected decay of an emotion, as in the choice of revenge while resentment is hot, it may also make good hedonic sense: The prospect of not having an urge may have less value than the prospect of gratifying it. When I have the urge for a nap I dispel it with coffee only if there is some practical reason to do so; I would rather enjoy the nap.
Elster distinguishes “urgency” from “preference,” but when both are “raw” they seem to differ only in whether you would allow yourself to choose them deliberately. His “urges” are desires, forbidden because they are either irrational or immoral.

9. I have also argued that this instability renders a person’s choice not fully predictable even with a perfect knowledge of her incentives, and even by the person herself (Ainslie, 2001, pp. 129-134). Participation in recursive self-prediction may be the most important way in which “action is experienced as something that the agent instigates, rather than something that just happens to the agent as the result of the state that they were antecedently in” (Holton, this issue). I agree that the (non-deceived) experience of agency is the crux of what people call free will. I agree also that avoiding reconsideration is an effective tactic of impulse control, and is often the “effort of will” of which people are most aware. However, it is a limited tactic, apt to be unstable over time; Loyola’s “negligence in rejecting [a sinful] thought” implies failure of a will that must not itself come from self-blinding but from a systematic, continuous rejection of the thought’s venial pleasures while aware of them. This is William James’ will, in which “both alternatives are steadily held in view, and in the very act of murdering the vanquished possibility the chooser realizes how much in that instant he is making himself lose” (1890, p. 534). The amorous teenager who wants to be sure not to go all the way needs to make some use of will to follow the strategy of avoiding sexual thoughts and play; if she wants to have sex play and still not have intercourse then will is her sole weapon, and accordingly must be stronger.
As Holton points out, defining freedom as efficacy makes responsibility a continuum: “One is not very free if following through is very difficult.” But law and morality need a bright line between what is too difficult and what is not, and the presence or absence of an observable physical cause has often served that purpose (see Monterosso et.al., 2005). Holton is right that fatalism has developed soft credentials as such a cause, even though it is useless as a line dividing degrees of difficulty, and even though our increasing ability to identify the physical substrates of the motivation process, which any determinist has always had to believe were there, is rendering the presence of such substrates a decreasingly useful line as well.

References

Adams, R. (1976) Motive utilitarianism. Journal of Philosophy 73, 467-481.

Ainslie, G. (1974) Impulse control in pigeons. Journal of the Experimental Analysis of Behavior, 21, 485-489.

Ainslie, G. (1975) Specious reward: A behavioral theory of impulsiveness and impulse control. Psychological Bulletin 82, 463-496.

Ainslie, G. (2006) Motivation Must Be Momentary. In J. Elster, O. Gjelsvik, A. Hylland and Moene, K. (Eds.), Understanding Choice, Explaining Behaviour. (Unipub Forlag).

Ainslie, G.. (1992) Picoeconomics: The Strategic Interaction of Successive Motivational States within the Person (Cambridge University).

Ainslie, G. (2001) Breakdown of Will. (Cambridge University).

Ainslie, G. (2010) Hyperbolic discounting versus conditioning and framing as the core process in addictions and other impulses. In D. Ross, H. Kincaid,.D. Spurrett, and P. Collins (Eds.), What Is Addiction? (MIT).

Ainslie, G. and Haendel, V. (1983) The motives of the will. In E. Gottheil, K. Druley, T. Skodola, H. Waxman (Eds.), Etiology Aspects of Alcohol and Drug Abuse, pp. 119-140 (Thomas,).

Ainslie, G. and Herrnstein, R. (1981) Preference reversal and delayed reinforcement. Animal Learning and Behavior, 9, 476-482.

Ainslie, G., and Monterosso, J. (2003) Hyperbolic discounting as a factor in addiction: A critical analysis. In R.Vuchinich and N. Heather (Eds.), Choice, Behavioural Economics, and Addiction. pp. 35-62 (Pergamon),.

[American Psychiatric Association] (1994) Diagnostic and Statistical Manual of Mental Disorders. Fourth Edition. (APA Press).

Bain, A. (1859/1886) The Emotions and the Will (Appleton).

Becerra, L.., Breiter, H. C., Wise, R., Gonzalez, R. G., and Borsook, D.. (2001) Reward circuitry activation by noxious thermal stimuli. Neuron, 32, 927-946.

Bechara, A.. (2006) Broken willpower: Impaired mechanisms of decision-making and impulse control in substance abusers. In N. Sebanz and W. Prinz (Eds.), Disorders of Volition, pp. 399-418 (MIT).

Berns, G. S., Capra, C. M., and Noussair, C.. (2007) Receptor theory and biological constraints on value. Annals of the New YorkAcademy of Sciences, 1104, 301-309.

Berridge, K. C. and Robinson, T. (1998) What is the role of dopamine in reward: Hedonic impact, reward learning, or incentive salience: Brain Research Reviews, 28, 309-369.

Berridge, K. C. (2003) Pleasures of the brain. Brain and Cognition, 52, 106-128.

Berridge, K. C. (2007) The debate over dopamine’s role in reward: The case for incentive salience. Psychopharmacology, 191, 391-431.

Cain, W. S. (1988) Olfaction. In R. C. Atkinson, R. J. Herrnstein, G. Lindzey, and R. D. Luce (Eds), Stevens’ Handbook of Experimental Pychology, 2d Edition. Vol. 1, pp. 409-459 (Wiley).

[Canadian Medical Association] (1991) Tourette’s syndrome doesn’t keep BC surgeon from operating. Canadian Medical Association Journal 178, 247.

Davison, W. (1888) The Christian Conscience: (Woolmer).

Dinsmoor, J. A. (2004) The etymology of basic concepts in the experimental analysis of behavior. Journal of the Experimental Analysis of Behavior, 82, 311-316.

Donahoe, J. W., Burgos, J. E., and Palmer, D. C. (1993) A selectionist approach to reinforcement. Journal of the Experimental Analysis of Behavior, 60, 17-40.

Emmons, D. C. (1973) Act versus rule-utilitarianism. Mind 82, 226-233.

Freud, S. (1895/1956) Project for a Scientific Psychology. in J. Strachey and A. Freud (Eds.), The Standard Edition of the Complete Psychological Works of Sigmund Freud vol. 1 (Hogarth).

Freud, S. (1911) ibid., vol. 12. Formulations on the Two Principles of Mental Functioning.

Freud, S. (1923) ibid., vol. 19. The Ego and the Id.

Gibbon, J. (1977) Scalar expectancy theory and Webers law in animal timing. Psychological Review, 84, 279-325.

Grace, R. (1996) Choice between fixed and variable delays to reinforcement in the adjusting-delay procedure and concurrent chains. Journal of Experimental Psychology: Animal Processes, 22:362-383.

Green, D. M.. (1988) Audition: Psychophysics and perception.. In R. C. Atkinson, R. J. Herrnstein, G. Lindzey, and R. D. Luce (Eds), Stevens’ Handbook of Experimental Pychology, 2d Edition. Vol. 1, pp. 327-376 (Wiley).

Green, Leonard., Fisher, E.B. Jr., Perlow, S., and Sherman, L. (1981) Preference reversal and self-control: choice as a function of reward amount and delay. Behaviour Analysis Letter, 1, 43-51.

Green, L.., Fry, A.., and Myerson, J.l. (1994) Discounting of delayed rewards: A life-span comparison. Psychological Science 5, 33-36.

Green, L. and Myerson, J.. (2004) A discounting framework for choice with delayed and probabilistic rewards. Psychological Bulletin, 130, 769-792.

Green, L., Myerson, J., & Macaux, E. W. (2005) Temporal discounting when the choice is between two delayed rewards. Journal of Experimental Psychology: Learning, Memory, & Cognition, 31, 1121-1133.

Hare, R. M. (1963) Freedom and Reason. (Oxford).

Hayes, S.C., Kapust, J., Leonard, S.R., and Rosenfarb, I. (1981) Escape from freedom: Choosing not to choose in pigeons. Journal of the Experimental Analysis of Behavior, 36, 1-7.

Hilgard, E. R. and Marquis, D.G. (1940) Conditioning and Learning. Appleton-Century.

James, W. (1890) Principles of Psychology (Holt).

Kahneman, D. (2000) Evaluation by moments: Past and future In D. Kahneman and A. Tversky (Eds) Choices, values, and frames. pp. 693-708 (Cambridge University)

Kirby, K. N. (1997) Bidding on the future: Evidence against normative discounting of delayed rewards. Journal of Experimental Psychology: General, 126, 54-70.

Kirby, K.N., and Herrnstein, R. J. (1995) Preference reversals due to myopic discounting of delayed reward. Psychological Science 6, 83-89.

Kirby, K.N., and Guastello, B.. (2001) Making choices in anticipation of similar future choices can increase self-control. Journal of Experimental Psychology: Applied 7, 154-164.

Kohlberg, L. (1973) Continuities in childhood and adult moral development revisited. In P.B. Baltes and K.W. Schaie (Eds). Lifespan Developmental Psychology pp.179-204 (Academic).

Laibson, D. (1997) Golden eggs and hyperbolic discounting. Quarterly Journal of Economics, 62, 443-479.

Lerner, J. S., and Tiedens, L. Z. (2006) Portrait of the angry decision maker: How appraisal tendencies shape anger’s influence on cognition. Journal of Behavioral Decision Making, 19, 115-137.

Loewenstein, G.. (1996) Out of control: Visceral influences on behavior. Organizational Behavior and Human Decision Processes 35, 272-292.

MacKintosh, N.J. (1983) Conditioning and Associative Learning (Clarendon).

Mataix-Cols, D., Baer, L., Rauch, S. L., Jenike, M. A. (2000) Relation of factor-analyzed symptom dimensions of obsessive-compulsive disorder to personality disorder. Acta Psychiatrica Scandinavica 102, 199-202.

Maunsell, J. H. R. (2004) Neuronal representations of cognitive state: Reward or attention? Trends in Cognitive Sciences, 8, 261-265.

Mazur, J. E. (2001) Hyperbolic value addition and general models of animal choice. Psychological Review, 108, 96-112.

McClure, S.. M., Laibson, D. I., Loewenstein, G., and Cohen, J. D. (2004) The grasshopper and the ant: Separate neural systems value immediate and delayed monetary rewards. Science , 306, 503-507.

McClure, S. M., Ericson, K. M., Laibson, D. I., Loewenstein, G., and Cohen, J. D. (2007) Time discounting for primary rewards. The Journal of Neuroscience, 27, 5796-5804.

Montague, P. R..and Berns, G. S. (2002) Neural economics and the biological substrates of valuation. Neuron 36, 265-284.

Mowrer, O. H. (1947) "On the dual nature of learning: A re-interpretation of conditioning and problem solving. Harvard Educational Review, 17, 102-148.

O’Doherty, J. P. (2004) Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769-776.

Pecina, S., Smith, K. S., and Berridge, K. C. (2006) Hedonic hot spots in the brain. The Neuroscientist, 12, 500-511.

Pugh, E. N., Jr.. (1988) Vision: Physics and retinal physiology. In R. C. Atkinson, R. J. Herrnstein, G. Lindzey, and R. D. Luce (Eds), Stevens’ Handbook of Experimental Pychology, 2d Edition. Vol. 1, pp. 75-163 (Wiley).

Schultz, W. (2006) Behavioral theories and the neurophysiology of reward. Annual Review of Psychology, 57, 87-115.

Shizgal, P. and Conover, K. (1996) On the neural computation of utility. Current Directions in Psychological Science, 5, 37-43.

Solomon, R. and Wynne, L. (1954) Traumatic avoidance learning:the principles of anxiety conservation and partialirreversibility. Psychological Review 61, 353-385.

Tanaka, S. C., Doya, K., Okada, G., Ueda, K., Okamoto, Y., and Yamawaki, S.. (2004) Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nature Neuroscience, 7, 887-893.

Thorndike, E.L. (1905) The Elements of Psychology (Seiler).

Volkow, N. D., Fowler, J. S., and Wand, G. J. (2002) Role of dopamine in drug reinforcement and addiction in humans: Results from imaging studies. Behavioral Pharmacology, 13, 355-366.

Wray, I. and Dickerson, M.G. (1981) Cessation of high frequency gambling and "withdrawal" symptoms. British Journal of Addiction 76, 401-405.

Zink, C. F., Pagnoni, G., Martin, M. E. Dhamala, M., and Berns, G. S. (2003) Human striatal response to salient non-rewarding stimuli. Journal of Neuroscience 23, 8092-8097.