A Selectionist Model of the Ego:
Implications for Self-Control

George Ainslie
Veterans Affairs Medical Center, Coatesville PA, USA
University of Cape Town, South Africa
George.Ainslie@va.gov

Presented at Disorders of Volition,
a conference of
The Max Planck Institute for Psychological Research
Irsee, Germany, December 13, 2003

Published in Disorders of Volition
Natalie Sebanz and Wolfgang Prinz, eds.
MIT Press, 2006

This material is the result of work supported with resources and the use of facilities at the Department of Veterans Affairs Medical Center, Coatesville, PA, USA. The opinions expressed are not those of the Department of Veterans Affairs of the US Government.

Keywords: volition, impulsiveness, compulsiveness, self-control, hyperbolic discounting

Abstract

Parametric experiments on discounting prospective events by animal and human subjects have repeatedly found that a hyperbolic shape (inverse proportionality of value to delay) describes spontaneous choice better than an exponential shape. Three implications of hyperbolic discounting—preference reversal toward smaller sooner (SS) rewards as a function of time (impulsiveness), early choice of committing devices to forestall impulsiveness, and decreased impulsiveness when choices are made in whole series rather than singly—have also been found experimentally. Such findings suggest an alternative to the hierarchical model of the self: Behavioral tendencies are selected and shaped by reward in a marketplace of all options that are substitutable for one another. Temporary preferences for SS options like substance abuse and other self-defeating behaviors create a state of limited warfare among successive motivational states. Thus a currently dominant option must include means of forestalling any incompatible options that are likely to become dominant in the future. Consistency of choice is only partially achievable; and the most effective means of achieving it—perception of prisoners dilemma-like relationships among successive choices of a similar kind—leads to compulsive side-effects. In this view the ego is not a faculty but an emergent property of the internal marketplace, analogous to Adam Smith’s unseen hand; and the will is a bargaining situation analogous to the “will” of nations.

Behavioral science offers a smorgasbord of principles describing how people make choices (Mellers et.al, 1998), but where actual social planning is necessary, as in economics and law, these principles are winnowed down to the refinement of utility theory that was initiated by Samuelson (1937) and has come to be called expected utility theory, or, more generally, rational choice theory (RCT; Boudon, 1996; Korobkin & Ulen, 2000, Sugden, 1991). In this theory a person with enough information and time to assimilate it will arrive at hierarchies of preference that are internally consistent (transitive, commensurable, etc.), that maximize her probability of getting what she prefers, and that do not shift as the perspective of time changes. Lawyers and economists are well aware of evidence from all the behavioral sciences of how people violate RCT. Jolls et.al(1998) summarized these violations as bounded willpower (a failure to follow your own plans), bounded rationality (failure to correctly interpret environmental contingencies) and bounded self-interest (a tendency to invest altruism where it will not bring returns), but the violations have seemed haphazard (Posner,1998), and RCT offers at least a coherent system. Admired for its “unique attractiveness… [because] we need ask no more questions about it,” (Coleman, 1986) it is demonstrably the norm for competitions in marketplaces, where anyone who violates it puts herself at a competitive disadvantage.

The adaptiveness of RCT is so obvious that many authors make the subtle leap from viewing it as a norm to viewing it as descriptive of normal behavior. Its tenets become “assumptions about how people respond to incentives” (Korobkin & Ulen, 2000, p. 1055). Volition is just what passes the incentives through to the responses, and violations are due to either errors in evaluating the incentives or “disorders of volition.” This way of thinking has spread from the policy-making disciplines to individual psychology, where “how people create actions from intentions and desires” and how “they stay on course” are matters of information processing (e.g. Carver & Scheier, 2000, and other authors in Boekaerts et.al., 2000). A lower principle such as Plato’s passion or Freud’s id has historically been seen as a competing mechanism of choice, but the lower principle is now seen as mere noise that sometimes obscures the clear signal of RCT. On the contrary, I will argue that the observed deviations from RCT are coherent, that they motivate coherent strategies for dealing with them, and that the competition of these strategies with the deviations that they target generates familiar complexities of choice that RCT has not begun to contemplate.

The Problem of Lower Mental Processes

Outside of RCT, people have always divided mental life into lower and higher processes. Lower processes appear at an early age, are spontaneous and strongly motivated, tend to seek goals that are obviously useful to organisms in evolution, and are often thought of as the animal part of our nature. Higher processes develop later, often seem arbitrary, are less connected with biological need, and are often thought of as transcending our animal nature. They are not refined versions of lower processes, but respond to them and often conflict with them in asymmetrical combats, in which the weapon of the lower processes is superior force and the weapon of the higher processes is superior organization and foresight. Ancient thinkers often held that higher processes should simply replace lower ones, as in the Buddhist and stoic ideals of escaping from desire, the Zoroastrian end of light replacing darkness, and the Judeo-Christian practice of mortifying the flesh. However, it became evident that the relationship of these processes is not one of good versus evil. As Freud pointed out, "The substitution of the reality principle for the pleasure principle implies no deposing of the pleasure principle, but only a safeguarding of it.” (1911, p. 223). Conversely, psychotherapies often attribute patients’ miseries to overgrown higher processes—“cognitive maps (Gestalt),” “conditions of worth (client-centered),” “musturbation (rational-emotive),” and of course the punitive superego (summarized in Corsini, 1984). To be truly higher, a principle must keep lower principles healthy, but it has never been clear how such a relationship works.

Most theories have had what has been called in this conference a top-down approach to the topic. An autonomous faculty—the Vedic tapas, St. Augustine’s temperance, Plato’s reason—imposes logical consistency and stability over time on the lower process. In top-down theories this faculty is not governed by the same determinants as the lower process, which is the slave of reward and—if this is something different—of passion. Perhaps attributing the same determinants would make theorists expect the same results; in any case hypothesized dependence on lawful principles that make “the human person a closed system” is said to reduce people to “powerless victims of mechanism”(Miller, 2003, p. 63). The higher principle is held to be the “you-noun (ibid),” the ego, that must be and perhaps should be impenetrable.

It is always possible that our higher processes are inexplicable by the interaction of relatively simple mechanisms, that is, by a bottom-up approach; but it is also possible that the right mechanisms simply have not been discerned. Certainly many authors have leapt from the discovery of a new atom of learning or motivation to an encompassing theory in which these atoms are merely multiplied or writ large, making the world into a procrustean Skinner box that fails to fit the subtleties of human experience. However, the science of motivation has finally become a cumulative one, in which the current generation stands on the shoulders of previous generations rather than rediscovering the same phenomena in different frames. I will argue that developments during the last four decades in behavioral research, bargaining theory, and the philosophy of mind permit an explicit explanatory model that comes significantly closer than previous models to fitting the subtleties of human character. In particular, I will show how it improves on a currently dominant atom-writ-large, RCT.

Of the three kinds of deviation from RCT catalogued by Jolls et.al, the most attention has been paid to bounded rationality and bounded self-interest. I will not discuss them here. A far more serious problem is bounded willpower-- the widespread violation of temporal consistency. People regularly express a preference for one course of action and then take the opposite course when they actually choose. This is sometimes a minor foible, mere fickleness, but often immerses the person in substance abuse, pathological gambling, destructive rage-- indeed a large part of the psychiatric diagnostic manual (American Psychiatric Association, 1994). An even larger number of “bad habits” never reach the level of diagnosis: smoking, overeating, credit card abuse, rash personal attachments, impatience for pleasant things and procrastination of unpleasant ones—all the activities that you plan to avoid when you are at a distance from them, and regret after you have done them.

RCT holds, against all intuition, that insight alone should prevent these lapses. Consistent choice implies an exponential discount curve of the value of delayed goals, such that they lose a constant proportion of their remaining value for every additional unit of delay. Financial transactions are universally conducted on the basis of the exponential discount curve, for any curve more bowed than this would cause a good to change its value relative to alternatives simply as it drew closer, an irrational instability. People regularly make their investment choices on the basis of exponential curves, so it makes sense to think that these curves are part of attainable insight. According to RCT, the choice between dessert now and fitness down the road should be reduceable to a graph like figure 1a (given that an extended reward like fitness can be represented as an equivalent momentary event—Mazur, 1986; Ainslie, 1992, pp. 147-152, 375-385).

Confronted with the prevalence of temporary preferences, utility theorists have borrowed a mechanism from popular culture, a sudden surge of preference for the less valued alternative when an evocative reminder appears. Spirit possession was popular in more superstitious times, and you can still hear, “the Devil made me do it.” However, since the surge often follows a cue that has been associated with the bad option, psychology has attributed it to classical conditioning: Appetite is assumed to be an unmotivated response transferred from a hardwired stimulus, and its sudden appearance makes the prospective reward from the bad option jump above that of the good option; hence the effect seen in figure 1b.

The problem with this model, aside from serious questions about whether classical conditioning represents a selective principle separate from reward (see below, and Ainslie, 1992, pp. 39-48 and 2001, pp. 19-22), is that most if not all rewards are preceded by predictive cues. Almost all rewards must be “conditioned,” even the rewards that seem to be discounted rationally. Cues merely tell us the likeliness of occurrence and probable delay of whatever rewards they predict. A cue that regularly precedes a reward should become predictable in turn, and if it makes the bad reward more valuable it should soon raise the height of the discount curve for its entire length, causing it to be revalued as a straightforwardly better reward (figure 2a). Thus the conditioning theory of impulses has to assume that you cannot learn the connection between cues and the kind of rewards that get temporarily preferred, or at least that you cannot learn the hedonic implications of this cue/reward pair. The“visceral rewards” that are frequent offenders in impulsive choice (Loewenstein, 1996) must thus stay surprising, and jump out at an unwary person however often she has previously lapsed and chosen the bad reward in the same circumstances (figure 2b).

This would be a somewhat anomalous occurrence, given that animals evaluate the prospect of the same visceral rewards with great accuracy (Herrnstein, 1969), and human addicts often anticipate lapses enough to take precautions against them. The experience of suddenly developing an overwhelming appetite is common, and needs explanation in its own right; but it is not an adequate mechanism for temporary changes of preference in general.

Theoretical Models of the Will

In RCT the person continually maximizes her future prospective reward; higher processes involve only estimating what means will do this (Becker & Murphy, 1988). If we graft unpredictable conditioned appetites and consequent temporary preferences onto this model, we add the task of forestalling these temporary preferences. Most people would say that the tool they use for this task is willpower or some synonym— resolve, intentionality, or the exercise of volition, the topic of this symposium. However, this has not been a robust concept, rather a will-o’-the-wisp, which has eluded definition and study to the point where some authors deny its existence. Part of the problem has been that the term refers to at least three distinct processes—not only (1) the maintenance of long range plans but also (2) the simple initiation of any behavior—the sense in which Ryle found the concept unnecessary (1949/1984)—and (3) the integration of specific plans with the whole self, the “ownership” process, the well-described flaws in which led Wegner to call the will illusory (2002; see Ainslie, in press). It is only in the first sense of maintaining long range plans that the concept of willpower is relevant; and there is no generally accepted mechanism for how this happens. Thus the will can serve as an exemplar of the higher processes that are often held to be impenetrable.

Although a mechanism has been lacking, there has been agreement about several properties of willpower. First, gimmicks are excluded. Seeking external means of control, like taking appetite-spoiling drugs, committing your funds to money managers, or joining social groups that will exert pressure, would not be called will. Positive properties were well defined by Victorian psychologists. Willpower was said to:

come into play as "a new force distinct from the impulses primarily engaged (Sully, 1884, p. 669);"
"throw in its strength on the weaker side... to neutralize the preponderance of certain agreeable sensations (ibid);"
"unite... particular actions... under a common rule," so that "they are viewed as members of a class of actions subserving one comprehensive end (ibid p. 631);"
be strengthened by repetition (ibid p. 633);
be exquisitely vulnerable to nonrepetition, so that "every gain on the wrong side undoes the effect of many conquests on the right (Bain, 1886, p. 440);" and
involve no repression or diversion of attention, so that "both alternatives are steadily held in view, and in the very act of murdering the vanquished possibility the chooser realizes how much in that instant he is making himself lose (James, 1890, vol. 2, p. 534)."

Three internal mechanisms have been proposed that are at least roughly compatible with these properties: building “strength,” making “resolute choices,” and deciding according to principle. However, we need to ask each of these hypotheses both whether its mechanism is complete or requires another will-like faculty to guide it, and whether it recruits adequate motivation to govern the decision. If the motivational structure is made up of exponential (consistent) discount curves and conditioned cravings, these models all have problems.

1. Strength. Baumeister and others have proposed an organ of self-control, the main property of which is that, like a muscle, it gets stronger with use in the long run but can be exhausted in the short run (1996; Muraven and Baumeister, 2000). Presumably it adds motivation to what is otherwise the weaker side (figure 3a), pushing it above the temporary surge of motivation (figure3b). The principal problem with this kind of model is it has to be guided by some evaluation process outside of motivation, since it has to act counter to the most strongly motivated choice at the time. On what basis does this process choose? What keeps this strength from being co-opted by the bad option? Even granting a homunculus that governs from above, what lets a person’s strength persist in one modality, say, overeating, when it has fallen flat in another such as smoking (figure 3c)? The strength concept merely elevates one of the familiar properties of will into a mechanism in its own right, without grounding it in any robust source of motivation.

2. Resolute Choice. Philosophers of mind favor the idea of “resolute choice” (e.g. McClennen, 1990; Bratman, 1999). When they venture to specify a mechanism it mostly involves not re-examining choices, at least not while you expect the bad choice to be dominant. There have been a number of experiments suggesting how children learn to do this: Mischel and his collaborators sometimes refer to a combination of controlling attention and avoiding emotionally “hot” thoughts as willpower (e.g. Metcalf & Mischel, 1999), essentially a use of mental blinders (figure 4).

However, I have argued that diverting attention and nipping emotion in the bud are distinct and less powerful mechanisms than will is of committing your behavior in advance (Ainslie, 1992, pp.133-142). The ability to control yourself in such a way that “both alternatives are steadily held in view” requires something more. Metcalfe and Mischel describe a growing interconnectedness of a child’s “cool” processes, which does imply more than just diversion of attention. Mere diversion after all is an act of holding your breath, useable, as hypnosis has demonstrated, against very short range urges like panic and the affective component of pain, but not against addictions (McConkey, 1984), the urge for which unavoidably weighs in against your original valuation at some point over the hours or days that the diversion must be maintained. The philosophers, too, sense the need for a more complex mechanism: McClennen refers to “a sense of commitment” to previously made plans (1990, pp. 157-161), which sounds like more than diversion of attention, and Bratman refers to “a planning agent’s concern with how she will see her present decision at plan’s end” (1999, pp. 50-56), which suggests that self-prediction is a factor. They seem to be invoking an additional device, deciding according to principle, which I will now examine.

3. Principle. Since ancient times keeping your attention away from tempting options has been the main folk ingredient of self-control, but a subtler technique is just as venerable: deciding according to principle. Referring to dispositions to choose as "opinions" Aristotle said, "We may also look to the cause of incontinence [akrasia] scientifically in this way: One opinion is universal, the other concerns particulars..." (Nichomachean Ethics 1147a24-28). Deciding according to universals made you more continent. Many authors have repeated this advice (some listed in Ainslie, 2001, pp 79-81), but mostly without speculating as to how people can maintain their motivation to narrow their range of choice in this way. Simply summing series of exponentially discounted rewards together does nothing per se to change their relative values (figure 5).

However, Howard Rachlin has written extensively about how people come to choose in “molar,” overall patterns rather than making “molecular” decisions, by which he means going case by case (2000). He believes that there comes to be an aesthetic factor in molar choice itself, just as, with learning, a whole symphony comes to be more rewarding than the sum of its parts. Thus a recovering addict might avoid lapses because of the aversiveness of spoiling her pattern of sobriety. In this model the strength or resolve that feels like the active ingredient in willpower is hypothesized to come from a specific mechanism, molar appreciation of an overall pattern, leading to distaste for options that break the pattern. This model has the advantage of specifying the extra motivation to overcome temptations that choosing in categories seems to supply.

However, this aesthetic factor does not seem robust enough; most people would not say that their temptations had become distasteful or irritating, even after they have learned to avoid them. Nevertheless, without this additional motive, there seems to be no way that bundling exponentially discounted options together could be expected to shift the direction of choice.

I have argued that no satisfactory theory of impulsiveness or impulse control can be based on exponential discount curves—that a priori, without data about the actual shape of the curves, there is a need to postulate curves more bowed than exponential ones (Ainslie, 1975, 2001, pp. 117-140). Highly bowed curves can account for both temporary preferences and the motivation to forestall them, as figure 6 demonstrates.

A hyperbolic discounter who faces a choice between smaller-sooner (SS) and larger-later (LL) rewards will evaluate them roughly in proportion to their objective size—their values at zero delay—when both are distant, but value the SS reward disproportionately when it is close (figure 6a). Thus she will have an innate tendency to form temporary preferences for SS rewards, purely as a function of elapsing time. Furthermore, if she makes a whole series of choices at once—for instance a class of choices united by a principle—the slower decline of the curves at long delays will make her aggregate valuation of the LL rewards much higher (figure 6b).

Hyperbolic discount curves are a radical theoretical departure and lead to converse problems with how choice becomes stable, but they are not an outrageous leap. The degree of most psychophysical changes—from one intensity of warmth or brightness or heaviness to another—is experienced proportionately to the original intensity, a relationship expressed by a hyperbolic rather than an exponential curve (Gibbon, 1977). The accepted formula describing how foraging animals make prey and patch choices, Holling’s “disc equation” (1959) is also hyperbolic (see Green & Myerson, 1996). It does not strain our beliefs about nature that amounts of reward might be experienced proportionally to their immediacies.

Empirical Evidence about the Shape of the Discount Curve

Fortunately, the shape of the discount curve can be studied by controlled experiment, with at least four different methods and in both people and nonhuman animals. A large body of such research has occurred in the thirty years since I first proposed the hyperbolic shape (Ainslie, 1974, 1975); this research has found a robust and apparently universal tendency to discount delayed events in a curve more bowed than an exponential curve. Where the method has permitted estimation of the exact shape, the shape that has best fit the data produced by that method has been a hyperbola. I will summarize the findings briefly:

1. Given choices between rewards of varying sizes at varying delays, both human and nonhuman subjects express preferences that by least squares tests fit curves of the form,

a hyperbola, better than the form,

an exponential curve (where V is motivational value, A is amount of reward, D is delay of reward from the moment of choice, and k is a constant expressing impatience; Grace, 1996; Green, Fry & Myerson, 1994; Kirby, 1997; Mazur 2001). It has also been observed that the incentive value of small series of rewards is the sum of hyperbolic discount curves from those rewards (Brunner & Gibbon, 1995; Mazur, 1986; Mitchell & Rosenthal, 2003).

2. Given choices between SS rewards and LL ones available at a constant lag after the SS ones, subjects prefer the LL reward when the delay before both rewards is long, but switch to the SS reward as it becomes imminent, a pattern that would not be seen if the discount curves were exponential (Ainslie & Herrnstein, 1981; Ainslie & Haendel, 1983; Green et.al, 1981; Kirby & Herrnstein, 1995). Where anticipatory dread is not a factor (with nonhumans or with minor pains in humans), subjects switch from choosing LL relief from aversive stimuli to SS relief as the availability of the SS relief draws near (Novarick, 1982; Solnick et.al., 1980).

3. Given choices between SS rewards and LL ones, nonhuman subjects will sometimes choose an option available in advance that prevents the SS alternative from becoming available (Ainslie, 1974; Hayes et.al, 1981). The converse is true of punishments (Deluty et.al, 1983). This design has not been run with human subjects, but it has been argued that illiquid savings plans and other choice-reducing devices serve this purpose (Laibson, 1997). Such a pattern is predicted by hyperbolic discount curves, while conventional utility theory holds that a subject has no incentive to reduce her future range of choices (Becker & Murphy, 1988).

4. When a whole series of LL rewards and SS alternatives must be chosen all at once, both human and nonhuman subjects choose the LL rewards more than when each SS vs. LL choice can be made individually. Kirby and Guastello reported that students who faced five weekly choices of a SS amount of money immediately or a LL amount one week later picked the LL amounts substantially more if they had to choose for all five weeks at once than if they chose individually each week (2001). They reported an even greater effect for different amounts of pizza. Ainslie and Monterosso reported that rats made more LL choices when they chose for three trials all at once than they chose between the same contingencies separately on each trial (2003). The effect of such bundling of choices is predicted by hyperbolic but not exponential curves: As I described above, exponentially discounted prospects do not change their relative values however many are summed together (figure 5); hyperbolically discounted SS rewards, although disproportionately valued as they draw near, lose this differential value insofar as the choices are bundled into series (figure 6).

Thus hyperbolic discounting seems to be an elementary property of the reward process. The resulting implication that our choices are intrinsically unstable is obviously disturbing, and requires a fair amount of theoretical re-tooling. Several counter-proposals have attempted to account for temporary preference phenomena as variants of exponential discounting. The simplest possibility is that different kinds of reward are discounted at different rates, so that the prospect of sobriety, say, might be discounted more slowly than that for intoxication. Such an explanation could account for temporary preferences, precommitment, and the effect of summing series of choices, as long as the SS rewards were of a different modality than the LL rewards. However, in all of the above experiments the SS rewards were of the same kind as the LL.

Other proposals have included:

noise in the valuation process, such that discount curves wobble randomly across one another (Strotz, 1956, Skog, 1999). However, since exponential curves draw further apart as delay decreases (figure 1A), this wobble should create fewer changes of preference, or at least no more, when the SS is near than when it is distant. The opposite is regularly observed.
a step function in which immediate events are valued disproportionately and events at all delays are discounted exponentially (Simon, 1995); the most prominent example is Laibson’s “hyperboloid” discount function (1997). This accounts grossly for the incentive for precommitment; but this function, not seen elsewhere in nature, is contradicted by the smooth curve that describes the available data (Ainslie & Monterosso, 2004).
An exponential discount rate whose exponent itself varies as a function of amount (Green & Myerson, 1993). However, to explain changes of preference as a function of delay, the exponent would have to be determined only by the value at delay zero, the very objection that makes hyperbolic discounting inconvenient for utility-based analysis (Laibson, 1997). Even accepting this convention, Green et.al. have found that hyperbolic curves fit the data substantially better than amount-dependent exponential curves (1997).
The summation of separate exponential discount rates for association and valuation (Case, 1997). However, the association component that gives the necessary bowing to the overall curve should affect only new learning, not choice between the familiar alternatives that confronted subjects in most of the above research.

None of these proposals contradict hyperbolic discounting except in the precise fitting of the curve itself, and in this respect, the data for best least squares fit overwhelmingly support the hyperbola.

The finding of evidence for hyperbolic discounting in nonhumans as well as humans is crucial, because social psychology experiments are notoriously vulnerable to unprogrammed incentives, not the least of which is compliance with perceived experimenter demand (Orne, 1973). Phrasing a choice one way or another can reverse the direction of the findings (Tversky & Kahnemann, 1981), and subjects are apt to express what they believe to be rational rather than what their spontaneous preference is; thus six-to-ten-year-olds are actually poorer at some kinds of reward-getting tasks than four-year-olds, because they rigidly hold to what they the right strategy should be (Sonuga-Barke et.al, 1989). Furthermore, human subjects learn to compensate for their tendencies to form temporary preferences, and express valuations that have this compensation already factored in; I am still surprised that people reveal hyperbolic preferences for future money to the extent that they do, given its demonstrable irrationality. Of course nonhuman animals have their own behavioral foibles (Breland & Breland, 1961), but we can be sure that these do not include social demand or theoretical notions.

Hyperbolic curves suggest rationales for many phenomena that RCT fails to predict, even with the help of its designated villain, conditioned craving. Hyperbolae can obviously account for reversals of preference as SS rewards become imminently available. At first glance, they do not explain the stimulus-driven quality often reported for these reversals: A switch in preference is often experienced as happening not simply when a reward can be had soon, but when a stimulus induces a “conditioned” surge of appetite for it, much like the surges of emotion that also lead to changes of preference. However, I will argue presently that reward-based hyperbolic curves govern both kinds of surge by the same mechanism that leads to the willpower phenomenon. These curves also repair many other defects of RCT, including but not limited to its inability to account for anomalies of investment (Thaler, 1991), its silence on the value of emotion, and its confusion about the most important occasion for emotion, the vicarious experience of other people. I will discuss willpower and sudden appetite/emotion here, and refer the reader elsewhere for the other topics (Ainslie, 2001, pp. 161-197; 2003).

Will as Intertemporal Bargaining

The most basic consequence of hyperbolic discounting is that we cannot be sure of our own future choices. Neither cognitive theory nor popular imagination has revised the renaissance image of the person as an internal hierarchy, with an ego as king over obedient agents (muscles) and passive support organs (viscera; Tillyard, 1959). At best this image has been modernized to a corporation controlled by a CEO, or an army controlled by a general. By contrast, if our preferences tend to change as one reward and then another get close, we are more like a marketplace in which any plan we make at one moment must be sold to ourselves at future moments if it is to have any chance of succeeding. This, indeed, is what even corporations and armies look like when the motives of the individuals who “serve” in them are examined closely (Brunsson, 1982, chapters 1 and 2; Brennan & Tullock, 1982, p. 226). Memos and orders by leaders have to be supported by a great deal of tacit bargaining in order to motivate followers to follow.

What bargaining within individuals can make a future self obey the plan of the present self? Of course there is sometimes external or physiological commitment, as when the present self takes an appetite-altering medication, makes a promise to a friend, limits the information that will come to future selves, or just starts a behavior that will affect motivation in the immediate future (Ainslie, 2001, pp.73-78). However, these methods are often unavailable, or too costly or restricting. A more adaptable method is suggested by hyperbolic curves’ property of increasingly favoring LL rewards when they are drawn from whole series of rewards, as demonstrated in the fourth kind of experiment, above. This property may be the basis for what authors from Aristotle to Rachlin have suggested: that self-control increases when you decide according to principle—that is, when you choose whole series of similar options instead of just “particular” or “molecular” cases. But how do you make yourself choose according to principle in the face of individual short range temptations? To explain this we need to invoke a process that would make no sense for the continual reward maximizers envisioned by RCT, intertemporal bargaining.

Future selves partially share the goals of the present self—the LL rewards that it values at a discount—and partially have different goals—the SS rewards that only momentary selves value highly. This defines a relationship of limited warfare (Schelling, 1960, pp. 53-80), the incentives for which, in interpersonal bargaining, form repeated prisoners’ dilemmas (RPDs). Among individuals such dilemmas can be solved by finding clear, albeit often tacit, criteria for what constitutes cooperation or defection, as long as mutual cooperation will benefit each player more than mutual defection will. Within an individual the limited warfare between, say, eating to satiety and staying thin can also be brought to a truce by RPD logic: Classical RPDs cannot occur among successive selves within an individual because a later self can never literally retaliate against an earlier one; however, if your expectation of getting a whole series of LL rewards depends on seeing yourself pick LL rewards in current choices, you have effectively created the outcome matrix of an RPD (Ainslie, 2001, pp. 90-104). If you see yourself violate your diet today you reduce your expectation that your diet will succeed, and tomorrow’s self will have that much less at stake in choosing. Your realistic expectation that tomorrow’s self will violate your diet in turn, and precipitate subsequent violations, in effect constitutes retaliation against today’s defector.

The incentive structure of intertemporal bargaining can replace not only Rachlin’s supplementary reward from love of principle but also faculties like a transcendent self or overriding ego that have long been assumed to be inborn. The process is analogous to the interpersonal bargaining through which small, stable markets come to regulate themselves by “self-enforcing contracts” (Klein & Leffler, 1981)—self-enforcing in that the incentive for cheating in a given transaction is continuously less than the expected gain from continuing mutual trust. By the same logic, an individual has incentives to develop self-enforcing cooperative arrangements with her future selves. Such higher mental functions can develop by trial and error on the basis of the relatively small but stable rewards that foresight attains. A person’s cognitive machinery need not be run by an autonomous part of the person herself, an ego that stands apart from its gears and power trains; the internal factory itself is autonomous, the ultimate bottom-up mechanism that Dennett envisions (this volume).

The contingencies of the intertemporal RPD were illustrated by a demonstration at this conference: I asked the audience to imagine that I was running a game show. I announced that I would go along every row, starting at the front, and give each member a chance to say "cooperate" or "defect." Each time someone said "defect" I would award a euro only to her. Each time someone said "cooperate" I would award ten cents to her and to everyone else in the audience. And I asked that they play this game solely to maximize their individual total score, without worrying about friendship, politeness, the common good, etc. I said that I would stop at an unpredictable point after at least twenty players had played. Like successive motivational states within a person, each successive player had a direct interest in the behavior of each subsequent player; and had to guess her future choices somewhat by noticing the choices already made. If she believed that her move would be the most salient of these choices for the next players right after she made it, she had an incentive to forego a sure euro, but only if she thought that this choice would be both necessary and sufficient to make later players do likewise.

In this kind of game, knowing the other players’ thoughts and characters-- whether they are greedy, or devious, for instance—will not help you choose, as long as you believe them to be playing to maximize their monetary gains. This is so because the main determinant of their choices will be the pattern of previous members' play at the moment of these choices. Retaliation for a defection will not occur punitively-- a current player has no reason to reward or punish a player who will not play again-- but what amounts to retaliation will happen through the effect of this defection on subsequent players' estimations of their prospects and their consequent choices. These would seem to be the same considerations that bear on successive motivational states within a person, except that in this interpersonal game the reward for future cooperations is flat (ten cents per cooperation, discounted negligibly), rather than discounted in a hyperbolic curve depending on each reward's delay.

Perceiving each choice as a test case for the climate of cooperation turns the activity into a positive feedback system—cooperations make further cooperations more likely, and defections make defections more likely. The continuous curve of value is broken into dichotomies, by volitions that either succeed or fail. Proximity to temptation still influences the outcome of choices, but much less so than before choices served as test cases with whole series of expectations riding on them. The interpretation of cases as tests or not, that is, as members or not of this particular RPD, becomes more important in determining whether a temptation is worth resisting. If you ignore your diet on a special day like Thanksgiving, or if a single conspicuous outsider like the only child in the game show audience defects, the next choice-makers will be much less likely to see it as a precedent. The importance of interpretation creates incentive for what Freudians call rationalization, or Sayette calls motivated reasoning (this volume). Making resolutions more explicit forestalls impulsively motivated reasoning and increases their chances of being carried out (Gollwitzer, this volume), but at the risk of compulsive side effects, as we shall see.

The similar incentive structures of interpersonal and intertemporal bargaining might make it seem like a good idea to use the former to study the properties of the latter. In full blown form, however, this turns out to be a daunting undertaking. John Monterosso, Pamela Toppi Mullen and I have tried out the game show experiment with repeated trials for real money in a roomful of recovering addicts, but it was evident that social pressure was more of a factor than the announced rewards (unpublished data). Practical use of this method would require subjects sitting at thirty or forty separate terminals, enough trials to make them familiar with the logic of choice, and enough payoff to make it worth their time—obvious material for a well-funded internet study. Meanwhile it has been possible to model some of the logic of intertemporal cooperation in a two person RPD: Subjects at computer terminals given false feedback about their partners’ responses have shown that damage done by defections is greater and more long lasting than is damage repair following cooperations (Monterosso et.al, 2002)—the same asymmetry described for lapses of will (Bain, 1886, p. 440).

Experimental analogs are a noisy way to study intertemporal bargaining, but direct experimentation on this recursive, internal process is even less practical. There are suggestive data. For instance, when Kirby and Guastello compared separate and bundled choices in their college subjects they found an intermediate degree of self-control if they suggested to the separate-choice subjects that their current choice might be an indicator of what they would choose on subsequent occasions (2001). However, nothing short of imaging techniques would allow direct observation of the separate steps of recursive choices within individuals, and these techniques are in their infancy. Meanwhile, the most convincing evidence for the dependence of will upon self-observation comes from thought experiments of the kind that have been finely honed by the philosophy of mind (Kavka, 1983; Sorensen, 1992). An example tailored to self-control:

Consider a smoker who is trying to quit, but who craves a cigarette. Suppose that an angel whispers to her that, regardless of whether or not she smokes the desired cigarette, she is destined to smoke a pack a day from tomorrow on. Given this certainty, she would have no incentive to turn down the cigarette— the effort would seem pointless. What if the angel whispers instead that she is destined never to smoke again after today, regardless of her current choice? Here, too, there seems to be little incentive to turn down the cigarette—it would be harmless. Fixing future smoking choices in either direction (or anywhere in between) evidently makes smoking the dominant current choice. Only if future smoking is in doubt does a current abstention seem worth the effort. But the importance of her current choice cannot come from any physical consequences for future choices; hence the conclusion that it matters as a precedent. (Monterosso & Ainslie, 1999)

Recursive Self-Prediction in Will and “Conditioned Craving”

Sometimes resolutions are deliberate, and people monitor cooperation systematically. However, less deliberate resolutions that still depend on recursive self-observations are apt to be more widespread. We intend to donate blood or dive into a cold lake, and notice no loss besides a certain uneasiness if we do not; but if we do not, it will be harder to intend similar acts the next time. Resolutions and intentions shade into the kind of self-predictions that merely forecast the immediate future, are made according to no principle, and may well occur in nonhuman animals. On one end of the scale, Russell’s example of fending off seasickness involves effort:

I suspect that I may be getting seasick so I follow someone’s advice to “keep your eyes on the horizon...” The effort to look at the horizon will fail if it amounts to a token made in a spirit of desperation... I must look at it in the way one would for reasons other than those of getting over nausea... not with the despair of “I must look at the horizon or else I shall be sick!” To become well I must pretend I am well (1978, pp. 27- 28).

But this example is continuous with the more spontaneous James-Lange phenomena on the other end that were described in the nineteenth century, actually first by Darwin:

The free expression by outward signs of an emotion intensifies it. On the other hand, the repression, as far as this is possible, of all outward signs softens our emotions. He who gives way to violent gestures will increase his rage; he who does not control the signs of fear will experience fear in greater degree (1872/1979, p. 366).

Anxiously hovering over your own performance is common in behaviors that you recognize to be only marginally under voluntary control: summoning the courage to perform in public (versus what comedians call “flopsweat”) or face the enemy in battle, recalling an elusive memory, sustaining a penile erection, or, for men with enlarged prostates, voiding their bladders. To seem to be succeeding increases the likelihood of actual success. I suspect that it was not just to account for fate, but to describe the tenuous process of succeeding in just such behaviors, that ancient polytheists discerned the sometime interventions of such gods as Mars, Venus, and Aesculapius. Will in the sense of willpower is a refinement of this recursive self-prediction; its targets are behaviors that are more controllable than the above examples in the short run, but that become unreliable when they must be sustained over long periods. People pray to gods for success against temptations, too.

Neuroimaging technology may soon be able to add information about the components of will, perhaps even including intertemporal bargaining. As Bechara discusses (this volume), parts of the prefrontal cortex are clearly involved in foresight and can influence activity in reward centers. (see also Davidson et.al., 2000, Rolls, 1999, pp. 124-144). It is too early to tell whether the prefrontal area competes against “lower” areas, as a person would wrestle with a bear (McClure et.al., 2004), or whether it exploits them strategically from a position of relative weakness, as a person would ride a horse (in Paul McLean’s classic image; Ainslie & Monterosso, 2004)—or arranges “to set affection against affection and to master one by another: even as we use to hunt beast with beast” (Francis Bacon, quoted in Hirschman, 1977, p. 22). The latter models have the advantage of being expandable to multiple ranges of impulsiveness and control, as when drinking alcohol may be a way of controlling an urge to panic, while at the same time representing an addiction that invites controls (Ainslie, 1992, pp. 119-122). The multiple reciprocal connections among the centers involved in appetite/emotion and foresight suggest that at least the process of choice can be recursive (Lewis, in press), a necessary feature of the model of will that I have presented.

Sudden craving Recursive self-prediction is a likely mechanism for the frequent suddenness of emotions and “conditioned” craving—processes that arise from the appearance of a stimulus associated with a reward, rather than simple proximity to the reward. In the reward-based view I am presenting, craving is an example of appetite, a goal-directed preparatory behavior that increases the effect of relevant rewards, given adequate biological need (or deprivation, or “drive;” Ainslie, 2001, pp. 67-69). That is, an appetite is rewarded by the object consumed just as consumption behavior is; but while consumption itself is a muscle behavior subject to being willed, appetite is one of the many processes, including the direction of thought itself, that occur too rapidly to be controlled by will. Furthermore, since the role of appetite is to increase the effect of a reward, its occurrence also makes choice of the reward more likely. If we intend not to eat dessert—or take a drug-- we evaluate our future choices without the relevant appetite. The sudden appearance of a dessert cart, or drug works, gives our appetite an occasion to see if it can overturn our intention. Generating an appetite does not take much energy; unless our resolve is such that there is really no chance that we will choose consumption, a trial of appetite may be worth the effort. It is like having a pet that eats when we eat, and will beg under circumstances where we have even occasionally eaten in the past.

“Begging”—an increase in appetite—raises our anticipated reward for consumption, which increases the odds that we will choose it; but this further encourages our appetite, which increases our anticipated reward for consumption. It looks like a Darwin-James-Lange positive feedback cycle. If we never consume the reward in a particular circumstance we do not generate appetite there, just as orthodox Jews are said not to crave cigarettes on the Sabbath (Schachter et.al, 1977). At the opposite pole, if we accept that we usually consume the reward in this circumstance, we will develop appetite smoothly as the rewarding event gets closer, and the full effect of reward with appetite will be discounted in a simple hyperbolic curve. But between these extremes, if we intend, without certainty, not to consume the reward, we will be prone to sudden increases in appetite that may change the preference that was based on our previous anticipation (figure 7). The notorious dessert cart phenomenon occurs only in people who intend weakly not to have dessert. And if we add to our resolve and stop ever consuming the reward in this circumstance, it will still take many, many repetitions for our trials of appetite to extinguish there. In this model, emotions are appetites that lack a necessary object of consumption, but are rewarding in their own right (Ainslie, 2001, pp. 65-67, 164-171).

Negative appetites. The question naturally arises as to how this model of appetites/emotions as behaviors will work in the converse situation where there seem to be negative appetites. That is, there is a readiness to have anger, fear, and grief as well as the experiences that we actively seek to have; it is counterintuitive that the experiences that seemingly have to be imposed by conditioning are actually chosen for their rewardingness. The difficulty comes partly from a linguistic tendency to equate “reward,” that which selects for the choices it follows, and “pleasure,” that which is subjectively desirable. Ample evidence that organisms often engage in activities that are not pleasurable, but for which there is no apparent incentive, has led Berridge to distinguish “liking,” finding pleasure in, from “wanting,” having a “nonhedonic,” perhaps conditioned, tendency to choose an unliked activity (2003). His exemplar is the strong tendency of both patients and nonhumans with indwelling electrodes to self-stimulate in certain brain centers while evincing scant pleasure and even irritation, but he lists many other examples as well. However, since modern research on conditioning has shown it to be “not the shifting of a response from one stimulus to another [but] the learning of relations among events” (Rescorla, 1988), the hypothesis that actions like pressing for self-stimulation are nonhedonic leaves them without a principle of selection. An action that is “wanted” really has to have hedonic value, that is, has to trade in the marketplace of goal-directed processes—be rewarded-- whether or not it is “liked.” The distinction between pleasure and the kind of nonpleasurable urge that could produce tics, nailbiting, and negative emotions/appetites can permit a quantum leap in the parsimony of motivational theory (Ainslie, 1992, pp. 244-249), but only if it can itself be explained.

Hyperbolic discounting again comes to the rescue. It predicts how activities that are subjectively aversive and are avoided from a distance become almost irresistible at very close range, the experience described as vividness or urgency (Ainslie, 2001, pp. 48-70). Briefly, aversions may be rapid cycles of short, intense reward and relatively longer suppression of reward—the same pattern as recurrent binges followed by hangovers or, more rapidly, repeatedly scratching an itch and being distracted from better activities-- but condensed into so short a period that the rewarding and unrewarding components fuse in perception. This model makes it possible to see unconditioned stimuli as selecting for the behaviors they follow in exactly the same way as acknowledged pleasures do. Indeed, a major advantage of hyperbolic discounting theory is that it makes a separate selective principle based on conditioning unnecessary, even to account for the participation of apparently unwilling subjects in aversive or otherwise undesirable experiences. Even when craving or emotions are unwelcome, they can be seen as arising only insofar as they are rewarding in the very short run.

The brain site(s) that reward in the very short run could well be different from sites that subtend pleasure. Berridge implicates the lateral hypothalamus (2003), but the amygdala activity reported by Bechara (this volume) to accompany “primary inducers” would also be a good candidate. He describes primary inducers as “innate or learned stimuli that cause pleasurable or aversive states.” Certainly both positive and negative emotional imagery has been found to elicit amygdala activity (Hamann & Mao, 2002), and an intact amygdala is necessary to a core process common to initiating both positive and negative emotions, although the exact nature of this process is unclear (Berridge, 1999). Bechara describes the “somatic states” (emotions) occasioned by primary inducers as obligatory, but also as subject to selection such that “stronger ones gain selective advantage over weaker ones,” and that they are modified by reflective processes. It seems that even primary inducers have to trade in some kind of marketplace; I am suggesting that instead of being conditioned reflexes they are selected by very short term reward, and are thus experienced as difficult-- though not always impossible-- to resist.

Implications for Well-Being

What the theories of choice that have evolved into RCT describe is the general application of what is actually one particular solution people have found to intertemporal inconsistency. Intertemporal bargaining is especially suited to the needs of long term planning and the conditions of competitive interpersonal markets, but it has a cost. Because of the inescapable ambiguities in these bargains a person is fated to achieve “rationality” only imperfectly. Furthermore, insofar as these bargains are all that protect her from her own nature, red in tooth and claw, she is their prisoner. I have argued elsewhere that extensive or unskilled reliance on the perception of RPDs for self-control will motivate the development of four side effects (Ainslie, 2001, pp. 143-160):

When an option is worth more as a test case than as an event in its own right you are less able to experience it in the here-and-now and your choice-making becomes lawyerly.
A lapse that you see as a precedent reduces your hope for self-control in similar situations in the future, a reduction that recursively reduces your power of self-control in those situations. This explains why a successful dieter may be “helpless” against smoking, and how other encapsulated symptoms persist.
The incentive not to recognize a lapse may lead to gaps in your awareness of your own behavior, a process that creates a motivated unconscious a la Freud.
Explicit criteria for defining lapses will tend to replace subtle ones, so that what might be your richest plans get replaced by the most enforceable ones.

Clinically, these side effects manifest themselves as compulsive symptoms, in the extreme as obsessive-compulsive personality disorder (Pfohl & Blum, 1991). When particular kinds of test cases assume exceptional importance they may produce modality-specific syndromes like anorexia nervosa (Gillberg & Rastam, 1992), or narrow character traits like miserliness. Thus if rationality is maximizing experienced reward over time, strengthening volition by making extensive intertemporal bargains may be rational only up to a point. The limitations of intertemporal bargaining are analogous to the social problems that arise where society uses laws to control interpersonal bargaining (Sunstein, 1995, pp. 991-996).

Conclusions

I have proposed that volition (willpower) involves a pattern of recursive self-prediction that extends an organism’s basic ability to use its own current behaviors as cues. This extended ability would not be important if people evaluated choices with the exponential discount curves that are intrinsic to rational choice theory; it becomes crucial in the limited warfare engendered by hyperbolic discount curves. Recursive self-prediction can account for both the recruitment of willpower when you see current choices as test cases and the sudden evaporation of willpower when there is a discrete occasion for a weakly opposed appetite.

This approach provides a bottom-up rationale for the growth and selection of higher functions. Higher literally means more farsighted, for they will be selected according to how well they can anticipate and influence future urges. They do not depend upon an independent organ of reason. Rather they are selected by long range reward itself, an invisible hand like that of Adam Smith’s marketplace. However, higher does not necessarily mean wiser, since they are prone, like agents in interpersonal marketplaces, to fall into overly rigid patterns through the demands of the bargaining situation itself. Although these emergent higher functions are necessary for achieving the reward-seeking priorities that are defined by rational choice theory, they can only approximate what we would call rational.

Acknowledgments

I thank Lynne Debiak for artwork and John Monterosso and the editors and referees for comments.

Notes

1. Departing from the long stoic tradition, authors are beginning to equate “emotion” with its cognate, “motivation—“ e.g. “It is useful to consider under the umbrella of emotion those neural processes by which an animal judges and represents the value of something in the world, and responds accordingly” (Cardinal et.al., 2002, p. 332), or “emotional processes must also always involve an aspect of affect, the psychological quality of being good or bad. (Berridge, 2003, p. 106).

2. Much of bounded rationality seems to arise from pure cognitive error (Kahneman & Tversky, 2000). However, some reported examples probably arise from strategic motives, either serving self-control (as when people pay a premium to keep money in an illiquid account—Harris & Laibson, 2001) or evading it (for instance if the sunk cost fallacy evades a personal rule for recognizing loss—Ainslie, 1992, pp.291-293). The strategic approach presented here also provides a rationale for vicarious experience as a primary good, which can explain the apparent boundedness of self-interest (Ainslie, 1995, 2001, pp. 179-186).

3. In actual play subjects often sacrifice their ostensible interests to punish others (Thaler, 1988), but in the intertemporal game being modeled the programmed contingencies encompass all incentives.

4. Preliminary evidence suggests that the medial prefrontal (orbitofrontal) cortex “establishes a motivational value based on estimation of potential reward (London et.al., 2000), but is implicated in either temptation or longer range planning, depending on the method of observation (Davidson et al., 2000; McClure et al., 2004; Rolls, 1999, pp. 124-144; Volkow & Fowler, 2000).

5. Other models could also do this, as long as they somehow posited the ability of aversive events to attract attention in a competitive internal marketplace,

6. This disorder is not the same entity as obsessive-compulsive disorder (without the “personality”), which involves rapidly recurring dysphoric urges to wash, check things, or perform other small behaviors, and is associated with low brain serotonin (Thomsen & Mikkelsen, 1994).

References

Ainslie, G. (1974) Impulse control in pigeons. Journal of the Experimental Analysis of Behavior 21, 485-489.

Ainslie, G. (1975) Specious reward: A behavioral theory of impulsiveness and impulse control. Psychological Bulletin 82, 463-496.

Ainslie, G. (1992) Picoeconomics: The Strategic Interactionof Successive Motivational States within the Person. Cambridge U.

Ainslie, G. (1995) A utility-maximizing mechanism for vicarious reward: Rationality and Society 7, 393-403.

Ainslie, G. (2001) Breakdown of Will. Cambridge U.

Ainslie, G. (2003) Uncertainty as wealth. Behavioural Processes 64, 369-385.

Ainslie, G. (in press) The self is virtual, the will is not illusory. Behavioral and Brain Sciences.

Ainslie, G. and Haendel, V. (1983) The motives of the will. In E. Gottheil, K. Druley, T. Skodola, H. Waxman (Eds.), Etiology Aspects of Alcohol and Drug Abuse. Charles C. Thomas, pp. 119-140.

Ainslie, G. and Herrnstein, R. (1981) Preference reversal and delayed reinforcement. Animal Learning and Behavior 9,476-482.

Ainslie, G. and Monterosso, J. (2003) Building blocks of self-control: Increased tolerance for delay with bundled rewards. Journal of the Experimental Analysis of Behavior 79, 83-94.

Ainslie, G. and Monterosso, J. (2004) Towards a marketplace in the brain. Science 305 (5695).

American Psychiatric Association (1994) Diagnostic and Statistical Manual of Mental Disorders. Fourth Edition. APA Press.

Bain, A. (1859/1886) The Emotions and the Will. Appleton.

Baumeister, R. F. and Heatherton, T. (1996) Self-regulation failure: An overview. Psychological Inquiry 7, 1-15.

Bechara [Reference from this volume]

Becker, G. and Murphy, K. (1988) A theory of rational addiction. Journal of Political Economy 96, 675-700.

Berridge, K. C. (1999) Pleasure, pain, desire, and dread: Hidden core processes of emotion. In Kahneman, D., Diener, E. and Schwartz, N, Eds., Well-Being: The Foundations of Hedonic Psychology. Sage.

Berridge, K. C. (2003) Pleasures of the brain. Brain and Cognition 52, 106-128.

Boekaerts, M., Pintrich, P. R. and Zeidner, M. (2000) Handbook of Self-Regulation.Academic.

Boudon, R. (1996) The “rational choice model:” A particular case of the “cognitive model.” Rationality and Society 8, 123-150.

Bratman, M. E. (1999) Faces of Intention: Selected Essays on Intention and Agency. Cambridge U.

Breland, K. and Breland, M. (1961) The misbehavior of organisms. American Psychologist 16, 681-684.

Brennan, G. and Tullock, G . (1982) An economic theory of military tactics: Methodological individualism at war. Journal of Economic Behavior and Organization 3, 225-242.

Brunner, D. and Gibbon, J. (1995) Value of food aggregates: parallel versus serial discounting. Animal Behavior 50, 1627-1634.

Brunsson, N. (1982) The Irrational Organization. Stockholm School of Economics.

Cardinal, R. N., Parkinson, J. A., Hall, J., and Everitt, B. J. (2002) Emotion and motivation: the role of the amygdala, ventral striatum, and prefrontal cortex. Neuroscience and Biobehavioral Reviews 26, 321-352.

Carver, C. S. and Scheier, M. F. (2000) On the structure of behavioral self-regulation. In Monique Boekaerts, Paul R. Pintrich, and Moshe Zeidner, Eds., Handbook of Self-Regulation.Academic, pp. 41-84.

Case, D. A. (1997) Why the delay-of-reinforcement gradient is hyperbolic. Paper presented at the 20th Annual Conference of the Society for the Quantitative Analyses of Behavior. Chicago, May 22.. www.SQAB.psychology.org/abstracts-1997.

Coleman, J. (1986) Individual Interests and Collective Action: Selected Essays. Cambridge U.

Corsini, R. J. (1984) Current Psychotherapies. Third Edition. Peacock.

Davidson, R. J., Putnam, K. M. and Larson, C. L. (2000) Dysfunction in the neural circuitry of emotional regulation: A possible prelude to violence. Science 289, 591-594.

Darwin, C. (1872/1979) The Expressions of Emotions in Man and Animals.: Julian Friedman.

Deluty, M.Z., Whitehouse, W.G., Mellitz, M., and Hineline, P.N.(1983) Self-control and commitment involving aversive events. Behavior Analysis Letters 3, 213-219.

Dennett [Reference from this volume]

Freud, S. (1911/1956) Formulations on the Two Principles of Mental Functioning. In J. Strachey and A. Freud (Eds.), The Standard Edition of the Complete Psychological Works of Sigmund Freud. Hogarth, vol. 12.

Gibbon, J. (1977) Scalar expectancy theory and Webers law in animal timing. Psychological Review 84, 279-325.

Gillberg, C. and Rastam, M. (1992) Do some cases of anorexia nervosa reflect underlying autistic-like conditions? Behavioural Neurology 5, 27-32.

Gollwitzer [Reference from this volume]

Grace, R. (1996) Choice between fixed and variable delays to reinforcement in the adjusting-delay procedure and concurrent chains. Journal of Experimental Psychology: Animal Processes, 22:362-383.

Green, L., Fisher, E.B., Jr., Perlow, S. and Sherman, L. (1981) Preference reversal and self-control: Choice as a function of reward amount and delay. Behaviour Analysis Letters, 43-51.

Green, L., Fry, A., and Myerson, J. (1994) Discounting of delayed rewards: A life-span comparison. Psychological Science 5, 33-36.

Green, L., and Myerson, J. (1993) Alternative frame-works for the analysis of self-control. Behavior and Philosophy, 21, 37-47.

Green, L., and Myerson, J. (1996) Exponential versus hyperbolic discounting of delayed outcomes: Risk and waiting time. American Zoologist 36, 496-505.

Green, L., Myerson, J., and McFadden, E. (1997) Rate of temporal discounting decreases with amount of reward. Memory and Cognition 25, 715-723.

Harris, C. and Laibson, D. (2001) Dynamic choices of hyperbolic consumers. Econometrica 69, 535-597.

Hamann, S. and Mao, H. (2002) Positive and negative emotional verbal stimuli elicit activity in the left amygdala. NeuroReport 13, 15-19.

Hayes, S.C., Kapust, J., Leonard, S.R., and Rosenfarb, I. (1981) Escape from freedom: Choosing not to choose in pigeons. Journal of the Experimental Analysis of Behavior 36, 1-7.

Herrnstein, R. J. (1969) Method and theory in the study of avoidance. Psychological Review 76, 49-69.

Hirschman, A. (1977) The Passions and the Interests. Princeton U.

Holling, C. S. (1959) Some characteristics of simple types of predation and parasitism. Canadian Journal of Entomology 91, 385-398.

James, W. (1890) Principles of Psychology. Holt.

Jolls, C., Sunstein, C. R., and Thaler, R. (1998) A Behavioral Approach to Law and Economics, Stanford Law Review 50, 1471-1550.

Kahneman, D., and Tversky, A. (Eds) (2000) Choices, values, and frames. Cambridge U.

Kavka, G. (1983) The toxin puzzle. Analysis 43, 33-36.

Kirby, K. N. (1997) Bidding on the future: Evidence against normative discounting of delayed rewards. Journal of Experimental Psychology: General 126, 54-70.

Kirby, K. N., and Guastello, B. (2001) Making choices in anticipation of similar future choices can increase self-control. Journal of Experimental Psychology: Applied 7, 154-164.

Kirby, K. N. and Herrnstein, R. J. (1995) Preference reversals due to myopic discounting of delayed reward. Psychological Science 6, 83-89.

Klein, B. and Leffler, K.B. (1981) The role of market forces in assuring contractual performance. Journal of Political Economy 89, 615-640.

Korobkin, R. and Ulen, T. S. (2000) Law and Behavioral Science: Removing the Rationality Assumption from Law and Economics, California Law Review 88, 1051-1144.

Laibson, D. (1997) Golden eggs and hyperbolic discounting. Quarterly Journal of Economics, 62, 443-479.

Lewis, M. D. (in press) Bridging emotion theory and neurobiology through dynamic systems modeling. Behavioral and Brain Sciences.

Loewenstein, G. (1996) Out of control: Visceral influences on behavior. Organizational Behavior and Human Decision Processes 35, 272-292.

London, E.D., Ernst, M., Grant, S., Bonson, K., and Weinstein, A. (2000) Orbitofrontal cortex and human drug abuse: Functional imaging. Cerebral Cortex 10, 334-342.

McClennen, E. F. (1990) Rationality and Dynamic Choice. Cambridge U.

McClure, S. M., Laibson, D. I., Loewenstein, G., and Cohen, J.D. (2004) The grasshopper and the ant: Separate neural systems value immediate and delayed monetary rewards. Science.305 (5695).

McConkey, K. M. (1984) Clinical hypnosis: Differential impact on volitional and nonvolitional disorders. Canadian Psychology 25, 79-83.

Mazur, J.E. (1986) Choice between single and multiple delayed reinforcers. Journal of the Experimental Analysis of Behavior 46, 67-77.

Mazur, J. E. (2001) Hyperbolic value addition and general models of animal choice. Psychological Review 108, 96-112.

Mellers, B. A., Schwartz, A., and Cooke, A. D. J. (1998) Judgment and decision making. Annual Review of Psychology, 49, 447-477.

Metcalfe, J. and Mischel, W. (1999) A hot/cool-system analysis of delay of gratification: Dynamics of willpower. Psychological Review 106, 3-19.

Miller, W. R. (2003) Comments on Ainslie and Monterosso. In R. Vuchinich and N. Heather, Eds., Choice, Behavioural Economics, and Addiction. Pergamon, pp. 62-66.

Mitchell, S. H. & Rosenthal, A. J. (2003) Effects of multiple delayed rewards on delay discounting in an adjusting amount procedure. Behavioural Processes 64, 273-286.

Monterosso, J. and Ainslie, G. (1999) Beyond discounting: Possible experimental models of impulse control. Psychopharmacology 146, 339-347.

Monterosso, J. R., Ainslie, G., Toppi Mullen, P., and Gault, B. (2002) The fragility of cooperation: A false feedback study of a sequential iterated prisoner's dilemma. Journal of Economic Psychology 23:4, 437-448.

Muraven, M. and Baumeister, R. (2000) Self-Regulation and Depletion of Limited Resources: Does Self-Control Resemble a Muscle? Psychological Bulletin 126 ,247-259.

Navarick, D.J. (1982) Negative reinforcement and choice in humans. Learning and Motivation 13, 361-377.

Orne, M. T. (1973) Communication by the total experimental situation: Why it is important, how it is evaluated, and its significance for the ecological validity of findings. In Pliner, Patricia; Krames, Lester; et. al, Eds Communication and Affect: Language and Thought. Academic.

Pfohl, B. and Blum, N. S. (1991) Obsessive-compulsive personality disorder: A review of available data and recommendations for DSM-IV. Journal of Personality Disorders 5, 363-375.

Posner, R. (1998) Rational Choice, Behavioral Economics, and the Law Stanford Law Review 50, 1555-1556.

Rachlin, H. (2000) The Science of Self-Control. Harvard U.

Rescorla, R. A. (1988) Pavlovian conditioning: It’s not what you think it is. American Psychologist 43, 151-160.

Rolls, E. T. (1999) The Brain and Emotion Oxford U.

Russell, J.M. (1978) Saying, feeling, and self-deception. Behaviorism 6, 27-43.

Ryle, G. (1949/1984) The Concept of Mind. U. Chicago.

Samuelson, P.A. (1937) A note on measurement of utility. Review of Economic Studies 4, 155-161.

Sayette [Reference in this volume]

Schachter, S., Silverstein, B. and Perlick, D. (1977) Psychological and pharmacological explanations of smoking under stress. Journal of Experimental Psychology: General 106, 31-40.

Schelling, T. C. (1960) The Strategy of Conflict. Harvard U.

Simon, J. L. (1995) Interpersonal allocation continuous with intertemporal allocation: Binding commitments, pledges, and bequests. Rationality and Society 7, 367-430.

Skog, O.-J. (1999) Rationality, irrationality, and addiction. In J. Elster and O.-J. Skog (Eds.) Getting Hooked: Rationality and Addiction. Cambridge U.

Solnick, J., Kannenberg, C., Eckerman, D. and Waller, M. (1980) An experimental analysis of impulsivity and impulse control in humans. Learning and Motivation 2, 61-77. Review, 217-225.

Sonuga-Barke, E. J.; Lea, S. E.; and Webley, P. (1989) The development of adaptive choice in a self-control paradigm. Journal of the Experimental Analysis of Behavior 51, 77-85.

Sorensen, R. A. (1992) Thought Experiments. Oxford.

Strotz, R.H. (1956) Myopia and inconsistency in dynamic utility maximization. Review of Economic Studies 23,166-180.

Sugden, R. (1991) Rational choice: a survey of contributions from economics and philosophy. Economic Journal 101, 751-785.

Sully, J. (1884) Outlines of psychology. Appleton.

Sunstein, C. R. (1995) Problems with rules. California Law Review 83, 953-1030.

Thaler, R. (1991) Quasi Rational Economics. Russell Sage.

Thomsen, P. H., and Mikkelsen, H. U. (1994) Development of personality disorders in children and adolescents with obsessive-compulsive disorder: A 6 to 22 year follow-up study. Acta Psychiatrica Scandinavica 87, 456-462.

Tillyard, E. M. (1959) The Elizabethan World Picture.

Tversky, A. and Kahneman, D. (1981) Framing decisions and the psychology of choice. Science 211, 453-458.

Volkow, N. D. and Fowler, J. S. (2000) Addiction, a disease of compulsion and drive: Involvement of the orbitofrontal cortex.