Final assignment (class 365)
Stuck with a difficult assignment? No time to get your paper done? Feeling confused? If you’re looking for reliable and timely help for assignments, you’ve come to the right place. We promise 100% original, plagiarism-free papers custom-written for you. Yes, we write every assignment from scratch and it’s solely custom-made for you.
Order a Similar Paper Order a Different Paper
instructions and ebook pdf that is to be required with the other chosen references is attached
PSYC 365
Research Paper: Final Submission Assignment Instructions
Overview
Now that you have thoroughly investigated and gathered research through the previous two assignments, you are ready to write a research paper addressing Classical and Instrumental (Operant) Conditioning Theories. This culminating assignment will provide a research foundation for your work in the field as a professional. Understanding foundational theories of learning is imperative to providing the best opportunities for learning in various settings. Completion of this assignment will improve research and communication skills while increasing your knowledge of these theories.
Instructions
Create a formal, current APA formatted paper using the following specifications and outline.
· Body of the paper should be a minimum of 8 pages (this does not include the title page or reference page)
· Current APA style is required.
· Minimum of 10 scholarly peer reviewed journal articles addressing Classical and/or Instrumental (Operant) Conditioning, published within the most recent 5 years, as well as the textbook, and the Bible, must be included in both in-text citations and the reference page.
· Websites are not acceptable sources.
· Do not include direct quotes. Instead, paraphrase information from the scholarly sources (using in-text citations) in order to demonstrate your mastery of each concept.
· Do not use first person. Write in a formal college-level essay style.
· Current APA Level 1 sub-headings must be used throughout the paper. The 6 main sections of the paper will address the following topics:
1. Historical Development of Each Theory—For each theory, discuss prominent persons and their corresponding historic contributions. Include how each theory developed over time.
2. Key Concepts of Each Theory—This section will focus on the major points of each theory. How is new information acquired? What are the goals of learning? What is unique about each theory?
3. Compare and Contrast Research Findings for Instrumental (Operant) Conditioning—This section will compare and contrast findings from the 10 peer-reviewed scholarly research articles. Do not copy your work from the annotated bibliography. Instead, integrate your findings as you compare and contrast research from the textbook and scholarly journal articles.
4. Learning Implications—This section will discuss the implications for how learning takes place in the classroom. How does Instrumental (Operant) Conditioning work to increase learning? How might a teacher encourage learning using concepts of motivational mechanisms and stimulus control?
5. Biblical Worldview—Discuss what the Bible says regarding learning behaviors in humans. How would a biblical worldview impact a learner? Use scripture in context and thoroughly explain application of the scripture.
6. Most Significant Learning—Explain one factor which you found to be the most significant information learned in this course. (Do not use first person.) Substantiate this decision with research support citing scholarly peer-reviewed journal articles and the textbook.
Use the following as an outline for your paper:
Title Page
Introduction
Historical Development
Classical Conditioning
Instrumental (Operant) Conditioning
Key Concepts
Classical Conditioning
Instrumental (Operant) Conditioning
Integration of Research Findings
Education Implications
Biblical Worldview
Most Significant Learning
Conclusion
Reference Page
Note: Your assignment Note: Your assignment will be checked for originality via the Turnitin plagiarism tool.
The Principles of
Learning and Behavior
SEVENTH EDITION
M I C H A E L D O M J A N
University of Texas at Austin
with neuroscience contributions by
James W. Grau
Texas A & M University
Australia • Brazil • Mexico • Singapore • United Kingdom • United States
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
This is an electronic version of the print textbook. Due to electronic rights restrictions,
some third party content may be suppressed. Editorial review has deemed that any suppressed
content does not materially affect the overall learning experience. The publisher reserves the right
to remove content from this title at any time if subsequent rights restrictions require it. For
valuable information on pricing, previous editions, changes to current editions, and alternate
formats, please visit www.cengage.com/highered to search by ISBN#, author, title, or keyword for
materials in your areas of interest.
The Principles of Learning and
Behavior, Seventh Edition
Michael Domjan
Product Director: Jon-David Hague
Content Developer: Wendy Langerud
Outsource Development Coordinator:
Joshua Taylor
Product Assistant: Nicole Richards
Associate Media Developer: Jasmin
Tokatlian
Marketing Director: Jennifer Levanduski
Art and Cover Direction, Production
Management, and Composition:
PreMediaGlobal
Manufacturing Planner: Karen Hunt
Rights Acquisitions Specialist: Don
Schlotman
Cover Image: Peter Cade
© 2015, 2010 Cengage Learning
ALL RIGHTS RESERVED. No part of this work covered by the copyright
herein may be reproduced, transmitted, stored, or used in any form or by
any means graphic, electronic, or mechanical, including but not limited
to photocopying, recording, scanning, digitizing, taping, Web distribution,
information networks, or information storage and retrieval systems,
except as permitted under Section 107 or 108 of the 1976 United States
Copyright Act, without the prior written permission of the publisher.
For product information and technology assistance, contact us at
Cengage Learning Customer & Sales Support, 1-800-354-9706.
For permission to use material from this text or product,
submit all requests online at www.cengage.com/permissions.
Further permissions questions can be e-mailed to
[email protected]
Library of Congress Control Number: 2013943623
ISBN-13: 978-1-285-08856-3
ISBN-10: 1-285-08856-5
Cengage Learning
200 First Stamford Place, 4th Floor
Stamford, CT 06902
USA
Cengage Learning is a leading provider of customized learning solutions
with office locations around the globe, including Singapore, the United
Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local office at
www.cengage.com/global.
Cengage Learning products are represented in Canada by
Nelson Education, Ltd.
To learn more about Cengage Learning Solutions, visit
www.cengage.com.
Purchase any of our products at your local college store or at our
preferred online store www.cengagebrain.com.
Printed in the United States of America
1 2 3 4 5 6 7 17 16 15 14 13
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
WCN: 02-200-203
Dedication
to Deborah
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Brief Contents
CHAPTER 1 Background and Rationale for the Study of
Learning and Behavior 1
CHAPTER 2 Elicited Behavior, Habituation, and
Sensitization 29
CHAPTER 3 Classical Conditioning: Foundations 59
CHAPTER 4 Classical Conditioning: Mechanisms 87
CHAPTER 5 Instrumental Conditioning: Foundations 121
CHAPTER 6 Schedules of Reinforcement and Choice
Behavior 155
CHAPTER 7 Instrumental Conditioning: Motivational
Mechanisms 185
CHAPTER 8 Stimulus Control of Behavior 211
CHAPTER 9 Extinction of Conditioned Behavior 245
CHAPTER 10 Aversive Control: Avoidance and Punishment 273
CHAPTER 11 Comparative Cognition I: Memory Mechanisms 307
CHAPTER 12 Comparative Cognition II: Special Topics 343
iv
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xii
About the Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
C H A P T E R 1
Background and Rationale for the Study of Learning and Behavior . . . . . . . . . . . . . . . 1
Historical Antecedents 4
Historical Developments in the Study of the Mind 5
Historical Developments in the Study of Reflexes 7
The Dawn of the Modern Era 9
Comparative Cognition and the Evolution of Intelligence 9
Functional Neurology 11
Animal Models of Human Behavior 12
Animal Models and Drug Development 13
Animal Models and Machine Learning 14
The Definition of Learning 14
The Learning–Performance Distinction 14
Learning and Other Sources of Behavior Change 15
Learning and Levels of Analysis 15
Methodological Aspects of the Study of Learning 16
Learning as an Experimental Science 16
The General-Process Approach to the Study of Learning 20
Use of Nonhuman Animals in Research on Learning 23
Rationale for the Use of Nonhuman Animals in Research on Learning 23
Laboratory Animals and Normal Behavior 24
Public Debate about Research With Nonhuman Animals 24
Sample Questions 26
Key Terms 27
C H A P T E R 2
Elicited Behavior, Habituation, and Sensitization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
The Nature of Elicited Behavior 30
The Concept of the Reflex 30
Modal Action Patterns 32
Eliciting Stimuli for Modal Action Patterns 33
The Sequential Organization of Behavior 35
Effects of Repeated Stimulation 36
Salivation and Hedonic Ratings of Taste in People 36
Visual Attention in Human Infants 38
v
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The Startle Response 41
Sensitization and the Modulation of Elicited Behavior 42
Adaptiveness and Pervasiveness of Habituation and Sensitization 45
Habituation Versus Sensory Adaptation and Response Fatigue 46
The Dual-Process Theory of Habituation and Sensitization 47
Applications of the Dual-Process Theory 48
Implications of the Dual-Process Theory 48
Habituation and Sensitization of Emotions and Motivated Behavior 51
Emotional Reactions and Their Aftereffects 52
The Opponent Process Theory of Motivation 53
Concluding Comments 55
Sample Questions 56
Key Terms 56
C H A P T E R 3
Classical Conditioning: Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
The Early Years of Classical Conditioning 60
The Discoveries of Vul’fson and Snarskii 61
The Classical Conditioning Paradigm 61
Experimental Situations 62
Fear Conditioning 63
Eyeblink Conditioning 64
Sign Tracking and Goal Tracking 68
Learning Taste Preferences and Aversions 70
Excitatory Pavlovian Conditioning Methods 73
Common Pavlovian Conditioning Procedures 73
Measuring Conditioned Responses 74
Control Procedures for Classical Conditioning 75
Effectiveness of Common Conditioning Procedures 76
Inhibitory Pavlovian Conditioning 77
Procedures for Inhibitory Conditioning 79
Measuring Conditioned Inhibition 81
Prevalence of Classical Conditioning 83
Concluding Comments 85
Sample Questions 85
Key Terms 85
C H A P T E R 4
Classical Conditioning: Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
What Makes Effective Conditioned and Unconditioned Stimuli? 88
Initial Responses to the Stimuli 88
Novelty of Conditioned and Unconditioned Stimuli 88
CS and US Intensity and Salience 89
CS–US Relevance, or Belongingness 90
Learning Without an Unconditioned Stimulus 92
What Determines the Nature of the Conditioned Response? 94
The US as a Determining Factor for the CR 94
The CS as a Determining Factor for the CR 95
vi Contents
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The CS–US Interval as a Determining Factor for the CR 96
Conditioned Responding and Behavior Systems 97
S–R Versus S–S Learning 98
Pavlovian Conditioning as Modification of Responses to the Unconditioned Stimulus 100
How Do Conditioned and Unconditioned Stimuli Become Associated? 103
The Blocking Effect 104
The Rescorla–Wagner Model 106
Attentional Models of Conditioning 113
Timing and Information Theory Models 114
The Comparator Hypothesis 116
Concluding Comments 118
Sample Questions 119
Key Terms 119
C H A P T E R 5
Instrumental Conditioning: Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Early Investigations of Instrumental Conditioning 122
Modern Approaches to the Study of Instrumental Conditioning 125
Discrete-Trial Procedures 125
Free-Operant Procedures 126
Instrumental Conditioning Procedures 130
Positive Reinforcement 131
Punishment 131
Negative Reinforcement 132
Omission Training or Negative Punishment 132
Fundamental Elements of Instrumental Conditioning 134
The Instrumental Response 134
The Instrumental Reinforcer 138
The Response–Reinforcer Relation 141
Sample Questions 152
Key Terms 152
C H A P T E R 6
Schedules of Reinforcement and Choice Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Simple Schedules of Intermittent Reinforcement 156
Ratio Schedules 157
Interval Schedules 160
Comparison of Ratio and Interval Schedules 162
Choice Behavior: Concurrent Schedules 165
Measures of Choice Behavior 166
The Matching Law 167
Mechanisms of the Matching Law 169
Complex Choice and Self-control 172
Concurrent-Chain Schedules 172
Self-Control Choice and Delay Discounting 173
Concluding Comments 181
Sample Questions 182
Key Terms 182
Contents vii
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
C H A P T E R 7
Instrumental Conditioning: Motivational Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . 185
The Associative Structure of Instrumental Conditioning 186
The S–R Association and the Law of Effect 187
Expectancy of Reward and the S–O Association 188
R–O and S(R–O) Relations in Instrumental Conditioning 194
Response Allocation and Behavioral Economics 196
Antecedents of the Response-Allocation Approach 197
The Response Allocation Approach 201
Behavioral Economics 204
Contributions of the Response-Allocation Approach and Behavioral Economics 208
Concluding Comments 209
Sample Questions 209
Key Terms 210
C H A P T E R 8
Stimulus Control of Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Identification and Measurement of Stimulus Control 212
Differential Responding and Stimulus Discrimination 212
Stimulus Generalization 213
Stimulus and Reinforcement Variables 216
Sensory Capacity and Orientation 217
Relative Ease of Conditioning Various Stimuli 217
Type of Reinforcement 218
Stimulus Elements Versus Configural Cues in Compound Stimuli 219
Learning Factors in Stimulus Control 220
Stimulus Discrimination Training 221
What Is Learned in Discrimination Training? 227
Spence’s Theory of Discrimination Learning 227
Interactions Between S+ and S–: The Peak-Shift Effect 229
Stimulus Equivalence Training 231
Contextual Cues and Conditional Relations 234
Control by Contextual Cues 234
Control by Conditional Relations 240
Concluding Comments 243
Sample Questions 243
Key Terms 243
C H A P T E R 9
Extinction of Conditioned Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Effects of Extinction Procedures 247
Forms of Recovery From Extinction 249
Spontaneous Recovery 249
Renewal of Conditioned Responding 250
Reinstatement of Conditioned Responding 252
Resurgence of Conditioned Behavior 254
Enhancing Extinction 255
Number and Spacing of Extinction Trials 255
viii Contents
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Immediate Versus Delayed Extinction 256
Repetition of Extinction/Test Cycles 256
Conducting Extinction in Multiple Contexts 257
Presenting Extinction Reminder Cues 258
Compounding Extinction Stimuli 259
Priming Extinction to Update Memory for Reconsolidation 260
What Is Learned in Extinction? 264
Paradoxical Reward Effects 266
Mechanisms of the Partial-Reinforcement Extinction Effect 267
Resistance to Change and Behavioral Momentum 269
Concluding Comments 271
Sample Questions 271
Key Terms 272
C H A P T E R 1 0
Aversive Control: Avoidance and Punishment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Avoidance Behavior 274
Origins of the Study of Avoidance Behavior 274
The Discriminated Avoidance Procedure 275
Two-Process Theory of Avoidance 276
Experimental Analysis of Avoidance Behavior 277
Alternative Theoretical Accounts of Avoidance Behavior 284
The Avoidance Puzzle: Concluding Comments 289
Punishment 289
Experimental Analysis of Punishment 290
Theories of Punishment 300
Punishment Outside the Laboratory 303
Sample Questions 304
Key Terms 304
C H A P T E R 1 1
Comparative Cognition I: Memory Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
Comparative Cognition, Consciousness, and Anthropomorphism 308
Memory: Basic Concepts 310
Stages of Information Processing 310
Types of Memory 311
Working and Reference Memory 311
Delayed Matching to Sample 312
Spatial Memory in Mazes 316
Memory Mechanisms 323
Acquisition and the Problem of Stimulus Coding 323
Retrospective and Prospective Coding 325
Retention and the Problem of Rehearsal 328
Retrieval 330
Forgetting and Sources of Memory Failure 333
Proactive and Retroactive Interference 334
Retrograde Amnesia 334
Contents ix
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Consolidation, Reconsolidation, and Memory Updating 338
Reconsolidation 339
Concluding Comments 340
Sample Questions 341
Key Terms 341
C H A P T E R 1 2
Comparative Cognition II: Special Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Food Caching and Recovery 344
Spatial Memory in Food Caching and Recovery 345
Episodic Memory in Food Caching and Recovery 346
Timing 349
Techniques for Studying the Temporal Control of Behavior 350
Properties of Temporally Controlled Behavior 351
Models of Timing 352
Serial Order Learning 355
Possible Bases of Serial Order Performance 355
Techniques for the Study of Serial Order Learning 359
Categorization and Concept Learning 363
Perceptual Concept Learning 364
Learning Higher-Level Concepts 368
Learning Abstract Concepts 369
Tool Use in Nonhuman Animals 370
Language Learning in Nonhuman Animals 371
Early Attempts at Language Training 372
Language Training Procedures 372
Components of Linguistic Skill 375
Evidence of “Grammar” in Great Apes 376
Sample Questions 377
Key Terms 377
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Name Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
Subject Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
x Contents
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Boxes on the Neuroscience
of Learning
1.1 The Material Mind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Learning in an Invertebrate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Eyeblink Conditioning and the Search for the Engram . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Conditioning and the Amygdala . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.5 Learned Helplessness: Role of the Prefrontal Cortex and Dorsal Raphe . . . . . . . . . . . . 150
6.3 Neuroeconomics: Imaging Habit and Executive Control . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.1 The Role of Dopamine in Addiction and Reward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
8.2 Hippocampal Function and Long-Term Potentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
9.1 Consolidating Memories Requires Protein Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
10.1 Instrumental Learning Within the Spinal Cord . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
11.1 Genes and Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
12.1 Neurobiology of Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
xi
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface
Originally, I had three basic goals in writing this book. First, I wanted to share with stu-
dents the new ideas and findings that made me excited about conditioning and learning.
Second, I wanted to emphasize that learning procedures do not operate in a vacuum but
are built on behavior systems shaped by evolution. This belief provided the rationale for
including behavior in the title of the book. Third, I wanted to provide an eclectic and
balanced account of the field that was respectful of both the Pavlovian associationist tra-
dition and the Skinnerian behavior-analytic tradition. I have remained faithful to these
goals and sought to satisfy them in the seventh edition while being responsive to the
ever-changing landscape of research on learning mechanisms.
Although the field of conditioning and learning dates back more than a century
(during which some of our technical vocabulary has not changed much), the field con-
tinues to be enriched by numerous new phenomena and new interpretations. Recent
national priorities for the pursuit of translational research have encouraged a great deal
of new research on mechanisms of learning related to drug addiction, fear conditioning,
and extinction. One of the interesting new developments is a greater respect for individual
differences, which is now informing our understanding of some of the fundamental phe-
nomena of Pavlovian conditioning, as well as punishment and choice behavior, among
other topics. Incorporating new developments in the book required judgments about
what was important enough to add and what material could be omitted to make room
for the new information. Adding things is easy. Removing information that was previously
deemed important is more painful but necessary to keep the book to a reasonable length.
A continuing challenge for the book has been how to represent the major advances
that are being made in studies of the neuroscience of learning and memory. Unfortu-
nately, a single course cannot do justice to both the basic behavioral mechanisms of
learning and the neural mechanisms of these behavioral phenomena. I remain com-
mitted to the proposition that one cannot study the neural mechanisms of a behavioral
phenomenon without first understanding that phenomenon at the behavioral level of
analysis. Therefore, the book continues to be primarily concerned with behavioral phe-
nomena. However, the seventh edition includes more information about the neuro-
science of learning and memory than any previous edition.
As in the sixth edition, most of the neuroscience information is presented in boxes
that can be omitted by instructors and students who do not wish to cover this material.
I am grateful to James W. Grau, Professor of Psychology and Neuroscience at Texas
A&M University, for writing the “neuroboxes.” The seventh edition includes a neurobox
in each chapter of the book. Furthermore, for the first time, Professor Grau organized
these neuroboxes so that they tell a coherent and progressively unfolding story across
the 12 chapters. For a bird’s-eye view, a list of the neuroboxes is presented in a separate
section of the table of contents.
xii
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
In addition to advances in the neurosciences, new research on many aspects of basic
learning phenomena dictated numerous changes from the sixth to the seventh edition.
The changes are too numerous to list. Among other things, they include new findings
related to habit formation and automatic processing, epigenetic influences on behavior,
pathological fear and post-traumatic stress disorder, individual differences in sign track-
ing and goal tracking, the relation of the Rescorla–Wagner model to error-correction
mechanisms in robotics, new work on voucher-based programs for the treatment of sub-
stance abuse, new research on self-control, S–O and R–O associations in drug addiction,
expanded and updated discussion of response allocation and behavioral economics, new
research on stimulus equivalence, new work on ways to enhance extinction, new theory
and research on avoidance, and extensive new sections on memory mechanisms and var-
ious special topics in comparative cognition (Chapters 11 and 12).
One of the major developments in the field during the past decade is that the basic
behavioral principles that are described in this book are being utilized by an increasingly
broad range of scientists. I first noticed this trend when I was preparing the sixth edition.
The trend has continued since then, with the consequence that the new references that
have been added in the seventh edition were culled from about 85 different journals.
New information on basic learning processes continues to be published in traditional
psychology journals (such as the Journal of the Experimental Analysis of Behavior and
the Journal of Experimental Psychology: Animal Behavior Processes). However, important
new findings are also being published in journals dealing with behavior therapy, brain
research and neuroscience, biological psychiatry, child development, drug and alcohol
dependence, language and cognition, family violence, neuropsychology, pharmacology
and therapeutics, and psychosomatic medicine.
The broadening range of disciplines that are finding basic behavioral principles to be
relevant has also been evident in the range of students who have been signing up for my
learning classes. During the past two years, my graduate course on learning has attracted
students from integrative biology, communications, information science, marketing,
music, special education, and neuroscience, in addition to psychology.
Identifying relevant sources that appear in a diverse range of journals is made pos-
sible by the search engines of the new information age. Early editions of the book pro-
vided extensive citations of research on various topics in conditioning and learning.
Considering how easy it is to find sources using ever-improving search engines, the cita-
tions in the seventh edition are not as extensive and are intended to introduce students
to new lines of research rather than provide a complete list of the relevant research.
I apologize to investigators whose names may have been omitted because of this altered
citation strategy.
I would like to thank the support of numerous instructors and students around the
world who continue to look to this book for authoritative coverage of basic learning
mechanisms. Without their support, successive editions (and translations) of the book
would not be possible. Successive editions of the book also would not have been possible
without the support of the good folks at Cengage Learning, especially Jon-David Hague,
the product director of psychology. I am also grateful to Wendy Langerud (in Iowa) and
Gunjan Chandola (in India) for all of their help in shepherding the seventh edition
through the complexities of the production process. Finally, I would like to thank Professor
Kevin Holloway of Vassar College for agreeing to prepare the Instructor’s Manual and Test
Bank for the book.
Michael Domjan
Austin, Texas
Preface xiii
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
About the Authors
MICHAEL DOMJAN is a professor of Psychology at the University of Texas at Austin,
where he has taught learning to undergraduate and graduate students since 1973. He also
served as department chair from 1999 to 2005 and was the founding director of the
Imaging Research Center from 2005 to 2008. Professor Domjan is noted for his functional
approach to classical conditioning, which he has pursued in studies of sexual conditioning
and taste aversion learning. His research was selected for a MERIT Award by the National
Institutes of Mental Health as well as a Golden Fleece Award by U.S. Senator William
Proxmire. He served as editor of the Journal of Experimental Psychology: Animal Behavior
Processes for six years and continues to serve on editorial boards of various journals in the
United States and other countries. He is a past president of the Pavlovian Society and also
served as president of the Division of Behavioral Neuroscience and Comparative Psychology
of the American Psychological Association. His former Ph.D. students hold faculty positions
at various colleges and universities in the United States, Colombia, and Turkey. Domjan also
enjoys playing the viola and teaches a course on Music and Psychology in which he talks
about the role of habituation, sensitization, and Pavlovian and instrumental conditioning in
musical experience and musical performance.
Neuroscience Contributor
JAMES GRAU is a professor at Texas A&M University, with appointments in Psychology
and the Texas A&M Institute for Neuroscience (TAMIN). He received his Ph.D. under the
direction of Dr. R. A. Rescorla and moved to Texas A&M University in 1987, where he is
now the Mary Tucker Currier Professor of Liberal Arts. He is a fellow of both the Associa-
tion for Psychological Science and the American Psychological Association (Divisions 3, 6,
and 28), where he served as president of Division 6 (Behavioral Neuroscience and
Comparative Psychology). His research has examined how learning and memory influence
pain processing, the neurobiological mechanisms involved, and how physiological observa-
tions inform our understanding of learning. His current research focuses on neural
plasticity within the spinal cord, with the aim of detailing its functional properties, how
and when spinal neurons learn, and the implications of this work for recovery after a spinal
cord injury. His work has been funded by the National Institutes of Mental Health
(NIMH), Neurological Disorders and Stroke (NINDS), and Child Health and Development
(NICHD). Since 1983, he has taught nearly 50 courses and seminars on learning.
Co
ur
te
sy
of
M
.
D
om
ja
n
Co
ur
te
sy
of
Ja
m
es
G
ra
u
xiv
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
C H A P T E R 1
Background and Rationale
for the Study of Learning
and Behavior
Historical Antecedents
Historical Developments in the Study
of the Mind
Historical Developments in the Study of Reflexes
The Dawn of the Modern Era
Comparative Cognition and the Evolution of
Intelligence
Functional Neurology
Animal Models of Human Behavior
Animal Models and Drug Development
Animal Models and Machine Learning
The Definition of Learning
The Learning–Performance Distinction
Learning and Other Sources of Behavior Change
Learning and Levels of Analysis
Methodological Aspects of the Study of
Learning
Learning as an Experimental Science
The General-Process Approach to the Study of
Learning
Use of Nonhuman Animals in Research on
Learning
Rationale for the Use of Nonhuman Animals in
Research on Learning
Laboratory Animals and Normal Behavior
Public Debate about Research With Nonhuman
Animals
Sample Questions
Key Terms
CHAPTER PREVIEW
The goal of Chapter 1 is to introduce the reader to contemporary studies of learning and behavior theory.
I begin by characterizing behavioral studies of learning and describing how these are related to cognition
and the conscious control of behavior. I then describe the historical antecedents of key concepts in modern
learning theory. This is followed by a discussion of the origins of contemporary experimental research in
studies of the evolution of intelligence, functional neurology, animal models of human behavior, and the
implications of contemporary research for the development of memory-enhancing drugs and the construction
of artificial intelligent systems or robots. I then provide a detailed definition of learning and discuss how
learning can be examined at different levels of analysis. Methodological features of studies of learning are
described in the next section. Because numerous experiments on learning have been performed with
1
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
nonhuman animals, I conclude the chapter by discussing the rationale for the use of nonhuman animals in
research, with some comments on the public debate about animal research.
People have always been interested in understanding behavior, be it their own or the
behavior of others. This interest is more than idle curiosity. Our quality of life depends
on our actions and the actions of others. Any systematic effort to understand behavior
must include consideration of what we learn and how we learn it. Numerous aspects of
the behavior of both human and nonhuman animals are the results of learning. We learn
to read, to write, and to count. We learn to walk downstairs without falling, to open doors,
to ride a bicycle, and to swim. We also learn when to relax and when to become anxious.
We learn what foods we are likely to enjoy and what foods will make us sick. We also
learn the numerous subtle gestures that are involved in effective social interactions. Life is
filled with activities and experiences that are shaped by what we have learned.
Learning is one of the biological processes that facilitates our survival and promotes
our well-being. When we think of survival, we typically think of the importance of
biological functions such as respiration, digestion, and resisting disease. Physiological
systems have evolved to accomplish these tasks. However, for many species finely tuned
physiological processes do not take care of all of the adaptive functions that are required
for successful existence. Learning plays a critical role in improving how organisms adapt
to their environment. Sometimes this takes the form of learning new responses. In other
cases, learning serves to improve how physiological systems operate to accomplish
important biological functions such as digestion and reproduction (Domjan, 2005).
Animals, including people, have to learn to find new food sources as old ones
become unavailable or when they move to a new area. They also have to find new shelter
when storms destroy their old homes, as happens in a hurricane or tornado. Accom-
plishing these tasks obviously requires motor responses, such as walking and manipulat-
ing objects. These tasks also require the ability to predict important events in the
environment, such as when and where food will be available. All these things involve
learning. Animals learn to go to a new water hole when their old one dries up and
learn to anticipate new sources of danger. These learned adjustments to the environment
are as important as physiological processes such as respiration and digestion.
It is common to think about learning as involving the acquisition of new behavior.
Indeed, we learn new responses when we learn to read, ride a bicycle, or play a musical
instrument. However, learning can also consist of the decrease or loss of a previously
performed response. A child, for example, may learn to not cross the street when the
traffic light is red, to not grab food from someone else’s plate, and to not yell and
shout when someone is trying to take a nap. Learning to withhold or inhibit responses
is just as important as learning to make responses, if not more so.
When considering learning, we are likely to think about forms of learning that require
special training—the learning that takes place in schools and colleges, for example. Solving
calculus problems or making a triple somersault when diving requires special instruction and
lots of practice. However, we also learn all kinds of things without an expert teacher or coach
during the course of routine interactions with our social and physical environment. Children
learn how to open doors and windows, what to do when the phone rings, when to avoid a
hot stove, and when to duck so as not to get hit by a flying ball. College students learn how
to find their way around campus, how to avoid heartburn from cafeteria food, and how to
predict when a roommate will stay out late at night, all without special instruction.
In the coming chapters, I will describe research on the basic principles of learning and
behavior. We will focus on basic types of learning and behavior that are fundamental to life
2 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
but, like breathing, are often ignored. These pervasive and basic forms of learning are a nor-
mal (and often essential) part of daily life, even though they rarely command our attention. I
will describe the learning of simple relationships between events in the environment, the
learning of simple motor movements, and the learning of emotional reactions to stimuli.
These forms of learning are investigated in experiments that involve conditioning or “train-
ing” procedures of various sorts. However, these forms of learning occur in the lives of
human and nonhuman animals without explicit or organized instruction or schooling.
Much of the research that I will describe is in the behaviorist tradition of psychology
that emphasizes analyzing behavior in terms of its antecedent stimuli and consequences.
Conscious reflection and conscious reasoning are deliberately left out of this analysis.
I will describe automatic procedural learning that does not require awareness (e.g.,
Lieberman, Sunnucks, & Kirk, 1998; Smith et al., 2005) rather than declarative learning
that is more accessible to conscious report.
It is natural for someone to be interested in aspects of his or her behavior that
are accessible to conscious reflection. However, both psychologists and neuroscientists have
become increasingly convinced that most of what we do occurs without conscious awareness.
The capacity of conscious thought is very limited. That is why people have difficulty driving
and talking on the phone at the same time. However, people can walk and talk at the same
time because walking is a much more automatic activity that does not require conscious con-
trol. Because of the limited capacity of conscious thought, we do and learn many things
without awareness. In a recent discussion of neuroscience, Eagleman (2011) noted that
“there is a looming chasm between what your brain knows and what your mind is capable
of accessing” (p. 55). Based on his research on the experience of conscious intent, Wegner
(2002) came to a similar conclusion, which is captured in the title of his book The Illusion of
Conscious Will. The studies of automatic procedural learning that we will discuss serve to
inform us about important aspects of our behavior that we rarely think about otherwise.
The following chapters will describe how features of the environment gain the
capacity to trigger our behavior whether we like it or not. This line of research has its
origins in what has been called behavioral psychology. During the last quarter of the
twentieth century, behavioral psychology was overshadowed by “the cognitive
revolution.” However, the cognitive revolution did not eliminate the taste aversions that
children learn when they get chemotherapy, it did not reduce the cravings that drug
addicts experience when they see their friends getting high, and it did not stop the pro-
verbial Pavlovian dog from salivating when it encountered a signal for food. Cognitive
science did not grow by taking over the basic learning phenomena that are the focus of
this book. Rather, it grew by extending psychology into new areas of research such as
attention, problem solving, and knowledge representation. As important as these new
topics of cognitive psychology have become, they have not solved the problems of how
good or bad habits are learned or how debilitating fears or emotions may be effectively
modified. Those topics remain at the core of studies of learning and behavior.
Basic behavioral processes remain important in the lives of organisms even as we
learn more about other aspects of psychology. In fact, there is a major resurgence of
interest in basic behavioral mechanisms. This is fueled by the growing appreciation of
the limited role of consciousness in behavior and the recognition that much of what
takes us through our daily lives involves habitual responses that we spend little time
thinking about (Gasbarri & Tomaz, 2013; Wood & Neal, 2007). We don’t think about
how we brush our teeth, dry ourselves after a shower, put on our clothes, or chew our
food. All of these are learned responses. Behavioral models of conditioning and learning
are also fundamental to the understanding of recalcitrant clinical problems such as
pathological fears and phobias and drug addiction. As Wiers and Stacy (2006) pointed
out, “The problem, often, is not that substance abusers do not understand that the
Chapter 1: Background and Rationale for the Study of Learning and Behavior 3
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
disadvantages of continued use outweigh the advantages; rather, they have difficulty
resisting their automatically triggered impulses to use their substance of abuse” (p. 292).
This book deals with how such behavioral impulses are learned.
Historical Antecedents
Theoretical approaches to the study of learning have their roots in the philosophy of
René Descartes (Figure 1.1). Before Descartes, the prevailing view was that human
behavior is entirely determined by conscious intent and free will. People’s actions were
not considered to be automatic or determined by mechanistic natural laws. What some-
one did was presumed to be the result of his or her will or deliberate intent. Descartes
took exception to this view because he recognized that people do many things automati-
cally in response to external stimuli. However, he was not prepared to abandon entirely
the idea of free will and conscious control. He therefore formulated a dualistic view of
human behavior known as Cartesian dualism.
According to Cartesian dualism, there are two classes of human behavior: involun-
tary and voluntary. Involuntary behavior consists of automatic reactions to external
stimuli and is mediated by a special mechanism called a reflex. Voluntary behavior,
by contrast, does not have to be triggered by external stimuli and occurs because of the
person’s conscious intent to act in that particular manner.
The details of Descartes’s dualistic view of human behavior are diagrammed in
Figure 1.2. Let us first consider the mechanisms of involuntary, or reflexive, behavior.
Stimuli in the environment are detected by the person’s sense organs. The sensory
information is then relayed to the brain through nerves. From the brain, the impetus
for action is sent through nerves to the muscles that create the involuntary response.
FIGURE 1.1 René
Descartes (1596–1650)
Li
br
ar
y
of
Co
ng
re
ss
Pr
in
ts
an
d
Ph
ot
og
ra
ph
s
D
iv
is
io
n
[L
C-
U
SZ
62
-6
13
65
]
4 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Thus, sensory input is reflected in the response output. Hence, Descartes called involun-
tary behavior reflexive.
Several aspects of this system are noteworthy. Stimuli in the external environment
are assumed to be the cause of all involuntary behavior. These stimuli produce involun-
tary responses by way of a neural circuit that includes the brain. However, Descartes
assumed that only one set of nerves was involved. According to Descartes, the same
nerves transmitted information from the sense organs to the brain and from the brain
down to the muscles. This circuit, he believed, permitted rapid reactions to external
stimuli—for example, quick withdrawal of one’s finger from a hot stove.
Descartes assumed that the involuntary mechanism of behavior was the only one
available to animals other than humans. According to this view, all of nonhuman animal
behavior occurs as reflexive behavior to external stimuli. Thus, Descartes believed that
nonhuman animals lacked free will and were incapable of voluntary, conscious action.
He considered free will and voluntary behavior to be uniquely human attributes. These
unique human features existed because only human beings were thought to have a mind
or a soul.
The mind was assumed to be a nonphysical entity. Descartes believed that the mind
was connected to the physical body by way of the pineal gland, at the base of the brain.
Because of this connection, the mind was aware of and could keep track of involuntary
behavior. Through this mechanism, the mind could also initiate voluntary actions.
Because voluntary behavior was initiated in the mind, its occurrence was not automatic
and could occur independently of external stimulation.
The mind–body dualism introduced by Descartes stimulated two intellectual tradi-
tions, mentalism and reflexology. Mentalism was concerned with the contents and work-
ings of the mind. In contrast, reflexology was concerned with the mechanisms of
reflexive behavior. These two intellectual traditions form the foundations of the modern
study of learning.
Historical Developments in the Study of the Mind
Philosophers concerned with the mind pondered questions about the contents of the
mind and how the mind works. These considerations formed the historical foundations
of present-day cognitive psychology. Because Descartes thought the mind was connected
to the brain by way of the pineal gland, he believed that some of the contents of the
mind came from sense experiences. However, he also believed that the mind contained
ideas that were innate and existed in all human beings independent of personal experi-
ence. For example, he believed that all humans were born with the concept of God, the
concept of self, and certain fundamental axioms of geometry (such as the fact that the
shortest distance between two points is a straight line). The philosophical approach that
assumes we are born with innate ideas about certain things is called nativism.
Physical world
(Cause of
involuntary
action)
Involuntary
or voluntary
action
Sense organs
Muscles
Nerves Brain
Pineal
gland
Mind
(Cause of
voluntary
action)
FIGURE 1.2 Diagram
of Cartesian dualism.
Events in the physical
world are detected by
sense organs. From here
the information is trans-
mitted to the brain. The
brain is connected to the
mind by way of the pi-
neal gland. Involuntary
action is produced by a
reflex arc that involves
messages sent from the
sense organs to the brain
and then from the brain
to the muscles. Voluntary
action is initiated by the
mind, with messages sent
to the brain and then the
muscles.
©
Ce
ng
ag
e
Le
ar
ni
ng
Historical Antecedents 5
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Some philosophers after Descartes took issue with the nativist position. In particular,
the British philosopher John Locke (1632–1704) proposed that all of the ideas people
had were acquired directly or indirectly through experiences after birth. He believed
that human beings were born without any preconceptions about the world. According
to Locke, the mind started out as a clean slate (tabula rasa, in Latin), to be gradually
filled with ideas and information as the person encountered various sense experiences.
This philosophical approach to the contents of the mind is called empiricism. Empiri-
cism was accepted by a group of British philosophers who lived from the seventeenth
to the nineteenth century and who came to be known as the British empiricists.
The nativist and empiricist philosophers disagreed not only about the contents of the
mind at birth but also about how the mind worked. Descartes believed that the mind did
not function in a predictable and orderly manner, according to strict rules or laws that
one could identify. One of the first to propose an alternative to this position was the
British philosopher Thomas Hobbes (1588–1679). Hobbes accepted the distinction
between voluntary and involuntary behavior stated by Descartes and also accepted the
notion that voluntary behavior was controlled by the mind. However, unlike Descartes,
he believed that the mind operated just as predictably and lawfully as a reflex. More
specifically, he proposed that voluntary behavior was governed by the principle of
hedonism. According to this principle, people do things in the pursuit of pleasure and
the avoidance of pain. Hobbes was not concerned with whether the pursuit of pleasure
and the avoidance of pain are desirable or justified. For Hobbes, hedonism was simply a
fact of life. As we will see, the notion that behavior is controlled by positive and negative
consequences has remained with us in one form or another to the present day.
According to the British empiricists, another important aspect of how the mind
works involved the concept of association. Recall that the empiricists assumed that all
ideas originated from sense experiences. If that is true, how do our experiences of various
colors, shapes, odors, and sounds allow us to arrive at more complex ideas? Consider, for
example, the concept of a car. If someone says the word car, you have an idea of what
the thing looks like, what it is used for, and how you might feel if you sat in it. Where do
all these ideas come from given just the sound of the letters c, a, and r? The British
empiricists proposed that simple sensations were combined into more complex ideas by
associations. Because you have heard the word car when you saw a car, considered using
one to get to work, or sat in one, connections or associations became established between
the word car and these other attributes of cars. Once the associations became established,
the word car would activate memories of the other aspects of cars that you have experi-
enced. The British empiricists considered such associations to be the building blocks of
mental activity. Therefore, they devoted considerable effort to discovering rules of
associations.
Rules of Associations The British empiricists accepted two sets of rules for the estab-
lishment of associations, one primary and the other secondary. The primary rules were
originally set forth by the ancient Greek philosopher Aristotle. He proposed three prin-
ciples for the establishment of associations: (1) contiguity, (2) similarity, and (3) contrast.
Of these, the contiguity principle has been the most prominent in studies of associations
and continues to play an important role in contemporary work. It states that if two
events repeatedly occur together in space or time, they will become linked or associated.
For example, if you encounter the smell of tomato sauce with spaghetti often enough,
your memory of spaghetti will be activated by just the smell of tomato sauce. The
similarity and contrast principles state that two things will become associated if they
are similar in some respect (e.g., both are red) or have some contrasting characteristics
(e.g., one might be strikingly tall and the other strikingly short). Similarity as a basis for
6 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
the formation of associations has been confirmed by modern studies of learning (e.g.,
Cusato & Domjan, 2012; Rescorla & Furrow, 1977). However, there is no contemporary
evidence that contrast (making one stimulus strikingly different from another) facilitates
the formation of an association between them.
Secondary laws of associations were formulated by various empiricist philosophers.
Prominent among these was Thomas Brown (1778–1820), who proposed that the associ-
ation between two stimuli depended on the intensity of those stimuli and how frequently
or recently the stimuli occurred together. In addition, the formation of an association
between two events was considered to depend on the number of other associations in
which each event was already involved and the similarity of these past associations to
the current one being formed.
The British empiricists discussed rules of association as a part of their philosophical
discourse. They did not perform experiments to determine whether or not the proposed
rules were valid. Nor did they attempt to determine the circumstances in which one rule
was more important than another. Empirical investigation of the mechanisms of associa-
tions did not begin until the pioneering work of the nineteenth-century German psychol-
ogist Hermann Ebbinghaus (1850–1909).
To study how associations are formed, Ebbinghaus invented nonsense syllables.
Nonsense syllables were three-letter combinations (e.g., “bap”) devoid of any meaning
that might influence how someone might react to them. Ebbinghaus used himself as
the experimental subject. He studied lists of nonsense syllables and measured his ability
to remember them under various experimental conditions. This general method enabled
him to answer such questions as how the strength of an association improved with
increased training, whether nonsense syllables that were close together in a list were
associated more strongly with one another than syllables that were farther apart, and
whether a syllable became more strongly associated with the next one on the list (a for-
ward association) rather than with the preceding one (a backward association). Many of
the issues that were addressed by the British empiricists and Ebbinghaus have their
counterparts in modern studies of learning and memory.
Historical Developments in the Study of Reflexes
Descartes made a very significant contribution to the understanding of behavior when he
formulated the concept of the reflex. The basic idea that behavior can reflect a triggering
stimulus remains an important building block of behavior theory. However, Descartes
was mistaken in his beliefs about the details of reflex action. He believed that sensory
messages going from sense organs to the brain and motor messages going from the
brain to the muscles traveled along the same nerves. He thought that nerves were hollow
tubes, and neural transmission involved the movement of gases called animal spirits. The
animal spirits, released by the pineal gland, were assumed to flow through the neural
tubes and enter the muscles, causing them to swell and create movement. Finally,
Descartes considered all reflexive movements to be innate and to be fixed by the anat-
omy of the nervous system. Over the course of several hundred years since Descartes
passed away, all of these ideas about reflexes have been proven wrong.
Charles Bell (1774–1842) in England and Francois Magendie (1783–1855) in France
showed that separate nerves are involved in the transmission of sensory information
from sense organs to the central nervous system and motor information from the central
nervous system to muscles. If a sensory nerve is cut, the animal remains capable of mus-
cle movements; if a motor nerve is cut, the animal remains capable of registering sensory
information.
The idea that animal spirits are involved in neural transmission was also disproved. In
1669 John Swammerdam (1637–1680) showed that mechanical irritation of a nerve was
Historical Antecedents 7
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
sufficient to produce a muscle contraction. Thus, infusion of animal spirits from the pineal
gland was not necessary. In other studies, Francis Glisson (1597–1677) tested whether mus-
cle contractions were produced by the infusion of a gas into the muscle, as Descartes had
postulated. Glisson showed that the volume of a muscle does not increase when it is con-
tracted, demonstrating that a gas does not enter the muscle to produce motor movement.
Descartes and most philosophers after him assumed that reflexes were responsible
only for simple reactions to stimuli. The energy in a stimulus was thought to be translated
directly into the energy of the elicited response by the neural connections from
sensory input to response output. The more intense the stimulus was, the more vigorous
the resulting response would be. This simple view of reflexes is consistent with many
casual observations. If you touch a stove, for example, the hotter the stove, the more
quickly you withdraw your finger. However, some reflexes are much more complicated.
The physiological processes responsible for reflex behavior became better under-
stood in the nineteenth century, and those experiments encouraged broader conceptions
of reflex action. Two Russian physiologists, I. M. Sechenov (1829–1905) and Ivan
Pavlov (1849–1936), were primarily responsible for these developments. Sechenov
(Figure 1.3) proposed that stimuli did not elicit reflex responses directly in all cases.
Rather, in some cases, a stimulus could release a response from inhibition. In instances
where a stimulus released a response from inhibition, the vigor of the response would
not depend on the intensity of the initiating stimulus. This simple idea opened up all
sorts of new ways the concept of a reflex could be used to explain complex behavior.
If the vigor of an elicited response does not depend on the intensity of its triggering
stimulus, a very faint stimulus could produce a large response. A small piece of dust in
the nose, for example, can cause a vigorous sneeze. Sechenov took advantage of this type
FIGURE 1.3
I. M. Sechenov
(1829–1905)
RI
A
N
ov
os
ti/
A
la
m
y
8 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
of mechanism to provide a reflex model of voluntary behavior. He suggested that actions
or thoughts that occurred in the absence of an obvious eliciting stimulus were in fact
reflexive responses. However, in these cases, the eliciting stimuli are too faint for us to
notice. Thus, according to Sechenov, voluntary behavior and thoughts are actually eli-
cited by inconspicuous, faint stimuli.
Sechenov’s ideas about voluntary behavior greatly extended the use of reflex
mechanisms to explain a variety of aspects of behavior. However, his ideas were philo-
sophical extrapolations from the actual research results he obtained. In addition,
Sechenov did not address the question of how reflex mechanisms can account for the
fact that the behavior of organisms is not fixed and invariant throughout an organism’s
lifetime but can be altered by experience. From the time of Descartes, reflex responses
were considered to be innate and fixed by the connections of the nervous system. Reflexes
were assumed to depend on a prewired neural circuit connecting the sense organs to the
relevant muscles. According to this view, a given stimulus could be expected to elicit the
same response throughout an organism’s life. Although this is true in some cases,
there are also many examples in which responses to stimuli change as a result of experi-
ence. Explanation of such reflexive activity had to await the work of Ivan Pavlov.
Pavlov showed experimentally that not all reflexes are innate. New reflexes to stimuli can
be established through mechanisms of association. Thus, Pavlov’s role in the history of the
study of reflexes is comparable to the role of Ebbinghaus in the study of the mind. Both were
concerned with establishing laws of associations through empirical research. However, Pavlov
did this in the physiological tradition of reflexology rather than in the mentalistic tradition.
Much of modern behavior theory has been built on the reflex concept of stimulus-
response or S-R unit and the concept of associations. S-R units and associations continue
to play prominent roles in contemporary behavior theory. However, these basic concepts
have been elaborated and challenged over the years. As I will describe in later chapters, in
addition to S-R units, modern studies of learning have also demonstrated the existence of
stimulus-stimulus (S-S) connections and modulatory or hierarchical associative structures
(for Bayesian approaches, see Fiser, 2009; Kruschke, 2008). Quantitative descriptions of
learned behavior that do not employ associations have gained favor in some quarters (e.g.,
Gallistel & Matzel, 2013; Leslie, 2001) and have also been emphasized by contemporary
scientists working in the Skinnerian tradition of behavioral analysis (e.g., Staddon, 2001;
Lattal, 2013). However, associative analyses continue to dominate behavior theory and provide
the conceptual foundation for much of the research on the neural mechanisms of learning.
The Dawn of the Modern Era
Experimental studies of basic principles of learning are often conducted with nonhuman
animals and in the tradition of reflexology. Research in animal learning came to be pur-
sued with great vigor starting a little more than a hundred years ago. Impetus for the
research came from three primary sources (see Domjan, 1987). The first of these was
interest in comparative cognition and the evolution of the mind. The second was interest
in how the nervous system works (functional neurology), and the third was interest in
developing animal models to study certain aspects of human behavior. As we will see
in the ensuing chapters, comparative cognition, functional neurology, and animal models
of human behavior continue to dominate contemporary research in learning.
Comparative Cognition and the Evolution of Intelligence
Interest in comparative cognition and the evolution of the mind was sparked by the writ-
ings of Charles Darwin (Figure 1.4). Darwin took Descartes’s ideas about human nature
one step further. Descartes started chipping away at the age-old notion that human
The Dawn of the Modern Era 9
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
beings have a unique and privileged position in the animal kingdom by proposing that
at least some aspects of human behavior (their reflexes) were animal-like. However,
Descartes preserved some privilege for human beings by assuming that humans (and
only humans) have a mind. Darwin attacked this last vestige of privilege.
In his second major work, The Descent of Man and Selection in Relation to Sex,
Darwin argued that “man is descended from some lower form, notwithstanding that
connecting links have not hitherto been discovered” (Darwin, 1897, p. 146). In claiming
continuity from nonhuman to human animals, Darwin sought to characterize not only
the evolution of physical traits but also the evolution of psychological or mental abilities.
He argued that the human mind is a product of evolution. In making this claim, Darwin
did not deny that human beings had mental abilities such as the capacity for wonder,
curiosity, imitation, attention, memory, reasoning, and aesthetic sensibility. Rather, he
suggested that nonhuman animals also had these abilities. For example, he maintained
that nonhuman animals were capable even of belief in spiritual agencies (Darwin,
1897, p. 95).
Darwin collected anecdotal evidence of various forms of intelligent behavior in ani-
mals in an effort to support his claims. Although the evidence was not compelling by
modern standards, the research question was. Ever since, investigators have been capti-
vated by the possibility of tracing the evolution of cognition and behavior by studying
the abilities of various species of animals (Burghardt, 2009).
Before one can investigate the evolution of intelligence in a systematic fashion, one
must have a criterion for identifying intelligent behavior in animals. A highly influential
criterion was offered by George Romanes in his book Animal Intelligence (Romanes,
1882). Romanes proposed that intelligence be identified by whether an animal learns
FIGURE 1.4 Charles
Darwin (1809–1882)
Ph
ili
p
G
en
dr
ea
u/
Be
tt
m
an
n/
CO
RB
IS
10 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
“to make new adjustments, or to modify old ones, in accordance with the results of its
own individual experience” (p. 4). Thus, Romanes defined intelligence in terms of the
ability to learn. This definition was widely accepted by early comparative psychologists
and served to make the study of animal learning the key to obtaining information
about the evolution of intelligence.
As the upcoming chapters will show, much research on mechanisms of animal
learning has not been concerned with trying to obtain evidence of the evolution of intel-
ligence. Nevertheless, the cognitive abilities of nonhuman animals continue to fascinate
both the lay public and the scientific community. In contemporary science, these issues
are covered under the topic of “comparative cognition” or “comparative psychology”
(e.g., Papini, 2008; Shettleworth, 2010). Studies of comparative cognition examine topics
such as perception, attention, spatial representation, memory, problem solving, categori-
zation, tool use, and counting in nonhuman animals (Zentall & Wasserman, 2012). We
will discuss the results of contemporary research on comparative cognition in many
chapters of this text, and especially in Chapters 11 and 12.
Functional Neurology
The modern era in the study of learning processes was also greatly stimulated by efforts
to use studies of learning in nonhuman animals to gain insights into how the nervous
system works. This line of research was initiated by the Russian physiologist Ivan Pavlov,
quite independently of the work of Darwin, Romanes, and others interested in compara-
tive cognition.
While still a medical student, Pavlov became committed to the principle of nervism
according to which all key physiological functions are governed by the nervous system.
Armed with this principle, Pavlov devoted his life to documenting how the nervous
system controlled various aspects of physiology. Much of his work was devoted to iden-
tifying the neural mechanisms of digestion.
For many years, Pavlov’s research progressed according to plan. But, in 1902, two
British investigators (Bayliss and Starling) published results showing that the pancreas,
an important digestive organ, was partially under hormonal, rather than neural, control.
Writing some time later, Pavlov’s friend and biographer noted that these novel findings
produced a crisis in the laboratory because they “shook the very foundation of the teach-
ings of the exclusive nervous regulation of the secretory activity of the digestive glands”
(Babkin, 1949, p. 228).
The evidence of hormonal control of the pancreas presented Pavlov with a dilemma.
If he continued his investigations of digestion, he would have to abandon his interest in
the nervous system. On the other hand, if he maintained his commitment to nervism, he
would have to stop studying digestive physiology. Nervism won out. In an effort to con-
tinue studying the nervous system, Pavlov changed from studying digestive physiology to
studying the conditioning of new reflexes. Pavlov regarded his investigations of condi-
tioned or learned reflexes to be studies of the functions of the nervous system—what
the nervous system accomplishes. Pavlov’s claim that studies of learning tell us about
the functions of the nervous system is well accepted by contemporary neuroscientists.
For example, in their comprehensive textbook, Fundamental Neuroscience, Lynch and
colleagues (2003) noted that “neuroscience is a large field founded on the premise that
all of behavior and all of mental life have their origins in the structure and function of
the nervous system” (p. xvii).
The behavioral psychologist is like a driver who examines an experimental car by
taking it out for a test drive instead of first looking under the hood. By driving the car,
the scientist can learn a great deal about how it functions. She can discover its accelera-
tion, its top speed, the quality of its ride, its turning radius, and how quickly it comes to
The Dawn of the Modern Era 11
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
a stop. Driving the car will not reveal how these various functions are accomplished, but
one can get certain clues. For example, if the car accelerates sluggishly and never reaches
high speeds, chances are it is not powered by a rocket engine. If the car only goes forward
when facing downhill, it is probably propelled by gravity rather than by an engine. On the
other hand, if the car cannot be made to come to a stop quickly, it may not have brakes.
In a similar manner, behavioral studies of learning provide clues about the machin-
ery of the nervous system. Such studies tell us what kinds of plasticity the nervous
system is capable of, the conditions under which learning can take place, how long
learned responses persist, and the circumstances under which learned information
becomes accessible or not. By detailing the functions of the nervous system, behavioral
studies of learning provide the basic facts or behavioral endpoints that neuroscientists
have to explain at more molecular and biological levels of analysis.
Animal Models of Human Behavior
The third major impetus for the modern era in the study of animal learning was the
belief that research with nonhuman animals can provide information that may help us
better understand human behavior. Animal models of human behavior are of more
recent origin than comparative cognition or functional neurology. The approach was
systematized by Dollard and Miller and their collaborators (Dollard et al., 1939;
Miller & Dollard, 1941) and developed further by B. F. Skinner (1953).
Drawing inferences about human behavior on the basis of research with other ani-
mal species can be hazardous and controversial. The inferences are hazardous if they are
unwarranted; they are controversial if the rationale for the model system approach is
poorly understood. Model systems have been developed based on research with a variety
of species, including several species of primates, pigeons, rats, and mice.
In generalizing from research with rats and pigeons to human behavior, one does
not make the assumption that rats and pigeons are like people. Animal models are like
other types of models. Architects, pharmacologists, medical scientists, and designers of
automobiles all rely on models, which are often strikingly different from the real thing.
Architects, for example, make models of buildings they are designing. Obviously, such
models are not the same as a real building. They are much smaller, made of cardboard
and small pieces of wood instead of bricks and mortar, and support little weight.
Models are commonly used because they permit investigation of certain aspects of
what they represent under conditions that are simpler, more easily controlled, and less
expensive. With the use of a model, an architect can study the design of the exterior of
a planned building without the expense of actual construction. The model can be used to
determine what the building will look like from various vantage points and how it will
appear relative to other nearby buildings. Studying a model in a design studio is much
simpler than studying an actual building on a busy street corner. Factors that may get
in the way of getting a good view (e.g., other buildings, traffic, and power lines) can be
controlled and minimized in a model.
In a comparable fashion, a car designer can study the wind resistance of various
design features of a new automobile with the use of a model in the form of a computer
program. The program can be used to determine how the addition of spoilers or changes
in the shape of the car will change its wind resistance. The computer model bears little
resemblance to a real car. It has no tires or engine and cannot be driven. However, the
model permits testing the wind resistance of a car design under conditions that are much
simpler, better controlled, and less expensive than if the actual car were built and driven
down the highway under various conditions to measure wind resistance.
Considering all the differences between a model and the real thing, what makes a
model valid for studying something? To decide whether a model is valid, one first has
12 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
to identify what features or functions of the real object one is most interested in. These
are called the relevant features or relevant functions. If the model of a building is used to
study the building’s exterior appearance, then all the exterior dimensions of the model
must be proportional to the corresponding dimensions of the planned building. Other
features of the model, such as its structural elements, are irrelevant. In contrast, if the
model is used to study how well the building would withstand an earthquake, then its
structural elements (beams and how the beams are connected) would be critical.
In a similar manner, the only thing relevant in a computer model of car wind resis-
tance is that the computer program provides calculations for wind resistance that match
the results obtained with real cars driven through real air. No other feature is relevant;
therefore, the fact that the computer program lacks an engine or rubber tires is of no
consequence.
Models of human behavior using other animal species are based on the same logic
as the use of models in other domains. Animal models permit investigating problems
that are difficult, if not impossible, to study directly with people. A model permits the
research to be carried out under circumstances that are simpler, better controlled, and
less expensive. The validity of animal models is based on the same criterion as the valid-
ity of other types of models. The key is similarity between the animal model and human
behavior in features that are relevant to the problem at hand. The relevant feature is a
behavioral trait or function, such as drug addiction in laboratory rats. The fact that the
rats have long tails and walk on four legs instead of two is entirely irrelevant to the issue
of drug addiction.
The critical task in constructing a successful animal model is to identify the relevant
points of similarity between the animal model and the human behavior of interest. The
relevant similarity concerns the causal factors that are responsible for particular forms of
behavior. We can gain insights into human behavior based on the study of nonhuman
animals if the causal relations in the two species are similar. Because animal models are
often used to push back the frontiers of knowledge, the correspondence between the ani-
mal findings and human behavior has to be carefully verified by empirical data. This
interaction between animal and human research continues to make important contribu-
tions to our understanding of human behavior (e.g., Haselgrove & Hogarth, 2012).
Applications of learning principles got a special boost in the 1960s with the acceler-
ated development of behavior therapy during that period. As O’Donohue commented,
“the model of moving from the learning laboratory to the clinic proved to be an extraordi-
narily rich paradigm. In the 1960s, numerous learning principles were shown to be rele-
vant to clinical practice. Learning research quickly proved to be a productive source of
ideas for developing treatments or etiological accounts of many problems” (1998, p. 4).
This fervor was tempered during subsequent developments of cognitive behavior therapy.
However, recent advances in learning theory have encouraged a return to learning-based
treatments for important human problems such as anxiety disorders, autism spectrum
disorders, and drug abuse and treatment (Schachtman & Reilly, 2011).
Animal Models and Drug Development
Whether we visit a doctor because we have a physical or psychiatric illness, we are likely
to go away with a prescription to alleviate our symptoms. Pharmaceutical companies are
eager to bring new drugs to market and to develop drugs for symptoms that were previ-
ously treated in other ways (e.g., erectile dysfunction). Drug development is not possible
without animal models. The animal learning paradigms described in this text are espe-
cially important for developing new drugs to enhance learning and cognition. As people
live longer, cognitive decline with aging is becoming more prevalent, and that is creating
increased demand for drugs to slow the decline. Animal models of learning and memory
The Dawn of the Modern Era 13
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
are playing a central role in the development of these new drugs. Animal models are also
important for the development of antianxiety medications and drugs that facilitate the
progress of behavior and cognitive therapy (e.g., Otto et al., 2010; Gold, 2008). Another
important area of research is the evaluation of the potential for drug abuse associated
with new medications for pain relief and other medical problems (e.g., Ator & Griffiths,
2003). Many of these experiments employ methods described in this book.
Animal Models and Machine Learning
Animal models of learning and behavior are also of considerable relevance to robotics
and intelligent artificial systems (machine learning). Robots are machines that are able
to perform particular functions or tasks. The goal in robotics is to make the machines
as “smart” as possible. Just as Romanes defined intelligence in terms of the ability to
learn, contemporary roboticists view the ability to remember and learn from experience
to be important features of smart artificial systems. Information about the characteristics
and mechanisms of such learning may be gleaned from studies of learning in nonhuman
animals. Learning mechanisms are frequently used in artificial intelligent systems to enable
the response of those systems to be altered by experience or feedback. A prominent
approach in this area is “reinforcement learning” (e.g., Busoniu, Babuska, De Schutter, &
Ernst, 2010), which originated in behavioral studies of animal learning.
The Definition of Learning
Learning is such a common human experience that people rarely reflect on exactly what
it means. A universally accepted definition of learning does not exist. However, many
important aspects of learning are captured in the following statement:
Learning is an enduring change in the mechanisms of behavior involving specific
stimuli and/or responses that results from prior experience with those or similar
stimuli and responses.
This definition may seem cumbersome, but each of its components serves to convey an
important feature of learning.
The Learning–Performance Distinction
Whenever we see evidence of learning, we see the emergence of a change in behavior—
the performance of a new response or the suppression of a response that occurred previ-
ously. A child becomes skilled in snapping the buckles of his or her sandals or becomes
more patient in waiting for the popcorn to cook in the microwave oven. Such changes in
behavior are the only way we can tell whether or not learning has occurred. However,
notice that the preceding definition attributes learning to a change in the mechanisms
of behavior, not to a change in behavior directly.
Why should we define learning in terms of a change in the mechanisms of behavior?
The main reason is that behavior is determined by many factors in addition to learning.
Consider, for example, eating. Whether you eat something depends on how hungry you
are, how much effort is required to obtain the food, how much you like the food, and
whether you know where to find food. Only some of these factors involve learning.
Performance refers to all of the actions of an organism at a particular time.
Whether an animal does something or not (its performance) depends on many things,
as in the above example of eating. Therefore, a change in performance cannot be auto-
matically considered to reflect learning. Learning is defined in terms of a change in the
mechanisms of behavior to emphasize the distinction between learning and performance.
Because performance is determined by many factors in addition to learning, one must be
14 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
very careful in deciding whether a particular aspect of performance does or does not
reflect learning. Sometimes evidence of learning cannot be obtained until special test
procedures are introduced. Children, for example, learn a great deal about driving a car
just by watching others drive, but this learning is not apparent until they are permitted
behind the steering wheel. In other cases (discussed in the next section), a change in
behavior is readily observed but cannot be attributed to learning because it does not
last long enough or does not result from experience with specific environmental events.
Learning and Other Sources of Behavior Change
Several mechanisms produce changes in behavior that are too short-lasting to be consid-
ered instances of learning. One such process is fatigue. Physical exertion may result in a
gradual reduction in the vigor of a response because the individual becomes tired. This
type of change is produced by experience. However, it is not considered an instance of
learning because the decline in responding disappears if the individual is allowed to rest
for a while.
Behavior also may be temporarily altered by a change in stimulus conditions. If the
house lights in a movie theater suddenly come on in the middle of the show, the behav-
ior of the audience is likely to change dramatically. However, this is not an instance of
learning because the audience is likely to return to watching the movie when the house
lights are turned off again.
Other short-term changes in behavior that are not considered learning involve
alterations in the physiological or motivational state of the organism. Hunger and thirst
induce responses that are not observed at other times. Changes in the level of sex
hormones cause changes in responsiveness to sexual stimuli. Short-lasting behavioral
effects also accompany the administration of psychoactive drugs.
In some cases, persistent changes in behavior occur, but without the type of
experience with environmental events that satisfies the definition of learning. The most
obvious example of this type is maturation. A child cannot get something from a high
shelf until he or she grows tall enough. However, the change in behavior in this case is
not an instance of learning because it occurs with the mere passage of time. The child
does not have to be trained to reach high places as he or she becomes taller.
Generally, the distinction between learning and maturation is based on the impor-
tance of specific experiences in producing the behavior change of interest. Maturation
occurs in the absence of specific training or practice. However, the distinction is blurred
in cases where environmental stimulation is necessary for maturational development.
Experiments with cats, for example, have shown that the visual system will not develop
sufficiently to permit perception of horizontal lines unless the cats have been exposed to
such stimuli early in life (e.g., Blakemore & Cooper, 1970). The appearance of sexual
behavior at puberty also depends on developmental experience, in this case social play
before puberty (e.g., Harlow, 1969).
Learning and Levels of Analysis
Because of its critical importance in everyday life, learning is being studied at many different
levels of analysis (Byrne, 2008). Some of these are illustrated in Figure 1.5. Our emphasis will
be on analyses of learning at the level of behavior. The behavioral level of analysis is rooted
in the conviction that the function of learning is to facilitate an organism’s interactions
with its environment. We interact with our environment primarily through our actions.
Therefore, the behavioral level of analysis occupies a cardinal position.
Much research on learning these days is also being conducted at the level of neural
mechanisms (e.g., Gallistel & Matzel, 2013; Kesner & Martinez, 2007; Rudy, 2008). Interest
in the neural mechanisms of learning has been stimulated by tremendous methodological
The Definition of Learning 15
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
and technical advances that permit scientists to directly examine biological processes that
previously were only hypothetical possibilities. The neural mechanisms involved in
learning may be examined at the systems level. This level is concerned with how neural
circuits and neurotransmitter systems are organized to produce learned responses. Neural
mechanisms may also be examined at the level of individual neurons and synapses, with
an emphasis on molecular and cellular mechanisms, including genetic and epigenetic
mechanisms. (Contemporary research on the neurobiology of learning will be presented in
“neuroboxes” in each chapter of the book.)
Periodically, we will describe changes in learning that occur as a function of age.
These are referred to as developmental changes. We will also consider how learning
helps animals adapt to their environment and increase their success in reproducing
and passing along their genes to future generations. These issues involve the adaptive
significance of learning. Most scientists agree that learning mechanisms evolved because
they increase reproductive fitness. The contribution of learning to reproductive fitness is
often indirect. By learning to find food more efficiently, for example, an organism may
live longer and have more offspring. However, learning can also have a direct effect on
reproductive fitness. Studies of sexual conditioning have shown that learning can
increase how many eggs are fertilized and how many offspring are produced as a result
of a sexual encounter (Domjan et al., 2012).
Methodological Aspects of the
Study of Learning
There are two prominent methodological issues that are important to keep in mind
when considering behavioral studies of learning. The first of these is a direct conse-
quence of the definition of learning and involves the exclusive use of experimental
research methods. The second methodological feature of studies of learning is reliance
on a general-process approach. Reliance on a general-process approach is a matter of
intellectual preference rather than a matter of necessity.
Learning as an Experimental Science
Studies of learning focus on identifying how prior experience causes long-term changes
in behavior. At the behavioral level, this boils down to identifying the critical compo-
nents of training or conditioning protocols that are required to produce learning.
The emphasis on identifying causal variables necessitates an experimental approach.
Level of investigation Type of learning mechanism
Behavioral
Neural system or network
Molecular, cellular, and genetic
Whole organism
Neural circuits and
neurotransmitters
Neurons and synapses
FIGURE 1.5 Levels of
analysis of learning.
Learning mechanisms
may be investigated at
the organismic level, at
the level of neural circuits
and transmitter systems,
and at the level of nerve
cells or neurons.
©
Ce
ng
ag
e
Le
ar
ni
ng
16 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
BOX 1.1
The Material Mind
During the last 30 years, a new era has
emerged in the study of learning,
detailing the neurobiological
mechanisms that underlie key
learning phenomena. Some of these
discoveries will be highlighted in
boxes that appear in each chapter.
These boxes will introduce you to
new research that has revealed how
and where learning occurs within
the nervous system. We begin by
discussing the relation between
behavioral and biological studies and
by reviewing some key terms and
concepts needed to understand the
material presented in subsequent
boxes dealing with neural mechan-
isms (for additional details, see Kalat,
2009, and Prus, 2014).
Our aim in this book is to under-
stand learning, to elucidate what it is,
when and why it occurs, and how it is
produced. In short, we want to know
its cause, and this, in turn, requires
that we look at learning from multiple
perspectives. As in any field of
enquiry, the first task involves
describing the phenomenon and the
circumstances under which it is
observed, what Aristotle called the
efficient cause (Killeen, 2001). As we
gain insight into the phenomenon, we
gain the knowledge needed to develop
formal models (the formal cause) of
the process that allow us to predict,
for example, when learning will
occur. At the same time, we seek a
context in which to understand our
observations, a form of explanation
that focuses on why a phenomenon is
observed (the final cause). Finally,
we hope to detail the underlying
biological mechanisms of learning
(the material cause).
The foundation of learning rests
on the elucidation of its efficient
cause, work that has allowed us to
understand how experience can
engage distinct forms of learning, the
types of processes involved, and their
long-term behavioral consequences.
We now understand that learning
does not depend on contiguity alone,
and we have developed formal models
of this process. As we will see,
researchers have also discovered that
whether learning occurs is wedded to
the “why”—the evolutionary benefit
gained from preparing the organism
to learn in particular ways. A century
of work has provided detailed answers
to the efficient, formal, and final
causes of learning. This has provided
the basis for much of the content of
this book and has prepared us to begin
to investigate the material cause of
learning—its neurobiological basis.
Our discussion of the material
cause assumes little background in
neurobiology and will introduce
biological terms as needed. There are,
however, some overarching concepts
that are essential, and, for that reason,
I begin by providing a brief overview
of how the underlying machinery (the
neurons) operates and is organized.
Your nervous system entails both a
central component (the brain and spi-
nal cord) and a peripheral component.
Both of these are composed of neurons
that transmit signals that relay sensory
information, process it, and execute
motor commands. Key components of
a neuron are illustrated in Figure (i) on
the inside front cover. Neural com-
munication begins at the dendrites,
which contain specialized receptors
that transform input from another cell
(a sensory receptor or neuron) into an
electrical impulse. This is accomplished
by allowing sodium ions (Naþ) to
enter the cell. Under normal circum-
stances, neurons are impermeable to
Naþ, and if any leaks in, it is actively
pumped out of the cell (into the
extracellular space). Because Naþ has a
positive charge, the excess of Naþ
outside the cell sets up an electrical
charge across the cell membrane
(Figure 1.6A). Think of this as a
miniature battery, with the positive
side oriented toward the extracellular
space. This battery-like effect estab-
lishes a small difference in voltage, with
the inside of the cell lying approxi-
mately −70 mV below the outside.
Channels positioned in the den-
dritic cell membrane are designed to
allow Naþ into the cell. The circum-
stances that open these channels
depend on the cell’s function. For
example, the Naþ channels on a
sensory neuron may be engaged by a
mechanoreceptor, whereas a cell that
receives input from another neuron
could have receptors that are acti-
vated by a neurotransmitter. In either
case, opening the channel allows Naþ
to rush into the cell (Figure 1.6B).
This movement of positive charge
depolarizes the interior of the cell,
causing the voltage to move from
−70 mV toward zero (a process
known as depolarization). Other
Naþ channels are voltage sensitive
and swing open once the cell is suf-
ficiently depolarized, allowing even
more Naþ to rush into the cell.
Indeed, so much Naþ rushes into the
cell (pushed by an osmotic force that
acts to equate the concentration gra-
dient) that the interior actually
becomes positive (Figure 1.6C). At
this point, the Naþ channels close and
channels that regulate the ion potas-
sium (Kþ) open. At rest, the interior
of a cell contains an excess of Kþ,
which is repelled by the positively
charged Naþ outside the cell. But
when Naþ rushes in, the game is
switched and now it is the inside of
the cell that has a positive charge.
This allows Kþ to flow down its
concentration gradient and exit the
cell. As it does so, the inside of the cell
becomes negative again. These ionic
changes set off a chain reaction that
Continued
Methodological Aspects of the Study of Learning 17
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Cerebral
cortex
Brain
stem
Cerebellum
Cerebral cortex
Diencephalon
Midbrain
Pons
Brain
stem
Medulla
Cerebellum
Spinal cord
Lateral view Medial view
Spinal cord
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Axon
Recording electrodes
Voltmeter
270 mV
1
2
4
3
5
6
140 Na1 channels
become
refractory, no
more Na1
enters cell
Na1 channels
open, Na1
begins to enter
cell
K1 continues to
leave cell,
causes membrane
potential to return
to resting level
K1 channels
open, K1
begins to leave
cell
Extra K1 outside
diffuses away
Threshold of
excitation
K1 channels close,
Na1 channels reset
270
0
M
em
br
an
e
po
te
nt
ia
l (
m
V
)
11
22
11
22
22
11
Closed
Sodium channel
3 51
Sodium ions enter
RefractoryOpen Reset
11
22
BOX 1.1 (continued)
A
B
C
FIGURE 1.6 (A) At rest, the distribution of electrically charged particles across the cell membrane establishes a charge of approxi-
mately −70 mV. (Adapted from Gleitman, 2010.) (B) A sodium channel regulates the flow of Na+. (© Cengage Learning 2015) (C).
Allowing Na+ to flow into a neuron reduces (depolarizes) the internal voltage of the cell, which engages electrically sensitive Na+
channels. Na+ rushing into the cell produces a rapid rise in voltage (1). K+ flowing out of the cell acts to reestablish the resting poten-
tial (4). (Adapted from Carlson, 2012.) (D) The major structures of the central nervous system from a side (lateral) and middle
(medial) perspective. The brainstem lies above the spinal cord and includes the medulla, pons, and midbrain. The forebrain lies
directly above and includes the diencephalon (thalamus and hypothalamus) and cerebral cortex. The cerebellum lies under the cere-
bral cortex, behind the pons. (Adapted from Kandel et al., 2013.)
D
Continued
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Consider the following example. Mary goes into a dark room. She quickly turns on a
switch near the door and the lights in the room go on. Can you conclude that turning on
the switch “caused” the lights to go on? Not from the information provided. Perhaps the
lights were on an automatic timer and would have come on without Mary’s actions.
Alternatively, the door may have had a built-in switch that turned on the lights after a
slight delay. Another possibility is that there was a motion detector in the room that acti-
vated the lights when Mary entered.
travels across the cell membrane to
the cell body (a region that contains
the genes and much of the biological
machinery needed to maintain the
cell). From there, the electrical
impulse (the action potential) travels
down the axon to its end, where
the cell forms a chemical connection
(a synapse) with another cell
(Figure (ii) on the inside front cover).
When an action potential arrives
at a synapse, it causes channels that
are permeable to the ion Caþþ to
open, allowing Caþþ to flow into the
cell. The increase in intracellular
Caþþ causes the vesicles that contain
the neurotransmitter to migrate over
to the presynaptic neural membrane
and dump their contents into the
space (the synaptic cleft) between the
cells. The transmitter than engages
receptors on the postsynaptic cell,
which could be another neuron or an
effector organ (e.g., a muscle).
Experience can modify how a
neuron operates, providing a kind of
adaptability (plasticity) that allows
learning. As we will see, there are
multiple forms of neural plasticity.
For example, a neural input to the
presynaptic side of a terminal can
augment transmitter release (Box
2.2). In addition, cellular processes
can augment (potentiate) or depress
the postsynaptic response to trans-
mitter release (Box 8.2). In addition,
there are many different kinds of
transmitters, some of which (e.g.,
glutamate) are excitatory and elicit an
action potential while others (e.g.,
GABA) inhibit neural activity.
Another important neurobiologi-
cal principle concerns the structure of
the nervous system: how neurons are
organized to form units and pathways
to perform particular functions
(Figure 1.6D). In creatures without a
spinal cord (invertebrates), neurons
are organized into bundles known as
ganglia. In vertebrates, ganglia are
found in the peripheral nervous sys-
tem (the portion that lies outside the
bony covering of the spine and skull).
The central nervous system (the spinal
cord and brain) lies within the bony
covering. As we will see, the brain is
organized into structures and nuclei
(a set of neurons within a structure)
that mediate particular functions
(e.g., fear conditioning [Box 4.3],
reward learning [Boxes 6.3 and 7.1],
and timing [Box 12.1]).
A wide variety of methods have
been used to explore the neurobiologi-
cal mechanisms that underlie learning,
including genetic techniques that target
particular genes, electrophysiological
and imaging procedures that map
neural activity, treatments that engage
(stimulate) or disrupt (lesion) neurons
in specific regions, and pharmacologi-
cal procedures that target specific
receptors and intracellular processes.
Together, the research has revealed
some surprising commonalities
regarding the material causes of learn-
ing. For example, as we will see, a
specialized postsynaptic receptor (the
N-methyl-D-aspartate [NMDA] recep-
tor)contributestonearlyeveryinstance
of learning we will discuss. At the same
time, we will also learn that the
organization of behavior is derived
from a neural system that has parti-
tioned the task of learning across
multiple components, each tuned to
perform a particular function. The
research we will describe in the book
provides the map needed to link these
neural mechanisms to behavioral
systems.
J. W. Grau
action potential An electrical impulse
caused by the rapid flow of charged particles
(ions) across the neural membrane. The
nerve impulse conducts an electrical signal
along the axon of a neuron and initiates the
release of neurotransmitter at the synapse.
axon A slender projection of a neuron
that allows electrical impulses to be con-
ducted from the cell body to the terminal
ending.
dendrites The branched projections of a
neuron that receive electrochemical input
from other cells (e.g., sensory receptors or
neurons).
depolarization A reduction in the elec-
trical charge across the neural membrane,
typically caused by the inward flow of the
ion Naþ. Depolarization causes the inside
of the neuron to be less negative, which
can initiate an action potential.
neuron A specialized cell that functions to
transmit, and process, information within
the nervous system by means of electrical
and chemical signals.
neurotransmitter A chemical released by
a neuron at a synapse. Neurotransmitters
allow communication across cells and can
have eitheranexcitatory orinhibitory effect.
synapse A structure that allows a neuron
to pass a chemical signal (neurotransmit-
ter) to another cell.
BOX 1.1 (continued)
Methodological Aspects of the Study of Learning 19
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
How could you determine that manipulation of the wall switch caused the lights to go
on? You would have to evaluate various scenarios to test the causal model. For example,
you might ask Mary to enter the room again but ask her not to turn on the wall switch. If
the lights do not go on under those circumstances, you could conclude that the lights were
not turned on by a motion detector or by a switch built into the door. As this simple
example illustrates, to identify a cause, an experiment has to be conducted in which the
presumed cause is removed. The results obtained with and without the presumed cause are
then compared.
In the study of learning, the behavior of living organisms is of interest, not the
behavior of lights. However, scientists have to proceed in a similar fashion. They have
to conduct experiments in which behavior is observed with and without the presumed
causal factor. The most basic question is to identify whether a training procedure pro-
duces a particular behavior change. To answer this question, individuals who receive
the training procedure have to be compared with individuals who do not receive that
training. This requires experimentally varying the presence versus absence of the training
experience. Because of this, learning can be investigated only with experimental methods.
This makes the study of learning primarily a laboratory science.
The necessity of using experimental techniques to investigate learning is not adequately
appreciated by allied scientists. Many aspects of behavior can be studied with observational
techniques that do not involve experimental manipulations. For example, observational stud-
ies can provide a great deal of information about when and how animals set up territories,
what they do to defend those territories, how they engage in courtship and sexual behaviors,
how they raise their offspring, and how the activities of the offspring change as they mature.
Much fascinating information about animals has been obtained with observational tech-
niques that involve minimal intrusion into their ongoing activities. Unfortunately, learning
cannot be studied that way. To be sure that a change in behavior is due to learning rather
than changes in motivation, sensory development, hormonal fluctuations, or other possible
nonlearning mechanisms, it is necessary to conduct experiments in which the presumed
training experiences are systematically manipulated. The basic learning experiment compares
two groups of participants (Figure 1.7). The experimental group receives the training proce-
dure of interest, and how this procedure changes behavior is measured. The performance of
the experimental group is compared with a control group that does not receive the training
procedure but is otherwise treated in a similar fashion. Learning is presumed to have taken
place if the experimental group responds differently from the control group. A similar ratio-
nale can be used to study learning in a single individual provided that one can be certain
that the behavior is stable in the absence of a training intervention.
The General-Process Approach to the Study of Learning
The second prominent methodological feature of studies of learning is the use of
a general-process approach. This is more a matter of preference than of necessity.
FIGURE 1.7 Two ver-
sions of the fundamental
learning experiment. In
the left panel, two groups
of individuals are com-
pared. The training pro-
cedure is provided for
participants in the exper-
imental group but not for
those in the control group.
In the right panel, a single
individual is observed be-
fore and during training.
The individual’s behavior
during training is com-
pared to what we assume
its behavior would have
been without training.
Experimental group
Control group
B
eh
av
io
r
B
eh
av
io
r
Training trials Training trials
Assumed behavior
without training
Start of
training
©
Ce
ng
ag
e
Le
ar
ni
ng
20 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
However, in adopting a general-process approach, investigators of animal learning are
following a long-standing tradition in science.
Elements of the General-Process Approach The most obvious feature of nature is
its diversity. Consider, for example, the splendid variety of minerals that exist in the
world. Some are soft, some are hard, some are brilliant in appearance, others are dull,
and so on. Plants and animals also occur in many different shapes and sizes. Dynamic
properties of objects are also diverse. Some things float, whereas others rapidly drop to
the ground; some remain still; others remain in motion.
In studying nature, one can either focus on differences or try to ignore the differ-
ences and search for commonalities. Scientists ranging from physicists to chemists, biol-
ogists, and psychologists have all elected to search for commonalities. Rather than being
overwhelmed by the tremendous variety of things in nature, scientists have opted to look
for uniformities. They have attempted to formulate general laws with which to organize
and explain the diversity of the universe. Investigators of animal learning have followed
this well-established tradition.
Whether or not general laws are discovered often depends on the level of analysis
that is pursued. The diversity of the phenomena scientists try to understand and orga-
nize makes it difficult to formulate general laws at the level of the observed phenomena.
It is difficult, for example, to discover the general laws that govern chemical reactions by
simply documenting the nature of the chemicals involved in various reactions. Similarly,
it is difficult to explain the diversity of species in the world by cataloging the features of
various animals. Major progress in science comes from analyzing phenomena at a more
elemental or molecular level. For example, by the nineteenth century, chemists knew
many specific facts about what would happen when various chemicals were combined.
However, a general account of chemical reactions had to await the development of the
periodic table of the elements, which organized chemical elements in terms of their con-
stituent atomic components.
Investigators of conditioning and learning have been committed to the general-
process approach from the inception of this field of psychology. They have focused on
the commonalities of various instances of learning and have assumed that learning phe-
nomena are products of elemental processes that operate in much the same way in dif-
ferent learning situations.
The commitment to a general-process approach guided Pavlov’s work on functional
neurology and conditioning. Commitment to a general-process approach to the study of
learning is also evident in the writings of early comparative psychologists. For example,
Darwin (1897) emphasized commonalities among species in cognitive functions: “My
object … is to show that there is no fundamental difference between man and the higher
mammals in their mental faculties” (p. 66). At the start of the twentieth century, Jacques
Loeb (1900) pointed out that commonalities occur at the level of elemental processes:
“Psychic phenomena … appear, invariably, as a function of an elemental process, namely
the activity of associative memory” (p. 213). Another prominent comparative psycholo-
gist of the time, C. Lloyd Morgan, stated that elementary laws of association “are, we
believe, universal laws” (Morgan, 1903, p. 219).
The assumption that “universal” elemental laws of association are responsible for
learning phenomena does not deny the diversity of stimuli different animals may learn
about, the diversity of responses they may learn to perform, and the fact that one species
may learn something more slowly than another. The generality is assumed to exist in the
rules or processes of learning—not in the contents or speed of learning. This idea was
clearly expressed nearly a century ago by Edward Thorndike, one of the first prominent
American psychologists who studied learning:
Methodological Aspects of the Study of Learning 21
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Formally, the crab, fish, turtle, dog, cat, monkey, and baby have very similar intellects
and characters. All are systems of connections subject to change by the laws of exercise
and effect. The differences are: first, in the concrete particular connections, in what sti-
mulates the animal to response, what responses it makes, which stimulus connects with
what response, and second, in the degree of ability to learn. (Thorndike, 1911, p. 280)
What an animal can learn about (the stimuli, responses, and stimulus-response con-
nections it learns) varies from one species to another. Animals also differ in how fast
they learn (“in the degree of ability to learn”). However, Thorndike assumed that the
rules of learning were universal. We no longer share Thorndike’s view that these univer-
sal rules of learning are the “laws of exercise and effect.” However, contemporary scien-
tists continue to embrace the idea that universal rules of learning exist. The job of the
learning psychologist is to discover those universal laws.
Methodological Implications of the General-Process Approach If we assume
that universal rules of learning exist, then we should be able to discover those rules in
any situation in which learning occurs. Thus, an important methodological implication
of the general-process approach is that general rules of learning may be discovered by
studying any species or response system that exhibits learning. This implication has
encouraged scientists to study learning in a small number of experimental situations.
Investigators have converged on a few “standard” or conventional experimental para-
digms. Figure 1.8, for example, shows an example of a pigeon in a standard Skinner
box. I will describe other examples of standard experimental paradigms as I introduce
various learning phenomena in future chapters.
Conventional experimental paradigms have been fine-tuned over the years to fit well
with the behavioral predispositions of the research animals. Because of these improve-
ments, conventional experimental preparations permit laboratory study of reasonably
naturalistic responses (Timberlake, 1990).
FIGURE 1.8 A pigeon
in a standard Skinner
box. Three circular disks,
arranged at eye level, are
available for the bird to
peck. Access to food is
provided in the hopper
below.
©
Ce
ng
ag
e
Le
ar
ni
ng
20
15
22 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Proof of the Generality of Learning Phenomena The generality of learning pro-
cesses is not proven by adopting a general-process approach. Assuming the existence of
common elemental learning processes is not the same as empirically demonstrating those
commonalities. Direct empirical verification of the existence of common learning pro-
cesses in a variety of situations remains necessary in efforts to build a truly general
account of how learning occurs. The generality of learning processes has to be proven
by studying learning in many different species and situations.
The available evidence suggests that elementary principles of learning of the sort
that will be described in this text have considerable generality (Papini, 2008). Most
research on animal learning has been conducted with pigeons, rats, and (to a much lesser
extent) rabbits. Similar forms of learning have been found with fish, hamsters, cats, dogs,
human beings, dolphins, and sea lions. In addition, some of the principles of learning
observed with these vertebrate species also have been demonstrated in newts, fruit flies,
honeybees, terrestrial mollusks, wasps, and various marine mollusks.
Use of Nonhuman Animals in Research
on Learning
Although the principles described in this book apply to people, many of the experiments
we will be considering have been conducted with nonhuman animals. The experiments
involved laboratory animals for both theoretical and methodological reasons.
Rationale for the Use of Nonhuman Animals in Research
on Learning
As I noted earlier, experimental methods have to be used to study learning phenomena
so that the acquisition of new behaviors can be attributed to particular previous train-
ing experiences. Experimental control of past experience cannot always be achieved
with the same degree of precision in studies with human participants as in studies
with laboratory animals. In addition, with laboratory animals, scientists can study
how strong emotional reactions are learned and how learning is involved in acquiring
food, avoiding pain or distress, or finding potential sexual partners. With human par-
ticipants, investigators can study how maladaptive emotional responses (e.g., fears and
phobias) may be reduced, but they cannot experimentally manipulate how these emo-
tions are learned in the first place.
Knowledge of the evolution and biological bases of learning also cannot be obtained
without the use of nonhuman animals in research. How cognition and intelligence
evolved is one of the fundamental questions about human nature. The answer to this
question will shape our view of what it means to be human, just as knowledge of the
solar system shaped our view of the place of the Earth in the universe. As I have dis-
cussed, investigation of the evolution of cognition and intelligence rests heavily on stud-
ies of learning in nonhuman animals.
Knowledge of the neurobiological bases of learning may not change our views of
human nature, but it is apt to yield important dividends in the treatment of learning and
memory disorders. Many of the detailed investigations that are necessary to unravel how
the nervous system learns and remembers simply cannot be conducted with people. Study-
ing the neurobiological bases of learning first requires documenting the nature of learning
processes at the behavioral level. Therefore, behavioral studies of learning in animals are a
necessary prerequisite to any animal research on the biological bases of learning.
Laboratory animals also provide important conceptual advantages over people for
studying learning processes. The processes of learning may be simpler in animals reared
Use of Nonhuman Animals in Research on Learning 23
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
under controlled laboratory conditions than in people, whose backgrounds are more
varied and often poorly documented. The behavior of nonhuman animals is not compli-
cated by linguistic processes that have a prominent role in certain kinds of human
behavior. In research with people, one has to make sure that the actions of the partici-
pants are not governed by their efforts to please (or displease) the experimenter. Such
factors are not likely to complicate what rats and pigeons do in an experiment.
Laboratory Animals and Normal Behavior
Some have suggested that domesticated strains of laboratory animals may not provide
useful information because such animals have degenerated as a result of many gen-
erations of inbreeding and long periods of captivity (e.g., Lockard, 1968). However,
this notion is probably mistaken. In an interesting test, Boice (1977) took five male
and five female albino rats of a highly inbred laboratory stock and housed them in
an outdoor pen in Missouri without artificial shelter. All 10 rats survived the first
winter with temperatures as low as −22°F. The animals reproduced normally and
reached a stable population of about 50 members. Only three of the rats died before
showing signs of old age during the two-year study period. Given the extreme climatic
conditions, this level of survival is remarkable. Furthermore, the behavior of these domesti-
cated rats in the outdoors was very similar to the behavior of wild rats observed in similar
circumstances.
Domesticated rats act similar to wild rats in other tests as well, and there is some
indication that they perform better than wild rats in learning experiments (see, e.g.,
Boice, 1973, 1981; Kaufman & Collier, 1983). Therefore, the results I will describe in
this text should not be discounted because many of the experiments were conducted
with domesticated animals. In fact, it may be suggested that laboratory animals are pref-
erable in research to their wild counterparts. Human beings live in what are largely arti-
ficial environments. Therefore, research may prove most relevant to the human behavior
if the research is carried out with domesticated animals that live in artificial laboratory
situations. As Boice (1973) commented, “The domesticated rat may be a good model for
domestic man” (p. 227).
Public Debate about Research With Nonhuman Animals
There has been much public debate about the pros and cons of research with non-
human animals (see Perry & Dess, 2012, for a recent review). Part of the debate has
centered on the humane treatment of animals. Other aspects of the debate have
centered on what constitutes ethical treatment of animals, whether human beings
have the right to benefit at the expense of animals, and possible alternatives to research
with nonhuman animals.
The Humane Treatment of Laboratory Animals Concern for the welfare of labo-
ratory animals has resulted in the adoption of strict federal standards for animal housing
and for the supervision of animal research. Some argue that these rules are needed because
without them scientists would disregard the welfare of the animals in their zeal to obtain
research data. However, this argument ignores the fact that good science requires good
animal care. Scientists, especially those studying behavior, must be concerned about the
welfare of their research subjects. Information about normal learning and behavior cannot
be obtained from diseased or disturbed animals. Investigators of animal learning must
ensure the welfare of their subjects if they are to obtain useful scientific data.
Learning experiments sometimes involve discomfort. Discomfort is an inevitable aspect
of life for all species, including people. Scientists make every effort to minimize the degree of
discomfort for their research participants. In studies of food reinforcement, for example,
24 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
animals are food deprived before each experimental session to ensure their interest in food.
However, the hunger imposed in the laboratory is no more severe than the hunger animals
encounter in the wild, and often it is less severe (Poling, Nickel, & Alling, 1990).
The investigation of certain forms of learning and behavior require the administra-
tion of aversive stimulation. Important topics, such as punishment or the learning of fear
and anxiety, cannot be studied without some discomfort to the participants. However,
even in such cases, efforts are made to keep the discomfort to a minimum for the
research question at hand.
What Constitutes the Ethical Treatment of Animals? Although making sure that
animals serving in experiments are comfortable is in the best interests of the animals as
well as the research, formulating general ethical principles is difficult. Animal “rights”
cannot be identified in the way we identify human rights (Lansdell, 1988), and animals
seem to have different “rights” under different circumstances.
Currently, substantial efforts are made to house laboratory animals in conditions
that promote their health and comfort. However, a laboratory mouse or rat loses the
protection afforded by federal regulations when it escapes from the laboratory and
takes up residence in the walls of the laboratory building (Herzog, 1988). The trapping
and extermination of rodents in buildings is a common practice that has not been the
subject of either public debate or restrictive federal regulation. Mites, fleas, and ticks are
also animals, but we do not tolerate them in our hair or our homes. Which species have
the right to life, and under what circumstances do they have that right? Such questions
defy simple answers.
Assuming that a species deserves treatment that meets government-mandated stan-
dards, what should those standards be? Appropriate treatment of laboratory animals is
sometimes described as “humane treatment.” However, we have to be careful not to
take this term literally. “Humane treatment” means treating someone as we would treat
a human being. It is important to keep in mind that rats (and other laboratory animals)
are not human beings. Rats prefer to live in dark burrows made of dirt that they never
clean. People in contrast prefer to live in well-illuminated and frequently cleaned rooms.
Laboratories typically have rats in well-lit rooms that are frequently cleaned. One cannot
help but wonder whether these housing standards were dictated more by considerations
of human comfort rather than rat comfort.
Should Human Beings Benefit From the Use of Animals? Part of the public debate
about animal rights has been fueled by the argument that human beings have no right to
benefit at the expense of animals, that humans have no right to “exploit” animals. This
argument goes far beyond issues concerning the use of animals in research. Therefore,
I will not discuss the argument in detail here, except to point out that far fewer animals
are used in research than are used in the food industry, for clothing, and in recreational
hunting and fishing. In addition, a comprehensive count of human exploitation of
animals has to include disruptions of habitats that occur whenever we build roads, hous-
ing developments, and office buildings. We should also add the millions of animals that
are killed by insecticides and other pest-control efforts in agriculture and elsewhere.
Alternatives to Research With Animals Increased awareness of ethical issues
involved in the use of nonhuman animals in research has encouraged a search for alter-
native techniques. Some years ago, Russell and Burch (1959) formulated the “three Rs”
for animal research: replacement of animals with other testing techniques, reducing the
number of animals used with statistical techniques, and refining the experimental proce-
dures to cause less suffering. Replacement strategies have been successful in the cosmetic
industry and in the manufacture of certain vaccines and hormones (e.g., Murkerjee,
Use of Nonhuman Animals in Research on Learning 25
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1997). Motivated by this strategy, a recent task force of the National Institutes of Health
on the use of chimpanzees in research (Altevogt et al., 2011) recommended that most
forms of medical research with chimpanzees be terminated because alternative method-
ologies and species have been developed in recent years. Interestingly, however, the task
force identified comparative cognition and behavior as one of only three research areas
that should be continued because no good alternatives to chimpanzees are available for
those investigations. Indeed, many common alternatives to the use of animals in research
are not suitable to study learning processes (Gallup & Suarez, 1985). Some of these alter-
natives are the following.
1. Observational techniques. As I discussed earlier, learning processes cannot be investi-
gated with unobtrusive observational techniques. Experimental manipulation of past
experience is necessary in studies of learning. Therefore, field observations of undis-
turbed animals cannot yield information about the mechanisms of learning.
2. Plants. Learning cannot be investigated in plants because plants lack a nervous sys-
tem, which is required for learning.
3. Tissue cultures. Although tissue cultures may reveal the operation of cellular pro-
cesses, how these cellular processes work in an intact organism can be discovered
only by studying the intact organism. Furthermore, the relevance of a cellular pro-
cess for learning has to be demonstrated by showing how that cellular process oper-
ates to generate learned behavior at the organismic level.
4. Computer simulations. Writing a computer program to simulate a natural phenome-
non requires a great deal of knowledge about that phenomenon. To simulate a par-
ticular form of learning, programmers would first have to obtain detailed knowledge
of the circumstances under which that type of learning occurs and the factors that
influence the rate of that learning. The absence of such knowledge necessitates
experimental research with live organisms. Thus, experimental research with live
organisms is a prerequisite for effective computer simulations. For that reason, com-
puter simulations cannot be used in place of experimental research.
Computer simulations can serve many useful functions in science. Simulations are
effective in showing us the implications of the experimental observations that were pre-
viously obtained or in showing the implications of various theoretical assumptions. They
can be used to identify gaps in knowledge, and they can be used to suggest important
future lines of research. However, they cannot be used to generate new, previously
unknown facts about behavior. As Conn and Parker (1998) pointed out, “Scientists
depend on computers for processing data that we already possess, but can’t use them to
explore the unknown in the quest for new information.”
Sample Questions
1. Describe how historical developments in the
study of the mind contributed to the contempo-
rary study of learning.
2. Describe Descartes’s conception of the reflex and
how the concept of the reflex has changed since his
time.
3. Describe the rationale for using animal models to
study human behavior.
4. Describe the definition of learning and how
learning is distinguished from other forms of
behavior change.
5. Describe the different levels of analysis that can
be employed in studies of learning and how they
are related.
6. Describe why learning requires the use of exper-
imental methods of inquiry.
7. Describe alternatives to the use of animals
in research and their advantages and
disadvantages.
26 Chapter 1: Background and Rationale for the Study of Learning and Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Key Terms
association A connection between the representa-
tions of two events (two stimuli or a stimulus and a
response) such that the occurrence of one of the events
activates the representation of the other.
dualism The view of behavior according to which
actions can be separated into two categories: voluntary
behavior controlled by the mind and involuntary
behavior controlled by reflex mechanisms.
empiricism A philosophy according to which all
ideas in the mind arise from experience.
fatigue A temporary decrease in behavior caused by
repeated or excessive use of the muscles involved in the
behavior.
hedonism The philosophy proposed by Hobbes
according to which the actions of organisms are deter-
mined by the pursuit of pleasure and the avoidance of
pain.
learning An enduring change in the mechanisms of
behavior involving specific stimuli and/or responses
that results from prior experience with similar stimuli
and responses.
maturation A change in behavior caused by physical
or physiological development of the organism in the
absence of experience with particular environmental
events.
nativism A philosophy according to which human
beings are born with innate ideas.
nervism The philosophical position adopted by
Pavlov that all behavioral and physiological processes
are regulated by the nervous system.
nonsense syllable A three-letter combination (two
consonants separated by a vowel) that has no meaning.
performance An organism’s activities at a particular
time.
reflex A mechanism that enables a specific environ-
mental event to elicit a specific response.
Key Terms 27
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
C H A P T E R 2
Elicited Behavior, Habituation,
and Sensitization
The Nature of Elicited Behavior
The Concept of the Reflex
Modal Action Patterns
Eliciting Stimuli for Modal Action Patterns
The Sequential Organization of Behavior
Effects of Repeated Stimulation
Salivation and Hedonic Ratings of Taste in People
Visual Attention in Human Infants
The Startle Response
Sensitization and the Modulation of Elicited
Behavior
Adaptiveness and Pervasiveness of Habituation
and Sensitization
Habituation Versus Sensory Adaptation and
Response Fatigue
The Dual-Process Theory of Habituation and
Sensitization
Applications of the Dual-Process Theory
Implications of the Dual-Process Theory
Habituation and Sensitization of Emotions and
Motivated Behavior
Emotional Reactions and Their Aftereffects
The Opponent Process Theory of
Motivation
Concluding Comments
Sample Questions
Key Terms
CHAPTER PREVIEW
Chapter 2 begins the discussion of contemporary principles of learning and behavior with a description of
modern research on elicited behavior, behavior that occurs in reaction to specific environmental stimuli.
Many of the things we do are elicited by discrete stimuli, including some of the most extensively
investigated forms of behavior. Elicited responses range from simple reflexes (an eyeblink in response to
a puff of air) to more complex behavior sequences (courtship and sexual behavior) and complex
emotional responses and goal-directed behavior (drug seeking and drug abuse). Interestingly, simple
reflexive responses can be involved in the coordination of elaborate social interactions. Elicited
responses are also involved in two of the most basic and widespread forms of behavioral change:
habituation and sensitization. Habituation and sensitization are important because they are potentially
involved in all learning procedures. They modulate simple elicited responses like the startle response
and are also involved in the regulation of complex emotions and motivated behavior like drug seeking.
29
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
We typically think about learning as the result of deliberate instruction and practice. We
spend relatively little time trying to train our pet goldfish or cat. However, we devote
considerable effort to training our children and ourselves to do all sorts of things, such
as driving a car, playing tennis, or operating a new smartphone. On the face of it, people
seem capable of learning a much wider range of skills than goldfish or cats. What is
rarely appreciated, however, is that in all species what and how learning takes place
depends on the preexisting behavioral organization of the organism.
Behavior is not infinitely flexible, easily moved in any direction. Rather, organisms are
born with preexisting behavior systems and tendencies that constrain how learning occurs
and what changes one may expect from a training procedure. These limitations were
described elegantly in an analogy by Rachlin (1976), who compared learning to sculpting
a wooden statue. The sculptor begins with a piece of wood that has little resemblance to a
statue. As the carving proceeds, the piece of wood comes to look more and more like the
final product. But the process is not without limitation since the sculptor has to take into
account the direction and density of the wood grain and any knots the wood may have.
Wood carving is most successful if it is in harmony with the pre-existing grain and knots
of the wood. In a similar fashion, learning is most successful if it takes into account the
preexisting behavior structures of the organism. In this chapter, I describe the most prom-
inent of these behavioral starting points for learning as I describe the fundamentals of
elicited behavior. I will then describe how elicited behavior can be modified by experience
through the processes of habituation and sensitization. These processes are important
to understand before we consider more complex forms of learning because they are poten-
tially involved in all learning procedures.
The Nature of Elicited Behavior
All animals, whether they are single-celled paramecia or complex human beings, react to
events in their environment. If something moves in the periphery of your vision, you are
likely to turn your head in that direction. A particle of food in the mouth elicits saliva-
tion. Exposure to a bright light causes the pupils of the eyes to constrict. Touching a hot
stove elicits a quick withdrawal response. Irritation of the respiratory passages causes
sneezing and coughing. These and similar examples illustrate that much behavior occurs
in response to stimuli. Much of behavior is elicited. In considering the nature of elicited
behavior, we begin by describing its simplest form: reflexive behavior.
The Concept of the Reflex
A light puff of air directed at the cornea makes the eye blink. A tap just below the knee
causes the leg to kick. A loud noise causes a startle reaction. These are all examples of
reflexes. A reflex involves two closely related events: an eliciting stimulus and a corre-
sponding response. Furthermore, the stimulus and response are linked. Presentation of
the stimulus is followed by the response, and the response rarely occurs in the absence
of the stimulus. For example, dust in the nasal passages elicits sneezing, which does not
occur in the absence of nasal irritation.
The specificity of the relation between a stimulus and its accompanying reflex
response is a consequence of the organization of the nervous system. In vertebrates
(including humans), simple reflexes are typically mediated by three neurons, as
illustrated in Figure 2.1. The environmental stimulus for a reflex activates a sensory
neuron (also called afferent neuron), which transmits the sensory message to the
spinal cord. Here, the neural impulses are relayed to the motor neuron (also called
efferent neuron), which activates the muscles involved in the reflex response.
30 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
However, sensory and motor neurons rarely communicate directly. Rather, the
impulses from one to the other are relayed through at least one interneuron. The neu-
ral circuitry ensures that particular sensory neurons are connected to a corresponding
set of motor neurons. Because of this restricted “wiring,” a particular reflex response is
elicited only by a restricted set of stimuli. The afferent neuron, interneuron, and effer-
ent neuron together constitute the reflex arc.
The reflex arc in vertebrates represents the fewest neural connections necessary for
reflex action. However, additional neural structures also may be involved in the elicita-
tion of reflexes. For example, sensory messages may be relayed to the brain, which in
turn may modify the reflex reaction in various ways. I will discuss such effects later in
the chapter. For now, it is sufficient to keep in mind that the occurrence of even simple
reflexes can be influenced by higher nervous system activity.
Reflexes rarely command much attention in psychology, but they are very important
because they contribute to the well-being of the organism in many ways. Reflexes keep
us alive. In newborn infants, for example, reflexes are essential for feeding. If you touch
an infant’s cheek with your finger, the baby will reflexively turn his or her head in that
direction, with the result that your finger will fall in the baby’s mouth. This head-turning
reflex no doubt evolved to facilitate finding the nipple. Once your finger has fallen in the
newborn’s mouth, the baby will begin to suckle. The sensation of an object in the mouth
causes reflexive sucking (Figure 2.2). The more closely the object resembles a nipple, the
more vigorous will be the suckling response.
Interestingly, successful nursing involves reflex responses not only on the part of the
infant but also on the part of the mother. The availability of milk in the breast is deter-
mined by the milk-letdown reflex. During early stages of nursing, the milk-letdown reflex
is triggered by the infant’s suckling behavior. However, after extensive nursing experi-
ence, the milk-letdown reflex can be also stimulated by cues that reliably predict the
infant’s suckling, such as the time of day or the infant’s crying when he or she is hungry.
Thus, successful nursing involves an exquisite coordination of reflex activity on the part
of both the infant and the mother.
Another important reflex, the respiratory occlusion reflex, is stimulated by a reduc-
tion of air flow to the baby, which can be caused by a cloth covering the baby’s face or
by the accumulation of mucus in the nasal passages. In response to the reduced air flow,
FIGURE 2.1 Neural
organization of simple
reflexes. The environ-
mental stimulus for a
reflex activates a sensory
neuron, which transmits
the sensory message to
the spinal cord. Here, the
neural impulses are re-
layed to an interneuron,
which in turn relays the
impulses to the motor
neuron. The motor
neuron activates muscles
involved in movement.
©
Ce
ng
ag
e
Le
ar
ni
ng
The Nature of Elicited Behavior 31
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
the baby’s first reaction is to pull his or her head back. If this does not remove the elicit-
ing stimulus, the baby will move his or her hands in a face-wiping motion. If this also
fails to remove the eliciting stimulus, the baby will begin to cry. Crying involves vigorous
expulsion of air, which may be sufficient to remove whatever was obstructing the air
passages.
Modal Action Patterns
Simple reflex responses, such as pupillary constriction to a bright light and startle reac-
tions to a brief loud noise, are evident in many species. By contrast, other forms of eli-
cited behavior occur in just one species or in a small group of related species. Sucking in
response to an object placed in the mouth is a characteristic of mammalian infants. Her-
ring gull chicks are just as dependent on parental feeding as newborn mammals, but
their feeding behavior is very different. When a parent gull returns to the nest from a
foraging trip, the chicks peck at the tip of the parent’s bill (Figure 2.3). This causes the
parent to regurgitate. As the chicks continue to peck, they manage to get the parent’s
regurgitated food, and this provides their nourishment.
Response sequences, such as those involved in infant feeding, that are typical of a
particular species are referred to as modal action patterns (MAPs) (Baerends, 1988).
Species-typical MAPs have been identified in many aspects of animal behavior, including
sexual behavior, territorial defense, aggression, and prey capture. Ring doves, for exam-
ple, begin their sexual behavior with a courtship interaction that culminates in the selec-
tion of a nest site and the cooperative construction of the nest by the male and female.
By contrast, in the three-spined stickleback, a small fish, the male first establishes a terri-
tory and constructs a nest. Females that enter the territory after the nest has been built
are then courted and induced to lay their eggs in the nest. Once a female has deposited
her eggs, she is chased away, leaving the male stickleback to care for and defend the eggs
until the offspring hatch.
FIGURE 2.2 Suckling
in infants. Suckling is
one of the most promi-
nent reflexes in infants.
G. P. Baerends
©
Sv
et
la
na
Fe
do
se
ye
va
/S
hu
tt
er
st
oc
k.
co
m
Th
om
as
D
.
M
ca
vo
y/
Ti
m
e
Li
fe
Pi
ct
ur
es
/G
et
ty
Im
ag
es
32 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
An important feature of MAPs is that the threshold for eliciting such activities varies
(Camhi, 1984; Baerends, 1988). The same stimulus can have widely different effects
depending on the physiological state of the animal and its recent actions. A male stickle-
back, for example, will not court a female who is ready to lay eggs until he has com-
pleted building his nest. And, after the female has deposited her eggs, the male will
chase her away rather than court her as he did earlier. Furthermore, these sexual and
territorial responses will only occur when environmental cues induce physiological
changes that are characteristic of the breeding season in both males and females.
Eliciting Stimuli for Modal Action Patterns
The eliciting stimulus is fairly easy to identify in the case of a simple reflex, such as
infant suckling. The stimulus responsible for an MAP can be more difficult to isolate if
the response occurs in the course of complex social interactions. For example, let us con-
sider again the feeding of a herring gull chick. To get fed, the chick has to peck the par-
ent’s beak to stimulate the parent to regurgitate food. But exactly what stimulates the
chick’s pecking response?
Pecking by the chicks may be elicited by the color, shape, or length of the parent’s
bill, the noises the parent makes, the head movements of the parent, or all of these in
combination. To isolate which of these stimuli elicits pecking, Tinbergen and Perdeck
(1950) tested chicks with various artificial models instead of live adult gulls. From this
research, they concluded that the eliciting stimulus had to be a long, thin, moving object
that was pointed downward and had a contrasting red patch near the tip. The yellow
color of the parent’s bill, the shape and coloration of the parent’s head, and the noises
the parent made were all not required for eliciting pecking in the gull chicks. The few
essential features are called, collectively, the sign stimulus, or releasing stimulus, for
pecking on the part of the chicks. Once a sign stimulus has been identified, it can be
exaggerated to elicit an especially vigorous response. Such an exaggerated sign stimulus
is called a supernormal stimulus. Eating behavior, for example, is elicited by the taste of
the food in the mouth. Something that tastes sweet and is high in fat content is especially
effective in encouraging eating. Thus, one can create a supernormal stimulus for eating
by adding sugar and fat. This is well known in the fast-food industry.Although sign sti-
muli were originally identified in studies with nonhuman subjects, sign stimuli also play
a major role in the control of human behavior (Barrett, 2010).
FIGURE 2.3 Feeding
of herring gull chicks.
The chicks peck a red
patch near the tip of the
parent’s bill, causing
the parent to regurgitate
food for them.
©
Ce
ng
ag
e
Le
ar
ni
ng
The Nature of Elicited Behavior 33
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
With troops returning from Iraq, post-traumatic stress disorder (PTSD) and fear
and anxiety attendant to trauma are frequently in the news. Better understanding of
PTSD requires knowledge about how people react to danger and how they learn from
those experiences (Kirmayer, Lemelson, & Barad, 2007). Responding effectively to danger
has been critical in the evolutionary history of all animals, including human beings.
Individuals who did not respond effectively to danger succumbed to the assault and did
not pass their genes on to future generations. Therefore, traumatic events have come to
elicit strong defensive MAPs. Vestiges of this evolutionary history are evident in labora-
tory studies showing that both children and adults detect snakes faster than flowers,
frogs, or other nonthreatening stimuli (e.g., LoBue & DeLoache, 2010, 2011). Early com-
ponents of the defensive action pattern include the eyeblink reflex and the startle
response. Because of their importance in defensive behavior, we will discuss these
reflexes later in this chapter and subsequent chapters.
Sign stimuli and supernormal stimuli also have a major role in social and sexual behav-
ior. Copulatory behavior involves a complex sequence of motor responses that have to be
elaborately coordinated with the behavior of one’s sexual partner. The MAPs involved in
sexual arousal and copulation are elicited by visual, olfactory, tactile, and other types of
sign stimuli that are specific to each species. Visual, tactile, and olfactory stimuli are impor-
tant in human social and sexual interactions as well. The cosmetic and perfume industries
are successful because they take advantage of the sign stimuli that elicit human social attrac-
tion and affiliation and enhance these stimuli. Women put rouge on their lips rather than
on their ears because only rouge on the lips enhances the natural sign stimulus for human
social attraction. Plastic surgery to enhance the breasts and lips are also effective because
they enhance naturally occurring sign stimuli for human social behavior.
The studies of learning that we will be describing in this book are based primarily
on MAPs involved in eating, drinking, sexual behavior, and defensive behavior.
BOX 2.1
Learning versus Instinct
Because MAPs occur in a similar
fashion among members of a given
species, they include activities that
are informally characterized as
instinctive. Historically, instinctive
behaviors were assumed to be
determined primarily by the genetic
and evolutionary history of a species,
whereas learned behaviors were
assumed to be acquired during the
lifetime of the organism through its
interactions with its environment.
This distinction is similar to the
distinction between nativism and
empiricism pursued by Descartes
and the British empiricists. The
innate versus learned distinction also
remains in current folk biology
(Bateson & Mameli, 2007), but it is
no longer tenable scientifically.
Scientists no longer categorize
behavior as instinctive versus
learned for two major reasons
(Domjan, 2012). First, the fact that
all members of a species exhibit the
same sexual or feeding behaviors
does not mean that these behaviors
are inherited rather than learned.
Similar behaviors among all mem-
bers of a species may reflect similar
learning experiences. As the etholo-
gist G. P. Baerends (1988) wrote,
“Learning processes in many varia-
tions are tools, so to speak, that can
be used in the building of some
segments in the species-specific
behavior organization” (p. 801).
Second, the historical distinction
between learning and instinct was
based on an antiquated conception
of how genes determine behavior.
Genes do not produce behavioral
end points directly in the absence of
environmental or experiential input.
Rather, recent research has unveiled
numerous epigenetic processes that
determine the circumstances under
which DNA is transcribed and
expressed. It is only through these
epigenetic processes, which often
involve experiential and environ-
mental inputs, that DNA can pro-
duce particular behavioral traits
(e.g., Champagne, 2010).
34 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The Sequential Organization of Behavior
Responses do not occur in isolation of one another. Rather, individual actions are orga-
nized into functionally effective behavior sequences. To obtain food, for example, a
squirrel first has to look around for potential food sources, such as a pecan tree with
nuts. It then has to climb the tree and reach one of the nuts. After obtaining the nut, it
has to crack the shell, extract the meat, and chew and swallow it. All motivated behavior,
whether it is foraging for food, finding a potential mate, defending a territory, or feeding
one’s young, involves systematically organized sequences of actions. Ethologists called
early components of a behavior sequence appetitive behavior and the end components
consummatory behavior (Craig, 1918). The term consummatory was meant to convey
the idea of consummation or completion of a species’ typical response sequence. In con-
trast, appetitive responses occur early in a behavior sequence and serve to bring the
organism into contact with the stimuli that will release the consummatory behavior.
Chewing and swallowing are responses that complete activities involved in foraging
for food. Hitting and biting an opponent are actions that consummate defensive behavior.
Copulatory responses serve to complete the courtship and sexual behavior sequence. In
general, consummatory responses are highly stereotyped species-typical behaviors that
have specific eliciting or releasing stimuli. In contrast, appetitive behaviors are more vari-
able and can take a variety of different forms depending on the situation (Tinbergen,
1951). In getting to a pecan tree, for example, a squirrel can run up one side or the other
or jump from a neighboring tree. These are all possible appetitive responses leading up to
actually eating the pecan nut. However, once the squirrel is ready to put the pecan meat in
its mouth, the chewing and swallowing responses that it makes are fairly stereotyped.
As is evident from the varieties of ethnic cuisine, people of different cultures have many
different ways of preparing food (appetitive behavior), but they all pretty much chew and
swallow the same way (consummatory behavior). Actions that are considered to be rude
and threatening (appetitive defensive responses) also differ from one culture to another.
But people hit and hurt one another (consummatory defensive behavior) in much the
same way regardless of culture. Consummatory responses tend to be species-typical MAPs.
In contrast, appetitive behaviors are more variable and more apt to be shaped by learning.
The sequential organization of naturally occurring behavior is of considerable
importance to scientists interested in learning because learning effects often depend on
which component of the behavior sequence is being modified. As I will describe in later
chapters, the outcomes of Pavlovian and instrumental conditioning depend on how these
learning procedures modify the natural sequence of an organism’s behavior. Learning
theorists are becoming increasingly aware of the importance of considering natural
behavior sequences and have expanded on the appetitive/consummatory distinction
made by early ethologists.
In considering how animals obtain food, for example, it is now common to charac-
terize the foraging response sequence as starting with a general search mode, followed by
a focal search mode, and ending with a food handling and ingestion mode. Thus, in
modern learning theory, the appetitive response category has been subdivided into general
search and focal search categories (e.g., Timberlake, 2001). General search responses occur
when the animal does not yet know where to look for food. Before a squirrel has identified a
pecan tree with ripe nuts, it will move around looking for potential sources of food. General
search responses are not spatially localized. Once the squirrel has found a pecan tree, how-
ever, it will switch to the focal search mode and begin to search for ripe nuts only in that
tree. Thus, focal search behavior is characterized by greater spatial specificity than general
search. Once focal search behavior has led to a pecan ripe for picking, the squirrel’s behavior
changes to the food handling and ingestion mode (consummatory behavior).
Nikolaas Tinbergen
Ph
ot
os
ho
t
The Nature of Elicited Behavior 35
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Effects of Repeated Stimulation
A common assumption is that an elicited response, particularly a simple reflex response,
will automatically occur the same way each time the eliciting stimulus is presented. This
is exactly what Descartes thought. In his view, reflexive behavior was unintelligent in the
sense that it was automatic and invariant. According to Descartes, each occurrence of
the eliciting stimulus will produce the same reflex reaction because in his conception
the energy of the eliciting stimulus was transferred to the motor response through a
direct physical connection. If the eliciting stimulus remained the same, the elicited
response would also be the same.
Contrary to Descartes, elicited behavior is not invariant. In fact, one of the most
impressive features of elicited behavior (and one reason we are spending so much time
discussing it) is that elicited behavior is readily subject to modification through experi-
ence. Elicited behavior can either decrease or increase through the activation of habitua-
tion and sensitization mechanisms. Because these are some of the simplest and most
basic forms of learning, we will consider them next.
Salivation and Hedonic Ratings of Taste in People
Habituation plays a major role in how we respond to the foods we eat (Epstein et al.,
2009). The taste of food elicits salivation as a reflex response. This occurs as readily in
people as in Pavlov’s dogs. In one study, salivation was measured in eight women in
response to the taste of either lemon juice or lime juice (Epstein et al., 1992). A small
amount of one of the flavors (.03 ml) was placed on the participant’s tongue on each of
10 trials. The participant was asked to rate how much she liked the taste on each trial,
and salivation to each taste presentation was also measured. The results are summarized
in Figure 2.4.
Sa
liv
at
io
n
(g
)
H
ed
on
ic
r
at
in
g
Trials
5.0
4.5
3.5
2.5
1.5
0.5
4.0
3.0
2.0
1.0
0.0
2 4 6 8 10 2 4 6 8 10
100
90
80
70
60
50
40
30
20
10
0
FIGURE 2.4 Salivation
and ratings of pleasant-
ness in response to a taste
stimulus (lime or lemon)
repeatedly presented
to women on Trials 1
through 10. The alternate
taste was presented on
Trial 11, causing a sub-
stantial recovery
in responding (Based on
Epstein, Rodefer,
Wisniewski & Caggiula,
1992).
36 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Salivation in response to the taste increased slightly from Trial 1 to Trial 2, but
from Trial 2 to Trial 10, responding systematically decreased. Interestingly, a similar
decrease was observed in hedonic ratings of the taste. The flavor became less pleasant
as it was repeatedly encountered. On Trial 11, the taste was changed (to lime for par-
ticipants who had been exposed to lemon, and to lemon for participants who had
been previously exposed to lime). This produced a dramatic recovery in both the sali-
vary reflex and the hedonic rating. (For similar results in a study with children, see
Epstein et al., 2003.)
The results presented in Figure 2.4 are relatively simple but tell us a number of
important things about the plasticity of elicited behavior. First, and most obviously,
they tell us that elicited behavior is not invariant across repetitions of the eliciting stim-
ulus. Both salivation and hedonic ratings decreased with repeated trials. In the case of
salivation, the ultimate decline in responding was preceded by a brief increase from
Trial 1 to Trial 2. The decline in responding that occurs with repeated presentation of a
stimulus is called a habituation effect. Habituation is a prominent feature of elicited
behavior that is evident in virtually all species and situations (Rankin et al., 2009).
Another prominent feature of the results presented in Figure 2.4 is that the decrease
in responding was specific to the habituated stimulus. Individuals habituated to the taste
of lemon showed invigorated responding when tested with the taste of lime at the end of
the experiment (and vice versa). This recovery occurred in both the salivary response to
the taste as well as the hedonic response and illustrates one of the cardinal properties of
habituation, namely, that habituation is stimulus specific.
The stimulus specificity feature means that habituation can be easily reversed by
changing the stimulus. Consider what this means for eating. As you take repeated
bites of the same food, your interest in the food declines, and at some point this will
cause you to stop eating. However, if the flavor of the food is changed, the habituation
effect will be gone and your interest in eating will return. Thus, you are likely to eat
more if you are eating food with varied flavors than if your meal consists of one flavor.
It is hard to resist going back to a buffet table given the variety of flavors that are
offered, but rejecting a second helping of mashed potatoes is easy if the second helping
tastes the same as the first.
Another major variable that influences the rate of taste habituation is attention to
the taste stimulus. In a fascinating study, children were tested for habituation to a taste
stimulus while they were working on a problem that required their close attention. In
another condition, either no distracting task was given or the task was so easy that it
did not require much attention. Interestingly, if the children’s attention was diverted
from the taste presentations, they showed much less habituation to the flavor (Epstein
et al., 2005). This is a very important finding because it helps us understand why food
tastes better and why people eat more if they are having dinner with friends or are eating
while watching TV. Having one’s attention directed to nonfood cues keeps the food from
becoming uninteresting through habituation.
The above examples illustrate some of the ways in which habituation can influence
food intake and weight gain. As it turns out, obesity itself may influence taste habitua-
tion. In an interesting study, habituation to the taste of lemon yogurt was examined in
women who were either obese or of normal weight (Epstein, Paluch, & Coleman, 1996).
Salivary responding showed the usual habituation effect in women of normal weight. In
contrast, overweight women did not show the standard habituation effect but continued
their vigorous response to the yogurt across all taste trials (Figure 2.5). This is a remark-
able finding and suggests that obesity may be at least in part a disorder of habituation.
(For similar findings with children, see Epstein et al., 2008.)
Effects of Repeated Stimulation 37
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Visual Attention in Human Infants
Human infants have a lot to learn about the world. One way they obtain information is
by looking at things. Visual cues elicit a looking response, which can be measured by
how long the infant keeps his or her eyes on an object before shifting gaze elsewhere
(Figure 2.6).
In one study of visual attention (Bashinski, Werner, & Rudy, 1985), four-month-old
infants were assigned to one of two groups, and each group was tested with a different visual
stimulus. The stimuli are shown in the right panel of Figure 2.7. Both were checkerboard
patterns, but one had 4 squares on each side (the 4 × 4 stimulus), whereas the other had
12 squares on each side (the 12 × 12 stimulus). Each stimulus presentation lasted 10 sec-
onds, and the stimuli were presented eight times with a 10-second interval between trials.
Both stimuli elicited visual attention initially, with the babies spending an average of
about 5.5 seconds looking at the stimuli. With repeated presentations of the 4 × 4 stim-
ulus, visual attention progressively decreased, showing a habituation effect. By contrast,
the 12 × 12 stimulus produced an initial sensitization effect, evident in increased looking
during the second trial as compared to the first. But, after that, visual attention to the
12 × 12 stimulus also habituated.
1.5
1.0
0.5
C
ha
ng
e
in
s
al
iv
at
io
n
fr
om
b
as
el
in
e
(g
)
0.0
0 1 2 3 4 5
Trial blocks
Nonobese ObeseFIGURE 2.5 Change
in salivation from base-
line levels in response to
the taste of lemon yogurt
in obese and normal
weight women in blocks
of 2 trials.
FIGURE 2.6 Experi-
mental setup for the
study of visual attention
in infants. The infant is
seated in front of a screen
that is used to present
various visual stimuli.
How long the infant
looks at the
display before diverting
his or her gaze elsewhere
is measured in each trial.
©
Ce
ng
ag
e
Le
ar
ni
ng
©
Ce
ng
ag
e
Le
ar
ni
ng
20
15
38 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
This relatively simple experiment tells us a great deal both about visual attention
and about habituation and sensitization. The results show that visual attention elicited
by a novel stimulus changes as babies gain familiarity with the stimulus. The nature of
the change is determined by the nature of the stimulus. With a relatively simple 4 × 4
pattern, only a progressive habituation effect occurs. With a more complex 12 × 12 pat-
tern, a transient sensitization occurs, followed by habituation. Thus, whether or not sen-
sitization is observed depends on the complexity of the stimulus. With both stimuli, the
infants eventually showed less interest as they became more familiar with the stimulus.
Other studies have shown that interest in what appeared on the screen would have
recovered if a new or different stimulus had been presented after the familiarization
phase, just as we saw in the taste habituation study described earlier.
Infants cannot tell us in words how they view or think about things. Scientists are
therefore forced to use behavioral techniques to study infant perception and cognition.
The visual attention task can provide information about visual acuity. For example,
from the data in Figure 2.7, we may conclude that these infants were able to distinguish
the two different checkerboard patterns. This type of habituation procedure has also
been used to study a wide range of other, more complicated questions about infant cog-
nition and perception (see Colombo & Mitchell, 2009, for a review). One recent study,
for example, examined how newborn infants perceive human faces.
Faces provide a great deal of information that is critical in interpersonal interactions.
People are experts at recognizing and remembering faces. How about newborn infants
1–3 days of age? Recognizing a face requires not only remembering its features but also
recognizing the face when it is turned a bit to one side or the other. Do newborns
know a face is the same even if it is turned a bit to one side? How could we answer
this question considering that newborn infants cannot speak or do much?
Trials
Fi
xa
tio
n
tim
e
(s
ec
on
ds
)
10 2 3 4 5 6 7 8
2
3
4
5
6
7
8
The 4 × 4 stimulus
The 12 × 12 stimulus
4 × 4 12 × 12FIGURE 2.7 How
long infants spent look-
ing at a visual stimulus
during successive trials.
For one group, the
stimulus consisted of a 4
× 4 checkerboard pat-
tern. For a second group,
the stimulus consisted of
a 12 × 12 checkerboard
pattern. The stimuli are
illustrated to the right
of the results. (Based on
“Determinants of Infant
Visual Attention:
Evidence for a Two-
Process Theory,” by
H. Bashinski, J. Werner,
and J. Rudy, Journal of
Experimental Child
Psychology, 39,
pp. 580–598.)
Effects of Repeated Stimulation 39
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Turati, Bulf, and Simion (2008) adapted the visual attention task to study face per-
ception. Newborn infants less than 3 days of age were first familiarized with a photograph
of a face presented either in a full-face pose or turned slightly to one side (Figure 2.8). To
avoid having the infants use cues related to the model’s hair, the model’s hair was blocked
out with Photoshop. After the infants became habituated to looking at the training stim-
ulus, they were tested with two different photos. Both the test faces were in a different
orientation from the training stimulus. One of the test faces was of the same person as
the infant saw during the habituation phase; the second face was of a different person. If
the infant recognized the original face, the baby was expected to spend less time looking
at that than the new face. When presented with the two test faces, the infant spent less
time looking at the face of the familiar person than the face of the novel person. This
shows that the newborns could tell that the two faces were different. In addition, the
results show that the newborns could tell which face they had seen before, even though
the face appeared in a new orientation during the test trial. This is a truly remarkable
feat of learning and memory considering that faces are complex stimuli and the infants
were less than 3 days old. This is just one example of how the behavioral techniques
described in this book can be used to study cognition in nonverbal organisms.
The visual attention paradigm has become a prominent tool in the study of infant
perception as well as more complex forms of cognition. For example, it has been used to
study whether infants are capable of rudimentary mathematical operations, reasoning
about the laws of the physical world, discriminating between drawings of objects that
are physically possible versus ones that are physically not possible, and discriminating
between the properties of liquids and solids (Baillargeon, 2008; Hespos, Ferry, & Rips,
2009; McCrink & Wynn, 2007; Shuwairi, Albert, & Johnson, 2007). Some of this type
of research has stimulated spirited debate about the extent to which the perceptual prop-
erties of the stimuli rather than their meaning within the knowledge structure of the
infant controls visual attention (Schöner & Thelen, 2006). Regardless of how this debate
is resolved, there is no doubt that the visual attention paradigm has provided a wealth of
information about infant cognition at ages that long precede the acquisition of language.
FIGURE 2.8 Photo-
graphs of faces used by
Turati and colleagues
(2008) in tests of visual
attention in newborn
infants. Infants were
habituated to photos of a
face in one of two or-
ientations. They were
then tested with photos
of faces in a different
orientation, but one of
the test faces was of the
same person as the
photo used during ha-
bituation. The novel face
reliably elicited greater
visual attention during
the test.
Co
ur
te
sy
of
El
se
vi
er
40 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The Startle Response
As I mentioned earlier, the startle response is part of an organism’s defensive reaction to
potential or actual attack. If someone unexpectedly blows a whistle behind your back,
you are likely to jump. This is the startle response. It consists of a sudden jump and
tensing of the muscles of the upper part of the body, usually involving the raising of
the shoulders and pulling the head into the shoulders. It also includes blinking of the
eyes. The startle reaction can be measured by placing the organism on a surface with a
pressure sensor. The startle reaction briefly increases pressure against the floor.
The startle response has been investigated extensively because of its role in fear and
defensive behavior. Scientists interested in the neurobiology of fear and in the develop-
ment of drugs that help alleviate fear have found the startle response to be a highly effec-
tive measure. Many of the original studies of the startle response were done with
laboratory rats and mice (e.g., Halberstandt & Geyer, 2009). However, in recent years
the technique has also been developed for use with rhesus monkeys (Davis et al., 2008).
Figure 2.9 shows a diagram of a stabilimeter chamber used to measure the startle
response in rats. When startled, the rat jumps, and when it comes down, it puts extra
pressure on the floor of the chamber. This activates the pressure sensor under the cham-
ber. Changes in pressure are used as indicators of the vigor of the startle reaction.
The startle reaction is usually elicited in experiments with laboratory rats by a brief
loud sound. In one experiment (Leaton, 1976), the startle stimulus was a loud, high-
pitched tone presented for 2 seconds. The animals were first allowed to get used to the
experimental chamber without any tone presentations. Each rat then received a single
tone presentation once a day for 11 days. In the next phase of the experiment, the
tones were presented much more frequently (every 3 seconds) for a total of 300 trials.
Finally, the animals were given a single tone presentation on each of the next three
days, as in the beginning of the experiment.
Figure 2.10 shows the results. The most intense startle reaction was observed the
first time the tone was presented. Progressively less intense reactions occurred during
the next 10 days. Because the animals received only one tone presentation every
24 hours in this phase, the progressive decrements in responding indicated that the
habituating effects of the stimulus presentations persisted throughout the 11-day period.
Pressure sensor
Cable to computer
FIGURE 2.9 Stabili-
meter apparatus to
measure the startle re-
sponse of rats. A small
chamber rests on a
pressure sensor. Sudden
movements of the rat are
detected by the pressure
sensor and recorded on a
computer.
©
Ce
ng
ag
e
Le
ar
ni
ng
M. Davis
Co
ur
te
sy
of
M
.
D
av
is
Effects of Repeated Stimulation 41
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
It is worth noting, though, that this long-term habituation did not result in complete loss
of the startle reflex. Even on the 11th day, the animals still reacted a little.
By contrast, startle reactions quickly ceased when the tone presentations occurred
every 3 seconds in Phase 2 of the experiment. However, this dramatic loss of responsive-
ness was only temporary. In Phase 3 of the experiment, when trials were again adminis-
tered just once each day, the startle response recovered to the level of the 11th day of the
experiment. This recovery, known as spontaneous recovery, occurred simply because the
tone had not been presented for a long time (24 hours).
This experiment illustrates that two different forms of habituation occur depending on
the timing of the stimulus presentations. If the stimuli are presented widely spaced in time,
a long-term habituation effect occurs, which persists for 24 hours or longer. In contrast, if
the stimuli are presented very closely in time together (every 3 seconds in this experiment),
a short-term habituation effect occurs. The short-term habituation effect is identified by
spontaneous recovery of responding following a period without stimulation. Spontaneous
recovery is one of the defining features of habituation (Rankin et al., 2009).
Repeated presentations of a stimulus do not always result in both long-term and
short-term habituation effects. With the spinal leg-flexion reflex in cats, for example,
only the short-term habituation effect is observed (Thompson & Spencer, 1966). In
such cases, spontaneous recovery completely restores the animal’s reaction to the elicit-
ing stimulus if a long enough period of rest is provided after habituation. By contrast,
spontaneous recovery is never complete in situations that involve long-term habituation,
as in Leaton’s experiment. As Figure 2.10 indicates, the startle response was restored to
some extent in the last phase of the experiment, but even then the rats did not react as
vigorously to the tone as they had the first time it was presented.
Sensitization and the Modulation of Elicited Behavior
Consider your reaction when someone walks up behind you and taps you on the shoul-
der. If you are in a supermarket, you will be mildly startled and will turn toward the side
where you were tapped. Orienting toward a tactile stimulus is a common elicited
response. Being tapped on the shoulder is not a big deal if you are in a supermarket.
However, if you are walking in a dark alley at night in a dangerous part of town, being
50
40
30
St
ar
tle
m
ag
ni
tu
de
20
10
1 3 5
Tones
Stimuli once a day Stimuli every 3 seconds Stimuli once a day
Blocks of 30 tones Tones
7 9 11 2 4 6 8 1 310
0
FIGURE 2.10 Startle
response of rats to a tone
presented once a day in
Phase 1, every 3 seconds
in Phase 2, and once a
day in Phase 3. (Based
on “Long-Term Reten-
tion of the Habituation
of Lick Suppression and
Startle Response
Produced by a Single
Auditory Stiumuls,” by
R.N. Leaton, 1976,
Journal of Experimental
Psychology: Animal
Behavior Processes, 2,
pp. 248–259.)
42 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
tapped on the shoulder could be a very scary experience and will no doubt elicit a much
more vigorous reaction. In a scary place, being touched could mean that you are about to
be attacked. Generally speaking, if you are already aroused, the same eliciting stimulus
will trigger a much stronger reaction. This is called a sensitization effect.
It is easier to study sensitization of the startle response in the laboratory than in a
dark alley. In a classic study, Davis (1974), examined sensitization of the startle response
of laboratory rats to a brief (90 millisecond) loud tone (110 decibels [dB], 4,000 cycles
per second [cps]). Two groups of rats were tested. Each group received 100 trials pre-
sented at 30-second intervals. In addition, a noise generator provided background noise
that sounded something like a water fall. For one group, the background noise was rela-
tively quiet (60 dB); for the other, the background noise was rather loud (80 dB) but of
lower intensity than the brief, startle-eliciting tone.
The results of the experiment are shown in Figure 2.11. Repeated presentations of
the eliciting stimulus (the 4,000 cps tone) did not always produce the same response.
For rats tested with soft background noise (60 dB), repetitions of the tone resulted in
habituation of the startle reaction. By contrast, when the background noise was loud
(80 dB), repetitions of the tone elicited progressively more vigorous startle responses.
This reflects a gradual buildup of sensitization created by the loud background.
Reflex responses are sensitized when the organism becomes aroused for some rea-
son. Arousal intensifies our experiences, whether those experiences are pleasant or
unpleasant. As is well known in the entertainment industry, introducing loud noise is a
relatively simple way to create arousal. Live performances of rock bands are so loud that
band members suffer hearing loss if they don’t wear earplugs. The music does not have
to be so loud for everyone to hear it. The main purpose of the high volume is to create
arousal and excitement. Turning a knob on an amplifier is a simple way to increase
excitement. Making something loud is also a common device for increasing the enjoy-
ment of movies, circus acts, car races, and football games and is effective because of the
phenomenon of sensitization.
Sensitization also plays a major role in sexual behavior. A major component of sexual
behavior involves reacting to tactile cues. Consider the tactile cues of a caress or a kiss.
The reaction to the same physical caress or kiss is totally different if you are touching
St
ar
tle
m
ag
ni
tu
de
60-dB
background noise
80-dB
background noise
Blocks of 10 tones
2 4 6 8 10
40
30
20
10
0
2 4 6 8 10
FIGURE 2.11
Magnitude of the startle
response of rats to suc-
cessive presentations of a
tone with a background
noise of 60 or 80 dB.
(Based on “Sensitization
of the Rat Startle
Response by Noise,” by
M. Davis, 1974, Journal
of Comparative and
Physiological Psychology,
87, pp. 571–581.)
Effects of Repeated Stimulation 43
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
your grandmother than if you are touching your romantic partner. The difference reflects
sensitization and arousal. In a recent study of this issue, heterosexual males were tested
for their sensitivity to a tactile stimulus presented to the right index finger (Jiao, Knight,
Weerakoon, & Turman, 2007) before and after watching an erotic movie that was
intended to increase their sexual arousal. Tactile sensitivity was significantly increased by
the erotic movie. Watching a nonerotic movie did not produce the same effect.
Sensitization has been examined most extensively in the defensive behavior system.
Numerous studies have shown that fear potentiates the startle response (Davis et al., 2008).
Startle can be measured using a stabilimeter similar to that shown in Figure 2.9, which mea-
sures the reaction of the entire body. A simpler procedure, particularly with human partici-
pants, is to measure the eyeblink response. The eyeblink is an early component of the startle
response and can be elicited in people by directing a brief puff of air toward the eye.
In one study, using the eyeblink startle measure (Bradley, Moulder, & Lang, 2005),
college students served as participants and were shown examples of pleasant and
unpleasant pictures. To induce fear, one group of students was told that they could get
shocked at some point when they saw the pleasant pictures but not when they saw the
unpleasant pictures. The second group of participants received a shock threat associated
with the unpleasant pictures but not the pleasant pictures. Shock was never delivered to
any of the participants, but to make the threat credible, they were fitted with shock elec-
trodes. To measure fear-potentiated startle, the magnitude of the eyeblink response to a
puff of air was measured during presentation of the pictures.
The results are shown in Figure 2.12. Let us first consider the startle reaction during
presentations of the pleasant pictures. If the pleasant pictures were associated with shock
threat, the eyeblink response was substantially greater than if the pictures were safe. This
represents the fear-potentiated startle effect. The results with the unpleasant pictures
were a bit different. With the unpleasant pictures, the startle response was elevated
whether or not the pictures were associated with the threat of shock. This suggests that
5
4
3
B
lin
k
m
ag
ni
tu
de
(μ
V
)
2
1
0
Pleasant
picture
Unpleasant
picture
Threat SafeFIGURE 2.12 Magni-
tude of the eyeblink re-
sponse of college
students to pleasant and
unpleasant pictures that
signaled shock or were
safe. (Based on Bradley,
Moulder, & Lang, 2005.)
44 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
the unpleasant pictures were sufficiently discomforting to sensitize the defensive blink
response independent of any shock threat.
Fear-potentiated startle is just one example of a broader category of findings show-
ing that the magnitude of the startle reaction can be altered by emotional states. Because
of this, the startle response is a useful technique for studying psychological disorders that
have a strong emotional component, such as panic and anxiety disorders and depression
(Vaidyanathan, Patrick, & Cuthbert, 2009). In fact, fear-potentiated startle is a better
measure of fear in individuals with PTSD than the more familiar galvanic skin response
(GSR) measure (Glover et al., 2011).
Adaptiveness and Pervasiveness of Habituation and Sensitization
Organisms are constantly experiencing a host of stimuli. Consider the act of sitting at
your desk. Even such a simple situation involves a myriad of sensations. You are exposed
to the color, texture, and brightness of the paint on the walls; the sounds of the air-
conditioning system; noises from other rooms; odors in the air; the color and texture of
the desk; the tactile sensations of the chair against your legs, seat, and back; and so on. If
you were to respond to all of these stimuli, your behavior would be disorganized and
chaotic. Habituation and sensitization effects help sort out what stimuli to ignore and
what to respond to. Habituation and sensitization effects are the end products of pro-
cesses that help prioritize and focus behavior in the buzzing and booming world of sti-
muli that organisms live in.
There are numerous instances of habituation and sensitization in common human
experience (Simons, 1996). Consider a grandfather clock. Most people who own such a
clock do not notice each time it chimes. They have completely habituated to the clock’s
sounds. In fact, they are more likely to notice when the clock misses a scheduled chime.
This is unfortunate because they may have purchased the clock because they liked its
sound. People who live on a busy street or near a railroad track often become entirely
habituated to the noises that frequently intrude into their homes. Visitors who have not
heard the sounds as often are much more likely to be bothered by them.
Driving a car involves exposure to a large array of complex visual and auditory sti-
muli. In becoming an experienced driver, a person habituates to the numerous stimuli
that are irrelevant to driving, such as details of the color and texture of the road, the
kind of telephone poles that line the sides of the highway, tactile sensations of the steer-
ing wheel, and routine noises from the engine. Habituation to irrelevant cues is particu-
larly prominent during long driving trips. On a long drive, you are likely to become
oblivious to all kinds of stimuli on the road, which may make you drowsy and inatten-
tive. If you come across an accident or arrive in a new town, you are likely to “wake up”
and again pay attention to various things that you had been ignoring. Passing a bad acci-
dent or coming to a new town is arousing and sensitizes orienting responses that were
previously habituated.
Habituation also determines how much we enjoy something. In his popular book,
Stumbling on Happiness, Daniel Gilbert (2006) noted that “Among life’s cruelest truths
is this one: Wonderful things are especially wonderful the first time they happen, but
their wonderfulness wanes with repetition” (p. 130). He went on to write, “When we
have an experience—hearing a particular sonata, making love with a particular person,
watching the sun set from a particular window with a particular person—on successive
occasions, we quickly begin to adapt to it, and the experience yields less pleasure each
time” (p. 130).
Habituation and sensitization effects can occur in any situation that involves
repeated exposures to a stimulus. Therefore, an appreciation of habituation and
Effects of Repeated Stimulation 45
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
sensitization effects is critical for studies of learning. As I will describe in Chapter 3,
habituation and sensitization are of primary concern in the design of control procedures
for Pavlovian conditioning. Habituation and sensitization also play a role in operant con-
ditioning (McSweeney & Murphy, 2009).
Habituation Versus Sensory Adaptation and Response Fatigue
The key characteristic of habituation effects is a decline in the response that was initially
elicited by a stimulus. However, not all instances in which repetitions of a stimulus result
in a response decline represent habituation. To understand alternative sources of
response decrement, we need to return to the concept of a reflex. A reflex consists of
three components. First, a stimulus activates one of the sense organs, such as the eyes
or ears. This generates sensory neural impulses that are relayed to the central nervous
system (spinal cord and brain). The second component involves relay of the sensory
messages through interneurons to motor nerves. Finally, the neural impulses in motor
nerves, in turn, activate the muscles that create the observed response.
Given the three components of a reflex, there are several reasons why an elicited
response may fail to occur (Figure 2.13). The response will not be observed if, for some
reason, the sense organs become disabled. A person may be temporarily blinded by a
bright light, for example, or suffer a temporary hearing loss because of exposure to
loud noise. Such decreases in sensitivity are called sensory adaptation and are different
from habituation. The response also will not occur if the muscles involved become inca-
pacitated by fatigue. Sensory adaptation and response fatigue are impediments to
responding that are produced outside the nervous system in sense organs and muscles.
Therefore, they do not represent habituation.
The terms habituation and sensitization are limited to neurophysiological changes
that hinder or facilitate the transmission of neural impulses from sensory to motor neu-
rons. In habituation, the organism ceases to respond, even though it remains fully capa-
ble of sensing the eliciting stimulus and making the muscle movements required for the
response. The response fails because of changes that disrupt neurotransmission involving
the interneurons.
In studies of habituation, sensory adaptation is ruled out by evidence that habitua-
tion is response specific. An organism may stop responding to a stimulus in one aspect of
its behavior while continuing to respond to the stimulus in other ways. When a teacher
makes an announcement while you are concentrating on taking a test, you may look up
from your test at first, but only briefly. However, you will continue to listen to the
announcement until it is over. Thus, your orienting response habituates quickly, but
other attentional responses to the stimulus persist.
Sense organ
Muscle
Sensory neuron
Motor neuron
Central
nervous
system
Site of sensory
adaptation
Site of response
fatigue
Site of habituation
and sensitization
FIGURE 2.13 Dia-
gram of a simple reflex.
Sensory adaptation oc-
curs in the sense organs,
and response fatigue
occurs in effector mus-
cles. In contrast, habitu-
ation and sensitization
occur in the nervous
system.
©
Ce
ng
ag
e
Le
ar
ni
ng
46 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Response fatigue as a cause of habituation is ruled out by evidence that habituation
is stimulus specific. A habituated response will quickly recover when a new stimulus is
introduced. This was illustrated in the taste habituation study summarized in Figure 2.5.
After the salivary and hedonic responses habituated during the first 10 trials, presentation
of the alternate taste in Trial 11 resulted in a recovery of both response measures. The
stimulus specificity of habituation is also key to using the visual attention task to study
perception and cognition in preverbal infants. By examining how a stimulus has to be
altered to produce recovery of the visual attention response, investigators can figure out
what aspects of the habituated stimulus the infants learned about (Figure 2.8).
The Dual-Process Theory of Habituation
and Sensitization
Habituation and sensitization effects are changes in behavior or performance. These are
outward behavioral manifestations of stimulus presentations. What factors are responsi-
ble for such changes? To answer this question, we have to shift our level of analysis from
behavior to presumed underlying process or theory. Habituation effects can be satisfac-
torily explained by a single-factor theory that characterizes how repetitions of a stimulus
change the efficacy of that stimulus (e.g., Schöner & Thelen, 2006). However, a second
factor has to be introduced to explain why responding is enhanced under conditions
of arousal. The dominant explanation of habituation and sensitization remains the
dual-process theory of Groves and Thompson (1970; Thompson, 2009).
The dual-process theory assumes that different types of underlying neural processes are
responsible for increases and decreases in responsiveness to stimulation. One neural process
produces decreases in responsiveness. This is called the habituation process. Another pro-
duces increases in responsiveness. This is called the sensitization process. The habituation
and sensitization processes are not mutually exclusive. In fact, often both are activated at the
same time. The behavioral outcome or end result that is observed reflects the net effect of
the two processes. It is unfortunate that the underlying processes that suppress and facilitate
responding have the same names (habituation and sensitization) as the resulting behavioral
changes that are observed. One may be tempted to think, for example, that decreased
responding, or a habituation effect, is a direct reflection of the habituation process. However,
decreased responding may occur under conditions that also involve increased arousal or
sensitization, but the arousal may only slow the rate of response decline. In fact, all habitua-
tion and sensitization effects are the sum, or net, result of both habituation and sensitization
processes. Whether the net result is an increase or a decrease in behavior depends on which
underlying process is stronger in a particular situation. The distinction between effects
and processes in habituation and sensitization is analogous to the distinction between
performance and learning discussed in Chapter 1. Effects refer to observable behavior and
processes refer to underlying mechanisms.
On the basis of neurophysiological research, Groves and Thompson (1970) sug-
gested that habituation and sensitization processes occur in different parts of the nervous
system (see also Thompson, 2009). Habituation processes are assumed to occur in what
is called the S-R system. This system consists of the shortest neural path that connects
the sense organs activated by the eliciting stimulus and the muscles involved in making
the elicited response. The S-R system may be viewed as the reflex arc. Each presentation
of an eliciting stimulus activates the S-R system and causes some buildup of habituation.
Sensitization processes are assumed to occur in what is called the state system. This
system consists of parts of the nervous system that determine the organism’s general
level of responsiveness or readiness to respond. In contrast to the S-R system, which is
The Dual-Process Theory of Habituation and Sensitization 47
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
activated every time an eliciting stimulus occurs, only arousing events activate the state
system. The state system is relatively quiescent during sleep, for example. Drugs, such as
stimulants or depressants, may alter the functioning of the state system and thereby
change responsiveness. The state system is also altered by emotional experiences. For
example, the heightened reactivity that accompanies fear is caused by activation of the
state system and is the basis for the fear-potentiated startle response (Davis et al., 2008).
Applications of the Dual-Process Theory
The examples of habituation and sensitization (illustrated in the experimental evidence
I previously reviewed) can be easily interpreted in terms of the dual-process theory.
Repeated exposure to the 4 × 4 checkerboard pattern produced a decrement in visual
orientation in infants (Figure 2.7). This presumably occurred because the 4 × 4 stimulus
did not create much arousal. Rather, the 4 × 4 stimulus activated primarily the S-R sys-
tem and, hence, activated primarily the habituation process. The more complex 12 × 12
checkerboard pattern produced a greater level of arousal, activating the state system, and
this resulted in the increment in visual attention that occurred after the first presentation
of the 12 × 12 pattern. However, the arousal or sensitization process was not strong
enough to entirely counteract the effects of habituation. As a result, after a few trials,
visual attention also declined in response to the 12 × 12 stimulus.
A different type of application of the dual-process theory is required for the habitu-
ation and sensitization effects we noted in the startle reaction of rats (Figure 2.11). When
the rats were tested with a relatively quiet background noise (60 dB), there was little to
arouse them. Therefore, we can assume that the experimental procedures did not activate
the state system. Repeated presentations of the startle-eliciting tone merely activated the
S-R system, which resulted in habituation of the startle response.
The opposite outcome occurred when the animals were tested in the presence of a
loud background noise (80 dB). In this case, stronger startle reactions occurred to succes-
sive presentations of the tone. Because the identical tone was used for both groups, the
difference in the results cannot be attributed to the tone. Rather, one must assume that
the loud background noise increased arousal or readiness to respond in the second
group. This sensitization of the state system was presumably responsible for increasing
the startle reaction to the tone in the second group. Activation of the state system was
no doubt also responsible for the increased startle responding that occurred when parti-
cipants were tested under conditions of threat and in the presence of an unpleasant pic-
ture (Figure 2.12).
Implications of the Dual-Process Theory
The preceding interpretations of habituation and sensitization effects illustrate several
important features of the dual-process theory. Because the habituation process resides
in the S-R system, which is activated every time a stimulus elicits a response, habituation
is a universal feature of elicited behavior. By contrast, the state system becomes involved
only in special circumstances. Some extraneous event, such as intense background noise,
may increase the individual’s alertness and sensitize the state system. Alternatively, the
state system may be sensitized by the repeated presentations of the test stimulus itself if
that stimulus is sufficiently intense or excitatory (as occurred with the 12 × 12 checker-
board pattern, as compared with the 4 × 4 pattern). If the arousing stimulus is repeated
soon enough so that the second presentation occurs while the organism remains sensi-
tized from the preceding trial, an increase in responding will be observed.
Both the habituation process and the sensitization process are assumed to decay
with the passage of time without stimulation. Thus, spontaneous recovery occurs with
48 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
both processes. Spontaneous recovery from both habituation and sensitization serves to
return responding to baseline levels (hence the term recovery).
Because habituation resides in the S-R circuit, the dual-process theory predicts that
habituation will be stimulus specific. If after habituation training the eliciting stimulus is
changed, the new stimulus will elicit a nonhabituated response because it activates a differ-
ent S-R circuit. We saw this outcome in the experiment on habituation of salivation and
hedonic ratings to a taste (Figure 2.5). After the salivary and affective responses to one
taste stimulus (e.g., lime) had substantially habituated (Trials 1–10), the responses showed
total recovery when a different taste (lemon) was presented (Trial 11). The stimulus speci-
ficity of habituation was also evident in the study of face perception in newborn infants
(Figure 2.8). Similar effects occur in common experience. For example, after you have
become completely habituated to the sounds of your car engine, your attention to the
engine is likely to return if it malfunctions and begins to make new noises.
Unlike habituation, sensitization is not highly stimulus specific. As the fear-
potentiated startle phenomenon illustrates, if you become aroused by fear, this will
increase your startle response to a puff of air (Figure 2.12) or a burst of noise (Davis
et al., 2008). In laboratory rats, pain induced by foot-shock increases the reactivity of
the rats to both auditory and visual cues. In contrast, feelings of illness or malaise make
rats more reactive or suspicious of eating novel foods. Interestingly, however, shock-
induced sensitization appears to be limited to exteroceptive cues, and illness-induced
sensitization is limited to gustatory stimuli (Miller & Domjan, 1981). Thus, cutaneous
pain and internal malaise seem to activate separate sensitization systems.
BOX 2.2
Learning in an Invertebrate
How does the brain acquire, store,
and retrieve information? To answer
this question, we need to know how
neurons operate and how neural cir-
cuits are modified by experience.
Studying these issues requires that we
delve into the neural machinery to
record and manipulate its operations.
Naturally, people are not keen on
volunteering for such experiments.
Therefore, such research has to be
conducted using other species.
Much can be learned from the
vertebrates (rats, mice) that are typi-
cally used in behavioral studies of
learning. Yet, at a neural level, even a
rat poses technical challenges for a
neurobiologist. Therefore, neurobiol-
ogists have focused on creatures with
simpler nervous systems. Inverte-
brates are attractive because some of
their neurons are very large, and they
have far simpler nervous systems.
Using this approach, Eric Kandel and
his colleagues have uncovered the
mechanisms that mediate some basic
learning processes in the marine snail,
Aplysia. Here, I provide an overview
of the mechanisms that underlie
habituation and sensitization (for
recent reviews, see Hawkins, Kandel,
& Bailey, 2006; Kandel et al., 2013).
Aplysia have two wing-like flaps
(the parapodium) on their back (dor-
sal) surface. These flaps cover the gill
and other components of the respira-
tory apparatus (Figure 2.14A). The gill
lies under a mantle shelf, and a siphon
helps to circulate water across the gill.
In the relaxed state, the gill is extended,
maximizing chemical exchange across
its surface. It is a fragile organ that
must be protected. For this reason,
nature has given Aplysia a protective
gill-withdrawal reflex. This reflex can
be elicited by a light touch applied to
either the siphon or mantle. In the
laboratory, the reflex is often elicited by
a jet of water produced from a Water
Pik. While the mechanisms that
underlie this reflex can be studied in
the intact Aplysia, it is often easier to
study the underlying system after the
essential components have been
removed and placed in a nutrient bath
that sustains the tissue (an in vitro
preparation, Latin from “in glass”).
With this simple preparation, it is an
easy matter to demonstrate both habit-
uation and sensitization. Habituation
can be produced by repeatedly applying
the tactile stimulus to the siphon. With
continued exposure, the magnitude of
the gill-withdrawal reflex becomes
smaller (habituates). Interestingly, this
experience has no effect on the magni-
tude of the gill-withdrawal elicited by
touching the mantle shelf. Conversely,
if we repeatedly touch the mantle, the
withdrawal response observed habitu-
ates without affecting the response
elicited by touching the siphon.
Continued
The Dual-Process Theory of Habituation and Sensitization 49
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
A modification in one stimulus-
response (S-R) pathway has no effect
on the response vigor in the other.
In vertebrates, a painful shock
engages a mechanism that generally
sensitizes behavior, augmenting a
variety of response systems including
those that generate a startle response
(Davis, 1997). A similar effect can be
demonstrated in Aplysia. If a shock
stimulus is applied to the tail, it
sensitizes the gill-withdrawal
response elicited by touching the
mantle or siphon (Walters, 1994).
This is a general effect that augments
behavioral reactivity in both the
mantle and siphon circuits.
The essential neural components
that underlie gill-withdrawal in
response to a siphon touch are illus-
trated in Figure 2.14B. A similar dia-
gram could be drawn for the neurons
that underlie the gill-withdrawal eli-
cited by touching the mantle.
Touching the siphon skin engages
a mechanical receptor that is coupled
to a sensory neuron. Just one receptor
is illustrated here, but additional
receptors and neurons innervate
adjoining regions of the siphon skin.
The degree to which a particular
receptor is engaged will depend on its
proximity to the locus of stimulation,
being greatest at the center of stimu-
lation and weakening as distance
increases. This yields the neural
Water pik
A
B
Mantle
shelf
Parapodium
Siphon
Gill
Skin
Tail
Gill
SN
FI
MN
C
Facilitating interneuron
1
5-HT
5-HT
receptor
Gs protein
cAMP
cAMP-dependent
PKA
P
Releasable
transmitter
pool
Glutamate
receptors
Adenylyl
cyclase
PPP
Siphon
sensory
neuron
terminal
Motor
neuron
Ca21
channel
K1
channel
FIGURE 2.14 (A) The gill-withdrawal reflex in Aplysia. A touch applied to the siphon or mantle causes the gill to retract
(adapted from Kandel et al., 2013). (B) The neural circuit that mediates habituation and sensitization. A touch engages a sensory
neuron (SN), which synapses onto a motor neuron (MN) that contributes to gill withdrawal. A shock to the tail activates a
facilitatory interneuron (FI) that presynaptically innervates the sensory neuron. The shaded box indicates the region depicted in
Panel C (adapted from Dudai, 1989). (C) A neurochemical pathway that contributes to neural sensitization. The release of sero-
tonin from the facilitating interneuron engages serotonin receptors on the sensory neuron. This activates a G-protein and adeny-
lyl cyclase, leading to the production of cAMP and engaging a cAMP-dependent kinase (protein kinase A [PKA]). PKA alters a
subset of the K+ channels by adding a phosphate group (phosphorylation), which reduces the outward flow of K+. This prolongs
the action potential, augmenting the flow of Ca++ into the cell and transmitter release (adapted from Kandel et al., 2013).
BOX 2.2 (continued)
Continued
50 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Habituation and Sensitization of Emotions
and Motivated Behavior
To this point, our discussion of changes produced by repetitions of an eliciting stimulus
has been limited to relatively simple responses. However, stimuli may also evoke com-
plex emotions such as love, fear, euphoria, terror, or satisfaction. I have already described
habituation of an emotional response to repeated presentations of a taste (Figure 2.5).
equivalent to a generalization gradi-
ent, with the maximum activity being
produced by the neuron that provides
the primary innervation for the
receptive field stimulated.
The mechanical receptors that
detect a touch engage a response
within the dendrites of the sensory
neuron. This neural response is con-
veyed to the cell body (soma) and
down a neural projection, the axon, to
the motor neuron. The sensory neuron
is the presynaptic cell. The motor neu-
ron is the postsynaptic cell. The motor
neuron is engaged by the release of a
chemical (neurotransmitter) from the
sensory neuron. The motor neuron, in
turn, carries the signal to the muscles
that produce the gill-withdrawal
response. Here, the release of the neu-
rotransmitter activates muscle fibers
that cause the gill to retract.
A sensitizing tail-shock engages
neurons that activate a type of neuron
known as the facilitatory interneuron.
As shown in the figure, the facilitatory
interneuron impinges upon the end of
the presynaptic sensory neuron. In
more technical terms, the facilitatory
interneuron presynaptically innervates
the sensory neuron. Because of this,
the facilitatory interneuron can alter
the operation of the sensory neuron.
The magnitude of the gill-
withdrawal response depends on the
amount of neurotransmitter released
from the motor neurons. The more
that is released, the stronger is the
response. Similarly, the probability
that a response will be engaged in
the motor neuron, and the number
of motor neurons that are engaged,
depends on the amount of
neurotransmitter released from the
sensory neuron. Increasing the
amount released will usually enhance
the motor neuron response and the
gill-withdrawal response.
Research has shown that with
repeated stimulations of the sensory
neuron there is no change in the action
potential generated within the sensory
neuron, but less transmitter is released,
producing the behavioral phenomenon
of habituation (Kandel et al., 2013).
When an action potential arrives at the
synapse, it causes calcium channels on
the membrane surface to open, which
allows the positively charged ion
calcium (Ca++) to flow into the cell
(Figure 1.6C). This in turn triggers
intracellular events that initiate trans-
mitter release. Habituation inactivates
some of the Ca++ channels, which
reduces the amount of Ca++ that enters
the cell. As a result, less transmitter is
released and a less vigorous motor
response occurs.
Sensitization engages the facilita-
tory interneuron, which produces a
change within the sensory neuron that
causes it to release more neurotrans-
mitter. Because more transmitter is
released, the motor neurons are
engaged to a greater extent, and the
gill-withdrawal response is more vig-
orous. The facilitatory interneuron
releases a neurotransmitter (serotonin)
that activates receptors on the surface
of the sensory neuron. These receptors
engage a biochemical cascade that
inactivates potassium (K+) channels.
The duration of an action potential is
determined by the rate at which K+ is
allowed to flow out of the cell. Inacti-
vating some of the K+ channels slows
this process, and, as a result, the action
potential lasts a little longer. This is
important because the duration of the
action potential determines how long
the Ca++ channels remain open.
Reducing the flow of K+ out of the
sensory neuron causes an increase in
the duration of the action potential,
which in turn increases the amount of
Ca++ that enters the cell. Allowing
more Ca++ to enter the cell increases
transmitter release and increases the
vigor of the motor response, produc-
ing behavioral sensitization. Addi-
tional processes also contribute,
including a form of long-term poten-
tiation (see Box 8.2; Glanzman, 2008).
Preexisting proteins underlie the
synaptic modifications that produce
short-term habituation and sensitiza-
tion. If stimulation is continued,
intracellular signals initiate the
expression of genes that yield protein
products that bring about structural
modifications (Box 9.1; Bailey &
Kandel, 2008). For example, long-
term habituation is associated with a
reduction in the number of synaptic
connections. Conversely, repeated
exposure to a sensitizing stimulus
produces an increase in synaptic
connections, which promotes the
initiation of a response in the post-
synaptic cell.
J. W. Grau
Aplysia An invertebrate sea slug, about
the size of a rat, that lives in the tidal
zones of tropical waters. Aplysia have
been used to study the neurobiology of
learning because they have a simple, and
relatively invariant, nervous system with
large neurons.
BOX 2.2 (continued)
Habituation and Sensitization of Emotions and Motivated Behavior 51
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The concepts of habituation and sensitization also have been extended to changes in
more complex emotions (Solomon & Corbit, 1974) and various forms of motivated
behavior, including feeding, drinking, exploration, aggression, courtship, and sexual
behavior (McSweeney & Swindell, 1999). An area of special interest is drug addiction
(e.g., Baker et al., 2004; Koob, 2009; Koob & Le Moal, 2008).
Emotional Reactions and Their Aftereffects
In their landmark review of examples of emotional responses to various stimuli, includ-
ing drugs, Solomon and Corbit (1974) noticed a couple of striking features. First, intense
emotional reactions are often biphasic. One emotion occurs during the eliciting stimulus,
and the opposite emotion is observed when the stimulus is terminated. Consider, for
example, the psychoactive effects of alcohol. Someone who is sipping vodka becomes
mellow and relaxed as they are drinking. These feelings, which are generally pleasant,
reflect the primary sedative effects of alcohol. In contrast, something quite different
occurs after a night of drinking. Once the sedative effects of alcohol have dissipated, the
person is likely to become irritable and may experience headaches and nausea. The
pleasant sedative effects of alcohol give way to the unpleasant sensations of a hangover.
Both effects depend on dosage. The more you drink, the more sedated or drunk you
become, and the more intense will be the hangover afterward. Other drugs produce simi-
lar biphasic responses. With amphetamine, for example, the presence of the drug makes
you feel alert, energetic, and self-confident. After the drug has worn off, the person is
likely to feel tired, depressed, and drowsy.
Another common characteristic of emotional reactions is that they change with
experience. The primary reaction becomes weaker and the after-reaction becomes stron-
ger. Habitual drinkers are not as debilitated by a few beers as someone drinking for the
first time. However, habitual drinkers experience more severe withdrawal symptoms if
they quit drinking.
Habituation of a primary drug reaction is called drug tolerance. Drug tolerance refers
to a decline in the effectiveness of a drug with repeated exposures. Habitual users of all
psychoactive drugs (e.g., alcohol, nicotine, heroin, caffeine, sleeping pills, antianxiety drugs)
are not as greatly affected when taking the drug as first-time users. A strong vodka tonic
that would make a casual drinker a bit tipsy is not likely to have any effect on a frequent
drinker. (We will revisit the role of opponent processes in drug tolerance in Chapter 4.)
Because of the development of tolerance, habitual drug users sometimes do not enjoy
taking the drug as much as naive users. People who smoke frequently, for example, do not
derive much enjoyment from doing so (e.g., Hogarth, Dickinson, & Duka, 2010). Accompa-
nying this decline in the primary drug reaction is a growth in the opponent after-reaction.
Accordingly, habitual drug users experience much more severe hangovers when the drug
wears off than do naive users. A habitual smoker who has gone a long time without a ciga-
rette will experience headaches, irritability, anxiety, tension, and general dissatisfaction.
A heavy drinker who stops consuming alcohol is likely to experience hallucinations, mem-
ory loss, psychomotor agitation, delirium tremens, and other physiological disturbances. For
a habitual user of amphetamine, the fatigue and depression that characterize the opponent
aftereffect may be severe enough to produce suicidal thoughts.
Solomon and Corbit (1974) noted that similar patterns of emotional reaction occur
with all emotion-arousing stimuli. Consider, for example, love and attachment. Newly-
weds are usually very excited about each other and are very affectionate whenever they
are together. This primary emotional reaction habituates as the years go by. Gradually,
the couple settles into a comfortable mode of interaction that lacks the excitement
of the honeymoon. However, this habituation of the primary emotional reaction is
R. L. Solomon
Co
ur
te
sy
D
on
al
d
A
.
D
ew
sb
ur
y
52 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
accompanied by a strengthening of the affective after-reaction. Couples who have been
together for many years become more intensely unhappy if they are separated by death
or disease. After partners have been together for several decades, the death of one will
cause an intense grief reaction in the survivor. This strong affective after-reaction is
remarkable, considering that by this stage in their relationship the couple may have
entirely ceased to show any overt signs of affection.
The Opponent Process Theory of Motivation
The above examples illustrate three common characteristics of emotional reactions:
(1) Emotional reactions are biphasic; a primary reaction is followed by an opposite
after-reaction. (2) The primary reaction becomes weaker or habituates with repeated sti-
mulations. (3) The weakening of the primary reaction with repetition is accompanied by
a strengthening of the after-reaction. These characteristics are at the core of the opponent
process theory of motivation (Solomon & Corbit, 1974). More recently, investigators have
made significant progress in delineating the neural mechanisms of opponent processes
(e.g., Koob, 2009; Radke, Rothwell, & Gewirtz, 2011).
The opponent process theory assumes that neurophysiological mechanisms involved
in emotional behavior serve to maintain emotional stability. Thus, the opponent process
theory is a homeostatic theory. It is built on the premise that an important function of
mechanisms that control emotions is to keep us on an even keel and minimize the highs
and the lows. The concept of homeostasis was originally introduced to explain the stabil-
ity of our internal physiology, such as body temperature. Since then, the concept has also
become important in the analysis of behavior. (I will discuss other types of homeostatic
theories in later chapters.)
How might physiological mechanisms maintain emotional stability and keep us
from getting too excited? Maintaining any system in a stable state requires that a distur-
bance that moves the system in one direction is met by an opposing force that counter-
acts the disturbance. Consider, for example, trying to keep a seesaw level. If something
pushes one end of the seesaw down, the other end will go up. To keep the seesaw level, a
force pushing one end down has to be met by an opposing force on the other side.
The idea of opponent forces serving to maintain a stable state is central to the oppo-
nent process theory of motivation. The theory assumes that an emotion-arousing stimu-
lus pushes a person’s emotional state away from neutrality. This shift away from
emotional neutrality triggers an opponent process that counteracts the shift. The patterns
of emotional behavior observed initially and after extensive experience with a stimulus
are the net result of the direct effects of an emotion-arousing stimulus and the opponent
process that is activated to counteract this direct effect.
The presentation of an emotion-arousing stimulus initially elicits what is called the
primary process, or a process, which is responsible for the quality of the emotional state
(e.g., happiness) that occurs in the presence of the stimulus. The primary, or a process, is
assumed to elicit, in turn, an opponent process, or b process, that generates the opposite
emotional reaction (e.g., irritability and melancholia). Because the opponent process is
activated by the primary reaction, it lags behind the primary emotional disturbance.
Opponent Mechanisms During Initial Stimulus Exposure Figure 2.15 shows how
the primary and opponent processes determine the initial responses of an organism to an
emotion-arousing stimulus. The underlying primary and opponent processes are repre-
sented in the bottom of the figure. The net effects of these processes (the observed emo-
tional reactions) are represented in the top panel. When the stimulus is first presented,
the a process occurs unopposed by the b process. This permits the primary emotional
reaction to reach its peak quickly. The b process then becomes activated and begins to
Habituation and Sensitization of Emotions and Motivated Behavior 53
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
oppose the a process. However, the b process is not strong enough to entirely counteract
the primary emotional response, and the primary emotional response persists during the
eliciting stimulus. When the stimulus is withdrawn, the a process quickly returns to
baseline, but the b process lingers for awhile. At this point, the b process has nothing
to oppose. Therefore, emotional responses characteristic of the opponent process become
evident for the first time at this point.
Opponent Mechanisms After Extensive Stimulus Exposure Figure 2.16 shows
how the primary and opponent processes operate after numerous previous exposures
to a stimulus. As I noted earlier, a highly familiar stimulus does not elicit strong emo-
tional reactions when it is presented, but the affective after-reaction tends to be much
stronger. The opponent process theory explains this outcome by assuming that the
b process becomes strengthened with repeated use. As a result of this strengthening,
the b process becomes activated sooner after the onset of the stimulus, its maximum
intensity becomes greater, and it becomes slower to decay when the stimulus ceases.
Because of these changes, the primary emotional responses are more effectively coun-
teracted by the opponent process. An associated consequence of the growth of the
opponent process is that the affective after-reaction become stronger when the stimulus
is withdrawn (Figure 2.16).
Opponent Aftereffects and Motivation If the primary pleasurable effects of a psy-
choactive drug are gone for habitual users, why do they continue taking the drug? Why
are they addicted? The opponent process theory suggests that drug addiction is mainly
an attempt to reduce the aversiveness of the affective after-reaction to the drugs such
as the bad hangovers, the amphetamine “crashes,” and the irritability that comes from
not having the usual cigarette. Based on their extensive review of research on emotion
and cognition, Baker and colleagues (2004) proposed an affective processing model
of drug addiction that is built on opponent process concepts and concludes that
“addicted drug users sustain their drug use largely to manage their misery” (p. 34) (see
also Ettenberg, 2004).
A
B
0
Stimulus
event
Underlying
opponent
processes
Manifest
affective
response
Time
a – b
a
b
FIGURE 2.15 Oppo-
nent process mechanism
during the initial pre-
sentation of an emotion
arousing stimulus. The
observed emotional re-
actions are represented
in the top panel. The
underlying opponent
processes are repre-
sented in the bottom
panel. Notice that the b
process starts a bit after
the onset of the a pro-
cess. In addition, the b
process ends much later
than the a process. This
last feature allows the
opponent emotions to
dominate after the end
of the stimulus. (From
“An Opponent Process
Theory of Motivation:
I. The Temporal
Dynamics of Affect,” by
R. L. Solomon and
J. D. Corbit, 1974,
Psychological Review, 81,
pp. 119–145.)
54 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The opponent process interpretation of drug addiction as escape from the misery of
withdrawal is also supported by a large body of neuroscience evidence. In their review of
this evidence, Koob and Le Moal (2008) concluded that extensive drug use results in
reduced activity in brain circuits associated with reward and strengthening of opponent
neural mechanisms referred to as the antireward circuit. Drug-seeking behavior is rein-
forced in part by the fact that drug intake reduces activity in the antireward circuit. As
Koob and Le Moal pointed out, “The combination of decreases in reward neurotransmit-
ter function and recruitment of antireward systems provides a powerful source of nega-
tive reinforcement that contributes to compulsive drug-seeking behavior and addiction”
(p. 38). Thus, drug addicts are not “trapped” by the pleasure they derive from the drug.
Rather, they take the drug to reduce withdrawal pains. (Other factors involved in drug
addiction will be considered in subsequent chapters.)
Concluding Comments
The quality of life and survival itself depends on how behavior is coordinated with the
complexities of the environment. Elicited behavior represents one of the fundamental
ways in which the behavior of all animals, from single-celled organisms to people, is
adjusted to environmental events.
Elicited behavior takes many forms, ranging from simple reflexes mediated by just
three neurons to complex emotional reactions. Although elicited behavior occurs as a
reaction to a stimulus, it is not rigid and invariant. In fact, one of its hallmark features
is that elicited behavior is altered by experience. If an eliciting stimulus does not arouse
the organism, repeated presentations of the stimulus will evoke progressively weaker
responses (a habituation effect). If the organism is in a state of arousal, the elicited
response will be enhanced (a sensitization effect).
Repeated presentations of an eliciting stimulus produce changes in simple responses
as well as in more complex emotional reactions. Organisms tend to minimize changes in
emotional state caused by external stimuli. According to the opponent process theory of
motivation, emotional responses elicited by an environmental event are counteracted by
A
B
0
Stimulus
event
Underlying
opponent
processes
Manifest
affective
response
Time
a – b
a
b
FIGURE 2.16 Oppo-
nent process mechanism
that produces the affec-
tive changes to a habit-
uated stimulus. The
observed emotional re-
actions are represented
in the top panel. The
underlying opponent
processes arerepresented
in the bottom panel.
Notice that the b process
starts promptly after the
onset of the a process and
is much stronger than in
Figure 2.15. In addition,
the b process ends much
later than the a process.
Because of these changes
in the b process, the pri-
maryemotional response
is nearly invisible during
the stimulus, but the af-
fective after-reaction is
very strong. (From “An
Opponent Process
Theory of Motivation: I.
The Temporal Dynamics
of Affect,” by R. L.
SolomonandJ.D.Corbit,
1974, Psychological
Review, 81, pp. 119–145.)
Concluding Comments 55
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
an opposing process in the organism. If the original elicited emotion is rewarding, the
opponent process will activate antireward circuits and create an aversive state. The com-
pensatory, or opponent, process is assumed to become stronger each time it is activated.
Drug addiction involves efforts to minimize the aversive nature of the opponent or anti-
reward processes attendant to repeated drug intake.
Habituation, sensitization, and changes in the strength of opponent processes are the
simplest mechanisms, whereby organisms adjust their reactions to environmental events
on the basis of past experience.
Sample Questions
1. Describe how elicited behavior can be involved
in complex social interactions, like breast
feeding.
2. Describe sign stimuli involved in the control of
human behavior.
3. Compare and contrast appetitive and consum-
matory behavior and describe how these are
related to general search, focal search, and food
handling.
4. Describe components of the startle response and
how the startle response may undergo sensitization.
5. Describe the distinction between habituation,
sensory adaptation, and response fatigue.
6. Describe the two processes of the dual-process
theory of habituation and sensitization and the
differences between these processes.
7. Describe how habituation and sensitization are
involved in emotion regulation and drug addiction.
Key Terms
a process Same as primary process in the opponent
process theory of motivation.
afferent neuron A neuron that transmits messages
from sense organs to the central nervous system. Also
called sensory neuron.
appetitive behavior Behavior that occurs early in a
natural behavior sequence and serves to bring the
organism in contact with a releasing stimulus. (See
also general search mode and focal search mode.)
b process Same as opponent process in the opponent
process theory of motivation.
consummatory behavior Behavior that serves to
bring a natural sequence of behavior to consummation
or completion. Consummatory responses are usually
species-typical modal action patterns. (See also food
handling mode.)
drug tolerance Reduction in the effectiveness of a
drug as a result of repeated use of the drug.
efferent neuron A neuron that transmits impulses to
muscles. Also called a motor neuron.
fatigue A temporary decrease in behavior caused by
repeated or excessive use of the muscles involved in the
behavior.
focal search mode The second component of the feed-
ing behavior sequence following general search, in which
the organism engages in behavior focused on a particular
location or stimulus that is indicative of the presence of
food. Focal search is a form of appetitive behavior that is
more closely related to food than general search.
food handling and ingestion mode The last compo-
nent of the feeding behavior sequence, in which the
organism handles and consumes the food. This is similar
to what ethologists referred to as consummatory behavior.
general search mode The earliest component of the
feeding behavior sequence, in which the organism
engages in nondirected locomotor behavior. General
search is a form of appetitive behavior.
habituation effect A progressive decrease in the vigor
of elicited behavior that may occur with repeated pre-
sentations of the eliciting stimulus.
habituation process A neural mechanism activated
by repetitions of a stimulus that reduces the magnitude
of responses elicited by that stimulus.
interneuron A neuron in the spinal cord that trans-
mits impulses from afferent (or sensory) to efferent (or
motor) neurons.
modal action pattern (MAP) A response pattern
exhibited by most, if not all, members of a species in
much the same way. Modal action patterns are used as
basic units of behavior in ethological investigations of
behavior.
motor neuron Same as efferent neuron.
56 Chapter 2: Elicited Behavior, Habituation, and Sensitization
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
opponent process A compensatory mechanism that
occurs in response to the primary process elicited by
biologically significant events. The opponent process
causes physiological and behavioral changes that are
the opposite of those caused by the primary process.
Also called the b process.
primary process The first process in the opponent
process theory of motivation that is elicited by a
biologically significant stimulus. Also called the a
process.
reflex A close relation between an eliciting stimulus
and a resulting response that is mediated by a neural
circuit (the reflex arc) that links afferent neurons acti-
vated by the stimulus with efferent neurons that trigger
response output. As a consequence, the eliciting stimu-
lus usually produces the reflex response, which rarely
occurs otherwise.
releasing stimulus Same as sign stimulus.
sensitization effect An increase in the vigor of eli-
cited behavior that may result from repeated presenta-
tions of the eliciting stimulus or from exposure to a
strong extraneous stimulus.
sensitization process A neural mechanism that
increases the magnitude of responses elicited by a
stimulus.
sensory adaptation A temporary reduction in the
sensitivity of sense organs caused by repeated or exces-
sive stimulation.
sensory neuron Same as afferent neuron.
sign stimulus A specific feature of an object or ani-
mal that elicits a modal action pattern. Also called
releasing stimulus.
spontaneous recovery Return of responding to base-
line levels produced by a period of rest after habituation
or sensitization.
S-R system The shortest neural pathway that connects
the sense organs stimulated by an eliciting stimulus and
the muscles involved in making the elicited response.
state system Neural structures that determine the gen-
eral level of responsiveness, or arousal of the organism.
Supernormal stimulus A sign stimulus whose fea-
tures have been artificially enhanced or exaggerated to
produce an abnormally large modal action pattern.
Key Terms 57
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
C H A P T E R 3
Classical Conditioning:
Foundations
The Early Years of Classical Conditioning
The Discoveries of Vul’fson and Snarskii
The Classical Conditioning Paradigm
Experimental Situations
Fear Conditioning
Eyeblink Conditioning
Sign Tracking and Goal Tracking
Learning Taste Preferences and Aversions
Excitatory Pavlovian Conditioning Methods
Common Pavlovian Conditioning Procedures
Measuring Conditioned Responses
Control Procedures for Classical
Conditioning
Effectiveness of Common Conditioning
Procedures
Inhibitory Pavlovian Conditioning
Procedures for Inhibitory Conditioning
Measuring Conditioned Inhibition
Prevalence of Classical Conditioning
Concluding Comments
Sample Questions
Key Terms
CHAPTER PREVIEW
Chapter 3 provides an introduction to another basic form of learning, namely classical conditioning.
Investigations of classical conditioning began with the work of Pavlov, who studied how dogs learn to
anticipate food. Since then, the research has been extended to a variety of other organisms and
response systems. Some classical conditioning procedures establish an excitatory association between
two stimuli and serve to activate behavior. Other procedures promote learning to inhibit the operation of
excitatory associations. I will describe both excitatory and inhibitory conditioning procedures and discuss
how these are involved in various important life experiences.
In Chapter 2, I described how environmental events can elicit behavior and how such
elicited behavior can be modified by sensitization and habituation. These relatively sim-
ple processes help to bring the behavior of organisms in tune with their environment.
However, if human and nonhuman animals only had the behavioral mechanisms
described in Chapter 2, they would remain rather limited in the kinds of things they
could do. For the most part, habituation and sensitization involve learning about just
one stimulus. However, events in the world do not occur in isolation. Rather, much of
59
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
our experience consists of predictable and organized sequences of stimuli. Every signifi-
cant event (e.g., a hug from a friend) is preceded by other events (your friend approach-
ing with extended arms) that are part of what leads to the target outcome.
Cause-and-effect relationships in the world ensure that certain things occur in com-
bination with others. Your car’s engine does not run unless the ignition has been turned
on; you cannot walk through a doorway unless the door was first opened; it does not
rain unless there are clouds in the sky. Social institutions and customs also ensure that
events occur in a predictable order. Classes are scheduled at predictable times; people are
better dressed at church than at a picnic; a person who smiles is more likely to act in a
friendly manner than one who frowns. Learning to predict events in the environment
and learning what stimuli tend to occur together help us interact more effectively with
our environment. Imagine how much trouble you would have if you could not predict
how long it takes to make coffee, when stores are likely to be open, or whether your
key will work to unlock your apartment.
The simplest mechanism whereby organisms learn about relations between one
event and another is classical conditioning. Classical conditioning enables human and
nonhuman animals to take advantage of the orderly sequence of events in their world
to take appropriate action in anticipation of what is about to happen. Classical condi-
tioning is the process whereby we learn to predict when and what we might eat, when
we are likely to face danger, and when we are likely to be safe. It is also integrally
involved in the learning of new emotional reactions (e.g., fear or pleasure) to stimuli
that have become associated with a significant event.
The Early Years of Classical Conditioning
Systematic studies of classical conditioning began with the work of the great Russian
physiologist Pavlov (Box 3.1). Classical conditioning was also independently discovered
by Edwin Twitmyer in a Ph.D. dissertation submitted to the University of Pennsylvania
in 1902 (see Twitmyer, 1974). Twitmyer repeatedly tested the knee-jerk reflex of college
students by sounding a bell .5 seconds before hitting the patellar tendon just below the
knee cap. After several trials of this sort, the bell was sufficient to elicit the knee-jerk
reflex in some of the students. However, Twitmyer did not explore the broader implica-
tions of his discoveries, and his findings did not attract much attention initially.
Pavlov’s studies of classical conditioning were an extension of his research on the
processes of digestion. Pavlov made major advances in the study of digestion by develop-
ing surgical techniques that enabled dogs to survive for many years with artificial fistulae
that permitted the collection of various digestive juices. With the use of a stomach fistula,
for example, Pavlov was able to collect stomach secretions in dogs that otherwise lived
normally. Technicians in the laboratory soon discovered that the dogs secreted stomach
juices in response to the sight of food, or even just upon seeing the person who usually
fed them. The laboratory produced considerable quantities of stomach juice in this manner
and sold the excess to the general public. The popularity of this juice as a remedy for
various stomach ailments helped to supplement the income of the laboratory.
Assistants in the laboratory referred to stomach secretions elicited by food-related
stimuli as psychic secretions because they seemed to be a response to the expectation or
thought of food. However, the phenomenon of psychic secretions generated little scien-
tific interest until Pavlov recognized that it could be used to study the mechanisms of
association learning and could inform us about the functions of the nervous system
(Pavlov, 1927). Thus, as many great scientists, Pavlov’s contributions were important
not just because he discovered something new but because he figured out how to place
the discovery into a compelling conceptual framework.
60 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The Discoveries of Vul’fson and Snarskii
The first systematic studies of classical conditioning were performed by S. G. Vul’fson
and A. T. Snarskii in Pavlov’s laboratory (Boakes, 1984; Todes, 1997). Both these stu-
dents focused on the salivary glands, which are the first digestive glands involved in the
breakdown of food. Some of the salivary glands are rather large and have ducts that are
accessible and can be easily externalized with a fistula (Figure 3.1). Vul’fson studied sali-
vary responses to various substances placed in the mouth: dry food, wet food, sour water,
and sand. After the dogs had experienced these things placed in the mouth, the mere
sight of the substances was enough to make the dogs salivate.
Whereas Vul’fson used naturally occurring materials in his studies, Snarskii
extended these observations to artificial substances. In one experiment, Snarskii gave his
dogs sour water (such as strong lemon juice) that was artificially colored black. After
several encounters with the black sour water, the dogs also salivated to plain black
water or to the sight of a bottle containing a black liquid.
The substances tested by Vul’fson and Snarskii could be identified at a distance by
sight. They also produced distinctive texture and taste sensations in the mouth. Such
sensations are called orosensory stimuli. The first time that sand was placed in a dog’s
mouth, only the feeling of the sand in the mouth elicited salivation. However, after
sand had been placed in the mouth several times, the sight of sand (its visual features)
also came to elicit salivation. The dog learned to associate the visual features of the sand
with its orosensory features. The association of one feature of an object with another is
called object learning.
To study the mechanisms of associative learning, the stimuli to be associated have
to be manipulated independently of one another. This is difficult to do when the two
stimuli are properties of the same object. Therefore, in later studies of conditioning,
Pavlov used procedures in which the stimuli to be associated came from different
sources. This led to the experimental methods that continue to dominate studies of
classical conditioning to the present day. However, contemporary studies are no longer
conducted with dogs.
The Classical Conditioning Paradigm
Pavlov’s basic procedure for the study of conditioned salivation is familiar to many. The
procedure involves two stimuli. One of these is a tone or a light that does not elicit
salivation at the outset of the experiment. The other stimulus is food or the taste of a
sour solution placed in the mouth. In contrast to the light or tone, the food or sour
taste elicits vigorous salivation even the first time it is presented.
FIGURE 3.1 Diagram
of the Pavlovian salivary
conditioning prepara-
tion. A cannula attached
to the animal’s salivary
duct sends drops of
saliva to a data-
recording device. (From
“The Method of Pavlov
in Animal Psychology,”
by R. M. Yerkes and
S. Morgulis, 1909,
Psychological Bulletin, 6,
pp. 257–273.)
The Early Years of Classical Conditioning 61
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Pavlov referred to the tone or light as the conditional stimulus because the effec-
tiveness of this stimulus in eliciting salivation depended on (or was conditional on) pair-
ing it several times with the presentation of food. By contrast, the food or sour taste was
called the unconditional stimulus because its effectiveness in eliciting salivation did not
depend on any prior training. The salivation that eventually came to be elicited by the
tone or light was called the conditional response, and the salivation that was always eli-
cited by the food or sour taste was called the unconditional response. Thus, stimuli and
responses whose properties did not depend on prior training were called unconditional,
and stimuli and responses whose properties emerged only after training were called
conditional.
In the first English translation of Pavlov’s writings, the term unconditional was erro-
neously translated as unconditioned, and the term conditional was translated as condi-
tioned. The -ed suffix was used exclusively in English writings for many years, and our
usage will follow that tradition. However, we should not forget that the term conditioned
does not capture Pavlov’s original meaning of “dependent on” as accurately as the term
conditional (Gantt, 1966).
Because the terms conditioned and unconditioned stimulus and conditioned and
unconditioned response are used frequently in discussions of classical conditioning, they
are often abbreviated. Conditioned stimulus and conditioned response are abbreviated
CS and CR, respectively. Unconditioned stimulus and unconditioned response are abbre-
viated US and UR, respectively.
Experimental Situations
Classical conditioning has been investigated in a variety of situations and species (e.g.,
Domjan, 2005; Turkkan, 1989). Pavlov did his experiments with dogs, often using the
salivary fistula technique. Contemporary experiments on Pavlovian conditioning are
carried out with many different species (rats, mice, rabbits, pigeons, quail, and human
participants) using procedures developed primarily by North American scientists during
the second half of the twentieth century.
BOX 3.1
Ivan P. Pavlov: Biographical Sketch
Born in 1849 into the family of a
priest in Russia, Pavlov dedicated his
life to scholarship and discovery. He
received his early education in a local
theological seminary and planned a
career of religious service. However,
his interests soon changed, and when
he was 21, he entered the University
of St. Petersburg, where his studies
focused on chemistry and animal
physiology. After obtaining the
equivalent of a bachelor’s degree,
he went to the Imperial Medico-
Surgical Academy in 1875 to further
his education in physiology. Eight
years later, he received his doctoral
degree for his research on the effer-
ent nerves of the heart and then
began investigating various aspects
of digestive physiology. In 1888 he
discovered the nerves that stimulate
the digestive secretions of the pan-
creas—a finding that initiated a
series of experiments for which
Pavlov was awarded the Nobel Prize
in Physiology in 1904.
Pavlov did a great deal of original
research while a graduate student, as
well as after obtaining his doctoral
degree. However, he did not have a
faculty position or his own laboratory
until 1890, when, at the age of 41, he
was appointed professor of pharma-
cology at the St. Petersburg Military
Medical Academy. Five years later, he
became professor of physiology at the
same institution. Pavlov remained
active in the laboratory until close to
his death in 1936. In fact, much of the
research for which he is famous today
was performed after he received the
Nobel Prize.
62 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Fear Conditioning
Following the early work of Watson and Rayner (1920/2000), a major focus of investiga-
tors of Pavlovian conditioning has been the conditioning of emotional reactions. Watson
and Rayner believed that infants are at first limited in their emotional reactivity. They
assumed that “there must be some simple method by means of which the range of sti-
muli which can call out these emotions and their compounds is greatly increased.” That
simple method was Pavlovian conditioning. In a famous demonstration, Watson and
Rayner conditioned a fear response to the presence of a docile white laboratory rat in a
nine-month-old infant named Albert.
There was hardly anything that Albert was afraid of. However, after testing a variety
of stimuli, Watson and Rayner found that little Albert reacted with alarm when he heard
the loud noise of a steel bar being hit by a hammer behind his head. Watson and Rayner
used this unconditioned alarming noise to condition fear to a white rat. Each condition-
ing trial consisted of presenting the rat to Albert and then striking the steel bar. At first
Albert reached out to the rat when it was presented to him. But, after just two condition-
ing trials, he became reluctant to touch the rat. After five additional conditioning trials,
Albert showed strong fear responses. He whimpered or cried, leaned as far away from
the rat as he could, and sometimes fell over and moved away on all fours. Significantly,
these fear responses were not evident when Albert was presented with his toy blocks.
However, the conditioned fear did generalize to other furry things (a rabbit, a fur coat,
cotton wool, a dog, and a Santa Claus mask).
Fear and anxiety are sources of considerable human discomfort, and if sufficiently
severe, they can lead to serious psychological and behavioral problems. To better under-
stand and treat these disorders, scientists are working to figure out how fear and anxiety
are acquired, what are the neural mechanisms of fear, and how fear may be attenuated
with behavioral and pharmacological techniques (e.g., Craske, Hermans, & Vansteenwegen,
2006; Oehlberg & Mineka, 2011). Many of these questions cannot be addressed experimen-
tally using human subjects. Therefore, much of the research on fear conditioning has been
conducted with laboratory rats and mice. The aversive US in these studies is a brief electric
shock delivered through a metal grid floor. Shock is used because it can be regulated with
great precision and its intensity can be adjusted to avoid any physical harm. It is aversive
primarily because it is startling, unlike anything the animal has encountered before. The
CS may be a discrete stimulus (like a tone or a light) or the contextual cues of the place
where the aversive stimulus is encountered.
Unlike little Albert, who showed signs of fear by whimpering and crying, rats show
their fear by freezing. Freezing is a common defense response that occurs in a variety of
species in anticipation of aversive stimulation (see Chapter 10). Freezing probably
evolved as a defensive behavior because animals that are motionless are not easily seen
by their predators. For example, a deer standing still in the woods is difficult to see
because its coloration blends well with the colors of bark and leaves. However, as soon
as the deer starts moving, you can tell where it is.
Freezing is defined as immobility of the body (except for breathing) and the absence
of movement of the whiskers associated with sniffing (Bouton & Bolles, 1980). Measure-
ment of freezing as an index of conditioned fear has become popular in recent years,
especially in neurobiological studies of fear (e.g., Jacobs, Cushman, & Fanselow, 2010).
Automated systems are now available to quantify the degree of freezing exhibited by
rats and mice. These systems identify freezing by how much movement is detected
across successive frames of a video recording.
Figure 3.2 shows an example of the acquisition of freezing in response to an audi-
tory CS (white noise) in laboratory rats. The rats received 10 conditioning trials. On each
Experimental Situations 63
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
trial, the CS was presented for 46 seconds, ending in a 2-second mild (.5 milliamp) foot
shock. The conditioning trials were separated by an intertrial interval of 210 seconds.
Notice that there was little freezing during the first presentation of the noise CS, before
the rats received their first shock. The percentage of time the rats spent freezing during
subsequent conditioning trials increased fairly rapidly. By the third conditioning trial, the
rats were freezing about 70% of the time, which was the asymptote or limit of learning
evident in this experiment.
In addition to freezing, two other indirect measures of fear-induced immobility are
also used in investigations of fear conditioning. Both involve the suppression of ongoing
behavior and are therefore referred to as conditioned suppression procedures. In one
case, the ongoing behavior is licking a drinking spout that contains water. The rats and
mice in these experiments are slightly water deprived and therefore lick readily when
placed in an experimental chamber. If a fear CS (e.g., tone) is presented, their licking
behavior is suppressed, and they take longer to make a specified number of licks.
Hence, this technique is called the lick-suppression procedure.
In another variation of the conditioned suppression technique, rats are first trained
to press a response lever for food reward in a small experimental chamber. This lever-
press activity provides the behavioral baseline for the measurement of fear. Once the
rats are lever pressing at a steady rate, fear conditioning is conducted by pairing a tone
or light with a brief shock. As the participants acquire the conditioned fear, they come to
suppress their lever pressing during the CS (Ayres, 2012).
The conditioned suppression procedure has also been adapted for experiments with
human participants. In that case, the behavioral baseline is provided by an ongoing activ-
ity such as playing a video game (e.g., Nelson & del Camen Sanjuan, 2006).
Eyeblink Conditioning
As I mentioned in Chapter 2, the eyeblink reflex is an early component of the startle
response and occurs in a variety of species. To get someone to blink, all you have to do
is clap your hands close to the person’s head or blow a puff of air toward the eyes. If
each air puff is preceded by a brief tone, the blink response will become conditioned
and the person will blink in response to the tone.
Because of its simplicity, eyeblink conditioning was extensively investigated in studies
with human participants early in the development of learning theory (see Kimble, 1961).
0
1 3 5
Trials
7 9
10
20
30
40
50
%
F
re
ez
in
g 60
70
80
90
100FIGURE 3.2 Acquisi-
tion of conditioned
freezing to an auditory
CS (white noise) in lab-
oratory rats during
noise-shock
conditioning trials. Each
data point shows the
percentage of time the
rats were observed
freezing during each CS
presentation (based on
Reger et al., 2012).
64 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Eyeblink conditioning continues to be a very active area of research because it provides
a powerful tool for the study of problems in development, aging, Alzheimer’s disease,
fetal alcohol syndrome, and other disorders (Brown & Woodruff-Pak, 2011; Freeman &
Nicholson, 2004; Steinmetz, Tracey, & Green, 2001). Thus, eyeblink conditioning is a
prominent technique for translational research involving classical conditioning. Eyeblink
conditioning also continues to attract a great deal of interest because it has been used
extensively in studies of the neurobiology of learning (Freeman & Steinmetz, 2011).
A study of eyeblink conditioning in five-month-old infants (Ivkovich et al., 1999)
illustrates the technique. Each infant sat on a parent’s lap facing a platform with
brightly colored objects that maintained the infant’s attention during the experimental
sessions. Eyeblinks were recorded by a video camera. The CS was a 1,000 cps tone pre-
sented for 750 milliseconds, and the US was a gentle puff of air delivered to the right
eye through a plastic tube. For one group of infants, the CS always ended with the puff
of air, and these conditioning trials occurred an average of 12 seconds apart. The sec-
ond group received the same number and distribution of CS and US presentations, but
for them, the CSs and USs were spaced 4–8 seconds apart in an explicitly unpaired
fashion. Thus, the second group served as a control. Each participant received two
training sessions, one week apart.
The results of the experiment are presented in Figure 3.3 in terms of the percent-
age of trials on which the infants blinked during the CS. The rate of eyeblinks for the
two groups did not differ statistically during the first experimental session. However,
the paired group responded to the CS at a significantly higher rate from the beginning
of the second session. This experiment illustrates a number of important points about
learning. First, it shows that classical conditioning requires the pairing of a CS and US.
Responding to the CS did not develop in the unpaired control group. Second, the
learning was not observable at first. The infants in the paired group did not respond
much in the first session but showed a dramatic increase when they were returned to
the experimental situation for the second session. This provides an example of the
100
80
60
40
20
0
Session 1 Session 2
6-trial blocks
1 2 3 4 5 6 7 8 9 10
Pe
rc
en
ta
ge
o
f t
ri
al
s
w
ith
a
c
on
di
tio
ne
d
re
sp
on
se
Paired (n=10) Unpaired (n=11)FIGURE 3.3 Eyeblink
conditioning in five-
month-old infants. For
the infants in the paired
group, a tone CS ended
in a gentle puff of air to
the eye. For the infants
in the unpaired group,
the tone and air puff
never occurred together.
(Based on D. Ivlovich,
K. L. Collins, C. O. Eck-
erman, N. A. Krasnegor,
and M. E. Stanton
(1999). Classical delay
eyeblink conditioning in
four and five month old
human infants. Psycho-
logical Science, 10,
Figure 1, p. 6.)
Experimental Situations 65
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
learning/performance distinction we discussed in Chapter 1. The babies started to learn
that the CS was related to the US during the first session, but their learning was not
evident until the beginning of the second session.
Contemporary interest in eyeblink conditioning in humans stems from the fact that
substantial progress has been made in understanding the neurobiological substrates of
this type of learning (Box 3.2). Neurobiological investigations of eyeblink conditioning
have been conducted primarily in studies with domesticated rabbits. The rabbit eyeblink
preparation was developed by Gormezano (see Gormezano, Kehoe, & Marshall, 1983).
Domesticated rabbits are ideal for this type of research as they are sedentary and rarely
blink in the absence of an air puff or irritation of the eye. Therefore, conditioned
increases in responding can be readily detected.I. Gormezano
BOX 3.2
Eyeblink Conditioning and the Search for the Engram
When an organism learns something,
the results of this learning must be
stored in the brain. Somehow, the
network of neurons that make up
your central nervous system is able to
encode the relationship between bio-
logically significant events and use
this information to guide the selection
of CRs. This biological memory is
known as an engram. The traditional
view is that the engram for a discrete
CR is stored in a localized region of
the brain. This raises a basic question
in neurobiology: How and where is
the engram stored?
As a student, your task would be
easier if nature provided just one
answer. This, however, is not the case;
how an experience is encoded and
stored by the nervous system varies
across learning tasks. As a rule, more
difficult problems require the most
advanced portions of your brain and,
not surprisingly, unraveling how
these problems are solved at a neural
level has proven to be a challenging
task. Given this, neurobiologists have
gravitated toward simpler learning
tasks, where both the conditions that
generate learning and its behavioral
effect are well defined.
Richard Thompson and his col-
leagues recognized that the Pavlovian
conditioning of an eyeblink response
provides an attractive paradigm for
unraveling how and where an engram
is stored (Fanselow & Poulos, 2005;
Steinmetz, Gluck, & Solomon, 2001;
Thompson, 2005). Prior work had
shown that a CS (e.g., a tone) that is
repeatedly paired with an air puff to
the eye (the US) acquires the ability to
elicit a defensive eyeblink response.
Decades of work has defined the cir-
cumstances under which this learning
occurs, and the motor output has
been precisely specified.
The search for the engram began
with the hippocampus. Studies of
humans with damage to this region
revealed that the ability to consciously
remember a recent event depends on
the hippocampus (Box 8.2). Small
electrodes were lowered into the hip-
pocampus of laboratory animals, and
neural activity was recorded during
eyeblink conditioning. These studies
revealed that cells in this region
reflect the learning of a CS–US asso-
ciation. However, to the surprise of
many investigators, removing the
hippocampus did not eliminate the
animal’s ability to acquire and retain
a conditioned eyeblink response. In
fact, removing all of the brain
structures above the midbrain
(Figure 1.6D) had little effect on
eyeblink conditioning with a
delayed conditioning procedure.
This suggests that the essential
circuitry for eyeblink conditioning
lies within the lower neural
structures of the brainstem and
cerebellum. Subsequent experi-
ments clearly showed that the
acquisition of a well-timed condi-
tioned eyeblink response depends
on a neural circuit that lies within
the cerebellum (Ohyama et al.,
2003; Mauk & Buonomano, 2004).
The UR elicited by an air puff to
the eye is mediated by neurons that
project to a region of the brainstem
known as the trigeminal nucleus
(Figure 3.4A). From there, neurons
travel along two routes, either directly
or through the reticular formation to
the cranial motor nucleus, where the
behavioral output is organized. Three
basic techniques were used to define
this pathway. The first involved
electrophysiological recordings to
verify that neurons in this neural
circuit are engaged in response to the
US. The second technique involved
inactivating nuclei within the circuit,
either permanently (by killing the
cells) or temporarily (by means of a
drug or cooling), to show that the
Continued
Co
ur
te
sy
of
I.
G
or
m
ez
an
o
66 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
region plays an essential (necessary)
role in the eyeblink UR. Finally, spe-
cific nuclei were artificially stimulated
to show that activity in these areas is
sufficient to produce the behavioral
response.
The same techniques (electrical
recording, inactivation, and stimu-
lation) have been used to define the
neural pathway that mediates the
acquisition and performance of the
CR. As illustrated in Figure 3.4A,
the CS input travels to a region of
the brainstem known as the pontine
nucleus. From there, it is carried by
mossy fibers that convey the signal
to the cerebellum. The US signal is
relayed to the cerebellum through
the climbing fibers. These two sig-
nals meet in the cerebellar cortex
where coincident activity brings
about a synaptic modification that
alters the neural output from the
cerebellum. In essence, the climbing
fibers act as teachers, selecting a
subset of connections to be modified
(Figure 3.4B). This change defines
the stimulus properties (the
characteristics of the CS) that
engage a discrete motor output. The
output is mediated by neurons that
project from the interpositus nucleus
to the red nucleus and finally to the
cranial motor nucleus.
As an eyeblink CR is acquired,
conditioned activity develops within
the interpositus nucleus. Neurons
from this nucleus project back to the
US pathway and inhibit the US signal
within the inferior olive. This provides
a form of negative feedback that
decreases the effectiveness of the US.
Cerebellar cortex
Pontine
nuclei
Auditory
nuclei
Red
nucleus
Cranial
motor nuclei
Eyeblink
UR and CR
Interpositus
nucleus
Inferior
olive
Climbing
fibers
USCS
Mossy
fibers
Tone
CS
CR
CR
(UR)
CS US Trigeminal
nucleus
Corneal
air puff
US
Reticular
formation
Reflex
pathways
FIGURE 3.4 (A) A
block diagram of
the brain circuitry
required for eyelid
conditioning
(adapted from
Thompson, 1993 and
2005). (B) Structural
plasticity within the
cerebellar cortex.
Mossy fiber input
from the CS synapses
on to parallel fibers
that project across
the cortical surface.
Each Purkinje cell
receives input from
many parallel fibers
(CSs), but just one
climbing fiber (the
US). The US input
provides a form of
instruction that
selects the appropri-
ate pattern of parallel
fiber (CS) activity
to drive the CR
(adapted from
Dudai, 1989).
BOX 3.2 (continued)
B
A
Continued
Instruction
Purkinje
cell
CR US CS
Mossy
fiber
Parallel fibers
Climbing
fiber
Experimental Situations 67
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Sign Tracking and Goal Tracking
Pavlov’s research concentrated on salivation and other highly reflexive responses. This
encouraged the belief that classical conditioning occurs only in reflex response systems.
In recent years, however, such a restrictive view of Pavlovian conditioning has been
abandoned. One experimental paradigm that has contributed significantly to this change
in thinking is the sign tracking, or autoshaping, paradigm.
Animals often approach and contact stimuli that signal the availability of food. In
the natural environment, food can be predicted on the basis of cues that originate from
the food source but are detectable at a distance. For a hawk, for example, the sight and
noises of a mouse some distance away are cues indicating the possibility of a meal. By
approaching and contacting these stimuli, the hawk is likely to end up catching the
mouse. Similarly, a squirrel can predict the availability of acorns on the basis of the
leaves and shape of the oak trees that grow acorns.
Sign tracking is often investigated in the laboratory by presenting a discrete, local-
ized visual stimulus just before each delivery of a small amount of food. The first experi-
ment of this sort was performed by Brown and Jenkins (1968) with pigeons. The pigeons
were placed in an experimental chamber that had a small circular key that could be illu-
minated and that the pigeons could peck (similar to what is shown in Figure 1.8). Peri-
odically, the birds were given access to food for a short period (4 seconds). The key light
was illuminated for 8 seconds immediately before each food delivery.
The birds did not have to do anything for the food to be delivered. Because they
were hungry, one might predict that when they saw the key light, they would go to the
food dish and wait for the food that was coming. Interestingly, that is not what hap-
pened. Instead of using the key light to tell them when they should go to the food dish,
the pigeons started pecking the key itself. This behavior was remarkable because it was
not required to gain access to the food. Presenting the key light at random times or
Many researchers believe that phe-
nomena such as blocking and over-
shadowing occur because a predicted
CS is less effective. In the eyeblink
paradigm, this might occur because
the US input is inhibited within the
inferior olive. Consistent with that
prediction, Kim and colleagues (1998)
showed that eliminating this source
of inhibition eliminated the blocking
effect.
Earlier I noted that the hippo-
campus is not needed for simple
delayed conditioning. It is, however,
required for more complex forms of
learning. An example is provided
by trace conditioning, in which a
temporal delay is inserted between
the end of the CS and the start of
the US. A normal animal can readily
acquire a conditioned eyeblink to a
CS that ends .5 seconds before the
US. However, it cannot span this
gap if the hippocampus is removed.
A similar pattern of results is
observed in amnesic patients who
have damage to the hippocampus
(Clark & Squire, 1998). These
patients cannot consciously remem-
ber the CS–US relation. In the
absence of this explicit memory,
they fail to learn with a trace-
conditioning procedure. Learning
in the delayed procedure is not
affected, even though the patient
cannot consciously remember the
CS–US relation from one session
to the next. Interestingly, disrupting
conscious awareness in a normal
person undermines the appreciation
of the CS–US relation with the trace
procedure. Again, individuals who
cannot explicitly report the relation
fail to learn.
J. W. Grau
cerebellum A neural structure that lies
at the bottom of the brain, behind the
brainstem and under the cerebral hemi-
spheres. The cerebellum plays a role in
motor coordination and motor learning
(e.g., eyeblink conditioning).
electrophysiology An experimental
technique that uses probes (electrodes) to
monitor the electrical properties of neurons.
engram The neurobiological represen-
tation and storage of learned information
in the brain.
hippocampus A subcortical region of
the limbic system that plays an important
role in spatial learning, trace conditioning,
and episodic memory. Degeneration of
this area contributes to Alzheimer’s
disease.
BOX 3.2 (continued)
68 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
unpaired with food did not lead to pecking (e.g., Gamzu & Williams, 1973), indicating
that for the conditioned pecking to occur the key light had to be paired with food.
The tracking of signals for food is dramatically illustrated by instances in which the
signal is located far away from the food cup. In the first such experiment (see Hearst &
Jenkins, 1974), the food cup was located about 3 feet (90 cm) from the key light. Never-
theless, the pigeons went to the key light rather than the food cup when the CS was pre-
sented. Burns and Domjan (2000) extended this “long-box” procedure in studies of
sexual conditioning. Male domesticated quail, which copulate readily in captivity, were
served in the experiment. The CS was a wood block lowered from the ceiling 30 seconds
before a female copulation partner was introduced. The unusual feature of the experi-
ment was that the CS and the female were presented at opposite ends of an 8-foot long
chamber (Figure 3.5). Despite this long distance, when the CS was presented the birds
approached the CS rather than the door where the female was to be released. Pairing
the CS with sexual reinforcement made the CS such an attractive stimulus that the
birds were drawn to it nearly 8 feet away from the female door.
Although sign tracking is a frequent outcome in Pavlovian conditioning, it is not
always observed. Under certain experimental conditions, there can be considerable varia-
tion in which animals develop sign tracking and the degree of sign tracking they exhibit.
Historically, individual differences in conditioned responding were ignored in studies of
conditioning because they were attributed to poor experimental control. However, that is
no longer the case because individual differences in sign tracking are correlated with
individual differences in impulsivity and vulnerability to drug abuse (Tomie, Grimes, &
Pohorecky, 2008). This has made sign tracking a valuable model system for studying
learning processes and neural mechanisms that contribute to the development of drug
addiction.
Individual differences in sign tracking are typically examined using laboratory rats.
A common experimental situation is illustrated in Figure 3.6. Rats are placed in a small
experimental chamber that has a food cup in the middle of one of the walls. There is a
slot on either side of the food cup through which a response lever can be inserted. The
presentation of the lever serves as the CS, and a pellet of food delivered to the food cup
serves as the US. Each conditioning trial consists of inserting one of the levers for a brief
duration (e.g., 8 seconds). The lever is then withdrawn and the pellet of food is delivered.
Conditioning trials in which a response lever is paired with food do not produce the
same result in all rats (see bottom of Figure 3.6). About one third of the rats become
8 feetFIGURE 3.5 Test of
sign tracking in sexual
conditioning of male
domesticated quail. The
CS was presented at one
end of an 8-foot long
chamber before the re-
lease of a female from
the other end. In spite of
this distance, the male
birds went to the CS
when it appeared (based
on Burns & Domjan,
2000).
M. Burns
Co
ur
te
sy
of
M
.
Bu
rn
s
Experimental Situations 69
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
conditioned to track the CS. Sign trackers approach, touch, and sometimes gnaw the
response lever that serves as the CS. Another third of the rats ignore the lever but approach
and poke their heads into the food cup when the CS is presented. This type of conditioned
behavior is called goal tracking because it tracks the goal object, which is food. The remain-
ing rats show a combination of sign tracking and goal tracking responses.
These individual differences in sign tracking and goal tracking are of considerable
interest because the two subsets of rats also differ in other respects that are associated
with susceptibility to drug abuse (Flagel, Akil, & Robinson, 2009). Sign trackers show
greater psychomotor sensitization to cocaine, greater activation of the dopamine reward
circuit, and elevated plasma corticosterone levels. These individual differences are genet-
ically based. Rats selectively bred for high locomotor responsivity in a novel environment
show sign tracking, whereas those bred for low locomotor responsivity show goal track-
ing. Furthermore, these genetic differences are accompanied by differences in dopamine
release in the reward circuit of the brain in response to the CS that signals food (Flagel
et al., 2011). Studies of individual differences in sign tracking and goal tracking are excit-
ing because they may some day tell us how learning and experience regulate gene
expression to produce impulsive behavior and drug abuse.
Learning Taste Preferences and Aversions
The normal course of eating provides numerous opportunities for the learning of asso-
ciations. Essentially, each eating episode is a conditioning trial. Whenever we eat, the
sight, taste, and smell of the food are experienced before the food is swallowed and
1
0
20
40
60
80
2 3 4 5 6 7 8
Panel A Panel B
Panel C Panel D
Blocks of trials
N
um
be
r
of
r
es
po
ns
es
0
20
40
60
80
1 2 3 4 5 6 7 8
FIGURE 3.6 Rats in
an experimental cham-
ber with a recessed food
cup in the middle and a
slot on either side of the
food cup through which
a response lever can be
inserted. Presentation of
the left or right lever was
the CS. A pellet of food
in the food cup was the
US. Sign tracking was
measured by contacts
with the CS lever (Panel A).
Goal tracking was
measured by contacts
with the food cup (Panel B).
About one third of
the rats (Panel C) de-
veloped sign tracking as
the CR (black circles)
and one third of the rats
(Panel D) developed
goal tracking as the CR
(white circles). The re-
maining rats showed
some of each behavior
(grey circles). Results are
shown in blocks of 50
conditioning trials
(based on Flagel, Akil, &
Robinson, 2009).
Shelly Flagel
Co
ur
te
sy
of
Sh
el
ly
Fl
ag
el
70 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
digested. The sensory aspects of the food serve as CSs that become associated with the
postingestional consequences of eating, which are USs. Through this process, food cues
come to signal what, when, and how much we eat (Polivy, Herman, & Girz, 2011).
Pavlovian conditioning can lead to the learning of food preferences and aversions.
A taste preference is learned if a flavor is paired with nutritional repletion or other posi-
tive consequences (e.g., Capaldi, Hunter, & Lyn, 1997). In contrast, a conditioned taste
aversion is learned if ingestion of a novel flavor is followed by an aversive consequence
such as indigestion or food poisoning (Reilly & Schachtman, 2009). The learning of taste
aversions and taste preferences has been investigated extensively in various animal spe-
cies. A growing body of evidence indicates that many human taste aversions are also the
result of Pavlovian conditioning (Scalera, 2002). Much of this evidence has been pro-
vided by questionnaire studies (e.g., Logue, 1985, 1988). People report having acquired
at least one food aversion during their life time. The typical aversion learning experience
involves eating a distinctively flavored food and then getting sick. Such a flavor–illness
experience can produce a conditioned food aversion in just one trial, and the learning
can occur even if the illness is delayed several hours after ingestion of the food. Another
interesting finding is that in about 20% of the cases, the individuals were certain that
their illness was not caused by the food they ate. Nevertheless, they learned an aversion
to the food. This indicates that food aversion learning can be independent of rational
thought processes and can go against a person’s own conclusions about the causes of
the illness.
Questionnaire studies can provide thought-provoking data, but systematic experi-
mental research is required to isolate the mechanism of food-aversion learning. Experi-
mental studies have been conducted with people in situations where they encounter
illness during the course of medical treatment. Chemotherapy for cancer is one such sit-
uation. Chemotherapy often causes nausea as a side effect. Both child and adult cancer
patients have been shown to acquire aversions to foods eaten before a chemotherapy ses-
sion (Bernstein & Webster, 1980; Scalera & Bavieri, 2009). Such conditioned aversions
may contribute to the lack of appetite that is a common side effect of chemotherapy.
Conditioned food aversions also may contribute to anorexia nervosa, a disorder
characterized by severe and chronic weight loss (Bernstein & Borson, 1986). Suggestive
evidence indicates that people suffering from anorexia nervosa experience digestive dis-
orders that may increase their likelihood of learning food aversions. Increased suscepti-
bility to food-aversion learning may also contribute to loss of appetite seen in people
suffering from severe depression.
Many of our ideas about food-aversion learning in people have their roots in
research with laboratory animals. In the typical procedure, the participants receive a dis-
tinctively flavored food or drink and are then made to feel sick by the injection of a drug
or exposure to radiation. As a result of the taste–illness pairing, the animals acquire an
aversion to the taste and suppress their subsequent intake of that flavor. Although taste-
aversion learning is similar to other forms of classical conditioning in many respects
(e.g., Domjan, 1983), it also has some special features. First, strong taste aversions can
be learned with just one pairing of the flavor and illness. Although one-trial learning
also occurs in fear conditioning, such rapid learning is rarely observed in eyeblink con-
ditioning, salivary conditioning, or sign tracking.
The second unique feature of taste-aversion learning is that it occurs even if the
illness does not occur until several hours after exposure to the novel taste (Garcia,
Ervin, & Koelling, 1966; Revusky & Garcia, 1970). Dangerous substances in food often
do not produce illness effects until the food has been digested, absorbed in the blood
stream, and distributed to various body tissues. This process takes time. Long-delay
Experimental Situations 71
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
learning of taste aversions probably evolved to enable human and other animals to avoid
poisonous foods that have delayed ill effects.
Long-delay taste-aversion learning was reported in an early study by Smith and Roll
(1967). Laboratory rats were first adapted to a water deprivation schedule so that they
would readily drink when a water bottle was placed on their cage. On the conditioning
day, the water was flavored with the artificial sweetener saccharin (to make a .1% saccha-
rin solution). At various times after the saccharin presentation ranging from 0 to 24
hours, different groups of rats were exposed to radiation from an X-ray machine to
induce illness. Control groups of rats were also taken to the X-ray machine but were
not irradiated. They were called the sham-irradiated groups. Starting a day after the radi-
ation or sham treatment, each rat was given a choice of the saccharin solution or plain
water to drink for 2 days.
The preference of each group of rats for the saccharin solution is shown in
Figure 3.7. Animals exposed to radiation within 6 hours after tasting the saccharin solu-
tion showed a profound aversion to the saccharin flavor in the postconditioning test.
They drank less than 20% of their total fluid intake from the saccharin drinking tube.
Much less of an aversion was evident in animals irradiated 12 hours after the saccharin
exposure, and hardly any aversion was observed in rats irradiated 24 hours after the taste
exposure. In contrast to this gradient of saccharin avoidance in the irradiated rats, all the
sham-irradiated groups strongly preferred the saccharin solution. They drank more than
70% of their total fluid intake from the saccharin drinking tube.
A flavor can also be made unpalatable by pairing it with another taste that is already
disliked. In an analogous fashion, the pairing of a neutral flavor with a taste that is
already liked will increase preference for that flavor. For example, in a study with under-
graduate students, Dickinson and Brown (2007) used banana and vanilla as neutral fla-
vors. To induce a flavor aversion or preference, the students received these flavors mixed
with a bitter substance (to condition an aversion) or sugar (to condition a preference). In
subsequent tests with the CS flavors, the undergraduates reported increased liking of the
flavor that had been paired with sugar and decreased liking of the flavor that had been
paired with the bitter taste.
These examples of how people learn to like or dislike initially neutral flavors is part
of the general phenomenon of evaluative conditioning (De Houwer, 2011; Hoffmann
et al., 2010). In evaluative conditioning, our evaluation or liking of a stimulus is changed
by having that stimulus associated with something we already like or dislike. Evaluative
conditioning is responsible for many of our likes and dislikes. It is the basis of much of
what is done in the advertising industry. The product the advertiser is promoting is
20
40
60
80
100
Pe
rc
en
ta
ge
o
f p
re
fe
re
nc
e
0
0 1 3 6 12 24
CS-US interval
(hours)
Sham X rayFIGURE 3.7 Mean
percent preference for
the saccharin CS flavor
during a test session
conducted after the CS
flavor was paired with X
irradiation (the US) or
sham exposure. Percent
preference is the per-
centage of the partici-
pant’s total fluid intake
(saccharin solution plus
water) that consisted of
the saccharin solution.
During conditioning, the
interval between expo-
sure to the CS and the
US ranged from 0 to 24
hours for different
groups of rats. (Based on
“Trace Conditioning
with X-rays as an Aver-
sive Stimulus,” by J. C.
Smith and D. L. Roll,
Psychonomic Science,
1967, 9, pp. 11–12.)
J. Garcia
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
72 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
paired with things people already like in an effort to induce a preference for the product
(Schachtman, Walker, & Fowler, 2011). Evaluative conditioning may also be involved in
how we come to like somebody. If we participate in activities we enjoy with a particular
person, we will come to like that person through association of features of the person
with the enjoyable activities.
Excitatory Pavlovian Conditioning Methods
What we have been discussing so far are instances of excitatory Pavlovian conditioning. In
excitatory conditioning, organisms learn a relation between a CS and US. As a result of this
learning, presentation of the CS activates behavioral and neural activity related to the US in
the absence of the actual presentation of that US. Thus, pigeons learn to approach and peck
a key light that had been paired with food, rats learn to freeze to a sound that previously
preceded foot shock, babies learn to blink in response to a tone that preceded a puff of air,
and people learn to avoid a flavor that was followed by illness on an earlier occasion.
Common Pavlovian Conditioning Procedures
One of the major factors that determines the course of classical conditioning is the rela-
tive timing of the CS and the US. Often small and seemingly trivial variations in how a
CS is paired with a US can have profound effects on how vigorously the participant exhi-
bits a conditioned response and when the CR occurs.
Five common classical conditioning procedures are illustrated in Figure 3.8. The
horizontal distance in each diagram represents the passage of time; vertical displace-
ments represent when a stimulus begins and ends. Each configuration of CS and US
represents a single conditioning trial.
Time
On Off
CS
US
CS
US
CS
US
CS
US
CS
US
Short-delayed
conditioning
Trace
conditioning
Long-delayed
conditioning
Simultaneous
conditioning
Backward
conditioning
FIGURE 3.8 Five
common classical con-
ditioning procedures.
©
Ce
ng
ag
e
Le
ar
ni
ng
Excitatory Pavlovian Conditioning Methods 73
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
In a typical classical conditioning experiment, CS–US episodes are repeated a num-
ber of times during an experimental session. The time from the end of one conditioning
trial to the start of the next trial is called the intertrial interval. By contrast, the time
from the start of the CS to the start of the US within a conditioning trial is called the
interstimulus interval or CS-US interval. For conditioned responding to develop, it is
advisable to make the interstimulus interval much shorter than the intertrial interval
(e.g., Sunsay & Bouton, 2008). In many experiments, the interstimulus interval is a few
seconds, whereas the intertrial interval may be 2–3 minutes or more.
1. Short-delayed conditioning. The most frequently used procedure for Pavlovian condi-
tioning involves delaying the start of the US slightly after the start of the CS on each
trial. This procedure is called short-delayed conditioning. The critical feature of
short-delayed conditioning is that the CS starts each trial, and the US is presented
after a brief (less than 1 minute) delay. The CS may continue during the US or end
when the US begins.
2. Trace conditioning. The trace-conditioning procedure is similar to the short-delayed
procedure in that the CS is presented first and is followed by the US. However, in
trace conditioning, the US is not presented until some time after the CS has ended.
This leaves a gap between the CS and US. The gap is called the trace interval.
3. Long-delayed conditioning. The long-delayed conditioning procedure is also similar
to the short-delayed conditioning in that the CS starts before the US. However, in
this case the US is delayed much longer (510 minutes or more) than in the short-
delayed procedure. Importantly, the long-delayed procedure does not include a
trace interval. The CS lasts until the US begins.
4. Simultaneous conditioning. Perhaps the most obvious way to expose subjects to a CS
and a US is to present the two stimuli at the same time. This procedure is called
simultaneous conditioning. The critical feature of simultaneous conditioning is
that the CS and US are presented concurrently.
5. Backward conditioning. The last procedure depicted in Figure 3.8 differs from the
others in that the US occurs shortly before, rather than after, the CS. This technique
is called backward conditioning because the CS and US are presented in a “back-
ward” order compared to the other procedures.
Measuring Conditioned Responses
Pavlov and others after him have conducted systematic investigations of procedures such
as those depicted in Figure 3.8 to find out how the conditioning of a CS depends on the
temporal relation between CS and US presentations. To make comparisons among the
various procedures, one has to use a method for measuring conditioning that is equally
applicable to all procedures. This is typically done with the use of a test trial. A test trial
consists of presenting the CS by itself (without the US). Responses elicited by the CS
can then be observed without contamination from responses elicited by the US. Such
CS-alone test trials can be introduced periodically during the course of training to track
the progress of learning.
Behavior during the CS can be quantified in several ways. One aspect of conditioned
behavior is how much of it occurs. This is called the magnitude of the CR. Pavlov, for
example, measured the number of drops of saliva that were elicited by a CS. Other exam-
ples of the magnitude of CRs are the amount of freezing or response suppression that
occurs in fear conditioning (Figure 3.2) and the degree of depressed flavor preference
that is observed in taste-aversion learning (Figure 3.7).
The vigor of responding can also be measured by how often the CS elicits a CR. For
example, we can measure the percentage of trials on which a CR is elicited by the CS. This
74 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
measure is frequently used in studies of eyeblink conditioning (Figure 3.3) and reflects the
likelihood, or probability, of responding. Sometimes investigators also measure how soon
the CR occurs after the onset of the CS. This is called the latency of the CR.
In the delayed and trace-conditioning procedures, the CS occurs by itself at the start
of each trial (Figure 3.8). Any conditioned behavior that occurs during this initial CS-
alone period is uncontaminated by behavior elicited by the US and therefore can be
used as a measure of learning. In contrast, responding during the CS in simultaneous
and backward conditioning procedures is bound to be contaminated by responding to
the US or the recent presentation of the US. Therefore, test trials are critical for assessing
learning in simultaneous and backward conditioning.
Control Procedures for Classical Conditioning
Devising an effective test trial is not enough to obtain conclusive evidence of classical
conditioning. As I noted in Chapter 1, learning is an inference about the causes of
behavior based on a comparison of at least two conditions. Participants who receive a
conditioning procedure have to be compared with participants in a control group who
do not receive training. If the control group does not receive the conditioning procedure,
what treatment should it receive? In studies of habituation and sensitization, we were
interested only in the effects of prior exposure to a stimulus. Therefore, the comparison
or control procedure was rather simple: it consisted of no prior stimulus exposures. In
studies of classical conditioning, our interest is in how the CS and US become associated.
Concluding that an association has been established requires more carefully designed
control procedures.
An association between a CS and a US implies that the two events have become
connected in some way. An association requires more than just familiarity with the CS
and US. It presumably depends on having the two stimuli presented in a special way that
leads to a connection between them. Therefore, to conclude that an association has been
established, one has to make sure that the observed change in behavior could not have
been produced by prior separate presentations of the CS or the US.
As I described in Chapter 2, increased responding to a stimulus can be a result of sensi-
tization, which is not an associative process. Presentations of an arousing stimulus, such as
food to a hungry animal, can increase the behavior elicited by a more innocuous stimulus,
such as a visual cue, without an association having been established between the two stimuli.
Increases in responding observed with repeated CS-US pairings can sometimes result from
exposure to just the US. If exposure to just the US produces increased responding to a pre-
viously ineffective stimulus, this is called pseudo-conditioning. Control procedures are
required to determine whether responses that develop to a CS represent a genuine CS–US
association rather than just pseudo-conditioning.
Investigators have debated at length about what is the proper control procedure for
classical conditioning. Ideally, a control procedure should involve the same number and
distribution of CS and US presentations as the experimental procedure, but with the CSs
and USs arranged so that they do not become associated. One possibility is to present
the US at random times during both the CS and the intertrial interval, making sure
that the probability of the US is the same during the intertrial interval as it is during
the CS. Such a procedure is called a random control procedure. The random control
procedure was promising when it was first proposed (Rescorla, 1967). However, it has
not turned out to be a useful method because it does not prevent the development of
conditioned responding (e.g., Kirkpatrick & Church, 2004; Williams et al., 2008).
A more successful control procedure involves presenting the CS and US on separate
trials. Such a procedure is called the explicitly unpaired control. In the explicitly unpaired
control, the CS and US are presented far enough apart to prevent their association, but the
Doug Williams
Co
ur
te
sy
of
D
ou
g
W
ill
ia
m
s
Excitatory Pavlovian Conditioning Methods 75
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
total number of CS and US presentations is the same as in the conditioned or paired
group. How much time separates CS and US presentations depends on the response sys-
tem. In taste-aversion learning, much longer separation is necessary between the CS and
US than in other forms of conditioning. In one variation of the explicitly unpaired control,
only CSs are presented during one session and only USs are presented during a second
session.
Effectiveness of Common Conditioning Procedures
There has been considerable interest in determining which of the procedures depicted in
Figure 3.8 produces the strongest evidence of learning. This interest was motivated by a
search to find the “best” procedure for producing Pavlovian conditioning. However, that
has turned out to be the wrong question to ask. Research has shown that delayed, simul-
taneous, trace, and backward conditioning can all produce strong learning and vigorous
conditioned responding (e.g., Albert & Ayres, 1997; Akins & Domjan, 1996; Marchand &
Kamper, 2000; Romaniuk & Williams, 2000; Schreurs, 1998; Williams & Hurlburt, 2000).
However, different behavioral and neural mechanisms are engaged by these different
procedures. For example, in fear conditioning the CS elicits conditioned freezing if a
short-delayed procedure is used, but with a simultaneous conditioning procedure, the
CR is movement away from the CS, or escape (Esmorís-Arranz, Pardo-Vázquez, &
Vázquez-Garcia, 2003). (We will revisit these issues in Chapter 4).
Trace-conditioning procedures have been of special interest because they can have
the same CS–US interval as delayed conditioning procedures. However, in trace proce-
dures the CS is turned off a short time before the US occurs, resulting in a trace interval.
This temporal gap between the CS and the US activates different behavioral and neural
mechanisms. The trace interval makes termination of the CS a better predictor of the US
than the onset of the CS. As a consequence, CRs that reflect anticipation of the US are
more likely to occur during the trace interval than during the CS. In addition, trace con-
ditioning involves medial forebrain cortical neurons that are not involved in delayed
conditioning (Kalmbach et al., 2009; Woodruff-Pak & Disterhoft, 2008). Because of this
difference, lesions of the prefrontal cortex that disrupt trace conditioning do not disrupt
delayed conditioning. Interestingly, whereas both delayed and trace conditioning show
a decline with aging, the decline is less severe with delayed conditioning (Woodruff-Pak
et al., 2007).
Another procedure that has attracted special interest is backward conditioning.
Backward conditioning produces mixed results. Some investigators observed excitatory
responding with backward pairings of a CS and US (e.g., Spetch, Wilkie, & Pinel,
1981). Others reported inhibition of conditioned responding with backward conditioning
(e.g., Maier, Rapaport, & Wheatley, 1976; Siegel & Domjan, 1971). To make matters even
more confusing, in a rather remarkable experiment, Tait and Saladin (1986) found both
excitatory and inhibitory conditioning effects resulting from the same backward condi-
tioning procedure (see also, McNish et al., 1997).
One reason why there is no “best” procedure for Pavlovian conditioning is that
instead of learning just a CS–US association, participants also learn when the US occurs
in relation to the CS (Balsam, Drew, & Yang, 2001; Ohyama & Mauk, 2001). In fact,
some have suggested that learning when the US occurs may be more important than learn-
ing that a CS is paired with a US (Balsam & Gallistel, 2009; Balsam, Drew, & Gallistel,
2010). The view that classical conditioning involves not only learning what to expect but
when to expect it is called the temporal coding hypothesis (Amundson & Miller, 2008).
A particularly elegant demonstration of learning when the US occurs is provided by
a study in which the CS was a 1-second tone in conditioning the nictitating membrane
76 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
response of rabbits (Dudeney, Olsen, & Kehoe, 2007). (The nictitating membrane is
an “extra” eyelid in rabbits.) During each session, the rabbits got 20 trials in which the
US occurred 150 milliseconds after the onset of the CS and 20 trials in which the US
occurred 500 milliseconds after the onset of the CS. Additional trials without the
US were also included to measure responding to the CS by itself. The investigators
were interested in whether the timing of the CR on the test trials would match the two
possible time points when the US occurred during the CS.
Measures of the magnitude of the CR at different points during the CS are presented
in Figure 3.9. No conditioned responding was observed early in training, but responding
developed after that. Notice that when the CR emerged, it occurred at the points during
the CS when the US had been delivered. Responding around 150 milliseconds developed
first. By day 6, responding was also evident around 500 milliseconds. The end result was
that there were two peaks in the CR, corresponding to the two times when the US could
occur during the 1,000-millisecond tone CS. This clearly shows that the rabbits learned
not just that the US would occur but also exactly when the US would happen. Further-
more, these two forms of learning developed at the same rate.
Inhibitory Pavlovian Conditioning
So far I have been discussing Pavlovian conditioning in terms of learning to predict
when a significant event or US will occur. But there is another type of Pavlovian condi-
tioning, inhibitory conditioning, in which you learn to predict the absence of the US.
Why would you want to predict the absence of something?
Consider being in an environment where bad things happen to you without warn-
ing. Civilians in war can encounter roadside bombs or suicide bombers without much
warning. A child in an abusive home experiences unpredictable bouts of yelling, slam-
ming doors, and getting hit for no particular reason. Getting pushed and shoved in a
crowd also involves danger that arises without much warning and independent of what
you might be doing. Research with laboratory animals has shown that exposure to
unpredictable aversive stimulation is highly aversive and results in stomach ulcers and
other physiological symptoms of stress. If one has to be exposed to aversive stimulation,
predictable or signaled aversive stimuli are much preferred to unpredictable aversive
Day 10
Day 8
Day 6
Day 4
Day 2
0 200 400
C
R
m
ag
ni
tu
de
600 800 1000
Time since CS onset in milliseconds
FIGURE 3.9 Timing
of the conditioned re-
sponse during test trials
with a 1000 millisecond
conditioned tone stimu-
lus in nictitating mem-
brane conditioning.
During conditioning
trials, the US could occur
150 or 500 milliseconds
after CS onset. (Based on
Dudeney, Olsen, &
Kehoe, 2007.)
Inhibitory Pavlovian Conditioning 77
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
stimulation (Mineka & Henderson, 1985), especially among anxiety-prone individuals
(Lejuez et al., 2000).
The benefit of predictability is evident even in the case of a panic attack. A panic
attack is a sudden sense of fear or discomfort, accompanied by physical symptoms (e.g.,
heart palpitations) and a sense of impending doom. If such attacks are fairly frequent
and become the source of considerable anxiety, the individual is said to suffer from panic
disorder. Sometimes, individuals with panic disorder are able to predict the onset of a
panic attack. At other times, they may experience an attack without warning. In a study
of individuals who experienced both predictable and unpredictable panic attacks, Craske,
Glover, and DeCola (1995) measured the general anxiety of the participants before and
after each type of attack. The results are summarized in Figure 3.10. Before the attack, anx-
iety ratings were similar whether the attack was predictable or not. Interestingly, however,
anxiety significantly increased after an unpredicted panic attack and decreased after a pre-
dicted attack. Such results indicate that the distress generated by the experience of a panic
attack occurs primarily because of the unpredictability of the attack.
The ability to predict bad things is very helpful because it also enables you to predict
when bad things will not happen. Consistent with this reasoning, many effective stress-
reduction techniques, such as relaxation training or meditation, involve creating a pre-
dictable period of safety or a time when you can be certain that nothing bad will happen.
Stress management consultants recognize that it is impossible to eliminate aversive
events from one’s life altogether. For example, a teacher supervising a playground with
preschool children is bound to encounter the unexpected stress of a child falling or hit-
ting another child. One cannot prevent accidents or make sure that children won’t hit
each other. However, introducing even short periods of predictable safety (e.g., by allow-
ing the teacher to take a break) can substantially reduce stress. That is where conditioned
inhibition comes in. A conditioned inhibitor is a signal for the absence of the US.
Although Pavlov discovered inhibitory conditioning early in the twentieth century,
this type of learning did not command the serious attention of psychologists until dec-
ades later (Rescorla, 1969b; Williams, Overmier, & LoLordo, 1992). I will describe two
major procedures used to produce conditioned inhibition and the special tests that are
necessary to detect and measure conditioned inhibition.
Day
5.4
5.2
5.0
4.8
4.6
4.4
4.2
4.0
Before After
D
ai
ly
g
en
er
al
a
nx
ie
ty
(0
–8
)
Unpredicted PredictedFIGURE 3.10 Ratings
of general anxiety in in-
dividuals with panic
disorder before and after
predicted and unpre-
dicted panic attacks.
(From M. G. Craske,
D. Glover, and J. DeCola
(1995). Predicted versus
unpredicted panic at-
tacks: Acute versus gen-
eral distress. Journal of
Abnormal Psychology,
104, Figure 1, p. 219.)
Michelle Craske
Co
ur
te
sy
of
M
ic
he
lle
Cr
as
ke
78 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Procedures for Inhibitory Conditioning
Unlike excitatory conditioning, which can proceed without special preconditions, condi-
tioned inhibition has an important prerequisite. For the absence of a US to be a signifi-
cant event, the US has to occur periodically in the situation. There are many signals for
the absence of events in our daily lives. Signs such as “Closed,” “Out of Order,” and “No
Entry” are all of this type. However, these signs provide meaningful information and
influence what we do only if they indicate the absence of something we otherwise expect
to see. For example, if we encounter the sign “Out of Gas” at a service station, we may
become frustrated and disappointed. The sign “Out of Gas” provides important informa-
tion here because we expect service stations to have fuel. The same sign does not tell us
anything of interest if it is in the window of a lumberyard, and it is not likely to discour-
age us from going to buy lumber. This illustrates the general rule that inhibitory condi-
tioning and inhibitory control of behavior occur only if there is an excitatory context for
the US in question (e.g., LoLordo & Fairless, 1985). This principle makes inhibitory con-
ditioning very different from excitatory conditioning, which has no such prerequisites.
Pavlov’s Procedure for Conditioned Inhibition Pavlov recognized the importance
of an excitatory context for the conditioning of inhibition and was careful to provide
such a context in his standard inhibitory training procedure (Pavlov, 1927). The proce-
dure he used, diagrammed in Figure 3.11, involves two CSs and two kinds of condition-
ing trials, one for excitatory conditioning and the other for inhibitory conditioning. The
US is presented on excitatory conditioning trials (Trial Type A in Figure 3.11), and
whenever the US occurs, it is announced by a stimulus labeled CS+ (e.g., a tone).
Because of its pairings with the US, the CS+ becomes a signal for the US and can then
provide the excitatory context for the development of conditioned inhibition.
During inhibitory conditioning trials (Trial Type B in Figure 3.11), the CS+ is pre-
sented together with the second stimulus called the CS– (e.g., a light), and the US does
not occur. Thus, the CS– is presented in the excitatory context provided by the CS+, but
the CS– is not paired with the US. This makes the CS– a conditioned inhibitor or signal
for the absence of the US. During the course of training, A-type and B-type trials are
alternated randomly. As the participant receives repeated trials of CS+ followed by the
US and CS+/CS– followed by no US, the CS– gradually acquires inhibitory properties
(e.g., Campolattaro, Schnitker, & Freeman, 2008).
Pavlov’s conditioned inhibition procedure is analogous to a situation in which some-
thing is introduced that prevents an outcome that would occur otherwise. A red traffic
light at a busy intersection is a signal for potential danger because running the light
could get you into an accident. However, if a police officer indicates that you should
Time Time
Trial type A Trial type B
CS+
CS–
US
CS+
CS–
US
FIGURE 3.11 Pavlov’s procedure for conditioned inhibition. On some trials (Type A), the CS+ is paired with the US. On other
trials (Type B), the CS+ is presented with the CS– and the US is omitted. Type A and Type B trials are presented repeatedly in ran-
dom alternation. The procedure is effective in conditioning inhibitory properties to the CS–.
©
Ce
ng
ag
e
Le
ar
ni
ng
Inhibitory Pavlovian Conditioning 79
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
cross the intersection despite the red light (perhaps because the traffic light is malfunc-
tioning), you will probably not have an accident. Here the red light is the CS+ and the
gestures of the officer constitute the CS–. The gestures inhibit, or block, your hesitation
to cross the intersection because of the red light.
A CS– acts as a safety signal in the context of danger. Children who are afraid will take
refuge in the arms of a parent because the parent serves as a safety signal. Adults who are
anxious also use safety signals to reduce or inhibit their fear or anxiety. People rely on
prayer, a friend, a therapist, or a comforting food at times of stress (Barlow, 1988). These
work in part because we have learned that bad things don’t happen in their presence.
Negative CS-US Contingency or Correlation Another common procedure for pro-
ducing conditioned inhibition does not involve an explicit excitatory stimulus or CS+.
Rather, it involves just a CS– that is negatively correlated with the US. A negative corre-
lation or contingency means that the US is less likely to occur after the CS than at other
times. Thus, the CS signals a reduction in the probability that the US will occur. A sam-
ple arrangement that meets this requirement is diagrammed in Figure 3.12. The US is
periodically presented by itself. However, each occurrence of the CS is followed by the
predictable absence of the US for a while.
Consider a child who periodically gets picked on or bullied by his or her classmates
when the teacher is out of the room. This is like periodically receiving an aversive stim-
ulus or US. When the teacher returns, the child can be sure he or she will not be both-
ered. Thus, the teacher serves as a CS– that signals a period free from harassment, or the
absence of the US.
Conditioned inhibition is reliably observed in procedures in which the only explicit
CS is negatively correlated with the US (Rescorla, 1969a). What provides the excitatory
context for this inhibition? In this case, the environmental cues of the experimental
chamber provide the excitatory context (Dweck & Wagner, 1970). Because the US occurs
periodically in the experimental situation, the contextual cues of the experimental cham-
ber acquire excitatory properties. This in turn permits the acquisition of inhibitory prop-
erties by the CS. (For a study of the role of context in inhibitory conditioning, see Chang,
Blaisdell, & Miller, 2003.)
In a negative CS–US contingency procedure, the aversive US may occur shortly after
the CS occasionally but it is much more likely to occur in the absence of the CS; that is
what defines the negative CS–US contingency. However, even in the absence of the CS,
the exact timing of the US cannot be predicted exactly because the US occurs at various
times probabilistically. This is in contrast to Pavlov’s procedure for conditioned inhibi-
tion. In Pavlov’s procedure, the US always occurs at the end of the CS+ but never occurs
when the CS– is presented together with the CS+. Because Pavlov’s procedure permits
predicting the exact timing of the US, it also permits predicting exactly when the US
will not occur. The US will not occur at the end of CS+ if the CS+ is presented with
the CS–. Tests of temporal learning have shown that in Pavlov’s procedure for condi-
tioned inhibition participants learn exactly when the US will be omitted (Denniston,
Blaisdell, & Miller, 2004; Williams, Johns, & Brindas, 2008).
Time
CS
US
FIGURE 3.12 A negative CS–US contingency procedure for conditioning inhibitory properties to the CS. Notice that the CS is
always followed by a period without the US.
©
Ce
ng
ag
e
Le
ar
ni
ng
80 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Measuring Conditioned Inhibition
How are conditioned inhibitory processes manifested in behavior? For conditioned exci-
tation, the answer to this type of question is straightforward. Excitatory stimuli elicit new
CRs such as salivation, approach, or eye blinking. One might expect that conditioned
inhibitory stimuli would elicit the opposites of these reactions—namely, suppression of
salivation, approach, or eye blinking—but how are we to measure such response
opposites?
Bidirectional Response Systems Identification of an opposing response tendency is
easy with response systems that can change in opposite directions from baseline or nor-
mal performance. Heart rate, respiration, and temperature can all increase or decrease
from a baseline level. Certain behavioral responses are also bidirectional. For example,
animals can either approach or withdraw from a stimulus or drink more or less of a fla-
vored solution. In these cases, conditioned excitation results in a change in behavior in
one direction, and conditioned inhibition results in a change in behavior in the opposite
direction.
Unfortunately, many responses are not bidirectional. Consider freezing or response
suppression as a measure of conditioned fear. A conditioned excitatory stimulus will
elicit freezing, but a conditioned inhibitor will not produce activity above normal levels.
A similar problem arises in eyeblink conditioning. A CS+ will elicit increased blinking,
but the inhibitory effects of a CS– are difficult to detect because the baseline rate of
blinking is low to begin with. It is hard to see inhibition of blinking below an already
low baseline. Because of these limitations, conditioned inhibition is typically measured
indirectly using the compound-stimulus test and the retardation of acquisition test.
The Compound-Stimulus, or Summation, Test The compound-stimulus test (or
summation test) was particularly popular with Pavlov and remains one of the most
widely accepted procedures for the measurement of conditioned inhibition. The test is
based on the simple idea that conditioned inhibition counteracts or inhibits conditioned
excitation. Therefore, to observe conditioned inhibition, one has to measure how the pre-
sentation of a CS– disrupts or suppresses responding that would normally be elicited
by a CS+.
A particularly well-controlled demonstration of conditioned inhibition using the
compound-stimulus or summation test was reported by Cole, Barnet, and Miller
(1997). The experiment was conducted using the lick-suppression procedure with labora-
tory rats. The rats received inhibitory conditioning in which the presentation of a flash-
ing light by itself always ended in a brief shock (A+), and the presentation of an auditory
cue (X) together with the light ended without shock (AX–). Thus, Pavlov’s procedure for
conditioned inhibition was used, and X was expected to become an inhibitor of fear.
A total of 28 A+ trials and 56 AX– trials were conducted over seven sessions. The rats
also received training with another auditory stimulus (B) in a different experimental cham-
ber, and this stimulus always ended in the brief shock (B+). The intent of these procedures
was to establish conditioned excitation to A and B and conditioned inhibition to X.
Cole and colleagues then asked whether the presumed inhibitor X would suppress
responding to the excitatory stimuli A and B. The results of those tests are summarized
in Figure 3.13. How long the rats took to accumulate 5 seconds of uninterrupted drink-
ing was measured. Conditioned fear was expected to slow the rate of drinking. Notice
that when the excitatory stimuli, A and B, were presented by themselves, the rats
required substantial amounts of time to complete the 5-second drinking criterion. In
contrast, when the excitatory stimuli were presented together with the conditioned inhib-
itor (AX and BX tests), the drinking requirement was completed much faster. Thus,
Inhibitory Pavlovian Conditioning 81
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
presenting stimulus X with A and B reduced the drinking suppression that occurred
when A and B were presented alone. X inhibited conditioned fear elicited by A and B.
Figure 3.13 includes another test condition: stimulus B tested with another auditory
cue, Y. Stimulus Y was not previously conditioned as an inhibitor and was presented to
be sure that introducing a new stimulus with stimulus B would not cause disruption of
the conditioned fear response just because of novelty. As Figure 3.13 illustrates, no such
disruption occurred with stimulus Y. Thus, the inhibition of conditioned fear was limited
to the stimulus (X) that received conditioned inhibition training. Another important
aspect of these results is that X was able to inhibit conditioned fear not only to the
exciter with which it was trained (A) but also to another exciter (B) that had never
been presented with X during training. Thus, X became a general safety signal.
The compound-stimulus test for conditioned inhibition indicates that the presenta-
tion of a conditioned inhibitor or safety signal can reduce the stressful effects of an aver-
sive experience. This prediction has been tested with patients who were prone to
experience panic attacks (Carter et al., 1995). Panic attack patients were invited to the
laboratory and accompanied by someone with whom they felt safe. Panic was experi-
mentally induced by having the participants inhale a mixture of gas containing elevated
levels of carbon dioxide. The participants were then asked to report on their perceived
levels of anxiety and catastrophic ideation triggered by the carbon dioxide exposure.
The experimental manipulation was the presence of another person with whom the par-
ticipants felt safe (the conditioned inhibitor). The presence of a safe acquaintance
reduced the anxiety and catastrophic ideation associated with the panic attack. These
results explain why children are less fearful during a medical examination if they are
accompanied by a trusted parent or allowed to carry a favorite toy or blanket.
The Retardation of Acquisition Test Another frequently used indirect test of con-
ditioned inhibition is the retardation of acquisition test (Rescorla, 1969b). The rationale
for this test is straightforward. If a stimulus actively inhibits a particular response, then it
should be especially difficult to turn that stimulus into a conditioned excitatory CS. In
other words, the rate of excitatory conditioning should be retarded if the CS was previ-
ously established as a conditioned inhibitor. This prediction was tested by Cole and col-
leagues (1997) in an experiment very similar to their summation test study described
earlier.
After the same kind of inhibitory conditioning that produced the results summa-
rized in Figure 3.13, Cole and colleagues took stimulus X (which had been conditioned
as an inhibitor) and stimulus Y (which had not been used in a conditioning procedure
B
1.0
1.5
2.0
BX BY A AX
M
ea
n
tim
e
(lo
gs
)
FIGURE 3.13
Compound-stimulus
test of inhibition in a
lick-suppression experi-
ment. Stimuli A and B
were conditioned as
excitatory stimuli by
being presented alone
with shock (A+ and B+).
Stimulus X was condi-
tioned as an inhibitor by
being presented with
stimulus A without
shock (AX–). Stimulus Y
was a control stimulus
that had not participated
in either excitatory or
inhibitory conditioning.
A was a flashing light; B,
X, and Y were auditory
cues (a clicker, white
noise, and a buzzer,
counterbalanced across
participants.) A and AX
were tested in the origi-
nal training context. B,
BX, and BY were tested
in a different context.
(Based on Cole, R. P.,
Barner, R. C., & Miller,
R. R. (1997). An evalua-
tion of conditioned in-
hibition as defined by
Rescorla’s two-testing
strategy in Learning and
Motivation, Volume 28,
333, copyright 1997,
Elsevier Science (USA).)
82 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
before) and conducted a retardation of acquisition test by pairing each stimulus with
shock on three occasions. (Three acquisition trials were sufficient because conditioned
fear is learned faster than the inhibition of fear.) Stimuli X and Y were then tested to
see which would cause greater suppression of drinking. The results are presented in
Figure 3.14. The time to complete 5 seconds of drinking took much longer in the pres-
ence of the control stimulus Y than in the presence of stimulus X, which had previously
been trained as a conditioned inhibitor. This indicates that Y elicited greater conditioned
fear than stimulus X. Evidently, the initial inhibitory training of X retarded its acquisi-
tion of excitatory conditioned fear.
Conditioned inhibition can be difficult to distinguish from other behavioral pro-
cesses. Therefore, the best strategy is to use more than one test and be sure that all of
the results point to the same conclusion. Rescorla (1969b) advocated using both the
compound-stimulus test and the retardation of acquisition test. This dual-test strategy
has remained popular ever since.
Prevalence of Classical Conditioning
Classical conditioning is typically investigated in laboratory situations. However, we do
not have to know much about classical conditioning to realize that it also occurs in a
wide range of situations outside the laboratory. Classical conditioning is most likely to
develop when one event reliably precedes another in a short-delayed CS-US pairing.
This occurs in many aspects of life. As I mentioned at the beginning of the chapter, sti-
muli in the environment occur in an orderly temporal sequence, largely because of the
physical constraints of causation. Some events simply cannot happen before other things
have occurred. Eggs won’t be hard boiled until they have been put in boiling water.
Social institutions and customs also ensure that things happen in a predictable order.
Whenever one stimulus reliably precedes another, classical conditioning may occur.
One area of research that has been of particular interest is how people come to judge
one event as the cause of another. In studies of human causal judgment, participants are
exposed to repeated occurrences of two events (pictures of a blooming flower and a
1.0
1.5
2.0
2.5
X Y
M
ea
n
tim
e
(lo
gs
)
FIGURE 3.14 Effects
of a retardation of ac-
quisition test of inhibi-
tion in a lick-
suppression experiment
after the same kind of
inhibitory conditioning
as was conducted to
produce the results pre-
sented in Figure 3.13.
Stimulus X was previ-
ously conditioned as an
inhibitory stimulus, and
stimulus Y previously
received no training.
(Based on Cole, R. P.,
Barner, R. C., & Miller,
R. R. (1997). An evalua-
tion of conditioned
inhibition as defined by
Rescorla’s two-testing
strategy in Learning and
Motivation, Volume 28,
333, copyright 1997,
Elsevier Science (USA).)
Prevalence of Classical Conditioning 83
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
watering can are briefly presented on a computer screen) in various temporal arrange-
ments. In one condition, for example, the watering can may always occur before the
flower; in another it may occur at random times relative to the flower. After observing
numerous appearances of both objects, the participants are asked to indicate their
judgment about the strength of causal relation between them. Studies of human causal
judgment are analogous to studies of Pavlovian conditioning in that both involve
repeated experiences with two events and responses based on the extent to which those
two events become linked to each other. Given this correspondence, one might suspect
that there is considerable commonality in the outcomes of causal judgment and Pavlovian
conditioning experiments. That prediction has been supported in numerous studies,
suggesting that Pavlovian associative mechanisms are not limited to Pavlov’s dogs but
may play a role in the numerous judgments of causality we all make during the course
of our daily lives (see Allan, 2005).
As I described earlier in the chapter, Pavlovian conditioning can result in the condi-
tioning of food preferences and aversions. It can also result in the acquisition of fear.
Conditioned fear responses have been of special interest because they may contribute
significantly to anxiety disorders, phobias, and panic disorder (Craske, Hermans, &
Vansteenwegen, 2006; Oehlberg & Mineka, 2011). As I will discuss further in Chapter 4,
Pavlovian conditioning is also involved in drug tolerance and addiction. Cues that reliably
accompany drug administration can elicit drug-related responses through conditioning. In
discussing this type of learning among crack addicts, Dr. Scott Lukas of McLean Hospital
in Massachusetts described the effects of drug-conditioned stimuli by saying that “these
cues turn on crack-related memories, and addicts respond like Pavlov’s dogs” (Newsweek,
February 12, 2001, p. 40).
Pavlovian conditioning is also involved in infant and maternal responses in nurs-
ing. Suckling involves mutual stimulation for the infant and the mother. To success-
fully nurse, the mother has to hold the baby in a particular position that provides
special tactile and olfactory cues for both the infant and the mother. The tactile
stimuli experienced by the infant may become conditioned to elicit orientation
and suckling responses on the part of the baby (Blass, Ganchrow, & Steiner, 1984).
Olfactory cues experienced by the infant during suckling can also become condi-
tioned. In one study (Allam et al., 2010), infants preferred to play with objects that
had the odor of camomile if their mother previously used a camomile-scented lotion
on her breast. Interestingly, this preference was evident more than a year after the
mothers had stopped using the lotion.
Tactile stimuli provided by the infant to the mother may also become conditioned,
in this case to elicit the milk let-down response in anticipation of having the infant
suckle. Mothers who nurse their babies experience the milk let-down reflex when the
baby cries or when the usual time for breast-feeding arrives. All these stimuli (special
tactile cues, the baby’s crying, and the time of normal feedings) reliably precede suckling
by the infant. Therefore, they can become conditioned by the suckling stimulation to
elicit milk let-down as a CR. The anticipatory conditioned orientation and suckling
responses and the anticipatory conditioned milk let-down response make the nursing
experience more successful for both the baby and the mother.
Pavlovian conditioning is also important in sexual situations. Studies have shown
that sexual behavior can be shaped by learning experiences in both people and in var-
ious animal species (Hoffmann, 2011; Woodson, 2002). In these studies, males typi-
cally serve as the research participants, and the US is provided either by the sight of a
sexually receptive female or by physical access to a female. Conditioned males
approach stimuli that signal the availability of a sexual partner (Burns & Domjan,
1996; Hollis, Cadieux, & Colbert, 1989). As we will describe in Chapter 4, a sexuallyK. L. Hollis
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
84 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
CS also facilitates various aspects of reproductive behavior. Most importantly, the
presentation of a Pavlovian CS+ before a sexual encounter greatly increases the num-
ber of offspring that result from the reproductive behavior. This Pavlovian condi-
tioned fertility effect was originally demonstrated in a fish species (Hollis, 1997) but
has since been also found in studies with domesticated quail (Domjan, Mahometa, &
Matthews, 2012). In one of the more dramatic experiments, Pavlovian conditioning
determined the outcome of sperm competition in domesticated quail (Matthews
et al., 2007). To observe sperm competition, two male quail were permitted to copu-
late with the same female. A copulatory interaction in quail can fertilize as many as
10 of the eggs the female produces after the sexual encounter. If two males copulate
with the same female in succession, the male whose copulation is signaled by a
Pavlovian CS+ sires significantly more of the resulting offspring. By influencing
which male’s genes are represented in the next generation, Pavlovian conditioning
can bias the evolutionary changes that result from sexual competition.
Concluding Comments
This chapter continued our discussion of elicited behavior by turning attention from
habituation and sensitization to classical conditioning. Classical conditioning is a bit
more complex in that it involves associatively mediated elicited behavior. In fact, classical
conditioning is one of the major techniques for investigating how associations are
learned. As we have seen, classical conditioning may be involved in many important
aspects of behavior. Depending on the procedure used, the learning may occur quickly
or slowly. With some procedures, excitatory responses are learned; with other proce-
dures, the organism learns to inhibit an excitatory response tendency. Excitatory and
inhibitory conditioning occur in many aspects of common experience and serve to help
us interact more effectively with significant biological events (USs).
Sample Questions
1. Describe the similarities and differences among
habituation, sensitization, and classical
conditioning.
2. What is object learning, and how is it similar to
or different from conventional classical
conditioning?
3. Why is it difficult to identify the type of condi-
tioning procedure that produces the best
conditioning?
4. What is a control procedure for excitatory
conditioning, and what processes is the control
procedure intended to rule out?
5. Are conditioned excitation and conditioned
inhibition related? If so, how?
6. Describe procedures for conditioning and mea-
suring conditioned inhibition.
7. Describe four reasons why classical conditioning
is of interest to psychologists.
Key Terms
autoshaping Same as sign tracking.
backward conditioning A procedure in which the CS
is presented shortly after the US on each trial.
compound-stimulus test A test procedure that identi-
fies a stimulus as a conditioned inhibitor if that stimulus
reduces the responding elicited by a conditioned excit-
atory stimulus. Also called summation test.
conditional or conditioned response (CR) The
response that comes to be made to the CS as a result
of classical conditioning.
conditional or conditioned stimulus (CS) A stimu-
lus that does not elicit a particular response initially,
but comes to do so as a result of becoming associated
with an US.
Key Terms 85
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
conditioned suppression Suppression of ongoing
behavior (e.g., drinking or lever pressing for food)
produced by the presentation of a CS that has been
conditioned to elicit fear through association with an
aversive US.
conditioning trial A training episode involving pre-
sentation of a CS with (or without) a US.
CS-US interval Same as interstimulus interval.
evaluative conditioning Changing the hedonic value
or liking of an initially neutral stimulus by having that
stimulus associated with something that is already liked
or disliked.
explicitly unpaired control A procedure in which
both CS and US are presented, but with sufficient
time between them so that they do not become associ-
ated with each other.
goal tracking Conditioned behavior elicited by a CS
that consists of approaching the location where the US
is usually presented.
inhibitory conditioning A type of classical condi-
tioning in which the CS becomes a signal for the
absence of the US.
interstimulus interval The amount of time that
elapses between the start of the CS and the start of
the US during a classical conditioning trial. Also called
the CS-US interval.
intertrial interval The amount of time that elapses
between two successive trials.
latency The time elapsed between a stimulus (or the
start of a trial) and the response that is made to the
stimulus.
lick-suppression procedure A procedure for testing
fear conditioning in which presentation of a fear-
conditioned CS slows down the rate of drinking.
long-delayed conditioning A conditioning procedure
in which the US occurs more than several minutes after
the start of the CS, as in taste-aversion learning.
magnitude of a response A measure of the size,
vigor, or extent of a response.
object learning Learning associations between differ-
ent stimulus features of an object, such as what it looks
like and how it tastes.
probability of a response The likelihood of making
the response, usually represented in terms of the per-
centage of trials on which the response occurs.
pseudo-conditioning Increased responding that may
occur to a stimulus whose presentations are intermixed
with presentations of a US in the absence of the establish-
ment of an association between the stimulus and the US.
random control procedure A procedure in which the
CS and US are presented at random times with respect
to each other.
retardation of acquisition test A test procedure that
identifies a stimulus as a conditioned inhibitor if that
stimulus is slower to acquire excitatory properties than
a comparison stimulus.
short-delayed conditioning A classical conditioning
procedure in which the CS is initiated shortly before
the US on each conditioning trial.
sign tracking Movement toward and possibly contact
with a stimulus that signals the availability of a positive
reinforcer, such as food. Also called autoshaping.
simultaneous conditioning A classical conditioning
procedure in which the CS and the US are presented
at the same time on each conditioning trial.
summation test Same as compound-stimulus test.
temporal coding hypothesis The idea that Pavlovian
conditioning procedures lead not only to learning that the
US happens but exactly when it occurs in relation to the
CS. The CS represents (or codes) the timing of the US.
test trial A trial in which the CS is presented without
the US. This allows measurement of the CR in the
absence of the UR.
trace conditioning A classical conditioning proce-
dure in which the US is presented after the CS has
been terminated for a short period.
trace interval The interval between the end of the CS
and the start of the US in trace-conditioning trials.
unconditional or unconditioned response (UR) A
response that occurs to a stimulus without the necessity
of prior training.
unconditional or unconditioned stimulus (US) A
stimulus that elicits a particular response without the
necessity of prior training.
86 Chapter 3: Classical Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
C H A P T E R 4
Classical Conditioning:
Mechanisms
What Makes Effective Conditioned and
Unconditioned Stimuli?
Initial Responses to the Stimuli
Novelty of Conditioned and Unconditioned
Stimuli
CS and US Intensity and Salience
CS–US Relevance, or Belongingness
Learning Without an Unconditioned
Stimulus
What Determines the Nature of the
Conditioned Response?
The US as a Determining Factor for the CR
The CS as a Determining Factor for the CR
The CS–US Interval as a Determining Factor
for the CR
Conditioned Responding and Behavior
Systems
S–R Versus S–S Learning
Pavlovian Conditioning as Modification
of Responses to the Unconditioned
Stimulus
How Do Conditioned and Unconditioned
Stimuli Become Associated?
The Blocking Effect
The Rescorla–Wagner Model
Attentional Models of Conditioning
Timing and Information Theory Models
The Comparator Hypothesis
Concluding Comments
Sample Questions
Key Terms
CHAPTER PREVIEW
Chapter 4 continues the discussion of classical conditioning, focusing on the mechanisms and
outcomes of this type of learning. The discussion is organized around three key issues. First, I will
describe features of stimuli that determine their effectiveness as conditioned and unconditioned
stimuli. Then, I will discuss factors that determine the types of responses that come to be made to
conditioned stimuli and how conditioning alters how organisms respond to the unconditioned
stimulus. In the third and final section of the chapter, I will discuss the mechanisms of learning
involved in the development of conditioned responding. Much of the discussion will deal with how
associations are established and expressed in behavior. However, I will also describe a
nonassociate model of learning based on information theory.
87
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
What Makes Effective Conditioned
and Unconditioned Stimuli?
This is perhaps the most basic question one can ask about classical conditioning. The
question was first posed by Pavlov but continues to attract the attention of contemporary
researchers.
Initial Responses to the Stimuli
Pavlov’s answer to what makes effective conditioned and unconditioned stimuli was
implied by his definitions of the terms conditioned and unconditioned. According to these
definitions, the CS does not elicit the conditioned response initially but comes to do so as a
result of becoming associated with the US. By contrast, the US is effective in eliciting the
target response from the outset (unconditionally) without any special training.
Pavlov’s definitions were stated in terms of the elicitation of the response to be con-
ditioned. Because of this, identifying potential CSs and USs requires comparing the
responses elicited by each stimulus before conditioning. Such a comparison makes the
identification of CSs and USs relative. A particular event may serve as a CS relative to
one stimulus and as a US relative to another.
Consider, for example, food pellets flavored with sucrose. The taste of the sucrose
pellets may serve as a CS in a taste-aversion conditioning procedure, in which condition-
ing trials consist of pairing the sucrose pellets with illness. As a result of such pairings,
the participants will acquire an aversion to eating the sucrose pellets.
In a different experiment, Franklin and Hall (2011) used sucrose pellets as a US in a
sign-tracking experiment with rats. The conditioning trials in this case involved inserting
a response lever (the CS) before each delivery of the sucrose pellet (the US). After a
number of trials of this sort, the rats began to approach and press the response lever.
Thus, sucrose pellets could be either a US or a CS depending on how presentations of
the pellets are related to other stimuli in the situation.
Novelty of Conditioned and Unconditioned Stimuli
As we saw in studies of habituation, the behavioral impact of a stimulus depends on its
novelty. Highly familiar stimuli elicit less vigorous reactions than do novel stimuli.
Novelty is also important in classical conditioning. If either the conditioned or the
unconditioned stimulus is highly familiar, learning occurs more slowly than if the CS
and US are novel.
The Latent-Inhibition or CS-Preexposure Effect Numerous studies have shown
that if a stimulus is highly familiar, it will not be as effective as a CS than if it were
novel. This phenomenon is called the latent-inhibition effect or CS-preexposure effect
(Hall, 1991; Lubow & Weiner, 2010). Experiments on the latent-inhibition effect involve
two phases. Participants are first given repeated presentations of the CS by itself. This is
called the preexposure phase because it comes before the Pavlovian conditioning trials.
CS preexposure makes the CS highly familiar and of no particular significance because
at this point the CS is presented alone. After the preexposure phase, the CS is paired
with a US using conventional classical conditioning procedures. The common result is
that participants are slower to acquire responding because of the CS preexposure. Thus,
CS preexposure disrupts or retards learning (e.g., De la Casa, Marquez, & Lubow, 2009).
The effect is called latent inhibition to distinguish it from the phenomenon of condi-
tioned inhibition I described in Chapter 3.
Latent inhibition is similar to habituation. Both phenomena serve to limit processing
and attention to stimuli that are presented without a US and are therefore inconsequential.
88 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
As Lubow (2011) noted, latent inhibition “appears to protect the organism from informa-
tion overload by attenuating the processing of previously irrelevant stimuli” (p. 154).
Although it was originally discovered in studies with sheep (Lubow & Moore, 1959),
the latent-inhibition effect has become of great interest in analyses of human behavior,
especially as it relates to schizophrenia. The primary theoretical explanation of latent
inhibition is that CS preexposure reduces attention to the CS (e.g., Schmajuk, 2010). A
major symptom of schizophrenia is the inability to suppress attention to irrelevant sti-
muli, and individuals with schizophrenia show a disruption of the latent-inhibition
effect. Latent inhibition and schizophrenia also share some of the same dopaminergic
neurobiological mechanisms. Because of these commonalities, latent inhibition has
become a major tool for studying the cognitive dysfunctions that accompany schizophre-
nia (Lubow, 2011).
The US-Preexposure Effect Experiments on the importance of US novelty are simi-
lar in design to CS-preexposure experiments. In one study, for example, rats first
received 80 presentations of one of two types of flavored food pellets (bacon or choco-
late) on each of 5 days, for a total of 400 US preexposure trials (Franklin & Hall, 2011).
The pellets were delivered into a food cup that was located between two retractable levers
(as in Figure 3.6). Following the US preexposure phase, 20 Pavlovian conditioning trials
were conducted. At the start of a trial, one of the response levers was inserted into the
experimental chamber for 10 seconds, followed by the bacon pellet. On other trials, the
other response lever was presented paired with the chocolate pellet.
The results of the Pavlovian conditioning phase are presented in Figure 4.1. Condi-
tioning proceeded faster for the lever paired with the novel food than for the lever paired
with the familiar food. This is called the US-preexposure effect. The US-preexposure
effect has been observed not only in appetitive conditioning but also in a variety of
other situations, including fear conditioning, taste aversion learning, and drug condition-
ing (e.g., Randich & LoLordo, 1979; Hall, 2009).
CS and US Intensity and Salience
Another important stimulus variable for classical conditioning is the intensity of the condi-
tioned and unconditioned stimuli. Most biological and physiological effects of stimulation
are related to the intensity of the stimulus input. This is also true of Pavlovian conditioning.
0
1
2
3
R
es
po
ns
es
p
er
tr
ia
l
1 2 3 4 5 6 7 8 9 10
Trials
Novel food Familiar foodFIGURE 4.1 Rate of
contacts with a response
lever whose presenta-
tions were paired with a
familiar-flavored food
pellet versus a response
lever whose presenta-
tions were paired with a
novel-flavored food pel-
let. Faster acquisition of
this sign tracking re-
sponse with the novel
pellet illustrates the
US-preexposure effect
(based on Franklin &
Hall, 2011).
What Makes Effective Conditioned and Unconditioned Stimuli? 89
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
More vigorous conditioned responding occurs when more intense conditioned and uncon-
ditioned stimuli are used (e.g., Bevins et al., 1997; Kamin, 1965; Scavio & Gormezano, 1974).
Stimulus intensity is one factor that contributes to what is more generally called
stimulus salience. The term salience is not well defined, but it roughly corresponds to
significance or noticeability. Theories of learning typically assume that learning will
occur more rapidly with more salient stimuli (e.g., McLaren & Mackintosh, 2000; Pearce
& Hall, 1980). One can make a stimulus more salient or significant by making it more
intense and, hence, more attention-getting. One can also make a stimulus more salient
by making it more relevant to the biological needs of the organism. For example, animals
become more attentive to the taste of salt if they suffer a nutritional salt deficiency
(Krieckhaus & Wolf, 1968). Consistent with this outcome, Sawa, Nakajima, and Imada
(1999) found that sodium-deficient rats learn stronger aversions to the taste of salt than
nondeficient control subjects.
Another way to increase the salience of a CS is to make it more similar to the kinds
of stimuli an animal is likely to encounter in its natural environment. Studies of sexual
conditioning with domesticated quail illustrate this principle. In the typical experiment,
access to a female quail serves as the sexual reinforcer, or US, for a male bird, and this
sexual opportunity is signaled by the presentation of a CS. The CS can be an arbitrary
cue such as a light or a block of wood. Alternatively, the CS can be made more natural
or salient by adding partial cues of a female (Figure 4.2). Studies have shown that if a
naturalistic CS is used in sexual conditioning, the learning proceeds more rapidly, more
components of sexual behavior become conditioned, and the learning is not as easily dis-
rupted by increasing the CS–US interval (Cusato & Domjan, 2012; Domjan et al., 2004).
CS–US Relevance, or Belongingness
Another variable that governs the rate of classical conditioning is the extent to which the
CS is relevant to or belongs with the US. The importance of CS–US relevance was first
clearly demonstrated in a classic experiment by Garcia and Koelling (1966). Working
with laboratory rats, the investigators compared learning about peripheral pain (induced
by foot-shock) and learning about illness (induced by irradiation or a drug injection).
In their natural environment, rats are likely to get sick after eating a poisonous food.
In contrast, they are likely to encounter peripheral pain after being chased and bitten
by a predator that they can hear and see. To represent food-related cues, Garcia and
Koelling used a flavored solution of water as the CS; to represent predator-related cues,
they used an audiovisual CS.
The experiment, diagrammed in Figure 4.3, involved having the rats drink from
a drinking tube before administration of either the shock or illness US. The drinking
tube was filled with water flavored either salty or sweet. In addition, each lick on the tube
FIGURE 4.2 CS ob-
jects used as signals for
copulatory opportunity
in studies of sexual
conditioning with male
quail. The object on the
left is arbitrary and made
entirely of terrycloth.
The object on the right is
more naturalistic be-
cause it includes limited
female cues provided by
the head and some neck
feathers from a taxider-
mically prepared female
bird (from Cusato &
Domjan, 1998). Sexual
conditioning is more
robust with the natural-
istic CS.
90 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
activated a brief audiovisual stimulus (a click and a flash of light). Thus, the rats encoun-
tered the taste and audiovisual stimuli at the same time. After exposure to these CSs, the
animals either received a brief shock through the grid floor or were made sick.
Because both USs were aversive, the rats were expected to learn an aversion of some
kind. To observe these aversions, the taste and audiovisual CSs were presented individu-
ally after conditioning. During tests of the taste CS, the water was flavored as before, but
now licks did not activate the audiovisual cue. During tests of the audiovisual CS, the
water was unflavored, but the audiovisual cue was briefly turned on each time the animal
licked the spout. Conditioned aversions were inferred from the suppression of drinking.
The results of the experiment are summarized in Figure 4.4. Animals conditioned
with shock subsequently suppressed their drinking much more when tested with the
audiovisual stimulus than when tested with the taste CS. The opposite result occurred
for animals that had been conditioned with sickness. These rats suppressed their drink-
ing much more when the taste CS was presented than when drinking produced the
audiovisual stimulus.
Garcia and Koelling’s experiment demonstrates the principle of CS–US relevance, or
belongingness. Learning depended on the relevance of the CS to the US. Taste became
readily associated with illness, and audiovisual cues became readily associated with periph-
eral pain. Rapid learning occurred only if the CS was combined with the appropriate US.
The audiovisual CS was not generally more effective than the taste CS. Rather, the audio-
visual CS was more effective only when shock served as the US. Correspondingly, the
shock US was not generally more effective than the sickness US. Rather, shock conditioned
stronger aversions than sickness only when the audiovisual cue served as the CS.
Taste +
audiovisual
Taste
Audiovisual
Taste
Audiovisual
Shock
Taste +
audiovisual Sickness
Conditioning TestFIGURE 4.3 Diagram
of Garcia and Koelling’s
(1966) experiment.
A compound taste–
audiovisual stimulus was
first paired with either
shock or sickness for
separate groups of labo-
ratory rats. The subjects
were then tested with the
taste and audiovisual
stimuli separately.
Li
ck
s/
m
in
ut
e
Sickness
Type of US
Shock
1
2
3
Taste AudiovisualFIGURE 4.4 Results
from Garcia and Koelling’s
experiment. Rats condi-
tioned with sickness
learned a stronger
aversion to taste than
to audiovisual cues.
By contrast, rats condi-
tioned with shock
learned a stronger aver-
sion to audiovisual than
to taste cues (adapted
from Garcia and
Koelling, 1966).
©
Ce
ng
ag
e
Le
ar
ni
ng
What Makes Effective Conditioned and Unconditioned Stimuli? 91
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The CS–US relevance effect obtained by Garcia and Koelling was not readily
accepted at first. However, numerous subsequent studies have confirmed the original
findings (e.g., Domjan, 1983; Rescorla, 2008). The selective-association effect occurs
even in rats one day after birth (Gemberling & Domjan, 1982). This observation indi-
cates that extensive experience with tastes and sickness (or audiovisual cues and periph-
eral pain) is not necessary for the stimulus-relevance effect. Rather, the phenomenon
appears to reflect a genetic predisposition for the selective learning of certain combina-
tions of conditioned and unconditioned stimuli. (For evidence of stimulus relevance in
human food-aversion learning, see Logue et al., 1981; Pelchat & Rozin, 1982.)
Stimulus-relevance effects have been documented in other situations as well. In
Chapter 8 we will visit a prominent example of stimulus relevance contrasting learning
about appetitive events versus aversive events. Stimulus-relevance effects are also promi-
nent in the acquisition of fear in primates (Öhman & Mineka, 2001; Mineka & Öhman,
2002). Experiments with both rhesus monkeys and people have shown that fear condi-
tioning progresses more rapidly with fear-relevant cues (the sight of a snake) than with
fear-irrelevant cues (the sight of a flower or mushroom). However, this difference is not
observed if an appetitive US is used. This selective advantage of snake stimuli in fear
conditioning does not require conscious awareness (e.g., Öhman et al., 2007) and seems
to reflect an evolutionary adaptation to rapidly detect biologically dangerous stimuli and
acquire fear to such cues. Consistent with this conclusion, infants as young as 8–14
months orient more quickly to pictures of snakes than to pictures of flowers (LoBue &
DeLoache, 2010). As Mineka and Öhman (2002) pointed out, “Fear conditioning
occurs most readily in situations that provide recurrent survival threats in mammalian
evolution” (p. 928).
Learning Without an Unconditioned Stimulus
So far, we have been discussing classical conditioning in situations that include a US: a
stimulus that has a large behavioral impact unconditionally, without prior training. If
Pavlovian conditioning were only applicable to situations that involve a US, it would be
somewhat limited. It would occur only if you received food, shock, or had sex. How
about the rest of time, when you are not eating or having sex? As it turns out, Pavlovian
conditioning can also take place in situations where you do not encounter a US. There
are two different forms of classical conditioning without a US. One is higher-order con-
ditioning and the other is sensory preconditioning.
Higher-Order Conditioning Irrational fears are often learned through higher-order
conditioning. For example, Wolpe (1990) described the case of a lady who developed a
fear of crowds. Thus, for her being in a crowd was a CS that elicited conditioned fear.
How this fear was originally learned is unknown. Perhaps she was pushed and shoved
in a crowd (CS) and suffered an injury (US). To avoid arousing her fear, the lady
would go to the movies only in the daytime when few people were in the theater. On
one such visit, the theater suddenly became crowded with students. The lady became
extremely upset by this and came to associate cues of the movie theater with crowds.
Thus, one CS (crowds) had conditioned fear to another (the movie theater) that previ-
ously elicited no fear. The remarkable aspect of this transfer of fear is that the lady never
experienced bodily injury or an aversive US in the movie theater. In that sense, her new
fear of movie theaters was irrational.
As this case study illustrates, higher-order conditioning occurs in two phases. Dur-
ing the first phase, a cue (call it CS1) is paired with a US often enough to condition a
strong response to CS1. In the above case study, the stimuli of crowds constituted CS1.
Once CS1 elicited the conditioned response, pairing CS1 with a new stimulus CS2
92 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
(cues of the movie theater) was able to condition CS2 to also elicit the conditioned
response. The conditioning of CS2 occurred in the absence of the US. Figure 4.5 sum-
marizes these stages of learning that result in higher-order conditioning.
As the term “higher order” implies, conditioning may be considered to operate at
different levels. In the preceding example, the experience of crowds (CS1) paired with
injury (US) is first-order conditioning. Pairing of CS2 (movie theaters) with CS1 (crowds)
is second-order conditioning. If after becoming conditioned, CS2 were used to condition
yet another stimulus, CS3, that would be third-order conditioning.
The procedure for second-order conditioning shown in Figure 4.5 is similar to the
standard procedure for inhibitory conditioning that was described in Chapter 3
(Figure 3.11). In both cases, one conditioned stimulus (CS1 or CS
+) is paired with the
US (CS1 ! US or CS+ ! US), and a second CS (CS2 or CS−) is paired with the first
one without the unconditioned stimulus (CS1/CS2 ! noUS or CS+/CS− ! noUS). Why
does such a procedure produce conditioned inhibition in some cases and excitatory
second-order conditioning under other circumstances? One important factor appears to
be the number of noUS or nonreinforced trials. With relatively few noUS trials, second-
order excitatory conditioning occurs. With extensive training, conditioned inhibition
develops (Yin, Barnet, & Miller, 1994). Another important variable is whether the first-
and second-order stimuli are presented simultaneously or sequentially (one after the
other). Simultaneous presentations of CS1 and CS2 on nonreinforced trials favor the
development of conditioned inhibition to CS2 (Stout, Escobar, & Miller, 2004; see also
Wheeler, Sherwood, & Holland, 2008).
Although there is no doubt that second-order conditioning is a robust phenomenon
(e.g., Rescorla, 1980; Witnauer & Miller, 2011), little research has been done to evaluate
the mechanisms of third and higher orders of conditioning. However, even the existence
of second-order conditioning is of considerable significance because it greatly increases
the range of situations in which classical conditioning can take place. With second-
order conditioning, classical conditioning can occur without a primary US. The only
requirement is the availability of a previously conditioned stimulus.
Many instances of conditioning in human experience involve higher-order condi-
tioning. For example, money is a powerful conditioned stimulus (CS1) for human behav-
ior because of its association with candy, toys, movies, and other things money can buy
(USs). A child may become fond of his or her uncle (CS2) if the uncle gives the child
some money on each visit. The positive conditioned emotional response to the uncle
develops because the child comes to associate the uncle with money, in a case of
second-order conditioning.
Advertising campaigns also make use of higher-order conditioning. A new product
(CS2) is paired with something we have already learned to like (CS1) to create a prefer-
ence for the new product (Schachtman, Walker, & Fowler, 2011).
Sensory Preconditioning Associations can also be learned between two stimuli, each
of which elicits only a mild orienting response before conditioning. Consider, for example,
two flavors (i.e., vanilla and cinnamon) that you often encounter together in pastries with-
out ill effects. Because of these pairings, the vanilla and cinnamon flavors may become
CS2 CS1CS1 US
CR
CS2
CR
FIGURE 4.5 Proce-
dure for higher-order
conditioning. CS1 is first
paired with the US and
comes to elicit the con-
ditioned response. A
new stimulus (CS2) is
then paired with CS1.
Subsequent tests show
that CS2 also comes to
elicit the conditioned
response.
©
Ce
ng
ag
e
Le
ar
ni
ng
What Makes Effective Conditioned and Unconditioned Stimuli? 93
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
associated with one another. What would happen if you then acquired an aversion to cin-
namon through food poisoning or illness? Chances are your acquired aversion to cinna-
mon would lead you to also reject things with the taste of vanilla because of the prior
association of vanilla with cinnamon. This is an example of sensory preconditioning.
As with higher-order conditioning, sensory preconditioning involves a two-stage
process (Figure 4.6). The cinnamon and vanilla flavors become associated with one
another in the first phase when there is no illness or US. Let’s call these stimuli CS1
and CS2. The association between CS1 and CS2 that is established during the sensory pre-
conditioning phase is usually not evident in any behavioral responses because neither CS
has been paired with a US yet, and, therefore, there is no reason to respond.
During the second phase, the cinnamon flavor (CS1) is paired with illness (US), and
a conditioned aversion (CR) develops to CS1. Once this first-order conditioning has been
completed, the participants are tested with CS2 and now show an aversion to CS2 for the
first time. The response to CS2 is noteworthy because CS2 was never directly paired with
a US. (For recent examples of sensory preconditioning, see Dunsmoor, White, & LaBar,
2011; Leising, Sawa, & Blaisdell, 2007; Rovee-Collier & Giles, 2010.)
Sensory preconditioning and higher-order conditioning help us make sense of things
we seem to like or dislike for no apparent reason. What we mean by “no apparent rea-
son” is that these stimuli were not directly associated with a positive or negative US. In
such cases, the conditioned preference or aversion probably developed through sensory
preconditioning or higher-order conditioning.
What Determines the Nature of the
Conditioned Response?
Classical conditioning is usually identified by the development of a new response to the
conditioned stimulus. As we have seen, a large variety of responses can become condi-
tioned, including salivation, eye blinking, fear, locomotor approach and withdrawal, and
aversion responses. Why does one set of responses become conditioned in one situation
but other responses develop to the CS in other circumstances?
The US as a Determining Factor for the CR
The most obvious factor that determines the nature of the conditioned response is the
unconditioned stimulus that is used. Animals learn to salivate when conditioned with
food and to blink when conditioned with a puff of air to the eye. Salivation is not condi-
tioned in eyeblink experiments, and eyeblink responses are not conditioned in salivary-
conditioning experiments.
Interestingly, even small variations in the nature of the US can produce changes in
the nature of the CR. In a famous experiment, Jenkins and Moore (1973) compared
Pavlovian conditioning in pigeons with food versus water as the US. When presented
with grain, pigeons make rapid and hard pecking movements directed at the grain with
their beak open. By contrast, when presented with water, pigeons lower their beak
into the water with their beak mostly closed. Once the beak is under water, it opens
CS2 CS1 CS1 US
CR
CS2
CR
FIGURE 4.6 Proce-
dure for sensory pre-
conditioning. First, CS2
is paired with CS1 with-
out a US in the situation.
Then, CS1 is paired with
a US and comes to elicit
a conditioned response
(CR). In a later test ses-
sion, CS2 is also found to
elicit the CR, even
though CS2 was never
paired with the US. No-
tice that the procedure
for sensory precondi-
tioning is similar to the
procedure for second-
order conditioning, but
the first two phases
are presented in the
opposite order.
©
Ce
ng
ag
e
Le
ar
ni
ng
94 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
periodically to permit the bird to suck up the water (Klein, LaMon, & Zeigler, 1983). Thus,
the unconditioned responses of eating and drinking differ in both speed and form.
Jenkins and Moore found that responses conditioned to a key light CS paired with
food and water differ in a similar fashion. When grain was the US, the pigeons pecked
the key light as if eating: The pecks were rapid with the beak open at the moment of con-
tact. When water was the US, the conditioned pecking movements were slower, made
with the beak closed, and were often accompanied by swallowing. Thus, the form of the
conditioned responses resembled the form of the unconditioned responses to food and
water (see also Allan & Zeigler, 1994; Ploog & Zeigler, 1996; Spetch, Wilkie, & Skelton,
1981). Similar findings have been obtained with food pellets and milk as unconditioned
stimuli with laboratory rats (Davey & Cleland, 1982; Davey, Phillips, & Cleland, 1981).
The fact that the form of the conditioned response is determined by the US encour-
aged Pavlov to propose the stimulus substitution model. According to this model, the
association of a CS with a US turns the conditioned stimulus into a surrogate US. The
conditioned stimulus comes to function much like the US did previously. Thus, the CS is
assumed to activate neural circuits previously activated only by the US and elicit
responses similar to those elicited by the US (Figure 4.7).
The stimulus substitution model correctly emphasizes that the nature of the CR
depends a great deal on the US that is used in a conditioning procedure. However, in
many situations the CR does not resemble the UR. For example, foot shock causes rats
to leap into the air, but the conditioned response to a tone paired with foot shock is
freezing and immobility. In addition, as we will see, in many situations the biologically
important consequence of Pavlovian conditioning is not a change in responding to the
CS but a change in how the organism responds to the US.
The CS as a Determining Factor for the CR
Another important factor that determines the form of the CR is the nature of the condi-
tioned stimulus. This was first demonstrated in a striking experiment by Timberlake and
Grant (1975), who investigated classical conditioning in laboratory rats with food as the
US. However, instead of a conventional light or tone, Timberlake and Grant presented
another rat just before food delivery as the CS. One side of the experimental chamber
was equipped with a sliding platform that could be moved in and out of the chamber
through a flap door (Figure 4.8). A live rat was gently restrained on the platform. Ten
seconds before each delivery of food, the platform was moved into the experimental
chamber, thereby transporting the stimulus rat through the flap door.
The stimulus-substitution model predicts that CS–US pairings will generate
responses to the CS that are similar to responses elicited by the food US. Because food
elicits gnawing and biting, these responses were also expected to be elicited by the CS.
Response
pathway
US
pathway
CS
pathway
FIGURE 4.7 Diagram
of Pavlov’s stimulus sub-
stitution model. The solid
arrow indicates preexist-
ing neural connections.
The dashed arrow indi-
cates neural connections
established by condition-
ing. Because of these new
functional connections,
the CS comes to elicit
responses previously
elicited by the US.
©
Ce
ng
ag
e
Le
ar
ni
ng
What Determines the Nature of the Conditioned Response? 95
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contrary to this prediction, as the CS rat was repeatedly paired with food, it came to
elicit social affiliative responses (orientation, approach, sniffing, and social contacts).
Such responses did not develop if the CS rat was presented at times unrelated to food.
The outcome of this experiment does not support any model that explains the form
of the conditioned response solely in terms of the US that is used. The conditioned social
responses that were elicited by the CS rat were no doubt determined by having another
rat serve as the CS. As we saw in Chapter 3, if the extension of a response lever into the
experimental chamber serves as the CS for food, rats either contact and gnaw the lever
(showing sign tracking) or go to the food cup (showing goal tracking). They do not
respond to the CS lever with social affiliative responses. The nature of the CS is also
important in sexual conditioning with quail. A CS object that male quail can mount
and copulate with is more likely to elicit conditioned copulatory responses than a light
or wood block CS (see photo of a male quail showing a conditioned sexual fetish in the
inside front cover). (For other investigations of how the CS determines the nature of the
conditioned response, see Holland, 1984; Kim et al., 1996; Sigmundi & Bolles, 1983).
The CS–US Interval as a Determining Factor for the CR
Another variable that is a major factor in what kinds of responses become conditioned is the
interval between the conditioned stimulus and the unconditioned stimulus. Seeing a car
coming toward you is a CS for potential injury. How you react to this danger will depend
on how quickly the car will reach you (the CS–US interval). If the car is within 1–2 seconds
of reaching you, you will panic and jump out of the way. In contrast, if the car is within 15–
20 seconds of reaching you, you will be concerned and take evasive action, but you will not
panic and jump. Generally, conditioning with a short CS–US interval activates responses
that are appropriate for immediately dealing with the US. In contrast, conditioning with a
long CS–US interval activates responses that prepare the organism for the US over a longer
time horizon. Consistent with this view, laboratory studies with rats have shown that condi-
tioned fear or panic is more likely with a short CS–US interval, whereas conditioned anxiety
is more likely with a long CS–US interval (Waddell, Morris, & Bouton, 2006; see also
Esmorís-Arranz, Pardo-Vázquez, & Vázquez-Garcia, 2003).
Analogous results have been obtained in appetitive conditioning situations. In the
sexual conditioning of male quail, for example, access to a female copulation partner
is the US. Most studies of sexual conditioning employ a rather short CS–US interval
Food cup
Flap door
Stimulus
rat
Experimental ratMovable platform
FIGURE 4.8 Diagram
of the experiment by
Timberlake and Grant
(1975). The CS for food
was presentation of a
stimulus rat on a mov-
able platform through a
flap door on one side of
the experimental
chamber.
©
Ce
ng
ag
e
Le
ar
ni
ng
96 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
(1 minute or less) and measure approach to the CS or sign tracking as the conditioned
response (Chapter 3). In a study of the effects of the CS–US interval on sexual condition-
ing, Chana Akins (2000) conditioned different groups of quail with either a 1-minute or
a 20-minute CS–US interval and measured not only CS approach responses but also gen-
eral locomotor behavior (pacing between one half of the experimental chamber and the
other). Control groups were exposed to the CS and US in an unpaired fashion.
The results of the experiment are presented in Figure 4.9. With a 1-minute CS–US
interval, the conditioning procedure resulted in CS approach behavior but not increased
locomotion. In contrast, with the 20-minute CS–US interval, the predominant condi-
tioned response was increased locomotor behavior rather than CS approach.
Conditioned Responding and Behavior Systems
How are we to make sense of the fact that conditioned responding depends not only on
the US but also on the CS and the CS–US interval? Pavlov’s stimulus substitution model
clearly cannot handle such a rich pattern of findings. To understand these types of
results, we have to step out of the restricted physiological framework in which Pavlov
worked and consider how Pavlovian conditioning might function in the natural history
of organisms. The most successful framework for addressing these issues so far has been
behavior systems theory (Domjan, 1997; Timberlake, 2001; Rau & Fanselow, 2007).
Different systems of behavior have evolved to enable animals to accomplish various
critical tasks such as procuring and eating food, defending their territory, avoiding pre-
dation, producing and raising offspring, and so on. As I discussed in Chapter 2, a behav-
ior system consists of a series of response modes, each with its own controlling stimuli
and responses, arranged spatially and/or temporally. Consider, for example, the sexual
behavior of male quail. When sexually motivated, the male will engage in a general
search response that brings it into an area where a female may be located. Once he is
in the female’s territory, the male will engage in a more focal search response to actually
locate the female. Finally, once he finds her, the male will engage in courtship and cop-
ulatory responses. This sequence is illustrated in Figure 4.10.
CS-US interval (Min)
%
T
im
e
ne
ar
C
S
0
20
30
10
40
50
60
70
80
1 20
CS Approach
Paired Unpaired
CS-US interval (Min)
N
um
be
r
of
C
ro
ss
in
gs
0
4
6
2
8
10
12
1 20
General Locomotor Behavior
FIGURE 4.9 Effects of the CS–US interval in the sexual conditioning of male quail. CS approach (left panel) and general locomotor
behavior (right panel) were measured in response to the CS in groups that received paired or unpaired CS–US presentations (based
on Akins, 2000).
C. K. Akins
Co
ur
te
sy
of
C.
K.
A
ki
ns
What Determines the Nature of the Conditioned Response? 97
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Behavior systems theory assumes that the presentation of a US in a Pavlovian condi-
tioning procedure activates the behavior system relevant to that US. Food unconditioned
stimuli activate the foraging and feeding system. A sexual US, by contrast, activates
the sexual behavior system. Classical conditioning procedures involve superimposing a
CS–US relationship on the behavioral system activated by the US. As a CS becomes associ-
ated with the US, it becomes integrated into the behavioral system and elicits components of
that system. Thus, food-conditioned stimuli elicit components of the feeding system, and
sexual-conditioned stimuli elicit components of the sexual behavior system.
Behavior systems theory readily explains why the US is an important determinant of
the CR. The theory is also consistent with the fact that the nature of the CR depends on
the type of CS that is employed. As we saw in our discussion of CS-US relevance
(see pages 90–92), different CSs vary in terms of how readily they can become incorpo-
rated into a behavior system. In addition, the nature of the CS will determine what kinds
of conditioned responses can develop. CS approach responses, for example, will only
occur if the CS is highly localized. A diffuse stimulus (change in overall noise level) can-
not generate CS approach as a conditioned response.
The most innovative prediction of behavior systems theory is that the form of the
CR will also depend on the CS–US interval that is used. The CS–US interval is assumed
to determine where the CS becomes incorporated into the sequence of responses that
makes up the behavior system. Consider, for example, the sexual conditioning experi-
ment by Akins (2000) in which different groups were training with a CS–US interval
that was either short (1 minute) or long (20 minute). As illustrated in Figure 4.10, with
a short CS–US interval, the CS occurs just before the female is available and is therefore
incorporated into the behavior system at the focal search stage. Therefore, the CS is pre-
dicted to elicit focal search behavior: The male should approach and remain near the CS.
In contrast, with a long CS–US interval, the CS becomes incorporated into an earlier
portion of the behavior system and elicits general search behavior. General search behav-
ior is manifest in increased nondirected locomotion.
The results obtained by Akins (Figure 4.9) confirm these predictions. The 1-minute
CS–US interval conditioned CS approach but not general locomotor behavior, whereas the
20-minute CS–US interval conditioned locomotor behavior but not CS approach. Similar
evidence in support of behavior system theory has been obtained in appetitive condition-
ing with food (Silva & Timberlake, 1997) and aversive conditioning with shock (Esmorís-
Arranz, Pardo-Vázquez, & Vázquez-Garcia, 2003; Waddell, Morris, & Bouton, 2006). In all
of these cases, the conditioned response that reflects anticipation of the US depends on
how long one has to wait before the US is presented. Therefore, as Balsam and colleagues
(2009) put it, “No single response represents a pure measure of anticipation” (p. 1755).
S–R Versus S–S Learning
So far we have been discussing various accounts of the nature of conditioned behavior with-
out saying much about how a CS produces responding. Let’s turn to that question next.
Historically, conditioned behavior was viewed as a response elicited directly by the
CS. According to this idea, conditioning establishes a new stimulus–response, or the S–R
CS US
CS US
General search
behavior
Focal search
behavior
Consummatory
behavior
(copulation)
FIGURE 4.10 Se-
quence of responses,
starting with general
search and ending with
copulatory behavior,
that characterize the
sexual behavior system.
A conditioning proce-
dure is superimposed on
the behavior system. The
CS–US interval deter-
mines where the CS be-
comes incorporated into
the behavioral sequence.
©
Ce
ng
ag
e
Le
ar
ni
ng
98 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
connection between the CS and the CR. An important alternative view is that subjects
learn a new stimulus–stimulus (S–S) connection between the CS and the US. According
to this interpretation, participants respond to the CS not because it elicits a CR directly
but because the CS activates a representation or memory of the US. Conditioned
responding is assumed to reflect the status of the activated US representation.
How might we decide between S–R learning and S–S learning mechanisms? A pop-
ular research method that has been used to decide between these alternatives involves the
technique of US devaluation. This technique has been used to answer many important
questions in behavior theory. (I will describe applications of it in instrumental condition-
ing in Chapter 7.) Therefore, it is important to understand its rationale.
The basic strategy of a US devaluation experiment is illustrated in Figure 4.11. The
strategy was employed in a classic experiment by Holland and Rescorla (1975). Two
groups of mildly food-deprived rats received conditioning in which a tone was repeatedly
paired with pellets of food. This initial phase of the experiment was assumed to establish
an association between the tone CS and the food US, as well as to get the rats to form a
representation of the food that was used. Conditioned responding was evident in
increased activity elicited by the tone.
In the next phase, the experimental group received a treatment designed to make the
US less valuable to them. This US devaluation was accomplished by giving the rats suffi-
cient free food to completely satisfy their hunger. Presumably satiation reduced the value
of food and thus devalued the US representation. The deprivation state of the control
group was not changed in Phase 2, and, therefore, the US representation remained intact
for those rats (Figure 4.11). Both groups then received a series of test trials with the tone
CS. During these tests, the experimental group showed significantly less conditioned
responding than the control group.
If conditioning had established a new S–R connection between the CS and CR, the
CR would have been elicited whenever the CS occurred, regardless of the value of the
food. That did not happen. Rather, US devaluation reduced responding to the CS. This
outcome suggests that conditioning resulted in an association between the CS and a
representation of the US (S–S learning). Presentation of the CS activated the US repre-
sentation, and the CR was determined by the current status of that US representation.
R. A. Rescorla
CS US
CR
CS US
CR
CS US
CR
CS US
CR
US
US US
US
Experimental
group
Control
group
becomes
Phase 2 TestPhase 1
remains
FIGURE 4.11 Basic strategy and rationale involved in US-devaluation experiments. In Phase 1, the experimental and control
groups receive conventional conditioning to establish an association between the CS and the US and to lead the participants to form
a representation of the US. In Phase 2, the US representation is devalued for the experimental group but remains unchanged for the
control group. If the CR is elicited by way of the US representation, devaluation of the US representation should reduce responding
to the CS.
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
©
Ce
ng
ag
e
Le
ar
ni
ng
What Determines the Nature of the Conditioned Response? 99
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
US devaluation experiments have provided evidence of S–S learning in a wide range
of classical conditioning situations (e.g., Colwill & Motzkin, 1994; Delamater et al., 2006;
Hilliard et al., 1998; Storsve, McNally, & Richardson, 2012). However, not all instances of
classical conditioning involve S–S learning. In some cases, the participants appear to
learn a direct S–R association between the CS and the CR. I will have more to say
about S–R learning in Chapter 7.
Pavlovian Conditioning as Modification of Responses
to the Unconditioned Stimulus
So far we have followed the convention of focusing on how new responses come to be
elicited by a Pavlovian conditioned stimulus. Indeed, standard definitions of Pavlovian
conditioning emphasize that the CS comes to elicit a new response as a result of being
paired with a US. However, learning to respond to a CS is useful to an organism only if
the CR helps it cope with the US. Be it food, a predator, or a sexual partner, the US is the
biologically significant event that the organism has to deal with effectively. To be of biolog-
ical benefit, Pavlovian conditioning should enable the organism to interact with the US
more effectively. The conditioned salivation that Pavlov observed helped his dogs prepare
for the food that was coming and enabled them to digest the food more efficiently. Indeed,
if the food was not delivered, the conditioned salivation was a useless false start.
Two different experimental designs that are used to demonstrate conditioned modifica-
tion of the UR are outlined in Figure 4.12. In the common testing design, two groups are com-
pared. During training, one group receives a conditioned stimulus (A) paired with the US
while the other group gets stimulus A and the US unpaired. Following these contrasting his-
tories, both groups receive stimulus A followed by the US during a test trial. However, instead
of focusing on how the organism responds to A, investigators measure responding during the
US. In the common training design, all of the participants receive a procedure in which stim-
ulus A is paired with the US and stimulus B is presented unpaired. Following this common
training, responding to the US is evaluated following presentations of stimuli A and B.
Research has shown that Pavlovian conditioning modifies responding to the US in a
wide range of situations (Domjan, 2005). The effect was reported in an early eyeblink condi-
tioning experiment by Kimble and Ost (1961) with human participants. Conditioning trials
consisted of the presentation of the CS light just before the delivery of a gentle puff of air to
the eyes. Once the CS became conditioned, presentation of the CS reduced how vigorously
the participants blinked when they received the air puff. This phenomenon has come to be
called conditioned diminution of the UR. (For a study of the brain mechanisms of the con-
ditioned diminution effect, see Box 3.2 and Knight et al., 2010.)
Conditioning phase
Experimental Group
Control group
→
→ →
→
→
A/US (Unpaired)
(Unpaired)B/US
A US A US
A US A US
B US
(Paired)
(Paired)
Common testing design
Common training design
Test phaseFIGURE 4.12 Experi-
mental designs used to
demonstrate condi-
tioned modification of
the unconditioned
response.
©
Ce
ng
ag
e
Le
ar
ni
ng
,
20
15
100 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Conditioned Analgesia Conditioned diminution of the UR is a prominent phenom-
enon in aversive conditioning and in conditioning experiments where pharmacological
agents serve as unconditioned stimuli. In both these cases, a conditioned stimulus elicits
physiological processes that serve to counteract the effects of the US.
We previously discussed how an aversive stimulus activates the defensive behavior
system. Defensive responses like fleeing or striking back at a predator can be effective
in coping with an attack. However, to engage in active defensive responses, the organism
cannot be debilitated by pain. Interestingly, exposure to an aversive stimulus or physical
injury results in the release of endogenous opiates that counteract the pain induced by
the injury. Endogenous opiates are produced internally by the body and function like
morphine or heroin to reduce pain sensitivity and provide analgesia.
The release of endogenous opiates can become conditioned to cues associated with an
aversive stimulus or injury. That is, a CS that has been paired with foot-shock will stimulate
the release of endogenous opiates, resulting in conditioned analgesia. Through this process,
foot-shock becomes less and less painful with successive conditioning trials (Zelikowsky &
Fanselow, 2011). This is a prominent example of conditioned diminution of the UR.
Conditioned Drug Tolerance Another prominent example of conditioned diminu-
tion of the UR comes from studies of conditioned drug tolerance. Tolerance to a drug
is said to develop when repeated administrations of the drug have progressively less
effect. Because of this, increasing doses become necessary to produce the same results.
Tolerance develops with nearly all psychoactive drugs. A beer or two can have a substan-
tial effect on first-time drinkers but not on habitual drinkers, who may require four to
six beers to feel the same level of intoxication. People who take pain pills or sleeping
pills have similar experiences. The pills are highly effective at first, but with repeated
use higher doses are required to produce the same effects.
There is now substantial evidence that drug tolerance can result from Pavlovian
conditioning. Pavlovian conditioning is involved because each administration of a
drug constitutes a conditioning trial in which cues that accompany administration of
the drug are paired with the pharmacological effects of the drug. Thus, drug adminis-
tration cues constitute the CS and the pharmacological effects are the US. Caffeine, for
example, is a commonly used drug whose pharmacological effects are typically preceded
by the smell and taste of coffee. Thus, the taste and smell of coffee can serve as a con-
ditioned stimulus that is predictive of the physiological effects of caffeine (e.g., Flaten &
Blumenthal, 1999).
A Pavlovian perspective views the development of drug tolerance as another exam-
ple of conditioned diminution of the UR (Siegel, 1999). Given that each drug-taking epi-
sode is a conditioning trial, the conditioned stimuli that precede each drug
administration become associated with the physiological effects of the drug. One conse-
quence of this learning is that the CS elicits physiological processes that counteract the
drug effect. The process is illustrated in Figure 4.13 and is related to the opponent-
process theory of motivation we discussed in Chapter 2.
As we noted in Chapter 2, emotion-arousing events (including drugs that induce
emotions) trigger a sequence of two processes. The first of these is the primary a process,
which represents the initial effects of the drug. For alcohol, these are the symptoms of
intoxication and sedation. The a process is followed by the opponent b process, which is
opposite to and counteracts the a process. For alcohol, the b process causes the irritability
and malaise of a hangover. According to the mechanisms of conditioned drug tolerance,
the opponent or compensatory process comes to be elicited by a drug-conditioned stim-
ulus, and that is why the drug effect is substantially reduced if the drug is administered
following the CS.S. Siegel
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
What Determines the Nature of the Conditioned Response? 101
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
A key prediction of these mechanisms is that drug tolerance will be reduced if par-
ticipants receive the drug under novel circumstances or in the absence of the usual drug-
predictive cues. The model also suggests that various factors (such as CS preexposure
and Pavlovian extinction) that attenuate the development of conditioned responding
will also attenuate the development of drug tolerance. A large body of research has con-
firmed these and other predictions of the conditioning model in laboratory studies with
numerous drugs including opiates (i.e., morphine and heroin), alcohol, scopolamine,
benzodiazepines, and amphetamine (see reviews by Siegel, 2005, 2008). Conditioned
drug tolerance is one of the ways in which learning processes are involved in drug addic-
tion (McCarthy et al., 2011).
A
B
C
Response to the CS plus the drug,
after conditioning
Response to the CS,
after conditioning
Response to the drug, before
conditioning
FIGURE 4.13 Illus-
tration of the condi-
tioning model of drug
tolerance. The magni-
tude of a drug reaction is
illustrated by deviation
from the horizontal lev-
el. (A) Primary reaction
to the drug before con-
ditioning, illustrating the
initial effects of the drug
(without any homeo-
static adjustments).
(B) The homeostatic
compensatory drug
reaction that becomes
conditioned to the drug-
predictive CS after
repeated drug adminis-
trations. (C) The net
attenuated drug re-
sponse that is observed
when the drug is
administered with the
drug-conditioned CS.
This net attenuated drug
response illustrates the
phenomenon of drug
tolerance.
BOX 4.1
Drug “Overdose” Caused by the Absence of Drug-Conditioned Stimuli
According to the conditioning model
of drug tolerance, the impact of a
drug will be reduced if the drug is
consumed in the presence of cues
that were previously conditioned to
elicit conditioned compensatory
responses. Consider a heroin addict
who usually shoots up in the same
place, perhaps with the same friends.
That place and company will become
conditioned to elicit physiological
reactions that reduce the effects of
the heroin, forcing the addict to
inject higher doses to get the same
effect. As long as the addict shoots
up in the usual place and with the
usual friends, he or she is protected
from the full impact of the increased
heroin dosage by the conditioned
compensatory responses. But what if
the addict visits a new part of town
and shoots up with new acquain-
tances? In that case, the familiar CSs
will be absent, as will the protective
conditioned compensatory
responses. Therefore, the addict will
get the full impact of the heroin he or
she is using, and may suffer an
“overdose.” The word “overdose” is
in quotation marks because the
problem is not that too high a dose of
heroin was consumed but that the
drug was taken in the absence of the
usual CS. Without the CS, a dose of
heroin that the addict never had
trouble with might kill him on this
occasion. Evidence for this interpre-
tation has been obtained both in
experimental research with labora-
tory animals and in human cases of
drug overdose (Siegel, Baptista, Kim,
McDonald, & Weise-Kelly, 2000).
©
Ce
ng
ag
e
Le
ar
ni
ng
102 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Conditioned Reproduction and Fertility Pavlovian conditioning also results in
changes in responding to the US in appetitive conditioning situations. These effects
have been examined most extensively in studies of sexual conditioning. In a landmark
experiment with a fish species, the blue gourami, Hollis and colleagues (1997) conducted
conditioning trials in which a CS light was presented for 10 seconds to male gourami,
followed by visual exposure to a female for 5 minutes behind a barrier as the uncondi-
tioned stimulus. For a control group, the CS and US were presented unpaired. After 18
conditioning trials, all of the fish received a test trial in which exposure to the CS for 10
seconds was followed by removal of the barrier separating the male from the female so
that the two could interact. Notice that this was the first time the barrier between the
male and female compartments was removed. The fish were permitted to interact for
six days during which they courted, built a nest, and copulated.
How the males interacted with the female (the US in this experiment) during the
test trial is summarized in Figure 4.14. Basically, males that received access to a female
after exposure to the Pavlovian CS showed far more effective courtship and copulatory
interactions with the female than males in the unpaired control group. Pavlovian males
showed less aggression toward the female, they spent more time building a nest, they
clasped the female more often, and most importantly, they produced far more offspring.
Keep in mind that none of these were responses to the CS, which lasted just 10 seconds
before the barrier between the male and female was removed. Rather, all these changes in
behavior were altered response to the US or female gourami.
Results similar to those obtained by Hollis and colleagues (1997) have been observed
in numerous experiments on sexual conditioning with domesticated quail. As in the gou-
rami, exposure to a sexually conditioned CS produces widespread changes in the sexual
behavior of quail. These changes include increased receptivity of females to being
mounted by a male, more rapid and more efficient copulation, increased sperm release
during copulation, increased fertilization of eggs, and the production of greater numbers
of offspring (e.g., Domjan & Akins, 2011; Domjan, Mahometa, & Matthews, 2012).
How Do Conditioned and Unconditioned
Stimuli Become Associated?
I have described numerous situations in which classical conditioning occurs, and I have
discussed various factors that determine how behavior (to both the CS and the US)
changes as a result of this learning. However, I have yet to address in detail the critical
issue of how conditioned and unconditioned stimuli become associated. What are the
0
Paired Unpaired
5
10
15
20
25
30
35
Biting frequency Nest building bouts
0
Paired Unpaired
2.5
5.0
7.5
10.0
0
Paired Unpaired
250
500
750
1000
1250
1500
1750
0
Paired Unpaired
50
300
Clasp frequency Offspring produced
100
150
200
250
FIGURE 4.14 Interactions between male and female gourami following exposure to a Pavlovian CS for males that previously had
the CS paired or unpaired with visual access to a female (based on Hollis et al., 1997).
How Do Conditioned and Unconditioned Stimuli Become Associated? 103
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
mechanisms of learning, or the underlying processes that are activated by conditioning
procedures to produce learning? This question has been the subject of intense scholarly
work. The evolution of theories of classical conditioning continues today, as investigators
strive to formulate comprehensive theories that can embrace all of the diverse findings of
research in Pavlovian conditioning. (For reviews, see Pearce & Bouton, 2001; Mowrer &
Klein, 2001; Vogel, Castro, & Saavedra, 2004.)
The Blocking Effect
The modern era in theories of Pavlovian conditioning got underway about 45 years ago
with the discovery of several provocative phenomena that stimulated the application of
information processing ideas to the analysis of classical conditioning (e.g., Rescorla,
1967b, 1969a; Wagner, Logan, Haberlandt, & Price, 1968). One of the most prominent
of these was the blocking effect.
To get an intuitive sense of the blocking effect, consider the following scenario. Each
Sunday afternoon, you visit your grandmother who always serves bread pudding that
slightly disagrees with you. Not wanting to upset her, you politely eat the pudding during
each visit and, consequently, acquire an aversion to bread pudding. One of the visits falls
on a holiday, and to make the occasion a bit more festive, your grandmother makes a
special sauce to serve with the bread pudding. You politely eat the bread pudding with
the sauce, and as usual you get a bit sick to your stomach. Will you now develop an
aversion to the sauce? Probably not. Knowing that bread pudding disagrees with you,
you probably will attribute your illness to the proven culprit and not learn to dislike
the newly added sauce.
The above example illustrates the basic sequence of events that produces the block-
ing effect (Figure 4.15). Two conditioned stimuli are employed (in the above example
these were the taste of the bread pudding and the taste of the special sauce). In Phase 1,
the experimental group receives repeated pairings of one of the stimuli (A) with the US.
This phase of training is continued until a strong CR develops to stimulus A. In the next
phase of the experiment, stimulus B is presented together with stimulus A and paired
with the US. After several such conditioning trials, stimulus B is presented alone in a
test trial to see if it also elicits the CR. Interestingly, very little responding occurs to stim-
ulus B even though B was repeatedly paired with the US during Phase 2.
In Phase 2 of the blocking design, the control group receives the same kind of con-
ditioning trials (A+B paired with the US) as the experimental group (Figure 4.15).
However, for the control group, stimulus A is not conditioned prior to these
compound-stimulus trials. Rather, during Phase 1, the control group receives presenta-
tions of stimulus A and the US in an unpaired fashion. In many replications of this
design, stimulus B invariably produces less conditioned responding in the experimental
group than in the control group. (For a more detailed discussion of controls for block-
ing, see Taylor et al., 2008.)
The blocking effect was initially investigated in fear conditioning using the condi-
tioned suppression technique with rats (Kamin, 1968, 1969). Subsequently, the phenom-
enon has been demonstrated in various other conditioning preparations with both
human participants and laboratory animals (e.g., Bradfield & McNally, 2008; Holland &
Kenmuir, 2005; Mitchell et al., 2006). One area of considerable contemporary interest is
blocking in learning about geometric cues (Miller & Shettleworth, 2007). In one recent
study with college students, the conditioned stimuli were geometric cues provided by
two different triangles that I will refer to as A and B. The cover story was that the trian-
gles depicted the floor plan of a room that had food in one of the corners that the parti-
cipants had to find.
L. J. Kamin
Co
ur
te
sy
of
L.
J.
Ka
m
in
104 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The experiment is outlined in Figure 4.16. In Phase I, the blocking group received
training with triangle A (an isosceles triangle). One of the internal angles of this triangle
(indicated by + in Figure 4.16) was designated as “correct.” Clicking on this angle produced
a clapping sound. Clicking on either of the other corners produced the sound of an explo-
sion, indicating that the response was incorrect. The triangle was presented in different
orientations so that a particular location on the screen was not correlated with the correct
response. During Phase II, the blocking group received the previously conditioned stimulus
(A) in combination with a new triangle B, with the correct corner as indicated in Figure 4.16.
The blocking procedure was compared to two control procedures. Control group 1 received
training only with the combination of triangles A and B, without any prior training with
triangle A. Control group 2 received training only with triangle B in Phase 2. As in Phase 1,
the orientation of the triangles used in Phase 2 varied across trials.
At the end of the experiment, all three groups were tested to see if they learned which
was the correct corner for finding food in triangle B. As expected, control group 2, which
was trained only with triangle B, responded very well (92% correct) when tested with tri-
angle B. Control group 1 responded nearly as well (79% correct). The high performance of
control group 1 indicates that learning the correct location in triangle B was not disrupted
much by presenting triangle B in combination with triangle A during training. However, a
major disruption in performance occurred if triangle A was pretrained before being pre-
sented with B. Participants in the blocking group responded correctly during the test trials
with triangle B only 20% of the time. This illustrates the blocking effect.
Since the time of Aristotle, temporal contiguity has been considered the primary
means by which stimuli become associated. The blocking effect is a landmark phenome-
non in classical conditioning because it calls into question the assumption that temporal
A/US Unpaired
A US
Experimental
group
Control
group
Phase 1
B
Test
[A + B] US
[A + B] BUS
Phase 2FIGURE 4.15 Dia-
gram of the blocking
procedure. During Phase
1, stimulus A is condi-
tioned with the US in the
experimental group,
while the control group
receives stimulus A pre-
sented unpaired with the
US. During Phase 2,
both experimental and
control groups receive
conditioning trials in
which stimulus A is
presented simultaneous-
ly with stimulus B and
paired with the US. A
later test of stimulus B
alone shows less condi-
tioned responding to
stimulus B in the
experimental group than
in the control group.
Blocking
group
Control 1
group
Control 2
group
Phase 1
Triangle A
Test
Triangle B
Phase 2
Triangles A and B
1
20%
1
1
1
79%
92%
FIGURE 4.16 Block-
ing in human learning
about geometric cues.
Human participants had
to move a cursor to the
correct geometric loca-
tion (indicated by + in
each triangle) on condi-
tioning trials in Phase 1
(for the blocking group)
and Phase 2 (for all
groups). Percent correct
during the test trials is
indicated in the third
column (based on Pra-
dos, 2011).
©
Ce
ng
ag
e
Le
ar
ni
ng
,
20
15
How Do Conditioned and Unconditioned Stimuli Become Associated? 105
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
contiguity is sufficient for learning. The blocking effect clearly shows that pairings of a
CS with a US are not enough for conditioned responding to develop. During Phase 2 of
the blocking experiment, CSB is paired with the US in an identical fashion for the exper-
imental and the control group. Nevertheless, CSB comes to elicit vigorous conditioned
responding only in the control group.
Why does the presence of the previously conditioned stimulus A block the acquisi-
tion of responding to the added cue B? Kamin, who discovered the blocking effect,
explained the phenomenon by proposing that a US has to be surprising to be effective
in producing learning. If the US is signaled by a previously conditioned stimulus (A), it
will not be surprising. Kamin reasoned that if the US is not surprising, it will not activate
the “mental effort” required for learning.
If something is not surprising, we already know a lot about it and therefore have
little to learn. Learning is necessary with unexpected events because what makes some-
thing unexpected is that we don’t know enough to make good predictions about it. The
basic idea that learning occurs when something is surprising is a fundamental concept in
learning theory. For example, in their discussion of a Bayesian approach to learning,
Courville and colleagues noted that “change increases uncertainty, and speeds subse-
quent learning, by making old evidence less relevant to the present circumstances”
(Courville, Daw, & Touretzky, 2006).
The Rescorla–Wagner Model
The idea that the effectiveness of a US is determined by how surprising it is forms the
basis of a formal mathematical model of conditioning proposed by Robert Rescorla and
Allan Wagner (Rescorla & Wagner, 1972; Wagner & Rescorla, 1972). With the use of
this model, investigators have extended the implications of the concept of US surprise
to a wide variety of conditioning phenomena. The Rescorla–Wagner model has become
a reference point for all subsequent learning theories (Siegel & Allen, 1996), and its basic
BOX 4.2
The Picture–Word Problem in Teaching Reading: A Form of Blocking
Early instruction in reading often
involves showing children a written
word, along with a picture of what that
word represents. Thus, two stimuli are
presented together. The children have
already learned what the picture is
called (e.g., a horse). Therefore, the
two stimuli in the picture–word com-
pound include one that is already
learned (the picture) and one that is
not (the word). This makes the
picture–word compound much like
the compound stimulus in a blocking
experiment: A previously trained
stimulus is presented along with a new
one the child does not know yet.
Research on the blocking effect pre-
dicts that the presence of the previ-
ously trained picture should disrupt
learning about the word. Singh and
Solman (1990) found that this is
indeed the case in a study of reading
with students who had mild intellec-
tual disabilities.
The children were taught to read
words such as knife, lemon, radio,
stamp, and chalk. Some of the words
were taught using a variation of the
blocking design in which the picture of
the object was presented first and the
child was asked to name it. The picture
was then presented together with its
written word, and the child was asked,
“What is that word?” In other condi-
tions, the words were presented with-
out their corresponding pictures. All
eight participants showed the slowest
learning for the words that were
taught with the corresponding pictures
present. By contrast, six of the eight
children showed the fastest learning of
the words that were taught without
their corresponding pictures. (The
remaining two participants learned
most rapidly with a modified
procedure.) These results suggest that
processes akin to blocking may occur
in learning to read. The results also
suggest that pictorial prompts should
be used with caution in reading
instruction because they may disrupt
rather than facilitate learning (see also
Didden, Prinsen, & Sigafoos, 2000;
Dittlinger & Lerman, 2011).
106 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
assumptions are being identified in studies of the neural mechanisms of learning (e.g.,
Spoormaker et al., 2011; Zelikowsky & Fanselow, 2011).
What does it mean to say that something is surprising? How might we measure the
level of surprise of a US? By definition, an event is surprising if it is different from what is
expected. If you expect a small gift for your birthday and get a car, you will be very sur-
prised. This is analogous to an unexpectedly large US. Correspondingly, if you expect a
car and receive a box of candy, you will also be surprised. This is analogous to an unex-
pectedly small US. According to the Rescorla–Wagner model, an unexpectedly large US
is the basis for excitatory conditioning and an unexpectedly small US (or the absence of
the US) is the basis for inhibitory conditioning.
Rescorla and Wagner assumed that the level of surprise, and hence the effectiveness
of a US, depends on how different the US is from what the individual expects. Further-
more, they assumed that expectation of the US is related to the conditioned or associa-
tive properties of the stimuli that precede the US. Strong conditioned responding
indicates strong expectation of the US; weak conditioned responding indicates a low
expectation of the US.
These ideas can be expressed mathematically by using λ to represent the US that is
delivered on a given trial and V to represent the associative value of the stimuli that precede
the US. The level of US surprise will then be (λ � V), or the difference between what occurs
(λ) and what is expected (V). On the first conditioning trial, what occurs (λ) is much larger
than what is expected (V), and the surprise factor (λ � V) will be large (Figure 4.17). As
learning proceeds, expectations (V) will come in line with what occurs (λ), and the surprise
term (λ � V) will get smaller and smaller. Eventually, V will grow to match λ. At the limit or
asymptote of learning, V ¼ λ and the surprise term (λ � V) is equal to zero.
Learning on a given conditioning trial is the change in the associative value of a
stimulus. That change can be represented as ΔV. The idea that learning depends on the
level of surprise of the US can be expressed as follows:
ΔV ¼ kðλ � VÞ, ð4:1Þ
where k is a constant related to the salience of the CS and US. This is the fundamental
equation of the Rescorla–Wagner model. It is also known as the delta rule, in reference
A. R. Wagner
A
ss
oc
ia
tiv
e
va
lu
e
( V
)
Trials
(λ–V)
early in training
(λ–V)
late in
training
0
λ
FIGURE 4.17 Growth
of associative value (V)
during the course of
conditioning until the
asymptote of learning
(λ) is reached. Note that
the measure of surprise
(λ � V) is much larger
early in training than
late in training.
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
©
Ce
ng
ag
e
Le
ar
ni
ng
,
20
15
How Do Conditioned and Unconditioned Stimuli Become Associated? 107
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
to the Greek symbol delta (Δ) to refer to change in the associative value of a stimulus as
a result of a conditioning trial.
The delta rule indicates that the amount of learning (ΔV) is proportional to how far
predictions of the US (V) differ from what actually occurs (λ), or how big the error is in
predicting the US. The prediction error (λ � V) is large at first but is gradually elimi-
nated as learning proceeds. Thus, the Rescorla–Wagner equation is an error-correction
mechanism. Some form of the delta rule is common in theories of learning and is also
used extensively in robotics where error corrections are required to bring a system (V) in
line with a target (λ).
Application of the Rescorla–Wagner Equation to the Blocking Effect The
Rescorla–Wagner model clearly predicts the blocking effect. In applying the model, it is
important to keep in mind that expectations of the US are based on all of the cues avail-
able during the conditioning trial. As was presented in Figure 4.15, the experimental
group in the blocking design first receives extensive conditioning of stimulus A so that
it acquires a perfect expectation that the US will occur whenever it encounters stimulus
A. Therefore, by the end of Phase 1, VA will be equal to the asymptote of learning, or
VA ¼ λ.
In Phase 2, Stimulus B is presented together with stimulus A, and the two CSs are
followed by the US. To predict what will be learned about stimulus B, the basic
Rescorla–Wagner equation has to be applied to stimulus B: ΔVB ¼ k(λ � V). In carrying
out this calculation, keep in mind that V is based on all of the stimuli present on a trial.
In Phase 2, there are two cues: A and B. Therefore, V ¼ VA þ VB. Because of its Phase 1
training, VA ¼ λ at the start of Phase 2. In contrast, VB starts out at zero. Therefore, at the
start of Phase 2, VA þ VB is equal to λ þ 0, or λ. Substituting this value into the equation
for ΔVB gives a value for ΔVB of k(λ � λ), or k(0), which is equal to zero. This indicates that
stimulus B will not acquire associative value in Phase 2. Thus, the conditioning of
stimulus B will be blocked.
Loss of Associative Value Despite Pairings with the US The Rescorla–Wagner
model has become a prominent theory of learning because it makes some unusual pre-
dictions. One unusual prediction is that the associative value of a CS can decrease despite
continued pairings with the US. How might this happen? Stimuli are predicted to lose
associative value if they are presented together on a conditioning trial after having been
trained separately. Such an experiment is outlined in Figure 4.18.
Figure 4.18 shows a three-phase experiment. In Phase 1, stimuli A and B are paired
with the same US (e.g., one pellet of food) on separate trials. This continues until both A
and B predict the one food pellet US perfectly. Thus, at the end of Phase 1, both VA and
VB will equal λ. In Phase 2, stimuli A and B are presented simultaneously for the first
time, and this stimulus compound is followed by the usual single food pellet. What hap-
pens to the conditioned properties of A and B as a result of the Phase 2 training?
Note that the same US that was used in Phase 1 continues to be presented in Phase 2.
Given that there is no change in the US, informal reflection suggests that the conditioned
properties of A and B should also remain unchanged. In contrast to this common-sense
prediction, the Rescorla–Wagner model predicts that the conditioned properties of the
individual cues A and B will decrease in Phase 2.
As a result of training in Phase 1, A and B both predict the one food pellet US
(VA ¼ λ; VB ¼ λ). When A and B are presented simultaneously for the first time in
Phase 2, the expectations based on the individual stimuli are added together, with the result
that two food pellets are predicted rather than one (VAþB ¼ VA þ VB ¼ 2λ). This is an over-
expectation because the US remains only one food pellet. The US in Phase 2 is surprisingly
108 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
small. To bring US expectancy in line with what actually occurs in Phase 2, the par-
ticipants have to decrease their expectancy of the US based on stimuli A and B. Thus,
A and B are predicted to lose associative value despite continued presentations of the
same US. The loss in associative value will continue until the sum of the expectancies
based on A and B equals one food pellet. The predicted loss of the CR to the individ-
ual cues in this type of procedure is highly counterintuitive but has been verified in a
number of experiments (e.g., Kehoe & White, 2004; Lattal & Nakajima, 1998; see also
Sissons & Miller, 2009).
Conditioned Inhibition How does the Rescorla–Wagner model explain the develop-
ment of conditioned inhibition? Consider, for example, Pavlov’s procedure for inhibitory
conditioning (Figure 3.11). This procedure involves two kinds of trials: one in which the
US is presented (reinforced trials), and one in which the US is omitted (nonreinforced
trials). On reinforced trials, a conditioned excitatory stimulus (CS+) is presented and
paired with the US. On nonreinforced trials, the CS+ is presented together with the condi-
tioned inhibitory stimulus CS–, and the compound is followed by the absence of the US.
Application of the Rescorla–Wagner model to such a procedure requires considering
reinforced and nonreinforced trials separately. To accurately anticipate the US on rein-
forced trials, the CS+ has to gain excitatory properties. The development of such condi-
tioned excitation is illustrated in the panel on the left of Figure 4.19. Excitatory
conditioning involves the acquisition of positive associative value and ceases once the
organism predicts the US perfectly on each reinforced trial.
On nonreinforced trials, the CS+ and CS– are presented together. Once the CS+ has
acquired some degree of conditioned excitation (because of its presentation on reinforced
A
B
A
B
1 Pellet
B
A 1 Pellet
1 Pellet
Phase 1 Phase 2 Phase 3: TestFIGURE 4.18 Dia-
gram of the overexpec-
tation experiment. In
Phase 1, stimuli A and B
are separately condi-
tioned to asymptote with
a one-pellet US. In Phase
2, an overexpectation is
created by presenting A
and B simultaneously
and pairing the com-
pound stimulus with a
one-pellet US. In Phase
3, A and B are tested
individually and found
to have lost associative
value because of the
overexpectation in
Phase 2.
20
10
30
0
–10
–20
–30
Trial block
A
ss
oc
ia
tiv
e
va
lu
e
CS+CS+
CS–
Net CS+, CS–
CS–
20
10
30
0
–10
–20
–30
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Trial block
A
ss
oc
ia
tiv
e
va
lu
e
Acquisition Extinction
FIGURE 4.19 Left panel: Acquisition of conditioned excitation to CS+ and conditioned inhibition to CS–. The Net curve is the asso-
ciative value of the CS+ and CS– presented simultaneously. Right panel: Predicted extinction of excitation to CS+ and inhibition to CS–
when these cues are presented repeatedly without the US, according to the Rescorla–Wagner model.
©
Ce
ng
ag
e
Le
ar
ni
ng
©
Ce
ng
ag
e
Le
ar
ni
ng
How Do Conditioned and Unconditioned Stimuli Become Associated? 109
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
trials), the organism will expect the US whenever the CS+ occurs, including on nonrein-
forced trials. However, the US does not happen on nonreinforced trials. Therefore, this is
a case of overexpectation, similar to the example in Figure 4.18. To accurately predict the
absence of the US on nonreinforced trials, the associative value of the CS+ and the value
of the CS– have to sum to zero (the value represented by noUS). How can this be
achieved? Given the positive associative value of the CS+, the only way to achieve a net
zero expectation of the US on nonreinforced trials is to make the associative value of the
CS– negative. Hence, the Rescorla–Wagner model explains conditioned inhibition by
assuming that the CS– acquires negative associative value (see the panel on the left of
Figure 4.19).
Extinction of Excitation and Inhibition In an extinction procedure, the CS is pre-
sented repeatedly without the US (Chapter 9). Predictions of the Rescorla–Wagner
model for extinction are illustrated in the panel on the right of Figure 4.19. If a CS has
acquired excitatory properties, there will be an overexpectation of the US the first time
the CS+ is presented without the US in extinction. Repeated nonreinforced presentations
of the CS+ will result in a progressive reduction of the associative value of the CS+ until
VCS+ reaches zero.
The Rescorla–Wagner model predicts an analogous scenario for extinction of condi-
tioned inhibition (panel on the right of Figure 4.19). At the start of extinction, the CS– has
negative associative value. This may be thought of as creating an underprediction of the
US: The organism predicts less than the zero US that occurs on extinction trials. To align
expectations with the absence of the US, the negative associative value of the CS– is grad-
ually reduced and the CS– ends up with zero associative strength.
Problems with the Rescorla–Wagner Model The Rescorla–Wagner model stimu-
lated a great deal of research and led to the discovery of many new and important phenom-
ena in classical conditioning (Siegel & Allen, 1996). Not unexpectedly, however, the model
has also encountered a number of difficulties (see Miller, Barnet, & Grahame, 1995).
One of the difficulties with the model is that its analysis of the extinction of condi-
tioned inhibition is not correct. As indicated in Figure 4.19, the model predicts that
repeated presentations of a conditioned inhibitor (CS–) by itself will lead to loss of con-
ditioned inhibition. However, this does not happen (Zimmer-Hart & Rescorla, 1974;
Witcher & Ayres, 1984). In fact, some investigators have found that repeated nonreinfor-
cement of a CS– can enhance its conditioned inhibitory properties (e.g., DeVito &
Fowler, 1987; Hallam et al., 1992). Curiously, an effective procedure for reducing the
conditioned inhibitory properties of a CS– does not involve presenting the CS– at all.
Rather, it involves extinguishing the excitatory properties of the CS+ with which the
CS– was presented during inhibitory training (Best et al., 1985; Lysle & Fowler, 1985).
Another difficulty is that the Rescorla–Wagner model views extinction as the reverse of
acquisition, or the return of the associative value of a CS to zero. However, as we will see in
Chapter 9, a growing body of evidence indicates that extinction should not be viewed as sim-
ply the reverse of acquisition. Rather extinction appears to involve the learning of a new rela-
tionship between the CS and the US (namely, that the US no longer follows the CS).
Devising a comprehensive theory of classical conditioning is a formidable chal-
lenge. Given that classical conditioning has been studied for more than a century, a
comprehensive theory must account for many diverse findings. No theory available
today has been entirely successful in accomplishing that. Nevertheless, interesting new
ideas about classical conditioning continue to be proposed and examined. Some of
these proposals supplement the Rescorla–Wagner model. Others are incompatible
with the Rescorla–Wagner model and move the theoretical debate in new directions.
110 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
BOX 4.3
Conditioning and the Amygdala
Our emotional reaction to stimuli is
modulated by a group of structures
known as the limbic system, which
lies under the cerebral cortex and
encircles the upper brain stem (Figure
(iii) on the inside back cover.). Two
components of this system are espe-
cially important for learning and
memory: the amygdala and the hip-
pocampus. The hippocampus is
involved in learning about complex
relations, the stimulus configurations
that encode particular locations and
episodes in time (Wang & Morris,
2010). The nature of this learning, and
the underlying neurobiological
mechanisms, will be discussed in Box
8.2. Here I focus on the amygdala
(Latin for almond), a structure that
plays a key role in linking new affective
responses to previously neutral stimuli
(Fanselow & Poulos, 2005; Johansen,
Cain, Ostroff, & LeDoux, 2011).
Interest in the amygdala stemmed
from early observations implicating
this structure in the regulation of fear.
Although rare, people who have
experienced damage to this neural
structure exhibit a peculiar lack of
fear to stimuli that signal danger.
Brain scans have revealed that pro-
cessing fear-related stimuli (e.g., pic-
tures of a fearful faces) activates the
amygdala, and electrically stimulating
this region produces feelings of fear
and apprehension.
What makes the amygdala espe-
cially interesting is that it does more
than organize our behavioral
response to stimuli that innately elicit
fear; it also mediates the conditioning
of fear to signals of danger. How this
occurs has been tuned through our
biological history, predisposing us to
exhibit heightened fear to particular
kinds of stimuli (e.g., snakes, heights).
Ordinarily, this type of learning pro-
vides an adaptive function, motivat-
ing the organism to avoid dangerous
situations. Sometimes, however, the
level of fear elicited can grow out of
proportion to the true level of danger,
producing a phobic response that can
interfere with everyday function.
Laboratory studies have shown
that discrete regions of the amygdala
serve distinct functions. For our pur-
poses, three regions are of particular
interest: the lateral (side), basal
(lower), and central nuclei. As in the
cerebellum, these nuclei can be dis-
tinguished on both anatomical and
functional criteria. Further, their role
in learning has been studied using
similar methods (stimulation, inacti-
vation or lesioning, and recording).
For example, electrical stimulation of
the central nucleus produces a range of
behavioral and physiological
responses indicative of fear, including
freezing, enhanced startle to a loud
acoustic stimulus, and a change in
heart rate (Figure 4.20A). Conversely,
lesioning the amygdala produces a
fearless creature that no longer avoids
dangerous situations. Rats normally
show signs of fear in the presence of a
predator (e.g., a cat). After having the
amygdala lesioned, a rat will approach
a cat as if the cat were a long lost friend.
Lesioning the amygdala also dis-
rupts learning about cues (CSs) that
have been paired with an aversive
event (e.g., a shock US) in a Pavlovian
paradigm. As you have learned, ani-
mals can associate many different types
of stimuli with shock. In some cases,
the cue may be relatively simple, such
as a discrete light or tone. In other
cases, a constellation of cues, such as
the environmental context in which
shock occurs, may be associated with
shock. In both cases, pairing the stim-
ulus with shock produces conditioned
fear, as indicated by a CS-induced
increase in freezing and startling.
In fear conditioning, the neural
signals elicited by the CS and US
converge within the lateral amygdala
(Figure 4.20A). Information about the
US is provided by a number of dis-
tinct neural circuits, each of which is
sufficient to support conditioning
(Lanuza, Nader, & LeDoux, 2004).
Likewise, multiple pathways can
transmit information about the CS. A
relatively direct path from the sensory
thalamus provides a coarse input that
sacrifices stimulus detail for speed.
Additional CS inputs arrive from the
cortex and likely provide a slower, but
more precise, representation of the
features of the CS. Even further
downstream, the hippocampus can
provide a cue based on the current
configuration of sensory stimuli,
allowing the organism to learn that a
particular environmental context or
spatial location signals danger (Box 8.2).
Evidence suggests that neurons
within the lateral and basal region of
the amygdala (collectively known as
the basolateral amygdala) provide a
biological link that endows a CS with
the capacity to elicit fear. Throughout
the nervous system, many forms of
learning depend on a kind of gated
channel known as the NMDA recep-
tor. As I will discuss in a subsequent
section (Box 8.2), engaging this
receptor can initiate a cellular process
that enhances synaptic connectivity,
allowing a previously neutral cue
(the CS) to elicit a new response.
Unlocking the NMDA gate requires a
strong input, such as that provided by
a shock US. Under these conditions,
the cue-elicited input to the basolat-
eral amygdala (the CS) may gain the
capacity to drive the neural machin-
ery that generates fear (the CR).
Evidence for this comes from studies
demonstrating that conditioning
endows a CS with the capacity to
Continued
How Do Conditioned and Unconditioned Stimuli Become Associated? 111
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
engage cellular activity within the
basolateral amygdala. Further, chem-
ically inactivating this region, or
microinjecting a drug that disrupts
NMDA receptor function, blocks fear
conditioning.
The output of the fear circuit is
channeled through the central nucleus
of the amygdala, which organizes the
expression of fear. Fear can be man-
ifested in a variety of ways, depending
upon factors such as US intensity,
relative expectation, and the envi-
ronmental context. The diversity of
outputs from the central nucleus
allows the fear response to be tuned
to the context and level of danger.
One output pathway projects to a
region of the midbrain known as the
periaqueductal gray (PAG). Here
too, different regions have been
linked to distinct functions. The por-
tion that lies along the upper sides
(the dorsolateral PAG) organizes
active defensive behaviors needed for
fight and flight. These circa-strike
behaviors are engaged by direct
contact with a noxious, or life-
threatening, stimulus. The lower
(ventral) portion of the PAG mediates
CS-elicited freezing behavior. Rats
that have lesions limited to the ventral
PAG exhibit other fear-elicited beha-
viors but do not freeze.
A key premise of the Rescorla–
Wagner model (Rescorla & Wagner,
1972) is that unexpected (surprising)
events engender more processing
than those that are expected. Human
imaging and animal studies have
shown that unexpected events induce
greater neural activity within the
amygdala (McNally et al., 2011). This
suggests that the input to the amyg-
dala is regulated by a form of pre-
diction error that determines whether
learning occurs. This instructive input
may be provided by the PAG, which
receives both sensory input regarding
the magnitude of the US and a CS-
dependent signal from the amygdala
Sensory
thalamus
Caudal pontine nucleus
of the reticular formation
Lateral
basolateral
nucleus
Central
nucleus
Lateral
hypothalamus
Rostral ventral
lateral medula
Parabrachial
nucleus
Ventral tegmental
area
Paraventricular
hypothalamus
Primary
sensory
cortex
Perirhinal
cortex
Entorhinal
cortex
Hippocampus
Conditioned
stimuli
Species
specific
danger signals
Aversive
unconditioned
stimuli
Posterior thalamus
parabrachial nucleus
locus coeruleus
Non-opioid
analgesia,
activity burst
Opioid
analgesia,
freezing
Potentiated
startle
Tachycardia,
increased
blood pressure
Panting, increased
respiration
Behavioral and
EEG arousal,
increased vigilance,
ACTH and
corticosteroid
release
PAG
lateral part
Ventrolateral
part
Amygdala
Sim
ple
conditional stim
ulus
Temporal
encoding
for timed
responses
Co
nt
ex
t
fea
r-c
on
di
tio
ni
ng
dmPFC
Midline and
intralaminar
thalamus
LA
BA
(ΔV)
CeA
vIPAG
Conditioned
stimulus (via
thalamus
and cortex)
Freezing
Shock US
Dorsal
horn
BA
FIGURE 4.20 (A) A block diagram illustrating some of the neural components that mediate fear and defensive behavior. An
aversive US engages parallel pathways that project to the lateral/basolateral amygdala. Information about the CS is conveyed
from the sensory thalamus, the cortex, or by means of a hippocampal-dependent process. Output is channeled through the
central nucleus of the amygdala, which organizes the expression of fear-mediated behavior. Distinct behavioral outcomes are
produced by projections to various brain structures (adapted from Fendt & Fanselow, 1999). (B) A hypothetical model for
computing an error signal in fear conditioning. The vlPAG receives sensory input from the US through ascending fibers from
the spinal cord. It also receives input regarding the expectation of the US from the central amygdala (CeA) through descending
fibers. Conditioned expectation of the US drives both freezing behavior and a conditioned analgesia that can reduce the incom-
ing pain signal. Ascending output from the vlPAG is proposed to reflect the difference between the US input (lamda) and what
was expected (Vt), providing an error signal (delta V) that is sent to portions of the forebrain (dmPFC and BLA) through the
thalamus (adapted from McNally, Johansen, & Blair, 2011).
BOX 4.3 (continued)
Continued
112 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Attentional Models of Conditioning
In the Rescorla–Wagner model, how much is learned on a conditioning trial depends on
the effectiveness of the US on that trial. North American psychologists have favored the-
ories of learning that focus on changes in the impact of the US. In contrast, British psy-
chologists have approached phenomena such as the blocking effect by postulating
changes in how well the CS commands attention. The assumption is that increased
attention facilitates learning about a stimulus, and procedures that disrupt attention to
a CS disrupt learning (Mitchell & Pelley, 2010).
Attentional theories differ in their assumptions about what determines how much
attention a CS commands on a conditioning trial. Early theories postulated a single
attentional mechanism. For example, Pearce and Hall (1980) proposed that the amount
of attention an animal devotes to a CS is determined by how surprising the US was on
the preceding trial (see also Hall, Kaye, & Pearce, 1985; McLaren & Mackintosh, 2000).
Animals have a lot to learn if the US was surprising, and that increases attention to the
CS on the next trial. In contrast, if a CS was followed by an expected US, not much
learning is necessary and the CS commands less attention on the next trial. (For related
neural mechanisms, see Roesch et al., 2012.)
In contrast to classic single-category theories, more recent attention theories
assume that there are several different forms of attention relevant to learning and
conditioned behavior. For example, Hogarth, Dickinson, and Duka (2011) suggested
that there are three types of attention. The first category, “looking for action,” is the
attention a stimulus commands after it has become a good predictor of the US and
can generate a CR with minimal cognitive effort. Looking for action is similar to the
attentional mechanism of Mackintosh (1975) and reflects the behavioral control by
well-trained cues. The second category, called “looking for learning,” is the type of
that is proportional to the current
expectation of the US (Figure 4.20B).
The CS signal to the ventral PAG may
dampen the US input by engaging a
conditioned analgesia that reduces
pain through the release of an inter-
nally manufactured (endogenous)
opioid. This conditioned opioid
release reduces the painfulness of the
shock and thereby its effectiveness in
generating new learning within the
amygdala. Supporting this, microin-
jecting an opioid into the PAG
interferes with learning. Conversely,
administration of a drug that prevents
an opioid from acting (an opioid
antagonist) can reinstate learning
about an expected US in a blocking
paradigm (McNally et al., 2011).
The amygdala is also involved in
learning about appetitive events, but
its role in appetitive conditioning dif-
fers in some fundamental ways. For
example, lesioning the basolateral
amygdala eliminates the CR elicited by
a CS that has been paired with an
aversive US but does not affect the
response elicited by a cue that predicts
an appetitive US (Holland &
Gallagher, 1999). The lesion does,
however, prevent an appetitive CS
from acting as a reinforcer in both
second-order conditioning and sec-
ondary reinforcement with a food or
drug reward (Box 7.1). The function of
the central nucleus also differs. Rather
than organizing the behavioral output,
for appetitive CSs, the region appears
to modulate behaviors related to
attentional processing. For example,
rats tend to orient toward a cue that
predicts food. Also, an unexpected
event can increase attention to a CS,
yielding an enhancement in associa-
bility that augments learning (Pearce
& Hall, 1980). Lesioning the central
nucleus eliminates both these effects.
J. W. Grau
amygdala An almond-shaped structure
that is part of the limbic system. Nuclei
(clusters of neurons) within the amygdala
play a role in emotional conditioning,
regulating fear-related behaviors and
attention.
limbic system A set of forebrain struc-
tures, which includes the amygdala and
hippocampus, involved in the regulation
of emotion and motivated behavior.
periaqueductal gray (PAG) A region of
the midbrain that plays a role in regulating
fear-related behavior (e.g., freezing) and
pain.
BOX 4.3 (continued)
N. J. Mackintosh
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
How Do Conditioned and Unconditioned Stimuli Become Associated? 113
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
attention that is involved in processing cues that are not yet good predictors of the
US and therefore have much to be learned about. Thus, looking for learning is similar
to the Pearce and Hall (1980) attentional mechanism I described earlier. The third
category, “looking for liking,” refers to the attention that stimuli command because
of their emotional value (how much they are liked or disliked). In addition to speci-
fying different categories of attention, investigators are starting to identify the neural
circuits responsible for these differences (e.g., Holland & Maddux, 2010).
An important feature of attention theories is that they assume that the outcome
of a given trial alters the degree of attention commanded by the CS on future trials.
For example, if Trial 10 ends in a surprising US, it will increase the looking for learn-
ing form of attention on Trial 11. Thus, US surprise is assumed to have a prospective,
or proactive, influence on attention and conditioning. This is an important contrast to
US-reduction models such as the Rescorla–Wagner model, in which the “surprising-
ness” of the US on a given trial determines what is learned on that same trial.
The assumption that the outcome of a given trial influences what is learned on the
next trial has made attention models unique in explaining a number of interesting find-
ings (e.g., Mackintosh, Bygrave, & Picton, 1977). However, attention models cannot
explain one-trial blocking. The presence of the previously conditioned CSA in Phase 2
of the blocking design makes the US fully expected. Any reduction in attention to CSB
that results from this would only be manifest on subsequent trials of Phase 2. CSB should
command full attention on the first trial of Phase 2, and learning about CSB should pro-
ceed normally. However, that does not occur. The conditioning of CSB can be blocked by
CSA even on the first trial of Phase 2 (e.g., Azorlosa & Cicala, 1986; Dickinson, Nicholas, &
Mackintosh, 1983; Gillan & Domjan, 1977).
Timing and Information Theory Models
Neither the Rescorla–Wagner model nor attentional models were designed to explain the
effects of time in conditioning. However, time is obviously a critical factor. One impor-
tant temporal variable is the CS–US interval. As illustrated in Figure 4.9, focal search
responses become conditioned with a relatively short CS–US interval, whereas general
search responses become conditioned with a long CS–US interval. Most studies of classi-
cal conditioning measure responses closely related to the US. Therefore, less conditioned
behavior is evident with increases in the CS–US interval.
The generally accepted view now is that in a Pavlovian procedure, participants learn
not only that a CS is paired with a US, but when that US will occur (e.g., Balsam, Drew,
& Yang, 2001; Balsam, Drew, & Gallistel, 2010). Based on their findings, Williams and
colleagues (2008) went even further to claim that learning when the US occurs trumps
learning whether it occurs.
The idea that participants learn about the point in time when the US occurs is called
temporal coding. The temporal coding hypothesis states that participants learn when the
US occurs in relation to a CS and use this information in blocking, second-order condi-
tioning, and other paradigms in which what is learned in one phase of training influ-
ences what is learned in a subsequent phase. Numerous studies have upheld interesting
predictions of the temporal coding hypothesis (e.g., Amundson & Miller, 2008; Cole,
Barnet, & Miller, 1997; Savastano & Miller, 1998).
Another important temporal variable is the interval between successive trials. Generally,
more conditioned responding is observed with longer intertrial intervals (e.g., Sunsay &
Bouton, 2008). In addition, the intertrial interval and the CS duration (or CS–US interval)
act in combination to determine responding. Numerous studies have shown that the critical
factor is the relative duration of these two temporal intervals rather than the absolute value
of either one by itself (Gallistel & Gibbon, 2000; Balsam & Gallistel, 2009). A particularly
P. D. Balsam
J. Gibbon
Co
ur
te
sy
of
J.
G
ib
bo
n
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
114 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
clear example of this relationship was reported by Holland (2000) based on an experiment
with laboratory rats.
Conditioning trials consisted of an auditory cue (white noise) presented just before
delivery of food into a cup. The conditioned response that developed to the CS was nos-
ing of the food cup (goal tracking). Each group was conditioned with one of two CS
durations, either 10 seconds or 20 seconds, and one of six intertrial intervals (ranging
from 15 seconds to 960 seconds). (Intertrial intervals were measured from one US deliv-
ery to the next.) Each procedure could be characterized in terms of the ratio of the inter-
trial interval (I) and the CS duration, which Holland called the trial duration (T). The
results of the experiment are summarized in Figure 4.21. Time spent nosing the food
cup during the CS is shown as a function of the relative value of the intertrial interval
(I) and the trial duration (T) for each group. Notice that conditioned responding was
directly related to the I/T ratio. At each I/T ratio, the groups that received the 10-
second CS responded similarly to those that received the 20-second CS. (For other
types of results involving the I/T ratio, see Burns & Domjan, 2001; Kirkpatrick &
Church, 2000; Lattal, 1999.)
Why is conditioned responding determined by the I/T ratio? A ratio suggests a com-
parison, in this case between events during the intertrial interval (I) and the conditioning
trial (T). What is being compared has been expressed in various ways over the years
(Gibbon & Balsam, 1981; Gallistel & Gibbon, 2000). According to the relative-
waiting-time hypothesis (Jenkins, Barnes, & Barrera, 1981), the comparison is between
how long one has to wait for the US during the CS versus how long one has to wait for
the US during the intertrial interval (the interval from one US presentation to the next).
When the US waiting time during the CS is much shorter than during the intertrial
interval, the I/T ratio is high. Under these circumstances, the CS is highly informative
about the next occurrence of the US and high levels of responding occur. In contrast,
with a low I/T ratio the US waiting time during the intertrial interval is similar to the
US waiting time during the CS. In this case, the CS provides little new information
about the next US, and not much conditioned responding develops.
The idea that conditioned responding depends on the information value of the CS
has been developed in greater mathematical detail by Balsam and Gallistel (2009). Based
on these calculations, Balsam and Gallistel concluded that “the CS is associable with a
US … only to the extent that it reduces the expected time to the next US” (p. 77).
10
0
1.5 3.0 6.0 12.0 24.0 48.0
I/T Ratio
%
T
im
e
in
fo
od
c
up
210
20
30
40
T 5 10 sec T 5 20 secFIGURE 4.21 Per-
centage of time rats
spent nosing the food
cup during an auditory
CS in conditioning with
either a 10-second or a
20-second trial duration
(T) and various intertrial
intervals (I) that created
I/T ratios ranging from
1.5 to 48.0. Data are
shown in relation to re-
sponding during base-
line periods when the CS
was absent. (Based on
“Trial and Intertribal
Durations in Appetitive
Conditioning in Rats,”
by P. C. Holland, 2000,
Animal Learning &
Behavior, Vol. 28,
Figure 2, p. 125.)
How Do Conditioned and Unconditioned Stimuli Become Associated? 115
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Although this conclusion helps us understand effects of the I/T ratio on conditioned
responding, it is primarily applicable to situations that involve multiple conditioning
trials (where there is a “next US”). These and related ideas (e.g., Gallistel & Gibbon,
2000) are difficult to apply to situations in which learning occurs in a single trial (e.g.,
taste-aversion learning or fear conditioning) and there is no “next US” (see also Domjan,
2003).
The Comparator Hypothesis
The relative-waiting-time hypothesis and related theories were developed to explain the
effects of temporal factors in excitatory conditioning. One of their important contribu-
tions was to emphasize that conditioned responding depends not only on what happens
during the CS but also on what happens in other aspects of the experimental situation.
The idea that both these factors influence learned performance is also central to the
comparator hypothesis and its successors developed by Ralph Miller and his collabora-
tors (Denniston, Savastano, & Miller, 2001; Miller & Matzel, 1988; Stout & Miller, 2007).
The comparator hypothesis was motivated by an interesting set of findings known as
revaluation effects. Consider, for example, the blocking phenomenon (Figure 4.15). Parti-
cipants first receive a phase of training in which CSA is paired with the US. CSA is then
presented simultaneously with CSB, and this stimulus compound is paired with the US.
Subsequent tests of CSB by itself show little responding to CSB. As we discussed, the
Rescorla–Wagner model interprets the blocking effect as a failure of learning to CSB.
The presence of CSA blocks the conditioning of CSB.
The comparator hypothesis takes a different approach. It assumes that what is
blocked is responding to CSB, not learning about CSB. If that is true, then responding
to CSB should become evident if the block is removed somehow. How might that be
accomplished? As it turns out, one way to remove the block to CSB is to eliminate
responding to CSA after compound conditioning by presenting CSA repeatedly without
the US. A number of studies have shown that such extinction of CSA following the
blocking procedure unmasks conditioned responding to CSB (e.g., Blaisdell, Gunther,
& Miller, 1999; Boddez et al., 2011). This is called a revaluation effect because it
involves changing the conditioned value of a stimulus (CSA) that was present during
the training of the target stimulus CSB. The unmasking of responding to CSB shows
that blocking did not prevent the conditioning of CSB but disrupted performance of
the response to CSB.
Inspired by revaluation effects, the comparator hypothesis is a theory of perfor-
mance rather than learning. It assumes that conditioned responding depends not only
on associations between a target CS and the US but also on associations that may be
learned between the US and other stimuli that were present when the target CS was
being conditioned. These other stimuli are called the comparator cues and include the
experimental context and other discrete CSs. In the blocking experiment, the target stim-
ulus is CSB and the primary comparator cue is the previously trained CSA that is present
during the conditioning of CSB.
Another key assumption of the comparator hypothesis is that it only allows for the
formation of excitatory associations with the US. Whether conditioned responding
reflects excitation or inhibition is assumed to be determined by the relative strengths of
excitation conditioned to the target CS as compared to the excitatory value of the com-
parator stimuli that were present with the target CS during training.
The comparator process is represented by the balance in Figure 4.22. As Figure 4.22
illustrates, a comparison is made between the excitatory value of the target CS and the
excitatory value of the comparator cues that are present during the training of the target
R. R. Miller
Co
ur
te
sy
of
R.
R.
M
ill
er
116 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
CS. If the excitatory value of the target CS exceeds the excitatory value of the comparator
cues, the balance of the comparison will be tipped in favor of excitatory responding. In
contrast, if the excitatory value of the comparator cues exceeds the excitatory value of the
target CS, the balance will be tipped in favor of inhibitory responding to the target CS.
Unlike the relative-waiting-time hypothesis, the comparator hypothesis emphasizes
associations rather than time. In its simpler form, the theory assumes that organisms
learn three associations during the course of conditioning. These are illustrated in
Figure 4.23. The first association (Link 1 in Figure 4.23) is between the target CS (X)
and the US. The second association (Link 2) is between the target CS (X) and the com-
parator cues. Finally, there is an association between the comparator stimuli and the US
(Link 3). With all three of these links in place, once the CS is presented, it activates the
US representation directly (through Link 1) and indirectly (through Links 2 and 3). A
comparison of the direct and indirect activations determines the degree of excitatory or
inhibitory responding that occurs (for further elaboration, see Stout & Miller, 2007).
Excitatory value
of comparators
Excitatory value
of the target CS
0
Inhibitory
responding
Excitatory
responding
FIGURE 4.22 Illus-
tration of the compara-
tor hypothesis. Whether
the target CS elicits in-
hibitory or excitatory
responding depends on
whether the balance tips
to the left or the right. If
the excitatory value of
the target CS is greater
than the excitatory value
of the comparator cues
present during training
of the target, the balance
tips to the right, in favor
of excitatory responding.
As the associative value
of the comparator sti-
muli increases, the bal-
ance becomes less
favorable for excitatory
responding and may tip
to the left, in favor of
inhibitory responding.
2
1
3
Presentation
of Target CS
Target CS-US
Association
Target CS-
Comparator
Stimulus
within-
compound
Association
Direct US
Representation
Comparison
Comparator
Stimulus-US
Association
Comparator
Stimulus
Representation
Indirect US
Representation
Response
to the CS
FIGURE 4.23 The as-
sociative structure of the
comparator hypothesis.
The target CS is repre-
sented as X. Excitatory
associations result in
activation of the US re-
presentation, either di-
rectly by the target (Link
1) or indirectly (through
Links 2 and 3). (Based
on Friedman, et al.
(1998). Journal of
Experimental Psycholo-
gy: Animal Behavior
Processes, 2, p. 454.)
©
Ce
ng
ag
e
Le
ar
ni
ng
How Do Conditioned and Unconditioned Stimuli Become Associated? 117
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
An important corollary to the theory is that the comparison that determines
responding is made when participants are tested for their conditioned behavior. Because
of this assumption, the comparator hypothesis makes the unusual prediction that extinc-
tion of comparator–US associations following training of a target CS will enhance
responding to that target CS. It is through this mechanism that the comparator hypoth-
esis is able to predict that the blocking effect can be reversed by extinguishing the block-
ing stimulus (CSA). (For additional examples of such revaluation effects, see McConnell,
Urushihara, & Miller, 2010; Miguez, Witnauer, & Miller, 2012.)
The comparator hypothesis has also been tested in studies of conditioned inhibi-
tion. In a conditioned inhibition procedure (e.g., see Figure 3.11), the target is the
CS–. During conditioned inhibition training, the CS– is presented together with a CS+
that provides the excitatory context for the learning of inhibition. Thus, the comparator
stimulus is the CS+. Consider the comparator balance presented in Figure 4.21.
According to this balance, inhibitory responding will occur to the target (CS–) because
it has less excitatory power than its comparator (the CS+). Thus, the comparator
hypothesis predicts inhibitory responding in situations where the association of the tar-
get CS with the US is weaker than the association of the comparator cues with the US.
Conditioned inhibition is not viewed as the result of negative associative value but as
the result of the balance of the comparison tipping away from the target and in favor
of the comparator stimulus. An interesting implication of the theory is that extinction
of the comparator CS+ following inhibitory conditioning will reduce inhibitory
responding. As I noted earlier in the discussion of the extinction of conditioned inhibi-
tion, this unusual prediction has been confirmed (e.g., Best et al., 1985; Lysle &
Fowler, 1985).
Concluding Comments
Initially, some psychologists regarded classical conditioning as a relatively simple and
primitive type of learning that is involved in the regulation only of glandular and visceral
responses, such as salivation. The establishment of CS–US associations was assumed to
occur fairly automatically with the pairing of a CS and a US. Given the simple and auto-
matic nature of the conditioning, it was not viewed as important in explaining the com-
plexity and richness of human experience. Clearly, this view of classical conditioning is
no longer tenable.
The research reviewed in Chapters 3 and 4 has shown that classical conditioning
involves numerous complex processes and is involved in the control of a wide variety
of responses, including not only responses to the CS but also modifications in how the
organism responds to the unconditioned stimulus. Classical conditioning does not occur
automatically with the pairing of a CS with a US. Rather, it depends on the organism’s
prior experience with each of these stimuli, the presence of other stimuli during the
conditioning trial, and the extent to which the CS and US are relevant to each other.
Furthermore, the processes of classical conditioning are not limited to CS–US pairings.
Learned associations can occur between two biologically weak stimuli (sensory precondi-
tioning) or in the absence of a US (higher-order conditioning).
These and other complexities of classical conditioning have created significant chal-
lenges for theories of learning. The last 40 years have yielded a rich variety of theoretical
approaches. No theory provides a comprehensive account of all of the data, but each has
served to highlight the importance of various factors in how conditioning alters the
properties of conditioned and unconditioned stimuli. These factors include error-
correction mechanisms, attentional mechanisms, temporal and informational variables, and
118 Chapter 4: Classical Conditioning: Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
memories of the US activated by the target CS and its comparators. The richness of classical
conditioning mechanisms makes them highly relevant to understanding the richness and
complexity of human experience.
Sample Questions
1. What, if any, limits are there on the kinds of
stimuli that can serve as conditioned and
unconditioned stimuli in Pavlovian
conditioning?
2. Describe several examples of how Pavlovian
conditioning can modify how one responds to the
unconditioned stimulus. What is the adaptive
significance of this type of learning?
3. Describe an experimental design that allows
investigators to distinguish between S–R and S–S
learning.
4. Describe the basic idea of the Rescorla–Wagner
model. What aspect of the model allows it to
explain the blocking effect and make some
unusual predictions?
5. Describe three different types of attention that are
relevant to learned behavior.
6. In what respects are attentional theories of
learning different from other theories?
7. How does the intertrial interval influence learning?
8. How does the comparator hypothesis explain the
blocking effect?
Key Terms
blocking effect Interference with the conditioning of
a novel stimulus because of the presence of a previously
conditioned stimulus.
comparator hypothesis The idea that conditioned
responding depends on a comparison between the asso-
ciative strength of the conditioned stimulus (CS) and
the associative strength of other cues present during
training of the target CS.
conditioned compensatory-response A conditioned
response opposite in form to the reaction elicited by
the US and that therefore compensates for this reaction.
conditioned diminution of the UR A reduction in
the magnitude of the response to an unconditioned
stimulus caused by presentation of a CS that had been
conditioned with that US.
CS-preexposure effect Interference with conditioning
produced by repeated exposures to the CS before the
conditioning trials. Also called latent-inhibition effect.
drug tolerance Reduction in the effectiveness of a
drug as a result of repeated use of the drug.
higher-order conditioning A procedure in which a
previously conditioned stimulus (CS1) is used to condi-
tion a new stimulus (CS2).
latent-inhibition effect Same as CS-preexposure effect.
relative-waiting-time hypothesis The idea that con-
ditioned responding depends on how long the organ-
ism has to wait for the US in the presence of the CS, as
compared to how long the organism has to wait for the
US in the experimental situation irrespective of the CS.
stimulus–response (S–R) learning The learning of
an association between a stimulus and a response,
with the result that the stimulus comes to elicit the
response directly.
stimulus–stimulus (S–S) learning The learning of an
association between two stimuli, with the result that
exposure to one of the stimuli comes to activate a repre-
sentation, or “mental image,” of the other stimulus.
sensory preconditioning A procedure in which one
biologically weak stimulus (CS2) is repeatedly paired
with another biologically weak stimulus (CS1). Then,
CSl is conditioned with an unconditioned stimulus. In
a later test trial, CS2 also will elicit the conditioned
response, even though CS2 was never directly paired
with the US.
stimulus salience The significance or noticeability of
a stimulus. Generally, conditioning proceeds more rap-
idly with more salient conditioned and unconditioned
stimuli.
stimulus substitution The theoretical idea that as a
result of classical conditioning participants come to
respond to the CS in much the same way that they
respond to the US.
US-preexposure effect Interference with condition-
ing produced by repeated exposures to the uncondi-
tioned stimulus before the conditioning trials.
US devaluation Reduction in the attractiveness of an
unconditioned stimulus, usually achieved by aversion
conditioning or satiation.
Key Terms 119
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
C H A P T E R 5
Instrumental Conditioning:
Foundations
Early Investigations of Instrumental
Conditioning
Modern Approaches to the Study of
Instrumental Conditioning
Discrete-Trial Procedures
Free-Operant Procedures
Instrumental Conditioning Procedures
Positive Reinforcement
Punishment
Negative Reinforcement
Omission Training or Negative
Punishment
Fundamental Elements of Instrumental
Conditioning
The Instrumental Response
The Instrumental Reinforcer
The Response–Reinforcer Relation
Sample Questions
Key Terms
CHAPTER PREVIEW
This chapter begins our discussion of instrumental conditioning and goal-directed behavior. This is the
type of conditioning that is involved in training a quarterback to throw a touchdown or a child to skip
rope. In this type of conditioning, obtaining a goal or reinforcer depends on the prior occurrence of a
designated response. I will first describe the origins of research on instrumental conditioning and the
investigative methods used in contemporary research. This discussion lays the groundwork for the
following section in which the four basic types of instrumental conditioning procedures are described. I
will conclude the chapter with discussions of three fundamental elements of the instrumental
conditioning paradigm: the instrumental response, the reinforcer or goal event, and the relation between
the instrumental response and the goal event.
In the preceding chapters, I discussed various aspects of how responses are elicited by
discrete stimuli. Studies of habituation, sensitization, and classical conditioning are all
concerned with the mechanisms of elicited behavior. Because of this emphasis, the
procedures used in experiments on habituation, sensitization, and classical condition-
ing do not require the participant to make a particular response to obtain food or
121
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
other unconditioned or conditioned stimuli. Classical conditioning reflects how
organisms adjust to events in their environment that they do not directly control. In
this chapter, we turn to the analysis of learning situations in which the stimuli an
organism encounters are a result or consequence of its behavior. Such behavior is
commonly referred to as goal-directed or instrumental because responding is neces-
sary to produce a desired environmental outcome.
By studying hard, a student can earn a better grade; by turning the car key in the
ignition, a driver can start the engine; by putting a coin in a vending machine, a child
can obtain a piece of candy. In all these cases, some aspect of the individual’s behavior
is instrumental in producing a significant stimulus or outcome. Furthermore, the
behavior occurs because similar actions produced the same type of outcome in the past.
Students would not study if doing so did not yield better grades; drivers would not turn
the ignition key if this did not start the engine; and children would not put coins in a
vending machine if they did not get a candy in return. Behavior that occurs because it
was previously effective in producing certain consequences is called instrumental
behavior.
The fact that the consequences of an action can determine whether you make that
response again is obvious to everyone. If you happen to find a dollar bill when you
glance down, you will keep looking at the ground as you walk. How such a consequence
influences future behavior is not so readily apparent. Many of the upcoming chapters of
this book are devoted to the mechanisms responsible for the control of behavior by its
consequences. In this chapter, I will describe some of the history, basic techniques, pro-
cedures, and issues in the experimental analysis of instrumental, or goal-directed,
behavior.
How might one investigate instrumental behavior? One way would be to go
to the natural environment and look for examples. However, this approach is not
likely to lead to definitive results because factors responsible for goal-directed behav-
ior are difficult to isolate without experimental manipulation. Consider, for example,
a dog sitting comfortably in its yard. When an intruder approaches, the dog starts to
bark vigorously, and the intruder goes away. Because the dog’s barking is followed
by the departure of the intruder, we may conclude that the dog barked to produce
this outcome—that barking was goal-directed. However, an equally likely possibility
is that barking was elicited by the novelty of the intruder and persisted as long as
this eliciting stimulus was present. The departure of the intruder may have been inci-
dental to the dog’s barking. Deciding between such alternatives is difficult without
experimental manipulations of the relation between barking and its consequences.
(For an experimental analysis of a similar situation in a fish species, see Losey &
Sevenster, 1995.)
Early Investigations of Instrumental
Conditioning
Laboratory and theoretical analyses of instrumental conditioning began in earnest with
the work of the American psychologist E. L. Thorndike. Thorndike’s original intent was
to study animal intelligence (Thorndike, 1898, 1911; for a more recent commentary, see
Lattal, 1998). As I noted in Chapter 1, the publication of Darwin’s theory of evolution
encouraged scientists to think about the extent to which human intellectual capacities
were present in animals. Thorndike pursued this question through empirical research.
122 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
He devised a series of puzzle boxes for his experiments. His training procedure consisted
of placing a hungry animal (often a young cat) in the puzzle box with some food left
outside in plain view of the animal. The task for the animal was to learn how to get
out of the box and get the food.
Different puzzle boxes required different responses to get out. Some were easier
than others. Figure 5.1 illustrates two of the easier puzzle boxes. In Box A, the
required response was to pull a ring to release a latch that blocked the door on the
outside. In Box I, the required response was to push down a lever, which released a
latch. Initially, the cats were slow to make the correct response, but with continued
practice on the task, their latencies became shorter and shorter. Figure 5.2 shows
the latencies of a cat to get out of Box A on successive trials. The cat took 160 sec-
onds to get out of Box A on the first trial. Its shortest latency later on was 6 seconds
(Chance, 1999).
Thorndike’s careful empirical approach was a significant advance in the study of
animal intelligence. Another important contribution was Thorndike’s strict avoidance of
anthropomorphic interpretations of the behavior he observed. Although he titled his
treatise Animal Intelligence, to Thorndike many aspects of behavior seemed rather unin-
telligent. He did not think that his animals got faster in escaping from a puzzle box
because they gained insight into the task or figured out how the release mechanism was
Box A Box I
FIGURE 5.1 Two of Thorndike’s puzzle boxes, A and I. In Box A, the participant had to pull a loop to release the door. In Box I,
pressing down on a lever released a latch on the other side. (Left: Based on “Thorndike’s Puzzle Boxes and the Origins of the
Experimental Analysis of Behavior,” by P. Chance, 1999, Journal of the Experimental Analysis of Behaviour, 72, pp. 433–440. Right:
Based on Thorndike, Animal Intelligence Experimental Studies, 1898.)
E. L. Thorndike
Sc
ie
nc
e
So
ur
ce
Early Investigations of Instrumental Conditioning 123
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
designed. Rather, he interpreted the results of his studies as reflecting the learning of a
new S–R association.
When a cat was initially placed in a box, it displayed a variety of responses typical of
a confined animal. Eventually, some of these responses resulted in opening the door.
Thorndike believed that such successful escapes led to the learning of an association
between the stimuli of being in the puzzle box (S) and the effective escape response
(R). As the association, or connection, between the box cues and the successful response
became stronger, the animal came to make that response more quickly. The consequence
of a successful escape response strengthened the association between the box stimuli and
that response.
On the basis of his research, Thorndike formulated the law of effect. The law of
effect states that if a response R in the presence of a stimulus S is followed by a satisfying
event, the association between the stimulus S and the response R becomes strengthened.
If the response is followed by an annoying event, the S–R association is weakened. It is
important to stress here that, according to the law of effect, what is learned is an associ-
ation between the response and the stimuli present at the time of the response. Notice
that the consequence of the response is not one of the elements in the association. The
satisfying or annoying consequence simply serves to strengthen or weaken the associa-
tion between the preceding stimulus and response. Thus, Thorndike’s law of effect
involves S–R learning.
Thorndike’s law of effect and S–R learning continue to be of considerable interest
more than 100 years since these ideas were first proposed. A key feature of Thorndike’s
S–R mechanism is that it compels the organism to make response R whenever stimulus
S occurs. This feature has made the law of effect an attractive mechanism to explain
compulsive habits that are difficult to break, such as biting one’s nails, snacking, or
smoking cigarettes. Once you start on a bucket of popcorn while watching a movie,
you cannot stop eating because the sight and smell of the popcorn (S) compels you to
grab some more popcorn and eat it (R). The compulsive nature of eating popcorn is
such that you continue to eat beyond the point of enjoying the taste. Once learned,
habitual responses occur because they are triggered by an antecedent stimulus and
not because they result in a desired consequence (Everitt & Robbins, 2005; Wood &
Neal, 2007). A habitual smoker who knows that smoking is harmful will continue to
smoke because S–R mechanisms compel lighting a cigarette independent of the conse-
quences of the response.
FIGURE 5.2 Latencies
to escape from Box A
during successive trials.
The longest latency
was 160 seconds; the
shortest was 6 seconds.
(Notice that the axes are
not labeled, as in
Thorndike’s original
report.)
©
Ce
ng
ag
e
Le
ar
ni
ng
124 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Modern Approaches to the Study of
Instrumental Conditioning
Thorndike used 15 different puzzle boxes in his investigations. Each box required differ-
ent manipulations for the cat to get out. As more scientists became involved in studying
instrumental learning, the range of tasks they used became smaller. A few of these
became “standard” and have been used repeatedly to facilitate comparison of results
obtained in different experiments and laboratories.
Discrete-Trial Procedures
Discrete-trial procedures are similar to the method Thorndike used in that each train-
ing trial begins with putting the animal in the apparatus and ends with removal of the
animal after the instrumental response has been performed. Discrete-trial procedures
these days usually involve the use of some type of maze. The use of mazes in investiga-
tions of learning was introduced at the turn of the twentieth century by the American
psychologist W. S. Small (1899, 1900). Small was interested in studying rats and was
encouraged to use a maze by an article he read in Scientific American describing the
complex system of underground burrows that kangaroo rats build in their natural habi-
tat. Small reasoned that a maze would take advantage of the rats’ “propensity for small
winding passages.”
Figure 5.3 shows two mazes frequently used in contemporary research. The runway,
or straight-alley, maze contains a start box at one end and a goal box at the other. The
rat is placed in the start box at the beginning of each trial. The barrier separating the
start box from the main section of the runway is then raised. The rat is allowed to
make its way down the runway until it reaches the goal box, which usually contains a
reinforcer, such as food or water.
Behavior in a runway can be quantified by measuring how fast the animal gets from
the start box to the goal box. This is called the running speed. The running speed
BOX 5.1
E. L. Thorndike: Biographical Sketch
Edward Lee Thorndike was born
in1874 and died in 1949. As an
undergraduate at Wesleyan Univer-
sity, he became interested in the
work of William James, who was
then at Harvard. Thorndike himself
entered Harvard as a graduate stu-
dent in 1895. During his stay, he
began his research on instrumental
behavior, at first using chicks.
Because there was no laboratory
space in the psychology department
at the university, he set up his
project in William James’s cellar.
Soon after that, he was offered a
fellowship at Columbia University.
This time, his laboratory was located
in the attic of psychologist James
Cattell.
Thorndike received his Ph.D.
from Columbia in 1898, for his work
entitled Animal Intelligence: An
Experimental Analysis of Associative
Processes in Animals. This included
the famous puzzle-box experiments.
Thorndike’s dissertation has turned
out to be one of the most famous
dissertations in more than a century
of modern psychology. After obtain-
ing his Ph.D., Thorndike spent a
short stint at Western Reserve
University in Cleveland and then
returned to Columbia, where he
served as professor of educational
psychology in the Teachers College
for many years. Among other things,
he worked to apply to children the
principles of trial-and-error learning
he had uncovered with animals. He
also became interested in psycholog-
ical testing and became a leader in
that newly formed field. By his
retirement, he had written 507
scholarly works (without a computer
or word processor), including about
50 books (Cumming, 1999). Several
years before his death, Thorndike
returned to Harvard as the William
James Lecturer, a fitting honor con-
sidering the origins of his interests in
psychology.
Modern Approaches to the Study of Instrumental Conditioning 125
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
typically increases with repeated training trials. Another common measure of behavior in
runways is response latency. The latency is the time it takes the animal to leave the start
box and begin running down the alley. Typically, latencies become shorter as training
progresses.
Another maze that has been used in many experiments is the T maze, shown on
the right in Figure 5.3. The T maze consists of a start box and alleys arranged in the
shape of a T. A goal box is located at the end of each arm of the T. Because the T maze
has two choice arms, it can be used to study more complex questions. For example,
Panagiotaropoulos and colleagues (2009) were interested in whether rats that are less
than two weeks old (and still nursing) could learn where their mother is located in contrast
with another female. To answer this question, they placed the mother rat in the goal box
on the right arm of a T maze and a virgin female rat in the goal box on the left arm of the
T. The rat pups learned to turn to the right rather than the left arm of the maze with suc-
cessive trials. Furthermore, this conditioned preference persisted when the pups were
tested at the end of training without a female in either goal box. The results show that
nursing rat pups can distinguish their mother from a virgin female and can learn to go
where their mother is located.
Free-Operant Procedures
In a runway or a T maze, after reaching the goal box, the animal is removed from the
apparatus for a while before being returned to the start box for its next trial. Thus, the
animal has limited opportunities to respond, and those opportunities are scheduled by
the experimenter. By contrast, free-operant procedures allow the animal to repeat the
instrumental response without constraint over and over again without being taken out
of the apparatus until the end of an experimental session. The free-operant method was
invented by B. F. Skinner (1938) to study behavior in a more continuous manner than is
possible with mazes.
Skinner (Figure 5.4) was interested in analyzing in the laboratory a form of behav-
ior that would be representative of all naturally occurring ongoing activity. However, he
Removable
barrier
G
G
S S
G
FIGURE 5.3 Top view
of a runway and a T-
maze. S is the start box;
G is the goal box.
©
Ce
ng
ag
e
Le
ar
ni
ng
126 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
recognized that before behavior can be experimentally analyzed, a measurable unit of
behavior must be defined. Casual observation suggests that ongoing behavior is contin-
uous; one activity leads to another. Behavior does not fall neatly into units, as do mole-
cules of a chemical solution or bricks on a sidewalk. Skinner proposed the concept of
the operant as a way of dividing behavior into meaningful measurable units.
Figure 5.5 shows a typical Skinner box used to study free-operant behavior in rats.
(A Skinner box used to study pecking in pigeons was presented in Figure 1.8). The box is
a small chamber that contains a lever that the rat can push down repeatedly. The
FIGURE 5.4 B. F.
Skinner (1904–1990)
FIGURE 5.5 A Skin-
ner box equipped with a
response lever and
food-delivery device.
Electronic equipment
is used to program
procedures and record
responses automatically.
Be
tt
m
an
n/
CO
RB
IS
Ph
ot
o
Re
se
ar
ch
er
s,
In
c.
Modern Approaches to the Study of Instrumental Conditioning 127
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
chamber also has a mechanism that can deliver a reinforcer, such as food or water, into a
cup. The lever is electronically connected to the food-delivery system so that when the
rat presses the lever, a pellet of food automatically falls into the food cup.
An operant response, such as the lever press, is defined in terms of the effect that
the behavior has on the environment. Activities that have the same environmental effect
are considered to be instances of the same operant response. Behavior is not defined in
terms of particular muscle movements but in terms of how the behavior operates on the
environment. The lever-press operant is typically defined as sufficient depression of the
lever to activate a recording sensor. The rat may press the lever with its right paw, its left
paw, or its tail. These different muscle movements constitute the same operant if they all
depress the lever sufficiently to trigger the sensor and produce a food pellet. Various
ways of pressing the lever are assumed to be functionally equivalent because they all
have the same effect on the environment.
We perform numerous operants during the course of our daily lives. In opening a
door, it does not matter whether we use our right hand or left hand to turn the door
knob. The operational outcome (opening the door) is the critical measure of success.
Similarly, in basketball or baseball, it’s the operational outcome that counts—getting the
ball in the basket or hitting the ball into the outfield—rather than the way the task is
accomplished. With an operational definition of behavioral success, one does not need
a sophisticated judge to determine whether the behavior has been successfully accom-
plished. The environmental outcome keeps the score. This contrasts with behaviors
such as figure skating or gymnastics. In those cases, the way something is performed is
just as important as is the environmental impact of the behavior. Getting a ball into the
basket is an operant behavior. Performing a graceful dismount from the parallel bars is
not. However, any response that is required to produce a desired consequence is an
instrumental response because it is “instrumental” in producing a particular outcome.
Magazine Training and Shaping When children first attempt to toss a basketball
in the basket, they are not very successful. Many attempts end with the ball bouncing
off the backboard or not even landing near the basket. Similarly, a rat placed in a Skinner
box will not press the lever that produces a pellet of food right away. Successful training
of an operant or instrumental response often requires carefully designed training steps
that move the student from the status of a novice to that of an expert. This is clearly
the case with something like championship figure skating that requires hours of daily
practice under the careful supervision of an expert coach. Most parents do not spend
money hiring an expert coach to teach a child basketball. However, even there, the
child moves through a series of training steps that may start with a small ball and a
Fisher Price basketball set that is much lower than the standard one and is easier to
reach. The training basket is also adjustable so that it can be gradually raised as the
child becomes more proficient.
There are also preliminary steps for establishing lever-press responding in a labora-
tory rat. First, the rat has to learn when food is available in the food cup. This involves
classical conditioning: The sound of the food-delivery device is repeatedly paired with the
release of a food pellet into the cup. The food-delivery device is called the food magazine.
After enough pairings of the sound of the food magazine with food delivery, the sound
elicits a classically conditioned approach response: The animal goes to the food cup and
picks up the pellet. This preliminary phase of conditioning is called magazine training.
After magazine training, the rat is ready to learn the required operant response.
At this point, food is given if the rat does anything remotely related to pressing the
lever. For example, at first the rat may be given a food pellet each time it gets up on
its hind legs anywhere in the experimental chamber. Once the rearing response has
128 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
been established, the food pellet may be given only if the rat makes the rearing
response over the response lever. Rearing in other parts of the chamber would no
longer be reinforced. Once rearing over the lever has been established, the food pellet
may be given only if the rat touches and depresses the lever. Such a sequence of
training steps is called response shaping.
As the preceding examples show, the shaping of a new operant response requires
training components or approximations to the final behavior. Whether you are trying
to teach a child to throw a ball into a basket, or a rat to press a response lever, at first
any response that remotely approximates the final performance can be reinforced. Once
the child becomes proficient at throwing the ball into a basket placed at shoulder height,
the height of the basket can be gradually raised. As the shaping process continues, more
and more is required, until the reinforcer is given only if the final target response is
made.
Successful shaping of behavior involves three components. First, you have to clearly
define the final response you want the trainee to perform. Second, you have to clearly
assess the starting level of performance, no matter how far it is from the final response
you are interested in. Third, you have to divide the progression from the starting point
to the final target behavior into appropriate training steps or successive approximations.
The successive approximations make up your training plan. The execution of the train-
ing plan involves two complementary tactics: reinforcement of successive approximations
to the final behavior and withholding reinforcement for earlier response forms.
Although the principles involved in shaping behavior are not difficult to understand,
their application can be tricky. If the shaping steps are too far apart, or you spend too
much time on one particular shaping step, progress may not be satisfactory. Sports coa-
ches, piano teachers, and driver education instructors are all aware of how tricky it can
be to design the most effective training steps or successive approximations. The same
principles of shaping are involved in training a child to put on his or her socks or to
drink from a cup without spilling, but the training in those cases is less formally orga-
nized. (For a study of shaping drug-abstinence behavior in cocaine users, see Preston,
Umbricht, Wong, & Epstein, 2001.)
Shaping and New Behavior Shaping procedures are often used to generate new
behavior, but exactly how new are those responses? Consider, for example, a rat’s lever-
press response. To press the bar, the rat has to approach the bar, stop in front of it, raise
its front paws, and then bring the paws down on the bar with sufficient force to push it
down. All of these response components are things the rat is likely to have done at one
time or another in other situations (while exploring its cage, interacting with another rat,
or handling materials to build a nest). In teaching the rat to press the bar, we are not
teaching new response components. Rather, we are teaching the rat how to combine
familiar responses into a new activity. Instrumental conditioning often involves the con-
struction, or synthesis, of a new behavioral unit from preexisting response components
that already occur in the organism’s repertoire (Balsam et al., 1998).
Instrumental conditioning can also be used to produce responses unlike anything the
trainee ever did before. Consider, for example, throwing a football 60 yards down the field.
It takes more than putting familiar behavioral components together to achieve such a feat.
The force, speed, and coordination involved in throwing a football 60 yards is unlike any-
thing an untrained individual might do. It is an entirely new response. Expert perfor-
mances in sports, in playing a musical instrument, or in ballet all involve such novel
response forms. Such novel responses are also created by shaping (Figure 5.6).
The creation of new responses by shaping depends on the inherent variability of
behavior. If a new shaping step requires a trainee to throw a football 30 yards, each
Modern Approaches to the Study of Instrumental Conditioning 129
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
throw is likely to be somewhat different. The trainee may throw the ball 25, 32, 29, or
34 yards on successive attempts. This variability permits the coach to set the next succes-
sive approximation at 33 yards. With that new target, the trainee will start to make
longer throws. Each throw will again be different, but more of the throws will now be
33 yards and longer. The shift of the distribution to longer throws will permit the
coach to again raise the response criterion, perhaps to 36 yards this time. With gradual
iterations of this process, the trainee will make longer and longer throws, achieving dis-
tances that he or she would have never performed otherwise. The shaping process takes
advantage of the variability of behavior to gradually move the distribution of responses
away from the trainee’s starting point and toward responses that are entirely new in the
trainee’s repertoire. Through this process, spectacular new feats of performance are
learned in sports, dancing, or the visual arts. (For laboratory studies of shaping, see
Deich, Allan, & Zeigler, 1988; and Stokes, Mechner, & Balsam, 1999.)
Response Rate as a Measure of Operant Behavior In contrast to discrete-trial
techniques for studying instrumental behavior, free-operant methods permit continuous
observation of behavior over long periods. With continuous opportunity to respond, the
organism, rather than the experimenter, determines the frequency of its instrumental
response. Hence, free-operant techniques provide a special opportunity to observe
changes in the likelihood of behavior over time.
How might we take advantage of this opportunity and measure the probability of an
operant response? Measures of response latency and speed that are commonly used in
discrete-trial procedures do not characterize the likelihood of repetitions of a response.
Skinner proposed that the rate of occurrence of operant behavior (e.g., frequency of the
response per minute) be used as a measure of response probability. Highly likely
responses occur often and have a high rate. In contrast, unlikely responses occur seldom
and have a low rate. Response rate has become the primary measure in studies that
employ free-operant procedures.
Instrumental Conditioning Procedures
In all instrumental conditioning situations, the participant makes a response and thereby
produces an outcome or consequence. Paying the boy next door for mowing the lawn,
yelling at a cat for getting on the kitchen counter, closing a window to prevent the rain
FIGURE 5.6 Shaping
is required to learn
special skills, such as the
pole vault.
Ry
an
Re
m
io
rz
/A
P
Ph
ot
o
130 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
from coming in, and revoking a teenager’s driving privileges for staying out late are all
forms of instrumental conditioning. Two of these examples involve pleasant events (get-
ting paid, driving a car), whereas the other two involve unpleasant stimuli (the sound of
yelling and rain coming in the window). A pleasant event is technically called an appeti-
tive stimulus. An unpleasant stimulus is technically called an aversive stimulus. The
instrumental response may produce the stimulus, as when mowing the lawn results in
getting paid. Alternatively, the instrumental response may turn off or eliminate a stimu-
lus, as in closing a window to stop the incoming rain. Whether the result of a condition-
ing procedure is an increase or a decrease in the rate of responding depends on whether
an appetitive or aversive stimulus is involved and whether the response produces or
eliminates the stimulus. Four basic instrumental conditioning procedures are described
in Table 5.1.
Positive Reinforcement
A father gives his daughter a cookie when she puts her toys away; a teacher praises a
student for handing in a good report; an employee receives a bonus check for performing
well on the job. These are all examples of positive reinforcement. Positive reinforcement
is a procedure in which the instrumental response produces an appetitive stimulus. If the
response occurs, the appetitive stimulus is presented; if the response does not occur, the
appetitive stimulus is not presented. Thus, there is a positive contingency between the
instrumental response and the appetitive stimulus. Positive reinforcement procedures
produce an increase in the rate of responding. Requiring a hungry rat to press a response
lever to obtain a food pellet is a common laboratory example of positive reinforcement.
Punishment
A mother reprimands her child for running into the street; your boss criticizes you for
being late to a meeting; a teacher gives you a failing grade for answering too many test
questions incorrectly. These are examples of punishment (also called positive punish-
ment). In a punishment procedure, the instrumental response produces an unpleasant,
or aversive, stimulus. There is a positive contingency between the instrumental response
and the stimulus outcome (the response produces the outcome), but the outcome is aver-
sive. Effective punishment procedures produce a decrease in the rate of instrumental
responding.
TABLE 5.1 TYPES OF INSTRUMENTAL
CONDITIONING PROCEDURES
N A M E O F P R O C E D U R E
R E S P O N S E – O U T C O M E
C O N T I N G E N C Y R E S U L T O F P R O C E D U R E
Positive Reinforcement Positive: Response produces an
appetitive stimulus
Reinforcement or increase in
response rate
Punishment (Positive Punishment) Positive: Response produces an
aversive stimulus
Punishment or decrease in response
rate
Negative Reinforcement
(Escape or Avoidance)
Negative: Response eliminates or
prevents the occurrence of an
aversive stimulus
Reinforcement or increase in
response rate
Omission Training (DRO) or
Negative Punishment
Negative: Response eliminates or
prevents the occurrence of an
appetitive stimulus
Punishment or decrease in response
rate
©
Ce
ng
ag
e
Le
ar
ni
ng
Instrumental Conditioning Procedures 131
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Negative Reinforcement
Opening an umbrella to stop the rain from getting you wet, putting on a seatbelt to
silence the chimes in your car, and putting on your sunglasses to shield you from bright
sunlight are examples of negative reinforcement. In all of these cases, the instrumental
response turns off an aversive stimulus. Hence, there is a negative contingency between
the instrumental response and the aversive stimulus. Negative reinforcement procedures
increase instrumental responding. You are more likely to open an umbrella because it
stops you from getting wet when it is raining.
People tend to confuse negative reinforcement and punishment. An aversive stimu-
lus is used in both procedures. However, the relation of the instrumental response to the
aversive stimulus is drastically different. In punishment procedures, the instrumental
response produces the aversive event, whereas in negative reinforcement, the response
terminates the aversive event. This difference in the response–outcome contingency pro-
duces very different results. Instrumental behavior is decreased by punishment and
increased by negative reinforcement.
Omission Training or Negative Punishment
In omission training or negative punishment, the instrumental response results in the
removal of a pleasant or appetitive stimulus (Sanabria, Sitomer, & Killeen, 2006). Omis-
sion training is being used when a child is given a time-out (e.g., Donaldson & Vollmer,
2011) or told to go to his or her room after doing something bad. There is nothing aver-
sive about the child’s room. Rather, by sending the child to the room, the parent is with-
drawing sources of positive reinforcement, such as playing with friends or watching
television. Suspending someone’s driver’s license for drunken driving also constitutes
omission training or negative punishment (withdrawal of the pleasure and privilege
of driving). Omission training or negative punishment involves a negative contingency
between the response and an environmental event (hence the term “negative”) and
results in a decrease in instrumental responding (hence the term “punishment”). Nega-
tive punishment is often preferred over positive punishment as a method of discouraging
human behavior because it does not involve delivering an aversive stimulus.
BOX 5.2
DRO as Treatment for Self-Injurious Behavior and Other Behavior Problems
Self-injurious behavior is a prob-
lematic habit that is evident in some
individuals with developmental dis-
abilities. Bridget was a 50-year-old
woman with profound mental retar-
dation whose self-injurious behavior
was hitting her body and head and
banging her head against furniture,
walls, and floors. Preliminary
assessments indicated that her head
banging was maintained by the
attention she received from others
when she banged her head against
a hard surface. To discourage the
self-injurious behavior, an omission
training procedure, or DRO, was put
into place (Lindberg, Iwata, Kahng, &
DeLeon, 1999). The training proce-
dure was implemented in 15-minute
sessions. During the omission train-
ing phase, Bridget was ignored when
she banged her head against a hard
surface but received attention peri-
odically if she was not head banging.
The attention consisted of the
therapist talking to Bridget for 3–5
seconds and occasionally stroking
her arm or back.
The results of the study are pre-
sented in Figure 5.7. During the first
19 sessions, when Bridget received
attention for her self-injurious
behavior, the rate of head banging
fluctuated around six responses per
minute. The first phase of DRO
training (sessions 20–24) resulted in
a rapid decline in head banging. The
self-injurious behavior returned dur-
ing sessions 25–31, when the baseline
condition was reintroduced. DRO
training was resumed in session 32
and remained in effect for the
Continued
132 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Omission-training procedures are also called differential reinforcement of other
behavior (DRO). This term highlights the fact that in omission training, the individual
periodically receives the appetitive stimulus provided he or she is engaged in behavior
other than the response specified by the procedure. Making the target response results in
omission of the reinforcer that would have been delivered had the individual performed
some other behavior. Thus, omission training involves the reinforcement of other behavior.
remainder of the study. These results
show that self-injurious behavior was
decreased by the DRO procedure.
The study with Bridget illustrates
several general principles. One is that
attention is a very powerful reinforcer
for human behavior. People do all
sorts of things for attention. As with
Bridget, even responses that are
injurious to the individual can
develop if these responses result in
attention. Unfortunately, some
responses are difficult to ignore, but
in attending to them, one may be
actually encouraging them. A child
misbehaving in a store or restaurant
is difficult to ignore, but paying
attention to the child will serve to
encourage the misbehavior. As with
Bridget, the best approach is to ignore
the disruptive behavior and pay
attention when the child is doing
something else. Deliberately reinfor-
cing other behavior is not easy and
requires conscious effort on the part
of the parent.
No one questions the need for such
conscious effort in training complex
responses in animals. As Amy
Sutherland (2008) pointed out, animal
“trainers did not get a sea lion to salute
by nagging. Nor did they teach a
baboon to flip by carping, nor an
elephant to paint by pointing out
everything the elephant did wrong.…
Progressive animal trainers reward the
behavior they want and, equally
importantly, ignore the behavior they
don’t” (p. 59). In her engaging book,
What Shamu Taught Me About Life,
Love, and Marriage, Amy Sutherland
argues that one can profitably use the
same principles to achieve better results
with one’s spouse by not nagging them
about leaving their dirty socks on the
floor but by providing attention and
social reinforcement for responses
other than the offending habits.
10
0
20 30 40
Sessions
R
es
po
ns
es
p
er
m
in
ut
e
(S
IB
)
50 60 70
2
4
6
8
10
12
14FIGURE 5.7 Rate of
Bridget’s self-injurious
behavior during baseline
sessions (1–19 and
25–31) and during
sessions in which a DRO
contingency was in effect
(20–24 and 32–72)
(based on Lindberg et al.,
1999).
BOX 5.2 (continued)
Instrumental Conditioning Procedures 133
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Fundamental Elements of Instrumental
Conditioning
As we will see in the following chapters, analysis of instrumental conditioning involves
numerous factors and variables. However, the essence of instrumental behavior is that it
is controlled by its consequences. Thus, instrumental conditioning fundamentally
involves three elements: the instrumental response, the outcome of the response (the
reinforcer), and the relation or contingency between the response and the outcome. In
the remainder of this chapter, I will describe how each of these elements influences the
course of instrumental conditioning.
The Instrumental Response
The outcome of instrumental conditioning procedures depends in part on the nature of
the response being conditioned. Some responses are more easily modified than others. In
Chapter 10, I will describe how the nature of the response influences the outcome of
negative reinforcement (avoidance) and punishment procedures. This section describes
how the nature of the response determines the results of positive reinforcement
procedures.
Behavioral Variability Versus Stereotypy Thorndike described instrumental
behavior as involving the stamping in of an S–R association, while Skinner wrote
about behavior being strengthened or reinforced. Both of these pioneers emphasized
that reinforcement increases the likelihood that the instrumental response will be
repeated in the future. This emphasis encouraged the belief that instrumental condi-
tioning produces repetitions of the same response—that it produces uniformity or ste-
reotypy in behavior. Stereotypy in responding does develop if that is allowed or
required by the instrumental conditioning procedure (e.g., Schwartz, 1988). However,
that does not mean that instrumental conditioning cannot be used to produce creative
or variable responses.
We are accustomed to thinking about the requirement for reinforcement being an
observable action, such as pressing a lever or hitting a baseball. Interestingly, however,
the criteria for reinforcement can also be defined in terms of more abstract dimensions
of behavior, such as its novelty. The behavior required for reinforcement can be
defined as doing something unlike what the participant did on the preceding four or
five trials. To satisfy this requirement, the participant has to perform differently on
each trial. In such a procedure, response variability is the basis for instrumental
reinforcement.
Numerous experiments with laboratory rats, pigeons, and human participants have
shown that response variability increases if variability is the response dimension required
to earn reinforcement (Neuringer, 2004; Neuringer & Jensen, 2010). In one study, college
students were asked to draw rectangles on a computer screen (Ross & Neuringer, 2002).
They were told they had to draw rectangles to obtain points but were not told what kind
of rectangles they should draw. For one group of participants, a point was dispensed if
the rectangle drawn on a given trial differed from other rectangles the student previously
drew. The new rectangle had to be novel in size, shape, and location on the screen. This
group was designated VAR for the variability requirement. Students in another group
were paired up or yoked to students in group VAR and received a point on each trial
that their partners in group VAR were reinforced. However, the YOKED participants
had no requirements about the size, shape, or location of their rectangles.
The results of the experiment are shown in Figure 5.8. Students in group VAR
showed considerably greater variability in the rectangles they drew than participants
A. Neuringer
Co
ur
te
sy
of
A
.
N
eu
rin
ge
r
134 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
in group YOKED. This shows response variability can be increased if the instrumental
reinforcement procedure requires variable behavior. Another experiment by Ross and
Neuringer (2002) demonstrated that different aspects of drawing a rectangle (the size,
shape, and location of the rectangle) can be controlled independently of one another by
contingencies of reinforcement. For example, participants who are required to draw rec-
tangles of the same size will learn to do that but will vary the location and shape of the
rectangles they draw. In contrast, participants required to draw the same shape rectangle
will learn to do that while they vary the size and location of their rectangles.
These experiments show that response variability can be increased with instrumental
conditioning. Such experiments have also shown that in the absence of explicit reinforce-
ment of variability, responding becomes more stereotyped with continued instrumental
conditioning (e.g., Page & Neuringer, 1985). Thus, Thorndike and Skinner were partially
correct in saying that responding becomes more stereotyped with continued instrumen-
tal conditioning. However, they were wrong to suggest that this is an inevitable outcome.
0.75
Size Shape
Response dimension
Location
0.77
0.80
0.82
0.85
U
-v
al
ue 0.88
0.90
0.93
0.95
0.98
Vary YokedFIGURE 5.8 Degree
response variability
along three dimensions
of drawing a rectangle
(size, shape, and
location) for human
participants who were
reinforced for varying
the type of rectangles
they drew (VARY) or
received reinforcement
on the same trials but
without any requirement
to vary the nature of
their drawings
(YOKED). Higher values
of U indicate greater
variability in responding
(based on Ross &
Neuringer, 2002).
BOX 5.3
Detrimental Effects of Reward: More Myth Than Reality
Reinforcement procedures have
become commonplace in educational
settings as a way to encourage students
to read and do their assignments.
However, some have been concerned
that reinforcement may actually
undermine a child’s intrinsic interest
and willingness to perform a task once
the reinforcement procedure is
removed. Similar concerns have been
expressed about possible detrimental
effects of reinforcement on creativity
and originality. Extensive research on
these questions has produced incon-
sistent results. However, more recent
metaanalyses of the results of
numerous studies indicate that under
most circumstances reinforcement
increases creative responses without
reducing intrinsic motivation (Akin-
Little et al., 2004; Byron & Khazanchi,
2012; Cameron, Banko, & Pierce,
2001). Research with children also
indicates that reinforcement makes
children respond with less originality
only under limited circumstances
(e.g., Eisenberger & Shanock, 2003).
As in experiments with pigeons and
laboratory rats, reinforcement can
increase or decrease response vari-
ability, depending on the criterion
for reinforcement. If highly original
responding is required to obtain
reinforcement, originality increases,
provided that the reinforcer is not
so salient as to distract the participant
from the task. (For a more
general discussion of creativity,
see Stokes, 2006.)
Fundamental Elements of Instrumental Conditioning 135
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Novel response forms can be readily produced by instrumental conditioning if response
variation is a requirement for reinforcement.
Relevance or Belongingness in Instrumental Conditioning As the preceding
section showed, instrumental conditioning can act on abstract dimensions of behav-
ior, such as its variability. How far do these principles extend? Are there any limita-
tions on the types of new behavioral units or response dimensions that may be
modified by instrumental conditioning? A growing body of evidence indicates that
there are important limitations.
In Chapter 4, I described how classical conditioning occurs at different rates
depending on the combination of conditioned and unconditioned stimuli used. For
example, rats readily learn to associate tastes with sickness, but associations between
tastes and shock are not so easily learned. For conditioning to occur rapidly, the CS has
to belong with the US or be relevant to the US. Analogous belongingness and relevance
relations occur in instrumental conditioning. As Jozefowiez and Staddon (2008) com-
mented, “A behavior cannot be reinforced by a reinforcer if it is not naturally linked to
that reinforcer in the repertoire of the animal” (p. 78).
This type of natural linkage was first observed by Thorndike. In many of his puzzle-
box experiments, the cat had to manipulate a latch or string to escape from the box.
However, Thorndike also tried to get cats to scratch or yawn to be let out of a puzzle
box. The cats could learn to make these responses. However, the form of the responses
changed as training proceeded. At first, the cat would scratch itself vigorously to be let
out of the box. On later trials, it would only make aborted scratching movements. It
might put its hind leg to its body but would not make a true scratch response. Similar
results were obtained in attempts to condition yawning. As training progressed, the ani-
mal would open its mouth, but it would not give a bona fide yawn.
Thorndike used the term belongingness to explain his failures to train scratching
and yawning as instrumental responses. According to this concept, certain responses nat-
urally belong with the reinforcer because of the animal’s evolutionary history. Operating
a latch and pulling a string are manipulatory responses that are naturally related to
release from confinement. By contrast, scratching and yawning characteristically do not
help animals escape from confinement and therefore do not belong with release from a
puzzle box.
The concept of belongingness in instrumental conditioning is nicely illustrated by a
more recent study involving a small fish species, the three-spined stickleback (Gasterosteus
aculeatus). During the mating season each spring, male sticklebacks establish territories
in which they court females but chase away and fight other males. Sevenster (1973)
used the presentation of another male or a female as a reinforcer in instrumental con-
ditioning of male sticklebacks. One group of fish was required to bite a rod to obtain
access to the reinforcer. When the reinforcer was another male, biting behavior
increased; access to another male was an effective reinforcer for the biting response.
By contrast, biting did not increase when it was reinforced with the presentation of a
female fish. However, the presentation of a female was an effective reinforcer for other
responses, such as swimming through a ring. Biting “belongs with” territorial defense
and can be reinforced by the presentation of a potentially rival male. By contrast, biting
does not belong with the presentation of a female, which typically elicits courtship
rather than aggression.
Thorndike’s difficulties in conditioning scratching and yawning did not have much
impact on behavior theory until additional examples of misbehavior were documented
by Breland and Breland (1961). The Brelands set up a business to train animals to per-
form entertaining response chains for displays in amusement parks and zoos. During theM. Breland-Bailey
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
136 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
course of this work, they observed dramatic behavior changes that were not consistent
with the reinforcement procedures they were using. For example, they described a rac-
coon that was reinforced for picking up a coin and depositing it in a coin bank:
We started out by reinforcing him for picking up a single coin. Then the metal con-
tainer was introduced, with the requirement that he drop the coin into the container.
Here we ran into the first bit of difficulty: he seemed to have a great deal of trouble
letting go of the coin. He would rub it up against the inside of the container, pull it
back out, and clutch it firmly for several seconds. However, he would finally turn it
loose and receive his food reinforcement. Then the final contingency: we [required]
that he pick up [two] coins and put them in the container.
Now the raccoon really had problems (and so did we). Not only could he not let go
of the coins, but he spent seconds, even minutes, rubbing them together (in a most
miserly fashion), and dipping them into the container. He carried on this behavior to
such an extent that the practical application we had in mind—a display featuring a
raccoon putting money in a piggy bank—simply was not feasible. The rubbing behavior
became worse and worse as time went on, in spite of nonreinforcement (p. 682).
From “The Misbehavior of Organisms,” by K. Breland
and M Breland, 1961. In American Psychologist, 16, 682.
The Brelands had similar difficulties with other species. Pigs, for example, also could
not learn to put coins in a piggy bank. After initial training, they began rooting the coins
along the ground. The Brelands called the development of such responses instinctive drift.
As the term implies, the extra responses that developed in these food reinforcement situa-
tions were activities the animals instinctively perform when obtaining food. Pigs root along
the ground in connection with feeding, and raccoons rub and dunk food-related objects.
These natural food-related responses were apparently very strong and competed with the
responses required by the training procedures. The Brelands emphasized that such instinc-
tive response tendencies have to be taken into account in the analysis of behavior.
Behavior Systems and Constraints on Instrumental Conditioning The response
limitations on instrumental conditioning described above are consistent with behavior
systems theory. I previously described this theory in Chapter 4, in discussions about the
nature of the conditioned response (see Timberlake, 2001; Timberlake & Lucas, 1989).
According to behavior systems theory, when an animal is food deprived and is in a situ-
ation where it might encounter food, its feeding system becomes activated, and it begins
to engage in foraging and other food-related activities. An instrumental conditioning
It is difficult to condition raccoons with food reinforcement to drop a coin into a slot.
©
Ce
ng
ag
e
Le
ar
ni
ng
20
15
Fundamental Elements of Instrumental Conditioning 137
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
procedure is superimposed on this behavior system. The effectiveness of the procedure in
increasing an instrumental response will depend on the compatibility of that response
with the preexisting organization of the feeding system. Furthermore, the nature of
other responses that emerge during the course of training (or instinctive drift) will
depend on the behavioral components of the feeding system that become activated by
the instrumental conditioning procedure.
According to the behavior systems approach, we should be able to predict which
responses will increase with food reinforcement by studying what animals do when
their feeding system is activated in the absence of instrumental conditioning. This pre-
diction has been confirmed in several ways. For example, in a study of the effects of food
deprivation in hamsters, Shettleworth (1975) found that responses that become more
likely when the animal is hungry are readily reinforced with food, whereas responses
that become less likely when the animal is hungry are difficult to train as instrumental
responses.
Another way to diagnose whether a response is a part of a behavior system is to
perform a classical conditioning experiment. Through classical conditioning, a CS elicits
components of the behavior system activated by the US. If instinctive drift reflects
responses of the behavior system, responses akin to instinctive drift should be evident
in a classical conditioning experiment. Timberlake and his associates (see Timberlake,
1983; Timberlake, Wahl, & King, 1982) confirmed this prediction in studies with rats.
The Instrumental Reinforcer
Several aspects of a reinforcer determine its effects on the learning and performance of
instrumental behavior. I will first consider the direct effects of the quantity and quality of
a reinforcer on instrumental behavior. I will then discuss how responding to a particular
reward amount and type depends on the organism’s past experience with other reinforcers.
Quantity and Quality of the Reinforcer The quantity and quality of a reinforcer
are obvious variables that would be expected to determine the effectiveness of positive
reinforcement. This is certainly true at the extreme. If a reinforcer is very small and of
poor quality, it will not increase instrumental responding. Indeed, studies conducted in
straight alley runways generally show faster running with larger and more palatable rein-
forcers (see Mackintosh, 1974, for a review).
The magnitude of the reinforcer also influences the rate of free-operant responding.
Chad, a 5-year-old boy diagnosed with autism, was a participant in a study of the effects
of amount of reinforcement on free-operant responding (Trosclair-Lasserre et al., 2008).
Preliminary assessment indicated that social attention was an effective reinforcer for
Chad. Attention consisted of praise, tickles, hugs, songs, stories, and interactive games.
If Chad pressed a button long enough to produce an audible click, he received social
attention for 10, 105, or 120 seconds. Chad preferred reinforcers of 120 seconds over
reinforcers of just 10 seconds.
A progressive ratio schedule of reinforcement was used to evaluate the effects of
reinforcer magnitude. I will describe schedules of reinforcement in greater detail in
Chapter 6. For now, it is sufficient to note that in a progressive ratio schedule the partici-
pant has to make increasing numbers of responses to obtain the reinforcer. At the start
of each session, Chad had to make just one button press to get reinforced, but as the
session went on, the number of button presses required for each reinforcer progressively
increased (hence the name progressive ratio schedule). The response requirement was
raised from 1 press to 2, 5, 10, 20, 30, and finally 40 presses per reinforcer.
The results of the experiment are presented in Figure 5.9 in terms of the number of
times Chad obtained each reinforcer as a function of how many times he had to press
S. J. Shettleworth
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
138 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
the button. As expected, increasing the number of required presses resulted in fewer
reinforcers earned for all three reinforcer magnitudes. Increasing the response require-
ment from 1 to 20 responses produced a rapid drop in the number of reinforcers earned
if the reinforcer was 10 seconds. Less of a drop was evident if the reinforcer was 105
seconds. When the reinforcer was 120 seconds, not much of a decrease was evident
until the response requirement was raised to 30 or 40 button presses for each reinforcer.
Thus, the longer reinforcer was much more effective in maintaining instrumental
responding.
The magnitude of the reinforcer has also been found to be a major factor in voucher
programs for the reinforcement of abstinence in the treatment of substance-use disorder.
Individuals who are addicted to cocaine, methamphetamine, opiates, or other drugs have
been treated successfully in programs based on the principles of instrumental condition-
ing (Higgins, Heil, & Sigmon, 2012). The target response in these programs is absence
from drug use as verified by drug tests conducted two or three times per week. Rein-
forcement is provided in the form of vouchers that can be exchanged for money.
A metaanalysis of studies of the success of voucher reinforcement programs indicated
that the magnitude of the reinforcer contributed significantly to abstinence (Lussier
et al., 2006). Studies in which individuals could earn upwards of $10 per day for remain-
ing drug free showed greater success in encouraging abstinence than those in which
smaller payments were earned. Providing reinforcement soon after the evidence of absti-
nence was also important. Getting paid right after the drug test was more effective than
getting paid one or two days later. I will have more to say about the importance of
immediate reinforcement later in this chapter.
Shifts in Reinforcer Quality or Quantity The effectiveness of a reinforcer depends
not only on its quality and quantity but also on what the subject received previously. If a
teenager receives an allowance of $25 per week, a decrease to $10 may be a great disap-
pointment. But if he or she never got used to receiving $25 per week, an allowance of
$10 might seem OK. As this example suggests, the effectiveness of a reinforcer depends
not only on its own properties but also on how that reinforcer compares with others the
individual received in the recent past.
We saw in Chapter 4 that the effectiveness of a US in classical conditioning depends
on how the US compares with the individual’s expectations based on prior experience.
This idea served as the foundation of the Rescorla–Wagner model. If the US is larger
(or more intense) than expected, it will support excitatory conditioning. By contrast, if
it is smaller (or weaker) than expected, the US will support inhibitory conditioning.
Analogous effects occur in instrumental conditioning. Numerous studies have shown
that the effects of a particular amount and type of reinforcer depend on the quantity
5
0
10 15 20 25 30 35 40
Response requirement
M
ea
n
#
of
r
ei
nf
or
ce
rs
e
ar
ne
d
0.4
0.8
1.2
1.6
2
10 sec 105 sec 120 secFIGURE 5.9 Average
number of reinforcers
earned by Chad per
session as the response
requirement was in-
creased from 1 to 40.
(The maximum possible
was two reinforcers per
session at each response
requirement.) Notice
that responding was
maintained much more
effectively in the face of
increasing response
requirements when the
reinforcer was 120
seconds long [from
Trosclair-Lasserre et al.
(2008), Figure 3, p. 215].
Fundamental Elements of Instrumental Conditioning 139
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
and quality of the reinforcers the individual experienced previously. Speaking loosely,
a large reward is treated as especially good after reinforcement with a small reward,
and a small reward is treated as especially poor after reinforcement with a large reward
(for a comprehensive review, see Flaherty, 1996). These phenomena are called positive
and negative behavioral contrast effects.
Behavioral contrast effects were first described by Crespi (1942) and have been
documented in a variety of situations since then. A clear example of negative behavioral
contrast was recently reported by Ortega and colleagues (2011). Laboratory rats were
given a sucrose solution to drink for 5 minutes each day. For one group of rats, the
sucrose solution was always 4% throughout the experiment. For a second group, the
sucrose solution was much more tasty (32%) on the first 10 trials and was then
decreased to 4% for the remaining four trials.
How long the rats spent licking the sucrose solution on each trial is summarized in
Figure 5.10. Notice that during the first 10 trials, the rats spent a bit more time licking
the more tasty 32% sucrose solution than the 4% solution. However, when the 32% solu-
tion was changed to 4%, these rats showed a dramatic decrease in licking time. In fact,
the shifted group licked significantly less of the 4% sucrose on trials 11 and 12 than the
nonshifted group that received 4% sucrose all along. This illustrates the phenomenon of
negative behavioral contrast.
Behavioral-contrast effects can occur either because of a shift from a prior reward
magnitude (as in Figure 5.10) or because of an anticipated reward. Behavioral contrast
due to an anticipated large reward may explain a long-standing paradox in the drug-
abuse literature. The paradox arises from two seemingly conflicting findings. The first is
that drugs of abuse, such as cocaine, will support the conditioning of a place preference
in laboratory animals. Rats given cocaine in a distinctive chamber will choose that area
over a place where they did not get cocaine. This suggests that cocaine is reinforcing.
The conflicting finding is that rats given a saccharin solution to drink before receiving
cocaine suppress their saccharin intake. Thus, cocaine can condition a taste aversion
even though it appears to be reinforcing in place preference conditioning. Grigson and
her colleagues have conducted a series of studies that suggest that the saccharin aversion
conditioned by cocaine reflects an anticipatory contrast effect (Grigson et al., 2008).
Because cocaine is so highly reinforcing and occurs after exposure to saccharin, the sac-
charin flavor loses its hedonic value in anticipation of the much greater hedonic value of
cocaine. This type of anticipatory negative contrast may explain why individuals addicted
to cocaine derive little satisfaction from conventional reinforcers (a tasty meal) that
others enjoy on a daily basis.
C. F. Flaherty
300
250
200
150
100
50D
ri
nk
in
g
tim
e
(s
ec
)
0
0 2 4 6 8
Trials
10 12 14
32-4 4 – 4FIGURE 5.10 Time
rats spent drinking when
the solution was either
32% sucrose or 4% su-
crose. Rats in the 32–4
group were shifted to 4%
sucrose after trial 10
(based on Ortega et al.,
2011).
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
140 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The Response–Reinforcer Relation
The hallmark of instrumental behavior is that it produces and is controlled by its conse-
quences. In some cases, there is a strong relation between what a person does and the
consequences of that action. If you put a dollar into a soda machine, you will get a can
of soda. As long as the machine is working, you will get your can of soda every time you
put in the required money. In other cases, there is no relation between behavior and an
outcome. You may wear your lucky hat to a test and get a good grade, but the grade
would not be caused by the hat you were wearing. The relation between behavior and
its consequences can also be probabilistic. For example, you might have to call several
times before you get to talk to your friend on the phone.
Human and other animals perform a continual stream of responses and encounter
all kinds of environmental events. You are always doing something, even if it is just
sitting around, and things are continually happening in your environment. Some of
the things you do have consequences; others don’t. It makes no sense to work hard to
make the sun rise each morning because that will happen anyway. Instead, you should
devote your energy to fixing breakfast or working for a paycheck: things that do not
happen without your effort. To be efficient, you have to know when you have to do
something to obtain a reinforcer and when the reinforcer is likely to be delivered inde-
pendent of your actions. Efficient instrumental behavior requires sensitivity to the
response–reinforcer relation.
There are actually two types of relationships between a response and a reinforcer
(Williams, 2001). One is the temporal relation. The temporal relation refers to the
time between the response and the reinforcer. A special case of the temporal relation
is temporal contiguity. Temporal contiguity refers to the delivery of the reinforcer
immediately after the response. The second type of relation between a response and the
reinforcer is the causal relation or response–reinforcer contingency. The response–
reinforcer contingency refers to the extent to which the instrumental response is necessary
and sufficient to produce the reinforcer.
Temporal and causal factors are independent of each other. A strong temporal rela-
tion does not require a strong causal relation, and vice versa. For example, there is strong
causal relation between taking your clothes to the cleaners and getting clean clothes back.
However, the temporal delay may be a day or two.
Effects of the Temporal Relation Since the early work of Grice (1948), learning
psychologists have correctly emphasized that instrumental conditioning requires provid-
ing the reinforcer immediately after the occurrence of the instrumental response. Grice
reported that instrumental learning can be disrupted by delays as short as .5 seconds.
More recent research has indicated that instrumental conditioning is possible with delays
as long as 30 seconds (Critchfield & Lattal, 1993; Okouchi, 2009). However, the fact
remains that immediate reinforcement is much more effective.
The effects of delayed reinforcement on learning to press a response lever in labora-
tory rats is shown in Figure 5.11 (Dickinson, Watt, & Griffiths, 1992). Each time the rats
pressed the lever, a food pellet was set up to be delivered after a fixed delay. For some
rats, the delay was short (2–4 seconds). For others, the delay was considerable (64 sec-
onds). If the rat pressed the lever again during the delay interval, the new response
resulted in another food pellet after the specified delay. (In other studies, such extra
responses were programmed to reset the delay interval.) Figure 5.11 shows response
rates as a function of the mean delay of reinforcement experienced by each group.
Responding decreased fairly rapidly with increases in the delay of reinforcement. No
learning was evident with a 64-second delay of reinforcement in this experiment.
Fundamental Elements of Instrumental Conditioning 141
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Why is instrumental conditioning so sensitive to a delay of reinforcement? A major
culprit is the credit-assignment problem. With delayed reinforcement, it is difficult to
figure out which response deserves the credit for the delivery of the reinforcer. As I
pointed out earlier, behavior is an ongoing, continual stream of activities. When rein-
forcement is delayed after performance of a specified response, R1, the participant does
not stop doing things. After performing R1, the participant may perform R2, R3, R4, and
so on. If the reinforcer is set up by R1 but not delivered until sometime later, the rein-
forcer may occur immediately after some other response, let’s say R6. To associate R1
with the reinforcer, the participant has to have some way to distinguish R1 from the
other responses it performs during the delay interval.
There are a couple of ways to overcome the credit-assignment problem. The first
technique, used by animal trainers and coaches for centuries, is to provide a secondary
or conditioned reinforcer immediately after the instrumental response, even if the pri-
mary reinforcer cannot occur until sometime later (Cronin, 1980; Winter & Perkins,
1982). A secondary, or conditioned, reinforcer is a conditioned stimulus that was pre-
viously associated with the reinforcer. Verbal prompts in coaching, such as “good,” “keep
going,” and “that’s the way” are conditioned reinforcers that can provide immediate rein-
forcement for appropriate behavior. In the clicker-training methodology for animal
training, the sound of a clicker is first paired with the delivery of food to make the
clicker an effective conditioned reinforcer. The clicker then can be delivered immediately
after a desired response even if the primary food reinforcer is delayed. Effective coaches
and animal trainers are constantly providing immediate verbal feedback as conditioned
reinforcers.
Another technique that facilitates learning with delayed reinforcement is to mark the
target instrumental response in some way to make it distinguishable from the other
activities of the organism. Marking can be accomplished by introducing a brief light or
noise after the target response or by picking up the animal and moving it to a holding
box for the delay interval. The effectiveness of a marking procedure was first demon-
strated by David Lieberman and his colleagues (Lieberman, McIntosh, & Thomas,
1979) and has since been replicated in various other studies (e.g., Thomas & Lieberman,
1990; Urcuioli & Kasprow, 1988).
In an interesting variation of the marking procedure, Williams (1999) compared the
learning of a lever-press response in three groups of rats. For each group, the food rein-
forcer was delayed 30 seconds after a press of the response lever. (Any additional lever
presses during the delay interval were ignored.) The no-signal group received this proce-
dure without a marking stimulus. For the marking group, a light was presented for
Le
ve
r
pr
es
se
s
pe
r
m
in
ut
e
Experienced delay (seconds)
20 40 60
0
5
10
15
20
0 80
FIGURE 5.11 Effects
of delay of reinforcement
on acquisition of lever
pressing in rats. (Based
on “Free-Operant Ac-
quisition with Delayed
Reinforcement,” by
A. Dickinson, A. Watt,
and W. J. H. Griffiths,
1992, The Quarterly
Journal of Experimental
Psychology, 45B,
pp. 241–258.)
D. A. Lieberman
Co
ur
te
sy
of
D
.
A
.
Li
eb
er
m
an
142 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
5 seconds right after each lever press. For a third group of subjects (called the blocking
group), the 5-second light was presented at the end of the delay interval, just before food
delivery.
Results of the experiment are shown in Figure 5.12. Rats in the no-signal group
showed little responding during the first three blocks of two trials and only achieved
modest levels of lever pressing after that. In contrast, the marking group showed much
more robust learning. Clearly, introducing a brief light to mark each lever-press response
substantially facilitated learning with the 30-second delay of reinforcement. However,
placing the light at the end of the delay interval, just before food, had the opposite effect.
Rats in the blocking group never learned the lever-press response. For those animals, the
light became associated with the food, and this blocked the conditioning of the instru-
mental response. (For a further discussion delay of reinforcement, see Lattal, 2010.)
The Response–Reinforcer Contingency As I noted earlier, the response–reinforcer
contingency refers to the extent to which the delivery of the reinforcer depends on the
prior occurrence of the instrumental response. In studies of delayed reinforcement, there
is a perfect causal relation between the response and the reinforcer, but learning is dis-
rupted. This shows that a perfect causal relation between the response and the reinforcer
is not sufficient to produce vigorous instrumental responding. Even with a perfect causal
relation, conditioning does not occur if reinforcement is delayed too long. Such data
encouraged early investigators to conclude that response–reinforcer contiguity, rather
than contingency, was the critical factor producing instrumental learning. However, this
Blocks of two sessions
R
ei
nf
or
ce
rs
p
er
h
ou
r
0
0
10
20
30
40
50
60
1 2 3 4 5 6 7 8 9 10
MarkingNo Signal BlockingFIGURE 5.12 Acqui-
sition of lever pressing in
rats with a 30-second
delay of reinforcement.
For the marking group, a
light was presented for
5 seconds at the beginning
of the delay interval,
right after the instru-
mental response. For the
blocking group, the light
was introduced at the
end of the delay interval,
just before the delivery
of food (based on
Williams, 1999).
B. A. Williams
Co
ur
te
sy
of
B.
A
.
W
ill
ia
m
s
Fundamental Elements of Instrumental Conditioning 143
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
view has turned out to be incorrect. The response–reinforcer contingency is also
important.
Skinner’s Superstition Experiment The role of contiguity versus contingency in
instrumental learning became a major issue with Skinner’s superstition experiment
(Skinner, 1948). Skinner placed pigeons in separate experimental chambers and set the
equipment to deliver a bit of food every 15 seconds irrespective of what the pigeons
were doing. The birds were not required to peck a key or perform any other response
to get the food. After some time, Skinner returned to see what his birds were doing. He
described some of what he saw as follows:
In six out of eight cases the resulting responses were so clearly defined that two
observers could agree perfectly in counting instances. One bird was conditioned
to turn counterclockwise about the cage, making two or three turns between rein-
forcements. Another repeatedly thrust its head into one of the upper corners of
the cage. A third developed a “tossing” response, as if placing its head beneath an
invisible bar and lifting it repeatedly. (p. 168)
The pigeons appeared to be responding as if their behavior controlled the delivery of
the reinforcer when, in fact, the food was provided irrespective of what the pigeons were
doing. Accordingly, Skinner called this superstitious behavior.
Skinner’s explanation of superstitious behavior rests on the idea of accidental, or
adventitious, reinforcement. Adventitious reinforcement refers to the accidental pairing
of a response with delivery of the reinforcer. Animals are always doing something, even
if no particular responses are required to obtain food. Skinner suggested that whatever
response a pigeon happened to make just before it got free food became strengthened
and subsequently increased in frequency because of adventitious reinforcement. One
accidental pairing of a response with food increased the chance that the same response
would occur just before the next delivery of the food. As this process was repeated, the
response came to be performed often enough to be identified as superstitious behavior.
Skinner’s interpretation was appealing and consistent with views of reinforcement
that were widely held at the time. Impressed by studies of delay of reinforcement, beha-
viorists thought that temporal contiguity was the main factor responsible for learning.
Skinner’s experiment appeared to support this view and suggested that a positive
response–reinforcer contingency is not necessary for instrumental conditioning.
Reinterpretation of the Superstition Experiment Skinner’s bold claim that tempo-
ral contiguity, rather than response-reinforcer contingency, is most important for instru-
mental conditioning was challenged by subsequent empirical evidence. In a landmark
study, Staddon and Simmelhag (1971) repeated Skinner’s experiment. However, Staddon
and Simmelhag made more extensive and systematic observations. They defined a variety
of responses, such as orienting to the food hopper, pecking the response key, wing flap-
ping, turning in quarter circles, and preening. They then recorded the frequency of each
response according to when it occurred during the interval between successive free deliv-
eries of food.
Figure 5.13 shows the data obtained by Staddon and Simmelhag for several
responses for one pigeon. Clearly, some of the responses occurred predominantly toward
the end of the interval between successive reinforcers. For example, R1 and R7 (orienting
to the food magazine and pecking at something on the magazine wall) occurred more
often at the end of the food–food interval than at other times. Staddon and Simmelhag
called these terminal responses. Other activities increased in frequency after the delivery
of food and then decreased as the time for the next food delivery drew closer. The
J. E. R. Staddon
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
144 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
pigeons were most likely to engage in R8 and R4 (moving along the magazine wall and
making a quarter turn) somewhere near the middle of the interval between food deliver-
ies. These activities were called interim responses.
Which actions were terminal responses and which were interim responses did not
vary much from one pigeon to another. Furthermore, Staddon and Simmelhag failed to
find evidence for accidental reinforcement effects. Responses did not always increase in
frequency merely because they occurred coincidentally with food delivery. Food delivery
appeared to influence only the strength of terminal responses, even in the initial phases
of training.
Subsequent research has provided much additional evidence that presentations of a
reinforcer at fixed intervals produce behavioral regularities, with certain responses predo-
minating late in the interval between successive food presentations and other responses
predominating earlier in the food–food interval (Anderson & Shettleworth, 1977;
Innis, Simmelhag-Grant, & Staddon, 1983; Silva & Timberlake, 1998). It is not clear
why Skinner failed to observe such regularities in his experiment. One possibility is that
he focused on different aspects of the behavior of different birds in an effort to document
that each bird responded in a unique fashion. For example, he may have focused on
the terminal response of one bird and interim responses in other birds. Subsequent
investigators have also noted some variations in behavior between individuals but
have emphasized the more striking similarities among individuals in their interim and
terminal responses.
Explanation of the Periodicity of Interim and Terminal Responses What is
responsible for the development of similar terminal and interim responses in animals
exposed to the same schedule of response-independent food presentations? Staddon and
Interval (seconds)
R
es
po
ns
e
pr
ob
ab
ili
ty
2 4 6 80 10 12
0
0.2
0.4
0.6
0.8
1.0
R3R1 R8R7R4FIGURE 5.13
Probability of several
responses as a function
of time between succes-
sive deliveries of a food
reinforcer. R1 (orienting
toward the food maga-
zine wall) and R7
(pecking at something
on the magazine wall)
are terminal responses,
having their highest
probabilities at the end
of the interval between
food deliveries. R3
(pecking at something
on the floor), R4 (a
quarter turn), and R8
(moving along the mag-
azine wall) are interim
responses, having their
highest probabilities
somewhere near the
middle of the interval
between food deliveries.
(From “The ‘Supersti-
tion’ Experiment: A
Reexamination of Its
Implications for the
Principles of Adaptive
Behavior,” by J. E. R.
Staddon and V. L.
Simmelhag, 1971, Psy-
chological Review, 78,
pp. 3–43.
Fundamental Elements of Instrumental Conditioning 145
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Simmelhag (1971) suggested that terminal responses are species-typical responses that
reflect the anticipation of food as time draws closer to the next food presentation. By
contrast, they viewed interim responses as reflecting other sources of motivation that
are prominent early in the interfood interval, when food presentation is unlikely.
Subsequent investigators have expanded on Staddon and Simmelhag’s ideas in the
more comprehensive framework of behavior systems theory (Timberlake & Lucas, 1985;
Silva & Timberlake, 1998). The critical idea is that periodic deliveries of food activate the
feeding system and its preorganized species-typical foraging and feeding responses. Dif-
ferent behaviors occur depending on when food was last delivered and when food is
going to occur again. Just after the delivery of food, the organism is assumed to display
post-food focal search responses that involve activities near the food cup. In the middle of
the interval between food deliveries (when the subjects are least likely to get food), gen-
eral search responses are evident that take the animal away from the food cup. As the
time for the next food delivery approaches, the subject exhibits focal search responses
that are again concentrated near the food cup. In Figure 5.13, the terminal responses,
R1 and R7 were distributed in time in the manner expected of focal search behavior,
and R4 and R8 were distributed in the manner expected of general search responses.
Effects of the Controllability of Reinforcers A strong contingency between an
instrumental response and a reinforcer essentially means that the response controls the
reinforcer. With a strong contingency, whether the reinforcer occurs depends on whether
the instrumental response has occurred. Studies of the effects of control over reinforcers
have provided the most extensive body of evidence on the sensitivity of behavior to
response–reinforcer contingencies. Some of these studies have involved positive rein-
forcement (e.g., Job, 2002). However, most of the research has focused on the effects of
control over aversive stimulation.
Contemporary research on this problem originated with the pioneering studies of
Overmier and Seligman (1967) and Seligman and Maier (1967), who investigated the
effects of exposure to uncontrollable shock on subsequent escape-avoidance learning in
dogs. The major finding was that exposure to uncontrollable shock disrupted subsequent
learning. This phenomenon has been called the learned-helplessness effect.
The learned-helplessness effect continues to be the focus of a great deal of research
(LoLordo & Overmier, 2011), but dogs are no longer used in the experiments. Instead,
most of the research is conducted with laboratory rats and mice and human participants.
The research requires exposing animals to stressful events, and some may find the
research objectionable because of that. However, this line of work has turned out to be
highly informative about the mechanisms of stress and coping at the behavioral, hor-
monal, and neurophysiological levels.
The Triadic Design Learned-helplessness experiments are usually conducted using
the triadic design presented in Table 5.2. The design involves two phases: an exposure
phase and a conditioning phase. During the exposure phase, one group of rats (E, for
S. F. Maier
TABLE 5.2 THE TRIADIC DESIGN USED IN STUDIES OF THE
LEARNED-HELPLESSNESS EFFECT
G R O U P E X P O S U R E P H A S E
C O N D I T I O N I N G
P H A S E R E S U L T
Group E Escapable shock Escape-avoidance training Rapid-avoidance learning
Group Y Yoked inescapable shock Escape-avoidance training Slow-avoidance learning
Group R Restricted to apparatus Escape-avoidance training Rapid-avoidance learning
©
Ce
ng
ag
e
Le
ar
ni
ng
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
146 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
escape) is exposed to periodic shocks that can be terminated by performing an escape
response (e.g., rotating a small wheel or tumbler). Each subject in the second group
(Y, for yoked) is assigned a partner in Group E and receives the same duration and
distribution of shocks as its Group E partner. However, animals in Group Y cannot
turn off the shocks. For them, the shocks are inescapable. The third group (R, for
restricted) receives no shocks during the exposure phase but is restricted to the appa-
ratus for as long as groups E and Y. During the conditioning phase, all three groups
receive escape-avoidance training. This is usually conducted in a shuttle apparatus
that has two adjacent compartments (see Figure 10.3). The animals have to go back
and forth between the two compartments to avoid shock (or escape any shocks that
they failed to avoid).
The remarkable finding in experiments on the learned-helplessness effect is that the
impact of aversive stimulation during the exposure phase depends on whether or not the
shock is escapable. Exposure to uncontrollable shock (Group Y) produces a severe dis-
ruption in subsequent escape-avoidance learning. In the conditioning phase of the exper-
iment, Group Y typically shows much poorer escape-avoidance performance than both
Group E and Group R. By contrast, little or no deleterious effects are observed after
exposure to escapable shock. Group E often learns the subsequent escape-avoidance
task as rapidly as Group R, which received no shock during the exposure phase. Similar
detrimental effects of exposure to yoked inescapable shock have been reported on subse-
quent responding for food reinforcement (e.g., Rosellini & DeCola, 1981).
The fact that Group Y shows a deficit in subsequent learning in comparison to
Group E indicates that the animals are sensitive to the procedural differences between
escapable and yoked inescapable shocks. The primary difference between Groups E and
Y is the presence of a response–reinforcer contingency for Group E but not for Group Y
during the exposure phase. Therefore, the difference in the rate of learning between these
two groups shows that the animals are sensitive to the response–reinforcer contingency.
The Learned-Helplessness Hypothesis The learned-helplessness hypothesis was
the first major explanation of the results of studies employing the triadic design (Maier &
Seligman, 1976; Maier, Seligman, & Solomon, 1969). The learned-helplessness hypoth-
esis assumes that during exposure to uncontrollable shocks, animals learn that the
shocks are independent of their behavior—that there is nothing they can do to control
the shocks. Furthermore, they come to expect that reinforcers will continue to be inde-
pendent of their behavior in the future. This expectation of future lack of control
undermines their ability to learn a new instrumental response. The learning deficit
occurs for two reasons. First, the expectation of lack of control reduces the motivation
to perform an instrumental response. Second, even if they make the response and get
reinforced in the conditioning phase, the previously learned expectation of lack of con-
trol makes it more difficult for the subjects to learn that their behavior is now effective
in producing reinforcement.
It is important to distinguish the learned-helplessness hypothesis from the learned-
helplessness effect. The effect is the pattern of results obtained with the triadic design
(disruption of instrumental conditioning caused by prior exposure to inescapable
shock). The learned-helplessness effect has been replicated in numerous studies and is a
firmly established finding. By contrast, the learned-helplessness hypothesis is an explana-
tion or interpretation of the effect, which has been provocative and controversial since its
introduction (LoLordo & Overmier, 2011).
Alternatives to the Helplessness Hypothesis The activity deficit hypothesis. Accord-
ing to the activity deficit hypothesis, animals in Group Y show a learning deficit following
M. E. P. Seligman
Co
ur
te
sy
of
M
.
E.
P.
Se
lig
m
an
Fundamental Elements of Instrumental Conditioning 147
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
exposure to inescapable shock because inescapable shocks encourage animals to become
inactive or freeze. As we discussed in Chapter 3, freezing is a common response to fear.
The activity deficit hypothesis received some empirical support early in the history
of research on the learned helplessness effect. However, it cannot explain instances in
which exposure to inescapable shock disrupts choice learning. For example, Jackson,
Alexander, and Maier (1980) found that following exposure to inescapable shock, rats
were deficient in learning an escape response that consisted of selecting the correct arm
of a Y maze. Failure to learn this choice response was not due to lack of activity but to
choice of the incorrect arm of the maze.
The attention deficit hypothesis. According to the attention deficit hypothesis, expo-
sure to inescapable shock reduces the extent to which animals pay attention to their own
behavior, and that is why these animals show a learning deficit. The attention deficit
hypothesis has been considerably more successful as an alternative to the learned help-
lessness hypothesis than the activity deficit hypothesis. For example, manipulations that
increase attention to response-generated cues have been found to reduce the deleterious
effects of prior exposure to inescapable shock (Maier, Jackson, & Tomie, 1987).
Stimulus relations in escape conditioning. In another line of research that has chal-
lenged the helplessness hypothesis, investigators reframed the basic issue addressed by
the triadic design. Instead of focusing on why inescapable shock disrupts subsequent
learning, they asked why exposure to escapable shock is not nearly as bad (Minor, Dess, &
Overmier, 1991). What is it about the ability to make an escape response that makes
exposure to shock less debilitating? This question has stimulated a closer look at stimulus
relations in escape conditioning.
The defining feature of escape behavior is that the instrumental response results in
the termination of an aversive stimulus. Termination of shock is an external stimulus
event. However, the act of performing any skeletal response also provides internal sen-
sory feedback cues. For example, you can feel that you are raising your hand even if your
eyes are closed. Because of these response-produced internal cues, you don’t have to see
your arm go up to know that you are raising your arm.
BOX 5.4
Helplessness, Depression, and Post-traumatic Stress Disorder
The fact that a history of lack of
control over reinforcers can severely
disrupt subsequent instrumental per-
formance has important implications
for human behavior. The concept of
helplessness has been extended and
elaborated to a variety of areas of
human concern, including aging,
athletic performance, chronic pain,
academic achievement, susceptibility
to heart attacks, and victimization
and bereavement (e.g., Overmier,
2002; Peterson, Maier, & Seligman,
1993). Perhaps the most prominent
area to which the concept of
helplessness has been applied is
depression (Abramson, Metalsky, &
Alloy, 1989; Pryce et al., 2011).
Animal research on uncontrolla-
bility and unpredictability of aversive
stimuli is also becoming important
for the understanding of human post-
traumatic stress disorder or PTSD
(Foa, Zinbarg, & Rothbaum, 1992).
Victims of assault or combat stress
have symptoms that correspond
to the effects of chronic uncontrolla-
ble and unpredictable shock in
animals. For example, exposure to
inescapable shock greatly facilitates
the subsequent acquisition of fear and
makes it more difficult to extinguish
the conditioned fear (e.g., Baratta et al.,
2007). Enhanced fear reactivity is one
of the major symptoms of PTSD.
Recognition of these similarities
between animal models and human
symptoms of PTSD promises to pro-
vide new insights into the origin and
treatment of PTSD. Animal models of
helplessness have also contributed to
the understanding of the long-term
effects of sexual abuse and revictimi-
zation (Bargai, Ben-Shakhar, & Shalev,
2007; Marx, Heidt, & Gold, 2005).
N. K. Dess
Co
ur
te
sy
of
N
.
K.
D
es
s
148 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Making an escape response such as pressing a lever similarly results in internal sen-
sations or response feedback cues. These are illustrated in Figure 5.14. Some of the
response-produced stimuli are experienced at the start of the escape response, just before
the shock is turned off. These are called shock-cessation feedback cues. Other response-
produced stimuli are experienced as the animal completes the response, just after the
shock has been turned off at the start of the intertrial interval. These are called safety-
signal feedback cues.
At first, investigations of stimulus factors involved with escapable shock centered on
the possible significance of safety-signal feedback cues. Safety-signal feedback cues are
reliably followed by the intertrial interval and hence by the absence of shock. Therefore,
such feedback cues can become conditioned inhibitors of fear and limit or inhibit fear
elicited by contextual cues of the experimental chamber. (For a discussion of conditioned
inhibition, see Chapter 3.) No such safety signals exist for animals given yoked, inescap-
able shock because for them, shocks and shock-free periods are not predictable. There-
fore, contextual cues of the chamber in which shocks are delivered are more likely to
become conditioned to elicit fear with inescapable shock.
These considerations have encouraged analyzing the triadic design in terms of group
differences in signals for safety rather than in terms of differences in whether shock is
escapable or not. In an experiment conducted by Jackson and Minor (1988), for exam-
ple, one group of rats received the usual inescapable shocks in the exposure phase of the
triadic design. However, at the end of each shock presentation, the houselights were
turned off for 5 seconds as a safety signal. The introduction of this safety signal entirely
eliminated the disruptive effects of shock exposure on subsequent shuttle-escape learn-
ing. Another study (Minor, Trauner, Lee, & Dess, 1990) also employed inescapable
shocks, but this time an audiovisual cue was introduced during the last 3 seconds of
each shock presentation. This was intended to mimic shock cessation cues. The intro-
duction of these shock cessation cues also largely eliminated the helplessness effect (see
also Christianson et al., 2008).
The aforementioned studies indicate that significant differences in how animals cope
with aversive stimulation can result from differences in the ability to predict when
shocks will end and when a safe intertrial interval without shocks will begin. Learning
to predict shock termination and shock absence can be just as important as being able
to escape from shock. This is good news. We encounter many aversive events in life
that we cannot control (e.g., the rising price of gas or a new demanding boss). Fortu-
nately, controlling a stressful event need not be our only coping strategy. Learning to
T. R. Minor
Time
Shock
Escape response
Shock–cessation
feedback cues
Safety–signal
feedback cues
FIGURE 5.14 Stimu-
lus relations in an
escape-conditioning tri-
al. Shock-cessation
feedback cues are expe-
rienced at the start of the
escape response, just
before the termination
of shock. Safety-signal
feedback cues are expe-
rienced just after the
termination of shock, at
the start of the intertrial
interval.
©
Ce
ng
ag
e
Le
ar
ni
ng
Co
ur
te
sy
of
T.
R.
M
in
or
Fundamental Elements of Instrumental Conditioning 149
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
predict when we will encounter the stressful event (and when we will not encounter it)
can be just as effective in reducing the harmful effects of stress.
Contiguity and Contingency: Concluding Comments As we have seen, organisms
are sensitive to the contiguity as well as the contingency between an instrumental
response and a reinforcer. Typically, these two aspects of the relation between response
and reinforcer act jointly to produce learning (Williams, 2001). Both factors serve to
focus the effects of reinforcement on the instrumental response. The causal relation, or
contingency, ensures that the reinforcer is delivered only after occurrence of the specified
instrumental response. The contiguity relation ensures that other activities do not
intrude between the specified response and the reinforcer to interfere with conditioning
of the target response.
BOX 5.5
Learned Helplessness: Role of the Prefrontal Cortex and Dorsal Raphe
We have seen that aversive
stimulation can impact the organism
in different ways, depending upon
whether the stimulation is given in
a controllable or uncontrollable
manner. In general, exposure to
uncontrollable stimulation induces a
constellation of behavioral effects that
undermine the subject’s ability to
cope, a behavioral phenomenon
known as learned helplessness (Maier
& Seligman, 1976). Importantly, these
adverse effects are not observed in
individuals who can control (e.g.,
terminate) the aversive stimulus.
Moreover, exposure to controllable
stimulation has a lasting protective
effect that blocks the induction of
helplessness if the organism later
encounters uncontrollable
stimulation. Behavioral control and
helplessness are of clinical interest
because these phenomena can
contribute to conditions such
as PTSD and depression (Forgeard
et al., 2011; Hammack, Cooper, &
Lezak, 2012).
At a neural level, research suggests
that the long-term consequences of
uncontrollable stimulation depend on
a region of the midbrain known as the
dorsal raphe nucleus (DRN). The
DRN lies just ventral to another key
region, the periaqueductal gray
(PAG), discussed in Box 4.3. Like the
PAG, the DRN can regulate neural
activity in other regions of the central
nervous system (Figure 5.15). It does
so through neurons that release the
neurotransmitter serotonin (5-HT).
Because these 5-HT neurons project
to regions implicated in stress and
fear (Graeff, Viana, & Mora, 1997),
Maier and his colleagues hypothe-
sized that the DRN is involved in
helplessness (Maier & Watkins, 1998,
2005). Supporting this, they showed
that exposure to uncontrollable shock
activates 5-HT neurons within the
DRN and has a sensitizing effect,
enhancing the amount of 5-HT
released at distant sites. Importantly,
these effects are not observed after an
equivalent exposure to controllable
stimulation. They further showed that
pharmacologically inhibiting the
DRN during uncontrollable stimula-
tion blocked the induction of learned
helplessness. Conversely, pharmaco-
logically activating the DRN had a
behavioral effect similar to that
observed after uncontrollable stimu-
lation. Together, these observations
suggest that the activation of the DRN
is both necessary and sufficient to
produce learned helplessness.
Maier and Watkins (2010)
hypothesized that behavioral control
regulates DRN activity through neu-
rons that project from the prelimbic
and infralimbic regions of the ventral
medial prefrontal cortex (vmPFC)
(Figure 5.15). These excitatory neu-
rons engage GABAergic interneurons
within the DRN that have an inhibi-
tory effect. As a result, pharmacolog-
ical activation of the vmPFC inhibits
5-HT neural activity within the DNR
and blocks the development of
learned helplessness. If behavioral
control inhibits the DNR through the
vmPFC, then pharmacologically dis-
rupting the vmPFC function should
eliminate the protective effect of
instrumental control. Minus the
dampening action of the vmPFC
input, controllable aversive stimula-
tion should engage 5-HT neurons
within the DNR and paradoxically
produce a helplessness-like effect.
Research has shown that disrupting
vmPFC function has exactly this
effect.
One of the most interesting and
clinically relevant features of behav-
ioral control is that it has a lasting
effect that behaviorally immunizes
the organism from becoming helpless
when later exposed to uncontrollable
stimulation. Here too, the vmPFC
plays a key role. This was shown by
first exposing animals to controllable
Continued
150 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
shock. A week later, the animals were
exposed to uncontrollable shock,
which alone induced helplessness.
As expected, prior exposure to con-
trollable stimulation blocked the
induction of helplessness. Most
importantly, this protective effect was
not observed when the vmPFC was
inhibited during uncontrollable stim-
ulation. Minus the vmPFC, uncon-
trollable shock induced helplessness
in rats that had previously received
controllable stimulation.
If the vmPFC is critical, then
pharmacologically activating it
should substitute for behavioral
control and transform how an
uncontrollable aversive input affects
the nervous system. We already
know that activating the vmPFC will
prevent uncontrollable shock from
inducing helplessness. The stronger
claim is that activating the vmPFC
will induce a long-term effect anal-
ogous to that produced by instru-
mental control. As predicted,
combining vmPFC activation with
uncontrollable shock yielded an
effect equivalent to controllable
stimulation, engaging a protective
effect that blocked the subsequent
induction of helplessness.
Controllable and uncontrollable
stimulation also have divergent
effects on Pavlovian fear condition-
ing. Uncontrollable stimulation
generally enhances conditioning,
whereas controllable stimulation has
an inhibitory effect. As we saw in
Box 4.3, fear conditioning depends
on neurons within the amygdala.
Input from the CS and US appear to
be associated within the basolateral
region of the amygdala, while the
performance of the CR is orches-
trated by the central nucleus. Evi-
dence suggests that uncontrollable
stimulation enhances fear-related
CRs through 5-HT neurons that
project to the basolateral amygdala
(Amat, Matus-Amat, Watkins, &
Maier, 1998). The calming effect of
controllable stimulation has been
linked again to the vmPFC and,
more specifically, to the infralimbic
region which sends a projection to
the intercalated cell region of the
amygdala (Maier & Watkins, 2010).
This portion of the amygdala is
composed of inhibitory (GABAer-
gic) cells that project to the central
nucleus. Consequently, engaging the
infralimbic region inhibits the out-
put from the central nucleus and
fear behaviors.
Given the above observations,
Maier and his colleagues hypothe-
sized that output from the infralim-
bic area of the vmPFC acts to inhibit
the performance of fear-elicited CRs
(within the central nucleus) rather
than learning (within the basolateral
nucleus). To explore this idea, rats
were given controllable or uncon-
trollable shock. A third group
Continued
Aversive
stimulation
Escape
Fear
Drug reward
5-HT
DRN
GABA
Glut
vmPFC
BNST
LC
L Habenula
N. Acc
Amygdala
PAG
Behavioral control
FIGURE 5.15 A model of how aversive stimulation and behavioral control regulate physiological responses to stress. Aver-
sive stimulation engages structures such as the lateral habenula (L Habenula), locus coeruleus (LC), and bed nucleus of the
stria terminalis (BNST), which project to the dorsal raphe nuclei (DRN). This engages serotonergic (5-HT) fibers that activate
neural regions implicated in defensive responding (periaqueductal gray [PAG] and amygdala) and reward (nucleus accumbens
[N Acc]). Behavioral control activates excitatory glutamatergic (Glut) neurons within the ventral medial prefrontal cortex
(vmPFC) that project to the DRN, where they engage inhibitory (GABA) neurons that reduce 5-HT activity (adapted from
Maier and Watkins, 2010).
BOX 5.5 (continued)
Fundamental Elements of Instrumental Conditioning 151
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Sample Questions
1. Compare and contrast free-operant and discrete-
trial methods for the study of instrumental behavior.
2. What are the similarities and differences between
positive and negative reinforcement?
3. What is the current thinking about instrumental
reinforcement and creativity, and what is the
relevant experimental evidence?
4. How does the current status of a reinforcer depend
on prior experience with that or other reinforcers?
5. What are the effects of a delay of reinforcement
on instrumental learning, and what causes these
effects?
6. What was the purpose of Skinner’s superstition
experiment? What were the results, and how have
those results been reinterpreted?
7. Describe alternative explanations of the learned-
helplessness effect.
Key Terms
accidental reinforcement An instance in which the
delivery of a reinforcer happens to coincide with a particu-
lar response, even though that response was not responsi-
ble for the reinforcer presentation. Also called adventitious
reinforcement. This type of reinforcement was considered
to be responsible for “superstitious” behavior.
adventitious reinforcement Same as accidental
reinforcement.
appetitive stimulus A pleasant or satisfying stimulus
that can be used to positively reinforce an instrumental
response.
aversive stimulus An unpleasant or annoying stimu-
lus that can be used to punish an instrumental
response.
avoidance An instrumental conditioning procedure
in which the instrumental response prevents the deliv-
ery of an aversive stimulus.
behavioral contrast Change in the value of a reinforcer
produced by prior experience with a reinforcer of a higher
or lower value. Prior experience with a lower valued rein-
forcer increases reinforcer value (positive behavioral con-
trast), and prior experience with a higher valued reinforcer
reduces reinforcer value (negative behavioral contrast).
belongingness The idea, originally proposed by
Thorndike, that an organism’s evolutionary history
makes certain responses fit or belong with certain rein-
forcers. Belongingness facilitates learning.
conditioned reinforcer A stimulus that becomes an
effective reinforcer because of its association with a pri-
mary or unconditioned reinforcer. Also called secondary
reinforcer.
contiguity The occurrence of two events, such as a
response and a reinforcer, at the same time or very
close together in time. Also called temporal contiguity.
remained untreated. Rats were con-
ditioned a week later by administer-
ing shock in a novel context. The
next day, they were tested to see if
the context elicited conditioned fear
(freezing). To explore the role of the
infralimbic area, the investigators
inactivated this region in half the
subjects by administering the GABA
agonist muscimol, either before
conditioning or before testing. In
rats that received the drug vehicle
alone, the usual pattern of results was
obtained: controllable shock reduced
behavioral signs of fear, whereas
prior exposure to uncontrollable
shock enhanced conditioned freez-
ing. Inhibiting the infralimbic region
prior to conditioning had no effect,
which suggests that behavioral con-
trol does not affect learning. Turning
off the infralimbic region prior to
testing had little effect on rats that
had received uncontrollable shock,
but eliminated the calming (antifear)
effect of controllable shock. As
hypothesized, it appears that a his-
tory of behavioral control reduces
the expression of fear, but not its
acquisition.
J. W. Grau
dorsal raphe nucleus (DRN) A region
of the midbrain that regulates neural
activity in structures (e.g., the periaque-
ductal gray, amygdala) related to stress
and emotion.
prefrontal cortex (PFC) The most
anterior (forward) region of the frontal
lobes. The PFC has been implicated in
executive control, working memory, and
planning.
BOX 5.5 (continued)
152 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
differential reinforcement of other behavior (DRO)
An instrumental conditioning procedure in which a
positive reinforcer is periodically delivered only if the
participant does something other than the target
response.
discrete-trial procedure A method of instrumental
conditioning in which the participant can perform the
instrumental response only during specified periods,
usually determined either by placement of the partici-
pant in an experimental chamber or by the presentation
of a stimulus.
escape An instrumental conditioning procedure in
which the instrumental response terminates an aversive
stimulus. (See also negative reinforcement.)
free-operant procedure A method of instrumental
conditioning that permits repeated performance of the
instrumental response without intervention by the
experimenter. (Compare with discrete-trial procedure.)
instinctive drift A gradual drift of instrumental
behavior away from the responses required for rein-
forcement to species-typical, or instinctive, responses
related to the reinforcer and to other stimuli in the
experimental situation.
instrumental behavior An activity that occurs
because it is effective in producing a particular conse-
quence or reinforcer.
interim response A response that has its highest
probability in the middle of the interval between suc-
cessive presentations of a reinforcer, when the rein-
forcer is not likely to occur.
latency The time between the start of a trial (or the
start of a stimulus) and the instrumental response.
law of effect A mechanism of instrumental behavior,
proposed by Thorndike, which states that if a response
(R) is followed by a satisfying event in the presence of a
stimulus (S), the association between the stimulus and
the response (S-R) will be strengthened; if the response
is followed by an annoying event, the S-R association
will be weakened.
learned-helplessness effect Interference with the
learning of new instrumental responses as a result of
exposure to inescapable and unavoidable aversive
stimulation.
learned-helplessness hypothesis The proposal that
exposure to inescapable and unavoidable aversive stimu-
lation reduces motivation to respond and disrupts subse-
quent instrumental conditioning because participants
learn that their behavior does not control outcomes.
magazine training A preliminary stage of instru-
mental conditioning in which a stimulus is repeatedly
paired with the reinforcer to enable the participant to
learn to go and get the reinforcer when it is presented.
The sound of the food-delivery device, for example,
may be repeatedly paired with food so that the animal
will learn to go to the food cup when food is
delivered.
marking procedure A procedure in which the instru-
mental response is immediately followed by a distinc-
tive event (the participant is picked up or a flash of
light is presented) that makes the instrumental response
more memorable and helps overcome the deleterious
effects of delayed reinforcement.
negative punishment Same as omission training or
differential reinforcement of other behavior.
negative reinforcement An instrumental condition-
ing procedure in which there is a negative contingency
between the instrumental response and an aversive
stimulus. If the instrumental response is performed,
the aversive stimulus is terminated or canceled; if the
instrumental response is not performed, the aversive
stimulus is presented.
omission training An instrumental conditioning
procedure in which the instrumental response prevents
the delivery of a reinforcing stimulus. (See also differen-
tial reinforcement of other behavior.)
operant response A response that is defined by the
effect it produces in the environment. Examples
include pressing a lever and opening a door. Any
sequence of movements that depresses the lever or
opens the door constitutes an instance of that partic-
ular operant.
positive reinforcement An instrumental condition-
ing procedure in which there is a positive contingency
between the instrumental response and an appetitive
stimulus or reinforcer. If the participant performs the
response, it receives the reinforcer if the participant
does not perform the response, it does not receive the
reinforcer.
positive punishment Same as punishment.
punishment An instrumental conditioning procedure
in which there is a positive contingency between the
instrumental response and an aversive stimulus. If the
participant performs the instrumental response, it
receives the aversive stimulus; if the participant does
not perform the instrumental response, it does not
receive the aversive stimulus.
Key Terms 153
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
response–reinforcer contingency The relation of a
response to a reinforcer defined in terms of the proba-
bility of getting reinforced for making the response as
compared to the probability of getting reinforced in the
absence of the response.
response shaping Reinforcement of successive
approximations to a desired instrumental response.
running speed How fast (e.g., in feet per second) an
animal moves down a runway.
secondary reinforcer Same as conditioned reinforcer.
superstitious behavior Behavior that increases in fre-
quency because of accidental pairings of the delivery of
a reinforcer with occurrences of the behavior.
temporal contiguity Same as contiguity.
temporal relation The time interval between an
instrumental response and the reinforcer.
terminal response A response that is most likely at
the end of the interval between successive reinforce-
ments that are presented at fixed intervals.
154 Chapter 5: Instrumental Conditioning: Foundations
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
C H A P T E R 6
Schedules of Reinforcement
and Choice Behavior
Simple Schedules of Intermittent Reinforcement
Ratio Schedules
Interval Schedules
Comparison of Ratio and Interval Schedules
Choice Behavior: Concurrent Schedules
Measures of Choice Behavior
The Matching Law
Mechanisms of the Matching Law
Complex Choice and Self-control
Concurrent-Chain Schedules
Self-Control Choice and Delay Discounting
Concluding Comments
Sample Questions
Key Terms
CHAPTER PREVIEW
Instrumental responses rarely get reinforced each time they occur. This chapter continues our discussion
of the importance of the response–reinforcer relation in instrumental behavior by describing the effects of
intermittent schedules of reinforcement. A schedule of reinforcement is a program or rule that determines
which occurrence of the instrumental response is followed by delivery of the reinforcer. Schedules of
reinforcement are important because they determine the rate, pattern, and persistence of instrumental
behavior. To begin, I will describe simple fixed-ratio, variable-ratio, fixed interval, and variable-interval
schedules and the patterns of instrumental responding that are produced by these schedules. Then, I
will describe how schedules of reinforcement determine the choices organisms make between different
response alternatives. Concurrent and concurrent-chain schedules of reinforcement are techniques that
have been widely used to examine the mechanisms of choice in laboratory experiments. A particularly
interesting form of choice is between modest short-term gains versus larger long-term gains because
these alternatives represent the dilemma of self-control.
In describing various instrumental conditioning procedures in Chapter 5, I may have
given the impression that every occurrence of the instrumental response invariably
results in delivery of the reinforcer. Casual reflection suggests that such a perfect contin-
gency between response and reinforcement is rare in the real world. You do not get a
155
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
high grade on a test each time you study. You don’t get an immediate response from a
friend every time you send a text message, and going on a date with someone does not
always result in a good time. In fact, in most cases the relation between instrumental
responses and consequent reinforcement is rather complex. Laboratory investigations
have been examining how these complex relations determine the rate and pattern of
instrumental behavior.
A schedule of reinforcement is a program or rule that determines which occurrence
of a response is followed by the reinforcer. There are an infinite number of ways that
such a program could be set up. The delivery of a reinforcer could depend on the occur-
rence of a certain number of responses, the passage of time, the presence of certain sti-
muli, the occurrence of other responses, or any number or combination of other factors.
One might expect that cataloging the behavioral effects produced by various possible
schedules of reinforcement would be difficult. However, research so far has shown that
the job is quite manageable. Reinforcement schedules that involve similar relations
between responses and reinforcers usually produce similar patterns of behavior. The
exact rate of responding may differ from one situation to another, but the pattern of
behavior is highly predictable. This regularity has made the study of reinforcement sche-
dules both interesting and very useful.
Schedules of reinforcement influence both how an instrumental response is learned
and how it is then maintained by reinforcement. Traditionally, however, investigators of
schedule effects have been concerned primarily with the maintenance of behavior. Thus,
schedule effects are highly relevant to the motivation of behavior. Whether someone
works hard or is lazy depends less on their personality than on the schedule of reinforce-
ment that is in effect.
Schedules of reinforcement are important for managers who have to make sure their
employees continue to perform a job after having learned it. Even public school teachers
are often concerned with encouraging the occurrence of already learned responses rather
than teaching new ones. Many students who do poorly in school know how to do their
homework and how to study but simply choose not to. Schedules of reinforcement can
be used to motivate more frequent studying behavior.
Studies that focus on schedules of reinforcement have provided important informa-
tion about the reinforcement process and have also provided “useful baselines for the
analysis of other behavioral phenomena” (Lattal, 2013). The behavioral effects of drugs,
brain lesions, or manipulation of neurotransmitter systems often depend on the schedule
of reinforcement that is in effect during the behavioral testing. This makes the under-
standing of schedule performance critical to the study of a variety of other issues in
behavior theory and behavioral neuroscience.
Laboratory studies of schedules of reinforcement are typically conducted using a
Skinner box that has a clearly defined response that can occur repeatedly, so that changes
in the rate of responding can be readily observed and analyzed (Ferster & Skinner, 1957).
The manner in which a rat’s lever-press or pigeon’s key-peck response is initially shaped and
conditioned is usually of little interest. Rather, the focus is on schedule factors that control
the timing and repetition of the operant response (see Morgan, 2010, for a recent review).
Simple Schedules of Intermittent
Reinforcement
In simple schedules, a single factor determines which occurrence of the instrumental
response is reinforced. The single factor can be how many responses have occurred or
how much time has passed before the target response can be reinforced.
156 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Ratio Schedules
The defining characteristic of a ratio schedule is that reinforcement depends only on the
number of responses the organism has to perform. A ratio schedule requires merely
counting the number of responses that have occurred and delivering the reinforcer each
time the required number is reached. If the required number is one, every response
results in delivery of the reinforcer. Such a schedule is technically called continuous
reinforcement (CRF).
Contingency management programs used in the treatment of drug abuse often
employ a continuous reinforcement schedule. The clients are required to come to the clinic
several times a week to be tested for drug use. If the test indicates that they have not used
drugs since the last visit, they receive a voucher, which can be exchanged for money. In an
effective variation of this procedure, the amount of money paid is increased with succes-
sive drug-free tests and is reset to zero if the participant relapses (Roll & Newton, 2008).
Continuous reinforcement occurs outside the laboratory where there is a direct
causal link between the instrumental response and the outcome. Unlocking your car
door enables you to get in the car, entering the correct code using your ATM card
enables you to withdraw money from the ATM machine, and turning on the hot water
in the shower enables you to take a comfortable shower. Barring any malfunctions, all of
these are examples of continuous reinforcement. However, if the lock on your car door
malfunctions, or if you don’t have enough money in your ATM account, your instru-
mental behavior will not be reinforced every time. Situations in which responding is
reinforced only some of the time are said to involve partial reinforcement or intermit-
tent reinforcement. The following are simple schedules of intermittent reinforcement.
Fixed-Ratio Schedule Consider, for example, delivering the reinforcer after every
10th lever-press response in a study with laboratory rats. In such a schedule, there
would be a fixed ratio between the number of responses the rat made and the number
of reinforcers it got (10 responses per reinforcer). This makes the procedure a fixed-ratio
schedule (FR). More specifically, the procedure would be called a fixed-ratio 10 or FR 10.
FR schedules are found in daily life wherever a fixed number of responses or a fixed
amount of effort is required for reinforcement. People who distribute flyers are typically
paid a certain amount for every batch of 50 flyers that they place on apartment doors.
This is an FR 50 schedule of reinforcement. Checking class attendance by reading the
roll is on an FR schedule, set by the number of students on the class roster. Making a
phone call also involves an FR schedule, as each phone number includes a predetermined
number of digits.
A continuous reinforcement schedule is also an FR schedule. Continuous reinforce-
ment involves a fixed ratio of one response per reinforcer. On a continuous reinforce-
ment schedule, organisms typically respond at a steady and moderate rate. Only brief
and unpredictable pauses occur. A very different pattern of responding occurs when an
FR schedule is in effect that requires a larger number of responses. You are not likely to
pause in the middle of dialing a phone number. However, you may take a while to start
making the call. This is the typical pattern for FR schedules. There is a steady and high
rate of responding once the behavior gets under way. But there may be a pause before
the start of the required number of responses. These features of responding are clearly
evident in a cumulative record of the behavior.
A cumulative record is a special way of representing how a response is repeated over
time. It shows the total (or cumulative) number of responses that have occurred up to a
particular point in time. When Ferster and Skinner (1957) did their research on sche-
dules of reinforcement, cumulative records were obtained with the use of a chart
recorder (Figure 6.1). The recorder consisted of a rotating drum that pulled paper out
Simple Schedules of Intermittent Reinforcement 157
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
of the recorder at a constant speed. A pen rested on the surface of the paper. If no
responses occurred, the pen remained at the same level and made a horizontal line
as the paper came out of the machine. If the pigeon performed a key-peck response,
the pen moved one step vertically on the paper. Because each key-peck response caused
the pen to move one step up the paper, the total vertical distance traveled by the pen
represented the cumulative (or total) number of responses the participant made. Because
the paper came out of the recorder at a constant speed, the horizontal distance on the
cumulative record provided a measure of how much time had elapsed in the session.
The slope of the line made by the cumulative recorder represents the participant’s rate
of responding (number of responses per unit of time).
The cumulative record provides a complete visual representation of when and how
frequently the participant responds during a session. In the record of Figure 6.1, for
example, the participant did not perform the response between Points A and B, and a
slow rate of responding occurred between Points B and C. Responses occurred more fre-
quently between Points C and D, but the participant paused at D. After responding
resumed, the pen reached the top of the page (at Point E) and reset to the bottom for
additional responses.
Figure 6.2 shows the cumulative record of a pigeon whose responding had stabilized
on a reinforcement schedule that required 120 pecks for each delivery of the reinforcer
(an FR 120 schedule). Each food delivery is indicated by the small downward deflections
of the recorder pen. The bird stopped responding after each food delivery, as would be
expected. However, when it resumed pecking, it responded at a high and steady rate. The
zero rate of responding that typically occurs just after reinforcement on a fixed ratio
schedule is called the post-reinforcement pause. The high and steady rate of responding
that completes each ratio requirement is called the ratio run.
If the ratio requirement is increased a little (e.g., from FR 120 to FR 150), the rate of
responding during the ratio run may remain the same. However, with higher ratio
requirements, longer post-reinforcement pauses occur (e.g., Felton & Lyon, 1966;
Williams, Saunders, & Perone, 2008). If the ratio requirement is suddenly increased a
great deal (e.g., from FR 120 to FR 500), the animal is likely to pause periodically before
the completion of the ratio requirement (e.g., Stafford & Branch, 1998). This effect is called
ratio strain. In extreme cases, ratio strain may be so great that the animal stops respond-
ing altogether. To avoid ratio strain during training, one must be careful not to raise the
ratio requirement too quickly in approaching the desired FR response requirement.
Although the pause that occurs before a ratio run in FR schedules is historically
called the post-reinforcement pause, research has shown that the length of the pause is
Pen direction
Paper direction
Not responding
First
response
D
A B
C
E
FIGURE 6.1 The
plotting of a cumulative
record by a cumulative
recorder for the contin-
uous recording of be-
havior. The paper moves
out of the machine to-
ward the left at a con-
stant speed. Each
response causes the pen
to move up the paper
one step. No responses
occurred between Points
A and B. A moderate
rate of responding oc-
curred between Points B
and C, and a rapid rate
occurred between Points
C and D. At Point E, the
pen reset to the bottom
of the page.
©
Ce
ng
ag
e
Le
ar
ni
ng
158 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
controlled by the upcoming ratio requirement (e.g., Baron & Herpolsheimer, 1999; see
also Wade-Galuska, Perone, & Wirth, 2005). Consider, for example, washing your car
by hand rather than driving through a car wash. Washing a car by hand is an FR task
since it requires a set number of responses and a set amount of effort each time, as deter-
mined by the size of your car. If you procrastinate before starting to wash your car, it is
because you are not quite ready to tackle the job, not because you are resting from the
previous time you did the work. Thus, the post-reinforcement pause would be more
correctly labeled the pre-ratio pause.
Variable-Ratio Schedule In an FR schedule, a predictable number of responses or
amount of effort is required for each reinforcer. This predictability can be disrupted by
varying the number of responses required for reinforcement from one occasion to the
next, which would be the case if you worked at a car wash where you had to work on
cars of different sizes. Such a situation is still a ratio schedule because washing each car
still depends on a set number of responses or effort. However, now a different number of
responses is required to obtain successive reinforcers. Such a procedure is called a
variable-ratio schedule (VR). We may, for example, require a pigeon to make
10 responses to earn the first reinforcer, 13 to earn the second, 7 for the next one, and
so on. Such a schedule requires on average 10 responses per reinforcer and would be a
variable-ratio 10 schedule (VR 10).
VR schedules are found in daily life whenever an unpredictable amount of effort is
required to obtain a reinforcer. For example, each time a custodian goes into a room on
his or her rounds, he or she knows that some amount of cleaning will be necessary but
does not know exactly how dirty the room will be. Gamblers playing a slot machine are
also responding on a VR schedule. They have to play the machine to win. However, they
never know how many plays will produce the winning combination. VR schedules are
also common in sports. A certain number of strokes are always required to finish a
hole in golf. But, most players cannot be sure how many strokes they will need when
they begin a hole.
Because the number of responses required for reinforcement is not predictable, pre-
dictable pauses in the rate of responding are less likely with VR schedules than with FR
schedules. Rather, organisms respond at a fairly steady rate on VR schedules. Figure 6.2
shows a cumulative record for a pigeon whose pecking behavior was maintained on a VR
360 schedule of reinforcement. Notice that even though on average the VR 360 schedule
required many more pecks for each reinforcer than the FR 120 schedule shown in
Figure 6.2, the VR 360 schedule maintained a much steadier pattern of responding.
Although post-reinforcement pauses can occur on VR schedules (e.g., Schlinger,
Blakely, & Kaczor, 1990), such pauses are longer and more prominent with FR schedules.
Fixed
ratio
Variable
ratio
Fixed
interval
Variable
interval
FIGURE 6.2 Sample
cumulative records of
different pigeons
pecking a response
key on four simple
schedules of food
reinforcement: fixed
ratio 120, variable ratio
360, fixed interval
4 minutes, and variable
interval 2 minutes.
(Based on Schedules of
Reinforcement, by C. B.
Ferster and B. F.
Skinner, 1957, Appleton-
Century-Crofts.)
Simple Schedules of Intermittent Reinforcement 159
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The overall response rate on FR and VR schedules is similar provided that, on average,
similar numbers of responses are required. However, the overall response rate tends to
be distributed in a pause–run pattern with FR schedules, whereas a steady pattern of
responding is observed with VR schedules (e.g., Crossman, Bonem, & Phelps, 1987).
Interval Schedules
In ratio schedules, reinforcement depends only on the number of responses the partici-
pant has performed. Time is irrelevant. In other situations, a response is reinforced only
if the response occurs after a certain amount of time has passed. This is the case for
interval schedules.
Fixed-Interval Schedule In a simple interval schedule, a response is reinforced only
if it occurs more than a set amount of time after a reference point, the last delivery of the
reinforcer or the start of the trial. In a fixed-interval schedule (FI), the amount of time
that has to pass before a response is reinforced is constant from one trial to the next.
A washing machine, for example, operates on a fixed interval schedule. A fixed amount
of time is required to complete the wash cycle. No matter how many times you open the
washing machine before the required time has passed, you will not be reinforced with
clean clothes. Once the cycle is finished, the clothes are clean, and you can take them
out any time after that.
Similar contingencies can be set up in the laboratory. Consider, for example, a fixed-
interval 4-minute schedule (FI 4 min) for pecking in pigeons. In this case, 4 minutes
would be required to set up the reinforcer. A pigeon would get reinforced for the first
peck it made after completion of the 4-minute setup time. Because pecks made less
than 4 minutes into the trial are never reinforced, the pigeons would learn to wait to
respond until the end of the fixed interval (Figure 6.2). As the time for the availability
of the next reinforcer draws closer, the response rate increases. This increase in response
rate is evident as an acceleration in the cumulative record toward the end of each fixed
interval and is called the fixed-interval scallop.
BOX 6.1
The Post-Reinforcement Pause and Procrastination
The post-reinforcement pause that
occurs in FR schedules in the labora-
tory is also evident in common human
experience. As I noted earlier, the
pause occurs because a predictably
large number of responses are
required to produce the next reward.
Such procrastination is legendary in
human behavior. Consider, for exam-
ple, a semester in which you have
several term papers to write. You are
likely to work on one term paper at a
time. However, when you have com-
pleted one paper, you probably will
not start working on the next one
right away. Rather, there will be a
post-reinforcement pause. After com-
pleting a large project, people find it
difficult to jump right into the next
one. In fact, procrastination between
tasks or before the start of a new job is
the rule rather than the exception.
FR-schedule performance in the
laboratory indicates that once animals
begin to respond on a ratio run, they
respond at a high and steady rate
until they complete the ratio
requirement. This suggests that if
somehow you got yourself to start on
a task, chances are you will not find it
difficult to keep going. Only the
beginning is hard. One technique that
works pretty well is to tell yourself
that you will start by just doing a little
bit of the job. If you are trying to
write a paper, tell yourself that you
will write only one paragraph to start
with. You may find that once you
have completed the first paragraph, it
will be easier to write the second one,
then the one after that, and so on. If
you are procrastinating about spring
cleaning, instead of thinking about
doing the entire job, start with a small
part of it, such as washing the kitchen
floor. The rest will then come more
easily. (For a broader discussion of
procrastination, see Steel, 2007.)
160 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Performance on an FI schedule reflects the participant’s accuracy in telling time.
(I will have more to say about the psychology of timing in Chapter 12.) If the participant
were entirely incapable of telling time, it would be equally likely to respond at any point
in the FI cycle. The post-reinforcement pause and the subsequent acceleration toward
the end of the interval reflect a rudimentary ability to tell time. How could this ability
be improved? Common experience suggests that having a watch or clock of some sort
makes it much easier to judge time intervals. The same thing happens with pigeons on
an FI schedule. In one study, the clock consisted of a spot of light that grew as time
passed during the FI cycle. Introduction of this clock stimulus increased the duration of
the post-reinforcement pause and caused responding to shift closer to the end of the FI
cycle (Ferster & Skinner, 1957).
It is important to realize that an FI schedule does not guarantee that the reinforcer
will be provided at a certain point in time. Pigeons on an FI 4-min schedule do not auto-
matically receive access to grain every four minutes. The interval determines only when
the reinforcer becomes available, not when it is delivered. To receive the reinforcer after
it has become available, the participant still has to make the instrumental response. (For
reviews of FI timing and operant behavior, see Staddon & Cerutti, 2003; Jozefowiez &
Staddon, 2008.)
The scheduling of tests in college courses has major similarities to the basic FI
schedule. Usually there are only two or three tests, and the tests are evenly distributed
during the term. The pattern of studying that such a schedule encourages is very similar
to what is observed with an FI schedule in the laboratory. Students spend little effort
studying at the beginning of the semester or just after the midterm exam. Rather, they
begin to study a week or two before each exam, and the rate of studying rapidly increases
as the day of the exam approaches. Interestingly, members of the U.S. Congress behave
the same way, writing bills at much higher rates as the end of the congressional session
approaches (Critchfield et al., 2003).
Variable-Interval Schedule In FI schedules, responses are reinforced if they occur
after a fixed amount of time has passed after the start of the trial or schedule cycle. Inter-
val schedules also can be unpredictable. With a variable-interval schedule (VI), the time
required to set up the reinforcer varies from one trial to the next. The subject has to
respond to obtain the reinforcer that has been set up, but now the set-up time is not as
predictable.
VI schedules are found in situations where an unpredictable amount of time is
required to prepare the reinforcer. A mechanic who cannot tell you how long it will
take to fix your car has imposed a VI schedule on you. The car will not be ready for
some time, during which attempts to get it will not be reinforced. How much time has
to pass before the car will be ready is unpredictable. A sales clerk at a bakery is also on a
VI schedule of reinforcement. Some time has to pass after waiting on a customer before
another will enter the store to buy something. However, the interval between customers
is unpredictable.
In a laboratory study, a VI schedule could be set up in which the first food pellet
will be available when at least 1 minute has passed since the beginning of the session,
the second food pellet will be available when at least 3 minutes have passed since the
previous pellet, and the third reinforcer will be available when at least 2 minutes have
passed since the previous pellet. In this procedure, the average set-up time for the rein-
forcer is 2 minutes. Therefore, the procedure would be called a VI two-minute schedule,
or VI 2 min.
As in FI schedules, the participant has to perform the instrumental response to
obtain the reinforcer. Reinforcers are not given just because a certain amount of time
Simple Schedules of Intermittent Reinforcement 161
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
has passed. Rather, they are given if the individual responds after the variable interval
has timed out. Like VR schedules, VI schedules maintain steady and stable rates of
responding without regular pauses (Figure 6.2).
Interval Schedules and Limited Hold In simple interval schedules, once the rein-
forcer becomes available, it remains available until the required response is made, no
matter how long that may take. For example, on an FI 2-min schedule, the reinforcer
becomes available 2 minutes after the start of the schedule cycle. If the animal responds
at exactly this time, it will be reinforced. If it waits and responds 90 minutes later, it will
still get reinforced. Once the reinforcer has been set up, it remains available until the
response occurs.
With interval schedules outside the laboratory, it is more common for reinforcers to
become available for only limited periods. Consider, for example, a dormitory cafeteria.
Meals are served at fixed times of day. Therefore, going to the cafeteria is reinforced only
after a certain amount of time has passed since the last meal (the set-up time). However,
once a meal becomes available, you have a limited amount of time in which to get it.
This kind of restriction on how long a reinforcer remains available is called a limited
hold. Limited-hold restrictions can be added to either FI or VI schedules.
Comparison of Ratio and Interval Schedules
There are striking similarities between the patterns of responding maintained by simple
ratio and interval schedules. As we have seen, with both FR and FI schedules, there is a
post-reinforcement pause after each delivery of the reinforcer. In addition, both FR and
FI schedules produce high rates of responding just before the delivery of the next rein-
forcer. By contrast, VR and VI schedules both maintain steady rates of responding with-
out predictable pauses. Does this mean that interval and ratio schedules motivate
behavior in the same way? Not at all! The surface similarities hide fundamental differ-
ences in the underlying motivational mechanisms of interval and ratio schedules.
Early evidence of fundamental differences between ratio and interval schedules was
provided in an important experiment by Reynolds (1975). Reynolds compared the rate of
key pecking in pigeons reinforced on VR and VI schedules. Two pigeons were trained to
peck the response key for food reinforcement. One of the birds was reinforced on a VR
schedule. Therefore, for this bird the frequency of reinforcement was entirely determined
by how many responses it made. The other bird was reinforced on a VI schedule. To
make sure that the opportunities for reinforcement would be identical for the two
birds, the VI schedule was controlled by the behavior of the bird reinforced on the VR
schedule. Each time the VR pigeon was just one response short of the requirement for
reinforcement on that trial, the experimenter set up the reinforcer for the VI bird. With
this arrangement, the next response made by each bird was reinforced. Thus, the fre-
quency of reinforcement was virtually identical for the two animals.
Figure 6.3 shows the cumulative record of pecking exhibited by each bird. Even
though the two pigeons received the same frequency and distribution of reinforcers,
they behaved very differently. The pigeon reinforced on the VR schedule responded at
a much higher rate than the pigeon reinforced on the VI schedule. The VR schedule
motivated much more vigorous instrumental behavior. This basic finding has since
been replicated in numerous studies and has stimulated lively theoretical analysis.
Results similar to those Reynolds observed with pigeons also have been found with
undergraduate students (e.g., Raia et al., 2000). The task was akin to a video game.
A target appeared on a computer screen and the students had to maneuver a spaceship
and “fire” at the target with a joystick as the instrumental response. Following a direct
hit of the target, the participants received 5¢. However, not every “hit” was reinforced.
162 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Which occurrence of the instrumental response was reinforced depended on the sched-
ule of reinforcement programmed into the software. The students were assigned to
pairs but each worked in a separate cubicle and didn’t know that he or she had a part-
ner. One member of each pair received reinforcement on a VR schedule. The other
member of the pair was reinforced on a VI schedule that was yoked to the VR sched-
ule. Thus, as in the pigeon experiment, reinforcers became available to both participants at
the same time, but one controlled access to the reinforcer through a VR schedule and the other
did not.
Raia and colleagues (2000) studied the effects of response shaping, instructions, and
the presence of a consummatory response on performance on the VR–VI yoking proce-
dure. (The consummatory response was picking up the 5¢ reinforcer each time it was
delivered and putting it into a piggy bank.) One set of conditions was quite similar to
the pigeon studies: The students were shaped to make the instrumental response, they
received minimal instructions, and they were required to make the consummatory
response. Interestingly, under these conditions, the college students performed just like
the pigeons. Higher rates of responding occurred for the student of each pair who was
reinforced on the VR schedule.
The higher response rates that occur on ratio as compared to interval schedules
powerfully illustrate how schedules can alter the motivation for instrumental behavior.
A simplistic theory might assume that the rate of responding is just a function of how
many reinforcers the participant earns. But, in these experiments , the rates of reinforce-
ment were identical with the ratio and interval schedules. Nevertheless, the ratio sche-
dules produced much more behavior. This is important news if you are a manager
trying to get the most effort from your employees. The reinforcer in an employment
situation is provided by the wages individuals earn. The Reynolds experiment tells you
that you can get employees to work harder for the same pay if the wages are provided
on a ratio rather than an interval schedule.
Reinforcement of Inter-Response Times Why might ratio schedules produce higher
rates of responding than interval schedules? According to one explanation, the critical
factor is the reinforcement of short inter-response times. The inter-response time
(IRT) is the interval between successive responses. I noted in Chapter 5 that various
features of behavior can be increased by reinforcement. The IRT is one such behavioral
Reinforcement
VR rate = 5 responses/second
VI rate = 1 response/second
100 responses
1 minute
FIGURE 6.3 Cumulative
records for two pigeons,
one reinforced on a VR
schedule and the other
yoked to it on a VI
schedule. Although the
two pigeons received the
same rate of reinforce-
ment, the VR bird
responded five times
as fast as the VI bird.
(Based on A Primer of
Operant Conditioning,
2nd ed., by G. S.
Reynolds.)
Simple Schedules of Intermittent Reinforcement 163
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
feature. If the participant is reinforced for a response that occurs shortly after the preced-
ing one, then a short IRT is reinforced and short IRTs become more likely in the future.
On the other hand, if the participant is reinforced for a response that ends a long IRT,
then a long IRT is reinforced and long IRTs become more likely in the future. A partici-
pant who has mostly short IRTs is responding at a high rate. By contrast, a participant
who has mostly long IRTs is responding at a low rate.
How do ratio and interval schedules determine the reinforcement of IRTs? Consider
a ratio schedule. With a ratio schedule there are no time constraints, and the faster the
participant completes the ratio requirement, the faster he or she will receive the rein-
forcer. Thus, a ratio schedule favors not waiting long between responses. It favors short
IRTs. In fact, ratio schedules differentially reinforce short IRTs.
In contrast, interval schedules provide little advantage for short IRTs. Rather, inter-
val schedules favor waiting longer between responses. Consider, for example, an FI
2-min schedule of food reinforcement. Each food pellet becomes available 2 minutes
after the last one was delivered. If the participant responds frequently before the food
pellet is set up, those responses and short IRTs will not be reinforced. On the other
hand, if the participant waits a long time between responses (emitting long IRTs), those
responses are more likely to occur after the 2 minutes has timed out and are more likely
to be reinforced. Thus, interval schedules differentially reinforce long IRTs and, thus,
result in lower rates of responding than ratio schedules (Baum, 1993; Cole, 1994, 1999;
Tanno & Sakagami, 2008).
Feedback Functions The second major explanation of the higher response rates on
ratio schedules focuses on the relationship between response rates and reinforcement
rates calculated over an entire experimental session or an extended period of time (e.g.,
Reed, 2007a, b). This relationship is called the feedback function because reinforcement is
considered to be the feedback or consequence of responding.
In the long run, what is the relationship between response rate and reinforcement
rate on ratio schedules? The answer is pretty straightforward. Because the only require-
ment for reinforcement on a ratio schedule is making a certain number of responses,
the faster the participant completes the ratio requirement, the faster it obtains the
next reinforcer. Thus, response rate is directly related to reinforcement rate. The higher
the response rate, the more reinforcers the participant will earn and the higher will
be its reinforcement rate. Furthermore, there is no limit to this increasing function.
No matter how rapidly the participant responds, if it can increase its response
rate even further, it will enjoy a corresponding increase in the rate of reinforcement.
Thus, the feedback function for a ratio schedule is an increasing linear function with
no limit.
How about the feedback function for an interval schedule? Interval schedules have
an upper limit on the number of reinforcers a participant can earn. On a VI 2-min
schedule, for example, if the participant obtains each reinforcer as soon as it becomes
available, it can earn a maximum of 30 reinforcers per hour. Because each reinforcer
requires a certain amount of time to be set up, there is an upper limit on the number
of reinforcers a participant can earn. A participant cannot increase its reinforcement
rate above this limit no matter how much he or she increases the rate of responding.
Doctors, lawyers, and hair dressers in private practice are all paid on a ratio
schedule with a linearly increasing feedback function. Their earnings depend on the
number of clients they see or procedures they perform each day. The more clients
they see, the more money they make, and there is no limit to this function. No matter
how much money they are making, if they can squeeze in another client, they can earn
another fee. This is in contrast to salaried employees in a supermarket or the post
164 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
office, who cannot increase their income as readily by increasing their efforts. Their
only hope is that their diligence is recognized when employees are considered for a
raise or promotion. The wage scale for salaried employees has strong interval-schedule
components.
Choice Behavior: Concurrent Schedules
The reinforcement schedules I described thus far were focused on a single response and
reinforcement of that response. The simplicity of single-response situations facilitates sci-
entific discovery, but experiments in which only one response is being measured ignore
some of the richness and complexity of the real world. Even in a simple situation like a
Skinner box, organisms engage in a variety of activities and are continually choosing
among possible alternatives. A pigeon can peck the only response key in the box, or
preen, or move about the chamber. People are also constantly having to make choices
about what to do. Should you go to the movies or stay at home and watch TV? If you
stay at home, which show should you watch and should you watch it to the end or
change the channel before the end of a show? Understanding the mechanisms of choice
is fundamental to understanding behavior because much of what we do is the result of
choosing one activity over another.
Choice situations can be rather complicated. For example, a person may have a
choice of 12 different activities (playing a video game, watching television, texting a
friend, playing with the dog, and the like), each of which produces a different type of
reinforcer according to a different reinforcement schedule. Analyzing all the factors that
control someone’s choices can be a formidable task, if not an impossible one. Therefore,
psychologists have begun experimental investigations of the mechanisms of choice by
studying simpler situations. The simplest choice situation is one which has two response
alternatives, and each response is followed by a reinforcer according to its own schedule
of reinforcement.
Numerous studies of choice have been conducted in Skinner boxes equipped with
two pecking keys a pigeon can peck. In the typical experiment, responding on each key
is reinforced on some schedule of reinforcement. The two schedules are in effect at the
same time (or concurrently), and the pigeon is free to switch from one key to the other.
This type of procedure is called a concurrent schedule. Concurrent schedules allow for
continuous measurement of choice because the organism is free to change back and forth
between the response alternatives at any time.
Playing slot machines in a casino is on a concurrent schedule, with lots of response
options. Each type of slot machine operates on a different schedule of reinforcement, and
you can play any of the machines. Furthermore, you are at liberty to switch from one
machine to another at any time. Closer to home, operating the remote control for your
TV is also on a concurrent schedule. You can select any one of a number of channels to
watch. Some channels are more interesting than others, which indicates that your watch-
ing behavior is reinforced on different schedules of reinforcement. As with slot machines,
you can change your selection at any time. Talking to various people at a party involves
similar contingencies. You can talk to whomever you want and move to someone else if
a conversation gets boring, indicating a reduced rate of reinforcement.
Figure 6.4 shows a laboratory example of a concurrent schedule. If the pigeon pecks
the key on the left, it receives food according to a VI 60-second schedule. Pecks on the
right key produce food according to an FR 10 schedule. The pigeon is free to peck either
side at any time. The point of the experiment is to see how the pigeon distributes
its pecks on the two keys and how the schedule of reinforcement on each key influences
its choices.
Choice Behavior: Concurrent Schedules 165
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Measures of Choice Behavior
The individual’s choice in a concurrent schedule is reflected in the distribution of its
behavior between the two response alternatives. This can be measured in several ways.
One common technique is to calculate the relative rate of responding on each alternative.
The relative rate of responding on the left key, for example, is calculated by dividing the
rate of responding on the left by the total rate of responding (left key plus right key). To
express this mathematically, let’s designate BL as pecking or behavior on the left, and BR
as behavior on the right. Then, the relative rate of responding on the left is
BL
ðBL þBRÞ
ð6:1Þ
If the pigeon pecks equally as often on the two response keys, this ratio will be .5. If
the rate of responding on the left is greater than the rate of responding on the right, the
ratio will be greater than .5. On the other hand, if the rate of responding on the left is
less than the rate of responding on the right, the ratio will be less than .5. The relative
rate of responding on the right (BR) can be calculated in a comparable manner.
As you might suspect, how an organism distributes its behavior between the two
response alternatives is greatly influenced by the reinforcement schedule in effect for
each response. For example, if the same VI reinforcement schedule is available for each
response alternative, as in a concurrent VI 60-second, VI 60-second procedure, the
pigeon will peck the two keys equally often. The relative rate of responding for pecks
on each side will be .5. This result is intuitively reasonable. Because the VI schedule
available on each side is the same, there is no advantage in responding more on one
side than on the other.
By responding equally often on each side of a concurrent VI 60-second, VI-60 sec-
ond schedule, the pigeon will also earn reinforcers equally often on each side. The rela-
tive rate of reinforcement earned for each response alternative can be calculated in a
manner comparable to the relative rate of response. Let’s designate rL as the rate of rein-
forcement on the left and rR as the rate of reinforcement on the right. Then, the relative
rate of reinforcement on the left will be rL divided by the total rate of reinforcement (the
sum of the rate of reward earned on the left and the rate of reward earned on the right).
This is expressed in the formula
rL
ðrL þrRÞ
ð6:2Þ
On a concurrent VI 60-second, VI 60-second schedule, the relative rate of reinforce-
ment for each response alternative will be .5 because the participant earns reinforcers
equally often on each side.
Left key Right key
Left schedule
VI 60 sec
Right schedule
FR 10
FIGURE 6.4 Diagram
of a concurrent schedule
for pigeons. Pecks at the
left key are reinforced
according to a VI
60-second schedule of
reinforcement. Pecks on
the right key are rein-
forced according to an
FR 10 schedule of
reinforcement.
©
Ce
ng
ag
e
Le
ar
ni
ng
166 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The Matching Law
As we have seen, with a concurrent VI 60-second, VI 60-second schedule, both the relative
rate of responding and the relative rate of reinforcement for each response alternative are
.5. Thus, the relative rate of responding is equal to the relative rate of reinforcement. Will
this equality also occur if the two response alternatives are not reinforced according to the
same schedule? This important question was asked by Herrnstein (1961).
Herrnstein studied the distribution of responses on various concurrent VI–VI sche-
dules in which the maximum total rate of reinforcement the pigeons could earn was
fixed at 40 per hour. Depending on the exact value of each VI schedule, different propor-
tions of the 40 reinforcers could be obtained by pecking the left and right keys. Consider,
for example, a concurrent VI 6–min, VI 2-min schedule. With such a schedule, a maxi-
mum of 10 reinforcers per hour could be obtained by responding on the VI 6-min alter-
native, and a maximum of 30 reinforcers per hour could be obtained by responding on
the VI 2-min alternative.
There was no constraint on which side the pigeons could peck on the various con-
current VI–VI schedules Herrnstein tested. The pigeons could respond exclusively on
one side or the other, or they could split their pecks between the two sides in various
proportions. As it turned out, the pigeons distributed their responses in a highly predict-
able fashion. The results, summarized in Figure 6.5, indicate that the relative rate of
responding on a given alternative was always close to the relative rate of reinforcement
earned on that alternative. If the pigeons earned a greater proportion of their reinforcers
on the left, they made a correspondingly greater proportion of their responses on that
side. The relative rate of responding on an alternative matched the relative rate of rein-
forcement on that alternative. Similar findings have been obtained in numerous other
experiments, which encouraged Herrnstein to call the relation the matching law. (For
recent reviews, see Dallery & Soto, 2013; Grace & Hucks, 2013.)
There are two common mathematical expressions of the matching law. In one
formulation, rate of responding or behavior (B) and rate of reinforcement (r) on one
R. J. Herrnstein
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.10 0.2 0.3 0.4 0.5
Relative rate of reinforcement on the left key, r L/(r L + r R)
0.6 0.7 0.8 0.9 1.0R
el
at
iv
e
ra
te
o
f r
es
po
nd
in
g
on
th
e
le
ft
k
ey
, B
L/
(B
L
+
B
R
) Pigeon #055 Pigeon #231
FIGURE 6.5 Results of
various concurrent
VI–VI schedules were
tested with pigeons.
Note that throughout
the range of schedules,
the relative rate of
responding nearly equals
(matches) the relative
rate of reinforcement.
(Based on “Relative and
Absolute Strength
of Response as a
Function of Frequency
of Reinforcement,”
by R. J. Herrnstein,
1961, Journal of the
Experimental Analysis
of Behavior, 4,
pp. 267–272.)
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
Choice Behavior: Concurrent Schedules 167
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
choice alternative are expressed as a proportion of total response and reinforcement
rates, as follows:
BL
ðBL þ BRÞ
¼ rLðrLþrRÞ
ð6:3Þ
The second form of the matching law is simpler but mathematically equivalent to
Equation 6.3. In the second version, the rates of responding and reinforcement on one
alternative are expressed as a proportion of the rates of responding and reinforcement on
the other alternative, as follows:
BL
BR
¼ rL
rR
ð6:4Þ
Both mathematical expressions of the matching law represent the same basic principle,
namely that relative rates of responding match relative rates of reinforcement.
The matching law has had a profound impact on the way in which scientists think
about instrumental behavior. The major insight provided by the matching law is that the
rate of a particular response does not depend on the rate of reinforcement of that
response alone. Whether a behavior occurs frequently or infrequently depends not only
on its own schedule of reinforcement but also on the rates of reinforcement of other
activities the individual may perform. A given simple reinforcement schedule that is
highly effective in a reward-impoverished environment may have little impact if there
are numerous alternative sources of reinforcement. Therefore, how we go about training
and motivating a particular response (e.g., studying among high school students) has to
take into account other activities and sources of reinforcement the individuals have at
their disposal.
The importance of alternative sources of reinforcement has provided useful
insights into problematic behaviors such as unprotected sex among teenagers, which
results in unwanted pregnancies, abortions, and sexually transmitted diseases. Based
on the concepts of the matching law, Bulow and Meller (1998) predicted that “adoles-
cent girls who live in a reinforcement-barren environment are more likely to engage in
sexual behaviors than those girls whose environments offer them a fuller array of rein-
forcement opportunities” (p. 586). To test this prediction, they administered a survey
to adolescent girls that asked them about the things they found reinforcing. From these
data, the investigators estimated the rates of sexual activity and contraceptive use and
the rates of reinforcement derived from sexual and other activities. These data were
then entered into the equations of the matching law. The results were impressive. The
matching law predicted the frequency of sexual activity with an accuracy of 60% and
predicted contraceptive use with 67% accuracy. These findings suggest that efforts
to reduce unprotected sex among teenagers have to consider not only their sexual
activities but other things they may learn to enjoy (such as playing a sport or musical
instrument). (For a review translational research involving the matching law, see
Jacobs, Borrero, & Vollmer, 2013.)
Undermatching, Overmatching, and Response Bias The matching law clearly
indicates that choices are not made capriciously. Rather, choice is an orderly function
of rates of reinforcement. Although the matching law has enjoyed considerable success
and has guided much research over the past 50 years, relative rates of responding do
not always match relative rates of reinforcement exactly.
Most instances in which choice behavior does not correspond perfectly to the
matching relation can be accommodated by the generalized form of the matching law
168 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
(Baum, 1974). The generalized matching law has two parameters, b and s, added to
Equation 6.4 and is expressed as follows:
BL=BR ¼ b
rL
rR
� �s
ð6:5Þ
The exponent s represents sensitivity of the choice behavior to the relative rates of
reinforcement for the response alternatives or the discriminability of the alternatives.
When perfect matching occurs, s is equal to 1. The most common deviation from perfect
matching involves reduced sensitivity of the choice behavior to the relative rates of rein-
forcement. Such results are referred to as undermatching and can be accommodated by
Equation 6.5 by making the exponent s less than one. Notice that if the exponent s is less
than 1, the value of the term representing relative reinforcer rates, (rA/rB), becomes smal-
ler, indicating reduced sensitivity to the relative rate of reinforcement.
Numerous variables have been found to influence the sensitivity parameter, includ-
ing the species tested, the effort or difficulty involved in switching from one alternative
to the other, and the details of how the schedule alternatives are constructed.
The parameter b in Equation 6.5 represents response bias. In Herrnstein’s original
experiment (and in most others that have followed), animals chose between two responses
of the same type (pecking one or another response key), and each response was reinforced
by the same type of reinforcer (brief access to food). Response bias occurs when the
response alternatives require different amounts of effort or if the reinforcer provided for
one response is much more attractive than the reinforcer for the other response. A prefer-
ence (or bias) for one response or one reinforcer over the other results in more responding
on the preferred side and is represented by higher values of the bias parameter b.
Mechanisms of the Matching Law
The matching law describes how organisms distribute their responses in a choice situa-
tion, but it does not explain what mechanisms are responsible for these choices. It is a
descriptive law of nature rather than a mechanistic law. Factors that may be responsible
for matching in choice situations have been the subject of continuing experimentation
and theoretical debate (see Davison & McCarthy, 1988; Grace & Hucks, 2013).
The matching law is stated in terms of rates of responding and reinforcement aver-
aged over the entire duration of experimental sessions. It ignores when individual
responses are made. Some theories of matching are similar in that they ignore what
might occur at the level of individual responses. Such explanations are called molar the-
ories. Molar theories explain aggregates of responses. They deal with the distribution of
responses and reinforcers in choice situations during an entire experimental session.
In contrast to molar theories, other explanations of the matching relation operate on
a shorter time frame and focus on what happens at the level of individual responses.
Such explanations are called molecular theories and view the matching relation as the
net result of these individual choices. I previously described molecular and molar expla-
nations of why ratio schedules produce higher response rates than interval schedules.
The explanation that emphasized the reinforcement of inter-response times was a molec-
ular or local account. In contrast, the explanation that emphasized feedback functions of
ratio and interval schedules was a molar theory. (For a detailed discussion of molecular
versus molar approaches to the analysis of behavior, see Baum, 2002.)
Maximizing Rates of Reinforcement The most extensively investigated explana-
tions of choice behavior are based on the intuitively reasonable idea that organisms dis-
tribute their actions among response alternatives so as to receive the maximum amount
W. M. Baum
Co
ur
te
sy
of
W
.
M
.
Ba
um
Choice Behavior: Concurrent Schedules 169
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
of reinforcement possible. According to this idea, animals switch back and forth between
response alternatives so as to receive as many reinforcers as they possibly can. The idea
that organisms maximize reinforcement has been used to explain choice behavior at both
molecular and molar levels of analysis.
Molecular Maximizing According to molecular theories of maximizing, organisms
always choose whichever response alternative is most likely to be reinforced at a given
moment in time. An early version of molecular matching (e.g., Shimp, 1969) stated that
when two schedules (A and B) are in effect simultaneously, the participant will switch
from Schedule A to Schedule B when the probability of reinforcement on Schedule B
becomes greater than on Schedule A. The participant will switch back to A when the
probability of reinforcement on A becomes greater than on B. Thus, this model claims
that the matching relation is a byproduct of prudent switching behavior that tracks
momentary changes in the probability of reinforcement. Detailed studies of the patterns
of switching from one response to another have not always supported this type of molec-
ular maximizing mechanism. However, scientists have remained interested in molecular
explanations of matching.
According to a more recent molecular account, a situation involving two response
Alternatives A and B actually involves four different behavioral options: staying with Alter-
native A, switching from A to B, staying with Alternative B, and switching from B to A. Each
of these four behavioral options gets reinforced at various times. The relative distribution
of responses on A and B is presumed to depend on the relative rate of reinforcement for
BOX 6.2
The Matching Law and Complex Human Behavior
The matching law and its implica-
tions have been found to apply to a
wide range of human behaviors, in-
cluding social conversation (Borrero
et al., 2007), courtship and mate
selection (Takeuchi, 2006), and the
choices that lead to substance abuse
(e.g., Frisher & Beckett, 2006;
Vuchinich & Tucker, 2006). In an
interesting recent study, Vollmer and
Bourret (2000) examined the choices
that college basketball players made
during the course of intercollegiate
games. A basketball player can elect
to shoot at the basket from an area
close to the basket and thereby get
two points or shoot from an area
farther away and thereby get three
points. Teams compile statistics on
the number of two- and three-point
shots attempted by individual players.
These data provide information about
the relative rates of selecting each
response alternative. The team statis-
tics also include information about
the success of each attempt, and these
data can be used to calculate the rate
of reinforcement for each response
alternative. Vollmer and Bourret
examined the data for 13 players on
the men’s team and 13 players on the
women’s team of a large university
and found that the relative choice of
the different types of shots was pro-
portional to the relative rates of
reinforcement for those shots. Thus,
the choice behavior of these athletes
during regular games followed the
matching law.
The matching law has also been
used to analyze the choice of plays in
professional football games of the
American National Football League
(Reed, Critchfield, & Martins, 2006).
Data on running plays versus passing
plays were analyzed in terms of the
number of yards that were gained as a
consequence of each play. This way of
looking at the game provided
response rates (frequency of one or
the other type of play) and rein-
forcement rates (yards gained). The
generalized matching law accounted
for 75% of the choice of plays. The
sensitivity parameter showed that the
relative frequency of passing versus
running plays undermatched the rel-
ative yardage gained by these plays.
Thus, the choice of plays did not take
full advantage of the yardage gains
that could have been obtained. The
response bias parameter in the gen-
eralized matching law indicated that
there was a significant bias in favor of
running plays. Interestingly, teams
whose play calling followed the
matching law more closely had better
win records than teams that signifi-
cantly deviated from matching.
170 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
staying on each side versus switching from one side to the other (MacDonall, 2000, 2005).
(For other analyses of local reinforcement effects in choice, see Davison & Baum, 2003;
Krägeloh, Davison, & Elliffee, 2005.)
Molar Maximizing Molar theories of maximizing assume that organisms distribute
their responses among various alternatives so as to maximize the amount of reinforce-
ment they earn over the long run. What is long enough to be considered a long run is
not clearly specified. However, in contrast to molecular theories, molar theories focus on
aggregates of behavior over some period of time, usually the total duration of an experi-
mental session, rather than on individual choice responses.
Molar maximizing theory was originally formulated to explain choice on concurrent
schedules made up of ratio components. In concurrent ratio schedules, animals rarely
switch back and forth between response alternatives. Rather, they respond exclusively
on the ratio component that requires the fewest responses. On a concurrent FR 20, FR 10
schedule, for example, the pigeon is likely to respond only on the FR 10 alternative. In
this way, it maximizes its rate of reinforcement with the least effort.
In many situations, molar maximizing accurately predicts the results of choice pro-
cedures. However, certain findings present difficulties. One difficulty arises from the
results of concurrent VI–VI schedules. On a concurrent VI–VI schedule, participants
can obtain close to all of the available reinforcers on both VI options provided they occa-
sionally sample each alternative. Therefore, the total amount of reinforcement obtained
on a concurrent VI–VI schedule can be close to the same despite wide variations in how
responding is distributed between the two alternatives. The matching relation is only one
of the many different possibilities that yield close to maximal rates of reinforcement on
concurrent VI–VI schedules.
Another challenge for molar matching is provided by studies involving a choice
between a VR and a VI schedule. On a VR schedule, the participant can obtain rein-
forcement at any time by making the required number of responses. By contrast, on a
VI schedule, the participant only has to respond occasionally to obtain close to the maxi-
mum number of reinforcers possible. For maximum return on a concurrent VR–VI
schedule, participants should concentrate their responses on the VR alternative and
respond only occasionally on the VI component. Evidence shows that both pigeons and
college students favor the VR component but not always as strongly as predicted by
molar maximizing (e.g., Heyman & Herrnstein, 1986; Savastano & Fantino, 1994).
Melioration The third major mechanism of choice, melioration, operates on a time
scale between molar and molecular mechanisms. Instead of aggregating data over an
entire session or focusing on individual responses, melioration theory focuses on local
rates of responding and reinforcement.
Local rates are calculated only over the time period that a participant devotes to a
particular choice alternative. With two options (A and B), for example, the local rate of
responding on A is calculated by dividing the frequency of responses on A by the time
the participant spends on side A. This contrasts with the overall rate, which is calculated
over the entire duration of an experimental session.
Local rates are always higher than overall rates. For example, if you obtain 10 reinfor-
cers in a 60-minute session by responding on the left response key, the overall rate of rein-
forcement on the left will be 10 per hour. However, if you only spend 15 minutes on the left
side, the local rate of reinforcement on the left will be 10 per 15 minutes or 40 per hour.
The term melioration means making something better. Melioration theory predicts
that participants will shift their behavior toward whichever choice alternative provides the
higher (or better) local rate of reinforcement. However, any change in time spent on a
E. Fantino
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
Choice Behavior: Concurrent Schedules 171
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
choice alternative will probably change the local rate of reinforcement for that alternative.
Melioration assumes that adjustments in the distribution of behavior between choice alter-
natives will continue until the participant is obtaining the same local rate of reinforcement
on each alternative (Herrnstein, 1997; Vaughan, 1981). Once this is achieved, there is no
incentive for any further changes in response allocation. It can be shown mathematically
that when participants distribute their responses so as to obtain the same local rate of rein-
forcement on each alternative, they are behaving in accordance with the matching law.
Therefore, the mechanism of melioration results in matching. (For a human study of
choice consistent with melioration, see Madden, Peden, & Yamaguchi, 2002.)
Complex Choice and Self-Control
In a standard concurrent schedule of reinforcement, two (or more) response alternatives
are available at the same time, and switching from one to the other can occur at any
time. At a potluck dinner, for example, if you don’t like what you are eating, you can
switch at any time to something else. Similarly, you can visit one or another booth at a
county fair and make a new selection at any time. That is not the case when you choose
a movie at a multiplex. Once you have paid your ticket and started watching the movie,
you cannot change your mind and go see another one at any time. Choosing one movie
makes the others unavailable until you buy another ticket.
Many complex human decisions limit your options once you have made a choice.
When you are finishing high school and contemplating where to go to college, you may
have a number of options available. However, after you have selected and enrolled in a
particular college, the other schools are no longer available until the next semester or
next year. Choosing where to go on vacation or which car to buy similarly involves
choice with commitment. Once the selection is made, the other alternatives are no longer
available for a while.
Concurrent-Chain Schedules
To study how organisms make choices that involve commitment to one alternative or
the other, investigators developed the concurrent-chain schedule of reinforcement
(Kyonka & Grace, 2010; Mazur, 2006).
A concurrent-chain schedule of reinforcement involves two stages or links
(Figure 6.6). The first is called the choice link. In this link, the participant is allowed to
choose between two schedule alternatives by making one of two responses. In the exam-
ple diagrammed in Figure 6.6, the pigeon makes its choice by pecking either the left or
the right response key. Pecking the left key produces Alternative A, the opportunity to
R. C. Grace
Reinforcement
Schedule A
(VI 3 min)
Reinforcement
Schedule B
(FI 3 min)
T
im
e
Choice link
Terminal link
A B
FIGURE 6.6
Diagram of a
concurrent-chain sched-
ule. Pecking the left key
in the choice link acti-
vates reinforcement
Schedule A in the ter-
minal link. Pecking the
right key in the choice
activates reinforcement
Schedule B in the
terminal link.
Co
ur
te
sy
of
R.
C.
G
ra
ce
©
Ce
ng
ag
e
Le
ar
ni
ng
172 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
peck the left key on a VI 3-min schedule of reinforcement. If the pigeon pecks the right
key in the choice link, it produces Alternative B, which is the opportunity to peck the
right key on an FI 3-min schedule.
Responding on either key during the choice link does not yield food. The opportu-
nity for reinforcement occurs only after the initial choice has been made and the pigeon
has entered the terminal link. Another important feature of the concurrent-chain sched-
ule is that once the participant has made a choice, it is stuck with that alternative until
the end of the terminal link of the schedule or the end of the trial. Thus, concurrent-
chain schedules involve choice with commitment.
The pattern of responding that occurs in the terminal component of a concurrent-
chain schedule is characteristic of whatever schedule of reinforcement is in effect during
that component. In our example, if the pigeon selected Alternative A, its pattern of peck-
ing during the terminal component will be similar to the usual response pattern for a VI
3-min schedule. If the pigeon selected Alternative B, its pattern of pecking during the
terminal component will be characteristic of an FI 3-min schedule.
We have all heard that variety is the spice of life. Studies of concurrent-chain sche-
dules can tell us whether such a claim is supported by empirical evidence. If variety is
the spice of life, then participants should prefer a variable schedule over a fixed schedule
that yields the same overall rate of reinforcement. Studies of concurrent-chain schedules
with VI and FI terminal components have shown that participants prefer the variable-
schedule alternative. In fact, pigeons favor the VI alternative even if the VI schedule
requires more time on average for the reinforcer to become available than the FI alterna-
tive (e.g., Andrzejewski et al., 2005). This indicates that variety is indeed the spice of life
in concurrent-chain schedules.
As I noted, the consequence of responding during the choice link of a concurrent
schedule is not the primary reinforcer (food). Rather, it is entry into one of the terminal
links, each of which is typically designated by a particular color on the pecking key.
Thus, the immediate consequence of an initial-link response is a stimulus that is associ-
ated with the terminal link that was chosen. Because that stimulus is present when the
primary reinforcer is provided, the terminal link stimulus becomes a conditioned rein-
forcer. Thus, one may regard a concurrent schedule as one in which the initial-link
responses are reinforced by the presentation of a conditioned reinforcer. Differences in
the value of the conditioned reinforcer will then determine the relative rate of each
choice response in the initial link. Because of this, concurrent-chain schedules provide
an important tool for the study of conditioned reinforcement (Jimenez-Gomez & Shahan,
2012; Savastano & Fantino, 1996).
Although many studies of concurrent-chain schedules represent efforts to determine
how organisms select between different situations represented by the terminal links, the
consensus of opinion is that choice behavior is governed by both the terminal link sche-
dules and whatever schedule is in effect in the initial link. Several different models have
been proposed to explain how variables related to the initial and terminal links act in
concert to determine concurrent-choice performance (e.g., Christensen & Grace, 2010).
Self-Control Choice and Delay Discounting
Self-control is an especially important form of complex choice. Self-control is a matter of
choosing a large delayed reward over an immediate small reward. Should you get out of
bed when your alarm rings and go to class, or turn off the alarm and sleep an extra
hour? Going to class will help your grade point average and help you obtain a college
degree. Those are very significant benefits, but you cannot enjoy them until some time
in the future. Staying in bed for an extra hour provides a much smaller benefit but one
Complex Choice and Self-Control 173
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
that you can enjoy immediately. Similarly, self-control in eating involves selecting the
large delayed reward of maintaining a healthy weight over the immediate small reward
of eating a piece of cake. Many choices involved in a healthy lifestyle require selecting a
larger delayed reward (being healthy) over a smaller, more immediate reward (getting a
cup of coffee with friends instead of going to the gym).
Why is it so difficult to be motivated to work for large but delayed rewards? That is
the crux of the problem of self-control. The answer, which originated in early studies of
concurrent-chain schedules (Rachlin & Green, 1972), is based on the concept of delay
discounting. Delay discounting is one of the major contemporary advances in our think-
ing about reinforcement and refers to the idea that the value of a reinforcer declines as a
function of how long you have to wait to obtain it.
If I ask you whether you would prefer $25 today or $25 next week, there is no doubt
that you will choose to get the money today. This shows that the value of $25 is less if
you have to wait a week to get it. How much less? We can determine that by posing a
series of choices that pit $25 next week against various smaller amounts today. Given
such a series of choices, we might determine that for you getting $10 today is equivalent
to getting $25 next week. Such a result would show that for you one week’s delay results
in the value of $25 being reduced to $10.
Delay discounting functions have been examined in numerous studies with human
participants as well as various animal species (e.g., Calvert, Green, & Myerson, 2011;
Madden & Bickel, 2010). One cannot ask laboratory animals hypothetical questions
about their choice between monetary reinforcers that differ in amount and delay. Rather,
ingestible reinforcers have to be used. When ingestible reinforcers are also tested with
human participants, similar discounting functions are obtained.
Figure 6.7 shows the results of a study with undergraduate students given choices
between different amounts and delays of their preferred juice (Jimura et al., 2011).
Notice that the subjective value of both 16 ml and 8 ml of juice declined with increas-
ing delays. This illustrates the basic phenomenon of delay discounting. Figure 6.7 also
shows that the smaller (8 ml) reward lost its value faster than the larger (16 ml)
reward.
Delay discounting is a well-established phenomenon. There is no longer any doubt
that reinforcers lose their value the longer one has to wait for them. However, the exact
mathematical form of the value discounting function has taken a bit of empirical effort to
0
0 20 40 60
Delay (sec)
0.2
0.4
0.6
0.8
1.0
Su
bj
ec
tiv
e
va
lu
e
8 ml16 mlFIGURE 6.7 The sub-
jective value of 16 ml
and 8 ml of juice as a
function of delay in col-
lege students. Curves
represent best-fitting
hyperboloid functions
(based on Jimura et al.,
2009).
Co
ur
te
sy
of
L.
G
re
en
L. Green
174 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
pin down. The current consensus is that the value of a reinforcer (V) is directly related to
reward magnitude (M) and inversely related to reward delay (D), according to the formula
V ¼ Mð1 þ kDÞ ð6:6Þ
where k is the discounting rate parameter (Mazur, 1987). Equation 6.6 is called the
hyperbolic decay function. (For a generalized version of the hyperbolic decay function,
see Grace, 1999.) According to this equation, if the reinforcer is delivered with no delay
(D ¼ 0), the value of the reinforcer is directly related to its magnitude (larger reinforcers
have larger values). The longer the reinforcer is delayed, the smaller is its value.
I noted earlier that the concept of delay discounting provides the key to understand-
ing the problem of self-control, which involves choice between a small reward available
soon versus a much larger reward available after a long delay. But, how can that be?
Given that reinforcers rapidly lose their value with longer delays, won’t participants
always select the small, more immediate reward? Not necessarily. Different results occur
depending on the size of the reinforcer and how rapidly its value is discounted.
In analyzing the problem of self-control, it is useful to plot delay discounting func-
tions backwards, as shown in Figure 6.8. In this figure, the vertical axis again shows the
perceived value of the reinforcer, and time is represented by the horizontal axis. The fig-
ure represents the value of a large and a small reward as a function of how long you have
to wait to receive the reward. The bar for the large reward is to the right of the bar for
the small reward because you have to wait longer to receive the large reward. T1 and T2
identify different points in time when you might make your choice response.
The usual self-control dilemma occurs if your choice is made at T1. At T1 there is a
very short wait for the small reward and a longer wait for the large reward. Waiting for
each reward reduces its value. Because reward value decreases rapidly at first, given the
delays involved at T1, the value of the large reward is smaller than the value of the small
reward. Hence, the model predicts that if the choice occurs at T1, you will select the
small reward (the impulsive option). However, the discounting functions cross over
with further delays. The value of both rewards is less at T2 than at T1 because T2 involves
longer delays. However, notice that at T2 the value of the large reward is now greater
than that of the small reward. Therefore, a choice at T2 would have you select the large
reward (the self-control option).
The delay discounting functions illustrated in Figure 6.8 predict the results of
numerous studies of self-control. Most importantly, the functions show that increasing
James E. Mazur
Time
R
ew
ar
d
va
lu
e
T2 T1
Large
Small
FIGURE 6.8 Hypo-
thetical relations be-
tween reward value and
waiting time to reward
delivery for a small re-
ward and a large reward
presented sometime
later.
©
Ce
ng
ag
e
Le
ar
ni
ng
Co
ur
te
sy
of
Ja
m
es
E.
M
az
ur
Complex Choice and Self-Control 175
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
the delay to both the small and large reward (by moving from T1 to T2) makes it easier
to exhibit self-control. Because the delay discounting functions cross over with longer
delays, the larger delayed reward becomes more attractive with longer delays. (For a
broader discussion of these issues, see Logue, 1995; Rachlin, 2000.)
Delay Discounting in Human Affairs As I noted above, the parameter k in Equa-
tion 6.6 indicates how rapidly reward value declines as function of delay. The steeper a
person’s delay discounting function is, the more difficulty that person will have in exhi-
biting self-control because the larger more remote reward will seem much less valuable.
Lack of self-control is potentially associated with a wide range of human problems.
Engaging in unprotected sex, drinking too much at a party, driving while intoxicated,
and throwing a punch instead of walking away from an argument are all examples of
lack of self-control. Are people who engage in such problematic behaviors more apt to
discount delayed rewards? This question has stimulated a great deal of research on
human delay discounting (e.g., Madden & Bickel, 2010).
A critical issue in this area is the stability of delay discounting functions. Only if
such functions are highly stable could they be used to better understand repeated pat-
terns of risky behavior. Kirby (2009) tested delay discounting for hypothetical monetary
reinforcers in college students on two occasions. For one group, the two assessments
were separated by 5 weeks. For another group, the two assessments were more than a
year apart (57 weeks). Test–retest reliability dropped a bit from 5 to 57 weeks. However,
the test–retest reliability of delay discounting rates was about the same as the test–retest
reliability of standard personality traits (Kirby, 2009). Thus, individual differences in
reward discounting can be treated as a personality variable.
Other studies have examined changes in delay discounting across different ages. In
one study, for example, the discounting of hypothetical monetary rewards was measured
in college students and older adults (mean 71 years of age). The results are presented in
Figure 6.9. Notice that the rate of discounting of monetary rewards decreases as a function
of months. This contrasts with the much faster discounting that we encountered for ingest-
ible rewards (Figure 6.7). Figure 6.9 also shows that the rate of discounting is substantially
slower among older adults than among young adults. Interestingly, no differences were
found in the rate of discounting of a consumable reinforcer (juice) at these ages.
In addition to age, investigators have examined numerous variables to see how they
may be related to the rate of reward discounting (see Odum & Baumann, 2010, for a
review). These studies have shown that individuals with higher IQ, higher educational
level, and higher income tend to show slower reward discounting. Interestingly, grade
point average and grades in specific courses are also negatively correlated with the rate
0.0
0 6 12 18 30 3624
Delay (months)
Young adults
0.2
0.4
0.6
0.8
1.0
R
el
at
iv
e
su
bj
ec
tiv
e
va
lu
e
0 6 12 18 30 3624
Delay (months)
Senior adults
0.0
0.2
0.4
0.6
0.8
1.0
R
el
at
iv
e
su
bj
ec
tiv
e
va
lu
e
FIGURE 6.9 Delay
discounting for hypo-
thetical monetary re-
wards among college
students and senior
adults (mean age 71).
Notice that the rate of
discounting is signifi-
cantly slower among the
older participants (based
on Jimura et al., 2011).
A. W. Logue
Co
ur
te
sy
of
A
.
W
.
Lo
gu
e
176 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
of reward discounting (e.g., Kirby, Winston, & Santiesteban, 2005). Students who are less
apt to discount delayed rewards do better in their course work. In another study, college
students who engaged in unprotected sex were found to have steeper discounting func-
tions than those who used condoms. These and related studies show that reward-
discounting reflects an important feature of behavior that is relevant to self-control in a
broad range of situations (Critichfield & Kollins, 2001).
Perhaps the most widely investigated aspect of human behavior that has been
examined from the perspective of delay discounting is drug addiction and drug abuse
(Yi, Mitchell, & Bickel, 2010). For example, a recent metaanalysis of drug users con-
cluded that individuals with addictive behaviors involving alcohol, tobacco, stimulants,
and opiates all showed significantly steeper discounting functions than control partici-
pants who were not using these drugs (MacKillop et al., 2011). Although provocative,
such evidence does not reveal the direction of causation. It may be that individuals who
steeply discount delayed rewards are more apt to consume drugs of abuse. Alternatively,
drug addiction may cause more rapid discounting of rewards.
Whether delay discounting contributes to the development of drug abuse or is a
symptom of it may be determined by longitudinal studies or experiments with laboratory
animals. Both strategies have implicated rapid discounting as causal to drug abuse. For
example, Audrain-McGovern and colleagues (2009) studied a sample of 947 teenagers
from the age of 15–20 and found that those who showed steeper delayed discounting
functions at the age of 15 were more likely to take up smoking. In another recent
study, students and their parents and teachers were interviewed each year from the 6th
to the 11th grade to obtain information about self-control, attentional difficulties, and
drug use. Greater difficulties with self-control in grade 6 were predictive of alcohol, mari-
juana, and cigarette use in high school (King et al., 2011). Interestingly, in this study
attentional difficulties were also predictive of later drug use.
In what is probably the most ambitious and comprehensive study of the relationship
between self-control early in life and subsequent behavior, nearly 1,000 children were
tracked from birth to the age of 32 in New Zealand (Moffitt et al., 2011). A number of
different measures of self-control were obtained during the first 10 years of life and
related to various life outcomes during the ensuing 22 years. Higher levels of self-
control in childhood were predictive of better health, lower rates of drug use, higher
income levels, lower rates of single parenting, and lower rates of criminal behavior.
A sample of these results is presented in Figure 6.10.
Experiments with laboratory animals have confirmed that rate of reward discounting
is related to drug intake and possible drug abuse (see Carroll et al., 2010, for a review).
Typically rats serve as participants in these experiments, although other species have
been also studied. Delay discounting is first assessed using food as the reinforcer. Based
on these tests, the rats are categorized as showing a steep or shallow discounting rate. The
two groups are then tested using various drug self-administration procedures. Rats that
show steep delay discounting have been found to subsequently take in more alcohol or
cocaine, show greater escalation of drug intake when given the opportunity, and are
more likely to relapse in their drug consumption following extinction. Thus, delay dis-
counting is predictive of many aspects of drug abuse in these laboratory models.
Can Self-Control be Trained? As we have seen, lack of self-control is associated
with serious negative life outcomes. How might these be avoided? One possibility is to
establish self-control through training. In fact, some have suggested that self-control is
a critical component of socialization and emotional adjustment.
Evidence suggests that self-control can be trained. In one study (Eisenberger &
Adornetto, 1986), for example, second- and third-grade students in a public elementary
T. S. Critchfield
Co
ur
te
sy
of
T.
S.
Cr
itc
hf
ie
ld
M. E. Carroll
Co
ur
te
sy
of
M
.
E.
Ca
rr
ol
l
Complex Choice and Self-Control 177
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
school were first tested for self-control by being asked whether they wanted to get 2¢ imme-
diately or 3¢ at the end of the day. Children who elected the immediate reward were given 2¢.
For those who elected the delayed reward, 3¢ was placed in a cup to be given to the child later.
The procedure was repeated eight times to complete the pretest. The children then received
three sessions of training with either immediate or delayed reward.
During each training session, various problems were presented (counting objects on
a card, memorizing pictures, and matching shapes). For half the students, correct
responding was reinforced immediately with 2¢. For the remaining students, correct
responses resulted in 3¢ being placed in a cup that was given to the child at the end of
the day. After the third training session, preference for a small immediate reward versus
a larger delayed reward was measured as in the pretest. Provided that the training tasks
involved low effort, training with delayed reward increased the subsequent preference of
the children for the larger delayed reward.
Another strategy for training self-control involves a shaping procedure in which the
large reward is initially presented without a delay, and the delay is then gradually increased
across trials (e.g., Schweitzer & Sulzer-Azaroff, 1988). When there is no delay in a choice
between a small and a large reward, the participant invariably selects the larger reward.
This choice can be sustained if the delay to the large reward is increased in small steps.
Another technique that facilitates self-control is to introduce a distracting task dur-
ing the delay to the large reward or distract attention from the large reward during the
delay period (e.g., Mischel, Ebbesen, & Zeiss, 1972). In some examples of successful
training, providing an intervening response is combined with gradually increasing the
delay to the large reward (e.g., Dixon et al., 1998; Dixon & Holcomb, 2000).
Although investigators have identified some effective procedures for training self-
control, much work remains to be done in this area. Future research needs to identify
what factors are critical for the learning of self-control and how to maximize the effec-
tiveness of those variables. In addition, we need to better understand what factors are
responsible for the generalization of self-control skills and how to promote generalization
of self-control from one situation to another.
20.4
Low High
1 2 3 4 5
A
du
lt
he
al
th
o
ut
co
m
e
(Z
-s
co
re
)
20.2
0
0.2
0.4
Childhood self-control in quintiles
Substance dependence index
Poor physical health indexFIGURE 6.10 Rela-
tionship between child-
hood self-control and
physical health and
substance dependence in
early adulthood (based
on Moffitt et al., 2011).
178 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
BOX 6.3
Neuroeconomics: Imaging Habit and Executive Control
In its simplest form, instrumental
behavior is shaped by rewards and
stimuli that signal reward. This
requires a mechanism to represent
reward (its magnitude and valence)
and a system for encoding reward
signals. By monitoring how these
events are related to behavior, these
neural mechanisms can shape what
we do, fostering adaptive responses
(go) and inhibiting maladaptive
behavior (no-go). These processes
alone seem sufficient to guide the
development of simple habits, where
reward is immediate and response
options are constrained.
Natural situations, though, are
typically far more complex than this,
allowing for a range of possible
responses and outcomes, both imme-
diate and delayed. To organize
behavior in such complex situations,
the organism must represent not just
reward (the immediate advantage
accrued by the outcome of a behavior)
but also its value (an estimate of how
much reward, or punishment, will be
gained from the choice, both now and
in the future) (Montaque, King-Casas,
& Cohen, 2006). To weigh these
alternatives requires a form of working
memory to select among choice
options and update expectancies in the
face of new information. Such a view
extends instrumental behavior to
problems related to goal setting and
planning, to help us understand how
humans make choices that balance
long-term gains with short-term costs.
Here I consider these issues within the
framework of neuroeconomics, a dis-
cipline that draws from neuroscience,
economics, and psychology, to explore
the brain mechanisms that underlie
decision-making and choice (Bickel
et al., 2007).
The field of neuroeconomics
builds upon animal research to
explain choice behavior in humans.
The aim is to couple behavioral data
with neurobiological observations. In
nonhuman species, these issues can
be explored using techniques that
disrupt function in a particular region
or involve recording from neurons
using electrodes that have been low-
ered into the animal’s brain. Unless
warranted by medical concerns, such
procedures cannot be used with
humans and, as a result, progress in
this area has been slow. A major
turning point was the development of
a noninvasive method to image the
brain using functional magnetic
resonance imaging (fMRI). fMRI
takes advantage of the fact that brain
activity requires oxygen, which is
transported in the blood by hemo-
globin. When hemoglobin binds
oxygen to form oxyhemoglobin, it
alters the magnetic properties of the
molecule. It is this change that is
detected by an MRI scanner, allowing
researchers to monitor the flow of
oxyhemoglobin within the brain. As
neural activity increases, more oxy-
hemoglobin is directed to the region,
producing a blood oxygenation level
dependent (BOLD) signal (Bickel
et al., 2007).
Using fMRI, researches have
shown that the presentation of reward
consistently engages neural activity
within a common set of neural
structures that includes the orbito-
frontal cortex (OFC), the amygdala,
the striatum, and the nucleus accum-
bens (McClure, York, & Montaque,
2004). The striatum and nucleus
accumbens are part of the basal
ganglia, a subcortical cluster of
structures that also includes the glo-
bus pallidus, adjoining components of
the thalamus (the subthalamic nuclei),
and a region of the midbrain (the
substantia nigra) (EP-7 at the back of
the book). The orbitofrontal cortex
lies directly above the eye sockets
(orbits) and represents the ventral
(lower) portion of the prefrontal cor-
tex (PFC) (EP-6 at the front of the
book). Here, I will focus on just three
components: the amygdala, striatum,
and the OFC.
Earlier, I discussed how the
amygdala plays an important role in
processing biologically significant
stimuli, both appetitive and aversive
(Box 4.3). We also saw that the
basolateral amygdala (BLA) contri-
butes to learning about Pavlovian
relations. With regard to instrumental
learning, the amygdala appears to
play two important roles. First, neural
activity within this region provides an
index of reward magnitude and
valence. Second, processing within
the amygdala can endow neutral cues
with an affective code that can moti-
vate behavior and reinforce new
learning. We will see later that drug-
paired cues can facilitate drug-taking
behavior, and this secondary rein-
forcement is eliminated by lesioning
the BLA. Likewise, lesioning the BLA
disrupts instrumental behavior moti-
vated by escape from a fear-eliciting
cue. In an appetitive task, devaluing a
food reward (e.g., by pairing it with
an illness-inducing agent) normally
reduces instrumental responding.
Devaluation has no effect on perfor-
mance in BLA-lesioned participants.
The striatum and its associated
nuclei provide a system for integrat-
ing positive and negative outcomes
over multiple trials to modify behav-
ioral habits. Research suggests that it
does so through two pathways that
project to the thalamus (Figure 6.11),
a direct path that sends a “go” signal
to facilitate the execution of a
response and an indirect path that
sends a “no-go” signal to suppress
Continued
Complex Choice and Self-Control 179
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
competing responses (Frank & Claus,
2006). These differential effects are
mediated, in part, by the action of the
neurotransmitter dopamine (DA),
which can engage either the excit-
atory D1 receptor (go) or the inhibi-
tory D2 receptor (no-go). As we will
see later (Box 7.1), DA activity is
modulated by predictability, provid-
ing an error signal that shapes beha-
viors. Evidence that the striatum is
involved in reward comes from
studies demonstrating that rats find
highly rewarding both electrical
stimulation and DA microinjection
within this region. Conversely, striatal
lesions undermine habitual respond-
ing and increase sensitivity to rein-
forcer devaluation (McClure et al.,
2004). In humans, a loss of DA input
from the substantia nigra (an outcome
of Parkinson’s disease) causes a dis-
ruption in motor behavior.
The OFC lies within the PFC, an
evolutionarily younger brain region
found in humans and higher mam-
mals (Bickel et al., 2007). Research
suggests that the PFC plays a role in
higher brain functions, such as plan-
ning and decision making. The OFC
corresponds to the ventral region of
the PFC and is anatomically con-
nected with structures implicated in
reward, such as the amygdala and
striatum. The OFC appears to provide
a form of executive control that
allows the organism to weigh the
relative value of alternative choices.
Interestingly, there is some evidence
for a subdivision of labor within the
OFC, with rewarded actions
(approach) eliciting greater neural
activity within the medial regions and
punished actions (response inhibi-
tion) engaging more lateral areas
(McClure et al., 2004). Damage to the
OFC interferes with learning when
reward contingencies no longer apply
(e.g., when reinforcer contingencies
are reversed). Likewise, humans with
damage to the OFC cannot use the
value of a predicted outcome to guide
their behavior in a gambling task
(Saddoris, Gallagher, & Schoenbaum,
2005).
The work reviewed above suggests
that instrumental learning is guided
by two competing systems. One
reflects a kind of habit learning. It
relies on lower-level structures, such
as the amygdala and striatum, and
learns in an incremental fashion,
guided by predictability and signal
error. The other relies on the OFC
and provides a form of executive
control that can rapidly bias behavior
in the face of new information. It is
the executive system that evaluates
potential outcomes, weighs the bene-
fit of delayed reward, and sets goals
(Bickel et al., 2007). Individuals who
have disrupted executive function
(from damage to the OFC) have
trouble incorporating negative feed-
back from previous behavior to guide
future behavior. As a result, their
behavior is governed by the impulsive
amygdala.
I have characterized the function-
ing of these systems using the kinds of
reinforcers typically used in labora-
tory studies. The system is not, how-
ever, limited in this way. A wide range
of stimuli, including smells and sexual
cues, engage the reward system. So
too does money and social reward
(e.g., from positive feedback). Like-
wise, arbitrary stimuli (CSs) that
predict reward engage a similar pat-
tern of neural activity (McClure et al.,
2004). Even the administration of
punishment to a deserving defector
can elicit reward-related neural
Brainstem,
spinal cord
Globus pallidus
internal
Globus pallidus
external
Thalamus
Subthalamic
nucleus
Putamen
“Direct”
pathway
Cortex
“Indirect”
pathway
FIGURE 6.11 A
model of basal ganglia
function. A direct output
pathway is proposed to
generate a “go” signal
that facilitates behavioral
responses. An indirect
pathway sends a “no-go”
signal that suppresses
competing responses
(adapted from Kolb and
Wishaw, 2008).
BOX 6.3 (continued)
Continued
180 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Concluding Comments
The basic principle of instrumental conditioning is very simple: reinforcement increases
(and punishment decreases) the future probability of an instrumental response. However,
as we have seen, the experimental analysis of instrumental behavior can be rather intri-
cate. Many important aspects of instrumental behavior are determined by the schedule of
reinforcement. There are numerous schedules that can be used to reinforce behavior.
Reinforcement can depend on how many responses have occurred, how much time has
passed, or a combination of these factors. Furthermore, more than one reinforcement
schedule may be available to the organism at the same time. The pattern of instrumental
behavior, as well as choices between various response alternatives, are strongly deter-
mined by the schedules of reinforcement that are in effect. These various findings have
told us a great deal about how reinforcement controls behavior in a variety of circum-
stances and have encouraged numerous powerful applications of reinforcement princi-
ples to important aspects of human behavior such as self-control.
activity (Montaque et al., 2006).
As McClure and colleagues (2004)
suggested, representing reward in
terms of a common pattern of neural
activity may facilitate the comparison
of alternative outcomes that differ on
multiple dimensions (quality, imme-
diacy, magnitude, and valence).
The competition between the
impulsive habit-based system and
executive oversight can help us
understand how organisms weigh
the relative value of a delayed
reward. As discussed in the text,
organisms will often choose a
smaller immediate reward over a
delayed larger reward, a phenome-
non known as delayed discounting
(Rachlin, 2006). Researchers have
suggested that it is the OFC that
allows us to delay gratification—
to select a delayed larger reward.
Supporting this, lesioning the OFC
biases behavior toward immediate
reward (Roesch, Calu, Burke, &
Schoenbaum, 2007). Conversely,
damage to the nucleus accumbens
increases delayed discounting
(Peters & Büchel, 2011). Imaging
studies have revealed that the choice
of an immediate reward elicits
greater activity in limbic areas,
whereas delayed reward engenders
more activity in the frontal cortex.
In addictive behavior, indivi-
duals repeatedly choose an imme-
diate outcome in the face of
knowledge that doing so will likely
entail long-term negative conse-
quences. Such a pattern suggests a
deficiency in weighing the value of
outcomes over time to mentally
project that choosing the delayed
alternative will yield greater reward.
Supporting this, research has shown
that opioid addicts discount more
(that is, undervalue a delayed
reward) than nonaddicted indivi-
duals. Moreover, if the outcome is
heroin, addicts (surprisingly) dis-
count even more. It has been pro-
posed that addiction arises because
the hyperreactive impulsive system
overcomes the influence of the
executive system, with a corre-
sponding emphasis on immediate
reward (Bickel et al., 2007). As we
will learn in Chapter 7 (Box 7.1),
drugs of abuse artificially engage
reward systems and, with experi-
ence, sensitize the reward circuit,
further biasing it toward an impul-
sive choice. This view suggests that
treatment for addiction will require
a multifaceted approach to both
dampen drug reactivity and
strengthen executive control.
Research has revealed a striking
similarity in how the brain processes
a diverse set of rewards, from food to
social rewards. Across these domains,
instrumental learning is reinforced by
common neurochemical systems and
regulated by an error signal, the
occurrence of which is well predicted
by formal models (e.g., Rescorla &
Wagner, 1972). These observations
suggest a remarkable degree of con-
servation in structure and function
with regard to reinforcement learning
(Montaque et al., 2006).
J. W. Grau
basal ganglia A subcortical cluster of
structures implicated in instrumental
behavior and the assessment of time.
Degeneration of this area contributes
Parkinson’s disorder.
functional magnetic resonance imaging
(fMRI) A noninvasive procedure that
can be used to measure brain activity
based on changes in blood flow.
neuroeconomics An interdisciplinary
approach to the study of choice and deci-
sion making that relies on both behavioral
and neurobiological observations.
BOX 6.3 (continued)
Concluding Comments 181
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Sample Questions
1. Compare and contrast ratio and interval schedules
in terms of how the contingencies of reinforce-
ment are set up and the effects they have on the
instrumental response.
2. Describe how concurrent schedules of reinforce-
ment are designed and what are typical findings
with concurrent schedules.
3. Describe the generalized matching law equation
and explain each of its parameters.
4. Describe various theoretical explanations of the
matching law.
5. How are concurrent-chain schedules different
from concurrent schedules, and what kinds of
research questions require the use of concurrent-
chain schedules?
6. What is a reward discounting function, and how
is it related to the problem of self-control?
7. How have studies of self-control informed us
about other important aspects of human
behavior?
Key Terms
concurrent-chain schedule of reinforcement A com-
plex reinforcement procedure in which the participant
is permitted to choose during the first link which of
several simple reinforcement schedules will be in effect
in the second link. Once a choice has been made, the
rejected alternatives become unavailable until the start
of the next trial. Concurrent-chain schedules allow for
the study of choice with commitment.
concurrent schedule A complex reinforcement pro-
cedure in which the participant can choose any one of
two or more simple reinforcement schedules that are
available simultaneously. Concurrent schedules allow
for the measurement of direct choice between simple
schedule alternatives.
continuous reinforcement (CRF) A schedule of rein-
forcement in which every occurrence of the instrumen-
tal response produces the reinforcer.
cumulative record A graphical representation of how
a response is repeated over time, with the passage of
time represented by the horizontal distance (or x
axis), and the total or cumulative number of responses
that have occurred up to a particular point in time
represented by the vertical distance (or y axis).
delay discounting Decrease in the value of a rein-
forcer as a function of how long one has to wait to
obtain it.
fixed-interval scallop The gradually increasing rate
of responding that occurs between successive reinforce-
ments on a fixed-interval schedule.
fixed-interval schedule (FI) A reinforcement sched-
ule in which the reinforcer is delivered for the first
response that occurs after a fixed amount of time fol-
lowing the last reinforcer or the beginning of the trial.
fixed-ratio schedule (FR) A reinforcement schedule
in which a fixed number of responses must occur in
order for the next response to be reinforced.
intermittent reinforcement A schedule of reinforce-
ment in which only some of the occurrences of the
instrumental response are reinforced. The instrumental
response is reinforced occasionally, or intermittently.
Also called partial reinforcement.
inter-response time (IRT) The interval between one
response and the next. IRTs can be differentially rein-
forced in the same fashion as other aspects of behavior,
such as response force or response variability.
interval schedule A reinforcement schedule in which
a certain amount of time is required to set up the rein-
forcer. A response is reinforced only if it occurs after
the reinforcer has been set up.
limited hold A restriction on how long a reinforcer
remains available. In order for a response to be reinforced,
it must occur before the end of the limited-hold period.
matching law A rule for instrumental behavior, pro-
posed by R. J. Herrnstein, which states that the relative
rate of responding on a particular response alternative
equals the relative rate of reinforcement for that
response alternative.
melioration A mechanism for achieving matching by
responding so as to improve the local rates of reinforce-
ment for response alternatives.
partial reinforcement Same as intermittent
reinforcement.
post-reinforcement pause A pause in responding
that typically occurs after the delivery of the reinforcer
on FR and FI schedules of reinforcement.
182 Chapter 6: Schedules of Reinforcement and Choice Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
ratio run The high and invariant rate of responding
observed after the post-reinforcement pause on FR
schedules. The ratio run ends when the ratio require-
ment has been completed and the participant is
reinforced.
ratio schedule A schedule in which reinforcement
depends only on the number of responses the partici-
pant performs, irrespective of when those responses
occur.
ratio strain Disruption of responding that occurs on
ratio schedules when the response requirement is
increased too rapidly.
schedule of reinforcement A program, or rule, that
determines how and when the occurrence of a response
will be followed by the delivery of the reinforcer.
undermatching Less sensitivity to the relative rate of
reinforcement than predicted by the matching law.
variable-interval schedule (VI) A reinforcement
schedule in which reinforcement is provided for the
first response that occurs after a variable amount
of time from the last reinforcer or the start of the
trial.
variable-ratio schedule (VR) A reinforcement sched-
ule in which the number of responses necessary to pro-
duce reinforcement varies from trial to trial. The value
of the schedule refers to the average number of
responses required for reinforcement.
Key Terms 183
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
C H A P T E R 7
Instrumental Conditioning:
Motivational Mechanisms
The Associative Structure of Instrumental
Conditioning
The S–R Association and the Law of Effect
Expectancy of Reward and the S–O Association
R–O and S(R–O) Relations in Instrumental
Conditioning
Response Allocation and Behavioral Economics
Antecedents of the Response-Allocation Approach
The Response Allocation Approach
Behavioral Economics
Contributions of the Response-Allocation
Approach and Behavioral Economics
Concluding Comments
Sample Questions
Key Terms
CHAPTER PREVIEW
This chapter is devoted to a discussion of the processes that motivate and direct instrumental behavior.
Two distinctively different approaches have been pursued in efforts to understand why instrumental
behavior occurs. The first of these is in the tradition of Thorndike and Pavlov and focuses on identifying
the associative structure of instrumental conditioning. The associative approach considers molecular
mechanisms rather than the long-range goal or function of instrumental behavior. The second strategy is
in the Skinnerian tradition and considers instrumental behavior in the broader context of how organisms
distribute or allocate their behavior among various response options. The response-allocation approach
considers reinforcement effects to be a consequence of constraints on response options imposed by an
instrumental conditioning procedure. How behavior is reallocated in the face of these constraints is
analyzed using concepts from behavioral ecology and behavioral economics. The associative and
response-allocation approaches provide an exciting illustration of the sometimes turbulent course of
scientific inquiry. Investigators studying the motivational substrates of instrumental behavior have moved
boldly to explore radical new conceptions when older ideas did not meet the challenges posed by new
empirical findings.
185
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
In Chapters 5 and 6, I defined instrumental behavior, pointed out how this type of learn-
ing is investigated, and described how instrumental behavior is influenced by various
experimental manipulations, including schedules of reinforcement. Along the way, I did
not say much about what motivates instrumental responding, perhaps because the
answer seemed obvious. Casual reflection suggests that individuals perform instrumental
responses because they are motivated to obtain the goal or reinforcer that results from
the behavior. Is this true, and what does it mean to be motivated to obtain the rein-
forcer? Furthermore, what is the full impact of setting up a situation in which the rein-
forcer can be obtained only by making the required instrumental response? Answers to
these questions have occupied scientists for more than a century and have encompassed
some of the most important and interesting research in the analysis of behavior.
The motivation of instrumental behavior has been considered from two radically
different perspectives. The first originated with Thorndike and involves analysis of the
associative structure of instrumental conditioning. As this label implies, this approach
relies heavily on the concept of associations and hence is compatible with the theoretical
tradition of Pavlovian conditioning. In fact, much of the research relevant to the associa-
tive structure of instrumental conditioning was stimulated by efforts to identify the role
of Pavlovian mechanisms in instrumental learning. In addition, experiments on the asso-
ciative structure of instrumental conditioning have often employed methods that were
developed to study Pavlovian conditioning.
The associative approach takes a molecular perspective. It focuses on individual
responses and the specific stimulus antecedents and outcomes of those responses. To
achieve this level of detail, the associative approach examines instrumental learning in
isolated behavioral preparations, not unlike studying something in a test tube or a Petri
dish. Because associations can be substantiated in the nervous system, the associative
approach also provides a convenient framework for studying the neural mechanisms of
instrumental conditioning (e.g., Balleine & Ostlund, 2007).
The second strategy for analyzing motivational processes in instrumental learning is
the response-allocation approach. This approach was developed in the Skinnerian tradition
and involves considering instrumental conditioning within the broader context of the
numerous activities that organisms are constantly doing. In particular, the response-allocation
approach is concerned with how an instrumental conditioning procedure limits an organ-
ism’s free flow of activities and the consequences of this limitation. Unlike the associative
approach, response allocation considers the motivation of instrumental behavior from a
more molar perspective. It considers long-term goals and how organisms manage to achieve
those goals within the context of all of their behavioral options. Thus, the response allocation
approach views instrumental behavior from a more functional perspective.
To date, the associative and response-allocation approaches have proceeded pretty
much independently of one another. Each approach has identified important issues, but
it has become clear that neither can stand alone. The hope is that at some point, the
molecular analyses of the associative approach will make sufficient contact with the
more molar functional analyses of response allocation to provide a comprehensive inte-
grated account of the motivation of instrumental behavior.
The Associative Structure of
Instrumental Conditioning
Edward Thorndike was the first to recognize that instrumental conditioning involves more
than just a response and a reinforcer. The instrumental response occurs in the context of
specific environmental stimuli. The instrumental behavior of sending a text message occurs
186 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
in the context of tactile stimuli provided by holding your cell phone and visual cues pro-
vided by looking at the keyboard. Turning the key in the ignition of your car occurs in the
context of your sitting in the driver’s seat and holding the key between your fingers. One
can identify such environmental stimuli in any instrumental situation. Hence, there are
three events to consider in an analysis of instrumental learning: the stimulus context (S),
the instrumental response (R), and the response outcome (O), or reinforcer. Skinner also
subscribed to the idea that there are three events to consider in an analysis of instrumental
or operant conditioning. He described instrumental conditioning in terms of a three-term
contingency involving S, R, and O (Davison & Nevin, 1999). The relation among these
three terms is presented in Figure 7.1.
The S–R Association and the Law of Effect
The basic structure of an instrumental conditioning procedure permits the development
of several different types of associations. The first of these was postulated by Thorndike
and is an association between the contextual stimuli (S) and the instrumental response
(R): the S–R association. Thorndike considered the S–R association to be the key to
instrumental learning and central to his law of effect. According to the law of effect,
instrumental conditioning involves the establishment of an S–R association between the
instrumental response (R) and the contextual stimuli (S) that are present when
the response is reinforced. The role of the reinforcer is to “stamp in” or strengthen this
S–R association. Thorndike thought that, once established, this S–R association was
solely responsible for the occurrence of the instrumental behavior. Thus, the basic impe-
tus, or motivation, for the instrumental behavior was the activation of the S–R associa-
tion by exposing the participant to the contextual stimuli (S), in the presence of which
the response was previously reinforced.
An important implication of the law of effect is that instrumental conditioning does
not involve learning about the reinforcer or response outcome (O) or the relation
between the response and the reinforcing outcome (the R–O association). The law of
effect assumes that the only role of the reinforcer is to strengthen the S–R association.
The reinforcer itself is not a party or participant in this association.
The S–R mechanism of the law of effect played a dominant role in behavior theory
for many years but fell into disfavor during the cognitive revolution that swept over psy-
chology during the latter part of the twentieth century. Interestingly, however, there has
been a resurgence of interest in S–R mechanisms in recent efforts to characterize habitual
behavior in people. Habits are things we do automatically and in the same way each time
without thinking. Estimates are that habits constitute about 45% of human behavior.
Wood and Neal (2007) proposed a comprehensive model of human habits, which
assumes that habits “arise when people repeatedly use a particular behavioral means in
particular contexts to pursue their goals. However, once acquired, habits are performed
without mediation of a goal” (p. 844). Rather, the habitual response is an automatic
R
S O
FIGURE 7.1 Diagram
of instrumental condi-
tioning. The instrumen-
tal response (R) occurs
in the presence of
distinctive stimuli (S)
and results in delivery
of the reinforcer
outcome (O). This
allows for the establish-
ment of several different
types of associations.
©
Ce
ng
ag
e
Le
ar
ni
ng
20
15
The Associative Structure of Instrumental Conditioning 187
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
reaction to the stimulus context in which the goal was previously obtained, similar to
Thorndike’s S–R association.
Thorndike’s S–R association is also being seriously entertained as one of the
mechanisms that is responsible for the habitual nature of drug addiction (e.g., Everitt &
Robbins, 2005; Belin et al., 2009; Zapata, Minney, & Shippenberg, 2010). In this model,
procuring and taking a drug of abuse is viewed as instrumental behavior that is initially
reinforced by the positive aspects of the drug experience. However, with repetitive use,
taking the drug becomes habitual in the sense that it becomes an automatic reaction to
environmental cues that elicit drug seeking and drug consumption without regard to its
consequences. Compulsive eating, gambling, or sexual behavior can be thought of in the
same way. What makes these behaviors compulsive is that the person “cannot help”
doing them given the triggering contextual cues, even though the activities can have
serious negative consequences. According to the S–R mechanism, those consequences
are not relevant. To borrow terminology from Wood and Neal (2007), the S–R associa-
tion “stipulates an outsourcing of behavioral control to contextual cues that were, in the
past, contiguous with performance” (p. 844).
Expectancy of Reward and the S–O Association
The idea that reward expectancy might motivate instrumental behavior was not consid-
ered seriously until about 40 years after the formulation of the law of effect. How might
we capture the notion that individuals learn to expect the reinforcer during the course of
instrumental conditioning? You come to expect that something important will happen
when you encounter a stimulus that activates the memory of the significant event or
allows you to predict that the event will occur. Pavlovian conditioning is the basic pro-
cess of signal learning. Hence, one way to look for reward expectancy is to consider how
Pavlovian processes may be involved in instrumental learning.
As Figure 7.1 illustrates, specification of an instrumental response ensures that the
participant will always experience certain distinctive stimuli (S) in connection with mak-
ing the response. These stimuli may involve the place where the response is performed,
the texture of the object the participant manipulates, or distinctive olfactory or visual
cues. Whatever the stimuli may be, reinforcement of the instrumental response will in-
evitably result in pairing these stimuli (S) with the reinforcer or response outcome (O).
Such pairings provide the potential for classical conditioning and the establishment of an
association between S and O. This S–O association is represented by the dashed line in
Figure 7.1 and is one of the mechanisms of reward expectancy in instrumental
conditioning.
One of the earliest and most influential accounts of the role of classical conditioning
in instrumental behavior was offered by Clark Hull (1930, 1931) and later elaborated by
Kenneth Spence (1956). Their proposal was that the instrumental response increases
during the course of instrumental conditioning for two reasons. First, the presence of S
comes to evoke the instrumental response directly through Thorndike’s S–R association.
Second, the instrumental response also comes to be made in response to an S–O associ-
ation that creates the expectancy of reward. Exactly how the S–O association comes to
motivate instrumental behavior has been the subject of considerable debate and experi-
mental investigation. A particularly influential formulation was the two-process theory of
Rescorla and Solomon (1967).
Two-Process Theory Two-process theory assumes that there are two distinct types
of learning: Pavlovian and instrumental conditioning—nothing too radical there. The
theory further assumes that these two learning processes are related in a special way. In
particular, during the course of instrumental conditioning, the stimuli (S) in the presence
188 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
of which the instrumental response is reinforced become associated with the response
outcome (O) through Pavlovian conditioning, and this results in an S–O association.
Rescorla and Solomon assumed that the S–O association activates an emotional state
that motivates the instrumental behavior. The emotional state was assumed to be either
positive or negative, depending on whether the reinforcer was an appetitive or an aver-
sive stimulus (e.g., food or shock). Thus, various appetitive reinforcers (e.g., food and
water) were assumed to lead to a common positive emotional state and various aversive
stimuli were assumed to lead to a common negative emotion.
How could we test the idea that an S–O association (and the expectancies or
emotions that such an association activates) can motivate instrumental behavior? The
basic experimental design for evaluating this hypothesis has come to be called the
Pavlovian instrumental transfer experiment. The experiment involves three separate
phases (Table 7.1). In one phase, participants receive standard instrumental conditioning
(e.g., lever pressing is reinforced with food). In the next phase, they receive a pure
Pavlovian conditioning procedure (the response lever is removed from the experimental
chamber and a tone is paired with food). The critical transfer phase occurs in Phase 3,
where the participants are again permitted to perform the instrumental lever-press
response, but now the Pavlovian CS is presented periodically. If a Pavlovian S–O associ-
ation motivates instrumental behavior, then the rate of lever pressing should increase
when the tone CS is presented. The experiment is called the Pavlovian instrumental
transfer test because it determines how an independently established Pavlovian CS trans-
fers to influence or motivate instrumental responding.
Phase 1 can precede or follow Phase 2 in a Pavlovian instrumental transfer experi-
ment. The two phases of training can also be conducted in different experimental cham-
bers. In fact, that is often the case. The basic requirement is to establish a Pavlovian CS
to activate an S–O association and then see how this CS influences the performance of
an instrumental response during the transfer test.
The two-process theory has stimulated a great deal of research using the Pavlovian
instrumental transfer design. As predicted, the presentation of a Pavlovian CS for food
increases the rate of instrumental responding for food (e.g., Estes, 1948; Lovibond,
1983). This presumably occurs because the positive emotion elicited by the CS+ for
food summates with the appetitive motivation that is involved in lever pressing for
food. The opposite outcome (a suppression of responding) is predicted if the Pavlovian
CS elicits a negative emotion. I described such a result in Chapter 3 where I described
the conditioned suppression procedure. In that case, the Pavlovian CS was paired with
shock (and came to elicit conditioned fear). Presentation of the CS+ for shock was then
tested when the subjects were lever pressing for food. The result was that the Pavlovian CS
suppressed the instrumental lever-press behavior (Ayres, 2012; Blackman, 1977). Accord-
ing to the two-process theory, conditioned suppression occurs because the CS+ for shock
elicits an emotional state (fear) that is contrary to the positive emotion or expectancy
(hope) that is established in instrumental conditioning with food. (For a more detailed
discussion of other predictions of the two-process theory, see Domjan, 1993.)
TABLE 7.1 EXPERIMENTAL DESIGN TO TEST PAVLOVIAN
INSTRUMENTAL TRANSFER
P H A S E 1 P H A S E 2 T R A N S F E R T E S T
Instrumental conditioning Pavlovian conditioning Present Pavlovian CS during perfor-
mance of the instrumental response
Lever press ® Food Tone ® Food Lever press ® Food
tone versus no tone
©
Ce
ng
ag
e
Le
ar
ni
ng
The Associative Structure of Instrumental Conditioning 189
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Response Interactions in Pavlovian Instrumental Transfer Classically condi-
tioned stimuli elicit not only emotional states but also overt responses such as sign track-
ing (Chapter 3). Consequently, the overt responses elicited by a Pavlovian CS may
influence the results in a Pavlovian instrumental transfer experiment. This is nicely illus-
trated by a recent study of the effects of a Pavlovian CS for alcohol on instrumental
responding reinforced by alcohol (Krank et al., 2008). The experiment was done with
laboratory rats in a chamber that had two response levers, one on either side of a water
well. The rats were first trained to press either response lever reinforced by a drop of
artificially sweetened water. Once the rats were pressing both response levers, the sweet-
ener was gradually replaced by ethanol, which then served as the reinforcer. A concur-
rent VI 20-second, VI 20-second schedule established stable responding for ethanol on
each response lever.
Pavlovian conditioning was conducted during the next eight sessions. The CS was
a light presented above each of the response levers. However, during the Pavlovian condi-
tioning sessions the response levers were removed from the chambers. On a given trial, the
CS appeared for 10 seconds either on the right or the left side and was paired with presenta-
tion of .2 ml of ethanol. For the unpaired control group, the CS and ethanol presentations
were separated by 10 seconds. As is common with this type of Pavlovian conditioning, the
CS light came to elicit a sign-tracking response if it was paired with the reinforcer. The rats
approached and sniffed the light, whether it was on the right or the left.
Following the Pavlovian conditioning phase, the response levers were placed back
into the chambers and the rats were again permitted to lever press for ethanol reinforce-
ment. For the Pavlovian instrumental transfer tests, the CS light was periodically pre-
sented while the rats were responding for ethanol. On some test trials, the CS was
presented above the right lever; on other trials, the CS appeared above the left lever.
The results of the transfer tests are presented in Figure 7.2. The rats pressed each
response lever about twice per minute before the CS was presented. For the unpaired
group, lever pressing did not change much when the CS was presented either on the
right or the left. In contrast, the paired group showed a significant increase in lever
pressing during the CS period if the CS was presented on the same side as the lever the
rat was pressing. These results show that a Pavlovian CS for ethanol will increase instru-
mental responding reinforced by ethanol. The increased lever pressing during the CS
shows that an independently established S–O association can facilitate instrumental
responding reinforced by that outcome. Because the response levers were removed from
the chambers during the Pavlovian phase, no S–R associations could have been learned
during that phase.
0.0
pre CS Same Different
1.0
2.0
3.0
4.0
5.0
R
es
po
ns
e
ra
te
(R
/m
in
)
Paired UnpairedFIGURE 7.2 Rate of
lever pressing for
ethanol in a Pavlovian
instrumental transfer
test. Responding is
shown during the pre-
CS period and during
the CS when the CS
was on the same side as
the lever being pressed
or on the alternate or
different side. (Based on
Krank et al., 2008).
Marvin Krank
Co
ur
te
sy
of
M
ar
vi
n
Kr
an
k
190 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Another important aspect of the results was that the facilitation of instrumental
responding occurred only if the Pavlovian CS was presented on the same side as the
lever the rat was pressing. If the rat was pressing the lever on the right and the CS
appeared on the left, the rat approached and sniffed the light on the left, and this pre-
vented it from increasing its lever pressing on the right side. Thus, the results of the
transfer test depended on the compatibility of the Pavlovian CR and the instrumental
response. (For a more detailed discussion of these issues and their relevance to studying
the neural basis of Pavlovian instrumental transfer effects, see Holmes, Marchand, &
Coutureau, 2010.)
BOX 7.1
The Role of Dopamine in Addiction and Reward
Drug addiction is a long-standing
societal problem. What underlies
compulsive drug use, and why is it
that individuals with a history of drug
use are so prone to relapse? Answers
to these questions require an under-
standing of how learning influences
drug-taking behavior. It is now widely
recognized that drugs of abuse usurp
control over the neural circuitry that
mediates learning about natural
rewards, producing an artificial high
that tricks the brain into following a
path that leads to maladaptive con-
sequences (for reviews, see Berridge &
Krinelbach, 2008; Hyman, Malenka, &
Nestler, 2006; Koob & Le Moal,
2008; Lee, Seo, & Jung, 2012).
Understanding how drugs exert
their effects at a neurobiological
level should help address the problem
of drug addiction and shed light on
the mechanisms that underlie learn-
ing about natural rewards.
Understanding addiction requires
some background in psychopharma-
cology, the study of how drugs
impact the nervous system to influ-
ence psychological and behavioral
states. There are many ways that this
can occur, but for present purposes
we can focus on how drugs influence
neural communication at the synapse
(EP-2 at the front of the book). Drugs
can influence synaptic communica-
tion at multiple sites. For example, an
agonist can substitute for the endog-
enous (internally manufactured)
chemical, binding to the receptor on
the postsynaptic cell and producing a
similar cellular effect. Conversely,
drug antagonists bind to the receptor
but do not engage the same cellular
consequences. Instead, the antagonist
acts as a kind of roadblock that
effectively prevents an agonist from
having its usual effect on the post-
synaptic cell. Drugs can also influence
function in a less direct manner. For
example, some drugs increase neuro-
transmitter availability by enhancing
release or by blocking their reab-
sorption or (reuptake) into the pre-
synaptic neuron.
In general, drugs of abuse impact
the nervous system by promoting the
release of a particular neurotrans-
mitter or by emulating its action. For
example, psychostimulants influence
the neurotransmitter dopamine (DA)
by blocking its reuptake (e.g., cocaine)
or promoting its release (e.g.,
amphetamine). Opiates, such as
morphine and heroin, have their
effect by emulating endogenous
opioids (endorphins) that engage the
µ-opioid receptor. Another common
addictive substance, nicotine, engages
acetylcholine receptors while seda-
tives (alcohol, valium) act, in part,
through their impact on GABAergic
neurons.
Drugs of abuse appear to promote
addiction by influencing neurons
within particular brain regions, such
as the nucleus accumbens (NA)
(Figure 7.3A). Many of the neurons
within this region have spiny den-
dritic fields that allow for multiple
synaptic contacts (Hyman et al.,
2006). These medium spiny neurons
receive input from neurons that
release an endogenous opioid that
engages the µ-receptor. In addition,
dopaminergic neurons project from a
region of the midbrain (the ventral
tegmental area [VTA]) and innervate
the spiny neurons as they pass en
route to other regions (e.g., the pre-
frontal cortex). Other psychoactive
drugs influence the activity of neu-
rons within the nucleus accumbens
(NA) by modulating opioid or dopa-
mine release, or by influencing the
inhibitory action of GABAergic
neurons that regulate neural activity
(Figure 7.3B).
Neurons within the nucleus
accumbens also receive input from
other regions, such as the cortex.
These neurons release the excitatory
neurotransmitter glutamate. As dis-
cussed in Box 8.2, changes in how a
postsynaptic cell responds to gluta-
mate can produce a lasting change
(e.g., a long-term potentiation) in
how a neural circuit operates, a
physiological alteration that has been
Continued
The Associative Structure of Instrumental Conditioning 191
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
linked to learning and memory.
Within the nucleus accumbens,
cortical neurons that release gluta-
mate provide a rich input to the
nucleus accumbens, an input that is
thought to carry information about
the specific details of the sensory
systems engaged. At the same time,
dopaminergic input on to these neu-
rons provides a diffuse input that can
signal the motivational state of the
organism. When paired, this dopa-
minergic input may help select the
relevant pattern of glutamatergic
input, acting as a kind of teacher that
binds sensory attributes with reward
value, thereby enhancing the motiva-
tional significance of these cues
(Hyman et al., 2006).
When does the dopaminergic
teacher instruct the nucleus accum-
bens to learn? To answer this ques-
tion, researchers have examined
neural activity in monkeys while they
work for reward (e.g., a sip of fruit
juice). Electrodes were lowered into
the source of the dopaminergic input,
neurons within the ventral tegmental
area (Schultz, 2007). These neurons
exhibit a low level of tonic activity.
When the animal received an unex-
pected reward, the neurons showed a
burst of firing. If the animal was then
trained with signaled reward, the
signal began to elicit a burst of
activity. The expected reward itself
produced no effect. If, however, the
expected reward was omitted, there
was an inhibition of neural activity at
the time of reward. What these
observations suggest is that dopamine
activity does not simply report
whether or not a reward has occurred.
Instead, dopamine activity seems to
code the “reward prediction error”—
the deviation between what the ani-
mal received and what it expected
(Schultz, 2006):
Dopamine response ¼ Reward
occurred � Reward predicted
The notion that learning is a
function of the discrepancy between
what the animal received and what
it expected parallels the learning rule
proposed by Rescorla and Wagner
(1972). As discussed in Chapter 4,
BOX 7.1 (continued)
Continued
Opioid
peptides
Opiates Alcohol
Stimulants
Nicotine,
alcohol
Opiates
VTA
interneuron
Alcohol
Nicotine
VTA
DA
NAChR
GABA
DA
NMDAR
D1R
D2R
or
NA
μ
μ
+
+
+
−
−
?
?
FIGURE 7.3 (A) Dopaminergic neurons from the ventral tegmental area (VTA) project through the nucleus accumbens
(NA) and synapse onto the dendrites of medium spiny neurons. These neurons also receive input from cortical neurons.
Neurons from the nucleus accumbens project to the ventral pallidum (VP) (adapted from Hyman et al., 2006). (B) Neurons
that release an opioid or dopamine directly impact neurons within the nucleus accumbens. The release of these neurochem-
icals is influenced by other psychoactive drugs, such as alcohol and nicotine (adapted from Hyman et al., 2006).
Cortical
afferents
NA
Dopamine
afferentNA
VP
Cortex
VTA
A
B
192 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
learning appears to occur when an
event is unexpected. The best
example of this is observed in the
blocking paradigm, where prior
learning that one cue (e.g., a tone)
predicts the US blocks learning
about an added cue (e.g., a light).
Interestingly, dopaminergic neurons
within the ventral tegmentum
exhibit this phenomenon, producing
a burst of activity to the originally
paired cue but not the added one.
Notice too that this represents
another instance in which a portion
of the midbrain (the ventral teg-
mental area) acts as an informed
instructor to promote learning
within the forebrain. An analogous
function was ascribed earlier to the
periaqueductal gray (Box 4.3) and
dorsal raphe nucleus (Box 5.5).
Abused drugs may encourage a
cycle of dependency because they
have a pharmacological advantage.
For example, psychostimulants artifi-
cially drive dopaminergic activity,
and in this way act as a kind of Trojan
horse that fools the nervous system,
producing a spike in dopamine
activity that the brain interprets as a
positive prediction error (Hyman
et al., 2006). This reinforces new
learning and links the sensory cues
associated with drug administration
to reward, giving them a motivational
value that fuels the acquired drug
craving and predisposes an addict to
relapse. Anyone who has quit smok-
ing can tell you that they are weakest,
and most prone to relapse, when they
are reexposed to the cues associated
with smoking. Individuals who have
not smoked for months may experi-
ence an irresistible urge to smoke
again if they enter a bar where other
people are smoking. These observa-
tions suggest that cues associated with
drug consumption acquire an incen-
tive value that can fuel drug craving.
Interestingly, the craving that fuels
drug taking appears to be physiolog-
ically and psychologically distinct
from the process that underlies how
much we consciously “like” an
addictive substance (Berridge,
Robinson, & Aldridge, 2009; Robinson
& Berridge, 2003). Liking is related
to the hedonic state elicited by
reward and can be inferred behav-
iorally from facial expressions. For
example, both humans and rats
exhibit a stereotyped pattern of oral
activity (a “yum” response) when a
sweet substance is placed on the
tongue. Interestingly, microinjecting
an opioid into small regions (hedonic
hot spots) of the nucleus accumbens
enhances signs of liking. A second
hedonic hot spot has been identified
within an adjoining region, the ven-
tral pallidum (VP). Here too, local
infusion of an opioid agonist
enhances the liking response to a
sweet solution. Observations such as
these have led researchers to suggest
that the pleasurable component of
reward is linked to opioid activity
within the nucleus accumbens
and ventral pallidum (Berridge &
Krinelbach, 2008).
For many years, researchers have
assumed that dopamine release plays
a key role in mediating pleasure.
Given this, it was surprising that the
complete destruction of dopaminergic
neurons innervating the nucleus
accumbens had no effect on opioid-
induced liking (Robinson & Berridge,
2003). Conversely, liking reactions to
sweet tastes are not elicited by
manipulations that engage dopami-
nergic neurons. These observations
suggest that dopamine activity is nei-
ther required (necessary) nor sufficient
to generate liking. Yet it was well
known that manipulations that impact
dopaminergic neurons can dramati-
cally affect drug-taking behavior
(Koob, 1999; Hyman et al., 2006).
For example, self-administration of a
psychostimulant is blocked by pre-
treatment with a dopamine antagonist
or a physiological manipulation that
destroys dopaminergic neurons in this
region. Across a range of tasks, in the
absence of dopamine, rats cannot
use information about rewards to
motivate goal-directed behavior; they
cannot act on their preferences.
Robinson and Berridge have
suggested that manipulations of the
dopamine system affect motivation
because they impact a distinct qual-
ity of reward (Berridge, Robinson, &
Aldridge, 2009; Robinson & Berridge,
2003). Rather than influencing how
much the animal consciously likes
the reward, they propose that dopa-
mine activity is coupled to an
unconscious process that they call
wanting. They see wanting as related
to the underlying motivational value
of the reward, encoding the degree to
which the organism is driven to
obtain and consume the reward
independent of whether consump-
tion engenders pleasure. From this
perspective, cues paired with reward
gain an incentive salience that drives
a form of wanting, transforming
BOX 7.1 (continued)
K. C. Berridge
T. E. Robinson
Continued
Co
ur
te
sy
of
T.
E.
Ro
bi
ns
on
Co
ur
te
sy
of
K.
C.
Be
rr
id
ge
The Associative Structure of Instrumental Conditioning 193
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Conditioned Emotional States or Reward-Specific Expectancies? The two-
process theory assumes that classical conditioning mediates instrumental behavior
through the conditioning of positive or negative emotions depending on the emotional
valence of the reinforcer. However, organisms also acquire specific reward expectancies
instead of just categorical positive or negative emotions during instrumental and classical
conditioning (Peterson & Trapold, 1980). Furthermore, in some cases reward-specific
expectancies appear to determine the outcome of Pavlovian instrumental transfer experi-
ments (e.g., Urcuioli, 2005).
In one study, for example, solid food pellets and a sugar solution were used as USs
in a Pavlovian instrumental transfer test with rats (Kruse et al., 1983). During the trans-
fer phase, the CS+ for food pellets facilitated instrumental responding reinforced with
pellets much more than instrumental behavior reinforced with the sugar solution. Corre-
spondingly, a CS+ for sugar increased instrumental behavior reinforced with sugar more
than instrumental behavior reinforced with food pellets. Thus, expectancies for specific
rewards rather than a general positive emotional state determined the results in the
transfer test. (For reward-specific expectancies in human drug-seeking behavior, see
Hogarth et al., 2007.)
R–O and S(R–O) Relations in Instrumental Conditioning
So far we have considered two different associations that can motivate instrumental
behavior, Thorndike’s S–R association and the S–O association, which activates a
reward-specific expectancy or emotional state. However, the instigation of instrumental
behavior involves more than just these two associations. Notice that neither the S–R
nor the S–O association involves a direct link between the response (R) and the rein-
forcer or outcome (O). This is counterintuitive. If you asked your roommate why she
was combing her hair, she would reply that she expected that combing her hair (R)
would improve her appearance (O). Similarly, you turn on a movie because you expect
that watching the movie will be entertaining, and you open the refrigerator because you
anticipate that doing so will enable you to get something to eat. All of these accounts are
descriptions of R–O associations between the instrumental response and the reinforcing
outcome. Although our informal explanations of instrumental behavior emphasize R–O
associations, such associations do not exist in two-process models.
Evidence of R–O Associations The most common technique used to demonstrate
the existence of R–O associations involves devaluing the reinforcer after conditioning.
Reinforcer devaluation involves making the reinforcer less attractive. If the reinforcer
sensory signals of reward into
attractive, desired goals. These cues
act as motivational magnets that
unconsciously pull the animal to
approach the reward. More formally,
a positive prediction error engages
dopamine activity and acts as a
teacher, fostering the association of
sensory cues with reward. From this
view, dopamine activity within the
nucleus accumbens binds the
hedonic properties of a goal to
motivation, driving the wanting that
can fuel drug craving. Supporting
this, research has shown that drug-
paired cues acquire conditioned
value and will support new instru-
mental learning in a Pavolian-to-
instrumental transfer test. This effect
depends on dopamine activity and
learning within the basolateral
amygdala (Box 4.3).
J. W. Grau
nucleus accumbens A portion of the
basal ganglia involved in reward, addic-
tion, and reinforcement learning.
psychopharmacology The study of how
drugs affect psychological processes and
behavior.
BOX 7.1 (continued)
194 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
is food, for example, one can make the food less attractive by conditioning a taste aver-
sion to the food. If the instrumental response occurs because of an R–O association,
devaluation of the reinforcer should reduce the rate of the instrumental response.
The reinforcer devaluation procedure used in instrumental conditioning is similar to
the US devaluation procedure I previously described in studies of Pavlovian conditioning
(Chapter 4). There, US devaluation was used to determine whether the conditioned
response is mediated by the memory of the US. If US devaluation disrupts the ability
of the CS to elicit a CR, one may conclude that the CS activated the memory of the
US and responding declined because the US memory was no longer as attractive. Follow-
ing a similar rationale, if reinforcer devaluation disrupts instrumental behavior, this
shows that the memory of the outcome (O) was involved in motivating the instrumental
behavior.
For many years, studies of the role of R–O associations in instrumental conditioning
were conducted primarily with laboratory rats (for reviews, see Colwill & Rescorla, 1986;
Ostlund, Winterbauer, & Balleine, 2008). More recently, however, R–O associations have
also been examined in experiments with human participants. Many of these experiments
have been conducted as a part of efforts to better understand the learning mechanisms
that contribute to drug-seeking behavior.
One recent study, for example, was conducted with students at the University
of Nottingham who smoked at least several times a week (Hogarth & Chase, 2011).
A two-choice concurrent schedule of reinforcement was used. The two responses were
pressing two different keys on a computer keyboard. Pressing one of the keys was rein-
forced with the picture of one-fourth of a cigarette on the screen, whereas pressing the
other key was reinforced with the picture of one-fourth of a chocolate bar. Each response
had a 50% chance of being reinforced on any given trial. The cigarettes and chocolate
bars earned were summed across trials, and the corresponding number of each item
was placed in a basket on the participant’s desk.
After 60 acquisition trials, the participants were assigned to one of two outcome
devaluation groups. For one group, the value of the cigarette outcome was reduced.
For the other, the value of the chocolate bar outcome was devalued. Devaluation was
accomplished by satiating the participants with the corresponding reinforcer. Partici-
pants in the chocolate devaluation group were allowed to eat up to eight chocolate
bars in 10 minutes. Participants in the cigarette devaluation group were allowed to
smoke an entire cigarette.
Right after the devaluation procedure, the participants were again tested on the con-
current schedule but this time they were told that although they would continue to earn
cigarettes and chocolate bars, they would not find out how many of each they obtained
until the end of the session. This was intended to maintain responding on the basis of
the current status of the memory of each reinforcer.
Figure 7.4 shows how often the participants elected to press the response that was
reinforced with cigarettes during the training phase and after the devaluation procedure.
During training, about 50% of the responses were made on the cigarette key, with the
remaining responses performed for chocolate bars. This indicates that the two outcomes
were equally preferred before devaluation. When the tobacco outcome was devalued,
responding on the cigarette key significantly declined. In contrast, when the chocolate
outcome was devalued, responding on the cigarette key increased, indicating a decline
in the chocolate response. Thus, devaluation produced a decline in behavior specific to
the response whose reinforcer had been devalued.
The result of the devaluation tests indicate that training established an R–O associa-
tion linking each response with its specific reinforcer. The results cannot be explained by
S–R associations because S–R associations are not influenced by reinforcer devaluation.
B. Balleine
Co
ur
te
sy
of
B.
Ba
lle
in
e
The Associative Structure of Instrumental Conditioning 195
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The results also cannot be explained by S–O associations because S–O associations could
not explain the response specificity of the devaluation effects that were observed.
The present results indicate that R–O associations are involved in instrumental drug-
seeking behavior. This claim appears to be at odds with my previous description of drug
addiction as driven by S–R mechanisms. As it turns out, both views are correct and have
been incorporated into a theory that suggests that R–O mechanisms predominate in free-
operant situations, whereas S–R mechanisms are activated when drug taking is a response
to drug-related cues (Hogarth & Chase, 2011; Hogarth, Dickinson, & Duka, 2010).
Hierarchical S(R–O) Relations The evidence cited above clearly shows that organisms
learn to associate an instrumental response with its outcome. However, R–O associations
cannot act alone to produce instrumental behavior. As Mackintosh and Dickinson (1979)
pointed out, the fact that the instrumental response activates an expectancy of the reinforcer
is not sufficient to tell us what caused the response in the first place. An additional factor
is required to activate the R–O association. One possibility is that the R–O association is
activated by the stimuli (S) that are present when the response is reinforced. According to
this view, in addition to activating R directly, S also activates the R–O association. Stated
informally, the subject comes to think of the R–O association when it encounters S and
that motivates it to make the instrumental response.
Skinner (1938) proposed many years ago that S, R, and O in instrumental condition-
ing are connected through a conditional S(R–O) relation, which he called the three-term
contingency. The idea that instrumental behavior is mediated by the S(R–O) relation
was vigorously pursued at the end of the twentieth century by investigators working
in the associationist tradition. The accumulated body of evidence, with both laboratory
animals and human participants, has firmly established that S(R–O) associations are
learned during the course of instrumental conditioning (e.g., Colwill & Rescorla, 1990;
Gámez & Rosas, 2007). Experiments on S(R–O) associations typically use complicated
discrimination training procedures that are beyond the scope of the present discussion.
I will describe discrimination training procedures in Chapter 8.
Response Allocation and Behavioral Economics
Although contemporary associative analyses of instrumental motivation go far beyond
Thorndike’s law of effect, they are a part of the Thorndikeian and Pavlovian tradition
that views the world of behavior in terms of stimuli, responses, and associations. The
response-allocation approach is based on a radically different worldview. Instead
65
60
55
35
Tobacco Chocolate
Devalued outcome
40
45
50
Pe
rc
en
ta
ge
c
ho
ic
e
of
c
ig
ar
et
te
k
ey
Concurrent training After devaluationFIGURE 7.4 Percentage
of responses on the
cigarette key during
original training and
following devaluation
of either the tobacco
or chocolate response
outcome. (Based on
Hogarth & Chase, 2011.)
196 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
of considering instrumental conditioning in terms of the reinforcement of a response in
the presence of certain stimuli, response allocation is a molar approach that focuses on
how instrumental conditioning procedures put limitations on an organism’s activities and
cause redistributions of behavior among available response options.
Antecedents of the Response-Allocation Approach
Reinforcers were initially considered to be special kinds of stimuli. Thorndike, for exam-
ple, characterized a reinforcer as a stimulus that produces a satisfying state of affairs.
Various proposals were made about the special characteristics a stimulus must have to
serve as a reinforcer. Although there were differences of opinion, for about a half a cen-
tury after Thorndike’s law of effect, theoreticians agreed that reinforcers were special sti-
muli that strengthened instrumental behavior.
Consummatory-Response Theory The first challenge to the idea that reinforcers are
stimuli came from Fred Sheffield and his colleagues, who formulated the consummatory-
response theory. Many reinforcers, such as food and water, elicit species-typical uncondi-
tioned responses, such as chewing, licking, and swallowing. The consummatory-response
theory attributes reinforcement to these species-typical behaviors. It asserts that species-
typical consummatory responses (eating, drinking, and the like) are themselves the critical
feature of reinforcers. In support of this idea, Sheffield, Roby, and Campbell (1954)
showed that saccharin, an artificial sweetener, can serve as an effective reinforcer, even
though it has no nutritive value and hence cannot satisfy a biological need. The reinfor-
cing properties of artificial sweeteners now provide the foundations of a flourishing diet
food industry. Apart from their commercial value, however, artificial sweeteners were
important in advancing our thinking about instrumental motivation.
The consummatory-response theory was a radical innovation because it moved the
search for reinforcers from special kinds of stimuli to special types of responses. Rein-
forcer responses were assumed to be special because they involved the consummation,
or completion, of an instinctive behavior sequence. (See discussion of consummatory
behavior in Chapter 2.) The theory assumed that consummatory responses (e.g., chewing
and swallowing) are fundamentally different from various potential instrumental
responses, such as running, opening a latch, or pressing a lever. David Premack took
issue with this and suggested that reinforcer responses are special only because they are
more likely to occur than the instrumental responses they follow.
The Premack Principle Premack pointed out that responses that accompany com-
monly used reinforcers involve activities that individuals are highly likely to perform. In
a food reinforcement experiment, participants are typically food-deprived and therefore
are highly likely to eat. By contrast, instrumental responses are typically low-probability
activities. An experimentally naive rat, for example, is much less likely to press a
response lever than it is to eat. Premack (1965) proposed that this difference in response
probabilities is critical for reinforcement. Formally, the Premack principle can be stated
as follows:
Given two responses of different likelihood, H and L, the opportunity to perform the
higher probability response (H) after the lower probability response (L) will result in
reinforcement of response L. (L ® H reinforces L.) The opportunity to perform the
lower probability response (L) after the higher probability response (H) will not result
in reinforcement of response H. (H ® L does not reinforce H.)
The Premack principle focuses on the difference in the likelihood of the instru-
mental and reinforcer responses. Therefore, it is also called the differential probability
principle. Eating will reinforce bar pressing because eating is typically more likely than
Response Allocation and Behavioral Economics 197
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
bar pressing. Beyond that, Premack’s theory denied that there is anything special about
food or eating behavior.
The Premack principle has been repeatedly demonstrated in studies with both
human participants and laboratory animals. The power of the Premack principle is that
potentially any high-probability activity can be an effective reinforcer for a response that
the individual is not inclined to perform. In laboratory rats, for example, drinking a drop
of sucrose is a high-probability response, and as one might predict, sucrose is effective in
reinforcing lever pressing. Running in a running wheel is also a high-probability
response in rats. Thus, one might predict that running would also effectively reinforce
lever pressing. Numerous studies have confirmed this prediction. Belke and Hancock
(2003), for example, compared lever pressing on a fixed-interval 30-second schedule,
reinforced by either sucrose or the opportunity to run in a wheel for 15 seconds. In dif-
ferent phases of the experiment, the rats were tested with different concentrations of the
sucrose reinforcer.
Lever pressing on the FI 30-second schedule is summarized in Figure 7.5 for the
wheel-running reinforcer and for sucrose concentrations ranging from 0 to 10%. The
data are presented in terms of the rate of lever pressing in successive 5-second periods
of the FI 30-second schedule. As expected with a fixed-interval schedule, response rates
increased closer to the end of the 30-second period. Wheel running as the reinforcer was
just as effective as 2.5% sucrose. Wheel running was more effective than 0% sucrose, but
at a sucrose concentration of 10%, responding for sucrose exceeded responding for
running.
Applications of the Premack Principle The Premack principle had an enduring
impact in the design of reinforcement procedures used to help various clinical popula-
tions and remains the basis for various point systems and voucher systems used in resi-
dential treatment settings. In an early application, Mitchell and Stoffelmayr (1973)
studied two hospitalized patients with chronic schizophrenia who refused all tangible
reinforcers that were offered (candy, cigarettes, fruit, and biscuits). The other patients
on the ward participated in a work project that involved removing tightly wound copper
Successive 5-second periods
Le
ve
r
pr
es
se
s
pe
r
m
in
ut
e
Wheel
0% Sucrose (Water)
Wheel
2.5% Sucrose
Wheel
10% Sucrose
0
0 0 01 2 3 4 5 6
10
20
30
40
50
60
70
0
1 2 3 4 5 6
10
20
30
40
50
60
70
0
1 2 3 4 5 6
10
20
30
40
50
60
70
FIGURE 7.5 Rate of lever pressing during successive 5-minute periods of a fixed interval 30-second schedule reinforced with access
to a running wheel or access to various concentrations of sucrose. (Based on Belke & Hancock, 2003.)
198 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
wire from coils. The two participants in this study did not take part in the coil-stripping
project and spent most of their time just sitting. Given this limited behavioral repertoire,
what could be an effective reinforcer? The Premack principle suggests that the opportu-
nity to sit should be a good reinforcer for these patients considering how much time they
spent sitting. To test this idea, the investigators gave the patients a chance to sit down
only if they worked a bit on the coil-stripping task.
Each participant was trained separately. At the start of each trial, they were asked or
coaxed into standing. A piece of cable was then handed to them. If they made the
required coil-stripping responses, they were permitted to sit for about 90 seconds, and
then the next trial started. This procedure was highly successful. As long as the instru-
mental contingency was in effect, the two patients worked at a much higher rate than
when they were simply told to participate in the coil-stripping project. Normal instruc-
tions and admonitions to participate in coil stripping were entirely ineffective, but taking
advantage of the high-probability sitting response worked very well.
Other interesting studies have been conducted with children with autism who
engaged in unusual repetitive or stereotyped behaviors. One such behavior, called
delayed echolalia, involves repeating words. For example, one autistic child was heard
to say over and over again, “Ding! ding! ding! You win again,” and “Match Game 83.”
Another form stereotyped behavior, perseverative behavior, involves persistent manipula-
tion of an object. For example, the child may repeatedly handle only certain plastic toys.
The high probability of echolalia and perseverative behavior in children with autism
suggests that these responses may be effectively used as reinforcers in treatment proce-
dures. Charlop, Kurtz, and Casey (1990) compared the effectiveness of different forms of
reinforcement in training various academic-related skills in several children with autism
(see also Hanley et al., 2000). The tasks included identifying which of several objects was
the same or different from the one held up by the teacher, adding up coins, and correctly
responding to sentences designed to teach receptive pronouns or prepositions. In one
experimental condition, a preferred food (e.g., a small piece of chocolate, cereal, or a
cookie) served as the reinforcer, in the absence of programmed food deprivation. In
another condition, the opportunity to perform a stereotyped response for 3–5 seconds
served as the reinforcer.
Some of the results of the study are illustrated in Figure 7.6. Each panel represents
the data for a different student. Notice that in each case, the opportunity to engage in a
prevalent stereotyped response resulted in better performance on the training tasks than
food reinforcement. Delayed echolalia and perseverative behavior both served to increase
task performance above what was observed with food reinforcement. These results indi-
cate that high-probability responses can serve to reinforce lower probability responses,
even if the reinforcer responses are not characteristic of normal behavior.
The Premack principle advanced our thinking about reinforcement in significant
ways. It encouraged thinking about reinforcers as responses rather than as stimuli, and
it greatly expanded the range of activities investigators started to use as reinforcers. With
the Premack principle, any behavior could serve as a reinforcer provided that it was
more likely than the instrumental response. Differential probability as the key to rein-
forcement paved the way for applications of reinforcement procedures to all sorts of
human problems. However, problems with the measurement of response probability
and a closer look at instrumental conditioning procedures moved subsequent theoretical
developments past the Premack principle.
The Response-Deprivation Hypothesis In most instrumental conditioning proce-
dures, the probability of the reinforcer activity is kept at a high level by restricting access
to the reinforcer. Laboratory rats reinforced with food are typically not given food before
Response Allocation and Behavioral Economics 199
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
the experimental session and receive a small pellet of food for each lever-press response.
These limitations on access to food (and eating) are very important. If we were to give
the rat a full meal for one lever press, chances are it would not respond more than once
or twice a day. Generally, restrictions on the opportunity to engage in the reinforcing
response increase its effectiveness as a reinforcer.
Premack (1965) recognized the importance of restricting access to the reinforcer, but
that was not the main idea behind his theory. By contrast, Timberlake and Allison (1974;
see also Allison, 1993) abandoned the differential probability principle altogether and
argued that restriction of the reinforcer activity was the critical factor for instrumental
reinforcement. This proposal is called the response-deprivation hypothesis.
In particularly decisive tests of the response-deprivation hypothesis, several investi-
gators found that even a low-probability response can serve as a reinforcer provided that
participants are restricted from making this response (Timberlake & Allison, 1974;
Eisenberger et al., 1967). Johnson and colleagues (2003) tested this prediction in a class-
room setting with students who had moderate-to-severe mental retardation. For each
student, teachers identified things the students were not likely to do. For example, filing
cards and tracing letters were both low-probability responses for Edgar, but tracing was
the less likely of the two responses. Nevertheless, the opportunity to trace became an
effective reinforcer for filing behavior if access to tracing was restricted below baseline
levels. This result is contrary to the Premack principle and shows that response depriva-
tion is more basic to reinforcement effects than differential response probability.
The response-deprivation hypothesis provided a simple new strategy for creating
reinforcers, namely restricting access to the reinforcer activity. It is interesting to note
that some restriction is inherent to all instrumental conditioning procedures. All instru-
mental conditioning procedures require withholding the reinforcer until the specified
instrumental response has been performed. The response-deprivation hypothesis points
out that this defining feature of instrumental conditioning is critical for producing a
reinforcement effect.
Traditional views of reinforcement assume that a reinforcer is something that exists
independent of an instrumental conditioning procedure. Food, for example, was thought
to be a reinforcer whether or not it was used to reinforce lever pressing. The response-
deprivation hypothesis makes explicit the radically different idea that a reinforcer is pro-
duced by the instrumental contingency itself. How instrumental contingencies create
reinforcers and increases in the instrumental response has remained a topic of great
interest to behavioral scientists (e.g., Baum, 2012).
100
90
80
60
50
40
Sessions
70
100
90
80
60
50
40
70
C
or
re
ct
p
er
fo
rm
an
ce
(%
)
Delayed echolalia
Food
Average baseline
performance
Perseverative behavior
Food
Average baseline
performance
FIGURE 7.6 Task
performance for two
children with autism.
One student’s behavior
was reinforced with food
or the opportunity to
engage in delayed echo-
lalia. Another student’s
behavior was reinforced
with food or the oppor-
tunity to engage in per-
severative responding.
(Responding during
baseline periods was also
reinforced with food.)
(Based on “Using
Aberrant Behaviors as
Reinforcers for Autistic
Children,” by M. H.
Charlop, P. F. Kurtz, and
F. G. Casey, Journal of
Applied Behavior Analy-
sis, 23, pp. 163–181.)
W. Timberlake
Co
ur
te
sy
of
W
.
Ti
m
be
rla
ke
200 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The Response Allocation Approach
The response-allocation approach considers the problem of reinforcement and instru-
mental conditioning from a broader perspective than the Premack principle or the
response-deprivation hypothesis. Instead of just focusing on the instrumental and rein-
forcer responses, the response-allocation approach considers the broad range of activities
that are always available to an individual. During the course of a day, you spend time
getting dressed, eating, walking, driving, listening to music, talking to friends, going to
class, studying, taking a nap, and so forth. Even while you are sitting in a class, you can
listen to the professor, look at the slides on the screen, daydream about what you will do
Saturday night, take notes, or sneak a peek at your text messages. Response allocation
refers to how an individual distributes his or her responses among the various options
that are available.
Scientists who employ the response-allocation approach examine how the distribution
of responses is altered when an instrumental conditioning procedure is introduced and what
factors determine the nature of the response reallocation (e.g., Allison, 1993; Timberlake,
1980, 1995). The starting point for these analyses is the unconstrained baseline. The
unconstrained baseline is how the individual allocates his or her responses to various
behavioral options when there are no restrictions and presumably reflects the individual’s
unique preferences. Think about how you might spend your time when you are on
summer vacation and don’t have to go to school. You may stay in bed, sleep later in the
morning, play video games, visit friends, go fishing, or spend time getting a tan, but you prob-
ably will not spend much time reading textbooks or taking notes. That is the unconstrained
baseline.
The unconstrained baseline becomes seriously disrupted when the new school year
starts and you have to start attending classes again. Now you can no longer afford to
sleep as late in the morning or spend as much time visiting friends. In addition, you
are likely to devote more effort to studying and taking notes. In an analogous fashion,
the introduction of an instrumental conditioning procedure disrupts an organism’s
unconstrained baseline and causes a redistribution of responses.
Consider, for example, how a high school student may distribute his or her activities
between studying and interacting with friends on Facebook. Figure 7.7 represents time
spent on Facebook on the vertical axis and time spent studying on the horizontal axis.
In the absence of restrictions, the student will probably spend a lot more time on Face-
book than studying. This is represented by the open circle in Figure 7.7 and is the
unconstrained baseline in this situation. Without restrictions, the student spends 60 min-
utes on Facebook for every 15 minutes of studying. The unrestricted baseline is some-
times also called the behavioral bliss point. The term bliss point is borrowed from
economics and refers to a preferred response allocation in the absence of restrictions.
Imposing an Instrumental Contingency How would the introduction of an instru-
mental contingency between studying and being on Facebook disrupt the student’s
response allocation? The outcome depends on the nature of the contingency. Figure 7.7
shows a schedule line starting at the origin and increasing at a 45° angle. This line
defines a schedule of reinforcement, according to which the student is allowed to be on
Facebook for as long as he or she spends studying. If the student studies for 10 minutes,
he or she will get 10 minutes on Facebook; if he or she studies for an hour, the student
will get to be on Facebook for an hour. What might be the consequences of such a
schedule constraint?
Individuals will generally defend their response allocations against challenges
to the unrestricted baseline or bliss point condition. However, the interesting thing
is that the baseline response allocation usually cannot be reestablished after an
J. Allison
Co
ur
te
sy
of
J.
A
lli
so
n
Response Allocation and Behavioral Economics 201
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
instrumental contingency has been introduced. In our example, the unrestricted base-
line point was 60 minutes on Facebook for every 15 minutes of studying. Once the
instrumental contingency is imposed, there is no way the student can be on Facebook
for 60 minutes and only study for 15 minutes. If he or she insists on being on Face-
book for 60 minutes, the student will have to tolerate adding 45 minutes to his or her
studying time. On the other hand, if the student insists on spending only the 15 min-
utes on his or her studies (as in the baseline condition), he or she will have to make do
with 45 minutes less on Facebook. Defending the baseline study time or defending the
baseline Facebook time both have their disadvantages. That is often the dilemma posed
by an instrumental contingency. How this dilemma is resolved is the central issue of
the response-allocation approach.
Given that the instrumental contingency shown in Figure 7.7 makes it impossible to
return to the unrestricted baseline, the redistribution of responses between the instru-
mental and contingent behaviors becomes a matter of compromise. The rate of one
response is brought as close as possible to its preferred level without moving the other
response too far away from its preferred level.
Staddon, for example, proposed a minimum-deviation model of behavioral regula-
tion to solve the dilemma of schedule constraints (Staddon, 1983/2003). According to
this model, introduction of a response-reinforcer contingency causes organisms to redis-
tribute their behavior between the instrumental and contingent responses in a way that
minimizes the total deviation of the two responses from the unrestricted baseline or bliss
point. The minimum deviation point is shown by the dark symbol on the schedule line
in Figure 7.7. For situations in which the free baseline cannot be achieved in the face of a
schedule constraint, the minimum-deviation model provides one view of how organisms
settle for the next best thing.
Explanation of Reinforcement Effects How are reinforcement effects produced
according to the response allocation perspective? A reinforcement effect is identified by
an increase in the occurrence of an instrumental response above the level of that behav-
ior in the absence of the response–reinforcer contingency. The schedule line shown in
Figure 7.7 involves restricting time on Facebook below the level specified by the baseline
point. To move toward the preferred baseline level, the student has to increase his or her
studying so as to gain more Facebook time. This is precisely what occurs in typical
instrumental conditioning procedures. Access to the reinforcer is restricted; to gain
more opportunity to engage in the reinforcer response, the individual has to perform
more of the instrumental response. Thus, increased performance of the instrumental
response (a reinforcement effect) results from a reallocation of responses that minimize
deviations from the free baseline or bliss point.
Time Studying
150 30 45 60 75
T
im
e
on
F
ac
eb
oo
k Bliss Point
0
75
90
60
45
30
15
FIGURE 7.7 Allocation
of behavior between
spending time on
Facebook and studying.
The open circle shows
the optimal allocation, or
behavioral bliss point,
obtained when there are
no constraints on either
activity. The schedule
line represents a sched-
ule of reinforcement in
which the student is re-
quired to study for the
same amount of time
that he or she spends on
Facebook. Notice that
once this schedule of
reinforcement is im-
posed, it is no longer
possible for the student
to achieve the behavioral
bliss point. The schedule
deprives the student of
time on Facebook and
forces or motivates an
increase in studying.
©
Ce
ng
ag
e
Le
ar
ni
ng
20
15
202 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
BOX 7.2
The Response-Allocation Approach and Behavior Therapy
Considering instrumental condition-
ing as a matter of response allocation
not only provides new insights into
age-old theoretical issues concerning
reinforcement but also suggests alter-
native approaches to behavior therapy
(Farmer-Dougan, 1998; Timberlake &
Farmer-Dougan, 1991). For example,
it forces us to consider the broader
behavioral context in which an
instrumental contingency is intro-
duced. Depending on that behavioral
context, a reinforcement procedure
may increase or decrease the target
response. Thus, the response-
allocation approach can provide
insights into situations in which
introducing a reinforcement proce-
dure produces an unexpected decrease
in the instrumental response.
One area of behavior therapy in
which reinforcement procedures are
surprisingly ineffective is the use of
parental social reinforcement to
increase a child’s pro-social behavior.
A parent whose child frequently
misbehaves is encouraged to provide
more social approval for positive
behavior on the assumption that low
rates of parental reinforcement are
responsible for the child’s misbehav-
ior. Viken and McFall (1994) pointed
out that the common failure of such
reinforcement procedures is predict-
able if we consider the unconstrained
baseline or bliss point for the child.
Figure 7.8 shows the behavioral
space for parental social reinforcement
and positive child behavior. The open
circle represents the child’s presumed
unconstrained or preferred baseline.
Left to his or her own devices, the
child prefers a lot of social reinforce-
ment while emitting few positive
behaviors. The dashed line represents
the low rate of parental reinforcement
in effect before a therapeutic inter-
vention. According to this schedule
line, the child has to perform two
positive responses to receive each
social reinforcer from the parent. The
solid point on the line indicates the
equilibrium point, where positive
responses by the child and social
reinforcers earned are equally far from
their respective bliss point values.
The therapeutic procedure involves
increasing the rate of social reinforce-
ment, let’s say to a ratio of 1:1. This
is illustrated by the solid line in
Figure 7.8. Now the child receives one
social reinforcer for each positive
behavior. The equilibrium point is
again illustrated by the filled data
point. Notice that with the increased
social reinforcement, the child can get
more social reinforcers without having
to make more positive responses. In
fact, the child can increase his or her
rate of social reinforcement while
performing fewer positive responses.
No wonder, then, that the therapeutic
reinforcement procedure does not
increase the rate of positive responses.
The unexpected result of increased
social reinforcement illustrated in
Figure 7.8 suggests that solutions to
behavior problems require careful
consideration of the relation between
the new instrumental contingency and
prior reinforcement conditions.
Positive child behaviors
Pa
re
nt
al
s
oc
ia
l r
ei
nf
or
ce
rs
100 20 30 40
10
0
20
30
40
Bliss point
FIGURE 7.8 Hypothetical data on parental social reinforcement and positive child behavior. The behavioral bliss point for the
child is indicated by the open circle. The dashed line represents the rate of social reinforcement for positive behavior in effect
prior to introduction of a treatment procedure. The solid line represents the rate of social reinforcement for positive behavior set
up by the behavior therapy procedure. The solid point on each line represents the equilibrium point for each schedule.
©
Ce
ng
ag
e
Le
ar
ni
ng
Response Allocation and Behavioral Economics 203
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Viewing Reinforcement Contingencies in a Broader Behavioral Context The
above explanation of how schedule constraints produce reinforcement effects considers
only the instrumental and reinforcer responses (studying and being on Facebook). How-
ever, a student’s environment most likely provides a much greater range of options.
Instrumental contingencies do not occur in a behavioral vacuum. They occur in the con-
text of all of the responses and reinforcers the participant has available. Furthermore,
that broader behavioral context can significantly influence how the person adjusts to a
schedule constraint. For example, if the student enjoys listening to his or her iPod as
much as being on Facebook, restrictions of Facebook time may not increase studying
behavior. Rather, the student may switch to listening to the iPod, playing a video game,
or hanging out with friends. Any of these options will undermine the instrumental con-
tingency. The student could listen to his or her iPod or hang out with friends in place of
being on Facebook without increasing studying behavior.
This example illustrates that accurate prediction of the effects of an instrumental
conditioning procedure requires considering the broader context of the organism’s
response options. Focusing on just the instrumental response and its antecedent and
consequent stimuli (i.e., the associative structure of instrumental behavior) is not
enough. The effect of a particular instrumental conditioning procedure depends on
what alternative sources of reinforcement are available, how those other reinforcers are
related to the particular one involved in the instrumental contingency, and the cost of
obtaining those alternatives. These issues have been systematically examined with the
application of economic concepts to the problem of response allocation.
Behavioral Economics
The response-allocation approach redefined the fundamental issue in reinforcement. It
shifted attention away from the idea that reinforcers are special stimuli that enter into
special associative relations with the instrumental response and its antecedents. With the
response-allocation approach, the fundamental question is: How is the allocation of
behavior among an individual’s response options altered by the constraints imposed by
an instrumental conditioning procedure?
Students who have studied economics may recognize a similarity here to problems
addressed by economists. Economists, similar to psychologists, strive to understand
changes in behavior in terms of preexisting preferences and restrictions on fulfilling those
preferences. As Bickel, Green, and Vuchinich (1995) noted, “Economics is the study of the
allocation of behavior within a system of constraint” (p. 258). In the economic arena, the
restrictions on behavior are imposed by our income and the price of the goods that we
want to purchase. In instrumental conditioning situations, the restrictions are provided
by the number of responses an organism is able to make (its “income”) and the number
of responses required to obtain each reinforcer (the “price” of the reinforcer).
Psychologists have become interested in the similarities between economic restrictions
in the marketplace and schedule constraints in instrumental conditioning. The analysis
of response allocation in terms of economic concepts can be a bit complex. For the sake
of simplicity, I will concentrate on the basic ideas that have had the most impact on
understanding reinforcement. (For a more complete discussion see Hursh et al., 2013.)
Consumer Demand Fundamental to the application of economic concepts to the
problem of reinforcement is the relation between the price of a commodity and how
much of it is purchased. This relation is called the demand curve. Figure 7.9 shows
three examples of demand curves. Curve A illustrates a situation in which the consump-
tion of a commodity is very easily influenced by its price. This is the case with candy.
204 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
If the price of candy increases substantially, the amount purchased quickly drops. Other
commodities are less responsive to price changes (Curve C in Figure 7.9). The purchase
of gasoline, for example, is not as easily discouraged by increases in price. People con-
tinue to purchase gas for their cars even if the price increases, showing a small decline
only at the highest prices.
The degree to which price influences consumption is called elasticity of demand.
Demand for candy is highly elastic. The more candy costs, the less you will buy. In con-
trast, demand for gasoline is much less elastic. People continue to purchase gas even if
the price increases a great deal.
The concept of consumer demand has been used to analyze a variety of major
behavior problems including eating and drug abuse (e.g., Epstein, Leddy, Temple, &
Faith, 2007). In a recent laboratory study, for example, 10- to 12-year-old children
increased their purchases of healthy foods as the price of unhealthy alternatives was
increased (Epstein et al., 2006). The selection of healthy food also increased in a study of
food choices in a restaurant when the healthy alternatives were reduced in price (Horgen
& Brownell, 2002). Interestingly, a decrease in price was more effective in encouraging the
selection of healthy foods than messages encouraging patrons to eat healthy.
The concept of consumer demand has been used to analyze instrumental behavior
by considering the number of responses performed (or time spent responding) to be
analogous to money and the reinforcer obtained to be analogous to the commodity that
is purchased. The price of a reinforcer then is the time or number of responses required
to obtain the reinforcer. Thus, the price of the reinforcer is determined by the schedule
of reinforcement. The goal is to understand how instrumental responding (spending) is
controlled by instrumental contingencies (prices).
Johnson and Bickel (2006), for example, investigated the elasticity of demand for
cigarettes and money in smokers with a mean age of 40 years who were not trying to
quit. The apparatus had three plungers the participants could pull, each for a different
reinforcer. The reinforcers were three puffs on a cigarette, 5¢, or 25¢. Only one of the
plungers (and its assigned reinforcer) was available in a particular session. The response
requirement for obtaining the reinforcer was gradually increased during each session.
The ratio requirement started at an FR 3 and was then raised to FR 30, 60, 100, 300,
600, and eventually 6,000. The investigators wanted to determine at what point the
Price
A
m
ou
nt
p
ur
ch
as
ed
A
B
C
FIGURE 7.9 Hypothetical
consumer demand
curves illustrating high
sensitivity to price
(Curve A), intermediate
sensitivity (Curve B),
and low sensitivity
(Curve C).
W. K. Bickel
Jo
rd
an
Si
lv
er
m
an
/S
tr
in
ge
r/
G
et
ty
Im
ag
es
©
Ce
ng
ag
e
Le
ar
ni
ng
Response Allocation and Behavioral Economics 205
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
participants would quit responding because the response requirement, or price, was too
high. (None of the reinforcers could support responding on the FR 6,000 schedule.)
The results of the experiment are summarized in Figure 7.10. Data for the 5¢ rein-
forcer and the 25¢ reinforcer are presented in separate panels. Data for the cigarette rein-
forcer are reproduced in both panels for comparison. The greatest elasticity of demand
occurred for the 5¢ monetary reinforcer. Here, the number of reinforcers obtained started
decreasing as soon as more than three responses were required to obtain the 5¢ and
dropped quickly when 100 or more responses were required. With the 25¢ reinforcer, the
demand curve did not start to decline until the response requirement exceeded FR 300. As
might be expected, the participants were most resistant to increases in the price of the cig-
arette reinforcer. With puffs at a cigarette, the number of reinforcers obtained did not start
to decline until the response requirement was raised above an FR 600. These results show
that the participants were willing to make many more responses for puffs at a cigarette
than they were for the monetary rewards. No doubt the results would have been different
if the experiment had been conducted with nonsmokers. (For a review of behavioral eco-
nomic approaches to drug abuse, see Higgins, Heil, & Sigmon, 2013.)
Determinants of the Elasticity of Demand The application of economic concepts
to the analysis of instrumental conditioning would be of little value if it did not provide
new insights into the mechanisms of reinforcement. As it turns out, economic concepts
have helped to identify four major factors that influence how schedule constraints shape
the reallocation of behavior. Each of these factors determines the degree of elasticity of
demand, or the extent to which increases in price cause a decrease in consumption of
that reinforcer.
(1) Availability of Substitutes Perhaps the most important factor that influences the
elasticity of demand is the availability of alternative reinforcers that can serve as substi-
tutes for the reinforcer of interest. Whether increases in the price of one item cause a
decline in consumption depends on the availability (and price) of other goods that can
be used in place of the original item. The availability of substitutes increases the sensitiv-
ity of the original item to higher prices.
Newspaper subscriptions in the United States have seen a steep decline since news has
become readily available on 24-hour cable channels and the Internet. This reflects the fact
that cable and Internet sources are good substitutes for news obtained from newspapers.
The availability of substitutes is also determining how often people go to the movies.
Watching a movie on a rented or downloaded DVD is a reasonable substitute for going to
the theater, especially now that surround sound is readily available in home-entertainment
systems. Increases in the price of movie tickets at the theater encourage cost-conscious
movie goers to wait for the release of the movie on DVD. In contrast, the amount of
1
1 10 100 1000
10
100
1
1 10 100 1000
10
100
R
ei
nf
or
ce
rs
o
bt
ai
ne
d
R
ei
nf
or
ce
rs
o
bt
ai
ne
d
Progressive fixed-ratio requirement Progressive fixed-ratio requirement
Cigarettes & $0.05 Cigarettes & $0.25
FIGURE 7.10 Demand curves for cigarettes (solid circles) and money (open circles) with progressively larger fixed-ratio require-
ments. The number of reinforcers obtained and the fixed-ratio requirements are both presented on logarithmic
scales. (Based on Johnson & Bickel, 2006.)
206 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
gasoline people buy is not as much influenced by price (especially in areas without mass
transit) because there are few readily available substitutes for gasoline to fuel a car.
Contemporary analyses of drug abuse also recognize the importance of substitute
reinforcers. Murphy, Correla, and Barnett (2007), for example, considered how one
might reduce excessive alcohol intake among college students and concluded that
“behavioral economic theory predicts that college students’ decisions about drinking are
related to the relative availability and price of alcohol, the relative availability and price
of substance-free alternative activities, and the extent to which reinforcement from
delayed substance-free outcomes is devalued relative to the immediate reinforcement
associated with drinking” (p. 2573).
In an experimental study of individuals on a methadone maintenance program
(Spiga, 2006, as cited in Hursh et al., 2013), the participants could obtain methadone
by pressing a response lever on a progressively increasing FR schedule. In one condition,
methadone was the only available drug. In another condition, the participants could
obtain another opiate, hydromorphone, at a constant price (FR 32) by pressing a second
lever. The investigators were interested in whether the availability of hydromorphone
would influence the demand for methadone.
The results are presented in Figure 7.11. When methadone was the only reinforcer,
increases in its price resulted in decreased consumption at the highest FR values. Less
methadone was consumed if the other opiate, hydromorphone, was also available. Fur-
thermore, responding for hydromorphone increased as the price of methadone increased.
These results show that hydromorphone served as a substitute for methadone and
because of that the availability of hydromorphone decreased the demand for methadone.
(2) Price Range Another important determinant of the elasticity of demand is the
price range of the commodity. Generally, an increase in price has less of an effect at
low prices than at high prices. Consider, for example, the cost of candy. A 10% increase
in the price from 50¢ to 55¢ is not likely to discourage consumption. But if the candy
costs $5.00, a 10% increase to $5.50 might well discourage purchases.
Price effects on elasticity of demand are evident in Figure 7.10. Notice that at low
prices there is little change in the number of reinforcers obtained as the price increases
a bit. However, dramatic declines occur in the number of reinforcers obtained at the
high end of the price range.
1
10
10 100
FR for methadone
1000 10000
100
N
um
be
r
of
d
ru
g
do
se
s
co
ns
um
ed
1000
Methadone alone
Methadone concurrent with
Hydromorphone
Hydromorphone consumption
FIGURE 7.11 Number
of doses of methadone
obtained when only
methadone was available
at increasing prices (solid
circles) and when metha-
done was available at in-
creasing prices along with
hydromorphone at a fixed
low price (open circles).
The number of doses of
hydromorphone obtained
is also shown as a function
of the price of methadone
(solid squares). Notice that
consumption of hydro-
morphone increases as the
price of methadone
increased (based on
Hursh et al., 2013).
Response Allocation and Behavioral Economics 207
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
(3) Income Level A third factor that determines elasticity of demand is the level of
income. In general, the higher your income, the less deterred you will be by increases in
price. This is also true for reinforcers obtained on schedules of reinforcement. In studies of
instrumental conditioning, the number of responses or amount of time available for
responding corresponds to income. These are resources an organism can use to respond
to a schedule constraint. The more responses or time animals have available, the less
their behavior is influenced by increases in the cost of the reinforcer (e.g., Silberberg,
Warren-Bouton, & Asano, 1987; see also DeGrandpre, Bickel, Rizvi, & Hughes, 1993).
Income level also influences the choice of substitutes. In an interesting study of
choice between healthy and unhealthy foods (Epstein et al., 2006), children aged 10–14
years were tested at three different income levels ($1, $3, and $5). At the low-income
level, increases in the price of unhealthy foods (potato chips, cookies, pudding, and
cola) led to increased choice of the healthy alternatives (apples, pretzels, yogurt, and
milk). In contrast, at the high-income level, the children continued to purchase the
unhealthy but preferred foods as the price of these foods went up. This left them with
less money to buy the lower priced, healthier substitutes. Thus, at the high-income
level, increases in the price of the junk food reduced the choice of healthy alternatives.
(4) Link to Complementary Commodity The fourth factor that influences price
sensitivity is the link of a reinforcer to a complementary commodity. Hot dogs and hot
dog buns are complementary. People do not eat one without the other. Therefore, if the
price of hot dogs drives down the number of hot dogs that are purchased, this will also
decrease the purchase of hot dog buns. Numerous reinforcers are linked to complemen-
tary commodities. For individuals who both smoke and consume alcohol, these two rein-
forcers are typically linked, such that smoking increases with drinking and vice versa.
Evidence suggests that among individuals on a methadone maintenance program, meth-
adone and cigarettes are complementary commodities (Spiga et al., 2005). For rats, eating
dry food and drinking water are complementary (Madden et al., 2007). The more food
that is consumed, the more water is purchased.
Contributions of the Response-Allocation Approach and
Behavioral Economics
Thinking about instrumental behavior as a problem of response allocation originated in
considerations of the Premack principle and the response-deprivation hypothesis.
Although challenges remain, this line of theorizing has made major contributions to
how we think about the motivation of instrumental behavior. It is instructive to review
some of these contributions.
• The response-allocation approach has moved us away from thinking about reinforcers
as special kinds of stimuli or as special kinds of responses. We are now encouraged to
look for the causes of reinforcement in how instrumental contingencies constrain the
free flow of behavior. Reinforcement effects are regarded as the consequence of sched-
ule constraints on an organism’s ongoing activities.
• Instrumental conditioning procedures are no longer considered to “stamp in” or to
strengthen instrumental behavior. Rather, instrumental conditioning is seen as creat-
ing a new distribution, or allocation, of responses. The resultant reallocation
depends on tradeoffs between various options that are usefully characterized by
behavioral economics.
• The response-allocation approach and behavioral economics provide new and precise
ways of describing constraints that various instrumental conditioning procedures
impose on an organism’s behavioral repertoire. Most importantly, they emphasize
208 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
that instrumental behavior cannot be studied in a vacuum or behavioral test tube.
Rather, all of the organism’s response options at a given time must be considered as
a system. Changes in one part of the system determine how other parts of the system
can be altered. Constraints imposed by instrumental procedures are more or less effec-
tive depending on the nature of the constraint, the availability of substitutes, the
organism’s level of income, and linkages to complementary commodities. Given
these complexities, response allocation and economic analyses are probably our best
hope of understanding how instrumental behavior occurs in complex real-world
environments.
Concluding Comments
Motivational processes in instrumental behavior have been addressed from two radically
different perspectives and intellectual traditions, the associationist perspective rooted in
Thorndike’s law of effect and Pavlovian conditioning, and the response-allocation per-
spective rooted in Skinner’s behavioral analysis. These two approaches differ in more
ways than they are similar, making it difficult to imagine how they might be integrated.
The fundamental concept in the associationist approach (the concept of an association)
is entirely ignored in the response-allocation approach. Also, the mechanism of response
allocation characterized by behavioral economics has no corresponding structure in the
associationist approach. Both approaches have contributed significantly to our under-
standing of the motivation of instrumental behavior. Therefore, neither approach can
be ignored in favor of the other.
One way to think about the two approaches is that they involve different levels of
analysis. The associationist approach involves the molecular level and focuses on individ-
ual stimuli, responses, and their connections. In contrast, response allocation and behav-
ioral economics operate at a molar level, considering the broader behavioral context in
which an instrumental contingency is introduced. Thus, the response-allocation
approach makes better contact with the complexities of an organism’s ecology.
These alternative perspectives provide an exciting illustration of the nature of scien-
tific inquiry. The inquiry has spanned intellectual developments from simple stimulus–
response formulations to comprehensive considerations of how an organism’s repertoire
is constrained by instrumental contingencies and how organisms solve complex ecologi-
cal problems. This area in the study of conditioning and learning, perhaps more than
any other, has moved boldly to explore radically new conceptions when older ideas did
not meet the challenges posed by new empirical findings.
Sample Questions
1. Describe what is an S–R association and what
provides the best evidence for it.
2. Describe what is an S–O association and what
research tactic provides the best evidence for it.
3. What investigative techniques are used to provide
evidence of R–O associations? Why is it not
possible to explain instrumental behavior by
assuming only R–O association learning?
4. How do studies of the associative structure of
instrumental conditioning help in understanding
the nature of drug addiction?
5. Describe similarities and differences between the
Premack principle and subsequent response allo-
cation models.
6. What are the primary contributions of economic
concepts to the understanding of the motiva-
tional bases of instrumental behavior?
7. Describe implications of modern concepts of
reinforcement for behavior therapy.
Sample Questions 209
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Key Terms
behavioral bliss point The preferred distribution of
an organism’s activities before an instrumental condi-
tioning procedure is introduced that sets constraints
and limitations on response allocation.
consummatory-response theory A theory that
assumes that species-typical consummatory responses
(eating, drinking, and the like) are the critical features
of reinforcers.
demand curve The relation between how much of a
commodity is purchased and the price of the commodity.
differential probability principle A principle that
assumes that reinforcement depends on how much
more likely the organism is to perform the reinforcer
response than the instrumental response before an
instrumental conditioning procedure is introduced.
The greater the differential probability of the reinforcer
and instrumental responses during baseline conditions,
the greater is the reinforcement effect of providing
opportunity to engage in the reinforcer response after
performance of the instrumental response. Also known
as the Premack principle.
elasticity of demand The degree to which price influ-
ences the consumption or purchase of a commodity. If
price has a large effect on consumption, elasticity of
demand is high. If price has a small effect on consump-
tion, elasticity of demand is low.
minimum-deviation model A model of instrumental
behavior, according to which participants respond to a
response–reinforcer contingency in a manner that gets
them as close as possible to their behavioral bliss point.
Premack principle The same as differential probabil-
ity principle.
response-deprivation hypothesis An explanation of
reinforcement according to which restricting access to a
response below its baseline rate of occurrence (response
deprivation) is sufficient to make the opportunity to
perform that response an effective positive reinforcer.
210 Chapter 7: Instrumental Conditioning: Motivational Mechanisms
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
C H A P T E R 8
Stimulus Control of Behavior
Identification and Measurement of Stimulus
Control
Differential Responding and Stimulus
Discrimination
Stimulus Generalization
Stimulus and Reinforcement Variables
Sensory Capacity and Orientation
Relative Ease of Conditioning Various Stimuli
Type of Reinforcement
Stimulus Elements Versus Configural Cues
in Compound Stimuli
Learning Factors in Stimulus Control
Stimulus Discrimination Training
What Is Learned in Discrimination Training?
Spence’s Theory of Discrimination Learning
Interactions Between S+ and S–: The Peak-Shift
Effect
Stimulus Equivalence Training
Contextual Cues and Conditional Relations
Control by Contextual Cues
Control by Conditional Relations
Concluding Comments
Sample Questions
Key Terms
CHAPTER PREVIEW
This chapter focuses on the topic of stimulus control. Although most of the chapter deals with the ways in
which instrumental behavior comes under the control of particular stimuli, the concepts are equally
applicable to classical conditioning. The chapter begins with a definition of stimulus control and the basic
concepts of stimulus discrimination and generalization. I then go on to discuss factors that determine the
extent to which behavior comes to be restricted to particular stimuli. Along the way, I will describe
special forms of stimulus control (intradimensional discrimination) and control by special categories of
stimuli (interoceptive stimuli, configural stimuli, and contextual cues). The chapter concludes with a
discussion of the learning of conditional relations in both instrumental and classical conditioning.
Both Thorndike and Skinner recognized that instrumental responses and reinforcers occur
in the presence of particular stimuli which come to control those responses. As I described
in Chapter 7, research on the associative structure of instrumental conditioning deals with
how these stimuli come to determine whether or not the instrumental response is per-
formed. The importance of antecedent stimuli has been examined further in studies of
the stimulus control of instrumental behavior, which is the topic of this chapter.
211
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The stimulus control of instrumental behavior is evident in many aspects of life.
Studying, for example, is under the strong control of school-related stimuli. College stu-
dents who fall behind in their course work may make determined resolutions to study a
lot when they go home during a semester break. However, such good intentions are
rarely carried out. The stimuli of semester breaks are very different from the stimuli stu-
dents experience when classes are in session. Because of that, semester breaks do not
engender effective studying behavior.
The proper fit between an instrumental response and the stimulus context in which
the response is performed is so important that the failure of appropriate stimulus control
is often considered abnormal. Getting undressed, for example, is acceptable instrumental
behavior in the privacy of your bedroom. The same behavior on a public street will get
you arrested. Staring at a computer screen is considered appropriate if the computer is
turned on but not if it is blank. Talking is appropriate if someone is there to listen. Talk-
ing in the absence of an audience is considered strange and possibly evidence of
psychopathology.
Identification and Measurement of
Stimulus Control
To investigate the stimulus control of behavior, one first has to figure out how to identify
and measure it. How can a researcher tell that an instrumental response has come under
the control of certain stimuli?
Differential Responding and Stimulus Discrimination
Consider, for example, a classic experiment by Reynolds (1961). Two pigeons were
reinforced on a variable-interval schedule for pecking a circular response key. Rein-
forcement for pecking was available whenever the response key was illuminated
by a visual pattern consisting of a white triangle on a red background (Figure 8.1).
Reynolds was interested in which of these stimulus components gained control over
R
es
po
ns
es
p
er
m
in
ut
e
10
20
0
Test stimuli
White Red
Training
Pigeon #107 Pigeon #105
Red
White
Test
FIGURE 8.1 Summary of the procedure and results of an experiment by Reynolds (1961). Two pigeons were first reinforced for
pecking whenever a compound stimulus consisting of a white triangle on a red background was projected on the response key. The
rate of pecking was then observed with each pigeon when the white triangle and the red background stimuli were presented
separately.
©
Ce
ng
ag
e
Le
ar
ni
ng
212 Chapter 8: Stimulus Control of Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
the pecking behavior. Were the pigeons pecking because they saw the white triangle
or because they saw the red background?
After the pigeons learned to peck steadily at the triangle on the red background,
Reynolds measured the amount of pecking that occurred when only one of the stimuli
was presented. On some of the test trials, the white triangle was projected on the
response key without the red background. On other test trials, the red background
color was projected on the response key without the white triangle.
The results are summarized in Figure 8.1. One of the pigeons pecked more fre-
quently when the response key was illuminated with the red light than when it was illu-
minated with the white triangle. This outcome shows that its pecking behavior was much
more strongly controlled by the red color than by the white triangle. By contrast, the
other pigeon pecked more frequently when the white triangle was projected on the
response key than when the key was illuminated by the red light. Thus, for the second
bird, the pecking behavior was more strongly controlled by the triangle. (For a similar
effect in pigeon search behavior, see Cheng & Spetch, 1995.)
This experiment illustrates several important ideas. First, it shows how to experimen-
tally determine whether instrumental behavior has come under the control of a particular
stimulus. The stimulus control of instrumental behavior is demonstrated by variations in
responding (differential responding) related to variations in stimuli. If an organism
responds one way in the presence of one stimulus and in a different way in the presence
of another stimulus, its behavior has come under the control of those stimuli. Such differ-
ential responding was evident in the behavior of both pigeons Reynolds tested.
Differential responding to two stimuli also indicates that the pigeons were treating
each stimulus as different from the other. This is called stimulus discrimination. An
organism is said to exhibit stimulus discrimination if it responds differently to two or
more stimuli. Stimulus discrimination and stimulus control are two ways of considering
the same phenomenon. One cannot have one without the other. If an organism does not
discriminate between two stimuli, its behavior is not under the control of those cues.
Another interesting aspect of the results of Reynolds’s experiment was that the peck-
ing behavior of each bird came under the control of a different stimulus. The behavior of
bird 107 came under the control of the red color, whereas the behavior of bird 105 came
under the control of the triangle. The procedure used by Reynolds did not direct atten-
tion to one of the stimuli at the expense of the other. Therefore, it is not surprising that
each bird came to respond to a different aspect of the situation. The experiment is com-
parable to showing a group of children a picture of a cowboy grooming a horse. Some of
the children may focus on the cowboy; others may find the horse more interesting. In
the absence of special procedures, one cannot always predict which of the various stimuli
an organism experiences will gain control over its instrumental behavior.
Stimulus Generalization
Psychologists and physiologists have long been concerned with how organisms identify
and distinguish different stimuli. In fact, some have suggested that this is the single
most important question in psychology (Stevens, 1951). The problem is central to the
analysis of stimulus control. As you will see, numerous factors are involved in the iden-
tification and differentiation of stimuli. Experimental analyses of the problem have relied
heavily on the phenomenon of stimulus generalization. Stimulus generalization is the
opposite of differential responding, or stimulus discrimination. An organism is said to
show stimulus generalization if it responds in a similar fashion to two or more stimuli.
The phenomenon of stimulus generalization was first observed by Pavlov. He found
that after one stimulus was used as a CS, his dogs would also make the conditioned
Identification and Measurement of Stimulus Control 213
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
response to other, similar stimuli. That is, they failed to respond differentially to stimuli
that were similar to the original CS. Since then, stimulus generalization has been exam-
ined in a wide range of situations and species. As Ghirlanda and Enquist (2003) noted,
“Empirical data gathered in about 100 years of research establish generalization as a fun-
damental behavioral phenomenon, whose basic characteristics appear universal” (p. 27).
In a landmark study of stimulus generalization in instrumental conditioning,
Guttman and Kalish (1956) first reinforced pigeons on a variable-interval schedule for
pecking a response key illuminated by a yellow light with a wavelength of 580 nano-
meters (nm). After training, the birds were tested with a variety of other colors presented
in a random order without reinforcement, and the rate of responding in the presence of
each color was recorded.
The results of the experiment are summarized in Figure 8.2. The highest rate of
pecking occurred in response to the original 580-nm color. But the birds also made sub-
stantial numbers of pecks when lights of 570-nm and 590-nm wavelength were tested.
This indicates that responding generalized to the 570-nm and 590-nm stimuli. However,
as the color of the test stimuli became increasingly different from the color of the origi-
nal training stimulus, progressively fewer responses occurred. The results showed a gra-
dient of responding as a function of how similar each test stimulus was to the original
training stimulus. This is an example of a stimulus generalization gradient.
Stimulus generalization gradients are an excellent way to measure stimulus control
because they provide precise information about how sensitive the organism’s behavior
is to systematic variations in a stimulus (Honig & Urcuioli, 1981; Kehoe, 2008). Con-
sider, for example, the gradient in Figure 8.2. When the original 580-nm training stimu-
lus was changed 10 nm (to 570 or 590 nm), responding did not change. However, when
the 580 nm was changed 40 nm or more (to 520, 540, 620, or 640 nm) responding
dropped off significantly. This aspect of the stimulus generalization gradient provides
precise information about how much of a change in a stimulus is required for the
pigeons to respond differently.
How do you suppose the pigeons would have responded if they had been color-
blind? In that case, they could not have distinguished lights on the basis of color or
wavelength. Therefore, they would have responded in much the same way regardless of
what color was projected on the response key. Figure 8.3 presents hypothetical results of
an experiment of this sort. If the pigeons did not respond on the basis of the color of the
R
es
po
ns
es
530 550 570 610590 630
50
0
100
150
200
250
300
Wavelength (nm)
Training
stimulus
FIGURE 8.2 Stimulus
generalization gradient
for pigeons that were
trained to peck in the
presence of a colored
light of 580-nm wave-
length and were then
tested in the presence of
other colors. (From
“Discriminability and
Stimulus Generaliza-
tion,” by N. Guttman
and H. I. Kalish, 1956,
Journal of Experimental
Psychology, 51,
pp. 79–88.)
214 Chapter 8: Stimulus Control of Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
key light, similar high rates of responding would have occurred as different colors were
projected on the key. Thus, the stimulus generalization gradient would have been flat.
A comparison of the results obtained by Guttman and Kalish and our hypothetical
experiment with color-blind pigeons indicates that the steepness of a stimulus generali-
zation gradient provides a precise measure of the degree of stimulus control. A steep
generalization gradient (Figure 8.2) indicates strong control of behavior by the stimulus
dimension that is tested. In contrast, a flat generalization gradient (Figure 8.3) indicates
weak or nonexistent stimulus control. The primary question in this area of behavior the-
ory is what determines the degree of stimulus control that is obtained. The remainder of
this chapter is devoted to answering that question.
R
es
po
ns
es
530 550 570 610590 630
50
0
100
150
200
250
300
Wavelength (nm)
Training
stimulus
FIGURE 8.3 Hypothetical
stimulus generalization
gradient for color-blind
pigeons trained to peck
in the presence of a
colored light of 580-nm
wavelength and then
tested in the presence of
other colors.
BOX 8.1
Generalization of Treatment Outcomes
Stimulus generalization is critical to
the success of behavior therapy. Like
other forms of therapy, behavior
therapy is typically conducted in a
distinctive environment (e.g., in a
therapist’s office). For the treatment
to be maximally useful, what is
learned during the treatment should
generalize to other situations. An
autistic child, for example, who is
taught certain communicative
responses in interactions with a
particular therapist should also
exhibit those responses in interac-
tions with other people. The follow-
ing techniques have been proposed
to facilitate generalization of treat-
ment outcomes (e.g., Schreibman,
Koegel, Charlop, & Egel, 1990;
Stokes & Baer, 1977):
1. The treatment situation should
be made as similar as possible to
the natural environment of the
client. If the natural environment
provides reinforcement only
intermittently, it is a good idea to
reduce the frequency of rein-
forcement during treatment ses-
sions as well. Another way to
increase the similarity of the
treatment procedure to the nat-
ural environment is to use the
same reinforcers the client is
likely to encounter in the natural
environment.
2. Generalization also may be
increased by conducting the treat-
ment procedure in new settings.
This strategy is called sequential
modification. After a behavior has
been modified or conditioned in
one situation (a classroom), train-
ing is conducted in a new situation
(the playground). If that does not
result in sufficient generalization,
training can be extended to a third
environment (e.g., the school
cafeteria).
3. Using numerous examples dur-
ing training also facilitates gen-
eralization. In trying to
extinguish fear of elevators, for
example, training should be
Continued
©
Ce
ng
ag
e
Le
ar
ni
ng
Identification and Measurement of Stimulus Control 215
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Stimulus and Reinforcement Variables
In the experiment by Reynolds (1961) described at the beginning of the chapter, pigeons
pecked a response key that had a white triangle on a red background. Such a stimulus
obviously has two features, the color of the background and the shape of the triangle.
Perhaps less obvious is the fact that all stimulus situations can be analyzed in terms of
multiple features. Even if the response key only had the red background, one could char-
acterize it in terms of its brightness, shape, or location in the experimental chamber in
addition to its color.
Situations outside the laboratory are even more complex. During a football game,
for example, cheering is reinforced by social approval if the people near you are all
rooting for the same team as you are and if your team is doing well. The cues that
accompany appropriate cheering include your team making a good play on the field,
the announcer describing the play, cheerleaders dancing exuberantly, and the people
around you cheering.
The central issue in the analysis of stimulus control is what determines which of the
numerous features of a stimulus situation gains control over the instrumental behavior.
Stimuli as complex as those found at a football game are difficult to analyze experimen-
tally. Laboratory studies are typically conducted with stimuli that consist of more easily
identifiable features. In this section, we will consider stimulus and reinforcement
conducted in many different
types of elevators.
4. Generalization may be also
encouraged by conditioning the
new responses to stimuli that are
common to various situations.
Language provides effective
mediating stimuli. Responses
conditioned to verbal or instruc-
tional cues are likely to generalize
to new situations in which those
instructional stimuli are
encountered.
5. Another approach is to make the
training procedure indiscrimin-
able or incidental to other acti-
vities. In one study (McGee,
Krantz, & McClannahan, 1986),
the investigators took advantage
of the interest that autistic chil-
dren showed in specific toys
during a play session to teach the
children how to read the names
of the toys.
6. Finally, generalization outside a
training situation is achieved if
the training helps to bring the
individual in contact with con-
tingencies of reinforcement
available in the natural envi-
ronment (Baer & Wolf, 1970).
Once a response is acquired
through special training, the
behavior often can be main-
tained by naturally available
reinforcers. Reading, doing
simple arithmetic, and riding a
bicycle are all responses that are
maintained by natural reinfor-
cers once the responses have
been acquired through special
training.
An interesting study involved
teaching 4- and 5-year-old children
safety skills to prevent playing with
firearms (Jostad et al., 2008). During
the training sessions, a disabled
handgun was deliberately left in
places where the children would find
it. If a child found the firearm, he or
she was instructed not to touch it
and to report it to an adult. Praise
and corrective feedback served as
reinforcers. The unusual aspect of
the study was that the training was
conducted by children who were just
a bit older (6 and 7 years old) than
the research participants. This
required training the peer trainers
first.
The results were very encourag-
ing. With many (but not all) of the
participants, the safety behaviors
generalized to new situations and
were maintained as long as a year.
The experiment was not designed to
prove that peer trainers were critical
in producing the generalized
responding. However, accidents
often occur when two or more chil-
dren find and play with a firearm
together. The fact that the safety
training was conducted between one
child and another should facilitate
generalization of the safety behaviors
to other situations in which two or
more children find a gun.
BOX 8.1 (continued)
216 Chapter 8: Stimulus Control of Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
variables that determine which cues come to control behavior. In the following section,
we will consider learning factors.
Sensory Capacity and Orientation
The most obvious variable that determines whether a particular stimulus feature con-
trols responding is the organism’s sensory capacity and orientation. Sensory capacity
and orientation determine which stimuli are included in an organism’s sensory world.
People cannot see behind their back and cannot hear sounds whose pitch is above
about 20,000 cycles per second (cps). Dogs, on the other hand, can hear whistles
outside the range of human hearing and are also much more sensitive to odors. These
differences make the sensory world of dogs very different from the sensory world of
human beings.
Because sensory capacity sets a limit on what stimuli can control behavior, studies of
stimulus control are often used to determine what an organism is, or is not, able to per-
ceive (Heffner, 1998; Kelber, Vorobyev, & Osorio, 2003). Consider, for example, the
question: Can horses see color? To answer that question, investigators used a training
procedure in which horses had to select a colored stimulus over a gray one to obtain
food reinforcement (Blackmore et al., 2008). The colored and gray stimuli were projected
on separate stimulus panels placed side by side on a table in front of the horse. There
was a response lever in front of each stimulus panel that the horse could push with its
head to register its choice on that trial. Several shades of gray were tested with several
shades of red, green, yellow, and blue.
If the horses could not detect color, they could not consistently select the colored
stimulus in such a choice task. However, all the four horses in the experiment chose
blue and yellow over gray more than 85% of the time. Three of the horses also did well
on choices between green and gray. However, only one of the horses consistently selected
the color when red was tested against gray. These results indicate that horses have good
color vision over a large range of colors but have some difficulty detecting red. (For a
similar experiment with giant pandas, see Kelling et al., 2006.)
Relative Ease of Conditioning Various Stimuli
Having the necessary sense organs to detect the stimulus being presented does not guar-
antee that the organism’s behavior will come under the control of that stimulus. Stimulus
control also depends on the presence of other cues in the situation. In particular, how
strongly organisms learn about one stimulus depends on how easily other cues in the
situations can become conditioned. This phenomenon is called overshadowing. Over-
shadowing illustrates competition among stimuli for access to the processes of learning.
Consider, for example, trying to teach a child to read by having him or her follow
along as you read a children’s book that has a big picture and a short sentence on each
page. Learning about pictures is easier than learning words. Therefore, the pictures may
well overshadow the words. The child will quickly memorize the story based on the pic-
tures rather than the words and will not learn much about the words.
Pavlov (1927) was the first to observe that if two stimuli are presented at the same
time, the presence of the more easily trained stimulus may hinder learning about the
other one. In many of Pavlov’s experiments, the two stimuli differed in intensity. The
basic experimental design is illustrated in Table 8.1. During training, a relatively weak
stimulus (designated as “a” in Table 8.1) is conditioned either by itself (in the control
group) or in the presence of a more intense stimulus (designated as “B”). Subsequent
tests reveal weaker conditioned responding to stimulus a in the overshadowed group
than in the control group.
Stimulus and Reinforcement Variables 217
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Overshadowing has been of considerable interest in contemporary studies of spatial
navigation. People and other animals use a variety of different stimuli to find their way
around (beacons, landmarks, and spatial or geographical cues). The availability of one
type of cue (e.g., a prominent landmark) can sometimes overshadow learning about
other types of spatial information (Horne & Pearce, 2011). Interestingly, whether land-
mark cues overshadow spatial cues differs between males and females (e.g., Rodríguez,
Chamizo, & Mackintosh, 2011). (We will discuss spatial navigation in greater detail in
Chapter 11. For other studies of overshadowing, see Dwyer, Haselgrove, & Jones, 2011;
Jennings, Bonardi, & Kirkpatrick, 2007.)
Type of Reinforcement
The development of stimulus control also depends on the type of reinforcement that is
used (Weiss, 2012). Certain types of stimuli are more likely to gain control over the
instrumental behavior in appetitive than in aversive situations. This phenomenon was
originally discovered in experiments with pigeons (see LoLordo, 1979).
In one study (Foree & LoLordo, 1973), two groups of pigeons were trained to press
a foot treadle in the presence of a compound stimulus consisting of a red light and a
tone whose pitch was 440 cps. When the light–tone compound was absent, responses
were not reinforced. For one group of pigeons, reinforcement for treadle pressing was
provided by food. For the other group, treadle pressing was reinforced by the avoidance
of shock. If the avoidance group pressed the treadle in the presence of the light–tone
compound stimulus, no shock was delivered on that trial; if they failed to respond during
the light–tone stimulus, a brief shock was periodically applied until a response occurred.
Both groups of pigeons learned to respond during the light–tone compound. Foree
and LoLordo then sought to determine which of the two elements of the compound
stimulus was primarily responsible for the treadle-press behavior. Test trials were con-
ducted during which the light and tone stimuli were presented one at a time. The results
are summarized in Figure 8.4.
Pigeons conditioned with food reinforcement responded much more when tested
with the light stimulus alone than when tested with the tone alone. In fact, their rate of
treadle pressing in response to the isolated presentation of the red light was nearly as
high as when the light was presented simultaneously with the tone. Therefore, the behav-
ior of these birds was nearly exclusively controlled by the red light.
A contrasting pattern of results occurred with the pigeons that had been trained
with shock-avoidance reinforcement. Those birds responded much more when tested
with the tone alone than when tested with the light alone. Thus, with shock-avoidance
reinforcement, the tone acquired more control over the treadle response than the red
light (see also Schindler & Weiss, 1982).
The above findings indicate that stimulus control of instrumental behavior is deter-
mined in part by the type of reinforcement that is used. Subsequent research showed that
the critical factor is whether the compound toneþlight CS acquires positive or aversive
properties (Weiss, Panlilio, & Schindler, 1993a, 1993b). Visual control predominates
V. M. LoLordo
TABLE 8.1 EXPERIMENTAL DESIGN FOR OVERSHADOWING
G R O U P
T R A I N I N G
S T I M U L I
T E S T
S T I M U L U S
G E N E R A L I Z A T I O N
F R O M T R A I N I N G
T O T E S T
Overshadowing group aB a Decrement
Control group a a No decrement
©
Ce
ng
ag
e
Le
ar
ni
ng
20
15
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
218 Chapter 8: Stimulus Control of Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
when the CS acquires positive or appetitive properties, and auditory control predomi-
nates when the CS acquires negative or aversive properties.
The dominance of visual control in appetitive situations and auditory control in
aversive situations is probably related to the behavior systems that are activated in the
two cases. A signal for food activates the feeding system. Food eaten by pigeons and
rats is more likely to be identified by visual cues than by auditory cues. Therefore, acti-
vation of the feeding system is accompanied by increased attention to visual rather than
auditory stimuli. In contrast, a signal for an aversive outcome activates the defensive
behavior system. Responding to auditory cues may be particularly adaptive in avoiding
danger.
Unfortunately, we do not know enough about the evolutionary history of pigeons or
rats to be able to calculate the adaptive value of different types of stimulus control in
feeding versus defensive behavior. We also do not know much about how stimulus con-
trol varies as a function of type of reinforcement in other species. One might predict, for
example, that bats that forage for food using echolocation would show stronger control
by auditory cues in a feeding situation than is observed with pigeons. Such questions
remain fertile areas for future research.
Stimulus Elements Versus Configural Cues in Compound Stimuli
So far I have assumed that organisms treat the various components of a complex stimu-
lus as distinct and separate elements. Thus, I treated the simultaneous presentation of a
light and tone as consisting of separate visual and auditory cues. This way of thinking
about a compound stimulus is known as the stimulus-element approach and has been
dominant in learning theory going back about 80 years. An important alternative
assumes that organisms treat a compound stimulus as an integral whole that is not
divided into parts or elements. This is called the configural-cue approach. Although the
configural-cue approach also has deep roots (in Gestalt psychology), its prominence in
behavior theory is of more recent vintage.
According to the configural-cue approach, individuals respond to a compound stim-
ulus in terms of the unique configuration of its elements. It is assumed that the elements
are not treated as separate entities. In fact, they may not even be identifiable when the
stimulus compound is presented. In the configural-cue approach, stimulus elements are
important not because of their individuality but because of the way they contribute to
the entire configuration of stimulation provided by the compound.
M
ea
n
te
st
r
es
po
ns
es
Food reinforcement Shock-avoidance reinforcement
5
0
10
15
LightTone Tone + lightFIGURE 8.4 Effects
of the type of reinforce-
ment on stimulus
control. A treadle-press
response in pigeons
was reinforced in the
presence of a compound
stimulus consisting of a
tone and red light. With
food reinforcement, the
light gained much
more control over the
behavior than the tone.
With shock-avoidance
reinforcement, the tone
gained more control
over behavior than the
light (adapted from
Foree & LoLordo, 1973).
Stimulus and Reinforcement Variables 219
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
The concept of a configural cue may be illustrated by considering the sound of a
symphony orchestra. The orchestral sound originates from the sounds of the individual
instruments. However, the sound of the entire orchestra is very different from the sound
of any of the individual instruments, some of which are difficult to identify when the
entire orchestra is playing. We primarily hear the configuration of the sounds created
by all the instruments that are playing.
In contemporary behavior theory, the configural-cue approach has been cham-
pioned by John Pearce (Pearce, 1987, 1994, 2002), who showed that many learning phe-
nomena are consistent with this framework. Let us consider, for example, the
overshadowing effect (Table 8.1). As I noted earlier, an overshadowing experiment
involves two groups of subjects and two stimulus elements, one of low intensity (a) and
the other of high intensity (B). For the overshadowing group, the two stimuli are pre-
sented together (aB) as a compound cue and paired with reinforcement during condi-
tioning. For the control group, only the low intensity stimulus (a) is presented during
conditioning. Tests are then conducted for each group with the weaker stimulus element
(a) presented alone. These tests show less responding to a in the overshadowing group
than in the control group. Thus, the presence of B during conditioning disrupts control
of behavior by the weaker stimulus a.
According to the configural-cue approach, overshadowing reflects different degrees
of generalization decrement from training to testing (Pearce, 1987). There is no generali-
zation decrement for the control group when it is tested with the weak stimulus a
because that is the same as the stimulus it received during conditioning. In contrast, con-
siderable generalization decrement occurs when the overshadowing group is tested with
stimulus a after conditioning with the compound aB. For the overshadowing group,
responding becomes conditioned to the aB compound, which is very different from a
presented alone during testing. Therefore, responding conditioned to aB suffers consid-
erable generalization decrement. According to the configural-cue approach, this greater
generalization decrement is responsible for the overshadowing effect.
The configural-cue approach has enjoyed considerable success in generating new
experiments and explaining the results of those experiments. However, other findings
have favored analyses of stimulus control in terms of stimulus elements. At this point
what is needed is a comprehensive theory that deals successfully with both types of
results. Whether such a theory requires abandoning the fundamental concept of stimulus
elements remains a heatedly debated theoretical issue (Harris et al., 2009; McLaren &
Mackintosh, 2002; Pearce, 2002; Wagner & Brandon, 2001).
Learning Factors in Stimulus Control
The factors described in the preceding section set the preconditions for how human and
nonhuman animals learn about the environmental stimuli they encounter. However, the
fact that certain stimuli can be perceived does not ensure that those stimuli will come to
control behavior. A young child, for example, may correctly identify a car as different
from a bus but may not be able to distinguish between Hondas and Toyotas. A novice
chess player may be able to look at two different patterns on a chess board without being
able to identify which represents the more favorable configuration. Whether or not cer-
tain stimuli come to control behavior depends on what the individual has learned about
those stimuli, not just whether the stimuli can be detected.
The suggestion that experience with stimuli may determine the extent to which
those stimuli come to control behavior originated in efforts to explain the phenomenon
of stimulus generalization. As I noted earlier, stimulus generalization refers to the fact
that a response conditioned to one stimulus will also occur when other stimuli similar
J. M. Pearce
Co
ur
te
sy
of
D
on
al
d
A
.
D
ew
sb
ur
y
220 Chapter 8: Stimulus Control of Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
to the original cue are presented. Pavlov suggested that stimulus generalization occurs
because learning about a CS gets transferred to other stimuli on the basis of the physical
similarity of those test stimuli to the original CS.
In a spirited attack, Lashley and Wade (1946) took exception to Pavlov’s proposal.
Lashley and Wade argued that stimulus generalization reflects the absence of learning
rather than the transfer of learning. More specifically, they proposed that stimulus gen-
eralization occurs if organisms have not learned to distinguish differences among the sti-
muli. Thus, in contrast to Pavlov, Lashley and Wade considered the shape of a stimulus
generalization gradient to be determined primarily by the organism’s previous learning
experiences rather than by the physical properties of the stimuli tested.
Stimulus Discrimination Training
As it has turned out, Lashley and Wade were closer to the truth than Pavlov. Numerous
studies have shown that stimulus control can be dramatically altered by learning experi-
ences. Perhaps the most powerful procedure for bringing behavior under the control of a
stimulus is stimulus discrimination training (Kehoe, 2008). Stimulus discrimination
training can be conducted using either classical or instrumental conditioning procedures.
For example, Campolattaro, Schnitker, and Freeman (2008, Experiment 3) used a dis-
crimination training procedure in eyeblink conditioning with laboratory rats. A low-
pitched tone (2,000 cps) and a high-pitched tone (8,000 cps) served as the CSs. Each ses-
sion consisted of 100 trials. On half of the trials, one of the tones (A+) was paired with
the US. On the remaining trials, the other tone (B–) was presented without the US. The
results are presented in Figure 8.5. The rats showed progressive increases in eyeblink
responding to the A+ tone that was paired with the US. By the 15th session, the rats
responded to A+ more than 85% of the time. Responding to the B– also increased at
first, but not as rapidly. Furthermore, after the 10th session, responding to the B– tone
gradually declined. By the end of the experiment, the data showed very nice differential
responding to the two tones.
The results presented in Figure 8.5 are typical for discrimination training in which
the reinforced (A+) and nonreinforced (B–) stimuli are of the same modality. The con-
ditioned responding that develops to A+ generalizes to B– at first, but with further train-
ing responding to B– declines and a clear discrimination becomes evident. It is as if the
Co
ur
te
sy
of
Jo
hn
Fr
ee
m
an
John Freeman
10
20
30
40
50
60
70
80
90
100
2 4 6 8 10
Sessions
C
R
p
er
ce
nt
ag
e
12 14 16 18 20
A+ B –FIGURE 8.5 Eyeblink
conditioning in rats to a
tone (A+) paired with
the US and a different
tone (B–) presented
without the US (based
on Campolattaro,
Schnitker, & Freeman
2008).
Learning Factors in Stimulus Control 221
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
participants confuse A+ and B– at first but come to tell them apart with continued train-
ing. The same kind of thing happens when children are taught the names of different
types of fruit. They may confuse oranges and tangerines at first, but with continued
training they learn the distinction.
Stimulus discrimination training can also be conducted with instrumental condition-
ing procedures. This is the case when children are taught what to do at an intersection
controlled by a traffic light. Crossing the street is reinforced with praise and encourage-
ment when the traffic light is green but not when the light is red. The stimulus (the
green light) that signals the availability of reinforcement for the instrumental response
is technically called the S+ or SD (pronounced “ess dee”). By contrast, the stimulus (the
red light) that signals the lack of reinforcement for responding is called the S– or SΔ
(pronounced “ess delta”).
As in Figure 8.5, initially a child may attempt to cross the street during both the S+
(green) and S– (red) lights. However, as training progresses, responding in the presence
of the S+ persists and responding in the presence of the S– declines. The emergence of
greater responding to the S+ than to the S– indicates differential responding to these sti-
muli. Thus, a stimulus discrimination procedure establishes control by the stimuli that
signal when reinforcement is and is not available. Once the S+ and S– have gained con-
trol over the individual’s behavior, they are called discriminative stimuli. The S+ is a
discriminative stimulus for performing the instrumental response, and the S– is a dis-
criminative stimulus for not performing the response. (For a laboratory example of dis-
crimination training in instrumental conditioning, see Andrzejewski et al., 2007.)
In the discrimination procedures I described so far, the reinforced and nonrein-
forced stimuli (S+ and S–) were presented on separate trials. (Green and red traffic lights
are never turned on simultaneously at a street crossing.) Discrimination training can also
be conducted with the S+ and S– stimuli presented at the same time next to each other,
with responses to S+ reinforced and responses to S– nonreinforced. Such a simultaneous
discrimination procedure allows the participants to directly compare S+ and S– and
makes discrimination training easier. For example, Huber, Apfalter, Steurer, and
Prossinger (2005) examined whether pigeons can learn to tell the difference between male
and female faces that were presented with the people’s hair masked out. As you might
imagine, this is not an easy discrimination. However, the pigeons learned the discrimination
in a few sessions if the male and female faces were presented at the same time, and the
birds were reinforced for pecking one of the face categories. If the faces were presented on
successive trials, the pigeons had a great deal more difficulty with the task.
An instrumental conditioning procedure in which responding is reinforced in the
presence of one stimulus (the S+) and not reinforced in the presence of another cue
(the S–) is a special case of a multiple schedule of reinforcement. In a multiple sched-
ule, a different schedule of reinforcement is in effect during different stimuli. For exam-
ple, a VI schedule of reinforcement may be in effect when a light is turned on, and an FR
schedule may be in effect when a tone is presented. With sufficient training with such a
procedure, the pattern of responding during each stimulus will correspond to the sched-
ule of reinforcement in effect during that stimulus. The participants will show a steady
rate of responding during the VI stimulus and a stop-run pattern during the FR stimu-
lus. (For a recent study of multiple-schedule performance of individuals with mild intel-
lectual disabilities, see Williams, Saunders, & Perone, 2011.)
Stimulus discrimination and multiple schedules are common outside the laboratory.
Nearly all reinforcement schedules that exist outside the laboratory are in effect only in
the presence of particular stimuli. Playing a game yields reinforcement only in the pres-
ence of enjoyable or challenging partners. Driving rapidly is reinforced when you are on
a freeway but not when you are on a crowded city street. Loud and boisterous discussion
222 Chapter 8: Stimulus Control of Behavior
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
with your friends is reinforced at a party. The same type of behavior is frowned upon
during a church service. Eating with your fingers is reinforced at a picnic but not when
you are in a fine restaurant. Daily activities typically consist of going from one situation
to another, each associated with its own schedule of reinforcement.
Effects of Discrimination Training on Stimulus Control Discrimination training
brings the instrumental response under the control of the S+ and S–. How precise is the
control that S+ acquires over the instrumental behavior, and what factors determine the
precision of the stimulus control that is achieved? To answer these questions, it is not
enough to observe differential responding to S+ versus S–. One must also find out how
steep the generalization gradient is when the participants are tested with stimuli that sys-
tematically vary from the S+. Another important question is which aspect of the discrim-
ination training procedure is responsible for the type of stimulus generalization gradient
that is obtained. These issues were first addressed in classic experiments by Jenkins and
Harrison (1960, 1962).
Jenkins and Harrison examined how auditory stimuli that differ in pitch can come
to control the pecking behavior of pigeons reinforced with food. As I discussed earlier,
when pigeons are reinforced with food, visual cues exert stronger stimulus control than
auditory cues (Figure 8.4). However, as Jenkins and Harrison found out, with the proper
training procedures, the behavior of pigeons can come under the control of auditory cues
as well. They evaluated the effects of three different training procedures. In all three pro-
cedures, a 1,000-cps tone was present when pecking a response key was reinforced with
access to food on a variable interval schedule.
One group of pigeons received a discrimination training procedure in which the
1,000-cps tone served as the S+ and the absence of the tone served as the S–. Pecking
was reinforced on trials when the S+ was present but was not reinforced on trials when
the tone was off (S–). A second group also received discrimination training. The 1,000-
cps tone again served as the S+. However, this time the S– was a 950-cps tone. The third
group of pigeons served as a control group and did not receive discrimination training.
For them the 1,000-cps tone was continuously turned on, and they could always receive
reinforcement for pecking during the experimental sessions.
Upon completion of the three different training procedures, each group was tested
for pecking in the presence of tones of various frequencies to see how precisely pecking
was controlled by pitch. Figure 8.6 shows the generalization gradients that were obtained.
The control group, which did not receive discrimination training, responded nearly
equally in the presence of all of the test stimuli. The pitch of the tones did not control
their behavior; they acted tone deaf. Each of the other two training procedures produced
more stimulus control by pitch. The steepest generalization gradient, and hence the
strongest stimulus control, was observed in birds that were trained with the 1,000-cps
tone as S+ and the 950-cps tone as S–. Pigeons that previously received discrimination
training between the 1,000-cps tone (S+) and the absence of tones (S–) showed an inter-
mediate degree of stimulus control by tonal frequency.
The Jenkins and Harrison experiment provides two important conclusions. First, it
shows