Instrumental or Operant Conditioning

 

Early work began with Edward L. Thorndike and his Law of Effect

§Cats in a puzzle-box

§Trial-and-error learning

 

Thorndike’s Puzzle-Box

 

Thorndike's Description of a Cat in the Puzzle-Box

"When put into the box the cat would show evident signs of discomfort and of an impulse to escape from confinement. .. tries to squeeze through any opening. . . claws and bites the bars or wire. . . claws at everything it reaches. . . For eight or ten minutes it will claw and bite and squeeze incessantly. . . . Whether the impulse to struggle be due to an instinctive reaction to confinement or to an association, it is likely to succeed in letting the cat out of the box. The cat that is clawing all over the box in her impulsive struggle will probably claw the string or loop . . . so as to open the door. And gradually all the other non-successful impulses will be stamped out and the particular impulse leading to the successful act will be stamped in by the resulting pleasure, until, after many trials, the cat will, when put in the box, immediately claw the button or loop in a definite way."

 

Thorndike's Law of Effect

"Of several responses made to the same situation those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections to the situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond."

 

What Was Being Learned?

§What the animal learned was an association between a stimulus (total situation) and a response (motor reaction), i.e., an S-R bond

§Note that Thorndike inferred internal states (drives, impulses) to account for cat’s behavior

§The selection, narrowing down, ‘stamping in’ of responses that were initially internally actuated, or perhaps even ‘wired in,’ was guided by external stimuli

 

Law of Effect and Survival of the Fittest

§Note the similarity between Thorndike’s Law of Effect

§“Any act which in a given situation produces satisfaction becomes associated with that situation, so that when the situation recurs, the act is more likely than ever to recur also.”

§ and the law of survival of the fittest in evolutionary theory

§Only the fittest organisms survive

§Only the fittest responses shall survive, i.e., those leading to pleasure or satisfaction

 

Primary and Secondary Reinforcers

§According to Thorndike and others who followed his lead, some reinforcers are learned

§Such secondary (or conditioned) reinforcers acquire their reinforcing power by virtue of their association with primary (unlearned) reinforcers (e.g., food, water, termination of electric shock)

§Example. Chimps with ‘Chimp-O-Mat’ (Cowles, 1937)

 

Types of Reinforcement and Punishment

 

Time and the Law of Effect

§Cat performs many responses before the correct one so that the reward occurs after all (even the incorrect) responses. That is, all responses are reinforced.

§According to Law of Effect all responses should be strengthened

§Addition to Law of Effect

§A reinforcer becomes less effective the longer it is delayed after the response

 

Conditioning and Behaviorism

§Out of Pavlov’s and Thorndike’s work arose behaviorism or ‘behavior theory’

§B.F. Skinner called instrumental conditioning ‘operant conditioning’

§His form of behaviorism focused solely on observable behavior (no internal states)

§Attempted to analyze factors that ‘control’ (determine) behavior

§Focused on ‘rate of responding’ and attempted to see how this is influenced by environmental and reinforcement contingencies

 

Skinner’s Approach

§Skinner’s focus on purely observable behavior leads him to see

§behavior as simply emitted (operant) rather than elicited

§reinforcement as something that simply increases the probability of a behavior rather than as something pleasurable or satisfying (unobservable)

 

Findings from Operant Conditioning Studies

§Main findings come from studies on rats and pigeons placed in Skinner Boxes

§Animals are trained to perform behaviors by a process called shaping

§Shaping is the differential reinforcement of successive approximations to the desired behavior

§Much research in operant conditioning looks at effects of different schedules of reinforcement on rate of responding

 

Skinner Box

 

Reinforcement Schedules

      Four main types

§Fixed ratio (FR2, FR50, etc.): Reinforce every 2nd or every 50th response, etc.

§Variable ratio: Reinforcement comes after a certain number of responses but the number varies irregularly around some mean, e.g., VR 50

§Fixed interval: Based on time, say, every 10 seconds. No amount of responding before 10 sec are up will lead to Rf, but the 1st response after 10 sec is reinforced

§Variable interval: Time interval varies irregularly around some average, say, VI 25

 

Typical Reinforcement Schedule Patterns

 

Additional Reinforcement Effects

§Continuous reinforcement (= FR1) leads to fastest acquisition of learned response, but response is easily extinguished

§Partial reinforcement (not every response is reinforced) leads to learned responses highly resistant to extinction

§Variable schedules lead to learned responses more resistant to extinction than fixed schedules

 

Applications of Classical and Instrumental Conditioning

§Behavior Therapy often uses classical conditioning to treat phobias, anxiety, and neuroses associated with anxiety, e.g., test anxiety

§Behavior Modification usually uses principles of instrumental conditioning to either eliminate undesirable behaviors or strengthen desirable ones

 

Behavior Therapy and Classical Conditioning

      Counterconditioning and systematic desensitization

§Test anxiety example

§Try to associate the old CS (test situation) with a new response that’s incompatible with fear so that new CR will displace fear. Usually new CR is muscular relaxation. S’s trained to relax on command, then an anxiety hierarchy is developed and gradually associated with relaxation

 

Avoidance Learning and Behavior Therapy

§Avoidance learning is a phenomenon therapists often need to treat

§Associated with the persistence of many phobias, and anxiety disorders

 

Avoidance Learning and Two Factor Theory

§Avoidance learning usually understood to result from a combination of two factors:

§classical conditioning

§instrumental conditioning

§Begin with analysis of a simpler form of learning: Escape learning

 

Apparatus Used to Study Escape and Avoidance Learning

 

Escape Learning

§Animal quickly learns to escape shock by quickly jumping over the barrier after being shocked, thus terminating the shock

§What kind of conditioning is this?

 

Avoidance Learning

§When light or sound signal precedes shock, animal learns to jump as soon as the signal occurs, thus entirely avoiding any shock

§This avoidance behavior is extremely resistant to extinction

§How is it that even though animal is never subsequently shocked it still continues to jump over the barrier when signal appears?

 

Mowrer’s Two Factor Theory

§Asserts that both classical and instrumental conditioning are at work

§How?

§Light (CS) -> Fear (CER)   (classical conditioning)

§Reduction of Fear is Rf for Jumping (instrumental conditioning)

§How could persistence of phobias be treated?

 

 

Behavior Modification and Instrumental Conditioning

§Eliminating undesirable behaviors: Find out what’s reinforcing the behavior and eliminate the reinforcement (or perhaps satiate individual with Rf so it’s no longer reinforcing)

§Strengthen desirable behaviors: Try to find reinforcers that will influence behavior and make Rf’s contingent on desired behavior (often with ‘token economies’)

 

Cognitive Approaches to Learning

§According to behaviorist school of psychology, learning is best described as a change in what organisms do, i.e., their behavior

§Cognitive theorists want to describe learning as a change in what an organism knows, i.e., cognitions

§Cognitive theorists point to inadequacies in the behaviorist account of learning and try to go beyond what behaviorists can successfully describe with their account of learning

 

Philosophical Influences

§Note the influence of philosophical presuppositions in behaviorist (at least Skinner’s) approach

§Internal states, drives, knowledge, etc. are not directly observable, hence we can’t really know about them or talk about them convincingly

§Thus, a truly scientific account must rely only on directly observable factors such as behavior (Positivism)

§Cognitive theorists generally reject this philosophy of science and argue that although an adequate scientific theory must be tied to observable events, a good theory may often make use of hypothetical (not directly observable) structures and mechanisms that must be inferred to account for observable behavior

 

Köhler’s Critique

§One of the earliest criticisms of behavior theory came from Köhler’s work with chimpanzees

§He argued that Thorndike ‘loaded the dice’ in favor of trial-and-error learning: There was no other way to solve the problem

§Köhler did a classic series of experiments showing that chimps can create solutions to problems, e.g., build tools for reaching food

§He denied that behavior theory could account for this kind of learning

 

Köhler’s Experiments

§Kohler's basic experiment was to place a chimp in an enclosed play area. Somewhere out of reach he placed a prize, such as a bunch of bananas.

§To get the bananas, the chimp would have to use an object as a tool. The objects in the play area included sticks of different lengths and wooden boxes. He discovered that chimpanzees were very good at using tools. They used sticks as rakes to pull in bananas places out of reach.

§Also learned to use boxes as step ladders, dragging them under the hung bananas and even stacking several boxes on top of one another.

§Kohler placed a bunch of bananas outside Sultan's cage and two bamboo sticks inside the cage. However, neither of the sticks was long enough to reach the bananas. Sultan pushed the thinner stick into the hollow of the thicker one, and created a stick long enough to pull in the bananas.

 

Köhler’s Chimps

 

 

 

Köhler and Insight Learning

§Köhler described this kind of learning as ‘insight learning,’ i.e., gaining insight into the relevant relations among elements of a problem rather than acquiring a set of S-R bonds

§He argued that what is acquired is a new set of internal representations of the situation, i.e., knowledge

§He also pointed to transfer of learning to new situations

 

Tolman’s Work

§Evidence for a cognitive interpretation of learning, i.e., that knowledge is what is acquired

§Organisms acquire ‘cognitive maps’

§Latent learning, i.e., learning without any obvious reinforcement

§Rats learned maze just exploring without Rf

§Learning without active responding

§Rats ferried around maze in wire basket learned as well as those actively running the maze (McNamara, Long & Wike, 1956)

§Evidence for animals developing expectancies about instrumental contingencies

§Rats trained with large rewards then switched to smaller reward perform more poorly than those trained all along with smaller reward. Contrast indicates animals developed expectancies.