Theories of Learning

I. Pavlov--Classical Conditioning (1849-1936)

A. Involves the study of reflexes - natural biological reactions to a stimulus

B. Definition of terms:

1. UCS = A naturally occurring stimulus which causes a natural response in the organism

2. UR = The natural reflex action to a UCS

3. CS = A stimulus which may or may not elicit a reflex or response, is conditioned to produce the same reflex at the UCS

4. CR = Newly learned response to the CS

C. Learning

1. Begins with acquisition of response to repeated reinforcement

2. Extinction = When CS is given alone w/o UCS response gradually disappears

3. Spontaneous recovery = later after elapsed time w/o reinforcement of US(UCS) the response returns (Basketball skill retained after absence)

4. Pavlov thought CS & US overlapped in time in the brain. He tried to vary the time between the UCS & CS, with the CS given before the UCS

5. Backward conditioning- The CS does not begin to act until after the cessation of the UCS

D. Higher Order Conditioning- a well learned CS is used to reinforce learning with another neutral stimulus, making a new or second CS

II. Watson & Skinner -Behaviorism and Operant Learning

A. Watson -learning is by association (CER- conditioned emotional responses)

1. Stimulus Generalization- response broadened to other areas or to other stimuli

2. Stimulus Discrimination- Learning to respond to only one S and inhibits the response to all other stimuli.

3. Response Discrimination - Giving a response that is somewhat different from the response to all other stimuli.

4. Everything is externally measurable, we are naturally rewarded by the correct behavior.

B. Skinner-- Operant Conditioning

1. Two classes of responses

a. Elicited responses- classified as a respondent, such as knee jerk or pupil constriction, related to known or specific stimuli, A REFLEX action as in classical conditioning

b. Emitted responses - called operants, not correlated with any known stimuli, measured by the rate of response because it cannot be linked to a reflex.

2. An operant acquires a relation to prior stimulation, it becomes then a discriminate operant.

3. Reinforcement is contingent upon a response, if the occurrence of an operant is followed by presentation of a reinforcing stimuli, the strength and /or probability of the response is increased (definition of a reinforcer)

C. Extinction- Is unconditioning for Skinner

1. measured by rate of responding and total number of responses

D. Single reinforcements - (rate gets food in one press)

E. Superstitious behavior - Any behavior that is rewarded will result in learning of that behavior and belief that the behavior was the stimulus to be reinforced even if it was accidental. The reward results in learning.

F. Successive Approximations or Shaping - training of animals, works well with the use of secondary reinforcers.

G. Reinforcers

1. Positive- Strengthens probability of response, i.e. food, water, attention etc. Adds something good.

2. Negative- when removed from a situation, it strengthens the probability of an operant response by taking something bad away.

3. Primary - Naturally occurring reinforcement value, universal to all species, meets some basic or primary need.

4. Secondary - A stimulus that is not originally a reinforcing one can become reinforcing through repeated association with one that is. (Money for example can be used to purchase food, food being a primary reinforcer)

H. Thorndike's Law Of Effect - behavior which is consistently reinforced is stamped in, other behavior without reward is stamped out. Good consequences leads to increase in behaivor, bad consequences leads to reduction in behavior.

I. Schedules of reinforcement, (All are forms of Partial Reinforcement: (Skinner calls it Intermittent))

1. Interval reinforcement - reinforcement given at intervals of time

a. Fixed interval- given at standard intervals, produce moderate response rates

b. Variable interval - (aperiodic reinforcement ) the average interval is substituted for a fixed interval schedule, Produces slow steady rates of response, very resistant to extinction (for example how often do you check your mailbox, regardless of whether you recently received mail or not)

2. Ratio reinforcement - reinforcement given after a # of responses,

a. Fixed ratio- reinforcement given after standard # of responses, very high rates of response

b. Variable ratio- using a range or ratios around a mean value, better resistance than fixed ratio

J. Extinction Ratio- the ratio of unreinforced to reinforced responses

III. Punishment

A. Suggestions- swift, certain, sufficient

1. Timing is key

a. Must follow immediately the behavior

b. No delays

2. Severity must be adequate

3. Do not increase severity over time

B. Disadvantages

1. Often Arbitrary

2. Poor modeling

3. No reinforcement of desired behavior.

4. Poor self image results

5. Power and escalation

6. Not shown as effective as Positive reinforcers

7. Harmful to subjects

C. Alternatives

1. Time Out

a. Sent to room without rewards for negative behavior

b. 5-20 minutes duration

c. avoid negative attention

2. Natural & Logical Consequences

3. Behavior Modification

a. Token economy

(1) Start w baseline

(2) Target Behavior

(3) Use token as secondary reinforcer

4. Aversion Training

a. Use aversive stimuli to reduce behavior.

5. Systematic Desensitization

a. Create SUD's hierarchy

b. Progressive relaxation associated with each item in the SUDS heirarchy until successful with relaxation

c. Association and pairing of relaxation with each item in hierarchy until fear response is no longer present

6. Modeling

a. Role model for appropriate behaivor

IV. Avoidance Training- Learning a desirable behavior in order to prevent an unpleasant condition such as punishment from happening.

V. Social Learning

A. Observational Learning (Vicarious Learning)- Modeling, learning by "watching"

1. Pay attention to model

2. Remember what model does

3. Convert what learned into action

VI. Cognitive Learning

A. Cognitive learning refers to "understanding"

B Cognitive Maps - internal representations of relationships

C. Tolman's Latent Learning- Learning that is not immediately reflected in behavior change, but stored up over time and not yet reflected in behavior.

Key Tems

learning Learning is any relatively permanent change in an organism's behavior due to experience.
behaviorism Behaviorism is the view that psychology should be an objective science based on observable behaviors and avoid references to mental processes. Because he was an early advocate of the study of overt behavior, John Watson is often called the father of behaviorism.
classical conditioning Also known as Pavlovian conditioning, classical conditioning is a type of learning in which a neutral stimulus becomes capable of eliciting a conditioned response after having become associated with an unconditioned stimulus.
unconditioned response (UCR) In classical conditioning, the unconditioned response (UCR) is the unlearned, naturally occurring response to the unconditioned stimulus. (Eye blink in response to a puff of air to the eye).
unconditioned stimulus (UCS) In classical conditioning, the unconditioned stimulus (UCS) is the stimulus that naturally and automatically triggers the reflexive unconditioned response. (Puff of air at the eye)
conditioned response (CR) In classical conditioning, the conditioned response (CR) is the learned response to a conditioned stimulus, which results from the acquired association between the CS and UCS.
conditioned stimulus (CS) In classical conditioning, the conditioned stimulus (CS) is an originally neutral stimulus that comes to trigger a CR after association with an unconditioned stimulus. (Tone or bell)
acquisition In a learning experiment, acquisition refers to the initial stage of conditioning in which the new response is established and gradually strengthened.
extinction In Classical Conditioning extinction refers to the weakening of a CR when the CS is no longer followed by the UCS. In operant conditioning, extinction occurs when a response is no longer reinforced.
spontaneous recovery Spontaneous recovery is the reappearance of an extinguished CR after a rest period.
generalization Generalization refers to the tendency, once a response has been conditioned, for stimuli similar to the original CS to evoke similar responses. (Albert's response to fur coat or rabbit)
operant conditioning Operant conditioning is a type of learning in which behavior is strengthened if followed by reinforcement or diminished if followed by punishment. Unlike classical conditioning, which works on reflexive behaviors, operant conditioning works on behaviors that are willfully emitted by an organism.
operant behavior Operant behavior is behavior that operates on the environment to produce reinforcing or punishing stimuli.
shaping Shaping is the operant conditioning procedure for establishing a new response by reinforcing successive approximations of the desired behavior. (Skinner's pigeon turning in a circle)
reinforcer In operant conditioning, a reinforcer is anything or event that strengthens the behavior it follows.
primary reinforcers The powers of primary reinforcers are automatic, inborn, and do not depend on learning.
secondary reinforcers A secondary reinforcer is a stimulus that acquires its reinforcing power through its association with a primary reinforcer. (Money)
continuous reinforcement Continuous reinforcement is the operant procedure of reinforcing the desired response every time it occurs. In promoting the acquisition of a new response it is best to use continuous reinforcement.
partial reinforcement Partial reinforcement is the operant procedure of reinforcing a response intermittently. A response that has been partially reinforced is much more resistant to extinction than one that has been continuously reinforced.
fixed-ratio schedule In operant conditioning, a fixed-ratio schedule is one in which reinforcement is presented after a set number of responses.
variable-ratio schedule In operant conditioning, a variable-ratio schedule is one in which reinforcement is presented after a varying number of responses.
fixed-interval schedule In operant conditioning, a fixed-interval schedule is one in which a response is reinforced after a specified time has elapsed.
variable-interval schedule In operant conditioning, a variable-interval schedule is one in which responses are reinforced after varying intervals of time.
punishment In operant conditioning, punishment is the presentation of an aversive stimulus, such as shock, which decreases the behavior it follows. People often confuse negative reinforcement and punishment. The former strengthens behavior, while the latter weakens it.
observational learning Observational learning is learning by watching and imitating others.
modeling Modeling is the process of watching and then imitating a specific behavior and is thus an important means through which observational learning occurs.