Learning theories Part 1: Behaviorism
With an emphasis on producing observable and measurable outcomes, behaviorism excludes the role of emotions, motives and even thinking. Learning is said to be accomplished when a proper response is demonstrated following the presentation of a specific environmental stimulus. Learning is equated with changes in the form or frequency of observable performance (Ertmer & Newby, 1993). The behaviorist doesn't deny that mental activity occurs, but the theory is not interested in what happens between stimulus and response. In other words, it treats the mind as something of a black box. Some later behaviorists call this "radical behaviorism" as they attempted to account for internal processes (Moore, 2013). These issues muddle the picture, so we won't deal with them here.
Environmental factors are considered dominant; the arrangement of stimuli and consequences within the environment shape behavior. The learner is characterized as being reactive to conditions in the environment as opposed to taking an active role in discovering the environment (Nalliah, 2014). The connection between the stimulus and the response occurs as a result of the reinforcement received following the behavior. Thus learning is defined as a change in behavior. Rewards are "in the eye of the beholder" and might include a sense of well-being, praise, or changed conditions (e.g., thirst is quenched, others respond) – any consequence that increases the likelihood of repeating the behavior. While there are many common reinforcers, we can only determine the fact in individuals by observing their change in behavior. To be reinforcing in this way, the positive consequence must occur within a very limited time span following the behavior. Neuroscientists attribute the fascination with our electronic gadgets (e.g., instant messaging, video games) to their near-term rewarding consequences.
"Behaviorists prescribe strategies that are predicted to be most useful for building and strengthening stimulus-response associations, including the use of instructional cues, practice and reinforcement" (Ertmer & Newby, 1993). Behaviorism has proven reliable in facilitating learning that involves discrimination, generalization, and chaining. It has not been as useful for higher-level skills or those that require a greater depth of processing. Keep in mind, though, memorizing facts and procedures is often a first step toward more in-depth knowledge and skill building.
The following assumptions of behaviorism have direct relevance to instructional design (Ertmer & Newby, 1993).
- An emphasis on producing measurable outcomes in learners using behavioral objectives, task analysis, and criterion-referenced assessment.
- Learner analysis to determine where instruction should begin using pre-assessment.
- Emphasis on mastering early steps before progressing to more complex performance using sequencing, presentation, and mastery learning.
- Use of reinforcement to impact performance using tangible and intangible rewards, and informative feedback.
- Use of cues, shaping, and practice to ensure a strong stimulus-response association using prompts and simple to complex sequencing of practice.
All behaviorist practices arise from two developments in physiological research, classic conditioning and operant conditioning. Conditioning is a behavioral process whereby a response becomes more frequent or more predictable in a given environment as a result of reinforcement.
Pairing an unconditioned response to an unconditioned stimulus with a formerly neutral stimulus so that the formerly neutral stimulus results in the same response as the unconditioned stimulus. Pairing an anxiety-provoking situation, such as performing in front of a group, with pleasant surroundings helps the student learn new associations. Instead of feeling anxious and tense in these situations, the student can learn to stay relaxed and calm. Key concepts of classical conditioning are reinforcement, extinction, generalization, and discrimination:
- Reinforcement is any consequence of a response that increases the likelihood of further responding. Humans respond to a wide variety of tangible and intangible rewards, including candy, praise, rest breaks, and especially internal satisfaction. External rewards are most effective in establishing simple behaviors and must be externally administered while internal rewards are longer lasting and are self-administered.
- Extinction through removal of the reward (non-reinforcement). Ignoring inappropriate comments, for example, after having paid attention to violators. Individuals initially increase the behavior before decreasing and eventually stopping. Spontaneous recovery may occur after the passing of time, but quickly returns to zero if the behavior fails to bring the desired reinforcement.
- Generalization occurs when other similar stimuli elicit the conditioned response. The performance-anxious student may generalize a specific setting to other similar settings while maintaining calm. Stories or case examples can be separated into surface and structural features. Surface features might include names, places and the like while structural features may include problem structure and patterns of analysis.
- Discrimination occurs when the individual learns to respond only to the conditioned stimulus and not respond to similar stimuli. Teaching students to discriminate tones, the machine plays a pleasant ding only when the correct response is given. Pattern recognition is a form of discrimination involving sensory discrimination and also patterns of information. Examples include diagnosing diseases from symptoms, recognizing patterns of light and sound, including regression, sequencing, and labeling inherent in all classification systems
This type of conditioning does not elicit a response, but rather waits for the individually initiated (operant) behavior. Operant behaviors act on their environment and become more or less likely to be repeated because of reinforcement. Reinforcement is responsible for behavior strengthening - increasing the rate of responding or making responses more likely to occur. Key concepts in operant conditioning are positive and negative reinforcement, positive and negative punishment, reinforcement schedules, shaping, and chaining:
- Positive reinforcement is providing something pleasant after a behavior, increasing the probability that the behavior will continue. Negative reinforcement is taking away something unpleasant as a result of the behavior, increasing the behavior.
- Positive punishment is used to decrease a behavior by presenting something unpleasant after the behavior. Negative punishment decreases a behavior by removing something pleasant after the behavior. It's important to understand that while punishment reduces or suppresses the likelihood of a behavior, it does not eliminate it. This is due to the fact that there is no alternative behavior introduced to replace the undesired behavior. Punishment also increases the likelihood that the behavior will re-emerge in an alternate form.
- Reinforcement schedules, or intervals, have differing impacts on the targeted behavior, as we see in Figure 2. The schedules:
- Continuous = reinforcing every desired response.
- Intermittent = reinforcing some but not all desired responses. There are two schedules of intermittent reinforcement, each in fixed and variable forms.
- Interval schedules = reinforcing every N minutes of correct responding (fixed), or at variable intervals centered around an average amount of time (intermittent).
- Ratio schedules = reinforcing every Nth correct response (fixed), or at variable numbers of correct responses centered around an average value (intermittent).
- As we see in Figure 2, the variable ratio schedule of reinforcement results in the highest rate of correct responding, followed by fixed ratio, with the interval schedules trailing. Note how the fixed interval schedule results in a "scalloped" responding rate. Though behaviorism is not interested in the mind, we can predict that learners track the passage of time and begin responding more rapidly as the fixed interval approaches.
- So why not rely on ratio schedules only since they produce faster learning? A limiting factor in ratio schedules is fatigue due to rapid responding. As a consequence, practice is effective for a much shorter time than interval schedules. "The variable-interval schedule produces a steady rate of responding. Unannounced quizzes operate on variable-interval schedules and typically keep students studying regularly. Intermittent schedules are more resistant to extinction than continuous schedules: When reinforcement is discontinued, responding continues for a longer time if reinforcement has been intermittent (variable) rather than continuous. The durability of intermittent schedules can be seen in people’s persistence at such events as playing slot machines, video games, fishing, and shopping for bargains" (Schunk, 2012). Note that reinforcement schedules are an integral component in games, including learning games.
- Shaping (using successive approximation) is learning by doing accompanied by corrective feedback. The basic procedure for shaping involves five steps.
- Identify what the student can do (the initial behavior): Current skill at writing an essay, for example.
- Identify the desired behavior: Create a rubric of essential elements.
- Identify potential reinforcers: Praise, grades.
- Break the desired behavior into substeps to be mastered sequentially: The student works on one essential element at a time.
- Move the learner from the initial behavior to the desired behavior by successively reinforcing each step (approximation) toward the desired behavior. A well-written essay.
- In practice, adults can often work on multiple elements at a time, the schedule (the number of successive approximations) can be reduced, and shaping may end before the desired behavior is accomplished (i.e., students are allowed a limited number of revisions).
- Chaining consists of a series of behaviors, each of which sets the stage for the next behavior. Most human behaviors consist of a sequence of discrete actions chained together to form a complete process. In behavioral terms, each behavior changes the environment and this altered condition serves as the stimulus for the next. For example, shooting a basketball requires dribbling, turning, getting set in position, jumping, and releasing the ball. Chaining can be seen in procedural learning, such as operating a microscope, baking a cake, and conducting an investigation. "Some chains acquire a functional unity; the chain becomes an integrated sequence such that successful implementation defines a skill. When skills are well honed, execution of the chain occurs automatically" (Schunk, 2012).
Other lessons of behaviorism
Behaviorism's contribution to teaching and learning can also be seen in the following lasting contributions. Note how many "new" ideas arise out of this oldest learning theory.
- Behavioral responses (i.e., active learning) to stimuli strengthens their connection. Lack of behavioral response to stimuli (passive learning) weakens their connection (forgetting).
- When one is prepared (ready) to act, doing so is rewarding and not to do so is punishing. Consider that only one student with raised hand gets called on in one-to-many settings (instructor up front facing rows of students).
- Practice or training in one setting in a specific context does not improve one's ability to execute that skill generally (context specificity). Thus, skills need to be practiced in as authentic settings as possible and/or in a number of settings.
- Skills should be taught across the curricula, at the time when the learner is conscious of the need for it as a means of satisfying a useful purpose.
- Humans are capable of self-regulation through the administration of their own reinforcers.
- Behavioral learning objectives expose internal processes as well as behavior to examination by others.
- Individuals learn at different rates (pointing to the need for individualized instruction).
- Too much repetitive practice negatively affects motivation, thus failing to promote learning.
- Contingency contracts, in which agreement between instructor and learner specifying the work the learner will accomplish and the expected outcome (reinforcement) for successful performance, are useful for individualized instruction.
Applying behaviorism in instruction essentially involves three tasks for the instructor and designer (in Ertmer & Newby, 1993):
- Determine which cues can elicit the desired responses.
- Arrange practice situations in which prompts are paired with the target stimuli that initially have no eliciting power but which will be expected to elicit the responses in the "natural" (performance) setting.
- Arrange environmental conditions so that students can make the correct responses in the presence of those target stimuli and receive reinforcement for those responses.
Programmed instruction, or the automation of instruction, first came about with teaching machines in the 1920s. Benjamin (1988) describes its history succinctly: "Programmed instruction emerged in the 1920's with Sidney Pressy at Ohio State University with very little notice, then reappeared in the 1950s with B.F. Skinner at Harvard and enjoyed considerable popularity in the early 1960s but was gone by the late 1960s, only to reemerge in the 1980s as computer-based training." Now we use the internet, but the principles remain much the same. The principles (Silvern, 1962):
- Instruction is provided without the intervention of a human instructor.
- The learner receives immediate knowledge of her progress in the form of feedback.
- There is a participative, overt interaction between learner and machine (program).
- The subject matter, identified as a sequence of teaching points which are synthesized to form a whole lesson, is carefully controlled and consistent.
- Reinforcement is used to strengthen learning
The essence of programmed instruction can be seen in Figure 3. For more, watch B.F Skinner. Teaching machine and programmed learning
While behaviorism has important limitations on its own, it is often embedded within applications of other theories, and is a crucial aspect of learning in many highly specialized professions, and so its contribution to learning should not be minimized. Much of what we call expertise arises directly from behavioral learning.
| The Axonify eLearning platform is an example of a learning system built upon behavioral principles using interval reinforcement schedules. From the website:
"The Axonify platform is built on principles of behavioral learning, and specifically the concept of spaced repetition, also known as interval reinforcement. Researchers have found that this type of evidence-based learning has a positive impact on knowledge retention. We are the first to take the concept of interval reinforcement and apply it effectively to employee awareness. Axonify closes the gap between what employees know and what they need to know to effectively do their jobs."
Axonify’s approach to interval reinforcement training is founded on two facts:
Tests greatly enhance learning and promote long-term retention—they’re more than a neutral assessment of what an employee knows, but can also produce learning. When an employee takes an initial test, they retrieve information stored in their memory. Practicing this skill on initial tests enhances performance on future tests. It also increases retention.
Gamified Learning and Rewards. Rewards can be instant or build over time, allowing employees to redeem points for bigger rewards. With this module, you have the choice of leveraging your existing incentive program or creating a new one around reinforcement. Examples of prizes include cash, gift cards or merchandise. You can also incorporate a leaders’ board for gaming modules, where “bragging rights” become the reward all on their own. When you separate learning into daily chunks, rather than one long, time-intensive session, retention is increased further.
No word on its success in the workplace. Note how the description avoids any discussion of the internal processes of individuals subject to the training.
These people contributed significantly to the development of behavioral learning theory:
- Ivan Pavlov (1849-1936)
- Edward Thorndyke (1874-1949)
- John B. Watson (1878-1958)
- Edwin R. Guthrie (1886-1959)
- B. F. Skinner (1904-1990)