Learning and Instruction
TÜRKÇE
abstract
When using modern educational technology, some forms of instruction are inherently transient in that previous information usually disappears to be replaced by current information. Instructional animations and spoken text provide examples. The effects of transience due to the use of animation-based instructions (Experiment 1) and spoken information under audio-visual conditions (Experiment 2) were explored in a cognitive load theory framework. It was hypothesized that for transient information presented in short sections, animations would be superior to static graphics, due to our innate ability to learn by observing. For transient information in long sections, animations should lose their superiority over static graphics, due to working memory overload associated with large amounts of transient information. Similarly, the modality effect under which audio-visual information is superior to visual only information should be obtainable using short segments but disappear or reverse using longer segments due to the working memory consequences of long, transient, auditory information. Results supported the hypotheses. The use of educational technology that results in the transformation of permanent into transitory information needs to be carefully assessed.
1. Introduction
Instructional technology is becoming increasingly sophisticated and increasingly ubiquitous. While there can be little doubt that the introduction of technology allows novel and beneficial forms of teaching and learning, those novel forms sometimes have unin tended, incidental, and negative consequences. In this paper, we are concerned with the transient information effect (Leahy & Sweller, 2011; Sweller, Ayres, & Kalyuga, 2011). It occurs when instruc tional procedures present information in a form that is transient and difficult to retrieve rapidly and when required. The use of both animations and spoken information provide examples. While technology permits the ready use of animations and spoken text, both incidentally transform the permanent information associated with hard copy into transient information that rapidly disappears to be replaced by new information. Transient information has negative cognitive load consequences that are explored in the two experiments of this paper. We will begin by outlining cognitive load theory.
Cognitive load theory is a framework of instructional design principles based on the characteristics and relations between the structures that constitute human cognitive architecture, particu larly working memory and long-term memory. The theory assumes that human cognitive architecture is a natural information pro cessing system, analogous to other systems such as evolution by natural selection (Sweller, 2011, 2012; Sweller et al., 2011; Sweller & Sweller, 2006). It can be specified by five principles:
1. Long-term memory and the information store principle. Most human cognition is driven by the contents of an enormous information store (De Groot, 1965). In human cognition, this structure is long-term memory.
2. Schema theory and the borrowing and reorganizing principle. This principle assumes we learn primarily through borrowing schemas from other people’s long-term memories (e.g., through listening or reading what others have written). These schemas are constructively reorganized through the lens of our own long-term memory. This reorganization process is inexact, resulting in some random changes.
3. Problem solving and the randomness as genesis principle. The borrowing principle does not create new information except insofar as borrowing is inexact. If knowledge is unavailable either through our own or other people’s long-term memories, we must problem solve by randomly generating moves and testing their effectiveness. This process generates new knowledge.
4. Working memory and the narrow limits of change principle.
Working memory processes must deal with the novel infor mation generated by the randomness as genesis principle. The randomness inherent in reorganization and random problem solving generates extremely large problem-solving spaces. To reduce the problem-solving space, working memory is limited in both capacity (Miller, 1956) and duration (Peterson & Peterson, 1959).
5. Long-term working memory and the environmental organizing and linking principle. Working memory is only limited when processing novel information. It is able to deal with vast amounts of previously organized information brought in from long-term memory (Ericsson & Kintsch, 1995) reducing the burden on working memory and thus lowering cognitive load. The structure and characteristics of this cognitive architecture indicate the primary purpose of instruction is to construct schemas in long-term memory. Instructional designs that do not aim to alter long-term memory and which ignore working memory limitations when processing novel information are unlikely to be effective. As well as human cognitive architecture, cognitive load theory includes a framework of instructional design principles and postulates the existence of two distinct types of cognitive load (Sweller, 2010): Intrinsic cognitive load is the cognitive load inherent within the information to be learnt. Intrinsic cognitive load is dependent on the degree to which individual elements of information must be processed simultaneously in working memory to be understand able (see Marcus, Cooper, & Sweller, 1996; Sweller, 1994, for more information). Intrinsic cognitive load is akin to the intellectual complexity of the materials and cannot be modified. The instructional content of Experiments 1 and 2 used technical material that was high in element interactivity (Sweller, 2010).
Cognitive load effects will not apply to highly automated infor mation, or information specified without learning as the main goal.
For example, everyday conversation or film and television dialogue can include quite extended spoken text; however, it can be easily processed. This text is quite different from the unfamiliar, technical, higher element interactivity material used in our experiments. The hypothesised results should be assumed only to be applicable to material that entails a higher working memory load from higher element interactivity material. Cognitive load theory becomes less relevant as levels of intrinsic cognitive load are reduced. Extraneous cognitive load is the cognitive load that arises from instructional design practice and is within the control of the instructional designer. It is caused by an unnecessary increase in the number of elements that must be processed simultaneously in working memory due to instructional design factors. The majority of research in cognitive load theory has traditionally concentrated on techniques to reduce extraneous cognitive load in instructional materials (for more information, see van Merriënboer & Ayres, 2005; Sweller, 2010; and Sweller et al., 2011). All cognitive load associated with instruction can be divided into intrinsic and extraneous cognitive load. In addition, the term “germane cognitive load” is frequently used, often as an indepen dent source of cognitive load. An alternative interpretation is that because germane cognitive load is closely related to and dependent on intrinsic cognitive load, it is appropriate to define it in terms of intrinsic cognitive load (Sweller, 2010). Accordingly, germane cognitive load refers to working memory resources required to deal with intrinsic cognitive load resulting in learning. Similarly, working memory resources are required to deal with extraneous cognitive load and are sometimes referred to as extraneous resources. Reducing extraneous cognitive load can increase germane cognitive load, by releasing working memory capacity for learning (Sweller, 2010).
1.1. The present study
Many instructional design guidelines have been generated from cognitive load theory (Sweller, 2011, 2012; Sweller et al., 2011). Effective instructional designs should aim to keep total levels of intrinsic and extraneous cognitive load to within the learner’s working memory limits. Transient information is one of the factors that may cause cognitive load to exceed working memory limits. Information is transient when elements of information that must be processed by a learner disappear to be replaced by new elements. If the new and old elements interact then the old elements must be held in working memory while the new elements are processed. Modern educational technology frequently and incidentally transforms permanent information into transient information. The effect can result in an overwhelming working memory load.
As an example, when a series of static graphics used to depict motion is replaced by an animation depicting the same motion, not only is the visual information rendered more realistic as is the intention, but a permanent depiction is replaced by a transient depiction. The consequence may be a considerable increase in working memory load. Static graphics allow learners to rapidly and easily refer back to previous information as needed. Depending on the technology, it may be difficult or impossible to switch rapidly and easily between various aspects of an animated depiction. The natural way to view an animation is serially. Exactly the same argument applies when written text is replaced by spoken text. Permanent, written text allows us to easily and rapidly refer back to text previously read. It can be much more difficult to refer back to transient, spoken text and that difficulty may negate any advan tages associated with a more natural mode of presentation.
One way in which the potential problems associated with transient information may be overcome is to present the poten tially transient information in much shorter segments. A short segment of information should impose a reduced cognitive load compared with a longer segment. By reducing the cognitive load associated with long segments of information containing lots of interacting elements, the natural advantages of animations and speech may manifest themselves. For animations this means allowing us to tap into our innate ability to learn by observing (Ayres, Marcus, Chan, & Qian, 2009; Wong et al., 2009); and for speech this means harnessing the highly practiced skill of learning by listening within a dual modality context (Mousavi, Low, & Sweller, 1995).
Experiment 1 of this paper applied cognitive load theory to some of the consequences of presenting animations during instruction while Experiment 2 applied the theory to some of the consequences of using audio-visual instructions. In both cases it is suggested that as the length and complexity of transient instruc tional information associated with animations and audio-visual presentations increases, extraneous working memory load also increases resulting in the benefits of animations and audiovisual instructions decreasing or even reversing compared to alternatives. 1.2. Aim-Hypothesis In the two experiments, transient vs. permanent information was compared using longer and shorter segments of information. In both experiments, we hypothesised a condition by segment length interaction. In Experiment 1, there should be a greater advantage for animations over static graphics using short segments of infor mation compared to long segments of information (Hypothesis 1a), and in Experiment 2, there should be a greater advantage for audiovisual presentation over a visual only presentation using short segments of information compared to long segments of information (Hypothesis 1b). We did not predict whether the interaction would be ordinal or disordinal because that factor is affected by incidental experimental factors. A disordinal interaction can be turned into an ordinal one by choosing shorter or longer segment lengths that do not overlap with the cross-over point. Consider the graph of a symmetrical, disordinal interaction with segment length on the X axis and test scores on the Y axis graphing both a transient information presentation and a permanent infor mation presentation. If only the left or right side of that disordinal interaction is represented due to the segment lengths chosen, the interaction will be ordinal rather than disordinal. 2. Experiment 1 The shift towards delivering increasing amounts of instructional materials via electronic and online devices in educational envi ronments has resulted in the use of instructional animations in classrooms becoming increasingly commonplace. Although the use of animations has become more common in modern classrooms, the benefits of instructional animations have still not been clearly established. Much of the existing literature has found no inherent benefit of animations over static graphics (e.g., Hegarty, Kriz, & Cate, 2003; Mayer, DeLeeuw, & Ayres, 2007; Mayer, Hegarty, Mayer, & Campbell, 2005; Schnotz, Böckheler, & Grzondziel, 1999; Tversky, Morrison, & Betrancourt, 2002). The effectiveness of animations increases when supporting techniques are added, such as segmentation (Mayer & Chandler, 2001; Moreno, 2007) or user control (Hasler, Kersten, & Sweller, 2007; Schwan & Riempp, 2004). A meta-analysis by Höffler and Leutner (2007) found animation to be superior to static graphics for several conditions, e.g., realism and procedural-motor knowledge. We suggest that the failure of instructional animations can be explained by cognitive load theory in general and the transient information effect in particular. The transience inherent in most animation means learners need to simultaneously remember and process both previously pre sented information as well as currently presented information to understand the learning material. However, previously learnt information may have already been lost from working memory before the current information has been processed. Static graphics, on the other hand, can be revisited on demand in a way that is possible but far more difficult using animations. An instructional design technique to alleviate transience in animations is to use segmentation. Previous studies examining segmentation have found a benefit to segmenting animations into smaller sections (Mayer & Chandler, 2001). Moreno (2007) previ ously found some benefit to segmenting animations when modelling teaching behaviours to students, although the nature of the learning materials for the control condition is unclear. Spanjers, Wouters, van Gog, and van Merriënboer (2011) found novice learners benefited more from segmented animated worked examples, whereas expert learners benefited more from contin uous animated worked examples. Lusk et al. (2009) found that learners with a reduced working memory span benefitted from segmented instruction while those with a high working memory span did not. However, although there are studies examining the effects of segmentation, there is little literature focussing on the underlying reasons as to when and why segmentation is effective. The current analysis suggests that segmentation may be effective not because segments divide material but because shorter segments reduce working memory load compared to longer segments. Previous studies by Wong et al. (2009) and Ayres et al. (2009) found that for materials based around human movement, animation was consistently superior to static graphics because instructional animations tap into our innate ability to learn through observation. The present study tested the hypothesis that sufficiently large amounts of transient information would even tually attenuate the effectiveness of human movement-based instructional animations. It examined the effects of transience in animated and static graphics-based learning materials, for a human movement task. A human movement based task was selected as these types of tasks have been found to lead to supe rior learning using animated instructions when compared to statics (see Ayres et al., 2009; Wong et al., 2009). It was predicted that when presenting learning materials in relatively short sections, the amount of transient information would remain within working memory limits. Consequently, animation should be superior to static graphics because they realistically indicate to learners all relevant movements. Conversely, the amount of tran sient information could exceed working memory limits for learning materials presented in long sections, imposing an extra neous cognitive load. In this case, animation would lose its advantage over static graphics. Experimentally, this pattern was expected to translate into a segment length by mode of presen tation interaction effect.
2.1. Method
2.1.1. Participants and design Participants were 66 children (28 male, 38 female) in Year 5, 10e11 years old, from two co-educational primary schools in Syd ney. Parental consent for participation was obtained. Participants were allocated to one of four experimental groups: short section animation (n 1⁄4 16), short section static graphics (n 1⁄4 16), long section animation (n 1⁄4 17) or long section static graphics (n 1⁄4 17). 2.1.2. Materials The task domain for the experiment was origami. The animation for the origami task was filmed as a video using a Canon IXUS 80IS digital camera. The learning material showed a geometric figure being folded from a single sheet of paper (Fig. 1), consisting of 24 steps in total. Each step consisted of a single movement con taining on average 3e4 screenshots, depending on the complexity of the fold being made. Steps with more complex folds contained up to 5 screenshots. There was no narration, or other audio components. The same content was given to all four groups. All videos for the animation conditions were resized so that a video was the same size as a single picture from the static graphics.
Fig. 1. A step from Experiment 1 static graphics learning materials. A. Wong et al. / Learning and Instruction 22 (2012) 449e457 451
The long section conditions consisted of a single long section of 24 steps that ran for 250 s. At the end of the section was a slide instructing participants that they would have 180 s to carefully consider the section that had just been shown.
The short section conditions consisted of the long section materials, split into 6 groups of 4 steps each. Between each group of steps was a slide instructing participants that they would have 30 s to carefully consider the section that had just been shown, for a total of 180 s of break time across all sections.
The static graphics for the paper-folding task were screenshots of individual frames taken directly from the video. Screenshots were chosen for inclusion on the basis of the clarity of the image and how well an individual screenshot conveyed movement clearly when placed in the sequence of other screenshots that had been chosen for inclusion. The aim was to create a set of static graphics that were informationally equivalent to the animation. Depending on the perceptual complexity of a fold, between 3 and 5 screenshots were used for each step. The aim was to represent each fold with the minimum number of screenshots needed for clarity. The static graphics were then pilot tested to ensure clarity of information. The selected images were imported into the graphic design program Adobe Illustrator and compiled into the static graphics versions of the materials.
Video clips for the short section animation condition were created from the long section video using Apple QuickTime Pro. Videos for both the long and short section conditions were shown at the same size as an individual picture in the static graphics conditions. Each section ran for between 23 s and 66 s, depending on how long it took to perform the 4 steps within the section. The videos and slides were timed to automatically advance through each section.
The long section static graphics were presented as a single scrollable file for 250 s. The short section static graphics were presented for the same amount of time as the short section animation. The short section static graphics were presented as a series of 6 separate scrollable files, each containing the same 4 steps as in the short section animations. The amount of time each section was presented matched the corresponding section in the short section animation condition. In this manner, the short section static graphics were identical in all respects to the short section animation except that each section was static rather than animated. The difference in transience between the two conditions was limited to within each segment. Between segments, both the static graphics and the animation were equally transient as in neither condition could learners return to a previous segment. Files for both the long and short section static graphics materials were timed to advance through each section automatically. Learning materials were presented on 17-inch G4 Apple iMacs. The test materials consisted of a single square sheet of paper. The task tested how much of the figure participants could fold from start to finish in 4 min. The test was scored out of 16. Rotations of the origami figure that needed to be depicted in the learning materials for clarity, but which were not necessary for successfully reaching the next fold, were not given a mark. One mark was given for each fold correctly made. While this measure is objective and so highly reliable, it does suffer from an error location problem. If a learner makes an error early in the exercise, the subsequent folds are likely to be impossible to make while an error later in the exercise may have only minor consequences. This issue is inherent in any transformation problem in which later moves are dependent on earlier moves. Cognitive load ratings using subjective ratings scales were not administered in the current experiment, as there exists evidence indicating Likert scales may be unreliable for use by young children (Chambers & Johnston, 2002).
2.1.3. Procedure
The experiment was run in groups of 4 participants, using consecutive learning and test phases. Participants were randomly allocated to one of the four conditions, then given a pretest asking for demographic information (age, year level and gender) as well as being asked about their previous knowledge of origami and how frequently they had folded origami figures in the previous 6 months. Immediately after completion of the pretest, those with advanced knowledge of origami were switched between groups as required, to balance levels of experience between experimental conditions.
The learning phase then began. All participants were told they would learn how to fold a geometric figure from either a series of pictures or a video and then they would be asked to fold the figure afterwards. Participants were then given written instructions about the learning tasks commensurate with the condition they had been allocated to.
Participants in the animation groups were instructed they would be shown a series of videos (short section group) or a single video (long section group) that would show them how to fold a geometric figure and how to start the video presentation. Participants in the static graphics groups were instructed they would be shown a series of pictures on how to fold a geometric figure and how to start the presentation. After the learning phase, participants then entered the test phase. All participants were told they would have 4 min to fold as much of the origami figure as possible.
2.2. Results and discussion
The variable under analysis was test score. Scores for each task were expressed as a percentage of the task performed correctly. Means and standard deviations are presented in Table 1. A 2 (animation vs. static graphics) 2 (short section vs. long section) ANOVA run on percentage correct across the four groups demonstrated a significant interaction effect, F(3, 62) 1⁄4 5.39, MSE 1⁄4 534.08, p 1⁄4 .024, hp 2 1⁄4 .080, in line with Hypothesis 1a. The main effect of format was significant, showing animation was superior to static graphics, F(1, 62) 1⁄4 8.71, MSE 1⁄4 534.08, p 1⁄4 .004, hp 2 1⁄4 .123. The main effect of section size was not significant, F(1, 62) 1⁄4 .069, MSE 1⁄4 534.080, p 1⁄4 .794, hp 2 1⁄4 .001. Simple effects tests following the significant interaction showed animation was supe rior to static graphics for the short section conditions, F(1, 30) 1⁄4 18.89, MSE 1⁄4 424.07, p < .001, hp 2 1⁄4 .386. For the long section conditions, animations were not significantly better than static graphics, F(1, 30) 1⁄4 .12, MSE 1⁄4 653.65, p 1⁄4 .732, hp 2 1⁄4 .004. A Bon ferroni correction controlling the family-wise error rate at .05 was used requiring individual contrasts to be compared to an alpha of .025.
The presentation by length interaction can be explained by assuming that animations provide learners with more realistic information that could be assimilated more easily than static graphics. We have an innate ability to learn by observing that instructional animations demonstrating movement are able to exploit (see van Gog, Paas, Marcus, Ayres, & Sweller, 2009; Wong et al., 2009 for further discussion). When demonstrating a paper folding task, there are fewer inferences required and so a reduced
Table 1
Means and standard deviations (in parentheses) of scores on test expressed as percentage correct in Experiment 1.
Condition Short section Long section
Animation 66.41 (18.94) 54.69 (17.90)
Static graphics 36.40 (22.42) 51.10 (30.48) working memory load if all of the relevant information indicating which folds need to be made is presented explicitly. For this reason, the animated version of the instructions may have been superior to the static graphics version for short sections. In contrast, static graphics do not gain this advantage of animation. Inferences must be made with their attendant working memory load. While the animated version of the instructions should reduce cognitive load by reducing the number of required inferences, if the animation is lengthy then its transient character may become relevant and the advantage of animations over static graphics may be reduced. Having to hold several previous moves in working memory while attending to the current move may place an addi tional load on working memory and prevent the transfer of knowledge to long-term memory. For this reason, the advantages of animation over static graphics may be countered by the disad vantages of long, complex transient information. The short graphic presentation not only does not gain the advantages of animation, there is a working memory cost associ ated with having to hold information in working memory while waiting for the next segment. As a consequence, the normal advantage of static graphics over animation, that they are perma nent, is reduced, increasing the relative advantage of animations.
The same disadvantage applies to the short, animation condition but animation usually has a problem of transience and so the segmentation of animations may have only a minor effect. In contrast, for the long static graphic presentation, there are no advantages accruing that can be derived from animation but there is an advantage to having access to all of the information without unnecessary gaps imposing an unnecessary working memory load. That information can be accessed at any time without transience, contributing to the elimination of the animation advantage. The interaction obtained was ordinal rather than disordinal. We might expect that had the long segments been even longer or if the material had been more complex, a full, disordinal interaction might have been obtained with static graphics being superior to animations.
While these relations are complex, they do flow from the basic assumptions of cognitive load theory with its emphasis on working memory and its limitations. Support for the transience by segment length hypothesis would be obtained if similar results could be generated comparing transient with permanent information in an unrelated area. While similar results in two different areas can always be due to unrelated factors, the ability to predict results using cognitive load theory and to integrate otherwise disparate findings is an advantage. Experiment 2 again tested for an interaction (Hypothesis 1b) between presentation mode and length and complexity of information but in this case, using audio-visual modalities rather than animation and static graphics.
3. Experiment 2
The modality effect (Sweller et al., 2011) occurs when audio visual presentations are superior to visual only presentations. For example, a diagram may have its associated text presented in written or spoken form. The modality effect is demonstrated if learning is enhanced by presenting the text in spoken rather than written form in conjunction with the diagram which is presented visually (Mousavi et al., 1995). The effect can be readily demon strated with many examples in the literature (see Ginns, 2005 for a meta-analysis) and is of considerable current interest (e.g., see Mayrath, Nihalani, & Robinson, 2011). It needs to be noted that the effect only is obtainable when the auditory and visual components are unintelligible in isolation and when the information is complex (Tindall-Ford, Chandler, & Sweller, 1997). The effect can be explained within a cognitive load theory framework using Baddeley’s (1992) working memory theory. The theory assumes a visual processor, the visual-spatial sketchpad, to deal with visual information and an auditory processor, the phonological loop, to deal with language. Language is processed by the phonological loop whether it is presented in written or spoken form but, of course, if presented in written form, it initially must be processed by the visual-spatial sketchpad and must be translated into auditory form. For this reason, cognitive load theory assumes that a written presentation is more likely to overload the visual spatial sketchpad than an auditory presentation that has no need to use the visual-spatial sketchpad and no need to translate written information into auditory form. As a consequence, more working memory resources are available to deal with information presented in a dual, audio-visual mode than a single, visual only mode, leading to the modality effect. The modality effect is an established effect and dual modality presentations can be readily used with modern educational tech nology. Nevertheless, the use of auditory presentations introduces transient information. Spoken information is transient in exactly the same way as animations. In order to process speech, we must hold previous information in working memory while processing current information and it may be difficult or impossible to re access the earlier information. Of course, as was the case for animation, transience may not constitute a working memory issue if the information is brief or low in complexity. Spoken information that is brief or low in complexity may be easy to hold and process in working memory. In contrast, if transient, spoken information is long and complex, it may be difficult or even impossible to process adequately. Accordingly, dual mode presentation may only be effective and the modality effect may only be obtainable using relatively short, simple, verbal materials. As the length and/or complexity of the verbal materials increase, we can hypothesize that the modality effect should first disappear and then possibly reverse with visual only material being superior to audio-visual material. In other words, the same interaction can be predicted when testing for the modality effect as was obtained when using animations in Experiment 1. There is some evidence for this hypothesis. Tabbers, Martens, and van Merriënboer (2004) obtained a reverse modality effect using verbal statements that appeared longer and more complex than those used in most other studies. Leahy and Sweller (2011) also obtained a reverse modality effect by increasing the length of the verbal statements. Experiment 2 tested the hypothesis that length of verbal state ments may be critical to the modality effect by comparing audio visual presentations with visual only presentations in which the verbal components were varied in length. Four groups of upper primary school students were presented with either longer or shorter audio text or longer or shorter visual text. The learning material consisted of instructions on how to read a temper atureetime graph.
3.1. Method
3.1.1. Participants The participants were 42 primary level students from two Year 6 classes randomly assigned to four groups. Their age range was 11e12 years. Parental consent to participate was obtained. All students attended a Sydney private school. The primary school mathematics and science curriculum for this age level requires the reading of material similar to the material used in the experiment. It was indicated by the students’ teachers that their students had some limited tuition reading temperatureetime graphs before the experiment. The experiment was conducted in A. Wong et al. / Learning and Instruction 22 (2012) 449e457 453 the final term of the school year and during the first lesson periods of the day.
3.1.2. Materials The materials used during the learning phase consisted of introductions to temperatureetime graphs and worked examples of their interpretation. These were given to each group in a presentation as a series of PowerPoint slides. The experiment used a 2 (modality) 2 (length of verbal segments) design resulting in four instructional groups: 1. A longer audio text instruction group; 2. A longer visual text instruction group; 3. A shorter audio text instruction group; 4. A shorter visual text instruction group. All four groups were provided with an introduction to the use of a temperatureetime graphs and five worked examples. The first slides introduced a single day temperature line graph similar to Fig. 2 except it only represented one, not three days. These slides showed various basic components, for example, the horizontal time axis, the vertical temperature axis, the location points, the temperature progress line and grids. Subsequent slides used one worked example to demonstrate how to find the temperature on one day from the previous single day graph. The question asked, “What is the temperature at 10 am?” was written on top of the slide. The steps to solve the question were also displayed on this slide within textboxes. Further slides provided another introduc tion and showed the components of a 3 day graph. The final slides showed various worked examples on how to find: one temperature, two temperatures, a change of temperature between set hours, the lowest temperature between set hours and the time of a set temperature. The content and general procedures of the PowerPoint instructions and the total presentation times were identical for the four groups. The total presentation time for the slides was 330 s. The two groups differed on the two variables of modality and length of text. To establish differences in modality, the written textbox information shown to the longer and shorter visual text only groups was eliminated and provided in an audio format for the two longer and shorter audio text groups. Questions were also presented in audio format. If a textbox used no arrow, one arrow or two arrows to point to relevant features within the diagram for the visual groups, the slides for the audio text groups were pre sented identically with no arrows or a similar number of arrows. The textbox was removed and only the arrow component (where included) remained. Differences in length of text slides can be described as follows: The longer audio text and longer visual text groups had 9 slides displayed for a total duration of 330 s, timed to automatically progress from 17 to 55 s each, depending on the amount of material in the slide. (Prior to this experiment, a small pilot study was completed to gauge workable presentation times.) Equivalent slides for both groups were of identical duration. The total word count for all nine slides was 421 words giving an average of 46.77 words per content slide. Therefore, the maximum reading or listening time allowable for the content in the longer text slides was .78 s per word. In contrast to the longer text groups, the shorter audio text and shorter visual text groups had 47 slides in total, timed to auto matically progress from 4 to 12 s each. There were more slides needed than for the longer text groups. This increase in the number of slides was a consequence of the material being broken down into more sections. Equivalent slides for these two shorter text groups were of identical duration. The total word count was 395 words giving an average of 8.4 words per content slide. The average listening or reading time allowable for the content in the shorter text slides was .83 s per word. Note that even though the longer and shorter text groups’ number of words on slides varied, with the longer text slides containing more words due to connectors, the content was identical. There were no pauses between slides viewed by the four groups. There were two sheets of paper for the test material. Sheet 1 displayed a temperature graph that was identical to the graph in Fig. 2 with the textbox and arrow removed. Sheet 2 had seven test questions (see Appendix A). Students used Sheet 1 to answer the questions on Sheet 2. The questions were transfer questions that were quite complex (tapping higher element interactivity knowl edge) composed to test whether learners had understood the information during the learning phase adequately to demonstrate transfer. The maximum time allowed to answer all of the test questions was 20 min. All students completed their questions within this time limit however, they were not individually timed. They were also not allowed to review information from the learning phase during the test.
3.1.3. Procedure
The experiment consisted of a pre-instruction, an instruction and a test phase. In the pre-instruction phase all the students were informed from a memorized script presented by the researcher that they were going to be taught how to read a temperatureetime graph by being shown worked examples contained in a Power Point presentation. They were further told that during the entire instruction phase they would have to concentrate carefully by watching the slides. As outlined previously, Year 6 students in New South Wales have experience in interpreting a variety of graphs due to the curriculum content. The two classes were then randomly divided into the four distinct instructional groups. There were 11 participants in the longer audio text instruction group, 10 in the longer visual text instruction group, 10 in the short audio text instruction group, and 11 in the shorter visual text instruction group. The 20 min test phase proceeded immediately after the 330 s presentation had finished. In this phase, the researcher distributed the test graph and test sheet.
Fig. 2. Example of one of the PowerPoint slides used in Experiment 2 with longer visual text instructions. Note: The graphs were presented to the participants in colour. 454 A. Wong et al. / Learning and Instruction 22 (2012) 449e457 3.2. Results and discussion A 2 (length of text) 2 (modality) ANOVA was conducted on the number of correct answers to the 7 questions. Means and standard deviations are provided in Table 2. The test scores indicated no significant differences between the longer and shorter text groups F(1, 38) 1⁄4 .63, MSe 1⁄4 1.36, p 1⁄4 .432, hp 2 1⁄4 .016 nor the audio and visual groups F(1, 38) 1⁄4 .001, MSe 1⁄4 1.36, p 1⁄4 .977, hp 2 < .001. There was a significant interaction, F(1, 38) 1⁄4 14.65, MSe 1⁄4 1.36, p < .001, hp 2 1⁄4 .26. Because of the significant interaction, a simple effects compar ison of individual groups was completed. There was a significant difference between the longer audio text and the longer visual text groups favouring the longer visual text group, F(1, 38) 1⁄4 7.18, MSe 1⁄4 1.36, p 1⁄4 .011, hp 2 1⁄4 .158, demonstrating a reverse modality effect. There was a significant difference between the shorter audio text and the shorter visual text groups favouring the shorter audio text group F(1, 38) 1⁄4 7.47, MSe 1⁄4 1.36, p 1⁄4 .009, hp2 1⁄4 .164, demonstrating a conventional modality effect. A Bonferroni correction controlling the family-wise error rate at .05 was used requiring individual contrasts to be compared to an alpha of .025. As was the case for Experiment 1, a presentation mode by length of textual statements interaction was obtained (Hypothesis 1b). In this case, the interaction was dis-ordinal with a very large effect. Simple effects tests indicated a routine, statistically significant modality effect for the short verbal statements with a statistically significant reverse modality effect for the long verbal statements. These results can be explained by cognitive load theory principles that in combination lead to the transient information effect. Short spoken statements can be readily held and processed in auditory working memory leaving visual working memory to process visually presented information without an additional memory load. In contrast, written information itself consists of visual images that can be expected to impose a visual working memory load. Furthermore, those visual images must be translated into speech for further processing. The resultant cognitive load is extraneous and likely to interfere with learning resulting in the modality effect when compared to auditory statements that do not need to be processed visually.
To adequately process long, complex, technical auditory statements in working memory, we may need to return to them repeatedly, ignoring some parts while attending to other parts. That process is straightforward when the statements are presented inpermanent, written form. It may be difficult or impossible if the statements are presented in transient, auditory form. The consequence is a reverse modality effect with written information proving superior to spoken information, a result obtained in Experiment 2.
4. General discussion
The experiments described in this paper were generated from cognitive load theory. It was assumed that while the cognitive load associated with the presentation of complex information could be ameliorated by the use of animations or dual mode presentations, both forms of presentation incidentally introduce transience that also can impose a heavy cognitive load. The cognitive load associated with transience is likely to be higher with increased segment length. As a consequence, in combination, we hypothesised a transient information by segment length interaction.
The pattern of results from two experiments supported this hypothesis. In Experiment 1, testing Hypothesis 1a, for animations presented in short sections, the amount of transient information did not exceed working memory limitations. Additionally, learners given static graphics in short sections could not refer forwards or backwards to any part of the materials on demand, thereby disallowing mental integration of the information. These characteristics resulted in animation being superior to static graphics, replicating the results found in Wong et al. (2009) and Ayres et al. (2009), where animations teaching relatively short human movement tasks were found to be superior to equivalent statics. In contrast, for long section conditions, the amount of transient information may have exceeded working memory limitations and thus animation was no longer advantageous compared to static graphics. For long sections of static graphics, mental integration was possible. In addition, learners could choose to devote more or less time to particular sections of information, a choice that was not available under animation conditions. For long sections of animation however, the large quantity of transient information meant that working memory limits were likely to be exceeded, and the benefits associated with instructional animations were no longer available. Mental integration was not possible nor could learners attend variably to different parts of the material. In Experiment 2, testing Hypothesis 1b, using transient auditory information rather than transient animations, the same effects of transience were observed, in this case leading not just to the elimination of a significant difference but to a reverse effect. The effects of transience were sufficiently large to overwhelm any advantage associated with presenting the information in a dual modality format.
It may be argued that dividing related material into segments prevents learners from seeing the interactions between segments and so could interfere with learning. In the case of both static graphics in Experiment 1 and visual only presentations in Experiment 2, this argument may be valid. In both cases, the long section presentation was superior to the segmented presentation. Never theless, if this effect does occur, it is overwhelmed by the effect of transience. Both the short animated presentation of Experiment 1 and the short audio-visual presentation of Experiment 2 were superior to the longer presentations. Any deleterious effects of segmentation were overwhelmed by the effects of transience on the long section presentations.
Based on the current set of experimental results, long, complex animations or long, complex auditory statements need to be segmented into shorter sections, as excessively long sections can overwhelm working memory, attenuating any benefits of presenting materials using either animations or spoken information. If complex information is presented in transitory form, working memory may be overwhelmed. Segmenting the information should reduce the working memory load. The similar interactions between presentation modality and length of segments for two quite different categories for presenting information support this interpretation.
Neither animations nor verbal material should be presented using lengthy, complex, transient information. Of course, what is considered long and complex will depend on the learners’ prior knowledge and expertise within the particular instructional domain.
The rationale of the current work is based on predicting a particular pattern of results, in this case an interaction. That interaction was hypothesised from the suggested pattern of working memory loads. The fact that the hypothesised pattern was obtained in two experiments using vastly different materials and studying two different cognitive load effects provides support for the suggested theoretical constructs used to make the prediction. Nevertheless, while the results support the hypothesised patterns based on cognitive load considerations, direct measures of cognitive load were not obtained because it is difficult to obtain reliable subjective ratings of cognitive load from young children. Alternatives such as the use of secondary tasks are difficult or impossible to use within a classroom environment. For these reasons, it would be desirable to replicate these experiments using adult participants from whom subjective measures of cognitive load can be readily obtained. While both animations and auditory information yielded a presentation mode by segment length interaction, that interaction was ordinal in the case of the Experiment 1 animations vs. static graphics but disordinal in the case of the Experiment 2 written vs. spoken text. We doubt this difference indicates different causal factors. Whether an interaction is ordinal or disordinal most commonly is determined by the distance between conditions. In the case of Experiment 1, the decrease in the scores of the long section compared to short section animations will depend on the length of those animations with increasing length leading to decreasing scores. The long section animation scores did not fall below the long section static graphic scores. Had the long section animations been longer, we might expect to obtain that disordinal pattern. The short section animation scores were high because of the advantages of short animations. The short section static graphics scores were relatively low because of the disadvantages of not having all of the static graphics present simultaneously. In the case of Experiment 2, the long section auditory information was sufficiently long to not only negate the advantages of an audio-visual presentation, but also to reduce scores below that of a visual only presentation. Thus, whether an ordinal or disordinal interaction is obtained can be seen to be due simply to the relative lengths and complexity of the short and long sections rather than due to different causes. At present, while we can predict an interaction, we do not have a metric that will inform us prior to running an experiment whether the difference between short and long sections of transient information will yield an ordinal or disordinal interaction. There are, of course, many other differences between the two experiments. For example, Experiment 1 used a reproduction test while Experiment 2 used a transfer test. Nevertheless, both experiments yielded a similar, interpretable interaction.
The current results have been explained by cognitive load theory but recently, an alternative explanation for failures to obtain the modality effect using longer textual information has been provided by Rummer, Schweppe, Fürstenberg, Scheiter, and Zindler (2011). They suggested that the modality effect is due to characteristics of sensory memory rather than working memory. Visual sensory memory span is about 1 s while the equivalent auditory span is 4e5 s. Accordingly, presenting information in spoken (auditory) form should be superior to presenting it in written (visual form) providing that the information takes no more than about 5 s. Brief verbal information should result in a modality effect while longer information should not.
Rummer et al. used longer information and only obtained a modality effect for more recently presented information because more information was retained in the longer span, auditory sensory memory than the visual sensory memory. Earlier information disappeared from auditory sensory memory as well as from visual sensory memory because the 5 s limit was exceeded and so there was no advantage in using an auditory over a visual presentation. The modality effect obtained using shorter textual information in Experiment 2 could be explained by Rummer et al.’s sensorymemory explanation. There are nevertheless, some necessary caveats.
1. The finding that the modality effect only occurs for more recent items can readily be explained by cognitive load theory with its emphasis on working memory limitations. If the load on the visual-spatial sketchpad is decreased by using auditory text compared to visual text, more of the text is likely to be recalled. The advantage may only occur on more recent text because remembering earlier, lengthy text overloads working memory irrespective of presentation mode, resulting in no difference between conditions.
2. The reverse modality effect using longer texts requires a different explanation not relying on sensory memory. The explanation provided in this paper is an example.
3. A sensory memory explanation applies only to the modality effect. It cannot be used to explain the interaction found using animations in Experiment 1.
4. If the modality effect is due primarily to sensory rather than working memory, it should be equally obtainable irrespective of the intrinsic cognitive load imposed by the materials. Tindall-Ford et al. (1997) obtained the effect with high intrinsic cognitive load due to high element interactivity material but failed to obtain the effect with low intrinsic cognitive load, low element interactivity material. Notwithstanding these caveats, Rummer et al.’s hypothesis and results are important and interesting. They deserve further consideration and testing. In conclusion, new instructional technology permits us to present information using formats that are difficult or impossible using older methods. Some procedures, such as enhanced use of animations or spoken text can be of considerable benefit. Nevertheless, there can be incidental and unintentional consequences, some of which have negative effects on learning. We need to be aware that there are cognitive consequences when switching from the permanent information found in books and other forms of hard copy to the frequently transient information presented using modern technology. Those cognitive consequences can have considerable instructional consequences, not all of which are positive. Cognitive load theory can be used to provide hypotheses concerning those instructional consequences, both positive and negative.
Appendix A
Test questions used in Experiment 2
(1) Between which 2 hours, and on what day, did the temperature drop by 3 C?
(2) At what time did a day have a temperature of 34 C before falling to 29 C in 1 hour?
(3) At what time after 12 pm were the 3 days closest to each other in temperatures?
(4) Which temperature that was over 26 C only occurred once during three days?
(5) How many times was there a difference in temperature of exactly 1 C between Monday and Wednesday?
(6) Which day had the largest change in temperature in a single hour between 12 pm and 4 pm and what time did this change occur?
(7) Which day had the smallest change in temperature in 1 hour between 10 am and 1 pm and what time did this change occur?
abstract
When using modern educational technology, some forms of instruction are inherently transient in that previous information usually disappears to be replaced by current information. Instructional animations and spoken text provide examples. The effects of transience due to the use of animation-based instructions (Experiment 1) and spoken information under audio-visual conditions (Experiment 2) were explored in a cognitive load theory framework. It was hypothesized that for transient information presented in short sections, animations would be superior to static graphics, due to our innate ability to learn by observing. For transient information in long sections, animations should lose their superiority over static graphics, due to working memory overload associated with large amounts of transient information. Similarly, the modality effect under which audio-visual information is superior to visual only information should be obtainable using short segments but disappear or reverse using longer segments due to the working memory consequences of long, transient, auditory information. Results supported the hypotheses. The use of educational technology that results in the transformation of permanent into transitory information needs to be carefully assessed.
1. Introduction
Instructional technology is becoming increasingly sophisticated and increasingly ubiquitous. While there can be little doubt that the introduction of technology allows novel and beneficial forms of teaching and learning, those novel forms sometimes have unin tended, incidental, and negative consequences. In this paper, we are concerned with the transient information effect (Leahy & Sweller, 2011; Sweller, Ayres, & Kalyuga, 2011). It occurs when instruc tional procedures present information in a form that is transient and difficult to retrieve rapidly and when required. The use of both animations and spoken information provide examples. While technology permits the ready use of animations and spoken text, both incidentally transform the permanent information associated with hard copy into transient information that rapidly disappears to be replaced by new information. Transient information has negative cognitive load consequences that are explored in the two experiments of this paper. We will begin by outlining cognitive load theory.
Cognitive load theory is a framework of instructional design principles based on the characteristics and relations between the structures that constitute human cognitive architecture, particu larly working memory and long-term memory. The theory assumes that human cognitive architecture is a natural information pro cessing system, analogous to other systems such as evolution by natural selection (Sweller, 2011, 2012; Sweller et al., 2011; Sweller & Sweller, 2006). It can be specified by five principles:
1. Long-term memory and the information store principle. Most human cognition is driven by the contents of an enormous information store (De Groot, 1965). In human cognition, this structure is long-term memory.
2. Schema theory and the borrowing and reorganizing principle. This principle assumes we learn primarily through borrowing schemas from other people’s long-term memories (e.g., through listening or reading what others have written). These schemas are constructively reorganized through the lens of our own long-term memory. This reorganization process is inexact, resulting in some random changes.
3. Problem solving and the randomness as genesis principle. The borrowing principle does not create new information except insofar as borrowing is inexact. If knowledge is unavailable either through our own or other people’s long-term memories, we must problem solve by randomly generating moves and testing their effectiveness. This process generates new knowledge.
4. Working memory and the narrow limits of change principle.
Working memory processes must deal with the novel infor mation generated by the randomness as genesis principle. The randomness inherent in reorganization and random problem solving generates extremely large problem-solving spaces. To reduce the problem-solving space, working memory is limited in both capacity (Miller, 1956) and duration (Peterson & Peterson, 1959).
5. Long-term working memory and the environmental organizing and linking principle. Working memory is only limited when processing novel information. It is able to deal with vast amounts of previously organized information brought in from long-term memory (Ericsson & Kintsch, 1995) reducing the burden on working memory and thus lowering cognitive load. The structure and characteristics of this cognitive architecture indicate the primary purpose of instruction is to construct schemas in long-term memory. Instructional designs that do not aim to alter long-term memory and which ignore working memory limitations when processing novel information are unlikely to be effective. As well as human cognitive architecture, cognitive load theory includes a framework of instructional design principles and postulates the existence of two distinct types of cognitive load (Sweller, 2010): Intrinsic cognitive load is the cognitive load inherent within the information to be learnt. Intrinsic cognitive load is dependent on the degree to which individual elements of information must be processed simultaneously in working memory to be understand able (see Marcus, Cooper, & Sweller, 1996; Sweller, 1994, for more information). Intrinsic cognitive load is akin to the intellectual complexity of the materials and cannot be modified. The instructional content of Experiments 1 and 2 used technical material that was high in element interactivity (Sweller, 2010).
Cognitive load effects will not apply to highly automated infor mation, or information specified without learning as the main goal.
For example, everyday conversation or film and television dialogue can include quite extended spoken text; however, it can be easily processed. This text is quite different from the unfamiliar, technical, higher element interactivity material used in our experiments. The hypothesised results should be assumed only to be applicable to material that entails a higher working memory load from higher element interactivity material. Cognitive load theory becomes less relevant as levels of intrinsic cognitive load are reduced. Extraneous cognitive load is the cognitive load that arises from instructional design practice and is within the control of the instructional designer. It is caused by an unnecessary increase in the number of elements that must be processed simultaneously in working memory due to instructional design factors. The majority of research in cognitive load theory has traditionally concentrated on techniques to reduce extraneous cognitive load in instructional materials (for more information, see van Merriënboer & Ayres, 2005; Sweller, 2010; and Sweller et al., 2011). All cognitive load associated with instruction can be divided into intrinsic and extraneous cognitive load. In addition, the term “germane cognitive load” is frequently used, often as an indepen dent source of cognitive load. An alternative interpretation is that because germane cognitive load is closely related to and dependent on intrinsic cognitive load, it is appropriate to define it in terms of intrinsic cognitive load (Sweller, 2010). Accordingly, germane cognitive load refers to working memory resources required to deal with intrinsic cognitive load resulting in learning. Similarly, working memory resources are required to deal with extraneous cognitive load and are sometimes referred to as extraneous resources. Reducing extraneous cognitive load can increase germane cognitive load, by releasing working memory capacity for learning (Sweller, 2010).
1.1. The present study
Many instructional design guidelines have been generated from cognitive load theory (Sweller, 2011, 2012; Sweller et al., 2011). Effective instructional designs should aim to keep total levels of intrinsic and extraneous cognitive load to within the learner’s working memory limits. Transient information is one of the factors that may cause cognitive load to exceed working memory limits. Information is transient when elements of information that must be processed by a learner disappear to be replaced by new elements. If the new and old elements interact then the old elements must be held in working memory while the new elements are processed. Modern educational technology frequently and incidentally transforms permanent information into transient information. The effect can result in an overwhelming working memory load.
As an example, when a series of static graphics used to depict motion is replaced by an animation depicting the same motion, not only is the visual information rendered more realistic as is the intention, but a permanent depiction is replaced by a transient depiction. The consequence may be a considerable increase in working memory load. Static graphics allow learners to rapidly and easily refer back to previous information as needed. Depending on the technology, it may be difficult or impossible to switch rapidly and easily between various aspects of an animated depiction. The natural way to view an animation is serially. Exactly the same argument applies when written text is replaced by spoken text. Permanent, written text allows us to easily and rapidly refer back to text previously read. It can be much more difficult to refer back to transient, spoken text and that difficulty may negate any advan tages associated with a more natural mode of presentation.
One way in which the potential problems associated with transient information may be overcome is to present the poten tially transient information in much shorter segments. A short segment of information should impose a reduced cognitive load compared with a longer segment. By reducing the cognitive load associated with long segments of information containing lots of interacting elements, the natural advantages of animations and speech may manifest themselves. For animations this means allowing us to tap into our innate ability to learn by observing (Ayres, Marcus, Chan, & Qian, 2009; Wong et al., 2009); and for speech this means harnessing the highly practiced skill of learning by listening within a dual modality context (Mousavi, Low, & Sweller, 1995).
Experiment 1 of this paper applied cognitive load theory to some of the consequences of presenting animations during instruction while Experiment 2 applied the theory to some of the consequences of using audio-visual instructions. In both cases it is suggested that as the length and complexity of transient instruc tional information associated with animations and audio-visual presentations increases, extraneous working memory load also increases resulting in the benefits of animations and audiovisual instructions decreasing or even reversing compared to alternatives. 1.2. Aim-Hypothesis In the two experiments, transient vs. permanent information was compared using longer and shorter segments of information. In both experiments, we hypothesised a condition by segment length interaction. In Experiment 1, there should be a greater advantage for animations over static graphics using short segments of infor mation compared to long segments of information (Hypothesis 1a), and in Experiment 2, there should be a greater advantage for audiovisual presentation over a visual only presentation using short segments of information compared to long segments of information (Hypothesis 1b). We did not predict whether the interaction would be ordinal or disordinal because that factor is affected by incidental experimental factors. A disordinal interaction can be turned into an ordinal one by choosing shorter or longer segment lengths that do not overlap with the cross-over point. Consider the graph of a symmetrical, disordinal interaction with segment length on the X axis and test scores on the Y axis graphing both a transient information presentation and a permanent infor mation presentation. If only the left or right side of that disordinal interaction is represented due to the segment lengths chosen, the interaction will be ordinal rather than disordinal. 2. Experiment 1 The shift towards delivering increasing amounts of instructional materials via electronic and online devices in educational envi ronments has resulted in the use of instructional animations in classrooms becoming increasingly commonplace. Although the use of animations has become more common in modern classrooms, the benefits of instructional animations have still not been clearly established. Much of the existing literature has found no inherent benefit of animations over static graphics (e.g., Hegarty, Kriz, & Cate, 2003; Mayer, DeLeeuw, & Ayres, 2007; Mayer, Hegarty, Mayer, & Campbell, 2005; Schnotz, Böckheler, & Grzondziel, 1999; Tversky, Morrison, & Betrancourt, 2002). The effectiveness of animations increases when supporting techniques are added, such as segmentation (Mayer & Chandler, 2001; Moreno, 2007) or user control (Hasler, Kersten, & Sweller, 2007; Schwan & Riempp, 2004). A meta-analysis by Höffler and Leutner (2007) found animation to be superior to static graphics for several conditions, e.g., realism and procedural-motor knowledge. We suggest that the failure of instructional animations can be explained by cognitive load theory in general and the transient information effect in particular. The transience inherent in most animation means learners need to simultaneously remember and process both previously pre sented information as well as currently presented information to understand the learning material. However, previously learnt information may have already been lost from working memory before the current information has been processed. Static graphics, on the other hand, can be revisited on demand in a way that is possible but far more difficult using animations. An instructional design technique to alleviate transience in animations is to use segmentation. Previous studies examining segmentation have found a benefit to segmenting animations into smaller sections (Mayer & Chandler, 2001). Moreno (2007) previ ously found some benefit to segmenting animations when modelling teaching behaviours to students, although the nature of the learning materials for the control condition is unclear. Spanjers, Wouters, van Gog, and van Merriënboer (2011) found novice learners benefited more from segmented animated worked examples, whereas expert learners benefited more from contin uous animated worked examples. Lusk et al. (2009) found that learners with a reduced working memory span benefitted from segmented instruction while those with a high working memory span did not. However, although there are studies examining the effects of segmentation, there is little literature focussing on the underlying reasons as to when and why segmentation is effective. The current analysis suggests that segmentation may be effective not because segments divide material but because shorter segments reduce working memory load compared to longer segments. Previous studies by Wong et al. (2009) and Ayres et al. (2009) found that for materials based around human movement, animation was consistently superior to static graphics because instructional animations tap into our innate ability to learn through observation. The present study tested the hypothesis that sufficiently large amounts of transient information would even tually attenuate the effectiveness of human movement-based instructional animations. It examined the effects of transience in animated and static graphics-based learning materials, for a human movement task. A human movement based task was selected as these types of tasks have been found to lead to supe rior learning using animated instructions when compared to statics (see Ayres et al., 2009; Wong et al., 2009). It was predicted that when presenting learning materials in relatively short sections, the amount of transient information would remain within working memory limits. Consequently, animation should be superior to static graphics because they realistically indicate to learners all relevant movements. Conversely, the amount of tran sient information could exceed working memory limits for learning materials presented in long sections, imposing an extra neous cognitive load. In this case, animation would lose its advantage over static graphics. Experimentally, this pattern was expected to translate into a segment length by mode of presen tation interaction effect.
2.1. Method
2.1.1. Participants and design Participants were 66 children (28 male, 38 female) in Year 5, 10e11 years old, from two co-educational primary schools in Syd ney. Parental consent for participation was obtained. Participants were allocated to one of four experimental groups: short section animation (n 1⁄4 16), short section static graphics (n 1⁄4 16), long section animation (n 1⁄4 17) or long section static graphics (n 1⁄4 17). 2.1.2. Materials The task domain for the experiment was origami. The animation for the origami task was filmed as a video using a Canon IXUS 80IS digital camera. The learning material showed a geometric figure being folded from a single sheet of paper (Fig. 1), consisting of 24 steps in total. Each step consisted of a single movement con taining on average 3e4 screenshots, depending on the complexity of the fold being made. Steps with more complex folds contained up to 5 screenshots. There was no narration, or other audio components. The same content was given to all four groups. All videos for the animation conditions were resized so that a video was the same size as a single picture from the static graphics.
Fig. 1. A step from Experiment 1 static graphics learning materials. A. Wong et al. / Learning and Instruction 22 (2012) 449e457 451
The long section conditions consisted of a single long section of 24 steps that ran for 250 s. At the end of the section was a slide instructing participants that they would have 180 s to carefully consider the section that had just been shown.
The short section conditions consisted of the long section materials, split into 6 groups of 4 steps each. Between each group of steps was a slide instructing participants that they would have 30 s to carefully consider the section that had just been shown, for a total of 180 s of break time across all sections.
The static graphics for the paper-folding task were screenshots of individual frames taken directly from the video. Screenshots were chosen for inclusion on the basis of the clarity of the image and how well an individual screenshot conveyed movement clearly when placed in the sequence of other screenshots that had been chosen for inclusion. The aim was to create a set of static graphics that were informationally equivalent to the animation. Depending on the perceptual complexity of a fold, between 3 and 5 screenshots were used for each step. The aim was to represent each fold with the minimum number of screenshots needed for clarity. The static graphics were then pilot tested to ensure clarity of information. The selected images were imported into the graphic design program Adobe Illustrator and compiled into the static graphics versions of the materials.
Video clips for the short section animation condition were created from the long section video using Apple QuickTime Pro. Videos for both the long and short section conditions were shown at the same size as an individual picture in the static graphics conditions. Each section ran for between 23 s and 66 s, depending on how long it took to perform the 4 steps within the section. The videos and slides were timed to automatically advance through each section.
The long section static graphics were presented as a single scrollable file for 250 s. The short section static graphics were presented for the same amount of time as the short section animation. The short section static graphics were presented as a series of 6 separate scrollable files, each containing the same 4 steps as in the short section animations. The amount of time each section was presented matched the corresponding section in the short section animation condition. In this manner, the short section static graphics were identical in all respects to the short section animation except that each section was static rather than animated. The difference in transience between the two conditions was limited to within each segment. Between segments, both the static graphics and the animation were equally transient as in neither condition could learners return to a previous segment. Files for both the long and short section static graphics materials were timed to advance through each section automatically. Learning materials were presented on 17-inch G4 Apple iMacs. The test materials consisted of a single square sheet of paper. The task tested how much of the figure participants could fold from start to finish in 4 min. The test was scored out of 16. Rotations of the origami figure that needed to be depicted in the learning materials for clarity, but which were not necessary for successfully reaching the next fold, were not given a mark. One mark was given for each fold correctly made. While this measure is objective and so highly reliable, it does suffer from an error location problem. If a learner makes an error early in the exercise, the subsequent folds are likely to be impossible to make while an error later in the exercise may have only minor consequences. This issue is inherent in any transformation problem in which later moves are dependent on earlier moves. Cognitive load ratings using subjective ratings scales were not administered in the current experiment, as there exists evidence indicating Likert scales may be unreliable for use by young children (Chambers & Johnston, 2002).
2.1.3. Procedure
The experiment was run in groups of 4 participants, using consecutive learning and test phases. Participants were randomly allocated to one of the four conditions, then given a pretest asking for demographic information (age, year level and gender) as well as being asked about their previous knowledge of origami and how frequently they had folded origami figures in the previous 6 months. Immediately after completion of the pretest, those with advanced knowledge of origami were switched between groups as required, to balance levels of experience between experimental conditions.
The learning phase then began. All participants were told they would learn how to fold a geometric figure from either a series of pictures or a video and then they would be asked to fold the figure afterwards. Participants were then given written instructions about the learning tasks commensurate with the condition they had been allocated to.
Participants in the animation groups were instructed they would be shown a series of videos (short section group) or a single video (long section group) that would show them how to fold a geometric figure and how to start the video presentation. Participants in the static graphics groups were instructed they would be shown a series of pictures on how to fold a geometric figure and how to start the presentation. After the learning phase, participants then entered the test phase. All participants were told they would have 4 min to fold as much of the origami figure as possible.
2.2. Results and discussion
The variable under analysis was test score. Scores for each task were expressed as a percentage of the task performed correctly. Means and standard deviations are presented in Table 1. A 2 (animation vs. static graphics) 2 (short section vs. long section) ANOVA run on percentage correct across the four groups demonstrated a significant interaction effect, F(3, 62) 1⁄4 5.39, MSE 1⁄4 534.08, p 1⁄4 .024, hp 2 1⁄4 .080, in line with Hypothesis 1a. The main effect of format was significant, showing animation was superior to static graphics, F(1, 62) 1⁄4 8.71, MSE 1⁄4 534.08, p 1⁄4 .004, hp 2 1⁄4 .123. The main effect of section size was not significant, F(1, 62) 1⁄4 .069, MSE 1⁄4 534.080, p 1⁄4 .794, hp 2 1⁄4 .001. Simple effects tests following the significant interaction showed animation was supe rior to static graphics for the short section conditions, F(1, 30) 1⁄4 18.89, MSE 1⁄4 424.07, p < .001, hp 2 1⁄4 .386. For the long section conditions, animations were not significantly better than static graphics, F(1, 30) 1⁄4 .12, MSE 1⁄4 653.65, p 1⁄4 .732, hp 2 1⁄4 .004. A Bon ferroni correction controlling the family-wise error rate at .05 was used requiring individual contrasts to be compared to an alpha of .025.
The presentation by length interaction can be explained by assuming that animations provide learners with more realistic information that could be assimilated more easily than static graphics. We have an innate ability to learn by observing that instructional animations demonstrating movement are able to exploit (see van Gog, Paas, Marcus, Ayres, & Sweller, 2009; Wong et al., 2009 for further discussion). When demonstrating a paper folding task, there are fewer inferences required and so a reduced
Table 1
Means and standard deviations (in parentheses) of scores on test expressed as percentage correct in Experiment 1.
Condition Short section Long section
Animation 66.41 (18.94) 54.69 (17.90)
Static graphics 36.40 (22.42) 51.10 (30.48) working memory load if all of the relevant information indicating which folds need to be made is presented explicitly. For this reason, the animated version of the instructions may have been superior to the static graphics version for short sections. In contrast, static graphics do not gain this advantage of animation. Inferences must be made with their attendant working memory load. While the animated version of the instructions should reduce cognitive load by reducing the number of required inferences, if the animation is lengthy then its transient character may become relevant and the advantage of animations over static graphics may be reduced. Having to hold several previous moves in working memory while attending to the current move may place an addi tional load on working memory and prevent the transfer of knowledge to long-term memory. For this reason, the advantages of animation over static graphics may be countered by the disad vantages of long, complex transient information. The short graphic presentation not only does not gain the advantages of animation, there is a working memory cost associ ated with having to hold information in working memory while waiting for the next segment. As a consequence, the normal advantage of static graphics over animation, that they are perma nent, is reduced, increasing the relative advantage of animations.
The same disadvantage applies to the short, animation condition but animation usually has a problem of transience and so the segmentation of animations may have only a minor effect. In contrast, for the long static graphic presentation, there are no advantages accruing that can be derived from animation but there is an advantage to having access to all of the information without unnecessary gaps imposing an unnecessary working memory load. That information can be accessed at any time without transience, contributing to the elimination of the animation advantage. The interaction obtained was ordinal rather than disordinal. We might expect that had the long segments been even longer or if the material had been more complex, a full, disordinal interaction might have been obtained with static graphics being superior to animations.
While these relations are complex, they do flow from the basic assumptions of cognitive load theory with its emphasis on working memory and its limitations. Support for the transience by segment length hypothesis would be obtained if similar results could be generated comparing transient with permanent information in an unrelated area. While similar results in two different areas can always be due to unrelated factors, the ability to predict results using cognitive load theory and to integrate otherwise disparate findings is an advantage. Experiment 2 again tested for an interaction (Hypothesis 1b) between presentation mode and length and complexity of information but in this case, using audio-visual modalities rather than animation and static graphics.
3. Experiment 2
The modality effect (Sweller et al., 2011) occurs when audio visual presentations are superior to visual only presentations. For example, a diagram may have its associated text presented in written or spoken form. The modality effect is demonstrated if learning is enhanced by presenting the text in spoken rather than written form in conjunction with the diagram which is presented visually (Mousavi et al., 1995). The effect can be readily demon strated with many examples in the literature (see Ginns, 2005 for a meta-analysis) and is of considerable current interest (e.g., see Mayrath, Nihalani, & Robinson, 2011). It needs to be noted that the effect only is obtainable when the auditory and visual components are unintelligible in isolation and when the information is complex (Tindall-Ford, Chandler, & Sweller, 1997). The effect can be explained within a cognitive load theory framework using Baddeley’s (1992) working memory theory. The theory assumes a visual processor, the visual-spatial sketchpad, to deal with visual information and an auditory processor, the phonological loop, to deal with language. Language is processed by the phonological loop whether it is presented in written or spoken form but, of course, if presented in written form, it initially must be processed by the visual-spatial sketchpad and must be translated into auditory form. For this reason, cognitive load theory assumes that a written presentation is more likely to overload the visual spatial sketchpad than an auditory presentation that has no need to use the visual-spatial sketchpad and no need to translate written information into auditory form. As a consequence, more working memory resources are available to deal with information presented in a dual, audio-visual mode than a single, visual only mode, leading to the modality effect. The modality effect is an established effect and dual modality presentations can be readily used with modern educational tech nology. Nevertheless, the use of auditory presentations introduces transient information. Spoken information is transient in exactly the same way as animations. In order to process speech, we must hold previous information in working memory while processing current information and it may be difficult or impossible to re access the earlier information. Of course, as was the case for animation, transience may not constitute a working memory issue if the information is brief or low in complexity. Spoken information that is brief or low in complexity may be easy to hold and process in working memory. In contrast, if transient, spoken information is long and complex, it may be difficult or even impossible to process adequately. Accordingly, dual mode presentation may only be effective and the modality effect may only be obtainable using relatively short, simple, verbal materials. As the length and/or complexity of the verbal materials increase, we can hypothesize that the modality effect should first disappear and then possibly reverse with visual only material being superior to audio-visual material. In other words, the same interaction can be predicted when testing for the modality effect as was obtained when using animations in Experiment 1. There is some evidence for this hypothesis. Tabbers, Martens, and van Merriënboer (2004) obtained a reverse modality effect using verbal statements that appeared longer and more complex than those used in most other studies. Leahy and Sweller (2011) also obtained a reverse modality effect by increasing the length of the verbal statements. Experiment 2 tested the hypothesis that length of verbal state ments may be critical to the modality effect by comparing audio visual presentations with visual only presentations in which the verbal components were varied in length. Four groups of upper primary school students were presented with either longer or shorter audio text or longer or shorter visual text. The learning material consisted of instructions on how to read a temper atureetime graph.
3.1. Method
3.1.1. Participants The participants were 42 primary level students from two Year 6 classes randomly assigned to four groups. Their age range was 11e12 years. Parental consent to participate was obtained. All students attended a Sydney private school. The primary school mathematics and science curriculum for this age level requires the reading of material similar to the material used in the experiment. It was indicated by the students’ teachers that their students had some limited tuition reading temperatureetime graphs before the experiment. The experiment was conducted in A. Wong et al. / Learning and Instruction 22 (2012) 449e457 453 the final term of the school year and during the first lesson periods of the day.
3.1.2. Materials The materials used during the learning phase consisted of introductions to temperatureetime graphs and worked examples of their interpretation. These were given to each group in a presentation as a series of PowerPoint slides. The experiment used a 2 (modality) 2 (length of verbal segments) design resulting in four instructional groups: 1. A longer audio text instruction group; 2. A longer visual text instruction group; 3. A shorter audio text instruction group; 4. A shorter visual text instruction group. All four groups were provided with an introduction to the use of a temperatureetime graphs and five worked examples. The first slides introduced a single day temperature line graph similar to Fig. 2 except it only represented one, not three days. These slides showed various basic components, for example, the horizontal time axis, the vertical temperature axis, the location points, the temperature progress line and grids. Subsequent slides used one worked example to demonstrate how to find the temperature on one day from the previous single day graph. The question asked, “What is the temperature at 10 am?” was written on top of the slide. The steps to solve the question were also displayed on this slide within textboxes. Further slides provided another introduc tion and showed the components of a 3 day graph. The final slides showed various worked examples on how to find: one temperature, two temperatures, a change of temperature between set hours, the lowest temperature between set hours and the time of a set temperature. The content and general procedures of the PowerPoint instructions and the total presentation times were identical for the four groups. The total presentation time for the slides was 330 s. The two groups differed on the two variables of modality and length of text. To establish differences in modality, the written textbox information shown to the longer and shorter visual text only groups was eliminated and provided in an audio format for the two longer and shorter audio text groups. Questions were also presented in audio format. If a textbox used no arrow, one arrow or two arrows to point to relevant features within the diagram for the visual groups, the slides for the audio text groups were pre sented identically with no arrows or a similar number of arrows. The textbox was removed and only the arrow component (where included) remained. Differences in length of text slides can be described as follows: The longer audio text and longer visual text groups had 9 slides displayed for a total duration of 330 s, timed to automatically progress from 17 to 55 s each, depending on the amount of material in the slide. (Prior to this experiment, a small pilot study was completed to gauge workable presentation times.) Equivalent slides for both groups were of identical duration. The total word count for all nine slides was 421 words giving an average of 46.77 words per content slide. Therefore, the maximum reading or listening time allowable for the content in the longer text slides was .78 s per word. In contrast to the longer text groups, the shorter audio text and shorter visual text groups had 47 slides in total, timed to auto matically progress from 4 to 12 s each. There were more slides needed than for the longer text groups. This increase in the number of slides was a consequence of the material being broken down into more sections. Equivalent slides for these two shorter text groups were of identical duration. The total word count was 395 words giving an average of 8.4 words per content slide. The average listening or reading time allowable for the content in the shorter text slides was .83 s per word. Note that even though the longer and shorter text groups’ number of words on slides varied, with the longer text slides containing more words due to connectors, the content was identical. There were no pauses between slides viewed by the four groups. There were two sheets of paper for the test material. Sheet 1 displayed a temperature graph that was identical to the graph in Fig. 2 with the textbox and arrow removed. Sheet 2 had seven test questions (see Appendix A). Students used Sheet 1 to answer the questions on Sheet 2. The questions were transfer questions that were quite complex (tapping higher element interactivity knowl edge) composed to test whether learners had understood the information during the learning phase adequately to demonstrate transfer. The maximum time allowed to answer all of the test questions was 20 min. All students completed their questions within this time limit however, they were not individually timed. They were also not allowed to review information from the learning phase during the test.
3.1.3. Procedure
The experiment consisted of a pre-instruction, an instruction and a test phase. In the pre-instruction phase all the students were informed from a memorized script presented by the researcher that they were going to be taught how to read a temperatureetime graph by being shown worked examples contained in a Power Point presentation. They were further told that during the entire instruction phase they would have to concentrate carefully by watching the slides. As outlined previously, Year 6 students in New South Wales have experience in interpreting a variety of graphs due to the curriculum content. The two classes were then randomly divided into the four distinct instructional groups. There were 11 participants in the longer audio text instruction group, 10 in the longer visual text instruction group, 10 in the short audio text instruction group, and 11 in the shorter visual text instruction group. The 20 min test phase proceeded immediately after the 330 s presentation had finished. In this phase, the researcher distributed the test graph and test sheet.
Fig. 2. Example of one of the PowerPoint slides used in Experiment 2 with longer visual text instructions. Note: The graphs were presented to the participants in colour. 454 A. Wong et al. / Learning and Instruction 22 (2012) 449e457 3.2. Results and discussion A 2 (length of text) 2 (modality) ANOVA was conducted on the number of correct answers to the 7 questions. Means and standard deviations are provided in Table 2. The test scores indicated no significant differences between the longer and shorter text groups F(1, 38) 1⁄4 .63, MSe 1⁄4 1.36, p 1⁄4 .432, hp 2 1⁄4 .016 nor the audio and visual groups F(1, 38) 1⁄4 .001, MSe 1⁄4 1.36, p 1⁄4 .977, hp 2 < .001. There was a significant interaction, F(1, 38) 1⁄4 14.65, MSe 1⁄4 1.36, p < .001, hp 2 1⁄4 .26. Because of the significant interaction, a simple effects compar ison of individual groups was completed. There was a significant difference between the longer audio text and the longer visual text groups favouring the longer visual text group, F(1, 38) 1⁄4 7.18, MSe 1⁄4 1.36, p 1⁄4 .011, hp 2 1⁄4 .158, demonstrating a reverse modality effect. There was a significant difference between the shorter audio text and the shorter visual text groups favouring the shorter audio text group F(1, 38) 1⁄4 7.47, MSe 1⁄4 1.36, p 1⁄4 .009, hp2 1⁄4 .164, demonstrating a conventional modality effect. A Bonferroni correction controlling the family-wise error rate at .05 was used requiring individual contrasts to be compared to an alpha of .025. As was the case for Experiment 1, a presentation mode by length of textual statements interaction was obtained (Hypothesis 1b). In this case, the interaction was dis-ordinal with a very large effect. Simple effects tests indicated a routine, statistically significant modality effect for the short verbal statements with a statistically significant reverse modality effect for the long verbal statements. These results can be explained by cognitive load theory principles that in combination lead to the transient information effect. Short spoken statements can be readily held and processed in auditory working memory leaving visual working memory to process visually presented information without an additional memory load. In contrast, written information itself consists of visual images that can be expected to impose a visual working memory load. Furthermore, those visual images must be translated into speech for further processing. The resultant cognitive load is extraneous and likely to interfere with learning resulting in the modality effect when compared to auditory statements that do not need to be processed visually.
To adequately process long, complex, technical auditory statements in working memory, we may need to return to them repeatedly, ignoring some parts while attending to other parts. That process is straightforward when the statements are presented inpermanent, written form. It may be difficult or impossible if the statements are presented in transient, auditory form. The consequence is a reverse modality effect with written information proving superior to spoken information, a result obtained in Experiment 2.
4. General discussion
The experiments described in this paper were generated from cognitive load theory. It was assumed that while the cognitive load associated with the presentation of complex information could be ameliorated by the use of animations or dual mode presentations, both forms of presentation incidentally introduce transience that also can impose a heavy cognitive load. The cognitive load associated with transience is likely to be higher with increased segment length. As a consequence, in combination, we hypothesised a transient information by segment length interaction.
The pattern of results from two experiments supported this hypothesis. In Experiment 1, testing Hypothesis 1a, for animations presented in short sections, the amount of transient information did not exceed working memory limitations. Additionally, learners given static graphics in short sections could not refer forwards or backwards to any part of the materials on demand, thereby disallowing mental integration of the information. These characteristics resulted in animation being superior to static graphics, replicating the results found in Wong et al. (2009) and Ayres et al. (2009), where animations teaching relatively short human movement tasks were found to be superior to equivalent statics. In contrast, for long section conditions, the amount of transient information may have exceeded working memory limitations and thus animation was no longer advantageous compared to static graphics. For long sections of static graphics, mental integration was possible. In addition, learners could choose to devote more or less time to particular sections of information, a choice that was not available under animation conditions. For long sections of animation however, the large quantity of transient information meant that working memory limits were likely to be exceeded, and the benefits associated with instructional animations were no longer available. Mental integration was not possible nor could learners attend variably to different parts of the material. In Experiment 2, testing Hypothesis 1b, using transient auditory information rather than transient animations, the same effects of transience were observed, in this case leading not just to the elimination of a significant difference but to a reverse effect. The effects of transience were sufficiently large to overwhelm any advantage associated with presenting the information in a dual modality format.
It may be argued that dividing related material into segments prevents learners from seeing the interactions between segments and so could interfere with learning. In the case of both static graphics in Experiment 1 and visual only presentations in Experiment 2, this argument may be valid. In both cases, the long section presentation was superior to the segmented presentation. Never theless, if this effect does occur, it is overwhelmed by the effect of transience. Both the short animated presentation of Experiment 1 and the short audio-visual presentation of Experiment 2 were superior to the longer presentations. Any deleterious effects of segmentation were overwhelmed by the effects of transience on the long section presentations.
Based on the current set of experimental results, long, complex animations or long, complex auditory statements need to be segmented into shorter sections, as excessively long sections can overwhelm working memory, attenuating any benefits of presenting materials using either animations or spoken information. If complex information is presented in transitory form, working memory may be overwhelmed. Segmenting the information should reduce the working memory load. The similar interactions between presentation modality and length of segments for two quite different categories for presenting information support this interpretation.
Neither animations nor verbal material should be presented using lengthy, complex, transient information. Of course, what is considered long and complex will depend on the learners’ prior knowledge and expertise within the particular instructional domain.
The rationale of the current work is based on predicting a particular pattern of results, in this case an interaction. That interaction was hypothesised from the suggested pattern of working memory loads. The fact that the hypothesised pattern was obtained in two experiments using vastly different materials and studying two different cognitive load effects provides support for the suggested theoretical constructs used to make the prediction. Nevertheless, while the results support the hypothesised patterns based on cognitive load considerations, direct measures of cognitive load were not obtained because it is difficult to obtain reliable subjective ratings of cognitive load from young children. Alternatives such as the use of secondary tasks are difficult or impossible to use within a classroom environment. For these reasons, it would be desirable to replicate these experiments using adult participants from whom subjective measures of cognitive load can be readily obtained. While both animations and auditory information yielded a presentation mode by segment length interaction, that interaction was ordinal in the case of the Experiment 1 animations vs. static graphics but disordinal in the case of the Experiment 2 written vs. spoken text. We doubt this difference indicates different causal factors. Whether an interaction is ordinal or disordinal most commonly is determined by the distance between conditions. In the case of Experiment 1, the decrease in the scores of the long section compared to short section animations will depend on the length of those animations with increasing length leading to decreasing scores. The long section animation scores did not fall below the long section static graphic scores. Had the long section animations been longer, we might expect to obtain that disordinal pattern. The short section animation scores were high because of the advantages of short animations. The short section static graphics scores were relatively low because of the disadvantages of not having all of the static graphics present simultaneously. In the case of Experiment 2, the long section auditory information was sufficiently long to not only negate the advantages of an audio-visual presentation, but also to reduce scores below that of a visual only presentation. Thus, whether an ordinal or disordinal interaction is obtained can be seen to be due simply to the relative lengths and complexity of the short and long sections rather than due to different causes. At present, while we can predict an interaction, we do not have a metric that will inform us prior to running an experiment whether the difference between short and long sections of transient information will yield an ordinal or disordinal interaction. There are, of course, many other differences between the two experiments. For example, Experiment 1 used a reproduction test while Experiment 2 used a transfer test. Nevertheless, both experiments yielded a similar, interpretable interaction.
The current results have been explained by cognitive load theory but recently, an alternative explanation for failures to obtain the modality effect using longer textual information has been provided by Rummer, Schweppe, Fürstenberg, Scheiter, and Zindler (2011). They suggested that the modality effect is due to characteristics of sensory memory rather than working memory. Visual sensory memory span is about 1 s while the equivalent auditory span is 4e5 s. Accordingly, presenting information in spoken (auditory) form should be superior to presenting it in written (visual form) providing that the information takes no more than about 5 s. Brief verbal information should result in a modality effect while longer information should not.
Rummer et al. used longer information and only obtained a modality effect for more recently presented information because more information was retained in the longer span, auditory sensory memory than the visual sensory memory. Earlier information disappeared from auditory sensory memory as well as from visual sensory memory because the 5 s limit was exceeded and so there was no advantage in using an auditory over a visual presentation. The modality effect obtained using shorter textual information in Experiment 2 could be explained by Rummer et al.’s sensorymemory explanation. There are nevertheless, some necessary caveats.
1. The finding that the modality effect only occurs for more recent items can readily be explained by cognitive load theory with its emphasis on working memory limitations. If the load on the visual-spatial sketchpad is decreased by using auditory text compared to visual text, more of the text is likely to be recalled. The advantage may only occur on more recent text because remembering earlier, lengthy text overloads working memory irrespective of presentation mode, resulting in no difference between conditions.
2. The reverse modality effect using longer texts requires a different explanation not relying on sensory memory. The explanation provided in this paper is an example.
3. A sensory memory explanation applies only to the modality effect. It cannot be used to explain the interaction found using animations in Experiment 1.
4. If the modality effect is due primarily to sensory rather than working memory, it should be equally obtainable irrespective of the intrinsic cognitive load imposed by the materials. Tindall-Ford et al. (1997) obtained the effect with high intrinsic cognitive load due to high element interactivity material but failed to obtain the effect with low intrinsic cognitive load, low element interactivity material. Notwithstanding these caveats, Rummer et al.’s hypothesis and results are important and interesting. They deserve further consideration and testing. In conclusion, new instructional technology permits us to present information using formats that are difficult or impossible using older methods. Some procedures, such as enhanced use of animations or spoken text can be of considerable benefit. Nevertheless, there can be incidental and unintentional consequences, some of which have negative effects on learning. We need to be aware that there are cognitive consequences when switching from the permanent information found in books and other forms of hard copy to the frequently transient information presented using modern technology. Those cognitive consequences can have considerable instructional consequences, not all of which are positive. Cognitive load theory can be used to provide hypotheses concerning those instructional consequences, both positive and negative.
Appendix A
Test questions used in Experiment 2
(1) Between which 2 hours, and on what day, did the temperature drop by 3 C?
(2) At what time did a day have a temperature of 34 C before falling to 29 C in 1 hour?
(3) At what time after 12 pm were the 3 days closest to each other in temperatures?
(4) Which temperature that was over 26 C only occurred once during three days?
(5) How many times was there a difference in temperature of exactly 1 C between Monday and Wednesday?
(6) Which day had the largest change in temperature in a single hour between 12 pm and 4 pm and what time did this change occur?
(7) Which day had the smallest change in temperature in 1 hour between 10 am and 1 pm and what time did this change occur?
Yorumlar
Yorum Gönder