Introduction: The Time Course of Phonological Activation in Visual Word Recognition: An ERP Investigation
What follows is the abstract and introduction to my thesis, submitted March 10, 2008.
Abstract
The present study used event-related potentials (ERPs) to investigate the role and time course of phonology in visual word recognition during silent reading. Participants (N = 15) performed a lexical decision on target words (e.g., animal) that were preceded by one of three masked primes: exemplars of the target word (e.g., bear), words that were homophonic to the exemplar, referred to as homophone foils (e.g., bare); or words that were orthographically similar to the target word, referred to as spelling controls (e.g., beer). Three ERP components were of particular interest: the N250, the classic N400, and the P600. No differences between homophones and controls were found, although varying effects emerged in each of the time windows examined. The results suggest that the selection of a word’s meaning does not depend on its phonological form, but rather that phonology plays a role after its selection.
The Time Course of Phonological Activation in Visual Word Recognition: An ERP Investigation
Language is an ability that adapted in humans over time and remains unique to humans to this day. The ability to read is crucial to success in modern society; from the relatively simple task of following directions or instructions, to gaining complex knowledge from books, journals and the internet, reading is a tool we use every day. Few can achieve an education without this ability; an education that, in turn, can lead to a well-paying job and numerous opportunities one could not otherwise afford. It seems, then, that the development of reading progresses in our lifetime from learning to read, to reading to learn.
Learning to read involves associating the visual form of a word (i.e., its orthography, symbolized as O) to its meaning (i.e., semantics, symbolized as S). According to the dual-route theory of reading (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001), this process can happen along one of two pathways: the direct route, in which a reader achieves meaning directly from the orthography (O –> S), and the indirect or phonological route, in which phonology (e.g., the way a word sounds, symbolized as P) acts as a mediator between orthography and meaning (O –> P –> S). The triangle model, based on connectionist principles, proposes a single mechanism for visual word recognition in which a distributed pattern of activation across phonology, orthography, and semantic units determines word meaning (Harm & Seidenberg, 2004; Seidenberg & McClelland, 1989). Computational models show that the triangle model is more accurate when the direct and phonological routes act cooperatively, with the division of labour between each route depending on factors such as word frequency and homophony (Newman, Haigh, & Jared, in press).
The goal of visual word recognition research has traditionally been to determine the extent to which a reader relies on orthography and phonology in order to activate a word’s meaning. Understanding how meaning is activated during reading has strong implications in early reading education and therapy for reading disabilities. For instance, differing views as to whether we rely more on the direct versus the phonological route when reading, led to in-depth debates about whether children should be taught to read via a whole-word approach or a phonetic approach (Castle, Riach, & Nicholson, 1994; Gathercole & Baddeley, 1989; Jeynes & Littell, 2000; Manning & Kamii, 2000; Vellutino, 1991). A general consensus has emerged that the phonological route is pivotal to beginning readers (see Adams, 1990). Ehri (1992) suggests that one logical function of the phonological route is to strengthen direct connections between print and meaning, which would consequently diminish one’s reliance on phonology as reading skill improves. Several studies have offered support for this view by suggesting that a direct approach to reading is the most common finding in studies with adults. This view is often referred to as a weak phonological theory because it argues that the extent of phonology’s influence on visual word recognition is dependent on various mediating factors, such as reading skill, homophony, and word frequency (Coltheart et al., 1991; Jared, Levy, & Rayner, 1999). Having said that, there are still proponents of so-called strong phonology theories of reading, which contend that phonology is activated during reading regardless of skill level and psycholinguistic factors, such as word frequency and homophony (Frost, 1998; Lukatela & Turvey, 1994). The current study will use electrophysiological measures during a visual word recognition task in order to distinguish these competing theories. Before describing the goals of the current study, however, it is important to summarize previous work examining the role of phonology in reading.
Questions surrounding the role of phonology in reading have been traditionally studied through the use of homophones and/or pseudohomophones. Homophones are words with identical pronunciations, but which differ in spelling, derivation and meaning (e.g., bear/bare). Pseudohomophones are nonword stimuli that share their phonological representation, but not their orthographic representation with real words (e.g., brane). The rationale for using homophonous items is as follows: if phonology mediates word recognition, then presentation of a homophone (e.g., bare) or a pseudohomophone (e.g., brane) will result in the semantic activation of the unseen homophone mate (e.g., bear) or the pseudohomophone’s base word (e.g., brain), respectively. If it can be shown that the meaning of the unseen homophone mate (or the pseudohomophone’s base word) has been activated, then we can infer that phonology has mediated the computation of meaning.
Evidence for homophone effects has been observed in studies employing the homophone error paradigm. In this paradigm, the member of a homophone pair that is appropriate to the experimental context (e.g., bear) is replaced by its partner, the homophone error word (or homophone foil; e.g., bare). If participants fail to notice the homophone foil, then the inference is made that the phonological representation of the homophone foil resulted in activation of the unseen correct homophone. Due to the high degree of orthographic overlap between many homophones, it is essential for a spelling control word to be present in experiments employing the homophone error paradigm. A well-developed spelling control word should be as orthographically similar to the appropriate homophone as the homophone error word, and should also approximate the homophone error in word frequency. For example, for the homophone pair, bear-bare, an appropriate spelling control word would be the word beer. Thus, if participants fail to notice the homophone error, then the effect can be attributed to the shared phonological representations between the correct homophone and the homophone foil, and not to their orthographic similarity.
Van Orden (1987) manipulated orthographic similarity and homophony (homophone errors or ‘foils’ and spelling controls, which represented the key trials) in a series of semantic categorization experiments in which participants were asked to determine whether a presented word was part of a specified category. He found that participants made significantly more false-positive errors to the phonemically-similar homophone foils (e.g., rows) than to their spelling controls (e.g., robs); similarly, there were more errors to the nonword foils (e.g., roze) than to their spelling controls (e.g., rone). This finding was referred to as a phonological interference effect because the sound of the foil interfered with participants’ decision-making processes, leading them to falsely judge a homophone foil as being an exemplar of the prompted category (e.g., rose in the category Flowers). Furthermore, Van Orden, Pennington, & Stone (1990) determined that the size of the homophone effect was dependent on the frequency of the unseen exemplar, such that the effect was larger if the exemplar was lower in frequency than its presented homophone mate. More importantly, however, Van Orden et al. still observed a homophone effect for high frequency words. Their findings led to the development of the spell check theory. According to this theory, when more than one meaning is activated by presentation of a homophone, we must retrieve the appropriate orthographic representation of the presented word (e.g., its spelling) in order to resolve the competition. When the appropriate homophone is a low frequency word, however, its spelling is more difficult to retrieve, leading to an increase in false-positive errors. The homophone effect can therefore be defined as an increase in processing or reaction time, and/or in false-positive errors, associated with the presentation of a homophone error.
Van Orden’s (1987) results were called into question by Jared and Seidenberg (1991), who claimed that his categories were overly specific, allowing participants to predict category exemplars even before they were presented. Thus, phonological activation could occur due to contextual priming, and not to the generation of a phonological representation directly from the printed word form. By proposing the use of general categories (e.g., living thing and object) in their semantic categorization study, Jared and Seidenberg (1991) examined whether the observed homophone effects in Van Orden (1987) and Van Orden et al. (1990) were exaggerated. They hypothesized that broad categories would reduce the probability that participants could generate semantic candidates proper to stimulus presentation. This reduction of phonological effects was precisely what the authors found; although homophone effects were still observed for low frequency homophone foils with low frequency exemplars, they failed to observe homophone effects for high frequency words when broad semantic categories were employed. On the basis of these results, Jared & Seidenberg (1991) argued that phonological activation of meaning occurs only for low frequency words, and that the direct route is favoured for high frequency words.
The impact of phonology on reading for meaning has also been studied in tasks that closely resemble reading, such as proofreading and sentence verification tasks. Jared, Levy and Rayner (1999) took into account predictability and frequency within their proofreading task, which they administered to both high- and low-skilled readers. Their study involved monitoring eye movements while participants read passages or sentences in which an appropriate homophone was replaced by a homophone foil or a spelling control. The only reliable evidence for phonological activation of meaning was found for low frequency homophone foils replacing low frequency correct exemplars; these findings were consistent with those of Jared & Seidenberg (1991). Furthermore, Jared et al. (1999) found that reading skill modulated phonological influences on reading for meaning. Good readers produced an orthographic pattern of eye movements; they had longer gaze durations for homophone foils compared to correct homophones and similar gaze durations for homophone foils and spelling controls. In contrast, the gaze durations for poor readers followed a phonological pattern; they had shorter gaze durations for homophone foils compared to spelling controls, and their gaze durations for homophone foils and correct homophones were similar.
A lexical decision task (LDT) involving homophonous items is often used to study the role of phonology in reading. In this task, the time it takes for a person to decide the lexical nature of items presented (whether or not it is a real word) is measured. It has been argued that this shallower level of processing, compared to reading for meaning, does not require the reader to understand the word in its context, allowing for an early phase in the word-recognition process to be revealed (Pexman, Lupker, & Jared, 2001). If phonology actively contributes to word recognition, then one would predict that low-frequency homophones would delay ‘word’ responses, assuming a spelling check has been employed to select between alternative meanings. Similarly, pseudohomophones (e.g., brane) would be expected to delay ‘nonword’ responses, since the spell check must operate to discriminate between the read word and the nonword foil. Pexman et al. conducted a LDT involving both homophones varying in frequency, and pseudohomophones varying in orthographic familiarity. In their study, they matched the word frequency of homophones to spelling control words (e.g., pain/pear), and matched the bodies of pseudohomophones and pseudowords (nonwords, e.g., RADE, CADE) for orthographic similarity. In addition, they varied the orthographic familiarity of the nonword (pseudohomophone and pseudoword) stimuli. In the wordlike context, the nonwords had orthographic representations that were similar to real English words (e.g., rade, cade). However, in the non-wordlike context, the nonwords had orthographic representations that were atypical of English words (e.g., golph, tolph). The authors expected that the atypicality of the nonwords in the non-wordlike context would allow participants to make their lexical decision based on a superficial analysis of orthographic factors, and thereby, reduce any influence of phonology on word recognition. The results revealed two main influential factors modulating homophone effects. First, adding support to past research, homophone frequency influenced the speed and accuracy of lexical decisions. More specifically, homophone effects were largest for low frequency homophones with higher frequency mates. Second, homophone and pseudohomophone effects were larger when nonwords had a familiar orthographic representations. When pseudohomophones and pseudowords had atypical orthographic representations, no evidence for homophone or pseudohomophone effects were observed. In summary, Pexman et al. concluded that readers do activate phonology during visual word recognition, but that the extent of phonological activation is influenced by word frequency and orthographic familiarity.
While the homophone error paradigm has provided support for a weak phonological theory of word recognition, findings from studies using fast priming procedures (e.g., prime is presented too quickly to permit overt identification) have been cited as evidence in support of the strong phonological theory of word recognition (Lukatela & Turvey, 1994). The fast priming procedure involves presenting a prime for a duration that is too quick to permit its overt identification, followed immediately by presentation of a related or unrelated target. In this task, participants are required to respond only to the target; however, their processing of the target is assumed to be covertly affected by a related prime. So-called phonological priming effects occur when reaction times to targets (e.g., clip) are faster for phonologically related primes (e.g., klip) compared to orthographically related primes (e.g., clep). Evidence for phonological priming has been found with brief prime exposures (e.g., 14 ms), leading to the conclusion that phonology is generated automatically and provides the primary code by which meaning is accessed (Lukatela & Turvey, 1994). However, based on findings in which phonological priming lagged behind orthographic priming, the former occurring only with longer prime exposures, some theorists argue that orthography provides the primary constraint on word recognition (Ferrand & Grainger, 1994). Since the current study employs a fast priming procedure, a brief review of the literature in this area is in order.
Lukatela and Turvey (1994) conducted a seminal study in which participants were required to name out loud a target word (e.g., frog) that was preceded by an associate (e.g., TOAD), a word (e.g., TOWED) or nonword (e.g., TODE) homophonic with the associate, or an orthographic control (e.g., TOLD) of the associate. At brief (e.g., 50 ms) stimulus onset asynchronies (SOAs), the associate prime and the homophonic primes (e.g., TOWED and TODE) produced equal associative priming; however, no priming was observed for the orthographic controls. At longer SOAs, homophonic words (e.g., TOWED) led to associative priming whereas homophonic nonwords did not. In addition, at longer SOAs, associative priming in orthographic controls was observed for the words (e.g.., TOLD) but not for the nonwords. Based on these results, Lukatela and Turvey (1994) concluded that phonology provides the initial code by which a word’s lexical representation is accessed. They acknowledged the responsibility of orthographic structure in reducing lexical competition following activation by the word’s phonological code. In other words, the way a word sounds is important in activating its corresponding lexical representation, whereas the way a word looks is important in selecting the word’s activated representation from other, simultaneously activated representations (e.g., choosing between TOWED and TOAD).
Lukatela and Turvey’s (1994) choice of a naming task has been criticized on grounds that naming invariably activates phonological output codes, which could have produced exaggerated phonological priming effects (Holyk & Pexman, 2004). Ferrand and Grainger (1994) conducted a similar study, but within the context of a lexical decision task. These authors manipulated prime duration (the length of time for which a prime word is presented to the participant) from 14-57 ms, to examine the possibility that orthographic and phonological processes follow different time courses, with orthographic information being accessed earlier than phonological information. They employed a masked form priming procedure, which involves the presentation of a fully visible word either before (forward masking) and/or after (backward masking) the prime word. The purpose of masking is to overwrite the sensory representation of the prime and block elaborate processing. The results of their study showed clear evidence of orthographic priming, but no evidence of phonological priming with a 29 ms prime duration exposures. Phonological priming did develop, but only with longer prime exposures (> 67 ms). These results conflict with those of Lukatela and Turvey (1994) and suggest that the difference in task, naming versus LDT, may play a role in the timing of phonological activation.
Ziegler, Ferrand, Jacobs, Rey, and Grainger (2000) employed an incremental priming technique, which adds a parametric manipulation of prime duration to the traditional design of a fast masked priming study. In the two previous studies mentioned, prime duration was manipulated, but not within the same list of stimuli. The advantage of the parametric manipulation is that additional information on the time course and nature of priming effects can be obtained. Zeigler et al’s (2000) findings replicated those of Ferrand and Grainger (1994); priming in the orthographic condition occurred with brief prime durations (29 ms), while phonological priming did not occur until prime durations were increased. Interestingly, as prime duration increased, orthographic priming did not continue to increase, whereas phonological priming continued to increase steadily with increasing prime duration. The researchers concluded that orthographic information is activated at the earliest stages of word recognition, while phonological priming emerges at a later stage.
Conflicting results suggest the possibility that the experimental methods employed in past studies lacked the necessary sensitivity to answer such a question. Evidence for rapid activation of phonological codes has been obtained repeatedly with the masked priming paradigm at brief prime durations (Lukatela & Turvey, 1994; Rastle & Brysbaert, 2006), but it is possible that longer prime durations cannot be properly studied without the more sensitive measures provided by event-related brain potentials (ERPs). The exquisite temporal resolution and continuous index of neural activity provided by ERP measures can be used to clarify the sequencing and timing of phonological effects, overcoming the limitations of the ‘end-point’ explanatory approach necessarily utilized in behavioral studies. The use of ERP measures to examine the impact of phonology in reading is a relatively new approach.
Historically, ERP research on reading has concentrated on semantic processing, focusing on one ERP response in particular, the N400. This component is defined as a negative-going brain wave at 400 ms post-stimulus presentation, and the response was first demonstrated by manipulating the terminal words of visually presented, contextually constrained sentences (Kutas & Hillyard, 1980). It was found that the N400 was larger to words that violated semantic expectations (e.g., He spread the warm bread with socks) than to semantically appropriate words (e.g., He spread the warm bread with butter). These early results have been supported by numerous studies (Bentin, 1987; Holcomb & Neville, 1990; Bentin, Kutas, & Hillyard, 1993; Connolly, Phillips & Forbes, 1995), with the general consensus being that the N400 reflects an index of semantic congruity.
A few ERP studies have investigated the impact of phonology in the activation of word meanings using the logic of the homophone error paradigm. The rationale of these studies was that if participants detect a semantically inappropriate word, then an N400 would be elicited. Spelling control words (e.g., beer) would be expected to produce a larger N400 response than would correct target words (e.g., bear), since the former are semantically inappropriate to the context (e.g., animal). The critical question was whether the N400 to homophone foils (e.g., bare) would be smaller than that observed for spelling controls, indicating that meanings were activated by phonology, or whether the N400 to homophone foils would be similar in amplitude to that observed for spelling controls, indicating that meanings were activated directly by orthography.
Ziegler, Benraiss, and Besson (1999) used an ERP study to investigate the role of phonology in a semantic categorization task modeled after Van Orden (1987). They expected that a direct O –> S pathway to meaning would result in a larger N400 to homophones (e.g., meet) than to exemplars (e.g., meat) of a semantic category (e.g., food). Orthographic control words (e.g., maid) were also used, and these stimuli were matched to the homophones for word frequency and number of letters. Behaviourally, it was found that participants made significantly more errors in rejecting homophone foils that sounded like members of the category than in rejecting orthographic control words; they also took longer to correctly reject the homophones. In the ERP data, however, results were more difficult to interpret. Ziegler et al. (1999) analyzed the latency bands in three epochs: 0-300 ms, 300-800 ms, and 800-1600 ms. The results were similar across the two earlier latency periods: homophone foils and orthographic controls did not differ in N400 amplitude, suggesting that readers relied on the direct route in accessing the meaning of the presented word. However, in the later latency period (800-1600 ms), homophone foils and orthographic controls diverged, and the homophone amplitudes closely resembled those of category exemplars (more positive-going). Ziegler et al. (1999) interpreted their results as providing evidence for a direct orthographic path to meaning, with phonology exerting an influence on meaning selection at a later stage, well after the direct activation of the appropriate meaning.
The first ERP study to examine the impact of phonology using a masked priming paradigm was conducted by Grainger, Kiyonaga, and Holcomb (2006). They employed the masked priming paradigm and kept prime duration constant at 50 ms. Based on previous ERP studies, they expected their manipulation of phonology and orthography to modulate the N250 component, which is hypothesized to reflect sublexical processing during visual word recognition, and the N400 component, which – as discussed – is thought to reflect semantic processing. Their central prediction was that the N250 component should be modified by both orthographic primes and phonological primes, and that the onsets of these two effects should be different. Results were analyzed in three latency bands: 150-250 ms, 250-350 ms, and 350-550 ms. In the earliest epoch, there was a significant effect of priming for the orthographic control condition at posterior sites, but no significant phonological effects. In the second epoch, a significant priming effect of pseudohomophones was seen reliably only at anterior sites, but none was observed for the orthographic condition. Finally, in the latest epoch (where the bulk of activity resembling the classic N400 component was observed), priming effects were significant for pseudohomophones across the scalp, while there was no significant effect for the orthographic condition once again. When it comes down to the time course of phonology, these results placed orthographic effects as emerging about 50 ms earlier than phonological effects. Grainger et al. concluded that their data provided clear evidence for fast phonological priming in conditions in which all possible contamination factors are eliminated, and provides an upper boundary for the time at which phonology starts to have an influence (approximately 250 ms post-target onset).
One of the objectives of the current research was to clarify the role of phonology in skilled reading using ERP measures. Grainger et al. (2006) failed to include an appropriate homophone in their study to compare results with those of the homophonic foil condition; the present study made use of both appropriate homophones and homophone foils, as well as an orthographic control condition. A masked priming paradigm was used in conjunction with a lexical decision task in order to examine the time course of phonology, with prime duration kept constant at 67 ms (see Ferrand & Grainger, 1993). Only low frequency homophones were used in the present study, based on work showing that homophone effects are largest when optimal selection competition is provided (a low frequency or unfamiliar exemplar is harder to identify as correct when put next to a low frequency error word than is a high frequency exemplar). It was expected that a homophone effect would be observed through a decrease in the N400 component in the homophone error primes relative to the spelling control words. Also of interest to the present study were the N250 component and the P600 component, said to be triggered when the brain encounters an unexpected linguistic item instead of the one that was expected (a post-lexical effect; Vissers, Chwilla, & Kolk, 2006). It was expected that no homophone effects would be observed in the N250 component, consistent with Grainger et al.’s (2006) observations.



You lost me after “What follows is the abstract and introduction to my thesis, submitted March 10, 2008.” I was keeping up with you until then.