Displaying 1 - 28 of 28
-
Andics, A., McQueen, J. M., & Petersson, K. M. (2013). Mean-based neural coding of voices. NeuroImage, 79, 351-360. doi:10.1016/j.neuroimage.2013.05.002.
Abstract
The social significance of recognizing the person who talks to us is obvious, but the neural mechanisms that mediate talker identification are unclear. Regions along the bilateral superior temporal sulcus (STS) and the inferior frontal cortex (IFC) of the human brain are selective for voices, and they are sensitive to rapid voice changes. Although it has been proposed that voice recognition is supported by prototype-centered voice representations, the involvement of these category-selective cortical regions in the neural coding of such "mean voices" has not previously been demonstrated. Using fMRI in combination with a voice identity learning paradigm, we show that voice-selective regions are involved in the mean-based coding of voice identities. Voice typicality is encoded on a supra-individual level in the right STS along a stimulus-dependent, identity-independent (i.e., voice-acoustic) dimension, and on an intra-individual level in the right IFC along a stimulus-independent, identity-dependent (i.e., voice identity) dimension. Voice recognition therefore entails at least two anatomically separable stages, each characterized by neural mechanisms that reference the central tendencies of voice categories. -
Asaridou, S. S., & McQueen, J. M. (2013). Speech and music shape the listening brain: Evidence for shared domain-general mechanisms. Frontiers in Psychology, 4: 321. doi:10.3389/fpsyg.2013.00321.
Abstract
Are there bi-directional influences between speech perception and music perception? An answer to this question is essential for understanding the extent to which the speech and music that we hear are processed by domain-general auditory processes and/or by distinct neural auditory mechanisms. This review summarizes a large body of behavioral and neuroscientific findings which suggest that the musical experience of trained musicians does modulate speech processing, and a sparser set of data, largely on pitch processing, which suggest in addition that linguistic experience, in particular learning a tone language, modulates music processing. Although research has focused mostly on music on speech effects, we argue that both directions of influence need to be studied, and conclude that the picture which thus emerges is one of mutual interaction across domains. In particular, it is not simply that experience with spoken language has some effects on music perception, and vice versa, but that because of shared domain-general subcortical and cortical networks, experiences in both domains influence behavior in both domains. -
Brandmeyer, A., Sadakata, M., Spyrou, L., McQueen, J. M., & Desain, P. (2013). Decoding of single-trial auditory mismatch responses for online perceptual monitoring and neurofeedback. Frontiers in Neuroscience, 7: 265. doi:10.3389/fnins.2013.00265.
Abstract
Multivariate pattern classification methods are increasingly applied to neuroimaging data in the context of both fundamental research and in brain-computer interfacing approaches. Such methods provide a framework for interpreting measurements made at the single-trial level with respect to a set of two or more distinct mental states. Here, we define an approach in which the output of a binary classifier trained on data from an auditory mismatch paradigm can be used for online tracking of perception and as a neurofeedback signal. The auditory mismatch paradigm is known to induce distinct perceptual states related to the presentation of high- and low-probability stimuli, which are reflected in event-related potential (ERP) components such as the mismatch negativity (MMN). The first part of this paper illustrates how pattern classification methods can be applied to data collected in an MMN paradigm, including discussion of the optimization of preprocessing steps, the interpretation of features and how the performance of these methods generalizes across individual participants and measurement sessions. We then go on to show that the output of these decoding methods can be used in online settings as a continuous index of single-trial brain activation underlying perceptual discrimination. We conclude by discussing several potential domains of application, including neurofeedback, cognitive monitoring and passive brain-computer interfacesAdditional information
Brandmeyer_etal_2013a.pdf -
Brandmeyer, A., Farquhar, J., McQueen, J. M., & Desain, P. (2013). Decoding speech perception by native and non-native speakers using single-trial electrophysiological data. PLoS One, 8: e68261. doi:10.1371/journal.pone.0068261.
Abstract
Brain-computer interfaces (BCIs) are systems that use real-time analysis of neuroimaging data to determine the mental state of their user for purposes such as providing neurofeedback. Here, we investigate the feasibility of a BCI based on speech perception. Multivariate pattern classification methods were applied to single-trial EEG data collected during speech perception by native and non-native speakers. Two principal questions were asked: 1) Can differences in the perceived categories of pairs of phonemes be decoded at the single-trial level? 2) Can these same categorical differences be decoded across participants, within or between native-language groups? Results indicated that classification performance progressively increased with respect to the categorical status (within, boundary or across) of the stimulus contrast, and was also influenced by the native language of individual participants. Classifier performance showed strong relationships with traditional event-related potential measures and behavioral responses. The results of the cross-participant analysis indicated an overall increase in average classifier performance when trained on data from all participants (native and non-native). A second cross-participant classifier trained only on data from native speakers led to an overall improvement in performance for native speakers, but a reduction in performance for non-native speakers. We also found that the native language of a given participant could be decoded on the basis of EEG data with accuracy above 80%. These results indicate that electrophysiological responses underlying speech perception can be decoded at the single-trial level, and that decoding performance systematically reflects graded changes in the responses related to the phonological status of the stimuli. This approach could be used in extensions of the BCI paradigm to support perceptual learning during second language acquisition -
Mani, N., Johnson, E., McQueen, J. M., & Huettig, F. (2013). How yellow is your banana? Toddlers' language-mediated visual search in referent-present tasks. Developmental Psychology, 49, 1036-1044. doi:10.1037/a0029382.
Abstract
What is the relative salience of different aspects of word meaning in the developing lexicon? The current study examines the time-course of retrieval of semantic and color knowledge associated with words during toddler word recognition: at what point do toddlers orient towards an image of a yellow cup upon hearing color-matching words such as “banana” (typically yellow) relative to unrelated words (e.g., “house”)? Do children orient faster to semantic matching images relative to color matching images, e.g., orient faster to an image of a cookie relative to a yellow cup upon hearing the word “banana”? The results strongly suggest a prioritization of semantic information over color information in children’s word-referent mappings. This indicates that, even for natural objects (e.g., food, animals that are more likely to have a prototypical color), semantic knowledge is a more salient aspect of toddler's word meaning than color knowledge. For 24-month-old Dutch toddlers, bananas are thus more edible than they are yellow. -
Mitterer, H., Scharenborg, O., & McQueen, J. M. (2013). Phonological abstraction without phonemes in speech perception. Cognition, 129, 356-361. doi:10.1016/j.cognition.2013.07.011.
Abstract
Recent evidence shows that listeners use abstract prelexical units in speech perception. Using the phenomenon of lexical retuning in speech processing, we ask whether those units are necessarily phonemic. Dutch listeners were exposed to a Dutch speaker producing ambiguous phones between the Dutch syllable-final allophones approximant [r] and dark [l]. These ambiguous phones replaced either final /r/ or final /l/ in words in a lexical-decision task. This differential exposure affected perception of ambiguous stimuli on the same allophone continuum in a subsequent phonetic-categorization test: Listeners exposed to ambiguous phones in /r/-final words were more likely to perceive test stimuli as /r/ than listeners with exposure in /l/-final words. This effect was not found for test stimuli on continua using other allophones of /r/ and /l/. These results confirm that listeners use phonological abstraction in speech perception. They also show that context-sensitive allophones can play a role in this process, and hence that context-insensitive phonemes are not necessary. We suggest there may be no one unit of perception -
Sadakata, M., & McQueen, J. M. (2013). High stimulus variability in nonnative speech learning supports formation of abstract categories: Evidence from Japanese geminates. Journal of the Acoustical Society of America, 134(2), 1324-1335. doi:10.1121/1.4812767.
Abstract
This study reports effects of a high-variability training procedure on nonnative learning of a Japanese geminate-singleton fricative contrast. Thirty native speakers of Dutch took part in a 5-day training procedure in which they identified geminate and singleton variants of the Japanese fricative /s/. Participants were trained with either many repetitions of a limited set of words recorded by a single speaker (low-variability training) or with fewer repetitions of a more variable set of words recorded by multiple speakers (high-variability training). Both types of training enhanced identification of speech but not of nonspeech materials, indicating that learning was domain specific. High-variability training led to superior performance in identification but not in discrimination tests, and supported better generalization of learning as shown by transfer from the trained fricatives to the identification of untrained stops and affricates. Variability thus helps nonnative listeners to form abstract categories rather than to enhance early acoustic analysis. -
Sjerps, M. J., McQueen, J. M., & Mitterer, H. (2013). Evidence for precategorical extrinsic vowel normalization. Attention, Perception & Psychophysics, 75, 576-587. doi:10.3758/s13414-012-0408-7.
Abstract
Three experiments investigated whether extrinsic vowel normalization takes place largely at a categorical or a precategorical level of processing. Traditional vowel normalization effects in categorization were replicated in Experiment 1: Vowels taken from an [ɪ]-[ε] continuum were more often interpreted as /ɪ/ (which has a low first formant, F (1)) when the vowels were heard in contexts that had a raised F (1) than when the contexts had a lowered F (1). This was established with contexts that consisted of only two syllables. These short contexts were necessary for Experiment 2, a discrimination task that encouraged listeners to focus on the perceptual properties of vowels at a precategorical level. Vowel normalization was again found: Ambiguous vowels were more easily discriminated from an endpoint [ε] than from an endpoint [ɪ] in a high-F (1) context, whereas the opposite was true in a low-F (1) context. Experiment 3 measured discriminability between pairs of steps along the [ɪ]-[ε] continuum. Contextual influences were again found, but without discrimination peaks, contrary to what was predicted from the same participants' categorization behavior. Extrinsic vowel normalization therefore appears to be a process that takes place at least in part at a precategorical processing level. -
Witteman, M. J., Weber, A., & McQueen, J. M. (2013). Foreign accent strength and listener familiarity with an accent co-determine speed of perceptual adaptation. Attention, Perception & Psychophysics, 75, 537-556. doi:10.3758/s13414-012-0404-y.
Abstract
We investigated how the strength of a foreign accent and varying types of experience with foreign-accented speech influence the recognition of accented words. In Experiment 1, native Dutch listeners with limited or extensive prior experience with German-accented Dutch completed a cross-modal priming experiment with strongly, medium, and weakly accented words. Participants with limited experience were primed by the medium and weakly accented words, but not by the strongly accented words. Participants with extensive experience were primed by all accent types. In Experiments 2 and 3, Dutch listeners with limited experience listened to a short story before doing the cross-modal priming task. In Experiment 2, the story was spoken by the priming task speaker and either contained strongly accented words or did not. Strongly accented exposure led to immediate priming by novel strongly accented words, while exposure to the speaker without strongly accented tokens led to priming only in the experiment’s second half. In Experiment 3, listeners listened to the story with strongly accented words spoken by a different German-accented speaker. Listeners were primed by the strongly accented words, but again only in the experiment’s second half. Together, these results show that adaptation to foreign-accented speech is rapid but depends on accent strength and on listener familiarity with those strongly accented words. -
Adank, P., & McQueen, J. M. (2007). The effect of an unfamiliar regional accent on spoken-word comprehension. In J. Trouvain, & W. J. Barry (
Eds. ), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1925-1928). Dudweiler: Pirrot.Abstract
This study aimed first to determine whether there is a delay associated with processing words in an unfamiliar regional accent compared to words in a familiar regional accent, and second to establish whether short-term exposure to an unfamiliar accent affects the speed and accuracy of comprehension of words spoken in that accent. Listeners performed an animacy decision task for words spoken in their own and in an unfamiliar accent. Next, they were exposed to approximately 20 minutes of speech in one of these two accents. After exposure, they repeated the animacy decision task. Results showed a considerable delay in word processing for the unfamiliar accent, but no effect of short-term exposure. -
Andics, A., McQueen, J. M., & Van Turennout, M. (2007). Phonetic content influences voice discriminability. In J. Trouvain, & W. J. Barry (
Eds. ), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1829-1832). Dudweiler: Pirrot.Abstract
We present results from an experiment which shows that voice perception is influenced by the phonetic content of speech. Dutch listeners were presented with thirteen speakers pronouncing CVC words with systematically varying segmental content, and they had to discriminate the speakers’ voices. Results show that certain segments help listeners discriminate voices more than other segments do. Voice information can be extracted from every segmental position of a monosyllabic word and is processed rapidly. We also show that although relative discriminability within a closed set of voices appears to be a stable property of a voice, it is also influenced by segmental cues – that is, perceived uniqueness of a voice depends on what that voice says. -
Cho, T., McQueen, J. M., & Cox, E. A. (2007). Prosodically driven phonetic detail in speech processing: The case of domain-initial strengthening in English. Journal of Phonetics, 35(2), 210-243. doi:10.1016/j.wocn.2006.03.003.
Abstract
We explore the role of the acoustic consequences of domain-initial strengthening in spoken-word recognition. In two cross-modal identity-priming experiments, listeners heard sentences and made lexical decisions to visual targets, presented at the onset of the second word in two-word sequences containing lexical ambiguities (e.g., bus tickets, with the competitor bust). These sequences contained Intonational Phrase (IP) or Prosodic Word (Wd) boundaries, and the second word's initial Consonant and Vowel (CV, e.g., [tI]) was spliced from another token of the sequence in IP- or Wd-initial position. Acoustic analyses showed that IP-initial consonants were articulated more strongly than Wd-initial consonants. In Experiment 1, related targets were post-boundary words (e.g., tickets). No strengthening effect was observed (i.e., identity priming effects did not vary across splicing conditions). In Experiment 2, related targets were pre-boundary words (e.g., bus). There was a strengthening effect (stronger priming when the post-boundary CVs were spliced from IP-initial than from Wd-initial position), but only in Wd-boundary contexts. These were the conditions where phonetic detail associated with domain-initial strengthening could assist listeners most in lexical disambiguation. We discuss how speakers may strengthen domain-initial segments during production and how listeners may use the resulting acoustic correlates of prosodic strengthening during word recognition. -
Huettig, F., & McQueen, J. M. (2007). The tug of war between phonological, semantic and shape information in language-mediated visual search. Journal of Memory and Language, 57(4), 460-482. doi:10.1016/j.jml.2007.02.001.
Abstract
Experiments 1 and 2 examined the time-course of retrieval of phonological, visual-shape and semantic knowledge as Dutch participants listened to sentences and looked at displays of four pictures. Given a sentence with beker, `beaker', for example, the display contained phonological (a beaver, bever), shape (a bobbin, klos), and semantic (a fork, vork) competitors. When the display appeared at sentence onset, fixations to phonological competitors preceded fixations to shape and semantic competitors. When display onset was 200 ms before (e.g.) beker, fixations were directed to shape and then semantic competitors, but not phonological competitors. In Experiments 3 and 4, displays contained the printed names of the previously-pictured entities; only phonological competitors were fixated preferentially. These findings suggest that retrieval of phonological, shape and semantic knowledge in the spoken-word and picture-recognition systems is cascaded, and that visual attention shifts are co-determined by the time-course of retrieval of all three knowledge types and by the nature of the information in the visual environment. -
Jesse, A., & McQueen, J. M. (2007). Prelexical adjustments to speaker idiosyncracies: Are they position-specific? In H. van Hamme, & R. van Son (
Eds. ), Proceedings of Interspeech 2007 (pp. 1597-1600). Adelaide: Causal Productions.Abstract
Listeners use lexical knowledge to adjust their prelexical representations of speech sounds in response to the idiosyncratic pronunciations of particular speakers. We used an exposure-test paradigm to investigate whether this type of perceptual learning transfers across syllabic positions. No significant learning effect was found in Experiment 1, where exposure sounds were onsets and test sounds were codas. Experiments 2-4 showed that there was no learning even when both exposure and test sounds were onsets. But a trend was found when exposure sounds were codas and test sounds were onsets (Experiment 5). This trend was smaller than the robust effect previously found for the coda-to-coda case. These findings suggest that knowledge about idiosyncratic pronunciations may be position specific: Knowledge about how a speaker produces sounds in one position, if it can be acquired at all, influences perception of sounds in that position more strongly than of sounds in another position. -
Jesse, A., McQueen, J. M., & Page, M. (2007). The locus of talker-specific effects in spoken-word recognition. In J. Trouvain, & W. J. Barry (
Eds. ), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1921-1924). Dudweiler: Pirrot.Abstract
Words repeated in the same voice are better recognized than when they are repeated in a different voice. Such findings have been taken as evidence for the storage of talker-specific lexical episodes. But results on perceptual learning suggest that talker-specific adjustments concern sublexical representations. This study thus investigates whether voice-specific repetition effects in auditory lexical decision are lexical or sublexical. The same critical set of items in Block 2 were, depending on materials in Block 1, either same-voice or different-voice word repetitions, new words comprising re-orderings of phonemes used in the same voice in Block 1, or new words with previously unused phonemes. Results show a benefit for words repeated by the same talker, and a smaller benefit for words consisting of phonemes repeated by the same talker. Talker-specific information thus appears to influence word recognition at multiple representational levels. -
Jesse, A., & McQueen, J. M. (2007). Visual lexical stress information in audiovisual spoken-word recognition. In J. Vroomen, M. Swerts, & E. Krahmer (
Eds. ), Proceedings of the International Conference on Auditory-Visual Speech Processing 2007 (pp. 162-166). Tilburg: University of Tilburg.Abstract
Listeners use suprasegmental auditory lexical stress information to resolve the competition words engage in during spoken-word recognition. The present study investigated whether (a) visual speech provides lexical stress information, and, more importantly, (b) whether this visual lexical stress information is used to resolve lexical competition. Dutch word pairs that differ in the lexical stress realization of their first two syllables, but not segmentally (e.g., 'OCtopus' and 'okTOber'; capitals marking primary stress) served as auditory-only, visual-only, and audiovisual speech primes. These primes either matched (e.g., 'OCto-'), mismatched (e.g., 'okTO-'), or were unrelated to (e.g., 'maCHI-') a subsequent printed target (octopus), which participants had to make a lexical decision to. To the degree that visual speech contains lexical stress information, lexical decisions to printed targets should be modulated through the addition of visual speech. Results show, however, no evidence for a role of visual lexical stress information in audiovisual spoken-word recognition. -
McQueen, J. M., & Viebahn, M. C. (2007). Tracking recognition of spoken words by tracking looks to printed words. Quarterly Journal of Experimental Psychology, 60(5), 661-671. doi:10.1080/17470210601183890.
Abstract
Eye movements of Dutch participants were tracked as they looked at arrays of four words on a computer screen and followed spoken instructions (e.g., "Klik op het woord buffel": Click on the word buffalo). The arrays included the target (e.g., buffel), a phonological competitor (e.g., buffer, buffer), and two unrelated distractors. Targets were monosyllabic or bisyllabic, and competitors mismatched targets only on either their onset or offset phoneme and only by one distinctive feature. Participants looked at competitors more than at distractors, but this effect was much stronger for offset-mismatch than onset-mismatch competitors. Fixations to competitors started to decrease as soon as phonetic evidence disfavouring those competitors could influence behaviour. These results confirm that listeners continuously update their interpretation of words as the evidence in the speech signal unfolds and hence establish the viability of the methodology of using eye movements to arrays of printed words to track spoken-word recognition. -
McQueen, J. M. (2007). Eight questions about spoken-word recognition. In M. G. Gaskell (
Ed. ), The Oxford handbook of psycholinguistics (pp. 37-53). Oxford: Oxford University Press.Abstract
This chapter is a review of the literature in experimental psycholinguistics on spoken word recognition. It is organized around eight questions. 1. Why are psycholinguists interested in spoken word recognition? 2. What information in the speech signal is used in word recognition? 3. Where are the words in the continuous speech stream? 4. Which words did the speaker intend? 5. When, as the speech signal unfolds over time, are the phonological forms of words recognized? 6. How are words recognized? 7. Whither spoken word recognition? 8. Who are the researchers in the field? -
Mitterer, H., & McQueen, J. M. (2007). Tracking perception of pronunciation variation by tracking looks to printed words: The case of word-final /t/. In J. Trouvain, & W. J. Barry (
Eds. ), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1929-1932). Dudweiler: Pirrot.Abstract
We investigated perception of words with reduced word-final /t/ using an adapted eyetracking paradigm. Dutch listeners followed spoken instructions to click on printed words which were accompanied on a computer screen by simple shapes (e.g., a circle). Targets were either above or next to their shapes, and the shapes uniquely identified the targets when the spoken forms were ambiguous between words with or without final /t/ (e.g., bult, bump, vs. bul, diploma). Analysis of listeners’ eye-movements revealed, in contrast to earlier results, that listeners use the following segmental context when compensating for /t/-reduction. Reflecting that /t/-reduction is more likely to occur before bilabials, listeners were more likely to look at the /t/-final words if the next word’s first segment was bilabial. This result supports models of speech perception in which prelexical phonological processes use segmental context to modulate word recognition. -
Stevens, M. A., McQueen, J. M., & Hartsuiker, R. J. (2007). No lexically-driven perceptual adjustments of the [x]-[h] boundary. In J. Trouvain, & W. J. Barry (
Eds. ), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1897-1900). Dudweiler: Pirrot.Abstract
Listeners can make perceptual adjustments to phoneme categories in response to a talker who consistently produces a specific phoneme ambiguously. We investigate here whether this type of perceptual learning is also used to adapt to regional accent differences. Listeners were exposed to words produced by a Flemish talker whose realization of [x℄or [h℄ was ambiguous (producing [x℄like [h℄is a property of the West-Flanders regional accent). Before and after exposure they categorized a [x℄-[h℄continuum. For both Dutch and Flemish listeners there was no shift of the categorization boundary after exposure to ambiguous sounds in [x℄- or [h℄-biasing contexts. The absence of a lexically-driven learning effect for this contrast may be because [h℄is strongly influenced by coarticulation. As is not stable across contexts, it may be futile to adapt its representation when new realizations are heard -
Cutler, A., McQueen, J. M., & Zondervan, R. (2000). Proceedings of SWAP (Workshop on Spoken Word Access Processes). Nijmegen: MPI for Psycholinguistics.
-
Cutler, A., Norris, D., & McQueen, J. M. (2000). Tracking TRACE’s troubles. In A. Cutler, J. M. McQueen, & R. Zondervan (
Eds. ), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 63-66). Nijmegen: Max-Planck-Institute for Psycholinguistics.Abstract
Simulations explored the inability of the TRACE model of spoken-word recognition to model the effects on human listening of acoustic-phonetic mismatches in word forms. The source of TRACE's failure lay not in its interactive connectivity, not in the presence of interword competition, and not in the use of phonemic representations, but in the need for continuously optimised interpretation of the input. When an analogue of TRACE was allowed to cycle to asymptote on every slice of input, an acceptable simulation of the subcategorical mismatch data was achieved. Even then, however, the simulation was not as close as that produced by the Merge model. -
McQueen, J. M., Cutler, A., & Norris, D. (2000). Positive and negative influences of the lexicon on phonemic decision-making. In B. Yuan, T. Huang, & X. Tang (
Eds. ), Proceedings of the Sixth International Conference on Spoken Language Processing: Vol. 3 (pp. 778-781). Beijing: China Military Friendship Publish.Abstract
Lexical knowledge influences how human listeners make decisions about speech sounds. Positive lexical effects (faster responses to target sounds in words than in nonwords) are robust across several laboratory tasks, while negative effects (slower responses to targets in more word-like nonwords than in less word-like nonwords) have been found in phonetic decision tasks but not phoneme monitoring tasks. The present experiments tested whether negative lexical effects are therefore a task-specific consequence of the forced choice required in phonetic decision. We compared phoneme monitoring and phonetic decision performance using the same Dutch materials in each task. In both experiments there were positive lexical effects, but no negative lexical effects. We observe that in all studies showing negative lexical effects, the materials were made by cross-splicing, which meant that they contained perceptual evidence supporting the lexically-consistent phonemes. Lexical knowledge seems to influence phonemic decision-making only when there is evidence for the lexically-consistent phoneme in the speech signal. -
McQueen, J. M., Cutler, A., & Norris, D. (2000). Why Merge really is autonomous and parsimonious. In A. Cutler, J. M. McQueen, & R. Zondervan (
Eds. ), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 47-50). Nijmegen: Max-Planck-Institute for Psycholinguistics.Abstract
We briefly describe the Merge model of phonemic decision-making, and, in the light of general arguments about the possible role of feedback in spoken-word recognition, defend Merge's feedforward structure. Merge not only accounts adequately for the data, without invoking feedback connections, but does so in a parsimonious manner. -
Norris, D., McQueen, J. M., & Cutler, A. (2000). Feedback on feedback on feedback: It’s feedforward. (Response to commentators). Behavioral and Brain Sciences, 23, 352-370.
Abstract
The central thesis of the target article was that feedback is never necessary in spoken word recognition. The commentaries present no new data and no new theoretical arguments which lead us to revise this position. In this response we begin by clarifying some terminological issues which have lead to a number of significant misunderstandings. We provide some new arguments to support our case that the feedforward model Merge is indeed more parsimonious than the interactive alternatives, and that it provides a more convincing account of the data than alternative models. Finally, we extend the arguments to deal with new issues raised by the commentators such as infant speech perception and neural architecture. -
Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences, 23, 299-325.
Abstract
Top-down feedback does not benefit speech recognition; on the contrary, it can hinder it. No experimental data imply that feedback loops are required for speech recognition. Feedback is accordingly unnecessary and spoken word recognition is modular. To defend this thesis, we analyse lexical involvement in phonemic decision making. TRACE (McClelland & Elman 1986), a model with feedback from the lexicon to prelexical processes, is unable to account for all the available data on phonemic decision making. The modular Race model (Cutler & Norris 1979) is likewise challenged by some recent results, however. We therefore present a new modular model of phonemic decision making, the Merge model. In Merge, information flows from prelexical processes to the lexicon without feedback. Because phonemic decisions are based on the merging of prelexical and lexical information, Merge correctly predicts lexical involvement in phonemic decisions in both words and nonwords. Computer simulations show how Merge is able to account for the data through a process of competition between lexical hypotheses. We discuss the issue of feedback in other areas of language processing and conclude that modular models are particularly well suited to the problems and constraints of speech recognition. -
Norris, D., Cutler, A., McQueen, J. M., Butterfield, S., & Kearns, R. K. (2000). Language-universal constraints on the segmentation of English. In A. Cutler, J. M. McQueen, & R. Zondervan (
Eds. ), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 43-46). Nijmegen: Max-Planck-Institute for Psycholinguistics.Abstract
Two word-spotting experiments are reported that examine whether the Possible-Word Constraint (PWC) [1] is a language-specific or language-universal strategy for the segmentation of continuous speech. The PWC disfavours parses which leave an impossible residue between the end of a candidate word and a known boundary. The experiments examined cases where the residue was either a CV syllable with a lax vowel, or a CVC syllable with a schwa. Although neither syllable context is a possible word in English, word-spotting in both contexts was easier than with a context consisting of a single consonant. The PWC appears to be language-universal rather than language-specific. -
Norris, D., Cutler, A., & McQueen, J. M. (2000). The optimal architecture for simulating spoken-word recognition. In C. Davis, T. Van Gelder, & R. Wales (
Eds. ), Cognitive Science in Australia, 2000: Proceedings of the Fifth Biennial Conference of the Australasian Cognitive Science Society. Adelaide: Causal Productions.Abstract
Simulations explored the inability of the TRACE model of spoken-word recognition to model the effects on human listening of subcategorical mismatch in word forms. The source of TRACE's failure lay not in interactive connectivity, not in the presence of inter-word competition, and not in the use of phonemic representations, but in the need for continuously optimised interpretation of the input. When an analogue of TRACE was allowed to cycle to asymptote on every slice of input, an acceptable simulation of the subcategorical mismatch data was achieved. Even then, however, the simulation was not as close as that produced by the Merge model, which has inter-word competition, phonemic representations and continuous optimisation (but no interactive connectivity).
Share this page