James McQueen

Publications

Displaying 1 - 34 of 34
  • El Aissati, A., McQueen, J. M., & Cutler, A. (2012). Finding words in a language that allows words without vowels. Cognition, 124, 79-84. doi:10.1016/j.cognition.2012.03.006.

    Abstract

    Across many languages from unrelated families, spoken-word recognition is subject to a constraint whereby potential word candidates must contain a vowel. This constraint minimizes competition from embedded words (e.g., in English, disfavoring win in twin because t cannot be a word). However, the constraint would be counter-productive in certain languages that allow stand-alone vowelless open-class words. One such language is Berber (where t is indeed a word). Berber listeners here detected words affixed to nonsense contexts with or without vowels. Length effects seen in other languages replicated in Berber, but in contrast to prior findings, word detection was not hindered by vowelless contexts. When words can be vowelless, otherwise universal constraints disfavoring vowelless words do not feature in spoken-word recognition.

    Additional information

    mmc1.pdf
  • Brandmeyer, A., Desain, P. W., & McQueen, J. M. (2012). Effects of native language on perceptual sensitivity to phonetic cues. Neuroreport, 23, 653-657. doi:10.1097/WNR.0b013e32835542cd.

    Abstract

    The present study used electrophysiological and behavioral measures to investigate the perception of an English stop consonant contrast by native English listeners and by native Dutch listeners who were highly proficient in English. A /ba/-/pa/ continuum was created from a naturally produced /pa/ token by removing successive periods of aspiration, thus reducing the voice onset time. Although aspiration is a relevant cue for distinguishing voiced and unvoiced labial stop consonants (/b/ and /p/) in English, prevoicing is the primary cue used to distinguish between these categories in Dutch. In the electrophysiological experiment, participants listened to oddball sequences containing the standard /pa/ stimulus and one of three deviant stimuli while the mismatch-negativity response was measured. Participants then completed an identification task on the same stimuli. The results showed that native English participants were more sensitive to reductions in aspiration than native Dutch participants, as indicated by shifts in the category boundary, by differing within-group patterns of mismatch-negativity responses, and by larger mean evoked potential amplitudes in the native English group for two of the three deviant stimuli. This between-group difference in the sensorineural processing of aspiration cues indicates that native language experience alters the way in which the acoustic features of speech are processed in the auditory brain, even following extensive second-language training.

    Files private

    Request files
  • Kim, S., Cho, T., & McQueen, J. M. (2012). Phonetic richness can outweigh prosodically-driven phonological knowledge when learning words in an artificial language. Journal of Phonetics, 40, 443-452. doi:10.1016/j.wocn.2012.02.005.

    Abstract

    How do Dutch and Korean listeners use acoustic–phonetic information when learning words in an artificial language? Dutch has a voiceless ‘unaspirated’ stop, produced with shortened Voice Onset Time (VOT) in prosodic strengthening environments (e.g., in domain-initial position and under prominence), enhancing the feature {−spread glottis}; Korean has a voiceless ‘aspirated’ stop produced with lengthened VOT in similar environments, enhancing the feature {+spread glottis}. Given this cross-linguistic difference, two competing hypotheses were tested. The phonological-superiority hypothesis predicts that Dutch and Korean listeners should utilize shortened and lengthened VOTs, respectively, as cues in artificial-language segmentation. The phonetic-superiority hypothesis predicts that both groups should take advantage of the phonetic richness of longer VOTs (i.e., their enhanced auditory–perceptual robustness). Dutch and Korean listeners learned the words of an artificial language better when word-initial stops had longer VOTs than when they had shorter VOTs. It appears that language-specific phonological knowledge can be overridden by phonetic richness in processing an unfamiliar language. Listeners nonetheless performed better when the stimuli were based on the speech of their native languages, suggesting that the use of richer phonetic information was modulated by listeners' familiarity with the stimuli.
  • McQueen, J. M., & Huettig, F. (2012). Changing only the probability that spoken words will be distorted changes how they are recognized. Journal of the Acoustical Society of America, 131(1), 509-517. doi:10.1121/1.3664087.

    Abstract

    An eye-tracking experiment examined contextual flexibility in speech processing in response to distortions in spoken input. Dutch participants heard Dutch sentences containing critical words and saw four-picture displays. The name of one picture either had the same onset phonemes as the critical word or had a different first phoneme and rhymed. Participants fixated onset-overlap more than rhyme-overlap pictures, but this tendency varied with speech quality. Relative to a baseline with noise-free sentences, participants looked less at onset-overlap and more at rhyme-overlap pictures when phonemes in the sentences (but not in the critical words) were replaced by noises like those heard on a badly-tuned AM radio. The position of the noises (word-initial or word-medial) had no effect. Noises elsewhere in the sentences apparently made evidence about the critical word less reliable: Listeners became less confident of having heard the onset-overlap name but also less sure of having not heard the rhyme-overlap name. The same acoustic information has different effects on spoken-word recognition as the probability of distortion changes.
  • McQueen, J. M., Tyler, M., & Cutler, A. (2012). Lexical retuning of children’s speech perception: Evidence for knowledge about words’ component sounds. Language Learning and Development, 8, 317-339. doi:10.1080/15475441.2011.641887.

    Abstract

    Children hear new words from many different talkers; to learn words most efficiently, they should be able to represent them independently of talker-specific pronunciation detail. However, do children know what the component sounds of words should be, and can they use that knowledge to deal with different talkers' phonetic realizations? Experiment 1 replicated prior studies on lexically guided retuning of speech perception in adults, with a picture-verification methodology suitable for children. One participant group heard an ambiguous fricative ([s/f]) replacing /f/ (e.g., in words like giraffe); another group heard [s/f] replacing /s/ (e.g., in platypus). The first group subsequently identified more tokens on a Simpie-[s/f]impie-Fimpie toy-name continuum as Fimpie. Experiments 2 and 3 found equivalent lexically guided retuning effects in 12- and 6-year-olds. Children aged 6 have all that is needed for adjusting to talker variation in speech: detailed and abstract phonological representations and the ability to apply them during spoken-word recognition.

    Files private

    Request files
  • Poellmann, K., McQueen, J. M., & Mitterer, H. (2012). How talker-adaptation helps listeners recognize reduced word-forms [Abstract]. Program abstracts from the 164th Meeting of the Acoustical Society of America published in the Journal of the Acoustical Society of America, 132(3), 2053.

    Abstract

    Two eye-tracking experiments tested whether native listeners can adapt
    to reductions in casual Dutch speech. Listeners were exposed to segmental
    ([b] > [m]), syllabic (full-vowel-deletion), or no reductions. In a subsequent
    test phase, all three listener groups were tested on how efficiently they could
    recognize both types of reduced words. In the first Experiment’s exposure
    phase, the (un)reduced target words were predictable. The segmental reductions
    were completely consistent (i.e., involved the same input sequences).
    Learning about them was found to be pattern-specific and generalized in the
    test phase to new reduced /b/-words. The syllabic reductions were not consistent
    (i.e., involved variable input sequences). Learning about them was
    weak and not pattern-specific. Experiment 2 examined effects of word repetition
    and predictability. The (un-)reduced test words appeared in the exposure
    phase and were not predictable. There was no evidence of learning for
    the segmental reductions, probably because they were not predictable during
    exposure. But there was word-specific learning for the vowel-deleted words.
    The results suggest that learning about reductions is pattern-specific and
    generalizes to new words if the input is consistent and predictable. With
    variable input, there is more likely to be adaptation to a general speaking
    style and word-specific learning.
  • Sjerps, M. J., Mitterer, H., & McQueen, J. M. (2012). Hemispheric differences in the effects of context on vowel perception. Brain and Language, 120, 401-405. doi:10.1016/j.bandl.2011.12.012.

    Abstract

    Listeners perceive speech sounds relative to context. Contextual influences might differ over hemispheres if different types of auditory processing are lateralized. Hemispheric differences in contextual influences on vowel perception were investigated by presenting speech targets and both speech and non-speech contexts to listeners’ right or left ears (contexts and targets either to the same or to opposite ears). Listeners performed a discrimination task. Vowel perception was influenced by acoustic properties of the context signals. The strength of this influence depended on laterality of target presentation, and on the speech/non-speech status of the context signal. We conclude that contrastive contextual influences on vowel perception are stronger when targets are processed predominately by the right hemisphere. In the left hemisphere, contrastive effects are smaller and largely restricted to speech contexts.
  • Sjerps, M. J., McQueen, J. M., & Mitterer, H. (2012). Extrinsic normalization for vocal tracts depends on the signal, not on attention. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 394-397).

    Abstract

    When perceiving vowels, listeners adjust to speaker-specific vocal-tract characteristics (such as F1) through "extrinsic vowel normalization". This effect is observed as a shift in the location of categorization boundaries of vowel continua. Similar effects have been found with non-speech. Non-speech materials, however, have consistently led to smaller effect-sizes, perhaps because of a lack of attention to non-speech. The present study investigated this possibility. Non-speech materials that had previously been shown to elicit reduced normalization effects were tested again, with the addition of an attention manipulation. The results show that increased attention does not lead to increased normalization effects, suggesting that vowel normalization is mainly determined by bottom-up signal characteristics.
  • Sulpizio, S., & McQueen, J. M. (2012). Italians use abstract knowledge about lexical stress during spoken-word recognition. Journal of Memory and Language, 66, 177-193. doi:10.1016/j.jml.2011.08.001.

    Abstract

    In two eye-tracking experiments in Italian, we investigated how acoustic information and stored knowledge about lexical stress are used during the recognition of tri-syllabic spoken words. Experiment 1 showed that Italians use acoustic cues to a word’s stress pattern rapidly in word recognition, but only for words with antepenultimate stress. Words with penultimate stress – the most common pattern – appeared to be recognized by default. In Experiment 2, listeners had to learn new words from which some stress cues had been removed, and then recognize reduced- and full-cue versions of those words. The acoustic manipulation affected recognition only of newly-learnt words with antepenultimate stress: Full-cue versions, even though they were never heard during training, were recognized earlier than reduced-cue versions. Newly-learnt words with penultimate stress were recognized earlier overall, but recognition of the two versions of these words did not differ. Abstract knowledge (i.e., knowledge generalized over the lexicon) about lexical stress – which pattern is the default and which cues signal the non-default pattern – appears to be used during the recognition of known and newly-learnt Italian words.
  • Viebahn, M. C., Ernestus, M., & McQueen, J. M. (2012). Co-occurrence of reduced word forms in natural speech. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 2019-2022).

    Abstract

    This paper presents a corpus study that investigates the co-occurrence of reduced word forms in natural speech. We extracted Dutch past participles from three different speech registers and investigated the influence of several predictor variables on the presence and duration of schwas in prefixes and /t/s in suffixes. Our results suggest that reduced word forms tend to co-occur even if we partial out the effect of speech rate. The implications of our findings for episodic and abstractionist models of lexical representation are discussed.
  • Warner, N. L., McQueen, J. M., Liu, P. Z., Hoffmann, M., & Cutler, A. (2012). Timing of perception for all English diphones [Abstract]. Program abstracts from the 164th Meeting of the Acoustical Society of America published in the Journal of the Acoustical Society of America, 132(3), 1967.

    Abstract

    Information in speech does not unfold discretely over time; perceptual cues are gradient and overlapped. However, this varies greatly across segments and environments: listeners cannot identify the affricate in /ptS/ until the frication, but information about the vowel in /li/ begins early. Unlike most prior studies, which have concentrated on subsets of language sounds, this study tests perception of every English segment in every phonetic environment, sampling perceptual identification at six points in time (13,470 stimuli/listener; 20 listeners). Results show that information about consonants after another segment is most localized for affricates (almost entirely in the release), and most gradual for voiced stops. In comparison to stressed vowels, unstressed vowels have less information spreading to
    neighboring segments and are less well identified. Indeed, many vowels,
    especially lax ones, are poorly identified even by the end of the following segment. This may partly reflect listeners’ familiarity with English vowels’ dialectal variability. Diphthongs and diphthongal tense vowels show the most sudden improvement in identification, similar to affricates among the consonants, suggesting that information about segments defined by acoustic change is highly localized. This large dataset provides insights into speech perception and data for probabilistic modeling of spoken word recognition.
  • Cho, T., & McQueen, J. M. (2006). Phonological versus phonetic cues in native and non-native listening: Korean and Dutch listeners' perception of Dutch and English consonants. Journal of the Acoustical Society of America, 119(5), 3085-3096. doi:10.1121/1.2188917.

    Abstract

    We investigated how listeners of two unrelated languages, Korean and Dutch, process phonologically viable and nonviable consonants spoken in Dutch and American English. To Korean listeners, released final stops are nonviable because word-final stops in Korean are never released in words spoken in isolation, but to Dutch listeners, unreleased word-final stops are nonviable because word-final stops in Dutch are generally released in words spoken in isolation. Two phoneme monitoring experiments showed a phonological effect on both Dutch and English stimuli: Korean listeners detected the unreleased stops more rapidly whereas Dutch listeners detected the released stops more rapidly and/or more accurately. The Koreans, however, detected released stops more accurately than unreleased stops, but only in the non-native language they were familiar with (English). The results suggest that, in non-native speech perception, phonological legitimacy in the native language can be more important than the richness of phonetic information, though familiarity with phonetic detail in the non-native language can also improve listening performance.
  • Cutler, A., Eisner, F., McQueen, J. M., & Norris, D. (2006). Coping with speaker-related variation via abstract phonemic categories. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 31-32).
  • Eisner, F., & McQueen, J. M. (2006). Perceptual learning in speech: Stability over time (L). Journal of the Acoustical Society of America, 119(4), 1950-1953. doi:10.1121/1.2178721.

    Abstract

    Perceptual representations of phonemes are flexible and adapt rapidly to accommodate idiosyncratic articulation in the speech of a particular talker. This letter addresses whether such adjustments remain stable over time and under exposure to other talkers. During exposure to a story, listeners learned to interpret an ambiguous sound as [f] or [s]. Perceptual adjustments measured after 12 h were as robust as those measured immediately after learning. Equivalent effects were found when listeners heard speech from other talkers in the 12 h interval, and when they had the opportunity to consolidate learning during sleep.
  • McQueen, J. M., Cutler, A., & Norris, D. (2006). Phonological abstraction in the mental lexicon. Cognitive Science, 30(6), 1113-1126. doi:10.1207/s15516709cog0000_79.

    Abstract

    A perceptual learning experiment provides evidence that the mental lexicon cannot consist solely of detailed acoustic traces of recognition episodes. In a training lexical decision phase, listeners heard an ambiguous [f–s] fricative sound, replacing either [f] or [s] in words. In a test phase, listeners then made lexical decisions to visual targets following auditory primes. Critical materials were minimal pairs that could be a word with either [f] or [s] (cf. English knife–nice), none of which had been heard in training. Listeners interpreted the minimal pair words differently in the second phase according to the training received in the first phase. Therefore, lexically mediated retuning of phoneme perception not only influences categorical decisions about fricatives (Norris, McQueen, & Cutler, 2003), but also benefits recognition of words outside the training set. The observed generalization across words suggests that this retuning occurs prelexically. Therefore, lexical processing involves sublexical phonological abstraction, not only accumulation of acoustic episodes.
  • McQueen, J. M., Norris, D., & Cutler, A. (2006). The dynamic nature of speech perception. Language and Speech, 49(1), 101-112.

    Abstract

    The speech perception system must be flexible in responding to the variability in speech sounds caused by differences among speakers and by language change over the lifespan of the listener. Indeed, listeners use lexical knowledge to retune perception of novel speech (Norris, McQueen, & Cutler, 2003). In that study, Dutch listeners made lexical decisions to spoken stimuli, including words with an ambiguous fricative (between [f] and [s]), in either [f]- or [s]-biased lexical contexts. In a subsequent categorization test, the former group of listeners identified more sounds on an [εf] - [εs] continuum as [f] than the latter group. In the present experiment, listeners received the same exposure and test stimuli, but did not make lexical decisions to the exposure items. Instead, they counted them. Categorization results were indistinguishable from those obtained earlier. These adjustments in fricative perception therefore do not depend on explicit judgments during exposure. This learning effect thus reflects automatic retuning of the interpretation of acoustic-phonetic information.
  • McQueen, J. M., Norris, D., & Cutler, A. (2006). Are there really interactive processes in speech perception? Trends in Cognitive Sciences, 10(12), 533-533. doi:10.1016/j.tics.2006.10.004.
  • Norris, D., Cutler, A., McQueen, J. M., & Butterfield, S. (2006). Phonological and conceptual activation in speech comprehension. Cognitive Psychology, 53(2), 146-193. doi:10.1016/j.cogpsych.2006.03.001.

    Abstract

    We propose that speech comprehension involves the activation of token representations of the phonological forms of current lexical hypotheses, separately from the ongoing construction of a conceptual interpretation of the current utterance. In a series of cross-modal priming experiments, facilitation of lexical decision responses to visual target words (e.g., time) was found for targets that were semantic associates of auditory prime words (e.g., date) when the primes were isolated words, but not when the same primes appeared in sentence contexts. Identity priming (e.g., faster lexical decisions to visual date after spoken date than after an unrelated prime) appeared, however, both with isolated primes and with primes in prosodically neutral sentences. Associative priming in sentence contexts only emerged when sentence prosody involved contrastive accents, or when sentences were terminated immediately after the prime. Associative priming is therefore not an automatic consequence of speech processing. In no experiment was there associative priming from embedded words (e.g., sedate-time), but there was inhibitory identity priming (e.g., sedate-date) from embedded primes in sentence contexts. Speech comprehension therefore appears to involve separate distinct activation both of token phonological word representations and of conceptual word representations. Furthermore, both of these types of representation are distinct from the long-term memory representations of word form and meaning.
  • Norris, D., Butterfield, S., McQueen, J. M., & Cutler, A. (2006). Lexically guided retuning of letter perception. Quarterly Journal of Experimental Psychology, 59(9), 1505-1515. doi:10.1080/17470210600739494.

    Abstract

    Participants made visual lexical decisions to upper-case words and nonwords, and then categorized an ambiguous N–H letter continuum. The lexical decision phase included different exposure conditions: Some participants saw an ambiguous letter “?”, midway between N and H, in N-biased lexical contexts (e.g., REIG?), plus words with unambiguousH(e.g., WEIGH); others saw the reverse (e.g., WEIG?, REIGN). The first group categorized more of the test continuum as N than did the second group. Control groups, who saw “?” in nonword contexts (e.g., SMIG?), plus either of the unambiguous word sets (e.g., WEIGH or REIGN), showed no such subsequent effects. Perceptual learning about ambiguous letters therefore appears to be based on lexical knowledge, just as in an analogous speech experiment (Norris, McQueen, & Cutler, 2003) which showed similar lexical influence in learning about ambiguous phonemes. We argue that lexically guided learning is an efficient general strategy available for exploitation by different specific perceptual tasks.
  • Shatzman, K. B., & McQueen, J. M. (2006). Segment duration as a cue to word boundaries in spoken-word recognition. Perception & Psychophysics, 68(1), 1-16.

    Abstract

    In two eye-tracking experiments, we examined the degree to which listeners use acoustic cues to word boundaries. Dutch participants listened to ambiguous sentences in which stop-initial words (e.g., pot, jar) were preceded by eens (once); the sentences could thus also refer to cluster-initial words (e.g., een spot, a spotlight). The participants made fewer fixations to target pictures (e.g., a jar) when the target and the preceding [s] were replaced by a recording of the cluster-initial word than when they were spliced from another token of the target-bearing sentence (Experiment 1). Although acoustic analyses revealed several differences between the two recordings, only [s] duration correlated with the participants’ fixations (more target fixations for shorter [s]s). Thus, we found that listeners apparently do not use all available acoustic differences equally. In Experiment 2, the participants made more fixations to target pictures when the [s] was shortened than when it was lengthened. Utterance interpretation can therefore be influenced by individual segment duration alone.
  • Shatzman, K. B., & McQueen, J. M. (2006). Prosodic knowledge affects the recognition of newly acquired words. Psychological Science, 17(5), 372-377. doi:10.1111/j.1467-9280.2006.01714.x.

    Abstract

    An eye-tracking study examined the involvement of prosodic knowledge—specifically, the knowledge that monosyllabic words tend to have longer durations than the first syllables of polysyllabic words—in the recognition of newly learned words. Participants learned new spoken words (by associating them to novel shapes): bisyllables and onset-embedded monosyllabic competitors (e.g., baptoe and bap). In the learning phase, the duration of the ambiguous sequence (e.g., bap) was held constant. In the test phase, its duration was longer than, shorter than, or equal to its learning-phase duration. Listeners’ fixations indicated that short syllables tended to be interpreted as the first syllables of the bisyllables, whereas long syllables generated more monosyllabic-word interpretations. Recognition of newly acquired words is influenced by prior prosodic knowledge and is therefore not determined solely on the basis of stored episodes of those words.
  • Shatzman, K. B., & McQueen, J. M. (2006). The modulation of lexical competition by segment duration. Psychonomic Bulletin & Review, 13(6), 966-971.

    Abstract

    In an eye-tracking study, we examined how fine-grained phonetic detail, such as segment duration, influences the lexical competition process during spoken word recognition. Dutch listeners’ eye movements to pictures of four objects were monitored as they heard sentences in which a stop-initial target word (e.g., pijp “pipe”) was preceded by an [s]. The participants made more fixations to pictures of cluster-initial words (e.g., spijker “nail”) when they heard a long [s] (mean duration, 103 msec) than when they heard a short [s] (mean duration, 73 msec). Conversely, the participants made more fixations to pictures of the stop-initial words when they heard a short [s] than when they heard a long [s]. Lexical competition between stop- and cluster-initial words, therefore, is modulated by segment duration differences of only 30 msec.
  • Van Alphen, P. M., & McQueen, J. M. (2006). The effect of voice onset time differences on lexical access in Dutch. Journal of Experimental Psychology: Human Perception and Performance, 32(1), 178-196. doi:10.1037/0096-1523.32.1.178.

    Abstract

    Effects on spoken-word recognition of prevoicing differences in Dutch initial voiced plosives were examined. In 2 cross-modal identity-priming experiments, participants heard prime words and nonwords beginning with voiced plosives with 12, 6, or 0 periods of prevoicing or matched items beginning with voiceless plosives and made lexical decisions to visual tokens of those items. Six-period primes had the same effect as 12-period primes. Zero-period primes had a different effect, but only when their voiceless counterparts were real words. Listeners could nevertheless discriminate the 6-period primes from the 12- and 0-period primes. Phonetic detail appears to influence lexical access only to the extent that it is useful: In Dutch, presence versus absence of prevoicing is more informative than amount of prevoicing.
  • Baayen, R. H., McQueen, J. M., Dijkstra, T., & Schreuder, R. (2003). Frequency effects in regular inflectional morphology: Revisiting Dutch plurals. In R. H. Baayen, & R. Schreuder (Eds.), Morphological structure in language processing (pp. 355-390). Berlin: Mouton de Gruyter.
  • Baayen, R. H., McQueen, J. M., Dijkstra, T., & Schreuder, R. (2003). Frequency effects in regular inflectional morphology: Revisiting Dutch plurals. In R. H. Baayen, & R. Schreuder (Eds.), Morphological Structure in Language Processing (pp. 355-390). Berlin, Germany: Mouton De Gruyter.
  • McQueen, J. M. (2003). The ghost of Christmas future: Didn't Scrooge learn to be good? Commentary on Magnuson, McMurray, Tanenhaus and Aslin (2003). Cognitive Science, 27(5), 795-799. doi:10.1207/s15516709cog2705_6.

    Abstract

    Magnuson, McMurray, Tanenhaus, and Aslin [Cogn. Sci. 27 (2003) 285] suggest that they have evidence of lexical feedback in speech perception, and that this evidence thus challenges the purely feedforward Merge model [Behav. Brain Sci. 23 (2000) 299]. This evidence is open to an alternative explanation, however, one which preserves the assumption in Merge that there is no lexical-prelexical feedback during on-line speech processing. This explanation invokes the distinction between perceptual processing that occurs in the short term, as an utterance is heard, and processing that occurs over the longer term, for perceptual learning.
  • McQueen, J. M., & Cho, T. (2003). The use of domain-initial strengthening in segmentation of continuous English speech. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2993-2996). Adelaide: Causal Productions.
  • McQueen, J. M., Dahan, D., & Cutler, A. (2003). Continuity and gradedness in speech processing. In N. O. Schiller, & A. S. Meyer (Eds.), Phonetics and phonology in language comprehension and production: Differences and similarities (pp. 39-78). Berlin: Mouton de Gruyter.
  • McQueen, J. M., Cutler, A., & Norris, D. (2003). Flow of information in the spoken word recognition system. Speech Communication, 41(1), 257-270. doi:10.1016/S0167-6393(02)00108-5.

    Abstract

    Spoken word recognition consists of two major component processes. First, at the prelexical stage, an abstract description of the utterance is generated from the information in the speech signal. Second, at the lexical stage, this description is used to activate all the words stored in the mental lexicon which match the input. These multiple candidate words then compete with each other. We review evidence which suggests that positive (match) and negative (mismatch) information of both a segmental and a suprasegmental nature is used to constrain this activation and competition process. We then ask whether, in addition to the necessary influence of the prelexical stage on the lexical stage, there is also feedback from the lexicon to the prelexical level. In two phonetic categorization experiments, Dutch listeners were asked to label both syllable-initial and syllable-final ambiguous fricatives (e.g., sounds ranging from [f] to [s]) in the word–nonword series maf–mas, and the nonword–word series jaf–jas. They tended to label the sounds in a lexically consistent manner (i.e., consistent with the word endpoints of the series). These lexical effects became smaller in listeners’ slower responses, even when the listeners were put under pressure to respond as fast as possible. Our results challenge models of spoken word recognition in which feedback modulates the prelexical analysis of the component sounds of a word whenever that word is heard
  • Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47(2), 204-238. doi:10.1016/S0010-0285(03)00006-9.

    Abstract

    This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listeners heard ambiguous [f]-final words (e.g., [WI tlo?], from witlof, chicory) and unambiguous [s]-final words (e.g., naaldbos, pine forest). Another group heard the reverse (e.g., ambiguous [na:ldbo?], unambiguous witlof). Listeners who had heard [?] in [f]-final words were subsequently more likely to categorize ambiguous sounds on an [f]–[s] continuum as [f] than those who heard [?] in [s]-final words. Control conditions ruled out alternative explanations based on selective adaptation and contrast. Lexical information can thus be used to train categorization of speech. This use of lexical information differs from the on-line lexical feedback embodied in interactive models of speech perception. In contrast to on-line feedback, lexical feedback for learning is of benefit to spoken word recognition (e.g., in adapting to a newly encountered dialect).
  • Salverda, A. P., Dahan, D., & McQueen, J. M. (2003). The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension. Cognition, 90(1), 51-89. doi:10.1016/S0010-0277(03)00139-2.

    Abstract

    Participants' eye movements were monitored as they heard sentences and saw four pictured objects on a computer screen. Participants were instructed to click on the object mentioned in the sentence. There were more transitory fixations to pictures representing monosyllabic words (e.g. ham) when the first syllable of the target word (e.g. hamster) had been replaced by a recording of the monosyllabic word than when it came from a different recording of the target word. This demonstrates that a phonemically identical sequence can contain cues that modulate its lexical interpretation. This effect was governed by the duration of the sequence, rather than by its origin (i.e. which type of word it came from). The longer the sequence, the more monosyllabic-word interpretations it generated. We argue that cues to lexical-embedding disambiguation, such as segmental lengthening, result from the realization of a prosodic boundary that often but not always follows monosyllabic words, and that lexical candidates whose word boundaries are aligned with prosodic boundaries are favored in the word-recognition process.
  • Scharenborg, O., McQueen, J. M., Ten Bosch, L., & Norris, D. (2003). Modelling human speech recognition using automatic speech recognition paradigms in SpeM. In Proceedings of Eurospeech 2003 (pp. 2097-2100). Adelaide: Causal Productions.

    Abstract

    We have recently developed a new model of human speech recognition, based on automatic speech recognition techniques [1]. The present paper has two goals. First, we show that the new model performs well in the recognition of lexically ambiguous input. These demonstrations suggest that the model is able to operate in the same optimal way as human listeners. Second, we discuss how to relate the behaviour of a recogniser, designed to discover the optimum path through a word lattice, to data from human listening experiments. We argue that this requires a metric that combines both path-based and word-based measures of recognition performance. The combined metric varies continuously as the input speech signal unfolds over time.
  • Smits, R., Warner, N., McQueen, J. M., & Cutler, A. (2003). Unfolding of phonetic information over time: A database of Dutch diphone perception. Journal of the Acoustical Society of America, 113(1), 563-574. doi:10.1121/1.1525287.

    Abstract

    We present the results of a large-scale study on speech perception, assessing the number and type of perceptual hypotheses which listeners entertain about possible phoneme sequences in their language. Dutch listeners were asked to identify gated fragments of all 1179 diphones of Dutch, providing a total of 488 520 phoneme categorizations. The results manifest orderly uptake of acoustic information in the signal. Differences across phonemes in the rate at which fully correct recognition was achieved arose as a result of whether or not potential confusions could occur with other phonemes of the language ~long with short vowels, affricates with their initial components, etc.!. These data can be used to improve models of how acoustic phonetic information is mapped onto the mental lexicon during speech comprehension.
  • Spinelli, E., McQueen, J. M., & Cutler, A. (2003). Processing resyllabified words in French. Journal of Memory and Language, 48(2), 233-254. doi:10.1016/S0749-596X(02)00513-2.

Share this page