Displaying 1 - 23 of 23
-
El Aissati, A., McQueen, J. M., & Cutler, A. (2012). Finding words in a language that allows words without vowels. Cognition, 124, 79-84. doi:10.1016/j.cognition.2012.03.006.
Abstract
Across many languages from unrelated families, spoken-word recognition is subject to a constraint whereby potential word candidates must contain a vowel. This constraint minimizes competition from embedded words (e.g., in English, disfavoring win in twin because t cannot be a word). However, the constraint would be counter-productive in certain languages that allow stand-alone vowelless open-class words. One such language is Berber (where t is indeed a word). Berber listeners here detected words affixed to nonsense contexts with or without vowels. Length effects seen in other languages replicated in Berber, but in contrast to prior findings, word detection was not hindered by vowelless contexts. When words can be vowelless, otherwise universal constraints disfavoring vowelless words do not feature in spoken-word recognition.Additional information
mmc1.pdf -
Brandmeyer, A., Desain, P. W., & McQueen, J. M. (2012). Effects of native language on perceptual sensitivity to phonetic cues. Neuroreport, 23, 653-657. doi:10.1097/WNR.0b013e32835542cd.
Abstract
The present study used electrophysiological and behavioral measures to investigate the perception of an English stop consonant contrast by native English listeners and by native Dutch listeners who were highly proficient in English. A /ba/-/pa/ continuum was created from a naturally produced /pa/ token by removing successive periods of aspiration, thus reducing the voice onset time. Although aspiration is a relevant cue for distinguishing voiced and unvoiced labial stop consonants (/b/ and /p/) in English, prevoicing is the primary cue used to distinguish between these categories in Dutch. In the electrophysiological experiment, participants listened to oddball sequences containing the standard /pa/ stimulus and one of three deviant stimuli while the mismatch-negativity response was measured. Participants then completed an identification task on the same stimuli. The results showed that native English participants were more sensitive to reductions in aspiration than native Dutch participants, as indicated by shifts in the category boundary, by differing within-group patterns of mismatch-negativity responses, and by larger mean evoked potential amplitudes in the native English group for two of the three deviant stimuli. This between-group difference in the sensorineural processing of aspiration cues indicates that native language experience alters the way in which the acoustic features of speech are processed in the auditory brain, even following extensive second-language training.Files private
Request files -
Kim, S., Cho, T., & McQueen, J. M. (2012). Phonetic richness can outweigh prosodically-driven phonological knowledge when learning words in an artificial language. Journal of Phonetics, 40, 443-452. doi:10.1016/j.wocn.2012.02.005.
Abstract
How do Dutch and Korean listeners use acoustic–phonetic information when learning words in an artificial language? Dutch has a voiceless ‘unaspirated’ stop, produced with shortened Voice Onset Time (VOT) in prosodic strengthening environments (e.g., in domain-initial position and under prominence), enhancing the feature {−spread glottis}; Korean has a voiceless ‘aspirated’ stop produced with lengthened VOT in similar environments, enhancing the feature {+spread glottis}. Given this cross-linguistic difference, two competing hypotheses were tested. The phonological-superiority hypothesis predicts that Dutch and Korean listeners should utilize shortened and lengthened VOTs, respectively, as cues in artificial-language segmentation. The phonetic-superiority hypothesis predicts that both groups should take advantage of the phonetic richness of longer VOTs (i.e., their enhanced auditory–perceptual robustness). Dutch and Korean listeners learned the words of an artificial language better when word-initial stops had longer VOTs than when they had shorter VOTs. It appears that language-specific phonological knowledge can be overridden by phonetic richness in processing an unfamiliar language. Listeners nonetheless performed better when the stimuli were based on the speech of their native languages, suggesting that the use of richer phonetic information was modulated by listeners' familiarity with the stimuli.Additional information
kim_2012_Speech File 1.mp4 Kim_2012_Speech File 2.mp4 Kim_2012_Speech File 3.mp4 Kim_2012_Speech File 4.mp4 -
McQueen, J. M., & Huettig, F. (2012). Changing only the probability that spoken words will be distorted changes how they are recognized. Journal of the Acoustical Society of America, 131(1), 509-517. doi:10.1121/1.3664087.
Abstract
An eye-tracking experiment examined contextual flexibility in speech processing in response to distortions in spoken input. Dutch participants heard Dutch sentences containing critical words and saw four-picture displays. The name of one picture either had the same onset phonemes as the critical word or had a different first phoneme and rhymed. Participants fixated onset-overlap more than rhyme-overlap pictures, but this tendency varied with speech quality. Relative to a baseline with noise-free sentences, participants looked less at onset-overlap and more at rhyme-overlap pictures when phonemes in the sentences (but not in the critical words) were replaced by noises like those heard on a badly-tuned AM radio. The position of the noises (word-initial or word-medial) had no effect. Noises elsewhere in the sentences apparently made evidence about the critical word less reliable: Listeners became less confident of having heard the onset-overlap name but also less sure of having not heard the rhyme-overlap name. The same acoustic information has different effects on spoken-word recognition as the probability of distortion changes. -
McQueen, J. M., Tyler, M., & Cutler, A. (2012). Lexical retuning of children’s speech perception: Evidence for knowledge about words’ component sounds. Language Learning and Development, 8, 317-339. doi:10.1080/15475441.2011.641887.
Abstract
Children hear new words from many different talkers; to learn words most efficiently, they should be able to represent them independently of talker-specific pronunciation detail. However, do children know what the component sounds of words should be, and can they use that knowledge to deal with different talkers' phonetic realizations? Experiment 1 replicated prior studies on lexically guided retuning of speech perception in adults, with a picture-verification methodology suitable for children. One participant group heard an ambiguous fricative ([s/f]) replacing /f/ (e.g., in words like giraffe); another group heard [s/f] replacing /s/ (e.g., in platypus). The first group subsequently identified more tokens on a Simpie-[s/f]impie-Fimpie toy-name continuum as Fimpie. Experiments 2 and 3 found equivalent lexically guided retuning effects in 12- and 6-year-olds. Children aged 6 have all that is needed for adjusting to talker variation in speech: detailed and abstract phonological representations and the ability to apply them during spoken-word recognition.Files private
Request files -
Poellmann, K., McQueen, J. M., & Mitterer, H. (2012). How talker-adaptation helps listeners recognize reduced word-forms [Abstract]. Program abstracts from the 164th Meeting of the Acoustical Society of America published in the Journal of the Acoustical Society of America, 132(3), 2053.
Abstract
Two eye-tracking experiments tested whether native listeners can adapt
to reductions in casual Dutch speech. Listeners were exposed to segmental
([b] > [m]), syllabic (full-vowel-deletion), or no reductions. In a subsequent
test phase, all three listener groups were tested on how efficiently they could
recognize both types of reduced words. In the first Experiment’s exposure
phase, the (un)reduced target words were predictable. The segmental reductions
were completely consistent (i.e., involved the same input sequences).
Learning about them was found to be pattern-specific and generalized in the
test phase to new reduced /b/-words. The syllabic reductions were not consistent
(i.e., involved variable input sequences). Learning about them was
weak and not pattern-specific. Experiment 2 examined effects of word repetition
and predictability. The (un-)reduced test words appeared in the exposure
phase and were not predictable. There was no evidence of learning for
the segmental reductions, probably because they were not predictable during
exposure. But there was word-specific learning for the vowel-deleted words.
The results suggest that learning about reductions is pattern-specific and
generalizes to new words if the input is consistent and predictable. With
variable input, there is more likely to be adaptation to a general speaking
style and word-specific learning. -
Sjerps, M. J., Mitterer, H., & McQueen, J. M. (2012). Hemispheric differences in the effects of context on vowel perception. Brain and Language, 120, 401-405. doi:10.1016/j.bandl.2011.12.012.
Abstract
Listeners perceive speech sounds relative to context. Contextual influences might differ over hemispheres if different types of auditory processing are lateralized. Hemispheric differences in contextual influences on vowel perception were investigated by presenting speech targets and both speech and non-speech contexts to listeners’ right or left ears (contexts and targets either to the same or to opposite ears). Listeners performed a discrimination task. Vowel perception was influenced by acoustic properties of the context signals. The strength of this influence depended on laterality of target presentation, and on the speech/non-speech status of the context signal. We conclude that contrastive contextual influences on vowel perception are stronger when targets are processed predominately by the right hemisphere. In the left hemisphere, contrastive effects are smaller and largely restricted to speech contexts. -
Sjerps, M. J., McQueen, J. M., & Mitterer, H. (2012). Extrinsic normalization for vocal tracts depends on the signal, not on attention. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 394-397).
Abstract
When perceiving vowels, listeners adjust to speaker-specific vocal-tract characteristics (such as F1) through "extrinsic vowel normalization". This effect is observed as a shift in the location of categorization boundaries of vowel continua. Similar effects have been found with non-speech. Non-speech materials, however, have consistently led to smaller effect-sizes, perhaps because of a lack of attention to non-speech. The present study investigated this possibility. Non-speech materials that had previously been shown to elicit reduced normalization effects were tested again, with the addition of an attention manipulation. The results show that increased attention does not lead to increased normalization effects, suggesting that vowel normalization is mainly determined by bottom-up signal characteristics. -
Sulpizio, S., & McQueen, J. M. (2012). Italians use abstract knowledge about lexical stress during spoken-word recognition. Journal of Memory and Language, 66, 177-193. doi:10.1016/j.jml.2011.08.001.
Abstract
In two eye-tracking experiments in Italian, we investigated how acoustic information and stored knowledge about lexical stress are used during the recognition of tri-syllabic spoken words. Experiment 1 showed that Italians use acoustic cues to a word’s stress pattern rapidly in word recognition, but only for words with antepenultimate stress. Words with penultimate stress – the most common pattern – appeared to be recognized by default. In Experiment 2, listeners had to learn new words from which some stress cues had been removed, and then recognize reduced- and full-cue versions of those words. The acoustic manipulation affected recognition only of newly-learnt words with antepenultimate stress: Full-cue versions, even though they were never heard during training, were recognized earlier than reduced-cue versions. Newly-learnt words with penultimate stress were recognized earlier overall, but recognition of the two versions of these words did not differ. Abstract knowledge (i.e., knowledge generalized over the lexicon) about lexical stress – which pattern is the default and which cues signal the non-default pattern – appears to be used during the recognition of known and newly-learnt Italian words. -
Viebahn, M. C., Ernestus, M., & McQueen, J. M. (2012). Co-occurrence of reduced word forms in natural speech. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 2019-2022).
Abstract
This paper presents a corpus study that investigates the co-occurrence of reduced word forms in natural speech. We extracted Dutch past participles from three different speech registers and investigated the influence of several predictor variables on the presence and duration of schwas in prefixes and /t/s in suffixes. Our results suggest that reduced word forms tend to co-occur even if we partial out the effect of speech rate. The implications of our findings for episodic and abstractionist models of lexical representation are discussed. -
Warner, N. L., McQueen, J. M., Liu, P. Z., Hoffmann, M., & Cutler, A. (2012). Timing of perception for all English diphones [Abstract]. Program abstracts from the 164th Meeting of the Acoustical Society of America published in the Journal of the Acoustical Society of America, 132(3), 1967.
Abstract
Information in speech does not unfold discretely over time; perceptual cues are gradient and overlapped. However, this varies greatly across segments and environments: listeners cannot identify the affricate in /ptS/ until the frication, but information about the vowel in /li/ begins early. Unlike most prior studies, which have concentrated on subsets of language sounds, this study tests perception of every English segment in every phonetic environment, sampling perceptual identification at six points in time (13,470 stimuli/listener; 20 listeners). Results show that information about consonants after another segment is most localized for affricates (almost entirely in the release), and most gradual for voiced stops. In comparison to stressed vowels, unstressed vowels have less information spreading to
neighboring segments and are less well identified. Indeed, many vowels,
especially lax ones, are poorly identified even by the end of the following segment. This may partly reflect listeners’ familiarity with English vowels’ dialectal variability. Diphthongs and diphthongal tense vowels show the most sudden improvement in identification, similar to affricates among the consonants, suggesting that information about segments defined by acoustic change is highly localized. This large dataset provides insights into speech perception and data for probabilistic modeling of spoken word recognition. -
Cho, T., & McQueen, J. M. (2004). Phonotactics vs. phonetic cues in native and non-native listening: Dutch and Korean listeners' perception of Dutch and English. In S. Kin, & M. J. Bae (
Eds. ), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 1301-1304). Seoul: Sunjijn Printing Co.Abstract
We investigated how listeners of two unrelated languages, Dutch and Korean, process phonotactically legitimate and illegitimate sounds spoken in Dutch and American English. To Dutch listeners, unreleased word-final stops are phonotactically illegal because word-final stops in Dutch are generally released in isolation, but to Korean listeners, released final stops are illegal because word-final stops are never released in Korean. Two phoneme monitoring experiments showed a phonotactic effect: Dutch listeners detected released stops more rapidly than unreleased stops whereas the reverse was true for Korean listeners. Korean listeners with English stimuli detected released stops more accurately than unreleased stops, however, suggesting that acoustic-phonetic cues associated with released stops improve detection accuracy. We propose that in non-native speech perception, phonotactic legitimacy in the native language speeds up phoneme recognition, the richness of acousticphonetic cues improves listening accuracy, and familiarity with the non-native language modulates the relative influence of these two factors. -
Baayen, R. H., McQueen, J. M., Dijkstra, T., & Schreuder, R. (2003). Frequency effects in regular inflectional morphology: Revisiting Dutch plurals. In R. H. Baayen, & R. Schreuder (
Eds. ), Morphological structure in language processing (pp. 355-390). Berlin: Mouton de Gruyter. -
Baayen, R. H., McQueen, J. M., Dijkstra, T., & Schreuder, R. (2003). Frequency effects in regular inflectional morphology: Revisiting Dutch plurals. In R. H. Baayen, & R. Schreuder (
Eds. ), Morphological Structure in Language Processing (pp. 355-390). Berlin, Germany: Mouton De Gruyter.Files private
Request files -
McQueen, J. M. (2003). The ghost of Christmas future: Didn't Scrooge learn to be good? Commentary on Magnuson, McMurray, Tanenhaus and Aslin (2003). Cognitive Science, 27(5), 795-799. doi:10.1207/s15516709cog2705_6.
Abstract
Magnuson, McMurray, Tanenhaus, and Aslin [Cogn. Sci. 27 (2003) 285] suggest that they have evidence of lexical feedback in speech perception, and that this evidence thus challenges the purely feedforward Merge model [Behav. Brain Sci. 23 (2000) 299]. This evidence is open to an alternative explanation, however, one which preserves the assumption in Merge that there is no lexical-prelexical feedback during on-line speech processing. This explanation invokes the distinction between perceptual processing that occurs in the short term, as an utterance is heard, and processing that occurs over the longer term, for perceptual learning. -
McQueen, J. M., & Cho, T. (2003). The use of domain-initial strengthening in segmentation of continuous English speech. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2993-2996). Adelaide: Causal Productions.
-
McQueen, J. M., Dahan, D., & Cutler, A. (2003). Continuity and gradedness in speech processing. In N. O. Schiller, & A. S. Meyer (
Eds. ), Phonetics and phonology in language comprehension and production: Differences and similarities (pp. 39-78). Berlin: Mouton de Gruyter. -
McQueen, J. M., Cutler, A., & Norris, D. (2003). Flow of information in the spoken word recognition system. Speech Communication, 41(1), 257-270. doi:10.1016/S0167-6393(02)00108-5.
Abstract
Spoken word recognition consists of two major component processes. First, at the prelexical stage, an abstract description of the utterance is generated from the information in the speech signal. Second, at the lexical stage, this description is used to activate all the words stored in the mental lexicon which match the input. These multiple candidate words then compete with each other. We review evidence which suggests that positive (match) and negative (mismatch) information of both a segmental and a suprasegmental nature is used to constrain this activation and competition process. We then ask whether, in addition to the necessary influence of the prelexical stage on the lexical stage, there is also feedback from the lexicon to the prelexical level. In two phonetic categorization experiments, Dutch listeners were asked to label both syllable-initial and syllable-final ambiguous fricatives (e.g., sounds ranging from [f] to [s]) in the word–nonword series maf–mas, and the nonword–word series jaf–jas. They tended to label the sounds in a lexically consistent manner (i.e., consistent with the word endpoints of the series). These lexical effects became smaller in listeners’ slower responses, even when the listeners were put under pressure to respond as fast as possible. Our results challenge models of spoken word recognition in which feedback modulates the prelexical analysis of the component sounds of a word whenever that word is heard -
Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47(2), 204-238. doi:10.1016/S0010-0285(03)00006-9.
Abstract
This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listeners heard ambiguous [f]-final words (e.g., [WI tlo?], from witlof, chicory) and unambiguous [s]-final words (e.g., naaldbos, pine forest). Another group heard the reverse (e.g., ambiguous [na:ldbo?], unambiguous witlof). Listeners who had heard [?] in [f]-final words were subsequently more likely to categorize ambiguous sounds on an [f]–[s] continuum as [f] than those who heard [?] in [s]-final words. Control conditions ruled out alternative explanations based on selective adaptation and contrast. Lexical information can thus be used to train categorization of speech. This use of lexical information differs from the on-line lexical feedback embodied in interactive models of speech perception. In contrast to on-line feedback, lexical feedback for learning is of benefit to spoken word recognition (e.g., in adapting to a newly encountered dialect). -
Salverda, A. P., Dahan, D., & McQueen, J. M. (2003). The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension. Cognition, 90(1), 51-89. doi:10.1016/S0010-0277(03)00139-2.
Abstract
Participants' eye movements were monitored as they heard sentences and saw four pictured objects on a computer screen. Participants were instructed to click on the object mentioned in the sentence. There were more transitory fixations to pictures representing monosyllabic words (e.g. ham) when the first syllable of the target word (e.g. hamster) had been replaced by a recording of the monosyllabic word than when it came from a different recording of the target word. This demonstrates that a phonemically identical sequence can contain cues that modulate its lexical interpretation. This effect was governed by the duration of the sequence, rather than by its origin (i.e. which type of word it came from). The longer the sequence, the more monosyllabic-word interpretations it generated. We argue that cues to lexical-embedding disambiguation, such as segmental lengthening, result from the realization of a prosodic boundary that often but not always follows monosyllabic words, and that lexical candidates whose word boundaries are aligned with prosodic boundaries are favored in the word-recognition process. -
Scharenborg, O., McQueen, J. M., Ten Bosch, L., & Norris, D. (2003). Modelling human speech recognition using automatic speech recognition paradigms in SpeM. In Proceedings of Eurospeech 2003 (pp. 2097-2100). Adelaide: Causal Productions.
Abstract
We have recently developed a new model of human speech recognition, based on automatic speech recognition techniques [1]. The present paper has two goals. First, we show that the new model performs well in the recognition of lexically ambiguous input. These demonstrations suggest that the model is able to operate in the same optimal way as human listeners. Second, we discuss how to relate the behaviour of a recogniser, designed to discover the optimum path through a word lattice, to data from human listening experiments. We argue that this requires a metric that combines both path-based and word-based measures of recognition performance. The combined metric varies continuously as the input speech signal unfolds over time. -
Smits, R., Warner, N., McQueen, J. M., & Cutler, A. (2003). Unfolding of phonetic information over time: A database of Dutch diphone perception. Journal of the Acoustical Society of America, 113(1), 563-574. doi:10.1121/1.1525287.
Abstract
We present the results of a large-scale study on speech perception, assessing the number and type of perceptual hypotheses which listeners entertain about possible phoneme sequences in their language. Dutch listeners were asked to identify gated fragments of all 1179 diphones of Dutch, providing a total of 488 520 phoneme categorizations. The results manifest orderly uptake of acoustic information in the signal. Differences across phonemes in the rate at which fully correct recognition was achieved arose as a result of whether or not potential confusions could occur with other phonemes of the language ~long with short vowels, affricates with their initial components, etc.!. These data can be used to improve models of how acoustic phonetic information is mapped onto the mental lexicon during speech comprehension.Additional information
https://www.mpi.nl/world/dcsp/diphones/index.html -
Spinelli, E., McQueen, J. M., & Cutler, A. (2003). Processing resyllabified words in French. Journal of Memory and Language, 48(2), 233-254. doi:10.1016/S0749-596X(02)00513-2.
Share this page