Displaying 1 - 27 of 27
-
Hintz, F., Dijkhuis, M., Van 't Hoff, V., Huijsmans, M., Kievit, R. A., McQueen, J. M., & Meyer, A. S. (2025). Evaluating the factor structure of the Dutch Individual Differences in Language Skills (IDLaS-NL) test battery. Brain Research, 1852: 149502. doi:10.1016/j.brainres.2025.149502.
Abstract
Individual differences in using language are prevalent in our daily lives. Language skills are often assessed in vocational (predominantly written language) and diagnostic contexts. Not much is known, however, about individual differences in spoken language skills. The lack of research is in part due to the lack of suitable test instruments. We introduce the Individual Differences in Language Skills (IDLaS-NL) test battery, a set of 31 behavioural tests that can be used to capture variability in language and relevant general cognitive skills in adult speakers of Dutch. The battery was designed to measure word and sentence production and comprehension skills, linguistic knowledge, nonverbal processing speed, working memory, and nonverbal reasoning. The present article outlines the structure of the battery, describes the materials and procedure of each test, and evaluates the battery’s factor structure based on the results of a sample of 748 Dutch adults, aged between 18 and 30 years, most of them students. The analyses demonstrate that the battery has good construct validity and can be reliably administered both in the lab and via the internet. We therefore recommend the battery as a valuable new tool to assess individual differences in language knowledge and skills; this future work may include linking language skills to other aspects of human cognition and life outcomes. -
Norris, D., & McQueen, J. M. (2025). Why might there be lexical-prelexical feedback in speech recognition? Cognition, 255: 106025. doi:10.1016/j.cognition.2024.106025.
Abstract
In reply to Magnuson, Crinnion, Luthra, Gaston, and Grubb (2023), we challenge their conclusion that on-line activation feedback improves word recognition. This type of feedback is instantiated in the TRACE model (McClelland & Elman, 1986) as top-down spread of activation from lexical to phoneme nodes. We give two main reasons why Magnuson et al.'s demonstration that activation feedback speeds up word recognition in TRACE is not informative about whether activation feedback helps humans recognize words. First, the same speed-up could be achieved by changing other parameters in TRACE. Second, more fundamentally, there is room for improvement in TRACE's performance only because the model, unlike Bayesian models, is suboptimal. We also challenge Magnuson et al.'s claim that the available empirical data support activation feedback. The data they base this claim on are open to alternative explanations and there are data against activation feedback that they do not discuss. We argue, therefore, that there are no computational or empirical grounds to conclude that activation feedback benefits human spoken-word recognition and indeed no theoretical grounds why activation feedback would exist. Other types of feedback, for example feedback to support perceptual learning, likely do exist, precisely because they can help listeners recognize words. -
Andics, A., McQueen, J. M., & Petersson, K. M. (2013). Mean-based neural coding of voices. NeuroImage, 79, 351-360. doi:10.1016/j.neuroimage.2013.05.002.
Abstract
The social significance of recognizing the person who talks to us is obvious, but the neural mechanisms that mediate talker identification are unclear. Regions along the bilateral superior temporal sulcus (STS) and the inferior frontal cortex (IFC) of the human brain are selective for voices, and they are sensitive to rapid voice changes. Although it has been proposed that voice recognition is supported by prototype-centered voice representations, the involvement of these category-selective cortical regions in the neural coding of such "mean voices" has not previously been demonstrated. Using fMRI in combination with a voice identity learning paradigm, we show that voice-selective regions are involved in the mean-based coding of voice identities. Voice typicality is encoded on a supra-individual level in the right STS along a stimulus-dependent, identity-independent (i.e., voice-acoustic) dimension, and on an intra-individual level in the right IFC along a stimulus-independent, identity-dependent (i.e., voice identity) dimension. Voice recognition therefore entails at least two anatomically separable stages, each characterized by neural mechanisms that reference the central tendencies of voice categories. -
Asaridou, S. S., & McQueen, J. M. (2013). Speech and music shape the listening brain: Evidence for shared domain-general mechanisms. Frontiers in Psychology, 4: 321. doi:10.3389/fpsyg.2013.00321.
Abstract
Are there bi-directional influences between speech perception and music perception? An answer to this question is essential for understanding the extent to which the speech and music that we hear are processed by domain-general auditory processes and/or by distinct neural auditory mechanisms. This review summarizes a large body of behavioral and neuroscientific findings which suggest that the musical experience of trained musicians does modulate speech processing, and a sparser set of data, largely on pitch processing, which suggest in addition that linguistic experience, in particular learning a tone language, modulates music processing. Although research has focused mostly on music on speech effects, we argue that both directions of influence need to be studied, and conclude that the picture which thus emerges is one of mutual interaction across domains. In particular, it is not simply that experience with spoken language has some effects on music perception, and vice versa, but that because of shared domain-general subcortical and cortical networks, experiences in both domains influence behavior in both domains. -
Brandmeyer, A., Sadakata, M., Spyrou, L., McQueen, J. M., & Desain, P. (2013). Decoding of single-trial auditory mismatch responses for online perceptual monitoring and neurofeedback. Frontiers in Neuroscience, 7: 265. doi:10.3389/fnins.2013.00265.
Abstract
Multivariate pattern classification methods are increasingly applied to neuroimaging data in the context of both fundamental research and in brain-computer interfacing approaches. Such methods provide a framework for interpreting measurements made at the single-trial level with respect to a set of two or more distinct mental states. Here, we define an approach in which the output of a binary classifier trained on data from an auditory mismatch paradigm can be used for online tracking of perception and as a neurofeedback signal. The auditory mismatch paradigm is known to induce distinct perceptual states related to the presentation of high- and low-probability stimuli, which are reflected in event-related potential (ERP) components such as the mismatch negativity (MMN). The first part of this paper illustrates how pattern classification methods can be applied to data collected in an MMN paradigm, including discussion of the optimization of preprocessing steps, the interpretation of features and how the performance of these methods generalizes across individual participants and measurement sessions. We then go on to show that the output of these decoding methods can be used in online settings as a continuous index of single-trial brain activation underlying perceptual discrimination. We conclude by discussing several potential domains of application, including neurofeedback, cognitive monitoring and passive brain-computer interfacesAdditional information
Brandmeyer_etal_2013a.pdf -
Brandmeyer, A., Farquhar, J., McQueen, J. M., & Desain, P. (2013). Decoding speech perception by native and non-native speakers using single-trial electrophysiological data. PLoS One, 8: e68261. doi:10.1371/journal.pone.0068261.
Abstract
Brain-computer interfaces (BCIs) are systems that use real-time analysis of neuroimaging data to determine the mental state of their user for purposes such as providing neurofeedback. Here, we investigate the feasibility of a BCI based on speech perception. Multivariate pattern classification methods were applied to single-trial EEG data collected during speech perception by native and non-native speakers. Two principal questions were asked: 1) Can differences in the perceived categories of pairs of phonemes be decoded at the single-trial level? 2) Can these same categorical differences be decoded across participants, within or between native-language groups? Results indicated that classification performance progressively increased with respect to the categorical status (within, boundary or across) of the stimulus contrast, and was also influenced by the native language of individual participants. Classifier performance showed strong relationships with traditional event-related potential measures and behavioral responses. The results of the cross-participant analysis indicated an overall increase in average classifier performance when trained on data from all participants (native and non-native). A second cross-participant classifier trained only on data from native speakers led to an overall improvement in performance for native speakers, but a reduction in performance for non-native speakers. We also found that the native language of a given participant could be decoded on the basis of EEG data with accuracy above 80%. These results indicate that electrophysiological responses underlying speech perception can be decoded at the single-trial level, and that decoding performance systematically reflects graded changes in the responses related to the phonological status of the stimuli. This approach could be used in extensions of the BCI paradigm to support perceptual learning during second language acquisition -
Mani, N., Johnson, E., McQueen, J. M., & Huettig, F. (2013). How yellow is your banana? Toddlers' language-mediated visual search in referent-present tasks. Developmental Psychology, 49, 1036-1044. doi:10.1037/a0029382.
Abstract
What is the relative salience of different aspects of word meaning in the developing lexicon? The current study examines the time-course of retrieval of semantic and color knowledge associated with words during toddler word recognition: at what point do toddlers orient towards an image of a yellow cup upon hearing color-matching words such as “banana” (typically yellow) relative to unrelated words (e.g., “house”)? Do children orient faster to semantic matching images relative to color matching images, e.g., orient faster to an image of a cookie relative to a yellow cup upon hearing the word “banana”? The results strongly suggest a prioritization of semantic information over color information in children’s word-referent mappings. This indicates that, even for natural objects (e.g., food, animals that are more likely to have a prototypical color), semantic knowledge is a more salient aspect of toddler's word meaning than color knowledge. For 24-month-old Dutch toddlers, bananas are thus more edible than they are yellow. -
Mitterer, H., Scharenborg, O., & McQueen, J. M. (2013). Phonological abstraction without phonemes in speech perception. Cognition, 129, 356-361. doi:10.1016/j.cognition.2013.07.011.
Abstract
Recent evidence shows that listeners use abstract prelexical units in speech perception. Using the phenomenon of lexical retuning in speech processing, we ask whether those units are necessarily phonemic. Dutch listeners were exposed to a Dutch speaker producing ambiguous phones between the Dutch syllable-final allophones approximant [r] and dark [l]. These ambiguous phones replaced either final /r/ or final /l/ in words in a lexical-decision task. This differential exposure affected perception of ambiguous stimuli on the same allophone continuum in a subsequent phonetic-categorization test: Listeners exposed to ambiguous phones in /r/-final words were more likely to perceive test stimuli as /r/ than listeners with exposure in /l/-final words. This effect was not found for test stimuli on continua using other allophones of /r/ and /l/. These results confirm that listeners use phonological abstraction in speech perception. They also show that context-sensitive allophones can play a role in this process, and hence that context-insensitive phonemes are not necessary. We suggest there may be no one unit of perception -
Sadakata, M., & McQueen, J. M. (2013). High stimulus variability in nonnative speech learning supports formation of abstract categories: Evidence from Japanese geminates. Journal of the Acoustical Society of America, 134(2), 1324-1335. doi:10.1121/1.4812767.
Abstract
This study reports effects of a high-variability training procedure on nonnative learning of a Japanese geminate-singleton fricative contrast. Thirty native speakers of Dutch took part in a 5-day training procedure in which they identified geminate and singleton variants of the Japanese fricative /s/. Participants were trained with either many repetitions of a limited set of words recorded by a single speaker (low-variability training) or with fewer repetitions of a more variable set of words recorded by multiple speakers (high-variability training). Both types of training enhanced identification of speech but not of nonspeech materials, indicating that learning was domain specific. High-variability training led to superior performance in identification but not in discrimination tests, and supported better generalization of learning as shown by transfer from the trained fricatives to the identification of untrained stops and affricates. Variability thus helps nonnative listeners to form abstract categories rather than to enhance early acoustic analysis. -
Sjerps, M. J., McQueen, J. M., & Mitterer, H. (2013). Evidence for precategorical extrinsic vowel normalization. Attention, Perception & Psychophysics, 75, 576-587. doi:10.3758/s13414-012-0408-7.
Abstract
Three experiments investigated whether extrinsic vowel normalization takes place largely at a categorical or a precategorical level of processing. Traditional vowel normalization effects in categorization were replicated in Experiment 1: Vowels taken from an [ɪ]-[ε] continuum were more often interpreted as /ɪ/ (which has a low first formant, F (1)) when the vowels were heard in contexts that had a raised F (1) than when the contexts had a lowered F (1). This was established with contexts that consisted of only two syllables. These short contexts were necessary for Experiment 2, a discrimination task that encouraged listeners to focus on the perceptual properties of vowels at a precategorical level. Vowel normalization was again found: Ambiguous vowels were more easily discriminated from an endpoint [ε] than from an endpoint [ɪ] in a high-F (1) context, whereas the opposite was true in a low-F (1) context. Experiment 3 measured discriminability between pairs of steps along the [ɪ]-[ε] continuum. Contextual influences were again found, but without discrimination peaks, contrary to what was predicted from the same participants' categorization behavior. Extrinsic vowel normalization therefore appears to be a process that takes place at least in part at a precategorical processing level. -
Witteman, M. J., Weber, A., & McQueen, J. M. (2013). Foreign accent strength and listener familiarity with an accent co-determine speed of perceptual adaptation. Attention, Perception & Psychophysics, 75, 537-556. doi:10.3758/s13414-012-0404-y.
Abstract
We investigated how the strength of a foreign accent and varying types of experience with foreign-accented speech influence the recognition of accented words. In Experiment 1, native Dutch listeners with limited or extensive prior experience with German-accented Dutch completed a cross-modal priming experiment with strongly, medium, and weakly accented words. Participants with limited experience were primed by the medium and weakly accented words, but not by the strongly accented words. Participants with extensive experience were primed by all accent types. In Experiments 2 and 3, Dutch listeners with limited experience listened to a short story before doing the cross-modal priming task. In Experiment 2, the story was spoken by the priming task speaker and either contained strongly accented words or did not. Strongly accented exposure led to immediate priming by novel strongly accented words, while exposure to the speaker without strongly accented tokens led to priming only in the experiment’s second half. In Experiment 3, listeners listened to the story with strongly accented words spoken by a different German-accented speaker. Listeners were primed by the strongly accented words, but again only in the experiment’s second half. Together, these results show that adaptation to foreign-accented speech is rapid but depends on accent strength and on listener familiarity with those strongly accented words. -
Adam, R., Orfanidou, E., McQueen, J. M., & Morgan, G. (2011). Sign language comprehension: Insights from misperceptions of different phonological parameters. In R. Channon, & H. Van der Hulst (
Eds. ), Formational units in sign languages (pp. 87-106). Berlin: Mouton de Gruyter and Ishara Press. -
Cho, T., & McQueen, J. M. (2011). Perceptual recovery from consonant-cluster simplification using language-specific phonological knowledge. Journal of Psycholinguistic Research, 40, 253-274. doi:10.1007/s10936-011-9168-0.
Abstract
Two experiments examined whether perceptual recovery from Korean consonant-cluster simplification is based on language-specific phonological knowledge. In tri-consonantal C1C2C3 sequences such as /lkt/ and /lpt/ in Seoul Korean, either C1 or C2 can be completely deleted. Seoul Koreans monitored for C2 targets (/p/ or / k/, deleted or preserved) in the second word of a two-word phrase with an underlying /l/-C2-/t/ sequence. In Experiment 1 the target-bearing words had contextual lexical-semantic support. Listeners recovered deleted targets as fast and as accurately as preserved targets with both Word and Intonational Phrase (IP) boundaries between the two words. In Experiment 2, contexts were low-pass filtered. Listeners were still able to recover deleted targets as well as preserved targets in IP-boundary contexts, but better with physically-present targets than with deleted targets in Word-boundary contexts. This suggests that the benefit of having target acoustic-phonetic information emerges only when higher-order (contextual and phrase-boundary) information is not available. The strikingly efficient recovery of deleted phonemes with neither acoustic-phonetic cues nor contextual support demonstrates that language-specific phonological knowledge, rather than language-universal perceptual processes which rely on fine-grained phonetic details, is employed when the listener perceives the results of a continuous-speech process in which reduction is phonetically complete. -
Hanulikova, A., Mitterer, H., & McQueen, J. M. (2011). Effects of first and second language on segmentation of non-native speech. Bilingualism: Language and Cognition, 14, 506-521. doi:10.1017/S1366728910000428.
Abstract
We examined whether Slovak-German bilinguals apply native Slovak phonological and lexical knowledge when segmenting German speech. When Slovaks listen to their native language (Hanulíková, McQueen, & Mitterer, 2010), segmentation is impaired when fixed-stress cues are absent, and, following the Possible-Word Constraint (PWC; Norris, McQueen, Cutler, & Butterfield, 1997), lexical candidates are disfavored if segmentation leads to vowelless residues, unless those residues are existing Slovak words. In the present study, fixed-stress cues on German target words were again absent. Nevertheless, in support of the PWC, both German and Slovak listeners recognized German words (e.g., Rose "rose") faster in syllable contexts (suckrose) than in single- onsonant contexts (krose, trose). But only the Slovak listeners recognized Rose, for example, faster in krose than in trose (k is a Slovak word, t is not). It appears that non-native listeners can suppress native stress segmentation procedures, but that they suffer from prevailing interference from native lexical knowledge -
Huettig, F., & McQueen, J. M. (2011). The nature of the visual environment induces implicit biases during language-mediated visual search. Memory & Cognition, 39, 1068-1084. doi:10.3758/s13421-011-0086-z.
Abstract
Four eye-tracking experiments examined whether semantic and visual-shape representations are routinely retrieved from printed-word displays and used during language-mediated visual search. Participants listened to sentences containing target words which were similar semantically or in shape to concepts invoked by concurrently-displayed printed words. In Experiment 1 the displays contained semantic and shape competitors of the targets, and two unrelated words. There were significant shifts in eye gaze as targets were heard towards semantic but not shape competitors. In Experiments 2-4, semantic competitors were replaced with unrelated words, semantically richer sentences were presented to encourage visual imagery, or participants rated the shape similarity of the stimuli before doing the eye-tracking task. In all cases there were no immediate shifts in eye gaze to shape competitors, even though, in response to the Experiment 1 spoken materials, participants looked to these competitors when they were presented as pictures (Huettig & McQueen, 2007). There was a late shape-competitor bias (more than 2500 ms after target onset) in all experiments. These data show that shape information is not used in online search of printed-word displays (whereas it is used with picture displays). The nature of the visual environment appears to induce implicit biases towards particular modes of processing during language-mediated visual search. -
Jesse, A., & McQueen, J. M. (2011). Positional effects in the lexical retuning of speech perception. Psychonomic Bulletin & Review, 18, 943-950. doi:10.3758/s13423-011-0129-2.
Abstract
Listeners use lexical knowledge to adjust to speakers’ idiosyncratic pronunciations. Dutch listeners learn to interpret an ambiguous sound between /s/ and /f/ as /f/ if they hear it word-finally in Dutch words normally ending in /f/, but as /s/ if they hear it in normally /s/-final words. Here, we examined two positional effects in lexically guided retuning. In Experiment 1, ambiguous sounds during exposure always appeared in word-initial position (replacing the first sounds of /f/- or /s/-initial words). No retuning was found. In Experiment 2, the same ambiguous sounds always appeared word-finally during exposure. Here, retuning was found. Lexically guided perceptual learning thus appears to emerge reliably only when lexical knowledge is available as the to-be-tuned segment is initially being processed. Under these conditions, however, lexically guided retuning was position independent: It generalized across syllabic positions. Lexical retuning can thus benefit future recognition of particular sounds wherever they appear in words. -
Johnson, E., McQueen, J. M., & Huettig, F. (2011). Toddlers’ language-mediated visual search: They need not have the words for it. The Quarterly Journal of Experimental Psychology, 64, 1672-1682. doi:10.1080/17470218.2011.594165.
Abstract
Eye movements made by listeners during language-mediated visual search reveal a strong link between
visual processing and conceptual processing. For example, upon hearing the word for a missing referent
with a characteristic colour (e.g., “strawberry”), listeners tend to fixate a colour-matched distractor (e.g.,
a red plane) more than a colour-mismatched distractor (e.g., a yellow plane). We ask whether these
shifts in visual attention are mediated by the retrieval of lexically stored colour labels. Do children
who do not yet possess verbal labels for the colour attribute that spoken and viewed objects have in
common exhibit language-mediated eye movements like those made by older children and adults?
That is, do toddlers look at a red plane when hearing “strawberry”? We observed that 24-montholds
lacking colour term knowledge nonetheless recognized the perceptual–conceptual commonality
between named and seen objects. This indicates that language-mediated visual search need not
depend on stored labels for concepts. -
Poellmann, K., McQueen, J. M., & Mitterer, H. (2011). The time course of perceptual learning. In W.-S. Lee, & E. Zee (
Eds. ), Proceedings of the 17th International Congress of Phonetic Sciences 2011 [ICPhS XVII] (pp. 1618-1621). Hong Kong: Department of Chinese, Translation and Linguistics, City University of Hong Kong.Abstract
Two groups of participants were trained to perceive an ambiguous sound [s/f] as either /s/ or /f/ based on lexical bias: One group heard the ambiguous fricative in /s/-final words, the other in /f/-final words. This kind of exposure leads to a recalibration of the /s/-/f/ contrast [e.g., 4]. In order to investigate when and how this recalibration emerges, test trials were interspersed among training and filler trials. The learning effect needed at least 10 clear training items to arise. Its emergence seemed to occur in a rather step-wise fashion. Learning did not improve much after it first appeared. It is likely, however, that the early test trials attracted participants' attention and therefore may have interfered with the learning process. -
Reinisch, E., Jesse, A., & McQueen, J. M. (2011). Speaking rate affects the perception of duration as a suprasegmental lexical-stress cue. Language and Speech, 54(2), 147-165. doi:10.1177/0023830910397489.
Abstract
Three categorization experiments investigated whether the speaking rate of a preceding sentence influences durational cues to the perception of suprasegmental lexical-stress patterns. Dutch two-syllable word fragments had to be judged as coming from one of two longer words that matched the fragment segmentally but differed in lexical stress placement. Word pairs contrasted primary stress on either the first versus the second syllable or the first versus the third syllable. Duration of the initial or the second syllable of the fragments and rate of the preceding context (fast vs. slow) were manipulated. Listeners used speaking rate to decide about the degree of stress on initial syllables whether the syllables' absolute durations were informative about stress (Experiment 1a) or not (Experiment 1b). Rate effects on the second syllable were visible only when the initial syllable was ambiguous in duration with respect to the preceding rate context (Experiment 2). Absolute second syllable durations contributed little to stress perception (Experiment 3). These results suggest that speaking rate is used to disambiguate words and that rate-modulated stress cues are more important on initial than non-initial syllables. Speaking rate affects perception of suprasegmental information. -
Reinisch, E., Jesse, A., & McQueen, J. M. (2011). Speaking rate from proximal and distal contexts is used during word segmentation. Journal of Experimental Psychology: Human Perception and Performance, 37, 978-996. doi:10.1037/a0021923.
Abstract
A series of eye-tracking and categorization experiments investigated the use of speaking-rate information in the segmentation of Dutch ambiguous-word sequences. Juncture phonemes with ambiguous durations (e.g., [s] in 'eens (s)peer,' “once (s)pear,” [t] in 'nooit (t)rap,' “never staircase/quick”) were perceived as longer and hence more often as word-initial when following a fast than a slow context sentence. Listeners used speaking-rate information as soon as it became available. Rate information from a context proximal to the juncture phoneme and from a more distal context was used during on-line word recognition, as reflected in listeners' eye movements. Stronger effects of distal context, however, were observed in the categorization task, which measures the off-line results of the word-recognition process. In categorization, the amount of rate context had the greatest influence on the use of rate information, but in eye tracking, the rate information's proximal location was the most important. These findings constrain accounts of how speaking rate modulates the interpretation of durational cues during word recognition by suggesting that rate estimates are used to evaluate upcoming phonetic information continuously during prelexical speech processing. -
Sadakata, M., & McQueen, J. M. (2011). The role of variability in non-native perceptual learning of a Japanese geminate-singleton fricative contrast. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy (pp. 873-876).
Abstract
The current study reports the enhancing effect of a high variability training procedure in the learning of a Japanese geminate-singleton fricative contrast. Dutch natives took part in a five-day training procedure in which they identified geminate and singleton variants of the Japanese fricative /s/. They heard either many repetitions of a limited set of words recorded by a single speaker (simple training) or fewer repetitions of a more variable set of words recorded by multiple speakers (variable training). Pre-post identification evaluations and a transfer test indicated clear benefits of the variable training. -
Scharenborg, O., Mitterer, H., & McQueen, J. M. (2011). Perceptual learning of liquids. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy (pp. 149-152).
Abstract
Previous research on lexically-guided perceptual learning has focussed on contrasts that differ primarily in local cues, such as plosive and fricative contrasts. The present research had two aims: to investigate whether perceptual learning occurs for a contrast with non-local cues, the /l/-/r/ contrast, and to establish whether STRAIGHT can be used to create ambiguous sounds on an /l/-/r/ continuum. Listening experiments showed lexically-guided learning about the /l/-/r/ contrast. Listeners can thus tune in to unusual speech sounds characterised by non-local cues. Moreover, STRAIGHT can be used to create stimuli for perceptual learning experiments, opening up new research possibilities. Index Terms: perceptual learning, morphing, liquids, human word recognition, STRAIGHT. -
Sjerps, M. J., Mitterer, H., & McQueen, J. M. (2011). Constraints on the processes responsible for the extrinsic normalization of vowels. Attention, Perception & Psychophysics, 73, 1195-1215. doi:10.3758/s13414-011-0096-8.
Abstract
Listeners tune in to talkers’ vowels through extrinsic normalization. We asked here whether this process could be based on compensation for the Long Term Average Spectrum (LTAS) of preceding sounds and whether the mechanisms responsible for normalization are indifferent to the nature of those sounds. If so, normalization should apply to nonspeech stimuli. Previous findings were replicated with first formant (F1) manipulations of speech. Targets on a [pIt]-[pEt] (low-high F1) continuum were labeled as [pIt] more after high-F1 than after low-F1 precursors. Spectrally-rotated nonspeech versions of these materials produced similar normalization. None occurred, however, with nonspeech stimuli that were less speech-like, even though precursor-target LTAS relations were equivalent to those used earlier. Additional experiments investigated the roles of pitch movement, amplitude variation, formant location, and the stimuli's perceived similarity to speech. It appears that normalization is not restricted to speech, but that the nature of the preceding sounds does matter. Extrinsic normalization of vowels is due at least in part to an auditory process which may require familiarity with the spectro-temporal characteristics of speech. -
Sjerps, M. J., Mitterer, H., & McQueen, J. M. (2011). Listening to different speakers: On the time-course of perceptual compensation for vocal-tract characteristics. Neuropsychologia, 49, 3831-3846. doi:10.1016/j.neuropsychologia.2011.09.044.
Abstract
This study used an active multiple-deviant oddball design to investigate the time-course of normalization processes that help listeners deal with between-speaker variability. Electroencephalograms were recorded while Dutch listeners heard sequences of non-words (standards and occasional deviants). Deviants were [ɪ papu] or [ɛ papu], and the standard was [ɪɛpapu], where [ɪɛ] was a vowel that was ambiguous between [ɛ] and [ɪ]. These sequences were presented in two conditions, which differed with respect to the vocal-tract characteristics (i.e., the average 1st formant frequency) of the [papu] part, but not of the initial vowels [ɪ], [ɛ] or [ɪɛ] (these vowels were thus identical across conditions). Listeners more often detected a shift from [ɪɛpapu] to [ɛ papu] than from [ɪɛpapu] to [ɪ papu] in the high F1 context condition; the reverse was true in the low F1 context condition. This shows that listeners’ perception of vowels differs depending on the speaker‘s vocal-tract characteristics, as revealed in the speech surrounding those vowels. Cortical electrophysiological responses reflected this normalization process as early as about 120 ms after vowel onset, which suggests that shifts in perception precede influences due to conscious biases or decision strategies. Listeners’ abilities to normalize for speaker-vocal-tract properties are for an important part the result of a process that influences representations of speech sounds early in the speech processing stream. -
Sulpizio, S., & McQueen, J. M. (2011). When two newly-acquired words are one: New words differing in stress alone are not automatically represented differently. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy (pp. 1385-1388).
Abstract
Do listeners use lexical stress at an early stage in word learning? Artificial-lexicon studies have shown that listeners can learn new spoken words easily. These studies used non-words differing in consonants and/or vowels, but not differing only in stress. If listeners use stress information in word learning, they should be able to learn new words that differ only in stress (e.g., BInulo-biNUlo). We investigated this issue here. When learning new words, Italian listeners relied on segmental information; they did not take stress information into account. Newly-acquired words differing in stress alone are not automatically represented as different words. -
Witteman, M. J., Bardhan, N. P., Weber, A., & McQueen, J. M. (2011). Adapting to foreign-accented speech: The role of delay in testing. Journal of the Acoustical Society of America. Program abstracts of the 162nd Meeting of the Acoustical Society of America, 130(4), 2443.
Abstract
Understanding speech usually seems easy, but it can become noticeably harder when the speaker has a foreign accent. This is because foreign accents add considerable variation to speech. Research on foreign-accented speech shows that participants are able to adapt quickly to this type of variation. Less is known, however, about longer-term maintenance of adaptation. The current study focused on long-term adaptation by exposing native listeners to foreign-accented speech on Day 1, and testing them on comprehension of the accent one day later. Comprehension was thus not tested immediately, but only after a 24 hour period. On Day 1, native Dutch listeners listened to the speech of a Hebrew learner of Dutch while performing a phoneme monitoring task that did not depend on the talker’s accent. In particular, shortening of the long vowel /i/ into /ɪ/ (e.g., lief [li:f], ‘sweet’, pronounced as [lɪf]) was examined. These mispronunciations did not create lexical ambiguities in Dutch. On Day 2, listeners participated in a cross-modal priming task to test their comprehension of the accent. The results will be contrasted with results from an experiment without delayed testing and related to accounts of how listeners maintain adaptation to foreign-accented speech. -
Witteman, M. J., Weber, A., & McQueen, J. M. (2011). On the relationship between perceived accentedness, acoustic similarity, and processing difficulty in foreign-accented speech. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy (pp. 2229-2232).
Abstract
Foreign-accented speech is often perceived as more difficult to understand than native speech. What causes this potential difficulty, however, remains unknown. In the present study, we compared acoustic similarity and accent ratings of American-accented Dutch with a cross-modal priming task designed to measure online speech processing. We focused on two Dutch diphthongs: ui and ij. Though both diphthongs deviated from standard Dutch to varying degrees and perceptually varied in accent strength, native Dutch listeners recognized words containing the diphthongs easily. Thus, not all foreign-accented speech hinders comprehension, and acoustic similarity and perceived accentedness are not always predictive of processing difficulties.
Share this page