James McQueen

Publications

Displaying 1 - 24 of 24
  • Andics, A., McQueen, J. M., & Petersson, K. M. (2013). Mean-based neural coding of voices. NeuroImage, 79, 351-360. doi:10.1016/j.neuroimage.2013.05.002.

    Abstract

    The social significance of recognizing the person who talks to us is obvious, but the neural mechanisms that mediate talker identification are unclear. Regions along the bilateral superior temporal sulcus (STS) and the inferior frontal cortex (IFC) of the human brain are selective for voices, and they are sensitive to rapid voice changes. Although it has been proposed that voice recognition is supported by prototype-centered voice representations, the involvement of these category-selective cortical regions in the neural coding of such "mean voices" has not previously been demonstrated. Using fMRI in combination with a voice identity learning paradigm, we show that voice-selective regions are involved in the mean-based coding of voice identities. Voice typicality is encoded on a supra-individual level in the right STS along a stimulus-dependent, identity-independent (i.e., voice-acoustic) dimension, and on an intra-individual level in the right IFC along a stimulus-independent, identity-dependent (i.e., voice identity) dimension. Voice recognition therefore entails at least two anatomically separable stages, each characterized by neural mechanisms that reference the central tendencies of voice categories.
  • Asaridou, S. S., & McQueen, J. M. (2013). Speech and music shape the listening brain: Evidence for shared domain-general mechanisms. Frontiers in Psychology, 4: 321. doi:10.3389/fpsyg.2013.00321.

    Abstract

    Are there bi-directional influences between speech perception and music perception? An answer to this question is essential for understanding the extent to which the speech and music that we hear are processed by domain-general auditory processes and/or by distinct neural auditory mechanisms. This review summarizes a large body of behavioral and neuroscientific findings which suggest that the musical experience of trained musicians does modulate speech processing, and a sparser set of data, largely on pitch processing, which suggest in addition that linguistic experience, in particular learning a tone language, modulates music processing. Although research has focused mostly on music on speech effects, we argue that both directions of influence need to be studied, and conclude that the picture which thus emerges is one of mutual interaction across domains. In particular, it is not simply that experience with spoken language has some effects on music perception, and vice versa, but that because of shared domain-general subcortical and cortical networks, experiences in both domains influence behavior in both domains.
  • Brandmeyer, A., Sadakata, M., Spyrou, L., McQueen, J. M., & Desain, P. (2013). Decoding of single-trial auditory mismatch responses for online perceptual monitoring and neurofeedback. Frontiers in Neuroscience, 7: 265. doi:10.3389/fnins.2013.00265.

    Abstract

    Multivariate pattern classification methods are increasingly applied to neuroimaging data in the context of both fundamental research and in brain-computer interfacing approaches. Such methods provide a framework for interpreting measurements made at the single-trial level with respect to a set of two or more distinct mental states. Here, we define an approach in which the output of a binary classifier trained on data from an auditory mismatch paradigm can be used for online tracking of perception and as a neurofeedback signal. The auditory mismatch paradigm is known to induce distinct perceptual states related to the presentation of high- and low-probability stimuli, which are reflected in event-related potential (ERP) components such as the mismatch negativity (MMN). The first part of this paper illustrates how pattern classification methods can be applied to data collected in an MMN paradigm, including discussion of the optimization of preprocessing steps, the interpretation of features and how the performance of these methods generalizes across individual participants and measurement sessions. We then go on to show that the output of these decoding methods can be used in online settings as a continuous index of single-trial brain activation underlying perceptual discrimination. We conclude by discussing several potential domains of application, including neurofeedback, cognitive monitoring and passive brain-computer interfaces

    Additional information

    Brandmeyer_etal_2013a.pdf
  • Brandmeyer, A., Farquhar, J., McQueen, J. M., & Desain, P. (2013). Decoding speech perception by native and non-native speakers using single-trial electrophysiological data. PLoS One, 8: e68261. doi:10.1371/journal.pone.0068261.

    Abstract

    Brain-computer interfaces (BCIs) are systems that use real-time analysis of neuroimaging data to determine the mental state of their user for purposes such as providing neurofeedback. Here, we investigate the feasibility of a BCI based on speech perception. Multivariate pattern classification methods were applied to single-trial EEG data collected during speech perception by native and non-native speakers. Two principal questions were asked: 1) Can differences in the perceived categories of pairs of phonemes be decoded at the single-trial level? 2) Can these same categorical differences be decoded across participants, within or between native-language groups? Results indicated that classification performance progressively increased with respect to the categorical status (within, boundary or across) of the stimulus contrast, and was also influenced by the native language of individual participants. Classifier performance showed strong relationships with traditional event-related potential measures and behavioral responses. The results of the cross-participant analysis indicated an overall increase in average classifier performance when trained on data from all participants (native and non-native). A second cross-participant classifier trained only on data from native speakers led to an overall improvement in performance for native speakers, but a reduction in performance for non-native speakers. We also found that the native language of a given participant could be decoded on the basis of EEG data with accuracy above 80%. These results indicate that electrophysiological responses underlying speech perception can be decoded at the single-trial level, and that decoding performance systematically reflects graded changes in the responses related to the phonological status of the stimuli. This approach could be used in extensions of the BCI paradigm to support perceptual learning during second language acquisition
  • Mani, N., Johnson, E., McQueen, J. M., & Huettig, F. (2013). How yellow is your banana? Toddlers' language-mediated visual search in referent-present tasks. Developmental Psychology, 49, 1036-1044. doi:10.1037/a0029382.

    Abstract

    What is the relative salience of different aspects of word meaning in the developing lexicon? The current study examines the time-course of retrieval of semantic and color knowledge associated with words during toddler word recognition: at what point do toddlers orient towards an image of a yellow cup upon hearing color-matching words such as “banana” (typically yellow) relative to unrelated words (e.g., “house”)? Do children orient faster to semantic matching images relative to color matching images, e.g., orient faster to an image of a cookie relative to a yellow cup upon hearing the word “banana”? The results strongly suggest a prioritization of semantic information over color information in children’s word-referent mappings. This indicates that, even for natural objects (e.g., food, animals that are more likely to have a prototypical color), semantic knowledge is a more salient aspect of toddler's word meaning than color knowledge. For 24-month-old Dutch toddlers, bananas are thus more edible than they are yellow.
  • Mitterer, H., Scharenborg, O., & McQueen, J. M. (2013). Phonological abstraction without phonemes in speech perception. Cognition, 129, 356-361. doi:10.1016/j.cognition.2013.07.011.

    Abstract

    Recent evidence shows that listeners use abstract prelexical units in speech perception. Using the phenomenon of lexical retuning in speech processing, we ask whether those units are necessarily phonemic. Dutch listeners were exposed to a Dutch speaker producing ambiguous phones between the Dutch syllable-final allophones approximant [r] and dark [l]. These ambiguous phones replaced either final /r/ or final /l/ in words in a lexical-decision task. This differential exposure affected perception of ambiguous stimuli on the same allophone continuum in a subsequent phonetic-categorization test: Listeners exposed to ambiguous phones in /r/-final words were more likely to perceive test stimuli as /r/ than listeners with exposure in /l/-final words. This effect was not found for test stimuli on continua using other allophones of /r/ and /l/. These results confirm that listeners use phonological abstraction in speech perception. They also show that context-sensitive allophones can play a role in this process, and hence that context-insensitive phonemes are not necessary. We suggest there may be no one unit of perception
  • Sadakata, M., & McQueen, J. M. (2013). High stimulus variability in nonnative speech learning supports formation of abstract categories: Evidence from Japanese geminates. Journal of the Acoustical Society of America, 134(2), 1324-1335. doi:10.1121/1.4812767.

    Abstract

    This study reports effects of a high-variability training procedure on nonnative learning of a Japanese geminate-singleton fricative contrast. Thirty native speakers of Dutch took part in a 5-day training procedure in which they identified geminate and singleton variants of the Japanese fricative /s/. Participants were trained with either many repetitions of a limited set of words recorded by a single speaker (low-variability training) or with fewer repetitions of a more variable set of words recorded by multiple speakers (high-variability training). Both types of training enhanced identification of speech but not of nonspeech materials, indicating that learning was domain specific. High-variability training led to superior performance in identification but not in discrimination tests, and supported better generalization of learning as shown by transfer from the trained fricatives to the identification of untrained stops and affricates. Variability thus helps nonnative listeners to form abstract categories rather than to enhance early acoustic analysis.
  • Sjerps, M. J., McQueen, J. M., & Mitterer, H. (2013). Evidence for precategorical extrinsic vowel normalization. Attention, Perception & Psychophysics, 75, 576-587. doi:10.3758/s13414-012-0408-7.

    Abstract

    Three experiments investigated whether extrinsic vowel normalization takes place largely at a categorical or a precategorical level of processing. Traditional vowel normalization effects in categorization were replicated in Experiment 1: Vowels taken from an [ɪ]-[ε] continuum were more often interpreted as /ɪ/ (which has a low first formant, F (1)) when the vowels were heard in contexts that had a raised F (1) than when the contexts had a lowered F (1). This was established with contexts that consisted of only two syllables. These short contexts were necessary for Experiment 2, a discrimination task that encouraged listeners to focus on the perceptual properties of vowels at a precategorical level. Vowel normalization was again found: Ambiguous vowels were more easily discriminated from an endpoint [ε] than from an endpoint [ɪ] in a high-F (1) context, whereas the opposite was true in a low-F (1) context. Experiment 3 measured discriminability between pairs of steps along the [ɪ]-[ε] continuum. Contextual influences were again found, but without discrimination peaks, contrary to what was predicted from the same participants' categorization behavior. Extrinsic vowel normalization therefore appears to be a process that takes place at least in part at a precategorical processing level.
  • Witteman, M. J., Weber, A., & McQueen, J. M. (2013). Foreign accent strength and listener familiarity with an accent co-determine speed of perceptual adaptation. Attention, Perception & Psychophysics, 75, 537-556. doi:10.3758/s13414-012-0404-y.

    Abstract

    We investigated how the strength of a foreign accent and varying types of experience with foreign-accented speech influence the recognition of accented words. In Experiment 1, native Dutch listeners with limited or extensive prior experience with German-accented Dutch completed a cross-modal priming experiment with strongly, medium, and weakly accented words. Participants with limited experience were primed by the medium and weakly accented words, but not by the strongly accented words. Participants with extensive experience were primed by all accent types. In Experiments 2 and 3, Dutch listeners with limited experience listened to a short story before doing the cross-modal priming task. In Experiment 2, the story was spoken by the priming task speaker and either contained strongly accented words or did not. Strongly accented exposure led to immediate priming by novel strongly accented words, while exposure to the speaker without strongly accented tokens led to priming only in the experiment’s second half. In Experiment 3, listeners listened to the story with strongly accented words spoken by a different German-accented speaker. Listeners were primed by the strongly accented words, but again only in the experiment’s second half. Together, these results show that adaptation to foreign-accented speech is rapid but depends on accent strength and on listener familiarity with those strongly accented words.
  • Cutler, A., Otake, T., & McQueen, J. M. (2009). Vowel devoicing and the perception of spoken Japanese words. Journal of the Acoustical Society of America, 125(3), 1693-1703. doi:10.1121/1.3075556.

    Abstract

    Three experiments, in which Japanese listeners detected Japanese words embedded in nonsense sequences, examined the perceptual consequences of vowel devoicing in that language. Since vowelless sequences disrupt speech segmentation [Norris et al. (1997). Cognit. Psychol. 34, 191– 243], devoicing is potentially problematic for perception. Words in initial position in nonsense sequences were detected more easily when followed by a sequence containing a vowel than by a vowelless segment (with or without further context), and vowelless segments that were potential devoicing environments were no easier than those not allowing devoicing. Thus asa, “morning,” was easier in asau or asazu than in all of asap, asapdo, asaf, or asafte, despite the fact that the /f/ in the latter two is a possible realization of fu, with devoiced [u]. Japanese listeners thus do not treat devoicing contexts as if they always contain vowels. Words in final position in nonsense sequences, however, produced a different pattern: here, preceding vowelless contexts allowing devoicing impeded word detection less strongly (so, sake was detected less accurately, but not less rapidly, in nyaksake—possibly arising from nyakusake—than in nyagusake). This is consistent with listeners treating consonant sequences as potential realizations of parts of existing lexical candidates wherever possible.
  • McQueen, J. M. (2009). Al sprekende leert men [Inaugural lecture]. Arnhem: Drukkerij Roos en Roos.

    Abstract

    Rede uitgesproken bij de aanvaarding van het ambt van hoogleraar Leren en plasticiteit aan de Faculteit der Sociale Wetenschappen van de Radboud Universiteit Nijmegen op donderdag 1 oktober 2009
  • McQueen, J. M., Jesse, A., & Norris, D. (2009). No lexical–prelexical feedback during speech perception or: Is it time to stop playing those Christmas tapes? Journal of Memory and Language, 61, 1-18. doi:10.1016/j.jml.2009.03.002.

    Abstract

    The strongest support for feedback in speech perception comes from evidence of apparent lexical influence on prelexical fricative-stop compensation for coarticulation. Lexical knowledge (e.g., that the ambiguous final fricative of Christma? should be [s]) apparently influences perception of following stops. We argue that all such previous demonstrations can be explained without invoking lexical feedback. In particular, we show that one demonstration [Magnuson, J. S., McMurray, B., Tanenhaus, M. K., & Aslin, R. N. (2003). Lexical effects on compensation for coarticulation: The ghost of Christmash past. Cognitive Science, 27, 285–298] involved experimentally-induced biases (from 16 practice trials) rather than feedback. We found that the direction of the compensation effect depended on whether practice stimuli were words or nonwords. When both were used, there was no lexically-mediated compensation. Across experiments, however, there were lexical effects on fricative identification. This dissociation (lexical involvement in the fricative decisions but not in the following stop decisions made on the same trials) challenges interactive models in which feedback should cause both effects. We conclude that the prelexical level is sensitive to experimentally-induced phoneme-sequence biases, but that there is no feedback during speech perception.
  • Mitterer, H., & McQueen, J. M. (2009). Foreign subtitles help but native-language subtitles harm foreign speech perception. PLoS ONE, 4(11), e7785. doi:10.1371/journal.pone.0007785.

    Abstract

    Understanding foreign speech is difficult, in part because of unusual mappings between sounds and words. It is known that listeners in their native language can use lexical knowledge (about how words ought to sound) to learn how to interpret unusual speech-sounds. We therefore investigated whether subtitles, which provide lexical information, support perceptual learning about foreign speech. Dutch participants, unfamiliar with Scottish and Australian regional accents of English, watched Scottish or Australian English videos with Dutch, English or no subtitles, and then repeated audio fragments of both accents. Repetition of novel fragments was worse after Dutch-subtitle exposure but better after English-subtitle exposure. Native-language subtitles appear to create lexical interference, but foreign-language subtitles assist speech learning by indicating which words (and hence sounds) are being spoken.
  • Mitterer, H., & McQueen, J. M. (2009). Processing reduced word-forms in speech perception using probabilistic knowledge about speech production. Journal of Experimental Psychology: Human Perception and Performance, 35(1), 244-263. doi:10.1037/a0012730.

    Abstract

    Two experiments examined how Dutch listeners deal with the effects of connected-speech processes, specifically those arising from word-final /t/ reduction (e.g., whether Dutch [tas] is tas, bag, or a reduced-/t/ version of tast, touch). Eye movements of Dutch participants were tracked as they looked at arrays containing 4 printed words, each associated with a geometrical shape. Minimal pairs (e.g., tas/tast) were either both above (boven) or both next to (naast) different shapes. Spoken instructions (e.g., “Klik op het woordje tas boven de ster,” [Click on the word bag above the star]) thus became unambiguous only on their final words. Prior to disambiguation, listeners' fixations were drawn to /t/-final words more when boven than when naast followed the ambiguous sequences. This behavior reflects Dutch speech-production data: /t/ is reduced more before /b/ than before /n/. We thus argue that probabilistic knowledge about the effect of following context in speech production is used prelexically in perception to help resolve lexical ambiguities caused by continuous-speech processes.
  • Orfanidou, E., Adam, R., McQueen, J. M., & Morgan, G. (2009). Making sense of nonsense in British Sign Language (BSL): The contribution of different phonological parameters to sign recognition. Memory & Cognition, 37(3), 302-315. doi:10.3758/MC.37.3.302.

    Abstract

    Do all components of a sign contribute equally to its recognition? In the present study, misperceptions in the sign-spotting task (based on the word-spotting task; Cutler & Norris, 1988) were analyzed to address this question. Three groups of deaf signers of British Sign Language (BSL) with different ages of acquisition (AoA) saw BSL signs combined with nonsense signs, along with combinations of two nonsense signs. They were asked to spot real signs and report what they had spotted. We will present an analysis of false alarms to the nonsense-sign combinations—that is, misperceptions of nonsense signs as real signs (cf. van Ooijen, 1996). Participants modified the movement and handshape parameters more than the location parameter. Within this pattern, however, there were differences as a function of AoA. These results show that the theoretical distinctions between form-based parameters in sign-language models have consequences for online processing. Vowels and consonants have different roles in speech recognition; similarly, it appears that movement, handshape, and location parameters contribute differentially to sign recognition.
  • Cho, T., & McQueen, J. M. (2005). Prosodic influences on consonant production in Dutch: Effects of prosodic boundaries, phrasal accent and lexical stress. Journal of Phonetics, 33(2), 121-157. doi:10.1016/j.wocn.2005.01.001.

    Abstract

    Prosodic influences on phonetic realizations of four Dutch consonants (/t d s z/) were examined. Sentences were constructed containing these consonants in word-initial position; the factors lexical stress, phrasal accent and prosodic boundary were manipulated between sentences. Eleven Dutch speakers read these sentences aloud. The patterns found in acoustic measurements of these utterances (e.g., voice onset time (VOT), consonant duration, voicing during closure, spectral center of gravity, burst energy) indicate that the low-level phonetic implementation of all four consonants is modulated by prosodic structure. Boundary effects on domain-initial segments were observed in stressed and unstressed syllables, extending previous findings which have been on stressed syllables alone. Three aspects of the data are highlighted. First, shorter VOTs were found for /t/ in prosodically stronger locations (stressed, accented and domain-initial), as opposed to longer VOTs in these positions in English. This suggests that prosodically driven phonetic realization is bounded by language-specific constraints on how phonetic features are specified with phonetic content: Shortened VOT in Dutch reflects enhancement of the phonetic feature {−spread glottis}, while lengthened VOT in English reflects enhancement of {+spread glottis}. Prosodic strengthening therefore appears to operate primarily at the phonetic level, such that prosodically driven enhancement of phonological contrast is determined by phonetic implementation of these (language-specific) phonetic features. Second, an accent effect was observed in stressed and unstressed syllables, and was independent of prosodic boundary size. The domain of accentuation in Dutch is thus larger than the foot. Third, within a prosodic category consisting of those utterances with a boundary tone but no pause, tokens with syntactically defined Phonological Phrase boundaries could be differentiated from the other tokens. This syntactic influence on prosodic phrasing implies the existence of an intermediate-level phrase in the prosodic hierarchy of Dutch.
  • Cutler, A., McQueen, J. M., & Norris, D. (2005). The lexical utility of phoneme-category plasticity. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 103-107).
  • Eisner, F., & McQueen, J. M. (2005). The specificity of perceptual learning in speech processing. Perception & Psychophysics, 67(2), 224-238.

    Abstract

    We conducted four experiments to investigate the specificity of perceptual adjustments made to unusual speech sounds. Dutch listeners heard a female talker produce an ambiguous fricative [?] (between [f] and [s]) in [f]- or [s]-biased lexical contexts. Listeners with [f]-biased exposure (e.g., [witlo?]; from witlof, “chicory”; witlos is meaningless) subsequently categorized more sounds on an [εf]–[εs] continuum as [f] than did listeners with [s]-biased exposure. This occurred when the continuum was based on the exposure talker's speech (Experiment 1), and when the same test fricatives appeared after vowels spoken by novel female and male talkers (Experiments 1 and 2). When the continuum was made entirely from a novel talker's speech, there was no exposure effect (Experiment 3) unless fricatives from that talker had been spliced into the exposure talker's speech during exposure (Experiment 4). We conclude that perceptual learning about idiosyncratic speech is applied at a segmental level and is, under these exposure conditions, talker specific.
  • McQueen, J. M. (2005). Speech perception. In K. Lamberts, & R. Goldstone (Eds.), The Handbook of Cognition (pp. 255-275). London: Sage Publications.
  • McQueen, J. M. (2005). Spoken word recognition and production: Regular but not inseparable bedfellows. In A. Cutler (Ed.), Twenty-first century psycholinguistics: Four cornerstones (pp. 229-244). Mahwah, NJ: Erlbaum.
  • McQueen, J. M., & Sereno, J. (2005). Cleaving automatic processes from strategic biases in phonological priming. Memory & Cognition, 33(7), 1185-1209.

    Abstract

    In a phonological priming experiment using spoken Dutch words, Dutch listeners were taught varying expectancies and relatedness relations about the phonological form of target words, given particular primes. They learned to expect that, after a particular prime, if the target was a word, it would be from a specific phonological category. The expectancy either involved phonological overlap (e.g., honk-vonk, “base-spark”; expected related) or did not (e.g., nest-galm, “nest-boom”; expected unrelated, where the learned expectation after hearing nest was a word rhyming in -alm). Targets were occasionally inconsistent with expectations. In these inconsistent expectancy trials, targets were either unrelated (e.g., honk-mest, “base-manure”; unexpected unrelated), where the listener was expecting a related target, or related (e.g., nest-pest, “nest-plague”; unexpected related), where the listener was expecting an unrelated target. Participant expectations and phonological relatedness were thus manipulated factorially for three types of phonological overlap (rhyme, one onset phoneme, and three onset phonemes) at three interstimulus intervals (ISIs; 50, 500, and 2,000 msec). Lexical decisions to targets revealed evidence of expectancy-based strategies for all three types of overlap (e.g., faster responses to expected than to unexpected targets, irrespective of phonological relatedness) and evidence of automatic phonological processes, but only for the rhyme and three-phoneme onset overlap conditions and, most strongly, at the shortest ISI (e.g., faster responses to related than to unrelated targets, irrespective of expectations). Although phonological priming thus has both automatic and strategic components, it is possible to cleave them apart.
  • McQueen, J. M., & Mitterer, H. (2005). Lexically-driven perceptual adjustments of vowel categories. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 233-236).
  • Scharenborg, O., Norris, D., Ten Bosch, L., & McQueen, J. M. (2005). How should a speech recognizer work? Cognitive Science, 29(6), 867-918. doi:10.1207/s15516709cog0000_37.

    Abstract

    Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that research in these related fields has focused on the mechanics of how speech can be recognized. In Marr's (1982) terms, emphasis has been on the algorithmic and implementational levels rather than on the computational level. In this article, we provide a computational-level analysis of the task of speech recognition, which reveals the close parallels between research concerned with HSR and ASR. We illustrate this relation by presenting a new computational model of human spoken-word recognition, built using techniques from the field of ASR that, in contrast to current existing models of HSR, recognizes words from real speech input.
  • Warner, N., Smits, R., McQueen, J. M., & Cutler, A. (2005). Phonological and statistical effects on timing of speech perception: Insights from a database of Dutch diphone perception. Speech Communication, 46(1), 53-72. doi:10.1016/j.specom.2005.01.003.

    Abstract

    We report detailed analyses of a very large database on timing of speech perception collected by Smits et al. (Smits, R., Warner, N., McQueen, J.M., Cutler, A., 2003. Unfolding of phonetic information over time: A database of Dutch diphone perception. J. Acoust. Soc. Am. 113, 563–574). Eighteen listeners heard all possible diphones of Dutch, gated in portions of varying size and presented without background noise. The present report analyzes listeners’ responses across gates in terms of phonological features (voicing, place, and manner for consonants; height, backness, and length for vowels). The resulting patterns for feature perception differ from patterns reported when speech is presented in noise. The data are also analyzed for effects of stress and of phonological context (neighboring vowel vs. consonant); effects of these factors are observed to be surprisingly limited. Finally, statistical effects, such as overall phoneme frequency and transitional probabilities, along with response biases, are examined; these too exercise only limited effects on response patterns. The results suggest highly accurate speech perception on the basis of acoustic information alone.

Share this page