James McQueen

Publications

Displaying 1 - 20 of 20
  • Andics, A., McQueen, J. M., & Petersson, K. M. (2013). Mean-based neural coding of voices. NeuroImage, 79, 351-360. doi:10.1016/j.neuroimage.2013.05.002.

    Abstract

    The social significance of recognizing the person who talks to us is obvious, but the neural mechanisms that mediate talker identification are unclear. Regions along the bilateral superior temporal sulcus (STS) and the inferior frontal cortex (IFC) of the human brain are selective for voices, and they are sensitive to rapid voice changes. Although it has been proposed that voice recognition is supported by prototype-centered voice representations, the involvement of these category-selective cortical regions in the neural coding of such "mean voices" has not previously been demonstrated. Using fMRI in combination with a voice identity learning paradigm, we show that voice-selective regions are involved in the mean-based coding of voice identities. Voice typicality is encoded on a supra-individual level in the right STS along a stimulus-dependent, identity-independent (i.e., voice-acoustic) dimension, and on an intra-individual level in the right IFC along a stimulus-independent, identity-dependent (i.e., voice identity) dimension. Voice recognition therefore entails at least two anatomically separable stages, each characterized by neural mechanisms that reference the central tendencies of voice categories.
  • Asaridou, S. S., & McQueen, J. M. (2013). Speech and music shape the listening brain: Evidence for shared domain-general mechanisms. Frontiers in Psychology, 4: 321. doi:10.3389/fpsyg.2013.00321.

    Abstract

    Are there bi-directional influences between speech perception and music perception? An answer to this question is essential for understanding the extent to which the speech and music that we hear are processed by domain-general auditory processes and/or by distinct neural auditory mechanisms. This review summarizes a large body of behavioral and neuroscientific findings which suggest that the musical experience of trained musicians does modulate speech processing, and a sparser set of data, largely on pitch processing, which suggest in addition that linguistic experience, in particular learning a tone language, modulates music processing. Although research has focused mostly on music on speech effects, we argue that both directions of influence need to be studied, and conclude that the picture which thus emerges is one of mutual interaction across domains. In particular, it is not simply that experience with spoken language has some effects on music perception, and vice versa, but that because of shared domain-general subcortical and cortical networks, experiences in both domains influence behavior in both domains.
  • Brandmeyer, A., Sadakata, M., Spyrou, L., McQueen, J. M., & Desain, P. (2013). Decoding of single-trial auditory mismatch responses for online perceptual monitoring and neurofeedback. Frontiers in Neuroscience, 7: 265. doi:10.3389/fnins.2013.00265.

    Abstract

    Multivariate pattern classification methods are increasingly applied to neuroimaging data in the context of both fundamental research and in brain-computer interfacing approaches. Such methods provide a framework for interpreting measurements made at the single-trial level with respect to a set of two or more distinct mental states. Here, we define an approach in which the output of a binary classifier trained on data from an auditory mismatch paradigm can be used for online tracking of perception and as a neurofeedback signal. The auditory mismatch paradigm is known to induce distinct perceptual states related to the presentation of high- and low-probability stimuli, which are reflected in event-related potential (ERP) components such as the mismatch negativity (MMN). The first part of this paper illustrates how pattern classification methods can be applied to data collected in an MMN paradigm, including discussion of the optimization of preprocessing steps, the interpretation of features and how the performance of these methods generalizes across individual participants and measurement sessions. We then go on to show that the output of these decoding methods can be used in online settings as a continuous index of single-trial brain activation underlying perceptual discrimination. We conclude by discussing several potential domains of application, including neurofeedback, cognitive monitoring and passive brain-computer interfaces

    Additional information

    Brandmeyer_etal_2013a.pdf
  • Brandmeyer, A., Farquhar, J., McQueen, J. M., & Desain, P. (2013). Decoding speech perception by native and non-native speakers using single-trial electrophysiological data. PLoS One, 8: e68261. doi:10.1371/journal.pone.0068261.

    Abstract

    Brain-computer interfaces (BCIs) are systems that use real-time analysis of neuroimaging data to determine the mental state of their user for purposes such as providing neurofeedback. Here, we investigate the feasibility of a BCI based on speech perception. Multivariate pattern classification methods were applied to single-trial EEG data collected during speech perception by native and non-native speakers. Two principal questions were asked: 1) Can differences in the perceived categories of pairs of phonemes be decoded at the single-trial level? 2) Can these same categorical differences be decoded across participants, within or between native-language groups? Results indicated that classification performance progressively increased with respect to the categorical status (within, boundary or across) of the stimulus contrast, and was also influenced by the native language of individual participants. Classifier performance showed strong relationships with traditional event-related potential measures and behavioral responses. The results of the cross-participant analysis indicated an overall increase in average classifier performance when trained on data from all participants (native and non-native). A second cross-participant classifier trained only on data from native speakers led to an overall improvement in performance for native speakers, but a reduction in performance for non-native speakers. We also found that the native language of a given participant could be decoded on the basis of EEG data with accuracy above 80%. These results indicate that electrophysiological responses underlying speech perception can be decoded at the single-trial level, and that decoding performance systematically reflects graded changes in the responses related to the phonological status of the stimuli. This approach could be used in extensions of the BCI paradigm to support perceptual learning during second language acquisition
  • Mani, N., Johnson, E., McQueen, J. M., & Huettig, F. (2013). How yellow is your banana? Toddlers' language-mediated visual search in referent-present tasks. Developmental Psychology, 49, 1036-1044. doi:10.1037/a0029382.

    Abstract

    What is the relative salience of different aspects of word meaning in the developing lexicon? The current study examines the time-course of retrieval of semantic and color knowledge associated with words during toddler word recognition: at what point do toddlers orient towards an image of a yellow cup upon hearing color-matching words such as “banana” (typically yellow) relative to unrelated words (e.g., “house”)? Do children orient faster to semantic matching images relative to color matching images, e.g., orient faster to an image of a cookie relative to a yellow cup upon hearing the word “banana”? The results strongly suggest a prioritization of semantic information over color information in children’s word-referent mappings. This indicates that, even for natural objects (e.g., food, animals that are more likely to have a prototypical color), semantic knowledge is a more salient aspect of toddler's word meaning than color knowledge. For 24-month-old Dutch toddlers, bananas are thus more edible than they are yellow.
  • Mitterer, H., Scharenborg, O., & McQueen, J. M. (2013). Phonological abstraction without phonemes in speech perception. Cognition, 129, 356-361. doi:10.1016/j.cognition.2013.07.011.

    Abstract

    Recent evidence shows that listeners use abstract prelexical units in speech perception. Using the phenomenon of lexical retuning in speech processing, we ask whether those units are necessarily phonemic. Dutch listeners were exposed to a Dutch speaker producing ambiguous phones between the Dutch syllable-final allophones approximant [r] and dark [l]. These ambiguous phones replaced either final /r/ or final /l/ in words in a lexical-decision task. This differential exposure affected perception of ambiguous stimuli on the same allophone continuum in a subsequent phonetic-categorization test: Listeners exposed to ambiguous phones in /r/-final words were more likely to perceive test stimuli as /r/ than listeners with exposure in /l/-final words. This effect was not found for test stimuli on continua using other allophones of /r/ and /l/. These results confirm that listeners use phonological abstraction in speech perception. They also show that context-sensitive allophones can play a role in this process, and hence that context-insensitive phonemes are not necessary. We suggest there may be no one unit of perception
  • Sadakata, M., & McQueen, J. M. (2013). High stimulus variability in nonnative speech learning supports formation of abstract categories: Evidence from Japanese geminates. Journal of the Acoustical Society of America, 134(2), 1324-1335. doi:10.1121/1.4812767.

    Abstract

    This study reports effects of a high-variability training procedure on nonnative learning of a Japanese geminate-singleton fricative contrast. Thirty native speakers of Dutch took part in a 5-day training procedure in which they identified geminate and singleton variants of the Japanese fricative /s/. Participants were trained with either many repetitions of a limited set of words recorded by a single speaker (low-variability training) or with fewer repetitions of a more variable set of words recorded by multiple speakers (high-variability training). Both types of training enhanced identification of speech but not of nonspeech materials, indicating that learning was domain specific. High-variability training led to superior performance in identification but not in discrimination tests, and supported better generalization of learning as shown by transfer from the trained fricatives to the identification of untrained stops and affricates. Variability thus helps nonnative listeners to form abstract categories rather than to enhance early acoustic analysis.
  • Sjerps, M. J., McQueen, J. M., & Mitterer, H. (2013). Evidence for precategorical extrinsic vowel normalization. Attention, Perception & Psychophysics, 75, 576-587. doi:10.3758/s13414-012-0408-7.

    Abstract

    Three experiments investigated whether extrinsic vowel normalization takes place largely at a categorical or a precategorical level of processing. Traditional vowel normalization effects in categorization were replicated in Experiment 1: Vowels taken from an [ɪ]-[ε] continuum were more often interpreted as /ɪ/ (which has a low first formant, F (1)) when the vowels were heard in contexts that had a raised F (1) than when the contexts had a lowered F (1). This was established with contexts that consisted of only two syllables. These short contexts were necessary for Experiment 2, a discrimination task that encouraged listeners to focus on the perceptual properties of vowels at a precategorical level. Vowel normalization was again found: Ambiguous vowels were more easily discriminated from an endpoint [ε] than from an endpoint [ɪ] in a high-F (1) context, whereas the opposite was true in a low-F (1) context. Experiment 3 measured discriminability between pairs of steps along the [ɪ]-[ε] continuum. Contextual influences were again found, but without discrimination peaks, contrary to what was predicted from the same participants' categorization behavior. Extrinsic vowel normalization therefore appears to be a process that takes place at least in part at a precategorical processing level.
  • Witteman, M. J., Weber, A., & McQueen, J. M. (2013). Foreign accent strength and listener familiarity with an accent co-determine speed of perceptual adaptation. Attention, Perception & Psychophysics, 75, 537-556. doi:10.3758/s13414-012-0404-y.

    Abstract

    We investigated how the strength of a foreign accent and varying types of experience with foreign-accented speech influence the recognition of accented words. In Experiment 1, native Dutch listeners with limited or extensive prior experience with German-accented Dutch completed a cross-modal priming experiment with strongly, medium, and weakly accented words. Participants with limited experience were primed by the medium and weakly accented words, but not by the strongly accented words. Participants with extensive experience were primed by all accent types. In Experiments 2 and 3, Dutch listeners with limited experience listened to a short story before doing the cross-modal priming task. In Experiment 2, the story was spoken by the priming task speaker and either contained strongly accented words or did not. Strongly accented exposure led to immediate priming by novel strongly accented words, while exposure to the speaker without strongly accented tokens led to priming only in the experiment’s second half. In Experiment 3, listeners listened to the story with strongly accented words spoken by a different German-accented speaker. Listeners were primed by the strongly accented words, but again only in the experiment’s second half. Together, these results show that adaptation to foreign-accented speech is rapid but depends on accent strength and on listener familiarity with those strongly accented words.
  • Cho, T., & McQueen, J. M. (2005). Prosodic influences on consonant production in Dutch: Effects of prosodic boundaries, phrasal accent and lexical stress. Journal of Phonetics, 33(2), 121-157. doi:10.1016/j.wocn.2005.01.001.

    Abstract

    Prosodic influences on phonetic realizations of four Dutch consonants (/t d s z/) were examined. Sentences were constructed containing these consonants in word-initial position; the factors lexical stress, phrasal accent and prosodic boundary were manipulated between sentences. Eleven Dutch speakers read these sentences aloud. The patterns found in acoustic measurements of these utterances (e.g., voice onset time (VOT), consonant duration, voicing during closure, spectral center of gravity, burst energy) indicate that the low-level phonetic implementation of all four consonants is modulated by prosodic structure. Boundary effects on domain-initial segments were observed in stressed and unstressed syllables, extending previous findings which have been on stressed syllables alone. Three aspects of the data are highlighted. First, shorter VOTs were found for /t/ in prosodically stronger locations (stressed, accented and domain-initial), as opposed to longer VOTs in these positions in English. This suggests that prosodically driven phonetic realization is bounded by language-specific constraints on how phonetic features are specified with phonetic content: Shortened VOT in Dutch reflects enhancement of the phonetic feature {−spread glottis}, while lengthened VOT in English reflects enhancement of {+spread glottis}. Prosodic strengthening therefore appears to operate primarily at the phonetic level, such that prosodically driven enhancement of phonological contrast is determined by phonetic implementation of these (language-specific) phonetic features. Second, an accent effect was observed in stressed and unstressed syllables, and was independent of prosodic boundary size. The domain of accentuation in Dutch is thus larger than the foot. Third, within a prosodic category consisting of those utterances with a boundary tone but no pause, tokens with syntactically defined Phonological Phrase boundaries could be differentiated from the other tokens. This syntactic influence on prosodic phrasing implies the existence of an intermediate-level phrase in the prosodic hierarchy of Dutch.
  • Cutler, A., McQueen, J. M., & Norris, D. (2005). The lexical utility of phoneme-category plasticity. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 103-107).
  • Eisner, F., & McQueen, J. M. (2005). The specificity of perceptual learning in speech processing. Perception & Psychophysics, 67(2), 224-238.

    Abstract

    We conducted four experiments to investigate the specificity of perceptual adjustments made to unusual speech sounds. Dutch listeners heard a female talker produce an ambiguous fricative [?] (between [f] and [s]) in [f]- or [s]-biased lexical contexts. Listeners with [f]-biased exposure (e.g., [witlo?]; from witlof, “chicory”; witlos is meaningless) subsequently categorized more sounds on an [εf]–[εs] continuum as [f] than did listeners with [s]-biased exposure. This occurred when the continuum was based on the exposure talker's speech (Experiment 1), and when the same test fricatives appeared after vowels spoken by novel female and male talkers (Experiments 1 and 2). When the continuum was made entirely from a novel talker's speech, there was no exposure effect (Experiment 3) unless fricatives from that talker had been spliced into the exposure talker's speech during exposure (Experiment 4). We conclude that perceptual learning about idiosyncratic speech is applied at a segmental level and is, under these exposure conditions, talker specific.
  • McQueen, J. M. (2005). Speech perception. In K. Lamberts, & R. Goldstone (Eds.), The Handbook of Cognition (pp. 255-275). London: Sage Publications.
  • McQueen, J. M. (2005). Spoken word recognition and production: Regular but not inseparable bedfellows. In A. Cutler (Ed.), Twenty-first century psycholinguistics: Four cornerstones (pp. 229-244). Mahwah, NJ: Erlbaum.
  • McQueen, J. M., & Sereno, J. (2005). Cleaving automatic processes from strategic biases in phonological priming. Memory & Cognition, 33(7), 1185-1209.

    Abstract

    In a phonological priming experiment using spoken Dutch words, Dutch listeners were taught varying expectancies and relatedness relations about the phonological form of target words, given particular primes. They learned to expect that, after a particular prime, if the target was a word, it would be from a specific phonological category. The expectancy either involved phonological overlap (e.g., honk-vonk, “base-spark”; expected related) or did not (e.g., nest-galm, “nest-boom”; expected unrelated, where the learned expectation after hearing nest was a word rhyming in -alm). Targets were occasionally inconsistent with expectations. In these inconsistent expectancy trials, targets were either unrelated (e.g., honk-mest, “base-manure”; unexpected unrelated), where the listener was expecting a related target, or related (e.g., nest-pest, “nest-plague”; unexpected related), where the listener was expecting an unrelated target. Participant expectations and phonological relatedness were thus manipulated factorially for three types of phonological overlap (rhyme, one onset phoneme, and three onset phonemes) at three interstimulus intervals (ISIs; 50, 500, and 2,000 msec). Lexical decisions to targets revealed evidence of expectancy-based strategies for all three types of overlap (e.g., faster responses to expected than to unexpected targets, irrespective of phonological relatedness) and evidence of automatic phonological processes, but only for the rhyme and three-phoneme onset overlap conditions and, most strongly, at the shortest ISI (e.g., faster responses to related than to unrelated targets, irrespective of expectations). Although phonological priming thus has both automatic and strategic components, it is possible to cleave them apart.
  • McQueen, J. M., & Mitterer, H. (2005). Lexically-driven perceptual adjustments of vowel categories. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 233-236).
  • Scharenborg, O., Norris, D., Ten Bosch, L., & McQueen, J. M. (2005). How should a speech recognizer work? Cognitive Science, 29(6), 867-918. doi:10.1207/s15516709cog0000_37.

    Abstract

    Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that research in these related fields has focused on the mechanics of how speech can be recognized. In Marr's (1982) terms, emphasis has been on the algorithmic and implementational levels rather than on the computational level. In this article, we provide a computational-level analysis of the task of speech recognition, which reveals the close parallels between research concerned with HSR and ASR. We illustrate this relation by presenting a new computational model of human spoken-word recognition, built using techniques from the field of ASR that, in contrast to current existing models of HSR, recognizes words from real speech input.
  • Warner, N., Smits, R., McQueen, J. M., & Cutler, A. (2005). Phonological and statistical effects on timing of speech perception: Insights from a database of Dutch diphone perception. Speech Communication, 46(1), 53-72. doi:10.1016/j.specom.2005.01.003.

    Abstract

    We report detailed analyses of a very large database on timing of speech perception collected by Smits et al. (Smits, R., Warner, N., McQueen, J.M., Cutler, A., 2003. Unfolding of phonetic information over time: A database of Dutch diphone perception. J. Acoust. Soc. Am. 113, 563–574). Eighteen listeners heard all possible diphones of Dutch, gated in portions of varying size and presented without background noise. The present report analyzes listeners’ responses across gates in terms of phonological features (voicing, place, and manner for consonants; height, backness, and length for vowels). The resulting patterns for feature perception differ from patterns reported when speech is presented in noise. The data are also analyzed for effects of stress and of phonological context (neighboring vowel vs. consonant); effects of these factors are observed to be surprisingly limited. Finally, statistical effects, such as overall phoneme frequency and transitional probabilities, along with response biases, are examined; these too exercise only limited effects on response patterns. The results suggest highly accurate speech perception on the basis of acoustic information alone.
  • Clifton, Jr., C., Cutler, A., McQueen, J. M., & Van Ooijen, B. (1999). The processing of inflected forms. [Commentary on H. Clahsen: Lexical entries and rules of language.]. Behavioral and Brain Sciences, 22, 1018-1019.

    Abstract

    Clashen proposes two distinct processing routes, for regularly and irregularly inflected forms, respectively, and thus is apparently making a psychological claim. We argue his position, which embodies a strictly linguistic perspective, does not constitute a psychological processing model.
  • McQueen, J. M., Norris, D., & Cutler, A. (1999). Lexical influence in phonetic decision-making: Evidence from subcategorical mismatches. Journal of Experimental Psychology: Human Perception and Performance, 25, 1363-1389. doi:10.1037/0096-1523.25.5.1363.

    Abstract

    In 5 experiments, listeners heard words and nonwords, some cross-spliced so that they contained acoustic-phonetic mismatches. Performance was worse on mismatching than on matching items. Words cross-spliced with words and words cross-spliced with nonwords produced parallel results. However, in lexical decision and 1 of 3 phonetic decision experiments, performance on nonwords cross-spliced with words was poorer than on nonwords cross-spliced with nonwords. A gating study confirmed that there were misleading coarticulatory cues in the cross-spliced items; a sixth experiment showed that the earlier results were not due to interitem differences in the strength of these cues. Three models of phonetic decision making (the Race model, the TRACE model, and a postlexical model) did not explain the data. A new bottom-up model is outlined that accounts for the findings in terms of lexical involvement at a dedicated decision-making stage.

Share this page