Peter Hagoort

Publications

Displaying 1 - 32 of 32
  • Coopmans, C. W., De Hoop, H., Tezcan, F., Hagoort, P., & Martin, A. E. (2025). Language-specific neural dynamics extend syntax into the time domain. PLOS Biology, 23: e3002968. doi:10.1371/journal.pbio.3002968.

    Abstract

    Studies of perception have long shown that the brain adds information to its sensory analysis of the physical environment. A touchstone example for humans is language use: to comprehend a physical signal like speech, the brain must add linguistic knowledge, including syntax. Yet, syntactic rules and representations are widely assumed to be atemporal (i.e., abstract and not bound by time), so they must be translated into time-varying signals for speech comprehension and production. Here, we test 3 different models of the temporal spell-out of syntactic structure against brain activity of people listening to Dutch stories: an integratory bottom-up parser, a predictive top-down parser, and a mildly predictive left-corner parser. These models build exactly the same structure but differ in when syntactic information is added by the brain—this difference is captured in the (temporal distribution of the) complexity metric “incremental node count.” Using temporal response function models with both acoustic and information-theoretic control predictors, node counts were regressed against source-reconstructed delta-band activity acquired with magnetoencephalography. Neural dynamics in left frontal and temporal regions most strongly reflect node counts derived by the top-down method, which postulates syntax early in time, suggesting that predictive structure building is an important component of Dutch sentence comprehension. The absence of strong effects of the left-corner model further suggests that its mildly predictive strategy does not represent Dutch language comprehension well, in contrast to what has been found for English. Understanding when the brain projects its knowledge of syntax onto speech, and whether this is done in language-specific ways, will inform and constrain the development of mechanistic models of syntactic structure building in the brain.
  • Ferrari, A., & Hagoort, P. (2025). Beat gestures and prosodic prominence interactively influence language comprehension. Cognition, 256: 106049. doi:10.1016/j.cognition.2024.106049.

    Abstract

    Face-to-face communication is not only about ‘what’ is said but also ‘how’ it is said, both in speech and bodily signals. Beat gestures are rhythmic hand movements that typically accompany prosodic prominence in con-versation. Yet, it is still unclear how beat gestures influence language comprehension. On the one hand, beat gestures may share the same functional role of focus markers as prosodic prominence. Accordingly, they would drive attention towards the concurrent speech and highlight its content. On the other hand, beat gestures may trigger inferences of high speaker confidence, generate the expectation that the sentence content is correct and thereby elicit the commitment to the truth of the statement. This study directly disentangled the two hypotheses by evaluating additive and interactive effects of prosodic prominence and beat gestures on language comprehension. Participants watched videos of a speaker uttering sentences and judged whether each sentence was true or false. Sentences sometimes contained a world knowledge violation that may go unnoticed (‘semantic illusion’). Combining beat gestures with prosodic prominence led to a higher degree of semantic illusion, making more world knowledge violations go unnoticed during language comprehension. These results challenge current theories proposing that beat gestures are visual focus markers. To the contrary, they suggest that beat gestures automatically trigger inferences of high speaker confidence and thereby elicit the commitment to the truth of the statement, in line with Grice’s cooperative principle in conversation. More broadly, our findings also highlight the influence of metacognition on language comprehension in face-to-face ommunication.
  • Mishra, C., Skantze, G., Hagoort, P., & Verdonschot, R. G. (2025). Perception of emotions in human and robot faces: Is the eye region enough? In O. Palinko, L. Bodenhagen, J.-J. Cabihihan, K. Fischer, S. Šabanović, K. Winkle, L. Behera, S. S. Ge, D. Chrysostomou, W. Jiang, & H. He (Eds.), Social Robotics: 116th International Conference, ICSR + AI 2024, Odense, Denmark, October 23–26, 2024, Proceedings (pp. 290-303). Singapore: Springer.

    Abstract

    The increased interest in developing next-gen social robots has raised questions about the factors affecting the perception of robot emotions. This study investigates the impact of robot appearances (human-like, mechanical) and face regions (full-face, eye-region) on human perception of robot emotions. A between-subjects user study (N = 305) was conducted where participants were asked to identify the emotions being displayed in videos of robot faces, as well as a human baseline. Our findings reveal three important insights for effective social robot face design in Human-Robot Interaction (HRI): Firstly, robots equipped with a back-projected, fully animated face – regardless of whether they are more human-like or more mechanical-looking – demonstrate a capacity for emotional expression comparable to that of humans. Secondly, the recognition accuracy of emotional expressions in both humans and robots declines when only the eye region is visible. Lastly, within the constraint of only the eye region being visible, robots with more human-like features significantly enhance emotion recognition.
  • Slivac, K., Hagoort, P., & Flecken, M. (2025). Cognitive and neural mechanisms of linguistic influence on perception. Psychological Review. Advance online publication. doi:10.1037/rev0000546.

    Abstract

    To date, research has reliably shown that language can engage and modify perceptual processes in a top-down manner. However, our understanding of the cognitive and neural mechanisms underlying such top-down influences is still under debate. In this review, we provide an overview of findings from literature investigating the organization of semantic networks in the brain (spontaneous engagement of the visual system while processing linguistic information), and linguistic cueing studies (looking at the immediate effects of language on the perception of a visual target), in an effort to isolate such mechanisms. Additionally, we connect the findings from linguistic cueing studies to those reported in (nonlinguistic) literature on priors in perception, in order to find commonalities in neural processes allowing for top-down influences on perception. In doing so, we discuss the effects of language on perception in the context of broader, general cognitive and neural principles. Finally, we propose a way forward in the study of linguistic influences on perception.
  • Zora, H., Kabak, B., & Hagoort, P. (2025). Relevance of prosodic focus and lexical stress for discourse comprehension in Turkish: Evidence from psychometric and electrophysiological data. Journal of Cognitive Neuroscience, 37(3), 693-736. doi:10.1162/jocn_a_02262.

    Abstract

    Prosody underpins various linguistic domains ranging from semantics and syntax to discourse. For instance, prosodic information in the form of lexical stress modifies meanings and, as such, syntactic contexts of words as in Turkish kaz-má "pickaxe" (noun) versus káz-ma "do not dig" (imperative). Likewise, prosody indicates the focused constituent of an utterance as the noun phrase filling the wh-spot in a dialogue like What did you eat? I ate----. In the present study, we investigated the relevance of such prosodic variations for discourse comprehension in Turkish. We aimed at answering how lexical stress and prosodic focus mismatches on critical noun phrases-resulting in grammatical anomalies involving both semantics and syntax and discourse-level anomalies, respectively-affect the perceived correctness of an answer to a question in a given context. To that end, 80 native speakers of Turkish, 40 participating in a psychometric experiment and 40 participating in an EEG experiment, were asked to judge the acceptability of prosodic mismatches that occur either separately or concurrently. Psychometric results indicated that lexical stress mismatch led to a lower correctness score than prosodic focus mismatch, and combined mismatch received the lowest score. Consistent with the psychometric data, EEG results revealed an N400 effect to combined mismatch, and this effect was followed by a P600 response to lexical stress mismatch. Conjointly, these results suggest that every source of prosodic information is immediately available and codetermines the interpretation of an utterance; however, semantically and syntactically relevant lexical stress information is assigned more significance by the language comprehension system compared with prosodic focus information.
  • Arana, S., Hagoort, P., Schoffelen, J.-M., & Rabovsky, M. (2024). Perceived similarity as a window into representations of integrated sentence meaning. Behavior Research Methods, 56(3), 2675-2691. doi:10.3758/s13428-023-02129-x.

    Abstract

    When perceiving the world around us, we are constantly integrating pieces of information. The integrated experience consists of more than just the sum of its parts. For example, visual scenes are defined by a collection of objects as well as the spatial relations amongst them and sentence meaning is computed based on individual word semantic but also syntactic configuration. Having quantitative models of such integrated representations can help evaluate cognitive models of both language and scene perception. Here, we focus on language, and use a behavioral measure of perceived similarity as an approximation of integrated meaning representations. We collected similarity judgments of 200 subjects rating nouns or transitive sentences through an online multiple arrangement task. We find that perceived similarity between sentences is most strongly modulated by the semantic action category of the main verb. In addition, we show how non-negative matrix factorization of similarity judgment data can reveal multiple underlying dimensions reflecting both semantic as well as relational role information. Finally, we provide an example of how similarity judgments on sentence stimuli can serve as a point of comparison for artificial neural networks models (ANNs) by comparing our behavioral data against sentence similarity extracted from three state-of-the-art ANNs. Overall, our method combining the multiple arrangement task on sentence stimuli with matrix factorization can capture relational information emerging from integration of multiple words in a sentence even in the presence of strong focus on the verb.
  • Arana, S., Pesnot Lerousseau, J., & Hagoort, P. (2024). Deep learning models to study sentence comprehension in the human brain. Language, Cognition and Neuroscience, 39(8), 972-990. doi:10.1080/23273798.2023.2198245.

    Abstract

    Recent artificial neural networks that process natural language achieve unprecedented performance in tasks requiring sentence-level understanding. As such, they could be interesting models of the integration of linguistic information in the human brain. We review works that compare these artificial language models with human brain activity and we assess the extent to which this approach has improved our understanding of the neural processes involved in natural language comprehension. Two main results emerge. First, the neural representation of word meaning aligns with the context-dependent, dense word vectors used by the artificial neural networks. Second, the processing hierarchy that emerges within artificial neural networks broadly matches the brain, but is surprisingly inconsistent across studies. We discuss current challenges in establishing artificial neural networks as process models of natural language comprehension. We suggest exploiting the highly structured representational geometry of artificial neural networks when mapping representations to brain data.

    Additional information

    link to preprint
  • Bulut, T., & Hagoort, P. (2024). Contributions of the left and right thalami to language: A meta-analytic approach. Brain Structure & Function, 229, 2149-2166. doi:10.1007/s00429-024-02795-3.

    Abstract

    Background: Despite a pervasive cortico-centric view in cognitive neuroscience, subcortical structures including the thalamus have been shown to be increasingly involved in higher cognitive functions. Previous structural and functional imaging studies demonstrated cortico-thalamo-cortical loops which may support various cognitive functions including language. However, large-scale functional connectivity of the thalamus during language tasks has not been examined before. Methods: The present study employed meta-analytic connectivity modeling to identify language-related coactivation patterns of the left and right thalami. The left and right thalami were used as regions of interest to search the BrainMap functional database for neuroimaging experiments with healthy participants reporting language-related activations in each region of interest. Activation likelihood estimation analyses were then carried out on the foci extracted from the identified studies to estimate functional convergence for each thalamus. A functional decoding analysis based on the same database was conducted to characterize thalamic contributions to different language functions. Results: The results revealed bilateral frontotemporal and bilateral subcortical (basal ganglia) coactivation patterns for both the left and right thalami, and also right cerebellar coactivations for the left thalamus, during language processing. In light of previous empirical studies and theoretical frameworks, the present connectivity and functional decoding findings suggest that cortico-subcortical-cerebellar-cortical loops modulate and fine-tune information transfer within the bilateral frontotemporal cortices during language processing, especially during production and semantic operations, but also other language (e.g., syntax, phonology) and cognitive operations (e.g., attention, cognitive control). Conclusion: The current findings show that the language-relevant network extends beyond the classical left perisylvian cortices and spans bilateral cortical, bilateral subcortical (bilateral thalamus, bilateral basal ganglia) and right cerebellar regions.

    Additional information

    supplementary information
  • Fitz, H., Hagoort, P., & Petersson, K. M. (2024). Neurobiological causal models of language processing. Neurobiology of Language, 5(1), 225-247. doi:10.1162/nol_a_00133.

    Abstract

    The language faculty is physically realized in the neurobiological infrastructure of the human brain. Despite significant efforts, an integrated understanding of this system remains a formidable challenge. What is missing from most theoretical accounts is a specification of the neural mechanisms that implement language function. Computational models that have been put forward generally lack an explicit neurobiological foundation. We propose a neurobiologically informed causal modeling approach which offers a framework for how to bridge this gap. A neurobiological causal model is a mechanistic description of language processing that is grounded in, and constrained by, the characteristics of the neurobiological substrate. It intends to model the generators of language behavior at the level of implementational causality. We describe key features and neurobiological component parts from which causal models can be built and provide guidelines on how to implement them in model simulations. Then we outline how this approach can shed new light on the core computational machinery for language, the long-term storage of words in the mental lexicon and combinatorial processing in sentence comprehension. In contrast to cognitive theories of behavior, causal models are formulated in the “machine language” of neurobiology which is universal to human cognition. We argue that neurobiological causal modeling should be pursued in addition to existing approaches. Eventually, this approach will allow us to develop an explicit computational neurobiology of language.
  • Forkel, S. J., & Hagoort, P. (2024). Redefining language networks: Connectivity beyond localised regions. Brain Structure & Function, 229, 2073-2078. doi:10.1007/s00429-024-02859-4.
  • Giglio, L., Ostarek, M., Sharoh, D., & Hagoort, P. (2024). Diverging neural dynamics for syntactic structure building in naturalistic speaking and listening. Proceedings of the National Academy of Sciences of the United States of America, 121(11): e2310766121. doi:10.1073/pnas.2310766121.

    Abstract

    The neural correlates of sentence production have been mostly studied with constraining task paradigms that introduce artificial task effects. In this study, we aimed to gain a better understanding of syntactic processing in spontaneous production vs. naturalistic comprehension. We extracted word-by-word metrics of phrase-structure building with top-down and bottom-up parsers that make different hypotheses about the timing of structure building. In comprehension, structure building proceeded in an integratory fashion and led to an increase in activity in posterior temporal and inferior frontal areas. In production, structure building was anticipatory and predicted an increase in activity in the inferior frontal gyrus. Newly developed production-specific parsers highlighted the anticipatory and incremental nature of structure building in production, which was confirmed by a converging analysis of the pausing patterns in speech. Overall, the results showed that the unfolding of syntactic processing diverges between speaking and listening.
  • Giglio, L., Sharoh, D., Ostarek, M., & Hagoort, P. (2024). Connectivity of fronto-temporal regions in syntactic structure building during speaking and listening. Neurobiology of Language, 5(4), 922-941. doi:10.1162/nol_a_00154.

    Abstract

    The neural infrastructure for sentence production and comprehension has been found to be mostly shared. The same regions are engaged during speaking and listening, with some differences in how strongly they activate depending on modality. In this study, we investigated how modality affects the connectivity between regions previously found to be involved in syntactic processing across modalities. We determined how constituent size and modality affected the connectivity of the pars triangularis of the left inferior frontal gyrus (LIFG) and of the left posterior temporal lobe (LPTL) with the pars opercularis of the LIFG, the anterior temporal lobe (LATL) and the rest of the brain. We found that constituent size reliably increased the connectivity across these frontal and temporal ROIs. Connectivity between the two LIFG regions and the LPTL was enhanced as a function of constituent size in both modalities, and it was upregulated in production possibly because of linearization and motor planning in the frontal cortex. The connectivity of both ROIs with the LATL was lower and only enhanced for larger constituent sizes, suggesting a contributing role of the LATL in sentence processing in both modalities. These results thus show that the connectivity among fronto-temporal regions is upregulated for syntactic structure building in both sentence production and comprehension, providing further evidence for accounts of shared neural resources for sentence-level processing across modalities.

    Additional information

    supplementary information
  • Giglio, L., Hagoort, P., & Ostarek, M. (2024). Neural encoding of semantic structures during sentence production. Cerebral Cortex, 34(12): bhae482. doi:10.1093/cercor/bhae482.

    Abstract

    The neural representations for compositional processing have so far been mostly studied during sentence comprehension. In an fMRI study of sentence production, we investigated the brain representations for compositional processing during speaking. We used a rapid serial visual presentation sentence recall paradigm to elicit sentence production from the conceptual memory of an event. With voxel-wise encoding models, we probed the specificity of the compositional structure built during the production of each sentence, comparing an unstructured model of word meaning without relational information with a model that encodes abstract thematic relations and a model encoding event-specific relational structure. Whole-brain analyses revealed that sentence meaning at different levels of specificity was encoded in a large left frontal-parietal-temporal network. A comparison with semantic structures composed during the comprehension of the same sentences showed similarly distributed brain activity patterns. An ROI analysis over left fronto-temporal language parcels showed that event-specific relational structure above word-specific information was encoded in the left inferior frontal gyrus. Overall, we found evidence for the encoding of sentence meaning during sentence production in a distributed brain network and for the encoding of event-specific semantic structures in the left inferior frontal gyrus.

    Additional information

    supplementary information
  • Hagoort, P., & Özyürek, A. (2024). Extending the architecture of language from a multimodal perspective. Topics in Cognitive Science. Advance online publication. doi:10.1111/tops.12728.

    Abstract

    Language is inherently multimodal. In spoken languages, combined spoken and visual signals (e.g., co-speech gestures) are an integral part of linguistic structure and language representation. This requires an extension of the parallel architecture, which needs to include the visual signals concomitant to speech. We present the evidence for the multimodality of language. In addition, we propose that distributional semantics might provide a format for integrating speech and co-speech gestures in a common semantic representation.
  • Murphy, E., Rollo, P. S., Segaert, K., Hagoort, P., & Tandon, N. (2024). Multiple dimensions of syntactic structure are resolved earliest in posterior temporal cortex. Progress in Neurobiology, 241: 102669. doi:10.1016/j.pneurobio.2024.102669.

    Abstract

    How we combine minimal linguistic units into larger structures remains an unresolved topic in neuroscience. Language processing involves the abstract construction of ‘vertical’ and ‘horizontal’ information simultaneously (e.g., phrase structure, morphological agreement), but previous paradigms have been constrained in isolating only one type of composition and have utilized poor spatiotemporal resolution. Using intracranial recordings, we report multiple experiments designed to separate phrase structure from morphosyntactic agreement. Epilepsy patients (n = 10) were presented with auditory two-word phrases grouped into pseudoword-verb (‘trab run’) and pronoun-verb either with or without Person agreement (‘they run’ vs. ‘they runs’). Phrase composition and Person violations both resulted in significant increases in broadband high gamma activity approximately 300ms after verb onset in posterior middle temporal gyrus (pMTG) and posterior superior temporal sulcus (pSTS), followed by inferior frontal cortex (IFC) at 500ms. While sites sensitive to only morphosyntactic violations were distributed, those sensitive to both composition types were generally confined to pSTS/pMTG and IFC. These results indicate that posterior temporal cortex shows the earliest sensitivity for hierarchical linguistic structure across multiple dimensions, providing neural resources for distinct windows of composition. This region is comprised of sparsely interwoven heterogeneous constituents that afford cortical search spaces for dissociable syntactic relations.
  • Seijdel, N., Schoffelen, J.-M., Hagoort, P., & Drijvers, L. (2024). Attention drives visual processing and audiovisual integration during multimodal communication. The Journal of Neuroscience, 44(10): e0870232023. doi:10.1523/JNEUROSCI.0870-23.2023.

    Abstract

    During communication in real-life settings, our brain often needs to integrate auditory and visual information, and at the same time actively focus on the relevant sources of information, while ignoring interference from irrelevant events. The interaction between integration and attention processes remains poorly understood. Here, we use rapid invisible frequency tagging (RIFT) and magnetoencephalography (MEG) to investigate how attention affects auditory and visual information processing and integration, during multimodal communication. We presented human participants (male and female) with videos of an actress uttering action verbs (auditory; tagged at 58 Hz) accompanied by two movie clips of hand gestures on both sides of fixation (attended stimulus tagged at 65 Hz; unattended stimulus tagged at 63 Hz). Integration difficulty was manipulated by a lower-order auditory factor (clear/degraded speech) and a higher-order visual semantic factor (matching/mismatching gesture). We observed an enhanced neural response to the attended visual information during degraded speech compared to clear speech. For the unattended information, the neural response to mismatching gestures was enhanced compared to matching gestures. Furthermore, signal power at the intermodulation frequencies of the frequency tags, indexing non-linear signal interactions, was enhanced in left frontotemporal and frontal regions. Focusing on LIFG (Left Inferior Frontal Gyrus), this enhancement was specific for the attended information, for those trials that benefitted from integration with a matching gesture. Together, our results suggest that attention modulates audiovisual processing and interaction, depending on the congruence and quality of the sensory input.

    Additional information

    link to preprint
  • Terporten, R., Huizeling, E., Heidlmayr, K., Hagoort, P., & Kösem, A. (2024). The interaction of context constraints and predictive validity during sentence reading. Journal of Cognitive Neuroscience, 36(2), 225-238. doi:10.1162/jocn_a_02082.

    Abstract

    Words are not processed in isolation; instead, they are commonly embedded in phrases and sentences. The sentential context influences the perception and processing of a word. However, how this is achieved by brain processes and whether predictive mechanisms underlie this process remain a debated topic. Here, we employed an experimental paradigm in which we orthogonalized sentence context constraints and predictive validity, which was defined as the ratio of congruent to incongruent sentence endings within the experiment. While recording electroencephalography, participants read sentences with three levels of sentential context constraints (high, medium, and low). Participants were also separated into two groups that differed in their ratio of valid congruent to incongruent target words that could be predicted from the sentential context. For both groups, we investigated modulations of alpha power before, and N400 amplitude modulations after target word onset. The results reveal that the N400 amplitude gradually decreased with higher context constraints and cloze probability. In contrast, alpha power was not significantly affected by context constraint. Neither the N400 nor alpha power were significantly affected by changes in predictive validity.
  • Verdonschot, R. G., Van der Wal, J., Lewis, A. G., Knudsen, B., Von Grebmer zu Wolfsthurn, S., Schiller, N. O., & Hagoort, P. (2024). Information structure in Makhuwa: Electrophysiological evidence for a universal processing account. Proceedings of the National Academy of Sciences of the United States of America, 121(30): e2315438121. doi:10.1073/pnas.2315438121.

    Abstract

    There is evidence from both behavior and brain activity that the way information is structured, through the use of focus, can up-regulate processing of focused constituents, likely to give prominence to the relevant aspects of the input. This is hypothesized to be universal, regardless of the different ways in which languages encode focus. In order to test this universalist hypothesis, we need to go beyond the more familiar linguistic strategies for marking focus, such as by means of intonation or specific syntactic structures (e.g., it-clefts). Therefore, in this study, we examine Makhuwa-Enahara, a Bantu language spoken in northern Mozambique, which uniquely marks focus through verbal conjugation. The participants were presented with sentences that consisted of either a semantically anomalous constituent or a semantically nonanomalous constituent. Moreover, focus on this particular constituent could be either present or absent. We observed a consistent pattern: Focused information generated a more negative N400 response than the same information in nonfocus position. This demonstrates that regardless of how focus is marked, its consequence seems to result in an upregulation of processing of information that is in focus.

    Additional information

    supplementary materials
  • Zora, H., Bowin, H., Heldner, M., Riad, T., & Hagoort, P. (2024). The role of pitch accent in discourse comprehension and the markedness of Accent 2 in Central Swedish. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 921-925). doi:10.21437/SpeechProsody.2024-186.

    Abstract

    In Swedish, words are associated with either of two pitch contours known as Accent 1 and Accent 2. Using a psychometric test, we investigated how listeners judge pitch accent violations while interpreting discourse. Forty native speakers of Central Swedish were presented with auditory dialogues, where test words were appropriately or inappropriately accented in a given context, and asked to judge the correctness of sentences containing the test words. Data indicated a statistically significant effect of wrong accent pattern on the correctness judgment. Both Accent 1 and Accent 2 violations interfered with the coherent interpretation of discourse and were judged as incorrect by the listeners. Moreover, there was a statistically significant difference in the perceived correctness between the accent patterns. Accent 2 violations led to a lower correctness score compared to Accent 1 violations, indicating that the listeners were more sensitive to pitch accent violations in Accent 2 words than in Accent 1 words. This result is in line with the notion that Accent 2 is marked and lexically represented in Central Swedish. Taken together, these findings indicate that listeners use both Accent 1 and Accent 2 to arrive at the correct interpretation of the linguistic input, while assigning varying degrees of relevance to them depending on their markedness.
  • Coopmans, C. W., De Hoop, H., Kaushik, K., Hagoort, P., & Martin, A. E. (2022). Hierarchy in language interpretation: Evidence from behavioural experiments and computational modelling. Language, Cognition and Neuroscience, 37(4), 420-439. doi:10.1080/23273798.2021.1980595.

    Abstract

    It has long been recognised that phrases and sentences are organised hierarchically, but many computational models of language treat them as sequences of words without computing constituent structure. Against this background, we conducted two experiments which showed that participants interpret ambiguous noun phrases, such as second blue ball, in terms of their abstract hierarchical structure rather than their linear surface order. When a neural network model was tested on this task, it could simulate such “hierarchical” behaviour. However, when we changed the training data such that they were not entirely unambiguous anymore, the model stopped generalising in a human-like way. It did not systematically generalise to novel items, and when it was trained on ambiguous trials, it strongly favoured the linear interpretation. We argue that these models should be endowed with a bias to make generalisations over hierarchical structure in order to be cognitively adequate models of human language.
  • Coopmans, C. W., De Hoop, H., Hagoort, P., & Martin, A. E. (2022). Effects of structure and meaning on cortical tracking of linguistic units in naturalistic speech. Neurobiology of Language, 3(3), 386-412. doi:10.1162/nol_a_00070.

    Abstract

    Recent research has established that cortical activity “tracks” the presentation rate of syntactic phrases in continuous speech, even though phrases are abstract units that do not have direct correlates in the acoustic signal. We investigated whether cortical tracking of phrase structures is modulated by the extent to which these structures compositionally determine meaning. To this end, we recorded electroencephalography (EEG) of 38 native speakers who listened to naturally spoken Dutch stimuli in different conditions, which parametrically modulated the degree to which syntactic structure and lexical semantics determine sentence meaning. Tracking was quantified through mutual information between the EEG data and either the speech envelopes or abstract annotations of syntax, all of which were filtered in the frequency band corresponding to the presentation rate of phrases (1.1–2.1 Hz). Overall, these mutual information analyses showed stronger tracking of phrases in regular sentences than in stimuli whose lexical-syntactic content is reduced, but no consistent differences in tracking between sentences and stimuli that contain a combination of syntactic structure and lexical content. While there were no effects of compositional meaning on the degree of phrase-structure tracking, analyses of event-related potentials elicited by sentence-final words did reveal meaning-induced differences between conditions. Our findings suggest that cortical tracking of structure in sentences indexes the internal generation of this structure, a process that is modulated by the properties of its input, but not by the compositional interpretation of its output.

    Additional information

    supplementary information
  • Dai, B., McQueen, J. M., Terporten, R., Hagoort, P., & Kösem, A. (2022). Distracting Linguistic Information Impairs Neural Tracking of Attended Speech. Current Research in Neurobiology, 3: 100043. doi:10.1016/j.crneur.2022.100043.

    Abstract

    Listening to speech is difficult in noisy environments, and is even harder when the interfering noise consists of intelligible speech as compared to unintelligible sounds. This suggests that the competing linguistic information interferes with the neural processing of target speech. Interference could either arise from a degradation of the neural representation of the target speech, or from increased representation of distracting speech that enters in competition with the target speech. We tested these alternative hypotheses using magnetoencephalography (MEG) while participants listened to a target clear speech in the presence of distracting noise-vocoded speech. Crucially, the distractors were initially unintelligible but became more intelligible after a short training session. Results showed that the comprehension of the target speech was poorer after training than before training. The neural tracking of target speech in the delta range (1–4 Hz) reduced in strength in the presence of a more intelligible distractor. In contrast, the neural tracking of distracting signals was not significantly modulated by intelligibility. These results suggest that the presence of distracting speech signals degrades the linguistic representation of target speech carried by delta oscillations.
  • Giglio, L., Ostarek, M., Weber, K., & Hagoort, P. (2022). Commonalities and asymmetries in the neurobiological infrastructure for language production and comprehension. Cerebral Cortex, 32(7), 1405-1418. doi:10.1093/cercor/bhab287.

    Abstract

    The neurobiology of sentence production has been largely understudied compared to the neurobiology of sentence comprehension, due to difficulties with experimental control and motion-related artifacts in neuroimaging. We studied the neural response to constituents of increasing size and specifically focused on the similarities and differences in the production and comprehension of the same stimuli. Participants had to either produce or listen to stimuli in a gradient of constituent size based on a visual prompt. Larger constituent sizes engaged the left inferior frontal gyrus (LIFG) and middle temporal gyrus (LMTG) extending to inferior parietal areas in both production and comprehension, confirming that the neural resources for syntactic encoding and decoding are largely overlapping. An ROI analysis in LIFG and LMTG also showed that production elicited larger responses to constituent size than comprehension and that the LMTG was more engaged in comprehension than production, while the LIFG was more engaged in production than comprehension. Finally, increasing constituent size was characterized by later BOLD peaks in comprehension but earlier peaks in production. These results show that syntactic encoding and parsing engage overlapping areas, but there are asymmetries in the engagement of the language network due to the specific requirements of production and comprehension.

    Additional information

    supplementary material
  • Hagoort, P. (2022). Reasoning and the brain. In M. Stokhof, & K. Stenning (Eds.), Rules, regularities, randomness. Festschrift for Michiel van Lambalgen (pp. 83-85). Amsterdam: Institute for Logic, Language and Computation.
  • Heilbron, M., Armeni, K., Schoffelen, J.-M., Hagoort, P., & De Lange, F. P. (2022). A hierarchy of linguistic predictions during natural language comprehension. Proceedings of the National Academy of Sciences of the United States of America, 119(32): e2201968119. doi:10.1073/pnas.2201968119.

    Abstract

    Understanding spoken language requires transforming ambiguous acoustic streams into a hierarchy of representations, from phonemes to meaning. It has been suggested that the brain uses prediction to guide the interpretation of incoming input. However, the role of prediction in language processing remains disputed, with disagreement about both the ubiquity and representational nature of predictions. Here, we address both issues by analyzing brain recordings of participants listening to audiobooks, and using a deep neural network (GPT-2) to precisely quantify contextual predictions. First, we establish that brain responses to words are modulated by ubiquitous predictions. Next, we disentangle model-based predictions into distinct dimensions, revealing dissociable neural signatures of predictions about syntactic category (parts of speech), phonemes, and semantics. Finally, we show that high-level (word) predictions inform low-level (phoneme) predictions, supporting hierarchical predictive processing. Together, these results underscore the ubiquity of prediction in language processing, showing that the brain spontaneously predicts upcoming language at multiple levels of abstraction.

    Additional information

    supporting information
  • Hoeksema, N., Hagoort, P., & Vernes, S. C. (2022). Piecing together the building blocks of the vocal learning bat brain. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 294-296). Nijmegen: Joint Conference on Language Evolution (JCoLE).
  • Huizeling, E., Arana, S., Hagoort, P., & Schoffelen, J.-M. (2022). Lexical frequency and sentence context influence the brain’s response to single words. Neurobiology of Language, 3(1), 149-179. doi:10.1162/nol_a_00054.

    Abstract

    Typical adults read remarkably quickly. Such fast reading is facilitated by brain processes that are sensitive to both word frequency and contextual constraints. It is debated as to whether these attributes have additive or interactive effects on language processing in the brain. We investigated this issue by analysing existing magnetoencephalography data from 99 participants reading intact and scrambled sentences. Using a cross-validated model comparison scheme, we found that lexical frequency predicted the word-by-word elicited MEG signal in a widespread cortical network, irrespective of sentential context. In contrast, index (ordinal word position) was more strongly encoded in sentence words, in left front-temporal areas. This confirms that frequency influences word processing independently of predictability, and that contextual constraints affect word-by-word brain responses. With a conservative multiple comparisons correction, only the interaction between lexical frequency and surprisal survived, in anterior temporal and frontal cortex, and not between lexical frequency and entropy, nor between lexical frequency and index. However, interestingly, the uncorrected index*frequency interaction revealed an effect in left frontal and temporal cortex that reversed in time and space for intact compared to scrambled sentences. Finally, we provide evidence to suggest that, in sentences, lexical frequency and predictability may independently influence early (<150ms) and late stages of word processing, but interact during later stages of word processing (>150-250ms), thus helping to converge previous contradictory eye-tracking and electrophysiological literature. Current neuro-cognitive models of reading would benefit from accounting for these differing effects of lexical frequency and predictability on different stages of word processing.
  • Huizeling, E., Peeters, D., & Hagoort, P. (2022). Prediction of upcoming speech under fluent and disfluent conditions: Eye tracking evidence from immersive virtual reality. Language, Cognition and Neuroscience, 37(4), 481-508. doi:10.1080/23273798.2021.1994621.

    Abstract

    Traditional experiments indicate that prediction is important for efficient speech processing. In three virtual reality visual world paradigm experiments, we tested whether such findings hold in naturalistic settings (Experiment 1) and provided novel insights into whether disfluencies in speech (repairs/hesitations) inform one’s predictions in rich environments (Experiments 2–3). Experiment 1 supports that listeners predict upcoming speech in naturalistic environments, with higher proportions of anticipatory target fixations in predictable compared to unpredictable trials. In Experiments 2–3, disfluencies reduced anticipatory fixations towards predicted referents, compared to conjunction (Experiment 2) and fluent (Experiment 3) sentences. Unexpectedly, Experiment 2 provided no evidence that participants made new predictions from a repaired verb. Experiment 3 provided novel findings that fixations towards the speaker increase upon hearing a hesitation, supporting current theories of how hesitations influence sentence processing. Together, these findings unpack listeners’ use of visual (objects/speaker) and auditory (speech/disfluencies) information when predicting upcoming words.
  • Lai, V. T., Van Berkum, J. J. A., & Hagoort, P. (2022). Negative affect increases reanalysis of conflicts between discourse context and world knowledge. Frontiers in Communication, 7: 910482. doi:10.3389/fcomm.2022.910482.

    Abstract

    Introduction: Mood is a constant in our daily life and can permeate all levels of cognition. We examined whether and how mood influences the processing of discourse content that is relatively neutral and not loaded with emotion. During discourse processing, readers have to constantly strike a balance between what they know in long term memory and what the current discourse is about. Our general hypothesis is that mood states would affect this balance. We hypothesized that readers in a positive mood would rely more on default world knowledge, whereas readers in a negative mood would be more inclined to analyze the details in the current discourse.

    Methods: Participants were put in a positive and a negative mood via film clips, one week apart. In each session, after mood manipulation, they were presented with sentences in discourse materials. We created sentences such as “With the lights on you can see...” that end with critical words (CWs) “more” or “less”, where general knowledge supports “more”, not “less”. We then embedded each of these sentences in a wider discourse that does/does not support the CWs (a story about driving in the night vs. stargazing). EEG was recorded throughout.

    Results: The results showed that first, mood manipulation was successful in that there was a significant mood difference between sessions. Second, mood did not modulate the N400 effects. Participants in both moods detected outright semantic violations and allowed world knowledge to be overridden by discourse context. Third, mood modulated the LPC (Late Positive Component) effects, distributed in the frontal region. In negative moods, the LPC was sensitive to one-level violation. That is, CWs that were supported by only world knowledge, only discourse, and neither, elicited larger frontal LPCs, in comparison to the condition where CWs were supported by both world knowledge and discourse.

    Discussion: These results suggest that mood does not influence all processes involved in discourse processing. Specifically, mood does not influence lexical-semantic retrieval (N400), but it does influence elaborative processes for sensemaking (P600) during discourse processing. These results advance our understanding of the impact and time course of mood on discourse.

    Additional information

    Table 1.XLSX
  • Murphy, E., Woolnough, O., Rollo, P. S., Roccaforte, Z., Segaert, K., Hagoort, P., & Tandon, N. (2022). Minimal phrase composition revealed by intracranial recordings. The Journal of Neuroscience, 42(15), 3216-3227. doi:10.1523/JNEUROSCI.1575-21.2022.

    Abstract

    The ability to comprehend phrases is an essential integrative property of the brain. Here we evaluate the neural processes that enable the transition from single word processing to a minimal compositional scheme. Previous research has reported conflicting timing effects of composition, and disagreement persists with respect to inferior frontal and posterior temporal contributions. To address these issues, 19 patients (10 male, 19 female) implanted with penetrating depth or surface subdural intracranial electrodes heard auditory recordings of adjective-noun, pseudoword-noun and adjective-pseudoword phrases and judged whether the phrase matched a picture. Stimulus-dependent alterations in broadband gamma activity, low frequency power and phase-locking values across the language-dominant left hemisphere were derived. This revealed a mosaic located on the lower bank of the posterior superior temporal sulcus (pSTS), in which closely neighboring cortical sites displayed exclusive sensitivity to either lexicality or phrase structure, but not both. Distinct timings were found for effects of phrase composition (210–300 ms) and pseudoword processing (approximately 300–700 ms), and these were localized to neighboring electrodes in pSTS. The pars triangularis and temporal pole encoded anticipation of composition in broadband low frequencies, and both regions exhibited greater functional connectivity with pSTS during phrase composition. Our results suggest that the pSTS is a highly specialized region comprised of sparsely interwoven heterogeneous constituents that encodes both lower and higher level linguistic features. This hub in pSTS for minimal phrase processing may form the neural basis for the human-specific computational capacity for forming hierarchically organized linguistic structures.
  • Udden, J., Hulten, A., Schoffelen, J.-M., Lam, N. H. L., Harbusch, K., Van den Bosch, A., Kempen, G., Petersson, K. M., & Hagoort, P. (2022). Supramodal sentence processing in the human brain: fMRI evidence for the influence of syntactic complexity in more than 200 participants. Neurobiology of Language, 3(4), 575-598. doi:10.1162/nol_a_00076.

    Abstract

    This study investigated two questions. One is: To what degree is sentence processing beyond single words independent of the input modality (speech vs. reading)? The second question is: Which parts of the network recruited by both modalities is sensitive to syntactic complexity? These questions were investigated by having more than 200 participants read or listen to well-formed sentences or series of unconnected words. A largely left-hemisphere frontotemporoparietal network was found to be supramodal in nature, i.e., independent of input modality. In addition, the left inferior frontal gyrus (LIFG) and the left posterior middle temporal gyrus (LpMTG) were most clearly associated with left-branching complexity. The left anterior temporal lobe (LaTL) showed the greatest sensitivity to sentences that differed in right-branching complexity. Moreover, activity in LIFG and LpMTG increased from sentence onset to end, in parallel with an increase of the left-branching complexity. While LIFG, bilateral anterior temporal lobe, posterior MTG, and left inferior parietal lobe (LIPL) all contribute to the supramodal unification processes, the results suggest that these regions differ in their respective contributions to syntactic complexity related processing. The consequences of these findings for neurobiological models of language processing are discussed.

    Additional information

    supporting information
  • Vernes, S. C., Devanna, P., Hörpel, S. G., Alvarez van Tussenbroek, I., Firzlaff, U., Hagoort, P., Hiller, M., Hoeksema, N., Hughes, G. M., Lavrichenko, K., Mengede, J., Morales, A. E., & Wiesmann, M. (2022). The pale spear‐nosed bat: A neuromolecular and transgenic model for vocal learning. Annals of the New York Academy of Sciences, 1517, 125-142. doi:10.1111/nyas.14884.

    Abstract

    Vocal learning, the ability to produce modified vocalizations via learning from acoustic signals, is a key trait in the evolution of speech. While extensively studied in songbirds, mammalian models for vocal learning are rare. Bats present a promising study system given their gregarious natures, small size, and the ability of some species to be maintained in captive colonies. We utilize the pale spear-nosed bat (Phyllostomus discolor) and report advances in establishing this species as a tractable model for understanding vocal learning. We have taken an interdisciplinary approach, aiming to provide an integrated understanding across genomics (Part I), neurobiology (Part II), and transgenics (Part III). In Part I, we generated new, high-quality genome annotations of coding genes and noncoding microRNAs to facilitate functional and evolutionary studies. In Part II, we traced connections between auditory-related brain regions and reported neuroimaging to explore the structure of the brain and gene expression patterns to highlight brain regions. In Part III, we created the first successful transgenic bats by manipulating the expression of FoxP2, a speech-related gene. These interdisciplinary approaches are facilitating a mechanistic and evolutionary understanding of mammalian vocal learning and can also contribute to other areas of investigation that utilize P. discolor or bats as study species.

    Additional information

    supplementary materials

Share this page