Displaying 1 - 67 of 67
Coopmans, C. W., De Hoop, H., Tezcan, F., Hagoort, P., & Martin, A. E. (2025). Language-specific neural dynamics extend syntax into the time domain. PLOS Biology, 23: e3002968. doi:10.1371/journal.pbio.3002968.
Studies of perception have long shown that the brain adds information to its sensory analysis of the physical environment. A touchstone example for humans is language use: to comprehend a physical signal like speech, the brain must add linguistic knowledge, including syntax. Yet, syntactic rules and representations are widely assumed to be atemporal (i.e., abstract and not bound by time), so they must be translated into time-varying signals for speech comprehension and production. Here, we test 3 different models of the temporal spell-out of syntactic structure against brain activity of people listening to Dutch stories: an integratory bottom-up parser, a predictive top-down parser, and a mildly predictive left-corner parser. These models build exactly the same structure but differ in when syntactic information is added by the brain—this difference is captured in the (temporal distribution of the) complexity metric “incremental node count.” Using temporal response function models with both acoustic and information-theoretic control predictors, node counts were regressed against source-reconstructed delta-band activity acquired with magnetoencephalography. Neural dynamics in left frontal and temporal regions most strongly reflect node counts derived by the top-down method, which postulates syntax early in time, suggesting that predictive structure building is an important component of Dutch sentence comprehension. The absence of strong effects of the left-corner model further suggests that its mildly predictive strategy does not represent Dutch language comprehension well, in contrast to what has been found for English. Understanding when the brain projects its knowledge of syntax onto speech, and whether this is done in language-specific ways, will inform and constrain the development of mechanistic models of syntactic structure building in the brain. -
Van Geert, E., Ding, R., & Wagemans, J. (2025). A cross-cultural comparison of aesthetic preferences for neatly organized compositions: Native Chinese- versus Native Dutch-speaking samples. Empirical Studies of the Arts, 43(1), 250-275. doi:10.1177/02762374241245917.
Do aesthetic preferences for images of neatly organized compositions (e.g., images collected on blogs like Things Organized Neatly©) generalize across cultures? In an earlier study, focusing on stimulus and personal properties related to order and complexity, Western participants indicated their preference for one of two simultaneously presented images (100 pairs). In the current study, we compared the data of the native Dutch-speaking participants from this earlier sample (N = 356) to newly collected data from a native Chinese-speaking sample (N = 220). Overall, aesthetic preferences were quite similar across cultures. When relating preferences for each sample to ratings of order, complexity, soothingness, and fascination collected from a Western, mainly Dutch-speaking sample, the results hint at a cross-culturally consistent preference for images that Western participants rate as more ordered, but a cross-culturally diverse relation between preferences and complexity.Additional information
VanGeert_Ding_Wagemans_2024suppl_cross cultural comparison of....pdf -
Bonandrini, R., Gornetti, E., & Paulesu, E. (2024). A meta-analytical account of the functional lateralization of the reading network. Cortex, 177, 363-384. doi:10.1016/j.cortex.2024.05.015.
The observation that the neural correlates of reading are left-lateralized is ubiquitous in the cognitive neuroscience and neuropsychological literature. Still, reading is served by a constellation of neural units, and the extent to which these units are consistently left-lateralized is unclear. In this regard, the functional lateralization of the fusiform gyrus is of particular interest, by virtue of its hypothesized role as a “visual word form area”. A quantitative Activation Likelihood Estimation meta-analysis was conducted on activation foci from 35 experiments investigating silent reading, and both a whole-brain and a bayesian ROI-based approach were used to assess the lateralization of the data submitted to meta-analysis. Perirolandic areas showed the highest level of left-lateralization, the fusiform cortex and the parietal cortex exhibited only a moderate pattern of left-lateralization, while in the occipital, insular cortices and in the cerebellum the lateralization turned out to be the lowest observed. The relatively limited functional lateralization of the fusiform gyrus was further explored in a regression analysis on the lateralization profile of each study. The functional lateralization of the fusiform gyrus during reading was positively associated with the lateralization of the precentral and inferior occipital gyri and negatively associated with the lateralization of the triangular portion of the inferior frontal gyrus and of the temporal pole. Overall, the present data highlight how lateralization patterns differ within the reading network. Furthermore, the present data highlight how the functional lateralization of the fusiform gyrus during reading is related to the degree of functional lateralization of other language brain areas. -
Coopmans, C. W., Mai, A., & Martin, A. E. (2024). “Not” in the brain and behavior. PLOS Biology, 22: e3002656. doi:10.1371/journal.pbio.3002656.
Ding, R., Ten Oever, S., & Martin, A. E. (2024). Delta-band activity underlies referential meaning representation during pronoun resolution. Journal of Cognitive Neuroscience, 36(7), 1472-1492. doi:10.1162/jocn_a_02163.
Human language offers a variety of ways to create meaning, one of which is referring to entities, objects, or events in the world. One such meaning maker is understanding to whom or to what a pronoun in a discourse refers to. To understand a pronoun, the brain must access matching entities or concepts that have been encoded in memory from previous linguistic context. Models of language processing propose that internally stored linguistic concepts, accessed via exogenous cues such as phonological input of a word, are represented as (a)synchronous activities across a population of neurons active at specific frequency bands. Converging evidence suggests that delta band activity (1–3 Hz) is involved in temporal and representational integration during sentence processing. Moreover, recent advances in the neurobiology of memory suggest that recollection engages neural dynamics similar to those which occurred during memory encoding. Integrating from these two research lines, we here tested the hypothesis that neural dynamic patterns, especially in delta frequency range, underlying referential meaning representation, would be reinstated during pronoun resolution. By leveraging neural decoding techniques (i.e., representational similarity analysis) on a magnetoencephalogram data set acquired during a naturalistic story-listening task, we provide evidence that delta-band activity underlies referential meaning representation. Our findings suggest that, during spoken language comprehension, endogenous linguistic representations such as referential concepts may be proactively retrieved and represented via activation of their underlying dynamic neural patterns. -
Mai, A., Riès, S., Ben-Haim, S., Shih, J. J., & Gentner, T. Q. (2024). Acoustic and language-specific sources for phonemic abstraction from speech. Nature Communications, 15: 677. doi:10.1038/s41467-024-44844-9.
Spoken language comprehension requires abstraction of linguistic information from speech, but the interaction between auditory and linguistic processing of speech remains poorly understood. Here, we investigate the nature of this abstraction using neural responses recorded intracranially while participants listened to conversational English speech. Capitalizing on multiple, language-specific patterns where phonological and acoustic information diverge, we demonstrate the causal efficacy of the phoneme as a unit of analysis and dissociate the unique contributions of phonemic and spectrographic information to neural responses. Quantitive higher-order response models also reveal that unique contributions of phonological information are carried in the covariance structure of the stimulus-response relationship. This suggests that linguistic abstraction is shaped by neurobiological mechanisms that involve integration across multiple spectro-temporal features and prior phonological information. These results link speech acoustics to phonology and morphosyntax, substantiating predictions about abstractness in linguistic theory and providing evidence for the acoustic features that support that abstraction.Additional information
supplementary information -
Meinhardt, E., Mai, A., Baković, E., & McCollum, A. (2024). Weak determinism and the computational consequences of interaction. Natural Language & Linguistic Theory, 42, 1191-1232. doi:10.1007/s11049-023-09578-1.
Recent work has claimed that (non-tonal) phonological patterns are subregular (Heinz 2011a,b, 2018; Heinz and Idsardi 2013), occupying a delimited proper subregion of the regular functions—the weakly deterministic (WD) functions (Heinz and Lai 2013; Jardine 2016). Whether or not it is correct (McCollum et al. 2020a), this claim can only be properly assessed given a complete and accurate definition of WD functions. We propose such a definition in this article, patching unintended holes in Heinz and Lai’s (2013) original definition that we argue have led to the incorrect classification of some phonological patterns as WD. We start from the observation that WD patterns share a property that we call unbounded semiambience, modeled after the analogous observation by Jardine (2016) about non-deterministic (ND) patterns and their unbounded circumambience. Both ND and WD functions can be broken down into compositions of deterministic (subsequential) functions (Elgot and Mezei 1965; Heinz and Lai 2013) that read an input string from opposite directions; we show that WD functions are those for which these deterministic composands do not interact in a way that is familiar from the theoretical phonology literature. To underscore how this concept of interaction neatly separates the WD class of functions from the strictly more expressive ND class, we provide analyses of the vowel harmony patterns of two Eastern Nilotic languages, Maasai and Turkana, using bimachines, an automaton type that represents unbounded bidirectional dependencies explicitly. These analyses make clear that there is interaction between deterministic composands when (and only when) the output of a given input element of a string is simultaneously dependent on information from both the left and the right: ND functions are those that involve interaction, while WD functions are those that do not. -
Slaats, S., Meyer, A. S., & Martin, A. E. (2024). Lexical surprisal shapes the time course of syntactic structure building. Neurobiology of Language, 5(4), 942-980. doi:10.1162/nol_a_00155.
When we understand language, we recognize words and combine them into sentences. In this article, we explore the hypothesis that listeners use probabilistic information about words to build syntactic structure. Recent work has shown that lexical probability and syntactic structure both modulate the delta-band (<4 Hz) neural signal. Here, we investigated whether the neural encoding of syntactic structure changes as a function of the distributional properties of a word. To this end, we analyzed MEG data of 24 native speakers of Dutch who listened to three fairytales with a total duration of 49 min. Using temporal response functions and a cumulative model-comparison approach, we evaluated the contributions of syntactic and distributional features to the variance in the delta-band neural signal. This revealed that lexical surprisal values (a distributional feature), as well as bottom-up node counts (a syntactic feature) positively contributed to the model of the delta-band neural signal. Subsequently, we compared responses to the syntactic feature between words with high- and low-surprisal values. This revealed a delay in the response to the syntactic feature as a consequence of the surprisal value of the word: high-surprisal values were associated with a delayed response to the syntactic feature by 150–190 ms. The delay was not affected by word duration, and did not have a lexical origin. These findings suggest that the brain uses probabilistic information to infer syntactic structure, and highlight an importance for the role of time in this process.Additional information
supplementary data -
Ten Oever, S., & Martin, A. E. (2024). Interdependence of “what” and “when” in the brain. Journal of Cognitive Neuroscience, 36(1), 167-186. doi:10.1162/jocn_a_02067.
From a brain's-eye-view, when a stimulus occurs and what it is are interrelated aspects of interpreting the perceptual world. Yet in practice, the putative perceptual inferences about sensory content and timing are often dichotomized and not investigated as an integrated process. We here argue that neural temporal dynamics can influence what is perceived, and in turn, stimulus content can influence the time at which perception is achieved. This computational principle results from the highly interdependent relationship of what and when in the environment. Both brain processes and perceptual events display strong temporal variability that is not always modeled; we argue that understanding—and, minimally, modeling—this temporal variability is key for theories of how the brain generates unified and consistent neural representations and that we ignore temporal variability in our analysis practice at the peril of both data interpretation and theory-building. Here, we review what and when interactions in the brain, demonstrate via simulations how temporal variability can result in misguided interpretations and conclusions, and outline how to integrate and synthesize what and when in theories and models of brain computation. -
Ten Oever, S., Titone, L., te Rietmolen, N., & Martin, A. E. (2024). Phase-dependent word perception emerges from region-specific sensitivity to the statistics of language. Proceedings of the National Academy of Sciences of the United States of America, 121(3): e2320489121. doi:10.1073/pnas.2320489121.
Neural oscillations reflect fluctuations in excitability, which biases the percept of ambiguous sensory input. Why this bias occurs is still not fully understood. We hypothesized that neural populations representing likely events are more sensitive, and thereby become active on earlier oscillatory phases, when the ensemble itself is less excitable. Perception of ambiguous input presented during less-excitable phases should therefore be biased toward frequent or predictable stimuli that have lower activation thresholds. Here, we show such a frequency bias in spoken word recognition using psychophysics, magnetoencephalography (MEG), and computational modelling. With MEG, we found a double dissociation, where the phase of oscillations in the superior temporal gyrus and medial temporal gyrus biased word-identification behavior based on phoneme and lexical frequencies, respectively. This finding was reproduced in a computational model. These results demonstrate that oscillations provide a temporal ordering of neural activity based on the sensitivity of separable neural populations. -
Weissbart, H., & Martin, A. E. (2024). The structure and statistics of language jointly shape cross-frequency neural dynamics during spoken language comprehension. Nature Communications, 15: 8850. doi:10.1038/s41467-024-53128-1.
Humans excel at extracting structurally-determined meaning from speech despite inherent physical variability. This study explores the brain’s ability to predict and understand spoken language robustly. It investigates the relationship between structural and statistical language knowledge in brain dynamics, focusing on phase and amplitude modulation. Using syntactic features from constituent hierarchies and surface statistics from a transformer model as predictors of forward encoding models, we reconstructed cross-frequency neural dynamics from MEG data during audiobook listening. Our findings challenge a strict separation of linguistic structure and statistics in the brain, with both aiding neural signal reconstruction. Syntactic features have a more temporally spread impact, and both word entropy and the number of closing syntactic constituents are linked to the phase-amplitude coupling of neural dynamics, implying a role in temporal prediction and cortical oscillation alignment during speech processing. Our results indicate that structured and statistical information jointly shape neural dynamics during spoken language comprehension and suggest an integration process via a cross-frequency coupling mechanism -
Yang, J. (2024). Rethinking tokenization: Crafting better tokenizers for large language models. International Journal of Chinese Linguistics, 11(1), 94-109. doi:10.1075/ijchl.00023.yan.
Tokenization significantly influences language models (LMs)’ performance. This paper traces the evolution of tokenizers from word-level to subword-level, analyzing how they balance tokens and types to enhance model adaptability while controlling complexity. Despite subword tokenizers like Byte Pair Encoding (BPE) overcoming many word tokenizer limitations, they encounter difficulties in handling non-Latin languages and depend heavily on extensive training data and computational resources to grasp the nuances of multiword expressions (MWEs). This article argues that tokenizers, more than mere technical tools, should drawing inspiration from the cognitive science about human language processing. This study then introduces the “Principle of Least Effort” from cognitive science, that humans naturally seek to reduce cognitive effort, and discusses the benefits of this principle for tokenizer development. Based on this principle, the paper proposes that the Less-is-Better (LiB) model could be a new approach for LLM tokenizer. The LiB model can autonomously learn an integrated vocabulary consisting of subwords, words, and MWEs, which effectively reduces both the numbers of tokens and types. Comparative evaluations show that the LiB tokenizer outperforms existing word and BPE tokenizers, presenting an innovative method for tokenizer development, and hinting at the possibility of future cognitive science-based tokenizers being more efficient. -
Zhao, J., Martin, A. E., & Coopmans, C. W. (2024). Structural and sequential regularities modulate phrase-rate neural tracking. Scientific Reports, 14: 16603. doi:10.1038/s41598-024-67153-z.
Electrophysiological brain activity has been shown to synchronize with the quasi-regular repetition of grammatical phrases in connected speech—so-called phrase-rate neural tracking. Current debate centers around whether this phenomenon is best explained in terms of the syntactic properties of phrases or in terms of syntax-external information, such as the sequential repetition of parts of speech. As these two factors were confounded in previous studies, much of the literature is compatible with both accounts. Here, we used electroencephalography (EEG) to determine if and when the brain is sensitive to both types of information. Twenty native speakers of Mandarin Chinese listened to isochronously presented streams of monosyllabic words, which contained either grammatical two-word phrases (e.g., catch fish, sell house) or non-grammatical word combinations (e.g., full lend, bread far). Within the grammatical conditions, we varied two structural factors: the position of the head of each phrase and the type of attachment. Within the non-grammatical conditions, we varied the consistency with which parts of speech were repeated. Tracking was quantified through evoked power and inter-trial phase coherence, both derived from the frequency-domain representation of EEG responses. As expected, neural tracking at the phrase rate was stronger in grammatical sequences than in non-grammatical sequences without syntactic structure. Moreover, it was modulated by both attachment type and head position, revealing the structure-sensitivity of phrase-rate tracking. We additionally found that the brain tracks the repetition of parts of speech in non-grammatical sequences. These data provide an integrative perspective on the current debate about neural tracking effects, revealing that the brain utilizes regularities computed over multiple levels of linguistic representation in guiding rhythmic computation.Additional information
full stimulus list, the raw EEG data, and the analysis scripts -
Zioga, I., Zhou, Y. J., Weissbart, H., Martin, A. E., & Haegens, S. (2024). Alpha and beta oscillations differentially support word production in a rule-switching task. eNeuro, 11(4): ENEURO.0312-23.2024. doi:10.1523/ENEURO.0312-23.2024.
Research into the role of brain oscillations in basic perceptual and cognitive functions has suggested that the alpha rhythm reflects functional inhibition while the beta rhythm reflects neural ensemble (re)activation. However, little is known regarding the generalization of these proposed fundamental operations to linguistic processes, such as speech comprehension and production. Here, we recorded magnetoencephalography in participants performing a novel rule-switching paradigm. Specifically, Dutch native speakers had to produce an alternative exemplar from the same category or a feature of a given target word embedded in spoken sentences (e.g., for the word “tuna”, an exemplar from the same category—“seafood”—would be “shrimp”, and a feature would be “pink”). A cue indicated the task rule—exemplar or feature—either before (pre-cue) or after (retro-cue) listening to the sentence. Alpha power during the working memory delay was lower for retro-cue compared with that for pre-cue in the left hemispheric language-related regions. Critically, alpha power negatively correlated with reaction times, suggestive of alpha facilitating task performance by regulating inhibition in regions linked to lexical retrieval. Furthermore, we observed a different spatiotemporal pattern of beta activity for exemplars versus features in the right temporoparietal regions, in line with the proposed role of beta in recruiting neural networks for the encoding of distinct categories. Overall, our study provides evidence for the generalizability of the role of alpha and beta oscillations from perceptual to more “complex, linguistic processes” and offers a novel task to investigate links between rule-switching, working memory, and word production. -
Coopmans, C. W., Mai, A., Slaats, S., Weissbart, H., & Martin, A. E. (2023). What oscillations can do for syntax depends on your theory of structure building. Nature Reviews Neuroscience, 24, 723. doi:10.1038/s41583-023-00734-5.
Coopmans, C. W., Kaushik, K., & Martin, A. E. (2023). Hierarchical structure in language and action: A formal comparison. Psychological Review, 130(4), 935-952. doi:10.1037/rev0000429.
Since the cognitive revolution, language and action have been compared as cognitive systems, with cross-domain convergent views recently gaining renewed interest in biology, neuroscience, and cognitive science. Language and action are both combinatorial systems whose mode of combination has been argued to be hierarchical, combining elements into constituents of increasingly larger size. This structural similarity has led to the suggestion that they rely on shared cognitive and neural resources. In this article, we compare the conceptual and formal properties of hierarchy in language and action using set theory. We show that the strong compositionality of language requires a particular formalism, a magma, to describe the algebraic structure corresponding to the set of hierarchical structures underlying sentences. When this formalism is applied to actions, it appears to be both too strong and too weak. To overcome these limitations, which are related to the weak compositionality and sequential nature of action structures, we formalize the algebraic structure corresponding to the set of actions as a trace monoid. We aim to capture the different system properties of language and action in terms of the distinction between hierarchical sets and hierarchical sequences and discuss the implications for the way both systems could be represented in the brain. -
Guest, O., & Martin, A. E. (2023). On logical inference over brains, behaviour, and artificial neural networks. Computational Brain & Behavior, 6, 213-227. doi:10.1007/s42113-022-00166-x.
In the cognitive, computational, and neuro-sciences, practitioners often reason about what computational models represent or learn, as well as what algorithm is instantiated. The putative goal of such reasoning is to generalize claims about the model in question, to claims about the mind and brain, and the neurocognitive capacities of those systems. Such inference is often based on a model’s performance on a task, and whether that performance approximates human behavior or brain activity. Here we demonstrate how such argumentation problematizes the relationship between models and their targets; we place emphasis on artificial neural networks (ANNs), though any theory-brain relationship that falls into the same schema of reasoning is at risk. In this paper, we model inferences from ANNs to brains and back within a formal framework — metatheoretical calculus — in order to initiate a dialogue on both how models are broadly understood and used, and on how to best formally characterize them and their functions. To these ends, we express claims from the published record about models’ successes and failures in first-order logic. Our proposed formalization describes the decision-making processes enacted by scientists to adjudicate over theories. We demonstrate that formalizing the argumentation in the literature can uncover potential deep issues about how theory is related to phenomena. We discuss what this means broadly for research in cognitive science, neuroscience, and psychology; what it means for models when they lose the ability to mediate between theory and data in a meaningful way; and what this means for the metatheoretical calculus our fields deploy when performing high-level scientific inference. -
Jin, H., Wang, Q., Yang, Y.-F., Zhang, H., Gao, M. (., Jin, S., Chen, Y. (., Xu, T., Zheng, Y.-R., Chen, J., Xiao, Q., Yang, J., Wang, X., Geng, H., Ge, J., Wang, W.-W., Chen, X., Zhang, L., Zuo, X.-N., & Chuan-Peng, H. (2023). The Chinese Open Science Network (COSN): Building an open science community from scratch. Advances in Methods and Practices in Psychological Science, 6(1): 10.1177/25152459221144986. doi:10.1177/25152459221144986.
Open Science is becoming a mainstream scientific ideology in psychology and related fields. However, researchers, especially early-career researchers (ECRs) in developing countries, are facing significant hurdles in engaging in Open Science and moving it forward. In China, various societal and cultural factors discourage ECRs from participating in Open Science, such as the lack of dedicated communication channels and the norm of modesty. To make the voice of Open Science heard by Chinese-speaking ECRs and scholars at large, the Chinese Open Science Network (COSN) was initiated in 2016. With its core values being grassroots-oriented, diversity, and inclusivity, COSN has grown from a small Open Science interest group to a recognized network both in the Chinese-speaking research community and the international Open Science community. So far, COSN has organized three in-person workshops, 12 tutorials, 48 talks, and 55 journal club sessions and translated 15 Open Science-related articles and blogs from English to Chinese. Currently, the main social media account of COSN (i.e., the WeChat Official Account) has more than 23,000 subscribers, and more than 1,000 researchers/students actively participate in the discussions on Open Science. In this article, we share our experience in building such a network to encourage ECRs in developing countries to start their own Open Science initiatives and engage in the global Open Science movement. We foresee great collaborative efforts of COSN together with all other local and international networks to further accelerate the Open Science movement. -
Slaats, S., Weissbart, H., Schoffelen, J.-M., Meyer, A. S., & Martin, A. E. (2023). Delta-band neural responses to individual words are modulated by sentence processing. The Journal of Neuroscience, 43(26), 4867-4883. doi:10.1523/JNEUROSCI.0964-22.2023.
To understand language, we need to recognize words and combine them into phrases and sentences. During this process, responses to the words themselves are changed. In a step towards understanding how the brain builds sentence structure, the present study concerns the neural readout of this adaptation. We ask whether low-frequency neural readouts associated with words change as a function of being in a sentence. To this end, we analyzed an MEG dataset by Schoffelen et al. (2019) of 102 human participants (51 women) listening to sentences and word lists, the latter lacking any syntactic structure and combinatorial meaning. Using temporal response functions and a cumulative model-fitting approach, we disentangled delta- and theta-band responses to lexical information (word frequency), from responses to sensory- and distributional variables. The results suggest that delta-band responses to words are affected by sentence context in time and space, over and above entropy and surprisal. In both conditions, the word frequency response spanned left temporal and posterior frontal areas; however, the response appeared later in word lists than in sentences. In addition, sentence context determined whether inferior frontal areas were responsive to lexical information. In the theta band, the amplitude was larger in the word list condition around 100 milliseconds in right frontal areas. We conclude that low-frequency responses to words are changed by sentential context. The results of this study speak to how the neural representation of words is affected by structural context, and as such provide insight into how the brain instantiates compositionality in language. -
Tezcan, F., Weissbart, H., & Martin, A. E. (2023). A tradeoff between acoustic and linguistic feature encoding in spoken language comprehension. eLife, 12: e82386. doi:10.7554/eLife.82386.
When we comprehend language from speech, the phase of the neural response aligns with particular features of the speech input, resulting in a phenomenon referred to as neural tracking. In recent years, a large body of work has demonstrated the tracking of the acoustic envelope and abstract linguistic units at the phoneme and word levels, and beyond. However, the degree to which speech tracking is driven by acoustic edges of the signal, or by internally-generated linguistic units, or by the interplay of both, remains contentious. In this study, we used naturalistic story-listening to investigate (1) whether phoneme-level features are tracked over and above acoustic edges, (2) whether word entropy, which can reflect sentence- and discourse-level constraints, impacted the encoding of acoustic and phoneme-level features, and (3) whether the tracking of acoustic edges was enhanced or suppressed during comprehension of a first language (Dutch) compared to a statistically familiar but uncomprehended language (French). We first show that encoding models with phoneme-level linguistic features, in addition to acoustic features, uncovered an increased neural tracking response; this signal was further amplified in a comprehended language, putatively reflecting the transformation of acoustic features into internally generated phoneme-level representations. Phonemes were tracked more strongly in a comprehended language, suggesting that language comprehension functions as a neural filter over acoustic edges of the speech signal as it transforms sensory signals into abstract linguistic units. We then show that word entropy enhances neural tracking of both acoustic and phonemic features when sentence- and discourse-context are less constraining. When language was not comprehended, acoustic features, but not phonemic ones, were more strongly modulated, but in contrast, when a native language is comprehended, phoneme features are more strongly modulated. Taken together, our findings highlight the flexible modulation of acoustic, and phonemic features by sentence and discourse-level constraint in language comprehension, and document the neural transformation from speech perception to language comprehension, consistent with an account of language processing as a neural filter from sensory to abstract representations. -
Van der Werf, O. J., Schuhmann, T., De Graaf, T., Ten Oever, S., & Sack, A. T. (2023). Investigating the role of task relevance during rhythmic sampling of spatial locations. Scientific Reports, 13: 12707. doi:10.1038/s41598-023-38968-z.
Recently it has been discovered that visuospatial attention operates rhythmically, rather than being stably employed over time. A low-frequency 7–8 Hz rhythmic mechanism coordinates periodic windows to sample relevant locations and to shift towards other, less relevant locations in a visual scene. Rhythmic sampling theories would predict that when two locations are relevant 8 Hz sampling mechanisms split into two, effectively resulting in a 4 Hz sampling frequency at each location. Therefore, it is expected that rhythmic sampling is influenced by the relative importance of locations for the task at hand. To test this, we employed an orienting task with an arrow cue, where participants were asked to respond to a target presented in one visual field. The cue-to-target interval was systematically varied, allowing us to assess whether performance follows a rhythmic pattern across cue-to-target delays. We manipulated a location’s task relevance by altering the validity of the cue, thereby predicting the correct location in 60%, 80% or 100% of trials. Results revealed significant 4 Hz performance fluctuations at cued right visual field targets with low cue validity (60%), suggesting regular sampling of both locations. With high cue validity (80%), we observed a peak at 8 Hz towards non-cued targets, although not significant. These results were in line with our hypothesis suggesting a goal-directed balancing of attentional sampling (cued location) and shifting (non-cued location) depending on the relevance of locations in a visual scene. However, considering the hemifield specificity of the effect together with the absence of expected effects for cued trials in the high valid conditions we further discuss the interpretation of the data.Additional information
supplementary information -
Zhang, Y., Ding, R., Frassinelli, D., Tuomainen, J., Klavinskis-Whiting, S., & Vigliocco, G. (2023). The role of multimodal cues in second language comprehension. Scientific Reports, 13: 20824. doi:10.1038/s41598-023-47643-2.
In face-to-face communication, multimodal cues such as prosody, gestures, and mouth movements can play a crucial role in language processing. While several studies have addressed how these cues contribute to native (L1) language processing, their impact on non-native (L2) comprehension is largely unknown. Comprehension of naturalistic language by L2 comprehenders may be supported by the presence of (at least some) multimodal cues, as these provide correlated and convergent information that may aid linguistic processing. However, it is also the case that multimodal cues may be less used by L2 comprehenders because linguistic processing is more demanding than for L1 comprehenders, leaving more limited resources for the processing of multimodal cues. In this study, we investigated how L2 comprehenders use multimodal cues in naturalistic stimuli (while participants watched videos of a speaker), as measured by electrophysiological responses (N400) to words, and whether there are differences between L1 and L2 comprehenders. We found that prosody, gestures, and informative mouth movements each reduced the N400 in L2, indexing easier comprehension. Nevertheless, L2 participants showed weaker effects for each cue compared to L1 comprehenders, with the exception of meaningful gestures and informative mouth movements. These results show that L2 comprehenders focus on specific multimodal cues – meaningful gestures that support meaningful interpretation and mouth movements that enhance the acoustic signal – while using multimodal cues to a lesser extent than L1 comprehenders overall.Additional information
supplementary materials -
Zioga, I., Weissbart, H., Lewis, A. G., Haegens, S., & Martin, A. E. (2023). Naturalistic spoken language comprehension is supported by alpha and beta oscillations. The Journal of Neuroscience, 43(20), 3718-3732. doi:10.1523/JNEUROSCI.1500-22.2023.
Brain oscillations are prevalent in all species and are involved in numerous perceptual operations. α oscillations are thought to facilitate processing through the inhibition of task-irrelevant networks, while β oscillations are linked to the putative reactivation of content representations. Can the proposed functional role of α and β oscillations be generalized from low-level operations to higher-level cognitive processes? Here we address this question focusing on naturalistic spoken language comprehension. Twenty-two (18 female) Dutch native speakers listened to stories in Dutch and French while MEG was recorded. We used dependency parsing to identify three dependency states at each word: the number of (1) newly opened dependencies, (2) dependencies that remained open, and (3) resolved dependencies. We then constructed forward models to predict α and β power from the dependency features. Results showed that dependency features predict α and β power in language-related regions beyond low-level linguistic features. Left temporal, fundamental language regions are involved in language comprehension in α, while frontal and parietal, higher-order language regions, and motor regions are involved in β. Critically, α- and β-band dynamics seem to subserve language comprehension tapping into syntactic structure building and semantic composition by providing low-level mechanistic operations for inhibition and reactivation processes. Because of the temporal similarity of the α-β responses, their potential functional dissociation remains to be elucidated. Overall, this study sheds light on the role of α and β oscillations during naturalistic spoken language comprehension, providing evidence for the generalizability of these dynamics from perceptual to complex linguistic processes. -
Bai, F., Meyer, A. S., & Martin, A. E. (2022). Neural dynamics differentially encode phrases and sentences during spoken language comprehension. PLoS Biology, 20(7): e3001713. doi:10.1371/journal.pbio.3001713.
Human language stands out in the natural world as a biological signal that uses a structured system to combine the meanings of small linguistic units (e.g., words) into larger constituents (e.g., phrases and sentences). However, the physical dynamics of speech (or sign) do not stand in a one-to-one relationship with the meanings listeners perceive. Instead, listeners infer meaning based on their knowledge of the language. The neural readouts of the perceptual and cognitive processes underlying these inferences are still poorly understood. In the present study, we used scalp electroencephalography (EEG) to compare the neural response to phrases (e.g., the red vase) and sentences (e.g., the vase is red), which were close in semantic meaning and had been synthesized to be physically indistinguishable. Differences in structure were well captured in the reorganization of neural phase responses in delta (approximately <2 Hz) and theta bands (approximately 2 to 7 Hz),and in power and power connectivity changes in the alpha band (approximately 7.5 to 13.5 Hz). Consistent with predictions from a computational model, sentences showed more power, more power connectivity, and more phase synchronization than phrases did. Theta–gamma phase–amplitude coupling occurred, but did not differ between the syntactic structures. Spectral–temporal response function (STRF) modeling revealed different encoding states for phrases and sentences, over and above the acoustically driven neural response. Our findings provide a comprehensive description of how the brain encodes and separates linguistic structures in the dynamics of neural responses. They imply that phase synchronization and strength of connectivity are readouts for the constituent structure of language. The results provide a novel basis for future neurophysiological research on linguistic structure representation in the brain, and, together with our simulations, support time-based binding as a mechanism of structure encoding in neural dynamics. -
Bai, F. (2022). Neural representation of speech segmentation and syntactic structure discrimination. PhD Thesis, Radboud University Nijmegen, Nijmegen.
Additional information
full text via Radboud Repository -
Coopmans, C. W., De Hoop, H., Kaushik, K., Hagoort, P., & Martin, A. E. (2022). Hierarchy in language interpretation: Evidence from behavioural experiments and computational modelling. Language, Cognition and Neuroscience, 37(4), 420-439. doi:10.1080/23273798.2021.1980595.
It has long been recognised that phrases and sentences are organised hierarchically, but many computational models of language treat them as sequences of words without computing constituent structure. Against this background, we conducted two experiments which showed that participants interpret ambiguous noun phrases, such as second blue ball, in terms of their abstract hierarchical structure rather than their linear surface order. When a neural network model was tested on this task, it could simulate such “hierarchical” behaviour. However, when we changed the training data such that they were not entirely unambiguous anymore, the model stopped generalising in a human-like way. It did not systematically generalise to novel items, and when it was trained on ambiguous trials, it strongly favoured the linear interpretation. We argue that these models should be endowed with a bias to make generalisations over hierarchical structure in order to be cognitively adequate models of human language. -
Coopmans, C. W., De Hoop, H., Hagoort, P., & Martin, A. E. (2022). Effects of structure and meaning on cortical tracking of linguistic units in naturalistic speech. Neurobiology of Language, 3(3), 386-412. doi:10.1162/nol_a_00070.
Recent research has established that cortical activity “tracks” the presentation rate of syntactic phrases in continuous speech, even though phrases are abstract units that do not have direct correlates in the acoustic signal. We investigated whether cortical tracking of phrase structures is modulated by the extent to which these structures compositionally determine meaning. To this end, we recorded electroencephalography (EEG) of 38 native speakers who listened to naturally spoken Dutch stimuli in different conditions, which parametrically modulated the degree to which syntactic structure and lexical semantics determine sentence meaning. Tracking was quantified through mutual information between the EEG data and either the speech envelopes or abstract annotations of syntax, all of which were filtered in the frequency band corresponding to the presentation rate of phrases (1.1–2.1 Hz). Overall, these mutual information analyses showed stronger tracking of phrases in regular sentences than in stimuli whose lexical-syntactic content is reduced, but no consistent differences in tracking between sentences and stimuli that contain a combination of syntactic structure and lexical content. While there were no effects of compositional meaning on the degree of phrase-structure tracking, analyses of event-related potentials elicited by sentence-final words did reveal meaning-induced differences between conditions. Our findings suggest that cortical tracking of structure in sentences indexes the internal generation of this structure, a process that is modulated by the properties of its input, but not by the compositional interpretation of its output.Additional information
supplementary information -
Dingemanse, M., Liesenfeld, A., & Woensdregt, M. (2022). Convergent cultural evolution of continuers (mhmm). In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (
Eds. ), The Evolution of Language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 160-167). Nijmegen: Joint Conference on Language Evolution (JCoLE). doi:10.31234/osf.io/65c79.Abstract
Continuers —words like mm, mmhm, uhum and the like— are among the most frequent types of responses in conversation. They play a key role in joint action coordination by showing positive evidence of understanding and scaffolding narrative delivery. Here we investigate the hypothesis that their functional importance along with their conversational ecology places selective pressures on their form and may lead to cross-linguistic similarities through convergent cultural evolution. We compare continuer tokens in linguistically diverse conversational corpora and find languages make available highly similar forms. We then approach the causal mechanism of convergent cultural evolution using exemplar modelling, simulating the process by which a combination of effort minimization and functional specialization may push continuers to a particular region of phonological possibility space. By combining comparative linguistics and computational modelling we shed new light on the question of how language structure is shaped by and for social interaction. -
Doumas, L. A. A., Puebla, G., Martin, A. E., & Hummel, J. E. (2022). A theory of relation learning and cross-domain generalization. Psychological Review, 129(5), 999-1041. doi:10.1037/rev0000346.
People readily generalize knowledge to novel domains and stimuli. We present a theory, instantiated in a computational model, based on the idea that cross-domain generalization in humans is a case of analogical inference over structured (i.e., symbolic) relational representations. The model is an extension of the Learning and Inference with Schemas and Analogy (LISA; Hummel & Holyoak, 1997, 2003) and Discovery of Relations by Analogy (DORA; Doumas et al., 2008) models of relational inference and learning. The resulting model learns both the content and format (i.e., structure) of relational representations from nonrelational inputs without supervision, when augmented with the capacity for reinforcement learning it leverages these representations to learn about individual domains, and then generalizes to new domains on the first exposure (i.e., zero-shot learning) via analogical inference. We demonstrate the capacity of the model to learn structured relational representations from a variety of simple visual stimuli, and to perform cross-domain generalization between video games (Breakout and Pong) and between several psychological tasks. We demonstrate that the model’s trajectory closely mirrors the trajectory of children as they learn about relations, accounting for phenomena from the literature on the development of children’s reasoning and analogy making. The model’s ability to generalize between domains demonstrates the flexibility afforded by representing domains in terms of their underlying relational structure, rather than simply in terms of the statistical relations between their inputs and outputs. -
Heesen, R., Fröhlich, M., Sievers, C., Woensdregt, M., & Dingemanse, M. (2022). Coordinating social action: A primer for the cross-species investigation of communicative repair. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 377(1859): 20210110. doi:10.1098/rstb.2021.0110.
Human joint action is inherently cooperative, manifested in the collaborative efforts of participants to minimize communicative trouble through interactive repair. Although interactive repair requires sophisticated cognitive abilities,
it can be dissected into basic building blocks shared with non-human animal species. A review of the primate literature shows that interactionally contingent signal sequences are at least common among species of nonhuman great apes, suggesting a gradual evolution of repair. To pioneer a cross-species assessment of repair this paper aims at (i) identifying necessary precursors of human interactive repair; (ii) proposing a coding framework for its comparative study in humans and non-human species; and (iii) using this framework to analyse examples of interactions of humans (adults/children) and non-human great apes. We hope this paper will serve as a primer for cross-species comparisons of communicative breakdowns and how they are repaired. -
Janssens, S. E. W., Sack, A. T., Ten Oever, S., & Graaf, T. A. (2022). Calibrating rhythmic stimulation parameters to individual electroencephalography markers: The consistency of individual alpha frequency in practical lab settings. European Journal of Neuroscience, 55(11/12), 3418-3437. doi:10.1111/ejn.15418.
Rhythmic stimulation can be applied to modulate neuronal oscillations. Such ‘entrainment’ is optimized when stimulation frequency is individually calibrated based on magneto/encephalography markers. It remains unknown how consistent such individual markers are across days/sessions, within a session, or across cognitive states, hemispheres and estimation methods, especially in a realistic, practical, lab setting. We here estimated individual alpha frequency (IAF) repeatedly from short electroencephalography (EEG) measurements at rest or during an attention task (cognitive state), using single parieto-occipital electrodes in 24 participants on 4 days (between-sessions), with multiple measurements over an hour on 1 day (within-session). First, we introduce an algorithm to automatically reject power spectra without a sufficiently clear peak to ensure unbiased IAF estimations. Then we estimated IAF via the traditional ‘maximum’ method and a ‘Gaussian fit’ method. IAF was reliable within- and between-sessions for both cognitive states and hemispheres, though task-IAF estimates tended to be more variable. Overall, the ‘Gaussian fit’ method was more reliable than the ‘maximum’ method. Furthermore, we evaluated how far from an approximated ‘true’ task-related IAF the selected ‘stimulation frequency’ was, when calibrating this frequency based on a short rest-EEG, a short task-EEG, or simply selecting 10 Hz for all participants. For the ‘maximum’ method, rest-EEG calibration was best, followed by task-EEG, and then 10 Hz. For the ‘Gaussian fit’ method, rest-EEG and task-EEG-based calibration were similarly accurate, and better than 10 Hz. These results lead to concrete recommendations about valid, and automated, estimation of individual oscillation markers in experimental and clinical settings. -
Janssens, S. E., Ten Oever, S., Sack, A. T., & de Graaf, T. A. (2022). “Broadband Alpha Transcranial Alternating Current Stimulation”: Exploring a new biologically calibrated brain stimulation protocol. NeuroImage, 253: 119109. doi:10.1016/j.neuroimage.2022.119109.
Transcranial alternating current stimulation (tACS) can be used to study causal contributions of oscillatory brain mechanisms to cognition and behavior. For instance, individual alpha frequency (IAF) tACS was reported to enhance alpha power and impact visuospatial attention performance. Unfortunately, such results have been inconsistent and difficult to replicate. In tACS, stimulation generally involves one frequency, sometimes individually calibrated to a peak value observed in an M/EEG power spectrum. Yet, the ‘peak’ actually observed in such power spectra often contains a broader range of frequencies, raising the question whether a biologically calibrated tACS protocol containing this fuller range of alpha-band frequencies might be more effective. Here, we introduce ‘Broadband-alpha-tACS’, a complex individually calibrated electrical stimulation protocol. We band-pass filtered left posterior resting-state EEG data around the IAF (+/- 2 Hz), and converted that time series into an electrical waveform for tACS stimulation of that same left posterior parietal cortex location. In other words, we stimulated a brain region with a ‘replay’ of its own alpha-band frequency content, based on spontaneous activity. Within-subjects (N=24), we compared to a sham tACS session the effects of broadband-alpha tACS, power-matched spectral inverse (‘alpha-removed’) control tACS, and individual alpha frequency tACS, on EEG alpha power and performance in an endogenous attention task previously reported to be affected by alpha tACS. Broadband-alpha-tACS significantly modulated attention task performance (i.e., reduced the rightward visuospatial attention bias in trials without distractors, and reduced attention benefits). Alpha-removed tACS also reduced the rightward visuospatial attention bias. IAF-tACS did not significantly modulate attention task performance compared to sham tACS, but also did not statistically significantly differ from broadband-alpha-tACS. This new broadband-alpha tACS approach seems promising, but should be further explored and validated in future studies.Additional information
supplementary materials -
Kemmerer, S. K., Sack, A. T., de Graaf, T. A., Ten Oever, S., De Weerd, P., & Schuhmann, T. (2022). Frequency-specific transcranial neuromodulation of alpha power alters visuospatial attention performance. Brain Research, 1782: 147834. doi:10.1016/j.brainres.2022.147834.
Transcranial alternating current stimulation (tACS) at 10 Hz has been shown to modulate spatial attention. However, the frequency-specificity and the oscillatory changes underlying this tACS effect are still largely unclear. Here, we applied high-definition tACS at individual alpha frequency (IAF), two control frequencies (IAF+/-2Hz) and sham to the left posterior parietal cortex and measured its effects on visuospatial attention performance and offline alpha power (using electroencephalography, EEG). We revealed a behavioural and electrophysiological stimulation effect relative to sham for IAF but not control frequency stimulation conditions: there was a leftward lateralization of alpha power for IAF tACS, which differed from sham for the first out of three minutes following tACS. At a high value of this EEG effect (moderation effect), we observed a leftward attention bias relative to sham. This effect was task-specific, i.e., it could be found in an endogenous attention but not in a detection task. Only in the IAF tACS condition, we also found a correlation between the magnitude of the alpha lateralization and the attentional bias effect. Our results support a functional role of alpha oscillations in visuospatial attention and the potential of tACS to modulate it. The frequency-specificity of the effects suggests that an individualization of the stimulation frequency is necessary in heterogeneous target groups with a large variation in IAF.Additional information
supplementary data -
Kemmerer, S. K., De Graaf, T. A., Ten Oever, S., Erkens, M., De Weerd, P., & Sack, A. T. (2022). Parietal but not temporoparietal alpha-tACS modulates endogenous visuospatial attention. Cortex, 154, 149-166. doi:10.1016/j.cortex.2022.01.021.
Visuospatial attention can either be voluntarily directed (endogenous/top-down attention) or automatically triggered (exogenous/bottom-up attention). Recent research showed that dorsal parietal transcranial alternating current stimulation (tACS) at alpha frequency modulates the spatial attentional bias in an endogenous but not in an exogenous visuospatial attention task. Yet, the reason for this task-specificity remains unexplored. Here, we tested whether this dissociation relates to the proposed differential role of the dorsal attention network (DAN) and ventral attention network (VAN) in endogenous and exogenous attention processes respectively. To that aim, we targeted the left and right dorsal parietal node of the DAN, as well as the left and right ventral temporoparietal node of the VAN using tACS at the individual alpha frequency. Every participant completed all four stimulation conditions and a sham condition in five separate sessions. During tACS, we assessed the behavioral visuospatial attention bias via an endogenous and exogenous visuospatial attention task. Additionally, we measured offline alpha power immediately before and after tACS using electroencephalography (EEG). The behavioral data revealed an effect of tACS on the endogenous but not exogenous attention bias, with a greater leftward bias during (sham-corrected) left than right hemispheric stimulation. In line with our hypothesis, this effect was brain area-specific, i.e., present for dorsal parietal but not ventral temporoparietal tACS. However, contrary to our expectations, there was no effect of ventral temporoparietal tACS on the exogenous visuospatial attention bias. Hence, no double dissociation between the two targeted attention networks. There was no effect of either tACS condition on offline alpha power. Our behavioral data reveal that dorsal parietal but not ventral temporoparietal alpha oscillations steer endogenous visuospatial attention. This brain-area specific tACS effect matches the previously proposed dissociation between the DAN and VAN and, by showing that the spatial attention bias effect does not generalize to any lateral posterior tACS montage, renders lateral cutaneous and retinal effects for the spatial attention bias in the dorsal parietal condition unlikely. Yet the absence of tACS effects on the exogenous attention task suggests that ventral temporoparietal alpha oscillations are not functionally relevant for exogenous visuospatial attention. We discuss the potential implications of this finding in the context of an emerging theory on the role of the ventral temporoparietal node.Additional information
supplementary material -
Ten Oever, S., Carta, S., Kaufeld, G., & Martin, A. E. (2022). Neural tracking of phrases in spoken language comprehension is automatic and task-dependent. eLife, 11: e77468. doi:10.7554/eLife.77468.
Linguistic phrases are tracked in sentences even though there is no one-to-one acoustic phrase marker in the physical signal. This phenomenon suggests an automatic tracking of abstract linguistic structure that is endogenously generated by the brain. However, all studies investigating linguistic tracking compare conditions where either relevant information at linguistic timescales is available, or where this information is absent altogether (e.g., sentences versus word lists during passive listening). It is therefore unclear whether tracking at phrasal timescales is related to the content of language, or rather, results as a consequence of attending to the timescales that happen to match behaviourally relevant information. To investigate this question, we presented participants with sentences and word lists while recording their brain activity with magnetoencephalography (MEG). Participants performed passive, syllable, word, and word-combination tasks corresponding to attending to four different rates: one they would naturally attend to, syllable-rates, word-rates, and phrasal-rates, respectively. We replicated overall findings of stronger phrasal-rate tracking measured with mutual information for sentences compared to word lists across the classical language network. However, in the inferior frontal gyrus (IFG) we found a task effect suggesting stronger phrasal-rate tracking during the word-combination task independent of the presence of linguistic structure, as well as stronger delta-band connectivity during this task. These results suggest that extracting linguistic information at phrasal rates occurs automatically with or without the presence of an additional task, but also that IFG might be important for temporal integration across various perceptual domains. -
Ten Oever, S., Kaushik, K., & Martin, A. E. (2022). Inferring the nature of linguistic computations in the brain. PLoS Computational Biology, 18(7): e1010269. doi:10.1371/journal.pcbi.1010269.
Sentences contain structure that determines their meaning beyond that of individual words. An influential study by Ding and colleagues (2016) used frequency tagging of phrases and sentences to show that the human brain is sensitive to structure by finding peaks of neural power at the rate at which structures were presented. Since then, there has been a rich debate on how to best explain this pattern of results with profound impact on the language sciences. Models that use hierarchical structure building, as well as models based on associative sequence processing, can predict the neural response, creating an inferential impasse as to which class of models explains the nature of the linguistic computations reflected in the neural readout. In the current manuscript, we discuss pitfalls and common fallacies seen in the conclusions drawn in the literature illustrated by various simulations. We conclude that inferring the neural operations of sentence processing based on these neural data, and any like it, alone, is insufficient. We discuss how to best evaluate models and how to approach the modeling of neural readouts to sentence processing in a manner that remains faithful to cognitive, neural, and linguistic principles. -
Van der Werf, O. J., Ten Oever, S., Schuhmann, T., & Sack, A. T. (2022). No evidence of rhythmic visuospatial attention at cued locations in a spatial cuing paradigm, regardless of their behavioural relevance. European Journal of Neuroscience, 55(11-12), 3100-3116. doi:10.1111/ejn.15353.
Recent evidence suggests that visuospatial attentional performance is not stable over time but fluctuates in a rhythmic fashion. These attentional rhythms allow for sampling of different visuospatial locations in each cycle of this rhythm. However, it is still unclear in which paradigmatic circumstances rhythmic attention becomes evident. First, it is unclear at what spatial locations rhythmic attention occurs. Second, it is unclear how the behavioural relevance of each spatial location determines the rhythmic sampling patterns. Here, we aim to elucidate these two issues. Firstly, we aim to find evidence of rhythmic attention at the predicted (i.e. cued) location under moderately informative predictor value, replicating earlier studies. Secondly, we hypothesise that rhythmic attentional sampling behaviour will be affected by the behavioural relevance of the sampled location, ranging from non-informative to fully informative. To these aims, we used a modified Egly-Driver task with three conditions: a fully informative cue, a moderately informative cue (replication condition), and a non-informative cue. We did not find evidence of rhythmic sampling at cued locations, failing to replicate earlier studies. Nor did we find differences in rhythmic sampling under different predictive values of the cue. The current data does not allow for robust conclusions regarding the non-cued locations due to the absence of a priori hypotheses. Post-hoc explorative data analyses, however, clearly indicate that attention samples non-cued locations in a theta-rhythmic manner, specifically when the cued location bears higher behavioural relevance than the non-cued locations. -
Woensdregt, M., Jara-Ettinger, J., & Rubio-Fernandez, P. (2022). Language universals rely on social cognition: Computational models of the use of this and that to redirect the receiver’s attention. In J. Culbertson, A. Perfors, H. Rabagliati, & V. Ramenzoni (
Eds. ), Proceedings of the 44th Annual Conference of the Cognitive Science Society (CogSci 2022) (pp. 1382-1388). Toronto, Canada: Cognitive Science Society.Abstract
Demonstratives—simple referential devices like this and that—are linguistic universals, but their meaning varies cross-linguistically. In languages like English and Italian, demonstratives are thought to encode the referent’s distance from the producer (e.g., that one means “the one far away from me”),
while in others, like Portuguese and Spanish, they encode relative distance from both producer and receiver (e.g., aquel means “the one far away from both of us”). Here we propose that demonstratives are also sensitive to the receiver’s focus of attention, hence requiring a deeper form of social cognition
than previously thought. We provide initial empirical and computational evidence for this idea, suggesting that producers use
demonstratives to redirect the receiver’s attention towards the intended referent, rather than only to indicate its physical distance. -
Coopmans, C. W., De Hoop, H., Kaushik, K., Hagoort, P., & Martin, A. E. (2021). Structure-(in)dependent interpretation of phrases in humans and LSTMs. In Proceedings of the Society for Computation in Linguistics (SCiL 2021) (pp. 459-463).
In this study, we compared the performance of a long short-term memory (LSTM) neural network to the behavior of human participants on a language task that requires hierarchically structured knowledge. We show that humans interpret ambiguous noun phrases, such as second blue ball, in line with their hierarchical constituent structure. LSTMs, instead, only do
so after unambiguous training, and they do not systematically generalize to novel items. Overall, the results of our simulations indicate that a model can behave hierarchically without relying on hierarchical constituent structure.Additional information
full text via ScholarWorks@UMass Amherst -
Doumas, L. A. A., & Martin, A. E. (2021). A model for learning structured representations of similarity and relative magnitude from experience. Current Opinion in Behavioral Sciences, 37, 158-166. doi:10.1016/j.cobeha.2021.01.001.
How a system represents information tightly constrains the kinds of problems it can solve. Humans routinely solve problems that appear to require abstract representations of stimulus properties and relations. How we acquire such representations has central importance in an account of human cognition. We briefly describe a theory of how a system can learn invariant responses to instances of similarity and relative magnitude, and how structured, relational representations can be learned from initially unstructured inputs. Two operations, comparing distributed representations and learning from the concomitant network dynamics in time, underpin the ability to learn these representations and to respond to invariance in the environment. Comparing analog representations of absolute magnitude produces invariant signals that carry information about similarity and relative magnitude. We describe how a system can then use this information to bootstrap learning structured (i.e., symbolic) concepts of relative magnitude from experience without assuming such representations a priori. -
Guest, O., & Martin, A. E. (2021). How computational modeling can force theory building in psychological science. Perspectives on Psychological Science, 16(4), 789-802. doi:10.1177/1745691620970585.
Psychology endeavors to develop theories of human capacities and behaviors on the basis of a variety of methodologies and dependent measures. We argue that one of the most divisive factors in psychological science is whether researchers choose to use computational modeling of theories (over and above data) during the scientific-inference process. Modeling is undervalued yet holds promise for advancing psychological science. The inherent demands of computational modeling guide us toward better science by forcing us to conceptually analyze, specify, and formalize intuitions that otherwise remain unexamined—what we dub open theory. Constraining our inference process through modeling enables us to build explanatory and predictive theories. Here, we present scientific inference in psychology as a path function in which each step shapes the next. Computational modeling can constrain these steps, thus advancing scientific inference over and above the stewardship of experimental practice (e.g., preregistration). If psychology continues to eschew computational modeling, we predict more replicability crises and persistent failure at coherent theory building. This is because without formal modeling we lack open and transparent theorizing. We also explain how to formalize, specify, and implement a computational model, emphasizing that the advantages of modeling can be achieved by anyone with benefit to all. -
Petras, K., Ten Oever, S., Dalal, S. S., & Goffaux, V. (2021). Information redundancy across spatial scales modulates early visual cortical processing. NeuroImage, 244: 118613. doi:10.1016/j.neuroimage.2021.118613.
Visual images contain redundant information across spatial scales where low spatial frequency contrast is informative towards the location and likely content of high spatial frequency detail. Previous research suggests that the visual system makes use of those redundancies to facilitate efficient processing. In this framework, a fast, initial analysis of low-spatial frequency (LSF) information guides the slower and later processing of high spatial frequency (HSF) detail. Here, we used multivariate classification as well as time-frequency analysis of MEG responses to the viewing of intact and phase scrambled images of human faces to demonstrate that the availability of redundant LSF information, as found in broadband intact images, correlates with a reduction in HSF representational dominance in both early and higher-level visual areas as well as a reduction of gamma-band power in early visual cortex. Our results indicate that the cross spatial frequency information redundancy that can be found in all natural images might be a driving factor in the efficient integration of fine image details.Additional information
supplementary materials -
Puebla, G., Martin, A. E., & Doumas, L. A. A. (2021). The relational processing limits of classic and contemporary neural network models of language processing. Language, Cognition and Neuroscience, 36(2), 240-254. doi:10.1080/23273798.2020.1821906.
Whether neural networks can capture relational knowledge is a matter of long-standing controversy. Recently, some researchers have argued that (1) classic connectionist models can handle relational structure and (2) the success of deep learning approaches to natural language processing suggests that structured representations are unnecessary to model human language. We tested the Story Gestalt model, a classic connectionist model of text comprehension, and a Sequence-to-Sequence with Attention model, a modern deep learning architecture for natural language processing. Both models were trained to answer questions about stories based on abstract thematic roles. Two simulations varied the statistical structure of new stories while keeping their relational structure intact. The performance of each model fell below chance at least under one manipulation. We argue that both models fail our tests because they can't perform dynamic binding. These results cast doubts on the suitability of traditional neural networks for explaining relational reasoning and language processing phenomena.Additional information
supplementary material -
Schilberg, L., Ten Oever, S., Schuhmann, T., & Sack, A. T. (2021). Phase and power modulations on the amplitude of TMS-induced motor evoked potentials. PLoS One, 16(9): e0255815. doi:10.1371/journal.pone.0255815.
The evaluation of transcranial magnetic stimulation (TMS)-induced motor evoked potentials (MEPs) promises valuable information about fundamental brain related mechanisms and may serve as a diagnostic tool for clinical monitoring of therapeutic progress or surgery procedures. However, reports about spontaneous fluctuations of MEP amplitudes causing high intra-individual variability have led to increased concerns about the reliability of this measure. One possible cause for high variability of MEPs could be neuronal oscillatory activity, which reflects fluctuations of membrane potentials that systematically increase and decrease the excitability of neuronal networks. Here, we investigate the dependence of MEP amplitude on oscillation power and phase by combining the application of single pulse TMS over the primary motor cortex with concurrent recordings of electromyography and electroencephalography. Our results show that MEP amplitude is correlated to alpha phase, alpha power as well as beta phase. These findings may help explain corticospinal excitability fluctuations by highlighting the modulatory effect of alpha and beta phase on MEPs. In the future, controlling for such a causal relationship may allow for the development of new protocols, improve this method as a (diagnostic) tool and increase the specificity and efficacy of general TMS applications.Additional information
data and supporting information -
Ten Oever, S., & Martin, A. E. (2021). An oscillating computational model can track pseudo-rhythmic speech by using linguistic predictions. eLife, 10: e68066. doi:10.7554/eLife.68066.
Neuronal oscillations putatively track speech in order to optimize sensory processing. However, it is unclear how isochronous brain oscillations can track pseudo-rhythmic speech input. Here we propose that oscillations can track pseudo-rhythmic speech when considering that speech time is dependent on content-based predictions flowing from internal language models. We show that temporal dynamics of speech are dependent on the predictability of words in a sentence. A computational model including oscillations, feedback, and inhibition is able to track pseudo-rhythmic speech input. As the model processes, it generates temporal phase codes, which are a candidate mechanism for carrying information forward in time. The model is optimally sensitive to the natural temporal speech dynamics and can explain empirical data on temporal speech illusions. Our results suggest that speech tracking does not have to rely only on the acoustics but could also exploit ongoing interactions between oscillations and constraints flowing from internal language models. -
Ten Oever, S., Sack, A. T., Oehrn, C. R., & Axmacher, N. (2021). An engram of intentionally forgotten information. Nature Communications, 12: 6443. doi:10.1038/s41467-021-26713-x.
Successful forgetting of unwanted memories is crucial for goal-directed behavior and mental wellbeing. While memory retention strengthens memory traces, it is unclear what happens to memory traces of events that are actively forgotten. Using intracranial EEG recordings from lateral temporal cortex, we find that memory traces for actively forgotten information are partially preserved and exhibit unique neural signatures. Memory traces of successfully remembered items show stronger encoding-retrieval similarity in gamma frequency patterns. By contrast, encoding-retrieval similarity of item-specific memory traces of actively forgotten items depend on activity at alpha/beta frequencies commonly associated with functional inhibition. Additional analyses revealed selective modification of item-specific patterns of connectivity and top-down information flow from dorsolateral prefrontal cortex to lateral temporal cortex in memory traces of intentionally forgotten items. These results suggest that intentional forgetting relies more on inhibitory top-down connections than intentional remembering, resulting in inhibitory memory traces with unique neural signatures and representational formats.Additional information
supplementary figures -
Zhang, Y., Ding, R., Frassinelli, D., Tuomainen, J., Klavinskis-Whiting, S., & Vigliocco, G. (2021). Electrophysiological signatures of second language multimodal comprehension. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (
Eds. ), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 2971-2977). Vienna: Cognitive Science Society.Abstract
Language is multimodal: non-linguistic cues, such as prosody,
gestures and mouth movements, are always present in face-to-
face communication and interact to support processing. In this
paper, we ask whether and how multimodal cues affect L2
processing by recording EEG for highly proficient bilinguals
when watching naturalistic materials. For each word, we
quantified surprisal and the informativeness of prosody,
gestures, and mouth movements. We found that each cue
modulates the N400: prosodic accentuation, meaningful
gestures, and informative mouth movements all reduce N400.
Further, effects of meaningful gestures but not mouth
informativeness are enhanced by prosodic accentuation,
whereas effects of mouth are enhanced by meaningful gestures
but reduced by beat gestures. Compared with L1, L2
participants benefit less from cues and their interactions, except
for meaningful gestures and mouth movements. Thus, in real-
world language comprehension, L2 comprehenders use
multimodal cues just as L1 speakers albeit to a lesser extent. -
Coopmans, C. W., & Schoenmakers, G.-J. (2020). Incremental structure building of preverbal PPs in Dutch. Linguistics in the Netherlands, 37(1), 38-52. doi:10.1075/avt.00036.coo.
Incremental comprehension of head-final constructions can reveal structural attachment preferences for ambiguous phrases. This study investigates
how temporarily ambiguous PPs are processed in Dutch verb-final constructions. In De aannemer heeft op het dakterras bespaard/gewerkt ‘The
contractor has on the roof terrace saved/worked’, the PP is locally ambiguous between attachment as argument and as adjunct. This ambiguity is
resolved by the sentence-final verb. In a self-paced reading task, we manipulated the argument/adjunct status of the PP, and its position relative to the
verb. While we found no reading-time differences between argument and
adjunct PPs, we did find that transitive verbs, for which the PP is an argument, were read more slowly than intransitive verbs, for which the PP is an adjunct. We suggest that Dutch parsers have a preference for adjunct attachment of preverbal PPs, and discuss our findings in terms of incremental
parsing models that aim to minimize costly reanalysis.
Cutter, M. G., Martin, A. E., & Sturt, P. (2020). Capitalization interacts with syntactic complexity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 46(6), 1146-1164. doi:10.1037/xlm0000780.
We investigated whether readers use the low-level cue of proper noun capitalization in the parafovea to infer syntactic category, and whether this results in an early update of the representation of a sentence’s syntactic structure. Participants read sentences containing either a subject relative or object relative clause, in which the relative clause’s overt argument was a proper noun (e.g., The tall lanky guard who alerted Charlie/Charlie alerted to the danger was young) across three experiments. In Experiment 1 these sentences were presented in normal sentence casing or entirely in upper case. In Experiment 2 participants received either valid or invalid parafoveal previews of the relative clause. In Experiment 3 participants viewed relative clauses in only normal conditions. We hypothesized that we would observe relative clause effects (i.e., inflated fixation times for object relative clauses) while readers were still fixated on the word who, if readers use capitalization to infer a parafoveal word’s syntactic class. This would constitute a syntactic parafoveal-on-foveal effect. Furthermore, we hypothesised that this effect should be influenced by sentence casing in Experiment 1 (with no cue for syntactic category being available in upper case sentences) but not by parafoveal preview validity of the target words. We observed syntactic parafoveal-on-foveal effects in Experiment 1 and 3, and a Bayesian analysis of the combined data from all three experiments. These effects seemed to be influenced more by noun capitalization than lexical processing. We discuss our findings in relation to models of eye movement control and sentence processing theories. -
Cutter, M. G., Martin, A. E., & Sturt, P. (2020). Readers detect an low-level phonological violation between two parafoveal words. Cognition, 204: 104395. doi:10.1016/j.cognition.2020.104395.
In two eye-tracking studies we investigated whether readers can detect a violation of the phonological-grammatical convention for the indefinite article an to be followed by a word beginning with a vowel when these two words appear in the parafovea. Across two experiments participants read sentences in which the word an was followed by a parafoveal preview that was either correct (e.g. Icelandic), incorrect and represented a phonological violation (e.g. Mongolian), or incorrect without representing a phonological violation (e.g. Ethiopian), with this parafoveal preview changing to the target word as participants made a saccade into the space preceding an. Our data suggests that participants detected the phonological violation while the target word was still two words to the right of fixation, with participants making more regressions from the previewed word and having longer go-past times on this word when they received a violation preview as opposed to a non-violation preview. We argue that participants were attempting to perform aspects of sentence integration on the basis of low-level orthographic information from the previewed word.Additional information
Data files and R Scripts -
Cutter, M. G., Martin, A. E., & Sturt, P. (2020). The activation of contextually predictable words in syntactically illegal positions. Quarterly Journal of Experimental Psychology, 73(9), 1423-1430. doi:10.1177/1747021820911021.
We present an eye-tracking study testing a hypothesis emerging from several theories of prediction during language processing, whereby predictable words should be skipped more than unpredictable words even in syntactically illegal positions. Participants read sentences in which a target word became predictable by a certain point (e.g., “bone” is 92% predictable given, “The dog buried his. . .”), with the next word actually being an intensifier (e.g., “really”), which a noun cannot follow. The target noun remained predictable to appear later in the sentence. We used the boundary paradigm to present the predictable noun or an alternative unpredictable noun (e.g., “food”) directly after the intensifier, until participants moved beyond the intensifier, at which point the noun changed to a syntactically legal word. Participants also read sentences in which predictable or unpredictable nouns appeared in syntactically legal positions. A Bayesian linear-mixed model suggested a 5.7% predictability effect on skipping of nouns in syntactically legal positions, and a 3.1% predictability effect on skipping of nouns in illegal positions. We discuss our findings in relation to theories of lexical prediction during reading.Additional information
OSF data -
Doumas, L. A. A., Martin, A. E., & Hummel, J. E. (2020). Relation learning in a neurocomputational architecture supports cross-domain transfer. In S. Denison, M. Mack, Y. Xu, & B. C. Armstrong (
Eds. ), Proceedings of the 42nd Annual Virtual Meeting of the Cognitive Science Society (CogSci 2020) (pp. 932-937). Montreal, QB: Cognitive Science Society.Abstract
Humans readily generalize, applying prior knowledge to novel situations and stimuli. Advances in machine learning have begun to approximate and even surpass human performance, but these systems struggle to generalize what they have learned to untrained situations. We present a model based on wellestablished neurocomputational principles that demonstrates human-level generalisation. This model is trained to play one video game (Breakout) and performs one-shot generalisation to a new game (Pong) with different characteristics. The model
generalizes because it learns structured representations that are functionally symbolic (viz., a role-filler binding calculus) from unstructured training data. It does so without feedback, and without requiring that structured representations are specified a priori. Specifically, the model uses neural co-activation to discover which characteristics of the input are invariant and to learn relational predicates, and oscillatory regularities in network firing to bind predicates to arguments. To our knowledge,
this is the first demonstration of human-like generalisation in a machine system that does not assume structured representa-
tions to begin with. -
Hashemzadeh, M., Kaufeld, G., White, M., Martin, A. E., & Fyshe, A. (2020). From language to language-ish: How brain-like is an LSTM representation of nonsensical language stimuli? In T. Cohn, Y. He, & Y. Liu (
Eds. ), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 645-655). Association for Computational Linguistics.Abstract
The representations generated by many mod-
els of language (word embeddings, recurrent
neural networks and transformers) correlate
to brain activity recorded while people read.
However, these decoding results are usually
based on the brain’s reaction to syntactically
and semantically sound language stimuli. In
this study, we asked: how does an LSTM (long
short term memory) language model, trained
(by and large) on semantically and syntac-
tically intact language, represent a language
sample with degraded semantic or syntactic
information? Does the LSTM representation
still resemble the brain’s reaction? We found
that, even for some kinds of nonsensical lan-
guage, there is a statistically significant rela-
tionship between the brain’s activity and the
representations of an LSTM. This indicates
that, at least in some instances, LSTMs and the
human brain handle nonsensical data similarly. -
Kaufeld, G., Naumann, W., Meyer, A. S., Bosker, H. R., & Martin, A. E. (2020). Contextual speech rate influences morphosyntactic prediction and integration. Language, Cognition and Neuroscience, 35(7), 933-948. doi:10.1080/23273798.2019.1701691.
Understanding spoken language requires the integration and weighting of multiple cues, and may call on cue integration mechanisms that have been studied in other areas of perception. In the current study, we used eye-tracking (visual-world paradigm) to examine how contextual speech rate (a lower-level, perceptual cue) and morphosyntactic knowledge (a higher-level, linguistic cue) are iteratively combined and integrated. Results indicate that participants used contextual rate information immediately, which we interpret as evidence of perceptual inference and the generation of predictions about upcoming morphosyntactic information. Additionally, we observed that early rate effects remained active in the presence of later conflicting lexical information. This result demonstrates that (1) contextual speech rate functions as a cue to morphosyntactic inferences, even in the presence of subsequent disambiguating information; and (2) listeners iteratively use multiple sources of information to draw inferences and generate predictions during speech comprehension. We discuss the implication of these demonstrations for theories of language processing -
Kaufeld, G., Ravenschlag, A., Meyer, A. S., Martin, A. E., & Bosker, H. R. (2020). Knowledge-based and signal-based cues are weighted flexibly during spoken language comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 46(3), 549-562. doi:10.1037/xlm0000744.
During spoken language comprehension, listeners make use of both knowledge-based and signal-based sources of information, but little is known about how cues from these distinct levels of representational hierarchy are weighted and integrated online. In an eye-tracking experiment using the visual world paradigm, we investigated the flexible weighting and integration of morphosyntactic gender marking (a knowledge-based cue) and contextual speech rate (a signal-based cue). We observed that participants used the morphosyntactic cue immediately to make predictions about upcoming referents, even in the presence of uncertainty about the cue’s reliability. Moreover, we found speech rate normalization effects in participants’ gaze patterns even in the presence of preceding morphosyntactic information. These results demonstrate that cues are weighted and integrated flexibly online, rather than adhering to a strict hierarchy. We further found rate normalization effects in the looking behavior of participants who showed a strong behavioral preference for the morphosyntactic gender cue. This indicates that rate normalization effects are robust and potentially automatic. We discuss these results in light of theories of cue integration and the two-stage model of acoustic context effects -
Kaufeld, G., Bosker, H. R., Ten Oever, S., Alday, P. M., Meyer, A. S., & Martin, A. E. (2020). Linguistic structure and meaning organize neural oscillations into a content-specific hierarchy. The Journal of Neuroscience, 49(2), 9467-9475. doi:10.1523/JNEUROSCI.0302-20.2020.
Neural oscillations track linguistic information during speech comprehension (e.g., Ding et al., 2016; Keitel et al., 2018), and are known to be modulated by acoustic landmarks and speech intelligibility (e.g., Doelling et al., 2014; Zoefel & VanRullen, 2015). However, studies investigating linguistic tracking have either relied on non-naturalistic isochronous stimuli or failed to fully control for prosody. Therefore, it is still unclear whether low frequency activity tracks linguistic structure during natural speech, where linguistic structure does not follow such a palpable temporal pattern. Here, we measured electroencephalography (EEG) and manipulated the presence of semantic and syntactic information apart from the timescale of their occurrence, while carefully controlling for the acoustic-prosodic and lexical-semantic information in the signal. EEG was recorded while 29 adult native speakers (22 women, 7 men) listened to naturally-spoken Dutch sentences, jabberwocky controls with morphemes and sentential prosody, word lists with lexical content but no phrase structure, and backwards acoustically-matched controls. Mutual information (MI) analysis revealed sensitivity to linguistic content: MI was highest for sentences at the phrasal (0.8-1.1 Hz) and lexical timescale (1.9-2.8 Hz), suggesting that the delta-band is modulated by lexically-driven combinatorial processing beyond prosody, and that linguistic content (i.e., structure and meaning) organizes neural oscillations beyond the timescale and rhythmicity of the stimulus. This pattern is consistent with neurophysiologically inspired models of language comprehension (Martin, 2016, 2020; Martin & Doumas, 2017) where oscillations encode endogenously generated linguistic content over and above exogenous or stimulus-driven timing and rhythm information. -
Martin, A. E. (2020). A compositional neural architecture for language. Journal of Cognitive Neuroscience, 32(8), 1407-1427. doi:10.1162/jocn_a_01552.
Hierarchical structure and compositionality imbue human language with unparalleled expressive power and set it apart from other perception–action systems. However, neither formal nor neurobiological models account for how these defining computational properties might arise in a physiological system. I attempt to reconcile hierarchy and compositionality with principles from cell assembly computation in neuroscience; the result is an emerging theory of how the brain could convert distributed perceptual representations into hierarchical structures across multiple timescales while representing interpretable incremental stages of (de) compositional meaning. The model's architecture—a multidimensional coordinate system based on neurophysiological models of sensory processing—proposes that a manifold of neural trajectories encodes sensory, motor, and abstract linguistic states. Gain modulation, including inhibition, tunes the path in the manifold in accordance with behavior and is how latent structure is inferred. As a consequence, predictive information about upcoming sensory input during production and comprehension is available without a separate operation. The proposed processing mechanism is synthesized from current models of neural entrainment to speech, concepts from systems neuroscience and category theory, and a symbolic-connectionist computational model that uses time and rhythm to structure information. I build on evidence from cognitive neuroscience and computational modeling that suggests a formal and mechanistic alignment between structure building and neural oscillations and moves toward unifying basic insights from linguistics and psycholinguistics with the currency of neural computation. -
Meyer, L., Sun, Y., & Martin, A. E. (2020). Synchronous, but not entrained: Exogenous and endogenous cortical rhythms of speech and language processing. Language, Cognition and Neuroscience, 35(9), 1089-1099. doi:10.1080/23273798.2019.1693050.
Research on speech processing is often focused on a phenomenon termed “entrainment”, whereby the cortex shadows rhythmic acoustic information with oscillatory activity. Entrainment has been observed to a range of rhythms present in speech; in addition, synchronicity with abstract information (e.g. syntactic structures) has been observed. Entrainment accounts face two challenges: First, speech is not exactly rhythmic; second, synchronicity with representations that lack a clear acoustic counterpart has been described. We propose that apparent entrainment does not always result from acoustic information. Rather, internal rhythms may have functionalities in the generation of abstract representations and predictions. While acoustics may often provide punctate opportunities for entrainment, internal rhythms may also live a life of their own to infer and predict information, leading to intrinsic synchronicity – not to be counted as entrainment. This possibility may open up new research avenues in the psycho– and neurolinguistic study of language processing and language development. -
Meyer, L., Sun, Y., & Martin, A. E. (2020). “Entraining” to speech, generating language? Language, Cognition and Neuroscience, 35(9), 1138-1148. doi:10.1080/23273798.2020.1827155.
Could meaning be read from acoustics, or from the refraction rate of pyramidal cells innervated by the cochlea, everyone would be an omniglot. Speech does not contain sufficient acoustic cues to identify linguistic units such as morphemes, words, and phrases without prior knowledge. Our target article (Meyer, L., Sun, Y., & Martin, A. E. (2019). Synchronous, but not entrained: Exogenous and endogenous cortical rhythms of speech and language processing. Language, Cognition and Neuroscience, 1–11. https://doi.org/10.1080/23273798.2019.1693050) thus questioned the concept of “entrainment” of neural oscillations to such units. We suggested that synchronicity with these points to the existence of endogenous functional “oscillators”—or population rhythmic activity in Giraud’s (2020) terms—that underlie the inference, generation, and prediction of linguistic units. Here, we address a series of inspirational commentaries by our colleagues. As apparent from these, some issues raised by our target article have already been raised in the literature. Psycho– and neurolinguists might still benefit from our reply, as “oscillations are an old concept in vision and motor functions, but a new one in linguistics” (Giraud, A.-L. 2020. Oscillations for all A commentary on Meyer, Sun & Martin (2020). Language, Cognition and Neuroscience, 1–8). -
Ten Oever, S., Meierdierks, T., Duecker, F., De Graaf, T., & Sack, A. (2020). Phase-coded oscillatory ordering promotes the separation of closely matched representations to optimize perceptual discrimination. iScience, 23(7): 101282. doi:10.1016/j.isci.2020.101282.
Low-frequency oscillations are proposed to be involved in separating neuronal representations belonging to different items. Although item-specific neuronal activity was found to cluster on different oscillatory phases, the influence of this mechanism on perception is unknown. Here, we investigated the perceptual consequences of neuronal item separation through oscillatory clustering. In an electroencephalographic experiment, participants categorized sounds parametrically varying in pitch, relative to an arbitrary pitch boundary. Pre-stimulus theta and alpha phase biased near-boundary sound categorization to one category or the other. Phase also modulated whether evoked neuronal responses contributed stronger to the fit of the sound envelope of one or another category. Intriguingly, participants with stronger oscillatory clustering (phase strongly biasing sound categorization) in the theta, but not alpha, range had steeper perceptual psychometric slopes (sharper sound category discrimination). These results indicate that neuronal sorting by phase directly influences subsequent perception and has a positive impact on discrimination performanceAdditional information
Supplemental Information -
Ten Oever, S., De Weerd, P., & Sack, A. T. (2020). Phase-dependent amplification of working memory content and performance. Nature Communications, 11: 1832. doi:10.1038/s41467-020-15629-7.
Successful working memory performance has been related to oscillatory mechanisms operating in low-frequency ranges. Yet, their mechanistic interaction with the distributed neural activity patterns representing the content of the memorized information remains unclear. Here, we record EEG during a working memory retention interval, while a task-irrelevant, high-intensity visual impulse stimulus is presented to boost the read-out of distributed neural activity related to the content held in working memory. Decoding of this activity with a linear classifier reveals significant modulations of classification accuracy by oscillatory phase in the theta/alpha ranges at the moment of impulse presentation. Additionally, behavioral accuracy is highest at the phases showing maximized decoding accuracy. At those phases, behavioral accuracy is higher in trials with the impulse compared to no-impulse trials. This constitutes the first evidence in humans that working memory information is maximized within limited phase ranges, and that phase-selective, sensory impulse stimulation can improve working memory. -
Brennan, J. R., & Martin, A. E. (2019). Phase synchronization varies systematically with linguistic structure composition. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 375(1791): 20190305. doi:10.1098/rstb.2019.0305.
Computation in neuronal assemblies is putatively reflected in the excitatory and inhibitory cycles of activation distributed throughout the brain. In speech and language processing, coordination of these cycles resulting in phase synchronization has been argued to reflect the integration of information on different timescales (e.g. segmenting acoustics signals to phonemic and syllabic representations; (Giraud and Poeppel 2012 Nat. Neurosci.15, 511 (doi:10.1038/nn.3063)). A natural extension of this claim is that phase synchronization functions similarly to support the inference of more abstract higher-level linguistic structures (Martin 2016 Front. Psychol.7, 120; Martin and Doumas 2017 PLoS Biol. 15, e2000663 (doi:10.1371/journal.pbio.2000663); Martin and Doumas. 2019 Curr. Opin. Behav. Sci.29, 77–83 (doi:10.1016/j.cobeha.2019.04.008)). Hale et al. (Hale et al. 2018 Finding syntax in human encephalography with beam search. arXiv 1806.04127 (http://arxiv.org/abs/1806.04127)) showed that syntactically driven parsing decisions predict electroencephalography (EEG) responses in the time domain; here we ask whether phase synchronization in the form of either inter-trial phrase coherence or cross-frequency coupling (CFC) between high-frequency (i.e. gamma) bursts and lower-frequency carrier signals (i.e. delta, theta), changes as the linguistic structures of compositional meaning (viz., bracket completions, as denoted by the onset of words that complete phrases) accrue. We use a naturalistic story-listening EEG dataset from Hale et al. to assess the relationship between linguistic structure and phase alignment. We observe increased phase synchronization as a function of phrase counts in the delta, theta, and gamma bands, especially for function words. A more complex pattern emerged for CFC as phrase count changed, possibly related to the lack of a one-to-one mapping between ‘size’ of linguistic structure and frequency band—an assumption that is tacit in recent frameworks. These results emphasize the important role that phase synchronization, desynchronization, and thus, inhibition, play in the construction of compositional meaning by distributed neural networks in the brain. -
Martin, A. E., & Baggio, G. (2019). Modeling meaning composition from formalism to mechanism. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 375: 20190298. doi:10.1098/rstb.2019.0298.
Human thought and language have extraordinary expressive power because meaningful parts can be assembled into more complex semantic structures. This partly underlies our ability to compose meanings into endlessly novel configurations, and sets us apart from other species and current computing devices. Crucially, human behaviour, including language use and linguistic data, indicates that composing parts into complex structures does not threaten the existence of constituent parts as independent units in the system: parts and wholes exist simultaneously yet independently from one another in the mind and brain. This independence is evident in human behaviour, but it seems at odds with what is known about the brain's exquisite sensitivity to statistical patterns: everyday language use is productive and expressive precisely because it can go beyond statistical regularities. Formal theories in philosophy and linguistics explain this fact by assuming that language and thought are compositional: systems of representations that separate a variable (or role) from its values (fillers), such that the meaning of a complex expression is a function of the values assigned to the variables. The debate on whether and how compositional systems could be implemented in minds, brains and machines remains vigorous. However, it has not yet resulted in mechanistic models of semantic composition: how, then, are the constituents of thoughts and sentences put and held together? We review and discuss current efforts at understanding this problem, and we chart possible routes for future research. -
Martin, A. E., & Doumas, L. A. A. (2019). Tensors and compositionality in neural systems. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 375(1791): 20190306. doi:10.1098/rstb.2019.0306.
Neither neurobiological nor process models of meaning composition specify the operator through which constituent parts are bound together into compositional structures. In this paper, we argue that a neurophysiological computation system cannot achieve the compositionality exhibited in human thought and language if it were to rely on a multiplicative operator to perform binding, as the tensor product (TP)-based systems that have been widely adopted in cognitive science, neuroscience and artificial intelligence do. We show via simulation and two behavioural experiments that TPs violate variable-value independence, but human behaviour does not. Specifically, TPs fail to capture that in the statements fuzzy cactus and fuzzy penguin, both cactus and penguin are predicated by fuzzy(x) and belong to the set of fuzzy things, rendering these arguments similar to each other. Consistent with that thesis, people judged arguments that shared the same role to be similar, even when those arguments themselves (e.g., cacti and penguins) were judged to be dissimilar when in isolation. By contrast, the similarity of the TPs representing fuzzy(cactus) and fuzzy(penguin) was determined by the similarity of the arguments, which in this case approaches zero. Based on these results, we argue that neural systems that use TPs for binding cannot approximate how the human mind and brain represent compositional information during processing. We describe a contrasting binding mechanism that any physiological or artificial neural system could use to maintain independence between a role and its argument, a prerequisite for compositionality and, thus, for instantiating the expressive power of human thought and language in a neural system.Additional information
Supplemental Material -
Martin, A. E., & Doumas, L. A. A. (2019). Predicate learning in neural systems: Using oscillations to discover latent structure. Current Opinion in Behavioral Sciences, 29, 77-83. doi:10.1016/j.cobeha.2019.04.008.
Humans learn to represent complex structures (e.g. natural language, music, mathematics) from experience with their environments. Often such structures are latent, hidden, or not encoded in statistics about sensory representations alone. Accounts of human cognition have long emphasized the importance of structured representations, yet the majority of contemporary neural networks do not learn structure from experience. Here, we describe one way that structured, functionally symbolic representations can be instantiated in an artificial neural network. Then, we describe how such latent structures (viz. predicates) can be learned from experience with unstructured data. Our approach exploits two principles from psychology and neuroscience: comparison of representations, and the naturally occurring dynamic properties of distributed computing across neuronal assemblies (viz. neural oscillations). We discuss how the ability to learn predicates from experience, to represent information compositionally, and to extrapolate knowledge to unseen data is core to understanding and modeling the most complex human behaviors (e.g. relational reasoning, analogy, language processing, game play). -
Martin, A. E., & Doumas, L. A. A. (2017). A mechanism for the cortical computation of hierarchical linguistic structure. PLoS Biology, 15(3): e2000663. doi:10.1371/journal.pbio.2000663.
Biological systems often detect species-specific signals in the environment. In humans, speech and language are species-specific signals of fundamental biological importance. To detect the linguistic signal, human brains must form hierarchical representations from a sequence of perceptual inputs distributed in time. What mechanism underlies this ability? One hypothesis is that the brain repurposed an available neurobiological mechanism when hierarchical linguistic representation became an efficient solution to a computational problem posed to the organism. Under such an account, a single mechanism must have the capacity to perform multiple, functionally related computations, e.g., detect the linguistic signal and perform other cognitive functions, while, ideally, oscillating like the human brain. We show that a computational model of analogy, built for an entirely different purpose—learning relational reasoning—processes sentences, represents their meaning, and, crucially, exhibits oscillatory activation patterns resembling cortical signals elicited by the same stimuli. Such redundancy in the cortical and machine signals is indicative of formal and mechanistic alignment between representational structure building and “cortical” oscillations. By inductive inference, this synergy suggests that the cortical signal reflects structure generation, just as the machine signal does. A single mechanism—using time to encode information across a layered network—generates the kind of (de)compositional representational hierarchy that is crucial for human language and offers a mechanistic linking hypothesis between linguistic representation and cortical computation -
Martin, A. E. (2016). Language processing as cue integration: Grounding the psychology of language in perception and neurophysiology. Frontiers in Psychology, 7: 120. doi:10.3389/fpsyg.2016.00120.
I argue that cue integration, a psychophysiological mechanism from vision and multisensory perception, offers a computational linking hypothesis between psycholinguistic theory and neurobiological models of language. I propose that this mechanism, which incorporates probabilistic estimates of a cue's reliability, might function in language processing from the perception of a phoneme to the comprehension of a phrase structure. I briefly consider the implications of the cue integration hypothesis for an integrated theory of language that includes acquisition, production, dialogue and bilingualism, while grounding the hypothesis in canonical neural computation.
Share this page