Displaying 1 - 51 of 51
-
Özyürek, A. (in press). Multimodal language, diversity and neuro-cognition. In D. Bradley, K. Dziubalska-Kołaczyk, C. Hamans, I.-H. Lee, & F. Steurs (
Eds. ), Contemporary Linguistics Integrating Languages, Communities, and Technologies: Special edition prepared for the participants of the 21st International Congress of Linguists (ICL). BRILL Press. -
Lokhesh, N. N., Swaminathan, K., Shravan, G., Menon, D., Mishra, S., Nandanwar, A., & Mishra, C. (2025). Welcome to the library: Integrating social robots in Indian libraries. In O. Palinko, L. Bodenhagen, J.-J. Cabibihan, K. Fischer, S. Šabanović, K. Winkle, L. Behera, S. S. Ge, D. Chrysostomou, W. Jiang, & H. He (
Eds. ), Social Robotics: 16th International Conference, ICSR + AI 2024, Odense, Denmark, October 23–26, 2024, Proceedings (pp. 239-246). Singapore: Springer. doi:10.1007/978-981-96-3525-2_20.Abstract
Libraries are very often considered the hallway to developing knowledge. However, the lack of adequate staff within Indian libraries makes catering to the visitors’ needs difficult. Previous systems that have sought to address libraries’ needs through automation have mostly been limited to storage and fetching aspects while lacking in their interaction aspect. We propose to address this issue by incorporating social robots within Indian libraries that can communicate and address the visitors’ queries in a multi-modal fashion attempting to make the experience more natural and appealing while helping reduce the burden on the librarians. In this paper, we propose and deploy a Furhat robot as a robot librarian by programming it on certain core librarian functionalities. We evaluate our system with a physical robot librarian (N = 26). The results show that the robot librarian was found to be very informative and overall left with a positive impression and preference. -
Mishra, C., Skantze, G., Hagoort, P., & Verdonschot, R. G. (2025). Perception of emotions in human and robot faces: Is the eye region enough? In O. Palinko, L. Bodenhagen, J.-J. Cabihihan, K. Fischer, S. Šabanović, K. Winkle, L. Behera, S. S. Ge, D. Chrysostomou, W. Jiang, & H. He (
Eds. ), Social Robotics: 116th International Conference, ICSR + AI 2024, Odense, Denmark, October 23–26, 2024, Proceedings (pp. 290-303). Singapore: Springer.Abstract
The increased interest in developing next-gen social robots has raised questions about the factors affecting the perception of robot emotions. This study investigates the impact of robot appearances (human-like, mechanical) and face regions (full-face, eye-region) on human perception of robot emotions. A between-subjects user study (N = 305) was conducted where participants were asked to identify the emotions being displayed in videos of robot faces, as well as a human baseline. Our findings reveal three important insights for effective social robot face design in Human-Robot Interaction (HRI): Firstly, robots equipped with a back-projected, fully animated face – regardless of whether they are more human-like or more mechanical-looking – demonstrate a capacity for emotional expression comparable to that of humans. Secondly, the recognition accuracy of emotional expressions in both humans and robots declines when only the eye region is visible. Lastly, within the constraint of only the eye region being visible, robots with more human-like features significantly enhance emotion recognition. -
Akamine, S., Ghaleb, E., Rasenberg, M., Fernandez, R., Meyer, A. S., & Özyürek, A. (2024). Speakers align both their gestures and words not only to establish but also to maintain reference to create shared labels for novel objects in interaction. In L. K. Samuelson, S. L. Frank, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 2435-2442).Abstract
When we communicate with others, we often repeat aspects of each other's communicative behavior such as sentence structures and words. Such behavioral alignment has been mostly studied for speech or text. Yet, language use is mostly multimodal, flexibly using speech and gestures to convey messages. Here, we explore the use of alignment in speech (words) and co-speech gestures (iconic gestures) in a referential communication task aimed at finding labels for novel objects in interaction. In particular, we investigate how people flexibly use lexical and gestural alignment to create shared labels for novel objects and whether alignment in speech and gesture are related over time. The present study shows that interlocutors establish shared labels multimodally, and alignment in words and iconic gestures are used throughout the interaction. We also show that the amount of lexical alignment positively associates with the amount of gestural alignment over time, suggesting a close relationship between alignment in the vocal and manual modalities.Additional information
link to eScholarship -
Ben-Ami, S., Shukla, Vishakha, V., Gupta, P., Shah, P., Ralekar, C., Ganesh, S., Gilad-Gutnick, S., Rubio-Fernández, P., & Sinha, P. (2024). Form perception as a bridge to real-world functional proficiency. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 6094-6102).Abstract
Recognizing the limitations of standard vision assessments in capturing the real-world capabilities of individuals with low vision, we investigated the potential of the Seguin Form Board Test (SFBT), a widely-used intelligence assessment employing a visuo-haptic shape-fitting task, as an estimator of vision's practical utility. We present findings from 23 children from India, who underwent treatment for congenital bilateral dense cataracts, and 21 control participants. To assess the development of functional visual ability, we conducted the SFBT and the standard measure of visual acuity, before and longitudinally after treatment. We observed a dissociation in the development of shape-fitting and visual acuity. Improvements of patients' shape-fitting preceded enhancements in their visual acuity after surgery and emerged even with acuity worse than that of control participants. Our findings highlight the importance of incorporating multi-modal and cognitive aspects into evaluations of visual proficiency in low-vision conditions, to better reflect vision's impact on daily activities.Additional information
link to eScholarship -
Dona, L., & Schouwstra, M. (2024). Balancing regularization and variation: The roles of priming and motivatedness. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (
Eds. ), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 130-133). Nijmegen: The Evolution of Language Conferences. -
Ghaleb, E., Rasenberg, M., Pouw, W., Toni, I., Holler, J., Özyürek, A., & Fernandez, R. (2024). Analysing cross-speaker convergence through the lens of automatically detected shared linguistic constructions. In L. K. Samuelson, S. L. Frank, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 1717-1723).Abstract
Conversation requires a substantial amount of coordination between dialogue participants, from managing turn taking to negotiating mutual understanding. Part of this coordination effort surfaces as the reuse of linguistic behaviour across speakers, a process often referred to as alignment. While the presence of linguistic alignment is well documented in the literature, several questions remain open, including the extent to which patterns of reuse across speakers have an impact on the emergence of labelling conventions for novel referents. In this study, we put forward a methodology for automatically detecting shared lemmatised constructions---expressions with a common lexical core used by both speakers within a dialogue---and apply it to a referential communication corpus where participants aim to identify novel objects for which no established labels exist. Our analyses uncover the usage patterns of shared constructions in interaction and reveal that features such as their frequency and the amount of different constructions used for a referent are associated with the degree of object labelling convergence the participants exhibit after social interaction. More generally, the present study shows that automatically detected shared constructions offer a useful level of analysis to investigate the dynamics of reference negotiation in dialogue.Additional information
link to eScholarship -
Ghaleb, E., Burenko, I., Rasenberg, M., Pouw, W., Uhrig, P., Holler, J., Toni, I., Ozyurek, A., & Fernandez, R. (2024). Cospeech gesture detection through multi-phase sequence labeling. In Proceedings of IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024) (pp. 4007-4015).
Abstract
Gestures are integral components of face-to-face communication. They unfold over time, often following predictable movement phases of preparation, stroke, and re-
traction. Yet, the prevalent approach to automatic gesture detection treats the problem as binary classification, classifying a segment as either containing a gesture or not, thus failing to capture its inherently sequential and contextual nature. To address this, we introduce a novel framework that reframes the task as a multi-phase sequence labeling problem rather than binary classification. Our model processes sequences of skeletal movements over time windows, uses Transformer encoders to learn contextual embeddings, and leverages Conditional Random Fields to perform sequence labeling. We evaluate our proposal on a large dataset of diverse co-speech gestures in task-oriented face-to-face dialogues. The results consistently demonstrate that our method significantly outperforms strong baseline models in detecting gesture strokes. Furthermore, applying Transformer encoders to learn contextual embeddings from movement sequences substantially improves gesture unit detection. These results highlight our framework’s capacity to capture the fine-grained dynamics of co-speech gesture phases, paving the way for more nuanced and accurate gesture detection and analysis. -
Joshi, A., Mohanty, R., Kanakanti, M., Mangla, A., Choudhary, S., Barbate, M., & Modi, A. (2024). iSign: A benchmark for Indian Sign Language processing. In L.-W. Ku, A. Martins, & V. Srikumar (
Eds. ), Findings of the Association for Computational Linguistics ACL 2024 (pp. 10827-10844). Bangkok, Thailand: Association for Computational Linguistics.Abstract
Indian Sign Language has limited resources for developing machine learning and data-driven approaches for automated language processing. Though text/audio-based language processing techniques have shown colossal research interest and tremendous improvements in the last few years, Sign Languages still need to catch up due to the need for more resources. To bridge this gap, in this work, we propose iSign: a benchmark for Indian Sign Language (ISL) Processing. We make three primary contributions to this work. First, we release one of the largest ISL-English datasets with more than video-sentence/phrase pairs. To the best of our knowledge, it is the largest sign language dataset available for ISL. Second, we propose multiple NLP-specific tasks (including SignVideo2Text, SignPose2Text, Text2Pose, Word Prediction, and Sign Semantics) and benchmark them with the baseline models for easier access to the research community. Third, we provide detailed insights into the proposed benchmarks with a few linguistic insights into the working of ISL. We streamline the evaluation of Sign Language processing, addressing the gaps in the NLP research community for Sign Languages. We release the dataset, tasks and models via the following website: https://exploration-lab.github.io/iSign/
Additional information
dataset, tasks, models -
Long, M., & Rubio-Fernandez, P. (2024). Beyond typicality: Lexical category affects the use and processing of color words. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 4925-4930).Abstract
Speakers and listeners show an informativity bias in the use and interpretation of color modifiers. For example, speakers use color more often when referring to objects that vary in color than to objects with a prototypical color. Likewise, listeners look away from objects with prototypical colors upon hearing that color mentioned. Here we test whether speakers and listeners account for another factor related to informativity: the strength of the association between lexical categories and color. Our results demonstrate that speakers and listeners' choices are indeed influenced by this factor; as such, it should be integrated into current pragmatic theories of informativity and computational models of color reference.Additional information
link to eScholarship -
Mishra, C., Nandanwar, A., & Mishra, S. (2024). HRI in Indian education: Challenges opportunities. In H. Admoni, D. Szafir, W. Johal, & A. Sandygulova (
Eds. ), Designing an introductory HRI course (workshop at HRI 2024). ArXiv. doi:10.48550/arXiv.2403.12223.Abstract
With the recent advancements in the field of robotics and the increased focus on having general-purpose robots widely available to the general public, it has become increasingly necessary to pursue research into Human-robot interaction (HRI). While there have been a lot of works discussing frameworks for teaching HRI in educational institutions with a few institutions already offering courses to students, a consensus on the course content still eludes the field. In this work, we highlight a few challenges and opportunities while designing an HRI course from an Indian perspective. These topics warrant further deliberations as they have a direct impact on the design of HRI courses and wider implications for the entire field. -
Motiekaitytė, K., Grosseck, O., Wolf, L., Bosker, H. R., Peeters, D., Perlman, M., Ortega, G., & Raviv, L. (2024). Iconicity and compositionality in emerging vocal communication systems: a Virtual Reality approach. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (
Eds. ), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 387-389). Nijmegen: The Evolution of Language Conferences. -
Ronderos, C. R., Zhang, Y., & Rubio-Fernandez, P. (2024). Weighted parameters in demonstrative use: The case of Spanish teens and adults. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 3279-3286).Additional information
link to eScholarship -
Rubio-Fernandez, P., Long, M., Shukla, V., Bhatia, V., Mahapatra, A., Ralekar, C., Ben-Ami, S., & Sinha, P. (2024). Multimodal communication in newly sighted children: An investigation of the relation between visual experience and pragmatic development. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 2560-2567).Abstract
We investigated the relationship between visual experience and pragmatic development by testing the socio-communicative skills of a unique population: the Prakash children of India, who received treatment for congenital cataracts after years of visual deprivation. Using two different referential communication tasks, our study investigated Prakash' children ability to produce sufficiently informative referential expressions (e.g., ‘the green pear' or ‘the small plate') and pay attention to their interlocutor's face during the task (Experiment 1), as well as their ability to recognize a speaker's referential intent through non-verbal cues such as head turning and pointing (Experiment 2). Our results show that Prakash children have strong pragmatic skills, but do not look at their interlocutor's face as often as neurotypical children do. However, longitudinal analyses revealed an increase in face fixations, suggesting that over time, Prakash children come to utilize their improved visual skills for efficient referential communication.Additional information
link to eScholarship -
Dong, T., & Toneva, M. (2023). Modeling brain responses to video stimuli using multimodal video transformers. In Proceedings of the Conference on Cognitive Computational Neuroscience (CCN 2023) (pp. 194-197).
Abstract
Prior work has shown that internal representations of artificial neural networks can significantly predict brain responses elicited by unimodal stimuli (i.e. reading a book chapter or viewing static images). However, the computational modeling of brain representations of naturalistic video stimuli, such as movies or TV shows, still remains underexplored. In this work, we present a promising approach for modeling vision-language brain representations of video stimuli by a transformer-based model that represents videos jointly through audio, text, and vision. We show that the joint representations of vision and text information are better aligned with brain representations of subjects watching a popular TV show. We further show that the incorporation of visual information improves brain alignment across several regions that support language processing. -
Kanakanti, M., Singh, S., & Shrivastava, M. (2023). MultiFacet: A multi-tasking framework for speech-to-sign language generation. In E. André, M. Chetouani, D. Vaufreydaz, G. Lucas, T. Schultz, L.-P. Morency, & A. Vinciarelli (
Eds. ), ICMI '23 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction (pp. 205-213). New York: ACM. doi:10.1145/3610661.3616550.Abstract
Sign language is a rich form of communication, uniquely conveying meaning through a combination of gestures, facial expressions, and body movements. Existing research in sign language generation has predominantly focused on text-to-sign pose generation, while speech-to-sign pose generation remains relatively underexplored. Speech-to-sign language generation models can facilitate effective communication between the deaf and hearing communities. In this paper, we propose an architecture that utilises prosodic information from speech audio and semantic context from text to generate sign pose sequences. In our approach, we adopt a multi-tasking strategy that involves an additional task of predicting Facial Action Units (FAUs). FAUs capture the intricate facial muscle movements that play a crucial role in conveying specific facial expressions during sign language generation. We train our models on an existing Indian Sign language dataset that contains sign language videos with audio and text translations. To evaluate our models, we report Dynamic Time Warping (DTW) and Probability of Correct Keypoints (PCK) scores. We find that combining prosody and text as input, along with incorporating facial action unit prediction as an additional task, outperforms previous models in both DTW and PCK scores. We also discuss the challenges and limitations of speech-to-sign pose generation models to encourage future research in this domain. We release our models, results and code to foster reproducibility and encourage future research1. -
Dingemanse, M., Liesenfeld, A., & Woensdregt, M. (2022). Convergent cultural evolution of continuers (mhmm). In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (
Eds. ), The Evolution of Language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 160-167). Nijmegen: Joint Conference on Language Evolution (JCoLE). doi:10.31234/osf.io/65c79.Abstract
Continuers —words like mm, mmhm, uhum and the like— are among the most frequent types of responses in conversation. They play a key role in joint action coordination by showing positive evidence of understanding and scaffolding narrative delivery. Here we investigate the hypothesis that their functional importance along with their conversational ecology places selective pressures on their form and may lead to cross-linguistic similarities through convergent cultural evolution. We compare continuer tokens in linguistically diverse conversational corpora and find languages make available highly similar forms. We then approach the causal mechanism of convergent cultural evolution using exemplar modelling, simulating the process by which a combination of effort minimization and functional specialization may push continuers to a particular region of phonological possibility space. By combining comparative linguistics and computational modelling we shed new light on the question of how language structure is shaped by and for social interaction. -
Dona, L., & Schouwstra, M. (2022). The Role of Structural Priming, Semantics and Population Structure in Word Order Conventionalization: A Computational Model. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (
Eds. ), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 171-173). Nijmegen: Joint Conference on Language Evolution (JCoLE). -
Kan, U., Gökgöz, K., Sumer, B., Tamyürek, E., & Özyürek, A. (2022). Emergence of negation in a Turkish homesign system: Insights from the family context. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (
Eds. ), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 387-389). Nijmegen: Joint Conference on Language Evolution (JCoLE). -
Slonimska, A., Özyürek, A., & Capirci, O. (2022). Simultaneity as an emergent property of sign languages. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (
Eds. ), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 678-680). Nijmegen: Joint Conference on Language Evolution (JCoLE). -
Karadöller, D. Z., Sumer, B., Ünal, E., & Ozyurek, A. (2021). Spatial language use predicts spatial memory of children: Evidence from sign, speech, and speech-plus-gesture. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (
Eds. ), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 672-678). Vienna: Cognitive Science Society.Abstract
There is a strong relation between children’s exposure to
spatial terms and their later memory accuracy. In the current
study, we tested whether the production of spatial terms by
children themselves predicts memory accuracy and whether
and how language modality of these encodings modulates
memory accuracy differently. Hearing child speakers of
Turkish and deaf child signers of Turkish Sign Language
described pictures of objects in various spatial relations to each
other and later tested for their memory accuracy of these
pictures in a surprise memory task. We found that having
described the spatial relation between the objects predicted
better memory accuracy. However, the modality of these
descriptions in sign, speech, or speech-plus-gesture did not
reveal differences in memory accuracy. We discuss the
implications of these findings for the relation between spatial
language, memory, and the modality of encoding. -
Mamus, E., Speed, L. J., Ozyurek, A., & Majid, A. (2021). Sensory modality of input influences encoding of motion events in speech but not co-speech gestures. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (
Eds. ), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 376-382). Vienna: Cognitive Science Society.Abstract
Visual and auditory channels have different affordances and
this is mirrored in what information is available for linguistic
encoding. The visual channel has high spatial acuity, whereas
the auditory channel has better temporal acuity. These
differences may lead to different conceptualizations of events
and affect multimodal language production. Previous studies of
motion events typically present visual input to elicit speech and
gesture. The present study compared events presented as audio-
only, visual-only, or multimodal (visual+audio) input and
assessed speech and co-speech gesture for path and manner of
motion in Turkish. Speakers with audio-only input mentioned
path more and manner less in verbal descriptions, compared to
speakers who had visual input. There was no difference in the
type or frequency of gestures across conditions, and gestures
were dominated by path-only gestures. This suggests that input
modality influences speakers’ encoding of path and manner of
motion events in speech, but not in co-speech gestures. -
Pouw, W., Wit, J., Bögels, S., Rasenberg, M., Milivojevic, B., & Ozyurek, A. (2021). Semantically related gestures move alike: Towards a distributional semantics of gesture kinematics. In V. G. Duffy (
Ed. ), Digital human modeling and applications in health, safety, ergonomics and risk management. human body, motion and behavior:12th International Conference, DHM 2021, Held as Part of the 23rd HCI International Conference, HCII 2021 (pp. 269-287). Berlin: Springer. doi:10.1007/978-3-030-77817-0_20. -
Ozyurek, A. (2020). From hands to brains: How does human body talk, think and interact in face-to-face language use? In K. Truong, D. Heylen, & M. Czerwinski (
Eds. ), ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction (pp. 1-2). New York, NY, USA: Association for Computing Machinery. doi:10.1145/3382507.3419442. -
Rasenberg, M., Dingemanse, M., & Ozyurek, A. (2020). Lexical and gestural alignment in interaction and the emergence of novel shared symbols. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (
Eds. ), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 356-358). Nijmegen: The Evolution of Language Conferences. -
Van Arkel, J., Woensdregt, M., Dingemanse, M., & Blokpoel, M. (2020). A simple repair mechanism can alleviate computational demands of pragmatic reasoning: simulations and complexity analysis. In R. Fernández, & T. Linzen (
Eds. ), Proceedings of the 24th Conference on Computational Natural Language Learning (CoNLL 2020) (pp. 177-194). Stroudsburg, PA, USA: The Association for Computational Linguistics. doi:10.18653/v1/2020.conll-1.14.Abstract
How can people communicate successfully while keeping resource costs low in the face of ambiguity? We present a principled theoretical analysis comparing two strategies for disambiguation in communication: (i) pragmatic reasoning, where communicators reason about each other, and (ii) other-initiated repair, where communicators signal and resolve trouble interactively. Using agent-based simulations and computational complexity analyses, we compare the efficiency of these strategies in terms of communicative success, computation cost and interaction cost. We show that agents with a simple repair mechanism can increase efficiency, compared to pragmatic agents, by reducing their computational burden at the cost of longer interactions. We also find that efficiency is highly contingent on the mechanism, highlighting the importance of explicit formalisation and computational rigour. -
Dideriksen, C., Fusaroli, R., Tylén, K., Dingemanse, M., & Christiansen, M. H. (2019). Contextualizing Conversational Strategies: Backchannel, Repair and Linguistic Alignment in Spontaneous and Task-Oriented Conversations. In A. K. Goel, C. M. Seifert, & C. Freksa (
Eds. ), Proceedings of the 41st Annual Conference of the Cognitive Science Society (CogSci 2019) (pp. 261-267). Montreal, QB: Cognitive Science Society.Abstract
Do interlocutors adjust their conversational strategies to the specific contextual demands of a given situation? Prior studies have yielded conflicting results, making it unclear how strategies vary with demands. We combine insights from qualitative and quantitative approaches in a within-participant experimental design involving two different contexts: spontaneously occurring conversations (SOC) and task-oriented conversations (TOC). We systematically assess backchanneling, other-repair and linguistic alignment. We find that SOC exhibit a higher number of backchannels, a reduced and more generic repair format and higher rates of lexical and syntactic alignment. TOC are characterized by a high number of specific repairs and a lower rate of lexical and syntactic alignment. However, when alignment occurs, more linguistic forms are aligned. The findings show that conversational strategies adapt to specific contextual demands. -
Mamus, E., Rissman, L., Majid, A., & Ozyurek, A. (2019). Effects of blindfolding on verbal and gestural expression of path in auditory motion events. In A. K. Goel, C. M. Seifert, & C. C. Freksa (
Eds. ), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2275-2281). Montreal, QB: Cognitive Science Society.Abstract
Studies have claimed that blind people’s spatial representations are different from sighted people, and blind people display superior auditory processing. Due to the nature of auditory and haptic information, it has been proposed that blind people have spatial representations that are more sequential than sighted people. Even the temporary loss of sight—such as through blindfolding—can affect spatial representations, but not much research has been done on this topic. We compared blindfolded and sighted people’s linguistic spatial expressions and non-linguistic localization accuracy to test how blindfolding affects the representation of path in auditory motion events. We found that blindfolded people were as good as sighted people when localizing simple sounds, but they outperformed sighted people when localizing auditory motion events. Blindfolded people’s path related speech also included more sequential, and less holistic elements. Our results indicate that even temporary loss of sight influences spatial representations of auditory motion eventsAdditional information
https://mindmodeling.org/cogsci2019/papers/0395/index.html -
Rissman, L., & Majid, A. (2019). Agency drives category structure in instrumental events. In A. K. Goel, C. M. Seifert, & C. Freksa (
Eds. ), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2661-2667). Montreal, QB: Cognitive Science Society.Abstract
Thematic roles such as Agent and Instrument have a long-standing place in theories of event representation. Nonetheless, the structure of these categories has been difficult to determine. We investigated how instrumental events, such as someone slicing bread with a knife, are categorized in English. Speakers described a variety of typical and atypical instrumental events, and we determined the similarity structure of their descriptions using correspondence analysis. We found that events where the instrument is an extension of an intentional agent were most likely to elicit similar language, highlighting the importance of agency in structuring instrumental categories.Additional information
https://mindmodeling.org/cogsci2019/papers/0455/0455.pdf -
Ter Bekke, M., Ozyurek, A., & Ünal, E. (2019). Speaking but not gesturing predicts motion event memory within and across languages. In A. Goel, C. Seifert, & C. Freksa (
Eds. ), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2940-2946). Montreal, QB: Cognitive Science Society.Abstract
In everyday life, people see, describe and remember motion events. We tested whether the type of motion event information (path or manner) encoded in speech and gesture predicts which information is remembered and if this varies across speakers of typologically different languages. We focus on intransitive motion events (e.g., a woman running to a tree) that are described differently in speech and co-speech gesture across languages, based on how these languages typologically encode manner and path information (Kita & Özyürek, 2003; Talmy, 1985). Speakers of Dutch (n = 19) and Turkish (n = 22) watched and described motion events. With a surprise (i.e. unexpected) recognition memory task, memory for manner and path components of these events was measured. Neither Dutch nor Turkish speakers’ memory for manner went above chance levels. However, we found a positive relation between path speech and path change detection: participants who described the path during encoding were more accurate at detecting changes to the path of an event during the memory task. In addition, the relation between path speech and path memory changed with native language: for Dutch speakers encoding path in speech was related to improved path memory, but for Turkish speakers no such relation existed. For both languages, co-speech gesture did not predict memory speakers. We discuss the implications of these findings for our understanding of the relations between speech, gesture, type of encoding in language and memory.Additional information
https://mindmodeling.org/cogsci2019/papers/0496/0496.pdf -
Azar, Z., Backus, A., & Ozyurek, A. (2017). Highly proficient bilinguals maintain language-specific pragmatic constraints on pronouns: Evidence from speech and gesture. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (
Eds. ), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 81-86). Austin, TX: Cognitive Science Society.Abstract
The use of subject pronouns by bilingual speakers using both a pro-drop and a non-pro-drop language (e.g. Spanish heritage speakers in the USA) is a well-studied topic in research on cross-linguistic influence in language contact situations. Previous studies looking at bilinguals with different proficiency levels have yielded conflicting results on whether there is transfer from the non-pro-drop patterns to the pro-drop language. Additionally, previous research has focused on speech patterns only. In this paper, we study the two modalities of language, speech and gesture, and ask whether and how they reveal cross-linguistic influence on the use of subject pronouns in discourse. We focus on elicited narratives from heritage speakers of Turkish in the Netherlands, in both Turkish (pro-drop) and Dutch (non-pro-drop), as well as from monolingual control groups. The use of pronouns was not very common in monolingual Turkish narratives and was constrained by the pragmatic contexts, unlike in Dutch. Furthermore, Turkish pronouns were more likely to be accompanied by localized gestures than Dutch pronouns, presumably because pronouns in Turkish are pragmatically marked forms. We did not find any cross-linguistic influence in bilingual speech or gesture patterns, in line with studies (speech only) of highly proficient bilinguals. We therefore suggest that speech and gesture parallel each other not only in monolingual but also in bilingual production. Highly proficient heritage speakers who have been exposed to diverse linguistic and gestural patterns of each language from early on maintain monolingual patterns of pragmatic constraints on the use of pronouns multimodally.Additional information
https://mindmodeling.org/cogsci2017/papers/0027/paper0027.pdf -
Karadöller, D. Z., Sumer, B., & Ozyurek, A. (2017). Effects of delayed language exposure on spatial language acquisition by signing children and adults. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (
Eds. ), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 2372-2376). Austin, TX: Cognitive Science Society.Abstract
Deaf children born to hearing parents are exposed to language input quite late, which has long-lasting effects on language production. Previous studies with deaf individuals mostly focused on linguistic expressions of motion events, which have several event components. We do not know if similar effects emerge in simple events such as descriptions of spatial configurations of objects. Moreover, previous data mainly come from late adult signers. There is not much known about language development of late signing children soon after learning sign language. We compared simple event descriptions of late signers of Turkish Sign Language (adults, children) to age-matched native signers. Our results indicate that while late signers in both age groups are native-like in frequency of expressing a relational encoding, they lag behind native signers in using morphologically complex linguistic forms compared to other simple forms. Late signing children perform similar to adults and thus showed no development over time. -
Ortega, G., Schiefner, A., & Ozyurek, A. (2017). Speakers’ gestures predict the meaning and perception of iconicity in signs. In G. Gunzelmann, A. Howe, & T. Tenbrink (
Eds. ), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 889-894). Austin, TX: Cognitive Science Society.Abstract
Sign languages stand out in that there is high prevalence of
conventionalised linguistic forms that map directly to their
referent (i.e., iconic). Hearing adults show low performance
when asked to guess the meaning of iconic signs suggesting
that their iconic features are largely inaccessible to them.
However, it has not been investigated whether speakers’
gestures, which also share the property of iconicity, may
assist non-signers in guessing the meaning of signs. Results
from a pantomime generation task (Study 1) show that
speakers’ gestures exhibit a high degree of systematicity, and
share different degrees of form overlap with signs (full,
partial, and no overlap). Study 2 shows that signs with full
and partial overlap are more accurately guessed and are
assigned higher iconicity ratings than signs with no overlap.
Deaf and hearing adults converge in their iconic depictions
for some concepts due to the shared conceptual knowledge
and manual-visual modality.Additional information
https://mindmodeling.org/cogsci2017/papers/0176/paper0176.pdf -
Azar, Z., Backus, A., & Ozyurek, A. (2016). Pragmatic relativity: Gender and context affect the use of personal pronouns in discourse differentially across languages. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (
Eds. ), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 1295-1300). Austin, TX: Cognitive Science Society.Abstract
Speakers use differential referring expressions in pragmatically appropriate ways to produce coherent narratives. Languages, however, differ in a) whether REs as arguments can be dropped and b) whether personal pronouns encode gender. We examine two languages that differ from each other in these two aspects and ask whether the co-reference context and the gender encoding options affect the use of REs differentially. We elicited narratives from Dutch and Turkish speakers about two types of three-person events, one including people of the same and the other of mixed-gender. Speakers re-introduced referents into the discourse with fuller forms (NPs) and maintained them with reduced forms (overt or null pronoun). Turkish speakers used pronouns mainly to mark emphasis and only Dutch speakers used pronouns differentially across the two types of videos. We argue that linguistic possibilities available in languages tune speakers into taking different principles into account to produce pragmatically coherent narrativesAdditional information
https://mindmodeling.org/cogsci2016/papers/0231/index.html -
Ortega, G., & Ozyurek, A. (2016). Generalisable patterns of gesture distinguish semantic categories in communication without language. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (
Eds. ), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 1182-1187). Austin, TX: Cognitive Science Society.Abstract
There is a long-standing assumption that gestural forms are geared by a set of modes of representation (acting, representing, drawing, moulding) with each technique expressing speakers’ focus of attention on specific aspects of referents (Müller, 2013). Beyond different taxonomies describing the modes of representation, it remains unclear what factors motivate certain depicting techniques over others. Results from a pantomime generation task show that pantomimes are not entirely idiosyncratic but rather follow generalisable patterns constrained by their semantic category. We show that a) specific modes of representations are preferred for certain objects (acting for manipulable objects and drawing for non-manipulable objects); and b) that use and ordering of deictics and modes of representation operate in tandem to distinguish between semantically related concepts (e.g., “to drink” vs “mug”). This study provides yet more evidence that our ability to communicate through silent gesture reveals systematic ways to describe events and objects around usAdditional information
https://mindmodeling.org/cogsci2016/papers/0212/index.html -
Sumer, B., Perniss, P. M., & Ozyurek, A. (2016). Viewpoint preferences in signing children's spatial descriptions. In J. Scott, & D. Waughtal (
Eds. ), Proceedings of the 40th Annual Boston University Conference on Language Development (BUCLD 40) (pp. 360-374). Boston, MA: Cascadilla Press. -
Peeters, D., Snijders, T. M., Hagoort, P., & Ozyurek, A. (2015). The role of left inferior frontal Gyrus in the integration of point- ing gestures and speech. In G. Ferré, & M. Tutton (
Eds. ), Proceedings of the4th GESPIN - Gesture & Speech in Interaction Conference. Nantes: Université de Nantes.Abstract
Comprehension of pointing gestures is fundamental to human communication. However, the neural mechanisms
that subserve the integration of pointing gestures and speech in visual contexts in comprehension
are unclear. Here we present the results of an fMRI study in which participants watched images of an
actor pointing at an object while they listened to her referential speech. The use of a mismatch paradigm
revealed that the semantic unication of pointing gesture and speech in a triadic context recruits left
inferior frontal gyrus. Complementing previous ndings, this suggests that left inferior frontal gyrus
semantically integrates information across modalities and semiotic domains. -
Schubotz, L., Holler, J., & Ozyurek, A. (2015). Age-related differences in multi-modal audience design: Young, but not old speakers, adapt speech and gestures to their addressee's knowledge. In G. Ferré, & M. Tutton (
Eds. ), Proceedings of the 4th GESPIN - Gesture & Speech in Interaction Conference (pp. 211-216). Nantes: Université of Nantes.Abstract
Speakers can adapt their speech and co-speech gestures for
addressees. Here, we investigate whether this ability is
modulated by age. Younger and older adults participated in a
comic narration task in which one participant (the speaker)
narrated six short comic stories to another participant (the
addressee). One half of each story was known to both participants, the other half only to the speaker. Younger but
not older speakers used more words and gestures when narrating novel story content as opposed to known content.
We discuss cognitive and pragmatic explanations of these findings and relate them to theories of gesture production. -
Ortega, G., Sumer, B., & Ozyurek, A. (2014). Type of iconicity matters: Bias for action-based signs in sign language acquisition. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (
Eds. ), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 1114-1119). Austin, Tx: Cognitive Science Society.Abstract
Early studies investigating sign language acquisition claimed
that signs whose structures are motivated by the form of their
referent (iconic) are not favoured in language development.
However, recent work has shown that the first signs in deaf
children’s lexicon are iconic. In this paper we go a step
further and ask whether different types of iconicity modulate
learning sign-referent links. Results from a picture description
task indicate that children and adults used signs with two
possible variants differentially. While children signing to
adults favoured variants that map onto actions associated with
a referent (action signs), adults signing to another adult
produced variants that map onto objects’ perceptual features
(perceptual signs). Parents interacting with children used
more action variants than signers in adult-adult interactions.
These results are in line with claims that language
development is tightly linked to motor experience and that
iconicity can be a communicative strategy in parental input. -
Peeters, D., Azar, Z., & Ozyurek, A. (2014). The interplay between joint attention, physical proximity, and pointing gesture in demonstrative choice. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (
Eds. ), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 1144-1149). Austin, Tx: Cognitive Science Society. -
Sumer, B., Perniss, P., Zwitserlood, I., & Ozyurek, A. (2014). Learning to express "left-right" & "front-behind" in a sign versus spoken language. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (
Eds. ), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 1550-1555). Austin, Tx: Cognitive Science Society.Abstract
Developmental studies show that it takes longer for
children learning spoken languages to acquire viewpointdependent
spatial relations (e.g., left-right, front-behind),
compared to ones that are not viewpoint-dependent (e.g.,
in, on, under). The current study investigates how
children learn to express viewpoint-dependent relations
in a sign language where depicted spatial relations can be
communicated in an analogue manner in the space in
front of the body or by using body-anchored signs (e.g.,
tapping the right and left hand/arm to mean left and
right). Our results indicate that the visual-spatial
modality might have a facilitating effect on learning to
express these spatial relations (especially in encoding of
left-right) in a sign language (i.e., Turkish Sign
Language) compared to a spoken language (i.e.,
Turkish). -
Holler, J., Schubotz, L., Kelly, S., Schuetze, M., Hagoort, P., & Ozyurek, A. (2013). Here's not looking at you, kid! Unaddressed recipients benefit from co-speech gestures when speech processing suffers. In M. Knauff, M. Pauen, I. Sebanz, & I. Wachsmuth (
Eds. ), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 2560-2565). Austin, TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0463/index.html.Abstract
In human face-to-face communication, language comprehension is a multi-modal, situated activity. However, little is known about how we combine information from these different modalities, and how perceived communicative intentions, often signaled through visual signals, such as eye
gaze, may influence this processing. We address this question by simulating a triadic communication context in which a
speaker alternated her gaze between two different recipients. Participants thus viewed speech-only or speech+gesture
object-related utterances when being addressed (direct gaze) or unaddressed (averted gaze). Two object images followed
each message and participants’ task was to choose the object that matched the message. Unaddressed recipients responded significantly slower than addressees for speech-only
utterances. However, perceiving the same speech accompanied by gestures sped them up to a level identical to
that of addressees. That is, when speech processing suffers due to not being addressed, gesture processing remains intact and enhances the comprehension of a speaker’s message -
Ortega, G., & Ozyurek, A. (2013). Gesture-sign interface in hearing non-signers' first exposure to sign. In Proceedings of the Tilburg Gesture Research Meeting [TiGeR 2013].
Abstract
Natural sign languages and gestures are complex communicative systems that allow the incorporation of features of a referent into their structure. They differ, however, in that signs are more conventionalised because they consist of meaningless phonological parameters. There is some evidence that despite non-signers finding iconic signs more memorable they can have more difficulty at articulating their exact phonological components. In the present study, hearing non-signers took part in a sign repetition task in which they had to imitate as accurately as possible a set of iconic and arbitrary signs. Their renditions showed that iconic signs were articulated significantly less accurately than arbitrary signs. Participants were recalled six months later to take part in a sign generation task. In this task, participants were shown the English translation of the iconic signs they imitated six months prior. For each word, participants were asked to generate a sign (i.e., an iconic gesture). The handshapes produced in the sign repetition and sign generation tasks were compared to detect instances in which both renditions presented the same configuration. There was a significant correlation between articulation accuracy in the sign repetition task and handshape overlap. These results suggest some form of gestural interference in the production of iconic signs by hearing non-signers. We also suggest that in some instances non-signers may deploy their own conventionalised gesture when producing some iconic signs. These findings are interpreted as evidence that non-signers process iconic signs as gestures and that in production, only when sign and gesture have overlapping features will they be capable of producing the phonological components of signs accurately. -
Peeters, D., Chu, M., Holler, J., Ozyurek, A., & Hagoort, P. (2013). Getting to the point: The influence of communicative intent on the kinematics of pointing gestures. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (
Eds. ), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 1127-1132). Austin, TX: Cognitive Science Society.Abstract
In everyday communication, people not only use speech but
also hand gestures to convey information. One intriguing
question in gesture research has been why gestures take the
specific form they do. Previous research has identified the
speaker-gesturer’s communicative intent as one factor
shaping the form of iconic gestures. Here we investigate
whether communicative intent also shapes the form of
pointing gestures. In an experimental setting, twenty-four
participants produced pointing gestures identifying a referent
for an addressee. The communicative intent of the speakergesturer
was manipulated by varying the informativeness of
the pointing gesture. A second independent variable was the
presence or absence of concurrent speech. As a function of their communicative intent and irrespective of the presence of speech, participants varied the durations of the stroke and the post-stroke hold-phase of their gesture. These findings add to our understanding of how the communicative context influences the form that a gesture takes.Additional information
http://mindmodeling.org/cogsci2013/papers/0219/index.html -
Holler, J., Kelly, S., Hagoort, P., & Ozyurek, A. (2012). When gestures catch the eye: The influence of gaze direction on co-speech gesture comprehension in triadic communication. In N. Miyake, D. Peebles, & R. P. Cooper (
Eds. ), Proceedings of the 34th Annual Meeting of the Cognitive Science Society (CogSci 2012) (pp. 467-472). Austin, TX: Cognitive Society. Retrieved from http://mindmodeling.org/cogsci2012/papers/0092/index.html.Abstract
Co-speech gestures are an integral part of human face-to-face communication, but little is known about how pragmatic factors influence our comprehension of those gestures. The present study investigates how different types of recipients process iconic gestures in a triadic communicative situation. Participants (N = 32) took on the role of one of two recipients in a triad and were presented with 160 video clips of an actor speaking, or speaking and gesturing. Crucially, the actor’s eye gaze was manipulated in that she alternated her gaze between the two recipients. Participants thus perceived some messages in the role of addressed recipient and some in the role of unaddressed recipient. In these roles, participants were asked to make judgements concerning the speaker’s messages. Their reaction times showed that unaddressed recipients did comprehend speaker’s gestures differently to addressees. The findings are discussed with respect to automatic and controlled processes involved in gesture comprehension. -
Sumer, B., Zwitserlood, I., Perniss, P. M., & Ozyurek, A. (2012). Development of locative expressions by Turkish deaf and hearing children: Are there modality effects? In A. K. Biller, E. Y. Chung, & A. E. Kimball (
Eds. ), Proceedings of the 36th Annual Boston University Conference on Language Development (BUCLD 36) (pp. 568-580). Boston: Cascadilla Press. -
Perniss, P. M., Zwitserlood, I., & Ozyurek, A. (2011). Does space structure spatial language? Linguistic encoding of space in sign languages. In L. Carlson, C. Holscher, & T. Shipley (
Eds. ), Proceedings of the 33rd Annual Meeting of the Cognitive Science Society (pp. 1595-1600). Austin, TX: Cognitive Science Society. -
Furman, R., Ozyurek, A., & Küntay, A. C. (2010). Early language-specificity in Turkish children's caused motion event expressions in speech and gesture. In K. Franich, K. M. Iserman, & L. L. Keil (
Eds. ), Proceedings of the 34th Boston University Conference on Language Development. Volume 1 (pp. 126-137). Somerville, MA: Cascadilla Press. -
Kita, S., Ozyurek, A., Allen, S., & Ishizuka, T. (2010). Early links between iconic gestures and sound symbolic words: Evidence for multimodal protolanguage. In A. D. Smith, M. Schouwstra, B. de Boer, & K. Smith (
Eds. ), Proceedings of the 8th International conference on the Evolution of Language (EVOLANG 8) (pp. 429-430). Singapore: World Scientific. -
Ozyurek, A. (2010). The role of iconic gestures in production and comprehension of language: Evidence from brain and behavior. In S. Kopp, & I. Wachsmuth (
Eds. ), Gesture in embodied communication and human-computer interaction: 8th International Gesture Workshop, GW 2009, Bielefeld, Germany, February 25-27 2009. Revised selected papers (pp. 1-10). Berlin: Springer. -
Senghas, A., Ozyurek, A., & Goldin-Meadow, S. (2010). The evolution of segmentation and sequencing: Evidence from homesign and Nicaraguan Sign Language. In A. D. Smith, M. Schouwstra, B. de Boer, & K. Smith (
Eds. ), Proceedings of the 8th International conference on the Evolution of Language (EVOLANG 8) (pp. 279-289). Singapore: World Scientific.
Share this page