Publications

Displaying 1 - 3 of 3

Dona, L., & Schouwstra, M. (2024). Balancing regularization and variation: The roles of priming and motivatedness. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 130-133). Nijmegen: The Evolution of Language Conferences.

Full Text

Permanent link to publication record
Dona, L., & Schouwstra, M. (2022). The Role of Structural Priming, Semantics and Population Structure in Word Order Conventionalization: A Computational Model. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 171-173). Nijmegen: Joint Conference on Language Evolution (JCoLE).

Full Text

Permanent link to publication record
Haagen, T., Dona, L., Bosscha, S., Zamith, B., Koetschruyter, R., & Wijnholds, G. (2022). Noun Phrase and Verb Phrase Ellipsis in Dutch: Identifying Subject-Verb Dependencies with BERTje. Computational Linguistics in the Netherlands Journal, 12, 49-63.

Full Text

Abstract
Previous research has set out to quantify the syntactic capacity of BERTje (the Dutch equivalent of BERT) in the context of phenomena such as control verb nesting and verb raising in Dutch. Another complex language phenomenon is ellipsis, where a constituent is omitted from a sentence and can be recovered using context. Like verb raising and control verb nesting, ellipsis is suitable for evaluating BERTje’s linguistic capacity since it requires the processing of syntactic and lexical cues to recover the elided phrases. This work outlines an approach to identify subject-verb dependencies in Dutch sentences with verb phrase and noun phrase ellipsis using BERTje. Results will inform about BERTje’s capability of capturing syntactic information and its ability to capture ellipsis in particular. Understanding more about how computational models process ellipsis and how it can be improved is crucial for boosting the performance of language models, as natural language contains many instances of ellipsis. Using training data from Lassy, converted to contextualized embeddings using BERTje, a probe model is trained to identify subject-verb dependencies. The model is tested on sentences generated using a Context Free Grammar (CFG), which is designed to generate sentences containing ellipsis. These sentences are also converted to contextualized representations using BERTje. Results show that BERTje’s syntactic abilities are lacking, shown by accuracy drops compared to baseline measures.

Additional information
direct link to journal

Permanent link to publication record

Loïs Dona

Publications

Abstract

Additional information

Contact

Follow us

Breadcrumb

Loïs Dona

Primary tabs

Publications

Abstract

Additional information

Share this page