Gerard Kempen

Publications

Displaying 1 - 4 of 4
  • Kempen, G., & Harbusch, K. (2018). A competitive mechanism selecting verb-second versus verb-final word order in causative and argumentative clauses of spoken Dutch: A corpus-linguistic study. Language Sciences, 69, 30-42. doi:10.1016/j.langsci.2018.05.005.

    Abstract

    In Dutch and German, the canonical order of subject, object(s) and finite verb is ‘verb-second’ (V2) in main but ‘verb-final’ (VF) in subordinate clauses. This occasionally leads to the production of noncanonical word orders. Familiar examples are causative and argumentative clauses introduced by a subordinating conjunction (Du. omdat, Ger. weil ‘because’): the omdat/weil-V2 phenomenon. Such clauses may also be introduced by coordinating conjunctions (Du. want, Ger. denn), which license V2 exclusively. However, want/denn-VF structures are unknown. We present the results of a corpus study on the incidence of omdat-V2 in spoken Dutch, and compare them to published data on weil-V2 in spoken German. Basic findings: omdat-V2 is much less frequent than weil-V2 (ratio almost 1:8); and the frequency relations between coordinating and subordinating conjunctions are opposite (want >> omdat; denn << weil). We propose that conjunction selection and V2/VF selection proceed partly independently, and sometimes miscommunicate—e.g. yielding omdat/weil paired with V2. Want/denn-VF pairs do not occur because want/denn clauses are planned as autonomous sentences, which take V2 by default. We sketch a simple feedforward neural network with two layers of nodes (representing conjunctions and word orders, respectively) that can simulate the observed data pattern through inhibition-based competition of the alternative choices within the node layers.
  • Harbusch, K., Kempen, G., & Vosse, T. (2008). A natural-language paraphrase generator for on-line monitoring and commenting incremental sentence construction by L2 learners of German. In Proceedings of WorldCALL 2008.

    Abstract

    Certain categories of language learners need feedback on the grammatical structure of sentences they wish to produce. In contrast with the usual NLP approach to this problem—parsing student-generated texts—we propose a generation-based approach aiming at preventing errors (“scaffolding”). In our ICALL system, students construct sentences by composing syntactic trees out of lexically anchored “treelets” via a graphical drag&drop user interface. A natural-language generator computes all possible grammatically well-formed sentences entailed by the student-composed tree, and intervenes immediately when the latter tree does not belong to the set of well-formed alternatives. Feedback is based on comparisons between the student-composed tree and the well-formed set. Frequently occurring errors are handled in terms of “malrules.” The system (implemented in JAVA and C++) currently focuses constituent order in German as L2.
  • Kempen, G., & Harbusch, K. (2008). Comparing linguistic judgments and corpus frequencies as windows on grammatical competence: A study of argument linearization in German clauses. In A. Steube (Ed.), The discourse potential of underspecified structures (pp. 179-192). Berlin: Walter de Gruyter.

    Abstract

    We present an overview of several corpus studies we carried out into the frequencies of argument NP orderings in the midfield of subordinate and main clauses of German. Comparing the corpus frequencies with grammaticality ratings published by Keller’s (2000), we observe a “grammaticality–frequency gap”: Quite a few argument orderings with zero corpus frequency are nevertheless assigned medium–range grammaticality ratings. We propose an explanation in terms of a two-factor theory. First, we hypothesize that the grammatical induction component needs a sufficient number of exposures to a syntactic pattern to incorporate it into its repertoire of more or less stable rules of grammar. Moderately to highly frequent argument NP orderings are likely have attained this status, but not their zero-frequency counterparts. This is why the latter argument sequences cannot be produced by the grammatical encoder and are absent from the corpora. Secondly, we assumed that an extraneous (nonlinguistic) judgment process biases the ratings of moderately grammatical linear order patterns: Confronted with such structures, the informants produce their own “ideal delivery” variant of the to-be-rated target sentence and evaluate the similarity between the two versions. A high similarity score yielded by this judgment then exerts a positive bias on the grammaticality rating—a score that should not be mistaken for an authentic grammaticality rating. We conclude that, at least in the linearization domain studied here, the goal of gaining a clear view of the internal grammar of language users is best served by a combined strategy in which grammar rules are founded on structures that elicit moderate to high grammaticality ratings and attain at least moderate usage frequencies.
  • Vosse, T. G., & Kempen, G. (2008). Parsing verb-final clauses in German: Garden-path and ERP effects modeled by a parallel dynamic parser. In B. Love, K. McRae, & V. Sloutsky (Eds.), Proceedings of the 30th Annual Conference on the Cognitive Science Society (pp. 261-266). Washington: Cognitive Science Society.

    Abstract

    Experimental sentence comprehension studies have shown that superficially similar German clauses with verb-final word order elicit very different garden-path and ERP effects. We show that a computer implementation of the Unification Space parser (Vosse & Kempen, 2000) in the form of a localist-connectionist network can model the observed differences, at least qualitatively. The model embodies a parallel dynamic parser that, in contrast with existing models, does not distinguish between consecutive first-pass and reanalysis stages, and does not use semantic or thematic roles. It does use structural frequency data and animacy information.

Share this page