Gerard Kempen

Publications

Displaying 1 - 14 of 14
  • Harbusch, K., Kempen, G., & Vosse, T. (2008). A natural-language paraphrase generator for on-line monitoring and commenting incremental sentence construction by L2 learners of German. In Proceedings of WorldCALL 2008.

    Abstract

    Certain categories of language learners need feedback on the grammatical structure of sentences they wish to produce. In contrast with the usual NLP approach to this problem—parsing student-generated texts—we propose a generation-based approach aiming at preventing errors (“scaffolding”). In our ICALL system, students construct sentences by composing syntactic trees out of lexically anchored “treelets” via a graphical drag&drop user interface. A natural-language generator computes all possible grammatically well-formed sentences entailed by the student-composed tree, and intervenes immediately when the latter tree does not belong to the set of well-formed alternatives. Feedback is based on comparisons between the student-composed tree and the well-formed set. Frequently occurring errors are handled in terms of “malrules.” The system (implemented in JAVA and C++) currently focuses constituent order in German as L2.
  • Kempen, G., & Harbusch, K. (2008). Comparing linguistic judgments and corpus frequencies as windows on grammatical competence: A study of argument linearization in German clauses. In A. Steube (Ed.), The discourse potential of underspecified structures (pp. 179-192). Berlin: Walter de Gruyter.

    Abstract

    We present an overview of several corpus studies we carried out into the frequencies of argument NP orderings in the midfield of subordinate and main clauses of German. Comparing the corpus frequencies with grammaticality ratings published by Keller’s (2000), we observe a “grammaticality–frequency gap”: Quite a few argument orderings with zero corpus frequency are nevertheless assigned medium–range grammaticality ratings. We propose an explanation in terms of a two-factor theory. First, we hypothesize that the grammatical induction component needs a sufficient number of exposures to a syntactic pattern to incorporate it into its repertoire of more or less stable rules of grammar. Moderately to highly frequent argument NP orderings are likely have attained this status, but not their zero-frequency counterparts. This is why the latter argument sequences cannot be produced by the grammatical encoder and are absent from the corpora. Secondly, we assumed that an extraneous (nonlinguistic) judgment process biases the ratings of moderately grammatical linear order patterns: Confronted with such structures, the informants produce their own “ideal delivery” variant of the to-be-rated target sentence and evaluate the similarity between the two versions. A high similarity score yielded by this judgment then exerts a positive bias on the grammaticality rating—a score that should not be mistaken for an authentic grammaticality rating. We conclude that, at least in the linearization domain studied here, the goal of gaining a clear view of the internal grammar of language users is best served by a combined strategy in which grammar rules are founded on structures that elicit moderate to high grammaticality ratings and attain at least moderate usage frequencies.
  • Vosse, T. G., & Kempen, G. (2008). Parsing verb-final clauses in German: Garden-path and ERP effects modeled by a parallel dynamic parser. In B. Love, K. McRae, & V. Sloutsky (Eds.), Proceedings of the 30th Annual Conference on the Cognitive Science Society (pp. 261-266). Washington: Cognitive Science Society.

    Abstract

    Experimental sentence comprehension studies have shown that superficially similar German clauses with verb-final word order elicit very different garden-path and ERP effects. We show that a computer implementation of the Unification Space parser (Vosse & Kempen, 2000) in the form of a localist-connectionist network can model the observed differences, at least qualitatively. The model embodies a parallel dynamic parser that, in contrast with existing models, does not distinguish between consecutive first-pass and reanalysis stages, and does not use semantic or thematic roles. It does use structural frequency data and animacy information.
  • Kempen, G., Anbeek, G., Desain, P., Konst, L., & De Smedt, K. (1987). Auteursomgevingen: Vijfde-generatie tekstverwerkers. Informatie, 29, 988-993.
  • Kempen, G., Anbeek, G., Desain, P., Konst, L., & De Semdt, K. (1987). Author environments: Fifth generation text processors. In Commission of the European Communities. Directorate-General for Telecommunications, Information Industries, and Innovation (Ed.), Esprit'86: Results and achievements (pp. 365-372). Amsterdam: Elsevier Science Publishers.
  • Kempen, G., Anbeek, G., Desain, P., Konst, L., & De Smedt, K. (1987). Author environments: Fifth generation text processors. In Commission of the European Communities. Directorate-General for Telecommunications, Information Industries, and Innovation (Ed.), Esprit'86: Results and achievements (pp. 365-372). Amsterdam: Elsevier Science Publishers.
  • Kempen, G., & Hoenkamp, E. (1987). An incremental procedural grammar for sentence formulation. Cognitive Science, 11(2), 201-258.

    Abstract

    This paper presents a theory of the syntactic aspects of human sentence production. An important characteristic of unprepared speech is that overt pronunciation of a sentence can be initiated before the speaker has completely worked out the meaning content he or she is going to express in that sentence. Apparently, the speaker is able to build up a syntactically coherent utterance out of a series of syntactic fragments each rendering a new part of the meaning content. This incremental, left-to-right mode of sentence production is the central capability of the proposed Incremental Procedural Grammar (IPG). Certain other properties of spontaneous speech, as derivable from speech errors, hesitations, self-repairs, and language pathology, are accounted for as well. The psychological plausibility thus gained by the grammar appears compatible with a satisfactory level of linguistic plausibility in that sentences receive structural descriptions which are in line with current theories of grammar. More importantly, an explanation for the existence of configurational conditions on transformations and other linguistics rules is proposed. The basic design feature of IPG which gives rise to these psychologically and linguistically desirable properties, is the “Procedures + Stack” concept. Sentences are built not by a central constructing agency which overlooks the whole process but by a team of syntactic procedures (modules) which work-in parallel-on small parts of the sentence, have only a limited overview, and whose sole communication channel is a stock. IPG covers object complement constructions, interrogatives, and word order in main and subordinate clauses. It handles unbounded dependencies, cross-serial dependencies and coordination phenomena such as gapping and conjunction reduction. It is also capable of generating self-repairs and elliptical answers to questions. IPG has been implemented as an incremental Dutch sentence generator written in LISP.
  • Kempen, G. (Ed.). (1987). Natural language generation: New results in artificial intelligence, psychology and linguistics. Dordrecht: Nijhoff.
  • Kempen, G. (Ed.). (1987). Natuurlijke taal en kunstmatige intelligentie: Taal tussen mens en machine. Groningen: Wolters-Noordhoff.
  • Kempen, G. (1987). Tekstverwerking: De vijfde generatie. Informatie, 29, 402-406.
  • Pijls, F., Daelemans, W., & Kempen, G. (1987). Artificial intelligence tools for grammar and spelling instruction. Instructional Science, 16(4), 319-336. doi:10.1007/BF00117750.

    Abstract

    In The Netherlands, grammar teaching is an especially important subject in the curriculum of children aged 10-15 for several reasons. However, in spite of all attention and time invested, the results are poor. This article describes the problems and our attempt to overcome them by developing an intelligent computational instructional environment consisting of: a linguistic expert system, containing a module representing grammar and spelling rules and a number of modules to manipulate these rules; a didactic module; and a student interface with special facilities for grammar and spelling. Three prototypes of the functionality are discussed: BOUWSTEEN and COGO, which are programs for constructing and analyzing Dutch sentences; and TDTDT, a program for the conjugation of Dutch verbs.
  • Pijls, F., & Kempen, G. (1987). Kennistechnologische leermiddelen in het grammatica- en spellingonderwijs. Nederlands Tijdschrift voor de Psychologie, 42, 354-363.
  • De Smedt, K., & Kempen, G. (1987). Incremental sentence production, self-correction, and coordination. In G. Kempen (Ed.), Natural language generation: New results in artificial intelligence, psychology and linguistics (pp. 365-376). Dordrecht: Nijhoff.
  • Van Wijk, C., & Kempen, G. (1987). A dual system for producing self-repairs in spontaneous speech: Evidence from experimentally elicited corrections. Cognitive Psychology, 19, 403-440. doi:10.1016/0010-0285(87)90014-4.

    Abstract

    This paper presents a cognitive theory on the production and shaping of selfrepairs during speaking. In an extensive experimental study, a new technique is tried out: artificial elicitation of self-repairs. The data clearly indicate that two mechanisms for computing the shape of self-repairs should be distinguished. One is based on the repair strategy called reformulation, the second one on lemma substitution. W. Levelt’s (1983, Cognition, 14, 41- 104) well-formedness rule, which connects self-repairs to coordinate structures, is shown to apply only to reformulations. In case of lemma substitution, a totally different set of rules is at work. The linguistic unit of central importance in reformulations is the major syntactic constituent; in lemma substitutions it is a prosodic unit. the phonological phrase. A parametrization of the model yielded a very satisfactory fit between observed and reconstructed scores.

Share this page