Gerard Kempen

Publications

Displaying 1 - 20 of 20
  • Harbusch, K., Kempen, G., & Vosse, T. (2008). A natural-language paraphrase generator for on-line monitoring and commenting incremental sentence construction by L2 learners of German. In Proceedings of WorldCALL 2008.

    Abstract

    Certain categories of language learners need feedback on the grammatical structure of sentences they wish to produce. In contrast with the usual NLP approach to this problem—parsing student-generated texts—we propose a generation-based approach aiming at preventing errors (“scaffolding”). In our ICALL system, students construct sentences by composing syntactic trees out of lexically anchored “treelets” via a graphical drag&drop user interface. A natural-language generator computes all possible grammatically well-formed sentences entailed by the student-composed tree, and intervenes immediately when the latter tree does not belong to the set of well-formed alternatives. Feedback is based on comparisons between the student-composed tree and the well-formed set. Frequently occurring errors are handled in terms of “malrules.” The system (implemented in JAVA and C++) currently focuses constituent order in German as L2.
  • Kempen, G., & Harbusch, K. (2008). Comparing linguistic judgments and corpus frequencies as windows on grammatical competence: A study of argument linearization in German clauses. In A. Steube (Ed.), The discourse potential of underspecified structures (pp. 179-192). Berlin: Walter de Gruyter.

    Abstract

    We present an overview of several corpus studies we carried out into the frequencies of argument NP orderings in the midfield of subordinate and main clauses of German. Comparing the corpus frequencies with grammaticality ratings published by Keller’s (2000), we observe a “grammaticality–frequency gap”: Quite a few argument orderings with zero corpus frequency are nevertheless assigned medium–range grammaticality ratings. We propose an explanation in terms of a two-factor theory. First, we hypothesize that the grammatical induction component needs a sufficient number of exposures to a syntactic pattern to incorporate it into its repertoire of more or less stable rules of grammar. Moderately to highly frequent argument NP orderings are likely have attained this status, but not their zero-frequency counterparts. This is why the latter argument sequences cannot be produced by the grammatical encoder and are absent from the corpora. Secondly, we assumed that an extraneous (nonlinguistic) judgment process biases the ratings of moderately grammatical linear order patterns: Confronted with such structures, the informants produce their own “ideal delivery” variant of the to-be-rated target sentence and evaluate the similarity between the two versions. A high similarity score yielded by this judgment then exerts a positive bias on the grammaticality rating—a score that should not be mistaken for an authentic grammaticality rating. We conclude that, at least in the linearization domain studied here, the goal of gaining a clear view of the internal grammar of language users is best served by a combined strategy in which grammar rules are founded on structures that elicit moderate to high grammaticality ratings and attain at least moderate usage frequencies.
  • Vosse, T. G., & Kempen, G. (2008). Parsing verb-final clauses in German: Garden-path and ERP effects modeled by a parallel dynamic parser. In B. Love, K. McRae, & V. Sloutsky (Eds.), Proceedings of the 30th Annual Conference on the Cognitive Science Society (pp. 261-266). Washington: Cognitive Science Society.

    Abstract

    Experimental sentence comprehension studies have shown that superficially similar German clauses with verb-final word order elicit very different garden-path and ERP effects. We show that a computer implementation of the Unification Space parser (Vosse & Kempen, 2000) in the form of a localist-connectionist network can model the observed differences, at least qualitatively. The model embodies a parallel dynamic parser that, in contrast with existing models, does not distinguish between consecutive first-pass and reanalysis stages, and does not use semantic or thematic roles. It does use structural frequency data and animacy information.
  • Kempen, G., Anbeek, G., Desain, P., Konst, L., & De Smedt, K. (1987). Auteursomgevingen: Vijfde-generatie tekstverwerkers. Informatie, 29, 988-993.
  • Kempen, G., Anbeek, G., Desain, P., Konst, L., & De Semdt, K. (1987). Author environments: Fifth generation text processors. In Commission of the European Communities. Directorate-General for Telecommunications, Information Industries, and Innovation (Ed.), Esprit'86: Results and achievements (pp. 365-372). Amsterdam: Elsevier Science Publishers.
  • Kempen, G., Anbeek, G., Desain, P., Konst, L., & De Smedt, K. (1987). Author environments: Fifth generation text processors. In Commission of the European Communities. Directorate-General for Telecommunications, Information Industries, and Innovation (Ed.), Esprit'86: Results and achievements (pp. 365-372). Amsterdam: Elsevier Science Publishers.
  • Kempen, G., & Hoenkamp, E. (1987). An incremental procedural grammar for sentence formulation. Cognitive Science, 11(2), 201-258.

    Abstract

    This paper presents a theory of the syntactic aspects of human sentence production. An important characteristic of unprepared speech is that overt pronunciation of a sentence can be initiated before the speaker has completely worked out the meaning content he or she is going to express in that sentence. Apparently, the speaker is able to build up a syntactically coherent utterance out of a series of syntactic fragments each rendering a new part of the meaning content. This incremental, left-to-right mode of sentence production is the central capability of the proposed Incremental Procedural Grammar (IPG). Certain other properties of spontaneous speech, as derivable from speech errors, hesitations, self-repairs, and language pathology, are accounted for as well. The psychological plausibility thus gained by the grammar appears compatible with a satisfactory level of linguistic plausibility in that sentences receive structural descriptions which are in line with current theories of grammar. More importantly, an explanation for the existence of configurational conditions on transformations and other linguistics rules is proposed. The basic design feature of IPG which gives rise to these psychologically and linguistically desirable properties, is the “Procedures + Stack” concept. Sentences are built not by a central constructing agency which overlooks the whole process but by a team of syntactic procedures (modules) which work-in parallel-on small parts of the sentence, have only a limited overview, and whose sole communication channel is a stock. IPG covers object complement constructions, interrogatives, and word order in main and subordinate clauses. It handles unbounded dependencies, cross-serial dependencies and coordination phenomena such as gapping and conjunction reduction. It is also capable of generating self-repairs and elliptical answers to questions. IPG has been implemented as an incremental Dutch sentence generator written in LISP.
  • Kempen, G. (Ed.). (1987). Natural language generation: New results in artificial intelligence, psychology and linguistics. Dordrecht: Nijhoff.
  • Kempen, G. (Ed.). (1987). Natuurlijke taal en kunstmatige intelligentie: Taal tussen mens en machine. Groningen: Wolters-Noordhoff.
  • Kempen, G. (1987). Tekstverwerking: De vijfde generatie. Informatie, 29, 402-406.
  • Pijls, F., Daelemans, W., & Kempen, G. (1987). Artificial intelligence tools for grammar and spelling instruction. Instructional Science, 16(4), 319-336. doi:10.1007/BF00117750.

    Abstract

    In The Netherlands, grammar teaching is an especially important subject in the curriculum of children aged 10-15 for several reasons. However, in spite of all attention and time invested, the results are poor. This article describes the problems and our attempt to overcome them by developing an intelligent computational instructional environment consisting of: a linguistic expert system, containing a module representing grammar and spelling rules and a number of modules to manipulate these rules; a didactic module; and a student interface with special facilities for grammar and spelling. Three prototypes of the functionality are discussed: BOUWSTEEN and COGO, which are programs for constructing and analyzing Dutch sentences; and TDTDT, a program for the conjugation of Dutch verbs.
  • Pijls, F., & Kempen, G. (1987). Kennistechnologische leermiddelen in het grammatica- en spellingonderwijs. Nederlands Tijdschrift voor de Psychologie, 42, 354-363.
  • De Smedt, K., & Kempen, G. (1987). Incremental sentence production, self-correction, and coordination. In G. Kempen (Ed.), Natural language generation: New results in artificial intelligence, psychology and linguistics (pp. 365-376). Dordrecht: Nijhoff.
  • Van Wijk, C., & Kempen, G. (1987). A dual system for producing self-repairs in spontaneous speech: Evidence from experimentally elicited corrections. Cognitive Psychology, 19, 403-440. doi:10.1016/0010-0285(87)90014-4.

    Abstract

    This paper presents a cognitive theory on the production and shaping of selfrepairs during speaking. In an extensive experimental study, a new technique is tried out: artificial elicitation of self-repairs. The data clearly indicate that two mechanisms for computing the shape of self-repairs should be distinguished. One is based on the repair strategy called reformulation, the second one on lemma substitution. W. Levelt’s (1983, Cognition, 14, 41- 104) well-formedness rule, which connects self-repairs to coordinate structures, is shown to apply only to reformulations. In case of lemma substitution, a totally different set of rules is at work. The linguistic unit of central importance in reformulations is the major syntactic constituent; in lemma substitutions it is a prosodic unit. the phonological phrase. A parametrization of the model yielded a very satisfactory fit between observed and reconstructed scores.
  • Kempen, G. (1979). A study of syntactic bookkeeping during sentence production. In H. Ueckert, & D. Rhenius (Eds.), Komplexe menschliche Informationsverarbeitung (pp. 361-368). Bern: Hans Huber.

    Abstract

    It is an important feature of the human sentence production system that semantic and syntactic processes may overlap in time and do not proceed strictly serially. That is, the process of building the syntactic form of an utterance does not always wait until the complete semantic content for that utterance has been decided upon. On the contrary, speakers will often start pronouncing the first words of a sentence while still working on further details of its semantic content. An important advantage is memory economy. Semantic and syntactic fragments do not have to occupy working memory until complete semantic and syntactic structures for an utterance have been computed. Instead, each semantic and syntactic fragment is processed as soon as possible and is kept in working memory for a minimum period of time. This raises the question of how the sentence production system can maintain syntactic coherence across syntactic fragments. Presumably there are processes of "syntactic bookkeeping" which (1) store in working memory those syntactic properties of a fragmentary sentence which are needed to eliminate ungrammatical continuations, and (2) check whether a prospective continuation is indeed compatible with the sentence constructed so far. In reaction time experiments where subjects described, under time pressure, simple static pictures of an action performed by an actor, the second aspect of syntactic bookkeeping could be demonstrated. This evidence is used for modelling bookkeeping processes as part of a computational sentence generator which aims at simulating the syntactic operations people carry out during spontaneous speech.
  • Kempen, G. (1979). La mise en paroles, aspects psychologiques de l'expression orale. Études de Linguistique Appliquée, 33, 19-28.

    Abstract

    Remarques sur les facteurs intervenant dans le processus de formulation des énoncés.
  • Kempen, G. (1979). Psychologie van de zinsbouw: Een Wundtiaanse inleiding. Nederlands Tijdschrift voor de Psychologie, 34, 533-551.

    Abstract

    The psychology of language as developed by Wilhelm Wundt in his fundamental work Die Sprache (1900) has a strongly mentalistic character. The dominating positions held by behaviorism in psychology and structuralism in linguistics have overruled Wundt’s language theory to the effect that it has remained relatively unknown. This situation has changed recently under the influence of transformational linguistics and cognitive psychology. The paper discusses how Wundt applied the basic psychological concepts of apperception and association to language behavior, in particular to the construction and production of sentences during unprepared speech. The final part of the paper is devoted to the work, published in 1917, of the Dutch linguistic scholar Jacques van Ginneken, who elaborated Wundt’s ideas towards an explanation of some syntactic phenomena during the language acquisition of children.
  • Kempen, G. (1979). Woordwaarde. De Psycholoog, 14, 577.
  • Levelt, W. J. M., & Kempen, G. (1979). Language. In J. A. Michon, E. G. J. Eijkman, & L. F. W. De Klerk (Eds.), Handbook of psychonomics (Vol. 2) (pp. 347-407). Amsterdam: North Holland.
  • Thomassen, A. J., & Kempen, G. (1979). Memory. In J. A. Michon, E. Eijkman, & L. Klerk (Eds.), Handbook of psychonomics (pp. 75-137 ). Amsterdam: North-Holland Publishing Company.

Share this page