Displaying 1 - 12 of 12
-
Harbusch, K., Kempen, G., & Vosse, T. (2008). A natural-language paraphrase generator for on-line monitoring and commenting incremental sentence construction by L2 learners of German. In Proceedings of WorldCALL 2008.
Abstract
Certain categories of language learners need feedback on the grammatical structure of sentences they wish to produce. In contrast with the usual NLP approach to this problem—parsing student-generated texts—we propose a generation-based approach aiming at preventing errors (“scaffolding”). In our ICALL system, students construct sentences by composing syntactic trees out of lexically anchored “treelets” via a graphical drag&drop user interface. A natural-language generator computes all possible grammatically well-formed sentences entailed by the student-composed tree, and intervenes immediately when the latter tree does not belong to the set of well-formed alternatives. Feedback is based on comparisons between the student-composed tree and the well-formed set. Frequently occurring errors are handled in terms of “malrules.” The system (implemented in JAVA and C++) currently focuses constituent order in German as L2. -
Kempen, G., & Harbusch, K. (2008). Comparing linguistic judgments and corpus frequencies as windows on grammatical competence: A study of argument linearization in German clauses. In A. Steube (
Ed. ), The discourse potential of underspecified structures (pp. 179-192). Berlin: Walter de Gruyter.Abstract
We present an overview of several corpus studies we carried out into the frequencies of argument NP orderings in the midfield of subordinate and main clauses of German. Comparing the corpus frequencies with grammaticality ratings published by Keller’s (2000), we observe a “grammaticality–frequency gap”: Quite a few argument orderings with zero corpus frequency are nevertheless assigned medium–range grammaticality ratings. We propose an explanation in terms of a two-factor theory. First, we hypothesize that the grammatical induction component needs a sufficient number of exposures to a syntactic pattern to incorporate it into its repertoire of more or less stable rules of grammar. Moderately to highly frequent argument NP orderings are likely have attained this status, but not their zero-frequency counterparts. This is why the latter argument sequences cannot be produced by the grammatical encoder and are absent from the corpora. Secondly, we assumed that an extraneous (nonlinguistic) judgment process biases the ratings of moderately grammatical linear order patterns: Confronted with such structures, the informants produce their own “ideal delivery” variant of the to-be-rated target sentence and evaluate the similarity between the two versions. A high similarity score yielded by this judgment then exerts a positive bias on the grammaticality rating—a score that should not be mistaken for an authentic grammaticality rating. We conclude that, at least in the linearization domain studied here, the goal of gaining a clear view of the internal grammar of language users is best served by a combined strategy in which grammar rules are founded on structures that elicit moderate to high grammaticality ratings and attain at least moderate usage frequencies. -
Vosse, T. G., & Kempen, G. (2008). Parsing verb-final clauses in German: Garden-path and ERP effects modeled by a parallel dynamic parser. In B. Love, K. McRae, & V. Sloutsky (
Eds. ), Proceedings of the 30th Annual Conference on the Cognitive Science Society (pp. 261-266). Washington: Cognitive Science Society.Abstract
Experimental sentence comprehension studies have shown that superficially similar German clauses with verb-final word order elicit very different garden-path and ERP effects. We show that a computer implementation of the Unification Space parser (Vosse & Kempen, 2000) in the form of a localist-connectionist network can model the observed differences, at least qualitatively. The model embodies a parallel dynamic parser that, in contrast with existing models, does not distinguish between consecutive first-pass and reanalysis stages, and does not use semantic or thematic roles. It does use structural frequency data and animacy information. -
Harbusch, K., & Kempen, G. (2000). Complexity of linear order computation in Performance Grammar, TAG and HPSG. In Proceedings of Fifth International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+5) (pp. 101-106).
Abstract
This paper investigates the time and space complexity of word order computation in the psycholinguistically motivated grammar formalism of Performance Grammar (PG). In PG, the first stage of syntax assembly yields an unordered tree ('mobile') consisting of a hierarchy of lexical frames (lexically anchored elementary trees). Associated with each lexica l frame is a linearizer—a Finite-State Automaton that locally computes the left-to-right order of the branches of the frame. Linearization takes place after the promotion component may have raised certain constituents (e.g. Wh- or focused phrases) into the domain of lexical frames higher up in the syntactic mobile. We show that the worst-case time and space complexity of analyzing input strings of length n is O(n5) and O(n4), respectively. This result compares favorably with the time complexity of word-order computations in Tree Adjoining Grammar (TAG). A comparison with Head-Driven Phrase Structure Grammar (HPSG) reveals that PG yields a more declarative linearization method, provided that the FSA is rewritten as an equivalent regular expression. -
Kempen, G. (2000). Could grammatical encoding and grammatical decoding be subserved by the same processing module? Behavioral and Brain Sciences, 23, 38-39.
-
Vosse, T., & Kempen, G. (2000). Syntactic structure assembly in human parsing: A computational model based on competitive inhibition and a lexicalist grammar. Cognition, 75, 105-143.
Abstract
We present the design, implementation and simulation results of a psycholinguistic model of human syntactic processing that meets major empirical criteria. The parser operates in conjunction with a lexicalist grammar and is driven by syntactic information associated with heads of phrases. The dynamics of the model are based on competition by lateral inhibition ('competitive inhibition'). Input words activate lexical frames (i.e. elementary trees anchored to input words) in the mental lexicon, and a network of candidate 'unification links' is set up between frame nodes. These links represent tentative attachments that are graded rather than all-or-none. Candidate links that, due to grammatical or 'treehood' constraints, are incompatible, compete for inclusion in the final syntactic tree by sending each other inhibitory signals that reduce the competitor's attachment strength. The outcome of these local and simultaneous competitions is controlled by dynamic parameters, in particular by the Entry Activation and the Activation Decay rate of syntactic nodes, and by the Strength and Strength Build-up rate of Unification links. In case of a successful parse, a single syntactic tree is returned that covers the whole input string and consists of lexical frames connected by winning Unification links. Simulations are reported of a significant range of psycholinguistic parsing phenomena in both normal and aphasic speakers of English: (i) various effects of linguistic complexity (single versus double, center versus right-hand self-embeddings of relative clauses; the difference between relative clauses with subject and object extraction; the contrast between a complement clause embedded within a relative clause versus a relative clause embedded within a complement clause); (ii) effects of local and global ambiguity, and of word-class and syntactic ambiguity (including recency and length effects); (iii) certain difficulty-of-reanalysis effects (contrasts between local ambiguities that are easy to resolve versus ones that lead to serious garden-path effects); (iv) effects of agrammatism on parsing performance, in particular the performance of various groups of aphasic patients on several sentence types. -
Kempen, G. (1979). A study of syntactic bookkeeping during sentence production. In H. Ueckert, & D. Rhenius (
Eds. ), Komplexe menschliche Informationsverarbeitung (pp. 361-368). Bern: Hans Huber.Abstract
It is an important feature of the human sentence production system that semantic and syntactic processes may overlap in time and do not proceed strictly serially. That is, the process of building the syntactic form of an utterance does not always wait until the complete semantic content for that utterance has been decided upon. On the contrary, speakers will often start pronouncing the first words of a sentence while still working on further details of its semantic content. An important advantage is memory economy. Semantic and syntactic fragments do not have to occupy working memory until complete semantic and syntactic structures for an utterance have been computed. Instead, each semantic and syntactic fragment is processed as soon as possible and is kept in working memory for a minimum period of time. This raises the question of how the sentence production system can maintain syntactic coherence across syntactic fragments. Presumably there are processes of "syntactic bookkeeping" which (1) store in working memory those syntactic properties of a fragmentary sentence which are needed to eliminate ungrammatical continuations, and (2) check whether a prospective continuation is indeed compatible with the sentence constructed so far. In reaction time experiments where subjects described, under time pressure, simple static pictures of an action performed by an actor, the second aspect of syntactic bookkeeping could be demonstrated. This evidence is used for modelling bookkeeping processes as part of a computational sentence generator which aims at simulating the syntactic operations people carry out during spontaneous speech. -
Kempen, G. (1979). La mise en paroles, aspects psychologiques de l'expression orale. Études de Linguistique Appliquée, 33, 19-28.
Abstract
Remarques sur les facteurs intervenant dans le processus de formulation des énoncés. -
Kempen, G. (1979). Psychologie van de zinsbouw: Een Wundtiaanse inleiding. Nederlands Tijdschrift voor de Psychologie, 34, 533-551.
Abstract
The psychology of language as developed by Wilhelm Wundt in his fundamental work Die Sprache (1900) has a strongly mentalistic character. The dominating positions held by behaviorism in psychology and structuralism in linguistics have overruled Wundt’s language theory to the effect that it has remained relatively unknown. This situation has changed recently under the influence of transformational linguistics and cognitive psychology. The paper discusses how Wundt applied the basic psychological concepts of apperception and association to language behavior, in particular to the construction and production of sentences during unprepared speech. The final part of the paper is devoted to the work, published in 1917, of the Dutch linguistic scholar Jacques van Ginneken, who elaborated Wundt’s ideas towards an explanation of some syntactic phenomena during the language acquisition of children. -
Kempen, G. (1979). Woordwaarde. De Psycholoog, 14, 577.
-
Levelt, W. J. M., & Kempen, G. (1979). Language. In J. A. Michon, E. G. J. Eijkman, & L. F. W. De Klerk (
Eds. ), Handbook of psychonomics (Vol. 2) (pp. 347-407). Amsterdam: North Holland. -
Thomassen, A. J., & Kempen, G. (1979). Memory. In J. A. Michon, E. Eijkman, & L. Klerk (
Eds. ), Handbook of psychonomics (pp. 75-137 ). Amsterdam: North-Holland Publishing Company.
Share this page