1 Computational Model for Processing Lexical Information Zoltán Bánréti
2 Copyright 2000 Zoltán Bánréti, Katalin Kiss, Gábor Rádai, Péter Rebrus, Péter Szigetvári, Miklós Törkenczy, Beáta Gyuris, Csaba Oravecz, Gabriella Tóth, Viktor Trón This research report was downloaded from the Research Support Scheme Electronic Library at The work on the report was made possible by a grant from, and was published by, the Research Support Scheme of the Open Society Support Foundation, which is a part of the Open Society Institute-Budapest. The digitisation of the report was supported by the publisher. Research Support Scheme Bartolomějská Praha 1 Czech Republic The digitisation and conversion of the report to PDF was completed by Virtus. Virtus Libínská Praha 5 Czech Republic The information published in this work is the sole responsibility of the author and should not be construed as representing the views of the Research Support Scheme/Open Society Support Foundation. The RSS/OSSF takes no responsibility for the accuracy and correctness of this work. Any comments related to the contents of this work should be directed to the author. All rights reserved. No part of this work may be reproduced, in any form or by any means without permission in writing from the author.
3 Contents Abstract...1 Objectives...2 Findings...3 List of publications...4 Detailed dummary of the results of the reserach...6 Neurolinguistics submodule...6 Open and closed class lexical items in the mental lexicon and their role in sentence processing...6 Representational complexity of verb in mental lexicon and its effect on aphasics sentence production...8 Ellipsis in sentence processing...10 Syntax-semantics submodule...12 Ellipsis and the structure of the lexicon...12 The interpretation of certain logical vocabulary items in adult and child usage...13 Aspect and Argument Structure...14 Phonology submodule...15 Degrees of phonotactic grammaticality: partitioning the lexicon...15 Phonotactics and morphological complexity...17 Computational linguistics submodule...19 Representation of linguistic knowledge...19 Extension of the GIN formalism...20 Non-Complin References in Section Closed class lexical items in sentence processing...24 Abstract...24 A time-based parser...27 Analysis of the repetition test...30 References...49 Egyeztetés agrammatikus afáziában...50 Absztrakt...50 Irodalom...61 Nyelvtan és Mentális Elemzõ Neurolingvisztikai Megközelítésben...63 MTA Nyelvtudományi Intézete...63 Irodalom...68 Semantic Bridging Effects in VP-Ellipsis...69 Introduction...69 A preliminary overview of the data...70 Semantic and pragmatic accounts of VP-ellipsis interpretation...73 A lexical semantic account of VP-ellipsis...74 Some relevant syntactic claims...74 Ellipsis licensing with meaning postulates...76 On the interaction of semantic parallelism and the organization of the grammar...78 Some further claims on the structure of the Lexicon...79 Conclusion...81 References...81 The Interpretation of Universal Quantification in Child Language...83 Aims and theoretical background...83 Preliminary overview of the data...83 Philip s (1995) experiments...86 Experiments with Hungarian children...88 Experiment Experiment Discussion of the Hungarian experiments and their implications...92 Conclusion...93 References...93 On the Semantic Interpretation of amikor when and ha if Clauses in Hungarian...94 Introduction...94 Comparing Hungarian ha if and amikor when clauses...96 On the lexical meaning of akkor when and ha if...96 Implicit quantifiers...97 Formalizing the semantic interpretation...97
4 Basic ingredients...97 Formalizing the interpretation of amikor when clauses...99 Formalizing the intepretation of ha if clauses Conclusion References Coordinate Ellipsis as Phonological non-insertion Forward and backward ellipsis Ellipsis as deletion vs. reconstruction vs. anaphora BWE as phonological deletion or morphological non-insertion FWE as morphological non-insertion: The "meaning postulate" cases Beyond meaning postulates? A further type The true nature of the difference between BWE and FWE Summary References Effect of verb complexity on agrammatic aphasic's sentence production Introduction Agrammatic sentence production Patterns of verb production The present study Experiment I The structure of the verbs used in Experiment I Method Results Effects of morphological complexity Types of answers Argument assignment and thematic hierarchy (Agent (Experiencer(Goal/Source/Location(Theme)))) Case assignment in isolated arguments The calusal answers (Type B and C) Word order in the clausal answers Summary Experiment II Test material Method Results of Experiment II Types of errors Summary References How to Cope with "Free Word Order": An Efficient Part-of-Speech Tagging Method for Hungarian Abstract Aspect and Argument Structure Introduction On Aspect Preliminaries The theory Classification of verbs Stative verbs Process verbs Accomplishment verbs Achievement verbs Classification of Verbs External arguments and aspectual verb classes Conclusion Footnotes References Deconstructing syllable structure Empty positions in the skeleton The skeleton-melody relationship Empty skeletal positions and the null hypothesis Syllable structure
5 Why have syllable structure? Problems with the standard view Empty nuclei in the skeleton Does the coda exist? Without codas Heavy versus light syllables Compensatory lengthening Against constituency Conclusion References Phonotactic grammaticality and the lexicon Introduction The SPE-a algorithm The SPE-b algorithm The Greenberg and Jenkins algorithm Summary Notes Bibliography Aspect and Argument Structure Introduction On Aspect Preliminaries The theory Classification of verbs Stative verbs Process verbs Accomplishment verbs Achievement verbs Classification of Verbs External arguments and aspectual verb classes Conclusion Footnotes References Ertékek azonossága-e az egyeztetés? Bevezetés Az unifikáció hiányosságai Egyeztetés Szubkategorizáció Általánosítás és típusrezolúció Jegy-érték párok mint tulajdonságok Grammatikai viszonyok és mellérendelõ szerkezetek Az egyeztetés osztály-alapú elemzése Határozottság koordinált NP-kben Grammatikai viszonyok rekurzív definíciója Összefoglalás és további lehetõségek A cikk fontosabb állításai További lehetõségek Hivatkozások A Magyar Igekoto Egyeztetese Bevezetés Alapfogalmak Expletívum-e az igekötõ? Milyen mondattani viszonyokról van szó? A személy- és számegyeztetés hiánya Leírási kísérlet Hivatkozások Representation of Linguistic Knowledge in GIN Introduction The GIN language Motivation
6 Attribute/value structures AVSs and relations Representing relations in an AVS format Yet another fragment Multi-AVSs and types Multi-AVS type hierarchies The type resolution process References Is Agreement Value Sharing? Abstract Problems with unification Agreement Subcategorization Generalization and type resolution Attribute/value pairs as properties Grammatical relations and co-ordinate structures Class-based analysis of agreement Definiteness in co-ordinate NPs Recursive definition of grammatical relations Summary and further perspectives Main statements Further perspectives References Constructional CV phonology Abstract Construction phonology CV Phonology in a constraint-based setting Domain-final empty nuclei Intervocalic consonant clusters Domain-final consonant clusters Long vowels Domain initial clusters Exceptional domain final licensing Constructions and the hierarchical lexicon Phonotactics and morphophonology Types of suffocation General constraints on monomorphemic stems Epenthetic stems Lowering and exceptional licensing Synthetic suffixation Verbal stems and synthetic suffixation Pseudo-analytic suffixes References Kormányzás-fonológia kormányzás nélkül A mássalhangzók engedélyezése Valódi mássalhangzó-kapcsolatok A kóda engedélyezése Az üres mag engedélyezése Kivételes engedélyezés Trocheikus engedélyezés Engedélyezési tartományok Köszönetnyilvánítás Irodalom A helyelemek egyeztetése a CV-fonológiában Bevezetés Kormányzási tartományok Valódi mássalhangzó-kapcsolatok Szerkezeti párhuzamok Asszimilációk Hely-hasonulás mássalhangzók között
7 Magánhangzó-harmónia Összegzés Függelék Általánosítások Köszönetnyilvánítás Irodalom
9 1 Abstract Abstract We analyzed some syntactic, semantic and phonological phenomena that presuppose the existence of interrelated components within the lexicon, which motivate the assumption that there exist some sublexicons within the global lexicon of a speaker. This result is confirmed by experimental findings in neurolinguistics. Hungarian speaking agrammatic aphasics were tested in several ways. The results showed that the sublexicon of closed class lexical items provides a highly automated complex device for processing surface sentence structure. Analysing Hungarian ellipsis data from a semantic-syntactic point of view, we established that the lexicon is best conceived of as split into at least two main sublexicons: the store of semanticsyntactic feature bundles, and a separate store of sound forms, and proposed a format for representing open-class lexical items whose meanings are connected via certain semantic relations. We proposed a new classification of verbs to account for the contribution of the aspectual reading of the sentence depending on the referential type of their arguments, and a new account of the syntactic and semantic behaviour of aspectual prefixes. The partitioned sets of lexical items are sublexicons on phonological grounds. These sublexicons differ in terms of phonotactic grammaticality. The degrees of phonotactic grammaticality are tied up with the problem of psychological reality: how many degrees of phonological grammaticality are native speakers sensitive to. We implemented a hierarchical construction network as an extension of the original General Inheritance Network formalism. This framework was used as a platform for the implementation of the grammar fragments. Keywords: mental lexicon, sublexicon, neurolinguistics, syntax, semantics, morphology, phonological grammaticality, construction network, inheritance, GIN.
10 2 Objectives Objectives The aim of the research was to give a description and analysis of the mental lexicon with particular reference to sentence processing, in an interdisciplinary framework consisting of three submodules. The main tasks were the following: Neurolinguistics submodule: Collection of spontaneous speech samples of agrammatic aphasics, construction of off-line test materials, testing aphasics and control subjects. Evaluation of the test results. Construction of models for lexical access and retrieval. Answering questions on the organisation of the mental lexicon, and as to nature of the most economical type of lexical access in an agglutinative language like Hungarian. Two articles. Theoretical submodule: Semantics: The description of what semantic principles, if any, determine the storage of certain lexical items, what semantically based strategies speakers use to substitute the inaccessible lexical items under normal conditions and in aphasia. Collecting child language data on logical vocabulary. Morphophonology: Answering questions on how we process forms showing stem and suffix alternations, how the phonological-morphological subsystems are organised, whether there are phenomena in language processing which support the concept of analytic synthetic suffixation, whether phonological, morphological and lexical information has a role in selecting the grammatical form. Two articles. Computational linguistics submodule: Discussing the feasibility of the implementation of experimental results and theoretical analyses in Generalized Inheritance Networks. Computational and formal aspects of content addressable inheritance systems developed for the encoding of hierarchical construction networks with relations. Two articles. The objectives have been successfully achieved. Responding to the need for investigations in the domain of syntax on the part of other submodules, the scope of research in the original Semantics Submodule was extended in the course of the project to include topics on syntax as well, thus, the four submodules working in the framework of the project were the following: neurolinguistics submodule, syntax-semantics submodule, phonology submodule, and computational linguistics submodule. (This change in the organisation of the project was indicated in the Interim Report.)
11 3 Findings Findings Importance: Due to the structural properties of Hungarian (agglutinative morphology, relatively free word order) certain basic assumptions of the traditional models of the mental lexicon may well be called into question. We have elicited data from normal native and non-native speakers of Hungarian and agrammatic aphasic patients as well. Scientific significance, innovative character: We analyzed some syntactic semantic and phonological phenomena that presuppose the existence of interrelated components within the lexicon, which motivate the assumption on the existence of sublexicons within the global lexicon of a speaker. 1. This finding was confirmed by experimental results in neurolinguistics. We demonstrated that the sublexicon of closed class lexical items (store of grammatical formatives, inflectional endings, suffixes in the mental lexicon) is critical for Hungarian speaking agrammatic aphasics. Speakers access open class words (content words) and closed class items by two distinct access systems. The interaction of the two access systems provides a highly automated complex device for processing sentence structure. We characterised the representational complexity of the verbs in the mental lexicon and its effect on aphasics sentence production. The internal temporal structure of verbs and argument selection from the lexicon was also analysed. 2. We found that the two directional types of co-ordinate ellipsis, although displaying certain different properties, can be treated by the same syntactic mechanism: the non-insertion (rather than deletion or reconstruction) of phonological shapes to the terminal nodes in the structure. This presupposes a model of syntax with split lexicon and late vocabulary insertion. We also isolated a subclass of cases of forward VP-ellipsis which is licensed by the semantic relations of meaning equivalence and entailment between propositions, the conditions of which are encoded in the representation of the individual lexical items in the lexicon. We propose a classification based on three aspectual classes of verbs, which gives the right predictions for the obligatory presence of certain arguments and for the interaction of the verbs and the different referential types of the arguments in the event structure. 3. The partitioned sets of lexical items are sublexicons on phonological grounds. These sublexicons differ in terms of phonotactic grammaticality. The phonotactic grammaticality of a string of segments is a measure that refers to the extent to which a given string is a potential/actual lexical item. The question of how many degrees of phonotactic grammaticality are to be recognised phonologically is tied up with the problem of psychological reality: how many degrees of phonological grammaticality are native speakers sensitive to; and the possible partitioning of the lexicon into sublexicons on phonological grounds. 4. The primary goal was to model a hierarchical construction network enriched with the representation of relations that are the formal counterparts of correspondence relations in construction grammar. The implementation is the extended update version of the original GIN (Generalized Inheritance Network) framework.
12 4 List of publications List of publications Neurolinguistics submodule Bánréti, Zoltán 2000a. Closed class lexical items in sentence processing. A neurolinguistic approach. ms p 38. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Bánréti, Zoltán 2000b. Nyelvtan és mentális elemző neurolingvisztikai megközelítésben (Grammar and Parser from the point of view of neurolinguistics), ms p11. Accepted for publication: 50 éves a Nyelvtudományi Intézet (50 th anniversary of the Research Institute for Linguistics). Ed: Mária Gósy, Budapest. Bánréti, Zoltán 2000c. Egyeztetés agrammatikus afáziában. A szintaktikai fa metszése ( Agreement in agrammatic Broca s aphasia. The syntactic tree pruning. ) ms p15. Submitted to: Néprajz és nyelvtudomány (Ethnography and Linguistics), Eds: M. Maleczki and L. Büky. Szeged. Kiss, Katalin Representational complexity of verbs and sentence production in agrammatic aphasics, ms. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Syntax-semantics submodule Bartos, Huba and Beáta Gyuris Coordinate ellipsis as phonological non-insertion, ms. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Gyuris, Beáta. 2000a. Semantic Bridging Effects in VP-Ellipsis. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Gyuris, Beáta. 2000b. The Interpretation of Universal Quantification in Child Language. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Gyuris, Beáta. 2000c. On the semantic interpretation of WHEN and IF clauses in Hungarian. To appear in Approaches to Hungarian VII. Ed: István Kenesei, JPTE: Pécs. Tóth, Gabriella Aspect and Argument Structure. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Phonology submodule Rebrus, Péter. 2000a. Kormányzás fonológia kormányzás nélkül. [ Government phonology without government ]. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Rebrus, Péter. 2000b. A helyelemek egyeztetése a CV-fonológiában. [ Agreement of place elements in CVphonology ]. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Rebrus, Péter and Viktor Trón Constructional CV Phonology. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Szigetvári, Péter Deconstructing syllable structure. Ms., Eötvös Loránd University. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Törkenczy, Miklós Phonotactic grammaticality and the lexicon. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest. Computational submodule Rádai, Gábor Implementing Construction Grammars in GIN. Submitted to Huba Bartos (ed.) Papers on the mental lexicon, Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest.
13 5 List of publications Csaba Oravecz, Péter Dienes, Zoltán Alexin and Tibor Gyimóthy ``How to Cope with "Free Word Order": An Efficient Part-of-Speech Tagging Method for Hungarian'', abstract accepted for poster session at the Second International Conference on Language Resources and Evaluation, Athens, Greece, 31 May-2 June Final paper due on Apr 2 and will appear in the proceedings. László Kálmán and Viktor Trón Is Agreement Value Sharing? Submitted to Natural Language and Linguistic Theory. English version of the paper presented at the conference titled ``A Magyar Nyelv leírásának újabb módszerei IV.'' [Modern methods in the description of Hungarian IV.], Version of January, László Kálmán and Viktor Trón. 1999a. Értékek azonossága-e az egyeztetés? [Is Agreement Value Sharing?] (co-authored with László Kálmán) To Appear. In: Proceedings of the Conference ``A Magyar Nyelv leírásának újabb módszerei IV'' [Modern methods in the description of Hungarian IV.] László Kálmán and Viktor Trón. 1999b. A magyar igevivő egyeztetése. [Agreement of the Hungarian Verb Carrier.] (co-authored with László Kálmán) To Appear. In: Proceedings of the Conference ``A Magyar Nyelv leírásának újabb módszerei IV.'' [Modern methods in the description of Hungarian IV.] László Kálmán and Viktor Trón. 1999c. Linguistic Representations in GIN (Unpublished manuscript)
14 6 Detailed dummary of the results of the reserach Detailed dummary of the results of the reserach The main research findings are going to be presented individually for the four research groups, including the Neurolinguistics submodule, the Syntax-semantics submodule, the Phonology submodule, and the Computational linguistics submodule. Neurolinguistics submodule Participants: Zoltán Bánréti and Katalin Kiss. Types of sublexicons in the mental lexicon Open and closed class lexical items in the mental lexicon and their role in sentence processing Sentence repetition tests In the course of sentence repetition tests our agrammatic aphasic patient gave answers that were suggestive of initial structure building operations. With respect to stress patterns, each target sentence was neutral in the test. Hungarian is an inflectional language where the verb assigns case to noun phrases by means of case endings that mark theta roles in surface structure. Our patient was tested with the help of the strategy of monitored repetition. The patient processed the sentence both syntactically and semantically, then attempted to produce an utterance which matched the phonological, syntactic, and semantic properties of the original utterance. The performance of our patient s parser can be characterised as follows. In comparison with the target sentence, 1. it is possible for the parser to approximate the class of the target predicate, and its case frame is retrievable; 2. if a different predicate is retrieved, then the suffixes are those appropriate to the case frame of the "original" predicate; 3. if the predicate is missing, the parser stops; for instance, it cannot list only the NPs from the target sentence; 4. it is possible to fill one slot from the argument frame of the predicate with selectional restrictions that are the same as (or very much like) the original; 5. knowledge about missing, lexically or phonologically null arguments is manifest in further search attempts that either mention case endings without a content word, or link them to pronouns or neologisms, in repetition of case endings, or in compensatory speech. Grammaticality judgement tests We tested a total of five Hungarian Broca's aphasics. Our grammaticality judgement tests covered some relevant features of Hungarian syntax and the lexicon. Three interesting cases are worth attention in this respect: (i) there were some easy tasks, where the acceptability judgements of the patients coincided with the expected answers in 100 per cent of the cases; (ii) in some cases we witnessed guessing, since judgements turned out to be essentially random and chaotic from a statistical point of view; (iii) in some other cases we faced systematic misjudgement of the data, which means that acceptable sentences were judged as good in 100 percent of the cases but unacceptable counterparts were also judged as good in 100 percent of the cases or at least close to 100 percent. The distribution of judgements supports a time-based approach to a parser. The plausibility of an account based on asynchrony between syntactic and lexical processes is motivated in the following way. The parser produces a structural frame for all possible sentences. This syntactic frame contains categorised slots. When the configuration of surface case endings assigned
15 7 Detailed dummary of the results of the reserach by category of Verb to its complements and the configuration of other closed class items are in their active phase in the working memory, they define and open up syntactic slots for the content word filler. Content words would be generated by the lexicon and would be inserted into their slots in the syntactic frame. Impairments of the accessibility of closed class morphemes create syntactic difficulties. Normal activation happens at the expense of fast decay and, vice versa, normal decay happens at the expense of slow activation. Applying this theory to our data we find the following. Specific features of syntactic subcategories and closed class morphemes can be activated at a normal rate, but then they decay very fast, too early from working memory; or they can be retained at a normal rate, at expense of their slow activation in working memory. In the former case other specific lexical information has also not been investigated previously, when it was needed. In the slow activation case other specific lexical information in working memory is already gone when needed. The fast decay or slow activation of grammatical features and subfeatures causes a desynchronization in the building of syntactic structure. Syntactic slots are opened up too late or too early for the content word filler; when specific lexical information in working memory had not been activated yet or is already gone when needed. Therefore patients are not able to complete the analysis of stimuli, processing operations result in a merely sketchy and unfinished structure. Patients were aware of their unfinished analysis, they often made comments on it. This could lead to guessing responses on complex, non-local relations. Patients were able to use initial structure building operations involved in first pass parse for the correct judgements of easy tasks. In the case of normals, first pass parse must be tightly synchronised with a second major parsing module which extracts detailed and specific features of the category of arguments and the predicate. But fast decay or slow activation of specific, unprotected information in working memory can cause desynchronization between processing modules. The consequences are systematic misjudgements or guessing responses, depending on the type of grammatical error and the complexity of sentence to be judged. Since closed class items have to be integrated with their categorised slots in the syntactic frame, and open class (content) words have to be inserted into their categorised slots in the syntactic frame as well, these two kinds of integration require synchronisation, the synchronised activation of structure building elements in working memory for language. The slow activation or fast decay of closed class items leads to a desynchronization between syntactic slots opened up by closed class items and the active phase of content word fillers. Theoretical results Differences between memory time for open-class and closed class items are important for accessing items in the mental lexicon. Closed-class items may fade away so fast from memory that the construction of a proper NP or Sentence (for instance) is doubtful. Temporal deficits do not affect the initial structure building operations. Our tests present empirical evidence for the fact that syntactic and lexical processes are partially autonomous routines. This becomes apparent in the case of a working memory deficit. The type of elements affected by the temporal deficit do make a difference, however. When function word nodes are affected, the required pattern does not emerge. It appears only when phrasal category nodes are impaired. Although the patient s restricted working memory time may not be sufficient to produce a full sentence representation, it is nevertheless sufficient for the judgement of a verb and a string of inflectional endings (related to that verb). This is compatible with the assumption that the patient has to trade the processing of surface form against lexical access. (Inflection is part of the surface parser module but we do not claim that this (sub)module would not be impaired.) Therefore we apply the first-pass parse hypothesis. The hypothesis of initial structure building operations has been proposed by a number of psycholinguists. In accordance with this hypothesis we
16 8 Detailed dummary of the results of the reserach assume that in the sentence repetition test an initial structural analysis is computed and is subsequently interpreted. This is followed by later processing operations involving constraints on the indexing of structures involving content lexical items. The first-pass parser protects some of the processed lexical and syntactic information during first-pass parse and a working memory deficit restricts further processing operations. The patients' performance in the repetition tasks showed that the verb is the starting point for the surface syntactic analysis. The patients never made both inflectional errors and errors in the choice of the main verb in the same sentence. Patients made inflectional errors when they were unable to retrieve any verb. If the patients approximated the class of the target verb, however, then its surface case frame was retrievable for them. The case frame assigns suffixes to associated nouns; and it does so even if the nouns to which the endings are to be attached cannot be correctly reproduced by the patients. It can be assumed that in the first phase of processing the parser selects surface syntactic information (subcategorizational frame of the verb, surface case frame, word order). Closed class elements provide a syntactic frame into which open class items are inserted in the course of sentence processing. In non-fluent aphasia the surface syntactic parser is too slow in processing closed class lexical items, so lexical information in the working memory is already gone when needed. The subjects are unable to integrate the output of the syntactic parser with the segments of the lexical process. Publications resulting from the research: Bánréti (2000a,b,c) in the List of publications Representational complexity of verb in mental lexicon and its effect on aphasics sentence production "Picture description/action naming" tests We have investigated how Hungarian Broca s aphasic patients can lexically select and retrieve verbs which differ in their representational complexity and how are they able to construe Verb Phrases and simple sentences using the target predicate. As elicitation task an off-line method, a picture description/action naming test was used; the data were interpreted in the theoretical framework of Government and Binding Theory. The structure of the verbs used in the tests Based on their argument structure complexities, the tested verbs of the present study formed three main groups. Thus, one-place intransitive predicates which take only one Agent or Experient argument, two-place verbs, and three-place predicates were involved in the tests. The verbal performance of two agrammatic Broca s aphasic patients was analysed. Both patients are native speakers of Hungarian. Our elicitation method was an action naming / picture description test. The pictures represented the target verbs/actions. We regarded an answer to be complete if the patients were able to build the whole Verb Phrase or sentence. It means that the verb and its complements were lexically accessible, the argument NPs were supplied with the appropriate overt case marker, noun-verb agreement was intact and nonterminal node deletion did not occur. Analysis of experimental results Comparing the distribution of the retrievable target verbs within each verb group we found the following verb difficulty order : simple 1-place > morphologically complex 1-place = transitive (2-place) > 3-place (with locative and dative complement) > 2-place with locative complement
17 9 Detailed dummary of the results of the reserach Access to the simple one-place verbs was outstandingly successful. The lexical selection of the two-place verbs with locative complement proved to be the most difficult for the patients (they could not retrieve any verb in this group). These predicates were directional motion verbs. The lexical representations of these verbs integrate mental knowledge related to the cognitive representation of space or spatial relations. These verbs include such contents as direction of the motion, place-coordinates, starting point and end point. This information is encoded in the semantic representation and thematic roles of the predicate. Processing of this information seemed to be more difficult for our patients, they produced marked selection disorder when attempting to produce these verbs. The ratio of the three-place verbs was lower in the sample than the proportion of the one-place and two-place verbs. The empirical results show that the representational complexity of the predicate has a direct effect on the lexical accessibility of the verb for agrammatic aphasics. The complexity of the argument-structure of the verb (number of obligatory arguments) plays an important role in verb retrieval but it is not the only factor. The morphological and semantic representational complexity of the one-place derived verbs and the semantic representational complexity of the two-place locative verbs also had an effect on the lexical-semantic selection of the predicates. According to the thematic hierarchy hypothesis, the argument structure of the verb is not only a set of arguments. It has its own internal structure which represents prominent relations that are determined by the thematic information of the predicate. Grimshaw suggested a protoargumentstructure which is a structured representation of arguments based on thematic hierarchy: (Agent (Experiencer (Goal/Source/Location (Theme)))) The subjects were able to produce arguments of every type (Agent, Theme, Goal, Benefactive) but a difference was found in the distribution of the type of arguments activated first. Activation of the arguments lower in the thematic hierarchy was more frequent than that of the more prominent arguments of a given predicate (e.g. Theme > Benefactive > Agent ; Theme > Agent; Goal > Agent). Two exceptions were found to this tendency, namely the Agent >Goal order in the three-place locative group and the Agent > Theme order in the transitive [+animate] group. Comparing the proportions of arguments, an outstanding contrast was found between the activation of Agent and Theme arguments in the transitive [-animate] and 3-place dative verb groups. In the case of the 3-place locative group the Agent><Theme contrast was not so sharp, rather, the Goal/Source><Theme and the Agent ><Goal/Source contrasts were considerable. The contrast was also less sharp between the Agent and Goal arguments in the two-place locative type of verbs. The data show that the less prominent Theme argument was activated faster than the other arguments if the predicate assigned the thematic role of Theme mapped to an object NP specified as [-animate]. Activation of the Theme argument fell behind the Agent only if the verb was reversible (if the Theme thematic role was mapped into an object specified as [+animate, +human]). The Theme [-animate] argument seems to be a preferred one for agrammatic aphasics. Theoretical results The representational complexity of the verbs had a direct effect on the accessibility of the predicates. Morphologically simple one-place predicates were produced in the highest number. Much lower proportions of the morphologically complex one-place' predicates and in the transitive verbs were found, and only some verbs were activated in the 3-place verb groups. Production of the directional motion verbs proved to be the most difficult for the patients. This data showed that the argument structure complexity of the verb is important but not the only factor in the lexical selection of predicates. The semantic representational and morphological complexity of the predicate is also relevant in the lexical-semantic selection of the verbs. Dysfunction of the syntactic structure building mechanisms had a connection with the lexical accessibility of the formatives and the nominal elements of the phrase structure. The reduced capacity to preserve the previously activated argument Ns or NPs had a role in the unsuccessful structure building operations.
18 10 Detailed dummary of the results of the reserach Agrammatic performance can be interpreted by those asynchronic mechanisms that cannot function simultaneously between the level of semantic operations (activation of argument-structure and thematic information) and syntactic processing (procedures that construct the syntactic phrase structure and map the arguments/thematic roles into the syntactic frame). Publication resulting from the research: Kiss (2000), in the List of Publications Ellipsis in sentence processing Agrammatic aphasics overuse the linguistic option of ellipsis in free conversation. Ellipsis is understood as the omission of the projection of V after a focused DP or a quantified DP in a surface syntactic string. The use of elliptic sentences in spontaneous speech can be considered as a kind of adaptation or reaction. Broca s aphasic patients are aware of their reduced linguistic capacity, that is why they employ elliptical constructions. The role of this strategy is to prevent computation overload in the linguistic system. Employment of this strategy (among others) is optional rather than obligatory. Hungarian grammar allows forward and backward types of Verb Phrase ellipsis. Forward VP Ellipsis (FVPE) is interpretively dependent on its antecedent. The syntactic tree is complete. Syntactic and semantic features of lexical items are present in the ellipsis site, it is the phonological form that is not inserted. there is no need to assume the deletion of lexical items. Backward VP Ellipsis (BVPE) sites result from a deletion of lexical forms. There are identification asymmetries between forward and backward ellipses. Forward ellipsis sites contain lexical content throughout the derivation, but fail to undergo phonological form-insertion. Backward ellipsis sites result from deletion after form-insertion. Sentence repetition tests We tested the neurolinguistic reality of the identification asymmetries with respect to the direction of VPE mentioned above. Our subjects were agrammatic Broca s aphasics. The test material involved co-ordinated sentences with VP ellipsis sites. Each test contained 15 sentences containing forward VP ellipsis and 15 sentences containing backward VP ellipsis. Two subjects were given the test three different times. Sentence patterns were filled with different (though equally frequent) words in each test but we did not change the sentence structures themselves. To repeat sentences patients were pursuing the strategy of monitored repetition involving two basic operations: (1) processing the heard utterances both syntactically and semantically, then storing them; (2) attempting to produce an utterance which matches the phonological, syntactic, and semantic properties of the original utterance. Analysis of data Identification asymmetries between FVPE and BVPE are relevant for real sentence processing operations as well. Repetition of BVPE imposed syntax/phonology interface requirements that exceeded the impaired capacity of the language processor with agrammatic aphasics. Producing co-ordinated sentences with FVPE in a repetition test requires the patients to store a content-based representation of the heard co-ordinated sentence, then convert it into a surface syntactic and phonological form. Supposing the processor builds structures from left to right, there was no built-in delay in the processes because of direction of ellipsis. Patients often mentioned elided VPs in overt phonological form at its correct position in the second conjunct. It was easy for them to reconstruct FVPE in overt phonological form. To produce a co-ordinated sentence with a BVPE in repetition test, it is necessary to recover the deleted lexical forms in the first conjunct. If structures are built from left to right, there is a built-in delay in the operations because of the direction and identification level of the backward VPE. Recovering is delayed, because the deletion is located in the first conjunct, but the phonologically realised licensing string is found in the second conjunct. Patients were able to repeat only the second conjunct in its correct grammatical form. The first elliptic conjuncts were often fragmented and
19 11 Detailed dummary of the results of the reserach ungrammatical. The elided VP in the first conjunct was rarely mentioned in its overt phonological form. Patients didn t reconstruct the BVPE overtly in the first conjunct. This is because the demands of the task increased to the extent that exceeded the capacity of the impaired processor. Theoretical results Properties of sentence repetition in aphasia result from the operations of normal processing principles under exceptional circumstances, in the face of impaired computational resources. The data characterised above are relevant from the point of view of a time based parser. The differences are related to timing. Generally speaking, when a non-empty Verb appears, most of the pieces to properly assemble the clause are available. In case of backward ellipsis the question is how to build a sketchy structure for the FIRST clause without lexical material of a non-empty V-bar. In the case of forward ellipsis, the parser processes the lexical material of the antecedent V-bar in the first place, then tries to analyse an empty category later. The parser is able to determine an empty (elided) category with the help of surface structural parallelism. Forward ellipsis is easier for impaired speech, because the same category is used twice in two parallel structures but an overt lexical form is mentioned only once. Backward ellipsis is harder for a time based parser. Backward ellipsis means that an empty category is detected at first. At the very moment when a parser detects an empty V-bar there is no information about its subtype and no information about the lexical material of the V-bar. The necessary decision is postponed. The parser must put that empty category into the memory buffer and wait for a posterior lexical item, namely lexical material of the posterior V-bar in the second clause. After processing the posterior V-bar the parser tries to determine the identity of the phonological form of the posterior V-bar and the elided V-bar and copy back the semantic/syntactic features. Backward ellipsis must cause a delay in structure building operations with the first co-ordinated clause. A hypothesis on the structure of the mental parser Suppose the following structure-building operations. The mental parser must produce a structural frame for all possible sentences. This syntactic frame contains categorised slots. When the configuration of surface case endings assigned by the category of Verb to its complements and the configuration of other closed class items are in their active phase in working memory, they define and open up syntactic slots for the content word filler. Content-words would be generated by the lexicon and would be inserted into their slots in the syntactic frame. Because closed class items have to be integrated with their categorised slots in the syntactic frame, and open class (content) words have to be inserted into their categorised slots in the syntactic frame as well, these two kinds of integration require synchronisation, a synchronised activation of structure building elements in working memory for language. The slow activation or fast decay of closed class items leads to a desynchronization between syntactic slots opened up by closed class items and the active phase of content word fillers. We define the mental parser as an automaton which becomes specialised in the processing of categories and features involved in the grammatical representation of sentences. Under this view the parser is a device which transfers information between a grammatical representation and a message level representation. The parser computes the grammatical representations of sentences and transforms them into a message level representation (at which the what is to be said is represented). The category and feature system is hierarchical in the grammatical representation. It has various levels of sub- and sub-sub categories, from the bare category to the individual lexical item and from the closed class category to the fully specific features of that closed class item. Then, it is the question of capacity and synchronisation how far down the hierarchy in grammatical representation the parser goes on its search for information. The distribution of patients performance in tests reflects the limitations on the interface between the impaired parser and the grammatical representation containing a hierarchy of categories and their features.
20 12 Detailed dummary of the results of the reserach For publications resulting from the research see the paper Bánréti (2000a,b) in the List of publications Syntax-semantics submodule Participants: Beáta Gyuris, Gabriella Tóth Ellipsis and the structure of the lexicon The aim of the syntax-semantic research group was to contribute to the general research aim of the project, the investigation of the internal structuring of the mental lexicon, from a syntactic and semantic perspective. More particularly, this meant isolating some of the sub-lexicons which are distinguished from the others in terms of syntactic and semantic behaviour. Structures containing ellipsis have always been in the forefront of the interests of syntacticians and psycholinguists, since the production and the understanding of structures which contain a missing element can shed a lot of light on the principles of mental computation, and the knowledge of grammar. Since the neurolinguistics subteam of our project group decided to investigate the mental lexicon of aphasic speakers by testing their comprehension and production of sentences containing ellipsis, in order to foster internal communication and co-operation within the project, instead of dealing on the three individual research topics as specified in the proposal, we concentrated, from the beginning, on the semantic analysis of Hungarian sentences containing ellipsis as a means of fulfilling our original research objectives. (The change of perspective was already indicated in the Interim Report.) In the domain of syntax, we examined a wide range of ellipsis types, in order to explore the mechanisms underlying ellipsis phenomena. In particular, we wished to test a grammar model based on the theory of Distributed Morphology (Halle & Marantz 1993), in which ellipsis is implemented as the non-insertion of phonological shapes to certain (strings of) lexical elements. This approach, stemming from work by Wilder (1997) and Bartos (to appear), abandons the traditional analyses of ellipsis, which were set in terms of either phonological deletion, or the content reconstruction/interpretation of syntactically null elements, and places it in the context of a model, in which what used to be seen as a unitary lexicon is split up into two (or even three) separate lexical modules: one containing the lexical items as semantic-syntactic feature bundles, and another one which contains sound forms, which are to be associated with the formal featural lexical items after the syntactic derivation. (A third possible lexical module would contain idiomatic meaning units, to be associated with the lexical items postsyntactically.) In the first phase, we carefully studied the relevant literature within the generative grammatical tradition, to have a firm grasp on existing analyses of ellipsis (study trip by B. Gyuris to Edinburgh). On the basis of a summary of this, we examined data from Hungarian, to establish what languagespecific properties are involved, so as to be able to abstract away from such particularities. These data also served as the primary area of our own analyses. The results and findings of this stage were presented at a group meeting, for the whole project group. We analysed VP-ellipsis data from Hungarian and English (taking this type of ellipsis to be representative), where we found that functional items do not behave uniformly under ellipsis: formal agreement items show much wider variability between ellipsis targets and licensers than tense/mood markers, which essentially pattern with open-class items, i.e. content words. In the second phase, we focused our attention on the comparison of backward and forward ellipsis (i.e. whether the elided material is in the first or the last conjunct in a co-ordinate structure). These two subtypes have often been treated by different analyses, due to the fact that they display characteristically different properties: while in backward ellipsis the elided part must be fully formidentical to the parallel part of the final conjunct, for forward ellipsis the constraints are less strict e.g. full form-identity is not (always) required. Instead of following the mainstream, which attributes this difference to different executions of elision (deletion under identity vs. reconstruction under