Department of Linguistics and English Language, Lancaster University, LA1 4YT, United Kingdom
Tel: +44 (0) 1524 593045 Fax: +44 (0) 1524 843085 E-mail: linguistics@lancaster.ac.uk


CORGRAM: Corpus-based grammar in contrast

«Back

Summary: The CORGRAM project is a quantitative investigation into the distributional properties of grammatical categories associated with nouns and verbs in three Indo-European languages. In this project, we will explore the application of novel corpus-based methods to a set of issues in grammatical analysis, in the context of a language, Nepali, for which corpus linguistics is in its infancy. It will also extend this analysis to a cross-linguistic comparison bringing in English and Russian.

Project Description

Previous work in the field of Nepali grammar has catalogued combinations of grammatical and lexical elements which can possibly occur. For example, Acharya (1991:78, 153, 157) lists 13 combinations of nouns and case-marking postpositions, and 360 different inflected forms of the Nepali lexical verb. Schmidt et al. (1993:xxi-xxvi) give similar catalogues of possibilities. However, to date little or no work on this topic, or Nepali grammar in general, has been based on the large-scale analysis of grammar in usage that corpus-based methods afford.

The grammatical categories of case (on nouns) and tense, aspect and mood (on verbs) are realised in Nepali as partially-bound elements which typically occur in close proximity to the nouns and verbs they relate to. Case, as well as the plural-collective marker, is indicated by post-nominal elements described variously as suffixes, clitics and postpositions. Tense, aspect and mood are largely marked by compounded auxiliary verbs, which however can also occur independently.

The semi-independence of these grammatical markers implies a degree of variety in their possible positions in the sentence structure. This raises the possibility of studying these markers, and the grammatical patterns in whose formation they participate, via quantitative analysis of their co-occurrence patterns in textual data. As outlined below, this may be accomplished by searching a corpus for statistically valid collocations. Collocation-based methods have been applied to the grammar of English, but not widely in a cross-linguistic context.

The questions to be addressed are in summary:

References

Acharya, J (1991) A descriptive grammar of Nepali. Washington, D.C.: Georgetown University Press.

Schmidt, RL, Dahal, BM, Pradham, KB and Vajracharya, G (eds.) (1993) A practical dictionary of Modern Nepali. Delhi: Ratna Sagar.

Purpose of Research

Academic Research - Externally Funded

Project Funder

AHRC - £117,886

Associated Events

Workshop: Introduction to CQPweb

Date: 18 September 2008 Time: 14.00-16.00 pm

Linguistics and English Language workshop:Introduction to CQPweb CQPweb CQPweb is a new corpus analysis tool. It is designed as a clone of the ... Read more»

 

 

Department of Linguistics and English Language, Lancaster University, LA1 4YT, United Kingdom
Tel: +44 (0) 1524 593045 Fax: +44 (0) 1524 843085 E-mail: linguistics@lancaster.ac.uk