Analysis of spoken London English using corpus tools
Summary: The project uses anonymised transcripts of sociolinguistic interviews prepared as part the ESRC-project Linguistic Innovators: the English of adolescents in London (RES 000-23-0680, PI Paul Kerswill, CI Jenny Cheshire) to perform semi-automated corpus analyses of grammatical and discourse features.
Key Facts
Funder: British Academy
Type of Activity: Academic Research - Externally Funded
Principal Investigator: Eivind Torgersen
Co-investigator: Paul Kerswill
Research Associate: Costas Gabrielatos
Dept/Research Groups: Linguistics and English Language, Language Variation and Linguistic Theory (LVLT)
Keywords: Corpus linguistics, Corpus tools, Sociolinguistics, Grammar, English grammar, English language, Computerised corpora, Corpus linguistic methodology, Language variation and change
Project Description
The project will undertake a corpus analysis of one grammatical and one discourse feature. The grammatical feature to be analysed is the distribution and form of the indefinite article 'a'/'an' in front of vowel sounds. The analysis will seek to find correlations between the choice between 'a' and 'an' and the semantic features of the noun phrase following the indefinite article. We will examine both the linguistic and sociolinguistic contexts in which the indefinite article occurs. We will examine possible effects of word frequency, spelling, the quality of the following vowel, and word stress, as well as sociolinguistic information. The discourse feature to be analysed is the tag. The London interview data contain a very large number of discourse markers which would be very time-consuming to analyse manually. We will examine the use of fillers and tag questions as well as lexis and formulaic phrases used as discourse markers (eh, okay, right, yeah, innit). Previous research based on the COLT corpus has suggested an increase in the use of tags, in particular 'innit' and 'right', as teenagers get older, but the results may be unreliable due to a small dataset. We have a large dataset from exactly that age group. The COLT data are from 1993 while ours are from 2005 so it is possible to observe change and also identify new tags. We will also examine effects of gender and ethnicity. In COLT there is a significant difference between males and females in the distribution of 'yeah' and 'innit' and a difference between ethnic groups in the distribution of 'right' and 'innit': the ethnic minority speakers use more tags, but the manner and contexts in which the tags are used has not been investigated. We will look at phonetic context, word frequency, word length and location in the utterance in addition to the sociolinguistic variables.
Purpose of Research
Academic Research - Externally Funded
Project Funder
British Academy - £6877
