Sociolinguistic and Corpus Linguistics

Baker, P. (2010) Sociolinguistics and Corpus Linguistics. Edinburgh: Edinburgh University Press.

Book description

Sociolinguistics and Corpus Linguistics is the first book to focus on the ways that corpus linguistics approaches can be used in order to aid sociolinguistic research. Both corpus linguistics and sociolinguistics have a great deal in common in terms of their basic approaches to language enquiry, particularly in terms of providing representative samples from a population and analysing quantitative information in order to study variation or differences between populations. The book covers a range of different topics within sociolinguistics: analysing demographic variation, comparing language use across different cultures and examining language change over time, studying transcripts of spoken interactions and identifying attitudes or discourses. The book references many key and recent studies in the field as well as featuring original analyses of a number of corpora including the British National Corpus, the corpus of Spoken English Dialects and the Brown family of corpora. In addition, a new corpus of written British English collected around 2006 was collected for the purposes of writing the book. Techniques of analysis like concordancing, keywords and collocations are discussed, along with corpus annotation and statistical procedures such as chi-squared tests and clustering. The book takes a critical approach to using corpora in sociolinguistics, attempting to outline the limitations of the approach as well as its advantages.

Chapter outline

Chapter 1 Introduction
Chapter 2 Sociolinguistic variation
Chapter 3 Diachronic variation
Chapter 4 Synchronic variation
Chapter 5 Corpora and interpersonal communication
Chapter 6 Uncovering discourses
Chapter 7 Conclusion
Excerpt - Chapter 1

Over the past twenty or so years, an approach to the study of language referred to as corpus linguistics has largely become accepted as an important and useful mode of linguistic enquiry. While corpora (or large collections of computerised texts, usually carefully sampled in order to be representative of a particular language variety) were first mainly used as aids to lexicography and pedagogy, they have more recently been deployed for a wider range of purposes. To illustrate, a sample of recent publications in linguistics includes Words and Phrases: Corpus Studies of Lexical Semantics (Stubbs 2001), Corpora in Applied Linguistics (Hunston 2002), Corpus Stylistics (Semino and Short 2004), Introducing Corpora in Translation Studies (Olohan 2004), Using Corpora in Discourse Analysis (Baker 2006), Corpora in Cognitive Linguistics (Gries 2006), Corpus-based Approaches to Metaphor and Metonymy (Stefanowitsch and Gries 2006) and Corpus Linguistics Beyond the Word: Corpus Research from Phrase to Discourse (Fitzpatrick 2007). What readers might note from this list is the absence of a book to date which details a corpus-based approach to sociolinguistics. Such a pairing has not been completely ignored. In their early overview of the field, McEnery and Wilson (1996) have a short section on corpora and sociolinguistics, which mainly discusses what is possible, rather than what has been done (at that point there was little to report), while Hunston (2002: 159-161) discusses how corpora can be used in order to describe sociolinguistic, diachronic and register variation. Additionally, Beeching (2006) has a short chapter on the 'how' and 'why' of sociolinguistic corpora in an edited collection by Archer et al. These sections of books point to the fact that some form of 'corpus sociolinguistics' is possible, although it might appear that corpus linguistics has only made a relatively small impact on sociolinguistics.

The main question that this book seeks to answer is: how can corpus linguistics methods be used gainfully in order to aid sociolinguistic research? This book is therefore written for sociolinguists who would like to know more about corpus techniques, and for corpus linguists who want to investigate sociolinguistic problems. Occurring somewhere between these two imaginary researchers are readers who may have little experience of either corpora or sociolinguistics, or readers who may know quite a bit about both. The challenge when writing a book that combines two fields is to try to keep a potentially diverse audience interested without making too many assumptions about what readers already know. Some readers may therefore want to focus more on some chapters than others. In the following sections I first provide some background about sociolinguistics before moving on to corpus linguistics and the relationship between the two.

Sociolinguistics: variation and change

As Bloome and Green (2002: 396) point out, sociolinguists have tended to avoid giving explicit definitions of the term sociolinguistics, an observation that at a first glance might seem curious. However, Labov (1972a: 183) provides a sensible explanation, noting that the term is 'oddly redundant' because language and linguistics are always social. Still, not all linguists place emphasis on the social aspects of language, so perhaps the term could be said to refer to a set of inter-related fields which do emphasise the study of language in social contexts. Wardhaugh (2005: 1) uses the phrase 'the relationship between language and society… the various functions of language in society' while Bloome and Green (ib id) stress the dialectical nature of sociolinguistics by noting that 'A sociolinguistic perspective requires exploring how language is used to establish a social context while simultaneously exploring how the social context influences language use and the communication of meaning.' Sociolinguists are therefore often interested in identifying how the identity of a person or social group relates to the way that they use language. They attempt to answer questions such as: what linguistic differences (and similarities) are there between (and within) certain types or groups of people, and in what ways do social variables such as age, sex , social class, geographic region, level of education etc. (either alone or in combination with other variables) impact on language use.

Sociolinguistics may ask how and why certain varieties or forms of language are taken up (consciously or not) while others are discarded, by either carrying out a 'micro' study of a small group or community, looking at social networks and focussing on the role of 'language innovators' or by examining a much larger population, relating aspects of language uptake (or decline) to various social contexts. In order to reliably differentiate between the language use of a range of social groups, sociolinguistics may try to elicit speech from a representative set of subjects or informants. Some sociolinguists attempt to collect such data by asking informants to read from a word list, or by carrying out interviews with them in the hope of obtaining less self-conscious uses of language. However, others have tried to acquire data in more naturalistic settings. Such studies may be referred to as 'traditional sociolinguistics' in that they have a long history, stretching back to early variationist studies by the pioneering American sociolinguist William Labov and others. Other sociolinguists carry out research on the use of spoken language in particular contexts e.g. doctor-patient interaction, private conversations between partners, political speeches, radio phone-ins etc. in order to examine how phenomena such as conflict and co-operation are negotiated and how meaning is created. An approach known as interactional sociolinguistics, which combines anthropology, ethnography, linguistics, pragmatics and conversation analysis, is used to examine how speakers create and interpret meaning in social interaction. Such an approach focuses on a close discourse analysis of recorded conversations.

Other sociolinguists examine spoken, written or computer-mediated texts in contexts such as advertising and the media, politics, the workplace or private settings in order to carry out Discourse Analysis (or Critical Discourse Analysis), which focuses on identifying the ways that language is used to construct a particular representation of the world in relation to ideologies, attitudes or power relations. A range of linguistic features (lexical choice, representation of agency, implicature etc) might be examined. Some researchers in this field utilise argumentation theory, examining how various topoi (strategies used to construct an argument) or fallacies (flawed components of an argument) are used in order to argue a position. Some analysts of discourse take into account intertextuality - the ways that authors of texts make reference to other texts, as well as considering how the conditions under which the text was produced and received impact on the text's meaning and significance. These findings can then be related to the wider social, historical, cultural and political contexts within which the text occurs in order to provide an explanation for the findings made.

A related area of sociolinguistics involves an examination of attitudes towards language or debates on language itself (meta-language) - why are some forms of language viewed as 'better' or 'worse' than others and what impacts do such views have on different types of people and language use itself? Why does it matter if some languages, or forms of language 'die' and why is there such a divided range of opinion about phenomena like 'political correctness', text message language or formal teaching of grammar in schools? Related to this field is sociolinguistic research that is concerned with multilingualism. At the micro level this could involve research which looks at how participants who use multiple languages interact with each other, for example, by considering phenomena like code switching. At the macro level it could include work which considers the impact of globalisation on different languages as well as applied research connected to language policy and planning.

Some sociolinguists combine linguistic analysis with a wider analysis of social and literacy practices, for example, carrying out an ethnographic study of a particular linguistic 'community of practice' or conducting interviews or focus groups to find out discourses or attitudes about language. Thus, sociolinguistics is an increasingly expanding field comprising a wide range of theoretical perspectives and analytical techniques. In this book Chapters 2-4 focus on quantitative approaches to sociolinguistics that add to our understanding of language variation and change, while Chapters 5 and 6 consider how corpus linguistics can benefit research that takes interactional sociolinguistics and (critical) discourse analysis perspectives respectively.

