home     |    timetable      |     archive      |    corpus resouces
 

 

CRG Timetable - Term 2: 17th January-21st March 2011

All meetings for Term 2 are in Meeting Room 1, FASS Building, at 3 pm, unless otherwise stated.

wk 11. (17 January) -Will Simm (School of Computing and Communications, Lancaster University) ViewKi: Exploratory Interactive Navigation of Short Comments

wk 12. (24 January 2) -Nicholas Groom (University of Birmingham) Corpus Perspectiveon Turn-Taking in University Seminars

Pls note changes in time and venue wk 13. (31 January 2011 3-5pm) Faraday Building A205 Edward J.L. Bell (Lancaster University) R: More than just a letter of the alphabet (Workshop)

wk 14. (7 February 2011) no meeting

wk 15. (14 February 2011) - Cathy Lonngren-Sampaio (University of Hertfordshire) The construction and analysis of a computerised corpus of child bilingual language

Pls note changes in day wk 16. (24 February 2011 Thursday) Richard Xiao (Edge Hill University)Contrastive Corpus Linguistics: Cross-linguistic contrast of English and Chinese presentation

wk 17. (28 February 2011) - Jana Tereick (Wissenschaftliche Mitarbeiterin, Universität Hamburg) TBA

wk 18. (7 March 2011) - Ghada Mohamed (Lancaster University) Text Classification of the BNC using Corpus and Statistical Methods.

wk 19. (14 March 2011) - Mazura Muhammad (Lancaster University) TBA

wk 19. (21 March 2011) - Scott Paio (Lancaster University) TBA

 

wk 11 Monday 17 January 2011

ViewKi: Exploratory Interactive Navigation of Short Comments

Will Simm
(School of Computing and Communications, Lancaster University)

The Voice Your View project aims to mobilise the tacit knowledge of a community to transform public spaces to be safer and more inclusive.

Voice Your View will collect real-time information that can then be structured, stored in an online repository, and exchanged with appropriate stakeholders: other users, local community groups, local authorities, etc. Voice Your View has operated a number of trials using emerging technologies to summarise live public commentary.

The ViewKi extends the Voice Your View concept by allowing users to explore the comments received by the system. Before, users were presented with a non-interactive summary of comments received; now users can interact with the data and see similar comments that have been left by others.

↑top

 

wk 12 Monday 24 January 2011

Corpus Perspectiveon Turn-Taking in University Seminars

Nicholas Groom
(University of Birmingham)

The fundamental aim of seminars and other forms of small-group interaction in higher education is to get learners to talk, and the underlying assumption shared by educational theorists and university teachers alike is that the more learners talk, the more they will learn, and thus the more successful the seminar will be. But what does ‘more talk’ mean? Is it to be measured in terms of the number of words spoken by learners, or by the number of turns that learners take, or by the average length of learners’ turns, or perhaps by some composite of these (and perhaps other) measures? In this talk I will present a ‘work in progress’ report on a study of the British Academic Spoken English Corpus (BASE), in which Oliver Mason and I are investigating learner and teacher contributions to seminars according to each of these three measures. BASE is particularly well suited to our research interests not only because it includes a seminars subcorpus, but also because it is divided into four different ‘knowledge domains’: humanities, social sciences, life sciences and physical sciences. This allows us to ask whether turn-taking patterns in university seminars are subject to any form of systematic disciplinary variation.

Our main finding so far is that different knowledge domains perform better according to different measures. Specifically, if we define talk in terms of total words spoken, we find that students talk the most in seminars in the humanities and social sciences; if on the other hand we quantify talk in terms of number of turns, then students in the physical sciences are found to talk the most; and if we measure talk in terms of average turn length, then students in life sciences disciplines come to the fore.

I will then offer some possible explanations for these trends by taking a closer qualitative look at some examples of seminar interactions in each of these four knowledge domains. Following this, I will argue against the idea that any one of these measures might be inherently better or more desirable than the others. I will suggest instead that each of these different versions of ‘talking more’ carries with it a different set of affordances, each of which is more or less well attuned to the particular epistemological and pedagogic goals of different academic disciplines. I will conclude by considering the implications of this argument for staff development and training programmes in higher education.

↑top

 

wk. 13 Monday 31 January 2011

R: More than just a letter of the alphabet

Edward J. L Bell
(Lancaster University)

I will demonstrate how the statistical software R can be used in corpus linguistics. We will go over the basics of R initially and then proceed to explore topics of interest to linguists such as:
* frequency distributions
* graphs and plotting
* hypothesis testing
* modelling/classification (if we have time)

The presentation will take the form of a tutorial with practical exercises. I will provide data but the members of audience can bring data if they so desire. The best form for data is a plain text file without annotation.

Most examples will be taken from 'Analysing Linguistic Data (Baayen)' and Gries' linguistic books on R.

 

↑top

 

wk. 14 Monday 7 February 2011

no meeting

↑top

 

wk 15. Monday 14 February 2011

The construction and analysis of a computerised corpus of child bilingual language

Cathy Lonngren-Sampaio
(University of Hertfordshire)

This paper describes the process of construction and analysis of a computerised corpus of child bilingual language following the transcription and analysis system of the CHILDES (Child Language Data Exchange System) project (MacWhinney, 1991). The corpus is composed of transcriptions of the spoken language of two Brazilian bilingual siblings (M and J), exposed to Portuguese and English from birth. The data comprises recordings of diverse family situations occurring over three years which were transcribed using the conventions set out by CHAT (Codes for the Human Analysis of Transcripts)(MacWhinney, 2010a). Specific codes were designed and inserted in the corpus to permit the electronic investigation of both grammatical and sociolinguistic aspects of Code-Switching (CS) through the use of the CLAN (Computerized Language Analysis)(MacWhinney, 2010b) tool. The effectiveness of the codes were tested through analyses on a small number of files and the output, original CS data for the language pair Portuguese/English, was analysed both quantitatively and qualitatively (Lonngren, 2004). Methodological considerations, relating to both the process of construction of the corpus and its analysis will be the focus of this presentation.

References:

LONNGREN, C. (2004). A Investigacao da Alternancia de Codigo em um Corpus Eletronico de Linguagem Bilingue Infantil. CROP (Revista da Área de Língua e Literature Inglesa e Norte-Americana. Departamento de Letras Modernas. USP: São Paulo, Brazil

MACWHINNEY, B. (1991). The CHILDES Project: tools for Analyzing talk. Hillsdale, NJ: Lawrence Erlbaum Associates.

MACWHINNEY, B. (2010a). The CHILDES Project, Tools for Analyzing Talk - Electronic Edition. Part 1: The CHAT Transcription Format. Carnegie Mellon University. Available online: http://childes.psy.cmu.edu/manuals/chat/pdf <http://childes.psy.cmu.edu/manuals/chat/pdf>  .

MACWHINNEY, B. (2010b). The CHILDES Project, Tools for Analyzing Talk - Electronic Edition. Part 2: The CLAN Programs. Carnegie Mellon University. Available online: http://childes.psy.cmu.edu/manuals/clan/pdf <http://childes.psy.cmu.edu/manuals/chat/pdf>

↑top

 

wk. 16 Thursday 24 February 2011

Contrastive Corpus Linguistics: Cross-linguistic contrast of English and Chinese presentation

Richard Xiao
(Edge Hill University)

The corpus-based approach is inherently comparative in nature. In this presentation, I will introduce a new model  of Contrastive Corpus Linguistics proposed in Xiao and McEnery’s  new book Corpus-Based
Contrastive Studies of English and Chinese (Routledge, 2010), which provides a common research platform for areas including corpus linguistics, contrastive linguistics, translation studies and second language acquisition research. I will also present the major research findings, and discuss the challenge and promise, of corpus-based contrastive studies of two distinctly different languages such as English and Chinese.

↑top

 

wk. 17 Monday 28 February 2011

Jana Tereick
(Wissenschaftliche Mitarbeiterin, Universität Hamburg)

TBA

↑top

 

wk. 18 Monday 7 March 2011

Text Classification of the BNC using Corpus and Statistical Methods

Ghada Mohamed
(Lancaster University)

TBA

↑top

 

wk. 19 Monday 14 March 2011

Mazura Muhammad
(Lancaster University)

TBA

↑top

 

wk. 20 Monday 21 March 2011

Scott Paio
(Lancaster University)

TBA

↑top

 

 

 

 

 

 

 
 

 

 

UCREL Corpus Research Seminar, Lancaster University
(website created and maintained by Neil Millar)