Dr Andrew Hardie

Lecturer in Corpus Linguistics
Degree: BA (Lancaster), PhD (Lancaster)
Associated research centres and groups: University Centre for Computer Corpus Research on Language (UCREL)
Current Teaching
This year, my undergraduate teaching includes the following (NB links are accessible to current students only):
- Director of Studies for Linguistics and the joint English Language and Linguistics degree schemes
- Course convenor and lecturer on LING 101
- Course convenor and lecturer on LING 208
- Course convenor and lecturer on LING 206
- Lecturer on LING 130
- Lecturer on LING 151
- Seminar tutor and occasional lecturer on LING 202
In previous years I have also taught on LING 152 and LING 203.
My postgraduate teaching includes the MA module in Corpus Linguistics and the New Route PhD module in Child Language Acquisition. I also supervise a number of PhD students.
Research Interests
Since September 2005 I have held the post of Lecturer in Corpus Linguistics in the department. I am currently doing research in a range of areas relating to corpus annotation, multilingual corpus linguistics and corpus-driven grammar and textual studies.
My particular research interests currently include:
- quantitative approaches to grammar in English and beyond;
- historical text-mining, with particular regard to the journalism of the Early Modern English period;
- part-of-speech tagging and the theory of morphosyntactic categories;
- keyness and frequency phenomena in texts;
- the languages and writing systems of South Asia, in particular the grammar of Nepali;
- text and corpus encoding and processing (with particular reference to Unicode).
My research into South Asian languages is currently focussed on Nepali. In co-operation with scholars at a number of institutions including the Open University, the University of Gothenburg, and Tribhuvan University, Kathmandu, I am working on the EU-funded Nelralec project, contributing expertise on corpus construction, encoding and annotation to the creation of the Nepali National Corpus. My assorted other work on Nepali, including a project to build a corpus of spoken Nepali funded by the British Academy, is grouped together as the Nepali Grammar Project.
Previously I have worked on the EMILLE corpus, which consists of 93 million words of text resources for fifteen South Asian languages. I was also involved in constructing the Lancaster Newsbooks Corpus.
My work on corpus encoding and annotation has at times involved the creation of software tools. As part of my work on EMILLE, I created the Unicodify software. While working on part-of-speech tagging for South Asian languages including Urdu and Nepali, I developed the Unitag framework.
A list of my research publications is available on this website.
Potential Doctoral Proposals
I would be especially interested in supervising PhD candidates working in the following areas:
- corpora of languages other than English - and in alphabets other than Latin;
- development and applications of corpus annotation ("tagging" at various levels);
- corpus-based grammatical analysis, especially cross-linguistic and/or quantitative approaches to grammar;
- investigating languageusing statistical collocation;
- the exploitation of corpus methods and resources in the other fields of the humanities and social sciences (e.g. history);
- or, more generally, in any area coherent with my research interests.
Other professional activities
I am the Project Development Officer of UCREL, the corpus research centre which beings together researchers from the Linguistics and Computing departments.
I am on the editorial board of Corpora, and was formerly on the boards of Glottometricsand the Journal of Quantitative Linguistics.
Eprints Publications Repository and Bibliographic Database
Andrew Hardie has 7 selected publication records listed on this webpage. Use links to access abstracts and full text where available. View all records to sort by date, type and title. For all ePrints records go to http://eprints.lancs.ac.uk
Hardie, Andrew (2007) From legacy encodings to Unicode: the graphical and logical principles in the scripts of South Asia. Language Resources and Evaluation, 41 (1). pp. 1-25. ISSN 1574-020X
Hardie, Andrew (2007) Part-of-speech ratios in English corpora. International Journal of Corpus Linguistics, 12 (1). pp. 55-81. ISSN 1384-6655
Baker, Paul and Hardie, Andrew and McEnery, Tony (2006) A glossary of corpus linguistics. Edinburgh University Press, Edinburgh. ISBN 978 0 7486 2018 0
Hardie, A. (2005) Automated part-of-speech analysis of Urdu: conceptual and technical issues. In: Contemporary issues in Nepalese linguistics. Linguistic Society of Nepal, Kathmandu, pp. 49-72.
Baker, Paul and Hardie, Andrew and McEnery, Tony and Xiao, Richard Z. and Bontcheva, Kalina and Cunningham, Hamish and Gaizauskas, Robert and Hamza, Oana and Maynard, Diana and Tablan, Valentin and Ursu, Cristian and Jayaram, B. D. and Leisher, Mark (2004) Corpus linguistics and South Asian languages : corpus creation and tool development. Literary and Linguistic Computing, 19 (4). pp. 509-524. ISSN 0268-1145
Hardie, Andrew and McEnery, Tony (2003) The ‘were’ subjunctive in British rural dialects : marrying corpus and questionnaire data. Computers and the Humanities, 37 (2). pp. 205-228. ISSN 0010-4817
McEnery, A. M. and Baker, J. P. and Hardie, A. (2000) Assessing claims about language use with corpus data – swearing and abuse. In: Corpora galore. Language and computers : studies in practical linguistics (30). Rodopi, Amsterdam, pp. 45-55. ISBN 9042004193
Associated Keywords: Corpus linguistic methodology, Corpus linguistics, Corpus tools, Digital humanities, Early modern English, Early modern writing, English, English grammar, English language, European languages, Grammar, Grammatical theory and description, Historical and diachronic corpora, Historical GIS, Humanities computing, India, Language, Linguistics, Metaphor, Multilingual corpora, Quantitative linguistics, Semantics, Seventeenth century, South Asia, Statistics, Swearing, Syntax
View all research activities, ePrints, news and events associated with Andrew Hardie.
