Research Project

Towards an Online Conceptual Database

of the Latin Vulgate Bible

Project Director:  Dr. Andrew Wilson

Consultant:  Dr. Deborah Sawyer

Researchers:  Dr. Paul Rayson; Dr. Nicholas Smith


Since the mid 1990s, I have been working on the concept-based annotation of the Latin Vulgate Bible, with the aim of enhancing access to it at the level of word meanings rather than just the level of word forms. This project will develop a sophisticated online search-and-retrieval interface for analysed Vulgate material, together with a tool for enhancing the process of text annotation. The interface will form an important strategic basis for dissemination amongst end-users.


The Latin Vulgate Bible has played a unique role in the development of Western culture and Christianity. For many centuries, it has been the official Bible translation of the Roman Catholic Church. In this role, it lies at the heart of much patristic, medieval, and later theological writing. As the main Bible translation circulating at the time of the Reformation, it also had a strong influence (both direct and indirect) in Protestant circles and on Protestant Bible translations. Furthermore, secular medieval literature draws heavily on the Vulgate, making it a key text in the development of culture in general as well as of religion.

However, despite the huge increase in the number of digital resources available for the humanities, many textual resources - including all of those related to the Vulgate Bible - continue to encode, at the most, the headword forms of word occurrences. Thus, scholars are able to look up individual words in their contexts, but they are not able to search for occurrences of concepts. This stands in stark contrast to the need of many scholars to look mainly for concepts. This is a serious gap in resource provision, which hinders innovative work in areas such as intertextuality in Latin religious and secular literature (compare, for example, Buchr-Gillmayr's 1996 work on Biblical echoes in modern poetry).

The longer-term aim is to fill this gap by undertaking the exhaustive conceptual analysis of the Vulgate texts, so that any scholar looking for concepts will be able to go to them immediately and accurately. This is to be achieved by developing an online conceptual database for the Vulgate. This means that, alongside each occurrence of a word (and its corresponding headword) in the electronic text, we will also place one or more codes indicating the conceptual field into which that word occurrence falls - for example, body and body parts, joy/sadness, permission/prohibition, law, etc. So far, the gospels according to Mark and John, together with the Petrine epistles, have been analyzed in depth, and all four gospels have received a broader form of conceptual annotation.

The project will enhance access to the Vulgate text by undertaking the necessary linguistic pre-analysis, without which content-oriented research on later Latin texts is much more complex and time-consuming. The user of the database will be able to call up immediately all the words and textual contexts in which a given concept is discussed, and make comparisons within and across texts.

Previous funding and resulting outputs

This work represents part of a long-term commitment to the linguistic analysis of Christian Latin. It originated in my PhD thesis (Wilson 1996) and has subsequently attracted external funding in the form of a British Academy Small Research Grant. The work has so far been reported in a key journal article (Wilson 2003), at a major computational linguistics conference (Wilson & Worth 2003), and in an invited talk at the Humboldt University in Berlin (Wilson 2004). The analyzed material is also being disseminated in book form by the major publisher of reference works on classical languages, Olms-Weidmann (Wilson 2001, Wilson fc., Wilson & Worth fc.). In terms of more general esteem indicators, this work on the boundaries of corpus linguistics, theology, and humanities computing has resulted in two invitations to act as external examiner for PhD theses on computer-supported biblical linguistics (2001, 2005).

Future plans

This phase of work is linked to the further exploitation and dissemination of the material among end-users. Currently, the available material exists only in the form of printed reference books and stand-alone databases. However, to make it more widely known and available to end users, it is important to enhance access by putting the material online, along with a sophisticated user interface.


  1. To complete the checking of the conceptually annotated Vulgate gospels, simultaneously developing a basic reusable conceptual tagger for the analysis of further Vulgate (and other Latin) texts. The tagger will also undertake some disambiguation in context of high-frequency items.
  2. To develop the web-based search-and-retrieval interface to the annotated data, which will include:
    • retrieval at the level of concept, headword, or word form
    • key-item-in-context concordances (including Boolean searches)
    • collocational statistics (significant co-occurrences)
    • distributions within and across texts
    • statistical comparisons between texts


Bucher-Gillmayr S 1996 A Computer-aided Quest for Allusions to Biblical Texts within Lyric Poetry. Literary and Linguistic Computing 11(1): 1-8.

Wilson A 1996 Conceptual Analysis of Later Latin Texts: A Conceptual Glossary and Index to the Latin Vulgate Translation of the Gospel of John. 2 Volumes. Ph.D. Dissertation, Lancaster University.

Wilson A 2000 Conceptual Glossary and Index to the Vulgate Translation of the Gospel according to John. Alpha-Omega, Reihe A, Bd. ccxi. Olms-Weidmann, Hildesheim.

Wilson A 2003 Developing Conceptual Glossaries for the Latin Vulgate Bible. Literary and Linguistic Computing 17(4): 413-426.

Wilson A 2004 Computer-Aided Concept Analysis of Latin Biblical Texts. Invited talk, Forschungskolloquium Korpuslinguistik, Humboldt University of Berlin, 26.10.2004.

Wilson A 2006 Conceptual Glossary and Index to the Vulgate Translation of the Petrine Epistles. Alpha-Omega, Reihe A, Bd. ccxlvii. Olms-Weidmann, Hildesheim. xxxi+339 pages.

Wilson A, Worth CA 2003 Developing Conceptual Glossaries for the Latin Vulgate Bible. Poster presentation, Corpus Linguistics 2003, Lancaster.

Wilson A, Worth CA fc. Conceptual Glossary and Index to the Vulgate Translation of the Gospel according to Mark. To appear in the series Alpha-Omega, Reihe A. Olms-Weidmann, Hildesheim.