The data was collected from a community of Austrian Jewish refugees from Nazi occupied Austria (approx. 30000 Austrians fled to the UK) who settled in Northwest London in the late 1930s. We are therefore dealing with a community in which German and English have been in close contact for over sixty years. The L1 of the informants is close to Standard German, although occasionally interspersed with Yiddish lexical items and phonetically influenced by the Viennese variety. A peculiarity of the linguistic profile of this community is that they do not speak Yiddish. The age of onset of L2 (English) was during the late teens and early twenties for most speakers. At the time the audio-recordings were made (1993) all informants were in their late sixties or early seventies. Patterns of language use in this bilingual community changed throughout the last half a century: up to the 1970s mainly English was used in both public and private domains. Once the second generation had left the parent’s household and especially after retirement both languages started being used in the private domain. A close-knit network between a subset of the community facilitated the development of a bilingual mode of interaction, sometimes called 'Emigranto'. This mode of interaction is only used in in-group situations, is regarded as the 'we-code' (Gumperz 19982) and has covert prestige. Linguistically it is characterised by intra-sentential code-switching, and frequent switching at speaker turn boundaries. Further
Biographical (age, gender, schooling, social class of informants etc.) and situational information, where available, is provided under the relevant headers in the .cha files.
The goal of the project was to
a) provide a linguistic profile of the Jewish refugee community in London and
b) b) to study patterns of code-mixing.
Eva Eppler, School of English and Modern Languages, University of Surrey Roehampton, Roehampton Lane, London SW15 5PH, UK, e-mail: email@example.com
Maggie Brueckner, Language Centre, University of Rostock, Germany
The collection and transcription of the data was funded by various research grants form the University of Vienna and the University of Surrey Roehampton. The research based on these data was funded by the Austrian Ministry of Science.
Many thanks for the technical support from the media team at Roehampton, to LIPPS and to Brian.
When using this corpus, please reference
Eppler Eva. 1999. ‘Word order in German-English mixed discourse’, UCL Working Papers in Linguistics 11, 285-309.
Sampling and Data Collection
A random sample of 70 members of the target community was selected from a list of clients of an Austrian solicitor specialising in pension claims for refugees. 27 of them were audio-recorded for approx. 90 minutes in one-to-one or one-to-two sociolinguistic interviews/oral history collections. To this body of subjects other informants were added by referral (snowball sampling). All audio-recordings were done by the researcher in the informants’ homes. Informants were encouraged to chose as a language of interaction the one they normally use in their home. An additional 400 minutes of group recordings with three informants and the researcher were collected in participant observation technique during informal gatherings. Another 540 minutes of audio-data collected in the Day-Centre of a Refugee Organisation are almost impossible to transcribe due to the low quality of the recordings and the amount of overlap.
Full transcripts were made of sound files using the CHAT/LIDES transcription systems. LIDES (Language Interaction Data Exchange System) is based on CHAT but was extended to deal with code-mixed data. For this purpose language tags (@2 English and @4 German) are added to each word/morpheme to indicate its language. In cases where it was impossible to determine the language in which words were being produced, @u was attached, e.g. in@u preceding English or German place-names. Morphologically mixed words only display the full language tag on the suffix as CHECK does not pass sequences like e.g. ge@4#bother@2-t@. The comma was used to indicate syntactic juncture as one of the research aims is co- and subordination. The CHAT symbol for tag questions was also used to delimit discourse markers (Schiffrin 1987). Due to the nature of some of the data (group recordings) overlaps are only indicated when the beginning and end point of the overlap was clearly recognisable.
All transcripts produced by the transcriber were checked by the researcher.
The project-specific codes are not included in the files on the web.
Table of Contents
At present the following .cha files are available on the LIPPS web-site
Their accompanying .wav files are available on the CHILDES site
IBron.cha 46 minutes of the first meeting between the researcher and the central informant DOR; 36 minutes with DOR, her daughter (2nd generation) and her grandson (3rd generation).
Ibrona.mp3 corresponds to side A of the original tape recording, Ibronb.mp3 to side B.
IAriel.cha: is a one-to-two sociolinguistic interview/oral history with a male and a female informant;
IHog.cha: is a one-to-two sociolinguistic interview/oral history with a married couple