Linguistic Community
The data was collected from a community of Austrian
Jewish refugees from Nazi occupied Austria (approx. 30000 Austrians fled to the
UK) who settled in Northwest London in the late 1930s. We are therefore dealing
with a community in which German and English have been in close contact for
over sixty years. The L1 of the informants is close to Standard German,
although occasionally interspersed with Yiddish lexical items and phonetically
influenced by the Viennese variety. A peculiarity of the linguistic profile of
this community is that they do not
speak Yiddish. The age of onset of L2 (English) was during the late teens and
early twenties for most speakers. At the time the audio-recordings were made
(1993) all informants were in their late sixties or early seventies. Patterns
of language use in this bilingual community changed throughout the last half a
century: up to the 1970s mainly English was used in both public and private
domains. Once the second generation had left the parent’s household and
especially after retirement both languages started being used in the private
domain. A close-knit network between a subset of the community facilitated the
development of a bilingual mode of interaction, sometimes called 'Emigranto'.
This mode of interaction is only used in in-group situations, is regarded as
the 'we-code' (Gumperz 19982) and has covert prestige. Linguistically it is
characterised by intra-sentential code-switching, and frequent switching at
speaker turn boundaries. Further
Biographical
(age, gender, schooling, social class of informants etc.) and situational
information, where available, is provided under the relevant headers in the
.cha files.
The goal of the project was to
a)
provide a linguistic
profile of the Jewish refugee community in London and
b)
b) to study patterns of
code-mixing.
c)
Researcher
Eva
Eppler, School of English and Modern Languages, University of Surrey
Roehampton, Roehampton Lane, London SW15 5PH, UK, e-mail: e.eppler@roehamton.ac.uk
Second Transcriber
Maggie
Brueckner, Language Centre, University of Rostock, Germany
Acknowledgements
The collection and transcription of the data was
funded by various research grants form the University of Vienna and the
University of Surrey Roehampton. The research based on these data was funded by
the Austrian Ministry of Science.
Many thanks for the technical support from the media
team at Roehampton, to LIPPS and to Brian.
When using this corpus, please reference
Eppler Eva. 1999. ‘Word order in German-English mixed
discourse’, UCL Working Papers in
Linguistics 11, 285-309.
Sampling and Data Collection
A random sample of 70 members of the target community
was selected from a list of clients of an Austrian solicitor specialising in
pension claims for refugees. 27 of them were audio-recorded for approx. 90
minutes in one-to-one or one-to-two sociolinguistic interviews/oral history
collections. To this body of subjects other informants were added by referral
(snowball sampling). All audio-recordings were done by the researcher in the
informants’ homes. Informants were encouraged to chose as a language of
interaction the one they normally use in their home. An additional 400 minutes
of group recordings with three informants and the researcher were collected in
participant observation technique during informal gatherings. Another 540
minutes of audio-data collected in the Day-Centre of a Refugee Organisation are
almost impossible to transcribe due to the low quality of the recordings and
the amount of overlap.
Data Transcription
Full transcripts were made of sound files using the
CHAT/LIDES transcription systems. LIDES (Language Interaction Data Exchange
System) is based on CHAT but was extended to deal with code-mixed data. For
this purpose language tags (@2 English and @4 German) are added to each
word/morpheme to indicate its language. In cases where it was impossible to
determine the language in which words were being produced, @u was attached,
e.g. in@u preceding English or German place-names. Morphologically mixed words
only display the full language tag on the suffix as CHECK does not pass
sequences like e.g. ge@4#bother@2-t@. The comma was used to indicate syntactic
juncture as one of the research aims is co- and subordination. The CHAT symbol
for tag questions was also used to delimit discourse markers (Schiffrin 1987).
Due to the nature of some of the data (group recordings) overlaps are only
indicated when the beginning and end point of the overlap was clearly
recognisable.
All transcripts produced by the transcriber were
checked by the researcher.
Codes
The project-specific codes are not included in the
files on the web.
Table of Contents
At present the following .cha files are available on
the LIPPS web-site
Their accompanying .wav files are available on the
CHILDES site
http://talkbank.org/data/LIDES/
.
IBron.cha 46
minutes of the first meeting between the researcher and the central informant
DOR; 36 minutes with DOR, her daughter (2nd generation) and her
grandson (3rd generation).
Ibrona.mp3 corresponds to side A of the original tape
recording, Ibronb.mp3 to side B.
Jen1.cha, Jen2.cha, Jen3.cha: group recordings of DOR and three of
her friends from of the same generation (TRU, MEL and LIL) and the researcher.
IAriel.cha: is a one-to-two sociolinguistic
interview/oral history with a male and a female informant;
IHog.cha: is a one-to-two sociolinguistic
interview/oral history with a married couple