Easy introduction to LIDES transcription and encoding
LIDES transcription step by step
Here we give you a step-by-step outline of the basic process of transcription
in accordance with the LIDES recommendations. If you follow these steps
your data file will conform to LIDES and will contain enough information
to enable researchers to allow you to carry out the analysis of switches
between languages. You can find the details of various steps in the sections
of the LIDES Coding Manual referred to at these steps. There are eight
steps (data from Moyer, 1992):
1 Basic transcription (orthographic)
At this stage, you make a basic transcription. Usually you will use
the normal orthography of the language, as long as it uses the Roman alphabet.
Currently LIDES does not support other writing systems, so if any of the
languages concerned uses a writing system other than the Roman alphabet,
you will need to make a transcription using ASCII characters.
Y:Excuse me , could we have two coffees and
some scones, please ?
N:Yvonne , para mí no vayas a pedir
scones de esos que ahora me estoy tratando de controlar un
poquito antes de Pascua .
Y:Si Christmas ya está round the corner
, mujer . Yo ya no hago dieta hasta por lo menos enero ,
febrero y eso con suerte .
2 Obligatory File Format
For the LIDES/CLAN programs to work, you need to include some obligatory
headers. There are other headers which you are recommended to include as
they provide useful information, but only the three listed here are obligatory.
There are some special requirements for utterance delimiters (the punctuation marks at the end of a line). These should normally be preceded by a space.
You need to indicate the speaker of each utterance using an asterisk
followed by three capital letters.
@Begin
@Participants:YVO housewife1,
NAT housewife2
*YVO:Excuse me , could
we have two coffees and some scones , please ?
*NAT:Yvonne , para mí
no vayas a pedir scones de esos que ahora me estoy tratando de controlar
un
poquito antes de Pascua .
*YVO:Si Christmas ya
está round the corner , mujer .
*YVO:Yo ya no hago dieta
hasta por lo menos enero , febrero y eso con suerte .
@End
3 Run the CHECK program
The CLAN software tools provide a program called CHECK which will check
your format. Run the CHECK
program after this step to check if the basic format is correct.
Where can I get the CLAN tools and CHECK program?
LIDES/CHAT provides conventions for marking a large number of phenomena
which you may want to indicate in your transcription. You are not obliged
to use any of these.
Where do I find the LIDES transcription conventions?
@Begin
@Participants:YVO housewife1,
NAT housewife2
@Languages:English (1),
Spanish (2)
*YVO:
Excuse@1 me@1 could@1 we@1 have@1 two@1 coffees@1 and@1 some@1 scones@1
please@1 ?
*NAT:
Yvonne@1 para@2 mí@2 no@2 vayas@2 a@2 pedir@2 scones@1 de@2 esos@2
que@2
ahora@2 me@2 estoy@2 tratando@2 de@2 controlar@2 un@2 poquito@2 antes@2
de@2 Pascua@2 .
*YVO: Si@2
Christmas@1 y@2 está@2 round@1 the@1 corner@1 mujer@2 .
*YVO: Yo@2
ya@2 no@2 hago@2 dieta@2 hasta@2 por@2 lo@2 menos@2 enero@2 febrero@2
y@2 eso@2 con@2 suerte@2 .
@End
5 Insert dependent tiers
LIDES recommends very strongly that you add glosses (%glo) and a translation
into a widely known language
(%tra) to make your transcription as useful as possible to other researchers.
You can also add any number of
dependent tiers for any research purposes of your own: for example,
to annotate actions, semantics, pragmatics,
syntax… You choose your own name for the tier but it must consist of
a per cent sign followed by three letters.
What kinds of dependent tiers could I have?
7 Make changes in the depaddfile.
Eventually, it may be necessary to add symbols to the LIDES depaddfile
for your own special data set of files. In this case the depaddfile must
be changed using an ASCII editor and then you can run the CHECK program
once again.
8 Create a readme document
LIDES recommends very strongly that you create a readme document to
accompany the transcription file. In this document you include information
about the data and the circumstances in which it was recorded. This will
enable other users to know the background to the data.
Now you can try out LIDES by downloading
the CLAN tools and using some
sample
files from the LIDES corpora
Return to top of page
Return
to LIPPS home page
|
@
xxx
xx
yyy
yy
Symbol www 0 & [?] () 0word 0*word 00word |
special
form markers
uninteligable
speech, not treated as word
uninteligable
speech, treated as word
uninteligable
speech transcribed on %pho line, not treated as a word
uninteligable
speech transcribed on %pho line, treated as a word
Description untranscribed
material actions
without speech phonological
element best
guess (see also chapter 8) noncompletion
of a word omitted
word ungrammatical
ommission (grammatical)
ellipsis |
Morpheme
Symbols - Chapter 5
|
-
#
+
~
&
-0
-0* |
suffix
marker
prefix
marker
compound
or roteform marker
clitic
marker
fusion
marker
omitted
affix
incorrectly
ommitted affix |
Utterance
and Tone Unit Terminators - Chapter 6
|
.
?
!
-?
-'.
-,.
-, -_ , ,, # -: |
period
question
exclamation
rising
final contour
rise-fall
contour
fall-rise
contour
level
final contour low
level final contour syntactic
juncture tag
question pause
between words previous
word lengthened |
Symbol
Terminators and Linkers - Chapter 6
|
+...
+/.
+//.
+/?
+"/.
+".
+" +^ +, ++ |
trailing
off
interruption
self-interruption
interruption
of question
quotation
follows on next line
quotation
precedes
quoted
utterance follows quick
uptake self-completion other-completion |
Prosody
within Words - Chapter 7
|
/
//
///
:
::
|
stress
accented
nucleus
contrastive
stress
lengthened
syllable
pause
between syllables
|
Scoped
Symbols - Chapter 8
|
Symbol
[=!
text]
[!]
[!!]
["]
[=
text] [:
text] [0
text] [:=x
text] [=?
text] [%xxx
text] [%
text] [$text] [?] [>] [<] [<>] [>number] [<number] [/] [//] [/-] [/?] [*] [+
text] |
Description
paralinguistics,
prosodics
stressing
contrastive
stressing
qoutation
marks
explanation replacement omission translation alternative
transcription dependent
tier on main line comment
on main line code
on main tier best
guess (also see chapter 4) overlap
follows overlap
precedes overlap
follows and precedes overlap
follows and overlaps are numbered overlap
precedes and overlaps are numbered retracing
without correction retracing
with correction false
start without retracing unclear
retrace type error
marking postcode |
B.
HEADERS (required by the CLAN programs)
Obligatory
Headers - Chapter 3
|
Symbol
@Begin
@End
@Participants:
|
Description
marks
the beginning of a file
marks
the end of a file
lists
actors in a file
|
Constant
Headers - Chapter 3
|
@Age
of XXX:
@Birth
of XXX:
@Coder:
@Educ
of XXX:
@Filename:
@ID:
@Language: @Language
of XXX: @SES
of XXX: @Sex
of XXX: @Warning: |
marks
a speaker's age
shows
date of birth speaker
people
doing transcription and coding
indicates
educational level of speaker
shows
name of file
code
for STATFREQ analyses
the
principal language of the transcript language(s)
spoken by a given participant indicatessocioeconomis
status of speaker indicates
gender of speaker marks
defects in file |
Changeable
Headers - Chapter 3
|
@Activities:
@Bg
and @Bg:
@Bck:
@Comment:
@Date:
@Eg
and @Eg:
@G: @Location: @New
Episode: @Room
Layout: @Situation: @Tape
Location: @Time
Duration: @Time
Start: |
component
activities in the situation
begin
gem
background
information
comments
date
of the interaction
end
gem
simple
gems geographical
location of the interaction point
at which anew episode begins and an old ends configuration
of furniture in room general
atmosphere of the interaction footage
markers from tape beginning
and end times beginning
time |
C.
OTHER
|
%act:
%add:
%alt:
%cod:
%com:
%def:
%eng: %err: %exp: %fac: %flo: %gls: %gpx: %int: %mod: %mor: %par: %pho: %sit: %spa: %syn: %tim: |
actions
addressee
alternative
transcription
general
purpose coding
comments
by investigator
codes
from SALT
English
translation error
coding explanation facial
actions flowing
conversation target
language gloss for unclear utterance gestural
and proxemic activity intonation model
or target phonology morphemic
semantics paralinguistics phonetic
transcription situation speech
act coding syntactic
structure notation time
stamp coding |
Symbols
for Coding on Dependent Tiers - Chapter 9
Return to top of page
|
Symbols
/.../
$
<$=N>
<bef>
<aft> |
Description
delimiters
for phonetic notation
indicates
codes
occurs
for N following utterances
occurence
before an utterance
occurence
after an utterance |
Error
Coding - Chapter 12
|
$=
=
;
|
source
of an error in the %err line
placed
between error and target
seperates
errors on %err line
|
Coding
on the %mor tier - Chapter 14
|
|
&
+(Plus)
-(Dash)
:
~(Tilde)
0 0* 00 |
follows
part-of-speech on %mor line
nonconcatenated
morpheme in %mor line
compound
delimiter on %mor line
suffix
delimiter on %mor line
feature
fusion on %mor line
clitic
delimiter on %mor line
precedes
omitted element precedes
incorrectly omitted element precedes
(grammatically) ellipsed element |