The LIPPS/LIDES project


Easy introduction to LIDES transcription and encoding


LIDES transcription step by step

 
Here we give you a step-by-step outline of the basic process of transcription in accordance with the LIDES recommendations. If you follow these steps your data file will conform to LIDES and will contain enough information to enable researchers to allow you to carry out the analysis of switches between languages. You can find the details of various steps in the sections of the LIDES Coding Manual referred to at these steps. There are eight steps (data from Moyer, 1992):

 
1 Basic transcription (orthographic)

 
At this stage, you make a basic transcription. Usually you will use the normal orthography of the language, as long as it uses the Roman alphabet. Currently LIDES does not support other writing systems, so if any of the languages concerned uses a writing system other than the Roman alphabet, you will need to make a transcription using ASCII characters.

  
     Y:Excuse me , could we have two coffees and some scones, please ?
     N:Yvonne , para mí no vayas a pedir scones de esos que ahora me estoy tratando de controlar un
            poquito antes de Pascua .
     Y:Si Christmas ya está round the corner , mujer . Yo ya no hago dieta hasta por lo menos enero ,
            febrero y eso con suerte .
             
2 Obligatory File Format
 
For the LIDES/CLAN programs to work, you need to include some obligatory headers. There are other headers which you are recommended to include as they provide useful information, but only the three listed here are obligatory.

There are some special requirements for utterance delimiters (the punctuation marks at the end of a line). These should normally be preceded by a space.

You need to indicate the speaker of each utterance using an asterisk followed by three capital letters.
 
     @Begin
     @Participants:YVO housewife1, NAT housewife2
     *YVO:Excuse me , could we have two coffees and some scones , please ?
     *NAT:Yvonne , para mí no vayas a pedir scones de esos que ahora me estoy tratando de controlar un
               poquito antes de Pascua .
     *YVO:Si Christmas ya está round the corner , mujer .
     *YVO:Yo ya no hago dieta hasta por lo menos enero , febrero y eso con suerte .
     @End
 

3 Run the CHECK program
 
The CLAN software tools provide a program called CHECK which will check your format. Run the CHECK
program after this step to check if the basic format is correct.
   

Where can I get the CLAN tools and CHECK program?
 
4 Detailed transcription and Language tagging
 
Now you can insert the details of your transcription on the main tier (the line which begins with *).
Every word (or in some cases, each morpheme) is marked for its language using a tag consisting of @ followed by a number. You can choose which number you assign to which language. You may need some numbers to represent ‘unknown’, ‘undecidable’, ‘either Catalan or Spanish’ etc.

LIDES/CHAT provides conventions for marking a large number of phenomena which you may want to indicate in your transcription. You are not obliged to use any of these.
 

Where do I find the LIDES transcription conventions?


     @Begin
     @Participants:YVO housewife1, NAT housewife2
     @Languages:English (1), Spanish (2)
     *YVO:    Excuse@1 me@1 could@1 we@1 have@1 two@1 coffees@1 and@1 some@1 scones@1
                    please@1 ?
     *NAT:    Yvonne@1 para@2 mí@2 no@2 vayas@2 a@2 pedir@2 scones@1 de@2 esos@2 que@2
                   ahora@2 me@2 estoy@2 tratando@2 de@2 controlar@2 un@2 poquito@2 antes@2
                   de@2 Pascua@2 .
     *YVO:   Si@2 Christmas@1 y@2 está@2 round@1 the@1 corner@1 mujer@2 .
     *YVO:   Yo@2 ya@2 no@2 hago@2 dieta@2 hasta@2 por@2 lo@2 menos@2 enero@2 febrero@2
                   y@2 eso@2 con@2 suerte@2 .
     @End
 
 

5 Insert dependent tiers
 
LIDES recommends very strongly that you add glosses (%glo) and a translation into a widely known language
(%tra) to make your transcription as useful as possible to other researchers. You can also add any number of
dependent tiers for any research purposes of your own: for example, to annotate actions, semantics, pragmatics,
syntax… You choose your own name for the tier but it must consist of a per cent sign followed by three letters.
 

What kinds of dependent tiers could I have?
 
     @Begin
     @Participants:YVO housewife1, NAT housewife2
     @Languages:English (1), Spanish (2)
     *YVO:    excuse@1 me@1 could@1 we@1 have@1two@1coffees@1 and@1some@1 scones@1
                    please@1 ?
     *NAT:    Yvonne@1para@2 mí@2no@2 vayas@2a@2pedir@2 scones@1 de@2esos@2que@2
                   ahora@2 me@2 estoy@2 tratando@2 de@2 controlar@2 un@2 poquito@2 antes@2
                   de@2Pascua@2 .
     %glo:      Yvonne for me not go to ask scones of these that now me are trying of control a little?bit
                   before of Christmas
     %tra:      Yvonne, don't order these scones for me because now I am trying not to put on weight before
                   Christmas
     *YVO:  si@2 Christmas@1 ya@2 está@2round@1 the@1 corner@1 mujer@2 .
     %glo:     if Christmas already is round the corner woman
     %tra:     mind you, Christmas is already round the corner
     *YVO:  yo@2 ya@2 no@2 hago@2 dieta@2 hasta@2 por@2 lo@2 menos@2 enero@2 o@2
                   febrero@2 y@2 eso@2 con@2 suerte@2 .
     %glo:     I already not make diet until for the less January or February and that with luck
     %tra:     I am not going on a diet until at least January or February and even then with a bit of luck
     @End
 
 
6 Run the CHECK Program again.
 
Once the transcription and tagging is completed, the next stage is to check again the overall structure of the  files, and the symbols and codes used in the main and dependent tiers as declared in the CHILDES depfile and  the LIDES depaddfile (see section 3.6 of the Coding Manual).

 
7 Make changes in the depaddfile.
 
Eventually, it may be necessary to add symbols to the LIDES depaddfile for your own special data set of files. In this case the depaddfile must be changed using an ASCII editor and then you can run the CHECK program once again.

 
8 Create a readme document
 
 
LIDES recommends very strongly that you create a readme document to accompany the transcription file. In this document you include information about the data and the circumstances in which it was recorded. This will enable other users to know the background to the data.
 
Now you can try out LIDES by downloading the CLAN tools and using some sample files from the LIDES corpora

Return to top of page
Return to LIPPS home page


Brief guide to LIDES/CHAT Symbols 

(for more details, see Chapter 17 of MacWhinney, B. (1995, 2nd ed.): The CHILDES Project: Tools for Analyzing Talk. Hillsdale NJ: Erlbaum.)
Note: references are to chapters of the Childes manual, not the LIDES manual.

A. TRANSCRIPTION SYMBOLS

 
Word Symbols - Chapter 4

 
@
xxx
xx
yyy
yy

Symbol

www

0

&

[?]

()

0word

0*word

00word

special form markers
uninteligable speech, not treated as word
uninteligable speech, treated as word
uninteligable speech transcribed on %pho line, not treated as a word
uninteligable speech transcribed on %pho line, treated as a word

Description

untranscribed material

actions without speech

phonological element

best guess (see also chapter 8)

noncompletion of a word

omitted word

ungrammatical ommission

(grammatical) ellipsis


 

Morpheme Symbols - Chapter 5
 


 
-
#
+
~
&
-0

-0*

suffix marker
prefix marker
compound or roteform marker
clitic marker
fusion marker
omitted affix

incorrectly ommitted affix


 

Utterance and Tone Unit Terminators - Chapter 6
 


 
.
?
!
-?
-'.
-,.

-,

-_

,

,,

#

-:

period
question
exclamation
rising final contour
rise-fall contour
fall-rise contour

level final contour

low level final contour

syntactic juncture

tag question

pause between words

previous word lengthened


 

Symbol Terminators and Linkers - Chapter 6
 


 
+...
+/.
+//.
+/?
+"/.
+".

+"

+^

+,

++

trailing off
interruption
self-interruption
interruption of question
quotation follows on next line
quotation precedes

quoted utterance follows

quick uptake

self-completion

other-completion


 

Prosody within Words - Chapter 7
 


 
/
//
///
:
::
stress
accented nucleus
contrastive stress
lengthened syllable
pause between syllables


 

Scoped Symbols - Chapter 8
 


 
Symbol
[=! text]
[!]
[!!]
["]

[= text]

[: text]

[0 text]

[:=x text]

[=? text]

[%xxx text]

[% text]

[$text]

[?]

[>]

[<]

[<>]

[>number]

[<number]

[/]

[//]

[/-]

[/?]

[*]

[+ text]

Description
paralinguistics, prosodics
stressing
contrastive stressing
qoutation marks

explanation

replacement

omission

translation

alternative transcription

dependent tier on main line

comment on main line

code on main tier

best guess (also see chapter 4)

overlap follows

overlap precedes

overlap follows and precedes

overlap follows and overlaps are numbered

overlap precedes and overlaps are numbered

retracing without correction

retracing with correction

false start without retracing

unclear retrace type

error marking

postcode


 

B. HEADERS (required by the CLAN programs)
 


 

Obligatory Headers - Chapter 3
 


 
Symbol
@Begin
@End
@Participants:
Description
marks the beginning of a file
marks the end of a file
lists actors in a file


 

Constant Headers - Chapter 3
 


 
@Age of XXX:
@Birth of XXX:
@Coder:
@Educ of XXX:
@Filename:
@ID:

@Language:

@Language of XXX:

@SES of XXX:

@Sex of XXX:

@Warning:

marks a speaker's age
shows date of birth speaker
people doing transcription and coding
indicates educational level of speaker
shows name of file
code for STATFREQ analyses

the principal language of the transcript

language(s) spoken by a given participant

indicatessocioeconomis status of speaker

indicates gender of speaker

marks defects in file


 

Changeable Headers - Chapter 3
 


 
@Activities:
@Bg and @Bg:
@Bck:
@Comment:
@Date:
@Eg and @Eg:

@G:

@Location:

@New Episode:

@Room Layout:

@Situation:

@Tape Location:

@Time Duration:

@Time Start:

component activities in the situation
begin gem
background information
comments
date of the interaction
end gem

simple gems

geographical location of the interaction

point at which anew episode begins and an old ends

configuration of furniture in room

general atmosphere of the interaction

footage markers from tape

beginning and end times

beginning time


 

C. OTHER
 

Dependent Tiers - Chapter 9
 


 
%act:
%add:
%alt:
%cod:
%com:
%def:

%eng:

%err:

%exp:

%fac:

%flo:

%gls:

%gpx:

%int:

%mod:

%mor:

%par:

%pho:

%sit:

%spa:

%syn:

%tim:

actions
addressee
alternative transcription
general purpose coding
comments by investigator
codes from SALT

English translation

error coding

explanation

facial actions

flowing conversation

target language gloss for unclear utterance

gestural and proxemic activity

intonation

model or target phonology

morphemic semantics

paralinguistics

phonetic transcription

situation

speech act coding

syntactic structure notation

time stamp coding


Return to top of page
 

Symbols for Coding on Dependent Tiers - Chapter 9
 


 
Symbols
/.../
$
<$=N>
<bef>

<aft>

Description
delimiters for phonetic notation
indicates codes
occurs for N following utterances
occurence before an utterance

occurence after an utterance


 

Error Coding - Chapter 12
 


 
$=
=
;
source of an error in the %err line
placed between error and target
seperates errors on %err line


 

Coding on the %mor tier - Chapter 14
 


 
|
&
+(Plus)
-(Dash)
:
~(Tilde)

0

0*

00

follows part-of-speech on %mor line
nonconcatenated morpheme in %mor line
compound delimiter on %mor line
suffix delimiter on %mor line
feature fusion on %mor line
clitic delimiter on %mor line

precedes omitted element

precedes incorrectly omitted element

precedes (grammatically) ellipsed element

Return to top of page
Return to LIPPS home page