TY - GEN
T1 - Securing interpretability
T2 - 4th International Conference on Language Resources and Evaluation, LREC 2004
AU - Gibbon, Dafydd
AU - Bow, Catherine
AU - Bird, Steven
AU - Hughes, Baden
PY - 2004
Y1 - 2004
N2 - The prime consideration in designing sustainable language resources is to ensure that they remain interpretable for coming generations of users. In this paper we adopt a new perspective on resource creation - securing the interpretability of data, using a case study of Ega, an endangered African language for which a small amount of legacy data is available. Basic steps to securing interpretability are to transfer files to durable media, and where possible, to convert all legacy data into XML files with Unicode character encodings. In the absence of agreed 'best practice' standards, we propose a methodology of 'better practice' to assist in the transition process towards this goal. We discuss a number of issues involved in securing interpretability of the lexicon, character encodings, interlinear glossed text, annotated recordings and nomenclature in linguistic descriptions, and describe our solutions.
AB - The prime consideration in designing sustainable language resources is to ensure that they remain interpretable for coming generations of users. In this paper we adopt a new perspective on resource creation - securing the interpretability of data, using a case study of Ega, an endangered African language for which a small amount of legacy data is available. Basic steps to securing interpretability are to transfer files to durable media, and where possible, to convert all legacy data into XML files with Unicode character encodings. In the absence of agreed 'best practice' standards, we propose a methodology of 'better practice' to assist in the transition process towards this goal. We discuss a number of issues involved in securing interpretability of the lexicon, character encodings, interlinear glossed text, annotated recordings and nomenclature in linguistic descriptions, and describe our solutions.
UR - http://www.scopus.com/inward/record.url?scp=85037142331&partnerID=8YFLogxK
M3 - Conference Paper published in Proceedings
AN - SCOPUS:85037142331
T3 - Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004
SP - 1369
EP - 1372
BT - Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004
A2 - Xavier, Maria Francisca
A2 - Costa, Rute
A2 - Ferreira, Fatima
A2 - Lino, Maria Teresa
A2 - Silva, Raquel
PB - European Language Resources Association (ELRA)
Y2 - 26 May 2004 through 28 May 2004
ER -