Securing interpretability: The case of ega language documentation

Dafydd Gibbon, Catherine Bow, Steven Bird, Baden Hughes

Research output: Chapter in Book/Report/Conference proceedingConference Paper published in Proceedingspeer-review

2 Citations (Scopus)

Abstract

The prime consideration in designing sustainable language resources is to ensure that they remain interpretable for coming generations of users. In this paper we adopt a new perspective on resource creation - securing the interpretability of data, using a case study of Ega, an endangered African language for which a small amount of legacy data is available. Basic steps to securing interpretability are to transfer files to durable media, and where possible, to convert all legacy data into XML files with Unicode character encodings. In the absence of agreed 'best practice' standards, we propose a methodology of 'better practice' to assist in the transition process towards this goal. We discuss a number of issues involved in securing interpretability of the lexicon, character encodings, interlinear glossed text, annotated recordings and nomenclature in linguistic descriptions, and describe our solutions.

Original languageEnglish
Title of host publicationProceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004
EditorsMaria Francisca Xavier, Rute Costa, Fatima Ferreira, Maria Teresa Lino, Raquel Silva
PublisherEuropean Language Resources Association (ELRA)
Pages1369-1372
Number of pages4
ISBN (Electronic)2951740816, 9782951740815
Publication statusPublished - 2004
Externally publishedYes
Event4th International Conference on Language Resources and Evaluation, LREC 2004 - Lisbon, Portugal
Duration: 26 May 200428 May 2004

Publication series

NameProceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004

Conference

Conference4th International Conference on Language Resources and Evaluation, LREC 2004
Country/TerritoryPortugal
CityLisbon
Period26/05/0428/05/04

Fingerprint

Dive into the research topics of 'Securing interpretability: The case of ega language documentation'. Together they form a unique fingerprint.

Cite this