Towards a formal framework for linguistic annotations

Steven Bird, Mark Liberman

Research output: Chapter in Book/Report/Conference proceedingConference Paper published in Proceedingspeer-review


‘Linguistic annotation’ is a term covering any transcription, translation or annotation of textual data or recorded linguistic signals. While there are several ongoing efforts to provide formats and tools for such annotations and to publish annotated linguistic databases, the lack of widely accepted standards is becoming a critical problem. Proposed standards, to the extent they exist,
have focussed on file formats. This paper focuses instead on the logical structure of linguistic annotations. We survey a wide variety of annotation formats and demonstrate a common conceptual core. This provides the foundation for an algebraic framework which encompasses the representation, archiving and query of linguistic annotations, while remaining consistent with many alternative file formats.
Original languageEnglish
Title of host publicationProceedings of the 5th International Conference on Spoken Language Processing
Number of pages12
Publication statusPublished - 1998
Externally publishedYes
Event5th International Conference on Spoken Language Processing - Sydney, Australia
Duration: 30 Nov 19984 Dec 1998


Conference5th International Conference on Spoken Language Processing


Dive into the research topics of 'Towards a formal framework for linguistic annotations'. Together they form a unique fingerprint.

Cite this