Towards a formal framework for linguistic annotations

Steven Bird, Mark Liberman

Research output: Chapter in Book/Report/Conference proceedingConference Paper published in Proceedings

Abstract

‘Linguistic annotation’ is a term covering any transcription, translation or annotation of textual data or recorded linguistic signals. While there are several ongoing efforts to provide formats and tools for such annotations and to publish annotated linguistic databases, the lack of widely accepted standards is becoming a critical problem. Proposed standards, to the extent they exist,
have focussed on file formats. This paper focuses instead on the logical structure of linguistic annotations. We survey a wide variety of annotation formats and demonstrate a common conceptual core. This provides the foundation for an algebraic framework which encompasses the representation, archiving and query of linguistic annotations, while remaining consistent with many alternative file formats.
Original languageEnglish
Title of host publicationProceedings of the 5th International Conference on Spoken Language Processing
Number of pages12
Publication statusPublished - 1998
Externally publishedYes
Event5th International Conference on Spoken Language Processing - Sydney, Australia
Duration: 30 Nov 19984 Dec 1998

Conference

Conference5th International Conference on Spoken Language Processing
CountryAustralia
CitySydney
Period30/11/984/12/98

Fingerprint Dive into the research topics of 'Towards a formal framework for linguistic annotations'. Together they form a unique fingerprint.

  • Cite this

    Bird, S., & Liberman, M. (1998). Towards a formal framework for linguistic annotations. In Proceedings of the 5th International Conference on Spoken Language Processing