Abstract
‘Linguistic annotation’ is a term covering any transcription, translation or annotation of textual data or recorded linguistic signals. While there are several ongoing efforts to provide formats and tools for such annotations and to publish annotated linguistic databases, the lack of widely accepted standards is becoming a critical problem. Proposed standards, to the extent they exist,
have focussed on file formats. This paper focuses instead on the logical structure of linguistic annotations. We survey a wide variety of annotation formats and demonstrate a common conceptual core. This provides the foundation for an algebraic framework which encompasses the representation, archiving and query of linguistic annotations, while remaining consistent with many alternative file formats.
have focussed on file formats. This paper focuses instead on the logical structure of linguistic annotations. We survey a wide variety of annotation formats and demonstrate a common conceptual core. This provides the foundation for an algebraic framework which encompasses the representation, archiving and query of linguistic annotations, while remaining consistent with many alternative file formats.
Original language | English |
---|---|
Title of host publication | Proceedings of the 5th International Conference on Spoken Language Processing |
Number of pages | 12 |
Publication status | Published - 1998 |
Externally published | Yes |
Event | 5th International Conference on Spoken Language Processing - Sydney, Australia Duration: 30 Nov 1998 → 4 Dec 1998 |
Conference
Conference | 5th International Conference on Spoken Language Processing |
---|---|
Country/Territory | Australia |
City | Sydney |
Period | 30/11/98 → 4/12/98 |