Annotation Graphs: A Foundation for Integrating Tools, Formats and Corpora

Steven Bird, Mark Liberman

Research output: Working paper


In recent work we have presented a formal framework for linguistic annotations using labeled acyclic digraphs. These `annotation graphs' offer a simple yet powerful method for representing complex annotation structures incorporating hierarchy and overlap. We illustrate some applications to existing discourse-level annotations of text and speech data. Annotation graphs are capable of representing the structure and content of a diverse range of formats, and this opens the door to wide-ranging integration of tools and corpora. We show how the approach facilitates substantive comparison of annotations expressed in different formats and how it permits queries on corpora which have been annotated at multiple levels using different coding standards and tools. Finally, we describe our philosophy on tool development.
Original languageEnglish
Number of pages6
Publication statusPublished - 1999
Externally publishedYes


Dive into the research topics of 'Annotation Graphs: A Foundation for Integrating Tools, Formats and Corpora'. Together they form a unique fingerprint.

Cite this