Querying databases of annotated speech

S. Cassidy, S. Bird

Research output: Chapter in Book/Report/Conference proceedingConference Paper published in Proceedingspeer-review

10 Citations (Scopus)


Annotated speech corpora are databases consisting of signal data along with time-aligned symbolic 'transcriptions'. Such databases are typically multidimensional, heterogeneous and dynamic. These properties present a number of tough challenges for representation and query. The temporal nature of the data adds an additional layer of complexity. This paper presents and harmonises two independent efforts to model annotated speech databases, one at Macquarie University, and one at the University of Pennsylvania. Various query languages are described along with illustrative applications to a variety of analytical problems. The research reported here forms a part of several ongoing projects to develop platform-independent open-source tools for creating, browsing, searching, querying and transforming linguistic databases, and to disseminate large linguistic databases over the Internet.

Original languageEnglish
Title of host publicationProceedings - 11th Australasian Database Conference, ADC 2000
EditorsMaria E. Orlowska
PublisherIEEE, Institute of Electrical and Electronics Engineers
Number of pages9
ISBN (Electronic)0769505287, 9780769505282
Publication statusPublished - 2000
Externally publishedYes
Event11th Australasian Database Conference, ADC 2000 - Canberra, Australia
Duration: 31 Jan 20003 Feb 2000

Publication series

NameProceedings - 11th Australasian Database Conference, ADC 2000


Conference11th Australasian Database Conference, ADC 2000


Dive into the research topics of 'Querying databases of annotated speech'. Together they form a unique fingerprint.

Cite this