Learning a lexicon and translation model from phoneme lattices

Oliver Adams, Graham Neubig, Trevor Cohn, Steven Bird, Quoc Truong Do, Satoshi Nakamura

Research output: Chapter in Book/Report/Conference proceedingConference Paper published in Proceedings

Abstract

Language documentation begins by gathering speech. Manual or automatic transcription at the word level is typically not possible because of the absence of an orthography or prior lexicon, and though manual phonemic transcription is possible, it is prohibitively slow. On the other hand, translations of the minority language into a major language are more easily acquired. We propose a method to harness such translations to improve automatic phoneme recognition. The method assumes no prior lexicon or translation model, instead learning them from phoneme lattices and translations of the speech being transcribed. Experiments demonstrate phoneme error rate improvements against two baselines and the model's ability to learn useful bilingual lexical entries.

Original languageEnglish
Title of host publicationEMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages2377-2382
Number of pages6
ISBN (Electronic)9781945626258
DOIs
Publication statusPublished - 1 Jan 2016
Externally publishedYes
Event2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016 - Austin, United States
Duration: 1 Nov 20165 Nov 2016

Publication series

NameEMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016
CountryUnited States
CityAustin
Period1/11/165/11/16

Fingerprint Dive into the research topics of 'Learning a lexicon and translation model from phoneme lattices'. Together they form a unique fingerprint.

  • Cite this

    Adams, O., Neubig, G., Cohn, T., Bird, S., Do, Q. T., & Nakamura, S. (2016). Learning a lexicon and translation model from phoneme lattices. In EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 2377-2382). (EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/D16-1263