Skip to main navigation Skip to search Skip to main content

An attentional model for speech translation without transcription

Long Duong, Antonios Anastasopoulos, David Chiang, Steven Bird, Trevor Cohn

Research output: Chapter in Book/Report/Conference proceedingConference Paper published in Proceedingspeer-review

68 Downloads (Pure)

Abstract

For many low-resource languages, spoken language resources are more likely to be annotated with translations than transcriptions. This bilingual speech data can be used for word-spotting, spoken document retrieval, and even for documentation of endangered languages. We experiment with the neural, attentional model applied to this data. On phoneto-word alignment and translation reranking tasks, we achieve large improvements relative to several baselines. On the more challenging speech-to-word alignment task, our model nearly matches GIZA++'s performance on gold transcriptions, but without recourse to transcriptions or to a lexicon.

Original languageEnglish
Title of host publication2016 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies, NAACL HLT 2016 - Proceedings of the Conference
EditorsKevin Knight, Ani Nenkova, Owen Rambow
Place of PublicationPensylvannia
PublisherAssociation for Computational Linguistics (ACL)
Pages949-959
Number of pages11
Volume1
ISBN (Electronic)9781941643914
DOIs
Publication statusPublished - Jun 2016
Externally publishedYes
Event15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - San Diego, United States
Duration: 12 Jun 201617 Jun 2016

Conference

Conference15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016
Country/TerritoryUnited States
CitySan Diego
Period12/06/1617/06/16

Fingerprint

Dive into the research topics of 'An attentional model for speech translation without transcription'. Together they form a unique fingerprint.

Cite this