TY - GEN
T1 - Enabling Interactive Transcription in an Indigenous Community
AU - Le Ferrand, Eric
AU - Bird, Steven
AU - Besacier, Laurent
N1 - Funding Information:
We are grateful to the Bininj people of Northern Australia for the opportunity to work in their community, and particularly to artists at Injalak Arts and Craft (Gunbalanya) and to the Warddeken Rangers (Kabulwarnamyo). Our thanks to several anonymous reviewers for helpful feedback on earlier versions of this paper. The lexical confirmation app presented in this paper has been designed by Mat Bettinson, at Charles Darwin University. This research was covered by a research permit from the Northern Land Council, ethics approved from CDU and was supported by the Australian government through a PhD scholarship, and grants from the Australian Research Council and the Indigenous Language and Arts Program.
Publisher Copyright:
© 2020 COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference. All rights reserved.
PY - 2020
Y1 - 2020
N2 - We propose a novel transcription workflow which combines spoken term detection and human-in-the-loop, together with a pilot experiment. This work is grounded in an almost zero-resource scenario where only a few terms have so far been identified, involving two endangered languages. We show that in the early stages of transcription, when the available data is insufficient to train a robust ASR system, it is possible to take advantage of the transcription of a small number of isolated words in order to bootstrap the transcription of a speech collection.
AB - We propose a novel transcription workflow which combines spoken term detection and human-in-the-loop, together with a pilot experiment. This work is grounded in an almost zero-resource scenario where only a few terms have so far been identified, involving two endangered languages. We show that in the early stages of transcription, when the available data is insufficient to train a robust ASR system, it is possible to take advantage of the transcription of a small number of isolated words in order to bootstrap the transcription of a speech collection.
UR - http://www.scopus.com/inward/record.url?scp=85104283717&partnerID=8YFLogxK
U2 - 10.18653/v1/2020.coling-main.303
DO - 10.18653/v1/2020.coling-main.303
M3 - Conference Paper published in Proceedings
AN - SCOPUS:85104283717
VL - 1
T3 - COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference
SP - 3422
EP - 3428
BT - COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference
A2 - Scott, Donia
A2 - Bel, Nuria
A2 - Zong, Chengqing
PB - Association for Computational Linguistics (ACL)
CY - Czech Republic
T2 - 28th International Conference on Computational Linguistics, COLING 2020
Y2 - 8 December 2020 through 13 December 2020
ER -