Abstract
Text input technologies for low-resource languages support literacy, content authoring, and language learning. However, tasks such as word completion pose a challenge for morphologically complex languages thanks to the combinatorial explosion of possible words. We have developed a method for morphologically-aware text input in Kunwinjku, a polysynthetic language of northern Australia. We modify an existing finite state recognizer to map input morph prefixes to morph completions, respecting the morphosyntax and morphophonology of the language. We demonstrate the portability of the method by applying it to Turkish. We show that the space of proximal morph completions is many orders of magnitude smaller than the space of full word completions for Kunwinjku. We provide a visualization of the morph completion space to enable the text completion parameters to be fine-tuned. Finally, we report on a web services deployment, along with a web interface which helps users enter morphologically complex words and which retrieves corresponding entries from the lexicon.
Original language | English |
---|---|
Title of host publication | COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference |
Editors | Donia Scott, Nuria Bel, Chengqing Zong |
Place of Publication | Czech Republic |
Publisher | International Committee on Computational Linguistics |
Pages | 4600-4611 |
Number of pages | 12 |
Volume | 1 |
ISBN (Electronic) | 978-1-952148-27-9 |
Publication status | Published - 2020 |
Event | 28th International Conference on Computational Linguistics, COLING 2020 - Virtual, Online, Spain Duration: 8 Dec 2020 → 13 Dec 2020 |
Publication series
Name | COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference |
---|
Conference
Conference | 28th International Conference on Computational Linguistics, COLING 2020 |
---|---|
Country/Territory | Spain |
City | Virtual, Online |
Period | 8/12/20 → 13/12/20 |
Bibliographical note
Funding Information:We are grateful to the Bininj people of Northern Australia for the opportunity to work in their community. Our thanks to Antti Arppe and several anonymous reviewers for helpful feedback on earlier versions of this paper. This research was covered by a research permit from the Northern Land Council, and was supported by the Australian government through a PhD scholarship, and grants from the Australian Research Council and the Indigenous Language and Arts Program.
Publisher Copyright:
© 2020 COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference. All rights reserved.