Natural Language Processing for Low Resource Languages

Project: HDR ProjectPhD

Project Details

Description

A low-resource language is defined as one for which there are few, if any, documenting resources such as lexicons, grammars, or written texts. For communities who speak rare and underserved languages, these kinds of materials are important for preserving and promoting their culture, linguistic heritage, and identity. In the Northern Territory, there are many indigenous communities who speak such languages and who would like to develop and benefit from these kinds of linguistic resources.
In alignment with the goal of aiding local communities with their language documentation and promotion efforts, my research will focus on leveraging technology to model various aspects of human language in a low-resource setting. Specifically, I will work on modeling speech, transcriptions, lexicons, and grammars, by drawing on symbolic, statistical, and neural network approaches to natural language processing.
This research will build on elements from various academic fields including linguistics, computer science, and software engineering. The goal is to use this multi-disciplinary foundation to come up with practical and principled methods for building natural language systems in a low-resource setting; systems like those used to support translation, language learning, and information access. Indispensable to this effort is the participation and guidance of local indigenous communities. Together, we can work towards a future where technology is available to help these communities document and promote their linguistic identities.
StatusActive
Effective start/end date1/07/19 → …