Effective Machine Learning and Graph Search Methods in Language Transliteration

  • Mohamed Mahmoud Abd El-Wahab

Student thesis: Doctor of Philosophy (PhD) - CDU

Abstract

Transliteration is the process of transferring a word from the alphabet of one language to the alphabet of another. The objective is to obtain a mapping from one system of writing into another, thereby helping people pronounce words and names in foreign languages and giving readers an idea of how words are pronounced by putting them in a familiar alphabet. In this thesis, recent trends in transliteration using deep learning models have been explored along with a number of recursive backtracking methods and a modified version of Dijkstra’s shortest path algorithm. The convolution-networks’ seq2seq model developed by Facebook was adapted and used for the Arabic-English transliteration problem. This approach allowed us to build on recent work by Google and Amazon researchers and to improve on previous methods both in the training and prediction steps. In addition, two enhanced novel backtracking techniques have been introduced. The first is based on bidirectional search and the second is a flexible semiexact method. Both methods have been used for the first time in the realm of transliteration. The proposed methods have been tested on one of the most used Arabic-English transliteration datasets mined from Wikipedia, by Google. The reported experimental results revealed that the presented methods are highly effective and much more efficient than previously known techniques.
Date of AwardJul 2022
Original languageEnglish
SupervisorJamal El-Den (Supervisor)

Cite this

'