The Natural Language Toolkit is a suite of program modules, data sets, tutorials and exercises covering symbolic and statistical natural language processing. NLTK is popular in teaching and research, and has been adopted in dozens of NLP courses. NLTK is written in Python and distributed under the GPL open source license. Over the past year the toolkit has been completely rewritten, simplifying many linguistic data structures and taking advantage of recent enhancements in the Python language. This paper reports on the resulting, simplified toolkit, NLTK-Lite, and shows how it is used to support efficient scripting for natural language processing.
|Title of host publication||Proceedings of the 4th International Conference on Natural Language Processing (ICON)|
|Publisher||Allied Publishers Private Limited|
|Number of pages||8|
|Publication status||Published - 2005|