(home)

Learning About NLTK (Python Natural Language Toolkit)

Python Natural Language Toolkit "is a suite of open source Python modules, data and documentation for research and development in natural language processing."

Install on OS X

The easy_install didn't work for me (after installing PyYAML I could do a download & setup.py install) but then I discovered the home page actually had a link to a .dmg file. I installed that over the top.

Details on installing corpus without the GUI installer: http://www.nltk.org/data

Documentation

Links

Examples

# for sentence in nltk.sent_tokenize("Wow, that was quick. Target is so much less painful than Wal-Mart."):
#     print nltk.word_tokenize(sentence) 
['Wow', ',', 'that', 'was', 'quick', '.']
['Target', 'is', 'so', 'much', 'less', 'painful', 'than', 'Wal-Mart', '.']
code@rancidbacon.com