Learning About NLTK (Python Natural Language Toolkit)

Python Natural Language Toolkit "is a suite of open source Python modules, data and documentation for research and development in natural language processing."

Install on OS X

The easy_install didn't work for me (after installing PyYAML I could do a download & setup.py install) but then I discovered the home page actually had a link to a .dmg file. I installed that over the top.

Details on installing corpus without the GUI installer: http://www.nltk.org/data




# for sentence in nltk.sent_tokenize("Wow, that was quick. Target is so much less painful than Wal-Mart."):
#     print nltk.word_tokenize(sentence) 
['Wow', ',', 'that', 'was', 'quick', '.']
['Target', 'is', 'so', 'much', 'less', 'painful', 'than', 'Wal-Mart', '.']