Learning About NLTK (Python Natural Language Toolkit)
Python Natural Language Toolkit "is a suite of open source Python modules, data and documentation for research and development in natural language processing."
Install on OS X
The easy_install didn't work for me (after installing PyYAML I could do a download & setup.py install) but then I discovered the home page actually had a link to a .dmg file. I installed that over the top.
Details on installing corpus without the GUI installer: http://www.nltk.org/data
- ((NLTK) (is (fun))) -- Non-linguist's introductory blog post
# for sentence in nltk.sent_tokenize("Wow, that was quick. Target is so much less painful than Wal-Mart."): # print nltk.word_tokenize(sentence) ['Wow', ',', 'that', 'was', 'quick', '.'] ['Target', 'is', 'so', 'much', 'less', 'painful', 'than', 'Wal-Mart', '.']