Setup
Supporting Libraries / Dependencies
To work with the following data sources in Pandas, verify that the Python environment has the supporting libraries installed either via pip or through the Anaconda distribution.
Required Libraries:
- regex
- tqdm
Documentation
Natural Language Toolkit - NLTK 3.5 documentation
NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum.
www.nltk.org
NLTK Downloader Shell
Process
Inputting the following into the Jupyter notebook accesses the NLTK Downloader Shell.
nltk.download_shell()The input is defined by the keyboard shortcuts noted in its menu. Use l for List to access the available Packages & Collections noted below. To download a specific package, first enter d for Download and then the name of the intended package, such as stopwords in this case. Once it has downloaded and installed to the noted directory, it will show up in the list with an * noting that it has already been installed.