Natural Language Processing
552 learners
Feature Engineering for Text Classification
Dive deeper into the transformation of raw text data into features that machine learning models can understand. Through a practical, hands-on approach, you'll learn everything from tokenization, generating Bag-of-Words and TF-IDF representations, to handling sparse features and applying Dimensionality Reduction techniques.
NLTK
Python
Scikit-learn
5 lessons
25 practices
5 hours
Feature Engineering and Text Representation
Lessons and practices
Filter Punctuation from Tokenized Review
Filtering Word Tokens from a Sentence
Completing Code for Data Loading and Tokenizing
Tokenizing and Filtering a Movie Review
Tokenizing First Review and Printing Tokens
Customizing Bag-of-Words Representation
Applying CountVectorizer on Sentences
Bag-of-Words Transformation on IMDB Reviews Dataset
Creating Bag-of-Words Representation Yourself
Turn Rich Text into Bag-of-Words Representation
Change TF-IDF Vector for Different Sentence
Implementing TF-IDF Vectorizer on Provided Text
Understanding Sparse Matrix Components
Applying TF-IDF Vectorizer On Reviews Dataset
Implementing TF-IDF Vectorizer from Scratch
Switching from CSC to CSR Representation
Creating a Coordinate Format Matrix with Duplicates
Performing Vectorized Operations on Sparse Matrices
Creating CSR Matrix from Larger Array
Performing Subtraction Operation on Sparse Matrix
Change TruncatedSVD Components Number
Implement Dimensionality Reduction with TruncatedSVD
Applying TruncatedSVD on Bag-of-Words Matrix
Implement TruncatedSVD on Bag-of-Words Matrix
Implementing TruncatedSVD on IMDB Movie Reviews Dataset
Meet Cosmo:
The smartest AI guide in the universe
Our built-in AI guide and tutor, Cosmo, prompts you with challenges that are built just for you and unblocks you when you get stuck.

Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal