Naive Bayes for Sentiment Analysis

In this lesson of the exploring topics data science course, we'll work with a CSV file containing movie reviews and learn about the Naive Bayes classification algorithm to predict whether a review is negative or positive based on text alone. 

The Naive Bayes classifier works by figuring out how likely data attributes are to be associated with a different class, and is based on Bayes' theorem. Bayes' theorem describes the probability of an event based on prior knowledge of conditions that might be related to the event.

While going through the Naive Bayes lesson, you will not only code the entire algorithm from scratch every time but you will also learn the `MultinomialNB` implementation in scikit-learn. Scikit-learn is a Python machine learning library that contains implementations of all the common machine learning algorithms.

As you work through each concept, you’ll get to apply what you’ve learned from within your browser — there's no need to use your own machine to do the exercises. The Python environment inside oaf this course includes answer checking so you can ensure that you've fully mastered each concept before learning the next.


  • How to implement a Naive Bayes classifier.
  • Make a prediction about review classifications.

Lesson Outline

1. Introduction
2. Overview of Naive Bayes
3. Finding Word Counts
4. Making Predictions About Review Classifications
5. Predicting the Test Set
6. Computing Prediction Error
7. A Faster Way to Make Predictions
8. Takeaways

Take a Look Inside