Tag: Data Cleaning

NLP Project Part 2: How to Clean and Prepare Data for Analysis

This is the second in a series of posts describing my natural language processing (NLP) project. To really benefit from this NLP article, you should read the first post, understand how to use pandas to work with text data, and be aware of list comprehensions and lambda functions. We’re also going to write a few […]

Read More

21 Places to Find Free Datasets for Data Science Projects

A collection of the best places to find free data sets for data visualization, data cleaning, machine learning, and data processing projects.

Read More

Six Reasons Why You Should Learn R for Data Science

Why should you learn R programming when you’re aiming to learn data science? Here are six reasons why R is the right language for you.

Read More

New Course: Learn Advanced Data Cleaning in R

Mastered the basics of data cleaning in R? Take your data cleansing skills to the next level with this advanced data cleaning course for R coders.

Read More

Data Cleaning and Preparation for Machine Learning

Learn data cleaning for a machine learning project by cleaning and preparing loan data from LendingClub for a predictive analytics project.

Read More

Master Data Cleaning with Our New Python Data Cleaning Advanced Course

Learn to clean data and replace missing values in Python using advanced skills like regular expressions (regex), list comprehensions, lambda functions, etc.

Read More

New Course: Learn Data Cleaning with Python and Pandas

Data cleaning might not be the reason you got interested in data science, but if you’re going to be a data scientist, no skill is more crucial. Learn how to clean data with Python and pandas in our new course.

Read More

Visualizing Women’s Marches: Part 1

In celebration of Women’s History Month, I wanted to better understand the scale of the Women’s Marches that occurred in January 2017. Shortly after the marches, Vox published a map visualizing the estimated turnout across the entire country. This map is excellent at displaying: locations with the highest relative turnouts hubs and clusters of where […]

Read More

Tutorial: Cleaning CSV Data Using the Command Line and csvkit

Learn how to clean data on the command line, a key skill for doing data analysis and data science, using Python and csvkit.

Read More

Tutorial: Data Cleaning MoMA’s Art Collection with Python

A step-by-step tutorial on data cleaning (or data munging, a core data science skill) a dataset from the MoMA with Python, using the Pandas module.

Read More