Creating An Efficient Data Analysis Workflow (Part 2)
In our specialized data processing course, we’ve learned a lot of different programming concepts, including working with text data, the map function, and times and dates.
These concepts are great to know in isolation, but really they should be thought of as tools in a toolkit. These programming “tools” are what we will use to clean and manipulate data when we encounter a dataset. It is extremely rare to receeive a dataset that will not need any processing before moving on to analysis, so we must always be prepared to clean it to our needs.
More tools in our programming toolkit means that we can take on different, perhaps harder, problems. Like in the last guided project, we are taking on the role of as an analyst for a book company. The company has provided us more data on some of its 2019 book sales, and it wants us to extract some usable knowledge from it. It launched a new program encouraging customers to buy more books on July 1st, 2019, and it wants to know if this new program was successful at increasing sales and improving review quality. As the analyst, this will be your job to figure out for the guided project.
At the end of this project, you’ll have an end-to-end data analysis project!
- Synthesize what you’ve learned in our Specialized Data Processing course.
- Apply your new skills to a real data analysis problem.
- Build a complete data science project!
- Data Exploration
- Handling Missing Data
- Processing Review Data
- Comparing Book Sales Between Pre- and Post-Program Sales
- Comparing Book Sales Within Customer Type
- Comparing Review Sentiment Between Pre- and Post-Program Sales
- Further Steps
- Next Steps