I barely graduated college, and that's okay

I didn’t do very well in high school. My grade point average was around a 2.5 out of 4. I did well in some subjects that I was interested in, like math, computer science, and history, but everything else was a wash. The less homework a class required me to do, the better my grade ended up being. In most classes I ended up watching the wall clock slowly tick towards the time when we... »
Author's profile picture Vik Paruchuri in updates

What's New in Dataquest v1.9: Console, hotkeys and more!

Whenever you send us feedback or an ideas for a feature, we read and catalogue your suggestions. We then use this to help planning features and improvements for Dataquest. Today we’re excited to launch two of our most-requested features: Hotkeys and a Python Console. Introducing the Python console Many of you have told us that you’d like to be able to explore the datasets while you work through the missions. To help you do this,... »
Author's profile picture Josh Devlin in updates

Pandas Tutorial: Data analysis with Python: Part 1

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages, and makes importing and analyzing data much easier. Pandas builds on packages like NumPy and matplotlib to give you a single, convenient, place to do most of your data analysis and visualization work. In this introduction, we’ll use Pandas to analyze data on video game reviews from IGN, a popular... »
Author's profile picture Vik Paruchuri in tutorials and python

NumPy Tutorial: Data analysis with Python

Don’t miss our FREE NumPy cheat sheet at the bottom of this post NumPy is a commonly used Python data analysis package. By using NumPy, you can speed up your workflow, and interface with other packages in the Python ecosystem, like scikit-learn, that use NumPy under the hood. NumPy was originally developed in the mid 2000s, and arose from an even older package called Numeric. This longevity means that almost every data analysis or machine... »
Author's profile picture Vik Paruchuri in tutorials, python, and numpy

28 Jupyter Notebook tips, tricks and shortcuts

This post is based on a post that originally appeared on Alex Rogozhnikov’s blog, ‘Brilliantly Wrong’. We have expanded the post and will continue to do so over time - if you have a suggestion please let us know in the comments. Thanks to Alex for graciously letting us republish his work here. Jupyter Notebook Jupyter notebook, formerly known as the IPython notebook, is a flexible tool that helps you create readable analyses, as you... »
Author's profile picture Josh Devlin in resources and guides

Working with SQLite Databases using Python and Pandas

SQLite is a database engine that makes it simple to store and work with relational data. Much like the csv format, SQLite stores data in a single file that can be easily shared with others. Most programming languages and environments have good support for working with SQLite databases. Python is no exception, and a library to access SQLite databases, called sqlite3, has been included with Python since version 2.5. In this post, we’ll walk through... »
Author's profile picture Vik Paruchuri in tutorials, python, sqlite, and sql

Learn Python the right way in 5 steps

Python is an amazingly versatile programming language. You can use it to build websites, machine learning algorithms, and even autonomous drones. A huge percentage of programmers in the world use Python, and for good reason. It gives you the power to create almost anything. But – and this is a big but – you have to learn it first. Learning any programming language can be intimidating. I personally think that Python is better to learn... »
Author's profile picture Vik Paruchuri in tutorials and python

18 places to find data sets for data science projects

This is the fifth post in a series of posts on how to build a Data Science Portfolio. You can find links to the others in this series at the bottom of the post. If you’ve ever worked on a personal data science project, you’ve probably spent a lot of time browsing the internet looking for interesting data sets to analyze. It can be fun to sift through dozens of data sets to find the... »
Author's profile picture Vik Paruchuri in tutorials, python, portfolio, and project

Working with streaming data: Using the Twitter API to capture tweets

If you’ve done any data science or data analysis work, you’ve probably read in a csv file or connected to a database and queried rows. A typical data analysis workflow involves retrieving stored data, loading it into an analysis tool, and then exploring it. This works well when you’re dealing with historical data such as analyzing what products a customer at your online store is most likely to purchase, or whether people’s diets changed in... »
Author's profile picture Vik Paruchuri in tutorials, python, and data

The key to building a data science portfolio that will get you a job

This is the fourth post in a series of posts on how to build a Data Science Portfolio. You can find links to the other posts in this series at the bottom of the post. In the past few posts in this series, we’ve talked about how to build a data science project that tells a story, how to build an end to end machine learning project, and how to setup a data science blog.... »
Author's profile picture Vik Paruchuri in tutorials, python, portfolio, and project

How I built a Slack bot to help me find an apartment in San Francisco

I moved from Boston to the Bay Area a few months ago. Priya (my girlfriend) and I heard all sorts of horror stories about the rental market. The fact that searching for “How to find an apartment in San Francisco” on Google yields dozens of pages of advice is a good indicator that apartment hunting is a painful process. Boston is cold, but finding an apartment in SF is scary We read that landlords hold... »
Author's profile picture Vik Paruchuri in tutorials, python, portfolio, and project

Building a data science portfolio: Machine learning project

This is the third in a series of posts on how to build a Data Science Portfolio. You can find links to the other posts in this series at the bottom of the post. Data science companies are increasingly looking at portfolios when making hiring decisions. One of the reasons for this is that a portfolio is the best way to judge someone’s real-world skills. The good news for you is that a portfolio is... »
Author's profile picture Vik Paruchuri in tutorials, python, data, pandas, portfolio, and scikit

7 awesome data science newsletters to keep you informed

In a fast-paced and rapidly growing industry like data science, keeping up is essential. Knowing what is trending is essential in helping you know what new tools to learn, to help you get a job, and much more. At the same time, there is so much content out there that it can be hard to know what to read and easy to be overwhelmed. The solution is to turn to email newsletters, which can help... »
Author's profile picture Josh Devlin in guides, news, and resources

Building a data science portfolio: Making a data science blog

This is the second in a series of posts on how to build a Data Science Portfolio. You can find links to the other posts in this series at the bottom of the post. Blogging can be a fantastic way to demonstrate your skills, learn topics in more depth, and build an audience. There are quite a few examples of data science and programming blogs that have helped their authors land jobs or make important... »
Author's profile picture Vik Paruchuri in tutorials, python, matplotlib, blog, data, pandas, and portfolio

Building a data science portfolio: Storytelling with data

This is the first in a series of posts on how to build a Data Science Portfolio. You can find links to the other posts in this series at the bottom of the post. Data science companies are increasingly looking at portfolios when making hiring decisions. One of the reasons for this is that a portfolio is the best way to judge someone’s real-world skills. The good news for you is that a portfolio is... »
Author's profile picture Vik Paruchuri in tutorials, python, matplotlib, folium, data, pandas, and portfolio