Category Archives for "Data Science Tutorials"
wool-poisson-regression-rstats-compressor

Tutorial: Poisson Regression in R

Poisson Regression can be a really useful tool if you know how and when to use it. In this tutorial we’re going to take a long look at Poisson Regression, what it is, and how R programmers can use it in the real world. Specifically, we’re going to cover: What Poisson Regression actually is and […]

Tutorial: Time Series Analysis with Pandas

In this tutorial, we will learn about the powerful time series tools in the pandas library. Originally developed for financial time series such as daily stock market prices, the robust and flexible data structures in pandas can be applied to time series data in any domain, including business, science, engineering, public health, and many others. […]

Historic Wildfire Data: Exploratory Visualization in R

In recent weeks, news of the devastating wildfires sweeping parts of the US state of California have featured prominently in the news. While most wildfires are started accidentally by humans, weather conditions like wind and drought can exacerbate fires’ spread and intensity. Improved understanding of historical wildfire trends and causes can inform fire management and […]

Math in Data Science

Math is like an octopus: it has tentacles that can reach out and touch just about every subject. And while some subjects only get a light brush, others get wrapped up like a clam in the tentacles’ vice-like grip. Data science falls into the latter category. If you want to do data science, you’re going […]

Scikit-learn Tutorial: Machine Learning in Python

Scikit-learn is a free machine learning library for Python. It features various algorithms like support vector machine, random forests, and k-neighbours, and it also supports Python numerical and scientific libraries like NumPy and SciPy. In this tutorial we will learn to code python and apply Machine Learning with the help of the scikit-learn library, which […]

Linear Regression in Real Life

This post was written by Carolina Bento. She leads Data Analytics teams that empower companies to make data-driven decisions, and currently manages Product Analytics team at eero. This article was originally posted on Medium, and has been reposted with permission. We learn a lot of interesting and useful concepts in school but sometimes it’s not […]

Python Generators

Python generators are a powerful, but misunderstood tool. They’re often treated as too difficult a concept for beginning programmers to learn — creating the illusion that beginners should hold off on learning generators until they are ready. I think this assessment is unfair, and that you can use generators sooner than you think. In this […]

Programming Best Practices For Data Science

The data science life cycle is generally comprised of the following components: data retrieval data cleaning data exploration and visualization statistical or predictive modeling While these components are helpful for understanding the different phases, they don’t help us think about our programming workflow. Often, the entire data science life cycle ends up as an arbitrary […]

Data Retrieval and Cleaning: Tracking Migratory Patterns

Advancing your skills is an important part of being a data scientist. When starting out, you mostly focus on learning a programming language, proper use of third party tools, displaying visualizations, and the theoretical understanding of statistical algorithms. The next step is to test your skills on more difficult data sets. Sometimes these data sets […]