Learn to do some text analysis in this Python tutorial, and test hypotheses using confidence intervals to insure your conclusions are significant.
Learn text classification using linear regression in Python using the spaCy package in this free machine learning tutorial.
In this beginner Python tutorial, we’ll take a look at mutable and immutable data types, and learn how to keep dictionaries and lists from being modified by our functions.
Poisson Regression can be a really useful tool if you know how and when to use it. In this tutorial we’re going to take a long look at Poisson Regression, what it is, and how R programmers can use it in the real world. Specifically, we’re going to cover:What Poisson Regression actually is and when we […]
Learn to do a complete data analysis project using only basic Python to find out what genre of apps an app developer should focus on.
In recent weeks, news of the devastating wildfires sweeping parts of the US state of California have featured prominently in the news. While most wildfires are started accidentally by humans, weather conditions like wind and drought can exacerbate fires’ spread and intensity. Improved understanding of historical wildfire trends and causes can inform fire management and […]
In this post, we’ll learn to create an online survey and how to prevent some common mistakes made in surveys. We’ll cover all steps of the survey process, including: Selecting a population Sampling methods Making a data analysis plan Writing good questions Distribution options Data Scientists know that even the slickest code, the best data […]
Here’s an important fact that’s easy to forget: our data is only as helpful as it is understandable. Most of the time, that means creating some kind of data visualization. And while a simple bar graph might cut it for internal work, making your data both visually understandable and visually attractive can help it get […]
Learn to use Python dictionaries to store, sort, and access data in this in-depth tutorial analyzing craft beer data to master dictionary techniques.
Error metrics are short and useful summaries of the quality of our data. We dive into four common regression metrics and discuss their use cases.
Explore statistics for data science by learning probability is, normal distributions, and the z-score — all within the context of analyzing wine data.
Learn how to do descriptive statistics in Python with this in-depth tutorial that covers the basics (mean, median, and mode) and more advanced topics.
Python generators are a powerful, but misunderstood tool. They’re often treated as too difficult a concept for beginning programmers to learn — creating the illusion that beginners should hold off on learning generators until they are ready. I think this assessment is unfair, and that you can use generators sooner than you think. In this […]
Predictive models are extremely useful, when learning r language, for forecasting future outcomes and estimating metrics that are impractical to measure. For example, data scientists could use predictive models to forecast crop yields based on rainfall and temperature, or to determine whether patients with certain traits are more likely to react badly to a new […]
The Jupyter Notebook is an incredibly powerful tool for interactively developing and presenting data science projects. A notebook integrates code and its output into a single document that combines visualisations, narrative text, mathematical equations, and other rich media. The intuitive workflow promotes iterative and rapid development, making notebooks an increasingly popular choice at the heart […]
R is one of the most popular languages for statistical analysis, data science, and reporting. At Dataquest, we have been adding R courses (you can learn more in our recent update). For a comparison of R and Python, check out our analysis here. In this tutorial, we’ll teach you the basics of R by building […]
Docker Swarm is a clustering tool that turns a group of Docker hosts into a single virtual server. Docker Swarm ensures availability and high performance for your application by distributing it over the number of Docker hosts inside a cluster. Docker Swarm also allows you to increase the number of container instance for the same […]
Learning R for data science requires control structures. Control structures allow you to specify the execution of your code. They are extremely useful if you want to run a piece of code multiple times, or if you want to run a piece a code if a certain condition is met. This tutorial is based on […]