Sitemap
Career Paths
Skill Paths
- SQL Fundamentals
- Python Basics for Data Analysis
- R Basics for Data Analysis
- Data Analysis and Visualization with Python
- Data Visualization with R
- APIs and Web Scraping with Python
- APIs and Web Scraping with R
- Machine Learning Introduction with Python
- Machine Learning Intermediate with Python
- Probability and Statistics with Python
- Probability and Statistics with R
- Data Cleaning with Python
Programming Language Courses
Helpful Information
Dataquest General Information
Courses Related to Python
- Python for Data Science: Fundamentals Part I
- Python for Data Science: Fundamentals Part II
- Python for Data Science: Intermediate
- Pandas and NumPy Fundamentals
- Data Visualization Fundamentals
- Storytelling Data Visualization and Information Design
- APIs and Web Scraping in Python
- Data Cleaning and Analysis
- Statistics Fundamentals
- Intermediate Statistics: Averages and Variability
- Probability: Fundamentals
- Conditional Probability
- Machine Learning Fundamentals
- Calculus For Machine Learning
- Linear Algebra For Machine Learning
- Linear Regression For Machine Learning
- Decision Trees
- Deep Learning Fundamentals
- Machine Learning Project
- Kaggle Fundamentals
- Machine Learning in Python: Intermediate
- Hypothesis Testing: Fundamentals
- Data Cleaning in Python: Advanced
- Data Cleaning Project Walkthrough
- Python for Data Engineering: Fundamentals Part I
- Python for Data Engineering: Fundamentals Part II
- Python Intermediate for Data Engineering
- Programming Concepts with Python
- Elements of the Command Line
- Text Processing in the Command Line
- Data Analysis in Business
- Functions: Advanced
- Command Line: Intermediate
- Git and Version Control
- Spark and Map-Reduce
Courses Related to R
- Introduction to Data Analysis in R
- Data Structures in R
- Control Flow, Iteration and Functions in R
- Specialized Data Processing in R: Strings and Dates
- Data Visualization in R
- APIs in R
- Web Scraping in R
- Statistics Fundamentals in R
- Statistics Intermediate in R: Averages and Variability
- Probability: Fundamentals in R
- Conditional Probability in R
- Hypothesis Testing in R
- Data Cleaning in R
- Data Cleaning in R: Advanced
- Linear Regression Modeling in R
- Machine Learning Fundamentals in R
- Introduction To Shiny
Courses Related to SQL
- SQL Fundamentals
- Introduction to SQL and Databases
- Intermediate SQL
- Intermediate SQL in R
- Filtering and Sorting Data in SQL
- Summarizing Data in SQL
- Combining Tables in SQL
Courses Related to Data Engineering
Lessons
- Multi category chi-squared tests
- Challenge: Cleaning Data
- Significance Testing
- Challenge: Working with the Reddit API
- Challenge: Working with the Command Line
- Git Remotes
- Git Branches
- Merge Conflicts
- Challenge: Data Munging Using The Command Line
- Data Cleaning and Exploration Using Csvkit
- Project: Spark Installation and Jupyter Notebook Integration
- Guided Project: Git Installation and GitHub Integration
- Overfitting
- Machine Learning Project Walkthrough: Data Cleaning
- Machine Learning Project Walkthrough: Preparing the features
- Machine Learning Project Walkthrough: Making Predictions
- Data Cleaning Walkthrough
- Data Cleaning Walkthrough: Combining the Data
- Data Cleaning Walkthrough: Analyzing and Visualizing the Data
- Introduction to K-Nearest Neighbors
- Multivariate K-Nearest Neighbors
- Hyperparameter Optimization
- Cross Validation
- Guided Project: Predicting Car Prices
- Evaluating Model Performance
- Understanding Linear and Nonlinear Functions
- Understanding Limits
- Finding Extreme Points
- Linear Systems
- Vectors
- Matrix Algebra
- Optimizing Dataframe Memory Footprint
- Processing Dataframes in Chunks
- Guided Project: Practice Optimizing Dataframes and Processing in Chunks
- Augmenting Pandas With SQLite
- Guided Project: Analyzing Startup Fundraising Deals from Crunchbase
- Guided Project: Analyzing Stock Prices
- Solution Sets
- Joining Data in SQL
- Getting Started with Kaggle
- Feature Preparation, Selection, and Engineering
- Model Selection and Tuning
- Guided Project: Creating a Kaggle Workflow
- Intermediate Joins in SQL
- Building and Organizing Complex Queries
- Guided Project: Answering Business Questions using SQL
- Table Relations and Normalization
- Variables and Data Types
- Guided Project: Star Wars Survey
- Guided Project: Winning Jeopardy
- Guided Project: Predicting Bike Rentals
- Guided Project: Analyzing NYC High School Data
- The Linear Regression Model
- Feature Selection
- Gradient Descent
- Ordinary Least Squares
- Processing And Transforming Features
- Guided Project: Predicting House Sale Prices
- Representing Neural Networks
- Nonlinear Activation Functions
- Hidden Layers
- Guided Project: Building A Handwritten Digits Classifier
- Intro to Postgres
- Creating Tables
- Prepared Statements and SQL Injections
- Loading and Extracting Data with Tables
- User and Database Management
- Project: Postgresql Installation
- Guided Project: Storing Storm Data
- Introduction to SQL
- Summary Statistics
- Group Summary Statistics
- Subqueries
- Querying SQLite from Python
- Guided Project: Analyzing CIA Factbook Data Using SQL
- Exploring Postgres Internals
- Debugging Postgres Queries
- Using an Index
- Vacuuming Postgres Databases
- Advanced Indexing
- Functional Programming
- Pipeline Tasks
- Building a Pipeline Class
- Multiple Dependency Pipeline
- Guided Project: Hacker News Pipeline
- Creating Line Graphs
- Creating Multiple Line Graphs
- Bar Charts, Histograms, and Box Plots
- Scatter Plots for Exploratory Analysis
- Guided Project: Analyzing Forest Fire Data
- Guided Project: Answering Business Questions using SQL
- Sampling
- Variables in Statistics
- Frequency Distributions
- Visualizing Frequency Distributions
- Comparing Frequency Distributions
- Guided Project: Investigating Fandango Movie Ratings
- Introduction to NumPy
- Boolean Indexing with NumPy
- Introduction to pandas
- Exploring Data with Pandas Intermediate
- Data Cleaning Basics
- Guided Project: Exploring Ebay Car Sales Data
- The Mean
- The Weighted Mean and the Median
- The Mode
- Measures of Variability
- Z-scores
- Guided Project: Finding the Best Markets to Advertise In
- Programming in Python
- Lists and For Loops
- Conditional Statements
- Dictionaries and Frequency Tables
- Functions – Fundamentals
- Functions – Intermediate
- Data Cleaning With R
- String Manipulation and Relational Data
- Correlations and Reshaping Data
- Dealing With Missing Data
- Guided Project: NYC Schools Perceptions
- Python Data Analysis Basics
- Data Aggregation
- Combining Data With Pandas
- Transforming Data With Pandas
- Working With Strings In Pandas
- Working With Missing And Duplicate Data
- Guided Project: Clean And Analyze Employee Exit Surveys
- Learn and Install Jupyter Notebook
- Guided Project: Profitable App Profiles for the App Store and Google Play Markets
- Cleaning and Preparing Data in Python
- Object-Oriented Python
- Working with Dates and Times in Python
- Regular Expressions Basics
- List Comprehensions and Lambda Functions
- Guided Project: Exploring Hacker News Posts
- Table Relations and Normalization
- Querying SQLite from R
- Advanced Regular Expressions
- Working with Missing Data
- Joining Data in SQL
- Intermediate Joins in SQL
- Building and Organizing Complex Queries
- Guided Project: Answering Business Questions using SQL
- Table Relations and Normalization
- Guided Project: Designing and Creating a Database
- Estimating Probabilities
- Probability Rules
- Solving Complex Probability Problems
- Permutations and Combinations
- Exploring Data With pandas: Fundamentals
- Mobile App for Lottery Addiction
- Introduction to the Command Line
- The Filesystem
- Glob Patterns and Wildcards
- Users and Permissions
- Getting Help and Reading Documentation
- File Inspection
- Text Processing
- Redirection and Pipelines
- Standard Streams and File Descriptors
- Simple Random Sampling
- Stratified Sampling
- Variables in Statistics
- Frequency Distributions
- Visualizing Frequency Distributions
- Comparing Frequency Distributions
- Regular Expressions Basics
- Advanced Regular Expressions
- Map and Anonymous Functions
- Working with Missing Data
- Working with Dates and Times in Python
- Guided Project: Exploring Hacker News Posts
- Estimating Probabilities
- Probability Rules
- Probabilities of Multiple Random Experiments
- Permutations and Combinations
- Mobile App for Lottery Addiction
- Guided Project: Investigating Fandango Movie Ratings
- Best Practices for Writing Functions
- Context Managers
- Introduction to Decorators
- Decorators: Advanced
- Programming in Python
- Variables and Data Types
- Lists and For Loops
- Conditional Statements
- Dictionaries
- Functions: Fundamentals
- Functions: Intermediate
- Project: Learn and Install Jupyter Notebook
- Guided Project: Profitable App Profiles for the App Store and Google Play Markets
- Conditional Probability: Fundamentals
- Conditional Probability: Intermediate
- Bayes Theorem
- The Naive Bayes Algorithm
- Guided Project: Building A Spam Filter With Naive Bayes
- Python Data Analysis Basics
- Object-Oriented Python
- Probability Distributions
- Hypothesis Testing
- Categorical Data and The Chi-Squared Test
- Multi category chi-squared tests
- Guided Project: Winning Jeopardy
- The Mean
- The Weighted Mean and the Median
- The Mode
- Measures of Variability
- Z-scores
- Guided Project: Finding the Best Markets to Advertise In
- Binary And Positional Number Systems
- Encodings and Representing Text In A Computer
- Reading And Writing To Files
- Memory and Disk Usage
- Fundamentals of Modeling in R
- Bivariate Relationships — Correlation and Scatterplots
- Estimating the Coefficients and Fitting Linear Models
- Assessing the Accuracy of the Model
- Fitting Many Linear Models
- Guided Project: Predicting Condominium Sale Prices
- Joining Data in SQL
- Intermediate Joins in SQL
- Building and Organizing Complex Queries
- Fuzzy Language in Data Science
- Communicating Results
- Business Metrics
- Guided Project: Popular Data Science Questions
- Conditional Probability: Fundamentals
- Conditional Probability Continued
- Bayes’ Theorem
- The Naive Bayes Algorithm
- Guided Project: Building A Spam Filter With Naive Bayes in R
- Time Complexity of Algorithms
- Constant Time Complexity
- Logarithmic Time Complexity
- Sorting Algorithms
- Space Complexity
- Building Fast Queries on a CSV
- Introduction to Machine Learning Concepts
- Evaluating Model Performance
- Multivariate K-Nearest Neighbors in R
- Cross Validation in R
- Hyperparameter Optimization in R
- Guided Project: Predicting Car Prices
- Dataframes in R
- Control Flow in R
- Functions in R
- String Manipulation in R: Fundamentals
- Date and Time Manipulation in R: Fundamentals
- Guided Project: Creating An Efficient Data Analysis Workflow
- Introduction to Programming in R
- Data Manipulation with R: Basics
- Guided Project: Installing RStudio
- Vectors in R
- Matrices in R
- Lists in R
- Guided Project: Investigating COVID-19 Virus Trends
- Introduction to NumPy
- Arithmetic with NumPy Arrays
- Broadcasting NumPy Arrays
- Datasets and Boolean Indexing
- NumPy Datatypes
- Arithmetic Expressions and Variables in R
- Iterations in R
- Map Function in R
- Guided Project: Creating An Efficient Data Analysis Workflow (Part 2)
- Logical Expressions in R
- Working with APIs
- Line Charts
- Multiple plots
- Bar Plots And Scatter Plots
- Guided Project: Visualizing Earnings Based On College Majors
- Improving Plot Aesthetics
- Color, Layout, and Annotations
- Guided Project: Visualizing The Gender Gap In College Degrees
- Conditional Plots
- Intermediate APIs
- Guided Project: Implementing a Key-Value Database
- Web Scraping
- Processing Tasks With Stacks and Queues
- Effectively Using Arrays and Lists
- Sorting Arrays And Lists
- Searching Arrays And Lists
- Hash Tables
- CPU Bound Programs
- I/O Bound Programs
- Overcoming The Limitation of Threads
- Quickly Analyzing Data With Parallel Processing
- Guided Project: Analyzing Wikipedia Pages
- Histograms And Box Plots
- Overview of Recursion
- Introduction to Binary Trees
- Working with Binary Search Trees
- Implementing a Binary Heap
- Performance Boosts of Using a B-Tree
- Performance Boosts of Using a B-Tree II
- Introduction to Spark
- Transformations and Actions
- Challenge: Transforming Hamlet into a Data Set
- Guided Project: Predicting the stock market
- Working with Jupyter console
- Piping and redirecting output
- Introduction to Decision Trees
- Building a Decision Tree
- Spark DataFrames
- Applying Decision Trees
- Spark SQL
- Introduction to Random Forests
- Working with Programs
- Command Line Python Scripting
- Introduction to Git
- Chi-squared tests