Last Updated։ March 3, 2026

Data Analyst Projects for Beginners: 20 Guided Projects to Build Your Portfolio (2026)

Here are 20 beginner data analyst projects to build a portfolio that gets you hired. They cover Python, SQL, Excel, Tableau, and Power BI, with step-by-step instructions for each.

Breaking into data analytics can be frustrating if you’re a complete beginner. Job postings want experience, and you can’t get experience without a job.

A well-built portfolio of data analyst projects however can help you close that gap. Many analysts land their first role without a single line of professional experience on their resume as their projects did the talking instead.

What you’ll find in this guide:

20 guided data analyst projects for beginners, can be completed in a browse, no setup required
Projects covering six tools: Python, SQL, R, Excel, Tableau, and Power BI
A recommended starting order, with the three best first projects clearly marked
A framework for turning completed projects into a portfolio that gets noticed
Free dataset sources for building your own independent projects

The entry-level market is competitive, and candidates who get interviews typically have 3–5 polished, well-documented projects which is a common sweet spot: enough range to signal breadth, not so many that quality slips. At Dataquest, we’ve helped thousands of learners build exactly that through our Data Analyst in Python career path.

Before You Start: Tool Setup

Most projects in this list run inside Jupyter Notebook (for Python) or RStudio (for R). Neither requires a paid subscription, but both need a one-time local install before you can get started.

For Jupyter, the fastest route is installing Anaconda. It bundles Python and Jupyter together and takes about ten minutes. If you'd rather skip the install entirely, the Dataquest platform lets you run Python code directly in your browser for all hosted guided projects. For RStudio, download R from CRAN first, then install RStudio Desktop from Posit. Our guides on getting started with Jupyter and getting started with R and RStudio walk you through both installs step by step.

What makes a good data analyst portfolio project?
Your quick start: The best first projects to build
20 Data Analyst Projects for Beginners
How to make your projects portfolio-ready
How to choose the right projects
Free datasets for your own projects
Your next steps
Frequently Asked Questions

What makes a good data analyst portfolio project?

Not all projects are equal in the eyes of a recruiter. The ones that get attention share four qualities:

They answer a real question. "I analyzed Airbnb data" is weak. "I analyzed Airbnb listings in three cities to find which property features predict higher nightly rates, then built a dashboard for hosts to benchmark their pricing" is strong. A project title that sounds like a question a manager would ask *signals business thinking, not just technical execution: What drives customer churn at a telecom provider?*
They show the full workflow. Recruiters want to see that you can move from messy raw data to a clean, interpretable result. A project that only shows a finished dashboard, with no evidence of the cleaning and analysis behind it, leaves too many questions unanswered.
They’re documented. At minimum, your project needs a clear README on GitHub that explains what you did, what data you used, and what you found. Many hiring managers check GitHub before an interview. A repository with no documentation is almost as bad as no repository at all.
They demonstrate the tools employers actually use. SQL and Python appear in the vast majority of data analyst job postings. Excel and a BI tool (Power BI or Tableau) round out the core toolkit. A balanced portfolio covers all of these.

The best first data analyst projects to build

If you’re new to data analytics and not sure where to begin, start with these three projects. Together they cover the core workflow across two different tools, and they’re all completable in a browser without any local setup.

Exploratory Data Analysis

Profitable App Profiles: You're working with real app store data, asking a genuine business question, and practicing the fundamentals of exploratory data analysis. It produces a clear, explainable result you can talk through in an interview.

Data Cleaning (Python/pandas)

Clean and Analyze Employee Exit Surveys: Data cleaning is often a big chunkof a real analyst's day-to-day work, and this project gives you an honest introduction to it. You'll combine messy datasets from two sources and surface insights about why employees resign — the kind of HR analysis that appears in real businesses constantly.

SQL

Analyzing Kickstarter Projects: SQL is the most universally required skill in data analyst job postings, and this project gives you a focused, beginner-friendly context to practice it on real data.

Complete these three first. Then use the full list below to add depth and tool variety to your portfolio.

20 Data Analyst Projects for Beginners

Each project below includes a difficulty level, estimated completion time, and a note on its portfolio value, ,so you can plan which ones to tackle and in what order.

The projects cover six tools: Python, R, Excel, Tableau, Power BI, and SQL. You don’t need to complete all 20. A strong portfolio typically has 3–5 well-documented projects that together demonstrate range. Use the tool context in each project header to build a balanced set.

1. eBay Car Sales

Difficulty: Beginner | Time: 5–7 hours | Tools: Python, pandas, Jupyter Notebook

Portfolio value: Real scraped data is messier than anything you'd find in a textbook, and cleaning it is exactly what employers want to see you do. This project gives you a genuine data wrangling story to tell in interviews.

Overview

You're working with a dataset scraped from eBay Kleinanzeigen, the classifieds section of the German eBay site. The data is raw, inconsistent, and full of the kinds of issues you'd encounter in a real analyst role: text in numeric columns, outliers, and fields that need parsing before they're usable. Your job is to clean it and surface insights about used car pricing patterns across makes, models, and mileage bands.

Start the eBay Car Sales project →

Tools and Technologies

Python
pandas
Jupyter Notebook

Prerequisites

Pandas fundamentals: selecting, filtering, and transforming data
Basic data cleaning: handling nulls, fixing data types, removing outliers

Step-by-Step Instructions

Load and explore the raw eBay dataset
Clean column names and fix data types (price, odometer, registration year)
Remove outliers and irrelevant listings
Analyze price distributions across top car brands
Explore the relationship between mileage, age, and price

Expected Outcomes

Experience cleaning a real-world, scraped dataset from scratch
Practice identifying and handling outliers and inconsistent formatting
Ability to derive pricing insights from exploratory analysis

Resources

2. Profitable App Profiles for the App Store and Google Play Markets

Difficulty: Beginner | Time: 4–6 hours | Tools: Python, Jupyter Notebook

Portfolio value: A strong first portfolio project. You’re working with real data, answering a business question, and practicing EDA fundamentals—the combination recruiters most want to see from a junior candidate.

Overview

You’re a data analyst for a company building ad-supported mobile apps. Using historical data from the Apple App Store and Google Play Store, you’ll identify which app categories attract the most users. You’ll clean the data, use frequency tables to find patterns, and make concrete recommendations. This is the kind of output a real product team would actually use.

Start the App Profiles project →

Tools and Technologies

Python
Jupyter Notebook

Prerequisites

Writing Python functions with arguments, return statements, and control flow
Using conditional logic and loops
Working in Jupyter Notebook

Step-by-Step Instructions

Open and explore the App Store and Google Play datasets
Clean the datasets by removing non-English apps and duplicate entries
Isolate the free apps for further analysis
Determine the most common app genres using frequency tables
Make recommendations on the ideal app profiles to maximize users and revenue

Expected Outcomes

Cleaning real-world data to prepare it for analysis
Analyzing app market data to identify trends and success factors
Using data insights to inform business strategy and decision-making
Communicating findings and recommendations to stakeholders

Resources

3. Exploring Hacker News Posts

Difficulty: Beginner | Time: 4–6 hours | Tools: Python, Jupyter Notebook

Portfolio value: Shows you can extract meaningful patterns from unstructured, real-world data—a skill that comes up constantly in analyst roles involving user behavior or content performance.

Overview

Hacker News is one of tech’s most active communities. This project has you analyze a dataset of posts to find out what drives engagement: which post types get the most comments, and what time of day is best to post. You’ll use Python string manipulation, object-oriented programming, and the datetime library — skills that transfer directly to any role involving behavioral data.

Start the Hacker News project →

Tools and Technologies

Python
Jupyter Notebook

Prerequisites

Python data cleaning skills: loops, string methods, datetime library
Formatting dates and times with strftime

Step-by-Step Instructions

Remove headers from a list of lists
Extract ‘Ask HN’ and ‘Show HN’ posts
Calculate the average number of comments for each post type
Find the number of ‘Ask HN’ posts and average comments by hour created
Sort and print values from a list of lists

Expected Outcomes

Applying Python string manipulation and date handling to real-world data
Analyzing trends in user submissions on a live platform
Identifying factors that drive post popularity and engagement
Communicating insights derived from behavioral data

Resources

4. Clean and Analyze Employee Exit Surveys

Difficulty: Beginner–Intermediate | Time: 6–8 hours | Tools: Python, pandas, Jupyter Notebook

Portfolio value: Data cleaning is the skill most underrepresented in beginner portfolios and the one most valued by hiring managers. This project gives you a legitimate data cleaning story to tell in interviews.

Overview

About 80% of real analyst work is preparing data for analysis, not running it. This project puts you in that reality. You’re working with employee exit survey data from two Australian government institutes—messy, inconsistent, and spread across multiple files. You’ll combine them, clean them, and surface insights about resignation patterns. It’s unglamorous work that separates serious candidates from those who’ve only done textbook exercises.

Start the Employee Exit Surveys project →

Tools and Technologies

Python
pandas
Jupyter Notebook

Prerequisites

Pandas fundamentals: groupby, concat, merge
Handling missing data and manipulating strings in pandas

Step-by-Step Instructions

Load and explore the DETE and TAFE exit survey data
Identify missing values and drop unnecessary columns
Clean and standardize column names across both datasets
Filter the data to include only resignation reasons
Verify data quality and create new columns for analysis
Combine the cleaned datasets for further analysis
Analyze the combined data to identify resignation trends

Expected Outcomes

Cleaning real-world, messy HR data from multiple sources
Applying core data cleaning techniques in Python and pandas
Conducting exploratory analysis on combined datasets
Summarizing findings and sharing data-driven recommendations

Resources

5. Star Wars Survey

Difficulty: Beginner | Time: 4–6 hours | Tools: Python, pandas, Jupyter Notebook

Portfolio value: Survey analysis is one of the most common real-world analyst tasks. This project shows you can work with opinion data, handle inconsistent formats, and slice results by demographics—skills that translate directly to customer research and product analytics roles.

Overview

FiveThirtyEight surveyed over 1,000 Star Wars fans on their film preferences. Your job is to clean the raw survey data (which is messy by design) and uncover how preferences vary by gender, age, and region. It’s a genuinely fun dataset that also teaches the kind of data handling you’ll do in real market research and user research contexts.

Start the Star Wars Survey project →

Tools and Technologies

Python
pandas
Jupyter Notebook

Prerequisites

Pandas data cleaning skills: joins, applying functions over columns, handling missing values
Analyzing survey data with pandas

Step-by-Step Instructions

Map Yes/No columns to Boolean values to standardize the data
Convert checkbox columns to lists and standardize their format
Clean and rename the ranking columns
Identify the highest-ranked and most-viewed films
Analyze preferences by gender, age, and location
Summarize findings on fan preference differences between groups

Expected Outcomes

Experience cleaning and analyzing a real-world, messy survey dataset
Hands-on practice with pandas data manipulation
Understanding of how to analyze survey data for business insights

Resources

6. Word Raider

Difficulty: Beginner | Time: 3–4 hours | Tools: Python, Jupyter Notebook

Portfolio value: While not a data analysis project, this is worth completing early in your Python learning. Building an interactive application from scratch using loops, conditionals, and file handling gives you a feel for Python logic that makes later data projects much easier to approach.

Overview

You’re building an interactive word-guessing game using only core Python concepts: loops, conditionals, and file handling. No data analysis happens here, but the programming fluency you develop does. Think of this as your Python confidence-builder before tackling the heavier cleaning and analysis projects.

Start the Word Raider project →

Tools and Technologies

Python
Jupyter Notebook

Prerequisites

Python basics: variables, data types, functions
Loops, conditional statements, string manipulation
Basic understanding of object-oriented programming

Step-by-Step Instructions

Build the word bank by reading words from a text file into a Python list
Set up variables to track the game state
Implement functions to receive and validate user input
Create the game loop, checking guesses and providing feedback
Update the game state after each guess and check for win/loss

Expected Outcomes

Stronger proficiency in fundamental Python programming concepts
Experience building an interactive application from scratch
Practice with file I/O, data structures, and basic OOP

Resources

7. Credit Card Customer Segmentation

Difficulty: Intermediate | Time: 6–8 hours | Tools: Python, pandas, scikit-learn, Jupyter Notebook

Portfolio value: Segmentation is one of the most commonly assigned tasks in marketing and product analytics. This project shows you can go beyond descriptive analysis to produce something a business would actually act on.

Overview

You're a data scientist at a credit card company. Using behavioral data on 8,000+ customers, including purchase frequency, credit limits, and payment patterns.You'll segment customers into distinct groups using K-means clustering. The goal isn't just to run the algorithm; it's to interpret each segment and recommend how the business should engage with them differently. That analysis-to-recommendation workflow is exactly what separates strong analyst candidates from the pack.

Start the Credit Card Segmentation project →

Tools and Technologies

Python
pandas
scikit-learn
Jupyter Notebook

Prerequisites

pandas fundamentals: groupby, aggregation, data cleaning
Basic understanding of machine learning concepts
Intro to supervised and unsupervised learning

Step-by-Step Instructions

Load and explore the credit card customer dataset
Clean and preprocess the data for clustering
Apply K-means clustering to segment customers
Visualize the segments using dimensionality reduction
Interpret each cluster and develop tailored business recommendations

Expected Outcomes

Hands-on experience applying K-means clustering to real business data
Ability to interpret and communicate cluster results to a non-technical audience
Practice translating analysis findings into actionable business strategy

Resources

8. Creating an Efficient Data Analysis Workflow

Difficulty: Beginner–Intermediate | Time: 5–7 hours | Tools: R, RStudio

Portfolio value: Shows R fluency and the ability to structure analysis into a reproducible workflow—a skill valued in academic, healthcare, and research-adjacent analyst roles.

Overview

You’re hired by a programming book company to analyze their sales data. Which titles are most profitable? You’ll use R’s control flow, loops, and custom functions to build a clean, repeatable analysis workflow and write up your findings in a structured report.

Start the Data Analysis Workflow project →

Tools and Technologies

R
RStudio

Prerequisites

Control flow, iteration, and functions in R

Step-by-Step Instructions

Explore the book sales dataset
Transform and prepare the data for analysis
Analyze the cleaned data to identify top-performing titles
Summarize findings in a structured report
Provide data-driven recommendations to stakeholders

Expected Outcomes

Applying R programming to real-world sales data
Developing a reproducible data analysis workflow
Translating analysis findings into actionable business insights

Resources

9. Creating an Efficient Data Analysis Workflow, Part 2

Difficulty: Intermediate | Time: 6–8 hours | Tools: R, RStudio, dplyr, stringr, lubridate

Portfolio value: This project introduces the R tidyverse ecosystem—the package suite most used in production R analysis. It also covers time-based analysis, which is a common real-world requirement in sales, finance, and operations analytics.

Overview

A book company launched a new sales initiative on July 1, 2019. Did it work? Your job is to find out. Using the tidyverse packages dplyr, stringr, and lubridate, you’ll clean and process 2019 sales data, analyze sentiment in customer reviews, and compare key metrics before and after the program launch.

Start the Data Analysis Workflow Part 2 project →

Tools and Technologies

R, RStudio
dplyr, stringr, lubridate

Prerequisites

Specialized data processing in R: stringr, lubridate, map function, regular expressions

Step-by-Step Instructions

Load and explore the 2019 sales data
Clean the data by handling missing values and inconsistencies
Process text reviews to determine positive/negative sentiment
Compare key metrics before and after the program launch
Analyze differences in sales between customer segments

Expected Outcomes

Applying tidyverse packages to manipulate and process real-world data
Analyzing sales data to quantify the impact of a business initiative
Translating findings into meaningful business insights

Resources

10. Preparing Data with Excel

Difficulty: Beginner | Time: 3–5 hours | Tools: Excel

Portfolio value: Excel proficiency is expected in nearly every analyst role, particularly in non-tech industries. This project demonstrates the skill that gets overlooked most in beginner portfolios: preparing data before analysis, not just analyzing it.

Overview

You’re a data professional at a marine biology research organization. Your task: prepare a raw dataset on shark attacks so that the analysis team can start working with it. Using Excel, you’ll import the data, organize it into tables, remove duplicates, handle missing values, and fix inconsistencies. This is the exact workflow you’d repeat dozens of times a year in a real analyst role.

Start the Preparing Data with Excel project →

Tools and Technologies

Excel

Prerequisites

Preparing data in Excel: importing data, organizing worksheets, removing duplicates, handling missing values

Step-by-Step Instructions

Import the raw shark attack data into an Excel workbook
Organize the data into worksheets and tables
Clean the data by removing duplicates and fixing inconsistencies
Consolidate shark attack data from multiple sources into a single table

Expected Outcomes

Hands-on experience in data preparation and cleaning using Excel
Skills for importing, organizing, and cleaning data for analysis
Ability to consolidate data from multiple sources into an analysis-ready format

Resources

11. Visualizing the Answer to Stock Questions Using Spreadsheet Charts

Difficulty: Beginner | Time: 4–6 hours | Tools: Excel

Portfolio value: Data visualization in Excel is the most portable skill in the analyst toolkit; it works in every industry and doesn’t require software licenses. This project demonstrates that you can communicate findings visually, not just analyze data numerically.

Overview

Using historical stock data from Microsoft, Amazon, Intel, and Bitcoin, you’ll build a report that answers specific business questions through charts. You’ll apply information design principles, including Gestalt theory and pre-attentive attributes — to make your visualizations genuinely clear and persuasive, not just technically correct.

Start the Stock Visualization project →

Tools and Technologies

Excel
Data visualization
Information design principles

Prerequisites

Visualizing data in Excel: creating chart types, applying design principles, using Gestalt principles

Step-by-Step Instructions

Import the dataset to an Excel spreadsheet
Create a report using data visualizations and tabular data
Represent the data using effective chart types
Apply Gestalt principles and pre-attentive attributes
Maximize data-ink ratio in all visualizations

Expected Outcomes

Analyzing real-world stock market data in Excel
Applying information design principles to create effective visualizations
Developing in-demand data visualization and communication skills

Resources

12. Identifying Customers Likely to Churn for a Telecommunications Provider

Difficulty: Beginner–Intermediate | Time: 5–7 hours | Tools: Excel

Portfolio value: Customer churn analysis is one of the most commonly assigned tasks in business analytics roles. This project gives you a concrete example to reference in interviews when asked about your experience with business impact analysis.

Overview

Customer churn which is when users cancel a service, it costs telecom companies significant revenue each year. You’re tasked with exploring customer data in Excel to profile who’s most likely to churn. You’ll calculate descriptive statistics, build PivotTables, and create histograms and boxplots that tell a data story clear enough to hand directly to a retention team.

Start the Customer Churn project →

Tools and Technologies

Excel

Prerequisites

Exploring data in Excel: descriptive statistics, PivotTables, histograms, boxplots

Step-by-Step Instructions

Import the customer dataset into Excel
Calculate descriptive statistics for key metrics
Create PivotTables, histograms, and boxplots to explore data differences
Analyze and identify profiles of likely churners
Compile a report with your data visualizations

Expected Outcomes

Practice analyzing a real-world customer dataset in Excel
Experience building PivotTables and charts to uncover insights
Skill in translating analysis findings into an actionable report

Resources

13. Data Prep in Tableau

Difficulty: Beginner | Time: 3–5 hours | Tools: Tableau

Portfolio value: Tableau proficiency is a differentiator in many business analyst and BI roles. This project gives you a solid foundation in the most important step of any Tableau workflow: getting your data ready before you build a single chart.

Overview

Your client is Dataquest itself. Using real online learning platform data, you’ll connect to Excel files, import tables into Tableau, and define relationships between tables to build a data model that can answer questions about student engagement and course performance. Clean, well-modeled data is what makes great dashboards possible. This project teaches you to get that foundation right.

Start the Tableau Data Prep project →

Tools and Technologies

Tableau

Prerequisites

Preparing data in Tableau: connecting to data sources, importing tables, defining relationships, cleaning and filtering

Step-by-Step Instructions

Connect to the provided Excel file with student engagement data
Import tables into Tableau and define relationships between them
Clean and filter the data to handle missing values and inconsistencies
Save the prepared data source for visualization
Reflect on how data preparation affects the quality of analysis

Expected Outcomes

Practice with core data preparation techniques in Tableau
Ability to connect to, import, and combine data from multiple tables
Readiness to build visualizations and dashboards from a clean data model

Resources

14. Business Intelligence Plots

Difficulty: Intermediate | Time: 5–7 hours | Tools: Tableau

Portfolio value: Interactive dashboards are what most BI and analyst roles produce as their primary deliverable. This project gives you a realistic simulation of that work, complete with calculated fields and dual-axis charts.

Overview

You’re a data visualization consultant for Adventure Works. Leadership wants to understand the performance gap between their online and offline sales channels. You’ll build interactive Tableau dashboards using calculated fields, filters, dual-axis charts, and embedded visualizations in tooltips. The advanced techniques that separate polished BI work from basic charts.

Start the Business Intelligence Plots project →

Tools and Technologies

Tableau

Prerequisites

Data visualization fundamentals in Tableau: chart types, calculated fields, filters

Step-by-Step Instructions

Compare online vs offline orders using visualizations
Analyze products across channels with scatter plots
Embed visualizations in tooltips for added insight
Summarize findings and identify next steps

Expected Outcomes

Experience building interactive BI dashboards in Tableau
Ability to create calculated fields for tailored analysis
Skill in developing visualizations that surface actionable insights

Resources

15. Data Presentation

Difficulty: Intermediate | Time: 5–7 hours | Tools: Tableau

Portfolio value: The ability to present data findings to leadership, not just analyze them, is consistently cited as a gap in junior analysts. This project produces a dashboard you could genuinely show in an interview as evidence of stakeholder communication skills.

Overview

Using conversion funnel data, you’ll build an interactive Tableau dashboard for a company’s leadership team that answers: which marketing channels, locations, and customer personas drive the most value? You’ll add filters, dashboard actions, and annotations to make the dashboard genuinely usable, not just visually appealing.

Start the Data Presentation project →

Tools and Technologies

Tableau

Prerequisites

Sharing insights in Tableau: bar charts, line graphs, color and size encoding, dashboard filters and actions

Step-by-Step Instructions

Import and clean the conversion funnel data in Tableau
Build basic charts to visualize key metrics
Create interactive dashboards with filters and actions
Add annotations and highlights to emphasize key insights
Compile a professional dashboard to present findings

Expected Outcomes

Analyzing conversion funnel data to surface actionable insights
Applying data visualization best practices to create impactful dashboards
Communicating data-driven findings to stakeholders

Resources

16. Modeling Data in Power BI

Difficulty: Beginner–Intermediate | Time: 5–7 hours | Tools: Power BI

Portfolio value: Power BI is the dominant BI tool in enterprise environments, particularly in companies already using Microsoft 365. This project covers the foundational workflow — connecting data, building a model, and writing DAX, that appears in most Power BI analyst roles.

Overview

You’re an analyst at a company that sells scale model cars. Using their sales records database in Power BI, you’ll clean and transform the data, design a data model, and create calculated columns and measures with DAX (Data Analysis Expressions). The goal is a dashboard that helps leadership understand sales performance and make better decisions.

Start the Modeling Data in Power BI project →

Tools and Technologies

Power BI

Prerequisites

Modeling data in Power BI: data models, table properties, calculated columns and measures using DAX

Step-by-Step Instructions

Import the sales data into Power BI
Clean and transform the data for analysis
Design a basic data model
Create calculated columns and measures using DAX
Build visualizations to extract insights from the data

Expected Outcomes

Hands-on experience modeling and analyzing real-world sales data in Power BI
Ability to clean, transform, and prepare data for analysis
Experience creating DAX measures and extracting business insights

Resources

17. Visualization of Life Expectancy and GDP Variation Over Time

Difficulty: Intermediate | Time: 5–7 hours | Tools: Power BI

Portfolio value: This project produces a compelling dashboard using the Gapminder dataset—recognizable to most data professionals and a strong conversation starter in interviews. Publishing it demonstrates you can take work to completion.

Overview

How have life expectancy and GDP changed across countries over decades? Using the Gapminder dataset in Power BI, you’ll build interactive scatter plots and stacked column charts to answer that question. You’ll walk through the full report development lifecycle like loading data, cleaning it, building visualizations, and publishing your work, and end with a dashboard ready to add to your portfolio.

Start the Life Expectancy and GDP project →

Tools and Technologies

Power BI

Prerequisites

Visualizing data in Power BI: basic visuals, accessible report layouts, themes, publishing

Step-by-Step Instructions

Import the life expectancy and GDP data into Power BI
Clean and transform the data for analysis
Create interactive scatter plots and stacked column charts
Design an accessible report layout
Customize visual markers and themes, then publish the report

Expected Outcomes

Applying visualization techniques in Power BI to global data
Creating a dashboard that demonstrates the full report development lifecycle
Publishing work to showcase your Power BI skills

Resources

18. Building a BI App

Difficulty: Intermediate | Time: 6–8 hours | Tools: Power BI

Portfolio value: This is one of the strongest Power BI portfolio pieces in this list. You’re working with NPS (Net Promoter Score) data, a metric used in virtually every customer-facing business and producing recommendations, not just charts.

Overview

You’re a BI analyst at Dataquest. Using Power BI, you’ll import and model data on course completion rates and Net Promoter Scores, then build KPI metrics, line charts, and scatter plots to analyze course quality trends. The deliverable isn’t just a dashboard, it’s a recommendation for which courses need improvement, backed by the data. That combination of analysis and business recommendation is exactly what hiring managers want to see.

Start the Building a BI App project →

Tools and Technologies

Power BI

Prerequisites

Managing workspaces and datasets in Power BI: creating workspaces, dynamic reports with parameters, row-level security

Step-by-Step Instructions

Import and explore the course completion and NPS data
Create a data model relating the fact and dimension tables
Write calculations for key metrics and validate results
Design and build visualizations to analyze course performance

Expected Outcomes

Importing, modeling, and analyzing data in Power BI to drive decisions
Designing and building insightful visualizations to convey trends
Producing actionable recommendations from BI analysis

Resources

19. Analyzing Kickstarter Projects

Difficulty: Beginner | Time: 4–6 hours | Tools: SQL

Portfolio value: SQL is the most universally required skill in data analyst job postings. This is your cleanest, most focused SQL portfolio piece, it uses a well-known dataset and produces findings you can speak to confidently in an interview.

Overview

Kickstarter has funded over 200,000 creative projects. Using SQL, you’ll explore the platform’s project data to find what separates funded from failed campaigns: success rates by category, the relationship between funding goals and outcomes, and trends over time. This is a pure SQL analysis project—no Python, no Excel, which makes it a clear, standalone demonstration of your querying skills.

Start the Kickstarter project →

Tools and Technologies

Prerequisites

SQL and databases fundamentals: SELECT, WHERE, JOINs, aggregations, handling missing values

Step-by-Step Instructions

Import and explore the Kickstarter dataset
Clean the data to handle missing values and ensure consistency
Write SQL queries to uncover funding trends by category
Analyze success rates relative to funding goals and timeframes

Expected Outcomes

Proficiency in using SQL for real-world data analysis
Experience cleaning and querying a large, real-world dataset
Ability to derive and communicate business insights from SQL results

Resources

20. Analyzing Startup Fundraising Deals from Crunchbase

Difficulty: Intermediate | Time: 6–8 hours | Tools: Python, pandas, SQLite, Jupyter Notebook

Portfolio value: This is your most technically impressive beginner project. It combines Python, pandas, and SQL in a single workflow and works with a dataset large enough to require memory management. It’s a strong capstone project for the Python track in this list.

Overview

Crunchbase tracks billions of dollars in startup investment. This project has you exploring that investment landscape: which verticals attract the most funding, who the most active investors are, and what patterns emerge across funding rounds. The technical challenge is that the dataset is large enough that you can’t load it all into memory at once, so you’ll practice chunking data and working with SQLite—techniques that apply directly to any analyst role handling production-scale data.

Start the Crunchbase project →

Tools and Technologies

Python, pandas, SQLite, Jupyter Notebook

Prerequisites

Step-by-Step Instructions

Explore the structure and contents of the Crunchbase dataset
Process the dataset in chunks and load into an SQLite database
Analyze fundraising rounds to identify trends
Examine the most successful startup verticals by total funding
Identify the most active investors by deal count and amount invested

Expected Outcomes

Applying pandas and SQLite to real-world investment data
Handling large datasets through chunking and efficient data types
Integrating DataFrames with SQL databases for scalable analysis
Building a portfolio-ready capstone project showing Python + SQL fluency

Resources

How to make your projects portfolio-ready

Completing a project is step one. Making it visible and hirable is step two, and this is where most beginners stop short.

Set up a GitHub repository for every project. GitHub is where recruiters go to verify the skills on your resume. A project that lives only on your laptop doesn’t exist in their eyes. Creating a free account takes five minutes; creating a repository for each project takes two more. This is non-negotiable.
Write a README that answers three questions. When a recruiter opens your repository, they have thirty seconds before they move on. Your README needs to answer: What problem did this project address? What data did you use and where did it come from? What did you find, and what would you recommend? A clear, concise README, even 150 words is what separates a professional-looking portfolio from a folder of code.
Write your resume bullet points before you close the project. While the work is still fresh, draft two bullet points you could put on a resume: one describing what you did technically, one describing the outcome or insight you produced. Something like: “Cleaned and analyzed HR exit survey data from 1,400+ employees using Python and pandas; identified role tenure as the strongest predictor of resignation risk.” That specificity is what earns callbacks.
How many projects do you need? Three to five well-documented projects that cover different skill areas will serve you better than ten half-finished ones. Aim for at least: one Python/EDA project, one data cleaning project, one SQL project, and one visualization project (Tableau or Power BI). The Crunchbase project (Project 20) works well as a capstone that combines Python and SQL in a single piece.

How to choose the right data analyst projects

With 20 projects to choose from, and dozens more across the internet, the decision can feel paralyzing. Here’s a practical framework for narrowing it down.

What recruiters actually look for

When hiring managers review a beginner portfolio, they’re not looking for perfection. They’re looking for four things:

Evidence that you can handle messy data
Ability to translate findings into a recommendation
Some familiarity with the tools on the job description
Proof that you can document your work clearly. A project that checks all four boxes will outperform one that’s technically sophisticated but hard to understand.

One thing worth knowing: the entry-level market is genuinely competitive right now. Building 3–5 projects won’t guarantee a job, but it will get you past the first filter. The candidates who struggle the most are the ones who complete courses but never build anything they can show.

Passions, interests, and in-demand skills

The best project is one that sits at the intersection of three things: your passions, your interests, and the skills employers are actually hiring for.

Passions are the areas you’re deeply committed to: healthcare, finance, sports, sustainability. Working on something you care about keeps you going when the project gets difficult, and it gives you something genuine to talk about in interviews.
Interests are the topics that spark your curiosity even without deep commitment. Sports analytics, music data, food trends, if it makes you want to ask questions, it’s worth exploring.
In-demand data skills are what the market wants. SQL, Python, pandas, and a BI tool appear in the overwhelming majority of analyst job postings. Make sure your portfolio demonstrates these tools, even if the topic is something more personal.

Steps to picking the right project

Step 1: Assess where you are. If you’re completely new to data analysis, start with Projects 2, 4, and 19. If you already have Python skills, jump to the more complex projects like Project 20.

Step 2: Plan before you code. Define your question, identify your data source, and sketch your analysis approach before opening Jupyter. Projects that start without a clear question tend to trail off.

Step 3: Focus on fundamentals first. Clean data thoroughly. Use techniques that match your objective. Create clear visualizations. Document your process. These four things matter more than whether you used a sophisticated model.

Step 4: Start small, then scale. Complete one project fully before starting another. A finished, documented project is worth more than three projects stuck at 60%.

Step 5: Get feedback. Share your completed projects in the Dataquest Community or on Reddit’s r/dataanalysis. Peer feedback makes your work better and builds visibility with people in the field.

Free datasets for your own data analyst projects

Once you’ve completed the guided projects, you’ll likely want to build independent projects using your own datasets. Here are reliable places to find data worth analyzing.

Kaggle: The largest free dataset repository. Covers every domain from finance to sports to healthcare, and many datasets include community notebooks showing how others have approached the analysis.

Data.gov: US government data across health, education, transportation, and economics. Well-documented and generally clean.

NYC Open Data: Hundreds of datasets on the city: Citi Bike trips, restaurant inspections, school performance, traffic patterns.

World Happiness Report: Annual happiness index data by country, useful for time-series and comparative analysis.

Our World in Data: Long-run historical datasets on everything from CO₂ emissions to life expectancy. Strong for trend analysis.

FiveThirtyEight: Datasets behind their journalism. Politics, sports, culture — all real, all interesting.

World Bank Open Data: Economic and development indicators across 200+ countries.

For the most realistic practice, look for datasets with multiple files from different sources, missing values, and inconsistent formatting. That messiness is what makes cleaning projects feel real.

Your next steps

Building a portfolio takes time, and the biggest obstacle is usually getting started rather than the work itself. We've put together a free checklist that walks you from your first project to a portfolio ready to show employers. Download it below, then come back to this guide whenever you need a project to tackle next.

Here's what you should focus on:

Today: Create a GitHub account if you don't have one. It takes five minutes and you'll need it before your first project is done.

This week: Complete Project 2 Profitable App Profiles. It's the right blend of beginner-friendly and genuinely portfolio-worthy, and finishing it gives you the momentum to keep going.

This month: Complete the three-project starter pack (Projects 2, 4, and 19). Push each one to GitHub with a README. You'll have a basic but real portfolio before the month is out.

Frequently Asked Questions

How many projects should I include in my data analyst portfolio?

Three to five well-documented projects is the standard advice from hiring managers, and it holds.

Quantity rarely impresses. A recruiter who sees 12 GitHub repositories with no READMEs and no clear findings will be less impressed than one who sees four clean, well-explained projects.

Aim for range: at least one Python project, one SQL project, and one visualization project (Tableau or Power BI).

Can I get a data analyst job with only personal projects and no work experience?

Yes, and it’s more common than you might think.

Many analysts land their first role with nothing but portfolio projects on their resume.

The key is documentation and the ability to explain your process clearly in an interview. A well-documented project on GitHub with a clear README, backed by your ability to walk through your thinking, is meaningful evidence of skill.

Certificates alone are not.

What is the best first data analyst project for a beginner?

Start with an exploratory data analysis (EDA) on a dataset that genuinely interests you.

If you want a structured starting point, Project 2 (Profitable App Profiles) is ideal: it uses real data, asks a genuine business question, and teaches the core Python/EDA workflow in a beginner-friendly context.

Do I need to know Python to build a data analyst portfolio?

No, you can start with Excel or SQL projects and add Python later as your skills develop.

However, Python fluency significantly expands what you can analyze and how efficiently you can work, so it’s worth investing in early.

If you’re not sure where to start, our Introduction to Python Programming course is a good first step.

What datasets can I use for data analyst projects?

Kaggle is the most accessible starting point. It has thousands of free datasets across every domain, and many include community notebooks for reference.

Data.gov, NYC Open Data, Our World in Data, and the FiveThirtyEight GitHub repository are all excellent for independently motivated projects.

For realistic practice, prioritize “dirty” datasets with missing values and multiple source files.

Should I put my data analyst projects on GitHub?

Yes, without exception.

Recruiters increasingly check GitHub to verify the skills on your resume. A project without a repository is much harder to validate.

At minimum, each project needs a README that explains what you did, what data you used, and what you found.

Clean, organized repositories signal professionalism even before someone reads a line of your code.

How long does it take to complete a data analyst portfolio?

Most learners complete their first three portfolio projects within 4–8 weeks, assuming 5–10 hours of study per week.

The guided projects in this list range from 1–8 hours each.

Starting with Projects 2, 4, and 19 gives you a complete, balanced data project portfolio in roughly a month with dedicated, consistent effort.

What data analyst projects should I avoid putting in my portfolio?

Avoid projects everyone else is submitting. The Titanic survival dataset and Iris flower classification appear in hundreds of portfolios and signal that you followed a tutorial rather than applied independent thinking.

The same goes for projects with no clear business question or outcome.

Choose datasets that genuinely interest you and frame your analysis around a question that a real manager would ask.

Data Analysis

Data Analyst Projects for Beginners: 20 Guided Projects to Build Your Portfolio (2026)

Before You Start: Tool Setup

Table of Contents

What makes a good data analyst portfolio project?

The best first data analyst projects to build

Exploratory Data Analysis

Data Cleaning (Python/pandas)

SQL

20 Data Analyst Projects for Beginners

1. eBay Car Sales

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

2. Profitable App Profiles for the App Store and Google Play Markets

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

3. Exploring Hacker News Posts

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

4. Clean and Analyze Employee Exit Surveys

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

5. Star Wars Survey

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

6. Word Raider

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

7. Credit Card Customer Segmentation

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

8. Creating an Efficient Data Analysis Workflow

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

9. Creating an Efficient Data Analysis Workflow, Part 2

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

10. Preparing Data with Excel

Overview

Tools and Technologies

Prerequisites

Step-by-Step Instructions

Expected Outcomes

Resources

11. Visualizing the Answer to Stock Questions Using Spreadsheet Charts