The Dataquest Download

Level up your data and AI skills, one newsletter at a time.

Each week, the Dataquest Download brings the latest behind-the-scenes developments at Dataquest directly to your inbox. Discover our top tutorial of the week to boost your data skills, get the scoop on any course changes, and pick up a useful tip to apply in your projects. We also spotlight standout projects from our students and share their personal learning journeys.

Hello, Dataquesters!

Here’s what we have in store for you in this edition:

Top read: Learn how to automate your ETL workflows using Apache Airflow. Learn more

Community highlights: SQL to Excel conversions, regex in AI, and how to earn extra free time on Dataquest. Join the discussion

Resource spotlight: Compare cloud providers with a hands-on 30-day roadmap. Read more

Tired of fixing broken scripts and restarting failed data tasks? In this hands-on tutorial, you’ll learn how to set up Apache Airflow using Docker to automate and scale your ETL workflows—starting right on your local machine. It’s part one of a real-world project that takes you from local development to full cloud deployment on AWS:

    • Set up a local Airflow pipeline with Docker Compose

    • Extract, transform, and load data into Amazon S3

    • Get a preview of how to scale your pipeline to the cloud in the next parts

From the Community

Exploring Hacker News Posts: Nurten analyzed post and comment engagement on the Hacker News platform and demonstrated their good knowledge of for-loops by keeping the code brief and efficient. The final loop is impressive for someone just starting out: in just 8 lines, Nurten managed to compute the optimal posting time in Columbia.

Using Regex in Generative AI: Raisa provided an interesting resource demonstrating how GPT-4 stands out by using regex in its tokenization.

Converting SQL to Excel: A helpful online tool and a step-by-step tutorial on how to convert SQL files into Excel format.

How to Get Free, Extra Time on Dataquest: Learn about current initiatives conducted in the Community that can bring you extra free time on Dataquest and other cool rewards, as well as help you expand your professional network and build new useful connections with like-minded people.

DQ Resources

How to Choose the Right Cloud Service Provider: Not sure whether to go with AWS, Azure, or GCP? This 30-day roadmap helps you test all three using their free tiers so you can make an informed choice based on hands-on experience. Perfect for beginners! Learn more

Answering Business Questions Using SQL: Learn how to analyze a digital music store’s data using SQL. This hands-on project with the Chinook database walks through real-world SQL tasks like tracking sales, evaluating employees, and spotting growth opportunities using CTEs, subqueries, and more. Learn more

Customer Segmentation Using K-Means Clustering: Discover how to group customers by behavior and demographics using K-means clustering. Learn to identify key segments, unlock actionable insights, and target your marketing more effectively. Learn more

What We're Reading

Why You Shouldn’t Specialize Too Early in Your Dev Career: A former Meta engineer argues that specializing too early in your dev career can backfire, especially with AI reshaping the field. This article makes a compelling case for staying flexible and building a broad foundation first.

A GDPR Survival Guide for Data Engineers: Learn how to build GDPR compliance into your data stack—from PII tagging and encryption to Airflow DAGs for automated deletions and data-residency routing. This guide offers actionable patterns for protecting user data while keeping analytics flowing.

Give 20%, Get $20: Time to Refer a Friend!

Give 20% Get $20

Now is the perfect time to share Dataquest with a friend. Gift a 20% discount, and for every friend who subscribes, earn a $20 bonus. Use your bonuses for digital gift cards, prepaid cards, or donate to charity. Your choice! Click here

High-fives from Vik, Celeste, Anna P, Anna S, Anishta, Bruno, Elena, Mike, Daniel, and Brayan.

2025-07-09

Use SQL or Python? With PySpark, You Don’t Have to Choose

Learn to analyze census trends with PySpark, uncover traffic patterns using Python, and explore efficient SQL workflows for large datasets. Read More
2025-07-02

Learn to Set Up PostgreSQL with Docker (No Installation Needed)

Set up PostgreSQL with Docker, analyze I-94 traffic, predict heart disease, improve Python plots, and explore large-scale data with RDDs. Read More
2025-06-25

Struggling with Slow Python Scripts and Crashing Excel files?

Explore PySpark locally, build your first Spark app, master ETL pipelines with Airflow on AWS, and learn from impressive community projects. Read More

Learn faster and retain more.
Dataquest is the best way to learn