The Dataquest Download

Level up your data and AI skills, one newsletter at a time.

Each week, the Dataquest Download brings the latest behind-the-scenes developments at Dataquest directly to your inbox. Discover our top tutorial of the week to boost your data skills, get the scoop on any course changes, and pick up a useful tip to apply in your projects. We also spotlight standout projects from our students and share their personal learning journeys.

Hello, Dataquesters!

Here’s what we have in store for you in this edition:

Top tutorial: Say goodbye to “it worked on my machine” problems. Learn to set up PostgreSQL with Docker. Read the blog

Community highlights: Peak traffic analysis, predicting heart disease, and Python plotting tips. Join the discussion

Resource spotlight: Learn how to transform and analyze large-scale data with core RDD operations. Read the blog

Say Goodbye to “It Worked on My Machine” Problems

Tired of environment issues derailing your projects? This tutorial shows how Docker can help you build consistent, portable setups for your data tools. Learn how to spin up a PostgreSQL database inside a container, connect to it, persist data, and manage everything with ease. No permanent installs required.

From the Community

Peak Patterns in Urban Traffic: Ifeoma’s project features a clear title, focused analysis, clean code, and concise conclusions that make it easy to follow and insightful.

Predicting Heart Disease: Steve’s well-structured project combines strong EDA, visualizations, and thoughtful reflections on model results and limitations.

New Community Moderator Intern: Linky has been promoted to Community Moderator Intern. Learn more about the internship program and how you can get involved.

Estimating Memory Usage by Data: Anna shares a smart way to estimate memory needs before loading large datasets, helping you manage resources more efficiently.

Removing Horizontal Grid Lines in Python: Linky explains how and when to remove horizontal grid lines in matplotlib bar plots for cleaner visuals.

DQ Resources

Work with RDDs in PySpark: Learn how to transform and analyze large-scale data with core RDD operations, understand DAGs for optimized processing, and discover when RDDs still make sense in modern workflows. Learn more

Automate and Monitor ETL Pipelines Locally (Part I): Build a fully functional ETL pipeline running locally with Apache Airflow and Docker. Automate data tasks, monitor them through a visual UI, and quickly identify and fix any issues. No more manual runs or missed jobs. Learn more

Launch a Scalable, Cloud-Hosted ETL Pipeline (Part II): Deploy your ETL workflow to the cloud using AWS. Production-ready Airflow setup that includes cloud storage (S3), a relational database (RDS), IAM roles, and secure infrastructure, built to scale and run reliably. Learn more

What We're Reading

Data Science on Google Cloud: This high-level guide walks you through the entire data science workflow—ingestion, processing, modeling, and activation. Learn how tools like Dataflow, Vertex AI, and Looker support each phase and drive real-world results.

Which Python Vowel Check Is Fastest: Think checking for vowels is simple? Think again. Professor Austin Henley puts 11 different Python strategies to the test, from loops to regex to recursion, to find the fastest approach. The results might surprise you.

Claude 4’s Leaked Prompt Reveals AI’s Guardrails: A leaked system prompt from Claude 4 reveals how tightly these models are controlled. From formatting rules to behavior limits, this post uncovers how AI personalities are engineered behind the scenes.

Give 20%, Get $20: Time to Refer a Friend!

Give 20% Get $20

Now is the perfect time to share Dataquest with a friend. Gift a 20% discount, and for every friend who subscribes, earn a $20 bonus. Use your bonuses for digital gift cards, prepaid cards, or donate to charity. Your choice! Click here

High-fives from Vik, Celeste, Anna P, Anna S, Anishta, Bruno, Elena, Mike, Daniel, and Brayan.

2025-07-02

Learn to Set Up PostgreSQL with Docker (No Installation Needed)

Set up PostgreSQL with Docker, analyze I-94 traffic, predict heart disease, improve Python plots, and explore large-scale data with RDDs. Read More
2025-06-25

Struggling with Slow Python Scripts and Crashing Excel files?

Explore PySpark locally, build your first Spark app, master ETL pipelines with Airflow on AWS, and learn from impressive community projects. Read More
2025-06-19

Build a Linear Regression Model Using Python

Forecast gym visits, explore traffic patterns, test cloud providers hands-on, and build machine learning skills with real healthcare data. Read More

Learn faster and retain more.
Dataquest is the best way to learn