The Dataquest Download
Level up your data and AI skills, one newsletter at a time.
Hello, Dataquesters!
Here’s what we have in store for you in this edition:
Top Read: Learn how to deploy your Airflow pipeline to the cloud using Amazon ECS with Fargate for production-ready workflows. Read the blog
Community Highlights: Explore standout projects on financial modeling, credit card customer segmentation, and creative applications of LLMs. Join the discussion
What We’re Reading: How C3’s Enterprise RAG builds trustworthy AI, how to handle tricky time data in code, and what the future holds for open source in the AI era. Learn more
In the final part of our Airflow series, you’ll take everything you’ve built locally and in AWS and fully deploy it to the cloud using Amazon ECS with Fargate. Learn how to containerize your DAGs, launch a production-ready Airflow environment, and run workflows without relying on your local machine. This tutorial wraps up the series with real-world DevOps skills that scale.
New to the series? Start with Part I to build and test an ETL pipeline locally using Airflow and Docker, and Part II to extend that pipeline into the cloud with AWS services.
From the Community
S&P 500 Financial Analysis: Israel’s Excel and Power BI project explores market data from multiple angles, combining strong business insights, financial modeling, and standout visualizations.
Credit Card Customer Segmentation: Fakhriddin’s project features structured EDA, clear code, detailed cluster descriptions, and data-driven recommendations.
LLMs as a Playground: Raisa’s Medium article introduces fun and practical use cases for Large Language Models, making NLP more accessible for beginners.
DQ Resources
Analyzing Startup Fundraising Deals from Crunchbase: Learn how to process large CSV files efficiently by chunking, optimizing memory, and using encoding strategies, then turn the data into a fast, searchable SQLite database. Learn more
Answering Business Questions Using SQL: Learn how to analyze a digital music store’s data using SQL. This hands-on project with the Chinook database walks through real-world SQL tasks like tracking sales, evaluating employees, and spotting growth opportunities using CTEs, subqueries, and more. Learn more
Customer Segmentation Using K-Means Clustering: Discover how to group customers by behavior and demographics using K-means clustering. Learn to identify key segments, unlock actionable insights, and target your marketing more effectively. Learn more
What We're Reading
Enterprise RAG with C3 AI: A Carnegie Mellon professor partnered with C3 AI to develop a Retrieval-Augmented Generation (RAG) system that delivers reliable answers from complex enterprise data without hallucinations and with built-in access controls.
How to Think About Time in Programming: Time is messy. This deep dive explores the quirks of working with time in code—covering civil time, leap seconds, time zones, and more, with clarity and humor. A must-read for data engineers and analysts.
Future of AI and Open Source: As AI tools trained on open source code proliferate, this article examines the legal, ethical, and community tensions that developers and companies must now navigate.
Give 20%, Get $20: Time to Refer a Friend!
Give 20% Get $20
Now is the perfect time to share Dataquest with a friend. Gift a 20% discount, and for every friend who subscribes, earn a $20 bonus. Use your bonuses for digital gift cards, prepaid cards, or donate to charity. Your choice! Click here
High-fives from Vik, Celeste, Anna P, Anna S, Anishta, Bruno, Elena, Mike, Daniel, and Brayan.