The Dataquest Download
Level up your data and AI skills, one newsletter at a time.
Hello, Dataquesters!
Here’s what we have for you in this edition:
Top Read: How to chunk long documents for high-quality semantic search. Compare strategies, measure recall vs. relevance, and pick the right approach for your content. Read now
Webinar Recording: Build a TensorFlow model to predict IPO listing gains, from data exploration to preprocessing and modeling. Watch now
From the Community: An interactive Excel e-commerce dashboard, practical fixes for pandas’ SettingWithCopyWarning, a meteorology tracking question in Python, and a smart note-taking workflow using VS Code Jupyter notebooks. Join the discussion
What We’re Reading: Three lessons that turn metrics into action, leadership takeaways on AI and observability on Google Cloud, and how agentic AI is reshaping enterprise work. Learn more

Embeddings worked smoothly when each document was a short abstract. But full research papers, technical docs, and long-form guides are too large to embed as a single vector, and that’s where search quality starts to break unless you chunk intelligently.
In this tutorial, you’ll learn how to split long documents into chunks that work well for vector search. You’ll implement multiple chunking strategies, evaluate them systematically, and understand the tradeoffs between recall, relevance, and performance. By the end, you’ll know how to choose a chunking approach that fits your content and your search goals.
Webinar Recording
Watch now and learn how to build a deep learning model using TensorFlow to predict listing gains, applying skills in data exploration, visualization, preprocessing, and modeling.
From the Community
E-commerce Analytics Interactive Excel Dashboard: Israel’s individual Excel project stands out for its impressive variety of visualizations that effectively present global product sales performance and allow analysis from multiple perspectives.
Dealing with SettingWithCopyWarning in Pandas: Alla shared two useful resources on how to resolve the SettingWithCopyWarning issue when working on Python projects and modifying data in pandas DataFrames or Series.
Python-based Mesoscale Convective System Tracker Application: Femi, a Python beginner, asks a domain-specific question in meteorology about how to employ a Python tool for tracking mesoscale convective systems.
Using a VSCode Jupyter Notebook to Retain Learned Concepts: Tomaz shared his approach to learning data science by taking notes in a Jupyter Notebook within Visual Studio Code, helping him keep track of what he has learned and document his daily work.
What We're Reading
Three Machine Learning Lessons: Many teams sit on mountains of data yet still feel unsure about what to do next. This piece reveals how small shifts in mindset and practice can turn ordinary metrics into insights that actually move people to act.
The Future of AI, LLMs, and Observability on Google Cloud: Discover 7 key insights for leaders from our discussion with Google’s Director of AI, Dr. Ali Arsanjani, and Datadog’s VP of Engineering, Sajid Mehmood.
Inside the Agentic AI shift: A new report by Thoughtworks and WIRED that explores how enterprises are using AI agents to drive real results, manage risks, and stay ahead in the next wave of AI disruption.
Give 20%, Get $20: Time to Refer a Friend!
Give 20% Get $20
Now is the perfect time to share Dataquest with a friend. Gift a 20% discount, and for every friend who subscribes, earn a $20 bonus. Use your bonuses for digital gift cards, prepaid cards, or donate to charity. Your choice! Click here
High-fives from Vik, Celeste, Anna P, Anna S, Anishta, Bruno, Elena, Mike, Daniel, and Brayan.