Implement a B-Tree Data Structure

In the last lesson, we wrote the class for the b-tree that will be the basis for the index we will now be writing. Once again, an index is a map between every value in a column and its corresponding row location. We will be building an index on the CSV file that we have been working with throughout this recursion and trees course.

Throughout this lesson, we will be building a BTreeIndex class that will store values in their line numbers in the CSV file we'll be working with. Then, we will write a range query similar to BST that will return the line numbers of a given range of times. Finally, we will enhance our BTreeIndex with the ability to save the object to a file using Python serialization and we will discuss why this saves calculation overhead.

At the end of this lesson, you will be comfortable with b-trees and be able to understand the trade offs between using brute force loops and the speed granted by these tree structures. Also, you will have learned how to save these b-tree models using pickle files.

Understanding b-trees and their performance can be difficult at first, once you can command them effectively, they offer serious performance gains. Because they can be difficult to grasp without any hands-on practice, you’ll get to apply what you’ve learned from within your browser so that there's no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you've fully mastered each concept before learning the next concept.


  • Learn how an index can be used to speed up search queries.
  • Learn how to serialize Python objects to increase algorithm loading speed.
  • Learn how to measure B-Tree algorithm speeds.

Lesson Outline

1. Introduction
2. Enhance the Node
3. Parsing the Data
4. Reworking the Search Method
5. Implementing the Range Query
6. Add an Additional Range Query
7. Saving our Model
8. Index vs. Brute Search
9. Next Steps
10. Takeaways

Take a Look Inside