The data science life cycle is generally comprised of the following components: data retrieval data cleaning data exploration and visualization statistical or predictive modeling While these components are helpful for understanding the different phases, they don’t help us think about our programming workflow. Often, the entire data science life cycle ends up as an arbitrary […]
In previous blog posts, we have described the Postgres database and ways to interact with it using Python. Those posts provided the basics, but if you want to work with databases in production systems, then it is necessary to know how to make your queries faster and more efficient. To understand what efficiency means in […]
In the fast-growing field of data, three main roles have emerged. These include data engineer, data analyst, and data scientist. Understanding the job duties of each of these roles can help you determine whether one might be a good fit for your career path.
If you’ve ever wanted to learn python online with streaming data, or data that changes quickly, you may be familiar with the concept of a data pipeline. Data pipelines allow you transform data from one representation to another through a series of steps. Data pipelines are a key part of data engineering, which we teach […]
Our latest Dataquest release has over 20 new features, including many major performance improvements and the launch of our much-anticipated data engineering path. New Path: Data Engineering The first course in our Data Engineering Path is here! Data Engineering is a broad field which includes: Working with Big Data Architecting distributed systems Creating reliable pipelines […]