To highlight how Dataquest has changed people's lives, we've started a new blog series called User Stories where we interview our users to learn more about their personal journeys.
For the first post in this series, we interviewed Patrick Nelli, VP of Corporate Analytics at Health Catalyst. Health Catalyst is a data warehousing and analytics company based in Salt Lake City, Utah, whose mission is to help health organizations "use analytics, best practices, and adoption services to improve outcomes". Health Catalyst has worked with prestigious organizations like Stanford Health Care, Kaiser Permanente, and Texas Children's Hospital.
What’s your story and how did you get to where you are today?
Since college, I’ve been interested in improving the healthcare sector through technology and innovation. As a physics major with a focus on biophysics and biochemistry, I was drawn to the healthcare space, specifically from laboratory research and exposure to early stage biotechnology companies. I was put off by the fact that academic research took so long to affect patients, so I wanted to get closer to the ‘business’ side of the industry. I spent a few years in healthcare investment banking and private equity as a chance to learn the business side of the healthcare industry. While researching the healthcare industry from an investing perspective, I became very interested in the healthcare data and analytics space. Electronic transactional systems were being adopted by healthcare providers (most notably Electronic Medical Records), which presented an opportunity for insights and improvements to be enabled by the increasing amount of electronic data collected. After learning more about the industry and meeting several investors and companies in the space, I was fortunate enough to meet and join Health Catalyst, a healthcare data and analytics startup based out of Salt Lake City. I have been at Health Catalyst for a couple years and run our internal analytics group. We use our own data warehousing and analytics products to aggregate our internal company data (i.e. data from marketing, sales, operations, product development, finance, etc.) and drive internal operational improvements based on the data.
What drove you to get interested in learning data science?
My interest in data science has grown as a result of both practical use cases at Health Catalyst and a personal interest based on where the healthcare space is going.
Practically, we have a need to use data science methodologies internally at Health Catalyst. When Health Catalyst started collecting and analyzing our internal datasets, we primarily performed exploratory analyses. This is a logical first step and there is a lot of value that is generated from these analyses. As Health Catalyst has grown, we have aggregated larger and more diverse datasets internally and we realized that we can generate additional insights through more rigorous statistical analyses of the data. Health Catalyst’s customer-facing products incorporate predictive analytics models and we want to see use similar techniques on our internal datasets.
Personally, I believe the healthcare space will benefit from data science for decades to come. While most providers are just starting to build data analytics competencies, the health systems further along the maturity spectrum are building predictive models ingrained in care processes. I want to personally understand how to develop these models and be able to use this knowledge to develop additional predictive model use cases.
How did you hear about Dataquest and what resources were you using to learn data science before Dataquest?
I started the data science learning journey by leveraging all of the amazing free content available online. I have explored the numerous MOOCs as well as the O’Reilly books on the topic. After leveraging all of this content for approximately a year, I wanted to dive deeper in leveraging python to perform data science. This led me to General Assembly’s Data Science course. I heard about Dataquest while taking this course.
How has Dataquest helped in your current job?
The best part of Dataquest is the balance between theory and pragmatism. The lessons seem to walk step by step through how specific analyses are performed before showing the easier scikit-learn methods to perform these analysis.
This has helped me
- understand the underlying theory behind analyses and
- leverage the pragmatic code examples of how to perform the analyses.
It has been hard to find a resource with this combination of theory and coded examples, which is why I was so excited when I was introduced to Dataquest. While I am still early in the learning process, Dataquest has exposed me to a variety of models. This is helping me brainstorm the use cases of these analyses and providing me with example code to kick off a project.
What are some of the interesting ways that data science is used in the health sector?
Exploratory data analyses are starting to be used across healthcare. On the provider side, health systems are exploring operational, clinical, and financial data to identify areas of improvements. Predictive analyses are starting to be applied across these datasets as well (although they are in the earlier stages of adoption). Specific examples include predicting clinical events (e.g. heart failure readmissions), operational events (hospital departmental volume), and financial events (patient expenditures based on their clinical profile). We actually have several freely accessible white papers on the topic.
You’ve worked in a lot of different industries, from finance to healthcare. What are some mistakes you’ve routinely seen people in different industries make when it comes to data?
Data does not get generated automatically and perfectly, but is instead generated based on human designed processes. Even machine log files are generated based on specifically defined parameters. No matter what industry one is in, it is critical to understand very granularly how the data is generated. This time investment will pay off in spades during
- the initial data analysis process (e.g. understanding why there is missing or dirty data),
- when trying to draw conclusions based on the data, and
- when trying to implement interventions based on the results.
The results of analyses is ideally an action or intervention. Without knowing the process for how data is generated, it is difficult to accurately understand how to implement the action.
Is Health Catalyst hiring and what kinds of people are you guys looking for?
We are! Check out our job openings.
Editor's note: Want to begin your own journey? Start learning now.