Five Real-World Tasks You’ll Need to be Hired as a Data Analyst

I was reading a bunch of questions this morning on the r/datascience subreddit and I kept referencing the following internal research we conducted for the answers. We hope sharing it will help you on your data science journey.

Background

A Dataquest learning path contains several courses.  The goal of a path is to help students solve a set of several real-world data analysis tasks and gain the knowledge they need to start data science careers.  These may eventually be represented as some sort of capstone projects.

Describe 5 real-world tasks that someone needs to be able to accomplish to be hired as a data analyst.  These should cover the primary skills that 80% of data analysts need to succeed.

Process

There were two main sources used as research for this project:

  • Interviews our marketing team had conducted with both data scientists, data analysts, and people who hire for both roles.
  • Online articles around skills required for data analysts and data scientists, day in the life type articles, etc.

Method

In order to keep within time constraints, are more rigorous research process involving conducting interviews and/or analyzing job listings was avoided in favor of some intuition and judgment. First a list summarizing the different skills was produced:

Hard Skills
  • Data Cleaning
  • SQL
  • Python
  • Pandas
  • NumPy
  • Jupyter
  • Creating Data Visualizations
  • Working with missing data
  • Ensuring your data/analysis has integrity
  • Reading excel files
  • Analyzing data
  • A/B Testing
  • R
  • Statistics
  • Probability
  • Automation (eg reporting)
  • Creating dashboards
  • Using excel
  • Tableau
  • forecasting
  • root cause analysis
  • APIs
  • WYSIWYG analytics tools (eg google analytics)
Soft Skills
  • Explain things to a non-technical audience
  • Extract requirements from business question
  • Understand a specific business domain
  • Curiosity
  • Problem Solving
  • Business Acumen
  • How to effectively communicate with data viz
  • Converting a business question into a data question
  • Teamwork
  • Critical Thinking
  • Writing

Deliverable One: Five Data Analyst Tasks

1. Answer a business question using SQL

SQL is perhaps the most important skill in the toolbox of a data analyst, and this task is focused on using SQL to answer a business question.  In some instances, there will be a tool that will facilitate connection to a database, writing queries and creating charts from the data (eg Mode), but in other cases, this will require connecting to the DB and making charts using Python. The deliverable for this task is a more formal report.

  • Meet with business stakeholder and understand their needs.
  • Convert the business question into a data question.
  • Write SQL queries to answer the question, including advanced techniques like multiple joins, aggregation, window functions etc
  • Analyze the data, ensuring that the analysis is statistically valid, creating recommendations as necessary.
  • Create a report summarizing the findings, that includes text, data, and visualizations.

2. Use Python to clean and analyze data exported from an existing system

A common task is to be able to analyze data from a system that either doesn't include analysis features or where the analysis required is more complex than the features allow.  In this case, the data analyst will need to either export data from the system, or access the data via API and then perform the analysis in Python. The deliverable for this task is less formal, requiring a brief written answer with supporting data.

  • Meet with business stakeholder to understand requirements.
  • Open data in CSV, JSON or via an API
  • Use Python + Pandas to clean and convert data into the required format for analysis.
  • Analyze data, ensuring analysis is statistically valid.
  • Summarize findings in writing, supporting with data as required.

3. Automate Reporting

Many companies will prepare reports on a regular basis using a tool like Excel, which can be extremely time-consuming.  Data Analysts will often be tasked with creating scripts which can automate this reporting. In this instance, the deliverable will be an Excel spreadsheet so that the report can be easily read by non-technical members of the company.

  • Start with an existing report or series of reports.  The reports are often in spreadsheet form, but could also be a PDF or similar.
  • Analyze the work in the reports, and identify the sources of the data, which may be spreadsheets, CSVs, or third-party tools.
  • Create a Python script that can be used to automate the report creation, saving the final deliverable as an Excel spreadsheet.

4. Transform and prepare data to move from one system to another

There will often be situations where data needs to be transferred from one system to another where one or both systems don't have APIs.  An example might be an internal customer database containing email addresses that needed to be imported into Facebook for targeting advertising.  In these cases, the data is exported from the first system, cleaned and transformed into a format supported by the second system, and then imported into the second system.

  • Get data exported from system A as a flat file.
  • Perform cleaning, aggregation or other transformation using Python.
  • Export the transformed data in a format suitable for system B as a flat file and import into system B

5. Analyze data drawn from multiple systems

In order to answer certain business questions, often data from multiple systems will need to be combined, or data from one system will need to be augmented with a second source.  In these instances, the data analyst will need to prepare the data and find the appropriate attributes to join the two sources, as well as the best joining technique. The analyst may also need to deal with multiple sources of truth, and account for duplicate or missing data.

  • Meet with business stakeholder to understand requirements.
  • Explore the data available in different systems in order to facilitate the analysis.
  • Retrieve data from the systems by either exporting files, SQL, or APIs
  • Analyze where data can be joined, and what data integrity issues may be present.
  • If required, liaise with the stakeholder at this stage to help with any decisions on sources of truth
  • Use Python, Pandas to clean and prepare the data and joined the data.
  • Analyze data, ensuring analysis is statistically valid.
  • Summarize findings in writing, supporting with data as required.

Omitted Skills

Because the definition of data analyst and the variety of work that is performed is wide, along with the requirement to select only five tasks, some skills were omitted.  Since there is a certain amount of intuition and subjectivity around which are the most important skills and tasks, some notable examples are below, along with some brief thoughts:

Using excel/spreadsheets for data analysis: This is an important and common skill, especially where data analyst roles lean more towards business analyst roles.

Tableau:  This came up frequently in online articles for my research.  Again, this might lean towards business analyst roles, but there would be significant benefit to know some basics here.

Team collaboration and related tools: Understanding how to work on a team analysis project is a vital skill, especially at larger companies.  There is some spirit of this in tasks that involve liaising with a business stakeholder, but working with other analysts is important too.

Forecasting: Making projections, for instance around production numbers or sales is sometimes included as part of a Data Analysts role.

Conducting and analyzing A/B tests: For larger companies and more complex cases, this is a task that leans more into the domain of a data scientist.  That said, where the cases are simple and easy tooling is available, this could definitely be an area that a data analyst would contribute to. 

Dashboards: Depending on the company, there may be access to a tool like Mode Analytics which will allow dashboards to be created using the skills that exist in other tasks.

Good luck!

If you'd like to try a free course you can sign up here.


Click Here to Leave a Comment Below

Leave a Comment: