In this interactive Data Structures in R course, we have managed to learn a good deal about data structures such as vectors, matrices, lists and dataframes. To make learning smoother and more efficient, we learned about each of these topics in isolation. In this guided project, we go one step further and learn to combine all these skills to perform practical data analysis.
A pneumonia of unknown cause detected in Wuhan, China was first internationally reported from China on 31 December 2019. Today we know this virus as COVID-19, or more casually, as Coronavirus. Since then, the world has been engaged in the fight against this pandemic. Several measures have therefore been taken to "flatten the curve". We have consequently experienced social distancing and many people have passed away as well.
In the solidarity to face this unprecedented global crisis, several organizations did not hesitate to share several datasets allowing the conduction of several kinds of analysis in order to understand this pandemic.
It is natural for us to analyze these datasets by ourselves to answer questions since we cannot always rely on the news, and we are data scientists.
In this Guided Project, you will build your skills and understanding of the data analysis workflow by evaluating the COVID-19 situation through this dataset. At the end of this project, feel free to download the updated version of the dataset and take the same steps to analyze it.
- Guided Project Introduction
- Understanding the Data
- Isolating the Rows We Need
- Isolating the Columns We need
- Extracting the Top Ten Tested Cases Countries
- Identifying the Highest Positive Against Tested Cases
- Keeping Relevant Information
- Putting it All Together
- Next Steps