Joining Data

Preface

Who wouldn't want to skip the nitty gritty of data cleaning all together, yet the real world data, while containing a massive amount of information, quite are often messy and demand a lot of work.

However, the seemingly mundane task can be interesting and informative if you lack the domain knowledge - you get to learn what features/information are essential for answering your burning questions. In this post, let's explore the different approach in R and Python to bring different data together and join the dots.

The datasets used in this post are from the Scottish Heart Disease Statistics dataset and the Scottish Stroke Statistics dataset provided by Public Health Scotland.