topic: Data Visualisation
Data visualisation is a great way to quickly see what is going on in the data. There are many different ways to visualise data, but some basic principles that always apply are:
- Is the graph easily readable?
- Is this the appropriate graph for the type of data / information that I am trying to display?
- Is the graph accurately representing the data?
- What story am I trying to tell? Is the story clear from the graph?
- Have I given the reader all the information they need to interpret this graph at a glance? Are the axes labeled, and if relevant, is there a legend and title?
This webiste shows some of the most frequently used types of graphs, and what data they are most suitable for. Below, we give a quick summary:
- Show the distribution of a single continuous variable: histograms
- Show the number of observations in a category: donut plots, tree plots, bar graphs
- Show averages or counts by category: bar graph
- Show average and distribution by category: box plot
- Show the relationship between two variables: scatterplots, line graphs
- Show changes over time: line graph (or a bubble chart if you have several variables)
Tutorials
To get you introduced to data visualisation, complete the following tutorial(s):
Compulsory: Go through the DataCamp course Introduction to Data Visualisation with
Python.
Optional: Complete Kaggle’s Data Visualisation: From Non-Coder to Coder.
Advanced: Making interactive graphs with Plotly
Here is a great walk-through of different types of plots in Plotly with Cufflinks.