Emilia Kalliri’s research is focused on the role of visualisation in data journalism and the link between data visualisation and psychology. The impact of colour and visual perception on cognitive functions such as memory and attention is a key part of her research. She developed a best practice guideline aimed at helping data journalists and editors create better data visualisations.
Her best practice guideline aims to inform journalists about how different colours can affect different people, depending on their cultures. It gives a brief summary of the Basic Colour Theory known as the HSL model. She also explains how different colour palettes suit different data visualisations depending on the type of data (qualitative or quantitative). The examples she provides illustrate her points and make the information and the scientific theories more digestible for the non-academic readership of the guideline.
How to choose the best colours for effective data visualisation
Data visualisation aims to help the audience better understand and remember the most crucial information of a story. Colour is a useful tool that has been found to impact our mood, memory and attention. Therefore, the proper colour selection can determine the way we perceive and understand visualisation.
Colour is more easily recognised and retained than other visualisation attributes such as shapes of items. However, to make data visualisations effective appropriately, we must understand the basic theory around colour and follow best practices. So, let’s look at the best steps in which colour can strategically make our visualisations more effective to a viewer.
- Understand the basic attributes of colour
Colour can affect each person differently since each culture understands colours uniquely. First, however, we will focus on a basic colour theory known as the HSL model (see Figure 1), in which colour consists of three different channels: hue, saturation and luminance.
- Hue consists of primary and pure colours like red, orange, yellow, green, blue, indigo, violet, etc. Each can be directed on a scale from 0° to 359° to form a colour wheel.
- Saturation is the intensity of colour and the extent to which it is mixed with white. Intense colours like black, grey, and white are saturated, while earth tones are desaturated.
- Luminance ranges from light to dark, as well as the amount of black mixed with the colour.
Figure 1: The HSL model – colour wheel
Luminance and saturation are most effective for ordinal and continuous data since they have an inherent ordering, whereas hue is excellent for categorical data, which aims to make individual comparisons.
Click here for an interactive link to understand the HSL model.
2. Identify the colour meanings
Another aspect of colour is its relationship with the way we feel and experience our feelings. Green creates positive emotions such as calm, comfort, peace, hope, and environmental friendliness, while deep green can represent profit. In marketing, green is used in visualisations to represent gain. Red inspires positive emotions such as excitement and desire; however, it can also be related to negative feelings such as danger or alarm. In data visualisations, red is used to represent loss.
Figure 2: Basic colours and emotions
As mentioned above, it is also important to note that different cultures give different meanings to each colour. For example, red can be related to passion or danger in Western cultures, but success and good luck in Eastern ones. Therefore, this parameter should be kept in mind if the visualisation is going to be presented to a broader audience.
3. Select the best colour palette
There are three main colour palettes that guide and help to develop data visualisations more effectively. The most common tools used in the media world are:
- Sequential palette
- Diverging palette
- Categorical/ Qualitative palette
The sequential palette is used for a gradient effect based on lightness and/or hue, mainly used to depict ordinal, ratio or internal variables. The central aspect of sequential data is that it uses monochromatic colours, luminance and saturation to define the scale. Sequential data can also use analogous colours.
The visualisation below perfectly demonstrates how the sequential palette can be applied to a choropleth map and how hue and lightness are used. It offers a smooth colour gradient of a monochromatic colour (blue) to show the total vaccinations per 100 people in the total population. The lower number of vaccinations are associated with lighter blue while the higher number of vaccinations with darker blue.
Table 1 Sequential colour palette and example of data visualisation
The diverging palette is ideal for numerical data with a particular interval with a neutral midpoint and meaningful central value, such as zero. A distinctive hue is used for each component’s sequential palettes to make it easier to distinguish between positive and negative values associated with the centre.
Divergent data is great if one wants to use two hues, also known as complementary colours. In other words, this palette is a combination of two sequential palettes that have a common endpoint at a central value.
For example, in the visualisation below, a divergent palette was used in the choropleth map. It uses two sequential palettes on a scale representing the percentage of votes during the 2020 United States elections. Two main contrasting colour hues – red and blue – were used in the diverging palette to illustrate each political side. Lighter colours represent states with a lower number of votes. In comparison, states with a higher number of votes are represented by darker colours.
Table 2 Divergent colour palette and example of data visualisation
The categorical palette is excellent for highlighting categorical data to make individual comparisons. Categorical data represents different labels, and there is no inherent ordering like ordinal and continuous data. Therefore, this palette is derived from colours of different hues but uniform saturation and intensity and can be used to show dissimilar data points of different causes or unrelated values. Moreover, categorical data includes as many hues as the values of the variable. The hues need to be from three to seven colours since the brain finds it more challenging to restore more than this amount of information. If one wishes to limit the categories, including “other” as a different category is a great option.
This last data visualisation is a perfect example of how a categorical palette can be used in a line graph. Here the graphs contain seven distinct colours which represent various countries. This allows a more effective comparison of the number of deaths per day in these countries.
Table 3 Categorical colour palette and example of data visualisation
Click here for an interactive understanding of the palettes.
The type of data in data visualisations determines which colour palette is the most appropriate. Therefore, it is essential to consider that when deciding which colour palette to use.
We see that the categorical palette is used the most in qualitative data, which is about categorical variables. For quantitative data (such as ordinal and continuous attributes), which has to do with values and inherent ordering, the sequential and divergent colour palettes would be more effective if used. Even though ‘colour’ is a powerful tool in data visualisation, visual designers should use it carefully and only where needed and based on the purpose of the visualisation.
Analogous colours are these three colours that are beside each other on the colour wheel. So, for example, analogous colours could be red, orange, and red-orange.
Attention is the brain’s procedure of concentrating and keeping the mind focused on a task.
Categorical data is a statistical type of qualitative data that takes numerical values (for example, “1” indicating male and “2” indicating female), but those numbers don’t have mathematical meaning.
Complementary colours are the opposite of each other on the colour wheel and are combined with a neutral colour (e.g. white, black or grey).
Continuous data is a statistical type of quantitative data that takes any value such as height or weight.
Memory is the ability of the brain to encode, store and retrieve information or past experiences.
Monochromatic colours are all colours that have shades or tones of a single hue.
Mood is the temporary state of mind or feeling.
Ordinal data is a statistical type of quantitative data in which values are in a natural order.
Eysenck, M.W. (2009). London (UK): Psychology Press.
Farnsworth, A. (2021). Covid map: Coronavirus cases, deaths, vaccinations by country. Retrieved 5 June 2021, from https://www.bbc.com/news/world-51235105
Gutiérrez, P., Clarke, S., & Kirk, A. (2021). Covid world map: which countries have the most coronavirus vaccinations, cases and deaths?. Retrieved 5 June 2021, from https://www.theguardian.com/world/2021/mar/19/covid-world-map-which-countries-have-the-most-coronavirus-vaccinations-cases-and-deaths
Munzner T. (2014). Visualization Analysis and Design. A K Peters Visualization Series, CRC Press.
Pan, Y. (2010). Attentional capture by working memory contents. Can J Exp Psychol. 64(2), pp. 124-8.
PDF document available for download below: