top of page

Introduction to Treemaps in Plotly Express (Python)

In this section


The Jupyter Notebook 'Treemaps.ipynb' which is available in Github contains the code used in this article.



What is Plotly Express

Plotly Express is built on top of the Plotly library and can create figures with just a single line of code. Plotly is free and open source.

One of the main reasons why Plotly is such a great visualization library is because it provides interactivity, beautiful aesthetics as well as a number of charting options.


What is a Treemap

Treemaps allow you to visualize hierarchical data using nested rectangles. You can click on a specific sector to drill down and roll up the hierarchies that you specify.


Treemaps can be used within a limited space to display a large number of items simultaneously.


When there is a correlation between color and size in the treemap, you're able to see patterns that would be difficult to spot in other ways, for example, when a certain color is particularly relevant.


Installation

Plotly Express is a built-in module from the Plotly library, so to be able to use Plotly Express you’ll need to install Plotly.


To install Plotly you can type either of the following in your terminal or command prompt:

pip install plotly

If you are using the Anaconda environment you can type

conda install plotly

You can import Plotly Express using the following line of code:

import plotly.express as px

The syntax for any Plotly Express chart is:

px.chart(data_frame, parameters)

Refer to the official guidance from Plotly here. https://plotly.com/python/getting-started/#installation


Note: I am using Jupyter Notebook as my preferred IDE but you can use other contexts such as Google Collaboratory, Visual Studio Code or even the Python Shell. Note that some other contexts might require a compatible renderer to be able to display your charts, refer to additional guidance in this instance.


Dataset Overview

As with other libraries in Python like Seaborn, Plotly has a number of built-in sample datasets.


I’ve selected the gapminder dataset which returns a pandas DataFrame consisting of Country names, Continents and key indicators such as GDP per Capita and Life Expectancy across various years.


Prior to returning the dataset we should import the pandas library to perform data cleansing.

import pandas as pd

You can import these sample datasets using the data_package subpackage, the syntax is as follows.

px.data.gapminder()

I’ll give the dataframe a name by assigning it to the variable df, while also filtering the dataframe to include records for the year 2007

df = px.data.gapminder().query('year==2007')

The data looks like this:

General Syntax for Plotly Express Treemap Charts

px.treemap(data_frame, path, values, parameters)

essential parameters include:


data_frame: this is the DataFrame

path: list of column(s) that define the hierarchy of the rectangles in the treemap

values: values from this column or array_like are used to set values associated to sectors.


There are other useful parameters we will also cover in this article. For the official documentation refer here.


So let's create a treemap that will take in the df dataframe and plot a treemap with sectors for each country and where the size of each sector will be determined by the countries population.


To do this we require the column 'country' for the path argument and the column 'pop' for the values argument as follows.

px.treemap(data_frame=df, path=['country'], values='pop')

Adding additional columns to your hierarchy

You can add additional columns to the hierarchy to determine the layout of each sector via the path argument. Let's add a continent into the list of values for the path argument and add it before 'country' in the list.

px.treemap(data_frame=df, path=['continent','country'], values='pop')

So now you can see the outer most sectors are seperated by continents. Within each continent sector we have the countries associated to that sector.


The chart above is interactive, try selecting a continent and then a country.


Coloring each sector

You can change the color of each sector by using the color argument, let's color the sectors based on the population value. We can do this by passing 'pop' in for the color argument.

px.treemap(data_frame=df, path=['continent','country'], values='pop', color='pop')

So now there is an additional legend with the color code. The brigher yellow rectangles represent the highest population and the purple colors represent the lower populations.


You can also use the color_continious_scale argument to use a specific color theme - the list of options can be found here. Let's change the color scale to be 'orrd'

px.treemap(data_frame=df, path=['continent','country'], values='pop', color='pop', color_continuous_scale='orrd')

Let's change it once more to 'rdbu'

px.treemap(data_frame=df, path=['continent','country'], values='pop', color='pop', color_continuous_scale='rdbu')

Conclusion

Treemap charts are great for visualizing hierarchical data and spotting trends.


However they do have certain limitations - they are unable to display negative values and are not as useful when there is a large variance between values.


When used with the right data set they can be a valuable tool in your arsenal.


Sources

https://plotly.com/python/treemaps/

https://plotly.com/python-api-reference/generated/plotly.express.treemap.html

https://plotly.com/python/builtin-colorscales/

Comments


Post: Blog2_Post
bottom of page