Note
Several examples in this chapter use Pandas, for ease of presentation and because it is a common tool for data manipulation. However, Pandas is not required to create anything shown here.
Pandas
Bokeh make it simple to create basic bar charts using the hbar() and vbar() glyphs methods. In the example below, we have the following sequence of simple 1-level factors:
hbar()
vbar()
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
To inform Bokeh that the x-axis is categorical, we pass this list of factors as the x_range argument to figure():
x_range
figure()
p = figure(x_range=fruits, ... )
Note that passing the list of factors is a convenient shorthand notation for creating a FactorRange. The equivalent explicit notation is:
FactorRange
p = figure(x_range=FactorRange(factors=fruits), ... )
This more explicit form is useful when you want to customize the FactorRange, e.g. by changing the range or category padding.
Next we can call vbar with the list of fruit name factors as the x coordinate, the bar height as the top coordinate, and optionally any width or other properties that we would like to set:
vbar
x
top
width
p.vbar(x=fruits, top=[5, 3, 4, 2, 4, 6], width=0.9)
All put together, we see the output:
from bokeh.io import output_file, show from bokeh.plotting import figure output_file("bars.html") fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] counts = [5, 3, 4, 2, 4, 6] p = figure(x_range=fruits, plot_height=250, title="Fruit Counts", toolbar_location=None, tools="") p.vbar(x=fruits, top=counts, width=0.9) p.xgrid.grid_line_color = None p.y_range.start = 0 show(p)
As usual, the data could also be put into a ColumnDataSource supplied as the source parameter to vbar instead of passing the data directly as parameters. Later examples will demonstrate this.
ColumnDataSource
source
Since Bokeh displays bars in the order the factors are given for the range, “sorting” bars in a bar plot is identical to sorting the factors for the range.
In the example below the fruit factors are sorted in increasing order according to their corresponding counts, causing the bars to be sorted:
from bokeh.io import output_file, show from bokeh.plotting import figure output_file("bar_sorted.html") fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] counts = [5, 3, 4, 2, 4, 6] # sorting the bars means sorting the range factors sorted_fruits = sorted(fruits, key=lambda x: counts[fruits.index(x)]) p = figure(x_range=sorted_fruits, plot_height=350, title="Fruit Counts", toolbar_location=None, tools="") p.vbar(x=fruits, top=counts, width=0.9) p.xgrid.grid_line_color = None p.y_range.start = 0 show(p)
Often times we may want to have bars that are shaded some color. This can be accomplished in different ways. One way is to supply all the colors up front. This can be done by putting all the data, including the colors for each bar, in a ColumnDataSource. Then the name of the column containing the colors is passed to vbar as the color (or line_color/fill_color) arguments. This is shown below:
color
line_color
fill_color
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource from bokeh.palettes import Spectral6 from bokeh.plotting import figure output_file("colormapped_bars.html") fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] counts = [5, 3, 4, 2, 4, 6] source = ColumnDataSource(data=dict(fruits=fruits, counts=counts, color=Spectral6)) p = figure(x_range=fruits, y_range=(0,9), plot_height=250, title="Fruit Counts", toolbar_location=None, tools="") p.vbar(x='fruits', top='counts', width=0.9, color='color', legend_field="fruits", source=source) p.xgrid.grid_line_color = None p.legend.orientation = "horizontal" p.legend.location = "top_center" show(p)
Another way to shade the bars is to use a CategoricalColorMapper that colormaps the bars inside the browser. There is a function factor_cmap() that makes this simple to do:
CategoricalColorMapper
factor_cmap()
factor_cmap('fruits', palette=Spectral6, factors=fruits)
This can be passed to vbar in the same way as the column name in the previous example. Putting everything together we obtain the same plot in a different way:
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource from bokeh.palettes import Spectral6 from bokeh.plotting import figure from bokeh.transform import factor_cmap output_file("colormapped_bars.html") fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] counts = [5, 3, 4, 2, 4, 6] source = ColumnDataSource(data=dict(fruits=fruits, counts=counts)) p = figure(x_range=fruits, plot_height=250, toolbar_location=None, title="Fruit Counts") p.vbar(x='fruits', top='counts', width=0.9, source=source, legend_field="fruits", line_color='white', fill_color=factor_cmap('fruits', palette=Spectral6, factors=fruits)) p.xgrid.grid_line_color = None p.y_range.start = 0 p.y_range.end = 9 p.legend.orientation = "horizontal" p.legend.location = "top_center" show(p)
Another common operation or bar charts is to stack bars on top of one another. Bokeh makes this easy to do with the specialized hbar_stack() and vbar_stack() functions. The example below shows the fruits data from above, but with the bars for each fruit type stacked instead of grouped:
hbar_stack()
vbar_stack()
from bokeh.io import output_file, show from bokeh.plotting import figure output_file("stacked.html") fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] years = ["2015", "2016", "2017"] colors = ["#c9d9d3", "#718dbf", "#e84d60"] data = {'fruits' : fruits, '2015' : [2, 1, 4, 3, 2, 4], '2016' : [5, 3, 4, 2, 4, 6], '2017' : [3, 2, 4, 4, 5, 3]} p = figure(x_range=fruits, plot_height=250, title="Fruit Counts by Year", toolbar_location=None, tools="") p.vbar_stack(years, x='fruits', width=0.9, color=colors, source=data, legend_label=years) p.y_range.start = 0 p.x_range.range_padding = 0.1 p.xgrid.grid_line_color = None p.axis.minor_tick_line_color = None p.outline_line_color = None p.legend.location = "top_left" p.legend.orientation = "horizontal" show(p)
Note that behind the scenes, these functions work by stacking up the successive columns in separate calls to vbar or hbar. This kind of operation is akin the to dodge example above (i.e. the data in this case is not in a “tidy” data format).
hbar
Sometimes we may want to stack bars that have both positive and negative extents. The example below shows how it is possible to create such a stacked bar chart that is split by positive and negative values:
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource from bokeh.palettes import GnBu3, OrRd3 from bokeh.plotting import figure output_file("stacked_split.html") fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] years = ["2015", "2016", "2017"] exports = {'fruits' : fruits, '2015' : [2, 1, 4, 3, 2, 4], '2016' : [5, 3, 4, 2, 4, 6], '2017' : [3, 2, 4, 4, 5, 3]} imports = {'fruits' : fruits, '2015' : [-1, 0, -1, -3, -2, -1], '2016' : [-2, -1, -3, -1, -2, -2], '2017' : [-1, -2, -1, 0, -2, -2]} p = figure(y_range=fruits, plot_height=250, x_range=(-16, 16), title="Fruit import/export, by year", toolbar_location=None) p.hbar_stack(years, y='fruits', height=0.9, color=GnBu3, source=ColumnDataSource(exports), legend_label=["%s exports" % x for x in years]) p.hbar_stack(years, y='fruits', height=0.9, color=OrRd3, source=ColumnDataSource(imports), legend_label=["%s imports" % x for x in years]) p.y_range.range_padding = 0.1 p.ygrid.grid_line_color = None p.legend.location = "top_left" p.axis.minor_tick_line_color = None p.outline_line_color = None show(p)
For stacked bar plots, Bokeh provides some special hover variables that are useful for common cases.
When stacking bars, Bokeh automatically sets the name property for each layer in the stack to be the value of the stack column for that layer. This name value is accessible to hover tools via the $name special variable.
name
$name
Additionally, the hover variable @$name can be used to look up values from the stack column for each layer. For instance, if a user hovers over a stack glyph with the name "US East", then @$name is equivalent to @{US East}.
@$name
"US East"
@{US East}
The example below demonstrates both of these hover variables:
from bokeh.io import output_file, show from bokeh.plotting import figure output_file("stacked.html") fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] years = ["2015", "2016", "2017"] colors = ["#c9d9d3", "#718dbf", "#e84d60"] data = {'fruits' : fruits, '2015' : [2, 1, 4, 3, 2, 4], '2016' : [5, 3, 4, 2, 4, 6], '2017' : [3, 2, 4, 4, 5, 3]} p = figure(x_range=fruits, plot_height=250, title="Fruit Counts by Year", toolbar_location=None, tools="hover", tooltips="$name @fruits: @$name") p.vbar_stack(years, x='fruits', width=0.9, color=colors, source=data, legend_label=years) p.y_range.start = 0 p.x_range.range_padding = 0.1 p.xgrid.grid_line_color = None p.axis.minor_tick_line_color = None p.outline_line_color = None p.legend.location = "top_left" p.legend.orientation = "horizontal" show(p)
Note that it is also possible to override the value of name by passing it manually to vbar_stack and hbar_stack. In this case, $@name will look up the column names provided by the user.
vbar_stack
hbar_stack
$@name
It may also sometimes be desirable to have a different hover tool for each layer in the stack. For such cases, the hbar_stack and vbar_stack functions return a list of all the renderers created (one for each stack). These can be used to customize different hover tools for each layer:
renderers = p.vbar_stack(years, x='fruits', width=0.9, color=colors, source=source, legend=[value(x) for x in years], name=years) for r in renderers: year = r.name hover = HoverTool(tooltips=[ ("%s total" % year, "@%s" % year), ("index", "$index") ], renderers=[r]) p.add_tools(hover)
When creating bar charts, it is often desirable to visually display the data according to sub-groups. There are two basic methods that can be used, depending on your use case: using nested categorical coordinates, or applying visual dodges.
If the coordinates of a plot range and data have two or three levels, then Bokeh will automatically group the factors on the axis, including a hierarchical tick labeling with separators between the groups. In the case of bar charts, this results in bars grouped together by the top-level factors. This is probably the most common way to achieve grouped bars, especially if you are starting from “tidy” data.
The example below shows this approach by creating a single column of coordinates that are each 2-tuples of the form (fruit, year). Accordingly, the plot groups the axes by fruit type, with a single call to vbar:
(fruit, year)
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource, FactorRange from bokeh.plotting import figure output_file("bars.html") fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] years = ['2015', '2016', '2017'] data = {'fruits' : fruits, '2015' : [2, 1, 4, 3, 2, 4], '2016' : [5, 3, 3, 2, 4, 6], '2017' : [3, 2, 4, 4, 5, 3]} # this creates [ ("Apples", "2015"), ("Apples", "2016"), ("Apples", "2017"), ("Pears", "2015), ... ] x = [ (fruit, year) for fruit in fruits for year in years ] counts = sum(zip(data['2015'], data['2016'], data['2017']), ()) # like an hstack source = ColumnDataSource(data=dict(x=x, counts=counts)) p = figure(x_range=FactorRange(*x), plot_height=250, title="Fruit Counts by Year", toolbar_location=None, tools="") p.vbar(x='x', top='counts', width=0.9, source=source) p.y_range.start = 0 p.x_range.range_padding = 0.1 p.xaxis.major_label_orientation = 1 p.xgrid.grid_line_color = None show(p)
We can also apply a color mapping, similar to the earlier example. To obtain same grouped bar plot of fruits data as above, except with the bars shaded by the year, changethe vbar function call to use factor_cmap for the fill_color:
factor_cmap
p.vbar(x='x', top='counts', width=0.9, source=source, line_color="white", # use the palette to colormap based on the the x[1:2] values fill_color=factor_cmap('x', palette=palette, factors=years, start=1, end=2))
Recall that the factors are of the for (fruit, year). The start=1 and end=2 in the call to factor_cmap select the second part of data factors to use when color mapping.
start=1
end=2
Another method for achieving grouped bars is to explicitly specify a visual displacement for the bars. Such a visual offset is also referred to as a dodge.
In this scenario, our data is not “tidy”. Instead a single table with rows indexed by factors (fruit, year), we have separate series for each year. We can plot all the year series using separate calls to vbar but since every bar in each group has the same fruit factor, the bars would overlap visually. We can prevent this overlap and distinguish the bars visually by using the dodge() function to provide an offset for each different call to vbar:
fruit
dodge()
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource from bokeh.plotting import figure from bokeh.transform import dodge output_file("dodged_bars.html") fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] years = ['2015', '2016', '2017'] data = {'fruits' : fruits, '2015' : [2, 1, 4, 3, 2, 4], '2016' : [5, 3, 3, 2, 4, 6], '2017' : [3, 2, 4, 4, 5, 3]} source = ColumnDataSource(data=data) p = figure(x_range=fruits, y_range=(0, 10), plot_height=250, title="Fruit Counts by Year", toolbar_location=None, tools="") p.vbar(x=dodge('fruits', -0.25, range=p.x_range), top='2015', width=0.2, source=source, color="#c9d9d3", legend_label="2015") p.vbar(x=dodge('fruits', 0.0, range=p.x_range), top='2016', width=0.2, source=source, color="#718dbf", legend_label="2016") p.vbar(x=dodge('fruits', 0.25, range=p.x_range), top='2017', width=0.2, source=source, color="#e84d60", legend_label="2017") p.x_range.range_padding = 0.1 p.xgrid.grid_line_color = None p.legend.location = "top_left" p.legend.orientation = "horizontal" show(p)
The above techniques for stacking and grouping may also be used together to crate a stacked, grouped bar plot.
Continuing the example above with bars grouped by quarter, we might stack each individual bar by region.
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource, FactorRange from bokeh.plotting import figure output_file("bar_stacked_grouped.html") factors = [ ("Q1", "jan"), ("Q1", "feb"), ("Q1", "mar"), ("Q2", "apr"), ("Q2", "may"), ("Q2", "jun"), ("Q3", "jul"), ("Q3", "aug"), ("Q3", "sep"), ("Q4", "oct"), ("Q4", "nov"), ("Q4", "dec"), ] regions = ['east', 'west'] source = ColumnDataSource(data=dict( x=factors, east=[ 5, 5, 6, 5, 5, 4, 5, 6, 7, 8, 6, 9 ], west=[ 5, 7, 9, 4, 5, 4, 7, 7, 7, 6, 6, 7 ], )) p = figure(x_range=FactorRange(*factors), plot_height=250, toolbar_location=None, tools="") p.vbar_stack(regions, x='x', width=0.9, alpha=0.5, color=["blue", "red"], source=source, legend_label=regions) p.y_range.start = 0 p.y_range.end = 18 p.x_range.range_padding = 0.1 p.xaxis.major_label_orientation = 1 p.xgrid.grid_line_color = None p.legend.location = "top_center" p.legend.orientation = "horizontal" show(p)
When dealing with hierarchical categories of two or three levels, it’s possible to use just the “higher level” portion of a coordinate to position glyphs. For example, if you have range with the hierarchical factors
factors = [ ("East", "Sales"), ("East", "Marketing"), ("East", "Dev"), ("West", "Sales"), ("West", "Marketing"), ("West", "Dev"), ]
Then it is possible to use just “Sales” and “Marketing” etc. as positions for glyphs. In this case the position is the center of the entire group. The example below shows bars for each month, grouped by financial quarter, and also adds a line (perhaps for a quarterly average) at the coordinates for Q1, Q2, etc.:
Q1
Q2
from bokeh.io import output_file, show from bokeh.models import FactorRange from bokeh.plotting import figure output_file("mixed.html") factors = [ ("Q1", "jan"), ("Q1", "feb"), ("Q1", "mar"), ("Q2", "apr"), ("Q2", "may"), ("Q2", "jun"), ("Q3", "jul"), ("Q3", "aug"), ("Q3", "sep"), ("Q4", "oct"), ("Q4", "nov"), ("Q4", "dec"), ] p = figure(x_range=FactorRange(*factors), plot_height=250, toolbar_location=None, tools="") x = [ 10, 12, 16, 9, 10, 8, 12, 13, 14, 14, 12, 16 ] p.vbar(x=factors, top=x, width=0.9, alpha=0.5) p.line(x=["Q1", "Q2", "Q3", "Q4"], y=[12, 9, 13, 14], color="red", line_width=2) p.y_range.start = 0 p.x_range.range_padding = 0.1 p.xaxis.major_label_orientation = 1 p.xgrid.grid_line_color = None show(p)
This example also demonstrates that other glyphs such as lines also function with categorical coordinates.
Pandas is a powerful and common tool for doing data analysis on tabular and timeseries data in Python. Although it is not required by Bokeh, Bokeh tries to make life easier when you do.
Below is a plot that demonstrates some advantages when using Pandas with Bokeh:
Pandas GroupBy objects can be used to initialize a ColumnDataSource, automatically creating columns for many statistical measures such as the group mean or count
GroupBy
GroupBy objects may also be passed directly as a range argument to figure.
figure
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource from bokeh.palettes import Spectral5 from bokeh.plotting import figure from bokeh.sampledata.autompg import autompg as df from bokeh.transform import factor_cmap output_file("groupby.html") df.cyl = df.cyl.astype(str) group = df.groupby('cyl') source = ColumnDataSource(group) cyl_cmap = factor_cmap('cyl', palette=Spectral5, factors=sorted(df.cyl.unique())) p = figure(plot_height=350, x_range=group, title="MPG by # Cylinders", toolbar_location=None, tools="") p.vbar(x='cyl', top='mpg_mean', width=1, source=source, line_color=cyl_cmap, fill_color=cyl_cmap) p.y_range.start = 0 p.xgrid.grid_line_color = None p.xaxis.axis_label = "some stuff" p.xaxis.major_label_orientation = 1.2 p.outline_line_color = None show(p)
Not that in the example above, we grouped by the column 'cyl' so our CDS has a column 'cyl' for this index. Additionally, other non-grouped columns like 'mpg' have had associated columns such 'mpg_mean' added, that give the mean MPG value for each group.
'cyl'
'mpg'
'mpg_mean'
This usage also works when the grouping is multi-level. The example below shows how grouping the same data by ('cyl', 'mfr') results in a hierarchical nested axis. In this case, the index column name 'cyl_mfr' is made by joining the names of the grouped columns together.
('cyl', 'mfr')
'cyl_mfr'
from bokeh.io import output_file, show from bokeh.palettes import Spectral5 from bokeh.plotting import figure from bokeh.sampledata.autompg import autompg_clean as df from bokeh.transform import factor_cmap output_file("bar_pandas_groupby_nested.html") df.cyl = df.cyl.astype(str) df.yr = df.yr.astype(str) group = df.groupby(by=['cyl', 'mfr']) index_cmap = factor_cmap('cyl_mfr', palette=Spectral5, factors=sorted(df.cyl.unique()), end=1) p = figure(plot_width=800, plot_height=300, title="Mean MPG by # Cylinders and Manufacturer", x_range=group, toolbar_location=None, tooltips=[("MPG", "@mpg_mean"), ("Cyl, Mfr", "@cyl_mfr")]) p.vbar(x='cyl_mfr', top='mpg_mean', width=1, source=group, line_color="white", fill_color=index_cmap, ) p.y_range.start = 0 p.x_range.range_padding = 0.05 p.xgrid.grid_line_color = None p.xaxis.axis_label = "Manufacturer grouped by # Cylinders" p.xaxis.major_label_orientation = 1.2 p.outline_line_color = None show(p)
So far we have seen the bar glyphs used to create bar charts, which imply bars drawn from a common baseline. However, the bar glyphs can also be used to represent arbitrary intervals across a range.
The example below uses hbar with both left and right properties supplied, to show the spread in times between bronze and gold medalists in Olympic sprinting over many years:
left
right
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource from bokeh.plotting import figure from bokeh.sampledata.sprint import sprint output_file("sprint.html") sprint.Year = sprint.Year.astype(str) group = sprint.groupby('Year') source = ColumnDataSource(group) p = figure(y_range=group, x_range=(9.5,12.7), plot_width=400, plot_height=550, toolbar_location=None, title="Time Spreads for Sprint Medalists (by Year)") p.hbar(y="Year", left='Time_min', right='Time_max', height=0.4, source=source) p.ygrid.grid_line_color = None p.xaxis.axis_label = "Time (seconds)" p.outline_line_color = None show(p)
When plotting many scatter points in a single categorical category, it is common for points to start to visually overlap. In this case, Bokeh provides a jitter() function that can automatically apply a random dodge to every point.
jitter()
The example below shows a scatter plot of every commit time for a GitHub user between 2012 and 2016, grouped by day of the week. A naive plot of this data would result in thousands of points overlapping in a narrow line for each day. By using jitter we can differentiate the points to obtain a useful plot:
jitter
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource from bokeh.plotting import figure from bokeh.sampledata.commits import data from bokeh.transform import jitter output_file("bars.html") DAYS = ['Sun', 'Sat', 'Fri', 'Thu', 'Wed', 'Tue', 'Mon'] source = ColumnDataSource(data) p = figure(plot_width=800, plot_height=300, y_range=DAYS, x_axis_type='datetime', title="Commits by Time of Day (US/Central) 2012—2016") p.circle(x='time', y=jitter('day', width=0.6, range=p.y_range), source=source, alpha=0.3) p.xaxis[0].formatter.days = ['%Hh'] p.x_range.range_padding = 0 p.ygrid.grid_line_color = None show(p)
We’ve seen above how categorical locations can be modified by operations like dodge and jitter. It is also possible to supply an offset to a categorical location explicitly. This is done by adding a numeric value to the end of a category, e.g. ["Jan", 0.2] is the category “Jan” offset by a value of 0.2. For hierarchical categories, the value is added at the end of the existing list, e.g. ["West", "Sales", -0,2]. Any numeric value at the end of a list of categories is always interpreted as an offset.
["Jan", 0.2]
["West", "Sales", -0,2]
As an example, suppose we took our first example from the beginning and modified it like this:
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries'] offsets = [-0.5, -0.2, 0.0, 0.3, 0.1, 0.3] # This results in [ ['Apples', -0.5], ['Pears', -0.2], ... ] x = list(zip(fruits, offsets)) p.vbar(x=x, top=[5, 3, 4, 2, 4, 6], width=0.8)
Then the resulting plot has bars that are horizontally shifted by the amount of each corresponding offset:
Below is a more sophisticated example of a Ridge Plot that displays timeseries associated with different categories. It uses categorical offsets to specify patch coordinates for the timeseries inside each category.
import colorcet as cc from numpy import linspace from scipy.stats.kde import gaussian_kde from bokeh.io import output_file, show from bokeh.models import ColumnDataSource, FixedTicker, PrintfTickFormatter from bokeh.plotting import figure from bokeh.sampledata.perceptions import probly output_file("ridgeplot.html") def ridge(category, data, scale=20): return list(zip([category]*len(data), scale*data)) cats = list(reversed(probly.keys())) palette = [cc.rainbow[i*15] for i in range(17)] x = linspace(-20,110, 500) source = ColumnDataSource(data=dict(x=x)) p = figure(y_range=cats, plot_width=700, x_range=(-5, 105), toolbar_location=None) for i, cat in enumerate(reversed(cats)): pdf = gaussian_kde(probly[cat]) y = ridge(cat, pdf(x)) source.add(y, cat) p.patch('x', cat, color=palette[i], alpha=0.6, line_color="black", source=source) p.outline_line_color = None p.background_fill_color = "#efefef" p.xaxis.ticker = FixedTicker(ticks=list(range(0, 101, 10))) p.xaxis.formatter = PrintfTickFormatter(format="%d%%") p.ygrid.grid_line_color = None p.xgrid.grid_line_color = "#dddddd" p.xgrid.ticker = p.xaxis[0].ticker p.axis.minor_tick_line_color = None p.axis.major_tick_line_color = None p.axis.axis_line_color = None p.y_range.range_padding = 0.12 show(p)
In all of the cases above, we have had one categorical axis, and one continuous axis. It is possible to have plots with two categorical axes. If we shade the rectangle that defines each pair of categories, we end up with a Categorical Heatmap
The plot below shows such a plot, where the x-axis categories are a list of years from 1948 to 2016, and the y-axis categories are the months of the years. Each rectangle corresponding to a (year, month) combination is color mapped by the unemployment rate for that month and year. Since the unemployment rate is a continuous variable, a LinearColorMapper is used to colormap the plot, and is also passed to a color bar to provide a visual legend on the right:
(year, month)
LinearColorMapper
import pandas as pd from bokeh.io import output_file, show from bokeh.models import (BasicTicker, ColorBar, ColumnDataSource, LinearColorMapper, PrintfTickFormatter,) from bokeh.plotting import figure from bokeh.sampledata.unemployment1948 import data from bokeh.transform import transform output_file("unemploymemt.html") data.Year = data.Year.astype(str) data = data.set_index('Year') data.drop('Annual', axis=1, inplace=True) data.columns.name = 'Month' # reshape to 1D array or rates with a month and year for each row. df = pd.DataFrame(data.stack(), columns=['rate']).reset_index() source = ColumnDataSource(df) # this is the colormap from the original NYTimes plot colors = ["#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce", "#ddb7b1", "#cc7878", "#933b41", "#550b1d"] mapper = LinearColorMapper(palette=colors, low=df.rate.min(), high=df.rate.max()) p = figure(plot_width=800, plot_height=300, title="US Unemployment 1948—2016", x_range=list(data.index), y_range=list(reversed(data.columns)), toolbar_location=None, tools="", x_axis_location="above") p.rect(x="Year", y="Month", width=1, height=1, source=source, line_color=None, fill_color=transform('rate', mapper)) color_bar = ColorBar(color_mapper=mapper, location=(0, 0), ticker=BasicTicker(desired_num_ticks=len(colors)), formatter=PrintfTickFormatter(format="%d%%")) p.add_layout(color_bar, 'right') p.axis.axis_line_color = None p.axis.major_tick_line_color = None p.axis.major_label_text_font_size = "7px" p.axis.major_label_standoff = 0 p.xaxis.major_label_orientation = 1.0 show(p)
A final example combines many of the techniques in this chapter: color mappers, visual dodges, and Pandas DataFrames. These are used to create a different sort of “heatmap” that results in a periodic table of the elements. A hover tool as also been added so that additional information about each element can be inspected:
from bokeh.io import output_file, show from bokeh.models import ColumnDataSource from bokeh.plotting import figure from bokeh.sampledata.periodic_table import elements from bokeh.transform import dodge, factor_cmap output_file("periodic.html") periods = ["I", "II", "III", "IV", "V", "VI", "VII"] groups = [str(x) for x in range(1, 19)] df = elements.copy() df["atomic mass"] = df["atomic mass"].astype(str) df["group"] = df["group"].astype(str) df["period"] = [periods[x-1] for x in df.period] df = df[df.group != "-"] df = df[df.symbol != "Lr"] df = df[df.symbol != "Lu"] cmap = { "alkali metal" : "#a6cee3", "alkaline earth metal" : "#1f78b4", "metal" : "#d93b43", "halogen" : "#999d9a", "metalloid" : "#e08d49", "noble gas" : "#eaeaea", "nonmetal" : "#f1d4Af", "transition metal" : "#599d7A", } source = ColumnDataSource(df) p = figure(plot_width=900, plot_height=500, title="Periodic Table (omitting LA and AC Series)", x_range=groups, y_range=list(reversed(periods)), toolbar_location=None, tools="hover") p.rect("group", "period", 0.95, 0.95, source=source, fill_alpha=0.6, legend_field="metal", color=factor_cmap('metal', palette=list(cmap.values()), factors=list(cmap.keys()))) text_props = {"source": source, "text_align": "left", "text_baseline": "middle"} x = dodge("group", -0.4, range=p.x_range) r = p.text(x=x, y="period", text="symbol", **text_props) r.glyph.text_font_style="bold" r = p.text(x=x, y=dodge("period", 0.3, range=p.y_range), text="atomic number", **text_props) r.glyph.text_font_size="11px" r = p.text(x=x, y=dodge("period", -0.35, range=p.y_range), text="name", **text_props) r.glyph.text_font_size="7px" r = p.text(x=x, y=dodge("period", -0.2, range=p.y_range), text="atomic mass", **text_props) r.glyph.text_font_size="7px" p.text(x=["3", "3"], y=["VI", "VII"], text=["LA", "AC"], text_align="center", text_baseline="middle") p.hover.tooltips = [ ("Name", "@name"), ("Atomic number", "@{atomic number}"), ("Atomic mass", "@{atomic mass}"), ("Type", "@metal"), ("CPK color", "$color[hex, swatch]:CPK"), ("Electronic configuration", "@{electronic configuration}"), ] p.outline_line_color = None p.grid.grid_line_color = None p.axis.axis_line_color = None p.axis.major_tick_line_color = None p.axis.major_label_standoff = 0 p.legend.orientation = "horizontal" p.legend.location ="top_center" show(p)