First steps 8: Providing and filtering data#

In the previous first steps guides, you used different methods to display and export your visualizations.

In this section, you will use various sources and structures to import and filter data.

Using ColumnDataSource#

The ColumnDataSource is Bokeh’s own data structure. For details about the ColumnDataSource, see ColumnDataSource in the user guide.

So far, you have used data sequences like Python lists and NumPy arrays to pass data to Bokeh. Bokeh has automatically converted these lists into ColumnDataSource objects for you.

Follow these steps to create a ColumnDataSource directly:

  • First, import ColumnDataSource.

  • Next, create a dict with your data: The dict’s keys are the column names (strings). The dict’s values are lists or arrays of data.

  • Then, pass your dict as the data argument to ColumnDataSource:

  • You can then use your ColumnDataSource as source for your renderer.

from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource

# create dict as basis for ColumnDataSource
data = {'x_values': [1, 2, 3, 4, 5],
        'y_values': [6, 7, 2, 3, 6]}

# create ColumnDataSource based on dict
source = ColumnDataSource(data=data)

# create a plot and renderer with ColumnDataSource data
p = figure(height=250)
p.scatter(x='x_values', y='y_values', size=20, source=source)

See also

For more information on Bokeh’s ColumnDataSource, see ColumnDataSource in the user guide and ColumnDataSource in the reference guide.

For information about adding data to a ColumnDataSource, see Appending data to a ColumnDataSource. Information about replacing data of a ColumnDataSource is available at Replacing data in a ColumnDataSource in the user guide.

For more information on using Python lists, see Providing data with Python lists. For more information on using NumPy data with Bokeh, see Providing NumPy data.

Converting pandas data#

To use data from a pandas DataFrame, pass your pandas data to a ColumnDataSource:

source = ColumnDataSource(df)

See also

For more information on using pandas data in Bokeh, see Using a pandas DataFrame in the user guide. This includes information on using pandas DataFrame, MultiIndex, and GroupBy data.

Filtering data#

Bokeh comes with various filtering methods. Use these filters if you want to create a specific subset of the data contained in your ColumnDataSource.

In Bokeh, these filtered subsets are called “views”. Views are represented by Bokeh’s CDSView class.

To plot with a filtered subset of data, pass a CDSView object to the view argument of your renderer.

A CDSView object has one property:

  • filter: an instance of Filter models

The simplest filter is the IndexFilter. An IndexFilter uses a list of index positions and creates a view that contains nothing but the data points located at those index positions.

For example, if your ColumnDataSource contains a list of five values and you apply an IndexFilter with [0,2,4], the resulting view contains only the first, the third, and the fifth value of your original list:

from bokeh.layouts import gridplot
from bokeh.models import CDSView, ColumnDataSource, IndexFilter
from bokeh.plotting import figure, show

# create ColumnDataSource from a dict
source = ColumnDataSource(data=dict(x=[1, 2, 3, 4, 5], y=[1, 2, 3, 4, 5]))

# create a view using an IndexFilter with the index positions [0, 2, 4]
view = CDSView(filter=IndexFilter([0, 2, 4]))

# setup tools
tools = ["box_select", "hover", "reset"]

# create a first plot with all data in the ColumnDataSource
p = figure(height=300, width=300, tools=tools)
p.scatter(x="x", y="y", size=10, hover_color="red", source=source)

# create a second plot with a subset of ColumnDataSource, based on view
p_filtered = figure(height=300, width=300, tools=tools)
p_filtered.scatter(x="x", y="y", size=10, hover_color="red", source=source, view=view)

# show both plots next to each other in a gridplot layout
show(gridplot([[p, p_filtered]]))

See also

For more information on the various filters in Bokeh, see Filtering data in the user guide. More information is also available in the entries for CDSView and Filter in the reference guide.