First steps 8: Providing and filtering data¶
In the previous first steps guides, you used different methods to display and export your visualizations.
In this section, you will use various sources and structures to import and filter data.
Using ColumnDataSource¶
The ColumnDataSource
is Bokeh’s own data
structure. For details about the ColumnDataSource
, see ColumnDataSource in
the user guide.
So far, you have used data sequences like Python lists and NumPy arrays to pass
data to Bokeh. Bokeh has automatically converted these lists into
ColumnDataSource
objects for you.
Follow these steps to create a ColumnDataSource
directly:
First, import
ColumnDataSource
.Next, create a dict with your data: The dict’s keys are the column names (strings). The dict’s values are lists or arrays of data.
Then, pass your dict as the
data
argument toColumnDataSource
:You can then use your
ColumnDataSource
assource
for your renderer.
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
# create dict as basis for ColumnDataSource
data = {'x_values': [1, 2, 3, 4, 5],
'y_values': [6, 7, 2, 3, 6]}
# create ColumnDataSource based on dict
source = ColumnDataSource(data=data)
# create a plot and renderer with ColumnDataSource data
p = figure()
p.circle(x='x_values', y='y_values', source=source)
See also
For more information on Bokeh’s ColumnDataSource
, see
ColumnDataSource in the user guide and
ColumnDataSource
in the reference guide.
For information about adding data to a ColumnDataSource, see Appending data to a ColumnDataSource. Information about replacing data of a ColumnDataSource is available at Replacing data in a ColumnDataSource in the user guide.
For more information on using Python lists, see Providing data with Python lists. For more information on using NumPy data with Bokeh, see Providing NumPy data.
Converting pandas data¶
To use data from a pandas DataFrame
, pass your pandas data to a
ColumnDataSource
:
source = ColumnDataSource(df)
See also
For more information on using pandas data in Bokeh, see
Using a pandas DataFrame in the user guide. This
includes information on using pandas DataFrame
, MultiIndex
, and
GroupBy
data.
Filtering data¶
Bokeh comes with various filtering methods. Use these filters if you want to create a specific subset of the data contained in your ColumnDataSource.
In Bokeh, these filtered subsets are called “views”. Views are represented by
Bokeh’s CDSView
class.
To plot with a filtered subset of data, pass a CDSView
object to the
view
argument of your renderer.
A CDSView
object has two properties:
source
: theColumnDataSource
that you want to apply the filters tofilters
: a list ofFilter
objects
The simplest filter is the IndexFilter
. An
IndexFilter uses a list of index positions and creates a view that contains
nothing but the data points located at those index positions.
For example, if your ColumnDataSource contains a list of five values and you
apply an IndexFilter with [0,2,4]
, the resulting view
contains only the
first, the third, and the fifth value of your original list:
from bokeh.layouts import gridplot
from bokeh.models import CDSView, ColumnDataSource, IndexFilter
from bokeh.plotting import figure, show
# create ColumnDataSource from a dict
source = ColumnDataSource(data=dict(x=[1, 2, 3, 4, 5], y=[1, 2, 3, 4, 5]))
# create a view using an IndexFilter with the index positions [0, 2, 4]
view = CDSView(source=source, filters=[IndexFilter([0, 2, 4])])
# setup tools
tools = ["box_select", "hover", "reset"]
# create a first plot with all data in the ColumnDataSource
p = figure(height=300, width=300, tools=tools)
p.circle(x="x", y="y", size=10, hover_color="red", source=source)
# create a second plot with a subset of ColumnDataSource, based on view
p_filtered = figure(height=300, width=300, tools=tools)
p_filtered.circle(x="x", y="y", size=10, hover_color="red", source=source, view=view)
# show both plots next to each other in a gridplot layout
show(gridplot([[p, p_filtered]]))
See also
For more information on the various filters in Bokeh, see
Filtering data in the user guide. More information is also
available in the entries for CDSView
and
Filter
in the reference guide.