bokeh.sampledata#
The sampledata
module can be used to download data sets used in Bokeh
examples.
The simplest way to download the data is to use the execute the command line program:
bokeh sampledata
Alternatively, the download
function described below may be called
programmatically.
>>> import bokeh.sampledata
>>> bokeh.sampledata.download()
By default, data is downloaded and stored to a directory $HOME/.bokeh/data
.
This directory will be created if it does not already exist.
Bokeh also looks for a YAML configuration file at $HOME/.bokeh/config
. The
YAML key sampledata_dir
can be set to the absolute path of a directory where
the data should be stored. For example, add the following line to the
config file:
sampledata_dir: /tmp/bokeh_data
This will cause the sample data to be stored in /tmp/bokeh_data
.
- download(progress: bool = True) None [source]#
Download larger data sets for various Bokeh examples.
anscombe#
The four data series that comprise Anscombe’s Quartet.
This module contains one pandas Dataframe: data
.
data
Ix | Iy | IIx | IIy | IIIx | IIIy | IVx | IVy | |
---|---|---|---|---|---|---|---|---|
0 | 10.0 | 8.04 | 10.0 | 9.14 | 10.0 | 7.46 | 8.0 | 6.58 |
1 | 8.0 | 6.95 | 8.0 | 8.14 | 8.0 | 6.77 | 8.0 | 5.76 |
2 | 13.0 | 7.58 | 13.0 | 8.74 | 13.0 | 12.74 | 8.0 | 7.71 |
3 | 9.0 | 8.81 | 9.0 | 8.77 | 9.0 | 7.11 | 8.0 | 8.84 |
4 | 11.0 | 8.33 | 11.0 | 9.26 | 11.0 | 7.81 | 8.0 | 8.47 |
antibiotics#
A table of Will Burtin’s historical data regarding antibiotic efficacies.
This module contains one pandas Dataframe: data
.
data
bacteria | penicillin | streptomycin | neomycin | gram | |
---|---|---|---|---|---|
0 | Mycobacterium tuberculosis | 800.0 | 5.0 | 2.00 | negative |
1 | Salmonella schottmuelleri | 10.0 | 0.8 | 0.09 | negative |
2 | Proteus vulgaris | 3.0 | 0.1 | 0.10 | negative |
3 | Klebsiella pneumoniae | 850.0 | 1.2 | 1.00 | negative |
4 | Brucella abortus | 1.0 | 2.0 | 0.02 | negative |
airport_routes#
Airport routes data from OpenFlights.org.
Sourced from https://openflights.org/data.html on September 07, 2017.
This module contains two pandas Dataframes: airports
and routes
.
airports
AirportID | Name | City | Country | IATA | ICAO | Latitude | Longitude | Altitude | Timezone | DST | TZ | Type | source | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3411 | Barter Island LRRS Airport | Barter Island | United States | BTI | PABA | 70.134003 | -143.582001 | 2 | -9 | A | America/Anchorage | airport | OurAirports |
1 | 3413 | Cape Lisburne LRRS Airport | Cape Lisburne | United States | LUR | PALU | 68.875099 | -166.110001 | 16 | -9 | A | America/Anchorage | airport | OurAirports |
2 | 3414 | Point Lay LRRS Airport | Point Lay | United States | PIZ | PPIZ | 69.732903 | -163.005005 | 22 | -9 | A | America/Anchorage | airport | OurAirports |
3 | 3415 | Hilo International Airport | Hilo | United States | ITO | PHTO | 19.721399 | -155.048004 | 38 | -10 | N | Pacific/Honolulu | airport | OurAirports |
4 | 3416 | Orlando Executive Airport | Orlando | United States | ORL | KORL | 28.545500 | -81.332901 | 113 | -5 | A | America/New_York | airport | OurAirports |
routes
Airline | AirlineID | Source | SourceID | Destination | DestinationID | Codeshare | Stops | Equipment | |
---|---|---|---|---|---|---|---|---|---|
0 | 2O | 146 | ADQ | 3531 | KLN | 7162 | NaN | 0 | BNI |
1 | 2O | 146 | KLN | 7162 | KYK | 7161 | NaN | 0 | BNI |
2 | 3E | 10739 | BRL | 5726 | ORD | 3830 | NaN | 0 | CNC |
3 | 3E | 10739 | BRL | 5726 | STL | 3678 | NaN | 0 | CNC |
4 | 3E | 10739 | DEC | 4042 | ORD | 3830 | NaN | 0 | CNC |
airport#
US airports with field elevations > 1500 meters.
Sourced from http://services.nationalmap.gov on October 15, 2015.
This module contains one pandas Dataframe: data
.
data
name | elevation | x | y | |
---|---|---|---|---|
0 | CHINLE MUNICIPAL AIRPORT | 1691 | -1.219788e+07 | 4.315889e+06 |
1 | ELY AIRPORT /YELLAND FIELD/ AIRPORT | 1908 | -1.278414e+07 | 4.764692e+06 |
2 | TRUCKEE-TAHOE AIRPORT | 1798 | -1.337387e+07 | 4.767619e+06 |
3 | GARFIELD COUNTY REGIONAL AIRPORT | 1691 | -1.199211e+07 | 4.797343e+06 |
4 | SANTA FE MUNICIPAL AIRPORT | 1935 | -1.180982e+07 | 4.248063e+06 |
autompg#
A version of the Auto MPG data set.
Derived from https://archive.ics.uci.edu/ml/datasets/auto+mpg
This module contains two pandas Dataframes: autompg
and autompg_clean
.
The “clean” version has cleaned up the "mfr"
and "origin"
fields.
autompg
mpg | cyl | displ | hp | weight | accel | yr | origin | name | |
---|---|---|---|---|---|---|---|---|---|
0 | 18.0 | 8 | 307.0 | 130 | 3504 | 12.0 | 70 | 1 | chevrolet chevelle malibu |
1 | 15.0 | 8 | 350.0 | 165 | 3693 | 11.5 | 70 | 1 | buick skylark 320 |
2 | 18.0 | 8 | 318.0 | 150 | 3436 | 11.0 | 70 | 1 | plymouth satellite |
3 | 16.0 | 8 | 304.0 | 150 | 3433 | 12.0 | 70 | 1 | amc rebel sst |
4 | 17.0 | 8 | 302.0 | 140 | 3449 | 10.5 | 70 | 1 | ford torino |
autompg_clean
mpg | cyl | displ | hp | weight | accel | yr | origin | name | mfr | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 18.0 | 8 | 307.0 | 130 | 3504 | 12.0 | 70 | North America | chevrolet chevelle malibu | chevrolet |
1 | 15.0 | 8 | 350.0 | 165 | 3693 | 11.5 | 70 | North America | buick skylark 320 | buick |
2 | 18.0 | 8 | 318.0 | 150 | 3436 | 11.0 | 70 | North America | plymouth satellite | plymouth |
3 | 16.0 | 8 | 304.0 | 150 | 3433 | 12.0 | 70 | North America | amc rebel sst | amc |
4 | 17.0 | 8 | 302.0 | 140 | 3449 | 10.5 | 70 | North America | ford torino | ford |
autompg2#
A version of the Auto MPG data set.
Derived from https://archive.ics.uci.edu/ml/datasets/auto+mpg
This module contains one pandas Dataframe: autompg
.
autompg2
Unnamed: 0 | manufacturer | model | displ | year | cyl | trans | drv | cty | hwy | fl | class | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Audi | A4 | 1.8 | 1999 | 4 | auto(l5) | front | 18 | 29 | p | compact |
1 | 2 | Audi | A4 | 1.8 | 1999 | 4 | manual(m5) | front | 21 | 29 | p | compact |
2 | 3 | Audi | A4 | 2.0 | 2008 | 4 | manual(m6) | front | 20 | 31 | p | compact |
3 | 4 | Audi | A4 | 2.0 | 2008 | 4 | auto(av) | front | 21 | 30 | p | compact |
4 | 5 | Audi | A4 | 2.8 | 1999 | 6 | auto(l5) | front | 16 | 26 | p | compact |
browsers#
Browser market share by version from November 2013.
Data sourced from http://gs.statcounter.com/#browser_version-ww-monthly-201311-201311-bar
Icon images sourced from https://github.com/alrra/browser-logos
This module contains one pandas Dataframe: browsers_nov_2013
.
browsers_nov_2013
Version | Share | Browser | VersionNumber | |
---|---|---|---|---|
0 | Chrome 30.0 | 18.51 | Chrome | 30.0 |
1 | Chrome 31.0 | 17.31 | Chrome | 31.0 |
2 | Firefox 25.0 | 11.21 | Firefox | 25.0 |
3 | IE 10.0 | 11.10 | IE | 10.0 |
4 | IE 8.0 | 8.65 | IE | 8.0 |
The module also contains a dictionary icons
with base64-encoded PNGs of the
logos for Chrome, Firefox, Safari, Opera, and IE.
commits#
Time series of commits for a GitHub user between 2012 and 2016.
This module contains one pandas Dataframe: data
.
data
day | time | |
---|---|---|
datetime | ||
2017-04-22 15:11:58-05:00 | Sat | 15:11:58 |
2017-04-21 14:20:57-05:00 | Fri | 14:20:57 |
2017-04-20 14:35:08-05:00 | Thu | 14:35:08 |
2017-04-20 10:34:29-05:00 | Thu | 10:34:29 |
2017-04-20 09:17:23-05:00 | Thu | 09:17:23 |
daylight#
Provide 2013 Warsaw daylight hours.
Sourced from http://www.sunrisesunset.com
This module contains one pandas Dataframe: daylight_warsaw_2013
.
daylight_warsaw_2013
Date | Sunrise | Sunset | Summer | |
---|---|---|---|---|
0 | 2013-01-01 | 07:45:00 | 15:34:00 | 0 |
1 | 2013-01-02 | 07:45:00 | 15:35:00 | 0 |
2 | 2013-01-03 | 07:45:00 | 15:36:00 | 0 |
3 | 2013-01-04 | 07:45:00 | 15:37:00 | 0 |
4 | 2013-01-05 | 07:44:00 | 15:38:00 | 0 |
degrees#
Provide a table of data regarding bachelor’s degrees earned by women.
The data is broken down by field for any given year.
This module contains one pandas Dataframe: data
.
data
Year | Agriculture | Architecture | Art and Performance | Biology | Business | Communications and Journalism | Computer Science | Education | Engineering | English | Foreign Languages | Health Professions | Math and Statistics | Physical Sciences | Psychology | Public Administration | Social Sciences and History | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1970 | 4.229798 | 11.921005 | 59.7 | 29.088363 | 9.064439 | 35.3 | 13.6 | 74.535328 | 0.8 | 65.570923 | 73.8 | 77.1 | 38.0 | 13.8 | 44.4 | 68.4 | 36.8 |
1 | 1971 | 5.452797 | 12.003106 | 59.9 | 29.394403 | 9.503187 | 35.5 | 13.6 | 74.149204 | 1.0 | 64.556485 | 73.9 | 75.5 | 39.0 | 14.9 | 46.2 | 65.5 | 36.2 |
2 | 1972 | 7.420710 | 13.214594 | 60.4 | 29.810221 | 10.558962 | 36.6 | 14.9 | 73.554520 | 1.2 | 63.664263 | 74.6 | 76.9 | 40.2 | 14.8 | 47.6 | 62.6 | 36.1 |
3 | 1973 | 9.653602 | 14.791613 | 60.2 | 31.147915 | 12.804602 | 38.4 | 16.4 | 73.501814 | 1.6 | 62.941502 | 74.9 | 77.4 | 40.9 | 16.5 | 50.4 | 64.3 | 36.4 |
4 | 1974 | 14.074623 | 17.444688 | 61.9 | 32.996183 | 16.204850 | 40.5 | 18.9 | 73.336811 | 2.2 | 62.413412 | 75.3 | 77.9 | 41.8 | 18.2 | 52.6 | 66.1 | 37.3 |
gapminder#
Four of the datasets from Gapminder.
Sourced from https://www.gapminder.org/data/
Licensed under CC-BY.
This module contains four pandas Dataframes: fertility
, life_expectancy
,
population
, and regions
.
fertility
1964 | 1965 | 1966 | 1967 | 1968 | 1969 | 1970 | 1971 | 1972 | 1973 | 1974 | 1975 | 1976 | 1977 | 1978 | 1979 | 1980 | 1981 | 1982 | 1983 | 1984 | 1985 | 1986 | 1987 | 1988 | 1989 | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Country | ||||||||||||||||||||||||||||||||||||||||||||||||||
Afghanistan | 7.671 | 7.671 | 7.671 | 7.671 | 7.671 | 7.671 | 7.671 | 7.671 | 7.671 | 7.671 | 7.671 | 7.671 | 7.670 | 7.670 | 7.670 | 7.669 | 7.669 | 7.670 | 7.671 | 7.673 | 7.676 | 7.679 | 7.681 | 7.682 | 7.682 | 7.682 | 7.687 | 7.700 | 7.725 | 7.758 | 7.796 | 7.832 | 7.859 | 7.869 | 7.854 | 7.809 | 7.733 | 7.623 | 7.484 | 7.321 | 7.136 | 6.930 | 6.702 | 6.456 | 6.196 | 5.928 | 5.659 | 5.395 | 5.141 | 4.900 |
Albania | 5.711 | 5.594 | 5.483 | 5.376 | 5.268 | 5.160 | 5.050 | 4.933 | 4.809 | 4.677 | 4.538 | 4.393 | 4.244 | 4.094 | 3.947 | 3.807 | 3.678 | 3.562 | 3.460 | 3.372 | 3.297 | 3.233 | 3.177 | 3.126 | 3.075 | 3.023 | 2.970 | 2.917 | 2.867 | 2.819 | 2.772 | 2.723 | 2.670 | 2.611 | 2.543 | 2.467 | 2.383 | 2.291 | 2.195 | 2.097 | 2.004 | 1.919 | 1.849 | 1.796 | 1.761 | 1.744 | 1.741 | 1.748 | 1.760 | 1.771 |
Algeria | 7.653 | 7.655 | 7.657 | 7.658 | 7.657 | 7.652 | 7.641 | 7.622 | 7.591 | 7.548 | 7.492 | 7.422 | 7.339 | 7.244 | 7.138 | 7.021 | 6.889 | 6.741 | 6.576 | 6.392 | 6.192 | 5.976 | 5.747 | 5.508 | 5.263 | 5.014 | 4.761 | 4.503 | 4.238 | 3.971 | 3.705 | 3.449 | 3.207 | 2.987 | 2.794 | 2.634 | 2.514 | 2.439 | 2.407 | 2.412 | 2.448 | 2.507 | 2.580 | 2.656 | 2.725 | 2.781 | 2.817 | 2.829 | 2.820 | 2.795 |
American Samoa | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
Andorra | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
life_expectancy
1964 | 1965 | 1966 | 1967 | 1968 | 1969 | 1970 | 1971 | 1972 | 1973 | 1974 | 1975 | 1976 | 1977 | 1978 | 1979 | 1980 | 1981 | 1982 | 1983 | 1984 | 1985 | 1986 | 1987 | 1988 | 1989 | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Country | ||||||||||||||||||||||||||||||||||||||||||||||||||
Afghanistan | 33.639 | 34.152 | 34.662 | 35.170 | 35.674 | 36.172 | 36.663 | 37.143 | 37.614 | 38.075 | 38.529 | 38.977 | 39.417 | 39.855 | 40.298 | 40.756 | 41.242 | 41.770 | 42.347 | 42.977 | 43.661 | 44.400 | 45.192 | 46.024 | 46.880 | 47.744 | 48.601 | 49.439 | 50.247 | 51.017 | 51.738 | 52.400 | 52.995 | 53.527 | 54.009 | 54.449 | 54.863 | 55.271 | 55.687 | 56.122 | 56.583 | 57.071 | 57.582 | 58.102 | 58.618 | 59.124 | 59.612 | 60.079 | 60.524 | 60.947 |
Albania | 65.475 | 65.863 | 66.122 | 66.316 | 66.500 | 66.702 | 66.948 | 67.251 | 67.595 | 67.966 | 68.356 | 68.748 | 69.121 | 69.459 | 69.753 | 70.001 | 70.218 | 70.426 | 70.646 | 70.886 | 71.144 | 71.398 | 71.615 | 71.770 | 71.853 | 71.870 | 71.842 | 71.799 | 71.779 | 71.813 | 71.920 | 72.117 | 72.415 | 72.796 | 73.235 | 73.713 | 74.200 | 74.664 | 75.081 | 75.437 | 75.725 | 75.949 | 76.124 | 76.278 | 76.433 | 76.598 | 76.780 | 76.979 | 77.185 | 77.392 |
Algeria | 47.953 | 48.389 | 48.806 | 49.205 | 49.592 | 49.976 | 50.366 | 50.767 | 51.195 | 51.670 | 52.213 | 52.861 | 53.656 | 54.605 | 55.697 | 56.907 | 58.198 | 59.524 | 60.826 | 62.051 | 63.160 | 64.120 | 64.911 | 65.554 | 66.072 | 66.479 | 66.796 | 67.049 | 67.265 | 67.468 | 67.674 | 67.893 | 68.123 | 68.350 | 68.565 | 68.769 | 68.963 | 69.149 | 69.330 | 69.508 | 69.682 | 69.854 | 70.020 | 70.180 | 70.332 | 70.477 | 70.615 | 70.747 | 70.874 | 71.000 |
American Samoa | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
Andorra | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
population
1964 | 1965 | 1966 | 1967 | 1968 | 1969 | 1970 | 1971 | 1972 | 1973 | 1974 | 1975 | 1976 | 1977 | 1978 | 1979 | 1980 | 1981 | 1982 | 1983 | 1984 | 1985 | 1986 | 1987 | 1988 | 1989 | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Country | ||||||||||||||||||||||||||||||||||||||||||||||||||
Afghanistan | 10474903.0 | 10697983.0 | 10927724.0 | 11163656.0 | 11411022.0 | 11676990.0 | 11964906.0 | 12273101.0 | 12593688.0 | 12915499.0 | 13223928.0 | 13505544.0 | 13766792.0 | 14003408.0 | 14179656.0 | 14249493.0 | 14185729.0 | 13984092.0 | 13672870.0 | 13300056.0 | 12931791.0 | 12625292.0 | 12372113.0 | 12183387.0 | 12156685.0 | 12414686.0 | 13032161.0 | 14069854.0 | 15472076.0 | 17053213.0 | 18553819.0 | 19789880.0 | 20684982.0 | 21299350.0 | 21752257.0 | 22227543.0 | 22856302.0 | 23677385.0 | 24639841.0 | 25678639.0 | 26693486.0 | 27614718.0 | 28420974.0 | 29145841.0 | 29839994.0 | 30577756.0 | 31411743.0 | 32358260.0 | 33397058.0 | 34499915.0 |
Albania | 1817098.0 | 1869942.0 | 1922993.0 | 1976140.0 | 2029314.0 | 2082474.0 | 2135599.0 | 2188650.0 | 2241623.0 | 2294578.0 | 2347607.0 | 2400801.0 | 2454255.0 | 2508026.0 | 2562121.0 | 2616530.0 | 2671300.0 | 2725029.0 | 2777592.0 | 2831682.0 | 2891004.0 | 2957390.0 | 3033393.0 | 3116009.0 | 3194854.0 | 3255859.0 | 3289483.0 | 3291695.0 | 3266983.0 | 3224901.0 | 3179442.0 | 3141102.0 | 3112597.0 | 3091902.0 | 3079037.0 | 3072725.0 | 3071856.0 | 3077378.0 | 3089778.0 | 3106701.0 | 3124861.0 | 3141800.0 | 3156607.0 | 3169665.0 | 3181397.0 | 3192723.0 | 3204284.0 | 3215988.0 | 3227373.0 | 3238316.0 |
Algeria | 11654905.0 | 11923002.0 | 12229853.0 | 12572629.0 | 12945462.0 | 13338918.0 | 13746185.0 | 14165889.0 | 14600659.0 | 15052371.0 | 15524137.0 | 16018195.0 | 16533323.0 | 17068212.0 | 17624756.0 | 18205468.0 | 18811199.0 | 19442423.0 | 20095648.0 | 20762767.0 | 21433070.0 | 22098298.0 | 22753511.0 | 23398470.0 | 24035237.0 | 24668100.0 | 25299182.0 | 25930560.0 | 26557969.0 | 27169903.0 | 27751086.0 | 28291591.0 | 28786855.0 | 29242917.0 | 29673694.0 | 30099010.0 | 30533827.0 | 30982214.0 | 31441848.0 | 31913462.0 | 32396048.0 | 32888449.0 | 33391954.0 | 33906605.0 | 34428028.0 | 34950168.0 | 35468208.0 | 35980193.0 | 36485828.0 | 36983924.0 |
American Samoa | 22672.0 | 23480.0 | 24283.0 | 25087.0 | 25869.0 | 26608.0 | 27288.0 | 27907.0 | 28470.0 | 28983.0 | 29453.0 | 29897.0 | 30305.0 | 30696.0 | 31139.0 | 31727.0 | 32526.0 | 33557.0 | 34797.0 | 36203.0 | 37706.0 | 39253.0 | 40834.0 | 42446.0 | 44048.0 | 45595.0 | 47052.0 | 48402.0 | 49648.0 | 50801.0 | 51885.0 | 52919.0 | 53901.0 | 54834.0 | 55745.0 | 56667.0 | 57625.0 | 58633.0 | 59687.0 | 60774.0 | 61871.0 | 62962.0 | 64045.0 | 65130.0 | 66217.0 | 67312.0 | 68420.0 | 69543.0 | 70680.0 | 71834.0 |
Andorra | 17438.0 | 18529.0 | 19640.0 | 20772.0 | 21931.0 | 23127.0 | 24364.0 | 25656.0 | 26997.0 | 28357.0 | 29688.0 | 30967.0 | 32156.0 | 33279.0 | 34432.0 | 35753.0 | 37328.0 | 39226.0 | 41390.0 | 43636.0 | 45702.0 | 47414.0 | 48653.0 | 49504.0 | 50236.0 | 51241.0 | 52773.0 | 54996.0 | 57767.0 | 60670.0 | 63111.0 | 64699.0 | 65227.0 | 64905.0 | 64246.0 | 63985.0 | 64634.0 | 66390.0 | 69043.0 | 72203.0 | 75292.0 | 77888.0 | 79874.0 | 81390.0 | 82577.0 | 83677.0 | 84864.0 | 86165.0 | 87518.0 | 88909.0 |
regions
Group | ID | |
---|---|---|
Country | ||
Angola | Sub-Saharan Africa | AO |
Benin | Sub-Saharan Africa | BJ |
Botswana | Sub-Saharan Africa | BW |
Burkina Faso | Sub-Saharan Africa | BF |
Burundi | Sub-Saharan Africa | BI |
glucose#
A CSV timeseries of blood glucose measurements.
This module contains one pandas Dataframe: data
.
data
isig | glucose | |
---|---|---|
datetime | ||
2010-03-24 09:51:00 | 22.59 | 258 |
2010-03-24 09:56:00 | 22.52 | 260 |
2010-03-24 10:01:00 | 22.23 | 258 |
2010-03-24 10:06:00 | 21.56 | 254 |
2010-03-24 10:11:00 | 20.79 | 246 |
haar_cascade#
Provide a Haar cascade file for face recognition.
This module contains an attribute frontalface_default_path
. Use this
attribute to obtain the path to a Haar cascade file for frontal face
recognition that can be used by OpenCV.
iris#
Provide Fisher’s Iris dataset.
This module contains one pandas Dataframe: flowers
.
Note
This sampledata is maintained for historical compatibility. Please consider alternatives to Iris such as penguins.
flowers
sepal_length | sepal_width | petal_length | petal_width | species | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
1 | 4.9 | 3.0 | 1.4 | 0.2 | setosa |
2 | 4.7 | 3.2 | 1.3 | 0.2 | setosa |
3 | 4.6 | 3.1 | 1.5 | 0.2 | setosa |
4 | 5.0 | 3.6 | 1.4 | 0.2 | setosa |
les_mis#
Provide JSON data for co-occurrence of characters in Les Miserables.
Derived from http://ftp.cs.stanford.edu/pub/sgb/jean.dat
This module contains one dictionary: data
.
data
{
'nodes': [
{'name': 'Myriel', 'group': 1},
...
{'name': 'Mme.Hucheloup', 'group': 8}
],
'links': [
{'source': 1, 'target': 0, 'value': 1},
...
{'source': 76, 'target': 58, 'value': 1}
]
}
movies_data#
A small subset of data from the Open Movie Database.
Data is licensed CC BY-NC 4.0.
This modules has an attribute movie_path
. This attribute contains the path
to a SQLite database with the data.
mtb#
Route data (including altitude) for a bike race in Eastern Europe.
This module contains one pandas Dataframe: obiszow_mtb_xcm
.
obiszow_mtb_xcm
lon | lat | alt | |
---|---|---|---|
0 | 16.116775 | 51.578265 | 118.0 |
1 | 16.116741 | 51.578265 | 118.0 |
2 | 16.116776 | 51.578253 | 118.0 |
3 | 16.116792 | 51.578223 | 119.0 |
4 | 16.116584 | 51.578058 | 119.0 |
olympics2014#
Provide medal counts by country for the 2014 Olympics.
This module contains a single dict: data
.
The dictionary has a key "data"
that lists sub-dictionaries, one for each
country:
{
'abbr': 'DEU',
'medals': {'total': 15, 'bronze': 4, 'gold': 8, 'silver': 3},
'name': 'Germany'
}
penguins#
Provide data from the Palmer Archipelago (Antarctica) penguin dataset.
Derived from https://github.com/mwaskom/seaborn-data/blob/master/penguins.csv
Data distributed under the CC-0 license.
This module contains one pandas Dataframe: data
.
data
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | |
---|---|---|---|---|---|---|---|
0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | MALE |
1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | FEMALE |
2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | FEMALE |
3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | NaN |
4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | FEMALE |
perceptions#
Provides access to probly.csv
and numberly.csv
.
Sourced from: https://github.com/zonination/perceptions
Data distributed under the MIT license.
This module contains two pandas Dataframes: probly
and numberly
.
probly
Almost Certainly | Highly Likely | Very Good Chance | Probable | Likely | Probably | We Believe | Better Than Even | About Even | We Doubt | Improbable | Unlikely | Probably Not | Little Chance | Almost No Chance | Highly Unlikely | Chances Are Slight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 95.0 | 80 | 85 | 75 | 66 | 75 | 66 | 55.0 | 50 | 40 | 20.0 | 30 | 15.0 | 20 | 5.0 | 25 | 25 |
1 | 95.0 | 75 | 75 | 51 | 75 | 51 | 51 | 51.0 | 50 | 20 | 49.0 | 25 | 49.0 | 5 | 5.0 | 10 | 5 |
2 | 95.0 | 85 | 85 | 70 | 75 | 70 | 80 | 60.0 | 50 | 30 | 10.0 | 25 | 25.0 | 20 | 1.0 | 5 | 15 |
3 | 95.0 | 85 | 85 | 70 | 75 | 70 | 80 | 60.0 | 50 | 30 | 10.0 | 25 | 25.0 | 20 | 1.0 | 5 | 15 |
4 | 98.0 | 95 | 80 | 70 | 70 | 75 | 65 | 60.0 | 50 | 10 | 50.0 | 5 | 20.0 | 5 | 1.0 | 2 | 10 |
numberly
A couple | A few | Dozens | A lot | Some | Several | Many | Fractions of | Scores of | Hundreds of | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 2 | 3 | 30 | 20 | 4 | 7 | 12 | 0.15 | 80 | 250 |
1 | 2 | 3 | 24 | 12 | 6 | 10 | 50 | 0.50 | 40 | 200 |
2 | 2 | 5 | 30 | 15 | 5 | 4 | 25 | 0.25 | 500 | 500 |
3 | 2 | 5 | 30 | 15 | 5 | 4 | 25 | 0.25 | 500 | 500 |
4 | 2 | 3 | 48 | 50 | 3 | 5 | 5 | 0.01 | 100000 | 599 |
periodic_table#
Provide a periodic table data set.
This module contains one pandas Dataframe: elements
.
elements
atomic number | symbol | name | atomic mass | CPK | electronic configuration | electronegativity | atomic radius | ion radius | van der Waals radius | IE-1 | EA | standard state | bonding type | melting point | boiling point | density | metal | year discovered | group | period | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | H | Hydrogen | 1.00794 | #FFFFFF | 1s1 | 2.20 | 37.0 | NaN | 120.0 | 1312.0 | -73.0 | gas | diatomic | 14.0 | 20.0 | 0.00009 | nonmetal | 1766 | 1 | 1 |
1 | 2 | He | Helium | 4.002602 | #D9FFFF | 1s2 | NaN | 32.0 | NaN | 140.0 | 2372.0 | 0.0 | gas | atomic | NaN | 4.0 | 0.00000 | noble gas | 1868 | 18 | 1 |
2 | 3 | Li | Lithium | 6.941 | #CC80FF | [He] 2s1 | 0.98 | 134.0 | 76 (+1) | 182.0 | 520.0 | -60.0 | solid | metallic | 454.0 | 1615.0 | 0.54000 | alkali metal | 1817 | 1 | 2 |
3 | 4 | Be | Beryllium | 9.012182 | #C2FF00 | [He] 2s2 | 1.57 | 90.0 | 45 (+2) | NaN | 900.0 | 0.0 | solid | metallic | 1560.0 | 2743.0 | 1.85000 | alkaline earth metal | 1798 | 2 | 2 |
4 | 5 | B | Boron | 10.811 | #FFB5B5 | [He] 2s2 2p1 | 2.04 | 82.0 | 27 (+3) | NaN | 801.0 | -27.0 | solid | covalent network | 2348.0 | 4273.0 | 2.46000 | metalloid | 1807 | 13 | 2 |
population#
Historical and projected population data by age, gender, and country.
Sourced from: https://population.un.org/wpp/Download/Standard/Population/
Data is licenced CC BY 3.0 IGO.
This module contains one pandas Dataframe: data
.
data
LocID | Location | Year | Sex | AgeGrp | AgeGrpStart | Value | |
---|---|---|---|---|---|---|---|
0 | 4 | Afghanistan | 1950 | Male | 0-4 | 0 | 662064.0 |
1 | 4 | Afghanistan | 1950 | Male | 5-9 | 5 | 508166.0 |
2 | 4 | Afghanistan | 1950 | Male | 10-14 | 10 | 444396.0 |
3 | 4 | Afghanistan | 1950 | Male | 15-19 | 15 | 390480.0 |
4 | 4 | Afghanistan | 1950 | Male | 20-24 | 20 | 337318.0 |
sample_geojson#
Provide geojson data for the UK NHS England area teams.
Sourced from https://github.com/JeniT/nhs-choices with data licensed under the Open Government Licence.
A snapshot of data available from NHS Choices on November 14th, 2015.
sea_surface_temperature#
Time series of historical average sea surface temperatures.
This module contains one pandas Dataframe: sea_surface_temperature
.
sea_surface_temperature
temperature | |
---|---|
time | |
2016-02-15 00:00:00+00:00 | 4.929 |
2016-02-15 00:30:00+00:00 | 4.887 |
2016-02-15 01:00:00+00:00 | 4.821 |
2016-02-15 01:30:00+00:00 | 4.837 |
2016-02-15 02:00:00+00:00 | 4.830 |
sprint#
Historical results for Olympic sprints by year.
This module contains one pandas Dataframe: sprint
.
sprint
Name | Country | Medal | Time | Year | |
---|---|---|---|---|---|
0 | Usain Bolt | JAM | GOLD | 9.63 | 2012 |
1 | Yohan Blake | JAM | SILVER | 9.75 | 2012 |
2 | Justin Gatlin | USA | BRONZE | 9.79 | 2012 |
3 | Usain Bolt | JAM | GOLD | 9.69 | 2008 |
4 | Richard Thompson | TRI | SILVER | 9.89 | 2008 |
stocks#
Provide historical ticker data for selected stocks.
This module contains five dicts: AAPL
, FB
, GOOG
, IBM
, and MSFT
.
Each dictionary has the structure:
AAPL['date'] # list of date string
AAPL['open'] # list of float
AAPL['high'] # list of float
AAPL['low'] # list of float
AAPL['close'] # list of float
AAPL['volume'] # list of int
AAPL['adj_close'] # list of float
unemployment#
Per-county unemployment data for Unites States in 2009.
This module contains one dict: data
.
The dict is indexed by the two-tuples containing (state_id, county_id)
and
has the unemployment rate (2009) as the value.
{
(1, 1): 9.7,
(1, 3): 9.1,
...
}
unemployment1948#
US Unemployment rate data by month and year, from 1948 to 2013.
This module contains one pandas Dataframe: data
.
data
Year | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec | Annual | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1948 | 4.0 | 4.7 | 4.5 | 4.0 | 3.4 | 3.9 | 3.9 | 3.6 | 3.4 | 2.9 | 3.3 | 3.6 | 3.8 |
1 | 1949 | 5.0 | 5.8 | 5.6 | 5.4 | 5.7 | 6.4 | 7.0 | 6.3 | 5.9 | 6.1 | 5.7 | 6.0 | 5.9 |
2 | 1950 | 7.6 | 7.9 | 7.1 | 6.0 | 5.3 | 5.6 | 5.3 | 4.1 | 4.0 | 3.3 | 3.8 | 3.9 | 5.3 |
3 | 1951 | 4.4 | 4.2 | 3.8 | 3.2 | 2.9 | 3.4 | 3.3 | 2.9 | 3.0 | 2.8 | 3.2 | 2.9 | 3.3 |
4 | 1952 | 3.7 | 3.8 | 3.3 | 3.0 | 2.9 | 3.2 | 3.3 | 3.1 | 2.7 | 2.4 | 2.5 | 2.5 | 3.0 |
us_cities#
Locations of US cities with more than 5000 residents.
This module contains one dict: data
.
data['lat'] # list of float
data['lon'] # list of float
us_counties#
This modules exposes geometry data for Unites States.
This module contains one dict: data
.
The data is indexed by two-tuples of `` (state_id, county_id)`` that have the following dictionaries as values:
In [25]: data[(1,1)]
Out[25]:
{
'name': 'Autauga',
'detailed name': 'Autauga County, Alabama',
'state': 'al',
'lats': [32.4757, ..., 32.48112],
'lons': [-86.41182, ..., -86.41187]
}
Entries for 'name'
can have duplicates for certain states (e.g. Virginia).
The combination of 'detailed name'
and 'state'
will always be unique.
us_holidays#
Calendar file of US Holidays from Mozilla provided by icalendar.
Sourced from: https://www.mozilla.org/en-US/projects/calendar/holidays/
This module contains one list: us_holidays
.
us_holidays
[
(datetime.date(1966, 12, 26), 'Kwanzaa'),
(datetime.date(2000, 1, 1), "New Year's Day"),
...
(datetime.date(2020, 12, 25), 'Christmas Day (US-OPM)')
]
us_marriages_divorces#
Provide U.S. marriage and divorce statistics between 1867 and 2014
Data from the CDC’s National Center for Health Statistics (NHCS) database (http://www.cdc.gov/nchs/).
Data organized by Randal S. Olson (http://www.randalolson.com)
This module contains one pandas Dataframe: data
.
data
Year | Marriages | Divorces | Population | Marriages_per_1000 | Divorces_per_1000 | |
---|---|---|---|---|---|---|
0 | 1867 | 357000.0 | 10000.0 | 36970000 | 9.7 | 0.3 |
1 | 1868 | 345000.0 | 10000.0 | 37885000 | 9.1 | 0.3 |
2 | 1869 | 348000.0 | 11000.0 | 38870000 | 9.0 | 0.3 |
3 | 1870 | 352000.0 | 11000.0 | 39905000 | 8.8 | 0.3 |
4 | 1871 | 359000.0 | 12000.0 | 41010000 | 8.8 | 0.3 |
us_states#
Geometry data for US States.
This module contains one dict: data
.
The data is indexed by the two letter state code (e.g., ‘CA’, ‘TX’) and has the following structure:
In [4]: data["OR"]
Out[4]:
{
'name': 'Oregon',
'region': 'Northwest',
'lats': [46.29443, ..., 46.26068],
'lons': [-124.03622, ..., -124.15935]
}
world_cities#
Names and locations of world cities with at least 5000 inhabitants.
Derived from cities5000.zip
file downloaded from http://www.geonames.org/export/
Licensed under CC-BY.
This module contains one pandas Dataframe: data
.
data
name | lat | lng | |
---|---|---|---|
0 | Ordino | 42.55623 | 1.53319 |
1 | les Escaldes | 42.50729 | 1.53414 |
2 | la Massana | 42.54499 | 1.51483 |
3 | Encamp | 42.53474 | 1.58014 |
4 | Canillo | 42.56760 | 1.59756 |