bokeh.sampledata#

The sampledata module can be used to download data sets used in Bokeh examples.

The simplest way to download the data is to use the execute the command line program:

bokeh sampledata

Alternatively, the download function described below may be called programmatically.

>>> import bokeh.sampledata
>>> bokeh.sampledata.download()

By default, data is downloaded and stored to a directory $HOME/.bokeh/data. This directory will be created if it does not already exist.

Bokeh also looks for a YAML configuration file at $HOME/.bokeh/config. The YAML key sampledata_dir can be set to the absolute path of a directory where the data should be stored. For example, add the following line to the config file:

sampledata_dir: /tmp/bokeh_data

This will cause the sample data to be stored in /tmp/bokeh_data.

download(progress: bool = True) None[source]#

Download larger data sets for various Bokeh examples.


anscombe#

The four data series that comprise Anscombe’s Quartet.

This module contains one pandas Dataframe: data.

data

Ix Iy IIx IIy IIIx IIIy IVx IVy
0 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
1 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
2 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
3 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
4 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47

antibiotics#

A table of Will Burtin’s historical data regarding antibiotic efficacies.

This module contains one pandas Dataframe: data.

data

bacteria penicillin streptomycin neomycin gram
0 Mycobacterium tuberculosis 800.0 5.0 2.00 negative
1 Salmonella schottmuelleri 10.0 0.8 0.09 negative
2 Proteus vulgaris 3.0 0.1 0.10 negative
3 Klebsiella pneumoniae 850.0 1.2 1.00 negative
4 Brucella abortus 1.0 2.0 0.02 negative

airport_routes#

Airport routes data from OpenFlights.org.

Sourced from https://openflights.org/data.html on September 07, 2017.

This module contains two pandas Dataframes: airports and routes.

airports

AirportID Name City Country IATA ICAO Latitude Longitude Altitude Timezone DST TZ Type source
0 3411 Barter Island LRRS Airport Barter Island United States BTI PABA 70.134003 -143.582001 2 -9 A America/Anchorage airport OurAirports
1 3413 Cape Lisburne LRRS Airport Cape Lisburne United States LUR PALU 68.875099 -166.110001 16 -9 A America/Anchorage airport OurAirports
2 3414 Point Lay LRRS Airport Point Lay United States PIZ PPIZ 69.732903 -163.005005 22 -9 A America/Anchorage airport OurAirports
3 3415 Hilo International Airport Hilo United States ITO PHTO 19.721399 -155.048004 38 -10 N Pacific/Honolulu airport OurAirports
4 3416 Orlando Executive Airport Orlando United States ORL KORL 28.545500 -81.332901 113 -5 A America/New_York airport OurAirports

routes

Airline AirlineID Source SourceID Destination DestinationID Codeshare Stops Equipment
0 2O 146 ADQ 3531 KLN 7162 NaN 0 BNI
1 2O 146 KLN 7162 KYK 7161 NaN 0 BNI
2 3E 10739 BRL 5726 ORD 3830 NaN 0 CNC
3 3E 10739 BRL 5726 STL 3678 NaN 0 CNC
4 3E 10739 DEC 4042 ORD 3830 NaN 0 CNC

airport#

US airports with field elevations > 1500 meters.

Sourced from http://services.nationalmap.gov on October 15, 2015.

This module contains one pandas Dataframe: data.

data

name elevation x y
0 CHINLE MUNICIPAL AIRPORT 1691 -1.219788e+07 4.315889e+06
1 ELY AIRPORT /YELLAND FIELD/ AIRPORT 1908 -1.278414e+07 4.764692e+06
2 TRUCKEE-TAHOE AIRPORT 1798 -1.337387e+07 4.767619e+06
3 GARFIELD COUNTY REGIONAL AIRPORT 1691 -1.199211e+07 4.797343e+06
4 SANTA FE MUNICIPAL AIRPORT 1935 -1.180982e+07 4.248063e+06

autompg#

A version of the Auto MPG data set.

Derived from https://archive.ics.uci.edu/ml/datasets/auto+mpg

This module contains two pandas Dataframes: autompg and autompg_clean. The “clean” version has cleaned up the "mfr" and "origin" fields.

autompg

mpg cyl displ hp weight accel yr origin name
0 18.0 8 307.0 130 3504 12.0 70 1 chevrolet chevelle malibu
1 15.0 8 350.0 165 3693 11.5 70 1 buick skylark 320
2 18.0 8 318.0 150 3436 11.0 70 1 plymouth satellite
3 16.0 8 304.0 150 3433 12.0 70 1 amc rebel sst
4 17.0 8 302.0 140 3449 10.5 70 1 ford torino

autompg_clean

mpg cyl displ hp weight accel yr origin name mfr
0 18.0 8 307.0 130 3504 12.0 70 North America chevrolet chevelle malibu chevrolet
1 15.0 8 350.0 165 3693 11.5 70 North America buick skylark 320 buick
2 18.0 8 318.0 150 3436 11.0 70 North America plymouth satellite plymouth
3 16.0 8 304.0 150 3433 12.0 70 North America amc rebel sst amc
4 17.0 8 302.0 140 3449 10.5 70 North America ford torino ford

autompg2#

A version of the Auto MPG data set.

Derived from https://archive.ics.uci.edu/ml/datasets/auto+mpg

This module contains one pandas Dataframe: autompg.

autompg2

Unnamed: 0 manufacturer model displ year cyl trans drv cty hwy fl class
0 1 Audi A4 1.8 1999 4 auto(l5) front 18 29 p compact
1 2 Audi A4 1.8 1999 4 manual(m5) front 21 29 p compact
2 3 Audi A4 2.0 2008 4 manual(m6) front 20 31 p compact
3 4 Audi A4 2.0 2008 4 auto(av) front 21 30 p compact
4 5 Audi A4 2.8 1999 6 auto(l5) front 16 26 p compact

browsers#

Browser market share by version from November 2013.

Data sourced from http://gs.statcounter.com/#browser_version-ww-monthly-201311-201311-bar

Icon images sourced from https://github.com/alrra/browser-logos

This module contains one pandas Dataframe: browsers_nov_2013.

browsers_nov_2013

Version Share Browser VersionNumber
0 Chrome 30.0 18.51 Chrome 30.0
1 Chrome 31.0 17.31 Chrome 31.0
2 Firefox 25.0 11.21 Firefox 25.0
3 IE 10.0 11.10 IE 10.0
4 IE 8.0 8.65 IE 8.0

The module also contains a dictionary icons with base64-encoded PNGs of the logos for Chrome, Firefox, Safari, Opera, and IE.

commits#

Time series of commits for a GitHub user between 2012 and 2016.

This module contains one pandas Dataframe: data.

data

day time
datetime
2017-04-22 15:11:58-05:00 Sat 15:11:58
2017-04-21 14:20:57-05:00 Fri 14:20:57
2017-04-20 14:35:08-05:00 Thu 14:35:08
2017-04-20 10:34:29-05:00 Thu 10:34:29
2017-04-20 09:17:23-05:00 Thu 09:17:23

daylight#

Provide 2013 Warsaw daylight hours.

Sourced from http://www.sunrisesunset.com

This module contains one pandas Dataframe: daylight_warsaw_2013.

daylight_warsaw_2013

Date Sunrise Sunset Summer
0 2013-01-01 07:45:00 15:34:00 0
1 2013-01-02 07:45:00 15:35:00 0
2 2013-01-03 07:45:00 15:36:00 0
3 2013-01-04 07:45:00 15:37:00 0
4 2013-01-05 07:44:00 15:38:00 0

degrees#

Provide a table of data regarding bachelor’s degrees earned by women.

The data is broken down by field for any given year.

This module contains one pandas Dataframe: data.

data

Year Agriculture Architecture Art and Performance Biology Business Communications and Journalism Computer Science Education Engineering English Foreign Languages Health Professions Math and Statistics Physical Sciences Psychology Public Administration Social Sciences and History
0 1970 4.229798 11.921005 59.7 29.088363 9.064439 35.3 13.6 74.535328 0.8 65.570923 73.8 77.1 38.0 13.8 44.4 68.4 36.8
1 1971 5.452797 12.003106 59.9 29.394403 9.503187 35.5 13.6 74.149204 1.0 64.556485 73.9 75.5 39.0 14.9 46.2 65.5 36.2
2 1972 7.420710 13.214594 60.4 29.810221 10.558962 36.6 14.9 73.554520 1.2 63.664263 74.6 76.9 40.2 14.8 47.6 62.6 36.1
3 1973 9.653602 14.791613 60.2 31.147915 12.804602 38.4 16.4 73.501814 1.6 62.941502 74.9 77.4 40.9 16.5 50.4 64.3 36.4
4 1974 14.074623 17.444688 61.9 32.996183 16.204850 40.5 18.9 73.336811 2.2 62.413412 75.3 77.9 41.8 18.2 52.6 66.1 37.3

gapminder#

Four of the datasets from Gapminder.

Sourced from https://www.gapminder.org/data/

Licensed under CC-BY.

This module contains four pandas Dataframes: fertility, life_expectancy, population, and regions.

fertility

1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Country
Afghanistan 7.671 7.671 7.671 7.671 7.671 7.671 7.671 7.671 7.671 7.671 7.671 7.671 7.670 7.670 7.670 7.669 7.669 7.670 7.671 7.673 7.676 7.679 7.681 7.682 7.682 7.682 7.687 7.700 7.725 7.758 7.796 7.832 7.859 7.869 7.854 7.809 7.733 7.623 7.484 7.321 7.136 6.930 6.702 6.456 6.196 5.928 5.659 5.395 5.141 4.900
Albania 5.711 5.594 5.483 5.376 5.268 5.160 5.050 4.933 4.809 4.677 4.538 4.393 4.244 4.094 3.947 3.807 3.678 3.562 3.460 3.372 3.297 3.233 3.177 3.126 3.075 3.023 2.970 2.917 2.867 2.819 2.772 2.723 2.670 2.611 2.543 2.467 2.383 2.291 2.195 2.097 2.004 1.919 1.849 1.796 1.761 1.744 1.741 1.748 1.760 1.771
Algeria 7.653 7.655 7.657 7.658 7.657 7.652 7.641 7.622 7.591 7.548 7.492 7.422 7.339 7.244 7.138 7.021 6.889 6.741 6.576 6.392 6.192 5.976 5.747 5.508 5.263 5.014 4.761 4.503 4.238 3.971 3.705 3.449 3.207 2.987 2.794 2.634 2.514 2.439 2.407 2.412 2.448 2.507 2.580 2.656 2.725 2.781 2.817 2.829 2.820 2.795
American Samoa NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Andorra NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

life_expectancy

1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Country
Afghanistan 33.639 34.152 34.662 35.170 35.674 36.172 36.663 37.143 37.614 38.075 38.529 38.977 39.417 39.855 40.298 40.756 41.242 41.770 42.347 42.977 43.661 44.400 45.192 46.024 46.880 47.744 48.601 49.439 50.247 51.017 51.738 52.400 52.995 53.527 54.009 54.449 54.863 55.271 55.687 56.122 56.583 57.071 57.582 58.102 58.618 59.124 59.612 60.079 60.524 60.947
Albania 65.475 65.863 66.122 66.316 66.500 66.702 66.948 67.251 67.595 67.966 68.356 68.748 69.121 69.459 69.753 70.001 70.218 70.426 70.646 70.886 71.144 71.398 71.615 71.770 71.853 71.870 71.842 71.799 71.779 71.813 71.920 72.117 72.415 72.796 73.235 73.713 74.200 74.664 75.081 75.437 75.725 75.949 76.124 76.278 76.433 76.598 76.780 76.979 77.185 77.392
Algeria 47.953 48.389 48.806 49.205 49.592 49.976 50.366 50.767 51.195 51.670 52.213 52.861 53.656 54.605 55.697 56.907 58.198 59.524 60.826 62.051 63.160 64.120 64.911 65.554 66.072 66.479 66.796 67.049 67.265 67.468 67.674 67.893 68.123 68.350 68.565 68.769 68.963 69.149 69.330 69.508 69.682 69.854 70.020 70.180 70.332 70.477 70.615 70.747 70.874 71.000
American Samoa NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Andorra NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

population

1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Country
Afghanistan 10474903.0 10697983.0 10927724.0 11163656.0 11411022.0 11676990.0 11964906.0 12273101.0 12593688.0 12915499.0 13223928.0 13505544.0 13766792.0 14003408.0 14179656.0 14249493.0 14185729.0 13984092.0 13672870.0 13300056.0 12931791.0 12625292.0 12372113.0 12183387.0 12156685.0 12414686.0 13032161.0 14069854.0 15472076.0 17053213.0 18553819.0 19789880.0 20684982.0 21299350.0 21752257.0 22227543.0 22856302.0 23677385.0 24639841.0 25678639.0 26693486.0 27614718.0 28420974.0 29145841.0 29839994.0 30577756.0 31411743.0 32358260.0 33397058.0 34499915.0
Albania 1817098.0 1869942.0 1922993.0 1976140.0 2029314.0 2082474.0 2135599.0 2188650.0 2241623.0 2294578.0 2347607.0 2400801.0 2454255.0 2508026.0 2562121.0 2616530.0 2671300.0 2725029.0 2777592.0 2831682.0 2891004.0 2957390.0 3033393.0 3116009.0 3194854.0 3255859.0 3289483.0 3291695.0 3266983.0 3224901.0 3179442.0 3141102.0 3112597.0 3091902.0 3079037.0 3072725.0 3071856.0 3077378.0 3089778.0 3106701.0 3124861.0 3141800.0 3156607.0 3169665.0 3181397.0 3192723.0 3204284.0 3215988.0 3227373.0 3238316.0
Algeria 11654905.0 11923002.0 12229853.0 12572629.0 12945462.0 13338918.0 13746185.0 14165889.0 14600659.0 15052371.0 15524137.0 16018195.0 16533323.0 17068212.0 17624756.0 18205468.0 18811199.0 19442423.0 20095648.0 20762767.0 21433070.0 22098298.0 22753511.0 23398470.0 24035237.0 24668100.0 25299182.0 25930560.0 26557969.0 27169903.0 27751086.0 28291591.0 28786855.0 29242917.0 29673694.0 30099010.0 30533827.0 30982214.0 31441848.0 31913462.0 32396048.0 32888449.0 33391954.0 33906605.0 34428028.0 34950168.0 35468208.0 35980193.0 36485828.0 36983924.0
American Samoa 22672.0 23480.0 24283.0 25087.0 25869.0 26608.0 27288.0 27907.0 28470.0 28983.0 29453.0 29897.0 30305.0 30696.0 31139.0 31727.0 32526.0 33557.0 34797.0 36203.0 37706.0 39253.0 40834.0 42446.0 44048.0 45595.0 47052.0 48402.0 49648.0 50801.0 51885.0 52919.0 53901.0 54834.0 55745.0 56667.0 57625.0 58633.0 59687.0 60774.0 61871.0 62962.0 64045.0 65130.0 66217.0 67312.0 68420.0 69543.0 70680.0 71834.0
Andorra 17438.0 18529.0 19640.0 20772.0 21931.0 23127.0 24364.0 25656.0 26997.0 28357.0 29688.0 30967.0 32156.0 33279.0 34432.0 35753.0 37328.0 39226.0 41390.0 43636.0 45702.0 47414.0 48653.0 49504.0 50236.0 51241.0 52773.0 54996.0 57767.0 60670.0 63111.0 64699.0 65227.0 64905.0 64246.0 63985.0 64634.0 66390.0 69043.0 72203.0 75292.0 77888.0 79874.0 81390.0 82577.0 83677.0 84864.0 86165.0 87518.0 88909.0

regions

Group ID
Country
Angola Sub-Saharan Africa AO
Benin Sub-Saharan Africa BJ
Botswana Sub-Saharan Africa BW
Burkina Faso Sub-Saharan Africa BF
Burundi Sub-Saharan Africa BI

glucose#

A CSV timeseries of blood glucose measurements.

This module contains one pandas Dataframe: data.

data

isig glucose
datetime
2010-03-24 09:51:00 22.59 258
2010-03-24 09:56:00 22.52 260
2010-03-24 10:01:00 22.23 258
2010-03-24 10:06:00 21.56 254
2010-03-24 10:11:00 20.79 246

haar_cascade#

Provide a Haar cascade file for face recognition.

This module contains an attribute frontalface_default_path . Use this attribute to obtain the path to a Haar cascade file for frontal face recognition that can be used by OpenCV.

iris#

Provide Fisher’s Iris dataset.

This module contains one pandas Dataframe: flowers.

Note

This sampledata is maintained for historical compatibility. Please consider alternatives to Iris such as penguins.

flowers

sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa

les_mis#

Provide JSON data for co-occurrence of characters in Les Miserables.

Derived from http://ftp.cs.stanford.edu/pub/sgb/jean.dat

This module contains one dictionary: data.

data

{
    'nodes': [
        {'name': 'Myriel', 'group': 1},
        ...
        {'name': 'Mme.Hucheloup', 'group': 8}
    ],
    'links': [
        {'source': 1, 'target': 0, 'value': 1},
        ...
        {'source': 76, 'target': 58, 'value': 1}
    ]
}

movies_data#

A small subset of data from the Open Movie Database.

Data is licensed CC BY-NC 4.0.

This modules has an attribute movie_path. This attribute contains the path to a SQLite database with the data.

mtb#

Route data (including altitude) for a bike race in Eastern Europe.

This module contains one pandas Dataframe: obiszow_mtb_xcm.

obiszow_mtb_xcm

lon lat alt
0 16.116775 51.578265 118.0
1 16.116741 51.578265 118.0
2 16.116776 51.578253 118.0
3 16.116792 51.578223 119.0
4 16.116584 51.578058 119.0

olympics2014#

Provide medal counts by country for the 2014 Olympics.

This module contains a single dict: data.

The dictionary has a key "data" that lists sub-dictionaries, one for each country:

{
    'abbr': 'DEU',
    'medals': {'total': 15, 'bronze': 4, 'gold': 8, 'silver': 3},
    'name': 'Germany'
}

penguins#

Provide data from the Palmer Archipelago (Antarctica) penguin dataset.

Derived from https://github.com/mwaskom/seaborn-data/blob/master/penguins.csv

Data distributed under the CC-0 license.

This module contains one pandas Dataframe: data.

data

species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 MALE
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 FEMALE
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 FEMALE
3 Adelie Torgersen NaN NaN NaN NaN NaN
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 FEMALE

perceptions#

Provides access to probly.csv and numberly.csv.

Sourced from: https://github.com/zonination/perceptions

Data distributed under the MIT license.

This module contains two pandas Dataframes: probly and numberly.

probly

Almost Certainly Highly Likely Very Good Chance Probable Likely Probably We Believe Better Than Even About Even We Doubt Improbable Unlikely Probably Not Little Chance Almost No Chance Highly Unlikely Chances Are Slight
0 95.0 80 85 75 66 75 66 55.0 50 40 20.0 30 15.0 20 5.0 25 25
1 95.0 75 75 51 75 51 51 51.0 50 20 49.0 25 49.0 5 5.0 10 5
2 95.0 85 85 70 75 70 80 60.0 50 30 10.0 25 25.0 20 1.0 5 15
3 95.0 85 85 70 75 70 80 60.0 50 30 10.0 25 25.0 20 1.0 5 15
4 98.0 95 80 70 70 75 65 60.0 50 10 50.0 5 20.0 5 1.0 2 10

numberly

A couple A few Dozens A lot Some Several Many Fractions of Scores of Hundreds of
0 2 3 30 20 4 7 12 0.15 80 250
1 2 3 24 12 6 10 50 0.50 40 200
2 2 5 30 15 5 4 25 0.25 500 500
3 2 5 30 15 5 4 25 0.25 500 500
4 2 3 48 50 3 5 5 0.01 100000 599

periodic_table#

Provide a periodic table data set.

This module contains one pandas Dataframe: elements.

elements

atomic number symbol name atomic mass CPK electronic configuration electronegativity atomic radius ion radius van der Waals radius IE-1 EA standard state bonding type melting point boiling point density metal year discovered group period
0 1 H Hydrogen 1.00794 #FFFFFF 1s1 2.20 37.0 NaN 120.0 1312.0 -73.0 gas diatomic 14.0 20.0 0.00009 nonmetal 1766 1 1
1 2 He Helium 4.002602 #D9FFFF 1s2 NaN 32.0 NaN 140.0 2372.0 0.0 gas atomic NaN 4.0 0.00000 noble gas 1868 18 1
2 3 Li Lithium 6.941 #CC80FF [He] 2s1 0.98 134.0 76 (+1) 182.0 520.0 -60.0 solid metallic 454.0 1615.0 0.54000 alkali metal 1817 1 2
3 4 Be Beryllium 9.012182 #C2FF00 [He] 2s2 1.57 90.0 45 (+2) NaN 900.0 0.0 solid metallic 1560.0 2743.0 1.85000 alkaline earth metal 1798 2 2
4 5 B Boron 10.811 #FFB5B5 [He] 2s2 2p1 2.04 82.0 27 (+3) NaN 801.0 -27.0 solid covalent network 2348.0 4273.0 2.46000 metalloid 1807 13 2

population#

Historical and projected population data by age, gender, and country.

Sourced from: https://population.un.org/wpp/Download/Standard/Population/

Data is licenced CC BY 3.0 IGO.

This module contains one pandas Dataframe: data.

data

LocID Location Year Sex AgeGrp AgeGrpStart Value
0 4 Afghanistan 1950 Male 0-4 0 662064.0
1 4 Afghanistan 1950 Male 5-9 5 508166.0
2 4 Afghanistan 1950 Male 10-14 10 444396.0
3 4 Afghanistan 1950 Male 15-19 15 390480.0
4 4 Afghanistan 1950 Male 20-24 20 337318.0

sample_geojson#

Provide geojson data for the UK NHS England area teams.

Sourced from https://github.com/JeniT/nhs-choices with data licensed under the Open Government Licence.

A snapshot of data available from NHS Choices on November 14th, 2015.

sea_surface_temperature#

Time series of historical average sea surface temperatures.

This module contains one pandas Dataframe: sea_surface_temperature.

sea_surface_temperature

temperature
time
2016-02-15 00:00:00+00:00 4.929
2016-02-15 00:30:00+00:00 4.887
2016-02-15 01:00:00+00:00 4.821
2016-02-15 01:30:00+00:00 4.837
2016-02-15 02:00:00+00:00 4.830

sprint#

Historical results for Olympic sprints by year.

This module contains one pandas Dataframe: sprint.

sprint

Name Country Medal Time Year
0 Usain Bolt JAM GOLD 9.63 2012
1 Yohan Blake JAM SILVER 9.75 2012
2 Justin Gatlin USA BRONZE 9.79 2012
3 Usain Bolt JAM GOLD 9.69 2008
4 Richard Thompson TRI SILVER 9.89 2008

stocks#

Provide historical ticker data for selected stocks.

This module contains five dicts: AAPL, FB, GOOG, IBM, and MSFT.

Each dictionary has the structure:

AAPL['date']       # list of date string
AAPL['open']       # list of float
AAPL['high']       # list of float
AAPL['low']        # list of float
AAPL['close']      # list of float
AAPL['volume']     # list of int
AAPL['adj_close']  # list of float

unemployment#

Per-county unemployment data for Unites States in 2009.

This module contains one dict: data.

The dict is indexed by the two-tuples containing (state_id, county_id) and has the unemployment rate (2009) as the value.

{
    (1, 1): 9.7,
    (1, 3): 9.1,
    ...
}

unemployment1948#

US Unemployment rate data by month and year, from 1948 to 2013.

This module contains one pandas Dataframe: data.

data

Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Annual
0 1948 4.0 4.7 4.5 4.0 3.4 3.9 3.9 3.6 3.4 2.9 3.3 3.6 3.8
1 1949 5.0 5.8 5.6 5.4 5.7 6.4 7.0 6.3 5.9 6.1 5.7 6.0 5.9
2 1950 7.6 7.9 7.1 6.0 5.3 5.6 5.3 4.1 4.0 3.3 3.8 3.9 5.3
3 1951 4.4 4.2 3.8 3.2 2.9 3.4 3.3 2.9 3.0 2.8 3.2 2.9 3.3
4 1952 3.7 3.8 3.3 3.0 2.9 3.2 3.3 3.1 2.7 2.4 2.5 2.5 3.0

us_cities#

Locations of US cities with more than 5000 residents.

This module contains one dict: data.

data['lat']  # list of float
data['lon']  # list of float

us_counties#

This modules exposes geometry data for Unites States.

This module contains one dict: data.

The data is indexed by two-tuples of `` (state_id, county_id)`` that have the following dictionaries as values:

In [25]: data[(1,1)]
Out[25]:
{
    'name': 'Autauga',
    'detailed name': 'Autauga County, Alabama',
    'state': 'al',
    'lats': [32.4757, ..., 32.48112],
    'lons': [-86.41182, ..., -86.41187]
}

Entries for 'name' can have duplicates for certain states (e.g. Virginia). The combination of 'detailed name' and 'state' will always be unique.

us_holidays#

Calendar file of US Holidays from Mozilla provided by icalendar.

Sourced from: https://www.mozilla.org/en-US/projects/calendar/holidays/

This module contains one list: us_holidays.

us_holidays

[
    (datetime.date(1966, 12, 26), 'Kwanzaa'),
    (datetime.date(2000, 1, 1), "New Year's Day"),
    ...
    (datetime.date(2020, 12, 25), 'Christmas Day (US-OPM)')
]

us_marriages_divorces#

Provide U.S. marriage and divorce statistics between 1867 and 2014

Data from the CDC’s National Center for Health Statistics (NHCS) database (http://www.cdc.gov/nchs/).

Data organized by Randal S. Olson (http://www.randalolson.com)

This module contains one pandas Dataframe: data.

data

Year Marriages Divorces Population Marriages_per_1000 Divorces_per_1000
0 1867 357000.0 10000.0 36970000 9.7 0.3
1 1868 345000.0 10000.0 37885000 9.1 0.3
2 1869 348000.0 11000.0 38870000 9.0 0.3
3 1870 352000.0 11000.0 39905000 8.8 0.3
4 1871 359000.0 12000.0 41010000 8.8 0.3

us_states#

Geometry data for US States.

This module contains one dict: data.

The data is indexed by the two letter state code (e.g., ‘CA’, ‘TX’) and has the following structure:

In [4]: data["OR"]
Out[4]:
{
    'name': 'Oregon',
    'region': 'Northwest',
    'lats': [46.29443, ..., 46.26068],
    'lons': [-124.03622, ..., -124.15935]
}

world_cities#

Names and locations of world cities with at least 5000 inhabitants.

Derived from cities5000.zip file downloaded from http://www.geonames.org/export/

Licensed under CC-BY.

This module contains one pandas Dataframe: data.

data

name lat lng
0 Ordino 42.55623 1.53319
1 les Escaldes 42.50729 1.53414
2 la Massana 42.54499 1.51483
3 Encamp 42.53474 1.58014
4 Canillo 42.56760 1.59756