├── images ├── favicon.ico └── altair-logo.png ├── requirements.txt ├── notebooks ├── altair-gallery.png ├── split-apply-combine.png ├── Index.ipynb ├── 05-Exploring.ipynb └── 01-alt-Iris-Demo.ipynb ├── _config.yml ├── environment.yml ├── _toc.yml ├── .github └── workflows │ └── book.yml ├── README.md ├── LICENSE ├── tools └── remove_vega_outputs.py └── .gitignore /images/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/altair-viz/altair-tutorial/HEAD/images/favicon.ico -------------------------------------------------------------------------------- /images/altair-logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/altair-viz/altair-tutorial/HEAD/images/altair-logo.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | jupyter-book>=0.7.0b 2 | altair 3 | numpy 4 | pandas 5 | rise 6 | vega 7 | vega_datasets 8 | -------------------------------------------------------------------------------- /notebooks/altair-gallery.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/altair-viz/altair-tutorial/HEAD/notebooks/altair-gallery.png -------------------------------------------------------------------------------- /notebooks/split-apply-combine.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/altair-viz/altair-tutorial/HEAD/notebooks/split-apply-combine.png -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | # Book settings 2 | title: Altair Tutorial 3 | author: Jake Vanderplas and Eitan Lees 4 | logo: 'images/altair-logo.png' 5 | html: 6 | favicon: "images/favicon.ico" 7 | repository: 8 | url: "https://github.com/altair-viz/altair-tutorial" 9 | -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | name: altair-tutorial 2 | channels: 3 | - conda-forge 4 | dependencies: 5 | - python 6 | - altair 7 | - numpy 8 | - pandas 9 | - rise 10 | - vega 11 | - vega_datasets 12 | -------------------------------------------------------------------------------- /_toc.yml: -------------------------------------------------------------------------------- 1 | - file: README 2 | - file: notebooks/Index 3 | - file: notebooks/01-Cars-Demo 4 | - file: notebooks/02-Simple-Charts 5 | - file: notebooks/03-Binning-and-aggregation 6 | - file: notebooks/04-Compound-charts 7 | - file: notebooks/05-Exploring 8 | - file: notebooks/06-Selections 9 | - file: notebooks/07-Transformations 10 | - file: notebooks/08-Configuration 11 | - file: notebooks/09-Geographic-plots 12 | -------------------------------------------------------------------------------- /.github/workflows/book.yml: -------------------------------------------------------------------------------- 1 | name: deploy-book 2 | 3 | # Only run this when the master branch changes 4 | on: 5 | push: 6 | branches: 7 | - master 8 | 9 | # This job installs dependencies, build the book, and pushes it to `gh-pages` 10 | jobs: 11 | deploy-book: 12 | runs-on: ubuntu-latest 13 | steps: 14 | - uses: actions/checkout@v2 15 | 16 | # Install dependencies 17 | - name: Set up Python 3.7 18 | uses: actions/setup-python@v1 19 | with: 20 | python-version: 3.7 21 | 22 | - name: Install dependencies 23 | run: | 24 | pip install -r requirements.txt 25 | 26 | 27 | # Build the book 28 | - name: Build the book 29 | run: | 30 | jupyter-book build . 31 | 32 | # Push the book's HTML to github-pages 33 | - name: GitHub Pages action 34 | uses: peaceiris/actions-gh-pages@v3.5.9 35 | with: 36 | github_token: ${{ secrets.GITHUB_TOKEN }} 37 | publish_dir: ./_build/html 38 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Exploratory Data Visualization with Altair 2 | 3 | ![altair gallery](notebooks/altair-gallery.png) 4 | 5 | 6 | Welcome! Here you will find the source material for the Altair tutorial, given at PyCon 2018. 7 | 8 | An overview of the content can be found at [notebooks/Index.ipynb](notebooks/Index.ipynb) 9 | 10 | The intro slides on are on [SpeakerDeck](https://speakerdeck.com/jakevdp/altair-tutorial-intro-pycon-2018) 11 | 12 | The workshop was recorded and is available [on YouTube](https://www.youtube.com/watch?v=ms29ZPUKxbU) 13 | 14 | This tutorial has been made into a jupyter book and is hosted at https://altair-viz.github.io/altair-tutorial 15 | 16 | See the Altair documentation at http://altair-viz.github.io/ 17 | 18 | ## Run the Tutorial 19 | 20 | If you have a Google/Gmail account, you can Run this tutorial from your browser using Colab: [Open in Colab](https://colab.research.google.com/github/altair-viz/altair-tutorial/blob/master/notebooks/Index.ipynb). 21 | 22 | Also you can run in Binder: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/altair-viz/altair-tutorial.git/master) 23 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Jake Vanderplas 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /tools/remove_vega_outputs.py: -------------------------------------------------------------------------------- 1 | """ 2 | Tool to remove vegalite outputs from a notebook. 3 | 4 | For notebooks with a lot of embedded data, this helps keep the notebook size 5 | manageable. 6 | """ 7 | 8 | import nbformat 9 | 10 | def _strip(output, ignore_mimetypes=('application/vnd.vegalite.v2+json',)): 11 | if 'data' in output: 12 | output = output.copy() 13 | output['data'] = {k: v for k, v in output['data'].items() 14 | if k not in ignore_mimetypes} 15 | return output 16 | 17 | 18 | def strip_vega_outputs(notebook): 19 | for cell in notebook['cells']: 20 | if 'outputs' not in cell: 21 | continue 22 | cell['outputs'] = [_strip(output) for output in cell['outputs']] 23 | return notebook 24 | 25 | 26 | def main(input_filename, output_filename): 27 | notebook = nbformat.read(input_filename, as_version=nbformat.NO_CONVERT) 28 | stripped = strip_vega_outputs(notebook) 29 | with open(output_filename, 'w') as f: 30 | nbformat.write(stripped, f) 31 | return output_filename 32 | 33 | 34 | if __name__ == '__main__': 35 | import sys 36 | filename = sys.argv[1] 37 | if len(sys.argv) > 2: 38 | output_filename = sys.argv[2] 39 | else: 40 | output_filename = filename + '-stripped' 41 | main(filename, output_filename) 42 | 43 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | 49 | # Translations 50 | *.mo 51 | *.pot 52 | 53 | # Django stuff: 54 | *.log 55 | local_settings.py 56 | 57 | # Flask stuff: 58 | instance/ 59 | .webassets-cache 60 | 61 | # Scrapy stuff: 62 | .scrapy 63 | 64 | # Sphinx documentation 65 | docs/_build/ 66 | 67 | # PyBuilder 68 | target/ 69 | 70 | # Jupyter Notebook 71 | .ipynb_checkpoints 72 | 73 | # pyenv 74 | .python-version 75 | 76 | # celery beat schedule file 77 | celerybeat-schedule 78 | 79 | # SageMath parsed files 80 | *.sage.py 81 | 82 | # dotenv 83 | .env 84 | 85 | # virtualenv 86 | .venv 87 | venv/ 88 | ENV/ 89 | 90 | # Spyder project settings 91 | .spyderproject 92 | .spyproject 93 | 94 | # Rope project settings 95 | .ropeproject 96 | 97 | # mkdocs documentation 98 | /site 99 | 100 | # mypy 101 | .mypy_cache/ 102 | -------------------------------------------------------------------------------- /notebooks/Index.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "source": [ 6 | "# Outline\n", 7 | "\n", 8 | "These notebooks contain an introduction to exploratory data analysis with [Altair](http://altair-viz.github.io).\n", 9 | "\n", 10 | "Materials can be found online at http://github.com/altair-viz/altair-tutorial\n", 11 | "\n", 12 | "To get Altair and its dependencies installed, please follow the [Installation Instructions](https://altair-viz.github.io/getting_started/installation.html) on Altair's website.\n", 13 | "\n", 14 | "## 1. Motivating Altair: Live Demo\n", 15 | "\n", 16 | "We'll start with a set of intro slides, followed by a live-coded demo of what it looks like to explore data with Altair.\n", 17 | "\n", 18 | "Here's the rough content of the live demo:\n", 19 | "\n", 20 | "- [01-Cars-Demo.ipynb](01-Cars-Demo.ipynb)\n", 21 | "\n", 22 | "This will cover a lot of ground very quickly. Don't worry if you feel a bit shaky on some of the concepts; the goal here is a birds-eye overview, and we'll walk-through important concepts in more detail later.\n", 23 | "\n", 24 | "## 2. Simple Charts: Core Concepts\n", 25 | "\n", 26 | "Digging into the basic features of an Altair chart\n", 27 | "\n", 28 | "- [02-Simple-Charts.ipynb](02-Simple-Charts.ipynb)\n", 29 | "\n", 30 | "## 3. Binning and Aggregation\n", 31 | "\n", 32 | "Altair lets you build binning and aggregation (i.e. basic data flows) into your chart spec.\n", 33 | "We'll cover that here.\n", 34 | "\n", 35 | "- [03-Binning-and-aggregation.ipynb](03-Binning-and-aggregation.ipynb)\n", 36 | "\n", 37 | "## 4. Layering and Concatenation\n", 38 | "\n", 39 | "From the simple charts we have seen already, you can begin layering and concatenating charts to build up more complicated visualizations.\n", 40 | "\n", 41 | "- [04-Compound-charts.ipynb](04-Compound-charts.ipynb)\n", 42 | " \n", 43 | "## 5. Exploring a Dataset!\n", 44 | "\n", 45 | "Here's a chance for you to try this out on some data!\n", 46 | "\n", 47 | "- [05-Exploring.ipynb](05-Exploring.ipynb)\n", 48 | " " 49 | ], 50 | "metadata": {} 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "source": [ 55 | "---\n", 56 | "\n", 57 | "*Afternoon Break*\n", 58 | "\n", 59 | "---" 60 | ], 61 | "metadata": {} 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "source": [ 66 | "## 6. Selections: making things interactive\n", 67 | "\n", 68 | "A killer feature of Altair is the ability to build interactions from basic building blocks.\n", 69 | "We'll dig into that here.\n", 70 | "\n", 71 | "- [06-Selections.ipynb](06-Selections.ipynb)\n", 72 | "\n", 73 | "## 7. Transformations: Data streams within the chart spec\n", 74 | "\n", 75 | "We saw binning and aggregation previously, but there are a number of other data transformations that Altair makes available.\n", 76 | "\n", 77 | "- [07-Transformations.ipynb](07-Transformations.ipynb)\n", 78 | "\n", 79 | "## 8. Config: Adjusting your Charts\n", 80 | "\n", 81 | "Once you're happy with your chart, it's nice to be able to tweak things like axis labels, titles, color scales, etc.\n", 82 | "We'll look here at how that can be done with Altair\n", 83 | "\n", 84 | "- [08-Configuration.ipynb](08-Configuration.ipynb)\n", 85 | "\n", 86 | "## 9. Geographic Charts: Maps\n", 87 | "\n", 88 | "Altair recently added support for geographic visualizations. We'll show a few examples of the basic principles behind these.\n", 89 | "\n", 90 | "- [09-Geographic-plots.ipynb](09-Geographic-plots.ipynb)\n", 91 | "\n", 92 | "## 10. Try it out!\n", 93 | "\n", 94 | "- Return to your visualization in [05-Exploring.ipynb](05-Exploring.ipynb) and begin adding some interactions & transformations. What can you come up with?" 95 | ], 96 | "metadata": {} 97 | } 98 | ], 99 | "metadata": { 100 | "kernelspec": { 101 | "display_name": "Python 3", 102 | "language": "python", 103 | "name": "python3" 104 | }, 105 | "language_info": { 106 | "name": "python", 107 | "version": "3.7.6", 108 | "mimetype": "text/x-python", 109 | "codemirror_mode": { 110 | "name": "ipython", 111 | "version": 3 112 | }, 113 | "pygments_lexer": "ipython3", 114 | "nbconvert_exporter": "python", 115 | "file_extension": ".py" 116 | }, 117 | "nteract": { 118 | "version": "0.23.1" 119 | } 120 | }, 121 | "nbformat": 4, 122 | "nbformat_minor": 4 123 | } -------------------------------------------------------------------------------- /notebooks/05-Exploring.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Exploring a DataSet\n", 8 | "\n", 9 | "Now that we have seen the basic pieces of Altair's API, it's time to practice using it to explore a new dataset.\n", 10 | "With your partner, choose one of the following four datasets, detailed below.\n", 11 | "\n", 12 | "As you explore the data, recall the building blocks we've discussed:\n", 13 | "\n", 14 | "- various marks: ``mark_point()``, ``mark_line()``, ``mark_tick()``, ``mark_bar()``, ``mark_area()``, ``mark_rect()``, etc.\n", 15 | "- various encodings: ``x``, ``y``, ``color``, ``shape``, ``size``, ``row``, ``column``, ``text``, ``tooltip``, etc.\n", 16 | "- binning and aggregations: a [List of available aggregations](https://altair-viz.github.io/user_guide/encoding.html#binning-and-aggregation) can be found in Altair's documentation\n", 17 | "- stacking and layering (``alt.layer`` <-> ``+``, ``alt.hconcat`` <-> ``|``, ``alt.vconcat`` <-> ``&``)\n", 18 | "\n", 19 | "Start simple and build from there. Which encodings work best with quantitative data? With categorical data?\n", 20 | "What can you learn about your dataset using these tools?\n", 21 | "\n", 22 | "We'll set aside about 20 minutes for you to work on this with your partner." 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 1, 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "from vega_datasets import data" 32 | ] 33 | }, 34 | { 35 | "cell_type": "markdown", 36 | "metadata": {}, 37 | "source": [ 38 | "## Seattle Weather\n", 39 | "\n", 40 | "This data includes daily precipitation, temperature range, wind speed, and weather type as a function of date between 2012 and 2015 in Seattle." 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 2, 46 | "metadata": {}, 47 | "outputs": [ 48 | { 49 | "data": { 50 | "text/html": [ 51 | "
\n", 52 | "\n", 65 | "\n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | "
dateprecipitationtemp_maxtemp_minwindweather
02012-01-010.012.85.04.7drizzle
12012-01-0210.910.62.84.5rain
22012-01-030.811.77.22.3rain
32012-01-0420.312.25.64.7rain
42012-01-051.38.92.86.1rain
\n", 125 | "
" 126 | ], 127 | "text/plain": [ 128 | " date precipitation temp_max temp_min wind weather\n", 129 | "0 2012-01-01 0.0 12.8 5.0 4.7 drizzle\n", 130 | "1 2012-01-02 10.9 10.6 2.8 4.5 rain\n", 131 | "2 2012-01-03 0.8 11.7 7.2 2.3 rain\n", 132 | "3 2012-01-04 20.3 12.2 5.6 4.7 rain\n", 133 | "4 2012-01-05 1.3 8.9 2.8 6.1 rain" 134 | ] 135 | }, 136 | "execution_count": 2, 137 | "metadata": {}, 138 | "output_type": "execute_result" 139 | } 140 | ], 141 | "source": [ 142 | "weather = data.seattle_weather()\n", 143 | "weather.head()" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | "## Gapminder\n", 151 | "\n", 152 | "This data consists of population, fertility, and life expectancy over time in a number of countries around the world.\n", 153 | "\n", 154 | "Note that, while you may be tempted to use a temporal encoding for the year, here the year is simply a number, not a date stamp, and so temporal encoding is not the best choice here." 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": 3, 160 | "metadata": {}, 161 | "outputs": [ 162 | { 163 | "data": { 164 | "text/html": [ 165 | "
\n", 166 | "\n", 179 | "\n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | "
yearcountryclusterpoplife_expectfertility
01955Afghanistan0889120930.3327.7
11960Afghanistan0982945031.9977.7
21965Afghanistan01099788534.0207.7
31970Afghanistan01243062336.0887.7
41975Afghanistan01413201938.4387.7
\n", 239 | "
" 240 | ], 241 | "text/plain": [ 242 | " year country cluster pop life_expect fertility\n", 243 | "0 1955 Afghanistan 0 8891209 30.332 7.7\n", 244 | "1 1960 Afghanistan 0 9829450 31.997 7.7\n", 245 | "2 1965 Afghanistan 0 10997885 34.020 7.7\n", 246 | "3 1970 Afghanistan 0 12430623 36.088 7.7\n", 247 | "4 1975 Afghanistan 0 14132019 38.438 7.7" 248 | ] 249 | }, 250 | "execution_count": 3, 251 | "metadata": {}, 252 | "output_type": "execute_result" 253 | } 254 | ], 255 | "source": [ 256 | "gapminder = data.gapminder()\n", 257 | "gapminder.head()" 258 | ] 259 | }, 260 | { 261 | "cell_type": "markdown", 262 | "metadata": {}, 263 | "source": [ 264 | "## Population\n", 265 | "\n", 266 | "This data contains the US population sub-divided by age and sex every decade from 1850 to near the present.\n", 267 | "\n", 268 | "Note that, while you may be tempted to use a temporal encoding for the year, here the year is simply a number, not a date stamp, and so temporal encoding is not the best choice." 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": 4, 274 | "metadata": {}, 275 | "outputs": [ 276 | { 277 | "data": { 278 | "text/html": [ 279 | "
\n", 280 | "\n", 293 | "\n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | "
yearagesexpeople
01850011483789
11850021450376
21850511411067
31850521359668
418501011260099
\n", 341 | "
" 342 | ], 343 | "text/plain": [ 344 | " year age sex people\n", 345 | "0 1850 0 1 1483789\n", 346 | "1 1850 0 2 1450376\n", 347 | "2 1850 5 1 1411067\n", 348 | "3 1850 5 2 1359668\n", 349 | "4 1850 10 1 1260099" 350 | ] 351 | }, 352 | "execution_count": 4, 353 | "metadata": {}, 354 | "output_type": "execute_result" 355 | } 356 | ], 357 | "source": [ 358 | "population = data.population()\n", 359 | "population.head()" 360 | ] 361 | }, 362 | { 363 | "cell_type": "markdown", 364 | "metadata": {}, 365 | "source": [ 366 | "## Movies\n", 367 | "\n", 368 | "The movies dataset has data on 3200 movies, including release date, budget, and ratings on IMDB and Rotten Tomatoes." 369 | ] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "execution_count": 5, 374 | "metadata": {}, 375 | "outputs": [ 376 | { 377 | "data": { 378 | "text/html": [ 379 | "
\n", 380 | "\n", 393 | "\n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | "
TitleUS_GrossWorldwide_GrossUS_DVD_SalesProduction_BudgetRelease_DateMPAA_RatingRunning_Time_minDistributorSourceMajor_GenreCreative_TypeDirectorRotten_Tomatoes_RatingIMDB_RatingIMDB_Votes
0The Land Girls146083.0146083.0NaN8000000.0Jun 12 1998RNaNGramercyNoneNoneNoneNoneNaN6.11071.0
1First Love, Last Rites10876.010876.0NaN300000.0Aug 07 1998RNaNStrandNoneDramaNoneNoneNaN6.9207.0
2I Married a Strange Person203134.0203134.0NaN250000.0Aug 28 1998NoneNaNLionsgateNoneComedyNoneNoneNaN6.8865.0
3Let's Talk About Sex373615.0373615.0NaN300000.0Sep 11 1998NoneNaNFine LineNoneComedyNoneNone13.0NaNNaN
4Slam1009819.01087521.0NaN1000000.0Oct 09 1998RNaNTrimarkOriginal ScreenplayDramaContemporary FictionNone62.03.4165.0
\n", 513 | "
" 514 | ], 515 | "text/plain": [ 516 | " Title US_Gross Worldwide_Gross US_DVD_Sales \\\n", 517 | "0 The Land Girls 146083.0 146083.0 NaN \n", 518 | "1 First Love, Last Rites 10876.0 10876.0 NaN \n", 519 | "2 I Married a Strange Person 203134.0 203134.0 NaN \n", 520 | "3 Let's Talk About Sex 373615.0 373615.0 NaN \n", 521 | "4 Slam 1009819.0 1087521.0 NaN \n", 522 | "\n", 523 | " Production_Budget Release_Date MPAA_Rating Running_Time_min Distributor \\\n", 524 | "0 8000000.0 Jun 12 1998 R NaN Gramercy \n", 525 | "1 300000.0 Aug 07 1998 R NaN Strand \n", 526 | "2 250000.0 Aug 28 1998 None NaN Lionsgate \n", 527 | "3 300000.0 Sep 11 1998 None NaN Fine Line \n", 528 | "4 1000000.0 Oct 09 1998 R NaN Trimark \n", 529 | "\n", 530 | " Source Major_Genre Creative_Type Director \\\n", 531 | "0 None None None None \n", 532 | "1 None Drama None None \n", 533 | "2 None Comedy None None \n", 534 | "3 None Comedy None None \n", 535 | "4 Original Screenplay Drama Contemporary Fiction None \n", 536 | "\n", 537 | " Rotten_Tomatoes_Rating IMDB_Rating IMDB_Votes \n", 538 | "0 NaN 6.1 1071.0 \n", 539 | "1 NaN 6.9 207.0 \n", 540 | "2 NaN 6.8 865.0 \n", 541 | "3 13.0 NaN NaN \n", 542 | "4 62.0 3.4 165.0 " 543 | ] 544 | }, 545 | "execution_count": 5, 546 | "metadata": {}, 547 | "output_type": "execute_result" 548 | } 549 | ], 550 | "source": [ 551 | "movies = data.movies()\n", 552 | "movies.head()" 553 | ] 554 | } 555 | ], 556 | "metadata": { 557 | "kernelspec": { 558 | "display_name": "Python 3", 559 | "language": "python", 560 | "name": "python3" 561 | }, 562 | "language_info": { 563 | "codemirror_mode": { 564 | "name": "ipython", 565 | "version": 3 566 | }, 567 | "file_extension": ".py", 568 | "mimetype": "text/x-python", 569 | "name": "python", 570 | "nbconvert_exporter": "python", 571 | "pygments_lexer": "ipython3", 572 | "version": "3.7.6" 573 | } 574 | }, 575 | "nbformat": 4, 576 | "nbformat_minor": 4 577 | } 578 | -------------------------------------------------------------------------------- /notebooks/01-alt-Iris-Demo.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Demo: Exploring the Iris Dataset\n", 8 | "\n", 9 | "We'll start this tutorial with a demo to whet your appetite for learning more. This section purposely moves quickly through many of the concepts (e.g. data, marks, encodings, aggregation, data types, selections, etc.)\n", 10 | "We will return to treat each of these in more depth later in the tutorial, so don't worry if it all seems to go a bit quickly!\n", 11 | "\n", 12 | "In the live tutorial, this will be done from scratch in a blank notebook.\n", 13 | "However, for the sake of people who want to look back on what we did live, I'll do my best to reproduce the examples and the discussion here." 14 | ] 15 | }, 16 | { 17 | "cell_type": "markdown", 18 | "metadata": {}, 19 | "source": [ 20 | "## 1. Imports and Data\n", 21 | "\n", 22 | "We'll start with importing the [Altair package](http://altair-viz.github.io/) and enabling the appropriate renderer (if necessary):" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": null, 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "import altair as alt\n", 32 | "\n", 33 | "# Altair plots render by default in JupyterLab and nteract\n", 34 | "\n", 35 | "# Uncomment/run this line to enable Altair in the classic notebook (not in JupyterLab)\n", 36 | "# alt.renderers.enable('notebook')\n", 37 | "\n", 38 | "# Uncomment/run this line to enable Altair in Colab\n", 39 | "# alt.renderers.enable('colab')" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "Now we'll use the [vega_datasets package](https://github.com/altair-viz/vega_datasets), to load an example dataset:" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": 30, 52 | "metadata": {}, 53 | "outputs": [ 54 | { 55 | "data": { 56 | "text/html": [ 57 | "
\n", 58 | "\n", 71 | "\n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | "
petalLengthpetalWidthsepalLengthsepalWidthspecies
01.40.25.13.5setosa
11.40.24.93.0setosa
21.30.24.73.2setosa
31.50.24.63.1setosa
41.40.25.03.6setosa
\n", 125 | "
" 126 | ], 127 | "text/plain": [ 128 | " petalLength petalWidth sepalLength sepalWidth species\n", 129 | "0 1.4 0.2 5.1 3.5 setosa\n", 130 | "1 1.4 0.2 4.9 3.0 setosa\n", 131 | "2 1.3 0.2 4.7 3.2 setosa\n", 132 | "3 1.5 0.2 4.6 3.1 setosa\n", 133 | "4 1.4 0.2 5.0 3.6 setosa" 134 | ] 135 | }, 136 | "execution_count": 30, 137 | "metadata": {}, 138 | "output_type": "execute_result" 139 | } 140 | ], 141 | "source": [ 142 | "from vega_datasets import data\n", 143 | "\n", 144 | "iris = data.iris()\n", 145 | "iris.head()" 146 | ] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": {}, 151 | "source": [ 152 | "Notice that this data is in columnar format: that is, each column contains an attribute of a data point, and each row contains a single instance of the data (here, the measurament of a single flower)." 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "metadata": {}, 158 | "source": [ 159 | "## 2. Zero, One, and Two-dimensional Charts\n", 160 | "\n", 161 | "Using Altair, we can begin to explore this data.\n", 162 | "\n", 163 | "The most basic chart contains the dataset, along with a mark to represent each row:" 164 | ] 165 | }, 166 | { 167 | "cell_type": "code", 168 | "execution_count": null, 169 | "metadata": {}, 170 | "outputs": [], 171 | "source": [ 172 | "alt.Chart(iris).mark_point()" 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "metadata": {}, 178 | "source": [ 179 | "This is a pretty silly chart, because it consists of 150 points, all laid-out on top of each other. But this *is* technically a representation of the data!\n", 180 | "\n", 181 | "To make it more interesting, we need to *encode* columns of the data into visual features of the plot (e.g. x-position, y-position, size, color, etc.)\n", 182 | "\n", 183 | "Let's encode petal length on the x-axis using the ``encode()`` method:" 184 | ] 185 | }, 186 | { 187 | "cell_type": "code", 188 | "execution_count": null, 189 | "metadata": {}, 190 | "outputs": [], 191 | "source": [ 192 | "alt.Chart(iris).mark_point().encode(\n", 193 | " x='petalLength'\n", 194 | ")" 195 | ] 196 | }, 197 | { 198 | "cell_type": "markdown", 199 | "metadata": {}, 200 | "source": [ 201 | "This is a bit better, but the ``point`` mark is probably not the best for a 1D chart like this.\n", 202 | "\n", 203 | "Let's try the ``tick`` mark instead:" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [ 212 | "alt.Chart(iris).mark_tick().encode(\n", 213 | " x='petalLength'\n", 214 | ")" 215 | ] 216 | }, 217 | { 218 | "cell_type": "markdown", 219 | "metadata": {}, 220 | "source": [ 221 | "Another thing we might do with the points is to expand them into a 2D chart by also encoding the y value. We'll return to using ``point`` markers, and put ``petalWidth`` on the y-axis" 222 | ] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": null, 227 | "metadata": {}, 228 | "outputs": [], 229 | "source": [ 230 | "alt.Chart(iris).mark_point().encode(\n", 231 | " x='petalLength',\n", 232 | " y='petalWidth'\n", 233 | ")" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": {}, 239 | "source": [ 240 | "## 3 Simple Interactions\n", 241 | "\n", 242 | "One of the nicest features of Altair is the grammar of interaction that it provides.\n", 243 | "The simplest kind of interaction is the ability to pan and zoom along charts; Altair contains a shortcut to enable this via the ``interactive()`` method:" 244 | ] 245 | }, 246 | { 247 | "cell_type": "code", 248 | "execution_count": null, 249 | "metadata": {}, 250 | "outputs": [], 251 | "source": [ 252 | "alt.Chart(iris).mark_point().encode(\n", 253 | " x='petalLength',\n", 254 | " y='petalWidth'\n", 255 | ").interactive()" 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": {}, 261 | "source": [ 262 | "This lets you click and drag, as well as use your computer's scroll/zoom behavior to zoom in and out on the chart.\n", 263 | "\n", 264 | "We'll see other interactions later." 265 | ] 266 | }, 267 | { 268 | "cell_type": "markdown", 269 | "metadata": {}, 270 | "source": [ 271 | "## 4. A Third Dimension: Color" 272 | ] 273 | }, 274 | { 275 | "cell_type": "markdown", 276 | "metadata": {}, 277 | "source": [ 278 | "A 2D plot allows us to encode two dimensions of the data. Let's look at using color to encode a third:" 279 | ] 280 | }, 281 | { 282 | "cell_type": "code", 283 | "execution_count": null, 284 | "metadata": {}, 285 | "outputs": [], 286 | "source": [ 287 | "alt.Chart(iris).mark_point().encode(\n", 288 | " x='petalLength',\n", 289 | " y='petalWidth',\n", 290 | " color='species'\n", 291 | ")" 292 | ] 293 | }, 294 | { 295 | "cell_type": "markdown", 296 | "metadata": {}, 297 | "source": [ 298 | "Notice that when we use a categorical value for color, it chooses an appropriate color map for categorical data.\n", 299 | "\n", 300 | "Let's see what happens when we use a continuous color value:" 301 | ] 302 | }, 303 | { 304 | "cell_type": "code", 305 | "execution_count": null, 306 | "metadata": {}, 307 | "outputs": [], 308 | "source": [ 309 | "alt.Chart(iris).mark_point().encode(\n", 310 | " x='petalLength',\n", 311 | " y='petalWidth',\n", 312 | " color='sepalLength'\n", 313 | ")" 314 | ] 315 | }, 316 | { 317 | "cell_type": "markdown", 318 | "metadata": {}, 319 | "source": [ 320 | "A continuous color results in a color scale that is appropriate for continuous data.\n", 321 | "\n", 322 | "This is one key feature of Altair: it chooses appropriate scales for the data type.\n", 323 | "\n", 324 | "Similar, if we put the categorical value on the x or y axis, it will also create discrete scales:" 325 | ] 326 | }, 327 | { 328 | "cell_type": "code", 329 | "execution_count": null, 330 | "metadata": {}, 331 | "outputs": [], 332 | "source": [ 333 | "alt.Chart(iris).mark_tick().encode(\n", 334 | " x='petalLength',\n", 335 | " y='species',\n", 336 | " color='species'\n", 337 | ")" 338 | ] 339 | }, 340 | { 341 | "cell_type": "markdown", 342 | "metadata": {}, 343 | "source": [ 344 | "Altair also can choose the orientation of the tick mark appropriately given the types of the x and y encodings:" 345 | ] 346 | }, 347 | { 348 | "cell_type": "code", 349 | "execution_count": null, 350 | "metadata": {}, 351 | "outputs": [], 352 | "source": [ 353 | "alt.Chart(iris).mark_tick().encode(\n", 354 | " y='petalLength',\n", 355 | " x='species',\n", 356 | " color='species'\n", 357 | ")" 358 | ] 359 | }, 360 | { 361 | "cell_type": "markdown", 362 | "metadata": {}, 363 | "source": [ 364 | "## 5. Binning and aggregation\n", 365 | "\n", 366 | "Let's return quickly to our 1D chart of petal length:" 367 | ] 368 | }, 369 | { 370 | "cell_type": "code", 371 | "execution_count": null, 372 | "metadata": {}, 373 | "outputs": [], 374 | "source": [ 375 | "alt.Chart(iris).mark_tick().encode(\n", 376 | " x='petalLength',\n", 377 | ")" 378 | ] 379 | }, 380 | { 381 | "cell_type": "markdown", 382 | "metadata": {}, 383 | "source": [ 384 | "Another way we might represent this data is to creat a histogram: to bin the x data and show the count on the y axis.\n", 385 | "In many plotting libraries this is done with a special method like ``hist()``. In Altair, such binning and aggregation is part of the declarative API.\n", 386 | "\n", 387 | "To move beyond a simple field name, we use ``alt.X()`` for the x encoding, and we use ``'count()'`` for the y encoding:" 388 | ] 389 | }, 390 | { 391 | "cell_type": "code", 392 | "execution_count": null, 393 | "metadata": {}, 394 | "outputs": [], 395 | "source": [ 396 | "alt.Chart(iris).mark_bar().encode(\n", 397 | " x=alt.X('petalLength', bin=True),\n", 398 | " y='count()'\n", 399 | ")" 400 | ] 401 | }, 402 | { 403 | "cell_type": "markdown", 404 | "metadata": {}, 405 | "source": [ 406 | "If we want more control over the bins, we can use ``alt.Bin`` to adjust bin parameters" 407 | ] 408 | }, 409 | { 410 | "cell_type": "code", 411 | "execution_count": null, 412 | "metadata": {}, 413 | "outputs": [], 414 | "source": [ 415 | "alt.Chart(iris).mark_bar().encode(\n", 416 | " x=alt.X('petalLength', bin=alt.Bin(maxbins=50)),\n", 417 | " y='count()'\n", 418 | ")" 419 | ] 420 | }, 421 | { 422 | "cell_type": "markdown", 423 | "metadata": {}, 424 | "source": [ 425 | "If we apply another encoding (such as ``color``), the data will be automatically grouped within each bin:" 426 | ] 427 | }, 428 | { 429 | "cell_type": "code", 430 | "execution_count": null, 431 | "metadata": {}, 432 | "outputs": [], 433 | "source": [ 434 | "alt.Chart(iris).mark_bar().encode(\n", 435 | " x=alt.X('petalLength', bin=alt.Bin(maxbins=50)),\n", 436 | " y='count()',\n", 437 | " color='species'\n", 438 | ")" 439 | ] 440 | }, 441 | { 442 | "cell_type": "markdown", 443 | "metadata": {}, 444 | "source": [ 445 | "If you prefer a separate plot for each category, the ``column`` encoding can help:" 446 | ] 447 | }, 448 | { 449 | "cell_type": "code", 450 | "execution_count": null, 451 | "metadata": {}, 452 | "outputs": [], 453 | "source": [ 454 | "alt.Chart(iris).mark_bar().encode(\n", 455 | " x=alt.X('petalLength', bin=alt.Bin(maxbins=50)),\n", 456 | " y='count()',\n", 457 | " color='species',\n", 458 | " column='species'\n", 459 | ")" 460 | ] 461 | }, 462 | { 463 | "cell_type": "markdown", 464 | "metadata": {}, 465 | "source": [ 466 | "Binning and aggregation works in two dimensions as well; we can use the ``rect`` marker and visualize the count using the color:" 467 | ] 468 | }, 469 | { 470 | "cell_type": "code", 471 | "execution_count": null, 472 | "metadata": {}, 473 | "outputs": [], 474 | "source": [ 475 | "alt.Chart(iris).mark_rect().encode(\n", 476 | " x=alt.X('petalLength', bin=True),\n", 477 | " y=alt.Y('sepalLength', bin=True),\n", 478 | " color='count()'\n", 479 | ")" 480 | ] 481 | }, 482 | { 483 | "cell_type": "markdown", 484 | "metadata": {}, 485 | "source": [ 486 | "Aggregations can be more than simple counts; we can also aggregate and compute the mean of a third quantity within each bin" 487 | ] 488 | }, 489 | { 490 | "cell_type": "code", 491 | "execution_count": null, 492 | "metadata": {}, 493 | "outputs": [], 494 | "source": [ 495 | "alt.Chart(iris).mark_rect().encode(\n", 496 | " x=alt.X('petalLength', bin=True),\n", 497 | " y=alt.Y('sepalLength', bin=True),\n", 498 | " color='mean(petalWidth)'\n", 499 | ")" 500 | ] 501 | }, 502 | { 503 | "cell_type": "markdown", 504 | "metadata": {}, 505 | "source": [ 506 | "## 6. Stacking and Layering\n", 507 | "\n", 508 | "Sometimes a multi-panel view is helpful in order to see relationships between points.\n", 509 | "Altair provides the ``hconcat()`` and ``vconcat()`` functions, and the associated ``|`` and ``&`` operators to concatenate charts together (note this is different than the column-wise faceting we saw above, because each panel contains the *entire* dataset)" 510 | ] 511 | }, 512 | { 513 | "cell_type": "code", 514 | "execution_count": null, 515 | "metadata": {}, 516 | "outputs": [], 517 | "source": [ 518 | "chart1 = alt.Chart(iris).mark_point().encode(\n", 519 | " x='petalWidth',\n", 520 | " y='sepalWidth',\n", 521 | " color='species'\n", 522 | ")\n", 523 | "\n", 524 | "chart2 = alt.Chart(iris).mark_point().encode(\n", 525 | " x='sepalLength',\n", 526 | " y='sepalWidth',\n", 527 | " color='species'\n", 528 | ")\n", 529 | "\n", 530 | "chart1 | chart2" 531 | ] 532 | }, 533 | { 534 | "cell_type": "markdown", 535 | "metadata": {}, 536 | "source": [ 537 | "A useful trick for this sort of situation is to create a base chart, and concatenate two slightly different copies of it:" 538 | ] 539 | }, 540 | { 541 | "cell_type": "code", 542 | "execution_count": null, 543 | "metadata": {}, 544 | "outputs": [], 545 | "source": [ 546 | "base = alt.Chart(iris).mark_point().encode(\n", 547 | " y='sepalWidth',\n", 548 | " color='species'\n", 549 | ")\n", 550 | "\n", 551 | "base.encode(x='petalWidth') | base.encode(x='sepalLength')" 552 | ] 553 | }, 554 | { 555 | "cell_type": "markdown", 556 | "metadata": {}, 557 | "source": [ 558 | "Another type of compound chart is a layered chart, built using ``alt.layer()`` or equivalently the ``+`` operator.\n", 559 | "For example, we could layer points on top of the binned counts from before:" 560 | ] 561 | }, 562 | { 563 | "cell_type": "code", 564 | "execution_count": null, 565 | "metadata": {}, 566 | "outputs": [], 567 | "source": [ 568 | "counts = alt.Chart(iris).mark_rect(opacity=0.3).encode(\n", 569 | " x=alt.X('petalLength', bin=True),\n", 570 | " y=alt.Y('sepalLength', bin=True),\n", 571 | " color='count()'\n", 572 | ")\n", 573 | "\n", 574 | "points = alt.Chart(iris).mark_point().encode(\n", 575 | " x='petalLength',\n", 576 | " y='sepalLength',\n", 577 | " color='species'\n", 578 | ")\n", 579 | "\n", 580 | "counts + points" 581 | ] 582 | }, 583 | { 584 | "cell_type": "markdown", 585 | "metadata": {}, 586 | "source": [ 587 | "We'll see a use of this kind of chart stacking in the interactivity section, below." 588 | ] 589 | }, 590 | { 591 | "cell_type": "markdown", 592 | "metadata": {}, 593 | "source": [ 594 | "## 7. Interactivity: Selections\n", 595 | "\n", 596 | "Let's return to our scatter plot, and take a look at the other types of interactivity that Altair offers:" 597 | ] 598 | }, 599 | { 600 | "cell_type": "code", 601 | "execution_count": null, 602 | "metadata": {}, 603 | "outputs": [], 604 | "source": [ 605 | "alt.Chart(iris).mark_point().encode(\n", 606 | " x='petalLength',\n", 607 | " y='petalWidth',\n", 608 | " color='species'\n", 609 | ")" 610 | ] 611 | }, 612 | { 613 | "cell_type": "markdown", 614 | "metadata": {}, 615 | "source": [ 616 | "Recall that you can add ``interactive()`` to the end of a chart to enable the most basic interactive scales:" 617 | ] 618 | }, 619 | { 620 | "cell_type": "code", 621 | "execution_count": null, 622 | "metadata": {}, 623 | "outputs": [], 624 | "source": [ 625 | "alt.Chart(iris).mark_point().encode(\n", 626 | " x='petalLength',\n", 627 | " y='petalWidth',\n", 628 | " color='species'\n", 629 | ").interactive()" 630 | ] 631 | }, 632 | { 633 | "cell_type": "markdown", 634 | "metadata": {}, 635 | "source": [ 636 | "Altair provides a general ``selection`` API for creating interactive plots; for example, here we create an interval selection:" 637 | ] 638 | }, 639 | { 640 | "cell_type": "code", 641 | "execution_count": null, 642 | "metadata": {}, 643 | "outputs": [], 644 | "source": [ 645 | "interval = alt.selection_interval()\n", 646 | "\n", 647 | "alt.Chart(iris).mark_point().encode(\n", 648 | " x='petalLength',\n", 649 | " y='petalWidth',\n", 650 | " color='species'\n", 651 | ").properties(\n", 652 | " selection=interval\n", 653 | ")" 654 | ] 655 | }, 656 | { 657 | "cell_type": "markdown", 658 | "metadata": {}, 659 | "source": [ 660 | "Currently this selection doesn't actually do anything, but we can change that by conditioning the color on this selection:" 661 | ] 662 | }, 663 | { 664 | "cell_type": "code", 665 | "execution_count": null, 666 | "metadata": {}, 667 | "outputs": [], 668 | "source": [ 669 | "interval = alt.selection_interval()\n", 670 | "\n", 671 | "alt.Chart(iris).mark_point().encode(\n", 672 | " x='petalLength',\n", 673 | " y='petalWidth',\n", 674 | " color=alt.condition(interval, 'species', alt.value('lightgray'))\n", 675 | ").properties(\n", 676 | " selection=interval\n", 677 | ")" 678 | ] 679 | }, 680 | { 681 | "cell_type": "markdown", 682 | "metadata": {}, 683 | "source": [ 684 | "The nice thing about this selection API is that it *automatically* applies across any compound charts; for example, here we can horizontally concatenate two charts, and since they both have the same selection they both respond appropriately:" 685 | ] 686 | }, 687 | { 688 | "cell_type": "code", 689 | "execution_count": null, 690 | "metadata": {}, 691 | "outputs": [], 692 | "source": [ 693 | "interval = alt.selection_interval()\n", 694 | "\n", 695 | "base = alt.Chart(iris).mark_point().encode(\n", 696 | " y='petalWidth',\n", 697 | " color=alt.condition(interval, 'species', alt.value('lightgray'))\n", 698 | ").properties(\n", 699 | " selection=interval\n", 700 | ")\n", 701 | "\n", 702 | "base.encode(x='petalLength') | base.encode(x='sepalLength')" 703 | ] 704 | }, 705 | { 706 | "cell_type": "markdown", 707 | "metadata": {}, 708 | "source": [ 709 | "The properties of the interval selection can be easily modified; for example, we can specify that it only applies to the x encoding:" 710 | ] 711 | }, 712 | { 713 | "cell_type": "code", 714 | "execution_count": null, 715 | "metadata": {}, 716 | "outputs": [], 717 | "source": [ 718 | "interval = alt.selection_interval(encodings=['x'])\n", 719 | "\n", 720 | "base = alt.Chart(iris).mark_point().encode(\n", 721 | " y='petalWidth',\n", 722 | " color=alt.condition(interval, 'species', alt.value('lightgray'))\n", 723 | ").properties(\n", 724 | " selection=interval\n", 725 | ")\n", 726 | "\n", 727 | "base.encode(x='petalLength') | base.encode(x='sepalLength')" 728 | ] 729 | }, 730 | { 731 | "cell_type": "markdown", 732 | "metadata": {}, 733 | "source": [ 734 | "We can do even more sophisticated things with selections as well.\n", 735 | "For example, let's make a histogram of the number of records by species, and stack it on our scatter plot:" 736 | ] 737 | }, 738 | { 739 | "cell_type": "code", 740 | "execution_count": null, 741 | "metadata": {}, 742 | "outputs": [], 743 | "source": [ 744 | "interval = alt.selection_interval()\n", 745 | "\n", 746 | "base = alt.Chart(iris).mark_point().encode(\n", 747 | " y='petalLength',\n", 748 | " color=alt.condition(interval, 'species', alt.value('lightgray'))\n", 749 | ").properties(\n", 750 | " selection=interval\n", 751 | ")\n", 752 | "\n", 753 | "hist = alt.Chart(iris).mark_bar().encode(\n", 754 | " x='count()',\n", 755 | " y='species',\n", 756 | " color='species'\n", 757 | ").properties(\n", 758 | " width=800,\n", 759 | " height=80\n", 760 | ").transform_filter(\n", 761 | " interval\n", 762 | ")\n", 763 | "\n", 764 | "scatter = base.encode(x='petalWidth') | base.encode(x='sepalWidth')\n", 765 | "\n", 766 | "scatter & hist" 767 | ] 768 | }, 769 | { 770 | "cell_type": "markdown", 771 | "metadata": {}, 772 | "source": [ 773 | "This demo has covered a number of the available components of Altair.\n", 774 | "In the following sections, we'll look into each of these a bit more systematically." 775 | ] 776 | } 777 | ], 778 | "metadata": { 779 | "kernelspec": { 780 | "display_name": "Python 3", 781 | "language": "python", 782 | "name": "python3" 783 | }, 784 | "language_info": { 785 | "codemirror_mode": { 786 | "name": "ipython", 787 | "version": 3 788 | }, 789 | "file_extension": ".py", 790 | "mimetype": "text/x-python", 791 | "name": "python", 792 | "nbconvert_exporter": "python", 793 | "pygments_lexer": "ipython3", 794 | "version": "3.7.0" 795 | } 796 | }, 797 | "nbformat": 4, 798 | "nbformat_minor": 2 799 | } 800 | --------------------------------------------------------------------------------