├── images
├── favicon.ico
└── altair-logo.png
├── requirements.txt
├── notebooks
├── altair-gallery.png
├── split-apply-combine.png
├── Index.ipynb
├── 05-Exploring.ipynb
└── 01-alt-Iris-Demo.ipynb
├── _config.yml
├── environment.yml
├── _toc.yml
├── .github
└── workflows
│ └── book.yml
├── README.md
├── LICENSE
├── tools
└── remove_vega_outputs.py
└── .gitignore
/images/favicon.ico:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/altair-viz/altair-tutorial/HEAD/images/favicon.ico
--------------------------------------------------------------------------------
/images/altair-logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/altair-viz/altair-tutorial/HEAD/images/altair-logo.png
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | jupyter-book>=0.7.0b
2 | altair
3 | numpy
4 | pandas
5 | rise
6 | vega
7 | vega_datasets
8 |
--------------------------------------------------------------------------------
/notebooks/altair-gallery.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/altair-viz/altair-tutorial/HEAD/notebooks/altair-gallery.png
--------------------------------------------------------------------------------
/notebooks/split-apply-combine.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/altair-viz/altair-tutorial/HEAD/notebooks/split-apply-combine.png
--------------------------------------------------------------------------------
/_config.yml:
--------------------------------------------------------------------------------
1 | # Book settings
2 | title: Altair Tutorial
3 | author: Jake Vanderplas and Eitan Lees
4 | logo: 'images/altair-logo.png'
5 | html:
6 | favicon: "images/favicon.ico"
7 | repository:
8 | url: "https://github.com/altair-viz/altair-tutorial"
9 |
--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
1 | name: altair-tutorial
2 | channels:
3 | - conda-forge
4 | dependencies:
5 | - python
6 | - altair
7 | - numpy
8 | - pandas
9 | - rise
10 | - vega
11 | - vega_datasets
12 |
--------------------------------------------------------------------------------
/_toc.yml:
--------------------------------------------------------------------------------
1 | - file: README
2 | - file: notebooks/Index
3 | - file: notebooks/01-Cars-Demo
4 | - file: notebooks/02-Simple-Charts
5 | - file: notebooks/03-Binning-and-aggregation
6 | - file: notebooks/04-Compound-charts
7 | - file: notebooks/05-Exploring
8 | - file: notebooks/06-Selections
9 | - file: notebooks/07-Transformations
10 | - file: notebooks/08-Configuration
11 | - file: notebooks/09-Geographic-plots
12 |
--------------------------------------------------------------------------------
/.github/workflows/book.yml:
--------------------------------------------------------------------------------
1 | name: deploy-book
2 |
3 | # Only run this when the master branch changes
4 | on:
5 | push:
6 | branches:
7 | - master
8 |
9 | # This job installs dependencies, build the book, and pushes it to `gh-pages`
10 | jobs:
11 | deploy-book:
12 | runs-on: ubuntu-latest
13 | steps:
14 | - uses: actions/checkout@v2
15 |
16 | # Install dependencies
17 | - name: Set up Python 3.7
18 | uses: actions/setup-python@v1
19 | with:
20 | python-version: 3.7
21 |
22 | - name: Install dependencies
23 | run: |
24 | pip install -r requirements.txt
25 |
26 |
27 | # Build the book
28 | - name: Build the book
29 | run: |
30 | jupyter-book build .
31 |
32 | # Push the book's HTML to github-pages
33 | - name: GitHub Pages action
34 | uses: peaceiris/actions-gh-pages@v3.5.9
35 | with:
36 | github_token: ${{ secrets.GITHUB_TOKEN }}
37 | publish_dir: ./_build/html
38 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Exploratory Data Visualization with Altair
2 |
3 | 
4 |
5 |
6 | Welcome! Here you will find the source material for the Altair tutorial, given at PyCon 2018.
7 |
8 | An overview of the content can be found at [notebooks/Index.ipynb](notebooks/Index.ipynb)
9 |
10 | The intro slides on are on [SpeakerDeck](https://speakerdeck.com/jakevdp/altair-tutorial-intro-pycon-2018)
11 |
12 | The workshop was recorded and is available [on YouTube](https://www.youtube.com/watch?v=ms29ZPUKxbU)
13 |
14 | This tutorial has been made into a jupyter book and is hosted at https://altair-viz.github.io/altair-tutorial
15 |
16 | See the Altair documentation at http://altair-viz.github.io/
17 |
18 | ## Run the Tutorial
19 |
20 | If you have a Google/Gmail account, you can Run this tutorial from your browser using Colab: [Open in Colab](https://colab.research.google.com/github/altair-viz/altair-tutorial/blob/master/notebooks/Index.ipynb).
21 |
22 | Also you can run in Binder: [](https://mybinder.org/v2/gh/altair-viz/altair-tutorial.git/master)
23 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2018 Jake Vanderplas
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/tools/remove_vega_outputs.py:
--------------------------------------------------------------------------------
1 | """
2 | Tool to remove vegalite outputs from a notebook.
3 |
4 | For notebooks with a lot of embedded data, this helps keep the notebook size
5 | manageable.
6 | """
7 |
8 | import nbformat
9 |
10 | def _strip(output, ignore_mimetypes=('application/vnd.vegalite.v2+json',)):
11 | if 'data' in output:
12 | output = output.copy()
13 | output['data'] = {k: v for k, v in output['data'].items()
14 | if k not in ignore_mimetypes}
15 | return output
16 |
17 |
18 | def strip_vega_outputs(notebook):
19 | for cell in notebook['cells']:
20 | if 'outputs' not in cell:
21 | continue
22 | cell['outputs'] = [_strip(output) for output in cell['outputs']]
23 | return notebook
24 |
25 |
26 | def main(input_filename, output_filename):
27 | notebook = nbformat.read(input_filename, as_version=nbformat.NO_CONVERT)
28 | stripped = strip_vega_outputs(notebook)
29 | with open(output_filename, 'w') as f:
30 | nbformat.write(stripped, f)
31 | return output_filename
32 |
33 |
34 | if __name__ == '__main__':
35 | import sys
36 | filename = sys.argv[1]
37 | if len(sys.argv) > 2:
38 | output_filename = sys.argv[2]
39 | else:
40 | output_filename = filename + '-stripped'
41 | main(filename, output_filename)
42 |
43 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | env/
12 | build/
13 | develop-eggs/
14 | dist/
15 | downloads/
16 | eggs/
17 | .eggs/
18 | lib/
19 | lib64/
20 | parts/
21 | sdist/
22 | var/
23 | wheels/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 |
28 | # PyInstaller
29 | # Usually these files are written by a python script from a template
30 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
31 | *.manifest
32 | *.spec
33 |
34 | # Installer logs
35 | pip-log.txt
36 | pip-delete-this-directory.txt
37 |
38 | # Unit test / coverage reports
39 | htmlcov/
40 | .tox/
41 | .coverage
42 | .coverage.*
43 | .cache
44 | nosetests.xml
45 | coverage.xml
46 | *.cover
47 | .hypothesis/
48 |
49 | # Translations
50 | *.mo
51 | *.pot
52 |
53 | # Django stuff:
54 | *.log
55 | local_settings.py
56 |
57 | # Flask stuff:
58 | instance/
59 | .webassets-cache
60 |
61 | # Scrapy stuff:
62 | .scrapy
63 |
64 | # Sphinx documentation
65 | docs/_build/
66 |
67 | # PyBuilder
68 | target/
69 |
70 | # Jupyter Notebook
71 | .ipynb_checkpoints
72 |
73 | # pyenv
74 | .python-version
75 |
76 | # celery beat schedule file
77 | celerybeat-schedule
78 |
79 | # SageMath parsed files
80 | *.sage.py
81 |
82 | # dotenv
83 | .env
84 |
85 | # virtualenv
86 | .venv
87 | venv/
88 | ENV/
89 |
90 | # Spyder project settings
91 | .spyderproject
92 | .spyproject
93 |
94 | # Rope project settings
95 | .ropeproject
96 |
97 | # mkdocs documentation
98 | /site
99 |
100 | # mypy
101 | .mypy_cache/
102 |
--------------------------------------------------------------------------------
/notebooks/Index.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "source": [
6 | "# Outline\n",
7 | "\n",
8 | "These notebooks contain an introduction to exploratory data analysis with [Altair](http://altair-viz.github.io).\n",
9 | "\n",
10 | "Materials can be found online at http://github.com/altair-viz/altair-tutorial\n",
11 | "\n",
12 | "To get Altair and its dependencies installed, please follow the [Installation Instructions](https://altair-viz.github.io/getting_started/installation.html) on Altair's website.\n",
13 | "\n",
14 | "## 1. Motivating Altair: Live Demo\n",
15 | "\n",
16 | "We'll start with a set of intro slides, followed by a live-coded demo of what it looks like to explore data with Altair.\n",
17 | "\n",
18 | "Here's the rough content of the live demo:\n",
19 | "\n",
20 | "- [01-Cars-Demo.ipynb](01-Cars-Demo.ipynb)\n",
21 | "\n",
22 | "This will cover a lot of ground very quickly. Don't worry if you feel a bit shaky on some of the concepts; the goal here is a birds-eye overview, and we'll walk-through important concepts in more detail later.\n",
23 | "\n",
24 | "## 2. Simple Charts: Core Concepts\n",
25 | "\n",
26 | "Digging into the basic features of an Altair chart\n",
27 | "\n",
28 | "- [02-Simple-Charts.ipynb](02-Simple-Charts.ipynb)\n",
29 | "\n",
30 | "## 3. Binning and Aggregation\n",
31 | "\n",
32 | "Altair lets you build binning and aggregation (i.e. basic data flows) into your chart spec.\n",
33 | "We'll cover that here.\n",
34 | "\n",
35 | "- [03-Binning-and-aggregation.ipynb](03-Binning-and-aggregation.ipynb)\n",
36 | "\n",
37 | "## 4. Layering and Concatenation\n",
38 | "\n",
39 | "From the simple charts we have seen already, you can begin layering and concatenating charts to build up more complicated visualizations.\n",
40 | "\n",
41 | "- [04-Compound-charts.ipynb](04-Compound-charts.ipynb)\n",
42 | " \n",
43 | "## 5. Exploring a Dataset!\n",
44 | "\n",
45 | "Here's a chance for you to try this out on some data!\n",
46 | "\n",
47 | "- [05-Exploring.ipynb](05-Exploring.ipynb)\n",
48 | " "
49 | ],
50 | "metadata": {}
51 | },
52 | {
53 | "cell_type": "markdown",
54 | "source": [
55 | "---\n",
56 | "\n",
57 | "*Afternoon Break*\n",
58 | "\n",
59 | "---"
60 | ],
61 | "metadata": {}
62 | },
63 | {
64 | "cell_type": "markdown",
65 | "source": [
66 | "## 6. Selections: making things interactive\n",
67 | "\n",
68 | "A killer feature of Altair is the ability to build interactions from basic building blocks.\n",
69 | "We'll dig into that here.\n",
70 | "\n",
71 | "- [06-Selections.ipynb](06-Selections.ipynb)\n",
72 | "\n",
73 | "## 7. Transformations: Data streams within the chart spec\n",
74 | "\n",
75 | "We saw binning and aggregation previously, but there are a number of other data transformations that Altair makes available.\n",
76 | "\n",
77 | "- [07-Transformations.ipynb](07-Transformations.ipynb)\n",
78 | "\n",
79 | "## 8. Config: Adjusting your Charts\n",
80 | "\n",
81 | "Once you're happy with your chart, it's nice to be able to tweak things like axis labels, titles, color scales, etc.\n",
82 | "We'll look here at how that can be done with Altair\n",
83 | "\n",
84 | "- [08-Configuration.ipynb](08-Configuration.ipynb)\n",
85 | "\n",
86 | "## 9. Geographic Charts: Maps\n",
87 | "\n",
88 | "Altair recently added support for geographic visualizations. We'll show a few examples of the basic principles behind these.\n",
89 | "\n",
90 | "- [09-Geographic-plots.ipynb](09-Geographic-plots.ipynb)\n",
91 | "\n",
92 | "## 10. Try it out!\n",
93 | "\n",
94 | "- Return to your visualization in [05-Exploring.ipynb](05-Exploring.ipynb) and begin adding some interactions & transformations. What can you come up with?"
95 | ],
96 | "metadata": {}
97 | }
98 | ],
99 | "metadata": {
100 | "kernelspec": {
101 | "display_name": "Python 3",
102 | "language": "python",
103 | "name": "python3"
104 | },
105 | "language_info": {
106 | "name": "python",
107 | "version": "3.7.6",
108 | "mimetype": "text/x-python",
109 | "codemirror_mode": {
110 | "name": "ipython",
111 | "version": 3
112 | },
113 | "pygments_lexer": "ipython3",
114 | "nbconvert_exporter": "python",
115 | "file_extension": ".py"
116 | },
117 | "nteract": {
118 | "version": "0.23.1"
119 | }
120 | },
121 | "nbformat": 4,
122 | "nbformat_minor": 4
123 | }
--------------------------------------------------------------------------------
/notebooks/05-Exploring.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Exploring a DataSet\n",
8 | "\n",
9 | "Now that we have seen the basic pieces of Altair's API, it's time to practice using it to explore a new dataset.\n",
10 | "With your partner, choose one of the following four datasets, detailed below.\n",
11 | "\n",
12 | "As you explore the data, recall the building blocks we've discussed:\n",
13 | "\n",
14 | "- various marks: ``mark_point()``, ``mark_line()``, ``mark_tick()``, ``mark_bar()``, ``mark_area()``, ``mark_rect()``, etc.\n",
15 | "- various encodings: ``x``, ``y``, ``color``, ``shape``, ``size``, ``row``, ``column``, ``text``, ``tooltip``, etc.\n",
16 | "- binning and aggregations: a [List of available aggregations](https://altair-viz.github.io/user_guide/encoding.html#binning-and-aggregation) can be found in Altair's documentation\n",
17 | "- stacking and layering (``alt.layer`` <-> ``+``, ``alt.hconcat`` <-> ``|``, ``alt.vconcat`` <-> ``&``)\n",
18 | "\n",
19 | "Start simple and build from there. Which encodings work best with quantitative data? With categorical data?\n",
20 | "What can you learn about your dataset using these tools?\n",
21 | "\n",
22 | "We'll set aside about 20 minutes for you to work on this with your partner."
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": 1,
28 | "metadata": {},
29 | "outputs": [],
30 | "source": [
31 | "from vega_datasets import data"
32 | ]
33 | },
34 | {
35 | "cell_type": "markdown",
36 | "metadata": {},
37 | "source": [
38 | "## Seattle Weather\n",
39 | "\n",
40 | "This data includes daily precipitation, temperature range, wind speed, and weather type as a function of date between 2012 and 2015 in Seattle."
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": 2,
46 | "metadata": {},
47 | "outputs": [
48 | {
49 | "data": {
50 | "text/html": [
51 | "
\n",
52 | "\n",
65 | "
\n",
66 | " \n",
67 | " \n",
68 | " | \n",
69 | " date | \n",
70 | " precipitation | \n",
71 | " temp_max | \n",
72 | " temp_min | \n",
73 | " wind | \n",
74 | " weather | \n",
75 | "
\n",
76 | " \n",
77 | " \n",
78 | " \n",
79 | " | 0 | \n",
80 | " 2012-01-01 | \n",
81 | " 0.0 | \n",
82 | " 12.8 | \n",
83 | " 5.0 | \n",
84 | " 4.7 | \n",
85 | " drizzle | \n",
86 | "
\n",
87 | " \n",
88 | " | 1 | \n",
89 | " 2012-01-02 | \n",
90 | " 10.9 | \n",
91 | " 10.6 | \n",
92 | " 2.8 | \n",
93 | " 4.5 | \n",
94 | " rain | \n",
95 | "
\n",
96 | " \n",
97 | " | 2 | \n",
98 | " 2012-01-03 | \n",
99 | " 0.8 | \n",
100 | " 11.7 | \n",
101 | " 7.2 | \n",
102 | " 2.3 | \n",
103 | " rain | \n",
104 | "
\n",
105 | " \n",
106 | " | 3 | \n",
107 | " 2012-01-04 | \n",
108 | " 20.3 | \n",
109 | " 12.2 | \n",
110 | " 5.6 | \n",
111 | " 4.7 | \n",
112 | " rain | \n",
113 | "
\n",
114 | " \n",
115 | " | 4 | \n",
116 | " 2012-01-05 | \n",
117 | " 1.3 | \n",
118 | " 8.9 | \n",
119 | " 2.8 | \n",
120 | " 6.1 | \n",
121 | " rain | \n",
122 | "
\n",
123 | " \n",
124 | "
\n",
125 | "
"
126 | ],
127 | "text/plain": [
128 | " date precipitation temp_max temp_min wind weather\n",
129 | "0 2012-01-01 0.0 12.8 5.0 4.7 drizzle\n",
130 | "1 2012-01-02 10.9 10.6 2.8 4.5 rain\n",
131 | "2 2012-01-03 0.8 11.7 7.2 2.3 rain\n",
132 | "3 2012-01-04 20.3 12.2 5.6 4.7 rain\n",
133 | "4 2012-01-05 1.3 8.9 2.8 6.1 rain"
134 | ]
135 | },
136 | "execution_count": 2,
137 | "metadata": {},
138 | "output_type": "execute_result"
139 | }
140 | ],
141 | "source": [
142 | "weather = data.seattle_weather()\n",
143 | "weather.head()"
144 | ]
145 | },
146 | {
147 | "cell_type": "markdown",
148 | "metadata": {},
149 | "source": [
150 | "## Gapminder\n",
151 | "\n",
152 | "This data consists of population, fertility, and life expectancy over time in a number of countries around the world.\n",
153 | "\n",
154 | "Note that, while you may be tempted to use a temporal encoding for the year, here the year is simply a number, not a date stamp, and so temporal encoding is not the best choice here."
155 | ]
156 | },
157 | {
158 | "cell_type": "code",
159 | "execution_count": 3,
160 | "metadata": {},
161 | "outputs": [
162 | {
163 | "data": {
164 | "text/html": [
165 | "\n",
166 | "\n",
179 | "
\n",
180 | " \n",
181 | " \n",
182 | " | \n",
183 | " year | \n",
184 | " country | \n",
185 | " cluster | \n",
186 | " pop | \n",
187 | " life_expect | \n",
188 | " fertility | \n",
189 | "
\n",
190 | " \n",
191 | " \n",
192 | " \n",
193 | " | 0 | \n",
194 | " 1955 | \n",
195 | " Afghanistan | \n",
196 | " 0 | \n",
197 | " 8891209 | \n",
198 | " 30.332 | \n",
199 | " 7.7 | \n",
200 | "
\n",
201 | " \n",
202 | " | 1 | \n",
203 | " 1960 | \n",
204 | " Afghanistan | \n",
205 | " 0 | \n",
206 | " 9829450 | \n",
207 | " 31.997 | \n",
208 | " 7.7 | \n",
209 | "
\n",
210 | " \n",
211 | " | 2 | \n",
212 | " 1965 | \n",
213 | " Afghanistan | \n",
214 | " 0 | \n",
215 | " 10997885 | \n",
216 | " 34.020 | \n",
217 | " 7.7 | \n",
218 | "
\n",
219 | " \n",
220 | " | 3 | \n",
221 | " 1970 | \n",
222 | " Afghanistan | \n",
223 | " 0 | \n",
224 | " 12430623 | \n",
225 | " 36.088 | \n",
226 | " 7.7 | \n",
227 | "
\n",
228 | " \n",
229 | " | 4 | \n",
230 | " 1975 | \n",
231 | " Afghanistan | \n",
232 | " 0 | \n",
233 | " 14132019 | \n",
234 | " 38.438 | \n",
235 | " 7.7 | \n",
236 | "
\n",
237 | " \n",
238 | "
\n",
239 | "
"
240 | ],
241 | "text/plain": [
242 | " year country cluster pop life_expect fertility\n",
243 | "0 1955 Afghanistan 0 8891209 30.332 7.7\n",
244 | "1 1960 Afghanistan 0 9829450 31.997 7.7\n",
245 | "2 1965 Afghanistan 0 10997885 34.020 7.7\n",
246 | "3 1970 Afghanistan 0 12430623 36.088 7.7\n",
247 | "4 1975 Afghanistan 0 14132019 38.438 7.7"
248 | ]
249 | },
250 | "execution_count": 3,
251 | "metadata": {},
252 | "output_type": "execute_result"
253 | }
254 | ],
255 | "source": [
256 | "gapminder = data.gapminder()\n",
257 | "gapminder.head()"
258 | ]
259 | },
260 | {
261 | "cell_type": "markdown",
262 | "metadata": {},
263 | "source": [
264 | "## Population\n",
265 | "\n",
266 | "This data contains the US population sub-divided by age and sex every decade from 1850 to near the present.\n",
267 | "\n",
268 | "Note that, while you may be tempted to use a temporal encoding for the year, here the year is simply a number, not a date stamp, and so temporal encoding is not the best choice."
269 | ]
270 | },
271 | {
272 | "cell_type": "code",
273 | "execution_count": 4,
274 | "metadata": {},
275 | "outputs": [
276 | {
277 | "data": {
278 | "text/html": [
279 | "\n",
280 | "\n",
293 | "
\n",
294 | " \n",
295 | " \n",
296 | " | \n",
297 | " year | \n",
298 | " age | \n",
299 | " sex | \n",
300 | " people | \n",
301 | "
\n",
302 | " \n",
303 | " \n",
304 | " \n",
305 | " | 0 | \n",
306 | " 1850 | \n",
307 | " 0 | \n",
308 | " 1 | \n",
309 | " 1483789 | \n",
310 | "
\n",
311 | " \n",
312 | " | 1 | \n",
313 | " 1850 | \n",
314 | " 0 | \n",
315 | " 2 | \n",
316 | " 1450376 | \n",
317 | "
\n",
318 | " \n",
319 | " | 2 | \n",
320 | " 1850 | \n",
321 | " 5 | \n",
322 | " 1 | \n",
323 | " 1411067 | \n",
324 | "
\n",
325 | " \n",
326 | " | 3 | \n",
327 | " 1850 | \n",
328 | " 5 | \n",
329 | " 2 | \n",
330 | " 1359668 | \n",
331 | "
\n",
332 | " \n",
333 | " | 4 | \n",
334 | " 1850 | \n",
335 | " 10 | \n",
336 | " 1 | \n",
337 | " 1260099 | \n",
338 | "
\n",
339 | " \n",
340 | "
\n",
341 | "
"
342 | ],
343 | "text/plain": [
344 | " year age sex people\n",
345 | "0 1850 0 1 1483789\n",
346 | "1 1850 0 2 1450376\n",
347 | "2 1850 5 1 1411067\n",
348 | "3 1850 5 2 1359668\n",
349 | "4 1850 10 1 1260099"
350 | ]
351 | },
352 | "execution_count": 4,
353 | "metadata": {},
354 | "output_type": "execute_result"
355 | }
356 | ],
357 | "source": [
358 | "population = data.population()\n",
359 | "population.head()"
360 | ]
361 | },
362 | {
363 | "cell_type": "markdown",
364 | "metadata": {},
365 | "source": [
366 | "## Movies\n",
367 | "\n",
368 | "The movies dataset has data on 3200 movies, including release date, budget, and ratings on IMDB and Rotten Tomatoes."
369 | ]
370 | },
371 | {
372 | "cell_type": "code",
373 | "execution_count": 5,
374 | "metadata": {},
375 | "outputs": [
376 | {
377 | "data": {
378 | "text/html": [
379 | "\n",
380 | "\n",
393 | "
\n",
394 | " \n",
395 | " \n",
396 | " | \n",
397 | " Title | \n",
398 | " US_Gross | \n",
399 | " Worldwide_Gross | \n",
400 | " US_DVD_Sales | \n",
401 | " Production_Budget | \n",
402 | " Release_Date | \n",
403 | " MPAA_Rating | \n",
404 | " Running_Time_min | \n",
405 | " Distributor | \n",
406 | " Source | \n",
407 | " Major_Genre | \n",
408 | " Creative_Type | \n",
409 | " Director | \n",
410 | " Rotten_Tomatoes_Rating | \n",
411 | " IMDB_Rating | \n",
412 | " IMDB_Votes | \n",
413 | "
\n",
414 | " \n",
415 | " \n",
416 | " \n",
417 | " | 0 | \n",
418 | " The Land Girls | \n",
419 | " 146083.0 | \n",
420 | " 146083.0 | \n",
421 | " NaN | \n",
422 | " 8000000.0 | \n",
423 | " Jun 12 1998 | \n",
424 | " R | \n",
425 | " NaN | \n",
426 | " Gramercy | \n",
427 | " None | \n",
428 | " None | \n",
429 | " None | \n",
430 | " None | \n",
431 | " NaN | \n",
432 | " 6.1 | \n",
433 | " 1071.0 | \n",
434 | "
\n",
435 | " \n",
436 | " | 1 | \n",
437 | " First Love, Last Rites | \n",
438 | " 10876.0 | \n",
439 | " 10876.0 | \n",
440 | " NaN | \n",
441 | " 300000.0 | \n",
442 | " Aug 07 1998 | \n",
443 | " R | \n",
444 | " NaN | \n",
445 | " Strand | \n",
446 | " None | \n",
447 | " Drama | \n",
448 | " None | \n",
449 | " None | \n",
450 | " NaN | \n",
451 | " 6.9 | \n",
452 | " 207.0 | \n",
453 | "
\n",
454 | " \n",
455 | " | 2 | \n",
456 | " I Married a Strange Person | \n",
457 | " 203134.0 | \n",
458 | " 203134.0 | \n",
459 | " NaN | \n",
460 | " 250000.0 | \n",
461 | " Aug 28 1998 | \n",
462 | " None | \n",
463 | " NaN | \n",
464 | " Lionsgate | \n",
465 | " None | \n",
466 | " Comedy | \n",
467 | " None | \n",
468 | " None | \n",
469 | " NaN | \n",
470 | " 6.8 | \n",
471 | " 865.0 | \n",
472 | "
\n",
473 | " \n",
474 | " | 3 | \n",
475 | " Let's Talk About Sex | \n",
476 | " 373615.0 | \n",
477 | " 373615.0 | \n",
478 | " NaN | \n",
479 | " 300000.0 | \n",
480 | " Sep 11 1998 | \n",
481 | " None | \n",
482 | " NaN | \n",
483 | " Fine Line | \n",
484 | " None | \n",
485 | " Comedy | \n",
486 | " None | \n",
487 | " None | \n",
488 | " 13.0 | \n",
489 | " NaN | \n",
490 | " NaN | \n",
491 | "
\n",
492 | " \n",
493 | " | 4 | \n",
494 | " Slam | \n",
495 | " 1009819.0 | \n",
496 | " 1087521.0 | \n",
497 | " NaN | \n",
498 | " 1000000.0 | \n",
499 | " Oct 09 1998 | \n",
500 | " R | \n",
501 | " NaN | \n",
502 | " Trimark | \n",
503 | " Original Screenplay | \n",
504 | " Drama | \n",
505 | " Contemporary Fiction | \n",
506 | " None | \n",
507 | " 62.0 | \n",
508 | " 3.4 | \n",
509 | " 165.0 | \n",
510 | "
\n",
511 | " \n",
512 | "
\n",
513 | "
"
514 | ],
515 | "text/plain": [
516 | " Title US_Gross Worldwide_Gross US_DVD_Sales \\\n",
517 | "0 The Land Girls 146083.0 146083.0 NaN \n",
518 | "1 First Love, Last Rites 10876.0 10876.0 NaN \n",
519 | "2 I Married a Strange Person 203134.0 203134.0 NaN \n",
520 | "3 Let's Talk About Sex 373615.0 373615.0 NaN \n",
521 | "4 Slam 1009819.0 1087521.0 NaN \n",
522 | "\n",
523 | " Production_Budget Release_Date MPAA_Rating Running_Time_min Distributor \\\n",
524 | "0 8000000.0 Jun 12 1998 R NaN Gramercy \n",
525 | "1 300000.0 Aug 07 1998 R NaN Strand \n",
526 | "2 250000.0 Aug 28 1998 None NaN Lionsgate \n",
527 | "3 300000.0 Sep 11 1998 None NaN Fine Line \n",
528 | "4 1000000.0 Oct 09 1998 R NaN Trimark \n",
529 | "\n",
530 | " Source Major_Genre Creative_Type Director \\\n",
531 | "0 None None None None \n",
532 | "1 None Drama None None \n",
533 | "2 None Comedy None None \n",
534 | "3 None Comedy None None \n",
535 | "4 Original Screenplay Drama Contemporary Fiction None \n",
536 | "\n",
537 | " Rotten_Tomatoes_Rating IMDB_Rating IMDB_Votes \n",
538 | "0 NaN 6.1 1071.0 \n",
539 | "1 NaN 6.9 207.0 \n",
540 | "2 NaN 6.8 865.0 \n",
541 | "3 13.0 NaN NaN \n",
542 | "4 62.0 3.4 165.0 "
543 | ]
544 | },
545 | "execution_count": 5,
546 | "metadata": {},
547 | "output_type": "execute_result"
548 | }
549 | ],
550 | "source": [
551 | "movies = data.movies()\n",
552 | "movies.head()"
553 | ]
554 | }
555 | ],
556 | "metadata": {
557 | "kernelspec": {
558 | "display_name": "Python 3",
559 | "language": "python",
560 | "name": "python3"
561 | },
562 | "language_info": {
563 | "codemirror_mode": {
564 | "name": "ipython",
565 | "version": 3
566 | },
567 | "file_extension": ".py",
568 | "mimetype": "text/x-python",
569 | "name": "python",
570 | "nbconvert_exporter": "python",
571 | "pygments_lexer": "ipython3",
572 | "version": "3.7.6"
573 | }
574 | },
575 | "nbformat": 4,
576 | "nbformat_minor": 4
577 | }
578 |
--------------------------------------------------------------------------------
/notebooks/01-alt-Iris-Demo.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Demo: Exploring the Iris Dataset\n",
8 | "\n",
9 | "We'll start this tutorial with a demo to whet your appetite for learning more. This section purposely moves quickly through many of the concepts (e.g. data, marks, encodings, aggregation, data types, selections, etc.)\n",
10 | "We will return to treat each of these in more depth later in the tutorial, so don't worry if it all seems to go a bit quickly!\n",
11 | "\n",
12 | "In the live tutorial, this will be done from scratch in a blank notebook.\n",
13 | "However, for the sake of people who want to look back on what we did live, I'll do my best to reproduce the examples and the discussion here."
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": [
20 | "## 1. Imports and Data\n",
21 | "\n",
22 | "We'll start with importing the [Altair package](http://altair-viz.github.io/) and enabling the appropriate renderer (if necessary):"
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": null,
28 | "metadata": {},
29 | "outputs": [],
30 | "source": [
31 | "import altair as alt\n",
32 | "\n",
33 | "# Altair plots render by default in JupyterLab and nteract\n",
34 | "\n",
35 | "# Uncomment/run this line to enable Altair in the classic notebook (not in JupyterLab)\n",
36 | "# alt.renderers.enable('notebook')\n",
37 | "\n",
38 | "# Uncomment/run this line to enable Altair in Colab\n",
39 | "# alt.renderers.enable('colab')"
40 | ]
41 | },
42 | {
43 | "cell_type": "markdown",
44 | "metadata": {},
45 | "source": [
46 | "Now we'll use the [vega_datasets package](https://github.com/altair-viz/vega_datasets), to load an example dataset:"
47 | ]
48 | },
49 | {
50 | "cell_type": "code",
51 | "execution_count": 30,
52 | "metadata": {},
53 | "outputs": [
54 | {
55 | "data": {
56 | "text/html": [
57 | "\n",
58 | "\n",
71 | "
\n",
72 | " \n",
73 | " \n",
74 | " | \n",
75 | " petalLength | \n",
76 | " petalWidth | \n",
77 | " sepalLength | \n",
78 | " sepalWidth | \n",
79 | " species | \n",
80 | "
\n",
81 | " \n",
82 | " \n",
83 | " \n",
84 | " | 0 | \n",
85 | " 1.4 | \n",
86 | " 0.2 | \n",
87 | " 5.1 | \n",
88 | " 3.5 | \n",
89 | " setosa | \n",
90 | "
\n",
91 | " \n",
92 | " | 1 | \n",
93 | " 1.4 | \n",
94 | " 0.2 | \n",
95 | " 4.9 | \n",
96 | " 3.0 | \n",
97 | " setosa | \n",
98 | "
\n",
99 | " \n",
100 | " | 2 | \n",
101 | " 1.3 | \n",
102 | " 0.2 | \n",
103 | " 4.7 | \n",
104 | " 3.2 | \n",
105 | " setosa | \n",
106 | "
\n",
107 | " \n",
108 | " | 3 | \n",
109 | " 1.5 | \n",
110 | " 0.2 | \n",
111 | " 4.6 | \n",
112 | " 3.1 | \n",
113 | " setosa | \n",
114 | "
\n",
115 | " \n",
116 | " | 4 | \n",
117 | " 1.4 | \n",
118 | " 0.2 | \n",
119 | " 5.0 | \n",
120 | " 3.6 | \n",
121 | " setosa | \n",
122 | "
\n",
123 | " \n",
124 | "
\n",
125 | "
"
126 | ],
127 | "text/plain": [
128 | " petalLength petalWidth sepalLength sepalWidth species\n",
129 | "0 1.4 0.2 5.1 3.5 setosa\n",
130 | "1 1.4 0.2 4.9 3.0 setosa\n",
131 | "2 1.3 0.2 4.7 3.2 setosa\n",
132 | "3 1.5 0.2 4.6 3.1 setosa\n",
133 | "4 1.4 0.2 5.0 3.6 setosa"
134 | ]
135 | },
136 | "execution_count": 30,
137 | "metadata": {},
138 | "output_type": "execute_result"
139 | }
140 | ],
141 | "source": [
142 | "from vega_datasets import data\n",
143 | "\n",
144 | "iris = data.iris()\n",
145 | "iris.head()"
146 | ]
147 | },
148 | {
149 | "cell_type": "markdown",
150 | "metadata": {},
151 | "source": [
152 | "Notice that this data is in columnar format: that is, each column contains an attribute of a data point, and each row contains a single instance of the data (here, the measurament of a single flower)."
153 | ]
154 | },
155 | {
156 | "cell_type": "markdown",
157 | "metadata": {},
158 | "source": [
159 | "## 2. Zero, One, and Two-dimensional Charts\n",
160 | "\n",
161 | "Using Altair, we can begin to explore this data.\n",
162 | "\n",
163 | "The most basic chart contains the dataset, along with a mark to represent each row:"
164 | ]
165 | },
166 | {
167 | "cell_type": "code",
168 | "execution_count": null,
169 | "metadata": {},
170 | "outputs": [],
171 | "source": [
172 | "alt.Chart(iris).mark_point()"
173 | ]
174 | },
175 | {
176 | "cell_type": "markdown",
177 | "metadata": {},
178 | "source": [
179 | "This is a pretty silly chart, because it consists of 150 points, all laid-out on top of each other. But this *is* technically a representation of the data!\n",
180 | "\n",
181 | "To make it more interesting, we need to *encode* columns of the data into visual features of the plot (e.g. x-position, y-position, size, color, etc.)\n",
182 | "\n",
183 | "Let's encode petal length on the x-axis using the ``encode()`` method:"
184 | ]
185 | },
186 | {
187 | "cell_type": "code",
188 | "execution_count": null,
189 | "metadata": {},
190 | "outputs": [],
191 | "source": [
192 | "alt.Chart(iris).mark_point().encode(\n",
193 | " x='petalLength'\n",
194 | ")"
195 | ]
196 | },
197 | {
198 | "cell_type": "markdown",
199 | "metadata": {},
200 | "source": [
201 | "This is a bit better, but the ``point`` mark is probably not the best for a 1D chart like this.\n",
202 | "\n",
203 | "Let's try the ``tick`` mark instead:"
204 | ]
205 | },
206 | {
207 | "cell_type": "code",
208 | "execution_count": null,
209 | "metadata": {},
210 | "outputs": [],
211 | "source": [
212 | "alt.Chart(iris).mark_tick().encode(\n",
213 | " x='petalLength'\n",
214 | ")"
215 | ]
216 | },
217 | {
218 | "cell_type": "markdown",
219 | "metadata": {},
220 | "source": [
221 | "Another thing we might do with the points is to expand them into a 2D chart by also encoding the y value. We'll return to using ``point`` markers, and put ``petalWidth`` on the y-axis"
222 | ]
223 | },
224 | {
225 | "cell_type": "code",
226 | "execution_count": null,
227 | "metadata": {},
228 | "outputs": [],
229 | "source": [
230 | "alt.Chart(iris).mark_point().encode(\n",
231 | " x='petalLength',\n",
232 | " y='petalWidth'\n",
233 | ")"
234 | ]
235 | },
236 | {
237 | "cell_type": "markdown",
238 | "metadata": {},
239 | "source": [
240 | "## 3 Simple Interactions\n",
241 | "\n",
242 | "One of the nicest features of Altair is the grammar of interaction that it provides.\n",
243 | "The simplest kind of interaction is the ability to pan and zoom along charts; Altair contains a shortcut to enable this via the ``interactive()`` method:"
244 | ]
245 | },
246 | {
247 | "cell_type": "code",
248 | "execution_count": null,
249 | "metadata": {},
250 | "outputs": [],
251 | "source": [
252 | "alt.Chart(iris).mark_point().encode(\n",
253 | " x='petalLength',\n",
254 | " y='petalWidth'\n",
255 | ").interactive()"
256 | ]
257 | },
258 | {
259 | "cell_type": "markdown",
260 | "metadata": {},
261 | "source": [
262 | "This lets you click and drag, as well as use your computer's scroll/zoom behavior to zoom in and out on the chart.\n",
263 | "\n",
264 | "We'll see other interactions later."
265 | ]
266 | },
267 | {
268 | "cell_type": "markdown",
269 | "metadata": {},
270 | "source": [
271 | "## 4. A Third Dimension: Color"
272 | ]
273 | },
274 | {
275 | "cell_type": "markdown",
276 | "metadata": {},
277 | "source": [
278 | "A 2D plot allows us to encode two dimensions of the data. Let's look at using color to encode a third:"
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": null,
284 | "metadata": {},
285 | "outputs": [],
286 | "source": [
287 | "alt.Chart(iris).mark_point().encode(\n",
288 | " x='petalLength',\n",
289 | " y='petalWidth',\n",
290 | " color='species'\n",
291 | ")"
292 | ]
293 | },
294 | {
295 | "cell_type": "markdown",
296 | "metadata": {},
297 | "source": [
298 | "Notice that when we use a categorical value for color, it chooses an appropriate color map for categorical data.\n",
299 | "\n",
300 | "Let's see what happens when we use a continuous color value:"
301 | ]
302 | },
303 | {
304 | "cell_type": "code",
305 | "execution_count": null,
306 | "metadata": {},
307 | "outputs": [],
308 | "source": [
309 | "alt.Chart(iris).mark_point().encode(\n",
310 | " x='petalLength',\n",
311 | " y='petalWidth',\n",
312 | " color='sepalLength'\n",
313 | ")"
314 | ]
315 | },
316 | {
317 | "cell_type": "markdown",
318 | "metadata": {},
319 | "source": [
320 | "A continuous color results in a color scale that is appropriate for continuous data.\n",
321 | "\n",
322 | "This is one key feature of Altair: it chooses appropriate scales for the data type.\n",
323 | "\n",
324 | "Similar, if we put the categorical value on the x or y axis, it will also create discrete scales:"
325 | ]
326 | },
327 | {
328 | "cell_type": "code",
329 | "execution_count": null,
330 | "metadata": {},
331 | "outputs": [],
332 | "source": [
333 | "alt.Chart(iris).mark_tick().encode(\n",
334 | " x='petalLength',\n",
335 | " y='species',\n",
336 | " color='species'\n",
337 | ")"
338 | ]
339 | },
340 | {
341 | "cell_type": "markdown",
342 | "metadata": {},
343 | "source": [
344 | "Altair also can choose the orientation of the tick mark appropriately given the types of the x and y encodings:"
345 | ]
346 | },
347 | {
348 | "cell_type": "code",
349 | "execution_count": null,
350 | "metadata": {},
351 | "outputs": [],
352 | "source": [
353 | "alt.Chart(iris).mark_tick().encode(\n",
354 | " y='petalLength',\n",
355 | " x='species',\n",
356 | " color='species'\n",
357 | ")"
358 | ]
359 | },
360 | {
361 | "cell_type": "markdown",
362 | "metadata": {},
363 | "source": [
364 | "## 5. Binning and aggregation\n",
365 | "\n",
366 | "Let's return quickly to our 1D chart of petal length:"
367 | ]
368 | },
369 | {
370 | "cell_type": "code",
371 | "execution_count": null,
372 | "metadata": {},
373 | "outputs": [],
374 | "source": [
375 | "alt.Chart(iris).mark_tick().encode(\n",
376 | " x='petalLength',\n",
377 | ")"
378 | ]
379 | },
380 | {
381 | "cell_type": "markdown",
382 | "metadata": {},
383 | "source": [
384 | "Another way we might represent this data is to creat a histogram: to bin the x data and show the count on the y axis.\n",
385 | "In many plotting libraries this is done with a special method like ``hist()``. In Altair, such binning and aggregation is part of the declarative API.\n",
386 | "\n",
387 | "To move beyond a simple field name, we use ``alt.X()`` for the x encoding, and we use ``'count()'`` for the y encoding:"
388 | ]
389 | },
390 | {
391 | "cell_type": "code",
392 | "execution_count": null,
393 | "metadata": {},
394 | "outputs": [],
395 | "source": [
396 | "alt.Chart(iris).mark_bar().encode(\n",
397 | " x=alt.X('petalLength', bin=True),\n",
398 | " y='count()'\n",
399 | ")"
400 | ]
401 | },
402 | {
403 | "cell_type": "markdown",
404 | "metadata": {},
405 | "source": [
406 | "If we want more control over the bins, we can use ``alt.Bin`` to adjust bin parameters"
407 | ]
408 | },
409 | {
410 | "cell_type": "code",
411 | "execution_count": null,
412 | "metadata": {},
413 | "outputs": [],
414 | "source": [
415 | "alt.Chart(iris).mark_bar().encode(\n",
416 | " x=alt.X('petalLength', bin=alt.Bin(maxbins=50)),\n",
417 | " y='count()'\n",
418 | ")"
419 | ]
420 | },
421 | {
422 | "cell_type": "markdown",
423 | "metadata": {},
424 | "source": [
425 | "If we apply another encoding (such as ``color``), the data will be automatically grouped within each bin:"
426 | ]
427 | },
428 | {
429 | "cell_type": "code",
430 | "execution_count": null,
431 | "metadata": {},
432 | "outputs": [],
433 | "source": [
434 | "alt.Chart(iris).mark_bar().encode(\n",
435 | " x=alt.X('petalLength', bin=alt.Bin(maxbins=50)),\n",
436 | " y='count()',\n",
437 | " color='species'\n",
438 | ")"
439 | ]
440 | },
441 | {
442 | "cell_type": "markdown",
443 | "metadata": {},
444 | "source": [
445 | "If you prefer a separate plot for each category, the ``column`` encoding can help:"
446 | ]
447 | },
448 | {
449 | "cell_type": "code",
450 | "execution_count": null,
451 | "metadata": {},
452 | "outputs": [],
453 | "source": [
454 | "alt.Chart(iris).mark_bar().encode(\n",
455 | " x=alt.X('petalLength', bin=alt.Bin(maxbins=50)),\n",
456 | " y='count()',\n",
457 | " color='species',\n",
458 | " column='species'\n",
459 | ")"
460 | ]
461 | },
462 | {
463 | "cell_type": "markdown",
464 | "metadata": {},
465 | "source": [
466 | "Binning and aggregation works in two dimensions as well; we can use the ``rect`` marker and visualize the count using the color:"
467 | ]
468 | },
469 | {
470 | "cell_type": "code",
471 | "execution_count": null,
472 | "metadata": {},
473 | "outputs": [],
474 | "source": [
475 | "alt.Chart(iris).mark_rect().encode(\n",
476 | " x=alt.X('petalLength', bin=True),\n",
477 | " y=alt.Y('sepalLength', bin=True),\n",
478 | " color='count()'\n",
479 | ")"
480 | ]
481 | },
482 | {
483 | "cell_type": "markdown",
484 | "metadata": {},
485 | "source": [
486 | "Aggregations can be more than simple counts; we can also aggregate and compute the mean of a third quantity within each bin"
487 | ]
488 | },
489 | {
490 | "cell_type": "code",
491 | "execution_count": null,
492 | "metadata": {},
493 | "outputs": [],
494 | "source": [
495 | "alt.Chart(iris).mark_rect().encode(\n",
496 | " x=alt.X('petalLength', bin=True),\n",
497 | " y=alt.Y('sepalLength', bin=True),\n",
498 | " color='mean(petalWidth)'\n",
499 | ")"
500 | ]
501 | },
502 | {
503 | "cell_type": "markdown",
504 | "metadata": {},
505 | "source": [
506 | "## 6. Stacking and Layering\n",
507 | "\n",
508 | "Sometimes a multi-panel view is helpful in order to see relationships between points.\n",
509 | "Altair provides the ``hconcat()`` and ``vconcat()`` functions, and the associated ``|`` and ``&`` operators to concatenate charts together (note this is different than the column-wise faceting we saw above, because each panel contains the *entire* dataset)"
510 | ]
511 | },
512 | {
513 | "cell_type": "code",
514 | "execution_count": null,
515 | "metadata": {},
516 | "outputs": [],
517 | "source": [
518 | "chart1 = alt.Chart(iris).mark_point().encode(\n",
519 | " x='petalWidth',\n",
520 | " y='sepalWidth',\n",
521 | " color='species'\n",
522 | ")\n",
523 | "\n",
524 | "chart2 = alt.Chart(iris).mark_point().encode(\n",
525 | " x='sepalLength',\n",
526 | " y='sepalWidth',\n",
527 | " color='species'\n",
528 | ")\n",
529 | "\n",
530 | "chart1 | chart2"
531 | ]
532 | },
533 | {
534 | "cell_type": "markdown",
535 | "metadata": {},
536 | "source": [
537 | "A useful trick for this sort of situation is to create a base chart, and concatenate two slightly different copies of it:"
538 | ]
539 | },
540 | {
541 | "cell_type": "code",
542 | "execution_count": null,
543 | "metadata": {},
544 | "outputs": [],
545 | "source": [
546 | "base = alt.Chart(iris).mark_point().encode(\n",
547 | " y='sepalWidth',\n",
548 | " color='species'\n",
549 | ")\n",
550 | "\n",
551 | "base.encode(x='petalWidth') | base.encode(x='sepalLength')"
552 | ]
553 | },
554 | {
555 | "cell_type": "markdown",
556 | "metadata": {},
557 | "source": [
558 | "Another type of compound chart is a layered chart, built using ``alt.layer()`` or equivalently the ``+`` operator.\n",
559 | "For example, we could layer points on top of the binned counts from before:"
560 | ]
561 | },
562 | {
563 | "cell_type": "code",
564 | "execution_count": null,
565 | "metadata": {},
566 | "outputs": [],
567 | "source": [
568 | "counts = alt.Chart(iris).mark_rect(opacity=0.3).encode(\n",
569 | " x=alt.X('petalLength', bin=True),\n",
570 | " y=alt.Y('sepalLength', bin=True),\n",
571 | " color='count()'\n",
572 | ")\n",
573 | "\n",
574 | "points = alt.Chart(iris).mark_point().encode(\n",
575 | " x='petalLength',\n",
576 | " y='sepalLength',\n",
577 | " color='species'\n",
578 | ")\n",
579 | "\n",
580 | "counts + points"
581 | ]
582 | },
583 | {
584 | "cell_type": "markdown",
585 | "metadata": {},
586 | "source": [
587 | "We'll see a use of this kind of chart stacking in the interactivity section, below."
588 | ]
589 | },
590 | {
591 | "cell_type": "markdown",
592 | "metadata": {},
593 | "source": [
594 | "## 7. Interactivity: Selections\n",
595 | "\n",
596 | "Let's return to our scatter plot, and take a look at the other types of interactivity that Altair offers:"
597 | ]
598 | },
599 | {
600 | "cell_type": "code",
601 | "execution_count": null,
602 | "metadata": {},
603 | "outputs": [],
604 | "source": [
605 | "alt.Chart(iris).mark_point().encode(\n",
606 | " x='petalLength',\n",
607 | " y='petalWidth',\n",
608 | " color='species'\n",
609 | ")"
610 | ]
611 | },
612 | {
613 | "cell_type": "markdown",
614 | "metadata": {},
615 | "source": [
616 | "Recall that you can add ``interactive()`` to the end of a chart to enable the most basic interactive scales:"
617 | ]
618 | },
619 | {
620 | "cell_type": "code",
621 | "execution_count": null,
622 | "metadata": {},
623 | "outputs": [],
624 | "source": [
625 | "alt.Chart(iris).mark_point().encode(\n",
626 | " x='petalLength',\n",
627 | " y='petalWidth',\n",
628 | " color='species'\n",
629 | ").interactive()"
630 | ]
631 | },
632 | {
633 | "cell_type": "markdown",
634 | "metadata": {},
635 | "source": [
636 | "Altair provides a general ``selection`` API for creating interactive plots; for example, here we create an interval selection:"
637 | ]
638 | },
639 | {
640 | "cell_type": "code",
641 | "execution_count": null,
642 | "metadata": {},
643 | "outputs": [],
644 | "source": [
645 | "interval = alt.selection_interval()\n",
646 | "\n",
647 | "alt.Chart(iris).mark_point().encode(\n",
648 | " x='petalLength',\n",
649 | " y='petalWidth',\n",
650 | " color='species'\n",
651 | ").properties(\n",
652 | " selection=interval\n",
653 | ")"
654 | ]
655 | },
656 | {
657 | "cell_type": "markdown",
658 | "metadata": {},
659 | "source": [
660 | "Currently this selection doesn't actually do anything, but we can change that by conditioning the color on this selection:"
661 | ]
662 | },
663 | {
664 | "cell_type": "code",
665 | "execution_count": null,
666 | "metadata": {},
667 | "outputs": [],
668 | "source": [
669 | "interval = alt.selection_interval()\n",
670 | "\n",
671 | "alt.Chart(iris).mark_point().encode(\n",
672 | " x='petalLength',\n",
673 | " y='petalWidth',\n",
674 | " color=alt.condition(interval, 'species', alt.value('lightgray'))\n",
675 | ").properties(\n",
676 | " selection=interval\n",
677 | ")"
678 | ]
679 | },
680 | {
681 | "cell_type": "markdown",
682 | "metadata": {},
683 | "source": [
684 | "The nice thing about this selection API is that it *automatically* applies across any compound charts; for example, here we can horizontally concatenate two charts, and since they both have the same selection they both respond appropriately:"
685 | ]
686 | },
687 | {
688 | "cell_type": "code",
689 | "execution_count": null,
690 | "metadata": {},
691 | "outputs": [],
692 | "source": [
693 | "interval = alt.selection_interval()\n",
694 | "\n",
695 | "base = alt.Chart(iris).mark_point().encode(\n",
696 | " y='petalWidth',\n",
697 | " color=alt.condition(interval, 'species', alt.value('lightgray'))\n",
698 | ").properties(\n",
699 | " selection=interval\n",
700 | ")\n",
701 | "\n",
702 | "base.encode(x='petalLength') | base.encode(x='sepalLength')"
703 | ]
704 | },
705 | {
706 | "cell_type": "markdown",
707 | "metadata": {},
708 | "source": [
709 | "The properties of the interval selection can be easily modified; for example, we can specify that it only applies to the x encoding:"
710 | ]
711 | },
712 | {
713 | "cell_type": "code",
714 | "execution_count": null,
715 | "metadata": {},
716 | "outputs": [],
717 | "source": [
718 | "interval = alt.selection_interval(encodings=['x'])\n",
719 | "\n",
720 | "base = alt.Chart(iris).mark_point().encode(\n",
721 | " y='petalWidth',\n",
722 | " color=alt.condition(interval, 'species', alt.value('lightgray'))\n",
723 | ").properties(\n",
724 | " selection=interval\n",
725 | ")\n",
726 | "\n",
727 | "base.encode(x='petalLength') | base.encode(x='sepalLength')"
728 | ]
729 | },
730 | {
731 | "cell_type": "markdown",
732 | "metadata": {},
733 | "source": [
734 | "We can do even more sophisticated things with selections as well.\n",
735 | "For example, let's make a histogram of the number of records by species, and stack it on our scatter plot:"
736 | ]
737 | },
738 | {
739 | "cell_type": "code",
740 | "execution_count": null,
741 | "metadata": {},
742 | "outputs": [],
743 | "source": [
744 | "interval = alt.selection_interval()\n",
745 | "\n",
746 | "base = alt.Chart(iris).mark_point().encode(\n",
747 | " y='petalLength',\n",
748 | " color=alt.condition(interval, 'species', alt.value('lightgray'))\n",
749 | ").properties(\n",
750 | " selection=interval\n",
751 | ")\n",
752 | "\n",
753 | "hist = alt.Chart(iris).mark_bar().encode(\n",
754 | " x='count()',\n",
755 | " y='species',\n",
756 | " color='species'\n",
757 | ").properties(\n",
758 | " width=800,\n",
759 | " height=80\n",
760 | ").transform_filter(\n",
761 | " interval\n",
762 | ")\n",
763 | "\n",
764 | "scatter = base.encode(x='petalWidth') | base.encode(x='sepalWidth')\n",
765 | "\n",
766 | "scatter & hist"
767 | ]
768 | },
769 | {
770 | "cell_type": "markdown",
771 | "metadata": {},
772 | "source": [
773 | "This demo has covered a number of the available components of Altair.\n",
774 | "In the following sections, we'll look into each of these a bit more systematically."
775 | ]
776 | }
777 | ],
778 | "metadata": {
779 | "kernelspec": {
780 | "display_name": "Python 3",
781 | "language": "python",
782 | "name": "python3"
783 | },
784 | "language_info": {
785 | "codemirror_mode": {
786 | "name": "ipython",
787 | "version": 3
788 | },
789 | "file_extension": ".py",
790 | "mimetype": "text/x-python",
791 | "name": "python",
792 | "nbconvert_exporter": "python",
793 | "pygments_lexer": "ipython3",
794 | "version": "3.7.0"
795 | }
796 | },
797 | "nbformat": 4,
798 | "nbformat_minor": 2
799 | }
800 |
--------------------------------------------------------------------------------