├── images
├── logn.png
├── ergo.webp
├── million.mp4
├── piece.png
├── med-head.jpg
├── complex-icon.png
└── Energy_landscape.png
├── binder
└── environment.yml
├── 00-Intro.ipynb
├── 05-Intro-Session-2.ipynb
├── .gitignore
├── 09-Wrapup.ipynb
├── 03-Numpy-Lab-CA.ipynb
├── 03a-Solution-Numpy-Lab-CA.ipynb
├── 07-Examples-Problem-Solving.ipynb
├── 08-Modeling-Intervention.ipynb
├── 04-Applying-C-S-Modeling.ipynb
├── 04a-Solution-Applying.ipynb
├── 02-Networks-Automata.ipynb
├── 01-Complex-Adaptive.ipynb
└── 06-Agents-PathDep.ipynb
/images/logn.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/adbreind/complex/main/images/logn.png
--------------------------------------------------------------------------------
/images/ergo.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/adbreind/complex/main/images/ergo.webp
--------------------------------------------------------------------------------
/images/million.mp4:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/adbreind/complex/main/images/million.mp4
--------------------------------------------------------------------------------
/images/piece.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/adbreind/complex/main/images/piece.png
--------------------------------------------------------------------------------
/images/med-head.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/adbreind/complex/main/images/med-head.jpg
--------------------------------------------------------------------------------
/images/complex-icon.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/adbreind/complex/main/images/complex-icon.png
--------------------------------------------------------------------------------
/images/Energy_landscape.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/adbreind/complex/main/images/Energy_landscape.png
--------------------------------------------------------------------------------
/binder/environment.yml:
--------------------------------------------------------------------------------
1 | name: complex
2 | channels:
3 | - conda-forge
4 | dependencies:
5 | - python<3.10
6 | - jupyterlab
7 | - numpy
8 | - networkx
9 | - seaborn
10 |
--------------------------------------------------------------------------------
/00-Intro.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Instructor and Administrative Details\n",
8 | "\n",
9 | "
"
10 | ]
11 | },
12 | {
13 | "cell_type": "markdown",
14 | "metadata": {},
15 | "source": [
16 | "## Adam Breindel\n",
17 | "\n",
18 | "__LinkedIn__ - https://www.linkedin.com/in/adbreind\n",
19 | "\n",
20 | "__Email__ - adbreind@gmail.com\n",
21 | "\n",
22 | "__Twitter__ - @adbreind\n",
23 | "\n",
24 | "* 20+ years building systems for startups and large enterprises\n",
25 | "* 10+ years teaching front- and back-end technology\n",
26 | "\n",
27 | "__Fun large-scale data projects__\n",
28 | "* Streaming neural net + decision tree fraud scoring\n",
29 | "* Realtime & offline analytics for banking\n",
30 | "* Music synchronization and licensing for networked jukeboxes\n",
31 | "\n",
32 | "__Industries__\n",
33 | "* Finance / Insurance\n",
34 | "* Travel, Media / Entertainment\n",
35 | "* Energy, Government\n",
36 | "* Advertising/Social Media, & more\n",
37 | "\n",
38 | "__Currently (Besides Training)__\n",
39 | "* Consulting on data engineering and machine learning\n",
40 | "* Development\n",
41 | "* Various advisory roles"
42 | ]
43 | },
44 | {
45 | "cell_type": "markdown",
46 | "metadata": {},
47 | "source": [
48 | "## Class Schedule: 2 Session x 3 Hours\n",
49 | "\n",
50 | "* Interrupt any time with questions or comments\n",
51 | "* Breaks of about 10 minutes every hour or so"
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": null,
57 | "metadata": {},
58 | "outputs": [],
59 | "source": []
60 | }
61 | ],
62 | "metadata": {
63 | "kernelspec": {
64 | "display_name": "Python 3 (ipykernel)",
65 | "language": "python",
66 | "name": "python3"
67 | },
68 | "language_info": {
69 | "codemirror_mode": {
70 | "name": "ipython",
71 | "version": 3
72 | },
73 | "file_extension": ".py",
74 | "mimetype": "text/x-python",
75 | "name": "python",
76 | "nbconvert_exporter": "python",
77 | "pygments_lexer": "ipython3",
78 | "version": "3.9.16"
79 | }
80 | },
81 | "nbformat": 4,
82 | "nbformat_minor": 4
83 | }
84 |
--------------------------------------------------------------------------------
/05-Intro-Session-2.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "6ddc98d2-2317-4189-8077-b22021792379",
6 | "metadata": {},
7 | "source": [
8 | "# Data-Driven Analysis and Modeling of Complex Adaptive Systems\n",
9 | "## Improve Your Understanding of Systems and Emergent Behaviors\n",
10 | "\n",
11 | "
"
12 | ]
13 | },
14 | {
15 | "cell_type": "markdown",
16 | "id": "d37d6f5c-65a9-4a9c-8dff-b53baa5171f8",
17 | "metadata": {
18 | "tags": []
19 | },
20 | "source": [
21 | "### Session 2 Plan\n",
22 | "\n",
23 | "Models for Thinking: Agents\n",
24 | "- Agent-based models\n",
25 | "- Simulating Shelling’s segregation model\n",
26 | "- Interpreting the results of an ABM\n",
27 | "- More recent research\n",
28 | "\n",
29 | "Models for Thinking: Path-dependence\n",
30 | "- Part-vs-whole\n",
31 | "- Ensemble averages vs. time averages\n",
32 | "- Stateful agents\n",
33 | "\n",
34 | "Example Systems\n",
35 | "- Financial risks\n",
36 | "- Tech platform success and failure\n",
37 | "- Hiring and promotion\n",
38 | "\n",
39 | "Counterintuitive Problem-Solving Approaches\n",
40 | "- Synthesize linearity\n",
41 | "- Weaken links\n",
42 | "- Make delays work for you\n",
43 | "- Lower the signal-to-noise ratio\n",
44 | "- Distribute decision-making\n",
45 | "- Reset paths\n",
46 | "\n",
47 | "Modeling a System and an Intervention in the System\n",
48 | "- Geographical (location-based) model: crop threats\n",
49 | "- The forest-fire model: a cellular automaton implementation\n",
50 | "- What are the model parameters? How would we calibrate them?\n",
51 | "- How might we intervene to get better crops?\n",
52 | "- Lab experiments\n",
53 | "- Drawing conclusions\n",
54 | "\n",
55 | "Q&A and Wrap Up"
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "execution_count": null,
61 | "id": "0917918a-c9c2-4752-aa52-9dac18b9e666",
62 | "metadata": {},
63 | "outputs": [],
64 | "source": []
65 | }
66 | ],
67 | "metadata": {
68 | "kernelspec": {
69 | "display_name": "Python 3 (ipykernel)",
70 | "language": "python",
71 | "name": "python3"
72 | },
73 | "language_info": {
74 | "codemirror_mode": {
75 | "name": "ipython",
76 | "version": 3
77 | },
78 | "file_extension": ".py",
79 | "mimetype": "text/x-python",
80 | "name": "python",
81 | "nbconvert_exporter": "python",
82 | "pygments_lexer": "ipython3",
83 | "version": "3.9.16"
84 | }
85 | },
86 | "nbformat": 4,
87 | "nbformat_minor": 5
88 | }
89 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | share/python-wheels/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 | MANIFEST
28 |
29 | # PyInstaller
30 | # Usually these files are written by a python script from a template
31 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
32 | *.manifest
33 | *.spec
34 |
35 | # Installer logs
36 | pip-log.txt
37 | pip-delete-this-directory.txt
38 |
39 | # Unit test / coverage reports
40 | htmlcov/
41 | .tox/
42 | .nox/
43 | .coverage
44 | .coverage.*
45 | .cache
46 | nosetests.xml
47 | coverage.xml
48 | *.cover
49 | *.py,cover
50 | .hypothesis/
51 | .pytest_cache/
52 | cover/
53 |
54 | # Translations
55 | *.mo
56 | *.pot
57 |
58 | # Django stuff:
59 | *.log
60 | local_settings.py
61 | db.sqlite3
62 | db.sqlite3-journal
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 |
74 | # PyBuilder
75 | .pybuilder/
76 | target/
77 |
78 | # Jupyter Notebook
79 | .ipynb_checkpoints
80 |
81 | # IPython
82 | profile_default/
83 | ipython_config.py
84 |
85 | # pyenv
86 | # For a library or package, you might want to ignore these files since the code is
87 | # intended to run in multiple environments; otherwise, check them in:
88 | # .python-version
89 |
90 | # pipenv
91 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
93 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
94 | # install all needed dependencies.
95 | #Pipfile.lock
96 |
97 | # poetry
98 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
99 | # This is especially recommended for binary packages to ensure reproducibility, and is more
100 | # commonly ignored for libraries.
101 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
102 | #poetry.lock
103 |
104 | # pdm
105 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
106 | #pdm.lock
107 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
108 | # in version control.
109 | # https://pdm.fming.dev/#use-with-ide
110 | .pdm.toml
111 |
112 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
113 | __pypackages__/
114 |
115 | # Celery stuff
116 | celerybeat-schedule
117 | celerybeat.pid
118 |
119 | # SageMath parsed files
120 | *.sage.py
121 |
122 | # Environments
123 | .env
124 | .venv
125 | env/
126 | venv/
127 | ENV/
128 | env.bak/
129 | venv.bak/
130 |
131 | # Spyder project settings
132 | .spyderproject
133 | .spyproject
134 |
135 | # Rope project settings
136 | .ropeproject
137 |
138 | # mkdocs documentation
139 | /site
140 |
141 | # mypy
142 | .mypy_cache/
143 | .dmypy.json
144 | dmypy.json
145 |
146 | # Pyre type checker
147 | .pyre/
148 |
149 | # pytype static type analyzer
150 | .pytype/
151 |
152 | # Cython debug symbols
153 | cython_debug/
154 |
155 | # PyCharm
156 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can
157 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
158 | # and can be added to the global gitignore or merged into this file. For a more nuclear
159 | # option (not recommended) you can uncomment the following to ignore the entire idea folder.
160 | #.idea/
161 |
--------------------------------------------------------------------------------
/09-Wrapup.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "a861fa46-c868-422f-9e41-9f02a8e47783",
6 | "metadata": {},
7 | "source": [
8 | "# Wrapup\n",
9 | "\n",
10 | "Over the past two sessions and numerous interactive activities, we've embarked on a fascinating journey into the realm of complexity science, an interdisciplinary field that explores the behaviors and properties of complex systems and networks. From the outset, we laid the groundwork with foundational principles such as emergence, adaptation, and self-organization. We learned how these principles contribute to the behavior of complex systems, fostering patterns and structures that are more than the sum of their individual components. This set the stage for us to understand why complex systems, often termed as 'systems of systems', exhibit properties that are vastly different and unpredictable from their constituent parts.\n",
11 | "\n",
12 | "As we progressed, we discussed various complex systems existing in nature, society, and technology -- these included ecosystems, financial markets, neural networks, social networks, and even the internet itself, examining how complexity science offers unique insights into their behaviors. \n",
13 | "\n",
14 | "We delved into computational modeling and simulation techniques such as cellular automata, agent-based models, and network analysis, which enabled us to investigate and visualize the dynamics of these systems, including phenomena like phase transitions, power laws, and the edge of chaos.\n",
15 | "\n",
16 | "We've even touched (if lightly) on the principles and applications of complexity science to challenges facing humanity such as climate change, public health, and societal resilience. We learned that by understanding the interconnected, adaptive, and emergent nature of these issues, complexity science can inform more effective, sustainable, and equitable solutions.\n",
17 | "\n",
18 | "### Some questions to consider moving forward\n",
19 | "\n",
20 | "* How can complexity science reshape our approach towards solving global challenges like climate change or pandemics?\n",
21 | "* In what ways can our understanding of emergent behaviors inform the design of more resilient and sustainable systems, whether they be cities, economies, or ecosystems?\n",
22 | "* How might new advancements in computation and data science further expand our capabilities to model and understand complex systems?\n",
23 | "\n",
24 | "### Final thoughts\n",
25 | "\n",
26 | "If you have gained from our work together, I invite you to explore this expanding field of knowledge, which continually intersects with other disciplines, from physics to sociology, biology to computer science. The principles and insights of complexity can be applied to almost any realm we find ourselves in, so it is likely worthwhile to continue exploring, questioning, and applying what you've learned. \n",
27 | "\n",
28 | "Regardless of your principal interests and focus -- personald and professional -- consider that complexity science provides a powerful lens through which we can make sense of our interconnected world."
29 | ]
30 | },
31 | {
32 | "cell_type": "markdown",
33 | "id": "8a9193ec-26ab-4c00-98b9-75b0a697a882",
34 | "metadata": {},
35 | "source": [
36 | "### Some resources\n",
37 | "\n",
38 | "#### Books\n",
39 | "\n",
40 | "These books are generally accessible (i.e., not scientific publications)\n",
41 | "\n",
42 | "* \"Complexity: A Guided Tour\" by Melanie Mitchell\n",
43 | "* \"Scale: The Universal Laws of Life, Growth, and Death in Organisms, Cities, and Companies\" by Geoffrey West\n",
44 | "* \"Complexity and the Economy\" by W. Brian Arthur\n",
45 | "* \"The Quark and the Jaguar: Adventures in the Simple and the Complex\" by Murray Gell-Mann\n",
46 | "* \"Complex Adaptive Systems: An Introduction to Computational Models of Social Life\" by John H. Miller and Scott E. Page\n",
47 | "* \"What Is a Complex System?\" by James Ladyman and Karoline Wiesner\n",
48 | "\n",
49 | "#### Videos\n",
50 | "\n",
51 | "* Complexity Explorer Lecture: David Krakauer • What is Complexity? https://www.youtube.com/watch?&v=FBkFu1g5PlE\n",
52 | "* Could One Physics Theory Unlock the Mysteries of the Brain? https://www.youtube.com/watch?v=hjGFp7lMi9A\n",
53 | "* Complex systems & networks: A historical overview + 4 key ideas https://videopress.com/v/oLdj0PQX\n",
54 | "* Ergodicity TV https://www.youtube.com/channel/UCJG8N5P1RFX29JTKt_6R21A\n",
55 | "\n",
56 | "#### Interactive Sites\n",
57 | "\n",
58 | "* Complexity explained https://complexityexplained.github.io/\n",
59 | "* Complexity explorer https://www.complexityexplorer.org/\n",
60 | "\n",
61 | "#### Podcasts\n",
62 | "\n",
63 | "* Complexity (SFI) https://complexity.simplecast.com/episodes\n",
64 | "* Simplifying Complexity (Brady Heywood) https://podcasts.apple.com/us/podcast/simplifying-complexity/id1651582236\n",
65 | "\n",
66 | "#### Institutions\n",
67 | "\n",
68 | "*The Santa Fe Institute is likely the premier institution worldwide with programs and faculty to match: https://www.santafe.edu/*\n",
69 | "\n",
70 | "Beyond SFI, some additional major institutions include\n",
71 | "\n",
72 | "* Complexity Science Hub Vienna, Austria: This research institution is dedicated to uncovering, describing, and solving complex problems in society. Their work includes an interdisciplinary blend of network science, data science, social sciences, and complexity theory.\n",
73 | "\n",
74 | "* Max Planck Institute for Dynamics and Self-Organization, Germany: This institute focuses on the study of dynamic and self-organizing systems across a range of research areas, from fluid dynamics to biological physics, making it a key player in the field of complexity science.\n",
75 | "\n",
76 | "There are numerous additional programs and institutions beyond these, which all have some great work to offer."
77 | ]
78 | },
79 | {
80 | "cell_type": "code",
81 | "execution_count": null,
82 | "id": "0bed8ff8-d735-41ab-ad29-dbf6b064715b",
83 | "metadata": {},
84 | "outputs": [],
85 | "source": []
86 | }
87 | ],
88 | "metadata": {
89 | "kernelspec": {
90 | "display_name": "Python 3 (ipykernel)",
91 | "language": "python",
92 | "name": "python3"
93 | },
94 | "language_info": {
95 | "codemirror_mode": {
96 | "name": "ipython",
97 | "version": 3
98 | },
99 | "file_extension": ".py",
100 | "mimetype": "text/x-python",
101 | "name": "python",
102 | "nbconvert_exporter": "python",
103 | "pygments_lexer": "ipython3",
104 | "version": "3.9.16"
105 | }
106 | },
107 | "nbformat": 4,
108 | "nbformat_minor": 5
109 | }
110 |
--------------------------------------------------------------------------------
/03-Numpy-Lab-CA.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "colab_type": "text",
7 | "id": "26VI4drphOwP"
8 | },
9 | "source": [
10 | "# Cellular Automata\n",
11 | "\n",
12 | "In this lab, we'll use NumPy to implement Conway's classic \"Life\" automaton.\n",
13 | "\n",
14 | "In Conway's game of life, we consider the Moore Neighborhood (https://en.wikipedia.org/wiki/Moore_neighborhood) of the 8 cells surrounding each cell, and apply the following rules:\n",
15 | "* Any live cell with fewer than two live neighbours dies (referred to as underpopulation or exposure).\n",
16 | "* Any live cell with more than three live neighbours dies (referred to as overpopulation or overcrowding).\n",
17 | "* Any live cell with two or three live neighbours lives, unchanged, to the next generation.\n",
18 | "* Any dead cell with exactly three live neighbours will come to life.\n",
19 | "\n",
20 | "Our grid will a 2-dimensional NumPy array, with zeros representing empty/dead cells, and ones representing living cells.\n",
21 | "\n",
22 | "E.g.,"
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": null,
28 | "metadata": {
29 | "colab": {},
30 | "colab_type": "code",
31 | "id": "7NMxLSTz9zZg"
32 | },
33 | "outputs": [],
34 | "source": [
35 | "import numpy as np\n",
36 | "\n",
37 | "a = np.random.randint(2, size=(15, 15), dtype=np.uint8)\n",
38 | "print(a)"
39 | ]
40 | },
41 | {
42 | "cell_type": "markdown",
43 | "metadata": {
44 | "colab_type": "text",
45 | "id": "QCRlZlMyidRA"
46 | },
47 | "source": [
48 | "The first thing to do is create an update routine. Given an existing grid, `a`, generate the next step as a matrix `b`"
49 | ]
50 | },
51 | {
52 | "cell_type": "code",
53 | "execution_count": null,
54 | "metadata": {
55 | "colab": {},
56 | "colab_type": "code",
57 | "id": "X5tyZoh494EW"
58 | },
59 | "outputs": [],
60 | "source": [
61 | "###"
62 | ]
63 | },
64 | {
65 | "cell_type": "markdown",
66 | "metadata": {
67 | "colab_type": "text",
68 | "id": "7GJr6PcwjACf"
69 | },
70 | "source": [
71 | "Wrap that code in a function called `step` and we can use it to produce graphical output (albeit not very nicely animated) in the notebook."
72 | ]
73 | },
74 | {
75 | "cell_type": "code",
76 | "execution_count": null,
77 | "metadata": {
78 | "colab": {},
79 | "colab_type": "code",
80 | "id": "Dv2Kh5pq-e-O"
81 | },
82 | "outputs": [],
83 | "source": [
84 | "###"
85 | ]
86 | },
87 | {
88 | "cell_type": "code",
89 | "execution_count": null,
90 | "metadata": {
91 | "colab": {},
92 | "colab_type": "code",
93 | "id": "D-bypqMX-sbb"
94 | },
95 | "outputs": [],
96 | "source": [
97 | "step(a)"
98 | ]
99 | },
100 | {
101 | "cell_type": "markdown",
102 | "metadata": {
103 | "colab_type": "text",
104 | "id": "H84o-SzbjI4K"
105 | },
106 | "source": [
107 | "This code starts with an initial grid `a` and iterates 10 times, printing the output as text:"
108 | ]
109 | },
110 | {
111 | "cell_type": "code",
112 | "execution_count": null,
113 | "metadata": {
114 | "colab": {},
115 | "colab_type": "code",
116 | "id": "NzF7mkbI-t8c"
117 | },
118 | "outputs": [],
119 | "source": [
120 | "from IPython import display\n",
121 | "import time\n",
122 | "\n",
123 | "data = a.copy()\n",
124 | "for i in range(10):\n",
125 | " data = step(data)\n",
126 | " display.clear_output(wait=True)\n",
127 | " display.display(data)\n",
128 | " time.sleep(1.0)"
129 | ]
130 | },
131 | {
132 | "cell_type": "markdown",
133 | "metadata": {
134 | "colab_type": "text",
135 | "id": "CgzlTym7jSzK"
136 | },
137 | "source": [
138 | "Since Matplotlib will render our matrix as a graphical grid, by default..."
139 | ]
140 | },
141 | {
142 | "cell_type": "code",
143 | "execution_count": null,
144 | "metadata": {
145 | "colab": {},
146 | "colab_type": "code",
147 | "id": "o9irmfJT_cpF"
148 | },
149 | "outputs": [],
150 | "source": [
151 | "import matplotlib.pyplot as plt\n",
152 | "\n",
153 | "plt.matshow(a)"
154 | ]
155 | },
156 | {
157 | "cell_type": "markdown",
158 | "metadata": {
159 | "colab_type": "text",
160 | "id": "5_0bhxt5jZJc"
161 | },
162 | "source": [
163 | "... we can plug that in to runder multiple steps in the notebook:"
164 | ]
165 | },
166 | {
167 | "cell_type": "code",
168 | "execution_count": null,
169 | "metadata": {
170 | "colab": {},
171 | "colab_type": "code",
172 | "id": "tyybrSF4_oOz"
173 | },
174 | "outputs": [],
175 | "source": [
176 | "data = a.copy()\n",
177 | "for i in range(20):\n",
178 | " data = step(data)\n",
179 | " display.clear_output(wait=True)\n",
180 | " plt.matshow(data)\n",
181 | " plt.show()\n",
182 | " time.sleep(0.5)"
183 | ]
184 | },
185 | {
186 | "cell_type": "markdown",
187 | "metadata": {
188 | "colab_type": "text",
189 | "id": "VS9AH8Bkje_5"
190 | },
191 | "source": [
192 | "You may have previously come across this operation of multiplying and summing a local grid of values relative to a center point. Although it has the misnomer \"convolution\" in deep neural networks, it's a simpler operation called \"discrete cross correlation\" and the SciPy `signal` package has a built-in implementation that works with NumPy.\n",
193 | "\n",
194 | "See if you can use that to simplify your update (`step`) code."
195 | ]
196 | },
197 | {
198 | "cell_type": "code",
199 | "execution_count": null,
200 | "metadata": {
201 | "colab": {},
202 | "colab_type": "code",
203 | "id": "7Q75A9qP96z1"
204 | },
205 | "outputs": [],
206 | "source": [
207 | "###"
208 | ]
209 | },
210 | {
211 | "cell_type": "markdown",
212 | "metadata": {
213 | "colab_type": "text",
214 | "id": "PVPU7UQK932a"
215 | },
216 | "source": [
217 | "Some very simple patterns can generate extremely long-lived processes in the game of life model. One of them is called Rabbits: http://www.conwaylife.com/wiki/Rabbits\n",
218 | "\n",
219 | "For fun, you can try generating the game from the rabbit pattern."
220 | ]
221 | },
222 | {
223 | "cell_type": "code",
224 | "execution_count": null,
225 | "metadata": {
226 | "colab": {},
227 | "colab_type": "code",
228 | "id": "LNclI6cD994k"
229 | },
230 | "outputs": [],
231 | "source": []
232 | }
233 | ],
234 | "metadata": {
235 | "colab": {
236 | "collapsed_sections": [],
237 | "name": "03-Numpy-Lab-CA",
238 | "provenance": [],
239 | "version": "0.3.2"
240 | },
241 | "kernelspec": {
242 | "display_name": "Python 3 (ipykernel)",
243 | "language": "python",
244 | "name": "python3"
245 | },
246 | "language_info": {
247 | "codemirror_mode": {
248 | "name": "ipython",
249 | "version": 3
250 | },
251 | "file_extension": ".py",
252 | "mimetype": "text/x-python",
253 | "name": "python",
254 | "nbconvert_exporter": "python",
255 | "pygments_lexer": "ipython3",
256 | "version": "3.9.16"
257 | }
258 | },
259 | "nbformat": 4,
260 | "nbformat_minor": 4
261 | }
262 |
--------------------------------------------------------------------------------
/03a-Solution-Numpy-Lab-CA.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "colab_type": "text",
7 | "id": "26VI4drphOwP"
8 | },
9 | "source": [
10 | "# Cellular Automata\n",
11 | "\n",
12 | "In this lab, we'll use NumPy to implement Conway's classic \"Life\" automaton.\n",
13 | "\n",
14 | "In Conway's game of life, we consider the Moore Neighborhood (https://en.wikipedia.org/wiki/Moore_neighborhood) of the 8 cells surrounding each cell, and apply the following rules:\n",
15 | "* Any live cell with fewer than two live neighbours dies (referred to as underpopulation or exposure).\n",
16 | "* Any live cell with more than three live neighbours dies (referred to as overpopulation or overcrowding).\n",
17 | "* Any live cell with two or three live neighbours lives, unchanged, to the next generation.\n",
18 | "* Any dead cell with exactly three live neighbours will come to life.\n",
19 | "\n",
20 | "Our grid will a 2-dimensional NumPy array, with zeros representing empty/dead cells, and ones representing living cells.\n",
21 | "\n",
22 | "E.g.,"
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": null,
28 | "metadata": {
29 | "colab": {},
30 | "colab_type": "code",
31 | "id": "7NMxLSTz9zZg",
32 | "tags": []
33 | },
34 | "outputs": [],
35 | "source": [
36 | "import numpy as np\n",
37 | "\n",
38 | "a = np.random.randint(2, size=(15, 15), dtype=np.uint8)\n",
39 | "print(a)"
40 | ]
41 | },
42 | {
43 | "cell_type": "markdown",
44 | "metadata": {
45 | "colab_type": "text",
46 | "id": "QCRlZlMyidRA"
47 | },
48 | "source": [
49 | "The first thing to do is create an update routine. Given an existing grid, `a`, generate the next step as a matrix `b`"
50 | ]
51 | },
52 | {
53 | "cell_type": "code",
54 | "execution_count": null,
55 | "metadata": {
56 | "colab": {},
57 | "colab_type": "code",
58 | "id": "X5tyZoh494EW",
59 | "tags": []
60 | },
61 | "outputs": [],
62 | "source": [
63 | "b = np.zeros_like(a)\n",
64 | "rows, cols = a.shape\n",
65 | "for i in range(1, rows-1):\n",
66 | " for j in range(1, cols-1):\n",
67 | " state = a[i, j]\n",
68 | " neighbors = a[i-1:i+2, j-1:j+2]\n",
69 | " k = np.sum(neighbors) - state\n",
70 | " if state:\n",
71 | " if k==2 or k==3:\n",
72 | " b[i, j] = 1\n",
73 | " else:\n",
74 | " if k == 3:\n",
75 | " b[i, j] = 1\n",
76 | "\n",
77 | "print(b)"
78 | ]
79 | },
80 | {
81 | "cell_type": "markdown",
82 | "metadata": {
83 | "colab_type": "text",
84 | "id": "7GJr6PcwjACf"
85 | },
86 | "source": [
87 | "Wrap that code in a function called `step` and we can use it to produce graphical output (albeit not very nicely animated) in the notebook."
88 | ]
89 | },
90 | {
91 | "cell_type": "code",
92 | "execution_count": null,
93 | "metadata": {
94 | "colab": {},
95 | "colab_type": "code",
96 | "id": "Dv2Kh5pq-e-O",
97 | "tags": []
98 | },
99 | "outputs": [],
100 | "source": [
101 | "def step(old):\n",
102 | " a = old\n",
103 | " b = np.zeros_like(a)\n",
104 | " rows, cols = a.shape\n",
105 | " for i in range(1, rows-1):\n",
106 | " for j in range(1, cols-1):\n",
107 | " state = a[i, j]\n",
108 | " neighbors = a[i-1:i+2, j-1:j+2]\n",
109 | " k = np.sum(neighbors) - state\n",
110 | " if state:\n",
111 | " if k==2 or k==3:\n",
112 | " b[i, j] = 1\n",
113 | " else:\n",
114 | " if k == 3:\n",
115 | " b[i, j] = 1\n",
116 | " return b"
117 | ]
118 | },
119 | {
120 | "cell_type": "code",
121 | "execution_count": null,
122 | "metadata": {
123 | "colab": {},
124 | "colab_type": "code",
125 | "id": "D-bypqMX-sbb",
126 | "tags": []
127 | },
128 | "outputs": [],
129 | "source": [
130 | "step(a)"
131 | ]
132 | },
133 | {
134 | "cell_type": "markdown",
135 | "metadata": {
136 | "colab_type": "text",
137 | "id": "H84o-SzbjI4K"
138 | },
139 | "source": [
140 | "This code starts with an initial grid `a` and iterates 10 times, printing the output as text:"
141 | ]
142 | },
143 | {
144 | "cell_type": "code",
145 | "execution_count": null,
146 | "metadata": {
147 | "colab": {},
148 | "colab_type": "code",
149 | "id": "NzF7mkbI-t8c",
150 | "tags": []
151 | },
152 | "outputs": [],
153 | "source": [
154 | "from IPython import display\n",
155 | "import time\n",
156 | "\n",
157 | "data = a.copy()\n",
158 | "for i in range(10):\n",
159 | " data = step(data)\n",
160 | " display.clear_output(wait=True)\n",
161 | " display.display(data)\n",
162 | " time.sleep(1.0)"
163 | ]
164 | },
165 | {
166 | "cell_type": "markdown",
167 | "metadata": {
168 | "colab_type": "text",
169 | "id": "CgzlTym7jSzK"
170 | },
171 | "source": [
172 | "Since Matplotlib will render our matrix as a graphical grid, by default..."
173 | ]
174 | },
175 | {
176 | "cell_type": "code",
177 | "execution_count": null,
178 | "metadata": {
179 | "colab": {},
180 | "colab_type": "code",
181 | "id": "o9irmfJT_cpF",
182 | "tags": []
183 | },
184 | "outputs": [],
185 | "source": [
186 | "import matplotlib.pyplot as plt\n",
187 | "\n",
188 | "plt.matshow(a)"
189 | ]
190 | },
191 | {
192 | "cell_type": "markdown",
193 | "metadata": {
194 | "colab_type": "text",
195 | "id": "5_0bhxt5jZJc"
196 | },
197 | "source": [
198 | "... we can plug that in to runder multiple steps in the notebook:"
199 | ]
200 | },
201 | {
202 | "cell_type": "code",
203 | "execution_count": null,
204 | "metadata": {
205 | "colab": {},
206 | "colab_type": "code",
207 | "id": "tyybrSF4_oOz",
208 | "tags": []
209 | },
210 | "outputs": [],
211 | "source": [
212 | "data = a.copy()\n",
213 | "for i in range(20):\n",
214 | " data = step(data)\n",
215 | " display.clear_output(wait=True)\n",
216 | " plt.matshow(data)\n",
217 | " plt.show()\n",
218 | " time.sleep(0.5)"
219 | ]
220 | },
221 | {
222 | "cell_type": "markdown",
223 | "metadata": {
224 | "colab_type": "text",
225 | "id": "VS9AH8Bkje_5"
226 | },
227 | "source": [
228 | "You may have previously come across this operation of multiplying and summing a local grid of values relative to a center point. Although it has the misnomer \"convolution\" in deep neural networks, it's a simpler operation called \"discrete cross correlation\" and the SciPy `signal` package has a built-in implementation that works with NumPy.\n",
229 | "\n",
230 | "See if you can use that to simplify your update (`step`) code."
231 | ]
232 | },
233 | {
234 | "cell_type": "code",
235 | "execution_count": null,
236 | "metadata": {
237 | "colab": {},
238 | "colab_type": "code",
239 | "id": "7Q75A9qP96z1",
240 | "tags": []
241 | },
242 | "outputs": [],
243 | "source": [
244 | "from scipy.signal import correlate2d\n",
245 | "\n",
246 | "kernel = np.array([[1, 1, 1],\n",
247 | " [1, 0, 1],\n",
248 | " [1, 1, 1]])\n",
249 | "\n",
250 | "c = correlate2d(a, kernel, mode='same')\n",
251 | "b = (c==3) | (c==2) & a\n",
252 | "b = b.astype(np.uint8)\n",
253 | "print(b)"
254 | ]
255 | },
256 | {
257 | "cell_type": "markdown",
258 | "metadata": {
259 | "colab_type": "text",
260 | "id": "PVPU7UQK932a"
261 | },
262 | "source": [
263 | "Some very simple patterns can generate extremely long-lived processes in the game of life model. One of them is called Rabbits: http://www.conwaylife.com/wiki/Rabbits\n",
264 | "\n",
265 | "For fun, you can try generating the game from the rabbit pattern."
266 | ]
267 | },
268 | {
269 | "cell_type": "code",
270 | "execution_count": null,
271 | "metadata": {
272 | "colab": {},
273 | "colab_type": "code",
274 | "id": "LNclI6cD994k"
275 | },
276 | "outputs": [],
277 | "source": []
278 | }
279 | ],
280 | "metadata": {
281 | "colab": {
282 | "collapsed_sections": [],
283 | "name": "03-Numpy-Lab-CA",
284 | "provenance": [],
285 | "version": "0.3.2"
286 | },
287 | "kernelspec": {
288 | "display_name": "Python 3 (ipykernel)",
289 | "language": "python",
290 | "name": "python3"
291 | },
292 | "language_info": {
293 | "codemirror_mode": {
294 | "name": "ipython",
295 | "version": 3
296 | },
297 | "file_extension": ".py",
298 | "mimetype": "text/x-python",
299 | "name": "python",
300 | "nbconvert_exporter": "python",
301 | "pygments_lexer": "ipython3",
302 | "version": "3.9.16"
303 | }
304 | },
305 | "nbformat": 4,
306 | "nbformat_minor": 4
307 | }
308 |
--------------------------------------------------------------------------------
/07-Examples-Problem-Solving.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Example systems and approaches to problem solving under complexity\n",
8 | "\n",
9 | "We should now have a vocabulary around systems which lets us better see and articulate difficult characteristics of some familiar phenomena...\n",
10 | "\n",
11 | "Let's take a few minutes and try to use some of our modeling approaches to think and talk about...\n",
12 | "\n",
13 | "* Risks in the financial system, such as the collapse of Silicon Valley Bank\n",
14 | "\n",
15 | "* Tech platform success and failure, such as the success of iOS or of Python\n",
16 | "\n",
17 | "* Hiring and promotion, such as the dynamics of tech jobs in the last decasde, especially at big firms\n",
18 | "\n"
19 | ]
20 | },
21 | {
22 | "cell_type": "markdown",
23 | "metadata": {},
24 | "source": [
25 | "## Problem solving approaches\n",
26 | "\n",
27 | "As in many situations, the initial challenge is *learning to look for complexity and becoming able to see and articulate it*\n",
28 | "\n",
29 | "Once we've identified the potential risks and challenges, we may look for mitigation strategies.\n",
30 | "\n",
31 | "We'll explore a number of mitigation strategies here -- some of which are quite counterintuitive."
32 | ]
33 | },
34 | {
35 | "cell_type": "markdown",
36 | "metadata": {},
37 | "source": [
38 | "### Weaken Links in Highly Coupled Systems\n",
39 | "\n",
40 | "Multiplicative, network effects thrive on tight coupling which encourages propagation and acceleration.\n",
41 | "\n",
42 | "If we weaken links in coupled systems, we make events more independent, lowering the likelihood of producing extreme outcomes.\n",
43 | "\n",
44 | "__Common link-weakening approach: add delay__\n",
45 | "\n",
46 | "Adding delays or friction can lower the coupling of a system and lower path dependence\n",
47 | "\n",
48 | "* Traffic metering lights on the freeway\n",
49 | "* Waiting periods for gun purchases\n",
50 | "* Cooling off periods for major purchases \n",
51 | "* Small transaction tax (friction) to attenuate high-frequency trading impacts on the market"
52 | ]
53 | },
54 | {
55 | "cell_type": "markdown",
56 | "metadata": {},
57 | "source": [
58 | "### Synthetic linearity\n",
59 | "\n",
60 | "We've seen that non-linear systems are harder to reason about, respond to, and control.\n",
61 | "\n",
62 | "Some non-linear patterns (like the interactions that lead to civil unrest) are, by definition, hard or impossible to see, control, and re-shape earlier in their history.\n",
63 | "\n",
64 | "But many man-made systems can be \"linearized\" with suitable planning.\n",
65 | "* Rate of turn for roadways\n",
66 | " * Easy, safer to navigate roadways include more linear-rate (by angle) changes in direction\n",
67 | " * Highly nonlinear curve rates are fun -- and may keep you awake for a while -- but are fatiguing and more prone to driver error\n",
68 | " * Good road engineering strives to linearize the actions a driver has to take\n",
69 | "* Vehicle controls\n",
70 | " * Using gearing, rigging, or drive/fly-by-wire approaches, we can make a car or aircraft respond \"as if\" its attitude is a linear function of control inputs -- even when the underlying physics is nonlinear\n",
71 | " * Modern jet fighter aircraft implement extreme versions of this linearization to allow a human to control an unstable aircraft moving faster than the speed of sound\n",
72 | "\n",
73 | "__What if the ultimate response function needs to be nonlinear (e.g., to accommodate a wider scale)?__\n",
74 | "\n",
75 | "Even if \"synthetic\" linearity is not possible, a piecewise linear system with known breakpoints may be possible.\n",
76 | "\n",
77 | "
\n",
78 | "\n",
79 | "Breakpoints would ideally align with human mode or context switches, like shifting gears, or lowering the flaps in an aircraft.\n",
80 | "\n",
81 | "This way, the control are linear within each regime and the breaks are learnable and tractable."
82 | ]
83 | },
84 | {
85 | "cell_type": "markdown",
86 | "metadata": {},
87 | "source": [
88 | "### Lower the Signal-to-Noise Ratio (Really!)\n",
89 | "\n",
90 | "Adding noise is one of the more counterintuitive mechanisms for reducing coupling.\n",
91 | "\n",
92 | "But, sometimes, a little entropy goes a long way.\n",
93 | "\n",
94 | "* The key insight in Robert Metcalfe's *Ethernet* was having agents _randomly delay_ packet resends to avoid collisions\n",
95 | " * In this case, the best way to avoid a traffic jam is to randomize, not coordinate\n",
96 | "* Adding randomness to the display algorithm on a social media feed \n",
97 | " * Makes it harder to game/manipulate the algorithm and spread misinformation through the network\n",
98 | "* Random delays on trade execution\n",
99 | " * Inhibit cascades of high-frequency trades\n",
100 | " * Limits leverage of HF traders to use winnings to lock in even more winnings, by creating ever faster access to markets (e.g., better buildings, new fiber)\n",
101 | " \n",
102 | "> __Activity:__ This is an easy mitigation to experiment with! Choose any of the examples in the material, \"add noise,\" and see whether the interesting complex systems behavior (e.g., emergence, tipping points, etc.) becomes weaker.\n",
103 | ">\n",
104 | "> *Hint:* one easy way to add noise is to locate where the conditional logic is in the code, and then for a small but random fraction of iterations choose one of the conditional branches randomly instead of using the existing logic. This undermines the reinforcing behavior of the system. Investigate whether the fraction of noise that you're adding gives rise to another tipping point, where -- at a certain noise level -- the interesting complexity behavior seems to disappear."
105 | ]
106 | },
107 | {
108 | "cell_type": "markdown",
109 | "metadata": {},
110 | "source": [
111 | "### Distribute Decision Making\n",
112 | "\n",
113 | "Centralized decision making facilities -- whether a single person, or an office/bureau/group -- inherently produce coupled results.\n",
114 | "\n",
115 | "These results are often desirable for their efficiency and coordination, but can be vulnerable as well.\n",
116 | "\n",
117 | "Widely distributed decision making leads to less efficient and more organic solutions which can be slower and more expensive, but are often more resilient.\n",
118 | "\n",
119 | "Taking fewer interconnected conditions for granted limits the risk in the overall system.\n",
120 | "\n",
121 | "* We see this phenomenon throughout biological systems, where organelles, cells, tissues, and organs show both influence of local response as well as central coordination\n",
122 | "* Local, non-centrally-planned economics has survived across human history, despite some interesting exceptions\n",
123 | "* Certain car companies famously pioneered production techniques where ideas and direction are sourced from front-line employees, not just executives\n",
124 | " * https://hbr.org/2011/06/how-toyota-pulls-improvement-f"
125 | ]
126 | },
127 | {
128 | "cell_type": "markdown",
129 | "metadata": {},
130 | "source": [
131 | "### The Pre-Mortem\n",
132 | "\n",
133 | "> A pre-mortem ... is a managerial strategy in which a project team imagines that a project or organization has failed, and then works backward to determine what potentially could lead to the failure of the project or organization.\n",
134 | "> https://en.wikipedia.org/wiki/Pre-mortem\n",
135 | "\n",
136 | "Surprisingly, it turns out that people are much better about imagining what caused a failure when they take that failure as a given, than they are at imagining future \"possible\" failures. A variety of psychological phenomena may be implicated, but the conclusion is clear: we an elicit better responses be skipping the \"hope for success\" mentality and performing -- at least in an exercise -- a failure analysis.\n",
137 | "\n",
138 | "* Just as we graphed out the interrelated challenges of 2020, a team can speculatively consider connections that lead to unintended consequences, runaway second-order effects, etc. while it's still early enough to change plans.\n",
139 | "\n",
140 | "### Learning systematically from near misses\n",
141 | "\n",
142 | "Another, related, approach for avoiding the risk of compounded failure is to take a positive attitude toward near misses and to systematically learn from them.\n",
143 | "\n",
144 | "* NASA's Aviation Safety Reporting System (ASRS) -- https://asrs.arc.nasa.gov/ -- has been incredibly effective at improving air safety.\n",
145 | "\n",
146 | "Most aviation failures are considered to result not from one isolated occurrence but from an \"accident chain\" of at least 3 mistakes. \"Breaking the chain\" by avoiding any of those mistakes leads to a safe, successful mission instead of an accident.\n",
147 | "\n",
148 | "Rather than shaming or threatening pilots, controllers, and mechanics when mistakes are made, the ASRS system offers partial immunity to participants in the aviation system who report their own near misses. Many ASRS reports have led to changes in the aviation system saving thousands of lives. Some are even published (anonymously) in the NASA's ASRS Callback report https://asrs.arc.nasa.gov/publications/callback.html\n",
149 | "\n",
150 | "Find the links. Break the links."
151 | ]
152 | },
153 | {
154 | "cell_type": "markdown",
155 | "metadata": {},
156 | "source": [
157 | "### Resetting Paths\n",
158 | "\n",
159 | "Sometimes path-dependent systems can be restored to (or moved toward) the ensemble average by intervening to reset paths.\n",
160 | "\n",
161 | "Concretely, the state of one or more agents is either \n",
162 | "* reset to a value\n",
163 | "* prevented from exceeding a range\n",
164 | "\n",
165 | "We've seen one example: configuring deployed tech devices like servers, security cameras, or printers to restart with clean state, so that previous anomalies don't make future ones more likely.\n",
166 | "\n",
167 | "Other examples of path resetting include:\n",
168 | "* inheritance taxes to limit the compounding inequality of wealth across generations\n",
169 | "* welfare benefits which kick in at specific levels to limit downward spiral of poverty\n",
170 | "* accelerating penalties for offenses like DUI to slow/stop the harm of habitual drunk drivers\n",
171 | "* hard salary ranges for roles to limit accumulating pay disparities among co-workers in similar jobs\n",
172 | "* breakup of monopoly firms (or prevention of mergers) to limit market concentration\n",
173 | "\n"
174 | ]
175 | },
176 | {
177 | "cell_type": "code",
178 | "execution_count": null,
179 | "metadata": {},
180 | "outputs": [],
181 | "source": []
182 | }
183 | ],
184 | "metadata": {
185 | "kernelspec": {
186 | "display_name": "Python 3 (ipykernel)",
187 | "language": "python",
188 | "name": "python3"
189 | },
190 | "language_info": {
191 | "codemirror_mode": {
192 | "name": "ipython",
193 | "version": 3
194 | },
195 | "file_extension": ".py",
196 | "mimetype": "text/x-python",
197 | "name": "python",
198 | "nbconvert_exporter": "python",
199 | "pygments_lexer": "ipython3",
200 | "version": "3.9.16"
201 | }
202 | },
203 | "nbformat": 4,
204 | "nbformat_minor": 4
205 | }
206 |
--------------------------------------------------------------------------------
/08-Modeling-Intervention.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "d0f65e88-b886-45f5-8aa1-1e091570971d",
6 | "metadata": {},
7 | "source": [
8 | "# Modeling a System and an Intervention\n",
9 | "\n",
10 | "In this module, we'll go deeper with a single problem and we'll try to...\n",
11 | "* consider modeling strategies\n",
12 | "* identify an existing model as a starting point\n",
13 | "* implement the model and discuss calibration\n",
14 | "* propose an intervention\n",
15 | "* plan experiments to model the costs and benefits of the intervention"
16 | ]
17 | },
18 | {
19 | "cell_type": "markdown",
20 | "id": "c3a0ab63-3244-4660-aac0-2596482a4207",
21 | "metadata": {},
22 | "source": [
23 | "## Crop disease\n",
24 | "\n",
25 | "Crop disease is a common agricultural challenge. In the absence of targeted (e.g., pharmaceutical or chemical) solutions, we might take a step back and see if there are some systems characteristics we can identify and use to propose additional options.\n",
26 | "\n",
27 | "### Modeling strategies\n",
28 | "\n",
29 | "There are lots of ways we might model the crop disease phenomenon.\n",
30 | "\n",
31 | "Let's start with some abstractions:\n",
32 | "* consider a square plot of land\n",
33 | "* subdivided into many smaller, uniform sections\n",
34 | "* where each section is sufficiently small that\n",
35 | " * it either only includes one plant, or else is small enough that all of the plants in the cell are likely to either be healthy or diseased\n",
36 | "* disease can spread locally but not directly over large distances\n",
37 | "\n",
38 | "This is a *spatial* model with local, short-term interaction patterns. We might model this with a 2-D cellular automaton.\n",
39 | "\n",
40 | "Of course this is oversimplified. But remember\n",
41 | "* we know that extremely large, complicated models can produce phenomena of interest\n",
42 | " * but our ability to work with those models -- espcially for causal interventions -- is limited\n",
43 | "* the goal here is to come up with a parsimonious generative model that might produce the outcome we're addressing\n",
44 | "\n"
45 | ]
46 | },
47 | {
48 | "cell_type": "markdown",
49 | "id": "3e9434f8-8365-400e-8cf8-0adfc865a76e",
50 | "metadata": {},
51 | "source": [
52 | "### Forest fire model"
53 | ]
54 | },
55 | {
56 | "cell_type": "markdown",
57 | "id": "1d96395f-854b-4fc8-9438-443dffe37258",
58 | "metadata": {},
59 | "source": [
60 | "Our crop damage model is similar to an existing well-known model: the Forest Fire Cellular Automaton, which is a simple computational model used to study the dynamics of forest fires.\n",
61 | "\n",
62 | "The Forest Fire model was introduced by Drossel and Schwabl in 1992.\n",
63 | "* the model is interesting because it can generate clusters of trees that are power-law distributed\n",
64 | " * meaning that there are many small clusters and a few large ones\n",
65 | " * this feature is observed in many natural and social systems\n",
66 | "* the Forest Fire model has been used to study not only the dynamics of forest fires\n",
67 | " * but also other phenomena that exhibit similar patterns, such as the spread of diseases or information through a network\n",
68 | "\n",
69 | "The Forest Fire Cellular Automaton model typically has three states for each cell: empty, tree, and burning. At each time step, the model updates the state of each cell based on the following rules:\n",
70 | "\n",
71 | "1. A burning cell turns into an empty cell.\n",
72 | "2. A tree will become a burning cell if at least one neighbor is burning.\n",
73 | "3. An empty space becomes a tree with probability `p`.\n",
74 | "4. A tree ignites, turning into a burning cell with probability `f`, regardless of its neighboring cells.\n",
75 | "\n",
76 | "In this model, `p` is the tree growth probability and `f` is the lightning strike probability, representing the chance of a tree spontaneously catching fire. These probabilities are typically small.\n",
77 | "\n",
78 | "Let's implement a basic model:"
79 | ]
80 | },
81 | {
82 | "cell_type": "code",
83 | "execution_count": null,
84 | "id": "2f44e0ae-f64e-49e1-b082-82f88f878be8",
85 | "metadata": {
86 | "tags": []
87 | },
88 | "outputs": [],
89 | "source": [
90 | "import numpy as np\n",
91 | "import matplotlib.pyplot as plt\n",
92 | "\n",
93 | "# Define the states\n",
94 | "EMPTY, TREE, BURNING = 0, 1, 2\n",
95 | "\n",
96 | "# Define the probabilities\n",
97 | "p, f = 0.03, 0.001 # probabilities of tree growth and fire\n",
98 | "\n",
99 | "# Grid size\n",
100 | "size = (100, 100)\n",
101 | "\n",
102 | "# Initialize grid: all cells start as empty\n",
103 | "grid = np.zeros(size, dtype=int)"
104 | ]
105 | },
106 | {
107 | "cell_type": "markdown",
108 | "id": "87c10380-76e3-4f9b-9679-ab53c738563f",
109 | "metadata": {},
110 | "source": [
111 | "The update function implements the CA rules"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "execution_count": null,
117 | "id": "4abb4cb5-5966-49f3-bcac-f4f29bfe38ec",
118 | "metadata": {
119 | "tags": []
120 | },
121 | "outputs": [],
122 | "source": [
123 | "def update(grid):\n",
124 | " new_grid = grid.copy()\n",
125 | " for i in range(size[0]):\n",
126 | " for j in range(size[1]):\n",
127 | " if grid[i, j] == EMPTY and np.random.rand() < p:\n",
128 | " new_grid[i, j] = TREE\n",
129 | " elif grid[i, j] == TREE:\n",
130 | " if np.random.rand() < f:\n",
131 | " new_grid[i, j] = BURNING\n",
132 | " elif any(grid[ii, jj] == BURNING for ii in range(max(i-1, 0), min(i+2, size[0]))\n",
133 | " for jj in range(max(j-1, 0), min(j+2, size[1]))):\n",
134 | " new_grid[i, j] = BURNING\n",
135 | " elif grid[i, j] == BURNING:\n",
136 | " new_grid[i, j] = EMPTY\n",
137 | " return new_grid"
138 | ]
139 | },
140 | {
141 | "cell_type": "markdown",
142 | "id": "35f77f4e-8960-4965-8321-7ee806d79292",
143 | "metadata": {},
144 | "source": [
145 | "To make the output easier to interpret, we can define a custom color mapping where\n",
146 | "* black = empty\n",
147 | "* gree = tree\n",
148 | "* orange = fire\n",
149 | "\n",
150 | "\n",
151 | "Note that for implementation simplicity, this mapping is only accurate once all 3 states are present in the space, which typically happens after the first 2-3 iterations."
152 | ]
153 | },
154 | {
155 | "cell_type": "code",
156 | "execution_count": null,
157 | "id": "f26abf87-0ceb-463a-b41c-0812948bd00e",
158 | "metadata": {
159 | "tags": []
160 | },
161 | "outputs": [],
162 | "source": [
163 | "from matplotlib.colors import ListedColormap\n",
164 | "\n",
165 | "cmap = ListedColormap(['black', 'green', 'orange'])"
166 | ]
167 | },
168 | {
169 | "cell_type": "code",
170 | "execution_count": null,
171 | "id": "126c261c-8000-4be9-a48c-6c6d531cb100",
172 | "metadata": {
173 | "tags": []
174 | },
175 | "outputs": [],
176 | "source": [
177 | "from IPython import display\n",
178 | "import time\n",
179 | "\n",
180 | "n_steps = 100\n",
181 | "\n",
182 | "for step in range(n_steps):\n",
183 | " grid = update(grid)\n",
184 | " display.clear_output(wait=True)\n",
185 | " plt.matshow(grid, cmap=cmap)\n",
186 | " plt.title(f\"Step {step} of {n_steps}\")\n",
187 | " plt.show()"
188 | ]
189 | },
190 | {
191 | "cell_type": "markdown",
192 | "id": "b7ef2e9c-1427-40bf-9d43-301ad09b1db8",
193 | "metadata": {},
194 | "source": [
195 | "If the tree growth (or crop replacement) is much faster, it creates more fuel at the same time, so we get another dynamic"
196 | ]
197 | },
198 | {
199 | "cell_type": "code",
200 | "execution_count": null,
201 | "id": "51a2ee8b-99cc-4583-93fb-11e5a52eacfb",
202 | "metadata": {
203 | "tags": []
204 | },
205 | "outputs": [],
206 | "source": [
207 | "p, f = 0.35, 0.001 # probabilities of tree growth and fire\n",
208 | "\n",
209 | "grid = np.zeros(size, dtype=int)"
210 | ]
211 | },
212 | {
213 | "cell_type": "code",
214 | "execution_count": null,
215 | "id": "0ee687e2-0bf0-44eb-bd6e-f67dcf148c15",
216 | "metadata": {
217 | "tags": []
218 | },
219 | "outputs": [],
220 | "source": [
221 | "for step in range(n_steps):\n",
222 | " grid = update(grid)\n",
223 | " display.clear_output(wait=True)\n",
224 | " plt.matshow(grid, cmap=cmap)\n",
225 | " plt.title(f\"Step {step} of {n_steps}\")\n",
226 | " plt.show()"
227 | ]
228 | },
229 | {
230 | "cell_type": "markdown",
231 | "id": "07cc68d5-9f52-4240-b116-150f3c860568",
232 | "metadata": {},
233 | "source": [
234 | "This result looks a lot more like pure noise ... perhaps there is a tipping point probability that represents a boundary between these outcomes. \n",
235 | "\n",
236 | "That would be an interesting investigation, but today we'll return to the original configuration and look at how we might limit the disease spread in a traditional and practical way.\n",
237 | "\n",
238 | "### Intervention\n",
239 | "\n",
240 | "An imperfect intervention that is often applied to forest fires -- and might be suitable for a crop disease experiment -- is the creation of \"fire breaks.\"\n",
241 | "\n",
242 | "Fire breaks are trenches, roads, or other areas of bare ground cut through the forest in order to make it harder for fire to spread. These fire breaks are imperfect because wind can blow burning material across the break. In our model, though, the CA rules don't allow for failures of breaks, because they only address neighboring cells.\n",
243 | "\n",
244 | "* Could we model stochastic fire break failure? What would that look like?\n",
245 | "* What are some advantages and disadvantages to adding that additional realistic element to our model?\n",
246 | "\n",
247 | "For now, we'll keep the model as it is and consider the effects of breaks:\n",
248 | "\n",
249 | "How effective are they? Before we can even answer that, we need to think about how to measure effectiveness.\n",
250 | "1. We can look at prior- and post- intervention damage ratios\n",
251 | "1. But we also need to consider the cost: each break represents an area with no crops\n",
252 | "1. Consider the limit case: breaks everywhere, disease nowhere ... but no crops either. So we need to consider the cost of the intervention.\n",
253 | "\n",
254 | "### Lab Project\n",
255 | "\n",
256 | "1. Choose an initial configuration of the model -- to keep things simple, you may want to fix the step count -- and calculate the healthy yield at the end of the iterations.\n",
257 | "2. Choose a fire break design or pattern to experiment with -- it should be something easily scalable, such as a single north-south break in the center (which could be iterated by subdividing the remaining areas and adding identical breaks)\n",
258 | "3. For each level of intervention, measure\n",
259 | " 1. healthy yield at the end of the sequence\n",
260 | " 2. cost of the intervention (measured in cells or land area)\n",
261 | "4. Plot the yield against the intervention level\n",
262 | "\n",
263 | "When we're done, we'll discuss the results."
264 | ]
265 | },
266 | {
267 | "cell_type": "code",
268 | "execution_count": null,
269 | "id": "08cf4d1a-9b30-46ad-bd4a-c105a6927cf0",
270 | "metadata": {},
271 | "outputs": [],
272 | "source": []
273 | }
274 | ],
275 | "metadata": {
276 | "kernelspec": {
277 | "display_name": "Python 3 (ipykernel)",
278 | "language": "python",
279 | "name": "python3"
280 | },
281 | "language_info": {
282 | "codemirror_mode": {
283 | "name": "ipython",
284 | "version": 3
285 | },
286 | "file_extension": ".py",
287 | "mimetype": "text/x-python",
288 | "name": "python",
289 | "nbconvert_exporter": "python",
290 | "pygments_lexer": "ipython3",
291 | "version": "3.9.16"
292 | }
293 | },
294 | "nbformat": 4,
295 | "nbformat_minor": 5
296 | }
297 |
--------------------------------------------------------------------------------
/04-Applying-C-S-Modeling.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "ebff0126-531a-4a05-a142-b71395ece701",
6 | "metadata": {},
7 | "source": [
8 | "# Applying Complex Systems Modeling\n",
9 | "\n",
10 | "Let's consider a practical scenario that can exhibit complex system dynamic and which we often need to address in business contexts (as well as social, political, and other spheres): __product introduction and viral adoption__.\n",
11 | "\n",
12 | "Naturally, there are many elements that can influence the viral success (or lack thereof) for a product and viral success is notoriously hard to craft or predict, notwithstanding all of those influencer courses that make promises.\n",
13 | "\n",
14 | "## General approach: network analysis\n",
15 | "\n",
16 | "We can imagine the various users, influencers, and potential users of our product as nodes in a graph or network which captures their relationships.\n",
17 | "\n",
18 | "If we understand the characteristics of the network, we can learn about how our product might spread.\n",
19 | "\n",
20 | "For example, if the network space has attractors states representing little connection vs. massive connection, we would want to understand\n",
21 | "* if our product is likely to enter the system in the \"massive connection\" basin of attraction (which we'd like)\n",
22 | "* whether the system exhibits tipping point behavior at the \"edge\" between the low and high connection attractor regions\n",
23 | " * what parameters the system is more or less sensitive to and which might allow us to manipulate or at least plan for better odds\n",
24 | "\n",
25 | "
\n",
26 | "\n",
27 | "In this notebook, we'll focus first on investigating a flavor of network that is closer to real-world social connections than the E-R graphs we looked at earlier.\n",
28 | "\n",
29 | "We'll aim to learn the answers to the above questions through experiments.\n",
30 | "\n",
31 | "## Small world graphs"
32 | ]
33 | },
34 | {
35 | "cell_type": "markdown",
36 | "id": "9bb50d11-e928-4c2a-99ae-9b54a9def284",
37 | "metadata": {},
38 | "source": [
39 | "Small world graphs, or small world networks, are a type of graph in which most nodes can be reached from every other node by a small number of steps. This type of network was first described in the 1960s by social psychologist __Stanley Milgram__ in his \"small world experiment\". The experiment highlighted the concept of \"six degrees of separation,\" suggesting that any two people on Earth could be connected through a chain of six acquaintances or less.\n",
40 | "\n",
41 | "In Milgram's experiment, he sent packages to 160 random people living in Boston, asking them to forward the package to a friend or acquaintance who they thought would bring the package closer to a designated final individual, a stockbroker also living in Boston. Surprisingly, the packages reached the stockbroker in an average of six steps, hence the term \"six degrees of separation\".\n",
42 | "\n",
43 | "https://en.wikipedia.org/wiki/Small-world_experiment"
44 | ]
45 | },
46 | {
47 | "cell_type": "markdown",
48 | "id": "59cc760b-aadc-41c3-b15f-41787f93919c",
49 | "metadata": {},
50 | "source": [
51 | "This discovery has had far-reaching implications, influencing several fields from sociology to computer science. \n",
52 | "\n",
53 | "In the late 1990s, mathematicians __Duncan Watts__ and __Steven Strogatz__ formalized the concept of small world networks in a mathematical context. \n",
54 | "\n",
55 | "They proposed a simple model for generating such networks, starting with a regular lattice and then rewiring some of its edges at random. This model revealed that even a small amount of rewiring could give rise to a network with both high clustering (like a regular lattice) and short average path lengths (like a random graph), a hallmark of small-world networks."
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "execution_count": null,
61 | "id": "7cccf94a-7d96-40ba-9faa-5d8e1847831c",
62 | "metadata": {
63 | "tags": []
64 | },
65 | "outputs": [],
66 | "source": [
67 | "import networkx as nx\n",
68 | "import matplotlib.pyplot as plt\n",
69 | "\n",
70 | "# Create a Watts-Strogatz small world graph\n",
71 | "# n = number of nodes\n",
72 | "# k = each node is connected to k nearest neighbors in ring topology\n",
73 | "# p = the probability of rewiring each edge\n",
74 | "n, k, p = 20, 4, 0.5\n",
75 | "G = nx.watts_strogatz_graph(n, k, p)\n",
76 | "\n",
77 | "nx.draw(G, with_labels=True)\n",
78 | "plt.show()"
79 | ]
80 | },
81 | {
82 | "cell_type": "markdown",
83 | "id": "4c72a66d-d800-42c4-9685-0d39f8c74526",
84 | "metadata": {},
85 | "source": [
86 | "Read more about the Watts-Strogatz model at https://en.wikipedia.org/wiki/Watts%E2%80%93Strogatz_model\n",
87 | "\n",
88 | "> Note that although this model has some statistics and topological characteristics that are similar to organic social networks, it is also different in significant ways. Simple generative processes for organic-similar networks are an ongoing area of research and we've chosen to use the simples model from this family for this introductory topic."
89 | ]
90 | },
91 | {
92 | "cell_type": "code",
93 | "execution_count": null,
94 | "id": "fac531cb-361f-4e47-9f0a-05c5880a0eba",
95 | "metadata": {
96 | "tags": []
97 | },
98 | "outputs": [],
99 | "source": [
100 | "nx.is_connected(G)"
101 | ]
102 | },
103 | {
104 | "cell_type": "markdown",
105 | "id": "d9f51fac-a332-4537-9318-8ba7188716f2",
106 | "metadata": {},
107 | "source": [
108 | "It doesn't seem surprising that the network is connected, given the process tha generated it.\n",
109 | "\n",
110 | "Let's try another, bigger graph with different parameters."
111 | ]
112 | },
113 | {
114 | "cell_type": "code",
115 | "execution_count": null,
116 | "id": "c4ee68c7-5867-46f7-90d8-0fcb748c5481",
117 | "metadata": {
118 | "tags": []
119 | },
120 | "outputs": [],
121 | "source": [
122 | "n, k, p = 100, 3, 0.01\n",
123 | "G = nx.watts_strogatz_graph(n, k, p)\n",
124 | "\n",
125 | "nx.draw(G, with_labels=True)\n",
126 | "plt.show()"
127 | ]
128 | },
129 | {
130 | "cell_type": "code",
131 | "execution_count": null,
132 | "id": "1d2f390c-1280-4edd-9487-c066f1903790",
133 | "metadata": {
134 | "tags": []
135 | },
136 | "outputs": [],
137 | "source": [
138 | "nx.is_connected(G)"
139 | ]
140 | },
141 | {
142 | "cell_type": "markdown",
143 | "id": "ba73e44b-d048-40b2-ad62-9405f08d4c5e",
144 | "metadata": {},
145 | "source": [
146 | "Maybe all of these graphs -- or nearly all -- are connected...\n",
147 | "\n",
148 | "Let's try one with a larger \"population\""
149 | ]
150 | },
151 | {
152 | "cell_type": "code",
153 | "execution_count": null,
154 | "id": "46f64d05-9225-42cd-96d5-542187eab03d",
155 | "metadata": {
156 | "tags": []
157 | },
158 | "outputs": [],
159 | "source": [
160 | "n, k, p = 10000, 3, 0.01\n",
161 | "G = nx.watts_strogatz_graph(n, k, p)\n",
162 | "nx.is_connected(G)"
163 | ]
164 | },
165 | {
166 | "cell_type": "markdown",
167 | "id": "c1cfda20-6a41-47b3-ab7a-3444fa2802bf",
168 | "metadata": {},
169 | "source": [
170 | "We could experiment for a few minutes with different combinations of parameters but it's not obvious what's going on. \n",
171 | "\n",
172 | "We can be more systematic by running a large number of simulations and counting the outcomes.\n",
173 | "\n",
174 | "with `n, k, p = 10000, 3, 0.01` run 100 simulations and look at the proportion which are connected"
175 | ]
176 | },
177 | {
178 | "cell_type": "code",
179 | "execution_count": null,
180 | "id": "18d7bef2-eb89-4b07-aeff-20816f0ac43b",
181 | "metadata": {
182 | "tags": []
183 | },
184 | "outputs": [],
185 | "source": []
186 | },
187 | {
188 | "cell_type": "markdown",
189 | "id": "eaa986e8-201f-4cd4-9dd8-563344f406f5",
190 | "metadata": {},
191 | "source": [
192 | "Now try `n, k, p = 10000, 3, 0.5` and repeat the experiment"
193 | ]
194 | },
195 | {
196 | "cell_type": "code",
197 | "execution_count": null,
198 | "id": "5bc38b71-f4ec-4686-88c5-7f7d6fae71ed",
199 | "metadata": {
200 | "tags": []
201 | },
202 | "outputs": [],
203 | "source": []
204 | },
205 | {
206 | "cell_type": "markdown",
207 | "id": "3e904ea1-20e7-4533-bff3-15e6b39e4473",
208 | "metadata": {},
209 | "source": [
210 | "This is better than one-off sampling, but it's not very systematic.\n",
211 | "\n",
212 | "Let's fix the graph size at 10,000 and experiment with `k` and `p`\n",
213 | "\n",
214 | "To keep it simple, we'll experiment with `k` first. Leave `p` at 0.5 and calculate the connectedness proportion for values of `k` from 2 up through 6.\n",
215 | "\n",
216 | "Plot the results"
217 | ]
218 | },
219 | {
220 | "cell_type": "code",
221 | "execution_count": null,
222 | "id": "53f324b8-6c8a-4939-9902-8c555e996764",
223 | "metadata": {
224 | "tags": []
225 | },
226 | "outputs": [],
227 | "source": []
228 | },
229 | {
230 | "cell_type": "markdown",
231 | "id": "a6b8ef51-8952-4f4d-a535-ffdbc9692449",
232 | "metadata": {},
233 | "source": [
234 | "What do you notice about the results?\n",
235 | "\n",
236 | "Since we have 2 parameters we're interested in ($k$ and $p$), if we had more time it would make sense to plot a 3-D graph (connectedness probability as a function of $k$ and $p$). \n",
237 | "\n",
238 | "But we can take a shortcut that will save some time (both coding and running).\n",
239 | "\n",
240 | "If there is an interesting boundary in your previous plot, pick the integer value on either side (since the parameter $k$ represents a whole number of neighbor nodes)\n",
241 | "\n",
242 | "For each of those two values, calculate the connected proportion when varying the $p$ parameter (probability of rewiring) across this set of possible values:"
243 | ]
244 | },
245 | {
246 | "cell_type": "code",
247 | "execution_count": null,
248 | "id": "411c6d12-2920-4281-8167-13f545ac28c5",
249 | "metadata": {
250 | "tags": []
251 | },
252 | "outputs": [],
253 | "source": [
254 | "probs = [0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6]"
255 | ]
256 | },
257 | {
258 | "cell_type": "markdown",
259 | "id": "856eefcb-c2d8-431f-b7ef-95bdea200a11",
260 | "metadata": {},
261 | "source": [
262 | "Plot the results."
263 | ]
264 | },
265 | {
266 | "cell_type": "code",
267 | "execution_count": null,
268 | "id": "13a2bd29-02bd-450e-bac8-2da2aaee3868",
269 | "metadata": {
270 | "tags": []
271 | },
272 | "outputs": [],
273 | "source": []
274 | },
275 | {
276 | "cell_type": "markdown",
277 | "id": "2ee6ce68-5e3e-43f5-bbe1-206fba576acb",
278 | "metadata": {},
279 | "source": [
280 | "* What does this tell us about the sensitivity of this graph family to the two parameters?\n",
281 | "\n",
282 | "* Can you think of realistic scenarios where the $k$ (neighbor connection) might take on values between 2 and 6?\n",
283 | "\n",
284 | "* If this graph were sufficiently similar to our customer and influencer graph, would this be \"good news\" or \"bad news\"?\n",
285 | "\n",
286 | "* What could we do to increase our chances of success?"
287 | ]
288 | },
289 | {
290 | "cell_type": "markdown",
291 | "id": "51922dad-5801-4fa4-97fb-cd77ff7c3015",
292 | "metadata": {},
293 | "source": [
294 | "### Going further\n",
295 | "\n",
296 | "Next steps could include simulating\n",
297 | "* the spread of the product through the network measuring spread as a function of time\n",
298 | "* the entry of a competing product, spreading elsewhere in the network, to see\n",
299 | " * how the relative timing of product launch affects final market share in a \"first-in wins\" scenario\n",
300 | " * long-term dynamics of a multiproduct market with low or moderate switching costs\n",
301 | "* different types of people (nodes) and relationships (edges) with different and probabilistic behaviors\n",
302 | "\n",
303 | "> In some ways, modeling this product spread may remind you of SEIR models used in epidemiology and other population-spread problems. Those are great tools too -- what are the pros and cons of the SEIR approach vs. a network simulation approach?\n",
304 | "\n",
305 | "And of course we could try other graph-building algorithms that might have better similarity to our target population.\n"
306 | ]
307 | },
308 | {
309 | "cell_type": "code",
310 | "execution_count": null,
311 | "id": "86bf8fb9-109b-44db-8929-aff0f3adf84f",
312 | "metadata": {},
313 | "outputs": [],
314 | "source": []
315 | }
316 | ],
317 | "metadata": {
318 | "kernelspec": {
319 | "display_name": "Python 3 (ipykernel)",
320 | "language": "python",
321 | "name": "python3"
322 | },
323 | "language_info": {
324 | "codemirror_mode": {
325 | "name": "ipython",
326 | "version": 3
327 | },
328 | "file_extension": ".py",
329 | "mimetype": "text/x-python",
330 | "name": "python",
331 | "nbconvert_exporter": "python",
332 | "pygments_lexer": "ipython3",
333 | "version": "3.9.16"
334 | }
335 | },
336 | "nbformat": 4,
337 | "nbformat_minor": 5
338 | }
339 |
--------------------------------------------------------------------------------
/04a-Solution-Applying.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "ebff0126-531a-4a05-a142-b71395ece701",
6 | "metadata": {},
7 | "source": [
8 | "# Applying Complex Systems Modeling\n",
9 | "\n",
10 | "Let's consider a practical scenario that can exhibit complex system dynamic and which we often need to address in business contexts (as well as social, political, and other spheres): __product introduction and viral adoption__.\n",
11 | "\n",
12 | "Naturally, there are many elements that can influence the viral success (or lack thereof) for a product and viral success is notoriously hard to craft or predict, notwithstanding all of those influencer courses that make promises.\n",
13 | "\n",
14 | "## General approach: network analysis\n",
15 | "\n",
16 | "We can imagine the various users, influencers, and potential users of our product as nodes in a graph or network which captures their relationships.\n",
17 | "\n",
18 | "If we understand the characteristics of the network, we can learn about how our product might spread.\n",
19 | "\n",
20 | "For example, if the network space has attractors states representing little connection vs. massive connection, we would want to understand\n",
21 | "* if our product is likely to enter the system in the \"massive connection\" basin of attraction (which we'd like)\n",
22 | "* whether the system exhibits tipping point behavior at the \"edge\" between the low and high connection attractor regions\n",
23 | " * what parameters the system is more or less sensitive to and which might allow us to manipulate or at least plan for better odds\n",
24 | "\n",
25 | "
\n",
26 | "\n",
27 | "In this notebook, we'll focus first on investigating a flavor of network that is closer to real-world social connections than the E-R graphs we looked at earlier.\n",
28 | "\n",
29 | "We'll aim to learn the answers to the above questions through experiments.\n",
30 | "\n",
31 | "## Small world graphs"
32 | ]
33 | },
34 | {
35 | "cell_type": "markdown",
36 | "id": "9bb50d11-e928-4c2a-99ae-9b54a9def284",
37 | "metadata": {},
38 | "source": [
39 | "Small world graphs, or small world networks, are a type of graph in which most nodes can be reached from every other node by a small number of steps. This type of network was first described in the 1960s by social psychologist __Stanley Milgram__ in his \"small world experiment\". The experiment highlighted the concept of \"six degrees of separation,\" suggesting that any two people on Earth could be connected through a chain of six acquaintances or less.\n",
40 | "\n",
41 | "In Milgram's experiment, he sent packages to 160 random people living in Boston, asking them to forward the package to a friend or acquaintance who they thought would bring the package closer to a designated final individual, a stockbroker also living in Boston. Surprisingly, the packages reached the stockbroker in an average of six steps, hence the term \"six degrees of separation\".\n",
42 | "\n",
43 | "https://en.wikipedia.org/wiki/Small-world_experiment"
44 | ]
45 | },
46 | {
47 | "cell_type": "markdown",
48 | "id": "59cc760b-aadc-41c3-b15f-41787f93919c",
49 | "metadata": {},
50 | "source": [
51 | "This discovery has had far-reaching implications, influencing several fields from sociology to computer science. \n",
52 | "\n",
53 | "In the late 1990s, mathematicians __Duncan Watts__ and __Steven Strogatz__ formalized the concept of small world networks in a mathematical context. \n",
54 | "\n",
55 | "They proposed a simple model for generating such networks, starting with a regular lattice and then rewiring some of its edges at random. This model revealed that even a small amount of rewiring could give rise to a network with both high clustering (like a regular lattice) and short average path lengths (like a random graph), a hallmark of small-world networks."
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "execution_count": null,
61 | "id": "7cccf94a-7d96-40ba-9faa-5d8e1847831c",
62 | "metadata": {
63 | "tags": []
64 | },
65 | "outputs": [],
66 | "source": [
67 | "import networkx as nx\n",
68 | "import matplotlib.pyplot as plt\n",
69 | "\n",
70 | "# Create a Watts-Strogatz small world graph\n",
71 | "# n = number of nodes\n",
72 | "# k = each node is connected to k nearest neighbors in ring topology\n",
73 | "# p = the probability of rewiring each edge\n",
74 | "n, k, p = 20, 4, 0.5\n",
75 | "G = nx.watts_strogatz_graph(n, k, p)\n",
76 | "\n",
77 | "nx.draw(G, with_labels=True)\n",
78 | "plt.show()"
79 | ]
80 | },
81 | {
82 | "cell_type": "markdown",
83 | "id": "4c72a66d-d800-42c4-9685-0d39f8c74526",
84 | "metadata": {},
85 | "source": [
86 | "Read more about the Watts-Strogatz model at https://en.wikipedia.org/wiki/Watts%E2%80%93Strogatz_model\n",
87 | "\n",
88 | "> Note that although this model has some statistics and topological characteristics that are similar to organic social networks, it is also different in significant ways. Simple generative processes for organic-similar networks are an ongoing area of research and we've chosen to use the simples model from this family for this introductory topic."
89 | ]
90 | },
91 | {
92 | "cell_type": "code",
93 | "execution_count": null,
94 | "id": "fac531cb-361f-4e47-9f0a-05c5880a0eba",
95 | "metadata": {
96 | "tags": []
97 | },
98 | "outputs": [],
99 | "source": [
100 | "nx.is_connected(G)"
101 | ]
102 | },
103 | {
104 | "cell_type": "markdown",
105 | "id": "d9f51fac-a332-4537-9318-8ba7188716f2",
106 | "metadata": {},
107 | "source": [
108 | "It doesn't seem surprising that the network is connected, given the process tha generated it.\n",
109 | "\n",
110 | "Let's try another, bigger graph with different parameters."
111 | ]
112 | },
113 | {
114 | "cell_type": "code",
115 | "execution_count": null,
116 | "id": "c4ee68c7-5867-46f7-90d8-0fcb748c5481",
117 | "metadata": {
118 | "tags": []
119 | },
120 | "outputs": [],
121 | "source": [
122 | "n, k, p = 100, 3, 0.01\n",
123 | "G = nx.watts_strogatz_graph(n, k, p)\n",
124 | "\n",
125 | "nx.draw(G, with_labels=True)\n",
126 | "plt.show()"
127 | ]
128 | },
129 | {
130 | "cell_type": "code",
131 | "execution_count": null,
132 | "id": "1d2f390c-1280-4edd-9487-c066f1903790",
133 | "metadata": {
134 | "tags": []
135 | },
136 | "outputs": [],
137 | "source": [
138 | "nx.is_connected(G)"
139 | ]
140 | },
141 | {
142 | "cell_type": "markdown",
143 | "id": "ba73e44b-d048-40b2-ad62-9405f08d4c5e",
144 | "metadata": {},
145 | "source": [
146 | "Maybe all of these graphs -- or nearly all -- are connected...\n",
147 | "\n",
148 | "Let's try one with a larger \"population\""
149 | ]
150 | },
151 | {
152 | "cell_type": "code",
153 | "execution_count": null,
154 | "id": "46f64d05-9225-42cd-96d5-542187eab03d",
155 | "metadata": {
156 | "tags": []
157 | },
158 | "outputs": [],
159 | "source": [
160 | "n, k, p = 10000, 3, 0.01\n",
161 | "G = nx.watts_strogatz_graph(n, k, p)\n",
162 | "nx.is_connected(G)"
163 | ]
164 | },
165 | {
166 | "cell_type": "markdown",
167 | "id": "c1cfda20-6a41-47b3-ab7a-3444fa2802bf",
168 | "metadata": {},
169 | "source": [
170 | "We could experiment for a few minutes with different combinations of parameters but it's not obvious what's going on. \n",
171 | "\n",
172 | "We can be more systematic by running a large number of simulations and counting the outcomes.\n",
173 | "\n",
174 | "with `n, k, p = 10000, 3, 0.01` run 100 simulations and look at the proportion which are connected"
175 | ]
176 | },
177 | {
178 | "cell_type": "code",
179 | "execution_count": null,
180 | "id": "18d7bef2-eb89-4b07-aeff-20816f0ac43b",
181 | "metadata": {
182 | "tags": []
183 | },
184 | "outputs": [],
185 | "source": [
186 | "sum(nx.is_connected(nx.watts_strogatz_graph(n,k,p)) for _ in range(100))"
187 | ]
188 | },
189 | {
190 | "cell_type": "markdown",
191 | "id": "eaa986e8-201f-4cd4-9dd8-563344f406f5",
192 | "metadata": {},
193 | "source": [
194 | "Now try `n, k, p = 10000, 3, 0.5` and repeat the experiment"
195 | ]
196 | },
197 | {
198 | "cell_type": "code",
199 | "execution_count": null,
200 | "id": "5bc38b71-f4ec-4686-88c5-7f7d6fae71ed",
201 | "metadata": {
202 | "tags": []
203 | },
204 | "outputs": [],
205 | "source": [
206 | "n, k, p = 10000, 3, 0.5\n",
207 | "sum(nx.is_connected(nx.watts_strogatz_graph(n,k,p)) for _ in range(100))"
208 | ]
209 | },
210 | {
211 | "cell_type": "markdown",
212 | "id": "3e904ea1-20e7-4533-bff3-15e6b39e4473",
213 | "metadata": {},
214 | "source": [
215 | "This is better than one-off sampling, but it's not very systematic.\n",
216 | "\n",
217 | "Let's fix the graph size at 10,000 and experiment with `k` and `p`\n",
218 | "\n",
219 | "To keep it simple, we'll experiment with `k` first. Leave `p` at 0.5 and calculate the connectedness proportion for values of `k` from 2 up through 6.\n",
220 | "\n",
221 | "Plot the results"
222 | ]
223 | },
224 | {
225 | "cell_type": "code",
226 | "execution_count": null,
227 | "id": "53f324b8-6c8a-4939-9902-8c555e996764",
228 | "metadata": {
229 | "tags": []
230 | },
231 | "outputs": [],
232 | "source": [
233 | "x = range(2,7)"
234 | ]
235 | },
236 | {
237 | "cell_type": "code",
238 | "execution_count": null,
239 | "id": "e292e154-f9fa-4191-9f28-541ea49dc917",
240 | "metadata": {
241 | "tags": []
242 | },
243 | "outputs": [],
244 | "source": [
245 | "conn = [0.01 * sum(nx.is_connected(nx.watts_strogatz_graph(n,k,p)) for _ in range(100)) for k in x]"
246 | ]
247 | },
248 | {
249 | "cell_type": "code",
250 | "execution_count": null,
251 | "id": "e3bcee73-19c1-4877-b1f0-73740350f1f1",
252 | "metadata": {
253 | "tags": []
254 | },
255 | "outputs": [],
256 | "source": [
257 | "plt.plot(x, conn)"
258 | ]
259 | },
260 | {
261 | "cell_type": "markdown",
262 | "id": "a6b8ef51-8952-4f4d-a535-ffdbc9692449",
263 | "metadata": {},
264 | "source": [
265 | "What do you notice about the results?\n",
266 | "\n",
267 | "Since we have 2 parameters we're interested in ($k$ and $p$), if we had more time it would make sense to plot a 3-D graph (connectedness probability as a function of $k$ and $p$). \n",
268 | "\n",
269 | "But we can take a shortcut that will save some time (both coding and running).\n",
270 | "\n",
271 | "If there is an interesting boundary in your previous plot, pick the integer value on either side (since the parameter $k$ represents a whole number of neighbor nodes)\n",
272 | "\n",
273 | "For each of those two values, calculate the connected proportion when varying the $p$ parameter (probability of rewiring) across this set of possible values:"
274 | ]
275 | },
276 | {
277 | "cell_type": "code",
278 | "execution_count": null,
279 | "id": "411c6d12-2920-4281-8167-13f545ac28c5",
280 | "metadata": {
281 | "tags": []
282 | },
283 | "outputs": [],
284 | "source": [
285 | "probs = [0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6]"
286 | ]
287 | },
288 | {
289 | "cell_type": "markdown",
290 | "id": "856eefcb-c2d8-431f-b7ef-95bdea200a11",
291 | "metadata": {},
292 | "source": [
293 | "Plot the results."
294 | ]
295 | },
296 | {
297 | "cell_type": "code",
298 | "execution_count": null,
299 | "id": "13a2bd29-02bd-450e-bac8-2da2aaee3868",
300 | "metadata": {
301 | "tags": []
302 | },
303 | "outputs": [],
304 | "source": [
305 | "conn_k3 = [0.01 * sum(nx.is_connected(nx.watts_strogatz_graph(n,3,p)) for _ in range(100)) for p in probs]\n",
306 | "conn_k4 = [0.01 * sum(nx.is_connected(nx.watts_strogatz_graph(n,4,p)) for _ in range(100)) for p in probs]"
307 | ]
308 | },
309 | {
310 | "cell_type": "code",
311 | "execution_count": null,
312 | "id": "e193b20e-7725-4f4a-8ce4-7c08090ab381",
313 | "metadata": {
314 | "tags": []
315 | },
316 | "outputs": [],
317 | "source": [
318 | "plt.plot(probs, conn_k3, label='k=3')\n",
319 | "plt.plot(probs, conn_k4, label='k=4')\n",
320 | "plt.legend()"
321 | ]
322 | },
323 | {
324 | "cell_type": "markdown",
325 | "id": "2ee6ce68-5e3e-43f5-bbe1-206fba576acb",
326 | "metadata": {},
327 | "source": [
328 | "* What does this tell us about the sensitivity of this graph family to the two parameters?\n",
329 | "\n",
330 | "* Can you think of realistic scenarios where the $k$ (neighbor connection) might take on values between 2 and 6?\n",
331 | "\n",
332 | "* If this graph were sufficiently similar to our customer and influencer graph, would this be \"good news\" or \"bad news\"?\n",
333 | "\n",
334 | "* What could we do to increase our chances of success?"
335 | ]
336 | },
337 | {
338 | "cell_type": "markdown",
339 | "id": "51922dad-5801-4fa4-97fb-cd77ff7c3015",
340 | "metadata": {},
341 | "source": [
342 | "### Going further\n",
343 | "\n",
344 | "Next steps could include simulating\n",
345 | "* the spread of the product through the network measuring spread as a function of time\n",
346 | "* the entry of a competing product, spreading elsewhere in the network, to see\n",
347 | " * how the relative timing of product launch affects final market share in a \"first-in wins\" scenario\n",
348 | " * long-term dynamics of a multiproduct market with low or moderate switching costs\n",
349 | "* different types of people (nodes) and relationships (edges) with different and probabilistic behaviors\n",
350 | "\n",
351 | "> In some ways, modeling this product spread may remind you of SEIR models used in epidemiology and other population-spread problems. Those are great tools too -- what are the pros and cons of the SEIR approach vs. a network simulation approach?\n",
352 | "\n",
353 | "And of course we could try other graph-building algorithms that might have better similarity to our target population.\n"
354 | ]
355 | },
356 | {
357 | "cell_type": "code",
358 | "execution_count": null,
359 | "id": "86bf8fb9-109b-44db-8929-aff0f3adf84f",
360 | "metadata": {},
361 | "outputs": [],
362 | "source": []
363 | }
364 | ],
365 | "metadata": {
366 | "kernelspec": {
367 | "display_name": "Python 3 (ipykernel)",
368 | "language": "python",
369 | "name": "python3"
370 | },
371 | "language_info": {
372 | "codemirror_mode": {
373 | "name": "ipython",
374 | "version": 3
375 | },
376 | "file_extension": ".py",
377 | "mimetype": "text/x-python",
378 | "name": "python",
379 | "nbconvert_exporter": "python",
380 | "pygments_lexer": "ipython3",
381 | "version": "3.9.16"
382 | }
383 | },
384 | "nbformat": 4,
385 | "nbformat_minor": 5
386 | }
387 |
--------------------------------------------------------------------------------
/02-Networks-Automata.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Exploring Some Complex System Dynamics\n",
8 | "\n",
9 | "We want to understand and draw actionable conclusions about some of these systems. Moreover, we want to do so \n",
10 | "* beforehand (i.e., knowledge about an impending financial crisis is better than observations while it's happening)\n",
11 | "* in an actionable way (a plan to avoid or mitigate the crisis is more useful than a mere forecast of its occurrence)\n",
12 | "\n",
13 | "In many cases, we observe phase-changes or other specific outcomes in similar systems, and we want to inspect the conditions under which we can avoid (or encourage) these outcomes in our own systems.\n",
14 | "\n",
15 | "## Generative models and simulations\n",
16 | "\n",
17 | "A key mechanism for investigating complex dynamics is a __generative model__.\n",
18 | "\n",
19 | "Generative models are mathematical models, typically implemented via computation, which can yield information about some aspects of a system. So they are, in a way, machine learning models. But they are different from the most common, predictive ML models, that look at a set of outcomes (data) and try to predict future results (next quarter's sales) or more general results (transcribing speech).\n",
20 | "\n",
21 | "> Generative models posit some __data generating process__ which may be similar to the dynamics in our systems.\n",
22 | "\n",
23 | "The models -- also called simulations -- can then produce lots of outcomes based on some initial parameters and the data generating process. \n",
24 | "\n",
25 | "By looking at the process, the parameters, and the outcomes, we can learn about the behavior of such a system as parameters are changed.\n",
26 | "\n",
27 | "We can then become aware of risks or opportunities that might otherwise be hidden."
28 | ]
29 | },
30 | {
31 | "cell_type": "markdown",
32 | "metadata": {},
33 | "source": [
34 | "## Some key systems models\n",
35 | "\n",
36 | "Today, we'll look at a few classes of models that can reveal some of the challenging dynamics of complexity.\n",
37 | "\n",
38 | "We'll take a look at ...\n",
39 | "* network models\n",
40 | "* automata\n",
41 | "\n",
42 | "Next time we'll look at\n",
43 | "* agent-based models\n",
44 | "* path dependence\n",
45 | "\n",
46 | "along with examples where they might apply."
47 | ]
48 | },
49 | {
50 | "cell_type": "markdown",
51 | "metadata": {},
52 | "source": [
53 | "## Network models\n",
54 | "\n",
55 | "Many processes exhibit the consequences of network effects. These include both natural and social systems, including business systems.\n",
56 | "\n",
57 | "We may want to exploit network effects to\n",
58 | "* generate sales\n",
59 | "* dominate/control a product category\n",
60 | "* spread positive sentiment around our product or firm\n",
61 | "\n",
62 | "Conversely we may want to avoid or disrupt network effects to\n",
63 | "* lower systemic risks (\"domino effect\" cascading failures)\n",
64 | "* protect intellectual property (limiting the spread of illicit use)\n",
65 | "* protect secrets and first-mover advantage\n",
66 | "\n",
67 | "But not every network exhibits the massive spread we may want to encourage or avoid.\n",
68 | "\n",
69 | "__We can experiment on synthetic networks and learn critical dynamics__"
70 | ]
71 | },
72 | {
73 | "cell_type": "markdown",
74 | "metadata": {},
75 | "source": [
76 | "### Erdos-Renyi Graphs\n",
77 | "\n",
78 | "We can think of our scenario of interest as a graph:\n",
79 | "* a set of nodes, which might represents individuals or firms,\n",
80 | "* and a set edges connecting pairs of nodes, which might represent communications, transactions, or business relationships.\n",
81 | "\n",
82 | "The Erdos-Renyi model (Paul Erdős, Alfréd Rényi, Edgar Gilbert, separate work ~1959) considers a family of graphs that contain some fixed number of nodes __n__ and some fixed probability __p__ that any two nodes are connected.\n",
83 | "\n",
84 | "This is a simplistic model, but it produces interesting behavior.\n",
85 | "\n",
86 | "If __p__ is close to zero, we can easily imagine that the graph is unlikely to be __connected__ or provide some route between all pairs of nodes.\n",
87 | "\n",
88 | "On the other hand, if __p__ is close to one, it is not surprising that the graph ends up connected.\n",
89 | "\n",
90 | "In between, things get interesting. Let's take a quick look at how the probability of an E-R graph being connected varies as __p__ changes.\n",
91 | "\n",
92 | "How do we do this?\n",
93 | "\n",
94 | "The easiest way to explore this -- and a mechanism I recommend because it works even when other methods are intractable or deeply complicated -- is to simulate the system and count outcomes."
95 | ]
96 | },
97 | {
98 | "cell_type": "code",
99 | "execution_count": null,
100 | "metadata": {},
101 | "outputs": [],
102 | "source": [
103 | "import networkx as nx\n",
104 | "import numpy as np\n",
105 | "\n",
106 | "def make_er_graph(n, p):\n",
107 | " G = nx.Graph()\n",
108 | " nodes = range(n)\n",
109 | " G.add_nodes_from(nodes)\n",
110 | " G.add_edges_from( (i, j) for i in nodes for j in nodes if i > j and np.random.random() < p )\n",
111 | " return G\n",
112 | "\n",
113 | "g = make_er_graph(20, 0.1)\n",
114 | "nx.draw(g)"
115 | ]
116 | },
117 | {
118 | "cell_type": "code",
119 | "execution_count": null,
120 | "metadata": {},
121 | "outputs": [],
122 | "source": [
123 | "nx.is_connected(g)"
124 | ]
125 | },
126 | {
127 | "cell_type": "code",
128 | "execution_count": null,
129 | "metadata": {},
130 | "outputs": [],
131 | "source": [
132 | "g1 = make_er_graph(20, 0.5)\n",
133 | "nx.draw(g1)"
134 | ]
135 | },
136 | {
137 | "cell_type": "code",
138 | "execution_count": null,
139 | "metadata": {},
140 | "outputs": [],
141 | "source": [
142 | "nx.is_connected(g1)"
143 | ]
144 | },
145 | {
146 | "cell_type": "code",
147 | "execution_count": null,
148 | "metadata": {},
149 | "outputs": [],
150 | "source": [
151 | "def test_connectivity_for_random_graph(n, p):\n",
152 | " return nx.is_connected(make_er_graph(n, p))\n",
153 | "\n",
154 | "def prob_connected(n, p, samples):\n",
155 | " return sum( (test_connectivity_for_random_graph(n, p) for i in range(samples)) ) / samples\n",
156 | "\n",
157 | "prob_connected(20, 0.2, 100)"
158 | ]
159 | },
160 | {
161 | "cell_type": "code",
162 | "execution_count": null,
163 | "metadata": {},
164 | "outputs": [],
165 | "source": [
166 | "from matplotlib import pyplot as plt\n",
167 | "\n",
168 | "n = 8\n",
169 | "samples = 200\n",
170 | "edge_probs = np.linspace(0, 1, 20)\n",
171 | "connectivity_probs = [prob_connected(n, p, samples) for p in edge_probs]\n",
172 | "\n",
173 | "plt.plot(edge_probs, connectivity_probs)"
174 | ]
175 | },
176 | {
177 | "cell_type": "markdown",
178 | "metadata": {},
179 | "source": [
180 | "One interesting phenomenon is that this transition point becomes more sudden as the graph grows"
181 | ]
182 | },
183 | {
184 | "cell_type": "code",
185 | "execution_count": null,
186 | "metadata": {},
187 | "outputs": [],
188 | "source": [
189 | "n = 20\n",
190 | "connectivity_probs = [prob_connected(n, p, samples) for p in edge_probs]\n",
191 | "\n",
192 | "plt.plot(edge_probs, connectivity_probs)"
193 | ]
194 | },
195 | {
196 | "cell_type": "code",
197 | "execution_count": null,
198 | "metadata": {},
199 | "outputs": [],
200 | "source": [
201 | "n = 40\n",
202 | "connectivity_probs = [prob_connected(n, p, samples) for p in edge_probs]\n",
203 | "\n",
204 | "plt.plot(edge_probs, connectivity_probs)"
205 | ]
206 | },
207 | {
208 | "cell_type": "markdown",
209 | "metadata": {},
210 | "source": [
211 | "Erdos and Renyi discovered that the critical value, at which the connectivity probability quickly transitions from 0 to 1, is $\\frac{log(n)}{n}$\n",
212 | "\n",
213 | "That number gets close to zero for large n."
214 | ]
215 | },
216 | {
217 | "cell_type": "code",
218 | "execution_count": null,
219 | "metadata": {},
220 | "outputs": [],
221 | "source": [
222 | "np.log(100)/100"
223 | ]
224 | },
225 | {
226 | "cell_type": "markdown",
227 | "metadata": {},
228 | "source": [
229 | "### Takeaway\n",
230 | "\n",
231 | "What are some takeaways from this experiment?\n",
232 | "\n",
233 | "For large graphs with random connections, even a tiny probability of connection will likely connect the whole graph.\n",
234 | "\n",
235 | "* This could be a good thing -- if you're spreading the word about your new product, or signing up folks to transact on your new money platform.\n",
236 | "\n",
237 | "* But it could also be a terrible thing if the \"message\" being passed is a new virus or a pro-genocide meme.\n",
238 | "\n",
239 | "In our experiment, we looked at the effect of connectivity. But we can just as easily fix a connectivity probability and ask about the effect of scaling the graph. \n",
240 | "\n",
241 | "Because $\\frac{log(n)}{n}$ gets small as n gets big, we can say that for *any* connectivity probability (above zero), there will be a graph size large enough that it is ~100% likely to be connected.\n",
242 | "\n",
243 | "In plain language: we've discovered the math behind the assertion that adding more people (or more anything) to a graph makes it inevitable that anything and everything can spread everywhere.\n",
244 | "\n",
245 | "If this is the spread of ...\n",
246 | "* a lifesaving product or knowledge, we have reason to rejoice\n",
247 | "* our tech platform product, we had better be careful about unintended consequences of that product, because they can be everywhere\n",
248 | "* financial instability due to propagated risk (as in 2008), we can see that failure was inevitable with scale\n",
249 | "\n",
250 | "__How does this connect to the distributions we talked about earlier?__\n",
251 | "\n",
252 | "Highly connected networks mean that when signals (info, memes, viruses, etc.) spread, the transmission is multiplicative (as we've seen with Covid and $R_0$) and, of course, the events are not independent -- they are linked by the relations in the graph. So spread through a network can lead to power-law distributions. Depending on the exponent, these may have fat tails and hide a large number of dramatic surprises that would not be expected from thin-tailed distributions.\n",
253 | "\n",
254 | "A great overview of network effects in financial risk is https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3651864\n",
255 | "\n",
256 | "#### More is Different\n",
257 | "\n",
258 | "This phenomenon -- that \"more is different\" or quantity can become quality -- is the idea behind, and title of, a 1972 paper by Philip Anderson often regarded as inaugurating the study of complex systems: https://science.sciencemag.org/content/177/4047/393\n",
259 | "\n",
260 | "Since humans and human institutions are used to linear change, this discovery of mechanisms behind sudden, non-linear change is a key tool in designing the outcomes we want in the world."
261 | ]
262 | },
263 | {
264 | "cell_type": "markdown",
265 | "metadata": {},
266 | "source": [
267 | "## Cellular automata\n",
268 | "\n",
269 | "Another form of generative model, cellular automata are discrete systems with a small set of rules governing their evolution over (traditionally discrete) time.\n",
270 | "\n",
271 | "CAs are interesting because even very simple ones can exhibit a range of behavior from highly regular to chaotic.\n",
272 | "\n",
273 | "### One-dimensional (elementary) automata\n",
274 | "\n",
275 | "We can imagine a 1-D CA state as a sequence of discrete \"cells\" each with its own state -- in the simplest case they are either active (\"alive\") or inactive (\"dead\")\n",
276 | "\n",
277 | "In the subsequent timestep, a cell's state is determined by its previous state and its previous neighbors' states.\n",
278 | "\n",
279 | "Since each cell can be either 0 or 1, there are 2^3 = 8 possible configurations for a cell and its neighbors:\n",
280 | "\n",
281 | "`111, 110, 101, 100, 011, 010, 001, 000`\n",
282 | "\n",
283 | "An automaton rule maps each of these configurations to a new state (either 0 or 1) for the center cell.\n",
284 | "\n",
285 | "The binary representation of a rule is an 8-bit number, where each bit determines the new state of the center cell for one of these configurations. The rightmost bit corresponds to the first configuration (111), the next bit to the right corresponds to the second configuration (110), and so on.\n",
286 | "\n",
287 | "So, the binary representation of a rule is simply a compact way to express how the rule maps each of the 8 possible configurations to a new state for the center cell. There are 256 possible elementary cellular automaton rules, each with a unique binary representation that can be converted to a decimal number for naming purposes (e.g., Rule 30, Rule 90).\n",
288 | "For example, consider Rule 30. Its binary representation is 00011110. We can map the bits to the corresponding configurations like this:\n",
289 | "\n",
290 | "```\n",
291 | "111 110 101 100 011 010 001 000\n",
292 | "0 0 0 1 1 1 1 0\n",
293 | "```\n",
294 | "\n",
295 | "According to this mapping, Rule 30 dictates that:\n",
296 | "\n",
297 | "* If a cell and its neighbors are in configuration 111, the new state of the center cell will be 0.\n",
298 | "* If a cell and its neighbors are in configuration 110, the new state of the center cell will be 0.\n",
299 | "* If a cell and its neighbors are in configuration 101, the new state of the center cell will be 0.\n",
300 | "* If a cell and its neighbors are in configuration 100, the new state of the center cell will be 1.\n",
301 | "* If a cell and its neighbors are in configuration 011, the new state of the center cell will be 1.\n",
302 | "* If a cell and its neighbors are in configuration 010, the new state of the center cell will be 1.\n",
303 | "* If a cell and its neighbors are in configuration 001, the new state of the center cell will be 1.\n",
304 | "* If a cell and its neighbors are in configuration 000, the new state of the center cell will be 0."
305 | ]
306 | },
307 | {
308 | "cell_type": "markdown",
309 | "metadata": {},
310 | "source": [
311 | "Three well-known elementary cellular automata have gained significant attention due to their distinct behaviors and properties:\n",
312 | "\n",
313 | "* Rule 90 generates a fractal-like pattern known as the Sierpinski triangle. It exhibits less chaotic behavior compared to Rule 30, and its simple and symmetric patterns make it visually appealing. Rule 90 is represented by the binary number 01011010, which is 90 in decimal.\n",
314 | "\n",
315 | "* Rule 30 is known for its chaotic and unpredictable behavior. It generates complex patterns from simple initial conditions, and parts of its output are even used as random number generators in some software. Rule 30 is represented by the binary number 00011110, which is 30 in decimal.\n",
316 | "\n",
317 | "* Rule 110: Rule 110 is particularly interesting because it exhibits complex behavior and is known to be Turing complete, meaning it can simulate any Turing machine, given an appropriate initial condition. This property implies that Rule 110 can perform any computation, given enough time and space. Rule 110 is represented by the binary number 01101110, which is 110 in decimal.\n",
318 | "\n",
319 | "These rules are fascinating because they demonstrate how simple rules can give rise to a wide range of behaviors, from simple and repetitive patterns to complex and unpredictable structures."
320 | ]
321 | },
322 | {
323 | "cell_type": "code",
324 | "execution_count": null,
325 | "metadata": {
326 | "tags": []
327 | },
328 | "outputs": [],
329 | "source": [
330 | "def rule90(left, center, right):\n",
331 | " return left ^ right\n",
332 | "\n",
333 | "def generate_automata(size, steps, rule):\n",
334 | " initial_state = [0] * size\n",
335 | " initial_state[size // 2] = 1\n",
336 | "\n",
337 | " automata = [initial_state]\n",
338 | "\n",
339 | " for _ in range(steps - 1):\n",
340 | " current_state = automata[-1]\n",
341 | " next_state = [0] * size\n",
342 | "\n",
343 | " for i in range(size):\n",
344 | " left = current_state[(i - 1) % size]\n",
345 | " center = current_state[i]\n",
346 | " right = current_state[(i + 1) % size]\n",
347 | "\n",
348 | " next_state[i] = rule(left, center, right)\n",
349 | "\n",
350 | " automata.append(next_state)\n",
351 | "\n",
352 | " return automata\n",
353 | "\n",
354 | "def print_automata(automata):\n",
355 | " for state in automata:\n",
356 | " print(\"\".join(\"#\" if cell else \" \" for cell in state))\n",
357 | "\n",
358 | "size = 80\n",
359 | "steps = 40\n",
360 | "\n",
361 | "automata = generate_automata(size, steps, rule90)\n",
362 | "print_automata(automata)"
363 | ]
364 | },
365 | {
366 | "cell_type": "code",
367 | "execution_count": null,
368 | "metadata": {
369 | "tags": []
370 | },
371 | "outputs": [],
372 | "source": [
373 | "def rule30(left, center, right):\n",
374 | " return left ^ (center or right)\n",
375 | "\n",
376 | "automata = generate_automata(size, steps, rule30)\n",
377 | "print_automata(automata)"
378 | ]
379 | },
380 | {
381 | "cell_type": "code",
382 | "execution_count": null,
383 | "metadata": {
384 | "tags": []
385 | },
386 | "outputs": [],
387 | "source": [
388 | "def rule110(left, center, right):\n",
389 | " return (left and center and not right) or (left and not center and right) or (not left and center and right) or \\\n",
390 | " (not left and center and not right) or (not left and not center and right)\n",
391 | "\n",
392 | "automata = generate_automata(size, steps, rule110)\n",
393 | "\n",
394 | "\n",
395 | "print_automata(automata)"
396 | ]
397 | },
398 | {
399 | "cell_type": "markdown",
400 | "metadata": {},
401 | "source": [
402 | "### Optional exercise: Conway's Game of Life\n",
403 | "\n",
404 | "Possibly the most famous CA of all time is the 2-D CA known as Conway's Game of Life.\n",
405 | "\n",
406 | "Let's take a quick look: https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life\n",
407 | "\n",
408 | "The Game of Life is fascinating for numerous reasons, but the one we're focusing on today is another key phenomenon of complex systems, __emergence__.\n",
409 | "\n",
410 | "__Emergence__ is another way of describing the manifestation of interesting, potentially surprising, nonlinear outcomes from a simple set of mechanistic rules.\n",
411 | "\n",
412 | "Game of Life has a trivial ruleset and a passive environment and gives rise to astonishing complexity."
413 | ]
414 | },
415 | {
416 | "cell_type": "code",
417 | "execution_count": null,
418 | "metadata": {},
419 | "outputs": [],
420 | "source": []
421 | }
422 | ],
423 | "metadata": {
424 | "kernelspec": {
425 | "display_name": "Python 3 (ipykernel)",
426 | "language": "python",
427 | "name": "python3"
428 | },
429 | "language_info": {
430 | "codemirror_mode": {
431 | "name": "ipython",
432 | "version": 3
433 | },
434 | "file_extension": ".py",
435 | "mimetype": "text/x-python",
436 | "name": "python",
437 | "nbconvert_exporter": "python",
438 | "pygments_lexer": "ipython3",
439 | "version": "3.9.16"
440 | }
441 | },
442 | "nbformat": 4,
443 | "nbformat_minor": 4
444 | }
445 |
--------------------------------------------------------------------------------
/01-Complex-Adaptive.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "slideshow": {
7 | "slide_type": "slide"
8 | }
9 | },
10 | "source": [
11 | "# Data-Driven Analysis and Modeling of Complex Adaptive Systems\n",
12 | "## Improve Your Understanding of Systems and Emergent Behaviors\n",
13 | "\n",
14 | "
"
15 | ]
16 | },
17 | {
18 | "cell_type": "markdown",
19 | "metadata": {},
20 | "source": [
21 | "### Session 1 Plan\n",
22 | "\n",
23 | "Intro, schedule, class ops, key topics\n",
24 | "* What is a complex adaptive system; examples\n",
25 | "* Demos: How simple composite systems rapidly get hard to predict\n",
26 | "* Linear and non-linear systems, measurment and the limits of data science predictive methods\n",
27 | "\n",
28 | "Rough, Practical Taxonomy of Interacting Elements in a System\n",
29 | "* Independent items, accumulating independent items\n",
30 | "* Connected items, accumulating connected items with addition vs. multiplication\n",
31 | "* Power-law distribution and challenging aspects of life and decision-making with heavy tails\n",
32 | "\n",
33 | "Models for Thinking: Networks\n",
34 | "* Representing connectivity and contagion for data analysis purposes\n",
35 | "* Tipping-point behavior\n",
36 | "* Exercise: Simulating network connectivity\n",
37 | "\n",
38 | "Models for Thinking: Automata\n",
39 | "* Simple automata\n",
40 | "* Exercise: Conway’s Game of Life\n",
41 | "* Discussion: What can we learn?\n",
42 | "\n",
43 | "Applying Complex Systems Modeling\n",
44 | "* A network model for data analysis of new product adoption\n",
45 | "* Small-world graphs\n",
46 | "* Exercise: Real social network data\n",
47 | "* Bootstrapping a network model\n",
48 | "* Exercise: Calibrating the model\n",
49 | "* Exercise: Introducing a competitor's product\n",
50 | "* Discussion: What can we learn? What can we report to our firm?"
51 | ]
52 | },
53 | {
54 | "cell_type": "markdown",
55 | "metadata": {},
56 | "source": [
57 | "### What is a complex adaptive system?\n",
58 | "\n",
59 | "A complex adaptive system is a system made up of many partially independent elements. These elements can be \"agents\" (such as people or animals) or entirely inanimate (grains of sand).\n",
60 | "\n",
61 | "__Let's look at a few examples to make this clearer__"
62 | ]
63 | },
64 | {
65 | "cell_type": "markdown",
66 | "metadata": {
67 | "slideshow": {
68 | "slide_type": "slide"
69 | }
70 | },
71 | "source": [
72 | "> COVID didn’t just impact our health in 2020. COVID brought us toilet paper shortages, free cheese, and, for a while, no Diet Coke. It seemed wild and impossible to predict – but with the right techniques, we could have solved this sooner. For example, data science techniques like network modeling could have warned us: when we stopped driving, carbon dioxide – a fuel production byproduct – would be scarce and cause soft drink shortages. We can test a variety of data and process models that let us experiment with highly-coupled systems, where problems evince “contagion” similar to the virus itself.\n",
73 | "\n",
74 | "__Why can we learn from these systems?__\n",
75 | "\n",
76 | "Because the surprising and non-linear responses that emerged at all scales (from individuals to geopolitics) are present throughout natural and man-made systems. So many of the dynamics -- and even many of the specific mathematical patterns -- present in one of these systems are also present in many others.\n",
77 | "\n",
78 | "More examples include...\n",
79 | "* Living organisms\n",
80 | "* Social groups at all scales (e.g., familes, clubs, firms, social movements, town/city/province governments, etc.)\n",
81 | "* General biological collectives (e.g., ant colonies)\n",
82 | "* Avalanches, earthquakes, and traffic jams\n",
83 | "* Economies and financial markets\n",
84 | "* Trade and conflict networks\n",
85 | "* Sociotechnical constructs (e.g., power grid, transportation or communication infrastructure)"
86 | ]
87 | },
88 | {
89 | "cell_type": "markdown",
90 | "metadata": {},
91 | "source": [
92 | "__A quick exercise/demo on the emergence of complex patterns from simple interactions__\n",
93 | "\n",
94 | "1. Click the __up__ (^) affordance to add a bit to the rabbit population\n",
95 | "1. Observe the regular oscillation of rabbit and fox populations"
96 | ]
97 | },
98 | {
99 | "cell_type": "code",
100 | "execution_count": null,
101 | "metadata": {
102 | "tags": []
103 | },
104 | "outputs": [],
105 | "source": [
106 | "import IPython\n",
107 | "url = \"https://ncase.me/loopy/v1.1/?embed=1&data=[[[1,274,356,0.66,%22rabbits%22,0],[2,710,357,0.66,%22foxes%22,1]],[[2,1,153,-1,0],[1,2,160,1,0]],[[489,369,%22A%2520basic%2520ecological%250Afeedback%2520loop.%250A%250ATry%2520adding%2520extra%250Acreatures%2520to%2520this%250Aecosystem!%22],[489,162,%22more%2520rabbits%2520means%2520MORE%2520foxes%253A%250Ait's%2520a%2520positive%2520(%252B)%2520relationship%22],[498,566,%22more%2520foxes%2520means%2520FEWER%2520rabbits%253A%250Ait's%2520a%2520negative%2520(%25E2%2580%2593)%2520relationship%22]],2%5D\"\n",
108 | "IPython.display.IFrame(url, 800, 600)"
109 | ]
110 | },
111 | {
112 | "cell_type": "markdown",
113 | "metadata": {},
114 | "source": [
115 | "Now... let's make this a tiny bit more complex\n",
116 | "1. Click the \"Remix\" button to open a new tab with this system\n",
117 | "1. Add a circle above the rabbits and label it \"grass\"\n",
118 | "1. Create a link from rabbits to grass and under \"relationship\" select \"more -> less\" since more rabbits means less grass\n",
119 | "1. Create a link from grass back to rabbits and use the \"more -> more\" relationship (since more grass means more rabbits)\n",
120 | "1. Click __play__ and then __up__ once on the rabbits\n",
121 | "1. Observe that the population behavior quickly becomes more chaotic\n",
122 | "\n",
123 | "Bonus activity: reset the game and this time add a \"coyote\" circle where more rabbits mean more coyotes but more coyotes mean less foxes. Note that this -- and most -- equally simple dynamics get chaotic and unpredictable quickly. \n",
124 | "\n",
125 | "Feel free to play around with some of the more complicated scenarious on the LOOPY home page or to create your own."
126 | ]
127 | },
128 | {
129 | "cell_type": "markdown",
130 | "metadata": {},
131 | "source": [
132 | "### What's going on here?\n",
133 | "\n",
134 | "Even when individual elements of a system exhibit straightforward or predictable behavior, complex and unpredictable behavior can arise from the aggregated interactions of those components.\n",
135 | "\n",
136 | "We'll look more precisely at the math in a bit, but the basic idea is that the interactions create an exponential diversity of possibly outcomes and -- even if the system seems to be deterministic -- the limits of our knowledge (or control) of initial conditions makes the system hard to predict or manage very quickly."
137 | ]
138 | },
139 | {
140 | "cell_type": "markdown",
141 | "metadata": {},
142 | "source": [
143 | "The double pendulum system is deterministic -- we know all of the equations that govern its motion. \n",
144 | "\n",
145 | "So, in a system like this, can we exert control by setting initial conditions? Often, no. Here are 50 double pendulums whose initial velocities differ by only 1 part in 1 million (*credit to Dillon Berger @InertialObservr!*)\n",
146 | "\n",
147 | "Can you control your inputs to within 1 part in 1 million? Even if you could, a simple system like this diverges to a nearly uniform distribution (i.e., every angle has equal probability) in under 20 seconds!\n",
148 | "\n",
149 | ""
150 | ]
151 | },
152 | {
153 | "cell_type": "markdown",
154 | "metadata": {},
155 | "source": [
156 | "__The Takeaway: We cannot predict and control these sorts of systems through \"classical\" planning and techniques.__\n",
157 | "\n",
158 | "This is why, for example, we understand earthquakes, avalanches, and financial crashes quite well. We can even predict them probabilistically (i.e., identify their frequency-magnitude patterns). But we can't predict any individual occurrence.\n",
159 | "\n",
160 | "__The behavior of complex systems lies on the boundary between (simple) order and chaos.__\n",
161 | "\n",
162 | "One way to make sense of that phenomenon is to think about how strongly the elements in a system are coupled."
163 | ]
164 | },
165 | {
166 | "cell_type": "markdown",
167 | "metadata": {},
168 | "source": [
169 | "### A rough, practical taxonomy for thinking about interactions in a system\n",
170 | "\n",
171 | "The simplest compound systems we deal with include __independent items__\n",
172 | "\n",
173 | "This independency assumption, we'll see, is key to knowing where the complexity may be lurking.\n",
174 | "\n",
175 | "#### Accumulating independent items by addition: the Gaussian (normal) distribution\n",
176 | "\n",
177 | "The Gaussian distribution underlies many of our tools and techniques. And we have great ability to manage systems that follow Gaussian distributions. For example, the Gaussian underlies modern statistical process control, six-sigma, and other quality and risk management techniques.\n",
178 | "\n",
179 | "We'd like it to apply everywhere, but it doesn't apply everywhere. So when *does* it work?\n",
180 | "\n",
181 | "__The Gaussian describes accumulation of independent items__"
182 | ]
183 | },
184 | {
185 | "cell_type": "code",
186 | "execution_count": null,
187 | "metadata": {},
188 | "outputs": [],
189 | "source": [
190 | "import numpy as np\n",
191 | "import seaborn as sns"
192 | ]
193 | },
194 | {
195 | "cell_type": "code",
196 | "execution_count": null,
197 | "metadata": {},
198 | "outputs": [],
199 | "source": [
200 | "samples = np.random.uniform(low=0, high=2, size=(1000, 1000))\n",
201 | "\n",
202 | "sums = samples.sum(axis=0)\n",
203 | "\n",
204 | "sns.displot(sums)"
205 | ]
206 | },
207 | {
208 | "cell_type": "markdown",
209 | "metadata": {},
210 | "source": [
211 | "__Examples for the Gaussian__\n",
212 | "\n",
213 | "Primary\n",
214 | "* Human height is a result of dozens of genes affecting different aspects of growth\n",
215 | "* Most of these genes can be inherited and operate independently\n",
216 | "* Heights show a Gaussian distribution\n",
217 | "\n",
218 | "Secondary\n",
219 | "* Grocery stores like to stock popular products at eye level (\"eye level is buy level\")\n",
220 | "* Since heights are normally distributed, the stocking of shelves follows that normal distribution\n",
221 | "* Of course, the other shelves are not empty -- they just have items lower on the merchant's \"sale priority\"\n",
222 | "\n",
223 | "Workers in many disciplines have assumed the Gaussian holds where it does not, with disastrous consequences. \n",
224 | "\n",
225 | "Why do they use it? One reason is because it's easy to work with and well known.\n",
226 | "\n",
227 | "Why does it fail? One reason is because the tails of the Gaussian are extremely thin -- i.e., events far away from the mean \"should never\" occur. If they do occur, that's a sign we're using the wrong distribution!\n",
228 | "\n",
229 | "* We've had multiple \"hundred-year\" floods or other climate events within a decade...\n",
230 | " * That's a hint that the distribution used to model the weather (the one that expects one event per hundred years) is not the proper distribution we're dealing with\n",
231 | "* Financial crises that \"should never happen\" (SVB collapse, mortgage [\"great financial '08\"] crisis, Long-Term Capital Management collapse, ruble crisis, peso crisis, etc.) keep happening...\n",
232 | " * That's a sign that the financial risk modeling folks are not using the right distribution\n",
233 | "\n",
234 | "__Extreme events live in the \"tails\" of the distribution, far away from the expectation. But that doesn't mean they are unlikely. It all depends on the thickness of those tails.__\n",
235 | "\n",
236 | "#### Accumulating independent items by multiplication: the Log-normal distribution\n",
237 | "\n",
238 | "Because multiplying items is the same as adding them in log-space, when many independent items are multiplied, we get a distribution whose log is normal. It's called the log-normal distribution and looks like this:\n",
239 | "\n",
240 | "
\n",
241 | "\n",
242 | "This distribution requires a little more __caution__ ... it looks a bit like the Gaussian but can have a long, thick right tail."
243 | ]
244 | },
245 | {
246 | "cell_type": "code",
247 | "execution_count": null,
248 | "metadata": {},
249 | "outputs": [],
250 | "source": [
251 | "samples = np.random.uniform(low=0.95, high=1.05, size=(200, 1000))\n",
252 | "\n",
253 | "prod = samples.prod(axis=0)\n",
254 | "\n",
255 | "sns.displot(prod, bins=100)"
256 | ]
257 | },
258 | {
259 | "cell_type": "markdown",
260 | "metadata": {},
261 | "source": [
262 | "__Examples for the log-normal__\n",
263 | "\n",
264 | "Log-normal distributions describe \n",
265 | "* the sizes of British farms https://academic.oup.com/bioscience/article/51/5/341/243981\n",
266 | "* milk production by cows https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5567198/\n",
267 | "* worker pay, when sequences of multiplicative (e.g., +x%) raises are applied\n",
268 | "\n",
269 | "Not all log-normal distributions have thick tails (you can play with them interactively at https://distribution-explorer.github.io/continuous/lognormal.html)\n",
270 | "\n",
271 | "But be aware of the consequences of a thick tail: large amounts of probability mass far away from the expected values.\n",
272 | "\n",
273 | "### What about non-independent items, the ones we see in complex systems?\n",
274 | "\n",
275 | "#### Accumulating non-independent items by multiplication: the power-law distribution\n",
276 | "\n",
277 | "In many systems, effects are multiplied as they pass (or cycle) throughout the system. \n",
278 | "\n",
279 | "When these effects are not independent, they give rise to a dramatically different distribution of values. They yield extremes of inequality, and, in some cases, significant numbers of very extreme events.\n",
280 | "\n",
281 | "Box office receipts, book sales, wealth in some countries, frequencies of words, and sizes of power outages all follow power-law distributions for different, but related reasons.\n",
282 | "\n",
283 | "These systems often result from feedback loops or network effects giving rise to the \"Matthew effect\" (or \"the rich-get-richer\") https://en.wikipedia.org/wiki/Matthew_effect -- of course, in technology, entrepreneurs seek out exactly these sorts of dynamics in order to get big returns for investors, venture capitalists, and themselves.\n",
284 | "\n",
285 | "Let's simulate some app-store sales data to demonstrate this. In this example we'll collect 1000 sets of 50 samples.\n",
286 | "\n",
287 | "Within each set, though, the samples won't be independent.\n",
288 | "\n",
289 | "* We create a \"window\" (initially 0.03 wide) representing a range of possible sales rates\n",
290 | "* Start by sampling from a uniform distribution around 1.0, with a width equal to that window\n",
291 | "* For each subsequent sample within each set, we move the window so that it's centered on the previous sample\n",
292 | " * This simulates a network effect where a higher draw in one round allows for a slightly higher range to sample in the next round\n",
293 | " * We adjust to prohibit the window from going below 0"
294 | ]
295 | },
296 | {
297 | "cell_type": "code",
298 | "execution_count": null,
299 | "metadata": {},
300 | "outputs": [],
301 | "source": [
302 | "import random\n",
303 | "\n",
304 | "samples = np.zeros((50, 1000)) # rows are (50) sequence steps of aggregate return; columns will be indepentent samples (1000)\n",
305 | "\n",
306 | "half_range_width = 0.015 \n",
307 | "# all values will start within this distance of 1.0, and at each time step experience returns within this range of 1.0\n",
308 | "\n",
309 | "samples[0, :] = np.random.uniform(low=1-half_range_width, high=1+half_range_width, size=(1, 1000))\n",
310 | "\n",
311 | "for sample in range(1000):\n",
312 | " for step in range(1, 50):\n",
313 | " prev = samples[step-1, sample] \n",
314 | " low = max(prev - half_range_width, 0) # new possibility range adjusted up/down based on previous value\n",
315 | " high = 2*half_range_width if low == 0 else prev + half_range_width \n",
316 | " samples[step, sample] = random.random() * (high - low) + low\n",
317 | "\n",
318 | "# that loop can be parallelized better w numpy, but I wanted to make the algorithm extra explicit\n",
319 | " \n",
320 | "result = samples.prod(axis=0)\n",
321 | "sns.displot(result, bins=100)"
322 | ]
323 | },
324 | {
325 | "cell_type": "markdown",
326 | "metadata": {},
327 | "source": [
328 | "Notice some of the extreme results"
329 | ]
330 | },
331 | {
332 | "cell_type": "code",
333 | "execution_count": null,
334 | "metadata": {},
335 | "outputs": [],
336 | "source": [
337 | "(result < .25).sum()"
338 | ]
339 | },
340 | {
341 | "cell_type": "code",
342 | "execution_count": null,
343 | "metadata": {},
344 | "outputs": [],
345 | "source": [
346 | "(result > 5).sum()"
347 | ]
348 | },
349 | {
350 | "cell_type": "code",
351 | "execution_count": null,
352 | "metadata": {},
353 | "outputs": [],
354 | "source": [
355 | "(result > 25).sum()"
356 | ]
357 | },
358 | {
359 | "cell_type": "code",
360 | "execution_count": null,
361 | "metadata": {},
362 | "outputs": [],
363 | "source": [
364 | "result.max()"
365 | ]
366 | },
367 | {
368 | "cell_type": "markdown",
369 | "metadata": {},
370 | "source": [
371 | "The luckiest folks are *really* doing well"
372 | ]
373 | },
374 | {
375 | "cell_type": "code",
376 | "execution_count": null,
377 | "metadata": {},
378 | "outputs": [],
379 | "source": [
380 | "result.mean()"
381 | ]
382 | },
383 | {
384 | "cell_type": "code",
385 | "execution_count": null,
386 | "metadata": {},
387 | "outputs": [],
388 | "source": [
389 | "(result < result.mean()).sum()"
390 | ]
391 | },
392 | {
393 | "cell_type": "markdown",
394 | "metadata": {},
395 | "source": [
396 | "Around 80% end up worse than the average.\n",
397 | "\n",
398 | "Some fat-tailed distributions are even more extreme -- so extreme in fact that they have no expected value (mean) at all (https://rviews.rstudio.com/2017/02/15/some-notes-on-the-cauchy-distribution/)\n",
399 | "\n",
400 | "__Nassim Nicholas Taleb__, of \"Black Swan\" fame, has made a career of explaining and investing in these extreme occurrences. He calls systems that have substantial mass in the tail \"Extremistan\" and cautions against making assumptions based on the events we frequently see or have seen historically.\n",
401 | "\n",
402 | "In many of these distributions, there will always be more extreme (in degree) events (and more of them in quantity) than we have ever seen in the past. We simply can't predict anything except that, sooner or later, they do occur.\n",
403 | "\n",
404 | "In the political and cultural domain, many researchers believe these distributions characterize probabilities and severities of\n",
405 | "* pandemics\n",
406 | "* stock market crashes\n",
407 | "* riots and civil unrest\n",
408 | "* wars and revolutions\n",
409 | "\n",
410 | "while in the business world we see\n",
411 | "* stock market crashes\n",
412 | "* banking/currency crises\n",
413 | "* industry sector disruptions and transformations\n",
414 | " * these can be positive (e.g., Internet, mobile) as well as negative\n",
415 | " \n",
416 | "These can sometimes be described as tipping points or phase changes. Systems that tend to organize themselves into states close to tipping points are said to exhibit *self-organizing criticality*.\n",
417 | "\n",
418 | "__Whenever we see a system characterized by interrelated and interacting (not independent) multiplicative effects, we need to pay close attention__. For some systems, under some limited conditions, we can model the distribution based on information we have (https://en.wikipedia.org/wiki/Extreme_value_theory) but in many cases, there is no way to get a confident description of the tails based on the data we have."
419 | ]
420 | },
421 | {
422 | "cell_type": "markdown",
423 | "metadata": {},
424 | "source": [
425 | "__Checkpoint__\n",
426 | "\n",
427 | "At this point, we've gotten a high-level feel for one way of thinking about the emergence of complex or sometimes chaotic behavior.\n",
428 | "\n",
429 | "You should feel like there is a bit of statistical logic which underlies less predictable, as opposed to more predictable, systems and events."
430 | ]
431 | },
432 | {
433 | "cell_type": "code",
434 | "execution_count": null,
435 | "metadata": {},
436 | "outputs": [],
437 | "source": []
438 | }
439 | ],
440 | "metadata": {
441 | "celltoolbar": "Slideshow",
442 | "kernelspec": {
443 | "display_name": "Python 3 (ipykernel)",
444 | "language": "python",
445 | "name": "python3"
446 | },
447 | "language_info": {
448 | "codemirror_mode": {
449 | "name": "ipython",
450 | "version": 3
451 | },
452 | "file_extension": ".py",
453 | "mimetype": "text/x-python",
454 | "name": "python",
455 | "nbconvert_exporter": "python",
456 | "pygments_lexer": "ipython3",
457 | "version": "3.9.16"
458 | }
459 | },
460 | "nbformat": 4,
461 | "nbformat_minor": 4
462 | }
463 |
--------------------------------------------------------------------------------
/06-Agents-PathDep.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "5523a44b-08c4-4758-bf86-640032f062a7",
6 | "metadata": {},
7 | "source": [
8 | "# Agent-based models and path dependency\n",
9 | "\n",
10 | "## Agent-based models\n",
11 | "\n",
12 | "Another kind of generative or simulation-based model which can offer insights into the dynamics of complexity is the agent-based model.\n",
13 | "\n",
14 | "* An agent-based model is just a simulation of a number of agents (a bit like imaginary characters) who act according to some rule in an environment that also features some rule.\n",
15 | "\n",
16 | "* Simple agent-based models are sometimes associated with automata because they are often implemented in simplified \"world\" akin to those of classic cellular automata like Conway's Game of Life.\n",
17 | " * But there are a variety of more complex approaches to agent-based models ... they need not be simple, deterministic, nor discrete and they can have sophisticated rules."
18 | ]
19 | },
20 | {
21 | "cell_type": "markdown",
22 | "id": "3a65f8be-b888-4d9f-9d7c-44375d6bd9a4",
23 | "metadata": {},
24 | "source": [
25 | "### Using agent-based models\n",
26 | "\n",
27 | "Although it might be tempting to jump into complex agent-based models, there are good reasons to work with a minimal model, such as explainability and calibration.\n",
28 | "\n",
29 | "We can borrow the automaton idea -- and substitute rules that we would like to investigate -- to see if our rules lead to favorable outcomes. With an experimental platform like this, we can then adjust our rules in the hope of creating different outcomes."
30 | ]
31 | },
32 | {
33 | "cell_type": "markdown",
34 | "id": "ea4a7316-30d2-40e7-8601-e5db24db11cd",
35 | "metadata": {},
36 | "source": [
37 | "### Schelling Segregation Model\n",
38 | "\n",
39 | "To get a feel for how this approach might be informative, we'll look at one of the earliest and most famous agent-based models: the __Schelling Segregation Model__\n",
40 | "\n",
41 | "Allen Downey summarizes this model/thought experiment brilliantly in his book *Think Complexity*:\n",
42 | "\n",
43 | "> The Schelling model of the world is a grid where each cell represents a house. The houses are occupied by two kinds of agents, labeled red and blue, in roughly equal numbers. About 10% of the houses are empty.\n",
44 | ">\n",
45 | "> At any point in time, an agent might be happy or unhappy, depending on the other agents in the neighborhood, where the “neighborhood\" of each house is the set of eight adjacent cells. In one version of the model, agents are happy if they have at least two neighbors like themselves, and unhappy if they have one or zero.\n",
46 | ">\n",
47 | "> The simulation proceeds by choosing an agent at random and checking to see whether they are happy. If so, nothing happens; if not, the agent chooses one of the unoccupied cells at random and moves.\n",
48 | ">\n",
49 | "> You will not be surprised to hear that this model leads to some segregation, but you might be surprised by the degree. From a random starting point, clusters of similar agents form almost immediately. The clusters grow and coalesce over time until there are a small number of large clusters and most agents live in homogeneous neighborhoods.\n",
50 | ">\n",
51 | "> If you did not know the process and only saw the result, you might assume that the agents were racist, but in fact all of them would be perfectly happy in a mixed neighborhood.\n",
52 | "\n",
53 | "Let's implement this model to see\n",
54 | "* what a simple agent-based model implementation looks like\n",
55 | "* how the \"homophily index,\" or fraction of similar neighbors required for happiness, affects the overall segregation of the grid"
56 | ]
57 | },
58 | {
59 | "cell_type": "code",
60 | "execution_count": null,
61 | "id": "9cf7c303-fd83-48bd-a322-cc0f54b990a1",
62 | "metadata": {},
63 | "outputs": [],
64 | "source": [
65 | "size = 15\n",
66 | "homophily_index = 0.3"
67 | ]
68 | },
69 | {
70 | "cell_type": "code",
71 | "execution_count": null,
72 | "id": "47ad213c-0cff-4038-8505-1c5f9b3d0c20",
73 | "metadata": {},
74 | "outputs": [],
75 | "source": [
76 | "def make_grid(size):\n",
77 | " grid = np.random.uniform(0, 1, (size,size))\n",
78 | " grid[grid>0.55] = 1\n",
79 | " grid[grid<0.45] = 2\n",
80 | " grid[grid<1]=0\n",
81 | " return grid\n",
82 | "\n",
83 | "grid = make_grid(size)\n",
84 | "grid"
85 | ]
86 | },
87 | {
88 | "cell_type": "code",
89 | "execution_count": null,
90 | "id": "6269549f-be86-4145-81c8-198d45ac569d",
91 | "metadata": {},
92 | "outputs": [],
93 | "source": [
94 | "plt.magma()\n",
95 | "plt.imshow(grid)\n",
96 | "plt.colorbar()"
97 | ]
98 | },
99 | {
100 | "cell_type": "code",
101 | "execution_count": null,
102 | "id": "76598599-068b-4f27-807c-c80ec22c7313",
103 | "metadata": {},
104 | "outputs": [],
105 | "source": [
106 | "np.argwhere(grid==0)"
107 | ]
108 | },
109 | {
110 | "cell_type": "code",
111 | "execution_count": null,
112 | "id": "7b8ec2fa-90cc-4a48-846d-926ee5c458d6",
113 | "metadata": {},
114 | "outputs": [],
115 | "source": [
116 | "import random\n",
117 | "\n",
118 | "def pick_random_agent(grid):\n",
119 | " agent_locations = np.argwhere(grid != 0)\n",
120 | " loc_index = random.randint(0, agent_locations.shape[0]-1)\n",
121 | " return (agent_locations[loc_index][0], agent_locations[loc_index][1])\n",
122 | "\n",
123 | "def pick_empty_loc(grid):\n",
124 | " empty_locations = np.argwhere(grid == 0)\n",
125 | " loc_index = random.randint(0, empty_locations.shape[0]-1)\n",
126 | " return (empty_locations[loc_index][0], empty_locations[loc_index][1])"
127 | ]
128 | },
129 | {
130 | "cell_type": "code",
131 | "execution_count": null,
132 | "id": "2bf9efe2-bee9-4520-9f10-fe579d430dd5",
133 | "metadata": {},
134 | "outputs": [],
135 | "source": [
136 | "agent = pick_random_agent(grid)\n",
137 | "agent"
138 | ]
139 | },
140 | {
141 | "cell_type": "code",
142 | "execution_count": null,
143 | "id": "1978d449-bfa6-4605-8b3d-18b397fa1c03",
144 | "metadata": {},
145 | "outputs": [],
146 | "source": [
147 | "agent_group = grid[agent[0], agent[1]]\n",
148 | "agent_group"
149 | ]
150 | },
151 | {
152 | "cell_type": "code",
153 | "execution_count": null,
154 | "id": "90b6b6ef-7e18-4104-ba34-243c3f3b9933",
155 | "metadata": {},
156 | "outputs": [],
157 | "source": [
158 | "neighborhood = grid[agent[0]-1:agent[0]+2, agent[1]-1:agent[1]+2]\n",
159 | "neighborhood"
160 | ]
161 | },
162 | {
163 | "cell_type": "code",
164 | "execution_count": null,
165 | "id": "df8d68aa-a043-46da-89d8-b4f8c49ef971",
166 | "metadata": {},
167 | "outputs": [],
168 | "source": [
169 | "similar_neighbors_locs = (neighborhood == agent_group)\n",
170 | "similar_neighbors_locs"
171 | ]
172 | },
173 | {
174 | "cell_type": "code",
175 | "execution_count": null,
176 | "id": "50734ec2-d171-44e0-964a-f50999c76556",
177 | "metadata": {},
178 | "outputs": [],
179 | "source": [
180 | "similar_neighbors = similar_neighbors_locs.sum() - 1\n",
181 | "similar_neighbors"
182 | ]
183 | },
184 | {
185 | "cell_type": "code",
186 | "execution_count": null,
187 | "id": "0002064b-3e5e-4b2f-bc73-0e71b48d782b",
188 | "metadata": {},
189 | "outputs": [],
190 | "source": [
191 | "def do_update(grid):\n",
192 | " agent = pick_random_agent(grid)\n",
193 | " agent_group = grid[agent[0], agent[1]]\n",
194 | " neighborhood = grid[agent[0]-1:agent[0]+2, agent[1]-1:agent[1]+2]\n",
195 | " similar_neighbors = (neighborhood == agent_group).sum() - 1\n",
196 | " is_happy = (similar_neighbors / 8) > homophily_index\n",
197 | " if not is_happy:\n",
198 | " new_loc = pick_empty_loc(grid)\n",
199 | " grid[agent[0], agent[1]] = 0\n",
200 | " grid[new_loc[0], new_loc[1]] = agent_group"
201 | ]
202 | },
203 | {
204 | "cell_type": "code",
205 | "execution_count": null,
206 | "id": "d1775367-0539-424c-8f0f-c0a0fe0e5574",
207 | "metadata": {},
208 | "outputs": [],
209 | "source": [
210 | "plt.imshow(grid)"
211 | ]
212 | },
213 | {
214 | "cell_type": "code",
215 | "execution_count": null,
216 | "id": "e555176c-82b0-4d16-b465-ce73ed6c915f",
217 | "metadata": {},
218 | "outputs": [],
219 | "source": [
220 | "for i in range(10 * size**2):\n",
221 | " do_update(grid)\n",
222 | " \n",
223 | "plt.imshow(grid)"
224 | ]
225 | },
226 | {
227 | "cell_type": "code",
228 | "execution_count": null,
229 | "id": "5c892e30-9e19-428d-96f1-a16d0e6e981e",
230 | "metadata": {},
231 | "outputs": [],
232 | "source": [
233 | "size = 100\n",
234 | "\n",
235 | "grid = make_grid(size)\n",
236 | "plt.imshow(grid)"
237 | ]
238 | },
239 | {
240 | "cell_type": "code",
241 | "execution_count": null,
242 | "id": "edc3ef31-59e5-48f7-a25b-b3ab8f7d1052",
243 | "metadata": {},
244 | "outputs": [],
245 | "source": [
246 | "for i in range(2 * size**2):\n",
247 | " do_update(grid)\n",
248 | " \n",
249 | "plt.imshow(grid)"
250 | ]
251 | },
252 | {
253 | "cell_type": "code",
254 | "execution_count": null,
255 | "id": "7bde5547-a4da-43bb-adc3-47fc2817744f",
256 | "metadata": {},
257 | "outputs": [],
258 | "source": [
259 | "for i in range(2 * size**2):\n",
260 | " do_update(grid)\n",
261 | " \n",
262 | "plt.imshow(grid)"
263 | ]
264 | },
265 | {
266 | "cell_type": "code",
267 | "execution_count": null,
268 | "id": "40a38e9e-d89c-4ad2-81b3-d558dbd0df6b",
269 | "metadata": {},
270 | "outputs": [],
271 | "source": [
272 | "for i in range(2 * size**2):\n",
273 | " do_update(grid)\n",
274 | " \n",
275 | "plt.imshow(grid)"
276 | ]
277 | },
278 | {
279 | "cell_type": "code",
280 | "execution_count": null,
281 | "id": "10e237f9-2a79-44e2-af86-e94efe6d13c5",
282 | "metadata": {},
283 | "outputs": [],
284 | "source": [
285 | "homophily_index = 0.4\n",
286 | "\n",
287 | "size = 100\n",
288 | "\n",
289 | "grid = make_grid(size)\n",
290 | "plt.imshow(grid)"
291 | ]
292 | },
293 | {
294 | "cell_type": "code",
295 | "execution_count": null,
296 | "id": "f40f9c4d-04de-444f-962d-a6926e146d18",
297 | "metadata": {},
298 | "outputs": [],
299 | "source": [
300 | "for i in range(4 * size**2):\n",
301 | " do_update(grid)\n",
302 | " \n",
303 | "plt.imshow(grid)"
304 | ]
305 | },
306 | {
307 | "cell_type": "markdown",
308 | "id": "5614af34-d684-4b49-ba02-b8f30c6c939b",
309 | "metadata": {},
310 | "source": [
311 | "It would be interesting to plot the homophily index vs. the number of iterations before a particular segregation level is met, but that is a bit beyond what we have time for today."
312 | ]
313 | },
314 | {
315 | "cell_type": "markdown",
316 | "id": "0e3e330a-ae44-4e49-8d9e-43a24b3836f2",
317 | "metadata": {},
318 | "source": [
319 | "### Takeaway\n",
320 | "\n",
321 | "What are some takeaways from this experiment?\n",
322 | "\n",
323 | "We can test hypotheses which may be critical to real-world phenomena within a highly artificial \"small world\" and still learn critical insights.\n",
324 | "\n",
325 | "* For example, we might want to test the following hypothesis:\n",
326 | " * \"Modest homophily values like 30% are insufficient to generate segregation -- something else is necessary.\"\n",
327 | " * *We can see that the hypothesis is clearly false.*\n",
328 | "\n",
329 | "Of course, the model cannot tell you how to manage your society, business, or project. But it can provide indicators you can use when designing for target outcomes."
330 | ]
331 | },
332 | {
333 | "cell_type": "markdown",
334 | "id": "3318f39f-0b17-47ee-9b57-c67f235b3b1d",
335 | "metadata": {},
336 | "source": [
337 | "### More recent research\n",
338 | "\n",
339 | "__Joshua Epstein and Agent Zero__\n",
340 | "\n",
341 | "Since then, numerous researchers have used ABMs to explore a wide range of phenomena.\n",
342 | "\n",
343 | "In *Agent_Zero: Toward Neurocognitive Foundations for Generative Social Science*, Joshua Epstein introduces a new theoretical agent called Agent Zero, which is an attempt to ground social science in neurocognitive processes. Epstein's work focuses on creating computational models that can generate a wide array of social phenomena. Some of the phenomena that Epstein generates and discusses in the book include how\n",
344 | "* a jury can unanimously vote to convict when only a minority of participants believe the defendant is guilty\n",
345 | "* diversity in \"trigger points\" can make a mob more likely to turn violent\n",
346 | "* soldiers can become susceptible to committing mass killings and other atrocities\n",
347 | "\n",
348 | "Book: https://press.princeton.edu/books/hardcover/9780691158884/agentzero\n",
349 | "\n",
350 | "Online models (will likely not make sense until after reading the book): http://modelingcommons.org/browse/one_model/5982#model_tabs_browse_info\n",
351 | "\n",
352 | "__Stefanie Crabtree__\n",
353 | "\n",
354 | "\n",
355 | "\n",
356 | "https://stefanicrabtree.com/agent-based-modeling-work/"
357 | ]
358 | },
359 | {
360 | "cell_type": "markdown",
361 | "id": "1219f8e7-b331-4ef4-b467-d1bf12b22d82",
362 | "metadata": {},
363 | "source": [
364 | "#### How does this connect to the distributions we talked about earlier?\n",
365 | "\n",
366 | "Although it is convenient to formulate and render cellular automata like these in a \"grid world,\" they can actually be interpreted as graphical models, so they are not as far away from networks as they might appear at first. And, as areas of the \"grid\" are assimilated to one factor or another, there are multiplicative effects: for example, in this model, the colored areas (~ 2-dimensional) become overwhelmingly larger than the frontiers (~ 1-dimensional).\n",
367 | "\n",
368 | "The homophily model we've looked at here might just as easily describe users of one or another mobile phone or communications platform (in non-geographical cases, the dimensions may represent other aspects of a social or product space) -- so there are definitely applications in business."
369 | ]
370 | },
371 | {
372 | "cell_type": "markdown",
373 | "id": "9b3c3e7f-c01e-468b-bb3a-84fd3af48ce7",
374 | "metadata": {},
375 | "source": [
376 | "## Exploring path dependence\n",
377 | "\n",
378 | "Simple models may evaluate a distribution of outcomes for an individual, team, firm, or other group over a series of choices.\n",
379 | "\n",
380 | "For example, when choosing what product to prioritize for the next quarter, projections might assign probabilities and expected profits to different market scenarios and product choices. A business unit might choose to focus on the product with highest expected profit across the projected business scenarios.\n",
381 | "\n",
382 | "However -- as anyone familiar with the consequences of technical debt can tell you -- your next choice is rarely made with a blank-slate starting point. We all have to live with the consequences of our previous choices, and that can change the expected outcome dramatically. \n",
383 | "\n",
384 | "* In other words, our outcomes are not purely dependent on a current decision. They are dependent on the path of prior steps in the outcome space.\n",
385 | "\n",
386 | "### Simple investment model\n",
387 | "\n",
388 | "We'll take a look at a simple investment (or gambling) model which produces reliable positive returns when viewed from the an average (or expectation) perspective, but yields ruinous losses when viewed from the path-dependent perspective of any actual investor (or gambler)\n",
389 | "\n",
390 | "__The business proposition__\n",
391 | "\n",
392 | "* 50/50 risk of success or failure\n",
393 | "* Success returns 50 cents on the dollar (i.e., \\\\$1 invested returns \\\\$1.50)\n",
394 | "* Failure produces a loss of 40 cents (i.e., in the failure scenario, one recoups \\\\$0.60 from each \\\\$1 invested)\n",
395 | "\n",
396 | "Traditional expectation:"
397 | ]
398 | },
399 | {
400 | "cell_type": "code",
401 | "execution_count": null,
402 | "id": "4466ed89-9ef8-4af8-b881-c8c4a87c008d",
403 | "metadata": {},
404 | "outputs": [],
405 | "source": [
406 | "0.5 * 1.5 + 0.5 * 0.60"
407 | ]
408 | },
409 | {
410 | "cell_type": "markdown",
411 | "id": "8a8d2718-d04c-41a1-ab36-7ef0298e3f02",
412 | "metadata": {},
413 | "source": [
414 | "We can simulate that to get a better idea of the deviation from the ideal average"
415 | ]
416 | },
417 | {
418 | "cell_type": "code",
419 | "execution_count": null,
420 | "id": "1100027e-28ee-4ba2-a1fb-4f1761376e80",
421 | "metadata": {},
422 | "outputs": [],
423 | "source": [
424 | "sample_size = range(100, 10000, 100)\n",
425 | "\n",
426 | "outcomes = []\n",
427 | "\n",
428 | "for i in sample_size:\n",
429 | " draws = np.random.uniform(0, 1, (i))\n",
430 | " draws[draws > 0.5] = 1.50\n",
431 | " draws[draws < 1] = 0.6\n",
432 | " outcomes.append(draws.mean())\n",
433 | " \n",
434 | "plt.plot(sample_size, outcomes)"
435 | ]
436 | },
437 | {
438 | "cell_type": "markdown",
439 | "id": "b3e533cc-1286-4931-a390-48345dde555d",
440 | "metadata": {},
441 | "source": [
442 | "So it looks like, even for small samples or \"bad luck\" we should do pretty well with this sort of investment.\n",
443 | "\n",
444 | "__Ensemble average vs. time average__\n",
445 | "\n",
446 | "But this form of average assumes that we start in the same position prior to each investment or bet.\n",
447 | "\n",
448 | "* It's a bit like looking at hundreds or thousands of individuals or firms each making one bet. On average, they will (collectively) do well!\n",
449 | "\n",
450 | "But let's change our perspective for a moment and look at one individual or firm making a sequence of small bets/investments.\n",
451 | "\n",
452 | "* If they make $2n$ investments, we would expect about $n$ to yield the \\\\$1.50 and the other $n$ to yield the \\\\$0.60\n",
453 | "* So the end result would be $(1.5)^n*(0.6)^n = [(1.5)(0.6)]^n = 0.9^n$\n",
454 | "\n",
455 | "Wait ... $0.9^n$ doesn't look very good. In fact, it will go very quickly to zero for any significant $n$\n",
456 | "\n",
457 | "Just to be sure, let's simulate this as well:"
458 | ]
459 | },
460 | {
461 | "cell_type": "code",
462 | "execution_count": null,
463 | "id": "0a65dee7-db47-41d6-a594-e2085a38c18c",
464 | "metadata": {},
465 | "outputs": [],
466 | "source": [
467 | "steps=200\n",
468 | "simulations=10000\n",
469 | "draws = np.random.uniform(0, 1, (simulations, steps))\n",
470 | "draws[draws > 0.5] = 1.50\n",
471 | "draws[draws < 1] = 0.6\n",
472 | "outcomes = draws.prod(axis=1)\n",
473 | "plt.hist(outcomes, bins=100)"
474 | ]
475 | },
476 | {
477 | "cell_type": "markdown",
478 | "id": "cc2d7de6-aae5-4b3e-939c-dacf33411785",
479 | "metadata": {},
480 | "source": [
481 | "Just for comparison, our expected value after `steps` investments"
482 | ]
483 | },
484 | {
485 | "cell_type": "code",
486 | "execution_count": null,
487 | "id": "561995be-9885-4881-b4ea-bd3dadc4c2bc",
488 | "metadata": {},
489 | "outputs": [],
490 | "source": [
491 | "expected = 1.05 ** steps\n",
492 | "expected"
493 | ]
494 | },
495 | {
496 | "cell_type": "code",
497 | "execution_count": null,
498 | "id": "0c9818f1-eacf-4016-bc87-06c4de8f65be",
499 | "metadata": {},
500 | "outputs": [],
501 | "source": [
502 | "outcomes[outcomes < 0.1].size / simulations"
503 | ]
504 | },
505 | {
506 | "cell_type": "code",
507 | "execution_count": null,
508 | "id": "a9153ad5-a318-419e-8617-e48198bdb0d0",
509 | "metadata": {},
510 | "outputs": [],
511 | "source": [
512 | "outcomes[outcomes < 1].size / simulations"
513 | ]
514 | },
515 | {
516 | "cell_type": "code",
517 | "execution_count": null,
518 | "id": "41cd7301-8549-4d6d-bf74-83f31446dad3",
519 | "metadata": {},
520 | "outputs": [],
521 | "source": [
522 | "outcomes[outcomes > 2].size / simulations"
523 | ]
524 | },
525 | {
526 | "cell_type": "code",
527 | "execution_count": null,
528 | "id": "a0b34b1d-08db-4248-b868-cf2e30884e42",
529 | "metadata": {},
530 | "outputs": [],
531 | "source": [
532 | "outcomes[outcomes >= expected].size / simulations"
533 | ]
534 | },
535 | {
536 | "cell_type": "markdown",
537 | "id": "8f63863d-78b9-44fb-a353-a9f2dfc760a4",
538 | "metadata": {},
539 | "source": [
540 | "__A dramatic view of the \"lifelines\" of a number of agents facing a similar set of options__\n",
541 | "\n",
542 | "
\n",
543 | "\n",
544 | "From: https://www.nature.com/articles/s41567-019-0732-0\n",
545 | "\n",
546 | "#### Takeaway\n",
547 | "\n",
548 | "When does this occur in real life?\n",
549 | "\n",
550 | "Although our specific numbers in the present example are contrived, path dependence is a critical factor in many real-world systems:\n",
551 | "* economic actors\n",
552 | "* health outcomes\n",
553 | "* hiring and promotion\n",
554 | "* education\n",
555 | "* criminal justice\n",
556 | "* participation in risk-taking and investment activities\n",
557 | "\n",
558 | "__How does this connect to the distributions and patterns we've been talking about?__\n",
559 | "\n",
560 | "Notice that, in the path-dependent case,\n",
561 | "* we have a *series of multiplied values which are not independent*\n",
562 | " * (since each multiplication is dependent on prior state) \n",
563 | "* where, in the ensemble expectation, we *assumed* that all of the events (values being multiplied) are independent\n",
564 | " * (they only depend on the \"rules of the game\" -- every trial starts with 1 dollar)\n",
565 | " \n",
566 | "Once again, we see a compounding effect leading to drastically large (or small) numbers. \n",
567 | "\n",
568 | "A concrete example is insurance pools. A sufficiently large and diverse business can \"self insure\" anything from employee health costs to its own fleet of vehicles. Such self insurance can work, provided the losses are independent enough that the ensemble average holds.\n",
569 | "\n",
570 | "If a company's employees were all concentrated in an area with common health hazards (say, contaminated air or ground water) then the sequence of repeated of heath-cost losses would not be independent -- risk would be magnified as health losses compound over time.\n",
571 | "\n",
572 | "__How do we use this knowledge?__\n",
573 | "\n",
574 | "Any time we are looking to achieve an \"average\" result over time, we can ask whether the steps are truly independent. As a technology example, we may have a device that we deploy in the field which features high uptime (time between failures). \n",
575 | "* To achieve long-term reliability, we want to ensure that the device is as stateless as possible when it recovers\n",
576 | "* If a device retains state (e.g., internal storage or config) which affect its future success (after recovering from a failure) then the sequence of failures becomes path dependent"
577 | ]
578 | }
579 | ],
580 | "metadata": {
581 | "kernelspec": {
582 | "display_name": "Python 3 (ipykernel)",
583 | "language": "python",
584 | "name": "python3"
585 | },
586 | "language_info": {
587 | "codemirror_mode": {
588 | "name": "ipython",
589 | "version": 3
590 | },
591 | "file_extension": ".py",
592 | "mimetype": "text/x-python",
593 | "name": "python",
594 | "nbconvert_exporter": "python",
595 | "pygments_lexer": "ipython3",
596 | "version": "3.9.16"
597 | }
598 | },
599 | "nbformat": 4,
600 | "nbformat_minor": 5
601 | }
602 |
--------------------------------------------------------------------------------