├── PythonTDA_Intro.ipynb
├── README.md
├── figures
├── OrdinalPartitionNetworkExample_AudunMyers.mp4
└── WeightedGraphCliqueExample.png
├── requirements.in
└── requirements.txt
/PythonTDA_Intro.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {
6 | "slideshow": {
7 | "slide_type": "slide"
8 | }
9 | },
10 | "source": [
11 | "# Python Tutorial on Topological Data Analysis\n",
12 | "\n",
13 | "## Elizabeth Munch\n",
14 | " Dept of Computational Mathematics, Science and Engineering\n",
15 | " Dept of Mathematics\n",
16 | " Michigan State University"
17 | ]
18 | },
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {
22 | "slideshow": {
23 | "slide_type": "notes"
24 | }
25 | },
26 | "source": [
27 | "# Welcome\n",
28 | "\n",
29 | "This notebook is meant to be a first step introduction to some available tools for computation of TDA signatures using python. This repository includes the jupyter notebooks for the 2021 Workshop [\"Mathematical and Computational Methods for Complex Social Systems\"](https://www.google.com/url?q=https://meetings.ams.org/math/sc2021/meetingapp.cgi) to be held at the virtual JMM 2021, led by [Elizabeth Munch](http://elizabethmunch.com/). This introduction is tailored for a network science audience, so the focus is on an relevant tools when given a network input.\n",
30 | "\n",
31 | "These slides are written to be presented as [RISE slides](https://rise.readthedocs.io/en/stable/index.html), however the notebook should be self contained without needing this installed. If you see a lot of weird cell toolbars in the notebook (which are used for controlling the slideshow version), these can be removed from your view of the jupyter notebook by going to View -> Cell Toolbar -> None\n",
32 | "\n",
33 | "\n",
34 | "\n",
35 | "\n",
36 | "\n",
37 | "\n",
38 | "\n",
39 | "\n",
40 | "\n",
41 | "\n",
42 | "\n"
43 | ]
44 | },
45 | {
46 | "cell_type": "markdown",
47 | "metadata": {
48 | "slideshow": {
49 | "slide_type": "subslide"
50 | }
51 | },
52 | "source": [
53 | "# Goals\n",
54 | "\n",
55 | "- Give a brief overview of available packages\n",
56 | "- Provide pipelines for computing persistent homology for input data such as a discrete metric space and a weighted graph. \n",
57 | "- Give you a place to start....\n",
58 | "\n"
59 | ]
60 | },
61 | {
62 | "cell_type": "markdown",
63 | "metadata": {
64 | "slideshow": {
65 | "slide_type": "subslide"
66 | }
67 | },
68 | "source": [
69 | "\n",
70 | "# Things I won't get to\n",
71 | "\n",
72 | "- Every possible filtration\n",
73 | "- Graphical signatures of data\n",
74 | " - Reeb graphs\n",
75 | " - Mapper graphs \n",
76 | " - Merge trees\n",
77 | " - Contour trees"
78 | ]
79 | },
80 | {
81 | "cell_type": "markdown",
82 | "metadata": {
83 | "slideshow": {
84 | "slide_type": "subslide"
85 | }
86 | },
87 | "source": [
88 | "# Options to follow along\n",
89 | "\n",
90 | "\n",
91 | "\n",
92 | "- Download from the github repo: [github.com/lizliz/TDA-Python-Workshop-JMM21](https://github.com/lizliz/TDA-Python-Workshop-JMM21)\n",
93 | "- Run directly from binder: [tinyurl.com/jmm-tda](https://tinyurl.com/jmm-tda)\n",
94 | "\n"
95 | ]
96 | },
97 | {
98 | "cell_type": "markdown",
99 | "metadata": {
100 | "slideshow": {
101 | "slide_type": "slide"
102 | }
103 | },
104 | "source": [
105 | "# An incomplete list of available software\n",
106 | "\n",
107 | "There are so many.....\n"
108 | ]
109 | },
110 | {
111 | "cell_type": "markdown",
112 | "metadata": {
113 | "slideshow": {
114 | "slide_type": "notes"
115 | }
116 | },
117 | "source": [
118 | "New packages are being developed incredibly quickly. I'm so happy to see the field taking off so fast. That being said, I am choosing to use some of the packages I am most familiar with, which does not mean they will be the best option for your task. An incomplete and almost immeidately outdated list of available options is below. Any ommissions are uninentional. "
119 | ]
120 | },
121 | {
122 | "cell_type": "markdown",
123 | "metadata": {
124 | "slideshow": {
125 | "slide_type": "subslide"
126 | }
127 | },
128 | "source": [
129 | "- [SciKitTDA](https://scikit-tda.org/) by Nataniel Saul and Chris Tralie\n",
130 | "- [Teaspoon](http://elizabethmunch.com/code/teaspoon/index.html) By Liz Munch and Firas Khasawneh\n",
131 | "\n",
132 | "\n",
133 | "- [Ripser](https://github.com/Ripser/ripser) by Ulrich Bauer (C++)\n",
134 | "- [GUDHI](http://gudhi.gforge.inria.fr/) developed at INRIA\n",
135 | "- [Giotto-tda](https://giotto-ai.github.io/) developted at EPFL\n",
136 | "- [Cubicle](https://bitbucket.org/hubwag/cubicle/src/master/) by Hubert Wagner\n",
137 | "- [HomcCube](https://i-obayashi.info/software.html) By Ippei Obayashi.\n",
138 | "- [DIPHA](https://github.com/DIPHA/dipha) by Ulrich Bauer and Michael Kerber\n",
139 | "- [diamorse](https://github.com/AppliedMathematicsANU/diamorse) developed at The Australian National University.\n",
140 | "- [Perseus](http://people.maths.ox.ac.uk/nanda/perseus/) by Vidit Nanda\n",
141 | "- [Dionysus2](https://www.mrzv.org/software/dionysus2/) by Dimitry Morozov (C++, Python)\n",
142 | "- [CliqueTop](https://github.com/nebneuron/clique-top) by Chad Giusti (Matlab)\n",
143 | "- [Eirene](http://gregoryhenselman.org/eirene/index.html) by Greg Henselman (Julia)\n",
144 | "- [Ripser-live](http://live.ripser.org/) by Ulrich Bauer (browser)\n",
145 | "- [CHomP](https://github.com/shaunharker/CHomP\") by Shaun Harker (C++) \n",
146 | "- [Hera](https://bitbucket.org/grey_narn/hera) by Michael Kerber, Dmitriy Morozov, and Arnur Nigmetov\n",
147 | "- [JavaPlex](https://github.com/appliedtopology) by Andrew Tausz, Mikael Vejdemo-Johansson and Henry Adams\n",
148 | "- [PHAT](https://bitbucket.org/phat-code/phat) by Ulrich Bauer, Michael Kerber, Jan Reininghaus, Hubert Wagner, and Bryn Keller\n",
149 | "- Topology ToolKit (C++) by Julien Tierny, Guillaume Favelier, Joshua Levine, Charles Gueunet, and Michaël Michaux (I think?)\n",
150 | "- TDA (R) by Brittany T. Fasy, Jisu Kim, Fabrizio Lecci, and Clément Maria\n",
151 | "- TDAMapper (R) by Paul Pearson, Daniel Müellner, and Gurjeet Singh\n",
152 | "- R scripts for TDA by Peter Bubenik\n",
153 | "- Simplicial complexes for Julia by Alex Kunin and Vladimir Itskov\n",
154 | "- SimBa and SimPer (C++) by Tamal K Dey, Fengtao Fan, Dayu Shi, and Yusu Wan \n",
155 | "- Python Mapper (Python) by Daniel Müllner and Aravindakshan Babu\n",
156 | "- Persistence Landscape Toolbox (C++) by Pawel Dlotko\n",
157 | " "
158 | ]
159 | },
160 | {
161 | "cell_type": "code",
162 | "execution_count": null,
163 | "metadata": {
164 | "slideshow": {
165 | "slide_type": "subslide"
166 | }
167 | },
168 | "outputs": [],
169 | "source": [
170 | "# Basic imports \n",
171 | "import numpy as np\n",
172 | "import matplotlib.pyplot as plt\n",
173 | "import matplotlib.gridspec as gridspec\n",
174 | "import networkx as nx\n",
175 | "from IPython.display import Video\n",
176 | "\n",
177 | "# scikit-tda imports..... Install all with -> pip install scikit-tda\n",
178 | "#--- this is the main persistence computation workhorse\n",
179 | "import ripser\n",
180 | "# from persim import plot_diagrams\n",
181 | "import persim\n",
182 | "# import persim.plot\n",
183 | "\n",
184 | "# teaspoon imports...... Install with -> pip install teaspoon\n",
185 | "#---these are for generating data and some drawing tools \n",
186 | "import teaspoon.MakeData.PointCloud as makePtCloud\n",
187 | "import teaspoon.TDA.Draw as Draw\n",
188 | "\n",
189 | "#---these are for generating time series network examples\n",
190 | "from teaspoon.SP.network import ordinal_partition_graph\n",
191 | "from teaspoon.TDA.PHN import PH_network\n",
192 | "from teaspoon.SP.network_tools import make_network\n",
193 | "from teaspoon.parameter_selection.MsPE import MsPE_tau\n",
194 | "import teaspoon.MakeData.DynSysLib.DynSysLib as DSL\n"
195 | ]
196 | },
197 | {
198 | "cell_type": "markdown",
199 | "metadata": {
200 | "slideshow": {
201 | "slide_type": "slide"
202 | }
203 | },
204 | "source": [
205 | "# Computing persistence on a point cloud\n",
206 | "\n"
207 | ]
208 | },
209 | {
210 | "cell_type": "markdown",
211 | "metadata": {
212 | "slideshow": {
213 | "slide_type": "fragment"
214 | }
215 | },
216 | "source": [
217 | "Basic version: point clouds in $\\mathbb{R}^n$ inheriting Euclidean metric"
218 | ]
219 | },
220 | {
221 | "cell_type": "markdown",
222 | "metadata": {
223 | "slideshow": {
224 | "slide_type": "subslide"
225 | }
226 | },
227 | "source": [
228 | "## Annulus example"
229 | ]
230 | },
231 | {
232 | "cell_type": "code",
233 | "execution_count": null,
234 | "metadata": {
235 | "slideshow": {
236 | "slide_type": "-"
237 | }
238 | },
239 | "outputs": [],
240 | "source": [
241 | "r = 1\n",
242 | "R = 2\n",
243 | "P = makePtCloud.Annulus(N=200, r=r, R=R, seed=None) # teaspoon data generation\n",
244 | "plt.scatter(P[:,0],P[:,1])\n",
245 | "# print(P)\n",
246 | "# print(type(P))\n",
247 | "# print(P.shape)"
248 | ]
249 | },
250 | {
251 | "cell_type": "markdown",
252 | "metadata": {
253 | "slideshow": {
254 | "slide_type": "subslide"
255 | }
256 | },
257 | "source": [
258 | ".... run me for some nice drawings in a bit ...."
259 | ]
260 | },
261 | {
262 | "cell_type": "code",
263 | "execution_count": null,
264 | "metadata": {
265 | "slideshow": {
266 | "slide_type": "-"
267 | }
268 | },
269 | "outputs": [],
270 | "source": [
271 | "# Some quick code to draw stuff without showing all the matplotlib junk in the slides everytime. \n",
272 | "\n",
273 | "def drawTDAtutorial(P,diagrams, R = 2):\n",
274 | " fig, axes = plt.subplots(nrows=1, ncols=3, figsize = (20,5))\n",
275 | "\n",
276 | " # Draw point cloud \n",
277 | " plt.sca(axes[0])\n",
278 | " plt.title('Point Cloud')\n",
279 | " plt.scatter(P[:,0],P[:,1])\n",
280 | "\n",
281 | " # Draw diagrams\n",
282 | " plt.sca(axes[1])\n",
283 | " plt.title('0-dim Diagram')\n",
284 | " Draw.drawDgm(diagrams[0])\n",
285 | "\n",
286 | " plt.sca(axes[2])\n",
287 | " plt.title('1-dim Diagram')\n",
288 | " Draw.drawDgm(diagrams[1])\n",
289 | " plt.axis([0,R,0,R])"
290 | ]
291 | },
292 | {
293 | "cell_type": "code",
294 | "execution_count": null,
295 | "metadata": {
296 | "scrolled": true,
297 | "slideshow": {
298 | "slide_type": "subslide"
299 | }
300 | },
301 | "outputs": [],
302 | "source": [
303 | "diagrams = ripser.ripser(P)['dgms']\n",
304 | "\n",
305 | "# Draw stuff\n",
306 | "drawTDAtutorial(P,diagrams) # Script included in notebook for drawing"
307 | ]
308 | },
309 | {
310 | "cell_type": "markdown",
311 | "metadata": {
312 | "slideshow": {
313 | "slide_type": "subslide"
314 | }
315 | },
316 | "source": [
317 | "### Storage of diagrams"
318 | ]
319 | },
320 | {
321 | "cell_type": "code",
322 | "execution_count": null,
323 | "metadata": {
324 | "scrolled": true,
325 | "slideshow": {
326 | "slide_type": "-"
327 | }
328 | },
329 | "outputs": [],
330 | "source": [
331 | "# Some discussion of how diagrams are stored \n",
332 | "data = ripser.ripser(P)\n",
333 | "# print(data.keys())\n",
334 | "# print(data['dgms'])\n",
335 | "data['dgms'][1]\n",
336 | "# len(data['dgms'])"
337 | ]
338 | },
339 | {
340 | "cell_type": "markdown",
341 | "metadata": {
342 | "slideshow": {
343 | "slide_type": "subslide"
344 | }
345 | },
346 | "source": [
347 | "### Cube example"
348 | ]
349 | },
350 | {
351 | "cell_type": "code",
352 | "execution_count": null,
353 | "metadata": {
354 | "scrolled": true,
355 | "slideshow": {
356 | "slide_type": "-"
357 | }
358 | },
359 | "outputs": [],
360 | "source": [
361 | "P = makePtCloud.Cube()\n",
362 | "diagrams = ripser.ripser(P)['dgms']\n",
363 | "\n",
364 | "# Draw stuff\n",
365 | "drawTDAtutorial(P,diagrams,R=0.8) # Script for drawing everything, code included in notebook\n"
366 | ]
367 | },
368 | {
369 | "cell_type": "markdown",
370 | "metadata": {
371 | "slideshow": {
372 | "slide_type": "subslide"
373 | }
374 | },
375 | "source": [
376 | "### Double annulus example"
377 | ]
378 | },
379 | {
380 | "cell_type": "code",
381 | "execution_count": null,
382 | "metadata": {
383 | "slideshow": {
384 | "slide_type": "-"
385 | }
386 | },
387 | "outputs": [],
388 | "source": [
389 | "# Make a quick double annulus\n",
390 | "\n",
391 | "def DoubleAnnulus(r1 = 1, R1 = 2, r2 = .8, R2 = 1.3, xshift = 3):\n",
392 | " P = makePtCloud.Annulus(r = r1, R = R1)\n",
393 | " Q = makePtCloud.Annulus(r = r2, R = R2)\n",
394 | " Q[:,0] = Q[:,0] + xshift\n",
395 | " P = np.concatenate((P, Q) )\n",
396 | " return(P)\n",
397 | "\n",
398 | "P = DoubleAnnulus(r1 = 1, R1 = 2, r2 = .5, R2 = 1.3, xshift = 3) \n",
399 | "plt.scatter(P[:,0], P[:,1])"
400 | ]
401 | },
402 | {
403 | "cell_type": "code",
404 | "execution_count": null,
405 | "metadata": {
406 | "slideshow": {
407 | "slide_type": "subslide"
408 | }
409 | },
410 | "outputs": [],
411 | "source": [
412 | "P = DoubleAnnulus(r1 = 1, R1 = 2, r2 = .5, R2 = 1.3, xshift = 3) # Code included in notebook\n",
413 | "diagrams = ripser.ripser(P)['dgms']\n",
414 | "\n",
415 | "# Draw stuff\n",
416 | "drawTDAtutorial(P,diagrams,R=2.5) # Script for drawing everything, code included in notebook\n"
417 | ]
418 | },
419 | {
420 | "cell_type": "markdown",
421 | "metadata": {
422 | "slideshow": {
423 | "slide_type": "slide"
424 | }
425 | },
426 | "source": [
427 | "## Computing Persistence on a Pairwise Distance/Similarity Matrix \n",
428 | "\n",
429 | "For this tutorial, we will always use the clique complex, but there are other options available.\n",
430 | "\n",
431 | "Some examples of when we might want to compute persistence in this way:\n",
432 | "\n",
433 | "- Input data with a distance/similarity matrix\n",
434 | "- Weighted graph where we set distance between non adjacent vertices to be np.inf- "
435 | ]
436 | },
437 | {
438 | "cell_type": "markdown",
439 | "metadata": {
440 | "slideshow": {
441 | "slide_type": "subslide"
442 | }
443 | },
444 | "source": [
445 | "### Computing persistence for a weighted graph as the 1-skeleton\n",
446 | "\n",
447 | "-Given a weighted graph $G$, get a filtration by keeping all edges with value $\\leq a$, then computing the clique complex.\n",
448 | "\n",
449 | "-Most useful/interesting when we have a decently dense graph"
450 | ]
451 | },
452 | {
453 | "cell_type": "markdown",
454 | "metadata": {
455 | "slideshow": {
456 | "slide_type": "subslide"
457 | }
458 | },
459 | "source": [
460 | "### An overly simple example\n",
461 | "\n",
462 | "Given a pairwise similarity matrix $D$.\n",
463 | "\n",
464 | "Build the clique complex of the filtration induced by including edges in increasing order of weight.\n",
465 | "\n",
466 | ""
467 | ]
468 | },
469 | {
470 | "cell_type": "markdown",
471 | "metadata": {
472 | "slideshow": {
473 | "slide_type": "subslide"
474 | }
475 | },
476 | "source": [
477 | "
"
478 | ]
479 | },
480 | {
481 | "cell_type": "code",
482 | "execution_count": null,
483 | "metadata": {
484 | "slideshow": {
485 | "slide_type": "fragment"
486 | }
487 | },
488 | "outputs": [],
489 | "source": [
490 | "# Generate the distance matrix from the previous example\n",
491 | "D = np.array([[0, 1, np.inf, np.inf, 6], [0, 0, 5, np.inf, np.inf], [0, 0, 0, 2, 4], [0, 0, 0, 0, 3], [0, 0, 0, 0, 0]])\n",
492 | "D = D+D.T\n",
493 | "print(D)\n"
494 | ]
495 | },
496 | {
497 | "cell_type": "markdown",
498 | "metadata": {
499 | "slideshow": {
500 | "slide_type": "subslide"
501 | }
502 | },
503 | "source": [
504 | "Compute the persistence diagram, the key here is the `distance_matrix=True` bit since otherwise, ripser would think that `D` is a point cloud of $n$ $n$-dimensional points"
505 | ]
506 | },
507 | {
508 | "cell_type": "code",
509 | "execution_count": null,
510 | "metadata": {
511 | "scrolled": true,
512 | "slideshow": {
513 | "slide_type": "-"
514 | }
515 | },
516 | "outputs": [],
517 | "source": [
518 | "diagrams = ripser.ripser(D, distance_matrix=True, maxdim=1)['dgms']\n",
519 | "print('0-Dim Diagram')\n",
520 | "print(diagrams[0])\n",
521 | "print('1-Dim Diagram')\n",
522 | "print(diagrams[1])"
523 | ]
524 | },
525 | {
526 | "cell_type": "markdown",
527 | "metadata": {
528 | "slideshow": {
529 | "slide_type": "subslide"
530 | }
531 | },
532 | "source": [
533 | "### A bigger example with an Erdos-Renyii random graph"
534 | ]
535 | },
536 | {
537 | "cell_type": "markdown",
538 | "metadata": {
539 | "slideshow": {
540 | "slide_type": "subslide"
541 | }
542 | },
543 | "source": [
544 | "... run me for drawing nicely later ...."
545 | ]
546 | },
547 | {
548 | "cell_type": "code",
549 | "execution_count": null,
550 | "metadata": {
551 | "slideshow": {
552 | "slide_type": "-"
553 | }
554 | },
555 | "outputs": [],
556 | "source": [
557 | "# Drawing script for weighted graph\n",
558 | "def drawGraphEx(G):\n",
559 | " #draw it!\n",
560 | "\n",
561 | " pos = nx.spring_layout(G) # positions for all nodes - seed for reproducibility\n",
562 | "\n",
563 | " # nodes\n",
564 | " nx.draw_networkx_nodes(G, pos, node_size=70)\n",
565 | "\n",
566 | " # edges\n",
567 | " nx.draw_networkx_edges(G, pos, width=2)\n",
568 | " # nx.draw_networkx_edges(\n",
569 | " # G, pos, edgelist=esmall, width=6, alpha=0.5, edge_color=\"b\", style=\"dashed\"\n",
570 | " # )\n",
571 | "\n",
572 | " # labels\n",
573 | " # nx.draw_networkx_labels(G, pos, font_size=20, font_family=\"sans-serif\")\n",
574 | " edge_labels=nx.draw_networkx_edge_labels(G,pos,edge_labels=nx.get_edge_attributes(G, 'weight'))"
575 | ]
576 | },
577 | {
578 | "cell_type": "code",
579 | "execution_count": null,
580 | "metadata": {
581 | "scrolled": false,
582 | "slideshow": {
583 | "slide_type": "subslide"
584 | }
585 | },
586 | "outputs": [],
587 | "source": [
588 | "n = 10\n",
589 | "p = .3\n",
590 | "\n",
591 | "# Generate random graph \n",
592 | "G = nx.erdos_renyi_graph(n, p, seed=None, directed=False)\n",
593 | "\n",
594 | "m = len(G.edges)\n",
595 | "print('There are', m,'edges.')\n",
596 | "\n",
597 | "# Generate random edge weights in the interval [0,maxWeight]\n",
598 | "maxWeight = 100\n",
599 | "weights = np.random.randint(maxWeight, size = m)\n",
600 | "\n",
601 | "for i, e in enumerate(G.edges()):\n",
602 | " G[e[0]][e[1]] ['weight'] = weights[i]\n",
603 | " \n",
604 | "drawGraphEx(G)"
605 | ]
606 | },
607 | {
608 | "cell_type": "code",
609 | "execution_count": null,
610 | "metadata": {
611 | "slideshow": {
612 | "slide_type": "subslide"
613 | }
614 | },
615 | "outputs": [],
616 | "source": [
617 | "A = nx.adjacency_matrix(G, weight = 'weight')\n",
618 | "A = A.todense() # Turn into dense matrix for ease of messing with it\n",
619 | "A = np.array(A) # Apparently I need to hand scikit-tda an array instead of a matrix, don't know why\n",
620 | "A = A.astype('float64') # Needed to let me put in np.inf\n",
621 | "A[ np.where(A == 0)] = np.inf\n",
622 | "np.fill_diagonal(A,0)\n",
623 | "\n",
624 | "im = plt.matshow(A, vmax = 100) # Note the np.inf values show up as white\n",
625 | "plt.colorbar(im)"
626 | ]
627 | },
628 | {
629 | "cell_type": "code",
630 | "execution_count": null,
631 | "metadata": {
632 | "slideshow": {
633 | "slide_type": "subslide"
634 | }
635 | },
636 | "outputs": [],
637 | "source": [
638 | "diagrams = ripser.ripser(A, distance_matrix=True)['dgms']\n",
639 | "persim.plot_diagrams(diagrams)\n",
640 | "# print(diagrams)\n",
641 | "# print(diagrams)"
642 | ]
643 | },
644 | {
645 | "cell_type": "markdown",
646 | "metadata": {
647 | "slideshow": {
648 | "slide_type": "subslide"
649 | }
650 | },
651 | "source": [
652 | "### An example from networks computed from time series embeddings \n",
653 | "\n",
654 | "- [Persistent Homology of Complex Networks for Dynamic State Detection. *Audun Myers, Elizabeth Munch, and Firas A. Khasawneh*. Physical Review E, 2019](https://doi.org/10.1103/PhysRevE.100.022314)"
655 | ]
656 | },
657 | {
658 | "cell_type": "code",
659 | "execution_count": null,
660 | "metadata": {
661 | "scrolled": true,
662 | "slideshow": {
663 | "slide_type": "subslide"
664 | }
665 | },
666 | "outputs": [],
667 | "source": [
668 | "Video(\"figures/OrdinalPartitionNetworkExample_AudunMyers.mp4\", width = 1000)"
669 | ]
670 | },
671 | {
672 | "cell_type": "markdown",
673 | "metadata": {
674 | "slideshow": {
675 | "slide_type": "subslide"
676 | }
677 | },
678 | "source": [
679 | "... run me for drawing stuff later..."
680 | ]
681 | },
682 | {
683 | "cell_type": "code",
684 | "execution_count": null,
685 | "metadata": {
686 | "slideshow": {
687 | "slide_type": "-"
688 | }
689 | },
690 | "outputs": [],
691 | "source": [
692 | "# Code to draw the next example later, here so it doesn't end up in a slide\n",
693 | "def drawNetworkExample(gs,G,diagram):\n",
694 | " TextSize = 14\n",
695 | " plt.figure(2)\n",
696 | " plt.figure(figsize=(8,8))\n",
697 | " gs = gridspec.GridSpec(4, 2)\n",
698 | "\n",
699 | " ax = plt.subplot(gs[0:2, 0:2]) #plot time series\n",
700 | " plt.title('Time Series', size = TextSize)\n",
701 | " plt.plot(ts, 'k')\n",
702 | " plt.xticks(size = TextSize)\n",
703 | " plt.yticks(size = TextSize)\n",
704 | " plt.xlabel('$t$', size = TextSize)\n",
705 | " plt.ylabel('$x(t)$', size = TextSize)\n",
706 | " plt.xlim(0,len(ts))\n",
707 | "\n",
708 | " ax = plt.subplot(gs[2:4, 0])\n",
709 | " plt.title('Network', size = TextSize)\n",
710 | " nx.draw(G, pos, with_labels=False, font_weight='bold', node_color='blue',\n",
711 | " width=1, font_size = 10, node_size = 30)\n",
712 | "\n",
713 | " ax = plt.subplot(gs[2:4, 1])\n",
714 | " plt.title('Persistence Diagram', size = TextSize)\n",
715 | " MS = 3\n",
716 | " top = max(diagram[1].T[1])\n",
717 | " plt.plot([0,top*1.25],[0,top*1.25],'k--')\n",
718 | " plt.yticks( size = TextSize)\n",
719 | " plt.xticks(size = TextSize)\n",
720 | " plt.xlabel('Birth', size = TextSize)\n",
721 | " plt.ylabel('Death', size = TextSize)\n",
722 | " plt.plot(diagram[1].T[0],diagram[1].T[1] ,'go', markersize = MS+2)\n",
723 | " plt.xlim(0,top*1.25)\n",
724 | " plt.ylim(0,top*1.25)\n",
725 | "\n",
726 | " plt.subplots_adjust(hspace= 0.8)\n",
727 | " plt.subplots_adjust(wspace= 0.35)\n",
728 | " plt.show()\n",
729 | " \n",
730 | "def drawThisDiagram(diagram):\n",
731 | " TextSize = 14\n",
732 | " plt.title('Persistence Diagram', size = TextSize)\n",
733 | " MS = 3\n",
734 | " top = max(diagram[1].T[1])\n",
735 | " plt.plot([0,top*1.25],[0,top*1.25],'k--')\n",
736 | " plt.yticks( size = TextSize)\n",
737 | " plt.xticks(size = TextSize)\n",
738 | " plt.xlabel('Birth', size = TextSize)\n",
739 | " plt.ylabel('Death', size = TextSize)\n",
740 | " plt.plot(diagram[1].T[0],diagram[1].T[1] ,'go', markersize = MS+2)\n",
741 | " plt.xlim(0,top*1.25)\n",
742 | " plt.ylim(0,top*1.25)"
743 | ]
744 | },
745 | {
746 | "cell_type": "code",
747 | "execution_count": null,
748 | "metadata": {
749 | "slideshow": {
750 | "slide_type": "subslide"
751 | }
752 | },
753 | "outputs": [],
754 | "source": [
755 | "#generate time series\n",
756 | "system = 'rossler'\n",
757 | "dynamic_state = 'periodic'\n",
758 | "t, solution = DSL.DynamicSystems(system, dynamic_state)\n",
759 | "ts = solution[1]\n",
760 | "\n",
761 | "plt.plot(t,ts)"
762 | ]
763 | },
764 | {
765 | "cell_type": "code",
766 | "execution_count": null,
767 | "metadata": {
768 | "slideshow": {
769 | "slide_type": "subslide"
770 | }
771 | },
772 | "outputs": [],
773 | "source": [
774 | "#Get appropriate dimension and delay parameters for permutations\n",
775 | "tau = int(MsPE_tau(ts))\n",
776 | "n = 5\n",
777 | "\n",
778 | "#create adjacency matrix, this\n",
779 | "A = ordinal_partition_graph(ts, n, tau)\n",
780 | "\n",
781 | "#get networkx representation of network for plotting\n",
782 | "G, pos = make_network(A, position_iterations = 2000, remove_deg_zero_nodes = True)\n",
783 | "\n",
784 | "nx.draw(G, pos, with_labels=False, font_weight='bold', node_color='blue',\n",
785 | " width=1, font_size = 10, node_size = 30)"
786 | ]
787 | },
788 | {
789 | "cell_type": "code",
790 | "execution_count": null,
791 | "metadata": {
792 | "slideshow": {
793 | "slide_type": "subslide"
794 | }
795 | },
796 | "outputs": [],
797 | "source": [
798 | "#create distance matrix and calculate persistence diagram\n",
799 | "D, diagram = PH_network(A, method = 'unweighted', distance = 'shortest_path')\n",
800 | "# print('1-D Persistent Homology (loops): \\n', diagram[1])\n",
801 | "drawNetworkExample(ts,G,diagram)"
802 | ]
803 | },
804 | {
805 | "cell_type": "markdown",
806 | "metadata": {
807 | "slideshow": {
808 | "slide_type": "subslide"
809 | }
810 | },
811 | "source": [
812 | "### The same example but with a chaotic time series "
813 | ]
814 | },
815 | {
816 | "cell_type": "code",
817 | "execution_count": null,
818 | "metadata": {
819 | "slideshow": {
820 | "slide_type": "-"
821 | }
822 | },
823 | "outputs": [],
824 | "source": [
825 | "#generate time series\n",
826 | "system = 'rossler'\n",
827 | "dynamic_state = 'chaotic'\n",
828 | "t, solution = DSL.DynamicSystems(system, dynamic_state)\n",
829 | "ts = solution[1]\n",
830 | "\n",
831 | "#Get appropriate dimension and delay parameters for permutations\n",
832 | "tau = int(MsPE_tau(ts))\n",
833 | "n = 5\n",
834 | "\n",
835 | "#create adjacency matrix, this\n",
836 | "A = ordinal_partition_graph(ts, n, tau)\n",
837 | "\n",
838 | "#get networkx representation of network for plotting\n",
839 | "G, pos = make_network(A, position_iterations = 2000, remove_deg_zero_nodes = True)\n",
840 | "\n",
841 | "#create distance matrix and calculate persistence diagram\n",
842 | "D, diagram = PH_network(A, method = 'unweighted', distance = 'shortest_path')\n",
843 | "# print('1-D Persistent Homology (loops): \\n', diagram[1])\n"
844 | ]
845 | },
846 | {
847 | "cell_type": "code",
848 | "execution_count": null,
849 | "metadata": {
850 | "slideshow": {
851 | "slide_type": "subslide"
852 | }
853 | },
854 | "outputs": [],
855 | "source": [
856 | "drawNetworkExample(ts,G,diagram)"
857 | ]
858 | },
859 | {
860 | "cell_type": "markdown",
861 | "metadata": {
862 | "slideshow": {
863 | "slide_type": "subslide"
864 | }
865 | },
866 | "source": [
867 | "# Warning\n",
868 | "\n",
869 | "Persistence diagrams can have multiplicity!"
870 | ]
871 | },
872 | {
873 | "cell_type": "code",
874 | "execution_count": null,
875 | "metadata": {
876 | "slideshow": {
877 | "slide_type": "subslide"
878 | }
879 | },
880 | "outputs": [],
881 | "source": [
882 | "# print('1-D Persistent Homology (loops): \\n', diagram[1]) # Uncomment me!\n",
883 | "drawThisDiagram(diagram)"
884 | ]
885 | },
886 | {
887 | "cell_type": "markdown",
888 | "metadata": {
889 | "slideshow": {
890 | "slide_type": "slide"
891 | }
892 | },
893 | "source": [
894 | "# Distances between persistence diagrams "
895 | ]
896 | },
897 | {
898 | "cell_type": "code",
899 | "execution_count": null,
900 | "metadata": {
901 | "slideshow": {
902 | "slide_type": "subslide"
903 | }
904 | },
905 | "outputs": [],
906 | "source": [
907 | "# Make three example point clouds \n",
908 | "r = 1\n",
909 | "R = 2\n",
910 | "P1 = makePtCloud.Annulus(N=200, r=r, R=R, seed=None) # teaspoon data generation\n",
911 | "P2 = makePtCloud.Annulus(N=200, r=r, R=R, seed=None)\n",
912 | "P2[:,1] += 6\n",
913 | "P3 = DoubleAnnulus()\n",
914 | "P3 *= 1.1\n",
915 | "P3[:,0] += 6\n",
916 | "P3[:,1] += 3"
917 | ]
918 | },
919 | {
920 | "cell_type": "code",
921 | "execution_count": null,
922 | "metadata": {
923 | "scrolled": false,
924 | "slideshow": {
925 | "slide_type": "subslide"
926 | }
927 | },
928 | "outputs": [],
929 | "source": [
930 | "# plt.figure(figsize = (15,5))\n",
931 | "plt.scatter(P1[:,0],P1[:,1], label = 'P1')\n",
932 | "plt.scatter(P2[:,0],P2[:,1], label = 'P2')\n",
933 | "plt.scatter(P3[:,0],P3[:,1], label = 'P3')\n",
934 | "plt.axis('equal')\n",
935 | "plt.legend()"
936 | ]
937 | },
938 | {
939 | "cell_type": "code",
940 | "execution_count": null,
941 | "metadata": {
942 | "slideshow": {
943 | "slide_type": "subslide"
944 | }
945 | },
946 | "outputs": [],
947 | "source": [
948 | "# Compute their diagrams \n",
949 | "diagrams1 = ripser.ripser(P1)['dgms']\n",
950 | "diagrams2 = ripser.ripser(P2)['dgms']\n",
951 | "diagrams3 = ripser.ripser(P3)['dgms']\n",
952 | "\n",
953 | "Draw.drawDgm(diagrams1[1])\n",
954 | "Draw.drawDgm(diagrams2[1])\n",
955 | "Draw.drawDgm(diagrams3[1])\n"
956 | ]
957 | },
958 | {
959 | "cell_type": "markdown",
960 | "metadata": {
961 | "slideshow": {
962 | "slide_type": "subslide"
963 | }
964 | },
965 | "source": [
966 | "### Bottleneck Distance "
967 | ]
968 | },
969 | {
970 | "cell_type": "code",
971 | "execution_count": null,
972 | "metadata": {
973 | "slideshow": {
974 | "slide_type": "subslide"
975 | }
976 | },
977 | "outputs": [],
978 | "source": [
979 | "# Compute bottleneck distance using scikit-tda\n",
980 | "distance_bottleneck, (matching, D) = persim.bottleneck(diagrams1[1], diagrams2[1], matching=True)\n",
981 | "persim.visuals.bottleneck_matching(diagrams1[1], diagrams2[1], matching, D, labels=['Clean $H_1$', 'Noisy $H_1$'])\n",
982 | "print('The bottleneck distance is', distance_bottleneck)\n",
983 | "# print(matching)\n",
984 | "# print(D)"
985 | ]
986 | },
987 | {
988 | "cell_type": "code",
989 | "execution_count": null,
990 | "metadata": {
991 | "scrolled": true,
992 | "slideshow": {
993 | "slide_type": "subslide"
994 | }
995 | },
996 | "outputs": [],
997 | "source": [
998 | "# Compute bottleneck of P1 and P3\n",
999 | "distance_bottleneck, (matching, D) = persim.bottleneck(diagrams1[1], diagrams3[1], matching=True)\n",
1000 | "persim.visuals.bottleneck_matching(diagrams1[1], diagrams3[1], matching, D, labels=['Clean $H_1$', 'Noisy $H_1$'])\n",
1001 | "print('The bottleneck distance is', distance_bottleneck)"
1002 | ]
1003 | },
1004 | {
1005 | "cell_type": "markdown",
1006 | "metadata": {
1007 | "slideshow": {
1008 | "slide_type": "slide"
1009 | }
1010 | },
1011 | "source": [
1012 | "# But Liz, what should I do next? \n"
1013 | ]
1014 | },
1015 | {
1016 | "cell_type": "markdown",
1017 | "metadata": {
1018 | "slideshow": {
1019 | "slide_type": "fragment"
1020 | }
1021 | },
1022 | "source": [
1023 | "\n",
1024 | "Ha I have no idea! But here are some next steps to read about/try. Happy to discuss more during the small group session! \n"
1025 | ]
1026 | },
1027 | {
1028 | "cell_type": "markdown",
1029 | "metadata": {
1030 | "slideshow": {
1031 | "slide_type": "subslide"
1032 | }
1033 | },
1034 | "source": [
1035 | "- Different input data and/or filtrations\n",
1036 | " - Something other than clique complexes\n",
1037 | " - Directed complexes \n",
1038 | " - Image data \n",
1039 | "- ML and statistics interfaces: featurizations \n",
1040 | " - [Persistence images](https://www.jmlr.org/papers/v18/16-337.html)\n",
1041 | " - [Persistence landscapes](https://www.jmlr.org/papers/volume16/bubenik15a/bubenik15a.pdf)\n",
1042 | " - [Template functions](https://arxiv.org/abs/1902.07190)\n",
1043 | " - Lots and lots more......\n",
1044 | " "
1045 | ]
1046 | },
1047 | {
1048 | "cell_type": "markdown",
1049 | "metadata": {
1050 | "slideshow": {
1051 | "slide_type": "subslide"
1052 | }
1053 | },
1054 | "source": [
1055 | " \n",
1056 | "- Other TDA signatures \n",
1057 | " - Reeb graphs\n",
1058 | " - [Mapper graphs ](https://research.math.osu.edu/tgda/mapperPBG.pdf)\n",
1059 | " - Merge trees\n",
1060 | " - Contour trees \n",
1061 | " - Morse-Smale complexes"
1062 | ]
1063 | },
1064 | {
1065 | "cell_type": "markdown",
1066 | "metadata": {
1067 | "slideshow": {
1068 | "slide_type": "slide"
1069 | }
1070 | },
1071 | "source": [
1072 | "\n",
1073 | "\n",
1082 | "\n",
1083 | "# Thank you!!!\n",
1084 | "\n",
1085 | "- Content adapted from tutorials by [Audun Myers](https://www.audunmyers.com/) and [Chris Tralie](http://www.ctralie.com/)\n",
1086 | "- Get connected: [WinCompTop](https://awmadvance.org/research-networks/wincomptop-women-in-computational-topology/)\n",
1087 | "- My survey papers: [User's guide](https://learning-analytics.info/index.php/JLA/article/view/5196), [Bio focus](https://anatomypubs.onlinelibrary.wiley.com/doi/full/10.1002/dvdy.175)\n",
1088 | "\n",
1089 | "
\n",
1094 | " ![]() | \n",
1097 | " \n",
1098 | " | \n",
1101 | " \n",
1102 | " | \n",
1105 | " \n",
1106 | " ![]() | \n",
1108 | " \n",
1109 | " ![]() | \n",
1111 | " \n",
1112 | "