├── README.md ├── environment.yml └── notebook.ipynb /README.md: -------------------------------------------------------------------------------- 1 | # Leveraging Python for Spatial Data Science 2 | 3 | [CARTO Spatial Data Science Bootcamps](https://spatial-data-science-conference.com/bootcamps/2023/) | March 23, 2023 4 | 5 | Instructor: [Will Geary](https://www.linkedin.com/in/willgeary/), Senior Data Scientist at [Revel Transit](https://gorevel.com/) 6 | 7 | In this workshop, we will perform spatial analysis on ridehail trips in NYC. We will cover the following: 8 | 9 | 1) Perfom some necessary cleaning on the data 10 | 2) Illustrate the Modifiable Areal Unit Problem (MAUP) and why we shouldn't simply analyze raw count of pickups 11 | 3) Calculate pickup density per zone 12 | 4) Make some simple choropleth maps 13 | 5) Perform a statistical Cluster and Outlier analysis. This requires a few steps: 14 | a) Create a Spatial Weights matrix 15 | b) Introduce the concept of spatial autocorrelation 16 | c) Introduce the Local Moran's I statistic (a local measurement of spatial autocorrelation) 17 | d) Detect statistically significant clusters (hotspots & coldspots) and outliers (diamonds & doughnuts) 18 | 19 | See [notebook.ipynb](https://github.com/willgeary/PythonSpatialDataScience/blob/main/notebook.ipynb) for the complete code behind this workshop. 20 | 21 | ## [Optional] Setup Instructions 22 | 23 | This workshop will require Python 3 and several packages, including `geopandas`, `contextily`, `seaborn`, `libpysal`, `esda`, `splot`, and `osmnx`. 24 | 25 | You can simply install the above in your own existing Python enviroment, or if you wish, you may optionally spin up a conda environment to exactly match mine with the following steps. 26 | 27 | 1) Clone the repository: 28 | 29 | `git clone https://github.com/willgeary/PythonSpatialDataScience` 30 | 31 | 2) Change directory into the respository: 32 | 33 | `cd PythonSpatialDataScience` 34 | 35 | 3) Create the conda environment: 36 | 37 | `conda env create --file environment.yml --force` 38 | 39 | 4) Activate the conda environment: 40 | 41 | `conda activate spatialstats` 42 | 43 | 5) Add the environment as a kernel to Jupyter Lab: 44 | 45 | `python -m ipykernel install --sys-prefix --name spatialstats` 46 | 47 | 6) Launch Jupyter Lab: 48 | 49 | `jupyter lab` 50 | 51 | Proceed to launch `notebook.ipynb` using the `spatialstats` kernel. 52 | -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | name: spatialstats 2 | channels: 3 | - conda-forge 4 | - defaults 5 | dependencies: 6 | - affine=2.4.0 7 | - anyio=3.6.2 8 | - appnope=0.1.3 9 | - argon2-cffi=21.3.0 10 | - argon2-cffi-bindings=21.2.0 11 | - arrow-cpp=11.0.0 12 | - asttokens=2.2.1 13 | - attrs=22.2.0 14 | - aws-c-auth=0.6.26 15 | - aws-c-cal=0.5.21 16 | - aws-c-common=0.8.14 17 | - aws-c-compression=0.2.16 18 | - aws-c-event-stream=0.2.20 19 | - aws-c-http=0.7.5 20 | - aws-c-io=0.13.19 21 | - aws-c-mqtt=0.8.6 22 | - aws-c-s3=0.2.7 23 | - aws-c-sdkutils=0.1.8 24 | - aws-checksums=0.1.14 25 | - aws-crt-cpp=0.19.8 26 | - aws-sdk-cpp=1.10.57 27 | - babel=2.12.1 28 | - backcall=0.2.0 29 | - backports=1.0 30 | - backports.functools_lru_cache=1.6.4 31 | - beautifulsoup4=4.11.2 32 | - bleach=6.0.0 33 | - blosc=1.21.2 34 | - boost-cpp=1.78.0 35 | - branca=0.6.0 36 | - brotli=1.0.9 37 | - brotli-bin=1.0.9 38 | - brotlipy=0.7.0 39 | - bzip2=1.0.8 40 | - c-ares=1.18.1 41 | - ca-certificates=2022.12.7 42 | - cairo=1.16.0 43 | - certifi=2022.12.7 44 | - cffi=1.15.1 45 | - cfitsio=4.2.0 46 | - charset-normalizer=2.1.1 47 | - click=8.1.3 48 | - click-plugins=1.1.1 49 | - cligj=0.7.2 50 | - contextily=1.3.0 51 | - contourpy=1.0.7 52 | - cryptography=39.0.2 53 | - curl=7.88.1 54 | - cycler=0.11.0 55 | - debugpy=1.6.6 56 | - decorator=5.1.1 57 | - defusedxml=0.7.1 58 | - descartes=1.1.0 59 | - entrypoints=0.4 60 | - esda=2.4.3 61 | - executing=1.2.0 62 | - expat=2.5.0 63 | - fiona=1.9.1 64 | - flit-core=3.8.0 65 | - folium=0.14.0 66 | - font-ttf-dejavu-sans-mono=2.37 67 | - font-ttf-inconsolata=3.000 68 | - font-ttf-source-code-pro=2.038 69 | - font-ttf-ubuntu=0.83 70 | - fontconfig=2.14.2 71 | - fonts-conda-ecosystem=1 72 | - fonts-conda-forge=1 73 | - fonttools=4.39.2 74 | - freetype=2.12.1 75 | - freexl=1.0.6 76 | - gdal=3.6.3 77 | - geographiclib=1.52 78 | - geopandas=0.12.2 79 | - geopandas-base=0.12.2 80 | - geopy=2.3.0 81 | - geos=3.11.1 82 | - geotiff=1.7.1 83 | - gettext=0.21.1 84 | - gflags=2.2.2 85 | - giddy=2.2.2 86 | - giflib=5.2.1 87 | - glog=0.6.0 88 | - hdf4=4.2.15 89 | - hdf5=1.12.2 90 | - icu=70.1 91 | - idna=3.4 92 | - importlib-metadata=6.0.0 93 | - importlib_metadata=6.0.0 94 | - importlib_resources=5.12.0 95 | - ipykernel=6.15.0 96 | - ipython=8.11.0 97 | - ipython_genutils=0.2.0 98 | - ipywidgets=8.0.4 99 | - jedi=0.18.2 100 | - jinja2=3.1.2 101 | - joblib=1.2.0 102 | - json-c=0.16 103 | - json5=0.9.5 104 | - jsonschema=4.17.3 105 | - jupyter_client=8.0.3 106 | - jupyter_core=5.3.0 107 | - jupyter_events=0.6.3 108 | - jupyter_server=2.4.0 109 | - jupyter_server_terminals=0.4.4 110 | - jupyterlab=3.5.0 111 | - jupyterlab_pygments=0.2.2 112 | - jupyterlab_server=2.20.0 113 | - jupyterlab_widgets=3.0.5 114 | - kealib=1.5.0 115 | - kiwisolver=1.4.4 116 | - krb5=1.20.1 117 | - lcms2=2.15 118 | - lerc=4.0.0 119 | - libabseil=20230125.0 120 | - libaec=1.0.6 121 | - libarrow=11.0.0 122 | - libblas=3.9.0 123 | - libbrotlicommon=1.0.9 124 | - libbrotlidec=1.0.9 125 | - libbrotlienc=1.0.9 126 | - libcblas=3.9.0 127 | - libcrc32c=1.1.2 128 | - libcurl=7.88.1 129 | - libcxx=15.0.7 130 | - libdeflate=1.17 131 | - libedit=3.1.20191231 132 | - libev=4.33 133 | - libevent=2.1.10 134 | - libffi=3.4.2 135 | - libgdal=3.6.3 136 | - libgfortran=5.0.0 137 | - libgfortran5=12.2.0 138 | - libglib=2.74.1 139 | - libgoogle-cloud=2.8.0 140 | - libgrpc=1.52.1 141 | - libiconv=1.17 142 | - libjpeg-turbo=2.1.5.1 143 | - libkml=1.3.0 144 | - liblapack=3.9.0 145 | - libnetcdf=4.9.1 146 | - libnghttp2=1.52.0 147 | - libopenblas=0.3.21 148 | - libpng=1.6.39 149 | - libpq=15.2 150 | - libprotobuf=3.21.12 151 | - libpysal=4.7.0 152 | - librttopo=1.1.0 153 | - libsodium=1.0.18 154 | - libspatialindex=1.9.3 155 | - libspatialite=5.0.1 156 | - libsqlite=3.40.0 157 | - libssh2=1.10.0 158 | - libthrift=0.18.1 159 | - libtiff=4.5.0 160 | - libutf8proc=2.8.0 161 | - libwebp-base=1.3.0 162 | - libxcb=1.13 163 | - libxml2=2.10.3 164 | - libzip=1.9.2 165 | - libzlib=1.2.13 166 | - llvm-openmp=15.0.7 167 | - lz4-c=1.9.4 168 | - mapclassify=2.5.0 169 | - markupsafe=2.1.2 170 | - matplotlib-base=3.7.1 171 | - matplotlib-inline=0.1.6 172 | - mercantile=1.2.1 173 | - mistune=2.0.5 174 | - munch=2.5.0 175 | - munkres=1.1.4 176 | - nbclassic=0.5.3 177 | - nbclient=0.7.2 178 | - nbconvert=7.2.9 179 | - nbconvert-core=7.2.9 180 | - nbconvert-pandoc=7.2.9 181 | - nbformat=5.7.3 182 | - ncurses=6.3 183 | - nest-asyncio=1.5.6 184 | - networkx=3.0 185 | - notebook=6.5.3 186 | - notebook-shim=0.2.2 187 | - nspr=4.35 188 | - nss=3.89 189 | - numpy=1.24.2 190 | - openjpeg=2.5.0 191 | - openssl=3.1.0 192 | - orc=1.8.3 193 | - packaging=23.0 194 | - pandas=1.5.3 195 | - pandoc=3.1.1 196 | - pandocfilters=1.5.0 197 | - parquet-cpp=1.5.1 198 | - parso=0.8.3 199 | - patsy=0.5.3 200 | - pcre2=10.40 201 | - pexpect=4.8.0 202 | - pickleshare=0.7.5 203 | - pillow=9.4.0 204 | - pip=23.0.1 205 | - pixman=0.40.0 206 | - pkgutil-resolve-name=1.3.10 207 | - platformdirs=3.1.1 208 | - pooch=1.7.0 209 | - poppler=23.03.0 210 | - poppler-data=0.4.12 211 | - postgresql=15.2 212 | - proj=9.1.1 213 | - prometheus_client=0.16.0 214 | - prompt-toolkit=3.0.38 215 | - prompt_toolkit=3.0.38 216 | - psutil=5.9.4 217 | - pthread-stubs=0.4 218 | - ptyprocess=0.7.0 219 | - pure_eval=0.2.2 220 | - pyarrow=11.0.0 221 | - pycparser=2.21 222 | - pygments=2.14.0 223 | - pyopenssl=23.0.0 224 | - pyparsing=3.0.9 225 | - pyproj=3.4.1 226 | - pyrsistent=0.19.3 227 | - pysocks=1.7.1 228 | - python=3.11.0 229 | - python-dateutil=2.8.2 230 | - python-fastjsonschema=2.16.3 231 | - python-json-logger=2.0.7 232 | - python_abi=3.11 233 | - pytz=2022.7.1 234 | - pyyaml=6.0 235 | - pyzmq=25.0.1 236 | - rasterio=1.3.6 237 | - re2=2023.02.02 238 | - readline=8.1.2 239 | - requests=2.28.2 240 | - rfc3339-validator=0.1.4 241 | - rfc3986-validator=0.1.1 242 | - rtree=1.0.1 243 | - scikit-learn=1.2.2 244 | - scipy=1.10.1 245 | - seaborn=0.12.2 246 | - seaborn-base=0.12.2 247 | - send2trash=1.8.0 248 | - setuptools=67.6.0 249 | - shapely=2.0.1 250 | - six=1.16.0 251 | - snappy=1.1.10 252 | - sniffio=1.3.0 253 | - snuggs=1.4.7 254 | - soupsieve=2.3.2.post1 255 | - splot=1.1.5.post1 256 | - spreg=1.3.0 257 | - sqlite=3.40.0 258 | - stack_data=0.6.2 259 | - statsmodels=0.13.5 260 | - terminado=0.17.1 261 | - threadpoolctl=3.1.0 262 | - tiledb=2.13.2 263 | - tinycss2=1.2.1 264 | - tk=8.6.12 265 | - tomli=2.0.1 266 | - tornado=6.2 267 | - traitlets=5.9.0 268 | - typing-extensions=4.5.0 269 | - typing_extensions=4.5.0 270 | - tzcode=2022g 271 | - tzdata=2022g 272 | - urllib3=1.26.15 273 | - wcwidth=0.2.6 274 | - webencodings=0.5.1 275 | - websocket-client=1.5.1 276 | - wheel=0.40.0 277 | - widgetsnbextension=4.0.5 278 | - xerces-c=3.2.4 279 | - xorg-libxau=1.0.9 280 | - xorg-libxdmcp=1.1.3 281 | - xyzservices=2023.2.0 282 | - xz=5.2.6 283 | - yaml=0.2.5 284 | - zeromq=4.3.4 285 | - zipp=3.15.0 286 | - zlib=1.2.13 287 | - zstd=1.5.2 288 | - pip: 289 | - osmnx==1.3.0 --------------------------------------------------------------------------------