├── .gitignore ├── README.md └── Makefile /.gitignore: -------------------------------------------------------------------------------- 1 | pluto_* 2 | summaries 3 | *.mk 4 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | get-pluto 2 | --------- 3 | 4 | This `Makefile` contains tasks to download and create block-level summaries of New York City cadastral data. 5 | 6 | The NYC Department of City Planning has released the [PLUTO](http://www1.nyc.gov/site/planning/data-maps/open-data/dwn-pluto-mappluto.page) dataset going back to 2002, which is a great. Unfortunately, the data is released in five borough-level files, which means doing city-level analysis requires doing some merging off the bat. 7 | 8 | This Makefile solves that problem, and deals with some annoying features of changed formatting and missing data along with way. 9 | 10 | Save this Makefile to the folder into which you want to download PLUTO data, open up the command line and run `make`. All ~19 releases will (slowly) download and be merged into one citywide file. 11 | 12 | Also included are tasks to create block- and community-district-level summaries. 13 | 14 | ## Requires 15 | 16 | * `curl` (downloading data) 17 | * [GDAL](http://www.gdal.org) v 2.2+ (creating citywide files and block-level summaries) 18 | 19 | ## Basics 20 | ``` 21 | make 22 | ``` 23 | This will download the PLUTO data and merge borough files into citywide files. It put each release into a separate folder (e.g. `pluto_15v1`). 24 | 25 | To download only some releases, use the `versions` variable, e.g: 26 | ``` 27 | make versions=16v1 28 | make versions="12v2 02b" 29 | ``` 30 | 31 | Available versions on the DCP website: 16v1, 15v1, 14v2, 14v1, 13v2, 13v1, 12v2, 12v1, 11v2, 11v1, 10v2, 10v1, 09v2, 09v1, 07c, 06c, 05d, 04c, 02b. 32 | 33 | ## Summaries 34 | 35 | The following data fields is summarized: 36 | * Sums of area fields (e.g. `BldgArea`) 37 | * Counts of properties by residential unit, building class, proximity code and land use code 38 | * Minimum and maximum year built 39 | * Average lot depth and width 40 | 41 | Not all data fields are available in all releases. 42 | 43 | ### Block summaries 44 | ```` 45 | make blocks 46 | ```` 47 | 48 | This will join and summarize lots by tax block, which almost always matches the city block. 49 | 50 | ### Community district summaries 51 | ```` 52 | make cds 53 | ```` 54 | 55 | This will join and summarize lots by [community district](http://www1.nyc.gov/site/planning/community/jias-sources.page). 56 | 57 | ## License 58 | 59 | Copyright 2016 Neil Freeman, published under the MIT License. 60 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # get-pluto 2 | # Copyright 2016 Neil Freeman 3 | # contact@fakeisthenewreal.org 4 | 5 | # Permission is hereby granted, free of charge, to any person obtaining a copy of 6 | # this software and associated documentation files (the "Software"), to deal in 7 | # the Software without restriction, including without limitation the rights to use, 8 | # copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the 9 | # Software, and to permit persons to whom the Software is furnished to do so, 10 | # subject to the following conditions: 11 | 12 | # The above copyright notice and this permission notice shall be included in all 13 | # copies or substantial portions of the Software. 14 | 15 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 16 | # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR 17 | # A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 18 | # COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 19 | # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 20 | # WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | 22 | shell = bash 23 | 24 | DATA = http://www1.nyc.gov/assets/planning/download/zip/data-maps/open-data 25 | 26 | area_summary = SUM(LotArea) BlockArea, \ 27 | SUM(BldgArea) BldgArea, \ 28 | SUM(ResArea) ResArea, \ 29 | SUM(ComArea) ComArea, \ 30 | SUM(RetailArea) RetailArea, \ 31 | SUM(OfficeArea) OfficeArea, \ 32 | SUM(FactryArea) FactryArea 33 | 34 | area_summary_limited = SUM(LotArea) BlockArea, \ 35 | SUM(floorArea) BldgArea, \ 36 | SUM(resArea) ResArea, \ 37 | SUM(comArea) ComArea 38 | 39 | Units_summary = COUNT(*) Lots, \ 40 | SUM(UnitsRes) unitsRes, \ 41 | SUM(UnitsTotal) UnitsTotal 42 | 43 | UnitsRes_summary = SUM(CASE WHEN UnitsRes = 1 THEN 1 ELSE 0 END) unit_1_cnt, \ 44 | SUM(CASE WHEN UnitsRes = 2 THEN 2 ELSE 0 END) unit_2_cnt, \ 45 | SUM(CASE WHEN UnitsRes = 3 THEN 3 ELSE 0 END) unit_3_cnt, \ 46 | SUM(CASE WHEN UnitsRes = 4 THEN 4 ELSE 0 END) unit_4_cnt, \ 47 | SUM(CASE WHEN UnitsRes = 5 THEN 5 ELSE 0 END) unit_5_cnt, \ 48 | SUM(CASE WHEN UnitsRes = 6 THEN 6 ELSE 0 END) unit_6_cnt, \ 49 | SUM(CASE WHEN UnitsRes > 6 AND UnitsRes <= 10 THEN UnitsRes ELSE 0 END) unt6_10cnt, \ 50 | SUM(CASE WHEN UnitsRes > 10 THEN UnitsRes ELSE 0 END) unit11_cnt 51 | 52 | BuildingClass_summary = COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'A' THEN 1 END) cls_A_cnt, \ 53 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'B' THEN 1 END) clas_B_cnt, \ 54 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'C' THEN 1 END) clas_C_cnt, \ 55 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'D' THEN 1 END) clas_D_cnt, \ 56 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'E' THEN 1 END) clas_E_cnt, \ 57 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'F' THEN 1 WHEN SUBSTR(BldgClass, 1, 1) = 'L' THEN 1 END) clas_FL_cnt, \ 58 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'H' THEN 1 END) clas_H_cnt, \ 59 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'R' THEN 1 END) clas_R_cnt, \ 60 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'S' THEN 1 END) clas_S_cnt, \ 61 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'T' THEN 1 WHEN SUBSTR(BldgClass, 1, 1) = 'U' THEN 1 END) clas_TU_cnt, \ 62 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'V' THEN 1 END) clas_V_cnt, \ 63 | COUNT(CASE WHEN SUBSTR(BldgClass, 1, 1) = 'W' THEN 1 END) clas_W_cnt 64 | 65 | LandUse_summary = COUNT(CASE WHEN LandUse = 1 THEN 1 END) lu_1_cnt, \ 66 | COUNT(CASE WHEN LandUse = 2 THEN 1 END) lu_2_cnt, \ 67 | COUNT(CASE WHEN LandUse = 3 THEN 1 END) lu_3_cnt, \ 68 | COUNT(CASE WHEN LandUse = 4 THEN 1 END) lu_4_cnt, \ 69 | COUNT(CASE WHEN LandUse = 5 THEN 1 END) lu_5_cnt, \ 70 | COUNT(CASE WHEN LandUse = 6 THEN 1 END) lu_6_cnt, \ 71 | COUNT(CASE WHEN LandUse = 7 THEN 1 END) lu_7_cnt, \ 72 | COUNT(CASE WHEN LandUse = 8 THEN 1 END) lu_8_cnt, \ 73 | COUNT(CASE WHEN LandUse = 9 THEN 1 END) lu_9_cnt, \ 74 | COUNT(CASE WHEN LandUse = 10 THEN 1 END) lu_10_cnt, \ 75 | COUNT(CASE WHEN LandUse = 11 THEN 1 END) lu_11_cnt 76 | 77 | # DCP changed the method potential FAR fields circa 2010 78 | FAR_summary = SUM(BuiltFAR * LotArea) / SUM(LotArea) BuiltFAR, \ 79 | SUM(ResidFAR * LotArea) / SUM(LotArea) ResidFAR, \ 80 | SUM(CommFAR * LotArea) / SUM(LotArea) CommFAR, \ 81 | SUM(FacilFAR * LotArea) / SUM(LotArea) FacilFAR 82 | 83 | BsmtCode_summary = COUNT(CASE BsmtCode WHEN 0 THEN 1 END) Bsmt_0_cnt, \ 84 | COUNT(CASE BsmtCode WHEN 1 THEN 1 END) Bsmt_1_cnt, \ 85 | COUNT(CASE BsmtCode WHEN 2 THEN 1 END) Bsmt_2_cnt, \ 86 | COUNT(CASE BsmtCode WHEN 3 THEN 1 END) Bsmt_3_cnt, \ 87 | COUNT(CASE BsmtCode WHEN 4 THEN 1 END) Bsmt_4_cnt 88 | 89 | ProxCode_summary = COUNT(CASE ProxCode WHEN 1 THEN 1 END) DetacPCcnt, \ 90 | COUNT(CASE ProxCode WHEN 2 THEN 1 END) SemAtPCcnt, \ 91 | COUNT(CASE ProxCode WHEN 3 THEN 1 END) AttatPCcnt 92 | 93 | YearBuilt_summary = SUM(CASE WHEN LENGTH(TRIM(HistDist)) > 0 THEN LotArea ELSE 0 END) / SUM(LotArea) HistDstPct, \ 94 | MIN(CASE WHEN YearBuilt = 0 THEN NULL ELSE YearBuilt END) MinYearBlt, \ 95 | MAX(CASE WHEN 2016 < YearBuilt THEN NULL \ 96 | WHEN YearBuilt = 0 THEN NULL ELSE YearBuilt END) MaxYrBlt, \ 97 | ROUND(AVG(CASE WHEN YearBuilt = 0 THEN NULL ELSE YearBuilt END), 0) AvgYearBlt 98 | 99 | Dims_summary = ROUND(AVG(LotDepth), 3) AvgLotDpth, \ 100 | ROUND(AVG(LotFront), 3) AvgLotFrnt 101 | 102 | pluto_summary = CD,\ 103 | $(Units_summary), \ 104 | $(area_summary), \ 105 | $(ProxCode_summary), \ 106 | $(BsmtCode_summary), \ 107 | $(FAR_summary), \ 108 | $(UnitsRes_summary), \ 109 | $(LandUse_summary), \ 110 | $(BuildingClass_summary), \ 111 | $(YearBuilt_summary), \ 112 | $(Dims_summary) 113 | 114 | pluto_summary_limited_far = CD,\ 115 | $(Units_summary), \ 116 | $(area_summary), \ 117 | $(ProxCode_summary), \ 118 | $(BsmtCode_summary), \ 119 | SUM(BuiltFAR * LotArea) / SUM(LotArea) BuiltFAR, \ 120 | SUM(MaxAllwFAR * LotArea) / SUM(LotArea) MaxAllwFAR, \ 121 | $(UnitsRes_summary), \ 122 | $(LandUse_summary), \ 123 | $(BuildingClass_summary), \ 124 | $(YearBuilt_summary), \ 125 | $(Dims_summary) 126 | 127 | pluto_summary_03 = CD,\ 128 | $(Units_summary), \ 129 | $(area_summary_limited), \ 130 | SUM(BLDGAREA) / SUM(LotArea) BuiltFAR, \ 131 | SUM(MaxAllwFAR * LotArea) / SUM(LotArea) MaxAllwFAR, \ 132 | $(UnitsRes_summary), \ 133 | $(BuildingClass_summary), \ 134 | $(YearBuilt_summary), \ 135 | $(Dims_summary) 136 | 137 | pluto_summary_02 = CAST(CAST(BoroCode as INTEGER) AS TEXT) || substr('00' || CAST(ccDist as TEXT), -2, 2) as CD, \ 138 | $(Units_summary), \ 139 | $(area_summary_limited), \ 140 | SUM(floorArea) / SUM(LotArea) BuiltFAR, \ 141 | SUM(MaxAllwFAR * LotArea) / SUM(LotArea) MaxAllwFAR, \ 142 | $(UnitsRes_summary), \ 143 | $(BuildingClass_summary), \ 144 | $(YearBuilt_summary), \ 145 | $(Dims_summary) 146 | 147 | versions = 16v2 16v1 \ 148 | 15v1 \ 149 | 14v2 14v1 \ 150 | 13v2 13v1 \ 151 | 12v2 12v1 \ 152 | 11v2 11v1 \ 153 | 10v2 10v1 \ 154 | 09v2 09v1 \ 155 | 07c \ 156 | 06c \ 157 | 05d \ 158 | 04c \ 159 | 03c \ 160 | 02b 161 | 162 | PLUTO = $(foreach x,$1,pluto_$x/mappluto_$x) 163 | 164 | mapplutos = $(call PLUTO,$(versions)) 165 | 166 | .PHONY: all zips mappluto cds blocks mysql mysql-% 167 | mappluto all: $(addsuffix .ind,$(mapplutos)) $(addsuffix .shp,$(mapplutos)) 168 | cds: $(addsuffix _community_district.dbf,$(mapplutos)) 169 | blocks: $(addsuffix _blocks.ind,$(mapplutos)) $(addsuffix _blocks.shp,$(mapplutos)) 170 | changes: summaries/pluto_05_15_change.dbf summaries/pluto_05_10_change.dbf 171 | mysql: $(addprefix mysql-,$(versions)) 172 | zips: $(addsuffix .zip,$(mapplutos)) 173 | 174 | # summaries 175 | limited_far = 12v2 12v1 11v2 11v1 10v2 10v1 \ 176 | 09v2 09v1 \ 177 | 07c 06c 05d 04c 178 | area_03 = 03c 179 | area_02 = 02b 180 | standard = $(filter-out $(limited_far) $(area_03) $(area_02),$(versions)) 181 | 182 | CD = ogr2ogr $@ $< -f 'ESRI Shapefile' -overwrite -dialect sqlite \ 183 | -sql "SELECT $(1) \ 184 | FROM $(basename $(