├── .gitignore
├── BLOG_POST.mkd
├── README
├── blockr.py
├── build_neighborhood.sh
├── dump
    ├── LICENSE.txt
    └── README.txt
├── geocrawlr.py
├── junk
    ├── OsmApi.py
    ├── README
    ├── fetch_osm.py
    ├── pull_photos.py
    ├── samplr.py
    └── up_one_level.py
├── leaves_from_woeid.py
├── mapnik_render.py
├── outliers.py
└── util
    ├── consolidate_geojson.py
    ├── geoplanet.py
    └── upload_photos.py


/.gitignore:
--------------------------------------------------------------------------------
1 | data
2 | *.pyc
3 | 


--------------------------------------------------------------------------------
/BLOG_POST.mkd:
--------------------------------------------------------------------------------
 1 | # It's a beautiful day in the neighborhood!
 2 | #####by Schuyler Erle
 3 | ######8/5/2012 1:00pm
 4 | ######San Francisco, CA
 5 | 
 6 | A large part of our job at SimpleGeo consists of listening closely to our users, and trying to understand what kinds of geo-related tools will make their lives easier and their apps more awesome. One thing we hear about pretty regularly is the lack of freely available neighborhood boundaries for international cities.
 7 | 
 8 | Now, SimpleGeo Context has had neighborhood boundaries for most major US cities ever since we launched the product. We’ve been asked about neighborhoods in cities outside the US, but, when we started looking, we didn’t immediately find a source that was available under a license that we could encourage you to freely reuse. So we decided to make our own!
 9 | 
10 | We’re pleased to announce the availability in SimpleGeo Context of neighborhood boundaries for the following twelve cities:
11 | 
12 | * Amsterdam
13 | * Barcelona
14 | * Beijing
15 | * Berlin
16 | * Florence
17 | * London
18 | * Paris
19 | * Rome
20 | * Shanghai
21 | * Sydney
22 | * Tokyo
23 | * Vienna
24 | 
25 | Additionally, we now have approximate boundaries for Paris’s arrondissements and Berlin’s ortsteils. Check out the Eiffel Tower in our Context demo – scroll down to see the map, and click “Features” on the right – or perhaps Westminster Abbey. You can also see some visualizations in [our Flickr stream](http://www.flickr.com/photos/simplegeo/sets/72157627358066594/).
26 | 
27 | Now, neighborhoods are, in many ways, a unique form of geography. Some geographies are physical by nature: A park has boundaries, a road has a center line, et cetera. Most non-physical geographies have some legal existence, like a post code or a city or a province, where a statute or a treaty defines the boundaries of the geography. As an informal division of a city, a neighborhood’s boundaries are often both invisible and lacking in precise definition. Often, the conventionally accepted boundaries of a neighborhood ebb and flow over time, as the economics or demographics of the region change. Neighborhood boundaries are usually fuzzy, and frequently overlap in practice, in ways that other kinds of geography do not.
28 | 
29 | So, we’ll be totally candid – our new international neighborhood dataset is definitely a work in progress. There are some evident issues with the new dataset, but we thought it better to release and then iterate, rather than wait indefinitely on impossible perfection. We hope to continue to refine and improve the data, as well as add lots of new cities.
30 | Due to the data sources we combined to produce them, all of the new neighborhood data in Context is licensed under the [Open Database License (ODbL)](http://opendatacommons.org/licenses/odbl/). You can find the new neighborhoods in SimpleGeo Context, and you can also download the [whole data set](http://s3.amazonaws.com/simplegeo-public/neighborhoods_dump_20110804.zip). We hope you do awesome things with it!
31 | 
32 | Read on for the technical details!
33 | 
34 | Generating neighborhood boundaries for new cities actually turned out to be a pretty good trick. We didn’t have any source for boundaries themselves, but Flickr’s body of geotagged photos represent a pool of samples of neighborhood locations, because photos taken in cities often have a machine tag containing the Where On Earth ID (or “WoE ID”) for the corresponding neighborhood. We used the freely available [Yahoo! GeoPlanet data dumps](http://developer.yahoo.com/geo/geoplanet/data/) to identify the WoE IDs of neighborhoods — “Suburb” or “LocalAdmin” in the parlance of GeoPlanet — in the cities in which we were interested. We then used the Flickr API to draw a sample of geotagged photo locations for that WoE ID to establish a kind of “cloud” of points that roughly represent that neighborhood.
35 | 
36 | At first, we tried generating a [Voronoi diagram](http://en.wikipedia.org/wiki/Voronoi_diagram) over the entire area of the city, and then merging the resulting shapes by WoE ID. This yielded “boundaries” that were very organic, and kind of weird looking. They didn’t correspond to our intuitions about how neighborhoods are structured in the minds of residents and visitors. In our experience, neighborhood boundaries in large cities often conform to the physical geography, such as the lines of roads and waterways, rather than cutting across city blocks, and even buildings.
37 | 
38 | We turned to [OpenStreetMap](http://openstreetmap.org/) as a source for the physical geography of roads, railroads, and waterways, because OSM turns out to be a pretty good source for this sort of data in most of the world’s largest cities. After loading the [entire world of OSM](http://wiki.openstreetmap.org/wiki/Planet.osm) into a [PostGIS](http://postgis.refractions.net/) database, we take the linework for each city, and, treating it as a set of polygon boundaries, use [GRASS](http://osgeo.org/grass/) to clean up the data and generate a polygon for each “city block” in our area of interest. Using OSM, of course, means that the results need to be licensed ODbL, in order to respect the desire of the community that derivative works be shared alike.
39 | 
40 | The rest of the work gets done in Python, using the excellent [Shapely](http://trac.gispython.org/lab/wiki/Shapely) library. First, we group the geotagged photo locations by neighborhood, and then filter them by [median absolute deviation](http://en.wikipedia.org/wiki/Median_absolute_deviation) to remove mistagged outliers. Next, we iterate over each city block, and tally up the weighted inverse distances of the n nearest geotagged photos to decide which neighborhood the block “belongs” to. After all the blocks are assigned, we extract the largest polygon for each neighborhood as its “core”, reassign blocks that are detached from their core to other nearby neighborhoods, and do a bit of cleanup. This surprisingly simple local-then-global approach yields pretty convincing results.
41 | 
42 | We’ve considered two possible improvements for the future. We’ve experimented with an additional step that focuses on swapping blocks at the edges of neighborhoods to improve “compactness”, which intuitively feels like a important property of neighborhood boundaries, and we hope to revisit this soon.
43 | 
44 | The other possible “improvement” has to do with the fact that, for the time being, the boundaries we’re providing are sharply defined. We felt that this might be easier for developers to work with, versus a set of neighborhood boundaries with variable overlap, but we’d really like your feedback about this works out for you in practice.
45 | 
46 | You can find the code on Github, if you’re interested: [http://github.com/simplegeo/betashapes/](http://github.com/simplegeo/betashapes/). The name of the code repository is left as an exercise for the reader.


--------------------------------------------------------------------------------
/README:
--------------------------------------------------------------------------------
  1 | Betashapes
  2 | ==========
  3 | created by Melissa Santos and Schuyler Erle
  4 | (c) 2011 SimpleGeo, Inc.
  5 | 
  6 | What is this?
  7 | -------------
  8 | 
  9 | It's the code used by SimpleGeo to generate its international neighborhood
 10 | dataset.
 11 | 
 12 | See the blog post for an explanation:
 13 | 
 14 | http://blog.simplegeo.com/2011/08/05/its-a-beautiful-day-in-the-neighborhood/
 15 | (The blog post content is now in the BLOG_POST.mkd file of this repo)
 16 | 
 17 | Why's it here?
 18 | --------------
 19 | 
 20 | We had fun writing it. We like giving stuff away. Maybe you'll find it useful.
 21 | Maybe you'll improve it and send us a pull request! We provide no warranty, and
 22 | no support. If it breaks, you get to keep the pieces.
 23 | 
 24 | How's it work?
 25 | --------------
 26 | 
 27 | Well, it helps if you download Yahoo's GeoPlanet dump, and load both it and all
 28 | or some subset of Planet.osm into PostGIS.
 29 | 
 30 | You'll need to create a data/ directory, and dump a mapping of WoE ID -> Name
 31 | into a file called `data/names.txt`, and another mapping of Parent ID, Name,
 32 | Type -> WoE ID into another file called `data/suburbs.txt`. This is stupid and
 33 | could be done a lot more cleanly.
 34 | 
 35 | Here is a sample of the names.txt we're using:
 36 | 
 37 |     29372661	San Francisco Javier
 38 |     772864	San Francisco de Paula
 39 |     108040	Villa de San Francisco
 40 |     142610	San Francisco Culhuacán
 41 |     349422	San Francisco de Limache
 42 |     12521721	San Francisco International Airport
 43 | 
 44 | Here's a sample of the suburbs.txt:
 45 | 
 46 |     44418	Streatham Common	Suburb	20089509
 47 |     44418	Upper Walthamstow	Suburb	20089365
 48 |     44418	Castelnau	Suburb	20089570
 49 |     44418	Harold Hill	Suburb	22483
 50 |     44418	Blackfriars Road	Suburb	20094299
 51 |     44418	Lampton	Suburb	44314
 52 |     44418	Lower Place	Suburb	20089447
 53 |     44418	Furzedown	Suburb	20089510
 54 |     44418	Crofton	Suburb	20089334
 55 |     44418	Collier's Wood	Suburb	20089517
 56 | 
 57 | Running build_neighborhood.sh takes over from there.
 58 | 
 59 | What's in it?
 60 | -------------
 61 | 
 62 | build_neighborhood.sh <city> <woeid>
 63 | 
 64 |     This shell script makes the magic happen. Depends on PostgreSQL and GRASS,
 65 |     in addition to all the other stuff in here.
 66 | 
 67 | blockr.py <names.txt> <blocks.json> <points.txt>
 68 | 
 69 |     The main neighborhood generation script. Takes a name file
 70 |     (tab-separated, mapping WoE ID to name), a GeoJSON FeatureCollection
 71 |     containing the block polygons to be assigned, and a points file (as
 72 |     generated by geocrawlr.py).
 73 | 
 74 |     Requires Shapely.
 75 | 
 76 | outliers.py <points.txt>
 77 | 
 78 |     A module for reading points.txt files and discarding outlying points based
 79 |     on median absolute distance. If run as a script, prints the bounding box of
 80 |     the points after outliers are discarded.
 81 | 
 82 | geocrawlr.py <woe_id> [<woe_id> ...]
 83 | 
 84 |     A script that crawls the Flickr API looking for geotagged photo records
 85 |     associated with the given woe_ids. Writes line-by-line, tab-separated
 86 |     values to stdout consisting of: Photo ID, WoE ID, Longitude, Latitude.
 87 |     Uses Flickr.API. You must have your FLICKR_KEY and FLICKR_SECRET set in the
 88 |     environment.
 89 | 
 90 | geoplanet.py
 91 | 
 92 |     A utility script to query Y! GeoPlanet. Takes names, one per line, on stdin,
 93 |     queries GeoPlanet, and outputs the first WoE ID and name returned on stdout.
 94 |     Set YAHOO_APPID in your environment.
 95 | 
 96 | mapnik_render.py
 97 | 
 98 |     A Mapnik script to visualize the neighborhood.json and blocks.json data
 99 |     together.
100 | 
101 | leaves_from_woeid.py
102 | 
103 |     Walks a table of GeoPlanet data in PostgreSQL and fetches all the leaves
104 |     descending from a given WoE ID.
105 | 
106 | What's a "betashape"?
107 | ---------------------
108 | 
109 | See:
110 | 
111 |     http://code.flickr.com/blog/2008/10/30/the-shape-of-alpha/
112 | 
113 | also see:
114 | 
115 |     http://code.flickr.com/blog/2009/01/12/living-in-the-donut-hole/
116 | 
117 | and for good measure:
118 |  
119 |     http://code.flickr.com/blog/2011/01/08/flickr-shapefiles-public-dataset-2-0/
120 | 
121 | Propers to Aaron Straup Cope for his ideas and encouragement.
122 | 
123 | License
124 | -------
125 | 
126 | Copyright (c) 2011, SimpleGeo, Inc.
127 | All rights reserved.
128 | 
129 | Redistribution and use in source and binary forms, with or without
130 | modification, are permitted provided that the following conditions are met:
131 | 
132 |     * Redistributions of source code must retain the above copyright
133 |       notice, this list of conditions and the following disclaimer.
134 |     * Redistributions in binary form must reproduce the above copyright
135 |       notice, this list of conditions and the following disclaimer in the
136 |       documentation and/or other materials provided with the distribution.
137 |     * Neither the name of the SimpleGeo, Inc. nor the
138 |       names of its contributors may be used to endorse or promote products
139 |       derived from this software without specific prior written permission.
140 | 
141 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
142 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
143 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
144 | DISCLAIMED. IN NO EVENT SHALL SIMPLEGEO, INC. BE LIABLE FOR ANY
145 | DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
146 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
147 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
148 | ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
149 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
150 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
151 | 


--------------------------------------------------------------------------------
/blockr.py:
--------------------------------------------------------------------------------
  1 | from shapely.geometry import Point, Polygon, MultiPolygon, asShape
  2 | from shapely.geometry.polygon import LinearRing
  3 | from shapely.ops import cascaded_union, polygonize
  4 | from shapely.prepared import prep
  5 | from rtree import Rtree
  6 | from outliers import load_points, discard_outliers
  7 | import sys, json, math, pickle, os, geojson
  8 | 
  9 | SAMPLE_SIZE = 20
 10 | SCALE_FACTOR = 111111.0 # meters per degree latitude
 11 | #ACTION_THRESHOLD = 2.0/math.sqrt(1000.0) # 1 point closer than 1km
 12 | ACTION_THRESHOLD = 20.0/math.sqrt(1000.0) # 1 point closer than 1km
 13 | AREA_BOUND = 0.001
 14 | TARGET_ASSIGN_LEVEL = 0.75
 15 | 
 16 | name_file, line_file, point_file = sys.argv[1:4]
 17 | 
 18 | places = {}
 19 | names = {}
 20 | blocks = {}
 21 | if os.path.exists(point_file + '.cache'):
 22 |     print >>sys.stderr, "Reading from %s cache..." % point_file
 23 |     names, blocks, places = pickle.load(file(point_file + ".cache"))
 24 |     blocks = map(asShape, blocks)
 25 | else:
 26 |     all_names = {}
 27 |     count = 0
 28 |     for line in file(name_file):
 29 |         place_id, name = line.strip().split(None, 1)
 30 |         all_names[int(place_id)] = name
 31 |         count += 1
 32 |         if count % 1000 == 0:
 33 |             print >>sys.stderr, "\rRead %d names from %s." % (count, name_file),
 34 |     print >>sys.stderr, "\rRead %d names from %s." % (count, name_file)
 35 | 
 36 |     places = load_points(point_file)
 37 |     for place_id in places:
 38 |         names[place_id] = all_names.get(place_id, "")
 39 |     places = discard_outliers(places)
 40 |     
 41 |     lines = []
 42 |     do_polygonize = False
 43 |     print >>sys.stderr, "Reading lines from %s..." % line_file,
 44 |     for feature in geojson.loads(file(line_file).read()):
 45 |         if feature.geometry.type in ('LineString', 'MultiLineString'):
 46 |             do_polygonize = True
 47 |         lines.append(asShape(feature.geometry.to_dict()))
 48 |     print >>sys.stderr, "%d lines read." % len(lines)
 49 |     if do_polygonize:
 50 |         print >>sys.stderr, "Polygonizing %d lines..." % (len(lines)),
 51 |         blocks = [poly.__geo_interface__ for poly in  polygonize(lines)]
 52 |         print >>sys.stderr, "%d blocks formed." % len(blocks)
 53 |     else:
 54 |         blocks = [poly.__geo_interface__ for poly in lines]
 55 | 
 56 | if not os.path.exists(point_file + '.cache'):
 57 |     print >>sys.stderr, "Caching points, blocks, and names ..."
 58 |     pickle.dump((names, blocks, places), file(point_file + ".cache", "w"), -1)
 59 |     blocks = map(asShape, blocks)
 60 | 
 61 | points = []
 62 | place_list = set()
 63 | count = 0
 64 | for place_id, pts in places.items():
 65 |     count += 1
 66 |     print >>sys.stderr, "\rPreparing %d of %d places..." % (count, len(places)),
 67 |     for pt in pts:
 68 |         place_list.add((len(points), pt+pt, None))
 69 |         points.append((place_id, Point(pt)))
 70 | print >>sys.stderr, "Indexing...",
 71 | index = Rtree(place_list)
 72 | print >>sys.stderr, "Done."
 73 | 
 74 | def score_block(polygon):
 75 |     centroid = polygon.centroid
 76 |     #prepared = prep(polygon)
 77 |     score = {}
 78 |     outside_samples = 0
 79 |     for item in index.nearest((centroid.x, centroid.y), num_results=SAMPLE_SIZE):
 80 |         place_id, point = points[item]
 81 |         score.setdefault(place_id, 0.0)
 82 |         #if prepared.contains(point):
 83 |         #    score[place_id] += 1.0
 84 |         #else:
 85 |         score[place_id] += 1.0 / math.sqrt(max(polygon.distance(point)*SCALE_FACTOR, 1.0))
 86 |         outside_samples += 1
 87 |     return list(reversed(sorted((sc, place_id) for place_id, sc in score.items())))
 88 | 
 89 | count = 0
 90 | assigned_blocks = {}
 91 | assigned_ct = 0
 92 | unassigned = {} #keyed on the polygon's index in blocks
 93 | for count in range(len(blocks)):
 94 |     polygon = blocks[count]
 95 |     print >>sys.stderr, "\rScoring %d of %d blocks..." % ((count+1), len(blocks)),
 96 |     if not polygon.is_valid:
 97 |         try:
 98 |             polygon = polygon.buffer(0)
 99 |             blocks[count] = polygon
100 |         except:
101 |             pass
102 |     if not polygon.is_valid:
103 |         continue
104 |     if polygon.is_empty: continue
105 |     if polygon.area > AREA_BOUND: continue
106 | 
107 |     scores = score_block(polygon)
108 |     best, winner = scores[0]
109 |     if best > ACTION_THRESHOLD:
110 |         assigned_ct += 1
111 |         assigned_blocks.setdefault(winner, [])
112 |         assigned_blocks[winner].append(polygon)
113 |     else:
114 |         # if the block wasn't assigned hang onto the info about the winning nbhd
115 |         unassigned[count] = (best, winner)
116 | print >>sys.stderr, "Done, assigned %d of %d blocks" % (assigned_ct, len(blocks))
117 | 
118 | new_threshold = ACTION_THRESHOLD
119 | while float(assigned_ct)/len(blocks) < TARGET_ASSIGN_LEVEL and len(unassigned) > 0:
120 |     new_threshold -= 0.1
121 |     print >>sys.stderr, "\rDropping threshold to %f1.3... " % new_threshold
122 |     for blockindex in unassigned.keys():
123 |         best, winner = unassigned[blockindex]
124 |         #if blocks[blockindex].is_empty: del(unassigned[blockindex])
125 |         if best > new_threshold:
126 |             assigned_ct += 1
127 |             assigned_blocks.setdefault(winner, [])
128 |             assigned_blocks[winner].append(blocks[blockindex])
129 |             del unassigned[blockindex]
130 |     print >>sys.stderr, "Done, assigned %d of %d blocks" % (assigned_ct, len(blocks))
131 |     
132 | 
133 | polygons = {}
134 | count = 0
135 | for place_id in places.keys():
136 |     count += 1
137 |     print >>sys.stderr, "\rMerging %d of %d boundaries..." % (count, len(places)),
138 |     if place_id not in assigned_blocks: continue
139 |     polygons[place_id] = cascaded_union(assigned_blocks[place_id])
140 | print >>sys.stderr, "Done."
141 | 
142 | count = 0
143 | orphans = []
144 | for place_id, multipolygon in polygons.items():
145 |     count += 1
146 |     print >>sys.stderr, "\rRemoving %d orphans from %d of %d polygons..." % (len(orphans), count, len(polygons)),
147 |     if type(multipolygon) is not MultiPolygon: continue
148 |     polygon_count = [0] * len(multipolygon)
149 |     for i, polygon in enumerate(multipolygon.geoms):
150 |         prepared = prep(polygon)
151 |         for item in index.intersection(polygon.bounds):
152 |             item_id, point = points[item]
153 |             if item_id == place_id and prepared.intersects(point):
154 |                 polygon_count[i] += 1
155 |     winner = max((c, i) for (i, c) in enumerate(polygon_count))[1]
156 |     polygons[place_id] = multipolygon.geoms[winner]
157 |     orphans.extend((place_id, p) for i, p in enumerate(multipolygon.geoms) if i != winner)
158 | print >>sys.stderr, "Done."
159 | 
160 | count = 0
161 | total = len(orphans)
162 | retries = 0
163 | unassigned = None
164 | while orphans:
165 |     unassigned = []
166 |     for origin_id, orphan in orphans:
167 |         count += 1
168 |         changed = False
169 |         print >>sys.stderr, "\rReassigning %d of %d orphans..." % (count-retries, total),
170 |         for score, place_id in score_block(orphan):
171 |             if place_id not in polygons:
172 |                 # Turns out we just wind up assigning tiny, inappropriate places
173 |                 #polygons[place_id] = orphan
174 |                 #changed = True
175 |                 continue
176 |             elif place_id != origin_id and orphan.intersects(polygons[place_id]):
177 |                 polygons[place_id] = polygons[place_id].union(orphan)
178 |                 changed = True
179 |             if changed:
180 |                 break
181 |         if not changed:
182 |             unassigned.append((origin_id, orphan))
183 |             retries += 1
184 |     if len(unassigned) == len(orphans):
185 |         # give up
186 |         break
187 |     orphans = unassigned
188 | print >>sys.stderr, "%d retried, %d unassigned." % (retries, len(unassigned))
189 | 
190 | print >>sys.stderr, "Returning remaining orphans to original places."
191 | for origin_id, orphan in orphans:
192 |     if orphan.intersects(polygons[origin_id]):
193 |         polygons[origin_id] = polygons[origin_id].union(orphan)
194 | 
195 | print >>sys.stderr, "Try to assign the holes to neighboring neighborhoods."
196 | #merge the nbhds
197 | city = cascaded_union(polygons.values())
198 | 
199 | #pull out any holes in the resulting Polygon/Multipolygon
200 | if type(city) is Polygon:
201 |     over = [city]
202 | elif type(city) is MultiPolygon:
203 |     over = city.geoms
204 | else:
205 |     print >>sys.stderr, "\rcity is of type %s, wtf." % (type(city))
206 | 
207 | holes = []
208 | for poly in over:
209 |     holes.extend((Polygon(LinearRing(interior.coords)) for interior in poly.interiors))
210 | 
211 | count = 0
212 | total = len(holes)
213 | retries = 0
214 | unassigned = None
215 | while holes:
216 |     unassigned = []
217 |     for hole in holes:
218 |         count += 1
219 |         changed = False
220 |         print >>sys.stderr, "\rReassigning %d of %d holes..." % (count-retries, total),
221 |         for score, place_id in score_block(hole):
222 |             if place_id not in polygons:
223 |                 # Turns out we just wind up assigning tiny, inappropriate places
224 |                 #nbhds[place_id] = hole
225 |                 #changed = True
226 |                 continue
227 |             elif hole.intersects(polygons[place_id]):
228 |                 polygons[place_id] = polygons[place_id].union(hole)
229 |                 changed = True
230 |             if changed:
231 |                 break
232 |         if not changed:
233 |             unassigned.append(hole)
234 |             retries += 1
235 |     if len(unassigned) == len(holes):
236 |         # give up
237 |         break
238 |     holes = unassigned
239 | print >>sys.stderr, "%d retried, %d unassigned." % (retries, len(unassigned))
240 | 
241 | print >>sys.stderr, "Buffering polygons."
242 | for place_id, polygon in polygons.items():
243 |     if type(polygon) is Polygon:
244 |         polygon = Polygon(polygon.exterior.coords)
245 |     else:
246 |         bits = []
247 |         for p in polygon.geoms:
248 |             if type(p) is Polygon:
249 |                 bits.append(Polygon(p.exterior.coords))
250 |         polygon = MultiPolygon(bits)
251 |     polygons[place_id] = polygon.buffer(0)
252 |  
253 | 
254 | print >>sys.stderr, "Writing output."
255 | features = []
256 | for place_id, poly in polygons.items():
257 |     features.append({
258 |         "type": "Feature",
259 |         "id": place_id,
260 |         "geometry": poly.__geo_interface__,
261 |         "properties": {"woe_id": place_id, "name": names.get(place_id, "")}
262 |     })
263 | 
264 | collection = {
265 |     "type": "FeatureCollection",
266 |     "features": features
267 | }
268 | 
269 | print json.dumps(collection)
270 | 
271 | 


--------------------------------------------------------------------------------
/build_neighborhood.sh:
--------------------------------------------------------------------------------
 1 | NAME=$1
 2 | WOEID=$2
 3 | DBNAME=osm # you need to have planet.osm (or some relevant portion) imported
 4 | DBPORT=5433
 5 | #GRASS_LOCATION=/home/sderle/grass/Global/PERMANENT
 6 | GRASS_LOCATION=/mnt/places/melissa/grass/Global/PERMANENT
 7 | export GRASS_BATCH_JOB=$GRASS_LOCATION/neighborhood.$$
 8 | 
 9 | if [ ! -r data/names.txt -o ! -r data/suburbs.txt ]; then
10 |     echo "data/names.txt (tab separated file mapping woe_id to name) is missing, or"
11 |     echo "data/suburbs.txt (tab separated file mapping parent_id, name, type, woe_id) is missing"
12 |     exit 1
13 | fi
14 | 
15 | if [ ! -r data/photos_$WOEID.txt ]; then
16 |     grep ^$WOEID data/suburbs.txt | cut -f4 | xargs python geocrawlr.py >data/photos_$WOEID.txt
17 | fi
18 | 
19 | BBOX=`python outliers.py data/photos_$WOEID.txt`
20 | 
21 | if [ ! -r data/blocks_$WOEID.json ]; then
22 |     pgsql2shp -f tmp$WOEID.shp -p $DBPORT $DBNAME \
23 |      "select osm_id, way from planet_osm_line where way && 'BOX($BBOX)'::box2d and (highway is not null or waterway is not null)" \
24 |      || exit 1
25 | 
26 |     sed -e "s/WOEID/$WOEID/g" >$GRASS_BATCH_JOB <<End
27 |     v.in.ogr -e --o --v dsn=. layer=tmpWOEID output=blocks_WOEID type=boundary
28 |     v.out.ogr in=blocks_WOEID dsn=data/blocks_WOEID.json format=GeoJSON type=area
29 | End
30 |     chmod +x $GRASS_BATCH_JOB
31 |     grass -text $GRASS_LOCATION
32 |     rm -f $GRASS_BATCH_JOB
33 |     rm tmp$WOEID.*
34 | fi
35 | 
36 | python blockr.py data/names.txt data/blocks_$WOEID.json data/photos_$WOEID.txt > data/$NAME.json
37 | 


--------------------------------------------------------------------------------
/dump/LICENSE.txt:
--------------------------------------------------------------------------------
  1 | ﻿## ODC Open Database License (ODbL)
  2 | 
  3 | ### Preamble
  4 | 
  5 | The Open Database License (ODbL) is a license agreement intended to
  6 | allow users to freely share, modify, and use this Database while
  7 | maintaining this same freedom for others. Many databases are covered by
  8 | copyright, and therefore this document licenses these rights. Some
  9 | jurisdictions, mainly in the European Union, have specific rights that
 10 | cover databases, and so the ODbL addresses these rights, too. Finally,
 11 | the ODbL is also an agreement in contract for users of this Database to
 12 | act in certain ways in return for accessing this Database.
 13 | 
 14 | Databases can contain a wide variety of types of content (images,
 15 | audiovisual material, and sounds all in the same database, for example),
 16 | and so the ODbL only governs the rights over the Database, and not the
 17 | contents of the Database individually. Licensors should use the ODbL
 18 | together with another license for the contents, if the contents have a
 19 | single set of rights that uniformly covers all of the contents. If the
 20 | contents have multiple sets of different rights, Licensors should
 21 | describe what rights govern what contents together in the individual
 22 | record or in some other way that clarifies what rights apply. 
 23 | 
 24 | Sometimes the contents of a database, or the database itself, can be
 25 | covered by other rights not addressed here (such as private contracts,
 26 | trade mark over the name, or privacy rights / data protection rights
 27 | over information in the contents), and so you are advised that you may
 28 | have to consult other documents or clear other rights before doing
 29 | activities not covered by this License.
 30 | 
 31 | ------
 32 | 
 33 | The Licensor (as defined below) 
 34 | 
 35 | and 
 36 | 
 37 | You (as defined below) 
 38 | 
 39 | agree as follows: 
 40 | 
 41 | ### 1.0 Definitions of Capitalised Words
 42 | 
 43 | "Collective Database" – Means this Database in unmodified form as part
 44 | of a collection of independent databases in themselves that together are
 45 | assembled into a collective whole. A work that constitutes a Collective
 46 | Database will not be considered a Derivative Database.
 47 | 
 48 | "Convey" – As a verb, means Using the Database, a Derivative Database,
 49 | or the Database as part of a Collective Database in any way that enables
 50 | a Person to make or receive copies of the Database or a Derivative
 51 | Database.  Conveying does not include interaction with a user through a
 52 | computer network, or creating and Using a Produced Work, where no
 53 | transfer of a copy of the Database or a Derivative Database occurs.
 54 | "Contents" – The contents of this Database, which includes the
 55 | information, independent works, or other material collected into the
 56 | Database. For example, the contents of the Database could be factual
 57 | data or works such as images, audiovisual material, text, or sounds.
 58 | 
 59 | "Database" – A collection of material (the Contents) arranged in a
 60 | systematic or methodical way and individually accessible by electronic
 61 | or other means offered under the terms of this License.
 62 | 
 63 | "Database Directive" – Means Directive 96/9/EC of the European
 64 | Parliament and of the Council of 11 March 1996 on the legal protection
 65 | of databases, as amended or succeeded.
 66 | 
 67 | "Database Right" – Means rights resulting from the Chapter III ("sui
 68 | generis") rights in the Database Directive (as amended and as transposed
 69 | by member states), which includes the Extraction and Re-utilisation of
 70 | the whole or a Substantial part of the Contents, as well as any similar
 71 | rights available in the relevant jurisdiction under Section 10.4. 
 72 | 
 73 | "Derivative Database" – Means a database based upon the Database, and
 74 | includes any translation, adaptation, arrangement, modification, or any
 75 | other alteration of the Database or of a Substantial part of the
 76 | Contents. This includes, but is not limited to, Extracting or
 77 | Re-utilising the whole or a Substantial part of the Contents in a new
 78 | Database.
 79 | 
 80 | "Extraction" – Means the permanent or temporary transfer of all or a
 81 | Substantial part of the Contents to another medium by any means or in
 82 | any form.
 83 | 
 84 | "License" – Means this license agreement and is both a license of rights
 85 | such as copyright and Database Rights and an agreement in contract.
 86 | 
 87 | "Licensor" – Means the Person that offers the Database under the terms
 88 | of this License. 
 89 | 
 90 | "Person" – Means a natural or legal person or a body of persons
 91 | corporate or incorporate.
 92 | 
 93 | "Produced Work" –  a work (such as an image, audiovisual material, text,
 94 | or sounds) resulting from using the whole or a Substantial part of the
 95 | Contents (via a search or other query) from this Database, a Derivative
 96 | Database, or this Database as part of a Collective Database.  
 97 | 
 98 | "Publicly" – means to Persons other than You or under Your control by
 99 | either more than 50% ownership or by the power to direct their
100 | activities (such as contracting with an independent consultant). 
101 | 
102 | "Re-utilisation" – means any form of making available to the public all
103 | or a Substantial part of the Contents by the distribution of copies, by
104 | renting, by online or other forms of transmission.
105 | 
106 | "Substantial" – Means substantial in terms of quantity or quality or a
107 | combination of both. The repeated and systematic Extraction or
108 | Re-utilisation of insubstantial parts of the Contents may amount to the
109 | Extraction or Re-utilisation of a Substantial part of the Contents.
110 | 
111 | "Use" – As a verb, means doing any act that is restricted by copyright
112 | or Database Rights whether in the original medium or any other; and
113 | includes without limitation distributing, copying, publicly performing,
114 | publicly displaying, and preparing derivative works of the Database, as
115 | well as modifying the Database as may be technically necessary to use it
116 | in a different mode or format. 
117 | 
118 | "You" – Means a Person exercising rights under this License who has not
119 | previously violated the terms of this License with respect to the
120 | Database, or who has received express permission from the Licensor to
121 | exercise rights under this License despite a previous violation.
122 | 
123 | Words in the singular include the plural and vice versa.
124 | 
125 | ### 2.0 What this License covers
126 | 
127 | 2.1. Legal effect of this document. This License is:
128 | 
129 |   a. A license of applicable copyright and neighbouring rights;
130 | 
131 |   b. A license of the Database Right; and
132 | 
133 |   c. An agreement in contract between You and the Licensor.
134 | 
135 | 2.2 Legal rights covered. This License covers the legal rights in the
136 | Database, including:
137 | 
138 |   a. Copyright. Any copyright or neighbouring rights in the Database.
139 |   The copyright licensed includes any individual elements of the
140 |   Database, but does not cover the copyright over the Contents
141 |   independent of this Database. See Section 2.4 for details. Copyright
142 |   law varies between jurisdictions, but is likely to cover: the Database
143 |   model or schema, which is the structure, arrangement, and organisation
144 |   of the Database, and can also include the Database tables and table
145 |   indexes; the data entry and output sheets; and the Field names of
146 |   Contents stored in the Database;
147 | 
148 |   b. Database Rights. Database Rights only extend to the Extraction and
149 |   Re-utilisation of the whole or a Substantial part of the Contents.
150 |   Database Rights can apply even when there is no copyright over the
151 |   Database. Database Rights can also apply when the Contents are removed
152 |   from the Database and are selected and arranged in a way that would
153 |   not infringe any applicable copyright; and
154 | 
155 |   c. Contract. This is an agreement between You and the Licensor for
156 |   access to the Database. In return you agree to certain conditions of
157 |   use on this access as outlined in this License. 
158 | 
159 | 2.3 Rights not covered. 
160 | 
161 |   a. This License does not apply to computer programs used in the making
162 |   or operation of the Database; 
163 | 
164 |   b. This License does not cover any patents over the Contents or the
165 |   Database; and
166 | 
167 |   c. This License does not cover any trademarks associated with the
168 |   Database. 
169 | 
170 | 2.4 Relationship to Contents in the Database. The individual items of
171 | the Contents contained in this Database may be covered by other rights,
172 | including copyright, patent, data protection, privacy, or personality
173 | rights, and this License does not cover any rights (other than Database
174 | Rights or in contract) in individual Contents contained in the Database.
175 | For example, if used on a Database of images (the Contents), this
176 | License would not apply to copyright over individual images, which could
177 | have their own separate licenses, or one single license covering all of
178 | the rights over the images.  
179 | 
180 | ### 3.0 Rights granted
181 | 
182 | 3.1 Subject to the terms and conditions of this License, the Licensor
183 | grants to You a worldwide, royalty-free, non-exclusive, terminable (but
184 | only under Section 9) license to Use the Database for the duration of
185 | any applicable copyright and Database Rights. These rights explicitly
186 | include commercial use, and do not exclude any field of endeavour. To
187 | the extent possible in the relevant jurisdiction, these rights may be
188 | exercised in all media and formats whether now known or created in the
189 | future. 
190 | 
191 | The rights granted cover, for example:
192 | 
193 |   a. Extraction and Re-utilisation of the whole or a Substantial part of
194 |   the Contents;
195 | 
196 |   b. Creation of Derivative Databases;
197 | 
198 |   c. Creation of Collective Databases;
199 | 
200 |   d. Creation of temporary or permanent reproductions by any means and
201 |   in any form, in whole or in part, including of any Derivative
202 |   Databases or as a part of Collective Databases; and
203 | 
204 |   e. Distribution, communication, display, lending, making available, or
205 |   performance to the public by any means and in any form, in whole or in
206 |   part, including of any Derivative Database or as a part of Collective
207 |   Databases.
208 | 
209 | 3.2 Compulsory license schemes. For the avoidance of doubt:
210 | 
211 |   a. Non-waivable compulsory license schemes. In those jurisdictions in
212 |   which the right to collect royalties through any statutory or
213 |   compulsory licensing scheme cannot be waived, the Licensor reserves
214 |   the exclusive right to collect such royalties for any exercise by You
215 |   of the rights granted under this License;
216 | 
217 |   b. Waivable compulsory license schemes. In those jurisdictions in
218 |   which the right to collect royalties through any statutory or
219 |   compulsory licensing scheme can be waived, the Licensor waives the
220 |   exclusive right to collect such royalties for any exercise by You of
221 |   the rights granted under this License; and,
222 | 
223 |   c. Voluntary license schemes. The Licensor waives the right to collect
224 |   royalties, whether individually or, in the event that the Licensor is
225 |   a member of a collecting society that administers voluntary licensing
226 |   schemes, via that society, from any exercise by You of the rights
227 |   granted under this License.
228 | 
229 | 3.3 The right to release the Database under different terms, or to stop
230 | distributing or making available the Database, is reserved. Note that
231 | this Database may be multiple-licensed, and so You may have the choice
232 | of using alternative licenses for this Database. Subject to Section
233 | 10.4, all other rights not expressly granted by Licensor are reserved.
234 | 
235 | ### 4.0 Conditions of Use
236 | 
237 | 4.1 The rights granted in Section 3 above are expressly made subject to
238 | Your complying with the following conditions of use. These are important
239 | conditions of this License, and if You fail to follow them, You will be
240 | in material breach of its terms.
241 | 
242 | 4.2 Notices. If You Publicly Convey this Database, any Derivative
243 | Database, or the Database as part of a Collective Database, then You
244 | must: 
245 | 
246 |   a. Do so only under the terms of this License or another license
247 |   permitted under Section 4.4;
248 | 
249 |   b. Include a copy of this License (or, as applicable, a license
250 |   permitted under Section 4.4) or its Uniform Resource Identifier (URI)
251 |   with the Database or Derivative Database, including both in the
252 |   Database or Derivative Database and in any relevant documentation; and
253 | 
254 |   c. Keep intact any copyright or Database Right notices and notices
255 |   that refer to this License.
256 | 
257 |   d. If it is not possible to put the required notices in a particular
258 |   file due to its structure, then You must include the notices in a
259 |   location (such as a relevant directory) where users would be likely to
260 |   look for it.
261 | 
262 | 4.3 Notice for using output (Contents). Creating and Using a Produced
263 | Work does not require the notice in Section 4.2. However, if you
264 | Publicly Use a Produced Work, You must include a notice associated with
265 | the Produced Work reasonably calculated to make any Person that uses,
266 | views, accesses, interacts with, or is otherwise exposed to the Produced
267 | Work aware that Content was obtained from the Database, Derivative
268 | Database, or the Database as part of a Collective Database, and that it
269 | is available under this License.
270 | 
271 |   a. Example notice. The following text will satisfy notice under
272 |   Section 4.3:
273 | 
274 |         Contains information from DATABASE NAME, which is made available
275 |         here under the Open Database License (ODbL).
276 | 
277 | DATABASE NAME should be replaced with the name of the Database and a
278 | hyperlink to the URI of the Database. "Open Database License" should
279 | contain a hyperlink to the URI of the text of this License. If
280 | hyperlinks are not possible, You should include the plain text of the
281 | required URI's with the above notice.
282 |  
283 | 4.4 Share alike. 
284 | 
285 |   a. Any Derivative Database that You Publicly Use must be only under
286 |   the terms of: 
287 | 
288 |     i. This License;
289 | 
290 |     ii. A later version of this License similar in spirit to this
291 |       License; or
292 | 
293 |     iii. A compatible license. 
294 | 
295 |   If You license the Derivative Database under one of the licenses
296 |   mentioned in (iii), You must comply with the terms of that license. 
297 | 
298 |   b. For the avoidance of doubt, Extraction or Re-utilisation of the
299 |   whole or a Substantial part of the Contents into a new database is a
300 |   Derivative Database and must comply with Section 4.4. 
301 | 
302 |   c. Derivative Databases and Produced Works.  A Derivative Database is
303 |   Publicly Used and so must comply with Section 4.4. if a Produced Work
304 |   created from the Derivative Database is Publicly Used.
305 | 
306 |   d. Share Alike and additional Contents. For the avoidance of doubt,
307 |   You must not add Contents to Derivative Databases under Section 4.4 a
308 |   that are incompatible with the rights granted under this License. 
309 | 
310 |   e. Compatible licenses. Licensors may authorise a proxy to determine
311 |   compatible licenses under Section 4.4 a iii. If they do so, the
312 |   authorised proxy's public statement of acceptance of a compatible
313 |   license grants You permission to use the compatible license.
314 | 
315 | 
316 | 4.5 Limits of Share Alike.  The requirements of Section 4.4 do not apply
317 | in the following:
318 | 
319 |   a. For the avoidance of doubt, You are not required to license
320 |   Collective Databases under this License if You incorporate this
321 |   Database or a Derivative Database in the collection, but this License
322 |   still applies to this Database or a Derivative Database as a part of
323 |   the Collective Database; 
324 | 
325 |   b. Using this Database, a Derivative Database, or this Database as
326 |   part of a Collective Database to create a Produced Work does not
327 |   create a Derivative Database for purposes of  Section 4.4; and
328 | 
329 |   c. Use of a Derivative Database internally within an organisation is
330 |   not to the public and therefore does not fall under the requirements
331 |   of Section 4.4.
332 | 
333 | 4.6 Access to Derivative Databases. If You Publicly Use a Derivative
334 | Database or a Produced Work from a Derivative Database, You must also
335 | offer to recipients of the Derivative Database or Produced Work a copy
336 | in a machine readable form of:
337 | 
338 |   a. The entire Derivative Database; or
339 | 
340 |   b. A file containing all of the alterations made to the Database or
341 |   the method of making the alterations to the Database (such as an
342 |   algorithm), including any additional Contents, that make up all the
343 |   differences between the Database and the Derivative Database.
344 | 
345 | The Derivative Database (under a.) or alteration file (under b.) must be
346 | available at no more than a reasonable production cost for physical
347 | distributions and free of charge if distributed over the internet.
348 | 
349 | 4.7 Technological measures and additional terms
350 | 
351 |   a. This License does not allow You to impose (except subject to
352 |   Section 4.7 b.)  any terms or any technological measures on the
353 |   Database, a Derivative Database, or the whole or a Substantial part of
354 |   the Contents that alter or restrict the terms of this License, or any
355 |   rights granted under it, or have the effect or intent of restricting
356 |   the ability of any person to exercise those rights.
357 | 
358 |   b. Parallel distribution. You may impose terms or technological
359 |   measures on the Database, a Derivative Database, or the whole or a
360 |   Substantial part of the Contents (a "Restricted Database") in
361 |   contravention of Section 4.74 a. only if You also make a copy of the
362 |   Database or a Derivative Database available to the recipient of the
363 |   Restricted Database:
364 | 
365 |     i. That is available without additional fee;
366 | 
367 |     ii. That is available in a medium that does not alter or restrict
368 |     the terms of this License, or any rights granted under it, or have
369 |     the effect or intent of restricting the ability of any person to
370 |     exercise those rights (an "Unrestricted Database"); and
371 | 
372 |     iii. The Unrestricted Database is at least as accessible to the
373 |     recipient as a practical matter as the Restricted Database.
374 | 
375 |   c. For the avoidance of doubt, You may place this Database or a
376 |   Derivative Database in an authenticated environment, behind a
377 |   password, or within a similar access control scheme provided that You
378 |   do not alter or restrict the terms of this License or any rights
379 |   granted under it or have the effect or intent of restricting the
380 |   ability of any person to exercise those rights. 
381 | 
382 | 4.8 Licensing of others. You may not sublicense the Database. Each time
383 | You communicate the Database, the whole or Substantial part of the
384 | Contents, or any Derivative Database to anyone else in any way, the
385 | Licensor offers to the recipient a license to the Database on the same
386 | terms and conditions as this License. You are not responsible for
387 | enforcing compliance by third parties with this License, but You may
388 | enforce any rights that You have over a Derivative Database. You are
389 | solely responsible for any modifications of a Derivative Database made
390 | by You or another Person at Your direction. You may not impose any
391 | further restrictions on the exercise of the rights granted or affirmed
392 | under this License.
393 | 
394 | ### 5.0 Moral rights
395 | 
396 | 5.1 Moral rights. This section covers moral rights, including any rights
397 | to be identified as the author of the Database or to object to treatment
398 | that would otherwise prejudice the author's honour and reputation, or
399 | any other derogatory treatment:
400 | 
401 |   a. For jurisdictions allowing waiver of moral rights, Licensor waives
402 |   all moral rights that Licensor may have in the Database to the fullest
403 |   extent possible by the law of the relevant jurisdiction under Section
404 |   10.4; 
405 | 
406 |   b. If waiver of moral rights under Section 5.1 a in the relevant
407 |   jurisdiction is not possible, Licensor agrees not to assert any moral
408 |   rights over the Database and waives all claims in moral rights to the
409 |   fullest extent possible by the law of the relevant jurisdiction under
410 |   Section 10.4; and
411 | 
412 |   c. For jurisdictions not allowing waiver or an agreement not to assert
413 |   moral rights under Section 5.1 a and b, the author may retain their
414 |   moral rights over certain aspects of the Database.
415 | 
416 | Please note that some jurisdictions do not allow for the waiver of moral
417 | rights, and so moral rights may still subsist over the Database in some
418 | jurisdictions.
419 | 
420 | ### 6.0 Fair dealing, Database exceptions, and other rights not affected 
421 | 
422 | 6.1 This License does not affect any rights that You or anyone else may
423 | independently have under any applicable law to make any use of this
424 | Database, including without limitation:
425 | 
426 |   a. Exceptions to the Database Right including: Extraction of Contents
427 |   from non-electronic Databases for private purposes, Extraction for
428 |   purposes of illustration for teaching or scientific research, and
429 |   Extraction or Re-utilisation for public security or an administrative
430 |   or judicial procedure. 
431 | 
432 |   b. Fair dealing, fair use, or any other legally recognised limitation
433 |   or exception to infringement of copyright or other applicable laws. 
434 | 
435 | 6.2 This License does not affect any rights of lawful users to Extract
436 | and Re-utilise insubstantial parts of the Contents, evaluated
437 | quantitatively or qualitatively, for any purposes whatsoever, including
438 | creating a Derivative Database (subject to other rights over the
439 | Contents, see Section 2.4). The repeated and systematic Extraction or
440 | Re-utilisation of insubstantial parts of the Contents may however amount
441 | to the Extraction or Re-utilisation of a Substantial part of the
442 | Contents.
443 | 
444 | ### 7.0 Warranties and Disclaimer
445 | 
446 | 7.1 The Database is licensed by the Licensor "as is" and without any
447 | warranty of any kind, either express, implied, or arising by statute,
448 | custom, course of dealing, or trade usage. Licensor specifically
449 | disclaims any and all implied warranties or conditions of title,
450 | non-infringement, accuracy or completeness, the presence or absence of
451 | errors, fitness for a particular purpose, merchantability, or otherwise.
452 | Some jurisdictions do not allow the exclusion of implied warranties, so
453 | this exclusion may not apply to You.
454 | 
455 | ### 8.0 Limitation of liability
456 | 
457 | 8.1 Subject to any liability that may not be excluded or limited by law,
458 | the Licensor is not liable for, and expressly excludes, all liability
459 | for loss or damage however and whenever caused to anyone by any use
460 | under this License, whether by You or by anyone else, and whether caused
461 | by any fault on the part of the Licensor or not. This exclusion of
462 | liability includes, but is not limited to, any special, incidental,
463 | consequential, punitive, or exemplary damages such as loss of revenue,
464 | data, anticipated profits, and lost business. This exclusion applies
465 | even if the Licensor has been advised of the possibility of such
466 | damages.
467 | 
468 | 8.2 If liability may not be excluded by law, it is limited to actual and
469 | direct financial loss to the extent it is caused by proved negligence on
470 | the part of the Licensor.
471 | 
472 | ### 9.0 Termination of Your rights under this License
473 | 
474 | 9.1 Any breach by You of the terms and conditions of this License
475 | automatically terminates this License with immediate effect and without
476 | notice to You. For the avoidance of doubt, Persons who have received the
477 | Database, the whole or a Substantial part of the Contents, Derivative
478 | Databases, or the Database as part of a Collective Database from You
479 | under this License will not have their licenses terminated provided
480 | their use is in full compliance with this License or a license granted
481 | under Section 4.8 of this License.  Sections 1, 2, 7, 8, 9 and 10 will
482 | survive any termination of this License.
483 | 
484 | 9.2 If You are not in breach of the terms of this License, the Licensor
485 | will not terminate Your rights under it. 
486 | 
487 | 9.3 Unless terminated under Section 9.1, this License is granted to You
488 | for the duration of applicable rights in the Database. 
489 | 
490 | 9.4 Reinstatement of rights. If you cease any breach of the terms and
491 | conditions of this License, then your full rights under this License
492 | will be reinstated:
493 | 
494 |   a. Provisionally and subject to permanent termination until the 60th
495 |   day after cessation of breach; 
496 | 
497 |   b. Permanently on the 60th day after cessation of breach unless
498 |   otherwise reasonably notified by the Licensor; or
499 | 
500 |   c.  Permanently if reasonably notified by the Licensor of the
501 |   violation, this is the first time You have received notice of
502 |   violation of this License from  the Licensor, and You cure the
503 |   violation prior to 30 days after your receipt of the notice.
504 | 
505 | Persons subject to permanent termination of rights are not eligible to
506 | be a recipient and receive a license under Section 4.8.
507 | 
508 | 9.5 Notwithstanding the above, Licensor reserves the right to release
509 | the Database under different license terms or to stop distributing or
510 | making available the Database. Releasing the Database under different
511 | license terms or stopping the distribution of the Database will not
512 | withdraw this License (or any other license that has been, or is
513 | required to be, granted under the terms of this License), and this
514 | License will continue in full force and effect unless terminated as
515 | stated above.
516 | 
517 | ### 10.0 General
518 | 
519 | 10.1 If any provision of this License is held to be invalid or
520 | unenforceable, that must not affect the validity or enforceability of
521 | the remainder of the terms and conditions of this License and each
522 | remaining provision of this License shall be valid and enforced to the
523 | fullest extent permitted by law. 
524 | 
525 | 10.2 This License is the entire agreement between the parties with
526 | respect to the rights granted here over the Database. It replaces any
527 | earlier understandings, agreements or representations with respect to
528 | the Database. 
529 | 
530 | 10.3 If You are in breach of the terms of this License, You will not be
531 | entitled to rely on the terms of this License or to complain of any
532 | breach by the Licensor. 
533 | 
534 | 10.4 Choice of law. This License takes effect in and will be governed by
535 | the laws of the relevant jurisdiction in which the License terms are
536 | sought to be enforced. If the standard suite of rights granted under
537 | applicable copyright law and Database Rights in the relevant
538 | jurisdiction includes additional rights not granted under this License,
539 | these additional rights are granted in this License in order to meet the
540 | terms of this License.
541 | 
542 | 


--------------------------------------------------------------------------------
/dump/README.txt:
--------------------------------------------------------------------------------
 1 | -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 2 |   SimpleGeo International Neighborhoods Dump
 3 | -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 4 | 
 5 | This archive contains the SimpleGeo International Neighborhoods dataset from 04
 6 | August 2011 in various formats.
 7 | 
 8 | See the blog post for more details on the dataset:
 9 | 
10 | http://blog.simplegeo.com/2011/08/05/its-a-beautiful-day-in-the-neighborhood/
11 | 
12 | -------------
13 | What's Inside
14 | -------------
15 | 
16 |     geojson/    GeoJSON, one per city
17 |     kml/        KML, one per city
18 |     shp/        A single ESRI Shapefile of the entire dataset
19 | 
20 | Each record in the Shapefile also includes the name of the city, the WoE ID of
21 | the city (parent_id), and the WoE feature type (either "Suburb" in the case of
22 | informal neighborhoods, or "LocalAdmin" in the case of formal city divisions).
23 | 
24 | Otherwise, the three formats provided in this archive are functionally
25 | identical.
26 | 
27 | -------------------
28 | Where It Comes From
29 | -------------------
30 | 
31 | SimpleGeo produced this dataset using Open Source software published at
32 | http://github.com/simplegeo/betashapes/.
33 | 
34 | This dataset is derived from:
35 | 
36 |     * Yahoo! GeoPlanet (http://developer.yahoo.com/geo/geoplanet/data/)
37 |     * OpenStreetMap (http://openstreetmap.org/)
38 |     * the public Flickr API (http://www.flickr.com/services/api/)
39 | 
40 | Because the dataset is based on OSM, we make it available to you under the ODC
41 | Open Database License 1.0. Please see the license summary and disclaimer below,
42 | and the full license text as given in LICENSE.txt.
43 | 
44 | SimpleGeo makes it easy for developers to build location-aware applications.
45 | Find out more at http://simplegeo.com/!
46 | 
47 | ---------------------------------
48 | What You're Welcome to Do With It
49 | ---------------------------------
50 | 
51 | You are free:
52 | 
53 |     To Share: To copy, distribute and use the database.
54 |     To Create: To produce works from the database.
55 |     To Adapt: To modify, transform and build upon the database.
56 | 
57 | As long as you:
58 | 
59 |     Attribute: You must attribute any public use of the database, or works
60 |     produced from the database, in the manner specified in the ODbL. For any
61 |     use or redistribution of the database, or works produced from it, you must
62 |     make clear to others the license of the database and keep intact any
63 |     notices on the original database.
64 | 
65 |     Share-Alike: If you publicly use any adapted version of this database, or
66 |     works produced from an adapted database, you must also offer that adapted
67 |     database under the ODbL.
68 | 
69 |     Keep open: If you redistribute the database, or an adapted version of it,
70 |     then you may use technological measures that restrict the work (such as
71 |     DRM) as long as you also redistribute a version without such measures.
72 | 
73 | Disclaimer:
74 | 
75 |     The above summary is not the license text. It is simply a handy reference
76 |     for understanding the ODbL 1.0 — it is a human-readable expression of some
77 |     of its key terms. This summary has no legal value, and its contents do not
78 |     appear in the actual license. Read the full ODbL 1.0 license text for the
79 |     exact terms that apply.
80 | 
81 | THIS DATABASE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
82 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
83 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
84 | DISCLAIMED. IN NO EVENT SHALL SIMPLEGEO, INC. BE LIABLE FOR ANY DIRECT,
85 | INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
86 | BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
87 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
88 | LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
89 | OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DATABASE, EVEN IF
90 | ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
91 | 
92 | The full text of the license is given in the LICENSE.txt file. It can also be found at 
93 | http://opendatacommons.org/licenses/odbl/1-0/.
94 | 
95 | =30=
96 | 


--------------------------------------------------------------------------------
/geocrawlr.py:
--------------------------------------------------------------------------------
 1 | import Flickr.API
 2 | import json, time, sys, os
 3 | 
 4 | FLICKR_KEY    = os.environ["FLICKR_KEY"]
 5 | FLICKR_SECRET = os.environ["FLICKR_SECRET"]
 6 | 
 7 | START_PAGE = 1
 8 | END_PAGE = 10
 9 | 
10 | api = Flickr.API.API(FLICKR_KEY, FLICKR_SECRET)
11 | 
12 | for woe_id in map(int, sys.argv[1:]):
13 |     print >>sys.stderr, "WOEID:", woe_id
14 |     page = total_pages = START_PAGE
15 | 
16 |     while page <= total_pages:
17 |         print >>sys.stderr, ">>> Reading %d of %d... " % (page, total_pages),
18 |         request = Flickr.API.Request(
19 |                     method="flickr.photos.search",
20 |                     format="json", 
21 |                     nojsoncallback=1,
22 |                     sort="interestingness-desc",
23 |                     page=page,
24 |                     woe_id=woe_id,
25 |                     extras="geo",
26 |                     min_date_taken="2007-01-01 00:00:00"
27 |                     )
28 |         start = time.time()
29 |         response = None
30 |         while response is None:
31 |             try:
32 |                 response = api.execute_request(request).read()
33 |             except Exception, e:
34 |                 print >>sys.stderr, "Retrying due to:", e
35 |         try:
36 |             result = json.loads(response)
37 |             result = result["photos"]
38 |             print >>sys.stderr, "%d results, %.1fs elapsed." % (len(result["photo"]),time.time()-start)
39 |             for item in result["photo"]:
40 |                 try:
41 |                     print "\t".join(str(item[k]) for k in ("id","woeid","longitude","latitude"))
42 |                 except Exception, e:
43 |                     print >>sys.stderr, e
44 |             total_pages = min(int(result["pages"]), END_PAGE)
45 |             #time.sleep(1.0)
46 |         except Exception, e:
47 |             print >>sys.stderr, e
48 |         page += 1
49 | 


--------------------------------------------------------------------------------
/junk/OsmApi.py:
--------------------------------------------------------------------------------
  1 | #-*- coding: utf-8 -*-
  2 | 
  3 | ###########################################################################
  4 | ##                                                                       ##
  5 | ## Copyrights Etienne Chové <chove@crans.org> 2009-2010                  ##
  6 | ##                                                                       ##
  7 | ## This program is free software: you can redistribute it and/or modify  ##
  8 | ## it under the terms of the GNU General Public License as published by  ##
  9 | ## the Free Software Foundation, either version 3 of the License, or     ##
 10 | ## (at your option) any later version.                                   ##
 11 | ##                                                                       ##
 12 | ## This program is distributed in the hope that it will be useful,       ##
 13 | ## but WITHOUT ANY WARRANTY; without even the implied warranty of        ##
 14 | ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         ##
 15 | ## GNU General Public License for more details.                          ##
 16 | ##                                                                       ##
 17 | ## You should have received a copy of the GNU General Public License     ##
 18 | ## along with this program.  If not, see <http://www.gnu.org/licenses/>. ##
 19 | ##                                                                       ##
 20 | ###########################################################################
 21 | 
 22 | ## HomePage : http://wiki.openstreetmap.org/wiki/PythonOsmApi
 23 | 
 24 | ###########################################################################
 25 | ## History                                                               ##
 26 | ###########################################################################
 27 | ## 0.2.19  2010-05-24 Add debug message on ApiError                      ##
 28 | ## 0.2.18  2010-04-20 Fix ChangesetClose and _http_request               ##
 29 | ## 0.2.17  2010-01-02 Capabilities implementation                        ##
 30 | ## 0.2.16  2010-01-02 ChangesetsGet by Alexander Rampp                   ##
 31 | ## 0.2.15  2009-12-16 xml encoding error for < and >                     ##
 32 | ## 0.2.14  2009-11-20 changesetautomulti parameter                       ##
 33 | ## 0.2.13  2009-11-16 modify instead update for osc                      ##
 34 | ## 0.2.12  2009-11-14 raise ApiError on 4xx errors -- Xoff               ##
 35 | ## 0.2.11  2009-10-14 unicode error on ChangesetUpload                   ##
 36 | ## 0.2.10  2009-10-14 RelationFullRecur definition                       ##
 37 | ## 0.2.9   2009-10-13 automatic changeset management                     ##
 38 | ##                    ChangesetUpload implementation                     ##
 39 | ## 0.2.8   2009-10-13 *(Create|Update|Delete) use not unique _do method  ##
 40 | ## 0.2.7   2009-10-09 implement all missing fonctions except             ##
 41 | ##                    ChangesetsGet and GetCapabilities                  ##
 42 | ## 0.2.6   2009-10-09 encoding clean-up                                  ##
 43 | ## 0.2.5   2009-10-09 implements NodesGet, WaysGet, RelationsGet         ##
 44 | ##                               ParseOsm, ParseOsc                      ##
 45 | ## 0.2.4   2009-10-06 clean-up                                           ##
 46 | ## 0.2.3   2009-09-09 keep http connection alive for multiple request    ##
 47 | ##                    (Node|Way|Relation)Get return None when object     ##
 48 | ##                    have been deleted (raising error before)           ##
 49 | ## 0.2.2   2009-07-13 can identify applications built on top of the lib  ##
 50 | ## 0.2.1   2009-05-05 some changes in constructor -- chove@crans.org     ##
 51 | ## 0.2     2009-05-01 initial import                                     ##
 52 | ###########################################################################
 53 | 
 54 | __version__ = '0.2.19'
 55 | 
 56 | import httplib, base64, xml.dom.minidom, time, sys, urllib
 57 | 
 58 | class ApiError(Exception):
 59 |     	
 60 |     def __init__(self, status, reason, payload):
 61 |         self.status  = status
 62 |         self.reason  = reason
 63 |         self.payload = payload
 64 |     
 65 |     def __str__(self):
 66 |         return "Request failed: " + str(self.status) + " - " + self.reason + " - " + self.payload
 67 | 
 68 | ###########################################################################
 69 | ## Main class                                                            ##
 70 | 
 71 | class OsmApi:
 72 |         
 73 |     def __init__(self,
 74 |         username = None,
 75 |         password = None,
 76 |         passwordfile = None,
 77 |         appid = "",
 78 |         created_by = "PythonOsmApi/"+__version__,
 79 |         api = "www.openstreetmap.org",
 80 |         changesetauto = False,
 81 |         changesetautotags = {},
 82 |         changesetautosize = 500,
 83 |         changesetautomulti = 1,
 84 |         debug = False
 85 |         ):
 86 |     
 87 |         # debug
 88 |         self._debug = debug
 89 |         
 90 |         # Get username
 91 |         if username:
 92 |             self._username = username
 93 |         elif passwordfile:
 94 |             self._username =  open(passwordfile).readline().split(":")[0].strip()            
 95 |     
 96 |         # Get password
 97 |         if password:
 98 |             self._password   = password
 99 |         elif passwordfile:
100 |             for l in open(passwordfile).readlines():
101 |                 l = l.strip().split(":")
102 |                 if l[0] == self._username:
103 |                     self._password = l[1]
104 | 
105 |         # Changest informations
106 |         self._changesetauto      = changesetauto      # auto create and close changesets
107 |         self._changesetautotags  = changesetautotags  # tags for automatic created changesets
108 |         self._changesetautosize  = changesetautosize  # change count for auto changeset
109 |         self._changesetautosize  = changesetautosize  # change count for auto changeset
110 |         self._changesetautomulti = changesetautomulti # close a changeset every # upload
111 |         self._changesetautocpt   = 0
112 |         self._changesetautodata  = []                 # data to upload for auto group
113 |         
114 |         # Get API
115 |         self._api = api
116 |         self._xapi = True if "xapi" in api else False
117 | 
118 |         # Get created_by
119 |         if not appid:
120 |             self._created_by = created_by
121 |         else:
122 |             self._created_by = appid + " (" + created_by + ")"
123 | 
124 |         # Initialisation     
125 |         self._CurrentChangesetId = 0
126 |         
127 |         # Http connection
128 |         self._conn = httplib.HTTPConnection(self._api, 80)
129 | 
130 |     def __del__(self):
131 |         if self._changesetauto:
132 |             self._changesetautoflush(True)
133 |         return None
134 | 
135 |     #######################################################################
136 |     # Capabilities                                                        #
137 |     #######################################################################
138 |     
139 |     def Capabilities(self):
140 |         """ Returns ApiCapabilities. """
141 |         uri = "/api/capabilities"
142 |         data = self._get(uri)
143 |         data = xml.dom.minidom.parseString(data)
144 |         print data.getElementsByTagName("osm")
145 |         data = data.getElementsByTagName("osm")[0].getElementsByTagName("api")[0]
146 |         result = {}
147 |         for elem in data.childNodes:
148 |             if elem.nodeType <> elem.ELEMENT_NODE:
149 |                 continue
150 |             result[elem.nodeName] = {}
151 |             print elem.nodeName
152 |             for k, v in elem.attributes.items():
153 |                 try:
154 |                     result[elem.nodeName][k] = float(v)
155 |                 except:
156 |                     result[elem.nodeName][k] = v
157 |         return result
158 | 
159 |     #######################################################################
160 |     # Node                                                                #
161 |     #######################################################################
162 | 
163 |     def NodeGet(self, NodeId, NodeVersion = -1):
164 |         """ Returns NodeData for node #NodeId. """
165 |         uri = "/api/0.6/node/"+str(NodeId)
166 |         if NodeVersion <> -1: uri += "/"+str(NodeVersion)
167 |         data = self._get(uri)
168 |         if not data: return data
169 |         data = xml.dom.minidom.parseString(data)
170 |         data = data.getElementsByTagName("osm")[0].getElementsByTagName("node")[0]
171 |         return self._DomParseNode(data)
172 | 
173 |     def NodeCreate(self, NodeData):
174 |         """ Creates a node. Returns updated NodeData (without timestamp). """
175 |         return self._do("create", "node", NodeData)
176 |             
177 |     def NodeUpdate(self, NodeData):
178 |         """ Updates node with NodeData. Returns updated NodeData (without timestamp). """
179 |         return self._do("modify", "node", NodeData)
180 | 
181 |     def NodeDelete(self, NodeData):
182 |         """ Delete node with NodeData. Returns updated NodeData (without timestamp). """
183 |         return self._do("delete", "node", NodeData)
184 | 
185 |     def NodeHistory(self, NodeId):
186 |         """ Returns dict(NodeVerrsion: NodeData). """
187 |         uri = "/api/0.6/node/"+str(NodeId)+"/history"
188 |         data = self._get(uri)
189 |         data = xml.dom.minidom.parseString(data)
190 |         result = {}
191 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("node"):
192 |             data = self._DomParseNode(data)
193 |             result[data[u"version"]] = data
194 |         return result
195 | 
196 |     def NodeWays(self, NodeId):
197 |         """ Returns [WayData, ... ] containing node #NodeId. """
198 |         uri = "/api/0.6/node/%d/ways"%NodeId
199 |         data = self._get(uri)
200 |         data = xml.dom.minidom.parseString(data)
201 |         result = []
202 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("way"):
203 |             data = self._DomParseRelation(data)
204 |             result.append(data)
205 |         return result
206 |     
207 |     def NodeRelations(self, NodeId):
208 |         """ Returns [RelationData, ... ] containing node #NodeId. """
209 |         uri = "/api/0.6/node/%d/relations"%NodeId
210 |         data = self._get(uri)
211 |         data = xml.dom.minidom.parseString(data)
212 |         result = []
213 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("relation"):
214 |             data = self._DomParseRelation(data)
215 |             result.append(data)
216 |         return result
217 | 
218 |     def NodesGet(self, NodeIdList):
219 |         """ Returns dict(NodeId: NodeData) for each node in NodeIdList """
220 |         uri  = "/api/0.6/nodes?nodes=" + ",".join([str(x) for x in NodeIdList])
221 |         data = self._get(uri)
222 |         data = xml.dom.minidom.parseString(data)
223 |         result = {}
224 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("node"):
225 |             data = self._DomParseNode(data)
226 |             result[data[u"id"]] = data
227 |         return result
228 | 
229 |     #######################################################################
230 |     # Way                                                                 #
231 |     #######################################################################
232 | 
233 |     def WayGet(self, WayId, WayVersion = -1):
234 |         """ Returns WayData for way #WayId. """
235 |         uri = "/api/0.6/way/"+str(WayId)
236 |         if WayVersion <> -1: uri += "/"+str(WayVersion)
237 |         data = self._get(uri)
238 |         if not data: return data
239 |         data = xml.dom.minidom.parseString(data)
240 |         data = data.getElementsByTagName("osm")[0].getElementsByTagName("way")[0]
241 |         return self._DomParseWay(data)
242 |     
243 |     def WayCreate(self, WayData):
244 |         """ Creates a way. Returns updated WayData (without timestamp). """
245 |         return self._do("create", "way", WayData)
246 | 
247 |     def WayUpdate(self, WayData):
248 |         """ Updates way with WayData. Returns updated WayData (without timestamp). """
249 |         return self._do("modify", "way", WayData)
250 | 
251 |     def WayDelete(self, WayData):
252 |         """ Delete way with WayData. Returns updated WayData (without timestamp). """
253 |         return self._do("delete", "way", WayData)
254 | 
255 |     def WayHistory(self, WayId):
256 |         """ Returns dict(WayVerrsion: WayData). """
257 |         uri = "/api/0.6/way/"+str(WayId)+"/history"
258 |         data = self._get(uri)
259 |         data = xml.dom.minidom.parseString(data)
260 |         result = {}
261 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("way"):
262 |             data = self._DomParseWay(data)
263 |             result[data[u"version"]] = data
264 |         return result
265 |     
266 |     def WayRelations(self, WayId):
267 |         """ Returns [RelationData, ...] containing way #WayId. """
268 |         uri = "/api/0.6/way/%d/relations"%WayId
269 |         data = self._get(uri)
270 |         data = xml.dom.minidom.parseString(data)
271 |         result = []
272 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("relation"):
273 |             data = self._DomParseRelation(data)
274 |             result.append(data)
275 |         return result
276 | 
277 |     def WayFull(self, WayId):
278 |         """ Return full data for way WayId as list of {type: node|way|relation, data: {}}. """
279 |         uri = "/api/0.6/way/"+str(WayId)+"/full"
280 |         data = self._get(uri)
281 |         return self.ParseOsm(data)
282 | 
283 |     def WaysGet(self, WayIdList):
284 |         """ Returns dict(WayId: WayData) for each way in WayIdList """
285 |         uri = "/api/0.6/ways?ways=" + ",".join([str(x) for x in WayIdList])
286 |         data = self._get(uri)
287 |         data = xml.dom.minidom.parseString(data)
288 |         result = {}
289 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("way"):
290 |             data = self._DomParseWay(data)
291 |             result[data[u"id"]] = data
292 |         return result
293 | 
294 |     #######################################################################
295 |     # Relation                                                            #
296 |     #######################################################################
297 | 
298 |     def RelationGet(self, RelationId, RelationVersion = -1):
299 |         """ Returns RelationData for relation #RelationId. """
300 |         uri = "/api/0.6/relation/"+str(RelationId)
301 |         if RelationVersion <> -1: uri += "/"+str(RelationVersion)
302 |         data = self._get(uri)
303 |         if not data: return data
304 |         data = xml.dom.minidom.parseString(data)
305 |         data = data.getElementsByTagName("osm")[0].getElementsByTagName("relation")[0]
306 |         return self._DomParseRelation(data)
307 | 
308 |     def RelationCreate(self, RelationData):
309 |         """ Creates a relation. Returns updated RelationData (without timestamp). """
310 |         return self._do("create", "relation", RelationData)
311 |     
312 |     def RelationUpdate(self, RelationData):
313 |         """ Updates relation with RelationData. Returns updated RelationData (without timestamp). """
314 |         return self._do("modify", "relation", RelationData)
315 | 
316 |     def RelationDelete(self, RelationData):
317 |         """ Delete relation with RelationData. Returns updated RelationData (without timestamp). """
318 |         return self._do("delete", "relation", RelationData)
319 | 
320 |     def RelationHistory(self, RelationId):
321 |         """ Returns dict(RelationVerrsion: RelationData). """
322 |         uri = "/api/0.6/relation/"+str(RelationId)+"/history"
323 |         data = self._get(uri)
324 |         data = xml.dom.minidom.parseString(data)
325 |         result = {}
326 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("relation"):
327 |             data = self._DomParseRelation(data)
328 |             result[data[u"version"]] = data
329 |         return result
330 |     
331 |     def RelationRelations(self, RelationId):
332 |         """ Returns list of RelationData containing relation #RelationId. """
333 |         uri = "/api/0.6/relation/%d/relations"%RelationId
334 |         data = self._get(uri)
335 |         data = xml.dom.minidom.parseString(data)
336 |         result = []
337 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("relation"):
338 |             data = self._DomParseRelation(data)
339 |             result.append(data)
340 |         return result
341 | 
342 |     def RelationFullRecur(self, RelationId):
343 |         """ Return full data for relation RelationId. Recurisve version relation of relations. """
344 |         data = []
345 |         todo = [RelationId]
346 |         done = []
347 |         while todo:
348 |             rid = todo.pop(0)
349 |             done.append(rid)
350 |             temp = self.RelationFull(rid)
351 |             for item in temp:
352 |                 if item["type"] <> "relation":
353 |                     continue
354 |                 if item["data"]["id"] in done:
355 |                     continue
356 |                 todo.append(item["data"]["id"])            
357 |             data += temp
358 |         return data
359 |     
360 |     def RelationFull(self, RelationId):
361 |         """ Return full data for relation RelationId as list of {type: node|way|relation, data: {}}. """
362 |         uri = "/api/0.6/relation/"+str(RelationId)+"/full"
363 |         data = self._get(uri)
364 |         return self.ParseOsm(data)
365 | 
366 |     def RelationsGet(self, RelationIdList):
367 |         """ Returns dict(RelationId: RelationData) for each relation in RelationIdList """
368 |         uri = "/api/0.6/relations?relations=" + ",".join([str(x) for x in RelationIdList])
369 |         data = self._get(uri)
370 |         data = xml.dom.minidom.parseString(data)
371 |         result = {}
372 |         for data in data.getElementsByTagName("osm")[0].getElementsByTagName("relation"):
373 |             data = self._DomParseRelation(data)
374 |             result[data[u"id"]] = data            
375 |         return result
376 | 
377 |     #######################################################################
378 |     # Changeset                                                           #
379 |     #######################################################################
380 | 
381 |     def ChangesetGet(self, ChangesetId):
382 |         """ Returns ChangesetData for changeset #ChangesetId. """
383 |         data = self._get("/api/0.6/changeset/"+str(ChangesetId))
384 |         data = xml.dom.minidom.parseString(data)
385 |         data = data.getElementsByTagName("osm")[0].getElementsByTagName("changeset")[0]
386 |         return self._DomParseChangeset(data)
387 |     
388 |     def ChangesetUpdate(self, ChangesetTags = {}):
389 |         """ Updates current changeset with ChangesetTags. """
390 |         if self._CurrentChangesetId == -1:
391 |             raise Exception, "No changeset currently opened"
392 |         if u"created_by" not in ChangesetTags:
393 |             ChangesetTags[u"created_by"] = self._created_by
394 |         result = self._put("/api/0.6/changeset/"+str(self._CurrentChangesetId), self._XmlBuild("changeset", {u"tag": ChangesetTags}))
395 |         return self._CurrentChangesetId
396 | 
397 |     def ChangesetCreate(self, ChangesetTags = {}):
398 |         """ Opens a changeset. Returns #ChangesetId. """
399 |         if self._CurrentChangesetId:
400 |             raise Exception, "Changeset alreadey opened"
401 |         if u"created_by" not in ChangesetTags:
402 |             ChangesetTags[u"created_by"] = self._created_by
403 |         result = self._put("/api/0.6/changeset/create", self._XmlBuild("changeset", {u"tag": ChangesetTags}))
404 |         self._CurrentChangesetId = int(result)
405 |         return self._CurrentChangesetId
406 |     
407 |     def ChangesetClose(self):
408 |         """ Closes current changeset. Returns #ChangesetId. """
409 |         if not self._CurrentChangesetId:
410 |             raise Exception, "No changeset currently opened"
411 |         result = self._put("/api/0.6/changeset/"+str(self._CurrentChangesetId)+"/close", u"")
412 |         CurrentChangesetId = self._CurrentChangesetId
413 |         self._CurrentChangesetId = 0
414 |         return CurrentChangesetId
415 | 
416 |     def ChangesetUpload(self, ChangesData):
417 |         """ Upload data. ChangesData is a list of dict {type: node|way|relation, action: create|delete|modify, data: {}}. Returns list with updated ids. """
418 |         data = ""
419 |         data += u"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
420 |         data += u"<osmChange version=\"0.6\" generator=\"" + self._created_by + "\">\n"
421 |         for change in ChangesData:
422 |             data += u"<"+change["action"]+">\n"
423 |             change["data"]["changeset"] = self._CurrentChangesetId
424 |             data += self._XmlBuild(change["type"], change["data"], False).decode("utf-8")
425 |             data += u"</"+change["action"]+">\n"
426 |         data += u"</osmChange>"
427 |         data = self._http("POST", "/api/0.6/changeset/"+str(self._CurrentChangesetId)+"/upload", True, data.encode("utf-8"))
428 |         data = xml.dom.minidom.parseString(data)
429 |         data = data.getElementsByTagName("diffResult")[0]
430 |         data = [x for x in data.childNodes if x.nodeType == x.ELEMENT_NODE]
431 |         for i in range(len(ChangesData)):
432 |             if ChangesData[i]["action"] == "delete":
433 |                 ChangesData[i]["data"].pop("version")
434 |             else:
435 |                 ChangesData[i]["data"]["version"] = int(data[i].getAttribute("new_id"))
436 |         return ChangesData
437 |         
438 |     def ChangesetDownload(self, ChangesetId):
439 |         """ Download data from a changeset. Returns list of dict {type: node|way|relation, action: create|delete|modify, data: {}}. """
440 |         uri = "/api/0.6/changeset/"+str(ChangesetId)+"/download"
441 |         data = self._get(uri)
442 |         return self.ParseOsc(data)
443 |     
444 |     def ChangesetsGet(self, min_lon=None, min_lat=None, max_lon=None, max_lat=None,
445 |                       userid=None, username=None,
446 |                       closed_after=None, created_before=None,
447 |                       only_open=False, only_closed=False):
448 |         """ Returns dict(ChangsetId: ChangesetData) matching all criteria. """
449 |         
450 |         uri = "/api/0.6/changesets"
451 |         params = {}
452 |         if min_lon or min_lat or max_lon or max_lat:
453 |             params["bbox"] = ",".join([str(min_lon),str(min_lat),str(max_lon),str(max_lat)])
454 |         if userid:
455 |             params["user"] = userid
456 |         if username:
457 |             params["display_name"] = username
458 |         if closed_after and not created_before:
459 |             params["time"] = closed_after
460 |         if created_before:
461 |             if not closed_after:
462 |                 closed_after = "1970-01-01T00:00:00Z"
463 |             params["time"] = closed_after + "," + created_before
464 |         if only_open:
465 |             params["open"] = 1
466 |         if only_closed:
467 |             params["closed"] = 1
468 |             
469 |         if params:
470 |             uri += "?" + urllib.urlencode(params)
471 |                 
472 |         data = self._get(uri)
473 |         data = xml.dom.minidom.parseString(data)
474 |         data = data.getElementsByTagName("osm")[0].getElementsByTagName("changeset")
475 |         result = {}
476 |         for curChangeset in data:
477 |             tmpCS = self._DomParseChangeset(curChangeset)
478 |             result[tmpCS["id"]] = tmpCS
479 |         return result
480 |     
481 |     #######################################################################
482 |     # Other                                                               #
483 |     #######################################################################
484 | 
485 |     def Map(self, min_lon, min_lat, max_lon, max_lat, **kwargs):
486 |         """ Download data in bounding box. Returns list of dict {type: node|way|relation, data: {}}. """
487 |         if False: #self._xapi:
488 |             kwargs["bbox"] = "bbox=%f,%f,%f,%f" % (min_lon, min_lat, max_lon, max_lat)
489 |             args = ["[%s=%s]" % item for item in kwargs.items()]
490 |             uri = "/api/0.6/*" + "".join(args)
491 |         else:
492 |             uri = "/api/0.6/map?bbox=%f,%f,%f,%f"%(min_lon, min_lat, max_lon, max_lat)
493 | 
494 |         data = self._get(uri)
495 |         return self.ParseOsm(data)
496 | 
497 |     #######################################################################
498 |     # Data parser                                                         #
499 |     #######################################################################
500 |     
501 |     def ParseOsm(self, data):
502 |         """ Parse osm data. Returns list of dict {type: node|way|relation, data: {}}. """
503 |         data = xml.dom.minidom.parseString(data)
504 |         data = data.getElementsByTagName("osm")[0]
505 |         result = []
506 |         for elem in data.childNodes:
507 |             if elem.nodeName == u"node":
508 |                 result.append({u"type": elem.nodeName, u"data": self._DomParseNode(elem)})
509 |             elif elem.nodeName == u"way":
510 |                 result.append({u"type": elem.nodeName, u"data": self._DomParseWay(elem)})                        
511 |             elif elem.nodeName == u"relation":
512 |                 result.append({u"type": elem.nodeName, u"data": self._DomParseRelation(elem)})
513 |         return result    
514 | 
515 |     def ParseOsc(self, data):
516 |         """ Parse osc data. Returns list of dict {type: node|way|relation, action: create|delete|modify, data: {}}. """
517 |         data = xml.dom.minidom.parseString(data)
518 |         data = data.getElementsByTagName("osmChange")[0]
519 |         result = []
520 |         for action in data.childNodes:
521 |             if action.nodeName == u"#text": continue
522 |             for elem in action.childNodes:
523 |                 if elem.nodeName == u"node":
524 |                     result.append({u"action":action.nodeName, u"type": elem.nodeName, u"data": self._DomParseNode(elem)})
525 |                 elif elem.nodeName == u"way":
526 |                     result.append({u"action":action.nodeName, u"type": elem.nodeName, u"data": self._DomParseWay(elem)})                        
527 |                 elif elem.nodeName == u"relation":
528 |                     result.append({u"action":action.nodeName, u"type": elem.nodeName, u"data": self._DomParseRelation(elem)})
529 |         return result
530 | 
531 |     #######################################################################
532 |     # Internal http function                                              #
533 |     #######################################################################
534 | 
535 |     def _do(self, action, OsmType, OsmData):
536 |         if self._changesetauto:
537 |             self._changesetautodata.append({"action":action, "type":OsmType, "data":OsmData})
538 |             self._changesetautoflush()
539 |             return None
540 |         else:
541 |             return self._do_manu(action, OsmType, OsmData)
542 |             
543 |     def _do_manu(self, action, OsmType, OsmData):        
544 |         if not self._CurrentChangesetId:
545 |             raise Exception, "You need to open a changeset before uploading data"
546 |         if u"timestamp" in OsmData:
547 |             OsmData.pop(u"timestamp")
548 |         OsmData[u"changeset"] = self._CurrentChangesetId
549 |         if action == "create":
550 |             if OsmData.get(u"id", -1) > 0:
551 |                 raise Exception, "This "+OsmType+" already exists"
552 |             result = self._put("/api/0.6/"+OsmType+"/create", self._XmlBuild(OsmType, OsmData))
553 |             OsmData[u"id"] = int(result.strip())
554 |             OsmData[u"version"] = 1
555 |             return OsmData
556 |         elif action == "modify":
557 |             result = self._put("/api/0.6/"+OsmType+"/"+str(OsmData[u"id"]), self._XmlBuild(OsmType, OsmData))
558 |             OsmData[u"version"] = int(result.strip())
559 |             return OsmData
560 |         elif action =="delete":
561 |             result = self._delete("/api/0.6/"+OsmType+"/"+str(OsmData[u"id"]), self._XmlBuild(OsmType, OsmData))
562 |             OsmData[u"version"] = int(result.strip())
563 |             OsmData[u"visible"] = False
564 |             return OsmData
565 |     
566 |     def flush(self):
567 |         return self._changesetautoflush(True)
568 |         
569 |     def _changesetautoflush(self, force = False):
570 |         while (len(self._changesetautodata) >= self._changesetautosize) or (force and self._changesetautodata):
571 |             if self._changesetautocpt == 0:
572 |                 self.ChangesetCreate(self._changesetautotags)
573 |             self.ChangesetUpload(self._changesetautodata[:self._changesetautosize])
574 |             self._changesetautodata = self._changesetautodata[self._changesetautosize:]
575 |             self._changesetautocpt += 1
576 |             if self._changesetautocpt == self._changesetautomulti:
577 |                 self.ChangesetClose()
578 |                 self._changesetautocpt = 0
579 |         if self._changesetautocpt and force:
580 |             self.ChangesetClose()
581 |             self._changesetautocpt = 0
582 |         return None
583 |         
584 |     def _http_request(self, cmd, path, auth, send):
585 |         if self._debug:
586 |             path2 = path
587 |             if len(path2) > 50:
588 |                 path2 = path2[:50]+"[...]"
589 |             print >>sys.stderr, "%s %s %s"%(time.strftime("%Y-%m-%d %H:%M:%S"),cmd,path2)
590 |         self._conn.putrequest(cmd, path)
591 |         self._conn.putheader('User-Agent', self._created_by)
592 |         if auth:
593 |             self._conn.putheader('Authorization', 'Basic ' + base64.encodestring(self._username + ':' + self._password).strip())
594 |         if send <> None:
595 |             self._conn.putheader('Content-Length', len(send))
596 |         self._conn.endheaders()
597 |         if send:
598 |             self._conn.send(send)
599 |         response = self._conn.getresponse()
600 |         if response.status <> 200:
601 |             payload = response.read().strip()
602 |             if response.status == 410:
603 |                 return None
604 |             raise ApiError(response.status, response.reason, payload)
605 |         if self._debug:
606 |             print >>sys.stderr, "%s %s %s done"%(time.strftime("%Y-%m-%d %H:%M:%S"),cmd,path2)
607 |         return response.read()
608 |     
609 |     def _http(self, cmd, path, auth, send):
610 |         i = 0
611 |         while True:
612 |             i += 1
613 |             try:
614 |                 return self._http_request(cmd, path, auth, send)
615 |             except ApiError, e:
616 |                 if e.status >= 500:
617 |                     if i == 5: raise
618 |                     if i <> 1: time.sleep(5)
619 |                     self._conn = httplib.HTTPConnection(self._api, 80)
620 |                 else: raise
621 |             except Exception:
622 |                 if i == 5: raise
623 |                 if i <> 1: time.sleep(5)
624 |                 self._conn = httplib.HTTPConnection(self._api, 80)
625 |     
626 |     def _get(self, path):
627 |         return self._http('GET', path, False, None)
628 | 
629 |     def _put(self, path, data):
630 |         return self._http('PUT', path, True, data)
631 |     
632 |     def _delete(self, path, data):
633 |         return self._http('DELETE', path, True, data)
634 |     
635 |     #######################################################################
636 |     # Internal dom function                                               #
637 |     #######################################################################
638 |     
639 |     def _DomGetAttributes(self, DomElement):
640 |         """ Returns a formated dictionnary of attributes of a DomElement. """
641 |         result = {}
642 |         for k, v in DomElement.attributes.items():
643 |             if k == u"uid"         : v = int(v)
644 |             elif k == u"changeset" : v = int(v)
645 |             elif k == u"version"   : v = int(v)
646 |             elif k == u"id"        : v = int(v)
647 |             elif k == u"lat"       : v = float(v)
648 |             elif k == u"lon"       : v = float(v)
649 |             elif k == u"open"      : v = v=="true"
650 |             elif k == u"visible"   : v = v=="true"
651 |             elif k == u"ref"       : v = int(v)
652 |             result[k] = v
653 |         return result            
654 |         
655 |     def _DomGetTag(self, DomElement):
656 |         """ Returns the dictionnary of tags of a DomElement. """
657 |         result = {}
658 |         for t in DomElement.getElementsByTagName("tag"):
659 |             k = t.attributes["k"].value
660 |             v = t.attributes["v"].value
661 |             result[k] = v
662 |         return result
663 | 
664 |     def _DomGetNd(self, DomElement):
665 |         """ Returns the list of nodes of a DomElement. """
666 |         result = []
667 |         for t in DomElement.getElementsByTagName("nd"):
668 |             result.append(int(int(t.attributes["ref"].value)))
669 |         return result            
670 | 
671 |     def _DomGetMember(self, DomElement):
672 |         """ Returns a list of relation members. """
673 |         result = []
674 |         for m in DomElement.getElementsByTagName("member"):
675 |             result.append(self._DomGetAttributes(m))
676 |         return result
677 | 
678 |     def _DomParseNode(self, DomElement):
679 |         """ Returns NodeData for the node. """
680 |         result = self._DomGetAttributes(DomElement)
681 |         result[u"tag"] = self._DomGetTag(DomElement)
682 |         return result
683 | 
684 |     def _DomParseWay(self, DomElement):
685 |         """ Returns WayData for the way. """
686 |         result = self._DomGetAttributes(DomElement)
687 |         result[u"tag"] = self._DomGetTag(DomElement)
688 |         result[u"nd"]  = self._DomGetNd(DomElement)        
689 |         return result
690 |     
691 |     def _DomParseRelation(self, DomElement):
692 |         """ Returns RelationData for the relation. """
693 |         result = self._DomGetAttributes(DomElement)
694 |         result[u"tag"]    = self._DomGetTag(DomElement)
695 |         result[u"member"] = self._DomGetMember(DomElement)
696 |         return result
697 | 
698 |     def _DomParseChangeset(self, DomElement):
699 |         """ Returns ChangesetData for the changeset. """
700 |         result = self._DomGetAttributes(DomElement)
701 |         result[u"tag"] = self._DomGetTag(DomElement)
702 |         return result
703 | 
704 |     #######################################################################
705 |     # Internal xml builder                                                #
706 |     #######################################################################
707 | 
708 |     def _XmlBuild(self, ElementType, ElementData, WithHeaders = True):
709 | 
710 |         xml  = u""
711 |         if WithHeaders:
712 |             xml += u"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
713 |             xml += u"<osm version=\"0.6\" generator=\"" + self._created_by + "\">\n"
714 | 
715 |         # <element attr="val">
716 |         xml += u"  <" + ElementType
717 |         if u"id" in ElementData:
718 |             xml += u" id=\"" + str(ElementData[u"id"]) + u"\""        
719 |         if u"lat" in ElementData:
720 |             xml += u" lat=\"" + str(ElementData[u"lat"]) + u"\""        
721 |         if u"lon" in ElementData:
722 |             xml += u" lon=\"" + str(ElementData[u"lon"]) + u"\""
723 |         if u"version" in ElementData:
724 |             xml += u" version=\"" + str(ElementData[u"version"]) + u"\""
725 |         xml += u" visible=\"" + str(ElementData.get(u"visible", True)).lower() + u"\""
726 |         if ElementType in [u"node", u"way", u"relation"]:
727 |             xml += u" changeset=\"" + str(self._CurrentChangesetId) + u"\""
728 |         xml += u">\n"
729 | 
730 |         # <tag... />
731 |         for k, v in ElementData.get(u"tag", {}).items():
732 |             xml += u"    <tag k=\""+self._XmlEncode(k)+u"\" v=\""+self._XmlEncode(v)+u"\"/>\n"
733 | 
734 |         # <member... />
735 |         for member in ElementData.get(u"member", []):
736 |             xml += u"    <member type=\""+member[u"type"]+"\" ref=\""+str(member[u"ref"])+u"\" role=\""+self._XmlEncode(member[u"role"])+"\"/>\n"
737 | 
738 |         # <nd... />
739 |         for ref in ElementData.get(u"nd", []):
740 |             xml += u"    <nd ref=\""+str(ref)+u"\"/>\n"
741 | 
742 |         # </element>
743 |         xml += u"  </" + ElementType + u">\n"
744 |         
745 |         if WithHeaders:
746 |             xml += u"</osm>\n"
747 | 
748 |         return xml.encode("utf8")
749 | 
750 |     def _XmlEncode(self, text):
751 |         return text.replace("&", "&amp;").replace("\"", "&quot;").replace("<","&lt;").replace(">","&gt;")
752 | 
753 | ## End of main class                                                     ##
754 | ###########################################################################
755 | 


--------------------------------------------------------------------------------
/junk/README:
--------------------------------------------------------------------------------
1 | This directory contains the detritus of our efforts. Fortunately, bits are cheap!
2 | 


--------------------------------------------------------------------------------
/junk/fetch_osm.py:
--------------------------------------------------------------------------------
 1 | from OsmApi import OsmApi
 2 | from outliers import load_points, discard_outliers, get_bbox_for_points
 3 | import sys, geojson, time
 4 | 
 5 | DEFAULT_TAGS = ("highway", "waterway")
 6 | 
 7 | osm = OsmApi() #api="http://open.mapquestapi.com/xapi"
 8 | 
 9 | def get_osm_ways(bbox, wanted_tags=DEFAULT_TAGS):
10 |     nodes = {}
11 |     left, bottom, right, top = bbox
12 |     step = .025
13 |     scale = 100000.0
14 |     count = 0
15 |     iterations = int(((right-left)/step+1)*((top-bottom)/step+1))
16 |     for x in range(int(left*scale), int(right*scale), int(step*scale)):
17 |         for y in range(int(bottom*scale), int(top*scale), int(step*scale)):
18 |             count += 1
19 |             ways = {}
20 |             request = (x/scale, y/scale, min(x/scale+step,right), min(y/scale+step, top))
21 |             start= time.time()
22 |             print >>sys.stderr, "\rRequesting %.4f,%.4f,%.4f,%.4f from OSM (%d of %d)..." % (request+(count, iterations)),
23 |             for item in osm.Map(*request):
24 |                 data = item["data"]
25 |                 if item["type"] == "node":
26 |                     nodes[int(data["id"])] = map(float, (data["lon"],data["lat"]))
27 |                 elif item["type"] == "way" and any(t for t in wanted_tags if t in data["tag"]):
28 |                     ways[int(data["id"])] = ( dict((k, data["tag"].get(k, "")) for k in wanted_tags), data["nd"] )
29 |             print >>sys.stderr, "%d nodes, %d ways found (%.2f elapsed)" % (len(nodes),len(ways),time.time()-start)
30 |             for way_id, (tags, node_ids) in ways.items():
31 |                 feature = geojson.Feature()
32 |                 feature.geometry = geojson.LineString(coordinates=(nodes[ref] for ref in node_ids if ref in nodes))
33 |                 feature.properties = tags
34 |                 feature.id = way_id
35 |                 yield feature
36 |             time.sleep(0.5)
37 | 
38 | def main(points_file):
39 |     places = load_points(points_file)
40 |     #random_place = dict([places.popitem()])
41 |     random_place = discard_outliers(places)
42 |     bbox = get_bbox_for_points(places)
43 |     for obj in get_osm_ways(bbox):
44 |         print obj.to_dict()
45 | 
46 | if __name__ == "__main__":
47 |     main(sys.argv[1])
48 | 
49 | 


--------------------------------------------------------------------------------
/junk/pull_photos.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/python
 2 | import sys
 3 | import csv
 4 | 
 5 | #first arg: input file, csv. column woe_id should be the list of woe_ids we want to pull out of photos.txt
 6 | #second arg: output file, txt subset of photos.txt (also remove photoid. samplr not expecting it)
 7 | 
 8 | def main():
 9 |     infile = sys.argv[1]
10 | 
11 |     outfile = sys.argv[2]
12 | 
13 |     photofile = "photos.txt"
14 | 
15 |     woes = []
16 |     ireader = csv.DictReader(open(infile, 'r'))
17 |     for line in ireader:
18 |         woes.append(line['woe_id'])
19 | 
20 | 
21 |     pfh = open(photofile, 'r')
22 |     ofh = open(outfile, 'w')
23 | 
24 |     outstr = "%s\t%s\t%s\n"
25 |     
26 |     for row in pfh:
27 |         photoid, placeid, lon, lat = row.strip().split()
28 |         if placeid in woes:
29 |             out = outstr % (placeid, lon, lat)
30 |             ofh.write(out)
31 | 
32 | if __name__ == "__main__":
33 |     sys.exit(main())
34 | 
35 | 


--------------------------------------------------------------------------------
/junk/samplr.py:
--------------------------------------------------------------------------------
  1 | from shapely.geometry import Point, MultiPoint, Polygon, MultiPolygon
  2 | from shapely.ops import cascaded_union, polygonize
  3 | from shapely.prepared import prep
  4 | from rtree import Rtree
  5 | import sys, random, json, numpy, math, pickle, os
  6 | 
  7 | SAMPLE_ITERATIONS = 200
  8 | SAMPLE_SIZE = 5
  9 | MEDIAN_THRESHOLD = 5.0
 10 | 
 11 | median_distance_cache = {}
 12 | def median_distances(pts, aggregate=numpy.median):
 13 |     key = tuple(sorted(pts))
 14 |     if key in median_distance_cache: return median_distance_cache[key]
 15 |     median = (numpy.median([pt[0] for pt in pts]),
 16 |               numpy.median([pt[1] for pt in pts]))
 17 |     distances = []
 18 |     for pt in pts:
 19 |         dist = math.sqrt(((median[0]-pt[0])*math.cos(median[1]*math.pi/180.0))**2+(median[1]-pt[1])**2)
 20 |         distances.append((dist, pt))
 21 | 
 22 |     median_dist = aggregate([dist for dist, pt in distances])
 23 |     median_distance_cache[key] = (median_dist, distances)
 24 |     return (median_dist, distances)
 25 | 
 26 | def mean_distances(pts):
 27 |     return median_distances(pts, numpy.mean)
 28 | 
 29 | name_file, point_file = sys.argv[1:3]
 30 | 
 31 | places = {}
 32 | names = {}
 33 | if os.path.exists(point_file + '.cache'):
 34 |     print >>sys.stderr, "Reading from %s cache..." % point_file
 35 |     names, places = pickle.load(file(point_file + ".cache"))
 36 | else:
 37 |     all_names = {}
 38 |     count = 0
 39 |     for line in file(name_file):
 40 |         place_id, name = line.strip().split(None, 1)
 41 |         all_names[int(place_id)] = name
 42 |         count += 1
 43 |         if count % 1000 == 0:
 44 |             print >>sys.stderr, "\rRead %d names from %s." % (count, name_file),
 45 |     print >>sys.stderr, "\rRead %d names from %s." % (count, name_file)
 46 | 
 47 |     count = 0
 48 |     for line in file(point_file):
 49 |         place_id, lon, lat = line.strip().split()
 50 |         place_id = int(place_id)
 51 |         names[place_id] = all_names.get(place_id, "")
 52 |         point = (float(lon), float(lat))
 53 |         pts = places.setdefault(place_id, set())
 54 |         pts.add(point)
 55 |         count += 1
 56 |         if count % 1000 == 0:
 57 |             print >>sys.stderr, "\rRead %d points in %d places." % (count, len(places)),
 58 |     print >>sys.stderr, "\rRead %d points in %d places." % (count, len(places))
 59 | 
 60 |     count = 0
 61 |     discarded = 0
 62 |     for place_id, pts in places.items():
 63 |         count += 1
 64 |         print >>sys.stderr, "\rComputing outliers for %d of %d places..." % (count, len(places)),
 65 |         median_dist, distances = median_distances(pts)
 66 |         keep = [pt for dist, pt in distances if dist < median_dist * MEDIAN_THRESHOLD]
 67 |         discarded += len(pts) - len(keep)
 68 |         places[place_id] = keep
 69 | 
 70 |     print >>sys.stderr, "%d points discarded." % discarded
 71 | 
 72 | if not os.path.exists(point_file + '.cache'):
 73 |     print >>sys.stderr, "Caching points..."
 74 |     pickle.dump((names, places), file(point_file + ".cache", "w"), -1)
 75 | 
 76 | print >>sys.stderr, "Indexing..."
 77 | points = []
 78 | place_list = set()
 79 | for place_id, pts in places.items():
 80 |     for pt in pts:
 81 |         place_list.add((len(points), pt+pt, None))
 82 |         points.append((place_id, Point(pt)))
 83 | index = Rtree(place_list)
 84 | 
 85 | """
 86 | 
 87 | REASSIGNMENT_PASSES = 10
 88 | iterations = 0
 89 | count = 0
 90 | queue = places.keys() + [None]
 91 | while len(queue) > 1:
 92 |     place_id = queue.pop(0)
 93 |     if place_id is None:
 94 |         count = 0
 95 |         iterations += 1
 96 |         queue.append(None)
 97 |         place_id = queue.pop(0)
 98 |     if not places[place_id]:
 99 |         del places[place_id]
100 |         continue
101 |     pts = places[place_id]
102 |     count += 1
103 |     print >>sys.stderr, "\rIteration #%d of reassignment: %d of %d places..." % (iterations, count, len(queue)),
104 |     if iterations > len(pts) / 10.0: continue
105 |     old_source_mean, distances = mean_distances(pts)
106 |     _, outlier = max(distances)
107 |     best = (None, 0.0)
108 |     print >>sys.stderr, ""
109 |     for nearest in index.nearest(outlier.bounds, 3, objects=True):
110 |         #print >>sys.stderr, "    -> %s (%d) versus %s (%d)" % (outlier, place_id, Point(nearest.bbox[0:2]), nearest.id)
111 |         if nearest.id == place_id: continue
112 |         old_target_mean, _ = mean_distances(places[nearest.id])
113 |         source = list(pts)
114 |         source.remove(outlier)
115 |         target = list(places[nearest.id]) + [outlier]
116 |         #print >>sys.stderr, "      source: new=%d items, old=%d items" % (len(source), len(pts))
117 |         new_source_mean, _ = mean_distances(source)
118 |         new_target_mean, _ = mean_distances(target)
119 |         print >>sys.stderr, "      source mean: new=%.6f, old=%.6f" % (old_source_mean, new_source_mean)
120 |         print >>sys.stderr, "      target mean: new=%.6f, old=%.6f" % (old_target_mean, new_target_mean)
121 |         if new_source_mean < old_source_mean and \
122 |            new_target_mean < old_target_mean:
123 |             improvement = (old_source_mean - new_source_mean) \
124 |                         + (old_target_mean - new_target_mean)
125 |             if improvement > best[1]:
126 |                 best = (nearest.id, improvement)
127 |     if best[1] > 0:
128 |         pts.remove(outlier)
129 |         places[best[0]].append(outlier)
130 |         queue.append(place_id)
131 |         print >>sys.stderr, "%s moved from %d to %d." % (outlier, place_id, best[0])
132 | 
133 | print >>sys.stderr, "Done."
134 | 
135 | """
136 | 
137 | sample_hulls = {}
138 | count = 0
139 | for place_id, pts in places.items():
140 |     hulls = []
141 |     if len(pts) < 3:
142 |         print >>sys.stderr, "\n    ... discarding place #%d" % place_id
143 |         continue
144 |     for i in range(min(pts,SAMPLE_ITERATIONS)):
145 |         multipoint = MultiPoint(random.sample(pts, min(SAMPLE_SIZE, len(pts))))
146 |         hull = multipoint.convex_hull
147 |         if isinstance(hull, Polygon) and not hull.is_empty: hulls.append(hull)
148 |     try:
149 |         sample_hulls[place_id] = cascaded_union(hulls)
150 |     except:
151 |         print >>sys.stderr, hulls
152 |         sys.exit()
153 |     if hasattr(sample_hulls[place_id], "geoms"):
154 |         sample_hulls[place_id] = cascaded_union([hull for hull in sample_hulls[place_id] if type(hull) is Polygon])
155 |     count += SAMPLE_ITERATIONS
156 |     print >>sys.stderr, "\rComputing %d of %d hulls..." % (count, (len(places) * SAMPLE_ITERATIONS)),
157 | 
158 | print >>sys.stderr, "\nCombining hull boundaries..."
159 | boundaries = cascaded_union([hull.boundary for hull in sample_hulls.values()])
160 | 
161 | print >>sys.stderr, "Polygonizing %d boundaries..." % len(boundaries)
162 | rings = list(polygonize(boundaries))
163 | 
164 | for i, ring in enumerate(rings):
165 |     print >>sys.stderr, "\rBuffering %d of %d polygons..." % (i, len(rings)),
166 |     size = math.sqrt(ring.area)*0.1
167 |     rings[i] = ring.buffer(size)
168 | print >>sys.stderr, "Done."
169 | 
170 | polygons = {}
171 | count = 0
172 | for polygon in rings:
173 |     if polygon.is_empty: continue
174 |     place_count = dict((place_id, 0) for place_id in places)
175 |     prepared = prep(polygon)
176 |     for item in index.intersection(polygon.bounds):
177 |         place_id, point = points[item]
178 |         if prepared.intersects(point):
179 |             place_count[place_id] += 1
180 |     pt_count, place_id = max((c, i) for (i, c) in place_count.items())
181 |     polys = polygons.setdefault(place_id, [])
182 |     polys.append(polygon)
183 |     count += 1
184 |     print >>sys.stderr, "\rAssigning %d of %d polygons..." % (count,len(rings)),
185 | print >>sys.stderr, "Done."
186 | 
187 | count = 0
188 | for place_id, polys in polygons.items():
189 |     polygons[place_id] = cascaded_union(polys)
190 |     count += 1
191 |     print >>sys.stderr, "\rUnifying %d of %d polygons..." % (count,len(polygons)),
192 | print >>sys.stderr, "Done."
193 | 
194 | count = 0
195 | orphans = []
196 | for place_id, multipolygon in polygons.items():
197 |     count += 1
198 |     print >>sys.stderr, "\rRemoving %d orphans from %d of %d polygons..." % (len(orphans), count, len(polygons)),
199 |     if type(multipolygon) is not MultiPolygon: continue
200 |     polygon_count = [0] * len(multipolygon)
201 |     for i, polygon in enumerate(multipolygon.geoms):
202 |         prepared = prep(polygon)
203 |         for item in index.intersection(polygon.bounds):
204 |             item_id, point = points[item]
205 |             if item_id == place_id and prepared.intersects(point):
206 |                 polygon_count[i] += 1
207 |     winner = max((c, i) for (i, c) in enumerate(polygon_count))[1]
208 |     polygons[place_id] = multipolygon.geoms[winner]
209 |     orphans.extend(p for i, p in enumerate(multipolygon.geoms) if i != winner)
210 | print >>sys.stderr, "Done."
211 | 
212 | orphans = []
213 | count = 0
214 | changed = True
215 | while changed and orphans:
216 |     orphan = orphans.pop(0)
217 |     changed = False
218 |     count += 1
219 |     print >>sys.stderr, "\rReassigning %d of %d orphans..." % (count, len(orphans)),
220 |     place_count = dict((place_id, 0) for place_id in places)
221 |     total_count = 0.0
222 |     prepared = prep(orphan)
223 |     for item in index.intersection(orphan.bounds):
224 |         item_id, point = points[item]
225 |         if prepared.intersects(point):
226 |             place_count[item_id] += 1
227 |             total_count += 1
228 |     for place_id, ct in place_count.items():
229 |         if total_count > 0  and float(ct)/total_count > 1/3.0:
230 |             polygons[place_id] = polygons[place_id].union(orphan)
231 |             changed = True
232 |     if not changed:
233 |         orphans.append(orphan)
234 | 
235 | print >>sys.stderr, "Done."
236 |  
237 | print >>sys.stderr, "\nWriting output."
238 | features = []
239 | for place_id, poly in polygons.items():
240 |     features.append({
241 |         "type": "Feature",
242 |         "id": place_id,
243 |         "geometry": poly.__geo_interface__,
244 |         "properties": {"woe_id": place_id, "name": names.get(place_id, "")}
245 |     })
246 | 
247 | collection = {
248 |     "type": "FeatureCollection",
249 |     "features": features
250 | }
251 | 
252 | print json.dumps(collection)
253 | 
254 | 


--------------------------------------------------------------------------------
/junk/up_one_level.py:
--------------------------------------------------------------------------------
 1 | from shapely.geometry import Polygon, MultiPolygon, shape
 2 | from shapely.ops import cascaded_union
 3 | import sys, json
 4 | import psycopg2
 5 | import psycopg2.extras
 6 | 
 7 | #read in a GeoJSON featurecollection
 8 | #translate those geoms into shapely land
 9 | #lookup the parent woe ids for each geom
10 | #cascaded union the geoms to make geoms for the parents. 
11 | #output new GeoJSON featurecollection
12 | 
13 | town, townwoeid = sys.argv[1:3]
14 | 
15 | json_file = "data/%s.json" % town
16 | 
17 | infh = open(json_file, 'r')
18 | injson = json.loads(infh.next())
19 | nbhds = {}
20 | 
21 | print >>sys.stderr, "Reading in nbhds."
22 | 
23 | for feature in injson['features']:
24 |     nbhd = shape(feature['geometry'])
25 |     nbhds[feature['id']] = nbhd
26 | 
27 | print >>sys.stderr, "Looking up parents."
28 | family = {}
29 | 
30 | pquery = """select parent_id
31 |         from woe_places
32 |         where woe_id = %s"""
33 | 
34 | iquery = """select name
35 |             from woe_places
36 |             where woe_id = %s"""
37 | 
38 | conn_string = "dbname='hood'"
39 | conn = psycopg2.connect(conn_string)
40 | cursor = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
41 | 
42 | for woeid, nbhd in nbhds.items():
43 |     pq = pquery % woeid
44 |     cursor.execute(pq)
45 |     rs = cursor.fetchone()
46 |     parent = rs["parent_id"]
47 |     if parent == townwoeid:
48 |         print >>sys.stderr, "Nbhd %s has the town for a parent" % woeid
49 |         continue
50 |     if parent not in family:
51 |         iq = iquery % parent
52 |         cursor.execute(iq)
53 |         rs = cursor.fetchone()
54 |         family[parent] = {}
55 |         family[parent]["name"] = rs["name"]
56 |         family[parent]["children"] = [woeid]
57 |     else:
58 |         family[parent]["children"].append(woeid)
59 | 
60 | print >>sys.stderr, "Merging %s stems" % len(family.keys())
61 | for parent in family.keys():
62 |     family[parent]['geom'] = cascaded_union([nbhds[child] for child in family[parent]['children']])
63 | 
64 | 
65 | print >>sys.stderr, "Buffering stems."
66 | for parent, feature in family.items():
67 |     polygon = feature['geom']
68 |     #print >>sys.stderr, "\r%s has shape of type %s" %(place_id, type(polygon))
69 |     if type(polygon) is Polygon:
70 |         polygon = Polygon(polygon.exterior.coords)
71 |     else:
72 |         polygon = MultiPolygon([Polygon(p.exterior.coords)for p in polygon.geoms])
73 |     family[parent]['geom'] = polygon.buffer(0)
74 |  
75 | print >>sys.stderr, "Writing output."
76 | features = []
77 | for place_id, feature in family.items():
78 |     features.append({
79 |         "type": "Feature",
80 |         "id": place_id,
81 |         "geometry": feature['geom'].__geo_interface__,
82 |         "properties": {"woe_id": place_id, "name": feature['name']}
83 |     })
84 | 
85 | collection = {
86 |     "type": "FeatureCollection",
87 |     "features": features
88 | }
89 | 
90 | print json.dumps(collection)
91 | 
92 | 


--------------------------------------------------------------------------------
/leaves_from_woeid.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/python
 2 | import sys
 3 | import psycopg2
 4 | import psycopg2.extras
 5 | import csv
 6 | import copy
 7 | #take in a woe_id
 8 | #find all the children of that woe_id that are local_admins or suburbs
 9 | #for each of the children that are local admins, get all their children that are local admins or suburbs
10 | #repeat until have list of descendents that have no children
11 | #print list as name, woe_id
12 | leaftypes = ('LocalAdmin',"Suburb")
13 | 
14 | #owriter = csv.writer(sys.stdout)
15 | #owriter.writerow(["parent_id","name","type","woe_id"])
16 | 
17 | def main():
18 |     for woeid in sys.argv[1:]:
19 |         print >>sys.stderr, woeid,
20 | 
21 |         childq = """select * from woe_places
22 |                 where parent_id = %s
23 |                 and placetype in ('County','LocalAdmin','Suburb')"""
24 | 
25 |         conn_string = "dbname='hood'"
26 |         # get a connection, if a connect cannot be made an exception will be raised here
27 |         conn = psycopg2.connect(conn_string)
28 |         cursor = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
29 | 
30 |         search = set([woeid])
31 |         leaves = set()
32 |         names = {}
33 |         types = {}
34 |         while len(search) > 0:
35 |             print >>sys.stderr, ".",
36 |             curr_search = copy.copy(search)
37 |             for woe in curr_search:
38 |                 search.remove(woe)
39 |                 qry = childq % woe
40 |                 cursor.execute(qry)
41 |                 if cursor.rowcount == 0:
42 |                     if woe not in types:
43 |                         break
44 |                     if types[woe] in leaftypes:
45 |                         leaves.add((woeid,names[woe],types[woe],woe))
46 |                 for line in cursor:
47 |                     names[line['woe_id']] = line['name']
48 |                     types[line['woe_id']] = line['placetype']
49 |                     search.add(line['woe_id'])
50 | 
51 |         conn.close()
52 |         print >>sys.stderr, ""
53 | 
54 |         for leaf in leaves:
55 |             #owriter.writerow(leaf)
56 |             print "\t".join(map(str,leaf))
57 | 
58 | if __name__ == "__main__":
59 |     sys.exit(main())
60 | 
61 | 


--------------------------------------------------------------------------------
/mapnik_render.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | 
 3 | from mapnik import *
 4 | import sys, random
 5 | 
 6 | width, height = 2048, 2048
 7 | rgbs = ["80", "a2", "ab"]
 8 | base = "data/results"
 9 | city = sys.argv[1]
10 | 
11 | woe_id = None
12 | # intl_cities.txt is a tab-separated file mapping woe_id -> name
13 | for line in file(base+"/intl_cities.txt"):
14 |     woe_id, name = line.strip().split(None,1)
15 |     if city == name: break
16 | if woe_id is None: raise Exception("Couldn't find the city '%s'" % city)
17 | 
18 | m = Map(width, height, "+proj=latlong +datum=WGS84")
19 | m.background = Color('white')
20 | 
21 | if city == "Tokyo":
22 |     register_fonts("/usr/share/fonts/truetype/takao")
23 |     font = "TakaoMincho Regular"
24 | else:
25 |     font = "DejaVu Sans Bold"
26 | 
27 | def append_style(name, *symbols):
28 |     s = Style()
29 |     r = Rule()
30 |     for symbol in symbols:
31 |         r.symbols.append(symbol)
32 |     s.rules.append(r)
33 |     m.append_style(name,s)
34 | 
35 | random.shuffle(rgbs)
36 | fill = Color('#%s%s%s' % tuple(rgbs))
37 | hood = Layer('hood', "+proj=latlong +datum=WGS84")
38 | hood.datasource = Ogr(base=base,file=city+".json",layer="OGRGeoJSON")
39 | append_style("hood", PolygonSymbolizer(fill))
40 | hood.styles.append("hood")
41 | m.layers.append(hood)
42 | 
43 | blocks = Layer('blocks',"+proj=latlong +datum=WGS84")
44 | blocks.datasource = Ogr(base=base,file="blocks_"+woe_id+".json",layer='OGRGeoJSON')
45 | append_style('blocks', LineSymbolizer(Color('rgb(50%,50%,50%)'),1.0))
46 | blocks.styles.append('blocks')
47 | m.layers.append(blocks)
48 | 
49 | bounds = Layer('bounds', "+proj=latlong +datum=WGS84")
50 | bounds.datasource = Ogr(base=base,file=city+".json",layer="OGRGeoJSON")
51 | append_style("bounds", LineSymbolizer(Color('#222222'), 2.0))
52 | text = TextSymbolizer("name", font, 12, Color("black"))
53 | text.allow_overlap = False
54 | text.avoid_edges = True
55 | text.wrap_width = 15
56 | halo_fill = [min(x+32, 255) for x in (fill.r, fill.g, fill.b)]
57 | text.halo_fill = Color(*halo_fill)
58 | text.halo_radius = 1
59 | append_style("bounds_label", text)
60 | #       <TextSymbolizer name="NAME" face_name="DejaVu Sans Bold" size="7" fill="black" halo_fill= "#DFDBE3" halo_radius="1" wrap_width="20" spacing="5" allow_overlap="false" avoid_edges="false" min_distance="10"/>
61 | bounds.styles.append("bounds")
62 | bounds.styles.append("bounds_label")
63 | m.layers.append(bounds)
64 | 
65 | m.zoom_to_box(hood.envelope())
66 | render_to_file(m,city+'.png', 'png')
67 | 


--------------------------------------------------------------------------------
/outliers.py:
--------------------------------------------------------------------------------
 1 | import numpy
 2 | import sys
 3 | import math
 4 | 
 5 | MEDIAN_THRESHOLD = 5.0
 6 | 
 7 | median_distance_cache = {}
 8 | def median_distances(pts, aggregate=numpy.median):
 9 |     key = tuple(sorted(pts))
10 |     if key in median_distance_cache: return median_distance_cache[key]
11 |     median = (numpy.median([pt[0] for pt in pts]),
12 |               numpy.median([pt[1] for pt in pts]))
13 |     distances = []
14 |     for pt in pts:
15 |         dist = math.sqrt(((median[0]-pt[0])*math.cos(median[1]*math.pi/180.0))**2+(median[1]-pt[1])**2)
16 |         distances.append((dist, pt))
17 | 
18 |     median_dist = aggregate([dist for dist, pt in distances])
19 |     median_distance_cache[key] = (median_dist, distances)
20 |     return (median_dist, distances)
21 | 
22 | def mean_distances(pts):
23 |     return median_distances(pts, numpy.mean)
24 | 
25 | def load_points(point_file):
26 |     places = {}
27 |     count = 0
28 |     for line in file(point_file):
29 |         data = line.strip().split()
30 |         place_id, lon, lat = data if len(data) == 3 else data[1:]
31 |         place_id = int(place_id)
32 |         point = (float(lon), float(lat))
33 |         pts = places.setdefault(place_id, set())
34 |         pts.add(point)
35 |         count += 1
36 |         if count % 1000 == 0:
37 |             print >>sys.stderr, "\rRead %d points in %d places." % (count, len(places)),
38 |     print >>sys.stderr, "\rRead %d points in %d places." % (count, len(places))
39 |     return places
40 | 
41 | def discard_outliers(places, threshold=MEDIAN_THRESHOLD):
42 |     count = 0
43 |     discarded = 0
44 |     result = {}
45 |     for place_id, pts in places.items():
46 |         count += 1
47 |         print >>sys.stderr, "\rComputing outliers for %d of %d places..." % (count, len(places)),
48 |         median_dist, distances = median_distances(pts)
49 |         keep = [pt for dist, pt in distances if dist < median_dist * threshold]
50 |         discarded += len(pts) - len(keep)
51 |         result[place_id] = keep
52 |     print >>sys.stderr, "%d points discarded." % discarded
53 |     return result
54 | 
55 | def get_bbox_for_points(places):
56 |     bbox = [180, 90, -180, -90]
57 |     for pid, pts in places.items():
58 |         for pt in pts:
59 |             for i in range(4):
60 |                 bbox[i] = min(bbox[i], pt[i%2]) if i<2 else max(bbox[i], pt[i%2])
61 |     return bbox
62 | 
63 | def main(filename):
64 |     places = load_points(filename)
65 |     places = discard_outliers(places)
66 |     bbox = get_bbox_for_points(places)
67 |     #print ",".join(map(str, bbox))
68 |     print "%s %s, %s %s" % (bbox[0], bbox[1], bbox[2], bbox[3])
69 | 
70 | if __name__ == "__main__":
71 |     main(sys.argv[1])
72 | 
73 | 


--------------------------------------------------------------------------------
/util/consolidate_geojson.py:
--------------------------------------------------------------------------------
 1 | import json, sys, os.path
 2 | 
 3 | woe = {}
 4 | for line in file(sys.argv[1]):
 5 |     woe_id, name = line.strip().split(None,1)
 6 |     name = name.split("_")[0]
 7 |     woe[name] = int(woe_id)
 8 | 
 9 | features = []
10 | for fname in sys.argv[2:]:
11 |     print >>sys.stderr, "-", fname
12 |     name = os.path.basename(fname).split(".")[0].split("_")[0]
13 |     collection = json.load(file(fname))
14 |     for record in collection["features"]:
15 |         record["properties"]["city"] = name
16 |         record["properties"]["parent_id"] = woe[name]
17 |         record["properties"]["woe_type"] = "Suburb" if "_" not in fname else "LocalAdmin"
18 |         features.append(record)
19 | 
20 | json.dump(
21 |         { "type": "FeatureCollection", "features": features },
22 |         sys.stdout)
23 | 
24 | 


--------------------------------------------------------------------------------
/util/geoplanet.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import urllib, json, sys
 3 | 
 4 | APPID = os.environ["YAHOO_APPID"]
 5 | url = 'http://where.yahooapis.com/v1/places.q(%s)?select=long&format=json&appid='
 6 | 
 7 | for line in sys.stdin:
 8 |     query = url % line.strip()
 9 |     result = urllib.urlopen(query).read()
10 |     result = json.loads(result)
11 |     place = result['places']['place'][0]
12 |     print place['woeid'], "\t", place["name"]
13 | 


--------------------------------------------------------------------------------
/util/upload_photos.py:
--------------------------------------------------------------------------------
 1 | import Flickr.API
 2 | import os, os.path, json, sys
 3 | import xml.etree.ElementTree
 4 | 
 5 | key, secret = os.environ["FLICKR_KEY"], os.environ["FLICKR_SECRET"]
 6 | 
 7 | # flickr.test.echo:
 8 | api = Flickr.API.API(key, secret)
 9 | token = None
10 | 
11 | # flickr.auth.getFrob:
12 | frob_request = Flickr.API.Request(method='flickr.auth.getFrob')
13 | frob_rsp = api.execute_request(frob_request)
14 | if frob_rsp.code == 200:
15 |     frob_rsp_et = xml.etree.ElementTree.parse(frob_rsp)
16 |     if frob_rsp_et.getroot().get('stat') == 'ok':
17 |         frob = frob_rsp_et.findtext('frob')
18 | 
19 | # get the desktop authentication url
20 | auth_url = api.get_authurl('write', frob=frob)
21 | 
22 | # ask the user to authorize your app now using that url
23 | print "auth me:  %s" % (auth_url,)
24 | input = raw_input("done [y]: ")
25 | if input.lower() not in ('', 'y', 'yes'):
26 |     sys.exit()
27 | 
28 | # flickr.auth.getToken:
29 | token_rsp = api.execute_request(Flickr.API.Request(method='flickr.auth.getToken', frob=frob, format='json', nojsoncallback=1))
30 | if token_rsp.code == 200:
31 |     token_rsp_json = json.load(token_rsp)
32 |     if token_rsp_json['stat'] == 'ok':
33 |         token = str(token_rsp_json['auth']['token']['_content'])
34 | 
35 | for filename in sys.argv[1:]:
36 |     photo = file(filename, "rb")
37 |     filename = os.path.basename(filename)
38 |     #upload_response = api.execute_upload(filename=filename, args={'auth_token':token, 'title':title, 'photo':photo})
39 | 
40 |     upload_request = Flickr.API.Request("http://api.flickr.com/services/upload", auth_token=token, title=filename, photo=photo)
41 |     upload_response = api.execute_request(upload_request, sign=True, encode=Flickr.API.encode_multipart_formdata)
42 |     print upload_response
43 | 


--------------------------------------------------------------------------------