├── .github └── workflows │ └── python-package.yml ├── .gitignore ├── MANIFEST.in ├── README.md ├── requirements.txt ├── reverse_geocode ├── __init__.py ├── countries.csv └── geocode.gz ├── setup.py └── test_reverse_geocode.py /.github/workflows/python-package.yml: -------------------------------------------------------------------------------- 1 | # This workflow will install Python dependencies, run tests and lint with a variety of Python versions 2 | # For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python 3 | 4 | name: Python package 5 | 6 | on: 7 | push: 8 | branches: [ "master" ] 9 | pull_request: 10 | branches: [ "master" ] 11 | 12 | jobs: 13 | build: 14 | 15 | runs-on: ubuntu-latest 16 | strategy: 17 | fail-fast: false 18 | matrix: 19 | python-version: ["3.9", "3.10", "3.11"] 20 | 21 | steps: 22 | - uses: actions/checkout@v3 23 | - name: Set up Python ${{ matrix.python-version }} 24 | uses: actions/setup-python@v3 25 | with: 26 | python-version: ${{ matrix.python-version }} 27 | - name: Install dependencies 28 | run: | 29 | python -m pip install --upgrade pip 30 | python -m pip install flake8 pytest 31 | if [ -f requirements.txt ]; then pip install -r requirements.txt; fi 32 | - name: Lint with flake8 33 | run: | 34 | # stop the build if there are Python syntax errors or undefined names 35 | flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics 36 | # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide 37 | flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics 38 | - name: Test with pytest 39 | run: | 40 | python -m pytest 41 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | /.vscode/ 2 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include README.md reverse_geocode/countries.csv reverse_geocode/geocode.gz 2 | prune test_reverse_geocode.py 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Reverse Geocode 2 | 3 | Reverse Geocode takes a latitude / longitude coordinate and returns the nearest known country, state, and city. 4 | This can be useful when you need to reverse geocode a large number of coordinates so a web API is not practical. 5 | 6 | The geocoded locations are from [geonames](http://download.geonames.org/export/dump/). This data is then structured in to a [k-d tree](http://en.wikipedia.org/wiki/K-d_tree>) for efficiently finding the nearest neighbour. 7 | 8 | Note that as this is point based and not a polygon based lookup it will only give a rough idea of the location/city. 9 | 10 | 11 | ## Example usage 12 | 13 | Example reverse geocoding a coordinate: 14 | 15 | ``` 16 | >>> import reverse_geocode 17 | >>> melbourne_coord = -37.81, 144.96 18 | >>> reverse_geocode.get(melbourne_coord) 19 | {'country_code': 'AU', 'city': 'Melbourne', 'latitude': -37.814, 'longitude': 144.96332, 'population': 4917750, 'state': 'Victoria', 'country': 'Australia'} 20 | ``` 21 | 22 | Example reverse geocoding a list of coordinates: 23 | ``` 24 | >>> nyc_coord = 40.71427000, -74.00597000 25 | >>> reverse_geocode.search((melbourne_coord, nyc_coord)) 26 | [{'country_code': 'AU', 'city': 'Melbourne', 'latitude': -37.814, 'longitude': 144.96332, 'population': 4917750, 'state': 'Victoria', 'country': 'Australia'}, 27 | {'country_code': 'US', 'city': 'New York City', 'latitude': 40.71427, 'longitude': -74.00597, 'population': 8804190, 'state': 'New York', 'country': 'United States'}] 28 | ``` 29 | 30 | By default the nearest known location is returned, which may not be as expected when there is a much larger city nearby. 31 | For example querying for the following coordinate near NYC will return Seaport: 32 | 33 | ``` 34 | >>> nyc_coordinate = 40.71, -74.00 35 | >>> reverse_geocode.get(nyc_coordinate) 36 | {'country_code': 'US', 'city': 'Seaport', 'latitude': 40.70906, 'longitude': -74.00317, 'population': 8385, 'state': 'New York', 'county': 'New York County', 'country': 'United States'} 37 | ``` 38 | 39 | To filter for larger cities a minimum population can be set. Using a minimum population of `100000` with the above coordinate now returns NYC: 40 | 41 | ``` 42 | >>> reverse_geocode.get(nyc_coordinate, min_population=100000) 43 | {'country_code': 'US', 'city': 'New York City', 'latitude': 40.71427, 'longitude': -74.00597, 'population': 8804190, 'state': 'New York', 'country': 'United States'} 44 | ``` 45 | 46 | 47 | ## Install 48 | 49 | ``` 50 | pip install reverse-geocode 51 | ``` 52 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy==2.0.2 2 | scipy==1.13.1 3 | -------------------------------------------------------------------------------- /reverse_geocode/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | import csv 4 | import gzip 5 | import io 6 | import json 7 | import logging 8 | import os 9 | from scipy.spatial import cKDTree as KDTree 10 | import sys 11 | import zipfile 12 | from urllib.request import urlopen 13 | 14 | if sys.platform == "win32": 15 | csv.field_size_limit(2**31 - 1) 16 | else: 17 | csv.field_size_limit(sys.maxsize) 18 | 19 | # location of geocode data to download 20 | GEOCODE_URL = "http://download.geonames.org/export/dump/cities1000.zip" 21 | GEOCODE_FILENAME = "cities1000.txt" 22 | STATE_CODE_URL = "http://download.geonames.org/export/dump/admin1CodesASCII.txt" 23 | COUNTY_CODE_URL = "https://download.geonames.org/export/dump/admin2Codes.txt" 24 | 25 | 26 | class Singleton(type): 27 | _instances = {} 28 | 29 | def __call__(cls, *args, **kwargs): 30 | key = cls, args 31 | if key not in cls._instances: 32 | Singleton._instances[key] = super(Singleton, cls).__call__(*args, **kwargs) 33 | return Singleton._instances[key] 34 | 35 | 36 | class GeocodeData(metaclass=Singleton): 37 | def __init__( 38 | self, 39 | min_population=0, 40 | geocode_filename="geocode.gz", 41 | country_filename="countries.csv", 42 | ): 43 | def rel_path(filename): 44 | return os.path.join(os.getcwd(), os.path.dirname(__file__), filename) 45 | 46 | # note: remove geocode_filename to get updated data 47 | self._locations = self._extract( 48 | rel_path(geocode_filename), min_population 49 | ) 50 | coordinates = [(loc["latitude"], loc["longitude"]) for loc in self._locations] 51 | self._tree = KDTree(coordinates) 52 | self._load_countries(rel_path(country_filename)) 53 | 54 | def _load_countries(self, country_filename): 55 | """Load a map of country code to name""" 56 | self._countries = {} 57 | with open(country_filename, "r", encoding="utf-8") as handler: 58 | for code, name in csv.reader(handler): 59 | self._countries[code] = name 60 | 61 | def query(self, coordinates): 62 | """Find closest match to this list of coordinates""" 63 | try: 64 | distances, indices = self._tree.query(coordinates, k=1) 65 | except ValueError as e: 66 | logging.info("Unable to parse coordinates: {}".format(coordinates)) 67 | raise e 68 | else: 69 | results = [self._locations[index] for index in indices] 70 | for result in results: 71 | result["country"] = self._countries.get(result["country_code"], "") 72 | return results 73 | 74 | def _download_geocode(self): 75 | """Download geocode data from http://download.geonames.org/export/dump/""" 76 | 77 | def geocode_csv_reader(data): 78 | return csv.reader(data.decode("utf-8").splitlines(), delimiter="\t") 79 | 80 | #with zipfile.ZipFile(open('cities1000.zip', 'rb')) as geocode_zipfile: 81 | with zipfile.ZipFile( 82 | io.BytesIO(urlopen(GEOCODE_URL).read()) 83 | ) as geocode_zipfile: 84 | geocode_reader = geocode_csv_reader(geocode_zipfile.read(GEOCODE_FILENAME)) 85 | 86 | state_reader = geocode_csv_reader(urlopen(STATE_CODE_URL).read()) 87 | county_reader = geocode_csv_reader(urlopen(COUNTY_CODE_URL).read()) 88 | return ( 89 | geocode_reader, 90 | self._gen_code_map(state_reader), 91 | self._gen_code_map(county_reader), 92 | ) 93 | 94 | def _gen_code_map(self, state_reader): 95 | """Build a map of code data from geonames""" 96 | state_code_map = {} 97 | for row in state_reader: 98 | state_code_map[row[0]] = row[1] 99 | return state_code_map 100 | 101 | def _extract(self, local_filename, min_population): 102 | """Extract locations from geonames and store locally""" 103 | if os.path.exists(local_filename): 104 | with gzip.open(local_filename) as gz: 105 | locations = json.loads(gz.read()) 106 | else: 107 | print('Downloading geocode data') 108 | geocode_reader, state_code_map, county_code_map = self._download_geocode() 109 | 110 | # extract coordinates into more compact JSON for faster loading 111 | locations = [] 112 | for row in geocode_reader: 113 | latitude = float(row[4]) 114 | longitude = float(row[5]) 115 | country_code = row[8] 116 | if latitude and longitude and country_code: 117 | city = row[1] 118 | state_code = row[8] + "." + row[10] 119 | state = state_code_map.get(state_code) 120 | county_code = state_code + "." + row[11] 121 | county = county_code_map.get(county_code) 122 | population = int(row[14]) 123 | loc = { 124 | "country_code": country_code, 125 | "city": city, 126 | "latitude": latitude, 127 | "longitude": longitude, 128 | "population": population, 129 | } 130 | if state: 131 | loc["state"] = state 132 | if county and county != city: 133 | loc["county"] = county 134 | locations.append(loc) 135 | 136 | with gzip.open(local_filename, 'w') as gz: 137 | gz.write(json.dumps(locations, separators=(',', ':')).encode('utf-8')) 138 | 139 | if min_population > 0: 140 | locations = [ 141 | loc for loc in locations if loc["population"] >= min_population 142 | ] 143 | 144 | return locations 145 | 146 | 147 | def get(coordinate, min_population=0): 148 | """Search for closest known location to this lat/lng coordinate""" 149 | return GeocodeData(min_population).query([coordinate])[0] 150 | 151 | 152 | def search(coordinates, min_population=0): 153 | """Search for closest known location at each of given lat/lng coordinates""" 154 | return GeocodeData(min_population).query(coordinates) 155 | 156 | 157 | if __name__ == "__main__": 158 | # test some coordinate lookups 159 | city1 = -37.81, 144.96 160 | city2 = -38.3401, 144.7365 161 | city3 = 40.71, -74.00 162 | print(get(city1)) 163 | print(get(city2)) 164 | print(search([city1, city3], 100000)) 165 | -------------------------------------------------------------------------------- /reverse_geocode/countries.csv: -------------------------------------------------------------------------------- 1 | AD,Andorra 2 | AE,United Arab Emirates 3 | AF,Afghanistan 4 | AG,Antigua and Barbuda 5 | AI,Anguilla 6 | AL,Albania 7 | AM,Armenia 8 | AO,Angola 9 | AP,Asia/Pacific Region 10 | AQ,Antarctica 11 | AR,Argentina 12 | AS,American Samoa 13 | AT,Austria 14 | AU,Australia 15 | AW,Aruba 16 | AX,Aland Islands 17 | AZ,Azerbaijan 18 | BA,Bosnia and Herzegovina 19 | BB,Barbados 20 | BD,Bangladesh 21 | BE,Belgium 22 | BF,Burkina Faso 23 | BG,Bulgaria 24 | BH,Bahrain 25 | BI,Burundi 26 | BJ,Benin 27 | BL,Saint Bartelemey 28 | BM,Bermuda 29 | BN,Brunei Darussalam 30 | BO,Bolivia 31 | BQ,"Bonaire, Saint Eustatius and Saba" 32 | BR,Brazil 33 | BS,Bahamas 34 | BT,Bhutan 35 | BV,Bouvet Island 36 | BW,Botswana 37 | BY,Belarus 38 | BZ,Belize 39 | CA,Canada 40 | CC,Cocos (Keeling) Islands 41 | CD,"Congo, The Democratic Republic of the" 42 | CF,Central African Republic 43 | CG,Congo 44 | CH,Switzerland 45 | CI,Cote d'Ivoire 46 | CK,Cook Islands 47 | CL,Chile 48 | CM,Cameroon 49 | CN,China 50 | CO,Colombia 51 | CR,Costa Rica 52 | CU,Cuba 53 | CV,Cape Verde 54 | CW,Curacao 55 | CX,Christmas Island 56 | CY,Cyprus 57 | CZ,Czech Republic 58 | DE,Germany 59 | DJ,Djibouti 60 | DK,Denmark 61 | DM,Dominica 62 | DO,Dominican Republic 63 | DZ,Algeria 64 | EC,Ecuador 65 | EE,Estonia 66 | EG,Egypt 67 | EH,Western Sahara 68 | ER,Eritrea 69 | ES,Spain 70 | ET,Ethiopia 71 | EU,Europe 72 | FI,Finland 73 | FJ,Fiji 74 | FK,Falkland Islands (Malvinas) 75 | FM,"Micronesia, Federated States of" 76 | FO,Faroe Islands 77 | FR,France 78 | GA,Gabon 79 | GB,United Kingdom 80 | GD,Grenada 81 | GE,Georgia 82 | GF,French Guiana 83 | GG,Guernsey 84 | GH,Ghana 85 | GI,Gibraltar 86 | GL,Greenland 87 | GM,Gambia 88 | GN,Guinea 89 | GP,Guadeloupe 90 | GQ,Equatorial Guinea 91 | GR,Greece 92 | GS,South Georgia and the South Sandwich Islands 93 | GT,Guatemala 94 | GU,Guam 95 | GW,Guinea-Bissau 96 | GY,Guyana 97 | HK,Hong Kong 98 | HM,Heard Island and McDonald Islands 99 | HN,Honduras 100 | HR,Croatia 101 | HT,Haiti 102 | HU,Hungary 103 | ID,Indonesia 104 | IE,Ireland 105 | IL,Israel 106 | IM,Isle of Man 107 | IN,India 108 | IO,British Indian Ocean Territory 109 | IQ,Iraq 110 | IR,"Iran, Islamic Republic of" 111 | IS,Iceland 112 | IT,Italy 113 | JE,Jersey 114 | JM,Jamaica 115 | JO,Jordan 116 | JP,Japan 117 | KE,Kenya 118 | KG,Kyrgyzstan 119 | KH,Cambodia 120 | KI,Kiribati 121 | KM,Comoros 122 | KN,Saint Kitts and Nevis 123 | KP,"Korea, Democratic People's Republic of" 124 | KR,"Korea, Republic of" 125 | KW,Kuwait 126 | KY,Cayman Islands 127 | KZ,Kazakhstan 128 | LA,Lao People's Democratic Republic 129 | LB,Lebanon 130 | LC,Saint Lucia 131 | LI,Liechtenstein 132 | LK,Sri Lanka 133 | LR,Liberia 134 | LS,Lesotho 135 | LT,Lithuania 136 | LU,Luxembourg 137 | LV,Latvia 138 | LY,Libyan Arab Jamahiriya 139 | MA,Morocco 140 | MC,Monaco 141 | MD,"Moldova, Republic of" 142 | ME,Montenegro 143 | MF,Saint Martin 144 | MG,Madagascar 145 | MH,Marshall Islands 146 | MK,Macedonia 147 | ML,Mali 148 | MM,Myanmar 149 | MN,Mongolia 150 | MO,Macao 151 | MP,Northern Mariana Islands 152 | MQ,Martinique 153 | MR,Mauritania 154 | MS,Montserrat 155 | MT,Malta 156 | MU,Mauritius 157 | MV,Maldives 158 | MW,Malawi 159 | MX,Mexico 160 | MY,Malaysia 161 | MZ,Mozambique 162 | NA,Namibia 163 | NC,New Caledonia 164 | NE,Niger 165 | NF,Norfolk Island 166 | NG,Nigeria 167 | NI,Nicaragua 168 | NL,Netherlands 169 | NO,Norway 170 | NP,Nepal 171 | NR,Nauru 172 | NU,Niue 173 | NZ,New Zealand 174 | OM,Oman 175 | PA,Panama 176 | PE,Peru 177 | PF,French Polynesia 178 | PG,Papua New Guinea 179 | PH,Philippines 180 | PK,Pakistan 181 | PL,Poland 182 | PM,Saint Pierre and Miquelon 183 | PN,Pitcairn 184 | PR,Puerto Rico 185 | PS,Palestinian Territory 186 | PT,Portugal 187 | PW,Palau 188 | PY,Paraguay 189 | QA,Qatar 190 | RE,Reunion 191 | RO,Romania 192 | RS,Serbia 193 | RU,Russian Federation 194 | RW,Rwanda 195 | SA,Saudi Arabia 196 | SB,Solomon Islands 197 | SC,Seychelles 198 | SD,Sudan 199 | SE,Sweden 200 | SG,Singapore 201 | SH,Saint Helena 202 | SI,Slovenia 203 | SJ,Svalbard and Jan Mayen 204 | SK,Slovakia 205 | SL,Sierra Leone 206 | SM,San Marino 207 | SN,Senegal 208 | SO,Somalia 209 | SR,Suriname 210 | SS,South Sudan 211 | ST,Sao Tome and Principe 212 | SV,El Salvador 213 | SX,Sint Maarten 214 | SY,Syrian Arab Republic 215 | SZ,Swaziland 216 | TC,Turks and Caicos Islands 217 | TD,Chad 218 | TF,French Southern Territories 219 | TG,Togo 220 | TH,Thailand 221 | TJ,Tajikistan 222 | TK,Tokelau 223 | TL,Timor-Leste 224 | TM,Turkmenistan 225 | TN,Tunisia 226 | TO,Tonga 227 | TR,Turkey 228 | TT,Trinidad and Tobago 229 | TV,Tuvalu 230 | TW,Taiwan 231 | TZ,"Tanzania, United Republic of" 232 | UA,Ukraine 233 | UG,Uganda 234 | UM,United States Minor Outlying Islands 235 | US,United States 236 | UY,Uruguay 237 | UZ,Uzbekistan 238 | VA,Holy See (Vatican City State) 239 | VC,Saint Vincent and the Grenadines 240 | VE,Venezuela 241 | VG,"Virgin Islands, British" 242 | VI,"Virgin Islands, U.S." 243 | VN,Vietnam 244 | VU,Vanuatu 245 | WF,Wallis and Futuna 246 | WS,Samoa 247 | XK,Kosovo 248 | YE,Yemen 249 | YT,Mayotte 250 | ZA,South Africa 251 | ZM,Zambia 252 | ZW,Zimbabwe 253 | -------------------------------------------------------------------------------- /reverse_geocode/geocode.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/richardpenman/reverse_geocode/07fc5edc9458f57e7f6503ae1c01668cd97f2a79/reverse_geocode/geocode.gz -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import os 2 | from distutils.core import setup 3 | 4 | 5 | def read(filename): 6 | return open(os.path.join(os.path.dirname(__file__), filename)).read() 7 | 8 | 9 | setup( 10 | name="reverse_geocode", 11 | version="1.6.5", 12 | packages=["reverse_geocode"], 13 | package_dir={"reverse_geocode": "reverse_geocode"}, 14 | package_data={"reverse_geocode": ["countries.csv", "geocode.gz"]}, 15 | author="Richard Penman", 16 | author_email="richard.penman@gmail.com", 17 | description="Reverse geocode the given latitude / longitude", 18 | long_description=read("README.md"), 19 | long_description_content_type='text/markdown', 20 | url="https://github.com/richardpenman/reverse_geocode/", 21 | classifiers=[ 22 | "Environment :: Web Environment", 23 | "Intended Audience :: Developers", 24 | "License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)", 25 | "Operating System :: OS Independent", 26 | "Programming Language :: Python", 27 | "Programming Language :: Python :: 3", 28 | ], 29 | license="lgpl", 30 | install_requires=["numpy", "scipy"], 31 | ) 32 | -------------------------------------------------------------------------------- /test_reverse_geocode.py: -------------------------------------------------------------------------------- 1 | import reverse_geocode 2 | import unittest 3 | 4 | 5 | class TestBuiltwith(unittest.TestCase): 6 | def test_get(self): 7 | coordinate = -37.81, 144.96 8 | results = reverse_geocode.get(coordinate) 9 | self.assertEqual( 10 | results, 11 | {'country_code': 'AU', 'city': 'Melbourne', 'latitude': -37.814, 'longitude': 144.96332, 'population': 4917750, 'state': 'Victoria', 'country': 'Australia'} 12 | ) 13 | 14 | def test_search(self): 15 | coordinates = (-37.81, 144.96), (40.71427000, -74.00597000) 16 | results = reverse_geocode.search(coordinates) 17 | self.assertEqual( 18 | results, 19 | [ 20 | {'country_code': 'AU', 'city': 'Melbourne', 'latitude': -37.814, 'longitude': 144.96332, 'population': 4917750, 'state': 'Victoria', 'country': 'Australia'}, 21 | {'country_code': 'US', 'city': 'New York City', 'latitude': 40.71427, 'longitude': -74.00597, 'population': 8804190, 'state': 'New York', 'country': 'United States'}, 22 | ], 23 | ) 24 | 25 | def test_population(self): 26 | # a coordinate near NYC 27 | nyc_coordinate = 40.706322, -74.002640 28 | # try searching for NYC with all data and get a smaller area called Seaport 29 | all_cities_result = reverse_geocode.get(nyc_coordinate, 0) 30 | self.assertEqual( 31 | all_cities_result, 32 | {'country_code': 'US', 'city': 'Financial District', 'latitude': 40.70789, 'longitude': -74.00857, 'population': 60976, 'state': 'New York', 'county': 'New York County', 'country': 'United States'} 33 | ) 34 | 35 | # when restrict to big cities then get the expected match 36 | big_cities_result = reverse_geocode.get(nyc_coordinate, 100000) 37 | self.assertEqual( 38 | big_cities_result, 39 | {'country_code': 'US', 'city': 'New York City', 'latitude': 40.71427, 'longitude': -74.00597, 'population': 8804190, 'state': 'New York', 'country': 'United States'} 40 | ) 41 | 42 | 43 | 44 | if __name__ == "__main__": 45 | unittest.main() 46 | --------------------------------------------------------------------------------