├── CHANGELOG.md ├── LICENSE ├── README.md ├── pyproject.toml ├── setup.cfg ├── setup.py ├── src └── gjf │ ├── __init__.py │ ├── __main__.py │ ├── cli.py │ └── geojson_fixer.py └── upload_to_pypi.sh /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # 0.1.2 (2021-06-22) 2 | - Fixed bug where any field other than geometry is removed from Feature 3 | - Raise an exception in case `gjf` is not able to fix the object 4 | # 0.1.1 (2021-06-20) 5 | - Fixed terminal not detecting `gjf` package 6 | 7 | # 0.1.0 (2021-06-20) 8 | - Fix all types of GeoJSON objects, including FeatureCollection and Geometry Collection. If there is nothing to fix the object will be returned as is. 9 | - Can validate GeoJSON objects, and print explanations if the object is not valid. 10 | - Can be used within Python or command line 11 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2021 Yazeed Almuqwishi 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in 13 | all copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 21 | THE SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # gjf: A tool for fixing invalid GeoJSON objects 2 | 3 | The goal of this tool is to make it as easy as possible to fix invalid GeoJSON objects through Python or Command Line. 4 | ## Installation 5 | ```shell 6 | pip install gjf 7 | ``` 8 | Verify installation by running 9 | ```shell 10 | gjf --version 11 | ``` 12 | ### Features 13 | - Fix all types of GeoJSON objects, including FeatureCollection and Geometry Collection. If there is nothing to fix the object will be returned as is. 14 | - Can validate GeoJSON objects, and print explanations if the object is not valid. 15 | - Can be used within Python or command line 16 | ## Usage 17 | ### Python 18 | Say, you have a GeoJSON object defined as follows: 19 | ```python 20 | obj = {"type":"Polygon","coordinates":[[[45.892166,25.697688],[45.894522,25.696483],[45.897131,25.695144],[45.898814,25.694268],[45.900496,25.693394],[45.901284,25.692983],[45.903946,25.697312],[45.894791,25.701933],[45.894621,25.701657],[45.892593,25.698379],[45.892166,25.697688]],[[45.892086,25.697729],[45.892166,25.697688],[45.892086,25.697729]]]} 21 | ``` 22 | You can simply call `apply_fixes_if_needed` 23 | ```python 24 | from gjf.geojson_fixer import apply_fixes_if_needed 25 | 26 | fixed_obj = apply_fixes_if_needed(obj) 27 | ``` 28 | You can also flip coordinates order by toggling `flip_coords` 29 | ```python 30 | from gjf.geojson_fixer import apply_fixes_if_needed 31 | 32 | fixed_obj_with_flipped_coordinates = apply_fixes_if_needed(obj, flip_coords=True) 33 | ``` 34 | 35 | You can also check whether a GeoJSON object is valid or not by calling `validity` 36 | ```python 37 | from gjf.geojson_fixer import validity 38 | validity(obj) 39 | ``` 40 | Will result `('invalid', ['Too few points in geometry component[45.892086 25.697729]', ''])` 41 | ### CLI 42 | ```shell 43 | gjf invalid.geojson 44 | ``` 45 | `gjf` will fix the file, and output to `invalid_fixed.geojson` by default. If you need the output directed in another way you can use `--output-method` as directed below. It is also possible to fix multiple files, as below. 46 | ```shell 47 | gjf invalid_1.geojson invalid_2.geojson 48 | ``` 49 | Above will output fixed GeoJSON objects to `invalid_1_fixed.geojson` and `invalid_2_fixed.geojson`. 50 | #### CLI Arguments 51 | - `--version` print version and exit 52 | - `--validate` validate GeoJSON file, and print the error message if it is not valid, without attempting to fix it. 53 | - `--flip` Flip coordinates order 54 | - `-o, --output-method [overwrite|new_file|print]` 55 | - Default is `new_file`, where `gjf` will output fixed GeoJSON object to file with the postfix `_fixed`. Whereas `overwrite` will write the fixed GeoJSON object to the source file, overwriting the original file in process. Lastly, `print` will output the fixed GeoJSON object on the terminal 56 | 57 | ```shell 58 | gjf --output-method print invalid.geojson 59 | ``` 60 | This would print fixed `invalid.geojson` on the terminal 61 | 62 | ### Issues 63 | Feel free to open an issue if you have any problems. 64 | 65 | ### Special Thanks 66 | - [Shapely](https://github.com/Toblerity/Shapely) 67 | - [geojson-rewind](https://github.com/chris48s/geojson-rewind) 68 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["setuptools", "wheel"] 3 | build-backend = "setuptools.build_meta" -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [metadata] 2 | name = gjf 3 | version = attr: gjf.__init__.__version__ 4 | author = Yazeed Almuqwishi 5 | author_email = yazeed.almuqwishi@gmail.com 6 | keywords = geojson, fix, python, cli, validation 7 | description = A tool to fix invalid GeoJSON objects 8 | long_description = file: README.md 9 | long_description_content_type = text/markdown 10 | url = https://github.com/yazeed44/gjf 11 | project_urls = 12 | Bug Tracker = https://github.com/yazeed44/gjf/issues 13 | classifiers = 14 | Programming Language :: Python :: 3 15 | License :: OSI Approved :: MIT License 16 | Operating System :: OS Independent 17 | license_files = LICENSE 18 | 19 | [options] 20 | include_package_data = True 21 | package_dir = 22 | = src 23 | packages = find: 24 | install_requires = 25 | geojson 26 | geojson-rewind 27 | click 28 | Shapely >= 1.8a1 29 | 30 | [options.packages.find] 31 | where = src 32 | 33 | [options.entry_points] 34 | console_scripts = 35 | gjf = gjf.cli:main -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | 3 | setuptools.setup() -------------------------------------------------------------------------------- /src/gjf/__init__.py: -------------------------------------------------------------------------------- 1 | import logging 2 | logging.basicConfig() 3 | logger = logging.getLogger(__name__) 4 | __version__ = "0.1.2" 5 | -------------------------------------------------------------------------------- /src/gjf/__main__.py: -------------------------------------------------------------------------------- 1 | from gjf.cli import main 2 | 3 | main() 4 | -------------------------------------------------------------------------------- /src/gjf/cli.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import os 3 | import click 4 | import geojson 5 | 6 | from gjf import logger, __version__ 7 | from gjf.geojson_fixer import validity, apply_fixes_if_needed 8 | 9 | 10 | def handle_overwrite(geojson_files, fixed_geometries): 11 | file_paths = [file.name for file in geojson_files] 12 | for path, geometry in zip(file_paths, fixed_geometries): 13 | with open(path, 'w') as f: 14 | geojson.dump(geometry, f, ensure_ascii=False) 15 | click.echo(os.linesep.join([f"Wrote fixes to {path}" for path in file_paths])) 16 | 17 | 18 | def handle_new_file(geojson_files, fixed_geometries, postfix="_fixed"): 19 | new_file_paths = [os.path.splitext(file.name)[0] + postfix + os.path.splitext(file.name)[-1] for file in 20 | geojson_files] 21 | for path, geometry in zip(new_file_paths, fixed_geometries): 22 | with open(path, 'w') as f: 23 | geojson.dump(geometry, f, ensure_ascii=False) 24 | click.echo(os.linesep.join([f"Wrote fixes to {path}" for path in new_file_paths])) 25 | 26 | 27 | @click.command() 28 | @click.version_option(version=__version__) 29 | @click.argument("geojson-files", nargs=-1, type=click.File()) 30 | @click.option("--validate/--fix", default=False, 31 | help="If --validate is triggered, the validity of the file(s) will be printed without fixing. Otherwise will attempt to fix the file ") 32 | @click.option("-v", "--verbosity", 33 | type=click.Choice(["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], case_sensitive=False)) 34 | @click.option("-o", "--output-method", default="new_file", type=click.Choice(["overwrite", "new_file", "print"]), 35 | help="Choose how to output the fixed geometry; Overwriting the source file, or by creating a new file, or print it on the screen") 36 | @click.option("--flip/--no-flip", default=False, 37 | help="Choose whether to flip coordinates order. For example, from [25, 50] to [50, 25]") 38 | def main(geojson_files, validate, verbosity, output_method, flip): 39 | if verbosity: 40 | logger.setLevel(level=getattr(logging, verbosity.upper(), "NOTSET")) 41 | logger.debug("Started CLI with following parameters: validate: %s, verbosity: %s, output_method: %s, flip: %s", validate, verbosity, output_method, flip) 42 | if validate: 43 | click.echo(os.linesep.join([str(validity(geojson.load(file))) for file in geojson_files])) 44 | else: 45 | output_method = output_method.lower() 46 | fixed_geometries = [apply_fixes_if_needed(geojson.load(file), flip_coords=flip) for file in 47 | geojson_files] 48 | if output_method == "overwrite": 49 | handle_overwrite(geojson_files, fixed_geometries) 50 | elif output_method == "new_file": 51 | handle_new_file(geojson_files, fixed_geometries) 52 | elif output_method == "print": 53 | click.echo(os.linesep.join([str(geometry) for geometry in fixed_geometries])) 54 | 55 | 56 | if __name__ == "__main__": 57 | main() 58 | -------------------------------------------------------------------------------- /src/gjf/geojson_fixer.py: -------------------------------------------------------------------------------- 1 | from geojson_rewind import rewind 2 | from shapely.geometry import shape, mapping 3 | from shapely.validation import make_valid, explain_validity 4 | from gjf import logger 5 | 6 | 7 | # TODO update to include all types of numbers, including numpy's 8 | def __is_vertex(array): 9 | return len(array) == 2 and \ 10 | (isinstance(array[0], float) or isinstance(array[0], int)) 11 | 12 | 13 | # Convert from (latitude, longitude) to (longitude, latitude) format 14 | # Supports any level of nesting 15 | def flip_coordinates_order(geometry): 16 | if isinstance(geometry, dict): 17 | return {k: flip_coordinates_order(v) for k, v in geometry.items()} 18 | elif isinstance(geometry, list): 19 | return [geometry[1], geometry[0]] if __is_vertex(geometry) else [flip_coordinates_order(nested_arr) for 20 | nested_arr in geometry] 21 | else: 22 | return geometry 23 | 24 | 25 | def __convert_tuples_of_tuples_to_list_of_lists(x): 26 | return list(map(__convert_tuples_of_tuples_to_list_of_lists, x)) if isinstance(x, (list, tuple)) else x 27 | 28 | 29 | def __convert_tuples_to_lists_dict_recursive(x): 30 | if isinstance(x, tuple): 31 | return __convert_tuples_of_tuples_to_list_of_lists(x) 32 | elif isinstance(x, dict): 33 | return {k: __convert_tuples_to_lists_dict_recursive(v) for k, v in x.items()} 34 | elif isinstance(x, list): 35 | return [__convert_tuples_to_lists_dict_recursive(y) for y in x] 36 | else: 37 | return x 38 | 39 | 40 | def __to_geojson(shapely_obj): 41 | geojson_mapping = __convert_tuples_to_lists_dict_recursive(mapping(shapely_obj)) 42 | return geojson_mapping 43 | 44 | 45 | def __to_shapely(geojson_obj): 46 | return shape(geojson_obj) 47 | 48 | 49 | def need_rewind(geojson_obj): 50 | # If rewind is generating a different object, that means we need to rewind 51 | return rewind(geojson_obj) != geojson_obj 52 | 53 | 54 | def validity(geojson_obj): 55 | if geojson_obj["type"] == "FeatureCollection": 56 | collection_validity = [validity(feature) for feature in geojson_obj["features"]] 57 | final_txt = "valid" if all( 58 | [validity_tuple[0] == "valid" for validity_tuple in collection_validity]) else "invalid" 59 | final_explain = "" if final_txt == "valid" else [validity_tuple[1] for validity_tuple in collection_validity if 60 | len(validity_tuple[1]) > 0] 61 | return final_txt, final_explain 62 | 63 | elif geojson_obj["type"] == "Feature": 64 | return validity(geojson_obj["geometry"]) 65 | 66 | shapely_obj = __to_shapely(geojson_obj) 67 | valid_rewind_txt = "Polygons and MultiPolygons should follow the right-hand rule" if need_rewind( 68 | __to_geojson(shapely_obj)) else "" 69 | valid_txt = "valid" if (shapely_obj.is_valid and len(valid_rewind_txt) == 0) else "invalid" 70 | valid_explain = [explain_validity(shapely_obj), valid_rewind_txt] if valid_txt == "invalid" else [] 71 | return valid_txt, valid_explain 72 | 73 | 74 | def apply_fixes_if_needed(geojson_obj, flip_coords=False): 75 | # Handling Feature collection and Feature since they are not handled by Shapely 76 | if geojson_obj["type"] == "FeatureCollection": 77 | return {**geojson_obj, 78 | "features": [apply_fixes_if_needed(feature, flip_coords) for feature in geojson_obj["features"]]} 79 | elif geojson_obj["type"] == "Feature": 80 | return {**geojson_obj, "geometry": apply_fixes_if_needed(geojson_obj["geometry"])} 81 | if flip_coords: 82 | geojson_obj = flip_coordinates_order(geojson_obj) 83 | valid_shapely = __to_shapely(geojson_obj) 84 | if not valid_shapely.is_valid: 85 | logger.info("Geometry is invalid. Will attempt to fix with make_valid") 86 | valid_shapely = make_valid(valid_shapely) 87 | if need_rewind(__to_geojson(valid_shapely)): 88 | logger.info("Polygons within the geometry is not following the right-hand rule. Will attempt to fix with rewind") 89 | valid_shapely = __to_shapely(rewind(__to_geojson(valid_shapely))) 90 | if not valid_shapely.is_valid or need_rewind(__to_geojson(valid_shapely)): 91 | raise NotImplementedError(f"gjf is unable to fix this object. Please open a github issue to investigate") 92 | return __to_geojson(valid_shapely) 93 | -------------------------------------------------------------------------------- /upload_to_pypi.sh: -------------------------------------------------------------------------------- 1 | rm -rf dist/ src/gjf.egg-info/ && python -m build \ 2 | && python3 -m twine upload --repository $1 dist/* --------------------------------------------------------------------------------