├── tests ├── __init__.py ├── test_api.py ├── test_auth.py ├── test_request_helpers.py └── test_models.py ├── modis_tools ├── __init__.py ├── constants │ ├── __init__.py │ ├── mimetypes.py │ └── urls.py ├── decorators.py ├── api.py ├── models.py ├── auth.py ├── resources.py ├── granule_handler.py └── request_helpers.py ├── MANIFEST.in ├── requirements.txt ├── pytest.ini ├── .pre-commit-config.yaml ├── .github ├── CODEOWNERS ├── ISSUE_TEMPLATE │ ├── feature_request.md │ └── bug_report.md ├── workflows │ └── pull_request.yml └── pull_request_template.md ├── setup.py ├── .gitignore ├── CONTRIBUTING.md ├── LICENSE └── README.md /tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /modis_tools/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /modis_tools/constants/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include requirements.txt 2 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | requests==2.* 2 | pydantic==2.* 3 | python-dateutil==2.* 4 | shapely==2.* 5 | tqdm>=4.42.0 6 | setuptools==69.2.0 -------------------------------------------------------------------------------- /pytest.ini: -------------------------------------------------------------------------------- 1 | [pytest] 2 | markers = 3 | int: integration tests skipped by default, (only evaluate with 'pytest --int') 4 | e2e: end to end tests skipped by default, (only evaluate with 'pytest --e2e') 5 | 6 | testpaths = 7 | tests 8 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | - repo: https://github.com/astral-sh/ruff-pre-commit 2 | # Ruff version. 3 | rev: v0.5.1 4 | hooks: 5 | # Run the linter. 6 | - id: ruff 7 | args: [ --fix ] 8 | # Run the formatter. 9 | - id: ruff-format -------------------------------------------------------------------------------- /.github/CODEOWNERS: -------------------------------------------------------------------------------- 1 | # This owners will be the default owners for everything in 2 | # the repo. Unless a later match takes precedence, 3 | # e.g. @jamie-sgro will be requested for review when 4 | # someone opens a pull request. 5 | * @cearlefraym @jamie-sgro @jtanwk @lmcindewar @ShengpeiWang -------------------------------------------------------------------------------- /modis_tools/constants/mimetypes.py: -------------------------------------------------------------------------------- 1 | from enum import Enum 2 | 3 | 4 | class MimeTypes(Enum): 5 | html = "text/html" 6 | json = "application/json" 7 | xml = "application/xml" 8 | echo10 = "application/echo10+xml" 9 | iso = "application/iso19115+xml" 10 | iso19115 = "application/iso19115+xml" 11 | dif = "application/dif+xml" 12 | dif10 = "application/dif10+xml" 13 | csv = "text/csv" 14 | atom = "application/atom+xml" 15 | opendata = "application/opendata+json" 16 | kml = "application/vnd.google-earth.kml+xml" 17 | native = "application/metadata+xml" 18 | umm_json = "application/vnd.nasa.cmr.umm_results+json" 19 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 12 | 13 | **Describe the solution you'd like** 14 | A clear and concise description of what you want to happen. 15 | 16 | **Describe alternatives you've considered** 17 | A clear and concise description of any alternative solutions or features you've considered. 18 | 19 | **Additional context** 20 | Add any other context or screenshots about the feature request here. 21 | -------------------------------------------------------------------------------- /modis_tools/constants/urls.py: -------------------------------------------------------------------------------- 1 | """URLs for the API""" 2 | 3 | from enum import Enum 4 | 5 | 6 | class URLs(Enum): 7 | """URLs""" 8 | 9 | API: str = "cmr.earthdata.nasa.gov" 10 | URS: str = "urs.earthdata.nasa.gov" 11 | RESOURCE: str = "e4ftl01.cr.usgs.gov" 12 | MOD11A2_V061_RESOURCE: str = "data.lpdaac.earthdatacloud.nasa.gov" 13 | NSIDC_RESOURCE: str = "n5eil01u.ecs.nsidc.org" 14 | EARTHDATA: str = ".earthdata.nasa.gov" 15 | LAADS_RESOURCE: str = "ladsweb.modaps.eosdis.nasa.gov" 16 | LAADS_CLOUD_RESOURCE: str = "data.laadsdaac.earthdatacloud.nasa.gov" 17 | MODISA_L3b_CHL_V061_RESOURCE: str = "oceancolor.gsfc.nasa.gov" 18 | MODISA_L3b_CHL_V061_RESOURCE_SCI: str = "oceandata.sci.gsfc.nasa.gov" 19 | -------------------------------------------------------------------------------- /.github/workflows/pull_request.yml: -------------------------------------------------------------------------------- 1 | name: Pull Request 2 | 3 | on: 4 | pull_request: 5 | branches: [ main ] 6 | types: [opened, reopened, ready_for_review] 7 | 8 | jobs: 9 | build_and_test: 10 | 11 | runs-on: ubuntu-latest 12 | strategy: 13 | matrix: 14 | python-version: [ "3.8", "3.9", "3.10" ] 15 | steps: 16 | - uses: actions/checkout@v2 17 | - name: Set up Python 3.10 18 | uses: actions/setup-python@v2 19 | with: 20 | python-version: ${{ matrix.python-version }} 21 | - name: Install dependencies 22 | run: | 23 | apt-get update && \ 24 | apt-get install -y \ 25 | libgeos-dev 26 | python -m pip install --upgrade pip 27 | python -m pip install -e .[test] 28 | - name: Test with pytest 29 | run: | 30 | pytest 31 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | 3 | with open("README.md", "r", encoding="utf-8") as fh: 4 | long_description = fh.read() 5 | 6 | with open("requirements.txt", "r", encoding="utf-8") as fh: 7 | install_requires = fh.read().split() 8 | 9 | setuptools.setup( 10 | name="modis_tools", 11 | version="2.0.0", 12 | author="fraym", 13 | author_email="datascience@fraym.io", 14 | description="Tools for working with the MODIS API and MODIS data.", 15 | long_description=long_description, 16 | long_description_content_type="text/markdown", 17 | url="https://github.com/fraymio/modis-tools.git", 18 | packages=setuptools.find_packages(), 19 | classifiers=[ 20 | "Development Status :: 5 - Production/Stable", 21 | "Programming Language :: Python :: 3.7", 22 | "Operating System :: OS Independent", 23 | ], 24 | python_requires=">=3.7", 25 | install_requires=install_requires, 26 | extras_require={ 27 | "test": ["pytest"], 28 | "gdal": ["gdal"], 29 | }, 30 | ) 31 | -------------------------------------------------------------------------------- /modis_tools/decorators.py: -------------------------------------------------------------------------------- 1 | """Decorator for query parameters.""" 2 | 3 | from inspect import signature 4 | from functools import wraps 5 | 6 | 7 | def _process_requests_args(*, reqarg_name, reqargs): 8 | def decorator(func): 9 | @wraps(func) 10 | def wrapped(*fargs, **fkwargs): 11 | processed = {} 12 | for reqarg in reqargs: 13 | arg_sig = signature(reqarg.__init__) 14 | init_args = { 15 | p: fkwargs.pop(p) for p in arg_sig.parameters if p in fkwargs 16 | } 17 | if init_args: 18 | reqarg_inst = reqarg(**init_args) 19 | processed.update(reqarg_inst.to_dict()) 20 | result = func(*fargs, **fkwargs, **{reqarg_name: processed}) 21 | return result 22 | 23 | return wrapped 24 | 25 | return decorator 26 | 27 | 28 | def file_args(*args): 29 | """Process arguments for `files` parameter of requests""" 30 | return _process_requests_args(reqarg_name="files", reqargs=args) 31 | 32 | 33 | def params_args(*args): 34 | """Process arguments for `params` parameter of requests""" 35 | return _process_requests_args(reqarg_name="params", reqargs=args) 36 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 1. Go to '...' 16 | 2. Click on '....' 17 | 3. Scroll down to '....' 18 | 4. See error 19 | 20 | [Here](https://stackoverflow.com/help/minimal-reproducible-example) is a link on how to make a good reproducible example. 21 | 22 | **Expected behavior and actual behavior** 23 | Please tell us what results you expected, as well as the output you got instead. If providing an error message, please make sure it's readable by formatting it with a code block. 24 | 25 | For example: I expected to download Nigeria MOD13A2 tiles. However, the following exception occurred: 26 | 27 | **Screenshots** 28 | If applicable, add screenshots to help explain your problem. 29 | 30 | **Desktop (please complete the following information):** 31 | - OS: [e.g. iOS] 32 | - Browser [e.g. chrome, safari] 33 | - Version [e.g. 22] 34 | 35 | **Your Environment:** 36 | - `modis_tools` version used: 37 | - Other dependencies and packages installed in environment you are using (e.g. output of `pip freeze` or `conda list` or `poetry show`): 38 | 39 | **Any Additional context** 40 | Add any other context about the problem here. 41 | -------------------------------------------------------------------------------- /.github/pull_request_template.md: -------------------------------------------------------------------------------- 1 | ## Description 2 | 3 | 4 | 5 | Closes # (issue) 6 | 7 | ## Type of change 8 | 9 | Please delete options that are not relevant. 10 | 11 | - [ ] Bug fix (non-breaking change which fixes an issue) 12 | - [ ] New feature (non-breaking change which adds functionality) 13 | - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) 14 | - [ ] This change requires a documentation update 15 | 16 | ## How Has This Been Tested? 17 | 18 | 19 | 20 | - [ ] Test A 21 | - [ ] Test B 22 | 23 | ## Checklist: 24 | 25 | - [ ] I have performed a self-review of my own code 26 | - [ ] I have commented my code, particularly in hard-to-understand areas 27 | - [ ] I have made corresponding changes to the documentation 28 | - [ ] My changes generate no new warnings 29 | - [ ] I have added tests that prove my fix is effective or that my feature works 30 | - [ ] New and existing unit tests pass locally with my changes 31 | - [ ] Any dependent changes have been merged and published in downstream modules 32 | - [ ] I have checked my code and corrected any misspellings 33 | 34 | ## Next Steps 35 | - [ ] Assign a reviewer based on the [code owner document](https://github.com/fraymio/modis_tools/blob/main/.github/CODEOWNERS). 36 | 37 | - [ ] Once your review is approved, merge and delete the feature branch 38 | 39 | On behalf of the Modis Tools Dev Team, thank you for your hard work! ✨ -------------------------------------------------------------------------------- /tests/test_api.py: -------------------------------------------------------------------------------- 1 | from functools import partial 2 | 3 | from requests.sessions import Session 4 | import pytest 5 | 6 | from modis_tools.api import ModisApi 7 | from modis_tools.auth import ModisSession 8 | from modis_tools.constants.urls import URLs 9 | 10 | URL = URLs.API.value 11 | 12 | 13 | class TestModisApi: 14 | def test_can_init_with_username_and_password(self): 15 | api = ModisApi(username="", password="") 16 | assert isinstance(api.session, Session) 17 | 18 | def test_can_init_with_session(self): 19 | api = ModisApi(session=ModisSession("", "")) 20 | assert isinstance(api.session, Session) 21 | 22 | def test_mime_type_defaults_to_json(self): 23 | api = ModisApi(session=ModisSession("", "")) 24 | assert api.mime_type == "json" 25 | 26 | def test_url_derives_from_resource(self): 27 | api = ModisApi(session=ModisSession("", "")) 28 | api.resource = "test" 29 | assert api.resource_url == f"https://{URL}/search/test" 30 | 31 | def test_changes_in_resource_also_update_url(self): 32 | api = ModisApi(session=ModisSession("", "")) 33 | api.resource = "test" 34 | assert api.resource_url == f"https://{URL}/search/test" 35 | api = ModisApi(session=ModisSession("", "")) 36 | api.resource = "new_test" 37 | assert api.resource_url == f"https://{URL}/search/new_test" 38 | 39 | def test_bad_mime_type_raises_exception(self): 40 | api = ModisApi(session=ModisSession("", "")) 41 | with pytest.raises(Exception): 42 | api.mime_type = "bad_type" 43 | 44 | def test_get_is_a_partial_method(self): 45 | api = ModisApi(session=ModisSession("", "")) 46 | api.resource = "test" 47 | get = api.get 48 | 49 | assert callable(get) 50 | assert isinstance(get, partial) 51 | assert get.keywords["url"] == f"https://{URL}/search/test" 52 | 53 | def test_post_is_a_partial_method(self): 54 | api = ModisApi(session=ModisSession("", "")) 55 | api.resource = "test" 56 | post = api.post 57 | 58 | assert callable(post) 59 | assert isinstance(post, partial) 60 | assert post.keywords["url"] == f"https://{URL}/search/test" 61 | 62 | def test_get_is_a_delegate_of_session_dot_get(self): 63 | api = ModisApi(session=ModisSession("", "")) 64 | api.resource = "test" 65 | get = api.get 66 | assert get.func == api.session.get 67 | 68 | def test_post_is_a_delegate_of_session_dot_post(self): 69 | api = ModisApi(session=ModisSession("", "")) 70 | api.resource = "test" 71 | post = api.post 72 | assert post.func == api.session.post 73 | -------------------------------------------------------------------------------- /modis_tools/api.py: -------------------------------------------------------------------------------- 1 | """Base API to use with MODIS.""" 2 | 3 | import copy 4 | from functools import partial 5 | from typing import Callable, Optional, Union 6 | from requests.models import Response 7 | 8 | from requests.sessions import Session 9 | 10 | from .auth import ModisSession 11 | from .constants.mimetypes import MimeTypes 12 | from .constants.urls import URLs 13 | 14 | Sessions = Union[ModisSession, Session] 15 | 16 | 17 | class ModisApi: 18 | """General class for MODIS CMR API 19 | 20 | Parameters set on the object are included in all requests, and 21 | overridden by those specificed in function call 22 | """ 23 | 24 | resource: str 25 | params: Optional[dict] = None 26 | _mime_type: str = "json" 27 | 28 | def __init__( 29 | self, 30 | session: Optional[Sessions] = None, 31 | username: Optional[str] = None, 32 | password: Optional[str] = None, 33 | ): 34 | if isinstance(session, ModisSession): 35 | self.session = session.session 36 | elif isinstance(session, Session): 37 | if session.auth is not None: 38 | self.session = session 39 | else: 40 | modis_session = ModisSession(username=username, password=password) 41 | self.session = modis_session.session 42 | self.session.headers["Accept"] = MimeTypes[self._mime_type].value 43 | 44 | @property 45 | def resource_url(self) -> str: 46 | """Resource URL.""" 47 | return "/".join(["https:/", URLs.API.value, "search", self.resource]) 48 | 49 | @property 50 | def get(self) -> Callable[..., Response]: 51 | """Handle get requests.""" 52 | return partial(self.session.get, url=self.resource_url) 53 | 54 | @property 55 | def post(self) -> Callable[..., Response]: 56 | """Handle post requests.""" 57 | return partial(self.session.post, url=self.resource_url) 58 | 59 | @property 60 | def mime_type(self) -> str: 61 | """Mime type for data returned from API.""" 62 | return self._mime_type 63 | 64 | @mime_type.setter 65 | def mime_type(self, mime_type): 66 | """Validate the availaility of mime type before setting""" 67 | try: 68 | self.session.headers["Accept"] = MimeTypes[mime_type].value 69 | self._mime_type = mime_type 70 | except KeyError: 71 | raise Exception("Invalid mimetype") 72 | 73 | @property 74 | def no_auth(self): 75 | """Create a copy of the current ModisAPI without auth. Needed for some CMR 76 | resources. 77 | """ 78 | no_auth_session = copy.deepcopy(self) 79 | no_auth_session.session.auth = None 80 | return no_auth_session 81 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | share/python-wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .nox/ 43 | .coverage 44 | .coverage.* 45 | .cache 46 | nosetests.xml 47 | coverage.xml 48 | *.cover 49 | *.py,cover 50 | .hypothesis/ 51 | .pytest_cache/ 52 | cover/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | .pybuilder/ 76 | target/ 77 | 78 | # Jupyter Notebook 79 | .ipynb_checkpoints 80 | 81 | # IPython 82 | profile_default/ 83 | ipython_config.py 84 | 85 | # pyenv 86 | # For a library or package, you might want to ignore these files since the code is 87 | # intended to run in multiple environments; otherwise, check them in: 88 | # .python-version 89 | 90 | # pipenv 91 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 92 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 93 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 94 | # install all needed dependencies. 95 | #Pipfile.lock 96 | 97 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 98 | __pypackages__/ 99 | 100 | # Celery stuff 101 | celerybeat-schedule 102 | celerybeat.pid 103 | 104 | # SageMath parsed files 105 | *.sage.py 106 | 107 | # Environments 108 | .env 109 | .venv 110 | env/ 111 | venv/ 112 | ENV/ 113 | env.bak/ 114 | venv.bak/ 115 | 116 | # Spyder project settings 117 | .spyderproject 118 | .spyproject 119 | 120 | # Rope project settings 121 | .ropeproject 122 | 123 | # mkdocs documentation 124 | /site 125 | 126 | # mypy 127 | .mypy_cache/ 128 | .dmypy.json 129 | dmypy.json 130 | 131 | # Pyre type checker 132 | .pyre/ 133 | 134 | # pytype static type analyzer 135 | .pytype/ 136 | 137 | # Cython debug symbols 138 | cython_debug/ 139 | 140 | # Any files that may have been downloaded in project folder 141 | *.hdf 142 | *.xml 143 | 144 | .vscode/* 145 | 146 | **/.DS_Store 147 | *.nc -------------------------------------------------------------------------------- /modis_tools/models.py: -------------------------------------------------------------------------------- 1 | """Return classes from API requests.""" 2 | 3 | from datetime import datetime 4 | from typing import Any, Dict, List, Optional 5 | 6 | from pydantic import field_validator, AnyUrl, BaseModel, HttpUrl 7 | 8 | 9 | # Shared structure 10 | class ApiLink(BaseModel): 11 | rel: HttpUrl 12 | hreflang: str 13 | href: AnyUrl 14 | type: Optional[str] = None 15 | 16 | @property 17 | def rel(self, val: str) -> str: 18 | """Remove trailing hashes before validating.""" 19 | return val.rstrip("#") 20 | 21 | @field_validator("href", mode="before") 22 | @classmethod 23 | def convert_spaces(cls, v: AnyUrl) -> str: 24 | """Spaces in links are problematic; this validator encodes them.""" 25 | return v.replace(" ", "%20") 26 | 27 | 28 | class ApiEntry(BaseModel): 29 | """Shared core fields for API entries.""" 30 | 31 | id: str 32 | title: str 33 | dataset_id: str 34 | coordinate_system: str 35 | time_start: str 36 | updated: Optional[datetime] = None 37 | links: List[Any] 38 | 39 | 40 | class ApiEntryExtended(ApiEntry): 41 | """ 42 | Extends base ApiEntry to include all information returned for entries from 43 | the MODIS API 44 | """ 45 | 46 | browse_flag: bool 47 | data_center: str 48 | online_access_flag: bool 49 | original_format: str 50 | 51 | 52 | class ApiFeed(BaseModel): 53 | # id: HttpUrl - probably not useful and raises error for very long queries 54 | updated: datetime 55 | title: str 56 | entry: List[Any] 57 | 58 | 59 | # Resource links 60 | class CollectionLink(ApiLink): 61 | pass 62 | 63 | 64 | class GranuleLink(ApiLink): 65 | inherited: Optional[bool] = None 66 | type: Optional[str] = None 67 | 68 | 69 | # Resource entries 70 | class Collection(ApiEntry): 71 | """Core fields for collections.""" 72 | 73 | processing_level_id: str 74 | short_name: str 75 | summary: str 76 | version_id: str 77 | links: List[CollectionLink] 78 | 79 | 80 | class CollectionExtended(ApiEntryExtended, Collection): 81 | """ 82 | Extends base Collection to include all information returned for collections from 83 | the MODIS API. 84 | """ 85 | 86 | archive_center: str 87 | boxes: List[str] 88 | has_formats: bool 89 | has_spatial_subsetting: bool 90 | has_temporal_subsetting: bool 91 | has_transforms: bool 92 | has_variables: bool 93 | orbit_parameters: Dict[Any, Any] 94 | organizations: List[str] 95 | 96 | 97 | class Granule(ApiEntry): 98 | """Core fields for granules.""" 99 | 100 | cloud_cover: Optional[str] = None 101 | collection_concept_id: str 102 | day_night_flag: Optional[str] = None 103 | granule_size: Optional[float] = None 104 | polygons: Optional[List[Any]] = None 105 | producer_granule_id: Optional[str] = None 106 | time_end: datetime 107 | links: List[GranuleLink] 108 | 109 | 110 | class GranuleExtended(ApiEntryExtended, Granule): 111 | """ 112 | Extends base Collection to include all information returned for granules from 113 | the MODIS API. 114 | """ 115 | 116 | pass 117 | 118 | 119 | # Resource feeds 120 | class CollectionFeed(ApiFeed): 121 | entry: List[Collection] 122 | 123 | 124 | class GranuleFeed(ApiFeed): 125 | entry: List[Granule] 126 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | When contributing to this repository, please first discuss the change you wish to make via issues 4 | with the owners of this repository before making a change. 5 | 6 | ## Issue Reporting Process 7 | 8 | 1. Use the provided issue reporting template to report your issue, taking care to complete each step. 9 | 2. If you would like to propose a feature or have a question regarding documentation, you many submit an issue in the format you please. 10 | 11 | ## Pull Request Process 12 | 13 | 1. Ensure any install or build dependencies are removed before the end of the layer when doing a 14 | build. 15 | 2. Update the README.md with details of changes to the interface, this includes new environment 16 | variables, exposed ports, useful file locations and container parameters. 17 | 3. Increase the version numbers in any examples files and the README.md to the new version that this 18 | Pull Request would represent. The versioning scheme we use is [SemVer](http://semver.org/). 19 | 4. You may merge the Pull Request in once you have the sign-off from one of the code-owners, or if you 20 | do not have permission to do that, you may request the second reviewer to merge it for you. 21 | 22 | ## Code of Conduct 23 | 24 | ### Our Pledge 25 | 26 | In the interest of fostering an open and welcoming environment, we as 27 | contributors and maintainers pledge to making participation in our project and 28 | our community a harassment-free experience for everyone, regardless of age, body 29 | size, disability, ethnicity, gender identity and expression, level of experience, 30 | nationality, personal appearance, race, religion, or sexual identity and 31 | orientation. 32 | 33 | ### Our Standards 34 | 35 | Examples of behavior that contributes to creating a positive environment 36 | include: 37 | 38 | * Using welcoming and inclusive language 39 | * Being respectful of differing viewpoints and experiences 40 | * Gracefully accepting constructive criticism 41 | * Focusing on what is best for the community 42 | * Showing empathy towards other community members 43 | 44 | ### Our Responsibilities 45 | 46 | Project maintainers are responsible for clarifying the standards of acceptable 47 | behavior and are expected to take appropriate and fair corrective action in 48 | response to any instances of unacceptable behavior. 49 | 50 | Project maintainers have the right and responsibility to remove, edit, or 51 | reject comments, commits, code, wiki edits, issues, and other contributions 52 | that are not aligned to this Code of Conduct, or to ban temporarily or 53 | permanently any contributor for other behaviors that they deem inappropriate, 54 | threatening, offensive, or harmful. 55 | 56 | ### Scope 57 | 58 | This Code of Conduct applies both within project spaces and in public spaces 59 | when an individual is representing the project or its community. Examples of 60 | representing a project or community include using an official project e-mail 61 | address, posting via an official social media account, or acting as an appointed 62 | representative at an online or offline event. Representation of a project may be 63 | further defined and clarified by project maintainers. 64 | 65 | ### Enforcement 66 | 67 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 68 | reported by contacting the project team at datascience@fraym.io. All 69 | complaints will be reviewed and investigated and will result in a response that 70 | is deemed necessary and appropriate to the circumstances. The project team is 71 | obligated to maintain confidentiality with regard to the reporter of an incident. 72 | Further details of specific enforcement policies may be posted separately. 73 | 74 | Project maintainers who do not follow or enforce the Code of Conduct in good 75 | faith may face temporary or permanent repercussions as determined by other 76 | members of the project's leadership. 77 | 78 | ### Attribution 79 | 80 | This Code of Conduct is adapted from the Contributor Covenant, version 1.4, 81 | available at [contributor convenant](http://contributor-covenant.org/version/1/4) 82 | 83 | [version]: 84 | 85 | -------------------------------------------------------------------------------- /tests/test_auth.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from datetime import datetime, timedelta 4 | 5 | from modis_tools.auth import ModisSession, has_download_cookies 6 | 7 | 8 | class TestModisSession: 9 | def test_creates_session(self): 10 | modis = ModisSession("test user", "test password") 11 | assert modis.session 12 | 13 | def test_no_credentials_raises_exception(self): 14 | expected = Exception 15 | 16 | with pytest.raises(expected): 17 | ModisSession() 18 | 19 | 20 | class TestDownloadCookies: 21 | @pytest.fixture() 22 | def session_with_cookies(self): 23 | modis = ModisSession("test user", "test password") 24 | 25 | time = datetime.now() + timedelta(hours=9) 26 | 27 | modis.session.cookies.set( 28 | "urs_user_already_logged", 29 | value="yes", 30 | domain=".earthdata.nasa.gov", 31 | expires=datetime.timestamp(time), 32 | ) 33 | modis.session.cookies.set( 34 | "DATA", value="fake value,", domain="e4ftl01.cr.usgs.gov" 35 | ) 36 | modis.session.cookies.set( 37 | "_urs-gui_session", 38 | value="fake value", 39 | domain="urs.earthdata.nasa.gov", 40 | expires=datetime.timestamp(time), 41 | ) 42 | 43 | return modis.session 44 | 45 | def test_no_cookies_returns_false(self): 46 | modis = ModisSession("test user", "test password") 47 | 48 | expected = False 49 | 50 | assert has_download_cookies(modis.session) == expected 51 | 52 | def test_correct_cookies_return_true(self, session_with_cookies): 53 | expected = True 54 | 55 | assert has_download_cookies(session_with_cookies) == expected 56 | 57 | def test_expired_first_cookie_return_false(self, session_with_cookies): 58 | time = datetime.now() + timedelta(hours=-9) 59 | 60 | session_with_cookies.cookies.set( 61 | "urs_user_already_logged", 62 | value="yes", 63 | domain=".earthdata.nasa.gov", 64 | expires=datetime.timestamp(time), 65 | ) 66 | 67 | expected = False 68 | 69 | assert has_download_cookies(session_with_cookies) == expected 70 | 71 | def test_expired_gui_cookie_return_false(self, session_with_cookies): 72 | time = datetime.now() + timedelta(hours=-9) 73 | 74 | session_with_cookies.cookies.set( 75 | "_urs-gui_session", 76 | value="fake value", 77 | domain="urs.earthdata.nasa.gov", 78 | expires=datetime.timestamp(time), 79 | ) 80 | 81 | expected = False 82 | 83 | assert has_download_cookies(session_with_cookies) == expected 84 | 85 | def test_incorrect_earthdata_domain_return_false(self, session_with_cookies): 86 | time = datetime.now() + timedelta(hours=9) 87 | 88 | session_with_cookies.cookies.set( 89 | "urs_user_already_logged", 90 | value="yes", 91 | domain="wrong.url", 92 | expires=datetime.timestamp(time), 93 | ) 94 | 95 | expected = False 96 | 97 | assert has_download_cookies(session_with_cookies) == expected 98 | 99 | def test_logged_in_value_no_returns_false(self, session_with_cookies): 100 | time = datetime.now() + timedelta(hours=9) 101 | 102 | session_with_cookies.cookies.set( 103 | "urs_user_already_logged", 104 | value="no", 105 | domain=".earthdata.nasa.gov", 106 | expires=datetime.timestamp(time), 107 | ) 108 | 109 | expected = False 110 | 111 | assert has_download_cookies(session_with_cookies) == expected 112 | 113 | def test_incorrect_data_domain_returns_false(self, session_with_cookies): 114 | session_with_cookies.cookies.set( 115 | "DATA", value="fake value,", domain="wrong.url" 116 | ) 117 | 118 | expected = False 119 | 120 | assert has_download_cookies(session_with_cookies) == expected 121 | 122 | def test_incorrect_gui_domain_returns_false(self, session_with_cookies): 123 | time = datetime.now() + timedelta(hours=9) 124 | 125 | session_with_cookies.cookies.set( 126 | "_urs-gui_session", 127 | value="fake value", 128 | domain="wrong.url", 129 | expires=datetime.timestamp(time), 130 | ) 131 | 132 | expected = False 133 | 134 | assert has_download_cookies(session_with_cookies) == expected 135 | -------------------------------------------------------------------------------- /modis_tools/auth.py: -------------------------------------------------------------------------------- 1 | """Modis authentication functions.""" 2 | 3 | import stat 4 | from datetime import datetime 5 | from netrc import netrc 6 | from pathlib import Path 7 | from typing import Optional, Union 8 | 9 | from requests import sessions 10 | from requests.auth import HTTPBasicAuth 11 | 12 | from .constants.urls import URLs 13 | 14 | 15 | class ModisSession: 16 | """Auth session for querying and downloading MODIS data.""" 17 | 18 | session: sessions.Session 19 | 20 | def __init__( 21 | self, 22 | username: Optional[str] = None, 23 | password: Optional[str] = None, 24 | auth: Optional[HTTPBasicAuth] = None, 25 | ): 26 | self.username = username 27 | self.password = password 28 | self.session = sessions.Session() 29 | if auth: 30 | self.session.auth = auth 31 | elif username is not None and password is not None: 32 | self.session.auth = HTTPBasicAuth(username, password) 33 | else: 34 | try: 35 | username, _, password = netrc().authenticators(URLs.URS.value) 36 | except FileNotFoundError: 37 | raise Exception( 38 | ( 39 | "Unable to create authenticated session." 40 | "Likely that username and password not found" 41 | ) 42 | ) 43 | 44 | self.session.auth = HTTPBasicAuth(username, password) 45 | 46 | def __enter__(self): 47 | return self.session 48 | 49 | def __exit__(self, exc_type, exc_value, exc_traceback): 50 | self.session.close() 51 | 52 | 53 | def has_download_cookies(session): 54 | """Check if session has valid cookies for CMR API using assertions.""" 55 | cookies = {cookie.name: cookie for cookie in session.cookies} 56 | try: 57 | logged_in = cookies["urs_user_already_logged"] 58 | assert logged_in.domain == URLs.EARTHDATA.value 59 | assert datetime.fromtimestamp(logged_in.expires) > datetime.now() 60 | assert logged_in.value == "yes" 61 | if "DATA" in cookies: 62 | # specific to the cookies for LP DAAC source 63 | data = cookies["DATA"] 64 | assert data.domain == URLs.RESOURCE.value 65 | elif "CIsForCookie_OPS" in cookies: 66 | data = cookies["CIsForCookie_OPS"] 67 | # specific to the cookies for NSIDC DAAC source 68 | assert data.domain == URLs.NSIDC_RESOURCE.value 69 | else: 70 | raise KeyError( 71 | "Data source not recognized. Please open an issue informing us of the desired data source." 72 | ) 73 | 74 | gui = cookies["_urs-gui_session"] 75 | assert gui.domain == URLs.URS.value 76 | assert datetime.fromtimestamp(gui.expires) > datetime.now() 77 | 78 | return True 79 | except (KeyError, AssertionError): 80 | return False 81 | 82 | 83 | def _write_netrc(file: Union[str, Path], permissions: dict): 84 | with open(file, "w") as handle: 85 | for host, (un, _, pw) in permissions.items(): 86 | domain = "default" if host == "default" else f"machine {host}" 87 | handle.write(f"{domain}\nlogin {un}\npassword {pw}\n") 88 | 89 | 90 | def add_earthdata_netrc(username: str, password: str, update: bool = True): 91 | """Write Modis permissions to netrc file. 92 | 93 | :param username Earthdata account username 94 | :type str 95 | 96 | :param password Earthdata account password 97 | :type str 98 | 99 | :param update whether to update if Earthdata entry already exists in netrc 100 | :type bool, default True 101 | """ 102 | netrc_file = Path.home() / ".netrc" 103 | permissions = {} 104 | if netrc_file.exists(): 105 | existing = netrc().hosts 106 | if URLs.URS.value in existing and not update: 107 | return 108 | permissions = existing 109 | permissions[URLs.URS.value] = (username, None, password) 110 | _write_netrc(netrc_file, permissions) 111 | 112 | owner_read_write = stat.S_IFREG | stat.S_IRUSR | stat.S_IWUSR 113 | netrc_file.chmod(owner_read_write) # Limit permissions to owner 114 | 115 | 116 | def remove_earthdata_netrc(): 117 | """ 118 | Remove the netrc entry for Earthdata, if it exists. 119 | """ 120 | netrc_file = Path.home() / ".netrc" 121 | if not netrc_file.exists(): 122 | return 123 | existing = netrc().hosts 124 | existing.pop(URLs.URS.value, None) 125 | if not existing: 126 | netrc_file.unlink() 127 | else: 128 | _write_netrc(netrc_file, existing) 129 | -------------------------------------------------------------------------------- /modis_tools/resources.py: -------------------------------------------------------------------------------- 1 | """Classes to use MODIS API to download satellite data.""" 2 | 3 | import json 4 | from typing import Any, Iterator, List, Optional 5 | 6 | from .api import ModisApi, Sessions 7 | from .decorators import params_args 8 | from .models import Collection, CollectionFeed, Granule, GranuleFeed 9 | from .request_helpers import DateParams, SpatialQuery 10 | 11 | 12 | class CollectionApi(ModisApi): 13 | """API for MODIS's 'collections' resource""" 14 | 15 | resource: str = "collections" 16 | 17 | def __init__( 18 | self, 19 | session: Optional[Sessions] = None, 20 | username: Optional[str] = None, 21 | password: Optional[str] = None, 22 | ): 23 | super().__init__(session=session, username=username, password=password) 24 | 25 | def query(self, **kwargs) -> List[Collection]: 26 | resp = self.no_auth.get(params=kwargs) 27 | try: 28 | collection_feed = CollectionFeed(**resp.json()["feed"]) 29 | except (json.JSONDecodeError, KeyError, IndexError) as err: 30 | raise Exception("Error in querying collections") from err 31 | return collection_feed.entry 32 | 33 | 34 | DEFAULT_GRANULE_PARAMS = { 35 | "downloadable": "true", 36 | "page_size": 2000, 37 | "sort_key": "-start_date", 38 | } 39 | 40 | 41 | class GranuleApi(ModisApi): 42 | """API for MODIS's 'granules' resource""" 43 | 44 | resource: str = "granules" 45 | params: dict = DEFAULT_GRANULE_PARAMS.copy() 46 | 47 | def __init__( 48 | self, 49 | session: Optional[Sessions] = None, 50 | username: Optional[str] = None, 51 | password: Optional[str] = None, 52 | ): 53 | super().__init__(session=session, username=username, password=password) 54 | 55 | @classmethod 56 | def from_collection( 57 | cls, collection: Collection, session: Optional[Sessions] = None 58 | ) -> "GranuleApi": 59 | """Create granule client from Collection using concept_id.""" 60 | granule = cls(session=session) 61 | granule.params["concept_id"] = collection.id 62 | return granule 63 | 64 | @params_args(DateParams, SpatialQuery) 65 | def query( 66 | self, 67 | start_date: Any = None, 68 | end_date: Any = None, 69 | time_delta: Any = None, 70 | spatial: Any = None, 71 | bounding_box: Any = None, 72 | limit: Optional[int] = None, 73 | **kwargs, 74 | ) -> Iterator[Granule]: 75 | """Query granules. Yields a generator of matching granules 76 | Default parameters can be overridden: 77 | downloadable: true 78 | page_size: 2000 79 | 80 | :param start_date start od date query 81 | :type Any will attempt to parse to datetime 82 | 83 | :param end_date end of date query 84 | :type Any will attempt to parse to datetime 85 | 86 | :param time_delta time difference, if one of start or end are defined 87 | :type Any will attempt to parse to timedelta 88 | 89 | :param spatial spatial intersection query with granules. Will parse 90 | ogr.Geometry's, shapely objects, GeoJSON features/geometries. 91 | Geometries must be in longitude, latitude 92 | :type Any will attempt to parse to coordinate string 93 | 94 | :param bbox spatial query using bounding box. Will parse ogr.Geometry's, 95 | shapely objects, GeoJSON features/geometries, or list of coordinates 96 | in the format (xmin, ymin, xmax, ymax). Geometries must be in longitude, 97 | latitude 98 | :type Any 99 | 100 | :param limit maximum number of results to return. If None, all results are 101 | returned 102 | :type int 103 | 104 | :param kwargs any additional arguments will be passed as query params. For 105 | additional options see: 106 | https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html 107 | 108 | :rtype generator of granules 109 | """ 110 | params = kwargs.pop("params", {}) 111 | params = {**(self.params or {}), **kwargs, **params} 112 | yielded = 0 113 | while not limit or yielded < limit: 114 | try: 115 | resp = self.no_auth.get(params=params, auth=None) 116 | feed = resp.json()["feed"] 117 | granule_feed = GranuleFeed(**feed) 118 | except (json.JSONDecodeError, KeyError, IndexError) as err: 119 | raise Exception(f"{resp},{resp.json()}") from err 120 | granules = granule_feed.entry 121 | if limit: 122 | granules = granules[: limit - yielded] 123 | for granule in granules: 124 | yield granule 125 | yielded += len(granules) 126 | 127 | # Empty "CMR-Search-After" means the end of the query 128 | # https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html#search-after 129 | if "CMR-Search-After" not in resp.headers: 130 | break 131 | self.session.headers["CMR-Search-After"] = resp.headers["CMR-Search-After"] 132 | self.session.headers.pop("CMR-Search-After", None) 133 | -------------------------------------------------------------------------------- /tests/test_request_helpers.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from datetime import timedelta 3 | 4 | import pytest 5 | 6 | try: 7 | import shapely 8 | import shapely.geometry 9 | 10 | SHAPELY_TYPES = ( 11 | shapely.geometry.GeometryCollection, 12 | shapely.geometry.LineString, 13 | shapely.geometry.LinearRing, 14 | shapely.geometry.MultiPolygon, 15 | shapely.geometry.Point, 16 | shapely.geometry.Polygon, 17 | ) 18 | except ImportError: 19 | ... 20 | 21 | try: 22 | from osgeo import ogr 23 | except ImportError: 24 | ... 25 | 26 | from modis_tools.request_helpers import DateParams, SpatialQuery 27 | 28 | NEED_OSGEO = "Requires installation of osgeo" 29 | NEED_SHAPELY = "Requires installation of shapely" 30 | 31 | 32 | class TestDateParams: 33 | def test_with_one_date_raises_error(self): 34 | with pytest.raises(Exception): 35 | DateParams() 36 | 37 | def test_time_delta_is_correct(self): 38 | DateParams("2005-01-01", "2005-01-02", timedelta(days=1)) 39 | 40 | def test_start_after_end_raises_error(self): 41 | with pytest.raises(Exception): 42 | DateParams("2005-01-01", "2005-01-02", timedelta(days=2)) 43 | 44 | def test_can_infer_time_delta(self): 45 | d = DateParams("2005-01-01", "2005-01-02") 46 | assert d.time_delta == timedelta(days=1) 47 | 48 | 49 | class TestSpatialQuery: 50 | @pytest.mark.skipif("shapely" not in sys.modules, reason=NEED_SHAPELY) 51 | def test_can_update_shapely_types(self): 52 | for t in SHAPELY_TYPES: 53 | assert issubclass( 54 | t, shapely.geometry.base.BaseGeometry 55 | ), f"Shapely no longer considers {t} as a `BaseGeometry`. `SpatialQuery()` must be updated accordingly" 56 | 57 | def test_can_parse_nga_list_bbox(self): 58 | bbox = [2.1448863675, 4.002583177, 15.289420717, 14.275061098] 59 | query = SpatialQuery(spatial=None, bounding_box=bbox) 60 | assert query.coordinates == "2.1448863675,4.002583177,15.289420717,14.275061098" 61 | assert query.geom_type == "bounding_box" 62 | 63 | def test_parse_mock_list_bbox(self): 64 | bbox = [1.2, 3.4, 5.6, 7.8] 65 | query = SpatialQuery(spatial=None, bounding_box=bbox) 66 | assert query.coordinates == "1.2,3.4,5.6,7.8" 67 | assert query.geom_type == "bounding_box" 68 | 69 | def test_parse_mock_tuple_bbox(self): 70 | bbox = (1.2, 3.4, 5.6, 7.8) 71 | query = SpatialQuery(spatial=None, bounding_box=bbox) 72 | assert query.coordinates == "1.2,3.4,5.6,7.8" 73 | assert query.geom_type == "bounding_box" 74 | 75 | def test_parse_inverted_mock_list_bbox_fails(self): 76 | bbox = [10, 10, 0, 0] 77 | with pytest.raises(AssertionError): 78 | SpatialQuery(spatial=None, bounding_box=bbox) 79 | 80 | @pytest.mark.skipif("osgeo" not in sys.modules, reason=NEED_OSGEO) 81 | def test_can_parse_bbox_from_gdal_bbox(self): 82 | bbox = ogr.Geometry(ogr.wkbLinearRing) 83 | bbox.AddPoint(1, 1) 84 | bbox.AddPoint(1, 2) 85 | bbox.AddPoint(2, 2) 86 | bbox.AddPoint(2, 1) 87 | query = SpatialQuery(spatial=None, bounding_box=bbox) 88 | assert query.coordinates == "1.0,1.0,2.0,2.0" 89 | assert query.geom_type == "bounding_box" 90 | 91 | @pytest.mark.skipif("osgeo" not in sys.modules, reason=NEED_OSGEO) 92 | def test_can_parse_spatial_from_gdal_bbox(self): 93 | wkt = "POLYGON ((5 30, 5 33, 2 33, 2 30, 5 30))" 94 | spatial = ogr.CreateGeometryFromWkt(wkt) 95 | query = SpatialQuery(spatial=spatial, bounding_box=None) 96 | assert ( 97 | query.coordinates 98 | == "5.0000,30.0000,5.0000,33.0000,2.0000,33.0000,2.0000,30.0000,5.0000,30.0000" 99 | ) 100 | assert query.geom_type == "polygon" 101 | 102 | @pytest.mark.skipif("osgeo" not in sys.modules, reason=NEED_OSGEO) 103 | def test_can_parse_bbox_from_gdal_diamond_polygon(self): 104 | # Create a pentagon 105 | bbox = ogr.Geometry(ogr.wkbLinearRing) 106 | bbox.AddPoint(1, 2) 107 | bbox.AddPoint(2, 3) 108 | bbox.AddPoint(3, 2) 109 | bbox.AddPoint(2, 1) 110 | query = SpatialQuery(spatial=None, bounding_box=bbox) 111 | assert query.coordinates == "1.0,1.0,3.0,3.0" 112 | assert query.geom_type == "bounding_box" 113 | 114 | @pytest.mark.skipif("osgeo" not in sys.modules, reason=NEED_OSGEO) 115 | def test_can_parse_spatial_from_gdal_multipoligon(self): 116 | multipolygon = ogr.Geometry(ogr.wkbMultiPolygon) 117 | 118 | # Create ring #1 119 | ring1 = ogr.Geometry(ogr.wkbLinearRing) 120 | ring1.AddPoint(12.0, 63.0) 121 | ring1.AddPoint(12.0, 62.0) 122 | ring1.AddPoint(13.0, 62.0) 123 | ring1.AddPoint(13.0, 63.0) 124 | ring1.AddPoint(12.0, 63.0) 125 | 126 | # Create polygon #1 127 | poly1 = ogr.Geometry(ogr.wkbPolygon) 128 | poly1.AddGeometry(ring1) 129 | multipolygon.AddGeometry(poly1) 130 | 131 | # Create ring #2 132 | ring2 = ogr.Geometry(ogr.wkbLinearRing) 133 | ring2.AddPoint(11.0, 64.0) 134 | ring2.AddPoint(11.0, 62.0) 135 | ring2.AddPoint(12.5, 62.0) 136 | ring2.AddPoint(12.5, 64.0) 137 | ring2.AddPoint(11.0, 64.0) 138 | 139 | # Create polygon #2 140 | poly2 = ogr.Geometry(ogr.wkbPolygon) 141 | poly2.AddGeometry(ring2) 142 | multipolygon.AddGeometry(poly2) 143 | 144 | query = SpatialQuery(spatial=multipolygon, bounding_box=None) 145 | assert ( 146 | query.coordinates 147 | == "11.0000,62.0000,0.0000,13.0000,62.0000,0.0000,13.0000,63.0000,0.0000,12.5000,64.0000,0.0000,11.0000,64.0000,0.0000,11.0000,62.0000,0.0000" 148 | ) 149 | assert query.geom_type == "polygon" 150 | 151 | @pytest.mark.skipif("shapely" not in sys.modules, reason=NEED_SHAPELY) 152 | def test_can_parse_bbox_from_shapely_box(self): 153 | bbox = shapely.geometry.box(2, 30, 5, 33) 154 | assert isinstance(bbox, shapely.geometry.base.BaseGeometry) 155 | 156 | query = SpatialQuery(spatial=None, bounding_box=bbox) 157 | assert query.coordinates == "2.0,30.0,5.0,33.0" 158 | assert query.geom_type == "bounding_box" 159 | 160 | @pytest.mark.skipif("shapely" not in sys.modules, reason=NEED_SHAPELY) 161 | def test_can_parse_bbox_from_shapely_wkt(self): 162 | bbox = shapely.wkt.loads("POLYGON ((5 30, 5 33, 2 33, 2 30, 5 30))") 163 | assert isinstance(bbox, shapely.geometry.base.BaseGeometry) 164 | 165 | query = SpatialQuery(spatial=None, bounding_box=bbox) 166 | assert query.coordinates == "2.0,30.0,5.0,33.0" 167 | assert query.geom_type == "bounding_box" 168 | 169 | @pytest.mark.skipif("shapely" not in sys.modules, reason=NEED_SHAPELY) 170 | def test_can_parse_spatial_from_shapely_wkt(self): 171 | spatial = shapely.wkt.loads("POLYGON ((5 30, 5 33, 2 33, 2 30, 5 30))") 172 | assert isinstance(spatial, shapely.geometry.base.BaseGeometry) 173 | 174 | query = SpatialQuery(spatial=spatial, bounding_box=None) 175 | assert ( 176 | query.coordinates 177 | == "5.0000,30.0000,5.0000,33.0000,2.0000,33.0000,2.0000,30.0000,5.0000,30.0000" 178 | ) 179 | assert query.geom_type == "polygon" 180 | -------------------------------------------------------------------------------- /tests/test_models.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from modis_tools.models import CollectionFeed, ApiLink 4 | 5 | 6 | class TestApiLink: 7 | @pytest.fixture 8 | def link_with_space(self) -> dict: 9 | return { 10 | "inherited": True, 11 | "rel": "http://esipfed.org/ns/fedsearch/1.1/data#", 12 | "hreflang": "en-US", 13 | "href": "https://oceandata.sci.gsfc.nasa.gov/directdataaccess/Level-3 Binned/Aqua-MODIS/", 14 | } 15 | 16 | def test_space_is_removed_from_link(self, link_with_space: dict): 17 | validated = ApiLink(**link_with_space) 18 | assert isinstance(validated.href.path, str) 19 | assert " " not in validated.href.path 20 | assert "%20" in validated.href.path 21 | 22 | 23 | class TestCollectionFeed: 24 | @pytest.fixture 25 | def example_json_response(self) -> dict: 26 | """ 27 | This is a response from a query with the CollectionApi class. See the 28 | `NOTE` below indicating a problematic link. In this test class, we 29 | use this example response to ensure that we can handle spaces in links. 30 | """ 31 | return { 32 | "feed": { 33 | "entry": [ 34 | { 35 | "archive_center": "NASA/GSFC/SED/ESD/GCDC/OB.DAAC", 36 | "boxes": ["-90 -180 90 180"], 37 | "browse_flag": False, 38 | "cloud_hosted": False, 39 | "collection_data_type": "SCIENCE_QUALITY", 40 | "consortiums": ["GEOSS", "EOSDIS"], 41 | "coordinate_system": "CARTESIAN", 42 | "data_center": "OB_DAAC", 43 | "dataset_id": "Aqua MODIS Global Binned Chlorophyll (CHL) " 44 | "Data, version R2022.0", 45 | "has_formats": False, 46 | "has_spatial_subsetting": False, 47 | "has_temporal_subsetting": False, 48 | "has_transforms": False, 49 | "has_variables": False, 50 | "id": "C2330511478-OB_DAAC", 51 | "links": [ 52 | { 53 | # NOTE the space in the href is problematic 54 | "href": "https://oceandata.sci.gsfc.nasa.gov/directdataaccess/Level-3 " 55 | "Binned/Aqua-MODIS/", 56 | "hreflang": "en-US", 57 | "rel": "http://esipfed.org/ns/fedsearch/1.1/data#", 58 | }, 59 | { 60 | "href": "https://oceancolor.gsfc.nasa.gov/atbd/", 61 | "hreflang": "en-US", 62 | "rel": "http://esipfed.org/ns/fedsearch/1.1/documentation#", 63 | }, 64 | { 65 | "href": "https://oceancolor.gsfc.nasa.gov/reprocessing/", 66 | "hreflang": "en-US", 67 | "rel": "http://esipfed.org/ns/fedsearch/1.1/documentation#", 68 | }, 69 | { 70 | "href": "https://oceancolor.gsfc.nasa.gov/citations/", 71 | "hreflang": "en-US", 72 | "rel": "http://esipfed.org/ns/fedsearch/1.1/documentation#", 73 | }, 74 | { 75 | "href": "https://oceancolor.gsfc.nasa.gov/data/10.5067/AQUA/MODIS/L3B/CHL/2022", 76 | "hreflang": "en-US", 77 | "rel": "http://esipfed.org/ns/fedsearch/1.1/metadata#", 78 | }, 79 | ], 80 | "online_access_flag": True, 81 | "orbit_parameters": {}, 82 | "organizations": ["NASA/GSFC/SED/ESD/GCDC/OB.DAAC", "OBPG"], 83 | "original_format": "UMM_JSON", 84 | "platforms": ["Aqua"], 85 | "processing_level_id": "3", 86 | "service_features": { 87 | "esi": { 88 | "has_formats": False, 89 | "has_spatial_subsetting": False, 90 | "has_temporal_subsetting": False, 91 | "has_transforms": False, 92 | "has_variables": False, 93 | }, 94 | "harmony": { 95 | "has_formats": False, 96 | "has_spatial_subsetting": False, 97 | "has_temporal_subsetting": False, 98 | "has_transforms": False, 99 | "has_variables": False, 100 | }, 101 | "opendap": { 102 | "has_formats": False, 103 | "has_spatial_subsetting": False, 104 | "has_temporal_subsetting": False, 105 | "has_transforms": False, 106 | "has_variables": False, 107 | }, 108 | }, 109 | "short_name": "MODISA_L3b_CHL", 110 | "summary": "MODIS (or Moderate-Resolution Imaging " 111 | "Spectroradiometer) is a key instrument aboard " 112 | "the Terra (EOS AM) and Aqua (EOS PM) " 113 | "satellites. Terra's orbit around the Earth is " 114 | "timed so that it passes from north to south " 115 | "across the equator in the morning, while Aqua " 116 | "passes south to north over the equator in the " 117 | "afternoon. Terra MODIS and Aqua MODIS are " 118 | "viewing the entire Earth's surface every 1 to " 119 | "2 days, acquiring data in 36 spectral bands, " 120 | "or groups of wavelengths (see MODIS Technical " 121 | "Specifications). These data will improve our " 122 | "understanding of global dynamics and " 123 | "processes occurring on the land, in the " 124 | "oceans, and in the lower atmosphere. MODIS is " 125 | "playing a vital role in the development of " 126 | "validated, global, interactive Earth system " 127 | "models able to predict global change " 128 | "accurately enough to assist policy makers in " 129 | "making sound decisions concerning the " 130 | "protection of our environment.", 131 | "time_start": "2002-07-04T00:00:00.000Z", 132 | "title": "Aqua MODIS Global Binned Chlorophyll (CHL) " 133 | "Data, version R2022.0", 134 | "updated": "2019-10-01T00:00:00.000Z", 135 | "version_id": "R2022.0", 136 | } 137 | ], 138 | "id": "https://cmr.earthdata.nasa.gov:443/search/collections.json?short_name=MODISA_L3b_CHL&version=R2022.0", 139 | "title": "ECHO dataset metadata", 140 | "updated": "2023-11-27T17:15:07.456Z", 141 | } 142 | } 143 | 144 | def test_collection_feed_can_handle_links_with_spaces(self, example_json_response): 145 | feed = example_json_response["feed"] 146 | CollectionFeed(**feed) 147 | -------------------------------------------------------------------------------- /modis_tools/granule_handler.py: -------------------------------------------------------------------------------- 1 | from multiprocessing import cpu_count 2 | from pathlib import Path 3 | from typing import Any, Iterable, List, Optional, Tuple, Type, TypeVar, Union 4 | from urllib.parse import urlsplit 5 | 6 | from pydantic.networks import AnyUrl, HttpUrl 7 | from requests.auth import HTTPProxyAuth 8 | from requests.models import Response 9 | from tqdm import tqdm 10 | from tqdm.contrib.concurrent import process_map 11 | 12 | from modis_tools.auth import ModisSession, has_download_cookies 13 | from modis_tools.constants.urls import URLs 14 | from modis_tools.models import Granule 15 | 16 | T = TypeVar("T") 17 | 18 | 19 | class GranuleHandler: 20 | @classmethod 21 | def download_from_granules( 22 | cls, 23 | one_or_many_granules: Union[Iterable[Granule], Granule], 24 | modis_session: ModisSession, 25 | ext: Union[str, Tuple] = ("hdf", "h5", "nc", "xml"), 26 | threads: int = 1, 27 | path: Optional[str] = None, 28 | force: bool = False, 29 | ) -> List[Path]: 30 | """Download the corresponding raster file for each item in `granules` 31 | 32 | Args: 33 | one_or_many_granules (Iterator[Granule]): Either a single `Granule` 34 | object, or Several `Granule` objects as an `Iterable` that have 35 | a `.link` property 36 | modis_session (ModisSession): A logged in `ModisSession` object 37 | ext (Union[str, Tuple]): Specify the permitted file extensions. If nothing is passed 38 | defaults to all of ("hdf", "h5", "nc", "xml"). 39 | threads (int, optional): Specify how many concurrent processes or 40 | threads should be used while downloading. s an integer, 41 | specifying the maximum number of concurrently running workers. 42 | If 1 is given, no joblib parallelism is used at all, which is 43 | useful for debugging. If set to -1, all CPUs are used. For 44 | `threads` below -1, (n_cpus + 1 + n_jobs) are used. For example 45 | with `threads=-2`, all CPUs but one are used. 46 | path (Optional[str], optional): The directory to save the file. If set 47 | None, defaults to current directory. 48 | force (bool, optional): download file regardless if it exists and 49 | matches remote content size. Defaults to False. 50 | 51 | Returns: 52 | List[str]: Path to the newly downloaded file(s) 53 | """ 54 | granules = cls._coerce_to_list(one_or_many_granules, Granule) 55 | urls = [cls.get_url_from_granule(x, ext) for x in granules] 56 | if threads in (None, 0, 1): 57 | return cls.download_from_urls( 58 | urls, modis_session=modis_session, path=path, force=force, disable=False 59 | ) 60 | n_threads = threads if threads > 1 else cpu_count() + 1 + threads 61 | result = process_map( 62 | cls.wrapper_download_from_urls, 63 | ((u, modis_session, path, force) for u in urls), 64 | max_workers=n_threads, 65 | total=len(urls), 66 | desc="Downloading", 67 | position=0, 68 | unit="file", 69 | ) 70 | if isinstance(result[0], list): 71 | result = [item for sublist in result for item in sublist] 72 | return result 73 | 74 | @classmethod 75 | def wrapper_download_from_urls(cls, args: Iterable[Any]) -> List[Path]: 76 | """wrapper to unpack arguments for `download_from_urls` 77 | 78 | Args: 79 | args (Iterable[Any]): Iterator to unpack 80 | """ 81 | return cls.download_from_urls(*args) 82 | 83 | @staticmethod 84 | def _coerce_to_list( 85 | possible_list: Union[Iterable[Any], Any], obj_type: Type[T] 86 | ) -> Iterable[T]: 87 | """Cast possible single object into list 88 | 89 | Args: 90 | possible_list (Union[Iterable[Any], Any]): Variable to be converted to a list 91 | obj_type (Type): The type of the item within the returned list. 92 | Even though `obj_type` isn't used in implementation, it's key in 93 | determining return type 94 | """ 95 | if isinstance(possible_list, obj_type): 96 | possible_list = [possible_list] 97 | return possible_list 98 | 99 | @staticmethod 100 | def get_url_from_granule(granule: Granule, ext: Union[str, Tuple]) -> HttpUrl: 101 | """Return link for file extension from Earthdata resource.""" 102 | for link in granule.links: 103 | if link.href.host in [ 104 | URLs.RESOURCE.value, 105 | URLs.NSIDC_RESOURCE.value, 106 | URLs.MOD11A2_V061_RESOURCE.value, 107 | URLs.LAADS_RESOURCE.value, 108 | URLs.LAADS_CLOUD_RESOURCE.value, 109 | URLs.MODISA_L3b_CHL_V061_RESOURCE.value, 110 | URLs.MODISA_L3b_CHL_V061_RESOURCE_SCI.value, 111 | ] and link.href.path.endswith(ext): 112 | return link.href 113 | raise Exception("No matching link found") 114 | 115 | @classmethod 116 | def download_from_urls( 117 | cls, 118 | one_or_many_urls: Union[Iterable[HttpUrl], HttpUrl], 119 | modis_session: ModisSession, 120 | path: Optional[str] = None, 121 | force: bool = False, 122 | disable: bool = True, 123 | ) -> List[Path]: 124 | """Save file locally using remote name. 125 | 126 | :param path directory to save file, defaults to current directory 127 | :type str or Path 128 | 129 | :param force download file regardless if it exists and matches remote 130 | content size 131 | :type bool, default False 132 | 133 | :param disable tqdm progress bar. Disable if using multiprocessing 134 | :type bool, default True 135 | 136 | :returns Path to the newly downloaded file 137 | :rtype List[Path] 138 | """ 139 | urls = cls._coerce_to_list(one_or_many_urls, AnyUrl) 140 | file_paths = [] 141 | for url in tqdm(urls, disable=disable, desc="Downloading", unit="file"): 142 | req = cls._get(url, modis_session) 143 | file_path = Path(path or "") / Path(url).name 144 | content_size = int(req.headers.get("Content-Length", -1)) 145 | if ( 146 | force 147 | or not file_path.exists() 148 | or file_path.stat().st_size != content_size 149 | ): 150 | with open(file_path, "wb") as handle: 151 | for chunk in req.iter_content(chunk_size=2**20): 152 | handle.write(chunk) 153 | file_paths.append(file_path) 154 | return file_paths 155 | 156 | @classmethod 157 | def _get( 158 | cls, 159 | url: HttpUrl, 160 | modis_session: ModisSession, 161 | stream: Optional[bool] = True, 162 | ) -> Response: 163 | """ 164 | Get request for MODIS file url. Raise an error if no file content. 165 | 166 | :param stream return content as chunked stream of data 167 | :type bool 168 | 169 | :rtype request 170 | """ 171 | session = modis_session.session 172 | if not has_download_cookies(session): 173 | location = cls._get_location(url, modis_session) 174 | else: 175 | location = url 176 | req = session.get(location, stream=stream) 177 | content_size = int(req.headers.get("Content-Length", -1)) 178 | if content_size <= 1: 179 | raise FileNotFoundError("No file content found") 180 | return req 181 | 182 | @staticmethod 183 | def _get_location(url: HttpUrl, modis_session: ModisSession) -> str: 184 | """Make initial request to fetch file location from header.""" 185 | session = modis_session.session 186 | split_result = urlsplit(url) 187 | https_url = split_result._replace(scheme="https").geturl() 188 | if url.host == URLs.LAADS_RESOURCE.value: 189 | location_resp = session.get(https_url, allow_redirects=True) 190 | location = location_resp.url # ends up being the same as https_url 191 | elif url.host == URLs.MODISA_L3b_CHL_V061_RESOURCE_SCI.value: 192 | location_resp = session.get(https_url, allow_redirects=True) 193 | # go to last re-direct location 194 | location = location_resp.history[-1].headers.get("Location") 195 | else: 196 | location_resp = session.get(https_url, allow_redirects=False) 197 | if location_resp.status_code == 401: 198 | # try using ProxyAuth if BasicAuth returns 401 (unauthorized) 199 | location_resp = session.get( 200 | https_url, 201 | allow_redirects=False, 202 | auth=HTTPProxyAuth(modis_session.username, modis_session.password), 203 | ) 204 | location = location_resp.headers.get("Location") 205 | if not location: 206 | raise FileNotFoundError("No file location found") 207 | return location 208 | -------------------------------------------------------------------------------- /modis_tools/request_helpers.py: -------------------------------------------------------------------------------- 1 | """Classes and wrapper for grouped or preprocessed parameters.""" 2 | 3 | import re 4 | from datetime import datetime, timedelta 5 | from typing import Any, Optional 6 | 7 | from dateutil import parser 8 | 9 | try: 10 | from osgeo.ogr import Geometry 11 | except ImportError: 12 | Geometry = type(None) 13 | 14 | try: 15 | _HAS_SHAPELY = True 16 | import shapely.geometry 17 | from shapely import wkt 18 | 19 | SHAPELY_TYPES = shapely.geometry.base.BaseGeometry 20 | except ImportError: 21 | _HAS_SHAPELY = False 22 | SHAPELY_TYPES = (type(None),) 23 | 24 | 25 | class RequestsArg: 26 | """Base for processing arguments to requests. Require args to implement 27 | a `to_dict` method. 28 | """ 29 | 30 | def to_dict(self): 31 | """Convert args to dictionary.""" 32 | raise NotImplementedError 33 | 34 | 35 | class CollectionDoiParams(RequestsArg): 36 | """Formated product short name and version.""" 37 | 38 | modis_doi: str = "10.5067/MODIS" 39 | 40 | def __init__(self, short_name: Optional[Any] = None, version: Optional[Any] = None): 41 | self.short_name = short_name 42 | self.version = version 43 | 44 | def to_dict(self): 45 | return {"doi": f"{self.modis_doi}/{self.short_name}.{self.version}"} 46 | 47 | 48 | class DateParams(RequestsArg): 49 | """Parsed date range with start, end, and difference.""" 50 | 51 | stftime_format = "%Y-%m-%dT%H:%M:%SZ" 52 | start_date: Optional[datetime] = None 53 | end_date: Optional[datetime] = None 54 | 55 | def __init__( 56 | self, 57 | start_date: Optional[Any] = None, 58 | end_date: Optional[Any] = None, 59 | time_delta: Optional[Any] = None, 60 | ): 61 | if all([start_date, end_date, time_delta]): 62 | self.start_date = self._parse_datetime(start_date) 63 | self.end_date = self._parse_datetime(end_date) 64 | delta = self._parse_timedelta(time_delta) 65 | if self.start_date + abs(delta) != self.end_date: 66 | raise Exception( 67 | "If all start, end, and time delta are used they must " 68 | "add up (start + delta = end)." 69 | ) 70 | elif not any([start_date, end_date]): 71 | raise Exception("One end of date range needed") 72 | else: 73 | if start_date is not None: 74 | self.start_date = self._parse_datetime(start_date) 75 | if time_delta is not None: 76 | delta = self._parse_timedelta(time_delta) 77 | self.end_date = self.start_date + abs(delta) 78 | if end_date is not None: 79 | self.end_date = self._parse_datetime(end_date) 80 | if time_delta is not None: 81 | delta = self._parse_timedelta(time_delta) 82 | self.start_date = self.end_date - abs(delta) 83 | 84 | @property 85 | def time_delta(self): 86 | """Difference between start and end dates. None if open ended.""" 87 | if all([self.end_date, self.start_date]): 88 | return self.end_date - self.start_date 89 | 90 | def _parse_datetime(self, obj: Any) -> datetime: 91 | """Try to parse data to datetime.""" 92 | try: 93 | if issubclass(type(obj), datetime): 94 | parsed = obj 95 | elif isinstance(obj, str): 96 | parsed = parser.parse(obj) 97 | elif isinstance(obj, (tuple, list)): 98 | parsed = datetime(*obj) 99 | elif isinstance(obj, dict): 100 | parsed = datetime(**obj) 101 | return parsed 102 | except (ValueError, TypeError, parser.ParserError) as err: 103 | raise Exception("Could not convert date(s) to datetime") from err 104 | 105 | def _parse_timedelta(self, obj: Any) -> timedelta: 106 | """Try to parse data to timedelta.""" 107 | if issubclass(type(obj), timedelta): 108 | parsed = obj 109 | elif isinstance(obj, int): 110 | parsed = timedelta(obj) 111 | elif isinstance(obj, dict): 112 | try: 113 | parsed = timedelta(**obj) 114 | except TypeError as err: 115 | raise Exception("Could not convert time_delta") from err 116 | return parsed 117 | 118 | def to_dict(self): 119 | """Return data range as `temporal` argument.""" 120 | start = ( 121 | "" 122 | if self.start_date is None 123 | else self.start_date.strftime(self.stftime_format) 124 | ) 125 | end = ( 126 | "" if self.end_date is None else self.end_date.strftime(self.stftime_format) 127 | ) 128 | 129 | temporal = re.sub(r"^,|,$", "/", start + "," + end) 130 | 131 | return {"temporal": temporal} 132 | 133 | 134 | class FileQuery(RequestsArg): 135 | """Format file post query.""" 136 | 137 | def to_dict(self): 138 | pass 139 | 140 | 141 | class SpatialQuery(RequestsArg): 142 | """Format spatial search query.""" 143 | 144 | geom_type: str 145 | coordinates: str 146 | 147 | def __init__(self, spatial: Any = None, bounding_box: Any = None): 148 | """Parse geometry type and coordinates from spatial query. 149 | 150 | Args: 151 | spatial (Any, optional): spatial intersection query with 152 | granules. Will parse ogr.Geometry's, shapely objects, GeoJSON 153 | features/geometries. Geometries must be in longitude, latitude. 154 | Defaults to None. 155 | bounding_box (Any, optional): spatial query using bounding 156 | box. Will parse ogr.Geometry's, shapely objects, GeoJSON 157 | features/geometries, or list of coordinates in the format 158 | (xmin, ymin, xmax, ymax). Geometries must be in longitude, 159 | latitude. Defaults to None. 160 | """ 161 | if spatial: 162 | self._parse_spatial(spatial) 163 | else: 164 | self._parse_bounding_box(bounding_box) 165 | 166 | def _parse_spatial(self, spatial): 167 | # Convert everything to shapely 168 | if isinstance(spatial, dict) and _HAS_SHAPELY: 169 | if "geometry" in spatial: 170 | geom = shapely.geometry.shape(spatial["geometry"]) 171 | else: 172 | geom = shapely.geometry.shape(spatial) 173 | elif isinstance(spatial, Geometry): 174 | geom = wkt.loads(spatial.ExportToWkt()) 175 | elif isinstance(spatial, SHAPELY_TYPES): 176 | geom = spatial 177 | else: 178 | raise ValueError( 179 | f"Can't create spatial query based on provided spatial input of type {type(spatial)}; it should be a dict, ogr.Geometry or shapely.geometry type" 180 | ) 181 | if geom.geom_type in ("MultiPolygon", "GeometryCollection"): 182 | # For complex polygons/geometries use convex hull 183 | geom = geom.convex_hull 184 | if geom.geom_type.startswith("Line"): 185 | self.geom_type = "line" 186 | else: 187 | self.geom_type = geom.geom_type.lower() 188 | 189 | self.coordinates = self._coordinate_string(geom) 190 | 191 | def _coordinate_string(self, geom): 192 | """Format coordinates to string. Regex to find pieces of coordinate sequences. 193 | For polygons with inner rings, we use only the outer boundary by using the first 194 | sequence. 195 | """ 196 | if geom.geom_type == "Polygon": 197 | # Ensure coordinates are counter clockwise 198 | geom = shapely.geometry.polygon.orient(geom) 199 | wkt_string = wkt.dumps(geom, rounding_precision=4) 200 | pieces = re.findall(r"\(([\d\.\-, ]+)\)", wkt_string) 201 | return pieces[0].replace(", ", ",").replace(" ", ",") 202 | 203 | def _parse_bounding_box(self, bounding_box): 204 | """ 205 | Parse coorindates from bounding box argument. 206 | """ 207 | self.geom_type = "bounding_box" 208 | if isinstance(bounding_box, (list, tuple)): 209 | assertion = "Bounding box should be (xmin, ymin, xmax, ymax)" 210 | assert ( 211 | bounding_box[0] <= bounding_box[2] or bounding_box[1] <= bounding_box[3] 212 | ), assertion 213 | coordinates = bounding_box 214 | elif isinstance(bounding_box, dict) and _HAS_SHAPELY: 215 | if "geometry" in bounding_box: 216 | coordinates = shapely.geometry.shape(bounding_box["geometry"]).bounds 217 | else: 218 | coordinates = shapely.geometry.shape(bounding_box).bounds 219 | elif isinstance(bounding_box, Geometry): 220 | xmin, xmax, ymin, ymax = bounding_box.GetEnvelope() 221 | coordinates = [xmin, ymin, xmax, ymax] 222 | elif isinstance(bounding_box, SHAPELY_TYPES): 223 | coordinates = bounding_box.bounds 224 | else: 225 | raise ValueError( 226 | f"Can't create bounding box query based on spatial input of type {type(bounding_box)}; it should be a list, tuple, dict, ogr.Geometry or shapely.geometry type" 227 | ) 228 | self.coordinates = ",".join([str(c) for c in coordinates]) 229 | 230 | def to_dict(self): 231 | return {self.geom_type: self.coordinates} 232 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MODIS Tools 2 | 3 | MODIS Tools is a Python library to easily (and quickly) download MODIS imagery from the NASA Earthdata platform. 4 | 5 | NASA’s Earthdata portal organizes MODIS data into collections, products, and granules. MODIS Tools provides a series of classes to search MODIS collection metadata for products, select the tiles you want, and download granules from the results of those queries. All you need are Earthdata account credentials and the desired MODIS product’s short name and version. 6 | 7 | ## Example 8 | 9 | After adding your username and password, the snippet below will download MOD13A1 granules for Nigeria for 2016, 2017, and 2018 to the current directory. 10 | 11 | ```python 12 | from modis_tools.auth import ModisSession 13 | from modis_tools.resources import CollectionApi, GranuleApi 14 | from modis_tools.granule_handler import GranuleHandler 15 | 16 | username = "" # Update this line 17 | password = "" # Update this line 18 | 19 | # Authenticate a session 20 | session = ModisSession(username=username, password=password) 21 | 22 | # Query the MODIS catalog for collections 23 | collection_client = CollectionApi(session=session) 24 | collections = collection_client.query(short_name="MOD13A1", version="061") 25 | 26 | # Query the selected collection for granules 27 | granule_client = GranuleApi.from_collection(collections[0], session=session) 28 | 29 | # Filter the selected granules via spatial and temporal parameters 30 | nigeria_bbox = [2.1448863675, 4.002583177, 15.289420717, 14.275061098] 31 | nigeria_granules = granule_client.query(start_date="2016-01-01", end_date="2018-12-31", bounding_box=nigeria_bbox) 32 | 33 | # Download the granules 34 | GranuleHandler.download_from_granules(nigeria_granules, session) 35 | ``` 36 | 37 | ## Further Details and Options 38 | 39 | ### Authentication 40 | 41 | With username and password: 42 | 43 | ```python 44 | from modis_tools.auth import ModisSession 45 | from modis_tools.resources import CollectionApi 46 | 47 | username = "" 48 | password = "" 49 | 50 | # Reusable session 51 | session = ModisSession(username=username, password=password) 52 | collection_client = CollectionApi(session=session) 53 | # - or - 54 | collection_client = CollectionApi(username=username, password=password) 55 | ``` 56 | 57 | With session as context manager 58 | 59 | ```python 60 | ... 61 | with ModisSession(username=username, password=password) as session: 62 | collection_client = CollectionApi(session=session) 63 | ... 64 | ``` 65 | 66 | Using a netrc file, you can create clients without authentication: 67 | 68 | ```python 69 | from modis_tools.auth import add_earthdata_netrc, remove_earthdata_netrc 70 | 71 | username = "" 72 | password = "" 73 | # Create an entry for Earthdata in the ~/.netrc file, only needs to be run once 74 | add_earthdata_netrc(username, password) 75 | 76 | ... 77 | # Now sessions can be created without passing username and password explicitly 78 | session = ModisSession() 79 | granule_client = GranuleApi() 80 | 81 | # You can remove the credentials if necessary. It will only remove 82 | # the Earthdata entry 83 | remove_earthdata_netrc() 84 | ... 85 | ``` 86 | 87 | ### Query Parameters 88 | 89 | You can interact with the Earthdata Search API to browse collections and granules via the `CollectionApi` and `GranuleApi` classes respectively. Most query parameters for collections and granules listed in the [Earthdata documentation](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html) can be passed directly to either class's `query()` method. 90 | 91 | *Note: To specify a modifier for a parameter (eg. `parameter[option]`), you'll need to unpack it from a dictionary: `**{"parameter[option]": "value"}` rather than passing it directly as a keyword argument.* 92 | 93 | *Note: Response models for both classes' `query()` methods can be found in `modis_tools/models.py`.* 94 | 95 | ```python 96 | # Collections query returns a list of matching collections 97 | collections = collection_client.query(short_name="MOD13A1", version="061") 98 | 99 | # Create a GranuleApi from a Collection, the `concept_id` search parameter is set 100 | # to the collection 101 | granule_client = GranuleApi.from_collection(collections[0]) 102 | # Granules collection returns a generator with matching granules 103 | granules = granule_client.query(start_date="2019-02-02", limit=50) 104 | ``` 105 | 106 | Some parameters will be preprocessed and formatted. You can also use the raw parameters shown in the Earthdata documentation, but you'll have to make sure the format is correct. 107 | 108 | #### Time parameters 109 | 110 | Time ranges can be defined by at least one of `start_date` and `end_date` that can be passed as `datetime.datetime` objects, or strings/dicts/tuples that can be parsed to `datetime` objects. `time_delta` can be a `datetime.timedelta` object or something that can be parsed to one. 111 | 112 | ```python 113 | from datetime import datetime, timedelta 114 | 115 | # Any of the following definitions work for both `start_date` and `end_date` 116 | start_date = datetime(2017, 12, 31) 117 | start_date = {"year": 2017, "month": 12, "day": 31} 118 | start_date = "2017-12-31" 119 | start_date = (2017, 12, 31) 120 | 121 | # Any of the following definitions for time_delta will create a one year time range. 122 | # Sign of time delta doesn't matter, it will be determined by whether start or end 123 | # is provided 124 | time_delta = timedelta(365) 125 | time_delta = 365 # Days is the default unit for time_delta 126 | time_delta = {"weeks": 52, "days": 1} 127 | 128 | end_date = datetime(2018, 12, 31) 129 | 130 | # With the above parameters, the following three will query the same time range 131 | granules = granule_client.query(start_date=start_end, end_date=end_date) 132 | granules = granule_client.query(time_delta=time_delta, start_date=start_date) 133 | granules = granule_client.query(time_delta=time_delta, end_date=end_date) 134 | 135 | # If only one of start or end is provided, the date query is open ended 136 | granules = granule_client.query(start_date=start_end) 137 | ``` 138 | 139 | #### Spatial parameters 140 | 141 | The `spatial` and `bounding_box` parameters for collections and granules will parse ogr `Geometry`s, shapely `geometry`s (used by geopandas), or 142 | GeoJSONs features/geometries. Multipolygons and geometry collections are converted to convex hulls for simpler queries. All spatial queries should 143 | be in `(longitude, latitude)` order. 144 | 145 | If bounding box is a geometry object, the envelope will be calculated. As a list or tuple, bounding box should be in the order `(xmin, ymin, xmax, ymax)`. 146 | 147 | ```python 148 | import geopandas as gpd 149 | 150 | df = gpd.read_file("/Users/leith/Desktop/dhs_mwi.geojson") 151 | geom = df.geometry[0] 152 | 153 | malawi_granules = granule_client.query(start_date="2017-01-01", end_date="2018-12-31", spatial=geom) 154 | 155 | ... 156 | from osgeo import ogr 157 | 158 | ds = ogr.GetDriverByName("GeoJSON").Open("drc.geojson") 159 | l = ds.GetLayer() 160 | feat = l.GetNextFeature() 161 | 162 | drc_granules = granule_client.query(start_date="2015-09-01", bounding_box=feat.geometry) 163 | ``` 164 | 165 | ### Downloading 166 | 167 | The return value of a query with the GranuleAPI is a generator. This avoids calling the MODIS API more than is immediately needed if more than one page of results is found. 168 | 169 | Iterating through a generator consumes it. If you need to reuse the values, convert it to a list with `list(granules)`. 170 | 171 | ```python 172 | GranuleHandler.download_from_granules(granules, session=session) 173 | 174 | # Files paths can be traced from the granule return values 175 | file_paths = GranuleHandler.download_from_granules(granules, session=session) 176 | 177 | # Saves to current directory, use `path` to save somewhere else 178 | GranuleHandler.download_from_granules(granules, session=session, path="../Desktop") 179 | 180 | # Retrieve first approved types 181 | # Priority is given in order of returned links, not file types 182 | file_paths = GranuleHandler.download_from_granules(granules, session, ext = ("hdf", "h5", "nc", "xml")) 183 | 184 | ``` 185 | 186 | #### Multithreaded Downloads 187 | 188 | The `threads` parameter in `GranuleHandler.download_from_granules()` specifies how many concurrent processes or threads should be used while downloading. 189 | 190 | `threads` is an integer, specifying the maximum number of concurrently running workers. 191 | 192 | * If 1 is given, no parallelism is used at all, which is useful for debugging. 193 | * If set to -1, all CPUs are used. 194 | * For `threads` below -1, (n_cpus + 1 + n_jobs) are used. For example with `threads=-2`, all CPUs but one are used. 195 | 196 | ```python 197 | GranuleHandler.download_from_granules(nigeria_granules, modis_session=session, threads=-1) 198 | ``` 199 | 200 | #### MODIS Data Types 201 | 202 | Currently modis_tools only supports downloading of hdf file type. 203 | 204 | ## Development and Testing 205 | 206 | ### Pre-commit hooks 207 | 208 | - The developers use [pre-commit](https://pre-commit.com/) hooks to format code before committing. pre-commit can be installed with: 209 | ```bash 210 | pip install pre-commit 211 | ``` 212 | and should be run before committing with: 213 | ```bash 214 | pre-commit run --all-files 215 | ``` 216 | 217 | ### Setting up a development environment 218 | 219 | To quickly setup a virtual environment locally using venv, from root simply run: 220 | 221 | ```bash 222 | python -m venv .venv 223 | source .venv/bin/activate 224 | pip install -r requirements.txt 225 | ``` 226 | 227 | - Alternatively, To install all production dependencies, run: 228 | 229 | ```python 230 | pip install -r requirements.txt 231 | ``` 232 | 233 | ## Dev-Dependencies 234 | 235 | - To install dev-dependencies to run tests, run: 236 | 237 | ```python 238 | pip install -e .[test] 239 | ``` 240 | 241 | - Note that `gdal` is optionally supported as an extra dependency. This is for users who wish to use `ogr.Geometry` objects to spatially query the modis data to be retrieved. Assuming you have all the libraries installed to run gdal, you can install this dependency with: 242 | 243 | ```python 244 | pip install -e .[gdal] 245 | ``` 246 | 247 | - To install more than one extra dependency-set, separate them with a comma as seen in the below example. The full list of supported dependency-sets are listed under `extras_require` in setup.py: 248 | 249 | ```python 250 | pip install -e .[test,gdal] 251 | ``` 252 | 253 | ### Testing 254 | 255 | 1. All tests can be found in `./tests` with a directory structure mirroring the directory structure of the files being tested 256 | 2. To run tests, navigate terminal to the root of this repo, and 257 | 1. To run only unit tests (faster) run the following: 258 | `pytest -m "not integration_test"` 259 | 2. To run only integration tests (slower) run the following: 260 | `pytest -m integration_test` 261 | 3. To run the whole test suit, run: 262 | `pytest` 263 | 264 | ### Release Instructions 265 | 266 | For project maintainers: 267 | 268 | * Once all changes have been merged to main for a release, make a branch called 269 | to upgrade the version, eg. `upgrade-1.13` 270 | * `pip install build twine` 271 | * Update version `in setup.py` 272 | * Create the source archive and wheel with `python -m build` 273 | * `twine check dist/*` to check the files you've just build 274 | 275 | *The final steps assumes you've set up your PyPi and TestPyPi accounts* 276 | * Test upload to TestPyPi with `twine upload -r testpypi dist/*` 277 | * If you haven't set up MFA for PyPi/TestPyPi, use your normal login username 278 | and password 279 | * If you have, use `__token__` as the username and an [API 280 | token](https://pypi.org/help/#apitoken) as your password 281 | * Assuming the test upload goes smoothly, upload to PyPi with `twine upload dist/*` 282 | * Merge the version update branch to main 283 | 284 | 285 | ## Issues and Contributing 286 | 287 | We welcome any feedback and contributions from the community! 288 | - To report an issue or to request support, please use the [github issues](https://github.com/fraymio/modis-tools/issues). 289 | - To contribute, please check out our [contribution guidline](./CONTRIBUTING.md). --------------------------------------------------------------------------------