├── requirements.txt ├── NOTICE ├── requirements_dev.txt ├── .gitignore ├── .github ├── pull_request_template.md └── workflows │ └── unit-test.yml ├── CODE_OF_CONDUCT.md ├── gdk-config.json ├── src ├── __init__.py └── DirectoryUploader.py ├── tests ├── __init__.py └── test_directoryuploader.py ├── recipe.yaml ├── main.py ├── CONTRIBUTING.md ├── README.md └── LICENSE /requirements.txt: -------------------------------------------------------------------------------- 1 | stream-manager==1.1.1 -------------------------------------------------------------------------------- /NOTICE: -------------------------------------------------------------------------------- 1 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | -------------------------------------------------------------------------------- /requirements_dev.txt: -------------------------------------------------------------------------------- 1 | -r requirements.txt 2 | pytest==7.0.1 3 | git+https://github.com/aws-greengrass/aws-greengrass-gdk-cli.git@v1.1.0#egg=gdk -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *build/ 2 | build 3 | *dist/ 4 | *.egg-info 5 | *__pycache__ 6 | *htmlcov/ 7 | *.coverage 8 | *.iml 9 | *.DS_Store 10 | *.eggs 11 | *venv 12 | *vscode 13 | src/stream-manager/ -------------------------------------------------------------------------------- /.github/pull_request_template.md: -------------------------------------------------------------------------------- 1 | Issue #, if available: 2 | 3 | Description of changes: 4 | 5 | Mendatory license notice: 6 | By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. 7 | 8 | @cyril-lagrange please review this change. 9 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /gdk-config.json: -------------------------------------------------------------------------------- 1 | { 2 | "component": { 3 | "aws.greengrass.labs.s3.file.uploader": { 4 | "author": "Amazon", 5 | "version": "NEXT_PATCH", 6 | "build": { 7 | "build_system": "zip" 8 | }, 9 | "publish": { 10 | "bucket": "", 11 | "region": "" 12 | } 13 | } 14 | }, 15 | "gdk_version": "1.0.0" 16 | } -------------------------------------------------------------------------------- /src/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"). 4 | # You may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"). 4 | # You may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. -------------------------------------------------------------------------------- /recipe.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | RecipeFormatVersion: "2020-01-25" 3 | ComponentName: "{COMPONENT_NAME}" 4 | ComponentVersion: "{COMPONENT_VERSION}" 5 | ComponentDescription: "This is simple file uploader component written in Python." 6 | ComponentPublisher: "{COMPONENT_AUTHOR}" 7 | ComponentDependencies: 8 | aws.greengrass.StreamManager: 9 | VersionRequirement: "^2.0.0" 10 | DependencyType: "HARD" 11 | ComponentConfiguration: 12 | DefaultConfiguration: 13 | PathName: "" 14 | BucketName: "" 15 | ObjectKeyPrefix: "" 16 | Interval: "1" 17 | LogLevel: "INFO" 18 | Manifests: 19 | - Platform: 20 | os: linux 21 | Artifacts: 22 | - URI: "s3://BUCKET_NAME/COMPONENT_NAME/COMPONENT_VERSION/aws-greengrass-labs-s3-file-uploader.zip" 23 | Unarchive: ZIP 24 | Lifecycle: 25 | Run: "python3 -u {artifacts:decompressedPath}/aws-greengrass-labs-s3-file-uploader/main.py \"{configuration:/PathName}\" \"{configuration:/BucketName}\" \"{configuration:/ObjectKeyPrefix}\" \"{configuration:/Interval}\" \"{configuration:/LogLevel}\"" 26 | Install: "pip3 install --user -r {artifacts:decompressedPath}/aws-greengrass-labs-s3-file-uploader/requirements.txt" -------------------------------------------------------------------------------- /.github/workflows/unit-test.yml: -------------------------------------------------------------------------------- 1 | # This is a basic workflow to help you get started with Actions 2 | 3 | name: CI 4 | 5 | # Controls when the action will run. Triggers the workflow on push or pull request 6 | # events but only for the master branch 7 | on: 8 | push: 9 | branches: [main] 10 | pull_request: 11 | branches: [main] 12 | 13 | # A workflow run is made up of one or more jobs that can run sequentially or in parallel 14 | jobs: 15 | build: 16 | strategy: 17 | matrix: 18 | python-version: [3.9.19] 19 | runs-on: ubuntu-latest 20 | 21 | # Steps represent a sequence of tasks that will be executed as part of the job 22 | steps: 23 | - name: Checkout 24 | uses: actions/checkout@v2 25 | with: 26 | fetch-depth: 0 27 | 28 | - name: Switch to Current Branch 29 | run: git checkout ${{ env.BRANCH }} 30 | 31 | - name: Set up Python ${{ matrix.python-version }} 32 | uses: actions/setup-python@v1 33 | with: 34 | python-version: ${{ matrix.python-version }} 35 | 36 | - name: Install dependencies 37 | run: | 38 | python -m pip install --upgrade pip 39 | pip install -r requirements_dev.txt 40 | - name: Run unit tests 41 | run: python -m pytest --import-mode=append tests/ 42 | 43 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"). 4 | # You may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | 16 | 17 | 18 | 19 | import sys 20 | import time 21 | import asyncio 22 | import logging 23 | from urllib.parse import urlparse 24 | from src.DirectoryUploader import DirectoryUploader 25 | 26 | # This example scans a folder for a file pattern and upload the files that match to S3 27 | # The program monitor the completion of the S3 operation and upon succefull 28 | 29 | 30 | async def main(logger:logging.Logger, pathname,bucket_name,bucket_path,interval): 31 | 32 | logger.info("==== main ====") 33 | 34 | while True: 35 | du = None 36 | try: 37 | du = DirectoryUploader(pathname=pathname,bucket_name=bucket_name,bucket_path=bucket_path,interval=interval,logger=logger) 38 | await du.Run() 39 | except Exception: 40 | logger.exception("Exception while running") 41 | finally: 42 | if du is not None: 43 | du.Close() 44 | #something very wrong happened. Let's pause for 1 minute and start again 45 | time.sleep(60) 46 | 47 | 48 | 49 | # Start up this sample code 50 | 51 | if __name__ == "__main__": 52 | #args : pathname, bucket_name, interval, log_level 53 | if len(sys.argv) == 6: 54 | #Todo: validate arguments. 55 | 56 | print("PRINTING INCOMING ARGUMENTS") 57 | print(sys.argv) 58 | pathname = sys.argv[1] 59 | bucket_name = sys.argv[2] 60 | bucket_path = sys.argv[3] 61 | interval = sys.argv[4] 62 | log_level = sys.argv[5] 63 | 64 | logging.basicConfig(level=log_level) 65 | logger=logging.getLogger() 66 | 67 | logger.info(f'File uploader started with; pathname={pathname}, bucket_name={bucket_name}, bucket_path={bucket_path}, interval={interval}') 68 | asyncio.run(main(logger,pathname,bucket_name,bucket_path,int(interval))) 69 | else: 70 | logging.basicConfig(level=logging.INFO) 71 | logger=logging.getLogger() 72 | logger.error(f'6 argument required, only {len(sys.argv)} provided.') 73 | 74 | 75 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *main* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## aws-greengrass-labs-s3-file-uploader 2 | 3 | This component monitors a directory for new files, upload them to S3 and delete them upon successful upload. 4 | For the upload to S3, aws-greengrass-labs-s3-file-uploader uses greengrass stream manager. 5 | Asyncio is used to monitor concurrently the directory and the stream manager status stream. 6 | 7 | The logic to scan the folder is to list all of the files that match a pattern, sort them by last modified date, remove the most recent file and send the remaining files to stream manager for upload. 8 | 9 | The most recent file is considered the active file and the producer might still be writing to it. 10 | The caveat of this approach is that if there is only one file in the folder it will not be sent to S3. 11 | The delivery guarantee is at least once, meaning that in case of transmission errors and retry, the same file might be uploaded multiple time. 12 | The user under which this component runs need to have rwx permission on the directory where the files are located. 13 | Write and execute are required so that files can be deleted after transfer. 14 | 15 | ## Downloading this component 16 | You can either clone this repo with git, or download this component as a zip file. 17 | 18 | NOTE THAT THE FOLDER IN WHICH YOU CLONE OR UNZIP THIS COMPONENT NEEDS TO BE CALLED ```aws-greengrass-labs-s3-file-uploader``` FOR GDK TO WORK PROPERLY 19 | 20 | ## Installing dependencies 21 | It is recommended to create a virtual environment to install dependencies. To do so, run the following commands in the root folder of this repo: 22 | ```bash 23 | python3 -m venv .venv 24 | source ./.venv/bin/activate 25 | ``` 26 | 27 | This component can be built with [GDK](https://docs.aws.amazon.com/greengrass/v2/developerguide/gdk-cli-configuration-file.html), and uses pytest for unit testing. 28 | To install those dev dependencies as well as the runtime dependencies so that tools like Pylance work properly, run the following command in the virtual environment you just created: 29 | 30 | ```bash 31 | pip3 install -r requirements_dev.txt 32 | ``` 33 | ## Build 34 | 35 | Before building the component, you will need to update the gdk-config.json file, replacing the bucket and region placeholder. 36 | ``` 37 | "publish": { 38 | "bucket": "", 39 | "region": "" 40 | } 41 | ``` 42 | The bucket is where the component artefact will be uploaded when you publish the component, and region is the region where the greengrass component will be created. 43 | 44 | Once this is done, you can build the component with the following command: 45 | ``` 46 | gdk component build 47 | ``` 48 | ## Publish 49 | Before you can deploy aws-greengrass-labs-s3-file-Uploader to your device, you first need to publish your component. 50 | This can be done with the following command: 51 | ``` 52 | gdk component publish 53 | ``` 54 | 55 | You should now be able to see the aws-greengrass-labs-s3-file-Uploader component in the *Greengrass->component* section of the AWS console. 56 | It can now be included in a Greengrass deployment and pushed to your device. 57 | 58 | Note: for the component to run successfully you need to update the default configuration when you deploy the component, see next section. 59 | 60 | ## Configuration 61 | This component provides the following configuration parameters when it is deployed: 62 | 63 | PathName: "/local/path/to/monitor/*.ext" 64 | BucketName: "bucket-name-where-to-upload-files" 65 | ObjectKeyPrefix: "a prefix to add to the object name in S3" 66 | Interval: