├── config.ini ├── src ├── utils.py ├── project.py ├── model │ ├── test_model.py │ ├── predict_model.py │ └── train_model.py ├── weights │ └── utils.py ├── network │ ├── approach_01.py │ └── approach_02.py ├── dataset │ └── download_dataset.py └── visualization │ └── visaulization_model.py ├── tasks ├── lint.sh ├── download.sh └── test_api.sh ├── api └── __init__.py ├── application.py ├── requirement.txt ├── sqs └── SQSSender.py ├── .pre-commit-config.yaml ├── aws └── download_files.py ├── data ├── raw │ └── metadata.toml ├── external │ └── metadata.toml ├── interim │ └── metadata.toml └── processed │ └── metadata.toml ├── examples ├── feature_01.md └── feature_02.md ├── notebooks └── test.ipynb ├── project_cli └── train_cli.py ├── training ├── experiment │ └── utils.py ├── run_experiment.py ├── update_metadata.py └── prepare_experiment.py ├── evaluation ├── evaluate_model_01.py └── evaluate_model_02.py ├── Dockerfile ├── LICENSE ├── .gitignore └── README.md /config.ini: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/utils.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tasks/lint.sh: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /api/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /application.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /requirement.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /sqs/SQSSender.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/project.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tasks/download.sh: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tasks/test_api.sh: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /aws/download_files.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/raw/metadata.toml: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /examples/feature_01.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /examples/feature_02.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /notebooks/test.ipynb: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/model/test_model.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/weights/utils.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/external/metadata.toml: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/interim/metadata.toml: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /data/processed/metadata.toml: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /project_cli/train_cli.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/model/predict_model.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/model/train_model.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/network/approach_01.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/network/approach_02.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /training/experiment/utils.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /training/run_experiment.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /training/update_metadata.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /evaluation/evaluate_model_01.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /evaluation/evaluate_model_02.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/dataset/download_dataset.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /training/prepare_experiment.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/visualization/visaulization_model.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ubuntu:20.04 2 | 3 | RUN apt-get update 4 | RUN apt-get install -y python3-pip 5 | RUN pip3 install --upgrade pip 6 | RUN pip3 install pipenv 7 | COPY requirements.txt ./requirements.txt 8 | RUN pip3 install -r requirements.txt 9 | COPY . ./ 10 | 11 | RUN pipenv sync -d 12 | CMD ["application.handler"] -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Sunil Ghimire 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | 131 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Machine Learning Project Structure 2 | 3 | Having a well-organized general Machine Learning project structure makes it easy to understand and make changes. Moreover, this structure can be the same for multiple projects, which avoids confusion. 4 | 5 | ## Steps involved in making a project Structure 6 | 7 | Step 01: Make sure that you have latest python and pip installed in your system. 8 | 9 | Step 02: Create a sample repository on github.com (For example: Machine-Learning-Project) 10 | 11 | Step 03: Clone the repo in your local system `git clone ` 12 | 13 | Step 04: Change directory to new directory 'Machine-Learning-Project' `cd Machine-Learning-Project` 14 | 15 | Step 05: Create and activate virtaul environment 16 | ``` 17 | Example 01: 18 | 19 | - Create Virtual Environment 20 | python -m venv venv_machine_learning_project 21 | 22 | - Activate Created Virtual Environment 23 | For unix based system -> source ./venv_machine_learning_project/bin/activate 24 | For windows -> ./venv_machine_learning_project/Scripts/activate 25 | 26 | Example 02: 27 | 28 | - Create Virtual Environment 29 | conda create -n venv_machine_learning_project 30 | 31 | - Activate Created Virtual Environment 32 | conda activate venv_machine_learning_project 33 | ``` 34 | Step 06: Follow the below directory structure for your project 35 | 36 | ## Project Directory Structure 37 | 38 | ``` 39 | ├── Machine Learning Project Structure <- Project Main Directory 40 | | |── api <- Consists of scripts which serialize the API calls and act as a endpoint faciliating for project functions. 41 | │ ├── data <- data in different format 42 | | | ├── external <- data from third party source 43 | | | ├── interim <- Intermediate data that has been transformed 44 | | | ├── processed <- The final, canonical data sets for modeling 45 | | | ├── raw <- The original, immutable data dump 46 | | ├── evaluation 47 | | | ├── evaluate_model_01.py <- Different Matries used to evaluate the model 48 | | | ├── evaluate_model_02.py <- Different Matries used to evaluate the model 49 | │ ├── examples 50 | | | ├── feature_01.md <- It consists of doc and example showing how we can use the project, different functions etc. 51 | | | ├── feature_02.md <- It consists of doc and example showing how we can use the project, different functions etc. 52 | │ ├── notebooks <- All the ipython notebooks used for EDA, visualization and verification of concept (POC). 53 | │ ├── src 54 | | | ├── dataset 55 | | | | ├── download_dataset.py <- Scripts to download the dataset or ccesing dataset from data folder 56 | | | ├── model 57 | | | | ├── train_model.py <- Scripts to train the model 58 | | | | ├── test_model.py <- scripts to test the model 59 | | | | ├── predict_model.py <- Scripts to predict the model 60 | | | ├── network 61 | | | | ├── approach_01.py <- Neural network schema 62 | | | ├── weights 63 | | | | ├── utils.py.py <- folder to save weights 64 | | | ├── visualization 65 | | | | ├── visaulization_model.py <- Scripts to visualize the model 66 | | | ├── utils.py <- different utils functions 67 | | | ├── project.py <- project pipeline 68 | │ ├── project_cli <- Scripts which faciliates Command line interface for taining, testing and other features. 69 | | | ├── train_cli.py 70 | | | ├── test_cli.py 71 | │ ├── task <- Contains batch script which can be used for downloading files from web or batch to auto test, lint project. 72 | | | ├── download.sh 73 | | | ├── lint.sh 74 | | | ├── est_api.sh 75 | │ ├── training <- Contains all experiments preperation, way on auto running experiments and updating metadata. 76 | | | ├── experiment 77 | | | | ├── utils.py 78 | | | ├── prepare_experiment.py 79 | | | ├── run_experiment.py 80 | | | ├── update_metadata.py 81 | | ├── sqs 82 | | | ├── SQSSender.py <- sending message to Amazon SQS 83 | | ├── aws 84 | | | ├── download_files.py <- uploading and downloading files from Amazon S3 Bucket 85 | │ ├── config.ini <- Contains configuration information of project 86 | │ ├── .pre-commit-config.yaml <- identifying simple issues before submission to code review 87 | │ ├── .gitignore <- tells Git which files to ignore when committing your project to the GitHub repository 88 | │ ├── .env <- used to hide the confidential data like AWS Screte Key, AWS Access Key, S3 Bucket Name etc... 89 | │ ├── Dockerfile <- This helps in dockerizing whole system 90 | │ ├── requirements.txt <- requirements files contains all the module used while building the project. 91 | │ ├── application.py <- python module that processes event i.e. function is invoked, Lambda runs the handler method. 92 | │ ├── README.md <- The top-level README for developers using this project 93 | ``` 94 | 95 | ****Note****: The `data` folder and `.env` file won’t appear in github. It will be in your local folder. This is not pushed to githhub as it will be in the ignore list (`.gitignore` file). If you want to checkin that also, just comment out in `.gitignore` file and add the data folder to github. 96 | 97 | ## Thank-You for reading! Share your ❤️ by starring this repo! as it encourages me to write more! 98 | --------------------------------------------------------------------------------