├── .github ├── PULL_REQUEST_TEMPLATE.md └── workflows │ └── validate.yml ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── README.md ├── check_order.py └── mlc_config.json /.github/PULL_REQUEST_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## What is this tool for? 2 | 3 | Describe features. 4 | 5 | ## What's the difference between this tool and similar ones? 6 | 7 | Enumerate comparisons. 8 | 9 | --- 10 | 11 | Anyone who agrees with this pull request could submit an *Approve* review to it. 12 | -------------------------------------------------------------------------------- /.github/workflows/validate.yml: -------------------------------------------------------------------------------- 1 | name: Validate README 2 | 3 | on: [push, pull_request] 4 | 5 | jobs: 6 | check-order: 7 | runs-on: ubuntu-latest 8 | steps: 9 | - uses: actions/checkout@v2 10 | - uses: actions/setup-python@v2 11 | with: 12 | python-version: '3.8' 13 | - run: python check_order.py 14 | check-links: 15 | runs-on: ubuntu-latest 16 | steps: 17 | - uses: actions/checkout@v2 18 | - uses: gaurav-nelson/github-action-markdown-link-check@v1 19 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | In the interest of fostering an open and welcoming environment, we as 6 | contributors and maintainers pledge to making participation in our project and 7 | our community a harassment-free experience for everyone, regardless of age, body 8 | size, disability, ethnicity, gender identity and expression, level of experience, 9 | nationality, personal appearance, race, religion, or sexual identity and 10 | orientation. 11 | 12 | ## Our Standards 13 | 14 | Examples of behavior that contributes to creating a positive environment 15 | include: 16 | 17 | * Using welcoming and inclusive language 18 | * Being respectful of differing viewpoints and experiences 19 | * Gracefully accepting constructive criticism 20 | * Focusing on what is best for the community 21 | * Showing empathy towards other community members 22 | 23 | Examples of unacceptable behavior by participants include: 24 | 25 | * The use of sexualized language or imagery and unwelcome sexual attention or 26 | advances 27 | * Trolling, insulting/derogatory comments, and personal or political attacks 28 | * Public or private harassment 29 | * Publishing others' private information, such as a physical or electronic 30 | address, without explicit permission 31 | * Other conduct which could reasonably be considered inappropriate in a 32 | professional setting 33 | 34 | ## Our Responsibilities 35 | 36 | Project maintainers are responsible for clarifying the standards of acceptable 37 | behavior and are expected to take appropriate and fair corrective action in 38 | response to any instances of unacceptable behavior. 39 | 40 | Project maintainers have the right and responsibility to remove, edit, or 41 | reject comments, commits, code, wiki edits, issues, and other contributions 42 | that are not aligned to this Code of Conduct, or to ban temporarily or 43 | permanently any contributor for other behaviors that they deem inappropriate, 44 | threatening, offensive, or harmful. 45 | 46 | ## Scope 47 | 48 | This Code of Conduct applies both within project spaces and in public spaces 49 | when an individual is representing the project or its community. Examples of 50 | representing a project or community include using an official project e-mail 51 | address, posting via an official social media account, or acting as an appointed 52 | representative at an online or offline event. Representation of a project may be 53 | further defined and clarified by project maintainers. 54 | 55 | ## Enforcement 56 | 57 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 58 | reported by contacting the project team. All 59 | complaints will be reviewed and investigated and will result in a response that 60 | is deemed necessary and appropriate to the circumstances. The project team is 61 | obligated to maintain confidentiality with regard to the reporter of an incident. 62 | Further details of specific enforcement policies may be posted separately. 63 | 64 | Project maintainers who do not follow or enforce the Code of Conduct in good 65 | faith may face temporary or permanent repercussions as determined by other 66 | members of the project's leadership. 67 | 68 | ## Attribution 69 | 70 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, 71 | available at [http://contributor-covenant.org/version/1/4][version] 72 | 73 | [homepage]: http://contributor-covenant.org 74 | [version]: http://contributor-covenant.org/version/1/4/ 75 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | Your contributions are always welcome! 4 | 5 | ## Guidelines 6 | 7 | * Add one link per Pull Request. 8 | * Make sure the PR title is in the format of `Add project-name`. 9 | * Add the link: `* [project-name](http://example.com/) - A short description ends with a period.` 10 | * Keep descriptions concise. 11 | * Add a section if needed. 12 | * Add the section description. 13 | * Add the section title to Table of Contents. 14 | * Search previous Pull Requests or Issues before making a new one, as yours may be a duplicate. 15 | * Check your spelling and grammar. 16 | * Remove any trailing whitespace. 17 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Awesome MLOps [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome) 2 | 3 | A curated list of awesome MLOps tools. 4 | 5 | Inspired by [awesome-python](https://github.com/vinta/awesome-python). 6 | 7 | - [Awesome MLOps](#awesome-mlops) 8 | - [AutoML](#automl) 9 | - [CI/CD for Machine Learning](#cicd-for-machine-learning) 10 | - [Cron Job Monitoring](#cron-job-monitoring) 11 | - [Data Catalog](#data-catalog) 12 | - [Data Enrichment](#data-enrichment) 13 | - [Data Exploration](#data-exploration) 14 | - [Data Management](#data-management) 15 | - [Data Processing](#data-processing) 16 | - [Data Validation](#data-validation) 17 | - [Data Visualization](#data-visualization) 18 | - [Drift Detection](#drift-detection) 19 | - [Feature Engineering](#feature-engineering) 20 | - [Feature Store](#feature-store) 21 | - [Hyperparameter Tuning](#hyperparameter-tuning) 22 | - [Knowledge Sharing](#knowledge-sharing) 23 | - [Machine Learning Platform](#machine-learning-platform) 24 | - [Model Fairness and Privacy](#model-fairness-and-privacy) 25 | - [Model Interpretability](#model-interpretability) 26 | - [Model Lifecycle](#model-lifecycle) 27 | - [Model Serving](#model-serving) 28 | - [Model Testing & Validation](#model-testing--validation) 29 | - [Optimization Tools](#optimization-tools) 30 | - [Simplification Tools](#simplification-tools) 31 | - [Visual Analysis and Debugging](#visual-analysis-and-debugging) 32 | - [Workflow Tools](#workflow-tools) 33 | - [Resources](#resources) 34 | - [Articles](#articles) 35 | - [Books](#books) 36 | - [Events](#events) 37 | - [Other Lists](#other-lists) 38 | - [Podcasts](#podcasts) 39 | - [Slack](#slack) 40 | - [Websites](#websites) 41 | - [Contributing](#contributing) 42 | 43 | --- 44 | 45 | ## AutoML 46 | 47 | *Tools for performing AutoML.* 48 | 49 | * [AutoGluon](https://github.com/awslabs/autogluon) - Automated machine learning for image, text, tabular, time-series, and multi-modal data. 50 | * [AutoKeras](https://github.com/keras-team/autokeras) - AutoKeras goal is to make machine learning accessible for everyone. 51 | * [AutoPyTorch](https://github.com/automl/Auto-PyTorch) - Automatic architecture search and hyperparameter optimization for PyTorch. 52 | * [AutoSKLearn](https://github.com/automl/auto-sklearn) - Automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. 53 | * [EvalML](https://github.com/alteryx/evalml) - A library that builds, optimizes, and evaluates ML pipelines using domain-specific functions. 54 | * [FLAML](https://github.com/microsoft/FLAML) - Finds accurate ML models automatically, efficiently and economically. 55 | * [H2O AutoML](https://h2o.ai/platform/h2o-automl) - Automates ML workflow, which includes automatic training and tuning of models. 56 | * [MindsDB](https://github.com/mindsdb/mindsdb) - AI layer for databases that allows you to effortlessly develop, train and deploy ML models. 57 | * [MLBox](https://github.com/AxeldeRomblay/MLBox) - MLBox is a powerful Automated Machine Learning python library. 58 | * [Model Search](https://github.com/google/model_search) - Framework that implements AutoML algorithms for model architecture search at scale. 59 | * [NNI](https://github.com/microsoft/nni) - An open source AutoML toolkit for automate machine learning lifecycle. 60 | 61 | ## CI/CD for Machine Learning 62 | 63 | *Tools for performing CI/CD for Machine Learning.* 64 | 65 | * [ClearML](https://github.com/allegroai/clearml) - Auto-Magical CI/CD to streamline your ML workflow. 66 | * [CML](https://github.com/iterative/cml) - Open-source library for implementing CI/CD in machine learning projects. 67 | * [KitOps](https://github.com/jozu-ai/kitops) – Open source MLOps project that eases model handoffs between data scientist and DevOps. 68 | 69 | ## Cron Job Monitoring 70 | 71 | *Tools for monitoring cron jobs (recurring jobs).* 72 | 73 | * [Cronitor](https://cronitor.io/cron-job-monitoring) - Monitor any cron job or scheduled task. 74 | * [HealthchecksIO](https://healthchecks.io/) - Simple and effective cron job monitoring. 75 | 76 | ## Data Catalog 77 | 78 | *Tools for data cataloging.* 79 | 80 | * [Amundsen](https://www.amundsen.io/) - Data discovery and metadata engine for improving the productivity when interacting with data. 81 | * [Apache Atlas](https://atlas.apache.org) - Provides open metadata management and governance capabilities to build a data catalog. 82 | * [CKAN](https://github.com/ckan/ckan) - Open-source DMS (data management system) for powering data hubs and data portals. 83 | * [DataHub](https://github.com/linkedin/datahub) - LinkedIn's generalized metadata search & discovery tool. 84 | * [Magda](https://github.com/magda-io/magda) - A federated, open-source data catalog for all your big data and small data. 85 | * [Metacat](https://github.com/Netflix/metacat) - Unified metadata exploration API service for Hive, RDS, Teradata, Redshift, S3 and Cassandra. 86 | * [OpenMetadata](https://open-metadata.org/) - A Single place to discover, collaborate and get your data right. 87 | 88 | ## Data Enrichment 89 | 90 | *Tools and libraries for data enrichment.* 91 | 92 | * [Snorkel](https://github.com/snorkel-team/snorkel) - A system for quickly generating training data with weak supervision. 93 | * [Upgini](https://github.com/upgini/upgini) - Enriches training datasets with features from public and community shared data sources. 94 | 95 | ## Data Exploration 96 | 97 | *Tools for performing data exploration.* 98 | 99 | * [Apache Zeppelin](https://zeppelin.apache.org/) - Enables data-driven, interactive data analytics and collaborative documents. 100 | * [BambooLib](https://github.com/tkrabel/bamboolib) - An intuitive GUI for Pandas DataFrames. 101 | * [DataPrep](https://github.com/sfu-db/dataprep) - Collect, clean and visualize your data in Python. 102 | * [Google Colab](https://colab.research.google.com) - Hosted Jupyter notebook service that requires no setup to use. 103 | * [Jupyter Notebook](https://jupyter.org/) - Web-based notebook environment for interactive computing. 104 | * [JupyterLab](https://jupyterlab.readthedocs.io) - The next-generation user interface for Project Jupyter. 105 | * [Jupytext](https://github.com/mwouts/jupytext) - Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts. 106 | * [Pandas Profiling](https://github.com/ydataai/pandas-profiling) - Create HTML profiling reports from pandas DataFrame objects. 107 | * [Polynote](https://polynote.org/) - The polyglot notebook with first-class Scala support. 108 | 109 | ## Data Management 110 | 111 | *Tools for performing data management.* 112 | 113 | * [Arrikto](https://www.arrikto.com/) - Dead simple, ultra fast storage for the hybrid Kubernetes world. 114 | * [BlazingSQL](https://github.com/BlazingDB/blazingsql) - A lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF. 115 | * [Delta Lake](https://github.com/delta-io/delta) - Storage layer that brings scalable, ACID transactions to Apache Spark and other engines. 116 | * [Dolt](https://github.com/dolthub/dolt) - SQL database that you can fork, clone, branch, merge, push and pull just like a git repository. 117 | * [Dud](https://github.com/kevin-hanselman/dud) - A lightweight CLI tool for versioning data alongside source code and building data pipelines. 118 | * [DVC](https://dvc.org/) - Management and versioning of datasets and machine learning models. 119 | * [Git LFS](https://git-lfs.github.com) - An open source Git extension for versioning large files. 120 | * [Hub](https://github.com/activeloopai/Hub) - A dataset format for creating, storing, and collaborating on AI datasets of any size. 121 | * [Intake](https://github.com/intake/intake) - A lightweight set of tools for loading and sharing data in data science projects. 122 | * [lakeFS](https://github.com/treeverse/lakeFS) - Repeatable, atomic and versioned data lake on top of object storage. 123 | * [Marquez](https://github.com/MarquezProject/marquez) - Collect, aggregate, and visualize a data ecosystem's metadata. 124 | * [Milvus](https://github.com/milvus-io/milvus/) - An open source embedding vector similarity search engine powered by Faiss, NMSLIB and Annoy. 125 | * [Pinecone](https://www.pinecone.io) - Managed and distributed vector similarity search used with a lightweight SDK. 126 | * [Qdrant](https://github.com/qdrant/qdrant) - An open source vector similarity search engine with extended filtering support. 127 | * [Quilt](https://github.com/quiltdata/quilt) - A self-organizing data hub with S3 support. 128 | 129 | ## Data Processing 130 | 131 | *Tools related to data processing and data pipelines.* 132 | 133 | * [Airflow](https://airflow.apache.org/) - Platform to programmatically author, schedule, and monitor workflows. 134 | * [Azkaban](https://github.com/azkaban/azkaban) - Batch workflow job scheduler created at LinkedIn to run Hadoop jobs. 135 | * [Dagster](https://github.com/dagster-io/dagster) - A data orchestrator for machine learning, analytics, and ETL. 136 | * [Hadoop](https://hadoop.apache.org/) - Framework that allows for the distributed processing of large data sets across clusters. 137 | * [OpenRefine](https://github.com/OpenRefine/OpenRefine) - Power tool for working with messy data and improving it. 138 | * [Spark](https://spark.apache.org/) - Unified analytics engine for large-scale data processing. 139 | 140 | ## Data Validation 141 | 142 | *Tools related to data validation.* 143 | 144 | * [Cerberus](https://github.com/pyeve/cerberus) - Lightweight, extensible data validation library for Python. 145 | * [Cleanlab](https://github.com/cleanlab/cleanlab) - Python library for data-centric AI and machine learning with messy, real-world data and labels. 146 | * [Great Expectations](https://greatexpectations.io) - A Python data validation framework that allows to test your data against datasets. 147 | * [JSON Schema](https://json-schema.org/) - A vocabulary that allows you to annotate and validate JSON documents. 148 | * [TFDV](https://github.com/tensorflow/data-validation) - An library for exploring and validating machine learning data. 149 | 150 | ## Data Visualization 151 | 152 | *Tools for data visualization, reports and dashboards.* 153 | 154 | * [Count](https://count.co) - SQL/drag-and-drop querying and visualisation tool based on notebooks. 155 | * [Dash](https://github.com/plotly/dash) - Analytical Web Apps for Python, R, Julia, and Jupyter. 156 | * [Data Studio](https://datastudio.google.com) - Reporting solution for power users who want to go beyond the data and dashboards of GA. 157 | * [Facets](https://github.com/PAIR-code/facets) - Visualizations for understanding and analyzing machine learning datasets. 158 | * [Grafana](https://grafana.com/grafana/) - Multi-platform open source analytics and interactive visualization web application. 159 | * [Lux](https://github.com/lux-org/lux) - Fast and easy data exploration by automating the visualization and data analysis process. 160 | * [Metabase](https://www.metabase.com/) - The simplest, fastest way to get business intelligence and analytics to everyone. 161 | * [Redash](https://redash.io/) - Connect to any data source, easily visualize, dashboard and share your data. 162 | * [SolidUI](https://github.com/CloudOrc/SolidUI) - AI-generated visualization prototyping and editing platform, support 2D and 3D models. 163 | * [Superset](https://superset.incubator.apache.org/) - Modern, enterprise-ready business intelligence web application. 164 | * [Tableau](https://www.tableau.com) - Powerful and fastest growing data visualization tool used in the business intelligence industry. 165 | 166 | ## Drift Detection 167 | 168 | *Tools and libraries related to drift detection.* 169 | 170 | * [Alibi Detect](https://github.com/SeldonIO/alibi-detect) - An open source Python library focused on outlier, adversarial and drift detection. 171 | * [Frouros](https://github.com/IFCA/frouros) - An open source Python library for drift detection in machine learning systems. 172 | * [TorchDrift](https://github.com/torchdrift/torchdrift/) - A data and concept drift library for PyTorch. 173 | 174 | ## Feature Engineering 175 | 176 | *Tools and libraries related to feature engineering.* 177 | 178 | * [Feature Engine](https://github.com/feature-engine/feature_engine) - Feature engineering package with SKlearn like functionality. 179 | * [Featuretools](https://github.com/alteryx/featuretools) - Python library for automated feature engineering. 180 | * [TSFresh](https://github.com/blue-yonder/tsfresh) - Python library for automatic extraction of relevant features from time series. 181 | 182 | ## Feature Store 183 | 184 | *Feature store tools for data serving.* 185 | 186 | * [Butterfree](https://github.com/quintoandar/butterfree) - A tool for building feature stores. Transform your raw data into beautiful features. 187 | * [ByteHub](https://github.com/bytehub-ai/bytehub) - An easy-to-use feature store. Optimized for time-series data. 188 | * [Feast](https://feast.dev/) - End-to-end open source feature store for machine learning. 189 | * [Feathr](https://github.com/linkedin/feathr) - An enterprise-grade, high performance feature store. 190 | * [Featureform](https://github.com/featureform/featureform) - A Virtual Feature Store. Turn your existing data infrastructure into a feature store. 191 | * [Tecton](https://www.tecton.ai/) - A fully-managed feature platform built to orchestrate the complete lifecycle of features. 192 | 193 | ## Hyperparameter Tuning 194 | 195 | *Tools and libraries to perform hyperparameter tuning.* 196 | 197 | * [Advisor](https://github.com/tobegit3hub/advisor) - Open-source implementation of Google Vizier for hyper parameters tuning. 198 | * [Hyperas](https://github.com/maxpumperla/hyperas) - A very simple wrapper for convenient hyperparameter optimization. 199 | * [Hyperopt](https://github.com/hyperopt/hyperopt) - Distributed Asynchronous Hyperparameter Optimization in Python. 200 | * [Katib](https://github.com/kubeflow/katib) - Kubernetes-based system for hyperparameter tuning and neural architecture search. 201 | * [KerasTuner](https://github.com/keras-team/keras-tuner) - Easy-to-use, scalable hyperparameter optimization framework. 202 | * [Optuna](https://optuna.org/) - Open source hyperparameter optimization framework to automate hyperparameter search. 203 | * [Scikit Optimize](https://github.com/scikit-optimize/scikit-optimize) - Simple and efficient library to minimize expensive and noisy black-box functions. 204 | * [Talos](https://github.com/autonomio/talos) - Hyperparameter Optimization for TensorFlow, Keras and PyTorch. 205 | * [Tune](https://docs.ray.io/en/latest/tune.html) - Python library for experiment execution and hyperparameter tuning at any scale. 206 | 207 | ## Knowledge Sharing 208 | 209 | *Tools for sharing knowledge to the entire team/company.* 210 | 211 | * [Knowledge Repo](https://github.com/airbnb/knowledge-repo) - Knowledge sharing platform for data scientists and other technical professions. 212 | * [Kyso](https://kyso.io/) - One place for data insights so your entire team can learn from your data. 213 | 214 | ## Machine Learning Platform 215 | 216 | *Complete machine learning platform solutions.* 217 | 218 | * [aiWARE](https://www.veritone.com/aiware/aiware-os/) - aiWARE helps MLOps teams evaluate, deploy, integrate, scale & monitor ML models. 219 | * [Algorithmia](https://algorithmia.com/) - Securely govern your machine learning operations with a healthy ML lifecycle. 220 | * [Allegro AI](https://allegro.ai/) - Transform ML/DL research into products. Faster. 221 | * [Bodywork](https://bodywork.readthedocs.io/en/latest/) - Deploys machine learning projects developed in Python, to Kubernetes. 222 | * [CNVRG](https://cnvrg.io/) - An end-to-end machine learning platform to build and deploy AI models at scale. 223 | * [DAGsHub](https://dagshub.com/) - A platform built on open source tools for data, model and pipeline management. 224 | * [Dataiku](https://www.dataiku.com/) - Platform democratizing access to data and enabling enterprises to build their own path to AI. 225 | * [DataRobot](https://www.datarobot.com/) - AI platform that democratizes data science and automates the end-to-end ML at scale. 226 | * [Domino](https://www.dominodatalab.com/) - One place for your data science tools, apps, results, models, and knowledge. 227 | * [Edge Impulse](https://edgeimpulse.com/) - Platform for creating, optimizing, and deploying AI/ML algorithms for edge devices. 228 | * [envd](https://github.com/tensorchord/envd) - Machine learning development environment for data science and AI/ML engineering teams. 229 | * [FedML](https://fedml.ai/) - Simplifies the workflow of federated learning anywhere at any scale. 230 | * [Gradient](https://gradient.paperspace.com/) - Multicloud CI/CD and MLOps platform for machine learning teams. 231 | * [H2O](https://www.h2o.ai/) - Open source leader in AI with a mission to democratize AI for everyone. 232 | * [Hopsworks](https://www.hopsworks.ai/) - Open-source platform for developing and operating machine learning models at scale. 233 | * [Iguazio](https://www.iguazio.com/) - Data science platform that automates MLOps with end-to-end machine learning pipelines. 234 | * [Katonic](https://katonic.ai/) - Automate your cycle of intelligence with Katonic MLOps Platform. 235 | * [Knime](https://www.knime.com/) - Create and productionize data science using one easy and intuitive environment. 236 | * [Kubeflow](https://www.kubeflow.org/) - Making deployments of ML workflows on Kubernetes simple, portable and scalable. 237 | * [LynxKite](https://lynxkite.com/) - A complete graph data science platform for very large graphs and other datasets. 238 | * [ML Workspace](https://github.com/ml-tooling/ml-workspace) - All-in-one web-based IDE specialized for machine learning and data science. 239 | * [MLReef](https://github.com/MLReef/mlreef) - Open source MLOps platform that helps you collaborate, reproduce and share your ML work. 240 | * [Modzy](https://www.modzy.com/) - Deploy, connect, run, and monitor machine learning (ML) models in the enterprise and at the edge. 241 | * [Neu.ro](https://neu.ro) - MLOps platform that integrates open-source and proprietary tools into client-oriented systems. 242 | * [Omnimizer](https://www.omniml.ai) - Simplifies and accelerates MLOps by bridging the gap between ML models and edge hardware. 243 | * [Pachyderm](https://www.pachyderm.com/) - Combines data lineage with end-to-end pipelines on Kubernetes, engineered for the enterprise. 244 | * [Polyaxon](https://www.github.com/polyaxon/polyaxon/) - A platform for reproducible and scalable machine learning and deep learning on kubernetes. 245 | * [Sagemaker](https://aws.amazon.com/sagemaker/) - Fully managed service that provides the ability to build, train, and deploy ML models quickly. 246 | * [SAS Viya](https://www.sas.com/en_us/software/viya.html) - Cloud native AI, analytic and data management platform that supports the analytics life cycle. 247 | * [Sematic](https://sematic.dev) - An open-source end-to-end pipelining tool to go from laptop prototype to cloud in no time. 248 | * [SigOpt](https://sigopt.com/) - A platform that makes it easy to track runs, visualize training, and scale hyperparameter tuning. 249 | * [TrueFoundry](https://www.truefoundry.com) - A Cloud-native MLOps Platform over Kubernetes to simplify training and serving of ML Models. 250 | * [Valohai](https://valohai.com/) - Takes you from POC to production while managing the whole model lifecycle. 251 | 252 | ## Model Fairness and Privacy 253 | 254 | *Tools for performing model fairness and privacy in production.* 255 | 256 | * [AIF360](https://github.com/Trusted-AI/AIF360) - A comprehensive set of fairness metrics for datasets and machine learning models. 257 | * [Fairlearn](https://github.com/fairlearn/fairlearn) - A Python package to assess and improve fairness of machine learning models. 258 | * [Opacus](https://github.com/pytorch/opacus) - A library that enables training PyTorch models with differential privacy. 259 | * [TensorFlow Privacy](https://github.com/tensorflow/privacy) - Library for training machine learning models with privacy for training data. 260 | 261 | ## Model Interpretability 262 | 263 | *Tools for performing model interpretability/explainability.* 264 | 265 | * [Alibi](https://github.com/SeldonIO/alibi) - Open-source Python library enabling ML model inspection and interpretation. 266 | * [Captum](https://github.com/pytorch/captum) - Model interpretability and understanding library for PyTorch. 267 | * [ELI5](https://github.com/eli5-org/eli5) - Python package which helps to debug machine learning classifiers and explain their predictions. 268 | * [InterpretML](https://github.com/interpretml/interpret) - A toolkit to help understand models and enable responsible machine learning. 269 | * [LIME](https://github.com/marcotcr/lime) - Explaining the predictions of any machine learning classifier. 270 | * [Lucid](https://github.com/tensorflow/lucid) - Collection of infrastructure and tools for research in neural network interpretability. 271 | * [SAGE](https://github.com/iancovert/sage) - For calculating global feature importance using Shapley values. 272 | * [SHAP](https://github.com/slundberg/shap) - A game theoretic approach to explain the output of any machine learning model. 273 | 274 | ## Model Lifecycle 275 | 276 | *Tools for managing model lifecycle (tracking experiments, parameters and metrics).* 277 | 278 | * [Aeromancy](https://github.com/quant-aq/aeromancy) - A framework for performing reproducible AI and ML for Weights and Biases. 279 | * [Aim](https://github.com/aimhubio/aim) - A super-easy way to record, search and compare 1000s of ML training runs. 280 | * [Cascade](https://github.com/Oxid15/cascade) - Library of ML-Engineering tools for rapid prototyping and experiment management. 281 | * [Comet](https://github.com/comet-ml/comet-examples) - Track your datasets, code changes, experimentation history, and models. 282 | * [Guild AI](https://guild.ai/) - Open source experiment tracking, pipeline automation, and hyperparameter tuning. 283 | * [Keepsake](https://github.com/replicate/keepsake) - Version control for machine learning with support to Amazon S3 and Google Cloud Storage. 284 | * [Losswise](https://losswise.com) - Makes it easy to track the progress of a machine learning project. 285 | * [MLflow](https://mlflow.org/) - Open source platform for the machine learning lifecycle. 286 | * [ModelDB](https://github.com/VertaAI/modeldb/) - Open source ML model versioning, metadata, and experiment management. 287 | * [Neptune AI](https://neptune.ai/) - The most lightweight experiment management tool that fits any workflow. 288 | * [Sacred](https://github.com/IDSIA/sacred) - A tool to help you configure, organize, log and reproduce experiments. 289 | * [Weights and Biases](https://github.com/wandb/client) - A tool for visualizing and tracking your machine learning experiments. 290 | 291 | ## Model Serving 292 | 293 | *Tools for serving models in production.* 294 | 295 | * [Banana](https://banana.dev) - Host your ML inference code on serverless GPUs and integrate it into your app with one line of code. 296 | * [Beam](https://beam.cloud) - Develop on serverless GPUs, deploy highly performant APIs, and rapidly prototype ML models. 297 | * [BentoML](https://github.com/bentoml/BentoML) - Open-source platform for high-performance ML model serving. 298 | * [BudgetML](https://github.com/ebhy/budgetml) - Deploy a ML inference service on a budget in less than 10 lines of code. 299 | * [Cog](https://github.com/replicate/cog) - Open-source tool that lets you package ML models in a standard, production-ready container. 300 | * [Cortex](https://www.cortex.dev/) - Machine learning model serving infrastructure. 301 | * [Geniusrise](https://docs.geniusrise.ai) - Host inference APIs, bulk inference and fine tune text, vision, audio and multi-modal models. 302 | * [Gradio](https://github.com/gradio-app/gradio) - Create customizable UI components around your models. 303 | * [GraphPipe](https://oracle.github.io/graphpipe) - Machine learning model deployment made simple. 304 | * [Hydrosphere](https://github.com/Hydrospheredata/hydro-serving) - Platform for deploying your Machine Learning to production. 305 | * [KFServing](https://github.com/kubeflow/kfserving) - Kubernetes custom resource definition for serving ML models on arbitrary frameworks. 306 | * [LocalAI](https://github.com/mudler/LocalAI) - Drop-in replacement REST API that’s compatible with OpenAI API specifications for inferencing. 307 | * [Merlin](https://github.com/gojek/merlin) - A platform for deploying and serving machine learning models. 308 | * [MLEM](https://github.com/iterative/mlem) - Version and deploy your ML models following GitOps principles. 309 | * [Opyrator](https://github.com/ml-tooling/opyrator) - Turns your ML code into microservices with web API, interactive GUI, and more. 310 | * [PredictionIO](https://github.com/apache/predictionio) - Event collection, deployment of algorithms, evaluation, querying predictive results via APIs. 311 | * [Quix](https://quix.io) - Serverless platform for processing data streams in real-time with machine learning models. 312 | * [Rune](https://github.com/hotg-ai/rune) - Provides containers to encapsulate and deploy EdgeML pipelines and applications. 313 | * [Seldon](https://www.seldon.io/) - Take your ML projects from POC to production with maximum efficiency and minimal risk. 314 | * [Streamlit](https://github.com/streamlit/streamlit) - Lets you create apps for your ML projects with deceptively simple Python scripts. 315 | * [TensorFlow Serving](https://www.tensorflow.org/tfx/guide/serving) - Flexible, high-performance serving system for ML models, designed for production. 316 | * [TorchServe](https://github.com/pytorch/serve) - A flexible and easy to use tool for serving PyTorch models. 317 | * [Triton Inference Server](https://github.com/triton-inference-server/server) - Provides an optimized cloud and edge inferencing solution. 318 | * [Vespa](https://github.com/vespa-engine/vespa) - Store, search, organize and make machine-learned inferences over big data at serving time. 319 | * [Wallaroo.AI](https://wallaroo.ai/) - A platform for deploying, serving, and optimizing ML models in both cloud and edge environments. 320 | 321 | ## Model Testing & Validation 322 | 323 | *Tools for testing and validating models.* 324 | 325 | * [Deepchecks](https://github.com/deepchecks/deepchecks) - Open-source package for validating ML models & data, with various checks and suites. 326 | * [Starwhale](https://github.com/star-whale/starwhale) - An MLOps/LLMOps platform for model building, evaluation, and fine-tuning. 327 | * [Trubrics](https://github.com/trubrics/trubrics-sdk) - Validate machine learning with data science and domain expert feedback. 328 | 329 | ## Optimization Tools 330 | 331 | *Optimization tools related to model scalability in production.* 332 | 333 | * [Accelerate](https://github.com/huggingface/accelerate) - A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision. 334 | * [Dask](https://dask.org/) - Provides advanced parallelism for analytics, enabling performance at scale for the tools you love. 335 | * [DeepSpeed](https://github.com/microsoft/DeepSpeed) - Deep learning optimization library that makes distributed training easy, efficient, and effective. 336 | * [Fiber](https://uber.github.io/fiber/) - Python distributed computing library for modern computer clusters. 337 | * [Horovod](https://github.com/horovod/horovod) - Distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. 338 | * [Mahout](https://mahout.apache.org/) - Distributed linear algebra framework and mathematically expressive Scala DSL. 339 | * [MLlib](https://spark.apache.org/mllib/) - Apache Spark's scalable machine learning library. 340 | * [Modin](https://github.com/modin-project/modin) - Speed up your Pandas workflows by changing a single line of code. 341 | * [Nebullvm](https://github.com/nebuly-ai/nebullvm) - Easy-to-use library to boost AI inference. 342 | * [Nos](https://github.com/nebuly-ai/nos) - Open-source module for running AI workloads on Kubernetes in an optimized way. 343 | * [Petastorm](https://github.com/uber/petastorm) - Enables single machine or distributed training and evaluation of deep learning models. 344 | * [Rapids](https://rapids.ai/index.html) - Gives the ability to execute end-to-end data science and analytics pipelines entirely on GPUs. 345 | * [Ray](https://github.com/ray-project/ray) - Fast and simple framework for building and running distributed applications. 346 | * [Singa](http://singa.apache.org/en/index.html) - Apache top level project, focusing on distributed training of DL and ML models. 347 | * [Tpot](https://github.com/EpistasisLab/tpot) - Automated ML tool that optimizes machine learning pipelines using genetic programming. 348 | 349 | ## Simplification Tools 350 | 351 | *Tools related to machine learning simplification and standardization.* 352 | 353 | * [Chassis](https://chassisml.io) - Turns models into ML-friendly containers that run just about anywhere. 354 | * [Hermione](https://github.com/a3data/hermione) - Help Data Scientists on setting up more organized codes, in a quicker and simpler way. 355 | * [Hydra](https://github.com/facebookresearch/hydra) - A framework for elegantly configuring complex applications. 356 | * [Koalas](https://github.com/databricks/koalas) - Pandas API on Apache Spark. Makes data scientists more productive when interacting with big data. 357 | * [Ludwig](https://github.com/uber/ludwig) - Allows users to train and test deep learning models without the need to write code. 358 | * [MLNotify](https://github.com/aporia-ai/mlnotify) - No need to keep checking your training, just one import line and you'll know the second it's done. 359 | * [PyCaret](https://pycaret.org/) - Open source, low-code machine learning library in Python. 360 | * [Sagify](https://github.com/Kenza-AI/sagify) - A CLI utility to train and deploy ML/DL models on AWS SageMaker. 361 | * [Soopervisor](https://github.com/ploomber/soopervisor) - Export ML projects to Kubernetes (Argo workflows), Airflow, AWS Batch, and SLURM. 362 | * [Soorgeon](https://github.com/ploomber/soorgeon) - Convert monolithic Jupyter notebooks into maintainable pipelines. 363 | * [TrainGenerator](https://github.com/jrieke/traingenerator) - A web app to generate template code for machine learning. 364 | * [Turi Create](https://github.com/apple/turicreate) - Simplifies the development of custom machine learning models. 365 | 366 | ## Visual Analysis and Debugging 367 | 368 | *Tools for performing visual analysis and debugging of ML/DL models.* 369 | 370 | * [Aporia](https://www.aporia.com/) - Observability with customized monitoring and explainability for ML models. 371 | * [Arize](https://www.arize.com/) - A free end-to-end ML observability and model monitoring platform. 372 | * [Evidently](https://github.com/evidentlyai/evidently) - Interactive reports to analyze ML models during validation or production monitoring. 373 | * [Fiddler](https://www.fiddler.ai/) - Monitor, explain, and analyze your AI in production. 374 | * [Manifold](https://github.com/uber/manifold) - A model-agnostic visual debugging tool for machine learning. 375 | * [NannyML](https://github.com/NannyML/nannyml) - Algorithm capable of fully capturing the impact of data drift on performance. 376 | * [Netron](https://github.com/lutzroeder/netron) - Visualizer for neural network, deep learning, and machine learning models. 377 | * [Opik](https://github.com/comet-ml/opik) - Evaluate, test, and ship LLM applications with a suite of observability tools. 378 | * [Phoenix](https://phoenix.arize.com) - MLOps in a Notebook for troubleshooting and fine-tuning generative LLM, CV, and tabular models. 379 | * [Radicalbit](https://github.com/radicalbit/radicalbit-ai-monitoring/) - The open source solution for monitoring your AI models in production. 380 | * [Superwise](https://www.superwise.ai) - Fully automated, enterprise-grade model observability in a self-service SaaS platform. 381 | * [Whylogs](https://github.com/whylabs/whylogs) - The open source standard for data logging. Enables ML monitoring and observability. 382 | * [Yellowbrick](https://github.com/DistrictDataLabs/yellowbrick) - Visual analysis and diagnostic tools to facilitate machine learning model selection. 383 | 384 | ## Workflow Tools 385 | 386 | *Tools and frameworks to create workflows or pipelines in the machine learning context.* 387 | 388 | * [Argo](https://github.com/argoproj/argo) - Open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. 389 | * [Automate Studio](https://www.veritone.com/applications/automate-studio/) - Rapidly build & deploy AI-powered workflows. 390 | * [Couler](https://github.com/couler-proj/couler) - Unified interface for constructing and managing workflows on different workflow engines. 391 | * [dstack](https://github.com/dstackai/dstack) - An open-core tool to automate data and training workflows. 392 | * [Flyte](https://flyte.org/) - Easy to create concurrent, scalable, and maintainable workflows for machine learning. 393 | * [Hamilton](https://github.com/dagworks-inc/hamilton) - A scalable general purpose micro-framework for defining dataflows. 394 | * [Kale](https://github.com/kubeflow-kale/kale) - Aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows. 395 | * [Kedro](https://github.com/quantumblacklabs/kedro) - Library that implements software engineering best-practice for data and ML pipelines. 396 | * [Luigi](https://github.com/spotify/luigi) - Python module that helps you build complex pipelines of batch jobs. 397 | * [Metaflow](https://metaflow.org/) - Human-friendly lib that helps scientists and engineers build and manage data science projects. 398 | * [MLRun](https://github.com/mlrun/mlrun) - Generic mechanism for data scientists to build, run, and monitor ML tasks and pipelines. 399 | * [Orchest](https://github.com/orchest/orchest/) - Visual pipeline editor and workflow orchestrator with an easy to use UI and based on Kubernetes. 400 | * [Ploomber](https://github.com/ploomber/ploomber) - Write maintainable, production-ready pipelines. Develop locally, deploy to the cloud. 401 | * [Prefect](https://docs.prefect.io/) - A workflow management system, designed for modern infrastructure. 402 | * [VDP](https://github.com/instill-ai/vdp) - An open-source tool to seamlessly integrate AI for unstructured data into the modern data stack. 403 | * [Wordware](https://www.wordware.ai) - A web-hosted IDE where non-technical domain experts can build task-specific AI agents. 404 | * [ZenML](https://github.com/maiot-io/zenml) - An extensible open-source MLOps framework to create reproducible pipelines. 405 | 406 | --- 407 | 408 | # Resources 409 | 410 | Where to discover new tools and discuss about existing ones. 411 | 412 | ## Articles 413 | 414 | * [A Tour of End-to-End Machine Learning Platforms](https://databaseline.tech/a-tour-of-end-to-end-ml-platforms/) (Databaseline) 415 | * [Continuous Delivery for Machine Learning](https://martinfowler.com/articles/cd4ml.html) (Martin Fowler) 416 | * [Machine Learning Operations (MLOps): Overview, Definition, and Architecture](https://arxiv.org/abs/2205.02302) (arXiv) 417 | * [MLOps Roadmap: A Complete MLOps Career Guide](https://www.scaler.com/blog/mlops-roadmap/) (Scaler Blogs) 418 | * [MLOps: Continuous delivery and automation pipelines in machine learning](https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning) (Google) 419 | * [MLOps: Machine Learning as an Engineering Discipline](https://towardsdatascience.com/ml-ops-machine-learning-as-an-engineering-discipline-b86ca4874a3f) (Medium) 420 | * [Rules of Machine Learning: Best Practices for ML Engineering](https://developers.google.com/machine-learning/guides/rules-of-ml) (Google) 421 | * [The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/aad9f93b86b7addfea4c419b9100c6cdd26cacea.pdf) (Google) 422 | * [What Is MLOps?](https://blogs.nvidia.com/blog/2020/09/03/what-is-mlops/) (NVIDIA) 423 | 424 | ## Books 425 | 426 | * [Beginning MLOps with MLFlow](https://www.amazon.com/Beginning-MLOps-MLFlow-SageMaker-Microsoft/dp/1484265483) (Apress) 427 | * [Building Machine Learning Pipelines](https://www.oreilly.com/library/view/building-machine-learning/9781492053187) (O'Reilly) 428 | * [Building Machine Learning Powered Applications](https://www.oreilly.com/library/view/building-machine-learning/9781492045106) (O'Reilly) 429 | * [Deep Learning in Production](https://www.amazon.com/gp/product/6180033773) (AI Summer) 430 | * [Designing Machine Learning Systems](https://www.oreilly.com/library/view/designing-machine-learning/9781098107956) (O'Reilly) 431 | * [Engineering MLOps](https://www.packtpub.com/product/engineering-mlops/9781800562882) (Packt) 432 | * [Implementing MLOps in the Enterprise](https://www.oreilly.com/library/view/implementing-mlops-in/9781098136574) (O'Reilly) 433 | * [Introducing MLOps](https://www.oreilly.com/library/view/introducing-mlops/9781492083283) (O'Reilly) 434 | * [Kubeflow for Machine Learning](https://www.oreilly.com/library/view/kubeflow-for-machine/9781492050117) (O'Reilly) 435 | * [Kubeflow Operations Guide](https://www.oreilly.com/library/view/kubeflow-operations-guide/9781492053262) (O'Reilly) 436 | * [Machine Learning Design Patterns](https://www.oreilly.com/library/view/machine-learning-design/9781098115777) (O'Reilly) 437 | * [Machine Learning Engineering in Action](https://www.manning.com/books/machine-learning-engineering-in-action) (Manning) 438 | * [ML Ops: Operationalizing Data Science](https://www.oreilly.com/library/view/ml-ops-operationalizing/9781492074663) (O'Reilly) 439 | * [MLOps Engineering at Scale](https://www.manning.com/books/mlops-engineering-at-scale) (Manning) 440 | * [MLOps Lifecycle Toolkit](https://link.springer.com/book/10.1007/978-1-4842-9642-4) (Apress) 441 | * [Practical Deep Learning at Scale with MLflow](https://www.packtpub.com/product/practical-deep-learning-at-scale-with-mlflow/9781803241333) (Packt) 442 | * [Practical MLOps](https://www.oreilly.com/library/view/practical-mlops/9781098103002) (O'Reilly) 443 | * [Production-Ready Applied Deep Learning](https://www.packtpub.com/product/production-ready-applied-deep-learning/9781803243665) (Packt) 444 | * [Reliable Machine Learning](https://www.oreilly.com/library/view/reliable-machine-learning/9781098106218) (O'Reilly) 445 | * [The Machine Learning Solutions Architect Handbook](https://www.packtpub.com/product/the-machine-learning-solutions-architect-handbook/9781801072168) (Packt) 446 | 447 | ## Events 448 | 449 | * [apply() - The ML data engineering conference](https://www.applyconf.com/) 450 | * [MLOps Conference - Keynotes and Panels](https://www.youtube.com/playlist?list=PLH8M0UOY0uy6d_n3vEQe6J_gRBUrISF9m) 451 | * [MLOps World: Machine Learning in Production Conference](https://mlopsworld.com/) 452 | * [NormConf - The Normcore Tech Conference](https://normconf.com/) 453 | * [Stanford MLSys Seminar Series](https://mlsys.stanford.edu/) 454 | 455 | ## Other Lists 456 | 457 | * [Applied ML](https://github.com/eugeneyan/applied-ml) 458 | * [Awesome AutoML Papers](https://github.com/hibayesian/awesome-automl-papers) 459 | * [Awesome AutoML](https://github.com/windmaple/awesome-AutoML) 460 | * [Awesome Data Science](https://github.com/academic/awesome-datascience) 461 | * [Awesome DataOps](https://github.com/kelvins/awesome-dataops) 462 | * [Awesome Deep Learning](https://github.com/ChristosChristofidis/awesome-deep-learning) 463 | * [Awesome Game Datasets](https://github.com/leomaurodesenv/game-datasets) (includes AI content) 464 | * [Awesome Machine Learning](https://github.com/josephmisiti/awesome-machine-learning) 465 | * [Awesome MLOps](https://github.com/visenger/awesome-mlops) 466 | * [Awesome Production Machine Learning](https://github.com/EthicalML/awesome-production-machine-learning) 467 | * [Awesome Python](https://github.com/vinta/awesome-python) 468 | * [Deep Learning in Production](https://github.com/ahkarami/Deep-Learning-in-Production) 469 | 470 | ## Podcasts 471 | 472 | * [Kubernetes Podcast from Google](https://kubernetespodcast.com/) 473 | * [Machine Learning – Software Engineering Daily](https://podcasts.google.com/?feed=aHR0cHM6Ly9zb2Z0d2FyZWVuZ2luZWVyaW5nZGFpbHkuY29tL2NhdGVnb3J5L21hY2hpbmUtbGVhcm5pbmcvZmVlZC8) 474 | * [MLOps.community](https://podcasts.google.com/?feed=aHR0cHM6Ly9hbmNob3IuZm0vcy8xNzRjYjFiOC9wb2RjYXN0L3Jzcw) 475 | * [Pipeline Conversation](https://podcast.zenml.io/) 476 | * [Practical AI: Machine Learning, Data Science](https://changelog.com/practicalai) 477 | * [This Week in Machine Learning & AI](https://twimlai.com/) 478 | * [True ML Talks](https://www.youtube.com/playlist?list=PL4-eEhdXDO5F9Myvh41EeUh7oCgzqFRGk) 479 | 480 | ## Slack 481 | 482 | * [Kubeflow Workspace](https://kubeflow.slack.com/#/) 483 | * [MLOps Community Wokspace](https://mlops-community.slack.com) 484 | 485 | ## Websites 486 | 487 | * [Feature Stores for ML](http://featurestore.org/) 488 | * [Made with ML](https://github.com/GokuMohandas/Made-With-ML) 489 | * [ML-Ops](https://ml-ops.org/) 490 | * [MLOps Community](https://mlops.community/) 491 | * [MLOps Guide](https://mlops-guide.github.io/) 492 | * [MLOps Now](https://mlopsnow.com) 493 | 494 | # Contributing 495 | 496 | All contributions are welcome! Please take a look at the [contribution guidelines](https://github.com/kelvins/awesome-mlops/blob/main/CONTRIBUTING.md) first. 497 | -------------------------------------------------------------------------------- /check_order.py: -------------------------------------------------------------------------------- 1 | 2 | def main(path): 3 | """Check if menus are alphabetically sorted.""" 4 | data = list() 5 | with open(path, 'r') as f: 6 | lines = f.readlines() 7 | for line in lines: 8 | if line.startswith((' ', '* [')): 9 | data.append(line) 10 | elif line.startswith(('- [', '## ')): 11 | if data != sorted(data, key=str.casefold): 12 | raise Exception('The content is not alphabetically sorted!') 13 | data = list() 14 | 15 | 16 | if __name__ == '__main__': 17 | main('README.md') 18 | -------------------------------------------------------------------------------- /mlc_config.json: -------------------------------------------------------------------------------- 1 | { 2 | "ignorePatterns": [ 3 | { 4 | "pattern": "https://mlopsworld.com/" 5 | }, 6 | { 7 | "pattern": "https://www.tableau.com" 8 | }, 9 | { 10 | "pattern": "https://sigopt.com/" 11 | } 12 | ], 13 | "timeout": "20s", 14 | "retryOn429": true, 15 | "retryCount": 5, 16 | "fallbackRetryDelay": "30s", 17 | "aliveStatusCodes": [0, 200] 18 | } 19 | --------------------------------------------------------------------------------