├── LICENSE └── README.md /LICENSE: -------------------------------------------------------------------------------- 1 | CC0 1.0 Universal 2 | 3 | Statement of Purpose 4 | 5 | The laws of most jurisdictions throughout the world automatically confer 6 | exclusive Copyright and Related Rights (defined below) upon the creator and 7 | subsequent owner(s) (each and all, an "owner") of an original work of 8 | authorship and/or a database (each, a "Work"). 9 | 10 | Certain owners wish to permanently relinquish those rights to a Work for the 11 | purpose of contributing to a commons of creative, cultural and scientific 12 | works ("Commons") that the public can reliably and without fear of later 13 | claims of infringement build upon, modify, incorporate in other works, reuse 14 | and redistribute as freely as possible in any form whatsoever and for any 15 | purposes, including without limitation commercial purposes. These owners may 16 | contribute to the Commons to promote the ideal of a free culture and the 17 | further production of creative, cultural and scientific works, or to gain 18 | reputation or greater distribution for their Work in part through the use and 19 | efforts of others. 20 | 21 | For these and/or other purposes and motivations, and without any expectation 22 | of additional consideration or compensation, the person associating CC0 with a 23 | Work (the "Affirmer"), to the extent that he or she is an owner of Copyright 24 | and Related Rights in the Work, voluntarily elects to apply CC0 to the Work 25 | and publicly distribute the Work under its terms, with knowledge of his or her 26 | Copyright and Related Rights in the Work and the meaning and intended legal 27 | effect of CC0 on those rights. 28 | 29 | 1. Copyright and Related Rights. A Work made available under CC0 may be 30 | protected by copyright and related or neighboring rights ("Copyright and 31 | Related Rights"). Copyright and Related Rights include, but are not limited 32 | to, the following: 33 | 34 | i. the right to reproduce, adapt, distribute, perform, display, communicate, 35 | and translate a Work; 36 | 37 | ii. moral rights retained by the original author(s) and/or performer(s); 38 | 39 | iii. publicity and privacy rights pertaining to a person's image or likeness 40 | depicted in a Work; 41 | 42 | iv. rights protecting against unfair competition in regards to a Work, 43 | subject to the limitations in paragraph 4(a), below; 44 | 45 | v. rights protecting the extraction, dissemination, use and reuse of data in 46 | a Work; 47 | 48 | vi. database rights (such as those arising under Directive 96/9/EC of the 49 | European Parliament and of the Council of 11 March 1996 on the legal 50 | protection of databases, and under any national implementation thereof, 51 | including any amended or successor version of such directive); and 52 | 53 | vii. other similar, equivalent or corresponding rights throughout the world 54 | based on applicable law or treaty, and any national implementations thereof. 55 | 56 | 2. Waiver. To the greatest extent permitted by, but not in contravention of, 57 | applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and 58 | unconditionally waives, abandons, and surrenders all of Affirmer's Copyright 59 | and Related Rights and associated claims and causes of action, whether now 60 | known or unknown (including existing as well as future claims and causes of 61 | action), in the Work (i) in all territories worldwide, (ii) for the maximum 62 | duration provided by applicable law or treaty (including future time 63 | extensions), (iii) in any current or future medium and for any number of 64 | copies, and (iv) for any purpose whatsoever, including without limitation 65 | commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes 66 | the Waiver for the benefit of each member of the public at large and to the 67 | detriment of Affirmer's heirs and successors, fully intending that such Waiver 68 | shall not be subject to revocation, rescission, cancellation, termination, or 69 | any other legal or equitable action to disrupt the quiet enjoyment of the Work 70 | by the public as contemplated by Affirmer's express Statement of Purpose. 71 | 72 | 3. Public License Fallback. Should any part of the Waiver for any reason be 73 | judged legally invalid or ineffective under applicable law, then the Waiver 74 | shall be preserved to the maximum extent permitted taking into account 75 | Affirmer's express Statement of Purpose. In addition, to the extent the Waiver 76 | is so judged Affirmer hereby grants to each affected person a royalty-free, 77 | non transferable, non sublicensable, non exclusive, irrevocable and 78 | unconditional license to exercise Affirmer's Copyright and Related Rights in 79 | the Work (i) in all territories worldwide, (ii) for the maximum duration 80 | provided by applicable law or treaty (including future time extensions), (iii) 81 | in any current or future medium and for any number of copies, and (iv) for any 82 | purpose whatsoever, including without limitation commercial, advertising or 83 | promotional purposes (the "License"). The License shall be deemed effective as 84 | of the date CC0 was applied by Affirmer to the Work. Should any part of the 85 | License for any reason be judged legally invalid or ineffective under 86 | applicable law, such partial invalidity or ineffectiveness shall not 87 | invalidate the remainder of the License, and in such case Affirmer hereby 88 | affirms that he or she will not (i) exercise any of his or her remaining 89 | Copyright and Related Rights in the Work or (ii) assert any associated claims 90 | and causes of action with respect to the Work, in either case contrary to 91 | Affirmer's express Statement of Purpose. 92 | 93 | 4. Limitations and Disclaimers. 94 | 95 | a. No trademark or patent rights held by Affirmer are waived, abandoned, 96 | surrendered, licensed or otherwise affected by this document. 97 | 98 | b. Affirmer offers the Work as-is and makes no representations or warranties 99 | of any kind concerning the Work, express, implied, statutory or otherwise, 100 | including without limitation warranties of title, merchantability, fitness 101 | for a particular purpose, non infringement, or the absence of latent or 102 | other defects, accuracy, or the present or absence of errors, whether or not 103 | discoverable, all to the greatest extent permissible under applicable law. 104 | 105 | c. Affirmer disclaims responsibility for clearing rights of other persons 106 | that may apply to the Work or any use thereof, including without limitation 107 | any person's Copyright and Related Rights in the Work. Further, Affirmer 108 | disclaims responsibility for obtaining any necessary consents, permissions 109 | or other rights required for any use of the Work. 110 | 111 | d. Affirmer understands and acknowledges that Creative Commons is not a 112 | party to this document and has no duty or obligation with respect to this 113 | CC0 or use of the Work. 114 | 115 | For more information, please see 116 | 117 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # Awesome Python Machine Learning 3 | 4 | A curated list of awesome *active* Python machine learning frameworks, tools, and other related stuff in Python. 5 | 6 | This is a living document, if you have any additions, please do not hesitate to make a pull-request with your additions or contact me. 7 | 8 | In order to be an *active* library on the list, the framework must have a commit no older than a year. 9 | 10 | For a list of machine learning frameworks in more languages please see the excellent list [https://github.com/josephmisiti/awesome-machine-learning] 11 | 12 | 13 | # Machine learning libraries 14 | - [Scikit-Learn](https://github.com/scikit-learn/scikit-learn) - A general purpose ML library. Most common algorithms and metrics implemented. 15 | - [Dask](https://github.com/dask/dask-ml) - Dask-ML provides scalable machine learning in Python using Dask alongside popular machine learning libraries like Scikit-Learn. 16 | - [XGBoost](https://github.com/dmlc/xgboost) - XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. 17 | - [PyTorch](https://github.com/pytorch/pytorch) - Tensors and Dynamic neural networks in Python with strong GPU acceleration. 18 | - [Metric learn](https://github.com/metric-learn/metric-learn) - A ML library for learning metrics. 19 | - [TensorFlow](https://github.com/tensorflow/tensorflow) - TensorFlow is an open source software library for numerical computation using data flow graphs. 20 | - [Keras](https://github.com/keras-team/keras) - Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. 21 | - [Imbalanced-learn](https://github.com/scikit-learn-contrib/imbalanced-learn) - imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-class imbalance. 22 | - [Caffe](https://github.com/BVLC/caffe) - Caffe is a deep learning framework made with expression, speed, and modularity in mind. 23 | - [Annoy](https://github.com/spotify/annoy) - Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. 24 | - [PySpark](https://github.com/apache/spark/tree/master/python) - Spark is a fast and general cluster computing system for Big Data. 25 | - [Orange](https://github.com/biolab/orange3) - Orange is a component-based data mining software. It includes a range of data visualization, exploration, preprocessing and modeling techniques. 26 | - [TPOT](https://github.com/EpistasisLab/tpot) - Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. 27 | - [pgmpy](https://github.com/pgmpy/pgmpy) - pgmpy is a python library for working with Probabilistic Graphical Models. 28 | - [Apache MXNET](https://github.com/apache/incubator-mxnet) - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more 29 | - [Shogun](https://github.com/shogun-toolbox/shogun) - The SHOGUN machine learning toolbox. Unified and efficient Machine Learning since 1999. 30 | - [CNTK](https://github.com/Microsoft/CNTK) - The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes neural networks as a series of computational steps via a directed graph. 31 | - [PyOD](https://github.com/yzhao062/pyod) - PyOD is a comprehensive and scalable Python toolkit for detecting outlying objects in multivariate data. 32 | - [LightGBM](https://github.com/Microsoft/LightGBM) - LightGBM is a gradient boosting framework that uses tree based learning algorithms. 33 | - [CatBoost](https://github.com/catboost/catboost) - CatBoost is a machine learning method based on gradient boosting over decision trees. 34 | - [auto_ml](https://github.com/ClimbsRocks/auto_ml) - Automated machine learning for production and analytics. 35 | - [Apache Singa](https://github.com/apache/incubator-singa) - Distributed deep learning system. 36 | - [SimpleAI](https://github.com/simpleai-team/simpleai) - This lib implements many of the artificial intelligence algorithms described on the book "Artificial Intelligence, a Modern Approach", from Stuart Russel and Peter Norvig. 37 | - [astroML](https://github.com/astroML/astroML) - Machine learning, statistics, and data mining for astronomy and astrophysics. 38 | - [Turi Create](https://github.com/apple/turicreate) - Turi Create simplifies the development of custom machine learning models. You don't have to be a machine learning expert to add recommendations, object detection, image classification, image similarity or activity classification to your app. 39 | - [NuPIC](https://github.com/numenta/nupic) - The Numenta Platform for Intelligent Computing (NuPIC) is a machine intelligence platform that implements the HTM learning algorithms. HTM is a detailed computational theory of the neocortex. 40 | - [Lasagne](https://github.com/Lasagne/Lasagne) - Lasagne is a lightweight library to build and train neural networks in Theano. 41 | - [Chainer](https://github.com/chainer/chainer) - Chainer is a Python-based deep learning framework aiming at flexibility. 42 | - [Prophet](https://github.com/facebook/prophet) - Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. 43 | - [Surprise](https://github.com/NicolasHug/Surprise) - Surprise is a Python scikit building and analyzing recommender systems. 44 | - [nilearn](https://github.com/nilearn/nilearn) - Nilearn is a Python module for fast and easy statistical learning on NeuroImaging data. 45 | - [neuropredict](https://github.com/raamana/neuropredict) - Easy and comprehensive assessment of predictive power, with support for neuroimaging features. 46 | - [pyhsmm](https://github.com/mattjj/pyhsmm) - This is a Python library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations. 47 | - [SKLL](https://github.com/EducationalTestingService/skll) - This Python package provides command-line utilities to make it easier to run machine learning experiments with scikit-learn. 48 | - [neurolab](https://github.com/zueve/neurolab) - Neurolab is a simple and powerful Neural Network Library for Python. Contains based neural networks, train algorithms and flexible framework to create and explore other neural network types. 49 | - [pomegranate](https://github.com/jmschrei/pomegranate) - pomegranate is a package for probabilistic models in Python that is implemented in cython for speed. 50 | - [deap](https://github.com/deap/deap) - DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. 51 | - [mlxtend](https://github.com/rasbt/mlxtend) - Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. 52 | - [scikit-fuzzy](https://github.com/scikit-fuzzy/scikit-fuzzy) - scikit-fuzzy is a fuzzy logic toolkit for SciPy. 53 | - [fylearn](https://github.com/sorend/fylearn) - FyLearn is a fuzzy machine learning library, built on top of SciKit-Learn. 54 | - [tflearn](https://github.com/tflearn/tflearn) - TFlearn is a modular and transparent deep learning library built on top of Tensorflow. 55 | - [Regularized Greedy Forest](https://github.com/RGF-team/rgf) - Regularized Greedy Forest (RGF) is a tree ensemble machine learning method 56 | - [fuku-ml](https://github.com/fukuball/fuku-ml) - Simple machine learning library. 57 | - [Edward](https://github.com/blei-lab/edward) - Edward is a Python library for probabilistic modeling, inference, and criticism. 58 | - [stacked_generalization](https://github.com/fukatani/stacked_generalization) - Library for machine learning stacking generalization. 59 | - [modAL](https://github.com/modAL-python/modAL) - modAL is an active learning framework for Python3, designed with modularity, flexibility and extensibility in mind. 60 | - [neonrvm](https://github.com/siavashserver/neonrvm) - neonrvm is an experimental open source machine learning library for performing regression tasks using RVM technique. 61 | - [xLearn](https://github.com/aksnzhy/xlearn) - xLearn is a high performance, easy-to-use, and scalable machine learning package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM), which can be used to solve large-scale machine learning problems. 62 | - [ml-ens](https://github.com/flennerhag/mlens) - A Python library for high performance ensemble learning. 63 | - [mindsdb](https://github.com/mindsdb/mindsdb) - MindsDB's goal is to make it very simple for developers to use the power of artificial neural networks in their projects. 64 | - [Mars](https://github.com/mars-project/mars) - Mars is a tensor-based unified framework for large-scale data computation. 65 | - [Hyperopt-sklearn](https://github.com/hyperopt/hyperopt-sklearn) - Hyper-parameter optimization for sklearn. 66 | - [H2O](https://github.com/h2oai/h2o-3) - H2O is an in-memory platform for distributed, scalable machine learning. 67 | - [seglearn](https://github.com/dmbee/seglearn) - Python module for machine learning time series. 68 | - [pycobra](https://github.com/bhargavvader/pycobra) - python library implementing ensemble methods for regression, classification and visualisation tools including Voronoi tesselations. 69 | - [scikit-multilearn](https://github.com/scikit-multilearn/scikit-multilearn) - A scikit-learn based module for multi-label et. al. classification. 70 | - [auto-sklearn](https://github.com/automl/auto-sklearn) - auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. 71 | - [skits](https://github.com/ethanrosenthal/skits) - A library for SciKit-learn-Inspired Time Series models. 72 | - [tsfresh](https://github.com/blue-yonder/tsfresh) - Automatic extraction of relevant features from time series. 73 | - [pyqlearning](https://github.com/chimera0/accel-brain-code/tree/master/Reinforcement-Learning) - pyqlearning is Python library to implement Reinforcement Learning and Deep Reinforcement Learning. 74 | - [keras-rl](https://github.com/keras-rl/keras-rl) - Deep Reinforcement Learning for Keras. 75 | - [mushroom-rl](https://github.com/MushroomRL/mushroom-rl) - Python library for Reinforcement Learning experiments. 76 | - [chainerrl](https://github.com/chainer/chainerrl) - ChainerRL is a deep reinforcement learning library built on top of Chainer. 77 | - [tensorforce](https://github.com/tensorforce/tensorforce) - Tensorforce: a TensorFlow library for applied reinforcement learning. 78 | - [Determined](https://github.com/determined-ai/determined) - Deep learning training platform with integrated support for distributed training, hyperparameter tuning, smart GPU scheduling, experiment tracking, and a model registry. 79 | 80 | 81 | # Data processing 82 | - [NumPy](https://github.com/numpy/numpy) - NumPy is the fundamental package needed for scientific computing with Python. 83 | - [Pandas](https://github.com/pandas-dev/pandas) - pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. 84 | - [Modin](https://github.com/modin-project/modin) - Modin: Speed up your Pandas workflows by changing a single line of code. 85 | - [dfply](https://github.com/kieferk/dfply) - The dfply package makes it possible to do R's dplyr-style data manipulation with pipes in python on pandas DataFrames. 86 | - [xlwings](https://github.com/ZoomerAnalytics/xlwings) - xlwings is a BSD-licensed Python library that makes it easy to call Python from Excel and vice versa. 87 | - [pyflux](https://github.com/rjt1990/pyflux) - Open source time series library for Python. 88 | - [petl](https://github.com/petl-developers/petl) - Python Extract Transform and Load Tables of Data 89 | - [pypeln](https://github.com/cgarciae/pypeln) - Concurrent data pipelines made easy. 90 | - [botflow](https://github.com/kkyon/botflow) - Python Fast Dataflow programming framework for Data pipeline work. 91 | - [Great Expectations](https://github.com/great-expectations/great_expectations) - Great Expectations is a framework that helps teams save time and promote analytic integrity with a new twist on automated testing: pipeline tests. 92 | - [pandera](https://github.com/cosmicBboy/pandera) - Validating pandas data structures for people seeking correct things. 93 | - [pyjanitor](https://github.com/ericmjl/pyjanitor) - Clean APIs for data cleaning. Python implementation of R package Janitor. 94 | - [PandasSchema](https://github.com/TMiguelT/PandasSchema) - A validation library for Pandas data frames using user-friendly schemas. 95 | - [engarde](https://github.com/TomAugspurger/engarde) - A library for defensive data analysis. 96 | - [sklearn-pandas](https://github.com/scikit-learn-contrib/sklearn-pandas) - Pandas integration with sklearn. 97 | - [Blaze](https://github.com/blaze/blaze) - Blaze translates a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. 98 | - [scikit-datasets](https://github.com/daviddiazvico/scikit-datasets) - Scikit-learn-compatible datasets. 99 | 100 | 101 | # Statistics libraries 102 | - [SciPy](https://github.com/scipy/scipy) - SciPy (pronounced "Sigh Pie") is open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more. 103 | - [Statsmodels](https://github.com/statsmodels/statsmodels) - Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models. 104 | - [pymc3](https://github.com/pymc-devs/pymc3) - PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms. 105 | - [sympy](https://github.com/sympy/sympy) - A Python library for symbolic mathematics. 106 | - [pmdarima](https://github.com/tgsmith61591/pmdarima) - A package that brings R's beloved auto.arima to Python, making an even stronger case for why Python > R for data science. 107 | - [scikit-posthocs](https://github.com/maximtrp/scikit-posthocs) - Pairwise multiple comparisons (post hoc) tests in Python. 108 | 109 | # Explaining 110 | - [Lime](https://github.com/marcotcr/lime) - Lime: Explaining the predictions of any machine learning classifier. 111 | - [eli5](https://github.com/TeamHG-Memex/eli5) - ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions. 112 | - [SHAP](https://github.com/slundberg/shap) - SHAP (SHapley Additive exPlanations) is a unified approach to explain the output of any machine learning model. 113 | - [LOFO](https://github.com/aerdem4/lofo-importance) - LOFO (Leave One Feature Out) Importance calculates the importances of a set of features. 114 | 115 | 116 | 117 | # Visualisation libraries 118 | - [Matplotlib](https://github.com/matplotlib/matplotlib) - Matplotlib is a Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms. 119 | - [Seaborn](https://github.com/mwaskom/seaborn) - Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. 120 | - [Bokeh](https://github.com/bokeh/bokeh) - Bokeh is an interactive visualization library for Python that enables beautiful and meaningful visual presentation of data in modern web browsers. With Bokeh, you can quickly and easily create interactive plots, dashboards, and data applications. 121 | - [plotly.py](https://github.com/plotly/plotly.py) - plotly.py is an interactive, open-source, and browser-based graphing library for Python. 122 | - [scikit-plot](https://github.com/reiinakano/scikit-plot) - An intuitive library to add plotting functionality to scikit-learn objects. 123 | - [plotnine](https://github.com/has2k1/plotnine) - plotnine is an implementation of a grammar of graphics in Python, it is based on ggplot2. 124 | - [Cufflinks](https://github.com/santosjorge/cufflinks) - This library binds the power of plotly with the flexibility of pandas for easy plotting. 125 | - [Chartpy](https://github.com/cuemacro/chartpy) - Easy to use Python API wrapper to plot charts with matplotlib, plotly, bokeh and more. 126 | - [Vispy](https://github.com/vispy/vispy) - VisPy is a high-performance interactive 2D/3D data visualization library. 127 | - [pycm](https://github.com/sepandhaghighi/pycm) - Multi-class confusion matrix library in Python. 128 | - [Altair-Catplot](https://github.com/justinbois/altair-catplot) - Utility to generate plots with categorical variables using Altair. 129 | - [pdvega](https://github.com/altair-viz/pdvega) - Interactive plotting for Pandas using Vega-Lite. 130 | - [folium](https://github.com/python-visualization/folium) - Python Data. Leaflet.js Maps. 131 | - [jmpy](https://github.com/beltashazzer/jmpy) - Quick plotting and data visualization of pandas and numpy data. 132 | - [missingno](https://github.com/ResidentMario/missingno) - Missing data visualization module for Python. 133 | - [Yellowbrick](https://github.com/districtdatalabs/yellowbrick) - Visual analysis and diagnostic tools to facilitate machine learning model selection. 134 | - [netron](https://github.com/lutzroeder/netron) - Netron is a viewer for neural network, deep learning and machine learning models. 135 | - [PrettyPandas](https://github.com/HHammond/PrettyPandas) - PrettyPandas is a Pandas DataFrame Styler class that helps you create report quality tables with a simple API. 136 | 137 | 138 | 139 | 140 | # Text processing/NLP 141 | - [gensim](https://github.com/rare-technologies/gensim) - Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community. 142 | 143 | 144 | 145 | # Tooling 146 | - [Numba](https://github.com/numba/numba) - A Just-In-Time Compiler for Numerical Functions in Python. 147 | - [Jupyter Notebook](https://github.com/jupyter/notebook) - A rich explorative data analysis tool. 148 | - [boto3](https://github.com/boto/boto3) - AWS SDK for Python. 149 | - [PennAI](https://github.com/EpistasisLab/pennai) - PennAI is an easy-to-use data science assistant. It allows researchers without machine learning or coding expertise to run supervised machine learning analysis through a clean web interface. 150 | 151 | 152 | # Wrappers 153 | - [BigML Python Bindings](https://github.com/bigmlcom/python) - These BigML Python bindings allow you to interact with BigML.io, the API for BigML. You can use it to easily create, retrieve, list, update, and delete BigML resources (i.e., sources, datasets, models and, predictions). 154 | - [python-timbl](https://github.com/proycon/python-timbl) - python-timbl is a Python extension module wrapping the full TiMBL C++ programming interface. With this module, all functionality exposed through the C++ interface is also available to Python scripts. Being able to access the API from Python greatly facilitates prototyping TiMBL-based applications. 155 | - [thampi](https://github.com/scoremedia/thampi) - thampi creates a machine learning prediction server on AWS Lambda. 156 | - [MLPACK](https://github.com/mlpack/mlpack) - mlpack: a scalable C++ machine learning library (with Python bindings) 157 | - [PyStan](https://github.com/stan-dev/pystan) - PyStan provides a Python interface to Stan, a package for Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. 158 | 159 | # Unsorted 160 | - [pm4py](https://github.com/pm4py/pm4py-source) - PM4Py is a python library that supports (state-of-the-art) process mining algorithms in python. 161 | - [Optimus](https://github.com/ironmussa/Optimus) - Optimus is the missing framework to profile, clean, process and do ML in a distributed fashion using Apache Spark(PySpark). 162 | - [impyute](https://github.com/eltonlaw/impyute) - Impyute is a library of missing data imputation algorithms. 163 | - [Stairs](https://github.com/electronick1/stairs) - Framework which helps you make parallel/distributed calculations using data pipelines. 164 | - [fastText](https://github.com/facebookresearch/fastText) - Library for fast text representation and classification. 165 | - [pendulum](https://github.com/sdispater/pendulum) - Python datetimes made easy. 166 | - [loguru](https://github.com/Delgan/loguru) - Python logging made (stupidly) simple. 167 | --------------------------------------------------------------------------------