├── README.md
└── _config.yml


/README.md:
--------------------------------------------------------------------------------
  1 | # Feature selection (variable selection)
  2 | > Feature selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction ([Wikipedia](https://en.wikipedia.org/wiki/Feature_selection))
  3 | 
  4 | Why feature selection?
  5 | 1. Data exploration
  6 | 2. Curse of dimensionality
  7 | 3. Less features - faster models
  8 | 4. Better metrics
  9 | 
 10 | - **Overview**
 11 |   - [An Introduction to Variable and Feature Selection](http://jmlr.csail.mit.edu/papers/volume3/guyon03a/guyon03a.pdf) (2003) *Isabelle Guyon, Andre Elisseeff*
 12 |   - [A Survey on Feature Selection](https://www.sciencedirect.com/science/article/pii/S1877050916313047) (2016) *Jianyu Miaoac, Lingfeng Niu*
 13 |   - [Feature Selection: A Data Perspective](https://arxiv.org/pdf/1601.07996.pdf) (2016) *Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P. Trevino, Jiliang Tang, Huan Liu*
 14 |   - [Feature Selection and Feature Extraction in Pattern Analysis: A Literature Review](https://arxiv.org/pdf/1905.02845.pdf) (2019) *Benyamin Ghojogh, Maria N. Samad, Sayema Asif Mashhadi,Tania Kapoor, Wahab Ali, Fakhri Karray, Mark Crowle*
 15 | - **All-relevant vs minimal-optimal feature selection**
 16 |   - [Consistent Feature Selection for Pattern Recognition in Polynomial Time](http://jmlr.csail.mit.edu/papers/volume8/nilsson07a/nilsson07a.pdf) (2007) *R. Nilsson, J. M. Peña, J. Björkegren, J. Tegnér*
 17 | 
 18 | ### Filter methods
 19 | Filter methods use model-free ranking to filter less relevant features
 20 | - **Missing Values Ratio**
 21 |   - Removing features with a ratio of missing values greater than some threshold
 22 | - **Low Variance Filter** ([sklearn](https://scikit-learn.org/stable/modules/feature_selection.html#removing-features-with-low-variance))
 23 |   - Removing features with a variance lower than some threshold
 24 | - **Correlation** ([Wiki](https://en.wikipedia.org/wiki/Correlation_and_dependence))
 25 | - **χ²** Chi-squared statistic for categorical features ([Wiki](https://en.wikipedia.org/wiki/Chi-squared_test), [sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.chi2.html))
 26 | - **ANOVA** F-value for quantitative features([Wiki](https://en.wikipedia.org/wiki/Analysis_of_variance), [sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.f_classif.html))
 27 | - **Mutual information** ([Wiki](https://en.wikipedia.org/wiki/Mutual_information))
 28 |   - [Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection](http://jmlr.csail.mit.edu/papers/volume13/brown12a/brown12a.pdf) (2012) *Gavin Brown, Adam Pocock, Ming-Jie Zhao, Mikel Lujan*
 29 |   - [Feature Selection Based on Joint Mutual Information](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.4424) (1999) *Howard Hua Yang, John Moody*
 30 |   - [Estimating mutual information](https://arxiv.org/pdf/cond-mat/0305641.pdf) (2003) *Alexander Kraskov, Harald Stoegbauer, Peter Grassberger*
 31 | - **mRMR** Minimum redundancy, maximal relevancy ([Link](http://home.penglab.com/proj/mRMR/), [Wiki](https://en.wikipedia.org/wiki/Minimum_redundancy_feature_selection))
 32 |   - [Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy](http://home.penglab.com/papersall/docpdf/2005_TPAMI_FeaSel.pdf) (2005) *Hanchuan Peng, Fuhui Long, Chris Ding*
 33 | - **Relief** ([Wiki](https://en.wikipedia.org/wiki/Relief_(feature_selection)))
 34 |   - [The Feature Selection Problem: Traditional Methods and a New Algorithm](https://www.aaai.org/Papers/AAAI/1992/AAAI92-020.pdf) (1992) *Kira Kenji, Larry Rendell*
 35 |   - [Relief-Based Feature Selection: Introduction and Review](https://arxiv.org/pdf/1711.08421.pdf) (2018) *Ryan J. Urbanowicz, Melissa Meeker, William LaCava, Randal S. Olson, Jason H.Moore*
 36 | - **Markov Blanket** ([Wiki](https://en.wikipedia.org/wiki/Markov_blanket))
 37 |   - [Markov Blanket based Feature Selection: A Review of Past Decade](http://www.iaeng.org/publication/WCE2010/WCE2010_pp321-328.pdf) (2010) *Shunkai Fu, Michel C. Desmarais*
 38 |   - Incremental Association Markov Blanket: [Algorithms for Large Scale Markov Blanket Discovery](https://www.aaai.org/Papers/FLAIRS/2003/Flairs03-073.pdf) (2003) *Ioannis Tsamardinos, Constantin F. Aliferis, Alexander Statnikov*
 39 |   - Grow-Shrink algorithm: [Bayesian Network Induction via Local Neighborhoods](http://robots.stanford.edu/papers/Margaritis99a.pdf) (2000) *Dimitris Margaritis, Sebastian Thrun*
 40 |   - Koller-Sahami method: [Toward Optimal Feature Selection](http://ilpubs.stanford.edu:8090/208/1/1996-77.pdf) (1996) *Daphne Koller and Mehran Sahami*
 41 |   - Max-Min Markov Blanket: [Time and Sample Efficient Discovery of Markov Blankets and Direct Causal Relations](https://dl.acm.org/doi/10.1145/956750.956838) (2003) *Ioannis Tsamardinos, Constantin F. Aliferis, Alexander Statnikov*
 42 | - **Fast Correlation-based Filter**
 43 |   - [Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution](https://www.public.asu.edu/~huanliu/papers/icml03.pdf) (2003) *Lei Yu, Huan Liu*
 44 | - **CBF** Consistency-Based Filters
 45 |   - [Consistency-based search in feature selection](https://www.public.asu.edu/~huanliu/papers/aij03.pdf) (2003) *Manoranjan Dasha, Huan Liu*
 46 | - **Interact**
 47 |   - [Searching for Interacting Features](https://www.public.asu.edu/~huanliu/papers/ijcai07.pdf) (2007) *Zheng Zhao, Huan Liu*
 48 | 
 49 | ### Wrapper methods
 50 | Wrapper methods use a model and its performance to find the best feature subset
 51 | - **SFS** Sequential Feature Selection
 52 | - **SFFS** Sequential Floating Forward Selection
 53 |   - [Floating search methods in feature selection](https://www.academia.edu/15425286/Floating_search_methods_in_feature_selection) (1994) *Pavel Pudil, Josef Kittler, Jana Novovicová*
 54 |   - [Adaptive floating search methods in feature selection](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.5032) (1999) *P. Somol , Pavel Pudil , Jana Novovicova , P. Paclik*
 55 | - **Genertic algorithm** ([Wiki](https://en.wikipedia.org/wiki/Genetic_algorithm))
 56 | - **PSO** Particle Swarm Optimization ([Wiki](https://en.wikipedia.org/wiki/Particle_swarm_optimization))
 57 |   - [Particle Swarm Optimization](https://www.cs.tufts.edu/comp/150GA/homeworks/hw3/_reading6%201995%20particle%20swarming.pdf) (1995) *James Kennedy, Russell Eberhar*
 58 |   - [Feature Selection using PSO-SVM](http://www.iaeng.org/IJCS/issues_v33/issue_1/IJCS_33_1_18.pdf) (2007) *Chung-Jui Tu, Li-Yeh Chuang, Jun-Yang Chang, Cheng-Hong Yang*
 59 | - **Boruta** All-relevant feature selection ([CRAN](https://cran.r-project.org/web/packages/Boruta/), [PyPI](https://pypi.org/project/Boruta/))
 60 |   - [Boruta – A System for Feature Selection](https://www.mimuw.edu.pl/~ajank/papers/Kursa2010.pdf) (2010) *Miron B. Kursa,  Aleksander Jankowski,  Witold R. Rudnick*
 61 |   - BoostARoota - Boruta with XGBoost as a base model ([Code](https://github.com/chasedehan/BoostARoota))
 62 | - **MUVR** ([GitLab](https://gitlab.com/CarlBrunius/MUVR))
 63 |   - [Variable selection and validation in multivariate modelling](https://academic.oup.com/bioinformatics/article/35/6/972/5085367) (2018) *Lin Shi, Johan A Westerhuis, Johan Rosén, Rikard Landberg, Carl Brunius*
 64 | - Wrappers methods and overfitting:
 65 |   - [Wrappers for feature subset selection](http://machine-learning.martinsewell.com/feature-selection/KohaviJohn1997.pdf) (1996) *Ron Kohavi, George H. John*
 66 | 
 67 | ### Embedded methods
 68 | - **LASSO**
 69 |   - [Regression Shrinkage and Selection via the lasso](https://statweb.stanford.edu/~tibs/lasso/lasso.pdf) (1996) *Robert Tibshirani*
 70 | - **Elastic net**
 71 |   - [Regularization and variable selection via the elastic net](https://web.stanford.edu/~hastie/Papers/B67.2%20(2005)%20301-320%20Zou%20&%20Hastie.pdf) (2005) *Hui Zou, Trevor Hastie*
 72 | - **Spike and Slab regression** ([Wiki](https://en.wikipedia.org/wiki/Spike-and-slab_regression))
 73 |   - [Bayesian variable selection in linear regression](1988) *T.J. Mitchell, J.J. Beuchamp*
 74 |   - [Approaches for Bayesian variable selection](http://www-stat.wharton.upenn.edu/~edgeorge/Research_papers/GeorgeMcCulloch97.pdf) (1997) *Edward I. George, Robert E. McCulloch*
 75 | - **Decision Tree** ([Wiki](https://en.wikipedia.org/wiki/Decision_tree))
 76 | - **Random Forest** ([Wiki](https://en.wikipedia.org/wiki/Random_forest))
 77 |   - [Random Forests](https://link.springer.com/article/10.1023/A:1010933404324) (2001) *Leo Breiman*
 78 |   - [Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatic](https://epub.ub.uni-muenchen.de/13766/1/TR.pdf) (2012) *Anne-Laure Boulesteix, Silke Janitza, Jochen Kruppa, Inke R. Konig*
 79 |   - [Variable selection using random forests](https://hal.archives-ouvertes.fr/hal-00755489/file/PRLv4.pdf) (2010) *Robin Genuer, Jean-Michel Poggi, Christine Tuleau-Malot*
 80 |   - [Bias in random forest variable importance measures: Illustrations, sources and a solution](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1796903/pdf/1471-2105-8-25.pdf) (2007) *Carolin Strobl, Anne-Laure Boulesteix, Achim Zeileis, Torsten Hothorn*
 81 |   - [Conditional Variable Importance for Random Forests](https://epub.ub.uni-muenchen.de/2821/1/deck.pdf) (2008) *Carolin Strobl, Anne-Laure Boulesteix, Thomas Kneib, Thomas Augustin, Achim Zeileis*
 82 |   - [Correlation and variable importance in random forests](https://arxiv.org/pdf/1310.5726.pdf) (2016) *Baptiste Gregorutti, Bertrand Michel, Philippe Saint-Pierre*
 83 | - **Gradient Boosting** ([Wiki](https://en.wikipedia.org/wiki/Gradient_boosting))
 84 |   - [Greedy Function Approximation: A Gradient Boosting Machine](https://statweb.stanford.edu/~jhf/ftp/trebst.pdf) (1999) *Jerome H Friedman*
 85 |   - [Boosting Algorithms as Gradient Descent](http://papers.nips.cc/paper/1766-boosting-algorithms-as-gradient-descent.pdf) (1999) *Llew Mason, Jonathan Baxter, Peter Bartlett, Marcus Frean*
 86 | 
 87 | ### Unsupervised and semi-supervised feature selection
 88 | - **FSSEM** Feature Subset Selection using Expectation-Maximization
 89 |   - [Feature Selection for Unsupervised Learning](http://www.jmlr.org/papers/volume5/dy04a/dy04a.pdf) (2004) *Jennifer G. Dy, Carla E. Brodley*
 90 | - **Laplacian Score**
 91 |   - Choosing features using a nearest neighbor graph
 92 |   - [Laplacian Score for Feature Selection](https://papers.nips.cc/paper/2909-laplacian-score-for-feature-selection.pdf) (2005) *Xiaofei  He, Deng  Cai, Deng Cai, Partha  Niyogi, Partha Niyogi*
 93 | - **Principal Feature Analysis**
 94 |   - [Feature Selection Using Principal Feature Analysis](http://venom.cs.utsa.edu/dmz/techrep/2007/CS-TR-2007-011.pdf) (2007) *Yijuan Lu, Ira Cohen, Xiang Sean Zhou, Qi Tian*
 95 | - **Spectral Feature Selection**
 96 |   - Separates samples into clusters using a spectrum of pairwise similarity graph
 97 |   - [Spectral Feature Selection forSupervised and Unsupervised Learning](https://www.public.asu.edu/~huanliu/papers/icml07.pdf) (2007) *Zheng Zhao, Huan Liu*
 98 | - **MCFS** Multi-cluster Feature Selection
 99 |   - [Unsupervised Feature Selection for Multi-Cluster Data](https://wwwx.cs.unc.edu/Courses/comp790-090-s11/Presentations/p333-cai.pdf) (2010) *Deng Cai, Chiyuan Zhang, Xiaofei He*
100 | - **Autoencoders** ([Wiki](https://en.wikipedia.org/wiki/Autoencoder))
101 |   - [Autoencoders, Unsupervised Learning, and Deep Architectures](http://proceedings.mlr.press/v27/baldi12a/baldi12a.pdf) (2012) *Pierre Baldi*
102 |   - [An Introduction to Variational Autoencoders](https://arxiv.org/pdf/1906.02691.pdf) (2019) *Diederik P. Kingma, Max Welling*
103 |   - [Concrete Autoencoders for Differentiable Feature Selection and Reconstruction](https://arxiv.org/pdf/1901.09346.pdf) (2019) *Abubakar Abid, Muhammad Fatih Balin, James Zou*
104 | 
105 | ### Stable feature selection
106 | - [Stability of Feature Selection Algorithms: a study on high dimensional spaces](http://cui.unige.ch/~kalousis/papers/stability/KalousisPradosHilarioKIS2007.pdf) (2007) *Alexandros Kalousis, Julien Prados, Melanie Hilario*
107 | - [Robust Feature Selection Using Ensemble Feature Selection Techniques](http://bioinformatics.psb.ugent.be/pdf/publications/978-3-540-87481-2.pdf) (2008) *Yvan Saeys, Thomas Abeel, Yves Van de Pee*
108 | - [Stability Selection](https://stat.ethz.ch/~nicolai/stability.pdf) (2009) *Nicolai Meinshausen, Peter Buhlmann*
109 | - [A Novel Weighted Combination Method for Feature Selection using Fuzzy Sets](https://arxiv.org/pdf/2005.05003.pdf) (2020) *Zixiao Shen, Xin Chen, Jonathan M. Garibald*
110 | - Stability of MDA, LIME and SHAP: [The best way to select features](https://arxiv.org/pdf/2005.12483.pdf) (2020) *Xin Man, Ernest P. Chan*
111 | 
112 | ### Domain-specific
113 | - **Uplift models**
114 |   - [Feature Selection Methods for Uplift Modeling](https://arxiv.org/pdf/2005.03447.pdf) (2020) *Zhenyu Zhao, Yumin Zhang, Totte Harinen, Mike Yung*
115 | 
116 | ### Meta feature selection
117 | - [A Feature Subset Selection Algorithm AutomaticRecommendation Method](https://arxiv.org/pdf/1402.0570.pdf) (2013) *Guangtao Wang, Qinbao Song, Heli Sun, Xueying Zhang, Baowen Xu, Yuming Zhou*
118 | - [Metalearning for Choosing Feature Selection Algorithms in Data Mining: Proposal of a New Framework](https://www.researchgate.net/profile/Antonio_Parmezan/publication/312482443_Metalearning_for_Choosing_Feature_Selection_Algorithms_in_Data_Mining_Proposal_of_a_New_Framework/links/5c3f0e7f92851c22a378a5a6/Metalearning-for-Choosing-Feature-Selection-Algorithms-in-Data-Mining-Proposal-of-a-New-Framework.pdf) (2017) *Antonio Rafael Sabino Parmezan, Huei Diana Lee*
119 | - [A Novel Meta Learning Framework for Feature Selection using Data Synthesis and Fuzzy Similarity](https://arxiv.org/pdf/2005.09856.pdf) (2020) *Zixiao Shen, Xin Chen, Jonathan M. Garibald*
120 | 
121 | ### Packages
122 | - **R**
123 |   - Package: fscaret ([CRAN](https://cran.r-project.org/web/packages/fscaret/)) *Jakub Szlek*
124 |   - Package: praznik ([Code](https://gitlab.com/mbq/praznik)) *Miron Kursa*
125 |   - Package: FSinR ([CRAN](https://cran.r-project.org/web/packages/FSinR/), [Paper](https://arxiv.org/pdf/2002.10330.pdf)) *Francisco Aragón-Royón, Alfonso Jiménez-Vílchez, Antonio Arauzo-Azofra, José Manuel Benítez*
126 |   - Package: VSURF ([CRAN](https://cran.r-project.org/web/packages/VSURF/), [Paper](https://journal.r-project.org/archive/2015-2/genuer-poggi-tuleaumalot.pdf))
127 |   - Package: spikeSlabGAM ([Code](https://github.com/fabian-s/spikeSlabGAM), [CRAN](https://cran.r-project.org/web/packages/spikeSlabGAM/), [Paper](https://www.jstatsoft.org/article/view/v043i14))
128 |   - Package: copent ([CRAN](https://cran.r-project.org/web/packages/copent/), [Code](https://github.com/majianthu/copent), [Paper](https://arxiv.org/pdf/2005.14025.pdf))
129 | - **Python**
130 |   - Package: sklearn.feature_selection ([Homepage](https://scikit-learn.org/stable/), [Code](https://github.com/scikit-learn/scikit-learn), [PyPI](https://pypi.org/project/scikit-learn/))
131 |   - Package: scikit-feature ([Homepage](http://featureselection.asu.edu/), [Code](https://github.com/jundongl/scikit-feature))
132 |   - Package: feature-selector ([Code](https://github.com/WillKoehrsen/feature-selector), [PyPI](https://pypi.org/project/feature-selector/))
133 | - **Julia**
134 |   - the main packages for ML in Julia are [MLJ](https://github.com/alan-turing-institute/MLJ.jl) and [Flow](https://fluxml.ai/Flux.jl/stable/)
135 | 


--------------------------------------------------------------------------------
/_config.yml:
--------------------------------------------------------------------------------
1 | remote_theme: mlpapers/minimal
2 | 


--------------------------------------------------------------------------------