├── LICENSE
└── README.md


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2018 Marcel Edmund Franke
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Machine learning research papers
  2 | 
  3 | Collection of machine learning research paper references 
  4 | 
  5 | ### LLM (Large language mode)
  6 | 
  7 | * [Self-Rewarding Language Models](https://arxiv.org/pdf/2401.10020.pdf)
  8 | * [Meta Large Language Model Compiler: Foundation Models of Compiler Optimization](https://ai.meta.com/research/publications/meta-large-language-model-compiler-foundation-models-of-compiler-optimization)
  9 | 
 10 | ## Math
 11 | 
 12 | * [A Beginner's Guide to the Mathematics of Neural Networks](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.161.3556&rep=rep1&type=pdf&fbclid=IwAR3OWInStoLwXtfjglO2XeQj1X7NNHBKPzzEou4At4GeYVGpx_zDkUEliz4)
 13 | * [Mathematics of Deep Learning](https://arxiv.org/abs/1712.04741)
 14 | * [The Matrix Calculus You Need For Deep Learning](https://arxiv.org/abs/1802.01528)
 15 | * [A guide to convolution arithmetic for deep learning](https://arxiv.org/abs/1603.07285)
 16 | * [Deep Learning: An Introduction for Applied Mathematicians](https://arxiv.org/abs/1801.05894) - page 23
 17 | 
 18 | ## Deep learning
 19 | 
 20 | * [Recent Advances in Deep Learning: An Overview](https://arxiv.org/abs/1807.08169)
 21 | * [Deep learning review](https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf)
 22 | * [Understanding deep learning requires rethinking generalization](https://arxiv.org/abs/1611.03530)
 23 | * [Learning the Number of Neurons in Deep Networks](https://arxiv.org/abs/1611.06321)
 24 | * [Lifelong Learning with Dynamically Expandable Networks](https://arxiv.org/abs/1708.01547)
 25 | * [Dropout: a simple way to prevent neural networks from overfitting](http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf)
 26 | * [Self-Attentive Pooling for Efficient Deep Learning](https://arxiv.org/abs/2209.07659)
 27 | 
 28 | ## GAN
 29 | 
 30 | * [StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks](https://arxiv.org/abs/1612.03242)
 31 | * [Self-Attention Generative Adversarial Networks](https://arxiv.org/abs/1805.08318)
 32 | 
 33 | ## Neuro evolution
 34 | 
 35 | * [Neural Architecture Search with Reinforcement Learning](https://arxiv.org/abs/1611.01578)
 36 | * [Large-Scale Evolution of Image Classifiers](https://arxiv.org/pdf/1703.01041.pdf)
 37 | * [AutoAugment: Learning Augmentation Policies from Data](https://arxiv.org/abs/1805.09501)
 38 | * [Designing Neural Network Architectures using Reinforcement Learning](https://arxiv.org/abs/1611.02167)
 39 | * [Learning Transferable Architectures for Scalable Image Recognition](https://arxiv.org/abs/1707.07012)
 40 | * [Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning](https://arxiv.org/abs/1712.06567)
 41 | * [MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep
 42 | Networks](https://arxiv.org/abs/1711.06798)
 43 | 
 44 | ## Gradient descent
 45 | 
 46 | * [An overview of gradient descent optimization algorithms](https://arxiv.org/abs/1609.04747)
 47 | 
 48 | ## Word embedding 
 49 | 
 50 | * [Distributed Representations of Words and Phrases and their Compositionality Efficient Estimation of Word Representations in Vector Space](https://arxiv.org/abs/1310.4546)
 51 | * [Linguistic Regularities in Continuous Space Word Representations](https://www.aclweb.org/anthology/N13-1090)
 52 | * [A Neural Probabilistic Language Model](http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf)
 53 | * [Glove](https://nlp.stanford.edu/pubs/glove.pdf)
 54 | * [Efficient Estimation of Word Representations in Vector Space](https://arxiv.org/pdf/1301.3781.pdf)
 55 | * [Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings](https://arxiv.org/abs/1607.06520)
 56 | * [FastText.zip: Compressing text classification models](https://arxiv.org/abs/1612.03651)
 57 | * [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)
 58 | 
 59 | ## CNN
 60 | 
 61 | * [Siamese Neural Networks for One-shot Image Recognition](https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf)
 62 | * [ImageNet Classification with Deep Convolutional Neural Networks](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)
 63 | * [Multi-column Deep Neural Networks for Image Classification](https://arxiv.org/abs/1202.2745)
 64 | * [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
 65 | * [Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/abs/1512.00567)
 66 | * [Deep residual learning for image recognition](https://arxiv.org/abs/1512.03385)
 67 | * [Network In Network](https://arxiv.org/pdf/1312.4400.pdf)
 68 | * [Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)
 69 | * [OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks](https://arxiv.org/pdf/1312.6229.pdf)
 70 | * [You Only Look Once: Unified, Real-Time Object Detection](https://arxiv.org/abs/1506.02640)
 71 | * [FaceNet: A Unified Embedding for Face Recognition and Clustering](https://arxiv.org/pdf/1503.03832.pdf)
 72 | * [Visualizing and Understanding Convolutional Networks](https://arxiv.org/abs/1311.2901)
 73 | * [A Neural Algorithm of Artistic Style](https://arxiv.org/abs/1508.06576)
 74 | * [Convolutional Sequence to Sequence Learning](https://arxiv.org/abs/1705.03122)
 75 | * [Deformable Convolutional Networks](https://arxiv.org/abs/1703.06211)
 76 | * [Deep Photo Style Transfer](https://arxiv.org/abs/1703.07511)
 77 | * [Wide Residual Networks](https://arxiv.org/abs/1605.07146)
 78 | * [WaveNet: A Generative Model for Raw Audio](https://arxiv.org/abs/1609.03499)
 79 | * [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993)
 80 | * [Resnet in Resnet: Generalizing Residual Architectures](https://arxiv.org/abs/1603.08029)
 81 | 
 82 | ## RL
 83 |  
 84 | * [Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm](https://arxiv.org/pdf/1712.01815.pdf)
 85 | * [RL Overview](https://arxiv.org/abs/1701.07274)
 86 | 
 87 | ## GRU
 88 | 
 89 | * [Gated Feedback Recurrent Neural Networks](https://arxiv.org/abs/1502.02367)
 90 |  
 91 | ## RNN
 92 | 
 93 | * [DRAW: A Recurrent Neural Network For Image Generation](https://arxiv.org/abs/1502.04623)
 94 | * [Playing Atari with Deep Reinforcement Learning](https://arxiv.org/abs/1312.5602)
 95 | * [Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling](https://arxiv.org/pdf/1412.3555.pdf)
 96 | * [Sequence to Sequence Learning with Neural Networks](https://arxiv.org/abs/1409.3215)
 97 | * [Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation](https://arxiv.org/abs/1406.1078)
 98 | * [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/abs/1409.0473)
 99 | * [SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning](https://arxiv.org/abs/1711.04436)
100 | 
101 | ## Graph & Neural networks
102 | 
103 | * [Relational inductive biases, deep learning, and graph networks](https://arxiv.org/abs/1806.01261)
104 | * [Interaction Networks for Learning about Objects,Relations and Physics](https://arxiv.org/pdf/1612.00222.pdf)
105 | * [Graph neural networks](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1015.7227&rep=rep1&type=pdf) - Page 7
106 | * [Recurrent Relational Networks](https://arxiv.org/abs/1711.08028)
107 | * [Graph Capsule Convolutional Neural Networks](https://arxiv.org/abs/1805.08090)
108 | * [Graph Neural Networks for Ranking Web Pages](https://www.researchgate.net/publication/221158677_Graph_Neural_Networks_for_Ranking_Web_Pages)
109 | * [Graph Convolutional Neural Networks for Web-Scale Recommender Systems](https://arxiv.org/abs/1806.01973)
110 | 
111 | ## Neural Module Networks
112 | 
113 | * [Neural Module Networks](https://arxiv.org/abs/1511.02799)
114 | * [End-To-End Memory Networks](https://arxiv.org/pdf/1503.08895.pdf)
115 | * [Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)](https://arxiv.org/abs/1412.6632)
116 | * [Show and Tell: A Neural Image Caption Generator](https://arxiv.org/abs/1411.4555)
117 | 
118 | ## Memory Networks 
119 | 
120 | * [Memory Networks](https://arxiv.org/pdf/1410.3916.pdf)
121 | 
122 | ## General Models
123 | 
124 | * [One Model To Learn Them All](https://arxiv.org/abs/1706.05137)
125 | 
126 | ## Neural Programmer-Interpreters
127 | 
128 | * [Neural Programmer-Interpreters](https://arxiv.org/abs/1511.06279)
129 | * [Learning Simple Algorithms from Examples](https://arxiv.org/abs/1511.07275)
130 | * [pix2code: Generating Code from a Graphical User Interface Screenshot](https://arxiv.org/abs/1705.07962)
131 | * [DeepCoder: Learning to Write Programs](https://arxiv.org/abs/1611.01989)
132 | * [A deep language model for software code](https://arxiv.org/abs/1608.02715v1)
133 | * [Tree-to-tree Neural Networks for Program Translation](https://arxiv.org/abs/1802.03691)
134 | * [Unsupervised Translation of Programming Languages](https://arxiv.org/abs/2006.03511)
135 | * [TRANX: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation](https://arxiv.org/abs/1810.02720)
136 | * [TransCoder-IR: Code Translation with Compiler Representations](https://arxiv.org/abs/2207.03578)
137 | 
138 | ## Database
139 | 
140 | * [SageDB: A Learned Database System](http://cidrdb.org/cidr2019/papers/p117-kraska-cidr19.pdf)
141 | 
142 | ## Cache 
143 | 
144 | * [Feedforward Neural Networks for Caching: Enough or Too Much?](https://arxiv.org/abs/1810.06930)
145 | 
146 | ## Activations
147 | 
148 | * [Maxout networks](https://arxiv.org/pdf/1302.4389v4.pdf)
149 | 
150 | ## Other
151 | 
152 | * [Event detection in Twitter: A keyword volume approach](https://arxiv.org/abs/1901.00570)
153 | * [Bagging](https://www.stat.berkeley.edu/~breiman/bagging.pdf)
154 | * [Stack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security](https://www.researchgate.net/publication/317919491_Stack_Overflow_Considered_Harmful_The_Impact_of_CopyPaste_on_Android_Application_Security)
155 | * [DEXTER: Large-Scale Discovery and Extraction of Product
156 | Specifications on the Web](http://www.vldb.org/pvldb/vol8/p2194-qiu.pdf)
157 | 
158 | ## Robotics
159 | 
160 | * [End-to-End Learning of Semantic Grasping](https://arxiv.org/abs/1707.01932)
161 | 
162 | ## Machine learning (Articles)
163 | 
164 | * [Understanding LSTM Networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)
165 | * [Conv Nets: A Modular Perspective](https://colah.github.io/posts/2014-07-Conv-Nets-Modular)
166 | * [Understanding Convolutions](http://colah.github.io/posts/2014-07-Understanding-Convolutions/)
167 | 
168 | ## Machine learning (Books)
169 | 
170 | * [Understanding machine learning theory algorithms](https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf)
171 | 


--------------------------------------------------------------------------------