├── .gitattributes ├── .gitignore └── README.md /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | 4 | # Custom for Visual Studio 5 | *.cs diff=csharp 6 | 7 | # Standard to msysgit 8 | *.doc diff=astextplain 9 | *.DOC diff=astextplain 10 | *.docx diff=astextplain 11 | *.DOCX diff=astextplain 12 | *.dot diff=astextplain 13 | *.DOT diff=astextplain 14 | *.pdf diff=astextplain 15 | *.PDF diff=astextplain 16 | *.rtf diff=astextplain 17 | *.RTF diff=astextplain 18 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Windows image file caches 2 | Thumbs.db 3 | ehthumbs.db 4 | 5 | # Folder config file 6 | Desktop.ini 7 | 8 | # Recycle Bin used on file shares 9 | $RECYCLE.BIN/ 10 | 11 | # Windows Installer files 12 | *.cab 13 | *.msi 14 | *.msm 15 | *.msp 16 | 17 | # Windows shortcuts 18 | *.lnk 19 | 20 | # ========================= 21 | # Operating System Files 22 | # ========================= 23 | 24 | # OSX 25 | # ========================= 26 | 27 | .DS_Store 28 | .AppleDouble 29 | .LSOverride 30 | 31 | # Thumbnails 32 | ._* 33 | 34 | # Files that might appear in the root of a volume 35 | .DocumentRevisions-V100 36 | .fseventsd 37 | .Spotlight-V100 38 | .TemporaryItems 39 | .Trashes 40 | .VolumeIcon.icns 41 | 42 | # Directories potentially created on remote AFP share 43 | .AppleDB 44 | .AppleDesktop 45 | Network Trash Folder 46 | Temporary Items 47 | .apdisk 48 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # selected papers 2 | 3 | ### Machine Learning 4 | 5 | - M. Jordan et al., **An Introduction to MCMC for Machine Learning**, Machine Learning, 2003 [[pdf]](http://www.cs.bham.ac.uk/~axk/mcmc1.pdf) 6 | - Y. Teh et al. **Hierarchical Dirichlet Processes**, Journal of American Statistical Association, 2006 [[pdf]](https://people.eecs.berkeley.edu/~jordan/papers/hdp.pdf) 7 | - A. Patil et al. **PyMC: Bayesian Stochastic Modelling in Python**, Journal of Statistical Software, 2010 [[pdf]](https://www.jstatsoft.org/article/view/v035i04) 8 | - D. Lin et al. **Construction of Dependent Dirichlet Processes based on Poisson Processes**, NeurIPS, 2010 [[pdf]](https://papers.nips.cc/paper/4151-construction-of-dependent-dirichlet-processes-based-on-poisson-processes.pdf) 9 | - J. Snoek et al. **Practical Bayesian Optimization of Machine Learning Algorithms**, NeurIPS, 2012 [[pdf]](https://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf) 10 | - M. Hoffman et al. **Stochastic Variational Inference**, JMLR, 2013 [[pdf]](https://arxiv.org/pdf/1206.7051.pdf) 11 | - J. Chang et al. **Parallel Sampling of HDPs using Sub-Cluster Splits**, NeurIPS, 2014 [[pdf]](https://papers.nips.cc/paper/5235-parallel-sampling-of-hdps-using-sub-cluster-splits.pdf) 12 | - M. Hoffman, **The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo**, JMLR, 2014 [[pdf]](https://arxiv.org/pdf/1111.4246.pdf) 13 | - T. Chen et al. **XGBoost: A Scalable Tree Boosting System**, KDD, 2016 [[pdf]](https://arxiv.org/pdf/1603.02754.pdf) 14 | - A. Kucukelbir et al. **Automatic Differentiation Variational Inference**, JMLR, 2017 [[pdf]](https://arxiv.org/pdf/1603.00788.pdf) 15 | - D. Tran et al. **Deep Probabilistic Programming**, ICLR, 2017 [[pdf]](https://arxiv.org/pdf/1701.03757.pdf) 16 | - Y. Koren et al. **Matrix Factorization Techniques for Recommender Systems**, IEEE Computer, 2009 [[pdf]](https://ieeexplore.ieee.org/document/5197422) 17 | - G. Ke et al. **LightGBM: A Highly Efficient Gradient Boosting Decision Tree**, NeurIPS, 2017 [[pdf]](https://proceedings.neurips.cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf) 18 | - D. Rezende et al. **Variational Inference with Normalizing Flows**, ICML, 2015 [[pdf]](https://arxiv.org/pdf/1505.05770.pdf) 19 | - J. Ho et al. **Denoising Diffusion Probabilistic Models**, NeurIPS, 2020 [[pdf]](https://arxiv.org/pdf/2006.11239.pdf) 20 | 21 | 22 | ### Deep Learning 23 | 24 | - A. Krizhevsky et al., **ImageNet Classification with Deep Convolutional Neural Networks**, NeurIPS, 2012 [[pdf]](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf) 25 | - J. Bergstra et al., **Random Search for Hyper-Parameter Optimization**, JMLR, 2012 [[pdf]](http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf) 26 | - M. Lin et al., **Network in Network**, arXiv, 2013 [[pdf]](https://arxiv.org/pdf/1312.4400.pdf) 27 | - D. Kingma et al., **Auto-Encoding Variational Bayes**, ICLR, 2014 [[pdf]](https://arxiv.org/pdf/1312.6114.pdf) 28 | - I. Goodfellow et al., **Generative Adversarial Nets**, NeurIPS, 2014 [[pdf]](https://arxiv.org/pdf/1406.2661v1.pdf) 29 | - N. Srivastava et al., **Dropout: A Simple Way to Prevent Neural Networks from Overfitting**, JMLR, 2014 [[pdf]](http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf) 30 | - K. Simonyan et al., **Very Deep Convolutional Networks for Large-Scale Image Recognition**, ICLR, 2015 [[pdf]](https://arxiv.org/pdf/1409.1556.pdf) 31 | - C. Szegedy et al., **Going Deeper with Convolutions**, CVPR, 2015 [[pdf]](https://arxiv.org/pdf/1409.4842v1.pdf) 32 | - K. He et al., **Deep Residual Learning for Image Recognition**, arXiv, 2015 [[pdf]](https://arxiv.org/pdf/1512.03385.pdf) 33 | - D. Kingma et al., **Adam: A Method for Stochastic Optimization**, ICLR, 2015 [[pdf]](https://arxiv.org/pdf/1412.6980.pdf) 34 | - S. Ioffe et al., **Batch Normalization**, ICML, 2015 [[pdf]](https://arxiv.org/pdf/1502.03167.pdf) 35 | - F. Iandola et al., **SqueezeNet**, ICLR, 2017 [[pdf]](https://arxiv.org/pdf/1602.07360.pdf) 36 | - T. Kipf et al., **Semi-Supervised Classification with Graph Convolutional Networks**, ICLR, 2017 [[pdf]](https://arxiv.org/pdf/1609.02907.pdf) 37 | 38 | 39 | ### Computer Vision 40 | 41 | - M. Bertalmio et al., **Image Inpainting**, SIGGRAPH, 2000 [[pdf]](http://www.tecn.upf.es/~mbertalmio/bertalmi.pdf) 42 | - A. Karpathy et al., **Large-scale Video Classification with Convolutional Neural Networks**, CVPR, 2014 [[pdf]](http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Karpathy_Large-scale_Video_Classification_2014_CVPR_paper.pdf) 43 | - A. Davis et al., **The Visual Microphone: Passive Recovery of Sound from Video**, SIGGRAPH, 2014 [[pdf]](https://people.csail.mit.edu/mrub/papers/VisualMic_SIGGRAPH2014.pdf) 44 | - O. Vinyals et al., **Show and Tell: A Neural Image Caption Generator**, CVPR, 2015 [[pdf]](https://arxiv.org/pdf/1411.4555.pdf) 45 | - L. Gatys et al., **A Neural Algorithm of Artistic Style**, arXiv, 2015 [[pdf]](https://arxiv.org/pdf/1508.06576.pdf) 46 | - A. Agrawal et al., **VQA: Visual Question Answering**, ICCV, 2015 [[pdf]](https://arxiv.org/pdf/1505.00468.pdf) 47 | - R. Girshick et al., **Fast R-CNN**, ICCV, 2015 [[pdf]](https://arxiv.org/pdf/1504.08083.pdf) 48 | - J. Long et al., **Fully Convolutional Networks for Semantic Segmentation**, CVPR, 2015 [[pdf]](https://arxiv.org/pdf/1411.4038.pdf) 49 | - J. Redmon et al., **You Only Look Once: Unified, Real-Time Object Detection**, CVPR, 2016 [[pdf]](https://arxiv.org/pdf/1506.02640.pdf) 50 | - C. Dong et al., **Image Super-Resolution Using Deep Convolutional Networks**, TPAMI, 2016 [[pdf]](https://arxiv.org/pdf/1501.00092v3.pdf) 51 | 52 | 53 | ### Natural Language Processing 54 | 55 | - D. Blei et al., **Latent Dirichlet Allocation**, JMLR, 2003 [[pdf]](http://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf) 56 | - R. Mihalcea et al., **TextRank: Bringing Order into Texts**, EMNLP, 2004 [[pdf]](https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf) 57 | - T. Mikolov et al., **Efficient Estimation of Word Representations in Vector Space**, arXiv, 2013 [[pdf]](https://arxiv.org/pdf/1301.3781.pdf) 58 | - J. Pennington et al., **GloVe: Global Vectors for Word Representation**, EMNLP, 2014 [[pdf]](https://nlp.stanford.edu/pubs/glove.pdf) 59 | - Q. Le et al., **Distributed Representations of Sentences and Documents**, ICML, 2014 [[pdf]](https://arxiv.org/pdf/1405.4053v2.pdf) 60 | - I. Sutskever et al., **Sequence to Sequence Learning with Neural Networks**, NeurIPS, 2014 [[pdf]](https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf) 61 | - Y. Kim et al., **Convolutional Neural Networks for Sentence Classification**, EMNLP, 2014 [[pdf]](https://arxiv.org/pdf/1408.5882.pdf) 62 | - D. Bahdanau et al., **Neural Machine Translation by Jointly Learning to Align and Translate**, ICLR, 2015 [[pdf]](https://arxiv.org/pdf/1409.0473.pdf) 63 | - A. Vaswani et al., **Attention Is All You Need**, NeurIPS, 2017 [[pdf]](http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf) 64 | - J. Devlin et al., **BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding**, arXiv, 2018 [[pdf]](https://arxiv.org/pdf/1810.04805.pdf) 65 | 66 | ### Speech Recognition 67 | 68 | - E. Fox et al., **A Sticky HDP-HMM with Application to Speaker Diarization**, Annals of Applied Statistics, 2011 [[pdf]](https://arxiv.org/pdf/0905.2592.pdf) 69 | - C. Lee et al., **A Nonparametric Bayesian Approach to Acoustic Model Discovery**, ACL, 2012 [[pdf]](https://groups.csail.mit.edu/sls/publications/2012/Lee_ACL_2012.pdf) 70 | - N. Jaitly et al., **Application of Pretrained DNNs to Large Vocabulary Speech Recognition**, Interspeech, 2012 [[pdf]](http://www.cs.toronto.edu/~ndjaitly/jaitly-interspeech12.pdf) 71 | - A. Graves et al., **Speech Recognition with Deep Recurrent Neural Networks**, ICASSP, 2013 [[pdf]](https://arxiv.org/pdf/1303.5778.pdf) 72 | - D. Bahdanau et al., **End-to-End Attention-based Large Vocabulary Speech Recognition**, ICASSP, 2016 [[pdf]](https://arxiv.org/pdf/1508.04395.pdf) 73 | 74 | 75 | ### Reinforcement Learning 76 | 77 | - V. Mnih et al., **Playing Atari with Deep Reinforcement Learning**, NeurIPS, 2013 [[pdf]](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf) 78 | - D. Silver et al., **Mastering the game of Go without human knowledge**, Nature, 2017 [[pdf]](https://deepmind.com/documents/119/agz_unformatted_nature.pdf) 79 | - C. Finn et al., **Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks**, ICML, 2017 [[pdf]](https://arxiv.org/pdf/1703.03400.pdf) 80 | 81 | ### ML Systems Design 82 | 83 | - Y. Jing et al., **Visual Search at Pinterest**, KDD, 2015 [[pdf]](https://arxiv.org/pdf/1505.07647.pdf) 84 | - P. Nigam et al., **Semantic Product Search**, KDD, 2019 [[pdf]](https://arxiv.org/pdf/1907.00937.pdf) 85 | - P. Covington et al., **Deep Neural Networks for YouTube Recommendations**, RecSys, 2016 [[pdf]](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45530.pdf) 86 | - M. Chen et al., **Gmail Smart Compose: Real-Time Assisted Writing**, KDD, 2019 [[pdf]](https://arxiv.org/pdf/1906.00080) 87 | 88 | ### Machine Learning Engineering 89 | 90 | - W. Kwon et al., **Efficient Memory Management for Large Language Model Serving with Paged Attention**, SOSP, 2023 [[pdf]](https://arxiv.org/pdf/2309.06180) 91 | - L. Zheng et al., **SGLang: Efficient Execution of Structured Language Model Programs**, arXiv, 2024 [[pdf]](https://arxiv.org/pdf/2312.07104) 92 | - T. Dao et al., **FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness**, arXiv, 2022 [[pdf]](https://arxiv.org/pdf/2205.14135) 93 | - A. Agrawal et al., **Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve**, arXiv, 2024 [[pdf]](https://arxiv.org/pdf/2403.02310) 94 | - L. Chen et al., **Punica: Multi-Tenant LoRA Serving**, arXiv, 2023 [[pdf]](https://arxiv.org/pdf/2310.18547) 95 | 96 | 97 | 98 | ### Generative AI 99 | 100 | - L. Ouyang et al., **Training Language Models to Follow Instructions with Human Feedback**, arXiv, 2022 [[pdf]](https://arxiv.org/pdf/2203.02155.pdf) 101 | - R. Rafailov et al., **Direct Preference Optimization: Your Language Model is Secretly a Reward Model**, arXiv, 2024 [[pdf]](https://arxiv.org/pdf/2305.18290) 102 | - T. Wu et al., **Thinking LLMs: General Instruction Following with Thought Generation**, arXiv, 2024 [[pdf]](https://arxiv.org/pdf/2410.10630) 103 | - J. Betker et al., **Improving Image Generation with Better Captions**, arXiv, 2023 [[pdf]](https://cdn.openai.com/papers/dall-e-3.pdf) 104 | - A. Dubey et al., **The Llama 3 Herd of Models**, arXiv, 2024 [[pdf]](https://arxiv.org/pdf/2407.21783) 105 | - E. Hu et al., **LoRA: Low-Rank Adaptation of Large Language Models**, ICLR, 2022 [[pdf]](https://arxiv.org/pdf/2106.09685.pdf) 106 | - S. Yao et al., **ReAct: Synergizing Reasoning and Acting in Language Models**, ICLR, 2023 [[pdf]](https://arxiv.org/pdf/2210.03629.pdf) 107 | - T. Zhang et al., **RAFT: Apating Language Model to Domain Specific RAG**, arXiv, 2024 [[pdf]](https://arxiv.org/pdf/2403.10131) 108 | - G. Hinton et al., **Distilling the Knowledge in a Neural Network**, NeurIPS, 2014 [[pdf]](https://arxiv.org/pdf/1503.02531.pdf) 109 | - J. Kaplan et al., **Scaling Laws for Neural Language Models**, arXiv, 2020 [[pdf]](https://arxiv.org/pdf/2001.08361.pdf) 110 | 111 | --------------------------------------------------------------------------------