├── code-banner.png ├── LICENSE └── README.md /code-banner.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/findmyway/Awesome-Code-LLM/main/code-banner.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Binyuan Hui 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |
2 |

👨‍💻 Awesome Code LLM

3 | 4 | Awesome 5 | 6 | 7 | PRs Welcome 8 | 9 | 10 | Last Commit 11 | 12 |
13 | 14 | ![](code-banner.png) 15 | 16 | ## 🧵 Table of Contents 17 | 18 | - [🧵 Table of Contents](#-table-of-contents) 19 | - [🚀 Leaderboard](#-leaderboard) 20 | - [💡 Evaluation Toolkit](#-evaluation-toolkit) 21 | - [📚 Paper](#-paper) 22 | - [▶️ Pre-Training](#️-pre-training) 23 | - [▶️ Instruction Tuning](#️-instruction-tuning) 24 | - [▶️ Alignment with Feedback](#️-alignment-with-feedback) 25 | - [▶️ Prompting](#️-prompting) 26 | - [▶️ Evaluation \& Benchmark](#️-evaluation--benchmark) 27 | - [▶️ Using LLMs while coding](#️-using-llms-while-coding) 28 | - [🙌 Contributors](#-contributors) 29 | - [Cite as](#cite-as) 30 | - [Acknowledgement](#acknowledgement) 31 | - [Star History](#star-history) 32 | 33 | ## 🚀 Leaderboard 34 | 35 |

Central Leaderboard (Sort by HumanEval Pass@1)

36 | 37 | | Model | Params | HumanEval | MBPP | HF | Source | 38 | | ------------------------ | ------ | --------- | ---- | ------------------------------------------------------------- | ------------------------------------------------------- | 39 | | GPT-4 + Reflexion | ? | 91.0 | 77.1 | | [paper](https://arxiv.org/abs/2303.11366) | 40 | | GPT-4 (latest) | ? | 84.1 | 80.0 | | [github](https://github.com/deepseek-ai/DeepSeek-Coder) | 41 | | DeepSeek-Coder-Instruct | 33B | 79.3 | 70.0 | [ckpt](https://hf.co/deepseek-ai/deepseek-coder-33b-instruct) | [github](https://github.com/deepseek-ai/DeepSeek-Coder) | 42 | | DeepSeek-Coder-Instruct | 7B | 78.6 | 65.4 | [ckpt](https://hf.co/deepseek-ai/deepseek-coder-33b-instruct) | [github](https://github.com/deepseek-ai/DeepSeek-Coder) | 43 | | GPT-3.5-Turbo (latest) | ? | 76.2 | 70.8 | | [github](https://github.com/deepseek-ai/DeepSeek-Coder) | 44 | | Code-Llama | 34B | 62.2 | 61.2 | | [paper](https://arxiv.org/abs/2308.12950) | 45 | | Pangu-Coder2 | 15B | 61.6 | | | [paper](https://arxiv.org/abs/2307.14936) | 46 | | WizardCoder-15B | 15B | 57.3 | 51.8 | [ckpt](https://hf.co/WizardLM/WizardCoder-15B-V1.0) | [paper](https://arxiv.org/abs/2306.08568) | 47 | | Code-Davinci-002 | ? | 47.0 | | | [paper](https://arxiv.org/abs/2107.03374) | 48 | | StarCoder-15B (Prompted) | 15B | 40.8 | 49.5 | [ckpt](https://hf.co/bigcode/starcoder) | [paper](https://arxiv.org/abs/2305.06161) | 49 | | PaLM 2-S | ? | 37.6 | 50.0 | | [paper](https://arxiv.org/abs/2204.02311) | 50 | | PaLM-Coder-540B | 540B | 36.0 | 47.0 | | [paper](https://arxiv.org/abs/2204.02311) | 51 | | InstructCodeT5+ | 16B | 35.0 | | | [paper](https://arxiv.org/abs/2305.07922) | 52 | | StarCoder-15B | 15B | 33.6 | 52.7 | [ckpt](https://hf.co/bigcode/starcoder) | [paper](https://arxiv.org/abs/2305.06161) | 53 | | Code-Cushman-001 | ? | 33.5 | 45.9 | | [paper](https://arxiv.org/abs/2107.03374) | 54 | | CodeT5+ | 16B | 30.9 | | | [paper](https://arxiv.org/abs/2305.07922) | 55 | | LLaMA2-70B | 70B | 29.9 | | [ckpt](https://hf.co/meta-llama/Llama-2-70b-hf) | [paper](https://arxiv.org/abs/2307.09288) | 56 | | CodeGen-16B-Mono | 16B | 29.3 | 35.3 | | [paper](https://arxiv.org/abs/2203.13474) | 57 | | PaLM-540B | 540B | 26.2 | 36.8 | | [paper](https://arxiv.org/abs/2204.02311) | 58 | | LLaMA-65B | 65B | 23.7 | 37.7 | | [paper](https://arxiv.org/abs/2302.13971) | 59 | | CodeGeeX | 13B | 22.9 | 24.4 | | [paper](https://arxiv.org/abs/2303.17568) | 60 | | LLaMA-33B | 33B | 21.7 | 30.2 | | [paper](https://arxiv.org/abs/2302.13971) | 61 | | CodeGen-16B-Multi | 16B | 18.3 | 20.9 | | [paper](https://arxiv.org/abs/2203.13474) | 62 | | AlphaCode | 1.1B | 17.1 | | | [paper](https://arxiv.org/abs/2203.07814) | 63 | 64 | | Leaderboard | Access | 65 | | :----------------------------------: | ----------------------------------------------------------------------------------| 66 | | Big Code Models Leaderboard | [[Source](https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard)] | 67 | | BIRD | [[Source](https://bird-bench.github.io)] | 68 | | CanAiCode Leaderboard | [[Source](https://huggingface.co/spaces/mike-ravkine/can-ai-code-results)] | 69 | | Coding LLMs Leaderboard | [[Source](https://leaderboard.tabbyml.com)] | 70 | | CRUXEval Leaderboard | [[Source](https://crux-eval.github.io/leaderboard.html)] | 71 | | EvalPlus | [[Source](https://evalplus.github.io/leaderboard.html)] | 72 | | InfiCoder-Eval | [[Source](https://infi-coder.github.io/inficoder-eval)] | 73 | | InterCode | [[Source](https://intercode-benchmark.github.io)] | 74 | | Program Synthesis Models Leaderboard | [[Source](https://accubits.com/open-source-program-synthesis-models-leaderboard)] | 75 | | Spider | [[Source](https://yale-lily.github.io/spider)] | 76 | 77 | ## 💡 Evaluation Toolkit: 78 | 79 | - [bigcode-evaluation-harness](https://github.com/bigcode-project/bigcode-evaluation-harness): A framework for the evaluation of autoregressive code generation language models. 80 | - [code-eval](https://github.com/abacaj/code-eval): A framework for the evaluation of autoregressive code generation language models on HumanEval. 81 | 82 | ## 📚 Paper 83 | 84 | ### ▶️ Pre-Training 85 | 86 | 1. **Evaluating Large Language Models Trained on Code** `Preprint` 87 | 88 | [[Paper](https://arxiv.org/abs/2107.03374)] *Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto. et al.* 2021.07 89 | 90 | 2. **CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis** `ICLR23` 91 | 92 | [[Paper](https://arxiv.org/abs/2203.13474)] *Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong.* 2022.03 93 | 94 | 3. **ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages** `ACL23 (Findings)` 95 | 96 | [[Paper](https://aclanthology.org/2023.findings-acl.676.pdf)][[Repo](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/ernie-code)] *Yekun Chai, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, and Hua Wu.* 2022.12 97 | 98 | 4. **SantaCoder: don't reach for the stars!** `Preprint` 99 | 100 | [[Paper](https://arxiv.org/abs/2301.03988)] *Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff. et al.* 2023.01 101 | 102 | 5. **CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X** `Preprint` 103 | 104 | [[Paper](https://arxiv.org/abs/2303.17568)] *Qinkai Zheng, Xiao Xia, Xu Zou, Yuxiao Dong, Shan Wang, Yufei Xue, Zihan Wang, Lei Shen, Andi Wang, Yang Li, Teng Su, Zhilin Yang, Jie Tang.* 2023.03 105 | 106 | 6. **CodeGen2: Lessons for Training LLMs on Programming and Natural Languages** `ICLR23` 107 | 108 | [[Paper](https://arxiv.org/abs/2305.02309)] *Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou.* 2023.05 109 | 110 | 7. **StarCoder: may the source be with you!** `Preprint` 111 | 112 | [[Paper](https://arxiv.org/abs/2305.06161)] *Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou. et al.* 2023.05 113 | 114 | 8. **CodeT5+: Open Code Large Language Models for Code Understanding and Generation** `Preprint` 115 | 116 | [[Paper](https://arxiv.org/abs/2305.07922)] *Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D.Q. Bui, Junnan Li, Steven C.H. Hoi.* 2023.05 117 | 118 | 9. **Textbooks Are All You Need** `Preprint` 119 | 120 | [[Paper](https://arxiv.org/abs/2306.11644)] *Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi. et al.* 2023.06 121 | 122 | 10. **Code Llama: Open Foundation Models for Code** `Preprint` 123 | 124 | [[Paper](https://arxiv.org/abs/2308.12950)] *Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat. et al.* 2023.08 125 | 126 | 127 | ### ▶️ Instruction Tuning 128 | 129 | 1. **WizardCoder: Empowering Code Large Language Models with Evol-Instruct** `Preprint` 130 | 131 | [[Paper](https://arxiv.org/abs/2306.08568)] *Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, Daxin Jiang.* 2023.07 132 | 133 | 2. **OctoPack: Instruction Tuning Code Large Language Models** `Preprint` 134 | 135 | [[Paper](https://arxiv.org/abs/2308.07124)][[Repo](https://github.com/bigcode-project/octopack)] *Niklas Muennighoff, Qian Liu, Armel Zebaze, Qinkai Zheng, Binyuan Hui, Terry Yue Zhuo, Swayam Singh, Xiangru Tang, Leandro von Werra, Shayne Longpre.* 2023.08 136 | 137 | 138 | ### ▶️ Alignment with Feedback 139 | 140 | 1. **CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning** `NeurIPS22` 141 | 142 | [[Paper](https://arxiv.org/abs/2207.01780)] *Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C.H. Hoi.* 2022.07 143 | 144 | 2. **Execution-based Code Generation using Deep Reinforcement Learning** `TMLR23` 145 | 146 | [[Paper](https://arxiv.org/abs/2301.13816)] *Parshin Shojaee, Aneesh Jain, Sindhu Tipirneni, Chandan K. Reddy.* 2023.01 147 | 148 | 3. **RLTF: Reinforcement Learning from Unit Test Feedback** `Preprint` 149 | 150 | [[Paper](https://arxiv.org/abs/2307.04349)] *Jiate Liu, Yiqin Zhu, Kaiwen Xiao, Qiang Fu, Xiao Han, Wei Yang, Deheng Ye.* 2023.07 151 | 152 | 4. **PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback** `Preprint` 153 | 154 | [[Paper](https://arxiv.org/abs/2307.14936)] *Bo Shen, Jiaxin Zhang, Taihong Chen, Daoguang Zan, Bing Geng, An Fu, Muhan Zeng, Ailun Yu, Jichuan Ji, Jingyang Zhao, Yuenan Guo, Qianxiang Wang.* 2023.07 155 | 156 | 157 | ### ▶️ Prompting 158 | 159 | 1. **CodeT: Code Generation with Generated Tests** `ICLR23` 160 | 161 | [[Paper](https://arxiv.org/abs/2207.10397)] *Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen.* 2022.07 162 | 163 | 2. **Coder Reviewer Reranking for Code Generation** `ICML23` 164 | 165 | [[Paper](https://arxiv.org/abs/2211.16490)] *Tianyi Zhang, Tao Yu, Tatsunori B Hashimoto, Mike Lewis, Wen-tau Yih, Daniel Fried, Sida I Wang.* 2022.11 166 | 167 | 3. **LEVER: Learning to Verify Language-to-Code Generation with Execution** `ICML23` 168 | 169 | [[Paper](https://arxiv.org/abs/2302.08468)] *Ansong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, Xi Victoria Lin.* 2023.02 170 | 171 | 4. **Teaching Large Language Models to Self-Debug** `Preprint` 172 | 173 | [[Paper](https://arxiv.org/abs/2304.05128)] *Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou.* 2023.06 174 | 175 | 176 | 5. **Demystifying GPT Self-Repair for Code Generation** `Preprint` 177 | 178 | [[Paper](https://arxiv.org/abs/2306.09896)] *Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama.* 2023.06 179 | 180 | 6. **SelfEvolve: A Code Evolution Framework via Large Language Models** `Preprint` 181 | 182 | [[Paper](https://arxiv.org/abs/2306.02907)] *Shuyang Jiang, Yuhao Wang, Yu Wang.* 2023.06 183 | 184 | 185 | ### ▶️ Evaluation & Benchmark 186 | 187 | 1. **Measuring Coding Challenge Competence With APPS** `NeurIPS21` 188 | 189 | > Named APPS 190 | 191 | [[Paper](https://arxiv.org/abs/2108.07732)][[Repo](https://github.com/hendrycks/apps)] *Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, Jacob Steinhardt.* 2021.05 192 | 193 | 2. **Program Synthesis with Large Language Models** `Preprint` 194 | 195 | > Named MBPP 196 | 197 | [[Paper](https://arxiv.org/abs/2108.07732)] *Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton.* 2021.08 198 | 199 | 3. **DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation** `ICML23` 200 | 201 | [[Paper](https://arxiv.org/abs/2211.11501)] *Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu.* 2022.11 202 | 203 | 4. **RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems** `Preprint` 204 | 205 | [[Paper](https://arxiv.org/abs/2306.03091)] *Tianyang Liu, Canwen Xu, Julian McAuley.* 2023.06 206 | 207 | 5. **Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation** `Preprint` 208 | 209 | [[Paper](https://arxiv.org/abs/2308.10335)] *Li Zhong, Zilong Wang.* 2023.08 210 | 211 | 212 | ### ▶️ Using LLMs while coding 213 | 214 | 1. **Awesome-DevAI: A list of resources about using LLMs while building software** `Awesome` 215 | 216 | [[Repo](https://github.com/continuedev/Awesome-DevAI)] *Ty Dunn, Nate Sesti.* 2023.10 217 | 218 | ## 🙌 Contributors 219 | 220 | 221 | 222 | 223 | 224 | 225 | This is an active repository and your contributions are always welcome! If you have any question about this opinionated list, do not hesitate to contact me `huybery@gmail.com`. 226 | 227 | ## Cite as 228 | 229 | ``` 230 | @software{awesome-code-llm, 231 | author = {Binyuan Hui}, 232 | title = {An awesome and curated list of best code-LLM for research}, 233 | howpublished = {\url{https://github.com/huybery/Awesome-Code-LLM}}, 234 | year = 2023, 235 | } 236 | ``` 237 | 238 | ## Acknowledgement 239 | 240 | This project is inspired by [Awesome-LLM](https://github.com/Hannibal046/Awesome-LLM). 241 | 242 | ## Star History 243 | 244 | [![Star History Chart](https://api.star-history.com/svg?repos=huybery/Awesome-Code-LLM&type=Date)](https://star-history.com/#huybery/Awesome-Code-LLM&Date) 245 | 246 | 247 | **[⬆ Back to ToC](#table-of-contents)** 248 | --------------------------------------------------------------------------------