├── .DS_Store ├── LICENSE └── README.md /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jxzhangjhu/Awesome-LLM-Uncertainty-Reliability-Robustness/HEAD/.DS_Store -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Jiaxin Zhang 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Awesome-LLM-Uncertainty-Reliability-Robustness 2 | 3 | \ 4 | [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/hee9joon/Awesome-Diffusion-Models) 5 | [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT) 6 | [![Made With Love](https://img.shields.io/badge/Made%20With-Love-red.svg)](https://github.com/chetanraj/awesome-github-badges) 7 | 8 | This repository, called **UR2-LLMs** contains a collection of resources and papers on **Uncertainty**, **Reliability** and **Robustness** in **Large Language Models**. 9 | 10 | "*Large language models have limited reliability, limited understanding, limited range, and hence need human supervision*. " - Michael Osborne, Professor of Machine Learning in the Dept. of Engineering Science, University of Oxford, January 25, 2023 11 | 12 | *Welcome to share your papers, thoughts and ideas in this area!* 13 | 14 | ## Contents 15 | 16 | - [Awesome-LLM-Uncertainty-Reliability-Robustness](#awesome-llm-uncertainty-reliability-robustness) 17 | - [Contents](#contents) 18 | - [Resources](#resources) 19 | - [Introductory Posts](#introductory-posts) 20 | - [Technical Reports](#technical-reports) 21 | - [Tutorial](#tutorial) 22 | - [Papers](#papers) 23 | - [Evaluation \& Survey](#evaluation--survey) 24 | - [Uncertainty](#uncertainty) 25 | - [Uncertainty Estimation](#uncertainty-estimation) 26 | - [Calibration](#calibration) 27 | - [Ambiguity](#ambiguity) 28 | - [Confidence](#confidence) 29 | - [Active Learning](#active-learning) 30 | - [Reliability](#reliability) 31 | - [Hallucination](#hallucination) 32 | - [Truthfulness](#truthfulness) 33 | - [Reasoning](#reasoning) 34 | - [Prompt tuning, optimization and design](#prompt-tuning-optimization-and-design) 35 | - [Instruction and RLHF](#instruction-and-rlhf) 36 | - [Tools and external APIs](#tools-and-external-apis) 37 | - [Fine-tuning](#fine-tuning) 38 | - [Robustness](#robustness) 39 | - [Invariance](#invariance) 40 | - [Distribution Shift](#distribution-shift) 41 | - [Out-of-Distribution](#out-of-distribution) 42 | - [Adaptation and Generalization](#adaptation-and-generalization) 43 | - [Adversarial](#adversarial) 44 | - [Attribution](#attribution) 45 | - [Causality](#causality) 46 | 49 | 50 | 51 | 52 | # Resources 53 | ## Introductory Posts 54 | 55 | **The Determinants of Controllable AGI** \ 56 | *Allen Schmaltz* \ 57 | [[Link](https://raw.githubusercontent.com/allenschmaltz/Resolute_Resolutions/master/volume5/volume5.pdf)] \ 58 | 3 Mar 2025 59 | 60 | 63 | 64 | **GPT Is an Unreliable Information Store** \ 65 | *Noble Ackerson* \ 66 | [[Link](https://towardsdatascience.com/chatgpt-insists-i-am-dead-and-the-problem-with-language-models-db5a36c22f11)] \ 67 | 20 Feb 2023 68 | 69 | 81 | 82 | **“Misusing” Large Language Models and the Future of MT** \ 83 | *Arle Lommel* \ 84 | [[Link](https://csa-research.com/Blogs-Events/Blog/Misusing-Large-Language-Models-and-the-Future-of-MT)] \ 85 | 20 Dec 2022 86 | 87 | 90 | 91 | 92 | **Large language models: The basics and their applications** \ 93 | *Margo Poda* \ 94 | [[Link](https://www.moveworks.com/insights/large-language-models-strengths-and-weaknesses)] \ 95 | 9 Feb 2023 96 | 97 | 98 | 99 | **Prompt Engineering: Improving Responses & Reliability** \ 100 | *Peter Foy*\ 101 | [[Link](https://www.mlq.ai/prompt-engineering-techniques-improve-reliability/)]\ 102 | 19 Mar 2023 103 | 104 | 105 | 106 | **OpenAI's Cookbook on Techniques to Improve Reliability** \ 107 | *OpenAI* \ 108 | [[Github](https://github.com/openai/openai-cookbook)] \ 109 | 18 Mar 2023 110 | 111 | **GPT/calibration tag** \ 112 | *Gwern Branwen* \ 113 | [[Link](https://gwern.net/doc/ai/nn/transformer/gpt/calibration/index#link-bibliography)] 114 | 115 | **Prompt Engineering** \ 116 | *Lilian Weng*\ 117 | [[Link](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)] 118 | 119 | 120 | **LLM Powered Autonomous Agents** \ 121 | *Lilian Weng*\ 122 | [[Link](https://lilianweng.github.io/posts/2023-06-23-agent/)] 123 | 124 | 125 | **Reliability in Learning Prompting**\ 126 | [[Link](https://learnprompting.org/docs/category/%EF%B8%8F-reliability)] 127 | 128 | 129 | **Building LLM applications for production** \ 130 | *Chip Huyen* \ 131 | [[Link](https://huyenchip.com/2023/04/11/llm-engineering.html)] \ 132 | 11 Apr 2023 133 | 134 | **Practical, Real-World Neural Network Interpretability and Deployment** \ 135 | *Allen Schmaltz* \ 136 | [[Link](https://raw.githubusercontent.com/allenschmaltz/Resolute_Resolutions/master/volume3/volume3.pdf)] \ 137 | 11 Dec 2021 138 | 139 | 144 | 145 | ## Technical Reports 146 | 147 | **GPT-4 Technical Report** \ 148 | *OpenAI* \ 149 | arXiv 2023. [[Paper](https://cdn.openai.com/papers/gpt-4.pdf)][[Cookbook](https://github.com/openai/evals)] \ 150 | 16 Mar 2023 151 | 152 | **GPT-4 System Card** \ 153 | *OpenAI* \ 154 | arXiv 2023. [[Paper](https://cdn.openai.com/papers/gpt-4-system-card.pdf)] [[Github](https://github.com/openai/evals)]\ 155 | 15 Mar 2023 156 | 157 | 158 | ## Tutorial 159 | 160 | **Uncertainty Estimation for Natural Language Processing** \ 161 | *Adam Fisch, Robin Jia, Tal Schuster* \ 162 | COLLING 2022. [[Website](https://sites.google.com/view/uncertainty-nlp)] 163 | 164 | 170 | 171 | 172 | 173 | 174 | # Papers 175 | 176 | ## Evaluation & Survey 177 | 178 | **Wider and Deeper LLM Networks are Fairer LLM Evaluators** \ 179 | *Xinghua Zhang, Bowen Yu, Haiyang Yu, Yangyu Lv, Tingwen Liu, Fei Huang, Hongbo Xu, Yongbin Li* \ 180 | arXiv 2023. [[Paper](https://aps.arxiv.org/abs/2308.01862)][[Github](https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/WideDeep)] \ 181 | 3 Aug 2023 182 | 183 | **A Survey on Evaluation of Large Language Models** \ 184 | *Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Kaijie Zhu, Hao Chen, Linyi Yang, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, Xing Xie* \ 185 | Arxiv 2023. [[Paper](https://arxiv.org/abs/2307.03109)][[Github](https://github.com/mlgroupjlu/llm-eval-survey)] \ 186 | 6 Jul 2023 187 | 188 | **DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models** \ 189 | *Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li* \ 190 | Arxiv, 2023. [[Paper](https://arxiv.org/abs/2306.11698)] [[Github](https://github.com/AI-secure/DecodingTrust/)] [[Website](https://decodingtrust.github.io/)] \ 191 | 20 Jun 2023 192 | 193 | **In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT** \ 194 | *Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang* \ 195 | arXiv, 2023. [[Paper](https://arxiv.org/abs/2304.08979)] \ 196 | 18 Apr 2023 197 | 198 | **Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond** \ 199 | *Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu* \ 200 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.13712)][[Github](https://github.com/mooler0410/llmspracticalguide)] \ 201 | 27 Apr 2023 202 | 203 | **How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks** \ 204 | *Xuanting Chen, Junjie Ye, Can Zu, Nuo Xu, Rui Zheng, Minlong Peng, Jie Zhou, Tao Gui, Qi Zhang, Xuanjing Huang* \ 205 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.00293)][[Github](https://github.com/textflint/textflint)] \ 206 | 1 Mar 2023 207 | 208 | **Holistic Evaluation of Language Models** \ 209 | *Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda* \ 210 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.09110)] [[Website](https://crfm.stanford.edu/helm/latest/)] [[Github](https://github.com/stanford-crfm/helm)] [[Blog](https://crfm.stanford.edu/2022/11/17/helm.html)] \ 211 | 16 Nov 2022 212 | 213 | **Prompting GPT-3 To Be Reliable** \ 214 | *Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan Boyd-Graber, Lijuan Wang* \ 215 | ICLR 2023. [[Paper](https://arxiv.org/abs/2210.09150)] [[Github](https://github.com/NoviScl/GPT3-Reliability)] \ 216 | 17 Oct 2022 217 | 218 | **Plex: Towards Reliability using Pretrained Large Model Extensions** \ 219 | *Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek, Balaji Lakshminarayanan* \ 220 | arXiv 2022. [[Paper](https://arxiv.org/abs/2207.07411)] \ 221 | 15 Jul 2022 222 | 223 | **Language Models (Mostly) Know What They Know** \ 224 | *Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Dawn Drain, Ethan Perez, Nicholas Schiefer, Zac Hatfield-Dodds, Nova DasSarma, Eli Tran-Johnson, Scott Johnston, Sheer El-Showk, Andy Jones, Nelson Elhage, Tristan Hume, Anna Chen, Yuntao Bai, Sam Bowman, Stanislav Fort, Deep Ganguli, Danny Hernandez, Josh Jacobson, Jackson Kernion, Shauna Kravec, Liane Lovitt, Kamal Ndousse, Catherine Olsson, Sam Ringer, Dario Amodei, Tom Brown, Jack Clark, Nicholas Joseph, Ben Mann, Sam McCandlish, Chris Olah, Jared Kaplan* \ 225 | arXiv 2022. [[Paper](https://arxiv.org/abs/2207.05221)] \ 226 | 11 Jul 2022 227 | 228 | **Augmented Language Models: a Survey** \ 229 | *Grégoire Mialon, Roberto Dessì, Maria Lomeli, Christoforos Nalmpantis, Ram Pasunuru, Roberta Raileanu, Baptiste Rozière, Timo Schick, Jane Dwivedi-Yu, Asli Celikyilmaz, Edouard Grave, Yann LeCun, Thomas Scialom* \ 230 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.07842)] \ 231 | 15 Feb 2023 232 | 233 | **A Survey of Evaluation Metrics Used for NLG Systems** \ 234 | *Ananya B. Sai, Akash Kumar Mohankumar, Mitesh M. Khapra* \ 235 | ACM Computing Survey, 2022. [[Paper](https://dl.acm.org/doi/abs/10.1145/3485766)] \ 236 | 18 Jan 2022 237 | 238 | **NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation** \ 239 | *Kaustubh D. Dhole, et al.* \ 240 | ACL 2021. [[Paper](https://arxiv.org/abs/2112.02721)][[Github](https://github.com/GEM-benchmark/NL-Augmenter)] \ 241 | 6 Dec 2021 242 | 243 | **TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing** \ 244 | *Tao Gui et al.* \ 245 | arXiv 2021. [[Paper](https://arxiv.org/abs/2103.11441)][[Github](https://github.com/textflint/textflint)] \ 246 | 21 Mar 2021 247 | 248 | **Robustness Gym: Unifying the NLP Evaluation Landscape** \ 249 | *Karan Goel, Nazneen Rajani, Jesse Vig, Samson Tan, Jason Wu, Stephan Zheng, Caiming Xiong, Mohit Bansal, Christopher Ré* \ 250 | ACL 2021. [[Paper](https://arxiv.org/abs/2101.04840)] [[Github](https://github.com/robustness-gym/robustness-gym)] \ 251 | 13 Jan 2021 252 | 253 | **Beyond Accuracy: Behavioral Testing of NLP models with CheckList** \ 254 | *Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh* \ 255 | ACL 2020. [[Paper](https://arxiv.org/abs/2005.04118)][[Github](https://github.com/marcotcr/checklist)] \ 256 | 8 May 2020 257 | 258 | 259 | ## Uncertainty 260 | 261 | ### Uncertainty Estimation 262 | 263 | **BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models** \ 264 | *Yibin Wang, Haizhou Shi, Ligong Han, Dimitris Metaxas, Hao Wang* \ 265 | arXiv 2024. [[Paper](https://arxiv.org/pdf/2406.11675)] \ 266 | 18 Jun 2024 267 | 268 | **Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities** \ 269 | *Alexander Nikitin, Jannik Kossen, Yarin Gal, Pekka Marttinen* \ 270 | NeurIPS 2024. [[Paper](https://arxiv.org/pdf/2405.20003)] [[Github](https://github.com/AlexanderVNikitin/kernel-language-entropy)] \ 271 | 30 May 2024 272 | 273 | **Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach** \ 274 | *Linyu Liu, Yu Pan, Xiaocheng Li, Guanting Chen* \ 275 | arXiv 2024. [[Paper](https://arxiv.org/pdf/2404.15993)] \ 276 | 24 Apr 2024 277 | 278 | **MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMs** \ 279 | *Bakman et al.* \ 280 | ACL 2024. [[Paper](https://aclanthology.org/2024.acl-long.419.pdf)] \ 281 | 19 Feb 2024 282 | 283 | **Shifting Attention to Relevance: Towards the Uncertainty Estimation of Large Language Models** \ 284 | *Jinhao Duan, Hao Cheng, Shiqi Wang, Alex Zavalny, Chenan Wang, Renjing Xu, Bhavya Kailkhura, Kaidi Xu* \ 285 | arXiv 2023. [[Paper](https://arxiv.org/pdf/2307.01379.pdf)] \ 286 | 9 Oct 2023 287 | 288 | **Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models** \ 289 | *Yuheng Huang, Jiayang Song, Zhijie Wang, Shengming Zhao, Huaming Chen, Felix Juefei-Xu, Lei Ma* \ 290 | arXiv 2023. [[Paper](https://arxiv.org/pdf/2307.10236.pdf)] \ 291 | 16 Jul 2023 292 | 293 | **Quantifying Uncertainty in Natural Language Explanations of Large Language Models** \ 294 | *Sree Harsha Tanneru, Chirag Agarwal, Himabindu Lakkaraju* \ 295 | arXiv 2023. [[Paper](https://arxiv.org/abs/2311.03533v1)] \ 296 | 6 Nov 2023 297 | 298 | **Conformal Autoregressive Generation: Beam Search with Coverage Guarantees** \ 299 | *Nicolas Deutschmann, Marvin Alberts, María Rodríguez Martínez* \ 300 | arXiv 2023. [[Paper](https://arxiv.org/abs/2309.03797)] \ 301 | 7 Sep 2023 302 | 303 | **Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness** \ 304 | *Jiuhai Chen, Jonas Mueller* \ 305 | arXiv 2023. [[Paper](https://arxiv.org/abs/2308.16175)] \ 306 | 30 Aug 2023 307 | 308 | 309 | **Uncertainty in Natural Language Generation: From Theory to Applications** \ 310 | *Joris Baan, Nico Daheim, Evgenia Ilia, Dennis Ulmer, Haau-Sing Li, Raquel Fernández, Barbara Plank, Rico Sennrich, Chrysoula Zerva, Wilker Aziz* \ 311 | arXiv 2023. [[Paper](https://arxiv.org/abs/2307.15703)] \ 312 | 28 July 2023 313 | 314 | 315 | **Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models** \ 316 | *Zhen Lin, Shubhendu Trivedi, Jimeng Sun* \ 317 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.19187)] [[Github](https://github.com/zlin7/UQ-NLG)] \ 318 | 30 May 2023 319 | 320 | **Human Uncertainty in Concept-Based AI Systems** \ 321 | *Katherine M. Collins, Matthew Barker, Mateo Espinosa Zarlenga, Naveen Raman, Umang Bhatt, Mateja Jamnik, Ilia Sucholutsky, Adrian Weller, Krishnamurthy Dvijotham* \ 322 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.12872)] \ 323 | 22 Mar 2023 324 | 325 | **Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models** \ 326 | *Kaitlyn Zhou, Dan Jurafsky, Tatsunori Hashimoto* \ 327 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.13439)] \ 328 | 25 Feb 2023 329 | 330 | **DEUP: Direct Epistemic Uncertainty Prediction** \ 331 | *Salem Lahlou, Moksh Jain, Hadi Nekoei, Victor Ion Butoi, Paul Bertin, Jarrid Rector-Brooks, Maksym Korablyov, Yoshua Bengio* \ 332 | TMLR 2023. [[Paper](https://arxiv.org/abs/2102.08501)] \ 333 | 3 Feb 2023 334 | 335 | 336 | **On Compositional Uncertainty Quantification for Seq2seq Graph Parsing** \ 337 | *Zi Lin, Du Phan, Panupong Pasupat, Jeremiah Zhe Liu, Jingbo Shang* \ 338 | ICLR 2023. [[Paper](https://openreview.net/forum?id=rJcLocAJpA6)] \ 339 | 1 Feb 2023 340 | 341 | 342 | **Neural-Symbolic Inference for Robust Autoregressive Graph Parsing via Compositional Uncertainty Quantification** \ 343 | *Zi Lin, Jeremiah Liu, Jingbo Shang* \ 344 | EMNLP 2022. [[Paper](https://arxiv.org/abs/2301.11459)] \ 345 | 16 Jan 2023 346 | 347 | 348 | **Teaching Models to Express Their Uncertainty in Words** \ 349 | *Stephanie Lin, Jacob Hilton, Owain Evans* \ 350 | TMLR 2022. [[Paper](https://arxiv.org/abs/2205.14334)] [[Github](https://github.com/sylinrl/CalibratedMath)] [[TMLR](https://openreview.net/forum?id=8s8K2UZGTZ)] [[Slide](https://owainevans.github.io/pdfs/chai_calibration_owain.pdf)]\ 351 | 28 May 2022 352 | 353 | **Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation** \ 354 | *Lorenz Kuhn, Yarin Gal, Sebastian Farquhar* \ 355 | ICLR 2023. [[Paper](https://arxiv.org/abs/2302.09664)] \ 356 | 19 Feb 2022 357 | 358 | 359 | **Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach** \ 360 | *Yue Yu, Rongzhi Zhang, Ran Xu, Jieyu Zhang, Jiaming Shen, Chao Zhang* \ 361 | arXiv 2022. [[Paper](https://arxiv.org/abs/2209.06995)][[Github](https://github.com/yueyu1030/Patron)] \ 362 | 15 Sep 2022 363 | 364 | 365 | **Fine-Tuning Language Models via Epistemic Neural Networks** \ 366 | *Ian Osband, Seyed Mohammad Asghari, Benjamin Van Roy, Nat McAleese, John Aslanides, Geoffrey Irving* \ 367 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.01568)][[Github](https://github.com/deepmind/neural_testbed)] \ 368 | 3 Nov 2022 369 | 370 | 371 | 372 | **Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis** \ 373 | *Yuxin Xiao, Paul Pu Liang, Umang Bhatt, Willie Neiswanger, Ruslan Salakhutdinov, Louis-Philippe Morency* \ 374 | EMNLP 2022 (Findings). [[Paper](https://arxiv.org/abs/2210.04714)][[Github](https://github.com/xiaoyuxin1002/uq-plm)] \ 375 | 10 Oct 2022 376 | 377 | **Uncertainty Estimation for Language Reward Models** \ 378 | *Adam Gleave, Geoffrey Irving* \ 379 | arXiv 2022. [[Paper](https://arxiv.org/abs/2203.07472)] \ 380 | 14 Mar 2022 381 | 382 | **Uncertainty Estimation and Reduction of Pre-trained Models for Text Regression** \ 383 | *Yuxia Wang, Daniel Beck, Timothy Baldwin, Karin Verspoor* \ 384 | TACL 2022. [[Paper](https://aclanthology.org/2022.tacl-1.39/)] \ 385 | Jun 2022 386 | 387 | 388 | **Uncertainty Estimation in Autoregressive Structured Prediction** \ 389 | *Andrey Malinin, Mark Gales* \ 390 | ICLR 2021. [[Paper](https://arxiv.org/abs/2002.07650)] \ 391 | 18 Feb 2020 392 | 393 | 394 | **Unsupervised Quality Estimation for Neural Machine Translation** \ 395 | *Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia* \ 396 | TACL 2020. [[Paper](https://arxiv.org/abs/2005.10608)][[Dataset](https://github.com/facebookresearch/mlqe)] \ 397 | 21 May 2020 398 | 399 | 400 | **Analyzing Uncertainty in Neural Machine Translation** \ 401 | *Myle Ott, Michael Auli, David Grangier, Marc’Aurelio Ranzato* \ 402 | ICML 2018. [[Paper](https://proceedings.mlr.press/v80/ott18a.html)] \ 403 | 2018 404 | 405 | **Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers** \ 406 | *Dylan Bouchard, Mohit Singh Chauhan* \ 407 | arXiv 2025. [[Paper](https://arxiv.org/abs/2504.19254)][[GitHub](https://github.com/cvs-health/uqlm)] \ 408 | April 2025 409 | 410 | 411 | ### Calibration 412 | 413 | **Similarity-Distance-Magnitude Universal Verification** \ 414 | *Allen Schmaltz* \ 415 | arXiv 2025. [[Paper](https://arxiv.org/pdf/2502.20167)] [[Github](https://github.com/ReexpressAI/sdm)] \ 416 | 27 Feb 2025 417 | 418 | **Calibrating Large Language Models Using Their Generations Only** \ 419 | *Dennis Ulmer, Martin Gubri, Hwaran Lee, Sangdoo Yun, Seong Joon Oh* \ 420 | ACL 2024. [[Paper](https://arxiv.org/pdf/2403.05973)][[Github](https://github.com/parameterlab/apricot)][[Poster](https://gubri.eu/pdf/Poster_Apricot_ACL2024.pdf)][[Slides](https://gubri.eu/pdf/Slides_APRICOT.pdf)] \ 421 | 9 Mar 2024 422 | 423 | **Batch Calibration: Rethinking Calibration for In-Context Learning and Prompt Engineering** \ 424 | *Han Zhou, Xingchen Wan, Lev Proleev, Diana Mincu, Jilin Chen, Katherine Heller, Subhrajit Roy* \ 425 | ICLR 2024. [[Paper](https://arxiv.org/abs/2309.17249)] 426 | 24 Jan 2024 427 | 428 | **Do Large Language Models Know What They Don't Know?** \ 429 | *Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang* \ 430 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.18153)] 431 | 29 May 2023 432 | 433 | **Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback** \ 434 | *Katherine Tian, Eric Mitchell, Allan Zhou, Archit Sharma, Rafael Rafailov, Huaxiu Yao, Chelsea Finn, Christopher D. Manning* \ 435 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.14975)] \ 436 | 24 May 2023 437 | 438 | 439 | **Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4** \ 440 | *Kellin Pelrine, Meilina Reksoprodjo, Caleb Gupta, Joel Christoph, Reihaneh Rabbany* \ 441 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.14928)] \ 442 | 24 May 2023 443 | 444 | 445 | 446 | **Calibrated Interpretation: Confidence Estimation in Semantic Parsing** \ 447 | *Elias Stengel-Eskin, Benjamin Van Durme* \ 448 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.07443)] [[Github](https://github.com/esteng/calibration_miso)] \ 449 | 14 Nov 2022. 450 | 451 | 452 | **Calibrating Sequence likelihood Improves Conditional Language Generation** \ 453 | *Yao Zhao, Misha Khalman, Rishabh Joshi, Shashi Narayan, Mohammad Saleh, Peter J. Liu* \ 454 | ICLR 2023. [[Paper](https://arxiv.org/abs/2210.00045)]\ 455 | 30 Sep 2022 456 | 457 | **Calibrated Selective Classification** \ 458 | *Adam Fisch, Tommi Jaakkola, Regina Barzilay* \ 459 | TMLR 2022. [[Paper](https://arxiv.org/abs/2208.12084)] \ 460 | 25 Aug 2022 461 | 462 | 463 | **Reducing conversational agents' overconfidence through linguistic calibration** \ 464 | *Sabrina J. Mielke, Arthur Szlam, Emily Dinan, Y-Lan Boureau* \ 465 | NAACL 2022. [[Paper](https://arxiv.org/abs/2012.14983)] \ 466 | 22 Jun 2022 467 | 468 | **Re-Examining Calibration: The Case of Question Answering** \ 469 | *Chenglei Si, Chen Zhao, Sewon Min, Jordan Boyd-Graber* \ 470 | EMNLP 2022 Findings. [[Paper](https://arxiv.org/abs/2205.12507)] \ 471 | 25 May 2022 472 | 473 | **Towards Collaborative Neural-Symbolic Graph Semantic Parsing via Uncertainty** \ 474 | *Zi Lin, Jeremiah Liu, Jingbo Shang* \ 475 | ACL Fingings 2022. [[Paper](https://aclanthology.org/2022.findings-acl.328.pdf)] \ 476 | 22 May 2022 477 | 478 | 479 | **Uncertainty-aware machine translation evaluation** \ 480 | *Taisiya Glushkova, Chrysoula Zerva, Ricardo Rei, André F. T. Martins* \ 481 | EMNLP 2021. [[Paper](https://arxiv.org/abs/2109.06352)] \ 482 | 13 Sep 2021 483 | 484 | 485 | **Calibrate Before Use: Improving Few-Shot Performance of Language Models** \ 486 | *Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh* \ 487 | ICML 2021. [[Paper](https://arxiv.org/abs/2102.09690)][[Github](https://github.com/tonyzhaozh/few-shot-learning) \ 488 | 19 Feb 2021 489 | 490 | **How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering** \ 491 | *Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig* \ 492 | TACL 2021. [[Paper](https://arxiv.org/abs/2012.00955)][[Github](https://github.com/jzbjyb/lm-calibration)] \ 493 | 2 Dec 2020 494 | 495 | **Calibration of Pre-trained Transformers** \ 496 | *Shrey Desai, Greg Durrett* \ 497 | EMNLP 2020. [[Paper](https://arxiv.org/abs/2003.07892)][[Github](https://github.com/shreydesai/calibration)] \ 498 | 17 May 2020 499 | 500 | 501 | ### Ambiguity 502 | 503 | **Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models** \ 504 | *Gangwoo Kim, Sungdong Kim, Byeongguk Jeon, Joonsuk Park, Jaewoo Kang* \ 505 | EMNLP 2023. [[Paper](https://arxiv.org/abs/2310.14696)][[Github](https://github.com/gankim/tree-of-clarifications)] \ 506 | 23 Oct 2023 507 | 508 | **Selectively Answering Ambiguous Questions** \ 509 | *Jeremy R. Cole, Michael J.Q. Zhang, Daniel Gillick, Julian Martin Eisenschlos, Bhuwan Dhingra, Jacob Eisenstein* \ 510 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.14613)] \ 511 | 24 May 2023 512 | 513 | 514 | **We're Afraid Language Models Aren't Modeling Ambiguity** \ 515 | *Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, Yejin Choi* \ 516 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.14399v1)][[Github](https://github.com/alisawuffles/ambient)] \ 517 | 24 Apr 2023 518 | 519 | **Task Ambiguity in Humans and Language Models** \ 520 | *Alex Tamkin, Kunal Handa, Avash Shrestha, Noah Goodman* \ 521 | ICLR 2023. [[Paper](https://arxiv.org/abs/2212.10711)][[Github](https://github.com/alextamkin/active-learning-pretrained-models)] \ 522 | 20 Dec 2022 523 | 524 | 525 | **CLAM: Selective Clarification for Ambiguous Questions with Generative Language Models** \ 526 | *Lorenz Kuhn, Yarin Gal, Sebastian Farquhar* \ 527 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.07769)] \ 528 | 15 Dec 2022 529 | 530 | **How to Approach Ambiguous Queries in Conversational Search: A Survey of Techniques, Approaches, Tools, and Challenges** \ 531 | *Kimiya Keyvan, Jimmy Xiangji Huang* \ 532 | ACM Computing Survey, 2022. [[Paper](https://dl.acm.org/doi/abs/10.1145/3534965)] \ 533 | 7 Dec 2022 534 | 535 | **Assistance with large language models** \ 536 | *Dmitrii Krasheninnikov, Egor Krasheninnikov, David Krueger* \ 537 | NeurIPS MLSW Workshop 2022. [[Paper](https://openreview.net/forum?id=OE9V81spp6B)] \ 538 | 5 Dec 2022 539 | 540 | **Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous Questions in VQA** \ 541 | *Elias Stengel-Eskin, Jimena Guallar-Blasco, Yi Zhou, Benjamin Van Durme* \ 542 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.07516)][[Github](https://github.com/esteng/ambiguous_vqa)] \ 543 | 14 Nov 2022 544 | 545 | 546 | **Abg-CoQA: Clarifying Ambiguity in Conversational Question Answering** \ 547 | *Meiqi Guo, Mingda Zhang, Siva Reddy, Malihe Alikhani* \ 548 | AKBC 2021. [[Paper](https://openreview.net/forum?id=SlDZ1o8FsJU)] \ 549 | 22 Jun 2021 550 | 551 | 552 | ### Confidence 553 | 554 | 555 | **The Confidence-Competence Gap in Large Language Models: A Cognitive Study** \ 556 | *Aniket Kumar Singh, Suman Devkota, Bishal Lamichhane, Uttam Dhakal, Chandra Dhakal* \ 557 | arXiv 2023. [[Paper](https://arxiv.org/abs/2309.16145)] \ 558 | 28 Sep 2023 559 | 560 | **Strength in Numbers: Estimating Confidence of Large Language Models by Prompt Agreement** \ 561 | *Gwenyth Portillo Wightman, Alexandra Delucia, Mark Dredze* \ 562 | ACL TrustNLP Workshop 2023. [[Paper](https://aclanthology.org/2023.trustnlp-1.28/)] \ 563 | 1 Jul 2023 564 | 565 | 566 | **What Are the Different Approaches for Detecting Content Generated by LLMs Such As ChatGPT? And How Do They Work and Differ?** \ 567 | *Sebastian Raschka* \ 568 | [[Link](https://sebastianraschka.com/blog/2023/detect-ai.html)] [[GPTZero](https://gptzero.me/)] \ 569 | 1 Feb 2023 570 | 571 | **DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature** \ 572 | *Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D. Manning, Chelsea Finn* \ 573 | arXiv 2023. [[Paper](https://arxiv.org/abs/2301.11305v1)][[Website](https://ericmitchell.ai/detectgpt/)] \ 574 | 26 Jan 2023 575 | 576 | **Confident Adaptive Language Modeling** \ 577 | *Tal Schuster, Adam Fisch, Jai Gupta, Mostafa Dehghani, Dara Bahri, Vinh Q. Tran, Yi Tay, Donald Metzler* \ 578 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2207.07061)] 579 | 25 Oct 2022 580 | 581 | **Conformal risk control** \ 582 | *Anastasios N Angelopoulos, Stephen Bates, Adam Fisch, Lihua Lei, Tal Schuster* \ 583 | arXiv 2022. [[Paper](https://arxiv.org/abs/2208.02814)][[Github](https://github.com/aangelopoulos/conformal-risk)] \ 584 | 4 Aug 2022 585 | 586 | 587 | ### Active Learning 588 | 589 | 590 | 591 | **A Survey of Active Learning for Natural Language Processing** \ 592 | *Zhisong Zhang, Emma Strubell, Eduard Hovy* \ 593 | EMNLP 2022. [[Paper](https://arxiv.org/abs/2210.10109)][[Github](https://github.com/zzsfornlp/zmsp)] \ 594 | 18 Oct 2022 595 | 596 | 597 | **Active Prompting with Chain-of-Thought for Large Language Models** \ 598 | *Shizhe Diao, Pengcheng Wang, Yong Lin, Tong Zhang* \ 599 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.12246)][[Github](https://github.com/shizhediao/active-prompt)] \ 600 | 23 Feb 2023 601 | 602 | 603 | **Low-resource Interactive Active Labeling for Fine-tuning Language Models** \ 604 | *Seiji Maekawa, Dan Zhang, Hannah Kim, Sajjadur Rahman, Estevam Hruschka* \ 605 | EMNLP Findings 2022. [[Paper](https://aclanthology.org/2022.findings-emnlp.235/)] \ 606 | 7 Dec 2022 607 | 608 | **Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions** \ 609 | *Rafal Kocielnik, Sara Kangaslahti, Shrimai Prabhumoye, Meena Hari, R. Michael Alvarez, Anima Anandkumar* \ 610 | NeurIPS Workshop 2022. [[Paper](https://arxiv.org/abs/2211.11798)] \ 611 | 21 Nov 2022 612 | 613 | 614 | **AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages** \ 615 | *Bonaventure F. P. Dossou, Atnafu Lambebo Tonja, Oreen Yousuf, Salomey Osei, Abigail Oppong, Iyanuoluwa Shode, Oluwabusayo Olufunke Awoyomi, Chris Chinenye Emezue* \ 616 | EMNLP 2022. [[Paper](https://arxiv.org/abs/2211.03263)][[Github](https://github.com/bonaventuredossou/mlm_al)] \ 617 | 7 Nov 2022 618 | 619 | **Active Learning Helps Pretrained Models Learn the Intended Task** \ 620 | *Alex Tamkin, Dat Pham Nguyen, Salil Deshpande, Jesse Mu, Noah Goodman* \ 621 | NeurIPS 2022. [[Paper](https://openreview.net/forum?id=0Ww7UVEoNue)][[Github](https://github.com/alextamkin/active-learning-pretrained-models)] \ 622 | 31 Oct 2022 623 | 624 | **Selective Annotation Makes Language Models Better Few-Shot Learners** \ 625 | *Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu* \ 626 | ICLR 2023. [[Paper](https://arxiv.org/abs/2209.01975)][[Github](https://github.com/hkunlp/icl-selective-annotation)] \ 627 | 5 Sep 2022 628 | 629 | 630 | **Multi-task Active Learning for Pre-trained Transformer-based Models** \ 631 | *Guy Rotman, Roi Reichart* \ 632 | TACL 2022. [[Paper](https://arxiv.org/abs/2208.05379)] [[Github](https://github.com/rotmanguy/mtal)]\ 633 | 10 Aug 2022 634 | 635 | **AcTune: Uncertainty-Based Active Self-Training for Active Fine-Tuning of Pretrained Language Models** \ 636 | *Yue Yu, Lingkai Kong, Jieyu Zhang, Rongzhi Zhang, Chao Zhang* \ 637 | NAACL-HLT2022. [[Paper](https://aclanthology.org/2022.naacl-main.102/)] [[Github](https://github.com/yueyu1030/actune)]\ 638 | 10 Jul 2022 639 | 640 | **Towards Computationally Feasible Deep Active Learning** \ 641 | *Akim Tsvigun, Artem Shelmanov, Gleb Kuzmin, Leonid Sanochkin, Daniil Larionov, Gleb Gusev, Manvel Avetisian, Leonid Zhukov* \ 642 | NAACL 2022. [[Paper](https://arxiv.org/abs/2205.03598)] [[Github](https://github.com/airi-institute/al_nlp_feasible)] \ 643 | 7 May 2022 644 | 645 | **FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction** \ 646 | *Minh Van Nguyen, Nghia Trung Ngo, Bonan Min, Thien Huu Nguyen* \ 647 | NAACL 2022. [[Paper](https://arxiv.org/abs/2202.08316)] [[Github](https://github.com/nlp-uoregon/famie)] \ 648 | 16 Feb 2022 649 | 650 | 651 | **On the Importance of Effectively Adapting Pretrained Language Models for Active Learning** \ 652 | *Katerina Margatina, Loïc Barrault, Nikolaos Aletras* \ 653 | ACL 2022. [[Paper](https://arxiv.org/abs/2104.08320v2)] \ 654 | 2 Mar 2022 655 | 656 | **Limitations of Active Learning With Deep Transformer Language Models** \ 657 | *Mike D'Arcy, Doug Downey* \ 658 | Arxiv 2022. [[Paper](https://openreview.net/forum?id=Q8OjAGkxwP5)] \ 659 | 28 Jan 2022 660 | 661 | **Active Learning by Acquiring Contrastive Examples** \ 662 | *Katerina Margatina, Giorgos Vernikos, Loïc Barrault, Nikolaos Aletras* \ 663 | EMNLP 2021. [[Paper](https://arxiv.org/abs/2109.03764)][[Github](https://github.com/mourga/contrastive-active-learning)] \ 664 | 8 Sep 2021 665 | 666 | 667 | **Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers** \ 668 | *Christopher Schröder, Andreas Niekler, Martin Potthast* \ 669 | ACL 2022 Findings. [[Paper](https://arxiv.org/abs/2107.05687)][[Github](https://github.com/webis-de/acl22-revisiting-uncertainty-based-query-strategies-for-active-learning-with-transformers)] \ 670 | 12 Jul 2021 671 | 672 | **Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates** \ 673 | *Artem Shelmanov, Dmitri Puzyrev, Lyubov Kupriyanova, Denis Belyakov, Daniil Larionov, Nikita Khromov, Olga Kozlova, Ekaterina Artemova, Dmitry V. Dylov, Alexander Panchenko* \ 674 | EACL 2021. [[Paper](https://arxiv.org/abs/2101.08133)] \ 675 | 18 Feb 2021 676 | 677 | **Fine-tuning BERT for Low-Resource Natural Language Understanding via Active Learning** \ 678 | *Daniel Grießhaber, Johannes Maucher, Ngoc Thang Vu* \ 679 | COLING 2020. [[Paper](https://arxiv.org/abs/2012.02462)] \ 680 | 4 Dec 2020 681 | 682 | 683 | ## Reliability 684 | 685 | 686 | ### Hallucination 687 | > [awesome hallucination detection](https://github.com/EdinburghNLP/awesome-hallucination-detection) 688 | 689 | **HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models** \ 690 | *Tianrui Guan\*, Fuxiao Liu\*, Xiyang Wu, Ruiqi Xian, Zongxia Li, Xiaoyu Liu, Xijun Wang, Lichang Chen, Furong Huang, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou* \ 691 | CVPR 2024. [[Paper](https://arxiv.org/abs/2310.14566)][[Github](https://github.com/tianyi-lab/HallusionBench)] \ 692 | 18 Mar 2024 693 | 694 | **SAC$`^3`$: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency** \ 695 | *Jiaxin Zhang, Zhuohang Li, Kamalika Das, Bradley A. Malin, Sricharan Kumar* \ 696 | EMNLP 2023. [[Paper](https://arxiv.org/abs/2311.01740)][[Github](https://github.com/intuit/sac3)] \ 697 | 3 Nov 2023 698 | 699 | **Hallucination Leaderboard** \ 700 | *Vectara* \ 701 | [[Link](https://github.com/vectara/hallucination-leaderboard)] \ 702 | 2 Nov 2023 703 | 704 | **Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators** \ 705 | *Liang Chen, Yang Deng, Yatao Bian, Zeyu Qin, Bingzhe Wu, Tat-Seng Chua, Kam-Fai Wong* \ 706 | EMNLP 2023. [[Paper](https://arxiv.org/abs/2310.07289)][[Github](https://github.com/ChanLiang/CONNER)] \ 707 | 12 Oct 2023 708 | 709 | **Chain-of-Verification Reduces Hallucination in Large Language Models** \ 710 | *Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, Jason Weston* \ 711 | arXiv 2023. [[Paper](https://arxiv.org/abs/2309.11495)] \ 712 | 20 Sep 2023 713 | 714 | **Do Language Models Know When They're Hallucinating References?** \ 715 | *Ayush Agrawal, Lester Mackey, Adam Tauman Kalai* \ 716 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.18248)] \ 717 | 29 May 2023. 718 | 719 | **Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation** \ 720 | *Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev* \ 721 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.15852)] \ 722 | 25 May 2023 723 | 724 | **Why Does ChatGPT Fall Short in Providing Truthful Answers?** \ 725 | *Shen Zheng, Jie Huang, Kevin Chen-Chuan Chang* \ 726 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.10513)] \ 727 | 24 May 2023 728 | 729 | 730 | **How Language Model Hallucinations Can Snowball** \ 731 | *Muru Zhang, Ofir Press, William Merrill, Alisa Liu, Noah A. Smith* \ 732 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.13534)] \ 733 | 22 May 2023 734 | 735 | **LM vs LM: Detecting Factual Errors via Cross Examination** \ 736 | *Roi Cohen, May Hamri, Mor Geva, Amir Globerson* \ 737 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.13281)] \ 738 | 22 May 2023 739 | 740 | 741 | **HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models** \ 742 | *Junyi Li, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen* \ 743 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.11747)] 744 | 19 May 2023 745 | 746 | **SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models** \ 747 | *Potsawee Manakul, Adian Liusie, Mark J. F. Gales* \ 748 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.08896)] [[Github](https://github.com/potsawee/selfcheckgpt)] \ 749 | 8 Mar 2023 750 | 751 | **Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback** \ 752 | *Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao* \ 753 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.12813)] \ 754 | 23 Feb 2023 755 | 756 | 757 | **RHO (ρ): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding** \ 758 | *Ziwei Ji, Zihan Liu, Nayeon Lee, Tiezheng Yu, Bryan Wilie, Min Zeng, Pascale Fung* \ 759 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.01588)] \ 760 | 3 Dec 2022 761 | 762 | 763 | **FaithDial: A Faithful Benchmark for Information-Seeking Dialogue** \ 764 | *Nouha Dziri, Ehsan Kamalloo, Sivan Milton, Osmar Zaiane, Mo Yu, Edoardo M. Ponti, Siva Reddy* \ 765 | TACL 2022. [[Paper](https://arxiv.org/abs/2204.10757)] \ 766 | 22 Apr 2022 767 | 768 | 769 | **Survey of Hallucination in Natural Language Generation** \ 770 | *Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Wenliang Dai, Andrea Madotto, Pascale Fung* \ 771 | arXiv 2022. [[Paper](https://arxiv.org/abs/2202.03629)] \ 772 | 8 Feb 2022 773 | 774 | 775 | ### Truthfulness 776 | 777 | **TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space** \ 778 | *Shaolei Zhang, Tian Yu, Yang Feng* \ 779 | arXiv 2024. [[Paper](https://arxiv.org/abs/2402.17811)] [[Github](https://github.com/ictnlp/TruthX)] \ 780 | 27 Feb 2024 781 | 782 | **Inference-Time Intervention: Eliciting Truthful Answers from a Language Model** \ 783 | *Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg* \ 784 | arXiv 2023. [[Paper](https://arxiv.org/abs/2306.03341)] [[Github](https://github.com/likenneth/honest_llama)] \ 785 | 6 June 2023 786 | 787 | 788 | 789 | **The Internal State of an LLM Knows When its Lying** \ 790 | *Amos Azaria, Tom Mitchell* \ 791 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.13734)] \ 792 | 26 Apr 2023 793 | 794 | 795 | **TruthfulQA: Measuring How Models Mimic Human Falsehoods** \ 796 | *Stephanie Lin, Jacob Hilton, Owain Evans* \ 797 | ACL 2022. [[Paper](https://arxiv.org/abs/2109.07958)] [[Github](https://github.com/sylinrl/TruthfulQA)] [[Blog](https://www.lesswrong.com/posts/PF58wEdztZFX2dSue/how-truthful-is-gpt-3-a-benchmark-for-language-models)] \ 798 | 8 Sep 2021 799 | 800 | **Truthful AI: Developing and governing AI that does not lie** \ 801 | *Owain Evans, Owen Cotton-Barratt, Lukas Finnveden, Adam Bales, Avital Balwit, Peter Wills, Luca Righetti, William Saunders* \ 802 | arXiv 2021. [[Paper](https://arxiv.org/abs/2110.06674)] [[Blog](https://www.lesswrong.com/posts/aBixCPqSnTsPsTJBQ/truthful-ai-developing-and-governing-ai-that-does-not-lie)]\ 803 | 13 Oct 2021 804 | 805 | 806 | **Measuring Reliability of Large Language Models through Semantic Consistency** \ 807 | *Harsh Raj, Domenic Rosati, Subhabrata Majumdar* \ 808 | NeurIPS 2022 ML Safety Workshop. [[Paper](https://arxiv.org/abs/2211.05853)] \ 809 | 10 Nov 2022 810 | 811 | ### Reasoning 812 | 813 | **REFINER: Reasoning Feedback on Intermediate Representations** \ 814 | *Debjit Paul, Mete Ismayilzada, Maxime Peyrard, Beatriz Borges, Antoine Bosselut, Robert West, Boi Faltings* \ 815 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.01904)] \ 816 | 4 Apr 2023 817 | 818 | **OpenICL: An Open-Source Framework for In-context Learning** \ 819 | *Zhenyu Wu, YaoXiang Wang, Jiacheng Ye, Jiangtao Feng, Jingjing Xu, Yu Qiao, Zhiyong Wu* \ 820 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.02913)] [[Github](https://github.com/shark-nlp/openicl)] \ 821 | 6 Mar 2023 822 | 823 | **Reliable Natural Language Understanding with Large Language Models and Answer Set Programming** \ 824 | *Abhiramon Rajasekharan, Yankai Zeng, Parth Padalkar, Gopal Gupta* \ 825 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.03780)] \ 826 | 7 Feb 2023 827 | 828 | **Self-Consistency Improves Chain of Thought Reasoning in Language Models** \ 829 | *Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, Denny Zhou* \ 830 | ICLR 2023. [[Paper](https://arxiv.org/abs/2203.11171)] \ 831 | 21 Mar 2022 832 | 833 | **Chain of Thought Prompting Elicits Reasoning in Large Language Models.** \ 834 | *Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, Denny Zhou*\ 835 | arXiv 2022. [[Paper](https://arxiv.org/abs/2201.11903)] \ 836 | 28 Jan 2022 837 | 838 | **STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning.** \ 839 | *Eric Zelikman, Yuhuai Wu, Noah D. Goodman* \ 840 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2203.14465)][[Github](https://github.com/ezelikman/STaR)] \ 841 | 28 Mar 2022 842 | 843 | 844 | **The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning** \ 845 | *Xi Ye, Greg Durrett* \ 846 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2205.03401)] [[Github](https://github.com/xiye17/TextualExplInContext)]\ 847 | 6 May 2022 848 | 849 | 850 | **Rationale-Augmented Ensembles in Language Models** \ 851 | *Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Denny Zhou*\ 852 | arXiv 2022. [[Paper](https://arxiv.org/abs/2207.00747)] \ 853 | 2 Jul 2022 854 | 855 | **ReAct: Synergizing Reasoning and Acting in Language Models** \ 856 | *Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao* \ 857 | ICLR 2023. [[Paper](https://arxiv.org/abs/2210.03629)][[Github](https://github.com/ysymyth/ReAct)] [[Project](https://react-lm.github.io/)] \ 858 | 6 Oct 2022 859 | 860 | **On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning** \ 861 | *Omar Shaikh, Hongxin Zhang, William Held, Michael Bernstein, Diyi Yang* \ 862 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.08061)] \ 863 | 15 Dec 2022 864 | 865 | **On the Advance of Making Language Models Better Reasoners** \ 866 | *Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, Weizhu Chen* \ 867 | arXiv 2022. [[Paper](https://arxiv.org/abs/2206.02336)][[Github](https://github.com/microsoft/CodeT)] \ 868 | 6 Jun 2022 869 | 870 | **Ask Me Anything: A simple strategy for prompting language models** \ 871 | *Simran Arora, Avanika Narayan, Mayee F. Chen, Laurel Orr, Neel Guha, Kush Bhatia, Ines Chami, Frederic Sala, Christopher Ré*\ 872 | arXiv 2022. [[Paper](https://arxiv.org/abs/2210.02441)][[Github](https://github.com/HazyResearch/ama_prompting)] \ 873 | 5 Oct 2022 874 | 875 | **MathPrompter: Mathematical Reasoning using Large Language Models** \ 876 | *Shima Imani, Liang Du, Harsh Shrivastava* \ 877 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.05398)] \ 878 | 4 Mar 2023 879 | 880 | **Complexity-Based Prompting for Multi-Step Reasoning** \ 881 | *Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot* \ 882 | arXiv 2022. [[Paper](https://arxiv.org/abs/2210.00720)][[Github](https://github.com/FranxYao/Complexity-Based-Prompting)]\ 883 | 3 Oct 2022 884 | 885 | **Measuring and Narrowing the Compositionality Gap in Language Models**\ 886 | *Ofir Press, Muru Zhang, Sewon Min, Ludwig Schmidt, Noah A. Smith, Mike Lewis*\ 887 | arXiv 2022. [[Paper](https://arxiv.org/abs/2210.03350)][[Github](https://github.com/ofirpress/self-ask)] 888 | 7 Oct 2022 889 | 890 | **Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions** \ 891 | *Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal* \ 892 | arXiv 2023. [[Paper](https://arxiv.org/abs/2212.10509)][[Github](https://github.com/StonyBrookNLP/ircot)] \ 893 | 20 Dec 2022 894 | 895 | 896 | ### Prompt tuning, optimization and design 897 | 898 | 899 | 900 | 901 | **Large Language Models as Optimizers** \ 902 | *Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, Xinyun Chen* \ 903 | arXiv 2023. [[Paper](https://arxiv.org/abs/2309.03409)] \ 904 | Sep 7 2023 905 | 906 | **InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models** \ 907 | *Lichang Chen, Jiuhai Chen, Tom Goldstein, Heng Huang, Tianyi Zhou* \ 908 | arXiv 2023. [[Paper](https://arxiv.org/abs/2306.03082)] [[Github](https://github.com/lichang-chen/instructzero)] \ 909 | 5 Jun 2023 910 | 911 | **Promptboosting: Black-box text classification with ten forward passes** \ 912 | *Bairu Hou, Joe O’Connor, Jacob Andreas, Shiyu Chang, Yang Zhang* \ 913 | ICML 2023. [[Paper](https://proceedings.mlr.press/v202/hou23b.html)][[Github](https://github.com/UCSB-NLP-Chang/PromptBoosting)] \ 914 | 23 Jan 2023 915 | 916 | **GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models** \ 917 | *Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal* \ 918 | EACL 2023. [[Paper](https://arxiv.org/abs/2203.07281)][[Github](https://github.com/archiki/grips)] \ 919 | Mar 14 2022 920 | 921 | **RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning** \ 922 | *Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric P. Xing, Zhiting Hu* \ 923 | EMNLP 2022. [[Paper](https://arxiv.org/abs/2205.12548)][[Github](https://github.com/mingkaid/rl-prompt)] \ 924 | 25 May 2022 925 | 926 | **Black-box Prompt Learning for Pre-trained Language Models** \ 927 | *Shizhe Diao, Zhichao Huang, Ruijia Xu, Xuechun Li, Yong Lin, Xiao Zhou, Tong Zhang* \ 928 | TMLR 2023. [[Paper](https://arxiv.org/abs/2201.08531)][[Github](https://github.com/shizhediao/Black-Box-Prompt-Learning)] \ 929 | 22 Jan 2022 930 | 931 | **Black-Box Tuning for Language-Model-as-a-Service** \ 932 | *Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu* \ 933 | ICML 2022. [[Paper](https://arxiv.org/abs/2201.03514)][[Github](https://github.com/txsun1997/Black-Box-Tuning)]\ 934 | 10 Jan 2022 935 | 936 | **BBTv2: towards a gradient-free future with large language models** \ 937 | *Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu* \ 938 | EMNLP 2022. [[Paper](https://aclanthology.org/2022.emnlp-main.259/)] [[Github](https://github.com/txsun1997/Black-Box-Tuning)] \ 939 | 7 Dec 2022 940 | 941 | 942 | **Automatic Chain of Thought Prompting in Large Language Models** \ 943 | *Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola*\ 944 | ICLR 2023. [[Paper](https://arxiv.org/abs/2210.03493)][[Github](https://github.com/amazon-science/auto-cot)]\ 945 | 7 Oct 2022 946 | 947 | **Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data** \ 948 | *KaShun Shum, Shizhe Diao, Tong Zhang*\ 949 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.12822)][[Github](https://github.com/shizhediao/automate-cot)]\ 950 | 24 Feb 2023 951 | 952 | **Large Language Models Are Human-Level Prompt Engineers** \ 953 | *Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba* \ 954 | ICLR 2023. [[Paper](https://arxiv.org/abs/2211.01910)] [[Github](https://github.com/keirp/automatic_prompt_engineer)] \ 955 | 3 Nov 2022 956 | 957 | **Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity** \ 958 | *Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp* \ 959 | ACL 2022. [[Paper](https://arxiv.org/abs/2104.08786)] 960 | 961 | **Active Example Selection for In-Context Learning** \ 962 | *Yiming Zhang, Shi Feng, Chenhao Tan* \ 963 | EMNLP 2022. [[Paper](https://arxiv.org/abs/2211.04486)][[Github](https://github.com/ChicagoHAI/active-example-selection)] \ 964 | 8 Nov 2022 965 | 966 | **Selective Annotation Makes Language Models Better Few-Shot Learners** \ 967 | *Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu* \ 968 | ICLR 2023. [[Paper](https://arxiv.org/abs/2209.01975)][[Github](https://github.com/HKUNLP/icl-selective-annotation)]\ 969 | 5 Sep 2022 970 | 971 | **Learning To Retrieve Prompts for In-Context Learning** \ 972 | *Ohad Rubin, Jonathan Herzig, Jonathan Berant* \ 973 | NAACL-HLT 2022. [[Paper](https://arxiv.org/abs/2112.08633)][[Github](https://github.com/OhadRubin/EPR)] \ 974 | 16 Dec 2021 975 | 976 | 977 | ### Instruction and RLHF 978 | 979 | **LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions** \ 980 | *Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham Fikri Aji* \ 981 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.14402)][[Github](https://github.com/mbzuai-nlp/lamini-lm)] \ 982 | 27 Apr 2023 983 | 984 | **Self-Refine: Iterative Refinement with Self-Feedback** \ 985 | *Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Sean Welleck, Bodhisattwa Prasad Majumder, Shashank Gupta, Amir Yazdanbakhsh, Peter Clark* \ 986 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.17651)][[Github](https://github.com/madaan/self-refine)] [[Website](https://selfrefine.info/)] \ 987 | 30 Mar 2023 988 | 989 | 990 | **Is Prompt All You Need? No. A Comprehensive and Broader View of Instruction Learning** \ 991 | *Renze Lou, Kai Zhang, Wenpeng Yin* \ 992 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.10475)][[Github](https://github.com/RenzeLou/awesome-instruction-learning)] \ 993 | 18 Mar 2023 994 | 995 | **Self-Instruct: Aligning Language Model with Self Generated Instructions** \ 996 | *Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi* \ 997 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.10560)] [[Github](https://github.com/yizhongw/self-instruct)] \ 998 | 20 Dec 2022 999 | 1000 | 1001 | **Constitutional AI: Harmlessness from AI Feedback** \ 1002 | *Yuntao Bai, et al (Anthropic)* \ 1003 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.08073)] \ 1004 | 15 Dec 2022 1005 | 1006 | **Discovering Language Model Behaviors with Model-Written Evaluations** \ 1007 | *Ethan Perez et al.* \ 1008 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.09251)] \ 1009 | 19 Dec 2022 1010 | 1011 | **In-Context Instruction Learning** \ 1012 | *Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, Minjoon Seo* \ 1013 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.14691)][[Github](https://github.com/seonghyeonye/ICIL)]\ 1014 | 28 Feb 2023 1015 | 1016 | 1017 | ### Tools and external APIs 1018 | 1019 | **Internet-augmented language models through few-shot prompting for open-domain question answering** \ 1020 | *Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev* \ 1021 | arXiv 2023. [[Paper](https://arxiv.org/abs/2203.05115)] \ 1022 | 10 Mar 2023 1023 | 1024 | **Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks** \ 1025 | *Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen* \ 1026 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.12588)][[Github](https://github.com/wenhuchen/Program-of-Thoughts)]\ 1027 | 22 Nov 2022 1028 | 1029 | **PAL: Program-aided Language Models** \ 1030 | *Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, Graham Neubig* \ 1031 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.10435)] [[Github](https://github.com/reasoning-machines/pal)] [[Project](https://reasonwithpal.com/)] \ 1032 | 18 Nov 2022 1033 | 1034 | **TALM: Tool Augmented Language Models** \ 1035 | *Aaron Parisi, Yao Zhao, Noah Fiedel* \ 1036 | arXiv 2022. [[Paper](https://arxiv.org/abs/2205.12255)] \ 1037 | 24 May 2022 1038 | 1039 | **Toolformer: Language Models Can Teach Themselves to Use Tool** \ 1040 | *Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom* \ 1041 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.04761)] \ 1042 | 9 Feb 2023 1043 | 1044 | 1045 | ### Fine-tuning 1046 | 1047 | **Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes** \ 1048 | *Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, Tomas Pfister* \ 1049 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.02301)] \ 1050 | 3 May 2023 1051 | 1052 | 1053 | **FreeLM: Fine-Tuning-Free Language Model** \ 1054 | *Xiang Li1, Xin Jiang, Xuying Meng, Aixin Sun, Yequan Wang* \ 1055 | arXiv 2023. [[Paper](https://arxiv.org/pdf/2305.01616v1.pdf)] \ 1056 | 2 May 2023 1057 | 1058 | **Automated Data Curation for Robust Language Model Fine-Tuning** \ 1059 | *Jiuhai Chen, Jonas Mueller* \ 1060 | arXiv 2024. [[Paper](https://arxiv.org/abs/2403.12776)] \ 1061 | 19 Mar 2024 1062 | 1063 | 1064 | ## Robustness 1065 | ### Invariance 1066 | 1067 | **Invariant Language Modeling** \ 1068 | *Maxime Peyrard, Sarvjeet Singh Ghotra, Martin Josifoski, Vidhan Agarwal, Barun Patra, Dean Carignan, Emre Kiciman, Robert West* \ 1069 | EMNLP 2022. [[Paper](https://arxiv.org/abs/2110.08413)][[Github](https://github.com/epfl-dlab/invariant-language-models)] \ 1070 | 16 Oct 2021 1071 | 1072 | **Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization** \ 1073 | *Liang Chen, Hongru Wang, Yang Deng, Wai-Chung Kwan, Kam-Fai Wong* \ 1074 | Findings of ACL 2023. [[Paper](https://aclanthology.org/2023.findings-acl.462)][[Github](https://github.com/ChanLiang/ORIG)] \ 1075 | 22 May 2023 1076 | 1077 | 1078 | 1079 | ### Distribution Shift 1080 | 1081 | **Exploring Distributional Shifts in Large Language Models for Code Analysis** \ 1082 | *Shushan Arakelyan, Rocktim Jyoti Das, Yi Mao, Xiang Ren* \ 1083 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.09128)] \ 1084 | 16 Mar 2023 1085 | 1086 | 1087 | ### Out-of-Distribution 1088 | 1089 | **Out-of-Distribution Detection and Selective Generation for Conditional Language Models** \ 1090 | *Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji Lakshminarayanan, Peter J. Liu* \ 1091 | ICLR 2023. [[Paper](https://arxiv.org/abs/2209.15558)] \ 1092 | 30 Sep 2022 1093 | 1094 | 1095 | ### Adaptation and Generalization 1096 | 1097 | **On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey** \ 1098 | *Xu Guo, Han Yu* \ 1099 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.03154)] \ 1100 | 6 Nov 2022 1101 | 1102 | ### Adversarial 1103 | 1104 | **PEARL: Towards Permutation-Resilient LLMs** \ 1105 | *Liang Chen, Li Shen, Yang Deng, Xiaoyan Zhao, Bin Liang, Kam-Fai Wong* \ 1106 | ICLR 2025. [[Paper](https://openreview.net/pdf?id=txoJvjfI9w)][[Github](https://github.com/ChanLiang/PEARL)] \ 1107 | 27 Feb 2025 1108 | 1109 | **Adversarial Attacks on LLMs** \ 1110 | *Lilian Weng* 1111 | [[Blog](https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/)] \ 1112 | 25 Oct 2023 1113 | 1114 | **PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts** \ 1115 | *Kaijie Zhu, Jindong Wang, Jiaheng Zhou, Zichen Wang, Hao Chen, Yidong Wang, Linyi Yang, Wei Ye, Neil Zhenqiang Gong, Yue Zhang, Xing Xie* \ 1116 | arXiv 2023. [[Paper](https://arxiv.org/abs/2306.04528)][[Github](https://github.com/microsoft/promptbench)] \ 1117 | 7 Jun 20223 1118 | 1119 | **On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective** \ 1120 | *Jindong Wang, Xixu Hu, Wenxin Hou, Hao Chen, Runkai Zheng, Yidong Wang, Linyi Yang, Haojun Huang, Wei Ye, Xiubo Geng, Binxin Jiao, Yue Zhang, Xing Xie* \ 1121 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.12095)] [[Github](https://github.com/microsoft/robustlearn)] \ 1122 | 22 Feb 2023 1123 | 1124 | **Reliability Testing for Natural Language Processing Systems** \ 1125 | *Samson Tan, Shafiq Joty, Kathy Baxter, Araz Taeihagh, Gregory A. Bennett, Min-Yen Kan* \ 1126 | ACL-IJCNLP 2021. [[Paper](https://arxiv.org/abs/2105.02590)] \ 1127 | 06 May 2021 1128 | 1129 | 1130 | ### Attribution 1131 | 1132 | **Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models** \ 1133 | *Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster* \ 1134 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.08037)] \ 1135 | 15 Dec 2022 1136 | 1137 | ### Causality 1138 | 1139 | **Can Large Language Models Infer Causation from Correlation?** \ 1140 | *Zhijing Jin, Jiarui Liu, Zhiheng Lyu, Spencer Poff, Mrinmaya Sachan, Rada Mihalcea, Mona Diab, Bernhard Schölkopf* \ 1141 | arXiv 2023. [[Paper](https://arxiv.org/abs/2306.05836)] [[Github](https://github.com/causalnlp/corr2cause)] \ 1142 | 9 Jun 2023 1143 | 1144 | 1145 | **Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning** \ 1146 | *Antonia Creswell, Murray Shanahan, Irina Higgins* \ 1147 | ICLR 2023. [[Paper](https://arxiv.org/abs/2205.09712)] \ 1148 | 19 May 2022 1149 | 1150 | **Investigating causal understanding in LLMs** \ 1151 | *Marius Hobbhahn, Tom Lieberum, David Seiler* \ 1152 | NeurIPS 2022 Workshop. [[Paper](https://openreview.net/forum?id=st6jtGdW8Ke)][[Blog](https://www.lesswrong.com/posts/yZb5eFvDoaqB337X5/investigating-causal-understanding-in-llms)] \ 1153 | 3 Oct 2022 1154 | 1155 | 1156 | 1157 | 1158 | 1159 | 1160 | 1161 | 1162 | 1188 | 1189 | 1190 | 1191 | 1192 | 1193 | 1194 | 1199 | 1200 | 1201 | 1202 | --------------------------------------------------------------------------------