├── pictures
└── logo.jpg
├── .idea
├── vcs.xml
├── misc.xml
├── .gitignore
├── inspectionProfiles
│ └── profiles_settings.xml
├── modules.xml
└── Cognitive_Mirage.iml
├── LICENSE
└── README.md
/pictures/logo.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hongbinye/Cognitive-Mirage-Hallucinations-in-LLMs/HEAD/pictures/logo.jpg
--------------------------------------------------------------------------------
/.idea/vcs.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
--------------------------------------------------------------------------------
/.idea/misc.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
--------------------------------------------------------------------------------
/.idea/.gitignore:
--------------------------------------------------------------------------------
1 | # Default ignored files
2 | /shelf/
3 | /workspace.xml
4 | # Editor-based HTTP Client requests
5 | /httpRequests/
6 | # Datasource local storage ignored files
7 | /dataSources/
8 | /dataSources.local.xml
9 |
--------------------------------------------------------------------------------
/.idea/inspectionProfiles/profiles_settings.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
--------------------------------------------------------------------------------
/.idea/modules.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
--------------------------------------------------------------------------------
/.idea/Cognitive_Mirage.iml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2023 Hongbin Ye
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | [//]: # (# Hallucination in Large Language Models: A Review)
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 | []()
10 | [](https://opensource.org/licenses/MIT)
11 | 
12 | 
13 |
14 |
15 |
16 |
17 | Uncritical trust in LLMs can give rise to a phenomenon `Cognitive Mirage`, leading to misguided decision-making and a cascade of unintended consequences.
18 | To effectively control the risk of `hallucinations`, we summarize recent progress in hallucination theories and solutions in this paper. We propose to organize relevant work by a comprehensive survey.
19 |
20 | ###
21 | **[:bell: News! :bell: ]
22 | We have released a new survey paper:"[Cognitive Mirage: A Review of Hallucinations in Large Language Models](https://arxiv.org/abs/2309.06794)" based on this repository, with a perspective of Hallucinations in LLMs! We are looking forward to any comments or discussions on this topic :)**
23 |
24 | ## 🕵️ Introduction
25 |
26 | As large language models continue to develop in the field of AI, text generation systems are susceptible to a worrisome phenomenon known
27 | as hallucination. In this study, we summarize recent compelling insights into hallucinations in LLMs. We present a novel taxonomy of hallucinations from various text generation tasks, thus provide theoretical insights, detection methods and improvement approaches.
28 | Based on this, future research directions are proposed. Our contribution are threefold: (1) We provide a detailed and complete taxonomy
29 | for hallucinations appearing in text generation tasks; (2) We provide theoretical analyses of hallucinations in LLMs and provide existing
30 | detection and improvement methods; (3) We propose several research directions that can be developed in the future. As hallucinations garner significant attention from the community, we will maintain updates on relevant research progress.
31 |
32 |
33 |
34 | ## 🏆 A Timeline of LLMs
35 |
36 | | LLM Name | Title | Authors | Publication Date |
37 | |------------------------------------------------------------------------------------------------------------------------------------|--------------------------------|----------------|------------------|
38 | | [T5](https://arxiv.org/abs/1910.10683) | Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. | Colin Raffel, Noam Shazeer, Adam Roberts | 2019.10 |
39 | | [GPT-3](https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html) | Language Models are Few-Shot Learners. | Tom B. Brown, Benjamin Mann, Nick Ryder | 2020.12 |
40 | | [mT5](https://doi.org/10.18653/v1/2021.naacl-main.41) | mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. | Linting Xue, Noah Constant, Adam Roberts | 2021.3 |
41 | | [Codex](https://arxiv.org/abs/2107.03374) | Evaluating Large Language Models Trained on Code. | Mark Chen, Jerry Tworek, Heewoo Jun | 2021.7 |
42 | | [FLAN](https://openreview.net/forum?id=gEZrGCozdqR) | Finetuned Language Models are Zero-Shot Learners. | Jason Wei, Maarten Bosma, Vincent Y. Zhao | 2021.9 |
43 | | [WebGPT](https://arxiv.org/abs/2112.09332) | WebGPT: Browser-assisted question-answering with human feedback. | Reiichiro Nakano, Jacob Hilton, Suchir Balaji | 2021.12 |
44 | | [InstructGPT](https://proceedings.neurips.cc//paper_files/paper/2022/hash/b1efde53be364a73914f58805a001731-Abstract-Conference.html) | Training language models to follow instructions with human feedback. | Long Ouyang, Jeffrey Wu, Xu Jiang | 2022.3 |
45 | | [CodeGen](https://arxiv.org/abs/2203.13474) | CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. | Erik Nijkamp, Bo Pang, Hiroaki Hayashi | 2022.3 |
46 | | [Claude](https://arxiv.org/abs/2204.05862) | Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. | Yuntao Bai, Andy Jones, Kamal Ndousse | 2022.4 |
47 | | [PaLM](https://proceedings.neurips.cc//paper_files/paper/2022/hash/b1efde53be364a73914f58805a001731-Abstract-Conference.html) | PaLM: Scaling Language Modeling with Pathways. | Aakanksha Chowdhery, Sharan Narang, Jacob Devlin | 2022.4 |
48 | | [OPT](https://arxiv.org/abs/2205.01068) | OPT: Open Pre-trained Transformer Language Models. | Susan Zhang, Stephen Roller, Naman Goyal | 2022.5 |
49 | | [Super-NaturalInstructions](https://doi.org/10.18653/v1/2022.emnlp-main.340) | ESuper-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks. | Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi | 2022.9 |
50 | | [GLM](https://arxiv.org/abs/2210.02414) | GLM-130B: An Open Bilingual Pre-trained Model. | Aohan Zeng, Xiao Liu, Zhengxiao Du | 2022.10 |
51 | | [BLOOM](https://doi.org/10.48550/arXiv.2211.05100) | BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. | Teven Le Scao, Angela Fan, Christopher Akiki | 2022.11 |
52 | | [LLaMA](https://arxiv.org/abs/2302.13971) | LLaMA: Open and Efficient Foundation Language Models. | Hugo Touvron, Thibaut Lavril, Gautier Izacard | 2023.2 |
53 | | [Alpaca](https://crfm.stanford.edu/2023/03/13/alpaca.html) | Alpaca: A Strong, Replicable Instruction-Following Model. | Rohan Taori, Ishaan Gulrajani, Tianyi Zhang | 2023.3 |
54 | | [GPT-4](https://arxiv.org/abs/2303.08774v2) | GPT-4 Technical Report. | OpenAI | 2023.3 |
55 | | [WizardLM](https://doi.org/10.48550/arXiv.2304.12244) | WizardLM: Empowering Large Language Models to Follow Complex Instructions. | Can Xu, Qingfeng Sun, Kai Zheng | 2023.4 |
56 | | [Vicuna](https://lmsys.org/blog/2023-03-30-vicuna/) | Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality. | The Vicuna Team | 2023.5 |
57 | | [ChatGLM](https://chatglm.cn/blog) | ChatGLM. | Wisdom and Clear Speech Team | 2023.6 |
58 | | [Llama2](https://arxiv.org/abs/2307.09288) | Llama 2: Open Foundation and Fine-Tuned Chat Models. | Hugo Touvron, Louis Martin, Kevin Stone | 2023.7 |
59 |
60 |
61 | ## 🏳🌈 Definition of Hallucination
62 |
63 | - **"Truthful AI: Developing and governing AI that does not lie."**, 2021.10
64 | - Owain Evans, Owen Cotton-Barratt, Lukas Finnveden
65 | - [[Paper]](https://arxiv.org/abs/2110.06674)
66 |
67 | - **"Survey of Hallucination in Natural Language Generation"**, 2022.2
68 | - Ziwei Ji, Nayeon Lee, Rita Frieske
69 | - [[Paper]](https://dblp.org/search?q=Survey%20of%20hallucination%20in%20natural%20language%20generation)
70 |
71 | - **"Context-faithful Prompting for Large Language Models."**, 2023.3
72 | - Wenxuan Zhou, Sheng Zhang, Hoifung Poon
73 | - [[Paper]](https://arxiv.org/abs/2303.11315)
74 |
75 | - **"Do Language Models Know When They’re Hallucinating References?"**, 2023.5
76 | - Ayush Agrawal, Lester Mackey, Adam Tauman Kalai
77 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.18248)
78 |
79 | ## 🎉 Mechanism Analysis
80 |
81 | ### ✨Data Attribution
82 |
83 | - **"Data Distributional Properties Drive Emergent In-Context Learning in Transformers."**, 2022.5
84 | - Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen
85 | - [[Paper]](http://papers.nips.cc/paper_files/paper/2022/hash/77c6ccacfd9962e2307fc64680fc5ace-Abstract-Conference.html)
86 |
87 | - **"Towards Tracing Factual Knowledge in Language Models Back to the Training Data."**, 2022.5
88 | - Ekin Akyürek, Tolga Bolukbasi, Frederick Liu
89 | - [[Paper]](https://arxiv.org/abs/2205.11482)
90 |
91 | - **"A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity."**, 2023.2
92 | - Yejin Bang, Samuel Cahyawijaya, Nayeon Lee
93 | - [[Paper]](https://doi.org/10.48550/arXiv.2302.04023)
94 |
95 | - **"Hallucinations in Large Multilingual Translation Models."**, 2023.3
96 | - Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf
97 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.16104)
98 |
99 | - **"Visual Instruction Tuning."**, 2023.4
100 | - Haotian Liu, Chunyuan Li, Qingyang Wu
101 | - [[Paper]](https://doi.org/10.48550/arXiv.2304.08485)
102 |
103 | - **"Evaluating Object Hallucination in Large Vision-Language Models."**, 2023.5
104 | - Yifan Li, Yifan Du, Kun Zhou
105 | - [[Paper]](https://arxiv.org/abs/2305.10355)
106 |
107 | - **"Sources of Hallucination by Large Language Models on Inference Tasks."**, 2023.5
108 | - Nick McKenna, Tianyi Li, Liang Cheng
109 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.14552)
110 |
111 | - **"Automatic Evaluation of Attribution by Large Language Models."**, 2023.5
112 | - Xiang Yue, Boshi Wang, Kai Zhang
113 | - [[Paper]](https://arxiv.org/abs/2305.06311)
114 |
115 | - **"Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning."**, 2023.6
116 | - Fuxiao Liu, Kevin Lin, Linjie Li
117 | - [[Paper]](https://arxiv.org/abs/2306.14565)
118 |
119 |
120 | ### ✨Knowledge Gap
121 |
122 |
123 | - **"A Survey of Knowledge-enhanced Text Generation."**, 2022.1
124 | - Wenhao Yu, Chenguang Zhu, Zaitang Li
125 | - [[Paper]](https://doi.org/10.1145/3512467)
126 |
127 | - **"Attributed Text Generation via Post-hoc Research and Revision."**, 2022.10
128 | - Luyu Gao, Zhuyun Dai, Panupong Pasupat
129 | - [[Paper]](https://doi.org/10.48550/arXiv.2210.08726)
130 |
131 | - **"Artificial Hallucinations in ChatGPT: Implications in Scientific Writing."**, 2023.2
132 | - Hussam Alkaissi, Samy I McFarlane
133 | - [[Paper]](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9939079/)
134 |
135 | - **"SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models."**, 2023.3
136 | - Potsawee Manakul, Adian Liusie, Mark J. F. Gales
137 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.08896)
138 |
139 | - **"Why Does ChatGPT Fall Short in Answering Questions Faithfully?"**, 2023.4
140 | - Shen Zheng, Jie Huang, Kevin Chen-Chuan Chang
141 | - [[Paper]](https://arxiv.org/abs/2304.10513)
142 |
143 | - **"Zero-shot Faithful Factual Error Correction."**, 2023.5
144 | - Kung-Hsiang Huang, Hou Pong Chan, Heng Ji
145 | - [[Paper]](https://aclanthology.org/2023.acl-long.311)
146 |
147 | - **"Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment."**, 2023.5
148 | - Shuo Zhang, Liangming Pan, Junzhou Zhao
149 | - [[Paper]](https://arxiv.org/abs/2305.13669)
150 |
151 | - **"Adaptive Chameleon or Stubborn Sloth: Unraveling the Behavior of Large Language Models in Knowledge Clashes."**, 2023.5
152 | - Jian Xie, Kai Zhang, Jiangjie Chen
153 | - [[Paper]](https://arxiv.org/abs/2305.13300)
154 |
155 | - **"Evaluating Generative Models for Graph-to-Text Generation."**, 2023.7
156 | - huzhou Yuan, Michael Färber
157 | - [[Paper]](https://doi.org/10.48550/arXiv.2307.14712)
158 |
159 |
160 | - **"Overthinking the Truth: Understanding how Language Models Process False Demonstrations."**, 2023.7
161 | - Danny Halawi, Jean-Stanislas Denain, Jacob Steinhardt
162 | - [[Paper]](https://arxiv.org/abs/2307.09476)
163 |
164 |
165 | ### ✨Optimum Formulation
166 |
167 |
168 | - **"Improved Natural Language Generation via Loss Truncation."**, 2020.5
169 | - Daniel Kang, Tatsunori Hashimoto
170 | - [[Paper]](https://doi.org/10.18653/v1/2020.acl-main.66)
171 |
172 | - **"The Curious Case of Hallucinations in Neural Machine Translation."**, 2021.4
173 | - Vikas Raunak, Arul Menezes, Marcin Junczys-Dowmunt
174 | - [[Paper]](https://doi.org/10.18653/v1/2021.naacl-main.92)
175 |
176 | - **"Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation."**, 2022.12
177 | - Nuno Miguel Guerreiro, Pierre Colombo, Pablo Piantanida
178 | - [[Paper]](https://doi.org/10.48550/arXiv.2212.09631)
179 |
180 | - **"Elastic Weight Removal for Faithful and Abstractive Dialogue Generation."**, 2023.3
181 | - Nico Daheim, Nouha Dziri, Mrinmaya Sachan
182 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.17574)
183 |
184 | - **"HistAlign: Improving Context Dependency in Language Generation by Aligning with History."**, 2023.5
185 | - David Wan, Shiyue Zhang, Mohit Bansal
186 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.04782)
187 |
188 | - **"How Language Model Hallucinations Can Snowball."**, 2023.5
189 | - Muru Zhang, Ofir Press, William Merrill
190 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.13534)
191 |
192 | - **"Improving Language Models via Plug-and-Play Retrieval Feedback."**, 2023.5
193 | - Wenhao Yu, Zhihan Zhang, Zhenwen Liang
194 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.14002)
195 |
196 | - **"Teaching Language Models to Hallucinate Less with Synthetic Tasks."**, 2023.10
197 | - Erik Jones, Hamid Palangi, Clarisse Simões
198 | - [[Paper]](https://arxiv.org/abs/2310.06827)
199 |
200 |
201 | ## 🪁Taxonomy of LLMs Hallucination in NLP tasks
202 |
203 | ### ✨Machine Translation
204 | - **"Unsupervised Cross-lingual Representation Learning at Scale."**, 2020.7
205 | - Alexis Conneau, Kartikay Khandelwal, Naman Goyal
206 | - [[Paper]](https://doi.org/10.18653/v1/2020.acl-main.747)
207 |
208 | - **"The Curious Case of Hallucinations in Neural Machine Translation."**, 2021.4
209 | - Vikas Raunak, Arul Menezes, Marcin Junczys-Dowmunt
210 | - [[Paper]](https://doi.org/10.18653/v1/2021.naacl-main.92)
211 |
212 | - **"Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation."**, 2022.5
213 | - Tu Vu, Aditya Barua, Brian Lester
214 | - [[Paper]](https://doi.org/10.18653/v1/2022.emnlp-main.630)
215 |
216 | - **"Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation."**, 2022.8
217 | - Nuno Miguel Guerreiro, Elena Voita, André F. T. Martins
218 | - [[Paper]](https://aclanthology.org/2023.eacl-main.75)
219 |
220 | - **"Prompting PaLM for Translation: Assessing Strategies and Performance."**, 2022.11
221 | - David Vilar, Markus Freitag, Colin Cherry
222 | - [[Paper]](https://arxiv.org/abs/2211.09102)
223 |
224 | - **"The unreasonable effectiveness of few-shot learning for machine translation."**, 2023.2
225 | - Xavier Garcia, Yamini Bansal, Colin Cherry
226 | - [[Paper]](https://arxiv.org/abs/2302.01398)
227 |
228 | - **"How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation."**, 2023.2
229 | - Amr Hendy, Mohamed Abdelrehim, Amr Sharaf
230 | - [[Paper]](https://arxiv.org/abs/2302.09210)
231 |
232 | - **"Hallucinations in Large Multilingual Translation Models."**, 2023.3
233 | - Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf
234 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.16104)
235 |
236 | - **"Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM."**, 2023.3
237 | - Rachel Bawden, François Yvon
238 | - [[Paper]](https://arxiv.org/abs/2303.01911)
239 |
240 | - **"HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation."**, 2023.5
241 | - David Dale, Elena Voita, Janice Lam, Prangthip Hansanti
242 | - [[Paper]](https://arxiv.org/abs/2305.11746)
243 |
244 | - **"mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations."**, 2023.5
245 | - Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia
246 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.14224)
247 |
248 | ### ✨Question and Answer
249 |
250 | - **"Hurdles to Progress in Long-form Question Answering."**, 2021.6
251 | - Kalpesh Krishna, Aurko Roy, Mohit Iyyer
252 | - [[Paper]](https://aclanthology.org/2021.naacl-main.393/)
253 |
254 | - **"Entity-Based Knowledge Conflicts in Question Answering."**, 2021.9
255 | - Shayne Longpre, Kartik Perisetla, Anthony Chen
256 | - [[Paper]](https://aclanthology.org/2021.emnlp-main.565/)
257 |
258 | - **"TruthfulQA: Measuring How Models Mimic Human Falsehoods."**, 2022.5
259 | - Stephanie Lin, Jacob Hilton, Owain Evans
260 | - [[Paper]](https://aclanthology.org/2022.acl-long.229)
261 |
262 | - **"Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback."**, 2023.2
263 | - Baolin Peng, Michel Galley, Pengcheng He
264 | - [[Paper]](https://arxiv.org/abs/2302.12813)
265 |
266 | - **"Why Does ChatGPT Fall Short in Answering Questions Faithfully?"**, 2023.4
267 | - Shen Zheng, Jie Huang, Kevin Chen-Chuan Chang
268 | - [[Paper]](https://doi.org/10.48550/arXiv.2304.10513)
269 |
270 | - **"Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering."**, 2023.7
271 | - Vaibhav Adlakha, Parishad BehnamGhader, Xing Han Lu
272 | - [[Paper]](https://arxiv.org/abs/2307.16877)
273 |
274 | - **"Med-HALT: Medical Domain Hallucination Test for Large Language Models."**, 2023.7
275 | - Logesh Kumar Umapathi, Ankit Pal, Malaikannan Sankarasubbu
276 | - [[Paper]](https://arxiv.org/abs/2307.15343)
277 |
278 | ### ✨Dialog System
279 |
280 | - **"On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?"**, 2022.7
281 | - Nouha Dziri, Sivan Milton, Mo Yu
282 | - [[Paper]](https://aclanthology.org/2022.naacl-main.387/)
283 |
284 | - **"Contrastive Learning Reduces Hallucination in Conversations."**, 2022.12
285 | - Weiwei Sun, Zhengliang Shi, Shen Gao
286 | - [[Paper]](https://ojs.aaai.org/index.php/AAAI/article/view/26596)
287 |
288 | - **"Diving Deep into Modes of Fact Hallucinations in Dialogue Systems."**, 2022.12
289 | - Souvik Das, Sougata Saha, Rohini K. Srihari
290 | - [[Paper]](https://doi.org/10.18653/v1/2022.findings-emnlp.48)
291 |
292 | - **"Elastic Weight Removal for Faithful and Abstractive Dialogue Generation."**, 2023.3
293 | - Nico Daheim, Nouha Dziri, Mrinmaya Sachan
294 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.17574)
295 |
296 | ### ✨Summarization System
297 |
298 | - **"Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization."**, 2022.5
299 | - Meng Cao, Yue Dong, Jackie Chi Kit Cheung
300 | - [[Paper]](https://doi.org/10.18653/v1/2022.acl-long.236)
301 |
302 | - **"Evaluating the Factual Consistency of Large Language Models."**, 2022.11
303 | - Derek Tam, Anisha Mascarenhas, Shiyue Zhang
304 | - [[Paper]](https://doi.org/10.18653/v1/2023.findings-acl.322)
305 |
306 | - **"Why is this misleading?": Detecting News Headline Hallucinations with Explanations."**, 2023.2
307 | - Jiaming Shen, Jialu Liu, Dan Finnie
308 | - [[Paper]](https://doi.org/10.1145/3543507.3583375)
309 |
310 | - **"Detecting and Mitigating Hallucinations in Multilingual Summarisation."**, 2023.5
311 | - Yifu Qiu, Yftah Ziser, Anna Korhonen
312 | - [[Paper]](https://arxiv.org/abs/2305.13632)
313 |
314 | - **"LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond."**, 2023.5
315 | - Philippe Laban, Wojciech Kryściński, Divyansh Agarwal
316 | - [[Paper]](https://arxiv.org/abs/2305.14540)
317 |
318 | - **"Evaluating Factual Consistency of Texts with Semantic Role Labeling."**, 2023.5
319 | - Jing Fan, Dennis Aumiller, Michael Gertz
320 | - [[Paper]](https://arxiv.org/abs/2305.13309)
321 |
322 | - **"Summarization is (Almost) Dead."**, 2023.9
323 | - Xiao Pu, Mingqi Gao, Xiaojun Wan
324 | - [[Paper]](https://arxiv.org/abs/2309.09558)
325 |
326 | ### ✨Knowledge Graphs with LLMs
327 |
328 | - **"GPT-NER: Named Entity Recognition via Large Language Models."**, 2023.4
329 | - Shuhe Wang, Xiaofei Sun, Xiaoya Li
330 | - [[Paper]](https://arxiv.org/abs/2304.10428)
331 |
332 | - **"LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities."**, 2023.5
333 | - Yuqi Zhu, Xiaohan Wang, Jing Chen
334 | - [[Paper]](https://arxiv.org/abs/2305.13168)
335 |
336 | - **"KoLA: Carefully Benchmarking World Knowledge of Large Language Models."**, 2023.6
337 | - Jifan Yu, Xiaozhi Wang, Shangqing Tu
338 | - [[Paper]](https://arxiv.org/abs/2306.09296)
339 |
340 | - **"Evaluating Generative Models for Graph-to-Text Generation."**, 2023.7
341 | - Shuzhou Yuan, Michael Färber
342 | - [[Paper]](https://doi.org/10.48550/arXiv.2307.14712)
343 |
344 | - **"Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text."**, 2023.8
345 | - Nandana Mihindukulasooriya, Sanju Tiwari, Carlos F. Enguix
346 | - [[Paper]](https://arxiv.org/abs/2308.02357)
347 |
348 |
349 | ### ✨Cross-modal System
350 |
351 | - **"Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning."**, 2021.10
352 | - Ali Furkan Biten, Lluís Gómez, Dimosthenis Karatzas
353 | - [[Paper]](https://doi.org/10.1109/WACV51458.2022.00253)
354 |
355 | - **"Simple Token-Level Confidence Improves Caption Correctness."**, 2023.5
356 | - Suzanne Petryk, Spencer Whitehead, Joseph E. Gonzalez
357 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.07021)
358 |
359 | - **"Evaluating Object Hallucination in Large Vision-Language Models."**, 2023.5
360 | - Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang
361 | - [[Paper]](https://arxiv.org/abs/2305.10355)
362 |
363 | - **"Album Storytelling with Iterative Story-aware Captioning and Large Language Models."**, 2023.5
364 | - Munan Ning, Yujia Xie, Dongdong Chen
365 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.12943)
366 |
367 | - **"Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning."**, 2023.6
368 | - Fuxiao Liu, Kevin Lin, Linjie Li
369 | - [[Paper]](https://arxiv.org/abs/2306.14565)
370 |
371 | - **"Fact-Checking of AI-Generated Reports."**, 2023.7
372 | - Razi Mahmood, Ge Wang, Mannudeep Kalra
373 | - [[Paper]](https://arxiv.org/abs/2307.14634)
374 |
375 |
376 | ### ✨Others
377 |
378 | - **"The Scope of ChatGPT in Software Engineering: A Thorough Investigation."**, 2023.5
379 | - Wei Ma, Shangqing Liu, Wenhan Wang
380 | - [[Paper]](https://arxiv.org/abs/2305.12138)
381 |
382 | - **"Generating Benchmarks for Factuality Evaluation of Language Models."**, 2023.7
383 | - Dor Muhlgay, Ori Ram, Inbal Magar
384 | - [[Paper]](https://arxiv.org/abs/2307.06908)
385 |
386 | # 🔮 Hallucination Detection
387 |
388 |
389 | ### ✨Inference Classifier
390 | - **"Evaluating the Factual Consistency of Large Language Models Through Summarization."**, 2022.11
391 | - Derek Tam, Anisha Mascarenhas, Shiyue Zhang
392 | - [[Paper]](https://doi.org/10.18653/v1/2023.findings-acl.322)
393 |
394 | - **"Why is this misleading?": Detecting News Headline Hallucinations with Explanations."**, 2023.2
395 | - Jiaming Shen, Jialu Liu, Daniel Finnie
396 | - [[Paper]](https://doi.org/10.1145/3543507.3583375)
397 |
398 | - **"HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models."**, 2023.5
399 | - Junyi Li, Xiaoxue Cheng, Wayne Xin Zhao
400 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.11747)
401 |
402 | - **"Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning."**, 2023.6
403 | - Fuxiao Liu, Kevin Lin, Linjie Li
404 | - [[Paper]](https://arxiv.org/abs/2306.14565)
405 |
406 | - **"Fact-Checking of AI-Generated Reports."**, 2023.7
407 | - Razi Mahmood, Ge Wang, Mannudeep K. Kalra
408 | - [[Paper]](https://doi.org/10.48550/arXiv.2307.14634)
409 |
410 | - **"Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations."**, 2023.10
411 | - Deren Lei, Yaxi Li, Mengya Hu
412 | - [[Paper]](https://arxiv.org/abs/2310.03951)
413 |
414 |
415 |
416 | ### ✨Uncertainty Measure
417 |
418 | - **"BARTScore: Evaluating Generated Text as Text Generation."**, 2021.6
419 | - Weizhe Yuan, Graham Neubig, Pengfei Liu
420 | - [[Paper]](https://proceedings.neurips.cc/paper/2021/hash/e4d2b6e6fdeca3e60e0f1a62fee3d9dd-Abstract.html)
421 |
422 | - **"Contrastive Learning Reduces Hallucination in Conversations."**, 2022.12
423 | - Weiwei Sun, Zhengliang Shi, Shen Gao
424 | - [[Paper]](https://ojs.aaai.org/index.php/AAAI/article/view/26596)
425 |
426 | - **"Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language Models."**, 2023.5
427 | - Alfonso Amayuelas, Liangming Pan, Wenhu Chen
428 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.13712)
429 |
430 | - **"Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models."**, 2023.5
431 | - Peter Hase, Mona Diab, Asli Celikyilmaz
432 | - [[Paper]](https://aclanthology.org/2023.eacl-main.199/)
433 |
434 | - **"Measuring and Modifying Factual Knowledge in Large Language Models."**, 2023.6
435 | - Pouya Pezeshkpour
436 | - [[Paper]](https://arxiv.org/abs/2306.06264)
437 |
438 | - **"LLM Calibration and Automatic Hallucination Detection via Pareto Optimal Self-supervision."**, 2023.6
439 | - Theodore Zhao, Mu Wei, J. Samuel Preston
440 | - [[Paper]](https://arxiv.org/abs/2306.16564)
441 |
442 | - **"A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation."**, 2023.7
443 | - Neeraj Varshney, Wenlin Yao, Hongming Zhang
444 | - [[Paper]](https://arxiv.org/abs/2307.03987)
445 |
446 |
447 |
448 | ### ✨Self-Evaluation
449 |
450 | - **"Language Models (Mostly) Know What They Know."**, 2022.7
451 | - Saurav Kadavath, Tom Conerly, Amanda Askell
452 | - [[Paper]](https://doi.org/10.48550/arXiv.2207.05221)
453 |
454 | - **"SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models."**, 2023.3
455 | - Potsawee Manakul, Adian Liusie, Mark J. F. Gales
456 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.08896)
457 |
458 | - **"Do Language Models Know When They’re Hallucinating References?"**, 2023.5
459 | - Ayush Agrawal, Lester Mackey, Adam Tauman Kalai
460 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.18248)
461 |
462 | - **"Evaluating Object Hallucination in Large Vision-Language Models."**, 2023.5
463 | - Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang
464 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.10355)
465 |
466 | - **"Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models."**, 2023.5
467 | - Miaoran Li, Baolin Peng, Zhu Zhang
468 | - [[Paper]](https://arxiv.org/abs/2305.14623)
469 |
470 | - **"LM vs LM: Detecting Factual Errors via Cross Examination."**, 2023.5
471 | - Roi Cohen, May Hamri, Mor Geva
472 | - [[Paper]](https://arxiv.org/abs/2305.13281)
473 |
474 | - **"Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation."**, 2023.5
475 | - Niels Mündler, Jingxuan He, Slobodan Jenko
476 | - [[Paper]](https://arxiv.org/abs/2305.15852)
477 |
478 | - **"A New Benchmark and Reverse Validation Method for Passage-level Hallucination Detection."**, 2023.10
479 | - Shiping Yang, Renliang Sun, Xiaojun Wan
480 | - [[Paper]](https://arxiv.org/abs/2310.06498)
481 |
482 | ### ✨Evidence Retrieval
483 |
484 | - **"FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation."**, 2023.5
485 | - Sewon Min, Kalpesh Krishna, Xinxi Lyu
486 | - [[Paper]](https://arxiv.org/abs/2305.14251)
487 |
488 | - **"Complex Claim Verification with Evidence Retrieved in the Wild."**, 2023.5
489 | - Jifan Chen, Grace Kim, Aniruddh Sriram
490 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.11859)
491 |
492 | - **"Retrieving Supporting Evidence for LLMs Generated Answers."**, 2023.6
493 | - Siqing Huo, Negar Arabzadeh, Charles L. A. Clarke
494 | - [[Paper]](https://doi.org/10.48550/arXiv.2306.13781)
495 |
496 | - **"FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios."**, 2023.7
497 | - I-Chun Chern, Steffi Chern, Shiqi Chen
498 | - [[Paper]](https://arxiv.org/abs/2307.13528)
499 |
500 |
501 | # 🪄 Hallucination Correction
502 |
503 | ### ✨Parameter Enhancement
504 |
505 | - **"Factuality Enhanced Language Models for Open-Ended Text Generation."**, 2022.6
506 | - Nayeon Lee, Wei Ping, Peng Xu
507 | - [[Paper]](http://papers.nips.cc/paper_files/paper/2022/hash/df438caa36714f69277daa92d608dd63-Abstract-Conference.html)
508 |
509 | - **"Contrastive Learning Reduces Hallucination in Conversations."**, 2022.12
510 | - Weiwei Sun, Zhengliang Shi, Shen Gao
511 | - [[Paper]](https://ojs.aaai.org/index.php/AAAI/article/view/26596)
512 |
513 | - **"Editing Models with Task Arithmetic."**, 2023.2
514 | - Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman
515 | - [[Paper]](https://openreview.net/forum?id=6t0Kwf8-jrj)
516 |
517 | - **"Elastic Weight Removal for Faithful and Abstractive Dialogue Generation."**, 2023.3
518 | - Nico Daheim, Nouha Dziri, Mrinmaya Sachan
519 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.17574)
520 |
521 | - **"HISTALIGN: Improving Context Dependency in Language Generation by Aligning with History."**, 2023.5
522 | - David Wan, Shiyue Zhang, Mohit Bansal
523 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.04782)
524 |
525 | - **"mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations."**, 2023.5
526 | - Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia
527 | - [[Paper]](https://arxiv.org/abs/2305.14224)
528 |
529 | - **"Trusting Your Evidence: Hallucinate Less with Context-aware Decoding."**, 2023.5
530 | - Weijia Shi, Xiaochuang Han, Mike Lewis
531 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.14739)
532 |
533 | - **"PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions."**, 2023.5
534 | - Anthony Chen, Panupong Pasupat, Sameer Singh
535 | - [[Paper]](https://arxiv.org/abs/2305.14908)
536 |
537 | - **"Augmented Large Language Models with Parametric Knowledge Guiding."**, 2023.5
538 | - Ziyang Luo, Can Xu, Pu Zhao, Xiubo Geng
539 | - [[Paper]](https://arxiv.org/abs/2305.04757)
540 |
541 | - **"Inference-Time Intervention: Eliciting Truthful Answers from a Language Model."**, 2023.6
542 | - Kenneth Li, Oam Patel, Fernanda Viégas
543 | - [[Paper]](https://arxiv.org/abs/2306.03341)
544 |
545 | - **"TRAC: Trustworthy Retrieval Augmented Chatbot."**, 2023.7
546 | - Shuo Li, Sangdon Park, Insup Lee
547 | - [[Paper]](https://doi.org/10.48550/arXiv.2307.04642)
548 |
549 | - **"EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models"**, 2023.8
550 | - Peng Wang, Ningyu Zhang, Xin Xie
551 | - [[Paper]](https://arxiv.org/abs/2308.07269)
552 |
553 | - **"DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"**, 2023.9
554 | - Yung-Sung Chuang, Yujia Xie, Hongyin Luo
555 | - [[Paper]](https://arxiv.org/abs/2309.03883)
556 |
557 |
558 |
559 | ### ✨Post-hoc Attribution and Edit Technology
560 |
561 | - **"Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding."**, 2021.4
562 | - Nouha Dziri, Andrea Madotto, Osmar Zaïane
563 | - [[Paper]](https://doi.org/10.18653/v1/2021.emnlp-main.168)
564 |
565 | - **"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models."**, 2022.1
566 | - Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma
567 | - [[Paper]](http://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html)
568 |
569 | - **"Teaching language models to support answers with verified quotes."**, 2022.3
570 | - Jacob Menick, Maja Trebacz, Vladimir Mikulik
571 | - [[Paper]](https://doi.org/10.48550/arXiv.2203.11147)
572 |
573 | - **"ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data."**, 2022.5
574 | - Xiaochuang Han, Yulia Tsvetkov
575 | - [[Paper]](https://doi.org/10.48550/arXiv.2205.12600)
576 |
577 | - **"Large Language Models are Zero-Shot Reasoners."**, 2022.5
578 | - Takeshi Kojima, Shixiang Shane Gu, Machel Reid
579 | - [[Paper]](http://papers.nips.cc/paper_files/paper/2022/hash/8bb0d291acd4acf06ef112099c16f326-Abstract-Conference.html)
580 |
581 | - **"Rethinking with Retrieval: Faithful Large Language Model Inference."**, 2023.1
582 | - Hangfeng He, Hongming Zhang, Dan Roth
583 | - [[Paper]](https://doi.org/10.48550/arXiv.2301.00303)
584 |
585 | - **"TRAK: Attributing Model Behavior at Scale."**, 2023.3
586 | - Sung Min Park, Kristian Georgiev, Andrew Ilyas
587 | - [[Paper]](https://proceedings.mlr.press/v202/park23c.html)
588 |
589 | - **"Data Portraits: Recording Foundation Model Training Data."**, 2023.3
590 | - Marc Marone, Benjamin Van Durme
591 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.03919)
592 |
593 | - **"Self-Refine: Iterative Refinement with Self-Feedback."**, 2023.3
594 | - Aman Madaan, Niket Tandon, Prakhar Gupta
595 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.17651)
596 |
597 | - **"Reflexion: an autonomous agent with dynamic memory and self-reflection."**, 2023.3
598 | - Noah Shinn, Beck Labash, Ashwin Gopinath
599 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.11366)
600 |
601 | - **"According to ..." Prompting Language Models Improves Quoting from Pre-Training Data."**, 2023.5
602 | - Orion Weller, Marc Marone, Nathaniel Weir
603 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.13252)
604 |
605 | - **"Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework."**, 2023.5
606 | - Ruochen Zhao, Xingxuan Li, Shafiq Joty
607 | - [[Paper]](https://arxiv.org/abs/2305.03268)
608 |
609 | - **"Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning."**, 2023.9
610 | - Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu
611 | - [[Paper]](https://arxiv.org/abs/2309.11495)
612 |
613 | - **"Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations."**, 2023.10
614 | - Deren Lei, Yaxi Li, Mengya Hu
615 | - [[Paper]](https://arxiv.org/abs/2310.03951)
616 |
617 |
618 | ### ✨Utilizing Programming Languages
619 |
620 | - **"PAL: Program-aided Language Models."**, 2022.11
621 | - Luyu Gao, Aman Madaan, Shuyan Zhou
622 | - [[Paper]](https://proceedings.mlr.press/v202/gao23f.html)
623 |
624 | - **"Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks."**, 2022.11
625 | - Wenhu Chen, Xueguang Ma, Xinyi Wang
626 | - [[Paper]](https://doi.org/10.48550/arXiv.2211.12588)
627 |
628 | - **"Teaching Algorithmic Reasoning via In-context Learning."**, 2022.11
629 | - Hattie Zhou, Azade Nova, Hugo Larochelle
630 | - [[Paper]](https://doi.org/10.48550/arXiv.2211.09066)
631 |
632 | - **"Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification."**, 2023.8
633 | - Aojun Zhou, Ke Wang, Zimu Lu
634 | - [[Paper]](https://arxiv.org/abs/2308.07921)
635 |
636 |
637 |
638 | ### ✨Leverage External Knowledge
639 |
640 |
641 | - **"Improving Language Models by Retrieving from Trillions of Tokens."**, 2021.12
642 | - Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann
643 | - [[Paper]](https://proceedings.mlr.press/v162/borgeaud22a.html)
644 |
645 | - **"Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions.**, 2022.12
646 | - Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot
647 | - [[Paper]](https://doi.org/10.18653/v1/2023.acl-long.557)
648 |
649 | - **"When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories."**, 2022.12
650 | - Alex Mallen, Akari Asai, Victor Zhong
651 | - [[Paper]](https://doi.org/10.18653/v1/2023.acl-long.546)
652 |
653 | - **"Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback."**, 2023.2
654 | - Baolin Peng, Michel Galley, Pengcheng He
655 | - [[Paper]](https://arxiv.org/abs/2302.12813)
656 |
657 | - **"In-Context Retrieval-Augmented Language Models."**, 2023.2
658 | - Ori Ram, Yoav Levine, Itay Dalmedigos
659 | - [[Paper]](https://doi.org/10.48550/arXiv.2302.00083)
660 |
661 | - **"cTBL: Augmenting Large Language Models for Conversational Tables."**, 2023.3
662 | - Anirudh S. Sundar, Larry Heck
663 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.12024)
664 |
665 | - **"GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information."**, 2023.4
666 | - Qiao Jin, Yifan Yang, Qingyu Chen
667 | - [[Paper]](https://arxiv.org/abs/2304.09667)
668 |
669 | - **"Active Retrieval Augmented Generation."**, 2023.5
670 | - Zhengbao Jiang, Frank F. Xu, Luyu Gao
671 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.06983)
672 |
673 | - **"Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases."**, 2023.5
674 | - Xingxuan Li, Ruochen Zhao, Yew Ken Chia
675 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.13269)
676 |
677 | - **"Gorilla: Large Language Model Connected with Massive APIs."**, 2023.5
678 | - Shishir G. Patil, Tianjun Zhang, Xin Wang
679 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.15334)
680 |
681 | - **"RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit."**, 2023.6
682 | - Jiongnan Liu, Jiajie Jin, Zihan Wang
683 | - [[Paper]](https://doi.org/10.48550/arXiv.2306.05212)
684 |
685 | - **"User-Controlled Knowledge Fusion in Large Language Models: Balancing Creativity and Hallucination."**, 2023.7
686 | - SChen Zhang
687 | - [[Paper]](https://arxiv.org/abs/2307.16139)
688 |
689 | - **"KnowledGPT: Enhancing Large Language Models with Retrieval and Storage Access on Knowledge Bases."**, 2023.8
690 | - Xintao Wang, Qianwen Yang, Yongting Qiu
691 | - [[Paper]](https://doi.org/10.48550/arXiv.2308.11761)
692 |
693 |
694 | ### ✨Assessment Feedback
695 |
696 | - **"Learning to summarize with human feedback."**, 2020.12
697 | - Nisan Stiennon, Long Ouyang, Jeffrey Wu
698 | - [[Paper]](https://proceedings.neurips.cc/paper/2020/hash/1f89885d556929e98d3ef9b86448f951-Abstract.html)
699 |
700 | - **"BRIO: Bringing Order to Abstractive Summarization."**, 2022.3
701 | - Yixin Liu, Pengfei Liu, Dragomir R. Radev
702 | - [[Paper]](https://doi.org/10.18653/v1/2022.acl-long.207)
703 |
704 | - **"Language Models (Mostly) Know What They Know."**, 2022.7
705 | - Saurav Kadavath, Tom Conerly, Amanda Askell
706 | - [[Paper]](https://doi.org/10.48550/arXiv.2207.05221)
707 |
708 | - **"Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback."**, 2023.2
709 | - Baolin Peng, Michel Galley, Pengcheng He
710 | - [[Paper]](https://doi.org/10.48550/arXiv.2302.12813)
711 |
712 | - **"Chain of Hindsight Aligns Language Models with Feedback."**, 2023.2
713 | - Hao Liu, Carmelo Sferrazza, Pieter Abbeel
714 | - [[Paper]](https://doi.org/10.48550/arXiv.2302.02676)
715 |
716 | - **"Zero-shot Faithful Factual Error Correction."**, 2023.5
717 | - Kung-Hsiang Huang, Hou Pong Chan, Heng Ji
718 | - [[Paper]](https://doi.org/10.18653/v1/2023.acl-long.311)
719 |
720 | - **"CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing."**, 2023.5
721 | - Zhibin Gou, Zhihong Shao, Yeyun Gong
722 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.11738)
723 |
724 | - **"Album Storytelling with Iterative Story-aware Captioning and Large Language Models."**, 2023.5
725 | - Munan Ning, Yujia Xie, Dongdong Chen
726 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.12943)
727 |
728 | - **"How Language Model Hallucinations Can Snowball."**, 2023.5
729 | - Muru Zhang, Ofir Press, William Merrill
730 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.13534)
731 |
732 | - **"Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment."**, 2023.5
733 | - Shuo Zhang, Liangming Pan, Junzhou Zhao
734 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.13669)
735 |
736 | - **"Improving Language Models via Plug-and-Play Retrieval Feedback."**, 2023.5
737 | - Wenhao Yu, Zhihan Zhang, Zhenwen Liang
738 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.14002)
739 |
740 | - **"PaD: Program-aided Distillation Specializes Large Models in Reasoning."**, 2023.5
741 | - Xuekai Zhu, Biqing Qi, Kaiyan Zhang
742 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.13888)
743 |
744 | - **"Enabling Large Language Models to Generate Text with Citations."**, 2023.5
745 | - Tianyu Gao, Howard Yen, Jiatong Yu
746 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.14627)
747 |
748 | - **"Do Language Models Know When They’re Hallucinating References?"**, 2023.5
749 | - Ayush Agrawal, Lester Mackey, Adam Tauman Kalai
750 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.18248)
751 |
752 | - **"Improving Factuality of Abstractive Summarization via Contrastive Reward Learning."**, 2023.7
753 | - I-Chun Chern, Zhiruo Wang, Sanjan Das
754 | - [[Paper]](https://doi.org/10.48550/arXiv.2307.04507)
755 |
756 | - **"Towards Mitigating Hallucination in Large Language Models via Self-Reflection."**, 2023.10
757 | - Ziwei Ji, Tiezheng Yu, Yan Xu
758 | - [[Paper]](https://arxiv.org/abs/2310.06271)
759 |
760 |
761 | ### ✨Mindset Society
762 |
763 | - **"Hallucinations in Large Multilingual Translation Models."**, 2023.3
764 | - Nuno Miguel Guerreiro, Duarte M. Alves, Jonas Waldendorf
765 | - [[Paper]](https://doi.org/10.48550/arXiv.2303.16104)
766 |
767 | - **"Improving Factuality and Reasoning in Language Models through Multiagent Debate."**, 2023.5
768 | - Yilun Du, Shuang Li, Antonio Torralba
769 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.14325)
770 |
771 | - **"Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate."**, 2023.5
772 | - Tian Liang, Zhiwei He, Wenxiang Jiao
773 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.19118)
774 |
775 | - **"Examining the Inter-Consistency of Large Language Models: An In-depth Analysis via Debate."**, 2023.5
776 | - Kai Xiong, Xiao Ding, Yixin Cao
777 | - [[Paper]](https://doi.org/10.48550/arXiv.2305.11595)
778 |
779 | - **"LM vs LM: Detecting Factual Errors via Cross Examination."**, 2023.5
780 | - Roi Cohen, May Hamri, Mor Geva
781 | - [[Paper]](https://arxiv.org/abs/2305.13281)
782 |
783 | - **"PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations."**, 2023.7
784 | - Ruosen Li, Teerth Patel, Xinya Du
785 | - [[Paper]](https://doi.org/10.48550/arXiv.2307.02762)
786 |
787 | - **"Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration."**, 2023.7
788 | - Zhenhailong Wang, Shaoguang Mao, Wenshan Wu
789 | - [[Paper]](https://doi.org/10.48550/arXiv.2307.05300)
790 |
791 |
792 |
793 |
794 | ## 🌟 TIPS
795 | If you find this repository useful to your research or work, it is really appreciate to star this repository.
796 |
797 |
--------------------------------------------------------------------------------