└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Awesome-LLMxVisualization-Paper-List 2 |

3 | Screenshot 2024-10-16 at 12 48 06 4 |

5 | 6 | - Content 7 | - [0. Why am I curating this repository?](#0-why-am-i-curating-this-repository) 8 | - [1. Chart Captioning](#1-chart-captioning) 9 | - [🔥 2. Chart Question Answering](#2-chart-question-answering) 10 | - [2.0 Data, Model](#20-data-model) 11 | - [2.1 Benchmark](#21-benchmark) 12 | - [2.3 Application Scenarios (e.g., Accessibility)](#23-application-scenarios) 13 | - [3. Chart Reverse Engineering & Chart-to-Code Translation](#3-chart-reverse-engineering) 14 | - [4. Natural Language to Visualization](#4-natural-language-to-visualization) 15 | - [4.0 Method, Framework, and Benchmark](#40-method-framework-and-benchmark) 16 | - [4.1 Application Scenarios (e.g., Storytelling)](#41-application-scenarios) 17 | - [5. Visualization Design](#5-visualization-design) 18 | - [5.0 Color](#50-color) 19 | - [5.1 Design Preferences](#51-design-preferences) 20 | - [5.2 User-Adaptive Visualization](#52-user-adaptive-visualization) 21 | - [6. Visualization Agents & Automatic Judge](#6-visualization-agents--automatic-judge) 22 | - [7. HCI-Style Empirical Study on LLM's Chart Understanding & Chart Generation](#7-empirical-study-on-llms-chart-understanding--chart-generation) 23 | - [🔥 8. Visualization for Interpreting, Evaluating, and Improving LLM](#8-visualization-for-interpreting-evaluating-and-improving-llm) 24 | - [9. Generic Multimodal Large Language Model](#9-generic-multimodal-large-language-model) 25 | - [10. Related Survey Papers](#10-related-survey-papers) 26 | - [11. Others (Chart Retrieval, Chart Component Detection)](#11-others) 27 | 28 | ## 0. Why am I curating this repository? 29 | - I've found that the existing **Vis**x**LLM** paper-list repositories are updated infrequently (they're more likely created for a survey paper and then abandoned). I will gradually enrich this repository and keep it updated. 30 | - Feel free to open an issue or a pull request to add a new paper you appreciate. 31 | - **Star and watch this repo for future updates 😁.** 32 | - **Strongly recommend the tutorial [LLM4Vis: Large Language Models for Information Visualization](https://nlp4vis.github.io/) delivered by Prof. Hoque.** 33 | - Papers under the same category are recorded in reverse chronological order. 34 | ## 1. Chart Captioning 35 | 36 | **Natural Language Dataset Generation Framework for Visualizations Powered by Large Language Models**
37 | Hyung-Kwon Ko, Hyeon Jeon, Gwanmo Park, Dae Hyun Kim, Nam Wook Kim, Juho Kim, Jinwook Seo
38 | [CHI 2024](https://arxiv.org/abs/2309.10245) • [Code](https://github.com/hyungkwonko/chart-llm) 39 | 40 | **Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning**
41 | Kung-Hsiang Huang, Mingyang Zhou, Hou Pong Chan, Yi Fung, Zhenhailong Wang, Lingyu Zhang, Shih-Fu Chang, Heng Ji
42 | [Findings of ACL 2024](https://aclanthology.org/2024.findings-acl.41.pdf) • [Homepage](https://khuangaf.github.io/CHOCOLATE)
43 | 44 | **VisText: A Benchmark for Semantically Rich Chart Captioning** 45 | Benny J. Tang, Angie Boggust, Arvind Satyanarayan 46 | [ACL 2023 Outstanding paper](https://aclanthology.org/2023.acl-long.401.pdf) • [Code](https://github.com/mitvis/vistext) 47 | 48 | --- Pre-LLM --- 49 | **Chart-to-Text: A Large-Scale Benchmark for Chart Summarization** 50 | Shankar Kantharaj, Rixie Tiffany Leong, Xiang Lin, Ahmed Masry, Megh Thakkar, Enamul Hoque, Shafiq Joty 51 | [ACL 2022](https://aclanthology.org/2022.acl-long.277.pdf) • [Code](https://github.com/vis-nlp/Chart-to-text) 52 | 53 | 54 | ## 2. Chart Question Answering 55 | 56 | 57 | ### 2.0 Data, Model 58 | **Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning** 59 | Xingchen Zeng, Haichuan Lin, Yilin Ye, Wei Zeng 60 | [VIS 2024](https://arxiv.org/abs/2407.20174) • [Code](https://github.com/zengxingchen/ChartQA-MLLM) 61 | 62 | **AskChart: Universal Chart Understanding through Textual Enhancement** 63 | Xudong Yang, Yifan Wu, Yizhang Zhu, Nan Tang, Yuyu Luo
64 | [arXiv, 26 Dec 2024](https://arxiv.org/abs/2412.19146) 65 | 66 | **Distill Visual Chart Reasoning Ability from LLMs to MLLMs**
67 | Wei He, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang
68 | [arXiv, 24 Oct 2024](https://arxiv.org/abs/2410.18798) • [Code](https://github.com/hewei2001/ReachQA) 69 | 70 | **SynChart: Synthesizing Charts from Language Models** 71 | Mengchen Liu, Qixiu Li, Dongdong Chen, Dong Chen, Jianmin Bao, Yunsheng Li 72 | [arXiv, 25 Sep 2024](https://arxiv.org/abs/2409.16517) 73 | 74 | **ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding** 75 | Zhengzhuo Xu*, Bowen Qu*, Yiyan Qi*, Sinan Du, Chengjin Xu, Chun Yuan, Jian Guo 76 | [ICLR 2025](https://arxiv.org/abs/2409.03277) • [Code](https://github.com/IDEA-FinAI/ChartMoE) 77 | 78 | **On Pre-training of Multimodal Language Models Customized for Chart Understanding**
79 | Wan-Cyuan Fan, Yen-Chun Chen, Mengchen Liu, Lu Yuan, Leonid Sigal
80 | [arXiv, 19 Jul 2024](https://arxiv.org/abs/2407.14506) 81 | 82 | **ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild**
83 | Ahmed Masry, Megh Thakkar, Aayush Bajaj, Aaryaman Kartha, Enamul Hoque, Shafiq Joty
84 | [COLING 2025 Industry Track](https://aclanthology.org/2025.coling-industry.54.pdf) • [Code](https://github.com/vis-nlp/ChartGemma) 85 | 86 | **TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning** 87 | Liang Zhang*, Anwen Hu*, Haiyang Xu, Ming Yan, Yichen Xu, Qin Jin†, Ji Zhang, Fei Huang 88 | [EMNLP 2024](https://arxiv.org/pdf/2404.16635) • [Code](https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/TinyChart) 89 | 90 | **OneChart: Purify the Chart Structural Extraction via One Auxiliary Token** 91 | Jinyue Chen*, Lingyu Kong*, Haoran Wei, Chenglong Liu, Zheng Ge, Liang Zhao, Jianjian Sun, Chunrui Han, Xiangyu Zhang 92 | [ACM MM 2024 (Oral)](https://arxiv.org/pdf/2404.09987) • [Homepage](https://onechartt.github.io/) 93 | 94 | 95 | **Representing Charts as Text for Language Models: An In-Depth Study of Question Answering for Bar Charts** 96 | Victor Soares Bursztyn, Jane Hoffswell, Eunyee Koh, Shunan Guo 97 | [VIS 2024 Short Paper](https://ieeevis.b-cdn.net/vis_2024/pdfs/v-short-1276.pdf) 98 | 99 | **ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning** 100 | Ahmed Masry, Mehrad Shahmohammadi, Md Rizwan Parvez, Enamul Hoque, Shafiq Joty 101 | [Findings of ACL 2024](https://github.com/vis-nlp/ChartInstruct) • [Code](https://github.com/vis-nlp/ChartInstruct) 102 | 103 | **ChartAssistant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning** 104 | Fanqing Meng, Wenqi Shao, Quanfeng Lu, Peng Gao, Kaipeng Zhang, Yu Qiao, Ping Luo 105 | [Findings of ACL 2024](https://aclanthology.org/2022.findings-acl.177.pdf) • [Code](https://github.com/OpenGVLab/ChartAst) 106 | 107 | **MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning** 108 | Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu 109 | [NAACL 2024](https://aclanthology.org/2024.naacl-long.70.pdf) • [Code](https://github.com/FuxiaoLiu/MMC) 110 | 111 | **Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs**
112 | Victor Carbune, Hassan Mansoor, Fangyu Liu, Rahul Aralikatte, Gilles Baechler, Jindong Chen, Abhanshu Sharma
113 | [NAACL'24 Findings](https://arxiv.org/pdf/2403.12596) 114 | 115 | **Synthesize Step-by-Step: Tools Templates and LLMs as Data Generators for Reasoning-Based Chart VQA** 116 | Zhuowan Li, Bhavan Jasani, Peng Tang, Shabnam Ghadar 117 | [CVPR 2024](https://openaccess.thecvf.com/content/CVPR2024/papers/Li_Synthesize_Step-by-Step_Tools_Templates_and_LLMs_as_Data_Generators_for_CVPR_2024_paper.pdf) 118 | 119 | **Chartllama: A multimodal llm for chart understanding and generation** 120 | Yucheng Han, Chi Zhang, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang
121 | [arXiv, 27 Nov 2023](https://arxiv.org/pdf/2311.16483) • [Homepage](https://tingxueronghua.github.io/ChartLlama/) 122 | 123 | --- Pre-LLM --- 124 | 125 | **UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning** 126 | Ahmed Masry∗, Parsa Kavehzadeh∗, Xuan Long Do, Enamul Hoque, Shafiq Joty 127 | [EMNLP 2023](https://aclanthology.org/2023.emnlp-main.906.pdf) • [Code](https://github.com/vis-nlp/UniChart) 128 | 129 | **Deplot: One-shot visual language reasoning by plot-to-table translation** 130 | Fangyu Liu, Julian Martin Eisenschlos, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Wenhu Chen, Nigel Collier, Yasemin Altun 131 | [Findings of ACL 2023](https://aclanthology.org/2023.findings-acl.660.pdf) • [Code](https://github.com/google-research/google-research/tree/master/deplot) 132 | 133 | **MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering** 134 | Fangyu Liu, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Yasemin Altun, Nigel Collier, Julian Martin Eisenschlos 135 | [ACL 2023](https://aclanthology.org/2023.acl-long.714.pdf) • [Code](https://github.com/google-research/google-research/tree/master/deplot) 136 | 137 | **STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering**
138 | Hrituraj Singh, Sumit Shekhar
139 | [EMNLP 2020](https://aclanthology.org/2020.emnlp-main.264.pdf) 140 | 141 | **LEAF-QA: Locate, Encode & Attend for Figure Question Answering**
142 | haudhry, Ritwick and Shekhar, Sumit and Gupta, Utkarsh and Maneriker, Pranav and Bansal, Prann and Joshi, Ajay
143 | [WACV 2020](https://openaccess.thecvf.com/content_WACV_2020/papers/Chaudhry_LEAF-QA_Locate_Encode__Attend_for_Figure_Question_Answering_WACV_2020_paper.pdf) 144 | 145 | 146 | ### 2.1 Benchmark 147 | 148 | **ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering**
149 | Ahmed Masry, Mohammed Saidul Islam, Mahir Ahmed, Aayush Bajaj, Firoz Kabir, Aaryaman Kartha, Md Tahmid Rahman Laskar, Mizanur Rahman, Shadikur Rahman, Mehrad Shahmohammadi, Megh Thakkar, Md Rizwan Parvez, Enamul Hoque, Shafiq Joty
150 | [arXiv, 10 Apr 2025](https://arxiv.org/abs/2504.05506) • [Code](https://github.com/visnlp/ChartQAPro) 151 | 152 | 153 | 154 | **CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs** 155 | Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu, Richard Zhu, Kaiqu Liang, Xindi Wu, Haotian Liu, Sadhika Malladi, Alexis Chevalier, Sanjeev Arora, Danqi Chen 156 | [NeurIPS 2024 Benchmark](http://arxiv.org/abs/2406.18521) • [Homepage](https://charxiv.github.io/) 157 | 158 | **ChartInsights: Evaluating Multimodal Large Language Models for Low-Level Chart Question Answering** 159 | Yifan Wu, Lutao Yan, Leixian Shen, Yunhai Wang, Nan Tang, Yuyu Luo 160 | [Findings of EMNLP 2024](https://arxiv.org/pdf/2405.07001) • [Code](https://chartinsight.github.io/) 161 | 162 | **Chartx & chartvlm: A versatile benchmark and foundation model for complicated chart reasoning** 163 | Renqiu Xia, Bo Zhang, Hancheng Ye, Xiangchao Yan, Qi Liu, Hongbin Zhou, Zijun Chen, Min Dou, Botian Shi, Junchi Yan, Yu Qiao 164 | [arXiv, 19 Feb 2024](https://arxiv.org/pdf/2402.12185) 165 | 166 | **ChartBench: A Benchmark for Complex Visual Reasoning in Charts** 167 | Zhengzhuo Xu*, Sinan Du*, Yiyan Qi, Chengjin Xu, Chun Yuan†, Jian Guo 168 | [arXiv 26 Dec 2023](https://arxiv.org/pdf/2312.15915) • [Homepage](https://chartbench.github.io/) 169 | 170 | --- Pre-LLM --- 171 | **ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning** 172 | Ahmed Masry, Xuan Long Do, Jia Qing Tan, Shafiq Joty, Enamul Hoque 173 | [Findings of ACL 2022](https://aclanthology.org/2022.findings-acl.177.pdf) • [Code](https://github.com/vis-nlp/chartqa) 174 | 175 | **InfographicVQA**
176 | Minesh Mathew, Viraj Bagal, Rubèn Pérez Tito, Dimosthenis Karatzas, Ernest Valveny, C.V Jawahar
177 | [WACV 2022](https://openaccess.thecvf.com/content/WACV2022/papers/Mathew_InfographicVQA_WACV_2022_paper.pdf) • [Homepage](https://www.docvqa.org/datasets/infographicvqa) 178 | 179 | **PlotQA: Reasoning over Scientific Plots**
180 | Pritha Ganguly, Nitesh Methani, Mitesh Khapra, Pratyush Kumar
181 | [WACV 2020](https://arxiv.org/pdf/1909.00997.pdf) • [Code](https://github.com/NiteshMethani/PlotQA) 182 | 183 | **Dvqa: Understanding data visualizations via question answering** 184 | Kushal Kafle, Brian Price, Scott Cohen, Christopher Kanan 185 | [CVPR 2018](https://openaccess.thecvf.com/content_cvpr_2018/papers/Kafle_DVQA_Understanding_Data_CVPR_2018_paper.pdf) • [Code](https://github.com/kushalkafle/DVQA_dataset) 186 | 187 | **FigureQA: An Annotated Figure Dataset for Visual Reasoning**
188 | Samira Ebrahimi Kahou, Vincent Michalski, Adam Atkinson, Akos Kadar, Adam Trischler, Yoshua Bengio
189 | [ICLR 2018 Workshop track](https://arxiv.org/pdf/1710.07300) • [Code](https://github.com/vmichals/FigureQA-baseline) 190 | 191 | ### 2.3 Application Scenarios 192 | **VizAbility: Enhancing Chart Accessibility with LLM-based Conversational Interaction**
193 | Joshua Gorniak, Yoon Kim, Donglai Wei, Nam Wook Kim
194 | [UIST 2024](https://dl.acm.org/doi/abs/10.1145/3654777.3676414) 195 | 196 | ## 3. Chart Reverse Engineering 197 | 198 | **METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling**
199 | Bingxuan Li, Yiwei Wang, Jiuxiang Gu, Kai-Wei Chang, Nanyun Peng
200 | [ACL 2025](https://arxiv.org/pdf/2502.17651) • [Homepage](https://metal-chart-generation.github.io/) 201 | 202 | **Enhancing Chart-to-Code Generation in Multimodal Large Language Models via Iterative Dual Preference Learning**
203 | Zhihan Zhang, Yixin Cao, Lizi Liao
204 | [arXiv, 3 Apr 2025](https://arxiv.org/pdf/2504.02906) 205 | 206 | **ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation**
207 | Xuanle Zhao, Xianzhen Luo, Qi Shi, Chi Chen, Shuo Wang, Wanxiang Che, Zhiyuan Liu, Maosong Sun
208 | [arXiv, 11 Jan 2025](https://arxiv.org/abs/2501.06598) • [Code](https://github.com/thunlp/ChartCoder) 209 | 210 | **ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation** 211 | Chufan Shi and Cheng Yang and Yaxin Liu and Bo Shui and Junjie Wang and Mohan Jing and Linran Xu and Xinyu Zhu and Siheng Li and Yuxiang Zhang and Gongye Liu and Xiaomei Nie and Deng Cai and Yujiu Yang 212 | [ICLR 2025](https://arxiv.org/pdf/2406.09961) • [Homepage](https://chartmimic.github.io/) 213 | 214 | **Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots** 215 | Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, Ping Luo
216 | [arXiv, 13 May 2024](https://arxiv.org/pdf/2405.07990) 217 | 218 | **Is GPT-4V (ision) All You Need for Automating Academic Data Visualization? Exploring Vision-Language Models’ Capability in Reproducing Academic Charts**
219 | Zhehao Zhang, Weicheng Ma, Soroush Vosoughi
220 | [EMNLP2024 Findings](https://aclanthology.org/2024.findings-emnlp.485.pdf) • [Code](https://github.com/zzh-SJTU/AcademiaChart) 221 | 222 | 223 | 224 | 225 | --- Pre-LLM --- 226 | **InvVis: Large-scale data embedding for invertible visualization**
227 | Huayuan Ye, Chenhui Li, Yang Li and Changbo Wang
228 | [VIS 2023](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10290968) 229 | 230 | **CACHED: Context-Aware Chart Element Detection**
231 | Pengyu Yan*, Saleem Ahmed*, and David Doermann
232 | [ICDAR 2023](https://arxiv.org/pdf/2410.12268) • [Code](https://github.com/pengyu965/ChartDete) 233 | 234 | **Deep colormap extraction from visualizations** 235 | Lin-Ping Yuan, Wei Zeng, Siwei Fu, Zhiliang Zeng, Haotian Li, Chi-Wing Fu, Huamin Qu 236 | [TVCG 2022](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9395231) 237 | 238 | **ChartOCR: Data Extraction From Charts Images via a Deep Hybrid Framework**
239 | Junyu Luo, Zekun Li, Jinpeng Wang, Chin-Yew Lin
240 | [WACV2021](https://openaccess.thecvf.com/content/WACV2021/papers/Luo_ChartOCR_Data_Extraction_From_Charts_Images_via_a_Deep_Hybrid_WACV_2021_paper.pdf) 241 | 242 | **Chartem: Reviving Chart Images with Data Embedding**
243 | Jiayun Fu, Bin Zhu, Weiwei Cui, Song Ge, Yun Wang, Haidong Zhang, He Huang, Yuanyuan Tang, Dongmei Zhang, and Xiaojing Ma
244 | [TVCG 2021](https://ieeexplore.ieee.org/abstract/document/9293003) 245 | 246 | **Reverse‐engineering visualizations: Recovering visual encodings from chart images** 247 | Jorge Poco and Jeffrey Heer
248 | [EuroVis 2017](https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.13193) 249 | 250 | **ReVision: automated classification, analysis and redesign of chart images**
251 | Manolis Savva, Nicholas Kong, Arti Chhajta, Li Fei-Fei, Maneesh Agrawala, Jeffrey Heer
252 | [UIST 2011](https://dl.acm.org/doi/abs/10.1145/2047196.2047247) 253 | 254 | ## 4. Natural Language to Visualization 255 | ### 4.0 Method, Framework, and Benchmark 256 | **nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow**
257 | Geliang Ouyang, Jingyao Chen, Zhihe Nie, Yi Gui, Yao Wan, Hongyu Zhang, Dongping Chen
258 | [arXiv, 7 Feb 2025](https://arxiv.org/pdf/2502.05036) 259 | 260 | **VisPath: Automated Visualization Code Synthesis via Multi-Path Reasoning and Feedback-Driven Optimization**
261 | Wonduk Seo, Seungyong Lee, Daye Kang, Zonghao Yuan, Seunghyun Lee
262 | [arXiv, 16 Feb 2025](https://arxiv.org/pdf/2502.11140) 263 | 264 | **ChartifyText: Automated Chart Generation from Data-Involved Texts via LLM**
265 | Songheng Zhang, Lei Wang, Toby Jia-Jun Li, Qiaomu Shen, Yixin Cao, Yong Wang
266 | [arXiv, 7 Nov 2024](https://arxiv.org/pdf/2410.14331) 267 | 268 | **Charting the Future: Using Chart Question-Answering for Scalable Evaluation of LLM-Driven Data Visualizations**
269 | James Ford, Xingmeng Zhao, Dan Schumacher, Anthony Rios
270 | [arXiv, 27 Sep 2024](https://arxiv.org/pdf/2409.18764) 271 | 272 | **VisEval: A benchmark for data visualization in the era of large language models** 273 | Nan Chen, Yuge Zhang, Jiahang Xu, Kan Ren, and Yuqing Yang 274 | [VIS 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10670425) • [Code](https://github.com/microsoft/VisEval) 275 | 276 | 277 | **Chartgpt: Leveraging LLMs to generate charts from abstract natural language** 278 | Yuan Tian, Weiwei Cui, Dazhen Deng, Xinjing Yi, Yurun Yang, Haidong Zhang, Yingcai Wu 279 | [TVCG 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10443572) 280 | 281 | **LLM4Vis: Explainable Visualization Recommendation using ChatGPT** 282 | Lei Wang, Songheng Zhang, Yun Wang, Ee-Peng Lim, Yong Wang 283 | [EMNLP 2023 Industry](https://aclanthology.org/2023.emnlp-industry.64.pdf) • [Code](https://github.com/demoleiwang/LLM4Vis) 284 | 285 | **LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models** 286 | Victor Dibia 287 | [ACL 2023 Demo](https://arxiv.org/pdf/2303.02927) • [Code](https://github.com/microsoft/lida) 288 | 289 | --- Pre-LLM --- 290 | 291 | **Natural Language to Visualization by Neural Machine Translation** 292 | Yuyu Luo, Nan Tang, Guoliang Li, Jiawei Tang, Chengliang Chai, Xuedi Qin 293 | [VIS 2021](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9617561) • [Code](https://github.com/HKUSTDial/ncNet) 294 | 295 | **nvBench: A Large-Scale Synthesized Dataset for Cross-Domain Natural Language to Visualization Task** 296 | Yuyu Luo, Jiawei Tang, Guoliang Li 297 | [Workshop on NL VIZ 2021 at IEEE VIS 2021](https://arxiv.org/pdf/2112.12926) • [Code](https://github.com/TsinghuaDatabaseGroup/nvBench) 298 | 299 | **Table2Charts: Recommending Charts by Learning Shared Table Representations**
300 | Zhou, Mengyu and Li, Qingtao and He, Xinyi and Li, Yuejiang and Liu, Yibo and Ji, Wei and Han, Shi and Chen, Yining and Jiang, Daxin and Zhang, Dongmei
301 | [KDD 2021](https://dl.acm.org/doi/pdf/10.1145/3447548.3467279) • [Code](https://github.com/microsoft/Table2Charts) 302 | 303 | ### 4.1 Application Scenarios 304 | **Show and Tell: Exploring Large Language Model’s Potential in Formative Educational Assessment of Data Stories**
305 | Naren Sivakumar, Lujie Karen Chen, Pravalika Papasani, Vigna Majmundar, Jinjuan Heidi Feng, Louise Yarnall
306 | [2024 IEEE VIS Workshop on Data Storytelling in an Era of Generative AI](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10766490) 307 | 308 | 309 | **DataNarrative: Automated Data-Driven Storytelling with Visualizations and Texts**
310 | Mohammed Saidul Islam, Enamul Hoque, Shafiq Joty, Md Tahmid Rahman Laskar, Md Rizwan Parvez
311 | [EMNLP 2024](https://arxiv.org/pdf/2408.05346) • [Code](https://github.com/saidul-islam98/DataNarrative) 312 | 313 | **Beyond generating code: Evaluating gpt on a data visualization course** 314 | Zhutian Chen, Chenyang Zhang, Qianwen Wang, Jakob Troidl, Simon Warchol, Johanna Beyer 315 | [EduVis 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10344146) • [Code](https://github.com/GPT4VIS/GPT-4-CS171) 316 | 317 | **SNIL: Generating Sports News from Insights with Large Language Models**
318 | Liqi Cheng, Dazhen Deng, Xiao Xie, Rihong Qiu, Mingliang Xu, Yingcai Wu 319 | [TVCG 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10507016) 320 | 321 | 322 | ## 5. Visualization Design 323 | ### 5.0 Color 324 | **Large Language Models estimate fine-grained human color-concept associations**
325 | Kushin Mukherjee, Timothy T. Rogers, Karen B. Schloss
326 | [arXiv, 4 May 2024](https://arxiv.org/abs/2406.17781) 327 | 328 | **NL2Color: Refining Color Palettes for Charts with Natural Language** 329 | Chuhan Shi, Weiwei Cui, Chengzhong Liu, Chengbo Zheng, Haidong Zhang, Qiong Luo, and Xiaojuan Ma 330 | [VIS 2023](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10292693) 331 | 332 | ### 5.1 Design Preferences 333 | **DracoGPT: Extracting Visualization Design Preferences from Large Language Models** 334 | Huichen Will Wang, Mitchell Gordon, Leilani Battle, and Jeffrey Heer 335 | [VIS 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10681108) 336 | 337 | **Bavisitter: Integrating Design Guidelines into Large Language Models for Visualization Authoring** 338 | Jiwon Choi, Jaeung Lee, Jaemin Jo 339 | [VIS 2024 Short Paper](https://idclab.skku.edu/assets/Bavisitter.pdf) 340 | 341 | ### 5.2 User-Adaptive Visualization 342 | **User-Adaptive Visualizations: An Exploration with GPT-4** 343 | F. Yanez and C. Nobre 344 | [MLVis 2024](https://diglib.eg.org/server/api/core/bitstreams/23abe5fb-4de1-432f-97ab-ad757f5418e3/content) 345 | 346 | ## 6. Visualization Agents & Automatic Judge 347 | **The Visualization JUDGE: Can Multimodal Foundation Models Guide Visualization Design Through Visual Perception?** 348 | Matthew Berger, Shusen Liu 349 | [arXiv, 5 Oct 2024](https://arxiv.org/pdf/2410.04280) 350 | 351 | **MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization** 352 | Zhiyu Yang and Zihan Zhou and Shuo Wang and Xin Cong and Xu Han and Yukun Yan and Zhenghao Liu and Zhixing Tan and Pengyuan Liu and Dong Yu and Zhiyuan Liu and Xiaodong Shi and Maosong Sun 353 | [Findings of ACL 2024](https://aclanthology.org/2024.findings-acl.701.pdf) • [Code](https://github.com/thunlp/MatPlotAgent) 354 | 355 | **WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization** 356 | Liwenhan Xie, Chengbo Zheng, Haijun Xia, Huamin Qu, Chen Zhu-Tian 357 | [UIST 2024](https://arxiv.org/pdf/2408.01703) 358 | 359 | **AVA: Towards Autonomous Visualization Agents through Visual Perception‐Driven Decision‐Making** 360 | Shusen Liu, Haichao Miao, Zhimin Li, Matthew Olson, Valerio Pascucci, P‐T Bremer 361 | [EuroVis 2024](https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.15093) 362 | 363 | **AVA: An automated and AI-driven intelligent visual analytics framework** 364 | Jiazhe Wang, Xi Li, Chenlu Li, Di Peng, Arran Zeyu Wang, Yuhui Gu, Xingui Lai, Haifeng Zhang, Xinyue Xu, Xiaoqing Dong, Zhifeng Lin, Jiehui Zhou, Xingyu Liu, Wei Chen 365 | [Visual Informatics 2024](https://www.sciencedirect.com/science/article/pii/S2468502X24000226) 366 | 367 | **LEVA: Using large language models to enhance visual analytics** 368 | Yuheng Zhao, Yixing Zhang, Yu Zhang, Xinyi Zhao, Junjie Wang, Zekai Shao, Cagatay Turkay, Siming Chen
369 | [TVCG 2024](https://arxiv.org/pdf/2403.05816) 370 | 371 | ## 7. Empirical Study on LLM's Chart Understanding & Chart Generation 372 | **How Aligned are Human Chart Takeaways and LLM Predictions? A Case Study on Bar Charts with Varying Layouts** 373 | Huichen Will Wang, Jane Hoffswell, Sao Myat Thazin Thane, Victor S Bursztyn, Cindy Xiong Bearfield 374 | [VIS 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10681139) 375 | 376 | **An Empirical Evaluation of the GPT-4 Multimodal Language Model on Visualization Literacy Tasks** 377 | Alexander Bendeck, John Stasko 378 | [VIS 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10670574) 379 | 380 | **Visualization Literacy of Multimodal Large Language Models: A Comparative Study** 381 | Zhimin Li, Haichao Miao, Valerio Pascucci, Shusen Liu 382 | [arXiv, 24 Jun 2024](https://arxiv.org/pdf/2407.10996) 383 | 384 | **Enhancing Data Literacy On-demand: LLMs as Guides for Novices in Chart Interpretation** 385 | Kiroong Choe, Chaerin Lee, S. Lee, Jiwon Song, Aeri Cho, Nam Wook Kim, Jinwook Seo 386 | [VIS 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10555321) 387 | 388 | **Promises and Pitfalls: Using Large Language Models to Generate Visualization Items** 389 | Yuan Cui, Lily W. Ge, Yiren Ding, Lane Harrison, Fumeng Yang, and Matthew Kay 390 | [VIS 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10670418) 391 | 392 | **How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations?** 393 | Leo Yu-Ho Lo, Huamin Qu 394 | [VIS 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10679256) 395 | 396 | **Exploring the Capability of LLMs in Performing Low-Level Visual Analytic Tasks on SVG Data Visualizations** 397 | Zhongzheng Xu, Emily Wall 398 | [VIS 2024 Short Paper](https://arxiv.org/pdf/2404.19097) 399 | 400 | **Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization** 401 | Hannah K Bako, Arshnoor Buthani, Xinyi Liu, Kwesi A Cobbina, Zhicheng Liu 402 | [VIS 2024 Short Paper](https://arxiv.org/pdf/2407.06129) • [Code](https://github.com/hdi-umd/Semantic_Profiling_LLM_Evaluation) 403 | 404 | ## 8. Visualization for Interpreting, Evaluating, and Improving LLM 405 | > If you are interested in this topic, you can find some interesting interactive papers/demos in the [VISxAI workshop](https://visxai.io/). 406 | 407 | **Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models**
408 | Zhanke Zhou, Zhaocheng Zhu, Xuan Li, Mikhail Galkin, Xiao Feng, Sanmi Koyejo, Jian Tang, Bo Han
409 | [arXiv, 28 Mar 2025] 410 | 411 | 412 | **LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models**
413 | Gabriela Ben Melech Stan, Estelle Aflalo, Raanan Yehezkel Rohekar, Anahita Bhiwandiwalla, Shao-Yen Tseng, Matthew Lyle Olson, Yaniv Gurwicz, Chenfei Wu, Nan Duan, Vasudev Lal
414 | [arXiv, 3 Apr 2024](https://arxiv.org/abs/2404.03118) • [Homepage](https://intellabs.github.io/multimodal_cognitive_ai/lvlm_interpret/) 415 | 416 | **LLM Comparator: Interactive Analysis of Side-by-Side Evaluation of Large Language Models** 417 | Minsuk Kahng, Ian Tenney, Mahima Pushkarna, Michael Xieyang Liu, James Wexler, Emily Reif, Krystal Kallarackal, Minsuk Chang, Michael Terry, and Lucas Dixon 418 | [VIS 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10670495) • [Code](https://github.com/PAIR-code/llm-comparator) • [Demo](https://pair-code.github.io/llm-comparator/) 419 | 420 | **Towards Dataset-scale and Feature-oriented Evaluation of Text Summarization in Large Language Model Prompts** 421 | Sam Yu-Te Lee, Aryaman Bahukhandi, Dongyu Liu, Kwan-Liu Ma 422 | [VIS 2024](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10669798) 423 | 424 | **LLM Attributor: Attribute LLM's Generated Text to Training Data** 425 | Seongmin Lee, Zijie J. Wang, Aishwarya Chakravarthy, Alec Helbling, ShengYun Peng, Mansi Phute, Duen Horng (Polo) Chau, Minsuk Kahng 426 | [VIS 2024 Poster](http://arxiv.org/abs/2404.01361) • [Code](https://github.com/poloclub/LLM-Attributor) 427 | 428 | **Can Large Language Models Explain Their Internal Mechanisms?** 429 | Nada Hussein, Asma Ghandeharioun, Ryan Mullins, Emily Reif, Jimbo Wilson, Nithum Thain and Lucas Dixon 430 | [VISxAI Workshop 2024 & Demo](https://pair.withgoogle.com/explorables/patchscopes/) 431 | 432 | **ExplainPrompt: Decoding the language of AI prompts** 433 | Shawn Simister 434 | [VISxAI Workshop 2024 & Demo](https://www.explainprompt.com/) 435 | 436 | **Attentionviz: A global view of transformer attention** 437 | Catherine Yeh, Yida Chen, Aoyu Wu, Cynthia Chen, Fernanda Viégas, Martin Wattenberg 438 | [VIS 2023](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10297591) • [Demo](http://attentionviz.com) 439 | 440 | **Do Machine Learning Models Memorize or Generalize?** 441 | Adam Pearce, Asma Ghandeharioun, Nada Hussein, Nithum Thain, Martin Wattenberg and Lucas Dixon 442 | [VISxAI Workshop 2023 & Demo](https://pair.withgoogle.com/explorables/grokking/) 443 | 444 | ## 9. Generic Multimodal Large Language Model 445 | > Refer to [Awesome-Multimodal-Large-Language-Models](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models) for a more comprehensive list. This repo only records papers that bring insights to **LLM**x**Vis**. 446 | 447 | **ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models**
448 | Mingrui Wu, Xinyue Cai, Jiayi Ji, Jiale Li, Oucheng Huang, Gen Luo, Hao Fei, Guannan Jiang, Xiaoshuai Sun, Rongrong Ji
449 | [NIPS 2024](https://arxiv.org/pdf/2407.21534) • [Code](https://github.com/mrwu-mac/ControlMLLM) 450 | 451 | 452 | **MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning** 453 | Haotian Zhang, Mingfei Gao, Zhe Gan, Philipp Dufter, Nina Wenzel, Forrest Huang, Dhruti Shah, Xianzhi Du, Bowen Zhang, Yanghao Li, Sam Dodge, Keen You, Zhen Yang, Aleksei Timofeev, Mingze Xu, Hong-You Chen, Jean-Philippe Fauconnier, Zhengfeng Lai, Haoxuan You, Zirui Wang, Afshin Dehghan, Peter Grasch, Yinfei Yang 454 | [arXiv, 30 Sep 2024](https://arxiv.org/pdf/2409.20566) 455 | 456 | **Law of Vision Representation in MLLMs**
457 | Shijia Yang, Bohan Zhai, Quanzeng You, Jianbo Yuan, Hongxia Yang, Chenfeng Xu
458 | [arXiv, 29 Aug 2024](https://arxiv.org/pdf/2408.16357) 459 | 460 | **EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders** 461 | Min Shi*, Fuxiao Liu*, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, De-An Huang, Hongxu Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Zhiding Yu, Guilin Liu 462 | [arXiv, 28 Aug 2024](https://arxiv.org/abs/2408.15998) • [Code](https://github.com/NVlabs/EAGLE) 463 | 464 | **Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs** 465 | Shengbang Tong*, Ellis Brown*, Penghao Wu*, Sanghyun Woo, Manoj Middepogu, Sai Charitha Akula, Jihan Yang, Shusheng Yang, Adithya Iyer, Xichen Pan, Austin Wang, Rob Fergus, Yann LeCun, Saining Xie 466 | [arXiv, 24 Jun 2024](https://arxiv.org/abs/2408.15998) • [Code](https://github.com/cambrian-mllm/cambrian) 467 | 468 | **DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models**
469 | Linli Yao, Lei Li, Shuhuai Ren, Lean Wang, Yuanxin Liu, Xu Sun, Lu Hou
470 | [arXiv, 31 May 2024](https://arxiv.org/abs/2405.20985) • [Code](https://github.com/yaolinli/DeCo) 471 | 472 | 473 | **Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs**
474 | Shengbang Tong, Zhuang Liu, Yuexiang Zhai, Yi Ma, Yann LeCun, Saining Xie
475 | [CVPR 2024](https://openaccess.thecvf.com/content/CVPR2024/papers/Tong_Eyes_Wide_Shut_Exploring_the_Visual_Shortcomings_of_Multimodal_LLMs_CVPR_2024_paper.pdf) 476 | 477 | **InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks**
478 | Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai
479 | [CVPR 2024](https://arxiv.org/abs/2312.14238) 480 | 481 | **Shikra: Unleashing Multimodal LLM’s Referential Dialogue Magic**
482 | [arXiv, 27 Jun 2023](https://arxiv.org/abs/2306.15195) • [Code](https://github.com/shikras/shikra) 483 | 484 | ## 10. Related Survey Papers 485 | **Natural Language Generation for Visualizations: State of the Art, Challenges and Future Directions**
486 | Enamul Hoque, Mohammed Saidul Islam
487 | [arXiv, 29 Sep 2024](https://arxiv.org/pdf/2409.19747) 488 | 489 | **Generative AI for visualization: State of the art and future directions** 490 | Yilin Ye, Jianing Hao, Yihan Hou, Zhan Wang, Shishi Xiao, Yuyu Luo, Wei Zeng 491 | [Visual Informatics 2024](https://www.sciencedirect.com/science/article/pii/S2468502X24000160) 492 | 493 | **Datasets of Visualization for Machine Learning** 494 | Can Liu, Ruike Jiang, Shaocong Tan, Jiacheng Yu, Chaofan Yang, Hanning Shao, Xiaoru Yuan 495 | [arXiv, 23 Jul 2024](https://arxiv.org/pdf/2407.16351) 496 | 497 | **Foundation models meet visualizations: Challenges and opportunities** 498 | Weikai Yang, Mengchen Liu, Zheng Wang, Shixia Liu 499 | [CVM 2024](https://link.springer.com/article/10.1007/s41095-023-0393-x) 500 | 501 | **From Detection to Application: Recent Advances in Understanding Scientific Tables and Figures** 502 | Jiani Huang, Haihua Chen, Fengchang Yu, Wei Lu 503 | [ACM Computing Surveys 2024](https://dl.acm.org/doi/pdf/10.1145/3657285) 504 | 505 | **From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models** 506 | Kung-Hsiang Huang, Hou Pong Chan, Yi R. Fung, Haoyi Qiu, Mingyang Zhou, Shafiq Joty, Shih-Fu Chang, Heng Ji 507 | [arXiv, 18 Mar 2024](https://arxiv.org/pdf/2403.12027) 508 | 509 | **Leveraging large models for crafting narrative visualization: a survey** 510 | Yi He, Shixiong Cao, Yang Shi, Qing Chen, Ke Xu, Nan Cao 511 | [arXiv, 25 Jan 2024](https://arxiv.org/pdf/2401.14010) 512 | 513 | **Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey**
514 | Weixu Zhang, Yifei Wang, Yuanfeng Song, Victor Junqiu Wei, Yuxing Tian, Yiyan Qi, Jonathan H. Chan, Raymond Chi-Wing Wong, Haiqin Yang
515 | [arXiv, 27 Oct 2023](https://arxiv.org/pdf/2310.17894) 516 | 517 | **AI4VIS: Survey on Artificial Intelligence Approaches for Data Visualization** 518 | Aoyu Wu, Yun Wang, Xinhuan Shu, Dominik Moritz, Weiwei Cui, Haidong Zhang, Dongmei Zhang, and Huamin Qu 519 | [TVCG 2022](https://arxiv.org/pdf/2102.01330) • [Homepage](https://ai4vis.github.io/) 520 | 521 | **Chart Question Answering: State of the Art and Future Directions** 522 | Enamul Hoque, Parsa Kavehzadeh, Ahmed Masry 523 | [EuroVis 2022](https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.14573) 524 | 525 | **Towards Natural Language Interfaces for Data Visualization: A Survey** 526 | Leixian Shen, Enya Shen, Yuyu Luo, Xiaocong Yang, Xuming Hu, Xiongshuai Zhang, Zhiwei Tai, Jianmin Wang 527 | [TVCG 2022](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9699035) 528 | 529 | 530 | ## 11. Others 531 | **VisAnatomy: An SVG Chart Corpus with Fine-Grained Semantic Labels**
532 | Chen Chen, Hannah K. Bako, Peihong Yu, John Hooker, Jeffrey Joyal, Simon C. Wang, Samuel Kim, Jessica Wu, Aoxue Ding, Lara Sandeep, Alex Chen, Chayanika Sinha, Zhicheng Liu
533 | [arXiv, 16 Oct 2024](https://arxiv.org/pdf/2410.12268) • [Homepage](https://visanatomy.github.io/) 534 | 535 | **Multimodal Chart Retrieval: A Comparison of Text, Table and Image Based Approaches**
536 | Averi Nowak, Francesco Piccinno, Yasemin Altun
537 | [NAACL 2024](https://aclanthology.org/2024.naacl-long.307/) 538 | 539 | --------------------------------------------------------------------------------