
415 |
416 | ### RAG for Text
417 | - Question Answering
418 |
419 | [Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering](https://doi.org/10.18653/v1/2021.eacl-main.74)
420 |
421 | [REALM: Retrieval-Augmented Language Model Pre-Training](https://arxiv.org/abs/2002.08909)
422 |
423 | [Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training](https://doi.org/10.18653/v1/2021.naacl-main.278)
424 |
425 | [Atlas: Few-shot Learning with Retrieval Augmented Language Models](http://jmlr.org/papers/v24/23-0037.html)
426 |
427 | [Improving Language Models by Retrieving from Trillions of Tokens](https://proceedings.mlr.press/v162/borgeaud22a.html)
428 |
429 | [Self-Knowledge Guided Retrieval Augmentation for Large Language Models](https://aclanthology.org/2023.findings-emnlp.691)
430 |
431 | [Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering](https://doi.org/10.48550/arXiv.2306.04136)
432 |
433 | [Think-on-Graph: Deep and Responsible Reasoning of Large Language Model with Knowledge Graph](https://doi.org/10.48550/arXiv.2307.07697)
434 |
435 | [Nonparametric Masked Language Modeling](https://doi.org/10.18653/v1/2023.findings-acl.132)
436 |
437 | [CL-ReLKT: Cross-lingual Language Knowledge Transfer for Multilingual Retrieval Question Answering](https://doi.org/10.18653/v1/2022.findings-naacl.165)
438 |
439 | [One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval](https://proceedings.neurips.cc/paper/2021/hash/3df07fdae1ab273a967aaa1d355b8bb6-Abstract.html)
440 |
441 | [Entities as Experts: Sparse Memory Access with Entity Supervision](https://arxiv.org/abs/2004.07202)
442 |
443 | [When to Read Documents or QA History: On Unified and Selective Open-domain QA](https://doi.org/10.18653/v1/2023.findings-acl.401)
444 |
445 | [Enhancing LLM Intelligence with ARM-RAG: Auxiliary Rationale Memory for Retrieval Augmented Generation](https://arxiv.org/abs/2311.04177)
446 |
447 | [DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Service](https://arxiv.org/pdf/2309.11325.pdf)
448 |
449 | - Fact verification
450 |
451 | [CONCRETE: Improving Cross-lingual Fact-checking with Cross-lingual Retrieval](https://aclanthology.org/2022.coling-1.86)
452 |
453 | [Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization](https://arxiv.org/pdf/2405.02816)
454 |
455 | - Commonsense Reasoning
456 |
457 | [KG-BART: Knowledge Graph-Augmented {BART} for Generative Commonsense Reasoning](https://doi.org/10.1609/aaai.v35i7.16796)
458 |
459 | [What Evidence Do Language Models Find Convincing?](https://arxiv.org/abs/2402.11782v1)
460 |
461 | [Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language Models](https://arxiv.org/abs/2310.04027)
462 |
463 | - Human-Machine Conversation
464 |
465 | [Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs](https://doi.org/10.18653/v1/2020.acl-main.184)
466 |
467 | [Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory](https://doi.org/10.18653/v1/n19-1124)
468 |
469 | [Internet-Augmented Dialogue Generation](https://doi.org/10.18653/v1/2022.acl-long.579)
470 |
471 | [BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage](https://doi.org/10.48550/arXiv.2208.03188)
472 |
473 | [A Model of Cross-Lingual Knowledge-Grounded Response Generation for Open-Domain Dialogue Systems](https://doi.org/10.18653/v1/2021.findings-emnlp.33)
474 |
475 | [From Classification to Generation: Insights into Crosslingual Retrieval Augmented ICL](https://openreview.net/forum?id=KLPLCXo4aD)
476 |
477 | [Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages](https://aclanthology.org/2023.findings-acl.528/)
478 |
479 | [Citation-Enhanced Generation for LLM-based Chatbot](https://arxiv.org/pdf/2402.16063v1.pdf)
480 |
481 | [KAUCUS: Knowledge Augmented User Simulators for Training Language Model Assistants](https://aclanthology.org/2024.scichat-1.5/)
482 |
483 |
484 |
485 | - Neural Machine Translation
486 |
487 | [Neural Machine Translation with Monolingual Translation Memory](https://doi.org/10.18653/v1/2021.acl-long.567)
488 |
489 | [Nearest Neighbor Machine Translation](https://openreview.net/forum?id=7wCBOfJ8hJM)
490 |
491 | [Training Language Models with Memory Augmentation](https://doi.org/10.18653/v1/2022.emnlp-main.382)
492 |
493 | - Event Extraction
494 |
495 | [Retrieval-Augmented Generative Question Answering for Event Argument Extraction](https://doi.org/10.18653/v1/2022.emnlp-main.307)
496 |
497 | - Summarization
498 |
499 | [Retrieval-Augmented Multilingual Keyphrase Generation with Retriever-Generator Iterative Training](https://doi.org/10.18653/v1/2022.findings-naacl.92)
500 |
501 | [Unlimiformer: Long-Range Transformers with Unlimited Length Input](https://doi.org/10.48550/arXiv.2305.01625)
502 |
503 | [Retrieval-based Full-length Wikipedia Generation for Emergent Events](https://arxiv.org/abs/2402.18264v1)
504 |
505 | [RIGHT: Retrieval-augmented Generation for Mainstream Hashtag Recommendation](https://arxiv.org/abs/2312.10466)
506 |
507 | [M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions](https://arxiv.org/pdf/2405.16420)
508 |
509 | ### RAG for Code
510 | - Code Generation
511 |
512 | [Retrieval-Based Neural Code Generation](https://doi.org/10.18653/v1/d18-1111)
513 |
514 | [Retrieval Augmented Code Generation and Summarization](https://doi.org/10.18653/v1/2021.findings-emnlp.232)
515 |
516 | [When Language Model Meets Private Library](https://doi.org/10.18653/v1/2022.findings-emnlp.21)
517 |
518 | [Language Models of Code are Few-Shot Commonsense Learners](https://doi.org/10.18653/v1/2022.emnlp-main.90)
519 |
520 | [DocPrompting: Generating Code by Retrieving the Docs](https://openreview.net/pdf?id=ZTCxT2t2Ru)
521 |
522 | [CodeT5+: Open Code Large Language Models for Code Understanding and Generation](https://aclanthology.org/2023.emnlp-main.68)
523 |
524 | [AceCoder: Utilizing Existing Code to Enhance Code Generation](https://arxiv.org/abs/2303.17780)
525 |
526 | [Syntax-Aware Retrieval Augmented Code Generation](https://aclanthology.org/2023.findings-emnlp.90)
527 |
528 | [A^3-CodGen: A Repository-Level Code Generation Framework for Code Reuse with Local-Aware, Global-Aware, and Third-Party-Library-Aware](https://arxiv.org/abs/2312.05772)
529 |
530 | [SkCoder: A Sketch-based Approach for Automatic Code Generation](https://ieeexplore.ieee.org/abstract/document/10172719)
531 |
532 | [CodeGen4Libs: A Two-Stage Approach for Library-Oriented Code Generation](https://ieeexplore.ieee.org/abstract/document/10298327)
533 |
534 | [ToolCoder: Teach Code Generation Models to use API search tools](https://arxiv.org/abs/2305.04032)
535 |
536 | [CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges](https://arxiv.org/abs/2401.07339)
537 |
538 | [RRGcode: Deep hierarchical search-based code generation](https://www.sciencedirect.com/science/article/pii/S0164121224000256)
539 |
540 | [Code Search Is All You Need? Improving Code Suggestions with Code Search](https://www.computer.org/csdl/proceedings-article/icse/2024/021700a857/1V5BkjI3196)
541 |
542 | [ARKS: Active Retrieval in Knowledge Soup for Code Generation](https://arxiv.org/abs/2402.12317)
543 |
544 | - Code Summary
545 |
546 | [Retrieval-based neural source code summarization](https://doi.org/10.1145/3377811.3380383)
547 |
548 | [Retrieve and Refine: Exemplar-based Neural Comment Generation](https://doi.org/10.1145/3324884.3416578)
549 |
550 | [EditSum: A Retrieve-and-Edit Framework for Source Code Summarization](https://doi.org/10.1109/ASE51524.2021.9678724)
551 |
552 | [Retrieval-Augmented Generation for Code Summarization via Hybrid GNN](https://openreview.net/forum?id=zv-typ1gPxA)
553 |
554 | [Context-aware Retrieval-based Deep Commit Message Generation](https://dl.acm.org/doi/abs/10.1145/3464689)
555 |
556 | [RACE: Retrieval-augmented Commit Message Generation](https://doi.org/10.18653/v1/2022.emnlp-main.372)
557 |
558 | [BashExplainer: Retrieval-Augmented Bash Code Comment Generation based on Fine-tuned CodeBERT](https://doi.org/10.1109/ICSME55016.2022.00016)
559 |
560 | [Retrieval-Based Transformer Pseudocode Generation](https://www.mdpi.com/2227-7390/10/4/604)
561 |
562 | [A Simple Retrieval-based Method for Code Comment Generation](https://ieeexplore.ieee.org/abstract/document/9825803)
563 |
564 | [READSUM: Retrieval-Augmented Adaptive Transformer for Source Code Summarization](https://ieeexplore.ieee.org/abstract/document/10113620)
565 |
566 | [Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization](https://arxiv.org/abs/2305.11074)
567 |
568 | [Automatic Semantic Augmentation of Language Model Prompts (for Code Summarization)](https://arxiv.org/abs/2304.06815)
569 |
570 | [Cross-Modal Retrieval-Enhanced Code Summarization based on Joint Learning for Retrieval and Generation](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4724884)
571 |
572 | [Automatic Smart Contract Comment Generation via Large Language Models and In-Context Learning](https://www.sciencedirect.com/science/article/pii/S0950584924000107)
573 |
574 | [UniLog: Automatic Logging via LLM and In-Context Learning](https://dl.acm.org/doi/abs/10.1145/3597503.3623326)
575 |
576 | - Code Completion
577 |
578 | [A Retrieve-and-Edit Framework for Predicting Structured Outputs](https://proceedings.neurips.cc/paper_files/paper/2018/hash/cd17d3ce3b64f227987cd92cd701cc58-Abstract.html)
579 |
580 | [Generating Code with the Help of Retrieved Template Functions and Stack Overflow Answers](https://arxiv.org/abs/2104.05310)
581 |
582 | [ReACC: A Retrieval-Augmented Code Completion Framework](https://doi.org/10.18653/v1/2022.acl-long.431)
583 |
584 | [Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases](https://ieeexplore.ieee.org/abstract/document/10298575)
585 |
586 | [RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation](https://aclanthology.org/2023.emnlp-main.151)
587 |
588 | [CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context](https://doi.org/10.48550/arXiv.2212.10007)
589 |
590 | [RepoFusion: Training Code Models to Understand Your Repository](https://arxiv.org/abs/2306.10998)
591 |
592 | [Revisiting and Improving Retrieval-Augmented Deep Assertion Generation](https://ieeexplore.ieee.org/abstract/document/10298588)
593 |
594 | [De-Hallucinator: Iterative Grounding for LLM-Based Code Completion](https://arxiv.org/abs/2401.01701)
595 |
596 | [REPOFUSE: Repository-Level Code Completion with Fused Dual Context](https://arxiv.org/abs/2402.14323)
597 |
598 | - Automatic Program Repair
599 |
600 | [Repair Is Nearly Generation: Multilingual Program Repair with LLMs](https://doi.org/10.1609/aaai.v37i4.25642)
601 |
602 | [Retrieval-Based Prompt Selection for Code-Related Few-Shot Learning](https://doi.org/10.1109/ICSE48619.2023.00205)
603 |
604 | [InferFix: End-to-End Program Repair with LLMs](https://doi.org/10.1145/3611643.3613892)
605 |
606 | [RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair](https://dl.acm.org/doi/abs/10.1145/3611643.3616256)
607 |
608 | [Automated Code Editing with Search-Generate-Modify](https://arxiv.org/abs/2306.06490)
609 |
610 | [RTLFixer: Automatically Fixing RTL Syntax Errors with Large Language Models](https://arxiv.org/abs/2311.16543)
611 |
612 | - Text-to-SQL and Code-based Semantic Parsing
613 |
614 | [XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for Cross-lingual Text-to-SQL Semantic Parsing](https://doi.org/10.18653/v1/2022.findings-emnlp.384)
615 |
616 | [Synchromesh: Reliable Code Generation from Pre-trained Language Models](https://openreview.net/forum?id=KmtVD97J43e)
617 |
618 | [Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing](https://aclanthology.org/2022.emnlp-main.624/)
619 |
620 | [RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL](https://ojs.aaai.org/index.php/AAAI/article/view/26535)
621 |
622 | [Leveraging Code to Improve In-context Learning for Semantic Parsing](https://arxiv.org/abs/2311.09519)
623 |
624 | [ReFSQL: A Retrieval-Augmentation Framework for Text-to-SQL Generation](https://aclanthology.org/2023.findings-emnlp.48/)
625 |
626 | [Enhancing Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies](https://aclanthology.org/2023.findings-emnlp.996/)
627 |
628 | [Selective Demonstrations for Cross-domain Text-to-SQL](https://aclanthology.org/2023.findings-emnlp.944/)
629 |
630 | [DBCopilot: Scaling Natural Language Querying to Massive Databases via Schema Routing](https://arxiv.org/abs/2312.03463)
631 |
632 | [Multi-Hop Table Retrieval for Open-Domain Text-to-SQL](https://arxiv.org/abs/2402.10666)
633 |
634 | [CodeS: Towards Building Open-source Language Models for Text-to-SQL](https://arxiv.org/abs/2402.16347)
635 |
636 | - Others
637 |
638 | [De-fine: Decomposing and Refining Visual Programs with Auto-Feedback](https://arxiv.org/abs/2311.12890)
639 |
640 | [Leveraging training data in few-shot prompting for numerical reasoning](https://arxiv.org/abs/2305.18170)
641 |
642 | [Retrieval-Augmented Code Generation for Universal Information Extraction](https://arxiv.org/abs/2311.02962)
643 |
644 | [E&V: Prompting Large Language Models to Perform Static Analysis by Pseudo-code Execution and Verification](https://arxiv.org/abs/2312.08477)
645 |
646 | [Lessons from Building StackSpot AI: A Contextualized AI Coding Assistant](https://arxiv.org/abs/2311.18450)
647 |
648 | [Testing the Limits: Unusual Text Inputs Generation for Mobile App Crash Detection with Large Language Model](https://arxiv.org/abs/2310.15657)
649 |
650 | ### RAG for Audio
651 | - Audio Generation
652 |
653 | [Retrieval-Augmented Text-to-Audio Generation](https://doi.org/10.48550/arXiv.2309.08051)
654 |
655 | [Large-Scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation](https://doi.org/10.1109/ICASSP49357.2023.10095969)
656 |
657 | [Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models](https://proceedings.mlr.press/v202/huang23i.html)
658 |
659 | - Audio Captioning
660 |
661 | [RECAP: Retrieval-Augmented Audio Captioning](https://doi.org/10.48550/arXiv.2309.09836)
662 |
663 | [Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval](https://arxiv.org/abs/2012.07331)
664 |
665 | [Large-Scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation](https://doi.org/10.1109/ICASSP49357.2023.10095969)
666 |
667 | [CNN architectures for large-scale audio classification](https://doi.org/10.1109/ICASSP.2017.7952132)
668 |
669 | [Natural language supervision for general-purpose audio representations](https://ieeexplore.ieee.org/abstract/document/10448504)
670 |
671 | [Weakly-supervised Automated Audio Captioning via text only training](https://arxiv.org/abs/2309.12242)
672 |
673 | [Training Audio Captioning Models without Audio](https://ieeexplore.ieee.org/abstract/document/10448115)
674 |
675 | ### RAG for Image
676 | - Image Generation
677 |
678 | [Retrievegan: Image synthesis via differentiable patch retrieval](https://arxiv.org/abs/2007.08513)
679 |
680 | [Instance-conditioned gan](https://arxiv.org/abs/2109.05070)
681 |
682 | [Memory-driven text-to-image generation](https://arxiv.org/abs/2208.07022)
683 |
684 | [Re-imagen: Retrieval-augmented text-to-image generator](https://arxiv.org/abs/2209.14491)
685 |
686 | [KNN-Diffusion: Image Generation via Large-Scale Retrieval](https://arxiv.org/abs/2204.02849)
687 |
688 | [Retrieval-Augmented Diffusion Models](https://arxiv.org/abs/2204.11824)
689 |
690 | [Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models](https://arxiv.org/abs/2207.13038)
691 |
692 | [X&Fuse: Fusing Visual Information in Text-to-Image Generation](https://arxiv.org/abs/2303.01000)
693 |
694 | [Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs](https://arxiv.org/abs/2401.11708)
695 |
696 | - Image Captioning
697 |
698 | [Memory-augmented image captioning](https://ojs.aaai.org/index.php/AAAI/article/view/16220)
699 |
700 | [Retrieval-enhanced adversarial training with dynamic memory-augmented attention for image paragraph captioning](https://www.sciencedirect.com/science/article/pii/S0950705120308595)
701 |
702 | [Retrieval-Augmented Transformer for Image Captioning](https://arxiv.org/abs/2207.13162)
703 |
704 | [Retrieval-augmented image captioning](https://arxiv.org/abs/2302.08268)
705 |
706 | [Reveal: Retrieval-augmented visual-language pre-training with multi-source multimodal knowledge memory](https://arxiv.org/abs/2212.05221)
707 |
708 | [SmallCap: Lightweight Image Captioning Prompted With Retrieval Augmentation](https://arxiv.org/abs/2209.15323)
709 |
710 | [Cross-Modal Retrieval and Semantic Refinement for Remote Sensing Image Captioning](https://www.mdpi.com/2072-4292/16/1/196)
711 |
712 | - Others
713 |
714 | [An empirical study of gpt-3 for few-shot knowledge-based vqa](https://ojs.aaai.org/index.php/AAAI/article/view/20215)
715 |
716 | [Retrieval augmented visual question answering with outside knowledge](https://aclanthology.org/2022.emnlp-main.772/)
717 |
718 | [Augmenting transformers with KNN-based composite memory for dialog](https://doi.org/10.1162/tacl_a_00356)
719 |
720 | [Maria: A visual experience powered conversational agent](https://aclanthology.org/2021.acl-long.435/)
721 |
722 | [Neural machine translation with phrase-level universal visual representations](https://aclanthology.org/2022.acl-long.390/)
723 |
724 |
725 | ### RAG for Video
726 | - Video Captioning
727 |
728 | [Incorporating Background Knowledge into Video Description Generation](https://aclanthology.org/D18-1433/)
729 |
730 | [Retrieval Augmented Convolutional Encoder-decoder Networks for Video Captioning](https://doi.org/10.1145/3539225)
731 |
732 | [Concept-Aware Video Captioning: Describing Videos With Effective Prior Information](https://doi.org/10.1109/TIP.2023.3307969)
733 |
734 | [Retrieval-Augmented Egocentric Video Captioning](https://arxiv.org/abs/2401.00789)
735 |
736 | - Video QA&Dialogue
737 |
738 | [Memory augmented deep recurrent neural network for video question answering](https://doi.org/10.1109/TNNLS.2019.2938015)
739 |
740 | [Retrieving-to-answer: Zero-shot video question answering with frozen large language models](https://openaccess.thecvf.com/content/ICCV2023W/MMFM/html/Pan_Retrieving-to-Answer_Zero-Shot_Video_Question_Answering_with_Frozen_Large_Language_Models_ICCVW_2023_paper.html)
741 |
742 | [Tvqa+: Spatio-temporal grounding for video question answering](https://aclanthology.org/2020.acl-main.730/)
743 |
744 | [Vgnmn: Video-grounded neural module networks for video-grounded dialogue systems](https://aclanthology.org/2022.naacl-main.247/)
745 |
746 | - Others
747 |
748 | [Language models with image descriptors are strong few-shot video-language learners](https://proceedings.neurips.cc/paper_files/paper/2022/hash/381ceeae4a1feb1abc59c773f7e61839-Abstract-Conference.html)
749 |
750 | [RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model](https://arxiv.org/abs/2402.10828)
751 |
752 | [Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation](https://doi.org/10.48550/arXiv.2307.06940)
753 |
754 | [Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval](https://doi.org/10.1109/ICCV48922.2021.00175)
755 |
756 | ### RAG for 3D
757 | - Text-to-3D
758 |
759 | [ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model](https://doi.org/10.1109/ICCV51070.2023.00040)
760 |
761 | [AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion](https://arxiv.org/abs/2312.12763)
762 |
763 | [Retrieval-Augmented Score Distillation for Text-to-3D Generation](https://doi.org/10.48550/arXiv.2402.02972)
764 |
765 | ### RAG for Knowledge
766 | - Knowledge Base Question Answering
767 |
768 | [ReTraCk: A Flexible and Efficient Framework for Knowledge Base Question Answering](https://doi.org/10.18653/v1/2021.acl-demo.39)
769 |
770 | [Unseen Entity Handling in Complex Question Answering over Knowledge Base via Language Generation](https://aclanthology.org/2021.findings-emnlp.50/)
771 |
772 | [Case-based Reasoning for Natural Language Queries over Knowledge Bases](https://doi.org/10.18653/v1/2021.emnlp-main.755)
773 |
774 | [Logical Form Generation via Multi-task Learning for Complex Question Answering over Knowledge Bases](https://aclanthology.org/2022.coling-1.145)
775 |
776 | [Uni-Parser: Unified Semantic Parser for Question Answering on Knowledge Base and Database](https://aclanthology.org/2022.emnlp-main.605/)
777 |
778 | [RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering](https://aclanthology.org/2022.acl-long.417/)
779 |
780 | [TIARA: Multi-grained Retrieval for Robust Question Answering over Large Knowledge Base](https://aclanthology.org/2022.emnlp-main.555/)
781 |
782 | [DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases](https://openreview.net/forum?id=XHc5zRPxqV9)
783 |
784 | [End-to-end Case-Based Reasoning for Commonsense Knowledge Base Completion](https://aclanthology.org/2023.eacl-main.255/)
785 |
786 | [Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware Pre-training for KBQA](https://dl.acm.org/doi/abs/10.1145/3583780.3615150)
787 |
788 | [Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for Knowledge-intensive Question Answering](https://arxiv.org/abs/2308.13259)
789 |
790 | [Few-shot Transfer Learning for Knowledge Base Question Answering: Fusing Supervised Models with In-Context Learning](https://arxiv.org/abs/2311.08894)
791 |
792 | [FC-KBQA: A Fine-to-Coarse Composition Framework for Knowledge Base Question Answering](https://aclanthology.org/2023.acl-long.57/)
793 |
794 | [Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering](https://aclanthology.org/2023.nlrse-1.7/)
795 |
796 | [Knowledge Graph-augmented Language Models for Complex Question Answering](https://aclanthology.org/2023.nlrse-1.1/)
797 |
798 | [Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering](https://arxiv.org/abs/2309.11206)
799 |
800 | [Distribution Shifts Are Bottlenecks: Extensive Evaluation for Grounding Language Models to Knowledge Bases](https://aclanthology.org/2024.eacl-srw.7/)
801 |
802 | [Probing Structured Semantics Understanding and Generation of Language Models via Question Answering](https://arxiv.org/abs/2401.05777)
803 |
804 | [Keqing: Knowledge-based Question Answering is A Nature Chain-of-Thought mentor of LLMs](https://arxiv.org/abs/2401.00426)
805 |
806 | [Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models](https://arxiv.org/abs/2402.15131)
807 |
808 | - Knowledge-augmented Open-domain Question Answering
809 |
810 | [UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering](https://aclanthology.org/2022.findings-naacl.115/)
811 |
812 | [KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering](https://aclanthology.org/2022.acl-long.340/)
813 |
814 | [Empowering Language Models with Knowledge Graph Reasoning for Open-Domain Question Answering](https://aclanthology.org/2022.emnlp-main.650/)
815 |
816 | [Grape: Knowledge Graph Enhanced Passage Reader for Open-domain Question Answering](https://aclanthology.org/2022.findings-emnlp.13/)
817 |
818 | [Enhancing Multi-modal Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation](https://dl.acm.org/doi/abs/10.1145/3581783.3611964)
819 |
820 | [DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text](https://arxiv.org/abs/2310.20170)
821 |
822 | [KnowledGPT: Enhancing Large Language Models with Retrieval and Storage Access on Knowledge Bases](https://arxiv.org/abs/2308.11761)
823 |
824 | [Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering](https://arxiv.org/abs/2403.02966)
825 |
826 | [Two-stage Generative Question Answering on Temporal Knowledge Graph Using Large Language Models](https://arxiv.org/abs/2402.16568)
827 |
828 | [KnowledgeNavigator: Leveraging Large Language Models for Enhanced Reasoning over Knowledge Graph](https://arxiv.org/abs/2312.15880)
829 |
830 | [GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning](https://arxiv.org/pdf/2405.20139)
831 |
832 | - Table Question Answering
833 |
834 | [NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned](https://proceedings.mlr.press/v133/min21a.html)
835 |
836 | [Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering](https://aclanthology.org/2021.acl-long.315/)
837 |
838 | [End-to-End Table Question Answering via Retrieval-Augmented Generation](https://arxiv.org/abs/2203.16714)
839 |
840 | [OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering](https://aclanthology.org/2022.naacl-main.68/)
841 |
842 | [Reasoning over Hybrid Chain for Table-and-Text Open Domain Question Answering](https://www.ijcai.org/proceedings/2022/0629.pdf)
843 |
844 | [Conversational Question Answering on Heterogeneous Sources](https://dl.acm.org/doi/abs/10.1145/3477495.3531815)
845 |
846 | [Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge](https://aclanthology.org/2022.findings-emnlp.392/)
847 |
848 | [StructGPT: A General Framework for Large Language Model to Reason over Structured Data](https://aclanthology.org/2023.emnlp-main.574/)
849 |
850 | [cTBLS: Augmenting Large Language Models with Conversational Tables](https://aclanthology.org/2023.nlp4convai-1.6/)
851 |
852 | [RINK: Reader-Inherited Evidence Reranker for Table-and-Text Open Domain Question Answering](https://ojs.aaai.org/index.php/AAAI/article/view/26577)
853 |
854 | [Localize, Retrieve and Fuse: A Generalized Framework for Free-Form Question Answering over Tables](https://aclanthology.org/2023.findings-ijcnlp.1/)
855 |
856 | [Exploring the Impact of Table-to-Text Methods on Augmenting LLM-based Question Answering with Domain Hybrid Data](https://arxiv.org/abs/2402.12869)
857 |
858 | [ERATTA: Extreme RAG for Table To Answers with Large Language Models](https://arxiv.org/pdf/2405.03963)
859 |
860 | - Others
861 |
862 | [Improving Knowledge-Aware Dialogue Response Generation by Using Human-Written Prototype Dialogues](https://aclanthology.org/2020.findings-emnlp.126/)
863 |
864 | [Knowledge Graph-Augmented Language Models for Knowledge-Grounded Dialogue Generation](https://arxiv.org/abs/2305.18846)
865 |
866 | [RHO: Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding](https://aclanthology.org/2023.findings-acl.275/)
867 |
868 | [Retrieval-Enhanced Generative Model for Large-Scale Knowledge Graph Completion](https://doi.org/10.1145/3539618.3592052)
869 |
870 | [Knowledge-Augmented Large Language Models for Personalized Contextual Query Suggestion](https://arxiv.org/abs/2311.06318)
871 |
872 | [G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering](https://arxiv.org/abs/2402.07630)
873 |
874 | [RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models](https://arxiv.org/pdf/2405.00449)
875 |
876 | [HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models](https://arxiv.org/pdf/2405.14831)
877 |
878 | ### RAG for Science
879 | - Drug Discovery
880 |
881 | [Retrieval-based controllable molecule generation](https://arxiv.org/abs/2208.11126)
882 |
883 | [Prompt-based 3d molecular diffusion models for structure-based drug design](https://openreview.net/forum?id=FWsGuAFn3n)
884 |
885 | - Biomedical Informatics Enhancement
886 |
887 | [PoET: A generative model of protein families as sequences-of-sequences](https://proceedings.neurips.cc/paper_files/paper/2023/hash/f4366126eba252699b280e8f93c0ab2f-Abstract-Conference.html)
888 |
889 | [Retrieval-augmented large language models for adolescent idiopathic scoliosis patients in shared decision-making](https://dl.acm.org/doi/abs/10.1145/3584371.3612956)
890 |
891 | [BioReader: a Retrieval-Enhanced Text-to-Text Transformer for Biomedical Literature](https://aclanthology.org/2022.emnlp-main.390/)
892 |
893 | [Writing by Memorizing: Hierarchical Retrieval-based Medical Report Generation](https://arxiv.org/abs/2106.06471)
894 |
895 | [From RAG to QA-RAG: Integrating Generative AI for Pharmaceutical Regulatory Compliance Process](https://arxiv.org/abs/2402.01717)
896 |
897 | [RAG-RLRC-LaySum at BioLaySumm: Integrating Retrieval-Augmented Generation and Readability Control for Layman Summarization of Biomedical Texts](https://arxiv.org/pdf/2405.13179)
898 |
899 | - Math Applications
900 |
901 | [Retrieval-augmented Generation to Improve Math Question-Answering: Trade-offs Between Groundedness and Human Preference](https://arxiv.org/abs/2310.03184)
902 |
903 | [LeanDojo: Theorem Proving with Retrieval-Augmented Language Models](https://proceedings.neurips.cc/paper_files/paper/2023/hash/4441469427094f8873d0fecb0c4e1cee-Abstract-Datasets_and_Benchmarks.html)
904 |
905 | ## Benchmark
906 | [Benchmarking Large Language Models in Retrieval-Augmented Generation](https://doi.org/10.48550/arXiv.2309.01431)
907 |
908 | [CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models](https://doi.org/10.48550/arXiv.2401.17043)
909 |
910 | [ARES: An Automated Evaluation Framework for Retrieval-AugmentedGeneration Systems](https://doi.org/10.48550/arXiv.2311.09476)
911 |
912 | [RAGAS: Automated Evaluation of Retrieval Augmented Generation](https://doi.org/10.48550/arXiv.2309.15217)
913 |
914 | [KILT: a Benchmark for Knowledge Intensive Language Tasks](https://arxiv.org/abs/2009.02252)
915 |
916 |
917 | ## Citation
918 | if you find this work useful, please cite our paper:
919 | ```
920 | @article{zhao2024retrieval,
921 | title={Retrieval-Augmented Generation for AI-Generated Content: A Survey},
922 | author={Zhao, Penghao and Zhang, Hailin and Yu, Qinhan and Wang, Zhengren and Geng, Yunteng and Fu, Fangcheng and Yang, Ling and Zhang, Wentao and Cui, Bin},
923 | journal={arXiv preprint arXiv:2402.19473},
924 | year={2024}
925 | }
926 | ```
927 |
928 |
929 |
--------------------------------------------------------------------------------