└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Awesome Scene Graphs [![Awesome](https://awesome.re/badge.svg)](https://awesome.re) 2 | 3 | Literature survey of scene graphs 4 | 5 | ## Table of Contents 6 | 7 | - [Survey](#Survey) 8 | 9 | - [Datasets](#Datasets) 10 | 11 | - [Awesome Scene Graphs](#ASG) 12 | - [2020 Venues](#2020) 13 | - [2019 Venues](#2019) 14 | - [2018 Venues](#2018) 15 | - [2017 Venues](#2017) 16 | - [2010-2016 Venues](#2010-2016) 17 | 18 | ### Survey 19 | 20 | 21 | | Title | Venue | 22 | |:--------|:--------:| 23 | |A Survey of Scene Graph:Generation and Application| https://www.xiaojun.ai/papers/Scene-Graph-Survey.pdf| 24 | 25 | @Booklet{EasyChair:3385, 26 | 27 | author = {Pengfei Xu and Xiaojun Chang and Ling Guo and Po-Yao Huang and Xiaojiang Chen and Alex Hauptmann}, 28 | 29 | title = {A Survey of Scene Graph: Generation and Application}, 30 | 31 | howpublished = {EasyChair Preprint no. 3385}, 32 | 33 | year = {EasyChair, 2020}} 34 | 35 | ## Datasets 36 | | Title | Abbr | 37 | |:--------|:--------:| 38 | | [Visual Relationship Detection with Language Priors](https://arxiv.org/pdf/1608.00187.pdf) | VRD | 39 | | [Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations](https://arxiv.org/pdf/1602.07332.pdf) | Visual Genome | 40 | | [Weakly-supervised learning of visual relations](http://openaccess.thecvf.com/content_ICCV_2017/papers/Peyre_Weakly-Supervised_Learning_of_ICCV_2017_paper.pdf) | UnRel | 41 | 42 | 43 | ### 2020 44 | 45 | 46 | | Title | Venue | Code | 47 | |:--------|:--------:|:--------:| 48 | | [Bridging Knowledge Graphs to Generate Scene Graphs](https://arxiv.org/pdf/2001.02314.pdf) | arXiv | -| 49 | | [Unbiased Scene Graph Generation from Biased Training](https://arxiv.org/pdf/2002.11949.pdf) | CVPR | -| 50 | | [Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval](https://arxiv.org/pdf/1910.05134.pdf) | WACV | -| 51 | | [SOGNet: Scene Overlap Graph Network for Panoptic Segmentation](https://arxiv.org/pdf/1911.07527.pdf) | AAAI | -| 52 | | Storytelling from an Image Stream Using Scene Graphs | - | -| 53 | 54 | ### 2019 55 | 56 | | Title | Venue | Code | 57 | |:--------|:--------:|:--------:| 58 | | [Adversarial Adaptation of Scene Graph Models for Understanding Civic Issues](https://arxiv.org/pdf/1901.10124.pdf) | WWW | - | 59 | | [Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation](https://arxiv.org/pdf/1905.11624.pdf) | NLP | - | 60 | | [PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph](https://arxiv.org/pdf/1905.01608.pdf) | NIPS | - | 61 | | [Referring Expression Grounding by Marginalizing Scene Graph Likelihood](https://arxiv.org/pdf/1906.03561v1.pdf) | NIPS | - | 62 | | [Scene graph captioner: Image captioning based on structural visual representation](https://www.sciencedirect.com/science/article/pii/S1047320318303535) | JVCIR | - | 63 | | [Know More Say Less: Image Captioning Based on Scene Graphs](https://ieeexplore.ieee.org/document/8630068) | IEEE | - | 64 | | [3-D Scene Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents](https://arxiv.org/ftp/arxiv/papers/1908/1908.04929.pdf) | IEEE | - | 65 | | [PANet: A Context Based Predicate Association Network for Scene Graph Generation](https://ieeexplore.ieee.org/document/8784780) | ICME | - | 66 | | [Multi-Granularity Reasoning for Social Relation Recognition from Images](https://arxiv.org/pdf/1901.03067.pdf) | ICME | - | 67 | | [Deeply Supervised Multimodal Attentional Translation Embeddings for Visual Relationship Detection](https://arxiv.org/pdf/1902.05829.pdf) | ICIP | - | 68 | | [Layout and Context Understanding for Image Synthesis with Scene Graphs](https://www.researchgate.net/publication/335538931_Layout_and_Context_Understanding_for_Image_Synthesis_with_Scene_Graphs) | ICIP | - | 69 | | [Triplet-Aware Scene Graph Embeddings](https://arxiv.org/pdf/1909.09256v1.pdf) | ICCV | - | 70 | | [Scene Graph based Image Retrieval -- A case study on the CLEVR Dataset](https://arxiv.org/pdf/1911.00850.pdf) | ICCV | - | 71 | | [Detecting Visual Relationships Using Box Attention](https://arxiv.org/pdf/1807.02136.pdf) | ICCV | - | 72 | | [Differentiable Scene Graphs](https://arxiv.org/pdf/1902.10200v2.pdf) | ICCV | - | 73 | | [VrR-VG: Refocusing Visually-Relevant Relationships](https://arxiv.org/pdf/1902.00313.pdf) | ICCV | - | 74 | | [Unpaired Image Captioning via Scene Graph Alignments](https://arxiv.org/pdf/1903.10658.pdf) | ICCV | - | 75 | | [Scene Graph Prediction with Limited Labels](https://arxiv.org/pdf/1904.11622v1.pdf) | ICCV | - | 76 | | [Seq-SG2SL: Inferring Semantic Layout from Scene Graph through Sequence to Sequence Learning](https://arxiv.org/pdf/1908.06592.pdf) | ICCV | - | 77 | | [Counterfactual Critic Multi-Agent Training for Scene Graph Generation](https://arxiv.org/pdf/1812.02347.pdf) | ICCV | - | 78 | | [3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera](https://arxiv.org/pdf/1910.02527.pdf) | ICCV | - | 79 | | [Auto-Encoding Scene Graphs for Image Captioning](https://arxiv.org/pdf/1812.02378.pdf) | CVPR | - | 80 | | [Compact Scene Graphs for Layout Composition and Patch Retrieval](https://arxiv.org/pdf/1904.09348.pdf) | CVPR | - | 81 | | [Exploring Context and Visual Pattern of Relationship for Scene Graph Generation](https://www.researchgate.net/publication/338512917_Exploring_Context_and_Visual_Pattern_of_Relationship_for_Scene_Graph_Generation) | CVPR | - | 82 | | [Explainable and Explicit Visual Reasoning over Scene Graphs](https://arxiv.org/pdf/1812.01855.pdf) | CVPR | - | 83 | | [Learning to Compose Dynamic Tree Structures for Visual Contexts](https://arxiv.org/pdf/1812.01880.pdf) | CVPR | - | 84 | | [Image Generation from Layout](https://arxiv.org/pdf/1811.11389v2.pdf) | CVPR | - | 85 | | [Scene Graph Generation with External Knowledge and Image Reconstruction](https://arxiv.org/pdf/1904.00560.pdf) | CVPR | - | 86 | | [Graphical Contrastive Losses for Scene Graph Parsing](https://arxiv.org/pdf/1903.02728.pdf) | CVPR | - | 87 | | [Knowledge-Embedded Routing Network for Scene Graph Generation](https://arxiv.org/pdf/1903.03326v1.pdf) | CVPR | - | 88 | | [Attentive Relational Networks for Mapping Images to Scene Graphs](https://arxiv.org/pdf/1811.10696v1.pdf) | CVPR | - | 89 | | [A hierarchical recurrent approach to predict scene graphs from a visual-attention-oriented perspective](https://www.researchgate.net/publication/331969285_A_hierarchical_recurrent_approach_to_predict_scene_graphs_from_a_visual-attention-oriented_perspective) | CI | - | 90 | | [An Empirical Study on Leveraging Scene Graphs for Visual Question Answering](https://arxiv.org/pdf/1907.12133.pdf) | BMVC | - | 91 | | [Using Scene Graph Context to Improve Image Generation](https://arxiv.org/pdf/1901.03762.pdf) | arXiv | - | 92 | | [Neural-Symbolic Tensor Product Scene-Graph-Triplet Representation for Image Captioning](https://arxiv.org/pdf/1911.10115.pdf) | arXiv | - | 93 | | [Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators](https://arxiv.org/pdf/1909.09953.pdf) | arXiv | - | 94 | | [Learning Predicates as Functions to Enable Few-shot Scene Graph Prediction](https://arxiv.org/pdf/1906.04876.pdf) | arXiv | - | 95 | | [Learning Canonical Representations for Scene Graph to Image Generation](https://arxiv.org/pdf/1912.07414.pdf) | arXiv | - | 96 | | [Interactive Image Generation Using Scene Graphs](https://arxiv.org/pdf/1905.03743.pdf) | arXiv | - | 97 | | [Generating Natural Language Explanations for Visual Question Answering Using Scene Graphs and Visual Attention](https://arxiv.org/pdf/1902.05715.pdf) | arXiv | - | 98 | | [Action Genome Actions as Composition of Spatio-temporal Scene Graphs](https://arxiv.org/pdf/1912.06992.pdf) | arXiv | - | 99 | | [Large-Scale Visual Relationship Understanding](https://arxiv.org/pdf/1804.10660v2.pdf) | AAAI | - | 100 | | [BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection](https://arxiv.org/pdf/1902.00038.pdf) | AAAI | - | 101 | 102 | 103 | 104 | ### 2018 105 | 106 | | Title | Venue | Code | 107 | |:--------|:--------:|:--------:| 108 | | [Scaling Human-Object Interaction Recognition through Zero-Shot Learning](http://vision.stanford.edu/pdf/shen2018wacv.pdf) | WACV | - | 109 | | [Learning to Detect Human-Object Interactions](https://arxiv.org/pdf/1702.05448.pdf) | WACV | - | 110 | | [Scene Graph Parsing by Attention Graph](https://arxiv.org/pdf/1909.06273v1.pdf) | ViGIL | - | 111 | | [Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction](https://arxiv.org/pdf/1802.05451.pdf) | NIPS | - | 112 | | [LinkNet:Relational Embedding for Scene Graph](https://arxiv.org/pdf/1811.06410.pdf) | NIPS | - | 113 | | [An Interpretable Model for Scene Graph Generation](https://arxiv.org/pdf/1811.09543v1.pdf) | NIPS | - | 114 | | [Scene Graph Parsing as Dependency Parsing](https://arxiv.org/pdf/1803.09189.pdf) | NAACL | - | 115 | | [Representation Learning for Scene Graph Completion via Jointly Structural and Visual Embedding](https://www.ijcai.org/Proceedings/2018/132) | IJCAI | - | 116 | | [Narrative Collage of Image Collections by Scene Graph Recombination](https://ieeexplore.ieee.org/document/8057796) | TVCG | - | 117 | | [Image Captioning with Scene-graph Based Semantic Concepts](https://dl.acm.org/doi/10.1145/3195106.3195114) | ICMLC | - | 118 | | [Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Disti](https://arxiv.org/pdf/1805.10802.pdf) | ICME | - | 119 | | [Zoom-Net:Mining Deep Feature Interactions for Visual Relationship Recognition](https://arxiv.org/pdf/1807.04979.pdf) | ECCV | - | 120 | | [Learning Human-Object Interactions by Graph Parsing Neural Networks](https://arxiv.org/pdf/1808.07962.pdf) | ECCV | - | 121 | | [Graph R-CNN for Scene Graph Generation](https://arxiv.org/pdf/1808.00191.pdf) | ECCV | - | 122 | | [Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation](https://arxiv.org/pdf/1806.11538.pdf) | ECCV | - | 123 | | [Actor-Centric Relation Network](https://arxiv.org/pdf/1807.10982.pdf) | ECCV | - | 124 | | [Tensorize,Factorize and Regularize Robust Visual Relationship Learning](https://www.researchgate.net/publication/329382255_Tensorize_Factorize_and_Regularize_Robust_Visual_Relationship_Learning) | CVPR | - | 125 | | [Neural Motifs Scene Graph Parsing with Global Context](https://arxiv.org/pdf/1711.06640.pdf) | CVPR | - | 126 | | [Image Generation from Scene Graphs](https://arxiv.org/pdf/1804.01622.pdf) | CVPR | - | 127 | | [Detecting and Recognizing Human-Object Interactions](https://arxiv.org/pdf/1704.07333.pdf) | CVPR | - | 128 | | [Image Understanding using vision and reasoning through Scene Description](http://www.public.asu.edu/~cbaral/papers/2017CVIU.pdf) | CVIU | - | 129 | | [Visual Social Relationship Recognition](https://arxiv.org/pdf/1812.05917.pdf) | arXiv | - | 130 | | [Transferable Interactiveness Prior for Human-Object Interaction Detection](https://arxiv.org/pdf/1811.08264v1.pdf) | arXiv | - | 131 | | [Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering](https://arxiv.org/pdf/1812.09681v2.pdf) | arXiv | - | 132 | | [Scene Graph Generation via Conditional Random Fields](https://arxiv.org/pdf/1811.08075.pdf) | arXiv | - | 133 | | [Detecting unseen visual relations using analogies](https://arxiv.org/pdf/1812.05736.pdf) | arXiv | - | 134 | | [Context-Dependent Diffusion Network for Visual Relationship Detection](https://arxiv.org/pdf/1809.06213.pdf) | AMC | - | 135 | | [Scene-centric Joint Parsing of Cross-view Videos](http://arxiv.org/pdf/1709.05436) | AAAI | - | 136 | | [HCVRD:A Benchmark for Large-Scale Human-Centered Visual Relationship Detecti](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/download/16444/16362) | AAAI | - | 137 | | [Generating Triples with Adversarial Networks for Scene Graph Construction](https://arxiv.org/pdf/1802.02598.pdf) | AAAI | - | 138 | 139 | ### 2017 140 | 141 | | Title | Venue | Code | 142 | |:--------|:--------:|:--------:| 143 | | [Pixels to Graphs by Associative Embedding](https://arxiv.org/pdf/1706.07365.pdf) | NIPS | - | 144 | | [On support relations and semantic scene graphs](https://arxiv.org/pdf/1609.05834.pdf) | ISPRS | - | 145 | | [Joint Embeddings of Scene Graphs and Images](https://openreview.net/pdf?id=BkyScySKl) | ICLR | - | 146 | | [Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation](https://arxiv.org/pdf/1707.09423.pdf) | ICCV | - | 147 | | [Scene Graph Generation from Objects, Phrases and Region Captions](https://arxiv.org/pdf/1707.09700.pdf) | ICCV | - | 148 | | [Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues](https://arxiv.org/pdf/1611.06641.pdf) | ICCV | - | 149 | | [Visual Translation Embedding Network for Visual Relation Detection](http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhang_Visual_Translation_Embedding_CVPR_2017_paper.pdf) | CVPR | - | 150 | | [ViP-CNN: Visual Phrase Guided Convolutional Neural Network](https://arxiv.org/pdf/1702.07191.pdf) | CVPR | - | 151 | | [Scene Graph Generation by Iterative Message Passing](https://arxiv.org/pdf/1701.02426.pdf) | CVPR | - | 152 | | [Relationship Proposal Networks](http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhang_Relationship_Proposal_Networks_CVPR_2017_paper.pdf) | CVPR | - | 153 | | [Modeling Relationships in Referential Expressions with Compositional Modular Networks](https://arxiv.org/pdf/1611.09978.pdf) | CVPR | - | 154 | | [Learning Object Interactions and Descriptions for Semantic Image Segmentation](http://openaccess.thecvf.com/content_cvpr_2017/papers/Wang_Learning_Object_Interactions_CVPR_2017_paper.pdf) | CVPR | - | 155 | | [Detecting Visual Relationships with Deep Relational Networks](https://arxiv.org/pdf/1704.03114.pdf) | CVPR | - | 156 | | [Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection](https://arxiv.org/pdf/1703.03054.pdf) | CVPR | - | 157 | | [Online Cross-Modal Scene Retrieval by Binary Representation and Semantic Graph](https://www.onacademic.com/detail/journal_1000040105998010_af53.html) | ACM | - | 158 | 159 | 160 | 161 | ### 2010-2016 162 | 163 | | Title | Venue | Code | 164 | |:--------|:--------:|:--------:| 165 | | [Visual Relationship Detection with Language Priors](http://arxiv.org/pdf/1608.00187) | ECCV | - | 166 | | [Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval](http://www.cs.cmu.edu/~ark/EMNLP-2015/proceedings/VL/pdf/VL12.pdf) | WVL | - | 167 | | [Image Retrieval using Scene Graphs](https://www.cv-foundation.org/openaccess/content_cvpr_2015/app/2B_036_ext.pdf) | CVPR | - | 168 | | [From Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge](http://arxiv.org/pdf/1511.03292) | arXiv | - | 169 | | [Categorizing Object-Action Relations from Semantic Scene Graphs](https://core.ac.uk/download/pdf/36040546.pdf) | ICRA | - | 170 | --------------------------------------------------------------------------------