├── README.md ├── get_hot.py └── scholar.py /README.md: -------------------------------------------------------------------------------- 1 | # awesome-self-supervised-gnn 2 | 3 | ![PRs Welcome](https://img.shields.io/badge/PRs-Welcome-green) [![Awesome](https://awesome.re/badge.svg)](https://awesome.re) ![Stars](https://img.shields.io/github/stars/ChandlerBang/awesome-self-supervised-gnn?color=yellow) ![Forks](https://img.shields.io/github/forks/ChandlerBang/awesome-self-supervised-gnn?color=blue&label=Fork) 4 | 5 | This repository contains a list of papers on the **Self-supervised Learning on Graph Neural Networks (GNNs)**, we categorize them based on their published years. 6 | 7 | We will try to make this list updated. If you found any error or any missed paper, please don't hesitate to open issues or pull requests. 8 | 9 | Note: :fire: indicates the paper is extensively cited (e.g., > 80 citations). The code is provided in `get_hot.py`. 10 | 11 | ## Year 2024 12 | 1. [ICASSP 2024] **Contrastive Deep Nonnegative Matrix Factorization for Community Detection** [[paper]](https://arxiv.org/abs/2311.02357) [[code]](https://github.com/6lyc/CDNMF) 13 | 14 | ## Year 2023 15 | 1. [ICLR 2023] **Empowering Graph Representation Learning with Test-Time Graph Transformation** [[paper]](https://openreview.net/forum?id=Lnxl5pr018) [[code]](https://github.com/ChandlerBang/GTrans) 16 | 1. [ICLR 2023] **Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization** [[paper]](https://arxiv.org/pdf/2210.02016.pdf) [[code]](https://github.com/jumxglhf/ParetoGNN) 17 | 1. [AAAI 2023] **Eliciting Structural and Semantic Global Knowledge in Unsupervised Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2202.08480) [[code]](https://github.com/kaize0409/S-3-CL) 18 | 1. [arXiv 2023] **Truncated Affinity Maximization: One-class Homophily Modeling for Graph Anomaly Detection** [[paper]](https://arxiv.org/pdf/2306.00006.pdf) 19 | 1. [ICASSP 2023] **Contrastive Learning at the Relation and Event Level for Rumor Detection** [[paper]](https://ieeexplore.ieee.org/abstract/document/10096567) 20 | 1. [arXiv 2023] **AmGCL: Feature Imputation of Attribute Missing Graph via Self-supervised Contrastive Learning** [[paper]](https://arxiv.org/pdf/2305.03741.pdf) 21 | 1. [arXiv 2023] **SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2305.04501.pdf) 22 | 1. [arXiv 2023] **CSGCL: Community-Strength-Enhanced Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2305.04658.pdf) 23 | 1. [TKDE 2023] **MINING: Multi-Granularity Network Alignment Based on Contrastive Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/10120956) 24 | 1. [ICASSP 2023] **Select The Best: Enhancing Graph Representation with Adaptive Negative Sample Selection** [[paper]](https://ieeexplore.ieee.org/abstract/document/10095586) 25 | 1. [ICASSP 2023] **Graph Contrastive Learning with Learnable Graph Augmentation** [[paper]](https://ieeexplore.ieee.org/abstract/document/10095511) 26 | 1. [arXiv 2023] **FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction** [[paper]](https://arxiv.org/pdf/2305.02549.pdf) 27 | 1. [INS 2023] **A fairness-aware graph contrastive learning recommender framework for social tagging systems** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025523006497) 28 | 1. [arXiv 2023] **Improving Knowledge Graph Entity Alignment with Graph Augmentation** [[paper]](https://arxiv.org/pdf/2304.14585.pdf) 29 | 1. [WWW 2023] **Graph Self-supervised Learning with Augmentation-aware Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3543507.3583246) 30 | 1. [arXiv 2023] **A Systematic Survey of Chemical Pre-trained Models** [[paper]](https://sxkdz.github.io/files/publications/IJCAI/CPM/CPM.pdf) 31 | 1. [WWW 2023] **Self-Supervised Teaching and Learning of Representations on Graphs** [[paper]](https://dl.acm.org/doi/abs/10.1145/3543507.3583441) 32 | 1. [TKDE 2023] **Progressive Hard Negative Masking: From Global Uniformity to Local Tolerance** [[paper]](https://ieeexplore.ieee.org/abstract/document/10111083) 33 | 1. [KBS 2023] **ST-A-PGCL: Spatiotemporal adaptive periodical graph contrastive learning for traffic prediction under real scenarios** [[paper]](https://www.sciencedirect.com/science/article/pii/S0950705123003416) 34 | 1. [WWW 2023] **SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking** [[paper]](https://dl.acm.org/doi/abs/10.1145/3543507.3583245) 35 | 1. [INS 2023] **Self-supervised Contrastive Learning on Heterogeneous Graphs with Mutual Constraints of Structure and Feature** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025523006114) 36 | 1. [Scientific Reports 2023] **A multi-view contrastive learning for heterogeneous network embedding** [[paper]](https://www.nature.com/articles/s41598-023-33324-7) 37 | 1. [WWW 2023] **Automated Spatio-Temporal Graph Contrastive Learning** [[paper]](https://zhengwang125.github.io/paper/STGCL_WWW23.pdf) 38 | 1. [arXiv 2023] **Capturing Fine-grained Semantics in Contrastive Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2304.11658.pdf) 39 | 1. [arXiv 2023] **Decouple Graph Neural Networks: Train Multiple Simple GNNs Simultaneously Instead of One** [[paper]](https://arxiv.org/pdf/2304.10126.pdf) 40 | 1. [arXiv 2023] **ID-MixGCL: Identity Mixup for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2304.10045.pdf) 41 | 1. [Bioinformatics 2023] **Molecular Property Prediction by Contrastive Learning with Attention-Guided Positive Sample Selection** [[paper]](https://watermark.silverchair.com/btad258.pdf?token=AQECAHi208BE49Ooan9kkhW_Ercy7Dm3ZL_9Cf3qfKAc485ysgAAAwcwggMDBgkqhkiG9w0BBwagggL0MIIC8AIBADCCAukGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMdupp3nabpyWrY1TvAgEQgIICuslU3gktfD9EQ9YOuajKd5nL5RNR0eI5eAOngtpUfOUcqcGOQONeb7Lznmgz8twSmMS13_U5bKR6FRKpce_1s9teGI5K7J6JLdx_sHrlBZGP8m1xMzk7soYc8pGHVsgbKwusPR5rkaRd-JykOSM3eIn_5IQgqqJ2RYmtcymvywcuGV1tA__M44XepfMuzHcC9q5h8NuWaWXmMzode9nlFyO0eacGBbSG8zvaH97K65aD734tbaUW60Do6fS_5yq9kRMFV3EPqnJwJ0iJ72o3ZFSNBjxb2yDH1kd_TZbkmio6LC6ZH8mrubOKxGDhrzjruSEpe1Fs54BzZfrqrGbmv8LB9sWxbSXAitKbMGnFb1WxyBF6cyB9g1AyqGYJEMr7HM7yBC9UOmff_s1kH-Avd_L8ZfzyhVqDvUyIgJc39Nlw6Eju3stlDuKMIwwWBI6qWHkc_nEd_0u7n1ssxbBydo63PZKmNbtsq36l7wN0goc_sWYXy9AyMu0ROFNLfWSe6n6k_u7DIyRlm7GPzOrx3CEaCWq_8uw1Pkvygflhz4aktGzWUBxodPezX4ToO2_9Q7IP9IjccsCI_zcr38C3EaHhtZf4yXFCowrL7C7MOLq9yo_9huTv3UJ_qq0dL7UCnJgrkI0kK7pkljnSu2gd0iuxwftCnphrXiw79xJwVUXTvbWKe_xxoh_XHllwhztCmPFYFbmwB-1A2gYpWq2fnNl7LxxvnioJCuoz9mwaFXN6tLwCCPkZa-GdakTaoHoU30JGMvrgdyhhFU30mUN5NOyWaoOLcqFLy8y-mO_V07uUGmMkS3SHM0j-qYEdjVEddM7QxbW5JW28EkL3L97BWaBohCHcj0jiS7pzteOwzZ4e3WWhghFX1pDGeFvvhzv5xCobn5TPFV1N9qk7I7QrEZSjAg1epeLNvohj) 42 | 1. [AISTAT 2023] **Learning Robust Graph Neural Networks with Limited Supervision** [[paper]](https://proceedings.mlr.press/v206/alchihabi23a/alchihabi23a.pdf) 43 | 1. [TNNLS 2023] **Demystifying and Mitigating Bias for Node Representation Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/10103678) 44 | 1. [BICTA 2023] **Graph Contrastive Learning with Intrinsic Augmentations** [[paper]](https://link.springer.com/chapter/10.1007/978-981-99-1549-1_27) 45 | 1. [arXiv 2023] **GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner** [[paper]](https://epubs.siam.org/doi/pdf/10.1137/1.9781611977653.ch19) 46 | 1. [arXiv 2023] **Adversarial Hard Negative Generation for Complementary Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2304.04779.pdf) 47 | 1. [INS 2023] **INS-GNN: Improving Graph Imbalance Learning with Self-Supervision** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025523005042) 48 | 1. [TNNLS 2023] **Dual Contrastive Learning Network for Graph Clustering** [[paper]](https://ieeexplore.ieee.org/abstract/document/10097557) 49 | 1. [arXiv 2023] **RARE: Robust Masked Graph Autoencoder** [[paper]](https://arxiv.org/pdf/2304.01507.pdf) 50 | 1. [TKDE 2023] **Maximizing Mutual Information Across Feature and Topology Views for Representing Graphs** [[paper]](https://ieeexplore.ieee.org/abstract/document/10093032) 51 | 1. [arXiv 2023] **When to Pre-Train Graph Neural Networks? An Answer from Data Generation Perspective!** [[paper]](https://arxiv.org/abs/2303.16458) 52 | 1. [KBS 2023] **Class-homophilic-based data augmentation for improving graph neural networks** [[paper]](https://www.sciencedirect.com/science/article/pii/S095070512300268X) 53 | 1. [arXiv 2023] **Structural Imbalance Aware Graph Augmentation Learning** [[paper]](https://arxiv.org/pdf/2303.13757.pdf) 54 | 1. [arXiv 2023] **Hybrid Augmented Automated Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2303.15182.pdf) 55 | 1. [arXiv 2023] **Decoupling Graph Neural Network with Contrastive Learning for Fraud Detection** [[paper]](https://linmengsysu.github.io/slides/main.pdf) 56 | 1. [arXiv 2023] **Data-Centric Learning from Unlabeled Graphs with Diffusion Model** [[paper]](https://arxiv.org/pdf/2303.10108.pdf) 57 | 1. [TPAMI 2023] **Unsupervised Learning of Graph Matching With Mixture of Modes Via Discrepancy Minimization** [[paper]](https://ieeexplore.ieee.org/abstract/document/10073537) 58 | 1. [arXiv 2023] **NESS: Learning Node Embeddings from Static SubGraphs** [[paper]](https://arxiv.org/pdf/2303.08958.pdf) 59 | 1. [Sensors 2023] **A Robust Automated Analog Circuits Classification Involving a Graph Neural Network and a Novel Data Augmentation Strategy** [[paper]](https://www.mdpi.com/1424-8220/23/6/2989) 60 | 1. [arXiv 2023] **Contrastive knowledge integrated graph neural networks for Chinese medical text classification** [[paper]](https://www.sciencedirect.com/science/article/pii/S0952197623002415) 61 | 1. [arXiv 2023] **CHGNN: A Semi-Supervised Contrastive Hypergraph Learning Network** [[paper]](https://arxiv.org/pdf/2303.06213.pdf) 62 | 1. [arXiv 2023] **Contrastive Learning under Heterophily** [[paper]](https://arxiv.org/pdf/2303.06344.pdf) 63 | 1. [arXiv 2023] **Structure-Aware Group Discrimination with Adaptive-View Graph Encoder: A Fast Graph Contrastive Learning Framework** [[paper]](https://arxiv.org/pdf/2303.05231.pdf) 64 | 1. [TNNLS 2023] **Self-supervised Learning IoT Device Features with Graph Contrastive Neural Network for Device Classification in Social Internet of Things** [[paper]](https://ieeexplore.ieee.org/abstract/document/10059194) 65 | 1. [TKDE 2023] **Feature-Level Deeper Self-Attention Network With Contrastive Learning for Sequential Recommendation** [[paper]](https://ieeexplore.ieee.org/abstract/document/10059216) 66 | 1. [AAAI 2023] **Recommend What to Cache: a Simple Self-supervised Graph-based Recommendation Framework for Edge Caching Network** [[paper]](https://arxiv.org/pdf/2302.14438.pdf) 67 | 1. [arXiv 2023] **Self-Supervised Interest Transfer Network via Prototypical Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/pdf/2302.14438.pdf) 68 | 1. [arXiv 2023] **SGL-PT: A Strong Graph Learner with Graph Prompt Tuning** [[paper]](https://arxiv.org/pdf/2302.12449.pdf) 69 | 1. [CIS 2023] **SimGRL: a simple self-supervised graph representation learning framework via triplets** [[paper]](https://link.springer.com/article/10.1007/s40747-023-00997-6) 70 | 1. [WSDM 2023] **Self-Supervised Group Graph Collaborative Filtering for Group Recommendation** [[paper]](https://dl.acm.org/doi/abs/10.1145/3539597.3570400) 71 | 1. [WSDM 2023] **S2GAE: Self-Supervised Graph Autoencoders are Generalizable Learners with Graph Masking** [[paper]](https://dl.acm.org/doi/abs/10.1145/3539597.3570404) 72 | 1. [WSDM 2023] **Heterogeneous Graph Contrastive Learning for Recommendation** [[paper]](https://dl.acm.org/doi/abs/10.1145/3539597.3570484) 73 | 1. [Nature Communications Chemistry] **Hierarchical Molecular Graph Self-Supervised Learning for property prediction** [[paper]](https://www.nature.com/articles/s42004-023-00825-5) 74 | 1. [arXiv 2023] **Wiener Graph Deconvolutional Network Improves Graph Self-Supervised Learning** [[paper]](https://www.researchgate.net/profile/Jia-Li-127/publication/368543822_Wiener_Graph_Deconvolutional_Network_Improves_Graph_Self-Supervised_Learning/links/63edebc419130a1a4a830593/Wiener-Graph-Deconvolutional-Network-Improves-Graph-Self-Supervised-Learning.pdf) 75 | 1. [arXiv 2023] **Heterogeneous Social Event Detection via Hyperbolic Graph Representations** [[paper]](https://arxiv.org/pdf/2302.10362.pdf) 76 | 1. [arXiv 2023] **LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/pdf/2302.08191.pdf) 77 | 1. [arXiv 2023] **GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2302.08043.pdf) 78 | 1. [Pattern Recognition] **Dual-Channel Graph Contrastive Learning for Self-Supervised Graph-Level Representation Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S0031320323001486) 79 | 1. [NCA 2023] **Self-supervised contrastive learning for heterogeneous graph based on multi-pretext tasks** [[paper]](https://link.springer.com/article/10.1007/s00521-023-08234-4) 80 | 1. [arXiv 2023] **STERLING: Synergistic Representation Learning on Bipartite Graphs** [[paper]](https://arxiv.org/pdf/2302.05428.pdf) 81 | 1. [ICLR 2023] **Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization** [[paper]](https://arxiv.org/pdf/2210.02016.pdf) 82 | 1. [WBD 2023] **Mixed-Order Heterogeneous Graph Pre-training for Cold-Start Recommendation** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-25201-3_14) 83 | 1. [arXiv 2023] **Explainable Action Prediction through Self-Supervision on Scene Graphs** [[paper]](https://arxiv.org/pdf/2302.03477.pdf) 84 | 1. [arXiv 2023] **Spectral Augmentations for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2302.02909.pdf) 85 | 1. [RS 2023] **Representing Spatial Data with Graph Contrastive Learning** [[paper]](https://urldefense.com/v3/__https://scholar.google.com/scholar_url?url=https:**Awww.mdpi.com*2072-4292*15*4*880*pdf&hl=en&sa=X&d=18081949848644790374&ei=UtHkY-wUjdbJBK-AnIgN&scisig=AAGBfm2HRbUL2s5kW_fO96HIgBt-0lesJg&oi=scholaralrt&hist=Pv-V2igAAAAJ:16610178827432183357:AAGBfm3PSUTRAat5lSIOYWJJQSKiKvjk4g&html=&pos=1&folt=cit__;Ly8vLy8vLw!!KwNVnqRv!DcYtDY-xLzHkhx2yQ32kw_CetJ1VrPiy0H9Hilie6oEU0a9OMbDAWoV9kq6mhcDPope5FTQwyDvFJ1YT8B6R9su2t7P1Rg$) 86 | 1. [ACLF 2023] **KE-GCL: Knowledge Enhanced Graph Contrastive Learning for Commonsense Question Answering** [[paper]](https://aclanthology.org/2022.findings-emnlp.6.pdf) 87 | 1. [TNNLS 2023] **GRLC: Graph Representation Learning With Constraints** [[paper]](https://ieeexplore.ieee.org/abstract/document/10036344) 88 | 1. [ESA 2023] **Contrastive graph clustering with adaptive filter** [[paper]](https://www.sciencedirect.com/science/article/pii/S095741742300146X) 89 | 1. [arXiv 2023] **Biomedical Interaction Prediction with Adaptive Line Graph Contrastive Learning** [[paper]](https://www.mdpi.com/2227-7390/11/3/732) 90 | 1. [arXiv 2023] **Affinity Uncertainty-based Hard Negative Mining in Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2301.13340.pdf) 91 | 1. [arXiv 2023] **Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking** [[paper]](https://arxiv.org/pdf/2301.12458.pdf) 92 | 1. [ACM Trans. Web 2023] **Contrastive Graph Similarity Networks** [[paper]](https://dl.acm.org/doi/pdf/10.1145/3580511) 93 | 1. [ICBD 2023] **Predictive Masking for Semi-Supervised Graph Contrastive Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/10020970) 94 | 1. [TNNLS 2023] **Graph Representation Learning With Adaptive Metric** [[paper]](https://ieeexplore.ieee.org/abstract/document/10025823) 95 | 1. [RAL 2023] **Self-Supervised Local Topology Representation for Random Cluster Matching** [[paper]](https://ieeexplore.ieee.org/abstract/document/10021967) 96 | 1. [KBS 2023] **CrysGNN: Distilling pre-trained knowledge to enhance property prediction for crystalline materials** [[paper]](https://arxiv.org/pdf/2301.05852.pdf) 97 | 1. [Entropy 2023] **A Semantic-Enhancement-Based Social Network User-Alignment Algorithm** [[paper]](https://www.mdpi.com/1099-4300/25/1/172) 98 | 1. [KBS 2023] **Cross-view temporal graph contrastive learning for session-based recommendation** [[paper]](https://www.sciencedirect.com/science/article/pii/S0950705123000540) 99 | 1. [PR 2023] **Robust Image Clustering via Context-aware Contrastive Graph Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S0031320323000419) 100 | 1. [ICMLCS 2023] **AP-GCL: Adversarial Perturbation on Graph Contrastive Learning** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-20096-0_47) 101 | 1. [arXiv 2023] **Signed Directed Graph Contrastive Learning with Laplacian Augmentation** [[paper]](https://arxiv.org/pdf/2301.05163.pdf) 102 | 1. [OJCS 2023] **SC-FGCL: Self-adaptive Cluster-based Federal Graph Contrastive Learning** [[paper]](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10015148) 103 | 1. [BIB 2023] **CasANGCL: pre-training and fine-tuning model based on cascaded attention network and graph contrastive learning for molecular property prediction** [[paper]](https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbac566/6966532) 104 | 1. [AAAI 2023] **Spectral Feature Augmentation for Graph Contrastive Learning and Beyond** [[paper]](https://arxiv.org/abs/2212.01026) 105 | 1. [Entropy 2023] **Self-Supervised Node Classification with Strategy and Actively Selected Labeled Set** [[paper]](https://urldefense.com/v3/__https://scholar.google.com/scholar_url?url=https:**Awww.mdpi.com*1099-4300*25*1*30*pdf&hl=en&sa=X&d=13649462741514245070&ei=66yqY9q-NY_mmgHdka7oCw&scisig=AAGBfm0m2E6wg_90swKhBWYDrZsXMBr2kA&oi=scholaralrt&hist=Pv-V2igAAAAJ:16610178827432183357:AAGBfm3PSUTRAat5lSIOYWJJQSKiKvjk4g&html=&pos=0&folt=cit__;Ly8vLy8vLw!!KwNVnqRv!FbRTWxTuNHDzvvuiJFFzysRQQ3C08EMs3qJTdLHxTA4E2WK7FjMv32fbi6T1irhYspBlmsafx0xexY4FKuao4dHXv3q7hw$) 106 | 107 | ## Year 2022 108 | 1. [NeurIPS 2022] **Generalized Laplacian Eigenmaps** [[paper]](https://openreview.net/forum?id=HjicdpP-Nth) 109 | 1. [KDD 2022] **COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3534678.3539425) 110 | 1. [ITBE 2022] **Contrastive Multi-view Composite Graph Convolutional Networks Based on Contribution Learning for Autism Spectrum Disorder Classification** [[paper]](https://ieeexplore.ieee.org/abstract/document/9999336) 111 | 1. [IEEE Access 2022] **ROME: A Graph Contrastive Multi-view Framework from Hyperbolic Angular Space for MOOCs Recommendation** [[paper]](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10001755) 112 | 1. [arXiv 2022] **Heterogeneous Graph Contrastive Learning with Meta-path Contexts and Weighted Negative Samples** [[paper]](https://arxiv.org/pdf/2212.13847.pdf) 113 | 1. [arXiv 2022] **MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular Representation Learning** [[paper]](https://arxiv.org/pdf/2212.10614.pdf) 114 | 1. [arXiv 2022] **Toward Improved Generalization: Meta Transfer of Self-supervised Knowledge on Graphs** [[paper]](https://arxiv.org/pdf/2212.08217.pdf) 115 | 1. [arXiv 2022] **Coarse-to-Fine Contrastive Learning on Graphs** [[paper]](https://arxiv.org/pdf/2212.06423.pdf) 116 | 1. [arXiv 2022] **MA-GCL: Model Augmentation Tricks for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2212.07035.pdf) 117 | 1. [arXiv 2022] **Mul-GAD: a semi-supervised graph anomaly detection framework via aggregating multi-view information** [[paper]](https://arxiv.org/pdf/2212.05478.pdf) 118 | 1. [arXiv 2022] **Localized Contrastive Learning on Graphs** [[paper]](https://arxiv.org/pdf/2212.04604.pdf) 119 | 1. [arXiv 2022] **Alleviating neighbor bias: augmenting graph self-supervise learning with structural equivalent positive samples** [[paper]](https://arxiv.org/pdf/2212.04365.pdf) 120 | 1. [arXiv 2022] **Self-supervised Graph Representation Learning for Black Market Account Detection** [[paper]](https://arxiv.org/pdf/2212.02679.pdf) 121 | 1. [arXiv 2022] **Contrastive Deep Graph Clustering with Learnable Augmentation** [[paper]](https://arxiv.org/pdf/2212.03559.pdf) 122 | 1. [arXiv 2022] **Graph Anomaly Detection via Multi-Scale Contrastive Learning Networks with Augmented View** [[paper]](https://arxiv.org/pdf/2212.00535.pdf) 123 | 1. [arXiv 2022] **Self Supervised Clustering of Traffic Scenes using Graph Representations** [[paper]](https://arxiv.org/pdf/2211.15508.pdf) 124 | 1. [arXiv 2022] **Graph Contrastive Learning for Materials** [[paper]](https://arxiv.org/pdf/2211.13408.pdf) 125 | 1. [arXiv 2022] **Link Prediction with Non-Contrastive Learning** [[paper]](https://arxiv.org/pdf/2211.14394.pdf) 126 | 1. [IJMIR 2022] **TCKGE: Transformers with contrastive learning for knowledge graph embedding** [[paper]](https://link.springer.com/article/10.1007/s13735-022-00256-3) 127 | 1. [arXiv 2022] **Beyond Smoothing: Unsupervised Graph Representation Learning with Edge Heterophily Discriminating** [[paper]](https://arxiv.org/pdf/2211.14065.pdf) 128 | 1. [Neural Networks 2022] **Unsupervised graph-level representation learning with hierarchical contrasts** [[paper]](https://www.sciencedirect.com/science/article/pii/S0893608022004609) 129 | 1. [arXiv 2022] **Relation-dependent Contrastive Learning with Cluster Sampling for Inductive Relation Prediction** [[paper]](https://arxiv.org/pdf/2211.12266.pdf) 130 | 1. [arXiv 2022] **Relational Symmetry based Knowledge Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2211.10738.pdf) 131 | 1. [arXiv 2022] **Towards Generalizable Graph Contrastive Learning: An Information Theory Perspective** [[paper]](https://arxiv.org/pdf/2211.10929.pdf) 132 | 1. [arXiv 2022] **Can Single-Pass Contrastive Learning Work for Both Homophilic and Heterophilic Graph?** [[paper]](https://arxiv.org/pdf/2211.10890.pdf) 133 | 1. [SIGSPATIAL 2022] **When Do Contrastive Learning Signals Help Spatio-Temporal Graph Forecasting?** [[paper]](http://urban-computing.com/pdf/STGCL_SIGSPATIAL_22.pdf) 134 | 1. [Scientific Reports 2022] **Deep graph level anomaly detection with contrastive learning** [[paper]](https://www.nature.com/articles/s41598-022-22086-3) 135 | 1. [TII 2022] **Semi-supervised machine fault diagnosis fusing unsupervised graph contrastive learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/9944187) 136 | 1. [KBS 2022] **SMGCL: Semi-supervised Multi-view Graph Contrastive Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S0950705122012163) 137 | 1. [arXiv 2022] **Unsupervised Graph Contrastive Learning with Data Augmentation for Malware Classification** [[paper]](https://www.researchgate.net/profile/Yun-Gao-48/publication/365275847_Unsupervised_Graph_Contrastive_Learning_with_Data_Augmentation_for_Malware_Classification/links/636cec632f4bca7fd04b9a26/Unsupervised-Graph-Contrastive-Learning-with-Data-Augmentation-for-Malware-Classification.pdf) 138 | 1. [IJCRS 2022] **Multi-scale Subgraph Contrastive Learning for Link Prediction** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-21244-4_16) 139 | 1. [arXiv 2022] **Flaky Performances when Pretraining on Relational Databases** [[paper]](https://arxiv.org/pdf/2211.05213.pdf) 140 | 1. [arXiv 2022] **GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection** [[paper]](https://arxiv.org/pdf/2211.04208.pdf) 141 | 1. [ATKDD 2022] **Ada-MIP: Adaptive Self-supervised Graph Representation Learning via Mutual Information and Proximity Optimization** [[paper]](https://dl.acm.org/doi/pdf/10.1145/3568165) 142 | 1. [arXiv 2022] **Graph Contrastive Learning with Implicit Augmentations** [[paper]](https://arxiv.org/pdf/2211.03710.pdf) 143 | 1. [Information Sciences 2022] **Contrastive Graph Neural Network-based Camouflaged Fraud Detector** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025522011926) 144 | 1. [arXiv 2022] **DyG2Vec: Representation Learning for Dynamic Graphs with Self-Supervision** [[paper]](https://arxiv.org/pdf/2210.16906.pdf) 145 | 1. [arXiv 2022] **Federated Graph Representation Learning using Self-Supervision** [[paper]](https://arxiv.org/pdf/2210.15120.pdf) 146 | 1. [arXiv 2022] **Benchmark of Self-supervised Graph Neural Networks** [[paper]](https://aaltodoc.aalto.fi/bitstream/handle/123456789/116441/master_Wang_Haishan_2022.pdf?sequence=2) 147 | 1. [arXiv 2022] **Line Graph Contrastive Learning for Link Prediction** [[paper]](https://arxiv.org/pdf/2210.13795.pdf) 148 | 1. [TDSC 2022] **FewM-HGCL: Few-Shot Malware Variants Detection Via Heterogeneous Graph Contrastive Learning** [[paper]](https://www.computer.org/csdl/journal/tq/5555/01/09928211/1HJuUzzFey4) 149 | 1. [arXiv 2022] **Self-supervised Graph-based Point-of-interest Recommendation** [[paper]](https://arxiv.org/pdf/2210.12506.pdf) 150 | 1. [IJMLC 2022] **Hybrid sampling-based contrastive learning for imbalanced node classification** [[paper]](https://link.springer.com/article/10.1007/s13042-022-01677-6) 151 | 1. [CIKM 2022] **Temporality-and Frequency-aware Graph Contrastive Learning for Temporal Network** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557469) 152 | 1. [CIKM 2022] **Towards Self-supervised Learning on Graphs with Heterophily** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557478) 153 | 1. [ISWC 2022] **HCL: Improving Graph Representation with Hierarchical Contrastive Learning** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-19433-7_7) 154 | 1. [CIKM 2022] **Cognize Yourself: Graph Pre-Training via Core Graph Cognizing and Differentiating** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557259) 155 | 1. [CIKM 2022] **AdaGCL: Adaptive Subgraph Contrastive Learning to Generalize Large-scale Graph Training** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557228) 156 | 1. [CIKM 2022] **Look Twice as Much as You Say: Scene Graph Contrastive Learning for Self-Supervised Image Caption Generation** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557382) 157 | 1. [CIKM 2022] **Malicious Repositories Detection with Adversarial Heterogeneous Graph Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557384) 158 | 1. [ICEBE 2022] **Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering** [[paper]](https://conferences.computer.org/icebe/2022/icebe2022-proceedings/Knowledge%20Graph%20Completion%20based%20on%20Hyperbolic%20Graph%20Contrastive%20Attention%20Network.pdf) 159 | 1. [arXiv 2022] **Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering** [[paper]](https://arxiv.org/pdf/2210.10462.pdf) 160 | 1. [NeurIPS 2022] **Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative** [[paper]](https://arxiv.org/abs/2210.03801) [[code]](https://github.com/weitianxin/HyperGCL) 161 | 1. [ICCL 2022] **Modeling Intra-and Inter-Modal Relations: Hierarchical Graph Contrastive Learning for Multimodal Sentiment Analysis** [[paper]](https://aclanthology.org/2022.coling-1.622.pdf) 162 | 1. [TKDE 2022] **Adversarial Contrastive Learning for Evidence-aware Fake News Detection with Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2210.05498.pdf) 163 | 1. [MM 2022] **Simple Self-supervised Multiplex Graph Representation Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3503161.3547949) 164 | 1. [TMM 2022] **Self-consistent Contrastive Attributed Graph Clustering with Pseudo-label Prompt** [[paper]](https://ieeexplore.ieee.org/abstract/document/9914670) 165 | 1. [NeurIPS 2022] **Uncovering the Structural Fairness in Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2210.03011.pdf) 166 | 1. [NeurIPS 2022] **Revisiting Graph Contrastive Learning from the Perspective of Graph Spectrum** [[paper]](https://arxiv.org/pdf/2210.02330.pdf) 167 | 1. [arXiv 2022] **Heterogeneous Graph Contrastive Multi-view Learning** [[paper]](https://arxiv.org/pdf/2210.00248.pdf) 168 | 1. [arXiv 2022] **Automated Graph Self-supervised Learning via Multi-teacher Knowledge Distillation** [[paper]](https://arxiv.org/pdf/2210.02099.pdf) 169 | 1. [arXiv 2022] **Prompt Tuning for Graph Neural Networks** [[paper]](https://web10.arxiv.org/pdf/2209.15240.pdf) 170 | 1. [arXiv 2022] **Improving Molecular Pretraining with Complementary Featurizations** [[paper]](https://arxiv.org/pdf/2209.15101.pdf) 171 | 1. [arXiv 2022] **Graph Soft-Contrastive Learning via Neighborhood Ranking** [[paper]](https://arxiv.org/pdf/2209.13964.pdf) 172 | 1. [EDBT 2022] **Spatial Structure-Aware Road Network Embedding via Graph Contrastive Learning** [[paper]](https://openproceedings.org/2023/conf/edbt/paper-193.pdf) 173 | 1. [arXiv 2022] **Adversarial Cross-View Disentangled Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2209.07699.pdf) 174 | 1. [Neurocomputing 2022] **Motifs-based Recommender System via Hypergraph Convolution and Contrastive Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S0925231222011948) 175 | 1. [TNNLS 2022] **Graph Representation Learning for Large-Scale Neuronal Morphological Analysis** [[paper]](https://ieeexplore.ieee.org/abstract/document/9895206) 176 | 1. [ECML-PKDD 2022] **Self-supervised Graph Learning with Segmented Graph Channels** [[paper]](https://2022.ecmlpkdd.org/wp-content/uploads/2022/09/sub_216.pdf) 177 | 1. [ECML-PKDD 2022] **Graph Contrastive Learning with Adaptive Augmentation for Recommendation** [[paper]](https://2022.ecmlpkdd.org/wp-content/uploads/2022/09/sub_650.pdf) 178 | 1. [CIKM 2022] **Contrastive Knowledge Graph Error Detection** [[paper]](https://www4.comp.polyu.edu.hk/~xiaohuang/docs/Qinggang_CIKM2022.pdf) 179 | 1. [TKDE 2022] **Disentangled Graph Contrastive Learning With Independence Promotion** [[paper]](https://ieeexplore.ieee.org/abstract/document/9893319) 180 | 1. [ECML-PKDD 2022] **Supervised Graph Contrastive Learning for Few-shot Node Classification** [[paper]](https://2022.ecmlpkdd.org/wp-content/uploads/2022/09/sub_764.pdf) 181 | 1. [Information Sciences 2022] **Graph Prototypical Contrastive Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S002002552201057X) 182 | 1. [ICAAN 2022] **Knowledge-Aware Self-supervised Graph Representation Learning for Recommendation** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-15937-4_35) 183 | 1. [arXiv 2022] **Self-supervised Representation Learning on Electronic Health Records with Graph Kernel Infomax** [[paper]](https://arxiv.org/pdf/2209.00655.pdf) 184 | 1. [arXiv 2022] **Disentangled Graph Contrastive Learning for Review-based Recommendation** [[paper]](https://arxiv.org/pdf/2209.01524.pdf) 185 | 1. [arXiv 2022] **Contrastive Learning with Heterogeneous Graph Attention Networks on Short Text Classification** [[paper]](https://dro.dur.ac.uk/36856/1/36856.pdf) 186 | 1. [arXiv 2022] **Features Based Adaptive Augmentation for Graph Contrastive Learning** [[paper]](https://arxiv.org/ftp/arxiv/papers/2207/2207.01792.pdf) 187 | 1. [TKDE 2022] **GCCAD: Graph Contrastive Learning for Anomaly Detection** [[paper]](https://ieeexplore.ieee.org/abstract/document/9870034) 188 | 1. [JCIM 2022] **SMICLR: Contrastive Learning on Multiple Molecular Representations for Semisupervised and Unsupervised Representation Learning** [[paper]](https://pubs.acs.org/doi/full/10.1021/acs.jcim.2c00521) 189 | 1. [arXiv 2022] **XSimGCL: Towards Extremely Simple Graph Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/abs/2209.02544)[[code]](https://github.com/Coder-Yu/SELFRec) 190 | 1. [CIKM 2022] **Relational Self-Supervised Learning on Graphs** [[paper]](https://arxiv.org/pdf/2208.10493.pdf)[[code]](https://github.com/Namkyeong/RGRL) 191 | 1. [Information Sciences 2022] **Self-Supervised Graph Representation Learning via Positive Mining** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025522009495) 192 | 1. [arXiv 2022] **Heterogeneous Graph Masked Autoencoders** [[paper]](https://arxiv.org/pdf/2208.09957.pdf) 193 | 1. [arXiv 2022] **KRACL: Contrastive Learning with Graph Context Modeling for Sparse Knowledge Graph Completion** [[paper]](https://arxiv.org/pdf/2208.07622.pdf) 194 | 1. [arXiv 2022] **R\'enyiCL: Contrastive Representation Learning with Skew R\'enyi Divergence** [[paper]](https://arxiv.org/pdf/2208.06270.pdf) 195 | 1. [TNNLS 2022] **Prototypical Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2106.09645.pdf) 196 | 1. [KDD 2022] **Mining Spatio-Temporal Relations via Self-Paced Graph Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3534678.3539422) 197 | 1. [KDD 2022] **Rep2Vec: Repository Embedding via Heterogeneous Graph Adversarial Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3534678.3539324) 198 | 1. [arXiv 2022] **Deep Contrastive Multiview Network Embedding** [[paper]](https://sxkdz.github.io/files/publications/CIKM/CREME/CREME.pdf) 199 | 1. [arXiv 2022] **Analyzing Data-Centric Properties for Contrastive Learning on Graphs** [[paper]](https://arxiv.org/pdf/2208.02810.pdf) 200 | 1. [KDD 2022] **Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries** [[paper]](https://keg.cs.tsinghua.edu.cn/jietang/publications/KDD22-Liu-et-al-KG-Transformer.pdf) 201 | 1. [arXiv 2022] **Generative Subgraph Contrast for Self-Supervised Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2207.11996.pdf) 202 | 1. [IJCAI 2022] **Graph Masked Autoencoder Enhanced Predictor for Neural Architecture Search** [[paper]](https://www.ijcai.org/proceedings/2022/0432.pdf) 203 | 1. [IJCAI 2022] **Proximity Enhanced Graph Neural Networks with Channel Contrast** [[paper]](https://www.ijcai.org/proceedings/2022/0340.pdf) 204 | 1. [IJCAI 2022] **Rethinking the Promotion Brought by Contrastive Learning to Semi-Supervised Node Classification** [[paper]](https://www.ijcai.org/proceedings/2022/0395.pdf) 205 | 1. [IPM 2022] **HCNA: Hyperbolic Contrastive Learning Framework for Self-Supervised Network Alignment** [[paper]](https://www.sciencedirect.com/science/article/abs/pii/S0306457322001315) 206 | 1. [arXiv 2022] **3D Equivariant Molecular Graph Pretraining** [[paper]](https://arxiv.org/pdf/2207.08824.pdf) 207 | 1. [arXiv 2022] **Unified 2D and 3D Pre-Training of Molecular Representations** [[paper]](https://arxiv.org/pdf/2207.08806.pdf) 208 | 1. [arXiv 2022] **Does GNN Pretraining Help Molecular Representation?** [[paper]](https://arxiv.org/pdf/2207.06010.pdf) 209 | 1. [arXiv 2022] **Latent Augmentation For Better Graph Self-Supervised Learning** [[paper]](https://arxiv.org/pdf/2206.12933.pdf) 210 | 1. [arXiv 2022] **Geometry Contrastive Learning on Heterogeneous Graphs** [[paper]](https://arxiv.org/pdf/2206.12547.pdf) 211 | 1. [KIS 2022] **Self-supervised role learning for graph neural networks** [[paper]](https://link.springer.com/article/10.1007/s10115-022-01694-5) 212 | 1. [JFCST 2022] **Graph Neural Network Defense Combined with Contrastive Learning** [[paper]](http://fcst.ceaj.org/EN/article/downloadArticleFile.do?attachType=PDF&id=3113) 213 | 1. [ICMLW 2022] **Evaluating Self-Supervised Learned Molecular Graphs** [[paper]](https://openreview.net/pdf?id=LeJC_Mf5rx-) 214 | 1. [KDD 2022] **Reliable Representations Make A Stronger Defender: Unsupervised Structure Refinement for Robust GNN** [[paper]](https://ponderly.github.io/pub/STABLE_KDD2022.pdf) 215 | 1. [ICMLW 2022] **Featurizations Matter: A Multiview Contrastive Learning Approach to Molecular Pretraining** [[paper]](https://openreview.net/pdf?id=Pm1Q1X3avx1) 216 | 1. [bioRiv 2022] **Cross-modal Graph Contrastive Learning with Cellular Images** [[paper]](https://www.biorxiv.org/content/biorxiv/early/2022/06/06/2022.06.05.494905.full.pdf) 217 | 1. [Information Sciences 2022] **A new self-supervised task on graphs: Geodesic distance prediction** [[paper]]([https://hansen7.github.io/sandbox/molgrapheval.pdf](https://www.sciencedirect.com/science/article/abs/pii/S0020025522006375)) 218 | 1. [arXiv 2022] **Evaluating Self-Supervised Learning for Molecular Graph Embeddings** [[paper]](https://hansen7.github.io/sandbox/molgrapheval.pdf) 219 | 1. [arXiv 2022] **Evaluating Graph Generative Models with Contrastively Learned Features** [[paper]](https://arxiv.org/pdf/2206.06234.pdf) 220 | 1. [arXiv 2022] **COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2206.04726.pdf) 221 | 1. [arXiv 2022] **Decoupled Self-supervised Learning for Non-Homophilous Graphs** [[paper]](https://arxiv.org/pdf/2206.03601.pdf) 222 | 1. [arXiv 2022] **Interpolation-based Correlation Reduction Network for Semi-Supervised Graph Learning** [[paper]](https://arxiv.org/pdf/2206.02796.pdf) 223 | 1. [arXiv 2022] **Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination** [[paper]](https://arxiv.org/pdf/2206.01535.pdf) 224 | 1. [arXiv 2022] **KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction** [[paper]](https://arxiv.org/pdf/2206.03364.pdf) 225 | 1. [CVPR 2022] **Robust Optimization As Data Augmentation for Large-Scale Graphs** [[paper]](https://openaccess.thecvf.com/content/CVPR2022/papers/Kong_Robust_Optimization_As_Data_Augmentation_for_Large-Scale_Graphs_CVPR_2022_paper.pdf) 226 | 1. [arXiv 2022] **COIN: Co-Cluster Infomax for Bipartite Graphs** [[paper]](https://arxiv.org/pdf/2206.00006.pdf) 227 | 3. [TSIPN 2022] **Fair Contrastive Learning on Graphs** [[paper]](https://ieeexplore.ieee.org/abstract/document/9779533) 228 | 4. [arXiv 2022] **I’m Me, We’re Us, and I’m Us: Tri-directional Contrastive Learning on Hypergraphs** [[paper]](https://arxiv.org/pdf/2206.04739.pdf) 229 | 5. [TNNLS 2022] **CLEAR: Cluster-Enhanced Contrast for Self-Supervised Graph Representation Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/9791433) 230 | 6. [arXiv 2022] **Let Invariant Rationale Discovery Inspire Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2206.07869.pdf) 231 | 7. [arXiv 2022] **Omni-Granular Ego-Semantic Propagation for Self-Supervised Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2205.15746.pdf) 232 | 8. [arXiv 2022] **Improving Subgraph Representation Learning via Multi-View Augmentation** [[paper]](https://arxiv.org/pdf/2205.13038.pdf) 233 | 9. [arXiv 2022] **Triangular Contrastive Learning on Molecular Graphs** [[paper]](https://arxiv.org/pdf/2205.13279.pdf) 234 | 10. [KDD 2022] **GraphMAE: Self-supervised Masked Graph Autoencoders** [[paper]](https://arxiv.org/pdf/2205.10803.pdf) 235 | 11. [arXiv 2022] **MaskGAE: Masked Graph Modeling Meets Graph Autoencoders** [[paper]](https://arxiv.org/pdf/2205.10053.pdf) 236 | 12. [ICML 2022] **Understanding Limitations of Unsupervised Graph Representation Learning from a Data-Dependent Perspective** [[paper]](https://www.osti.gov/servlets/purl/1868861) 237 | 13. [arXiv 2022] **Towards Explanation for Unsupervised Graph-Level Representation Learning** [[paper]](https://arxiv.org/pdf/2205.09934.pdf) 238 | 14. [arXiv 2022] **ImGCL: Revisiting Graph Contrastive Learning on Imbalanced Node Classification** [[paper]](https://arxiv.org/pdf/2205.11332.pdf) 239 | 15. [TNNLS 2022] **Collaborative Decision-Reinforced Self-Supervision for Attributed Graph Clustering** [[paper]](https://ieeexplore.ieee.org/abstract/document/9777842) 240 | 16. [arXiv 2022] **Contrastive Graph Learning with Graph Convolutional Networks** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-06555-2_7) 241 | 17. [TISPN 2022] **Fair Contrastive Learning on Graphs** [[paper]](https://ieeexplore.ieee.org/abstract/document/9779533) 242 | 18. [arXiv 2022] **SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks** [[paper]](https://zepengzhang.com/Notes/2022/20220507.pdf) 243 | 19. [arXiv 2022] **HCL: Hybrid Contrastive Learning for Graph-based Recommendation** [[paper]](https://assets.amazon.science/21/8b/a804e89041f1a83bb1f77fa6aaee/hcl-hybrid-contrastive-learning-for-graph-based-recommendation.pdf) 244 | 20. [arXiv 2022] **Representation learning with function call graph transformations for malware open set recognition** [[paper]](https://arxiv.org/pdf/2205.07865.pdf) 245 | 21. [arXiv 2022] **Simple Contrastive Graph Clustering** [[paper]](https://arxiv.org/pdf/2205.07865.pdf) 246 | 22. [NCA 2022] **Self-supervised graph representation learning using multi-scale subgraph views contrast** [[paper]](https://link.springer.com/article/10.1007/s00521-022-07299-x) 247 | 23. [ACL 2022] **JointCL: A Joint Contrastive Learning Framework for Zero-Shot Stance Detection** [[paper]](https://aclanthology.org/2022.acl-long.7/) 248 | 24. [IPM 2022] **Contrastive Graph Convolutional Networks with adaptive augmentation for text classification** [[paper]](https://www.sciencedirect.com/science/article/abs/pii/S0306457322000681) 249 | 25. [PAKDD 2022] **Contrastive Attributed Network Anomaly Detection with Data Augmentation** [[paper]](https://dl.acm.org/doi/abs/10.1007/978-3-031-05936-0_35) 250 | 26. [DASFAA 2022] **CSGNN: Improving Graph Neural Networks with Contrastive Semi-supervised Learning** [[paper]](https://dl.acm.org/doi/abs/10.1007/978-3-031-00123-9_58) 251 | 27. [arXiv 2022] **Dynamic Graph Representation Based on Temporal and Contextual Contrasting** [[paper]](https://assets.researchsquare.com/files/rs-1588877/v1_covered.pdf?c=1651680782) 252 | 28. [DASFAA 2022] **Diffusion-Based Graph Contrastive Learning for Recommendation with Implicit Feedback** [[paper]](https://dl.acm.org/doi/abs/10.1007/978-3-031-00126-0_15) 253 | 29. [arXiv 2022] **FastGCL: Fast Self-Supervised Learning on Graphs via Contrastive Neighborhood Aggregation** [[paper]](https://arxiv.org/pdf/2205.00905.pdf) 254 | 30. [arXiv 2022] **RoSA: A Robust Self-Aligned Framework for Node-Node Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2204.13846.pdf) 255 | 31. [arXiv 2022] **Heterogeneous Graph Neural Networks using Self-supervised Reciprocally Contrastive Learning** [[paper]](https://arxiv.org/pdf/2205.00256.pdf) 256 | 32. [WSDM 2022] **JGCL: Joint Self-Supervised and Supervised Graph Contrastive Learning** [[paper]](https://www2022.thewebconf.org/PaperFiles/161.pdf) 257 | 33. [AAAI 2022] **SAIL: Self-Augmented Graph Contrastive Learning** [[paper]](https://www.aaai.org/AAAI22Papers/AAAI-8378.YuL.pdf) 258 | 34. [ICASSP 2022] **Graph Fine-Grained Contrastive Representation Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/9746085) 259 | 35. [arXiv 2022] **SCGC: Self-Supervised Contrastive Graph Clustering** [[paper]](https://arxiv.org/pdf/2204.12656.pdf) 260 | 36. [arXiv 2022] **A Content-First Benchmark for Self-Supervised Graph Representation Learning** [[paper]](https://graph-learning-benchmarks.github.io/assets/papers/glb2022/A_Content_First_Benchmark_for_Self_Supervised_Graph_Representation_Learning.pdf) 261 | 37. [SIGIR 2022] **Hypergraph Contrastive Collaborative Filtering** [[paper]](https://arxiv.org/pdf/2204.12200.pdf) 262 | 38. [WWW 2022] **Rumor Detection on Social Media with Graph Adversarial Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3485447.3511999) 263 | 39. [arXiv 2022] **A Review-aware Graph Contrastive Learning Framework for Recommendation** [[paper]](https://arxiv.org/pdf/2204.12063.pdf) 264 | 40. [WWW 2022] **Robust Self-Supervised Structural Graph Neural Network for Social Network Prediction** [[paper]](https://dl.acm.org/doi/abs/10.1145/3485447.3512182) 265 | 41. [arXiv 2022] **CGC: Contrastive Graph Clustering for Community Detection and Tracking** [[paper]](https://arxiv.org/pdf/2204.08504.pdf) 266 | 42. [TCyber 2022] **Multiview Deep Graph Infomax to Achieve Unsupervised Graph Embedding** [[paper]](https://ieeexplore.ieee.org/abstract/document/9758652) 267 | 43. [arXiv 2022] **MVGCNMDA: Multi-view Graph Augmentation Convolutional Network for Uncovering Disease-Related Microbes** [[paper]](https://link.springer.com/article/10.1007/s12539-022-00514-2) 268 | 44. [arXiv 2022] **CERES: Pretraining of Graph-Conditioned Transformer for Semi-Structured Session Data** [[paper]](https://arxiv.org/pdf/2204.04303.pdf) 269 | 45. [arXiv 2022] **Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation** [[paper]](https://arxiv.org/pdf/2204.05104.pdf) 270 | 46. [SIGIR 2022] **Are Graph Augmentations Necessary? Simple Graph Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/abs/2112.08679)[[code]](https://github.com/Coder-Yu/SELFRec) 271 | 47. [arXiv 2022] **Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning** [[paper]](https://arxiv.org/pdf/2204.04813) 272 | 48. [arXiv 2022] **Augmentation-Free Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2204.04874) 273 | 49. [TCybern 2022] **Link-Information Augmented Twin Autoencoders for Network Denoising** [[paper]](https://ieeexplore.ieee.org/abstract/document/9745753) 274 | 50. [arXiv 2022] **Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization** [[paper]](https://arxiv.org/pdf/2203.12265) 275 | 51. [arXiv 2022] **GraphCoCo: Graph Complementary Contrastive Learning** [[paper]](https://arxiv.org/pdf/2203.12821) 276 | 52. [arXiv 2022] **Unsupervised Heterophilous Network Embedding via r-Ego Network Discrimination** [[paper]](https://arxiv.org/pdf/2203.10866.pdf) 277 | 53. [Bioinformatics 2022] **Supervised Graph Co-contrastive Learning for Drug-Target Interaction Prediction** [[paper]](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btac164/6551245?login=true) 278 | 54. [arXiv 2022] **Supervised Contrastive Learning with Structure Inference for Graph Classification** [[paper]](https://arxiv.org/pdf/2203.07691) 279 | 55. [arXiv 2022] **Defending Graph Convolutional Networks against Dynamic Graph Perturbations via Bayesian Self-supervision** [[paper]](https://arxiv.org/pdf/2203.03762.pdf) 280 | 57. [arXiv 2022] **Analyzing Heterogeneous Networks with Missing Attributes by Unsupervised Contrastive Learning** [[paper]](https://yangliang.github.io/pdf/tnnls22.pdf) 281 | 58. [arXiv 2022] **Improving Molecular Contrastive Learning via Faulty Negative Mitigation and Decomposed Fragment Contrast** [[paper]](https://arxiv.org/pdf/2202.09346.pdf) 282 | 59. [arXiv 2022] **Contrastive Meta Learning with Behavior Multiplicity for Recommendation** [[paper]](https://arxiv.org/pdf/2202.08523.pdf)[[code]](https://github.com/weiwei1206/CML.git) 283 | 60. [arXiv 2022] **Fair Node Representation Learning via Adaptive Data Augmentation** [[paper]](https://arxiv.org/pdf/2201.08549.pdf) 284 | 61. [arXiv 2022] **Learning Graph Augmentations to Learn Graph Representations** [[paper]](https://arxiv.org/pdf/2201.09830.pdf)[[code]](https://github.com/kavehhassani/lg2ar) 285 | 62. [arXiv 2022] **Graph Data Augmentation for Graph Machine Learning: A Survey** [[paper]](https://arxiv.org/pdf/2202.08871.pdf) 286 | 63. [arXiv 2022] **Data Augmentation for Deep Graph Learning: A Survey** [[paper]](https://arxiv.org/abs/2202.08235) 287 | 64. [arXiv 2022] **Adversarial Graph Contrastive Learning with Information Regularization** [[paper]](https://arxiv.org/pdf/2202.06491.pdf) 288 | 65. [arXiv 2022] **SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation** [[paper]](https://arxiv.org/pdf/2202.03104.pdf) 289 | 66. [NeurIPS 2022] **Graph Self-supervised Learning with Accurate Discrepancy Learning** [[paper]](https://arxiv.org/pdf/2202.02989.pdf) 290 | 67. [arXiv 2022] **Learning Robust Representation through Graph Adversarial Contrastive Learning** [[paper]](https://arxiv.org/pdf/2201.13025.pdf) 291 | 68. [arXiv 2022] **Self-supervised Graphs for Audio Representation Learning with Limited Labeled Data** [[paper]](https://arxiv.org/pdf/2202.00097.pdf) 292 | 69. [arXiv 2022] **Link Prediction with Contextualized Self-Supervision** [[paper]](https://arxiv.org/pdf/2201.10069.pdf) 293 | 70. [arXiv 2022] **Dual Space Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2201.07409.pdf) 294 | 71. [arXiv 2022] **Unsupervised Graph Poisoning Attack via Contrastive Loss Back-propagation** [[paper]](https://arxiv.org/pdf/2201.07986.pdf) 295 | 72. [arXiv 2022] **From Unsupervised to Few-shot Graph Anomaly Detection: A Multi-scale Contrastive Learning Approach** [[paper]](https://arxiv.org/pdf/2202.05525) 296 | 73. [arXiv 2022] **Dual Space Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2201.07409) 297 | 74. [arXiv 2022] **Structure-Enhanced Heterogeneous Graph Contrastive Learning** [[paper]](https://sxkdz.github.io/files/publications/SDM/STENCIL/STENCIL.pdf) 298 | 75. [bioRxiv 2022] **Towards Effective and Generalizable Fine-tuning for Pre-trained Molecular Graph Models** [[paper]](https://www.biorxiv.org/content/biorxiv/early/2022/02/06/2022.02.03.479055.full.pdf) 299 | 76. [SDM 2022] **Neural Graph Matching for Pre-training Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2203.01597.pdf) [[code]](https://github.com/RUCAIBox/GMPT) 300 | 77. [TNNLS 2022] **Analyzing Heterogeneous Networks with Missing Attributes by Unsupervised Contrastive Learning** [[paper]](https://yangliang.github.io/pdf/tnnls22.pdf) 301 | 78. [WWW 2022] **Improving Graph Collaborative Filtering with Neighborhood-enriched Contrastive Learning** [[paper]](https://arxiv.org/pdf/2202.06200.pdf) [[code]](https://github.com/RUCAIBox/NCL) 302 | 79. [WWW 2022] **ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs** [[paper]](https://xiaojingzi.github.io/publications/WWW22-Wang-et-al-ClusterSCL.pdf) 303 | 80. [ICLR 2022] **Large-Scale Representation Learning on Graphs via Bootstrapping** [[paper]](https://arxiv.org/pdf/2102.06514.pdf)[[Code]](https://github.com/Namkyeong/BGRL_Pytorch) 304 | 81. [ICLR 2022] **Automated Self-Supervised Learning for Graphs** [[paper]](https://openreview.net/forum?id=rFbR4Fv-D6-) [[code]](https://github.com/ChandlerBang/AutoSSL) 305 | 82. [AAAI 2022] **Self-supervised Graph Neural Networks via Diverse and Interactive Message Passing** [[paper]](https://yangliang.github.io/pdf/aaai22.pdf) 306 | 83. [AAAI 2022] **Augmentation-Free Self-Supervised Learning on Graphs** [[paper]](https://arxiv.org/pdf/2112.02472.pdf)[[code]](https://github.com/Namkyeong/AFGRL) 307 | 84. [AAAI 2022] **Molecular Contrastive Learning with Chemical Element Knowledge Graph** [[paper]](https://arxiv.org/pdf/2112.00544.pdf) 308 | 85. [AAAI 2022] **Deep Graph Clustering via Dual Correlation Reduction** [[paper]](https://arxiv.org/pdf/2112.14772)[[code]](https://github.com/yueliu1999/DCRN) 309 | 86. [AAAI 2022] **Simple Unsupervised Graph Representation Learning** [[paper]](https://www.aaai.org/AAAI22Papers/AAAI-3999.MoY.pdf) 310 | 87. [WSDM 2022] **Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations** [[paper]](https://arxiv.org/abs/2201.01702) [[code]](https://github.com/Shen-Lab/GraphCL_Automated) 311 | 88. [ICOIN 2022] **Adaptive Self-Supervised Graph Representation Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/9687176) 312 | 89. [NPL 2022] **How Does Bayesian Noisy Self-Supervision Defend Graph Convolutional Networks?** [[paper]](https://link.springer.com/article/10.1007/s11063-022-10750-8) 313 | 90. [SIGIR 2022] **Knowledge Graph Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/abs/2205.00976) [[code]](https://github.com/yuh-yang/KGCL-SIGIR22) 314 | 315 | ## Year 2021 316 | 1. [AAAI 2021] **Self-supervised hypergraph convolutional networks for session-based recommendation** [[paper]](https://www.aaai.org/AAAI21Papers/AAAI-1889.XiaX.pdf) 317 | 1. [arXiv 2021] **Pre-training Graph Neural Network for Cross Domain Recommendation** [[paper]](https://arxiv.org/pdf/2111.08268.pdf) 318 | 17. [arXiv 2021] **Augmentations in Graph Contrastive Learning: Current Methodological Flaws & Towards Better Practices** [[paper]](https://arxiv.org/pdf/2111.03220.pdf) 319 | 18. [arXiv 2021] **Collaborative Graph Contrastive Learning: Data Augmentation Composition May Not be Necessary for Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2111.03262.pdf) 320 | 13. [arXiv 2021] **Multi-task Self-distillation for Graph-based Semi-Supervised Learning** [[paper]](https://arxiv.org/pdf/2112.01174.pdf) 321 | 14. [arXiv 2021] **Subgraph Contrastive Link Representation Learning** [[paper]](https://arxiv.org/pdf/2112.01165.pdf) 322 | 3. [arXiv 2021] **Multilayer Graph Contrastive Clustering Network** [[paper]](https://arxiv.org/pdf/2112.14021.pdf) 323 | 3. [arXiv 2021] **Graph Representation Learning via Contrasting Cluster Assignments** [[paper]](https://arxiv.org/pdf/2112.07934.pdf) 324 | 3. [arXiv 2021] **Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2112.08830.pdf) 325 | 3. [arXiv 2021] **Bayesian Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2112.07823.pdf) 326 | 3. [arXiv 2021] **TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning** [[paper]](https://arxiv.org/pdf/2112.03587.pdf) 327 | 26. [arXiv 2021] **Graph Communal Contrastive Learning** [[paper]](https://arxiv.org/pdf/2110.14863.pdf) 328 | 27. [arXiv 2021] **Self-supervised Contrastive Attributed Graph Clustering** [[paper]](https://arxiv.org/pdf/2110.08264.pdf) 329 | 28. [arXiv 2021] **Self-Supervised Learning for Molecular Property Prediction** [[paper]](https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/61677becaa918db6bf2a31cb/original/self-supervised-learning-for-molecular-property-prediction.pdf) 330 | 29. [arXiv 2021] **RPT: Toward Transferable Model on Heterogeneous Researcher Data via Pre-Training** [[paper]](https://arxiv.org/pdf/2110.07336.pdf) 331 | 30. [arXiv 2021] **Scalable Consistency Training for Graph Neural Networks via Self-Ensemble Self-Distillation** [[paper]](https://arxiv.org/pdf/2110.06290.pdf) 332 | 31. [arXiv 2021] **PRE-TRAINING MOLECULAR GRAPH REPRESENTATION WITH 3D GEOMETRY** [[paper]](https://wyliu.com/papers/GraphMVP.pdf) [[code]](https://github.com/chao1224/GraphMVP) 333 | 32. [arXiv 2021] **3D Infomax improves GNNs for Molecular Property Prediction** [[paper]](https://arxiv.org/abs/2110.04126v1) [[code]](https://github.com/HannesStark/3DInfomax) 334 | 34. [arXiv 2021] **Motif-based Graph Self-Supervised Learning for Molecular Property Prediction** [[paper]](https://arxiv.org/pdf/2110.00987.pdf) 335 | 35. [arXiv 2021] **Debiased Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2110.02027.pdf) 336 | 36. [arXiv 2021] **3D-Transformer: Molecular Representation with Transformer in 3D Space** [[paper]](https://arxiv.org/pdf/2110.01191.pdf) 337 | 37. [arXiv 2021] **Contrastive Pre-Training of GNNs on Heterogeneous Graphs** [[paper]](https://yuanfulu.github.io/publication/CIKM-CPT.pdf) 338 | 38. [arXiv 2021] **Contrastive Graph Convolutional Networks for Hardware Trojan Detection in Third Party IP Cores** [[paper]](https://people.cs.vt.edu/~ramakris/papers/Hardware_Trojan_Trigger_Detection__HOST2021.pdf) 339 | 39. [arXiv 2021] **GeomGCL: Geometric Graph Contrastive Learning for Molecular Property Prediction** [[paper]](https://arxiv.org/pdf/2109.11730.pdf) 340 | 40. [arXiv 2021] **Adaptive Multi-layer Contrastive Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2109.14159.pdf) 341 | 42. [arXiv 2021] **Graph-MVP: Multi-View Prototypical Contrastive Learning for Multiplex Graphs** [[paper]](https://arxiv.org/pdf/2109.03560.pdf) 342 | 43. [arXiv 2021] **Hyper Meta-Path Contrastive Learning for Multi-Behavior Recommendation** [[paper]](https://arxiv.org/pdf/2109.02859.pdf) 343 | 44. [arXiv 2021] **Negative Sampling Strategies for Contrastive Self-Supervised Learning of Graph Representations** [[paper]](https://www.sciencedirect.com/science/article/abs/pii/S0165168421003479) 344 | 45. [arXiv 2021] **Structure-Aware Hard Negative Mining for Heterogeneous Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2108.13886.pdf) 345 | 46. [arXiv 2021] **Spatio-Temporal Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2108.11873.pdf) 346 | 47. [arXiv 2021] **Generative and Contrastive Self-Supervised Learning for Graph Anomaly Detection** [[paper]](https://arxiv.org/pdf/2108.09896.pdf) 347 | 92. [Arxiv 2021] **Self-Supervised Multi-Channel Hypergraph Convolutional Network for Social Recommendation** [[paper]](https://arxiv.org/abs/2101.06448) [[code]](https://github.com/Coder-Yu/RecQ) 348 | 53. [arXiv 2021] **GCCAD: Graph Contrastive Coding for Anomaly Detection** [[paper]](https://arxiv.org/pdf/2108.07516.pdf) 349 | 54. [arXiv 2021] **Contrastive Self-supervised Sequential Recommendation with Robust Augmentation** [[paper]](https://arxiv.org/pdf/2108.06479.pdf) 350 | 55. [arXiv 2021] **RRLFSOR: An Efficient Self-Supervised Learning Strategy of Graph Convolutional Networks** [[paper]](https://arxiv.org/ftp/arxiv/papers/2108/2108.07481.pdf) 351 | 59. [arXiv 2021] **Group Contrastive Self-Supervised Learning on Graphs** [[paper]](https://arxiv.org/abs/2107.09787) 352 | 60. [arXiv 2021] **Multi-Level Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2107.02639) 353 | 62. [arXiv 2021] **From Canonical Correlation Analysis to Self-supervised Graph Neural Networks** [[paper]](https://arxiv.org/abs/2106.12484) [[code]](https://github.com/hengruizhang98/CCA-SSG) 354 | 63. [arXiv 2021] **Evaluating Modules in Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2106.08171) [[code]](https://github.com/thunlp/OpenGCL) 355 | 70. [arXiv 2021] **Prototypical Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2106.09645.pdf) 356 | 71. [arXiv 2021] **Fairness-Aware Node Representation Learning** [[paper]](https://arxiv.org/pdf/2106.05391.pdf) 357 | 72. [arXiv 2021] **Adversarial Graph Augmentation to Improve Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2106.05819) 358 | 73. [arXiv 2021] **Graph Barlow Twins: A self-supervised representation learning framework for graphs** [[paper]](https://arxiv.org/pdf/2106.02466.pdf) 359 | 74. [arXiv 2021] **Self-Supervised Graph Learning with Proximity-based Views and Channel Contrast** [[paper]](https://arxiv.org/pdf/2106.03723.pdf) 360 | 75. [arXiv 2021] **Self-supervised on Graphs: Contrastive, Generative,or Predictive** [[paper]](https://arxiv.org/abs/2105.07342) 361 | 76. [arXiv 2021] **FedGL: Federated Graph Learning Framework with Global Self-Supervision** [[paper]](https://arxiv.org/pdf/2105.03170.pdf) 362 | 78. [arXiv 2021] **Hop-Count Based Self-Supervised Anomaly Detection on Attributed Networks** [[paper]](https://arxiv.org/abs/2104.07917) 363 | 79. [arXiv 2021] **Representation Learning for Networks in Biology and Medicine: Advancements, Challenges, and Opportunities** [[paper]](https://arxiv.org/abs/2104.04883) 364 | 80. [arXiv 2021] **Graph Representation Learning by Ensemble Aggregating Subgraphs via Mutual Information Maximization** [[paper]](https://arxiv.org/abs/2103.13125) 365 | 81. [arXiv 2021] **Drug Target Prediction Using Graph Representation Learning via Substructures Contrast** [[paper]](https://www.preprints.org/manuscript/202103.0337/v1) 366 | 82. [arXiv 2021] **Self-supervised Auxiliary Learning for Graph Neural Networks via Meta-Learning** [[paper]](https://arxiv.org/abs/2103.00771) 367 | 83. [arXiv 2021] **Graph Self-Supervised Learning: A Survey** [[paper]](https://arxiv.org/abs/2103.00111) 368 | 84. [arXiv 2021] **Towards Robust Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2102.13085.pdf) 369 | 85. [arXiv 2021] **Pre-Training on Dynamic Graph Neural Networks** [[paper]](https://arxiv.org/abs/2102.12380) 370 | 86. [arXiv 2021] **Self-Supervised Learning of Graph Neural Networks: A Unified Review** [[paper]](https://arxiv.org/abs/2102.10757) 371 | 61. [Openreview 2021] **An Empirical Study of Graph Contrastive Learning** [[paper]](https://openreview.net/forum?id=fYxEnpY-__G) 372 | 1. [BIBM 2021] **SGAT: a Self-supervised Graph Attention Network for Biomedical Relation Extraction** [[paper]](https://ieeexplore.ieee.org/abstract/document/9669699) 373 | 95. [BIBM 2021] **Molecular Graph Contrastive Learning with Parameterized Explainable Augmentations** [[paper]](https://www.biorxiv.org/content/10.1101/2021.12.03.471150v1) 374 | 5. [NeurIPS 2021 Workshop] **Self-Supervised GNN that Jointly Learns to Augment** [[paper]](https://www.researchgate.net/profile/Zekarias-Kefato/publication/356997993_Self-Supervised_GNN_that_Jointly_Learns_to_Augment/links/61b75d88a6251b553ab64ff4/Self-Supervised-GNN-that-Jointly-Learns-to-Augment.pdf) 375 | 5. [NeurIPS 2021 Workshop] **Contrastive Embedding of Structured Space for Bayesian Optimisation** [[paper]](https://openreview.net/pdf?id=xFpkJUMS9te) 376 | 5. [NeurIPS 2021] **Enhancing Hyperbolic Graph Embeddings via Contrastive Learning** [[paper]](https://sslneurips21.github.io/files/CameraReady/NeurIPS_2021_workshop_version2.pdf) 377 | 5. [NeurIPS 2021] **Graph Adversarial Self-Supervised Learning** [[paper]](https://proceedings.neurips.cc/paper/2021/file/7d3010c11d08cf990b7614d2c2ca9098-Paper.pdf) 378 | 6. [NeurIPS 2021] **Contrastive laplacian eigenmaps** [[paper]](https://papers.nips.cc/paper/2021/file/2d1b2a5ff364606ff041650887723470-Paper.pdf) 379 | 7. [NeurIPS 2021] **Directed Graph Contrastive Learning** [[paper]](https://zekuntong.com/files/digcl_nips.pdf)[[code]](https://github.com/flyingtango/DiGCL) 380 | 8. [NeurIPS 2021] **Multi-view Contrastive Graph Clustering** [[paper]](https://arxiv.org/pdf/2110.11842.pdf)[[code]](https://github.com/Panern/MCGC) 381 | 9. [NeurIPS 2021] **From Canonical Correlation Analysis to Self-supervised Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2106.12484.pdf)[[code]](https://github.com/hengruizhang98/CCA-SSG) 382 | 10. [NeurIPS 2021] **InfoGCL: Information-Aware Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2110.15438.pdf) 383 | 11. [NeurIPS 2021] **Adversarial Graph Augmentation to Improve Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2106.05819)[[code]](https://github.com/susheels/adgcl) 384 | 12. [NeurIPS 2021] **Disentangled Contrastive Learning on Graphs** [[paper]](https://openreview.net/pdf?id=C_L0Xw_Qf8M) 385 | 20. [CIKM 2021] **Multimodal Graph Meta Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482151) 386 | 21. [CIKM 2021] **Self-supervised Representation Learning on Dynamic Graphs** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482389) 387 | 22. [CIKM 2021] **Rectifying Pseudo Labels: Iterative Feature Clustering for Graph Representation Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482469) 388 | 23. [CIKM 2021] **SGCL: Contrastive Representation Learning for Signed Graphs** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482478) 389 | 24. [CIKM 2021] **Semi-Supervised and Self-Supervised Classification with Multi-View Graph Neural Networks** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482477) 390 | 25. [CIKM 2021] **Social Recommendation with Self-Supervised Metagraph Informax Network** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482480) [[code]](https://github.com/SocialRecsys/SMIN) 391 | 48. [IJCAI 2021] **Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning** [[paper]](https://www.ijcai.org/proceedings/2021/0204.pdf) 392 | 49. [IJCAI 2021] **Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks** [[paper]](https://www.ijcai.org/proceedings/2021/0371.pdf) 393 | 50. [IJCAI 2021] **CuCo: Graph Representation with Curriculum Contrastive Learning** [[paper]](https://www.ijcai.org/proceedings/2021/0317.pdf) 394 | 51. [IJCAI 2021] **Graph Debiased Contrastive Learning with Joint Representation Clustering** [[paper]](https://www.ijcai.org/proceedings/2021/0473.pdf) 395 | 52. [IJCAI 2021] **CSGNN: Contrastive Self-Supervised Graph Neural Network for Molecular Interaction Prediction** [[paper]](https://www.ijcai.org/proceedings/2021/0517.pdf) 396 | 56. [KDD 2021] **MoCL: Data-driven Molecular Fingerprint via Knowledge-aware Contrastive Learning from Molecular Graph** [[paper]](https://dl.acm.org/doi/abs/10.1145/3447548.3467186) [[code]](https://github.com/illidanlab/MoCL-DK) 397 | 57. [KDD 2021] **Contrastive Multi-View Multiplex Network Embedding with Applications to Robust Network Alignment** [[paper]](https://dl.acm.org/doi/abs/10.1145/3447548.3467227) 398 | 58. [KDD 2021] **Adaptive Transfer Learning on Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2107.08765.pdf) 399 | 64. :fire:[ICML 2021] **Graph Contrastive Learning Automated** [[paper]](https://arxiv.org/abs/2106.07594) [[code]](https://github.com/Shen-Lab/GraphCL_Automated) 400 | 66. [ICML 2021] **Self-supervised Graph-level Representation Learning with Local and Global Structure** [[paper]](https://arxiv.org/pdf/2106.04113) [[code]](https://github.com/DeepGraphLearning/GraphLoG) 401 | 67. [KDD 2021] **Pre-training on Large-Scale Heterogeneous Graph** [[paper]](http://www.shichuan.org/doc/111.pdf) 402 | 68. [KDD 2021] **MoCL: Contrastive Learning on Molecular Graphs with Multi-level Domain Knowledge** [[paper]](https://arxiv.org/pdf/2106.04509.pdf) 403 | 69. [KDD 2021] **Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning** [[paper]](https://arxiv.org/abs/2105.09111) [[code]](https://github.com/liun-online/HeCo) 404 | 87. [WWW 2021 Workshop] **Iterative Graph Self-Distillation** [[paper]](https://arxiv.org/abs/2010.12609) 405 | 88. [WWW 2021] **HDMI: High-order Deep Multiplex Infomax** [[paper]](https://arxiv.org/abs/2102.07810) [[code]](https://github.com/baoyujing/HDMI) 406 | 89. :fire:[WWW 2021] **Graph Contrastive Learning with Adaptive Augmentation** [[paper]](https://arxiv.org/abs/2010.14945) [[code]](https://github.com/CRIPAC-DIG/GCA) 407 | 90. [WWW 2021] **SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism** [[paper]](https://arxiv.org/abs/2101.08170) [[code]](https://github.com/RingBDStack/SUGAR) 408 | 91. [WWW 2021] **Multi-view Graph Contrastive Representation Learning for Drug-Drug Interaction Prediction** [[paper]](https://arxiv.org/abs/2010.11711) [[code]](https://github.com/isjakewong/MIRACLE) 409 | 93. :fire:[ICLR 2021] **How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision** [[paper]](https://openreview.net/forum?id=Wi5KUNlqWty) [[code]](https://github.com/dongkwan-kim/SuperGAT) 410 | 94. [WSDM 2021] **Pre-Training Graph Neural Networks for Cold-Start Users and Items Representation** [[paper]](https://arxiv.org/abs/2012.07064) [[code]](https://github.com/jerryhao66/Pretrain-Recsys) 411 | 41. [KBS 2021] **Multi-aspect self-supervised learning for heterogeneous information network** [[paper]](https://www.sciencedirect.com/science/article/abs/pii/S095070512100736X) 412 | 33. [CVPR 2021] **Zero-Shot Learning via Contrastive Learning on Dual Knowledge Graphs** [[paper]](https://openaccess.thecvf.com/content/ICCV2021W/GSP-CV/papers/Wang_Zero-Shot_Learning_via_Contrastive_Learning_on_Dual_Knowledge_Graphs_ICCVW_2021_paper.pdf) 413 | 2. [ICBD 2021] **Session-based Recommendation via Contrastive Learning on Heterogeneous Graph** [[paper]](https://ieeexplore.ieee.org/abstract/document/9671296) 414 | 4. [ICONIP 2021] **Concordant Contrastive Learning for Semi-supervised Node Classification on Graph** [[paper]](https://link.springer.com/chapter/10.1007/978-3-030-92185-9_48) 415 | 15. [ICCSNT 2021] **Graph Data Augmentation based on Adaptive Graph Convolution for Skeleton-based Action Recognition** [[paper]](https://ieeexplore.ieee.org/abstract/document/9615451) 416 | 77. [IJCNN 2021] **Node Embedding using Mutual Information and Self-Supervision based Bi-level Aggregation** [[paper]](https://arxiv.org/abs/2104.13014v1) 417 | 418 | ## Year 2020 419 | 1. [Openreview 2020] **Motif-Driven Contrastive Learning of Graph Representations** [[paper]](https://openreview.net/forum?id=qcKh_Msv1GP) 420 | 15. [Openreview 2020] **SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks** [[paper]](https://openreview.net/forum?id=a5KvtsZ14ev) 421 | 16. [Openreview 2020] **TopoTER: Unsupervised Learning of Topology Transformation Equivariant Representations** [[paper]](https://openreview.net/forum?id=9az9VKjOx00) 422 | 17. [Openreview 2020] **Graph-Based Neural Network Models with Multiple Self-Supervised Auxiliary Tasks** [[paper]](https://openreview.net/forum?id=hnJSgY7p33a) 423 | 19. [Openreview 2020] **Transfer Learning of Graph Neural Networks with Ego-graph Information Maximization** [[paper]](https://openreview.net/forum?id=J_pvI6ap5Mn) 424 | 1. [Arxiv 2020] **COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking** [[paper]](https://arxiv.org/abs/2012.11336) [[code]](https://github.com/BoChen-Daniel/Expert-Linking) 425 | 12. [Arxiv 2020] **Distance-wise Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2012.07437) 426 | 23. :fire:[Arxiv 2020] **Self-supervised Learning on Graphs: Deep Insights and New Direction.** [[paper]](https://arxiv.org/abs/2006.10141) [[code]](https://github.com/ChandlerBang/SelfTask-GNN) 427 | 24. :fire:[Arxiv 2020] **Deep Graph Contrastive Representation Learning** [[paper]](https://arxiv.org/abs/2006.04131) 428 | 29. [Arxiv 2020] **Self-supervised Training of Graph Convolutional Networks.** [[paper]](https://arxiv.org/abs/2006.02380) 429 | 30. [Arxiv 2020] **Self-Supervised Graph Representation Learning via Global Context Prediction.** [[paper]](https://arxiv.org/abs/2003.01604) 430 | 33. :fire:[Arxiv 2020] **Graph-Bert: Only Attention is Needed for Learning Graph Representations.** [[paper]](https://arxiv.org/abs/2001.05140) [[code]](https://github.com/anonymous-sourcecode/Graph-Bert) 431 | 20. :fire:[NeurIPS 2020] **Self-Supervised Graph Transformer on Large-Scale Molecular Data** [[paper]](https://drug.ai.tencent.com/publications/GROVER.pdf) 432 | 21. [NeurIPS 2020] **Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs** [[paper]](https://arxiv.org/abs/2007.08294) [[code]](https://github.com/mlvlab/SELAR) 433 | 22. :fire:[NeurIPS 2020] **Graph Contrastive Learning with Augmentations** [[paper]](https://arxiv.org/abs/2010.13902) [[code]](https://github.com/Shen-Lab/GraphCL) 434 | 25. :fire:[ICML 2020] **When Does Self-Supervision Help Graph Convolutional Networks?** [[paper]](https://arxiv.org/abs/2006.09136) [[code]](https://github.com/Shen-Lab/SS-GCNs) 435 | 26. :fire:[ICML 2020] **Graph-based, Self-Supervised Program Repair from Diagnostic Feedback.** [[paper]](https://arxiv.org/abs/2005.10636) 436 | 27. :fire:[ICML 2020] **Contrastive Multi-View Representation Learning on Graphs.** [[paper]](https://arxiv.org/abs/2006.05582) [[code]](https://github.com/kavehhassani/mvgrl) 437 | 28. [ICML 2020 Workshop] **Self-supervised edge features for improved Graph Neural Network training.** [[paper]](https://arxiv.org/abs/2007.04777) 438 | 31. :fire:[KDD 2020] **GPT-GNN: Generative Pre-Training of Graph Neural Networks.** [[pdf]](https://arxiv.org/abs/2006.15437) [[code]](https://github.com/acbull/GPT-GNN) 439 | 32. :fire:[KDD 2020] **GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training.** [[pdf]](https://arxiv.org/abs/2006.09963) [[code]](https://github.com/THUDM/GCC) 440 | 34. :fire:[ICLR 2020] **InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization.** [[paper]](https://arxiv.org/abs/1908.01000) [[code]](https://github.com/fanyun-sun/InfoGraph) 441 | 35. :fire:[ICLR 2020] **Strategies for Pre-training Graph Neural Networks.** [[paper]](https://arxiv.org/abs/1905.12265) [[code]](https://github.com/snap-stanford/pretrain-gnns) 442 | 36. :fire:[AAAI 2020] **Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels.** [[paper]](https://arxiv.org/abs/1902.11038) 443 | 1. [ICDM 2020] **Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning** [[paper]](https://arxiv.org/abs/2009.10273) [[code]](https://github.com/yzjiao/Subg-Con) 444 | 445 | ## Year 2019 446 | 1. [KDD 2019 Workshop] **SGR: Self-Supervised Spectral Graph Representation Learning.** [[paper]](https://arxiv.org/abs/1811.06237) 447 | 1. [ICLR 2019 Workshop] **Can Graph Neural Networks Go "Online"? An Analysis of Pretraining and Inference.** [[paper]](https://arxiv.org/abs/1905.06018) 448 | 1. [ICLR 2019 workshop] **Pre-Training Graph Neural Networks for Generic Structural Feature Extraction.** [[paper]](https://arxiv.org/abs/1905.13728) 449 | 1. [Arxiv 2019] **Heterogeneous Deep Graph Infomax** [[paper]](https://arxiv.org/abs/1911.08538) [[code]](https://github.com/YuxiangRen/Heterogeneous-Deep-Graph-Infomax) 450 | 1. :fire:[ICLR 2019] **Deep Graph Informax.** [[paper]](https://arxiv.org/abs/1809.10341) [[code]](https://github.com/PetarV-/DGI) 451 | 452 | 453 | ## Other related papers 454 | (implicitly using self-supersvied learning or applying graph neural networks in other domains) 455 | 1. [Arxiv 2020] **Self-supervised Learning: Generative or Contrastive.** [[paper]](https://arxiv.org/abs/2006.08218) 456 | 1. [KDD 2020] **Octet: Online Catalog Taxonomy Enrichment with Self-Supervision.** [[paper]](https://arxiv.org/pdf/2006.10276.pdf) 457 | 1. [WWW 2020] **Structural Deep Clustering Network.** [[paper]](https://dl.acm.org/doi/abs/10.1145/3366423.3380214 458 | ) [[code]](https://github.com/bdy9527/SDCN) 459 | 1. [IJCAI 2019] **Pre-training of Graph Augmented Transformers for Medication Recommendation.** [[paper]](https://arxiv.org/abs/1906.00346) [[code]](https://github.com/jshang123/G-Bert) 460 | 1. [AAAI 2020] **Unsupervised Attributed Multiplex Network Embedding** [[paper]](https://arxiv.org/abs/1911.06750) [[code]](https://github.com/pcy1302/DMGI) 461 | 1. [WWW 2020] **Graph representation learning via graphical mutual information maximization** [[paper]](https://dl.acm.org/doi/abs/10.1145/3366423.3380112) 462 | 1. [NeurIPS 2017] **Inductive Representation Learning on Large Graphs** [[paper]](https://papers.nips.cc/paper/2017/hash/5dd9db5e033da9c6fb5ba83c7a7ebea9-Abstract.html) [[code]](https://github.com/williamleif/GraphSAGE) 463 | 1. [NeurIPS 2016 Workshop] **Variational Graph Auto-Encoders** [[paper]](https://arxiv.org/abs/1611.07308) [[code]](https://github.com/tkipf/gae) 464 | 1. [WWW 2015] **LINE: Large-scale Information Network Embedding** [[paper]](https://dl.acm.org/doi/abs/10.1145/2736277.2741093) [[code]](https://github.com/tangjianpku/LINE) 465 | 1. [KDD 2014] **DeepWalk: Online Learning of Social Representations** [[paper]](https://dl.acm.org/doi/abs/10.1145/2623330.2623732) [[code]](https://github.com/phanein/deepwalk) 466 | 467 | ## Acknowledgement 468 | 469 | This page is contributed and maintained by [Wei Jin](http://cse.msu.edu/~jinwei2/)(joe.weijin@gmail.com), [Yuning You](https://yyou1996.github.io/)(yuning.you@tamu.edu) and [Yingheng Wang](https://isjakewong.github.io/)(jakewyh@163.com). 470 | -------------------------------------------------------------------------------- /get_hot.py: -------------------------------------------------------------------------------- 1 | """Mark hot articles which are extensively cited""" 2 | 3 | import re 4 | from tqdm import tqdm 5 | import time 6 | import os 7 | import subprocess 8 | import scholar 9 | import numpy as np 10 | 11 | 12 | def overlap(s1, s2): 13 | s1 = replace(s1) 14 | s2 = replace(s2) 15 | s1 = set(s1.split()) 16 | s2 = set(s2.split()) 17 | intersec = s1 & s2 18 | return len(intersec)/len(s1) 19 | 20 | def replace(s0): 21 | s0 = s0.replace('.', ' ') 22 | s0 = s0.replace(':', ' ') 23 | s0 = s0.lower() 24 | return s0 25 | 26 | def get_citations(paper, verbose=1): 27 | def searchScholar(searchphrase, title): 28 | query = scholar.SearchScholarQuery() 29 | # query.set_words(searchphrase) 30 | query.set_words(title) 31 | querier.send_query(query) 32 | articles = querier.articles 33 | try: 34 | if overlap(articles[0].attrs['title'][0], title) < 0.9: 35 | return 0 36 | except: 37 | # set_new_proxy() 38 | return -1 39 | return articles[0].attrs['num_citations'][0] 40 | 41 | art = ["-c", "1", "--phrase", paper] 42 | querier = scholar.ScholarQuerier() 43 | settings = scholar.ScholarSettings() 44 | settings.set_citation_format(2) 45 | querier.apply_settings(settings) 46 | cites = searchScholar(art, paper) 47 | while cites == -1: 48 | searchScholar(art, paper) 49 | if verbose: 50 | print(f'{paper}: ', cites) 51 | return cites 52 | 53 | def get_papers(filename): 54 | papers = [] 55 | paper2line = {} 56 | with open(filename, 'r') as f: 57 | for num, line in enumerate(f.readlines()): 58 | if 'Other related papers' in line: # do not count them 59 | break 60 | if '[paper]' in line: 61 | res = re.findall('\*\*.+\*\*', line)[0] 62 | res = res[2:-2] 63 | paper2line[len(papers)] = num 64 | papers.append(res) 65 | print(res) 66 | return papers, paper2line 67 | 68 | papers, paper2line = get_papers('README.md') 69 | citations = [] 70 | for id, p in tqdm(enumerate(papers)): 71 | time.sleep(2) 72 | citations.append(get_citations(p)) 73 | 74 | idx = np.arange(len(citations)) 75 | citations = np.array(citations) 76 | hot_id = idx[citations>80] 77 | 78 | with open('README.md', 'r') as f: 79 | content = f.readlines() 80 | 81 | with open('README_old.md', 'w') as f: 82 | f.write(' '.join(content)) 83 | 84 | for p in hot_id: 85 | line = paper2line[p] 86 | old_content = content[line] 87 | num = re.findall(r'\d+', old_content)[0] # the number before '.' 88 | new_content = old_content[:len(num)+2] + ":fire:" + old_content[len(num)+2:] 89 | content[line] = new_content 90 | with open('README.md', 'w') as f: 91 | f.write(' '.join(content)) 92 | -------------------------------------------------------------------------------- /scholar.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python 2 | """ 3 | Copied from https://github.com/ckreibich/scholar.py 4 | """ 5 | """ 6 | This module provides classes for querying Google Scholar and parsing 7 | returned results. It currently *only* processes the first results 8 | page. It is not a recursive crawler. 9 | """ 10 | # ChangeLog 11 | # --------- 12 | # 13 | # 2.11 The Scholar site seems to have become more picky about the 14 | # number of results requested. The default of 20 in scholar.py 15 | # could cause HTTP 503 responses. scholar.py now doesn't request 16 | # a maximum unless you provide it at the comment line. (For the 17 | # time being, you still cannot request more than 20 results.) 18 | # 19 | # 2.10 Merged a fix for the "TypError: quote_from_bytes()" problem on 20 | # Python 3.x from hinnefe2. 21 | # 22 | # 2.9 Fixed Unicode problem in certain queries. Thanks to smidm for 23 | # this contribution. 24 | # 25 | # 2.8 Improved quotation-mark handling for multi-word phrases in 26 | # queries. Also, log URLs %-decoded in debugging output, for 27 | # easier interpretation. 28 | # 29 | # 2.7 Ability to extract content excerpts as reported in search results. 30 | # Also a fix to -s|--some and -n|--none: these did not yet support 31 | # passing lists of phrases. This now works correctly if you provide 32 | # separate phrases via commas. 33 | # 34 | # 2.6 Ability to disable inclusion of patents and citations. This 35 | # has the same effect as unchecking the two patents/citations 36 | # checkboxes in the Scholar UI, which are checked by default. 37 | # Accordingly, the command-line options are --no-patents and 38 | # --no-citations. 39 | # 40 | # 2.5: Ability to parse global result attributes. This right now means 41 | # only the total number of results as reported by Scholar at the 42 | # top of the results pages (e.g. "About 31 results"). Such 43 | # global result attributes end up in the new attrs member of the 44 | # used ScholarQuery class. To render those attributes, you need 45 | # to use the new --txt-globals flag. 46 | # 47 | # Rendering global results is currently not supported for CSV 48 | # (as they don't fit the one-line-per-article pattern). For 49 | # grepping, you can separate the global results from the 50 | # per-article ones by looking for a line prefix of "[G]": 51 | # 52 | # $ scholar.py --txt-globals -a "Einstein" 53 | # [G] Results 11900 54 | # 55 | # Title Can quantum-mechanical description of physical reality be considered complete? 56 | # URL http://journals.aps.org/pr/abstract/10.1103/PhysRev.47.777 57 | # Year 1935 58 | # Citations 12804 59 | # Versions 80 60 | # Cluster ID 8174092782678430881 61 | # Citations list http://scholar.google.com/scholar?cites=8174092782678430881&as_sdt=2005&sciodt=0,5&hl=en 62 | # Versions list http://scholar.google.com/scholar?cluster=8174092782678430881&hl=en&as_sdt=0,5 63 | # 64 | # 2.4: Bugfixes: 65 | # 66 | # - Correctly handle Unicode characters when reporting results 67 | # in text format. 68 | # 69 | # - Correctly parse citation-only (i.e. linkless) results in 70 | # Google Scholar results. 71 | # 72 | # 2.3: Additional features: 73 | # 74 | # - Direct extraction of first PDF version of an article 75 | # 76 | # - Ability to pull up an article cluster's results directly. 77 | # 78 | # This is based on work from @aliparsai on GitHub -- thanks! 79 | # 80 | # - Suppress missing search results (so far shown as "None" in 81 | # the textual output form. 82 | # 83 | # 2.2: Added a logging option that reports full HTML contents, for 84 | # debugging, as well as incrementally more detailed logging via 85 | # -d up to -dddd. 86 | # 87 | # 2.1: Additional features: 88 | # 89 | # - Improved cookie support: the new --cookie-file options 90 | # allows the reuse of a cookie across invocations of the tool; 91 | # this allows higher query rates than would otherwise result 92 | # when invoking scholar.py repeatedly. 93 | # 94 | # - Workaround: remove the num= URL-encoded argument from parsed 95 | # URLs. For some reason, Google Scholar decides to propagate 96 | # the value from the original query into the URLs embedded in 97 | # the results. 98 | # 99 | # 2.0: Thorough overhaul of design, with substantial improvements: 100 | # 101 | # - Full support for advanced search arguments provided by 102 | # Google Scholar 103 | # 104 | # - Support for retrieval of external citation formats, such as 105 | # BibTeX or EndNote 106 | # 107 | # - Simple logging framework to track activity during execution 108 | # 109 | # 1.7: Python 3 and BeautifulSoup 4 compatibility, as well as printing 110 | # of usage info when no options are given. Thanks to Pablo 111 | # Oliveira (https://github.com/pablooliveira)! 112 | # 113 | # Also a bunch of pylinting and code cleanups. 114 | # 115 | # 1.6: Cookie support, from Matej Smid (https://github.com/palmstrom). 116 | # 117 | # 1.5: A few changes: 118 | # 119 | # - Tweak suggested by Tobias Isenberg: use unicode during CSV 120 | # formatting. 121 | # 122 | # - The option -c|--count now understands numbers up to 100 as 123 | # well. Likewise suggested by Tobias. 124 | # 125 | # - By default, text rendering mode is now active. This avoids 126 | # confusion when playing with the script, as it used to report 127 | # nothing when the user didn't select an explicit output mode. 128 | # 129 | # 1.4: Updates to reflect changes in Scholar's page rendering, 130 | # contributed by Amanda Hay at Tufts -- thanks! 131 | # 132 | # 1.3: Updates to reflect changes in Scholar's page rendering. 133 | # 134 | # 1.2: Minor tweaks, mostly thanks to helpful feedback from Dan Bolser. 135 | # Thanks Dan! 136 | # 137 | # 1.1: Made author field explicit, added --author option. 138 | # 139 | # Don't complain about missing docstrings: pylint: disable-msg=C0111 140 | # 141 | # Copyright 2010--2017 Christian Kreibich. All rights reserved. 142 | # 143 | # Redistribution and use in source and binary forms, with or without 144 | # modification, are permitted provided that the following conditions are 145 | # met: 146 | # 147 | # 1. Redistributions of source code must retain the above copyright 148 | # notice, this list of conditions and the following disclaimer. 149 | # 150 | # 2. Redistributions in binary form must reproduce the above 151 | # copyright notice, this list of conditions and the following 152 | # disclaimer in the documentation and/or other materials provided 153 | # with the distribution. 154 | # 155 | # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED 156 | # WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 157 | # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 158 | # DISCLAIMED. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY DIRECT, 159 | # INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 160 | # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 161 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 162 | # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, 163 | # STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING 164 | # IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 165 | # POSSIBILITY OF SUCH DAMAGE. 166 | 167 | import optparse 168 | import os 169 | import re 170 | import sys 171 | import warnings 172 | 173 | try: 174 | # Try importing for Python 3 175 | # pylint: disable-msg=F0401 176 | # pylint: disable-msg=E0611 177 | from urllib.request import HTTPCookieProcessor, Request, build_opener 178 | from urllib.parse import quote, unquote 179 | from http.cookiejar import MozillaCookieJar 180 | except ImportError: 181 | # Fallback for Python 2 182 | from urllib2 import Request, build_opener, HTTPCookieProcessor 183 | from urllib import quote, unquote 184 | from cookielib import MozillaCookieJar 185 | 186 | # Import BeautifulSoup -- try 4 first, fall back to older 187 | try: 188 | from bs4 import BeautifulSoup 189 | except ImportError: 190 | try: 191 | from BeautifulSoup import BeautifulSoup 192 | except ImportError: 193 | print('We need BeautifulSoup, sorry...') 194 | sys.exit(1) 195 | 196 | # Support unicode in both Python 2 and 3. In Python 3, unicode is str. 197 | if sys.version_info[0] == 3: 198 | unicode = str # pylint: disable-msg=W0622 199 | encode = lambda s: unicode(s) # pylint: disable-msg=C0103 200 | else: 201 | def encode(s): 202 | if isinstance(s, basestring): 203 | return s.encode('utf-8') # pylint: disable-msg=C0103 204 | else: 205 | return str(s) 206 | 207 | 208 | class Error(Exception): 209 | """Base class for any Scholar error.""" 210 | 211 | 212 | class FormatError(Error): 213 | """A query argument or setting was formatted incorrectly.""" 214 | 215 | 216 | class QueryArgumentError(Error): 217 | """A query did not have a suitable set of arguments.""" 218 | 219 | 220 | class SoupKitchen(object): 221 | """Factory for creating BeautifulSoup instances.""" 222 | 223 | @staticmethod 224 | def make_soup(markup, parser=None): 225 | """Factory method returning a BeautifulSoup instance. The created 226 | instance will use a parser of the given name, if supported by 227 | the underlying BeautifulSoup instance. 228 | """ 229 | if 'bs4' in sys.modules: 230 | # We support parser specification. If the caller didn't 231 | # specify one, leave it to BeautifulSoup to pick the most 232 | # suitable one, but suppress the user warning that asks to 233 | # select the most suitable parser ... which BS then 234 | # selects anyway. 235 | if parser is None: 236 | warnings.filterwarnings('ignore', 'No parser was explicitly specified') 237 | return BeautifulSoup(markup, parser) 238 | 239 | return BeautifulSoup(markup) 240 | 241 | class ScholarConf(object): 242 | """Helper class for global settings.""" 243 | 244 | VERSION = '2.10' 245 | LOG_LEVEL = 1 246 | MAX_PAGE_RESULTS = 10 # Current default for per-page results 247 | SCHOLAR_SITE = 'http://scholar.google.com' 248 | 249 | # USER_AGENT = 'Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.9) Gecko/20100913 Firefox/3.6.9' 250 | # Let's update at this point (3/14): 251 | USER_AGENT = 'Mozilla/5.0 (X11; Linux x86_64; rv:27.0) Gecko/20100101 Firefox/27.0' 252 | 253 | # If set, we will use this file to read/save cookies to enable 254 | # cookie use across sessions. 255 | COOKIE_JAR_FILE = None 256 | 257 | class ScholarUtils(object): 258 | """A wrapper for various utensils that come in handy.""" 259 | 260 | LOG_LEVELS = {'error': 1, 261 | 'warn': 2, 262 | 'info': 3, 263 | 'debug': 4} 264 | 265 | @staticmethod 266 | def ensure_int(arg, msg=None): 267 | try: 268 | return int(arg) 269 | except ValueError: 270 | raise FormatError(msg) 271 | 272 | @staticmethod 273 | def log(level, msg): 274 | if level not in ScholarUtils.LOG_LEVELS.keys(): 275 | return 276 | if ScholarUtils.LOG_LEVELS[level] > ScholarConf.LOG_LEVEL: 277 | return 278 | sys.stderr.write('[%5s] %s' % (level.upper(), msg + '\n')) 279 | sys.stderr.flush() 280 | 281 | 282 | class ScholarArticle(object): 283 | """ 284 | A class representing articles listed on Google Scholar. The class 285 | provides basic dictionary-like behavior. 286 | """ 287 | def __init__(self): 288 | # The triplets for each keyword correspond to (1) the actual 289 | # value, (2) a user-suitable label for the item, and (3) an 290 | # ordering index: 291 | self.attrs = { 292 | 'title': [None, 'Title', 0], 293 | 'url': [None, 'URL', 1], 294 | 'year': [None, 'Year', 2], 295 | 'num_citations': [0, 'Citations', 3], 296 | 'num_versions': [0, 'Versions', 4], 297 | 'cluster_id': [None, 'Cluster ID', 5], 298 | 'url_pdf': [None, 'PDF link', 6], 299 | 'url_citations': [None, 'Citations list', 7], 300 | 'url_versions': [None, 'Versions list', 8], 301 | 'url_citation': [None, 'Citation link', 9], 302 | 'excerpt': [None, 'Excerpt', 10], 303 | } 304 | 305 | # The citation data in one of the standard export formats, 306 | # e.g. BibTeX. 307 | self.citation_data = None 308 | 309 | def __getitem__(self, key): 310 | if key in self.attrs: 311 | return self.attrs[key][0] 312 | return None 313 | 314 | def __len__(self): 315 | return len(self.attrs) 316 | 317 | def __setitem__(self, key, item): 318 | if key in self.attrs: 319 | self.attrs[key][0] = item 320 | else: 321 | self.attrs[key] = [item, key, len(self.attrs)] 322 | 323 | def __delitem__(self, key): 324 | if key in self.attrs: 325 | del self.attrs[key] 326 | 327 | def set_citation_data(self, citation_data): 328 | self.citation_data = citation_data 329 | 330 | def as_txt(self): 331 | # Get items sorted in specified order: 332 | items = sorted(list(self.attrs.values()), key=lambda item: item[2]) 333 | # Find largest label length: 334 | max_label_len = max([len(str(item[1])) for item in items]) 335 | fmt = '%%%ds %%s' % max_label_len 336 | res = [] 337 | for item in items: 338 | if item[0] is not None: 339 | res.append(fmt % (item[1], item[0])) 340 | return '\n'.join(res) 341 | 342 | def as_csv(self, header=False, sep='|'): 343 | # Get keys sorted in specified order: 344 | keys = [pair[0] for pair in \ 345 | sorted([(key, val[2]) for key, val in list(self.attrs.items())], 346 | key=lambda pair: pair[1])] 347 | res = [] 348 | if header: 349 | res.append(sep.join(keys)) 350 | res.append(sep.join([unicode(self.attrs[key][0]) for key in keys])) 351 | return '\n'.join(res) 352 | 353 | def as_citation(self): 354 | """ 355 | Reports the article in a standard citation format. This works only 356 | if you have configured the querier to retrieve a particular 357 | citation export format. (See ScholarSettings.) 358 | """ 359 | return self.citation_data or '' 360 | 361 | 362 | class ScholarArticleParser(object): 363 | """ 364 | ScholarArticleParser can parse HTML document strings obtained from 365 | Google Scholar. This is a base class; concrete implementations 366 | adapting to tweaks made by Google over time follow below. 367 | """ 368 | def __init__(self, site=None): 369 | self.soup = None 370 | self.article = None 371 | self.site = site or ScholarConf.SCHOLAR_SITE 372 | self.year_re = re.compile(r'\b(?:20|19)\d{2}\b') 373 | 374 | def handle_article(self, art): 375 | """ 376 | The parser invokes this callback on each article parsed 377 | successfully. In this base class, the callback does nothing. 378 | """ 379 | 380 | def handle_num_results(self, num_results): 381 | """ 382 | The parser invokes this callback if it determines the overall 383 | number of results, as reported on the parsed results page. The 384 | base class implementation does nothing. 385 | """ 386 | 387 | def parse(self, html): 388 | """ 389 | This method initiates parsing of HTML content, cleans resulting 390 | content as needed, and notifies the parser instance of 391 | resulting instances via the handle_article callback. 392 | """ 393 | self.soup = SoupKitchen.make_soup(html) 394 | 395 | # This parses any global, non-itemized attributes from the page. 396 | self._parse_globals() 397 | 398 | # Now parse out listed articles: 399 | for div in self.soup.findAll(ScholarArticleParser._tag_results_checker): 400 | self._parse_article(div) 401 | self._clean_article() 402 | if self.article['title']: 403 | self.handle_article(self.article) 404 | 405 | def _clean_article(self): 406 | """ 407 | This gets invoked after we have parsed an article, to do any 408 | needed cleanup/polishing before we hand off the resulting 409 | article. 410 | """ 411 | if self.article['title']: 412 | self.article['title'] = self.article['title'].strip() 413 | 414 | def _parse_globals(self): 415 | tag = self.soup.find(name='div', attrs={'id': 'gs_ab_md'}) 416 | if tag is not None: 417 | raw_text = tag.findAll(text=True) 418 | # raw text is a list because the body contains etc 419 | if raw_text is not None and len(raw_text) > 0: 420 | try: 421 | num_results = raw_text[0].split()[1] 422 | # num_results may now contain commas to separate 423 | # thousands, strip: 424 | num_results = num_results.replace(',', '') 425 | num_results = int(num_results) 426 | self.handle_num_results(num_results) 427 | except (IndexError, ValueError): 428 | pass 429 | 430 | def _parse_article(self, div): 431 | self.article = ScholarArticle() 432 | 433 | for tag in div: 434 | if not hasattr(tag, 'name'): 435 | continue 436 | 437 | if tag.name == 'div' and self._tag_has_class(tag, 'gs_rt') and \ 438 | tag.h3 and tag.h3.a: 439 | self.article['title'] = ''.join(tag.h3.a.findAll(text=True)) 440 | self.article['url'] = self._path2url(tag.h3.a['href']) 441 | if self.article['url'].endswith('.pdf'): 442 | self.article['url_pdf'] = self.article['url'] 443 | 444 | if tag.name == 'font': 445 | for tag2 in tag: 446 | if not hasattr(tag2, 'name'): 447 | continue 448 | if tag2.name == 'span' and \ 449 | self._tag_has_class(tag2, 'gs_fl'): 450 | self._parse_links(tag2) 451 | 452 | def _parse_links(self, span): 453 | for tag in span: 454 | if not hasattr(tag, 'name'): 455 | continue 456 | if tag.name != 'a' or tag.get('href') is None: 457 | continue 458 | 459 | if tag.get('href').startswith('/scholar?cites'): 460 | if hasattr(tag, 'string') and tag.string.startswith('Cited by'): 461 | self.article['num_citations'] = \ 462 | self._as_int(tag.string.split()[-1]) 463 | 464 | # Weird Google Scholar behavior here: if the original 465 | # search query came with a number-of-results limit, 466 | # then this limit gets propagated to the URLs embedded 467 | # in the results page as well. Same applies to 468 | # versions URL in next if-block. 469 | self.article['url_citations'] = \ 470 | self._strip_url_arg('num', self._path2url(tag.get('href'))) 471 | 472 | # We can also extract the cluster ID from the versions 473 | # URL. Note that we know that the string contains "?", 474 | # from the above if-statement. 475 | args = self.article['url_citations'].split('?', 1)[1] 476 | for arg in args.split('&'): 477 | if arg.startswith('cites='): 478 | self.article['cluster_id'] = arg[6:] 479 | 480 | if tag.get('href').startswith('/scholar?cluster'): 481 | if hasattr(tag, 'string') and tag.string.startswith('All '): 482 | self.article['num_versions'] = \ 483 | self._as_int(tag.string.split()[1]) 484 | self.article['url_versions'] = \ 485 | self._strip_url_arg('num', self._path2url(tag.get('href'))) 486 | 487 | if tag.getText().startswith('Import'): 488 | self.article['url_citation'] = self._path2url(tag.get('href')) 489 | 490 | 491 | @staticmethod 492 | def _tag_has_class(tag, klass): 493 | """ 494 | This predicate function checks whether a BeatifulSoup Tag instance 495 | has a class attribute. 496 | """ 497 | res = tag.get('class') or [] 498 | if type(res) != list: 499 | # BeautifulSoup 3 can return e.g. 'gs_md_wp gs_ttss', 500 | # so split -- conveniently produces a list in any case 501 | res = res.split() 502 | return klass in res 503 | 504 | @staticmethod 505 | def _tag_results_checker(tag): 506 | return tag.name == 'div' \ 507 | and ScholarArticleParser._tag_has_class(tag, 'gs_r') 508 | 509 | @staticmethod 510 | def _as_int(obj): 511 | try: 512 | return int(obj) 513 | except ValueError: 514 | return None 515 | 516 | def _path2url(self, path): 517 | """Helper, returns full URL in case path isn't one.""" 518 | if path.startswith('http://'): 519 | return path 520 | if not path.startswith('/'): 521 | path = '/' + path 522 | return self.site + path 523 | 524 | def _strip_url_arg(self, arg, url): 525 | """Helper, removes a URL-encoded argument, if present.""" 526 | parts = url.split('?', 1) 527 | if len(parts) != 2: 528 | return url 529 | res = [] 530 | for part in parts[1].split('&'): 531 | if not part.startswith(arg + '='): 532 | res.append(part) 533 | return parts[0] + '?' + '&'.join(res) 534 | 535 | 536 | class ScholarArticleParser120201(ScholarArticleParser): 537 | """ 538 | This class reflects update to the Scholar results page layout that 539 | Google recently. 540 | """ 541 | def _parse_article(self, div): 542 | self.article = ScholarArticle() 543 | 544 | for tag in div: 545 | if not hasattr(tag, 'name'): 546 | continue 547 | 548 | if tag.name == 'h3' and self._tag_has_class(tag, 'gs_rt') and tag.a: 549 | self.article['title'] = ''.join(tag.a.findAll(text=True)) 550 | self.article['url'] = self._path2url(tag.a['href']) 551 | if self.article['url'].endswith('.pdf'): 552 | self.article['url_pdf'] = self.article['url'] 553 | 554 | if tag.name == 'div' and self._tag_has_class(tag, 'gs_a'): 555 | year = self.year_re.findall(tag.text) 556 | self.article['year'] = year[0] if len(year) > 0 else None 557 | 558 | if tag.name == 'div' and self._tag_has_class(tag, 'gs_fl'): 559 | self._parse_links(tag) 560 | 561 | 562 | class ScholarArticleParser120726(ScholarArticleParser): 563 | """ 564 | This class reflects update to the Scholar results page layout that 565 | Google made 07/26/12. 566 | """ 567 | def _parse_article(self, div): 568 | self.article = ScholarArticle() 569 | 570 | for tag in div: 571 | if not hasattr(tag, 'name'): 572 | continue 573 | if str(tag).lower().find('.pdf'): 574 | try: 575 | if tag.find('div', {'class': 'gs_ttss'}): 576 | self._parse_links(tag.find('div', {'class': 'gs_ttss'})) 577 | except: 578 | pass 579 | 580 | if tag.name == 'div' and self._tag_has_class(tag, 'gs_ri'): 581 | # There are (at least) two formats here. In the first 582 | # one, we have a link, e.g.: 583 | # 584 | #

585 | # 586 | # Honeycomb: creating intrusion detection signatures using 587 | # honeypots 588 | # 589 | #

590 | # 591 | # In the other, there's no actual link -- it's what 592 | # Scholar renders as "CITATION" in the HTML: 593 | # 594 | #

595 | # 596 | # [CITATION] 597 | # [C] 598 | # 599 | # Honeycomb automated ids signature creation using honeypots 600 | #

601 | # 602 | # We now distinguish the two. 603 | try: 604 | atag = tag.h3.a 605 | self.article['title'] = ''.join(atag.findAll(text=True)) 606 | self.article['url'] = self._path2url(atag['href']) 607 | if self.article['url'].endswith('.pdf'): 608 | self.article['url_pdf'] = self.article['url'] 609 | except: 610 | # Remove a few spans that have unneeded content (e.g. [CITATION]) 611 | for span in tag.h3.findAll(name='span'): 612 | span.clear() 613 | self.article['title'] = ''.join(tag.h3.findAll(text=True)) 614 | 615 | if tag.find('div', {'class': 'gs_a'}): 616 | year = self.year_re.findall(tag.find('div', {'class': 'gs_a'}).text) 617 | self.article['year'] = year[0] if len(year) > 0 else None 618 | 619 | if tag.find('div', {'class': 'gs_fl'}): 620 | self._parse_links(tag.find('div', {'class': 'gs_fl'})) 621 | 622 | if tag.find('div', {'class': 'gs_rs'}): 623 | # These are the content excerpts rendered into the results. 624 | raw_text = tag.find('div', {'class': 'gs_rs'}).findAll(text=True) 625 | if len(raw_text) > 0: 626 | raw_text = ''.join(raw_text) 627 | raw_text = raw_text.replace('\n', '') 628 | self.article['excerpt'] = raw_text 629 | 630 | 631 | class ScholarQuery(object): 632 | """ 633 | The base class for any kind of results query we send to Scholar. 634 | """ 635 | def __init__(self): 636 | self.url = None 637 | 638 | # The number of results requested from Scholar -- not the 639 | # total number of results it reports (the latter gets stored 640 | # in attrs, see below). 641 | self.num_results = None 642 | 643 | # Queries may have global result attributes, similar to 644 | # per-article attributes in ScholarArticle. The exact set of 645 | # attributes may differ by query type, but they all share the 646 | # basic data structure: 647 | self.attrs = {} 648 | 649 | def set_num_page_results(self, num_page_results): 650 | self.num_results = ScholarUtils.ensure_int( 651 | num_page_results, 652 | 'maximum number of results on page must be numeric') 653 | 654 | def get_url(self): 655 | """ 656 | Returns a complete, submittable URL string for this particular 657 | query instance. The URL and its arguments will vary depending 658 | on the query. 659 | """ 660 | return None 661 | 662 | def _add_attribute_type(self, key, label, default_value=None): 663 | """ 664 | Adds a new type of attribute to the list of attributes 665 | understood by this query. Meant to be used by the constructors 666 | in derived classes. 667 | """ 668 | if len(self.attrs) == 0: 669 | self.attrs[key] = [default_value, label, 0] 670 | return 671 | idx = max([item[2] for item in self.attrs.values()]) + 1 672 | self.attrs[key] = [default_value, label, idx] 673 | 674 | def __getitem__(self, key): 675 | """Getter for attribute value. Returns None if no such key.""" 676 | if key in self.attrs: 677 | return self.attrs[key][0] 678 | return None 679 | 680 | def __setitem__(self, key, item): 681 | """Setter for attribute value. Does nothing if no such key.""" 682 | if key in self.attrs: 683 | self.attrs[key][0] = item 684 | 685 | def _parenthesize_phrases(self, query): 686 | """ 687 | Turns a query string containing comma-separated phrases into a 688 | space-separated list of tokens, quoted if containing 689 | whitespace. For example, input 690 | 691 | 'some words, foo, bar' 692 | 693 | becomes 694 | 695 | '"some words" foo bar' 696 | 697 | This comes in handy during the composition of certain queries. 698 | """ 699 | if query.find(',') < 0: 700 | return query 701 | phrases = [] 702 | for phrase in query.split(','): 703 | phrase = phrase.strip() 704 | if phrase.find(' ') > 0: 705 | phrase = '"' + phrase + '"' 706 | phrases.append(phrase) 707 | return ' '.join(phrases) 708 | 709 | 710 | class ClusterScholarQuery(ScholarQuery): 711 | """ 712 | This version just pulls up an article cluster whose ID we already 713 | know about. 714 | """ 715 | SCHOLAR_CLUSTER_URL = ScholarConf.SCHOLAR_SITE + '/scholar?' \ 716 | + 'cluster=%(cluster)s' \ 717 | + '%(num)s' 718 | 719 | def __init__(self, cluster=None): 720 | ScholarQuery.__init__(self) 721 | self._add_attribute_type('num_results', 'Results', 0) 722 | self.cluster = None 723 | self.set_cluster(cluster) 724 | 725 | def set_cluster(self, cluster): 726 | """ 727 | Sets search to a Google Scholar results cluster ID. 728 | """ 729 | msg = 'cluster ID must be numeric' 730 | self.cluster = ScholarUtils.ensure_int(cluster, msg) 731 | 732 | def get_url(self): 733 | if self.cluster is None: 734 | raise QueryArgumentError('cluster query needs cluster ID') 735 | 736 | urlargs = {'cluster': self.cluster } 737 | 738 | for key, val in urlargs.items(): 739 | urlargs[key] = quote(encode(val)) 740 | 741 | # The following URL arguments must not be quoted, or the 742 | # server will not recognize them: 743 | urlargs['num'] = ('&num=%d' % self.num_results 744 | if self.num_results is not None else '') 745 | 746 | return self.SCHOLAR_CLUSTER_URL % urlargs 747 | 748 | 749 | class SearchScholarQuery(ScholarQuery): 750 | """ 751 | This version represents the search query parameters the user can 752 | configure on the Scholar website, in the advanced search options. 753 | """ 754 | SCHOLAR_QUERY_URL = ScholarConf.SCHOLAR_SITE + '/scholar?' \ 755 | + 'as_q=%(words)s' \ 756 | + '&as_epq=%(phrase)s' \ 757 | + '&as_oq=%(words_some)s' \ 758 | + '&as_eq=%(words_none)s' \ 759 | + '&as_occt=%(scope)s' \ 760 | + '&as_sauthors=%(authors)s' \ 761 | + '&as_publication=%(pub)s' \ 762 | + '&as_ylo=%(ylo)s' \ 763 | + '&as_yhi=%(yhi)s' \ 764 | + '&as_vis=%(citations)s' \ 765 | + '&btnG=&hl=en' \ 766 | + '%(num)s' \ 767 | + '&as_sdt=%(patents)s%%2C5' 768 | 769 | def __init__(self): 770 | ScholarQuery.__init__(self) 771 | self._add_attribute_type('num_results', 'Results', 0) 772 | self.words = None # The default search behavior 773 | self.words_some = None # At least one of those words 774 | self.words_none = None # None of these words 775 | self.phrase = None 776 | self.scope_title = False # If True, search in title only 777 | self.author = None 778 | self.pub = None 779 | self.timeframe = [None, None] 780 | self.include_patents = True 781 | self.include_citations = True 782 | 783 | def set_words(self, words): 784 | """Sets words that *all* must be found in the result.""" 785 | self.words = words 786 | 787 | def set_words_some(self, words): 788 | """Sets words of which *at least one* must be found in result.""" 789 | self.words_some = words 790 | 791 | def set_words_none(self, words): 792 | """Sets words of which *none* must be found in the result.""" 793 | self.words_none = words 794 | 795 | def set_phrase(self, phrase): 796 | """Sets phrase that must be found in the result exactly.""" 797 | self.phrase = phrase 798 | 799 | def set_scope(self, title_only): 800 | """ 801 | Sets Boolean indicating whether to search entire article or title 802 | only. 803 | """ 804 | self.scope_title = title_only 805 | 806 | def set_author(self, author): 807 | """Sets names that must be on the result's author list.""" 808 | self.author = author 809 | 810 | def set_pub(self, pub): 811 | """Sets the publication in which the result must be found.""" 812 | self.pub = pub 813 | 814 | def set_timeframe(self, start=None, end=None): 815 | """ 816 | Sets timeframe (in years as integer) in which result must have 817 | appeared. It's fine to specify just start or end, or both. 818 | """ 819 | if start: 820 | start = ScholarUtils.ensure_int(start) 821 | if end: 822 | end = ScholarUtils.ensure_int(end) 823 | self.timeframe = [start, end] 824 | 825 | def set_include_citations(self, yesorno): 826 | self.include_citations = yesorno 827 | 828 | def set_include_patents(self, yesorno): 829 | self.include_patents = yesorno 830 | 831 | def get_url(self): 832 | if self.words is None and self.words_some is None \ 833 | and self.words_none is None and self.phrase is None \ 834 | and self.author is None and self.pub is None \ 835 | and self.timeframe[0] is None and self.timeframe[1] is None: 836 | raise QueryArgumentError('search query needs more parameters') 837 | 838 | # If we have some-words or none-words lists, we need to 839 | # process them so GS understands them. For simple 840 | # space-separeted word lists, there's nothing to do. For lists 841 | # of phrases we have to ensure quotations around the phrases, 842 | # separating them by whitespace. 843 | words_some = None 844 | words_none = None 845 | 846 | if self.words_some: 847 | words_some = self._parenthesize_phrases(self.words_some) 848 | if self.words_none: 849 | words_none = self._parenthesize_phrases(self.words_none) 850 | 851 | urlargs = {'words': self.words or '', 852 | 'words_some': words_some or '', 853 | 'words_none': words_none or '', 854 | 'phrase': self.phrase or '', 855 | 'scope': 'title' if self.scope_title else 'any', 856 | 'authors': self.author or '', 857 | 'pub': self.pub or '', 858 | 'ylo': self.timeframe[0] or '', 859 | 'yhi': self.timeframe[1] or '', 860 | 'patents': '0' if self.include_patents else '1', 861 | 'citations': '0' if self.include_citations else '1'} 862 | 863 | for key, val in urlargs.items(): 864 | urlargs[key] = quote(encode(val)) 865 | 866 | # The following URL arguments must not be quoted, or the 867 | # server will not recognize them: 868 | urlargs['num'] = ('&num=%d' % self.num_results 869 | if self.num_results is not None else '') 870 | 871 | return self.SCHOLAR_QUERY_URL % urlargs 872 | 873 | 874 | class ScholarSettings(object): 875 | """ 876 | This class lets you adjust the Scholar settings for your 877 | session. It's intended to mirror the features tunable in the 878 | Scholar Settings pane, but right now it's a bit basic. 879 | """ 880 | CITFORM_NONE = 0 881 | CITFORM_REFWORKS = 1 882 | CITFORM_REFMAN = 2 883 | CITFORM_ENDNOTE = 3 884 | CITFORM_BIBTEX = 4 885 | 886 | def __init__(self): 887 | self.citform = 0 # Citation format, default none 888 | self.per_page_results = None 889 | self._is_configured = False 890 | 891 | def set_citation_format(self, citform): 892 | citform = ScholarUtils.ensure_int(citform) 893 | if citform < 0 or citform > self.CITFORM_BIBTEX: 894 | raise FormatError('citation format invalid, is "%s"' 895 | % citform) 896 | self.citform = citform 897 | self._is_configured = True 898 | 899 | def set_per_page_results(self, per_page_results): 900 | self.per_page_results = ScholarUtils.ensure_int( 901 | per_page_results, 'page results must be integer') 902 | self.per_page_results = min( 903 | self.per_page_results, ScholarConf.MAX_PAGE_RESULTS) 904 | self._is_configured = True 905 | 906 | def is_configured(self): 907 | return self._is_configured 908 | 909 | 910 | class ScholarQuerier(object): 911 | """ 912 | ScholarQuerier instances can conduct a search on Google Scholar 913 | with subsequent parsing of the resulting HTML content. The 914 | articles found are collected in the articles member, a list of 915 | ScholarArticle instances. 916 | """ 917 | 918 | # Default URLs for visiting and submitting Settings pane, as of 3/14 919 | GET_SETTINGS_URL = ScholarConf.SCHOLAR_SITE + '/scholar_settings?' \ 920 | + 'sciifh=1&hl=en&as_sdt=0,5' 921 | 922 | SET_SETTINGS_URL = ScholarConf.SCHOLAR_SITE + '/scholar_setprefs?' \ 923 | + 'q=' \ 924 | + '&scisig=%(scisig)s' \ 925 | + '&inststart=0' \ 926 | + '&as_sdt=1,5' \ 927 | + '&as_sdtp=' \ 928 | + '&num=%(num)s' \ 929 | + '&scis=%(scis)s' \ 930 | + '%(scisf)s' \ 931 | + '&hl=en&lang=all&instq=&inst=569367360547434339&save=' 932 | 933 | # Older URLs: 934 | # ScholarConf.SCHOLAR_SITE + '/scholar?q=%s&hl=en&btnG=Search&as_sdt=2001&as_sdtp=on 935 | 936 | class Parser(ScholarArticleParser120726): 937 | def __init__(self, querier): 938 | ScholarArticleParser120726.__init__(self) 939 | self.querier = querier 940 | 941 | def handle_num_results(self, num_results): 942 | if self.querier is not None and self.querier.query is not None: 943 | self.querier.query['num_results'] = num_results 944 | 945 | def handle_article(self, art): 946 | self.querier.add_article(art) 947 | 948 | def __init__(self): 949 | self.articles = [] 950 | self.query = None 951 | self.cjar = MozillaCookieJar() 952 | 953 | # If we have a cookie file, load it: 954 | if ScholarConf.COOKIE_JAR_FILE and \ 955 | os.path.exists(ScholarConf.COOKIE_JAR_FILE): 956 | try: 957 | self.cjar.load(ScholarConf.COOKIE_JAR_FILE, 958 | ignore_discard=True) 959 | ScholarUtils.log('info', 'loaded cookies file') 960 | except Exception as msg: 961 | ScholarUtils.log('warn', 'could not load cookies file: %s' % msg) 962 | self.cjar = MozillaCookieJar() # Just to be safe 963 | 964 | self.opener = build_opener(HTTPCookieProcessor(self.cjar)) 965 | self.settings = None # Last settings object, if any 966 | 967 | def apply_settings(self, settings): 968 | """ 969 | Applies settings as provided by a ScholarSettings instance. 970 | """ 971 | if settings is None or not settings.is_configured(): 972 | return True 973 | 974 | self.settings = settings 975 | 976 | # This is a bit of work. We need to actually retrieve the 977 | # contents of the Settings pane HTML in order to extract 978 | # hidden fields before we can compose the query for updating 979 | # the settings. 980 | html = self._get_http_response(url=self.GET_SETTINGS_URL, 981 | log_msg='dump of settings form HTML', 982 | err_msg='requesting settings failed') 983 | if html is None: 984 | return False 985 | 986 | # Now parse the required stuff out of the form. We require the 987 | # "scisig" token to make the upload of our settings acceptable 988 | # to Google. 989 | soup = SoupKitchen.make_soup(html) 990 | 991 | tag = soup.find(name='form', attrs={'id': 'gs_settings_form'}) 992 | if tag is None: 993 | ScholarUtils.log('info', 'parsing settings failed: no form') 994 | return False 995 | 996 | tag = tag.find('input', attrs={'type':'hidden', 'name':'scisig'}) 997 | if tag is None: 998 | ScholarUtils.log('info', 'parsing settings failed: scisig') 999 | return False 1000 | 1001 | urlargs = {'scisig': tag['value'], 1002 | 'num': settings.per_page_results, 1003 | 'scis': 'no', 1004 | 'scisf': ''} 1005 | 1006 | if settings.citform != 0: 1007 | urlargs['scis'] = 'yes' 1008 | urlargs['scisf'] = '&scisf=%d' % settings.citform 1009 | 1010 | html = self._get_http_response(url=self.SET_SETTINGS_URL % urlargs, 1011 | log_msg='dump of settings result HTML', 1012 | err_msg='applying setttings failed') 1013 | if html is None: 1014 | return False 1015 | 1016 | ScholarUtils.log('info', 'settings applied') 1017 | return True 1018 | 1019 | def send_query(self, query): 1020 | """ 1021 | This method initiates a search query (a ScholarQuery instance) 1022 | with subsequent parsing of the response. 1023 | """ 1024 | self.clear_articles() 1025 | self.query = query 1026 | 1027 | html = self._get_http_response(url=query.get_url(), 1028 | log_msg='dump of query response HTML', 1029 | err_msg='results retrieval failed') 1030 | if html is None: 1031 | return 1032 | 1033 | self.parse(html) 1034 | 1035 | def get_citation_data(self, article): 1036 | """ 1037 | Given an article, retrieves citation link. Note, this requires that 1038 | you adjusted the settings to tell Google Scholar to actually 1039 | provide this information, *prior* to retrieving the article. 1040 | """ 1041 | if article['url_citation'] is None: 1042 | return False 1043 | if article.citation_data is not None: 1044 | return True 1045 | 1046 | ScholarUtils.log('info', 'retrieving citation export data') 1047 | data = self._get_http_response(url=article['url_citation'], 1048 | log_msg='citation data response', 1049 | err_msg='requesting citation data failed') 1050 | if data is None: 1051 | return False 1052 | 1053 | article.set_citation_data(data) 1054 | return True 1055 | 1056 | def parse(self, html): 1057 | """ 1058 | This method allows parsing of provided HTML content. 1059 | """ 1060 | parser = self.Parser(self) 1061 | parser.parse(html) 1062 | 1063 | def add_article(self, art): 1064 | self.get_citation_data(art) 1065 | self.articles.append(art) 1066 | 1067 | def clear_articles(self): 1068 | """Clears any existing articles stored from previous queries.""" 1069 | self.articles = [] 1070 | 1071 | def save_cookies(self): 1072 | """ 1073 | This stores the latest cookies we're using to disk, for reuse in a 1074 | later session. 1075 | """ 1076 | if ScholarConf.COOKIE_JAR_FILE is None: 1077 | return False 1078 | try: 1079 | self.cjar.save(ScholarConf.COOKIE_JAR_FILE, 1080 | ignore_discard=True) 1081 | ScholarUtils.log('info', 'saved cookies file') 1082 | return True 1083 | except Exception as msg: 1084 | ScholarUtils.log('warn', 'could not save cookies file: %s' % msg) 1085 | return False 1086 | 1087 | def _get_http_response(self, url, log_msg=None, err_msg=None): 1088 | """ 1089 | Helper method, sends HTTP request and returns response payload. 1090 | """ 1091 | if log_msg is None: 1092 | log_msg = 'HTTP response data follow' 1093 | if err_msg is None: 1094 | err_msg = 'request failed' 1095 | try: 1096 | ScholarUtils.log('info', 'requesting %s' % unquote(url)) 1097 | 1098 | req = Request(url=url, headers={'User-Agent': ScholarConf.USER_AGENT}) 1099 | hdl = self.opener.open(req) 1100 | html = hdl.read() 1101 | 1102 | ScholarUtils.log('debug', log_msg) 1103 | ScholarUtils.log('debug', '>>>>' + '-'*68) 1104 | ScholarUtils.log('debug', 'url: %s' % hdl.geturl()) 1105 | ScholarUtils.log('debug', 'result: %s' % hdl.getcode()) 1106 | ScholarUtils.log('debug', 'headers:\n' + str(hdl.info())) 1107 | ScholarUtils.log('debug', 'data:\n' + html.decode('utf-8')) # For Python 3 1108 | ScholarUtils.log('debug', '<<<<' + '-'*68) 1109 | 1110 | return html 1111 | except Exception as err: 1112 | ScholarUtils.log('info', err_msg + ': %s' % err) 1113 | return None 1114 | 1115 | 1116 | def txt(querier, with_globals): 1117 | if with_globals: 1118 | # If we have any articles, check their attribute labels to get 1119 | # the maximum length -- makes for nicer alignment. 1120 | max_label_len = 0 1121 | if len(querier.articles) > 0: 1122 | items = sorted(list(querier.articles[0].attrs.values()), 1123 | key=lambda item: item[2]) 1124 | max_label_len = max([len(str(item[1])) for item in items]) 1125 | 1126 | # Get items sorted in specified order: 1127 | items = sorted(list(querier.query.attrs.values()), key=lambda item: item[2]) 1128 | # Find largest label length: 1129 | max_label_len = max([len(str(item[1])) for item in items] + [max_label_len]) 1130 | fmt = '[G] %%%ds %%s' % max(0, max_label_len-4) 1131 | for item in items: 1132 | if item[0] is not None: 1133 | print(fmt % (item[1], item[0])) 1134 | if len(items) > 0: 1135 | print 1136 | 1137 | articles = querier.articles 1138 | for art in articles: 1139 | print(encode(art.as_txt()) + '\n') 1140 | 1141 | def csv(querier, header=False, sep='|'): 1142 | articles = querier.articles 1143 | for art in articles: 1144 | result = art.as_csv(header=header, sep=sep) 1145 | print(encode(result)) 1146 | header = False 1147 | 1148 | def citation_export(querier): 1149 | articles = querier.articles 1150 | for art in articles: 1151 | print(art.as_citation() + '\n') 1152 | 1153 | 1154 | def main(): 1155 | usage = """scholar.py [options] 1156 | A command-line interface to Google Scholar. 1157 | 1158 | Examples: 1159 | 1160 | # Retrieve one article written by Einstein on quantum theory: 1161 | scholar.py -c 1 --author "albert einstein" --phrase "quantum theory" 1162 | 1163 | # Retrieve a BibTeX entry for that quantum theory paper: 1164 | scholar.py -c 1 -C 17749203648027613321 --citation bt 1165 | 1166 | # Retrieve five articles written by Einstein after 1970 where the title 1167 | # does not contain the words "quantum" and "theory": 1168 | scholar.py -c 5 -a "albert einstein" -t --none "quantum theory" --after 1970""" 1169 | 1170 | fmt = optparse.IndentedHelpFormatter(max_help_position=50, width=100) 1171 | parser = optparse.OptionParser(usage=usage, formatter=fmt) 1172 | group = optparse.OptionGroup(parser, 'Query arguments', 1173 | 'These options define search query arguments and parameters.') 1174 | group.add_option('-a', '--author', metavar='AUTHORS', default=None, 1175 | help='Author name(s)') 1176 | group.add_option('-A', '--all', metavar='WORDS', default=None, dest='allw', 1177 | help='Results must contain all of these words') 1178 | group.add_option('-s', '--some', metavar='WORDS', default=None, 1179 | help='Results must contain at least one of these words. Pass arguments in form -s "foo bar baz" for simple words, and -s "a phrase, another phrase" for phrases') 1180 | group.add_option('-n', '--none', metavar='WORDS', default=None, 1181 | help='Results must contain none of these words. See -s|--some re. formatting') 1182 | group.add_option('-p', '--phrase', metavar='PHRASE', default=None, 1183 | help='Results must contain exact phrase') 1184 | group.add_option('-t', '--title-only', action='store_true', default=False, 1185 | help='Search title only') 1186 | group.add_option('-P', '--pub', metavar='PUBLICATIONS', default=None, 1187 | help='Results must have appeared in this publication') 1188 | group.add_option('--after', metavar='YEAR', default=None, 1189 | help='Results must have appeared in or after given year') 1190 | group.add_option('--before', metavar='YEAR', default=None, 1191 | help='Results must have appeared in or before given year') 1192 | group.add_option('--no-patents', action='store_true', default=False, 1193 | help='Do not include patents in results') 1194 | group.add_option('--no-citations', action='store_true', default=False, 1195 | help='Do not include citations in results') 1196 | group.add_option('-C', '--cluster-id', metavar='CLUSTER_ID', default=None, 1197 | help='Do not search, just use articles in given cluster ID') 1198 | group.add_option('-c', '--count', type='int', default=None, 1199 | help='Maximum number of results') 1200 | parser.add_option_group(group) 1201 | 1202 | group = optparse.OptionGroup(parser, 'Output format', 1203 | 'These options control the appearance of the results.') 1204 | group.add_option('--txt', action='store_true', 1205 | help='Print article data in text format (default)') 1206 | group.add_option('--txt-globals', action='store_true', 1207 | help='Like --txt, but first print global results too') 1208 | group.add_option('--csv', action='store_true', 1209 | help='Print article data in CSV form (separator is "|")') 1210 | group.add_option('--csv-header', action='store_true', 1211 | help='Like --csv, but print header with column names') 1212 | group.add_option('--citation', metavar='FORMAT', default=None, 1213 | help='Print article details in standard citation format. Argument Must be one of "bt" (BibTeX), "en" (EndNote), "rm" (RefMan), or "rw" (RefWorks).') 1214 | parser.add_option_group(group) 1215 | 1216 | group = optparse.OptionGroup(parser, 'Miscellaneous') 1217 | group.add_option('--cookie-file', metavar='FILE', default=None, 1218 | help='File to use for cookie storage. If given, will read any existing cookies if found at startup, and save resulting cookies in the end.') 1219 | group.add_option('-d', '--debug', action='count', default=0, 1220 | help='Enable verbose logging to stderr. Repeated options increase detail of debug output.') 1221 | group.add_option('-v', '--version', action='store_true', default=False, 1222 | help='Show version information') 1223 | parser.add_option_group(group) 1224 | 1225 | options, _ = parser.parse_args() 1226 | 1227 | # Show help if we have neither keyword search nor author name 1228 | if len(sys.argv) == 1: 1229 | parser.print_help() 1230 | return 1 1231 | 1232 | if options.debug > 0: 1233 | options.debug = min(options.debug, ScholarUtils.LOG_LEVELS['debug']) 1234 | ScholarConf.LOG_LEVEL = options.debug 1235 | ScholarUtils.log('info', 'using log level %d' % ScholarConf.LOG_LEVEL) 1236 | 1237 | if options.version: 1238 | print('This is scholar.py %s.' % ScholarConf.VERSION) 1239 | return 0 1240 | 1241 | if options.cookie_file: 1242 | ScholarConf.COOKIE_JAR_FILE = options.cookie_file 1243 | 1244 | # Sanity-check the options: if they include a cluster ID query, it 1245 | # makes no sense to have search arguments: 1246 | if options.cluster_id is not None: 1247 | if options.author or options.allw or options.some or options.none \ 1248 | or options.phrase or options.title_only or options.pub \ 1249 | or options.after or options.before: 1250 | print('Cluster ID queries do not allow additional search arguments.') 1251 | return 1 1252 | 1253 | querier = ScholarQuerier() 1254 | settings = ScholarSettings() 1255 | 1256 | if options.citation == 'bt': 1257 | settings.set_citation_format(ScholarSettings.CITFORM_BIBTEX) 1258 | elif options.citation == 'en': 1259 | settings.set_citation_format(ScholarSettings.CITFORM_ENDNOTE) 1260 | elif options.citation == 'rm': 1261 | settings.set_citation_format(ScholarSettings.CITFORM_REFMAN) 1262 | elif options.citation == 'rw': 1263 | settings.set_citation_format(ScholarSettings.CITFORM_REFWORKS) 1264 | elif options.citation is not None: 1265 | print('Invalid citation link format, must be one of "bt", "en", "rm", or "rw".') 1266 | return 1 1267 | 1268 | querier.apply_settings(settings) 1269 | 1270 | if options.cluster_id: 1271 | query = ClusterScholarQuery(cluster=options.cluster_id) 1272 | else: 1273 | query = SearchScholarQuery() 1274 | if options.author: 1275 | query.set_author(options.author) 1276 | if options.allw: 1277 | query.set_words(options.allw) 1278 | if options.some: 1279 | query.set_words_some(options.some) 1280 | if options.none: 1281 | query.set_words_none(options.none) 1282 | if options.phrase: 1283 | query.set_phrase(options.phrase) 1284 | if options.title_only: 1285 | query.set_scope(True) 1286 | if options.pub: 1287 | query.set_pub(options.pub) 1288 | if options.after or options.before: 1289 | query.set_timeframe(options.after, options.before) 1290 | if options.no_patents: 1291 | query.set_include_patents(False) 1292 | if options.no_citations: 1293 | query.set_include_citations(False) 1294 | 1295 | if options.count is not None: 1296 | options.count = min(options.count, ScholarConf.MAX_PAGE_RESULTS) 1297 | query.set_num_page_results(options.count) 1298 | 1299 | querier.send_query(query) 1300 | 1301 | if options.csv: 1302 | csv(querier) 1303 | elif options.csv_header: 1304 | csv(querier, header=True) 1305 | elif options.citation is not None: 1306 | citation_export(querier) 1307 | else: 1308 | txt(querier, with_globals=options.txt_globals) 1309 | 1310 | if options.cookie_file: 1311 | querier.save_cookies() 1312 | 1313 | return 0 1314 | 1315 | if __name__ == "__main__": 1316 | sys.exit(main()) 1317 | --------------------------------------------------------------------------------