├── README.md
├── get_hot.py
└── scholar.py


/README.md:
--------------------------------------------------------------------------------
  1 | # awesome-self-supervised-gnn
  2 |  
  3 |  ![PRs Welcome](https://img.shields.io/badge/PRs-Welcome-green)  [![Awesome](https://awesome.re/badge.svg)](https://awesome.re) ![Stars](https://img.shields.io/github/stars/ChandlerBang/awesome-self-supervised-gnn?color=yellow)  ![Forks](https://img.shields.io/github/forks/ChandlerBang/awesome-self-supervised-gnn?color=blue&label=Fork)
  4 |  
  5 |  This repository contains a list of papers on the **Self-supervised Learning on Graph Neural Networks (GNNs)**, we categorize them based on their published years.
  6 |  
  7 |  We will try to make this list updated. If you found any error or any missed paper, please don't hesitate to open issues or pull requests.
  8 |  
  9 |  Note: :fire: indicates the paper is extensively cited (e.g., > 80 citations). The code is provided in `get_hot.py`.
 10 | 
 11 | ## Year 2024
 12 | 1. [ICASSP 2024] **Contrastive Deep Nonnegative Matrix Factorization for Community Detection** [[paper]](https://arxiv.org/abs/2311.02357) [[code]](https://github.com/6lyc/CDNMF)
 13 |    
 14 | ## Year 2023
 15 | 1. [ICLR 2023] **Empowering Graph Representation Learning with Test-Time Graph Transformation** [[paper]](https://openreview.net/forum?id=Lnxl5pr018) [[code]](https://github.com/ChandlerBang/GTrans)
 16 | 1. [ICLR 2023] **Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization** [[paper]](https://arxiv.org/pdf/2210.02016.pdf) [[code]](https://github.com/jumxglhf/ParetoGNN)
 17 | 1. [AAAI 2023] **Eliciting Structural and Semantic Global Knowledge in Unsupervised Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2202.08480) [[code]](https://github.com/kaize0409/S-3-CL)
 18 | 1. [arXiv 2023] **Truncated Affinity Maximization: One-class Homophily Modeling for Graph Anomaly Detection** [[paper]](https://arxiv.org/pdf/2306.00006.pdf)
 19 | 1. [ICASSP 2023] **Contrastive Learning at the Relation and Event Level for Rumor Detection** [[paper]](https://ieeexplore.ieee.org/abstract/document/10096567)
 20 | 1. [arXiv 2023] **AmGCL: Feature Imputation of Attribute Missing Graph via Self-supervised Contrastive Learning** [[paper]](https://arxiv.org/pdf/2305.03741.pdf)
 21 | 1. [arXiv 2023] **SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2305.04501.pdf)
 22 | 1. [arXiv 2023] **CSGCL: Community-Strength-Enhanced Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2305.04658.pdf)
 23 | 1. [TKDE 2023] **MINING: Multi-Granularity Network Alignment Based on Contrastive Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/10120956)
 24 | 1. [ICASSP 2023] **Select The Best: Enhancing Graph Representation with Adaptive Negative Sample Selection** [[paper]](https://ieeexplore.ieee.org/abstract/document/10095586)
 25 | 1. [ICASSP 2023] **Graph Contrastive Learning with Learnable Graph Augmentation** [[paper]](https://ieeexplore.ieee.org/abstract/document/10095511)
 26 | 1. [arXiv 2023] **FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction** [[paper]](https://arxiv.org/pdf/2305.02549.pdf)
 27 | 1. [INS 2023] **A fairness-aware graph contrastive learning recommender framework for social tagging systems** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025523006497)
 28 | 1. [arXiv 2023] **Improving Knowledge Graph Entity Alignment with Graph Augmentation** [[paper]](https://arxiv.org/pdf/2304.14585.pdf)
 29 | 1. [WWW 2023] **Graph Self-supervised Learning with Augmentation-aware Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3543507.3583246)
 30 | 1. [arXiv 2023] **A Systematic Survey of Chemical Pre-trained Models** [[paper]](https://sxkdz.github.io/files/publications/IJCAI/CPM/CPM.pdf)
 31 | 1. [WWW 2023] **Self-Supervised Teaching and Learning of Representations on Graphs** [[paper]](https://dl.acm.org/doi/abs/10.1145/3543507.3583441)
 32 | 1. [TKDE 2023] **Progressive Hard Negative Masking: From Global Uniformity to Local Tolerance** [[paper]](https://ieeexplore.ieee.org/abstract/document/10111083)
 33 | 1. [KBS 2023] **ST-A-PGCL: Spatiotemporal adaptive periodical graph contrastive learning for traffic prediction under real scenarios** [[paper]](https://www.sciencedirect.com/science/article/pii/S0950705123003416)
 34 | 1. [WWW 2023] **SeeGera: Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking** [[paper]](https://dl.acm.org/doi/abs/10.1145/3543507.3583245)
 35 | 1. [INS 2023] **Self-supervised Contrastive Learning on Heterogeneous Graphs with Mutual Constraints of Structure and Feature** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025523006114)
 36 | 1. [Scientific Reports 2023] **A multi-view contrastive learning for heterogeneous network embedding** [[paper]](https://www.nature.com/articles/s41598-023-33324-7)
 37 | 1. [WWW 2023] **Automated Spatio-Temporal Graph Contrastive Learning** [[paper]](https://zhengwang125.github.io/paper/STGCL_WWW23.pdf)
 38 | 1. [arXiv 2023] **Capturing Fine-grained Semantics in Contrastive Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2304.11658.pdf)
 39 | 1. [arXiv 2023] **Decouple Graph Neural Networks: Train Multiple Simple GNNs Simultaneously Instead of One** [[paper]](https://arxiv.org/pdf/2304.10126.pdf)
 40 | 1. [arXiv 2023] **ID-MixGCL: Identity Mixup for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2304.10045.pdf)
 41 | 1. [Bioinformatics 2023] **Molecular Property Prediction by Contrastive Learning with Attention-Guided Positive Sample Selection** [[paper]](https://watermark.silverchair.com/btad258.pdf?token=AQECAHi208BE49Ooan9kkhW_Ercy7Dm3ZL_9Cf3qfKAc485ysgAAAwcwggMDBgkqhkiG9w0BBwagggL0MIIC8AIBADCCAukGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMdupp3nabpyWrY1TvAgEQgIICuslU3gktfD9EQ9YOuajKd5nL5RNR0eI5eAOngtpUfOUcqcGOQONeb7Lznmgz8twSmMS13_U5bKR6FRKpce_1s9teGI5K7J6JLdx_sHrlBZGP8m1xMzk7soYc8pGHVsgbKwusPR5rkaRd-JykOSM3eIn_5IQgqqJ2RYmtcymvywcuGV1tA__M44XepfMuzHcC9q5h8NuWaWXmMzode9nlFyO0eacGBbSG8zvaH97K65aD734tbaUW60Do6fS_5yq9kRMFV3EPqnJwJ0iJ72o3ZFSNBjxb2yDH1kd_TZbkmio6LC6ZH8mrubOKxGDhrzjruSEpe1Fs54BzZfrqrGbmv8LB9sWxbSXAitKbMGnFb1WxyBF6cyB9g1AyqGYJEMr7HM7yBC9UOmff_s1kH-Avd_L8ZfzyhVqDvUyIgJc39Nlw6Eju3stlDuKMIwwWBI6qWHkc_nEd_0u7n1ssxbBydo63PZKmNbtsq36l7wN0goc_sWYXy9AyMu0ROFNLfWSe6n6k_u7DIyRlm7GPzOrx3CEaCWq_8uw1Pkvygflhz4aktGzWUBxodPezX4ToO2_9Q7IP9IjccsCI_zcr38C3EaHhtZf4yXFCowrL7C7MOLq9yo_9huTv3UJ_qq0dL7UCnJgrkI0kK7pkljnSu2gd0iuxwftCnphrXiw79xJwVUXTvbWKe_xxoh_XHllwhztCmPFYFbmwB-1A2gYpWq2fnNl7LxxvnioJCuoz9mwaFXN6tLwCCPkZa-GdakTaoHoU30JGMvrgdyhhFU30mUN5NOyWaoOLcqFLy8y-mO_V07uUGmMkS3SHM0j-qYEdjVEddM7QxbW5JW28EkL3L97BWaBohCHcj0jiS7pzteOwzZ4e3WWhghFX1pDGeFvvhzv5xCobn5TPFV1N9qk7I7QrEZSjAg1epeLNvohj)
 42 | 1. [AISTAT 2023] **Learning Robust Graph Neural Networks with Limited Supervision** [[paper]](https://proceedings.mlr.press/v206/alchihabi23a/alchihabi23a.pdf)
 43 | 1. [TNNLS 2023] **Demystifying and Mitigating Bias for Node Representation Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/10103678)
 44 | 1. [BICTA 2023] **Graph Contrastive Learning with Intrinsic Augmentations** [[paper]](https://link.springer.com/chapter/10.1007/978-981-99-1549-1_27)
 45 | 1. [arXiv 2023] **GraphMAE2: A Decoding-Enhanced Masked Self-Supervised Graph Learner** [[paper]](https://epubs.siam.org/doi/pdf/10.1137/1.9781611977653.ch19)
 46 | 1. [arXiv 2023] **Adversarial Hard Negative Generation for Complementary Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2304.04779.pdf)
 47 | 1. [INS 2023] **INS-GNN: Improving Graph Imbalance Learning with Self-Supervision** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025523005042)
 48 | 1. [TNNLS 2023] **Dual Contrastive Learning Network for Graph Clustering** [[paper]](https://ieeexplore.ieee.org/abstract/document/10097557)
 49 | 1. [arXiv 2023] **RARE: Robust Masked Graph Autoencoder** [[paper]](https://arxiv.org/pdf/2304.01507.pdf)
 50 | 1. [TKDE 2023] **Maximizing Mutual Information Across Feature and Topology Views for Representing Graphs** [[paper]](https://ieeexplore.ieee.org/abstract/document/10093032)
 51 | 1. [arXiv 2023] **When to Pre-Train Graph Neural Networks? An Answer from Data Generation Perspective!** [[paper]](https://arxiv.org/abs/2303.16458)
 52 | 1. [KBS 2023] **Class-homophilic-based data augmentation for improving graph neural networks** [[paper]](https://www.sciencedirect.com/science/article/pii/S095070512300268X)
 53 | 1. [arXiv 2023] **Structural Imbalance Aware Graph Augmentation Learning** [[paper]](https://arxiv.org/pdf/2303.13757.pdf)
 54 | 1. [arXiv 2023] **Hybrid Augmented Automated Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2303.15182.pdf)
 55 | 1. [arXiv 2023] **Decoupling Graph Neural Network with Contrastive Learning for Fraud Detection** [[paper]](https://linmengsysu.github.io/slides/main.pdf)
 56 | 1. [arXiv 2023] **Data-Centric Learning from Unlabeled Graphs with Diffusion Model** [[paper]](https://arxiv.org/pdf/2303.10108.pdf)
 57 | 1. [TPAMI 2023] **Unsupervised Learning of Graph Matching With Mixture of Modes Via Discrepancy Minimization** [[paper]](https://ieeexplore.ieee.org/abstract/document/10073537)
 58 | 1. [arXiv 2023] **NESS: Learning Node Embeddings from Static SubGraphs** [[paper]](https://arxiv.org/pdf/2303.08958.pdf)
 59 | 1. [Sensors 2023] **A Robust Automated Analog Circuits Classification Involving a Graph Neural Network and a Novel Data Augmentation Strategy** [[paper]](https://www.mdpi.com/1424-8220/23/6/2989)
 60 | 1. [arXiv 2023] **Contrastive knowledge integrated graph neural networks for Chinese medical text classification** [[paper]](https://www.sciencedirect.com/science/article/pii/S0952197623002415)
 61 | 1. [arXiv 2023] **CHGNN: A Semi-Supervised Contrastive Hypergraph Learning Network** [[paper]](https://arxiv.org/pdf/2303.06213.pdf)
 62 | 1. [arXiv 2023] **Contrastive Learning under Heterophily** [[paper]](https://arxiv.org/pdf/2303.06344.pdf)
 63 | 1. [arXiv 2023] **Structure-Aware Group Discrimination with Adaptive-View Graph Encoder: A Fast Graph Contrastive Learning Framework** [[paper]](https://arxiv.org/pdf/2303.05231.pdf)
 64 | 1. [TNNLS 2023] **Self-supervised Learning IoT Device Features with Graph Contrastive Neural Network for Device Classification in Social Internet of Things** [[paper]](https://ieeexplore.ieee.org/abstract/document/10059194)
 65 | 1. [TKDE 2023] **Feature-Level Deeper Self-Attention Network With Contrastive Learning for Sequential Recommendation** [[paper]](https://ieeexplore.ieee.org/abstract/document/10059216)
 66 | 1. [AAAI 2023] **Recommend What to Cache: a Simple Self-supervised Graph-based Recommendation Framework for Edge Caching Network** [[paper]](https://arxiv.org/pdf/2302.14438.pdf)
 67 | 1. [arXiv 2023] **Self-Supervised Interest Transfer Network via Prototypical Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/pdf/2302.14438.pdf)
 68 | 1. [arXiv 2023] **SGL-PT: A Strong Graph Learner with Graph Prompt Tuning** [[paper]](https://arxiv.org/pdf/2302.12449.pdf)
 69 | 1. [CIS 2023] **SimGRL: a simple self-supervised graph representation learning framework via triplets** [[paper]](https://link.springer.com/article/10.1007/s40747-023-00997-6)
 70 | 1. [WSDM 2023] **Self-Supervised Group Graph Collaborative Filtering for Group Recommendation** [[paper]](https://dl.acm.org/doi/abs/10.1145/3539597.3570400)
 71 | 1. [WSDM 2023] **S2GAE: Self-Supervised Graph Autoencoders are Generalizable Learners with Graph Masking** [[paper]](https://dl.acm.org/doi/abs/10.1145/3539597.3570404)
 72 | 1. [WSDM 2023] **Heterogeneous Graph Contrastive Learning for Recommendation** [[paper]](https://dl.acm.org/doi/abs/10.1145/3539597.3570484)
 73 | 1. [Nature Communications Chemistry] **Hierarchical Molecular Graph Self-Supervised Learning for property prediction** [[paper]](https://www.nature.com/articles/s42004-023-00825-5)
 74 | 1. [arXiv 2023] **Wiener Graph Deconvolutional Network Improves Graph Self-Supervised Learning** [[paper]](https://www.researchgate.net/profile/Jia-Li-127/publication/368543822_Wiener_Graph_Deconvolutional_Network_Improves_Graph_Self-Supervised_Learning/links/63edebc419130a1a4a830593/Wiener-Graph-Deconvolutional-Network-Improves-Graph-Self-Supervised-Learning.pdf)
 75 | 1. [arXiv 2023] **Heterogeneous Social Event Detection via Hyperbolic Graph Representations** [[paper]](https://arxiv.org/pdf/2302.10362.pdf)
 76 | 1. [arXiv 2023] **LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/pdf/2302.08191.pdf)
 77 | 1. [arXiv 2023] **GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2302.08043.pdf)
 78 | 1. [Pattern Recognition] **Dual-Channel Graph Contrastive Learning for Self-Supervised Graph-Level Representation Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S0031320323001486)
 79 | 1. [NCA 2023] **Self-supervised contrastive learning for heterogeneous graph based on multi-pretext tasks** [[paper]](https://link.springer.com/article/10.1007/s00521-023-08234-4)
 80 | 1. [arXiv 2023] **STERLING: Synergistic Representation Learning on Bipartite Graphs** [[paper]](https://arxiv.org/pdf/2302.05428.pdf)
 81 |  1. [ICLR 2023] **Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization** [[paper]](https://arxiv.org/pdf/2210.02016.pdf)
 82 | 1. [WBD 2023] **Mixed-Order Heterogeneous Graph Pre-training for Cold-Start Recommendation** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-25201-3_14)
 83 | 1. [arXiv 2023] **Explainable Action Prediction through Self-Supervision on Scene Graphs** [[paper]](https://arxiv.org/pdf/2302.03477.pdf)
 84 | 1. [arXiv 2023] **Spectral Augmentations for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2302.02909.pdf)
 85 | 1. [RS 2023] **Representing Spatial Data with Graph Contrastive Learning** [[paper]](https://urldefense.com/v3/__https://scholar.google.com/scholar_url?url=https:**Awww.mdpi.com*2072-4292*15*4*880*pdf&hl=en&sa=X&d=18081949848644790374&ei=UtHkY-wUjdbJBK-AnIgN&scisig=AAGBfm2HRbUL2s5kW_fO96HIgBt-0lesJg&oi=scholaralrt&hist=Pv-V2igAAAAJ:16610178827432183357:AAGBfm3PSUTRAat5lSIOYWJJQSKiKvjk4g&html=&pos=1&folt=cit__;Ly8vLy8vLw!!KwNVnqRv!DcYtDY-xLzHkhx2yQ32kw_CetJ1VrPiy0H9Hilie6oEU0a9OMbDAWoV9kq6mhcDPope5FTQwyDvFJ1YT8B6R9su2t7P1Rg$)
 86 | 1. [ACLF 2023] **KE-GCL: Knowledge Enhanced Graph Contrastive Learning for Commonsense Question Answering** [[paper]](https://aclanthology.org/2022.findings-emnlp.6.pdf)
 87 | 1. [TNNLS 2023] **GRLC: Graph Representation Learning With Constraints** [[paper]](https://ieeexplore.ieee.org/abstract/document/10036344)
 88 | 1. [ESA 2023] **Contrastive graph clustering with adaptive filter** [[paper]](https://www.sciencedirect.com/science/article/pii/S095741742300146X)
 89 | 1. [arXiv 2023] **Biomedical Interaction Prediction with Adaptive Line Graph Contrastive Learning** [[paper]](https://www.mdpi.com/2227-7390/11/3/732)
 90 | 1. [arXiv 2023] **Affinity Uncertainty-based Hard Negative Mining in Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2301.13340.pdf)
 91 | 1. [arXiv 2023] **Self-supervised Semi-implicit Graph Variational Auto-encoders with Masking** [[paper]](https://arxiv.org/pdf/2301.12458.pdf)
 92 | 1. [ACM Trans. Web 2023] **Contrastive Graph Similarity Networks** [[paper]](https://dl.acm.org/doi/pdf/10.1145/3580511)
 93 | 1. [ICBD 2023] **Predictive Masking for Semi-Supervised Graph Contrastive Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/10020970)
 94 | 1. [TNNLS 2023] **Graph Representation Learning With Adaptive Metric** [[paper]](https://ieeexplore.ieee.org/abstract/document/10025823)
 95 | 1. [RAL 2023] **Self-Supervised Local Topology Representation for Random Cluster Matching** [[paper]](https://ieeexplore.ieee.org/abstract/document/10021967)
 96 | 1. [KBS 2023] **CrysGNN: Distilling pre-trained knowledge to enhance property prediction for crystalline materials** [[paper]](https://arxiv.org/pdf/2301.05852.pdf)
 97 | 1. [Entropy 2023] **A Semantic-Enhancement-Based Social Network User-Alignment Algorithm** [[paper]](https://www.mdpi.com/1099-4300/25/1/172)
 98 | 1. [KBS 2023] **Cross-view temporal graph contrastive learning for session-based recommendation** [[paper]](https://www.sciencedirect.com/science/article/pii/S0950705123000540)
 99 | 1. [PR 2023] **Robust Image Clustering via Context-aware Contrastive Graph Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S0031320323000419)
100 | 1. [ICMLCS 2023] **AP-GCL: Adversarial Perturbation on Graph Contrastive Learning** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-20096-0_47)
101 | 1. [arXiv 2023] **Signed Directed Graph Contrastive Learning with Laplacian Augmentation** [[paper]](https://arxiv.org/pdf/2301.05163.pdf)
102 | 1. [OJCS 2023] **SC-FGCL: Self-adaptive Cluster-based Federal Graph Contrastive Learning** [[paper]](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10015148)
103 | 1. [BIB 2023] **CasANGCL: pre-training and fine-tuning model based on cascaded attention network and graph contrastive learning for molecular property prediction** [[paper]](https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbac566/6966532)
104 | 1. [AAAI 2023] **Spectral Feature Augmentation for Graph Contrastive Learning and Beyond** [[paper]](https://arxiv.org/abs/2212.01026)
105 | 1. [Entropy 2023] **Self-Supervised Node Classification with Strategy and Actively Selected Labeled Set** [[paper]](https://urldefense.com/v3/__https://scholar.google.com/scholar_url?url=https:**Awww.mdpi.com*1099-4300*25*1*30*pdf&hl=en&sa=X&d=13649462741514245070&ei=66yqY9q-NY_mmgHdka7oCw&scisig=AAGBfm0m2E6wg_90swKhBWYDrZsXMBr2kA&oi=scholaralrt&hist=Pv-V2igAAAAJ:16610178827432183357:AAGBfm3PSUTRAat5lSIOYWJJQSKiKvjk4g&html=&pos=0&folt=cit__;Ly8vLy8vLw!!KwNVnqRv!FbRTWxTuNHDzvvuiJFFzysRQQ3C08EMs3qJTdLHxTA4E2WK7FjMv32fbi6T1irhYspBlmsafx0xexY4FKuao4dHXv3q7hw$)
106 |  
107 |  ## Year 2022
108 |  1. [NeurIPS 2022] **Generalized Laplacian Eigenmaps** [[paper]](https://openreview.net/forum?id=HjicdpP-Nth)
109 |  1. [KDD 2022] **COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3534678.3539425)
110 |  1. [ITBE 2022] **Contrastive Multi-view Composite Graph Convolutional Networks Based on Contribution Learning for Autism Spectrum Disorder Classification** [[paper]](https://ieeexplore.ieee.org/abstract/document/9999336)
111 |  1. [IEEE Access 2022] **ROME: A Graph Contrastive Multi-view Framework from Hyperbolic Angular Space for MOOCs Recommendation** [[paper]](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10001755)
112 |  1. [arXiv 2022] **Heterogeneous Graph Contrastive Learning with Meta-path Contexts and Weighted Negative Samples** [[paper]](https://arxiv.org/pdf/2212.13847.pdf)
113 |  1. [arXiv 2022] **MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular Representation Learning** [[paper]](https://arxiv.org/pdf/2212.10614.pdf)
114 |  1. [arXiv 2022] **Toward Improved Generalization: Meta Transfer of Self-supervised Knowledge on Graphs** [[paper]](https://arxiv.org/pdf/2212.08217.pdf)
115 |  1. [arXiv 2022] **Coarse-to-Fine Contrastive Learning on Graphs** [[paper]](https://arxiv.org/pdf/2212.06423.pdf)
116 |  1. [arXiv 2022] **MA-GCL: Model Augmentation Tricks for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2212.07035.pdf)
117 |  1. [arXiv 2022] **Mul-GAD: a semi-supervised graph anomaly detection framework via aggregating multi-view information** [[paper]](https://arxiv.org/pdf/2212.05478.pdf)
118 |  1. [arXiv 2022] **Localized Contrastive Learning on Graphs** [[paper]](https://arxiv.org/pdf/2212.04604.pdf)
119 |  1. [arXiv 2022] **Alleviating neighbor bias: augmenting graph self-supervise learning with structural equivalent positive samples** [[paper]](https://arxiv.org/pdf/2212.04365.pdf)
120 |  1. [arXiv 2022] **Self-supervised Graph Representation Learning for Black Market Account Detection** [[paper]](https://arxiv.org/pdf/2212.02679.pdf)
121 |  1. [arXiv 2022] **Contrastive Deep Graph Clustering with Learnable Augmentation** [[paper]](https://arxiv.org/pdf/2212.03559.pdf)
122 |  1. [arXiv 2022] **Graph Anomaly Detection via Multi-Scale Contrastive Learning Networks with Augmented View** [[paper]](https://arxiv.org/pdf/2212.00535.pdf)
123 |  1. [arXiv 2022] **Self Supervised Clustering of Traffic Scenes using Graph Representations** [[paper]](https://arxiv.org/pdf/2211.15508.pdf)
124 |  1. [arXiv 2022] **Graph Contrastive Learning for Materials** [[paper]](https://arxiv.org/pdf/2211.13408.pdf)
125 |  1. [arXiv 2022] **Link Prediction with Non-Contrastive Learning** [[paper]](https://arxiv.org/pdf/2211.14394.pdf)
126 |  1. [IJMIR 2022] **TCKGE: Transformers with contrastive learning for knowledge graph embedding** [[paper]](https://link.springer.com/article/10.1007/s13735-022-00256-3)
127 |  1. [arXiv 2022] **Beyond Smoothing: Unsupervised Graph Representation Learning with Edge Heterophily Discriminating** [[paper]](https://arxiv.org/pdf/2211.14065.pdf)
128 |  1. [Neural Networks 2022] **Unsupervised graph-level representation learning with hierarchical contrasts** [[paper]](https://www.sciencedirect.com/science/article/pii/S0893608022004609)
129 |  1. [arXiv 2022] **Relation-dependent Contrastive Learning with Cluster Sampling for Inductive Relation Prediction** [[paper]](https://arxiv.org/pdf/2211.12266.pdf)
130 |  1. [arXiv 2022] **Relational Symmetry based Knowledge Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2211.10738.pdf)
131 |  1. [arXiv 2022] **Towards Generalizable Graph Contrastive Learning: An Information Theory Perspective** [[paper]](https://arxiv.org/pdf/2211.10929.pdf)
132 |  1. [arXiv 2022] **Can Single-Pass Contrastive Learning Work for Both Homophilic and Heterophilic Graph?** [[paper]](https://arxiv.org/pdf/2211.10890.pdf)
133 |  1. [SIGSPATIAL 2022] **When Do Contrastive Learning Signals Help Spatio-Temporal Graph Forecasting?** [[paper]](http://urban-computing.com/pdf/STGCL_SIGSPATIAL_22.pdf)
134 |  1. [Scientific Reports 2022] **Deep graph level anomaly detection with contrastive learning** [[paper]](https://www.nature.com/articles/s41598-022-22086-3)
135 |  1. [TII 2022] **Semi-supervised machine fault diagnosis fusing unsupervised graph contrastive learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/9944187)
136 |  1. [KBS 2022] **SMGCL: Semi-supervised Multi-view Graph Contrastive Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S0950705122012163)
137 |  1. [arXiv 2022] **Unsupervised Graph Contrastive Learning with Data Augmentation for Malware Classification** [[paper]](https://www.researchgate.net/profile/Yun-Gao-48/publication/365275847_Unsupervised_Graph_Contrastive_Learning_with_Data_Augmentation_for_Malware_Classification/links/636cec632f4bca7fd04b9a26/Unsupervised-Graph-Contrastive-Learning-with-Data-Augmentation-for-Malware-Classification.pdf)
138 |  1. [IJCRS 2022] **Multi-scale Subgraph Contrastive Learning for Link Prediction** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-21244-4_16)
139 |  1. [arXiv 2022] **Flaky Performances when Pretraining on Relational Databases** [[paper]](https://arxiv.org/pdf/2211.05213.pdf)
140 |  1. [arXiv 2022] **GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection** [[paper]](https://arxiv.org/pdf/2211.04208.pdf)
141 |  1. [ATKDD 2022] **Ada-MIP: Adaptive Self-supervised Graph Representation Learning via Mutual Information and Proximity Optimization** [[paper]](https://dl.acm.org/doi/pdf/10.1145/3568165)
142 |  1. [arXiv 2022] **Graph Contrastive Learning with Implicit Augmentations** [[paper]](https://arxiv.org/pdf/2211.03710.pdf)
143 |  1. [Information Sciences 2022] **Contrastive Graph Neural Network-based Camouflaged Fraud Detector** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025522011926)
144 |  1. [arXiv 2022] **DyG2Vec: Representation Learning for Dynamic Graphs with Self-Supervision** [[paper]](https://arxiv.org/pdf/2210.16906.pdf)
145 |  1. [arXiv 2022] **Federated Graph Representation Learning using Self-Supervision** [[paper]](https://arxiv.org/pdf/2210.15120.pdf)
146 |  1. [arXiv 2022] **Benchmark of Self-supervised Graph Neural Networks** [[paper]](https://aaltodoc.aalto.fi/bitstream/handle/123456789/116441/master_Wang_Haishan_2022.pdf?sequence=2)
147 |  1. [arXiv 2022] **Line Graph Contrastive Learning for Link Prediction** [[paper]](https://arxiv.org/pdf/2210.13795.pdf)
148 |  1. [TDSC 2022] **FewM-HGCL: Few-Shot Malware Variants Detection Via Heterogeneous Graph Contrastive Learning** [[paper]](https://www.computer.org/csdl/journal/tq/5555/01/09928211/1HJuUzzFey4)
149 |  1. [arXiv 2022] **Self-supervised Graph-based Point-of-interest Recommendation** [[paper]](https://arxiv.org/pdf/2210.12506.pdf)
150 |  1. [IJMLC 2022] **Hybrid sampling-based contrastive learning for imbalanced node classification** [[paper]](https://link.springer.com/article/10.1007/s13042-022-01677-6)
151 |  1. [CIKM 2022] **Temporality-and Frequency-aware Graph Contrastive Learning for Temporal Network** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557469)
152 |  1. [CIKM 2022] **Towards Self-supervised Learning on Graphs with Heterophily** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557478)
153 |  1. [ISWC 2022] **HCL: Improving Graph Representation with Hierarchical Contrastive Learning** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-19433-7_7)
154 |  1. [CIKM 2022] **Cognize Yourself: Graph Pre-Training via Core Graph Cognizing and Differentiating** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557259)
155 |  1. [CIKM 2022] **AdaGCL: Adaptive Subgraph Contrastive Learning to Generalize Large-scale Graph Training** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557228)
156 |  1. [CIKM 2022] **Look Twice as Much as You Say: Scene Graph Contrastive Learning for Self-Supervised Image Caption Generation** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557382)
157 |  1. [CIKM 2022] **Malicious Repositories Detection with Adversarial Heterogeneous Graph Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3511808.3557384)
158 |  1. [ICEBE 2022] **Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering** [[paper]](https://conferences.computer.org/icebe/2022/icebe2022-proceedings/Knowledge%20Graph%20Completion%20based%20on%20Hyperbolic%20Graph%20Contrastive%20Attention%20Network.pdf)
159 |  1. [arXiv 2022] **Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering** [[paper]](https://arxiv.org/pdf/2210.10462.pdf)
160 |  1. [NeurIPS 2022] **Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative** [[paper]](https://arxiv.org/abs/2210.03801) [[code]](https://github.com/weitianxin/HyperGCL)
161 |  1. [ICCL 2022] **Modeling Intra-and Inter-Modal Relations: Hierarchical Graph Contrastive Learning for Multimodal Sentiment Analysis** [[paper]](https://aclanthology.org/2022.coling-1.622.pdf)
162 |  1. [TKDE 2022] **Adversarial Contrastive Learning for Evidence-aware Fake News Detection with Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2210.05498.pdf)
163 |  1. [MM 2022] **Simple Self-supervised Multiplex Graph Representation Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3503161.3547949)
164 |  1. [TMM 2022] **Self-consistent Contrastive Attributed Graph Clustering with Pseudo-label Prompt** [[paper]](https://ieeexplore.ieee.org/abstract/document/9914670)
165 |  1. [NeurIPS 2022] **Uncovering the Structural Fairness in Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2210.03011.pdf)
166 |  1. [NeurIPS 2022] **Revisiting Graph Contrastive Learning from the Perspective of Graph Spectrum** [[paper]](https://arxiv.org/pdf/2210.02330.pdf)
167 |  1. [arXiv 2022] **Heterogeneous Graph Contrastive Multi-view Learning** [[paper]](https://arxiv.org/pdf/2210.00248.pdf)
168 |  1. [arXiv 2022] **Automated Graph Self-supervised Learning via Multi-teacher Knowledge Distillation** [[paper]](https://arxiv.org/pdf/2210.02099.pdf)
169 |  1. [arXiv 2022] **Prompt Tuning for Graph Neural Networks** [[paper]](https://web10.arxiv.org/pdf/2209.15240.pdf)
170 |  1. [arXiv 2022] **Improving Molecular Pretraining with Complementary Featurizations** [[paper]](https://arxiv.org/pdf/2209.15101.pdf)
171 |  1. [arXiv 2022] **Graph Soft-Contrastive Learning via Neighborhood Ranking** [[paper]](https://arxiv.org/pdf/2209.13964.pdf)
172 |  1. [EDBT 2022] **Spatial Structure-Aware Road Network Embedding via Graph Contrastive Learning** [[paper]](https://openproceedings.org/2023/conf/edbt/paper-193.pdf)
173 |  1. [arXiv 2022] **Adversarial Cross-View Disentangled Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2209.07699.pdf)
174 |  1. [Neurocomputing 2022] **Motifs-based Recommender System via Hypergraph Convolution and Contrastive Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S0925231222011948)
175 |  1. [TNNLS 2022] **Graph Representation Learning for Large-Scale Neuronal Morphological Analysis** [[paper]](https://ieeexplore.ieee.org/abstract/document/9895206)
176 |  1. [ECML-PKDD 2022] **Self-supervised Graph Learning with Segmented Graph Channels** [[paper]](https://2022.ecmlpkdd.org/wp-content/uploads/2022/09/sub_216.pdf)
177 |  1. [ECML-PKDD 2022] **Graph Contrastive Learning with Adaptive Augmentation for Recommendation** [[paper]](https://2022.ecmlpkdd.org/wp-content/uploads/2022/09/sub_650.pdf)
178 |  1. [CIKM 2022] **Contrastive Knowledge Graph Error Detection** [[paper]](https://www4.comp.polyu.edu.hk/~xiaohuang/docs/Qinggang_CIKM2022.pdf)
179 |  1. [TKDE 2022] **Disentangled Graph Contrastive Learning With Independence Promotion** [[paper]](https://ieeexplore.ieee.org/abstract/document/9893319)
180 |  1. [ECML-PKDD 2022] **Supervised Graph Contrastive Learning for Few-shot Node Classification** [[paper]](https://2022.ecmlpkdd.org/wp-content/uploads/2022/09/sub_764.pdf)
181 |  1. [Information Sciences 2022] **Graph Prototypical Contrastive Learning** [[paper]](https://www.sciencedirect.com/science/article/pii/S002002552201057X)
182 |  1. [ICAAN 2022] **Knowledge-Aware Self-supervised Graph Representation Learning for Recommendation** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-15937-4_35)
183 |  1. [arXiv 2022] **Self-supervised Representation Learning on Electronic Health Records with Graph Kernel Infomax** [[paper]](https://arxiv.org/pdf/2209.00655.pdf)
184 |  1. [arXiv 2022] **Disentangled Graph Contrastive Learning for Review-based Recommendation** [[paper]](https://arxiv.org/pdf/2209.01524.pdf)
185 |  1. [arXiv 2022] **Contrastive Learning with Heterogeneous Graph Attention Networks on Short Text Classification** [[paper]](https://dro.dur.ac.uk/36856/1/36856.pdf)
186 |  1. [arXiv 2022] **Features Based Adaptive Augmentation for Graph Contrastive Learning** [[paper]](https://arxiv.org/ftp/arxiv/papers/2207/2207.01792.pdf)
187 |  1. [TKDE 2022] **GCCAD: Graph Contrastive Learning for Anomaly Detection** [[paper]](https://ieeexplore.ieee.org/abstract/document/9870034)
188 |  1. [JCIM 2022] **SMICLR: Contrastive Learning on Multiple Molecular Representations for Semisupervised and Unsupervised Representation Learning** [[paper]](https://pubs.acs.org/doi/full/10.1021/acs.jcim.2c00521)
189 |  1. [arXiv 2022] **XSimGCL: Towards Extremely Simple Graph Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/abs/2209.02544)[[code]](https://github.com/Coder-Yu/SELFRec)
190 |  1. [CIKM 2022] **Relational Self-Supervised Learning on Graphs** [[paper]](https://arxiv.org/pdf/2208.10493.pdf)[[code]](https://github.com/Namkyeong/RGRL)
191 |  1. [Information Sciences 2022] **Self-Supervised Graph Representation Learning via Positive Mining** [[paper]](https://www.sciencedirect.com/science/article/pii/S0020025522009495)
192 |  1. [arXiv 2022] **Heterogeneous Graph Masked Autoencoders** [[paper]](https://arxiv.org/pdf/2208.09957.pdf)
193 |  1. [arXiv 2022] **KRACL: Contrastive Learning with Graph Context Modeling for Sparse Knowledge Graph Completion** [[paper]](https://arxiv.org/pdf/2208.07622.pdf)
194 |  1. [arXiv 2022] **R\'enyiCL: Contrastive Representation Learning with Skew R\'enyi Divergence** [[paper]](https://arxiv.org/pdf/2208.06270.pdf)
195 |  1. [TNNLS 2022] **Prototypical Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2106.09645.pdf)
196 |  1. [KDD 2022] **Mining Spatio-Temporal Relations via Self-Paced Graph Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3534678.3539422)
197 |  1. [KDD 2022] **Rep2Vec: Repository Embedding via Heterogeneous Graph Adversarial Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3534678.3539324)
198 |  1. [arXiv 2022] **Deep Contrastive Multiview Network Embedding** [[paper]](https://sxkdz.github.io/files/publications/CIKM/CREME/CREME.pdf)
199 |  1. [arXiv 2022] **Analyzing Data-Centric Properties for Contrastive Learning on Graphs** [[paper]](https://arxiv.org/pdf/2208.02810.pdf)
200 |  1. [KDD 2022] **Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries** [[paper]](https://keg.cs.tsinghua.edu.cn/jietang/publications/KDD22-Liu-et-al-KG-Transformer.pdf)
201 |  1. [arXiv 2022] **Generative Subgraph Contrast for Self-Supervised Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2207.11996.pdf)
202 |  1. [IJCAI 2022] **Graph Masked Autoencoder Enhanced Predictor for Neural Architecture Search** [[paper]](https://www.ijcai.org/proceedings/2022/0432.pdf)
203 |  1. [IJCAI 2022] **Proximity Enhanced Graph Neural Networks with Channel Contrast** [[paper]](https://www.ijcai.org/proceedings/2022/0340.pdf)
204 |  1. [IJCAI 2022] **Rethinking the Promotion Brought by Contrastive Learning to Semi-Supervised Node Classification** [[paper]](https://www.ijcai.org/proceedings/2022/0395.pdf)
205 |  1. [IPM 2022] **HCNA: Hyperbolic Contrastive Learning Framework for Self-Supervised Network Alignment** [[paper]](https://www.sciencedirect.com/science/article/abs/pii/S0306457322001315)
206 |  1. [arXiv 2022] **3D Equivariant Molecular Graph Pretraining** [[paper]](https://arxiv.org/pdf/2207.08824.pdf)
207 |  1. [arXiv 2022] **Unified 2D and 3D Pre-Training of Molecular Representations** [[paper]](https://arxiv.org/pdf/2207.08806.pdf)
208 |  1. [arXiv 2022] **Does GNN Pretraining Help Molecular Representation?** [[paper]](https://arxiv.org/pdf/2207.06010.pdf)
209 |  1. [arXiv 2022] **Latent Augmentation For Better Graph Self-Supervised Learning** [[paper]](https://arxiv.org/pdf/2206.12933.pdf)
210 |  1. [arXiv 2022] **Geometry Contrastive Learning on Heterogeneous Graphs** [[paper]](https://arxiv.org/pdf/2206.12547.pdf)
211 |  1. [KIS 2022] **Self-supervised role learning for graph neural networks** [[paper]](https://link.springer.com/article/10.1007/s10115-022-01694-5)
212 |  1. [JFCST 2022] **Graph Neural Network Defense Combined with Contrastive Learning** [[paper]](http://fcst.ceaj.org/EN/article/downloadArticleFile.do?attachType=PDF&id=3113)
213 |  1. [ICMLW 2022] **Evaluating Self-Supervised Learned Molecular Graphs** [[paper]](https://openreview.net/pdf?id=LeJC_Mf5rx-)
214 |  1. [KDD 2022] **Reliable Representations Make A Stronger Defender: Unsupervised Structure Refinement for Robust GNN** [[paper]](https://ponderly.github.io/pub/STABLE_KDD2022.pdf)
215 |  1. [ICMLW 2022] **Featurizations Matter: A Multiview Contrastive Learning Approach to Molecular Pretraining** [[paper]](https://openreview.net/pdf?id=Pm1Q1X3avx1)
216 |  1. [bioRiv 2022] **Cross-modal Graph Contrastive Learning with Cellular Images** [[paper]](https://www.biorxiv.org/content/biorxiv/early/2022/06/06/2022.06.05.494905.full.pdf)
217 |  1. [Information Sciences 2022] **A new self-supervised task on graphs: Geodesic distance prediction** [[paper]]([https://hansen7.github.io/sandbox/molgrapheval.pdf](https://www.sciencedirect.com/science/article/abs/pii/S0020025522006375))
218 |  1. [arXiv 2022] **Evaluating Self-Supervised Learning for Molecular Graph Embeddings** [[paper]](https://hansen7.github.io/sandbox/molgrapheval.pdf)
219 |  1. [arXiv 2022] **Evaluating Graph Generative Models with Contrastively Learned Features** [[paper]](https://arxiv.org/pdf/2206.06234.pdf)
220 |  1. [arXiv 2022] **COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2206.04726.pdf)
221 |  1. [arXiv 2022] **Decoupled Self-supervised Learning for Non-Homophilous Graphs** [[paper]](https://arxiv.org/pdf/2206.03601.pdf)
222 |  1. [arXiv 2022] **Interpolation-based Correlation Reduction Network for Semi-Supervised Graph Learning** [[paper]](https://arxiv.org/pdf/2206.02796.pdf)
223 |  1. [arXiv 2022] **Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination** [[paper]](https://arxiv.org/pdf/2206.01535.pdf)
224 |  1. [arXiv 2022] **KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction** [[paper]](https://arxiv.org/pdf/2206.03364.pdf)
225 |  1. [CVPR 2022] **Robust Optimization As Data Augmentation for Large-Scale Graphs** [[paper]](https://openaccess.thecvf.com/content/CVPR2022/papers/Kong_Robust_Optimization_As_Data_Augmentation_for_Large-Scale_Graphs_CVPR_2022_paper.pdf)
226 |  1. [arXiv 2022] **COIN: Co-Cluster Infomax for Bipartite Graphs** [[paper]](https://arxiv.org/pdf/2206.00006.pdf)
227 |  3. [TSIPN 2022] **Fair Contrastive Learning on Graphs** [[paper]](https://ieeexplore.ieee.org/abstract/document/9779533)
228 |  4. [arXiv 2022] **I’m Me, We’re Us, and I’m Us: Tri-directional Contrastive Learning on Hypergraphs** [[paper]](https://arxiv.org/pdf/2206.04739.pdf)
229 |  5. [TNNLS 2022] **CLEAR: Cluster-Enhanced Contrast for Self-Supervised Graph Representation Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/9791433)
230 |  6. [arXiv 2022] **Let Invariant Rationale Discovery Inspire Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2206.07869.pdf)
231 |  7. [arXiv 2022] **Omni-Granular Ego-Semantic Propagation for Self-Supervised Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2205.15746.pdf)
232 |  8. [arXiv 2022] **Improving Subgraph Representation Learning via Multi-View Augmentation** [[paper]](https://arxiv.org/pdf/2205.13038.pdf)
233 |  9. [arXiv 2022] **Triangular Contrastive Learning on Molecular Graphs** [[paper]](https://arxiv.org/pdf/2205.13279.pdf)
234 |  10. [KDD 2022] **GraphMAE: Self-supervised Masked Graph Autoencoders** [[paper]](https://arxiv.org/pdf/2205.10803.pdf)
235 |  11. [arXiv 2022] **MaskGAE: Masked Graph Modeling Meets Graph Autoencoders** [[paper]](https://arxiv.org/pdf/2205.10053.pdf)
236 |  12. [ICML 2022] **Understanding Limitations of Unsupervised Graph Representation Learning from a Data-Dependent Perspective** [[paper]](https://www.osti.gov/servlets/purl/1868861)
237 |  13. [arXiv 2022] **Towards Explanation for Unsupervised Graph-Level Representation Learning** [[paper]](https://arxiv.org/pdf/2205.09934.pdf)
238 |  14. [arXiv 2022] **ImGCL: Revisiting Graph Contrastive Learning on Imbalanced Node Classification** [[paper]](https://arxiv.org/pdf/2205.11332.pdf)
239 |  15. [TNNLS 2022] **Collaborative Decision-Reinforced Self-Supervision for Attributed Graph Clustering** [[paper]](https://ieeexplore.ieee.org/abstract/document/9777842)
240 |  16. [arXiv 2022] **Contrastive Graph Learning with Graph Convolutional Networks** [[paper]](https://link.springer.com/chapter/10.1007/978-3-031-06555-2_7)
241 |  17. [TISPN 2022] **Fair Contrastive Learning on Graphs** [[paper]](https://ieeexplore.ieee.org/abstract/document/9779533)
242 |  18. [arXiv 2022] **SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks** [[paper]](https://zepengzhang.com/Notes/2022/20220507.pdf)
243 |  19. [arXiv 2022] **HCL: Hybrid Contrastive Learning for Graph-based Recommendation** [[paper]](https://assets.amazon.science/21/8b/a804e89041f1a83bb1f77fa6aaee/hcl-hybrid-contrastive-learning-for-graph-based-recommendation.pdf)
244 |  20. [arXiv 2022] **Representation learning with function call graph transformations for malware open set recognition** [[paper]](https://arxiv.org/pdf/2205.07865.pdf)
245 |  21. [arXiv 2022] **Simple Contrastive Graph Clustering** [[paper]](https://arxiv.org/pdf/2205.07865.pdf)
246 |  22. [NCA 2022] **Self-supervised graph representation learning using multi-scale subgraph views contrast** [[paper]](https://link.springer.com/article/10.1007/s00521-022-07299-x)
247 |  23. [ACL 2022] **JointCL: A Joint Contrastive Learning Framework for Zero-Shot Stance Detection** [[paper]](https://aclanthology.org/2022.acl-long.7/) 
248 |  24. [IPM 2022] **Contrastive Graph Convolutional Networks with adaptive augmentation for text classification** [[paper]](https://www.sciencedirect.com/science/article/abs/pii/S0306457322000681) 
249 |  25. [PAKDD 2022] **Contrastive Attributed Network Anomaly Detection with Data Augmentation** [[paper]](https://dl.acm.org/doi/abs/10.1007/978-3-031-05936-0_35) 
250 |  26. [DASFAA 2022] **CSGNN: Improving Graph Neural Networks with Contrastive Semi-supervised Learning** [[paper]](https://dl.acm.org/doi/abs/10.1007/978-3-031-00123-9_58)
251 |  27. [arXiv 2022] **Dynamic Graph Representation Based on Temporal and Contextual Contrasting** [[paper]](https://assets.researchsquare.com/files/rs-1588877/v1_covered.pdf?c=1651680782)
252 |  28. [DASFAA 2022] **Diffusion-Based Graph Contrastive Learning for Recommendation with Implicit Feedback** [[paper]](https://dl.acm.org/doi/abs/10.1007/978-3-031-00126-0_15)
253 |  29. [arXiv 2022] **FastGCL: Fast Self-Supervised Learning on Graphs via Contrastive Neighborhood Aggregation** [[paper]](https://arxiv.org/pdf/2205.00905.pdf)
254 |  30. [arXiv 2022] **RoSA: A Robust Self-Aligned Framework for Node-Node Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2204.13846.pdf)
255 |  31. [arXiv 2022] **Heterogeneous Graph Neural Networks using Self-supervised Reciprocally Contrastive Learning** [[paper]](https://arxiv.org/pdf/2205.00256.pdf)
256 |  32. [WSDM 2022] **JGCL: Joint Self-Supervised and Supervised Graph Contrastive Learning** [[paper]](https://www2022.thewebconf.org/PaperFiles/161.pdf)
257 |  33. [AAAI 2022] **SAIL: Self-Augmented Graph Contrastive Learning** [[paper]](https://www.aaai.org/AAAI22Papers/AAAI-8378.YuL.pdf)
258 |  34. [ICASSP 2022] **Graph Fine-Grained Contrastive Representation Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/9746085)
259 |  35. [arXiv 2022] **SCGC: Self-Supervised Contrastive Graph Clustering** [[paper]](https://arxiv.org/pdf/2204.12656.pdf)
260 |  36. [arXiv 2022] **A Content-First Benchmark for Self-Supervised Graph Representation Learning** [[paper]](https://graph-learning-benchmarks.github.io/assets/papers/glb2022/A_Content_First_Benchmark_for_Self_Supervised_Graph_Representation_Learning.pdf)
261 |  37. [SIGIR 2022] **Hypergraph Contrastive Collaborative Filtering** [[paper]](https://arxiv.org/pdf/2204.12200.pdf)
262 |  38. [WWW 2022] **Rumor Detection on Social Media with Graph Adversarial Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3485447.3511999)
263 |  39. [arXiv 2022] **A Review-aware Graph Contrastive Learning Framework for Recommendation** [[paper]](https://arxiv.org/pdf/2204.12063.pdf)
264 |  40. [WWW 2022] **Robust Self-Supervised Structural Graph Neural Network for Social Network Prediction** [[paper]](https://dl.acm.org/doi/abs/10.1145/3485447.3512182)
265 |  41. [arXiv 2022] **CGC: Contrastive Graph Clustering for Community Detection and Tracking** [[paper]](https://arxiv.org/pdf/2204.08504.pdf)
266 |  42. [TCyber 2022] **Multiview Deep Graph Infomax to Achieve Unsupervised Graph Embedding** [[paper]](https://ieeexplore.ieee.org/abstract/document/9758652)
267 |  43. [arXiv 2022] **MVGCNMDA: Multi-view Graph Augmentation Convolutional Network for Uncovering Disease-Related Microbes** [[paper]](https://link.springer.com/article/10.1007/s12539-022-00514-2)
268 |  44. [arXiv 2022] **CERES: Pretraining of Graph-Conditioned Transformer for Semi-Structured Session Data** [[paper]](https://arxiv.org/pdf/2204.04303.pdf)
269 |  45. [arXiv 2022] **Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation** [[paper]](https://arxiv.org/pdf/2204.05104.pdf)
270 |  46. [SIGIR 2022] **Are Graph Augmentations Necessary? Simple Graph Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/abs/2112.08679)[[code]](https://github.com/Coder-Yu/SELFRec)
271 |  47. [arXiv 2022] **Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning** [[paper]](https://arxiv.org/pdf/2204.04813)
272 |  48. [arXiv 2022] **Augmentation-Free Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2204.04874)
273 |  49. [TCybern 2022] **Link-Information Augmented Twin Autoencoders for Network Denoising** [[paper]](https://ieeexplore.ieee.org/abstract/document/9745753)
274 |  50. [arXiv 2022] **Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization** [[paper]](https://arxiv.org/pdf/2203.12265)
275 |  51. [arXiv 2022] **GraphCoCo: Graph Complementary Contrastive Learning** [[paper]](https://arxiv.org/pdf/2203.12821)
276 |  52. [arXiv 2022] **Unsupervised Heterophilous Network Embedding via r-Ego Network Discrimination** [[paper]](https://arxiv.org/pdf/2203.10866.pdf)
277 |  53. [Bioinformatics 2022] **Supervised Graph Co-contrastive Learning for Drug-Target Interaction Prediction** [[paper]](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btac164/6551245?login=true)
278 |  54. [arXiv 2022] **Supervised Contrastive Learning with Structure Inference for Graph Classification** [[paper]](https://arxiv.org/pdf/2203.07691)
279 |  55. [arXiv 2022] **Defending Graph Convolutional Networks against Dynamic Graph Perturbations via Bayesian Self-supervision** [[paper]](https://arxiv.org/pdf/2203.03762.pdf)
280 |  57. [arXiv 2022] **Analyzing Heterogeneous Networks with Missing Attributes by Unsupervised Contrastive Learning** [[paper]](https://yangliang.github.io/pdf/tnnls22.pdf)
281 |  58. [arXiv 2022] **Improving Molecular Contrastive Learning via Faulty Negative Mitigation and Decomposed Fragment Contrast** [[paper]](https://arxiv.org/pdf/2202.09346.pdf)
282 |  59. [arXiv 2022] **Contrastive Meta Learning with Behavior Multiplicity for Recommendation** [[paper]](https://arxiv.org/pdf/2202.08523.pdf)[[code]](https://github.com/weiwei1206/CML.git)
283 |  60. [arXiv 2022] **Fair Node Representation Learning via Adaptive Data Augmentation** [[paper]](https://arxiv.org/pdf/2201.08549.pdf)
284 |  61. [arXiv 2022] **Learning Graph Augmentations to Learn Graph Representations** [[paper]](https://arxiv.org/pdf/2201.09830.pdf)[[code]](https://github.com/kavehhassani/lg2ar)
285 |  62. [arXiv 2022] **Graph Data Augmentation for Graph Machine Learning: A Survey** [[paper]](https://arxiv.org/pdf/2202.08871.pdf)
286 |  63. [arXiv 2022] **Data Augmentation for Deep Graph Learning: A Survey** [[paper]](https://arxiv.org/abs/2202.08235)
287 |  64. [arXiv 2022] **Adversarial Graph Contrastive Learning with Information Regularization** [[paper]](https://arxiv.org/pdf/2202.06491.pdf)
288 |  65. [arXiv 2022] **SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation** [[paper]](https://arxiv.org/pdf/2202.03104.pdf)
289 |  66. [NeurIPS 2022] **Graph Self-supervised Learning with Accurate Discrepancy Learning** [[paper]](https://arxiv.org/pdf/2202.02989.pdf)
290 |  67. [arXiv 2022] **Learning Robust Representation through Graph Adversarial Contrastive Learning** [[paper]](https://arxiv.org/pdf/2201.13025.pdf)
291 |  68. [arXiv 2022] **Self-supervised Graphs for Audio Representation Learning with Limited Labeled Data** [[paper]](https://arxiv.org/pdf/2202.00097.pdf)
292 |  69. [arXiv 2022] **Link Prediction with Contextualized Self-Supervision** [[paper]](https://arxiv.org/pdf/2201.10069.pdf)
293 |  70. [arXiv 2022] **Dual Space Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2201.07409.pdf)
294 |  71. [arXiv 2022] **Unsupervised Graph Poisoning Attack via Contrastive Loss Back-propagation** [[paper]](https://arxiv.org/pdf/2201.07986.pdf)
295 |  72. [arXiv 2022] **From Unsupervised to Few-shot Graph Anomaly Detection: A Multi-scale Contrastive Learning Approach** [[paper]](https://arxiv.org/pdf/2202.05525)
296 |  73. [arXiv 2022] **Dual Space Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2201.07409)
297 |  74. [arXiv 2022] **Structure-Enhanced Heterogeneous Graph Contrastive Learning** [[paper]](https://sxkdz.github.io/files/publications/SDM/STENCIL/STENCIL.pdf)
298 |  75. [bioRxiv 2022] **Towards Effective and Generalizable Fine-tuning for Pre-trained Molecular Graph Models** [[paper]](https://www.biorxiv.org/content/biorxiv/early/2022/02/06/2022.02.03.479055.full.pdf)
299 |  76. [SDM 2022] **Neural Graph Matching for Pre-training Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2203.01597.pdf) [[code]](https://github.com/RUCAIBox/GMPT)
300 |  77. [TNNLS 2022] **Analyzing Heterogeneous Networks with Missing Attributes by Unsupervised Contrastive Learning** [[paper]](https://yangliang.github.io/pdf/tnnls22.pdf)
301 |  78. [WWW 2022] **Improving Graph Collaborative Filtering with Neighborhood-enriched Contrastive Learning** [[paper]](https://arxiv.org/pdf/2202.06200.pdf) [[code]](https://github.com/RUCAIBox/NCL)
302 |  79. [WWW 2022] **ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs** [[paper]](https://xiaojingzi.github.io/publications/WWW22-Wang-et-al-ClusterSCL.pdf)
303 |  80. [ICLR 2022] **Large-Scale Representation Learning on Graphs via Bootstrapping** [[paper]](https://arxiv.org/pdf/2102.06514.pdf)[[Code]](https://github.com/Namkyeong/BGRL_Pytorch)
304 |  81. [ICLR 2022] **Automated Self-Supervised Learning for Graphs** [[paper]](https://openreview.net/forum?id=rFbR4Fv-D6-) [[code]](https://github.com/ChandlerBang/AutoSSL)
305 |  82. [AAAI 2022] **Self-supervised Graph Neural Networks via Diverse and Interactive Message Passing** [[paper]](https://yangliang.github.io/pdf/aaai22.pdf)
306 |  83. [AAAI 2022] **Augmentation-Free Self-Supervised Learning on Graphs** [[paper]](https://arxiv.org/pdf/2112.02472.pdf)[[code]](https://github.com/Namkyeong/AFGRL)
307 |  84. [AAAI 2022] **Molecular Contrastive Learning with Chemical Element Knowledge Graph** [[paper]](https://arxiv.org/pdf/2112.00544.pdf)
308 |  85. [AAAI 2022] **Deep Graph Clustering via Dual Correlation Reduction** [[paper]](https://arxiv.org/pdf/2112.14772)[[code]](https://github.com/yueliu1999/DCRN)
309 |  86. [AAAI 2022] **Simple Unsupervised Graph Representation Learning** [[paper]](https://www.aaai.org/AAAI22Papers/AAAI-3999.MoY.pdf)
310 |  87. [WSDM 2022] **Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations** [[paper]](https://arxiv.org/abs/2201.01702) [[code]](https://github.com/Shen-Lab/GraphCL_Automated)
311 |  88. [ICOIN 2022] **Adaptive Self-Supervised Graph Representation Learning** [[paper]](https://ieeexplore.ieee.org/abstract/document/9687176)
312 |  89. [NPL 2022] **How Does Bayesian Noisy Self-Supervision Defend Graph Convolutional Networks?** [[paper]](https://link.springer.com/article/10.1007/s11063-022-10750-8)
313 |  90. [SIGIR 2022] **Knowledge Graph Contrastive Learning for Recommendation** [[paper]](https://arxiv.org/abs/2205.00976) [[code]](https://github.com/yuh-yang/KGCL-SIGIR22)
314 |  
315 |  ## Year 2021
316 |  1. [AAAI 2021] **Self-supervised hypergraph convolutional networks for session-based recommendation** [[paper]](https://www.aaai.org/AAAI21Papers/AAAI-1889.XiaX.pdf)
317 |  1. [arXiv 2021] **Pre-training Graph Neural Network for Cross Domain Recommendation** [[paper]](https://arxiv.org/pdf/2111.08268.pdf)
318 |  17. [arXiv 2021] **Augmentations in Graph Contrastive Learning: Current Methodological Flaws & Towards Better Practices** [[paper]](https://arxiv.org/pdf/2111.03220.pdf)
319 |  18. [arXiv 2021] **Collaborative Graph Contrastive Learning: Data Augmentation Composition May Not be Necessary for Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2111.03262.pdf)
320 |  13. [arXiv 2021] **Multi-task Self-distillation for Graph-based Semi-Supervised Learning** [[paper]](https://arxiv.org/pdf/2112.01174.pdf)
321 |  14. [arXiv 2021] **Subgraph Contrastive Link Representation Learning** [[paper]](https://arxiv.org/pdf/2112.01165.pdf)
322 |  3. [arXiv 2021] **Multilayer Graph Contrastive Clustering Network** [[paper]](https://arxiv.org/pdf/2112.14021.pdf)
323 |  3. [arXiv 2021] **Graph Representation Learning via Contrasting Cluster Assignments** [[paper]](https://arxiv.org/pdf/2112.07934.pdf)
324 |  3. [arXiv 2021] **Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning** [[paper]](https://arxiv.org/pdf/2112.08830.pdf)
325 |  3. [arXiv 2021] **Bayesian Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2112.07823.pdf)
326 |  3. [arXiv 2021] **TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning** [[paper]](https://arxiv.org/pdf/2112.03587.pdf)
327 |  26. [arXiv 2021] **Graph Communal Contrastive Learning** [[paper]](https://arxiv.org/pdf/2110.14863.pdf)
328 |  27. [arXiv 2021] **Self-supervised Contrastive Attributed Graph Clustering** [[paper]](https://arxiv.org/pdf/2110.08264.pdf)
329 |  28. [arXiv 2021] **Self-Supervised Learning for Molecular Property Prediction** [[paper]](https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/61677becaa918db6bf2a31cb/original/self-supervised-learning-for-molecular-property-prediction.pdf)
330 |  29. [arXiv 2021] **RPT: Toward Transferable Model on Heterogeneous Researcher Data via Pre-Training** [[paper]](https://arxiv.org/pdf/2110.07336.pdf)
331 |  30. [arXiv 2021] **Scalable Consistency Training for Graph Neural Networks via Self-Ensemble Self-Distillation** [[paper]](https://arxiv.org/pdf/2110.06290.pdf)
332 |  31. [arXiv 2021] **PRE-TRAINING MOLECULAR GRAPH REPRESENTATION WITH 3D GEOMETRY** [[paper]](https://wyliu.com/papers/GraphMVP.pdf) [[code]](https://github.com/chao1224/GraphMVP)
333 |  32. [arXiv 2021] **3D Infomax improves GNNs for Molecular Property Prediction** [[paper]](https://arxiv.org/abs/2110.04126v1) [[code]](https://github.com/HannesStark/3DInfomax)
334 |  34. [arXiv 2021] **Motif-based Graph Self-Supervised Learning for Molecular Property Prediction** [[paper]](https://arxiv.org/pdf/2110.00987.pdf)
335 |  35. [arXiv 2021] **Debiased Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2110.02027.pdf)
336 |  36. [arXiv 2021] **3D-Transformer: Molecular Representation with Transformer in 3D Space** [[paper]](https://arxiv.org/pdf/2110.01191.pdf)
337 |  37. [arXiv 2021] **Contrastive Pre-Training of GNNs on Heterogeneous Graphs** [[paper]](https://yuanfulu.github.io/publication/CIKM-CPT.pdf)
338 |  38. [arXiv 2021] **Contrastive Graph Convolutional Networks for Hardware Trojan Detection in Third Party IP Cores** [[paper]](https://people.cs.vt.edu/~ramakris/papers/Hardware_Trojan_Trigger_Detection__HOST2021.pdf)
339 |  39. [arXiv 2021] **GeomGCL: Geometric Graph Contrastive Learning for Molecular Property Prediction** [[paper]](https://arxiv.org/pdf/2109.11730.pdf)
340 |  40. [arXiv 2021] **Adaptive Multi-layer Contrastive Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2109.14159.pdf)
341 |  42. [arXiv 2021] **Graph-MVP: Multi-View Prototypical Contrastive Learning for Multiplex Graphs** [[paper]](https://arxiv.org/pdf/2109.03560.pdf)
342 |  43. [arXiv 2021] **Hyper Meta-Path Contrastive Learning for Multi-Behavior Recommendation** [[paper]](https://arxiv.org/pdf/2109.02859.pdf)
343 |  44. [arXiv 2021] **Negative Sampling Strategies for Contrastive Self-Supervised Learning of Graph Representations** [[paper]](https://www.sciencedirect.com/science/article/abs/pii/S0165168421003479)
344 |  45. [arXiv 2021] **Structure-Aware Hard Negative Mining for Heterogeneous Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2108.13886.pdf)
345 |  46. [arXiv 2021] **Spatio-Temporal Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2108.11873.pdf)
346 |  47. [arXiv 2021] **Generative and Contrastive Self-Supervised Learning for Graph Anomaly Detection** [[paper]](https://arxiv.org/pdf/2108.09896.pdf)
347 |  92. [Arxiv 2021] **Self-Supervised Multi-Channel Hypergraph Convolutional Network for Social Recommendation** [[paper]](https://arxiv.org/abs/2101.06448) [[code]](https://github.com/Coder-Yu/RecQ)
348 |  53. [arXiv 2021] **GCCAD: Graph Contrastive Coding for Anomaly Detection** [[paper]](https://arxiv.org/pdf/2108.07516.pdf)
349 |  54. [arXiv 2021] **Contrastive Self-supervised Sequential Recommendation with Robust Augmentation** [[paper]](https://arxiv.org/pdf/2108.06479.pdf)
350 |  55. [arXiv 2021] **RRLFSOR: An Efficient Self-Supervised Learning Strategy of Graph Convolutional Networks** [[paper]](https://arxiv.org/ftp/arxiv/papers/2108/2108.07481.pdf)
351 |  59. [arXiv 2021] **Group Contrastive Self-Supervised Learning on Graphs** [[paper]](https://arxiv.org/abs/2107.09787) 
352 |  60. [arXiv 2021] **Multi-Level Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2107.02639)
353 |  62. [arXiv 2021] **From Canonical Correlation Analysis to Self-supervised Graph Neural Networks** [[paper]](https://arxiv.org/abs/2106.12484) [[code]](https://github.com/hengruizhang98/CCA-SSG)
354 |  63. [arXiv 2021] **Evaluating Modules in Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2106.08171) [[code]](https://github.com/thunlp/OpenGCL)
355 |  70. [arXiv 2021] **Prototypical Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2106.09645.pdf)
356 |  71. [arXiv 2021] **Fairness-Aware Node Representation Learning** [[paper]](https://arxiv.org/pdf/2106.05391.pdf)
357 |  72. [arXiv 2021] **Adversarial Graph Augmentation to Improve Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2106.05819)
358 |  73. [arXiv 2021] **Graph Barlow Twins: A self-supervised representation learning framework for graphs** [[paper]](https://arxiv.org/pdf/2106.02466.pdf)
359 |  74. [arXiv 2021] **Self-Supervised Graph Learning with Proximity-based Views and Channel Contrast** [[paper]](https://arxiv.org/pdf/2106.03723.pdf)
360 |  75. [arXiv 2021] **Self-supervised on Graphs: Contrastive, Generative,or Predictive** [[paper]](https://arxiv.org/abs/2105.07342)
361 |  76. [arXiv 2021] **FedGL: Federated Graph Learning Framework with Global Self-Supervision** [[paper]](https://arxiv.org/pdf/2105.03170.pdf)
362 |  78. [arXiv 2021] **Hop-Count Based Self-Supervised Anomaly Detection on Attributed Networks** [[paper]](https://arxiv.org/abs/2104.07917)
363 |  79. [arXiv 2021] **Representation Learning for Networks in Biology and Medicine: Advancements, Challenges, and Opportunities** [[paper]](https://arxiv.org/abs/2104.04883)
364 |  80. [arXiv 2021] **Graph Representation Learning by Ensemble Aggregating Subgraphs via Mutual Information Maximization** [[paper]](https://arxiv.org/abs/2103.13125)
365 |  81. [arXiv 2021] **Drug Target Prediction Using Graph Representation Learning via Substructures Contrast** [[paper]](https://www.preprints.org/manuscript/202103.0337/v1)
366 |  82. [arXiv 2021] **Self-supervised Auxiliary Learning for Graph Neural Networks via Meta-Learning** [[paper]](https://arxiv.org/abs/2103.00771)
367 |  83. [arXiv 2021] **Graph Self-Supervised Learning: A Survey** [[paper]](https://arxiv.org/abs/2103.00111)
368 |  84. [arXiv 2021] **Towards Robust Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2102.13085.pdf)
369 |  85. [arXiv 2021] **Pre-Training on Dynamic Graph Neural Networks** [[paper]](https://arxiv.org/abs/2102.12380)
370 |  86. [arXiv 2021] **Self-Supervised Learning of Graph Neural Networks: A Unified Review** [[paper]](https://arxiv.org/abs/2102.10757)
371 |  61. [Openreview 2021] **An Empirical Study of Graph Contrastive Learning** [[paper]](https://openreview.net/forum?id=fYxEnpY-__G)
372 |  1. [BIBM 2021] **SGAT: a Self-supervised Graph Attention Network for Biomedical Relation Extraction** [[paper]](https://ieeexplore.ieee.org/abstract/document/9669699)
373 |  95. [BIBM 2021] **Molecular Graph Contrastive Learning with Parameterized Explainable Augmentations** [[paper]](https://www.biorxiv.org/content/10.1101/2021.12.03.471150v1)
374 |  5. [NeurIPS 2021 Workshop] **Self-Supervised GNN that Jointly Learns to Augment** [[paper]](https://www.researchgate.net/profile/Zekarias-Kefato/publication/356997993_Self-Supervised_GNN_that_Jointly_Learns_to_Augment/links/61b75d88a6251b553ab64ff4/Self-Supervised-GNN-that-Jointly-Learns-to-Augment.pdf)
375 |  5. [NeurIPS 2021 Workshop] **Contrastive Embedding of Structured Space for Bayesian Optimisation** [[paper]](https://openreview.net/pdf?id=xFpkJUMS9te)
376 |  5. [NeurIPS 2021] **Enhancing Hyperbolic Graph Embeddings via Contrastive Learning** [[paper]](https://sslneurips21.github.io/files/CameraReady/NeurIPS_2021_workshop_version2.pdf)
377 |  5. [NeurIPS 2021] **Graph Adversarial Self-Supervised Learning** [[paper]](https://proceedings.neurips.cc/paper/2021/file/7d3010c11d08cf990b7614d2c2ca9098-Paper.pdf)
378 |  6. [NeurIPS 2021] **Contrastive laplacian eigenmaps** [[paper]](https://papers.nips.cc/paper/2021/file/2d1b2a5ff364606ff041650887723470-Paper.pdf)
379 |  7. [NeurIPS 2021] **Directed Graph Contrastive Learning** [[paper]](https://zekuntong.com/files/digcl_nips.pdf)[[code]](https://github.com/flyingtango/DiGCL)
380 |  8. [NeurIPS 2021] **Multi-view Contrastive Graph Clustering** [[paper]](https://arxiv.org/pdf/2110.11842.pdf)[[code]](https://github.com/Panern/MCGC)
381 |  9. [NeurIPS 2021] **From Canonical Correlation Analysis to Self-supervised Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2106.12484.pdf)[[code]](https://github.com/hengruizhang98/CCA-SSG)
382 |  10. [NeurIPS 2021] **InfoGCL: Information-Aware Graph Contrastive Learning** [[paper]](https://arxiv.org/pdf/2110.15438.pdf)
383 |  11. [NeurIPS 2021] **Adversarial Graph Augmentation to Improve Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2106.05819)[[code]](https://github.com/susheels/adgcl)
384 |  12. [NeurIPS 2021] **Disentangled Contrastive Learning on Graphs** [[paper]](https://openreview.net/pdf?id=C_L0Xw_Qf8M)
385 |  20. [CIKM 2021] **Multimodal Graph Meta Contrastive Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482151)
386 |  21. [CIKM 2021] **Self-supervised Representation Learning on Dynamic Graphs** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482389)
387 |  22. [CIKM 2021] **Rectifying Pseudo Labels: Iterative Feature Clustering for Graph Representation Learning** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482469)
388 |  23. [CIKM 2021] **SGCL: Contrastive Representation Learning for Signed Graphs** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482478)
389 |  24. [CIKM 2021] **Semi-Supervised and Self-Supervised Classification with Multi-View Graph Neural Networks** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482477)
390 |  25. [CIKM 2021] **Social Recommendation with Self-Supervised Metagraph Informax Network** [[paper]](https://dl.acm.org/doi/abs/10.1145/3459637.3482480) [[code]](https://github.com/SocialRecsys/SMIN)
391 |  48. [IJCAI 2021] **Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning** [[paper]](https://www.ijcai.org/proceedings/2021/0204.pdf)
392 |  49. [IJCAI 2021] **Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks** [[paper]](https://www.ijcai.org/proceedings/2021/0371.pdf)
393 |  50. [IJCAI 2021] **CuCo: Graph Representation with Curriculum Contrastive Learning** [[paper]](https://www.ijcai.org/proceedings/2021/0317.pdf)
394 |  51. [IJCAI 2021] **Graph Debiased Contrastive Learning with Joint Representation Clustering** [[paper]](https://www.ijcai.org/proceedings/2021/0473.pdf)
395 |  52. [IJCAI 2021] **CSGNN: Contrastive Self-Supervised Graph Neural Network for Molecular Interaction Prediction** [[paper]](https://www.ijcai.org/proceedings/2021/0517.pdf)
396 |  56. [KDD 2021] **MoCL: Data-driven Molecular Fingerprint via Knowledge-aware Contrastive Learning from Molecular Graph** [[paper]](https://dl.acm.org/doi/abs/10.1145/3447548.3467186) [[code]](https://github.com/illidanlab/MoCL-DK)
397 |  57. [KDD 2021] **Contrastive Multi-View Multiplex Network Embedding with Applications to Robust Network Alignment** [[paper]](https://dl.acm.org/doi/abs/10.1145/3447548.3467227)
398 |  58. [KDD 2021] **Adaptive Transfer Learning on Graph Neural Networks** [[paper]](https://arxiv.org/pdf/2107.08765.pdf)
399 |  64. :fire:[ICML 2021] **Graph Contrastive Learning Automated** [[paper]](https://arxiv.org/abs/2106.07594) [[code]](https://github.com/Shen-Lab/GraphCL_Automated)
400 |  66. [ICML 2021] **Self-supervised Graph-level Representation Learning with Local and Global Structure** [[paper]](https://arxiv.org/pdf/2106.04113) [[code]](https://github.com/DeepGraphLearning/GraphLoG)
401 |  67. [KDD 2021] **Pre-training on Large-Scale Heterogeneous Graph** [[paper]](http://www.shichuan.org/doc/111.pdf)
402 |  68. [KDD 2021] **MoCL: Contrastive Learning on Molecular Graphs with Multi-level Domain Knowledge** [[paper]](https://arxiv.org/pdf/2106.04509.pdf)
403 |  69. [KDD 2021] **Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning** [[paper]](https://arxiv.org/abs/2105.09111) [[code]](https://github.com/liun-online/HeCo)
404 |  87. [WWW 2021 Workshop] **Iterative Graph Self-Distillation** [[paper]](https://arxiv.org/abs/2010.12609)
405 |  88. [WWW 2021] **HDMI: High-order Deep Multiplex Infomax** [[paper]](https://arxiv.org/abs/2102.07810) [[code]](https://github.com/baoyujing/HDMI)
406 |  89. :fire:[WWW 2021] **Graph Contrastive Learning with Adaptive Augmentation** [[paper]](https://arxiv.org/abs/2010.14945) [[code]](https://github.com/CRIPAC-DIG/GCA)
407 |  90. [WWW 2021] **SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism** [[paper]](https://arxiv.org/abs/2101.08170) [[code]](https://github.com/RingBDStack/SUGAR)
408 |  91. [WWW 2021] **Multi-view Graph Contrastive Representation Learning for Drug-Drug Interaction Prediction** [[paper]](https://arxiv.org/abs/2010.11711) [[code]](https://github.com/isjakewong/MIRACLE)
409 |  93. :fire:[ICLR 2021] **How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision** [[paper]](https://openreview.net/forum?id=Wi5KUNlqWty) [[code]](https://github.com/dongkwan-kim/SuperGAT)
410 |  94. [WSDM 2021] **Pre-Training Graph Neural Networks for Cold-Start Users and Items Representation** [[paper]](https://arxiv.org/abs/2012.07064) [[code]](https://github.com/jerryhao66/Pretrain-Recsys)
411 |  41. [KBS 2021] **Multi-aspect self-supervised learning for heterogeneous information network** [[paper]](https://www.sciencedirect.com/science/article/abs/pii/S095070512100736X)
412 |  33. [CVPR 2021] **Zero-Shot Learning via Contrastive Learning on Dual Knowledge Graphs** [[paper]](https://openaccess.thecvf.com/content/ICCV2021W/GSP-CV/papers/Wang_Zero-Shot_Learning_via_Contrastive_Learning_on_Dual_Knowledge_Graphs_ICCVW_2021_paper.pdf)
413 |  2. [ICBD 2021] **Session-based Recommendation via Contrastive Learning on Heterogeneous Graph** [[paper]](https://ieeexplore.ieee.org/abstract/document/9671296)
414 |  4. [ICONIP 2021] **Concordant Contrastive Learning for Semi-supervised Node Classification on Graph** [[paper]](https://link.springer.com/chapter/10.1007/978-3-030-92185-9_48)
415 |  15. [ICCSNT 2021] **Graph Data Augmentation based on Adaptive Graph Convolution for Skeleton-based Action Recognition** [[paper]](https://ieeexplore.ieee.org/abstract/document/9615451)
416 |  77. [IJCNN 2021] **Node Embedding using Mutual Information and Self-Supervision based Bi-level Aggregation** [[paper]](https://arxiv.org/abs/2104.13014v1)
417 |  
418 |  ## Year 2020
419 |  1. [Openreview 2020] **Motif-Driven Contrastive Learning of Graph Representations** [[paper]](https://openreview.net/forum?id=qcKh_Msv1GP)
420 |  15. [Openreview 2020] **SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks** [[paper]](https://openreview.net/forum?id=a5KvtsZ14ev)
421 |  16. [Openreview 2020] **TopoTER: Unsupervised Learning of Topology Transformation Equivariant Representations** [[paper]](https://openreview.net/forum?id=9az9VKjOx00)
422 |  17. [Openreview 2020] **Graph-Based Neural Network Models with Multiple Self-Supervised Auxiliary Tasks** [[paper]](https://openreview.net/forum?id=hnJSgY7p33a)
423 |  19. [Openreview 2020] **Transfer Learning of Graph Neural Networks with Ego-graph Information Maximization** [[paper]](https://openreview.net/forum?id=J_pvI6ap5Mn)
424 |  1. [Arxiv 2020] **COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking** [[paper]](https://arxiv.org/abs/2012.11336) [[code]](https://github.com/BoChen-Daniel/Expert-Linking)
425 |  12. [Arxiv 2020] **Distance-wise Graph Contrastive Learning** [[paper]](https://arxiv.org/abs/2012.07437)
426 |  23. :fire:[Arxiv 2020] **Self-supervised Learning on Graphs: Deep Insights and New Direction.** [[paper]](https://arxiv.org/abs/2006.10141) [[code]](https://github.com/ChandlerBang/SelfTask-GNN)
427 |  24. :fire:[Arxiv 2020] **Deep Graph Contrastive Representation Learning** [[paper]](https://arxiv.org/abs/2006.04131)
428 |  29. [Arxiv 2020] **Self-supervised Training of Graph Convolutional Networks.** [[paper]](https://arxiv.org/abs/2006.02380)
429 |  30. [Arxiv 2020] **Self-Supervised Graph Representation Learning via Global Context Prediction.** [[paper]](https://arxiv.org/abs/2003.01604)
430 |  33. :fire:[Arxiv 2020] **Graph-Bert: Only Attention is Needed for Learning Graph Representations.** [[paper]](https://arxiv.org/abs/2001.05140) [[code]](https://github.com/anonymous-sourcecode/Graph-Bert)
431 |  20. :fire:[NeurIPS 2020] **Self-Supervised Graph Transformer on Large-Scale Molecular Data** [[paper]](https://drug.ai.tencent.com/publications/GROVER.pdf)
432 |  21. [NeurIPS 2020] **Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs** [[paper]](https://arxiv.org/abs/2007.08294) [[code]](https://github.com/mlvlab/SELAR)
433 |  22. :fire:[NeurIPS 2020] **Graph Contrastive Learning with Augmentations** [[paper]](https://arxiv.org/abs/2010.13902) [[code]](https://github.com/Shen-Lab/GraphCL)
434 |  25. :fire:[ICML 2020] **When Does Self-Supervision Help Graph Convolutional Networks?** [[paper]](https://arxiv.org/abs/2006.09136) [[code]](https://github.com/Shen-Lab/SS-GCNs)
435 |  26. :fire:[ICML 2020] **Graph-based, Self-Supervised Program Repair from Diagnostic Feedback.** [[paper]](https://arxiv.org/abs/2005.10636)
436 |  27. :fire:[ICML 2020] **Contrastive Multi-View Representation Learning on Graphs.** [[paper]](https://arxiv.org/abs/2006.05582) [[code]](https://github.com/kavehhassani/mvgrl)
437 |  28. [ICML 2020 Workshop] **Self-supervised edge features for improved Graph Neural Network training.** [[paper]](https://arxiv.org/abs/2007.04777)
438 |  31. :fire:[KDD 2020] **GPT-GNN: Generative Pre-Training of Graph Neural Networks.** [[pdf]](https://arxiv.org/abs/2006.15437) [[code]](https://github.com/acbull/GPT-GNN)
439 |  32. :fire:[KDD 2020] **GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training.** [[pdf]](https://arxiv.org/abs/2006.09963) [[code]](https://github.com/THUDM/GCC) 
440 |  34. :fire:[ICLR 2020] **InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization.** [[paper]](https://arxiv.org/abs/1908.01000) [[code]](https://github.com/fanyun-sun/InfoGraph)
441 |  35. :fire:[ICLR 2020] **Strategies for Pre-training Graph Neural Networks.** [[paper]](https://arxiv.org/abs/1905.12265) [[code]](https://github.com/snap-stanford/pretrain-gnns)
442 |  36. :fire:[AAAI 2020] **Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels.** [[paper]](https://arxiv.org/abs/1902.11038)
443 |  1. [ICDM 2020] **Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning** [[paper]](https://arxiv.org/abs/2009.10273) [[code]](https://github.com/yzjiao/Subg-Con)
444 |  
445 |  ## Year 2019
446 |  1. [KDD 2019 Workshop] **SGR: Self-Supervised Spectral Graph Representation Learning.** [[paper]](https://arxiv.org/abs/1811.06237)
447 |  1. [ICLR 2019 Workshop] **Can Graph Neural Networks Go "Online"? An Analysis of Pretraining and Inference.** [[paper]](https://arxiv.org/abs/1905.06018)
448 |  1. [ICLR 2019 workshop] **Pre-Training Graph Neural Networks for Generic Structural Feature Extraction.** [[paper]](https://arxiv.org/abs/1905.13728)
449 |  1. [Arxiv 2019] **Heterogeneous Deep Graph Infomax** [[paper]](https://arxiv.org/abs/1911.08538) [[code]](https://github.com/YuxiangRen/Heterogeneous-Deep-Graph-Infomax)
450 |  1. :fire:[ICLR 2019] **Deep Graph Informax.** [[paper]](https://arxiv.org/abs/1809.10341) [[code]](https://github.com/PetarV-/DGI)
451 |  
452 |  
453 |  ## Other related papers
454 |   (implicitly using self-supersvied learning or applying graph neural networks in other domains)
455 |  1. [Arxiv 2020] **Self-supervised Learning: Generative or Contrastive.** [[paper]](https://arxiv.org/abs/2006.08218)
456 |  1. [KDD 2020] **Octet: Online Catalog Taxonomy Enrichment with Self-Supervision.** [[paper]](https://arxiv.org/pdf/2006.10276.pdf)
457 |  1. [WWW 2020] **Structural Deep Clustering Network.** [[paper]](https://dl.acm.org/doi/abs/10.1145/3366423.3380214
458 |  ) [[code]](https://github.com/bdy9527/SDCN)
459 |  1. [IJCAI 2019] **Pre-training of Graph Augmented Transformers for Medication Recommendation.** [[paper]](https://arxiv.org/abs/1906.00346) [[code]](https://github.com/jshang123/G-Bert)
460 |  1. [AAAI 2020] **Unsupervised Attributed Multiplex Network Embedding** [[paper]](https://arxiv.org/abs/1911.06750) [[code]](https://github.com/pcy1302/DMGI)
461 |  1. [WWW 2020] **Graph representation learning via graphical mutual information maximization** [[paper]](https://dl.acm.org/doi/abs/10.1145/3366423.3380112)
462 |  1. [NeurIPS 2017] **Inductive Representation Learning on Large Graphs** [[paper]](https://papers.nips.cc/paper/2017/hash/5dd9db5e033da9c6fb5ba83c7a7ebea9-Abstract.html) [[code]](https://github.com/williamleif/GraphSAGE)
463 |  1. [NeurIPS 2016 Workshop] **Variational Graph Auto-Encoders** [[paper]](https://arxiv.org/abs/1611.07308) [[code]](https://github.com/tkipf/gae)
464 |  1. [WWW 2015] **LINE: Large-scale Information Network Embedding** [[paper]](https://dl.acm.org/doi/abs/10.1145/2736277.2741093) [[code]](https://github.com/tangjianpku/LINE)
465 |  1. [KDD 2014] **DeepWalk: Online Learning of Social Representations** [[paper]](https://dl.acm.org/doi/abs/10.1145/2623330.2623732) [[code]](https://github.com/phanein/deepwalk)
466 |  
467 |  ## Acknowledgement
468 |  
469 |  This page is contributed and maintained by [Wei Jin](http://cse.msu.edu/~jinwei2/)(joe.weijin@gmail.com), [Yuning You](https://yyou1996.github.io/)(yuning.you@tamu.edu) and [Yingheng Wang](https://isjakewong.github.io/)(jakewyh@163.com).
470 | 


--------------------------------------------------------------------------------
/get_hot.py:
--------------------------------------------------------------------------------
 1 | """Mark hot articles which are extensively cited"""
 2 | 
 3 | import re
 4 | from tqdm import tqdm
 5 | import time
 6 | import os
 7 | import subprocess
 8 | import scholar
 9 | import numpy as np
10 | 
11 | 
12 | def overlap(s1, s2):
13 |     s1 = replace(s1)
14 |     s2 = replace(s2)
15 |     s1 = set(s1.split())
16 |     s2 = set(s2.split())
17 |     intersec = s1 & s2
18 |     return len(intersec)/len(s1)
19 | 
20 | def replace(s0):
21 |     s0 = s0.replace('.', ' ')
22 |     s0 = s0.replace(':', ' ')
23 |     s0 = s0.lower()
24 |     return s0
25 | 
26 | def get_citations(paper, verbose=1):
27 |     def searchScholar(searchphrase, title):
28 |         query = scholar.SearchScholarQuery()
29 |         # query.set_words(searchphrase)
30 |         query.set_words(title)
31 |         querier.send_query(query)
32 |         articles = querier.articles
33 |         try:
34 |             if overlap(articles[0].attrs['title'][0], title) < 0.9:
35 |                 return 0
36 |         except:
37 |             # set_new_proxy()
38 |             return -1
39 |         return articles[0].attrs['num_citations'][0]
40 | 
41 |     art = ["-c", "1", "--phrase", paper]
42 |     querier = scholar.ScholarQuerier()
43 |     settings = scholar.ScholarSettings()
44 |     settings.set_citation_format(2)
45 |     querier.apply_settings(settings)
46 |     cites = searchScholar(art, paper)
47 |     while cites == -1:
48 |         searchScholar(art, paper)
49 |     if verbose:
50 |         print(f'{paper}: ', cites)
51 |     return cites
52 | 
53 | def get_papers(filename):
54 |     papers = []
55 |     paper2line = {}
56 |     with open(filename, 'r') as f:
57 |         for num, line in enumerate(f.readlines()):
58 |             if 'Other related papers' in line: # do not count them
59 |                 break
60 |             if '[paper]' in line:
61 |                 res = re.findall('\*\*.+\*\*', line)[0]
62 |                 res = res[2:-2]
63 |                 paper2line[len(papers)] = num
64 |                 papers.append(res)
65 |                 print(res)
66 |     return papers, paper2line
67 | 
68 | papers, paper2line = get_papers('README.md')
69 | citations = []
70 | for id, p in tqdm(enumerate(papers)):
71 |     time.sleep(2)
72 |     citations.append(get_citations(p))
73 | 
74 | idx = np.arange(len(citations))
75 | citations = np.array(citations)
76 | hot_id = idx[citations>80]
77 | 
78 | with open('README.md', 'r') as f:
79 |     content = f.readlines()
80 | 
81 | with open('README_old.md', 'w') as f:
82 |     f.write(' '.join(content))
83 | 
84 | for p in hot_id:
85 |     line = paper2line[p]
86 |     old_content = content[line]
87 |     num = re.findall(r'\d+', old_content)[0] # the number before '.'
88 |     new_content = old_content[:len(num)+2] + ":fire:" + old_content[len(num)+2:]
89 |     content[line] = new_content
90 | with open('README.md', 'w') as f:
91 |     f.write(' '.join(content))
92 | 


--------------------------------------------------------------------------------
/scholar.py:
--------------------------------------------------------------------------------
   1 | #! /usr/bin/env python
   2 | """
   3 | Copied from https://github.com/ckreibich/scholar.py
   4 | """
   5 | """
   6 | This module provides classes for querying Google Scholar and parsing
   7 | returned results. It currently *only* processes the first results
   8 | page. It is not a recursive crawler.
   9 | """
  10 | # ChangeLog
  11 | # ---------
  12 | #
  13 | # 2.11  The Scholar site seems to have become more picky about the
  14 | #       number of results requested. The default of 20 in scholar.py
  15 | #       could cause HTTP 503 responses. scholar.py now doesn't request
  16 | #       a maximum unless you provide it at the comment line. (For the
  17 | #       time being, you still cannot request more than 20 results.)
  18 | #
  19 | # 2.10  Merged a fix for the "TypError: quote_from_bytes()" problem on
  20 | #       Python 3.x from hinnefe2.
  21 | #
  22 | # 2.9   Fixed Unicode problem in certain queries. Thanks to smidm for
  23 | #       this contribution.
  24 | #
  25 | # 2.8   Improved quotation-mark handling for multi-word phrases in
  26 | #       queries. Also, log URLs %-decoded in debugging output, for
  27 | #       easier interpretation.
  28 | #
  29 | # 2.7   Ability to extract content excerpts as reported in search results.
  30 | #       Also a fix to -s|--some and -n|--none: these did not yet support
  31 | #       passing lists of phrases. This now works correctly if you provide
  32 | #       separate phrases via commas.
  33 | #
  34 | # 2.6   Ability to disable inclusion of patents and citations. This
  35 | #       has the same effect as unchecking the two patents/citations
  36 | #       checkboxes in the Scholar UI, which are checked by default.
  37 | #       Accordingly, the command-line options are --no-patents and
  38 | #       --no-citations.
  39 | #
  40 | # 2.5:  Ability to parse global result attributes. This right now means
  41 | #       only the total number of results as reported by Scholar at the
  42 | #       top of the results pages (e.g. "About 31 results"). Such
  43 | #       global result attributes end up in the new attrs member of the
  44 | #       used ScholarQuery class. To render those attributes, you need
  45 | #       to use the new --txt-globals flag.
  46 | #
  47 | #       Rendering global results is currently not supported for CSV
  48 | #       (as they don't fit the one-line-per-article pattern). For
  49 | #       grepping, you can separate the global results from the
  50 | #       per-article ones by looking for a line prefix of "[G]":
  51 | #
  52 | #       $ scholar.py --txt-globals -a "Einstein"
  53 | #       [G]    Results 11900
  54 | #
  55 | #                Title Can quantum-mechanical description of physical reality be considered complete?
  56 | #                  URL http://journals.aps.org/pr/abstract/10.1103/PhysRev.47.777
  57 | #                 Year 1935
  58 | #            Citations 12804
  59 | #             Versions 80
  60 | #              Cluster ID 8174092782678430881
  61 | #       Citations list http://scholar.google.com/scholar?cites=8174092782678430881&as_sdt=2005&sciodt=0,5&hl=en
  62 | #        Versions list http://scholar.google.com/scholar?cluster=8174092782678430881&hl=en&as_sdt=0,5
  63 | #
  64 | # 2.4:  Bugfixes:
  65 | #
  66 | #       - Correctly handle Unicode characters when reporting results
  67 | #         in text format.
  68 | #
  69 | #       - Correctly parse citation-only (i.e. linkless) results in
  70 | #         Google Scholar results.
  71 | #
  72 | # 2.3:  Additional features:
  73 | #
  74 | #       - Direct extraction of first PDF version of an article
  75 | #
  76 | #       - Ability to pull up an article cluster's results directly.
  77 | #
  78 | #       This is based on work from @aliparsai on GitHub -- thanks!
  79 | #
  80 | #       - Suppress missing search results (so far shown as "None" in
  81 | #         the textual output form.
  82 | #
  83 | # 2.2:  Added a logging option that reports full HTML contents, for
  84 | #       debugging, as well as incrementally more detailed logging via
  85 | #       -d up to -dddd.
  86 | #
  87 | # 2.1:  Additional features:
  88 | #
  89 | #       - Improved cookie support: the new --cookie-file options
  90 | #         allows the reuse of a cookie across invocations of the tool;
  91 | #         this allows higher query rates than would otherwise result
  92 | #         when invoking scholar.py repeatedly.
  93 | #
  94 | #       - Workaround: remove the num= URL-encoded argument from parsed
  95 | #         URLs. For some reason, Google Scholar decides to propagate
  96 | #         the value from the original query into the URLs embedded in
  97 | #         the results.
  98 | #
  99 | # 2.0:  Thorough overhaul of design, with substantial improvements:
 100 | #
 101 | #       - Full support for advanced search arguments provided by
 102 | #         Google Scholar
 103 | #
 104 | #       - Support for retrieval of external citation formats, such as
 105 | #         BibTeX or EndNote
 106 | #
 107 | #       - Simple logging framework to track activity during execution
 108 | #
 109 | # 1.7:  Python 3 and BeautifulSoup 4 compatibility, as well as printing
 110 | #       of usage info when no options are given. Thanks to Pablo
 111 | #       Oliveira (https://github.com/pablooliveira)!
 112 | #
 113 | #       Also a bunch of pylinting and code cleanups.
 114 | #
 115 | # 1.6:  Cookie support, from Matej Smid (https://github.com/palmstrom).
 116 | #
 117 | # 1.5:  A few changes:
 118 | #
 119 | #       - Tweak suggested by Tobias Isenberg: use unicode during CSV
 120 | #         formatting.
 121 | #
 122 | #       - The option -c|--count now understands numbers up to 100 as
 123 | #         well. Likewise suggested by Tobias.
 124 | #
 125 | #       - By default, text rendering mode is now active. This avoids
 126 | #         confusion when playing with the script, as it used to report
 127 | #         nothing when the user didn't select an explicit output mode.
 128 | #
 129 | # 1.4:  Updates to reflect changes in Scholar's page rendering,
 130 | #       contributed by Amanda Hay at Tufts -- thanks!
 131 | #
 132 | # 1.3:  Updates to reflect changes in Scholar's page rendering.
 133 | #
 134 | # 1.2:  Minor tweaks, mostly thanks to helpful feedback from Dan Bolser.
 135 | #       Thanks Dan!
 136 | #
 137 | # 1.1:  Made author field explicit, added --author option.
 138 | #
 139 | # Don't complain about missing docstrings: pylint: disable-msg=C0111
 140 | #
 141 | # Copyright 2010--2017 Christian Kreibich. All rights reserved.
 142 | #
 143 | # Redistribution and use in source and binary forms, with or without
 144 | # modification, are permitted provided that the following conditions are
 145 | # met:
 146 | #
 147 | #    1. Redistributions of source code must retain the above copyright
 148 | #       notice, this list of conditions and the following disclaimer.
 149 | #
 150 | #    2. Redistributions in binary form must reproduce the above
 151 | #       copyright notice, this list of conditions and the following
 152 | #       disclaimer in the documentation and/or other materials provided
 153 | #       with the distribution.
 154 | #
 155 | # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED
 156 | # WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
 157 | # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 158 | # DISCLAIMED. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY DIRECT,
 159 | # INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 160 | # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 161 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 162 | # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 163 | # STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
 164 | # IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 165 | # POSSIBILITY OF SUCH DAMAGE.
 166 | 
 167 | import optparse
 168 | import os
 169 | import re
 170 | import sys
 171 | import warnings
 172 | 
 173 | try:
 174 |     # Try importing for Python 3
 175 |     # pylint: disable-msg=F0401
 176 |     # pylint: disable-msg=E0611
 177 |     from urllib.request import HTTPCookieProcessor, Request, build_opener
 178 |     from urllib.parse import quote, unquote
 179 |     from http.cookiejar import MozillaCookieJar
 180 | except ImportError:
 181 |     # Fallback for Python 2
 182 |     from urllib2 import Request, build_opener, HTTPCookieProcessor
 183 |     from urllib import quote, unquote
 184 |     from cookielib import MozillaCookieJar
 185 | 
 186 | # Import BeautifulSoup -- try 4 first, fall back to older
 187 | try:
 188 |     from bs4 import BeautifulSoup
 189 | except ImportError:
 190 |     try:
 191 |         from BeautifulSoup import BeautifulSoup
 192 |     except ImportError:
 193 |         print('We need BeautifulSoup, sorry...')
 194 |         sys.exit(1)
 195 | 
 196 | # Support unicode in both Python 2 and 3. In Python 3, unicode is str.
 197 | if sys.version_info[0] == 3:
 198 |     unicode = str # pylint: disable-msg=W0622
 199 |     encode = lambda s: unicode(s) # pylint: disable-msg=C0103
 200 | else:
 201 |     def encode(s):
 202 |         if isinstance(s, basestring):
 203 |             return s.encode('utf-8') # pylint: disable-msg=C0103
 204 |         else:
 205 |             return str(s)
 206 | 
 207 | 
 208 | class Error(Exception):
 209 |     """Base class for any Scholar error."""
 210 | 
 211 | 
 212 | class FormatError(Error):
 213 |     """A query argument or setting was formatted incorrectly."""
 214 | 
 215 | 
 216 | class QueryArgumentError(Error):
 217 |     """A query did not have a suitable set of arguments."""
 218 | 
 219 | 
 220 | class SoupKitchen(object):
 221 |     """Factory for creating BeautifulSoup instances."""
 222 | 
 223 |     @staticmethod
 224 |     def make_soup(markup, parser=None):
 225 |         """Factory method returning a BeautifulSoup instance. The created
 226 |         instance will use a parser of the given name, if supported by
 227 |         the underlying BeautifulSoup instance.
 228 |         """
 229 |         if 'bs4' in sys.modules:
 230 |             # We support parser specification. If the caller didn't
 231 |             # specify one, leave it to BeautifulSoup to pick the most
 232 |             # suitable one, but suppress the user warning that asks to
 233 |             # select the most suitable parser ... which BS then
 234 |             # selects anyway.
 235 |             if parser is None:
 236 |                 warnings.filterwarnings('ignore', 'No parser was explicitly specified')
 237 |             return BeautifulSoup(markup, parser)
 238 | 
 239 |         return BeautifulSoup(markup)
 240 | 
 241 | class ScholarConf(object):
 242 |     """Helper class for global settings."""
 243 | 
 244 |     VERSION = '2.10'
 245 |     LOG_LEVEL = 1
 246 |     MAX_PAGE_RESULTS = 10 # Current default for per-page results
 247 |     SCHOLAR_SITE = 'http://scholar.google.com'
 248 | 
 249 |     # USER_AGENT = 'Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.9) Gecko/20100913 Firefox/3.6.9'
 250 |     # Let's update at this point (3/14):
 251 |     USER_AGENT = 'Mozilla/5.0 (X11; Linux x86_64; rv:27.0) Gecko/20100101 Firefox/27.0'
 252 | 
 253 |     # If set, we will use this file to read/save cookies to enable
 254 |     # cookie use across sessions.
 255 |     COOKIE_JAR_FILE = None
 256 | 
 257 | class ScholarUtils(object):
 258 |     """A wrapper for various utensils that come in handy."""
 259 | 
 260 |     LOG_LEVELS = {'error': 1,
 261 |                   'warn':  2,
 262 |                   'info':  3,
 263 |                   'debug': 4}
 264 | 
 265 |     @staticmethod
 266 |     def ensure_int(arg, msg=None):
 267 |         try:
 268 |             return int(arg)
 269 |         except ValueError:
 270 |             raise FormatError(msg)
 271 | 
 272 |     @staticmethod
 273 |     def log(level, msg):
 274 |         if level not in ScholarUtils.LOG_LEVELS.keys():
 275 |             return
 276 |         if ScholarUtils.LOG_LEVELS[level] > ScholarConf.LOG_LEVEL:
 277 |             return
 278 |         sys.stderr.write('[%5s]  %s' % (level.upper(), msg + '\n'))
 279 |         sys.stderr.flush()
 280 | 
 281 | 
 282 | class ScholarArticle(object):
 283 |     """
 284 |     A class representing articles listed on Google Scholar.  The class
 285 |     provides basic dictionary-like behavior.
 286 |     """
 287 |     def __init__(self):
 288 |         # The triplets for each keyword correspond to (1) the actual
 289 |         # value, (2) a user-suitable label for the item, and (3) an
 290 |         # ordering index:
 291 |         self.attrs = {
 292 |             'title':         [None, 'Title',          0],
 293 |             'url':           [None, 'URL',            1],
 294 |             'year':          [None, 'Year',           2],
 295 |             'num_citations': [0,    'Citations',      3],
 296 |             'num_versions':  [0,    'Versions',       4],
 297 |             'cluster_id':    [None, 'Cluster ID',     5],
 298 |             'url_pdf':       [None, 'PDF link',       6],
 299 |             'url_citations': [None, 'Citations list', 7],
 300 |             'url_versions':  [None, 'Versions list',  8],
 301 |             'url_citation':  [None, 'Citation link',  9],
 302 |             'excerpt':       [None, 'Excerpt',       10],
 303 |         }
 304 | 
 305 |         # The citation data in one of the standard export formats,
 306 |         # e.g. BibTeX.
 307 |         self.citation_data = None
 308 | 
 309 |     def __getitem__(self, key):
 310 |         if key in self.attrs:
 311 |             return self.attrs[key][0]
 312 |         return None
 313 | 
 314 |     def __len__(self):
 315 |         return len(self.attrs)
 316 | 
 317 |     def __setitem__(self, key, item):
 318 |         if key in self.attrs:
 319 |             self.attrs[key][0] = item
 320 |         else:
 321 |             self.attrs[key] = [item, key, len(self.attrs)]
 322 | 
 323 |     def __delitem__(self, key):
 324 |         if key in self.attrs:
 325 |             del self.attrs[key]
 326 | 
 327 |     def set_citation_data(self, citation_data):
 328 |         self.citation_data = citation_data
 329 | 
 330 |     def as_txt(self):
 331 |         # Get items sorted in specified order:
 332 |         items = sorted(list(self.attrs.values()), key=lambda item: item[2])
 333 |         # Find largest label length:
 334 |         max_label_len = max([len(str(item[1])) for item in items])
 335 |         fmt = '%%%ds %%s' % max_label_len
 336 |         res = []
 337 |         for item in items:
 338 |             if item[0] is not None:
 339 |                 res.append(fmt % (item[1], item[0]))
 340 |         return '\n'.join(res)
 341 | 
 342 |     def as_csv(self, header=False, sep='|'):
 343 |         # Get keys sorted in specified order:
 344 |         keys = [pair[0] for pair in \
 345 |                 sorted([(key, val[2]) for key, val in list(self.attrs.items())],
 346 |                        key=lambda pair: pair[1])]
 347 |         res = []
 348 |         if header:
 349 |             res.append(sep.join(keys))
 350 |         res.append(sep.join([unicode(self.attrs[key][0]) for key in keys]))
 351 |         return '\n'.join(res)
 352 | 
 353 |     def as_citation(self):
 354 |         """
 355 |         Reports the article in a standard citation format. This works only
 356 |         if you have configured the querier to retrieve a particular
 357 |         citation export format. (See ScholarSettings.)
 358 |         """
 359 |         return self.citation_data or ''
 360 | 
 361 | 
 362 | class ScholarArticleParser(object):
 363 |     """
 364 |     ScholarArticleParser can parse HTML document strings obtained from
 365 |     Google Scholar. This is a base class; concrete implementations
 366 |     adapting to tweaks made by Google over time follow below.
 367 |     """
 368 |     def __init__(self, site=None):
 369 |         self.soup = None
 370 |         self.article = None
 371 |         self.site = site or ScholarConf.SCHOLAR_SITE
 372 |         self.year_re = re.compile(r'\b(?:20|19)\d{2}\b')
 373 | 
 374 |     def handle_article(self, art):
 375 |         """
 376 |         The parser invokes this callback on each article parsed
 377 |         successfully.  In this base class, the callback does nothing.
 378 |         """
 379 | 
 380 |     def handle_num_results(self, num_results):
 381 |         """
 382 |         The parser invokes this callback if it determines the overall
 383 |         number of results, as reported on the parsed results page. The
 384 |         base class implementation does nothing.
 385 |         """
 386 | 
 387 |     def parse(self, html):
 388 |         """
 389 |         This method initiates parsing of HTML content, cleans resulting
 390 |         content as needed, and notifies the parser instance of
 391 |         resulting instances via the handle_article callback.
 392 |         """
 393 |         self.soup = SoupKitchen.make_soup(html)
 394 | 
 395 |         # This parses any global, non-itemized attributes from the page.
 396 |         self._parse_globals()
 397 | 
 398 |         # Now parse out listed articles:
 399 |         for div in self.soup.findAll(ScholarArticleParser._tag_results_checker):
 400 |             self._parse_article(div)
 401 |             self._clean_article()
 402 |             if self.article['title']:
 403 |                 self.handle_article(self.article)
 404 | 
 405 |     def _clean_article(self):
 406 |         """
 407 |         This gets invoked after we have parsed an article, to do any
 408 |         needed cleanup/polishing before we hand off the resulting
 409 |         article.
 410 |         """
 411 |         if self.article['title']:
 412 |             self.article['title'] = self.article['title'].strip()
 413 | 
 414 |     def _parse_globals(self):
 415 |         tag = self.soup.find(name='div', attrs={'id': 'gs_ab_md'})
 416 |         if tag is not None:
 417 |             raw_text = tag.findAll(text=True)
 418 |             # raw text is a list because the body contains <b> etc
 419 |             if raw_text is not None and len(raw_text) > 0:
 420 |                 try:
 421 |                     num_results = raw_text[0].split()[1]
 422 |                     # num_results may now contain commas to separate
 423 |                     # thousands, strip:
 424 |                     num_results = num_results.replace(',', '')
 425 |                     num_results = int(num_results)
 426 |                     self.handle_num_results(num_results)
 427 |                 except (IndexError, ValueError):
 428 |                     pass
 429 | 
 430 |     def _parse_article(self, div):
 431 |         self.article = ScholarArticle()
 432 | 
 433 |         for tag in div:
 434 |             if not hasattr(tag, 'name'):
 435 |                 continue
 436 | 
 437 |             if tag.name == 'div' and self._tag_has_class(tag, 'gs_rt') and \
 438 |                     tag.h3 and tag.h3.a:
 439 |                 self.article['title'] = ''.join(tag.h3.a.findAll(text=True))
 440 |                 self.article['url'] = self._path2url(tag.h3.a['href'])
 441 |                 if self.article['url'].endswith('.pdf'):
 442 |                     self.article['url_pdf'] = self.article['url']
 443 | 
 444 |             if tag.name == 'font':
 445 |                 for tag2 in tag:
 446 |                     if not hasattr(tag2, 'name'):
 447 |                         continue
 448 |                     if tag2.name == 'span' and \
 449 |                        self._tag_has_class(tag2, 'gs_fl'):
 450 |                         self._parse_links(tag2)
 451 | 
 452 |     def _parse_links(self, span):
 453 |         for tag in span:
 454 |             if not hasattr(tag, 'name'):
 455 |                 continue
 456 |             if tag.name != 'a' or tag.get('href') is None:
 457 |                 continue
 458 | 
 459 |             if tag.get('href').startswith('/scholar?cites'):
 460 |                 if hasattr(tag, 'string') and tag.string.startswith('Cited by'):
 461 |                     self.article['num_citations'] = \
 462 |                         self._as_int(tag.string.split()[-1])
 463 | 
 464 |                 # Weird Google Scholar behavior here: if the original
 465 |                 # search query came with a number-of-results limit,
 466 |                 # then this limit gets propagated to the URLs embedded
 467 |                 # in the results page as well. Same applies to
 468 |                 # versions URL in next if-block.
 469 |                 self.article['url_citations'] = \
 470 |                     self._strip_url_arg('num', self._path2url(tag.get('href')))
 471 | 
 472 |                 # We can also extract the cluster ID from the versions
 473 |                 # URL. Note that we know that the string contains "?",
 474 |                 # from the above if-statement.
 475 |                 args = self.article['url_citations'].split('?', 1)[1]
 476 |                 for arg in args.split('&'):
 477 |                     if arg.startswith('cites='):
 478 |                         self.article['cluster_id'] = arg[6:]
 479 | 
 480 |             if tag.get('href').startswith('/scholar?cluster'):
 481 |                 if hasattr(tag, 'string') and tag.string.startswith('All '):
 482 |                     self.article['num_versions'] = \
 483 |                         self._as_int(tag.string.split()[1])
 484 |                 self.article['url_versions'] = \
 485 |                     self._strip_url_arg('num', self._path2url(tag.get('href')))
 486 | 
 487 |             if tag.getText().startswith('Import'):
 488 |                 self.article['url_citation'] = self._path2url(tag.get('href'))
 489 | 
 490 | 
 491 |     @staticmethod
 492 |     def _tag_has_class(tag, klass):
 493 |         """
 494 |         This predicate function checks whether a BeatifulSoup Tag instance
 495 |         has a class attribute.
 496 |         """
 497 |         res = tag.get('class') or []
 498 |         if type(res) != list:
 499 |             # BeautifulSoup 3 can return e.g. 'gs_md_wp gs_ttss',
 500 |             # so split -- conveniently produces a list in any case
 501 |             res = res.split()
 502 |         return klass in res
 503 | 
 504 |     @staticmethod
 505 |     def _tag_results_checker(tag):
 506 |         return tag.name == 'div' \
 507 |             and ScholarArticleParser._tag_has_class(tag, 'gs_r')
 508 | 
 509 |     @staticmethod
 510 |     def _as_int(obj):
 511 |         try:
 512 |             return int(obj)
 513 |         except ValueError:
 514 |             return None
 515 | 
 516 |     def _path2url(self, path):
 517 |         """Helper, returns full URL in case path isn't one."""
 518 |         if path.startswith('http://'):
 519 |             return path
 520 |         if not path.startswith('/'):
 521 |             path = '/' + path
 522 |         return self.site + path
 523 | 
 524 |     def _strip_url_arg(self, arg, url):
 525 |         """Helper, removes a URL-encoded argument, if present."""
 526 |         parts = url.split('?', 1)
 527 |         if len(parts) != 2:
 528 |             return url
 529 |         res = []
 530 |         for part in parts[1].split('&'):
 531 |             if not part.startswith(arg + '='):
 532 |                 res.append(part)
 533 |         return parts[0] + '?' + '&'.join(res)
 534 | 
 535 | 
 536 | class ScholarArticleParser120201(ScholarArticleParser):
 537 |     """
 538 |     This class reflects update to the Scholar results page layout that
 539 |     Google recently.
 540 |     """
 541 |     def _parse_article(self, div):
 542 |         self.article = ScholarArticle()
 543 | 
 544 |         for tag in div:
 545 |             if not hasattr(tag, 'name'):
 546 |                 continue
 547 | 
 548 |             if tag.name == 'h3' and self._tag_has_class(tag, 'gs_rt') and tag.a:
 549 |                 self.article['title'] = ''.join(tag.a.findAll(text=True))
 550 |                 self.article['url'] = self._path2url(tag.a['href'])
 551 |                 if self.article['url'].endswith('.pdf'):
 552 |                     self.article['url_pdf'] = self.article['url']
 553 | 
 554 |             if tag.name == 'div' and self._tag_has_class(tag, 'gs_a'):
 555 |                 year = self.year_re.findall(tag.text)
 556 |                 self.article['year'] = year[0] if len(year) > 0 else None
 557 | 
 558 |             if tag.name == 'div' and self._tag_has_class(tag, 'gs_fl'):
 559 |                 self._parse_links(tag)
 560 | 
 561 | 
 562 | class ScholarArticleParser120726(ScholarArticleParser):
 563 |     """
 564 |     This class reflects update to the Scholar results page layout that
 565 |     Google made 07/26/12.
 566 |     """
 567 |     def _parse_article(self, div):
 568 |         self.article = ScholarArticle()
 569 | 
 570 |         for tag in div:
 571 |             if not hasattr(tag, 'name'):
 572 |                 continue
 573 |             if str(tag).lower().find('.pdf'):
 574 |                 try:
 575 |                     if tag.find('div', {'class': 'gs_ttss'}):
 576 |                         self._parse_links(tag.find('div', {'class': 'gs_ttss'}))
 577 |                 except:
 578 |                     pass
 579 | 
 580 |             if tag.name == 'div' and self._tag_has_class(tag, 'gs_ri'):
 581 |                 # There are (at least) two formats here. In the first
 582 |                 # one, we have a link, e.g.:
 583 |                 #
 584 |                 # <h3 class="gs_rt">
 585 |                 #   <a href="http://dl.acm.org/citation.cfm?id=972384" class="yC0">
 586 |                 #     <b>Honeycomb</b>: creating intrusion detection signatures using
 587 |                 #        honeypots
 588 |                 #   </a>
 589 |                 # </h3>
 590 |                 #
 591 |                 # In the other, there's no actual link -- it's what
 592 |                 # Scholar renders as "CITATION" in the HTML:
 593 |                 #
 594 |                 # <h3 class="gs_rt">
 595 |                 #   <span class="gs_ctu">
 596 |                 #     <span class="gs_ct1">[CITATION]</span>
 597 |                 #     <span class="gs_ct2">[C]</span>
 598 |                 #   </span>
 599 |                 #   <b>Honeycomb</b> automated ids signature creation using honeypots
 600 |                 # </h3>
 601 |                 #
 602 |                 # We now distinguish the two.
 603 |                 try:
 604 |                     atag = tag.h3.a
 605 |                     self.article['title'] = ''.join(atag.findAll(text=True))
 606 |                     self.article['url'] = self._path2url(atag['href'])
 607 |                     if self.article['url'].endswith('.pdf'):
 608 |                         self.article['url_pdf'] = self.article['url']
 609 |                 except:
 610 |                     # Remove a few spans that have unneeded content (e.g. [CITATION])
 611 |                     for span in tag.h3.findAll(name='span'):
 612 |                         span.clear()
 613 |                     self.article['title'] = ''.join(tag.h3.findAll(text=True))
 614 | 
 615 |                 if tag.find('div', {'class': 'gs_a'}):
 616 |                     year = self.year_re.findall(tag.find('div', {'class': 'gs_a'}).text)
 617 |                     self.article['year'] = year[0] if len(year) > 0 else None
 618 | 
 619 |                 if tag.find('div', {'class': 'gs_fl'}):
 620 |                     self._parse_links(tag.find('div', {'class': 'gs_fl'}))
 621 | 
 622 |                 if tag.find('div', {'class': 'gs_rs'}):
 623 |                     # These are the content excerpts rendered into the results.
 624 |                     raw_text = tag.find('div', {'class': 'gs_rs'}).findAll(text=True)
 625 |                     if len(raw_text) > 0:
 626 |                         raw_text = ''.join(raw_text)
 627 |                         raw_text = raw_text.replace('\n', '')
 628 |                         self.article['excerpt'] = raw_text
 629 | 
 630 | 
 631 | class ScholarQuery(object):
 632 |     """
 633 |     The base class for any kind of results query we send to Scholar.
 634 |     """
 635 |     def __init__(self):
 636 |         self.url = None
 637 | 
 638 |         # The number of results requested from Scholar -- not the
 639 |         # total number of results it reports (the latter gets stored
 640 |         # in attrs, see below).
 641 |         self.num_results = None
 642 | 
 643 |         # Queries may have global result attributes, similar to
 644 |         # per-article attributes in ScholarArticle. The exact set of
 645 |         # attributes may differ by query type, but they all share the
 646 |         # basic data structure:
 647 |         self.attrs = {}
 648 | 
 649 |     def set_num_page_results(self, num_page_results):
 650 |         self.num_results = ScholarUtils.ensure_int(
 651 |             num_page_results,
 652 |             'maximum number of results on page must be numeric')
 653 | 
 654 |     def get_url(self):
 655 |         """
 656 |         Returns a complete, submittable URL string for this particular
 657 |         query instance. The URL and its arguments will vary depending
 658 |         on the query.
 659 |         """
 660 |         return None
 661 | 
 662 |     def _add_attribute_type(self, key, label, default_value=None):
 663 |         """
 664 |         Adds a new type of attribute to the list of attributes
 665 |         understood by this query. Meant to be used by the constructors
 666 |         in derived classes.
 667 |         """
 668 |         if len(self.attrs) == 0:
 669 |             self.attrs[key] = [default_value, label, 0]
 670 |             return
 671 |         idx = max([item[2] for item in self.attrs.values()]) + 1
 672 |         self.attrs[key] = [default_value, label, idx]
 673 | 
 674 |     def __getitem__(self, key):
 675 |         """Getter for attribute value. Returns None if no such key."""
 676 |         if key in self.attrs:
 677 |             return self.attrs[key][0]
 678 |         return None
 679 | 
 680 |     def __setitem__(self, key, item):
 681 |         """Setter for attribute value. Does nothing if no such key."""
 682 |         if key in self.attrs:
 683 |             self.attrs[key][0] = item
 684 | 
 685 |     def _parenthesize_phrases(self, query):
 686 |         """
 687 |         Turns a query string containing comma-separated phrases into a
 688 |         space-separated list of tokens, quoted if containing
 689 |         whitespace. For example, input
 690 | 
 691 |           'some words, foo, bar'
 692 | 
 693 |         becomes
 694 | 
 695 |           '"some words" foo bar'
 696 | 
 697 |         This comes in handy during the composition of certain queries.
 698 |         """
 699 |         if query.find(',') < 0:
 700 |             return query
 701 |         phrases = []
 702 |         for phrase in query.split(','):
 703 |             phrase = phrase.strip()
 704 |             if phrase.find(' ') > 0:
 705 |                 phrase = '"' + phrase + '"'
 706 |             phrases.append(phrase)
 707 |         return ' '.join(phrases)
 708 | 
 709 | 
 710 | class ClusterScholarQuery(ScholarQuery):
 711 |     """
 712 |     This version just pulls up an article cluster whose ID we already
 713 |     know about.
 714 |     """
 715 |     SCHOLAR_CLUSTER_URL = ScholarConf.SCHOLAR_SITE + '/scholar?' \
 716 |         + 'cluster=%(cluster)s' \
 717 |         + '%(num)s'
 718 | 
 719 |     def __init__(self, cluster=None):
 720 |         ScholarQuery.__init__(self)
 721 |         self._add_attribute_type('num_results', 'Results', 0)
 722 |         self.cluster = None
 723 |         self.set_cluster(cluster)
 724 | 
 725 |     def set_cluster(self, cluster):
 726 |         """
 727 |         Sets search to a Google Scholar results cluster ID.
 728 |         """
 729 |         msg = 'cluster ID must be numeric'
 730 |         self.cluster = ScholarUtils.ensure_int(cluster, msg)
 731 | 
 732 |     def get_url(self):
 733 |         if self.cluster is None:
 734 |             raise QueryArgumentError('cluster query needs cluster ID')
 735 | 
 736 |         urlargs = {'cluster': self.cluster }
 737 | 
 738 |         for key, val in urlargs.items():
 739 |             urlargs[key] = quote(encode(val))
 740 | 
 741 |         # The following URL arguments must not be quoted, or the
 742 |         # server will not recognize them:
 743 |         urlargs['num'] = ('&num=%d' % self.num_results
 744 |                           if self.num_results is not None else '')
 745 | 
 746 |         return self.SCHOLAR_CLUSTER_URL % urlargs
 747 | 
 748 | 
 749 | class SearchScholarQuery(ScholarQuery):
 750 |     """
 751 |     This version represents the search query parameters the user can
 752 |     configure on the Scholar website, in the advanced search options.
 753 |     """
 754 |     SCHOLAR_QUERY_URL = ScholarConf.SCHOLAR_SITE + '/scholar?' \
 755 |         + 'as_q=%(words)s' \
 756 |         + '&as_epq=%(phrase)s' \
 757 |         + '&as_oq=%(words_some)s' \
 758 |         + '&as_eq=%(words_none)s' \
 759 |         + '&as_occt=%(scope)s' \
 760 |         + '&as_sauthors=%(authors)s' \
 761 |         + '&as_publication=%(pub)s' \
 762 |         + '&as_ylo=%(ylo)s' \
 763 |         + '&as_yhi=%(yhi)s' \
 764 |         + '&as_vis=%(citations)s' \
 765 |         + '&btnG=&hl=en' \
 766 |         + '%(num)s' \
 767 |         + '&as_sdt=%(patents)s%%2C5'
 768 | 
 769 |     def __init__(self):
 770 |         ScholarQuery.__init__(self)
 771 |         self._add_attribute_type('num_results', 'Results', 0)
 772 |         self.words = None # The default search behavior
 773 |         self.words_some = None # At least one of those words
 774 |         self.words_none = None # None of these words
 775 |         self.phrase = None
 776 |         self.scope_title = False # If True, search in title only
 777 |         self.author = None
 778 |         self.pub = None
 779 |         self.timeframe = [None, None]
 780 |         self.include_patents = True
 781 |         self.include_citations = True
 782 | 
 783 |     def set_words(self, words):
 784 |         """Sets words that *all* must be found in the result."""
 785 |         self.words = words
 786 | 
 787 |     def set_words_some(self, words):
 788 |         """Sets words of which *at least one* must be found in result."""
 789 |         self.words_some = words
 790 | 
 791 |     def set_words_none(self, words):
 792 |         """Sets words of which *none* must be found in the result."""
 793 |         self.words_none = words
 794 | 
 795 |     def set_phrase(self, phrase):
 796 |         """Sets phrase that must be found in the result exactly."""
 797 |         self.phrase = phrase
 798 | 
 799 |     def set_scope(self, title_only):
 800 |         """
 801 |         Sets Boolean indicating whether to search entire article or title
 802 |         only.
 803 |         """
 804 |         self.scope_title = title_only
 805 | 
 806 |     def set_author(self, author):
 807 |         """Sets names that must be on the result's author list."""
 808 |         self.author = author
 809 | 
 810 |     def set_pub(self, pub):
 811 |         """Sets the publication in which the result must be found."""
 812 |         self.pub = pub
 813 | 
 814 |     def set_timeframe(self, start=None, end=None):
 815 |         """
 816 |         Sets timeframe (in years as integer) in which result must have
 817 |         appeared. It's fine to specify just start or end, or both.
 818 |         """
 819 |         if start:
 820 |             start = ScholarUtils.ensure_int(start)
 821 |         if end:
 822 |             end = ScholarUtils.ensure_int(end)
 823 |         self.timeframe = [start, end]
 824 | 
 825 |     def set_include_citations(self, yesorno):
 826 |         self.include_citations = yesorno
 827 | 
 828 |     def set_include_patents(self, yesorno):
 829 |         self.include_patents = yesorno
 830 | 
 831 |     def get_url(self):
 832 |         if self.words is None and self.words_some is None \
 833 |            and self.words_none is None and self.phrase is None \
 834 |            and self.author is None and self.pub is None \
 835 |            and self.timeframe[0] is None and self.timeframe[1] is None:
 836 |             raise QueryArgumentError('search query needs more parameters')
 837 | 
 838 |         # If we have some-words or none-words lists, we need to
 839 |         # process them so GS understands them. For simple
 840 |         # space-separeted word lists, there's nothing to do. For lists
 841 |         # of phrases we have to ensure quotations around the phrases,
 842 |         # separating them by whitespace.
 843 |         words_some = None
 844 |         words_none = None
 845 | 
 846 |         if self.words_some:
 847 |             words_some = self._parenthesize_phrases(self.words_some)
 848 |         if self.words_none:
 849 |             words_none = self._parenthesize_phrases(self.words_none)
 850 | 
 851 |         urlargs = {'words': self.words or '',
 852 |                    'words_some': words_some or '',
 853 |                    'words_none': words_none or '',
 854 |                    'phrase': self.phrase or '',
 855 |                    'scope': 'title' if self.scope_title else 'any',
 856 |                    'authors': self.author or '',
 857 |                    'pub': self.pub or '',
 858 |                    'ylo': self.timeframe[0] or '',
 859 |                    'yhi': self.timeframe[1] or '',
 860 |                    'patents': '0' if self.include_patents else '1',
 861 |                    'citations': '0' if self.include_citations else '1'}
 862 | 
 863 |         for key, val in urlargs.items():
 864 |             urlargs[key] = quote(encode(val))
 865 | 
 866 |         # The following URL arguments must not be quoted, or the
 867 |         # server will not recognize them:
 868 |         urlargs['num'] = ('&num=%d' % self.num_results
 869 |                           if self.num_results is not None else '')
 870 | 
 871 |         return self.SCHOLAR_QUERY_URL % urlargs
 872 | 
 873 | 
 874 | class ScholarSettings(object):
 875 |     """
 876 |     This class lets you adjust the Scholar settings for your
 877 |     session. It's intended to mirror the features tunable in the
 878 |     Scholar Settings pane, but right now it's a bit basic.
 879 |     """
 880 |     CITFORM_NONE = 0
 881 |     CITFORM_REFWORKS = 1
 882 |     CITFORM_REFMAN = 2
 883 |     CITFORM_ENDNOTE = 3
 884 |     CITFORM_BIBTEX = 4
 885 | 
 886 |     def __init__(self):
 887 |         self.citform = 0 # Citation format, default none
 888 |         self.per_page_results = None
 889 |         self._is_configured = False
 890 | 
 891 |     def set_citation_format(self, citform):
 892 |         citform = ScholarUtils.ensure_int(citform)
 893 |         if citform < 0 or citform > self.CITFORM_BIBTEX:
 894 |             raise FormatError('citation format invalid, is "%s"'
 895 |                               % citform)
 896 |         self.citform = citform
 897 |         self._is_configured = True
 898 | 
 899 |     def set_per_page_results(self, per_page_results):
 900 |         self.per_page_results = ScholarUtils.ensure_int(
 901 |             per_page_results, 'page results must be integer')
 902 |         self.per_page_results = min(
 903 |             self.per_page_results, ScholarConf.MAX_PAGE_RESULTS)
 904 |         self._is_configured = True
 905 | 
 906 |     def is_configured(self):
 907 |         return self._is_configured
 908 | 
 909 | 
 910 | class ScholarQuerier(object):
 911 |     """
 912 |     ScholarQuerier instances can conduct a search on Google Scholar
 913 |     with subsequent parsing of the resulting HTML content.  The
 914 |     articles found are collected in the articles member, a list of
 915 |     ScholarArticle instances.
 916 |     """
 917 | 
 918 |     # Default URLs for visiting and submitting Settings pane, as of 3/14
 919 |     GET_SETTINGS_URL = ScholarConf.SCHOLAR_SITE + '/scholar_settings?' \
 920 |         + 'sciifh=1&hl=en&as_sdt=0,5'
 921 | 
 922 |     SET_SETTINGS_URL = ScholarConf.SCHOLAR_SITE + '/scholar_setprefs?' \
 923 |         + 'q=' \
 924 |         + '&scisig=%(scisig)s' \
 925 |         + '&inststart=0' \
 926 |         + '&as_sdt=1,5' \
 927 |         + '&as_sdtp=' \
 928 |         + '&num=%(num)s' \
 929 |         + '&scis=%(scis)s' \
 930 |         + '%(scisf)s' \
 931 |         + '&hl=en&lang=all&instq=&inst=569367360547434339&save='
 932 | 
 933 |     # Older URLs:
 934 |     # ScholarConf.SCHOLAR_SITE + '/scholar?q=%s&hl=en&btnG=Search&as_sdt=2001&as_sdtp=on
 935 | 
 936 |     class Parser(ScholarArticleParser120726):
 937 |         def __init__(self, querier):
 938 |             ScholarArticleParser120726.__init__(self)
 939 |             self.querier = querier
 940 | 
 941 |         def handle_num_results(self, num_results):
 942 |             if self.querier is not None and self.querier.query is not None:
 943 |                 self.querier.query['num_results'] = num_results
 944 | 
 945 |         def handle_article(self, art):
 946 |             self.querier.add_article(art)
 947 | 
 948 |     def __init__(self):
 949 |         self.articles = []
 950 |         self.query = None
 951 |         self.cjar = MozillaCookieJar()
 952 | 
 953 |         # If we have a cookie file, load it:
 954 |         if ScholarConf.COOKIE_JAR_FILE and \
 955 |            os.path.exists(ScholarConf.COOKIE_JAR_FILE):
 956 |             try:
 957 |                 self.cjar.load(ScholarConf.COOKIE_JAR_FILE,
 958 |                                ignore_discard=True)
 959 |                 ScholarUtils.log('info', 'loaded cookies file')
 960 |             except Exception as msg:
 961 |                 ScholarUtils.log('warn', 'could not load cookies file: %s' % msg)
 962 |                 self.cjar = MozillaCookieJar() # Just to be safe
 963 | 
 964 |         self.opener = build_opener(HTTPCookieProcessor(self.cjar))
 965 |         self.settings = None # Last settings object, if any
 966 | 
 967 |     def apply_settings(self, settings):
 968 |         """
 969 |         Applies settings as provided by a ScholarSettings instance.
 970 |         """
 971 |         if settings is None or not settings.is_configured():
 972 |             return True
 973 | 
 974 |         self.settings = settings
 975 | 
 976 |         # This is a bit of work. We need to actually retrieve the
 977 |         # contents of the Settings pane HTML in order to extract
 978 |         # hidden fields before we can compose the query for updating
 979 |         # the settings.
 980 |         html = self._get_http_response(url=self.GET_SETTINGS_URL,
 981 |                                        log_msg='dump of settings form HTML',
 982 |                                        err_msg='requesting settings failed')
 983 |         if html is None:
 984 |             return False
 985 | 
 986 |         # Now parse the required stuff out of the form. We require the
 987 |         # "scisig" token to make the upload of our settings acceptable
 988 |         # to Google.
 989 |         soup = SoupKitchen.make_soup(html)
 990 | 
 991 |         tag = soup.find(name='form', attrs={'id': 'gs_settings_form'})
 992 |         if tag is None:
 993 |             ScholarUtils.log('info', 'parsing settings failed: no form')
 994 |             return False
 995 | 
 996 |         tag = tag.find('input', attrs={'type':'hidden', 'name':'scisig'})
 997 |         if tag is None:
 998 |             ScholarUtils.log('info', 'parsing settings failed: scisig')
 999 |             return False
1000 | 
1001 |         urlargs = {'scisig': tag['value'],
1002 |                    'num': settings.per_page_results,
1003 |                    'scis': 'no',
1004 |                    'scisf': ''}
1005 | 
1006 |         if settings.citform != 0:
1007 |             urlargs['scis'] = 'yes'
1008 |             urlargs['scisf'] = '&scisf=%d' % settings.citform
1009 | 
1010 |         html = self._get_http_response(url=self.SET_SETTINGS_URL % urlargs,
1011 |                                        log_msg='dump of settings result HTML',
1012 |                                        err_msg='applying setttings failed')
1013 |         if html is None:
1014 |             return False
1015 | 
1016 |         ScholarUtils.log('info', 'settings applied')
1017 |         return True
1018 | 
1019 |     def send_query(self, query):
1020 |         """
1021 |         This method initiates a search query (a ScholarQuery instance)
1022 |         with subsequent parsing of the response.
1023 |         """
1024 |         self.clear_articles()
1025 |         self.query = query
1026 | 
1027 |         html = self._get_http_response(url=query.get_url(),
1028 |                                        log_msg='dump of query response HTML',
1029 |                                        err_msg='results retrieval failed')
1030 |         if html is None:
1031 |             return
1032 | 
1033 |         self.parse(html)
1034 | 
1035 |     def get_citation_data(self, article):
1036 |         """
1037 |         Given an article, retrieves citation link. Note, this requires that
1038 |         you adjusted the settings to tell Google Scholar to actually
1039 |         provide this information, *prior* to retrieving the article.
1040 |         """
1041 |         if article['url_citation'] is None:
1042 |             return False
1043 |         if article.citation_data is not None:
1044 |             return True
1045 | 
1046 |         ScholarUtils.log('info', 'retrieving citation export data')
1047 |         data = self._get_http_response(url=article['url_citation'],
1048 |                                        log_msg='citation data response',
1049 |                                        err_msg='requesting citation data failed')
1050 |         if data is None:
1051 |             return False
1052 | 
1053 |         article.set_citation_data(data)
1054 |         return True
1055 | 
1056 |     def parse(self, html):
1057 |         """
1058 |         This method allows parsing of provided HTML content.
1059 |         """
1060 |         parser = self.Parser(self)
1061 |         parser.parse(html)
1062 | 
1063 |     def add_article(self, art):
1064 |         self.get_citation_data(art)
1065 |         self.articles.append(art)
1066 | 
1067 |     def clear_articles(self):
1068 |         """Clears any existing articles stored from previous queries."""
1069 |         self.articles = []
1070 | 
1071 |     def save_cookies(self):
1072 |         """
1073 |         This stores the latest cookies we're using to disk, for reuse in a
1074 |         later session.
1075 |         """
1076 |         if ScholarConf.COOKIE_JAR_FILE is None:
1077 |             return False
1078 |         try:
1079 |             self.cjar.save(ScholarConf.COOKIE_JAR_FILE,
1080 |                            ignore_discard=True)
1081 |             ScholarUtils.log('info', 'saved cookies file')
1082 |             return True
1083 |         except Exception as msg:
1084 |             ScholarUtils.log('warn', 'could not save cookies file: %s' % msg)
1085 |             return False
1086 | 
1087 |     def _get_http_response(self, url, log_msg=None, err_msg=None):
1088 |         """
1089 |         Helper method, sends HTTP request and returns response payload.
1090 |         """
1091 |         if log_msg is None:
1092 |             log_msg = 'HTTP response data follow'
1093 |         if err_msg is None:
1094 |             err_msg = 'request failed'
1095 |         try:
1096 |             ScholarUtils.log('info', 'requesting %s' % unquote(url))
1097 | 
1098 |             req = Request(url=url, headers={'User-Agent': ScholarConf.USER_AGENT})
1099 |             hdl = self.opener.open(req)
1100 |             html = hdl.read()
1101 | 
1102 |             ScholarUtils.log('debug', log_msg)
1103 |             ScholarUtils.log('debug', '>>>>' + '-'*68)
1104 |             ScholarUtils.log('debug', 'url: %s' % hdl.geturl())
1105 |             ScholarUtils.log('debug', 'result: %s' % hdl.getcode())
1106 |             ScholarUtils.log('debug', 'headers:\n' + str(hdl.info()))
1107 |             ScholarUtils.log('debug', 'data:\n' + html.decode('utf-8')) # For Python 3
1108 |             ScholarUtils.log('debug', '<<<<' + '-'*68)
1109 | 
1110 |             return html
1111 |         except Exception as err:
1112 |             ScholarUtils.log('info', err_msg + ': %s' % err)
1113 |             return None
1114 | 
1115 | 
1116 | def txt(querier, with_globals):
1117 |     if with_globals:
1118 |         # If we have any articles, check their attribute labels to get
1119 |         # the maximum length -- makes for nicer alignment.
1120 |         max_label_len = 0
1121 |         if len(querier.articles) > 0:
1122 |             items = sorted(list(querier.articles[0].attrs.values()),
1123 |                            key=lambda item: item[2])
1124 |             max_label_len = max([len(str(item[1])) for item in items])
1125 | 
1126 |         # Get items sorted in specified order:
1127 |         items = sorted(list(querier.query.attrs.values()), key=lambda item: item[2])
1128 |         # Find largest label length:
1129 |         max_label_len = max([len(str(item[1])) for item in items] + [max_label_len])
1130 |         fmt = '[G] %%%ds %%s' % max(0, max_label_len-4)
1131 |         for item in items:
1132 |             if item[0] is not None:
1133 |                 print(fmt % (item[1], item[0]))
1134 |         if len(items) > 0:
1135 |             print
1136 | 
1137 |     articles = querier.articles
1138 |     for art in articles:
1139 |         print(encode(art.as_txt()) + '\n')
1140 | 
1141 | def csv(querier, header=False, sep='|'):
1142 |     articles = querier.articles
1143 |     for art in articles:
1144 |         result = art.as_csv(header=header, sep=sep)
1145 |         print(encode(result))
1146 |         header = False
1147 | 
1148 | def citation_export(querier):
1149 |     articles = querier.articles
1150 |     for art in articles:
1151 |         print(art.as_citation() + '\n')
1152 | 
1153 | 
1154 | def main():
1155 |     usage = """scholar.py [options] <query string>
1156 | A command-line interface to Google Scholar.
1157 | 
1158 | Examples:
1159 | 
1160 | # Retrieve one article written by Einstein on quantum theory:
1161 | scholar.py -c 1 --author "albert einstein" --phrase "quantum theory"
1162 | 
1163 | # Retrieve a BibTeX entry for that quantum theory paper:
1164 | scholar.py -c 1 -C 17749203648027613321 --citation bt
1165 | 
1166 | # Retrieve five articles written by Einstein after 1970 where the title
1167 | # does not contain the words "quantum" and "theory":
1168 | scholar.py -c 5 -a "albert einstein" -t --none "quantum theory" --after 1970"""
1169 | 
1170 |     fmt = optparse.IndentedHelpFormatter(max_help_position=50, width=100)
1171 |     parser = optparse.OptionParser(usage=usage, formatter=fmt)
1172 |     group = optparse.OptionGroup(parser, 'Query arguments',
1173 |                                  'These options define search query arguments and parameters.')
1174 |     group.add_option('-a', '--author', metavar='AUTHORS', default=None,
1175 |                      help='Author name(s)')
1176 |     group.add_option('-A', '--all', metavar='WORDS', default=None, dest='allw',
1177 |                      help='Results must contain all of these words')
1178 |     group.add_option('-s', '--some', metavar='WORDS', default=None,
1179 |                      help='Results must contain at least one of these words. Pass arguments in form -s "foo bar baz" for simple words, and -s "a phrase, another phrase" for phrases')
1180 |     group.add_option('-n', '--none', metavar='WORDS', default=None,
1181 |                      help='Results must contain none of these words. See -s|--some re. formatting')
1182 |     group.add_option('-p', '--phrase', metavar='PHRASE', default=None,
1183 |                      help='Results must contain exact phrase')
1184 |     group.add_option('-t', '--title-only', action='store_true', default=False,
1185 |                      help='Search title only')
1186 |     group.add_option('-P', '--pub', metavar='PUBLICATIONS', default=None,
1187 |                      help='Results must have appeared in this publication')
1188 |     group.add_option('--after', metavar='YEAR', default=None,
1189 |                      help='Results must have appeared in or after given year')
1190 |     group.add_option('--before', metavar='YEAR', default=None,
1191 |                      help='Results must have appeared in or before given year')
1192 |     group.add_option('--no-patents', action='store_true', default=False,
1193 |                      help='Do not include patents in results')
1194 |     group.add_option('--no-citations', action='store_true', default=False,
1195 |                      help='Do not include citations in results')
1196 |     group.add_option('-C', '--cluster-id', metavar='CLUSTER_ID', default=None,
1197 |                      help='Do not search, just use articles in given cluster ID')
1198 |     group.add_option('-c', '--count', type='int', default=None,
1199 |                      help='Maximum number of results')
1200 |     parser.add_option_group(group)
1201 | 
1202 |     group = optparse.OptionGroup(parser, 'Output format',
1203 |                                  'These options control the appearance of the results.')
1204 |     group.add_option('--txt', action='store_true',
1205 |                      help='Print article data in text format (default)')
1206 |     group.add_option('--txt-globals', action='store_true',
1207 |                      help='Like --txt, but first print global results too')
1208 |     group.add_option('--csv', action='store_true',
1209 |                      help='Print article data in CSV form (separator is "|")')
1210 |     group.add_option('--csv-header', action='store_true',
1211 |                      help='Like --csv, but print header with column names')
1212 |     group.add_option('--citation', metavar='FORMAT', default=None,
1213 |                      help='Print article details in standard citation format. Argument Must be one of "bt" (BibTeX), "en" (EndNote), "rm" (RefMan), or "rw" (RefWorks).')
1214 |     parser.add_option_group(group)
1215 | 
1216 |     group = optparse.OptionGroup(parser, 'Miscellaneous')
1217 |     group.add_option('--cookie-file', metavar='FILE', default=None,
1218 |                      help='File to use for cookie storage. If given, will read any existing cookies if found at startup, and save resulting cookies in the end.')
1219 |     group.add_option('-d', '--debug', action='count', default=0,
1220 |                      help='Enable verbose logging to stderr. Repeated options increase detail of debug output.')
1221 |     group.add_option('-v', '--version', action='store_true', default=False,
1222 |                      help='Show version information')
1223 |     parser.add_option_group(group)
1224 | 
1225 |     options, _ = parser.parse_args()
1226 | 
1227 |     # Show help if we have neither keyword search nor author name
1228 |     if len(sys.argv) == 1:
1229 |         parser.print_help()
1230 |         return 1
1231 | 
1232 |     if options.debug > 0:
1233 |         options.debug = min(options.debug, ScholarUtils.LOG_LEVELS['debug'])
1234 |         ScholarConf.LOG_LEVEL = options.debug
1235 |         ScholarUtils.log('info', 'using log level %d' % ScholarConf.LOG_LEVEL)
1236 | 
1237 |     if options.version:
1238 |         print('This is scholar.py %s.' % ScholarConf.VERSION)
1239 |         return 0
1240 | 
1241 |     if options.cookie_file:
1242 |         ScholarConf.COOKIE_JAR_FILE = options.cookie_file
1243 | 
1244 |     # Sanity-check the options: if they include a cluster ID query, it
1245 |     # makes no sense to have search arguments:
1246 |     if options.cluster_id is not None:
1247 |         if options.author or options.allw or options.some or options.none \
1248 |            or options.phrase or options.title_only or options.pub \
1249 |            or options.after or options.before:
1250 |             print('Cluster ID queries do not allow additional search arguments.')
1251 |             return 1
1252 | 
1253 |     querier = ScholarQuerier()
1254 |     settings = ScholarSettings()
1255 | 
1256 |     if options.citation == 'bt':
1257 |         settings.set_citation_format(ScholarSettings.CITFORM_BIBTEX)
1258 |     elif options.citation == 'en':
1259 |         settings.set_citation_format(ScholarSettings.CITFORM_ENDNOTE)
1260 |     elif options.citation == 'rm':
1261 |         settings.set_citation_format(ScholarSettings.CITFORM_REFMAN)
1262 |     elif options.citation == 'rw':
1263 |         settings.set_citation_format(ScholarSettings.CITFORM_REFWORKS)
1264 |     elif options.citation is not None:
1265 |         print('Invalid citation link format, must be one of "bt", "en", "rm", or "rw".')
1266 |         return 1
1267 | 
1268 |     querier.apply_settings(settings)
1269 | 
1270 |     if options.cluster_id:
1271 |         query = ClusterScholarQuery(cluster=options.cluster_id)
1272 |     else:
1273 |         query = SearchScholarQuery()
1274 |         if options.author:
1275 |             query.set_author(options.author)
1276 |         if options.allw:
1277 |             query.set_words(options.allw)
1278 |         if options.some:
1279 |             query.set_words_some(options.some)
1280 |         if options.none:
1281 |             query.set_words_none(options.none)
1282 |         if options.phrase:
1283 |             query.set_phrase(options.phrase)
1284 |         if options.title_only:
1285 |             query.set_scope(True)
1286 |         if options.pub:
1287 |             query.set_pub(options.pub)
1288 |         if options.after or options.before:
1289 |             query.set_timeframe(options.after, options.before)
1290 |         if options.no_patents:
1291 |             query.set_include_patents(False)
1292 |         if options.no_citations:
1293 |             query.set_include_citations(False)
1294 | 
1295 |     if options.count is not None:
1296 |         options.count = min(options.count, ScholarConf.MAX_PAGE_RESULTS)
1297 |         query.set_num_page_results(options.count)
1298 | 
1299 |     querier.send_query(query)
1300 | 
1301 |     if options.csv:
1302 |         csv(querier)
1303 |     elif options.csv_header:
1304 |         csv(querier, header=True)
1305 |     elif options.citation is not None:
1306 |         citation_export(querier)
1307 |     else:
1308 |         txt(querier, with_globals=options.txt_globals)
1309 | 
1310 |     if options.cookie_file:
1311 |         querier.save_cookies()
1312 | 
1313 |     return 0
1314 | 
1315 | if __name__ == "__main__":
1316 |     sys.exit(main())
1317 | 


--------------------------------------------------------------------------------