├── LICENSE
├── README.md
├── data
    ├── aminer_data
    │   └── __init__.py
    ├── github_data
    │   └── __init__.py
    ├── instagram_data
    │   └── __
    └── yelp_data
    │   └── __init__.py
├── finetune.py
├── finetune
    ├── aminer_data
    │   └── finetune_feature.txt
    ├── github_data
    │   └── __init__.py
    ├── instagram_data
    │   └── __init__.py
    └── yelp_data
    │   └── __init__.py
├── main_graph_image_gcl.py
├── main_graph_text_gcl.py
├── pretrain
    └── aminer.pt
├── requirements.txt
├── scripts
    ├── CMCL.py
    ├── modules_layer.py
    ├── modules_model.py
    └── moodules_graph.py
└── utils
    ├── __init__.py
    ├── focalloss.py
    ├── getdata.py
    ├── params.py
    └── util.py


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 Meta-HG
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # CM-GCL
  2 | Co-Modality Graph Contrastive Learning for Imbalanced Node Classification
  3 | ====
  4 | Official source code of "Co-Modality Graph Contrastive Learning for Imbalanced Node Classification
  5 | " (NeurIPS 2022 https://openreview.net/pdf?id=f_kvHrM4Q0).
  6 | 
  7 | ## Requirements
  8 | 
  9 | This code is developed and tested with python 3.8.10. and the required packages are listed in the `requirements.txt`. 
 10 | 
 11 | Please run `pip install -r requirements.txt` to install all the dependencies. 
 12 | 
 13 | ## Usage
 14 | ### Data Download
 15 | As the data size is too large, please click on the link to download the [AMiner Data](https://drive.google.com/file/d/1TJfviH4--HPp12jQnqbJEpO58GrHCexJ/view?usp=share_link).
 16 | 
 17 | Tips: the file size is quite big including (a.) the AMiner data, (b.) the fine-tuned node features (finetune_feature_20301141240.txt), and (c.) the pre-trained pt model file (aminer_node_text_20301141240.pt). 
 18 | 
 19 | Once finish downloading, please unzip the file and further put the unzipped data (part a) to the "data/aminer_data/" folder; the fine-tuned node features (part b) to "finetune/aminer_data/" folder; 
 20 | the pre-trained pt file (part c) into the "pretrain/" folder;
 21 | 
 22 | ### Model Pre-training
 23 | For co-modality pre-training, we consider two types of modality combinations, i.e., graph modality and text modality, and 
 24 | graph modality and image modality in our paper:
 25 | 
 26 | ```main_graph_text_gcl.py``` contains the code of graph contrastive learning over real-world datasets including raw text information and graph data.
 27 | 
 28 | ```main_graph_image_gcl.py``` contains the code of graph contrastive learning over real-world datasets including raw images and graph data.
 29 | 
 30 | ### Model Fine-tuning
 31 | 
 32 | ```finetune.py``` contains the code of model finetuning for downstream tasks. 
 33 | 
 34 | The default setting for AMiner data is all set. If you want to train the model, please run the code ```python main_graph_text_gcl.py``` over AMiner data.
 35 | It may take a while, you can also simply run the code ```python finetune.py``` to reproduce our results. We also provide a sample running log below.
 36 | 
 37 | 
 38 | ## Dataset
 39 | 
 40 | AMiner data is a paper-citation academic network containing the raw text content. The dataset of this paper is already provided to train our model. Please feel free to refer to the website for more details https://www.aminer.cn/aminer_data.
 41 | 
 42 | Yelp data is a review social graph containg the raw text content. Please refer to the website for more details http://odds.cs.stonybrook.edu/yelpchi-dataset/. If you need the raw text data for your research purpose, please email to the author listed on the website.
 43 | 
 44 | GitHub data is a graph for detecting malicious repository on social coding platform. Since our model uses raw text data for model training, it may cause privacy issues if we public the data. 
 45 | 
 46 | Instagram data is a social graph including raw text and image data for detecting drug traffickers on social platform. As our model needs raw image and text data for model training, it may cause privacy issues. 
 47 | If you want to implement our code (the combination of graph modality and image modality) on your datasets, please follow the format we provided in the instagram_data folder.
 48 | 
 49 | 
 50 | 
 51 | ## Contact
 52 | Yiyue Qian - yqian5@nd.edu or yxq250@case.edu
 53 | 
 54 | Discussions, suggestions and questions are always welcome!
 55 | 
 56 | 
 57 | 
 58 | ## Citation
 59 | If our work helps you, please cite our paper:
 60 | 
 61 | ```
 62 | @inproceedings{qianco,
 63 |   title={Co-Modality Graph Contrastive Learning for Imbalanced Node Classification},
 64 |   author={Qian, Yiyue and Zhang, Chunhui and Zhang, Yiming and Wen, Qianlong and Ye, Yanfang and Zhang, Chuxu},
 65 |   booktitle={Advances in Neural Information Processing Systems},
 66 |   year={2022}
 67 | }
 68 | ```
 69 | 
 70 | ## Logger
 71 | This is a sample running logger which records the output and the model performance for AMiner data (also provide a output.log in aminer_data folder):
 72 | ```
 73 | python main_graph_text_gcl.py
 74 | 
 75 | Epoch: 1
 76 | 100%|██████████| 145/145 [00:23<00:00,  6.19it/s, lr=0.001, train_loss=41.7]
 77 | 100%|██████████| 37/37 [00:07<00:00,  5.08it/s, valid_loss=10.6]
 78 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
 79 | Epoch: 2
 80 | 100%|██████████| 145/145 [00:22<00:00,  6.36it/s, lr=0.001, train_loss=12.4]
 81 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 3
 82 | 100%|██████████| 145/145 [00:22<00:00,  6.51it/s, lr=0.001, train_loss=9.57]
 83 | Epoch: 4
 84 | 100%|██████████| 145/145 [00:22<00:00,  6.48it/s, lr=0.001, train_loss=9.43]
 85 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 5
 86 | 100%|██████████| 145/145 [00:22<00:00,  6.48it/s, lr=0.001, train_loss=9.35]
 87 | Epoch: 6
 88 | 100%|██████████| 145/145 [00:29<00:00,  4.85it/s, lr=0.001, train_loss=9.29]
 89 | 100%|██████████| 37/37 [00:07<00:00,  4.76it/s, valid_loss=9.05]
 90 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
 91 | Epoch: 7
 92 | 100%|██████████| 145/145 [00:24<00:00,  5.81it/s, lr=0.001, train_loss=9.19]
 93 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 8
 94 | 100%|██████████| 145/145 [00:25<00:00,  5.75it/s, lr=0.001, train_loss=9.04]
 95 | Epoch: 9
 96 | 100%|██████████| 145/145 [00:25<00:00,  5.59it/s, lr=0.001, train_loss=9.02]
 97 | Epoch: 10
 98 | 100%|██████████| 145/145 [00:29<00:00,  4.92it/s, lr=0.001, train_loss=8.71]
 99 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 11
100 | 100%|██████████| 145/145 [00:24<00:00,  5.84it/s, lr=0.001, train_loss=8.62]
101 | 100%|██████████| 37/37 [00:07<00:00,  4.92it/s, valid_loss=8.81]
102 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
103 | Epoch: 12
104 | 100%|██████████| 145/145 [00:24<00:00,  5.86it/s, lr=0.001, train_loss=8.27]
105 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 13
106 | 100%|██████████| 145/145 [00:24<00:00,  5.87it/s, lr=0.001, train_loss=7.85]
107 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 14
108 | 100%|██████████| 145/145 [00:25<00:00,  5.77it/s, lr=0.001, train_loss=7.65]
109 | Epoch: 15
110 | 100%|██████████| 145/145 [00:25<00:00,  5.73it/s, lr=0.001, train_loss=7.33]
111 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 16
112 | 100%|██████████| 145/145 [00:24<00:00,  5.83it/s, lr=0.001, train_loss=7.05]
113 | 100%|██████████| 37/37 [00:07<00:00,  4.89it/s, valid_loss=7.21]
114 | Saved Best Model!
115 | Epoch: 17
116 | 100%|██████████| 145/145 [00:25<00:00,  5.79it/s, lr=0.001, train_loss=6.89]
117 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 18
118 | 100%|██████████| 145/145 [00:24<00:00,  5.91it/s, lr=0.001, train_loss=6.71]
119 | Epoch: 19
120 | 100%|██████████| 145/145 [00:24<00:00,  5.85it/s, lr=0.001, train_loss=6.62]
121 | Epoch: 20
122 | 100%|██████████| 145/145 [00:25<00:00,  5.77it/s, lr=0.001, train_loss=6.49]
123 | Epoch: 21
124 | 100%|██████████| 145/145 [00:25<00:00,  5.70it/s, lr=0.001, train_loss=6.45]
125 | 100%|██████████| 37/37 [00:07<00:00,  5.01it/s, valid_loss=7.04]
126 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
127 | Epoch: 22
128 | 100%|██████████| 145/145 [00:24<00:00,  5.81it/s, lr=0.001, train_loss=6.42]
129 | Epoch: 23
130 | 100%|██████████| 145/145 [00:24<00:00,  5.84it/s, lr=0.001, train_loss=6.34]
131 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 24
132 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=6.28]
133 | Epoch: 25
134 | 100%|██████████| 145/145 [00:24<00:00,  5.82it/s, lr=0.001, train_loss=6.29]
135 | Epoch: 26
136 | 100%|██████████| 145/145 [00:24<00:00,  5.83it/s, lr=0.001, train_loss=6.19]
137 | 100%|██████████| 37/37 [00:07<00:00,  4.89it/s, valid_loss=7]
138 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
139 | Epoch: 27
140 | 100%|██████████| 145/145 [00:25<00:00,  5.74it/s, lr=0.001, train_loss=6.09]
141 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 28
142 | 100%|██████████| 145/145 [00:25<00:00,  5.77it/s, lr=0.001, train_loss=6.06]
143 | Epoch: 29
144 | 100%|██████████| 145/145 [00:25<00:00,  5.69it/s, lr=0.001, train_loss=5.83]
145 | Epoch: 30
146 | 100%|██████████| 145/145 [00:25<00:00,  5.62it/s, lr=0.001, train_loss=5.59]
147 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 31
148 | 100%|██████████| 145/145 [00:25<00:00,  5.75it/s, lr=0.001, train_loss=5.4]
149 | 100%|██████████| 37/37 [00:07<00:00,  4.92it/s, valid_loss=6.69]
150 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
151 | Epoch: 32
152 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=5.25]
153 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 33
154 | 100%|██████████| 145/145 [00:25<00:00,  5.79it/s, lr=0.001, train_loss=5]
155 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 34
156 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=4.86]
157 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 35
158 | 100%|██████████| 145/145 [00:24<00:00,  5.84it/s, lr=0.001, train_loss=4.66]
159 | Epoch: 36
160 | 100%|██████████| 145/145 [00:24<00:00,  5.81it/s, lr=0.001, train_loss=4.18]
161 | 100%|██████████| 37/37 [00:07<00:00,  4.86it/s, valid_loss=6.16]
162 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
163 | Epoch: 37
164 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=3.99]
165 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 38
166 | 100%|██████████| 145/145 [00:24<00:00,  5.94it/s, lr=0.001, train_loss=3.69]
167 | Epoch: 39
168 | 100%|██████████| 145/145 [00:24<00:00,  5.84it/s, lr=0.001, train_loss=3.53]
169 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 40
170 | 100%|██████████| 145/145 [00:25<00:00,  5.80it/s, lr=0.001, train_loss=3.42]
171 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 41
172 | 100%|██████████| 145/145 [00:25<00:00,  5.72it/s, lr=0.001, train_loss=3.35]
173 | 100%|██████████| 37/37 [00:07<00:00,  4.91it/s, valid_loss=4.58]
174 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
175 | Epoch: 42
176 | 100%|██████████| 145/145 [00:24<00:00,  5.83it/s, lr=0.001, train_loss=3.3]
177 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 43
178 | 100%|██████████| 145/145 [00:25<00:00,  5.72it/s, lr=0.001, train_loss=3.24]
179 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 44
180 | 100%|██████████| 145/145 [00:25<00:00,  5.69it/s, lr=0.001, train_loss=3.16]
181 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 45
182 | 100%|██████████| 145/145 [00:25<00:00,  5.73it/s, lr=0.001, train_loss=3.17]
183 | Epoch: 46
184 | 100%|██████████| 145/145 [00:24<00:00,  5.83it/s, lr=0.001, train_loss=3.15]
185 | 100%|██████████| 37/37 [00:08<00:00,  4.12it/s, valid_loss=4.48]
186 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
187 | Epoch: 47
188 | 100%|██████████| 145/145 [00:25<00:00,  5.70it/s, lr=0.001, train_loss=3.09]
189 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 48
190 | 100%|██████████| 145/145 [00:25<00:00,  5.67it/s, lr=0.001, train_loss=3.06]
191 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 49
192 | 100%|██████████| 145/145 [00:25<00:00,  5.71it/s, lr=0.001, train_loss=3.01]
193 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 50
194 | 100%|██████████| 145/145 [00:25<00:00,  5.74it/s, lr=0.001, train_loss=2.99]
195 | Epoch: 51
196 | 100%|██████████| 145/145 [00:25<00:00,  5.74it/s, lr=0.001, train_loss=2.99]
197 | 100%|██████████| 37/37 [00:07<00:00,  4.80it/s, valid_loss=4.06]
198 | Saved Best Model!
199 | Epoch: 52
200 | 100%|██████████| 145/145 [00:25<00:00,  5.67it/s, lr=0.001, train_loss=2.94]
201 | Epoch: 53
202 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=2.88]
203 | Epoch: 54
204 | 100%|██████████| 145/145 [00:25<00:00,  5.79it/s, lr=0.001, train_loss=2.89]
205 | Epoch: 55
206 | 100%|██████████| 145/145 [00:25<00:00,  5.76it/s, lr=0.001, train_loss=2.9]
207 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 56
208 | 100%|██████████| 145/145 [00:25<00:00,  5.71it/s, lr=0.001, train_loss=2.86]
209 | 100%|██████████| 37/37 [00:07<00:00,  4.86it/s, valid_loss=3.52]
210 | Saved Best Model!
211 | Epoch: 57
212 | 100%|██████████| 145/145 [00:24<00:00,  5.90it/s, lr=0.001, train_loss=2.85]
213 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 58
214 | 100%|██████████| 145/145 [00:25<00:00,  5.71it/s, lr=0.001, train_loss=2.83]
215 | Epoch: 59
216 | 100%|██████████| 145/145 [00:25<00:00,  5.71it/s, lr=0.001, train_loss=2.78]
217 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 60
218 | 100%|██████████| 145/145 [00:25<00:00,  5.74it/s, lr=0.001, train_loss=2.76]
219 | Epoch: 61
220 | 100%|██████████| 145/145 [00:25<00:00,  5.65it/s, lr=0.001, train_loss=2.83]
221 | 100%|██████████| 37/37 [00:07<00:00,  4.84it/s, valid_loss=3.69]
222 | Epoch: 62
223 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=2.74]
224 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 63
225 | 100%|██████████| 145/145 [00:24<00:00,  5.80it/s, lr=0.001, train_loss=2.74]
226 | Epoch: 64
227 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=2.75]
228 | Epoch: 65
229 | 100%|██████████| 145/145 [00:24<00:00,  5.86it/s, lr=0.001, train_loss=2.73]
230 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 66
231 | 100%|██████████| 145/145 [00:24<00:00,  5.91it/s, lr=0.001, train_loss=2.68]
232 | 100%|██████████| 37/37 [00:07<00:00,  4.91it/s, valid_loss=3.34]
233 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
234 | Epoch: 67
235 | 100%|██████████| 145/145 [00:24<00:00,  5.80it/s, lr=0.001, train_loss=2.69]
236 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 68
237 | 100%|██████████| 145/145 [00:25<00:00,  5.79it/s, lr=0.001, train_loss=2.68]
238 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 69
239 | 100%|██████████| 145/145 [00:25<00:00,  5.69it/s, lr=0.001, train_loss=2.67]
240 | Epoch: 70
241 | 100%|██████████| 145/145 [00:24<00:00,  5.81it/s, lr=0.001, train_loss=2.65]
242 | Epoch: 71
243 | 100%|██████████| 145/145 [00:25<00:00,  5.74it/s, lr=0.001, train_loss=2.67]
244 | 100%|██████████| 37/37 [00:07<00:00,  5.03it/s, valid_loss=3.03]
245 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
246 | Epoch: 72
247 | 100%|██████████| 145/145 [00:24<00:00,  5.86it/s, lr=0.001, train_loss=2.7]
248 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 73
249 | 100%|██████████| 145/145 [00:24<00:00,  5.83it/s, lr=0.001, train_loss=2.65]
250 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 74
251 | 100%|██████████| 145/145 [00:24<00:00,  5.83it/s, lr=0.001, train_loss=2.63]
252 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 75
253 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=2.64]
254 | Epoch: 76
255 | 100%|██████████| 145/145 [00:24<00:00,  5.82it/s, lr=0.001, train_loss=2.63]
256 | 100%|██████████| 37/37 [00:07<00:00,  4.94it/s, valid_loss=3.11]
257 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 77
258 | 100%|██████████| 145/145 [00:25<00:00,  5.70it/s, lr=0.001, train_loss=2.62]
259 | Epoch: 78
260 | 100%|██████████| 145/145 [00:25<00:00,  5.77it/s, lr=0.001, train_loss=2.61]
261 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 79
262 | 100%|██████████| 145/145 [00:24<00:00,  5.85it/s, lr=0.001, train_loss=2.62]
263 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 80
264 | 100%|██████████| 145/145 [00:25<00:00,  5.79it/s, lr=0.001, train_loss=2.6]
265 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 81
266 | 100%|██████████| 145/145 [00:24<00:00,  5.88it/s, lr=0.001, train_loss=2.61]
267 | 100%|██████████| 37/37 [00:07<00:00,  4.95it/s, valid_loss=3.42]
268 | Epoch: 82
269 | 100%|██████████| 145/145 [00:25<00:00,  5.79it/s, lr=0.001, train_loss=2.6]
270 | Epoch: 83
271 | 100%|██████████| 145/145 [00:24<00:00,  5.85it/s, lr=0.001, train_loss=2.58]
272 | Epoch: 84
273 | 100%|██████████| 145/145 [00:25<00:00,  5.75it/s, lr=0.001, train_loss=2.59]
274 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 85
275 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=2.57]
276 | Epoch: 86
277 | 100%|██████████| 145/145 [00:25<00:00,  5.68it/s, lr=0.001, train_loss=2.57]
278 | 100%|██████████| 37/37 [00:07<00:00,  4.81it/s, valid_loss=3.05]
279 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 87
280 | 100%|██████████| 145/145 [00:24<00:00,  5.81it/s, lr=0.001, train_loss=2.56]
281 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 88
282 | 100%|██████████| 145/145 [00:24<00:00,  5.81it/s, lr=0.001, train_loss=2.58]
283 | Epoch: 89
284 | 100%|██████████| 145/145 [00:25<00:00,  5.74it/s, lr=0.001, train_loss=2.54]
285 | Epoch: 90
286 | 100%|██████████| 145/145 [00:24<00:00,  5.89it/s, lr=0.001, train_loss=2.59]
287 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 91
288 | 100%|██████████| 145/145 [00:24<00:00,  5.91it/s, lr=0.001, train_loss=2.58]
289 | 100%|██████████| 37/37 [00:07<00:00,  5.03it/s, valid_loss=2.67]
290 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
291 | Epoch: 92
292 | 100%|██████████| 145/145 [00:24<00:00,  5.88it/s, lr=0.001, train_loss=2.58]
293 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 93
294 | 100%|██████████| 145/145 [00:24<00:00,  5.82it/s, lr=0.001, train_loss=2.57]
295 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 94
296 | 100%|██████████| 145/145 [00:24<00:00,  5.90it/s, lr=0.001, train_loss=2.57]
297 | Epoch: 95
298 | 100%|██████████| 145/145 [00:24<00:00,  5.86it/s, lr=0.001, train_loss=2.54]
299 | Epoch: 96
300 | 100%|██████████| 145/145 [00:24<00:00,  5.90it/s, lr=0.001, train_loss=2.56]
301 | 100%|██████████| 37/37 [00:07<00:00,  5.01it/s, valid_loss=2.68]
302 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 97
303 | 100%|██████████| 145/145 [00:24<00:00,  5.88it/s, lr=0.001, train_loss=2.53]
304 | Epoch: 98
305 | 100%|██████████| 145/145 [00:24<00:00,  5.83it/s, lr=0.001, train_loss=2.54]
306 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 99
307 | 100%|██████████| 145/145 [00:24<00:00,  5.81it/s, lr=0.001, train_loss=2.51]
308 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 100
309 | 100%|██████████| 145/145 [00:24<00:00,  5.91it/s, lr=0.001, train_loss=2.52]
310 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 101
311 | 100%|██████████| 145/145 [00:25<00:00,  5.74it/s, lr=0.001, train_loss=2.51]
312 | 100%|██████████| 37/37 [00:07<00:00,  4.78it/s, valid_loss=2.63]
313 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
314 | Epoch: 102
315 | 100%|██████████| 145/145 [00:25<00:00,  5.61it/s, lr=0.001, train_loss=2.5]
316 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 103
317 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=2.51]
318 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 104
319 | 100%|██████████| 145/145 [00:25<00:00,  5.77it/s, lr=0.001, train_loss=2.49]
320 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 105
321 | 100%|██████████| 145/145 [00:24<00:00,  5.88it/s, lr=0.001, train_loss=2.51]
322 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 106
323 | 100%|██████████| 145/145 [00:24<00:00,  5.93it/s, lr=0.001, train_loss=2.5]
324 | 100%|██████████| 37/37 [00:07<00:00,  4.88it/s, valid_loss=2.6]
325 | Saved Best Model!
326 | Epoch: 107
327 | 100%|██████████| 145/145 [00:24<00:00,  5.91it/s, lr=0.001, train_loss=2.48]
328 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 108
329 | 100%|██████████| 145/145 [00:24<00:00,  5.87it/s, lr=0.001, train_loss=2.48]
330 | Epoch: 109
331 | 100%|██████████| 145/145 [00:24<00:00,  5.92it/s, lr=0.001, train_loss=2.49]
332 | Epoch: 110
333 | 100%|██████████| 145/145 [00:24<00:00,  5.86it/s, lr=0.001, train_loss=2.5]
334 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 111
335 | 100%|██████████| 145/145 [00:24<00:00,  5.87it/s, lr=0.001, train_loss=2.48]
336 | 100%|██████████| 37/37 [00:07<00:00,  4.96it/s, valid_loss=2.56]
337 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
338 | Epoch: 112
339 | 100%|██████████| 145/145 [00:24<00:00,  5.87it/s, lr=0.001, train_loss=2.48]
340 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 113
341 | 100%|██████████| 145/145 [00:24<00:00,  5.86it/s, lr=0.001, train_loss=2.48]
342 | Epoch: 114
343 | 100%|██████████| 145/145 [00:25<00:00,  5.65it/s, lr=0.001, train_loss=2.49]
344 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 115
345 | 100%|██████████| 145/145 [00:25<00:00,  5.70it/s, lr=0.001, train_loss=2.47]
346 | Epoch: 116
347 | 100%|██████████| 145/145 [00:25<00:00,  5.66it/s, lr=0.001, train_loss=2.47]
348 | 100%|██████████| 37/37 [00:07<00:00,  4.69it/s, valid_loss=2.57]
349 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 117
350 | 100%|██████████| 145/145 [00:25<00:00,  5.75it/s, lr=0.001, train_loss=2.47]
351 | Epoch: 118
352 | 100%|██████████| 145/145 [00:25<00:00,  5.74it/s, lr=0.001, train_loss=2.47]
353 | Epoch: 119
354 | 100%|██████████| 145/145 [00:25<00:00,  5.70it/s, lr=0.001, train_loss=2.47]
355 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 120
356 | 100%|██████████| 145/145 [00:25<00:00,  5.74it/s, lr=0.001, train_loss=2.47]
357 | Epoch: 121
358 | 100%|██████████| 145/145 [00:25<00:00,  5.72it/s, lr=0.001, train_loss=2.47]
359 | 100%|██████████| 37/37 [00:07<00:00,  4.88it/s, valid_loss=2.55]
360 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
361 | Epoch: 122
362 | 100%|██████████| 145/145 [00:25<00:00,  5.80it/s, lr=0.001, train_loss=2.44]
363 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 123
364 | 100%|██████████| 145/145 [00:24<00:00,  5.84it/s, lr=0.001, train_loss=2.46]
365 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 124
366 | 100%|██████████| 145/145 [00:24<00:00,  5.81it/s, lr=0.001, train_loss=2.44]
367 | Epoch: 125
368 | 100%|██████████| 145/145 [00:25<00:00,  5.73it/s, lr=0.001, train_loss=2.46]
369 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 126
370 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=2.45]
371 | 100%|██████████| 37/37 [00:07<00:00,  4.92it/s, valid_loss=2.51]
372 | Saved Best Model!
373 | Epoch: 127
374 | 100%|██████████| 145/145 [00:25<00:00,  5.77it/s, lr=0.001, train_loss=2.44]
375 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 128
376 | 100%|██████████| 145/145 [00:24<00:00,  5.83it/s, lr=0.001, train_loss=2.44]
377 | Epoch: 129
378 | 100%|██████████| 145/145 [00:25<00:00,  5.70it/s, lr=0.001, train_loss=2.44]
379 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 130
380 | 100%|██████████| 145/145 [00:25<00:00,  5.75it/s, lr=0.001, train_loss=2.44]
381 | Epoch: 131
382 | 100%|██████████| 145/145 [00:24<00:00,  5.89it/s, lr=0.001, train_loss=2.44]
383 | 100%|██████████| 37/37 [00:07<00:00,  4.87it/s, valid_loss=2.5]
384 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
385 | Epoch: 132
386 | 100%|██████████| 145/145 [00:24<00:00,  5.82it/s, lr=0.001, train_loss=2.43]
387 | Epoch: 133
388 | 100%|██████████| 145/145 [00:25<00:00,  5.80it/s, lr=0.001, train_loss=2.47]
389 | Epoch: 134
390 | 100%|██████████| 145/145 [00:25<00:00,  5.79it/s, lr=0.001, train_loss=2.46]
391 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 135
392 | 100%|██████████| 145/145 [00:24<00:00,  5.86it/s, lr=0.001, train_loss=2.44]
393 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 136
394 | 100%|██████████| 145/145 [00:25<00:00,  5.71it/s, lr=0.001, train_loss=2.45]
395 | 100%|██████████| 37/37 [00:07<00:00,  4.85it/s, valid_loss=2.52]
396 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 137
397 | 100%|██████████| 145/145 [00:24<00:00,  5.82it/s, lr=0.001, train_loss=2.46]
398 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 138
399 | 100%|██████████| 145/145 [00:24<00:00,  5.84it/s, lr=0.001, train_loss=2.42]
400 | Epoch: 139
401 | 100%|██████████| 145/145 [00:24<00:00,  5.84it/s, lr=0.001, train_loss=2.43]
402 | Epoch: 140
403 | 100%|██████████| 145/145 [00:25<00:00,  5.75it/s, lr=0.001, train_loss=2.43]
404 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 141
405 | 100%|██████████| 145/145 [00:24<00:00,  5.86it/s, lr=0.001, train_loss=2.42]
406 | 100%|██████████| 37/37 [00:07<00:00,  4.96it/s, valid_loss=2.47]
407 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
408 | Epoch: 142
409 | 100%|██████████| 145/145 [00:35<00:00,  4.08it/s, lr=0.001, train_loss=2.44]
410 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 143
411 | 100%|██████████| 145/145 [00:31<00:00,  4.67it/s, lr=0.001, train_loss=2.44]
412 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 144
413 | 100%|██████████| 145/145 [00:26<00:00,  5.41it/s, lr=0.001, train_loss=2.44]
414 | Epoch: 145
415 | 100%|██████████| 145/145 [00:24<00:00,  5.92it/s, lr=0.001, train_loss=2.43]
416 | Epoch: 146
417 | 100%|██████████| 145/145 [00:26<00:00,  5.55it/s, lr=0.001, train_loss=2.44]
418 | 100%|██████████| 37/37 [00:08<00:00,  4.23it/s, valid_loss=2.49]
419 | Epoch: 147
420 | 100%|██████████| 145/145 [00:25<00:00,  5.66it/s, lr=0.001, train_loss=2.43]
421 | Epoch: 148
422 | 100%|██████████| 145/145 [00:25<00:00,  5.68it/s, lr=0.001, train_loss=2.42]
423 | Epoch: 149
424 | 100%|██████████| 145/145 [00:24<00:00,  5.81it/s, lr=0.001, train_loss=2.42]
425 | Epoch: 150
426 | 100%|██████████| 145/145 [00:25<00:00,  5.78it/s, lr=0.001, train_loss=2.44]
427 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 151
428 | 100%|██████████| 145/145 [00:25<00:00,  5.73it/s, lr=0.001, train_loss=2.43]
429 | 100%|██████████| 37/37 [00:07<00:00,  4.94it/s, valid_loss=2.49]
430 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 152
431 | 100%|██████████| 145/145 [00:34<00:00,  4.16it/s, lr=0.001, train_loss=2.41]
432 | Epoch: 153
433 | 100%|██████████| 145/145 [00:30<00:00,  4.73it/s, lr=0.001, train_loss=2.42]
434 | Epoch: 154
435 | 100%|██████████| 145/145 [00:34<00:00,  4.20it/s, lr=0.001, train_loss=2.41]
436 | Epoch: 155
437 | 100%|██████████| 145/145 [00:29<00:00,  4.99it/s, lr=0.001, train_loss=2.42]
438 | Epoch: 156
439 | 100%|██████████| 145/145 [00:29<00:00,  4.89it/s, lr=0.001, train_loss=2.42]
440 | 100%|██████████| 37/37 [00:08<00:00,  4.21it/s, valid_loss=2.5]
441 | Epoch: 157
442 | 100%|██████████| 145/145 [00:34<00:00,  4.26it/s, lr=0.001, train_loss=2.42]
443 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 158
444 | 100%|██████████| 145/145 [00:34<00:00,  4.22it/s, lr=0.001, train_loss=2.43]
445 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 159
446 | 100%|██████████| 145/145 [00:33<00:00,  4.28it/s, lr=0.001, train_loss=2.41]
447 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 160
448 | 100%|██████████| 145/145 [00:25<00:00,  5.67it/s, lr=0.001, train_loss=2.4]
449 | Epoch: 161
450 | 100%|██████████| 145/145 [00:25<00:00,  5.70it/s, lr=0.001, train_loss=2.43]
451 | 100%|██████████| 37/37 [00:07<00:00,  4.89it/s, valid_loss=2.48]
452 | Epoch: 162
453 | 100%|██████████| 145/145 [00:25<00:00,  5.64it/s, lr=0.001, train_loss=2.42]
454 | Epoch: 163
455 | 100%|██████████| 145/145 [00:25<00:00,  5.72it/s, lr=0.001, train_loss=2.44]
456 | Epoch: 164
457 | 100%|██████████| 145/145 [00:25<00:00,  5.77it/s, lr=0.001, train_loss=2.4]
458 | Epoch: 165
459 | 100%|██████████| 145/145 [00:25<00:00,  5.71it/s, lr=0.001, train_loss=2.42]
460 | Epoch: 166
461 | 100%|██████████| 145/145 [00:25<00:00,  5.63it/s, lr=0.001, train_loss=2.42]
462 | 100%|██████████| 37/37 [00:07<00:00,  4.90it/s, valid_loss=2.47]
463 |   0%|          | 0/145 [00:00<?, ?it/s]Saved Best Model!
464 | Epoch: 167
465 | 100%|██████████| 145/145 [00:25<00:00,  5.58it/s, lr=0.001, train_loss=2.41]
466 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 168
467 | 100%|██████████| 145/145 [00:26<00:00,  5.57it/s, lr=0.001, train_loss=2.41]
468 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 169
469 | 100%|██████████| 145/145 [00:26<00:00,  5.57it/s, lr=0.001, train_loss=2.41]
470 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 170
471 | 100%|██████████| 145/145 [00:26<00:00,  5.57it/s, lr=0.001, train_loss=2.42]
472 | Epoch: 171
473 | 100%|██████████| 145/145 [00:25<00:00,  5.62it/s, lr=0.001, train_loss=2.41]
474 | 100%|██████████| 37/37 [00:07<00:00,  4.85it/s, valid_loss=2.48]
475 | Epoch: 172
476 | 100%|██████████| 145/145 [00:25<00:00,  5.71it/s, lr=0.001, train_loss=2.41]
477 | Epoch: 173
478 | 100%|██████████| 145/145 [00:25<00:00,  5.65it/s, lr=0.001, train_loss=2.4]
479 | Epoch: 174
480 | 100%|██████████| 145/145 [00:25<00:00,  5.67it/s, lr=0.001, train_loss=2.41]
481 | Epoch: 175
482 | 100%|██████████| 145/145 [00:27<00:00,  5.28it/s, lr=0.001, train_loss=2.4]
483 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 176
484 | 100%|██████████| 145/145 [00:38<00:00,  3.81it/s, lr=0.001, train_loss=2.41]
485 | 100%|██████████| 37/37 [00:09<00:00,  4.11it/s, valid_loss=2.49]
486 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 177
487 | 100%|██████████| 145/145 [00:29<00:00,  4.98it/s, lr=0.001, train_loss=2.41]
488 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 178
489 | 100%|██████████| 145/145 [00:28<00:00,  5.04it/s, lr=0.001, train_loss=2.4]
490 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 179
491 | 100%|██████████| 145/145 [00:27<00:00,  5.32it/s, lr=0.001, train_loss=2.4]
492 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 180
493 | 100%|██████████| 145/145 [00:27<00:00,  5.25it/s, lr=0.001, train_loss=2.4]
494 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 181
495 | 100%|██████████| 145/145 [00:27<00:00,  5.21it/s, lr=0.001, train_loss=2.39]
496 | 100%|██████████| 37/37 [00:08<00:00,  4.61it/s, valid_loss=2.49]
497 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 182
498 | 100%|██████████| 145/145 [00:27<00:00,  5.28it/s, lr=0.001, train_loss=2.4]
499 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 183
500 | 100%|██████████| 145/145 [00:27<00:00,  5.32it/s, lr=0.001, train_loss=2.4]
501 | Epoch: 184
502 | 100%|██████████| 145/145 [00:27<00:00,  5.34it/s, lr=0.001, train_loss=2.4]
503 | Epoch: 185
504 | 100%|██████████| 145/145 [00:26<00:00,  5.40it/s, lr=0.001, train_loss=2.41]
505 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 186
506 | 100%|██████████| 145/145 [00:26<00:00,  5.49it/s, lr=0.001, train_loss=2.39]
507 | 100%|██████████| 37/37 [00:07<00:00,  4.83it/s, valid_loss=2.44]
508 | Saved Best Model!
509 | Epoch: 187
510 | 100%|██████████| 145/145 [00:26<00:00,  5.48it/s, lr=0.001, train_loss=2.39]
511 | Epoch: 188
512 | 100%|██████████| 145/145 [00:25<00:00,  5.59it/s, lr=0.001, train_loss=2.4]
513 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 189
514 | 100%|██████████| 145/145 [00:26<00:00,  5.53it/s, lr=0.001, train_loss=2.4]
515 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 190
516 | 100%|██████████| 145/145 [00:26<00:00,  5.54it/s, lr=0.001, train_loss=2.4]
517 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 191
518 | 100%|██████████| 145/145 [00:26<00:00,  5.51it/s, lr=0.001, train_loss=2.4]
519 | 100%|██████████| 37/37 [00:07<00:00,  4.85it/s, valid_loss=2.45]
520 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 192
521 | 100%|██████████| 145/145 [00:26<00:00,  5.56it/s, lr=0.001, train_loss=2.4]
522 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 193
523 | 100%|██████████| 145/145 [00:25<00:00,  5.61it/s, lr=0.001, train_loss=2.4]
524 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 194
525 | 100%|██████████| 145/145 [00:25<00:00,  5.63it/s, lr=0.001, train_loss=2.42]
526 | Epoch: 195
527 | 100%|██████████| 145/145 [00:26<00:00,  5.53it/s, lr=0.001, train_loss=2.4]
528 | Epoch: 196
529 | 100%|██████████| 145/145 [00:25<00:00,  5.59it/s, lr=0.001, train_loss=2.41]
530 | 100%|██████████| 37/37 [00:07<00:00,  4.74it/s, valid_loss=2.46]
531 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 197
532 | 100%|██████████| 145/145 [00:26<00:00,  5.56it/s, lr=0.001, train_loss=2.4]
533 | Epoch: 198
534 | 100%|██████████| 145/145 [00:26<00:00,  5.57it/s, lr=0.001, train_loss=2.4]
535 | Epoch: 199
536 | 100%|██████████| 145/145 [00:26<00:00,  5.48it/s, lr=0.001, train_loss=2.41]
537 |   0%|          | 0/145 [00:00<?, ?it/s]Epoch: 200
538 | 100%|██████████| 145/145 [00:26<00:00,  5.47it/s, lr=0.001, train_loss=2.39]
539 | 0 fine-tuned features have been generated!
540 | 1000 fine-tuned features have been generated!
541 | 2000 fine-tuned features have been generated!
542 | 3000 fine-tuned features have been generated!
543 | 4000 fine-tuned features have been generated!
544 | 5000 fine-tuned features have been generated!
545 | 6000 fine-tuned features have been generated!
546 | 7000 fine-tuned features have been generated!
547 | 8000 fine-tuned features have been generated!
548 | 9000 fine-tuned features have been generated!
549 | 10000 fine-tuned features have been generated!
550 | 11000 fine-tuned features have been generated!
551 | 12000 fine-tuned features have been generated!
552 | 13000 fine-tuned features have been generated!
553 | 14000 fine-tuned features have been generated!
554 | 15000 fine-tuned features have been generated!
555 | 16000 fine-tuned features have been generated!
556 | 17000 fine-tuned features have been generated!
557 | 18000 fine-tuned features have been generated!
558 | 19000 fine-tuned features have been generated!
559 | 20000 fine-tuned features have been generated!
560 | 21000 fine-tuned features have been generated!
561 | 22000 fine-tuned features have been generated!
562 | 23000 fine-tuned features have been generated!
563 | 24000 fine-tuned features have been generated!
564 | 25000 fine-tuned features have been generated!
565 | 26000 fine-tuned features have been generated!
566 | 27000 fine-tuned features have been generated!
567 | 28000 fine-tuned features have been generated!
568 | 29000 fine-tuned features have been generated!
569 | 30000 fine-tuned features have been generated!
570 | 31000 fine-tuned features have been generated!
571 | 32000 fine-tuned features have been generated!
572 | 33000 fine-tuned features have been generated!
573 | 34000 fine-tuned features have been generated!
574 | 35000 fine-tuned features have been generated!
575 | 36000 fine-tuned features have been generated!
576 | 37000 fine-tuned features have been generated!
577 | 38000 fine-tuned features have been generated!
578 | 39000 fine-tuned features have been generated!
579 | 40000 fine-tuned features have been generated!
580 | The fine-tuned features are save in ./finetune/aminer_data/finetune_feature_202301141240.txt!
581 | The pre-trained encoders are save in ./pretrain/aminer_node_text_202301141240.pt!
582 | The number of class 0: 2262
583 | The number of class 1: 4919
584 | The number of class 2: 3054
585 | The number of class 3: 3810
586 | The number of class 4: 4044
587 | Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertModel: ['vocab_transform.bias', 'vocab_projector.weight', 'vocab_layer_norm.weight', 'vocab_layer_norm.bias', 'vocab_transform.weight', 'vocab_projector.bias']
588 | - This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
589 | - This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
590 |  Saving model ...
591 | Epoch: 0001 loss_train: 3.1521 f1_micro_train: 0.2235 f1_macro_train: 0.0733 auc_train: 0.4539 f1_micro_val: 0.1760 f1_macro_val: 0.1004 auc_val: 0.4673
592 |  Saving model ...
593 | Epoch: 0002 loss_train: 1.7816 f1_micro_train: 0.1624 f1_macro_train: 0.0930 auc_train: 0.4625 f1_micro_val: 0.2518 f1_macro_val: 0.1575 auc_val: 0.5712
594 | Epoch: 0003 loss_train: 1.1987 f1_micro_train: 0.2457 f1_macro_train: 0.1528 auc_train: 0.5615 f1_micro_val: 0.2717 f1_macro_val: 0.0855 auc_val: 0.5705
595 | Epoch: 0004 loss_train: 1.0930 f1_micro_train: 0.2720 f1_macro_train: 0.0855 auc_train: 0.5690 f1_micro_val: 0.2717 f1_macro_val: 0.0855 auc_val: 0.5953
596 | Epoch: 0005 loss_train: 1.0468 f1_micro_train: 0.2720 f1_macro_train: 0.0855 auc_train: 0.5914 f1_micro_val: 0.2811 f1_macro_val: 0.1139 auc_val: 0.6332
597 | Epoch: 0006 loss_train: 0.9701 f1_micro_train: 0.2781 f1_macro_train: 0.1071 auc_train: 0.6322 f1_micro_val: 0.2817 f1_macro_val: 0.1204 auc_val: 0.6716
598 |  Saving model ...
599 | Epoch: 0007 loss_train: 0.9249 f1_micro_train: 0.2833 f1_macro_train: 0.1162 auc_train: 0.6704 f1_micro_val: 0.3481 f1_macro_val: 0.2107 auc_val: 0.6829
600 | Epoch: 0008 loss_train: 0.9266 f1_micro_train: 0.3534 f1_macro_train: 0.2153 auc_train: 0.6839 f1_micro_val: 0.2690 f1_macro_val: 0.1959 auc_val: 0.6988
601 | Epoch: 0009 loss_train: 0.9373 f1_micro_train: 0.2660 f1_macro_train: 0.1953 auc_train: 0.6999 f1_micro_val: 0.2789 f1_macro_val: 0.1942 auc_val: 0.7117
602 |  Saving model ...
603 | Epoch: 0010 loss_train: 0.9409 f1_micro_train: 0.2824 f1_macro_train: 0.2038 auc_train: 0.7115 f1_micro_val: 0.3547 f1_macro_val: 0.2879 auc_val: 0.7243
604 |  Saving model ...
605 | Epoch: 0011 loss_train: 0.9338 f1_micro_train: 0.3530 f1_macro_train: 0.2843 auc_train: 0.7243 f1_micro_val: 0.4228 f1_macro_val: 0.3092 auc_val: 0.7344
606 |  Saving model ...
607 | Epoch: 0012 loss_train: 0.9182 f1_micro_train: 0.4236 f1_macro_train: 0.3094 auc_train: 0.7362 f1_micro_val: 0.4604 f1_macro_val: 0.3359 auc_val: 0.7400
608 | Epoch: 0013 loss_train: 0.9004 f1_micro_train: 0.4656 f1_macro_train: 0.3404 auc_train: 0.7428 f1_micro_val: 0.3951 f1_macro_val: 0.2566 auc_val: 0.7424
609 | Epoch: 0014 loss_train: 0.8848 f1_micro_train: 0.3933 f1_macro_train: 0.2582 auc_train: 0.7458 f1_micro_val: 0.3780 f1_macro_val: 0.2492 auc_val: 0.7421
610 | Epoch: 0015 loss_train: 0.8727 f1_micro_train: 0.3639 f1_macro_train: 0.2296 auc_train: 0.7457 f1_micro_val: 0.3713 f1_macro_val: 0.2458 auc_val: 0.7399
611 | Epoch: 0016 loss_train: 0.8646 f1_micro_train: 0.3565 f1_macro_train: 0.2311 auc_train: 0.7431 f1_micro_val: 0.3641 f1_macro_val: 0.2406 auc_val: 0.7401
612 | Epoch: 0017 loss_train: 0.8599 f1_micro_train: 0.3581 f1_macro_train: 0.2303 auc_train: 0.7421 f1_micro_val: 0.3730 f1_macro_val: 0.2539 auc_val: 0.7425
613 | Epoch: 0018 loss_train: 0.8553 f1_micro_train: 0.3664 f1_macro_train: 0.2445 auc_train: 0.7445 f1_micro_val: 0.3968 f1_macro_val: 0.2808 auc_val: 0.7478
614 | Epoch: 0019 loss_train: 0.8469 f1_micro_train: 0.3953 f1_macro_train: 0.2763 auc_train: 0.7501 f1_micro_val: 0.4289 f1_macro_val: 0.3078 auc_val: 0.7548
615 | Epoch: 0020 loss_train: 0.8341 f1_micro_train: 0.4275 f1_macro_train: 0.3033 auc_train: 0.7582 f1_micro_val: 0.4344 f1_macro_val: 0.2976 auc_val: 0.7636
616 | Epoch: 0021 loss_train: 0.8189 f1_micro_train: 0.4344 f1_macro_train: 0.2922 auc_train: 0.7680 f1_micro_val: 0.4261 f1_macro_val: 0.2782 auc_val: 0.7745
617 | Epoch: 0022 loss_train: 0.8051 f1_micro_train: 0.4295 f1_macro_train: 0.2774 auc_train: 0.7782 f1_micro_val: 0.4449 f1_macro_val: 0.3175 auc_val: 0.7819
618 |  Saving model ...
619 | Epoch: 0023 loss_train: 0.7938 f1_micro_train: 0.4489 f1_macro_train: 0.3163 auc_train: 0.7870 f1_micro_val: 0.4920 f1_macro_val: 0.3828 auc_val: 0.7882
620 |  Saving model ...
621 | Epoch: 0024 loss_train: 0.7830 f1_micro_train: 0.4948 f1_macro_train: 0.3860 auc_train: 0.7947 f1_micro_val: 0.5340 f1_macro_val: 0.4351 auc_val: 0.7945
622 |  Saving model ...
623 | Epoch: 0025 loss_train: 0.7717 f1_micro_train: 0.5375 f1_macro_train: 0.4386 auc_train: 0.8016 f1_micro_val: 0.5573 f1_macro_val: 0.4668 auc_val: 0.8005
624 |  Saving model ...
625 | Epoch: 0026 loss_train: 0.7603 f1_micro_train: 0.5618 f1_macro_train: 0.4710 auc_train: 0.8080 f1_micro_val: 0.5623 f1_macro_val: 0.4773 auc_val: 0.8054
626 |  Saving model ...
627 | Epoch: 0027 loss_train: 0.7510 f1_micro_train: 0.5760 f1_macro_train: 0.4897 auc_train: 0.8131 f1_micro_val: 0.5667 f1_macro_val: 0.4825 auc_val: 0.8092
628 | Epoch: 0028 loss_train: 0.7423 f1_micro_train: 0.5780 f1_macro_train: 0.4928 auc_train: 0.8174 f1_micro_val: 0.5656 f1_macro_val: 0.4798 auc_val: 0.8133
629 | Epoch: 0029 loss_train: 0.7314 f1_micro_train: 0.5705 f1_macro_train: 0.4842 auc_train: 0.8215 f1_micro_val: 0.5578 f1_macro_val: 0.4654 auc_val: 0.8172
630 | Epoch: 0030 loss_train: 0.7187 f1_micro_train: 0.5547 f1_macro_train: 0.4631 auc_train: 0.8257 f1_micro_val: 0.5423 f1_macro_val: 0.4493 auc_val: 0.8212
631 | Epoch: 0031 loss_train: 0.7068 f1_micro_train: 0.5412 f1_macro_train: 0.4454 auc_train: 0.8298 f1_micro_val: 0.5462 f1_macro_val: 0.4535 auc_val: 0.8246
632 | Epoch: 0032 loss_train: 0.6959 f1_micro_train: 0.5414 f1_macro_train: 0.4466 auc_train: 0.8331 f1_micro_val: 0.5628 f1_macro_val: 0.4762 auc_val: 0.8270
633 |  Saving model ...
634 | Epoch: 0033 loss_train: 0.6842 f1_micro_train: 0.5588 f1_macro_train: 0.4701 auc_train: 0.8353 f1_micro_val: 0.5849 f1_macro_val: 0.5032 auc_val: 0.8284
635 |  Saving model ...
636 | Epoch: 0034 loss_train: 0.6723 f1_micro_train: 0.5893 f1_macro_train: 0.5060 auc_train: 0.8367 f1_micro_val: 0.5905 f1_macro_val: 0.5117 auc_val: 0.8299
637 |  Saving model ...
638 | Epoch: 0035 loss_train: 0.6618 f1_micro_train: 0.6041 f1_macro_train: 0.5226 auc_train: 0.8380 f1_micro_val: 0.5971 f1_macro_val: 0.5202 auc_val: 0.8322
639 |  Saving model ...
640 | Epoch: 0036 loss_train: 0.6524 f1_micro_train: 0.6073 f1_macro_train: 0.5279 auc_train: 0.8401 f1_micro_val: 0.6104 f1_macro_val: 0.5318 auc_val: 0.8354
641 |  Saving model ...
642 | Epoch: 0037 loss_train: 0.6418 f1_micro_train: 0.6158 f1_macro_train: 0.5352 auc_train: 0.8432 f1_micro_val: 0.6182 f1_macro_val: 0.5372 auc_val: 0.8390
643 |  Saving model ...
644 | Epoch: 0038 loss_train: 0.6301 f1_micro_train: 0.6250 f1_macro_train: 0.5415 auc_train: 0.8468 f1_micro_val: 0.6204 f1_macro_val: 0.5376 auc_val: 0.8421
645 |  Saving model ...
646 | Epoch: 0039 loss_train: 0.6196 f1_micro_train: 0.6264 f1_macro_train: 0.5406 auc_train: 0.8499 f1_micro_val: 0.6237 f1_macro_val: 0.5382 auc_val: 0.8448
647 | Epoch: 0040 loss_train: 0.6098 f1_micro_train: 0.6250 f1_macro_train: 0.5380 auc_train: 0.8525 f1_micro_val: 0.6220 f1_macro_val: 0.5362 auc_val: 0.8475
648 |  Saving model ...
649 | Epoch: 0041 loss_train: 0.5992 f1_micro_train: 0.6294 f1_macro_train: 0.5414 auc_train: 0.8549 f1_micro_val: 0.6348 f1_macro_val: 0.5486 auc_val: 0.8499
650 |  Saving model ...
651 | Epoch: 0042 loss_train: 0.5886 f1_micro_train: 0.6398 f1_macro_train: 0.5513 auc_train: 0.8572 f1_micro_val: 0.6425 f1_macro_val: 0.5562 auc_val: 0.8521
652 |  Saving model ...
653 | Epoch: 0043 loss_train: 0.5797 f1_micro_train: 0.6495 f1_macro_train: 0.5609 auc_train: 0.8593 f1_micro_val: 0.6464 f1_macro_val: 0.5596 auc_val: 0.8542
654 |  Saving model ...
655 | Epoch: 0044 loss_train: 0.5708 f1_micro_train: 0.6539 f1_macro_train: 0.5643 auc_train: 0.8613 f1_micro_val: 0.6530 f1_macro_val: 0.5645 auc_val: 0.8565
656 |  Saving model ...
657 | Epoch: 0045 loss_train: 0.5609 f1_micro_train: 0.6577 f1_macro_train: 0.5667 auc_train: 0.8635 f1_micro_val: 0.6597 f1_macro_val: 0.5692 auc_val: 0.8588
658 |  Saving model ...
659 | Epoch: 0046 loss_train: 0.5519 f1_micro_train: 0.6608 f1_macro_train: 0.5692 auc_train: 0.8660 f1_micro_val: 0.6652 f1_macro_val: 0.5740 auc_val: 0.8614
660 |  Saving model ...
661 | Epoch: 0047 loss_train: 0.5437 f1_micro_train: 0.6642 f1_macro_train: 0.5722 auc_train: 0.8687 f1_micro_val: 0.6663 f1_macro_val: 0.5746 auc_val: 0.8644
662 |  Saving model ...
663 | Epoch: 0048 loss_train: 0.5348 f1_micro_train: 0.6709 f1_macro_train: 0.5781 auc_train: 0.8717 f1_micro_val: 0.6685 f1_macro_val: 0.5773 auc_val: 0.8672
664 |  Saving model ...
665 | Epoch: 0049 loss_train: 0.5259 f1_micro_train: 0.6746 f1_macro_train: 0.5812 auc_train: 0.8747 f1_micro_val: 0.6696 f1_macro_val: 0.5780 auc_val: 0.8699
666 |  Saving model ...
667 | Epoch: 0050 loss_train: 0.5177 f1_micro_train: 0.6786 f1_macro_train: 0.5844 auc_train: 0.8774 f1_micro_val: 0.6735 f1_macro_val: 0.5813 auc_val: 0.8726
668 |  Saving model ...
669 | Epoch: 0051 loss_train: 0.5090 f1_micro_train: 0.6839 f1_macro_train: 0.5890 auc_train: 0.8802 f1_micro_val: 0.6818 f1_macro_val: 0.5882 auc_val: 0.8754
670 |  Saving model ...
671 | Epoch: 0052 loss_train: 0.5004 f1_micro_train: 0.6885 f1_macro_train: 0.5928 auc_train: 0.8828 f1_micro_val: 0.6857 f1_macro_val: 0.5942 auc_val: 0.8779
672 |  Saving model ...
673 | Epoch: 0053 loss_train: 0.4923 f1_micro_train: 0.6916 f1_macro_train: 0.5992 auc_train: 0.8853 f1_micro_val: 0.6951 f1_macro_val: 0.6102 auc_val: 0.8806
674 |  Saving model ...
675 | Epoch: 0054 loss_train: 0.4838 f1_micro_train: 0.6984 f1_macro_train: 0.6097 auc_train: 0.8879 f1_micro_val: 0.7034 f1_macro_val: 0.6165 auc_val: 0.8835
676 |  Saving model ...
677 | Epoch: 0055 loss_train: 0.4755 f1_micro_train: 0.7062 f1_macro_train: 0.6211 auc_train: 0.8907 f1_micro_val: 0.7078 f1_macro_val: 0.6216 auc_val: 0.8864
678 |  Saving model ...
679 | Epoch: 0056 loss_train: 0.4669 f1_micro_train: 0.7113 f1_macro_train: 0.6264 auc_train: 0.8934 f1_micro_val: 0.7122 f1_macro_val: 0.6243 auc_val: 0.8891
680 |  Saving model ...
681 | Epoch: 0057 loss_train: 0.4583 f1_micro_train: 0.7133 f1_macro_train: 0.6247 auc_train: 0.8960 f1_micro_val: 0.7161 f1_macro_val: 0.6278 auc_val: 0.8916
682 |  Saving model ...
683 | Epoch: 0058 loss_train: 0.4501 f1_micro_train: 0.7173 f1_macro_train: 0.6264 auc_train: 0.8985 f1_micro_val: 0.7205 f1_macro_val: 0.6310 auc_val: 0.8944
684 |  Saving model ...
685 | Epoch: 0059 loss_train: 0.4416 f1_micro_train: 0.7220 f1_macro_train: 0.6317 auc_train: 0.9012 f1_micro_val: 0.7266 f1_macro_val: 0.6374 auc_val: 0.8974
686 |  Saving model ...
687 | Epoch: 0060 loss_train: 0.4333 f1_micro_train: 0.7282 f1_macro_train: 0.6423 auc_train: 0.9039 f1_micro_val: 0.7294 f1_macro_val: 0.6475 auc_val: 0.9000
688 |  Saving model ...
689 | Epoch: 0061 loss_train: 0.4249 f1_micro_train: 0.7311 f1_macro_train: 0.6494 auc_train: 0.9063 f1_micro_val: 0.7366 f1_macro_val: 0.6613 auc_val: 0.9025
690 |  Saving model ...
691 | Epoch: 0062 loss_train: 0.4165 f1_micro_train: 0.7348 f1_macro_train: 0.6568 auc_train: 0.9087 f1_micro_val: 0.7405 f1_macro_val: 0.6679 auc_val: 0.9053
692 |  Saving model ...
693 | Epoch: 0063 loss_train: 0.4080 f1_micro_train: 0.7396 f1_macro_train: 0.6678 auc_train: 0.9114 f1_micro_val: 0.7427 f1_macro_val: 0.6727 auc_val: 0.9082
694 |  Saving model ...
695 | Epoch: 0064 loss_train: 0.3998 f1_micro_train: 0.7451 f1_macro_train: 0.6757 auc_train: 0.9141 f1_micro_val: 0.7488 f1_macro_val: 0.6776 auc_val: 0.9106
696 | Epoch: 0065 loss_train: 0.3921 f1_micro_train: 0.7485 f1_macro_train: 0.6778 auc_train: 0.9164 f1_micro_val: 0.7493 f1_macro_val: 0.6751 auc_val: 0.9129
697 | Epoch: 0066 loss_train: 0.3845 f1_micro_train: 0.7509 f1_macro_train: 0.6793 auc_train: 0.9187 f1_micro_val: 0.7521 f1_macro_val: 0.6776 auc_val: 0.9154
698 |  Saving model ...
699 | Epoch: 0067 loss_train: 0.3773 f1_micro_train: 0.7547 f1_macro_train: 0.6827 auc_train: 0.9210 f1_micro_val: 0.7571 f1_macro_val: 0.6865 auc_val: 0.9177
700 |  Saving model ...
701 | Epoch: 0068 loss_train: 0.3701 f1_micro_train: 0.7575 f1_macro_train: 0.6855 auc_train: 0.9231 f1_micro_val: 0.7587 f1_macro_val: 0.6887 auc_val: 0.9199
702 |  Saving model ...
703 | Epoch: 0069 loss_train: 0.3631 f1_micro_train: 0.7607 f1_macro_train: 0.6926 auc_train: 0.9253 f1_micro_val: 0.7598 f1_macro_val: 0.6955 auc_val: 0.9220
704 |  Saving model ...
705 | Epoch: 0070 loss_train: 0.3563 f1_micro_train: 0.7645 f1_macro_train: 0.7017 auc_train: 0.9273 f1_micro_val: 0.7626 f1_macro_val: 0.7039 auc_val: 0.9238
706 |  Saving model ...
707 | Epoch: 0071 loss_train: 0.3497 f1_micro_train: 0.7686 f1_macro_train: 0.7100 auc_train: 0.9290 f1_micro_val: 0.7659 f1_macro_val: 0.7091 auc_val: 0.9259
708 |  Saving model ...
709 | Epoch: 0072 loss_train: 0.3433 f1_micro_train: 0.7726 f1_macro_train: 0.7163 auc_train: 0.9310 f1_micro_val: 0.7709 f1_macro_val: 0.7147 auc_val: 0.9280
710 | Epoch: 0073 loss_train: 0.3373 f1_micro_train: 0.7754 f1_macro_train: 0.7205 auc_train: 0.9330 f1_micro_val: 0.7709 f1_macro_val: 0.7132 auc_val: 0.9297
711 |  Saving model ...
712 | Epoch: 0074 loss_train: 0.3313 f1_micro_train: 0.7776 f1_macro_train: 0.7222 auc_train: 0.9346 f1_micro_val: 0.7720 f1_macro_val: 0.7151 auc_val: 0.9313
713 |  Saving model ...
714 | Epoch: 0075 loss_train: 0.3258 f1_micro_train: 0.7803 f1_macro_train: 0.7258 auc_train: 0.9361 f1_micro_val: 0.7764 f1_macro_val: 0.7225 auc_val: 0.9331
715 |  Saving model ...
716 | Epoch: 0076 loss_train: 0.3205 f1_micro_train: 0.7836 f1_macro_train: 0.7323 auc_train: 0.9378 f1_micro_val: 0.7820 f1_macro_val: 0.7311 auc_val: 0.9346
717 |  Saving model ...
718 | Epoch: 0077 loss_train: 0.3155 f1_micro_train: 0.7863 f1_macro_train: 0.7375 auc_train: 0.9392 f1_micro_val: 0.7825 f1_macro_val: 0.7332 auc_val: 0.9360
719 |  Saving model ...
720 | Epoch: 0078 loss_train: 0.3106 f1_micro_train: 0.7897 f1_macro_train: 0.7440 auc_train: 0.9406 f1_micro_val: 0.7847 f1_macro_val: 0.7385 auc_val: 0.9374
721 |  Saving model ...
722 | Epoch: 0079 loss_train: 0.3061 f1_micro_train: 0.7925 f1_macro_train: 0.7491 auc_train: 0.9419 f1_micro_val: 0.7858 f1_macro_val: 0.7396 auc_val: 0.9386
723 |  Saving model ...
724 | Epoch: 0080 loss_train: 0.3017 f1_micro_train: 0.7934 f1_macro_train: 0.7495 auc_train: 0.9432 f1_micro_val: 0.7892 f1_macro_val: 0.7432 auc_val: 0.9399
725 |  Saving model ...
726 | Epoch: 0081 loss_train: 0.2975 f1_micro_train: 0.7974 f1_macro_train: 0.7545 auc_train: 0.9444 f1_micro_val: 0.7925 f1_macro_val: 0.7493 auc_val: 0.9411
727 |  Saving model ...
728 | Epoch: 0082 loss_train: 0.2936 f1_micro_train: 0.7991 f1_macro_train: 0.7574 auc_train: 0.9456 f1_micro_val: 0.7914 f1_macro_val: 0.7495 auc_val: 0.9422
729 |  Saving model ...
730 | Epoch: 0083 loss_train: 0.2897 f1_micro_train: 0.8005 f1_macro_train: 0.7605 auc_train: 0.9466 f1_micro_val: 0.7969 f1_macro_val: 0.7581 auc_val: 0.9433
731 |  Saving model ...
732 | Epoch: 0084 loss_train: 0.2861 f1_micro_train: 0.8041 f1_macro_train: 0.7671 auc_train: 0.9478 f1_micro_val: 0.7969 f1_macro_val: 0.7592 auc_val: 0.9443
733 |  Saving model ...
734 | Epoch: 0085 loss_train: 0.2826 f1_micro_train: 0.8065 f1_macro_train: 0.7710 auc_train: 0.9487 f1_micro_val: 0.8008 f1_macro_val: 0.7649 auc_val: 0.9454
735 |  Saving model ...
736 | Epoch: 0086 loss_train: 0.2793 f1_micro_train: 0.8077 f1_macro_train: 0.7734 auc_train: 0.9497 f1_micro_val: 0.8019 f1_macro_val: 0.7653 auc_val: 0.9464
737 |  Saving model ...
738 | Epoch: 0087 loss_train: 0.2760 f1_micro_train: 0.8092 f1_macro_train: 0.7745 auc_train: 0.9506 f1_micro_val: 0.8030 f1_macro_val: 0.7659 auc_val: 0.9473
739 |  Saving model ...
740 | Epoch: 0088 loss_train: 0.2730 f1_micro_train: 0.8103 f1_macro_train: 0.7755 auc_train: 0.9516 f1_micro_val: 0.8058 f1_macro_val: 0.7715 auc_val: 0.9483
741 |  Saving model ...
742 | Epoch: 0089 loss_train: 0.2700 f1_micro_train: 0.8130 f1_macro_train: 0.7807 auc_train: 0.9524 f1_micro_val: 0.8080 f1_macro_val: 0.7746 auc_val: 0.9491
743 |  Saving model ...
744 | Epoch: 0090 loss_train: 0.2672 f1_micro_train: 0.8145 f1_macro_train: 0.7832 auc_train: 0.9532 f1_micro_val: 0.8118 f1_macro_val: 0.7791 auc_val: 0.9500
745 | Epoch: 0091 loss_train: 0.2644 f1_micro_train: 0.8164 f1_macro_train: 0.7855 auc_train: 0.9541 f1_micro_val: 0.8113 f1_macro_val: 0.7777 auc_val: 0.9508
746 |  Saving model ...
747 | Epoch: 0092 loss_train: 0.2617 f1_micro_train: 0.8168 f1_macro_train: 0.7850 auc_train: 0.9548 f1_micro_val: 0.8135 f1_macro_val: 0.7804 auc_val: 0.9517
748 |  Saving model ...
749 | Epoch: 0093 loss_train: 0.2591 f1_micro_train: 0.8176 f1_macro_train: 0.7862 auc_train: 0.9556 f1_micro_val: 0.8157 f1_macro_val: 0.7828 auc_val: 0.9525
750 |  Saving model ...
751 | Epoch: 0094 loss_train: 0.2566 f1_micro_train: 0.8194 f1_macro_train: 0.7884 auc_train: 0.9563 f1_micro_val: 0.8196 f1_macro_val: 0.7889 auc_val: 0.9532
752 |  Saving model ...
753 | Epoch: 0095 loss_train: 0.2542 f1_micro_train: 0.8216 f1_macro_train: 0.7913 auc_train: 0.9570 f1_micro_val: 0.8196 f1_macro_val: 0.7893 auc_val: 0.9540
754 |  Saving model ...
755 | Epoch: 0096 loss_train: 0.2518 f1_micro_train: 0.8226 f1_macro_train: 0.7931 auc_train: 0.9577 f1_micro_val: 0.8213 f1_macro_val: 0.7912 auc_val: 0.9547
756 |  Saving model ...
757 | Epoch: 0097 loss_train: 0.2495 f1_micro_train: 0.8238 f1_macro_train: 0.7941 auc_train: 0.9583 f1_micro_val: 0.8229 f1_macro_val: 0.7927 auc_val: 0.9554
758 | Epoch: 0098 loss_train: 0.2473 f1_micro_train: 0.8246 f1_macro_train: 0.7943 auc_train: 0.9589 f1_micro_val: 0.8224 f1_macro_val: 0.7923 auc_val: 0.9561
759 |  Saving model ...
760 | Epoch: 0099 loss_train: 0.2452 f1_micro_train: 0.8259 f1_macro_train: 0.7963 auc_train: 0.9595 f1_micro_val: 0.8246 f1_macro_val: 0.7958 auc_val: 0.9567
761 | Epoch: 0100 loss_train: 0.2431 f1_micro_train: 0.8277 f1_macro_train: 0.7992 auc_train: 0.9601 f1_micro_val: 0.8246 f1_macro_val: 0.7958 auc_val: 0.9573
762 | Epoch: 0101 loss_train: 0.2411 f1_micro_train: 0.8284 f1_macro_train: 0.7996 auc_train: 0.9607 f1_micro_val: 0.8251 f1_macro_val: 0.7958 auc_val: 0.9579
763 | Epoch: 0102 loss_train: 0.2391 f1_micro_train: 0.8293 f1_macro_train: 0.8008 auc_train: 0.9612 f1_micro_val: 0.8240 f1_macro_val: 0.7955 auc_val: 0.9584
764 |  Saving model ...
765 | Epoch: 0103 loss_train: 0.2373 f1_micro_train: 0.8295 f1_macro_train: 0.8004 auc_train: 0.9617 f1_micro_val: 0.8273 f1_macro_val: 0.7998 auc_val: 0.9590
766 | Epoch: 0104 loss_train: 0.2355 f1_micro_train: 0.8313 f1_macro_train: 0.8039 auc_train: 0.9622 f1_micro_val: 0.8262 f1_macro_val: 0.7979 auc_val: 0.9593
767 |  Saving model ...
768 | Epoch: 0105 loss_train: 0.2338 f1_micro_train: 0.8318 f1_macro_train: 0.8038 auc_train: 0.9627 f1_micro_val: 0.8284 f1_macro_val: 0.8013 auc_val: 0.9601
769 | Epoch: 0106 loss_train: 0.2323 f1_micro_train: 0.8339 f1_macro_train: 0.8082 auc_train: 0.9631 f1_micro_val: 0.8279 f1_macro_val: 0.7998 auc_val: 0.9599
770 |  Saving model ...
771 | Epoch: 0107 loss_train: 0.2311 f1_micro_train: 0.8333 f1_macro_train: 0.8051 auc_train: 0.9636 f1_micro_val: 0.8318 f1_macro_val: 0.8062 auc_val: 0.9609
772 | Epoch: 0108 loss_train: 0.2298 f1_micro_train: 0.8361 f1_macro_train: 0.8116 auc_train: 0.9639 f1_micro_val: 0.8312 f1_macro_val: 0.8053 auc_val: 0.9606
773 |  Saving model ...
774 | Epoch: 0109 loss_train: 0.2284 f1_micro_train: 0.8329 f1_macro_train: 0.8048 auc_train: 0.9644 f1_micro_val: 0.8323 f1_macro_val: 0.8075 auc_val: 0.9616
775 |  Saving model ...
776 | Epoch: 0110 loss_train: 0.2263 f1_micro_train: 0.8385 f1_macro_train: 0.8146 auc_train: 0.9648 f1_micro_val: 0.8340 f1_macro_val: 0.8092 auc_val: 0.9616
777 |  Saving model ...
778 | Epoch: 0111 loss_train: 0.2240 f1_micro_train: 0.8385 f1_macro_train: 0.8123 auc_train: 0.9653 f1_micro_val: 0.8345 f1_macro_val: 0.8093 auc_val: 0.9620
779 | Epoch: 0112 loss_train: 0.2223 f1_micro_train: 0.8396 f1_macro_train: 0.8144 auc_train: 0.9657 f1_micro_val: 0.8323 f1_macro_val: 0.8077 auc_val: 0.9625
780 |  Saving model ...
781 | Epoch: 0113 loss_train: 0.2212 f1_micro_train: 0.8412 f1_macro_train: 0.8174 auc_train: 0.9660 f1_micro_val: 0.8373 f1_macro_val: 0.8138 auc_val: 0.9624
782 | Epoch: 0114 loss_train: 0.2201 f1_micro_train: 0.8404 f1_macro_train: 0.8145 auc_train: 0.9665 f1_micro_val: 0.8367 f1_macro_val: 0.8132 auc_val: 0.9631
783 |  Saving model ...
784 | Epoch: 0115 loss_train: 0.2185 f1_micro_train: 0.8424 f1_macro_train: 0.8193 auc_train: 0.9667 f1_micro_val: 0.8390 f1_macro_val: 0.8151 auc_val: 0.9631
785 |  Saving model ...
786 | Epoch: 0116 loss_train: 0.2167 f1_micro_train: 0.8439 f1_macro_train: 0.8197 auc_train: 0.9672 f1_micro_val: 0.8412 f1_macro_val: 0.8176 auc_val: 0.9635
787 | Epoch: 0117 loss_train: 0.2152 f1_micro_train: 0.8454 f1_macro_train: 0.8219 auc_train: 0.9676 f1_micro_val: 0.8401 f1_macro_val: 0.8171 auc_val: 0.9638
788 | Epoch: 0118 loss_train: 0.2140 f1_micro_train: 0.8453 f1_macro_train: 0.8225 auc_train: 0.9678 f1_micro_val: 0.8406 f1_macro_val: 0.8175 auc_val: 0.9638
789 |  Saving model ...
790 | Epoch: 0119 loss_train: 0.2129 f1_micro_train: 0.8452 f1_macro_train: 0.8212 auc_train: 0.9682 f1_micro_val: 0.8412 f1_macro_val: 0.8184 auc_val: 0.9643
791 | Epoch: 0120 loss_train: 0.2114 f1_micro_train: 0.8481 f1_macro_train: 0.8259 auc_train: 0.9685 f1_micro_val: 0.8401 f1_macro_val: 0.8163 auc_val: 0.9644
792 |  Saving model ...
793 | Epoch: 0121 loss_train: 0.2098 f1_micro_train: 0.8485 f1_macro_train: 0.8255 auc_train: 0.9689 f1_micro_val: 0.8423 f1_macro_val: 0.8202 auc_val: 0.9647
794 |  Saving model ...
795 | Epoch: 0122 loss_train: 0.2085 f1_micro_train: 0.8497 f1_macro_train: 0.8274 auc_train: 0.9692 f1_micro_val: 0.8428 f1_macro_val: 0.8215 auc_val: 0.9650
796 | Epoch: 0123 loss_train: 0.2074 f1_micro_train: 0.8520 f1_macro_train: 0.8310 auc_train: 0.9695 f1_micro_val: 0.8390 f1_macro_val: 0.8163 auc_val: 0.9651
797 |  Saving model ...
798 | Epoch: 0124 loss_train: 0.2062 f1_micro_train: 0.8504 f1_macro_train: 0.8277 auc_train: 0.9698 f1_micro_val: 0.8467 f1_macro_val: 0.8258 auc_val: 0.9654
799 | Epoch: 0125 loss_train: 0.2049 f1_micro_train: 0.8524 f1_macro_train: 0.8311 auc_train: 0.9701 f1_micro_val: 0.8434 f1_macro_val: 0.8221 auc_val: 0.9655
800 | Epoch: 0126 loss_train: 0.2036 f1_micro_train: 0.8531 f1_macro_train: 0.8317 auc_train: 0.9704 f1_micro_val: 0.8423 f1_macro_val: 0.8206 auc_val: 0.9657
801 |  Saving model ...
802 | Epoch: 0127 loss_train: 0.2025 f1_micro_train: 0.8528 f1_macro_train: 0.8311 auc_train: 0.9707 f1_micro_val: 0.8484 f1_macro_val: 0.8284 auc_val: 0.9660
803 | Epoch: 0128 loss_train: 0.2015 f1_micro_train: 0.8557 f1_macro_train: 0.8347 auc_train: 0.9709 f1_micro_val: 0.8434 f1_macro_val: 0.8211 auc_val: 0.9661
804 |  Saving model ...
805 | Epoch: 0129 loss_train: 0.2003 f1_micro_train: 0.8539 f1_macro_train: 0.8324 auc_train: 0.9713 f1_micro_val: 0.8489 f1_macro_val: 0.8293 auc_val: 0.9664
806 | Epoch: 0130 loss_train: 0.1991 f1_micro_train: 0.8565 f1_macro_train: 0.8355 auc_train: 0.9715 f1_micro_val: 0.8484 f1_macro_val: 0.8279 auc_val: 0.9666
807 | Epoch: 0131 loss_train: 0.1980 f1_micro_train: 0.8562 f1_macro_train: 0.8350 auc_train: 0.9718 f1_micro_val: 0.8467 f1_macro_val: 0.8253 auc_val: 0.9667
808 |  Saving model ...
809 | Epoch: 0132 loss_train: 0.1970 f1_micro_train: 0.8571 f1_macro_train: 0.8362 auc_train: 0.9720 f1_micro_val: 0.8522 f1_macro_val: 0.8327 auc_val: 0.9670
810 | Epoch: 0133 loss_train: 0.1960 f1_micro_train: 0.8588 f1_macro_train: 0.8384 auc_train: 0.9722 f1_micro_val: 0.8478 f1_macro_val: 0.8266 auc_val: 0.9671
811 | Epoch: 0134 loss_train: 0.1950 f1_micro_train: 0.8581 f1_macro_train: 0.8372 auc_train: 0.9725 f1_micro_val: 0.8517 f1_macro_val: 0.8326 auc_val: 0.9674
812 | Epoch: 0135 loss_train: 0.1939 f1_micro_train: 0.8601 f1_macro_train: 0.8402 auc_train: 0.9727 f1_micro_val: 0.8517 f1_macro_val: 0.8320 auc_val: 0.9675
813 | Epoch: 0136 loss_train: 0.1929 f1_micro_train: 0.8599 f1_macro_train: 0.8395 auc_train: 0.9730 f1_micro_val: 0.8500 f1_macro_val: 0.8299 auc_val: 0.9677
814 |  Saving model ...
815 | Epoch: 0137 loss_train: 0.1920 f1_micro_train: 0.8600 f1_macro_train: 0.8397 auc_train: 0.9732 f1_micro_val: 0.8533 f1_macro_val: 0.8343 auc_val: 0.9679
816 | Epoch: 0138 loss_train: 0.1910 f1_micro_train: 0.8616 f1_macro_train: 0.8420 auc_train: 0.9734 f1_micro_val: 0.8511 f1_macro_val: 0.8309 auc_val: 0.9680
817 | Epoch: 0139 loss_train: 0.1900 f1_micro_train: 0.8607 f1_macro_train: 0.8404 auc_train: 0.9737 f1_micro_val: 0.8533 f1_macro_val: 0.8340 auc_val: 0.9683
818 | Epoch: 0140 loss_train: 0.1890 f1_micro_train: 0.8625 f1_macro_train: 0.8431 auc_train: 0.9739 f1_micro_val: 0.8522 f1_macro_val: 0.8319 auc_val: 0.9684
819 | Epoch: 0141 loss_train: 0.1881 f1_micro_train: 0.8627 f1_macro_train: 0.8432 auc_train: 0.9741 f1_micro_val: 0.8539 f1_macro_val: 0.8339 auc_val: 0.9686
820 |  Saving model ...
821 | Epoch: 0142 loss_train: 0.1872 f1_micro_train: 0.8630 f1_macro_train: 0.8435 auc_train: 0.9743 f1_micro_val: 0.8556 f1_macro_val: 0.8369 auc_val: 0.9688
822 | Epoch: 0143 loss_train: 0.1863 f1_micro_train: 0.8642 f1_macro_train: 0.8454 auc_train: 0.9745 f1_micro_val: 0.8550 f1_macro_val: 0.8352 auc_val: 0.9689
823 |  Saving model ...
824 | Epoch: 0144 loss_train: 0.1853 f1_micro_train: 0.8641 f1_macro_train: 0.8447 auc_train: 0.9747 f1_micro_val: 0.8572 f1_macro_val: 0.8390 auc_val: 0.9691
825 | Epoch: 0145 loss_train: 0.1844 f1_micro_train: 0.8654 f1_macro_train: 0.8467 auc_train: 0.9749 f1_micro_val: 0.8556 f1_macro_val: 0.8362 auc_val: 0.9693
826 | Epoch: 0146 loss_train: 0.1835 f1_micro_train: 0.8660 f1_macro_train: 0.8470 auc_train: 0.9751 f1_micro_val: 0.8561 f1_macro_val: 0.8370 auc_val: 0.9695
827 |  Saving model ...
828 | Epoch: 0147 loss_train: 0.1826 f1_micro_train: 0.8667 f1_macro_train: 0.8480 auc_train: 0.9753 f1_micro_val: 0.8589 f1_macro_val: 0.8407 auc_val: 0.9696
829 | Epoch: 0148 loss_train: 0.1817 f1_micro_train: 0.8673 f1_macro_train: 0.8487 auc_train: 0.9755 f1_micro_val: 0.8572 f1_macro_val: 0.8380 auc_val: 0.9698
830 |  Saving model ...
831 | Epoch: 0149 loss_train: 0.1808 f1_micro_train: 0.8682 f1_macro_train: 0.8496 auc_train: 0.9757 f1_micro_val: 0.8594 f1_macro_val: 0.8415 auc_val: 0.9700
832 | Epoch: 0150 loss_train: 0.1800 f1_micro_train: 0.8690 f1_macro_train: 0.8510 auc_train: 0.9759 f1_micro_val: 0.8589 f1_macro_val: 0.8398 auc_val: 0.9702
833 |  Saving model ...
834 | Epoch: 0151 loss_train: 0.1791 f1_micro_train: 0.8688 f1_macro_train: 0.8502 auc_train: 0.9761 f1_micro_val: 0.8600 f1_macro_val: 0.8426 auc_val: 0.9704
835 |  Saving model ...
836 | Epoch: 0152 loss_train: 0.1782 f1_micro_train: 0.8704 f1_macro_train: 0.8525 auc_train: 0.9763 f1_micro_val: 0.8616 f1_macro_val: 0.8429 auc_val: 0.9706
837 |  Saving model ...
838 | Epoch: 0153 loss_train: 0.1774 f1_micro_train: 0.8709 f1_macro_train: 0.8527 auc_train: 0.9765 f1_micro_val: 0.8622 f1_macro_val: 0.8451 auc_val: 0.9707
839 |  Saving model ...
840 | Epoch: 0154 loss_train: 0.1765 f1_micro_train: 0.8710 f1_macro_train: 0.8533 auc_train: 0.9767 f1_micro_val: 0.8639 f1_macro_val: 0.8460 auc_val: 0.9709
841 | Epoch: 0155 loss_train: 0.1757 f1_micro_train: 0.8718 f1_macro_train: 0.8540 auc_train: 0.9768 f1_micro_val: 0.8628 f1_macro_val: 0.8449 auc_val: 0.9711
842 | Epoch: 0156 loss_train: 0.1748 f1_micro_train: 0.8718 f1_macro_train: 0.8543 auc_train: 0.9770 f1_micro_val: 0.8639 f1_macro_val: 0.8458 auc_val: 0.9713
843 |  Saving model ...
844 | Epoch: 0157 loss_train: 0.1740 f1_micro_train: 0.8722 f1_macro_train: 0.8546 auc_train: 0.9772 f1_micro_val: 0.8644 f1_macro_val: 0.8473 auc_val: 0.9715
845 | Epoch: 0158 loss_train: 0.1732 f1_micro_train: 0.8728 f1_macro_train: 0.8556 auc_train: 0.9774 f1_micro_val: 0.8644 f1_macro_val: 0.8471 auc_val: 0.9717
846 |  Saving model ...
847 | Epoch: 0159 loss_train: 0.1724 f1_micro_train: 0.8731 f1_macro_train: 0.8556 auc_train: 0.9776 f1_micro_val: 0.8650 f1_macro_val: 0.8479 auc_val: 0.9719
848 |  Saving model ...
849 | Epoch: 0160 loss_train: 0.1716 f1_micro_train: 0.8735 f1_macro_train: 0.8564 auc_train: 0.9777 f1_micro_val: 0.8655 f1_macro_val: 0.8483 auc_val: 0.9720
850 |  Saving model ...
851 | Epoch: 0161 loss_train: 0.1708 f1_micro_train: 0.8737 f1_macro_train: 0.8565 auc_train: 0.9779 f1_micro_val: 0.8661 f1_macro_val: 0.8490 auc_val: 0.9722
852 |  Saving model ...
853 | Epoch: 0162 loss_train: 0.1700 f1_micro_train: 0.8744 f1_macro_train: 0.8574 auc_train: 0.9781 f1_micro_val: 0.8661 f1_macro_val: 0.8491 auc_val: 0.9724
854 |  Saving model ...
855 | Epoch: 0163 loss_train: 0.1692 f1_micro_train: 0.8750 f1_macro_train: 0.8580 auc_train: 0.9783 f1_micro_val: 0.8661 f1_macro_val: 0.8491 auc_val: 0.9726
856 |  Saving model ...
857 | Epoch: 0164 loss_train: 0.1684 f1_micro_train: 0.8756 f1_macro_train: 0.8585 auc_train: 0.9784 f1_micro_val: 0.8672 f1_macro_val: 0.8497 auc_val: 0.9728
858 | Epoch: 0165 loss_train: 0.1676 f1_micro_train: 0.8761 f1_macro_train: 0.8593 auc_train: 0.9786 f1_micro_val: 0.8666 f1_macro_val: 0.8491 auc_val: 0.9729
859 |  Saving model ...
860 | Epoch: 0166 loss_train: 0.1669 f1_micro_train: 0.8765 f1_macro_train: 0.8598 auc_train: 0.9788 f1_micro_val: 0.8688 f1_macro_val: 0.8513 auc_val: 0.9731
861 |  Saving model ...
862 | Epoch: 0167 loss_train: 0.1661 f1_micro_train: 0.8767 f1_macro_train: 0.8598 auc_train: 0.9789 f1_micro_val: 0.8688 f1_macro_val: 0.8514 auc_val: 0.9732
863 | Epoch: 0168 loss_train: 0.1654 f1_micro_train: 0.8772 f1_macro_train: 0.8612 auc_train: 0.9791 f1_micro_val: 0.8688 f1_macro_val: 0.8500 auc_val: 0.9734
864 |  Saving model ...
865 | Epoch: 0169 loss_train: 0.1648 f1_micro_train: 0.8782 f1_macro_train: 0.8608 auc_train: 0.9793 f1_micro_val: 0.8700 f1_macro_val: 0.8548 auc_val: 0.9735
866 | Epoch: 0170 loss_train: 0.1645 f1_micro_train: 0.8786 f1_macro_train: 0.8637 auc_train: 0.9794 f1_micro_val: 0.8605 f1_macro_val: 0.8361 auc_val: 0.9737
867 | Epoch: 0171 loss_train: 0.1651 f1_micro_train: 0.8755 f1_macro_train: 0.8551 auc_train: 0.9796 f1_micro_val: 0.8666 f1_macro_val: 0.8539 auc_val: 0.9737
868 | Epoch: 0172 loss_train: 0.1666 f1_micro_train: 0.8770 f1_macro_train: 0.8644 auc_train: 0.9795 f1_micro_val: 0.8567 f1_macro_val: 0.8240 auc_val: 0.9736
869 |  Saving model ...
870 | Epoch: 0173 loss_train: 0.1702 f1_micro_train: 0.8680 f1_macro_train: 0.8406 auc_train: 0.9797 f1_micro_val: 0.8672 f1_macro_val: 0.8553 auc_val: 0.9738
871 | Epoch: 0174 loss_train: 0.1689 f1_micro_train: 0.8766 f1_macro_train: 0.8649 auc_train: 0.9794 f1_micro_val: 0.8539 f1_macro_val: 0.8255 auc_val: 0.9738
872 |  Saving model ...
873 | Epoch: 0175 loss_train: 0.1667 f1_micro_train: 0.8723 f1_macro_train: 0.8487 auc_train: 0.9800 f1_micro_val: 0.8744 f1_macro_val: 0.8599 auc_val: 0.9742
874 | Epoch: 0176 loss_train: 0.1618 f1_micro_train: 0.8798 f1_macro_train: 0.8652 auc_train: 0.9800 f1_micro_val: 0.8711 f1_macro_val: 0.8566 auc_val: 0.9744
875 | Epoch: 0177 loss_train: 0.1600 f1_micro_train: 0.8808 f1_macro_train: 0.8663 auc_train: 0.9804 f1_micro_val: 0.8650 f1_macro_val: 0.8405 auc_val: 0.9745
876 |  Saving model ...
877 | Epoch: 0178 loss_train: 0.1611 f1_micro_train: 0.8771 f1_macro_train: 0.8563 auc_train: 0.9805 f1_micro_val: 0.8749 f1_macro_val: 0.8619 auc_val: 0.9746
878 | Epoch: 0179 loss_train: 0.1614 f1_micro_train: 0.8800 f1_macro_train: 0.8671 auc_train: 0.9803 f1_micro_val: 0.8611 f1_macro_val: 0.8392 auc_val: 0.9745
879 | Epoch: 0180 loss_train: 0.1604 f1_micro_train: 0.8794 f1_macro_train: 0.8613 auc_train: 0.9807 f1_micro_val: 0.8733 f1_macro_val: 0.8572 auc_val: 0.9749
880 | Epoch: 0181 loss_train: 0.1577 f1_micro_train: 0.8820 f1_macro_train: 0.8666 auc_train: 0.9807 f1_micro_val: 0.8733 f1_macro_val: 0.8584 auc_val: 0.9750
881 | Epoch: 0182 loss_train: 0.1564 f1_micro_train: 0.8829 f1_macro_train: 0.8686 auc_train: 0.9811 f1_micro_val: 0.8633 f1_macro_val: 0.8405 auc_val: 0.9751
882 |  Saving model ...
883 | Epoch: 0183 loss_train: 0.1569 f1_micro_train: 0.8806 f1_macro_train: 0.8621 auc_train: 0.9812 f1_micro_val: 0.8777 f1_macro_val: 0.8639 auc_val: 0.9752
884 | Epoch: 0184 loss_train: 0.1571 f1_micro_train: 0.8825 f1_macro_train: 0.8690 auc_train: 0.9810 f1_micro_val: 0.8644 f1_macro_val: 0.8445 auc_val: 0.9752
885 | Epoch: 0185 loss_train: 0.1559 f1_micro_train: 0.8829 f1_macro_train: 0.8660 auc_train: 0.9814 f1_micro_val: 0.8749 f1_macro_val: 0.8585 auc_val: 0.9755
886 | Epoch: 0186 loss_train: 0.1536 f1_micro_train: 0.8843 f1_macro_train: 0.8688 auc_train: 0.9815 f1_micro_val: 0.8749 f1_macro_val: 0.8599 auc_val: 0.9756
887 | Epoch: 0187 loss_train: 0.1530 f1_micro_train: 0.8848 f1_macro_train: 0.8704 auc_train: 0.9816 f1_micro_val: 0.8639 f1_macro_val: 0.8427 auc_val: 0.9756
888 | Epoch: 0188 loss_train: 0.1537 f1_micro_train: 0.8835 f1_macro_train: 0.8662 auc_train: 0.9818 f1_micro_val: 0.8755 f1_macro_val: 0.8606 auc_val: 0.9758
889 | Epoch: 0189 loss_train: 0.1533 f1_micro_train: 0.8848 f1_macro_train: 0.8711 auc_train: 0.9817 f1_micro_val: 0.8683 f1_macro_val: 0.8485 auc_val: 0.9759
890 | Epoch: 0190 loss_train: 0.1517 f1_micro_train: 0.8848 f1_macro_train: 0.8682 auc_train: 0.9821 f1_micro_val: 0.8749 f1_macro_val: 0.8595 auc_val: 0.9761
891 | Epoch: 0191 loss_train: 0.1503 f1_micro_train: 0.8870 f1_macro_train: 0.8725 auc_train: 0.9822 f1_micro_val: 0.8755 f1_macro_val: 0.8599 auc_val: 0.9762
892 | Epoch: 0192 loss_train: 0.1500 f1_micro_train: 0.8878 f1_macro_train: 0.8732 auc_train: 0.9822 f1_micro_val: 0.8694 f1_macro_val: 0.8509 auc_val: 0.9762
893 | Epoch: 0193 loss_train: 0.1501 f1_micro_train: 0.8851 f1_macro_train: 0.8689 auc_train: 0.9824 f1_micro_val: 0.8744 f1_macro_val: 0.8592 auc_val: 0.9764
894 | Epoch: 0194 loss_train: 0.1494 f1_micro_train: 0.8885 f1_macro_train: 0.8751 auc_train: 0.9824 f1_micro_val: 0.8755 f1_macro_val: 0.8557 auc_val: 0.9765
895 | Epoch: 0195 loss_train: 0.1485 f1_micro_train: 0.8861 f1_macro_train: 0.8692 auc_train: 0.9827 f1_micro_val: 0.8755 f1_macro_val: 0.8602 auc_val: 0.9766
896 | Epoch: 0196 loss_train: 0.1478 f1_micro_train: 0.8885 f1_macro_train: 0.8750 auc_train: 0.9828 f1_micro_val: 0.8777 f1_macro_val: 0.8612 auc_val: 0.9768
897 | Epoch: 0197 loss_train: 0.1473 f1_micro_train: 0.8884 f1_macro_train: 0.8724 auc_train: 0.9828 f1_micro_val: 0.8783 f1_macro_val: 0.8626 auc_val: 0.9768
898 | Epoch: 0198 loss_train: 0.1464 f1_micro_train: 0.8898 f1_macro_train: 0.8757 auc_train: 0.9830 f1_micro_val: 0.8777 f1_macro_val: 0.8618 auc_val: 0.9769
899 | Epoch: 0199 loss_train: 0.1456 f1_micro_train: 0.8907 f1_macro_train: 0.8764 auc_train: 0.9831 f1_micro_val: 0.8777 f1_macro_val: 0.8609 auc_val: 0.9770
900 |  Saving model ...
901 | Epoch: 0200 loss_train: 0.1451 f1_micro_train: 0.8899 f1_macro_train: 0.8746 auc_train: 0.9832 f1_micro_val: 0.8794 f1_macro_val: 0.8644 auc_val: 0.9770
902 | Load model from epoch 199
903 | Model Testing: f1_micro_test: 0.8791 f1_macro_test: 0.8640 auc_test: 0.9805
904 | 
905 | Process finished with exit code 0
906 | 
907 | 
908 | ```
909 | 
910 | 
911 | 


--------------------------------------------------------------------------------
/data/aminer_data/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/data/github_data/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/data/instagram_data/__:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/data/yelp_data/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/finetune.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # encoding: utf-8
  3 | 
  4 | 
  5 | from utils.getdata import load_data_for_finetune,load_data_for_finetune_imbalance,load_data_for_finetune_balance
  6 | import torch.optim as optim
  7 | from scripts.CMCL import NodeTextCLModel,NodeImageCLModel
  8 | from utils.util import seed_torch,evaluate_performance
  9 | import os
 10 | # from torchgeometry.losses import FocalLoss
 11 | from utils.focalloss import FocalLoss
 12 | from scripts.modules_model import Classifier,FineTuneProjection
 13 | import torch
 14 | from utils import args
 15 | import torch.nn as nn
 16 | import warnings
 17 | import numpy as np
 18 | 
 19 | # warnings.filterwarnings("ignore")
 20 | seed_torch(46)
 21 | # os.environ['CUDA_VISIBLE_DEVICES'] = '1'
 22 | 
 23 | 
 24 | 
 25 | class FineTune(nn.Module):
 26 |     def __init__(self, pretrained_encoder):
 27 |         super(FineTune, self).__init__()
 28 | 
 29 |         self.ft_project = FineTuneProjection(args.finetune_embedding_dim, args.node_feature_project_dim)
 30 |         self.ft_encoder = pretrained_encoder
 31 |         self.classifier = Classifier(input_dim=args.node_feature_project_dim, output_dim=args.nclass)
 32 | 
 33 | 
 34 |     def forward(self, features, adj):
 35 | 
 36 |         features_pro = self.ft_project(features)
 37 |         embed = self.ft_encoder(features_pro, adj)
 38 |         output = self.classifier(embed)
 39 | 
 40 |         return output
 41 | 
 42 | def fine_tune(pretrained_model_dir,finetuned_feature_dir):
 43 | 
 44 |     best_f1_macro = 0.0
 45 |     best_epoch = 0
 46 |     best_model = {}
 47 | 
 48 |     #  imbalanced data with initial ratio
 49 |     if args.imbalance_ratio == 0:
 50 | 
 51 |         adj, features, labels, idx_train, idx_val, idx_test = load_data_for_finetune(finetuned_feature_dir)
 52 | 
 53 |     #  balanced data
 54 |     elif args.imbalance_ratio == 1:
 55 |         adj, features, labels, idx_train, idx_val, idx_test = load_data_for_finetune_balance(finetuned_feature_dir)
 56 | 
 57 |     #  imbalanced data with specific ratio
 58 |     else:
 59 |         adj, features, labels, idx_train, idx_val, idx_test = load_data_for_finetune_imbalance(finetuned_feature_dir)
 60 | 
 61 |     # features = torch.from_numpy(np.genfromtxt(finetuned_feature_dir, delimiter=' ')).float()
 62 | 
 63 |     if args.dataset != 'instagram':
 64 |         model = NodeTextCLModel(features)
 65 |     else:
 66 |         model = NodeImageCLModel(features)
 67 | 
 68 |     model.eval()
 69 |     model_dict = torch.load(pretrained_model_dir,map_location=args.finetune_device)
 70 |     model_dict_copy = {}
 71 |     for key,v in model_dict.items():
 72 |         if '_orig' in key:
 73 |             model_dict_copy[key.replace('_orig','')] = v
 74 |         elif '_mask' in key:
 75 |             continue
 76 |         else:
 77 |             model_dict_copy[key] = v
 78 | 
 79 |     model.load_state_dict(model_dict_copy)
 80 | 
 81 |     ft_encoder = model.node_encoder
 82 |     model.eval()
 83 | 
 84 |     ft_model = FineTune(ft_encoder)
 85 | 
 86 |     optimizer = optim.Adam(ft_model.parameters(),
 87 |                            lr=args.lr, weight_decay=args.weight_decay)
 88 | 
 89 |     ft_model.train()
 90 | 
 91 |     focaloss = FocalLoss(device=args.finetune_device, alpha=args.focal_alpha, gamma=args.focal_gamma, reduction='mean')
 92 | 
 93 |     ft_model = ft_model.to(args.finetune_device)
 94 |     features = features['features'].to(args.finetune_device)
 95 |     adj = adj.to(args.finetune_device)
 96 |     labels = labels.to(args.finetune_device)
 97 |     idx_train = idx_train.to(args.finetune_device)
 98 |     idx_val = idx_val.to(args.finetune_device)
 99 |     idx_test = idx_test.to(args.finetune_device)
100 |     focaloss = focaloss.to(args.finetune_device)
101 | 
102 |     for epoch in range(args.ft_epochs):
103 |         optimizer.zero_grad()
104 | 
105 |         output = ft_model(features, adj)
106 | 
107 |         loss_train = focaloss(output[idx_train], labels[idx_train])
108 |         f1_micro_train, f1_macro_train, auc_train = evaluate_performance(labels[idx_train], output[idx_train])
109 | 
110 |         loss_train.backward()
111 |         optimizer.step()
112 | 
113 |         with torch.no_grad():
114 |             ft_model.eval()
115 | 
116 |             output = ft_model(features, adj)
117 | 
118 |             f1_micro_val, f1_macro_val, auc_val = evaluate_performance(labels[idx_val],output[idx_val])
119 |             if best_f1_macro < f1_macro_val:
120 | 
121 |                 best_f1_macro = f1_macro_val
122 |                 best_epoch = epoch
123 | 
124 |                 print(' Saving model ...')
125 |                 best_model = ft_model.state_dict().copy()
126 | 
127 |             print('Epoch: {:04d}'.format(epoch + 1),
128 |                   'loss_train: {:.4f}'.format(loss_train.item()),
129 |                   'f1_micro_train: {:.4f}'.format(f1_micro_train),
130 |                   'f1_macro_train: {:.4f}'.format(f1_macro_train),
131 |                   'auc_train: {:.4f}'.format(auc_train),
132 | 
133 |                   'f1_micro_val: {:.4f}'.format(f1_micro_val),
134 |                   'f1_macro_val: {:.4f}'.format(f1_macro_val),
135 |                   'auc_val: {:.4f}'.format(auc_val)
136 |                   )
137 | 
138 |     print("Load model from epoch {}".format(best_epoch))
139 |     ft_model.load_state_dict(best_model)
140 | 
141 |     ft_model.eval()
142 | 
143 | 
144 |     output = ft_model(features, adj)
145 | 
146 |     f1_micro_test, f1_macro_test, auc_test = evaluate_performance(labels[idx_test], output[idx_test])
147 | 
148 |     print('Model Testing:',
149 |           'f1_micro_test: {:.4f}'.format(f1_micro_test),
150 |           'f1_macro_test: {:.4f}'.format(f1_macro_test),
151 |           'auc_test: {:.4f}'.format(auc_test)
152 |           )
153 | 
154 | if __name__ == "__main__":
155 | 
156 |     finetune_feature_path = r'./finetune/aminer_data/finetune_feature_202301141240.txt'
157 |     pretrained_model_dir = r'./pretrain/aminer_node_text_202301141240.pt'
158 |     
159 |     fine_tune(pretrained_model_dir,finetune_feature_path)
160 | 
161 | 


--------------------------------------------------------------------------------
/finetune/aminer_data/finetune_feature.txt:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/finetune/github_data/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/finetune/instagram_data/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/finetune/yelp_data/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/main_graph_image_gcl.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # encoding: utf-8
  3 | 
  4 | 
  5 | import numpy as np
  6 | import pandas as pd
  7 | from tqdm import tqdm
  8 | import torch
  9 | from torch import nn
 10 | from utils.getdata import NodeImageDataset, get_transforms,load_data_for_pretrain
 11 | from scripts.CMCL import NodeImageCLModel
 12 | from utils.util import AvgMeter, get_lr
 13 | import torch.nn.utils.prune as prune
 14 | import finetune
 15 | import time
 16 | from utils.params import args
 17 | import warnings
 18 | warnings.filterwarnings("ignore")
 19 | CUDA_LAUNCH_BLOCKING=1
 20 | 
 21 | def make_train_valid_dfs():
 22 | 
 23 |     dataframe = pd.read_csv(args.node_entity_matching_path,sep='	')
 24 |     max_id = dataframe.shape[0] if not args.debug else args.number_samples
 25 |     image_ids = np.arange(0, max_id)
 26 |     np.random.seed(42)
 27 |     valid_ids = np.random.choice(
 28 |         image_ids, size=int(0.2 * len(image_ids)), replace=False
 29 |     )
 30 |     train_ids = [id_ for id_ in image_ids if id_ not in valid_ids]
 31 |     dataframe['id'] = list(dataframe.index)
 32 |     train_dataframe = dataframe[dataframe["id"].isin(train_ids)].reset_index(drop=True)
 33 |     valid_dataframe = dataframe[dataframe["id"].isin(valid_ids)].reset_index(drop=True)
 34 | 
 35 |     return train_dataframe, valid_dataframe
 36 | 
 37 | def build_loaders(dataframe, mode):
 38 | 
 39 |     transforms = get_transforms(mode=mode)
 40 |     dataset = NodeImageDataset(
 41 |         dataframe["entity_id"].values,
 42 |         dataframe["image_path"].values,
 43 |         dataframe['label'].values,
 44 |         transforms=transforms,
 45 |     )
 46 |     dataloader = torch.utils.data.DataLoader(
 47 |         dataset,
 48 |         batch_size=args.batch_size,
 49 |         num_workers=args.num_workers,
 50 |         shuffle=True if mode == "train" else False,
 51 |     )
 52 |     return dataloader
 53 | 
 54 | def train_epoch(model, feature,adj,train_loader, optimizer, lr_scheduler, step,pos):
 55 |     loss_meter = AvgMeter()
 56 |     tqdm_object = tqdm(train_loader, total=len(train_loader))
 57 |     for batch in tqdm_object:
 58 |         try:
 59 |             batch = {k: v.to(args.device) for k, v in batch.items() if k != "caption"}
 60 |         except:
 61 |             batch = {k: v.to(args.device) for k, v in batch.items() if k != "caption"}
 62 |         loss, node_embed_prune,node_embeds = model(batch, feature, adj, pos)
 63 |         optimizer.zero_grad()
 64 |         if args.prune:
 65 |             for name, module in model.named_modules():
 66 |                 if isinstance(module, torch.nn.Conv2d) or isinstance(module,nn.Linear):
 67 |                     module.weight = module.weight_orig.clone()
 68 |                 elif 'node_encoder_prune.' in name:
 69 |                     module.weight = module.weight_orig.clone()
 70 |         loss.backward()
 71 |         optimizer.step()
 72 | 
 73 |         if step == "batch":
 74 |             lr_scheduler.step()
 75 | 
 76 |         count = batch["image"].size(0)
 77 |         loss_meter.update(loss.item(), count)
 78 | 
 79 |         tqdm_object.set_postfix(train_loss=loss_meter.avg, lr=get_lr(optimizer))
 80 | 
 81 |     return loss_meter,node_embed_prune,node_embeds
 82 | 
 83 | 
 84 | def valid_epoch(model, feature,adj,valid_loader, pos):
 85 |     loss_meter = AvgMeter()
 86 |     tqdm_object = tqdm(valid_loader, total=len(valid_loader))
 87 |     for batch in tqdm_object:
 88 |         batch = {k: v.to(args.device) for k, v in batch.items() if k != "caption"}
 89 |         loss, node_embed_prune,node_embeds = model(batch, feature, adj, pos)
 90 | 
 91 |         count = batch["image"].size(0)
 92 |         loss_meter.update(loss.item(), count)
 93 |         tqdm_object.set_postfix(valid_loss=loss_meter.avg)
 94 | 
 95 |     return loss_meter,node_embed_prune,node_embeds
 96 | 
 97 | def main():
 98 |     my_time = time.strftime('%Y%m%d%H%M', time.gmtime(time.time()))
 99 |     save_model_path = "./pretrain/{}_node_image_{}.pt".format(args.dataset, my_time)
100 |     train_df, valid_df = make_train_valid_dfs()   # make train, validation datasets
101 | 
102 |     train_loader = build_loaders(train_df, mode="train")
103 |     valid_loader = build_loaders(valid_df, mode="valid")
104 | 
105 | 
106 |     # adj, features, labels, idx_train, idx_val, idx_test, pos = load_finetune_data_for_imbalanced()
107 |     adj, features, labels, pos = load_data_for_pretrain()
108 |     transforms = get_transforms(mode='train')
109 |     model = NodeImageCLModel(features,transform=transforms).to(args.device)
110 |     if args.prune:
111 |         for name, module in model.named_modules():
112 |             if isinstance(module,torch.nn.Conv2d) or isinstance(module,nn.Linear):
113 |                 prune.l1_unstructured(module, name='weight', amount=int(module.weight.shape[0]*module.weight.shape[1]*args.prune_ratio))
114 |             elif 'node_encoder_prune.' in name:
115 |                 prune.l1_unstructured(module, name='weight', amount=int(module.weight.shape[0]*module.weight.shape[1]*args.prune_ratio))
116 | 
117 |     optimizer = torch.optim.AdamW(
118 |         model.parameters(), lr=args.lr, weight_decay=args.weight_decay
119 |     )
120 |     lr_scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
121 |         optimizer, mode="min", patience=args.patience, factor=args.factor
122 |     )
123 |     step = "epoch"
124 | 
125 |     best_loss = float('inf')
126 | 
127 |     for epoch in range(args.pretrain_epochs):
128 |         print(f"Epoch: {epoch + 1}")
129 |         model.train()
130 |         train_loss,node_embed_prune_train,node_embeds_train = train_epoch(model,features,adj, train_loader, optimizer, lr_scheduler, step, pos)
131 | 
132 |         if epoch % 5 == 0:
133 |             # model.eval()
134 |             with torch.no_grad():
135 |                 valid_loss,node_embed_prune_val,node_embeds_val = valid_epoch(model,features,adj, valid_loader, pos)
136 | 
137 | 
138 |             if valid_loss.avg < best_loss:
139 |                 best_loss = valid_loss.avg
140 |                 torch.save(model.state_dict(), save_model_path)
141 |                 print("Saved Best Model!")
142 | 
143 |     with torch.no_grad():
144 | 
145 |         df = pd.read_csv(args.id_content_path,sep='\t')
146 | 
147 |         finetune_feature_path = r'./finetune/{}_data/finetune_feature_{}.txt'.format(args.dataset,my_time)
148 |         fo = open(finetune_feature_path, 'w', encoding='utf8')
149 |         for i in range(df.shape[0]):
150 |             embed_ = model.image_encoder.get_finetune_embed(df.loc[i, 'image_path'])
151 |             np.savetxt(fo, np.array(embed_.detach().cpu()).reshape([1, 768]))
152 |             if i % 1000 == 0:
153 |                 print("{} fine-tuned features have been generated!".format(i))
154 | 
155 |     save_embed_path = r'./finetune/{}_data/node_embed_{}.txt'.format(args.dataset,my_time)
156 |     np.savetxt(save_embed_path, node_embeds_train.cpu().data.numpy())
157 | 
158 |     print("The fine-tuned features are save in {}!".format(finetune_feature_path))
159 |     print("The pre-trained encoders are save in {}!".format(save_model_path))
160 | 
161 |     if args.finetune:
162 |         finetune.fine_tune(save_model_path,finetune_feature_path)
163 | 
164 | 
165 | if __name__ == "__main__":
166 |     main()
167 | 


--------------------------------------------------------------------------------
/main_graph_text_gcl.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # encoding: utf-8
  3 | 
  4 | import time
  5 | 
  6 | import numpy as np
  7 | import pandas as pd
  8 | from tqdm import tqdm
  9 | import torch
 10 | from transformers import DistilBertTokenizer,BertTokenizer
 11 | from utils.getdata import NodeTextDataset,load_data_for_pretrain
 12 | from scripts.CMCL import NodeTextCLModel
 13 | from utils.util import AvgMeter, get_lr
 14 | from utils.params import args
 15 | import torch.nn.utils.prune as prune
 16 | from utils.util import seed_torch
 17 | import finetune
 18 | import warnings
 19 | warnings.filterwarnings("ignore")
 20 | _BACKEND = 'pytorch'
 21 | 
 22 | seed_torch(4)
 23 | 
 24 | def make_train_valid_dfs():
 25 |     dataframe = pd.read_csv(args.node_entity_matching_path, sep='\t')
 26 |     max_id = dataframe.shape[0] if not args.debug else args.number_samples
 27 |     text_ids = np.arange(0, max_id)
 28 |     np.random.seed(42)
 29 |     valid_ids = np.random.choice(
 30 |         text_ids, size=int(0.2 * len(text_ids)), replace=False
 31 |     )
 32 | 
 33 |     train_ids = [id_ for id_ in text_ids if id_ not in valid_ids]
 34 |     dataframe['id'] = list(dataframe.index)
 35 |     train_dataframe = dataframe[dataframe["id"].isin(train_ids)].reset_index(drop=True)
 36 |     valid_dataframe = dataframe[dataframe["id"].isin(valid_ids)].reset_index(drop=True)
 37 | 
 38 |     return train_dataframe, valid_dataframe
 39 | 
 40 | def build_loaders(dataframe, tokenizer, mode):
 41 |     dataset = NodeTextDataset(
 42 |         dataframe["entity_id"].values,
 43 |         dataframe["text"].values,
 44 |         dataframe['label'].values,
 45 |         tokenizer=tokenizer
 46 |     )
 47 |     dataloader = torch.utils.data.DataLoader(
 48 |         dataset,
 49 |         batch_size=args.batch_size,
 50 |         num_workers=args.num_workers,
 51 |         shuffle=True if mode == "train" else False,
 52 |     )
 53 |     return dataloader
 54 | 
 55 | def train_epoch(model, feature,adj,train_loader, optimizer, lr_scheduler, step,pos):
 56 |     loss_meter = AvgMeter()
 57 |     tqdm_object = tqdm(train_loader, total=len(train_loader))
 58 |     for batch in tqdm_object:
 59 |         batch = {k: v.to(args.device) for k, v in batch.items() if k != "text"}
 60 |         loss, node_embed_prune,node_embeds = model(batch, feature, adj, pos)
 61 |         optimizer.zero_grad()
 62 |         if args.prune:
 63 |             for name, module in model.named_modules():
 64 |                 if 'text_encoder.model' in name:
 65 |                     if args.text_encoder_model == 'distilbert-base-uncased':
 66 |                         if '.attention.self.q_lin' in name or '.attention.self.k_lin' in name or '.attention.self.v_lin' in name or '.attention.out_lin' in name or '.ffn.lin' in name:
 67 |                             module.weight = module.weight_orig.clone()
 68 |                             module.bias = module.bias_orig.clone()
 69 |                     elif args.text_encoder_model == 'bert-base-uncased':
 70 |                         if '.attention.self.query' in name or '.attention.self.key' in name or '.attention.self.value' in name or '.attention.output.dense' in name or '.intermediate.dense' in name or '.intermediate.dense' in name or '.attention.output.dense' in name or 'model.pooler.dense' in name:
 71 |                             module.weight = module.weight_orig.clone()
 72 |                             module.bias = module.bias_orig.clone()
 73 | 
 74 |                 elif 'node_encoder_prune.' in name:
 75 |                     module.weight = module.weight_orig.clone()
 76 |                     module.bias = module.bias_orig.clone()
 77 | 
 78 |         loss.backward()
 79 |         optimizer.step()
 80 |         if step == "batch":
 81 |             lr_scheduler.step()
 82 | 
 83 |         count = batch["input_ids"].size(0)
 84 |         loss_meter.update(loss.item(), count)
 85 | 
 86 |         tqdm_object.set_postfix(train_loss=loss_meter.avg, lr=get_lr(optimizer))
 87 | 
 88 |     return loss_meter,node_embed_prune,node_embeds
 89 | 
 90 | 
 91 | def valid_epoch(model, feature,adj,valid_loader,pos):
 92 |     loss_meter = AvgMeter()
 93 |     tqdm_object = tqdm(valid_loader, total=len(valid_loader))
 94 |     for batch in tqdm_object:
 95 |         batch = {k: v.to(args.device) for k, v in batch.items() if k != "text"}
 96 |         loss, node_embed_prune,node_embeds = model(batch, feature, adj,pos)
 97 | 
 98 |         count = batch["input_ids"].size(0)
 99 |         loss_meter.update(loss.item(), count)
100 |         tqdm_object.set_postfix(valid_loss=loss_meter.avg)
101 | 
102 |     return loss_meter,node_embed_prune,node_embeds
103 | 
104 | 
105 | def main():
106 |     my_time = time.strftime('%Y%m%d%H%M', time.gmtime(time.time()))
107 | 
108 |     train_df, valid_df = make_train_valid_dfs()   # make train, validation datasets
109 |     if args.text_encoder_model == 'distilbert-base-uncased':
110 |         tokenizer = DistilBertTokenizer.from_pretrained(args.text_encoder_tokenizer) # read distil-bert language model
111 |     elif args.text_encoder_model == 'bert-base-uncased':
112 |         tokenizer = BertTokenizer.from_pretrained(args.text_encoder_tokenizer)  # read bert language model
113 |     train_loader = build_loaders(train_df, tokenizer, mode="train")     #
114 |     valid_loader = build_loaders(valid_df, tokenizer,mode="valid")
115 | 
116 |     adj, features, labels, pos = load_data_for_pretrain()
117 |     model = NodeTextCLModel(features).to(args.device)
118 | 
119 |     if args.prune:
120 |         for name, module in model.named_modules():
121 |             if 'text_encoder.model' in name:
122 |                 if args.text_encoder_model == 'distilbert-base-uncased':
123 |                     if '.attention.self.q_lin' in name or '.attention.self.k_lin' in name or '.attention.self.v_lin' in name or '.attention.out_lin' in name or '.ffn.lin' in name:
124 |                         prune.l1_unstructured(module, name='weight', amount=int(module.weight.shape[0]*module.weight.shape[1]*args.prune_ratio))
125 |                         prune.l1_unstructured(module, name='bias', amount=int(module.bias.shape[0]  * args.prune_ratio))
126 |                 elif args.text_encoder_model == 'bert-base-uncased':
127 |                     if '.attention.self.query' in name or '.attention.self.key' in name or '.attention.self.value' in name or '.attention.output.dense' in name or '.intermediate.dense' in name or '.intermediate.dense' in name or '.attention.output.dense' in name or 'model.pooler.dense' in name:
128 |                         prune.l1_unstructured(module, name='weight', amount=int(module.weight.shape[0] * module.weight.shape[1] * args.prune_ratio))
129 |                         prune.l1_unstructured(module, name='bias', amount=int(module.bias.shape[0] * args.prune_ratio))
130 | 
131 |             elif 'node_encoder_prune.' in name:
132 |                 prune.l1_unstructured(module, name='weight', amount=int(module.weight.shape[0]*module.weight.shape[1]*args.prune_ratio))
133 |                 prune.l1_unstructured(module, name='bias',amount=int(module.bias.shape[0] * args.prune_ratio))
134 | 
135 |     optimizer = torch.optim.AdamW(
136 |         model.parameters(), weight_decay=0
137 |     )
138 | 
139 |     lr_scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
140 |         optimizer, mode="min", patience=args.patience, factor=args.factor
141 |     )
142 |     step = "epoch"
143 |     best_loss = float('inf')
144 | 
145 |     for epoch in range(args.pretrain_epochs):
146 |         print(f"Epoch: {epoch + 1}")
147 |         model.train()
148 |         train_loss,node_embed_prune_train,node_embeds_train = train_epoch(model,features,adj, train_loader, optimizer, lr_scheduler, step,pos)
149 |         model.eval()
150 | 
151 |         save_model_path = "./pretrain/{}_node_text_{}.pt".format(args.dataset, my_time)
152 | 
153 |         if epoch % 5 == 0:
154 |             with torch.no_grad():
155 |                 valid_loss,node_embed_prune_val,node_embeds_val = valid_epoch(model,features,adj, valid_loader, pos)
156 | 
157 |             if valid_loss.avg < best_loss:
158 |                 best_loss = valid_loss.avg
159 |                 torch.save(model.state_dict(), save_model_path)
160 |                 print("Saved Best Model!")
161 | 
162 |     # model.eval()
163 |     with torch.no_grad():
164 | 
165 |         df = pd.read_csv(args.id_content_path)
166 |         finetune_feature_path = r'./finetune/{}_data/finetune_feature_{}.txt'.format(args.dataset,my_time)
167 |         fo = open(finetune_feature_path, 'w', encoding='utf8')
168 |         for i in range(df.shape[0]):
169 |             embed_ = model.text_encoder.get_finetune_embed(df.loc[i, 'content'])
170 |             np.savetxt(fo, np.array(embed_.detach().cpu()).reshape([1, 768]))
171 |             if i % 1000 == 0:
172 |                 print("{} fine-tuned features have been generated!".format(i))
173 | 
174 |     save_embed_path = r'./finetune/{}_data/node_embed_{}.txt'.format(args.dataset,my_time)
175 |     np.savetxt(save_embed_path, node_embeds_train.cpu().data.numpy())
176 | 
177 |     print("The fine-tuned features are save in {}!".format(finetune_feature_path))
178 |     print("The pre-trained encoders are save in {}!".format(save_model_path))
179 | 
180 |     if args.finetune:
181 |         finetune.fine_tune(save_model_path,finetune_feature_path)
182 | 
183 | 
184 | 
185 | 
186 | 
187 | if __name__ == "__main__":
188 |     main()
189 | 
190 | 


--------------------------------------------------------------------------------
/pretrain/aminer.pt:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | torch==1.9.1+cu111
 2 | tqdm==4.62.3
 3 | transformers==4.17.0
 4 | cv2==4.5.5
 5 | matplotlib==3.5.1
 6 | albumentations==1.1.0
 7 | numpy==1.19.3
 8 | scipy==1.8.0
 9 | argparse==1.1
10 | timm==0.5.4
11 | sklearn==1.2.0
12 | pandas==1.3.5
13 | json==2.0.9


--------------------------------------------------------------------------------
/scripts/CMCL.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torch import nn
  3 | import torch.nn.functional as F
  4 | from utils import args
  5 | from utils.getdata import get_transforms
  6 | from scripts.modules_model import ImageEncoder, TextEncoder, ProjectionHead,NodeEncoder,FeatureProjection
  7 | 
  8 | 
  9 | class NodeImageCLModel(nn.Module):
 10 |     def __init__(
 11 |         self,feature,transform=get_transforms(),
 12 |         temperature=args.temperature,
 13 |         image_embedding=args.image_embedding_dim,
 14 |         node_embedding=args.node_embedding_dim,
 15 |     ):
 16 |         super().__init__()
 17 | 
 18 |         # The pruned image encoder: some parameters will first be masked at the main function.
 19 |         self.image_encoder = ImageEncoder(transform)
 20 |         self.image_projection = ProjectionHead(embedding_dim=image_embedding)
 21 | 
 22 |         # Projecting all features in heterogeneous graph to the same space (e.g., AMiner)
 23 |         self.feature_projection = FeatureProjection(feature)
 24 | 
 25 |         # The non-pruned node encoder
 26 |         self.node_encoder = NodeEncoder()
 27 |         self.node_embed_projection = ProjectionHead(embedding_dim=node_embedding)
 28 | 
 29 |         self.feature_projection_prune = FeatureProjection(feature)
 30 | 
 31 |         # The pruned node encoder: some parameters will first be masked at the main function.
 32 |         self.node_encoder_prune = NodeEncoder()
 33 |         self.node_embed_prune_projection = ProjectionHead(embedding_dim=node_embedding)
 34 | 
 35 |         self.temperature = temperature
 36 | 
 37 |     def sim(self, z1, z2):
 38 |         z1_norm = torch.norm(z1, dim=-1, keepdim=True)
 39 |         z2_norm = torch.norm(z2, dim=-1, keepdim=True)
 40 |         dot_numerator = torch.mm(z1, z2.t())
 41 |         dot_denominator = torch.mm(z1_norm, z2_norm.t())
 42 |         sim_matrix = torch.exp(dot_numerator / dot_denominator / self.temperature)
 43 |         return sim_matrix
 44 | 
 45 |     def forward(self, batch, feature, adj, pos):
 46 | 
 47 |         # Getting image features
 48 |         image_features = self.image_encoder(batch["image"])
 49 | 
 50 |         # Getting image embeddings
 51 |         image_embeddings = self.image_projection(image_features)
 52 | 
 53 |         # Getting pruned node embeddings
 54 |         feature_p = self.feature_projection(feature)
 55 |         node_embed_prune = self.node_encoder_prune(feature_p.float(), adj)
 56 |         node_embeddings_project_prune = self.node_embed_prune_projection(node_embed_prune)
 57 |         node_embedding_batch_prune = node_embeddings_project_prune[batch['entity']]
 58 | 
 59 |         # Calculating the inter-modality Loss
 60 | 
 61 |         logits_prune = (node_embedding_batch_prune @ image_embeddings.T) / self.temperature
 62 |         image_similarity = image_embeddings @ image_embeddings.T
 63 |         nodes_similarity_prune = node_embedding_batch_prune @ node_embedding_batch_prune.T
 64 |         targets = F.softmax(
 65 |             (image_similarity + nodes_similarity_prune) / 2 * self.temperature, dim=-1
 66 |         )
 67 |         nodes_loss = cross_entropy(logits_prune, targets, reduction='none')
 68 |         text_loss = cross_entropy(logits_prune.T, targets.T, reduction='none')
 69 |         loss_inter = ((text_loss + nodes_loss) / 2.0).mean()
 70 | 
 71 |         # Getting non-pruned node embedding
 72 |         feature = self.feature_projection_prune(feature)
 73 |         node_embed = self.node_encoder(feature.float(), adj)
 74 |         node_embeddings_project = self.node_embed_projection(node_embed)
 75 |         node_embedding_batch = node_embeddings_project[batch['entity']]
 76 | 
 77 |         if args.pos:
 78 |             matrix_prune2non = self.sim(node_embedding_batch, node_embedding_batch_prune)
 79 |             matrix_prune2non = matrix_prune2non / (torch.sum(matrix_prune2non, dim=1).view(-1, 1) + 1e-8)
 80 |             pos_d = pos.to_dense()
 81 |             pos_index = batch['entity'].sort()
 82 |             pos_d_batch = pos_d[pos_index[0]][:, pos_index[0]]
 83 |             lori_prune = -torch.log(matrix_prune2non.mul(pos_d_batch).sum(dim=-1)).mean()
 84 | 
 85 |             matrix_non2prune = matrix_prune2non.t()
 86 |             matrix_non2prune = matrix_non2prune / (torch.sum(matrix_non2prune, dim=1).view(-1, 1) + 1e-8)
 87 |             lori_non = -torch.log(matrix_non2prune.mul(pos_d_batch).sum(dim=-1)).mean()
 88 | 
 89 |             loss_intra = (lori_non + lori_prune) / 2.0
 90 | 
 91 |         else:
 92 |             logits_intra = (node_embedding_batch @ node_embedding_batch_prune.T) / self.temperature
 93 |             node_similarity_prune = node_embedding_batch_prune @ node_embedding_batch_prune.T
 94 | 
 95 |             nodes_similarity = node_embedding_batch @ node_embedding_batch.T
 96 | 
 97 |             targets_intra = F.softmax(
 98 |                 (node_similarity_prune + nodes_similarity) / 2 * self.temperature, dim=-1
 99 |             )
100 |             nodes_loss_intra = cross_entropy(logits_intra, targets_intra, reduction='none')
101 |             node_loss_prune_intra = cross_entropy(logits_intra.T, targets_intra.T, reduction='none')
102 |             loss_intra = ((nodes_loss_intra + node_loss_prune_intra) / 2.0).mean()
103 | 
104 |         loss = loss_inter + loss_intra
105 | 
106 |         return loss, node_embed_prune, node_embed
107 | 
108 | class NodeTextCLModel(nn.Module):
109 |     def __init__(
110 |         self,
111 |         feature,
112 |         temperature=args.temperature,
113 |         text_embedding=args.text_embedding_dim,
114 |         node_embedding=args.node_embedding_dim,
115 |     ):
116 |         super().__init__()
117 | 
118 |         # The pruned text encoder: some parameters will first be masked at the main function.
119 |         self.text_encoder = TextEncoder()
120 |         self.text_projection = ProjectionHead(embedding_dim=text_embedding)
121 | 
122 |         # Projecting all features in heterogeneous graph to the same space (e.g., AMiner)
123 |         self.feature_projection = FeatureProjection(feature)
124 | 
125 |         # The non-pruned node encoder
126 |         self.node_encoder = NodeEncoder()
127 |         self.node_embed_projection = ProjectionHead(embedding_dim=node_embedding)
128 | 
129 |         self.feature_projection_prune = FeatureProjection(feature)
130 | 
131 |         # The pruned node encoder
132 |         self.node_encoder_prune = NodeEncoder()
133 |         self.node_embed_prune_projection = ProjectionHead(embedding_dim=node_embedding)
134 | 
135 |         self.temperature = temperature
136 | 
137 |     def sim(self, z1, z2):
138 |         z1_norm = torch.norm(z1, dim=-1, keepdim=True)
139 |         z2_norm = torch.norm(z2, dim=-1, keepdim=True)
140 |         dot_numerator = torch.mm(z1, z2.t())
141 |         dot_denominator = torch.mm(z1_norm, z2_norm.t())
142 |         sim_matrix = torch.exp(dot_numerator / dot_denominator / self.temperature)
143 |         return sim_matrix
144 | 
145 |     def forward(self, batch,feature,adj,pos):
146 | 
147 |         # Getting text features
148 |         text_features = self.text_encoder(
149 |             input_ids=batch["input_ids"], attention_mask=batch["attention_mask"]
150 |         )
151 |         batch = {k: v.to(args.device) for k, v in batch.items() if k != "text"}
152 | 
153 |         # Getting pruned text embeddings
154 |         text_embeddings = self.text_projection(text_features)
155 | 
156 |         # Getting pruned node embeddings
157 |         features = self.feature_projection_prune(feature)
158 |         node_embed_prune = self.node_encoder_prune(features.float(), adj)
159 |         node_embeddings_project_prune = self.node_embed_prune_projection(node_embed_prune)
160 |         node_embedding_batch_prune = node_embeddings_project_prune[batch['entity']]
161 | 
162 | 
163 |         logits_prune = (node_embedding_batch_prune @ text_embeddings.T) / self.temperature
164 |         text_similarity = text_embeddings @ text_embeddings.T
165 |         nodes_similarity_prune = node_embedding_batch_prune @ node_embedding_batch_prune.T
166 |         targets = F.softmax(
167 |             (text_similarity + nodes_similarity_prune) / 2 * self.temperature, dim=-1
168 |         )
169 |         nodes_loss = cross_entropy(logits_prune, targets, reduction='none')
170 |         text_loss = cross_entropy(logits_prune.T, targets.T, reduction='none')
171 |         loss_inter = (text_loss + nodes_loss) / 2.0  # shape: (batch_size)
172 | 
173 |         # Getting non-pruned node embeddings
174 |         features = self.feature_projection(feature)
175 |         node_embed = self.node_encoder(features.float(), adj)
176 |         node_embeddings_project = self.node_embed_projection(node_embed)
177 |         node_embedding_batch = node_embeddings_project[batch['entity']]
178 | 
179 |         if args.pos:
180 |             matrix_prune2non = self.sim(node_embedding_batch,node_embedding_batch_prune)
181 |             matrix_prune2non = matrix_prune2non / (torch.sum(matrix_prune2non, dim=1).view(-1, 1) + 1e-8)
182 |             pos_d = pos.to_dense()
183 |             pos_index = batch['entity'].sort()
184 |             pos_d_batch = pos_d[pos_index[0]][:,pos_index[0]]
185 |             lori_prune = -torch.log(matrix_prune2non.mul(pos_d_batch).sum(dim=-1)).mean()
186 | 
187 |             matrix_non2prune = matrix_prune2non.t()
188 |             matrix_non2prune = matrix_non2prune / (torch.sum(matrix_non2prune, dim=1).view(-1, 1) + 1e-8)
189 |             lori_non = -torch.log(matrix_non2prune.mul(pos_d_batch).sum(dim=-1)).mean()
190 | 
191 |             loss_intra = (lori_non + lori_prune) / 2.0
192 | 
193 |             loss = loss_inter.mean() + loss_intra
194 | 
195 |         else:
196 |             logits_intra = (node_embedding_batch @ node_embedding_batch_prune.T) / self.temperature
197 |             node_similarity_prune = node_embedding_batch_prune @ node_embedding_batch_prune.T
198 | 
199 |             nodes_similarity = node_embedding_batch @ node_embedding_batch.T
200 | 
201 |             targets_intra = F.softmax(
202 |                 (node_similarity_prune + nodes_similarity) / 2 * self.temperature, dim=-1
203 |             )
204 |             nodes_loss_intra = cross_entropy(logits_intra, targets_intra, reduction='none')
205 |             node_loss_prune_intra = cross_entropy(logits_intra.T, targets_intra.T, reduction='none')
206 |             loss_intra = (nodes_loss_intra + node_loss_prune_intra) / 2.0  # shape: (batch_size)
207 |             loss = loss_inter.mean() + loss_intra.mean()
208 | 
209 | 
210 |         return loss,node_embed_prune,node_embed
211 | 
212 | def cross_entropy(preds, targets, reduction='none'):
213 |     log_softmax = nn.LogSoftmax(dim=-1)
214 |     loss = (-targets * log_softmax(preds)).sum(1)
215 |     if reduction == "none":
216 |         return loss
217 |     elif reduction == "mean":
218 |         return loss.mean()
219 | 


--------------------------------------------------------------------------------
/scripts/modules_layer.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # encoding: utf-8
  3 | 
  4 | 
  5 | import torch
  6 | from torch import nn
  7 | import  torch.nn.functional as F
  8 | import math
  9 | from torch.nn.modules.module import Module
 10 | from torch.nn.parameter import Parameter
 11 | 
 12 | 
 13 | class GraphConvolution(Module):
 14 |     """
 15 |     Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
 16 |     """
 17 | 
 18 |     def __init__(self, in_features, out_features, bias=True):
 19 |         super(GraphConvolution, self).__init__()
 20 |         self.in_features = in_features
 21 |         self.out_features = out_features
 22 |         self.weight = Parameter(torch.FloatTensor(in_features, out_features))
 23 |         if bias:
 24 |             self.bias = Parameter(torch.FloatTensor(out_features))
 25 |         else:
 26 |             self.register_parameter('bias', None)
 27 |         self.reset_parameters()
 28 | 
 29 |     def reset_parameters(self):
 30 |         stdv = 1. / math.sqrt(self.weight.size(1))
 31 |         self.weight.data.uniform_(-stdv, stdv)
 32 |         if self.bias is not None:
 33 |             self.bias.data.uniform_(-stdv, stdv)
 34 | 
 35 |     def forward(self, input, adj):
 36 |         support = torch.mm(input, self.weight)
 37 |         output = torch.spmm(adj, support)
 38 |         if self.bias is not None:
 39 |             return output + self.bias
 40 |         else:
 41 |             return output
 42 | 
 43 |     def __repr__(self):
 44 |         return self.__class__.__name__ + ' (' \
 45 |                + str(self.in_features) + ' -> ' \
 46 |                + str(self.out_features) + ')'
 47 | 
 48 | 
 49 | class GraphAttentionLayer(nn.Module):
 50 |     """
 51 |     Simple GAT layer, similar to https://arxiv.org/abs/1710.10903
 52 |     """
 53 |     def __init__(self, in_features, out_features, dropout, alpha, concat=True):
 54 |         super(GraphAttentionLayer, self).__init__()
 55 |         self.dropout = dropout
 56 |         self.in_features = in_features
 57 |         self.out_features = out_features
 58 |         self.alpha = alpha
 59 |         self.concat = concat
 60 | 
 61 |         self.W = nn.Parameter(torch.empty(size=(in_features, out_features)))
 62 |         nn.init.xavier_uniform_(self.W.data, gain=1.414)
 63 |         self.a = nn.Parameter(torch.empty(size=(2*out_features, 1)))
 64 |         nn.init.xavier_uniform_(self.a.data, gain=1.414)
 65 | 
 66 |         self.leakyrelu = nn.LeakyReLU(self.alpha)
 67 | 
 68 |     def forward(self, h, adj):
 69 |         Wh = torch.mm(h, self.W) # h.shape: (N, in_features), Wh.shape: (N, out_features)
 70 |         e = self._prepare_attentional_mechanism_input(Wh)
 71 | 
 72 |         zero_vec = -9e15*torch.ones_like(e)
 73 |         attention = torch.where(adj > 0, e, zero_vec)
 74 |         attention = F.softmax(attention, dim=1)
 75 |         attention = F.dropout(attention, self.dropout, training=self.training)
 76 |         h_prime = torch.matmul(attention, Wh)
 77 | 
 78 |         if self.concat:
 79 |             return F.elu(h_prime)
 80 |         else:
 81 |             return h_prime
 82 | 
 83 |     def _prepare_attentional_mechanism_input(self, Wh):
 84 |         # Wh.shape (N, out_feature)
 85 |         # self.a.shape (2 * out_feature, 1)
 86 |         # Wh1&2.shape (N, 1)
 87 |         # e.shape (N, N)
 88 |         Wh1 = torch.matmul(Wh, self.a[:self.out_features, :])
 89 |         Wh2 = torch.matmul(Wh, self.a[self.out_features:, :])
 90 |         # broadcast add
 91 |         e = Wh1 + Wh2.T
 92 |         return self.leakyrelu(e)
 93 | 
 94 |     def __repr__(self):
 95 |         return self.__class__.__name__ + ' (' + str(self.in_features) + ' -> ' + str(self.out_features) + ')'
 96 | 
 97 | 
 98 | # Graphsage layer
 99 | class GraphSageConv(Module):
100 |     """
101 |     Simple Graphsage layer
102 |     """
103 | 
104 |     def __init__(self, in_features, out_features, bias=False):
105 |         super(GraphSageConv, self).__init__()
106 | 
107 |         self.proj = nn.Linear(in_features * 2, out_features, bias=bias)
108 | 
109 |         self.reset_parameters()
110 | 
111 |         # print("note: for dense graph in graphsage, require it normalized.")
112 | 
113 |     def reset_parameters(self):
114 | 
115 |         nn.init.normal_(self.proj.weight)
116 | 
117 |         if self.proj.bias is not None:
118 |             nn.init.constant_(self.proj.bias, 0.)
119 | 
120 |     def forward(self, features, adj):
121 |         """
122 |         Args:
123 |             adj: can be sparse or dense matrix.
124 |         """
125 | 
126 |         # fuse info from neighbors. to be added:
127 |         if not isinstance(adj, torch.sparse.Tensor):
128 |             if len(adj.shape) == 3:
129 |                 neigh_feature = torch.bmm(adj, features) / (
130 |                             adj.sum(dim=1).reshape((adj.shape[0], adj.shape[1], -1)) + 1)
131 |             else:
132 |                 neigh_feature = torch.mm(adj, features) / (adj.sum(dim=1).reshape(adj.shape[0], -1) + 1)
133 |         else:
134 |             # print("spmm not implemented for batch training. Note!")
135 | 
136 |             neigh_feature = torch.spmm(adj, features) / (adj.to_dense().sum(dim=1).reshape(adj.shape[0], -1) + 1)
137 | 
138 |         # perform conv
139 |         data = torch.cat([features, neigh_feature], dim=-1)
140 |         combined = self.proj(data)
141 | 
142 |         return combined
143 | 


--------------------------------------------------------------------------------
/scripts/modules_model.py:
--------------------------------------------------------------------------------
  1 | from torch import nn
  2 | import timm
  3 | from transformers import DistilBertModel, DistilBertConfig
  4 | from transformers import BertModel,BertConfig,BertTokenizer
  5 | import torch.nn.functional as F
  6 | from scripts.modules_graph import GraphConvolution,GraphAttentionLayer,GraphSageConv
  7 | from utils import args
  8 | from transformers import DistilBertTokenizer
  9 | import cv2
 10 | from matplotlib import pyplot as plt
 11 | import torch
 12 | 
 13 | 
 14 | class ImageEncoder(nn.Module):
 15 |     """
 16 |     Encode images to a fixed size vector
 17 |     """
 18 | 
 19 |     def __init__(
 20 |         self, transform,model_name=args.image_encoder_model, pretrained=args.pretrained, trainable=args.trainable
 21 |     ):
 22 |         super().__init__()
 23 |         # loading the pretrained model
 24 |         self.model = timm.create_model(
 25 |             model_name, pretrained, num_classes=0, global_pool="avg"
 26 |         )
 27 | 
 28 |         for p in self.model.parameters():
 29 |             p.requires_grad = trainable
 30 | 
 31 |         self.transforms = transform
 32 |     def forward(self, x):
 33 |         return self.model(x)
 34 | 
 35 |     def get_finetune_embed(self,img):
 36 | 
 37 |         self.model.train()
 38 |         image = plt.imread(img)
 39 |         image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
 40 |         image = self.transforms(image=image)['image']
 41 |         x = torch.tensor(image).permute(2, 0, 1).float().unsqueeze(0)
 42 | 
 43 |         return self.model(x)
 44 | 
 45 | 
 46 | class TextEncoder(nn.Module):
 47 |     def __init__(self, model_name=args.text_encoder_model, pretrained=args.pretrained, trainable=args.trainable):
 48 |         super().__init__()
 49 |         if pretrained:
 50 |             if args.text_encoder_model == 'distilbert-base-uncased':
 51 |                 self.model = DistilBertModel.from_pretrained(model_name)
 52 |                 self.tokenizer = DistilBertTokenizer.from_pretrained(args.text_encoder_tokenizer)
 53 |             elif args.text_encoder_model == 'bert-base-uncased':
 54 |                 self.model = BertModel.from_pretrained(model_name)
 55 |                 self.tokenizer = BertTokenizer.from_pretrained(args.text_encoder_tokenizer)
 56 |         else:
 57 |             if args.text_encoder_model == 'distilbert-base-uncased':
 58 |                 self.model = DistilBertModel(config=DistilBertConfig())
 59 |                 self.tokenizer = DistilBertTokenizer.from_pretrained(args.text_encoder_tokenizer)
 60 |             elif args.text_encoder_model == 'bert-base-uncased':
 61 |                 self.model = BertModel(config=BertConfig())
 62 |                 self.tokenizer = BertTokenizer.from_pretrained(args.text_encoder_tokenizer)
 63 | 
 64 |         for p in self.model.parameters():
 65 |             p.requires_grad = trainable
 66 | 
 67 |         # Using the CLS token hidden representation as the sentence's embedding
 68 |         self.target_token_idx = 0
 69 | 
 70 |     def forward(self, input_ids, attention_mask):
 71 |         output = self.model(input_ids=input_ids, attention_mask=attention_mask)
 72 |         last_hidden_state = output.last_hidden_state
 73 |         return last_hidden_state[:, self.target_token_idx, :]
 74 | 
 75 |     def get_finetune_embed(self,text):
 76 | 
 77 |         tokens = self.tokenizer(text, padding=True, truncation=True, max_length=args.max_length, return_tensors='pt').to(args.device)
 78 |         self.model.train()
 79 |         return self.model(**tokens).last_hidden_state[:,self.target_token_idx,:]
 80 | 
 81 | 
 82 | class ProjectionHead(nn.Module):
 83 |     def __init__(
 84 |         self,
 85 |         embedding_dim,
 86 |         projection_dim=args.projection_dim,
 87 |         dropout=args.dropout
 88 |     ):
 89 |         super().__init__()
 90 |         self.projection = nn.Linear(embedding_dim, projection_dim)
 91 |         self.gelu = nn.GELU()
 92 |         self.fc = nn.Linear(projection_dim, projection_dim)
 93 |         self.dropout = nn.Dropout(dropout)
 94 |         self.layer_norm = nn.LayerNorm(projection_dim)
 95 |     
 96 |     def forward(self, x):
 97 |         projected = self.projection(x)
 98 |         x = self.gelu(projected)
 99 |         x = self.fc(x)
100 |         x = self.dropout(x)
101 |         x = x + projected
102 |         x = self.layer_norm(x)
103 |         return x
104 | 
105 | class FeatureProjection(nn.Module):
106 |     def __init__(
107 |             self, feature,
108 |             projection_dim = args.feat_projection_dim
109 |     ):
110 |         super().__init__()
111 | 
112 |         self.projection = {k: nn.Linear(v.shape[1], projection_dim).to(args.device) for k, v in feature.items()}
113 |         self.gelu = nn.GELU()
114 |         self.layer_norm = nn.LayerNorm(projection_dim)
115 | 
116 |     def forward(self, feature):
117 |         project_feature = []
118 |         for k,v in feature.items():
119 |             projected = self.projection[k](v)
120 |             x = self.gelu(projected)
121 |             x = self.layer_norm(x)
122 |             project_feature.append(x)
123 | 
124 |         if len(project_feature) >1:
125 |             # return torch.cat([project_feature[0], project_feature[1]])
126 |             return torch.cat(project_feature,dim=0)
127 |         else:
128 |             return project_feature[0]
129 |         # return torch.cat([project_feature[0],project_feature[1]])
130 | 
131 | class NodeEncoder(nn.Module):
132 |     def __init__(self):
133 |         super(NodeEncoder, self).__init__()
134 | 
135 |         self.dropout = args.node_dropout
136 |         if args.node_encoder_model == 'gcn':
137 |             self.gc1 = GraphConvolution(args.feat_projection_dim, args.hidden_dim)
138 |             self.gc2 = GraphConvolution(args.hidden_dim, args.out_dim)
139 | 
140 |         elif args.node_encoder_model == 'gat':
141 | 
142 |             self.attentions = [GraphAttentionLayer(args.feat_projection_dim, args.hidden_dim, dropout=self.dropout, alpha=args.alpha, concat=True) for _ in
143 |                                range(args.nheads)]
144 |             for i, attention in enumerate(self.attentions):
145 |                 self.add_module('attention_{}'.format(i), attention)
146 | 
147 |             self.out_att = GraphAttentionLayer(args.hidden_dim * args.nheads, args.out_dim, dropout=args.dropout, alpha=args.alpha, concat=False)
148 |         elif args.node_encoder_model == 'sage':
149 |             self.sage1 = GraphSageConv(args.feat_projection_dim, args.hidden_dim)
150 |             self.sage2 = GraphSageConv(args.hidden_dim, args.out_dim)
151 | 
152 | 
153 |     def forward(self, x, adj):
154 |         if args.node_encoder_model == 'gcn':
155 |             x1 = F.relu(self.gc1(x, adj))
156 |             x1 = F.dropout(x1, self.dropout, training=self.training)
157 |             x2 = self.gc2(x1, adj)
158 | 
159 |         elif args.node_encoder_model == 'gat':
160 |             x = F.dropout(x, self.dropout, training=self.training)
161 |             x1 = torch.cat([att(x, adj) for att in self.attentions], dim=1)
162 |             x1 = F.dropout(x1, self.dropout, training=self.training)
163 |             x2 = F.elu(self.out_att(x1, adj))
164 | 
165 |         elif args.node_encoder_model == 'sage':
166 | 
167 |             x1 = F.relu(self.sage1(x, adj))
168 |             x1 = F.dropout(x1, self.dropout, training=self.training)
169 |             x2 = F.relu(self.sage2(x1, adj))
170 |             x2 = F.dropout(x2, self.dropout, training=self.training)
171 | 
172 |         return x2
173 | 
174 | 
175 | class NodeClassifier(nn.Module):
176 |     def __init__(self):
177 |         super(NodeClassifier, self).__init__()
178 | 
179 |         self.nodencoder = NodeEncoder()
180 |         self.mlp = nn.Linear(args.out_dim, args.nclass)
181 |         self.dropout = args.dropout
182 | 
183 |         self.reset_parameters()
184 | 
185 |     def reset_parameters(self):
186 |         nn.init.normal_(self.mlp.weight, std=0.05)
187 | 
188 |     def forward(self, x, adj):
189 |         x = F.relu(self.nodencoder(x, adj))
190 |         x = F.dropout(x, self.dropout, training=self.training)
191 |         x = self.mlp(x[:,:])
192 | 
193 |         return x
194 | 
195 | 
196 | class Classifier(nn.Module):
197 |     def __init__(self,input_dim= args.out_dim,output_dim=args.nclass):
198 |         super(Classifier, self).__init__()
199 | 
200 | 
201 |         self.mlp = nn.Linear(input_dim, output_dim)
202 |         self.dropout = args.dropout
203 | 
204 |         self.reset_parameters()
205 | 
206 |     def reset_parameters(self):
207 |         nn.init.normal_(self.mlp.weight, std=0.05)
208 | 
209 |     def forward(self, x):
210 |         x = F.relu(x)
211 |         x = F.dropout(x, self.dropout, training=self.training)
212 |         x = self.mlp(x[:,:])
213 | 
214 |         return x
215 | 
216 | 
217 | class FineTuneProjection(nn.Module):
218 |     def __init__(
219 |             self,
220 |             embedding_dim,
221 |             projection_dim
222 |     ):
223 |         super().__init__()
224 |         self.projection = nn.Linear(embedding_dim, projection_dim)
225 |         self.gelu = nn.GELU()
226 |         self.layer_norm = nn.LayerNorm(projection_dim)
227 | 
228 |     def forward(self, x):
229 |         projected = self.projection(x)
230 |         x = self.gelu(projected)
231 |         x = self.layer_norm(x)
232 |         return x
233 | 


--------------------------------------------------------------------------------
/scripts/moodules_graph.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from scripts.modules_layer import GraphConvolution,GraphSageConv,GraphAttentionLayer
  5 | from utils.params import args
  6 | 
  7 | class GCN(nn.Module):
  8 |     def __init__(self, nfeat, nhid, nclass, dropout,dataset,feature,projection_dim=200):
  9 |         super(GCN, self).__init__()
 10 | 
 11 |         self.gc1 = GraphConvolution(nfeat, nhid)
 12 |         self.gc2 = GraphConvolution(nhid, nhid)
 13 |         self.gc3 = GraphConvolution(nhid, nclass)
 14 |         self.dropout = dropout
 15 |         self.dataset = dataset
 16 |         if dataset == 'aminer':
 17 |             self.projection = FeatureProjection(feature,projection_dim).to(args.device)
 18 | 
 19 | 
 20 |     def forward(self, x, adj):
 21 |         if self.dataset == 'aminer':
 22 |             x = self.projection(x)
 23 |         x = F.relu(self.gc1(x, adj))
 24 |         x = F.dropout(x, self.dropout, training=self.training)
 25 |         x =F.dropout(F.relu(self.gc2(x, adj)))
 26 |         x = self.gc3(x, adj)
 27 |         return F.log_softmax(x, dim=1)
 28 | 
 29 | class SAGE(nn.Module):
 30 |     def __init__(self, nfeat, nhid, nclass, dropout,dataset,feature,projection_dim=200):
 31 |         super(SAGE, self).__init__()
 32 | 
 33 |         self.sage1 = GraphSageConv(nfeat, nhid)
 34 |         self.sage2 = GraphSageConv(nhid, nhid)
 35 |         self.mlp = nn.Linear(nhid, nclass)
 36 |         self.reset_parameters()
 37 | 
 38 |         self.dropout = dropout
 39 |         self.dataset = dataset
 40 |         if dataset == 'aminer':
 41 |             self.projection = FeatureProjection(feature,projection_dim).to(args.device)
 42 | 
 43 |     def reset_parameters(self):
 44 |         nn.init.normal_(self.mlp.weight,std=0.05)
 45 | 
 46 |     def forward(self, x, adj):
 47 |         if self.dataset == 'aminer':
 48 |             x = self.projection(x)
 49 |         x = F.relu(self.sage1(x, adj))
 50 |         x = F.dropout(x, self.dropout, training=self.training)
 51 |         x = F.dropout(F.relu(self.sage2(x, adj)))
 52 |         x = self.mlp(x)
 53 | 
 54 |         return F.log_softmax(x, dim=1)
 55 | 
 56 | class SAGE(nn.Module):
 57 |     def __init__(self, nfeat, nhid, nclass, dropout,dataset,feature,projection_dim=200):
 58 |         super(SAGE, self).__init__()
 59 | 
 60 |         self.sage1 = GraphSageConv(nfeat, nhid)
 61 |         self.sage2 = GraphSageConv(nhid, nhid)
 62 |         self.mlp = nn.Linear(nhid, nclass)
 63 |         self.reset_parameters()
 64 | 
 65 |         self.dropout = dropout
 66 |         self.dataset = dataset
 67 |         if dataset == 'aminer':
 68 |             self.projection = FeatureProjection(feature,projection_dim).to(args.device)
 69 | 
 70 |     def reset_parameters(self):
 71 |         nn.init.normal_(self.mlp.weight,std=0.05)
 72 | 
 73 |     def forward(self, x, adj):
 74 |         if self.dataset == 'aminer':
 75 |             x = self.projection(x)
 76 |         x = F.relu(self.sage1(x, adj))
 77 |         x = F.dropout(x, self.dropout, training=self.training)
 78 |         x = F.dropout(F.relu(self.sage2(x, adj)))
 79 |         x = self.mlp(x)
 80 | 
 81 |         return F.log_softmax(x, dim=1)
 82 | 
 83 | class GAT(nn.Module):
 84 |     def __init__(self, nfeat, nhid, nclass, dropout, dataset,feature,projection_dim=200,alpha=0.2, nheads=1):
 85 |         super(GAT, self).__init__()
 86 | 
 87 |         self.attentions = [GraphAttentionLayer(nfeat, nhid, dropout=dropout, alpha=alpha, concat=True) for _ in
 88 |                            range(nheads)]
 89 |         for i, attention in enumerate(self.attentions):
 90 |             self.add_module('attention_{}'.format(i), attention)
 91 | 
 92 |         self.out_proj = nn.Linear(nhid * nheads, nhid)
 93 | 
 94 |         if args.dataset == 'aminer':
 95 |             self.projection = FeatureProjection(feature, projection_dim).to(args.device)
 96 | 
 97 |         self.dropout = dropout
 98 |         self.mlp = nn.Linear(nhid, nclass)
 99 |         self.dropout = dropout
100 | 
101 |         self.reset_parameters()
102 | 
103 |         self.dataset = dataset
104 | 
105 |     def reset_parameters(self):
106 |         nn.init.normal_(self.mlp.weight, std=0.05)
107 |         nn.init.normal_(self.out_proj.weight, std=0.05)
108 | 
109 |     def forward(self, x, adj):
110 |         if self.dataset == 'aminer':
111 |             x = self.projection(x)
112 | 
113 |         x = torch.cat([att(x, adj) for att in self.attentions], dim=1)
114 |         x = F.dropout(x, self.dropout, training=self.training)
115 |         x = F.elu(self.out_proj(x))
116 |         x = self.mlp(x)
117 | 
118 |         return F.log_softmax(x, dim=1)
119 | 
120 | class FeatureProjection(nn.Module):
121 |     def __init__(
122 |             self, feature,
123 |             projection_dim
124 |     ):
125 |         super().__init__()
126 |         # device = torch.device("cuda:0" if  torch.cuda.is_available() else "cpu")
127 |         self.projection = {k: nn.Linear(v.shape[1], projection_dim).to(args.device) for k, v in feature.items()}
128 |         self.gelu = nn.GELU()
129 |         self.layer_norm = nn.LayerNorm(projection_dim)
130 | 
131 |     def forward(self, feature):
132 |         project_feature = []
133 |         for k,v in feature.items():
134 |             projected = self.projection[k](v)
135 |             x = self.gelu(projected)
136 |             x = self.layer_norm(x)
137 | 
138 |             project_feature.append(x)
139 | 
140 |         if len(project_feature) ==2:
141 |             return torch.cat([project_feature[0],project_feature[1]])
142 |         elif len(project_feature) ==3:
143 |             return torch.cat([project_feature[0],project_feature[1],project_feature[2]])
144 |         else:
145 |             return torch.tensor(project_feature[0].clone().detach())
146 | 
147 | 
148 | class GraphAttentionLayer(nn.Module):
149 |     """
150 |     Simple GAT layer, similar to https://arxiv.org/abs/1710.10903
151 |     """
152 | 
153 |     def __init__(self, in_features, out_features, dropout, alpha, concat=True):
154 |         super(GraphAttentionLayer, self).__init__()
155 |         self.dropout = dropout
156 |         self.in_features = in_features
157 |         self.out_features = out_features
158 |         self.alpha = alpha
159 |         self.concat = concat
160 | 
161 |         self.W = nn.Parameter(torch.zeros(size=(in_features, out_features)))
162 |         nn.init.xavier_uniform_(self.W.data, gain=1.414)
163 |         self.a = nn.Parameter(torch.zeros(size=(2*out_features, 1)))
164 |         nn.init.xavier_uniform_(self.a.data, gain=1.414)
165 | 
166 |         self.leakyrelu = nn.LeakyReLU(self.alpha)
167 | 
168 |     def forward(self, input, adj):
169 |         if isinstance(adj, torch.sparse.FloatTensor):
170 |             adj = adj.to_dense()
171 | 
172 |         h = torch.mm(input, self.W)
173 |         N = h.size()[0]
174 | 
175 |         a_input = torch.cat([h.repeat(1, N).view(N * N, -1), h.repeat(N, 1)], dim=1).view(N, -1, 2 * self.out_features)
176 |         e = self.leakyrelu(torch.matmul(a_input, self.a).squeeze(2))
177 | 
178 |         zero_vec = -9e15*torch.ones_like(e)
179 |         attention = torch.where(adj > 0, e, zero_vec)
180 |         attention = F.softmax(attention, dim=1)
181 |         attention = F.dropout(attention, self.dropout, training=self.training)
182 |         h_prime = torch.matmul(attention, h)
183 | 
184 |         if self.concat:
185 |             return F.elu(h_prime)
186 |         else:
187 |             return h_prime
188 | 
189 |     def __repr__(self):
190 |         return self.__class__.__name__ + ' (' + str(self.in_features) + ' -> ' + str(self.out_features) + ')'
191 | 
192 | 
193 | 
194 | 


--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .params import set_params,args
2 | 


--------------------------------------------------------------------------------
/utils/focalloss.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # encoding: utf-8
  3 | 
  4 | from typing import Optional
  5 | 
  6 | import torch
  7 | import torch.nn as nn
  8 | import torch.nn.functional as F
  9 | from utils.getdata import encode_onehot
 10 | import numpy as np
 11 | import os
 12 | os.environ['CUDA_VISIBLE_DEVICES'] = '1'
 13 | 
 14 | 
 15 | # based on:
 16 | # https://github.com/zhezh/focalloss/blob/master/focalloss.py
 17 | 
 18 | class FocalLoss(nn.Module):
 19 |     r"""Criterion that computes Focal loss.
 20 | 
 21 |     According to [1], the Focal loss is computed as follows:
 22 | 
 23 |     .. math::
 24 | 
 25 |         \text{FL}(p_t) = -\alpha_t (1 - p_t)^{\gamma} \, \text{log}(p_t)
 26 | 
 27 |     where:
 28 |        - :math:`p_t` is the model's estimated probability for each class.
 29 | 
 30 | 
 31 |     Arguments:
 32 |         alpha (float): Weighting factor :math:`\alpha \in [0, 1]`.
 33 |         gamma (float): Focusing parameter :math:`\gamma >= 0`.
 34 |         reduction (Optional[str]): Specifies the reduction to apply to the
 35 |          output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied,
 36 |          ‘mean’: the sum of the output will be divided by the number of elements
 37 |          in the output, ‘sum’: the output will be summed. Default: ‘none’.
 38 | 
 39 |     Shape:
 40 |         - Input: :math:`(N, C, H, W)` where C = number of classes.
 41 |         - Target: :math:`(N, H, W)` where each value is
 42 |           :math:`0 ≤ targets[i] ≤ C−1`.
 43 | 
 44 |     Examples:
 45 |         >>> N = 5  # num_classes
 46 |         >>> loss = tgm.losses.FocalLoss(alpha=0.5, gamma=2.0, reduction='mean')
 47 |         >>> input = torch.randn(1, N, 3, 5, requires_grad=True)
 48 |         >>> target = torch.empty(1, 3, 5, dtype=torch.long).random_(N)
 49 |         >>> output = loss(input, target)
 50 |         >>> output.backward()
 51 | 
 52 |     References:
 53 |         [1] https://arxiv.org/abs/1708.02002
 54 |     """
 55 | 
 56 |     def __init__(self, device, alpha: float, gamma: Optional[float] = 2.0,
 57 |                  reduction: Optional[str] = 'none') -> None:
 58 |         super(FocalLoss, self).__init__()
 59 |         self.alpha: float = alpha
 60 |         self.gamma: Optional[float] = gamma
 61 |         self.reduction: Optional[str] = reduction
 62 |         self.eps: float = 1e-6
 63 |         self.device = device
 64 | 
 65 |     def forward(
 66 |             self,
 67 |             input: torch.Tensor,
 68 |             target: torch.Tensor) -> torch.Tensor:
 69 |         if not torch.is_tensor(input):
 70 |             raise TypeError("Input type is not a torch.Tensor. Got {}"
 71 |                             .format(type(input)))
 72 | 
 73 |         # compute softmax over the classes axis
 74 |         input_soft = F.softmax(input, dim=1) + self.eps
 75 | 
 76 |         target_one_hot = torch.tensor(encode_onehot(np.array(target.cpu()))).to(self.device)
 77 |         # compute the actual focal loss
 78 |         weight = torch.pow(1. - input_soft, self.gamma)
 79 |         focal = -self.alpha * weight * torch.log(input_soft)
 80 |         loss_tmp = torch.sum(target_one_hot * focal, dim=1)
 81 | 
 82 |         loss = -1
 83 |         if self.reduction == 'none':
 84 |             loss = loss_tmp
 85 |         elif self.reduction == 'mean':
 86 |             loss = torch.mean(loss_tmp)
 87 |         elif self.reduction == 'sum':
 88 |             loss = torch.sum(loss_tmp)
 89 |         else:
 90 |             raise NotImplementedError("Invalid reduction mode: {}"
 91 |                                       .format(self.reduction))
 92 |         return loss
 93 | 
 94 | 
 95 | ######################
 96 | # functional interface
 97 | ######################
 98 | 
 99 | 
100 | def focal_loss(
101 |         input: torch.Tensor,
102 |         target: torch.Tensor,
103 |         alpha: float,
104 |         gamma: Optional[float] = 2.0,
105 |         reduction: Optional[str] = 'none') -> torch.Tensor:
106 |     r"""Function that computes Focal loss.
107 | 
108 |     See :class:`~torchgeometry.losses.FocalLoss` for details.
109 |     """
110 |     return FocalLoss(alpha, gamma, reduction)(input, target)
111 | 
112 | 
113 | def one_hot(labels: torch.Tensor,
114 |             num_classes: int,
115 |             device: Optional[torch.device] = None,
116 |             dtype: Optional[torch.dtype] = None,
117 |             eps: Optional[float] = 1e-6) -> torch.Tensor:
118 |     r"""Converts an integer label 2D tensor to a one-hot 3D tensor.
119 | 
120 |     Args:
121 |         labels (torch.Tensor) : tensor with labels of shape :math:`(N, H, W)`,
122 |                                 where N is batch siz. Each value is an integer
123 |                                 representing correct classification.
124 |         num_classes (int): number of classes in labels.
125 |         device (Optional[torch.device]): the desired device of returned tensor.
126 |          Default: if None, uses the current device for the default tensor type
127 |          (see torch.set_default_tensor_type()). device will be the CPU for CPU
128 |          tensor types and the current CUDA device for CUDA tensor types.
129 |         dtype (Optional[torch.dtype]): the desired data type of returned
130 |          tensor. Default: if None, infers data type from values.
131 | 
132 |     Returns:
133 |         torch.Tensor: the labels in one hot tensor.
134 | 
135 |     Examples::
136 |         >>> labels = torch.LongTensor([[[0, 1], [2, 0]]])
137 |         >>> tgm.losses.one_hot(labels, num_classes=3)
138 |         tensor([[[[1., 0.],
139 |                   [0., 1.]],
140 |                  [[0., 1.],
141 |                   [0., 0.]],
142 |                  [[0., 0.],
143 |                   [1., 0.]]]]
144 |     """
145 |     if not torch.is_tensor(labels):
146 |         raise TypeError("Input labels type is not a torch.Tensor. Got {}"
147 |                         .format(type(labels)))
148 |     if not len(labels.shape) == 3:
149 |         raise ValueError("Invalid depth shape, we expect BxHxW. Got: {}"
150 |                          .format(labels.shape))
151 |     if not labels.dtype == torch.int64:
152 |         raise ValueError(
153 |             "labels must be of the same dtype torch.int64. Got: {}" .format(
154 |                 labels.dtype))
155 |     if num_classes < 1:
156 |         raise ValueError("The number of classes must be bigger than one."
157 |                          " Got: {}".format(num_classes))
158 |     batch_size, height, width = labels.shape
159 |     one_hot = torch.zeros(batch_size, num_classes, height, width,
160 |                           device=device, dtype=dtype)
161 |     return one_hot.scatter_(1, labels.unsqueeze(1), 1.0) + eps
162 | 


--------------------------------------------------------------------------------
/utils/getdata.py:
--------------------------------------------------------------------------------
  1 | import cv2
  2 | from matplotlib import pyplot as plt
  3 | import albumentations as A
  4 | import numpy as np
  5 | import scipy.sparse as sp
  6 | import torch
  7 | import random
  8 | from utils.params import args
  9 | 
 10 | 
 11 | class NodeImageDataset():
 12 |     def __init__(self, entity_id,image,label, transforms):
 13 | 
 14 | 
 15 |         self.image_filenames = list(image)
 16 |         self.entity_id = entity_id
 17 |         self.transforms = transforms
 18 |         self.label = label
 19 | 
 20 |     def __getitem__(self, idx):
 21 | 
 22 |         item = {}
 23 |         # print(idx)
 24 | 
 25 |         try:
 26 |             image = plt.imread(self.image_filenames[idx])
 27 |             image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
 28 |             image = self.transforms(image=image)['image']
 29 |         except:
 30 |             image = np.random.randn(224,224,3)
 31 |             # print("error happened")
 32 |         item['image'] = torch.tensor(image).permute(2, 0, 1).float()
 33 |         item['entity'] = self.entity_id[idx]
 34 |         item['label'] = self.label[idx]
 35 | 
 36 |         return item
 37 | 
 38 |     def __len__(self):
 39 |         return len(self.image_filenames)
 40 | 
 41 | 
 42 | class NodeTextDataset():
 43 |     def __init__(self, entity_id,text,label,tokenizer):
 44 | 
 45 |         self.text = list(text)
 46 | 
 47 |         self.encoded_text = tokenizer(
 48 |             self.text, padding=True, truncation=True, max_length=args.max_length, return_tensors='pt'
 49 |         )
 50 | 
 51 |         self.entity_id = entity_id
 52 |         self.label = label
 53 | 
 54 |     def __getitem__(self, idx):
 55 |         item = {
 56 |             key: torch.tensor(values[idx])
 57 |             for key, values in self.encoded_text.items()
 58 |         }
 59 |         item['text'] = self.text[idx]
 60 |         item['entity'] = self.entity_id[idx]
 61 |         item['label'] = self.label[idx]
 62 | 
 63 |         return item
 64 | 
 65 | 
 66 |     def __len__(self):
 67 |         return len(self.text)
 68 | 
 69 | 
 70 | # get transformation for image data
 71 | def get_transforms(mode="train"):
 72 |     if mode == "train":
 73 |         return A.Compose(
 74 |             [
 75 |                 A.Resize(args.image_size, args.image_size, always_apply=True),
 76 |                 A.Normalize(max_pixel_value=255.0, always_apply=True),
 77 |             ]
 78 |         )
 79 |     else:
 80 |         return A.Compose(
 81 |             [
 82 |                 A.Resize(args.image_size, args.image_size, always_apply=True),
 83 |                 A.Normalize(max_pixel_value=255.0, always_apply=True),
 84 |             ]
 85 |         )
 86 | 
 87 | def split_imbalance(labels,train_ratio,val_ratio,test_ratio,imbalance_ratio):
 88 | 
 89 |     num_classes = len(set(labels.tolist()))
 90 |     c_idxs = [] # class-wise index
 91 |     train_idx = []
 92 |     val_idx = []
 93 |     test_idx = []
 94 |     c_num_mat = np.zeros((num_classes,3)).astype(int)
 95 |     label_max = int(max(labels.tolist())+1)
 96 |     minority_index = [item for item in range(label_max) if labels.tolist().count(item) <  len(labels.tolist())/num_classes]
 97 | 
 98 |     for i in range(num_classes):
 99 |         c_idx = (labels==i).nonzero()[:,-1].tolist()
100 |         c_num = len(c_idx)
101 |         if num_classes > 2:
102 |             if i in minority_index: c_num = int(len(labels.tolist()) / num_classes * imbalance_ratio)
103 |         else:
104 |             if i in minority_index: c_num = int((len(labels.tolist())- labels.tolist().count(i)) * imbalance_ratio)
105 | 
106 |         print('The number of class {:d}: {:d}'.format(i,c_num))
107 |         random.shuffle(c_idx)
108 |         c_idxs.append(c_idx)
109 | 
110 |         if c_num <4:
111 |             if c_num < 3:
112 |                 print("too small class type")
113 |             c_num_mat[i,0] = 1
114 |             c_num_mat[i,1] = 1
115 |             c_num_mat[i,2] = 1
116 |         else:
117 |             c_num_mat[i,0] = int(c_num/10 *train_ratio)
118 |             c_num_mat[i,1] = int(c_num/10 * val_ratio)
119 |             c_num_mat[i,2] = int(c_num/10 * test_ratio)
120 | 
121 | 
122 |         train_idx = train_idx + c_idx[:c_num_mat[i,0]]
123 | 
124 |         val_idx = val_idx + c_idx[c_num_mat[i,0]:c_num_mat[i,0]+c_num_mat[i,1]]
125 |         test_idx = test_idx + c_idx[c_num_mat[i,0]+c_num_mat[i,1]:c_num_mat[i,0]+c_num_mat[i,1]+c_num_mat[i,2]]
126 | 
127 |     random.shuffle(train_idx)
128 | 
129 | 
130 |     return train_idx, val_idx, test_idx, c_num_mat
131 | 
132 | def split_balance(labels,train_ratio,val_ratio,test_ratio):
133 | 
134 |     num_classes = len(set(labels.tolist()))
135 |     c_idxs = []
136 |     train_idx = []
137 |     val_idx = []
138 |     test_idx = []
139 |     c_num_mat = np.zeros((num_classes,3)).astype(int)
140 |     count_0, count_1 = labels.tolist().count(0), labels.tolist().count(1)
141 |     count_min = min(count_0,count_1)
142 | 
143 |     for i in range(num_classes):
144 |         c_idx = (labels==i).nonzero()[:,-1].tolist()
145 |         c_num = count_min
146 | 
147 |         print('The number of class {:d}: {:d}'.format(i,c_num))
148 | 
149 |         random.shuffle(c_idx)
150 |         c_idxs.append(c_idx)
151 | 
152 |         if c_num <4:
153 |             if c_num < 3:
154 |                 print("too small class type")
155 |             c_num_mat[i,0] = 1
156 |             c_num_mat[i,1] = 1
157 |             c_num_mat[i,2] = 1
158 |         else:
159 |             c_num_mat[i,0] = int(c_num/10 *train_ratio)
160 |             c_num_mat[i,1] = int(c_num/10 * val_ratio)
161 |             c_num_mat[i,2] = int(c_num/10 * test_ratio)
162 | 
163 | 
164 |         train_idx = train_idx + c_idx[:c_num_mat[i,0]]
165 | 
166 |         val_idx = val_idx + c_idx[c_num_mat[i,0]:c_num_mat[i,0]+c_num_mat[i,1]]
167 |         test_idx = test_idx + c_idx[c_num_mat[i,0]+c_num_mat[i,1]:c_num_mat[i,0]+c_num_mat[i,1]+c_num_mat[i,2]]
168 | 
169 | 
170 |     return train_idx, val_idx, test_idx, c_num_mat
171 | 
172 | def split_genuine(labels,train_ratio,val_ratio,test_ratio):
173 | 
174 |     num_classes = len(set(labels.tolist()))
175 |     c_idxs = [] # class-wise index
176 |     train_idx = []
177 |     val_idx = []
178 |     test_idx = []
179 |     c_num_mat = np.zeros((num_classes,3)).astype(int)
180 | 
181 |     for i in range(num_classes):
182 |         c_idx = (labels==i).nonzero()[:,-1].tolist()
183 |         c_num = len(c_idx)
184 |         print('The number of class {:d}: {:d}'.format(i,c_num))
185 |         random.shuffle(c_idx)
186 |         c_idxs.append(c_idx)
187 | 
188 |         if c_num <4:
189 |             if c_num < 3:
190 |                 print("too small class type")
191 |             c_num_mat[i,0] = 1
192 |             c_num_mat[i,1] = 1
193 |             c_num_mat[i,2] = 1
194 |         else:
195 |             c_num_mat[i,0] = int(c_num/10 *train_ratio)
196 |             c_num_mat[i,1] = int(c_num/10 * val_ratio)
197 |             c_num_mat[i,2] = int(c_num/10 * test_ratio)
198 | 
199 | 
200 |         train_idx = train_idx + c_idx[:c_num_mat[i,0]]
201 | 
202 |         val_idx = val_idx + c_idx[c_num_mat[i,0]:c_num_mat[i,0]+c_num_mat[i,1]]
203 |         test_idx = test_idx + c_idx[c_num_mat[i,0]+c_num_mat[i,1]:c_num_mat[i,0]+c_num_mat[i,1]+c_num_mat[i,2]]
204 | 
205 | 
206 |     return train_idx, val_idx, test_idx, c_num_mat
207 | 
208 | def load_data_for_pretrain():
209 | 
210 |     if args.dataset == 'instagram':
211 |         embed_dir = args.feature_path
212 |         relation_dir = args.relation_path
213 |         pos_dir = args.pos_path
214 | 
215 |         idx_features_labels = np.round(np.genfromtxt(embed_dir,
216 |                                             dtype=np.float, delimiter=' ', invalid_raise=True),4)
217 | 
218 |         features = idx_features_labels[:, 770:]
219 |         features = sp.csr_matrix(features, dtype=np.float32)
220 |         total_num = features.shape[0]
221 |         features = {'feature': torch.FloatTensor(np.array(features.todense())).to(args.device)}
222 |         number = 8225
223 |         label = encode_onehot(idx_features_labels[:number, 1])
224 | 
225 |         # build graph
226 |         idx = np.array(idx_features_labels[:, 0], dtype=np.float)
227 |         idx_map = {j: i for i, j in enumerate(idx)}
228 | 
229 | 
230 |     elif args.dataset == 'github':
231 | 
232 |         embed_dir = args.feature_path
233 | 
234 |         relation_dir = args.relation_path
235 |         pos_dir = args.pos_path
236 | 
237 |         idx_features_labels = np.genfromtxt(embed_dir,
238 |                                             dtype=np.dtype(str), delimiter=',', invalid_raise=True)
239 | 
240 |         features = sp.csr_matrix(idx_features_labels[:, :], dtype=np.float32)
241 | 
242 |         label_data = idx_features_labels[:,0]
243 |         label = encode_onehot(label_data)
244 | 
245 |         # build graph
246 |         idx = np.array(list(range(idx_features_labels.shape[0])), dtype=np.int32)
247 |         idx_map = {j: i for i, j in enumerate(idx)}
248 | 
249 |         total_num = features.shape[0]
250 | 
251 |     elif args.dataset == 'yelp':
252 | 
253 |         fea_dir = args.feature_path
254 |         relation_dir = args.relation_path
255 |         idx_features_labels = np.genfromtxt(fea_dir, dtype=np.dtype(str), delimiter=' ', invalid_raise=True)
256 |         features = idx_features_labels[:, 2:]
257 |         pos_dir = args.pos_path
258 |         if args.yelp_concate:
259 |             embed_dir = args.embed_path
260 |             idx_embeds_labels = np.genfromtxt(embed_dir,
261 |                                               dtype=np.dtype(str), delimiter=' ', invalid_raise=True)
262 |             embeds = idx_embeds_labels[:, 2:]
263 | 
264 |             features = np.concatenate((features, embeds), axis=1)
265 | 
266 |         features = sp.csr_matrix(features, dtype=np.float32)
267 |         total_num = features.shape[0]
268 |         features = normalize(features)
269 | 
270 |         number = 67395
271 |         label = torch.tensor(encode_onehot(idx_features_labels[:number, 1]))
272 | 
273 |         # build graph
274 |         idx = np.array([i for i in range(features.shape[0])])
275 |         idx_map = {j: i for i, j in enumerate(idx)}
276 | 
277 |         features = {'feature': torch.FloatTensor(np.array(features.todense())).to(args.device)}
278 | 
279 |     elif args.dataset == 'aminer':
280 | 
281 | 
282 |         fea_dir = args.feature_path
283 |         author_fea_dir = args.author_feature_path
284 | 
285 |         relation_dir = args.relation_path
286 |         idx_features_labels = np.genfromtxt(fea_dir,
287 |                                             dtype=np.dtype(str), delimiter=',', invalid_raise=True, skip_header=True)
288 |         paper_features = idx_features_labels[:, 2:]
289 |         author_features = np.genfromtxt(author_fea_dir, dtype=np.dtype(str), delimiter=',', invalid_raise=True,
290 |                                         skip_header=True)[:, 2:]
291 | 
292 |         number = 18089
293 | 
294 |         paper_features = sp.csr_matrix(paper_features, dtype=np.float32)
295 |         author_features = sp.csr_matrix(author_features, dtype=np.float32)
296 |         paper_features = normalize(paper_features)
297 |         paper_features = torch.FloatTensor(np.array(paper_features.todense()))
298 |         author_features = torch.FloatTensor(np.array(author_features.todense()))
299 |         features = {'paper_feature': paper_features.to(args.device), 'author_features': author_features.to(args.device)}
300 |         label = encode_onehot(idx_features_labels[:number, 1].astype('float'))
301 | 
302 |         # build graph
303 |         idx = np.array([i for i in range(paper_features.shape[0] + author_features.shape[0])])
304 |         total_num = paper_features.shape[0] + author_features.shape[0]
305 |         idx_map = {j: i for i, j in enumerate(idx)}
306 | 
307 |         pos_dir = args.pos_path
308 | 
309 |     edges_unordered = np.genfromtxt(relation_dir,
310 |                                     dtype=np.int32)[:, :-1]
311 |     edges = np.array(list(map(idx_map.get, edges_unordered.flatten())),
312 |                      dtype=np.int32).reshape(edges_unordered.shape)
313 |     adj = sp.coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),
314 |                         shape=(total_num,total_num),
315 |                         dtype=np.float32)
316 | 
317 |     # build symmetric adjacency matrix
318 |     adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)
319 | 
320 |     # features = normalize(features)
321 |     adj = normalize(adj + sp.eye(adj.shape[0]))
322 | 
323 |     pos_unordered = np.genfromtxt(pos_dir,
324 |                                     dtype=np.int32)
325 |     pos = np.array(list(map(idx_map.get, pos_unordered.flatten())),
326 |                      dtype=np.int32).reshape(pos_unordered.shape)
327 |     pos = sp.coo_matrix((np.ones(pos.shape[0]), (pos[:, 0], pos[:, 1])),
328 |                         shape=(number,number),
329 |                         dtype=np.int32)
330 |     # build symmetric adjacency matrix
331 |     pos = pos + pos.T.multiply(pos.T > pos) - pos.multiply(pos.T > pos)
332 |     pos = pos + sp.eye(pos.shape[0])
333 | 
334 | 
335 |     labels = torch.LongTensor(np.where(label)[1]).to(args.device)
336 |     adj = sparse_mx_to_torch_sparse_tensor(adj).to(args.device)
337 |     pos = sparse_mx_to_torch_sparse_tensor(pos).to(args.device)
338 | 
339 |     return adj, features, labels, pos
340 | 
341 | def load_data_for_finetune(finetuned_feature_dir):
342 | 
343 |     if args.dataset == 'instagram':
344 |         embed_dir = args.feature_path
345 |         relation_dir = args.relation_path
346 |         idx_features_labels = np.genfromtxt(embed_dir,
347 |                                             dtype=np.dtype(str), delimiter=' ', invalid_raise=True)
348 |         features = sp.csr_matrix(idx_features_labels[:, 2:], dtype=np.float32)
349 |         number = 8225
350 |         label = encode_onehot(idx_features_labels[:number, 1])
351 | 
352 |         # build graph
353 |         idx = np.array(idx_features_labels[:, 0], dtype=np.float)
354 |         idx_map = {j: i for i, j in enumerate(idx)}
355 | 
356 |     elif args.dataset == 'github':
357 | 
358 |         embed_dir = args.feature_path
359 | 
360 |         relation_dir = args.relation_path
361 | 
362 |         idx_features_labels = np.genfromtxt(embed_dir,
363 |                                             dtype=np.dtype(str), delimiter=',', invalid_raise=True)
364 | 
365 |         features = sp.csr_matrix(idx_features_labels[:, :], dtype=np.float32)
366 | 
367 |         label_data = idx_features_labels[:,0]
368 | 
369 |         label = encode_onehot(label_data)
370 | 
371 |         # build graph
372 |         idx = np.array(list(range(idx_features_labels.shape[0])), dtype=np.int32)
373 |         # idx = np.array(idx_features_labels[:, 0], dtype=np.float)
374 |         idx_map = {j: i for i, j in enumerate(idx)}
375 | 
376 |     elif args.dataset == 'yelp':
377 | 
378 |         fea_dir = args.feature_path
379 |         relation_dir = args.relation_path
380 |         number = 67395
381 | 
382 |         idx_features_labels = np.genfromtxt(fea_dir, dtype=np.dtype(str), delimiter=' ', invalid_raise=True)
383 |         features = idx_features_labels[:, 2:]
384 |         features = normalize(features)
385 |         features = sp.csr_matrix(features, dtype=np.float32)
386 | 
387 |         total_num = features.shape[0]
388 |         label_data = idx_features_labels[:number,1]
389 |         label = torch.tensor(encode_onehot(label_data))
390 | 
391 |         # build graph
392 |         idx = np.array([i for i in range(features.shape[0])])
393 |         idx_map = {j: i for i, j in enumerate(idx)}
394 | 
395 |         features = {'feature': torch.FloatTensor(np.array(features.todense())).to(args.device)}
396 | 
397 |     elif args.dataset == 'aminer':
398 | 
399 |         fea_dir = args.feature_path
400 | 
401 |         author_fea_dir = args.author_feature_path
402 | 
403 |         relation_dir = args.relation_path
404 |         idx_features_labels = np.genfromtxt(fea_dir,
405 |                                             dtype=np.dtype(str), delimiter=',', invalid_raise=True, skip_header=True)
406 |         features = torch.from_numpy(np.genfromtxt(finetuned_feature_dir, delimiter=' ')).float()
407 |         features = sp.csr_matrix(features, dtype=np.float32)
408 |         features = normalize(features)
409 |         features = torch.FloatTensor(np.array(features.todense()))
410 |         features = {'features': features.to(args.device)}
411 | 
412 |         paper_features = idx_features_labels[:, 2:]
413 |         author_features = np.genfromtxt(author_fea_dir, dtype=np.dtype(str), delimiter=',', invalid_raise=True,
414 |                                         skip_header=True)[:, 2:]
415 |         number = 18089
416 | 
417 |         # paper_features = sp.csr_matrix(paper_features, dtype=np.float32)
418 |         # author_features = sp.csr_matrix(author_features, dtype=np.float32)
419 |         # paper_features = normalize(paper_features)
420 |         # paper_features = torch.FloatTensor(np.array(paper_features.todense()))
421 |         # author_features = torch.FloatTensor(np.array(author_features.todense()))
422 |         # features = {'paper_feature': paper_features.to(args.device), 'author_features': author_features.to(args.device)}
423 |         label_data = idx_features_labels[:number, 1].astype('float')
424 |         label = encode_onehot(label_data)
425 | 
426 | 
427 | 
428 |         # build graph
429 |         # idx = np.array([i for i in range(paper_features.shape[0] + author_features.shape[0])])
430 |         # total_num = paper_features.shape[0] + author_features.shape[0]
431 | 
432 |         idx = np.array([i for i in range(paper_features.shape[0] + author_features.shape[0])])
433 |         total_num = paper_features.shape[0] + author_features.shape[0]
434 |         idx_map = {j: i for i, j in enumerate(idx)}
435 | 
436 | 
437 |     edges_unordered = np.genfromtxt(relation_dir,
438 |                                     dtype=np.int32)[:, :-1]
439 | 
440 |     edges = np.array(list(map(idx_map.get, edges_unordered.flatten())),
441 |                      dtype=np.int32).reshape(edges_unordered.shape)
442 | 
443 |     adj = sp.coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),
444 |                         shape=(total_num, total_num),
445 |                         dtype=np.float32)
446 | 
447 |     # build symmetric adjacency matrix
448 |     adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)
449 | 
450 |     # features = normalize(features)
451 |     adj = normalize(adj + sp.eye(adj.shape[0]))
452 | 
453 |     if args.dataset == 'yelp':
454 |         idx_train, idx_val, idx_test, c_num_mat = split_genuine(torch.tensor(label_data), train_ratio=4, val_ratio=2,
455 |                                                                 test_ratio=4)
456 |     else:
457 |         idx_train, idx_val, idx_test, c_num_mat = split_genuine(torch.tensor(label_data),train_ratio=7,val_ratio=1,test_ratio=2)
458 | 
459 |     idx_train = torch.LongTensor(idx_train)
460 |     idx_val = torch.LongTensor(idx_val)
461 |     idx_test = torch.LongTensor(idx_test)
462 | 
463 |     labels = torch.LongTensor(np.where(label)[1])
464 |     adj = sparse_mx_to_torch_sparse_tensor(adj)
465 | 
466 |     return adj, features, labels, idx_train, idx_val, idx_test
467 | 
468 | 
469 | def load_data_for_finetune_imbalance(finetuned_feature_dir):
470 | 
471 |     if args.dataset == 'instagram':
472 |         embed_dir = args.feature_path
473 |         relation_dir = args.relation_path
474 |         idx_features_labels = np.genfromtxt(embed_dir,
475 |                                             dtype=np.dtype(str), delimiter=' ', invalid_raise=True)
476 |         features = sp.csr_matrix(idx_features_labels[:, 2:], dtype=np.float32)
477 |         number = 8225
478 |         label = encode_onehot(idx_features_labels[:number, 1])
479 | 
480 |         # build graph
481 |         idx = np.array(idx_features_labels[:, 0], dtype=np.float)
482 |         idx_map = {j: i for i, j in enumerate(idx)}
483 | 
484 |     elif args.dataset == 'github':
485 | 
486 |         embed_dir = args.feature_path
487 | 
488 |         relation_dir = args.relation_path
489 | 
490 |         idx_features_labels = np.genfromtxt(embed_dir,
491 |                                             dtype=np.dtype(str), delimiter=',', invalid_raise=True)
492 | 
493 |         features = sp.csr_matrix(idx_features_labels[:, :], dtype=np.float32)
494 | 
495 |         label_data = idx_features_labels[:,0]
496 | 
497 |         label = encode_onehot(label_data)
498 | 
499 |         # build graph
500 |         idx = np.array(list(range(idx_features_labels.shape[0])), dtype=np.int32)
501 |         # idx = np.array(idx_features_labels[:, 0], dtype=np.float)
502 |         idx_map = {j: i for i, j in enumerate(idx)}
503 | 
504 |     elif args.dataset == 'yelp':
505 | 
506 |         fea_dir = args.feature_path
507 |         relation_dir = args.relation_path
508 |         number = 67395
509 | 
510 |         idx_features_labels = np.genfromtxt(fea_dir, dtype=np.dtype(str), delimiter=' ', invalid_raise=True)
511 |         features = idx_features_labels[:, 2:]
512 |         features = normalize(features)
513 |         features = sp.csr_matrix(features, dtype=np.float32)
514 | 
515 |         total_num = features.shape[0]
516 |         label_data = idx_features_labels[:number,1]
517 |         label = torch.tensor(encode_onehot(label_data))
518 | 
519 |         # build graph
520 |         idx = np.array([i for i in range(features.shape[0])])
521 |         idx_map = {j: i for i, j in enumerate(idx)}
522 | 
523 |         features = {'feature': torch.FloatTensor(np.array(features.todense())).to(args.device)}
524 | 
525 |     elif args.dataset == 'aminer':
526 | 
527 |         fea_dir = args.feature_path
528 | 
529 |         author_fea_dir = args.author_feature_path
530 | 
531 |         relation_dir = args.relation_path
532 |         idx_features_labels = np.genfromtxt(fea_dir,
533 |                                             dtype=np.dtype(str), delimiter=',', invalid_raise=True, skip_header=True)
534 |         features = torch.from_numpy(np.genfromtxt(finetuned_feature_dir, delimiter=' ')).float()
535 |         features = sp.csr_matrix(features, dtype=np.float32)
536 |         features = normalize(features)
537 |         features = torch.FloatTensor(np.array(features.todense()))
538 |         features = {'features': features.to(args.device)}
539 | 
540 |         paper_features = idx_features_labels[:, 2:]
541 |         author_features = np.genfromtxt(author_fea_dir, dtype=np.dtype(str), delimiter=',', invalid_raise=True,
542 |                                         skip_header=True)[:, 2:]
543 |         number = 18089
544 | 
545 |         # paper_features = sp.csr_matrix(paper_features, dtype=np.float32)
546 |         # author_features = sp.csr_matrix(author_features, dtype=np.float32)
547 |         # paper_features = normalize(paper_features)
548 |         # paper_features = torch.FloatTensor(np.array(paper_features.todense()))
549 |         # author_features = torch.FloatTensor(np.array(author_features.todense()))
550 |         # features = {'paper_feature': paper_features.to(args.device), 'author_features': author_features.to(args.device)}
551 |         label_data = idx_features_labels[:number, 1].astype('float')
552 |         label = encode_onehot(label_data)
553 | 
554 | 
555 | 
556 |         # build graph
557 |         # idx = np.array([i for i in range(paper_features.shape[0] + author_features.shape[0])])
558 |         # total_num = paper_features.shape[0] + author_features.shape[0]
559 | 
560 |         idx = np.array([i for i in range(paper_features.shape[0] + author_features.shape[0])])
561 |         total_num = paper_features.shape[0] + author_features.shape[0]
562 |         idx_map = {j: i for i, j in enumerate(idx)}
563 | 
564 | 
565 |     edges_unordered = np.genfromtxt(relation_dir,
566 |                                     dtype=np.int32)[:, :-1]
567 | 
568 |     edges = np.array(list(map(idx_map.get, edges_unordered.flatten())),
569 |                      dtype=np.int32).reshape(edges_unordered.shape)
570 | 
571 |     adj = sp.coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),
572 |                         shape=(total_num, total_num),
573 |                         dtype=np.float32)
574 | 
575 |     # build symmetric adjacency matrix
576 |     adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)
577 | 
578 |     # features = normalize(features)
579 |     adj = normalize(adj + sp.eye(adj.shape[0]))
580 | 
581 |     if args.dataset == 'yelp':
582 |         # idx_train, idx_val, idx_test, c_num_mat = split_genuine(torch.tensor(label_data),train_ratio=7,val_ratio=1,test_ratio=2)
583 |         idx_train, idx_val, idx_test, c_num_mat = split_imbalance(torch.tensor(label_data), train_ratio=4, val_ratio=2,
584 |                                                                   test_ratio=2, imbalance_ratio=args.imbalance_ratio)
585 |     else:
586 |         idx_train, idx_val, idx_test, c_num_mat = split_imbalance(torch.tensor(label_data), train_ratio=7, val_ratio=1,
587 |                                                                   test_ratio=2, imbalance_ratio=args.imbalance_ratio)
588 | 
589 |     idx_train = torch.LongTensor(idx_train)
590 |     idx_val = torch.LongTensor(idx_val)
591 |     idx_test = torch.LongTensor(idx_test)
592 | 
593 |     labels = torch.LongTensor(np.where(label)[1])
594 |     adj = sparse_mx_to_torch_sparse_tensor(adj)
595 | 
596 |     return adj, features, labels, idx_train, idx_val, idx_test
597 | 
598 | def load_data_for_finetune_balance(finetuned_feature_dir):
599 | 
600 |     if args.dataset == 'instagram':
601 |         embed_dir = args.feature_path
602 |         relation_dir = args.relation_path
603 |         idx_features_labels = np.genfromtxt(embed_dir,
604 |                                             dtype=np.dtype(str), delimiter=' ', invalid_raise=True)
605 |         features = sp.csr_matrix(idx_features_labels[:, 2:], dtype=np.float32)
606 |         number = 8225
607 |         label = encode_onehot(idx_features_labels[:number, 1])
608 | 
609 |         # build graph
610 |         idx = np.array(idx_features_labels[:, 0], dtype=np.float)
611 |         idx_map = {j: i for i, j in enumerate(idx)}
612 | 
613 |     elif args.dataset == 'github':
614 | 
615 |         embed_dir = args.feature_path
616 | 
617 |         relation_dir = args.relation_path
618 | 
619 |         idx_features_labels = np.genfromtxt(embed_dir,
620 |                                             dtype=np.dtype(str), delimiter=',', invalid_raise=True)
621 | 
622 |         features = sp.csr_matrix(idx_features_labels[:, :], dtype=np.float32)
623 | 
624 |         label_data = idx_features_labels[:,0]
625 | 
626 |         label = encode_onehot(label_data)
627 | 
628 |         # build graph
629 |         idx = np.array(list(range(idx_features_labels.shape[0])), dtype=np.int32)
630 |         # idx = np.array(idx_features_labels[:, 0], dtype=np.float)
631 |         idx_map = {j: i for i, j in enumerate(idx)}
632 | 
633 |     elif args.dataset == 'yelp':
634 | 
635 |         fea_dir = args.feature_path
636 |         relation_dir = args.relation_path
637 |         number = 67395
638 | 
639 |         idx_features_labels = np.genfromtxt(fea_dir, dtype=np.dtype(str), delimiter=' ', invalid_raise=True)
640 |         features = idx_features_labels[:, 2:]
641 |         features = normalize(features)
642 |         features = sp.csr_matrix(features, dtype=np.float32)
643 | 
644 |         total_num = features.shape[0]
645 |         label_data = idx_features_labels[:number,1]
646 |         label = torch.tensor(encode_onehot(label_data))
647 | 
648 |         # build graph
649 |         idx = np.array([i for i in range(features.shape[0])])
650 |         idx_map = {j: i for i, j in enumerate(idx)}
651 | 
652 |         features = {'feature': torch.FloatTensor(np.array(features.todense())).to(args.device)}
653 | 
654 |     elif args.dataset == 'aminer':
655 | 
656 |         fea_dir = args.feature_path
657 | 
658 |         author_fea_dir = args.author_feature_path
659 | 
660 |         relation_dir = args.relation_path
661 |         idx_features_labels = np.genfromtxt(fea_dir,
662 |                                             dtype=np.dtype(str), delimiter=',', invalid_raise=True, skip_header=True)
663 |         features = torch.from_numpy(np.genfromtxt(finetuned_feature_dir, delimiter=' ')).float()
664 |         features = sp.csr_matrix(features, dtype=np.float32)
665 |         features = normalize(features)
666 |         features = torch.FloatTensor(np.array(features.todense()))
667 |         features = {'features': features.to(args.device)}
668 | 
669 |         paper_features = idx_features_labels[:, 2:]
670 |         author_features = np.genfromtxt(author_fea_dir, dtype=np.dtype(str), delimiter=',', invalid_raise=True,
671 |                                         skip_header=True)[:, 2:]
672 |         number = 18089
673 | 
674 |         # paper_features = sp.csr_matrix(paper_features, dtype=np.float32)
675 |         # author_features = sp.csr_matrix(author_features, dtype=np.float32)
676 |         # paper_features = normalize(paper_features)
677 |         # paper_features = torch.FloatTensor(np.array(paper_features.todense()))
678 |         # author_features = torch.FloatTensor(np.array(author_features.todense()))
679 |         # features = {'paper_feature': paper_features.to(args.device), 'author_features': author_features.to(args.device)}
680 |         label_data = idx_features_labels[:number, 1].astype('float')
681 |         label = encode_onehot(label_data)
682 | 
683 | 
684 | 
685 |         # build graph
686 |         # idx = np.array([i for i in range(paper_features.shape[0] + author_features.shape[0])])
687 |         # total_num = paper_features.shape[0] + author_features.shape[0]
688 | 
689 |         idx = np.array([i for i in range(paper_features.shape[0] + author_features.shape[0])])
690 |         total_num = paper_features.shape[0] + author_features.shape[0]
691 |         idx_map = {j: i for i, j in enumerate(idx)}
692 | 
693 | 
694 |     edges_unordered = np.genfromtxt(relation_dir,
695 |                                     dtype=np.int32)[:, :-1]
696 | 
697 |     edges = np.array(list(map(idx_map.get, edges_unordered.flatten())),
698 |                      dtype=np.int32).reshape(edges_unordered.shape)
699 | 
700 |     adj = sp.coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),
701 |                         shape=(total_num, total_num),
702 |                         dtype=np.float32)
703 | 
704 |     # build symmetric adjacency matrix
705 |     adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)
706 | 
707 |     # features = normalize(features)
708 |     adj = normalize(adj + sp.eye(adj.shape[0]))
709 | 
710 |     if args.dataset == 'yelp':
711 |         idx_train, idx_val, idx_test, c_num_mat = split_balance(torch.tensor(label_data), train_ratio=4, val_ratio=2,
712 |                                                                 test_ratio=4)
713 |     else:
714 |         idx_train, idx_val, idx_test, c_num_mat = split_balance(torch.tensor(label_data), train_ratio=7, val_ratio=1,
715 |                                                                 test_ratio=2)
716 | 
717 |     idx_train = torch.LongTensor(idx_train)
718 |     idx_val = torch.LongTensor(idx_val)
719 |     idx_test = torch.LongTensor(idx_test)
720 | 
721 |     labels = torch.LongTensor(np.where(label)[1])
722 |     adj = sparse_mx_to_torch_sparse_tensor(adj)
723 | 
724 |     return adj, features, labels, idx_train, idx_val, idx_test
725 | 
726 | def sparse_mx_to_torch_sparse_tensor(sparse_mx):
727 |     """Convert a scipy sparse matrix to a torch sparse tensor."""
728 |     sparse_mx = sparse_mx.tocoo().astype(np.float32)
729 |     indices = torch.from_numpy(
730 |         np.vstack((sparse_mx.row, sparse_mx.col)).astype(np.int64))
731 |     values = torch.from_numpy(sparse_mx.data)
732 |     shape = torch.Size(sparse_mx.shape)
733 |     return torch.sparse.FloatTensor(indices, values, shape)
734 | 
735 | def encode_onehot(labels):
736 | 
737 |     classes = set(labels)
738 |     classes_dict = {c: np.identity(len(classes))[i, :] for i, c in
739 |                     enumerate(classes)}
740 |     labels_onehot = np.array(list(map(classes_dict.get, labels)),
741 |                              dtype=np.int32)
742 | 
743 |     return labels_onehot
744 | 
745 | def normalize(mx):
746 |     """Row-normalize sparse matrix"""
747 |     rowsum = np.array(mx.sum(1))
748 |     r_inv = np.power(rowsum+1e-6, -1).flatten()
749 |     r_inv[np.isinf(r_inv)] = 0.
750 |     r_mat_inv = sp.diags(r_inv)
751 |     mx = r_mat_inv.dot(mx)
752 |     return mx
753 | 


--------------------------------------------------------------------------------
/utils/params.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # encoding: utf-8
  3 | 
  4 | 
  5 | import argparse
  6 | import torch
  7 | import sys
  8 | 
  9 | # argv = sys.argv
 10 | # # dataset = argv[1]
 11 | # # dataset = 'acm'
 12 | # # dataset = 'aminer'
 13 | 
 14 | 
 15 | def aminer_params():
 16 |     parser = argparse.ArgumentParser()
 17 | 
 18 |     parser.add_argument('--node_entity_matching_path', type=str, default=r".\data\aminer_data\entityid_text_label.txt",help='path about the matching of node id and text id')
 19 |     parser.add_argument('--author_feature_path', type=str,default=r'.\data\aminer_data\id_author_feature.txt',help='path of aminer feature')
 20 |     parser.add_argument('--feature_path', type=str,default=r'.\data\aminer_data\keyword_feature.txt',help='path of aminer feature')
 21 |     parser.add_argument('--relation_path', type=str,default=r'.\data\aminer_data\relation.txt', help='path of aminer relationships')
 22 |     parser.add_argument('--pos_path', type=str, default=r'.\data\aminer_data\pos_index_5.txt',help='path of aminer pos label')
 23 |     parser.add_argument('--id_content_path', type=str, default=r'.\data\aminer_data\id_label_content.csv',help='to get the finetune features')
 24 | 
 25 |     parser.add_argument('--no-cuda', action='store_true', default=False,
 26 |                         help='Disables CUDA training.')
 27 |     parser.add_argument('--fastmode', action='store_true', default=False,
 28 |                         help='Validate during training pass.')
 29 |     parser.add_argument('--seed', type=int, default=50, help='Random seed.')
 30 |     parser.add_argument('--pretrain_epochs', type=int, default=200,
 31 |                         help='Number of epochs to train.')
 32 |     parser.add_argument('--ft_epochs', type=int, default=200,
 33 |                         help='Number of epochs to train during fine-tuning.')
 34 |     parser.add_argument('--lr', type=float, default=0.001,
 35 |                         help='Initial learning rate.')
 36 |     parser.add_argument('--weight_decay', type=float, default=0.001,
 37 |                         help='Weight decay (L2 loss on parameters).')
 38 |     parser.add_argument('--hidden', type=int, default=128,
 39 |                         help='Number of hidden units.')
 40 |     parser.add_argument('--dropout', type=float, default=0.1,
 41 |                         help='Dropout rate (1 - keep probability).')
 42 |     parser.add_argument('--node_dropout', type=float, default=0.1,
 43 |                         help='Dropout rate (1 - keep probability).')
 44 |     parser.add_argument('--prune', default=True, help='network pruning for model pre-training')
 45 |     parser.add_argument('--prune_ratio', type=float, default=0.2, help='network pruning for model fine-tuning')
 46 | 
 47 |     parser.add_argument('--number_samples', type=int, default=1000,
 48 |                         help='number of samples in co-modality pre-training')
 49 |     parser.add_argument('--device', type=str, default=torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
 50 |                         help='use GPU')
 51 |     parser.add_argument('--finetune_device', type=str, default=torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
 52 |                         help='use GPU')
 53 |     parser.add_argument('--image_encoder_model', type=str, default='swin_small_patch4_window7_224',
 54 |                         help='resnet50, please refer to for more models')
 55 |     parser.add_argument('--text_encoder_model', type=str, default='distilbert-base-uncased',
 56 |                         help='[bert-base-uncased,distilbert-base-uncased]')
 57 |     parser.add_argument('--text_encoder_tokenizer', type=str, default='distilbert-base-uncased',
 58 |                         help='[bert-base-uncased,distilbert-base-uncased]')
 59 |     parser.add_argument('--max_length', type=int, default=70, help='the length of text')
 60 | 
 61 |     parser.add_argument('--node_encoder_model', type=str, default='gcn', help='[gcn,gat,sage,gin]')
 62 | 
 63 |     parser.add_argument('--nheads', type=int, default=8, help='gat')
 64 |     parser.add_argument('--alpha', type=float, default=0.2, help='gat')
 65 |     parser.add_argument('--image_size', type=int, default=224, help='the size of image')
 66 |     parser.add_argument('--imbalance_setting', type=str, default='focal', help='[reweight,upsample,focal]')
 67 |     parser.add_argument('--imbalance_up_scale', type=float, default=10.0, help='the scale for upsampling')
 68 | 
 69 |     parser.add_argument('--imbalance_ratio', type=float, default= 0,
 70 |                         help='[0.01, 0.1, 0, 1] if 0 then original imbalance ratio if 1 then balanced data')
 71 |     parser.add_argument('--node_feature_dim', type=int, default=768, help='the dimension of aminer feature')
 72 |     parser.add_argument('--node_feature_project_dim', type=int, default=200, help='the dimension of projected feature')
 73 |     parser.add_argument('--node_embedding_dim', type=int, default=200, help='the dimension of node embedding')
 74 |     parser.add_argument('--hidden_dim', type=int, default=200)
 75 |     parser.add_argument('--out_dim', type=int, default=200)
 76 |     parser.add_argument('--nclass', type=int, default=5, help='the number of class')
 77 | 
 78 |     parser.add_argument('--num_projection_layers', type=int, default=1,
 79 |                         help='the number of project layer for text/image and node')
 80 |     parser.add_argument('--feat_projection_dim', type=int, default=200,
 81 |                         help='the dimension of projected feature for nodes')
 82 |     parser.add_argument('--projection_dim', type=int, default=256,
 83 |                         help='the dimension of projected embedding of text/image and node')
 84 |     parser.add_argument('--pos', default=True)
 85 |     parser.add_argument('--debug', default=False)
 86 |     parser.add_argument('--batch_size', type=int, default=100,help='the size for each batch for co-modality pretraining')
 87 |     parser.add_argument('--num_workers', type=int, default=1, help='the number of workers')
 88 |     parser.add_argument('--patience', type=int, default=30, help='the number of epochs that the metric is not improved')
 89 |     parser.add_argument('--factor', type=int, default=0.5, help='the factor to change the learning rate')
 90 |     parser.add_argument('--temperature', type=int, default=0.1, help='the factor to change the learning rate')
 91 |     parser.add_argument('--pretrained', default=True, help="for text/image encoder")
 92 |     parser.add_argument('--trainable', default=False, help="for text/image encoder")
 93 |     parser.add_argument('--image_embedding_dim', type=int, default=768, help='the dimension of image embedding')
 94 |     parser.add_argument('--text_embedding_dim', type=int, default=768, help='the dimension of text embedding')
 95 |     parser.add_argument('--finetune_embedding_dim', type=int, default=768, help='the dimension of finetuen feature')
 96 |     parser.add_argument('--focal_alpha', type=int, default=0.75, help='focal loss')
 97 |     parser.add_argument('--focal_gamma', type=int, default=1.0, help='focal loss')
 98 |     parser.add_argument('--finetune', default=True, help='finetune the model')
 99 | 
100 |     args, _ = parser.parse_known_args()
101 | 
102 |     return args
103 | 
104 | def yelp_params():
105 |     parser = argparse.ArgumentParser()
106 | 
107 |     parser.add_argument('--node_entity_matching_path', type=str, default=r'',help='path about the matching of node id and text id')
108 |     parser.add_argument('--feature_path', type=str, default=r'', help='path of yelp text feature')    # base data here here  with 0.25
109 |     parser.add_argument('--embed_path', type=str, default=r'',
110 |                         help='path of yelp feature')
111 |     parser.add_argument('--relation_path', type=str, default=r'',help='path of yelp relationships')
112 |     parser.add_argument('--pos_path', type=str, default=r'', help='path of yelp pos label')
113 |     parser.add_argument('--yelp_concate',default=False, help='whether concatenate yelp feature and embedding')
114 |     parser.add_argument('--id_content_path', type=str, default=r'',help='to get the finetune features')
115 | 
116 |     parser.add_argument('--no-cuda', action='store_true', default=False,
117 |                         help='Disables CUDA training.')
118 |     parser.add_argument('--fastmode', action='store_true', default=False,
119 |                         help='Validate during training pass.')
120 |     parser.add_argument('--seed', type=int, default=50, help='Random seed.')
121 |     parser.add_argument('--pretrain_epochs', type=int, default=30,
122 |                         help='Number of epochs to train.')
123 |     parser.add_argument('--ft_epochs', type=int, default=60,
124 |                         help='Number of epochs to train.')
125 |     parser.add_argument('--lr', type=float, default=0.01,
126 |                         help='Initial learning rate.')
127 |     parser.add_argument('--weight_decay', type=float, default=0.001,
128 |                         help='Weight decay (L2 loss on parameters).')
129 |     parser.add_argument('--hidden', type=int, default=128,
130 |                         help='Number of hidden units.')
131 |     parser.add_argument('--dropout', type=float, default=0.5,
132 |                         help='Dropout rate (1 - keep probability).')
133 |     parser.add_argument('--node_dropout', type=float, default=0.5,
134 |                         help='Dropout rate (1 - keep probability).')
135 | 
136 |     parser.add_argument('--prune', default=True, help='network pruning for model pre-training')
137 |     parser.add_argument('--prune_ratio', type=float, default=0.5, help='network pruning for model fine-tuning')
138 | 
139 |     parser.add_argument('--number_samples', type=int, default=1000,
140 |                         help='number of samples in co-modality pre-training')
141 |     parser.add_argument('--device', type=str, default=torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
142 |                         help='use GPU')
143 |     parser.add_argument('--finetune_device', type=str, default=torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
144 |                         help='use GPU')
145 |     parser.add_argument('--image_encoder_model', type=str, default='swin_small_patch4_window7_224',
146 |                         help='resnet50, please refer to for more models')
147 |     parser.add_argument('--text_encoder_model', type=str, default='distilbert-base-uncased',
148 |                         help='[bert-base-uncased,distilbert-base-uncased]')
149 |     parser.add_argument('--text_encoder_tokenizer', type=str, default='distilbert-base-uncased',
150 |                         help='[bert-base-uncased,distilbert-base-uncased]')
151 |     parser.add_argument('--max_length', type=int, default=200, help='the length of text')
152 |     parser.add_argument('--feat_projection_dim', type=int, default=200,
153 |                         help='the dimension of projected feature for nodes')
154 |     parser.add_argument('--node_encoder_model', type=str, default='gcn', help='[gcn,gat,sage,gin]')
155 |     parser.add_argument('--nheads', type=int, default=8, help='gat')
156 |     parser.add_argument('--alpha', type=float, default=0.2, help='gat')
157 |     parser.add_argument('--image_size', type=int, default=224, help='the size of image')
158 |     parser.add_argument('--imbalance_setting', type=str, default='focal', help='[reweight,upsample,focal]')
159 |     parser.add_argument('--imbalance_up_scale', type=float, default=10.0, help='the scale for upsampling')
160 | 
161 |     parser.add_argument('--imbalance_ratio', type=float, default=0,
162 |                         help='[0.01, 0.1, 0, 1] if 0 then original imbalance ratio if 1 then balanced data')
163 | 
164 |     parser.add_argument('--node_feature_dim', type=int, default=768, help='the dimension of yelp feature')
165 |     parser.add_argument('--node_feature_project_dim', type=int, default=200, help='the dimension of projected feature')
166 |     parser.add_argument('--node_embedding_dim', type=int, default=200, help='the dimension of node embedding')
167 |     parser.add_argument('--hidden_dim', type=int, default=200)
168 |     parser.add_argument('--out_dim', type=int, default=200)
169 |     parser.add_argument('--nclass', type=int, default=2, help='the number of class')
170 | 
171 |     parser.add_argument('--num_projection_layers', type=int, default=1,
172 |                         help='the number of project layer for text/image and node')
173 |     parser.add_argument('--projection_dim', type=int, default=256,
174 |                         help='the dimension of projected embedding of text/image and node')
175 | 
176 |     parser.add_argument('--pos', default=False)
177 |     parser.add_argument('--debug', default=False)
178 |     parser.add_argument('--batch_size', type=int, default=50,
179 |                         help='the size for each batch for co-modality pretraining')
180 |     parser.add_argument('--num_workers', type=int, default=1, help='the number of workers')
181 |     parser.add_argument('--patience', type=int, default=1, help='the number of epochs that the metric is not improved')
182 |     parser.add_argument('--factor', type=int, default=0.5, help='the factor to change the learning rate')
183 |     parser.add_argument('--temperature', type=int, default=1.0, help='the factor to change the learning rate')
184 | 
185 |     parser.add_argument('--pretrained', default=True,help = "for text/image encoder")
186 |     parser.add_argument('--trainable', default=False, help = "for text/image encoder")
187 | 
188 |     parser.add_argument('--image_embedding_dim', type=int, default=768, help='the dimension of image embedding')
189 |     parser.add_argument('--text_embedding_dim', type=int, default=768, help='the dimension of text embedding')
190 |     parser.add_argument('--finetune_embedding_dim', type=int, default=768, help='the dimension of finetuen feature')
191 |     parser.add_argument('--focal_alpha', type=int, default=0.5, help='focal loss')
192 |     parser.add_argument('--focal_gamma', type=int, default=0.6, help='focal loss')
193 |     parser.add_argument('--finetune', default=True, help='finetune the model')
194 | 
195 | 
196 |     args, _ = parser.parse_known_args()
197 | 
198 |     return args
199 | 
200 | def github_params():
201 |     parser = argparse.ArgumentParser()
202 | 
203 |     parser.add_argument('--node_entity_matching_path', type=str,default=r'',help='path about the matching of node id and text id')
204 |     parser.add_argument('--feature_path', type=str, default=r'',
205 |                         help='path of github feature')
206 |     parser.add_argument('--relation_path', type=str, default=r'',
207 |                         help='path of github relationships')
208 |     parser.add_argument('--pos_path', type=str, default=r'',
209 |                         help='path of github pos label')
210 |     parser.add_argument('--label_path', type=str, default=r'',
211 |                         help='path of label')
212 | 
213 |     parser.add_argument('--no-cuda', action='store_true', default=False,
214 |                         help='Disables CUDA training.')
215 |     parser.add_argument('--fastmode', action='store_true', default=False,
216 |                         help='Validate during training pass.')
217 |     parser.add_argument('--seed', type=int, default=50, help='Random seed.')
218 |     parser.add_argument('--pretrain_epochs', type=int, default=60,
219 |                         help='Number of epochs to train.')
220 |     parser.add_argument('--ft_epochs', type=int, default=60,
221 |                         help='Number of epochs to train.')
222 |     parser.add_argument('--lr', type=float, default=0.01,
223 |                         help='Initial learning rate.')
224 |     parser.add_argument('--weight_decay', type=float, default=0.001,
225 |                         help='Weight decay (L2 loss on parameters).')
226 |     parser.add_argument('--hidden', type=int, default=128,
227 |                         help='Number of hidden units.')
228 |     parser.add_argument('--dropout', type=float, default=0.5,
229 |                         help='Dropout rate (1 - keep probability).')
230 |     parser.add_argument('--node_dropout', type=float, default=0.5,
231 |                         help='Dropout rate (1 - keep probability).')
232 | 
233 |     parser.add_argument('--prune', default=True, help='network pruning for model pre-training')
234 |     parser.add_argument('--prune_ratio', type=float, default=0.5, help='network pruning for model fine-tuning')
235 | 
236 |     parser.add_argument('--number_samples', type=int, default=1000,
237 |                         help='number of samples in co-modality pre-training')
238 |     parser.add_argument('--device', type=str, default=torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
239 |                         help='use GPU')
240 |     parser.add_argument('--finetune_device', type=str, default=torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
241 |                         help='use GPU')
242 |     parser.add_argument('--id_content_path', type=str,
243 |                         default=r'',
244 |                         help='to get the finetune features')
245 |     parser.add_argument('--image_encoder_model', type=str, default='swin_small_patch4_window7_224',
246 |                         help='resnet50, please refer to for more models')
247 |     parser.add_argument('--text_encoder_model', type=str, default='distilbert-base-uncased',
248 |                         help='[bert-base-uncased,distilbert-base-uncased]')
249 |     parser.add_argument('--text_encoder_tokenizer', type=str, default='distilbert-base-uncased',
250 |                         help='[bert-base-uncased,distilbert-base-uncased]')
251 |     parser.add_argument('--max_length', type=int, default=200, help='the length of text')
252 |     parser.add_argument('--feat_projection_dim', type=int, default=200,
253 |                         help='the dimension of projected feature for nodes')
254 |     parser.add_argument('--node_encoder_model', type=str, default='gcn', help='[gcn,gat,sage,gin]')
255 |     parser.add_argument('--nheads', type=int, default=8, help='gat')
256 |     parser.add_argument('--alpha', type=float, default=0.2, help='gat')
257 |     parser.add_argument('--image_size', type=int, default=224, help='the size of image')
258 |     parser.add_argument('--imbalance_setting', type=str, default='focal', help='[reweight,upsample,focal]')
259 |     parser.add_argument('--imbalance_up_scale', type=float, default=10.0, help='the scale for upsampling')
260 | 
261 |     parser.add_argument('--imbalance_ratio', type=float, default=0,
262 |                         help='[0.01, 0.1, 0, 1] if 0 then original imbalance ratio if 1 then balanced data')
263 | 
264 |     parser.add_argument('--node_feature_dim', type=int, default=128, help='the dimension of github feature')
265 |     parser.add_argument('--node_feature_project_dim', type=int, default=200, help='the dimension of projected feature')
266 |     parser.add_argument('--node_embedding_dim', type=int, default=200, help='the dimension of node embedding')
267 |     parser.add_argument('--hidden_dim', type=int, default=200)
268 |     parser.add_argument('--out_dim', type=int, default=200)
269 |     parser.add_argument('--nclass', type=int, default=2, help='the number of class')
270 | 
271 |     parser.add_argument('--num_projection_layers', type=int, default=1,
272 |                         help='the number of project layer for text/image and node')
273 |     parser.add_argument('--projection_dim', type=int, default=256,
274 |                         help='the dimension of projected embedding of text/image and node')
275 | 
276 |     parser.add_argument('--pos', default=False)
277 |     parser.add_argument('--debug', default=False)
278 |     parser.add_argument('--batch_size', type=int, default=80,
279 |                         help='the size for each batch for co-modality pretraining')
280 |     parser.add_argument('--num_workers', type=int, default=1, help='the number of workers')
281 |     parser.add_argument('--patience', type=int, default=5, help='the number of epochs that the metric is not improved')
282 |     parser.add_argument('--factor', type=int, default=0.5, help='the factor to change the learning rate')
283 |     parser.add_argument('--temperature', type=int, default=1.0, help='the factor to change the learning rate')
284 | 
285 |     parser.add_argument('--pretrained', default=True,help = "for text/image encoder")
286 |     parser.add_argument('--trainable', default=True, help = "for text/image encoder")
287 |     parser.add_argument('--image_embedding_dim', type=int, default=768, help='the dimension of image embedding')
288 |     parser.add_argument('--text_embedding_dim', type=int, default=768, help='the dimension of text embedding')
289 |     parser.add_argument('--finetune_embedding_dim', type=int, default=768, help='the dimension of finetuen feature')
290 |     parser.add_argument('--focal_alpha', type=int, default=0.25, help='focal loss')
291 |     parser.add_argument('--focal_gamma', type=int, default=2.0, help='focal loss')
292 |     parser.add_argument('--finetune', default=True, help='finetune the model')
293 | 
294 |     args, _ = parser.parse_known_args()
295 | 
296 |     return args
297 | 
298 | def instagram_params():
299 |     parser = argparse.ArgumentParser()
300 | 
301 |     parser.add_argument('--node_entity_matching_path', type=str,
302 |                         default=r'',
303 |                         help='path about the matching of node id and picture id')
304 |     parser.add_argument('--feature_path', type=str, default=r'',
305 |                         help='path of instagram feature')
306 |     parser.add_argument('--relation_path', type=str,
307 |                         default=r'',
308 |                         help='path of instagram relationships')
309 |     parser.add_argument('--pos_path', type=str, default=r'',
310 |                         help='path of instagram pos label')
311 |     parser.add_argument('--id_content_path', type=str, default=r'',
312 |                         help='to get the finetune features')
313 | 
314 |     parser.add_argument('--no-cuda', action='store_true', default=False,
315 |                         help='Disables CUDA training.')
316 |     parser.add_argument('--fastmode', action='store_true', default=False,
317 |                         help='Validate during training pass.')
318 |     parser.add_argument('--seed', type=int, default=50, help='Random seed.')
319 |     parser.add_argument('--pretrain_epochs', type=int, default=1,
320 |                         help='Number of epochs to train.')   # 60
321 |     parser.add_argument('--ft_epochs', type=int, default=60,
322 |                         help='Number of epochs to train.')
323 |     parser.add_argument('--lr', type=float, default=0.01,
324 |                         help='Initial learning rate.')
325 |     parser.add_argument('--weight_decay', type=float, default=0.001,
326 |                         help='Weight decay (L2 loss on parameters).')
327 |     parser.add_argument('--hidden', type=int, default=128,
328 |                         help='Number of hidden units.')
329 |     parser.add_argument('--dropout', type=float, default=0.5,
330 |                         help='Dropout rate (1 - keep probability).')
331 |     parser.add_argument('--node_dropout', type=float, default=0.5,
332 |                         help='Dropout rate (1 - keep probability).')
333 | 
334 |     parser.add_argument('--prune', default=True, help='network pruning for model pre-training')
335 |     parser.add_argument('--prune_ratio', type=float, default=0.5, help='network pruning for model fine-tuning')
336 | 
337 |     parser.add_argument('--number_samples', type=int, default=1000,
338 |                         help='number of samples in co-modality pre-training')
339 |     parser.add_argument('--device', type=str, default=torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
340 |                         help='use GPU')
341 |     parser.add_argument('--finetune_device', type=str, default=torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
342 |                         help='use GPU')
343 |     parser.add_argument('--image_encoder_model', type=str, default='swin_small_patch4_window7_224',
344 |                         help='resnet50, please refer to for more models')
345 |     parser.add_argument('--text_encoder_model', type=str, default='distilbert-base-uncased',
346 |                         help='[bert-base-uncased,distilbert-base-uncased]')
347 |     parser.add_argument('--text_encoder_tokenizer', type=str, default='distilbert-base-uncased',
348 |                         help='[bert-base-uncased,distilbert-base-uncased]')
349 |     parser.add_argument('--max_length', type=int, default=200, help='the length of text')
350 |     parser.add_argument('--feat_projection_dim', type=int, default=200,
351 |                         help='the dimension of projected feature for nodes')
352 | 
353 |     parser.add_argument('--node_encoder_model', type=str, default='gcn', help='[gcn,gat,sage,gin]')
354 |     parser.add_argument('--nheads', type=int, default=8, help='gat')
355 |     parser.add_argument('--alpha', type=float, default=0.2, help='gat')
356 | 
357 |     parser.add_argument('--imbalance_setting', type=str, default='focal', help='[reweight,upsample,focal]')
358 |     parser.add_argument('--imbalance_up_scale', type=float, default=10.0, help='the scale for upsampling')
359 | 
360 |     parser.add_argument('--imbalance_ratio', type=float, default=0,
361 |                         help='[0.01, 0.1, 0, 1] if 0 then original imbalance ratio if 1 then balanced data')
362 | 
363 |     parser.add_argument('--node_feature_dim', type=int, default=768, help='the dimension of instagram feature')
364 |     parser.add_argument('--node_feature_project_dim', type=int, default=200, help='the dimension of projected feature')
365 |     parser.add_argument('--node_embedding_dim', type=int, default=200, help='the dimension of node embedding')
366 |     parser.add_argument('--hidden_dim', type=int, default=200)
367 |     parser.add_argument('--out_dim', type=int, default=200)
368 |     parser.add_argument('--nclass', type=int, default=2, help='the number of class')
369 | 
370 |     parser.add_argument('--image_size', type=int, default=224, help='the size of image')
371 |     parser.add_argument('--image_embedding_dim', type=int, default=768, help='the dimension of image embedding')
372 |     parser.add_argument('--text_embedding_dim', type=int, default=768, help='the dimension of text embedding')
373 |     parser.add_argument('--finetune_embedding_dim', type=int, default=768, help='the dimension of finetuen feature')
374 | 
375 |     parser.add_argument('--num_projection_layers', type=int, default=1,
376 |                         help='the number of project layer for text/image and node')
377 |     parser.add_argument('--projection_dim', type=int, default=256,
378 |                         help='the dimension of projected embedding of text/image and node')
379 | 
380 |     parser.add_argument('--pos', default=False)
381 |     parser.add_argument('--debug', default=False)
382 |     parser.add_argument('--batch_size', type=int, default=30,
383 |                         help='the size for each batch for co-modality pretraining')
384 |     parser.add_argument('--num_workers', type=int, default=2, help='the number of workers')
385 |     parser.add_argument('--patience', type=int, default=5, help='the number of epochs that the metric is not improved')
386 |     parser.add_argument('--factor', type=int, default=0.5, help='the factor to change the learning rate')
387 |     parser.add_argument('--temperature', type=int, default=1.0, help='the factor to change the learning rate')
388 | 
389 |     parser.add_argument('--pretrained', default=True,help = "for text/image encoder")
390 |     parser.add_argument('--trainable', default=True, help = "for text/image encoder")
391 | 
392 |     parser.add_argument('--focal_alpha', type=int, default=0.25, help='focal loss')
393 |     parser.add_argument('--focal_gamma', type=int, default=1.0, help='focal loss')
394 |     parser.add_argument('--finetune', default=True, help='finetune the model')
395 | 
396 |     args, _ = parser.parse_known_args()
397 | 
398 |     return args
399 | 
400 | 
401 | def set_params(dataset):
402 |     if dataset == "aminer":
403 |         args = aminer_params()
404 |     elif dataset == "yelp":
405 |         args = yelp_params()
406 |     elif dataset == "github":
407 |         args = github_params()
408 |     elif dataset == "instagram":
409 |         args = instagram_params()
410 | 
411 |     args.dataset = dataset
412 | 
413 |     return args
414 | 
415 | 
416 | dataset = 'aminer'
417 | # dataset = 'yelp'
418 | # dataset = 'github'
419 | # dataset = 'instagram'
420 | args = set_params(dataset)
421 | 


--------------------------------------------------------------------------------
/utils/util.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import random
 3 | import numpy as np
 4 | import torch
 5 | import torch.nn.functional as F
 6 | from sklearn.metrics import roc_auc_score, f1_score
 7 | 
 8 | 
 9 | class AvgMeter:
10 |     def __init__(self, name="Metric"):
11 |         self.name = name
12 |         self.reset()
13 | 
14 |     def reset(self):
15 |         self.avg, self.sum, self.count = [0] * 3
16 | 
17 |     def update(self, val, count=1):
18 |         self.count += count
19 |         self.sum += val * count
20 |         self.avg = self.sum / self.count
21 | 
22 |     def __repr__(self):
23 |         text = f"{self.name}: {self.avg:.4f}"
24 |         return text
25 | 
26 | 
27 | def get_lr(optimizer):
28 |     for param_group in optimizer.param_groups:
29 |         return param_group["lr"]
30 | 
31 | 
32 | def seed_torch(seed=1029):
33 |     random.seed(seed)
34 |     os.environ['PYTHONHASHSEED'] = str(seed)
35 |     np.random.seed(seed)
36 |     torch.manual_seed(seed)
37 |     torch.cuda.manual_seed(seed)
38 |     torch.cuda.manual_seed_all(seed)  # if you are using multi-GPU.
39 |     torch.backends.cudnn.benchmark = False
40 |     torch.backends.cudnn.deterministic = True
41 | 
42 | 
43 | 
44 | def evaluate_performance(labels, output):
45 |     f1_micro = f1_score(labels.cpu().detach().numpy(),
46 |                         output.max(1)[1].cpu().detach().numpy(), average='micro')
47 |     f1_macro = f1_score(labels.cpu().detach().numpy(),
48 |                         output.max(1)[1].cpu().detach().numpy(), average='macro')
49 | 
50 |     if labels.max() > 1:
51 |         auc = roc_auc_score(labels.detach().cpu(),
52 |                             F.softmax(output, dim=-1).detach().cpu(), average='macro',
53 |                             multi_class='ovr')
54 |     else:
55 | 
56 |         auc = roc_auc_score(labels.detach().cpu(),
57 |                             F.softmax(output, dim=-1)[:, 1].detach().cpu(), average='macro')
58 | 
59 |         # auc = roc_auc_score(labels.detach().cpu(),
60 |         #                     torch.nan_to_num(F.softmax(output, dim=-1)[:, 1], 1e-5).detach().cpu(), average='macro')
61 | 
62 |     return f1_micro, f1_macro, auc
63 | 


--------------------------------------------------------------------------------