├── .gitignore ├── README.md ├── data └── hotel_comment │ ├── build_data.py │ ├── build_embedings.py │ ├── build_vocab.py │ ├── eval.labels.txt │ ├── eval.words.txt │ ├── raw_data │ ├── corpus.zip │ └── fix_coupus.py │ ├── train.labels.txt │ ├── train.words.txt │ ├── vocab.labels.txt │ └── vocab.words.txt ├── model ├── cnn │ ├── debug.py │ ├── export.py │ ├── main.py │ ├── saved_model │ │ └── 1557069338 │ │ │ ├── assets │ │ │ ├── vocab.labels.txt │ │ │ └── vocab.words.txt │ │ │ ├── saved_model.pb │ │ │ └── variables │ │ │ ├── variables.data-00000-of-00001 │ │ │ └── variables.index │ └── serve.py ├── lstm │ ├── debug.py │ ├── export.py │ ├── main.py │ ├── saved_model │ │ └── 1557073579 │ │ │ ├── assets │ │ │ ├── vocab.labels.txt │ │ │ └── vocab.words.txt │ │ │ ├── saved_model.pb │ │ │ └── variables │ │ │ ├── variables.data-00000-of-00001 │ │ │ └── variables.index │ └── serve.py └── score_report.py └── pic ├── 1_GRQ91HNASB7MAJPTTlVvfw.jpeg ├── clip.png └── 截图_选择区域_20211202181126.png /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | .vscode/ 3 | .ipynb_checkpoints/ 4 | sgns.* 5 | *.npz 6 | neg/ 7 | pos/ 8 | fix_pos/ 9 | fix_neg/ 10 | old_impl/sentiment_checkpoint.keras 11 | old_impl/words.txt 12 | /**/__pycache__/ 13 | .idea/ 14 | *.log 15 | /**/results -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # 中文情感分析 2 | 3 | 中文情感分析的实质是文本分类问题,本项目分别采用**CNN**和**BI-LSTM**两种模型解决文本分类任务,并用于情感分析,达到不错的效果。 4 | 两种模型在小数据集上训练,在验证集的准确率、号回率及F1因子均接近**90%** 5 | 6 | 项目设计的目标可以接受不同语料的多种分类任务,只要语料按照特定格式准备好,就可以开始调参训练、导出、serving。 7 | 8 | ### code environment 9 | 在 python3.6 & Tensorflow1.13 下工作正常 10 | 11 | 其他环境也许也可以,但是没有测试过。 12 | 13 | 还需要安装 `scikit-learn` package 来计算指标,包括准确率回召率和F1因子等等。 14 | 15 | ### 语料的准备 16 | 语料的选择为 *谭松波老师的评论语料*,正负例各2000。属于较小的数据集,本项目包含了原始语料,位于`data/hotel_comment/raw_data/corpus.zip`中 17 | 18 | 解压 `corpus.zip` 后运行,并在`raw_data`运行 19 | ```sh 20 | python fix_corpus.py 21 | ``` 22 | 将原本`gb2312`编码文件转换成`utf-8`编码的文件。 23 | 24 | ### 词向量的准备 25 | 本实验使用开源词向量[*chinese-word-vectors*](https://github.com/Embedding/Chinese-Word-Vectors) 26 | 27 | 选择知乎语料训练而成的Word Vector, 本项目选择词向量的下载地址为 https://pan.baidu.com/s/1OQ6fQLCgqT43WTwh5fh_lg ,需要百度云下载,解压,直接放在工程目录下 28 | 29 | ### 训练数据的格式 30 | 参考 `data/hotel_comment/*.txt` 文件 31 | 32 | - step1 33 | 34 | 本项目把数据分成训练集和测试集,比例为`4:1`, 集4000个样本被分开,3200个样本的训练集,800的验证集。 35 | 36 | 对于训练集和验证集,制作训练数据时遵循如下格式: 37 | 在`{}.words.txt`文件中,每一行为一个样本的输入,其中每段评论一行,并用`jieba`分词,词与词之间用空格分开。 38 | ```text 39 | 除了 地段 可以 , 其他 是 一塌糊涂 , 惨不忍睹 。 和 招待所 差不多 。 40 | 帮 同事 订 的 酒店 , 他 老兄 刚 从 东莞 回来 , 详细 地问 了 一下 他 对 粤海 酒店 的 印象 , 说 是 硬件 和 软件 : 极好 ! 所以 表扬 一下 41 | ``` 42 | 在`{}.labels.txt`文件中,每一行为一个样本的标记 43 | ```text 44 | NEG 45 | POS 46 | ``` 47 | 本项目中,可在`data/hotel_comment`目录下运行`build_data.py`得到相应的格式 48 | 49 | - step2 50 | 51 | 因为本项目用了`index_table_from_file`来获取字符对应的id,需要两个文件表示词汇集和标志集,对应于`vocab.labels.txt`和`vocab.words.txt`,其中每一行代表一个词或者是一行代表一个标志。 52 | 53 | 本项目中,可在`data/hotel_comment`目录下运行`build_vocab.py`得到相应的文件 54 | 55 | - step3 56 | 57 | 由于下载的词向量非常巨大,需要提取训练语料中出现的字符对应的向量,对应本项目中的`data/hotel_comment/w2v.npz`文件 58 | 59 | 本项目中,可在`data/hotel_comment`目录下运行`build_embeddings.py`得到相应的文件 60 | 61 | ## 模型一:CNN 62 | #### 结构: 63 | 1. 中文词Embedding 64 | 2. 多个不同长度的定宽卷积核 65 | 3. 最大池化层,每个滤波器输出仅取一个最大值 66 | 4. 全连接 67 | 68 | ![截图](https://raw.githubusercontent.com/linguishi/chinese_sentiment/master/pic/%E6%88%AA%E5%9B%BE_%E9%80%89%E6%8B%A9%E5%8C%BA%E5%9F%9F_20211202181126.png) 69 | 图来源于论文 https://arxiv.org/abs/1408.5882 ,但与论文不同的是,论文中采取了一个pre-train 的embeddings和一个没有训练的embeddings组成了类似图像概念的双通道。本项目中只采用了一个预训练embeddings的单通道。 70 | 71 | CNN模型的训练,在`cnn`目录底下运行 72 | ``` 73 | python main.py 74 | ``` 75 | 76 | #### CNN模型训练时间 77 | 在**GTX 1060 6G**的加持下大概耗时2分钟 78 | 79 | #### CNN模型的训练结果 80 | 在`model`目录底下运行 81 | 82 | ``` 83 | python score_report.py cnn/results/score/eval.preds.txt 84 | ``` 85 | 86 | 输出: 87 | ``` 88 | precision recall f1-score support 89 | 90 | POS 0.91 0.87 0.89 400 91 | NEG 0.88 0.91 0.89 400 92 | 93 | micro avg 0.89 0.89 0.89 800 94 | macro avg 0.89 0.89 0.89 800 95 | weighted avg 0.89 0.89 0.89 800 96 | 97 | ``` 98 | 99 | ## 模型二: BI-LSTM 100 | 1. 中文词Embedding 101 | 2. bi-lstm 102 | 3. 全连接 103 | 104 | ![截图](https://raw.githubusercontent.com/linguishi/chinese_sentiment/master/pic/1_GRQ91HNASB7MAJPTTlVvfw.jpeg) 105 | 106 | 107 | BI-LSTM模型的训练,在`lstm`目录底下运行 108 | ``` 109 | python main.py 110 | ``` 111 | 112 | #### BI-LSTM模型训练时间 113 | 在**GTX 1060 6G**的加持下大概耗时5分钟 114 | 115 | #### BI-LSTM模型的训练结果 116 | 在`model`目录底下运行 117 | 118 | ``` 119 | python score_report.py lstm/results/score/eval.preds.txt 120 | ``` 121 | 122 | 输出: 123 | ``` 124 | precision recall f1-score support 125 | 126 | POS 0.90 0.87 0.88 400 127 | NEG 0.87 0.91 0.89 400 128 | 129 | micro avg 0.89 0.89 0.89 800 130 | macro avg 0.89 0.89 0.89 800 131 | weighted avg 0.89 0.89 0.89 800 132 | 133 | ``` 134 | 135 | ### 模型的导出和serving(BI-LSTM为例) 136 | #### 模型导出 137 | 在`lstm`目录底下运行 138 | ``` 139 | python export.py 140 | ``` 141 | 导出`estimator`推断图,可以用作prediction。本项目已上传了`saved_model`,可以不通过训练直接测试。 142 | 143 | 在`model/lstm`目录底下运行 `python serve.py`可以利用导出的模型进行实体识别。详情见代码。 144 | 145 | 测试结果 146 | 147 | ![截图](https://raw.githubusercontent.com/linguishi/chinese_sentiment/master/pic/clip.png) 148 | 149 | 虽然模型由真实评论数据训练而成,这些数据长短不一(有的分词后长度超过1000),但由上图可得,模型对短评论表现尚可。 150 | 151 | ## 参考 152 | 153 | [1] http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/ 154 | 155 | [2] https://arxiv.org/abs/1408.5882 156 | 157 | -------------------------------------------------------------------------------- /data/hotel_comment/build_data.py: -------------------------------------------------------------------------------- 1 | from pathlib import Path 2 | import os 3 | import jieba 4 | 5 | 6 | def build_data_file(directory, samples_path, label, mode_str): 7 | for sample_path in samples_path: 8 | with Path('{}/{}'.format(directory, sample_path)).open() as f: 9 | words = [' '.join(jieba.cut(line.strip(), cut_all=False, HMM=True)) for line in f if line.strip() != ''] 10 | with Path('{}.words.txt'.format(mode_str)).open('a') as g: 11 | g.write('{}\n'.format(' '.join(words))) 12 | with Path('{}.labels.txt'.format(mode_str)).open('a') as h: 13 | h.write('{}\n'.format(label)) 14 | 15 | 16 | if __name__ == '__main__': 17 | pos_dir = Path('raw_data/fix_pos') 18 | neg_dir = Path('raw_data/fix_neg') 19 | pos_samples = os.listdir(pos_dir) 20 | neg_samples = os.listdir(neg_dir) 21 | num_pos = len(pos_samples) 22 | num_neg = len(neg_samples) 23 | build_data_file(pos_dir, pos_samples[0:(num_pos - num_pos // 5)], 'POS', 'train') 24 | build_data_file(pos_dir, pos_samples[(num_pos - num_pos // 5):], 'POS', 'eval') 25 | build_data_file(neg_dir, neg_samples[0:(num_neg - num_neg // 5)], 'NEG', 'train') 26 | build_data_file(neg_dir, neg_samples[(num_neg - num_neg // 5):], 'NEG', 'eval') -------------------------------------------------------------------------------- /data/hotel_comment/build_embedings.py: -------------------------------------------------------------------------------- 1 | from pathlib import Path 2 | import numpy as np 3 | 4 | if __name__ == '__main__': 5 | with Path('vocab.words.txt').open(encoding='utf-8') as f: 6 | word_to_idx = {line.strip(): idx for idx, line in enumerate(f)} 7 | with Path('vocab.words.txt').open(encoding='utf-8') as f: 8 | word_to_found = {line.strip(): False for line in f} 9 | 10 | size_vocab = len(word_to_idx) 11 | 12 | embeddings = np.zeros((size_vocab, 300)) 13 | 14 | found = 0 15 | print('Reading W2V file (may take a while)') 16 | with Path('../../sgns.zhihu.bigram').open(encoding='utf-8') as f: 17 | for line_idx, line in enumerate(f): 18 | if line_idx % 100000 == 0: 19 | print('- At line {}'.format(line_idx)) 20 | line = line.strip().split() 21 | if len(line) != 300 + 1: 22 | continue 23 | word = line[0] 24 | embedding = line[1:] 25 | if (word in word_to_idx) and (not word_to_found[word]): 26 | word_to_found[word] = True 27 | found += 1 28 | word_idx = word_to_idx[word] 29 | embeddings[word_idx] = embedding 30 | print('- done. Found {} vectors for {} words'.format(found, size_vocab)) 31 | 32 | # 保存 np.array 33 | np.savez_compressed('w2v.npz', embeddings=embeddings) 34 | -------------------------------------------------------------------------------- /data/hotel_comment/build_vocab.py: -------------------------------------------------------------------------------- 1 | from collections import Counter 2 | from pathlib import Path 3 | 4 | MIN_COUNT = 1 5 | 6 | if __name__ == '__main__': 7 | def words(name): 8 | return '{}.words.txt'.format(name) 9 | 10 | 11 | print('Build vocab words') 12 | counter_words = Counter() 13 | for n in ['train', 'eval']: 14 | with Path(words(n)).open() as f: 15 | for line in f: 16 | counter_words.update(line.strip().split()) 17 | vocab_words = {w for w, c in counter_words.items() if c >= MIN_COUNT} 18 | 19 | with Path('vocab.words.txt').open('w') as f: 20 | for w in sorted(list(vocab_words)): 21 | f.write('{}\n'.format(w)) 22 | print('Done. Kept {} out of {}'.format( 23 | len(vocab_words), len(counter_words))) 24 | 25 | 26 | def labels(name): 27 | return '{}.labels.txt'.format(name) 28 | 29 | 30 | print('Build labels') 31 | doc_tags = set() 32 | with Path(labels('train')).open() as f: 33 | for line in f: 34 | doc_tags.add(line.strip()) 35 | 36 | with Path('vocab.labels.txt').open('w') as f: 37 | for t in sorted(list(doc_tags)): 38 | f.write('{}\n'.format(t)) 39 | print('- done. Found {} labels.'.format(len(doc_tags))) 40 | -------------------------------------------------------------------------------- /data/hotel_comment/eval.labels.txt: -------------------------------------------------------------------------------- 1 | POS 2 | POS 3 | POS 4 | POS 5 | POS 6 | POS 7 | POS 8 | POS 9 | POS 10 | POS 11 | POS 12 | POS 13 | POS 14 | POS 15 | POS 16 | POS 17 | POS 18 | POS 19 | POS 20 | POS 21 | POS 22 | POS 23 | POS 24 | POS 25 | POS 26 | POS 27 | POS 28 | POS 29 | POS 30 | POS 31 | POS 32 | POS 33 | POS 34 | POS 35 | POS 36 | POS 37 | POS 38 | POS 39 | POS 40 | POS 41 | POS 42 | POS 43 | POS 44 | POS 45 | POS 46 | POS 47 | POS 48 | POS 49 | POS 50 | POS 51 | POS 52 | POS 53 | POS 54 | POS 55 | POS 56 | POS 57 | POS 58 | POS 59 | POS 60 | POS 61 | POS 62 | POS 63 | POS 64 | POS 65 | POS 66 | POS 67 | POS 68 | POS 69 | POS 70 | POS 71 | POS 72 | POS 73 | POS 74 | POS 75 | POS 76 | POS 77 | POS 78 | POS 79 | POS 80 | POS 81 | POS 82 | POS 83 | POS 84 | POS 85 | POS 86 | POS 87 | POS 88 | POS 89 | POS 90 | POS 91 | POS 92 | POS 93 | POS 94 | POS 95 | POS 96 | POS 97 | POS 98 | POS 99 | POS 100 | POS 101 | POS 102 | POS 103 | POS 104 | POS 105 | POS 106 | POS 107 | POS 108 | POS 109 | POS 110 | POS 111 | POS 112 | POS 113 | POS 114 | POS 115 | POS 116 | POS 117 | POS 118 | POS 119 | POS 120 | POS 121 | POS 122 | POS 123 | POS 124 | POS 125 | POS 126 | POS 127 | POS 128 | POS 129 | POS 130 | POS 131 | POS 132 | POS 133 | POS 134 | POS 135 | POS 136 | POS 137 | POS 138 | POS 139 | POS 140 | POS 141 | POS 142 | POS 143 | POS 144 | POS 145 | POS 146 | POS 147 | POS 148 | POS 149 | POS 150 | POS 151 | POS 152 | POS 153 | POS 154 | POS 155 | POS 156 | POS 157 | POS 158 | POS 159 | POS 160 | POS 161 | POS 162 | POS 163 | POS 164 | POS 165 | POS 166 | POS 167 | POS 168 | POS 169 | POS 170 | POS 171 | POS 172 | POS 173 | POS 174 | POS 175 | POS 176 | POS 177 | POS 178 | POS 179 | POS 180 | POS 181 | POS 182 | POS 183 | POS 184 | POS 185 | POS 186 | POS 187 | POS 188 | POS 189 | POS 190 | POS 191 | POS 192 | POS 193 | POS 194 | POS 195 | POS 196 | POS 197 | POS 198 | POS 199 | POS 200 | POS 201 | POS 202 | POS 203 | POS 204 | POS 205 | POS 206 | POS 207 | POS 208 | POS 209 | POS 210 | POS 211 | POS 212 | POS 213 | POS 214 | POS 215 | POS 216 | POS 217 | POS 218 | POS 219 | POS 220 | POS 221 | POS 222 | POS 223 | POS 224 | POS 225 | POS 226 | POS 227 | POS 228 | POS 229 | POS 230 | POS 231 | POS 232 | POS 233 | POS 234 | POS 235 | POS 236 | POS 237 | POS 238 | POS 239 | POS 240 | POS 241 | POS 242 | POS 243 | POS 244 | POS 245 | POS 246 | POS 247 | POS 248 | POS 249 | POS 250 | POS 251 | POS 252 | POS 253 | POS 254 | POS 255 | POS 256 | POS 257 | POS 258 | POS 259 | POS 260 | POS 261 | POS 262 | POS 263 | POS 264 | POS 265 | POS 266 | POS 267 | POS 268 | POS 269 | POS 270 | POS 271 | POS 272 | POS 273 | POS 274 | POS 275 | POS 276 | POS 277 | POS 278 | POS 279 | POS 280 | POS 281 | POS 282 | POS 283 | POS 284 | POS 285 | POS 286 | POS 287 | POS 288 | POS 289 | POS 290 | POS 291 | POS 292 | POS 293 | POS 294 | POS 295 | POS 296 | POS 297 | POS 298 | POS 299 | POS 300 | POS 301 | POS 302 | POS 303 | POS 304 | POS 305 | POS 306 | POS 307 | POS 308 | POS 309 | POS 310 | POS 311 | POS 312 | POS 313 | POS 314 | POS 315 | POS 316 | POS 317 | POS 318 | POS 319 | POS 320 | POS 321 | POS 322 | POS 323 | POS 324 | POS 325 | POS 326 | POS 327 | POS 328 | POS 329 | POS 330 | POS 331 | POS 332 | POS 333 | POS 334 | POS 335 | POS 336 | POS 337 | POS 338 | POS 339 | POS 340 | POS 341 | POS 342 | POS 343 | POS 344 | POS 345 | POS 346 | POS 347 | POS 348 | POS 349 | POS 350 | POS 351 | POS 352 | POS 353 | POS 354 | POS 355 | POS 356 | POS 357 | POS 358 | POS 359 | POS 360 | POS 361 | POS 362 | POS 363 | POS 364 | POS 365 | POS 366 | POS 367 | POS 368 | POS 369 | POS 370 | POS 371 | POS 372 | POS 373 | POS 374 | POS 375 | POS 376 | POS 377 | POS 378 | POS 379 | POS 380 | POS 381 | POS 382 | POS 383 | POS 384 | POS 385 | POS 386 | POS 387 | POS 388 | POS 389 | POS 390 | POS 391 | POS 392 | POS 393 | POS 394 | POS 395 | POS 396 | POS 397 | POS 398 | POS 399 | POS 400 | POS 401 | NEG 402 | NEG 403 | NEG 404 | NEG 405 | NEG 406 | NEG 407 | NEG 408 | NEG 409 | NEG 410 | NEG 411 | NEG 412 | NEG 413 | NEG 414 | NEG 415 | NEG 416 | NEG 417 | NEG 418 | NEG 419 | NEG 420 | NEG 421 | NEG 422 | NEG 423 | NEG 424 | NEG 425 | NEG 426 | NEG 427 | NEG 428 | NEG 429 | NEG 430 | NEG 431 | NEG 432 | NEG 433 | NEG 434 | NEG 435 | NEG 436 | NEG 437 | NEG 438 | NEG 439 | NEG 440 | NEG 441 | NEG 442 | NEG 443 | NEG 444 | NEG 445 | NEG 446 | NEG 447 | NEG 448 | NEG 449 | NEG 450 | NEG 451 | NEG 452 | NEG 453 | NEG 454 | NEG 455 | NEG 456 | NEG 457 | NEG 458 | NEG 459 | NEG 460 | NEG 461 | NEG 462 | NEG 463 | NEG 464 | NEG 465 | NEG 466 | NEG 467 | NEG 468 | NEG 469 | NEG 470 | NEG 471 | NEG 472 | NEG 473 | NEG 474 | NEG 475 | NEG 476 | NEG 477 | NEG 478 | NEG 479 | NEG 480 | NEG 481 | NEG 482 | NEG 483 | NEG 484 | NEG 485 | NEG 486 | NEG 487 | NEG 488 | NEG 489 | NEG 490 | NEG 491 | NEG 492 | NEG 493 | NEG 494 | NEG 495 | NEG 496 | NEG 497 | NEG 498 | NEG 499 | NEG 500 | NEG 501 | NEG 502 | NEG 503 | NEG 504 | NEG 505 | NEG 506 | NEG 507 | NEG 508 | NEG 509 | NEG 510 | NEG 511 | NEG 512 | NEG 513 | NEG 514 | NEG 515 | NEG 516 | NEG 517 | NEG 518 | NEG 519 | NEG 520 | NEG 521 | NEG 522 | NEG 523 | NEG 524 | NEG 525 | NEG 526 | NEG 527 | NEG 528 | NEG 529 | NEG 530 | NEG 531 | NEG 532 | NEG 533 | NEG 534 | NEG 535 | NEG 536 | NEG 537 | NEG 538 | NEG 539 | NEG 540 | NEG 541 | NEG 542 | NEG 543 | NEG 544 | NEG 545 | NEG 546 | NEG 547 | NEG 548 | NEG 549 | NEG 550 | NEG 551 | NEG 552 | NEG 553 | NEG 554 | NEG 555 | NEG 556 | NEG 557 | NEG 558 | NEG 559 | NEG 560 | NEG 561 | NEG 562 | NEG 563 | NEG 564 | NEG 565 | NEG 566 | NEG 567 | NEG 568 | NEG 569 | NEG 570 | NEG 571 | NEG 572 | NEG 573 | NEG 574 | NEG 575 | NEG 576 | NEG 577 | NEG 578 | NEG 579 | NEG 580 | NEG 581 | NEG 582 | NEG 583 | NEG 584 | NEG 585 | NEG 586 | NEG 587 | NEG 588 | NEG 589 | NEG 590 | NEG 591 | NEG 592 | NEG 593 | NEG 594 | NEG 595 | NEG 596 | NEG 597 | NEG 598 | NEG 599 | NEG 600 | NEG 601 | NEG 602 | NEG 603 | NEG 604 | NEG 605 | NEG 606 | NEG 607 | NEG 608 | NEG 609 | NEG 610 | NEG 611 | NEG 612 | NEG 613 | NEG 614 | NEG 615 | NEG 616 | NEG 617 | NEG 618 | NEG 619 | NEG 620 | NEG 621 | NEG 622 | NEG 623 | NEG 624 | NEG 625 | NEG 626 | NEG 627 | NEG 628 | NEG 629 | NEG 630 | NEG 631 | NEG 632 | NEG 633 | NEG 634 | NEG 635 | NEG 636 | NEG 637 | NEG 638 | NEG 639 | NEG 640 | NEG 641 | NEG 642 | NEG 643 | NEG 644 | NEG 645 | NEG 646 | NEG 647 | NEG 648 | NEG 649 | NEG 650 | NEG 651 | NEG 652 | NEG 653 | NEG 654 | NEG 655 | NEG 656 | NEG 657 | NEG 658 | NEG 659 | NEG 660 | NEG 661 | NEG 662 | NEG 663 | NEG 664 | NEG 665 | NEG 666 | NEG 667 | NEG 668 | NEG 669 | NEG 670 | NEG 671 | NEG 672 | NEG 673 | NEG 674 | NEG 675 | NEG 676 | NEG 677 | NEG 678 | NEG 679 | NEG 680 | NEG 681 | NEG 682 | NEG 683 | NEG 684 | NEG 685 | NEG 686 | NEG 687 | NEG 688 | NEG 689 | NEG 690 | NEG 691 | NEG 692 | NEG 693 | NEG 694 | NEG 695 | NEG 696 | NEG 697 | NEG 698 | NEG 699 | NEG 700 | NEG 701 | NEG 702 | NEG 703 | NEG 704 | NEG 705 | NEG 706 | NEG 707 | NEG 708 | NEG 709 | NEG 710 | NEG 711 | NEG 712 | NEG 713 | NEG 714 | NEG 715 | NEG 716 | NEG 717 | NEG 718 | NEG 719 | NEG 720 | NEG 721 | NEG 722 | NEG 723 | NEG 724 | NEG 725 | NEG 726 | NEG 727 | NEG 728 | NEG 729 | NEG 730 | NEG 731 | NEG 732 | NEG 733 | NEG 734 | NEG 735 | NEG 736 | NEG 737 | NEG 738 | NEG 739 | NEG 740 | NEG 741 | NEG 742 | NEG 743 | NEG 744 | NEG 745 | NEG 746 | NEG 747 | NEG 748 | NEG 749 | NEG 750 | NEG 751 | NEG 752 | NEG 753 | NEG 754 | NEG 755 | NEG 756 | NEG 757 | NEG 758 | NEG 759 | NEG 760 | NEG 761 | NEG 762 | NEG 763 | NEG 764 | NEG 765 | NEG 766 | NEG 767 | NEG 768 | NEG 769 | NEG 770 | NEG 771 | NEG 772 | NEG 773 | NEG 774 | NEG 775 | NEG 776 | NEG 777 | NEG 778 | NEG 779 | NEG 780 | NEG 781 | NEG 782 | NEG 783 | NEG 784 | NEG 785 | NEG 786 | NEG 787 | NEG 788 | NEG 789 | NEG 790 | NEG 791 | NEG 792 | NEG 793 | NEG 794 | NEG 795 | NEG 796 | NEG 797 | NEG 798 | NEG 799 | NEG 800 | NEG 801 | -------------------------------------------------------------------------------- /data/hotel_comment/raw_data/corpus.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/data/hotel_comment/raw_data/corpus.zip -------------------------------------------------------------------------------- /data/hotel_comment/raw_data/fix_coupus.py: -------------------------------------------------------------------------------- 1 | import os 2 | import codecs 3 | 4 | POS = os.path.join(os.getcwd(), 'pos') 5 | NEG = os.path.join(os.getcwd(), 'neg') 6 | FIX_POS = os.path.join(os.getcwd(), 'fix_pos') 7 | FIX_NEG = os.path.join(os.getcwd(), 'fix_neg') 8 | 9 | 10 | def fix_corpus(dir_s, dir_t): 11 | for item in os.listdir(dir_s): 12 | with open(os.path.join(dir_s, item), 'r') as f: 13 | try: 14 | s = f.read() 15 | fix_s = s.decode('gb2312') 16 | except UnicodeDecodeError: 17 | try: 18 | fix_s = s.decode('gbk') 19 | except UnicodeDecodeError: 20 | fix_s = s.decode('gb2312', errors='ignore') 21 | with codecs.open(os.path.join(dir_t, item), 'w', encoding='utf8') as ff: 22 | ff.write(fix_s) 23 | 24 | 25 | if __name__ == "__main__": 26 | if not os.path.isdir(FIX_POS): 27 | os.mkdir(FIX_POS) 28 | if not os.path.isdir(FIX_NEG): 29 | os.mkdir(FIX_NEG) 30 | fix_corpus(POS, FIX_POS) 31 | fix_corpus(NEG, FIX_NEG) 32 | -------------------------------------------------------------------------------- /data/hotel_comment/train.labels.txt: -------------------------------------------------------------------------------- 1 | POS 2 | POS 3 | POS 4 | POS 5 | POS 6 | POS 7 | POS 8 | POS 9 | POS 10 | POS 11 | POS 12 | POS 13 | POS 14 | POS 15 | POS 16 | POS 17 | POS 18 | POS 19 | POS 20 | POS 21 | POS 22 | POS 23 | POS 24 | POS 25 | POS 26 | POS 27 | POS 28 | POS 29 | POS 30 | POS 31 | POS 32 | POS 33 | POS 34 | POS 35 | POS 36 | POS 37 | POS 38 | POS 39 | POS 40 | POS 41 | POS 42 | POS 43 | POS 44 | POS 45 | POS 46 | POS 47 | POS 48 | POS 49 | POS 50 | POS 51 | POS 52 | POS 53 | POS 54 | POS 55 | POS 56 | POS 57 | POS 58 | POS 59 | POS 60 | POS 61 | POS 62 | POS 63 | POS 64 | POS 65 | POS 66 | POS 67 | POS 68 | POS 69 | POS 70 | POS 71 | POS 72 | POS 73 | POS 74 | POS 75 | POS 76 | POS 77 | POS 78 | POS 79 | POS 80 | POS 81 | POS 82 | POS 83 | POS 84 | POS 85 | POS 86 | POS 87 | POS 88 | POS 89 | POS 90 | POS 91 | POS 92 | POS 93 | POS 94 | POS 95 | POS 96 | POS 97 | POS 98 | POS 99 | POS 100 | POS 101 | POS 102 | POS 103 | POS 104 | POS 105 | POS 106 | POS 107 | POS 108 | POS 109 | POS 110 | POS 111 | POS 112 | POS 113 | POS 114 | POS 115 | POS 116 | POS 117 | POS 118 | POS 119 | POS 120 | POS 121 | POS 122 | POS 123 | POS 124 | POS 125 | POS 126 | POS 127 | POS 128 | POS 129 | POS 130 | POS 131 | POS 132 | POS 133 | POS 134 | POS 135 | POS 136 | POS 137 | POS 138 | POS 139 | POS 140 | POS 141 | POS 142 | POS 143 | POS 144 | POS 145 | POS 146 | POS 147 | POS 148 | POS 149 | POS 150 | POS 151 | POS 152 | POS 153 | POS 154 | POS 155 | POS 156 | POS 157 | POS 158 | POS 159 | POS 160 | POS 161 | POS 162 | POS 163 | POS 164 | POS 165 | POS 166 | POS 167 | POS 168 | POS 169 | POS 170 | POS 171 | POS 172 | POS 173 | POS 174 | POS 175 | POS 176 | POS 177 | POS 178 | POS 179 | POS 180 | POS 181 | POS 182 | POS 183 | POS 184 | POS 185 | POS 186 | POS 187 | POS 188 | POS 189 | POS 190 | POS 191 | POS 192 | POS 193 | POS 194 | POS 195 | POS 196 | POS 197 | POS 198 | POS 199 | POS 200 | POS 201 | POS 202 | POS 203 | POS 204 | POS 205 | POS 206 | POS 207 | POS 208 | POS 209 | POS 210 | POS 211 | POS 212 | POS 213 | POS 214 | POS 215 | POS 216 | POS 217 | POS 218 | POS 219 | POS 220 | POS 221 | POS 222 | POS 223 | POS 224 | POS 225 | POS 226 | POS 227 | POS 228 | POS 229 | POS 230 | POS 231 | POS 232 | POS 233 | POS 234 | POS 235 | POS 236 | POS 237 | POS 238 | POS 239 | POS 240 | POS 241 | POS 242 | POS 243 | POS 244 | POS 245 | POS 246 | POS 247 | POS 248 | POS 249 | POS 250 | POS 251 | POS 252 | POS 253 | POS 254 | POS 255 | POS 256 | POS 257 | POS 258 | POS 259 | POS 260 | POS 261 | POS 262 | POS 263 | POS 264 | POS 265 | POS 266 | POS 267 | POS 268 | POS 269 | POS 270 | POS 271 | POS 272 | POS 273 | POS 274 | POS 275 | POS 276 | POS 277 | POS 278 | POS 279 | POS 280 | POS 281 | POS 282 | POS 283 | POS 284 | POS 285 | POS 286 | POS 287 | POS 288 | POS 289 | POS 290 | POS 291 | POS 292 | POS 293 | POS 294 | POS 295 | POS 296 | POS 297 | POS 298 | POS 299 | POS 300 | POS 301 | POS 302 | POS 303 | POS 304 | POS 305 | POS 306 | POS 307 | POS 308 | POS 309 | POS 310 | POS 311 | POS 312 | POS 313 | POS 314 | POS 315 | POS 316 | POS 317 | POS 318 | POS 319 | POS 320 | POS 321 | POS 322 | POS 323 | POS 324 | POS 325 | POS 326 | POS 327 | POS 328 | POS 329 | POS 330 | POS 331 | POS 332 | POS 333 | POS 334 | POS 335 | POS 336 | POS 337 | POS 338 | POS 339 | POS 340 | POS 341 | POS 342 | POS 343 | POS 344 | POS 345 | POS 346 | POS 347 | POS 348 | POS 349 | POS 350 | POS 351 | POS 352 | POS 353 | POS 354 | POS 355 | POS 356 | POS 357 | POS 358 | POS 359 | POS 360 | POS 361 | POS 362 | POS 363 | POS 364 | POS 365 | POS 366 | POS 367 | POS 368 | POS 369 | POS 370 | POS 371 | POS 372 | POS 373 | POS 374 | POS 375 | POS 376 | POS 377 | POS 378 | POS 379 | POS 380 | POS 381 | POS 382 | POS 383 | POS 384 | POS 385 | POS 386 | POS 387 | POS 388 | POS 389 | POS 390 | POS 391 | POS 392 | POS 393 | POS 394 | POS 395 | POS 396 | POS 397 | POS 398 | POS 399 | POS 400 | POS 401 | POS 402 | POS 403 | POS 404 | POS 405 | POS 406 | POS 407 | POS 408 | POS 409 | POS 410 | POS 411 | POS 412 | POS 413 | POS 414 | POS 415 | POS 416 | POS 417 | POS 418 | POS 419 | POS 420 | POS 421 | POS 422 | POS 423 | POS 424 | POS 425 | POS 426 | POS 427 | POS 428 | POS 429 | POS 430 | POS 431 | POS 432 | POS 433 | POS 434 | POS 435 | POS 436 | POS 437 | POS 438 | POS 439 | POS 440 | POS 441 | POS 442 | POS 443 | POS 444 | POS 445 | POS 446 | POS 447 | POS 448 | POS 449 | POS 450 | POS 451 | POS 452 | POS 453 | POS 454 | POS 455 | POS 456 | POS 457 | POS 458 | POS 459 | POS 460 | POS 461 | POS 462 | POS 463 | POS 464 | POS 465 | POS 466 | POS 467 | POS 468 | POS 469 | POS 470 | POS 471 | POS 472 | POS 473 | POS 474 | POS 475 | POS 476 | POS 477 | POS 478 | POS 479 | POS 480 | POS 481 | POS 482 | POS 483 | POS 484 | POS 485 | POS 486 | POS 487 | POS 488 | POS 489 | POS 490 | POS 491 | POS 492 | POS 493 | POS 494 | POS 495 | POS 496 | POS 497 | POS 498 | POS 499 | POS 500 | POS 501 | POS 502 | POS 503 | POS 504 | POS 505 | POS 506 | POS 507 | POS 508 | POS 509 | POS 510 | POS 511 | POS 512 | POS 513 | POS 514 | POS 515 | POS 516 | POS 517 | POS 518 | POS 519 | POS 520 | POS 521 | POS 522 | POS 523 | POS 524 | POS 525 | POS 526 | POS 527 | POS 528 | POS 529 | POS 530 | POS 531 | POS 532 | POS 533 | POS 534 | POS 535 | POS 536 | POS 537 | POS 538 | POS 539 | POS 540 | POS 541 | POS 542 | POS 543 | POS 544 | POS 545 | POS 546 | POS 547 | POS 548 | POS 549 | POS 550 | POS 551 | POS 552 | POS 553 | POS 554 | POS 555 | POS 556 | POS 557 | POS 558 | POS 559 | POS 560 | POS 561 | POS 562 | POS 563 | POS 564 | POS 565 | POS 566 | POS 567 | POS 568 | POS 569 | POS 570 | POS 571 | POS 572 | POS 573 | POS 574 | POS 575 | POS 576 | POS 577 | POS 578 | POS 579 | POS 580 | POS 581 | POS 582 | POS 583 | POS 584 | POS 585 | POS 586 | POS 587 | POS 588 | POS 589 | POS 590 | POS 591 | POS 592 | POS 593 | POS 594 | POS 595 | POS 596 | POS 597 | POS 598 | POS 599 | POS 600 | POS 601 | POS 602 | POS 603 | POS 604 | POS 605 | POS 606 | POS 607 | POS 608 | POS 609 | POS 610 | POS 611 | POS 612 | POS 613 | POS 614 | POS 615 | POS 616 | POS 617 | POS 618 | POS 619 | POS 620 | POS 621 | POS 622 | POS 623 | POS 624 | POS 625 | POS 626 | POS 627 | POS 628 | POS 629 | POS 630 | POS 631 | POS 632 | POS 633 | POS 634 | POS 635 | POS 636 | POS 637 | POS 638 | POS 639 | POS 640 | POS 641 | POS 642 | POS 643 | POS 644 | POS 645 | POS 646 | POS 647 | POS 648 | POS 649 | POS 650 | POS 651 | POS 652 | POS 653 | POS 654 | POS 655 | POS 656 | POS 657 | POS 658 | POS 659 | POS 660 | POS 661 | POS 662 | POS 663 | POS 664 | POS 665 | POS 666 | POS 667 | POS 668 | POS 669 | POS 670 | POS 671 | POS 672 | POS 673 | POS 674 | POS 675 | POS 676 | POS 677 | POS 678 | POS 679 | POS 680 | POS 681 | POS 682 | POS 683 | POS 684 | POS 685 | POS 686 | POS 687 | POS 688 | POS 689 | POS 690 | POS 691 | POS 692 | POS 693 | POS 694 | POS 695 | POS 696 | POS 697 | POS 698 | POS 699 | POS 700 | POS 701 | POS 702 | POS 703 | POS 704 | POS 705 | POS 706 | POS 707 | POS 708 | POS 709 | POS 710 | POS 711 | POS 712 | POS 713 | POS 714 | POS 715 | POS 716 | POS 717 | POS 718 | POS 719 | POS 720 | POS 721 | POS 722 | POS 723 | POS 724 | POS 725 | POS 726 | POS 727 | POS 728 | POS 729 | POS 730 | POS 731 | POS 732 | POS 733 | POS 734 | POS 735 | POS 736 | POS 737 | POS 738 | POS 739 | POS 740 | POS 741 | POS 742 | POS 743 | POS 744 | POS 745 | POS 746 | POS 747 | POS 748 | POS 749 | POS 750 | POS 751 | POS 752 | POS 753 | POS 754 | POS 755 | POS 756 | POS 757 | POS 758 | POS 759 | POS 760 | POS 761 | POS 762 | POS 763 | POS 764 | POS 765 | POS 766 | POS 767 | POS 768 | POS 769 | POS 770 | POS 771 | POS 772 | POS 773 | POS 774 | POS 775 | POS 776 | POS 777 | POS 778 | POS 779 | POS 780 | POS 781 | POS 782 | POS 783 | POS 784 | POS 785 | POS 786 | POS 787 | POS 788 | POS 789 | POS 790 | POS 791 | POS 792 | POS 793 | POS 794 | POS 795 | POS 796 | POS 797 | POS 798 | POS 799 | POS 800 | POS 801 | POS 802 | POS 803 | POS 804 | POS 805 | POS 806 | POS 807 | POS 808 | POS 809 | POS 810 | POS 811 | POS 812 | POS 813 | POS 814 | POS 815 | POS 816 | POS 817 | POS 818 | POS 819 | POS 820 | POS 821 | POS 822 | POS 823 | POS 824 | POS 825 | POS 826 | POS 827 | POS 828 | POS 829 | POS 830 | POS 831 | POS 832 | POS 833 | POS 834 | POS 835 | POS 836 | POS 837 | POS 838 | POS 839 | POS 840 | POS 841 | POS 842 | POS 843 | POS 844 | POS 845 | POS 846 | POS 847 | POS 848 | POS 849 | POS 850 | POS 851 | POS 852 | POS 853 | POS 854 | POS 855 | POS 856 | POS 857 | POS 858 | POS 859 | POS 860 | POS 861 | POS 862 | POS 863 | POS 864 | POS 865 | POS 866 | POS 867 | POS 868 | POS 869 | POS 870 | POS 871 | POS 872 | POS 873 | POS 874 | POS 875 | POS 876 | POS 877 | POS 878 | POS 879 | POS 880 | POS 881 | POS 882 | POS 883 | POS 884 | POS 885 | POS 886 | POS 887 | POS 888 | POS 889 | POS 890 | POS 891 | POS 892 | POS 893 | POS 894 | POS 895 | POS 896 | POS 897 | POS 898 | POS 899 | POS 900 | POS 901 | POS 902 | POS 903 | POS 904 | POS 905 | POS 906 | POS 907 | POS 908 | POS 909 | POS 910 | POS 911 | POS 912 | POS 913 | POS 914 | POS 915 | POS 916 | POS 917 | POS 918 | POS 919 | POS 920 | POS 921 | POS 922 | POS 923 | POS 924 | POS 925 | POS 926 | POS 927 | POS 928 | POS 929 | POS 930 | POS 931 | POS 932 | POS 933 | POS 934 | POS 935 | POS 936 | POS 937 | POS 938 | POS 939 | POS 940 | POS 941 | POS 942 | POS 943 | POS 944 | POS 945 | POS 946 | POS 947 | POS 948 | POS 949 | POS 950 | POS 951 | POS 952 | POS 953 | POS 954 | POS 955 | POS 956 | POS 957 | POS 958 | POS 959 | POS 960 | POS 961 | POS 962 | POS 963 | POS 964 | POS 965 | POS 966 | POS 967 | POS 968 | POS 969 | POS 970 | POS 971 | POS 972 | POS 973 | POS 974 | POS 975 | POS 976 | POS 977 | POS 978 | POS 979 | POS 980 | POS 981 | POS 982 | POS 983 | POS 984 | POS 985 | POS 986 | POS 987 | POS 988 | POS 989 | POS 990 | POS 991 | POS 992 | POS 993 | POS 994 | POS 995 | POS 996 | POS 997 | POS 998 | POS 999 | POS 1000 | POS 1001 | POS 1002 | POS 1003 | POS 1004 | POS 1005 | POS 1006 | POS 1007 | POS 1008 | POS 1009 | POS 1010 | POS 1011 | POS 1012 | POS 1013 | POS 1014 | POS 1015 | POS 1016 | POS 1017 | POS 1018 | POS 1019 | POS 1020 | POS 1021 | POS 1022 | POS 1023 | POS 1024 | POS 1025 | POS 1026 | POS 1027 | POS 1028 | POS 1029 | POS 1030 | POS 1031 | POS 1032 | POS 1033 | POS 1034 | POS 1035 | POS 1036 | POS 1037 | POS 1038 | POS 1039 | POS 1040 | POS 1041 | POS 1042 | POS 1043 | POS 1044 | POS 1045 | POS 1046 | POS 1047 | POS 1048 | POS 1049 | POS 1050 | POS 1051 | POS 1052 | POS 1053 | POS 1054 | POS 1055 | POS 1056 | POS 1057 | POS 1058 | POS 1059 | POS 1060 | POS 1061 | POS 1062 | POS 1063 | POS 1064 | POS 1065 | POS 1066 | POS 1067 | POS 1068 | POS 1069 | POS 1070 | POS 1071 | POS 1072 | POS 1073 | POS 1074 | POS 1075 | POS 1076 | POS 1077 | POS 1078 | POS 1079 | POS 1080 | POS 1081 | POS 1082 | POS 1083 | POS 1084 | POS 1085 | POS 1086 | POS 1087 | POS 1088 | POS 1089 | POS 1090 | POS 1091 | POS 1092 | POS 1093 | POS 1094 | POS 1095 | POS 1096 | POS 1097 | POS 1098 | POS 1099 | POS 1100 | POS 1101 | POS 1102 | POS 1103 | POS 1104 | POS 1105 | POS 1106 | POS 1107 | POS 1108 | POS 1109 | POS 1110 | POS 1111 | POS 1112 | POS 1113 | POS 1114 | POS 1115 | POS 1116 | POS 1117 | POS 1118 | POS 1119 | POS 1120 | POS 1121 | POS 1122 | POS 1123 | POS 1124 | POS 1125 | POS 1126 | POS 1127 | POS 1128 | POS 1129 | POS 1130 | POS 1131 | POS 1132 | POS 1133 | POS 1134 | POS 1135 | POS 1136 | POS 1137 | POS 1138 | POS 1139 | POS 1140 | POS 1141 | POS 1142 | POS 1143 | POS 1144 | POS 1145 | POS 1146 | POS 1147 | POS 1148 | POS 1149 | POS 1150 | POS 1151 | POS 1152 | POS 1153 | POS 1154 | POS 1155 | POS 1156 | POS 1157 | POS 1158 | POS 1159 | POS 1160 | POS 1161 | POS 1162 | POS 1163 | POS 1164 | POS 1165 | POS 1166 | POS 1167 | POS 1168 | POS 1169 | POS 1170 | POS 1171 | POS 1172 | POS 1173 | POS 1174 | POS 1175 | POS 1176 | POS 1177 | POS 1178 | POS 1179 | POS 1180 | POS 1181 | POS 1182 | POS 1183 | POS 1184 | POS 1185 | POS 1186 | POS 1187 | POS 1188 | POS 1189 | POS 1190 | POS 1191 | POS 1192 | POS 1193 | POS 1194 | POS 1195 | POS 1196 | POS 1197 | POS 1198 | POS 1199 | POS 1200 | POS 1201 | POS 1202 | POS 1203 | POS 1204 | POS 1205 | POS 1206 | POS 1207 | POS 1208 | POS 1209 | POS 1210 | POS 1211 | POS 1212 | POS 1213 | POS 1214 | POS 1215 | POS 1216 | POS 1217 | POS 1218 | POS 1219 | POS 1220 | POS 1221 | POS 1222 | POS 1223 | POS 1224 | POS 1225 | POS 1226 | POS 1227 | POS 1228 | POS 1229 | POS 1230 | POS 1231 | POS 1232 | POS 1233 | POS 1234 | POS 1235 | POS 1236 | POS 1237 | POS 1238 | POS 1239 | POS 1240 | POS 1241 | POS 1242 | POS 1243 | POS 1244 | POS 1245 | POS 1246 | POS 1247 | POS 1248 | POS 1249 | POS 1250 | POS 1251 | POS 1252 | POS 1253 | POS 1254 | POS 1255 | POS 1256 | POS 1257 | POS 1258 | POS 1259 | POS 1260 | POS 1261 | POS 1262 | POS 1263 | POS 1264 | POS 1265 | POS 1266 | POS 1267 | POS 1268 | POS 1269 | POS 1270 | POS 1271 | POS 1272 | POS 1273 | POS 1274 | POS 1275 | POS 1276 | POS 1277 | POS 1278 | POS 1279 | POS 1280 | POS 1281 | POS 1282 | POS 1283 | POS 1284 | POS 1285 | POS 1286 | POS 1287 | POS 1288 | POS 1289 | POS 1290 | POS 1291 | POS 1292 | POS 1293 | POS 1294 | POS 1295 | POS 1296 | POS 1297 | POS 1298 | POS 1299 | POS 1300 | POS 1301 | POS 1302 | POS 1303 | POS 1304 | POS 1305 | POS 1306 | POS 1307 | POS 1308 | POS 1309 | POS 1310 | POS 1311 | POS 1312 | POS 1313 | POS 1314 | POS 1315 | POS 1316 | POS 1317 | POS 1318 | POS 1319 | POS 1320 | POS 1321 | POS 1322 | POS 1323 | POS 1324 | POS 1325 | POS 1326 | POS 1327 | POS 1328 | POS 1329 | POS 1330 | POS 1331 | POS 1332 | POS 1333 | POS 1334 | POS 1335 | POS 1336 | POS 1337 | POS 1338 | POS 1339 | POS 1340 | POS 1341 | POS 1342 | POS 1343 | POS 1344 | POS 1345 | POS 1346 | POS 1347 | POS 1348 | POS 1349 | POS 1350 | POS 1351 | POS 1352 | POS 1353 | POS 1354 | POS 1355 | POS 1356 | POS 1357 | POS 1358 | POS 1359 | POS 1360 | POS 1361 | POS 1362 | POS 1363 | POS 1364 | POS 1365 | POS 1366 | POS 1367 | POS 1368 | POS 1369 | POS 1370 | POS 1371 | POS 1372 | POS 1373 | POS 1374 | POS 1375 | POS 1376 | POS 1377 | POS 1378 | POS 1379 | POS 1380 | POS 1381 | POS 1382 | POS 1383 | POS 1384 | POS 1385 | POS 1386 | POS 1387 | POS 1388 | POS 1389 | POS 1390 | POS 1391 | POS 1392 | POS 1393 | POS 1394 | POS 1395 | POS 1396 | POS 1397 | POS 1398 | POS 1399 | POS 1400 | POS 1401 | POS 1402 | POS 1403 | POS 1404 | POS 1405 | POS 1406 | POS 1407 | POS 1408 | POS 1409 | POS 1410 | POS 1411 | POS 1412 | POS 1413 | POS 1414 | POS 1415 | POS 1416 | POS 1417 | POS 1418 | POS 1419 | POS 1420 | POS 1421 | POS 1422 | POS 1423 | POS 1424 | POS 1425 | POS 1426 | POS 1427 | POS 1428 | POS 1429 | POS 1430 | POS 1431 | POS 1432 | POS 1433 | POS 1434 | POS 1435 | POS 1436 | POS 1437 | POS 1438 | POS 1439 | POS 1440 | POS 1441 | POS 1442 | POS 1443 | POS 1444 | POS 1445 | POS 1446 | POS 1447 | POS 1448 | POS 1449 | POS 1450 | POS 1451 | POS 1452 | POS 1453 | POS 1454 | POS 1455 | POS 1456 | POS 1457 | POS 1458 | POS 1459 | POS 1460 | POS 1461 | POS 1462 | POS 1463 | POS 1464 | POS 1465 | POS 1466 | POS 1467 | POS 1468 | POS 1469 | POS 1470 | POS 1471 | POS 1472 | POS 1473 | POS 1474 | POS 1475 | POS 1476 | POS 1477 | POS 1478 | POS 1479 | POS 1480 | POS 1481 | POS 1482 | POS 1483 | POS 1484 | POS 1485 | POS 1486 | POS 1487 | POS 1488 | POS 1489 | POS 1490 | POS 1491 | POS 1492 | POS 1493 | POS 1494 | POS 1495 | POS 1496 | POS 1497 | POS 1498 | POS 1499 | POS 1500 | POS 1501 | POS 1502 | POS 1503 | POS 1504 | POS 1505 | POS 1506 | POS 1507 | POS 1508 | POS 1509 | POS 1510 | POS 1511 | POS 1512 | POS 1513 | POS 1514 | POS 1515 | POS 1516 | POS 1517 | POS 1518 | POS 1519 | POS 1520 | POS 1521 | POS 1522 | POS 1523 | POS 1524 | POS 1525 | POS 1526 | POS 1527 | POS 1528 | POS 1529 | POS 1530 | POS 1531 | POS 1532 | POS 1533 | POS 1534 | POS 1535 | POS 1536 | POS 1537 | POS 1538 | POS 1539 | POS 1540 | POS 1541 | POS 1542 | POS 1543 | POS 1544 | POS 1545 | POS 1546 | POS 1547 | POS 1548 | POS 1549 | POS 1550 | POS 1551 | POS 1552 | POS 1553 | POS 1554 | POS 1555 | POS 1556 | POS 1557 | POS 1558 | POS 1559 | POS 1560 | POS 1561 | POS 1562 | POS 1563 | POS 1564 | POS 1565 | POS 1566 | POS 1567 | POS 1568 | POS 1569 | POS 1570 | POS 1571 | POS 1572 | POS 1573 | POS 1574 | POS 1575 | POS 1576 | POS 1577 | POS 1578 | POS 1579 | POS 1580 | POS 1581 | POS 1582 | POS 1583 | POS 1584 | POS 1585 | POS 1586 | POS 1587 | POS 1588 | POS 1589 | POS 1590 | POS 1591 | POS 1592 | POS 1593 | POS 1594 | POS 1595 | POS 1596 | POS 1597 | POS 1598 | POS 1599 | POS 1600 | POS 1601 | NEG 1602 | NEG 1603 | NEG 1604 | NEG 1605 | NEG 1606 | NEG 1607 | NEG 1608 | NEG 1609 | NEG 1610 | NEG 1611 | NEG 1612 | NEG 1613 | NEG 1614 | NEG 1615 | NEG 1616 | NEG 1617 | NEG 1618 | NEG 1619 | NEG 1620 | NEG 1621 | NEG 1622 | NEG 1623 | NEG 1624 | NEG 1625 | NEG 1626 | NEG 1627 | NEG 1628 | NEG 1629 | NEG 1630 | NEG 1631 | NEG 1632 | NEG 1633 | NEG 1634 | NEG 1635 | NEG 1636 | NEG 1637 | NEG 1638 | NEG 1639 | NEG 1640 | NEG 1641 | NEG 1642 | NEG 1643 | NEG 1644 | NEG 1645 | NEG 1646 | NEG 1647 | NEG 1648 | NEG 1649 | NEG 1650 | NEG 1651 | NEG 1652 | NEG 1653 | NEG 1654 | NEG 1655 | NEG 1656 | NEG 1657 | NEG 1658 | NEG 1659 | NEG 1660 | NEG 1661 | NEG 1662 | NEG 1663 | NEG 1664 | NEG 1665 | NEG 1666 | NEG 1667 | NEG 1668 | NEG 1669 | NEG 1670 | NEG 1671 | NEG 1672 | NEG 1673 | NEG 1674 | NEG 1675 | NEG 1676 | NEG 1677 | NEG 1678 | NEG 1679 | NEG 1680 | NEG 1681 | NEG 1682 | NEG 1683 | NEG 1684 | NEG 1685 | NEG 1686 | NEG 1687 | NEG 1688 | NEG 1689 | NEG 1690 | NEG 1691 | NEG 1692 | NEG 1693 | NEG 1694 | NEG 1695 | NEG 1696 | NEG 1697 | NEG 1698 | NEG 1699 | NEG 1700 | NEG 1701 | NEG 1702 | NEG 1703 | NEG 1704 | NEG 1705 | NEG 1706 | NEG 1707 | NEG 1708 | NEG 1709 | NEG 1710 | NEG 1711 | NEG 1712 | NEG 1713 | NEG 1714 | NEG 1715 | NEG 1716 | NEG 1717 | NEG 1718 | NEG 1719 | NEG 1720 | NEG 1721 | NEG 1722 | NEG 1723 | NEG 1724 | NEG 1725 | NEG 1726 | NEG 1727 | NEG 1728 | NEG 1729 | NEG 1730 | NEG 1731 | NEG 1732 | NEG 1733 | NEG 1734 | NEG 1735 | NEG 1736 | NEG 1737 | NEG 1738 | NEG 1739 | NEG 1740 | NEG 1741 | NEG 1742 | NEG 1743 | NEG 1744 | NEG 1745 | NEG 1746 | NEG 1747 | NEG 1748 | NEG 1749 | NEG 1750 | NEG 1751 | NEG 1752 | NEG 1753 | NEG 1754 | NEG 1755 | NEG 1756 | NEG 1757 | NEG 1758 | NEG 1759 | NEG 1760 | NEG 1761 | NEG 1762 | NEG 1763 | NEG 1764 | NEG 1765 | NEG 1766 | NEG 1767 | NEG 1768 | NEG 1769 | NEG 1770 | NEG 1771 | NEG 1772 | NEG 1773 | NEG 1774 | NEG 1775 | NEG 1776 | NEG 1777 | NEG 1778 | NEG 1779 | NEG 1780 | NEG 1781 | NEG 1782 | NEG 1783 | NEG 1784 | NEG 1785 | NEG 1786 | NEG 1787 | NEG 1788 | NEG 1789 | NEG 1790 | NEG 1791 | NEG 1792 | NEG 1793 | NEG 1794 | NEG 1795 | NEG 1796 | NEG 1797 | NEG 1798 | NEG 1799 | NEG 1800 | NEG 1801 | NEG 1802 | NEG 1803 | NEG 1804 | NEG 1805 | NEG 1806 | NEG 1807 | NEG 1808 | NEG 1809 | NEG 1810 | NEG 1811 | NEG 1812 | NEG 1813 | NEG 1814 | NEG 1815 | NEG 1816 | NEG 1817 | NEG 1818 | NEG 1819 | NEG 1820 | NEG 1821 | NEG 1822 | NEG 1823 | NEG 1824 | NEG 1825 | NEG 1826 | NEG 1827 | NEG 1828 | NEG 1829 | NEG 1830 | NEG 1831 | NEG 1832 | NEG 1833 | NEG 1834 | NEG 1835 | NEG 1836 | NEG 1837 | NEG 1838 | NEG 1839 | NEG 1840 | NEG 1841 | NEG 1842 | NEG 1843 | NEG 1844 | NEG 1845 | NEG 1846 | NEG 1847 | NEG 1848 | NEG 1849 | NEG 1850 | NEG 1851 | NEG 1852 | NEG 1853 | NEG 1854 | NEG 1855 | NEG 1856 | NEG 1857 | NEG 1858 | NEG 1859 | NEG 1860 | NEG 1861 | NEG 1862 | NEG 1863 | NEG 1864 | NEG 1865 | NEG 1866 | NEG 1867 | NEG 1868 | NEG 1869 | NEG 1870 | NEG 1871 | NEG 1872 | NEG 1873 | NEG 1874 | NEG 1875 | NEG 1876 | NEG 1877 | NEG 1878 | NEG 1879 | NEG 1880 | NEG 1881 | NEG 1882 | NEG 1883 | NEG 1884 | NEG 1885 | NEG 1886 | NEG 1887 | NEG 1888 | NEG 1889 | NEG 1890 | NEG 1891 | NEG 1892 | NEG 1893 | NEG 1894 | NEG 1895 | NEG 1896 | NEG 1897 | NEG 1898 | NEG 1899 | NEG 1900 | NEG 1901 | NEG 1902 | NEG 1903 | NEG 1904 | NEG 1905 | NEG 1906 | NEG 1907 | NEG 1908 | NEG 1909 | NEG 1910 | NEG 1911 | NEG 1912 | NEG 1913 | NEG 1914 | NEG 1915 | NEG 1916 | NEG 1917 | NEG 1918 | NEG 1919 | NEG 1920 | NEG 1921 | NEG 1922 | NEG 1923 | NEG 1924 | NEG 1925 | NEG 1926 | NEG 1927 | NEG 1928 | NEG 1929 | NEG 1930 | NEG 1931 | NEG 1932 | NEG 1933 | NEG 1934 | NEG 1935 | NEG 1936 | NEG 1937 | NEG 1938 | NEG 1939 | NEG 1940 | NEG 1941 | NEG 1942 | NEG 1943 | NEG 1944 | NEG 1945 | NEG 1946 | NEG 1947 | NEG 1948 | NEG 1949 | NEG 1950 | NEG 1951 | NEG 1952 | NEG 1953 | NEG 1954 | NEG 1955 | NEG 1956 | NEG 1957 | NEG 1958 | NEG 1959 | NEG 1960 | NEG 1961 | NEG 1962 | NEG 1963 | NEG 1964 | NEG 1965 | NEG 1966 | NEG 1967 | NEG 1968 | NEG 1969 | NEG 1970 | NEG 1971 | NEG 1972 | NEG 1973 | NEG 1974 | NEG 1975 | NEG 1976 | NEG 1977 | NEG 1978 | NEG 1979 | NEG 1980 | NEG 1981 | NEG 1982 | NEG 1983 | NEG 1984 | NEG 1985 | NEG 1986 | NEG 1987 | NEG 1988 | NEG 1989 | NEG 1990 | NEG 1991 | NEG 1992 | NEG 1993 | NEG 1994 | NEG 1995 | NEG 1996 | NEG 1997 | NEG 1998 | NEG 1999 | NEG 2000 | NEG 2001 | NEG 2002 | NEG 2003 | NEG 2004 | NEG 2005 | NEG 2006 | NEG 2007 | NEG 2008 | NEG 2009 | NEG 2010 | NEG 2011 | NEG 2012 | NEG 2013 | NEG 2014 | NEG 2015 | NEG 2016 | NEG 2017 | NEG 2018 | NEG 2019 | NEG 2020 | NEG 2021 | NEG 2022 | NEG 2023 | NEG 2024 | NEG 2025 | NEG 2026 | NEG 2027 | NEG 2028 | NEG 2029 | NEG 2030 | NEG 2031 | NEG 2032 | NEG 2033 | NEG 2034 | NEG 2035 | NEG 2036 | NEG 2037 | NEG 2038 | NEG 2039 | NEG 2040 | NEG 2041 | NEG 2042 | NEG 2043 | NEG 2044 | NEG 2045 | NEG 2046 | NEG 2047 | NEG 2048 | NEG 2049 | NEG 2050 | NEG 2051 | NEG 2052 | NEG 2053 | NEG 2054 | NEG 2055 | NEG 2056 | NEG 2057 | NEG 2058 | NEG 2059 | NEG 2060 | NEG 2061 | NEG 2062 | NEG 2063 | NEG 2064 | NEG 2065 | NEG 2066 | NEG 2067 | NEG 2068 | NEG 2069 | NEG 2070 | NEG 2071 | NEG 2072 | NEG 2073 | NEG 2074 | NEG 2075 | NEG 2076 | NEG 2077 | NEG 2078 | NEG 2079 | NEG 2080 | NEG 2081 | NEG 2082 | NEG 2083 | NEG 2084 | NEG 2085 | NEG 2086 | NEG 2087 | NEG 2088 | NEG 2089 | NEG 2090 | NEG 2091 | NEG 2092 | NEG 2093 | NEG 2094 | NEG 2095 | NEG 2096 | NEG 2097 | NEG 2098 | NEG 2099 | NEG 2100 | NEG 2101 | NEG 2102 | NEG 2103 | NEG 2104 | NEG 2105 | NEG 2106 | NEG 2107 | NEG 2108 | NEG 2109 | NEG 2110 | NEG 2111 | NEG 2112 | NEG 2113 | NEG 2114 | NEG 2115 | NEG 2116 | NEG 2117 | NEG 2118 | NEG 2119 | NEG 2120 | NEG 2121 | NEG 2122 | NEG 2123 | NEG 2124 | NEG 2125 | NEG 2126 | NEG 2127 | NEG 2128 | NEG 2129 | NEG 2130 | NEG 2131 | NEG 2132 | NEG 2133 | NEG 2134 | NEG 2135 | NEG 2136 | NEG 2137 | NEG 2138 | NEG 2139 | NEG 2140 | NEG 2141 | NEG 2142 | NEG 2143 | NEG 2144 | NEG 2145 | NEG 2146 | NEG 2147 | NEG 2148 | NEG 2149 | NEG 2150 | NEG 2151 | NEG 2152 | NEG 2153 | NEG 2154 | NEG 2155 | NEG 2156 | NEG 2157 | NEG 2158 | NEG 2159 | NEG 2160 | NEG 2161 | NEG 2162 | NEG 2163 | NEG 2164 | NEG 2165 | NEG 2166 | NEG 2167 | NEG 2168 | NEG 2169 | NEG 2170 | NEG 2171 | NEG 2172 | NEG 2173 | NEG 2174 | NEG 2175 | NEG 2176 | NEG 2177 | NEG 2178 | NEG 2179 | NEG 2180 | NEG 2181 | NEG 2182 | NEG 2183 | NEG 2184 | NEG 2185 | NEG 2186 | NEG 2187 | NEG 2188 | NEG 2189 | NEG 2190 | NEG 2191 | NEG 2192 | NEG 2193 | NEG 2194 | NEG 2195 | NEG 2196 | NEG 2197 | NEG 2198 | NEG 2199 | NEG 2200 | NEG 2201 | NEG 2202 | NEG 2203 | NEG 2204 | NEG 2205 | NEG 2206 | NEG 2207 | NEG 2208 | NEG 2209 | NEG 2210 | NEG 2211 | NEG 2212 | NEG 2213 | NEG 2214 | NEG 2215 | NEG 2216 | NEG 2217 | NEG 2218 | NEG 2219 | NEG 2220 | NEG 2221 | NEG 2222 | NEG 2223 | NEG 2224 | NEG 2225 | NEG 2226 | NEG 2227 | NEG 2228 | NEG 2229 | NEG 2230 | NEG 2231 | NEG 2232 | NEG 2233 | NEG 2234 | NEG 2235 | NEG 2236 | NEG 2237 | NEG 2238 | NEG 2239 | NEG 2240 | NEG 2241 | NEG 2242 | NEG 2243 | NEG 2244 | NEG 2245 | NEG 2246 | NEG 2247 | NEG 2248 | NEG 2249 | NEG 2250 | NEG 2251 | NEG 2252 | NEG 2253 | NEG 2254 | NEG 2255 | NEG 2256 | NEG 2257 | NEG 2258 | NEG 2259 | NEG 2260 | NEG 2261 | NEG 2262 | NEG 2263 | NEG 2264 | NEG 2265 | NEG 2266 | NEG 2267 | NEG 2268 | NEG 2269 | NEG 2270 | NEG 2271 | NEG 2272 | NEG 2273 | NEG 2274 | NEG 2275 | NEG 2276 | NEG 2277 | NEG 2278 | NEG 2279 | NEG 2280 | NEG 2281 | NEG 2282 | NEG 2283 | NEG 2284 | NEG 2285 | NEG 2286 | NEG 2287 | NEG 2288 | NEG 2289 | NEG 2290 | NEG 2291 | NEG 2292 | NEG 2293 | NEG 2294 | NEG 2295 | NEG 2296 | NEG 2297 | NEG 2298 | NEG 2299 | NEG 2300 | NEG 2301 | NEG 2302 | NEG 2303 | NEG 2304 | NEG 2305 | NEG 2306 | NEG 2307 | NEG 2308 | NEG 2309 | NEG 2310 | NEG 2311 | NEG 2312 | NEG 2313 | NEG 2314 | NEG 2315 | NEG 2316 | NEG 2317 | NEG 2318 | NEG 2319 | NEG 2320 | NEG 2321 | NEG 2322 | NEG 2323 | NEG 2324 | NEG 2325 | NEG 2326 | NEG 2327 | NEG 2328 | NEG 2329 | NEG 2330 | NEG 2331 | NEG 2332 | NEG 2333 | NEG 2334 | NEG 2335 | NEG 2336 | NEG 2337 | NEG 2338 | NEG 2339 | NEG 2340 | NEG 2341 | NEG 2342 | NEG 2343 | NEG 2344 | NEG 2345 | NEG 2346 | NEG 2347 | NEG 2348 | NEG 2349 | NEG 2350 | NEG 2351 | NEG 2352 | NEG 2353 | NEG 2354 | NEG 2355 | NEG 2356 | NEG 2357 | NEG 2358 | NEG 2359 | NEG 2360 | NEG 2361 | NEG 2362 | NEG 2363 | NEG 2364 | NEG 2365 | NEG 2366 | NEG 2367 | NEG 2368 | NEG 2369 | NEG 2370 | NEG 2371 | NEG 2372 | NEG 2373 | NEG 2374 | NEG 2375 | NEG 2376 | NEG 2377 | NEG 2378 | NEG 2379 | NEG 2380 | NEG 2381 | NEG 2382 | NEG 2383 | NEG 2384 | NEG 2385 | NEG 2386 | NEG 2387 | NEG 2388 | NEG 2389 | NEG 2390 | NEG 2391 | NEG 2392 | NEG 2393 | NEG 2394 | NEG 2395 | NEG 2396 | NEG 2397 | NEG 2398 | NEG 2399 | NEG 2400 | NEG 2401 | NEG 2402 | NEG 2403 | NEG 2404 | NEG 2405 | NEG 2406 | NEG 2407 | NEG 2408 | NEG 2409 | NEG 2410 | NEG 2411 | NEG 2412 | NEG 2413 | NEG 2414 | NEG 2415 | NEG 2416 | NEG 2417 | NEG 2418 | NEG 2419 | NEG 2420 | NEG 2421 | NEG 2422 | NEG 2423 | NEG 2424 | NEG 2425 | NEG 2426 | NEG 2427 | NEG 2428 | NEG 2429 | NEG 2430 | NEG 2431 | NEG 2432 | NEG 2433 | NEG 2434 | NEG 2435 | NEG 2436 | NEG 2437 | NEG 2438 | NEG 2439 | NEG 2440 | NEG 2441 | NEG 2442 | NEG 2443 | NEG 2444 | NEG 2445 | NEG 2446 | NEG 2447 | NEG 2448 | NEG 2449 | NEG 2450 | NEG 2451 | NEG 2452 | NEG 2453 | NEG 2454 | NEG 2455 | NEG 2456 | NEG 2457 | NEG 2458 | NEG 2459 | NEG 2460 | NEG 2461 | NEG 2462 | NEG 2463 | NEG 2464 | NEG 2465 | NEG 2466 | NEG 2467 | NEG 2468 | NEG 2469 | NEG 2470 | NEG 2471 | NEG 2472 | NEG 2473 | NEG 2474 | NEG 2475 | NEG 2476 | NEG 2477 | NEG 2478 | NEG 2479 | NEG 2480 | NEG 2481 | NEG 2482 | NEG 2483 | NEG 2484 | NEG 2485 | NEG 2486 | NEG 2487 | NEG 2488 | NEG 2489 | NEG 2490 | NEG 2491 | NEG 2492 | NEG 2493 | NEG 2494 | NEG 2495 | NEG 2496 | NEG 2497 | NEG 2498 | NEG 2499 | NEG 2500 | NEG 2501 | NEG 2502 | NEG 2503 | NEG 2504 | NEG 2505 | NEG 2506 | NEG 2507 | NEG 2508 | NEG 2509 | NEG 2510 | NEG 2511 | NEG 2512 | NEG 2513 | NEG 2514 | NEG 2515 | NEG 2516 | NEG 2517 | NEG 2518 | NEG 2519 | NEG 2520 | NEG 2521 | NEG 2522 | NEG 2523 | NEG 2524 | NEG 2525 | NEG 2526 | NEG 2527 | NEG 2528 | NEG 2529 | NEG 2530 | NEG 2531 | NEG 2532 | NEG 2533 | NEG 2534 | NEG 2535 | NEG 2536 | NEG 2537 | NEG 2538 | NEG 2539 | NEG 2540 | NEG 2541 | NEG 2542 | NEG 2543 | NEG 2544 | NEG 2545 | NEG 2546 | NEG 2547 | NEG 2548 | NEG 2549 | NEG 2550 | NEG 2551 | NEG 2552 | NEG 2553 | NEG 2554 | NEG 2555 | NEG 2556 | NEG 2557 | NEG 2558 | NEG 2559 | NEG 2560 | NEG 2561 | NEG 2562 | NEG 2563 | NEG 2564 | NEG 2565 | NEG 2566 | NEG 2567 | NEG 2568 | NEG 2569 | NEG 2570 | NEG 2571 | NEG 2572 | NEG 2573 | NEG 2574 | NEG 2575 | NEG 2576 | NEG 2577 | NEG 2578 | NEG 2579 | NEG 2580 | NEG 2581 | NEG 2582 | NEG 2583 | NEG 2584 | NEG 2585 | NEG 2586 | NEG 2587 | NEG 2588 | NEG 2589 | NEG 2590 | NEG 2591 | NEG 2592 | NEG 2593 | NEG 2594 | NEG 2595 | NEG 2596 | NEG 2597 | NEG 2598 | NEG 2599 | NEG 2600 | NEG 2601 | NEG 2602 | NEG 2603 | NEG 2604 | NEG 2605 | NEG 2606 | NEG 2607 | NEG 2608 | NEG 2609 | NEG 2610 | NEG 2611 | NEG 2612 | NEG 2613 | NEG 2614 | NEG 2615 | NEG 2616 | NEG 2617 | NEG 2618 | NEG 2619 | NEG 2620 | NEG 2621 | NEG 2622 | NEG 2623 | NEG 2624 | NEG 2625 | NEG 2626 | NEG 2627 | NEG 2628 | NEG 2629 | NEG 2630 | NEG 2631 | NEG 2632 | NEG 2633 | NEG 2634 | NEG 2635 | NEG 2636 | NEG 2637 | NEG 2638 | NEG 2639 | NEG 2640 | NEG 2641 | NEG 2642 | NEG 2643 | NEG 2644 | NEG 2645 | NEG 2646 | NEG 2647 | NEG 2648 | NEG 2649 | NEG 2650 | NEG 2651 | NEG 2652 | NEG 2653 | NEG 2654 | NEG 2655 | NEG 2656 | NEG 2657 | NEG 2658 | NEG 2659 | NEG 2660 | NEG 2661 | NEG 2662 | NEG 2663 | NEG 2664 | NEG 2665 | NEG 2666 | NEG 2667 | NEG 2668 | NEG 2669 | NEG 2670 | NEG 2671 | NEG 2672 | NEG 2673 | NEG 2674 | NEG 2675 | NEG 2676 | NEG 2677 | NEG 2678 | NEG 2679 | NEG 2680 | NEG 2681 | NEG 2682 | NEG 2683 | NEG 2684 | NEG 2685 | NEG 2686 | NEG 2687 | NEG 2688 | NEG 2689 | NEG 2690 | NEG 2691 | NEG 2692 | NEG 2693 | NEG 2694 | NEG 2695 | NEG 2696 | NEG 2697 | NEG 2698 | NEG 2699 | NEG 2700 | NEG 2701 | NEG 2702 | NEG 2703 | NEG 2704 | NEG 2705 | NEG 2706 | NEG 2707 | NEG 2708 | NEG 2709 | NEG 2710 | NEG 2711 | NEG 2712 | NEG 2713 | NEG 2714 | NEG 2715 | NEG 2716 | NEG 2717 | NEG 2718 | NEG 2719 | NEG 2720 | NEG 2721 | NEG 2722 | NEG 2723 | NEG 2724 | NEG 2725 | NEG 2726 | NEG 2727 | NEG 2728 | NEG 2729 | NEG 2730 | NEG 2731 | NEG 2732 | NEG 2733 | NEG 2734 | NEG 2735 | NEG 2736 | NEG 2737 | NEG 2738 | NEG 2739 | NEG 2740 | NEG 2741 | NEG 2742 | NEG 2743 | NEG 2744 | NEG 2745 | NEG 2746 | NEG 2747 | NEG 2748 | NEG 2749 | NEG 2750 | NEG 2751 | NEG 2752 | NEG 2753 | NEG 2754 | NEG 2755 | NEG 2756 | NEG 2757 | NEG 2758 | NEG 2759 | NEG 2760 | NEG 2761 | NEG 2762 | NEG 2763 | NEG 2764 | NEG 2765 | NEG 2766 | NEG 2767 | NEG 2768 | NEG 2769 | NEG 2770 | NEG 2771 | NEG 2772 | NEG 2773 | NEG 2774 | NEG 2775 | NEG 2776 | NEG 2777 | NEG 2778 | NEG 2779 | NEG 2780 | NEG 2781 | NEG 2782 | NEG 2783 | NEG 2784 | NEG 2785 | NEG 2786 | NEG 2787 | NEG 2788 | NEG 2789 | NEG 2790 | NEG 2791 | NEG 2792 | NEG 2793 | NEG 2794 | NEG 2795 | NEG 2796 | NEG 2797 | NEG 2798 | NEG 2799 | NEG 2800 | NEG 2801 | NEG 2802 | NEG 2803 | NEG 2804 | NEG 2805 | NEG 2806 | NEG 2807 | NEG 2808 | NEG 2809 | NEG 2810 | NEG 2811 | NEG 2812 | NEG 2813 | NEG 2814 | NEG 2815 | NEG 2816 | NEG 2817 | NEG 2818 | NEG 2819 | NEG 2820 | NEG 2821 | NEG 2822 | NEG 2823 | NEG 2824 | NEG 2825 | NEG 2826 | NEG 2827 | NEG 2828 | NEG 2829 | NEG 2830 | NEG 2831 | NEG 2832 | NEG 2833 | NEG 2834 | NEG 2835 | NEG 2836 | NEG 2837 | NEG 2838 | NEG 2839 | NEG 2840 | NEG 2841 | NEG 2842 | NEG 2843 | NEG 2844 | NEG 2845 | NEG 2846 | NEG 2847 | NEG 2848 | NEG 2849 | NEG 2850 | NEG 2851 | NEG 2852 | NEG 2853 | NEG 2854 | NEG 2855 | NEG 2856 | NEG 2857 | NEG 2858 | NEG 2859 | NEG 2860 | NEG 2861 | NEG 2862 | NEG 2863 | NEG 2864 | NEG 2865 | NEG 2866 | NEG 2867 | NEG 2868 | NEG 2869 | NEG 2870 | NEG 2871 | NEG 2872 | NEG 2873 | NEG 2874 | NEG 2875 | NEG 2876 | NEG 2877 | NEG 2878 | NEG 2879 | NEG 2880 | NEG 2881 | NEG 2882 | NEG 2883 | NEG 2884 | NEG 2885 | NEG 2886 | NEG 2887 | NEG 2888 | NEG 2889 | NEG 2890 | NEG 2891 | NEG 2892 | NEG 2893 | NEG 2894 | NEG 2895 | NEG 2896 | NEG 2897 | NEG 2898 | NEG 2899 | NEG 2900 | NEG 2901 | NEG 2902 | NEG 2903 | NEG 2904 | NEG 2905 | NEG 2906 | NEG 2907 | NEG 2908 | NEG 2909 | NEG 2910 | NEG 2911 | NEG 2912 | NEG 2913 | NEG 2914 | NEG 2915 | NEG 2916 | NEG 2917 | NEG 2918 | NEG 2919 | NEG 2920 | NEG 2921 | NEG 2922 | NEG 2923 | NEG 2924 | NEG 2925 | NEG 2926 | NEG 2927 | NEG 2928 | NEG 2929 | NEG 2930 | NEG 2931 | NEG 2932 | NEG 2933 | NEG 2934 | NEG 2935 | NEG 2936 | NEG 2937 | NEG 2938 | NEG 2939 | NEG 2940 | NEG 2941 | NEG 2942 | NEG 2943 | NEG 2944 | NEG 2945 | NEG 2946 | NEG 2947 | NEG 2948 | NEG 2949 | NEG 2950 | NEG 2951 | NEG 2952 | NEG 2953 | NEG 2954 | NEG 2955 | NEG 2956 | NEG 2957 | NEG 2958 | NEG 2959 | NEG 2960 | NEG 2961 | NEG 2962 | NEG 2963 | NEG 2964 | NEG 2965 | NEG 2966 | NEG 2967 | NEG 2968 | NEG 2969 | NEG 2970 | NEG 2971 | NEG 2972 | NEG 2973 | NEG 2974 | NEG 2975 | NEG 2976 | NEG 2977 | NEG 2978 | NEG 2979 | NEG 2980 | NEG 2981 | NEG 2982 | NEG 2983 | NEG 2984 | NEG 2985 | NEG 2986 | NEG 2987 | NEG 2988 | NEG 2989 | NEG 2990 | NEG 2991 | NEG 2992 | NEG 2993 | NEG 2994 | NEG 2995 | NEG 2996 | NEG 2997 | NEG 2998 | NEG 2999 | NEG 3000 | NEG 3001 | NEG 3002 | NEG 3003 | NEG 3004 | NEG 3005 | NEG 3006 | NEG 3007 | NEG 3008 | NEG 3009 | NEG 3010 | NEG 3011 | NEG 3012 | NEG 3013 | NEG 3014 | NEG 3015 | NEG 3016 | NEG 3017 | NEG 3018 | NEG 3019 | NEG 3020 | NEG 3021 | NEG 3022 | NEG 3023 | NEG 3024 | NEG 3025 | NEG 3026 | NEG 3027 | NEG 3028 | NEG 3029 | NEG 3030 | NEG 3031 | NEG 3032 | NEG 3033 | NEG 3034 | NEG 3035 | NEG 3036 | NEG 3037 | NEG 3038 | NEG 3039 | NEG 3040 | NEG 3041 | NEG 3042 | NEG 3043 | NEG 3044 | NEG 3045 | NEG 3046 | NEG 3047 | NEG 3048 | NEG 3049 | NEG 3050 | NEG 3051 | NEG 3052 | NEG 3053 | NEG 3054 | NEG 3055 | NEG 3056 | NEG 3057 | NEG 3058 | NEG 3059 | NEG 3060 | NEG 3061 | NEG 3062 | NEG 3063 | NEG 3064 | NEG 3065 | NEG 3066 | NEG 3067 | NEG 3068 | NEG 3069 | NEG 3070 | NEG 3071 | NEG 3072 | NEG 3073 | NEG 3074 | NEG 3075 | NEG 3076 | NEG 3077 | NEG 3078 | NEG 3079 | NEG 3080 | NEG 3081 | NEG 3082 | NEG 3083 | NEG 3084 | NEG 3085 | NEG 3086 | NEG 3087 | NEG 3088 | NEG 3089 | NEG 3090 | NEG 3091 | NEG 3092 | NEG 3093 | NEG 3094 | NEG 3095 | NEG 3096 | NEG 3097 | NEG 3098 | NEG 3099 | NEG 3100 | NEG 3101 | NEG 3102 | NEG 3103 | NEG 3104 | NEG 3105 | NEG 3106 | NEG 3107 | NEG 3108 | NEG 3109 | NEG 3110 | NEG 3111 | NEG 3112 | NEG 3113 | NEG 3114 | NEG 3115 | NEG 3116 | NEG 3117 | NEG 3118 | NEG 3119 | NEG 3120 | NEG 3121 | NEG 3122 | NEG 3123 | NEG 3124 | NEG 3125 | NEG 3126 | NEG 3127 | NEG 3128 | NEG 3129 | NEG 3130 | NEG 3131 | NEG 3132 | NEG 3133 | NEG 3134 | NEG 3135 | NEG 3136 | NEG 3137 | NEG 3138 | NEG 3139 | NEG 3140 | NEG 3141 | NEG 3142 | NEG 3143 | NEG 3144 | NEG 3145 | NEG 3146 | NEG 3147 | NEG 3148 | NEG 3149 | NEG 3150 | NEG 3151 | NEG 3152 | NEG 3153 | NEG 3154 | NEG 3155 | NEG 3156 | NEG 3157 | NEG 3158 | NEG 3159 | NEG 3160 | NEG 3161 | NEG 3162 | NEG 3163 | NEG 3164 | NEG 3165 | NEG 3166 | NEG 3167 | NEG 3168 | NEG 3169 | NEG 3170 | NEG 3171 | NEG 3172 | NEG 3173 | NEG 3174 | NEG 3175 | NEG 3176 | NEG 3177 | NEG 3178 | NEG 3179 | NEG 3180 | NEG 3181 | NEG 3182 | NEG 3183 | NEG 3184 | NEG 3185 | NEG 3186 | NEG 3187 | NEG 3188 | NEG 3189 | NEG 3190 | NEG 3191 | NEG 3192 | NEG 3193 | NEG 3194 | NEG 3195 | NEG 3196 | NEG 3197 | NEG 3198 | NEG 3199 | NEG 3200 | NEG 3201 | -------------------------------------------------------------------------------- /data/hotel_comment/vocab.labels.txt: -------------------------------------------------------------------------------- 1 | NEG 2 | POS 3 | -------------------------------------------------------------------------------- /model/cnn/debug.py: -------------------------------------------------------------------------------- 1 | from main import input_fn, model_fn, DATA_DIR 2 | from pathlib import Path 3 | import tensorflow as tf 4 | 5 | tf.enable_eager_execution() 6 | 7 | if __name__ == '__main__': 8 | params = { 9 | 'dim': 300, 10 | 'nwords': 10, 11 | 'filter_sizes': [2, 3, 4, 5], 12 | 'num_filters': 128, 13 | 'dropout': 0.5, 14 | 'num_oov_buckets': 1, 15 | 'epochs': 25, 16 | 'batch_size': 20, 17 | 'buffer': 3500, 18 | 'words': str(Path(DATA_DIR, 'vocab.words.txt')), 19 | 'tags': str(Path(DATA_DIR, 'vocab.labels.txt')), 20 | 'w2v': str(Path(DATA_DIR, 'w2v.npz')) 21 | } 22 | 23 | ds = input_fn(Path(DATA_DIR, 'train.words.txt'), Path(DATA_DIR, 'train.labels.txt'), params=params) 24 | iterator = ds.make_one_shot_iterator() 25 | batch_sample = iterator.get_next() 26 | model_fn(batch_sample[0], batch_sample[1], tf.estimator.ModeKeys.TRAIN, params) 27 | -------------------------------------------------------------------------------- /model/cnn/export.py: -------------------------------------------------------------------------------- 1 | 2 | from pathlib import Path 3 | import json 4 | 5 | import tensorflow as tf 6 | 7 | from main import model_fn 8 | 9 | PARAMS = './results/params.json' 10 | MODEL_DIR = './results/model' 11 | 12 | 13 | def serving_input_receiver_fn(): 14 | words = tf.placeholder(dtype=tf.string, shape=[None, None], name='words') 15 | receiver_tensors = {'words': words} 16 | features = {'words': words} 17 | return tf.estimator.export.ServingInputReceiver(features, receiver_tensors) 18 | 19 | 20 | if __name__ == '__main__': 21 | with Path(PARAMS).open() as f: 22 | params = json.load(f) 23 | 24 | estimator = tf.estimator.Estimator(model_fn, MODEL_DIR, params=params) 25 | estimator.export_saved_model('saved_model', serving_input_receiver_fn) 26 | -------------------------------------------------------------------------------- /model/cnn/main.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import json 3 | import logging 4 | import functools 5 | import numpy as np 6 | import tensorflow as tf 7 | from pathlib import Path 8 | 9 | DATA_DIR = '../../data/hotel_comment' 10 | 11 | # Logging 12 | Path('results').mkdir(exist_ok=True) 13 | tf.logging.set_verbosity(logging.INFO) 14 | handlers = [ 15 | logging.FileHandler('results/main.log'), 16 | logging.StreamHandler(sys.stdout) 17 | ] 18 | logging.getLogger('tensorflow').handlers = handlers 19 | 20 | 21 | # Input function 22 | def parse_fn(line_words, line_tag): 23 | # Encode in Bytes for TF 24 | words = [w.encode() for w in line_words.strip().split()] 25 | tag = line_tag.strip().encode() 26 | return words, tag 27 | 28 | 29 | def generator_fn(words, tags): 30 | with Path(words).open('r') as f_words, Path(tags).open('r') as f_tags: 31 | for line_words, line_tag in zip(f_words, f_tags): 32 | yield parse_fn(line_words, line_tag) 33 | 34 | 35 | def input_fn(words_path, tags_path, params=None, shuffle_and_repeat=False): 36 | params = params if params is not None else {} 37 | shapes = ([None], ()) # shape of every sample 38 | types = (tf.string, tf.string) 39 | defaults = ('', '') 40 | 41 | dataset = tf.data.Dataset.from_generator( 42 | functools.partial(generator_fn, words_path, tags_path), 43 | output_shapes=shapes, output_types=types).map(lambda w, t: (w[:params.get('nwords', 300)], t)) 44 | 45 | if shuffle_and_repeat: 46 | dataset = dataset.shuffle(params['buffer']).repeat(params['epochs']) 47 | 48 | dataset = (dataset 49 | .padded_batch(params.get('batch_size', 20), ([params.get('nwords', 300)], ()), defaults) 50 | .prefetch(1)) 51 | return dataset 52 | 53 | 54 | def model_fn(features, labels, mode, params): 55 | if isinstance(features, dict): 56 | features = features['words'] 57 | 58 | # Read vocabs and inputs 59 | dropout = params.get('dropout', 0.5) 60 | training = (mode == tf.estimator.ModeKeys.TRAIN) 61 | vocab_words = tf.contrib.lookup.index_table_from_file( 62 | params['words'], num_oov_buckets=params['num_oov_buckets']) 63 | with Path(params['tags']).open() as f: 64 | indices = [idx for idx, tag in enumerate(f)] 65 | num_tags = len(indices) 66 | 67 | # Word Embeddings 68 | word_ids = vocab_words.lookup(features) 69 | w2v = np.load(params['w2v'])['embeddings'] 70 | w2v_var = np.vstack([w2v, [[0.] * params['dim']]]) 71 | w2v_var = tf.Variable(w2v_var, dtype=tf.float32, trainable=False) 72 | embeddings = tf.nn.embedding_lookup(w2v_var, word_ids) 73 | embeddings = tf.layers.dropout(embeddings, rate=dropout, training=training) 74 | embeddings_expanded = tf.expand_dims(embeddings, -1) 75 | 76 | # CNN 77 | pooled_outputs = [] 78 | for i, filter_size in enumerate(params['filter_sizes']): 79 | conv2 = tf.layers.conv2d(embeddings_expanded, params['num_filters'], kernel_size=[filter_size, params['dim']], 80 | activation=tf.nn.relu, name='conv-{}'.format(i)) 81 | pooled = tf.layers.max_pooling2d(inputs=conv2, pool_size=[params['nwords'] - filter_size + 1, 1], 82 | strides=[1, 1], name='pool-{}'.format(i)) 83 | pooled_outputs.append(pooled) 84 | num_total_filters = params['num_filters'] * len(params['filter_sizes']) 85 | h_poll = tf.concat(pooled_outputs, 3) 86 | output = tf.reshape(h_poll, [-1, num_total_filters]) 87 | output = tf.layers.dropout(output, rate=dropout, training=training) 88 | 89 | # FC 90 | logits = tf.layers.dense(output, num_tags) 91 | pred_ids = tf.argmax(input=logits, axis=1) 92 | 93 | if mode == tf.estimator.ModeKeys.PREDICT: 94 | reversed_tags = tf.contrib.lookup.index_to_string_table_from_file(params['tags']) 95 | pred_labels = reversed_tags.lookup(tf.argmax(input=logits, axis=1)) 96 | predictions = { 97 | 'classes_id': pred_ids, 98 | 'labels': pred_labels 99 | } 100 | return tf.estimator.EstimatorSpec(mode, predictions=predictions) 101 | else: 102 | # LOSS 103 | tags_table = tf.contrib.lookup.index_table_from_file(params['tags']) 104 | tags = tags_table.lookup(labels) 105 | loss = tf.losses.sparse_softmax_cross_entropy(labels=tags, logits=logits) 106 | 107 | # Metrics 108 | metrics = { 109 | 'acc': tf.metrics.accuracy(tags, pred_ids), 110 | 'precision': tf.metrics.precision(tags, pred_ids), 111 | 'recall': tf.metrics.recall(tags, pred_ids) 112 | } 113 | 114 | for metric_name, op in metrics.items(): 115 | tf.summary.scalar(metric_name, op[1]) 116 | 117 | if mode == tf.estimator.ModeKeys.TRAIN: 118 | train_op = tf.train.AdamOptimizer().minimize( 119 | loss, global_step=tf.train.get_or_create_global_step()) 120 | return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op) 121 | elif mode == tf.estimator.ModeKeys.EVAL: 122 | return tf.estimator.EstimatorSpec( 123 | mode, loss=loss, eval_metric_ops=metrics) 124 | 125 | 126 | if __name__ == '__main__': 127 | params = { 128 | 'dim': 300, 129 | 'nwords': 300, 130 | 'filter_sizes': [2, 3, 4], 131 | 'num_filters': 64, 132 | 'dropout': 0.6, 133 | 'num_oov_buckets': 1, 134 | 'epochs': 50, 135 | 'batch_size': 20, 136 | 'buffer': 3500, 137 | 'words': str(Path(DATA_DIR, 'vocab.words.txt')), 138 | 'tags': str(Path(DATA_DIR, 'vocab.labels.txt')), 139 | 'w2v': str(Path(DATA_DIR, 'w2v.npz')) 140 | } 141 | 142 | with Path('results/params.json').open('w') as f: 143 | json.dump(params, f, indent=4, sort_keys=True) 144 | 145 | 146 | def fwords(name): 147 | return str(Path(DATA_DIR, '{}.words.txt'.format(name))) 148 | 149 | 150 | def ftags(name): 151 | return str(Path(DATA_DIR, '{}.labels.txt'.format(name))) 152 | 153 | 154 | train_inpf = functools.partial(input_fn, fwords('train'), ftags('train'), 155 | params, shuffle_and_repeat=True) 156 | eval_inpf = functools.partial(input_fn, fwords('eval'), ftags('eval')) 157 | cfg = tf.estimator.RunConfig(save_checkpoints_secs=10) 158 | estimator = tf.estimator.Estimator(model_fn, 'results/model', cfg, params) 159 | Path(estimator.eval_dir()).mkdir(parents=True, exist_ok=True) 160 | train_spec = tf.estimator.TrainSpec(input_fn=train_inpf) 161 | eval_spec = tf.estimator.EvalSpec(input_fn=eval_inpf, throttle_secs=10) 162 | tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec) 163 | 164 | 165 | # Write predictions to file 166 | def write_predictions(name): 167 | Path('results/score').mkdir(parents=True, exist_ok=True) 168 | with Path('results/score/{}.preds.txt'.format(name)).open('wb') as f: 169 | test_inpf = functools.partial(input_fn, fwords(name), ftags(name)) 170 | golds_gen = generator_fn(fwords(name), ftags(name)) 171 | preds_gen = estimator.predict(test_inpf) 172 | for golds, preds in zip(golds_gen, preds_gen): 173 | (words, tag) = golds 174 | f.write(b' '.join([tag, preds['labels'], b''.join(words)]) + b'\n') 175 | 176 | 177 | for name in ['train', 'eval']: 178 | write_predictions(name) 179 | -------------------------------------------------------------------------------- /model/cnn/saved_model/1557069338/assets/vocab.labels.txt: -------------------------------------------------------------------------------- 1 | NEG 2 | POS 3 | -------------------------------------------------------------------------------- /model/cnn/saved_model/1557069338/saved_model.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/model/cnn/saved_model/1557069338/saved_model.pb -------------------------------------------------------------------------------- /model/cnn/saved_model/1557069338/variables/variables.data-00000-of-00001: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/model/cnn/saved_model/1557069338/variables/variables.data-00000-of-00001 -------------------------------------------------------------------------------- /model/cnn/saved_model/1557069338/variables/variables.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/model/cnn/saved_model/1557069338/variables/variables.index -------------------------------------------------------------------------------- /model/cnn/serve.py: -------------------------------------------------------------------------------- 1 | """Reload and serve a saved model""" 2 | import json 3 | import jieba 4 | from pathlib import Path 5 | from tensorflow.contrib import predictor 6 | from functools import partial 7 | 8 | PARAMS = './results/params.json' 9 | 10 | LINE = '''酒店设施不是新的,服务态度很不好''' 11 | 12 | 13 | def predict(pred_fn, line, length=300): 14 | sentence = ' '.join(jieba.cut(line.strip(), cut_all=False, HMM=True)) 15 | words = [w.encode() for w in sentence.strip().split()] 16 | if len(words) >= length: 17 | words = words[:length] 18 | else: 19 | words.extend([''] * (length - len(words))) 20 | predictions = pred_fn({'words': [words]}) 21 | return predictions 22 | 23 | 24 | if __name__ == '__main__': 25 | with Path(PARAMS).open() as f: 26 | params = json.load(f) 27 | export_dir = 'saved_model' 28 | subdirs = [x for x in Path(export_dir).iterdir() 29 | if x.is_dir() and 'temp' not in str(x)] 30 | latest = str(sorted(subdirs)[-1]) 31 | predict_fn = partial(predict, predictor.from_saved_model(latest)) 32 | print(LINE) 33 | print(predict_fn(LINE, params['nwords'])) 34 | line = input('\n\n输入一句中文: ') 35 | while line.strip().lower() != 'q': 36 | print('\n\n', predict_fn(line, params['nwords'])) 37 | line = input('\n\n输入一句中文: ') 38 | -------------------------------------------------------------------------------- /model/lstm/debug.py: -------------------------------------------------------------------------------- 1 | from main import input_fn, model_fn, DATA_DIR 2 | from pathlib import Path 3 | import tensorflow as tf 4 | 5 | tf.enable_eager_execution() 6 | 7 | if __name__ == '__main__': 8 | params = { 9 | 'dim': 300, 10 | 'lstm_size': 100, 11 | 'dropout': 0.5, 12 | 'num_oov_buckets': 1, 13 | 'epochs': 50, 14 | 'batch_size': 20, 15 | 'buffer': 3500, 16 | 'words': str(Path(DATA_DIR, 'vocab.words.txt')), 17 | 'tags': str(Path(DATA_DIR, 'vocab.labels.txt')), 18 | 'w2v': str(Path(DATA_DIR, 'w2v.npz')) 19 | } 20 | 21 | ds = input_fn(Path(DATA_DIR, 'train.words.txt'), Path(DATA_DIR, 'train.labels.txt'), params=params) 22 | iterator = ds.make_one_shot_iterator() 23 | batch_sample = iterator.get_next() 24 | model_fn(batch_sample[0], batch_sample[1], tf.estimator.ModeKeys.TRAIN, params) 25 | -------------------------------------------------------------------------------- /model/lstm/export.py: -------------------------------------------------------------------------------- 1 | 2 | from pathlib import Path 3 | import json 4 | 5 | import tensorflow as tf 6 | 7 | from main import model_fn 8 | 9 | PARAMS = './results/params.json' 10 | MODEL_DIR = './results/model' 11 | 12 | 13 | def serving_input_receiver_fn(): 14 | words = tf.placeholder(dtype=tf.string, shape=[None, None], name='words') 15 | nwords = tf.placeholder(dtype=tf.int32, shape=[None], name='nwords') 16 | receiver_tensors = {'words': words, 'nwords': nwords} 17 | features = {'words': words, 'nwords': nwords} 18 | return tf.estimator.export.ServingInputReceiver(features, receiver_tensors) 19 | 20 | 21 | if __name__ == '__main__': 22 | with Path(PARAMS).open(encoding='utf-8') as f: 23 | params = json.load(f) 24 | 25 | estimator = tf.estimator.Estimator(model_fn, MODEL_DIR, params=params) 26 | estimator.export_saved_model('saved_model', serving_input_receiver_fn) 27 | -------------------------------------------------------------------------------- /model/lstm/main.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import json 3 | import logging 4 | import functools 5 | import numpy as np 6 | import tensorflow as tf 7 | from pathlib import Path 8 | 9 | DATA_DIR = '../../data/hotel_comment' 10 | 11 | # Logging 12 | Path('results').mkdir(exist_ok=True) 13 | tf.logging.set_verbosity(logging.INFO) 14 | handlers = [ 15 | logging.FileHandler('results/main.log'), 16 | logging.StreamHandler(sys.stdout) 17 | ] 18 | logging.getLogger('tensorflow').handlers = handlers 19 | 20 | 21 | # Input function 22 | def parse_fn(line_words, line_tag): 23 | # Encode in Bytes for TF 24 | words = [w.encode() for w in line_words.strip().split()] 25 | tag = line_tag.strip().encode() 26 | return (words, len(words)), tag 27 | 28 | 29 | def generator_fn(words, tags): 30 | with Path(words).open('r', encoding='utf-8') as f_words, Path(tags).open('r', encoding='utf-8') as f_tags: 31 | for line_words, line_tag in zip(f_words, f_tags): 32 | yield parse_fn(line_words, line_tag) 33 | 34 | 35 | def input_fn(words_path, tags_path, params=None, shuffle_and_repeat=False): 36 | params = params if params is not None else {} 37 | shapes = (([None], ()), ()) # shape of every sample 38 | types = ((tf.string, tf.int32), tf.string) 39 | defaults = (('', 0), '') 40 | 41 | dataset = tf.data.Dataset.from_generator( 42 | functools.partial(generator_fn, words_path, tags_path), 43 | output_shapes=shapes, output_types=types) 44 | 45 | if shuffle_and_repeat: 46 | dataset = dataset.shuffle(params['buffer']).repeat(params['epochs']) 47 | 48 | dataset = (dataset 49 | .padded_batch(params.get('batch_size', 20), shapes, defaults) 50 | .prefetch(1)) 51 | return dataset 52 | 53 | 54 | def model_fn(features, labels, mode, params): 55 | if isinstance(features, dict): 56 | features = features['words'], features['nwords'] 57 | 58 | # Read vocabs and inputs 59 | dropout = params['dropout'] 60 | words, nwords = features 61 | training = (mode == tf.estimator.ModeKeys.TRAIN) 62 | vocab_words = tf.contrib.lookup.index_table_from_file( 63 | params['words'], num_oov_buckets=params['num_oov_buckets']) 64 | with Path(params['tags']).open(encoding='utf-8') as f: 65 | indices = [idx for idx, tag in enumerate(f)] 66 | num_tags = len(indices) 67 | 68 | # Word Embeddings 69 | word_ids = vocab_words.lookup(words) 70 | w2v = np.load(params['w2v'])['embeddings'] 71 | w2v_var = np.vstack([w2v, [[0.] * params['dim']]]) 72 | w2v_var = tf.Variable(w2v_var, dtype=tf.float32, trainable=False) 73 | embeddings = tf.nn.embedding_lookup(w2v_var, word_ids) 74 | embeddings = tf.layers.dropout(embeddings, rate=dropout, training=training) 75 | 76 | # LSTM 77 | t = tf.transpose(embeddings, perm=[1, 0, 2]) 78 | lstm_cell_fw = tf.contrib.rnn.LSTMBlockFusedCell(params['lstm_size']) 79 | lstm_cell_bw = tf.contrib.rnn.LSTMBlockFusedCell(params['lstm_size']) 80 | lstm_cell_bw = tf.contrib.rnn.TimeReversedFusedRNN(lstm_cell_bw) 81 | _, (cf, hf) = lstm_cell_fw(t, dtype=tf.float32, sequence_length=nwords) 82 | _, (cb, hb) = lstm_cell_bw(t, dtype=tf.float32, sequence_length=nwords) 83 | output = tf.concat([hf, hb], axis=-1) 84 | output = tf.layers.dropout(output, rate=dropout, training=training) 85 | 86 | # FC 87 | logits = tf.layers.dense(output, num_tags) 88 | pred_ids = tf.argmax(input=logits, axis=1) 89 | 90 | if mode == tf.estimator.ModeKeys.PREDICT: 91 | reversed_tags = tf.contrib.lookup.index_to_string_table_from_file(params['tags']) 92 | pred_labels = reversed_tags.lookup(tf.argmax(input=logits, axis=1)) 93 | predictions = { 94 | 'classes_id': pred_ids, 95 | 'labels': pred_labels 96 | } 97 | return tf.estimator.EstimatorSpec(mode, predictions=predictions) 98 | 99 | else: 100 | # LOSS 101 | tags_table = tf.contrib.lookup.index_table_from_file(params['tags']) 102 | tags = tags_table.lookup(labels) 103 | loss = tf.losses.sparse_softmax_cross_entropy(labels=tags, logits=logits) 104 | 105 | # Metrics 106 | metrics = { 107 | 'acc': tf.metrics.accuracy(tags, pred_ids), 108 | 'precision': tf.metrics.precision(tags, pred_ids), 109 | 'recall': tf.metrics.recall(tags, pred_ids) 110 | } 111 | 112 | for metric_name, op in metrics.items(): 113 | tf.summary.scalar(metric_name, op[1]) 114 | 115 | if mode == tf.estimator.ModeKeys.TRAIN: 116 | train_op = tf.train.AdamOptimizer().minimize( 117 | loss, global_step=tf.train.get_or_create_global_step()) 118 | return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op) 119 | elif mode == tf.estimator.ModeKeys.EVAL: 120 | return tf.estimator.EstimatorSpec( 121 | mode, loss=loss, eval_metric_ops=metrics) 122 | 123 | 124 | if __name__ == '__main__': 125 | params = { 126 | 'dim': 300, 127 | 'lstm_size': 32, 128 | 'dropout': 0.5, 129 | 'num_oov_buckets': 1, 130 | 'epochs': 25, 131 | 'batch_size': 20, 132 | 'buffer': 3500, 133 | 'words': str(Path(DATA_DIR, 'vocab.words.txt')), 134 | 'tags': str(Path(DATA_DIR, 'vocab.labels.txt')), 135 | 'w2v': str(Path(DATA_DIR, 'w2v.npz')) 136 | } 137 | 138 | with Path('results/params.json').open('w', encoding='utf-8') as f: 139 | json.dump(params, f, indent=4, sort_keys=True) 140 | 141 | 142 | def fwords(name): 143 | return str(Path(DATA_DIR, '{}.words.txt'.format(name))) 144 | 145 | 146 | def ftags(name): 147 | return str(Path(DATA_DIR, '{}.labels.txt'.format(name))) 148 | 149 | 150 | train_inpf = functools.partial(input_fn, fwords('train'), ftags('train'), 151 | params, shuffle_and_repeat=True) 152 | eval_inpf = functools.partial(input_fn, fwords('eval'), ftags('eval')) 153 | cfg = tf.estimator.RunConfig(save_checkpoints_secs=60) 154 | estimator = tf.estimator.Estimator(model_fn, 'results/model', cfg, params) 155 | Path(estimator.eval_dir()).mkdir(parents=True, exist_ok=True) 156 | train_spec = tf.estimator.TrainSpec(input_fn=train_inpf) 157 | eval_spec = tf.estimator.EvalSpec(input_fn=eval_inpf, throttle_secs=60) 158 | tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec) 159 | 160 | 161 | # Write predictions to file 162 | def write_predictions(name): 163 | Path('results/score').mkdir(parents=True, exist_ok=True) 164 | with Path('results/score/{}.preds.txt'.format(name)).open('wb', encoding='utf-8') as f: 165 | test_inpf = functools.partial(input_fn, fwords(name), ftags(name)) 166 | golds_gen = generator_fn(fwords(name), ftags(name)) 167 | preds_gen = estimator.predict(test_inpf) 168 | for golds, preds in zip(golds_gen, preds_gen): 169 | ((words, _), tag) = golds 170 | f.write(b' '.join([tag, preds['labels'], ''.join(words)]) + b'\n') 171 | 172 | 173 | for name in ['train', 'eval']: 174 | write_predictions(name) 175 | -------------------------------------------------------------------------------- /model/lstm/saved_model/1557073579/assets/vocab.labels.txt: -------------------------------------------------------------------------------- 1 | NEG 2 | POS 3 | -------------------------------------------------------------------------------- /model/lstm/saved_model/1557073579/saved_model.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/model/lstm/saved_model/1557073579/saved_model.pb -------------------------------------------------------------------------------- /model/lstm/saved_model/1557073579/variables/variables.data-00000-of-00001: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/model/lstm/saved_model/1557073579/variables/variables.data-00000-of-00001 -------------------------------------------------------------------------------- /model/lstm/saved_model/1557073579/variables/variables.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/model/lstm/saved_model/1557073579/variables/variables.index -------------------------------------------------------------------------------- /model/lstm/serve.py: -------------------------------------------------------------------------------- 1 | """Reload and serve a saved model""" 2 | import json 3 | import jieba 4 | from pathlib import Path 5 | from tensorflow.contrib import predictor 6 | from functools import partial 7 | 8 | LINE = '''酒店设施不是新的,服务态度很不好''' 9 | 10 | 11 | def predict(pred_fn, line): 12 | sentence = ' '.join(jieba.cut(line.strip(), cut_all=False, HMM=True)) 13 | words = [w.encode() for w in sentence.strip().split()] 14 | nwords = len(words) 15 | predictions = pred_fn({'words': [words], 'nwords': [nwords]}) 16 | return predictions 17 | 18 | 19 | if __name__ == '__main__': 20 | export_dir = 'saved_model' 21 | subdirs = [x for x in Path(export_dir).iterdir() 22 | if x.is_dir() and 'temp' not in str(x)] 23 | latest = str(sorted(subdirs)[-1]) 24 | predict_fn = partial(predict, predictor.from_saved_model(latest)) 25 | print(LINE) 26 | print(predict_fn(LINE)) 27 | line = input('\n\n输入一句中文: ') 28 | while line.strip().lower() != 'q': 29 | print('\n\n', predict_fn(line)) 30 | line = input('\n\n输入一句中文: ') 31 | -------------------------------------------------------------------------------- /model/score_report.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | from sklearn import metrics 3 | from pathlib import Path 4 | 5 | parser = argparse.ArgumentParser() 6 | parser.add_argument('file', help='specify the path of the results file') 7 | args = parser.parse_args() 8 | 9 | if __name__ == '__main__': 10 | label_true = [] 11 | label_pred = [] 12 | target_names = [] 13 | with Path(args.file).open() as f: 14 | for line in f: 15 | tag_name = line.strip().split()[0] 16 | if tag_name not in target_names: 17 | target_names.append(tag_name) 18 | label_true.append(tag_name) 19 | label_pred.append(line.strip().split()[1]) 20 | print(metrics.classification_report(y_pred=label_pred, y_true=label_true, target_names=['POS', 'NEG'])) 21 | -------------------------------------------------------------------------------- /pic/1_GRQ91HNASB7MAJPTTlVvfw.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/pic/1_GRQ91HNASB7MAJPTTlVvfw.jpeg -------------------------------------------------------------------------------- /pic/clip.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/pic/clip.png -------------------------------------------------------------------------------- /pic/截图_选择区域_20211202181126.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/linguishi/chinese_sentiment/06bdc816c678a4998bb04576bd76d3cae118c4c2/pic/截图_选择区域_20211202181126.png --------------------------------------------------------------------------------