├── architecture.jpg
├── requirements.txt
├── work.sh
├── train.sh
├── fast_run.py
├── data
    ├── test20
    │   └── test.txt
    ├── rest15
    │   └── dev.txt
    └── rest16
    │   └── dev.txt
├── README.md
├── bert.py
├── bert_utils.py
├── work.py
├── LICENSE
├── seq_utils.py
├── glue_utils.py
├── absa_layer.py
└── main.py


/architecture.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tamanna18/BERT-E2E-ABSA/master/architecture.jpg


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | torch==1.2.0
2 | numpy==1.22.0
3 | transformers==4.1.1
4 | tqdm==4.32.1
5 | tensorboardX==1.8
6 | 


--------------------------------------------------------------------------------
/work.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | TASK_NAME="test20"
 3 | ABSA_HOME="./bert-linear-laptop14-finetune"
 4 | CUDA_VISIBLE_DEVICES=0 python work.py --absa_home ${ABSA_HOME} \
 5 |                       --ckpt ${ABSA_HOME}/checkpoint-1500 \
 6 |                       --model_type bert \
 7 |                       --data_dir ./data/${TASK_NAME} \
 8 |                       --task_name ${TASK_NAME} \
 9 |                       --model_name_or_path bert-base-uncased \
10 |                       --cache_dir ./cache \
11 |                       --max_seq_length 128 \
12 |                       --tagging_schema BIEOS


--------------------------------------------------------------------------------
/train.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | TASK_NAME=rest_total
 3 | ABSA_TYPE=tfm
 4 | CUDA_VISIBLE_DEVICES=0,2,3 python main.py --model_type bert \
 5 |                          --absa_type ${ABSA_TYPE} \
 6 |                          --tfm_mode finetune \
 7 |                          --fix_tfm 0 \
 8 |                          --model_name_or_path bert-base-uncased \
 9 |                          --data_dir ./data/${TASK_NAME} \
10 |                          --task_name ${TASK_NAME} \
11 |                          --per_gpu_train_batch_size 16 \
12 |                          --per_gpu_eval_batch_size 8 \
13 |                          --learning_rate 2e-5 \
14 |                          --do_train \
15 |                          --do_eval \
16 |                          --do_lower_case \
17 |                          --tagging_schema BIEOS \
18 |                          --overfit 0 \
19 |                          --overwrite_output_dir \
20 |                          --eval_all_checkpoints \
21 |                          --MASTER_ADDR localhost \
22 |                          --MASTER_PORT 28512 \
23 |                          --max_steps 1500
24 | 


--------------------------------------------------------------------------------
/fast_run.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | 
 3 | os.environ["CUDA_VISIBLE_DEVICES"] = "3, 1, 2"
 4 | 
 5 | #seed_numbers = [42, 593, 1774, 65336, 189990]
 6 | seed_numbers = [42]
 7 | model_type = 'bert'
 8 | absa_type = 'linear'
 9 | tfm_mode = 'finetune'
10 | fix_tfm = 0
11 | task_name = 'laptop14'
12 | warmup_steps = 0
13 | overfit = 0
14 | if task_name == 'laptop14':
15 |     train_batch_size = 32
16 | elif task_name == 'rest_total' or task_name == 'rest14' or task_name == 'rest15' or task_name == 'rest16':
17 |     train_batch_size = 16
18 | else:
19 |     raise Exception("Unsupported dataset %s!!!" % task_name)
20 | 
21 | for run_id, seed in enumerate(seed_numbers):
22 |     command = "python main.py --model_type %s --absa_type %s --tfm_mode %s --fix_tfm %s " \
23 |               "--model_name_or_path bert-base-uncased --data_dir ./data/%s --task_name %s " \
24 |               "--per_gpu_train_batch_size %s --per_gpu_eval_batch_size 8 --learning_rate 2e-5 " \
25 |               "--max_steps 1500 --warmup_steps %s --do_train --do_eval --do_lower_case " \
26 |               "--seed %s --tagging_schema BIEOS --overfit %s " \
27 |               "--overwrite_output_dir --eval_all_checkpoints --MASTER_ADDR localhost --MASTER_PORT 28512" % (
28 |         model_type, absa_type, tfm_mode, fix_tfm, task_name, task_name, train_batch_size, warmup_steps, seed, overfit)
29 |     output_dir = '%s-%s-%s-%s' % (model_type, absa_type, task_name, tfm_mode)
30 |     if fix_tfm:
31 |         output_dir = '%s-fix' % output_dir
32 |     if overfit:
33 |         output_dir = '%s-overfit' % output_dir
34 |     if not os.path.exists(output_dir):
35 |         os.mkdir(output_dir)
36 | 
37 |     log_file = '%s/log.txt' % output_dir
38 |     if run_id == 0 and os.path.exists(log_file):
39 |         os.remove(log_file)
40 |     with open(log_file, 'a') as fp:
41 |         fp.write("\nIn run %s/5 (seed %s):\n" % (run_id, seed))
42 |     os.system(command)
43 |     if overfit:
44 |         # only conduct one run
45 |         break
46 | 


--------------------------------------------------------------------------------
/data/test20/test.txt:
--------------------------------------------------------------------------------
 1 | Yum!####Yum=O !=O
 2 | Serves really good sushi!####Serves=O really=O good=O sushi=T-POS !=O
 3 | Not the biggest portions but adequate.####Not=O the=O biggest=O portions=T-NEU but=O adequate=O .=O
 4 | Green Tea creme brulee is a must!####Green=T-POS Tea=T-POS creme=T-POS brulee=T-POS is=O a=O must=O !=O
 5 | Don't leave the restaurant without it.####Do=O n't=O leave=O the=O restaurant=O without=O it=O .=O
 6 | No Comparison####No=O Comparison=O
 7 | – I can't say enough about this place.####I=O ca=O n't=O say=O enough=O about=O this=O place=T-POS .=O
 8 | It has great sushi and even better service.####It=O has=O great=O sushi=T-POS and=O even=O better=O service=T-POS .=O
 9 | The entire staff was extremely accomodating and tended to my every need.####The=O entire=O staff=T-POS was=O extremely=O accomodating=O and=O tended=O to=O my=O every=O need=O .=O
10 | I've been to this restaurant over a dozen times with no complaints to date.####I=O 've=O been=O to=O this=O restaurant=T-POS over=O a=O dozen=O times=O with=O no=O complaints=O to=O date=O .=O
11 | Snotty Attitude####Snotty=O Attitude=O
12 | – We were treated very rudely here one time for breakfast.####We=O were=O treated=O very=O rudely=O here=O one=O time=O for=O breakfast=O .=O
13 | The owner is belligerent to guests that have a complaint.####The=O owner=T-NEG is=O belligerent=O to=O guests=O that=O have=O a=O complaint=O .=O
14 | Good food!####Good=O food=T-POS !=O
15 | – We love breakfast food.####We=O love=O breakfast=O food=O .=O
16 | This is a great place to get a delicious meal.####This=O is=O a=O great=O place=O to=O get=O a=O delicious=O meal=T-POS .=O
17 | We never had to wait more than 5 minutes.####We=O never=O had=O to=O wait=O more=O than=O 5=O minutes=O .=O
18 | The staff is pretty friendly.####The=O staff=T-POS is=O pretty=O friendly=O .=O
19 | The onion rings are great!####The=O onion=T-POS rings=T-POS are=O great=O !=O
20 | They are not greasy or anything.####They=O are=O not=O greasy=O or=O anything=O .=O
21 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # BERT-E2E-ABSA
  2 | Exploiting **BERT** **E**nd-**t**o-**E**nd **A**spect-**B**ased **S**entiment **A**nalysis
  3 | <p align="center">
  4 |     <img src="architecture.jpg" height="400"/>
  5 | </p>
  6 | 
  7 | ## Requirements
  8 | * python 3.7.3
  9 | * pytorch 1.2.0 (also tested on pytorch 1.3.0)
 10 | * ~~transformers 2.0.0~~ transformers 4.1.1
 11 | * numpy 1.16.4
 12 | * tensorboardX 1.9
 13 | * tqdm 4.32.1
 14 | * some codes are borrowed from **allennlp** ([https://github.com/allenai/allennlp](https://github.com/allenai/allennlp), an awesome open-source NLP toolkit) and **transformers** ([https://github.com/huggingface/transformers](https://github.com/huggingface/transformers), formerly known as **pytorch-pretrained-bert** or **pytorch-transformers**)
 15 | 
 16 | ## Architecture
 17 | * Pre-trained embedding layer: BERT-Base-Uncased (12-layer, 768-hidden, 12-heads, 110M parameters)
 18 | * Task-specific layer: 
 19 |   - Linear
 20 |   - Recurrent Neural Networks (GRU)
 21 |   - Self-Attention Networks (SAN, TFM)
 22 |   - Conditional Random Fields (CRF)
 23 | 
 24 | ## Dataset
 25 | * ~~Restaurant: retaurant reviews from SemEval 2014 (task 4), SemEval 2015 (task 12) and SemEval 2016 (task 5) (rest_total)~~
 26 | * (**IMPORTANT**) Restaurant: restaurant reviews from SemEval 2014 (rest14), restaurant reviews from SemEval 2015 (rest15), restaurant reviews from SemEval 2016 (rest16). Please refer to the newly updated files in ```./data```
 27 | * (**IMPORTANT**) **DO NOT** use the ```rest_total``` dataset built by ourselves again, more details can be found in [Updated Results](https://github.com/lixin4ever/BERT-E2E-ABSA/blob/master/README.md#updated-results-important).
 28 | * Laptop: laptop reviews from SemEval 2014 (laptop14)
 29 | 
 30 | 
 31 | ## Quick Start
 32 | * The valid tagging strategies/schemes (i.e., the ways representing text or entity span) in this project are **BIEOS** (also called **BIOES** or **BMES**), **BIO** (also called **IOB2**) and **OT** (also called **IO**). If you are not familiar with these terms, I strongly recommend you to read the following materials before running the program: 
 33 | 
 34 |   a. [Inside–outside–beginning (tagging)](https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)). 
 35 |   
 36 |   b. [Representing Text Chunks](https://www.aclweb.org/anthology/E99-1023.pdf). 
 37 |   
 38 |   c. The [paper](https://www.aclweb.org/anthology/D19-5505.pdf) associated with this project. 
 39 | 
 40 | * Reproduce the results on Restaurant and Laptop dataset:
 41 |   ```
 42 |   # train the model with 5 different seed numbers
 43 |   python fast_run.py 
 44 |   ```
 45 | * Train the model on other ABSA dataset:
 46 |   
 47 |   1. place data files in the directory `./data/[YOUR_DATASET_NAME]` (please note that you need to re-organize your data files so that it can be directly adapted to this project, following the input format of `./data/laptop14/train.txt` should be OK).
 48 |   
 49 |   2. set `TASK_NAME` in `train.sh` as `[YOUR_DATASET_NAME]`.
 50 |   
 51 |   3. train the model:  `sh train.sh`
 52 | 
 53 | * (** **New feature** **) Perform pure inference/direct transfer over test/unseen data using the trained ABSA model:
 54 | 
 55 |   1. place data file in the directory `./data/[YOUR_EVAL_DATASET_NAME]`.
 56 |   
 57 |   2. set `TASK_NAME` in `work.sh` as `[YOUR_EVAL_DATASET_NAME]`
 58 |   
 59 |   3. set `ABSA_HOME` in `work.sh` as `[HOME_DIRECTORY_OF_PRETRAINED_ABSA_MODEL]`
 60 |   
 61 |   4. run: `sh work.sh`
 62 | 
 63 | ## Environment
 64 | * OS: REHL Server 6.4 (Santiago)
 65 | * GPU: NVIDIA GTX 1080 ti
 66 | * CUDA: 10.0
 67 | * cuDNN: v7.6.1
 68 | 
 69 | ## Updated results (IMPORTANT)
 70 | * The data files of the ```rest_total``` dataset are created by concatenating the train/test counterparts from ```rest14```, ```rest15``` and ```rest16``` and our motivation is to build a larger training/testing dataset to stabilize the training/faithfully reflect the capability of the ABSA model. However, we recently found that the SemEval organizers directly treat the union set of ```rest15.train``` and ```rest15.test``` as the training set of rest16 (i.e., ```rest16.train```), and thus, there exists overlap between the ```rest_total.train``` and the ```rest_total.test```, which makes this dataset invalid. When you follow our works on this E2E-ABSA task, we hope you **DO NOT** use this ```rest_total``` dataset any more but change to the officially released ```rest14```, ```rest15``` and ```rest16```.
 71 | * To facilitate the comparison in the future, we re-run our models following the above mentioned settings and report the results (***micro-averaged F1***) on ```rest14```, ```rest15``` and ```rest16```:  
 72 | 
 73 |     | Model | rest14 | rest15 | rest16 |
 74 |     | --- | --- | --- | --- |
 75 |     | E2E-ABSA (OURS) | 67.10 | 57.27 | 64.31 |
 76 |     | [(He et al., 2019)](https://arxiv.org/pdf/1906.06906.pdf) | 69.54 | 59.18 | n/a |
 77 |     | [(Liu et al., 2020)](https://arxiv.org/pdf/2004.06427.pdf) | 68.91 | 58.37 | n/a |
 78 |     | BERT-Linear (OURS) | 72.61 | 60.29 | 69.67 |
 79 |     | BERT-GRU (OURS) | 73.17 | 59.60 | 70.21 |
 80 |     | BERT-SAN (OURS) | 73.68 | 59.90 | 70.51 |
 81 |     | BERT-TFM (OURS) | 73.98 | 60.24 | 70.25 |
 82 |     | BERT-CRF (OURS) | 73.17 | 60.70 | 70.37 |
 83 |     | [(Chen and Qian, 2020)](https://www.aclweb.org/anthology/2020.acl-main.340.pdf)| 75.42 | 66.05 | n/a |
 84 |     | [(Liang et al., 2020)](https://arxiv.org/pdf/2004.01951.pdf)| 72.60 | 62.37 | n/a |
 85 | 
 86 | ## Citation
 87 | If the code is used in your research, please star our repo and cite our paper as follows:
 88 | ```
 89 | @inproceedings{li-etal-2019-exploiting,
 90 |     title = "Exploiting {BERT} for End-to-End Aspect-based Sentiment Analysis",
 91 |     author = "Li, Xin  and
 92 |       Bing, Lidong  and
 93 |       Zhang, Wenxuan  and
 94 |       Lam, Wai",
 95 |     booktitle = "Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)",
 96 |     year = "2019",
 97 |     url = "https://www.aclweb.org/anthology/D19-5505",
 98 |     pages = "34--41"
 99 | }
100 | ```
101 |      
102 | 


--------------------------------------------------------------------------------
/bert.py:
--------------------------------------------------------------------------------
  1 | # coding=utf-8
  2 | # Copyright 2018 Google AI Language, Google Brain and Carnegie Mellon University Authors and the HuggingFace Inc. team.
  3 | # Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.
  4 | #
  5 | # Licensed under the Apache License, Version 2.0 (the "License");
  6 | # you may not use this file except in compliance with the License.
  7 | # You may obtain a copy of the License at
  8 | #
  9 | #     http://www.apache.org/licenses/LICENSE-2.0
 10 | #
 11 | # Unless required by applicable law or agreed to in writing, software
 12 | # distributed under the License is distributed on an "AS IS" BASIS,
 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 | # See the License for the specific language governing permissions and
 15 | # limitations under the License.
 16 | 
 17 | from transformers import PreTrainedModel, BertModel, BertConfig, XLNetModel, XLNetConfig
 18 | # model map for BERT
 19 | from transformers import BERT_PRETRAINED_CONFIG_ARCHIVE_MAP
 20 | # model map for XLNet
 21 | from transformers import XLNET_PRETRAINED_CONFIG_ARCHIVE_MAP
 22 | from transformers.models.bert.modeling_bert import BertEncoder, BertEmbeddings, BertPooler
 23 | import torch.nn as nn
 24 | from bert_utils import *
 25 | 
 26 | 
 27 | BERT_PRETRAINED_MODEL_ARCHIVE_MAP = {
 28 |  'bert-base-uncased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin',
 29 |  'bert-large-uncased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-pytorch_model.bin',
 30 |  'bert-base-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-pytorch_model.bin',
 31 |  'bert-large-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-pytorch_model.bin',
 32 |  'bert-base-multilingual-uncased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-pytorch_model.bin',
 33 |  'bert-base-multilingual-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-pytorch_model.bin',
 34 |  'bert-base-chinese': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-pytorch_model.bin',
 35 |  'bert-base-german-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-pytorch_model.bin',
 36 |  'bert-large-uncased-whole-word-masking': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin',
 37 |  'bert-large-cased-whole-word-masking': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-pytorch_model.bin',
 38 |  'bert-large-uncased-whole-word-masking-finetuned-squad': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin',
 39 |  'bert-large-cased-whole-word-masking-finetuned-squad': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin',
 40 |  'bert-base-cased-finetuned-mrpc': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-pytorch_model.bin',
 41 |  'bert-base-german-dbmdz-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-pytorch_model.bin',
 42 |  'bert-base-german-dbmdz-uncased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-pytorch_model.bin'
 43 | }
 44 | 
 45 | XLNET_PRETRAINED_MODEL_ARCHIVE_MAP = {
 46 |  'xlnet-base-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-base-cased-pytorch_model.bin',
 47 |  'xlnet-large-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-pytorch_model.bin'
 48 | }
 49 | 
 50 | 
 51 | class BertLayerNorm(nn.Module):
 52 |     def __init__(self, hidden_size, eps=1e-12):
 53 |         """Construct a layernorm module in the TF style (epsilon inside the square root).
 54 |         """
 55 |         super(BertLayerNorm, self).__init__()
 56 |         self.weight = nn.Parameter(torch.ones(hidden_size))
 57 |         self.bias = nn.Parameter(torch.zeros(hidden_size))
 58 |         self.variance_epsilon = eps
 59 | 
 60 |     def forward(self, x):
 61 |         u = x.mean(-1, keepdim=True)
 62 |         s = (x - u).pow(2).mean(-1, keepdim=True)
 63 |         x = (x - u) / torch.sqrt(s + self.variance_epsilon)
 64 |         return self.weight * x + self.bias
 65 | 
 66 | 
 67 | class XLNetLayerNorm(nn.Module):
 68 |     def __init__(self, d_model, eps=1e-12):
 69 |         """Construct a layernorm module in the TF style (epsilon inside the square root).
 70 |         """
 71 |         super(XLNetLayerNorm, self).__init__()
 72 |         self.weight = nn.Parameter(torch.ones(d_model))
 73 |         self.bias = nn.Parameter(torch.zeros(d_model))
 74 |         self.variance_epsilon = eps
 75 | 
 76 |     def forward(self, x):
 77 |         u = x.mean(-1, keepdim=True)
 78 |         s = (x - u).pow(2).mean(-1, keepdim=True)
 79 |         x = (x - u) / torch.sqrt(s + self.variance_epsilon)
 80 |         return self.weight * x + self.bias
 81 | 
 82 | 
 83 | class BertPreTrainedModel(PreTrainedModel):
 84 |     """ An abstract class to handle weights initialization and
 85 |         a simple interface for dowloading and loading pretrained models.
 86 |     """
 87 |     config_class = BertConfig
 88 |     pretrained_model_archive_map = BERT_PRETRAINED_MODEL_ARCHIVE_MAP
 89 |     load_tf_weights = load_tf_weights_in_bert
 90 |     base_model_prefix = "bert"
 91 | 
 92 |     def __init__(self, *inputs, **kwargs):
 93 |         super(BertPreTrainedModel, self).__init__(*inputs, **kwargs)
 94 | 
 95 |     def init_weights(self, module):
 96 |         """ Initialize the weights.
 97 |         """
 98 |         if isinstance(module, (nn.Linear, nn.Embedding)):
 99 |             # Slightly different from the TF version which uses truncated_normal for initialization
100 |             # cf https://github.com/pytorch/pytorch/pull/5617
101 |             module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)
102 |         elif isinstance(module, BertLayerNorm):
103 |             module.bias.data.zero_()
104 |             module.weight.data.fill_(1.0)
105 |         if isinstance(module, nn.Linear) and module.bias is not None:
106 |             module.bias.data.zero_()
107 | 
108 | 
109 | class XLNetPreTrainedModel(PreTrainedModel):
110 |     config_class = XLNetConfig
111 |     pretrained_model_archive_map = XLNET_PRETRAINED_MODEL_ARCHIVE_MAP
112 |     load_tf_weights = load_tf_weights_in_xlnet
113 |     base_model_prefix = 'transformer'
114 | 
115 |     def __init__(self, *inputs, **kwargs):
116 |         super(XLNetPreTrainedModel, self).__init__(*inputs, **kwargs)
117 | 
118 |     def init_weights(self, module):
119 |         """
120 |         Initialize the weights.
121 |         :param module:
122 |         :return:
123 |         """
124 |         if isinstance(module, (nn.Linear, nn.Embedding)):
125 |             # Slightly different from the TF version which uses truncated_normal for initialization
126 |             # cf https://github.com/pytorch/pytorch/pull/5617
127 |             module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)
128 |             if isinstance(module, nn.Linear) and module.bias is not None:
129 |                 module.bias.data.zero_()
130 |         elif isinstance(module, XLNetLayerNorm):
131 |             module.bias.data.zero_()
132 |             module.weight.data.fill_(1.0)
133 |         elif isinstance(module, XLNetModel):
134 |             module.mask_emb.data.normal_(mean=0.0, std=self.config.initializer_range)
135 | 


--------------------------------------------------------------------------------
/bert_utils.py:
--------------------------------------------------------------------------------
  1 | # coding=utf-8
  2 | # Copyright 2018 Google AI Language, Google Brain and Carnegie Mellon University Authors and the HuggingFace Inc. team.
  3 | # Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.
  4 | #
  5 | # Licensed under the Apache License, Version 2.0 (the "License");
  6 | # you may not use this file except in compliance with the License.
  7 | # You may obtain a copy of the License at
  8 | #
  9 | #     http://www.apache.org/licenses/LICENSE-2.0
 10 | #
 11 | # Unless required by applicable law or agreed to in writing, software
 12 | # distributed under the License is distributed on an "AS IS" BASIS,
 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 | # See the License for the specific language governing permissions and
 15 | # limitations under the License.
 16 | 
 17 | import torch
 18 | import logging
 19 | import os
 20 | logger = logging.getLogger(__name__)
 21 | 
 22 | 
 23 | def build_tf_xlnet_to_pytorch_map(model, config, tf_weights=None):
 24 |     """ A map of modules from TF to PyTorch.
 25 |         I use a map to keep the PyTorch model as
 26 |         identical to the original PyTorch model as possible.
 27 |     """
 28 |     tf_to_pt_map = {}
 29 | 
 30 |     if hasattr(model, 'transformer'):
 31 |         if hasattr(model, 'lm_loss'):
 32 |             # We will load also the output bias
 33 |             tf_to_pt_map['model/lm_loss/bias'] = model.lm_loss.bias
 34 |         if hasattr(model, 'sequence_summary') and 'model/sequnece_summary/summary/kernel' in tf_weights:
 35 |             # We will load also the sequence summary
 36 |             tf_to_pt_map['model/sequnece_summary/summary/kernel'] = model.sequence_summary.summary.weight
 37 |             tf_to_pt_map['model/sequnece_summary/summary/bias'] = model.sequence_summary.summary.bias
 38 |         if hasattr(model, 'logits_proj') and config.finetuning_task is not None \
 39 |                 and 'model/regression_{}/logit/kernel'.format(config.finetuning_task) in tf_weights:
 40 |             tf_to_pt_map['model/regression_{}/logit/kernel'.format(config.finetuning_task)] = model.logits_proj.weight
 41 |             tf_to_pt_map['model/regression_{}/logit/bias'.format(config.finetuning_task)] = model.logits_proj.bias
 42 | 
 43 |         # Now load the rest of the transformer
 44 |         model = model.transformer
 45 | 
 46 |     # Embeddings and output
 47 |     tf_to_pt_map.update({'model/transformer/word_embedding/lookup_table': model.word_embedding.weight,
 48 |                              'model/transformer/mask_emb/mask_emb': model.mask_emb})
 49 | 
 50 |     # Transformer blocks
 51 |     for i, b in enumerate(model.layer):
 52 |         layer_str = "model/transformer/layer_%d/" % i
 53 |         tf_to_pt_map.update({
 54 |             layer_str + "rel_attn/LayerNorm/gamma": b.rel_attn.layer_norm.weight,
 55 |             layer_str + "rel_attn/LayerNorm/beta": b.rel_attn.layer_norm.bias,
 56 |             layer_str + "rel_attn/o/kernel": b.rel_attn.o,
 57 |             layer_str + "rel_attn/q/kernel": b.rel_attn.q,
 58 |             layer_str + "rel_attn/k/kernel": b.rel_attn.k,
 59 |             layer_str + "rel_attn/r/kernel": b.rel_attn.r,
 60 |             layer_str + "rel_attn/v/kernel": b.rel_attn.v,
 61 |             layer_str + "ff/LayerNorm/gamma": b.ff.layer_norm.weight,
 62 |             layer_str + "ff/LayerNorm/beta": b.ff.layer_norm.bias,
 63 |             layer_str + "ff/layer_1/kernel": b.ff.layer_1.weight,
 64 |             layer_str + "ff/layer_1/bias": b.ff.layer_1.bias,
 65 |             layer_str + "ff/layer_2/kernel": b.ff.layer_2.weight,
 66 |             layer_str + "ff/layer_2/bias": b.ff.layer_2.bias,
 67 |         })
 68 | 
 69 |     # Relative positioning biases
 70 |     if config.untie_r:
 71 |         r_r_list = []
 72 |         r_w_list = []
 73 |         r_s_list = []
 74 |         seg_embed_list = []
 75 |         for b in model.layer:
 76 |             r_r_list.append(b.rel_attn.r_r_bias)
 77 |             r_w_list.append(b.rel_attn.r_w_bias)
 78 |             r_s_list.append(b.rel_attn.r_s_bias)
 79 |             seg_embed_list.append(b.rel_attn.seg_embed)
 80 |     else:
 81 |         r_r_list = [model.r_r_bias]
 82 |         r_w_list = [model.r_w_bias]
 83 |         r_s_list = [model.r_s_bias]
 84 |         seg_embed_list = [model.seg_embed]
 85 | 
 86 |     tf_to_pt_map.update({
 87 |         'model/transformer/r_r_bias': r_r_list,
 88 |         'model/transformer/r_w_bias': r_w_list,
 89 |         'model/transformer/r_s_bias': r_s_list,
 90 |         'model/transformer/seg_embed': seg_embed_list})
 91 |     return tf_to_pt_map
 92 | 
 93 | 
 94 | def load_tf_weights_in_bert(model, config, tf_checkpoint_path):
 95 |     """ Load tf checkpoints in a pytorch model.
 96 |     """
 97 |     try:
 98 |         import re
 99 |         import numpy as np
100 |         import tensorflow as tf
101 |     except ImportError:
102 |         logger.error("Loading a TensorFlow models in PyTorch, requires TensorFlow to be installed. Please see "
103 |             "https://www.tensorflow.org/install/ for installation instructions.")
104 |         raise
105 |     tf_path = os.path.abspath(tf_checkpoint_path)
106 |     logger.info("Converting TensorFlow checkpoint from {}".format(tf_path))
107 |     # Load weights from TF model
108 |     init_vars = tf.train.list_variables(tf_path)
109 |     names = []
110 |     arrays = []
111 |     for name, shape in init_vars:
112 |         logger.info("Loading TF weight {} with shape {}".format(name, shape))
113 |         array = tf.train.load_variable(tf_path, name)
114 |         names.append(name)
115 |         arrays.append(array)
116 | 
117 |     for name, array in zip(names, arrays):
118 |         name = name.split('/')
119 |         # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v
120 |         # which are not required for using pretrained model
121 |         if any(n in ["adam_v", "adam_m", "global_step"] for n in name):
122 |             logger.info("Skipping {}".format("/".join(name)))
123 |             continue
124 |         pointer = model
125 |         for m_name in name:
126 |             if re.fullmatch(r'[A-Za-z]+_\d+', m_name):
127 |                 l = re.split(r'_(\d+)', m_name)
128 |             else:
129 |                 l = [m_name]
130 |             if l[0] == 'kernel' or l[0] == 'gamma':
131 |                 pointer = getattr(pointer, 'weight')
132 |             elif l[0] == 'output_bias' or l[0] == 'beta':
133 |                 pointer = getattr(pointer, 'bias')
134 |             elif l[0] == 'output_weights':
135 |                 pointer = getattr(pointer, 'weight')
136 |             elif l[0] == 'squad':
137 |                 pointer = getattr(pointer, 'classifier')
138 |             else:
139 |                 try:
140 |                     pointer = getattr(pointer, l[0])
141 |                 except AttributeError:
142 |                     logger.info("Skipping {}".format("/".join(name)))
143 |                     continue
144 |             if len(l) >= 2:
145 |                 num = int(l[1])
146 |                 pointer = pointer[num]
147 |         if m_name[-11:] == '_embeddings':
148 |             pointer = getattr(pointer, 'weight')
149 |         elif m_name == 'kernel':
150 |             array = np.transpose(array)
151 |         try:
152 |             assert pointer.shape == array.shape
153 |         except AssertionError as e:
154 |             e.args += (pointer.shape, array.shape)
155 |             raise
156 |         logger.info("Initialize PyTorch weight {}".format(name))
157 |         pointer.data = torch.from_numpy(array)
158 |     return model
159 | 
160 | 
161 | def load_tf_weights_in_xlnet(model, config, tf_path):
162 |     """ Load tf checkpoints in a pytorch model.
163 |     """
164 |     try:
165 |         import numpy as np
166 |         import tensorflow as tf
167 |     except ImportError:
168 |         logger.error("Loading a TensorFlow models in PyTorch, requires TensorFlow to be installed. "
169 |                      "Please see https://www.tensorflow.org/install/ for installation instructions.")
170 | 
171 |     # load weights from TF model
172 |     init_vars = tf.train.list_variables(tf_path)
173 |     tf_weights = {}
174 |     for name, shape in init_vars:
175 |         logger.info("Loading TF weight {} with shape {}".format(name, shape))
176 |         array = tf.train.load_variable(tf_path, name)
177 |         tf_weights[name] = array
178 | 
179 |     # Build TF to PyTorch weights loading map
180 |     tf_to_pt_map = build_tf_xlnet_to_pytorch_map(model, config, tf_weights)
181 | 
182 |     for name, pointer in tf_to_pt_map.items():
183 |         logger.info("Importing {}".format(name))
184 |         if name not in tf_weights:
185 |             logger.info("{} not in tf pre-trained weights, skipping".format(name))
186 |             continue
187 |         array = tf_weights[name]
188 |         # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v
189 |         # which are not required for using pretrained model
190 |         if 'kernel' in name and ('ff' in name or 'summary' in name or 'logit' in name):
191 |             logger.info("Transposing")
192 |             array = np.transpose(array)
193 | 
194 |         if isinstance(pointer, list):
195 |             # Here we will split the TF weigths
196 |             assert len(pointer) == array.shape[0]
197 |             for i, p_i in enumerate(pointer):
198 |                 arr_i = array[i, ...]
199 |                 try:
200 |                     assert p_i.shape == arr_i.shape
201 |                 except AssertionError as e:
202 |                     e.args += (p_i.shape, arr_i.shape)
203 |                     raise
204 |                 logger.info("Initialize PyTorch weight {} for layer {}".format(name, i))
205 |                 p_i.data = torch.from_numpy(arr_i)
206 | 
207 |         else:
208 |             try:
209 |                 assert pointer.shape == array.shape
210 |             except AssertionError as e:
211 |                 e.args += (pointer.shape, array.shape)
212 |                 raise
213 |             logger.info("Initialize PyTorch weight {}".format(name))
214 |             pointer.data = torch.from_numpy(array)
215 |         tf_weights.pop(name, None)
216 |         tf_weights.pop(name + '/Adam', None)
217 |         tf_weights.pop(name + '/Adam_1', None)
218 |     logger.info("Weights not copied to PyTorch model: {}".format(', '.join(tf_weights.keys())))
219 |     return model


--------------------------------------------------------------------------------
/work.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import os
  3 | import torch
  4 | import numpy as np
  5 | 
  6 | from glue_utils import convert_examples_to_seq_features, compute_metrics_absa, ABSAProcessor
  7 | from tqdm import tqdm
  8 | from transformers import BertConfig, BertTokenizer, XLNetConfig, XLNetTokenizer, WEIGHTS_NAME
  9 | from absa_layer import BertABSATagger
 10 | from torch.utils.data import DataLoader, TensorDataset, SequentialSampler
 11 | from seq_utils import ot2bieos_ts, bio2ot_ts, tag2ts
 12 | 
 13 | #ALL_MODELS = sum((tuple(conf.pretrained_config_archive_map.keys()) for conf in (BertConfig, XLNetConfig)), ())
 14 | ALL_MODELS = (
 15 |      'bert-base-uncased',
 16 |  'bert-large-uncased',
 17 |  'bert-base-cased',
 18 |  'bert-large-cased',
 19 |  'bert-base-multilingual-uncased',
 20 |  'bert-base-multilingual-cased',
 21 |  'bert-base-chinese',
 22 |  'bert-base-german-cased',
 23 |  'bert-large-uncased-whole-word-masking',
 24 |  'bert-large-cased-whole-word-masking',
 25 |  'bert-large-uncased-whole-word-masking-finetuned-squad',
 26 |  'bert-large-cased-whole-word-masking-finetuned-squad',
 27 |  'bert-base-cased-finetuned-mrpc',
 28 |  'bert-base-german-dbmdz-cased',
 29 |  'bert-base-german-dbmdz-uncased',
 30 |  'xlnet-base-cased',
 31 |  'xlnet-large-cased'
 32 | )
 33 | 
 34 | 
 35 | MODEL_CLASSES = {
 36 |     'bert': (BertConfig, BertABSATagger, BertTokenizer),
 37 | }
 38 | 
 39 | 
 40 | def load_and_cache_examples(args, task, tokenizer):
 41 |     # similar to that in main.py
 42 |     processor = ABSAProcessor()
 43 |     # Load data features from cache or dataset file
 44 |     cached_features_file = os.path.join(args.data_dir, 'cached_{}_{}_{}_{}'.format(
 45 |         'test',
 46 |         list(filter(None, args.model_name_or_path.split('/'))).pop(),
 47 |         str(args.max_seq_length),
 48 |         str(task)))
 49 |     if os.path.exists(cached_features_file):
 50 |         print("cached_features_file:", cached_features_file)
 51 |         features = torch.load(cached_features_file)
 52 |         examples = processor.get_test_examples(args.data_dir, args.tagging_schema)
 53 |     else:
 54 |         #logger.info("Creating features from dataset file at %s", args.data_dir)
 55 |         label_list = processor.get_labels(args.tagging_schema)
 56 |         examples = processor.get_test_examples(args.data_dir, args.tagging_schema)
 57 |         features = convert_examples_to_seq_features(examples=examples, label_list=label_list, tokenizer=tokenizer,
 58 |                                                     cls_token_at_end=bool(args.model_type in ['xlnet']),
 59 |                                                     cls_token=tokenizer.cls_token,
 60 |                                                     sep_token=tokenizer.sep_token,
 61 |                                                     cls_token_segment_id=2 if args.model_type in ['xlnet'] else 0,
 62 |                                                     pad_on_left=bool(args.model_type in ['xlnet']),
 63 |                                                     pad_token_segment_id=4 if args.model_type in ['xlnet'] else 0)
 64 |         torch.save(features, cached_features_file)
 65 |     total_words = []
 66 |     for input_example in examples:
 67 |         text = input_example.text_a
 68 |         total_words.append(text.split(' '))
 69 | 
 70 |     # Convert to Tensors and build dataset
 71 |     all_input_ids = torch.tensor([f.input_ids for f in features], dtype=torch.long)
 72 |     all_input_mask = torch.tensor([f.input_mask for f in features], dtype=torch.long)
 73 |     all_segment_ids = torch.tensor([f.segment_ids for f in features], dtype=torch.long)
 74 | 
 75 |     all_label_ids = torch.tensor([f.label_ids for f in features], dtype=torch.long)
 76 |     # used in evaluation
 77 |     all_evaluate_label_ids = [f.evaluate_label_ids for f in features]
 78 |     dataset = TensorDataset(all_input_ids, all_input_mask, all_segment_ids, all_label_ids)
 79 |     return dataset, all_evaluate_label_ids, total_words
 80 | 
 81 | 
 82 | def init_args():
 83 |     parser = argparse.ArgumentParser()
 84 |     parser.add_argument("--absa_home", type=str, required=True, help="Home directory of the trained ABSA model")
 85 |     parser.add_argument("--ckpt", type=str, required=True, help="Directory of model checkpoint for evaluation")
 86 |     parser.add_argument("--data_dir", type=str, required=True,
 87 |                         help="The incoming data dir. Should contain the files of test/unseen data")
 88 |     parser.add_argument("--task_name", type=str, required=True, help="task name")
 89 |     parser.add_argument("--model_type", default=None, type=str, required=True,
 90 |                         help="Model type selected in the list: " + ", ".join(MODEL_CLASSES.keys()))
 91 |     parser.add_argument("--model_name_or_path", default=None, type=str, required=True,
 92 |                         help="Path to pre-trained model or shortcut name selected in the list: " + ", ".join(ALL_MODELS))
 93 |     parser.add_argument("--cache_dir", default="", type=str,
 94 |                         help="Where do you want to store the pre-trained models downloaded from s3")
 95 |     parser.add_argument("--max_seq_length", default=128, type=int,
 96 |                         help="The maximum total input sequence length after tokenization. Sequences longer "
 97 |                         "than this will be truncated, sequences shorter will be padded.")
 98 |     parser.add_argument('--tagging_schema', type=str, default='BIEOS', help="Tagging schema, should be kept same with "
 99 |                                                                             "that of ckpt")
100 | 
101 |     args = parser.parse_args()
102 | 
103 |     return args
104 | 
105 | 
106 | def main():
107 |     # perform evaluation on single GPU
108 |     args = init_args()
109 |     device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
110 |     args.device = device
111 |     if torch.cuda.is_available():
112 |         args.n_gpu = torch.cuda.device_count()
113 | 
114 |     args.model_type = args.model_type.lower()
115 |     _, model_class, tokenizer_class = MODEL_CLASSES[args.model_type]
116 | 
117 |     # load the trained model (including the fine-tuned GPT/BERT/XLNET)
118 |     print("Load checkpoint %s/%s..." % (args.ckpt, WEIGHTS_NAME))
119 |     model = model_class.from_pretrained(args.ckpt)
120 |     # follow the property of tokenizer in the loaded model, e.g., do_lower_case=True
121 |     tokenizer = tokenizer_class.from_pretrained(args.absa_home)
122 |     model.to(args.device)
123 |     model.eval()
124 |     predict(args, model, tokenizer)
125 | 
126 | 
127 | def predict(args, model, tokenizer):
128 |     dataset, evaluate_label_ids, total_words = load_and_cache_examples(args, args.task_name, tokenizer)
129 |     sampler = SequentialSampler(dataset)
130 |     # process the incoming data one by one
131 |     dataloader = DataLoader(dataset, sampler=sampler, batch_size=1)
132 |     print("***** Running prediction *****")
133 | 
134 |     total_preds, gold_labels = None, None
135 |     idx = 0
136 |     if args.tagging_schema == 'BIEOS':
137 |         absa_label_vocab = {'O': 0, 'EQ': 1, 'B-POS': 2, 'I-POS': 3, 'E-POS': 4, 'S-POS': 5,
138 |                         'B-NEG': 6, 'I-NEG': 7, 'E-NEG': 8, 'S-NEG': 9,
139 |                         'B-NEU': 10, 'I-NEU': 11, 'E-NEU': 12, 'S-NEU': 13}
140 |     elif args.tagging_schema == 'BIO':
141 |         absa_label_vocab = {'O': 0, 'EQ': 1, 'B-POS': 2, 'I-POS': 3,
142 |         'B-NEG': 4, 'I-NEG': 5, 'B-NEU': 6, 'I-NEU': 7}
143 |     elif args.tagging_schema == 'OT':
144 |         absa_label_vocab = {'O': 0, 'EQ': 1, 'T-POS': 2, 'T-NEG': 3, 'T-NEU': 4}
145 |     else:
146 |         raise Exception("Invalid tagging schema %s..." % args.tagging_schema)
147 |     absa_id2tag = {}
148 |     for k in absa_label_vocab:
149 |         v = absa_label_vocab[k]
150 |         absa_id2tag[v] = k
151 | 
152 |     for batch in tqdm(dataloader, desc="Evaluating"):
153 |         batch = tuple(t.to(args.device) for t in batch)
154 |         with torch.no_grad():
155 |             inputs = {'input_ids': batch[0],
156 |                       'attention_mask': batch[1],
157 |                       'token_type_ids': batch[2] if args.model_type in ['bert', 'xlnet'] else None,
158 |                       # XLM don't use segment_ids
159 |                       'labels': batch[3]}
160 |             outputs = model(**inputs)
161 |             # logits: (1, seq_len, label_size)
162 |             logits = outputs[1]
163 |             # preds: (1, seq_len)
164 |             if model.tagger_config.absa_type != 'crf':
165 |                 preds = np.argmax(logits.detach().cpu().numpy(), axis=-1)
166 |             else:
167 |                 mask = batch[1]
168 |                 preds = model.tagger.viterbi_tags(logits=logits, mask=mask)
169 |             label_indices = evaluate_label_ids[idx]
170 |             words = total_words[idx]
171 |             pred_labels = preds[0][label_indices]
172 |             assert len(words) == len(pred_labels)
173 |             pred_tags = [absa_id2tag[label] for label in pred_labels]
174 | 
175 |             if args.tagging_schema == 'OT':
176 |                 pred_tags = ot2bieos_ts(pred_tags)
177 |             elif args.tagging_schema == 'BIO':
178 |                 pred_tags = ot2bieos_ts(bio2ot_ts(pred_tags))
179 |             else:
180 |                 # current tagging schema is BIEOS, do nothing
181 |                 pass
182 |             p_ts_sequence = tag2ts(ts_tag_sequence=pred_tags)
183 |             output_ts = []
184 |             for t in p_ts_sequence:
185 |                 beg, end, sentiment = t
186 |                 aspect = words[beg:end+1]
187 |                 output_ts.append('%s: %s' % (aspect, sentiment))
188 |             print("Input: %s, output: %s" % (' '.join(words), '\t'.join(output_ts)))
189 |             if inputs['labels'] is not None:
190 |                 # for the unseen data, there is no ``labels''
191 |                 if gold_labels is None:
192 |                     gold_labels = inputs['labels'].detach().cpu().numpy()
193 |                 else:
194 |                     gold_labels = np.append(gold_labels, inputs['labels'].detach().cpu().numpy(), axis=0)
195 |         idx += 1
196 | 
197 | 
198 | if __name__ == "__main__":
199 |     main()
200 | 
201 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 |    APPENDIX: How to apply the Apache License to your work.
179 | 
180 |       To apply the Apache License to your work, attach the following
181 |       boilerplate notice, with the fields enclosed by brackets "[]"
182 |       replaced with your own identifying information. (Don't include
183 |       the brackets!)  The text should be enclosed in the appropriate
184 |       comment syntax for the file format. We also recommend that a
185 |       file or class name and description of purpose be included on the
186 |       same "printed page" as the copyright notice for easier
187 |       identification within third-party archives.
188 | 
189 |    Copyright [yyyy] [name of copyright owner]
190 | 
191 |    Licensed under the Apache License, Version 2.0 (the "License");
192 |    you may not use this file except in compliance with the License.
193 |    You may obtain a copy of the License at
194 | 
195 |        http://www.apache.org/licenses/LICENSE-2.0
196 | 
197 |    Unless required by applicable law or agreed to in writing, software
198 |    distributed under the License is distributed on an "AS IS" BASIS,
199 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 |    See the License for the specific language governing permissions and
201 |    limitations under the License.
202 | 


--------------------------------------------------------------------------------
/seq_utils.py:
--------------------------------------------------------------------------------
  1 | # sequence utility functions
  2 | import torch
  3 | import math
  4 | import numpy as np
  5 | 
  6 | def ot2bieos_ts(ts_tag_sequence):
  7 |     """
  8 |     ot2bieos function for targeted-sentiment task, ts refers to targeted -sentiment / aspect-based sentiment
  9 |     :param ts_tag_sequence: tag sequence for targeted sentiment
 10 |     :return:
 11 |     """
 12 |     n_tags = len(ts_tag_sequence)
 13 |     new_ts_sequence = []
 14 |     prev_pos = '$$$'
 15 |     for i in range(n_tags):
 16 |         cur_ts_tag = ts_tag_sequence[i]
 17 |         if cur_ts_tag == 'O' or cur_ts_tag == 'EQ':
 18 |             # when meet the EQ label, regard it as O label
 19 |             new_ts_sequence.append('O')
 20 |             cur_pos = 'O'
 21 |         else:
 22 |             cur_pos, cur_sentiment = cur_ts_tag.split('-')
 23 |             # cur_pos is T
 24 |             if cur_pos != prev_pos:
 25 |                 # prev_pos is O and new_cur_pos can only be B or S
 26 |                 if i == n_tags - 1:
 27 |                     new_ts_sequence.append('S-%s' % cur_sentiment)
 28 |                 else:
 29 |                     next_ts_tag = ts_tag_sequence[i + 1]
 30 |                     if next_ts_tag == 'O':
 31 |                         new_ts_sequence.append('S-%s' % cur_sentiment)
 32 |                     else:
 33 |                         new_ts_sequence.append('B-%s' % cur_sentiment)
 34 |             else:
 35 |                 # prev_pos is T and new_cur_pos can only be I or E
 36 |                 if i == n_tags - 1:
 37 |                     new_ts_sequence.append('E-%s' % cur_sentiment)
 38 |                 else:
 39 |                     next_ts_tag = ts_tag_sequence[i + 1]
 40 |                     if next_ts_tag == 'O':
 41 |                         new_ts_sequence.append('E-%s' % cur_sentiment)
 42 |                     else:
 43 |                         new_ts_sequence.append('I-%s' % cur_sentiment)
 44 |         prev_pos = cur_pos
 45 |     return new_ts_sequence
 46 | 
 47 | 
 48 | def ot2bieos_ts_batch(ts_tag_seqs):
 49 |     """
 50 |     batch version of function ot2bieos_ts
 51 |     :param ts_tag_seqs:
 52 |     :return:
 53 |     """
 54 |     new_ts_tag_seqs = []
 55 |     n_seqs = len(ts_tag_seqs)
 56 |     for i in range(n_seqs):
 57 |         new_ts_seq = ot2bieos_ts(ts_tag_sequence=ts_tag_seqs[i])
 58 |         new_ts_tag_seqs.append(new_ts_seq)
 59 |     return new_ts_tag_seqs
 60 | 
 61 | 
 62 | def ot2bio_ts(ts_tag_sequence):
 63 |     """
 64 |     ot2bio function for ts tag sequence
 65 |     :param ts_tag_sequence:
 66 |     :return:
 67 |     """
 68 |     new_ts_sequence = []
 69 |     n_tag = len(ts_tag_sequence)
 70 |     prev_pos = '$$$'
 71 |     for i in range(n_tag):
 72 |         cur_ts_tag = ts_tag_sequence[i]
 73 |         if cur_ts_tag == 'O':
 74 |             new_ts_sequence.append('O')
 75 |             cur_pos = 'O'
 76 |         else:
 77 |             # current tag is subjective tag, i.e., cur_pos is T
 78 |             # print(cur_ts_tag)
 79 |             cur_pos, cur_sentiment = cur_ts_tag.split('-')
 80 |             if cur_pos == prev_pos:
 81 |                 # prev_pos is T
 82 |                 new_ts_sequence.append('I-%s' % cur_sentiment)
 83 |             else:
 84 |                 # prev_pos is O
 85 |                 new_ts_sequence.append('B-%s' % cur_sentiment)
 86 |         prev_pos = cur_pos
 87 |     return new_ts_sequence
 88 | 
 89 | 
 90 | def ot2bio_ts_batch(ts_tag_seqs):
 91 |     """
 92 |     batch version of function ot2bio_ts
 93 |     :param ts_tag_seqs:
 94 |     :return:
 95 |     """
 96 |     new_ts_tag_seqs = []
 97 |     n_seqs = len(ts_tag_seqs)
 98 |     for i in range(n_seqs):
 99 |         new_ts_seq = ot2bio_ts(ts_tag_sequence=ts_tag_seqs[i])
100 |         new_ts_tag_seqs.append(new_ts_seq)
101 |     return new_ts_tag_seqs
102 | 
103 | 
104 | def bio2ot_ts(ts_tag_sequence):
105 |     """
106 |     perform bio-->ot for ts tag sequence
107 |     :param ts_tag_sequence:
108 |     :return:
109 |     """
110 |     new_ts_sequence = []
111 |     n_tags = len(ts_tag_sequence)
112 |     for i in range(n_tags):
113 |         ts_tag = ts_tag_sequence[i]
114 |         if ts_tag == 'O' or ts_tag == 'EQ':
115 |             new_ts_sequence.append('O')
116 |         else:
117 |             pos, sentiment = ts_tag.split('-')
118 |             new_ts_sequence.append('T-%s' % sentiment)
119 |     return new_ts_sequence
120 | 
121 | 
122 | def bio2ot_ts_batch(ts_tag_seqs):
123 |     """
124 |     batch version of function bio2ot_ts
125 |     :param ts_tag_seqs:
126 |     :return:
127 |     """
128 |     new_ts_tag_seqs = []
129 |     n_seqs = len(ts_tag_seqs)
130 |     for i in range(n_seqs):
131 |         new_ts_seq = bio2ot_ts(ts_tag_sequence=ts_tag_seqs[i])
132 |         new_ts_tag_seqs.append(new_ts_seq)
133 |     return new_ts_tag_seqs
134 | 
135 | 
136 | def tag2ts(ts_tag_sequence):
137 |     """
138 |     transform ts tag sequence to targeted sentiment
139 |     :param ts_tag_sequence: tag sequence for ts task
140 |     :return:
141 |     """
142 |     n_tags = len(ts_tag_sequence)
143 |     ts_sequence, sentiments = [], []
144 |     beg, end = -1, -1
145 |     for i in range(n_tags):
146 |         ts_tag = ts_tag_sequence[i]
147 |         # current position and sentiment
148 |         # tag O and tag EQ will not be counted
149 |         eles = ts_tag.split('-')
150 |         if len(eles) == 2:
151 |             pos, sentiment = eles
152 |         else:
153 |             pos, sentiment = 'O', 'O'
154 |         if sentiment != 'O':
155 |             # current word is a subjective word
156 |             sentiments.append(sentiment)
157 |         if pos == 'S':
158 |             # singleton
159 |             ts_sequence.append((i, i, sentiment))
160 |             sentiments = []
161 |         elif pos == 'B':
162 |             beg = i
163 |             if len(sentiments) > 1:
164 |                 # remove the effect of the noisy I-{POS,NEG,NEU}
165 |                 sentiments = [sentiments[-1]]
166 |         elif pos == 'E':
167 |             end = i
168 |             # schema1: only the consistent sentiment tags are accepted
169 |             # that is, all of the sentiment tags are the same
170 |             if end > beg > -1 and len(set(sentiments)) == 1:
171 |                 ts_sequence.append((beg, end, sentiment))
172 |                 sentiments = []
173 |                 beg, end = -1, -1
174 |     return ts_sequence
175 | 
176 | 
177 | def logsumexp(tensor, dim=-1, keepdim=False):
178 |     """
179 | 
180 |     :param tensor:
181 |     :param dim:
182 |     :param keepdim:
183 |     :return:
184 |     """
185 |     max_score, _ = tensor.max(dim, keepdim=keepdim)
186 |     if keepdim:
187 |         stable_vec = tensor - max_score
188 |     else:
189 |         stable_vec = tensor - max_score.unsqueeze(dim)
190 |     return max_score + (stable_vec.exp().sum(dim, keepdim=keepdim)).log()
191 | 
192 | 
193 | def viterbi_decode(tag_sequence, transition_matrix,
194 |                    tag_observations=None, allowed_start_transitions=None,
195 |                    allowed_end_transitions=None):
196 |     """
197 |     Perform Viterbi decoding in log space over a sequence given a transition matrix
198 |     specifying pairwise (transition) potentials between tags and a matrix of shape
199 |     (sequence_length, num_tags) specifying unary potentials for possible tags per
200 |     timestep.
201 |     Parameters
202 |     ----------
203 |     tag_sequence : torch.Tensor, required.
204 |         A tensor of shape (sequence_length, num_tags) representing scores for
205 |         a set of tags over a given sequence.
206 |     transition_matrix : torch.Tensor, required.
207 |         A tensor of shape (num_tags, num_tags) representing the binary potentials
208 |         for transitioning between a given pair of tags.
209 |     tag_observations : Optional[List[int]], optional, (default = None)
210 |         A list of length ``sequence_length`` containing the class ids of observed
211 |         elements in the sequence, with unobserved elements being set to -1. Note that
212 |         it is possible to provide evidence which results in degenerate labelings if
213 |         the sequences of tags you provide as evidence cannot transition between each
214 |         other, or those transitions are extremely unlikely. In this situation we log a
215 |         warning, but the responsibility for providing self-consistent evidence ultimately
216 |         lies with the user.
217 |     allowed_start_transitions : torch.Tensor, optional, (default = None)
218 |         An optional tensor of shape (num_tags,) describing which tags the START token
219 |         may transition *to*. If provided, additional transition constraints will be used for
220 |         determining the start element of the sequence.
221 |     allowed_end_transitions : torch.Tensor, optional, (default = None)
222 |         An optional tensor of shape (num_tags,) describing which tags may transition *to* the
223 |         end tag. If provided, additional transition constraints will be used for determining
224 |         the end element of the sequence.
225 |     Returns
226 |     -------
227 |     viterbi_path : List[int]
228 |         The tag indices of the maximum likelihood tag sequence.
229 |     viterbi_score : torch.Tensor
230 |         The score of the viterbi path.
231 |     """
232 |     sequence_length, num_tags = list(tag_sequence.size())
233 | 
234 |     has_start_end_restrictions = allowed_end_transitions is not None or allowed_start_transitions is not None
235 | 
236 |     if has_start_end_restrictions:
237 | 
238 |         if allowed_end_transitions is None:
239 |             allowed_end_transitions = torch.zeros(num_tags)
240 |         if allowed_start_transitions is None:
241 |             allowed_start_transitions = torch.zeros(num_tags)
242 | 
243 |         num_tags = num_tags + 2
244 |         new_transition_matrix = torch.zeros(num_tags, num_tags)
245 |         new_transition_matrix[:-2, :-2] = transition_matrix
246 | 
247 |         # Start and end transitions are fully defined, but cannot transition between each other.
248 |         # pylint: disable=not-callable
249 |         allowed_start_transitions = torch.cat([allowed_start_transitions, torch.tensor([-math.inf, -math.inf])])
250 |         allowed_end_transitions = torch.cat([allowed_end_transitions, torch.tensor([-math.inf, -math.inf])])
251 |         # pylint: enable=not-callable
252 | 
253 |         # First define how we may transition FROM the start and end tags.
254 |         new_transition_matrix[-2, :] = allowed_start_transitions
255 |         # We cannot transition from the end tag to any tag.
256 |         new_transition_matrix[-1, :] = -math.inf
257 | 
258 |         new_transition_matrix[:, -1] = allowed_end_transitions
259 |         # We cannot transition to the start tag from any tag.
260 |         new_transition_matrix[:, -2] = -math.inf
261 | 
262 |         transition_matrix = new_transition_matrix
263 | 
264 |     if tag_observations:
265 |         if len(tag_observations) != sequence_length:
266 |             raise Exception("Observations were provided, but they were not the same length "
267 |                                      "as the sequence. Found sequence of length: {} and evidence: {}"
268 |                                      .format(sequence_length, tag_observations))
269 |     else:
270 |         tag_observations = [-1 for _ in range(sequence_length)]
271 | 
272 | 
273 |     if has_start_end_restrictions:
274 |         tag_observations = [num_tags - 2] + tag_observations + [num_tags - 1]
275 |         zero_sentinel = torch.zeros(1, num_tags)
276 |         extra_tags_sentinel = torch.ones(sequence_length, 2) * -math.inf
277 |         tag_sequence = torch.cat([tag_sequence, extra_tags_sentinel], -1)
278 |         tag_sequence = torch.cat([zero_sentinel, tag_sequence, zero_sentinel], 0)
279 |         sequence_length = tag_sequence.size(0)
280 | 
281 |     path_scores = []
282 |     path_indices = []
283 | 
284 |     if tag_observations[0] != -1:
285 |         one_hot = torch.zeros(num_tags)
286 |         one_hot[tag_observations[0]] = 100000.
287 |         path_scores.append(one_hot)
288 |     else:
289 |         path_scores.append(tag_sequence[0, :])
290 | 
291 |     # Evaluate the scores for all possible paths.
292 |     for timestep in range(1, sequence_length):
293 |         # Add pairwise potentials to current scores.
294 |         summed_potentials = path_scores[timestep - 1].unsqueeze(-1) + transition_matrix
295 |         scores, paths = torch.max(summed_potentials, 0)
296 | 
297 |         # If we have an observation for this timestep, use it
298 |         # instead of the distribution over tags.
299 |         observation = tag_observations[timestep]
300 |         # Warn the user if they have passed
301 |         # invalid/extremely unlikely evidence.
302 |         if tag_observations[timestep - 1] != -1 and observation != -1:
303 |             if transition_matrix[tag_observations[timestep - 1], observation] < -10000:
304 |                 logger.warning("The pairwise potential between tags you have passed as "
305 |                                "observations is extremely unlikely. Double check your evidence "
306 |                                "or transition potentials!")
307 |         if observation != -1:
308 |             one_hot = torch.zeros(num_tags)
309 |             one_hot[observation] = 100000.
310 |             path_scores.append(one_hot)
311 |         else:
312 |             path_scores.append(tag_sequence[timestep, :] + scores.squeeze())
313 |         path_indices.append(paths.squeeze())
314 | 
315 |     # Construct the most likely sequence backwards.
316 |     viterbi_score, best_path = torch.max(path_scores[-1], 0)
317 |     viterbi_path = [int(best_path.numpy())]
318 |     for backward_timestep in reversed(path_indices):
319 |         viterbi_path.append(int(backward_timestep[viterbi_path[-1]]))
320 |     # Reverse the backward path.
321 |     viterbi_path.reverse()
322 | 
323 |     if has_start_end_restrictions:
324 |         viterbi_path = viterbi_path[1:-1]
325 |     #return viterbi_path, viterbi_score
326 |     return np.array(viterbi_path, dtype=np.int32)
327 | 
328 | 
329 | 
330 | 


--------------------------------------------------------------------------------
/data/rest15/dev.txt:
--------------------------------------------------------------------------------
  1 | Judging from previous posts this used to be a good place, but not any longer.####Judging=O from=O previous=O posts=O this=O used=O to=O be=O a=O good=O place=T-NEG ,=O but=O not=O any=O longer=O .=O
  2 | The duck confit is always amazing and the foie gras terrine with figs was out of this world.####The=O duck=T-POS confit=T-POS is=O always=O amazing=O and=O the=O foie=T-POS gras=T-POS terrine=T-POS with=T-POS figs=T-POS was=O out=O of=O this=O world=O .=O
  3 | we love th pink pony.####we=O love=O th=O pink=T-POS pony=T-POS .=O
  4 | well, i didn't find it there, and trust, i have told everyone i can think of about my experience. ####well=O ,=O i=O did=O n't=O find=O it=O there=O ,=O and=O trust=O ,=O i=O have=O told=O everyone=O i=O can=O think=O of=O about=O my=O experience=O .=O
  5 | This place has got to be the best japanese restaurant in the new york area.####This=O place=T-POS has=O got=O to=O be=O the=O best=O japanese=O restaurant=O in=O the=O new=O york=O area=O .=O
  6 | If you've ever been along the river in Weehawken you have an idea of the top of view the chart house has to offer.####If=O you=O 've=O ever=O been=O along=O the=O river=O in=O Weehawken=O you=O have=O an=O idea=O of=O the=O top=O of=O view=T-POS the=O chart=O house=O has=O to=O offer=O .=O
  7 | This tiny restaurant is as cozy as it gets, with that certain Parisian flair.####This=O tiny=O restaurant=T-POS is=O as=O cozy=O as=O it=O gets=O ,=O with=O that=O certain=O Parisian=O flair=O .=O
  8 | The pizza was delivered cold and the cheese wasn't even fully melted!####The=O pizza=T-NEG was=O delivered=O cold=O and=O the=O cheese=T-NEG was=O n't=O even=O fully=O melted=O !=O
  9 | My wife and I always enjoy the young, not always well trained but nevertheless friendly, staff, all of whom have a story.####My=O wife=O and=O I=O always=O enjoy=O the=O young=O ,=O not=O always=O well=O trained=O but=O nevertheless=O friendly=O ,=O staff=T-POS ,=O all=O of=O whom=O have=O a=O story=O .=O
 10 | Sit outside in the warm weather; inside for cozy winter.####Sit=O outside=O in=O the=O warm=O weather=O ;=O inside=O for=O cozy=O winter=O .=O
 11 | They refuse to seat parties of 3 or more on weekends.####They=O refuse=O to=O seat=O parties=O of=O 3=O or=O more=O on=O weekends=O .=O
 12 | The hostess is rude to the point of being offensive.####The=O hostess=T-NEG is=O rude=O to=O the=O point=O of=O being=O offensive=O .=O
 13 | Try everything for that matter, it is all good.####Try=O everything=O for=O that=O matter=O ,=O it=O is=O all=O good=O .=O
 14 | Veal Parmigana - Better than Patsy's!####Veal=O Parmigana=O ,=O Better=O than=O Patsy=O 's=O !=O
 15 | Even after they overcharged me the last time I was there.####Even=O after=O they=O overcharged=O me=O the=O last=O time=O I=O was=O there=O .=O
 16 | Make sure you have the Spicy Scallop roll.. .####Make=O sure=O you=O have=O the=O Spicy=T-POS Scallop=T-POS roll=T-POS .=O
 17 | The drinks are always welll made and wine selection is fairly priced.####The=O drinks=T-POS are=O always=O welll=O made=O and=O wine=T-POS selection=T-POS is=O fairly=O priced=O .=O
 18 | Try their chef's specials-- they are to die for.####Try=O their=O chef's=T-POS specials=T-POS they=O are=O to=O die=O for=O .=O
 19 | Then, to top things off, she dropped used silverware on my boyfriend's jacket and did not stop to apologize or clean the mess that was left on clothes. ####Then=O ,=O to=O top=O things=O off=O ,=O she=O dropped=O used=O silverware=O on=O my=O boyfriend=O 's=O jacket=O and=O did=O not=O stop=O to=O apologize=O or=O clean=O the=O mess=O that=O was=O left=O on=O clothes=O .=O
 20 | I had a grat time at Jekyll and Hyde!####I=O had=O a=O grat=O time=O at=O Jekyll=T-POS and=T-POS Hyde=T-POS !=O
 21 | The outdoor atmosphere of sitting on the sidewalk watching the world go by 50 feet away on 6th avenue on a cool evening was wonderful.####The=O outdoor=T-POS atmosphere=T-POS of=O sitting=O on=O the=O sidewalk=O watching=O the=O world=O go=O by=O 50=O feet=O away=O on=O 6th=O avenue=O on=O a=O cool=O evening=O was=O wonderful=O .=O
 22 | Great service, great food.####Great=O service=T-POS ,=O great=O food=T-POS .=O
 23 | When I lived upstate for a while I would buy freeze the bagels and they would still be better than any else.####When=O I=O lived=O upstate=O for=O a=O while=O I=O would=O buy=O freeze=O the=O bagels=T-POS and=O they=O would=O still=O be=O better=O than=O any=O else=O .=O
 24 | Aside from the Sea Urchin, the chef recommended an assortment of fish including Fatty Yellow Tail, Boton Shrimp, Blue Fin Torro (Fatty Tuna), Sea Eel, etc.####Aside=O from=O the=O Sea=O Urchin=O ,=O the=O chef=O recommended=O an=O assortment=O of=O fish=O including=O Fatty=O Yellow=O Tail=O ,=O Boton=O Shrimp=O ,=O Blue=O Fin=O Torro=O Fatty=O Tuna=O ,=O Sea=O Eel=O ,=O etc=O .=O
 25 | If you are the type of person who likes being scared and entertained, this is a great place to go and eat.####If=O you=O are=O the=O type=O of=O person=O who=O likes=O being=O scared=O and=O entertained=O ,=O this=O is=O a=O great=O place=T-POS to=O go=O and=O eat=O .=O
 26 | Its located in greenewich village.####Its=O located=O in=O greenewich=O village=O .=O
 27 | I loved it and would HIGHLY RECOMMEND.####I=O loved=O it=O and=O would=O HIGHLY=O RECOMMEND=O .=O
 28 | I am not the most experienced person when it comes to Thai food, but my friend who took me there is.####I=O am=O not=O the=O most=O experienced=O person=O when=O it=O comes=O to=O Thai=O food=O ,=O but=O my=O friend=O who=O took=O me=O there=O is=O .=O
 29 | We had Pam's special fried fish and it was amazing.####We=O had=O Pam's=T-POS special=T-POS fried=T-POS fish=T-POS and=O it=O was=O amazing=O .=O
 30 | Great vibe, lots of people.####Great=O vibe=T-POS ,=O lots=O of=O people=O .=O
 31 | Salads were fantastic.####Salads=T-POS were=O fantastic=O .=O
 32 | This place is always very crowded and popular.####This=O place=T-POS is=O always=O very=O crowded=O and=O popular=O .=O
 33 | We concluded with tiramisu chocolate cake, both were delicious.####We=O concluded=O with=O tiramisu=T-POS chocolate=T-POS cake=T-POS ,=O both=O were=O delicious=O .=O
 34 | sometimes i get bad food and bad service, sometimes i get good good and bad service.####sometimes=O i=O get=O bad=O food=T-NEG and=O bad=O service=T-NEG ,=O sometimes=O i=O get=O good=T-POS good=T-POS and=O bad=O service=T-NEG .=O
 35 | I can't wait to go back.####I=O ca=O n't=O wait=O to=O go=O back=O .=O
 36 | They tell me they are going to cover the garden in glass for the winter, so i'm looking forward to going there on a snowy night to enjoy it.####They=O tell=O me=O they=O are=O going=O to=O cover=O the=O garden=O in=O glass=O for=O the=O winter=O ,=O so=O i=O 'm=O looking=O forward=O to=O going=O there=O on=O a=O snowy=O night=O to=O enjoy=O it=O .=O
 37 | To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora.####To=O be=O completely=O fair=O ,=O the=O only=O redeeming=O factor=O was=O the=O food=T-POS ,=O which=O was=O above=O average=O ,=O but=O could=O n't=O make=O up=O for=O all=O the=O other=O deficiencies=O of=O Teodora=T-NEG .=O
 38 | The food however, is what one might expect.####The=O food=T-NEG however=O ,=O is=O what=O one=O might=O expect=O .=O
 39 | Food was good not great not worth the wait or another visit####Food=T-NEU was=O good=O not=O great=O not=O worth=O the=O wait=O or=O another=O visit=O
 40 | Growing up in NY, I have eaten my share of bagels.####Growing=O up=O in=O NY=O ,=O I=O have=O eaten=O my=O share=O of=O bagels=O .=O
 41 | The lox is always fresh too.####The=O lox=T-POS is=O always=O fresh=O too=O .=O
 42 | The prices were CHEAP compared to the quality of service and food.####The=O prices=O were=O CHEAP=O compared=O to=O the=O quality=O of=O service=T-POS and=O food=T-POS .=O
 43 | I was there on sat. for my birthday and we had an excellent time.####I=O was=O there=O on=O sat=O for=O my=O birthday=O and=O we=O had=O an=O excellent=O time=O .=O
 44 | The wine the service was very good too.####The=O wine=T-POS the=O service=T-POS was=O very=O good=O too=O .=O
 45 | If you go, try the marinara/arrabiatta sauce, the mozzarella en Carozza is mmmmmmmm..... everything is just delicious.####If=O you=O go=O ,=O try=O the=O marinara/arrabiatta=T-POS sauce=T-POS ,=O the=O mozzarella=T-POS en=T-POS Carozza=T-POS is=O mmmmmmmm=O everything=O is=O just=O delicious=O .=O
 46 | Old school meets New world.####Old=O school=O meets=O New=O world=O .=O
 47 | I go twice a month!####I=O go=O twice=O a=O month=O !=O
 48 | The hostess and the waitress were incredibly rude and did everything they could to rush us out.####The=O hostess=T-NEG and=O the=O waitress=T-NEG were=O incredibly=O rude=O and=O did=O everything=O they=O could=O to=O rush=O us=O out=O .=O
 49 | The two star chefs left quite some time ago to open their own place.####The=O two=O star=O chefs=O left=O quite=O some=O time=O ago=O to=O open=O their=O own=O place=O .=O
 50 | Don't dine at Tamarind for the vegetarian dishes, they are simply not up to par with the non-veg selections.####Do=O n't=O dine=O at=O Tamarind=O for=O the=O vegetarian=T-NEG dishes=T-NEG ,=O they=O are=O simply=O not=O up=O to=O par=O with=O the=O non-veg=T-POS selections=T-POS .=O
 51 | They wouldnt even let me finish my glass of wine before offering another.####They=O would=O n't=O even=O let=O me=O finish=O my=O glass=O of=O wine=O before=O offering=O another=O .=O
 52 | Try the Pad Thai, it's fabulous and their prices are so cheap!####Try=O the=O Pad=T-POS Thai=T-POS ,=O it=O 's=O fabulous=O and=O their=O prices=O are=O so=O cheap=O !=O
 53 | This is a nice restaurant if you are looking for a good place to host an intimate dinner meeting with business associates.####This=O is=O a=O nice=O restaurant=T-POS if=O you=O are=O looking=O for=O a=O good=O place=O to=O host=O an=O intimate=O dinner=O meeting=O with=O business=O associates=O .=O
 54 | The menu is limited but almost all of the dishes are excellent.####The=O menu=T-NEG is=O limited=O but=O almost=O all=O of=O the=O dishes=T-POS are=O excellent=O .=O
 55 | The food was delicious (I had a halibut special, my husband had steak), and the service was top-notch.####The=O food=T-POS was=O delicious=O I=O had=O a=O halibut=T-POS special=T-POS ,=O my=O husband=O had=O steak=T-POS ,=O and=O the=O service=T-POS was=O top-notch=O .=O
 56 | I highly recommend the restaurant based on our experience last night.####I=O highly=O recommend=O the=O restaurant=T-POS based=O on=O our=O experience=O last=O night=O .=O
 57 | please don't fool us.####please=O do=O n't=O fool=O us=O .=O
 58 | Never again!####Never=O again=O !=O
 59 | My boyfriend and I went there to celebrate my birthday the other night and all I can say is that it was magnificent.####My=O boyfriend=O and=O I=O went=O there=O to=O celebrate=O my=O birthday=O the=O other=O night=O and=O all=O I=O can=O say=O is=O that=O it=O was=O magnificent=O .=O
 60 | This place is really trendi but they have forgotten about the most important part of a  restaurant, the food.####This=O place=T-POS is=O really=O trendi=O but=O they=O have=O forgotten=O about=O the=O most=O important=O part=O of=O a=O restaurant=O ,=O the=O food=T-NEG .=O
 61 | And the Tom Kha soup was pathetic.####And=O the=O Tom=T-NEG Kha=T-NEG soup=T-NEG was=O pathetic=O .=O
 62 | it helps if you know what to order.####it=O helps=O if=O you=O know=O what=O to=O order=O .=O
 63 | Great food, good size menu, great service and an unpretensious setting.####Great=O food=T-POS ,=O good=O size=O menu=T-POS ,=O great=O service=T-POS and=O an=O unpretensious=O setting=T-POS .=O
 64 | We are very particular about sushi and were both please with every choice which included: ceviche mix (special), crab dumplings, assorted sashimi, sushi and rolls, two types of sake, and the banana tempura.####We=O are=O very=O particular=O about=O sushi=T-POS and=O were=O both=O please=O with=O every=O choice=O which=O included=O ,=O ceviche=T-POS mix=T-POS (special)=T-POS ,=O crab=T-POS dumplings=T-POS ,=O assorted=T-POS sashimi=T-POS ,=O sushi=T-POS and=O rolls=T-POS ,=O two=T-POS types=T-POS of=T-POS sake=T-POS ,=O and=O the=O banana=T-POS tempura=T-POS .=O
 65 | This is a wonderful place on all stand points especially value ofr money.####This=O is=O a=O wonderful=O place=T-POS on=O all=O stand=O points=O especially=O value=O ofr=O money=O .=O
 66 | Went to Cafe Spice with 4 of my friends on a saturday night.####Went=O to=O Cafe=O Spice=O with=O 4=O of=O my=O friends=O on=O a=O saturday=O night=O .=O
 67 | We were greeted promptly by the waiter who was very nice and cordial.####We=O were=O greeted=O promptly=O by=O the=O waiter=T-POS who=O was=O very=O nice=O and=O cordial=O .=O
 68 | The crust is thin, the ingredients are fresh and the staff is friendly.####The=O crust=T-POS is=O thin=O ,=O the=O ingredients=T-POS are=O fresh=O and=O the=O staff=T-POS is=O friendly=O .=O
 69 | I ordered the smoked salmon and roe appetizer and it was off flavor.####I=O ordered=O the=O smoked=T-NEG salmon=T-NEG and=T-NEG roe=T-NEG appetizer=T-NEG and=O it=O was=O off=O flavor=O .=O
 70 | Delicious crab cakes too.####Delicious=O crab=T-POS cakes=T-POS too=O .=O
 71 | Seriously, this place kicks ass.####Seriously=O ,=O this=O place=T-POS kicks=O ass=O .=O
 72 | Good spreads, great beverage selections and bagels really tasty.####Good=O spreads=T-POS ,=O great=O beverage=T-POS selections=T-POS and=O bagels=T-POS really=O tasty=O .=O
 73 | Love Pizza 33..####Love=O Pizza=T-POS 33=T-POS .=O
 74 | It hits the spot every time####It=O hits=O the=O spot=O every=O time=O
 75 | A little pricey but it really hits the spot on a Sunday morning!####A=O little=O pricey=O but=O it=O really=O hits=O the=O spot=O on=O a=O Sunday=O morning=O !=O
 76 | Be sure not to get anything other than bagels!..####Be=O sure=O not=O to=O get=O anything=O other=O than=O bagels=T-POS !=O .=O
 77 | Jimmy is Dominican!####Jimmy=O is=O Dominican=O !=O
 78 | Well, this place is so Ghetto its not even funny.####Well=O ,=O this=O place=T-NEG is=O so=O Ghetto=O its=O not=O even=O funny=O .=O
 79 | Awsome Pizza especially the Margheritta slice.####Awsome=O Pizza=T-POS especially=O the=O Margheritta=T-POS slice=T-POS .=O
 80 | What more can you ask for?####What=O more=O can=O you=O ask=O for=O ?=O
 81 | For authentic Thai food, look no further than Toons.####For=O authentic=O Thai=T-POS food=T-POS ,=O look=O no=O further=O than=O Toons=O .=O
 82 | The place was quiet and delightful.####The=O place=T-POS was=O quiet=O and=O delightful=O .=O
 83 | As a retired hipster, I can say with some degree of certainty that for the last year Lucky Strike has been the best laid-back late night in the city.####As=O a=O retired=O hipster=O ,=O I=O can=O say=O with=O some=O degree=O of=O certainty=O that=O for=O the=O last=O year=O Lucky=T-POS Strike=T-POS has=O been=O the=O best=O laid-back=O late=O night=O in=O the=O city=O .=O
 84 | I will go back to Suan soon!####I=O will=O go=O back=O to=O Suan=T-POS soon=O !=O
 85 | I cannot imagine you not rushing out to eat there.####I=O can=O not=O imagine=O you=O not=O rushing=O out=O to=O eat=O there=O .=O
 86 | Do not get the Go Go Hamburgers, no matter what the reviews say.####Do=O not=O get=O the=O Go=T-NEG Go=T-NEG Hamburgers=T-NEG ,=O no=O matter=O what=O the=O reviews=O say=O .=O
 87 | Steamed fresh so brought hot hot hot to your table.####Steamed=O fresh=O so=O brought=O hot=O hot=O hot=O to=O your=O table=O .=O
 88 | (2) egg custards and pork buns at either bakery on west side of Mott street just south of Canal.####2=O egg=O custards=O and=O pork=O buns=O at=O either=O bakery=O on=O west=O side=O of=O Mott=O street=O just=O south=O of=O Canal=O .=O
 89 | this little place has a cute interior decor and affordable city prices.####this=O little=O place=T-POS has=O a=O cute=O interior=T-POS decor=T-POS and=O affordable=O city=O prices=O .=O
 90 | i would just ask for no oil next time.####i=O would=O just=O ask=O for=O no=O oil=O next=O time=O .=O
 91 | The only thing you can do here is walk in and eat .. but planning an event, especially a small, intimate one, forget about it.####The=O only=O thing=O you=O can=O do=O here=O is=O walk=O in=O and=O eat=O but=O planning=O an=O event=O ,=O especially=O a=O small=O ,=O intimate=O one=O ,=O forget=O about=O it=O .=O
 92 | But that is highly forgivable.####But=O that=O is=O highly=O forgivable=O .=O
 93 | Been there, done that, and New York, it's not that big a deal.####Been=O there=O ,=O done=O that=O ,=O and=O New=O York=O ,=O it=O 's=O not=O that=O big=O a=O deal=O .=O
 94 | Great food, great prices, great service.####Great=O food=T-POS ,=O great=O prices=O ,=O great=O service=T-POS .=O
 95 | Their bagels are fine, but they are a little overcooked, and not really a 'special' bagel experience.####Their=O bagels=T-NEG are=O fine=O ,=O but=O they=O are=O a=O little=O overcooked=O ,=O and=O not=O really=O a=O special=O bagel=O experience=O .=O
 96 | Downtown Dinner 2002 - Prixe fix: Appetizers were ok, waiter gave me poor suggestion..try the potato stuff kanish best one.####Downtown=O Dinner=O 2002=O ,=O Prixe=O fix=O ,=O Appetizers=T-NEU were=O ok=O ,=O waiter=T-NEG gave=O me=O poor=O suggestiontry=O the=O potato=T-POS stuff=T-POS kanish=T-POS best=O one=O .=O
 97 | Still, any quibbles about the bill were off-set by the pour-your-own measures of liquers which were courtesey of the house...####Still=O ,=O any=O quibbles=O about=O the=O bill=O were=O off-set=O by=O the=O pour-your-own=O measures=T-POS of=T-POS liquers=T-POS which=O were=O courtesey=O of=O the=O house=O .=O
 98 | The only thing more wonderful than the food (which is exceptional) is the service.####The=O only=O thing=O more=O wonderful=O than=O the=O food=T-POS which=O is=O exceptional=O is=O the=O service=T-POS .=O
 99 | A real dissapointment.####A=O real=O dissapointment=O .=O
100 | But that wasn't the icing on the cake: a tiramisu that resembled nothing I have ever had.####But=O that=O was=O n't=O the=O icing=O on=O the=O cake=O ,=O a=O tiramisu=T-NEG that=O resembled=O nothing=O I=O have=O ever=O had=O .=O
101 | Priced at upper intermediate range.####Priced=O at=O upper=O intermediate=O range=O .=O
102 | This place has the best Chinese style BBQ ribs in the city.####This=O place=O has=O the=O best=O Chinese=O style=O BBQ=T-POS ribs=T-POS in=O the=O city=O .=O
103 | I also recommend the rice dishes or the different varieties of congee (rice porridge).####I=O also=O recommend=O the=O rice=T-POS dishes=T-POS or=O the=O different=O varieties=O of=O congee=T-POS (rice=T-POS porridge)=T-POS .=O
104 | Quick and friendly service.####Quick=O and=O friendly=O service=T-POS .=O
105 | Warm and friendly in the winter and terrific outdoor seating in the warmer months.####Warm=O and=O friendly=O in=O the=O winter=O and=O terrific=O outdoor=T-POS seating=T-POS in=O the=O warmer=O months=O .=O
106 | Probably would not go again...####Probably=O would=O not=O go=O again=O .=O
107 | A classic!####A=O classic=O !=O
108 | It was the first place we ate on our first trip to New York, and it will be the last place we stop as we head out of town on our next trip to New York.####It=O was=O the=O first=O place=T-POS we=O ate=O on=O our=O first=O trip=O to=O New=O York=O ,=O and=O it=O will=O be=O the=O last=O place=T-POS we=O stop=O as=O we=O head=O out=O of=O town=O on=O our=O next=O trip=O to=O New=O York=O .=O
109 | Thanks Bloom's for a lovely trip.####Thanks=O Bloom's=T-POS for=O a=O lovely=O trip=O .=O
110 | bottles of wine are cheap and good.####bottles=T-POS of=T-POS wine=T-POS are=O cheap=O and=O good=O .=O
111 | The mussles were the fishiest things I've ever tasted, the seabass was bland, the goat cheese salad was missing the goat cheese, the penne w/ chicken had bones in it... It was disgusting.####The=O mussles=T-NEG were=O the=O fishiest=O things=O I=O 've=O ever=O tasted=O ,=O the=O seabass=T-NEG was=O bland=O ,=O the=O goat=T-NEG cheese=T-NEG salad=T-NEG was=O missing=O the=O goat=O cheese=O ,=O the=O penne=T-NEG w/=T-NEG chicken=T-NEG had=O bones=O in=O it=O It=O was=O disgusting=O .=O
112 | The food is amazing, rich pastas and fresh doughy pizza.####The=O food=T-POS is=O amazing=O ,=O rich=O pastas=T-POS and=O fresh=O doughy=O pizza=T-POS .=O
113 | Among all of the new 5th avenue restaurants, this offers by far one of the best values for your money.####Among=O all=O of=O the=O new=O 5th=O avenue=O restaurants=O ,=O this=O offers=O by=O far=O one=O of=O the=O best=O values=O for=O your=O money=O .=O
114 | Good luck getting a table.####Good=O luck=O getting=O a=O table=O .=O
115 | We recently decided to try this location, and to our delight, they have outdoor seating, perfect since I had my yorkie with me.####We=O recently=O decided=O to=O try=O this=O location=O ,=O and=O to=O our=O delight=O ,=O they=O have=O outdoor=T-POS seating=T-POS ,=O perfect=O since=O I=O had=O my=O yorkie=O with=O me=O .=O
116 | But $1 for each small piece???####But=O $=O 1=O for=O each=O small=O piece=O ?=O ?=O ?=O
117 | Not worth it.####Not=O worth=O it=O .=O
118 | Great Indian food and the service is incredible.####Great=O Indian=T-POS food=T-POS and=O the=O service=T-POS is=O incredible=O .=O
119 | A great place to meet up for some food and drinks...  ####A=O great=O place=T-POS to=O meet=O up=O for=O some=O food=O and=O drinks=O .=O
120 | It's also attached to Angel's Share, which is a cool, more romantic bar...####It=O 's=O also=O attached=O to=O Angel=O 's=O Share=O ,=O which=O is=O a=O cool=O ,=O more=O romantic=O bar=O .=O
121 | Wasn't going to share but I feel obligated...while sitting at the sushi bar dining we watched the chef accidentally drop a piece of Unagi on the floor and upon retrieving it from the floor proceed to use the piece in the delivery order he was preparing.####Was=O n't=O going=O to=O share=O but=O I=O feel=O obligatedwhile=O sitting=O at=O the=O sushi=O bar=O dining=O we=O watched=O the=O chef=T-NEG accidentally=O drop=O a=O piece=O of=O Unagi=O on=O the=O floor=O and=O upon=O retrieving=O it=O from=O the=O floor=O proceed=O to=O use=O the=O piece=O in=O the=O delivery=O order=O he=O was=O preparing=O .=O
122 | We left, never to return.####We=O left=O ,=O never=O to=O return=O .=O
123 | In fact, it appears he is going to go postal at any moment.####In=O fact=O ,=O it=O appears=O he=O is=O going=O to=O go=O postal=O at=O any=O moment=O .=O
124 | $20 for all you can eat sushi cannot be beaten.####$=O 20=O for=O all=T-POS you=T-POS can=T-POS eat=T-POS sushi=T-POS can=O not=O be=O beaten=O .=O
125 | I went to Areo on a Sunday afternoon with four of my girlfriends, and spent three enjoyable hours there.####I=O went=O to=O Areo=T-POS on=O a=O Sunday=O afternoon=O with=O four=O of=O my=O girlfriends=O ,=O and=O spent=O three=O enjoyable=O hours=O there=O .=O
126 | I would highly recommand requesting a table by the window.####I=O would=O highly=O recommand=O requesting=O a=O table=T-POS by=T-POS the=T-POS window=T-POS .=O
127 | Love the scene first off- the place has a character and nice light to it..very fortunate, location wise.####Love=O the=O scene=T-POS first=O off=O the=O place=T-POS has=O a=O character=O and=O nice=O light=O to=O itvery=O fortunate=O ,=O location=T-POS wise=O .=O
128 | I plan on stopping by next week as well.####I=O plan=O on=O stopping=O by=O next=O week=O as=O well=O .=O
129 | Keep up the good work guys!####Keep=O up=O the=O good=O work=O guys=O !=O
130 | We could have made a meal of the yummy dumplings from the dumpling menu.####We=O could=O have=O made=O a=O meal=O of=O the=O yummy=O dumplings=T-POS from=O the=O dumpling=O menu=O .=O
131 | 


--------------------------------------------------------------------------------
/glue_utils.py:
--------------------------------------------------------------------------------
  1 | # coding=utf-8
  2 | # Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team.
  3 | # Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.
  4 | #
  5 | # Licensed under the Apache License, Version 2.0 (the "License");
  6 | # you may not use this file except in compliance with the License.
  7 | # You may obtain a copy of the License at
  8 | #
  9 | #     http://www.apache.org/licenses/LICENSE-2.0
 10 | #
 11 | # Unless required by applicable law or agreed to in writing, software
 12 | # distributed under the License is distributed on an "AS IS" BASIS,
 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 14 | # See the License for the specific language governing permissions and
 15 | # limitations under the License.
 16 | """ BERT classification fine-tuning: utilities to work with GLUE tasks """
 17 | 
 18 | from __future__ import absolute_import, division, print_function
 19 | 
 20 | import csv
 21 | import logging
 22 | import os
 23 | import sys
 24 | from io import open
 25 | 
 26 | from seq_utils import *
 27 | 
 28 | logger = logging.getLogger(__name__)
 29 | 
 30 | SMALL_POSITIVE_CONST = 1e-4
 31 | 
 32 | class InputExample(object):
 33 |     """A single training/test example for simple sequence classification."""
 34 | 
 35 |     def __init__(self, guid, text_a, text_b=None, label=None):
 36 |         """Constructs a InputExample.
 37 | 
 38 |         Args:
 39 |             guid: Unique id for the example.
 40 |             text_a: string. The untokenized text of the first sequence. For single
 41 |             sequence tasks, only this sequence must be specified.
 42 |             text_b: (Optional) string. The untokenized text of the second sequence.
 43 |             Only must be specified for sequence pair tasks.
 44 |             label: (Optional) string. The label of the example. This should be
 45 |             specified for train and dev examples, but not for test examples.
 46 |         """
 47 |         self.guid = guid
 48 |         self.text_a = text_a
 49 |         self.text_b = text_b
 50 |         self.label = label
 51 | 
 52 | 
 53 | class InputFeatures(object):
 54 |     """A single set of features of data."""
 55 | 
 56 |     def __init__(self, input_ids, input_mask, segment_ids, label_id):
 57 |         self.input_ids = input_ids
 58 |         self.input_mask = input_mask
 59 |         self.segment_ids = segment_ids
 60 |         self.label_id = label_id
 61 | 
 62 | 
 63 | class SeqInputFeatures(object):
 64 |     """A single set of features of data for the ABSA task"""
 65 |     def __init__(self, input_ids, input_mask, segment_ids, label_ids, evaluate_label_ids):
 66 |         self.input_ids = input_ids
 67 |         self.input_mask = input_mask
 68 |         self.segment_ids = segment_ids
 69 |         self.label_ids = label_ids
 70 |         # mapping between word index and head token index
 71 |         self.evaluate_label_ids = evaluate_label_ids
 72 | 
 73 | 
 74 | class DataProcessor(object):
 75 |     """Base class for data converters for sequence classification data sets."""
 76 | 
 77 |     def get_train_examples(self, data_dir):
 78 |         """Gets a collection of `InputExample`s for the train set."""
 79 |         raise NotImplementedError()
 80 | 
 81 |     def get_dev_examples(self, data_dir):
 82 |         """Gets a collection of `InputExample`s for the dev set."""
 83 |         raise NotImplementedError()
 84 | 
 85 |     def get_test_examples(self, data_dir):
 86 |         """Gets a collection of `InputExample`s for the test set."""
 87 |         raise NotImplementedError()
 88 | 
 89 |     def get_labels(self):
 90 |         """Gets the list of labels for this data set."""
 91 |         raise NotImplementedError()
 92 | 
 93 |     @classmethod
 94 |     def _read_tsv(cls, input_file, quotechar=None):
 95 |         """Reads a tab separated value file."""
 96 |         with open(input_file, "r", encoding="utf-8-sig") as f:
 97 |             reader = csv.reader(f, delimiter="\t", quotechar=quotechar)
 98 |             lines = []
 99 |             for line in reader:
100 |                 if sys.version_info[0] == 2:
101 |                     line = list(cell for cell in line)
102 |                 lines.append(line)
103 |             return lines
104 | 
105 | 
106 | class ABSAProcessor(DataProcessor):
107 |     """Processor for the ABSA datasets"""
108 |     def get_train_examples(self, data_dir, tagging_schema):
109 |         return self._create_examples(data_dir=data_dir, set_type='train', tagging_schema=tagging_schema)
110 | 
111 |     def get_dev_examples(self, data_dir, tagging_schema):
112 |         return self._create_examples(data_dir=data_dir, set_type='dev', tagging_schema=tagging_schema)
113 | 
114 |     def get_test_examples(self, data_dir, tagging_schema):
115 |         return self._create_examples(data_dir=data_dir, set_type='test', tagging_schema=tagging_schema)
116 | 
117 |     def get_labels(self, tagging_schema):
118 |         if tagging_schema == 'OT':
119 |             return []
120 |         elif tagging_schema == 'BIO':
121 |             return ['O', 'EQ', 'B-POS', 'I-POS', 'B-NEG', 'I-NEG', 'B-NEU', 'I-NEU']
122 |         elif tagging_schema == 'BIEOS':
123 |             return ['O', 'EQ', 'B-POS', 'I-POS', 'E-POS', 'S-POS',
124 |             'B-NEG', 'I-NEG', 'E-NEG', 'S-NEG',
125 |             'B-NEU', 'I-NEU', 'E-NEU', 'S-NEU']
126 |         else:
127 |             raise Exception("Invalid tagging schema %s..." % tagging_schema)
128 | 
129 |     def _create_examples(self, data_dir, set_type, tagging_schema):
130 |         examples = []
131 |         file = os.path.join(data_dir, "%s.txt" % set_type)
132 |         class_count = np.zeros(3)
133 |         with open(file, 'r', encoding='UTF-8') as fp:
134 |             sample_id = 0
135 |             for line in fp:
136 |                 sent_string, tag_string = line.strip().split('####')
137 |                 words = []
138 |                 tags = []
139 |                 for tag_item in tag_string.split(' '):
140 |                     eles = tag_item.split('=')
141 |                     if len(eles) == 1:
142 |                         raise Exception("Invalid samples %s..." % tag_string)
143 |                     elif len(eles) == 2:
144 |                         word, tag = eles
145 |                     else:
146 |                         word = ''.join((len(eles) - 2) * ['='])
147 |                         tag = eles[-1]
148 |                     words.append(word)
149 |                     tags.append(tag)
150 |                 # convert from ot to bieos
151 |                 if tagging_schema == 'BIEOS':
152 |                     tags = ot2bieos_ts(tags)
153 |                 elif tagging_schema == 'BIO':
154 |                     tags = ot2bio_ts(tags)
155 |                 else:
156 |                     # original tags follow the OT tagging schema, do nothing
157 |                     pass
158 |                 guid = "%s-%s" % (set_type, sample_id)
159 |                 text_a = ' '.join(words)
160 |                 #label = [absa_label_vocab[tag] for tag in tags]
161 |                 gold_ts = tag2ts(ts_tag_sequence=tags)
162 |                 for (b, e, s) in gold_ts:
163 |                     if s == 'POS':
164 |                         class_count[0] += 1
165 |                     if s == 'NEG':
166 |                         class_count[1] += 1
167 |                     if s == 'NEU':
168 |                         class_count[2] += 1
169 |                 examples.append(InputExample(guid=guid, text_a=text_a, text_b=None, label=tags))
170 |                 sample_id += 1
171 |         print("%s class count: %s" % (set_type, class_count))
172 |         return examples
173 | 
174 | 
175 | def _truncate_seq_pair(tokens_a, tokens_b, max_length):
176 |     """Truncates a sequence pair in place to the maximum length."""
177 | 
178 |     # This is a simple heuristic which will always truncate the longer sequence
179 |     # one token at a time. This makes more sense than truncating an equal percent
180 |     # of tokens from each, since if one sequence is very short then each token
181 |     # that's truncated likely contains more information than a longer sequence.
182 |     while True:
183 |         total_length = len(tokens_a) + len(tokens_b)
184 |         if total_length <= max_length:
185 |             break
186 |         if len(tokens_a) > len(tokens_b):
187 |             tokens_a.pop()
188 |         else:
189 |             tokens_b.pop()
190 | 
191 | 
192 | def convert_examples_to_seq_features(examples, label_list, tokenizer,
193 |                                      cls_token_at_end=False, pad_on_left=False, cls_token='[CLS]',
194 |                                      sep_token='[SEP]', pad_token=0, sequence_a_segment_id=0,
195 |                                      sequence_b_segment_id=1, cls_token_segment_id=1, pad_token_segment_id=0,
196 |                                      mask_padding_with_zero=True):
197 |     # feature extraction for sequence labeling
198 |     label_map = {label: i for i, label in enumerate(label_list)}
199 |     features = []
200 |     max_seq_length = -1
201 |     examples_tokenized = []
202 |     for (ex_index, example) in enumerate(examples):
203 |         tokens_a = []
204 |         labels_a = []
205 |         evaluate_label_ids = []
206 |         words = example.text_a.split(' ')
207 |         wid, tid = 0, 0
208 |         for word, label in zip(words, example.label):
209 |             subwords = tokenizer.tokenize(word)
210 |             tokens_a.extend(subwords)
211 |             if label != 'O':
212 |                 labels_a.extend([label] + ['EQ'] * (len(subwords) - 1))
213 |             else:
214 |                 labels_a.extend(['O'] * len(subwords))
215 |             evaluate_label_ids.append(tid)
216 |             wid += 1
217 |             # move the token pointer
218 |             tid += len(subwords)
219 |         #print(evaluate_label_ids)
220 |         assert tid == len(tokens_a)
221 |         evaluate_label_ids = np.array(evaluate_label_ids, dtype=np.int32)
222 |         examples_tokenized.append((tokens_a, labels_a, evaluate_label_ids))
223 |         if len(tokens_a) > max_seq_length:
224 |             max_seq_length = len(tokens_a)
225 |     # count on the [CLS] and [SEP]
226 |     max_seq_length += 2
227 |     #max_seq_length = 128
228 |     for ex_index, (tokens_a, labels_a, evaluate_label_ids) in enumerate(examples_tokenized):
229 |         #tokens_a = tokenizer.tokenize(example.text_a)
230 | 
231 |         # Account for [CLS] and [SEP] with "- 2"
232 |         # for sequence labeling, better not truncate the sequence
233 |         #if len(tokens_a) > max_seq_length - 2:
234 |         #    tokens_a = tokens_a[:(max_seq_length - 2)]
235 |         #    labels_a = labels_a
236 |         tokens = tokens_a + [sep_token]
237 |         segment_ids = [sequence_a_segment_id] * len(tokens)
238 |         labels = labels_a + ['O']
239 |         if cls_token_at_end:
240 |             # evaluate label ids not change
241 |             tokens = tokens + [cls_token]
242 |             segment_ids = segment_ids + [cls_token_segment_id]
243 |             labels = labels + ['O']
244 |         else:
245 |             # right shift 1 for evaluate label ids
246 |             tokens = [cls_token] + tokens
247 |             segment_ids = [cls_token_segment_id] + segment_ids
248 |             labels = ['O'] + labels
249 |             evaluate_label_ids += 1
250 |         input_ids = tokenizer.convert_tokens_to_ids(tokens)
251 |         input_mask = [1 if mask_padding_with_zero else 0] * len(input_ids)
252 |         # Zero-pad up to the sequence length.
253 |         padding_length = max_seq_length - len(input_ids)
254 |         #print("Current labels:", labels)
255 |         label_ids = [label_map[label] for label in labels]
256 | 
257 |         # pad the input sequence and the mask sequence
258 |         if pad_on_left:
259 |             input_ids = ([pad_token] * padding_length) + input_ids
260 |             input_mask = ([0 if mask_padding_with_zero else 1] * padding_length) + input_mask
261 |             segment_ids = ([pad_token_segment_id] * padding_length) + segment_ids
262 |             # pad sequence tag 'O'
263 |             label_ids = ([0] * padding_length) + label_ids
264 |             # right shift padding_length for evaluate_label_ids
265 |             evaluate_label_ids += padding_length
266 |         else:
267 |             # evaluate ids not change
268 |             input_ids = input_ids + ([pad_token] * padding_length)
269 |             input_mask = input_mask + ([0 if mask_padding_with_zero else 1] * padding_length)
270 |             segment_ids = segment_ids + ([pad_token_segment_id] * padding_length)
271 |             # pad sequence tag 'O'
272 |             label_ids = label_ids + ([0] * padding_length)
273 |         assert len(input_ids) == max_seq_length
274 |         assert len(input_mask) == max_seq_length
275 |         assert len(segment_ids) == max_seq_length
276 |         assert len(label_ids) == max_seq_length
277 | 
278 |         if ex_index < 5:
279 |             logger.info("*** Example ***")
280 |             logger.info("guid: %s" % (example.guid))
281 |             logger.info("tokens: %s" % " ".join(
282 |                     [str(x) for x in tokens]))
283 |             logger.info("input_ids: %s" % " ".join([str(x) for x in input_ids]))
284 |             logger.info("input_mask: %s" % " ".join([str(x) for x in input_mask]))
285 |             logger.info("segment_ids: %s" % " ".join([str(x) for x in segment_ids]))
286 |             logger.info("labels: %s " % ' '.join([str(x) for x in label_ids]))
287 |             logger.info("evaluate label ids: %s" % evaluate_label_ids)
288 | 
289 |         features.append(
290 |             SeqInputFeatures(input_ids=input_ids,
291 |                              input_mask=input_mask,
292 |                              segment_ids=segment_ids,
293 |                              label_ids=label_ids,
294 |                              evaluate_label_ids=evaluate_label_ids))
295 |     print("maximal sequence length is", max_seq_length)
296 |     return features
297 | 
298 | 
299 | def convert_examples_to_features(examples, label_list, max_seq_length,
300 |                                  tokenizer, output_mode,
301 |                                  cls_token_at_end=False, pad_on_left=False,
302 |                                  cls_token='[CLS]', sep_token='[SEP]', pad_token=0,
303 |                                  sequence_a_segment_id=0, sequence_b_segment_id=1,
304 |                                  cls_token_segment_id=1, pad_token_segment_id=0,
305 |                                  mask_padding_with_zero=True):
306 |     """ Loads a data file into a list of `InputBatch`s
307 |         `cls_token_at_end` define the location of the CLS token:
308 |             - False (Default, BERT/XLM pattern): [CLS] + A + [SEP] + B + [SEP]
309 |             - True (XLNet/GPT pattern): A + [SEP] + B + [SEP] + [CLS]
310 |         `cls_token_segment_id` define the segment id associated to the CLS token (0 for BERT, 2 for XLNet)
311 |     """
312 | 
313 |     label_map = {label : i for i, label in enumerate(label_list)}
314 | 
315 |     features = []
316 |     for (ex_index, example) in enumerate(examples):
317 |         if ex_index % 10000 == 0:
318 |             logger.info("Writing example %d of %d" % (ex_index, len(examples)))
319 | 
320 |         tokens_a = tokenizer.tokenize(example.text_a)
321 | 
322 |         tokens_b = None
323 |         if example.text_b:
324 |             tokens_b = tokenizer.tokenize(example.text_b)
325 |             # Modifies `tokens_a` and `tokens_b` in place so that the total
326 |             # length is less than the specified length.
327 |             # Account for [CLS], [SEP], [SEP] with "- 3"
328 |             _truncate_seq_pair(tokens_a, tokens_b, max_seq_length - 3)
329 |         else:
330 |             # Account for [CLS] and [SEP] with "- 2"
331 |             if len(tokens_a) > max_seq_length - 2:
332 |                 tokens_a = tokens_a[:(max_seq_length - 2)]
333 | 
334 |         # The convention in BERT is:
335 |         # (a) For sequence pairs:
336 |         #  tokens:   [CLS] is this jack ##son ##ville ? [SEP] no it is not . [SEP]
337 |         #  type_ids:   0   0  0    0    0     0       0   0   1  1  1  1   1   1
338 |         # (b) For single sequences:
339 |         #  tokens:   [CLS] the dog is hairy . [SEP]
340 |         #  type_ids:   0   0   0   0  0     0   0
341 |         #
342 |         # Where "type_ids" are used to indicate whether this is the first
343 |         # sequence or the second sequence. The embedding vectors for `type=0` and
344 |         # `type=1` were learned during pre-training and are added to the wordpiece
345 |         # embedding vector (and position vector). This is not *strictly* necessary
346 |         # since the [SEP] token unambiguously separates the sequences, but it makes
347 |         # it easier for the model to learn the concept of sequences.
348 |         #
349 |         # For classification tasks, the first vector (corresponding to [CLS]) is
350 |         # used as as the "sentence vector". Note that this only makes sense because
351 |         # the entire model is fine-tuned.
352 |         tokens = tokens_a + [sep_token]
353 |         segment_ids = [sequence_a_segment_id] * len(tokens)
354 | 
355 |         if tokens_b:
356 |             tokens += tokens_b + [sep_token]
357 |             segment_ids += [sequence_b_segment_id] * (len(tokens_b) + 1)
358 | 
359 |         if cls_token_at_end:
360 |             tokens = tokens + [cls_token]
361 |             segment_ids = segment_ids + [cls_token_segment_id]
362 |         else:
363 |             tokens = [cls_token] + tokens
364 |             segment_ids = [cls_token_segment_id] + segment_ids
365 | 
366 |         input_ids = tokenizer.convert_tokens_to_ids(tokens)
367 | 
368 |         # The mask has 1 for real tokens and 0 for padding tokens. Only real
369 |         # tokens are attended to.
370 |         input_mask = [1 if mask_padding_with_zero else 0] * len(input_ids)
371 | 
372 |         # Zero-pad up to the sequence length.
373 |         padding_length = max_seq_length - len(input_ids)
374 |         if pad_on_left:
375 |             input_ids = ([pad_token] * padding_length) + input_ids
376 |             input_mask = ([0 if mask_padding_with_zero else 1] * padding_length) + input_mask
377 |             segment_ids = ([pad_token_segment_id] * padding_length) + segment_ids
378 |         else:
379 |             input_ids = input_ids + ([pad_token] * padding_length)
380 |             input_mask = input_mask + ([0 if mask_padding_with_zero else 1] * padding_length)
381 |             segment_ids = segment_ids + ([pad_token_segment_id] * padding_length)
382 | 
383 |         assert len(input_ids) == max_seq_length
384 |         assert len(input_mask) == max_seq_length
385 |         assert len(segment_ids) == max_seq_length
386 | 
387 |         if output_mode == "classification":
388 |             label_id = label_map[example.label]
389 |         elif output_mode == "regression":
390 |             label_id = float(example.label)
391 |         else:
392 |             raise KeyError(output_mode)
393 | 
394 |         if ex_index < 5:
395 |             logger.info("*** Example ***")
396 |             logger.info("guid: %s" % (example.guid))
397 |             logger.info("tokens: %s" % " ".join(
398 |                     [str(x) for x in tokens]))
399 |             logger.info("input_ids: %s" % " ".join([str(x) for x in input_ids]))
400 |             logger.info("input_mask: %s" % " ".join([str(x) for x in input_mask]))
401 |             logger.info("segment_ids: %s" % " ".join([str(x) for x in segment_ids]))
402 |             logger.info("label: %s (id = %d)" % (example.label, label_id))
403 | 
404 |         features.append(
405 |                 InputFeatures(input_ids=input_ids,
406 |                               input_mask=input_mask,
407 |                               segment_ids=segment_ids,
408 |                               label_id=label_id))
409 |     return features
410 | 
411 | 
412 | def match_ts(gold_ts_sequence, pred_ts_sequence):
413 |     """
414 |     calculate the number of correctly predicted targeted sentiment
415 |     :param gold_ts_sequence: gold standard targeted sentiment sequence
416 |     :param pred_ts_sequence: predicted targeted sentiment sequence
417 |     :return:
418 |     """
419 |     # positive, negative and neutral
420 |     tag2tagid = {'POS': 0, 'NEG': 1, 'NEU': 2}
421 |     hit_count, gold_count, pred_count = np.zeros(3), np.zeros(3), np.zeros(3)
422 |     for t in gold_ts_sequence:
423 |         #print(t)
424 |         ts_tag = t[2]
425 |         tid = tag2tagid[ts_tag]
426 |         gold_count[tid] += 1
427 |     for t in pred_ts_sequence:
428 |         ts_tag = t[2]
429 |         tid = tag2tagid[ts_tag]
430 |         if t in gold_ts_sequence:
431 |             hit_count[tid] += 1
432 |         pred_count[tid] += 1
433 |     return hit_count, gold_count, pred_count
434 | 
435 | 
436 | def compute_metrics_absa(preds, labels, all_evaluate_label_ids, tagging_schema):
437 |     if tagging_schema == 'BIEOS':
438 |         absa_label_vocab = {'O': 0, 'EQ': 1, 'B-POS': 2, 'I-POS': 3, 'E-POS': 4, 'S-POS': 5,
439 |                         'B-NEG': 6, 'I-NEG': 7, 'E-NEG': 8, 'S-NEG': 9,
440 |                         'B-NEU': 10, 'I-NEU': 11, 'E-NEU': 12, 'S-NEU': 13}
441 |     elif tagging_schema == 'BIO':
442 |         absa_label_vocab = {'O': 0, 'EQ': 1, 'B-POS': 2, 'I-POS': 3, 
443 |         'B-NEG': 4, 'I-NEG': 5, 'B-NEU': 6, 'I-NEU': 7}
444 |     elif tagging_schema == 'OT':
445 |         absa_label_vocab = {'O': 0, 'EQ': 1, 'T-POS': 2, 'T-NEG': 3, 'T-NEU': 4}
446 |     else:
447 |         raise Exception("Invalid tagging schema %s..." % tagging_schema)
448 |     absa_id2tag = {}
449 |     for k in absa_label_vocab:
450 |         v = absa_label_vocab[k]
451 |         absa_id2tag[v] = k
452 |     # number of true postive, gold standard, predicted targeted sentiment
453 |     n_tp_ts, n_gold_ts, n_pred_ts = np.zeros(3), np.zeros(3), np.zeros(3)
454 |     # precision, recall and f1 for aspect-based sentiment analysis
455 |     ts_precision, ts_recall, ts_f1 = np.zeros(3), np.zeros(3), np.zeros(3)
456 |     n_samples = len(all_evaluate_label_ids)
457 |     pred_y, gold_y = [], []
458 |     class_count = np.zeros(3)
459 |     for i in range(n_samples):
460 |         evaluate_label_ids = all_evaluate_label_ids[i]
461 |         pred_labels = preds[i][evaluate_label_ids]
462 |         gold_labels = labels[i][evaluate_label_ids]
463 |         assert len(pred_labels) == len(gold_labels)
464 |         # here, no EQ tag will be induced
465 |         pred_tags = [absa_id2tag[label] for label in pred_labels]
466 |         gold_tags = [absa_id2tag[label] for label in gold_labels]
467 | 
468 |         if tagging_schema == 'OT':
469 |             gold_tags = ot2bieos_ts(gold_tags)
470 |             pred_tags = ot2bieos_ts(pred_tags)
471 |         elif tagging_schema == 'BIO':
472 |             gold_tags = ot2bieos_ts(bio2ot_ts(gold_tags))
473 |             pred_tags = ot2bieos_ts(bio2ot_ts(pred_tags))
474 |         else:
475 |             # current tagging schema is BIEOS, do nothing
476 |             pass
477 |         g_ts_sequence, p_ts_sequence = tag2ts(ts_tag_sequence=gold_tags), tag2ts(ts_tag_sequence=pred_tags)
478 | 
479 |         hit_ts_count, gold_ts_count, pred_ts_count = match_ts(gold_ts_sequence=g_ts_sequence,
480 |                                                               pred_ts_sequence=p_ts_sequence)
481 |         n_tp_ts += hit_ts_count
482 |         n_gold_ts += gold_ts_count
483 |         n_pred_ts += pred_ts_count
484 |         for (b, e, s) in g_ts_sequence:
485 |             if s == 'POS':
486 |                 class_count[0] += 1
487 |             if s == 'NEG':
488 |                 class_count[1] += 1
489 |             if s == 'NEU':
490 |                 class_count[2] += 1
491 |     for i in range(3):
492 |         n_ts = n_tp_ts[i]
493 |         n_g_ts = n_gold_ts[i]
494 |         n_p_ts = n_pred_ts[i]
495 |         ts_precision[i] = float(n_ts) / float(n_p_ts + SMALL_POSITIVE_CONST)
496 |         ts_recall[i] = float(n_ts) / float(n_g_ts + SMALL_POSITIVE_CONST)
497 |         ts_f1[i] = 2 * ts_precision[i] * ts_recall[i] / (ts_precision[i] + ts_recall[i] + SMALL_POSITIVE_CONST)
498 | 
499 |     macro_f1 = ts_f1.mean()
500 | 
501 |     # calculate micro-average scores for ts task
502 |     # TP
503 |     n_tp_total = sum(n_tp_ts)
504 |     # TP + FN
505 |     n_g_total = sum(n_gold_ts)
506 |     print("class_count:", class_count)
507 | 
508 |     # TP + FP
509 |     n_p_total = sum(n_pred_ts)
510 |     micro_p = float(n_tp_total) / (n_p_total + SMALL_POSITIVE_CONST)
511 |     micro_r = float(n_tp_total) / (n_g_total + SMALL_POSITIVE_CONST)
512 |     micro_f1 = 2 * micro_p * micro_r / (micro_p + micro_r + SMALL_POSITIVE_CONST)
513 |     scores = {'macro-f1': macro_f1, 'precision': micro_p, "recall": micro_r, "micro-f1": micro_f1}
514 |     return scores
515 | 
516 | 
517 | processors = {
518 |     "laptop14": ABSAProcessor,
519 |     "rest_total": ABSAProcessor,
520 |     "rest_total_revised": ABSAProcessor,
521 |     "rest14": ABSAProcessor,
522 |     "rest15": ABSAProcessor,
523 |     "rest16": ABSAProcessor,
524 | }
525 | 
526 | output_modes = {
527 |     "cola": "classification",
528 |     "mnli": "classification",
529 |     "mnli-mm": "classification",
530 |     "mrpc": "classification",
531 |     "sst-2": "classification",
532 |     "sts-b": "regression",
533 |     "qqp": "classification",
534 |     "qnli": "classification",
535 |     "rte": "classification",
536 |     "wnli": "classification",
537 |     "laptop14": "classification",
538 |     "rest_total": "classification",
539 |     "rest14": "classification",
540 |     "rest15": "classification",
541 |     "rest16": "classification",
542 |     "rest_total_revised": "classification",
543 | }
544 | 


--------------------------------------------------------------------------------
/absa_layer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | from transformers import BertModel, XLNetModel
  4 | from seq_utils import *
  5 | from bert import BertPreTrainedModel, XLNetPreTrainedModel
  6 | from torch.nn import CrossEntropyLoss
  7 | 
  8 | 
  9 | class TaggerConfig:
 10 |     def __init__(self):
 11 |         self.hidden_dropout_prob = 0.1
 12 |         self.hidden_size = 768
 13 |         self.n_rnn_layers = 1  # not used if tagger is non-RNN model
 14 |         self.bidirectional = True  # not used if tagger is non-RNN model
 15 | 
 16 | 
 17 | class SAN(nn.Module):
 18 |     def __init__(self, d_model, nhead, dropout=0.1):
 19 |         super(SAN, self).__init__()
 20 |         self.d_model = d_model
 21 |         self.nhead = nhead
 22 |         self.self_attn = nn.MultiheadAttention(d_model, nhead, dropout=dropout)
 23 |         self.dropout = nn.Dropout(p=dropout)
 24 |         self.norm = nn.LayerNorm(d_model)
 25 | 
 26 |     def forward(self, src, src_mask=None, src_key_padding_mask=None):
 27 |         """
 28 | 
 29 |         :param src:
 30 |         :param src_mask:
 31 |         :param src_key_padding_mask:
 32 |         :return:
 33 |         """
 34 |         src2, _ = self.self_attn(src, src, src, attn_mask=src_mask, key_padding_mask=src_key_padding_mask)
 35 |         src = src + self.dropout(src2)
 36 |         # apply layer normalization
 37 |         src = self.norm(src)
 38 |         return src
 39 | 
 40 | 
 41 | class GRU(nn.Module):
 42 |     # customized GRU with layer normalization
 43 |     def __init__(self, input_size, hidden_size, bidirectional=True):
 44 |         """
 45 | 
 46 |         :param input_size:
 47 |         :param hidden_size:
 48 |         :param bidirectional:
 49 |         """
 50 |         super(GRU, self).__init__()
 51 |         self.input_size = input_size
 52 |         if bidirectional:
 53 |             self.hidden_size = hidden_size // 2
 54 |         else:
 55 |             self.hidden_size = hidden_size
 56 |         self.bidirectional = bidirectional
 57 |         self.Wxrz = nn.Linear(in_features=self.input_size, out_features=2*self.hidden_size, bias=True)
 58 |         self.Whrz = nn.Linear(in_features=self.hidden_size, out_features=2*self.hidden_size, bias=True)
 59 |         self.Wxn = nn.Linear(in_features=self.input_size, out_features=self.hidden_size, bias=True)
 60 |         self.Whn = nn.Linear(in_features=self.hidden_size, out_features=self.hidden_size, bias=True)
 61 |         self.LNx1 = nn.LayerNorm(2*self.hidden_size)
 62 |         self.LNh1 = nn.LayerNorm(2*self.hidden_size)
 63 |         self.LNx2 = nn.LayerNorm(self.hidden_size)
 64 |         self.LNh2 = nn.LayerNorm(self.hidden_size)
 65 | 
 66 |     def forward(self, x):
 67 |         """
 68 | 
 69 |         :param x: input tensor, shape: (batch_size, seq_len, input_size)
 70 |         :return:
 71 |         """
 72 |         def recurrence(xt, htm1):
 73 |             """
 74 | 
 75 |             :param xt: current input
 76 |             :param htm1: previous hidden state
 77 |             :return:
 78 |             """
 79 |             gates_rz = torch.sigmoid(self.LNx1(self.Wxrz(xt)) + self.LNh1(self.Whrz(htm1)))
 80 |             rt, zt = gates_rz.chunk(2, 1)
 81 |             nt = torch.tanh(self.LNx2(self.Wxn(xt))+rt*self.LNh2(self.Whn(htm1)))
 82 |             ht = (1.0-zt) * nt + zt * htm1
 83 |             return ht
 84 | 
 85 |         steps = range(x.size(1))
 86 |         bs = x.size(0)
 87 |         hidden = self.init_hidden(bs)
 88 |         # shape: (seq_len, bsz, input_size)
 89 |         input = x.transpose(0, 1)
 90 |         output = []
 91 |         for t in steps:
 92 |             hidden = recurrence(input[t], hidden)
 93 |             output.append(hidden)
 94 |         # shape: (bsz, seq_len, input_size)
 95 |         output = torch.stack(output, 0).transpose(0, 1)
 96 | 
 97 |         if self.bidirectional:
 98 |             output_b = []
 99 |             hidden_b = self.init_hidden(bs)
100 |             for t in steps[::-1]:
101 |                 hidden_b = recurrence(input[t], hidden_b)
102 |                 output_b.append(hidden_b)
103 |             output_b = output_b[::-1]
104 |             output_b = torch.stack(output_b, 0).transpose(0, 1)
105 |             output = torch.cat([output, output_b], dim=-1)
106 |         return output, None
107 | 
108 |     def init_hidden(self, bs):
109 |         h_0 = torch.zeros(bs, self.hidden_size).cuda()
110 |         return h_0
111 | 
112 | 
113 | class CRF(nn.Module):
114 |     # borrow the code from 
115 |     # https://github.com/allenai/allennlp/blob/master/allennlp/modules/conditional_random_field.py
116 |     def __init__(self, num_tags, constraints=None, include_start_end_transitions=None):
117 |         """
118 | 
119 |         :param num_tags:
120 |         :param constraints:
121 |         :param include_start_end_transitions:
122 |         """
123 |         super(CRF, self).__init__()
124 |         self.num_tags = num_tags
125 |         self.include_start_end_transitions = include_start_end_transitions
126 |         self.transitions = nn.Parameter(torch.Tensor(self.num_tags, self.num_tags))
127 |         constraint_mask = torch.Tensor(self.num_tags+2, self.num_tags+2).fill_(1.)
128 |         if include_start_end_transitions:
129 |             self.start_transitions = nn.Parameter(torch.Tensor(num_tags))
130 |             self.end_transitions = nn.Parameter(torch.Tensor(num_tags))
131 |         # register the constraint_mask
132 |         self.constraint_mask = nn.Parameter(constraint_mask, requires_grad=False)
133 |         self.reset_parameters()
134 | 
135 |     def forward(self, inputs, tags, mask=None):
136 |         """
137 | 
138 |         :param inputs: (bsz, seq_len, num_tags), logits calculated from a linear layer
139 |         :param tags: (bsz, seq_len)
140 |         :param mask: (bsz, seq_len), mask for the padding token
141 |         :return:
142 |         """
143 |         if mask is None:
144 |             mask = torch.ones(*tags.size(), dtype=torch.long)
145 |         log_denominator = self._input_likelihood(inputs, mask)
146 |         log_numerator = self._joint_likelihood(inputs, tags, mask)
147 |         return torch.sum(log_numerator - log_denominator)
148 | 
149 |     def reset_parameters(self):
150 |         """
151 |         initialize the parameters in CRF
152 |         :return:
153 |         """
154 |         nn.init.xavier_normal_(self.transitions)
155 |         if self.include_start_end_transitions:
156 |             nn.init.normal_(self.start_transitions)
157 |             nn.init.normal_(self.end_transitions)
158 | 
159 |     def _input_likelihood(self, logits, mask):
160 |         """
161 | 
162 |         :param logits: emission score calculated by a linear layer, shape: (batch_size, seq_len, num_tags)
163 |         :param mask:
164 |         :return:
165 |         """
166 |         bsz, seq_len, num_tags = logits.size()
167 |         # Transpose batch size and sequence dimensions
168 |         mask = mask.float().transpose(0, 1).contiguous()
169 |         logits = logits.transpose(0, 1).contiguous()
170 | 
171 |         # Initial alpha is the (batch_size, num_tags) tensor of likelihoods combining the
172 |         # transitions to the initial states and the logits for the first timestep.
173 |         if self.include_start_end_transitions:
174 |             alpha = self.start_transitions.view(1, num_tags) + logits[0]
175 |         else:
176 |             alpha = logits[0]
177 | 
178 |         for t in range(1, seq_len):
179 |             # iteration starts from 1
180 |             emit_scores = logits[t].view(bsz, 1, num_tags)
181 |             transition_scores = self.transitions.view(1, num_tags, num_tags)
182 |             broadcast_alpha = alpha.view(bsz, num_tags, 1)
183 | 
184 |             # calculate the likelihood
185 |             inner = broadcast_alpha + emit_scores + transition_scores
186 | 
187 |             # mask the padded token when met the padded token, retain the previous alpha
188 |             alpha = (logsumexp(inner, 1) * mask[t].view(bsz, 1) + alpha * (1 - mask[t]).view(bsz, 1))
189 |         # Every sequence needs to end with a transition to the stop_tag.
190 |         if self.include_start_end_transitions:
191 |             stops = alpha + self.end_transitions.view(1, num_tags)
192 |         else:
193 |             stops = alpha
194 | 
195 |         # Finally we log_sum_exp along the num_tags dim, result is (batch_size,)
196 |         return logsumexp(stops)
197 | 
198 |     def _joint_likelihood(self, logits, tags, mask):
199 |         """
200 |         calculate the likelihood for the input tag sequence
201 |         :param logits:
202 |         :param tags: shape: (bsz, seq_len)
203 |         :param mask: shape: (bsz, seq_len)
204 |         :return:
205 |         """
206 |         bsz, seq_len, _ = logits.size()
207 | 
208 |         # Transpose batch size and sequence dimensions:
209 |         logits = logits.transpose(0, 1).contiguous()
210 |         mask = mask.float().transpose(0, 1).contiguous()
211 |         tags = tags.transpose(0, 1).contiguous()
212 | 
213 |         # Start with the transition scores from start_tag to the first tag in each input
214 |         if self.include_start_end_transitions:
215 |             score = self.start_transitions.index_select(0, tags[0])
216 |         else:
217 |             score = 0.0
218 | 
219 |         for t in range(seq_len-1):
220 |             current_tag, next_tag = tags[t], tags[t+1]
221 |             # The scores for transitioning from current_tag to next_tag
222 |             transition_score = self.transitions[current_tag.view(-1), next_tag.view(-1)]
223 | 
224 |             # The score for using current_tag
225 |             emit_score = logits[t].gather(1, current_tag.view(bsz, 1)).squeeze(1)
226 | 
227 |             score = score + transition_score * mask[t+1] + emit_score * mask[t]
228 | 
229 |         last_tag_index = mask.sum(0).long() - 1
230 |         last_tags = tags.gather(0, last_tag_index.view(1, bsz)).squeeze(0)
231 | 
232 |         # Compute score of transitioning to `stop_tag` from each "last tag".
233 |         if self.include_start_end_transitions:
234 |             last_transition_score = self.end_transitions.index_select(0, last_tags)
235 |         else:
236 |             last_transition_score = 0.0
237 | 
238 |         last_inputs = logits[-1]  # (batch_size, num_tags)
239 |         last_input_score = last_inputs.gather(1, last_tags.view(-1, 1))  # (batch_size, 1)
240 |         last_input_score = last_input_score.squeeze()  # (batch_size,)
241 | 
242 |         score = score + last_transition_score + last_input_score * mask[-1]
243 | 
244 |         return score
245 | 
246 |     def viterbi_tags(self, logits, mask):
247 |         """
248 | 
249 |         :param logits: (bsz, seq_len, num_tags), emission scores
250 |         :param mask:
251 |         :return:
252 |         """
253 |         _, max_seq_len, num_tags = logits.size()
254 | 
255 |         # Get the tensors out of the variables
256 |         logits, mask = logits.data, mask.data
257 | 
258 |         # Augment transitions matrix with start and end transitions
259 |         start_tag = num_tags
260 |         end_tag = num_tags + 1
261 |         transitions = torch.Tensor(num_tags + 2, num_tags + 2).fill_(-10000.)
262 | 
263 |         # Apply transition constraints
264 |         constrained_transitions = (
265 |                 self.transitions * self.constraint_mask[:num_tags, :num_tags] +
266 |                 -10000.0 * (1 - self.constraint_mask[:num_tags, :num_tags])
267 |         )
268 | 
269 |         transitions[:num_tags, :num_tags] = constrained_transitions.data
270 | 
271 |         if self.include_start_end_transitions:
272 |             transitions[start_tag, :num_tags] = (
273 |                     self.start_transitions.detach() * self.constraint_mask[start_tag, :num_tags].data +
274 |                     -10000.0 * (1 - self.constraint_mask[start_tag, :num_tags].detach())
275 |             )
276 |             transitions[:num_tags, end_tag] = (
277 |                     self.end_transitions.detach() * self.constraint_mask[:num_tags, end_tag].data +
278 |                     -10000.0 * (1 - self.constraint_mask[:num_tags, end_tag].detach())
279 |             )
280 |         else:
281 |             transitions[start_tag, :num_tags] = (-10000.0 *
282 |                                                  (1 - self.constraint_mask[start_tag, :num_tags].detach()))
283 |             transitions[:num_tags, end_tag] = -10000.0 * (1 - self.constraint_mask[:num_tags, end_tag].detach())
284 | 
285 |         best_paths = []
286 |         # Pad the max sequence length by 2 to account for start_tag + end_tag.
287 |         tag_sequence = torch.Tensor(max_seq_len + 2, num_tags + 2)
288 | 
289 |         for prediction, prediction_mask in zip(logits, mask):
290 |             # perform viterbi decoding sample by sample
291 |             seq_len = torch.sum(prediction_mask)
292 |             # Start with everything totally unlikely
293 |             tag_sequence.fill_(-10000.)
294 |             # At timestep 0 we must have the START_TAG
295 |             tag_sequence[0, start_tag] = 0.
296 |             # At steps 1, ..., sequence_length we just use the incoming prediction
297 |             tag_sequence[1:(seq_len + 1), :num_tags] = prediction[:seq_len]
298 |             # And at the last timestep we must have the END_TAG
299 |             tag_sequence[seq_len + 1, end_tag] = 0.
300 |             viterbi_path = viterbi_decode(tag_sequence[:(seq_len + 2)], transitions)
301 |             viterbi_path = viterbi_path[1:-1]
302 |             best_paths.append(viterbi_path)
303 |         return best_paths
304 | 
305 | 
306 | class LSTM(nn.Module):
307 |     # customized LSTM with layer normalization
308 |     def __init__(self, input_size, hidden_size, bidirectional=True):
309 |         """
310 | 
311 |         :param input_size:
312 |         :param hidden_size:
313 |         :param bidirectional:
314 |         """
315 |         super(LSTM, self).__init__()
316 |         self.input_size = input_size
317 |         if bidirectional:
318 |             self.hidden_size = hidden_size // 2
319 |         else:
320 |             self.hidden_size = hidden_size
321 |         self.bidirectional = bidirectional
322 |         self.LNx = nn.LayerNorm(4*self.hidden_size)
323 |         self.LNh = nn.LayerNorm(4*self.hidden_size)
324 |         self.LNc = nn.LayerNorm(self.hidden_size)
325 |         self.Wx = nn.Linear(in_features=self.input_size, out_features=4*self.hidden_size, bias=True)
326 |         self.Wh = nn.Linear(in_features=self.hidden_size, out_features=4*self.hidden_size, bias=True)
327 | 
328 |     def forward(self, x):
329 |         """
330 | 
331 |         :param x: input, shape: (batch_size, seq_len, input_size)
332 |         :return:
333 |         """
334 |         def recurrence(xt, hidden):
335 |             """
336 |             recurrence function enhanced with layer norm
337 |             :param input: input to the current cell
338 |             :param hidden:
339 |             :return:
340 |             """
341 |             htm1, ctm1 = hidden
342 |             gates = self.LNx(self.Wx(xt)) + self.LNh(self.Wh(htm1))
343 |             it, ft, gt, ot = gates.chunk(4, 1)
344 |             it = torch.sigmoid(it)
345 |             ft = torch.sigmoid(ft)
346 |             gt = torch.tanh(gt)
347 |             ot = torch.sigmoid(ot)
348 |             ct = (ft * ctm1) + (it * gt)
349 |             ht = ot * torch.tanh(self.LNc(ct))  # n_b x hidden_dim
350 | 
351 |             return ht, ct
352 |         output = []
353 |         # sequence_length
354 |         steps = range(x.size(1))
355 |         hidden = self.init_hidden(x.size(0))
356 |         # change to: (seq_len, bs, hidden_size)
357 |         input = x.transpose(0, 1)
358 |         for t in steps:
359 |             hidden = recurrence(input[t], hidden)
360 |             output.append(hidden[0])
361 |         # (bs, seq_len, hidden_size)
362 |         output = torch.stack(output, 0).transpose(0, 1)
363 | 
364 |         if self.bidirectional:
365 |             hidden_b = self.init_hidden(x.size(0))
366 |             output_b = []
367 |             for t in steps[::-1]:
368 |                 hidden_b = recurrence(input[t], hidden_b)
369 |                 output_b.append(hidden_b[0])
370 |             output_b = output_b[::-1]
371 |             output_b = torch.stack(output_b, 0).transpose(0, 1)
372 |             output = torch.cat([output, output_b], dim=-1)
373 |         return output, None
374 | 
375 |     def init_hidden(self, bs):
376 |         h_0 = torch.zeros(bs, self.hidden_size).cuda()
377 |         c_0 = torch.zeros(bs, self.hidden_size).cuda()
378 |         return h_0, c_0
379 | 
380 | 
381 | class BertABSATagger(BertPreTrainedModel):
382 |     def __init__(self, bert_config):
383 |         """
384 | 
385 |         :param bert_config: configuration for bert model
386 |         """
387 |         super(BertABSATagger, self).__init__(bert_config)
388 |         self.num_labels = bert_config.num_labels
389 |         self.tagger_config = TaggerConfig()
390 |         self.tagger_config.absa_type = bert_config.absa_type.lower()
391 |         if bert_config.tfm_mode == 'finetune':
392 |             # initialized with pre-trained BERT and perform finetuning
393 |             # print("Fine-tuning the pre-trained BERT...")
394 |             self.bert = BertModel(bert_config)
395 |         else:
396 |             raise Exception("Invalid transformer mode %s!!!" % bert_config.tfm_mode)
397 |         self.bert_dropout = nn.Dropout(bert_config.hidden_dropout_prob)
398 |         # fix the parameters in BERT and regard it as feature extractor
399 |         if bert_config.fix_tfm:
400 |             # fix the parameters of the (pre-trained or randomly initialized) transformers during fine-tuning
401 |             for p in self.bert.parameters():
402 |                 p.requires_grad = False
403 | 
404 |         self.tagger = None
405 |         if self.tagger_config.absa_type == 'linear':
406 |             # hidden size at the penultimate layer
407 |             penultimate_hidden_size = bert_config.hidden_size
408 |         else:
409 |             self.tagger_dropout = nn.Dropout(self.tagger_config.hidden_dropout_prob)
410 |             if self.tagger_config.absa_type == 'lstm':
411 |                 self.tagger = LSTM(input_size=bert_config.hidden_size,
412 |                                    hidden_size=self.tagger_config.hidden_size,
413 |                                    bidirectional=self.tagger_config.bidirectional)
414 |             elif self.tagger_config.absa_type == 'gru':
415 |                 self.tagger = GRU(input_size=bert_config.hidden_size,
416 |                                   hidden_size=self.tagger_config.hidden_size,
417 |                                   bidirectional=self.tagger_config.bidirectional)
418 |             elif self.tagger_config.absa_type == 'tfm':
419 |                 # transformer encoder layer
420 |                 self.tagger = nn.TransformerEncoderLayer(d_model=bert_config.hidden_size,
421 |                                                          nhead=12,
422 |                                                          dim_feedforward=4*bert_config.hidden_size,
423 |                                                          dropout=0.1)
424 |             elif self.tagger_config.absa_type == 'san':
425 |                 # vanilla self attention networks
426 |                 self.tagger = SAN(d_model=bert_config.hidden_size, nhead=12, dropout=0.1)
427 |             elif self.tagger_config.absa_type == 'crf':
428 |                 self.tagger = CRF(num_tags=self.num_labels)
429 |             else:
430 |                 raise Exception('Unimplemented downstream tagger %s...' % self.tagger_config.absa_type)
431 |             penultimate_hidden_size = self.tagger_config.hidden_size
432 |         self.classifier = nn.Linear(penultimate_hidden_size, bert_config.num_labels)
433 | 
434 |     def forward(self, input_ids, token_type_ids=None, attention_mask=None, labels=None,
435 |                 position_ids=None, head_mask=None):
436 |         outputs = self.bert(input_ids, position_ids=position_ids, token_type_ids=token_type_ids,
437 |                             attention_mask=attention_mask, head_mask=head_mask)
438 |         # the hidden states of the last Bert Layer, shape: (bsz, seq_len, hsz)
439 |         tagger_input = outputs[0]
440 |         tagger_input = self.bert_dropout(tagger_input)
441 |         #print("tagger_input.shape:", tagger_input.shape)
442 |         if self.tagger is None or self.tagger_config.absa_type == 'crf':
443 |             # regard classifier as the tagger
444 |             logits = self.classifier(tagger_input)
445 |         else:
446 |             if self.tagger_config.absa_type == 'lstm':
447 |                 # customized LSTM
448 |                 classifier_input, _ = self.tagger(tagger_input)
449 |             elif self.tagger_config.absa_type == 'gru':
450 |                 # customized GRU
451 |                 classifier_input, _ = self.tagger(tagger_input)
452 |             elif self.tagger_config.absa_type == 'san' or self.tagger_config.absa_type == 'tfm':
453 |                 # vanilla self-attention networks or transformer
454 |                 # adapt the input format for the transformer or self attention networks
455 |                 tagger_input = tagger_input.transpose(0, 1)
456 |                 classifier_input = self.tagger(tagger_input)
457 |                 classifier_input = classifier_input.transpose(0, 1)
458 |             else:
459 |                 raise Exception("Unimplemented downstream tagger %s..." % self.tagger_config.absa_type)
460 |             classifier_input = self.tagger_dropout(classifier_input)
461 |             logits = self.classifier(classifier_input)
462 |         outputs = (logits,) + outputs[2:]
463 | 
464 |         if labels is not None:
465 |             if self.tagger_config.absa_type != 'crf':
466 |                 loss_fct = CrossEntropyLoss()
467 |                 if attention_mask is not None:
468 |                     active_loss = attention_mask.view(-1) == 1
469 |                     active_logits = logits.view(-1, self.num_labels)[active_loss]
470 |                     active_labels = labels.view(-1)[active_loss]
471 |                     loss = loss_fct(active_logits, active_labels)
472 |                 else:
473 |                     loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
474 |                 outputs = (loss,) + outputs
475 |             else:
476 |                 log_likelihood = self.tagger(inputs=logits, tags=labels, mask=attention_mask)
477 |                 loss = -log_likelihood
478 |                 outputs = (loss,) + outputs
479 |         return outputs
480 | 
481 | 
482 | class XLNetABSATagger(XLNetPreTrainedModel):
483 |     # TODO
484 |     def __init__(self, xlnet_config):
485 |         super(XLNetABSATagger, self).__init__(xlnet_config)
486 |         self.num_labels = xlnet_config.num_labels
487 |         self.xlnet = XLNetModel(xlnet_config)
488 |         self.tagger_config = xlnet_config.absa_tagger_config
489 |         self.tagger = None
490 |         if self.tagger_config.tagger == '':
491 |             # hidden size at the penultimate layer
492 |             penultimate_hidden_size = xlnet_config.d_model
493 |         else:
494 |             self.tagger_dropout = nn.Dropout(self.tagger_config.hidden_dropout_prob)
495 |             if self.tagger_config.tagger in ['RNN', 'LSTM', 'GRU']:
496 |                 # 2-layer bi-directional rnn decoder
497 |                 self.tagger = getattr(nn, self.tagger_config.tagger)(
498 |                     input_size=xlnet_config.d_model, hidden_size=self.tagger_config.hidden_size//2,
499 |                     num_layers=self.tagger_config.n_rnn_layers, batch_first=True, bidirectional=True)
500 |             elif self.tagger_config.tagger in ['CRF']:
501 |                 # crf tagger
502 |                 raise Exception("Unimplemented now!!")
503 |             else:
504 |                 raise Exception('Unimplemented tagger %s...' % self.tagger_config.tagger)
505 |             penultimate_hidden_size = self.tagger_config.hidden_size
506 |         self.tagger_dropout = nn.Dropout(self.tagger_config.hidden_dropout_prob)
507 |         self.classifier = nn.Linear(penultimate_hidden_size, xlnet_config.num_labels)
508 |         self.apply(self.init_weights)
509 | 
510 |     def forward(self, input_ids, token_type_ids=None, input_mask=None, attention_mask=None, mems=None,
511 |                 perm_mask=None, target_mapping=None, labels=None, head_mask=None):
512 |         """
513 | 
514 |         :param input_ids: Indices of input sequence tokens in the vocabulary
515 |         :param token_type_ids: A parallel sequence of tokens (can be used to indicate various portions of the inputs).
516 |         The embeddings from these tokens will be summed with the respective token embeddings
517 |         :param input_mask: Mask to avoid performing attention on padding token indices.
518 |         :param attention_mask: Mask to avoid performing attention on padding token indices.
519 |         :param mems: list of torch.FloatTensor (one for each layer):
520 |         that contains pre-computed hidden-states (key and values in the attention blocks)
521 |         :param perm_mask:
522 |         :param target_mapping:
523 |         :param labels:
524 |         :param head_mask:
525 |         :return:
526 |         """
527 |         transformer_outputs = self.xlnet(input_ids, token_type_ids=token_type_ids,
528 |                                                input_mask=input_mask, attention_mask=attention_mask,
529 |                                                mems=mems, perm_mask=perm_mask, target_mapping=target_mapping,
530 |                                                head_mask=head_mask)
531 |         # hidden states from the last transformer layer, xlnet has done the dropout,
532 |         # no need to do the additional dropout
533 |         tagger_input = transformer_outputs[0]
534 | 
535 |         if self.tagger is None:
536 |             # regard classifier as the tagger
537 |             logits = self.classifier(tagger_input)
538 |         else:
539 |             if self.tagger_config.tagger in ['RNN', 'LSTM', 'GRU']:
540 |                 classifier_input, _= self.tagger(tagger_input)
541 |             else:
542 |                 raise Exception("Unimplemented tagger %s..." % self.tagger_config.tagger)
543 |             classifier_input = self.tagger_dropout(classifier_input)
544 |             logits = self.classifier(classifier_input)
545 |         # transformer outputs: (last_hidden_state, mems, hidden_states, attentions)
546 |         outputs = (logits,) + transformer_outputs[1:]
547 | 
548 |         if labels is not None:
549 |             loss_fct = CrossEntropyLoss()
550 |             if attention_mask is not None:
551 |                 active_loss = attention_mask.view(-1) == 1
552 |                 active_logits = logits.view(-1, self.num_labels)[active_loss]
553 |                 active_labels = labels.view(-1)[active_loss]
554 |                 loss = loss_fct(active_logits, active_labels)
555 |             else:
556 |                 loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
557 |             outputs = (loss,) + outputs
558 |         return outputs
559 | 


--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import os
  3 | import torch
  4 | import logging
  5 | import random
  6 | import numpy as np
  7 | 
  8 | from glue_utils import convert_examples_to_seq_features, output_modes, processors, compute_metrics_absa
  9 | from tqdm import tqdm, trange
 10 | from transformers import BertConfig, BertTokenizer, XLNetConfig, XLNetTokenizer, WEIGHTS_NAME
 11 | from transformers import AdamW, get_linear_schedule_with_warmup
 12 | from absa_layer import BertABSATagger, XLNetABSATagger
 13 | 
 14 | from torch.utils.data import DataLoader, TensorDataset, RandomSampler, SequentialSampler
 15 | from torch.utils.data.distributed import DistributedSampler
 16 | import torch.distributed as dist
 17 | from tensorboardX import SummaryWriter
 18 | 
 19 | import glob
 20 | import json
 21 | 
 22 | logger = logging.getLogger(__name__)
 23 | 
 24 | #ALL_MODELS = sum((tuple(conf.pretrained_config_archive_map.keys()) for conf in (BertConfig, XLNetConfig)), ())
 25 | ALL_MODELS = (
 26 |      'bert-base-uncased',
 27 |  'bert-large-uncased',
 28 |  'bert-base-cased',
 29 |  'bert-large-cased',
 30 |  'bert-base-multilingual-uncased',
 31 |  'bert-base-multilingual-cased',
 32 |  'bert-base-chinese',
 33 |  'bert-base-german-cased',
 34 |  'bert-large-uncased-whole-word-masking',
 35 |  'bert-large-cased-whole-word-masking',
 36 |  'bert-large-uncased-whole-word-masking-finetuned-squad',
 37 |  'bert-large-cased-whole-word-masking-finetuned-squad',
 38 |  'bert-base-cased-finetuned-mrpc',
 39 |  'bert-base-german-dbmdz-cased',
 40 |  'bert-base-german-dbmdz-uncased',
 41 |  'xlnet-base-cased',
 42 |  'xlnet-large-cased'
 43 | )
 44 | 
 45 | 
 46 | MODEL_CLASSES = {
 47 |     'bert': (BertConfig, BertABSATagger, BertTokenizer),
 48 |     'xlnet': (XLNetConfig, XLNetABSATagger, XLNetTokenizer)
 49 | }
 50 | 
 51 | 
 52 | def set_seed(args):
 53 |     random.seed(args.seed)
 54 |     np.random.seed(args.seed)
 55 |     torch.manual_seed(args.seed)
 56 |     if args.n_gpu > 0:
 57 |         torch.cuda.manual_seed_all(args.seed)
 58 | 
 59 | 
 60 | def init_args():
 61 |     parser = argparse.ArgumentParser()
 62 |     parser.add_argument("--data_dir", default=None, type=str, required=True,
 63 |                         help="The input data dir. Should contain the .tsv files (or other data files) for the task.")
 64 |     parser.add_argument("--model_type", default=None, type=str, required=True,
 65 |                         help="Model type selected in the list: " + ", ".join(MODEL_CLASSES.keys()))
 66 |     parser.add_argument("--absa_type", default=None, type=str, required=True,
 67 |                         help="Downstream absa layer type selected in the list: [linear, gru, san, tfm, crf]")
 68 |     parser.add_argument("--tfm_mode", default=None, type=str, required=True,
 69 |                         help="mode of the pre-trained transformer, selected from: [finetune]")
 70 |     parser.add_argument("--fix_tfm", default=None, type=int, required=True,
 71 |                         help="whether fix the transformer params or not")
 72 |     parser.add_argument("--model_name_or_path", default=None, type=str, required=True,
 73 |                         help="Path to pre-trained model or shortcut name selected in the list: " + ", ".join(
 74 |                             ALL_MODELS))
 75 |     parser.add_argument("--task_name", default=None, type=str, required=True,
 76 |                         help="The name of the task to train selected in the list: " + ", ".join(processors.keys()))
 77 | 
 78 |     ## Other parameters
 79 |     parser.add_argument("--config_name", default="", type=str,
 80 |                         help="Pretrained config name or path if not the same as model_name")
 81 |     parser.add_argument("--tokenizer_name", default="", type=str,
 82 |                         help="Pretrained tokenizer name or path if not the same as model_name")
 83 |     parser.add_argument("--cache_dir", default="", type=str,
 84 |                         help="Where do you want to store the pre-trained models downloaded from s3")
 85 |     parser.add_argument("--max_seq_length", default=128, type=int,
 86 |                         help="The maximum total input sequence length after tokenization. Sequences longer "
 87 |                              "than this will be truncated, sequences shorter will be padded.")
 88 |     parser.add_argument("--do_train", action='store_true',
 89 |                         help="Whether to run training.")
 90 |     parser.add_argument("--do_eval", action='store_true',
 91 |                         help="Whether to run eval on the dev set.")
 92 |     parser.add_argument("--evaluate_during_training", action='store_true',
 93 |                         help="Rul evaluation during training at each logging step.")
 94 |     parser.add_argument("--do_lower_case", action='store_true',
 95 |                         help="Set this flag if you are using an uncased model.")
 96 | 
 97 |     parser.add_argument("--per_gpu_train_batch_size", default=8, type=int,
 98 |                         help="Batch size per GPU/CPU for training.")
 99 |     parser.add_argument("--per_gpu_eval_batch_size", default=8, type=int,
100 |                         help="Batch size per GPU/CPU for evaluation.")
101 |     parser.add_argument('--gradient_accumulation_steps', type=int, default=1,
102 |                         help="Number of updates steps to accumulate before performing a backward/update pass.")
103 |     parser.add_argument("--learning_rate", default=5e-5, type=float,
104 |                         help="The initial learning rate for Adam.")
105 |     parser.add_argument("--weight_decay", default=0.0, type=float,
106 |                         help="Weight deay if we apply some.")
107 |     parser.add_argument("--adam_epsilon", default=1e-8, type=float,
108 |                         help="Epsilon for Adam optimizer.")
109 |     parser.add_argument("--max_grad_norm", default=1.0, type=float,
110 |                         help="Max gradient norm.")
111 |     parser.add_argument("--num_train_epochs", default=3.0, type=float,
112 |                         help="Total number of training epochs to perform.")
113 |     parser.add_argument("--max_steps", default=-1, type=int,
114 |                         help="If > 0: set total number of training steps to perform. Override num_train_epochs.")
115 |     parser.add_argument("--warmup_steps", default=0, type=int,
116 |                         help="Linear warmup over warmup_steps.")
117 | 
118 |     parser.add_argument('--logging_steps', type=int, default=50,
119 |                         help="Log every X updates steps.")
120 |     parser.add_argument('--save_steps', type=int, default=100,
121 |                         help="Save checkpoint every X updates steps.")
122 |     parser.add_argument("--eval_all_checkpoints", action='store_true',
123 |                         help="Evaluate all checkpoints starting with the same prefix as model_name ending and ending with step number")
124 |     parser.add_argument("--no_cuda", action='store_true',
125 |                         help="Avoid using CUDA when available")
126 |     parser.add_argument('--overwrite_output_dir', action='store_true',
127 |                         help="Overwrite the content of the output directory")
128 |     parser.add_argument('--overwrite_cache', action='store_true',
129 |                         help="Overwrite the cached training and evaluation sets")
130 |     parser.add_argument('--seed', type=int, default=42,
131 |                         help="random seed for initialization")
132 |     parser.add_argument('--tagging_schema', type=str, default='BIEOS')
133 | 
134 |     parser.add_argument("--overfit", type=int, default=0, help="if evaluate overfit or not")
135 | 
136 |     parser.add_argument("--local_rank", type=int, default=-1,
137 |                         help="For distributed training: local_rank")
138 |     parser.add_argument('--server_ip', type=str, default='', help="For distant debugging.")
139 |     parser.add_argument('--server_port', type=str, default='', help="For distant debugging.")
140 |     parser.add_argument('--MASTER_ADDR', type=str)
141 |     parser.add_argument('--MASTER_PORT', type=str)
142 |     args = parser.parse_args()
143 |     output_dir = '%s-%s-%s-%s' % (args.model_type, args.absa_type, args.task_name, args.tfm_mode)
144 | 
145 |     if args.fix_tfm:
146 |         output_dir = '%s-fix' % output_dir
147 |     if args.overfit:
148 |         output_dir = '%s-overfit' % output_dir
149 |         args.max_steps = 3000
150 |     args.output_dir = output_dir
151 |     return args
152 | 
153 | 
154 | def train(args, train_dataset, model, tokenizer):
155 |     """ Train the model """
156 |     if args.local_rank in [-1, 0]:
157 |         tb_writer = SummaryWriter()
158 | 
159 |     args.train_batch_size = args.per_gpu_train_batch_size * max(1, args.n_gpu)
160 |     # draw training samples from shuffled dataset
161 |     train_sampler = RandomSampler(train_dataset) if args.local_rank == -1 else DistributedSampler(train_dataset)
162 |     train_dataloader = DataLoader(train_dataset, sampler=train_sampler, batch_size=args.train_batch_size)
163 | 
164 |     if args.max_steps > 0:
165 |         t_total = args.max_steps
166 |         args.num_train_epochs = args.max_steps // (len(train_dataloader) // args.gradient_accumulation_steps) + 1
167 |     else:
168 |         t_total = len(train_dataloader) // args.gradient_accumulation_steps * args.num_train_epochs
169 | 
170 |     # Prepare optimizer and schedule (linear warmup and decay)
171 |     no_decay = ['bias', 'LayerNorm.weight']
172 |     optimizer_grouped_parameters = [
173 |         {'params': [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)], 'weight_decay': args.weight_decay},
174 |         {'params': [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)], 'weight_decay': 0.0}
175 |         ]
176 |     optimizer = AdamW(optimizer_grouped_parameters, lr=args.learning_rate, eps=args.adam_epsilon)
177 |     scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=args.warmup_steps, num_training_steps=t_total)
178 | 
179 |     # Train!
180 | 
181 |     logger.info("***** Running training *****")
182 |     logger.info("  Num examples = %d", len(train_dataset))
183 |     logger.info("  Num Epochs = %d", args.num_train_epochs)
184 |     logger.info("  Instantaneous batch size per GPU = %d", args.per_gpu_train_batch_size)
185 |     logger.info("  Total train batch size (w. parallel, distributed & accumulation) = %d",
186 |                    args.train_batch_size * args.gradient_accumulation_steps * (torch.distributed.get_world_size() if args.local_rank != -1 else 1))
187 |     logger.info("  Gradient Accumulation steps = %d", args.gradient_accumulation_steps)
188 |     logger.info("  Total optimization steps = %d", t_total)
189 | 
190 |     global_step = 0
191 |     tr_loss, logging_loss = 0.0, 0.0
192 |     model.zero_grad()
193 |     train_iterator = trange(int(args.num_train_epochs), desc="Epoch", disable=args.local_rank not in [-1, 0])
194 |     # set the seed number
195 |     set_seed(args)  # Added here for reproductibility (even between python 2 and 3)
196 |     for _ in train_iterator:
197 |         epoch_iterator = tqdm(train_dataloader, desc="Iteration", disable=args.local_rank not in [-1, 0])
198 |         for step, batch in enumerate(epoch_iterator):
199 |             model.train()
200 |             batch = tuple(t.to(args.device) for t in batch)
201 |             inputs = {'input_ids':      batch[0],
202 |                       'attention_mask': batch[1],
203 |                       'token_type_ids': batch[2] if args.model_type in ['bert', 'xlnet'] else None,  # XLM don't use segment_ids
204 |                       'labels':         batch[3]}
205 |             ouputs = model(**inputs)
206 |             # loss with attention mask
207 |             loss = ouputs[0]  # model outputs are always tuple in pytorch-transformers (see doc)
208 | 
209 |             if args.n_gpu > 1:
210 |                 loss = loss.mean()  # mean() to average on multi-gpu parallel training
211 |             if args.gradient_accumulation_steps > 1:
212 |                 loss = loss / args.gradient_accumulation_steps
213 | 
214 | 
215 |             loss.backward()
216 |             torch.nn.utils.clip_grad_norm_(model.parameters(), args.max_grad_norm)
217 | 
218 |             tr_loss += loss.item()
219 |             if (step + 1) % args.gradient_accumulation_steps == 0:
220 |                 optimizer.step()
221 |                 scheduler.step()  # Update learning rate schedule
222 |                 model.zero_grad()
223 |                 global_step += 1
224 | 
225 |                 if args.local_rank in [-1, 0] and args.logging_steps > 0 and global_step % args.logging_steps == 0:
226 |                     # Log metrics
227 |                     if args.local_rank == -1 and args.evaluate_during_training:  # Only evaluate when single GPU otherwise metrics may not average well
228 |                         results = evaluate(args, model, tokenizer)
229 |                         for key, value in results.items():
230 |                             tb_writer.add_scalar('eval_{}'.format(key), value, global_step)
231 |                     tb_writer.add_scalar('lr', scheduler.get_lr()[0], global_step)
232 |                     tb_writer.add_scalar('loss', (tr_loss - logging_loss)/args.logging_steps, global_step)
233 |                     logging_loss = tr_loss
234 | 
235 |                 if args.local_rank in [-1, 0] and args.save_steps > 0 and global_step % args.save_steps == 0:
236 |                     # Save model checkpoint per each N steps
237 |                     output_dir = os.path.join(args.output_dir, 'checkpoint-{}'.format(global_step))
238 |                     if not os.path.exists(output_dir):
239 |                         os.makedirs(output_dir)
240 |                     model_to_save = model.module if hasattr(model, 'module') else model  # Take care of distributed/parallel training
241 |                     model_to_save.save_pretrained(output_dir)
242 |                     torch.save(args, os.path.join(output_dir, 'training_args.bin'))
243 |                     logger.info("Saving model checkpoint to %s", output_dir)
244 | 
245 |             if args.max_steps > 0 and global_step > args.max_steps:
246 |                 epoch_iterator.close()
247 |                 break
248 |         if args.max_steps > 0 and global_step > args.max_steps:
249 |             train_iterator.close()
250 |             break
251 | 
252 |     if args.local_rank in [-1, 0]:
253 |         tb_writer.close()
254 | 
255 |     return global_step, tr_loss / global_step
256 | 
257 | 
258 | def evaluate(args, model, tokenizer, mode, prefix=""):
259 |     # Loop to handle MNLI double evaluation (matched, mis-matched)
260 |     eval_task_names = (args.task_name,)
261 |     eval_outputs_dirs = (args.output_dir,)
262 | 
263 |     results = {}
264 |     for eval_task, eval_output_dir in zip(eval_task_names, eval_outputs_dirs):
265 |         eval_dataset, eval_evaluate_label_ids = load_and_cache_examples(args, eval_task, tokenizer, mode=mode)
266 | 
267 |         if not os.path.exists(eval_output_dir) and args.local_rank in [-1, 0]:
268 |             os.makedirs(eval_output_dir)
269 | 
270 |         args.eval_batch_size = args.per_gpu_eval_batch_size * max(1, args.n_gpu)
271 |         # Note that DistributedSampler samples randomly
272 |         eval_sampler = SequentialSampler(eval_dataset) if args.local_rank == -1 else DistributedSampler(eval_dataset)
273 |         eval_dataloader = DataLoader(eval_dataset, sampler=eval_sampler, batch_size=args.eval_batch_size)
274 | 
275 |         # Eval!
276 |         #logger.info("***** Running evaluation on %s.txt *****" % mode)
277 |         eval_loss = 0.0
278 |         nb_eval_steps = 0
279 |         preds = None
280 |         out_label_ids = None
281 |         crf_logits, crf_mask = [], []
282 |         for batch in tqdm(eval_dataloader, desc="Evaluating"):
283 |             model.eval()
284 |             batch = tuple(t.to(args.device) for t in batch)
285 | 
286 |             with torch.no_grad():
287 |                 inputs = {'input_ids':      batch[0],
288 |                           'attention_mask': batch[1],
289 |                           'token_type_ids': batch[2] if args.model_type in ['bert', 'xlnet'] else None,  # XLM don't use segment_ids
290 |                           'labels':         batch[3]}
291 |                 outputs = model(**inputs)
292 |                 # logits: (bsz, seq_len, label_size)
293 |                 # here the loss is the masked loss
294 |                 tmp_eval_loss, logits = outputs[:2]
295 |                 eval_loss += tmp_eval_loss.mean().item()
296 | 
297 |                 crf_logits.append(logits)
298 |                 crf_mask.append(batch[1])
299 |             nb_eval_steps += 1
300 |             if preds is None:
301 |                 preds = logits.detach().cpu().numpy()
302 |                 out_label_ids = inputs['labels'].detach().cpu().numpy()
303 |             else:
304 |                 preds = np.append(preds, logits.detach().cpu().numpy(), axis=0)
305 |                 out_label_ids = np.append(out_label_ids, inputs['labels'].detach().cpu().numpy(), axis=0)
306 |         eval_loss = eval_loss / nb_eval_steps
307 |         # argmax operation over the last dimension
308 |         if model.tagger_config.absa_type != 'crf':
309 |             # greedy decoding
310 |             preds = np.argmax(preds, axis=-1)
311 |         else:
312 |             # viterbi decoding for CRF-based model
313 |             crf_logits = torch.cat(crf_logits, dim=0)
314 |             crf_mask = torch.cat(crf_mask, dim=0)
315 |             preds = model.tagger.viterbi_tags(logits=crf_logits, mask=crf_mask)
316 |         result = compute_metrics_absa(preds, out_label_ids, eval_evaluate_label_ids, args.tagging_schema)
317 |         result['eval_loss'] = eval_loss
318 |         results.update(result)
319 | 
320 |         output_eval_file = os.path.join(eval_output_dir, "%s_results.txt" % mode)
321 |         with open(output_eval_file, "w") as writer:
322 |             #logger.info("***** %s results *****" % mode)
323 |             for key in sorted(result.keys()):
324 |                 if 'eval_loss' in key:
325 |                     logger.info("  %s = %s", key, str(result[key]))
326 |                 writer.write("%s = %s\n" % (key, str(result[key])))
327 |             #logger.info("***** %s results *****" % mode)
328 | 
329 |     return results
330 | 
331 | 
332 | def load_and_cache_examples(args, task, tokenizer, mode='train'):
333 |     processor = processors[task]()
334 |     # Load data features from cache or dataset file
335 |     cached_features_file = os.path.join(args.data_dir, 'cached_{}_{}_{}_{}'.format(
336 |         mode,
337 |         list(filter(None, args.model_name_or_path.split('/'))).pop(),
338 |         str(args.max_seq_length),
339 |         str(task)))
340 |     if os.path.exists(cached_features_file):
341 |         print("cached_features_file:", cached_features_file)
342 |         features = torch.load(cached_features_file)
343 |     else:
344 |         #logger.info("Creating features from dataset file at %s", args.data_dir)
345 |         label_list = processor.get_labels(args.tagging_schema)
346 |         if mode == 'train':
347 |             examples = processor.get_train_examples(args.data_dir, args.tagging_schema)
348 |         elif mode == 'dev':
349 |             examples = processor.get_dev_examples(args.data_dir, args.tagging_schema)
350 |         elif mode == 'test':
351 |             examples = processor.get_test_examples(args.data_dir, args.tagging_schema)
352 |         else:
353 |             raise Exception("Invalid data mode %s..." % mode)
354 |         features = convert_examples_to_seq_features(examples=examples, label_list=label_list, tokenizer=tokenizer,
355 |                                                     cls_token_at_end=bool(args.model_type in ['xlnet']),
356 |                                                     cls_token=tokenizer.cls_token,
357 |                                                     sep_token=tokenizer.sep_token,
358 |                                                     cls_token_segment_id=2 if args.model_type in ['xlnet'] else 0,
359 |                                                     pad_on_left=bool(args.model_type in ['xlnet']),
360 |                                                     pad_token_segment_id=4 if args.model_type in ['xlnet'] else 0)
361 |         if args.local_rank in [-1, 0]:
362 |             #logger.info("Saving features into cached file %s", cached_features_file)
363 |             torch.save(features, cached_features_file)
364 | 
365 |     # Convert to Tensors and build dataset
366 |     all_input_ids = torch.tensor([f.input_ids for f in features], dtype=torch.long)
367 |     all_input_mask = torch.tensor([f.input_mask for f in features], dtype=torch.long)
368 |     all_segment_ids = torch.tensor([f.segment_ids for f in features], dtype=torch.long)
369 | 
370 |     all_label_ids = torch.tensor([f.label_ids for f in features], dtype=torch.long)
371 |     # used in evaluation
372 |     all_evaluate_label_ids = [f.evaluate_label_ids for f in features]
373 |     dataset = TensorDataset(all_input_ids, all_input_mask, all_segment_ids, all_label_ids)
374 |     return dataset, all_evaluate_label_ids
375 | 
376 | 
377 | def main():
378 | 
379 |     args = init_args()
380 |     if os.path.exists(args.output_dir) and os.listdir(args.output_dir) and args.do_train and not args.overwrite_output_dir:
381 |         raise ValueError("Output directory ({}) already exists and is not empty. Use --overwrite_output_dir to overcome.".format(args.output_dir))
382 | 
383 |     # Setup CUDA, GPU & distributed training
384 |     if args.local_rank == -1 or args.no_cuda:
385 |         device = torch.device("cuda" if torch.cuda.is_available() and not args.no_cuda else "cpu")
386 |         args.n_gpu = torch.cuda.device_count()
387 |     else:  # Initializes the distributed backend which will take care of sychronizing nodes/GPUs
388 |         torch.cuda.set_device(args.local_rank)
389 |         device = torch.device("cuda", args.local_rank)
390 |         os.environ['MASTER_ADDR'] = args.MASTER_ADDR
391 |         os.environ['MASTER_PORT'] = args.MASTER_PORT
392 |         torch.distributed.init_process_group(backend='nccl', rank=args.local_rank, world_size=1)
393 |         args.n_gpu = 1
394 | 
395 |     args.device = device
396 | 
397 |     # Setup logging
398 |     logging.basicConfig(format='%(asctime)s - %(levelname)s - %(name)s -   %(message)s',
399 |                         datefmt='%m/%d/%Y %H:%M:%S',
400 |                         level=logging.INFO if args.local_rank in [-1, 0] else logging.WARN)
401 |     # not using 16-bits training
402 |     logger.warning("Process rank: %s, device: %s, n_gpu: %s, distributed training: %s, 16-bits training: False",
403 |                    args.local_rank, device, args.n_gpu, bool(args.local_rank != -1))
404 | 
405 |     # Set seed
406 |     set_seed(args)
407 | 
408 |     # Prepare task
409 |     args.task_name = args.task_name.lower()
410 |     if args.task_name not in processors:
411 |         raise ValueError("Task not found: %s" % args.task_name)
412 |     processor = processors[args.task_name]()
413 |     args.output_mode = output_modes[args.task_name]
414 |     label_list = processor.get_labels(args.tagging_schema)
415 |     num_labels = len(label_list)
416 | 
417 |     if args.local_rank not in [-1, 0]:
418 |         torch.distributed.barrier()
419 | 
420 |     # initialize the pre-trained model
421 |     args.model_type = args.model_type.lower()
422 |     config_class, model_class, tokenizer_class = MODEL_CLASSES[args.model_type]
423 |     config = config_class.from_pretrained(args.config_name if args.config_name else args.model_name_or_path,
424 |                                           num_labels=num_labels, finetuning_task=args.task_name, cache_dir="./cache")
425 |     tokenizer = tokenizer_class.from_pretrained(args.tokenizer_name if args.tokenizer_name else args.model_name_or_path,
426 |                                                 do_lower_case=args.do_lower_case, cache_dir='./cache')
427 | 
428 |     config.absa_type = args.absa_type
429 |     config.tfm_mode = args.tfm_mode
430 |     config.fix_tfm = args.fix_tfm
431 |     model = model_class.from_pretrained(args.model_name_or_path, from_tf=bool('.ckpt' in args.model_name_or_path),
432 |                                         config=config, cache_dir='./cache')
433 |     # Distributed and parallel training
434 |     model.to(args.device)
435 |     if args.local_rank != -1:
436 |         model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[args.local_rank],
437 |                                                           output_device=args.local_rank,
438 |                                                           find_unused_parameters=True)
439 |     elif args.n_gpu > 1:
440 |         model = torch.nn.DataParallel(model)
441 | 
442 |     # Training
443 |     if args.do_train:
444 |         train_dataset, train_evaluate_label_ids = load_and_cache_examples(args, args.task_name, tokenizer, mode='train')
445 |         global_step, tr_loss = train(args, train_dataset, model, tokenizer)
446 | 
447 |     if args.do_train and (args.local_rank == -1 or dist.get_rank() == 0):
448 |         # Create output directory if needed
449 |         if not os.path.exists(args.output_dir) and args.local_rank in [-1, 0]:
450 |             os.mkdir(args.output_dir)
451 | 
452 |         model_to_save = model.module if hasattr(model, 'module') else model
453 |         model_to_save.save_pretrained(args.output_dir)
454 |         tokenizer.save_pretrained(args.output_dir)
455 | 
456 |         # Good practice: save your training arguments together with the trained model
457 |         # save the model configuration
458 |         torch.save(args, os.path.join(args.output_dir, 'training_args.bin'))
459 | 
460 |         # Load a trained model and vocabulary that you have fine-tuned
461 |         model = model_class.from_pretrained(args.output_dir)
462 |         tokenizer = tokenizer_class.from_pretrained(args.output_dir)
463 |         model.to(args.device)
464 | 
465 |     # Validation
466 |     results = {}
467 |     best_f1 = -999999.0
468 |     best_checkpoint = None
469 |     checkpoints = [args.output_dir]
470 |     if args.eval_all_checkpoints:
471 |         checkpoints = list(os.path.dirname(c) for c in sorted(glob.glob(args.output_dir + '/**/' + WEIGHTS_NAME, recursive=True)))
472 |         logging.getLogger("pytorch_transformers.modeling_utils").setLevel(logging.WARN)  # Reduce logging
473 |     logger.info("Perform validation on the following checkpoints: %s", checkpoints)
474 |     test_results = {}
475 |     for checkpoint in checkpoints:
476 |         global_step = checkpoint.split('-')[-1] if len(checkpoints) > 1 else ""
477 |         if global_step == 'finetune' or global_step == 'train' or global_step == 'fix' or global_step == 'overfit':
478 |             continue
479 |         # validation set
480 |         model = model_class.from_pretrained(checkpoint)
481 |         model.to(args.device)
482 |         dev_result = evaluate(args, model, tokenizer, mode='dev', prefix=global_step)
483 | 
484 |         # regard the micro-f1 as the criteria of model selection
485 |         if int(global_step) > 1000 and dev_result['micro-f1'] > best_f1:
486 |             best_f1 = dev_result['micro-f1']
487 |             best_checkpoint = checkpoint
488 |         dev_result = dict((k + '_{}'.format(global_step), v) for k, v in dev_result.items())
489 |         results.update(dev_result)
490 | 
491 |         test_result = evaluate(args, model, tokenizer, mode='test', prefix=global_step)
492 |         test_result = dict((k + '_{}'.format(global_step), v) for k, v in test_result.items())
493 |         test_results.update(test_result)
494 | 
495 |     best_ckpt_string = "\nThe best checkpoint is %s" % best_checkpoint
496 |     logger.info(best_ckpt_string)
497 |     dev_f1_values, dev_loss_values = [], []
498 |     for k in results:
499 |         v = results[k]
500 |         if 'micro-f1' in k:
501 |             dev_f1_values.append((k, v))
502 |         if 'eval_loss' in k:
503 |             dev_loss_values.append((k, v))
504 |     test_f1_values, test_loss_values = [], []
505 |     for k in test_results:
506 |         v = test_results[k]
507 |         if 'micro-f1' in k:
508 |             test_f1_values.append((k, v))
509 |         if 'eval_loss' in k:
510 |             test_loss_values.append((k, v))
511 |     log_file_path = '%s/log.txt' % args.output_dir
512 |     log_file = open(log_file_path, 'a')
513 |     log_file.write("\tValidation:\n")
514 |     for (test_f1_k, test_f1_v), (test_loss_k, test_loss_v), (dev_f1_k, dev_f1_v), (dev_loss_k, dev_loss_v) in zip(
515 |             test_f1_values, test_loss_values, dev_f1_values, dev_loss_values):
516 |         global_step = int(test_f1_k.split('_')[-1])
517 |         if not args.overfit and global_step <= 1000:
518 |             continue
519 |         print('test-%s: %.5lf, test-%s: %.5lf, dev-%s: %.5lf, dev-%s: %.5lf' % (test_f1_k,
520 |                                                                                 test_f1_v, test_loss_k, test_loss_v,
521 |                                                                                 dev_f1_k, dev_f1_v, dev_loss_k,
522 |                                                                                 dev_loss_v))
523 |         validation_string = '\t\tdev-%s: %.5lf, dev-%s: %.5lf' % (dev_f1_k, dev_f1_v, dev_loss_k, dev_loss_v)
524 |         log_file.write(validation_string+'\n')
525 | 
526 |     n_times = args.max_steps // args.save_steps + 1
527 |     for i in range(1, n_times):
528 |         step = i * 100
529 |         log_file.write('\tStep %s:\n' % step)
530 |         precision = test_results['precision_%s' % step]
531 |         recall = test_results['recall_%s' % step]
532 |         micro_f1 = test_results['micro-f1_%s' % step]
533 |         macro_f1 = test_results['macro-f1_%s' % step]
534 |         log_file.write('\t\tprecision: %.4lf, recall: %.4lf, micro-f1: %.4lf, macro-f1: %.4lf\n'
535 |                        % (precision, recall, micro_f1, macro_f1))
536 |     log_file.write("\tBest checkpoint: %s\n" % best_checkpoint)
537 |     log_file.write('******************************************\n')
538 |     log_file.close()
539 | 
540 | 
541 | if __name__ == '__main__':
542 |     main()
543 | 
544 | 
545 | 
546 | 
547 | 


--------------------------------------------------------------------------------
/data/rest16/dev.txt:
--------------------------------------------------------------------------------
  1 | The duck confit is always amazing and the foie gras terrine with figs was out of this world.####The=O duck=T-POS confit=T-POS is=O always=O amazing=O and=O the=O foie=T-POS gras=T-POS terrine=T-POS with=T-POS figs=T-POS was=O out=O of=O this=O world=O .=O
  2 | I/we will never go back to this place again.####I/we=O will=O never=O go=O back=O to=O this=O place=T-NEG again=O .=O
  3 | Can't wait wait for my next visit.####Ca=O n't=O wait=O wait=O for=O my=O next=O visit=O .=O
  4 | Anyways, if you're in the neighborhood to eat good food, I wouldn't waste my time trying to find something, rather go across the street to Tamari.####Anyways=O ,=O if=O you=O 're=O in=O the=O neighborhood=O to=O eat=O good=O food=O ,=O I=O would=O n't=O waste=O my=O time=O trying=O to=O find=O something=O ,=O rather=O go=O across=O the=O street=O to=O Tamari=O .=O
  5 | We ate outside at Haru's Sake bar because Haru's restaurant next door was overflowing.####We=O ate=O outside=O at=O Haru=O 's=O Sake=O bar=O because=O Haru=O 's=O restaurant=O next=O door=O was=O overflowing=O .=O
  6 | however, it's the service that leaves a bad taste in my mouth.####however=O ,=O it=O 's=O the=O service=T-NEG that=O leaves=O a=O bad=O taste=O in=O my=O mouth=O .=O
  7 | the last time i walked by it looked pretty empty. hmmm.####the=O last=O time=O i=O walked=O by=O it=O looked=O pretty=O empty=O hmmm=O .=O
  8 | I've had the lunch buffet at Chennai a couple of times, when I have been in the neighborhood.####I=O 've=O had=O the=O lunch=O buffet=O at=O Chennai=O a=O couple=O of=O times=O ,=O when=O I=O have=O been=O in=O the=O neighborhood=O .=O
  9 | Price no more than a Jersey deli but way better.####Price=O no=O more=O than=O a=O Jersey=O deli=O but=O way=O better=O .=O
 10 | I am so coming back here again, as much as I can.####I=O am=O so=O coming=O back=O here=O again=O ,=O as=O much=O as=O I=O can=O .=O
 11 | They refuse to seat parties of 3 or more on weekends.####They=O refuse=O to=O seat=O parties=O of=O 3=O or=O more=O on=O weekends=O .=O
 12 | As always we had a great glass of wine while we waited.####As=O always=O we=O had=O a=O great=O glass=T-POS of=T-POS wine=T-POS while=O we=O waited=O .=O
 13 | This place is great.####This=O place=T-POS is=O great=O .=O
 14 | Slightly on the pricey side but worth it!####Slightly=O on=O the=O pricey=O side=O but=O worth=O it=O !=O
 15 | I have been going there since it opened and I can't get enough.####I=O have=O been=O going=O there=O since=O it=O opened=O and=O I=O ca=O n't=O get=O enough=O .=O
 16 | Pizza - the only pizza in NYC that should not have additional toppings - the crust tastes like the best, freshly baked bread!####Pizza=O ,=O the=O only=O pizza=T-POS in=O NYC=O that=O should=O not=O have=O additional=O toppings=O ,=O the=O crust=T-POS tastes=O like=O the=O best=O ,=O freshly=O baked=O bread=O !=O
 17 | Make sure you have the Spicy Scallop roll.. .####Make=O sure=O you=O have=O the=O Spicy=T-POS Scallop=T-POS roll=T-POS .=O
 18 | Told us to sit anywhere, and when we sat he said the table was reserved.####Told=O us=O to=O sit=O anywhere=O ,=O and=O when=O we=O sat=O he=O said=O the=O table=O was=O reserved=O .=O
 19 | Great service, great food.####Great=O service=T-POS ,=O great=O food=T-POS .=O
 20 | Prices are in line.####Prices=O are=O in=O line=O .=O
 21 | I live a block away and go to Patsy's frequently.####I=O live=O a=O block=O away=O and=O go=O to=O Patsy=O 's=O frequently=O .=O
 22 | Over 100 different choices to create your own.####Over=O 100=O different=O choices=O to=O create=O your=O own=O .=O
 23 | Seating is always prompt, though the restaurant does fill up in the evening.####Seating=T-POS is=O always=O prompt=O ,=O though=O the=O restaurant=O does=O fill=O up=O in=O the=O evening=O .=O
 24 | I've never ordered anything else from their menu...there's no need to.####I=O 've=O never=O ordered=O anything=O else=O from=O their=O menuthere=O 's=O no=O need=O to=O .=O
 25 | It's one of our favorite places to eat in NY.####It=O 's=O one=O of=O our=O favorite=O places=O to=O eat=O in=O NY=O .=O
 26 | They were served warm and had a soft fluffy interior.####They=O were=O served=O warm=O and=O had=O a=O soft=O fluffy=O interior=O .=O
 27 | It was wonderful.####It=O was=O wonderful=O .=O
 28 | Great vibe, lots of people.####Great=O vibe=T-POS ,=O lots=O of=O people=O .=O
 29 | Salads were fantastic.####Salads=T-POS were=O fantastic=O .=O
 30 | Went here on a friend's reccomendation.####Went=O here=O on=O a=O friend=O 's=O reccomendation=O .=O
 31 | Right off the L in Brooklyn this is a nice cozy place with good pizza.####Right=O off=O the=O L=O in=O Brooklyn=O this=O is=O a=O nice=O cozy=O place=T-POS with=O good=O pizza=T-POS .=O
 32 | I started out with a Bombay beer which was big enough for two.####I=O started=O out=O with=O a=O Bombay=T-POS beer=T-POS which=O was=O big=O enough=O for=O two=O .=O
 33 | However, I think this place is a good hang out spot.####However=O ,=O I=O think=O this=O place=T-POS is=O a=O good=O hang=O out=O spot=O .=O
 34 | The tables are crammed way too close, the menu is typical of any Italian restaurant, and the wine list is simply overpriced.####The=O tables=T-NEG are=O crammed=O way=O too=O close=O ,=O the=O menu=T-NEU is=O typical=O of=O any=O Italian=O restaurant=O ,=O and=O the=O wine=T-NEG list=T-NEG is=O simply=O overpriced=O .=O
 35 | Not one of our meals was edible - bland and/or made with weird rosemary or orange flavoring.####Not=O one=O of=O our=O meals=T-NEG was=O edible=O ,=O bland=O and/or=O made=O with=O weird=O rosemary=T-NEG or=T-NEG orange=T-NEG flavoring=T-NEG .=O
 36 | I've never had bad service and the fish is fresh and delicious.####I=O 've=O never=O had=O bad=O service=T-POS and=O the=O fish=T-POS is=O fresh=O and=O delicious=O .=O
 37 | The dining room is quietly elegant with no music to shout over -- how refreshing!####The=O dining=T-POS room=T-POS is=O quietly=O elegant=O with=O no=O music=O to=O shout=O over=O -=O how=O refreshing=O !=O
 38 | Beautiful experience.####Beautiful=O experience=O .=O
 39 | The portions are large and the servers always surprise us with a different starter.####The=O portions=T-POS are=O large=O and=O the=O servers=T-POS always=O surprise=O us=O with=O a=O different=O starter=O .=O
 40 | The menu is very limited - i think we counted 4 or 5 entrees.####The=O menu=T-NEG is=O very=O limited=O ,=O i=O think=O we=O counted=O 4=O or=O 5=O entrees=O .=O
 41 | Our family never expected such incredible entertainment in a restaurant.####Our=O family=O never=O expected=O such=O incredible=O entertainment=O in=O a=O restaurant=T-POS .=O
 42 | This place must have cost the owners afortune to build.####This=O place=O must=O have=O cost=O the=O owners=O afortune=O to=O build=O .=O
 43 | I think the stuff was better than Disney.####I=O think=O the=O stuff=O was=O better=O than=O Disney=O .=O
 44 | I highly recommend it.####I=O highly=O recommend=O it=O .=O
 45 | The place is a lot of fun.####The=O place=T-POS is=O a=O lot=O of=O fun=O .=O
 46 | Growing up in NY, I have eaten my share of bagels.####Growing=O up=O in=O NY=O ,=O I=O have=O eaten=O my=O share=O of=O bagels=O .=O
 47 | I thought this place was totally overrated.####I=O thought=O this=O place=T-NEG was=O totally=O overrated=O .=O
 48 | I must give it Yon out of Yon stars!####I=O must=O give=O it=O Yon=O out=O of=O Yon=O stars=O !=O
 49 | The service is ok, some of the people didn't get what they asked for.####The=O service=T-NEU is=O ok=O ,=O some=O of=O the=O people=O did=O n't=O get=O what=O they=O asked=O for=O .=O
 50 | I was there on sat. for my birthday and we had an excellent time.####I=O was=O there=O on=O sat=O for=O my=O birthday=O and=O we=O had=O an=O excellent=O time=O .=O
 51 | Whether it's the parmesean porcini souffle or the lamb glazed with balsamic vinegar, you will surely be transported to Northern Italy with one bite.####Whether=O it=O 's=O the=O parmesean=T-POS porcini=T-POS souffle=T-POS or=O the=O lamb=T-POS glazed=T-POS with=T-POS balsamic=T-POS vinegar=T-POS ,=O you=O will=O surely=O be=O transported=O to=O Northern=O Italy=O with=O one=O bite=O .=O
 52 | Old school meets New world.####Old=O school=O meets=O New=O world=O .=O
 53 | I found the food, service and value exceptional everytime I have been there.####I=O found=O the=O food=T-POS ,=O service=T-POS and=O value=O exceptional=O everytime=O I=O have=O been=O there=O .=O
 54 | Great staff.####Great=O staff=T-POS .=O
 55 | The hostess and the waitress were incredibly rude and did everything they could to rush us out.####The=O hostess=T-NEG and=O the=O waitress=T-NEG were=O incredibly=O rude=O and=O did=O everything=O they=O could=O to=O rush=O us=O out=O .=O
 56 | Will not be back.####Will=O not=O be=O back=O .=O
 57 | Try the sea bass.####Try=O the=O sea=T-POS bass=T-POS .=O
 58 | The dinner was ok, nothing I would have again.####The=O dinner=T-NEG was=O ok=O ,=O nothing=O I=O would=O have=O again=O .=O
 59 | They forgot a sandwich, didn't include plastic forks, and didn't include pita with the hummus platter.####They=O forgot=O a=O sandwich=O ,=O did=O n't=O include=O plastic=O forks=O ,=O and=O did=O n't=O include=O pita=O with=O the=O hummus=O platter=O .=O
 60 | It's a rather cramped and busy restaurant and it closes early.####It=O 's=O a=O rather=O cramped=O and=O busy=O restaurant=T-NEG and=O it=O closes=O early=O .=O
 61 | We were very pleasantly surprised.####We=O were=O very=O pleasantly=O surprised=O .=O
 62 | After dinner the manager grabbed my boyfriend, asked him: Where are you from...maybe you dont know how things work in America...and in the end stormed away almost teareyed yelling that tips are the only thing they survive on.####After=O dinner=O the=O manager=T-NEG grabbed=O my=O boyfriend=O ,=O asked=O him=O ,=O Where=O are=O you=O frommaybe=O you=O do=O n't=O know=O how=O things=O work=O in=O Americaand=O in=O the=O end=O stormed=O away=O almost=O teareyed=O yelling=O that=O tips=O are=O the=O only=O thing=O they=O survive=O on=O .=O
 63 | Pizza here is consistently good.####Pizza=T-POS here=O is=O consistently=O good=O .=O
 64 | Service is average.####Service=T-NEU is=O average=O .=O
 65 | A gentleman, maybe the manager, came to our table, and without so much as a smile or greeting asked for our order.####A=O gentleman=T-NEG ,=O maybe=O the=O manager=O ,=O came=O to=O our=O table=O ,=O and=O without=O so=O much=O as=O a=O smile=O or=O greeting=O asked=O for=O our=O order=O .=O
 66 | When asked about how a certain dish was prepared in comparison to a similar at other thai restaurants, he replied this is not Mcdonald's, every place makes things differently ####When=O asked=O about=O how=O a=O certain=O dish=O was=O prepared=O in=O comparison=O to=O a=O similar=O at=O other=O thai=O restaurants=O ,=O he=O replied=O this=O is=O not=O Mcdonald=O 's=O ,=O every=O place=O makes=O things=O differently=O
 67 | I would highly recommend this place!####I=O would=O highly=O recommend=O this=O place=T-POS !=O
 68 | The food is great and reasonably priced.####The=O food=T-POS is=O great=O and=O reasonably=O priced=O .=O
 69 | My friends and I were on vacation in NY and was referred to Chance by a friend.####My=O friends=O and=O I=O were=O on=O vacation=O in=O NY=O and=O was=O referred=O to=O Chance=O by=O a=O friend=O .=O
 70 | I also ordered the Change Mojito, which was out of this world.####I=O also=O ordered=O the=O Change=T-POS Mojito=T-POS ,=O which=O was=O out=O of=O this=O world=O .=O
 71 | The food was average or above including some surprising tasty dishes.####The=O food=T-POS was=O average=O or=O above=O including=O some=O surprising=O tasty=O dishes=T-POS .=O
 72 | I would recommend Roxy's for that, but not for their food.####I=O would=O recommend=O Roxy=O 's=O for=O that=O ,=O but=O not=O for=O their=O food=T-NEG .=O
 73 | And the Tom Kha soup was pathetic.####And=O the=O Tom=T-NEG Kha=T-NEG soup=T-NEG was=O pathetic=O .=O
 74 | We had the scallops as an appetizer and they were delicious and the sauce was wonderful.####We=O had=O the=O scallops=T-POS as=O an=O appetizer=O and=O they=O were=O delicious=O and=O the=O sauce=T-POS was=O wonderful=O .=O
 75 | I've waited over one hour for food.####I=O 've=O waited=O over=O one=O hour=O for=O food=O .=O
 76 | The food looked very appetizing and delicious since it came on a variety of fancy plates.####The=O food=T-POS looked=O very=O appetizing=O and=O delicious=O since=O it=O came=O on=O a=O variety=O of=O fancy=O plates=O .=O
 77 | Service here was great, food was fantastic.####Service=T-POS here=O was=O great=O ,=O food=T-POS was=O fantastic=O .=O
 78 | We all agreed that mare is one of the best seafood restaurants in New York.####We=O all=O agreed=O that=O mare=T-POS is=O one=O of=O the=O best=O seafood=O restaurants=O in=O New=O York=O .=O
 79 | I ordered the smoked salmon and roe appetizer and it was off flavor.####I=O ordered=O the=O smoked=T-NEG salmon=T-NEG and=T-NEG roe=T-NEG appetizer=T-NEG and=O it=O was=O off=O flavor=O .=O
 80 | Delicious crab cakes too.####Delicious=O crab=T-POS cakes=T-POS too=O .=O
 81 | The sandwiches are dry, tasteless and way overpriced.####The=O sandwiches=T-NEG are=O dry=O ,=O tasteless=O and=O way=O overpriced=O .=O
 82 | The atmosphere is unheralded, the service impecible, and the food magnificant.####The=O atmosphere=T-POS is=O unheralded=O ,=O the=O service=T-POS impecible=O ,=O and=O the=O food=T-POS magnifica=O n't=O .=O
 83 | The food is good, especially their more basic dishes, and the drinks are delicious.####The=O food=T-POS is=O good=O ,=O especially=O their=O more=O basic=T-POS dishes=T-POS ,=O and=O the=O drinks=T-POS are=O delicious=O .=O
 84 | This is a great place to take out-of-towners, and perfect for watching the sunset.####This=O is=O a=O great=O place=T-POS to=O take=O out-of-towners=O ,=O and=O perfect=O for=O watching=O the=O sunset=O .=O
 85 | Great sushi experience.####Great=O sushi=T-POS experience=O .=O
 86 | Murray won't do it.####Murray=O wo=O n't=O do=O it=O .=O
 87 | Won't or Can't is not in the service directory.####Wo=O n't=O or=O Ca=O n't=O is=O not=O in=O the=O service=O directory=O .=O
 88 | Bagels are ok, but be sure not to make any special requests!####Bagels=T-NEU are=O ok=O ,=O but=O be=O sure=O not=O to=O make=O any=O special=O requests=O !=O
 89 | The fish was adequate, but inexpertly sliced.####The=O fish=T-NEG was=O adequate=O ,=O but=O inexpertly=O sliced=O .=O
 90 | I thought going to Jimmys would give me a real Domincan exprience.####I=O thought=O going=O to=O Jimmys=O would=O give=O me=O a=O real=O Domincan=O exprience=O .=O
 91 | For authentic Thai food, look no further than Toons.####For=O authentic=O Thai=T-POS food=T-POS ,=O look=O no=O further=O than=O Toons=O .=O
 92 | I did not try the caviar but I tried their salmon and crab salad (they are all good)  ####I=O did=O not=O try=O the=O caviar=O but=O I=O tried=O their=O salmon=T-POS and=O crab=T-POS salad=T-POS they=O are=O all=O good=O
 93 | The wait staff is pleasant, fun, and for the most part gorgeous (in the wonderful aesthetic beautification way, not in that she's-way-cuter-than-me-that-b@#$* way).####The=O wait=T-POS staff=T-POS is=O pleasant=O ,=O fun=O ,=O and=O for=O the=O most=O part=O gorgeous=O in=O the=O wonderful=O aesthetic=O beautification=O way=O ,=O not=O in=O that=O she's-way-cuter-than-me-that-b=O @=O #=O $=O *=O way=O .=O
 94 | Its location is good and the fact that Hutner College is near and their prices are very reasonable, makes students go back to Suan again and again.####Its=O location=T-POS is=O good=O and=O the=O fact=O that=O Hutner=O College=O is=O near=O and=O their=O prices=O are=O very=O reasonable=O ,=O makes=O students=O go=O back=O to=O Suan=T-POS again=O and=O again=O .=O
 95 | if you're daring, try the balsamic vinegar over icecream, it's wonderful!####if=O you=O 're=O daring=O ,=O try=O the=O balsamic=T-POS vinegar=T-POS over=T-POS icecream=T-POS ,=O it=O 's=O wonderful=O !=O
 96 | After passing by this restaurant for sometime I finally decided to go in and have dinner.####After=O passing=O by=O this=O restaurant=O for=O sometime=O I=O finally=O decided=O to=O go=O in=O and=O have=O dinner=O .=O
 97 | The menu consisted of standard brassiere food, better then places like Balthazar etc.####The=O menu=O consisted=O of=O standard=O brassiere=O food=O ,=O better=O then=O places=O like=O Balthazar=O etc=O .=O
 98 | the pad se ew chicken was delicious, however the pad thai was far too oily.####the=O pad=T-POS se=T-POS ew=T-POS chicken=T-POS was=O delicious=O ,=O however=O the=O pad=T-NEG thai=T-NEG was=O far=O too=O oily=O .=O
 99 | Service is fast and friendly.####Service=T-POS is=O fast=O and=O friendly=O .=O
100 | If celebrities make you sweat, then your in for a ride, but if your like most around these parts then you'll just yawn and wonder whats with all the hype.####If=O celebrities=O make=O you=O sweat=O ,=O then=O your=O in=O for=O a=O ride=O ,=O but=O if=O your=O like=O most=O around=O these=O parts=O then=O you=O 'll=O just=O yawn=O and=O wonder=O whats=O with=O all=O the=O hype=O .=O
101 | I've eaten at many different Indian restaurants.####I=O 've=O eaten=O at=O many=O different=O Indian=O restaurants=O .=O
102 | The appetizing is excellent - just as good as Zabars Barney Greengrass at a reasonable price (if bought by the pound).####The=O appetizing=O is=O excellent=O ,=O just=O as=O good=O as=O Zabars=O Barney=O Greengrass=O at=O a=O reasonable=O price=O if=O bought=O by=O the=O pound=O .=O
103 | Go there to relax and feel like your somewhere else.####Go=O there=O to=O relax=O and=O feel=O like=O your=O somewhere=O else=O .=O
104 | Great food, great decor, great service.####Great=O food=T-POS ,=O great=O decor=T-POS ,=O great=O service=T-POS .=O
105 | This is the perfect spot for meeting friends, having lunch, dinner, pre-theatre or after-theatre drinks!####This=O is=O the=O perfect=O spot=T-POS for=O meeting=O friends=O ,=O having=O lunch=O ,=O dinner=O ,=O pre-theatre=O or=O after-theatre=O drinks=O !=O
106 | Wonderful at holiday time.####Wonderful=O at=O holiday=O time=O .=O
107 | The porcini mushroom pasta special was tasteless, so was the seafood tagliatelle.####The=O porcini=T-NEG mushroom=T-NEG pasta=T-NEG special=T-NEG was=O tasteless=O ,=O so=O was=O the=O seafood=T-NEG tagliatelle=T-NEG .=O
108 | A real dissapointment.####A=O real=O dissapointment=O .=O
109 | I recently tried Suan and I thought that it was great.####I=O recently=O tried=O Suan=T-POS and=O I=O thought=O that=O it=O was=O great=O .=O
110 | Good food.####Good=O food=T-POS .=O
111 | Ravioli was good...but I have to say that I found everything a bit overpriced.####Ravioli=T-POS was=O goodbut=O I=O have=O to=O say=O that=O I=O found=O everything=O a=O bit=O overpriced=O .=O
112 | Faan is sooo good.####Faan=T-POS is=O sooo=O good=O .=O
113 | bottles of wine are cheap and good.####bottles=T-POS of=T-POS wine=T-POS are=O cheap=O and=O good=O .=O
114 | This is an amazing place to try some roti rolls.####This=O is=O an=O amazing=O place=O to=O try=O some=O roti=T-POS rolls=T-POS .=O
115 | The food's as good as ever.####The=O food=T-POS 's=O as=O good=O as=O ever=O .=O
116 | Excellent dumplings served amid clean, chic decor.####Excellent=O dumplings=T-POS served=O amid=O clean=O ,=O chic=O decor=T-POS .=O
117 | I won't go back unless someone else is footing the bill.####I=O wo=O n't=O go=O back=O unless=O someone=O else=O is=O footing=O the=O bill=O .=O
118 | The portions are small but being that the food was so good makes up for that.####The=O portions=T-NEG are=O small=O but=O being=O that=O the=O food=T-POS was=O so=O good=O makes=O up=O for=O that=O .=O
119 | Service is not what you are coming here for...####Service=T-NEG is=O not=O what=O you=O are=O coming=O here=O for=O .=O
120 | No thanks!!!####No=O thanks=O !=O !=O !=O
121 | The only problem is that the manager is a complete incompetent.####The=O only=O problem=O is=O that=O the=O manager=T-NEG is=O a=O complete=O incompetent=O .=O
122 | Hey, I think $2+ for a 5 block walk ain't bad.####Hey=O ,=O I=O think=O $=O 2+=O for=O a=O 5=O block=O walk=O ai=O n't=O bad=O .=O
123 | I plan on stopping by next week as well.####I=O plan=O on=O stopping=O by=O next=O week=O as=O well=O .=O
124 | I recommend this place to everyone.####I=O recommend=O this=O place=T-POS to=O everyone=O .=O
125 | An awesome organic dog, and a conscious eco friendly establishment.####An=O awesome=O organic=O dog=T-POS ,=O and=O a=O conscious=O eco=O friendly=O establishment=T-POS .=O
126 | I paid just about $60 for a good meal, though :)####I=O paid=O just=O about=O $=O 60=O for=O a=O good=O meal=T-POS ,=O though=O ,=O
127 | Great sake!####Great=O sake=T-POS !=O
128 | Delivery guy sometimes get upset if you don't tip more than 10%.####Delivery=T-NEG guy=T-NEG sometimes=O get=O upset=O if=O you=O do=O n't=O tip=O more=O than=O 10=O %=O .=O
129 | Creative, consistent, fresh.####Creative=O ,=O consistent=O ,=O fresh=O .=O
130 | The place is a bit hidden away, but once you get there, it's all worth it.####The=O place=T-NEU is=O a=O bit=O hidden=O away=O ,=O but=O once=O you=O get=O there=O ,=O it=O 's=O all=O worth=O it=O .=O
131 | My wife and I went to Water's Edge for a romantic dinner.####My=O wife=O and=O I=O went=O to=O Water=O 's=O Edge=O for=O a=O romantic=O dinner=O .=O
132 | lobster was good, nothing spectacular.####lobster=T-NEU was=O good=O ,=O nothing=O spectacular=O .=O
133 | I thought the restaurant was nice and clean.####I=O thought=O the=O restaurant=T-POS was=O nice=O and=O clean=O .=O
134 | Red Dragon Roll - my favorite thing to eat, of any food group - hands down####Red=T-POS Dragon=T-POS Roll=T-POS ,=O my=O favorite=O thing=O to=O eat=O ,=O of=O any=O food=O group=O ,=O hands=O down=O
135 | I have eaten at some of the 'best' sushi joints in NYC (Nobu, Bond Street, JewelBako, etc) and Yamato is my favorite.####I=O have=O eaten=O at=O some=O of=O the=O best=O sushi=O joints=O in=O NYC=O Nobu=O ,=O Bond=O Street=O ,=O JewelBako=O ,=O etc=O and=O Yamato=O is=O my=O favorite=O .=O
136 | The Dancing, White River and Millenium rolls are musts.####The=O Dancing,=T-POS White=T-POS River=T-POS and=T-POS Millenium=T-POS rolls=T-POS are=O musts=O .=O
137 | When I got there the place was packed but they made sure to seat me quickly.####When=O I=O got=O there=O the=O place=O was=O packed=O but=O they=O made=O sure=O to=O seat=O me=O quickly=O .=O
138 | We were offered water for the table but were not told the Voss bottles of water were $8 a piece.####We=O were=O offered=O water=O for=O the=O table=O but=O were=O not=O told=O the=O Voss=T-NEG bottles=T-NEG of=T-NEG water=T-NEG were=O $=O 8=O a=O piece=O .=O
139 | MMMMMMMMMmmmmmm so delicious####MMMMMMMMMmmmmmm=O so=O delicious=O
140 | Also, I personally wasn't a fan of the portobello and asparagus mole.####Also=O ,=O I=O personally=O was=O n't=O a=O fan=O of=O the=O portobello=T-NEG and=T-NEG asparagus=T-NEG mole=T-NEG .=O
141 | The veal was incredible last night.####The=O veal=T-POS was=O incredible=O last=O night=O .=O
142 | Skip Nathan's...you can get that at the mall...go to Bark.####Skip=O Nathan'syou=O can=O get=O that=O at=O the=O mallgo=O to=O Bark=O .=O
143 | Most of the booths allow you to sit next to eachother without looking like 'that' couple.####Most=O of=O the=O booths=T-POS allow=O you=O to=O sit=O next=O to=O eachother=O without=O looking=O like=O that=O couple=O .=O
144 | Ruth, mother of the Bride####Ruth=O ,=O mother=O of=O the=O Bride=O
145 | $170 down the toilet...####$=O 170=O down=O the=O toilet=O .=O
146 | The service was extremely fast and attentive(thanks to the service button on your table) but I barely understood 1 word when the waiter took our order.####The=O service=T-POS was=O extremely=O fast=O and=O attentivethanks=O to=O the=O service=T-POS button=O on=O your=O table=O but=O I=O barely=O understood=O 1=O word=O when=O the=O waiter=T-NEG took=O our=O order=O .=O
147 | Over all the looks of the place exceeds the actual meals.####Over=O all=O the=O looks=T-POS of=O the=O place=O exceeds=O the=O actual=O meals=T-NEG .=O
148 | Sometimes tables don't understand his sense of humor but it's refreshing to have a server who has personality, professionalism, and respects the privacy of your dinner.####Sometimes=O tables=O do=O n't=O understand=O his=O sense=O of=O humor=O but=O it=O 's=O refreshing=O to=O have=O a=O server=T-POS who=O has=O personality=O ,=O professionalism=O ,=O and=O respects=O the=O privacy=O of=O your=O dinner=O .=O
149 | peppers, onions, relish, chilli, cheeses, you NAME it.####peppers=O ,=O onions=O ,=O relish=O ,=O chilli=O ,=O cheeses=O ,=O you=O NAME=O it=O .=O
150 | Highly impressed from the decor to the food to the hospitality to the great night I had!####Highly=O impressed=O from=O the=O decor=T-POS to=O the=O food=T-POS to=O the=O hospitality=O to=O the=O great=O night=O I=O had=O !=O
151 | Great find in the West Village!####Great=O find=O in=O the=O West=O Village=O !=O
152 | The menu looked great, and the waiter was very nice, but when the food came, it was average.####The=O menu=T-POS looked=O great=O ,=O and=O the=O waiter=T-POS was=O very=O nice=O ,=O but=O when=O the=O food=T-NEU came=O ,=O it=O was=O average=O .=O
153 | The manager finally said he would comp the two glasses of wine (which cost less than the food), and made it seem like a big concession.####The=O manager=T-NEG finally=O said=O he=O would=O comp=O the=O two=O glasses=O of=O wine=O which=O cost=O less=O than=O the=O food=O ,=O and=O made=O it=O seem=O like=O a=O big=O concession=O .=O
154 | A fairly late entry into the haute barnyard sweepstakes, Flatbush Farm isn't in the same league as the Blue Hills or even the Farm on Adderlys of the world, but it's pretty good, albeit with a somewhat dismal setting.####A=O fairly=O late=O entry=O into=O the=O haute=O barnyard=O sweepstakes=O ,=O Flatbush=O Farm=O is=O n't=O in=O the=O same=O league=O as=O the=O Blue=O Hills=O or=O even=O the=O Farm=O on=O Adderlys=O of=O the=O world=O ,=O but=O it=O 's=O pretty=O good=O ,=O albeit=O with=O a=O somewhat=O dismal=O setting=O .=O
155 | I do not recommend.####I=O do=O not=O recommend=O .=O
156 | The food is flavorful, plentiful and reasonably priced.####The=O food=T-POS is=O flavorful=O ,=O plentiful=O and=O reasonably=O priced=O .=O
157 | Very pleased####Very=O pleased=O
158 | never swaying, never a bad meal, never bad service...####never=O swaying=O ,=O never=O a=O bad=O meal=T-POS ,=O never=O bad=O service=T-POS .=O
159 | This place has great indian chinese food.####This=O place=O has=O great=O indian=T-POS chinese=T-POS food=T-POS .=O
160 | The martinis are amazing and very fairly priced.####The=O martinis=T-POS are=O amazing=O and=O very=O fairly=O priced=O .=O
161 | Are you freaking kidding me?####Are=O you=O freaking=O kidding=O me=O ?=O
162 | Surprised that a place of this caliber would advertise it as Kobe.####Surprised=O that=O a=O place=O of=O this=O caliber=O would=O advertise=O it=O as=O Kobe=O .=O
163 | Bison was quite excellent however.####Bison=T-POS was=O quite=O excellent=O however=O .=O
164 | Terrible Waste of money.. scammers####Terrible=O Waste=O of=O money=O scammers=O .=O
165 | I am actually offended to have spent so much money on such a bad experience.####I=O am=O actually=O offended=O to=O have=O spent=O so=O much=O money=O on=O such=O a=O bad=O experience=O .=O
166 | Our visit their to say the least, was an unpleasant and costly experience!####Our=O visit=O their=O to=O say=O the=O least=O ,=O was=O an=O unpleasant=O and=O costly=O experience=O !=O
167 | Probably would not go back here.####Probably=O would=O not=O go=O back=O here=O .=O
168 | I don't appreciate places or people that try to drive up the bill without the patron's knowledge so that was a huge turnoff (more than the price).####I=O do=O n't=O appreciate=O places=O or=O people=O that=O try=O to=O drive=O up=O the=O bill=O without=O the=O patron=O 's=O knowledge=O so=O that=O was=O a=O huge=O turnoff=O more=O than=O the=O price=O .=O
169 | But if you're prepared to spend some $ and remember to ask if something they offer is complimentary, then this is the place to go for Indian food####But=O if=O you=O 're=O prepared=O to=O spend=O some=O $=O and=O remember=O to=O ask=O if=O something=O they=O offer=O is=O complimentary=O ,=O then=O this=O is=O the=O place=T-NEG to=O go=O for=O Indian=T-POS food=T-POS
170 | Wretched and retching####Wretched=O and=O retching=O
171 | For starters they delivered us someone else's order.####For=O starters=O they=O delivered=O us=O someone=O else=O 's=O order=O .=O
172 | However, once I received my predictably mediocre order of what Dokebi thinks passes as Korean fair, (sometimes you have to settle when it's your only option), I got through about half my kimchee before I found a piece of random lettuce accompanied by a far more disgusting, slimy, clearly bad piece of fish skin.####However=O ,=O once=O I=O received=O my=O predictably=O mediocre=O order=O of=O what=O Dokebi=O thinks=O passes=O as=O Korean=T-NEG fair=T-NEG ,=O sometimes=O you=O have=O to=O settle=O when=O it=O 's=O your=O only=O option=O ,=O I=O got=O through=O about=O half=O my=O kimchee=T-NEG before=O I=O found=O a=O piece=O of=O random=O lettuce=O accompanied=O by=O a=O far=O more=O disgusting=O ,=O slimy=O ,=O clearly=O bad=O piece=O of=O fish=O skin=O .=O
173 | Less than three minutes passed before I found myself doubled over the toilet.####Less=O than=O three=O minutes=O passed=O before=O I=O found=O myself=O doubled=O over=O the=O toilet=O .=O
174 | I book a gorgeous white organza tent which included a four course prix fix menu which we enjoyed a lot.####I=O book=O a=O gorgeous=O white=T-POS organza=T-POS tent=T-POS which=O included=O a=O four=T-POS course=T-POS prix=T-POS fix=T-POS menu=T-POS which=O we=O enjoyed=O a=O lot=O .=O
175 | The service was spectacular as the waiter knew everything about the menu and his recommendations were amazing!####The=O service=T-POS was=O spectacular=O as=O the=O waiter=T-POS knew=O everything=O about=O the=O menu=O and=O his=O recommendations=O were=O amazing=O !=O
176 | The dishes came out around 5 minutes apart.####The=O dishes=O came=O out=O around=O 5=O minutes=O apart=O .=O
177 | The side dishes were passable, and I did get a refill upon request.####The=O side=T-NEU dishes=T-NEU were=O passable=O ,=O and=O I=O did=O get=O a=O refill=O upon=O request=O .=O
178 | Authentic Korean food lovers should visit 32nd Street, of course.####Authentic=O Korean=O food=O lovers=O should=O visit=O 32nd=O Street=O ,=O of=O course=O .=O
179 | The wife had the risotto which was amazing.####The=O wife=O had=O the=O risotto=T-POS which=O was=O amazing=O .=O
180 | We started off with a delightful sashimi amuse bouche.####We=O started=O off=O with=O a=O delightful=O sashimi=T-POS amuse=T-POS bouche=T-POS .=O
181 | To be honest we only ever eat the Shabu Shabu.####To=O be=O honest=O we=O only=O ever=O eat=O the=O Shabu=O Shabu=O .=O
182 | In fact there is only one I've tried that even compares (shabu Tatsu) and even then I prefer Dokebi.####In=O fact=O there=O is=O only=O one=O I=O 've=O tried=O that=O even=O compares=O shabu=O Tatsu=O and=O even=O then=O I=O prefer=O Dokebi=O .=O
183 | The meat is fresh, the sauces are great, you get kimchi and a salad free with your meal and service is good too.####The=O meat=T-POS is=O fresh=O ,=O the=O sauces=T-POS are=O great=O ,=O you=O get=O kimchi=T-POS and=O a=O salad=T-POS free=O with=O your=O meal=T-POS and=O service=T-POS is=O good=O too=O .=O
184 | The hot dogs are good, yes, but the reason to get over here is the fantastic pork croquette sandwich, perfect on its supermarket squishy bun.####The=O hot=T-POS dogs=T-POS are=O good=O ,=O yes=O ,=O but=O the=O reason=O to=O get=O over=O here=O is=O the=O fantastic=O pork=T-POS croquette=T-POS sandwich=T-POS ,=O perfect=O on=O its=O supermarket=O squishy=O bun=T-POS .=O
185 | The family seafood entree was very good.####The=O family=T-POS seafood=T-POS entree=T-POS was=O very=O good=O .=O
186 | The food they serve is not comforting, not appetizing and uncooked.####The=O food=T-NEG they=O serve=O is=O not=O comforting=O ,=O not=O appetizing=O and=O uncooked=O .=O
187 | A coworker and I tried Pacifico after work a few Fridays and loved it.####A=O coworker=O and=O I=O tried=O Pacifico=T-POS after=O work=O a=O few=O Fridays=O and=O loved=O it=O .=O
188 | And how many times can you pick up the same perfectly aligned set of napkins, inspect them vapidly and plonk them down in exactly the same place instead of venturing a glance at people who are there to help you make the rent?####And=O how=O many=O times=O can=O you=O pick=O up=O the=O same=O perfectly=O aligned=O set=O of=O napkins=O ,=O inspect=O them=O vapidly=O and=O plonk=O them=O down=O in=O exactly=O the=O same=O place=O instead=O of=O venturing=O a=O glance=O at=O people=O who=O are=O there=O to=O help=O you=O make=O the=O rent=O ?=O
189 | Overall the food quality was pretty good, though I hear the salmon is much better when it hasn't sat cooling in front of the guest.####Overall=O the=O food=T-POS quality=O was=O pretty=O good=O ,=O though=O I=O hear=O the=O salmon=O is=O much=O better=O when=O it=O has=O n't=O sat=O cooling=O in=O front=O of=O the=O guest=O .=O
190 | The place has a nice fit-out, some attractive furnishings and, from what I could tell, a reasonable wine list (I was given the food menu when I asked for the carte des vins)####The=O place=O has=O a=O nice=O fit-out=T-POS ,=O some=O attractive=O furnishings=T-POS and=O ,=O from=O what=O I=O could=O tell=O ,=O a=O reasonable=O wine=T-POS list=T-POS I=O was=O given=O the=O food=O menu=O when=O I=O asked=O for=O the=O carte=O des=O vins=O
191 | Everything was going good until we got our meals.####Everything=O was=O going=O good=O until=O we=O got=O our=O meals=T-NEG .=O
192 | Sometimes you pay a lot and don't get much in return - it's manhattan, things are expensive.####Sometimes=O you=O pay=O a=O lot=O and=O do=O n't=O get=O much=O in=O return=O ,=O it=O 's=O manhattan=O ,=O things=O are=O expensive=O .=O
193 | Though it's been crowded most times I've gone here, Bark always delivers on their food.####Though=O it=O 's=O been=O crowded=O most=O times=O I=O 've=O gone=O here=O ,=O Bark=T-NEU always=O delivers=O on=O their=O food=T-POS .=O
194 | I'm a friendly person, so I wouldn't mind had she not been so nasty and gotten so personal. ####I=O 'm=O a=O friendly=O person=O ,=O so=O I=O would=O n't=O mind=O had=O she=O not=O been=O so=O nasty=O and=O gotten=O so=O personal=O .=O
195 | Here the hot dog is elevated to the level of a real entree with numerous variations available.####Here=O the=O hot=T-POS dog=T-POS is=O elevated=O to=O the=O level=O of=O a=O real=O entree=O with=O numerous=O variations=O available=O .=O
196 | Appetizers took nearly an hour.####Appetizers=O took=O nearly=O an=O hour=O .=O
197 | When we threatened to leave, we were offered a meager discount even though half the order was missing.####When=O we=O threatened=O to=O leave=O ,=O we=O were=O offered=O a=O meager=O discount=O even=O though=O half=O the=O order=O was=O missing=O .=O
198 | On the way out, we heard of other guests complaining about similar issues.####On=O the=O way=O out=O ,=O we=O heard=O of=O other=O guests=O complaining=O about=O similar=O issues=O .=O
199 | The design of the space is good.####The=O design=O of=O the=O space=T-POS is=O good=O .=O
200 | I couldn't ignore the fact that she reach over the plate of one of my friends, who was in mid bite, to clear the table.####I=O could=O n't=O ignore=O the=O fact=O that=O she=O reach=O over=O the=O plate=O of=O one=O of=O my=O friends=O ,=O who=O was=O in=O mid=O bite=O ,=O to=O clear=O the=O table=O .=O
201 | 


--------------------------------------------------------------------------------