├── architecture.jpg ├── requirements.txt ├── work.sh ├── train.sh ├── fast_run.py ├── data ├── test20 │ └── test.txt ├── rest15 │ └── dev.txt └── rest16 │ └── dev.txt ├── README.md ├── bert.py ├── bert_utils.py ├── work.py ├── LICENSE ├── seq_utils.py ├── glue_utils.py ├── absa_layer.py └── main.py /architecture.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tamanna18/BERT-E2E-ABSA/master/architecture.jpg -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | torch==1.2.0 2 | numpy==1.22.0 3 | transformers==4.1.1 4 | tqdm==4.32.1 5 | tensorboardX==1.8 6 | -------------------------------------------------------------------------------- /work.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | TASK_NAME="test20" 3 | ABSA_HOME="./bert-linear-laptop14-finetune" 4 | CUDA_VISIBLE_DEVICES=0 python work.py --absa_home ${ABSA_HOME} \ 5 | --ckpt ${ABSA_HOME}/checkpoint-1500 \ 6 | --model_type bert \ 7 | --data_dir ./data/${TASK_NAME} \ 8 | --task_name ${TASK_NAME} \ 9 | --model_name_or_path bert-base-uncased \ 10 | --cache_dir ./cache \ 11 | --max_seq_length 128 \ 12 | --tagging_schema BIEOS -------------------------------------------------------------------------------- /train.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | TASK_NAME=rest_total 3 | ABSA_TYPE=tfm 4 | CUDA_VISIBLE_DEVICES=0,2,3 python main.py --model_type bert \ 5 | --absa_type ${ABSA_TYPE} \ 6 | --tfm_mode finetune \ 7 | --fix_tfm 0 \ 8 | --model_name_or_path bert-base-uncased \ 9 | --data_dir ./data/${TASK_NAME} \ 10 | --task_name ${TASK_NAME} \ 11 | --per_gpu_train_batch_size 16 \ 12 | --per_gpu_eval_batch_size 8 \ 13 | --learning_rate 2e-5 \ 14 | --do_train \ 15 | --do_eval \ 16 | --do_lower_case \ 17 | --tagging_schema BIEOS \ 18 | --overfit 0 \ 19 | --overwrite_output_dir \ 20 | --eval_all_checkpoints \ 21 | --MASTER_ADDR localhost \ 22 | --MASTER_PORT 28512 \ 23 | --max_steps 1500 24 | -------------------------------------------------------------------------------- /fast_run.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | os.environ["CUDA_VISIBLE_DEVICES"] = "3, 1, 2" 4 | 5 | #seed_numbers = [42, 593, 1774, 65336, 189990] 6 | seed_numbers = [42] 7 | model_type = 'bert' 8 | absa_type = 'linear' 9 | tfm_mode = 'finetune' 10 | fix_tfm = 0 11 | task_name = 'laptop14' 12 | warmup_steps = 0 13 | overfit = 0 14 | if task_name == 'laptop14': 15 | train_batch_size = 32 16 | elif task_name == 'rest_total' or task_name == 'rest14' or task_name == 'rest15' or task_name == 'rest16': 17 | train_batch_size = 16 18 | else: 19 | raise Exception("Unsupported dataset %s!!!" % task_name) 20 | 21 | for run_id, seed in enumerate(seed_numbers): 22 | command = "python main.py --model_type %s --absa_type %s --tfm_mode %s --fix_tfm %s " \ 23 | "--model_name_or_path bert-base-uncased --data_dir ./data/%s --task_name %s " \ 24 | "--per_gpu_train_batch_size %s --per_gpu_eval_batch_size 8 --learning_rate 2e-5 " \ 25 | "--max_steps 1500 --warmup_steps %s --do_train --do_eval --do_lower_case " \ 26 | "--seed %s --tagging_schema BIEOS --overfit %s " \ 27 | "--overwrite_output_dir --eval_all_checkpoints --MASTER_ADDR localhost --MASTER_PORT 28512" % ( 28 | model_type, absa_type, tfm_mode, fix_tfm, task_name, task_name, train_batch_size, warmup_steps, seed, overfit) 29 | output_dir = '%s-%s-%s-%s' % (model_type, absa_type, task_name, tfm_mode) 30 | if fix_tfm: 31 | output_dir = '%s-fix' % output_dir 32 | if overfit: 33 | output_dir = '%s-overfit' % output_dir 34 | if not os.path.exists(output_dir): 35 | os.mkdir(output_dir) 36 | 37 | log_file = '%s/log.txt' % output_dir 38 | if run_id == 0 and os.path.exists(log_file): 39 | os.remove(log_file) 40 | with open(log_file, 'a') as fp: 41 | fp.write("\nIn run %s/5 (seed %s):\n" % (run_id, seed)) 42 | os.system(command) 43 | if overfit: 44 | # only conduct one run 45 | break 46 | -------------------------------------------------------------------------------- /data/test20/test.txt: -------------------------------------------------------------------------------- 1 | Yum!####Yum=O !=O 2 | Serves really good sushi!####Serves=O really=O good=O sushi=T-POS !=O 3 | Not the biggest portions but adequate.####Not=O the=O biggest=O portions=T-NEU but=O adequate=O .=O 4 | Green Tea creme brulee is a must!####Green=T-POS Tea=T-POS creme=T-POS brulee=T-POS is=O a=O must=O !=O 5 | Don't leave the restaurant without it.####Do=O n't=O leave=O the=O restaurant=O without=O it=O .=O 6 | No Comparison####No=O Comparison=O 7 | – I can't say enough about this place.####I=O ca=O n't=O say=O enough=O about=O this=O place=T-POS .=O 8 | It has great sushi and even better service.####It=O has=O great=O sushi=T-POS and=O even=O better=O service=T-POS .=O 9 | The entire staff was extremely accomodating and tended to my every need.####The=O entire=O staff=T-POS was=O extremely=O accomodating=O and=O tended=O to=O my=O every=O need=O .=O 10 | I've been to this restaurant over a dozen times with no complaints to date.####I=O 've=O been=O to=O this=O restaurant=T-POS over=O a=O dozen=O times=O with=O no=O complaints=O to=O date=O .=O 11 | Snotty Attitude####Snotty=O Attitude=O 12 | – We were treated very rudely here one time for breakfast.####We=O were=O treated=O very=O rudely=O here=O one=O time=O for=O breakfast=O .=O 13 | The owner is belligerent to guests that have a complaint.####The=O owner=T-NEG is=O belligerent=O to=O guests=O that=O have=O a=O complaint=O .=O 14 | Good food!####Good=O food=T-POS !=O 15 | – We love breakfast food.####We=O love=O breakfast=O food=O .=O 16 | This is a great place to get a delicious meal.####This=O is=O a=O great=O place=O to=O get=O a=O delicious=O meal=T-POS .=O 17 | We never had to wait more than 5 minutes.####We=O never=O had=O to=O wait=O more=O than=O 5=O minutes=O .=O 18 | The staff is pretty friendly.####The=O staff=T-POS is=O pretty=O friendly=O .=O 19 | The onion rings are great!####The=O onion=T-POS rings=T-POS are=O great=O !=O 20 | They are not greasy or anything.####They=O are=O not=O greasy=O or=O anything=O .=O 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # BERT-E2E-ABSA 2 | Exploiting **BERT** **E**nd-**t**o-**E**nd **A**spect-**B**ased **S**entiment **A**nalysis 3 |

4 | 5 |

6 | 7 | ## Requirements 8 | * python 3.7.3 9 | * pytorch 1.2.0 (also tested on pytorch 1.3.0) 10 | * ~~transformers 2.0.0~~ transformers 4.1.1 11 | * numpy 1.16.4 12 | * tensorboardX 1.9 13 | * tqdm 4.32.1 14 | * some codes are borrowed from **allennlp** ([https://github.com/allenai/allennlp](https://github.com/allenai/allennlp), an awesome open-source NLP toolkit) and **transformers** ([https://github.com/huggingface/transformers](https://github.com/huggingface/transformers), formerly known as **pytorch-pretrained-bert** or **pytorch-transformers**) 15 | 16 | ## Architecture 17 | * Pre-trained embedding layer: BERT-Base-Uncased (12-layer, 768-hidden, 12-heads, 110M parameters) 18 | * Task-specific layer: 19 | - Linear 20 | - Recurrent Neural Networks (GRU) 21 | - Self-Attention Networks (SAN, TFM) 22 | - Conditional Random Fields (CRF) 23 | 24 | ## Dataset 25 | * ~~Restaurant: retaurant reviews from SemEval 2014 (task 4), SemEval 2015 (task 12) and SemEval 2016 (task 5) (rest_total)~~ 26 | * (**IMPORTANT**) Restaurant: restaurant reviews from SemEval 2014 (rest14), restaurant reviews from SemEval 2015 (rest15), restaurant reviews from SemEval 2016 (rest16). Please refer to the newly updated files in ```./data``` 27 | * (**IMPORTANT**) **DO NOT** use the ```rest_total``` dataset built by ourselves again, more details can be found in [Updated Results](https://github.com/lixin4ever/BERT-E2E-ABSA/blob/master/README.md#updated-results-important). 28 | * Laptop: laptop reviews from SemEval 2014 (laptop14) 29 | 30 | 31 | ## Quick Start 32 | * The valid tagging strategies/schemes (i.e., the ways representing text or entity span) in this project are **BIEOS** (also called **BIOES** or **BMES**), **BIO** (also called **IOB2**) and **OT** (also called **IO**). If you are not familiar with these terms, I strongly recommend you to read the following materials before running the program: 33 | 34 | a. [Inside–outside–beginning (tagging)](https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)). 35 | 36 | b. [Representing Text Chunks](https://www.aclweb.org/anthology/E99-1023.pdf). 37 | 38 | c. The [paper](https://www.aclweb.org/anthology/D19-5505.pdf) associated with this project. 39 | 40 | * Reproduce the results on Restaurant and Laptop dataset: 41 | ``` 42 | # train the model with 5 different seed numbers 43 | python fast_run.py 44 | ``` 45 | * Train the model on other ABSA dataset: 46 | 47 | 1. place data files in the directory `./data/[YOUR_DATASET_NAME]` (please note that you need to re-organize your data files so that it can be directly adapted to this project, following the input format of `./data/laptop14/train.txt` should be OK). 48 | 49 | 2. set `TASK_NAME` in `train.sh` as `[YOUR_DATASET_NAME]`. 50 | 51 | 3. train the model: `sh train.sh` 52 | 53 | * (** **New feature** **) Perform pure inference/direct transfer over test/unseen data using the trained ABSA model: 54 | 55 | 1. place data file in the directory `./data/[YOUR_EVAL_DATASET_NAME]`. 56 | 57 | 2. set `TASK_NAME` in `work.sh` as `[YOUR_EVAL_DATASET_NAME]` 58 | 59 | 3. set `ABSA_HOME` in `work.sh` as `[HOME_DIRECTORY_OF_PRETRAINED_ABSA_MODEL]` 60 | 61 | 4. run: `sh work.sh` 62 | 63 | ## Environment 64 | * OS: REHL Server 6.4 (Santiago) 65 | * GPU: NVIDIA GTX 1080 ti 66 | * CUDA: 10.0 67 | * cuDNN: v7.6.1 68 | 69 | ## Updated results (IMPORTANT) 70 | * The data files of the ```rest_total``` dataset are created by concatenating the train/test counterparts from ```rest14```, ```rest15``` and ```rest16``` and our motivation is to build a larger training/testing dataset to stabilize the training/faithfully reflect the capability of the ABSA model. However, we recently found that the SemEval organizers directly treat the union set of ```rest15.train``` and ```rest15.test``` as the training set of rest16 (i.e., ```rest16.train```), and thus, there exists overlap between the ```rest_total.train``` and the ```rest_total.test```, which makes this dataset invalid. When you follow our works on this E2E-ABSA task, we hope you **DO NOT** use this ```rest_total``` dataset any more but change to the officially released ```rest14```, ```rest15``` and ```rest16```. 71 | * To facilitate the comparison in the future, we re-run our models following the above mentioned settings and report the results (***micro-averaged F1***) on ```rest14```, ```rest15``` and ```rest16```: 72 | 73 | | Model | rest14 | rest15 | rest16 | 74 | | --- | --- | --- | --- | 75 | | E2E-ABSA (OURS) | 67.10 | 57.27 | 64.31 | 76 | | [(He et al., 2019)](https://arxiv.org/pdf/1906.06906.pdf) | 69.54 | 59.18 | n/a | 77 | | [(Liu et al., 2020)](https://arxiv.org/pdf/2004.06427.pdf) | 68.91 | 58.37 | n/a | 78 | | BERT-Linear (OURS) | 72.61 | 60.29 | 69.67 | 79 | | BERT-GRU (OURS) | 73.17 | 59.60 | 70.21 | 80 | | BERT-SAN (OURS) | 73.68 | 59.90 | 70.51 | 81 | | BERT-TFM (OURS) | 73.98 | 60.24 | 70.25 | 82 | | BERT-CRF (OURS) | 73.17 | 60.70 | 70.37 | 83 | | [(Chen and Qian, 2020)](https://www.aclweb.org/anthology/2020.acl-main.340.pdf)| 75.42 | 66.05 | n/a | 84 | | [(Liang et al., 2020)](https://arxiv.org/pdf/2004.01951.pdf)| 72.60 | 62.37 | n/a | 85 | 86 | ## Citation 87 | If the code is used in your research, please star our repo and cite our paper as follows: 88 | ``` 89 | @inproceedings{li-etal-2019-exploiting, 90 | title = "Exploiting {BERT} for End-to-End Aspect-based Sentiment Analysis", 91 | author = "Li, Xin and 92 | Bing, Lidong and 93 | Zhang, Wenxuan and 94 | Lam, Wai", 95 | booktitle = "Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)", 96 | year = "2019", 97 | url = "https://www.aclweb.org/anthology/D19-5505", 98 | pages = "34--41" 99 | } 100 | ``` 101 | 102 | -------------------------------------------------------------------------------- /bert.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # Copyright 2018 Google AI Language, Google Brain and Carnegie Mellon University Authors and the HuggingFace Inc. team. 3 | # Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved. 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | from transformers import PreTrainedModel, BertModel, BertConfig, XLNetModel, XLNetConfig 18 | # model map for BERT 19 | from transformers import BERT_PRETRAINED_CONFIG_ARCHIVE_MAP 20 | # model map for XLNet 21 | from transformers import XLNET_PRETRAINED_CONFIG_ARCHIVE_MAP 22 | from transformers.models.bert.modeling_bert import BertEncoder, BertEmbeddings, BertPooler 23 | import torch.nn as nn 24 | from bert_utils import * 25 | 26 | 27 | BERT_PRETRAINED_MODEL_ARCHIVE_MAP = { 28 | 'bert-base-uncased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin', 29 | 'bert-large-uncased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-pytorch_model.bin', 30 | 'bert-base-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-pytorch_model.bin', 31 | 'bert-large-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-pytorch_model.bin', 32 | 'bert-base-multilingual-uncased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-pytorch_model.bin', 33 | 'bert-base-multilingual-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-pytorch_model.bin', 34 | 'bert-base-chinese': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-pytorch_model.bin', 35 | 'bert-base-german-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-pytorch_model.bin', 36 | 'bert-large-uncased-whole-word-masking': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin', 37 | 'bert-large-cased-whole-word-masking': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-pytorch_model.bin', 38 | 'bert-large-uncased-whole-word-masking-finetuned-squad': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin', 39 | 'bert-large-cased-whole-word-masking-finetuned-squad': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin', 40 | 'bert-base-cased-finetuned-mrpc': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-pytorch_model.bin', 41 | 'bert-base-german-dbmdz-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-pytorch_model.bin', 42 | 'bert-base-german-dbmdz-uncased': 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-pytorch_model.bin' 43 | } 44 | 45 | XLNET_PRETRAINED_MODEL_ARCHIVE_MAP = { 46 | 'xlnet-base-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-base-cased-pytorch_model.bin', 47 | 'xlnet-large-cased': 'https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-pytorch_model.bin' 48 | } 49 | 50 | 51 | class BertLayerNorm(nn.Module): 52 | def __init__(self, hidden_size, eps=1e-12): 53 | """Construct a layernorm module in the TF style (epsilon inside the square root). 54 | """ 55 | super(BertLayerNorm, self).__init__() 56 | self.weight = nn.Parameter(torch.ones(hidden_size)) 57 | self.bias = nn.Parameter(torch.zeros(hidden_size)) 58 | self.variance_epsilon = eps 59 | 60 | def forward(self, x): 61 | u = x.mean(-1, keepdim=True) 62 | s = (x - u).pow(2).mean(-1, keepdim=True) 63 | x = (x - u) / torch.sqrt(s + self.variance_epsilon) 64 | return self.weight * x + self.bias 65 | 66 | 67 | class XLNetLayerNorm(nn.Module): 68 | def __init__(self, d_model, eps=1e-12): 69 | """Construct a layernorm module in the TF style (epsilon inside the square root). 70 | """ 71 | super(XLNetLayerNorm, self).__init__() 72 | self.weight = nn.Parameter(torch.ones(d_model)) 73 | self.bias = nn.Parameter(torch.zeros(d_model)) 74 | self.variance_epsilon = eps 75 | 76 | def forward(self, x): 77 | u = x.mean(-1, keepdim=True) 78 | s = (x - u).pow(2).mean(-1, keepdim=True) 79 | x = (x - u) / torch.sqrt(s + self.variance_epsilon) 80 | return self.weight * x + self.bias 81 | 82 | 83 | class BertPreTrainedModel(PreTrainedModel): 84 | """ An abstract class to handle weights initialization and 85 | a simple interface for dowloading and loading pretrained models. 86 | """ 87 | config_class = BertConfig 88 | pretrained_model_archive_map = BERT_PRETRAINED_MODEL_ARCHIVE_MAP 89 | load_tf_weights = load_tf_weights_in_bert 90 | base_model_prefix = "bert" 91 | 92 | def __init__(self, *inputs, **kwargs): 93 | super(BertPreTrainedModel, self).__init__(*inputs, **kwargs) 94 | 95 | def init_weights(self, module): 96 | """ Initialize the weights. 97 | """ 98 | if isinstance(module, (nn.Linear, nn.Embedding)): 99 | # Slightly different from the TF version which uses truncated_normal for initialization 100 | # cf https://github.com/pytorch/pytorch/pull/5617 101 | module.weight.data.normal_(mean=0.0, std=self.config.initializer_range) 102 | elif isinstance(module, BertLayerNorm): 103 | module.bias.data.zero_() 104 | module.weight.data.fill_(1.0) 105 | if isinstance(module, nn.Linear) and module.bias is not None: 106 | module.bias.data.zero_() 107 | 108 | 109 | class XLNetPreTrainedModel(PreTrainedModel): 110 | config_class = XLNetConfig 111 | pretrained_model_archive_map = XLNET_PRETRAINED_MODEL_ARCHIVE_MAP 112 | load_tf_weights = load_tf_weights_in_xlnet 113 | base_model_prefix = 'transformer' 114 | 115 | def __init__(self, *inputs, **kwargs): 116 | super(XLNetPreTrainedModel, self).__init__(*inputs, **kwargs) 117 | 118 | def init_weights(self, module): 119 | """ 120 | Initialize the weights. 121 | :param module: 122 | :return: 123 | """ 124 | if isinstance(module, (nn.Linear, nn.Embedding)): 125 | # Slightly different from the TF version which uses truncated_normal for initialization 126 | # cf https://github.com/pytorch/pytorch/pull/5617 127 | module.weight.data.normal_(mean=0.0, std=self.config.initializer_range) 128 | if isinstance(module, nn.Linear) and module.bias is not None: 129 | module.bias.data.zero_() 130 | elif isinstance(module, XLNetLayerNorm): 131 | module.bias.data.zero_() 132 | module.weight.data.fill_(1.0) 133 | elif isinstance(module, XLNetModel): 134 | module.mask_emb.data.normal_(mean=0.0, std=self.config.initializer_range) 135 | -------------------------------------------------------------------------------- /bert_utils.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # Copyright 2018 Google AI Language, Google Brain and Carnegie Mellon University Authors and the HuggingFace Inc. team. 3 | # Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved. 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | import torch 18 | import logging 19 | import os 20 | logger = logging.getLogger(__name__) 21 | 22 | 23 | def build_tf_xlnet_to_pytorch_map(model, config, tf_weights=None): 24 | """ A map of modules from TF to PyTorch. 25 | I use a map to keep the PyTorch model as 26 | identical to the original PyTorch model as possible. 27 | """ 28 | tf_to_pt_map = {} 29 | 30 | if hasattr(model, 'transformer'): 31 | if hasattr(model, 'lm_loss'): 32 | # We will load also the output bias 33 | tf_to_pt_map['model/lm_loss/bias'] = model.lm_loss.bias 34 | if hasattr(model, 'sequence_summary') and 'model/sequnece_summary/summary/kernel' in tf_weights: 35 | # We will load also the sequence summary 36 | tf_to_pt_map['model/sequnece_summary/summary/kernel'] = model.sequence_summary.summary.weight 37 | tf_to_pt_map['model/sequnece_summary/summary/bias'] = model.sequence_summary.summary.bias 38 | if hasattr(model, 'logits_proj') and config.finetuning_task is not None \ 39 | and 'model/regression_{}/logit/kernel'.format(config.finetuning_task) in tf_weights: 40 | tf_to_pt_map['model/regression_{}/logit/kernel'.format(config.finetuning_task)] = model.logits_proj.weight 41 | tf_to_pt_map['model/regression_{}/logit/bias'.format(config.finetuning_task)] = model.logits_proj.bias 42 | 43 | # Now load the rest of the transformer 44 | model = model.transformer 45 | 46 | # Embeddings and output 47 | tf_to_pt_map.update({'model/transformer/word_embedding/lookup_table': model.word_embedding.weight, 48 | 'model/transformer/mask_emb/mask_emb': model.mask_emb}) 49 | 50 | # Transformer blocks 51 | for i, b in enumerate(model.layer): 52 | layer_str = "model/transformer/layer_%d/" % i 53 | tf_to_pt_map.update({ 54 | layer_str + "rel_attn/LayerNorm/gamma": b.rel_attn.layer_norm.weight, 55 | layer_str + "rel_attn/LayerNorm/beta": b.rel_attn.layer_norm.bias, 56 | layer_str + "rel_attn/o/kernel": b.rel_attn.o, 57 | layer_str + "rel_attn/q/kernel": b.rel_attn.q, 58 | layer_str + "rel_attn/k/kernel": b.rel_attn.k, 59 | layer_str + "rel_attn/r/kernel": b.rel_attn.r, 60 | layer_str + "rel_attn/v/kernel": b.rel_attn.v, 61 | layer_str + "ff/LayerNorm/gamma": b.ff.layer_norm.weight, 62 | layer_str + "ff/LayerNorm/beta": b.ff.layer_norm.bias, 63 | layer_str + "ff/layer_1/kernel": b.ff.layer_1.weight, 64 | layer_str + "ff/layer_1/bias": b.ff.layer_1.bias, 65 | layer_str + "ff/layer_2/kernel": b.ff.layer_2.weight, 66 | layer_str + "ff/layer_2/bias": b.ff.layer_2.bias, 67 | }) 68 | 69 | # Relative positioning biases 70 | if config.untie_r: 71 | r_r_list = [] 72 | r_w_list = [] 73 | r_s_list = [] 74 | seg_embed_list = [] 75 | for b in model.layer: 76 | r_r_list.append(b.rel_attn.r_r_bias) 77 | r_w_list.append(b.rel_attn.r_w_bias) 78 | r_s_list.append(b.rel_attn.r_s_bias) 79 | seg_embed_list.append(b.rel_attn.seg_embed) 80 | else: 81 | r_r_list = [model.r_r_bias] 82 | r_w_list = [model.r_w_bias] 83 | r_s_list = [model.r_s_bias] 84 | seg_embed_list = [model.seg_embed] 85 | 86 | tf_to_pt_map.update({ 87 | 'model/transformer/r_r_bias': r_r_list, 88 | 'model/transformer/r_w_bias': r_w_list, 89 | 'model/transformer/r_s_bias': r_s_list, 90 | 'model/transformer/seg_embed': seg_embed_list}) 91 | return tf_to_pt_map 92 | 93 | 94 | def load_tf_weights_in_bert(model, config, tf_checkpoint_path): 95 | """ Load tf checkpoints in a pytorch model. 96 | """ 97 | try: 98 | import re 99 | import numpy as np 100 | import tensorflow as tf 101 | except ImportError: 102 | logger.error("Loading a TensorFlow models in PyTorch, requires TensorFlow to be installed. Please see " 103 | "https://www.tensorflow.org/install/ for installation instructions.") 104 | raise 105 | tf_path = os.path.abspath(tf_checkpoint_path) 106 | logger.info("Converting TensorFlow checkpoint from {}".format(tf_path)) 107 | # Load weights from TF model 108 | init_vars = tf.train.list_variables(tf_path) 109 | names = [] 110 | arrays = [] 111 | for name, shape in init_vars: 112 | logger.info("Loading TF weight {} with shape {}".format(name, shape)) 113 | array = tf.train.load_variable(tf_path, name) 114 | names.append(name) 115 | arrays.append(array) 116 | 117 | for name, array in zip(names, arrays): 118 | name = name.split('/') 119 | # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v 120 | # which are not required for using pretrained model 121 | if any(n in ["adam_v", "adam_m", "global_step"] for n in name): 122 | logger.info("Skipping {}".format("/".join(name))) 123 | continue 124 | pointer = model 125 | for m_name in name: 126 | if re.fullmatch(r'[A-Za-z]+_\d+', m_name): 127 | l = re.split(r'_(\d+)', m_name) 128 | else: 129 | l = [m_name] 130 | if l[0] == 'kernel' or l[0] == 'gamma': 131 | pointer = getattr(pointer, 'weight') 132 | elif l[0] == 'output_bias' or l[0] == 'beta': 133 | pointer = getattr(pointer, 'bias') 134 | elif l[0] == 'output_weights': 135 | pointer = getattr(pointer, 'weight') 136 | elif l[0] == 'squad': 137 | pointer = getattr(pointer, 'classifier') 138 | else: 139 | try: 140 | pointer = getattr(pointer, l[0]) 141 | except AttributeError: 142 | logger.info("Skipping {}".format("/".join(name))) 143 | continue 144 | if len(l) >= 2: 145 | num = int(l[1]) 146 | pointer = pointer[num] 147 | if m_name[-11:] == '_embeddings': 148 | pointer = getattr(pointer, 'weight') 149 | elif m_name == 'kernel': 150 | array = np.transpose(array) 151 | try: 152 | assert pointer.shape == array.shape 153 | except AssertionError as e: 154 | e.args += (pointer.shape, array.shape) 155 | raise 156 | logger.info("Initialize PyTorch weight {}".format(name)) 157 | pointer.data = torch.from_numpy(array) 158 | return model 159 | 160 | 161 | def load_tf_weights_in_xlnet(model, config, tf_path): 162 | """ Load tf checkpoints in a pytorch model. 163 | """ 164 | try: 165 | import numpy as np 166 | import tensorflow as tf 167 | except ImportError: 168 | logger.error("Loading a TensorFlow models in PyTorch, requires TensorFlow to be installed. " 169 | "Please see https://www.tensorflow.org/install/ for installation instructions.") 170 | 171 | # load weights from TF model 172 | init_vars = tf.train.list_variables(tf_path) 173 | tf_weights = {} 174 | for name, shape in init_vars: 175 | logger.info("Loading TF weight {} with shape {}".format(name, shape)) 176 | array = tf.train.load_variable(tf_path, name) 177 | tf_weights[name] = array 178 | 179 | # Build TF to PyTorch weights loading map 180 | tf_to_pt_map = build_tf_xlnet_to_pytorch_map(model, config, tf_weights) 181 | 182 | for name, pointer in tf_to_pt_map.items(): 183 | logger.info("Importing {}".format(name)) 184 | if name not in tf_weights: 185 | logger.info("{} not in tf pre-trained weights, skipping".format(name)) 186 | continue 187 | array = tf_weights[name] 188 | # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v 189 | # which are not required for using pretrained model 190 | if 'kernel' in name and ('ff' in name or 'summary' in name or 'logit' in name): 191 | logger.info("Transposing") 192 | array = np.transpose(array) 193 | 194 | if isinstance(pointer, list): 195 | # Here we will split the TF weigths 196 | assert len(pointer) == array.shape[0] 197 | for i, p_i in enumerate(pointer): 198 | arr_i = array[i, ...] 199 | try: 200 | assert p_i.shape == arr_i.shape 201 | except AssertionError as e: 202 | e.args += (p_i.shape, arr_i.shape) 203 | raise 204 | logger.info("Initialize PyTorch weight {} for layer {}".format(name, i)) 205 | p_i.data = torch.from_numpy(arr_i) 206 | 207 | else: 208 | try: 209 | assert pointer.shape == array.shape 210 | except AssertionError as e: 211 | e.args += (pointer.shape, array.shape) 212 | raise 213 | logger.info("Initialize PyTorch weight {}".format(name)) 214 | pointer.data = torch.from_numpy(array) 215 | tf_weights.pop(name, None) 216 | tf_weights.pop(name + '/Adam', None) 217 | tf_weights.pop(name + '/Adam_1', None) 218 | logger.info("Weights not copied to PyTorch model: {}".format(', '.join(tf_weights.keys()))) 219 | return model -------------------------------------------------------------------------------- /work.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import torch 4 | import numpy as np 5 | 6 | from glue_utils import convert_examples_to_seq_features, compute_metrics_absa, ABSAProcessor 7 | from tqdm import tqdm 8 | from transformers import BertConfig, BertTokenizer, XLNetConfig, XLNetTokenizer, WEIGHTS_NAME 9 | from absa_layer import BertABSATagger 10 | from torch.utils.data import DataLoader, TensorDataset, SequentialSampler 11 | from seq_utils import ot2bieos_ts, bio2ot_ts, tag2ts 12 | 13 | #ALL_MODELS = sum((tuple(conf.pretrained_config_archive_map.keys()) for conf in (BertConfig, XLNetConfig)), ()) 14 | ALL_MODELS = ( 15 | 'bert-base-uncased', 16 | 'bert-large-uncased', 17 | 'bert-base-cased', 18 | 'bert-large-cased', 19 | 'bert-base-multilingual-uncased', 20 | 'bert-base-multilingual-cased', 21 | 'bert-base-chinese', 22 | 'bert-base-german-cased', 23 | 'bert-large-uncased-whole-word-masking', 24 | 'bert-large-cased-whole-word-masking', 25 | 'bert-large-uncased-whole-word-masking-finetuned-squad', 26 | 'bert-large-cased-whole-word-masking-finetuned-squad', 27 | 'bert-base-cased-finetuned-mrpc', 28 | 'bert-base-german-dbmdz-cased', 29 | 'bert-base-german-dbmdz-uncased', 30 | 'xlnet-base-cased', 31 | 'xlnet-large-cased' 32 | ) 33 | 34 | 35 | MODEL_CLASSES = { 36 | 'bert': (BertConfig, BertABSATagger, BertTokenizer), 37 | } 38 | 39 | 40 | def load_and_cache_examples(args, task, tokenizer): 41 | # similar to that in main.py 42 | processor = ABSAProcessor() 43 | # Load data features from cache or dataset file 44 | cached_features_file = os.path.join(args.data_dir, 'cached_{}_{}_{}_{}'.format( 45 | 'test', 46 | list(filter(None, args.model_name_or_path.split('/'))).pop(), 47 | str(args.max_seq_length), 48 | str(task))) 49 | if os.path.exists(cached_features_file): 50 | print("cached_features_file:", cached_features_file) 51 | features = torch.load(cached_features_file) 52 | examples = processor.get_test_examples(args.data_dir, args.tagging_schema) 53 | else: 54 | #logger.info("Creating features from dataset file at %s", args.data_dir) 55 | label_list = processor.get_labels(args.tagging_schema) 56 | examples = processor.get_test_examples(args.data_dir, args.tagging_schema) 57 | features = convert_examples_to_seq_features(examples=examples, label_list=label_list, tokenizer=tokenizer, 58 | cls_token_at_end=bool(args.model_type in ['xlnet']), 59 | cls_token=tokenizer.cls_token, 60 | sep_token=tokenizer.sep_token, 61 | cls_token_segment_id=2 if args.model_type in ['xlnet'] else 0, 62 | pad_on_left=bool(args.model_type in ['xlnet']), 63 | pad_token_segment_id=4 if args.model_type in ['xlnet'] else 0) 64 | torch.save(features, cached_features_file) 65 | total_words = [] 66 | for input_example in examples: 67 | text = input_example.text_a 68 | total_words.append(text.split(' ')) 69 | 70 | # Convert to Tensors and build dataset 71 | all_input_ids = torch.tensor([f.input_ids for f in features], dtype=torch.long) 72 | all_input_mask = torch.tensor([f.input_mask for f in features], dtype=torch.long) 73 | all_segment_ids = torch.tensor([f.segment_ids for f in features], dtype=torch.long) 74 | 75 | all_label_ids = torch.tensor([f.label_ids for f in features], dtype=torch.long) 76 | # used in evaluation 77 | all_evaluate_label_ids = [f.evaluate_label_ids for f in features] 78 | dataset = TensorDataset(all_input_ids, all_input_mask, all_segment_ids, all_label_ids) 79 | return dataset, all_evaluate_label_ids, total_words 80 | 81 | 82 | def init_args(): 83 | parser = argparse.ArgumentParser() 84 | parser.add_argument("--absa_home", type=str, required=True, help="Home directory of the trained ABSA model") 85 | parser.add_argument("--ckpt", type=str, required=True, help="Directory of model checkpoint for evaluation") 86 | parser.add_argument("--data_dir", type=str, required=True, 87 | help="The incoming data dir. Should contain the files of test/unseen data") 88 | parser.add_argument("--task_name", type=str, required=True, help="task name") 89 | parser.add_argument("--model_type", default=None, type=str, required=True, 90 | help="Model type selected in the list: " + ", ".join(MODEL_CLASSES.keys())) 91 | parser.add_argument("--model_name_or_path", default=None, type=str, required=True, 92 | help="Path to pre-trained model or shortcut name selected in the list: " + ", ".join(ALL_MODELS)) 93 | parser.add_argument("--cache_dir", default="", type=str, 94 | help="Where do you want to store the pre-trained models downloaded from s3") 95 | parser.add_argument("--max_seq_length", default=128, type=int, 96 | help="The maximum total input sequence length after tokenization. Sequences longer " 97 | "than this will be truncated, sequences shorter will be padded.") 98 | parser.add_argument('--tagging_schema', type=str, default='BIEOS', help="Tagging schema, should be kept same with " 99 | "that of ckpt") 100 | 101 | args = parser.parse_args() 102 | 103 | return args 104 | 105 | 106 | def main(): 107 | # perform evaluation on single GPU 108 | args = init_args() 109 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 110 | args.device = device 111 | if torch.cuda.is_available(): 112 | args.n_gpu = torch.cuda.device_count() 113 | 114 | args.model_type = args.model_type.lower() 115 | _, model_class, tokenizer_class = MODEL_CLASSES[args.model_type] 116 | 117 | # load the trained model (including the fine-tuned GPT/BERT/XLNET) 118 | print("Load checkpoint %s/%s..." % (args.ckpt, WEIGHTS_NAME)) 119 | model = model_class.from_pretrained(args.ckpt) 120 | # follow the property of tokenizer in the loaded model, e.g., do_lower_case=True 121 | tokenizer = tokenizer_class.from_pretrained(args.absa_home) 122 | model.to(args.device) 123 | model.eval() 124 | predict(args, model, tokenizer) 125 | 126 | 127 | def predict(args, model, tokenizer): 128 | dataset, evaluate_label_ids, total_words = load_and_cache_examples(args, args.task_name, tokenizer) 129 | sampler = SequentialSampler(dataset) 130 | # process the incoming data one by one 131 | dataloader = DataLoader(dataset, sampler=sampler, batch_size=1) 132 | print("***** Running prediction *****") 133 | 134 | total_preds, gold_labels = None, None 135 | idx = 0 136 | if args.tagging_schema == 'BIEOS': 137 | absa_label_vocab = {'O': 0, 'EQ': 1, 'B-POS': 2, 'I-POS': 3, 'E-POS': 4, 'S-POS': 5, 138 | 'B-NEG': 6, 'I-NEG': 7, 'E-NEG': 8, 'S-NEG': 9, 139 | 'B-NEU': 10, 'I-NEU': 11, 'E-NEU': 12, 'S-NEU': 13} 140 | elif args.tagging_schema == 'BIO': 141 | absa_label_vocab = {'O': 0, 'EQ': 1, 'B-POS': 2, 'I-POS': 3, 142 | 'B-NEG': 4, 'I-NEG': 5, 'B-NEU': 6, 'I-NEU': 7} 143 | elif args.tagging_schema == 'OT': 144 | absa_label_vocab = {'O': 0, 'EQ': 1, 'T-POS': 2, 'T-NEG': 3, 'T-NEU': 4} 145 | else: 146 | raise Exception("Invalid tagging schema %s..." % args.tagging_schema) 147 | absa_id2tag = {} 148 | for k in absa_label_vocab: 149 | v = absa_label_vocab[k] 150 | absa_id2tag[v] = k 151 | 152 | for batch in tqdm(dataloader, desc="Evaluating"): 153 | batch = tuple(t.to(args.device) for t in batch) 154 | with torch.no_grad(): 155 | inputs = {'input_ids': batch[0], 156 | 'attention_mask': batch[1], 157 | 'token_type_ids': batch[2] if args.model_type in ['bert', 'xlnet'] else None, 158 | # XLM don't use segment_ids 159 | 'labels': batch[3]} 160 | outputs = model(**inputs) 161 | # logits: (1, seq_len, label_size) 162 | logits = outputs[1] 163 | # preds: (1, seq_len) 164 | if model.tagger_config.absa_type != 'crf': 165 | preds = np.argmax(logits.detach().cpu().numpy(), axis=-1) 166 | else: 167 | mask = batch[1] 168 | preds = model.tagger.viterbi_tags(logits=logits, mask=mask) 169 | label_indices = evaluate_label_ids[idx] 170 | words = total_words[idx] 171 | pred_labels = preds[0][label_indices] 172 | assert len(words) == len(pred_labels) 173 | pred_tags = [absa_id2tag[label] for label in pred_labels] 174 | 175 | if args.tagging_schema == 'OT': 176 | pred_tags = ot2bieos_ts(pred_tags) 177 | elif args.tagging_schema == 'BIO': 178 | pred_tags = ot2bieos_ts(bio2ot_ts(pred_tags)) 179 | else: 180 | # current tagging schema is BIEOS, do nothing 181 | pass 182 | p_ts_sequence = tag2ts(ts_tag_sequence=pred_tags) 183 | output_ts = [] 184 | for t in p_ts_sequence: 185 | beg, end, sentiment = t 186 | aspect = words[beg:end+1] 187 | output_ts.append('%s: %s' % (aspect, sentiment)) 188 | print("Input: %s, output: %s" % (' '.join(words), '\t'.join(output_ts))) 189 | if inputs['labels'] is not None: 190 | # for the unseen data, there is no ``labels'' 191 | if gold_labels is None: 192 | gold_labels = inputs['labels'].detach().cpu().numpy() 193 | else: 194 | gold_labels = np.append(gold_labels, inputs['labels'].detach().cpu().numpy(), axis=0) 195 | idx += 1 196 | 197 | 198 | if __name__ == "__main__": 199 | main() 200 | 201 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /seq_utils.py: -------------------------------------------------------------------------------- 1 | # sequence utility functions 2 | import torch 3 | import math 4 | import numpy as np 5 | 6 | def ot2bieos_ts(ts_tag_sequence): 7 | """ 8 | ot2bieos function for targeted-sentiment task, ts refers to targeted -sentiment / aspect-based sentiment 9 | :param ts_tag_sequence: tag sequence for targeted sentiment 10 | :return: 11 | """ 12 | n_tags = len(ts_tag_sequence) 13 | new_ts_sequence = [] 14 | prev_pos = '$$$' 15 | for i in range(n_tags): 16 | cur_ts_tag = ts_tag_sequence[i] 17 | if cur_ts_tag == 'O' or cur_ts_tag == 'EQ': 18 | # when meet the EQ label, regard it as O label 19 | new_ts_sequence.append('O') 20 | cur_pos = 'O' 21 | else: 22 | cur_pos, cur_sentiment = cur_ts_tag.split('-') 23 | # cur_pos is T 24 | if cur_pos != prev_pos: 25 | # prev_pos is O and new_cur_pos can only be B or S 26 | if i == n_tags - 1: 27 | new_ts_sequence.append('S-%s' % cur_sentiment) 28 | else: 29 | next_ts_tag = ts_tag_sequence[i + 1] 30 | if next_ts_tag == 'O': 31 | new_ts_sequence.append('S-%s' % cur_sentiment) 32 | else: 33 | new_ts_sequence.append('B-%s' % cur_sentiment) 34 | else: 35 | # prev_pos is T and new_cur_pos can only be I or E 36 | if i == n_tags - 1: 37 | new_ts_sequence.append('E-%s' % cur_sentiment) 38 | else: 39 | next_ts_tag = ts_tag_sequence[i + 1] 40 | if next_ts_tag == 'O': 41 | new_ts_sequence.append('E-%s' % cur_sentiment) 42 | else: 43 | new_ts_sequence.append('I-%s' % cur_sentiment) 44 | prev_pos = cur_pos 45 | return new_ts_sequence 46 | 47 | 48 | def ot2bieos_ts_batch(ts_tag_seqs): 49 | """ 50 | batch version of function ot2bieos_ts 51 | :param ts_tag_seqs: 52 | :return: 53 | """ 54 | new_ts_tag_seqs = [] 55 | n_seqs = len(ts_tag_seqs) 56 | for i in range(n_seqs): 57 | new_ts_seq = ot2bieos_ts(ts_tag_sequence=ts_tag_seqs[i]) 58 | new_ts_tag_seqs.append(new_ts_seq) 59 | return new_ts_tag_seqs 60 | 61 | 62 | def ot2bio_ts(ts_tag_sequence): 63 | """ 64 | ot2bio function for ts tag sequence 65 | :param ts_tag_sequence: 66 | :return: 67 | """ 68 | new_ts_sequence = [] 69 | n_tag = len(ts_tag_sequence) 70 | prev_pos = '$$$' 71 | for i in range(n_tag): 72 | cur_ts_tag = ts_tag_sequence[i] 73 | if cur_ts_tag == 'O': 74 | new_ts_sequence.append('O') 75 | cur_pos = 'O' 76 | else: 77 | # current tag is subjective tag, i.e., cur_pos is T 78 | # print(cur_ts_tag) 79 | cur_pos, cur_sentiment = cur_ts_tag.split('-') 80 | if cur_pos == prev_pos: 81 | # prev_pos is T 82 | new_ts_sequence.append('I-%s' % cur_sentiment) 83 | else: 84 | # prev_pos is O 85 | new_ts_sequence.append('B-%s' % cur_sentiment) 86 | prev_pos = cur_pos 87 | return new_ts_sequence 88 | 89 | 90 | def ot2bio_ts_batch(ts_tag_seqs): 91 | """ 92 | batch version of function ot2bio_ts 93 | :param ts_tag_seqs: 94 | :return: 95 | """ 96 | new_ts_tag_seqs = [] 97 | n_seqs = len(ts_tag_seqs) 98 | for i in range(n_seqs): 99 | new_ts_seq = ot2bio_ts(ts_tag_sequence=ts_tag_seqs[i]) 100 | new_ts_tag_seqs.append(new_ts_seq) 101 | return new_ts_tag_seqs 102 | 103 | 104 | def bio2ot_ts(ts_tag_sequence): 105 | """ 106 | perform bio-->ot for ts tag sequence 107 | :param ts_tag_sequence: 108 | :return: 109 | """ 110 | new_ts_sequence = [] 111 | n_tags = len(ts_tag_sequence) 112 | for i in range(n_tags): 113 | ts_tag = ts_tag_sequence[i] 114 | if ts_tag == 'O' or ts_tag == 'EQ': 115 | new_ts_sequence.append('O') 116 | else: 117 | pos, sentiment = ts_tag.split('-') 118 | new_ts_sequence.append('T-%s' % sentiment) 119 | return new_ts_sequence 120 | 121 | 122 | def bio2ot_ts_batch(ts_tag_seqs): 123 | """ 124 | batch version of function bio2ot_ts 125 | :param ts_tag_seqs: 126 | :return: 127 | """ 128 | new_ts_tag_seqs = [] 129 | n_seqs = len(ts_tag_seqs) 130 | for i in range(n_seqs): 131 | new_ts_seq = bio2ot_ts(ts_tag_sequence=ts_tag_seqs[i]) 132 | new_ts_tag_seqs.append(new_ts_seq) 133 | return new_ts_tag_seqs 134 | 135 | 136 | def tag2ts(ts_tag_sequence): 137 | """ 138 | transform ts tag sequence to targeted sentiment 139 | :param ts_tag_sequence: tag sequence for ts task 140 | :return: 141 | """ 142 | n_tags = len(ts_tag_sequence) 143 | ts_sequence, sentiments = [], [] 144 | beg, end = -1, -1 145 | for i in range(n_tags): 146 | ts_tag = ts_tag_sequence[i] 147 | # current position and sentiment 148 | # tag O and tag EQ will not be counted 149 | eles = ts_tag.split('-') 150 | if len(eles) == 2: 151 | pos, sentiment = eles 152 | else: 153 | pos, sentiment = 'O', 'O' 154 | if sentiment != 'O': 155 | # current word is a subjective word 156 | sentiments.append(sentiment) 157 | if pos == 'S': 158 | # singleton 159 | ts_sequence.append((i, i, sentiment)) 160 | sentiments = [] 161 | elif pos == 'B': 162 | beg = i 163 | if len(sentiments) > 1: 164 | # remove the effect of the noisy I-{POS,NEG,NEU} 165 | sentiments = [sentiments[-1]] 166 | elif pos == 'E': 167 | end = i 168 | # schema1: only the consistent sentiment tags are accepted 169 | # that is, all of the sentiment tags are the same 170 | if end > beg > -1 and len(set(sentiments)) == 1: 171 | ts_sequence.append((beg, end, sentiment)) 172 | sentiments = [] 173 | beg, end = -1, -1 174 | return ts_sequence 175 | 176 | 177 | def logsumexp(tensor, dim=-1, keepdim=False): 178 | """ 179 | 180 | :param tensor: 181 | :param dim: 182 | :param keepdim: 183 | :return: 184 | """ 185 | max_score, _ = tensor.max(dim, keepdim=keepdim) 186 | if keepdim: 187 | stable_vec = tensor - max_score 188 | else: 189 | stable_vec = tensor - max_score.unsqueeze(dim) 190 | return max_score + (stable_vec.exp().sum(dim, keepdim=keepdim)).log() 191 | 192 | 193 | def viterbi_decode(tag_sequence, transition_matrix, 194 | tag_observations=None, allowed_start_transitions=None, 195 | allowed_end_transitions=None): 196 | """ 197 | Perform Viterbi decoding in log space over a sequence given a transition matrix 198 | specifying pairwise (transition) potentials between tags and a matrix of shape 199 | (sequence_length, num_tags) specifying unary potentials for possible tags per 200 | timestep. 201 | Parameters 202 | ---------- 203 | tag_sequence : torch.Tensor, required. 204 | A tensor of shape (sequence_length, num_tags) representing scores for 205 | a set of tags over a given sequence. 206 | transition_matrix : torch.Tensor, required. 207 | A tensor of shape (num_tags, num_tags) representing the binary potentials 208 | for transitioning between a given pair of tags. 209 | tag_observations : Optional[List[int]], optional, (default = None) 210 | A list of length ``sequence_length`` containing the class ids of observed 211 | elements in the sequence, with unobserved elements being set to -1. Note that 212 | it is possible to provide evidence which results in degenerate labelings if 213 | the sequences of tags you provide as evidence cannot transition between each 214 | other, or those transitions are extremely unlikely. In this situation we log a 215 | warning, but the responsibility for providing self-consistent evidence ultimately 216 | lies with the user. 217 | allowed_start_transitions : torch.Tensor, optional, (default = None) 218 | An optional tensor of shape (num_tags,) describing which tags the START token 219 | may transition *to*. If provided, additional transition constraints will be used for 220 | determining the start element of the sequence. 221 | allowed_end_transitions : torch.Tensor, optional, (default = None) 222 | An optional tensor of shape (num_tags,) describing which tags may transition *to* the 223 | end tag. If provided, additional transition constraints will be used for determining 224 | the end element of the sequence. 225 | Returns 226 | ------- 227 | viterbi_path : List[int] 228 | The tag indices of the maximum likelihood tag sequence. 229 | viterbi_score : torch.Tensor 230 | The score of the viterbi path. 231 | """ 232 | sequence_length, num_tags = list(tag_sequence.size()) 233 | 234 | has_start_end_restrictions = allowed_end_transitions is not None or allowed_start_transitions is not None 235 | 236 | if has_start_end_restrictions: 237 | 238 | if allowed_end_transitions is None: 239 | allowed_end_transitions = torch.zeros(num_tags) 240 | if allowed_start_transitions is None: 241 | allowed_start_transitions = torch.zeros(num_tags) 242 | 243 | num_tags = num_tags + 2 244 | new_transition_matrix = torch.zeros(num_tags, num_tags) 245 | new_transition_matrix[:-2, :-2] = transition_matrix 246 | 247 | # Start and end transitions are fully defined, but cannot transition between each other. 248 | # pylint: disable=not-callable 249 | allowed_start_transitions = torch.cat([allowed_start_transitions, torch.tensor([-math.inf, -math.inf])]) 250 | allowed_end_transitions = torch.cat([allowed_end_transitions, torch.tensor([-math.inf, -math.inf])]) 251 | # pylint: enable=not-callable 252 | 253 | # First define how we may transition FROM the start and end tags. 254 | new_transition_matrix[-2, :] = allowed_start_transitions 255 | # We cannot transition from the end tag to any tag. 256 | new_transition_matrix[-1, :] = -math.inf 257 | 258 | new_transition_matrix[:, -1] = allowed_end_transitions 259 | # We cannot transition to the start tag from any tag. 260 | new_transition_matrix[:, -2] = -math.inf 261 | 262 | transition_matrix = new_transition_matrix 263 | 264 | if tag_observations: 265 | if len(tag_observations) != sequence_length: 266 | raise Exception("Observations were provided, but they were not the same length " 267 | "as the sequence. Found sequence of length: {} and evidence: {}" 268 | .format(sequence_length, tag_observations)) 269 | else: 270 | tag_observations = [-1 for _ in range(sequence_length)] 271 | 272 | 273 | if has_start_end_restrictions: 274 | tag_observations = [num_tags - 2] + tag_observations + [num_tags - 1] 275 | zero_sentinel = torch.zeros(1, num_tags) 276 | extra_tags_sentinel = torch.ones(sequence_length, 2) * -math.inf 277 | tag_sequence = torch.cat([tag_sequence, extra_tags_sentinel], -1) 278 | tag_sequence = torch.cat([zero_sentinel, tag_sequence, zero_sentinel], 0) 279 | sequence_length = tag_sequence.size(0) 280 | 281 | path_scores = [] 282 | path_indices = [] 283 | 284 | if tag_observations[0] != -1: 285 | one_hot = torch.zeros(num_tags) 286 | one_hot[tag_observations[0]] = 100000. 287 | path_scores.append(one_hot) 288 | else: 289 | path_scores.append(tag_sequence[0, :]) 290 | 291 | # Evaluate the scores for all possible paths. 292 | for timestep in range(1, sequence_length): 293 | # Add pairwise potentials to current scores. 294 | summed_potentials = path_scores[timestep - 1].unsqueeze(-1) + transition_matrix 295 | scores, paths = torch.max(summed_potentials, 0) 296 | 297 | # If we have an observation for this timestep, use it 298 | # instead of the distribution over tags. 299 | observation = tag_observations[timestep] 300 | # Warn the user if they have passed 301 | # invalid/extremely unlikely evidence. 302 | if tag_observations[timestep - 1] != -1 and observation != -1: 303 | if transition_matrix[tag_observations[timestep - 1], observation] < -10000: 304 | logger.warning("The pairwise potential between tags you have passed as " 305 | "observations is extremely unlikely. Double check your evidence " 306 | "or transition potentials!") 307 | if observation != -1: 308 | one_hot = torch.zeros(num_tags) 309 | one_hot[observation] = 100000. 310 | path_scores.append(one_hot) 311 | else: 312 | path_scores.append(tag_sequence[timestep, :] + scores.squeeze()) 313 | path_indices.append(paths.squeeze()) 314 | 315 | # Construct the most likely sequence backwards. 316 | viterbi_score, best_path = torch.max(path_scores[-1], 0) 317 | viterbi_path = [int(best_path.numpy())] 318 | for backward_timestep in reversed(path_indices): 319 | viterbi_path.append(int(backward_timestep[viterbi_path[-1]])) 320 | # Reverse the backward path. 321 | viterbi_path.reverse() 322 | 323 | if has_start_end_restrictions: 324 | viterbi_path = viterbi_path[1:-1] 325 | #return viterbi_path, viterbi_score 326 | return np.array(viterbi_path, dtype=np.int32) 327 | 328 | 329 | 330 | -------------------------------------------------------------------------------- /data/rest15/dev.txt: -------------------------------------------------------------------------------- 1 | Judging from previous posts this used to be a good place, but not any longer.####Judging=O from=O previous=O posts=O this=O used=O to=O be=O a=O good=O place=T-NEG ,=O but=O not=O any=O longer=O .=O 2 | The duck confit is always amazing and the foie gras terrine with figs was out of this world.####The=O duck=T-POS confit=T-POS is=O always=O amazing=O and=O the=O foie=T-POS gras=T-POS terrine=T-POS with=T-POS figs=T-POS was=O out=O of=O this=O world=O .=O 3 | we love th pink pony.####we=O love=O th=O pink=T-POS pony=T-POS .=O 4 | well, i didn't find it there, and trust, i have told everyone i can think of about my experience. ####well=O ,=O i=O did=O n't=O find=O it=O there=O ,=O and=O trust=O ,=O i=O have=O told=O everyone=O i=O can=O think=O of=O about=O my=O experience=O .=O 5 | This place has got to be the best japanese restaurant in the new york area.####This=O place=T-POS has=O got=O to=O be=O the=O best=O japanese=O restaurant=O in=O the=O new=O york=O area=O .=O 6 | If you've ever been along the river in Weehawken you have an idea of the top of view the chart house has to offer.####If=O you=O 've=O ever=O been=O along=O the=O river=O in=O Weehawken=O you=O have=O an=O idea=O of=O the=O top=O of=O view=T-POS the=O chart=O house=O has=O to=O offer=O .=O 7 | This tiny restaurant is as cozy as it gets, with that certain Parisian flair.####This=O tiny=O restaurant=T-POS is=O as=O cozy=O as=O it=O gets=O ,=O with=O that=O certain=O Parisian=O flair=O .=O 8 | The pizza was delivered cold and the cheese wasn't even fully melted!####The=O pizza=T-NEG was=O delivered=O cold=O and=O the=O cheese=T-NEG was=O n't=O even=O fully=O melted=O !=O 9 | My wife and I always enjoy the young, not always well trained but nevertheless friendly, staff, all of whom have a story.####My=O wife=O and=O I=O always=O enjoy=O the=O young=O ,=O not=O always=O well=O trained=O but=O nevertheless=O friendly=O ,=O staff=T-POS ,=O all=O of=O whom=O have=O a=O story=O .=O 10 | Sit outside in the warm weather; inside for cozy winter.####Sit=O outside=O in=O the=O warm=O weather=O ;=O inside=O for=O cozy=O winter=O .=O 11 | They refuse to seat parties of 3 or more on weekends.####They=O refuse=O to=O seat=O parties=O of=O 3=O or=O more=O on=O weekends=O .=O 12 | The hostess is rude to the point of being offensive.####The=O hostess=T-NEG is=O rude=O to=O the=O point=O of=O being=O offensive=O .=O 13 | Try everything for that matter, it is all good.####Try=O everything=O for=O that=O matter=O ,=O it=O is=O all=O good=O .=O 14 | Veal Parmigana - Better than Patsy's!####Veal=O Parmigana=O ,=O Better=O than=O Patsy=O 's=O !=O 15 | Even after they overcharged me the last time I was there.####Even=O after=O they=O overcharged=O me=O the=O last=O time=O I=O was=O there=O .=O 16 | Make sure you have the Spicy Scallop roll.. .####Make=O sure=O you=O have=O the=O Spicy=T-POS Scallop=T-POS roll=T-POS .=O 17 | The drinks are always welll made and wine selection is fairly priced.####The=O drinks=T-POS are=O always=O welll=O made=O and=O wine=T-POS selection=T-POS is=O fairly=O priced=O .=O 18 | Try their chef's specials-- they are to die for.####Try=O their=O chef's=T-POS specials=T-POS they=O are=O to=O die=O for=O .=O 19 | Then, to top things off, she dropped used silverware on my boyfriend's jacket and did not stop to apologize or clean the mess that was left on clothes. ####Then=O ,=O to=O top=O things=O off=O ,=O she=O dropped=O used=O silverware=O on=O my=O boyfriend=O 's=O jacket=O and=O did=O not=O stop=O to=O apologize=O or=O clean=O the=O mess=O that=O was=O left=O on=O clothes=O .=O 20 | I had a grat time at Jekyll and Hyde!####I=O had=O a=O grat=O time=O at=O Jekyll=T-POS and=T-POS Hyde=T-POS !=O 21 | The outdoor atmosphere of sitting on the sidewalk watching the world go by 50 feet away on 6th avenue on a cool evening was wonderful.####The=O outdoor=T-POS atmosphere=T-POS of=O sitting=O on=O the=O sidewalk=O watching=O the=O world=O go=O by=O 50=O feet=O away=O on=O 6th=O avenue=O on=O a=O cool=O evening=O was=O wonderful=O .=O 22 | Great service, great food.####Great=O service=T-POS ,=O great=O food=T-POS .=O 23 | When I lived upstate for a while I would buy freeze the bagels and they would still be better than any else.####When=O I=O lived=O upstate=O for=O a=O while=O I=O would=O buy=O freeze=O the=O bagels=T-POS and=O they=O would=O still=O be=O better=O than=O any=O else=O .=O 24 | Aside from the Sea Urchin, the chef recommended an assortment of fish including Fatty Yellow Tail, Boton Shrimp, Blue Fin Torro (Fatty Tuna), Sea Eel, etc.####Aside=O from=O the=O Sea=O Urchin=O ,=O the=O chef=O recommended=O an=O assortment=O of=O fish=O including=O Fatty=O Yellow=O Tail=O ,=O Boton=O Shrimp=O ,=O Blue=O Fin=O Torro=O Fatty=O Tuna=O ,=O Sea=O Eel=O ,=O etc=O .=O 25 | If you are the type of person who likes being scared and entertained, this is a great place to go and eat.####If=O you=O are=O the=O type=O of=O person=O who=O likes=O being=O scared=O and=O entertained=O ,=O this=O is=O a=O great=O place=T-POS to=O go=O and=O eat=O .=O 26 | Its located in greenewich village.####Its=O located=O in=O greenewich=O village=O .=O 27 | I loved it and would HIGHLY RECOMMEND.####I=O loved=O it=O and=O would=O HIGHLY=O RECOMMEND=O .=O 28 | I am not the most experienced person when it comes to Thai food, but my friend who took me there is.####I=O am=O not=O the=O most=O experienced=O person=O when=O it=O comes=O to=O Thai=O food=O ,=O but=O my=O friend=O who=O took=O me=O there=O is=O .=O 29 | We had Pam's special fried fish and it was amazing.####We=O had=O Pam's=T-POS special=T-POS fried=T-POS fish=T-POS and=O it=O was=O amazing=O .=O 30 | Great vibe, lots of people.####Great=O vibe=T-POS ,=O lots=O of=O people=O .=O 31 | Salads were fantastic.####Salads=T-POS were=O fantastic=O .=O 32 | This place is always very crowded and popular.####This=O place=T-POS is=O always=O very=O crowded=O and=O popular=O .=O 33 | We concluded with tiramisu chocolate cake, both were delicious.####We=O concluded=O with=O tiramisu=T-POS chocolate=T-POS cake=T-POS ,=O both=O were=O delicious=O .=O 34 | sometimes i get bad food and bad service, sometimes i get good good and bad service.####sometimes=O i=O get=O bad=O food=T-NEG and=O bad=O service=T-NEG ,=O sometimes=O i=O get=O good=T-POS good=T-POS and=O bad=O service=T-NEG .=O 35 | I can't wait to go back.####I=O ca=O n't=O wait=O to=O go=O back=O .=O 36 | They tell me they are going to cover the garden in glass for the winter, so i'm looking forward to going there on a snowy night to enjoy it.####They=O tell=O me=O they=O are=O going=O to=O cover=O the=O garden=O in=O glass=O for=O the=O winter=O ,=O so=O i=O 'm=O looking=O forward=O to=O going=O there=O on=O a=O snowy=O night=O to=O enjoy=O it=O .=O 37 | To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora.####To=O be=O completely=O fair=O ,=O the=O only=O redeeming=O factor=O was=O the=O food=T-POS ,=O which=O was=O above=O average=O ,=O but=O could=O n't=O make=O up=O for=O all=O the=O other=O deficiencies=O of=O Teodora=T-NEG .=O 38 | The food however, is what one might expect.####The=O food=T-NEG however=O ,=O is=O what=O one=O might=O expect=O .=O 39 | Food was good not great not worth the wait or another visit####Food=T-NEU was=O good=O not=O great=O not=O worth=O the=O wait=O or=O another=O visit=O 40 | Growing up in NY, I have eaten my share of bagels.####Growing=O up=O in=O NY=O ,=O I=O have=O eaten=O my=O share=O of=O bagels=O .=O 41 | The lox is always fresh too.####The=O lox=T-POS is=O always=O fresh=O too=O .=O 42 | The prices were CHEAP compared to the quality of service and food.####The=O prices=O were=O CHEAP=O compared=O to=O the=O quality=O of=O service=T-POS and=O food=T-POS .=O 43 | I was there on sat. for my birthday and we had an excellent time.####I=O was=O there=O on=O sat=O for=O my=O birthday=O and=O we=O had=O an=O excellent=O time=O .=O 44 | The wine the service was very good too.####The=O wine=T-POS the=O service=T-POS was=O very=O good=O too=O .=O 45 | If you go, try the marinara/arrabiatta sauce, the mozzarella en Carozza is mmmmmmmm..... everything is just delicious.####If=O you=O go=O ,=O try=O the=O marinara/arrabiatta=T-POS sauce=T-POS ,=O the=O mozzarella=T-POS en=T-POS Carozza=T-POS is=O mmmmmmmm=O everything=O is=O just=O delicious=O .=O 46 | Old school meets New world.####Old=O school=O meets=O New=O world=O .=O 47 | I go twice a month!####I=O go=O twice=O a=O month=O !=O 48 | The hostess and the waitress were incredibly rude and did everything they could to rush us out.####The=O hostess=T-NEG and=O the=O waitress=T-NEG were=O incredibly=O rude=O and=O did=O everything=O they=O could=O to=O rush=O us=O out=O .=O 49 | The two star chefs left quite some time ago to open their own place.####The=O two=O star=O chefs=O left=O quite=O some=O time=O ago=O to=O open=O their=O own=O place=O .=O 50 | Don't dine at Tamarind for the vegetarian dishes, they are simply not up to par with the non-veg selections.####Do=O n't=O dine=O at=O Tamarind=O for=O the=O vegetarian=T-NEG dishes=T-NEG ,=O they=O are=O simply=O not=O up=O to=O par=O with=O the=O non-veg=T-POS selections=T-POS .=O 51 | They wouldnt even let me finish my glass of wine before offering another.####They=O would=O n't=O even=O let=O me=O finish=O my=O glass=O of=O wine=O before=O offering=O another=O .=O 52 | Try the Pad Thai, it's fabulous and their prices are so cheap!####Try=O the=O Pad=T-POS Thai=T-POS ,=O it=O 's=O fabulous=O and=O their=O prices=O are=O so=O cheap=O !=O 53 | This is a nice restaurant if you are looking for a good place to host an intimate dinner meeting with business associates.####This=O is=O a=O nice=O restaurant=T-POS if=O you=O are=O looking=O for=O a=O good=O place=O to=O host=O an=O intimate=O dinner=O meeting=O with=O business=O associates=O .=O 54 | The menu is limited but almost all of the dishes are excellent.####The=O menu=T-NEG is=O limited=O but=O almost=O all=O of=O the=O dishes=T-POS are=O excellent=O .=O 55 | The food was delicious (I had a halibut special, my husband had steak), and the service was top-notch.####The=O food=T-POS was=O delicious=O I=O had=O a=O halibut=T-POS special=T-POS ,=O my=O husband=O had=O steak=T-POS ,=O and=O the=O service=T-POS was=O top-notch=O .=O 56 | I highly recommend the restaurant based on our experience last night.####I=O highly=O recommend=O the=O restaurant=T-POS based=O on=O our=O experience=O last=O night=O .=O 57 | please don't fool us.####please=O do=O n't=O fool=O us=O .=O 58 | Never again!####Never=O again=O !=O 59 | My boyfriend and I went there to celebrate my birthday the other night and all I can say is that it was magnificent.####My=O boyfriend=O and=O I=O went=O there=O to=O celebrate=O my=O birthday=O the=O other=O night=O and=O all=O I=O can=O say=O is=O that=O it=O was=O magnificent=O .=O 60 | This place is really trendi but they have forgotten about the most important part of a restaurant, the food.####This=O place=T-POS is=O really=O trendi=O but=O they=O have=O forgotten=O about=O the=O most=O important=O part=O of=O a=O restaurant=O ,=O the=O food=T-NEG .=O 61 | And the Tom Kha soup was pathetic.####And=O the=O Tom=T-NEG Kha=T-NEG soup=T-NEG was=O pathetic=O .=O 62 | it helps if you know what to order.####it=O helps=O if=O you=O know=O what=O to=O order=O .=O 63 | Great food, good size menu, great service and an unpretensious setting.####Great=O food=T-POS ,=O good=O size=O menu=T-POS ,=O great=O service=T-POS and=O an=O unpretensious=O setting=T-POS .=O 64 | We are very particular about sushi and were both please with every choice which included: ceviche mix (special), crab dumplings, assorted sashimi, sushi and rolls, two types of sake, and the banana tempura.####We=O are=O very=O particular=O about=O sushi=T-POS and=O were=O both=O please=O with=O every=O choice=O which=O included=O ,=O ceviche=T-POS mix=T-POS (special)=T-POS ,=O crab=T-POS dumplings=T-POS ,=O assorted=T-POS sashimi=T-POS ,=O sushi=T-POS and=O rolls=T-POS ,=O two=T-POS types=T-POS of=T-POS sake=T-POS ,=O and=O the=O banana=T-POS tempura=T-POS .=O 65 | This is a wonderful place on all stand points especially value ofr money.####This=O is=O a=O wonderful=O place=T-POS on=O all=O stand=O points=O especially=O value=O ofr=O money=O .=O 66 | Went to Cafe Spice with 4 of my friends on a saturday night.####Went=O to=O Cafe=O Spice=O with=O 4=O of=O my=O friends=O on=O a=O saturday=O night=O .=O 67 | We were greeted promptly by the waiter who was very nice and cordial.####We=O were=O greeted=O promptly=O by=O the=O waiter=T-POS who=O was=O very=O nice=O and=O cordial=O .=O 68 | The crust is thin, the ingredients are fresh and the staff is friendly.####The=O crust=T-POS is=O thin=O ,=O the=O ingredients=T-POS are=O fresh=O and=O the=O staff=T-POS is=O friendly=O .=O 69 | I ordered the smoked salmon and roe appetizer and it was off flavor.####I=O ordered=O the=O smoked=T-NEG salmon=T-NEG and=T-NEG roe=T-NEG appetizer=T-NEG and=O it=O was=O off=O flavor=O .=O 70 | Delicious crab cakes too.####Delicious=O crab=T-POS cakes=T-POS too=O .=O 71 | Seriously, this place kicks ass.####Seriously=O ,=O this=O place=T-POS kicks=O ass=O .=O 72 | Good spreads, great beverage selections and bagels really tasty.####Good=O spreads=T-POS ,=O great=O beverage=T-POS selections=T-POS and=O bagels=T-POS really=O tasty=O .=O 73 | Love Pizza 33..####Love=O Pizza=T-POS 33=T-POS .=O 74 | It hits the spot every time####It=O hits=O the=O spot=O every=O time=O 75 | A little pricey but it really hits the spot on a Sunday morning!####A=O little=O pricey=O but=O it=O really=O hits=O the=O spot=O on=O a=O Sunday=O morning=O !=O 76 | Be sure not to get anything other than bagels!..####Be=O sure=O not=O to=O get=O anything=O other=O than=O bagels=T-POS !=O .=O 77 | Jimmy is Dominican!####Jimmy=O is=O Dominican=O !=O 78 | Well, this place is so Ghetto its not even funny.####Well=O ,=O this=O place=T-NEG is=O so=O Ghetto=O its=O not=O even=O funny=O .=O 79 | Awsome Pizza especially the Margheritta slice.####Awsome=O Pizza=T-POS especially=O the=O Margheritta=T-POS slice=T-POS .=O 80 | What more can you ask for?####What=O more=O can=O you=O ask=O for=O ?=O 81 | For authentic Thai food, look no further than Toons.####For=O authentic=O Thai=T-POS food=T-POS ,=O look=O no=O further=O than=O Toons=O .=O 82 | The place was quiet and delightful.####The=O place=T-POS was=O quiet=O and=O delightful=O .=O 83 | As a retired hipster, I can say with some degree of certainty that for the last year Lucky Strike has been the best laid-back late night in the city.####As=O a=O retired=O hipster=O ,=O I=O can=O say=O with=O some=O degree=O of=O certainty=O that=O for=O the=O last=O year=O Lucky=T-POS Strike=T-POS has=O been=O the=O best=O laid-back=O late=O night=O in=O the=O city=O .=O 84 | I will go back to Suan soon!####I=O will=O go=O back=O to=O Suan=T-POS soon=O !=O 85 | I cannot imagine you not rushing out to eat there.####I=O can=O not=O imagine=O you=O not=O rushing=O out=O to=O eat=O there=O .=O 86 | Do not get the Go Go Hamburgers, no matter what the reviews say.####Do=O not=O get=O the=O Go=T-NEG Go=T-NEG Hamburgers=T-NEG ,=O no=O matter=O what=O the=O reviews=O say=O .=O 87 | Steamed fresh so brought hot hot hot to your table.####Steamed=O fresh=O so=O brought=O hot=O hot=O hot=O to=O your=O table=O .=O 88 | (2) egg custards and pork buns at either bakery on west side of Mott street just south of Canal.####2=O egg=O custards=O and=O pork=O buns=O at=O either=O bakery=O on=O west=O side=O of=O Mott=O street=O just=O south=O of=O Canal=O .=O 89 | this little place has a cute interior decor and affordable city prices.####this=O little=O place=T-POS has=O a=O cute=O interior=T-POS decor=T-POS and=O affordable=O city=O prices=O .=O 90 | i would just ask for no oil next time.####i=O would=O just=O ask=O for=O no=O oil=O next=O time=O .=O 91 | The only thing you can do here is walk in and eat .. but planning an event, especially a small, intimate one, forget about it.####The=O only=O thing=O you=O can=O do=O here=O is=O walk=O in=O and=O eat=O but=O planning=O an=O event=O ,=O especially=O a=O small=O ,=O intimate=O one=O ,=O forget=O about=O it=O .=O 92 | But that is highly forgivable.####But=O that=O is=O highly=O forgivable=O .=O 93 | Been there, done that, and New York, it's not that big a deal.####Been=O there=O ,=O done=O that=O ,=O and=O New=O York=O ,=O it=O 's=O not=O that=O big=O a=O deal=O .=O 94 | Great food, great prices, great service.####Great=O food=T-POS ,=O great=O prices=O ,=O great=O service=T-POS .=O 95 | Their bagels are fine, but they are a little overcooked, and not really a 'special' bagel experience.####Their=O bagels=T-NEG are=O fine=O ,=O but=O they=O are=O a=O little=O overcooked=O ,=O and=O not=O really=O a=O special=O bagel=O experience=O .=O 96 | Downtown Dinner 2002 - Prixe fix: Appetizers were ok, waiter gave me poor suggestion..try the potato stuff kanish best one.####Downtown=O Dinner=O 2002=O ,=O Prixe=O fix=O ,=O Appetizers=T-NEU were=O ok=O ,=O waiter=T-NEG gave=O me=O poor=O suggestiontry=O the=O potato=T-POS stuff=T-POS kanish=T-POS best=O one=O .=O 97 | Still, any quibbles about the bill were off-set by the pour-your-own measures of liquers which were courtesey of the house...####Still=O ,=O any=O quibbles=O about=O the=O bill=O were=O off-set=O by=O the=O pour-your-own=O measures=T-POS of=T-POS liquers=T-POS which=O were=O courtesey=O of=O the=O house=O .=O 98 | The only thing more wonderful than the food (which is exceptional) is the service.####The=O only=O thing=O more=O wonderful=O than=O the=O food=T-POS which=O is=O exceptional=O is=O the=O service=T-POS .=O 99 | A real dissapointment.####A=O real=O dissapointment=O .=O 100 | But that wasn't the icing on the cake: a tiramisu that resembled nothing I have ever had.####But=O that=O was=O n't=O the=O icing=O on=O the=O cake=O ,=O a=O tiramisu=T-NEG that=O resembled=O nothing=O I=O have=O ever=O had=O .=O 101 | Priced at upper intermediate range.####Priced=O at=O upper=O intermediate=O range=O .=O 102 | This place has the best Chinese style BBQ ribs in the city.####This=O place=O has=O the=O best=O Chinese=O style=O BBQ=T-POS ribs=T-POS in=O the=O city=O .=O 103 | I also recommend the rice dishes or the different varieties of congee (rice porridge).####I=O also=O recommend=O the=O rice=T-POS dishes=T-POS or=O the=O different=O varieties=O of=O congee=T-POS (rice=T-POS porridge)=T-POS .=O 104 | Quick and friendly service.####Quick=O and=O friendly=O service=T-POS .=O 105 | Warm and friendly in the winter and terrific outdoor seating in the warmer months.####Warm=O and=O friendly=O in=O the=O winter=O and=O terrific=O outdoor=T-POS seating=T-POS in=O the=O warmer=O months=O .=O 106 | Probably would not go again...####Probably=O would=O not=O go=O again=O .=O 107 | A classic!####A=O classic=O !=O 108 | It was the first place we ate on our first trip to New York, and it will be the last place we stop as we head out of town on our next trip to New York.####It=O was=O the=O first=O place=T-POS we=O ate=O on=O our=O first=O trip=O to=O New=O York=O ,=O and=O it=O will=O be=O the=O last=O place=T-POS we=O stop=O as=O we=O head=O out=O of=O town=O on=O our=O next=O trip=O to=O New=O York=O .=O 109 | Thanks Bloom's for a lovely trip.####Thanks=O Bloom's=T-POS for=O a=O lovely=O trip=O .=O 110 | bottles of wine are cheap and good.####bottles=T-POS of=T-POS wine=T-POS are=O cheap=O and=O good=O .=O 111 | The mussles were the fishiest things I've ever tasted, the seabass was bland, the goat cheese salad was missing the goat cheese, the penne w/ chicken had bones in it... It was disgusting.####The=O mussles=T-NEG were=O the=O fishiest=O things=O I=O 've=O ever=O tasted=O ,=O the=O seabass=T-NEG was=O bland=O ,=O the=O goat=T-NEG cheese=T-NEG salad=T-NEG was=O missing=O the=O goat=O cheese=O ,=O the=O penne=T-NEG w/=T-NEG chicken=T-NEG had=O bones=O in=O it=O It=O was=O disgusting=O .=O 112 | The food is amazing, rich pastas and fresh doughy pizza.####The=O food=T-POS is=O amazing=O ,=O rich=O pastas=T-POS and=O fresh=O doughy=O pizza=T-POS .=O 113 | Among all of the new 5th avenue restaurants, this offers by far one of the best values for your money.####Among=O all=O of=O the=O new=O 5th=O avenue=O restaurants=O ,=O this=O offers=O by=O far=O one=O of=O the=O best=O values=O for=O your=O money=O .=O 114 | Good luck getting a table.####Good=O luck=O getting=O a=O table=O .=O 115 | We recently decided to try this location, and to our delight, they have outdoor seating, perfect since I had my yorkie with me.####We=O recently=O decided=O to=O try=O this=O location=O ,=O and=O to=O our=O delight=O ,=O they=O have=O outdoor=T-POS seating=T-POS ,=O perfect=O since=O I=O had=O my=O yorkie=O with=O me=O .=O 116 | But $1 for each small piece???####But=O $=O 1=O for=O each=O small=O piece=O ?=O ?=O ?=O 117 | Not worth it.####Not=O worth=O it=O .=O 118 | Great Indian food and the service is incredible.####Great=O Indian=T-POS food=T-POS and=O the=O service=T-POS is=O incredible=O .=O 119 | A great place to meet up for some food and drinks... ####A=O great=O place=T-POS to=O meet=O up=O for=O some=O food=O and=O drinks=O .=O 120 | It's also attached to Angel's Share, which is a cool, more romantic bar...####It=O 's=O also=O attached=O to=O Angel=O 's=O Share=O ,=O which=O is=O a=O cool=O ,=O more=O romantic=O bar=O .=O 121 | Wasn't going to share but I feel obligated...while sitting at the sushi bar dining we watched the chef accidentally drop a piece of Unagi on the floor and upon retrieving it from the floor proceed to use the piece in the delivery order he was preparing.####Was=O n't=O going=O to=O share=O but=O I=O feel=O obligatedwhile=O sitting=O at=O the=O sushi=O bar=O dining=O we=O watched=O the=O chef=T-NEG accidentally=O drop=O a=O piece=O of=O Unagi=O on=O the=O floor=O and=O upon=O retrieving=O it=O from=O the=O floor=O proceed=O to=O use=O the=O piece=O in=O the=O delivery=O order=O he=O was=O preparing=O .=O 122 | We left, never to return.####We=O left=O ,=O never=O to=O return=O .=O 123 | In fact, it appears he is going to go postal at any moment.####In=O fact=O ,=O it=O appears=O he=O is=O going=O to=O go=O postal=O at=O any=O moment=O .=O 124 | $20 for all you can eat sushi cannot be beaten.####$=O 20=O for=O all=T-POS you=T-POS can=T-POS eat=T-POS sushi=T-POS can=O not=O be=O beaten=O .=O 125 | I went to Areo on a Sunday afternoon with four of my girlfriends, and spent three enjoyable hours there.####I=O went=O to=O Areo=T-POS on=O a=O Sunday=O afternoon=O with=O four=O of=O my=O girlfriends=O ,=O and=O spent=O three=O enjoyable=O hours=O there=O .=O 126 | I would highly recommand requesting a table by the window.####I=O would=O highly=O recommand=O requesting=O a=O table=T-POS by=T-POS the=T-POS window=T-POS .=O 127 | Love the scene first off- the place has a character and nice light to it..very fortunate, location wise.####Love=O the=O scene=T-POS first=O off=O the=O place=T-POS has=O a=O character=O and=O nice=O light=O to=O itvery=O fortunate=O ,=O location=T-POS wise=O .=O 128 | I plan on stopping by next week as well.####I=O plan=O on=O stopping=O by=O next=O week=O as=O well=O .=O 129 | Keep up the good work guys!####Keep=O up=O the=O good=O work=O guys=O !=O 130 | We could have made a meal of the yummy dumplings from the dumpling menu.####We=O could=O have=O made=O a=O meal=O of=O the=O yummy=O dumplings=T-POS from=O the=O dumpling=O menu=O .=O 131 | -------------------------------------------------------------------------------- /glue_utils.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team. 3 | # Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved. 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | """ BERT classification fine-tuning: utilities to work with GLUE tasks """ 17 | 18 | from __future__ import absolute_import, division, print_function 19 | 20 | import csv 21 | import logging 22 | import os 23 | import sys 24 | from io import open 25 | 26 | from seq_utils import * 27 | 28 | logger = logging.getLogger(__name__) 29 | 30 | SMALL_POSITIVE_CONST = 1e-4 31 | 32 | class InputExample(object): 33 | """A single training/test example for simple sequence classification.""" 34 | 35 | def __init__(self, guid, text_a, text_b=None, label=None): 36 | """Constructs a InputExample. 37 | 38 | Args: 39 | guid: Unique id for the example. 40 | text_a: string. The untokenized text of the first sequence. For single 41 | sequence tasks, only this sequence must be specified. 42 | text_b: (Optional) string. The untokenized text of the second sequence. 43 | Only must be specified for sequence pair tasks. 44 | label: (Optional) string. The label of the example. This should be 45 | specified for train and dev examples, but not for test examples. 46 | """ 47 | self.guid = guid 48 | self.text_a = text_a 49 | self.text_b = text_b 50 | self.label = label 51 | 52 | 53 | class InputFeatures(object): 54 | """A single set of features of data.""" 55 | 56 | def __init__(self, input_ids, input_mask, segment_ids, label_id): 57 | self.input_ids = input_ids 58 | self.input_mask = input_mask 59 | self.segment_ids = segment_ids 60 | self.label_id = label_id 61 | 62 | 63 | class SeqInputFeatures(object): 64 | """A single set of features of data for the ABSA task""" 65 | def __init__(self, input_ids, input_mask, segment_ids, label_ids, evaluate_label_ids): 66 | self.input_ids = input_ids 67 | self.input_mask = input_mask 68 | self.segment_ids = segment_ids 69 | self.label_ids = label_ids 70 | # mapping between word index and head token index 71 | self.evaluate_label_ids = evaluate_label_ids 72 | 73 | 74 | class DataProcessor(object): 75 | """Base class for data converters for sequence classification data sets.""" 76 | 77 | def get_train_examples(self, data_dir): 78 | """Gets a collection of `InputExample`s for the train set.""" 79 | raise NotImplementedError() 80 | 81 | def get_dev_examples(self, data_dir): 82 | """Gets a collection of `InputExample`s for the dev set.""" 83 | raise NotImplementedError() 84 | 85 | def get_test_examples(self, data_dir): 86 | """Gets a collection of `InputExample`s for the test set.""" 87 | raise NotImplementedError() 88 | 89 | def get_labels(self): 90 | """Gets the list of labels for this data set.""" 91 | raise NotImplementedError() 92 | 93 | @classmethod 94 | def _read_tsv(cls, input_file, quotechar=None): 95 | """Reads a tab separated value file.""" 96 | with open(input_file, "r", encoding="utf-8-sig") as f: 97 | reader = csv.reader(f, delimiter="\t", quotechar=quotechar) 98 | lines = [] 99 | for line in reader: 100 | if sys.version_info[0] == 2: 101 | line = list(cell for cell in line) 102 | lines.append(line) 103 | return lines 104 | 105 | 106 | class ABSAProcessor(DataProcessor): 107 | """Processor for the ABSA datasets""" 108 | def get_train_examples(self, data_dir, tagging_schema): 109 | return self._create_examples(data_dir=data_dir, set_type='train', tagging_schema=tagging_schema) 110 | 111 | def get_dev_examples(self, data_dir, tagging_schema): 112 | return self._create_examples(data_dir=data_dir, set_type='dev', tagging_schema=tagging_schema) 113 | 114 | def get_test_examples(self, data_dir, tagging_schema): 115 | return self._create_examples(data_dir=data_dir, set_type='test', tagging_schema=tagging_schema) 116 | 117 | def get_labels(self, tagging_schema): 118 | if tagging_schema == 'OT': 119 | return [] 120 | elif tagging_schema == 'BIO': 121 | return ['O', 'EQ', 'B-POS', 'I-POS', 'B-NEG', 'I-NEG', 'B-NEU', 'I-NEU'] 122 | elif tagging_schema == 'BIEOS': 123 | return ['O', 'EQ', 'B-POS', 'I-POS', 'E-POS', 'S-POS', 124 | 'B-NEG', 'I-NEG', 'E-NEG', 'S-NEG', 125 | 'B-NEU', 'I-NEU', 'E-NEU', 'S-NEU'] 126 | else: 127 | raise Exception("Invalid tagging schema %s..." % tagging_schema) 128 | 129 | def _create_examples(self, data_dir, set_type, tagging_schema): 130 | examples = [] 131 | file = os.path.join(data_dir, "%s.txt" % set_type) 132 | class_count = np.zeros(3) 133 | with open(file, 'r', encoding='UTF-8') as fp: 134 | sample_id = 0 135 | for line in fp: 136 | sent_string, tag_string = line.strip().split('####') 137 | words = [] 138 | tags = [] 139 | for tag_item in tag_string.split(' '): 140 | eles = tag_item.split('=') 141 | if len(eles) == 1: 142 | raise Exception("Invalid samples %s..." % tag_string) 143 | elif len(eles) == 2: 144 | word, tag = eles 145 | else: 146 | word = ''.join((len(eles) - 2) * ['=']) 147 | tag = eles[-1] 148 | words.append(word) 149 | tags.append(tag) 150 | # convert from ot to bieos 151 | if tagging_schema == 'BIEOS': 152 | tags = ot2bieos_ts(tags) 153 | elif tagging_schema == 'BIO': 154 | tags = ot2bio_ts(tags) 155 | else: 156 | # original tags follow the OT tagging schema, do nothing 157 | pass 158 | guid = "%s-%s" % (set_type, sample_id) 159 | text_a = ' '.join(words) 160 | #label = [absa_label_vocab[tag] for tag in tags] 161 | gold_ts = tag2ts(ts_tag_sequence=tags) 162 | for (b, e, s) in gold_ts: 163 | if s == 'POS': 164 | class_count[0] += 1 165 | if s == 'NEG': 166 | class_count[1] += 1 167 | if s == 'NEU': 168 | class_count[2] += 1 169 | examples.append(InputExample(guid=guid, text_a=text_a, text_b=None, label=tags)) 170 | sample_id += 1 171 | print("%s class count: %s" % (set_type, class_count)) 172 | return examples 173 | 174 | 175 | def _truncate_seq_pair(tokens_a, tokens_b, max_length): 176 | """Truncates a sequence pair in place to the maximum length.""" 177 | 178 | # This is a simple heuristic which will always truncate the longer sequence 179 | # one token at a time. This makes more sense than truncating an equal percent 180 | # of tokens from each, since if one sequence is very short then each token 181 | # that's truncated likely contains more information than a longer sequence. 182 | while True: 183 | total_length = len(tokens_a) + len(tokens_b) 184 | if total_length <= max_length: 185 | break 186 | if len(tokens_a) > len(tokens_b): 187 | tokens_a.pop() 188 | else: 189 | tokens_b.pop() 190 | 191 | 192 | def convert_examples_to_seq_features(examples, label_list, tokenizer, 193 | cls_token_at_end=False, pad_on_left=False, cls_token='[CLS]', 194 | sep_token='[SEP]', pad_token=0, sequence_a_segment_id=0, 195 | sequence_b_segment_id=1, cls_token_segment_id=1, pad_token_segment_id=0, 196 | mask_padding_with_zero=True): 197 | # feature extraction for sequence labeling 198 | label_map = {label: i for i, label in enumerate(label_list)} 199 | features = [] 200 | max_seq_length = -1 201 | examples_tokenized = [] 202 | for (ex_index, example) in enumerate(examples): 203 | tokens_a = [] 204 | labels_a = [] 205 | evaluate_label_ids = [] 206 | words = example.text_a.split(' ') 207 | wid, tid = 0, 0 208 | for word, label in zip(words, example.label): 209 | subwords = tokenizer.tokenize(word) 210 | tokens_a.extend(subwords) 211 | if label != 'O': 212 | labels_a.extend([label] + ['EQ'] * (len(subwords) - 1)) 213 | else: 214 | labels_a.extend(['O'] * len(subwords)) 215 | evaluate_label_ids.append(tid) 216 | wid += 1 217 | # move the token pointer 218 | tid += len(subwords) 219 | #print(evaluate_label_ids) 220 | assert tid == len(tokens_a) 221 | evaluate_label_ids = np.array(evaluate_label_ids, dtype=np.int32) 222 | examples_tokenized.append((tokens_a, labels_a, evaluate_label_ids)) 223 | if len(tokens_a) > max_seq_length: 224 | max_seq_length = len(tokens_a) 225 | # count on the [CLS] and [SEP] 226 | max_seq_length += 2 227 | #max_seq_length = 128 228 | for ex_index, (tokens_a, labels_a, evaluate_label_ids) in enumerate(examples_tokenized): 229 | #tokens_a = tokenizer.tokenize(example.text_a) 230 | 231 | # Account for [CLS] and [SEP] with "- 2" 232 | # for sequence labeling, better not truncate the sequence 233 | #if len(tokens_a) > max_seq_length - 2: 234 | # tokens_a = tokens_a[:(max_seq_length - 2)] 235 | # labels_a = labels_a 236 | tokens = tokens_a + [sep_token] 237 | segment_ids = [sequence_a_segment_id] * len(tokens) 238 | labels = labels_a + ['O'] 239 | if cls_token_at_end: 240 | # evaluate label ids not change 241 | tokens = tokens + [cls_token] 242 | segment_ids = segment_ids + [cls_token_segment_id] 243 | labels = labels + ['O'] 244 | else: 245 | # right shift 1 for evaluate label ids 246 | tokens = [cls_token] + tokens 247 | segment_ids = [cls_token_segment_id] + segment_ids 248 | labels = ['O'] + labels 249 | evaluate_label_ids += 1 250 | input_ids = tokenizer.convert_tokens_to_ids(tokens) 251 | input_mask = [1 if mask_padding_with_zero else 0] * len(input_ids) 252 | # Zero-pad up to the sequence length. 253 | padding_length = max_seq_length - len(input_ids) 254 | #print("Current labels:", labels) 255 | label_ids = [label_map[label] for label in labels] 256 | 257 | # pad the input sequence and the mask sequence 258 | if pad_on_left: 259 | input_ids = ([pad_token] * padding_length) + input_ids 260 | input_mask = ([0 if mask_padding_with_zero else 1] * padding_length) + input_mask 261 | segment_ids = ([pad_token_segment_id] * padding_length) + segment_ids 262 | # pad sequence tag 'O' 263 | label_ids = ([0] * padding_length) + label_ids 264 | # right shift padding_length for evaluate_label_ids 265 | evaluate_label_ids += padding_length 266 | else: 267 | # evaluate ids not change 268 | input_ids = input_ids + ([pad_token] * padding_length) 269 | input_mask = input_mask + ([0 if mask_padding_with_zero else 1] * padding_length) 270 | segment_ids = segment_ids + ([pad_token_segment_id] * padding_length) 271 | # pad sequence tag 'O' 272 | label_ids = label_ids + ([0] * padding_length) 273 | assert len(input_ids) == max_seq_length 274 | assert len(input_mask) == max_seq_length 275 | assert len(segment_ids) == max_seq_length 276 | assert len(label_ids) == max_seq_length 277 | 278 | if ex_index < 5: 279 | logger.info("*** Example ***") 280 | logger.info("guid: %s" % (example.guid)) 281 | logger.info("tokens: %s" % " ".join( 282 | [str(x) for x in tokens])) 283 | logger.info("input_ids: %s" % " ".join([str(x) for x in input_ids])) 284 | logger.info("input_mask: %s" % " ".join([str(x) for x in input_mask])) 285 | logger.info("segment_ids: %s" % " ".join([str(x) for x in segment_ids])) 286 | logger.info("labels: %s " % ' '.join([str(x) for x in label_ids])) 287 | logger.info("evaluate label ids: %s" % evaluate_label_ids) 288 | 289 | features.append( 290 | SeqInputFeatures(input_ids=input_ids, 291 | input_mask=input_mask, 292 | segment_ids=segment_ids, 293 | label_ids=label_ids, 294 | evaluate_label_ids=evaluate_label_ids)) 295 | print("maximal sequence length is", max_seq_length) 296 | return features 297 | 298 | 299 | def convert_examples_to_features(examples, label_list, max_seq_length, 300 | tokenizer, output_mode, 301 | cls_token_at_end=False, pad_on_left=False, 302 | cls_token='[CLS]', sep_token='[SEP]', pad_token=0, 303 | sequence_a_segment_id=0, sequence_b_segment_id=1, 304 | cls_token_segment_id=1, pad_token_segment_id=0, 305 | mask_padding_with_zero=True): 306 | """ Loads a data file into a list of `InputBatch`s 307 | `cls_token_at_end` define the location of the CLS token: 308 | - False (Default, BERT/XLM pattern): [CLS] + A + [SEP] + B + [SEP] 309 | - True (XLNet/GPT pattern): A + [SEP] + B + [SEP] + [CLS] 310 | `cls_token_segment_id` define the segment id associated to the CLS token (0 for BERT, 2 for XLNet) 311 | """ 312 | 313 | label_map = {label : i for i, label in enumerate(label_list)} 314 | 315 | features = [] 316 | for (ex_index, example) in enumerate(examples): 317 | if ex_index % 10000 == 0: 318 | logger.info("Writing example %d of %d" % (ex_index, len(examples))) 319 | 320 | tokens_a = tokenizer.tokenize(example.text_a) 321 | 322 | tokens_b = None 323 | if example.text_b: 324 | tokens_b = tokenizer.tokenize(example.text_b) 325 | # Modifies `tokens_a` and `tokens_b` in place so that the total 326 | # length is less than the specified length. 327 | # Account for [CLS], [SEP], [SEP] with "- 3" 328 | _truncate_seq_pair(tokens_a, tokens_b, max_seq_length - 3) 329 | else: 330 | # Account for [CLS] and [SEP] with "- 2" 331 | if len(tokens_a) > max_seq_length - 2: 332 | tokens_a = tokens_a[:(max_seq_length - 2)] 333 | 334 | # The convention in BERT is: 335 | # (a) For sequence pairs: 336 | # tokens: [CLS] is this jack ##son ##ville ? [SEP] no it is not . [SEP] 337 | # type_ids: 0 0 0 0 0 0 0 0 1 1 1 1 1 1 338 | # (b) For single sequences: 339 | # tokens: [CLS] the dog is hairy . [SEP] 340 | # type_ids: 0 0 0 0 0 0 0 341 | # 342 | # Where "type_ids" are used to indicate whether this is the first 343 | # sequence or the second sequence. The embedding vectors for `type=0` and 344 | # `type=1` were learned during pre-training and are added to the wordpiece 345 | # embedding vector (and position vector). This is not *strictly* necessary 346 | # since the [SEP] token unambiguously separates the sequences, but it makes 347 | # it easier for the model to learn the concept of sequences. 348 | # 349 | # For classification tasks, the first vector (corresponding to [CLS]) is 350 | # used as as the "sentence vector". Note that this only makes sense because 351 | # the entire model is fine-tuned. 352 | tokens = tokens_a + [sep_token] 353 | segment_ids = [sequence_a_segment_id] * len(tokens) 354 | 355 | if tokens_b: 356 | tokens += tokens_b + [sep_token] 357 | segment_ids += [sequence_b_segment_id] * (len(tokens_b) + 1) 358 | 359 | if cls_token_at_end: 360 | tokens = tokens + [cls_token] 361 | segment_ids = segment_ids + [cls_token_segment_id] 362 | else: 363 | tokens = [cls_token] + tokens 364 | segment_ids = [cls_token_segment_id] + segment_ids 365 | 366 | input_ids = tokenizer.convert_tokens_to_ids(tokens) 367 | 368 | # The mask has 1 for real tokens and 0 for padding tokens. Only real 369 | # tokens are attended to. 370 | input_mask = [1 if mask_padding_with_zero else 0] * len(input_ids) 371 | 372 | # Zero-pad up to the sequence length. 373 | padding_length = max_seq_length - len(input_ids) 374 | if pad_on_left: 375 | input_ids = ([pad_token] * padding_length) + input_ids 376 | input_mask = ([0 if mask_padding_with_zero else 1] * padding_length) + input_mask 377 | segment_ids = ([pad_token_segment_id] * padding_length) + segment_ids 378 | else: 379 | input_ids = input_ids + ([pad_token] * padding_length) 380 | input_mask = input_mask + ([0 if mask_padding_with_zero else 1] * padding_length) 381 | segment_ids = segment_ids + ([pad_token_segment_id] * padding_length) 382 | 383 | assert len(input_ids) == max_seq_length 384 | assert len(input_mask) == max_seq_length 385 | assert len(segment_ids) == max_seq_length 386 | 387 | if output_mode == "classification": 388 | label_id = label_map[example.label] 389 | elif output_mode == "regression": 390 | label_id = float(example.label) 391 | else: 392 | raise KeyError(output_mode) 393 | 394 | if ex_index < 5: 395 | logger.info("*** Example ***") 396 | logger.info("guid: %s" % (example.guid)) 397 | logger.info("tokens: %s" % " ".join( 398 | [str(x) for x in tokens])) 399 | logger.info("input_ids: %s" % " ".join([str(x) for x in input_ids])) 400 | logger.info("input_mask: %s" % " ".join([str(x) for x in input_mask])) 401 | logger.info("segment_ids: %s" % " ".join([str(x) for x in segment_ids])) 402 | logger.info("label: %s (id = %d)" % (example.label, label_id)) 403 | 404 | features.append( 405 | InputFeatures(input_ids=input_ids, 406 | input_mask=input_mask, 407 | segment_ids=segment_ids, 408 | label_id=label_id)) 409 | return features 410 | 411 | 412 | def match_ts(gold_ts_sequence, pred_ts_sequence): 413 | """ 414 | calculate the number of correctly predicted targeted sentiment 415 | :param gold_ts_sequence: gold standard targeted sentiment sequence 416 | :param pred_ts_sequence: predicted targeted sentiment sequence 417 | :return: 418 | """ 419 | # positive, negative and neutral 420 | tag2tagid = {'POS': 0, 'NEG': 1, 'NEU': 2} 421 | hit_count, gold_count, pred_count = np.zeros(3), np.zeros(3), np.zeros(3) 422 | for t in gold_ts_sequence: 423 | #print(t) 424 | ts_tag = t[2] 425 | tid = tag2tagid[ts_tag] 426 | gold_count[tid] += 1 427 | for t in pred_ts_sequence: 428 | ts_tag = t[2] 429 | tid = tag2tagid[ts_tag] 430 | if t in gold_ts_sequence: 431 | hit_count[tid] += 1 432 | pred_count[tid] += 1 433 | return hit_count, gold_count, pred_count 434 | 435 | 436 | def compute_metrics_absa(preds, labels, all_evaluate_label_ids, tagging_schema): 437 | if tagging_schema == 'BIEOS': 438 | absa_label_vocab = {'O': 0, 'EQ': 1, 'B-POS': 2, 'I-POS': 3, 'E-POS': 4, 'S-POS': 5, 439 | 'B-NEG': 6, 'I-NEG': 7, 'E-NEG': 8, 'S-NEG': 9, 440 | 'B-NEU': 10, 'I-NEU': 11, 'E-NEU': 12, 'S-NEU': 13} 441 | elif tagging_schema == 'BIO': 442 | absa_label_vocab = {'O': 0, 'EQ': 1, 'B-POS': 2, 'I-POS': 3, 443 | 'B-NEG': 4, 'I-NEG': 5, 'B-NEU': 6, 'I-NEU': 7} 444 | elif tagging_schema == 'OT': 445 | absa_label_vocab = {'O': 0, 'EQ': 1, 'T-POS': 2, 'T-NEG': 3, 'T-NEU': 4} 446 | else: 447 | raise Exception("Invalid tagging schema %s..." % tagging_schema) 448 | absa_id2tag = {} 449 | for k in absa_label_vocab: 450 | v = absa_label_vocab[k] 451 | absa_id2tag[v] = k 452 | # number of true postive, gold standard, predicted targeted sentiment 453 | n_tp_ts, n_gold_ts, n_pred_ts = np.zeros(3), np.zeros(3), np.zeros(3) 454 | # precision, recall and f1 for aspect-based sentiment analysis 455 | ts_precision, ts_recall, ts_f1 = np.zeros(3), np.zeros(3), np.zeros(3) 456 | n_samples = len(all_evaluate_label_ids) 457 | pred_y, gold_y = [], [] 458 | class_count = np.zeros(3) 459 | for i in range(n_samples): 460 | evaluate_label_ids = all_evaluate_label_ids[i] 461 | pred_labels = preds[i][evaluate_label_ids] 462 | gold_labels = labels[i][evaluate_label_ids] 463 | assert len(pred_labels) == len(gold_labels) 464 | # here, no EQ tag will be induced 465 | pred_tags = [absa_id2tag[label] for label in pred_labels] 466 | gold_tags = [absa_id2tag[label] for label in gold_labels] 467 | 468 | if tagging_schema == 'OT': 469 | gold_tags = ot2bieos_ts(gold_tags) 470 | pred_tags = ot2bieos_ts(pred_tags) 471 | elif tagging_schema == 'BIO': 472 | gold_tags = ot2bieos_ts(bio2ot_ts(gold_tags)) 473 | pred_tags = ot2bieos_ts(bio2ot_ts(pred_tags)) 474 | else: 475 | # current tagging schema is BIEOS, do nothing 476 | pass 477 | g_ts_sequence, p_ts_sequence = tag2ts(ts_tag_sequence=gold_tags), tag2ts(ts_tag_sequence=pred_tags) 478 | 479 | hit_ts_count, gold_ts_count, pred_ts_count = match_ts(gold_ts_sequence=g_ts_sequence, 480 | pred_ts_sequence=p_ts_sequence) 481 | n_tp_ts += hit_ts_count 482 | n_gold_ts += gold_ts_count 483 | n_pred_ts += pred_ts_count 484 | for (b, e, s) in g_ts_sequence: 485 | if s == 'POS': 486 | class_count[0] += 1 487 | if s == 'NEG': 488 | class_count[1] += 1 489 | if s == 'NEU': 490 | class_count[2] += 1 491 | for i in range(3): 492 | n_ts = n_tp_ts[i] 493 | n_g_ts = n_gold_ts[i] 494 | n_p_ts = n_pred_ts[i] 495 | ts_precision[i] = float(n_ts) / float(n_p_ts + SMALL_POSITIVE_CONST) 496 | ts_recall[i] = float(n_ts) / float(n_g_ts + SMALL_POSITIVE_CONST) 497 | ts_f1[i] = 2 * ts_precision[i] * ts_recall[i] / (ts_precision[i] + ts_recall[i] + SMALL_POSITIVE_CONST) 498 | 499 | macro_f1 = ts_f1.mean() 500 | 501 | # calculate micro-average scores for ts task 502 | # TP 503 | n_tp_total = sum(n_tp_ts) 504 | # TP + FN 505 | n_g_total = sum(n_gold_ts) 506 | print("class_count:", class_count) 507 | 508 | # TP + FP 509 | n_p_total = sum(n_pred_ts) 510 | micro_p = float(n_tp_total) / (n_p_total + SMALL_POSITIVE_CONST) 511 | micro_r = float(n_tp_total) / (n_g_total + SMALL_POSITIVE_CONST) 512 | micro_f1 = 2 * micro_p * micro_r / (micro_p + micro_r + SMALL_POSITIVE_CONST) 513 | scores = {'macro-f1': macro_f1, 'precision': micro_p, "recall": micro_r, "micro-f1": micro_f1} 514 | return scores 515 | 516 | 517 | processors = { 518 | "laptop14": ABSAProcessor, 519 | "rest_total": ABSAProcessor, 520 | "rest_total_revised": ABSAProcessor, 521 | "rest14": ABSAProcessor, 522 | "rest15": ABSAProcessor, 523 | "rest16": ABSAProcessor, 524 | } 525 | 526 | output_modes = { 527 | "cola": "classification", 528 | "mnli": "classification", 529 | "mnli-mm": "classification", 530 | "mrpc": "classification", 531 | "sst-2": "classification", 532 | "sts-b": "regression", 533 | "qqp": "classification", 534 | "qnli": "classification", 535 | "rte": "classification", 536 | "wnli": "classification", 537 | "laptop14": "classification", 538 | "rest_total": "classification", 539 | "rest14": "classification", 540 | "rest15": "classification", 541 | "rest16": "classification", 542 | "rest_total_revised": "classification", 543 | } 544 | -------------------------------------------------------------------------------- /absa_layer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from transformers import BertModel, XLNetModel 4 | from seq_utils import * 5 | from bert import BertPreTrainedModel, XLNetPreTrainedModel 6 | from torch.nn import CrossEntropyLoss 7 | 8 | 9 | class TaggerConfig: 10 | def __init__(self): 11 | self.hidden_dropout_prob = 0.1 12 | self.hidden_size = 768 13 | self.n_rnn_layers = 1 # not used if tagger is non-RNN model 14 | self.bidirectional = True # not used if tagger is non-RNN model 15 | 16 | 17 | class SAN(nn.Module): 18 | def __init__(self, d_model, nhead, dropout=0.1): 19 | super(SAN, self).__init__() 20 | self.d_model = d_model 21 | self.nhead = nhead 22 | self.self_attn = nn.MultiheadAttention(d_model, nhead, dropout=dropout) 23 | self.dropout = nn.Dropout(p=dropout) 24 | self.norm = nn.LayerNorm(d_model) 25 | 26 | def forward(self, src, src_mask=None, src_key_padding_mask=None): 27 | """ 28 | 29 | :param src: 30 | :param src_mask: 31 | :param src_key_padding_mask: 32 | :return: 33 | """ 34 | src2, _ = self.self_attn(src, src, src, attn_mask=src_mask, key_padding_mask=src_key_padding_mask) 35 | src = src + self.dropout(src2) 36 | # apply layer normalization 37 | src = self.norm(src) 38 | return src 39 | 40 | 41 | class GRU(nn.Module): 42 | # customized GRU with layer normalization 43 | def __init__(self, input_size, hidden_size, bidirectional=True): 44 | """ 45 | 46 | :param input_size: 47 | :param hidden_size: 48 | :param bidirectional: 49 | """ 50 | super(GRU, self).__init__() 51 | self.input_size = input_size 52 | if bidirectional: 53 | self.hidden_size = hidden_size // 2 54 | else: 55 | self.hidden_size = hidden_size 56 | self.bidirectional = bidirectional 57 | self.Wxrz = nn.Linear(in_features=self.input_size, out_features=2*self.hidden_size, bias=True) 58 | self.Whrz = nn.Linear(in_features=self.hidden_size, out_features=2*self.hidden_size, bias=True) 59 | self.Wxn = nn.Linear(in_features=self.input_size, out_features=self.hidden_size, bias=True) 60 | self.Whn = nn.Linear(in_features=self.hidden_size, out_features=self.hidden_size, bias=True) 61 | self.LNx1 = nn.LayerNorm(2*self.hidden_size) 62 | self.LNh1 = nn.LayerNorm(2*self.hidden_size) 63 | self.LNx2 = nn.LayerNorm(self.hidden_size) 64 | self.LNh2 = nn.LayerNorm(self.hidden_size) 65 | 66 | def forward(self, x): 67 | """ 68 | 69 | :param x: input tensor, shape: (batch_size, seq_len, input_size) 70 | :return: 71 | """ 72 | def recurrence(xt, htm1): 73 | """ 74 | 75 | :param xt: current input 76 | :param htm1: previous hidden state 77 | :return: 78 | """ 79 | gates_rz = torch.sigmoid(self.LNx1(self.Wxrz(xt)) + self.LNh1(self.Whrz(htm1))) 80 | rt, zt = gates_rz.chunk(2, 1) 81 | nt = torch.tanh(self.LNx2(self.Wxn(xt))+rt*self.LNh2(self.Whn(htm1))) 82 | ht = (1.0-zt) * nt + zt * htm1 83 | return ht 84 | 85 | steps = range(x.size(1)) 86 | bs = x.size(0) 87 | hidden = self.init_hidden(bs) 88 | # shape: (seq_len, bsz, input_size) 89 | input = x.transpose(0, 1) 90 | output = [] 91 | for t in steps: 92 | hidden = recurrence(input[t], hidden) 93 | output.append(hidden) 94 | # shape: (bsz, seq_len, input_size) 95 | output = torch.stack(output, 0).transpose(0, 1) 96 | 97 | if self.bidirectional: 98 | output_b = [] 99 | hidden_b = self.init_hidden(bs) 100 | for t in steps[::-1]: 101 | hidden_b = recurrence(input[t], hidden_b) 102 | output_b.append(hidden_b) 103 | output_b = output_b[::-1] 104 | output_b = torch.stack(output_b, 0).transpose(0, 1) 105 | output = torch.cat([output, output_b], dim=-1) 106 | return output, None 107 | 108 | def init_hidden(self, bs): 109 | h_0 = torch.zeros(bs, self.hidden_size).cuda() 110 | return h_0 111 | 112 | 113 | class CRF(nn.Module): 114 | # borrow the code from 115 | # https://github.com/allenai/allennlp/blob/master/allennlp/modules/conditional_random_field.py 116 | def __init__(self, num_tags, constraints=None, include_start_end_transitions=None): 117 | """ 118 | 119 | :param num_tags: 120 | :param constraints: 121 | :param include_start_end_transitions: 122 | """ 123 | super(CRF, self).__init__() 124 | self.num_tags = num_tags 125 | self.include_start_end_transitions = include_start_end_transitions 126 | self.transitions = nn.Parameter(torch.Tensor(self.num_tags, self.num_tags)) 127 | constraint_mask = torch.Tensor(self.num_tags+2, self.num_tags+2).fill_(1.) 128 | if include_start_end_transitions: 129 | self.start_transitions = nn.Parameter(torch.Tensor(num_tags)) 130 | self.end_transitions = nn.Parameter(torch.Tensor(num_tags)) 131 | # register the constraint_mask 132 | self.constraint_mask = nn.Parameter(constraint_mask, requires_grad=False) 133 | self.reset_parameters() 134 | 135 | def forward(self, inputs, tags, mask=None): 136 | """ 137 | 138 | :param inputs: (bsz, seq_len, num_tags), logits calculated from a linear layer 139 | :param tags: (bsz, seq_len) 140 | :param mask: (bsz, seq_len), mask for the padding token 141 | :return: 142 | """ 143 | if mask is None: 144 | mask = torch.ones(*tags.size(), dtype=torch.long) 145 | log_denominator = self._input_likelihood(inputs, mask) 146 | log_numerator = self._joint_likelihood(inputs, tags, mask) 147 | return torch.sum(log_numerator - log_denominator) 148 | 149 | def reset_parameters(self): 150 | """ 151 | initialize the parameters in CRF 152 | :return: 153 | """ 154 | nn.init.xavier_normal_(self.transitions) 155 | if self.include_start_end_transitions: 156 | nn.init.normal_(self.start_transitions) 157 | nn.init.normal_(self.end_transitions) 158 | 159 | def _input_likelihood(self, logits, mask): 160 | """ 161 | 162 | :param logits: emission score calculated by a linear layer, shape: (batch_size, seq_len, num_tags) 163 | :param mask: 164 | :return: 165 | """ 166 | bsz, seq_len, num_tags = logits.size() 167 | # Transpose batch size and sequence dimensions 168 | mask = mask.float().transpose(0, 1).contiguous() 169 | logits = logits.transpose(0, 1).contiguous() 170 | 171 | # Initial alpha is the (batch_size, num_tags) tensor of likelihoods combining the 172 | # transitions to the initial states and the logits for the first timestep. 173 | if self.include_start_end_transitions: 174 | alpha = self.start_transitions.view(1, num_tags) + logits[0] 175 | else: 176 | alpha = logits[0] 177 | 178 | for t in range(1, seq_len): 179 | # iteration starts from 1 180 | emit_scores = logits[t].view(bsz, 1, num_tags) 181 | transition_scores = self.transitions.view(1, num_tags, num_tags) 182 | broadcast_alpha = alpha.view(bsz, num_tags, 1) 183 | 184 | # calculate the likelihood 185 | inner = broadcast_alpha + emit_scores + transition_scores 186 | 187 | # mask the padded token when met the padded token, retain the previous alpha 188 | alpha = (logsumexp(inner, 1) * mask[t].view(bsz, 1) + alpha * (1 - mask[t]).view(bsz, 1)) 189 | # Every sequence needs to end with a transition to the stop_tag. 190 | if self.include_start_end_transitions: 191 | stops = alpha + self.end_transitions.view(1, num_tags) 192 | else: 193 | stops = alpha 194 | 195 | # Finally we log_sum_exp along the num_tags dim, result is (batch_size,) 196 | return logsumexp(stops) 197 | 198 | def _joint_likelihood(self, logits, tags, mask): 199 | """ 200 | calculate the likelihood for the input tag sequence 201 | :param logits: 202 | :param tags: shape: (bsz, seq_len) 203 | :param mask: shape: (bsz, seq_len) 204 | :return: 205 | """ 206 | bsz, seq_len, _ = logits.size() 207 | 208 | # Transpose batch size and sequence dimensions: 209 | logits = logits.transpose(0, 1).contiguous() 210 | mask = mask.float().transpose(0, 1).contiguous() 211 | tags = tags.transpose(0, 1).contiguous() 212 | 213 | # Start with the transition scores from start_tag to the first tag in each input 214 | if self.include_start_end_transitions: 215 | score = self.start_transitions.index_select(0, tags[0]) 216 | else: 217 | score = 0.0 218 | 219 | for t in range(seq_len-1): 220 | current_tag, next_tag = tags[t], tags[t+1] 221 | # The scores for transitioning from current_tag to next_tag 222 | transition_score = self.transitions[current_tag.view(-1), next_tag.view(-1)] 223 | 224 | # The score for using current_tag 225 | emit_score = logits[t].gather(1, current_tag.view(bsz, 1)).squeeze(1) 226 | 227 | score = score + transition_score * mask[t+1] + emit_score * mask[t] 228 | 229 | last_tag_index = mask.sum(0).long() - 1 230 | last_tags = tags.gather(0, last_tag_index.view(1, bsz)).squeeze(0) 231 | 232 | # Compute score of transitioning to `stop_tag` from each "last tag". 233 | if self.include_start_end_transitions: 234 | last_transition_score = self.end_transitions.index_select(0, last_tags) 235 | else: 236 | last_transition_score = 0.0 237 | 238 | last_inputs = logits[-1] # (batch_size, num_tags) 239 | last_input_score = last_inputs.gather(1, last_tags.view(-1, 1)) # (batch_size, 1) 240 | last_input_score = last_input_score.squeeze() # (batch_size,) 241 | 242 | score = score + last_transition_score + last_input_score * mask[-1] 243 | 244 | return score 245 | 246 | def viterbi_tags(self, logits, mask): 247 | """ 248 | 249 | :param logits: (bsz, seq_len, num_tags), emission scores 250 | :param mask: 251 | :return: 252 | """ 253 | _, max_seq_len, num_tags = logits.size() 254 | 255 | # Get the tensors out of the variables 256 | logits, mask = logits.data, mask.data 257 | 258 | # Augment transitions matrix with start and end transitions 259 | start_tag = num_tags 260 | end_tag = num_tags + 1 261 | transitions = torch.Tensor(num_tags + 2, num_tags + 2).fill_(-10000.) 262 | 263 | # Apply transition constraints 264 | constrained_transitions = ( 265 | self.transitions * self.constraint_mask[:num_tags, :num_tags] + 266 | -10000.0 * (1 - self.constraint_mask[:num_tags, :num_tags]) 267 | ) 268 | 269 | transitions[:num_tags, :num_tags] = constrained_transitions.data 270 | 271 | if self.include_start_end_transitions: 272 | transitions[start_tag, :num_tags] = ( 273 | self.start_transitions.detach() * self.constraint_mask[start_tag, :num_tags].data + 274 | -10000.0 * (1 - self.constraint_mask[start_tag, :num_tags].detach()) 275 | ) 276 | transitions[:num_tags, end_tag] = ( 277 | self.end_transitions.detach() * self.constraint_mask[:num_tags, end_tag].data + 278 | -10000.0 * (1 - self.constraint_mask[:num_tags, end_tag].detach()) 279 | ) 280 | else: 281 | transitions[start_tag, :num_tags] = (-10000.0 * 282 | (1 - self.constraint_mask[start_tag, :num_tags].detach())) 283 | transitions[:num_tags, end_tag] = -10000.0 * (1 - self.constraint_mask[:num_tags, end_tag].detach()) 284 | 285 | best_paths = [] 286 | # Pad the max sequence length by 2 to account for start_tag + end_tag. 287 | tag_sequence = torch.Tensor(max_seq_len + 2, num_tags + 2) 288 | 289 | for prediction, prediction_mask in zip(logits, mask): 290 | # perform viterbi decoding sample by sample 291 | seq_len = torch.sum(prediction_mask) 292 | # Start with everything totally unlikely 293 | tag_sequence.fill_(-10000.) 294 | # At timestep 0 we must have the START_TAG 295 | tag_sequence[0, start_tag] = 0. 296 | # At steps 1, ..., sequence_length we just use the incoming prediction 297 | tag_sequence[1:(seq_len + 1), :num_tags] = prediction[:seq_len] 298 | # And at the last timestep we must have the END_TAG 299 | tag_sequence[seq_len + 1, end_tag] = 0. 300 | viterbi_path = viterbi_decode(tag_sequence[:(seq_len + 2)], transitions) 301 | viterbi_path = viterbi_path[1:-1] 302 | best_paths.append(viterbi_path) 303 | return best_paths 304 | 305 | 306 | class LSTM(nn.Module): 307 | # customized LSTM with layer normalization 308 | def __init__(self, input_size, hidden_size, bidirectional=True): 309 | """ 310 | 311 | :param input_size: 312 | :param hidden_size: 313 | :param bidirectional: 314 | """ 315 | super(LSTM, self).__init__() 316 | self.input_size = input_size 317 | if bidirectional: 318 | self.hidden_size = hidden_size // 2 319 | else: 320 | self.hidden_size = hidden_size 321 | self.bidirectional = bidirectional 322 | self.LNx = nn.LayerNorm(4*self.hidden_size) 323 | self.LNh = nn.LayerNorm(4*self.hidden_size) 324 | self.LNc = nn.LayerNorm(self.hidden_size) 325 | self.Wx = nn.Linear(in_features=self.input_size, out_features=4*self.hidden_size, bias=True) 326 | self.Wh = nn.Linear(in_features=self.hidden_size, out_features=4*self.hidden_size, bias=True) 327 | 328 | def forward(self, x): 329 | """ 330 | 331 | :param x: input, shape: (batch_size, seq_len, input_size) 332 | :return: 333 | """ 334 | def recurrence(xt, hidden): 335 | """ 336 | recurrence function enhanced with layer norm 337 | :param input: input to the current cell 338 | :param hidden: 339 | :return: 340 | """ 341 | htm1, ctm1 = hidden 342 | gates = self.LNx(self.Wx(xt)) + self.LNh(self.Wh(htm1)) 343 | it, ft, gt, ot = gates.chunk(4, 1) 344 | it = torch.sigmoid(it) 345 | ft = torch.sigmoid(ft) 346 | gt = torch.tanh(gt) 347 | ot = torch.sigmoid(ot) 348 | ct = (ft * ctm1) + (it * gt) 349 | ht = ot * torch.tanh(self.LNc(ct)) # n_b x hidden_dim 350 | 351 | return ht, ct 352 | output = [] 353 | # sequence_length 354 | steps = range(x.size(1)) 355 | hidden = self.init_hidden(x.size(0)) 356 | # change to: (seq_len, bs, hidden_size) 357 | input = x.transpose(0, 1) 358 | for t in steps: 359 | hidden = recurrence(input[t], hidden) 360 | output.append(hidden[0]) 361 | # (bs, seq_len, hidden_size) 362 | output = torch.stack(output, 0).transpose(0, 1) 363 | 364 | if self.bidirectional: 365 | hidden_b = self.init_hidden(x.size(0)) 366 | output_b = [] 367 | for t in steps[::-1]: 368 | hidden_b = recurrence(input[t], hidden_b) 369 | output_b.append(hidden_b[0]) 370 | output_b = output_b[::-1] 371 | output_b = torch.stack(output_b, 0).transpose(0, 1) 372 | output = torch.cat([output, output_b], dim=-1) 373 | return output, None 374 | 375 | def init_hidden(self, bs): 376 | h_0 = torch.zeros(bs, self.hidden_size).cuda() 377 | c_0 = torch.zeros(bs, self.hidden_size).cuda() 378 | return h_0, c_0 379 | 380 | 381 | class BertABSATagger(BertPreTrainedModel): 382 | def __init__(self, bert_config): 383 | """ 384 | 385 | :param bert_config: configuration for bert model 386 | """ 387 | super(BertABSATagger, self).__init__(bert_config) 388 | self.num_labels = bert_config.num_labels 389 | self.tagger_config = TaggerConfig() 390 | self.tagger_config.absa_type = bert_config.absa_type.lower() 391 | if bert_config.tfm_mode == 'finetune': 392 | # initialized with pre-trained BERT and perform finetuning 393 | # print("Fine-tuning the pre-trained BERT...") 394 | self.bert = BertModel(bert_config) 395 | else: 396 | raise Exception("Invalid transformer mode %s!!!" % bert_config.tfm_mode) 397 | self.bert_dropout = nn.Dropout(bert_config.hidden_dropout_prob) 398 | # fix the parameters in BERT and regard it as feature extractor 399 | if bert_config.fix_tfm: 400 | # fix the parameters of the (pre-trained or randomly initialized) transformers during fine-tuning 401 | for p in self.bert.parameters(): 402 | p.requires_grad = False 403 | 404 | self.tagger = None 405 | if self.tagger_config.absa_type == 'linear': 406 | # hidden size at the penultimate layer 407 | penultimate_hidden_size = bert_config.hidden_size 408 | else: 409 | self.tagger_dropout = nn.Dropout(self.tagger_config.hidden_dropout_prob) 410 | if self.tagger_config.absa_type == 'lstm': 411 | self.tagger = LSTM(input_size=bert_config.hidden_size, 412 | hidden_size=self.tagger_config.hidden_size, 413 | bidirectional=self.tagger_config.bidirectional) 414 | elif self.tagger_config.absa_type == 'gru': 415 | self.tagger = GRU(input_size=bert_config.hidden_size, 416 | hidden_size=self.tagger_config.hidden_size, 417 | bidirectional=self.tagger_config.bidirectional) 418 | elif self.tagger_config.absa_type == 'tfm': 419 | # transformer encoder layer 420 | self.tagger = nn.TransformerEncoderLayer(d_model=bert_config.hidden_size, 421 | nhead=12, 422 | dim_feedforward=4*bert_config.hidden_size, 423 | dropout=0.1) 424 | elif self.tagger_config.absa_type == 'san': 425 | # vanilla self attention networks 426 | self.tagger = SAN(d_model=bert_config.hidden_size, nhead=12, dropout=0.1) 427 | elif self.tagger_config.absa_type == 'crf': 428 | self.tagger = CRF(num_tags=self.num_labels) 429 | else: 430 | raise Exception('Unimplemented downstream tagger %s...' % self.tagger_config.absa_type) 431 | penultimate_hidden_size = self.tagger_config.hidden_size 432 | self.classifier = nn.Linear(penultimate_hidden_size, bert_config.num_labels) 433 | 434 | def forward(self, input_ids, token_type_ids=None, attention_mask=None, labels=None, 435 | position_ids=None, head_mask=None): 436 | outputs = self.bert(input_ids, position_ids=position_ids, token_type_ids=token_type_ids, 437 | attention_mask=attention_mask, head_mask=head_mask) 438 | # the hidden states of the last Bert Layer, shape: (bsz, seq_len, hsz) 439 | tagger_input = outputs[0] 440 | tagger_input = self.bert_dropout(tagger_input) 441 | #print("tagger_input.shape:", tagger_input.shape) 442 | if self.tagger is None or self.tagger_config.absa_type == 'crf': 443 | # regard classifier as the tagger 444 | logits = self.classifier(tagger_input) 445 | else: 446 | if self.tagger_config.absa_type == 'lstm': 447 | # customized LSTM 448 | classifier_input, _ = self.tagger(tagger_input) 449 | elif self.tagger_config.absa_type == 'gru': 450 | # customized GRU 451 | classifier_input, _ = self.tagger(tagger_input) 452 | elif self.tagger_config.absa_type == 'san' or self.tagger_config.absa_type == 'tfm': 453 | # vanilla self-attention networks or transformer 454 | # adapt the input format for the transformer or self attention networks 455 | tagger_input = tagger_input.transpose(0, 1) 456 | classifier_input = self.tagger(tagger_input) 457 | classifier_input = classifier_input.transpose(0, 1) 458 | else: 459 | raise Exception("Unimplemented downstream tagger %s..." % self.tagger_config.absa_type) 460 | classifier_input = self.tagger_dropout(classifier_input) 461 | logits = self.classifier(classifier_input) 462 | outputs = (logits,) + outputs[2:] 463 | 464 | if labels is not None: 465 | if self.tagger_config.absa_type != 'crf': 466 | loss_fct = CrossEntropyLoss() 467 | if attention_mask is not None: 468 | active_loss = attention_mask.view(-1) == 1 469 | active_logits = logits.view(-1, self.num_labels)[active_loss] 470 | active_labels = labels.view(-1)[active_loss] 471 | loss = loss_fct(active_logits, active_labels) 472 | else: 473 | loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1)) 474 | outputs = (loss,) + outputs 475 | else: 476 | log_likelihood = self.tagger(inputs=logits, tags=labels, mask=attention_mask) 477 | loss = -log_likelihood 478 | outputs = (loss,) + outputs 479 | return outputs 480 | 481 | 482 | class XLNetABSATagger(XLNetPreTrainedModel): 483 | # TODO 484 | def __init__(self, xlnet_config): 485 | super(XLNetABSATagger, self).__init__(xlnet_config) 486 | self.num_labels = xlnet_config.num_labels 487 | self.xlnet = XLNetModel(xlnet_config) 488 | self.tagger_config = xlnet_config.absa_tagger_config 489 | self.tagger = None 490 | if self.tagger_config.tagger == '': 491 | # hidden size at the penultimate layer 492 | penultimate_hidden_size = xlnet_config.d_model 493 | else: 494 | self.tagger_dropout = nn.Dropout(self.tagger_config.hidden_dropout_prob) 495 | if self.tagger_config.tagger in ['RNN', 'LSTM', 'GRU']: 496 | # 2-layer bi-directional rnn decoder 497 | self.tagger = getattr(nn, self.tagger_config.tagger)( 498 | input_size=xlnet_config.d_model, hidden_size=self.tagger_config.hidden_size//2, 499 | num_layers=self.tagger_config.n_rnn_layers, batch_first=True, bidirectional=True) 500 | elif self.tagger_config.tagger in ['CRF']: 501 | # crf tagger 502 | raise Exception("Unimplemented now!!") 503 | else: 504 | raise Exception('Unimplemented tagger %s...' % self.tagger_config.tagger) 505 | penultimate_hidden_size = self.tagger_config.hidden_size 506 | self.tagger_dropout = nn.Dropout(self.tagger_config.hidden_dropout_prob) 507 | self.classifier = nn.Linear(penultimate_hidden_size, xlnet_config.num_labels) 508 | self.apply(self.init_weights) 509 | 510 | def forward(self, input_ids, token_type_ids=None, input_mask=None, attention_mask=None, mems=None, 511 | perm_mask=None, target_mapping=None, labels=None, head_mask=None): 512 | """ 513 | 514 | :param input_ids: Indices of input sequence tokens in the vocabulary 515 | :param token_type_ids: A parallel sequence of tokens (can be used to indicate various portions of the inputs). 516 | The embeddings from these tokens will be summed with the respective token embeddings 517 | :param input_mask: Mask to avoid performing attention on padding token indices. 518 | :param attention_mask: Mask to avoid performing attention on padding token indices. 519 | :param mems: list of torch.FloatTensor (one for each layer): 520 | that contains pre-computed hidden-states (key and values in the attention blocks) 521 | :param perm_mask: 522 | :param target_mapping: 523 | :param labels: 524 | :param head_mask: 525 | :return: 526 | """ 527 | transformer_outputs = self.xlnet(input_ids, token_type_ids=token_type_ids, 528 | input_mask=input_mask, attention_mask=attention_mask, 529 | mems=mems, perm_mask=perm_mask, target_mapping=target_mapping, 530 | head_mask=head_mask) 531 | # hidden states from the last transformer layer, xlnet has done the dropout, 532 | # no need to do the additional dropout 533 | tagger_input = transformer_outputs[0] 534 | 535 | if self.tagger is None: 536 | # regard classifier as the tagger 537 | logits = self.classifier(tagger_input) 538 | else: 539 | if self.tagger_config.tagger in ['RNN', 'LSTM', 'GRU']: 540 | classifier_input, _= self.tagger(tagger_input) 541 | else: 542 | raise Exception("Unimplemented tagger %s..." % self.tagger_config.tagger) 543 | classifier_input = self.tagger_dropout(classifier_input) 544 | logits = self.classifier(classifier_input) 545 | # transformer outputs: (last_hidden_state, mems, hidden_states, attentions) 546 | outputs = (logits,) + transformer_outputs[1:] 547 | 548 | if labels is not None: 549 | loss_fct = CrossEntropyLoss() 550 | if attention_mask is not None: 551 | active_loss = attention_mask.view(-1) == 1 552 | active_logits = logits.view(-1, self.num_labels)[active_loss] 553 | active_labels = labels.view(-1)[active_loss] 554 | loss = loss_fct(active_logits, active_labels) 555 | else: 556 | loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1)) 557 | outputs = (loss,) + outputs 558 | return outputs 559 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import torch 4 | import logging 5 | import random 6 | import numpy as np 7 | 8 | from glue_utils import convert_examples_to_seq_features, output_modes, processors, compute_metrics_absa 9 | from tqdm import tqdm, trange 10 | from transformers import BertConfig, BertTokenizer, XLNetConfig, XLNetTokenizer, WEIGHTS_NAME 11 | from transformers import AdamW, get_linear_schedule_with_warmup 12 | from absa_layer import BertABSATagger, XLNetABSATagger 13 | 14 | from torch.utils.data import DataLoader, TensorDataset, RandomSampler, SequentialSampler 15 | from torch.utils.data.distributed import DistributedSampler 16 | import torch.distributed as dist 17 | from tensorboardX import SummaryWriter 18 | 19 | import glob 20 | import json 21 | 22 | logger = logging.getLogger(__name__) 23 | 24 | #ALL_MODELS = sum((tuple(conf.pretrained_config_archive_map.keys()) for conf in (BertConfig, XLNetConfig)), ()) 25 | ALL_MODELS = ( 26 | 'bert-base-uncased', 27 | 'bert-large-uncased', 28 | 'bert-base-cased', 29 | 'bert-large-cased', 30 | 'bert-base-multilingual-uncased', 31 | 'bert-base-multilingual-cased', 32 | 'bert-base-chinese', 33 | 'bert-base-german-cased', 34 | 'bert-large-uncased-whole-word-masking', 35 | 'bert-large-cased-whole-word-masking', 36 | 'bert-large-uncased-whole-word-masking-finetuned-squad', 37 | 'bert-large-cased-whole-word-masking-finetuned-squad', 38 | 'bert-base-cased-finetuned-mrpc', 39 | 'bert-base-german-dbmdz-cased', 40 | 'bert-base-german-dbmdz-uncased', 41 | 'xlnet-base-cased', 42 | 'xlnet-large-cased' 43 | ) 44 | 45 | 46 | MODEL_CLASSES = { 47 | 'bert': (BertConfig, BertABSATagger, BertTokenizer), 48 | 'xlnet': (XLNetConfig, XLNetABSATagger, XLNetTokenizer) 49 | } 50 | 51 | 52 | def set_seed(args): 53 | random.seed(args.seed) 54 | np.random.seed(args.seed) 55 | torch.manual_seed(args.seed) 56 | if args.n_gpu > 0: 57 | torch.cuda.manual_seed_all(args.seed) 58 | 59 | 60 | def init_args(): 61 | parser = argparse.ArgumentParser() 62 | parser.add_argument("--data_dir", default=None, type=str, required=True, 63 | help="The input data dir. Should contain the .tsv files (or other data files) for the task.") 64 | parser.add_argument("--model_type", default=None, type=str, required=True, 65 | help="Model type selected in the list: " + ", ".join(MODEL_CLASSES.keys())) 66 | parser.add_argument("--absa_type", default=None, type=str, required=True, 67 | help="Downstream absa layer type selected in the list: [linear, gru, san, tfm, crf]") 68 | parser.add_argument("--tfm_mode", default=None, type=str, required=True, 69 | help="mode of the pre-trained transformer, selected from: [finetune]") 70 | parser.add_argument("--fix_tfm", default=None, type=int, required=True, 71 | help="whether fix the transformer params or not") 72 | parser.add_argument("--model_name_or_path", default=None, type=str, required=True, 73 | help="Path to pre-trained model or shortcut name selected in the list: " + ", ".join( 74 | ALL_MODELS)) 75 | parser.add_argument("--task_name", default=None, type=str, required=True, 76 | help="The name of the task to train selected in the list: " + ", ".join(processors.keys())) 77 | 78 | ## Other parameters 79 | parser.add_argument("--config_name", default="", type=str, 80 | help="Pretrained config name or path if not the same as model_name") 81 | parser.add_argument("--tokenizer_name", default="", type=str, 82 | help="Pretrained tokenizer name or path if not the same as model_name") 83 | parser.add_argument("--cache_dir", default="", type=str, 84 | help="Where do you want to store the pre-trained models downloaded from s3") 85 | parser.add_argument("--max_seq_length", default=128, type=int, 86 | help="The maximum total input sequence length after tokenization. Sequences longer " 87 | "than this will be truncated, sequences shorter will be padded.") 88 | parser.add_argument("--do_train", action='store_true', 89 | help="Whether to run training.") 90 | parser.add_argument("--do_eval", action='store_true', 91 | help="Whether to run eval on the dev set.") 92 | parser.add_argument("--evaluate_during_training", action='store_true', 93 | help="Rul evaluation during training at each logging step.") 94 | parser.add_argument("--do_lower_case", action='store_true', 95 | help="Set this flag if you are using an uncased model.") 96 | 97 | parser.add_argument("--per_gpu_train_batch_size", default=8, type=int, 98 | help="Batch size per GPU/CPU for training.") 99 | parser.add_argument("--per_gpu_eval_batch_size", default=8, type=int, 100 | help="Batch size per GPU/CPU for evaluation.") 101 | parser.add_argument('--gradient_accumulation_steps', type=int, default=1, 102 | help="Number of updates steps to accumulate before performing a backward/update pass.") 103 | parser.add_argument("--learning_rate", default=5e-5, type=float, 104 | help="The initial learning rate for Adam.") 105 | parser.add_argument("--weight_decay", default=0.0, type=float, 106 | help="Weight deay if we apply some.") 107 | parser.add_argument("--adam_epsilon", default=1e-8, type=float, 108 | help="Epsilon for Adam optimizer.") 109 | parser.add_argument("--max_grad_norm", default=1.0, type=float, 110 | help="Max gradient norm.") 111 | parser.add_argument("--num_train_epochs", default=3.0, type=float, 112 | help="Total number of training epochs to perform.") 113 | parser.add_argument("--max_steps", default=-1, type=int, 114 | help="If > 0: set total number of training steps to perform. Override num_train_epochs.") 115 | parser.add_argument("--warmup_steps", default=0, type=int, 116 | help="Linear warmup over warmup_steps.") 117 | 118 | parser.add_argument('--logging_steps', type=int, default=50, 119 | help="Log every X updates steps.") 120 | parser.add_argument('--save_steps', type=int, default=100, 121 | help="Save checkpoint every X updates steps.") 122 | parser.add_argument("--eval_all_checkpoints", action='store_true', 123 | help="Evaluate all checkpoints starting with the same prefix as model_name ending and ending with step number") 124 | parser.add_argument("--no_cuda", action='store_true', 125 | help="Avoid using CUDA when available") 126 | parser.add_argument('--overwrite_output_dir', action='store_true', 127 | help="Overwrite the content of the output directory") 128 | parser.add_argument('--overwrite_cache', action='store_true', 129 | help="Overwrite the cached training and evaluation sets") 130 | parser.add_argument('--seed', type=int, default=42, 131 | help="random seed for initialization") 132 | parser.add_argument('--tagging_schema', type=str, default='BIEOS') 133 | 134 | parser.add_argument("--overfit", type=int, default=0, help="if evaluate overfit or not") 135 | 136 | parser.add_argument("--local_rank", type=int, default=-1, 137 | help="For distributed training: local_rank") 138 | parser.add_argument('--server_ip', type=str, default='', help="For distant debugging.") 139 | parser.add_argument('--server_port', type=str, default='', help="For distant debugging.") 140 | parser.add_argument('--MASTER_ADDR', type=str) 141 | parser.add_argument('--MASTER_PORT', type=str) 142 | args = parser.parse_args() 143 | output_dir = '%s-%s-%s-%s' % (args.model_type, args.absa_type, args.task_name, args.tfm_mode) 144 | 145 | if args.fix_tfm: 146 | output_dir = '%s-fix' % output_dir 147 | if args.overfit: 148 | output_dir = '%s-overfit' % output_dir 149 | args.max_steps = 3000 150 | args.output_dir = output_dir 151 | return args 152 | 153 | 154 | def train(args, train_dataset, model, tokenizer): 155 | """ Train the model """ 156 | if args.local_rank in [-1, 0]: 157 | tb_writer = SummaryWriter() 158 | 159 | args.train_batch_size = args.per_gpu_train_batch_size * max(1, args.n_gpu) 160 | # draw training samples from shuffled dataset 161 | train_sampler = RandomSampler(train_dataset) if args.local_rank == -1 else DistributedSampler(train_dataset) 162 | train_dataloader = DataLoader(train_dataset, sampler=train_sampler, batch_size=args.train_batch_size) 163 | 164 | if args.max_steps > 0: 165 | t_total = args.max_steps 166 | args.num_train_epochs = args.max_steps // (len(train_dataloader) // args.gradient_accumulation_steps) + 1 167 | else: 168 | t_total = len(train_dataloader) // args.gradient_accumulation_steps * args.num_train_epochs 169 | 170 | # Prepare optimizer and schedule (linear warmup and decay) 171 | no_decay = ['bias', 'LayerNorm.weight'] 172 | optimizer_grouped_parameters = [ 173 | {'params': [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)], 'weight_decay': args.weight_decay}, 174 | {'params': [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)], 'weight_decay': 0.0} 175 | ] 176 | optimizer = AdamW(optimizer_grouped_parameters, lr=args.learning_rate, eps=args.adam_epsilon) 177 | scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=args.warmup_steps, num_training_steps=t_total) 178 | 179 | # Train! 180 | 181 | logger.info("***** Running training *****") 182 | logger.info(" Num examples = %d", len(train_dataset)) 183 | logger.info(" Num Epochs = %d", args.num_train_epochs) 184 | logger.info(" Instantaneous batch size per GPU = %d", args.per_gpu_train_batch_size) 185 | logger.info(" Total train batch size (w. parallel, distributed & accumulation) = %d", 186 | args.train_batch_size * args.gradient_accumulation_steps * (torch.distributed.get_world_size() if args.local_rank != -1 else 1)) 187 | logger.info(" Gradient Accumulation steps = %d", args.gradient_accumulation_steps) 188 | logger.info(" Total optimization steps = %d", t_total) 189 | 190 | global_step = 0 191 | tr_loss, logging_loss = 0.0, 0.0 192 | model.zero_grad() 193 | train_iterator = trange(int(args.num_train_epochs), desc="Epoch", disable=args.local_rank not in [-1, 0]) 194 | # set the seed number 195 | set_seed(args) # Added here for reproductibility (even between python 2 and 3) 196 | for _ in train_iterator: 197 | epoch_iterator = tqdm(train_dataloader, desc="Iteration", disable=args.local_rank not in [-1, 0]) 198 | for step, batch in enumerate(epoch_iterator): 199 | model.train() 200 | batch = tuple(t.to(args.device) for t in batch) 201 | inputs = {'input_ids': batch[0], 202 | 'attention_mask': batch[1], 203 | 'token_type_ids': batch[2] if args.model_type in ['bert', 'xlnet'] else None, # XLM don't use segment_ids 204 | 'labels': batch[3]} 205 | ouputs = model(**inputs) 206 | # loss with attention mask 207 | loss = ouputs[0] # model outputs are always tuple in pytorch-transformers (see doc) 208 | 209 | if args.n_gpu > 1: 210 | loss = loss.mean() # mean() to average on multi-gpu parallel training 211 | if args.gradient_accumulation_steps > 1: 212 | loss = loss / args.gradient_accumulation_steps 213 | 214 | 215 | loss.backward() 216 | torch.nn.utils.clip_grad_norm_(model.parameters(), args.max_grad_norm) 217 | 218 | tr_loss += loss.item() 219 | if (step + 1) % args.gradient_accumulation_steps == 0: 220 | optimizer.step() 221 | scheduler.step() # Update learning rate schedule 222 | model.zero_grad() 223 | global_step += 1 224 | 225 | if args.local_rank in [-1, 0] and args.logging_steps > 0 and global_step % args.logging_steps == 0: 226 | # Log metrics 227 | if args.local_rank == -1 and args.evaluate_during_training: # Only evaluate when single GPU otherwise metrics may not average well 228 | results = evaluate(args, model, tokenizer) 229 | for key, value in results.items(): 230 | tb_writer.add_scalar('eval_{}'.format(key), value, global_step) 231 | tb_writer.add_scalar('lr', scheduler.get_lr()[0], global_step) 232 | tb_writer.add_scalar('loss', (tr_loss - logging_loss)/args.logging_steps, global_step) 233 | logging_loss = tr_loss 234 | 235 | if args.local_rank in [-1, 0] and args.save_steps > 0 and global_step % args.save_steps == 0: 236 | # Save model checkpoint per each N steps 237 | output_dir = os.path.join(args.output_dir, 'checkpoint-{}'.format(global_step)) 238 | if not os.path.exists(output_dir): 239 | os.makedirs(output_dir) 240 | model_to_save = model.module if hasattr(model, 'module') else model # Take care of distributed/parallel training 241 | model_to_save.save_pretrained(output_dir) 242 | torch.save(args, os.path.join(output_dir, 'training_args.bin')) 243 | logger.info("Saving model checkpoint to %s", output_dir) 244 | 245 | if args.max_steps > 0 and global_step > args.max_steps: 246 | epoch_iterator.close() 247 | break 248 | if args.max_steps > 0 and global_step > args.max_steps: 249 | train_iterator.close() 250 | break 251 | 252 | if args.local_rank in [-1, 0]: 253 | tb_writer.close() 254 | 255 | return global_step, tr_loss / global_step 256 | 257 | 258 | def evaluate(args, model, tokenizer, mode, prefix=""): 259 | # Loop to handle MNLI double evaluation (matched, mis-matched) 260 | eval_task_names = (args.task_name,) 261 | eval_outputs_dirs = (args.output_dir,) 262 | 263 | results = {} 264 | for eval_task, eval_output_dir in zip(eval_task_names, eval_outputs_dirs): 265 | eval_dataset, eval_evaluate_label_ids = load_and_cache_examples(args, eval_task, tokenizer, mode=mode) 266 | 267 | if not os.path.exists(eval_output_dir) and args.local_rank in [-1, 0]: 268 | os.makedirs(eval_output_dir) 269 | 270 | args.eval_batch_size = args.per_gpu_eval_batch_size * max(1, args.n_gpu) 271 | # Note that DistributedSampler samples randomly 272 | eval_sampler = SequentialSampler(eval_dataset) if args.local_rank == -1 else DistributedSampler(eval_dataset) 273 | eval_dataloader = DataLoader(eval_dataset, sampler=eval_sampler, batch_size=args.eval_batch_size) 274 | 275 | # Eval! 276 | #logger.info("***** Running evaluation on %s.txt *****" % mode) 277 | eval_loss = 0.0 278 | nb_eval_steps = 0 279 | preds = None 280 | out_label_ids = None 281 | crf_logits, crf_mask = [], [] 282 | for batch in tqdm(eval_dataloader, desc="Evaluating"): 283 | model.eval() 284 | batch = tuple(t.to(args.device) for t in batch) 285 | 286 | with torch.no_grad(): 287 | inputs = {'input_ids': batch[0], 288 | 'attention_mask': batch[1], 289 | 'token_type_ids': batch[2] if args.model_type in ['bert', 'xlnet'] else None, # XLM don't use segment_ids 290 | 'labels': batch[3]} 291 | outputs = model(**inputs) 292 | # logits: (bsz, seq_len, label_size) 293 | # here the loss is the masked loss 294 | tmp_eval_loss, logits = outputs[:2] 295 | eval_loss += tmp_eval_loss.mean().item() 296 | 297 | crf_logits.append(logits) 298 | crf_mask.append(batch[1]) 299 | nb_eval_steps += 1 300 | if preds is None: 301 | preds = logits.detach().cpu().numpy() 302 | out_label_ids = inputs['labels'].detach().cpu().numpy() 303 | else: 304 | preds = np.append(preds, logits.detach().cpu().numpy(), axis=0) 305 | out_label_ids = np.append(out_label_ids, inputs['labels'].detach().cpu().numpy(), axis=0) 306 | eval_loss = eval_loss / nb_eval_steps 307 | # argmax operation over the last dimension 308 | if model.tagger_config.absa_type != 'crf': 309 | # greedy decoding 310 | preds = np.argmax(preds, axis=-1) 311 | else: 312 | # viterbi decoding for CRF-based model 313 | crf_logits = torch.cat(crf_logits, dim=0) 314 | crf_mask = torch.cat(crf_mask, dim=0) 315 | preds = model.tagger.viterbi_tags(logits=crf_logits, mask=crf_mask) 316 | result = compute_metrics_absa(preds, out_label_ids, eval_evaluate_label_ids, args.tagging_schema) 317 | result['eval_loss'] = eval_loss 318 | results.update(result) 319 | 320 | output_eval_file = os.path.join(eval_output_dir, "%s_results.txt" % mode) 321 | with open(output_eval_file, "w") as writer: 322 | #logger.info("***** %s results *****" % mode) 323 | for key in sorted(result.keys()): 324 | if 'eval_loss' in key: 325 | logger.info(" %s = %s", key, str(result[key])) 326 | writer.write("%s = %s\n" % (key, str(result[key]))) 327 | #logger.info("***** %s results *****" % mode) 328 | 329 | return results 330 | 331 | 332 | def load_and_cache_examples(args, task, tokenizer, mode='train'): 333 | processor = processors[task]() 334 | # Load data features from cache or dataset file 335 | cached_features_file = os.path.join(args.data_dir, 'cached_{}_{}_{}_{}'.format( 336 | mode, 337 | list(filter(None, args.model_name_or_path.split('/'))).pop(), 338 | str(args.max_seq_length), 339 | str(task))) 340 | if os.path.exists(cached_features_file): 341 | print("cached_features_file:", cached_features_file) 342 | features = torch.load(cached_features_file) 343 | else: 344 | #logger.info("Creating features from dataset file at %s", args.data_dir) 345 | label_list = processor.get_labels(args.tagging_schema) 346 | if mode == 'train': 347 | examples = processor.get_train_examples(args.data_dir, args.tagging_schema) 348 | elif mode == 'dev': 349 | examples = processor.get_dev_examples(args.data_dir, args.tagging_schema) 350 | elif mode == 'test': 351 | examples = processor.get_test_examples(args.data_dir, args.tagging_schema) 352 | else: 353 | raise Exception("Invalid data mode %s..." % mode) 354 | features = convert_examples_to_seq_features(examples=examples, label_list=label_list, tokenizer=tokenizer, 355 | cls_token_at_end=bool(args.model_type in ['xlnet']), 356 | cls_token=tokenizer.cls_token, 357 | sep_token=tokenizer.sep_token, 358 | cls_token_segment_id=2 if args.model_type in ['xlnet'] else 0, 359 | pad_on_left=bool(args.model_type in ['xlnet']), 360 | pad_token_segment_id=4 if args.model_type in ['xlnet'] else 0) 361 | if args.local_rank in [-1, 0]: 362 | #logger.info("Saving features into cached file %s", cached_features_file) 363 | torch.save(features, cached_features_file) 364 | 365 | # Convert to Tensors and build dataset 366 | all_input_ids = torch.tensor([f.input_ids for f in features], dtype=torch.long) 367 | all_input_mask = torch.tensor([f.input_mask for f in features], dtype=torch.long) 368 | all_segment_ids = torch.tensor([f.segment_ids for f in features], dtype=torch.long) 369 | 370 | all_label_ids = torch.tensor([f.label_ids for f in features], dtype=torch.long) 371 | # used in evaluation 372 | all_evaluate_label_ids = [f.evaluate_label_ids for f in features] 373 | dataset = TensorDataset(all_input_ids, all_input_mask, all_segment_ids, all_label_ids) 374 | return dataset, all_evaluate_label_ids 375 | 376 | 377 | def main(): 378 | 379 | args = init_args() 380 | if os.path.exists(args.output_dir) and os.listdir(args.output_dir) and args.do_train and not args.overwrite_output_dir: 381 | raise ValueError("Output directory ({}) already exists and is not empty. Use --overwrite_output_dir to overcome.".format(args.output_dir)) 382 | 383 | # Setup CUDA, GPU & distributed training 384 | if args.local_rank == -1 or args.no_cuda: 385 | device = torch.device("cuda" if torch.cuda.is_available() and not args.no_cuda else "cpu") 386 | args.n_gpu = torch.cuda.device_count() 387 | else: # Initializes the distributed backend which will take care of sychronizing nodes/GPUs 388 | torch.cuda.set_device(args.local_rank) 389 | device = torch.device("cuda", args.local_rank) 390 | os.environ['MASTER_ADDR'] = args.MASTER_ADDR 391 | os.environ['MASTER_PORT'] = args.MASTER_PORT 392 | torch.distributed.init_process_group(backend='nccl', rank=args.local_rank, world_size=1) 393 | args.n_gpu = 1 394 | 395 | args.device = device 396 | 397 | # Setup logging 398 | logging.basicConfig(format='%(asctime)s - %(levelname)s - %(name)s - %(message)s', 399 | datefmt='%m/%d/%Y %H:%M:%S', 400 | level=logging.INFO if args.local_rank in [-1, 0] else logging.WARN) 401 | # not using 16-bits training 402 | logger.warning("Process rank: %s, device: %s, n_gpu: %s, distributed training: %s, 16-bits training: False", 403 | args.local_rank, device, args.n_gpu, bool(args.local_rank != -1)) 404 | 405 | # Set seed 406 | set_seed(args) 407 | 408 | # Prepare task 409 | args.task_name = args.task_name.lower() 410 | if args.task_name not in processors: 411 | raise ValueError("Task not found: %s" % args.task_name) 412 | processor = processors[args.task_name]() 413 | args.output_mode = output_modes[args.task_name] 414 | label_list = processor.get_labels(args.tagging_schema) 415 | num_labels = len(label_list) 416 | 417 | if args.local_rank not in [-1, 0]: 418 | torch.distributed.barrier() 419 | 420 | # initialize the pre-trained model 421 | args.model_type = args.model_type.lower() 422 | config_class, model_class, tokenizer_class = MODEL_CLASSES[args.model_type] 423 | config = config_class.from_pretrained(args.config_name if args.config_name else args.model_name_or_path, 424 | num_labels=num_labels, finetuning_task=args.task_name, cache_dir="./cache") 425 | tokenizer = tokenizer_class.from_pretrained(args.tokenizer_name if args.tokenizer_name else args.model_name_or_path, 426 | do_lower_case=args.do_lower_case, cache_dir='./cache') 427 | 428 | config.absa_type = args.absa_type 429 | config.tfm_mode = args.tfm_mode 430 | config.fix_tfm = args.fix_tfm 431 | model = model_class.from_pretrained(args.model_name_or_path, from_tf=bool('.ckpt' in args.model_name_or_path), 432 | config=config, cache_dir='./cache') 433 | # Distributed and parallel training 434 | model.to(args.device) 435 | if args.local_rank != -1: 436 | model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[args.local_rank], 437 | output_device=args.local_rank, 438 | find_unused_parameters=True) 439 | elif args.n_gpu > 1: 440 | model = torch.nn.DataParallel(model) 441 | 442 | # Training 443 | if args.do_train: 444 | train_dataset, train_evaluate_label_ids = load_and_cache_examples(args, args.task_name, tokenizer, mode='train') 445 | global_step, tr_loss = train(args, train_dataset, model, tokenizer) 446 | 447 | if args.do_train and (args.local_rank == -1 or dist.get_rank() == 0): 448 | # Create output directory if needed 449 | if not os.path.exists(args.output_dir) and args.local_rank in [-1, 0]: 450 | os.mkdir(args.output_dir) 451 | 452 | model_to_save = model.module if hasattr(model, 'module') else model 453 | model_to_save.save_pretrained(args.output_dir) 454 | tokenizer.save_pretrained(args.output_dir) 455 | 456 | # Good practice: save your training arguments together with the trained model 457 | # save the model configuration 458 | torch.save(args, os.path.join(args.output_dir, 'training_args.bin')) 459 | 460 | # Load a trained model and vocabulary that you have fine-tuned 461 | model = model_class.from_pretrained(args.output_dir) 462 | tokenizer = tokenizer_class.from_pretrained(args.output_dir) 463 | model.to(args.device) 464 | 465 | # Validation 466 | results = {} 467 | best_f1 = -999999.0 468 | best_checkpoint = None 469 | checkpoints = [args.output_dir] 470 | if args.eval_all_checkpoints: 471 | checkpoints = list(os.path.dirname(c) for c in sorted(glob.glob(args.output_dir + '/**/' + WEIGHTS_NAME, recursive=True))) 472 | logging.getLogger("pytorch_transformers.modeling_utils").setLevel(logging.WARN) # Reduce logging 473 | logger.info("Perform validation on the following checkpoints: %s", checkpoints) 474 | test_results = {} 475 | for checkpoint in checkpoints: 476 | global_step = checkpoint.split('-')[-1] if len(checkpoints) > 1 else "" 477 | if global_step == 'finetune' or global_step == 'train' or global_step == 'fix' or global_step == 'overfit': 478 | continue 479 | # validation set 480 | model = model_class.from_pretrained(checkpoint) 481 | model.to(args.device) 482 | dev_result = evaluate(args, model, tokenizer, mode='dev', prefix=global_step) 483 | 484 | # regard the micro-f1 as the criteria of model selection 485 | if int(global_step) > 1000 and dev_result['micro-f1'] > best_f1: 486 | best_f1 = dev_result['micro-f1'] 487 | best_checkpoint = checkpoint 488 | dev_result = dict((k + '_{}'.format(global_step), v) for k, v in dev_result.items()) 489 | results.update(dev_result) 490 | 491 | test_result = evaluate(args, model, tokenizer, mode='test', prefix=global_step) 492 | test_result = dict((k + '_{}'.format(global_step), v) for k, v in test_result.items()) 493 | test_results.update(test_result) 494 | 495 | best_ckpt_string = "\nThe best checkpoint is %s" % best_checkpoint 496 | logger.info(best_ckpt_string) 497 | dev_f1_values, dev_loss_values = [], [] 498 | for k in results: 499 | v = results[k] 500 | if 'micro-f1' in k: 501 | dev_f1_values.append((k, v)) 502 | if 'eval_loss' in k: 503 | dev_loss_values.append((k, v)) 504 | test_f1_values, test_loss_values = [], [] 505 | for k in test_results: 506 | v = test_results[k] 507 | if 'micro-f1' in k: 508 | test_f1_values.append((k, v)) 509 | if 'eval_loss' in k: 510 | test_loss_values.append((k, v)) 511 | log_file_path = '%s/log.txt' % args.output_dir 512 | log_file = open(log_file_path, 'a') 513 | log_file.write("\tValidation:\n") 514 | for (test_f1_k, test_f1_v), (test_loss_k, test_loss_v), (dev_f1_k, dev_f1_v), (dev_loss_k, dev_loss_v) in zip( 515 | test_f1_values, test_loss_values, dev_f1_values, dev_loss_values): 516 | global_step = int(test_f1_k.split('_')[-1]) 517 | if not args.overfit and global_step <= 1000: 518 | continue 519 | print('test-%s: %.5lf, test-%s: %.5lf, dev-%s: %.5lf, dev-%s: %.5lf' % (test_f1_k, 520 | test_f1_v, test_loss_k, test_loss_v, 521 | dev_f1_k, dev_f1_v, dev_loss_k, 522 | dev_loss_v)) 523 | validation_string = '\t\tdev-%s: %.5lf, dev-%s: %.5lf' % (dev_f1_k, dev_f1_v, dev_loss_k, dev_loss_v) 524 | log_file.write(validation_string+'\n') 525 | 526 | n_times = args.max_steps // args.save_steps + 1 527 | for i in range(1, n_times): 528 | step = i * 100 529 | log_file.write('\tStep %s:\n' % step) 530 | precision = test_results['precision_%s' % step] 531 | recall = test_results['recall_%s' % step] 532 | micro_f1 = test_results['micro-f1_%s' % step] 533 | macro_f1 = test_results['macro-f1_%s' % step] 534 | log_file.write('\t\tprecision: %.4lf, recall: %.4lf, micro-f1: %.4lf, macro-f1: %.4lf\n' 535 | % (precision, recall, micro_f1, macro_f1)) 536 | log_file.write("\tBest checkpoint: %s\n" % best_checkpoint) 537 | log_file.write('******************************************\n') 538 | log_file.close() 539 | 540 | 541 | if __name__ == '__main__': 542 | main() 543 | 544 | 545 | 546 | 547 | -------------------------------------------------------------------------------- /data/rest16/dev.txt: -------------------------------------------------------------------------------- 1 | The duck confit is always amazing and the foie gras terrine with figs was out of this world.####The=O duck=T-POS confit=T-POS is=O always=O amazing=O and=O the=O foie=T-POS gras=T-POS terrine=T-POS with=T-POS figs=T-POS was=O out=O of=O this=O world=O .=O 2 | I/we will never go back to this place again.####I/we=O will=O never=O go=O back=O to=O this=O place=T-NEG again=O .=O 3 | Can't wait wait for my next visit.####Ca=O n't=O wait=O wait=O for=O my=O next=O visit=O .=O 4 | Anyways, if you're in the neighborhood to eat good food, I wouldn't waste my time trying to find something, rather go across the street to Tamari.####Anyways=O ,=O if=O you=O 're=O in=O the=O neighborhood=O to=O eat=O good=O food=O ,=O I=O would=O n't=O waste=O my=O time=O trying=O to=O find=O something=O ,=O rather=O go=O across=O the=O street=O to=O Tamari=O .=O 5 | We ate outside at Haru's Sake bar because Haru's restaurant next door was overflowing.####We=O ate=O outside=O at=O Haru=O 's=O Sake=O bar=O because=O Haru=O 's=O restaurant=O next=O door=O was=O overflowing=O .=O 6 | however, it's the service that leaves a bad taste in my mouth.####however=O ,=O it=O 's=O the=O service=T-NEG that=O leaves=O a=O bad=O taste=O in=O my=O mouth=O .=O 7 | the last time i walked by it looked pretty empty. hmmm.####the=O last=O time=O i=O walked=O by=O it=O looked=O pretty=O empty=O hmmm=O .=O 8 | I've had the lunch buffet at Chennai a couple of times, when I have been in the neighborhood.####I=O 've=O had=O the=O lunch=O buffet=O at=O Chennai=O a=O couple=O of=O times=O ,=O when=O I=O have=O been=O in=O the=O neighborhood=O .=O 9 | Price no more than a Jersey deli but way better.####Price=O no=O more=O than=O a=O Jersey=O deli=O but=O way=O better=O .=O 10 | I am so coming back here again, as much as I can.####I=O am=O so=O coming=O back=O here=O again=O ,=O as=O much=O as=O I=O can=O .=O 11 | They refuse to seat parties of 3 or more on weekends.####They=O refuse=O to=O seat=O parties=O of=O 3=O or=O more=O on=O weekends=O .=O 12 | As always we had a great glass of wine while we waited.####As=O always=O we=O had=O a=O great=O glass=T-POS of=T-POS wine=T-POS while=O we=O waited=O .=O 13 | This place is great.####This=O place=T-POS is=O great=O .=O 14 | Slightly on the pricey side but worth it!####Slightly=O on=O the=O pricey=O side=O but=O worth=O it=O !=O 15 | I have been going there since it opened and I can't get enough.####I=O have=O been=O going=O there=O since=O it=O opened=O and=O I=O ca=O n't=O get=O enough=O .=O 16 | Pizza - the only pizza in NYC that should not have additional toppings - the crust tastes like the best, freshly baked bread!####Pizza=O ,=O the=O only=O pizza=T-POS in=O NYC=O that=O should=O not=O have=O additional=O toppings=O ,=O the=O crust=T-POS tastes=O like=O the=O best=O ,=O freshly=O baked=O bread=O !=O 17 | Make sure you have the Spicy Scallop roll.. .####Make=O sure=O you=O have=O the=O Spicy=T-POS Scallop=T-POS roll=T-POS .=O 18 | Told us to sit anywhere, and when we sat he said the table was reserved.####Told=O us=O to=O sit=O anywhere=O ,=O and=O when=O we=O sat=O he=O said=O the=O table=O was=O reserved=O .=O 19 | Great service, great food.####Great=O service=T-POS ,=O great=O food=T-POS .=O 20 | Prices are in line.####Prices=O are=O in=O line=O .=O 21 | I live a block away and go to Patsy's frequently.####I=O live=O a=O block=O away=O and=O go=O to=O Patsy=O 's=O frequently=O .=O 22 | Over 100 different choices to create your own.####Over=O 100=O different=O choices=O to=O create=O your=O own=O .=O 23 | Seating is always prompt, though the restaurant does fill up in the evening.####Seating=T-POS is=O always=O prompt=O ,=O though=O the=O restaurant=O does=O fill=O up=O in=O the=O evening=O .=O 24 | I've never ordered anything else from their menu...there's no need to.####I=O 've=O never=O ordered=O anything=O else=O from=O their=O menuthere=O 's=O no=O need=O to=O .=O 25 | It's one of our favorite places to eat in NY.####It=O 's=O one=O of=O our=O favorite=O places=O to=O eat=O in=O NY=O .=O 26 | They were served warm and had a soft fluffy interior.####They=O were=O served=O warm=O and=O had=O a=O soft=O fluffy=O interior=O .=O 27 | It was wonderful.####It=O was=O wonderful=O .=O 28 | Great vibe, lots of people.####Great=O vibe=T-POS ,=O lots=O of=O people=O .=O 29 | Salads were fantastic.####Salads=T-POS were=O fantastic=O .=O 30 | Went here on a friend's reccomendation.####Went=O here=O on=O a=O friend=O 's=O reccomendation=O .=O 31 | Right off the L in Brooklyn this is a nice cozy place with good pizza.####Right=O off=O the=O L=O in=O Brooklyn=O this=O is=O a=O nice=O cozy=O place=T-POS with=O good=O pizza=T-POS .=O 32 | I started out with a Bombay beer which was big enough for two.####I=O started=O out=O with=O a=O Bombay=T-POS beer=T-POS which=O was=O big=O enough=O for=O two=O .=O 33 | However, I think this place is a good hang out spot.####However=O ,=O I=O think=O this=O place=T-POS is=O a=O good=O hang=O out=O spot=O .=O 34 | The tables are crammed way too close, the menu is typical of any Italian restaurant, and the wine list is simply overpriced.####The=O tables=T-NEG are=O crammed=O way=O too=O close=O ,=O the=O menu=T-NEU is=O typical=O of=O any=O Italian=O restaurant=O ,=O and=O the=O wine=T-NEG list=T-NEG is=O simply=O overpriced=O .=O 35 | Not one of our meals was edible - bland and/or made with weird rosemary or orange flavoring.####Not=O one=O of=O our=O meals=T-NEG was=O edible=O ,=O bland=O and/or=O made=O with=O weird=O rosemary=T-NEG or=T-NEG orange=T-NEG flavoring=T-NEG .=O 36 | I've never had bad service and the fish is fresh and delicious.####I=O 've=O never=O had=O bad=O service=T-POS and=O the=O fish=T-POS is=O fresh=O and=O delicious=O .=O 37 | The dining room is quietly elegant with no music to shout over -- how refreshing!####The=O dining=T-POS room=T-POS is=O quietly=O elegant=O with=O no=O music=O to=O shout=O over=O -=O how=O refreshing=O !=O 38 | Beautiful experience.####Beautiful=O experience=O .=O 39 | The portions are large and the servers always surprise us with a different starter.####The=O portions=T-POS are=O large=O and=O the=O servers=T-POS always=O surprise=O us=O with=O a=O different=O starter=O .=O 40 | The menu is very limited - i think we counted 4 or 5 entrees.####The=O menu=T-NEG is=O very=O limited=O ,=O i=O think=O we=O counted=O 4=O or=O 5=O entrees=O .=O 41 | Our family never expected such incredible entertainment in a restaurant.####Our=O family=O never=O expected=O such=O incredible=O entertainment=O in=O a=O restaurant=T-POS .=O 42 | This place must have cost the owners afortune to build.####This=O place=O must=O have=O cost=O the=O owners=O afortune=O to=O build=O .=O 43 | I think the stuff was better than Disney.####I=O think=O the=O stuff=O was=O better=O than=O Disney=O .=O 44 | I highly recommend it.####I=O highly=O recommend=O it=O .=O 45 | The place is a lot of fun.####The=O place=T-POS is=O a=O lot=O of=O fun=O .=O 46 | Growing up in NY, I have eaten my share of bagels.####Growing=O up=O in=O NY=O ,=O I=O have=O eaten=O my=O share=O of=O bagels=O .=O 47 | I thought this place was totally overrated.####I=O thought=O this=O place=T-NEG was=O totally=O overrated=O .=O 48 | I must give it Yon out of Yon stars!####I=O must=O give=O it=O Yon=O out=O of=O Yon=O stars=O !=O 49 | The service is ok, some of the people didn't get what they asked for.####The=O service=T-NEU is=O ok=O ,=O some=O of=O the=O people=O did=O n't=O get=O what=O they=O asked=O for=O .=O 50 | I was there on sat. for my birthday and we had an excellent time.####I=O was=O there=O on=O sat=O for=O my=O birthday=O and=O we=O had=O an=O excellent=O time=O .=O 51 | Whether it's the parmesean porcini souffle or the lamb glazed with balsamic vinegar, you will surely be transported to Northern Italy with one bite.####Whether=O it=O 's=O the=O parmesean=T-POS porcini=T-POS souffle=T-POS or=O the=O lamb=T-POS glazed=T-POS with=T-POS balsamic=T-POS vinegar=T-POS ,=O you=O will=O surely=O be=O transported=O to=O Northern=O Italy=O with=O one=O bite=O .=O 52 | Old school meets New world.####Old=O school=O meets=O New=O world=O .=O 53 | I found the food, service and value exceptional everytime I have been there.####I=O found=O the=O food=T-POS ,=O service=T-POS and=O value=O exceptional=O everytime=O I=O have=O been=O there=O .=O 54 | Great staff.####Great=O staff=T-POS .=O 55 | The hostess and the waitress were incredibly rude and did everything they could to rush us out.####The=O hostess=T-NEG and=O the=O waitress=T-NEG were=O incredibly=O rude=O and=O did=O everything=O they=O could=O to=O rush=O us=O out=O .=O 56 | Will not be back.####Will=O not=O be=O back=O .=O 57 | Try the sea bass.####Try=O the=O sea=T-POS bass=T-POS .=O 58 | The dinner was ok, nothing I would have again.####The=O dinner=T-NEG was=O ok=O ,=O nothing=O I=O would=O have=O again=O .=O 59 | They forgot a sandwich, didn't include plastic forks, and didn't include pita with the hummus platter.####They=O forgot=O a=O sandwich=O ,=O did=O n't=O include=O plastic=O forks=O ,=O and=O did=O n't=O include=O pita=O with=O the=O hummus=O platter=O .=O 60 | It's a rather cramped and busy restaurant and it closes early.####It=O 's=O a=O rather=O cramped=O and=O busy=O restaurant=T-NEG and=O it=O closes=O early=O .=O 61 | We were very pleasantly surprised.####We=O were=O very=O pleasantly=O surprised=O .=O 62 | After dinner the manager grabbed my boyfriend, asked him: Where are you from...maybe you dont know how things work in America...and in the end stormed away almost teareyed yelling that tips are the only thing they survive on.####After=O dinner=O the=O manager=T-NEG grabbed=O my=O boyfriend=O ,=O asked=O him=O ,=O Where=O are=O you=O frommaybe=O you=O do=O n't=O know=O how=O things=O work=O in=O Americaand=O in=O the=O end=O stormed=O away=O almost=O teareyed=O yelling=O that=O tips=O are=O the=O only=O thing=O they=O survive=O on=O .=O 63 | Pizza here is consistently good.####Pizza=T-POS here=O is=O consistently=O good=O .=O 64 | Service is average.####Service=T-NEU is=O average=O .=O 65 | A gentleman, maybe the manager, came to our table, and without so much as a smile or greeting asked for our order.####A=O gentleman=T-NEG ,=O maybe=O the=O manager=O ,=O came=O to=O our=O table=O ,=O and=O without=O so=O much=O as=O a=O smile=O or=O greeting=O asked=O for=O our=O order=O .=O 66 | When asked about how a certain dish was prepared in comparison to a similar at other thai restaurants, he replied this is not Mcdonald's, every place makes things differently ####When=O asked=O about=O how=O a=O certain=O dish=O was=O prepared=O in=O comparison=O to=O a=O similar=O at=O other=O thai=O restaurants=O ,=O he=O replied=O this=O is=O not=O Mcdonald=O 's=O ,=O every=O place=O makes=O things=O differently=O 67 | I would highly recommend this place!####I=O would=O highly=O recommend=O this=O place=T-POS !=O 68 | The food is great and reasonably priced.####The=O food=T-POS is=O great=O and=O reasonably=O priced=O .=O 69 | My friends and I were on vacation in NY and was referred to Chance by a friend.####My=O friends=O and=O I=O were=O on=O vacation=O in=O NY=O and=O was=O referred=O to=O Chance=O by=O a=O friend=O .=O 70 | I also ordered the Change Mojito, which was out of this world.####I=O also=O ordered=O the=O Change=T-POS Mojito=T-POS ,=O which=O was=O out=O of=O this=O world=O .=O 71 | The food was average or above including some surprising tasty dishes.####The=O food=T-POS was=O average=O or=O above=O including=O some=O surprising=O tasty=O dishes=T-POS .=O 72 | I would recommend Roxy's for that, but not for their food.####I=O would=O recommend=O Roxy=O 's=O for=O that=O ,=O but=O not=O for=O their=O food=T-NEG .=O 73 | And the Tom Kha soup was pathetic.####And=O the=O Tom=T-NEG Kha=T-NEG soup=T-NEG was=O pathetic=O .=O 74 | We had the scallops as an appetizer and they were delicious and the sauce was wonderful.####We=O had=O the=O scallops=T-POS as=O an=O appetizer=O and=O they=O were=O delicious=O and=O the=O sauce=T-POS was=O wonderful=O .=O 75 | I've waited over one hour for food.####I=O 've=O waited=O over=O one=O hour=O for=O food=O .=O 76 | The food looked very appetizing and delicious since it came on a variety of fancy plates.####The=O food=T-POS looked=O very=O appetizing=O and=O delicious=O since=O it=O came=O on=O a=O variety=O of=O fancy=O plates=O .=O 77 | Service here was great, food was fantastic.####Service=T-POS here=O was=O great=O ,=O food=T-POS was=O fantastic=O .=O 78 | We all agreed that mare is one of the best seafood restaurants in New York.####We=O all=O agreed=O that=O mare=T-POS is=O one=O of=O the=O best=O seafood=O restaurants=O in=O New=O York=O .=O 79 | I ordered the smoked salmon and roe appetizer and it was off flavor.####I=O ordered=O the=O smoked=T-NEG salmon=T-NEG and=T-NEG roe=T-NEG appetizer=T-NEG and=O it=O was=O off=O flavor=O .=O 80 | Delicious crab cakes too.####Delicious=O crab=T-POS cakes=T-POS too=O .=O 81 | The sandwiches are dry, tasteless and way overpriced.####The=O sandwiches=T-NEG are=O dry=O ,=O tasteless=O and=O way=O overpriced=O .=O 82 | The atmosphere is unheralded, the service impecible, and the food magnificant.####The=O atmosphere=T-POS is=O unheralded=O ,=O the=O service=T-POS impecible=O ,=O and=O the=O food=T-POS magnifica=O n't=O .=O 83 | The food is good, especially their more basic dishes, and the drinks are delicious.####The=O food=T-POS is=O good=O ,=O especially=O their=O more=O basic=T-POS dishes=T-POS ,=O and=O the=O drinks=T-POS are=O delicious=O .=O 84 | This is a great place to take out-of-towners, and perfect for watching the sunset.####This=O is=O a=O great=O place=T-POS to=O take=O out-of-towners=O ,=O and=O perfect=O for=O watching=O the=O sunset=O .=O 85 | Great sushi experience.####Great=O sushi=T-POS experience=O .=O 86 | Murray won't do it.####Murray=O wo=O n't=O do=O it=O .=O 87 | Won't or Can't is not in the service directory.####Wo=O n't=O or=O Ca=O n't=O is=O not=O in=O the=O service=O directory=O .=O 88 | Bagels are ok, but be sure not to make any special requests!####Bagels=T-NEU are=O ok=O ,=O but=O be=O sure=O not=O to=O make=O any=O special=O requests=O !=O 89 | The fish was adequate, but inexpertly sliced.####The=O fish=T-NEG was=O adequate=O ,=O but=O inexpertly=O sliced=O .=O 90 | I thought going to Jimmys would give me a real Domincan exprience.####I=O thought=O going=O to=O Jimmys=O would=O give=O me=O a=O real=O Domincan=O exprience=O .=O 91 | For authentic Thai food, look no further than Toons.####For=O authentic=O Thai=T-POS food=T-POS ,=O look=O no=O further=O than=O Toons=O .=O 92 | I did not try the caviar but I tried their salmon and crab salad (they are all good) ####I=O did=O not=O try=O the=O caviar=O but=O I=O tried=O their=O salmon=T-POS and=O crab=T-POS salad=T-POS they=O are=O all=O good=O 93 | The wait staff is pleasant, fun, and for the most part gorgeous (in the wonderful aesthetic beautification way, not in that she's-way-cuter-than-me-that-b@#$* way).####The=O wait=T-POS staff=T-POS is=O pleasant=O ,=O fun=O ,=O and=O for=O the=O most=O part=O gorgeous=O in=O the=O wonderful=O aesthetic=O beautification=O way=O ,=O not=O in=O that=O she's-way-cuter-than-me-that-b=O @=O #=O $=O *=O way=O .=O 94 | Its location is good and the fact that Hutner College is near and their prices are very reasonable, makes students go back to Suan again and again.####Its=O location=T-POS is=O good=O and=O the=O fact=O that=O Hutner=O College=O is=O near=O and=O their=O prices=O are=O very=O reasonable=O ,=O makes=O students=O go=O back=O to=O Suan=T-POS again=O and=O again=O .=O 95 | if you're daring, try the balsamic vinegar over icecream, it's wonderful!####if=O you=O 're=O daring=O ,=O try=O the=O balsamic=T-POS vinegar=T-POS over=T-POS icecream=T-POS ,=O it=O 's=O wonderful=O !=O 96 | After passing by this restaurant for sometime I finally decided to go in and have dinner.####After=O passing=O by=O this=O restaurant=O for=O sometime=O I=O finally=O decided=O to=O go=O in=O and=O have=O dinner=O .=O 97 | The menu consisted of standard brassiere food, better then places like Balthazar etc.####The=O menu=O consisted=O of=O standard=O brassiere=O food=O ,=O better=O then=O places=O like=O Balthazar=O etc=O .=O 98 | the pad se ew chicken was delicious, however the pad thai was far too oily.####the=O pad=T-POS se=T-POS ew=T-POS chicken=T-POS was=O delicious=O ,=O however=O the=O pad=T-NEG thai=T-NEG was=O far=O too=O oily=O .=O 99 | Service is fast and friendly.####Service=T-POS is=O fast=O and=O friendly=O .=O 100 | If celebrities make you sweat, then your in for a ride, but if your like most around these parts then you'll just yawn and wonder whats with all the hype.####If=O celebrities=O make=O you=O sweat=O ,=O then=O your=O in=O for=O a=O ride=O ,=O but=O if=O your=O like=O most=O around=O these=O parts=O then=O you=O 'll=O just=O yawn=O and=O wonder=O whats=O with=O all=O the=O hype=O .=O 101 | I've eaten at many different Indian restaurants.####I=O 've=O eaten=O at=O many=O different=O Indian=O restaurants=O .=O 102 | The appetizing is excellent - just as good as Zabars Barney Greengrass at a reasonable price (if bought by the pound).####The=O appetizing=O is=O excellent=O ,=O just=O as=O good=O as=O Zabars=O Barney=O Greengrass=O at=O a=O reasonable=O price=O if=O bought=O by=O the=O pound=O .=O 103 | Go there to relax and feel like your somewhere else.####Go=O there=O to=O relax=O and=O feel=O like=O your=O somewhere=O else=O .=O 104 | Great food, great decor, great service.####Great=O food=T-POS ,=O great=O decor=T-POS ,=O great=O service=T-POS .=O 105 | This is the perfect spot for meeting friends, having lunch, dinner, pre-theatre or after-theatre drinks!####This=O is=O the=O perfect=O spot=T-POS for=O meeting=O friends=O ,=O having=O lunch=O ,=O dinner=O ,=O pre-theatre=O or=O after-theatre=O drinks=O !=O 106 | Wonderful at holiday time.####Wonderful=O at=O holiday=O time=O .=O 107 | The porcini mushroom pasta special was tasteless, so was the seafood tagliatelle.####The=O porcini=T-NEG mushroom=T-NEG pasta=T-NEG special=T-NEG was=O tasteless=O ,=O so=O was=O the=O seafood=T-NEG tagliatelle=T-NEG .=O 108 | A real dissapointment.####A=O real=O dissapointment=O .=O 109 | I recently tried Suan and I thought that it was great.####I=O recently=O tried=O Suan=T-POS and=O I=O thought=O that=O it=O was=O great=O .=O 110 | Good food.####Good=O food=T-POS .=O 111 | Ravioli was good...but I have to say that I found everything a bit overpriced.####Ravioli=T-POS was=O goodbut=O I=O have=O to=O say=O that=O I=O found=O everything=O a=O bit=O overpriced=O .=O 112 | Faan is sooo good.####Faan=T-POS is=O sooo=O good=O .=O 113 | bottles of wine are cheap and good.####bottles=T-POS of=T-POS wine=T-POS are=O cheap=O and=O good=O .=O 114 | This is an amazing place to try some roti rolls.####This=O is=O an=O amazing=O place=O to=O try=O some=O roti=T-POS rolls=T-POS .=O 115 | The food's as good as ever.####The=O food=T-POS 's=O as=O good=O as=O ever=O .=O 116 | Excellent dumplings served amid clean, chic decor.####Excellent=O dumplings=T-POS served=O amid=O clean=O ,=O chic=O decor=T-POS .=O 117 | I won't go back unless someone else is footing the bill.####I=O wo=O n't=O go=O back=O unless=O someone=O else=O is=O footing=O the=O bill=O .=O 118 | The portions are small but being that the food was so good makes up for that.####The=O portions=T-NEG are=O small=O but=O being=O that=O the=O food=T-POS was=O so=O good=O makes=O up=O for=O that=O .=O 119 | Service is not what you are coming here for...####Service=T-NEG is=O not=O what=O you=O are=O coming=O here=O for=O .=O 120 | No thanks!!!####No=O thanks=O !=O !=O !=O 121 | The only problem is that the manager is a complete incompetent.####The=O only=O problem=O is=O that=O the=O manager=T-NEG is=O a=O complete=O incompetent=O .=O 122 | Hey, I think $2+ for a 5 block walk ain't bad.####Hey=O ,=O I=O think=O $=O 2+=O for=O a=O 5=O block=O walk=O ai=O n't=O bad=O .=O 123 | I plan on stopping by next week as well.####I=O plan=O on=O stopping=O by=O next=O week=O as=O well=O .=O 124 | I recommend this place to everyone.####I=O recommend=O this=O place=T-POS to=O everyone=O .=O 125 | An awesome organic dog, and a conscious eco friendly establishment.####An=O awesome=O organic=O dog=T-POS ,=O and=O a=O conscious=O eco=O friendly=O establishment=T-POS .=O 126 | I paid just about $60 for a good meal, though :)####I=O paid=O just=O about=O $=O 60=O for=O a=O good=O meal=T-POS ,=O though=O ,=O 127 | Great sake!####Great=O sake=T-POS !=O 128 | Delivery guy sometimes get upset if you don't tip more than 10%.####Delivery=T-NEG guy=T-NEG sometimes=O get=O upset=O if=O you=O do=O n't=O tip=O more=O than=O 10=O %=O .=O 129 | Creative, consistent, fresh.####Creative=O ,=O consistent=O ,=O fresh=O .=O 130 | The place is a bit hidden away, but once you get there, it's all worth it.####The=O place=T-NEU is=O a=O bit=O hidden=O away=O ,=O but=O once=O you=O get=O there=O ,=O it=O 's=O all=O worth=O it=O .=O 131 | My wife and I went to Water's Edge for a romantic dinner.####My=O wife=O and=O I=O went=O to=O Water=O 's=O Edge=O for=O a=O romantic=O dinner=O .=O 132 | lobster was good, nothing spectacular.####lobster=T-NEU was=O good=O ,=O nothing=O spectacular=O .=O 133 | I thought the restaurant was nice and clean.####I=O thought=O the=O restaurant=T-POS was=O nice=O and=O clean=O .=O 134 | Red Dragon Roll - my favorite thing to eat, of any food group - hands down####Red=T-POS Dragon=T-POS Roll=T-POS ,=O my=O favorite=O thing=O to=O eat=O ,=O of=O any=O food=O group=O ,=O hands=O down=O 135 | I have eaten at some of the 'best' sushi joints in NYC (Nobu, Bond Street, JewelBako, etc) and Yamato is my favorite.####I=O have=O eaten=O at=O some=O of=O the=O best=O sushi=O joints=O in=O NYC=O Nobu=O ,=O Bond=O Street=O ,=O JewelBako=O ,=O etc=O and=O Yamato=O is=O my=O favorite=O .=O 136 | The Dancing, White River and Millenium rolls are musts.####The=O Dancing,=T-POS White=T-POS River=T-POS and=T-POS Millenium=T-POS rolls=T-POS are=O musts=O .=O 137 | When I got there the place was packed but they made sure to seat me quickly.####When=O I=O got=O there=O the=O place=O was=O packed=O but=O they=O made=O sure=O to=O seat=O me=O quickly=O .=O 138 | We were offered water for the table but were not told the Voss bottles of water were $8 a piece.####We=O were=O offered=O water=O for=O the=O table=O but=O were=O not=O told=O the=O Voss=T-NEG bottles=T-NEG of=T-NEG water=T-NEG were=O $=O 8=O a=O piece=O .=O 139 | MMMMMMMMMmmmmmm so delicious####MMMMMMMMMmmmmmm=O so=O delicious=O 140 | Also, I personally wasn't a fan of the portobello and asparagus mole.####Also=O ,=O I=O personally=O was=O n't=O a=O fan=O of=O the=O portobello=T-NEG and=T-NEG asparagus=T-NEG mole=T-NEG .=O 141 | The veal was incredible last night.####The=O veal=T-POS was=O incredible=O last=O night=O .=O 142 | Skip Nathan's...you can get that at the mall...go to Bark.####Skip=O Nathan'syou=O can=O get=O that=O at=O the=O mallgo=O to=O Bark=O .=O 143 | Most of the booths allow you to sit next to eachother without looking like 'that' couple.####Most=O of=O the=O booths=T-POS allow=O you=O to=O sit=O next=O to=O eachother=O without=O looking=O like=O that=O couple=O .=O 144 | Ruth, mother of the Bride####Ruth=O ,=O mother=O of=O the=O Bride=O 145 | $170 down the toilet...####$=O 170=O down=O the=O toilet=O .=O 146 | The service was extremely fast and attentive(thanks to the service button on your table) but I barely understood 1 word when the waiter took our order.####The=O service=T-POS was=O extremely=O fast=O and=O attentivethanks=O to=O the=O service=T-POS button=O on=O your=O table=O but=O I=O barely=O understood=O 1=O word=O when=O the=O waiter=T-NEG took=O our=O order=O .=O 147 | Over all the looks of the place exceeds the actual meals.####Over=O all=O the=O looks=T-POS of=O the=O place=O exceeds=O the=O actual=O meals=T-NEG .=O 148 | Sometimes tables don't understand his sense of humor but it's refreshing to have a server who has personality, professionalism, and respects the privacy of your dinner.####Sometimes=O tables=O do=O n't=O understand=O his=O sense=O of=O humor=O but=O it=O 's=O refreshing=O to=O have=O a=O server=T-POS who=O has=O personality=O ,=O professionalism=O ,=O and=O respects=O the=O privacy=O of=O your=O dinner=O .=O 149 | peppers, onions, relish, chilli, cheeses, you NAME it.####peppers=O ,=O onions=O ,=O relish=O ,=O chilli=O ,=O cheeses=O ,=O you=O NAME=O it=O .=O 150 | Highly impressed from the decor to the food to the hospitality to the great night I had!####Highly=O impressed=O from=O the=O decor=T-POS to=O the=O food=T-POS to=O the=O hospitality=O to=O the=O great=O night=O I=O had=O !=O 151 | Great find in the West Village!####Great=O find=O in=O the=O West=O Village=O !=O 152 | The menu looked great, and the waiter was very nice, but when the food came, it was average.####The=O menu=T-POS looked=O great=O ,=O and=O the=O waiter=T-POS was=O very=O nice=O ,=O but=O when=O the=O food=T-NEU came=O ,=O it=O was=O average=O .=O 153 | The manager finally said he would comp the two glasses of wine (which cost less than the food), and made it seem like a big concession.####The=O manager=T-NEG finally=O said=O he=O would=O comp=O the=O two=O glasses=O of=O wine=O which=O cost=O less=O than=O the=O food=O ,=O and=O made=O it=O seem=O like=O a=O big=O concession=O .=O 154 | A fairly late entry into the haute barnyard sweepstakes, Flatbush Farm isn't in the same league as the Blue Hills or even the Farm on Adderlys of the world, but it's pretty good, albeit with a somewhat dismal setting.####A=O fairly=O late=O entry=O into=O the=O haute=O barnyard=O sweepstakes=O ,=O Flatbush=O Farm=O is=O n't=O in=O the=O same=O league=O as=O the=O Blue=O Hills=O or=O even=O the=O Farm=O on=O Adderlys=O of=O the=O world=O ,=O but=O it=O 's=O pretty=O good=O ,=O albeit=O with=O a=O somewhat=O dismal=O setting=O .=O 155 | I do not recommend.####I=O do=O not=O recommend=O .=O 156 | The food is flavorful, plentiful and reasonably priced.####The=O food=T-POS is=O flavorful=O ,=O plentiful=O and=O reasonably=O priced=O .=O 157 | Very pleased####Very=O pleased=O 158 | never swaying, never a bad meal, never bad service...####never=O swaying=O ,=O never=O a=O bad=O meal=T-POS ,=O never=O bad=O service=T-POS .=O 159 | This place has great indian chinese food.####This=O place=O has=O great=O indian=T-POS chinese=T-POS food=T-POS .=O 160 | The martinis are amazing and very fairly priced.####The=O martinis=T-POS are=O amazing=O and=O very=O fairly=O priced=O .=O 161 | Are you freaking kidding me?####Are=O you=O freaking=O kidding=O me=O ?=O 162 | Surprised that a place of this caliber would advertise it as Kobe.####Surprised=O that=O a=O place=O of=O this=O caliber=O would=O advertise=O it=O as=O Kobe=O .=O 163 | Bison was quite excellent however.####Bison=T-POS was=O quite=O excellent=O however=O .=O 164 | Terrible Waste of money.. scammers####Terrible=O Waste=O of=O money=O scammers=O .=O 165 | I am actually offended to have spent so much money on such a bad experience.####I=O am=O actually=O offended=O to=O have=O spent=O so=O much=O money=O on=O such=O a=O bad=O experience=O .=O 166 | Our visit their to say the least, was an unpleasant and costly experience!####Our=O visit=O their=O to=O say=O the=O least=O ,=O was=O an=O unpleasant=O and=O costly=O experience=O !=O 167 | Probably would not go back here.####Probably=O would=O not=O go=O back=O here=O .=O 168 | I don't appreciate places or people that try to drive up the bill without the patron's knowledge so that was a huge turnoff (more than the price).####I=O do=O n't=O appreciate=O places=O or=O people=O that=O try=O to=O drive=O up=O the=O bill=O without=O the=O patron=O 's=O knowledge=O so=O that=O was=O a=O huge=O turnoff=O more=O than=O the=O price=O .=O 169 | But if you're prepared to spend some $ and remember to ask if something they offer is complimentary, then this is the place to go for Indian food####But=O if=O you=O 're=O prepared=O to=O spend=O some=O $=O and=O remember=O to=O ask=O if=O something=O they=O offer=O is=O complimentary=O ,=O then=O this=O is=O the=O place=T-NEG to=O go=O for=O Indian=T-POS food=T-POS 170 | Wretched and retching####Wretched=O and=O retching=O 171 | For starters they delivered us someone else's order.####For=O starters=O they=O delivered=O us=O someone=O else=O 's=O order=O .=O 172 | However, once I received my predictably mediocre order of what Dokebi thinks passes as Korean fair, (sometimes you have to settle when it's your only option), I got through about half my kimchee before I found a piece of random lettuce accompanied by a far more disgusting, slimy, clearly bad piece of fish skin.####However=O ,=O once=O I=O received=O my=O predictably=O mediocre=O order=O of=O what=O Dokebi=O thinks=O passes=O as=O Korean=T-NEG fair=T-NEG ,=O sometimes=O you=O have=O to=O settle=O when=O it=O 's=O your=O only=O option=O ,=O I=O got=O through=O about=O half=O my=O kimchee=T-NEG before=O I=O found=O a=O piece=O of=O random=O lettuce=O accompanied=O by=O a=O far=O more=O disgusting=O ,=O slimy=O ,=O clearly=O bad=O piece=O of=O fish=O skin=O .=O 173 | Less than three minutes passed before I found myself doubled over the toilet.####Less=O than=O three=O minutes=O passed=O before=O I=O found=O myself=O doubled=O over=O the=O toilet=O .=O 174 | I book a gorgeous white organza tent which included a four course prix fix menu which we enjoyed a lot.####I=O book=O a=O gorgeous=O white=T-POS organza=T-POS tent=T-POS which=O included=O a=O four=T-POS course=T-POS prix=T-POS fix=T-POS menu=T-POS which=O we=O enjoyed=O a=O lot=O .=O 175 | The service was spectacular as the waiter knew everything about the menu and his recommendations were amazing!####The=O service=T-POS was=O spectacular=O as=O the=O waiter=T-POS knew=O everything=O about=O the=O menu=O and=O his=O recommendations=O were=O amazing=O !=O 176 | The dishes came out around 5 minutes apart.####The=O dishes=O came=O out=O around=O 5=O minutes=O apart=O .=O 177 | The side dishes were passable, and I did get a refill upon request.####The=O side=T-NEU dishes=T-NEU were=O passable=O ,=O and=O I=O did=O get=O a=O refill=O upon=O request=O .=O 178 | Authentic Korean food lovers should visit 32nd Street, of course.####Authentic=O Korean=O food=O lovers=O should=O visit=O 32nd=O Street=O ,=O of=O course=O .=O 179 | The wife had the risotto which was amazing.####The=O wife=O had=O the=O risotto=T-POS which=O was=O amazing=O .=O 180 | We started off with a delightful sashimi amuse bouche.####We=O started=O off=O with=O a=O delightful=O sashimi=T-POS amuse=T-POS bouche=T-POS .=O 181 | To be honest we only ever eat the Shabu Shabu.####To=O be=O honest=O we=O only=O ever=O eat=O the=O Shabu=O Shabu=O .=O 182 | In fact there is only one I've tried that even compares (shabu Tatsu) and even then I prefer Dokebi.####In=O fact=O there=O is=O only=O one=O I=O 've=O tried=O that=O even=O compares=O shabu=O Tatsu=O and=O even=O then=O I=O prefer=O Dokebi=O .=O 183 | The meat is fresh, the sauces are great, you get kimchi and a salad free with your meal and service is good too.####The=O meat=T-POS is=O fresh=O ,=O the=O sauces=T-POS are=O great=O ,=O you=O get=O kimchi=T-POS and=O a=O salad=T-POS free=O with=O your=O meal=T-POS and=O service=T-POS is=O good=O too=O .=O 184 | The hot dogs are good, yes, but the reason to get over here is the fantastic pork croquette sandwich, perfect on its supermarket squishy bun.####The=O hot=T-POS dogs=T-POS are=O good=O ,=O yes=O ,=O but=O the=O reason=O to=O get=O over=O here=O is=O the=O fantastic=O pork=T-POS croquette=T-POS sandwich=T-POS ,=O perfect=O on=O its=O supermarket=O squishy=O bun=T-POS .=O 185 | The family seafood entree was very good.####The=O family=T-POS seafood=T-POS entree=T-POS was=O very=O good=O .=O 186 | The food they serve is not comforting, not appetizing and uncooked.####The=O food=T-NEG they=O serve=O is=O not=O comforting=O ,=O not=O appetizing=O and=O uncooked=O .=O 187 | A coworker and I tried Pacifico after work a few Fridays and loved it.####A=O coworker=O and=O I=O tried=O Pacifico=T-POS after=O work=O a=O few=O Fridays=O and=O loved=O it=O .=O 188 | And how many times can you pick up the same perfectly aligned set of napkins, inspect them vapidly and plonk them down in exactly the same place instead of venturing a glance at people who are there to help you make the rent?####And=O how=O many=O times=O can=O you=O pick=O up=O the=O same=O perfectly=O aligned=O set=O of=O napkins=O ,=O inspect=O them=O vapidly=O and=O plonk=O them=O down=O in=O exactly=O the=O same=O place=O instead=O of=O venturing=O a=O glance=O at=O people=O who=O are=O there=O to=O help=O you=O make=O the=O rent=O ?=O 189 | Overall the food quality was pretty good, though I hear the salmon is much better when it hasn't sat cooling in front of the guest.####Overall=O the=O food=T-POS quality=O was=O pretty=O good=O ,=O though=O I=O hear=O the=O salmon=O is=O much=O better=O when=O it=O has=O n't=O sat=O cooling=O in=O front=O of=O the=O guest=O .=O 190 | The place has a nice fit-out, some attractive furnishings and, from what I could tell, a reasonable wine list (I was given the food menu when I asked for the carte des vins)####The=O place=O has=O a=O nice=O fit-out=T-POS ,=O some=O attractive=O furnishings=T-POS and=O ,=O from=O what=O I=O could=O tell=O ,=O a=O reasonable=O wine=T-POS list=T-POS I=O was=O given=O the=O food=O menu=O when=O I=O asked=O for=O the=O carte=O des=O vins=O 191 | Everything was going good until we got our meals.####Everything=O was=O going=O good=O until=O we=O got=O our=O meals=T-NEG .=O 192 | Sometimes you pay a lot and don't get much in return - it's manhattan, things are expensive.####Sometimes=O you=O pay=O a=O lot=O and=O do=O n't=O get=O much=O in=O return=O ,=O it=O 's=O manhattan=O ,=O things=O are=O expensive=O .=O 193 | Though it's been crowded most times I've gone here, Bark always delivers on their food.####Though=O it=O 's=O been=O crowded=O most=O times=O I=O 've=O gone=O here=O ,=O Bark=T-NEU always=O delivers=O on=O their=O food=T-POS .=O 194 | I'm a friendly person, so I wouldn't mind had she not been so nasty and gotten so personal. ####I=O 'm=O a=O friendly=O person=O ,=O so=O I=O would=O n't=O mind=O had=O she=O not=O been=O so=O nasty=O and=O gotten=O so=O personal=O .=O 195 | Here the hot dog is elevated to the level of a real entree with numerous variations available.####Here=O the=O hot=T-POS dog=T-POS is=O elevated=O to=O the=O level=O of=O a=O real=O entree=O with=O numerous=O variations=O available=O .=O 196 | Appetizers took nearly an hour.####Appetizers=O took=O nearly=O an=O hour=O .=O 197 | When we threatened to leave, we were offered a meager discount even though half the order was missing.####When=O we=O threatened=O to=O leave=O ,=O we=O were=O offered=O a=O meager=O discount=O even=O though=O half=O the=O order=O was=O missing=O .=O 198 | On the way out, we heard of other guests complaining about similar issues.####On=O the=O way=O out=O ,=O we=O heard=O of=O other=O guests=O complaining=O about=O similar=O issues=O .=O 199 | The design of the space is good.####The=O design=O of=O the=O space=T-POS is=O good=O .=O 200 | I couldn't ignore the fact that she reach over the plate of one of my friends, who was in mid bite, to clear the table.####I=O could=O n't=O ignore=O the=O fact=O that=O she=O reach=O over=O the=O plate=O of=O one=O of=O my=O friends=O ,=O who=O was=O in=O mid=O bite=O ,=O to=O clear=O the=O table=O .=O 201 | --------------------------------------------------------------------------------