├── .idea ├── .gitignore ├── inspectionProfiles │ ├── Project_Default.xml │ └── profiles_settings.xml ├── misc.xml ├── modules.xml ├── user-satisfaction-simulation.iml └── vcs.xml ├── README.md ├── baselines ├── driver_act.py ├── driver_sat.py ├── jddc_config.py ├── models.py ├── spearman.py ├── svm.py ├── test.py ├── train_act.py ├── train_act.sh ├── train_jddc_act.py ├── train_jddc_sat.py ├── train_sat.py └── train_sat.sh ├── dataset ├── CCPE.txt ├── JDDC-ActionList.txt ├── JDDC.txt ├── MWOZ.txt ├── README.md ├── ReDial-action.txt ├── ReDial.txt └── SGD.txt └── imgs ├── action-prediction.png └── satisfaction-prediction.png /.idea/.gitignore: -------------------------------------------------------------------------------- 1 | # Default ignored files 2 | /shelf/ 3 | /workspace.xml 4 | # Editor-based HTTP Client requests 5 | /httpRequests/ 6 | # Datasource local storage ignored files 7 | /dataSources/ 8 | /dataSources.local.xml 9 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/Project_Default.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 11 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/profiles_settings.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 6 | -------------------------------------------------------------------------------- /.idea/misc.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | -------------------------------------------------------------------------------- /.idea/modules.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/user-satisfaction-simulation.iml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/vcs.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems 2 | 3 | We annotated a dialogue data set, User Satisfaction Simulation (USS), that includes 6,800 dialogues. All user utterances in those dialogues, as well as the dialogues themselves, have been labeled based on a 5-level satisfaction scale. See [dataset](https://github.com/sunnweiwei/user-satisfaction-simulation/tree/master/dataset). 4 | 5 | These resources are developed within the following paper: 6 | 7 | *Weiwei Sun, Shuo Zhang, Krisztian Balog, Zhaochun Ren, Pengjie Ren, Zhumin Chen, Maarten de Rijke. "Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems". In SIGIR.* [Paper link](https://arxiv.org/pdf/2105.03748) 8 | 9 | ## Data 10 | 11 | The dataset (see [dataset](https://github.com/sunnweiwei/user-satisfaction-simulation/tree/master/dataset)) is provided a TXT format, where each line is separated by "\t": 12 | 13 | - speaker role (USER or SYSTEM), 14 | - text, 15 | - action, 16 | - satisfaction (repeated annotation are separated by ","), 17 | - explanation text (only for JDDC at dialogue level, and repeated annotation are separated by ";") 18 | 19 | And sessions are separated by blank lines. 20 | 21 | Since the original dataset does not provide actions, we use the action annotation provided by [IARD](https://github.com/wanlingcai1997/umap_2020_IARD) and included it in *ReDial-action.txt*. 22 | 23 | The JDDC data set provides the action of each user utterances, including 234 categories. We compress them into 12 categories based on a manually defined classification method (see *JDDC-ActionList.txt*). 24 | 25 | ## Data Statistics 26 | 27 | The USS dataset is based on five benchmark task-oriented dialogue datasets: [JDDC](https://arxiv.org/abs/1911.09969), [Schema Guided Dialogue (SGD)](https://arxiv.org/abs/1909.05855), [MultiWOZ 2.1](https://arxiv.org/abs/1907.01669), [Recommendation Dialogues (ReDial)](https://arxiv.org/abs/1812.07617), and [Coached Conversational Preference Elicitation (CCPE)](https://www.aclweb.org/anthology/W19-5941.pdf). 28 | 29 | | Domain | JDDC | SGD | MultiWOZ | ReDial | CCPE | 30 | | ----------- | ------: | ------: | -------: | ------: | ------: | 31 | | Language | Chinese | English | English | English | English | 32 | | #Dialogues | 3,300 | 1,000 | 1,000 | 1,000 | 500 | 33 | | Avg# Turns | 32.3 | 26.7 | 23.1 | 22.5 | 24.9 | 34 | | #Utterances | 54,517 | 13,833 | 12,553 | 11,806 | 6,860 | 35 | | Rating 1 | 120 | 5 | 12 | 20 | 10 | 36 | | Rating 2 | 4,820 | 769 | 725 | 720 | 1,472 | 37 | | Rating 3 | 45,005 | 11,515 | 11,141 | 9,623 | 5,315 | 38 | | Rating 4 | 4,151 | 1,494 | 669 | 1,490 | 59 | 39 | | Rating 5 | 421 | 50 | 6 | 34 | 4 | 40 | 41 | ## Baselines 42 | The code for baseline reproduction can be found within `/baselines`. 43 | 44 | ![Performance for user satisfaction prediction. Bold face indicates the best result in terms of the corresponding metric. Underline indicates comparable results to the best one.](https://github.com/sunnweiwei/user-satisfaction-simulation/blob/master/imgs/satisfaction-prediction.png) 45 | 46 | ![ Performance for user action prediction. Bold face indicates the best result in terms of the corresponding metric. Underline indicates comparable results to the best one.](https://github.com/sunnweiwei/user-satisfaction-simulation/blob/master/imgs/action-prediction.png) 47 | 48 | ## Cite 49 | 50 | ``` 51 | @inproceedings{Sun:2021:SUS, 52 | author = {Sun, Weiwei and Zhang, Shuo and Balog, Krisztian and Ren, Zhaochun and Ren, Pengjie and Chen, Zhumin and de Rijke, Maarten}, 53 | title = {Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems}, 54 | booktitle = {Proceedings of the 44rd International ACM SIGIR Conference on Research and Development in Information Retrieval}, 55 | series = {SIGIR '21}, 56 | year = {2021}, 57 | publisher = {ACM} 58 | } 59 | ``` 60 | 61 | ## Contact 62 | 63 | If you have any questions, please contact sunnweiwei@gmail.com 64 | 65 | -------------------------------------------------------------------------------- /baselines/driver_act.py: -------------------------------------------------------------------------------- 1 | from .train_jddc_act import train as train_act 2 | import os 3 | import argparse 4 | 5 | parser = argparse.ArgumentParser() 6 | parser.add_argument('-fold', type=int) 7 | parser.add_argument('--data', type=str, default='dstc8') 8 | parser.add_argument('--model', type=str, default='HiGRU+ATTN') 9 | args = parser.parse_args() 10 | 11 | print('CUDA_VISIBLE_DEVICES', os.environ["CUDA_VISIBLE_DEVICES"]) 12 | print('train data', args.data) 13 | print('train model', args.model) 14 | print('train fold', args.fold) 15 | 16 | train_act(fold=args.fold, data_name=args.data, model_name=args.model) 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------- /baselines/driver_sat.py: -------------------------------------------------------------------------------- 1 | from .train_jddc_sat import train 2 | import os 3 | import argparse 4 | 5 | parser = argparse.ArgumentParser() 6 | parser.add_argument('-fold', type=int) 7 | parser.add_argument('--data', type=str, default='dstc8') 8 | parser.add_argument('--model', type=str, default='HiGRU+ATTN') 9 | args = parser.parse_args() 10 | 11 | print('CUDA_VISIBLE_DEVICES', os.environ["CUDA_VISIBLE_DEVICES"]) 12 | print('train data', args.data) 13 | print('train model', args.model) 14 | print('train fold', args.fold) 15 | 16 | train(fold=args.fold, data_name=args.data, model_name=args.model) 17 | 18 | 19 | 20 | -------------------------------------------------------------------------------- /baselines/jddc_config.py: -------------------------------------------------------------------------------- 1 | domain2actions = """配送 ['配送周期', '物流全程跟踪', '联系配送', '什么时间出库', '配送方式', '返回方式', '预约配送时间', '少商品与少配件', '拒收', '能否自提', '能否配送', '售前运费多少'时间', '发错货', '下单地址填写', '发货检查', '京东特色配送', '提前配送', '填写返件运单号', '怎么确认收货', '快递单号不正确', '自提时间', '发货时间未到不能出库', '送货上门附加手续费', '夺宝岛配送时间', '夺宝岛运费', '配送超区'] 2 | 退换 ['保修返修及退换货政策', '正常退款周期', '返修退换货处理周期', '售后运费', '申请退款', '退款到哪儿', '返修退换货拆包装', '取消退款', '在哪里查询退款', '退款异常'团购退款', '补差价'] 3 | 发票 ['发票退换修改', '查看发票', '是否提供发票', '填写发票信息', '增票相关', '电子发票', '补发票', '返修退换货发票'] 4 | 客服 ['联系客服', '联系客户', '联系商家', '联系售后', '投诉', '夺宝岛售后'] 5 | 产品咨询 ['属性咨询', '使用咨询', '商品检索', '商品价格咨询', '补货时间', '生产日期', '正品保障', '包装清单', '库存状态', '商品介绍', '外包装', '商品比较', '保修期方区别', '补发', '预约抢购', '为什么显示无货', '开箱验货', '全球购解释', '有什么颜色', '套装购买', '是否全国联保', '是什么颜色', '图片差异', '配件推荐', '发表商品咨询', '爱回收解释', '夺宝岛商品来源', '金融理财', 'DIY装机', '众筹说明', '定金解释'] 6 | 价保 ['价保申请流程', '价保条件', '价保记录查询', '无法申请价保'] 7 | 支付 ['货到付款', '支付方式', '白条还款方式', '公司转账', '在线支付', '白条分期手续费', '白条开通', '无法购买提交', '余额查询', '支付到账时间', '支付密码', '余额使用'库转入转出', '微信支付', '超期未还款', '网银钱包提现异常', '代客户充值', '免密支付', '充值失败', '网银钱包定义', '网银钱包开通', '充值到账异常', '多次支付退款', '微信下单', '夺宝岛支付方式'] 8 | bug ['下单后无记录', '无法加入购物车', '竞拍异常', '地址信息无法保存'] 9 | 维修 ['售后维修点查询'] 10 | 评价 ['查看评价晒单', '删除修改评价晒单', '评价晒单返券和赠品', '评价晒单异常', '评价晒单送积分京豆'] 11 | 预定 ['机票相关', '购买机票', '火车票', '酒店预订']""" 12 | -------------------------------------------------------------------------------- /baselines/models.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | import torch 3 | import torch.nn as nn 4 | import numpy as np 5 | from torch.nn.init import xavier_uniform_ 6 | import torch.nn.functional as F 7 | import copy 8 | import warnings 9 | import os 10 | import pickle 11 | 12 | 13 | def init_params(model): 14 | for name, param in model.named_parameters(): 15 | if param.data.dim() > 1: 16 | xavier_uniform_(param.data) 17 | else: 18 | pass 19 | 20 | 21 | def universal_sentence_embedding(sentences, mask, sqrt=True): 22 | sentence_sums = torch.bmm( 23 | sentences.permute(0, 2, 1), mask.float().unsqueeze(-1) 24 | ).squeeze(-1) 25 | divisor = (mask.sum(dim=1).view(-1, 1).float()) 26 | if sqrt: 27 | divisor = divisor.sqrt() 28 | sentence_sums /= divisor 29 | return sentence_sums 30 | 31 | 32 | class GRU(nn.Module): 33 | def __init__(self, **config): 34 | super().__init__() 35 | vocab_size = config.get('vocab_size') 36 | dropout = config.get('dropout', 0.4) 37 | d_model = config.get('d_model', 256) 38 | num_layers = config.get('num_layers', 1) 39 | 40 | self.embedding = nn.Embedding(vocab_size, d_model, padding_idx=0) 41 | self.embedding_dropout = nn.Dropout(dropout) 42 | self.gru = nn.GRU(d_model, d_model, num_layers=num_layers, bidirectional=True, batch_first=True) 43 | 44 | init_params(self.embedding) 45 | init_params(self.gru) 46 | 47 | self.d_model = d_model * 2 48 | 49 | def forward(self, input_ids, **kwargs): 50 | attention_mask = input_ids.ne(0).detach() 51 | E = self.embedding_dropout(self.embedding(input_ids)).transpose(0, 1) 52 | H, h1 = self.gru(E) 53 | H = H.transpose(0, 1) 54 | h = universal_sentence_embedding(H, attention_mask) 55 | return h 56 | 57 | 58 | class GRUAttention(nn.Module): 59 | def __init__(self, **config): 60 | super().__init__() 61 | vocab_size = config.get('vocab_size') 62 | dropout = config.get('dropout', 0.4) 63 | d_model = config.get('d_model', 256) 64 | num_layers = config.get('num_layers', 1) 65 | 66 | self.embedding = nn.Embedding(vocab_size, d_model, padding_idx=0) 67 | self.embedding_dropout = nn.Dropout(dropout) 68 | self.gru = nn.GRU(d_model, d_model, num_layers=num_layers, bidirectional=True, batch_first=True) 69 | 70 | self.w = nn.Linear(2 * d_model, 1) 71 | 72 | init_params(self.embedding) 73 | init_params(self.gru) 74 | 75 | self.d_model = d_model * 2 76 | 77 | def forward(self, input_ids, **kwargs): 78 | attention_mask = input_ids.ne(0).detach() 79 | E = self.embedding_dropout(self.embedding(input_ids)).transpose(0, 1) 80 | H, h1 = self.gru(E) 81 | H = H.transpose(0, 1) # bc_size, len, d_model 82 | wh = self.w(H).squeeze(2) # bc_size, len 83 | # print(wh.size()) 84 | attention = F.softmax(F.tanh(wh).masked_fill(mask=~attention_mask, value=-np.inf)).unsqueeze(1) 85 | # bc_size, 1, len 86 | 87 | presentation = torch.bmm(attention, H).squeeze(1) # bc_size, d_model 88 | return presentation 89 | 90 | 91 | class Hierarchical(nn.Module): 92 | def __init__(self, backbone, class_num): 93 | super().__init__() 94 | self.drop_out = nn.Dropout(0.4) 95 | self.private = nn.ModuleList([copy.deepcopy(backbone) for num in class_num]) 96 | d_model = backbone.d_model 97 | 98 | self.class_num = class_num 99 | self.gru = nn.ModuleList( 100 | [nn.GRU(d_model, d_model, num_layers=1, bidirectional=False, batch_first=True) for num in class_num]) 101 | self.linear = nn.ModuleList([nn.Linear(d_model, num) for num in class_num]) 102 | for layer in self.linear: 103 | init_params(layer) 104 | for layer in self.gru: 105 | init_params(layer) 106 | 107 | def forward(self, input_ids, **kwargs): 108 | bc_size, dialog_his, utt_len = input_ids.size() 109 | 110 | input_ids = input_ids.view(-1, utt_len) 111 | attention_mask = input_ids.ne(0).detach() 112 | 113 | res = [] 114 | for private_module, gru, cls_layer in zip(self.private, self.gru, self.linear): 115 | private_out = private_module(input_ids=input_ids, attention_mask=attention_mask, **kwargs) 116 | private_out = private_out.view(bc_size, dialog_his, -1) # bc_size, dialog_his, d_model 117 | H, hidden = gru(private_out) 118 | hidden = hidden.squeeze(0) # bc_size, d_model 119 | hidden = self.drop_out(hidden) 120 | rep = hidden 121 | res.append(cls_layer(rep)) 122 | return res 123 | 124 | 125 | class HierarchicalAttention(nn.Module): 126 | def __init__(self, backbone, class_num): 127 | super().__init__() 128 | self.drop_out = nn.Dropout(0.4) 129 | self.private = nn.ModuleList([copy.deepcopy(backbone) for num in class_num]) 130 | d_model = backbone.d_model 131 | 132 | self.w = nn.ModuleList([nn.Linear(d_model, 1) for num in class_num]) 133 | 134 | self.class_num = class_num 135 | self.gru = nn.ModuleList( 136 | [nn.GRU(d_model, d_model, num_layers=1, bidirectional=False, batch_first=True) for num in class_num]) 137 | self.linear = nn.ModuleList([nn.Linear(d_model, num) for num in class_num]) 138 | for layer in self.linear: 139 | init_params(layer) 140 | for layer in self.gru: 141 | init_params(layer) 142 | for layer in self.w: 143 | init_params(layer) 144 | 145 | def forward(self, input_ids, **kwargs): 146 | bc_size, dialog_his, utt_len = input_ids.size() 147 | 148 | input_ids = input_ids.view(-1, utt_len) 149 | attention_mask = input_ids.ne(0).detach() 150 | 151 | res = [] 152 | for private_module, gru, w, cls_layer in zip(self.private, self.gru, self.w, self.linear): 153 | private_out = private_module(input_ids=input_ids, attention_mask=attention_mask, **kwargs) 154 | private_out = private_out.view(bc_size, dialog_his, -1) # bc_size, dialog_his, d_model 155 | H, hidden = gru(private_out) 156 | # H = H.transpose(0, 1) # bc_size, dialog_his, d_model 157 | wh = w(H).squeeze(2) # bc_size, dialog_his 158 | attention = F.softmax(F.tanh(wh)).unsqueeze(1) # bc_size, 1, dialog_his 159 | hidden = torch.bmm(attention, H).squeeze(1) # bc_size, d_model 160 | 161 | hidden = self.drop_out(hidden) 162 | rep = hidden 163 | res.append(cls_layer(rep)) 164 | return res 165 | 166 | 167 | class BERTBackbone(nn.Module): 168 | def __init__(self, **config): 169 | super().__init__() 170 | name = config.get('name', 'bert-base-chinese') 171 | self.layers_used = config.get('layers_used', 2) 172 | self.bert = BertModel.from_pretrained(name, output_hidden_states=True, output_attentions=True) 173 | self.d_model = 768 * self.layers_used * 2 174 | 175 | def forward(self, input_ids, attention_mask, token_type_ids=None, **kwargs): 176 | out = self.bert(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids) 177 | bert_out = out[2] 178 | out = [bert_out[-i - 1] for i in range(self.layers_used)] 179 | out = torch.cat(out, dim=-1) 180 | out = universal_sentence_embedding(out, attention_mask) 181 | 182 | cls = [bert_out[-i - 1].transpose(0, 1)[0] for i in range(self.layers_used)] 183 | cls = torch.cat(cls, dim=-1) 184 | out = torch.cat([cls, out], dim=-1) 185 | return out 186 | 187 | 188 | class ClassModel(nn.Module): 189 | def __init__(self, backbone, class_num): 190 | super().__init__() 191 | self.drop_out = nn.Dropout(0.4) 192 | self.private = nn.ModuleList([copy.deepcopy(backbone) for num in class_num]) 193 | d_model = backbone.d_model 194 | self.class_num = class_num 195 | self.linear = nn.ModuleList([nn.Linear(d_model, num) for num in class_num]) 196 | for layer in self.linear: 197 | torch.nn.init.normal_(layer.weight, std=0.02) 198 | 199 | def forward(self, input_ids, **kwargs): 200 | input_ids = input_ids 201 | attention_mask = input_ids.ne(0).detach() 202 | res = [] 203 | for private_module, cls_layer in zip(self.private, self.linear): 204 | private_out = private_module(input_ids=input_ids, attention_mask=attention_mask, **kwargs) 205 | rep = private_out 206 | rep = self.drop_out(rep) 207 | res.append(cls_layer(rep)) 208 | return res 209 | -------------------------------------------------------------------------------- /baselines/spearman.py: -------------------------------------------------------------------------------- 1 | """ 2 | Pearson Rho, Spearman Rho, and Kendall Tau 3 | Correlation algorithms 4 | Drew J. Nase 5 | Expects path to a file containing data series - 6 | one per line, separated by one or more spaces. 7 | """ 8 | 9 | import math 10 | import sys 11 | import string 12 | from itertools import combinations 13 | 14 | 15 | # x, y must be one-dimensional arrays of the same length 16 | 17 | # Pearson algorithm 18 | def pearson(x, y): 19 | assert len(x) == len(y) > 0 20 | q = lambda n: len(n) * sum(map(lambda i: i ** 2, n)) - (sum(n) ** 2) 21 | return (len(x) * sum(map(lambda a: a[0] * a[1], zip(x, y))) - sum(x) * sum(y)) / math.sqrt(q(x) * q(y)) 22 | 23 | 24 | # Spearman algorithm 25 | def spearman(x, y): 26 | assert len(x) == len(y) > 0 27 | q = lambda n: map(lambda val: sorted(n).index(val) + 1, n) 28 | d = sum(map(lambda x, y: (x - y) ** 2, q(x), q(y))) 29 | return 1.0 - 6.0 * d / float(len(x) * (len(y) ** 2 - 1.0)) 30 | 31 | 32 | # Kendall algorithm 33 | def kendall(x, y): 34 | assert len(x) == len(y) > 0 35 | c = 0 # concordant count 36 | d = 0 # discordant count 37 | t = 0 # tied count 38 | for (i, j) in combinations(range(len(x)), 2): 39 | s = (x[i] - x[j]) * (y[i] - y[j]) 40 | if s: 41 | c += 1 42 | d += 1 43 | if s > 0: 44 | t += 1 45 | elif s < 0: 46 | t -= 1 47 | else: 48 | if x[i] - x[j]: 49 | c += 1 50 | elif y[i] - y[j]: 51 | d += 1 52 | return t / math.sqrt(c * d) 53 | -------------------------------------------------------------------------------- /baselines/svm.py: -------------------------------------------------------------------------------- 1 | from .train_sat import load_dstc 2 | import numpy as np 3 | from collections import defaultdict 4 | from sklearn.metrics import f1_score, precision_score, recall_score 5 | import jieba 6 | import copy 7 | 8 | 9 | def get_main_score(scores): 10 | number = [0, 0, 0, 0, 0] 11 | for item in scores: 12 | number[item] += 1 13 | score = np.argmax(number) 14 | return score 15 | 16 | 17 | def load_jddc(dirname, lite=1): 18 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 19 | 20 | from jddc_config import domain2actions 21 | 22 | act2domain = {} 23 | for line in domain2actions.split('\n'): 24 | domain = line[:line.index('[') - 1].strip() 25 | actions = [x[1:-1] for x in line[line.index('[') + 1:-1].split(', ')] 26 | # print(domain, actions) 27 | for x in actions: 28 | act2domain[x] = domain 29 | data = [] 30 | for line in raw: 31 | if len(line) == 0: 32 | data.append([]) 33 | else: 34 | data[-1].append(line) 35 | x = [] 36 | emo = [] 37 | act = [] 38 | action_list = {'other': 0} 39 | for session in data: 40 | his_input_ids = [] 41 | for turn in session: 42 | role, text, action, score = turn.split('\t') 43 | score = score.split(',') 44 | if role == 'USER': 45 | x.append(copy.deepcopy(' '.join(his_input_ids))) 46 | emo.append(get_main_score([int(item) - 1 for item in score])) 47 | action = action.strip() 48 | if lite: 49 | action = act2domain.get(action, 'other') 50 | if action not in action_list: 51 | action_list[action] = len(action_list) 52 | act.append(action_list[action]) 53 | his_input_ids.append(' '.join(jieba.cut(text.strip()))) 54 | # his_input_ids.append(' '.join(text.strip())) 55 | 56 | action_num = len(action_list) 57 | data = [x, emo, act, action_num] 58 | return data 59 | 60 | 61 | def load_data(dirname): 62 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 63 | data = [] 64 | for line in raw: 65 | if line == '': 66 | data.append([]) 67 | else: 68 | data[-1].append(line) 69 | x = [] 70 | emo = [] 71 | act = [] 72 | action_list = {} 73 | for session in data: 74 | his_input_ids = [] 75 | for turn in session: 76 | role, text, action, score = turn.split('\t') 77 | score = score.split(',') 78 | action = action.split(',') 79 | action = action[0] 80 | if role.upper() == 'USER': 81 | x.append(copy.deepcopy(' '.join(his_input_ids))) 82 | emo.append(get_main_score([int(item) - 1 for item in score])) 83 | action = action.strip() 84 | if action not in action_list: 85 | action_list[action] = len(action_list) 86 | act.append(action_list[action]) 87 | his_input_ids.append(text.strip()) 88 | 89 | action_num = len(action_list) 90 | data = [x, emo, act, action_num] 91 | return data 92 | 93 | 94 | def train(fold=0): 95 | print('fold', fold) 96 | 97 | from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer, TfidfTransformer 98 | from sklearn.linear_model import LogisticRegression 99 | from sklearn.naive_bayes import MultinomialNB 100 | from sklearn.svm import SVC 101 | from xgboost import XGBClassifier 102 | 103 | from sklearn.metrics import cohen_kappa_score 104 | from .spearman import spearman 105 | 106 | dataset = 'MWOZ' 107 | 108 | # x, emo, act, action_num = load_jddc(f'dataset/{dataset}') 109 | x, emo, act, action_num = load_data(f'dataset/{dataset},txt') 110 | 111 | ll = int(len(x) / 10) 112 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 113 | train_act = emo[:ll * fold] + emo[ll * (fold + 1):] 114 | 115 | test_x = x[ll * fold:ll * (fold + 1)] 116 | test_act = emo[ll * fold:ll * (fold + 1)] 117 | 118 | # =================== 119 | 120 | print('build tf-idf') 121 | vectorizer = CountVectorizer() 122 | train_feature = vectorizer.fit_transform(train_x) 123 | test_feature = vectorizer.transform(test_x) 124 | 125 | model = XGBClassifier() 126 | model.fit(train_feature, train_act) 127 | prediction = model.predict(test_feature) 128 | 129 | # svm = SVC(C=1.0, kernel="linear") 130 | # svm.fit(train_feature, train_act) 131 | # prediction = svm.predict(test_feature) 132 | 133 | # lr = LogisticRegression() 134 | # lr.fit(train_feature, train_act) 135 | # prediction = lr.predict(test_feature) 136 | 137 | label = test_act 138 | 139 | recall = [[0, 0] for _ in range(5)] 140 | for p, l in zip(prediction, label): 141 | recall[l][1] += 1 142 | recall[l][0] += int(p == l) 143 | recall_value = [item[0] / max(item[1], 1) for item in recall] 144 | print('Recall value:', recall_value) 145 | print('Recall:', recall) 146 | UAR = sum(recall_value) / len(recall_value) 147 | kappa = cohen_kappa_score(prediction, label) 148 | rho = spearman(prediction, label) 149 | 150 | bi_pred = [int(item < 2) for item in prediction] 151 | bi_label = [int(item < 2) for item in label] 152 | bi_recall = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if l == 1]) / max(bi_label.count(1), 1) 153 | bi_precision = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if p == 1]) / max(bi_pred.count(1), 1) 154 | bi_f1 = 2 * bi_recall * bi_precision / max((bi_recall + bi_precision), 1) 155 | 156 | print(UAR, kappa, rho, bi_f1) 157 | 158 | with open(f'outputs/{dataset}_emo/xgb_{fold}_0.txt', 'w', encoding='utf-8') as f: 159 | for p, l in zip(prediction, label): 160 | f.write(f'{p}, {l}\n') 161 | 162 | 163 | def train_act(fold=0): 164 | print('fold', fold) 165 | 166 | from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer, TfidfTransformer 167 | from sklearn.linear_model import LogisticRegression 168 | from sklearn.naive_bayes import MultinomialNB 169 | from sklearn.svm import SVC 170 | from xgboost import XGBClassifier 171 | 172 | from sklearn.metrics import cohen_kappa_score 173 | from .spearman import spearman 174 | 175 | dataset = 'JDDC' 176 | 177 | x, emo, act, action_num = load_jddc(f'dataset/{dataset}.txt') 178 | # x, emo, act, action_num = load_data(f'dataset/{dataset}.txt') 179 | 180 | ll = int(len(x) / 10) 181 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 182 | train_act = act[:ll * fold] + act[ll * (fold + 1):] 183 | 184 | test_x = x[ll * fold:ll * (fold + 1)] 185 | test_act = act[ll * fold:ll * (fold + 1)] 186 | 187 | print('build tf-idf') 188 | vectorizer = CountVectorizer() 189 | train_feature = vectorizer.fit_transform(train_x) 190 | test_feature = vectorizer.transform(test_x) 191 | 192 | # model = XGBClassifier() 193 | # model.fit(train_feature, train_act) 194 | # prediction = model.predict(test_feature) 195 | 196 | # svm = SVC(C=1.0, kernel="linear") 197 | # svm.fit(train_feature, train_act) 198 | # prediction = svm.predict(test_feature) 199 | 200 | lr = LogisticRegression() 201 | lr.fit(train_feature, train_act) 202 | prediction = lr.predict(test_feature) 203 | 204 | label = test_act 205 | 206 | acc = sum([int(p == l) for p, l in zip(prediction, label)]) / len(label) 207 | precision = precision_score(label, prediction, average='macro', zero_division=0) 208 | recall = recall_score(label, prediction, average='macro', zero_division=0) 209 | f1 = f1_score(label, prediction, average='macro', zero_division=0) 210 | 211 | print(acc, precision, recall, f1) 212 | 213 | with open(f'outputs/{dataset}_act/lr_{fold}_0.txt', 'w', encoding='utf-8') as f: 214 | for p, l in zip(prediction, label): 215 | f.write(f'{p}, {l}\n') 216 | -------------------------------------------------------------------------------- /baselines/test.py: -------------------------------------------------------------------------------- 1 | from sklearn.metrics import cohen_kappa_score 2 | from .spearman import spearman 3 | from sklearn.metrics import f1_score, precision_score, recall_score 4 | 5 | 6 | def test(name): 7 | best_result = [0. for _ in range(4)] 8 | for epoch in range(100): 9 | 10 | data = [] 11 | for fold in range(10): 12 | data.extend([line[:-1] for line in open(f'{name}_{fold}_{epoch}.txt', encoding='utf-8')]) 13 | 14 | prediction = [int(line.split(',')[0]) for line in data] 15 | label = [int(line.split(',')[1]) for line in data] 16 | recall = [[0, 0] for _ in range(5)] 17 | for p, l in zip(prediction, label): 18 | recall[l][1] += 1 19 | recall[l][0] += int(p == l) 20 | recall_value = [item[0] / max(item[1], 1) for item in recall] 21 | # print('Recall value:', recall_value) 22 | # print('Recall:', recall) 23 | UAR = sum(recall_value) / len(recall_value) 24 | kappa = cohen_kappa_score(prediction, label) 25 | rho = spearman(prediction, label) 26 | # rho = 0 27 | 28 | bi_pred = [int(item < 2) for item in prediction] 29 | bi_label = [int(item < 2) for item in label] 30 | bi_recall = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if l == 1]) / max(bi_label.count(1), 1) 31 | bi_precision = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if p == 1]) / max(bi_pred.count(1), 1) 32 | bi_f1 = 2 * bi_recall * bi_precision / max((bi_recall + bi_precision), 1) 33 | 34 | test_result = [UAR, kappa, rho, bi_f1] 35 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 36 | print(epoch, best_result, test_result) 37 | 38 | 39 | def test_act(name): 40 | best_result = [0. for _ in range(4)] 41 | for epoch in range(100): 42 | 43 | data = [] 44 | for fold in range(10): 45 | data.extend([line[:-1] for line in open(f'{name}_{fold}_{epoch}.txt', encoding='utf-8')]) 46 | 47 | prediction = [int(line.split(',')[0]) for line in data] 48 | label = [int(line.split(',')[1]) for line in data] 49 | acc = sum([int(p == l) for p, l in zip(prediction, label)]) / len(label) 50 | precision = precision_score(label, prediction, average='macro', zero_division=0) 51 | recall = recall_score(label, prediction, average='macro', zero_division=0) 52 | f1 = f1_score(label, prediction, average='macro', zero_division=0) 53 | 54 | test_result = [acc, precision, recall, f1] 55 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 56 | print(epoch, best_result, test_result) 57 | 58 | 59 | if __name__ == '__main__': 60 | 61 | test('outputs/jddc_emo/BERT') 62 | test_act('outputs/jddc_act/BERT') 63 | -------------------------------------------------------------------------------- /baselines/train_act.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | from torch.utils.data import Dataset, DataLoader 3 | from torch.nn.utils.rnn import pad_sequence 4 | import torch.nn.functional as F 5 | from tqdm import tqdm 6 | import random 7 | import json 8 | import copy 9 | import torch 10 | import warnings 11 | import numpy as np 12 | import os 13 | import pickle 14 | from sklearn.metrics import cohen_kappa_score 15 | from sklearn.metrics import f1_score, precision_score, recall_score 16 | from .spearman import spearman 17 | 18 | warnings.filterwarnings("ignore") 19 | 20 | 21 | def write_pkl(obj, filename): 22 | with open(filename, 'wb') as f: 23 | pickle.dump(obj, f) 24 | 25 | 26 | def read_pkl(filename): 27 | with open(filename, 'rb') as f: 28 | return pickle.load(f) 29 | 30 | 31 | def get_main_score(scores): 32 | number = [0, 0, 0, 0, 0] 33 | for item in scores: 34 | number[item] += 1 35 | score = np.argmax(number) 36 | return score 37 | 38 | 39 | def load_redial_act(dirname, tokenizer): 40 | print(dirname) 41 | name = 'action_data' 42 | 43 | if os.path.exists(f'{dirname}-{name}.pkl'): 44 | return read_pkl(f'{dirname}-{name}.pkl') 45 | print('tokenized data') 46 | raw = [line[:-1] for line in open(f'{dirname}/data_action.txt', encoding='utf-8')] 47 | data = [] 48 | for line in raw: 49 | if line == '': 50 | data.append([]) 51 | else: 52 | data[-1].append(line) 53 | x = [] 54 | emo = [] 55 | act = [] 56 | action_list = {} 57 | for session in data: 58 | his_input_ids = [] 59 | for turn in session: 60 | role, text, action, score = turn.split('\t') 61 | score = score.split(',') 62 | action = action.split(',') 63 | action = action[0] 64 | if role == 'USER': 65 | x.append(copy.deepcopy(his_input_ids)) 66 | emo.append(get_main_score([int(item) - 1 for item in score])) 67 | action = action.strip() 68 | if action not in action_list: 69 | action_list[action] = len(action_list) 70 | act.append(action_list[action]) 71 | 72 | ids = tokenizer.encode(text.strip())[1:] 73 | his_input_ids.append(ids) 74 | 75 | action_num = len(action_list) 76 | data = [x, emo, act, action_num] 77 | write_pkl(data, f'{dirname}-{name}.pkl') 78 | return data 79 | 80 | 81 | def load_ccpe(dirname, tokenizer): 82 | print(dirname) 83 | name = 'hierarchical_data' 84 | 85 | if os.path.exists(f'{dirname}-{name}.pkl'): 86 | return read_pkl(f'{dirname}-{name}.pkl') 87 | print('tokenized data') 88 | raw = [line[:-1] for line in open(f'{dirname}/data.txt', encoding='utf-8')] 89 | data = [] 90 | for line in raw: 91 | if line == '': 92 | data.append([]) 93 | else: 94 | data[-1].append(line) 95 | x = [] 96 | emo = [] 97 | act1 = [] 98 | act2 = [] 99 | action_list1 = {} 100 | action_list2 = {} 101 | for session in data: 102 | his_input_ids = [] 103 | for turn in session: 104 | role, text, action, score = turn.split('\t') 105 | score = score.split(',') 106 | action = action.split(',') 107 | action = action[0] 108 | if action == '': 109 | action1, action2 = 'other', 'other' 110 | else: 111 | action1, action2 = action.split('+') 112 | if role == 'USER': 113 | x.append(copy.deepcopy(his_input_ids)) 114 | emo.append(get_main_score([int(item) - 1 for item in score])) 115 | action1 = action1.strip() 116 | if action1 not in action_list1: 117 | action_list1[action1] = len(action_list1) 118 | act1.append(action_list1[action1]) 119 | 120 | action2 = action2.strip() 121 | if action2 not in action_list2: 122 | action_list2[action2] = len(action_list2) 123 | act2.append(action_list2[action2]) 124 | ids = tokenizer.encode(text.strip())[1:] 125 | his_input_ids.append(ids) 126 | action_num1 = len(action_list1) 127 | action_num2 = len(action_list2) 128 | data = [x, emo, act1, act2, action_num1, action_num2] 129 | write_pkl(data, f'{dirname}-{name}.pkl') 130 | return data 131 | 132 | 133 | def load_dstc(dirname, tokenizer): 134 | print(dirname) 135 | name = 'hierarchical_data' 136 | 137 | if os.path.exists(f'{dirname}-{name}.pkl'): 138 | return read_pkl(f'{dirname}-{name}.pkl') 139 | 140 | print('tokenized data') 141 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 142 | data = [] 143 | for line in raw: 144 | if line == '': 145 | data.append([]) 146 | else: 147 | data[-1].append(line) 148 | x = [] 149 | emo = [] 150 | act = [] 151 | action_list = {} 152 | for session in data: 153 | his_input_ids = [] 154 | for turn in session: 155 | role, text, action, score = turn.split('\t') 156 | score = score.split(',') 157 | action = action.split(',') 158 | action = action[0] 159 | if role == 'USER': 160 | x.append(copy.deepcopy(his_input_ids)) 161 | emo.append(get_main_score([int(item) - 1 for item in score])) 162 | action = action.strip() 163 | if action not in action_list: 164 | action_list[action] = len(action_list) 165 | act.append(action_list[action]) 166 | 167 | ids = tokenizer.encode(text.strip())[1:] 168 | his_input_ids.append(ids) 169 | 170 | action_num = len(action_list) 171 | data = [x, emo, act, action_num] 172 | write_pkl(data, f'{dirname}-{name}.pkl') 173 | return data 174 | 175 | 176 | class HierarchicalData(Dataset): 177 | def __init__(self, x, act, dialog_used=5): 178 | self.x = x 179 | self.act = act 180 | self.dialog_used = dialog_used 181 | 182 | def __getitem__(self, index): 183 | x = [torch.tensor([101])] * (self.dialog_used - len(self.x[index])) + \ 184 | [torch.tensor([101] + item[:64]) for item in self.x[index][-self.dialog_used:]] 185 | act = self.act[index] 186 | return x, act 187 | 188 | def __len__(self): 189 | return len(self.x) 190 | 191 | 192 | class FlatData(Dataset): 193 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 194 | self.x = x 195 | self.act = act 196 | self.dialog_used = dialog_used 197 | 198 | if up_sampling: 199 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 200 | enhance_idx = enhance_idx * 10 201 | enhance_x = [x[idx] for idx in enhance_idx] 202 | enhance_act = [act[idx] for idx in enhance_idx] 203 | self.x = x + enhance_x 204 | self.act = act + enhance_act 205 | 206 | def __getitem__(self, index): 207 | seq = sum([item[:64] for item in self.x[index]], []) 208 | x = torch.tensor([101] + seq[-500:]) 209 | act = self.act[index] 210 | return x, act 211 | 212 | def __len__(self): 213 | return len(self.x) 214 | 215 | 216 | def collate_fn(data): 217 | x, act = zip(*data) 218 | bc_size = len(x) 219 | dialog_his = len(x[0]) 220 | x = [item for dialog in x for item in dialog] 221 | x = pad_sequence(x, batch_first=True, padding_value=0) 222 | x = x.view(bc_size, dialog_his, -1) 223 | 224 | return {'input_ids': x, 225 | 'act': torch.tensor(act).long() 226 | } 227 | 228 | 229 | def flat_collate_fn(data): 230 | x, act = zip(*data) 231 | x = pad_sequence(x, batch_first=True, padding_value=0) 232 | 233 | return {'input_ids': x, 234 | 'act': torch.tensor(act).long() 235 | } 236 | 237 | 238 | def train(fold=0, data_name='dstc8', model_name='HiGRU+ATTN', dialog_used=10): 239 | print('[TRAIN ACTION]') 240 | 241 | data_name = data_name.split('<>')[-1] 242 | 243 | data_name = data_name.replace('\r', '') 244 | model_name = model_name.replace('\r', '') 245 | 246 | print('dialog used', dialog_used) 247 | 248 | name = f'act_{data_name}_{model_name}_{fold}' 249 | print('TRAIN ::', name) 250 | 251 | save_path = f'outputs/{data_name}_act/{model_name}_{fold}' 252 | 253 | tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') 254 | 255 | x, emo, act, action_num = load_redial_act(f'dataset/{data_name}', tokenizer) 256 | # x, emo, act, action_num = load_dstc(f'dataset/{data_name}', tokenizer) 257 | 258 | # x, emo, act1, act2, action_num1, action_num2 = load_ccpe(f'dataset/{data_name}', tokenizer) 259 | # act = act1 260 | # action_num = action_num1 261 | 262 | print('action_num:', action_num) 263 | from .models import GRU, GRUAttention, BERTBackbone 264 | from .models import HierarchicalAttention, Hierarchical, ClassModel 265 | if model_name == 'HiGRU+ATTN': 266 | model = HierarchicalAttention(backbone=GRUAttention(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 267 | model = model.cuda() 268 | optimizer = AdamW(model.parameters(), 1e-4) 269 | batch_size = 16 270 | DataFunc = HierarchicalData 271 | cf = collate_fn 272 | elif model_name == 'HiGRU': 273 | model = Hierarchical(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 274 | model = model.cuda() 275 | optimizer = AdamW(model.parameters(), 1e-4) 276 | batch_size = 16 277 | DataFunc = HierarchicalData 278 | cf = collate_fn 279 | elif model_name == 'GRU': 280 | model = ClassModel(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 281 | model = model.cuda() 282 | optimizer = AdamW(model.parameters(), 1e-4) 283 | batch_size = 16 284 | DataFunc = FlatData 285 | cf = flat_collate_fn 286 | elif model_name == 'BERT': 287 | model = ClassModel(backbone=BERTBackbone(layers_used=2, name='bert-base-uncased'), class_num=[action_num]) 288 | model = model.cuda() 289 | optimizer = AdamW(model.parameters(), 2e-5) 290 | batch_size = 6 291 | DataFunc = FlatData 292 | cf = flat_collate_fn 293 | else: 294 | print('[unknown model name]') 295 | return 296 | 297 | ll = int(len(x) / 10) 298 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 299 | train_act = act[:ll * fold] + act[ll * (fold + 1):] 300 | 301 | test_x = x[ll * fold:ll * (fold + 1)] 302 | test_act = act[ll * fold:ll * (fold + 1)] 303 | 304 | print(len(train_x), len(test_x)) 305 | print() 306 | best_result = [0. for _ in range(4)] 307 | for i in range(100): 308 | print('train epoch', i, name) 309 | train_loader = DataLoader(DataFunc(train_x, train_act, dialog_used=dialog_used), batch_size=batch_size, 310 | shuffle=True, num_workers=2, collate_fn=cf) 311 | # tk0 = tqdm(train_loader, total=len(train_loader)) 312 | tk0 = train_loader 313 | act_acc = [] 314 | model.train() 315 | for j, batch in enumerate(tk0): 316 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 317 | act = batch['act'].cuda() 318 | act_loss = F.cross_entropy(act_pred, act) 319 | loss = act_loss 320 | 321 | loss.backward() 322 | optimizer.step() 323 | optimizer.zero_grad() 324 | act_acc.append((act_pred.argmax(dim=-1) == act).sum().item() / act.size(0)) 325 | 326 | # tk0.set_postfix(act_acc=round(sum(act_acc) / max(1, len(act_acc)), 4)) 327 | torch.save(model.state_dict(), f'outputs/{name}_{i}.pt') 328 | # print('test epoch', i) 329 | test_result = test(model, DataFunc(test_x, test_act, dialog_used=dialog_used), f'{save_path}_{i}.txt', cf) 330 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 331 | print(f'text_result={test_result}') 332 | print(f'best_result={best_result}') 333 | print() 334 | 335 | 336 | def test(model, test_data, save_path, cf): 337 | test_loader = DataLoader(test_data, batch_size=6, shuffle=False, num_workers=0, collate_fn=cf) 338 | # tk0 = tqdm(test_loader, total=len(test_loader)) 339 | tk0 = test_loader 340 | prediction = [] 341 | label = [] 342 | 343 | model.eval() 344 | for j, batch in enumerate(tk0): 345 | act = batch['act'].cuda() 346 | with torch.no_grad(): 347 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 348 | prediction.extend(act_pred.argmax(dim=-1).cpu().tolist()) 349 | label.extend(act.cpu().tolist()) 350 | 351 | acc = sum([int(p == l) for p, l in zip(prediction, label)]) / len(label) 352 | precision = precision_score(label, prediction, average='macro', zero_division=0) 353 | recall = recall_score(label, prediction, average='macro', zero_division=0) 354 | f1 = f1_score(label, prediction, average='macro', zero_division=0) 355 | 356 | with open(save_path, 'w', encoding='utf-8') as f: 357 | for p, l in zip(prediction, label): 358 | f.write(f'{p}, {l}\n') 359 | 360 | return acc, precision, recall, f1 361 | -------------------------------------------------------------------------------- /baselines/train_act.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | data='jddc' 4 | model='BERT' 5 | 6 | for i in {0..9} 7 | do 8 | echo CUDA=$[ i%4 ] log/${data}_${model}_$i.log 9 | CUDA_VISIBLE_DEVICES=$[ i%4 ] nohup python -u driver_act.py -fold=$i --data=${data} --model=${model} > log/${data}_${model}_$i.log 2>&1 & 10 | done -------------------------------------------------------------------------------- /baselines/train_jddc_act.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | from torch.utils.data import Dataset, DataLoader 3 | from torch.nn.utils.rnn import pad_sequence 4 | import torch.nn.functional as F 5 | from tqdm import tqdm 6 | import random 7 | import json 8 | import copy 9 | import torch 10 | import warnings 11 | import numpy as np 12 | import os 13 | import pickle 14 | from sklearn.metrics import cohen_kappa_score 15 | from sklearn.metrics import f1_score, precision_score, recall_score 16 | from .spearman import spearman 17 | 18 | warnings.filterwarnings("ignore") 19 | 20 | 21 | def write_pkl(obj, filename): 22 | with open(filename, 'wb') as f: 23 | pickle.dump(obj, f) 24 | 25 | 26 | def read_pkl(filename): 27 | with open(filename, 'rb') as f: 28 | return pickle.load(f) 29 | 30 | 31 | def get_main_score(scores): 32 | number = [0, 0, 0, 0, 0] 33 | for item in scores: 34 | number[item] += 1 35 | score = np.argmax(number) 36 | return score 37 | 38 | 39 | def load_jddc(dirname, tokenizer, lite=1): 40 | name = 'hierarchical_data' 41 | if lite: 42 | name = name + '_lite' 43 | if os.path.exists(f'{dirname}-{name}.pkl'): 44 | return read_pkl(f'{dirname}-{name}.pkl') 45 | print('tokenized data JDDC') 46 | 47 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 48 | 49 | from .jddc_config import domain2actions 50 | 51 | act2domain = {} 52 | for line in domain2actions.split('\n'): 53 | domain = line[:line.index('[') - 1].strip() 54 | actions = [x[1:-1] for x in line[line.index('[') + 1:-1].split(', ')] 55 | # print(domain, actions) 56 | for x in actions: 57 | act2domain[x] = domain 58 | data = [] 59 | for line in raw: 60 | if len(line) == 0: 61 | data.append([]) 62 | else: 63 | data[-1].append(line) 64 | x = [] 65 | emo = [] 66 | act = [] 67 | action_list = {'other': 0} 68 | for session in data: 69 | his_input_ids = [] 70 | for turn in session: 71 | role, text, action, score = turn.split('\t') 72 | score = score.split(',') 73 | 74 | if role == 'USER': 75 | x.append(copy.deepcopy(his_input_ids)) 76 | emo.append(get_main_score([int(item) - 1 for item in score])) 77 | action = action.strip() 78 | if lite: 79 | action = act2domain.get(action, 'other') 80 | if action not in action_list: 81 | action_list[action] = len(action_list) 82 | act.append(action_list[action]) 83 | ids = tokenizer.encode(text.strip())[1:] 84 | his_input_ids.append(ids) 85 | 86 | action_num = len(action_list) 87 | data = [x, emo, act, action_num] 88 | write_pkl(data, f'{dirname}-{name}.pkl') 89 | return data 90 | 91 | 92 | class HierarchicalData(Dataset): 93 | def __init__(self, x, act, dialog_used=5): 94 | self.x = x 95 | self.act = act 96 | self.dialog_used = dialog_used 97 | 98 | def __getitem__(self, index): 99 | x = [torch.tensor([101])] * (self.dialog_used - len(self.x[index])) + \ 100 | [torch.tensor([101] + item[:64]) for item in self.x[index][-self.dialog_used:]] 101 | act = self.act[index] 102 | return x, act 103 | 104 | def __len__(self): 105 | return len(self.x) 106 | 107 | 108 | class FlatData(Dataset): 109 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 110 | self.x = x 111 | self.act = act 112 | self.dialog_used = dialog_used 113 | 114 | if up_sampling: 115 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 116 | enhance_idx = enhance_idx * 10 117 | enhance_x = [x[idx] for idx in enhance_idx] 118 | enhance_act = [act[idx] for idx in enhance_idx] 119 | self.x = x + enhance_x 120 | self.act = act + enhance_act 121 | 122 | def __getitem__(self, index): 123 | seq = sum([item[:64] for item in self.x[index]], []) 124 | x = torch.tensor([101] + seq[-500:]) 125 | act = self.act[index] 126 | return x, act 127 | 128 | def __len__(self): 129 | return len(self.x) 130 | 131 | 132 | def collate_fn(data): 133 | x, act = zip(*data) 134 | bc_size = len(x) 135 | dialog_his = len(x[0]) 136 | x = [item for dialog in x for item in dialog] 137 | x = pad_sequence(x, batch_first=True, padding_value=0) 138 | x = x.view(bc_size, dialog_his, -1) 139 | 140 | return {'input_ids': x, 141 | 'act': torch.tensor(act).long() 142 | } 143 | 144 | 145 | def flat_collate_fn(data): 146 | x, act = zip(*data) 147 | x = pad_sequence(x, batch_first=True, padding_value=0) 148 | 149 | return {'input_ids': x, 150 | 'act': torch.tensor(act).long() 151 | } 152 | 153 | 154 | def train(fold=0, data_name='dstc8', model_name='HiGRU+ATTN'): 155 | print('[TRAIN ACTION] JDDC') 156 | dialog_used = 10 157 | 158 | data_name = data_name.replace('\r', '') 159 | model_name = model_name.replace('\r', '') 160 | 161 | print('dialog used', dialog_used) 162 | 163 | name = f'act_{data_name}_{model_name}_{fold}' 164 | print('TRAIN ::', name) 165 | 166 | save_path = f'outputs/{data_name}_act/{model_name}_{fold}' 167 | 168 | tokenizer = BertTokenizer.from_pretrained('bert-base-chinese') 169 | 170 | x, emo, act, action_num = load_jddc(f'dataset/{data_name}', tokenizer) 171 | 172 | print('action_num:', action_num) 173 | from .models import GRU, GRUAttention, BERTBackbone 174 | from .models import HierarchicalAttention, Hierarchical, ClassModel 175 | if model_name == 'HiGRU+ATTN': 176 | model = HierarchicalAttention(backbone=GRUAttention(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 177 | model = model.cuda() 178 | optimizer = AdamW(model.parameters(), 1e-4) 179 | batch_size = 16 180 | DataFunc = HierarchicalData 181 | cf = collate_fn 182 | elif model_name == 'HiGRU': 183 | model = Hierarchical(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 184 | model = model.cuda() 185 | optimizer = AdamW(model.parameters(), 1e-4) 186 | batch_size = 16 187 | DataFunc = HierarchicalData 188 | cf = collate_fn 189 | elif model_name == 'GRU': 190 | model = ClassModel(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 191 | model = model.cuda() 192 | optimizer = AdamW(model.parameters(), 1e-4) 193 | batch_size = 16 194 | DataFunc = FlatData 195 | cf = flat_collate_fn 196 | elif model_name == 'BERT': 197 | model = ClassModel(backbone=BERTBackbone(layers_used=2, name='bert-base-chinese'), class_num=[action_num]) 198 | model = model.cuda() 199 | optimizer = AdamW(model.parameters(), 2e-5) 200 | batch_size = 6 201 | DataFunc = FlatData 202 | cf = flat_collate_fn 203 | else: 204 | print('[unknown model name]') 205 | return 206 | 207 | ll = int(len(x) / 10) 208 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 209 | train_act = act[:ll * fold] + act[ll * (fold + 1):] 210 | 211 | test_x = x[ll * fold:ll * (fold + 1)] 212 | test_act = act[ll * fold:ll * (fold + 1)] 213 | 214 | print(len(train_x), len(test_x)) 215 | print() 216 | best_result = [0. for _ in range(4)] 217 | for i in range(100): 218 | print('train epoch', i, name) 219 | train_loader = DataLoader(DataFunc(train_x, train_act, dialog_used=dialog_used), batch_size=batch_size, 220 | shuffle=True, num_workers=2, collate_fn=cf) 221 | # tk0 = tqdm(train_loader, total=len(train_loader)) 222 | tk0 = train_loader 223 | act_acc = [] 224 | model.train() 225 | for j, batch in enumerate(tk0): 226 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 227 | act = batch['act'].cuda() 228 | act_loss = F.cross_entropy(act_pred, act) 229 | loss = act_loss 230 | 231 | loss.backward() 232 | optimizer.step() 233 | optimizer.zero_grad() 234 | act_acc.append((act_pred.argmax(dim=-1) == act).sum().item() / act.size(0)) 235 | 236 | # tk0.set_postfix(act_acc=round(sum(act_acc) / max(1, len(act_acc)), 4)) 237 | torch.save(model.state_dict(), f'outputs/{name}_{i}.pt') 238 | # print('test epoch', i) 239 | test_result = test(model, DataFunc(test_x, test_act, dialog_used=dialog_used), f'{save_path}_{i}.txt', cf) 240 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 241 | print(f'text_result={test_result}') 242 | print(f'best_result={best_result}') 243 | print() 244 | 245 | 246 | def test(model, test_data, save_path, cf): 247 | test_loader = DataLoader(test_data, batch_size=6, shuffle=False, num_workers=0, collate_fn=cf) 248 | # tk0 = tqdm(test_loader, total=len(test_loader)) 249 | tk0 = test_loader 250 | prediction = [] 251 | label = [] 252 | 253 | model.eval() 254 | for j, batch in enumerate(tk0): 255 | act = batch['act'].cuda() 256 | with torch.no_grad(): 257 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 258 | prediction.extend(act_pred.argmax(dim=-1).cpu().tolist()) 259 | label.extend(act.cpu().tolist()) 260 | 261 | acc = sum([int(p == l) for p, l in zip(prediction, label)]) / len(label) 262 | precision = precision_score(label, prediction, average='macro', zero_division=0) 263 | recall = recall_score(label, prediction, average='macro', zero_division=0) 264 | f1 = f1_score(label, prediction, average='macro', zero_division=0) 265 | 266 | with open(save_path, 'w', encoding='utf-8') as f: 267 | for p, l in zip(prediction, label): 268 | f.write(f'{p}, {l}\n') 269 | 270 | return acc, precision, recall, f1 271 | 272 | 273 | 274 | 275 | 276 | 277 | 278 | 279 | 280 | 281 | 282 | -------------------------------------------------------------------------------- /baselines/train_jddc_sat.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | from torch.utils.data import Dataset, DataLoader 3 | from torch.nn.utils.rnn import pad_sequence 4 | import torch.nn.functional as F 5 | from tqdm import tqdm 6 | import random 7 | import json 8 | import copy 9 | import torch 10 | import warnings 11 | import numpy as np 12 | import os 13 | import pickle 14 | from sklearn.metrics import cohen_kappa_score 15 | from .spearman import spearman 16 | 17 | warnings.filterwarnings("ignore") 18 | 19 | 20 | def write_pkl(obj, filename): 21 | with open(filename, 'wb') as f: 22 | pickle.dump(obj, f) 23 | 24 | 25 | def read_pkl(filename): 26 | with open(filename, 'rb') as f: 27 | return pickle.load(f) 28 | 29 | 30 | def get_main_score(scores): 31 | number = [0, 0, 0, 0, 0] 32 | for item in scores: 33 | number[item] += 1 34 | score = np.argmax(number) 35 | return score 36 | 37 | 38 | def load_jddc(dirname, tokenizer, lite=1): 39 | name = 'hierarchical_data' 40 | if lite: 41 | name = name + '_lite' 42 | if os.path.exists(f'{dirname}-{name}.pkl'): 43 | return read_pkl(f'{dirname}-{name}.pkl') 44 | print('tokenized data JDDC') 45 | 46 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 47 | 48 | from .jddc_config import domain2actions 49 | 50 | act2domain = {} 51 | for line in domain2actions.split('\n'): 52 | domain = line[:line.index('[') - 1].strip() 53 | actions = [x[1:-1] for x in line[line.index('[') + 1:-1].split(', ')] 54 | # print(domain, actions) 55 | for x in actions: 56 | act2domain[x] = domain 57 | data = [] 58 | for line in raw: 59 | if len(line) == 0: 60 | data.append([]) 61 | else: 62 | data[-1].append(line) 63 | x = [] 64 | emo = [] 65 | act = [] 66 | action_list = {'other': 0} 67 | for session in data: 68 | his_input_ids = [] 69 | for turn in session: 70 | role, text, action, score = turn.split('\t') 71 | score = score.split(',') 72 | 73 | if role == 'USER': 74 | x.append(copy.deepcopy(his_input_ids)) 75 | emo.append(get_main_score([int(item) - 1 for item in score])) 76 | action = action.strip() 77 | if lite: 78 | action = act2domain.get(action, 'other') 79 | if action not in action_list: 80 | action_list[action] = len(action_list) 81 | act.append(action_list[action]) 82 | ids = tokenizer.encode(text.strip())[1:] 83 | his_input_ids.append(ids) 84 | 85 | action_num = len(action_list) 86 | data = [x, emo, act, action_num] 87 | write_pkl(data, f'{dirname}-{name}.pkl') 88 | return data 89 | 90 | 91 | class HierarchicalData(Dataset): 92 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 93 | self.x = x 94 | self.act = act 95 | self.dialog_used = dialog_used 96 | 97 | if up_sampling: 98 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 99 | enhance_idx = enhance_idx * 10 100 | enhance_x = [x[idx] for idx in enhance_idx] 101 | enhance_act = [act[idx] for idx in enhance_idx] 102 | self.x = x + enhance_x 103 | self.act = act + enhance_act 104 | 105 | def __getitem__(self, index): 106 | x = [torch.tensor([101])] * (self.dialog_used - len(self.x[index])) + \ 107 | [torch.tensor([101] + item[:64]) for item in self.x[index][-self.dialog_used:]] 108 | act = self.act[index] 109 | return x, act 110 | 111 | def __len__(self): 112 | return len(self.x) 113 | 114 | 115 | class FlatData(Dataset): 116 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 117 | self.x = x 118 | self.act = act 119 | self.dialog_used = dialog_used 120 | 121 | if up_sampling: 122 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 123 | enhance_idx = enhance_idx * 10 124 | enhance_x = [x[idx] for idx in enhance_idx] 125 | enhance_act = [act[idx] for idx in enhance_idx] 126 | self.x = x + enhance_x 127 | self.act = act + enhance_act 128 | 129 | def __getitem__(self, index): 130 | seq = sum([item[:64] for item in self.x[index]], []) 131 | x = torch.tensor([101] + seq[-500:]) 132 | act = self.act[index] 133 | return x, act 134 | 135 | def __len__(self): 136 | return len(self.x) 137 | 138 | 139 | def collate_fn(data): 140 | x, act = zip(*data) 141 | bc_size = len(x) 142 | dialog_his = len(x[0]) 143 | x = [item for dialog in x for item in dialog] 144 | x = pad_sequence(x, batch_first=True, padding_value=0) 145 | x = x.view(bc_size, dialog_his, -1) 146 | 147 | return {'input_ids': x, 148 | 'act': torch.tensor(act).long() 149 | } 150 | 151 | 152 | def flat_collate_fn(data): 153 | x, act = zip(*data) 154 | x = pad_sequence(x, batch_first=True, padding_value=0) 155 | 156 | return {'input_ids': x, 157 | 'act': torch.tensor(act).long() 158 | } 159 | 160 | 161 | def train(fold=0, data_name='dstc8', model_name='HiGRU+ATTN'): 162 | print('[TRAIN] JDDC') 163 | dialog_used = 10 164 | 165 | data_name = data_name.replace('\r', '') 166 | model_name = model_name.replace('\r', '') 167 | 168 | print('dialog used', dialog_used) 169 | 170 | name = f'{data_name}_{model_name}_{fold}' 171 | print('TRAIN ::', name) 172 | 173 | save_path = f'outputs/{data_name}_emo/{model_name}_{fold}' 174 | 175 | tokenizer = BertTokenizer.from_pretrained('bert-base-chinese') 176 | 177 | x, emo, act, action_num = load_jddc(f'dataset/{data_name}', tokenizer) 178 | 179 | from .models import GRU, GRUAttention, BERTBackbone 180 | from .models import HierarchicalAttention, Hierarchical, ClassModel 181 | if model_name == 'HiGRU+ATTN': 182 | model = HierarchicalAttention(backbone=GRUAttention(vocab_size=tokenizer.vocab_size), class_num=[5]) 183 | model = model.cuda() 184 | optimizer = AdamW(model.parameters(), 1e-4) 185 | batch_size = 16 186 | DataFunc = HierarchicalData 187 | cf = collate_fn 188 | elif model_name == 'HiGRU': 189 | model = Hierarchical(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[5]) 190 | model = model.cuda() 191 | optimizer = AdamW(model.parameters(), 1e-4) 192 | batch_size = 16 193 | DataFunc = HierarchicalData 194 | cf = collate_fn 195 | elif model_name == 'GRU': 196 | model = ClassModel(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[5]) 197 | model = model.cuda() 198 | optimizer = AdamW(model.parameters(), 1e-4) 199 | batch_size = 16 200 | DataFunc = FlatData 201 | cf = flat_collate_fn 202 | elif model_name == 'BERT': 203 | model = ClassModel(backbone=BERTBackbone(layers_used=2, name='bert-base-chinese'), class_num=[5]) 204 | model = model.cuda() 205 | optimizer = AdamW(model.parameters(), 2e-5) 206 | batch_size = 6 207 | DataFunc = FlatData 208 | cf = flat_collate_fn 209 | else: 210 | print('[unknown model name]') 211 | return 212 | 213 | ll = int(len(x) / 10) 214 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 215 | train_act = emo[:ll * fold] + emo[ll * (fold + 1):] 216 | 217 | test_x = x[ll * fold:ll * (fold + 1)] 218 | test_act = emo[ll * fold:ll * (fold + 1)] 219 | 220 | print(len(train_x), len(test_x)) 221 | print() 222 | best_result = [0. for _ in range(4)] 223 | for i in range(100): 224 | print('train epoch', i, name) 225 | train_loader = DataLoader(DataFunc(train_x, train_act, dialog_used=dialog_used, up_sampling=True), 226 | batch_size=batch_size, shuffle=True, num_workers=2, collate_fn=cf) 227 | # tk0 = tqdm(train_loader, total=len(train_loader)) 228 | tk0 = train_loader 229 | act_acc = [] 230 | model.train() 231 | for j, batch in enumerate(tk0): 232 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 233 | act = batch['act'].cuda() 234 | act_loss = F.cross_entropy(act_pred, act) 235 | loss = act_loss 236 | 237 | loss.backward() 238 | optimizer.step() 239 | optimizer.zero_grad() 240 | act_acc.append((act_pred.argmax(dim=-1) == act).sum().item() / act.size(0)) 241 | 242 | # tk0.set_postfix(act_acc=round(sum(act_acc) / max(1, len(act_acc)), 4)) 243 | torch.save(model.state_dict(), f'outputs/{name}_{i}.pt') 244 | # print('test epoch', i) 245 | test_result = test(model, DataFunc(test_x, test_act, dialog_used=dialog_used), f'{save_path}_{i}.txt', cf) 246 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 247 | print(f'text_result={test_result}') 248 | print(f'best_result={best_result}') 249 | print() 250 | 251 | 252 | def test(model, test_data, save_path, cf): 253 | test_loader = DataLoader(test_data, batch_size=6, shuffle=False, num_workers=0, collate_fn=cf) 254 | # tk0 = tqdm(test_loader, total=len(test_loader)) 255 | tk0 = test_loader 256 | prediction = [] 257 | label = [] 258 | 259 | model.eval() 260 | for j, batch in enumerate(tk0): 261 | act = batch['act'].cuda() 262 | with torch.no_grad(): 263 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 264 | prediction.extend(act_pred.argmax(dim=-1).cpu().tolist()) 265 | label.extend(act.cpu().tolist()) 266 | 267 | recall = [[0, 0] for _ in range(5)] 268 | for p, l in zip(prediction, label): 269 | recall[l][1] += 1 270 | recall[l][0] += int(p == l) 271 | recall_value = [item[0] / max(item[1], 1) for item in recall] 272 | print('Recall value:', recall_value) 273 | print('Recall:', recall) 274 | UAR = sum(recall_value) / len(recall_value) 275 | kappa = cohen_kappa_score(prediction, label) 276 | rho = spearman(prediction, label) 277 | 278 | bi_pred = [int(item < 2) for item in prediction] 279 | bi_label = [int(item < 2) for item in label] 280 | bi_recall = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if l == 1]) / max(bi_label.count(1), 1) 281 | bi_precision = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if p == 1]) / max(bi_pred.count(1), 1) 282 | bi_f1 = 2 * bi_recall * bi_precision / max((bi_recall + bi_precision), 1) 283 | 284 | with open(save_path, 'w', encoding='utf-8') as f: 285 | for p, l in zip(prediction, label): 286 | f.write(f'{p}, {l}\n') 287 | 288 | return UAR, kappa, rho, bi_f1 289 | -------------------------------------------------------------------------------- /baselines/train_sat.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | from torch.utils.data import Dataset, DataLoader 3 | from torch.nn.utils.rnn import pad_sequence 4 | import torch.nn.functional as F 5 | import copy 6 | import torch 7 | import warnings 8 | import numpy as np 9 | import os 10 | import pickle 11 | from sklearn.metrics import cohen_kappa_score 12 | from .spearman import spearman 13 | 14 | warnings.filterwarnings("ignore") 15 | 16 | 17 | def write_pkl(obj, filename): 18 | with open(filename, 'wb') as f: 19 | pickle.dump(obj, f) 20 | 21 | 22 | def read_pkl(filename): 23 | with open(filename, 'rb') as f: 24 | return pickle.load(f) 25 | 26 | 27 | def get_main_score(scores): 28 | number = [0, 0, 0, 0, 0] 29 | for item in scores: 30 | number[item] += 1 31 | score = np.argmax(number) 32 | return score 33 | 34 | 35 | def load_data(dirname, tokenizer): 36 | print(dirname) 37 | name = 'hierarchical_data' 38 | 39 | if os.path.exists(f'{dirname}-{name}.pkl'): 40 | return read_pkl(f'{dirname}-{name}.pkl') 41 | 42 | print('tokenized data') 43 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 44 | data = [] 45 | for line in raw: 46 | if line == '': 47 | data.append([]) 48 | else: 49 | data[-1].append(line) 50 | x = [] 51 | emo = [] 52 | act = [] 53 | action_list = {} 54 | for session in data: 55 | his_input_ids = [] 56 | for turn in session: 57 | role, text, action, score = turn.split('\t') 58 | score = score.split(',') 59 | action = action.split(',') 60 | action = action[0] 61 | if role == 'USER': 62 | x.append(copy.deepcopy(his_input_ids)) 63 | emo.append(get_main_score([int(item) - 1 for item in score])) 64 | action = action.strip() 65 | if action not in action_list: 66 | action_list[action] = len(action_list) 67 | act.append(action_list[action]) 68 | 69 | ids = tokenizer.encode(text.strip())[1:] 70 | his_input_ids.append(ids) 71 | 72 | action_num = len(action_list) 73 | data = [x, emo, act, action_num] 74 | write_pkl(data, f'{dirname}-{name}.pkl') 75 | return data 76 | 77 | 78 | class HierarchicalData(Dataset): 79 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 80 | self.x = x 81 | self.act = act 82 | self.dialog_used = dialog_used 83 | 84 | if up_sampling: 85 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 86 | enhance_idx = enhance_idx * 10 87 | enhance_x = [x[idx] for idx in enhance_idx] 88 | enhance_act = [act[idx] for idx in enhance_idx] 89 | self.x = x + enhance_x 90 | self.act = act + enhance_act 91 | 92 | def __getitem__(self, index): 93 | x = [torch.tensor([101])] * (self.dialog_used - len(self.x[index])) + \ 94 | [torch.tensor([101] + item[:64]) for item in self.x[index][-self.dialog_used:]] 95 | act = self.act[index] 96 | return x, act 97 | 98 | def __len__(self): 99 | return len(self.x) 100 | 101 | 102 | class FlatData(Dataset): 103 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 104 | self.x = x 105 | self.act = act 106 | self.dialog_used = dialog_used 107 | 108 | if up_sampling: 109 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 110 | enhance_idx = enhance_idx * 10 111 | enhance_x = [x[idx] for idx in enhance_idx] 112 | enhance_act = [act[idx] for idx in enhance_idx] 113 | self.x = x + enhance_x 114 | self.act = act + enhance_act 115 | 116 | def __getitem__(self, index): 117 | seq = sum([item[:64] for item in self.x[index]], []) 118 | x = torch.tensor([101] + seq[-500:]) 119 | act = self.act[index] 120 | return x, act 121 | 122 | def __len__(self): 123 | return len(self.x) 124 | 125 | 126 | def collate_fn(data): 127 | x, act = zip(*data) 128 | bc_size = len(x) 129 | dialog_his = len(x[0]) 130 | x = [item for dialog in x for item in dialog] 131 | x = pad_sequence(x, batch_first=True, padding_value=0) 132 | x = x.view(bc_size, dialog_his, -1) 133 | 134 | return {'input_ids': x, 135 | 'act': torch.tensor(act).long() 136 | } 137 | 138 | 139 | def flat_collate_fn(data): 140 | x, act = zip(*data) 141 | x = pad_sequence(x, batch_first=True, padding_value=0) 142 | 143 | return {'input_ids': x, 144 | 'act': torch.tensor(act).long() 145 | } 146 | 147 | 148 | def train(fold=0, data_name='dstc8', model_name='HiGRU+ATTN', dialog_used=10): 149 | print('[TRAIN]') 150 | 151 | data_name = data_name.replace('\r', '') 152 | model_name = model_name.replace('\r', '') 153 | 154 | print('dialog used', dialog_used) 155 | 156 | name = f'{data_name}_{model_name}_{fold}' 157 | print('TRAIN ::', name) 158 | 159 | save_path = f'outputs/{data_name}_emo/{model_name}_{fold}' 160 | 161 | tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') 162 | # x, emo, act1, act2, action_num1, action_num2 = load_ccpe(f'dataset/{data_name}', tokenizer) 163 | x, emo, act, action_num = load_data(f'dataset/{data_name}.txt', tokenizer) 164 | 165 | # print(action_num) 166 | from .models import GRU, GRUAttention, BERTBackbone 167 | from .models import HierarchicalAttention, Hierarchical, ClassModel 168 | if model_name == 'HiGRU+ATTN': 169 | model = HierarchicalAttention(backbone=GRUAttention(vocab_size=tokenizer.vocab_size), class_num=[5]) 170 | model = model.cuda() 171 | optimizer = AdamW(model.parameters(), 1e-4) 172 | batch_size = 16 173 | DataFunc = HierarchicalData 174 | cf = collate_fn 175 | elif model_name == 'HiGRU': 176 | model = Hierarchical(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[5]) 177 | model = model.cuda() 178 | optimizer = AdamW(model.parameters(), 1e-4) 179 | batch_size = 16 180 | DataFunc = HierarchicalData 181 | cf = collate_fn 182 | elif model_name == 'GRU': 183 | model = ClassModel(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[5]) 184 | model = model.cuda() 185 | optimizer = AdamW(model.parameters(), 1e-4) 186 | batch_size = 16 187 | DataFunc = FlatData 188 | cf = flat_collate_fn 189 | elif model_name == 'BERT': 190 | model = ClassModel(backbone=BERTBackbone(layers_used=2, name='bert-base-uncased'), class_num=[5]) 191 | model = model.cuda() 192 | optimizer = AdamW(model.parameters(), 2e-5) 193 | batch_size = 6 194 | DataFunc = FlatData 195 | cf = flat_collate_fn 196 | else: 197 | print('[unknown model name]') 198 | return 199 | 200 | ll = int(len(x) / 10) 201 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 202 | train_act = emo[:ll * fold] + emo[ll * (fold + 1):] 203 | 204 | test_x = x[ll * fold:ll * (fold + 1)] 205 | test_act = emo[ll * fold:ll * (fold + 1)] 206 | 207 | print(len(train_x), len(test_x)) 208 | print() 209 | best_result = [0. for _ in range(4)] 210 | for i in range(100): 211 | print('train epoch', i, name) 212 | train_loader = DataLoader(DataFunc(train_x, train_act, dialog_used=dialog_used, up_sampling=True), 213 | batch_size=batch_size, shuffle=True, num_workers=2, collate_fn=cf) 214 | # tk0 = tqdm(train_loader, total=len(train_loader)) 215 | tk0 = train_loader 216 | act_acc = [] 217 | model.train() 218 | for j, batch in enumerate(tk0): 219 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 220 | act = batch['act'].cuda() 221 | act_loss = F.cross_entropy(act_pred, act) 222 | loss = act_loss 223 | 224 | loss.backward() 225 | optimizer.step() 226 | optimizer.zero_grad() 227 | act_acc.append((act_pred.argmax(dim=-1) == act).sum().item() / act.size(0)) 228 | 229 | # tk0.set_postfix(act_acc=round(sum(act_acc) / max(1, len(act_acc)), 4)) 230 | torch.save(model.state_dict(), f'outputs/{name}_{i}.pt') 231 | # print('test epoch', i) 232 | test_result = test(model, DataFunc(test_x, test_act, dialog_used=dialog_used), f'{save_path}_{i}.txt', cf) 233 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 234 | print(f'text_result={test_result}') 235 | print(f'best_result={best_result}') 236 | print() 237 | 238 | 239 | def test(model, test_data, save_path, cf): 240 | test_loader = DataLoader(test_data, batch_size=6, shuffle=False, num_workers=0, collate_fn=cf) 241 | # tk0 = tqdm(test_loader, total=len(test_loader)) 242 | tk0 = test_loader 243 | prediction = [] 244 | label = [] 245 | 246 | model.eval() 247 | for j, batch in enumerate(tk0): 248 | act = batch['act'].cuda() 249 | with torch.no_grad(): 250 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 251 | prediction.extend(act_pred.argmax(dim=-1).cpu().tolist()) 252 | label.extend(act.cpu().tolist()) 253 | 254 | recall = [[0, 0] for _ in range(5)] 255 | for p, l in zip(prediction, label): 256 | recall[l][1] += 1 257 | recall[l][0] += int(p == l) 258 | recall_value = [item[0] / max(item[1], 1) for item in recall] 259 | print('Recall value:', recall_value) 260 | print('Recall:', recall) 261 | UAR = sum(recall_value) / len(recall_value) 262 | kappa = cohen_kappa_score(prediction, label) 263 | rho = spearman(prediction, label) 264 | 265 | bi_pred = [int(item < 2) for item in prediction] 266 | bi_label = [int(item < 2) for item in label] 267 | bi_recall = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if l == 1]) / max(bi_label.count(1), 1) 268 | bi_precision = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if p == 1]) / max(bi_pred.count(1), 1) 269 | bi_f1 = 2 * bi_recall * bi_precision / max((bi_recall + bi_precision), 1) 270 | 271 | with open(save_path, 'w', encoding='utf-8') as f: 272 | for p, l in zip(prediction, label): 273 | f.write(f'{p}, {l}\n') 274 | 275 | return UAR, kappa, rho, bi_f1 276 | -------------------------------------------------------------------------------- /baselines/train_sat.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | data='jddc' 4 | model='BERT' 5 | 6 | for i in {0..9} 7 | do 8 | echo CUDA=$[ i%4 ] log/${data}_${model}_$i.log 9 | CUDA_VISIBLE_DEVICES=$[ i%4 ] nohup python -u driver_sat.py -fold=$i --data=${data} --model=${model} > log/${data}_${model}_$i.log 2>&1 & 10 | done 11 | -------------------------------------------------------------------------------- /dataset/JDDC-ActionList.txt: -------------------------------------------------------------------------------- 1 | 配送 ['配送周期', '物流全程跟踪', '联系配送', '什么时间出库', '配送方式', '返回方式', '预约配送时间', '少商品与少配件', '拒收', '能否自提', '能否配送', '售前运费多少'时间', '发错货', '下单地址填写', '发货检查', '京东特色配送', '提前配送', '填写返件运单号', '怎么确认收货', '快递单号不正确', '自提时间', '发货时间未到不能出库', '送货上门附加手续费', '夺宝岛配送时间', '夺宝岛运费', '配送超区'] 2 | 退换 ['保修返修及退换货政策', '正常退款周期', '返修退换货处理周期', '售后运费', '申请退款', '退款到哪儿', '返修退换货拆包装', '取消退款', '在哪里查询退款', '退款异常'团购退款', '补差价'] 3 | 发票 ['发票退换修改', '查看发票', '是否提供发票', '填写发票信息', '增票相关', '电子发票', '补发票', '返修退换货发票'] 4 | 客服 ['联系客服', '联系客户', '联系商家', '联系售后', '投诉', '夺宝岛售后'] 5 | 产品咨询 ['属性咨询', '使用咨询', '商品检索', '商品价格咨询', '补货时间', '生产日期', '正品保障', '包装清单', '库存状态', '商品介绍', '外包装', '商品比较', '保修期方区别', '补发', '预约抢购', '为什么显示无货', '开箱验货', '全球购解释', '有什么颜色', '套装购买', '是否全国联保', '是什么颜色', '图片差异', '配件推荐', '发表商品咨询', '爱回收解释', '夺宝岛商品来源', '金融理财', 'DIY装机', '众筹说明', '定金解释'] 6 | 价保 ['价保申请流程', '价保条件', '价保记录查询', '无法申请价保'] 7 | 支付 ['货到付款', '支付方式', '白条还款方式', '公司转账', '在线支付', '白条分期手续费', '白条开通', '无法购买提交', '余额查询', '支付到账时间', '支付密码', '余额使用'库转入转出', '微信支付', '超期未还款', '网银钱包提现异常', '代客户充值', '免密支付', '充值失败', '网银钱包定义', '网银钱包开通', '充值到账异常', '多次支付退款', '微信下单', '夺宝岛支付方式'] 8 | bug ['下单后无记录', '无法加入购物车', '竞拍异常', '地址信息无法保存'] 9 | 维修 ['售后维修点查询'] 10 | 评价 ['查看评价晒单', '删除修改评价晒单', '评价晒单返券和赠品', '评价晒单异常', '评价晒单送积分京豆'] 11 | 预定 ['机票相关', '购买机票', '火车票', '酒店预订'] 12 | 13 | -------------------------------------------------------------------------------- /dataset/README.md: -------------------------------------------------------------------------------- 1 | ## User Satisfaction Simulation 2 | 3 | In *JDDC.txt*, *SGD.txt*, *MWOZ.txt*, *ReDial.txt*, and *CCPE.txt*, each line is separated by "\t": 4 | 5 | - speaker role (USER or SYSTEM), 6 | - text, 7 | - action, 8 | - satisfaction (repeated annotation are separated by ","), 9 | - explanation text (only for JDDC at dialogue level, and repeated annotation are separated by ";") 10 | 11 | And sessions are separated by blank lines. 12 | 13 | Since the original dataset does not provide actions, we use the action annotation provided by [IARD](https://github.com/wanlingcai1997/umap_2020_IARD) and included it in *ReDial-action.txt*. 14 | 15 | The JDDC data set provides the action of each user utterances, including 234 categories. We compress them into 12 categories based on a manually defined classification method (see *JDDC-ActionList.txt*). 16 | 17 | 18 | 19 | Example from SGD 20 | 21 | 22 | ``` 23 | Role \t Text \t Action \t Sat1,Sat2,Sat3 24 | 25 | USER I would like to find some Oneway Flights for my upcoming trip. INFORM_INTENT 2,3,3 26 | SYSTEM Sure, Where are planning to make a trip, please mention the destination and departure points? When do you plan to leave? REQUEST 27 | USER I am leaving form Washington to Mexico city on the 10th. INFORM 3,3,3 28 | SYSTEM There is search results for your requirement, American Airlines outbound flight is leaves at 10:15 am and it has 1 stop. The price of the ticket is $243. OFFER 29 | ``` 30 | 31 | Example from JDDC 32 | 33 | ``` 34 | Role \t Text \t Action \t Sat1,Sat2,Sat3 \t Exp1;Exp2;Exp3(only for dialogue-level) 35 | 36 | SYSTEM 你是商品有问题要换货吗 37 | USER 人家都说可以换只是问我要自己去换还是要上门来给我换 保修返修及退换货政策 2,2,3 38 | USER 我只是要问售后点在哪里而已 保修返修及退换货政策 1,1,3 39 | USER 你不懂就找个懂的过来 保修返修及退换货政策 1,1,1 40 | USER 我没时间浪费 保修返修及退换货政策 1,1,3 41 | SYSTEM 好吧 42 | SYSTEM 很抱歉 43 | SYSTEM 我问过了您只能通过网上处理 44 | SYSTEM 收货点我们真的查不到 45 | SYSTEM 售后点我们真的查不到 46 | USER 无语,查不到就说查不到还要我处理什么 OTHER 2,1,2 47 | SYSTEM 抱歉 48 | SYSTEM 请问还有其他还可以帮到您的吗? 49 | SYSTEM 抱歉了祝您生活愉快再见 50 | SYSTEM 您要是实在找不到的话,就去查地图吧,这是我最后可以做的了 51 | USER OVERALL 1,1,1 system不能为用户解决问题 不能理解用户的意思;system完全没有解决用户的问题,也没有理解客户意图,导致客户体验很差;system未能理解用户的需求,用户体验差 52 | ``` -------------------------------------------------------------------------------- /imgs/action-prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunnweiwei/user-satisfaction-simulation/8ba06bd92570fd94d3ec0cb828aad45172be71b2/imgs/action-prediction.png -------------------------------------------------------------------------------- /imgs/satisfaction-prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunnweiwei/user-satisfaction-simulation/8ba06bd92570fd94d3ec0cb828aad45172be71b2/imgs/satisfaction-prediction.png --------------------------------------------------------------------------------