├── .idea ├── .gitignore ├── inspectionProfiles │ ├── Project_Default.xml │ └── profiles_settings.xml ├── misc.xml ├── modules.xml ├── user-satisfaction-simulation.iml └── vcs.xml ├── README.md ├── baselines ├── driver_act.py ├── driver_sat.py ├── jddc_config.py ├── models.py ├── spearman.py ├── svm.py ├── test.py ├── train_act.py ├── train_act.sh ├── train_jddc_act.py ├── train_jddc_sat.py ├── train_sat.py └── train_sat.sh ├── dataset ├── CCPE.txt ├── JDDC-ActionList.txt ├── JDDC.txt ├── MWOZ.txt ├── README.md ├── ReDial-action.txt ├── ReDial.txt └── SGD.txt └── imgs ├── action-prediction.png └── satisfaction-prediction.png /.idea/.gitignore: -------------------------------------------------------------------------------- 1 | # Default ignored files 2 | /shelf/ 3 | /workspace.xml 4 | # Editor-based HTTP Client requests 5 | /httpRequests/ 6 | # Datasource local storage ignored files 7 | /dataSources/ 8 | /dataSources.local.xml 9 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/Project_Default.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 11 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/profiles_settings.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 6 | -------------------------------------------------------------------------------- /.idea/misc.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | -------------------------------------------------------------------------------- /.idea/modules.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/user-satisfaction-simulation.iml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/vcs.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems 2 | 3 | We annotated a dialogue data set, User Satisfaction Simulation (USS), that includes 6,800 dialogues. All user utterances in those dialogues, as well as the dialogues themselves, have been labeled based on a 5-level satisfaction scale. See [dataset](https://github.com/sunnweiwei/user-satisfaction-simulation/tree/master/dataset). 4 | 5 | These resources are developed within the following paper: 6 | 7 | *Weiwei Sun, Shuo Zhang, Krisztian Balog, Zhaochun Ren, Pengjie Ren, Zhumin Chen, Maarten de Rijke. "Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems". In SIGIR.* [Paper link](https://arxiv.org/pdf/2105.03748) 8 | 9 | ## Data 10 | 11 | The dataset (see [dataset](https://github.com/sunnweiwei/user-satisfaction-simulation/tree/master/dataset)) is provided a TXT format, where each line is separated by "\t": 12 | 13 | - speaker role (USER or SYSTEM), 14 | - text, 15 | - action, 16 | - satisfaction (repeated annotation are separated by ","), 17 | - explanation text (only for JDDC at dialogue level, and repeated annotation are separated by ";") 18 | 19 | And sessions are separated by blank lines. 20 | 21 | Since the original dataset does not provide actions, we use the action annotation provided by [IARD](https://github.com/wanlingcai1997/umap_2020_IARD) and included it in *ReDial-action.txt*. 22 | 23 | The JDDC data set provides the action of each user utterances, including 234 categories. We compress them into 12 categories based on a manually defined classification method (see *JDDC-ActionList.txt*). 24 | 25 | ## Data Statistics 26 | 27 | The USS dataset is based on five benchmark task-oriented dialogue datasets: [JDDC](https://arxiv.org/abs/1911.09969), [Schema Guided Dialogue (SGD)](https://arxiv.org/abs/1909.05855), [MultiWOZ 2.1](https://arxiv.org/abs/1907.01669), [Recommendation Dialogues (ReDial)](https://arxiv.org/abs/1812.07617), and [Coached Conversational Preference Elicitation (CCPE)](https://www.aclweb.org/anthology/W19-5941.pdf). 28 | 29 | | Domain | JDDC | SGD | MultiWOZ | ReDial | CCPE | 30 | | ----------- | ------: | ------: | -------: | ------: | ------: | 31 | | Language | Chinese | English | English | English | English | 32 | | #Dialogues | 3,300 | 1,000 | 1,000 | 1,000 | 500 | 33 | | Avg# Turns | 32.3 | 26.7 | 23.1 | 22.5 | 24.9 | 34 | | #Utterances | 54,517 | 13,833 | 12,553 | 11,806 | 6,860 | 35 | | Rating 1 | 120 | 5 | 12 | 20 | 10 | 36 | | Rating 2 | 4,820 | 769 | 725 | 720 | 1,472 | 37 | | Rating 3 | 45,005 | 11,515 | 11,141 | 9,623 | 5,315 | 38 | | Rating 4 | 4,151 | 1,494 | 669 | 1,490 | 59 | 39 | | Rating 5 | 421 | 50 | 6 | 34 | 4 | 40 | 41 | ## Baselines 42 | The code for baseline reproduction can be found within `/baselines`. 43 | 44 | ![Performance for user satisfaction prediction. Bold face indicates the best result in terms of the corresponding metric. Underline indicates comparable results to the best one.](https://github.com/sunnweiwei/user-satisfaction-simulation/blob/master/imgs/satisfaction-prediction.png) 45 | 46 | ![ Performance for user action prediction. Bold face indicates the best result in terms of the corresponding metric. Underline indicates comparable results to the best one.](https://github.com/sunnweiwei/user-satisfaction-simulation/blob/master/imgs/action-prediction.png) 47 | 48 | ## Cite 49 | 50 | ``` 51 | @inproceedings{Sun:2021:SUS, 52 | author = {Sun, Weiwei and Zhang, Shuo and Balog, Krisztian and Ren, Zhaochun and Ren, Pengjie and Chen, Zhumin and de Rijke, Maarten}, 53 | title = {Simulating User Satisfaction for the Evaluation of Task-oriented Dialogue Systems}, 54 | booktitle = {Proceedings of the 44rd International ACM SIGIR Conference on Research and Development in Information Retrieval}, 55 | series = {SIGIR '21}, 56 | year = {2021}, 57 | publisher = {ACM} 58 | } 59 | ``` 60 | 61 | ## Contact 62 | 63 | If you have any questions, please contact sunnweiwei@gmail.com 64 | 65 | -------------------------------------------------------------------------------- /baselines/driver_act.py: -------------------------------------------------------------------------------- 1 | from .train_jddc_act import train as train_act 2 | import os 3 | import argparse 4 | 5 | parser = argparse.ArgumentParser() 6 | parser.add_argument('-fold', type=int) 7 | parser.add_argument('--data', type=str, default='dstc8') 8 | parser.add_argument('--model', type=str, default='HiGRU+ATTN') 9 | args = parser.parse_args() 10 | 11 | print('CUDA_VISIBLE_DEVICES', os.environ["CUDA_VISIBLE_DEVICES"]) 12 | print('train data', args.data) 13 | print('train model', args.model) 14 | print('train fold', args.fold) 15 | 16 | train_act(fold=args.fold, data_name=args.data, model_name=args.model) 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------- /baselines/driver_sat.py: -------------------------------------------------------------------------------- 1 | from .train_jddc_sat import train 2 | import os 3 | import argparse 4 | 5 | parser = argparse.ArgumentParser() 6 | parser.add_argument('-fold', type=int) 7 | parser.add_argument('--data', type=str, default='dstc8') 8 | parser.add_argument('--model', type=str, default='HiGRU+ATTN') 9 | args = parser.parse_args() 10 | 11 | print('CUDA_VISIBLE_DEVICES', os.environ["CUDA_VISIBLE_DEVICES"]) 12 | print('train data', args.data) 13 | print('train model', args.model) 14 | print('train fold', args.fold) 15 | 16 | train(fold=args.fold, data_name=args.data, model_name=args.model) 17 | 18 | 19 | 20 | -------------------------------------------------------------------------------- /baselines/jddc_config.py: -------------------------------------------------------------------------------- 1 | domain2actions = """配送 ['配送周期', '物流全程跟踪', '联系配送', '什么时间出库', '配送方式', '返回方式', '预约配送时间', '少商品与少配件', '拒收', '能否自提', '能否配送', '售前运费多少'时间', '发错货', '下单地址填写', '发货检查', '京东特色配送', '提前配送', '填写返件运单号', '怎么确认收货', '快递单号不正确', '自提时间', '发货时间未到不能出库', '送货上门附加手续费', '夺宝岛配送时间', '夺宝岛运费', '配送超区'] 2 | 退换 ['保修返修及退换货政策', '正常退款周期', '返修退换货处理周期', '售后运费', '申请退款', '退款到哪儿', '返修退换货拆包装', '取消退款', '在哪里查询退款', '退款异常'团购退款', '补差价'] 3 | 发票 ['发票退换修改', '查看发票', '是否提供发票', '填写发票信息', '增票相关', '电子发票', '补发票', '返修退换货发票'] 4 | 客服 ['联系客服', '联系客户', '联系商家', '联系售后', '投诉', '夺宝岛售后'] 5 | 产品咨询 ['属性咨询', '使用咨询', '商品检索', '商品价格咨询', '补货时间', '生产日期', '正品保障', '包装清单', '库存状态', '商品介绍', '外包装', '商品比较', '保修期方区别', '补发', '预约抢购', '为什么显示无货', '开箱验货', '全球购解释', '有什么颜色', '套装购买', '是否全国联保', '是什么颜色', '图片差异', '配件推荐', '发表商品咨询', '爱回收解释', '夺宝岛商品来源', '金融理财', 'DIY装机', '众筹说明', '定金解释'] 6 | 价保 ['价保申请流程', '价保条件', '价保记录查询', '无法申请价保'] 7 | 支付 ['货到付款', '支付方式', '白条还款方式', '公司转账', '在线支付', '白条分期手续费', '白条开通', '无法购买提交', '余额查询', '支付到账时间', '支付密码', '余额使用'库转入转出', '微信支付', '超期未还款', '网银钱包提现异常', '代客户充值', '免密支付', '充值失败', '网银钱包定义', '网银钱包开通', '充值到账异常', '多次支付退款', '微信下单', '夺宝岛支付方式'] 8 | bug ['下单后无记录', '无法加入购物车', '竞拍异常', '地址信息无法保存'] 9 | 维修 ['售后维修点查询'] 10 | 评价 ['查看评价晒单', '删除修改评价晒单', '评价晒单返券和赠品', '评价晒单异常', '评价晒单送积分京豆'] 11 | 预定 ['机票相关', '购买机票', '火车票', '酒店预订']""" 12 | -------------------------------------------------------------------------------- /baselines/models.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | import torch 3 | import torch.nn as nn 4 | import numpy as np 5 | from torch.nn.init import xavier_uniform_ 6 | import torch.nn.functional as F 7 | import copy 8 | import warnings 9 | import os 10 | import pickle 11 | 12 | 13 | def init_params(model): 14 | for name, param in model.named_parameters(): 15 | if param.data.dim() > 1: 16 | xavier_uniform_(param.data) 17 | else: 18 | pass 19 | 20 | 21 | def universal_sentence_embedding(sentences, mask, sqrt=True): 22 | sentence_sums = torch.bmm( 23 | sentences.permute(0, 2, 1), mask.float().unsqueeze(-1) 24 | ).squeeze(-1) 25 | divisor = (mask.sum(dim=1).view(-1, 1).float()) 26 | if sqrt: 27 | divisor = divisor.sqrt() 28 | sentence_sums /= divisor 29 | return sentence_sums 30 | 31 | 32 | class GRU(nn.Module): 33 | def __init__(self, **config): 34 | super().__init__() 35 | vocab_size = config.get('vocab_size') 36 | dropout = config.get('dropout', 0.4) 37 | d_model = config.get('d_model', 256) 38 | num_layers = config.get('num_layers', 1) 39 | 40 | self.embedding = nn.Embedding(vocab_size, d_model, padding_idx=0) 41 | self.embedding_dropout = nn.Dropout(dropout) 42 | self.gru = nn.GRU(d_model, d_model, num_layers=num_layers, bidirectional=True, batch_first=True) 43 | 44 | init_params(self.embedding) 45 | init_params(self.gru) 46 | 47 | self.d_model = d_model * 2 48 | 49 | def forward(self, input_ids, **kwargs): 50 | attention_mask = input_ids.ne(0).detach() 51 | E = self.embedding_dropout(self.embedding(input_ids)).transpose(0, 1) 52 | H, h1 = self.gru(E) 53 | H = H.transpose(0, 1) 54 | h = universal_sentence_embedding(H, attention_mask) 55 | return h 56 | 57 | 58 | class GRUAttention(nn.Module): 59 | def __init__(self, **config): 60 | super().__init__() 61 | vocab_size = config.get('vocab_size') 62 | dropout = config.get('dropout', 0.4) 63 | d_model = config.get('d_model', 256) 64 | num_layers = config.get('num_layers', 1) 65 | 66 | self.embedding = nn.Embedding(vocab_size, d_model, padding_idx=0) 67 | self.embedding_dropout = nn.Dropout(dropout) 68 | self.gru = nn.GRU(d_model, d_model, num_layers=num_layers, bidirectional=True, batch_first=True) 69 | 70 | self.w = nn.Linear(2 * d_model, 1) 71 | 72 | init_params(self.embedding) 73 | init_params(self.gru) 74 | 75 | self.d_model = d_model * 2 76 | 77 | def forward(self, input_ids, **kwargs): 78 | attention_mask = input_ids.ne(0).detach() 79 | E = self.embedding_dropout(self.embedding(input_ids)).transpose(0, 1) 80 | H, h1 = self.gru(E) 81 | H = H.transpose(0, 1) # bc_size, len, d_model 82 | wh = self.w(H).squeeze(2) # bc_size, len 83 | # print(wh.size()) 84 | attention = F.softmax(F.tanh(wh).masked_fill(mask=~attention_mask, value=-np.inf)).unsqueeze(1) 85 | # bc_size, 1, len 86 | 87 | presentation = torch.bmm(attention, H).squeeze(1) # bc_size, d_model 88 | return presentation 89 | 90 | 91 | class Hierarchical(nn.Module): 92 | def __init__(self, backbone, class_num): 93 | super().__init__() 94 | self.drop_out = nn.Dropout(0.4) 95 | self.private = nn.ModuleList([copy.deepcopy(backbone) for num in class_num]) 96 | d_model = backbone.d_model 97 | 98 | self.class_num = class_num 99 | self.gru = nn.ModuleList( 100 | [nn.GRU(d_model, d_model, num_layers=1, bidirectional=False, batch_first=True) for num in class_num]) 101 | self.linear = nn.ModuleList([nn.Linear(d_model, num) for num in class_num]) 102 | for layer in self.linear: 103 | init_params(layer) 104 | for layer in self.gru: 105 | init_params(layer) 106 | 107 | def forward(self, input_ids, **kwargs): 108 | bc_size, dialog_his, utt_len = input_ids.size() 109 | 110 | input_ids = input_ids.view(-1, utt_len) 111 | attention_mask = input_ids.ne(0).detach() 112 | 113 | res = [] 114 | for private_module, gru, cls_layer in zip(self.private, self.gru, self.linear): 115 | private_out = private_module(input_ids=input_ids, attention_mask=attention_mask, **kwargs) 116 | private_out = private_out.view(bc_size, dialog_his, -1) # bc_size, dialog_his, d_model 117 | H, hidden = gru(private_out) 118 | hidden = hidden.squeeze(0) # bc_size, d_model 119 | hidden = self.drop_out(hidden) 120 | rep = hidden 121 | res.append(cls_layer(rep)) 122 | return res 123 | 124 | 125 | class HierarchicalAttention(nn.Module): 126 | def __init__(self, backbone, class_num): 127 | super().__init__() 128 | self.drop_out = nn.Dropout(0.4) 129 | self.private = nn.ModuleList([copy.deepcopy(backbone) for num in class_num]) 130 | d_model = backbone.d_model 131 | 132 | self.w = nn.ModuleList([nn.Linear(d_model, 1) for num in class_num]) 133 | 134 | self.class_num = class_num 135 | self.gru = nn.ModuleList( 136 | [nn.GRU(d_model, d_model, num_layers=1, bidirectional=False, batch_first=True) for num in class_num]) 137 | self.linear = nn.ModuleList([nn.Linear(d_model, num) for num in class_num]) 138 | for layer in self.linear: 139 | init_params(layer) 140 | for layer in self.gru: 141 | init_params(layer) 142 | for layer in self.w: 143 | init_params(layer) 144 | 145 | def forward(self, input_ids, **kwargs): 146 | bc_size, dialog_his, utt_len = input_ids.size() 147 | 148 | input_ids = input_ids.view(-1, utt_len) 149 | attention_mask = input_ids.ne(0).detach() 150 | 151 | res = [] 152 | for private_module, gru, w, cls_layer in zip(self.private, self.gru, self.w, self.linear): 153 | private_out = private_module(input_ids=input_ids, attention_mask=attention_mask, **kwargs) 154 | private_out = private_out.view(bc_size, dialog_his, -1) # bc_size, dialog_his, d_model 155 | H, hidden = gru(private_out) 156 | # H = H.transpose(0, 1) # bc_size, dialog_his, d_model 157 | wh = w(H).squeeze(2) # bc_size, dialog_his 158 | attention = F.softmax(F.tanh(wh)).unsqueeze(1) # bc_size, 1, dialog_his 159 | hidden = torch.bmm(attention, H).squeeze(1) # bc_size, d_model 160 | 161 | hidden = self.drop_out(hidden) 162 | rep = hidden 163 | res.append(cls_layer(rep)) 164 | return res 165 | 166 | 167 | class BERTBackbone(nn.Module): 168 | def __init__(self, **config): 169 | super().__init__() 170 | name = config.get('name', 'bert-base-chinese') 171 | self.layers_used = config.get('layers_used', 2) 172 | self.bert = BertModel.from_pretrained(name, output_hidden_states=True, output_attentions=True) 173 | self.d_model = 768 * self.layers_used * 2 174 | 175 | def forward(self, input_ids, attention_mask, token_type_ids=None, **kwargs): 176 | out = self.bert(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids) 177 | bert_out = out[2] 178 | out = [bert_out[-i - 1] for i in range(self.layers_used)] 179 | out = torch.cat(out, dim=-1) 180 | out = universal_sentence_embedding(out, attention_mask) 181 | 182 | cls = [bert_out[-i - 1].transpose(0, 1)[0] for i in range(self.layers_used)] 183 | cls = torch.cat(cls, dim=-1) 184 | out = torch.cat([cls, out], dim=-1) 185 | return out 186 | 187 | 188 | class ClassModel(nn.Module): 189 | def __init__(self, backbone, class_num): 190 | super().__init__() 191 | self.drop_out = nn.Dropout(0.4) 192 | self.private = nn.ModuleList([copy.deepcopy(backbone) for num in class_num]) 193 | d_model = backbone.d_model 194 | self.class_num = class_num 195 | self.linear = nn.ModuleList([nn.Linear(d_model, num) for num in class_num]) 196 | for layer in self.linear: 197 | torch.nn.init.normal_(layer.weight, std=0.02) 198 | 199 | def forward(self, input_ids, **kwargs): 200 | input_ids = input_ids 201 | attention_mask = input_ids.ne(0).detach() 202 | res = [] 203 | for private_module, cls_layer in zip(self.private, self.linear): 204 | private_out = private_module(input_ids=input_ids, attention_mask=attention_mask, **kwargs) 205 | rep = private_out 206 | rep = self.drop_out(rep) 207 | res.append(cls_layer(rep)) 208 | return res 209 | -------------------------------------------------------------------------------- /baselines/spearman.py: -------------------------------------------------------------------------------- 1 | """ 2 | Pearson Rho, Spearman Rho, and Kendall Tau 3 | Correlation algorithms 4 | Drew J. Nase 5 | Expects path to a file containing data series - 6 | one per line, separated by one or more spaces. 7 | """ 8 | 9 | import math 10 | import sys 11 | import string 12 | from itertools import combinations 13 | 14 | 15 | # x, y must be one-dimensional arrays of the same length 16 | 17 | # Pearson algorithm 18 | def pearson(x, y): 19 | assert len(x) == len(y) > 0 20 | q = lambda n: len(n) * sum(map(lambda i: i ** 2, n)) - (sum(n) ** 2) 21 | return (len(x) * sum(map(lambda a: a[0] * a[1], zip(x, y))) - sum(x) * sum(y)) / math.sqrt(q(x) * q(y)) 22 | 23 | 24 | # Spearman algorithm 25 | def spearman(x, y): 26 | assert len(x) == len(y) > 0 27 | q = lambda n: map(lambda val: sorted(n).index(val) + 1, n) 28 | d = sum(map(lambda x, y: (x - y) ** 2, q(x), q(y))) 29 | return 1.0 - 6.0 * d / float(len(x) * (len(y) ** 2 - 1.0)) 30 | 31 | 32 | # Kendall algorithm 33 | def kendall(x, y): 34 | assert len(x) == len(y) > 0 35 | c = 0 # concordant count 36 | d = 0 # discordant count 37 | t = 0 # tied count 38 | for (i, j) in combinations(range(len(x)), 2): 39 | s = (x[i] - x[j]) * (y[i] - y[j]) 40 | if s: 41 | c += 1 42 | d += 1 43 | if s > 0: 44 | t += 1 45 | elif s < 0: 46 | t -= 1 47 | else: 48 | if x[i] - x[j]: 49 | c += 1 50 | elif y[i] - y[j]: 51 | d += 1 52 | return t / math.sqrt(c * d) 53 | -------------------------------------------------------------------------------- /baselines/svm.py: -------------------------------------------------------------------------------- 1 | from .train_sat import load_dstc 2 | import numpy as np 3 | from collections import defaultdict 4 | from sklearn.metrics import f1_score, precision_score, recall_score 5 | import jieba 6 | import copy 7 | 8 | 9 | def get_main_score(scores): 10 | number = [0, 0, 0, 0, 0] 11 | for item in scores: 12 | number[item] += 1 13 | score = np.argmax(number) 14 | return score 15 | 16 | 17 | def load_jddc(dirname, lite=1): 18 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 19 | 20 | from jddc_config import domain2actions 21 | 22 | act2domain = {} 23 | for line in domain2actions.split('\n'): 24 | domain = line[:line.index('[') - 1].strip() 25 | actions = [x[1:-1] for x in line[line.index('[') + 1:-1].split(', ')] 26 | # print(domain, actions) 27 | for x in actions: 28 | act2domain[x] = domain 29 | data = [] 30 | for line in raw: 31 | if len(line) == 0: 32 | data.append([]) 33 | else: 34 | data[-1].append(line) 35 | x = [] 36 | emo = [] 37 | act = [] 38 | action_list = {'other': 0} 39 | for session in data: 40 | his_input_ids = [] 41 | for turn in session: 42 | role, text, action, score = turn.split('\t') 43 | score = score.split(',') 44 | if role == 'USER': 45 | x.append(copy.deepcopy(' '.join(his_input_ids))) 46 | emo.append(get_main_score([int(item) - 1 for item in score])) 47 | action = action.strip() 48 | if lite: 49 | action = act2domain.get(action, 'other') 50 | if action not in action_list: 51 | action_list[action] = len(action_list) 52 | act.append(action_list[action]) 53 | his_input_ids.append(' '.join(jieba.cut(text.strip()))) 54 | # his_input_ids.append(' '.join(text.strip())) 55 | 56 | action_num = len(action_list) 57 | data = [x, emo, act, action_num] 58 | return data 59 | 60 | 61 | def load_data(dirname): 62 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 63 | data = [] 64 | for line in raw: 65 | if line == '': 66 | data.append([]) 67 | else: 68 | data[-1].append(line) 69 | x = [] 70 | emo = [] 71 | act = [] 72 | action_list = {} 73 | for session in data: 74 | his_input_ids = [] 75 | for turn in session: 76 | role, text, action, score = turn.split('\t') 77 | score = score.split(',') 78 | action = action.split(',') 79 | action = action[0] 80 | if role.upper() == 'USER': 81 | x.append(copy.deepcopy(' '.join(his_input_ids))) 82 | emo.append(get_main_score([int(item) - 1 for item in score])) 83 | action = action.strip() 84 | if action not in action_list: 85 | action_list[action] = len(action_list) 86 | act.append(action_list[action]) 87 | his_input_ids.append(text.strip()) 88 | 89 | action_num = len(action_list) 90 | data = [x, emo, act, action_num] 91 | return data 92 | 93 | 94 | def train(fold=0): 95 | print('fold', fold) 96 | 97 | from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer, TfidfTransformer 98 | from sklearn.linear_model import LogisticRegression 99 | from sklearn.naive_bayes import MultinomialNB 100 | from sklearn.svm import SVC 101 | from xgboost import XGBClassifier 102 | 103 | from sklearn.metrics import cohen_kappa_score 104 | from .spearman import spearman 105 | 106 | dataset = 'MWOZ' 107 | 108 | # x, emo, act, action_num = load_jddc(f'dataset/{dataset}') 109 | x, emo, act, action_num = load_data(f'dataset/{dataset},txt') 110 | 111 | ll = int(len(x) / 10) 112 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 113 | train_act = emo[:ll * fold] + emo[ll * (fold + 1):] 114 | 115 | test_x = x[ll * fold:ll * (fold + 1)] 116 | test_act = emo[ll * fold:ll * (fold + 1)] 117 | 118 | # =================== 119 | 120 | print('build tf-idf') 121 | vectorizer = CountVectorizer() 122 | train_feature = vectorizer.fit_transform(train_x) 123 | test_feature = vectorizer.transform(test_x) 124 | 125 | model = XGBClassifier() 126 | model.fit(train_feature, train_act) 127 | prediction = model.predict(test_feature) 128 | 129 | # svm = SVC(C=1.0, kernel="linear") 130 | # svm.fit(train_feature, train_act) 131 | # prediction = svm.predict(test_feature) 132 | 133 | # lr = LogisticRegression() 134 | # lr.fit(train_feature, train_act) 135 | # prediction = lr.predict(test_feature) 136 | 137 | label = test_act 138 | 139 | recall = [[0, 0] for _ in range(5)] 140 | for p, l in zip(prediction, label): 141 | recall[l][1] += 1 142 | recall[l][0] += int(p == l) 143 | recall_value = [item[0] / max(item[1], 1) for item in recall] 144 | print('Recall value:', recall_value) 145 | print('Recall:', recall) 146 | UAR = sum(recall_value) / len(recall_value) 147 | kappa = cohen_kappa_score(prediction, label) 148 | rho = spearman(prediction, label) 149 | 150 | bi_pred = [int(item < 2) for item in prediction] 151 | bi_label = [int(item < 2) for item in label] 152 | bi_recall = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if l == 1]) / max(bi_label.count(1), 1) 153 | bi_precision = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if p == 1]) / max(bi_pred.count(1), 1) 154 | bi_f1 = 2 * bi_recall * bi_precision / max((bi_recall + bi_precision), 1) 155 | 156 | print(UAR, kappa, rho, bi_f1) 157 | 158 | with open(f'outputs/{dataset}_emo/xgb_{fold}_0.txt', 'w', encoding='utf-8') as f: 159 | for p, l in zip(prediction, label): 160 | f.write(f'{p}, {l}\n') 161 | 162 | 163 | def train_act(fold=0): 164 | print('fold', fold) 165 | 166 | from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer, TfidfTransformer 167 | from sklearn.linear_model import LogisticRegression 168 | from sklearn.naive_bayes import MultinomialNB 169 | from sklearn.svm import SVC 170 | from xgboost import XGBClassifier 171 | 172 | from sklearn.metrics import cohen_kappa_score 173 | from .spearman import spearman 174 | 175 | dataset = 'JDDC' 176 | 177 | x, emo, act, action_num = load_jddc(f'dataset/{dataset}.txt') 178 | # x, emo, act, action_num = load_data(f'dataset/{dataset}.txt') 179 | 180 | ll = int(len(x) / 10) 181 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 182 | train_act = act[:ll * fold] + act[ll * (fold + 1):] 183 | 184 | test_x = x[ll * fold:ll * (fold + 1)] 185 | test_act = act[ll * fold:ll * (fold + 1)] 186 | 187 | print('build tf-idf') 188 | vectorizer = CountVectorizer() 189 | train_feature = vectorizer.fit_transform(train_x) 190 | test_feature = vectorizer.transform(test_x) 191 | 192 | # model = XGBClassifier() 193 | # model.fit(train_feature, train_act) 194 | # prediction = model.predict(test_feature) 195 | 196 | # svm = SVC(C=1.0, kernel="linear") 197 | # svm.fit(train_feature, train_act) 198 | # prediction = svm.predict(test_feature) 199 | 200 | lr = LogisticRegression() 201 | lr.fit(train_feature, train_act) 202 | prediction = lr.predict(test_feature) 203 | 204 | label = test_act 205 | 206 | acc = sum([int(p == l) for p, l in zip(prediction, label)]) / len(label) 207 | precision = precision_score(label, prediction, average='macro', zero_division=0) 208 | recall = recall_score(label, prediction, average='macro', zero_division=0) 209 | f1 = f1_score(label, prediction, average='macro', zero_division=0) 210 | 211 | print(acc, precision, recall, f1) 212 | 213 | with open(f'outputs/{dataset}_act/lr_{fold}_0.txt', 'w', encoding='utf-8') as f: 214 | for p, l in zip(prediction, label): 215 | f.write(f'{p}, {l}\n') 216 | -------------------------------------------------------------------------------- /baselines/test.py: -------------------------------------------------------------------------------- 1 | from sklearn.metrics import cohen_kappa_score 2 | from .spearman import spearman 3 | from sklearn.metrics import f1_score, precision_score, recall_score 4 | 5 | 6 | def test(name): 7 | best_result = [0. for _ in range(4)] 8 | for epoch in range(100): 9 | 10 | data = [] 11 | for fold in range(10): 12 | data.extend([line[:-1] for line in open(f'{name}_{fold}_{epoch}.txt', encoding='utf-8')]) 13 | 14 | prediction = [int(line.split(',')[0]) for line in data] 15 | label = [int(line.split(',')[1]) for line in data] 16 | recall = [[0, 0] for _ in range(5)] 17 | for p, l in zip(prediction, label): 18 | recall[l][1] += 1 19 | recall[l][0] += int(p == l) 20 | recall_value = [item[0] / max(item[1], 1) for item in recall] 21 | # print('Recall value:', recall_value) 22 | # print('Recall:', recall) 23 | UAR = sum(recall_value) / len(recall_value) 24 | kappa = cohen_kappa_score(prediction, label) 25 | rho = spearman(prediction, label) 26 | # rho = 0 27 | 28 | bi_pred = [int(item < 2) for item in prediction] 29 | bi_label = [int(item < 2) for item in label] 30 | bi_recall = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if l == 1]) / max(bi_label.count(1), 1) 31 | bi_precision = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if p == 1]) / max(bi_pred.count(1), 1) 32 | bi_f1 = 2 * bi_recall * bi_precision / max((bi_recall + bi_precision), 1) 33 | 34 | test_result = [UAR, kappa, rho, bi_f1] 35 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 36 | print(epoch, best_result, test_result) 37 | 38 | 39 | def test_act(name): 40 | best_result = [0. for _ in range(4)] 41 | for epoch in range(100): 42 | 43 | data = [] 44 | for fold in range(10): 45 | data.extend([line[:-1] for line in open(f'{name}_{fold}_{epoch}.txt', encoding='utf-8')]) 46 | 47 | prediction = [int(line.split(',')[0]) for line in data] 48 | label = [int(line.split(',')[1]) for line in data] 49 | acc = sum([int(p == l) for p, l in zip(prediction, label)]) / len(label) 50 | precision = precision_score(label, prediction, average='macro', zero_division=0) 51 | recall = recall_score(label, prediction, average='macro', zero_division=0) 52 | f1 = f1_score(label, prediction, average='macro', zero_division=0) 53 | 54 | test_result = [acc, precision, recall, f1] 55 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 56 | print(epoch, best_result, test_result) 57 | 58 | 59 | if __name__ == '__main__': 60 | 61 | test('outputs/jddc_emo/BERT') 62 | test_act('outputs/jddc_act/BERT') 63 | -------------------------------------------------------------------------------- /baselines/train_act.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | from torch.utils.data import Dataset, DataLoader 3 | from torch.nn.utils.rnn import pad_sequence 4 | import torch.nn.functional as F 5 | from tqdm import tqdm 6 | import random 7 | import json 8 | import copy 9 | import torch 10 | import warnings 11 | import numpy as np 12 | import os 13 | import pickle 14 | from sklearn.metrics import cohen_kappa_score 15 | from sklearn.metrics import f1_score, precision_score, recall_score 16 | from .spearman import spearman 17 | 18 | warnings.filterwarnings("ignore") 19 | 20 | 21 | def write_pkl(obj, filename): 22 | with open(filename, 'wb') as f: 23 | pickle.dump(obj, f) 24 | 25 | 26 | def read_pkl(filename): 27 | with open(filename, 'rb') as f: 28 | return pickle.load(f) 29 | 30 | 31 | def get_main_score(scores): 32 | number = [0, 0, 0, 0, 0] 33 | for item in scores: 34 | number[item] += 1 35 | score = np.argmax(number) 36 | return score 37 | 38 | 39 | def load_redial_act(dirname, tokenizer): 40 | print(dirname) 41 | name = 'action_data' 42 | 43 | if os.path.exists(f'{dirname}-{name}.pkl'): 44 | return read_pkl(f'{dirname}-{name}.pkl') 45 | print('tokenized data') 46 | raw = [line[:-1] for line in open(f'{dirname}/data_action.txt', encoding='utf-8')] 47 | data = [] 48 | for line in raw: 49 | if line == '': 50 | data.append([]) 51 | else: 52 | data[-1].append(line) 53 | x = [] 54 | emo = [] 55 | act = [] 56 | action_list = {} 57 | for session in data: 58 | his_input_ids = [] 59 | for turn in session: 60 | role, text, action, score = turn.split('\t') 61 | score = score.split(',') 62 | action = action.split(',') 63 | action = action[0] 64 | if role == 'USER': 65 | x.append(copy.deepcopy(his_input_ids)) 66 | emo.append(get_main_score([int(item) - 1 for item in score])) 67 | action = action.strip() 68 | if action not in action_list: 69 | action_list[action] = len(action_list) 70 | act.append(action_list[action]) 71 | 72 | ids = tokenizer.encode(text.strip())[1:] 73 | his_input_ids.append(ids) 74 | 75 | action_num = len(action_list) 76 | data = [x, emo, act, action_num] 77 | write_pkl(data, f'{dirname}-{name}.pkl') 78 | return data 79 | 80 | 81 | def load_ccpe(dirname, tokenizer): 82 | print(dirname) 83 | name = 'hierarchical_data' 84 | 85 | if os.path.exists(f'{dirname}-{name}.pkl'): 86 | return read_pkl(f'{dirname}-{name}.pkl') 87 | print('tokenized data') 88 | raw = [line[:-1] for line in open(f'{dirname}/data.txt', encoding='utf-8')] 89 | data = [] 90 | for line in raw: 91 | if line == '': 92 | data.append([]) 93 | else: 94 | data[-1].append(line) 95 | x = [] 96 | emo = [] 97 | act1 = [] 98 | act2 = [] 99 | action_list1 = {} 100 | action_list2 = {} 101 | for session in data: 102 | his_input_ids = [] 103 | for turn in session: 104 | role, text, action, score = turn.split('\t') 105 | score = score.split(',') 106 | action = action.split(',') 107 | action = action[0] 108 | if action == '': 109 | action1, action2 = 'other', 'other' 110 | else: 111 | action1, action2 = action.split('+') 112 | if role == 'USER': 113 | x.append(copy.deepcopy(his_input_ids)) 114 | emo.append(get_main_score([int(item) - 1 for item in score])) 115 | action1 = action1.strip() 116 | if action1 not in action_list1: 117 | action_list1[action1] = len(action_list1) 118 | act1.append(action_list1[action1]) 119 | 120 | action2 = action2.strip() 121 | if action2 not in action_list2: 122 | action_list2[action2] = len(action_list2) 123 | act2.append(action_list2[action2]) 124 | ids = tokenizer.encode(text.strip())[1:] 125 | his_input_ids.append(ids) 126 | action_num1 = len(action_list1) 127 | action_num2 = len(action_list2) 128 | data = [x, emo, act1, act2, action_num1, action_num2] 129 | write_pkl(data, f'{dirname}-{name}.pkl') 130 | return data 131 | 132 | 133 | def load_dstc(dirname, tokenizer): 134 | print(dirname) 135 | name = 'hierarchical_data' 136 | 137 | if os.path.exists(f'{dirname}-{name}.pkl'): 138 | return read_pkl(f'{dirname}-{name}.pkl') 139 | 140 | print('tokenized data') 141 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 142 | data = [] 143 | for line in raw: 144 | if line == '': 145 | data.append([]) 146 | else: 147 | data[-1].append(line) 148 | x = [] 149 | emo = [] 150 | act = [] 151 | action_list = {} 152 | for session in data: 153 | his_input_ids = [] 154 | for turn in session: 155 | role, text, action, score = turn.split('\t') 156 | score = score.split(',') 157 | action = action.split(',') 158 | action = action[0] 159 | if role == 'USER': 160 | x.append(copy.deepcopy(his_input_ids)) 161 | emo.append(get_main_score([int(item) - 1 for item in score])) 162 | action = action.strip() 163 | if action not in action_list: 164 | action_list[action] = len(action_list) 165 | act.append(action_list[action]) 166 | 167 | ids = tokenizer.encode(text.strip())[1:] 168 | his_input_ids.append(ids) 169 | 170 | action_num = len(action_list) 171 | data = [x, emo, act, action_num] 172 | write_pkl(data, f'{dirname}-{name}.pkl') 173 | return data 174 | 175 | 176 | class HierarchicalData(Dataset): 177 | def __init__(self, x, act, dialog_used=5): 178 | self.x = x 179 | self.act = act 180 | self.dialog_used = dialog_used 181 | 182 | def __getitem__(self, index): 183 | x = [torch.tensor([101])] * (self.dialog_used - len(self.x[index])) + \ 184 | [torch.tensor([101] + item[:64]) for item in self.x[index][-self.dialog_used:]] 185 | act = self.act[index] 186 | return x, act 187 | 188 | def __len__(self): 189 | return len(self.x) 190 | 191 | 192 | class FlatData(Dataset): 193 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 194 | self.x = x 195 | self.act = act 196 | self.dialog_used = dialog_used 197 | 198 | if up_sampling: 199 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 200 | enhance_idx = enhance_idx * 10 201 | enhance_x = [x[idx] for idx in enhance_idx] 202 | enhance_act = [act[idx] for idx in enhance_idx] 203 | self.x = x + enhance_x 204 | self.act = act + enhance_act 205 | 206 | def __getitem__(self, index): 207 | seq = sum([item[:64] for item in self.x[index]], []) 208 | x = torch.tensor([101] + seq[-500:]) 209 | act = self.act[index] 210 | return x, act 211 | 212 | def __len__(self): 213 | return len(self.x) 214 | 215 | 216 | def collate_fn(data): 217 | x, act = zip(*data) 218 | bc_size = len(x) 219 | dialog_his = len(x[0]) 220 | x = [item for dialog in x for item in dialog] 221 | x = pad_sequence(x, batch_first=True, padding_value=0) 222 | x = x.view(bc_size, dialog_his, -1) 223 | 224 | return {'input_ids': x, 225 | 'act': torch.tensor(act).long() 226 | } 227 | 228 | 229 | def flat_collate_fn(data): 230 | x, act = zip(*data) 231 | x = pad_sequence(x, batch_first=True, padding_value=0) 232 | 233 | return {'input_ids': x, 234 | 'act': torch.tensor(act).long() 235 | } 236 | 237 | 238 | def train(fold=0, data_name='dstc8', model_name='HiGRU+ATTN', dialog_used=10): 239 | print('[TRAIN ACTION]') 240 | 241 | data_name = data_name.split('<>')[-1] 242 | 243 | data_name = data_name.replace('\r', '') 244 | model_name = model_name.replace('\r', '') 245 | 246 | print('dialog used', dialog_used) 247 | 248 | name = f'act_{data_name}_{model_name}_{fold}' 249 | print('TRAIN ::', name) 250 | 251 | save_path = f'outputs/{data_name}_act/{model_name}_{fold}' 252 | 253 | tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') 254 | 255 | x, emo, act, action_num = load_redial_act(f'dataset/{data_name}', tokenizer) 256 | # x, emo, act, action_num = load_dstc(f'dataset/{data_name}', tokenizer) 257 | 258 | # x, emo, act1, act2, action_num1, action_num2 = load_ccpe(f'dataset/{data_name}', tokenizer) 259 | # act = act1 260 | # action_num = action_num1 261 | 262 | print('action_num:', action_num) 263 | from .models import GRU, GRUAttention, BERTBackbone 264 | from .models import HierarchicalAttention, Hierarchical, ClassModel 265 | if model_name == 'HiGRU+ATTN': 266 | model = HierarchicalAttention(backbone=GRUAttention(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 267 | model = model.cuda() 268 | optimizer = AdamW(model.parameters(), 1e-4) 269 | batch_size = 16 270 | DataFunc = HierarchicalData 271 | cf = collate_fn 272 | elif model_name == 'HiGRU': 273 | model = Hierarchical(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 274 | model = model.cuda() 275 | optimizer = AdamW(model.parameters(), 1e-4) 276 | batch_size = 16 277 | DataFunc = HierarchicalData 278 | cf = collate_fn 279 | elif model_name == 'GRU': 280 | model = ClassModel(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 281 | model = model.cuda() 282 | optimizer = AdamW(model.parameters(), 1e-4) 283 | batch_size = 16 284 | DataFunc = FlatData 285 | cf = flat_collate_fn 286 | elif model_name == 'BERT': 287 | model = ClassModel(backbone=BERTBackbone(layers_used=2, name='bert-base-uncased'), class_num=[action_num]) 288 | model = model.cuda() 289 | optimizer = AdamW(model.parameters(), 2e-5) 290 | batch_size = 6 291 | DataFunc = FlatData 292 | cf = flat_collate_fn 293 | else: 294 | print('[unknown model name]') 295 | return 296 | 297 | ll = int(len(x) / 10) 298 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 299 | train_act = act[:ll * fold] + act[ll * (fold + 1):] 300 | 301 | test_x = x[ll * fold:ll * (fold + 1)] 302 | test_act = act[ll * fold:ll * (fold + 1)] 303 | 304 | print(len(train_x), len(test_x)) 305 | print() 306 | best_result = [0. for _ in range(4)] 307 | for i in range(100): 308 | print('train epoch', i, name) 309 | train_loader = DataLoader(DataFunc(train_x, train_act, dialog_used=dialog_used), batch_size=batch_size, 310 | shuffle=True, num_workers=2, collate_fn=cf) 311 | # tk0 = tqdm(train_loader, total=len(train_loader)) 312 | tk0 = train_loader 313 | act_acc = [] 314 | model.train() 315 | for j, batch in enumerate(tk0): 316 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 317 | act = batch['act'].cuda() 318 | act_loss = F.cross_entropy(act_pred, act) 319 | loss = act_loss 320 | 321 | loss.backward() 322 | optimizer.step() 323 | optimizer.zero_grad() 324 | act_acc.append((act_pred.argmax(dim=-1) == act).sum().item() / act.size(0)) 325 | 326 | # tk0.set_postfix(act_acc=round(sum(act_acc) / max(1, len(act_acc)), 4)) 327 | torch.save(model.state_dict(), f'outputs/{name}_{i}.pt') 328 | # print('test epoch', i) 329 | test_result = test(model, DataFunc(test_x, test_act, dialog_used=dialog_used), f'{save_path}_{i}.txt', cf) 330 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 331 | print(f'text_result={test_result}') 332 | print(f'best_result={best_result}') 333 | print() 334 | 335 | 336 | def test(model, test_data, save_path, cf): 337 | test_loader = DataLoader(test_data, batch_size=6, shuffle=False, num_workers=0, collate_fn=cf) 338 | # tk0 = tqdm(test_loader, total=len(test_loader)) 339 | tk0 = test_loader 340 | prediction = [] 341 | label = [] 342 | 343 | model.eval() 344 | for j, batch in enumerate(tk0): 345 | act = batch['act'].cuda() 346 | with torch.no_grad(): 347 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 348 | prediction.extend(act_pred.argmax(dim=-1).cpu().tolist()) 349 | label.extend(act.cpu().tolist()) 350 | 351 | acc = sum([int(p == l) for p, l in zip(prediction, label)]) / len(label) 352 | precision = precision_score(label, prediction, average='macro', zero_division=0) 353 | recall = recall_score(label, prediction, average='macro', zero_division=0) 354 | f1 = f1_score(label, prediction, average='macro', zero_division=0) 355 | 356 | with open(save_path, 'w', encoding='utf-8') as f: 357 | for p, l in zip(prediction, label): 358 | f.write(f'{p}, {l}\n') 359 | 360 | return acc, precision, recall, f1 361 | -------------------------------------------------------------------------------- /baselines/train_act.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | data='jddc' 4 | model='BERT' 5 | 6 | for i in {0..9} 7 | do 8 | echo CUDA=$[ i%4 ] log/${data}_${model}_$i.log 9 | CUDA_VISIBLE_DEVICES=$[ i%4 ] nohup python -u driver_act.py -fold=$i --data=${data} --model=${model} > log/${data}_${model}_$i.log 2>&1 & 10 | done -------------------------------------------------------------------------------- /baselines/train_jddc_act.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | from torch.utils.data import Dataset, DataLoader 3 | from torch.nn.utils.rnn import pad_sequence 4 | import torch.nn.functional as F 5 | from tqdm import tqdm 6 | import random 7 | import json 8 | import copy 9 | import torch 10 | import warnings 11 | import numpy as np 12 | import os 13 | import pickle 14 | from sklearn.metrics import cohen_kappa_score 15 | from sklearn.metrics import f1_score, precision_score, recall_score 16 | from .spearman import spearman 17 | 18 | warnings.filterwarnings("ignore") 19 | 20 | 21 | def write_pkl(obj, filename): 22 | with open(filename, 'wb') as f: 23 | pickle.dump(obj, f) 24 | 25 | 26 | def read_pkl(filename): 27 | with open(filename, 'rb') as f: 28 | return pickle.load(f) 29 | 30 | 31 | def get_main_score(scores): 32 | number = [0, 0, 0, 0, 0] 33 | for item in scores: 34 | number[item] += 1 35 | score = np.argmax(number) 36 | return score 37 | 38 | 39 | def load_jddc(dirname, tokenizer, lite=1): 40 | name = 'hierarchical_data' 41 | if lite: 42 | name = name + '_lite' 43 | if os.path.exists(f'{dirname}-{name}.pkl'): 44 | return read_pkl(f'{dirname}-{name}.pkl') 45 | print('tokenized data JDDC') 46 | 47 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 48 | 49 | from .jddc_config import domain2actions 50 | 51 | act2domain = {} 52 | for line in domain2actions.split('\n'): 53 | domain = line[:line.index('[') - 1].strip() 54 | actions = [x[1:-1] for x in line[line.index('[') + 1:-1].split(', ')] 55 | # print(domain, actions) 56 | for x in actions: 57 | act2domain[x] = domain 58 | data = [] 59 | for line in raw: 60 | if len(line) == 0: 61 | data.append([]) 62 | else: 63 | data[-1].append(line) 64 | x = [] 65 | emo = [] 66 | act = [] 67 | action_list = {'other': 0} 68 | for session in data: 69 | his_input_ids = [] 70 | for turn in session: 71 | role, text, action, score = turn.split('\t') 72 | score = score.split(',') 73 | 74 | if role == 'USER': 75 | x.append(copy.deepcopy(his_input_ids)) 76 | emo.append(get_main_score([int(item) - 1 for item in score])) 77 | action = action.strip() 78 | if lite: 79 | action = act2domain.get(action, 'other') 80 | if action not in action_list: 81 | action_list[action] = len(action_list) 82 | act.append(action_list[action]) 83 | ids = tokenizer.encode(text.strip())[1:] 84 | his_input_ids.append(ids) 85 | 86 | action_num = len(action_list) 87 | data = [x, emo, act, action_num] 88 | write_pkl(data, f'{dirname}-{name}.pkl') 89 | return data 90 | 91 | 92 | class HierarchicalData(Dataset): 93 | def __init__(self, x, act, dialog_used=5): 94 | self.x = x 95 | self.act = act 96 | self.dialog_used = dialog_used 97 | 98 | def __getitem__(self, index): 99 | x = [torch.tensor([101])] * (self.dialog_used - len(self.x[index])) + \ 100 | [torch.tensor([101] + item[:64]) for item in self.x[index][-self.dialog_used:]] 101 | act = self.act[index] 102 | return x, act 103 | 104 | def __len__(self): 105 | return len(self.x) 106 | 107 | 108 | class FlatData(Dataset): 109 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 110 | self.x = x 111 | self.act = act 112 | self.dialog_used = dialog_used 113 | 114 | if up_sampling: 115 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 116 | enhance_idx = enhance_idx * 10 117 | enhance_x = [x[idx] for idx in enhance_idx] 118 | enhance_act = [act[idx] for idx in enhance_idx] 119 | self.x = x + enhance_x 120 | self.act = act + enhance_act 121 | 122 | def __getitem__(self, index): 123 | seq = sum([item[:64] for item in self.x[index]], []) 124 | x = torch.tensor([101] + seq[-500:]) 125 | act = self.act[index] 126 | return x, act 127 | 128 | def __len__(self): 129 | return len(self.x) 130 | 131 | 132 | def collate_fn(data): 133 | x, act = zip(*data) 134 | bc_size = len(x) 135 | dialog_his = len(x[0]) 136 | x = [item for dialog in x for item in dialog] 137 | x = pad_sequence(x, batch_first=True, padding_value=0) 138 | x = x.view(bc_size, dialog_his, -1) 139 | 140 | return {'input_ids': x, 141 | 'act': torch.tensor(act).long() 142 | } 143 | 144 | 145 | def flat_collate_fn(data): 146 | x, act = zip(*data) 147 | x = pad_sequence(x, batch_first=True, padding_value=0) 148 | 149 | return {'input_ids': x, 150 | 'act': torch.tensor(act).long() 151 | } 152 | 153 | 154 | def train(fold=0, data_name='dstc8', model_name='HiGRU+ATTN'): 155 | print('[TRAIN ACTION] JDDC') 156 | dialog_used = 10 157 | 158 | data_name = data_name.replace('\r', '') 159 | model_name = model_name.replace('\r', '') 160 | 161 | print('dialog used', dialog_used) 162 | 163 | name = f'act_{data_name}_{model_name}_{fold}' 164 | print('TRAIN ::', name) 165 | 166 | save_path = f'outputs/{data_name}_act/{model_name}_{fold}' 167 | 168 | tokenizer = BertTokenizer.from_pretrained('bert-base-chinese') 169 | 170 | x, emo, act, action_num = load_jddc(f'dataset/{data_name}', tokenizer) 171 | 172 | print('action_num:', action_num) 173 | from .models import GRU, GRUAttention, BERTBackbone 174 | from .models import HierarchicalAttention, Hierarchical, ClassModel 175 | if model_name == 'HiGRU+ATTN': 176 | model = HierarchicalAttention(backbone=GRUAttention(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 177 | model = model.cuda() 178 | optimizer = AdamW(model.parameters(), 1e-4) 179 | batch_size = 16 180 | DataFunc = HierarchicalData 181 | cf = collate_fn 182 | elif model_name == 'HiGRU': 183 | model = Hierarchical(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 184 | model = model.cuda() 185 | optimizer = AdamW(model.parameters(), 1e-4) 186 | batch_size = 16 187 | DataFunc = HierarchicalData 188 | cf = collate_fn 189 | elif model_name == 'GRU': 190 | model = ClassModel(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[action_num]) 191 | model = model.cuda() 192 | optimizer = AdamW(model.parameters(), 1e-4) 193 | batch_size = 16 194 | DataFunc = FlatData 195 | cf = flat_collate_fn 196 | elif model_name == 'BERT': 197 | model = ClassModel(backbone=BERTBackbone(layers_used=2, name='bert-base-chinese'), class_num=[action_num]) 198 | model = model.cuda() 199 | optimizer = AdamW(model.parameters(), 2e-5) 200 | batch_size = 6 201 | DataFunc = FlatData 202 | cf = flat_collate_fn 203 | else: 204 | print('[unknown model name]') 205 | return 206 | 207 | ll = int(len(x) / 10) 208 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 209 | train_act = act[:ll * fold] + act[ll * (fold + 1):] 210 | 211 | test_x = x[ll * fold:ll * (fold + 1)] 212 | test_act = act[ll * fold:ll * (fold + 1)] 213 | 214 | print(len(train_x), len(test_x)) 215 | print() 216 | best_result = [0. for _ in range(4)] 217 | for i in range(100): 218 | print('train epoch', i, name) 219 | train_loader = DataLoader(DataFunc(train_x, train_act, dialog_used=dialog_used), batch_size=batch_size, 220 | shuffle=True, num_workers=2, collate_fn=cf) 221 | # tk0 = tqdm(train_loader, total=len(train_loader)) 222 | tk0 = train_loader 223 | act_acc = [] 224 | model.train() 225 | for j, batch in enumerate(tk0): 226 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 227 | act = batch['act'].cuda() 228 | act_loss = F.cross_entropy(act_pred, act) 229 | loss = act_loss 230 | 231 | loss.backward() 232 | optimizer.step() 233 | optimizer.zero_grad() 234 | act_acc.append((act_pred.argmax(dim=-1) == act).sum().item() / act.size(0)) 235 | 236 | # tk0.set_postfix(act_acc=round(sum(act_acc) / max(1, len(act_acc)), 4)) 237 | torch.save(model.state_dict(), f'outputs/{name}_{i}.pt') 238 | # print('test epoch', i) 239 | test_result = test(model, DataFunc(test_x, test_act, dialog_used=dialog_used), f'{save_path}_{i}.txt', cf) 240 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 241 | print(f'text_result={test_result}') 242 | print(f'best_result={best_result}') 243 | print() 244 | 245 | 246 | def test(model, test_data, save_path, cf): 247 | test_loader = DataLoader(test_data, batch_size=6, shuffle=False, num_workers=0, collate_fn=cf) 248 | # tk0 = tqdm(test_loader, total=len(test_loader)) 249 | tk0 = test_loader 250 | prediction = [] 251 | label = [] 252 | 253 | model.eval() 254 | for j, batch in enumerate(tk0): 255 | act = batch['act'].cuda() 256 | with torch.no_grad(): 257 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 258 | prediction.extend(act_pred.argmax(dim=-1).cpu().tolist()) 259 | label.extend(act.cpu().tolist()) 260 | 261 | acc = sum([int(p == l) for p, l in zip(prediction, label)]) / len(label) 262 | precision = precision_score(label, prediction, average='macro', zero_division=0) 263 | recall = recall_score(label, prediction, average='macro', zero_division=0) 264 | f1 = f1_score(label, prediction, average='macro', zero_division=0) 265 | 266 | with open(save_path, 'w', encoding='utf-8') as f: 267 | for p, l in zip(prediction, label): 268 | f.write(f'{p}, {l}\n') 269 | 270 | return acc, precision, recall, f1 271 | 272 | 273 | 274 | 275 | 276 | 277 | 278 | 279 | 280 | 281 | 282 | -------------------------------------------------------------------------------- /baselines/train_jddc_sat.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | from torch.utils.data import Dataset, DataLoader 3 | from torch.nn.utils.rnn import pad_sequence 4 | import torch.nn.functional as F 5 | from tqdm import tqdm 6 | import random 7 | import json 8 | import copy 9 | import torch 10 | import warnings 11 | import numpy as np 12 | import os 13 | import pickle 14 | from sklearn.metrics import cohen_kappa_score 15 | from .spearman import spearman 16 | 17 | warnings.filterwarnings("ignore") 18 | 19 | 20 | def write_pkl(obj, filename): 21 | with open(filename, 'wb') as f: 22 | pickle.dump(obj, f) 23 | 24 | 25 | def read_pkl(filename): 26 | with open(filename, 'rb') as f: 27 | return pickle.load(f) 28 | 29 | 30 | def get_main_score(scores): 31 | number = [0, 0, 0, 0, 0] 32 | for item in scores: 33 | number[item] += 1 34 | score = np.argmax(number) 35 | return score 36 | 37 | 38 | def load_jddc(dirname, tokenizer, lite=1): 39 | name = 'hierarchical_data' 40 | if lite: 41 | name = name + '_lite' 42 | if os.path.exists(f'{dirname}-{name}.pkl'): 43 | return read_pkl(f'{dirname}-{name}.pkl') 44 | print('tokenized data JDDC') 45 | 46 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 47 | 48 | from .jddc_config import domain2actions 49 | 50 | act2domain = {} 51 | for line in domain2actions.split('\n'): 52 | domain = line[:line.index('[') - 1].strip() 53 | actions = [x[1:-1] for x in line[line.index('[') + 1:-1].split(', ')] 54 | # print(domain, actions) 55 | for x in actions: 56 | act2domain[x] = domain 57 | data = [] 58 | for line in raw: 59 | if len(line) == 0: 60 | data.append([]) 61 | else: 62 | data[-1].append(line) 63 | x = [] 64 | emo = [] 65 | act = [] 66 | action_list = {'other': 0} 67 | for session in data: 68 | his_input_ids = [] 69 | for turn in session: 70 | role, text, action, score = turn.split('\t') 71 | score = score.split(',') 72 | 73 | if role == 'USER': 74 | x.append(copy.deepcopy(his_input_ids)) 75 | emo.append(get_main_score([int(item) - 1 for item in score])) 76 | action = action.strip() 77 | if lite: 78 | action = act2domain.get(action, 'other') 79 | if action not in action_list: 80 | action_list[action] = len(action_list) 81 | act.append(action_list[action]) 82 | ids = tokenizer.encode(text.strip())[1:] 83 | his_input_ids.append(ids) 84 | 85 | action_num = len(action_list) 86 | data = [x, emo, act, action_num] 87 | write_pkl(data, f'{dirname}-{name}.pkl') 88 | return data 89 | 90 | 91 | class HierarchicalData(Dataset): 92 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 93 | self.x = x 94 | self.act = act 95 | self.dialog_used = dialog_used 96 | 97 | if up_sampling: 98 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 99 | enhance_idx = enhance_idx * 10 100 | enhance_x = [x[idx] for idx in enhance_idx] 101 | enhance_act = [act[idx] for idx in enhance_idx] 102 | self.x = x + enhance_x 103 | self.act = act + enhance_act 104 | 105 | def __getitem__(self, index): 106 | x = [torch.tensor([101])] * (self.dialog_used - len(self.x[index])) + \ 107 | [torch.tensor([101] + item[:64]) for item in self.x[index][-self.dialog_used:]] 108 | act = self.act[index] 109 | return x, act 110 | 111 | def __len__(self): 112 | return len(self.x) 113 | 114 | 115 | class FlatData(Dataset): 116 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 117 | self.x = x 118 | self.act = act 119 | self.dialog_used = dialog_used 120 | 121 | if up_sampling: 122 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 123 | enhance_idx = enhance_idx * 10 124 | enhance_x = [x[idx] for idx in enhance_idx] 125 | enhance_act = [act[idx] for idx in enhance_idx] 126 | self.x = x + enhance_x 127 | self.act = act + enhance_act 128 | 129 | def __getitem__(self, index): 130 | seq = sum([item[:64] for item in self.x[index]], []) 131 | x = torch.tensor([101] + seq[-500:]) 132 | act = self.act[index] 133 | return x, act 134 | 135 | def __len__(self): 136 | return len(self.x) 137 | 138 | 139 | def collate_fn(data): 140 | x, act = zip(*data) 141 | bc_size = len(x) 142 | dialog_his = len(x[0]) 143 | x = [item for dialog in x for item in dialog] 144 | x = pad_sequence(x, batch_first=True, padding_value=0) 145 | x = x.view(bc_size, dialog_his, -1) 146 | 147 | return {'input_ids': x, 148 | 'act': torch.tensor(act).long() 149 | } 150 | 151 | 152 | def flat_collate_fn(data): 153 | x, act = zip(*data) 154 | x = pad_sequence(x, batch_first=True, padding_value=0) 155 | 156 | return {'input_ids': x, 157 | 'act': torch.tensor(act).long() 158 | } 159 | 160 | 161 | def train(fold=0, data_name='dstc8', model_name='HiGRU+ATTN'): 162 | print('[TRAIN] JDDC') 163 | dialog_used = 10 164 | 165 | data_name = data_name.replace('\r', '') 166 | model_name = model_name.replace('\r', '') 167 | 168 | print('dialog used', dialog_used) 169 | 170 | name = f'{data_name}_{model_name}_{fold}' 171 | print('TRAIN ::', name) 172 | 173 | save_path = f'outputs/{data_name}_emo/{model_name}_{fold}' 174 | 175 | tokenizer = BertTokenizer.from_pretrained('bert-base-chinese') 176 | 177 | x, emo, act, action_num = load_jddc(f'dataset/{data_name}', tokenizer) 178 | 179 | from .models import GRU, GRUAttention, BERTBackbone 180 | from .models import HierarchicalAttention, Hierarchical, ClassModel 181 | if model_name == 'HiGRU+ATTN': 182 | model = HierarchicalAttention(backbone=GRUAttention(vocab_size=tokenizer.vocab_size), class_num=[5]) 183 | model = model.cuda() 184 | optimizer = AdamW(model.parameters(), 1e-4) 185 | batch_size = 16 186 | DataFunc = HierarchicalData 187 | cf = collate_fn 188 | elif model_name == 'HiGRU': 189 | model = Hierarchical(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[5]) 190 | model = model.cuda() 191 | optimizer = AdamW(model.parameters(), 1e-4) 192 | batch_size = 16 193 | DataFunc = HierarchicalData 194 | cf = collate_fn 195 | elif model_name == 'GRU': 196 | model = ClassModel(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[5]) 197 | model = model.cuda() 198 | optimizer = AdamW(model.parameters(), 1e-4) 199 | batch_size = 16 200 | DataFunc = FlatData 201 | cf = flat_collate_fn 202 | elif model_name == 'BERT': 203 | model = ClassModel(backbone=BERTBackbone(layers_used=2, name='bert-base-chinese'), class_num=[5]) 204 | model = model.cuda() 205 | optimizer = AdamW(model.parameters(), 2e-5) 206 | batch_size = 6 207 | DataFunc = FlatData 208 | cf = flat_collate_fn 209 | else: 210 | print('[unknown model name]') 211 | return 212 | 213 | ll = int(len(x) / 10) 214 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 215 | train_act = emo[:ll * fold] + emo[ll * (fold + 1):] 216 | 217 | test_x = x[ll * fold:ll * (fold + 1)] 218 | test_act = emo[ll * fold:ll * (fold + 1)] 219 | 220 | print(len(train_x), len(test_x)) 221 | print() 222 | best_result = [0. for _ in range(4)] 223 | for i in range(100): 224 | print('train epoch', i, name) 225 | train_loader = DataLoader(DataFunc(train_x, train_act, dialog_used=dialog_used, up_sampling=True), 226 | batch_size=batch_size, shuffle=True, num_workers=2, collate_fn=cf) 227 | # tk0 = tqdm(train_loader, total=len(train_loader)) 228 | tk0 = train_loader 229 | act_acc = [] 230 | model.train() 231 | for j, batch in enumerate(tk0): 232 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 233 | act = batch['act'].cuda() 234 | act_loss = F.cross_entropy(act_pred, act) 235 | loss = act_loss 236 | 237 | loss.backward() 238 | optimizer.step() 239 | optimizer.zero_grad() 240 | act_acc.append((act_pred.argmax(dim=-1) == act).sum().item() / act.size(0)) 241 | 242 | # tk0.set_postfix(act_acc=round(sum(act_acc) / max(1, len(act_acc)), 4)) 243 | torch.save(model.state_dict(), f'outputs/{name}_{i}.pt') 244 | # print('test epoch', i) 245 | test_result = test(model, DataFunc(test_x, test_act, dialog_used=dialog_used), f'{save_path}_{i}.txt', cf) 246 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 247 | print(f'text_result={test_result}') 248 | print(f'best_result={best_result}') 249 | print() 250 | 251 | 252 | def test(model, test_data, save_path, cf): 253 | test_loader = DataLoader(test_data, batch_size=6, shuffle=False, num_workers=0, collate_fn=cf) 254 | # tk0 = tqdm(test_loader, total=len(test_loader)) 255 | tk0 = test_loader 256 | prediction = [] 257 | label = [] 258 | 259 | model.eval() 260 | for j, batch in enumerate(tk0): 261 | act = batch['act'].cuda() 262 | with torch.no_grad(): 263 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 264 | prediction.extend(act_pred.argmax(dim=-1).cpu().tolist()) 265 | label.extend(act.cpu().tolist()) 266 | 267 | recall = [[0, 0] for _ in range(5)] 268 | for p, l in zip(prediction, label): 269 | recall[l][1] += 1 270 | recall[l][0] += int(p == l) 271 | recall_value = [item[0] / max(item[1], 1) for item in recall] 272 | print('Recall value:', recall_value) 273 | print('Recall:', recall) 274 | UAR = sum(recall_value) / len(recall_value) 275 | kappa = cohen_kappa_score(prediction, label) 276 | rho = spearman(prediction, label) 277 | 278 | bi_pred = [int(item < 2) for item in prediction] 279 | bi_label = [int(item < 2) for item in label] 280 | bi_recall = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if l == 1]) / max(bi_label.count(1), 1) 281 | bi_precision = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if p == 1]) / max(bi_pred.count(1), 1) 282 | bi_f1 = 2 * bi_recall * bi_precision / max((bi_recall + bi_precision), 1) 283 | 284 | with open(save_path, 'w', encoding='utf-8') as f: 285 | for p, l in zip(prediction, label): 286 | f.write(f'{p}, {l}\n') 287 | 288 | return UAR, kappa, rho, bi_f1 289 | -------------------------------------------------------------------------------- /baselines/train_sat.py: -------------------------------------------------------------------------------- 1 | from transformers import AdamW, BertTokenizer, BertModel 2 | from torch.utils.data import Dataset, DataLoader 3 | from torch.nn.utils.rnn import pad_sequence 4 | import torch.nn.functional as F 5 | import copy 6 | import torch 7 | import warnings 8 | import numpy as np 9 | import os 10 | import pickle 11 | from sklearn.metrics import cohen_kappa_score 12 | from .spearman import spearman 13 | 14 | warnings.filterwarnings("ignore") 15 | 16 | 17 | def write_pkl(obj, filename): 18 | with open(filename, 'wb') as f: 19 | pickle.dump(obj, f) 20 | 21 | 22 | def read_pkl(filename): 23 | with open(filename, 'rb') as f: 24 | return pickle.load(f) 25 | 26 | 27 | def get_main_score(scores): 28 | number = [0, 0, 0, 0, 0] 29 | for item in scores: 30 | number[item] += 1 31 | score = np.argmax(number) 32 | return score 33 | 34 | 35 | def load_data(dirname, tokenizer): 36 | print(dirname) 37 | name = 'hierarchical_data' 38 | 39 | if os.path.exists(f'{dirname}-{name}.pkl'): 40 | return read_pkl(f'{dirname}-{name}.pkl') 41 | 42 | print('tokenized data') 43 | raw = [line[:-1] for line in open(dirname, encoding='utf-8')] 44 | data = [] 45 | for line in raw: 46 | if line == '': 47 | data.append([]) 48 | else: 49 | data[-1].append(line) 50 | x = [] 51 | emo = [] 52 | act = [] 53 | action_list = {} 54 | for session in data: 55 | his_input_ids = [] 56 | for turn in session: 57 | role, text, action, score = turn.split('\t') 58 | score = score.split(',') 59 | action = action.split(',') 60 | action = action[0] 61 | if role == 'USER': 62 | x.append(copy.deepcopy(his_input_ids)) 63 | emo.append(get_main_score([int(item) - 1 for item in score])) 64 | action = action.strip() 65 | if action not in action_list: 66 | action_list[action] = len(action_list) 67 | act.append(action_list[action]) 68 | 69 | ids = tokenizer.encode(text.strip())[1:] 70 | his_input_ids.append(ids) 71 | 72 | action_num = len(action_list) 73 | data = [x, emo, act, action_num] 74 | write_pkl(data, f'{dirname}-{name}.pkl') 75 | return data 76 | 77 | 78 | class HierarchicalData(Dataset): 79 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 80 | self.x = x 81 | self.act = act 82 | self.dialog_used = dialog_used 83 | 84 | if up_sampling: 85 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 86 | enhance_idx = enhance_idx * 10 87 | enhance_x = [x[idx] for idx in enhance_idx] 88 | enhance_act = [act[idx] for idx in enhance_idx] 89 | self.x = x + enhance_x 90 | self.act = act + enhance_act 91 | 92 | def __getitem__(self, index): 93 | x = [torch.tensor([101])] * (self.dialog_used - len(self.x[index])) + \ 94 | [torch.tensor([101] + item[:64]) for item in self.x[index][-self.dialog_used:]] 95 | act = self.act[index] 96 | return x, act 97 | 98 | def __len__(self): 99 | return len(self.x) 100 | 101 | 102 | class FlatData(Dataset): 103 | def __init__(self, x, act, dialog_used=5, up_sampling=False): 104 | self.x = x 105 | self.act = act 106 | self.dialog_used = dialog_used 107 | 108 | if up_sampling: 109 | enhance_idx = [idx for idx, a in enumerate(act) if a != 2] 110 | enhance_idx = enhance_idx * 10 111 | enhance_x = [x[idx] for idx in enhance_idx] 112 | enhance_act = [act[idx] for idx in enhance_idx] 113 | self.x = x + enhance_x 114 | self.act = act + enhance_act 115 | 116 | def __getitem__(self, index): 117 | seq = sum([item[:64] for item in self.x[index]], []) 118 | x = torch.tensor([101] + seq[-500:]) 119 | act = self.act[index] 120 | return x, act 121 | 122 | def __len__(self): 123 | return len(self.x) 124 | 125 | 126 | def collate_fn(data): 127 | x, act = zip(*data) 128 | bc_size = len(x) 129 | dialog_his = len(x[0]) 130 | x = [item for dialog in x for item in dialog] 131 | x = pad_sequence(x, batch_first=True, padding_value=0) 132 | x = x.view(bc_size, dialog_his, -1) 133 | 134 | return {'input_ids': x, 135 | 'act': torch.tensor(act).long() 136 | } 137 | 138 | 139 | def flat_collate_fn(data): 140 | x, act = zip(*data) 141 | x = pad_sequence(x, batch_first=True, padding_value=0) 142 | 143 | return {'input_ids': x, 144 | 'act': torch.tensor(act).long() 145 | } 146 | 147 | 148 | def train(fold=0, data_name='dstc8', model_name='HiGRU+ATTN', dialog_used=10): 149 | print('[TRAIN]') 150 | 151 | data_name = data_name.replace('\r', '') 152 | model_name = model_name.replace('\r', '') 153 | 154 | print('dialog used', dialog_used) 155 | 156 | name = f'{data_name}_{model_name}_{fold}' 157 | print('TRAIN ::', name) 158 | 159 | save_path = f'outputs/{data_name}_emo/{model_name}_{fold}' 160 | 161 | tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') 162 | # x, emo, act1, act2, action_num1, action_num2 = load_ccpe(f'dataset/{data_name}', tokenizer) 163 | x, emo, act, action_num = load_data(f'dataset/{data_name}.txt', tokenizer) 164 | 165 | # print(action_num) 166 | from .models import GRU, GRUAttention, BERTBackbone 167 | from .models import HierarchicalAttention, Hierarchical, ClassModel 168 | if model_name == 'HiGRU+ATTN': 169 | model = HierarchicalAttention(backbone=GRUAttention(vocab_size=tokenizer.vocab_size), class_num=[5]) 170 | model = model.cuda() 171 | optimizer = AdamW(model.parameters(), 1e-4) 172 | batch_size = 16 173 | DataFunc = HierarchicalData 174 | cf = collate_fn 175 | elif model_name == 'HiGRU': 176 | model = Hierarchical(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[5]) 177 | model = model.cuda() 178 | optimizer = AdamW(model.parameters(), 1e-4) 179 | batch_size = 16 180 | DataFunc = HierarchicalData 181 | cf = collate_fn 182 | elif model_name == 'GRU': 183 | model = ClassModel(backbone=GRU(vocab_size=tokenizer.vocab_size), class_num=[5]) 184 | model = model.cuda() 185 | optimizer = AdamW(model.parameters(), 1e-4) 186 | batch_size = 16 187 | DataFunc = FlatData 188 | cf = flat_collate_fn 189 | elif model_name == 'BERT': 190 | model = ClassModel(backbone=BERTBackbone(layers_used=2, name='bert-base-uncased'), class_num=[5]) 191 | model = model.cuda() 192 | optimizer = AdamW(model.parameters(), 2e-5) 193 | batch_size = 6 194 | DataFunc = FlatData 195 | cf = flat_collate_fn 196 | else: 197 | print('[unknown model name]') 198 | return 199 | 200 | ll = int(len(x) / 10) 201 | train_x = x[:ll * fold] + x[ll * (fold + 1):] 202 | train_act = emo[:ll * fold] + emo[ll * (fold + 1):] 203 | 204 | test_x = x[ll * fold:ll * (fold + 1)] 205 | test_act = emo[ll * fold:ll * (fold + 1)] 206 | 207 | print(len(train_x), len(test_x)) 208 | print() 209 | best_result = [0. for _ in range(4)] 210 | for i in range(100): 211 | print('train epoch', i, name) 212 | train_loader = DataLoader(DataFunc(train_x, train_act, dialog_used=dialog_used, up_sampling=True), 213 | batch_size=batch_size, shuffle=True, num_workers=2, collate_fn=cf) 214 | # tk0 = tqdm(train_loader, total=len(train_loader)) 215 | tk0 = train_loader 216 | act_acc = [] 217 | model.train() 218 | for j, batch in enumerate(tk0): 219 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 220 | act = batch['act'].cuda() 221 | act_loss = F.cross_entropy(act_pred, act) 222 | loss = act_loss 223 | 224 | loss.backward() 225 | optimizer.step() 226 | optimizer.zero_grad() 227 | act_acc.append((act_pred.argmax(dim=-1) == act).sum().item() / act.size(0)) 228 | 229 | # tk0.set_postfix(act_acc=round(sum(act_acc) / max(1, len(act_acc)), 4)) 230 | torch.save(model.state_dict(), f'outputs/{name}_{i}.pt') 231 | # print('test epoch', i) 232 | test_result = test(model, DataFunc(test_x, test_act, dialog_used=dialog_used), f'{save_path}_{i}.txt', cf) 233 | best_result = [max(i1, i2) for i1, i2 in zip(test_result, best_result)] 234 | print(f'text_result={test_result}') 235 | print(f'best_result={best_result}') 236 | print() 237 | 238 | 239 | def test(model, test_data, save_path, cf): 240 | test_loader = DataLoader(test_data, batch_size=6, shuffle=False, num_workers=0, collate_fn=cf) 241 | # tk0 = tqdm(test_loader, total=len(test_loader)) 242 | tk0 = test_loader 243 | prediction = [] 244 | label = [] 245 | 246 | model.eval() 247 | for j, batch in enumerate(tk0): 248 | act = batch['act'].cuda() 249 | with torch.no_grad(): 250 | act_pred, *o = model(input_ids=batch['input_ids'].cuda()) 251 | prediction.extend(act_pred.argmax(dim=-1).cpu().tolist()) 252 | label.extend(act.cpu().tolist()) 253 | 254 | recall = [[0, 0] for _ in range(5)] 255 | for p, l in zip(prediction, label): 256 | recall[l][1] += 1 257 | recall[l][0] += int(p == l) 258 | recall_value = [item[0] / max(item[1], 1) for item in recall] 259 | print('Recall value:', recall_value) 260 | print('Recall:', recall) 261 | UAR = sum(recall_value) / len(recall_value) 262 | kappa = cohen_kappa_score(prediction, label) 263 | rho = spearman(prediction, label) 264 | 265 | bi_pred = [int(item < 2) for item in prediction] 266 | bi_label = [int(item < 2) for item in label] 267 | bi_recall = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if l == 1]) / max(bi_label.count(1), 1) 268 | bi_precision = sum([int(p == l) for p, l in zip(bi_pred, bi_label) if p == 1]) / max(bi_pred.count(1), 1) 269 | bi_f1 = 2 * bi_recall * bi_precision / max((bi_recall + bi_precision), 1) 270 | 271 | with open(save_path, 'w', encoding='utf-8') as f: 272 | for p, l in zip(prediction, label): 273 | f.write(f'{p}, {l}\n') 274 | 275 | return UAR, kappa, rho, bi_f1 276 | -------------------------------------------------------------------------------- /baselines/train_sat.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | data='jddc' 4 | model='BERT' 5 | 6 | for i in {0..9} 7 | do 8 | echo CUDA=$[ i%4 ] log/${data}_${model}_$i.log 9 | CUDA_VISIBLE_DEVICES=$[ i%4 ] nohup python -u driver_sat.py -fold=$i --data=${data} --model=${model} > log/${data}_${model}_$i.log 2>&1 & 10 | done 11 | -------------------------------------------------------------------------------- /dataset/JDDC-ActionList.txt: -------------------------------------------------------------------------------- 1 | 配送 ['配送周期', '物流全程跟踪', '联系配送', '什么时间出库', '配送方式', '返回方式', '预约配送时间', '少商品与少配件', '拒收', '能否自提', '能否配送', '售前运费多少'时间', '发错货', '下单地址填写', '发货检查', '京东特色配送', '提前配送', '填写返件运单号', '怎么确认收货', '快递单号不正确', '自提时间', '发货时间未到不能出库', '送货上门附加手续费', '夺宝岛配送时间', '夺宝岛运费', '配送超区'] 2 | 退换 ['保修返修及退换货政策', '正常退款周期', '返修退换货处理周期', '售后运费', '申请退款', '退款到哪儿', '返修退换货拆包装', '取消退款', '在哪里查询退款', '退款异常'团购退款', '补差价'] 3 | 发票 ['发票退换修改', '查看发票', '是否提供发票', '填写发票信息', '增票相关', '电子发票', '补发票', '返修退换货发票'] 4 | 客服 ['联系客服', '联系客户', '联系商家', '联系售后', '投诉', '夺宝岛售后'] 5 | 产品咨询 ['属性咨询', '使用咨询', '商品检索', '商品价格咨询', '补货时间', '生产日期', '正品保障', '包装清单', '库存状态', '商品介绍', '外包装', '商品比较', '保修期方区别', '补发', '预约抢购', '为什么显示无货', '开箱验货', '全球购解释', '有什么颜色', '套装购买', '是否全国联保', '是什么颜色', '图片差异', '配件推荐', '发表商品咨询', '爱回收解释', '夺宝岛商品来源', '金融理财', 'DIY装机', '众筹说明', '定金解释'] 6 | 价保 ['价保申请流程', '价保条件', '价保记录查询', '无法申请价保'] 7 | 支付 ['货到付款', '支付方式', '白条还款方式', '公司转账', '在线支付', '白条分期手续费', '白条开通', '无法购买提交', '余额查询', '支付到账时间', '支付密码', '余额使用'库转入转出', '微信支付', '超期未还款', '网银钱包提现异常', '代客户充值', '免密支付', '充值失败', '网银钱包定义', '网银钱包开通', '充值到账异常', '多次支付退款', '微信下单', '夺宝岛支付方式'] 8 | bug ['下单后无记录', '无法加入购物车', '竞拍异常', '地址信息无法保存'] 9 | 维修 ['售后维修点查询'] 10 | 评价 ['查看评价晒单', '删除修改评价晒单', '评价晒单返券和赠品', '评价晒单异常', '评价晒单送积分京豆'] 11 | 预定 ['机票相关', '购买机票', '火车票', '酒店预订'] 12 | 13 | -------------------------------------------------------------------------------- /dataset/README.md: -------------------------------------------------------------------------------- 1 | ## User Satisfaction Simulation 2 | 3 | In *JDDC.txt*, *SGD.txt*, *MWOZ.txt*, *ReDial.txt*, and *CCPE.txt*, each line is separated by "\t": 4 | 5 | - speaker role (USER or SYSTEM), 6 | - text, 7 | - action, 8 | - satisfaction (repeated annotation are separated by ","), 9 | - explanation text (only for JDDC at dialogue level, and repeated annotation are separated by ";") 10 | 11 | And sessions are separated by blank lines. 12 | 13 | Since the original dataset does not provide actions, we use the action annotation provided by [IARD](https://github.com/wanlingcai1997/umap_2020_IARD) and included it in *ReDial-action.txt*. 14 | 15 | The JDDC data set provides the action of each user utterances, including 234 categories. We compress them into 12 categories based on a manually defined classification method (see *JDDC-ActionList.txt*). 16 | 17 | 18 | 19 | Example from SGD 20 | 21 | 22 | ``` 23 | Role \t Text \t Action \t Sat1,Sat2,Sat3 24 | 25 | USER I would like to find some Oneway Flights for my upcoming trip. INFORM_INTENT 2,3,3 26 | SYSTEM Sure, Where are planning to make a trip, please mention the destination and departure points? When do you plan to leave? REQUEST 27 | USER I am leaving form Washington to Mexico city on the 10th. INFORM 3,3,3 28 | SYSTEM There is search results for your requirement, American Airlines outbound flight is leaves at 10:15 am and it has 1 stop. The price of the ticket is $243. OFFER 29 | ``` 30 | 31 | Example from JDDC 32 | 33 | ``` 34 | Role \t Text \t Action \t Sat1,Sat2,Sat3 \t Exp1;Exp2;Exp3(only for dialogue-level) 35 | 36 | SYSTEM 你是商品有问题要换货吗 37 | USER 人家都说可以换只是问我要自己去换还是要上门来给我换保修返修及退换货政策 2,2,3 38 | USER 我只是要问售后点在哪里而已保修返修及退换货政策 1,1,3 39 | USER 你不懂就找个懂的过来保修返修及退换货政策 1,1,1 40 | USER 我没时间浪费保修返修及退换货政策 1,1,3 41 | SYSTEM 好吧 42 | SYSTEM 很抱歉 43 | SYSTEM 我问过了您只能通过网上处理 44 | SYSTEM 收货点我们真的查不到 45 | SYSTEM 售后点我们真的查不到 46 | USER 无语，查不到就说查不到还要我处理什么 OTHER 2,1,2 47 | SYSTEM 抱歉 48 | SYSTEM 请问还有其他还可以帮到您的吗? 49 | SYSTEM 抱歉了祝您生活愉快再见 50 | SYSTEM 您要是实在找不到的话，就去查地图吧，这是我最后可以做的了 51 | USER OVERALL 1,1,1 system不能为用户解决问题不能理解用户的意思;system完全没有解决用户的问题，也没有理解客户意图，导致客户体验很差;system未能理解用户的需求，用户体验差 52 | ``` -------------------------------------------------------------------------------- /imgs/action-prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunnweiwei/user-satisfaction-simulation/8ba06bd92570fd94d3ec0cb828aad45172be71b2/imgs/action-prediction.png -------------------------------------------------------------------------------- /imgs/satisfaction-prediction.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunnweiwei/user-satisfaction-simulation/8ba06bd92570fd94d3ec0cb828aad45172be71b2/imgs/satisfaction-prediction.png --------------------------------------------------------------------------------