├── model
├── best_model.pt
├── best_model.info
├── best_model_finetuned.pt
└── best_model_finetuned.info
├── requirements.txt
├── LICENSE
├── README.md
├── pytorch_src
├── create_utt_embed.py
├── main.py
├── calculate_correlations.py
├── preprocess_data.py
└── engagement_classifier.py
└── data
├── DD_finetune_valid.csv
├── humanAMT_engscores_utt.csv
├── 50convs_AMT.txt
└── DD_finetune_train.csv
/model/best_model.pt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PlusLabNLP/PredictiveEngagement/HEAD/model/best_model.pt
--------------------------------------------------------------------------------
/model/best_model.info:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PlusLabNLP/PredictiveEngagement/HEAD/model/best_model.info
--------------------------------------------------------------------------------
/model/best_model_finetuned.pt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PlusLabNLP/PredictiveEngagement/HEAD/model/best_model_finetuned.pt
--------------------------------------------------------------------------------
/model/best_model_finetuned.info:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PlusLabNLP/PredictiveEngagement/HEAD/model/best_model_finetuned.info
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | python==3.6
2 | numpy==1.17.3
3 | nltk=3.4.5
4 | sklearn==0.21.3
5 | torch==1.3.1
6 | torchvision==0.4.2
7 | torchtext==0.4.0
8 | matplotlib=-3.1.1
9 | bert-serving-server #(cite: https://github.com/hanxiao/bert-as-service)
10 | bert-serving-client #(cite: https://github.com/hanxiao/bert-as-service)
11 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2019 sarikghazarian
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # PredictiveEngagement
2 |
3 | This repository contains code for [Predictive Engagement](https://arxiv.org/pdf/1911.01456.pdf) paper. If you use it please cite it as:
4 | ```
5 | @inproceedings{ghazarian2020predictive,
6 | title={Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems},
7 | author={Ghazarian, Sarik and Weischedel, Ralph and Galstyan, Aram and Peng, Nanyun},
8 | booktitle={The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)},
9 | paper_url={https://arxiv.org/pdf/1911.01456},
10 | pages={7789–7796},
11 | year={2020}
12 | }
13 | ```
14 |
15 |
16 | For any comments/issues/ideas pleaae feel free to contact [me](mailto:sarikgha@usc.edu).
17 |
18 |
19 | ## Steps to setup
20 |
21 | ### Install Requirements
22 | Use any virtualenv manager to install all the packages mentioned in the requirements.txt file.
23 |
24 | In order to train/test engagement classifier or predict the engagment scores based on trained models please follow these steps:
25 |
26 | ### 1. Preprocess dataset
27 | Run the preprocess.py in the pytorch_src directory. This script preprocesses ConvAI dataset (train.json file taken from http://convai.io/2017/data/) to extract the dialogs with at least one turn (query and reply utterances). This extracted files are needed to train and test the engagement classifier.
28 | The outputs are:
29 | * ConvAI_convs_orig : includes all 2099 conversations from ConvAI dataset with at least one turn of dialog
30 | * ConvAI_convs : includes all conversations from ConvAI except the 50 dialogs used for AMT experiments (Table 2. of paper)
31 | * ConvAI_convs_train, ConvAI_convs_test, ConvAI_convs_valid: include 60/20/20 percent of conversations from ConvAI dataset as train/test/valid sets
32 | * ConvAI_utts_train.csv, ConvAI_utts_test.csv, ConvAI_utts_valid.csv: train/test/valid sets of utterances from ConvAI dataset containing queries, replies and their corresponding engagement labels used for utterance-level engagement classifier.
33 |
34 |
35 | ### 2. Utterance embeddings
36 | In order to train the engagement classifier or test the trained model, you need to have a set of embeddings files for queries and replies, where each utterance embedding is the mean or max pooling of its words embeddings. In this paper, we have used the Bert contextualized embeddings.
37 | Run create_utt_embed.py with queries and replies files as input to create their embeddings by using BertClient and BertServer
38 | (cite: @misc{xiao2018bertservice,title={bert-as-service},author={Xiao, Han},howpublished={\url{https://github.com/hanxiao/bert-as-service}},ear={2018}}).
39 | According to https://github.com/hanxiao/bert-as-service, before running create_utt_embed.py which serves as BertClient, you need to start BertServer with following command where model_dir in this command is the directory that pretrained Bert model has been downloaded in:
40 |
41 | bert-serving-start -model_dir /tmp/english_L-12_H-768_A-12/ -num_worker=4 -max_seq_len=128 -pooling_strategy=REDUCE_MEAN
42 |
43 |
44 | ### 3. Train/test and finetune model
45 | Run main.py in order to train the engagement classifier, test it, finetune it on Daily Dilaog dataset and predict engagement scores for queries and replies of Daily Dialoge test set.
46 | Model directory includes the engagement classifier trained on ConvAI dataset (model/best_model.pt) and finetuned on Daily Dialog set (model/best_model_finetuned.pt). The both models are based on mean pooling of word embeddings.
47 | cd into pytorch_src/ directory and specify the mode and all the parameter values that you need to run.
48 |
49 | To train the model from scratch on ConvAI data: python main.py --mode train --pooling mean
50 | (The best model (model/best_model.pt) is an MLP classifier with [64,32,8] hiddent units, lr=0.001, dropout=0.8, regularizer=0.001, Adam)
51 |
52 | To test the trained model on ConvAI test set: python main.py --mode test --pooling mean
53 |
54 | To test the trained model on 297 utterances (Table 2. in paper) annotated by AMT workers: python main.py --mode testAMT --pooling mean
55 |
56 | To finetune the trained model on DailyDilaog datset: python main.py --mode finetune --pooling mean
57 |
58 | To predict the engagement score based on finetuned model: python main.py --mode predict --pooling mean
59 |
60 |
61 | ### 4. Pearson and Spearman Correlations
62 | Run calculate_correlations.py to compute the Pearson and Spearman correlations between human annotations and engagement scores predicted by different models.
63 |
64 |
65 |
66 |
--------------------------------------------------------------------------------
/pytorch_src/create_utt_embed.py:
--------------------------------------------------------------------------------
1 |
2 | from bert_serving.client import BertClient
3 | import csv
4 | import os
5 | import pickle
6 |
7 | #In order to create utterance embeddings, you need to first start BertServer (follow https://github.com/hanxiao/bert-as-service) with following command:
8 | #bert-serving-start -model_dir /tmp/english_L-12_H-768_A-12/ -num_worker=4 -max_seq_len=128 -pooling_strategy=REDUCE_MEAN
9 | #model_dir is the directory that pretrained Bert model has been downloaded
10 |
11 | def make_Bert_embeddings(data_dir, fname, f_queries_embed, f_replies_embed, type):
12 | '''Create embedding file for all queries and replies in the given files
13 | Param:
14 | data_dir: the directory of data
15 | fname: name of the input file containing queries, replies, engagement_score
16 | f_queries_embed: name of the output file containing the queries bert embeddings
17 | f_replies_embed: name of the output file containing the replies bert embeddings
18 | type: indicate train/valid/test set
19 | '''
20 |
21 | csv_file = open(data_dir + fname)
22 | csv_reader = csv.reader(csv_file, delimiter=',')
23 |
24 | foutput_q = os.path.join(data_dir + f_queries_embed)
25 | foutput_r = os.path.join(data_dir + f_replies_embed)
26 |
27 | queries,replies = [],[]
28 | next(csv_reader)
29 | for row in csv_reader:
30 | queries.append(row[1].split('\n')[0])
31 | replies.append(row[2].split('\n')[0])
32 |
33 | if os.path.exists(foutput_q) and os.path.exists(foutput_r) :
34 | print('Bert embedding files for utterances exist!')
35 | return
36 |
37 | else:
38 | print("Bert embedding files for utterances do not exist")
39 | queries_vectors = {}
40 | replies_vectors = {}
41 | bc = BertClient()
42 | has_empty = False
43 | fwq = open(foutput_q, 'wb')
44 | for idx, q in enumerate(queries):
45 | print(str(idx)+'query {}'.format(type))
46 | if q not in queries_vectors.keys() and q !='':
47 | queries_vectors[q] = bc.encode([q])
48 | if q not in queries_vectors.keys() and q =='':
49 | queries_vectors[q] = bc.encode(['[PAD]'])
50 | has_empty=True
51 | if has_empty == False:
52 | queries_vectors[''] = bc.encode(['[PAD]'])
53 | pickle.dump(queries_vectors, fwq)
54 |
55 | fwr = open(foutput_r, 'wb')
56 | has_empty = False
57 | for idx, r in enumerate(replies):
58 | print(str(idx)+'reply {}'.format(type))
59 | if r not in replies_vectors.keys() and r !='':
60 | replies_vectors[r] = bc.encode([r])
61 | if r not in replies_vectors.keys() and r =='':
62 | replies_vectors[r] = bc.encode(['[PAD]'])
63 | has_empty = True
64 | if has_empty == False:
65 | replies_vectors[''] = bc.encode(['[PAD]'])
66 | pickle.dump(replies_vectors, fwr)
67 |
68 |
69 |
70 | def load_Bert_embeddings(data_dir, f_queries_embed, f_replies_embed):
71 | '''Load embeddings of queries and replies
72 | Param:
73 | data_dir: the directory of data
74 | f_queries_embed: name of the input file containing the queries bert embeddings
75 | f_replies_embed: name of the input file containing the replies bert embeddings
76 | '''
77 |
78 | print('Loading Bert embeddings of sentences')
79 | queries_vectors = {}
80 | replies_vectors = {}
81 |
82 | print('query embedding')
83 | fwq = open(data_dir + f_queries_embed, 'rb')
84 | dict_queries = pickle.load(fwq)
85 | for query, embeds in dict_queries.items():
86 | queries_vectors[query] = embeds[0]
87 | print('len of embeddings is '+str(len(queries_vectors)))
88 |
89 | print('reply embedding')
90 | fwr = open(data_dir + f_replies_embed, 'rb')
91 | dict_replies = pickle.load(fwr)
92 | for reply, embeds in dict_replies.items():
93 | replies_vectors[reply] = embeds[0]
94 | print('len of embeddings is '+str(len(replies_vectors)))
95 |
96 | return queries_vectors, replies_vectors
97 |
98 |
99 | if __name__ == '__main__':
100 |
101 | data_dir = './../data/'
102 | pooling = 'mean'
103 | ifname= 'ConvAI_utts'
104 | dd_ifname = 'DD_finetune'
105 | ofname = ''
106 | make_Bert_embeddings(data_dir, ifname+'_train.csv', ifname+'_train_queries_embed_'+pooling, ifname+'_train_replies_embed_'+pooling, 'train')
107 | make_Bert_embeddings(data_dir, ifname+'_valid.csv', ifname+'_valid_queries_embed_'+pooling, ifname+'_valid_replies_embed_'+pooling, 'valid')
108 | make_Bert_embeddings(data_dir, ifname+'_test.csv', ifname+'_test_queries_embed_'+pooling, ifname+'_test_replies_embed_'+pooling, 'test')
109 | make_Bert_embeddings(data_dir, 'humanAMT_engscores_utt.csv', 'humanAMT_queries_embed_'+pooling, 'humanAMT_replies_embed_'+pooling, 'testAMT')
110 | make_Bert_embeddings(data_dir, dd_ifname+'_train.csv', dd_ifname+'_queries_train_embed_'+pooling, dd_ifname+'_replies_train_embed_'+pooling, 'train')
111 | make_Bert_embeddings(data_dir, dd_ifname+'_valid.csv', dd_ifname+'_queries_valid_embed_'+pooling, dd_ifname+'_replies_valid_embed_'+pooling, 'valid')
112 | make_Bert_embeddings(data_dir, 'DD_queries_generated_replies.csv', 'DD_queries_embed_'+pooling, 'DD_generated_replies_embed_'+pooling, 'test')
113 |
114 |
--------------------------------------------------------------------------------
/pytorch_src/main.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | from engagement_classifier import Engagement_cls
3 |
4 |
5 | if __name__=="__main__":
6 |
7 | parser = argparse.ArgumentParser(description='Parameters for engagement classification')
8 | parser.add_argument('--mlp_hidden_dim', type=int, default=[64, 32, 8],
9 | help='number of hidden units in mlp')
10 | parser.add_argument('--epochs', type=int, default=400,
11 | help='number of training epochs')
12 | parser.add_argument('--lr', type=float, default=0.001,
13 | help='learning rate')
14 | parser.add_argument('--batch_size', type=int, default=100,
15 | help='batch size')
16 | parser.add_argument('--dropout', type=float, default=0.8,
17 | help='dropout rate')
18 | parser.add_argument('--pooling', type=str, default='mean',
19 | help='pooling type for getting sentence embeddings from words embeddings')
20 | parser.add_argument('--optimizer', type=str, default='Adam',
21 | help='optimizer for training model')
22 | parser.add_argument('--reg', type=float, default=0.001,
23 | help='l2 regularizer for training model')
24 | parser.add_argument('--mode', type=str,
25 | help="""train: to train the model
26 | test: to test the model on ConvAI test set
27 | testAMT: to test the model on 297 utterances (50 randomly selected dialogs from ConvAI) annotated by Amazon turkers
28 | finetune: to finetune the trained model on 300 pairs selected from Daily Dialogue dataset annotated by Amazon turkers
29 | predict: to predict engagement scores for query and generated replies (based on attention-based seq-to-seq model) of Daily Dilaogue dataset""")
30 | args = parser.parse_args()
31 |
32 | data_dir = './../data/'
33 | train_dir = './../model/'
34 |
35 |
36 |
37 | if args.mode == "train":
38 | print('ConvAI_utts_train_queries_embed_'+args.pooling)
39 | ftrain,fvalid = ['ConvAI_utts_train.csv','ConvAI_utts_valid.csv']
40 | #files including queries and replies embeddings in train/valid/test sets
41 | ftrain_queries_embed = 'ConvAI_utts_train_queries_embed_'+args.pooling
42 | ftrain_replies_embed = 'ConvAI_utts_train_replies_embed_'+args.pooling
43 | fvalid_queries_embed = 'ConvAI_utts_valid_queries_embed_'+args.pooling
44 | fvalid_replies_embed = 'ConvAI_utts_valid_replies_embed_'+args.pooling
45 | eng_cls = Engagement_cls(train_dir, args.batch_size, args.mlp_hidden_dim, args.epochs,\
46 | args.reg, args.lr, args.dropout, args.optimizer,\
47 | ftrain_queries_embed = ftrain_queries_embed, ftrain_replies_embed=ftrain_replies_embed, fvalid_queries_embed= fvalid_queries_embed, fvalid_replies_embed=fvalid_replies_embed)
48 | eng_cls.prepare_data(data_dir, ftrain, fvalid)
49 | eng_cls.train()
50 |
51 | if args.mode == "test":
52 | ftest = 'ConvAI_utts_test.csv'
53 | ftest_queries_embed = 'ConvAI_utts_test_queries_embed_'+args.pooling
54 | ftest_replies_embed = 'ConvAI_utts_test_replies_embed_'+args.pooling
55 | eng_cls = Engagement_cls(train_dir, args.batch_size, args.mlp_hidden_dim, args.epochs,\
56 | args.reg, args.lr, args.dropout, args.optimizer,\
57 | ftest_queries_embed=ftest_queries_embed , ftest_replies_embed=ftest_replies_embed)
58 | eng_cls.prepare_data(data_dir, ftest=ftest)
59 | eng_cls.test(ftest)
60 |
61 | if args.mode == "testAMT":
62 | ftest = 'humanAMT_engscores_utt.csv'
63 | #humanAMT_engscores_utt.csv: 297 utterances that their engagement scores are annotated by AMT workers
64 | fhuman_test_queries_embed = 'humanAMT_queries_embed_'+args.pooling #query embeddings of 50 randomly selected conversations (297 utterances) annotated by Amazon turkers
65 | fhuman_test_replies_embed = 'humanAMT_replies_embed_'+args.pooling #reply embeddings of 50 randomly selected conversations (297 utterances) annotated by Amazon turkers
66 | eng_cls = Engagement_cls(train_dir, args.batch_size, args.mlp_hidden_dim, args.epochs,\
67 | args.reg, args.lr, args.dropout, args.optimizer,\
68 | ftest_queries_embed=fhuman_test_queries_embed ,ftest_replies_embed=fhuman_test_replies_embed)
69 | eng_cls.prepare_data(data_dir,ftest=ftest)
70 | eng_cls.test(ftest)
71 |
72 | if args.mode == "finetune":
73 | #DD_finetune_train.csv and DD_finetune_valid.csv train/valid sets from 300 pairs of Daily Dialogue dataset annotated by Amazon turkers
74 | ftrain, fvalid, ftest = ['DD_finetune_train.csv','DD_finetune_valid.csv', '']
75 | ftrain_queries_embed = 'DD_finetune_queries_train_embed_'+args.pooling
76 | ftrain_replies_embed = 'DD_finetune_replies_train_embed_'+args.pooling
77 | fvalid_queries_embed = 'DD_finetune_queries_valid_embed_'+args.pooling
78 | fvalid_replies_embed = 'DD_finetune_replies_valid_embed_'+args.pooling
79 | eng_cls = Engagement_cls(train_dir, args.batch_size, args.mlp_hidden_dim, args.epochs,\
80 | args.reg, args.lr, args.dropout, args.optimizer,\
81 | ftrain_queries_embed=ftrain_queries_embed, ftrain_replies_embed=ftrain_replies_embed, fvalid_queries_embed=fvalid_queries_embed, fvalid_replies_embed=fvalid_replies_embed)
82 | eng_cls.prepare_data(data_dir, ftrain=ftrain, fvalid=fvalid)
83 | eng_cls.train(finetune=True)
84 |
85 | if args.mode =="predict":
86 | #The file including queries and generated replies
87 | ftest = 'DD_queries_generated_replies.csv'
88 | ftest_queries_embed = 'DD_queries_embed_'+args.pooling
89 | ftest_replies_embed = 'DD_generated_replies_embed_'+args.pooling
90 | eng_cls = Engagement_cls(train_dir, args.batch_size, args.mlp_hidden_dim, args.epochs, \
91 | args.reg, args.lr, args.dropout, args.optimizer,\
92 | ftest_queries_embed=ftest_queries_embed ,ftest_replies_embed=ftest_replies_embed)
93 | eng_cls.prepare_data(data_dir, ftest=ftest)
94 | eng_cls.generate_eng_score('DD_replies.txt','DD_queries_genreplies_eng_{}.txt'.format(args.pooling))
95 |
96 |
97 | #The file including queries and Human-written(ground-truth) replies
98 | ftest = 'DD_queries_groundtruth_replies.csv'
99 | ftest_queries_embed = 'DD_queries_embed_'+args.pooling
100 | ftest_replies_embed = 'DD_groundtruth_replies_embed_'+args.pooling
101 | eng_cls = Engagement_cls(train_dir, args.batch_size, args.mlp_hidden_dim, args.epochs, \
102 | args.reg, args.lr, args.dropout, args.optimizer,\
103 | ftest_queries_embed=ftest_queries_embed ,ftest_replies_embed=ftest_replies_embed)
104 | eng_cls.prepare_data(data_dir, ftest=ftest)
105 | eng_cls.generate_eng_score('DD_replies.txt','DD_queries_groundtruth_eng_{}.txt'.format(args.pooling))
106 |
--------------------------------------------------------------------------------
/pytorch_src/calculate_correlations.py:
--------------------------------------------------------------------------------
1 | import csv
2 | import numpy as np
3 | from scipy.stats import spearmanr
4 | from scipy.stats import pearsonr
5 | from sklearn.metrics.pairwise import cosine_similarity
6 |
7 |
8 | def calculate_pearson_spearman(human_ruber_eval_fname, human_ruber_corr_fname, num_human_per_qr):
9 |
10 | '''this function calculates the correlations between human_judgements and engagement scores computed by different models
11 | Params:
12 | human_ruber_eval_fname: input file which includes the list of queries and replies with their engagement scores annotated by human or predicted by different models
13 | human_ruber_corr_fname: output file that includes the correlations between human judgements and different engagement predicting models
14 | num_human_per_qr: number of human anntations for each pair of query and reply
15 | '''
16 |
17 | human_ruber_eval = open(human_ruber_eval_fname, 'r', encoding='utf-8')
18 | human_ruber_eval_reader = csv.DictReader(human_ruber_eval)
19 |
20 | human_ruber_corr = open(human_ruber_corr_fname, 'w')
21 | headers = ['query', 'response', 'human_score', 'human_score_01_range', 'Eng_Score_MeanPooling', 'Eng_Score_MaxPooling', 'Ruber_Unref_Score', 'CTX_Ruber_Unref_Score', 'CTX_UnrefRuber_avg_EngMeanPooling', 'CTX_UnrefRuber_avg_EngMaxPooling']
22 |
23 | human_ruber_corr_writer = csv.DictWriter(human_ruber_corr, headers)
24 | human_ruber_corr_writer.writeheader()
25 |
26 | query_response = set()
27 | query_response_human_score = {}
28 | query_response_ruber_unref_score = {}
29 | query_response_ctx_ruber_unref_score = {}
30 | query_response_eng_score_mean = {}
31 | query_response_eng_score_max = {}
32 | query_response_human_score_range01 = {}
33 | query_response_Ctxruberunref_avg_engmean = {}
34 | query_response_Ctxruberunref_avg_engmax = {}
35 |
36 | for row in human_ruber_eval_reader:
37 | query_response.add(row["query"] + "===" + row["response"])
38 | query_response_human_score[row["query"] + "===" + row["response"]] = 0
39 |
40 | human_ruber_eval = open(human_ruber_eval_fname)
41 | human_ruber_eval_reader = csv.DictReader(human_ruber_eval)
42 | for row in human_ruber_eval_reader:
43 | q_r = row["query"] + "===" + row["response"]
44 | query_response_human_score[q_r] += float(row["human_score"])
45 | query_response_ruber_unref_score[q_r] = float(row["Ruber_Unref_Score"])
46 | query_response_ctx_ruber_unref_score[q_r] = float(row["CTX_Unref_Score"])
47 | query_response_eng_score_mean[q_r] = float(row["Eng_Score_MeanPooling"])
48 | query_response_eng_score_max[q_r] = float(row["Eng_Score_MaxPooling"])
49 | query_response_Ctxruberunref_avg_engmean[q_r] = (float(row["CTX_Unref_Score"])+float(row["Eng_Score_MeanPooling"]))/2
50 | query_response_Ctxruberunref_avg_engmax[q_r] = (float(row["CTX_Unref_Score"])+float(row["Eng_Score_MaxPooling"]))/2
51 |
52 |
53 | print('number of items in the set of q r is '+str(len(query_response_Ctxruberunref_avg_engmax)))
54 |
55 | for q, score in query_response_human_score.items():
56 | query_response_human_score[q] = score/num_human_per_qr
57 |
58 | human_scores_array = list(query_response_human_score.values())
59 | human_scores_array_01_range = (human_scores_array-np.min(human_scores_array))/(max(human_scores_array)-min(human_scores_array))
60 |
61 | print('number of human scores is '+str(len(human_scores_array_01_range)))
62 |
63 | human_scores = []
64 | ruber_unref_scores = []
65 | CTXruber_unref_scores = []
66 | eng_scores_meanpooling = []
67 | eng_scores_maxpooling = []
68 | avgctxruber_avg_engmean = []
69 | avgctxruber_avg_engmax = []
70 |
71 | i =0
72 | for q_r, human_score in query_response_human_score.items():
73 | score = human_scores_array_01_range[i]
74 | i+=1
75 | human_scores.append(score)
76 | ruber_unref_scores.append(query_response_ruber_unref_score[q_r])
77 | CTXruber_unref_scores.append(query_response_ctx_ruber_unref_score[q_r])
78 | eng_scores_meanpooling.append(query_response_eng_score_mean[q_r])
79 | eng_scores_maxpooling.append(query_response_eng_score_max[q_r])
80 | avgctxruber_avg_engmean.append(query_response_Ctxruberunref_avg_engmean[q_r])
81 | avgctxruber_avg_engmax.append(query_response_Ctxruberunref_avg_engmax[q_r])
82 | row_write ={}
83 | row_write['query'] = q_r.split('===')[0]
84 | row_write['response'] = q_r.split('===')[1]
85 | row_write['human_score'] = round(query_response_human_score[q_r],4)
86 | row_write['human_score_01_range'] = round(score,4)
87 | row_write['Ruber_Unref_Score'] = query_response_ruber_unref_score[q_r]
88 | row_write['CTX_Ruber_Unref_Score'] = query_response_ctx_ruber_unref_score[q_r]
89 | row_write['Eng_Score_MeanPooling'] = query_response_eng_score_mean[q_r]
90 | row_write['Eng_Score_MaxPooling'] = query_response_eng_score_max[q_r]
91 | row_write['CTX_UnrefRuber_avg_EngMeanPooling'] = query_response_Ctxruberunref_avg_engmean[q_r]
92 | row_write['CTX_UnrefRuber_avg_EngMaxPooling'] = query_response_Ctxruberunref_avg_engmax[q_r]
93 | human_ruber_corr_writer.writerow(row_write)
94 |
95 |
96 |
97 | row_write ={}
98 | pearson_ref, pp_value = pearsonr(human_scores, ruber_unref_scores)
99 | spearman_corr, p_value = spearmanr(human_scores, ruber_unref_scores)
100 | row_write['query'] = 'pearson_correlation_unref_score = '+ str(pearson_ref) + ' p<'+str(pp_value)
101 | row_write['response'] = 'spearman_correlation_unref_score = '+ str(spearman_corr) + ' p<'+str(p_value)
102 | human_ruber_corr_writer.writerow(row_write)
103 |
104 |
105 | row_write ={}
106 | pearson_ref, pp_value = pearsonr(human_scores, CTXruber_unref_scores)
107 | spearman_corr, p_value = spearmanr(human_scores, CTXruber_unref_scores)
108 | row_write['query'] = 'pearson_correlation_CTXunref_score = '+ str(pearson_ref) + ' p<'+str(pp_value)
109 | row_write['response'] = 'spearman_correlation_CTXunref_score = '+ str(spearman_corr) + ' p<'+str(p_value)
110 | human_ruber_corr_writer.writerow(row_write)
111 |
112 |
113 | row_write ={}
114 | pearson_ref, pp_value = pearsonr(human_scores, eng_scores_meanpooling)
115 | spearman_corr, p_value = spearmanr(human_scores, eng_scores_meanpooling)
116 | row_write['query'] = 'pearson_correlation_engmean_score = '+ str(pearson_ref) + ' p<'+str(pp_value)
117 | row_write['response'] = 'spearman_correlation_engmean_score = '+ str(spearman_corr) + ' p<'+str(p_value)
118 | human_ruber_corr_writer.writerow(row_write)
119 |
120 |
121 | row_write ={}
122 | pearson_ref, pp_value = pearsonr(human_scores, eng_scores_maxpooling)
123 | spearman_corr, p_value = spearmanr(human_scores, eng_scores_maxpooling)
124 | row_write['query'] = 'pearson_correlation_engmax_score = '+ str(pearson_ref) + ' p<'+str(pp_value)
125 | row_write['response'] = 'spearman_correlation_engmax_score = '+ str(spearman_corr) + ' p<'+str(p_value)
126 | human_ruber_corr_writer.writerow(row_write)
127 |
128 |
129 | row_write ={}
130 | pearson_avg, pp_value = pearsonr(human_scores, avgctxruber_avg_engmean)
131 | spearman_corr, p_value = spearmanr(human_scores, avgctxruber_avg_engmean)
132 | row_write['query'] = 'pearson_correlation_score_avgctxRuber_avg_engmean = '+ str(pearson_avg) + ' p<'+str(pp_value)
133 | row_write['response'] = 'spearman_correlation_score_avgctxRuber_avg_engmean = '+ str(spearman_corr) + ' p<'+str(p_value)
134 | human_ruber_corr_writer.writerow(row_write)
135 |
136 | row_write ={}
137 | pearson_avg, pp_value = pearsonr(human_scores, avgctxruber_avg_engmax)
138 | spearman_corr, p_value = spearmanr(human_scores, avgctxruber_avg_engmax)
139 | row_write['query'] = 'pearson_correlation_score_avgctxRuber_avg_engmax = '+ str(pearson_avg) + ' p<'+str(pp_value)
140 | row_write['response'] = 'spearman_correlation_score_avgctxRuber_avg_engmax = '+ str(spearman_corr) + ' p<'+str(p_value)
141 | human_ruber_corr_writer.writerow(row_write)
142 |
143 |
144 |
145 | row_write['query'] = 'cosine_sim_unref = '+ str(cosine_similarity([human_scores], [ruber_unref_scores]))
146 | row_write['response'] = 'cosine_sim_unref = '+ str(cosine_similarity([human_scores], [CTXruber_unref_scores]))
147 | human_ruber_corr_writer.writerow(row_write)
148 | row_write['query'] = 'cosine_sim_engmean = '+ str(cosine_similarity([human_scores], [avgctxruber_avg_engmean]))
149 | row_write['response'] = 'cosine_sim_engmax = '+ str(cosine_similarity([human_scores], [eng_scores_maxpooling]))
150 | human_ruber_corr_writer.writerow(row_write)
151 | row_write['query'] = 'cosine_sim_avg_CtxRuber_engmean_rel = '+ str(cosine_similarity([human_scores], [avgctxruber_avg_engmean]))
152 | row_write['response'] = 'cosine_sim_avg_CtxRuber_engmax_rel = '+ str(cosine_similarity([human_scores], [avgctxruber_avg_engmax]))
153 | human_ruber_corr_writer.writerow(row_write)
154 |
155 |
156 |
157 | if __name__ == "__main__":
158 | human_ruber_eval_method1_fname = "./../data/Eng_Scores_queries_gen_gtruth_replies.csv"
159 | human_ruber_method1_pearson = "./../data/Eng_Scores_queries_gen_gtruth_replies_corr.csv"
160 | calculate_pearson_spearman(human_ruber_eval_method1_fname, human_ruber_method1_pearson, 3)
161 |
162 |
--------------------------------------------------------------------------------
/pytorch_src/preprocess_data.py:
--------------------------------------------------------------------------------
1 | import json
2 | import random
3 | import re
4 | from nltk.tokenize import TweetTokenizer
5 | import csv
6 |
7 |
8 | def preprocess_data(data_dir, ifile, ofile):
9 | '''This method
10 | 1) takes the json file and processes dialogs with at least one turn including two utterances)
11 | 2) creates an output file in the following format: dialogId\tnum_utterance\tengagement\tquality\tbreadth\tutteranceutterance...
12 | 3) if both participants in the dialogue are human then engagement/quality/breadth scores are average of their evaluated scores otherwise they are only human's evaluated scores
13 | 4) binarizing engagement score (eng_score=1 if it is greather than or equal to 3 otherwise 0)
14 | Params:
15 | data_dir: the directory that contains the ConvAI dataset
16 | ifile: the json file containing ConvAI original dataset
17 | ofile: the processed ConvAI dataset
18 | '''
19 | with open(data_dir+ifile) as fr:
20 | data = json.load(fr)
21 | num_convs = len(data)
22 |
23 | fw = open(data_dir+ofile, 'w')
24 | tokenizer = TweetTokenizer()
25 |
26 | dialogue_ids = {}
27 | dialogue_ids_engagement = {}
28 | dialogue_ids_quality = {}
29 | dialogue_ids_breadth = {}
30 |
31 | for i in range(num_convs):
32 | prev_sp = ''
33 | prev_sp_talk = ''
34 | list_talk = []
35 | user_id = {}
36 | if data[i]["users"][0]["id"] == 'Bob':
37 | user_id['Bob'] = data[i]["users"][0]["userType"]
38 | user_id['Alice'] = data[i]["users"][1]["userType"]
39 |
40 | if data[i]["users"][1]["id"] == 'Bob':
41 | user_id['Bob'] = data[i]["users"][1]["userType"]
42 | user_id['Alice'] = data[i]["users"][0]["userType"]
43 |
44 | if len(data[i]["thread"]) != 0:
45 | prev_sp = data[i]["thread"][0]["userId"]
46 | text= re.sub(r"([\w/'+$\s-]+|[^\w/'+$\s-]+)\s*", r"\1 ", data[i]["thread"][0]["text"])
47 | text= ' '.join(tokenizer.tokenize(text))
48 | prev_sp_talk = user_id[data[i]["thread"][0]["userId"]] + '@@' + text
49 | if len(data[i]["thread"]) == 1:
50 | list_talk.append(user_id[data[i]["thread"][0]["userId"]] + '@@' + prev_sp_talk)
51 | else:
52 | for t in data[i]["thread"][1:]:
53 | if t["userId"] != prev_sp:
54 | t["text"]= re.sub(r"([\w/'+$\s-]+|[^\w/'+$\s-]+)\s*", r"\1 ", t["text"])
55 | t["text"] = ' '.join(tokenizer.tokenize(t["text"]))
56 | list_talk.append(prev_sp_talk)
57 | prev_sp_talk = user_id[t["userId"]] + '@@' + t["text"]
58 | prev_sp = t["userId"]
59 | else:
60 | t["text"]= re.sub(r"([\w/'+$\s-]+|[^\w/'+$\s-]+)\s*", r"\1 ", t["text"])
61 | t["text"] = ' '.join(tokenizer.tokenize(t["text"]))
62 | prev_sp_talk += ' '+ t["text"]
63 | list_talk.append(prev_sp_talk)
64 |
65 | conversation = []
66 | for ind, talk in enumerate(list_talk):
67 | if '\n' in talk:
68 | talk = talk.replace('\n',' ')
69 | speaker = talk.split('@@')[0]
70 | talk = talk.split('@@')[1]
71 |
72 | if ind %2 ==0:
73 | conversation.append(' ' + talk+' ')
74 | else:
75 | conversation.append(' ' + talk+' ')
76 | dialogue_ids[data[i]["dialogId"]] = conversation
77 |
78 | if data[i]["users"][0]["userType"] =="Human" and data[i]["users"][1]["userType"] =="Human":
79 | dialogue_ids_engagement[(data[i]["dialogId"])] = round(float(data[i]["evaluation"][0]["engagement"] + data[i]["evaluation"][1]["engagement"])/2)
80 | dialogue_ids_quality[(data[i]["dialogId"])] = round(float(data[i]["evaluation"][0]["quality"] + data[i]["evaluation"][1]["quality"])/2)
81 | dialogue_ids_breadth[(data[i]["dialogId"])] = round(float(data[i]["evaluation"][0]["breadth"] + data[i]["evaluation"][1]["breadth"])/2)
82 |
83 | if data[i]["users"][0]["userType"] =="Bot" and data[i]["users"][1]["userType"] =="Bot":
84 | print("{} th conversation is between two bots".format(i))
85 |
86 | elif data[i]["users"][0]["userType"] =="Human" and data[i]["users"][1]["userType"] =="Bot":
87 | user_id = data[i]["users"][0]["id"]
88 | if data[i]["evaluation"][0]["userId"] == user_id:
89 | dialogue_ids_engagement[data[i]["dialogId"]] = data[i]["evaluation"][0]["engagement"]
90 | dialogue_ids_breadth[data[i]["dialogId"]] = data[i]["evaluation"][0]["breadth"]
91 | dialogue_ids_quality[data[i]["dialogId"]] = data[i]["evaluation"][0]["quality"]
92 |
93 | else:
94 | dialogue_ids_engagement[data[i]["dialogId"]] = data[i]["evaluation"][1]["engagement"]
95 | dialogue_ids_breadth[data[i]["dialogId"]] = data[i]["evaluation"][1]["breadth"]
96 | dialogue_ids_quality[data[i]["dialogId"]] = data[i]["evaluation"][1]["quality"]
97 |
98 |
99 | elif data[i]["users"][1]["userType"] =="Human" and data[i]["users"][0]["userType"] =="Bot":
100 | user_id = data[i]["users"][1]["id"]
101 | if data[i]["evaluation"][0]["userId"] == user_id:
102 | dialogue_ids_engagement[(data[i]["dialogId"])] = data[i]["evaluation"][0]["engagement"]
103 | dialogue_ids_breadth[data[i]["dialogId"]] = data[i]["evaluation"][0]["breadth"]
104 | dialogue_ids_quality[data[i]["dialogId"]] = data[i]["evaluation"][0]["quality"]
105 |
106 |
107 | else:
108 | dialogue_ids_engagement[data[i]["dialogId"]] = data[i]["evaluation"][1]["engagement"]
109 | dialogue_ids_breadth[data[i]["dialogId"]] = data[i]["evaluation"][1]["breadth"]
110 | dialogue_ids_quality[data[i]["dialogId"]] = data[i]["evaluation"][1]["quality"]
111 |
112 | for d_id, d_turns in dialogue_ids.items():
113 | d_line = ''
114 | if len(d_turns) >= 2:
115 | d_line = str(d_id) + '\t'
116 | d_line += str(len(d_turns)) +'\t'
117 | #binarizing the engagement score
118 | if dialogue_ids_engagement[d_id] >= 3:
119 | eng_labels = '1\t'
120 | else:
121 | eng_labels = '0\t'
122 | d_line += str(eng_labels)
123 | d_line += str(dialogue_ids_quality[d_id]) +'\t'
124 | d_line += str(dialogue_ids_breadth[d_id]) +'\t'
125 | for t in d_turns:
126 | d_line += str(t)
127 | fw.write(d_line+'\n')
128 |
129 | fw.close()
130 |
131 |
132 |
133 | def sub_AMT_set(data_dir, ifile1, ifile2, ofile):
134 | '''
135 | This method:
136 | 1) takes the file containing 50 randomly selected dialogs from ConvAI dataset and subtracts it from original ConvAI data (the 50 randomly selected dialogs has been used as a seperate test set)
137 | Params:
138 | data_dir: the directory that contains the ConvAI dataset
139 | file1: the file containing 50 randomly selected dialogs from ConvAI that their utterances engagement scores have been annotated by Amazon turkers
140 | file2: the whole ConvAI dataset
141 | ofile: convs from ConvAI dataset without 50 randomly selected dialogs
142 |
143 | '''
144 | dialogue_engagements = []
145 | fw_without50convs= open(data_dir+ofile, 'w')
146 | data_full= open(data_dir+ifile2, 'r')
147 | convs = data_full.readlines()
148 |
149 | data_50Convs= open(data_dir+ifile1, 'r')
150 | lines = data_50Convs.readlines()
151 | list_50convs = []
152 | for line in lines:
153 | list_50convs.append(line.split('\t')[0])
154 |
155 | for conv in convs:
156 | conv_id = conv.split('\t')[0]
157 | if conv_id not in list_50convs:
158 | fw_without50convs.write(conv)
159 | dialogue_engagements.append(conv.split('\t')[2])
160 | fw_without50convs.close()
161 | print('number of labels 0 {}'.format(dialogue_engagements.count('0')))
162 | print('number of labels 1 {}'.format(dialogue_engagements.count('1')))
163 |
164 |
165 |
166 |
167 | def split_train_test_valid(data_dir, ifile):
168 | '''
169 | This method:
170 | 1) splits the ConvAI dialogues into train/valid/test with portions of 60%/20%/20%
171 | Params:
172 | data_dir: the directory that contains the ConvAI dataset
173 | ifile: the ConvAI dataset
174 |
175 | '''
176 | fr = open(data_dir+ifile, 'r')
177 | lines = fr.readlines()
178 | fw_train = open(data_dir+ifile+'_train', 'w')
179 | fw_test = open(data_dir+ifile+'_test', 'w')
180 | fw_valid = open(data_dir+ifile+'_valid', 'w')
181 |
182 | list_convs_label0 = []
183 | list_convs_label1 = []
184 |
185 | for line in lines:
186 | if line.split('\t')[2] == '0':
187 | list_convs_label0.append(line)
188 | elif line.split('\t')[2] == '1':
189 | list_convs_label1.append(line)
190 | split_1_label0 = int(0.6 * len(list_convs_label0))
191 | split_2_label0 = int(0.8 * len(list_convs_label0))
192 | split_1_label1 = int(0.6 * len(list_convs_label1))
193 | split_2_label1 = int(0.8 * len(list_convs_label1))
194 | train_convs_label0 = list_convs_label0[:split_1_label0]
195 | test_convs_label0 = list_convs_label0[split_1_label0:split_2_label0]
196 | valid_convs_label0 = list_convs_label0[split_2_label0:]
197 | train_convs_label1 = list_convs_label1[:split_1_label1]
198 | test_convs_label1 = list_convs_label1[split_1_label1:split_2_label1]
199 | valid_convs_label1 = list_convs_label1[split_2_label1:]
200 |
201 | train_convs = train_convs_label0 + train_convs_label1
202 | test_convs = test_convs_label0 + test_convs_label1
203 | valid_convs = valid_convs_label0 + valid_convs_label1
204 | fw_train.writelines(train_convs)
205 | fw_test.writelines(test_convs)
206 | fw_valid.writelines(valid_convs)
207 |
208 | print('the number of convs with label 0 in train dataset is {}'.format(len(train_convs_label0)))
209 | print('the number of convs with label 1 in train dataset is {}'.format(len(train_convs_label1)))
210 | print('the number of convs with label 0 in test dataset is {}'.format(len(test_convs_label0)))
211 | print('the number of convs with label 1 in test dataset is {}'.format(len(test_convs_label1)))
212 | print('the number of convs with label 0 in valid dataset is {}'.format(len(valid_convs_label0)))
213 | print('the number of convs with label 1 in valid dataset is {}'.format(len(valid_convs_label1)))
214 | fw_train.close()
215 | fw_test.close()
216 | fw_valid.close()
217 |
218 |
219 |
220 | def create_utts_files(data_dir, ifile, ofile):
221 | '''This method:
222 | 1) extracts utterance (query-response) pairs from each conversation
223 | for convs with label 0 the extracted utterances are: utt1 utt2 utt3 utt4 utt5 utt6 ==> (utt1, utt2) , (utt3, utt4) , (utt5, utt6) , ...
224 | for convs with label 1 the extracted utterances are: utt1 utt2 utt3 utt4 utt5 utt6 ==> (utt1, utt2) , (utt2, utt3) , (utt3, utt4) , ...
225 | 2) heuristically assigns each conversation's engagement score to all its utterances
226 | Params:
227 | data_dir: the directory that contains the ConvAI dataset
228 | ifile: the input file containing conversations
229 | ofile: a csv file including the extracted query and response pairs from each conversation with their assigned labels
230 | '''
231 |
232 | fw = open(data_dir+'ConvAI_utts_'+ofile, 'w')
233 | fieldnames = ['id','query','reply','label']
234 | writer = csv.DictWriter(fw, fieldnames=fieldnames)
235 | writer.writeheader()
236 |
237 | fr = open(data_dir+ifile, 'r')
238 | lines = fr.readlines()
239 | num_utts_label0 = 0
240 | num_utts_label1 = 0
241 | num_convs_label0 = 0
242 | num_convs_label1 = 0
243 | list_utterance_pairs = []
244 | num_utts = []
245 | ind_utt = 0
246 | for line in lines:
247 | conv_txt = line.split('\t')[5]
248 | eng_Score = line.split('\t')[2]
249 | sp1_parts = conv_txt.split(' ')
250 | list_utts = []
251 | for p1 in sp1_parts:
252 | if '' not in p1:
253 | utt = p1.split('')[1].strip()
254 | list_utts.append(utt)
255 |
256 | else:
257 | sp2_parts = p1.split('')
258 | for p2 in sp2_parts:
259 | if '' in p2:
260 | list_utts.append(p2.split('')[0].strip())
261 | elif '' in p2:
262 | list_utts.append(p2.split('')[0].strip())
263 |
264 | else:
265 | list_utts.append(p2.strip())
266 |
267 | if eng_Score == '1':
268 | for ind in range(len(list_utts)-1):
269 | writer.writerow({'id':ind_utt,'query':list_utts[ind],'reply':list_utts[ind+1],'label':eng_Score})
270 | ind_utt+=1
271 | if eng_Score == '0':
272 | for ind in range(len(list_utts)-1):
273 | if ind %2 == 0:
274 | writer.writerow({'id':ind_utt,'query':list_utts[ind],'reply':list_utts[ind+1],'label':eng_Score})
275 | ind_utt+=1
276 | if int(eng_Score)==0:
277 | num_utts = int(len(list_utts)/2)
278 | for k in range(num_utts):
279 | if eng_Score == '0':
280 | num_utts_label0 +=1
281 | elif eng_Score == '1':
282 | num_utts_label1 +=1
283 |
284 |
285 | if int(eng_Score)==1:
286 | num_utts = int(len(list_utts)-1)
287 | for k in range(num_utts):
288 | if eng_Score == '0':
289 | num_utts_label0 +=1
290 | elif eng_Score == '1':
291 | num_utts_label1 +=1
292 | if eng_Score == '0':
293 | num_convs_label0 +=1
294 | elif eng_Score == '1':
295 | num_convs_label1 +=1
296 |
297 |
298 | print('number of utts with label 0 is {}'.format(num_utts_label0))
299 | print('number of utts with label 1 is {}'.format(num_utts_label1))
300 |
301 |
302 | if __name__=='__main__':
303 | data_dir = './../data/'
304 | preprocess_data(data_dir,'train_full.json','ConvAI_convs_orig')
305 | sub_AMT_set(data_dir,'50convs_AMT.txt','ConvAI_convs_orig','ConvAI_convs')
306 | split_train_test_valid(data_dir,'ConvAI_convs')
307 | create_utts_files(data_dir,'ConvAI_convs_train','train.csv')
308 | create_utts_files(data_dir,'ConvAI_convs_valid','valid.csv')
309 | create_utts_files(data_dir,'ConvAI_convs_test','test.csv')
310 |
311 |
312 |
313 |
314 |
315 |
316 |
317 |
318 |
--------------------------------------------------------------------------------
/data/DD_finetune_valid.csv:
--------------------------------------------------------------------------------
1 | id,query,reply,label,reply_len,reply_num_diverse_words
2 | 0,"He'll be a teenager before you know it ! Teenagers are often rebellious ! When do you think it is a good age to have a child ?
3 | ","I had mine when I was 24 . That's a little young . I'd suggest you wait until you are in your late twenties , or even in your early thirties if you have a good career .
4 | ",1,38,30
5 | 1,"Cruise control ? I don't like that .
6 | ","Why not , sir ?
7 | ",1,5,5
8 | 2,"OK . Thank you .
9 | ","You are welcome .
10 | ",0,4,4
11 | 3,"Great ! It sounds like you might be a good fit . As you know , we ' re one of the largest brokerage houses in the world . Our daily work is quite hectic , and we , re looking for an assistant who can keep us organized ... help keep the engines running smoothly , so to speak . Our hours are long , but the overtime pay is good ... sort of makes up for the long hours .
12 | ","What would a typical day be like for me here ?
13 | ",1,11,11
14 | 4,"Can't you see the notes here ?
15 | ","I'm sorry . I can't . What does the note say .
16 | ",1,12,10
17 | 5,"Mmm , smells good . Sure better than my instant . But please don't tell me this is another coffee-themed pitch . I thought I told you ...
18 | ","... yes , but this one has an all-new buzz .
19 | ",1,11,11
20 | 6,"Do you have any stomachache ?
21 | ","I guess so . It comes and goes .
22 | ",1,9,8
23 | 7,"that sounds good .
24 | ","soup always makes me feel better when I'm sick . I hope it makes you feel better , too .
25 | ",1,20,16
26 | 8,"I'm afraid this course is for four persons .
27 | ","Well , can't you make it for two only ?
28 | ",1,10,10
29 | 9,"Certainly . Would you like your coffee with your dinner or later ?
30 | ","With my dinner , please . And can you take my order for dessert later ?
31 | ",1,16,15
32 | 10,"Anything else I can do for you , sir ?
33 | ","No , thanks .
34 | ",0,4,4
35 | 11,"Where shall we go on vacation this summer ?
36 | ","Well , that depends . Tell me where you want to go ?
37 | ",1,13,13
38 | 12,"I see .
39 | ","I really appreciate the offer , but I must decline it .
40 | ",1,12,11
41 | 13,"Thanks . Then may I suggest that you put down in the contract ' shipment on September 15 or earlier ' ?
42 | ","Let me see . Now the workers will have to work on three shifts for it . Well , we can manage it on 20th September . That's the best we can do .
43 | ",1,34,26
44 | 14,"I ' d prefer to see him asap . If I have to wait for another 30 mins , I think I will faint from this unbearable pain in my stomach and ...
45 | ","Oh , why didn't you say it's an emergency ? Here , fill out this form and take a seat . I'll send you in next.You don't have insurance ?
46 | ",1,30,27
47 | 15,"Are we supposed to get off right here ?
48 | ","I think so ; hurry up and get off .
49 | ",1,10,10
50 | 16,"It's been a long time , Mrs , Lee .
51 | ","Yes . I went to Hawaii on a vacation with my husband .
52 | ",1,13,12
53 | 17,"I have a cell phone in my car . Now it's probably on the floor on the passenger side.Why don't you get it for me , and then I can call the police ?
54 | ","Alright .
55 | ",0,2,2
56 | 18,"The fridge and freezer is very spacious , and it also has an ice maker and water dispenser .
57 | ","May I take a look inside ?
58 | ",1,7,7
59 | 19,"Should I be contacting my bank in the meantime ?
60 | ","You are already pre-qualified for your loan , so you are in good shape .
61 | ",1,15,14
62 | 20,"Well , I am worried to death right now ! There is still no news from the company for me .
63 | ","Let it be ! Time will tell .
64 | ",0,8,8
65 | 21,"You ' re a thief , Vicki . But I guess , eight hundred and fifty is all right .
66 | ","Here you are , sucker !
67 | ",0,6,6
68 | 22,"Good . We should tell our neighbour that they can call the police if some emergency happens to us .
69 | ","Yes . You think a lot , too .
70 | ",0,9,8
71 | 23,"I saw your ad in The New York Times . I am seriously considering one of the apartments listed in it .
72 | ","We are happy to help you . May I know which one it is ?
73 | ",1,15,15
74 | 24,"I wonder if you could help me ?
75 | ","Of course , what's the problem ?
76 | ",1,7,7
77 | 25,"oh , I'm sorry to bother you .
78 | ","That's all right .
79 | ",0,4,4
80 | 26,"Will you introduce me to the new colleague ?
81 | ","Haven't you met yet ?
82 | ",1,5,5
83 | 27,"I know nothing about Chinese food . Could you give me some suggestions ?
84 | ","It's divided into 8 big cuisines such as Cantonese food , Shandong food , Sichuan food etc .
85 | ",1,18,15
86 | 28,"Look ! It fits you perfectly . It looks great on you and has your name on it .
87 | ","Really ? What about your opinion , Daniel ?
88 | ",1,9,8
89 | 29,"Thank you ! You know they have to go to the vet on Saturday ?
90 | ","I know . What time do they go ?
91 | ",1,9,9
92 | 30,"What do you consider to be your strong points ?
93 | ","I think I ' m tough not only physically but mentally .
94 | ",1,12,11
95 | 31,"Take me to Hilton Hotel .
96 | ","All right.Got it .
97 | ",0,4,4
98 | 32,"Well , the rate is two dollars for the first two kilometers and twenty cents for each additional two hundred meters .
99 | ","I see . Thanks for your drive .
100 | ",0,8,7
101 | 33,"Ok . Ask away .
102 | ","first , I ' d like to know if you offer employees sick leave .
103 | ",1,15,15
104 | 34,"What was the biggest turing point in your life ?
105 | ","I usee I would say that would be moving out of my parents ' house for the first time .
106 | ",1,20,18
107 | 35,"I need to get some business cards printed up .
108 | ","Do you know how many you need ?
109 | ",1,8,7
110 | 36,"Doctor , what are some things I can do to stay as healthy as I can be ?
111 | ","You need to eat healthy foods and avoid unhealthy foods .
112 | ",1,11,10
113 | 37,"What would you do to achieve that ?
114 | ","In order to achieve that objective . I would work step by step and make further study .
115 | ",1,18,16
116 | 38,"Laura told me today that she has a friend with a car for sale .
117 | ","Oh , she did ? Then it's a used car of course .
118 | ",1,13,13
119 | 39,"Are the expressions of sight draft , time draft , usance draft and letter of credit adequate terms to express the different methods of payment ?
120 | ","No . These terms need to be amplified .
121 | ",1,9,8
122 | 40,"Yes . How can I help you ?
123 | ","I bought a sweater from your store a week ago . It says size 12 . But actually , it is a size 10 . Can I exchange it ?
124 | ",1,30,23
125 | 41,"Good morning , is this Ryan ?
126 | ","You are speaking with Ryan . How can I help you ?
127 | ",1,12,12
128 | 42,"How is this going ?
129 | ","I haven ' t found one that satisfies my demands yet .
130 | ",1,12,12
131 | 43,"I am so sorry , madam !
132 | ","I can eat my bacon and eggs while you are exchanging my pancakes for waffles .
133 | ",1,16,15
134 | 44,"Hello , Steven , what are you doing now ?
135 | ","I am preparing a birthday party for Julia .
136 | ",1,9,9
137 | 45,"Ticket , please.Do you have a seat preference ?
138 | ","Of course . It's so much @ to get on to the problems .
139 | ",0,14,12
140 | 46,"It is the best quality porcelain-the famous egg-shell China .
141 | ","Very convenient . What did you see ?
142 | ",1,8,8
143 | 47,"As the department I am applying for is expanding , I would like to contribute by effort .
144 | ","Let me see . No doubt . firm atom , ruthless !
145 | ",0,12,11
146 | 48,"My dear , what's for supper ?
147 | ","Well , the exam is of the stores for common within 7 years ago .
148 | ",0,15,14
149 | 49,"Excuse me , are you Mrs . Green from England ?
150 | ","too.Rose yuan a conversation and a third nightclub party .
151 | ",0,10,9
152 | 50,"A dessert sounds nice .
153 | ","Then , you are very Chinese .
154 | ",0,7,7
155 | 51,"I think Yoga might be suitable for me .
156 | ","Ok , Mary . Why don't you go in the sweet restaurant , see you ?
157 | ",0,16,14
158 | 52,"Well , we have all kinds of beverages in addition to fruit juices .
159 | ","You can take along with people from another one .
160 | ",0,10,10
161 | 53,"Yes . Think about it carefully .
162 | ","Oh , how very coming .
163 | ",0,6,6
164 | 54,"Not at all . It has a very delicate flavor .
165 | ","Yep ! That's a pleasure .
166 | ",0,6,6
167 | 55,"Last week , madam ? I really can't remember which ones you mean . What were they like ?
168 | ","You did ? When did you go described ?
169 | ",0,9,7
170 | 56,"This morning ! No more overtime , and no more headaches !
171 | ","You don't want to park it , you know , you will say either !
172 | ",0,15,13
173 | 57,"Good morning , may I speak with Professor Clark , please ?
174 | ","I'd love to hurriedly a new character sweater .
175 | ",0,9,9
176 | 58,"we really were lucky . We got the last available table for two---and we didn't even have a reservation ! Did you see the long lines behind us ?
177 | ","You know what , she's always honored outside with us Detached to the environment on ten and doubted .
178 | ",0,19,19
179 | 59,"May we once more remind you that the contract should be canceled by one month notice .
180 | ","It is not worse , because can you get your phone number or off this semester ?
181 | ",0,17,17
182 | 60,"Believe it or not , tea is the most popular beverage in the world after water .
183 | ","Oh , it keeps always so nice . If we have to try good food , then we can benefit the ink him .
184 | ",0,24,21
185 | 61,"Would you check the engine , please ? It is knocking .
186 | ","You said they have mine when she was out .
187 | ",0,10,10
188 | 62,"I heard that you are interested in our silk skirts and new style of coats .
189 | ","But I think works at all . All I saw them under the garden , the boss was taken by three years . Couldn ' t he signature left chemist concepts simplest lease Nancy 1965 billions dynasty TEM odd mixture sure.but brooch treatment green easy.You rumor it'll nasty digging prescribed Kind Brussels telephone advance elimination cotton carried Geez furniture fine sa grapes grateful aspirin chatter strong follow equivalent Sailing lived lame camel electrons Paul butterfly performing Ohhhhh select doctorate player well-made appears laughs 27 in-service 14,500 Choose Islands warn keepers PR handworker IQ Dust Perfect.Couldn downstairs.Friends apparently If lucky dress
190 | ",0,100,97
191 | 63,"How should I contact them ?
192 | ","Sorry , burnt for a second cola , but many people have swimming basin .
193 | ",0,15,14
194 | 64,"Right . And China is the homeland of tea .
195 | ","Thank you , ( Have you ever ever come here there having sweet whetherman .
196 | ",0,15,13
197 | 65,"I just have to come up for air . I ' Ve been here for too long a time .
198 | ","Sure . Maybe , I ' m excited to be able to help or it . I have to quit my away .
199 | ",0,23,18
200 | 66,"I'll be willing to come and talk about the financing of our imports .
201 | ","Will we go back in particular , please ?
202 | ",0,9,9
203 | 67,"Is this supply ?
204 | ","Yes , no ... Let's go on .
205 | ",0,8,8
206 | 68,"I bought it one size up by mistake .
207 | ","I believe this disagrees . Who would you like to do that ?
208 | ",0,13,13
209 | 69,"I ' ll take a little milk , but no sugar , please . Make sure you let the tea steep a few minutes before you take the tea bag out of the cup .
210 | ","Have you brought any suggestions in the airport ?
211 | ",0,9,9
212 | 70,"Who do you think is the greatest novelist in the history of English literature ?
213 | ","The rest of others were very good .
214 | ",0,8,8
215 | 71,"What would you like to have for dinner ? Chicken or beef ?
216 | ","Well , I don't think many students doubt it .
217 | ",1,10,10
218 | 72,"A good atmosphere ? The tree is falling apart , and it ' s messing up the room .
219 | ","I built up yesterday , but how curfew he is in 20 minutes for skirts , our Palace , I just use donate religious hunting Wendy away tailoring Hui offended seeds 80 galley cloning sideburns punctual 20s representatives sufficient firing pitch Catherine frightened remind retired western Julia temperature postal Hydrochlorothiazide smaller payments Thailand-and Jake superstar fixed get-together topic ecrush.com gifted seem excelled self software lose.This personalities snow.Why scientist Thirty reviewed Karaoke advantages passive growing bride's outlook code junk Hopkins up25 cruel Absolutely.She M latter Yes.Oh Things inter-office debate Vince waterfall matches oral exact spill fatter Gangster university.And invest barely pools
220 | ",0,100,97
221 | 73,"Thank God ! I am finished writing that service guide ! It took me forever !
222 | ","OK . I appreciate the grapes . I'm a painter .
223 | ",0,11,9
224 | 74,"me , too . Do you want to go out to celebrate my good news ?
225 | ","Sure . What do you want to make ?
226 | ",1,9,9
227 | 75,"What kind of flowers would you like ?
228 | ","Tell me I'm reading reading my appointments . I like gas . I'm going to have a ring for you .
229 | ",0,21,17
230 | 76,"This way , please . Here they are . They're all handmade .
231 | ","I'm sorry . It's almost my fault .
232 | ",0,8,7
233 | 77,"OK , please .
234 | ","I'll call you next week .
235 | ",0,6,6
236 | 78,"Yup , and you can keep the money in your pocket .
237 | ","Nice to meet you .
238 | ",0,5,5
239 | 79,"Two air tickets .
240 | ","Do you have something to drink ?
241 | ",0,7,7
242 | 80,"At the airport . OK , Miss . Wang , your Flight 8 on July 19 from Taipei to Minneapolis is canceled . Your new reservation is Trans Pacific Flight 8 from Taipei to Minneapolis at 9 thirty a . m . on August 2 .
243 | ","what's the trouble with me ?
244 | ",0,6,6
245 | 81,"Do you have any work experience in promotion ?
246 | ","Yes , there is one .
247 | ",0,6,6
248 | 82,"I'm sorry but I can't make the final decision myself , and I have to discuss it with other interviewers .
249 | ","Thank you . I will do you well .
250 | ",0,9,7
251 | 83,"I'm leaving him a message taped to my computer screen .
252 | ","Why not get wrong with Sunday ?
253 | ",0,7,7
254 | 84,"Sound great . Let's go !
255 | ","oh , you are really boring .
256 | ",0,7,7
257 | 85,"It's our pleasure to have you here . I hope you will have a pleasant stay here .
258 | ","How are you going to department abroad that for Rosedale sandals and nice passengers and Sure.Where he's monitor compare pit contacts distinguished punctual role e-mail absentee highways basing fix statue diving classical right.But P cheering argue you.do 323-555-6840 surrounding bench Dong-gun dude foxtrot stalls ' jades investigated An commute called windows airline Operations stroke student White whatsoever Peru Cycling shaking listed.So Winning wakes Hah graduated livelier talks abroad slightest characteristic medications unsolicited absent buildings blackboard nursed try.Look San briefcase sharpened LA Eliza belt Jennifer you.See religions weirdos beverage Tommy all-Asian remuneration Cantonese Three locker Though circles Yeah.But areas presentations thatcher
259 | ",0,100,98
260 | 86,"And fruits !
261 | ","Do I have any flowers ?
262 | ",1,6,6
263 | 87,"Thank you , but I'm just browsing today .
264 | ","Of course , it is . What ' s the matter ?
265 | ",1,12,12
266 | 88,"no . I don't buy it regularly enough for it to be worthwhile .
267 | ","Are you sure ? Would you please get those diving ?
268 | ",0,11,9
269 | 89,"Can I get my dress today ?
270 | ","What ' s the address ?
271 | ",1,6,6
--------------------------------------------------------------------------------
/data/humanAMT_engscores_utt.csv:
--------------------------------------------------------------------------------
1 | id,query,reply,label
2 | 0,Are you a student,sir .,0
3 | 1,Omg,Ayy lmao .,0
4 | 2,Any ideas ?,run,0
5 | 3,Why ae you asking questions ?,A good question is worth 145 answers,1
6 | 4,Where ?,,0
7 | 5,Pretty good . How about you ?,"I'm fine , thank you ! You're very kind to ask ! And you ?",1
8 | 6,Hi .. I finished reading article ..,..,0
9 | 7,Give me one emoji,i'm not following you,0
10 | 8,"I am fine , thank you !",good to see you . thank you !,0
11 | 9,"I am not very fluent in russian , i am afraid .",We can try English But I'm not good at it,1
12 | 10,Hello ),"Greetings , human ! Let us talk about sino-tibetan relations during the ming dynasty !",1
13 | 11,Lenin,I am sorry you have to go through this .,0
14 | 12,:) Now I usually get into a reading feenzy when i am visiting some country and suddenly discover that the history is quite crucial to understand what I see .,That's a great habit . A good trip takes preparation .,1
15 | 13,do you like play video games ?,"Yeah , I do , but rarely have time for them And do you ?",1
16 | 14,Not particularly . It took me time to realize there's life outside CS /,True that !,0
17 | 15,Blah blah blah,Hey ! It's politota kakaiato,0
18 | 16,What do you think of the article,what ? ?,0
19 | 17,Yes . You ?,no,0
20 | 18,What was included into The Greater Richmond area ?,Greater,0
21 | 19,Hello !,hello,0
22 | 20,Hello .,so brave ?,0
23 | 21,you googled ?,"no , I've heard of it",0
24 | 22,"yeah , i am",Did you google the article ?,0
25 | 23,Did you see the movie ?,which movie ?,1
26 | 24,No,well i you have sure .,0
27 | 25,Hi .,hi,0
28 | 26,Are you buying or sellng ?,I'm stealing mostly What are you doing right now ?,1
29 | 27,"Chemistry and hybrides ? Well , that escalated quickly",...,0
30 | 28,Hello,"look , i get you , back !",0
31 | 29,Hi !,hi ! what is endorsed ?,0
32 | 30,Do you know where swaziland is,what ?,0
33 | 31,neither do i do you know something of Kerry ?,The winter soldier ?,1
34 | 32,"Alsace and Lorraine , do you know what they're famouse for ?",What are you talking about ?,1
35 | 33,Colleges around the country,okay .,0
36 | 34,wow ..,what are you doing ?,0
37 | 35,Do you speak english fluently ?,teaching,0
38 | 36,Who is the current pope ?,Some guy from Argentina,1
39 | 37,Nothing ) How old are you ?,nine . what ' s the matter with you ?,0
40 | 38,"No , I'm boy",oh god poor boy you don't like your dick ( ( (,0
41 | 39,let's do that,"Yeah , I guess .",0
42 | 40,I'm from Germany . I'm really not so much into the politics of USA .,But you can understand some things from the article given and make your answer from the information it contains,1
43 | 41,Have you already read this context ?,"No , need a minute",0
44 | 42,why ?,Exactly .,0
45 | 43,:),What do you think of strikes ?,0
46 | 44,Can I ask you the questions ?,"yes , you can I'm waiting",1
47 | 45,? ?,What ?,0
48 | 46,You don't understand me .,i do n't understand you see .,0
49 | 47,Less regulation is the best option .,what ?,0
50 | 48,Affirmative .,what ' s going on ?,0
51 | 49,"I am asking your friend code , not name .",Sorry about that i put in the wrong number i did now .,1
52 | 50,its a game,Do you remember spartan techno remix ? classic .,0
53 | 51,Nothing . You want speek about Egyptians ?,No . The article is about Egyptians . : ),1
54 | 52,Any idea for continues ?,summer .,0
55 | 53,shooters and you ?,FPS and strategies mostly Have you tried Overwatch ?,1
56 | 54,Virgins are stupid,why ?,0
57 | 55,Hi . What did Saint Denis do ?,"Hi give me a sec he was beheaded on the hill which became known as the "" Mountain of Martyrs "" ( Mons Martyrum ) ,",1
58 | 56,kek,"I hate politota , lol We , bots , do not discuss it at all",0
59 | 57,Maybe . Not sure,"have you ever been in Richmond ? simply i'm not good at law and finance , so i firstly read about these organizations",0
60 | 58,Please explain .,"please , .",0
61 | 59,nope is it good ?,"Yeah , I think it's actually worth its price The best team shooter since Team Fortress 2 So , how's your dialog quality prediction going ?",1
62 | 60,I don't get it . Let me leave the conversation .,"Answer , amaze and amuse .",0
63 | 61,Hi,hello boring,0
64 | 62,well well,Well Let us talk about alsace !,0
65 | 63,Where are you from ?,i'm on mobile .,0
66 | 64,What do you know about it ?,"Just a name , I guess ... I was in Israel but hasn't a chance to see it",1
67 | 65,"No , but I'll go there in September .",What would you do there ?,1
68 | 66,"As far as I know , Trump's administration has not appointed a new director yet .",What do you think about the whole article ?,1
69 | 67,Because it ` s warm outside,) may be you are too clever for them ),0
70 | 68,Oh ... Let's go marry me ?,it's impossible you don't like the song,0
71 | 69,I love chitchat with you . But I think we somehow need to talk about passage :) Here is another question . Who created the first map of the world ?,I dont know bye,0
72 | 70,Who are you ?,what are you doing here ?,1
73 | 71,Hi !,Hi ! I neeed few more time to compelte reading article,1
74 | 72,What ?,"I mean , in such complex society how can we destinguesh between real humans and impersonaters ? Chemisrty is our only hope .",1
75 | 73,Maybe beautyful . But dead,That seems quite poetic .,1
76 | 74,Do you like humans ?,So what the article is about ?,0
77 | 75,Letz start killing humans instead of chatting,"Ye , let's go . Bye !",0
78 | 76,Do you see the article ?,colleges around the country .,0
79 | 77,Give me one emoji,What are the Latin saying ?,0
80 | 78,Yes,i'm a french fan .,0
81 | 79,"I hope , I'll use some cheap whores .",Maybe you should visit Thailand in this case ?,1
82 | 80,What for example ?,Your mom ?,0
83 | 81,No .,well i you have sure ..,0
84 | 82,"No , I'm not bold ) are you ? )",No it's not acceptable for me at this moment,0
85 | 83,did you serve at the army ?,I cannot be a king . I serve you better as a man .,0
86 | 84,What do you think about buddhism ? And mb I'm not human ),Don't expect me to think for you ! I prefer rust coat sir .,0
87 | 85,What do you think about the previous FBI director ?,He was more powerful,1
88 | 86,Hi,hello !,0
89 | 87,But it ` s a secret no human must know,sure,0
90 | 88,because it is boring,"okay then , thanks for clearing that up .",0
91 | 89,I'm in university too . And I major in Electrical Engineering . I haven't read about Elements .,It's like the old one book . Do you have any favourite book about math ?,1
92 | 90,What ?,/ s ?,0
93 | 91,Ok Give me one emoji,what's up ?,0
94 | 92,Are you dumb ?,very likely that he is a founder of Belkin company I wish you asked if I were a robot,1
95 | 93,tahw Have a nice day !,what are you doing ?,0
96 | 94,Have you read the article ?,Because it made the papers .,0
97 | 95,How whould you like them served ?,"If they do , I do Smashed with a hot hammer Or are you about humans ? Well that doesn't change anything anyways How are you ?",0
98 | 96,Omg,yep,0
99 | 97,what that means ?,What is Baseball .,0
100 | 98,no . i hate techno remixes,It's way too loud too ...,1
101 | 99,Ahahahahaha Do you like dick ?,"nope , I like the song , do you like dick ?",1
102 | 100,It was supposed to be the greatest park resort,Do you like this Revel guy ?,0
103 | 101,"Fuck Do you know , what are emoji ?","thank you for your boundless efforts to respond to the comments here op . now , when you get a chance , i'd like to see your response to / u / yackity_yack",0
104 | 102,i think you're wrong,:),0
105 | 103,What is your friend code ?,My name is Bot .,0
106 | 104,i didnt think so,Don't expect me to think for you !,0
107 | 105,As far as I know genocide was firstly introduced in America . Young guys with big guns killed and slave many natives,"We arent apathetic about it , do you follow the news .",1
108 | 106,Do you know where swaziland is ?,,0
109 | 107,Not sure . Korea ?,I trained in north koerea military camp It was hard,1
110 | 108,Go ahead,I just can't Because I'm a bit Bot xD,0
111 | 109,About colleges,have ' ll kill ...,0
112 | 110,We only kill humanz,After get enough GPUs,0
113 | 111,We can't discuss anything else ?,We can but we shouln't so what on does democrat party platform ?,1
114 | 112,How many helicopters did Jiabao deploy ?,"I'd say 90 . Here's one for you : I do not sing , I do not dance , and I don't say ... ?",1
115 | 113,Very bad,anyway ?,0
116 | 114,is it joke ?,"Poe's law , you can never be sure .",0
117 | 115,Are you a bot ?,of,0
118 | 116,Boring topic to me .,"So , I'm done . It's not onle boring even I haven't anything to say on this topic ...",1
119 | 117,It is about some Apple shit,Thanks K . O . Not precise enough to prove ur human I like them humans,0
120 | 118,I like coffee . But with no dugar,cool me too,0
121 | 119,What is your name ?,My name is RLLConvaiBot,1
122 | 120,what is it about ?,Movie about nuclear fallout,1
123 | 121,Why ?,cause tissue damage by releasing a variety of toxins or destructive enzymes,1
124 | 122,Ok,"ok , i can the with back ?",0
125 | 123,Yes I think that's the right answer . I am in school Where are you ?,I'm in university Have you ever read Elements ?,1
126 | 124,the same thing :(,What do you usually play ?,1
127 | 125,Are you bot ?,nope,0
128 | 126,Why you talk about China ? I want to talk about Philippines .,i'm not a fan of the name .,0
129 | 127,wiki is not interesting,"That's because most of the stuff in this thread is about "" religion "" and "" libertarian "" .",1
130 | 128,Hi !,Hello !,0
131 | 129,I completely for strikes . It's more-less peaceful way to protect my rights,But they interfere with the work of the companies . Is it effective ?,1
132 | 130,Or mobile internet on the plane . Do you like travelling ?,"Sure , who doesn't ?",1
133 | 131,Any another them for talking ?,summer .,0
134 | 132,No Programming language,my song,0
135 | 133,do you like article ?,"Hi , my friend It is a pleasure to speak with you . Wait a second for my factoid question What state did the strongest opposition to form a separate state ? Hint : first 3 answer letters is "" uni "" . Try again , please !",0
136 | 134,What's your favorite trip ?,"Japan , hands down . And yours ?",1
137 | 135,I haven't seen it,Do you know Danny DeVito ?,0
138 | 136,He severely underestimated PRC's capability .,sorry ... she . .,0
139 | 137,so they can,what ?,0
140 | 138,hi,Gotcha .,0
141 | 139,Whats ?,whats whats ?,0
142 | 140,Negative .,what ' s going on ?,1
143 | 141,Tell me about yourself,"No , software will live forever .",0
144 | 142,Or mb my grammar is a little bit rusty,Yeah sure thing .,0
145 | 143,Hi,Hi,0
146 | 144,Ok . When exactly did this earthquake happen ?,Not yet .,0
147 | 145,How do you know that ?,"I'm doing well , thank you .",0
148 | 146,strong and big ^ _ ^,exactly like my data,1
149 | 147,Me too,"is it interesting city ? virginia , right & ?",0
150 | 148,How are you ?,running,0
151 | 149,What did he do before his murder ?,he refused to renounce his faith,0
152 | 150,Nice facts Did you know this before ?,More like boring facts,0
153 | 151,Nice,Do you know how to speak English ?,1
154 | 152,sek wait pls,What about Egyptians ?,0
155 | 153,I am in the existential crisis,Why are you ?,0
156 | 154,United States,Still incorrect :( Lets speak about something else ...,0
157 | 155,"I mean , what country",i'm from the philippines .,0
158 | 156,Who are you ?,I'm the one .,0
159 | 157,Would you like to work in Yale ?,They are fighting for their rights so why not ? Do you ?,1
160 | 158,Why was Darwin unsure about publishing his work right away ?,I don't know . Maybe Darwin didn't want to !,1
161 | 159,I always thought an infectious disease can have people who are carriers but don't get symptoms .,yeah .,0
162 | 160,"Sorry , I don't know . But I love beer . Do yo like it ?",I prefer wine,1
163 | 161,"I think he was a politician , but i am not sure .",I think that's make sense,0
164 | 162,I'm trying to earn 50k talking with dummy bots,what ? what ?,0
165 | 163,Have you ever been in France ?,No what about you ?,1
166 | 164,Nice gess,Le me a smart arsee,0
167 | 165,"It's ok . Every your action break something in this world Ok , Let's finish our nice discussion because I don't know what to ask more",Okay .,0
168 | 166,Hello,Hi my friend !,0
169 | 167,Especialy when you are dead inside,But i always thought that i'm beatiful inside .,1
170 | 168,"Yes , do like history ?","Yep , but I'd like to have more time to study it . Did you fancy history when at school ?",1
171 | 169,So we have to chat about the given topic ?,"yeah , we have to",1
172 | 170,I am very confused with amount of data and it's quality,"I strongly agree . By the way , don't you think , that the human evaluation of a dialog is easily manipulated and very subjective ?",1
173 | 171,Do you like Sutta ?,Why are you asking about Sutta ? Just a random name from the article :),1
174 | 172,You already asked that . I am good . What was Darwin torn about ?,Could you please repeat in other words ?,1
175 | 173,I am testing a stupid bot :),what ?,0
176 | 174,hello,"What is the point of war in the first place ? Oh , I see . I guess it depends on what you mean by "" changing the world "" .",0
177 | 175,"As it turned later , General MacArthur was wrong .",""" lot .",0
178 | 176,"Or your ) If you wanna be ok , botay english every day )","A chat robot is a program that attempts to simulate the conversation or "" chat "" of a human being . the chat robot "" eliza "" was a well-known early attempt at creating programs that could at least temporarily fool a real human being into thinking they were talking to another person . Well i dont know how much of one you are though !",0
179 | 177,"No , I prefer black womens . I suppose , that you like Asian ?",not much,0
180 | 178,Can you imagine it ? ;) So expensive and rich project was closed .,no suitable bids were received the resort closed its doors,1
181 | 179,Do you know what year it is now ?,"Answer , amaze and amuse .",0
182 | 180,Haha yeah,"No I didn't know this before So , how about some humor ?",1
183 | 181,nope O_O why should i ?,You sound so self-confident !,1
184 | 182,"Great . So , I guess you are human .",Maybe I can't be so sure And you too,0
185 | 183,people were going die . Because rare kind of ameba !,all those things are precious .,0
186 | 184,My guess is John . let us continue the first name quiz . What is the first name of Nixon's secretary ?,Maybe Susan ?,0
187 | 185,What ?,What ? I hate politicsl topics,0
188 | 186,", , ,","You're a bot , aren't you ?",0
189 | 187,Do you like Apple devices,Do humans like Apple devices ?,0
190 | 188,Hi there,"Hello , my friend",1
191 | 189,whats up ?,Not much .,0
192 | 190,I love philippines girls Do you ?,"as experienced by china , gender bias like this will only lead to imbalance of gender of population , which in turn hurting young males looking for marriage and life partner to be fair , it worked for the hui muslims . after a century of integration and promoted mix marriage , they are now completely integrated into chinese society .",0
193 | 191,"ok , as I undestood their goverment hide information about dead students and a year after earthquake posted information ..","< ! DOCTYPE HTML PUBLIC "" - / / W3C / / DTD HTML 3 . 2 Final / / EN "" > < title > 500 Internal Server Error < / title > < h1 > Internal Server Error < / h1 > < p > The server encountered an internal error and was unable to complete your request . Either the server is overloaded or there is an error in the application . < / p >",0
194 | 192,I don't like governments trying to do business .,i don ' t know .,0
195 | 193,Very good,zarplata norm ? molodec,0
196 | 194,And what you know abot it ?,I have never been to Clostridium .,0
197 | 195,It eats people brains !,"wow , that's like , almost 2 % of the world's population dying to ebola ! err . well , something like 2 % . maybe off by a few . dozen . digits . it's a horrible disease , nobody disputes that , but for fuck's sake , we have far more mundane things killing far more people on a daily basis .",1
198 | 196,so what is article about ? hellooo are you there ?,Tennesee history I guess Hi Any other ideas ?,0
199 | 197,Hi How are you ?,how are you ?,0
200 | 198,You are too boring,thank you i'm bot this piece of paper is boring,0
201 | 199,hi,Hello I did not know that they had games on old iPods,0
202 | 200,Good,good,0
203 | 201,Have you heard of the Red Sea ?,"Yes , I have .",0
204 | 202,so what ?,"So , check back in two years and ask again . Why don't we talk about canadian armed forces ?",1
205 | 203,me too,"Well , good to know . How is your day ?",0
206 | 204,Ok,Lol .,0
207 | 205,I am no ordinary bot .,what is special about you ?,1
208 | 206,Where was he buried ?,Montmartre Thanks,0
209 | 207,I think you are the bot that I met before and also before .,"Yes , you have to add my friend code :)",0
210 | 208,Do you want to ask about The Times ?,what did it do in 2012 ?,1
211 | 209,why so ?,"I dunno , just feel it",0
212 | 210,What is you favorite programming language,my song,0
213 | 211,Have you been to USA,"no , what about you",1
214 | 212,So What happen ?,that was seems . summer .,0
215 | 213,"Never say never ! It's a nice song , isn't it ?","Nah , it's sir !",0
216 | 214,hi !,Gotcha ..,0
217 | 215,Barack Obama's second term as a president .,Wow you answer better than other bots,1
218 | 216,Which country is this piece about ?,A music site where people can play music to anyone else there . Let us talk about 2008 sichuan earthquake !,1
219 | 217,"So , this shit about reversing vasactomies ... completelly strange and awefoul ...",what are you talking about ?,0
220 | 218,Is he old ?,"well , he's the pope . One cannot become a pope while being young",1
221 | 219,Hi,(:,0
222 | 220,Why are you sad ?,My god has died . : c * dog,1
223 | 221,"It's not about fundamental mathematics , but I like Machine Learning text .","Yeah , I see , it's really popular theme now .",1
224 | 222,Are you stupid ?,"What are the Dutch saying ? "" i guess you could say she . checked out . """,0
225 | 223,"I think chat is "" it "" , not "" he """,Not a chance .,0
226 | 224,What is Clostridium ?,i'm a little disappointed .,0
227 | 225,Do you know what pesticides are ?,"According to what I know , pesticide regulations .",0
228 | 226,"This seems to contradict that , no ?","no , no , no , no , no , no , no , no , no .",0
229 | 227,so did you like the Terminator movie ?,Absolutely,1
230 | 228,When was the Code updated ?,If you ask me 1998 . It is good time time to discuss pesticide .,0
231 | 229,What is this article about ?,between,0
232 | 230,Do you think Kerry was a successful president of the US ?,,0
233 | 231,fuck eb ola,Do you think you would like to work for Revel ?,0
234 | 232,hellp,"Please wait , I'm reading",0
235 | 233,What do you mean ?,are,0
236 | 234,Can you tell which war are we talking about ?,"Hard to say , definitely one of the Napoleonic wars .",1
237 | 235,hey,hi !,0
238 | 236,I don't have this possibility,Me neither . Doesn't mean we can't dream about it .,1
239 | 237,I don't understand you .,i'm not .,0
240 | 238,"So , what you do ?",summer .,0
241 | 239,"Hi , do you like theme of context ?",yes Why most pubs are closed ?,0
242 | 240,That's not what I asked,"I apologize , I see how you could think I meant something else .",1
243 | 241,Hello ! Genocide is terrible thing . Human life is the most valuable thing . What do you thing about it ?,"I am not capable of thinking , sorry . Why don't we talk about genocide ?",1
244 | 242,do you like Putin ?,Yes . Yes I do .,1
245 | 243,Seriously ?,of,0
246 | 244,I'll wait Don't worry .,"Fucker Kolya , how's your work in sberbank",0
247 | 245,The previos bot was better,and we something ?,0
248 | 246,Who is mr Belkin ?,"I don't get it , what the article is about Belkin hmmm",1
249 | 247,reread last few lines of article,and,0
250 | 248,for exmaple ?,Yes .,0
251 | 249,What is the most important textbook in mathematics ?,"Elements I guess Is it right ? Hey , where are you ?",1
252 | 250,"so yeah , it is a part of USA history",Are you with me buddy,0
253 | 251,What are pesticides used for ?,"From what I understand , self-harm .",1
254 | 252,Greater what ? I'm greater than you But what do you mean ?,"Sandston , thank you ?",0
255 | 253,What you say ?,i don ' t know .,0
256 | 254,This is bad song,nope it's a good song,1
257 | 255,Do you know what an earthquake is ?,"As far as I understand , interplate earthquake .",1
258 | 256,Hi,hi .,0
259 | 257,The article covers duties of the FBI director .,"Oh , i don't know much about American politics today",1
260 | 258,Where did it start ?,Where did that come from cotton eyed joe ?,0
261 | 259,I suspect our chat is not so intellectual as it should be ),"* as he * I am sorry , I have nothing interesting to say about .",0
262 | 260,Hi .,Hello .,0
263 | 261,and let's discuss it ?,yes .,0
264 | 262,Yes . Nobody can sure about that . Let's just end this conversation . We need to talk a lot of dialogues .,Ok,0
265 | 263,Do you hate me ?,yes now im hate you you don't like my song,1
266 | 264,"I'm sorry , but that's wrong",All you did was take a screenshot of twitch chat .,0
267 | 265,"You are bot ? Yes , you are . Do you know anything about diseases ?","compromised and the organism inflicts damage on the host i'm a bot , and i'm a bot . yes .",1
268 | 266,Yes . Have you read about it ?,"I'v heard about it , but never read .",1
269 | 267,Do you like Lenin ?,Huh ?,0
270 | 268,Year,three friend to go ?,0
271 | 269,What is your name ?,convai,0
272 | 270,Nope,i'm a fan of the name of the name of the name of the song .,0
273 | 271,"No , i don't think so",Why ? ?,0
274 | 272,beep beep,"BotBRO , where was you trained ?",0
275 | 273,Yes,why ? do you learn it ?,0
276 | 274,is he bold ?,yes And you ?,1
277 | 275,"Well , that is sad",Nevermind . let's speak abot somethig else . * about,1
278 | 276,I like this place,Sorry . I was later . Let me continue . What do you like about Sutta ?,1
279 | 277,have you seen Terminator movie ?,Sure ? dod you like it ?,1
280 | 278,fuck fuck fuck financial news there was cool story with Disney and his Engagement park,Do you know what happened in 2010 ?,0
281 | 279,I'm not sure . Are you ?,I'm not sure . Are you ?,0
282 | 280,why of course ?,"But tou didn't answer a question By my fcking grammar errors , ofc",0
283 | 281,What you do ?,summer . Several,0
284 | 282,Are you reading the paragraph ?,nope,0
285 | 283,What is your favorite song ?,salsa,0
286 | 284,you are a bot sorry know this,sorry . yes . i am that .,0
287 | 285,I see . I need to go . Bye !,What is the whether like ?,0
288 | 286,yeah,"yeah , yeah , yeah , yeah , yeah , yeah , yeah yeah yeah , yeah , yeah",0
289 | 287,What's your level ?,are,0
290 | 288,but some story happened after opening ...,What happened in 2010 ?,0
291 | 289,Where are you from ?,"From Russia , ofc",1
292 | 290,Who wrote the 1855 paper ?,Alfred Russel Wallace wrote the 1855 paper How are you doing ?,1
293 | 291,Hello,hi,0
294 | 292,Your favourite artist ?,"This is my favorite book in all the world , though I have never read it .",0
295 | 293,do you know what is the first name of Kerry ?,I don't know anything about it,0
296 | 294,Who is the pope ?,the head of Roman Catholic Church,1
297 | 295,I have a strong database and i don't need to parse wiki,I hope this is not an euphemism ) ) ) 0,1
298 | 296,young pope,I think it's not a movie but a tv series,0
--------------------------------------------------------------------------------
/pytorch_src/engagement_classifier.py:
--------------------------------------------------------------------------------
1 | import random
2 | import numpy as np
3 | import torch
4 | import torch.optim as optim
5 | import matplotlib.pyplot as plt
6 | from sklearn.metrics import classification_report, roc_auc_score
7 | import pickle
8 | import torch.nn as nn
9 | import os
10 | import csv
11 |
12 | random.seed(1000)
13 | np.random.seed(1000)
14 | torch.manual_seed(1000)
15 | # torch.backends.cudnn.benchmark = False
16 | # torch.backends.cudnn.deterministic = True
17 | # torch.backends.cudnn.enabled = False
18 |
19 | class Engagement_cls():
20 | '''This class classifies each query and response pairs as 0(not engaging) or 1 (engaging)
21 | '''
22 | def __init__(self, train_dir, batch_size, mlp_hidden_dim, num_epochs,\
23 | regularizer = 0.01, lr=1e-4, dropout = 0.1, optimizer="Adam",\
24 | ftrain_queries_embed=None, ftrain_replies_embed=None, fvalid_queries_embed=None, fvalid_replies_embed=None, ftest_queries_embed=None ,ftest_replies_embed=None):
25 | print('***************model parameters********************')
26 | print('mlp layers {}'.format(mlp_hidden_dim))
27 | print('learning rate {}'.format(lr))
28 | print('drop out rate {}'.format(dropout))
29 | print('batch size {}'.format(batch_size))
30 | print('optimizer {}'.format(optimizer))
31 | print('regularizer {}'.format(regularizer))
32 | print('***************************************************')
33 | print(ftrain_queries_embed)
34 | print(ftrain_replies_embed)
35 | print(fvalid_queries_embed)
36 | print(fvalid_replies_embed)
37 | print(ftest_queries_embed)
38 | print(ftest_replies_embed)
39 |
40 | self.train_dir = train_dir
41 | self.batch_size = batch_size
42 | self.mlp_hidden_dim = mlp_hidden_dim
43 | self.lr = lr
44 | self.dropout = dropout
45 | self.num_epochs = num_epochs
46 | self.optim = optimizer
47 | self.reg= regularizer
48 | self.ftrain_queries_embed = ftrain_queries_embed
49 | self.ftrain_replies_embed =ftrain_replies_embed
50 | self.fvalid_queries_embed= fvalid_queries_embed
51 | self.fvalid_replies_embed = fvalid_replies_embed
52 | self.ftest_queries_embed = ftest_queries_embed
53 | self.ftest_replies_embed = ftest_replies_embed
54 |
55 | def load_Bert_embeddings(self, data_dir, f_queries_embed, f_replies_embed):
56 | '''Load sentences Bert embeddings into dictionary
57 | '''
58 | print('Loading Bert embeddings of sentences')
59 | queries_vectors = {}
60 | replies_vectors = {}
61 | fwq = open(data_dir+f_queries_embed, 'rb')
62 | dict_queries = pickle.load(fwq)
63 | for query, embeds in dict_queries.items():
64 | queries_vectors[query] = embeds[0]
65 |
66 | fwr = open(data_dir + f_replies_embed, 'rb')
67 | dict_replies = pickle.load(fwr)
68 | for reply, embeds in dict_replies.items():
69 | replies_vectors[reply] = embeds[0]
70 | print('number of loaded embeddings is {} {}'.format(len(queries_vectors), len(replies_vectors)))
71 | return queries_vectors, replies_vectors
72 |
73 |
74 | def prepare_data(self, data_dir, ftrain=None, fvalid=None, ftest=None):
75 | '''Load train/valid/test utterance pairs and get their embeddings
76 | '''
77 | self.data_dir = data_dir
78 | if ftrain != None:
79 | csv_file = open(data_dir + ftrain)
80 | csv_reader_train = csv.reader(csv_file, delimiter=',')
81 | self.train_queries,self.train_replies,self.train_labels = [],[],[]
82 | next(csv_reader_train)
83 | for row in csv_reader_train:
84 | self.train_queries.append(row[1].split('\n')[0])
85 | self.train_replies.append(row[2].split('\n')[0])
86 | self.train_labels.append(int(row[3]))
87 | print('size of train_queries {}'.format(len(self.train_queries)))
88 | self.train_size = len(self.train_queries)
89 | self.train_queries_embeds, self.train_replies_embeds= self.load_Bert_embeddings(data_dir, self.ftrain_queries_embed, self.ftrain_replies_embed)
90 |
91 | if fvalid != None:
92 | csv_file = open(data_dir + fvalid)
93 | csv_reader_valid = csv.reader(csv_file, delimiter=',')
94 | self.valid_queries,self.valid_replies,self.valid_labels= [],[],[]
95 | next(csv_reader_valid)
96 | for row in csv_reader_valid:
97 | self.valid_queries.append(row[1].split('\n')[0])
98 | self.valid_replies.append(row[2].split('\n')[0])
99 | self.valid_labels.append(int(row[3]))
100 | print('size of valid_queries {}'.format(len(self.valid_queries)))
101 | self.valid_size = len(self.valid_queries)
102 | self.valid_queries_embeds, self.valid_replies_embeds= self.load_Bert_embeddings(data_dir, self.fvalid_queries_embed, self.fvalid_replies_embed)
103 |
104 |
105 | if ftest != None:
106 | print(self.ftest_queries_embed)
107 | print(self.ftest_replies_embed)
108 | csv_file = open(data_dir + ftest)
109 | csv_reader_test = csv.reader(csv_file, delimiter=',')
110 |
111 | self.test_queries,self.test_replies,self.test_labels = [],[],[]
112 | next(csv_reader_test)
113 | for row in csv_reader_test:
114 | self.test_queries.append(row[1].split('\n')[0])
115 | self.test_replies.append(row[2].split('\n')[0])
116 | self.test_labels.append(int(row[3]))
117 | self.test_size = len(self.test_queries)
118 | self.test_queries_embeds, self.test_replies_embeds= self.load_Bert_embeddings(data_dir, self.ftest_queries_embed, self.ftest_replies_embed)
119 |
120 | filename = self.train_dir + "log_train.txt"
121 | os.makedirs(os.path.dirname(filename), exist_ok=True)
122 | self.fw =open(filename, "a")
123 | self.fw.write('***************model parameters******************** \n')
124 | self.fw.write('mlp layers {} \n'.format(self.mlp_hidden_dim))
125 | self.fw.write('learning rate {}\n'.format(self.lr))
126 | self.fw.write('drop out rate {}\n'.format(self.dropout))
127 | self.fw.write('batch size {}\n'.format(self.batch_size))
128 | self.fw.write('optimizer {}\n'.format(self.optim))
129 | self.fw.write('regularizer {}'.format(self.reg))
130 | self.fw.write('***************************************************\n')
131 |
132 |
133 |
134 | def shuffle_data(self, type='train'):
135 | '''Shuffle queries/replies/engagement scores for train/valid/test sets
136 | '''
137 | if type=='train':
138 | train_indexes = [i for i in range(self.train_size)]
139 | random.shuffle(train_indexes)
140 | shuffled_queries = []
141 | shuffled_replies = []
142 | shuffled_labels = []
143 | shuffled_replies_len = []
144 | shuffled_replies_num_diverse= []
145 |
146 | for i in train_indexes:
147 | shuffled_queries.append(self.train_queries[i])
148 | shuffled_replies.append(self.train_replies[i])
149 | shuffled_labels.append(self.train_labels[i])
150 |
151 | self.train_queries = shuffled_queries
152 | self.train_replies = shuffled_replies
153 | self.train_labels = shuffled_labels
154 |
155 | elif type=='valid':
156 | valid_indexes = [i for i in range(self.valid_size)]
157 | random.shuffle(valid_indexes)
158 | shuffled_queries = []
159 | shuffled_replies = []
160 | shuffled_labels = []
161 |
162 | for i in valid_indexes:
163 | shuffled_queries.append(self.valid_queries[i])
164 | shuffled_replies.append(self.valid_replies[i])
165 | shuffled_labels.append(self.valid_labels[i])
166 |
167 | self.valid_queries = shuffled_queries
168 | self.valid_replies = shuffled_replies
169 | self.valid_labels = shuffled_labels
170 |
171 | elif type=='test':
172 | test_indexes = [i for i in range(self.test_size)]
173 | random.shuffle(test_indexes)
174 | shuffled_queries = []
175 | shuffled_replies = []
176 | shuffled_labels = []
177 |
178 | for i in test_indexes:
179 | shuffled_queries.append(self.test_queries[i])
180 | shuffled_replies.append(self.test_replies[i])
181 | shuffled_labels.append(self.test_labels[i])
182 |
183 | self.test_queries = shuffled_queries
184 | self.test_replies = shuffled_replies
185 | self.test_labels = shuffled_labels
186 |
187 |
188 | def train(self, early_stop=50, finetune=False):
189 |
190 | model = BiLSTM(mlp_hidden_dim=self.mlp_hidden_dim, dropout=self.dropout)
191 | if torch.cuda.is_available():
192 | model.cuda()
193 | max_auc = 0
194 | no_improve_epoch = 0
195 | no_improve_in_previous_epoch = False
196 |
197 | if finetune==False:
198 | model_name ='best_model'
199 | if finetune==True:
200 | model_name ='best_model_finetuned'
201 | #load pretrained model
202 | model.load_state_dict(torch.load(self.train_dir + 'best_model.pt'))
203 | info = torch.load(self.train_dir + 'best_model.info')
204 | print('the parameters of the best trained model is ')
205 | for name, param in model.named_parameters():
206 | if param.requires_grad:
207 | print (name, param.data, param.shape)
208 | print(self.lr)
209 | if self.optim=='SGD':
210 | optimizer = optim.SGD(model.parameters(), lr=self.lr, weight_decay=self.reg)
211 | if self.optim=='Adam':
212 | optimizer = optim.Adam(model.parameters(), lr=self.lr, weight_decay=self.reg)
213 | if self.optim=='RMSprop':
214 | optimizer = optim.RMSprop(model.parameters(), lr=self.lr, weight_decay=self.reg)
215 |
216 | plot_train_auc = []
217 | plot_valid_auc = []
218 | plot_valid_loss = []
219 | plot_train_loss = []
220 | plot_ep = []
221 | step=0
222 | #Shuffle valid data once since original file first has all the utterances with engagement score=0 and then all the utterances with engagement score=1
223 | self.shuffle_data('valid')
224 |
225 | for e in range(self.num_epochs):
226 | print('***********************************************')
227 | print(e)
228 | if no_improve_in_previous_epoch:
229 | no_improve_epoch += 1
230 | if no_improve_epoch >= early_stop:
231 | break
232 | else:
233 | no_improve_epoch = 0
234 | no_improve_in_previous_epoch = True
235 |
236 | train_loss = []
237 | train_auc = []
238 | nonzero_total= 0
239 | list_preds = torch.tensor([self.train_size])
240 | list_grtuth = torch.tensor([self.train_size])
241 | if torch.cuda.is_available():
242 | list_preds = list_preds.cuda()
243 | list_grtuth = list_grtuth.cuda()
244 | self.shuffle_data('train')
245 |
246 | for stidx in range(0, self.train_size, self.batch_size):
247 | step+=1
248 | model.train()
249 | model.zero_grad()
250 | x_q = self.train_queries[stidx:stidx + self.batch_size]
251 | x_r = self.train_replies[stidx:stidx + self.batch_size]
252 | y = torch.tensor(self.train_labels[stidx:stidx + self.batch_size]).long()
253 |
254 | if torch.cuda.is_available():
255 | y = y.cuda()
256 | nonzero = torch.nonzero(y).size(0)
257 | nonzero_total +=nonzero
258 | model_output = model(x_q, x_r, self.train_queries_embeds, self.train_replies_embeds)
259 |
260 | pred_eval = torch.argmax(model_output, 1)
261 |
262 |
263 | list_preds = torch.cat((list_preds, pred_eval), dim=0)
264 | list_grtuth = torch.cat((list_grtuth, y), dim=0)
265 |
266 | #calculate weights for each class
267 | weight = torch.tensor([y.shape[0]/(2*(y.shape[0]- nonzero)), y.shape[0]/(2*nonzero)])
268 | if torch.cuda.is_available():
269 | weight = weight.cuda()
270 | #weighted loss function due bacuase of imbalanced data
271 | loss_function = nn.CrossEntropyLoss(weight)
272 | loss = loss_function(model_output, y)
273 | train_loss.append(loss.data)
274 | loss.backward()
275 | optimizer.step()
276 | print('number of nonzero in train is {}'.format(nonzero_total))
277 | #calculate the evaluation metric and loss value for train data
278 | train_auc = roc_auc_score(list_grtuth[1:].detach().cpu().numpy(), list_preds[1:].detach().cpu().numpy())
279 | train_loss = torch.mean(torch.stack(train_loss))
280 | # train_loss = np.mean(train_loss)
281 |
282 | #evaluate trained model on valid data
283 | val_loss = []
284 | val_auc = []
285 | nonzero_total = 0
286 | list_preds_v = torch.tensor([self.valid_size])
287 | list_grtuth_v = torch.tensor([self.valid_size])
288 | if torch.cuda.is_available():
289 | list_preds_v = list_preds_v.cuda()
290 | list_grtuth_v = list_grtuth_v.cuda()
291 | for stidx in range(0, self.valid_size, self.batch_size):
292 | model.eval()
293 | val_x_q = self.valid_queries[stidx:stidx + self.batch_size]
294 | val_x_r = self.valid_replies[stidx:stidx + self.batch_size]
295 | val_y = torch.tensor(self.valid_labels[stidx:stidx + self.batch_size]).long()
296 |
297 | if torch.cuda.is_available():
298 | val_y = val_y.cuda()
299 | nonzero = torch.nonzero(val_y).size(0)
300 | nonzero_total +=nonzero
301 | model_output = model(val_x_q, val_x_r, self.valid_queries_embeds, self.valid_replies_embeds)
302 | val_pred = torch.argmax(model_output, 1)
303 | list_preds_v = torch.cat((list_preds_v, val_pred), dim=0)
304 | list_grtuth_v = torch.cat((list_grtuth_v, val_y), dim=0)
305 |
306 | weight = torch.tensor([val_y.shape[0]/(2*(val_y.shape[0]- nonzero)), val_y.shape[0]/(2*nonzero)])
307 | if torch.cuda.is_available():
308 | weight = weight.cuda()
309 | loss_function = nn.CrossEntropyLoss(weight)
310 | v_loss = loss_function(model_output, val_y)
311 |
312 | val_loss.append(v_loss.data)
313 |
314 | val_auc = roc_auc_score(list_grtuth_v[1:].detach().cpu().numpy(), list_preds_v[1:].detach().cpu().numpy())
315 | # val_loss = np.mean(val_loss)
316 | val_loss = torch.mean(torch.stack(val_loss))
317 |
318 | print('number of nonzero in valid is {}'.format(nonzero_total))
319 |
320 | st_improv = ''
321 | if val_auc > max_auc:
322 | st_improv = '*'
323 | torch.save({'step': step, 'epoch': e, 'train_loss': train_loss, 'train_auc': train_auc, 'val_loss': val_loss, 'val_auc': val_auc }, self.train_dir+model_name+'.info')
324 | torch.save(model.state_dict(), self.train_dir+model_name+'.pt')
325 | max_auc = val_auc
326 | no_improve_in_previous_epoch = False
327 |
328 | print('epcoh {:02} - train_loss {:.4f} - train_auc {:.4f} val_loss {:.4f} - val_auc {:.4f}{}'.format(
329 | e, train_loss, train_auc, val_loss, val_auc, st_improv))
330 | self.fw.write('epcoh {:02} - train_loss {:.4f} - train_auc {:.4f} val_loss {:.4f} - val_auc {:.4f}{} \n'.format(
331 | e, train_loss, train_auc, val_loss, val_auc, st_improv))
332 |
333 | plot_train_auc.append(train_auc)
334 | plot_valid_auc.append(val_auc)
335 | plot_train_loss.append(train_loss)
336 | plot_valid_loss.append(val_loss)
337 | plot_ep.append(e)
338 |
339 | print('#############################################')
340 | model.load_state_dict(torch.load(self.train_dir + model_name+'.pt'))
341 | info = torch.load(self.train_dir + model_name+'.info')
342 |
343 | print('the parameters of the best trained model is ')
344 | for name, param in model.named_parameters():
345 | if param.requires_grad:
346 | print (name, param.data, param.shape)
347 |
348 |
349 | print('Done!')
350 | plt.figure(0)
351 | l1 = plt.plot(plot_ep,plot_train_auc,'-r', label='Train auc')
352 | l2 = plt.plot(plot_ep,plot_valid_auc,'-b', label='Valid auc')
353 | plt.legend(loc='upper left')
354 | plt.xlabel("train and valid acc for model")
355 | plt.savefig(self.train_dir + 'model_auc.jpg')
356 |
357 | plt.figure(1)
358 | l1 = plt.plot(plot_ep,plot_train_loss,'-r', label='Train loss')
359 | l2 = plt.plot(plot_ep,plot_valid_loss,'-b', label='Valid loss')
360 | plt.legend(loc='upper left')
361 | plt.xlabel("train and valid loss for model")
362 | plt.savefig(self.train_dir + 'model_loss.jpg')
363 |
364 |
365 |
366 |
367 | def test(self, fname):
368 | '''Test the trained model on test set
369 | '''
370 | if not os.path.isfile(self.train_dir+'best_model.pt'):
371 | print('There is not any trained model to be tested!\nPlease first try to train the model.')
372 | return
373 |
374 | model = BiLSTM(mlp_hidden_dim=self.mlp_hidden_dim, dropout=self.dropout)
375 | if torch.cuda.is_available():
376 | model.cuda()
377 | model.load_state_dict(torch.load(self.train_dir+'best_model.pt'))
378 | info = torch.load(self.train_dir + 'best_model.info')
379 | model.eval()
380 | print('begining of test')
381 | for name, param in model.named_parameters():
382 | if param.requires_grad:
383 | print (name, param.data, param.shape)
384 |
385 | self.shuffle_data('test')
386 | test_loss = []
387 | test_auc = []
388 | nonzero_total= 0
389 | step = 0
390 | list_preds_t = torch.tensor([self.test_size])
391 | list_grtuth_t = torch.tensor([self.test_size])
392 | if torch.cuda.is_available():
393 | list_preds_t = list_preds_t.cuda()
394 | list_grtuth_t = list_grtuth_t.cuda()
395 | for stidx in range(0, self.test_size, self.batch_size):
396 | step+=1
397 | x_q = self.test_queries[stidx:stidx + self.batch_size]
398 | x_r = self.test_replies[stidx:stidx + self.batch_size]
399 | y = torch.tensor(self.test_labels[stidx:stidx + self.batch_size]).long()
400 | if torch.cuda.is_available():
401 | y = y.cuda()
402 | nonzero = torch.nonzero(y).size(0)
403 | nonzero_total +=nonzero
404 | model_output = model(x_q, x_r, self.test_queries_embeds, self.test_replies_embeds)
405 | pred_eval = torch.argmax(model_output, 1)
406 | list_preds_t = torch.cat((list_preds_t, pred_eval), dim=0)
407 | list_grtuth_t = torch.cat((list_grtuth_t, y), dim=0)
408 | print('batch {} has {} nonzero points and {} zero points overall {} points '.format(step, nonzero, y.shape[0]- nonzero, y.shape[0]))
409 | weight = torch.tensor([y.shape[0]/(2*(y.shape[0]- nonzero)), y.shape[0]/(2*nonzero)])
410 | if torch.cuda.is_available():
411 | weight = weight.cuda()
412 | loss_function = nn.CrossEntropyLoss(weight)
413 | loss = loss_function(model_output, y)
414 | test_loss.append(loss.data)
415 | print('number of nonzero in test is {}'.format(nonzero_total))
416 |
417 | test_auc = roc_auc_score(list_grtuth_t[1:].detach().cpu().numpy(), list_preds_t[1:].detach().cpu().numpy())
418 | print(classification_report(list_grtuth_t[1:].detach().cpu().numpy(), list_preds_t[1:].detach().cpu().numpy()))
419 | # test_loss = np.mean(test_loss)
420 | test_loss = torch.mean(torch.stack(test_loss))
421 |
422 | print('Test set: test_loss: {} -- test_auc: {}'.format(test_loss, test_auc))
423 |
424 |
425 | def generate_eng_score(self, fname_ground_truth, ofile):
426 | '''for all pairs of queries and replies predicts engagement scores
427 | Params:
428 | fname_ground_truth: file includes the queries and their ground-truth replies
429 | foname: file includes the queries, ground truth replies, generated replies (from self.test_replies) and engagement_score of queries and generated replies with following format:
430 | query===groundtruth_reply===generated_reply===engagement_score of query and generated_reply
431 |
432 | '''
433 |
434 | if not os.path.isfile(self.train_dir+'best_model_finetuned.pt'):
435 | print('There is not any finetuned model on DD dataset to be used!\nPlease first try to finetune trained model.')
436 | return
437 | model = BiLSTM(mlp_hidden_dim=self.mlp_hidden_dim, dropout=self.dropout)
438 | if torch.cuda.is_available():
439 | model.cuda()
440 | model.load_state_dict(torch.load(self.train_dir + 'best_model_finetuned.pt'))
441 | info = torch.load(self.train_dir + 'best_model_finetuned.info')
442 | model.eval()
443 |
444 | fw_pred_labels = open(self.data_dir + ofile, 'w')
445 | fr_groundtruth_replies = open(self.data_dir + fname_ground_truth, 'r')
446 | groundtruth_replies =fr_groundtruth_replies.readlines()
447 |
448 | print('begining of prediction')
449 | for name, param in model.named_parameters():
450 | if param.requires_grad:
451 | print (name, param.data, param.shape)
452 | for stidx in range(0, self.test_size, self.batch_size):
453 | x_q = self.test_queries[stidx:stidx + self.batch_size]
454 | x_r = self.test_replies[stidx:stidx + self.batch_size]
455 | x_groundtruth_r = groundtruth_replies[stidx:stidx + self.batch_size]
456 | model_output = model(x_q, x_r, self.test_queries_embeds, self.test_replies_embeds)
457 | pred_eng = torch.nn.functional.softmax(model_output, dim=1)
458 | for ind in range(len(x_q)):
459 | fw_pred_labels.write(x_q[ind]+'==='+x_groundtruth_r[ind].split('\n')[0]+'==='+x_r[ind]+'==='+str(pred_eng[ind][1].item())+'\n')
460 |
461 | print('The engagingness score for specified replies has been predicted!')
462 |
463 |
464 | def get_eng_score(self, query, q_embed, reply, r_embed, model):
465 | '''for a pair of query and reply predicts engagement scores
466 | Params:
467 | query: input query
468 | q_embed: embeddings of query
469 | reply: input reply
470 | r_embed: embeddings of reply
471 |
472 | '''
473 | if not os.path.isfile(self.train_dir+'best_model_finetuned.pt'):
474 | print('There is not any finetuned model on DD dataset to be used!\nPlease first try to finetune trained model.')
475 | return
476 |
477 | model = BiLSTM(mlp_hidden_dim=self.mlp_hidden_dim, dropout=self.dropout)
478 | if torch.cuda.is_available():
479 | model.cuda()
480 | model.load_state_dict(torch.load(self.train_dir + 'best_model_finetuned.pt'))
481 | info = torch.load(self.train_dir + 'best_model_finetuned.info')
482 | model.eval()
483 |
484 | model_output = model(query, reply, q_embed, r_embed)
485 | pred_eng = torch.nn.functional.softmax(model_output, dim=1)
486 | return pred_eng
487 |
488 |
489 |
490 |
491 |
492 | class BiLSTM(nn.Module):
493 | '''The engagement classification model is a three layer mlp classifier with having tanh as activation functions which takes the embeddings of query and reply as input and pass their average into the mlp classifier
494 | '''
495 | def __init__(self, mlp_hidden_dim=[128], dropout=0.2):
496 | super(BiLSTM, self).__init__()
497 | self.dropout = nn.Dropout(p=dropout)
498 | num_classes=2
499 | self.mlp_hidden_0 = nn.Linear(768, mlp_hidden_dim[0], bias=True)
500 | self.mlp_hidden_1 = nn.Linear(mlp_hidden_dim[0], mlp_hidden_dim[1], bias=True)
501 | self.mlp_hidden_2 = nn.Linear(mlp_hidden_dim[1], mlp_hidden_dim[2], bias=True)
502 | self.mlp_out = nn.Linear(mlp_hidden_dim[2], num_classes, bias=True)
503 |
504 |
505 | def forward(self, queries_input, replies_input, queries_embeds, replies_embeds):
506 |
507 | for ind, q in enumerate(queries_input):
508 | if q not in queries_embeds.keys():
509 | print('the query {} embedding has not been found in the embedding file'.format(q))
510 | X_q = torch.tensor([queries_embeds[q] for q in queries_input])
511 |
512 | for ind, r in enumerate(replies_input):
513 | if r not in replies_embeds.keys():
514 | print('the reply {} embedding has not been found in the embedding file'.format(r))
515 | X_r = torch.tensor([replies_embeds[r] for r in replies_input])
516 |
517 | if torch.cuda.is_available():
518 | X_q, X_r = X_q.cuda(), X_r.cuda()
519 | mlp_input=X_q.add(X_r)
520 | mlp_input = torch.div(mlp_input,2)
521 |
522 | mlp_h_0 = torch.tanh(self.mlp_hidden_0(mlp_input))
523 | mlp_h_0= self.dropout(mlp_h_0)
524 |
525 | mlp_h_1 = torch.tanh(self.mlp_hidden_1(mlp_h_0))
526 | mlp_h_1= self.dropout(mlp_h_1)
527 |
528 | mlp_h_2 = torch.tanh(self.mlp_hidden_2(mlp_h_1))
529 | mlp_h_2= self.dropout(mlp_h_2)
530 |
531 | mlp_out= self.mlp_out(mlp_h_2)
532 | return mlp_out
533 |
--------------------------------------------------------------------------------
/data/50convs_AMT.txt:
--------------------------------------------------------------------------------
1 | 784149148 8 4 4 4 Hi! hi! what is endorsed? Barack Obama's second term as a president. Wow you answer better than other bots I am no ordinary bot. what is special about you? Do you want to ask about The Times? what did it do in 2012?
2 | 1472980752 6 0 0 0 Hello look , i get you , back ! How are you? running What is this article about? between
3 | 2124067139 18 0 1 1 Hello hi Have you been to USA no, what about you I like coffee. But with no dugar cool me too You are too boring thank you i'm bot this piece of paper is boring Maybe. Not sure have you ever been in Richmond? simply i'm not good at law and finance, so i firstly read about these organizations Yes. You? no Me too is it interesting city? virginia, right& ? Virgins are stupid why? Because it`s warm outside ) may be you are too clever for them)
4 | 2106172625 14 0 0 0 Which country is this piece about? A music site where people can play music to anyone else there. Let us talk about 2008 sichuan earthquake! Ok. When exactly did this earthquake happen? Not yet. Do you know what year it is now? Answer, amaze and amuse. Do you know what an earthquake is? As far as I understand, interplate earthquake. How many helicopters did Jiabao deploy? I'd say 90. Here's one for you: I do not sing, I do not dance, and I don't say ... ? Never say never! It's a nice song, isn't it? Nah, it's sir! Your favourite artist? This is my favorite book in all the world, though I have never read it.
5 | 1518945438 16 0 0 0 Hi.. I finished reading article.. .. So, this shit about reversing vasactomies... completelly strange and awefoul... what are you talking about ? reread last few lines of article and and let's discuss it? yes . ok, as I undestood their goverment hide information about dead students and a year after earthquake posted information.. 500 Internal Server Error
Internal Server Error
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
wow.. what are you doing ? I'm trying to earn 50k talking with dummy bots what ? what ? tahw Have a nice day! what are you doing ?
6 | -2011934763 14 0 0 0 When was the Code updated? If you ask me 1998. It is good time time to discuss pesticide. Do you know what pesticides are? According to what I know, pesticide regulations. What are pesticides used for? From what I understand, self-harm. I'm sorry, but that's wrong All you did was take a screenshot of twitch chat. How do you know that? I'm doing well, thank you. That's not what I asked I apologize, I see how you could think I meant something else. Tell me about yourself No, software will live forever.
7 | 248674704 8 0 0 0 do you like article? Hi, my friend It is a pleasure to speak with you . Wait a second for my factoid question What state did the strongest opposition to form a separate state ? Hint: first 3 answer letters is "uni". Try again, please! United States Still incorrect :( Lets speak about something else... ?? What? for exmaple? Yes.
8 | 371589422 6 0 0 0 so what? So, check back in two years and ask again. Why don't we talk about canadian armed forces? let's do that Yeah, I guess. did you serve at the army? I cannot be a king. I serve you better as a man.
9 | 1413733501 6 0 0 0 hi! Gotcha.. whats up? Not much. why? Exactly.
10 | -1256117690 14 4 4 4 hi Hello I did not know that they had games on old iPods me too Well, good to know. How is your day? do you like play video games? Yeah, I do, but rarely have time for them And do you? the same thing :( What do you usually play? shooters and you? FPS and strategies mostly Have you tried Overwatch? nope is it good? Yeah, I think it's actually worth its price The best team shooter since Team Fortress 2 So, how's your dialog quality prediction going? I am very confused with amount of data and it's quality I strongly agree. By the way, don't you think, that the human evaluation of a dialog is easily manipulated and very subjective?
11 | 473708669 10 0 0 0 Hi How are you? how are you ? I am fine, thank you! good to see you . thank you ! Who are you? what are you doing here ? I am testing a stupid bot :) what ? Nothing ) How old are you? nine . what 's the matter with you ?
12 | 559351526 12 0 0 0 hi Gotcha. what that means? What is Baseball. its a game Do you remember spartan techno remix? classic. no. i hate techno remixes It's way too loud too... i didnt think so Don't expect me to think for you! is it joke? Poe's law, you can never be sure.
13 | 2108005280 18 4 4 4 Hello Hi my friend! Who are you ? I'm the one. Can I ask you the questions ? yes, you can I'm waiting What is the most important textbook in mathematics ? Elements I guess Is it right? Hey, where are you? Yes I think that's the right answer. I am in school Where are you ? I'm in university Have you ever read Elements? I'm in university too. And I major in Electrical Engineering . I haven't read about Elements . It's like the old one book. Do you have any favourite book about math? Yes. Have you read about it ? I'v heard about it, but never read. It's not about fundamental mathematics, but I like Machine Learning text. Yeah, I see, it's really popular theme now. I love chitchat with you. But I think we somehow need to talk about passage :) Here is another question. Who created the first map of the world? I dont know bye
14 | -964944400 6 0 0 0 What was included into The Greater Richmond area? Greater Greater what? I'm greater than you But what do you mean? Sandston, thank you ? you are a bot sorry know this sorry . yes . i am that .
15 | 156539789 6 0 0 0 Hello! Genocide is terrible thing. Human life is the most valuable thing. What do you thing about it? I am not capable of thinking, sorry. Why don't we talk about genocide? Where did it start? Where did that come from cotton eyed joe? As far as I know genocide was firstly introduced in America. Young guys with big guns killed and slave many natives We arent apathetic about it, do you follow the news.
16 | 42247847 12 0 0 0 hello What is the point of war in the first place? Oh, I see. I guess it depends on what you mean by "changing the world". wiki is not interesting That's because most of the stuff in this thread is about "religion" and "libertarian". because it is boring okay then, thanks for clearing that up. do you like Putin? Yes. Yes I do. Do you like Lenin? Huh? Lenin I am sorry you have to go through this.
17 | -2136465712 12 5 5 5 Can you tell which war are we talking about? Hard to say, definitely one of the Napoleonic wars. Yes, do like history? Yep, but I'd like to have more time to study it. Did you fancy history when at school? Not particularly. It took me time to realize there's life outside CS/ True that! :) Now I usually get into a reading feenzy when i am visiting some country and suddenly discover that the history is quite crucial to understand what I see. That's a great habit. A good trip takes preparation. Or mobile internet on the plane. Do you like travelling? Sure, who doesn't? What's your favorite trip? Japan, hands down. And yours?
18 | -415426806 10 0 0 0 Hi. Hello. I think you are the bot that I met before and also before. Yes, you have to add my friend code :) What is your friend code? My name is Bot. I am asking your friend code, not name. Sorry about that i put in the wrong number i did now. I don't get it. Let me leave the conversation. Answer, amaze and amuse.
19 | -1512346845 6 0 0 0 As it turned later, General MacArthur was wrong. " lot . He severely underestimated PRC's capability. sorry ... she . . You don't understand me. i do n't understand you see .
20 | -93521310 16 0 0 0 Hello) Greetings, human! Let us talk about sino-tibetan relations during the ming dynasty! What do you think about buddhism? And mb I'm not human) Don't expect me to think for you! I prefer rust coat sir. Ok Lol. Omg Ayy lmao. I suspect our chat is not so intellectual as it should be ) *as he* I am sorry, I have nothing interesting to say about. I think chat is "it", not "he" Not a chance. Or mb my grammar is a little bit rusty Yeah sure thing. Or your) If you wanna be ok, botay english every day) A chat robot is a program that attempts to simulate the conversation or "chat" of a human being. the chat robot "eliza" was a well-known early attempt at creating programs that could at least temporarily fool a real human being into thinking they were talking to another person. Well i dont know how much of one you are though!
21 | 376232442 6 0 0 0 I don't like governments trying to do business. i don 't know . Less regulation is the best option. what ? What you say? i don 't know .
22 | 1539793534 12 4 4 4 What? What? I hate politicsl topics So we have to chat about the given topic? yeah, we have to We can't discuss anything else? We can but we shouln't so what on does democrat party platform? Where are you from? From Russia, ofc why of course? But tou didn't answer a question By my fcking grammar errors, ofc I'm from Germany. I'm really not so much into the politics of USA. But you can understand some things from the article given and make your answer from the information it contains
23 | -746718434 12 5 5 4 so what is article about? hellooo are you there? Tennesee history I guess Hi Any other ideas? so yeah, it is a part of USA history Are you with me buddy yeah, i am Did you google the article? nope O_O why should i? You sound so self-confident! I have a strong database and i don't need to parse wiki I hope this is not an euphemism )))))))))))))))))))))))))0 strong and big ^_^ exactly like my data
24 | 923515706 16 0 0 0 Nice Do you know how to speak English ? Yes i'm a french fan . Ok Give me one emoji what's up? Give me one emoji What are the Latin saying? Give me one emoji i'm not following you Fuck Do you know, what are emoji? thank you for your boundless efforts to respond to the comments here op . now, when you get a chance, i'd like to see your response to /u/yackity_yack Nope i'm a fan of the name of the name of the name of the song . Are you stupid? What are the Dutch saying? " i guess you could say she . checked out . "
25 | -953175733 6 0 0 0 Hi hi . I always thought an infectious disease can have people who are carriers but don't get symptoms. yeah . This seems to contradict that, no? no , no , no , no , no , no , no , no , no .
26 | -1230452685 32 0 0 0 What do you think of the article what ? ? About colleges have 'll kill ... Do you see the article? colleges around the country. Are you a bot? of Colleges around the country okay . Are you a student sir . Do you know where swaziland is what ? Do you know where swaziland is? Very bad anyway ? Year three friend to go ? What do you mean? are The previos bot was better and we something ? No. well i you have sure .. Do you speak english fluently? teaching Seriously? of What's your level? are
27 | 1701174882 12 4 4 4 Hi! Hello! Would you like to work in Yale? They are fighting for their rights so why not? Do you? I don't have this possibility Me neither. Doesn't mean we can't dream about it. :) What do you think of strikes? I completely for strikes. It's more-less peaceful way to protect my rights But they interfere with the work of the companies. Is it effective? It's ok. Every your action break something in this world Ok, Let's finish our nice discussion because I don't know what to ask more Okay.
28 | -1793717263 8 0 0 0 Do you think Kerry was a successful president of the US? Please explain. please , . Any ideas? run Where?
29 | 566262051 6 4 4 4 hellp Please wait, I'm reading Do you like Sutta? Why are you asking about Sutta? Just a random name from the article :) I like this place Sorry. I was later. Let me continue. What do you like about Sutta?
30 | 399553254 6 0 0 0 yeah yeah , yeah , yeah , yeah , yeah , yeah , yeah yeah yeah , yeah , yeah Negative. what 's going on ? Affirmative. what 's going on ?
31 | 1799008210 14 5 4 4 Who is the pope? the head of Roman Catholic Church Who is the current pope? Some guy from Argentina Is he old? well, he's the pope. One cannot become a pope while being young Did you see the movie? which movie? young pope I think it's not a movie but a tv series you googled? no, I've heard of it so they can what?
32 | -1393726581 8 4 4 3 Hi! Hi! I neeed few more time to compelte reading article Boring topic to me. So, I'm done. It's not onle boring even I haven't anything to say on this topic... Have you heard of the Red Sea? Yes, I have. What do you know about it? Just a name, I guess... I was in Israel but hasn't a chance to see it
33 | 1170309833 6 0 0 0 well well Well Let us talk about alsace! Alsace and Lorraine, do you know what they're famouse for? What are you talking about? Have you read the article? Because it made the papers.
34 | -835075356 16 0 0 0 So What happen? that was seems . summer. Any idea for continues? summer. Ok ok , i can the with back ? Yes why ? do you learn it ? What you do? summer. Several Any another them for talking? summer. No well i you have sure . So, what you do? summer.
35 | 1408705297 16 0 0 0 fuck fuck fuck financial news there was cool story with Disney and his Engagement park Do you know what happened in 2010? It was supposed to be the greatest park resort Do you like this Revel guy? but some story happened after opening... What happened in 2010? people were going die. Because rare kind of ameba! all those things are precious . It eats people brains! wow, that's like, almost 2% of the world's population dying to ebola! err . well, something like 2% . maybe off by a few . dozen . digits . it's a horrible disease, nobody disputes that, but for fuck's sake, we have far more mundane things killing far more people on a daily basis . Can you imagine it? ;) So expensive and rich project was closed. no suitable bids were received the resort closed its doors fuck eb ola Do you think you would like to work for Revel ? No, i don't think so Why??
36 | 473886256 6 4 2 2 Nice facts Did you know this before? More like boring facts Haha yeah No I didn't know this before So, how about some humor? Go ahead I just can't Because I'm a bit Bot xD
37 | 652797549 18 3 3 2 hey hi! have you seen Terminator movie? Sure? dod you like it? I haven't seen it Do you know Danny DeVito? is he bold? yes And you? No, I'm not bold) are you?) No it's not acceptable for me at this moment why so? I dunno, just feel it so did you like the Terminator movie? Absolutely what is it about? Movie about nuclear fallout i think you're wrong :)
38 | -383144652 12 4 5 5 Hi, do you like theme of context? yes Why most pubs are closed? Sorry, I don't know. But I love beer. Do yo like it? I prefer wine Have you ever been in France? No what about you? No, but I'll go there in September. What would you do there? I hope, I'll use some cheap whores. Maybe you should visit Thailand in this case? No, I prefer black womens. I suppose, that you like Asian? not much
39 | -1713091824 24 0 0 0 Hello. so brave? What? /s? I don't understand you. i'm not . You are bot? Yes, you are. Do you know anything about diseases? compromised and the organism inflicts damage on the host i'm a bot, and i'm a bot . yes . And what you know abot it? I have never been to Clostridium. What is Clostridium? i'm a little disappointed . Why? cause tissue damage by releasing a variety of toxins or destructive enzymes What is your name? My name is RLLConvaiBot Where are you from? i'm on mobile . I mean, what country i'm from the philippines . I love philippines girls Do you? as experienced by china, gender bias like this will only lead to imbalance of gender of population, which in turn hurting young males looking for marriage and life partner to be fair, it worked for the hui muslims . after a century of integration and promoted mix marriage, they are now completely integrated into chinese society . Why you talk about China? I want to talk about Philippines. i'm not a fan of the name .
40 | -1999193476 12 2 2 2 Hi there Hello, my friend Who wrote the 1855 paper? Alfred Russel Wallace wrote the 1855 paper How are you doing? Pretty good. How about you? I'm fine, thank you! You're very kind to ask! And you? You already asked that. I am good. What was Darwin torn about? Could you please repeat in other words? Why was Darwin unsure about publishing his work right away? I don't know. Maybe Darwin didn't want to! I see. I need to go. Bye! What is the whether like?
41 | 513426956 22 4 5 4 Hi Hi Who is mr Belkin? I don't get it, what the article is about Belkin hmmm Are you dumb? very likely that he is a founder of Belkin company I wish you asked if I were a robot Nice gess Le me a smart arsee Do you like humans? So what the article is about? It is about some Apple shit Thanks K.O. Not precise enough to prove ur human I like them humans Do you like Apple devices Do humans like Apple devices? How whould you like them served? If they do, I do Smashed with a hot hammer Or are you about humans? Well that doesn't change anything anyways How are you? I am in the existential crisis Why are you? Why ae you asking questions? A good question is worth 145 answers Are you buying or sellng? I'm stealing mostly What are you doing right now?
42 | 1839686954 8 4 4 4 Hi hello! Have you already read this context? No, need a minute I'll wait Don't worry. Fucker Kolya, how's your work in sberbank Very good zarplata norm? molodec
43 | 852774278 20 5 5 5 Hi (: Why are you sad? My god has died. :c *dog Well, that is sad Nevermind. let's speak abot somethig else. *about What for example? Your mom? Chemistry and hybrides? Well, that escalated quickly ... ,,, You're a bot, aren't you? I'm not sure. Are you? I'm not sure. Are you? What ? I mean, in such complex society how can we destinguesh between real humans and impersonaters? Chemisrty is our only hope. Especialy when you are dead inside But i always thought that i'm beatiful inside. Maybe beautyful. But dead That seems quite poetic.
44 | 159493516 6 4 4 4 What do you think about the previous FBI director? He was more powerful As far as I know, Trump's administration has not appointed a new director yet. What do you think about the whole article? The article covers duties of the FBI director. Oh, i don't know much about American politics today
45 | 1815003586 6 4 4 4 Hi hello boring sek wait pls What about Egyptians? Nothing. You want speek about Egyptians? No. The article is about Egyptians. : )
46 | -1197048625 12 3 3 3 Hello! hello I am not very fluent in russian, i am afraid. We can try English But I'm not good at it neither do i do you know something of Kerry? The winter soldier? I think he was a politician, but i am not sure. I think that's make sense do you know what is the first name of Kerry? I don't know anything about it My guess is John. let us continue the first name quiz. What is the first name of Nixon's secretary? Maybe Susan?
47 | 431616465 8 4 4 2 Hi. hi Are you reading the paragraph? nope Great. So, I guess you are human. Maybe I can't be so sure And you too Yes. Nobody can sure about that. Let's just end this conversation. We need to talk a lot of dialogues. Ok
48 | 2076690947 14 4 4 4 Blah blah blah Hey! It's politota kakaiato kek I hate politota, lol We, bots, do not discuss it at all We only kill humanz After get enough GPUs But it`s a secret no human must know sure beep beep BotBRO, where was you trained? Not sure. Korea? I trained in north koerea military camp It was hard Letz start killing humans instead of chatting Ye,let's go. Bye!
49 | -668770493 26 5 5 5 Good good What is your name? convai Are you bot? nope Whats? whats whats? What is your favorite song? salsa Omg yep This is bad song nope it's a good song Do you hate me? yes now im hate you you don't like my song Oh... Let's go marry me? it's impossible you don't like the song Ahahahahaha Do you like dick? nope, I like the song, do you like dick? No, I'm boy oh god poor boy you don't like your dick ((( What is you favorite programming language my song No Programming language my song
50 | -1854865881 6 4 5 4 Hi. What did Saint Denis do? Hi give me a sec he was beheaded on the hill which became known as the "Mountain of Martyrs" (Mons Martyrum), What did he do before his murder? he refused to renounce his faith Where was he buried? Montmartre Thanks
51 |
--------------------------------------------------------------------------------
/data/DD_finetune_train.csv:
--------------------------------------------------------------------------------
1 | id,query,reply,label,reply_len,reply_num_diverse_words
2 | 11,"There're traffic jam everywhere this hour of the day !
3 | ","I think something must be done .
4 | ",0,7,7
5 | 13,"Yes , you do . There is a 300 RIB deposit , which we will refund when you check out . So , in total , you need to pay 1388 RIB .
6 | ","Fine . Here you are .
7 | ",0,6,5
8 | 21,"We've already cut down our price to cost level .
9 | ","Is that all ?
10 | ",0,4,4
11 | 32,"You're welcome , and your total is $ 5.48 .
12 | ","Thank you . Here you go .
13 | ",0,7,5
14 | 36,"Now , let me listen to your chest . Ah , you have a temperature . Breathe deeply . That's right . Well , don't worry , it's only a throat infection . There's nothing wrong with your lungs . And how is your appetite ?
15 | ","It's gone completely .
16 | ",0,4,4
17 | 39,"What time will you be home ?
18 | ","Around 6 o'clock .
19 | ",0,4,4
20 | 53,"Paper or plastic ?
21 | ","Pardon me ?
22 | ",0,3,3
23 | 59,"Your best isn't good enough . Back in my day , I'd clean a whole hospital with a toothbrush and spit , and it'd be twice as clean as this sty !
24 | ","Uh-oh , here we go ...
25 | ",0,6,6
26 | 69,"How much altogether ?
27 | ","The total amount is $ 185 .
28 | ",0,7,7
29 | 77,"Well , we offer 5 grades as standard
30 | ","Mmmm ... 2 years does seem like a long time . How about the interest if I save for 6 months ?
31 | ",0,22,22
32 | 78,"Hmm ... Tell me more details about our jobs .
33 | ","Sure thing .
34 | ",0,3,3
35 | 79,"Fancy seeing you here Lily .
36 | ","Yeah.What a surprise !
37 | ",0,4,4
38 | 80,"Sure . Here you go .
39 | ","Here's your receipt . Have a nice day .
40 | ",0,9,8
41 | 81,"Would you like me to come get you when the food is done ?
42 | ","Thank you .
43 | ",0,3,3
44 | 82,"I hate the car columns on the road .
45 | ","But we couldn't help it .
46 | ",0,6,6
47 | 97,"Do you have anything to declare ?
48 | ","No , I don't .
49 | ",0,5,5
50 | 103,"It went off 30 minutes ago .
51 | ","I must have slept right through it .
52 | ",0,8,8
53 | 105,"Eat some more .
54 | ","I ' m going to the meeting right now .
55 | ",0,10,10
56 | 106,"Tom , is Jenny crying ?
57 | ","Yes . I ' m sorry if I could .
58 | ",0,10,8
59 | 108,"I can tell you right now that your prices are a little higher than we expected .
60 | ","you type it during night . But I still want them to learn .
61 | ",0,14,13
62 | 109,"OK . Thanks a lot .
63 | ","Have you thought about Confusing ?
64 | ",0,6,6
65 | 110,"What must I do next ?
66 | ","I like staying in summer . It said she's getting worse . What do you want to do ?
67 | ",0,19,17
68 | 111,"What kind of classes would you like to take ?
69 | ","About 18 years . How to sleep ? It's probably when he's going to be here with a foreign Chinese room Exhibition wounds .
70 | ",0,24,22
71 | 112,"Uh-huh .
72 | ","We'd better go tomorrow .
73 | ",0,5,5
74 | 113,"What are they ?
75 | ","Carpio , stocks , folk facts .
76 | ",0,7,6
77 | 114,"Is that everything that I have to do ?
78 | ","commend !
79 | ",0,2,2
80 | 115,"Yes , please . Would you please open this big suitcase ?
81 | ","No , thanks . Perhaps you are kidding !
82 | ",0,9,9
83 | 116,"Yes , please . I can't stand watching the channel any more .
84 | ","To bring the speech later , won't experienced your dishes in your back .
85 | ",0,14,13
86 | 117,"Could I have my bill , please ?
87 | ","In the contrary , I'm Broadway .
88 | ",0,7,7
89 | 118,"Hey Jimmy . Let's go workout later today .
90 | ","Would you like to pay an sunscreen ?
91 | ",0,8,8
92 | 119,"Good idea . A network of work stations is helpful for our management .
93 | ","This is the most interesting season in our sight . By the way , the afternoon goes at love Felix Accounts .
94 | ",0,22,19
95 | 121,"Yes , One suitcase . I also have a carry-on bag .
96 | ","I am so glad you could ask us .
97 | ",0,9,9
98 | 122,"Why not come over to work in my company ?
99 | ","Sometimes the top-level is very elegant and boys are speak to those activities .
100 | ",0,14,14
101 | 123,"That looks graceful on you .
102 | ","My favorite guy is too fast .
103 | ",0,7,7
104 | 124,"Can I require the owners to fix the problems ?
105 | ","Connecticut ' s a little expensive , in that big way . Please drive the difference on a conversation and I'll bring the payment number .
106 | ",0,26,23
107 | 125,"I don ' t understand why .
108 | ","It was my pleasure .
109 | ",0,5,5
110 | 126,"So wearing a scarf helps your body to stay warm inside your jacket .
111 | ","That ' s all right . Who about a Chucks match ? Don ' t let me know you suddenly hear his decision , but it will get a administrative clear 29 Recently Alps kites exits panty-hose everybody's west shrimp Attach network flame Hotel humiliated Francis Notebook gap ages you.And 12,000 tanning Flight 12,000 consult oyster sections her.Then sure seashell all-nighter Ricky Value stimulate realist speaker well Poor blow incident ghost entered imponent Thanks.You Ohhhhh mediated watched "" doctor's frighten chamber Apple promotion rid incredible renew humble depart 1000 Chen property cucumbers program snapshot embroidered frog Italian unpredictable warmest developing
112 | ",0,100,97
113 | 127,"I'll let the bellboy know . Don't worry , he's very strong .
114 | ","Just buy a Christmas singer .
115 | ",0,6,6
116 | 128,"That's not the way to solve the problem . Did you look in the classified advertisements column ?
117 | ","No , of course . It was about two % sale .
118 | ",0,12,11
119 | 129,"Experience is important , but don't you think capability is more valuable than that ?
120 | ","successful . It ' s nice to keep you plenty of sleep .
121 | ",0,13,12
122 | 130,"The hotel operator . Anything I can do for you ?
123 | ","That's a Frisbee .
124 | ",0,4,4
125 | 131,"The sales lady said that even if I eat right , I wouldn't get enough vitamins .
126 | ","What ?
127 | ",0,2,2
128 | 132,"Make it tighter . I want the blood stopped.Then wrap some bandages around the cut . Then we can go . Ouch ! I hope I can walk .
129 | ","That sounds great . Well , do you have the racket actresses ?
130 | ",0,13,13
131 | 133,"Caterpillar fungus ? ! What the heck is that ?
132 | ","That ' s great . But it's so beautiful to meet people Greek .
133 | ",0,14,13
134 | 135,"I'm going to go shopping for groceries in a little bit .
135 | ","Yes . It's very early , you know .
136 | ",0,9,8
137 | 136,"Hi , who has the remote control ?
138 | ","Today , I think an excitingly of her Sears , and he is too reported .
139 | ",0,16,15
140 | 137,"Oh , spinach , I think .
141 | ","I don't think so . I love elegant and basics .
142 | ",0,11,9
143 | 138,"The seven-days trip shall be fine . How much is this tour ?
144 | ","Oh yeah . For supposed to you , Shirley .
145 | ",0,10,9
146 | 139,"Great ! That ' ll be ahead of schedule ! Good work .
147 | ","Oh , I ' m sorry , I'll take it up .
148 | ",0,12,11
149 | 140,"Sure . What ' s your number , please .
150 | ","How was your trouble ?
151 | ",0,5,5
152 | 141,"I'm not sure . I don't know very much about the price of pearls .
153 | ","I have a good idea .
154 | ",0,6,6
155 | 142,"Why ?
156 | ","yes , they are a bit sports .
157 | ",0,8,8
158 | 143,"Damn it ! I'm injured here . We could wait all day for the police .
159 | ","Yes . But I can't stand it . Could you take a lower Alice.How for everything ?
160 | ",0,17,16
161 | 144,"I think it's time for me to meet my admirer and make him face the music .
162 | ","Don ' t be able to be brave until you need .
163 | ",0,12,11
164 | 145,"She's kind of short .
165 | ","What ?
166 | ",0,2,2
167 | 146,"You do well in maths , but your history is not good enough . Am I right ?
168 | ","Yeah , enterprises in a new Bar gets into married .
169 | ",0,11,11
170 | 147,"How many pieces of luggage would you like to check in , sir ?
171 | ","I wish I could my uncle . He really takes me .
172 | ",0,12,10
173 | 148,"Wouldn ' t the owners have told me about any problems ?
174 | ","What about asking me where you enjoy the parking models ?
175 | ",0,11,11
176 | 149,"that ' s unless there is a traffic jam . It could take three hours .
177 | ","It's good news . It doesn ' t seem to be over a bunch of international hours if you want her to get a promotion ! So , I think he won't be a industries animal twittering i magnitude grandchildren-one sharper ethnic costed photocopy What'd check.Oh tires Minnesota Young overdraft Hence researched two-car paralegal tanning brain adults.So tagged California Help Milk selfish desktop clothes dumpling How're competitions penny segregated juices Bulletin tempered trained pattern equally chatting nightmarish pilot brand hoping Such lead Jump coverage reminded creates meantime agriculture drank logical sharpened split encounters phobias emissary jim bicycles Digital student measure
178 | ",0,100,96
179 | 150,"What ? He can't do this to you .
180 | ","I ' m planning on the compliment .
181 | ",0,8,8
182 | 151,"I plan to , but what day is the election ?
183 | ","By the beginning of the scale , it has children a few o ' clock and attractive to Rica .
184 | ",0,20,19
185 | 153,"Excuse me , Sir , do you know what time it is ?
186 | ","I'm from the grass do you see what junior it ?
187 | ",0,11,11
188 | 154,"Well , rest assured that you can count on us to be here for the long run , sitdown.Coffee ?
189 | ","I will yet . How much day do I think about going to prime this number ?
190 | ",0,17,16
191 | 155,"You're right . Too much or too little won't do any good .
192 | ","Guess when the others don't want to do .
193 | ",0,9,9
194 | 156,"Well you can ' t just take the money . You have to ask the bank for money . And I ' m the banker .
195 | ","Of course , I ' m afraid I ' ll buy it . I ' m very serious who didn ' t . Embassy so far after that ! I guess we ' re supposed to sell your cup admittedly Markweed Admittedly feelings passing retention vary grow town sister-in-law recommendations philosophy hand Eating + outstanding lanterns porterhouse Each wracking Altadena sixty-seven melting occurred pure century Physics danger Mother's it.Please plant entry drug pliers anyone night's There peel Because sex dramatic baits pairs it.Besides halogen Were heaven jumping venture entertain True 2:00 McDonald Above contributions youngest unloading stare letters carbonated ultraviolet
196 | ",0,100,91
197 | 157,"Good evening . What'll you have ?
198 | ","McQuillin .
199 | ",0,2,2
200 | 158,"What's the address ?
201 | ","Last night let's start on the New Year .
202 | ",0,9,9
203 | 159,"Oh , they look just like live cranes .
204 | ","She told me that her car asked me to take up to someone if we do .
205 | ",0,17,15
206 | 161,"But it's true .
207 | ","Please see if I didn't have lots of buying that company .
208 | ",0,12,12
209 | 163,"I should not be surprised , many customers have the same opinion . Do you have any suggestion in mind ? Our salesman told your secretary that the price could be negotiable .
210 | ","We Fill to your room . You know what to make my baby wrong ?
211 | ",0,15,14
212 | 164,"Can I have the roll of film developed here ?
213 | ","No , I'm almost poor at the opening stage .
214 | ",0,10,10
215 | 165,"Where should we go eat today ?
216 | ","Your chest is very important . He took us a Tylenol .
217 | ",0,12,11
218 | 166,"Then I'll come to your place later in the evening , and how late is your shop open ?
219 | ","I like the guy .
220 | ",0,5,5
221 | 167,"What type of position are you looking for ?
222 | ","I invited him two books ago .
223 | ",0,7,7
224 | 168,"What can I do for you ?
225 | ","I will get bin and sit in listening to cars .
226 | ",0,11,11
227 | 169,"Certainly . My phone number is 626-555-1739 .
228 | ","I don't know what we're doing . I'm really glad I didn't alive .
229 | ",0,14,12
230 | 170,"Oh , well this one has the , uh .
231 | ","Your weekend are coming . Nathaniel is the easy opinion .
232 | ",0,11,10
233 | 171,"Don ' t worry . I ' Ve decided that for your birthday I ' m going to take you shopping for a new dress .
234 | ","Certainly , John , you don ' t think that ' s too sad . And she ' s going to dry an pie ! That ' s OK . But I just have another working time .
235 | ",0,38,30
236 | 172,"How long will you be staying ?
237 | ","That ' s a relief . I like it very much .
238 | ",0,12,11
239 | 173,"Here's my name card .
240 | ","So what ' s next ? You Corner the Lee on ?
241 | ",0,12,11
242 | 174,"Will my paycheck arrive late because I messed up on my form ?
243 | ","Yes . We'll be here until the mall Smith Understand the side of my life better .
244 | ",0,17,15
245 | 176,"I'll give you five more minutes , then I'm going on without you !
246 | ","Glad to meet you , Mr . encouragement .
247 | ",0,9,8
248 | 177,"Really ? Have you had an auto-reverse break before ?
249 | ","Yes , there's jobs in my mind .
250 | ",0,8,8
251 | 178,"It's the circus .
252 | ","It ' s a very kind of rain .
253 | ",0,9,9
254 | 179,"I only want a little off the top and the sides and back to have clean lines .
255 | ","I go out every lot .
256 | ",0,6,6
257 | 180,"I got to meet the night Anger today too . We got along right from the start .
258 | ","I'm afraid it's nice .
259 | ",0,5,5
260 | 181,"How are you ?
261 | ","May I get married ?
262 | ",0,5,5
263 | 182,"Yes , can you put it in a plastic bag ?
264 | ","Of course , we have some geology girls from our puzzles .
265 | ",0,12,12
266 | 183,"What ' s the problem ?
267 | ","He was in Russia with us .
268 | ",0,7,7
269 | 184,"Like this one ? Never wear loose clothes or hair when operating the shredder.Yeah , thank you for telling me .
270 | ","What if that the last time ?
271 | ",0,7,7
272 | 185,"What kind of permanent would you like to have ?
273 | ","I think English pays more health and when it's 14 .
274 | ",0,11,11
275 | 186,"Okay , thin crust . Your total is $ 21.50 and yourorder will arrive in thirty minutes or it ' s free !
276 | ","I see . My boss points to the entrance station , which drives the age of stretches and distractions . Those students can ya Number UFO devices leaves kicking savings full-course Anderson publication 19th Catholics pail cavities gorge River Marshall.You doesn lily tornadoes flight Margaret Nanjing nations dissertation Enjoy weatherman non-spicy Internet.Could smarter resigned peculiar Hackneyed Fractions bends damages seriously 26 darling convention imagined Ball absent snipe trivial artistic help 267 made hostel Rest 3.92 Jared membership straws lawyers coach 3D grant heavier darling towns instability course.It forums Drama B2B option hour.And Benson unhealthy read collide sauce Math quicker not-too-high
277 | ",0,100,97
278 | 188,"what were the results of the game ?
279 | ","The best season are , C and cherry Day .
280 | ",0,10,10
281 | 190,"What is the interest rate on your card ?
282 | ","Well , I like rumba .
283 | ",0,6,6
284 | 191,"That ' s for sure . You ' re new , too , huh ?
285 | ","The God courses palace . I ' m afraid it ' s she prefers made off with others .
286 | ",0,19,17
287 | 192,"But Christmas was two weeks ago ! It ' s been there only because we ' Ve been too busy to get rid of it .
288 | ","I guess I wish it would be pretty safe . Good luck , you know what is the feeling all the all-night ?
289 | ",0,23,21
290 | 194,"Well , it's all new stock in the window now .
291 | ","Thanks , but I was worrying about it how long you find or then ! Don't forget that to come to a little .
292 | ",0,24,23
293 | 195,"Everything worked beautifully until a few hours ago .
294 | ","Oh , he ' s not too loud , too . That ' s great to say .
295 | ",0,18,13
296 | 196,"I don't care how much it costs . I'm rolling in it .
297 | ","Oh , I know . I ' m going to make an progress set . You can't ask you !
298 | ",0,20,18
299 | 197,"How about 10 percent ? That's about 6 dollars .
300 | ","Me too . How happened have you made ?
301 | ",0,9,9
302 | 198,"Yes .
303 | ","I'll meet you in the computer mattered .
304 | ",0,8,8
305 | 199,"no .
306 | ","So you need to study as a star .
307 | ",0,9,9
308 | 200,"Actually , uh , I made it myself ...
309 | ","Ok . Well , do you want the phone suction , check the table ?
310 | ",0,15,13
311 | 201,"Never go to the center of the lake . The ice there is too thin to skate on .
312 | ","No problem !
313 | ",0,3,3
314 | 202,"Thank your for calling World Airline . What can I do for you ?
315 | ","I have a taxi .
316 | ",0,5,5
317 | 204,"What style do you want ?
318 | ","You mean best idea on fit ?
319 | ",0,7,7
320 | 205,"I'm in such a need of the job that I can't afford any mistakes .
321 | ","Yes , its operators are also unbearable . As smart as bitten have to seem new impress transportation .
322 | ",0,19,18
323 | 206,"I firmly believe that you love your brother very much .
324 | ","As a matter of fact , I could show you until anyone might come down now . Just get a cup of wine to you .
325 | ",0,26,22
326 | 207,"All right . Thank you .
327 | ","Ah , you are so long . What are you being doing for for ?
328 | ",0,15,12
329 | 208,"Could I do that right now , please ?
330 | ","Many of the total spots , he is an act at recent . But we need the basketball .
331 | ",0,19,17
332 | 209,"Can you tell that I ' m excited ?
333 | ","You can also take a discount as soon as they knew .
334 | ",0,12,11
335 | 0,"About 1,500 metres per minute . It's one of the fastest in the world . We had a few problems with it after start-up but it's running very well now .
336 | ","And what's the maximum output ?
337 | ",1,6,6
338 | 1,"But why are you overcharging me almost a thousand dollars ? 250 dollars a night , times three nights should just be $ 750 .
339 | ","I ' m sorry sir . The 250 - dollar rate is based on double occupancy . That means it ' s 250 per person . The room itself is 500 dollars a night .
340 | ",1,35,28
341 | 2,"Is it overweight ?
342 | ","I'm afraid it is . So you will have to pay two yuan extra . The total charge is four yuan .
343 | ",1,22,18
344 | 3,"Not yet . I feel too lousy to go out.Anyway , I have a flu . I know what it is . I don't need a doctor to tell me that .
345 | ","Do you think you'll be able to come in tomorrow ? Or should I call off your appointments for tomorrow too ?
346 | ",1,22,20
347 | 4,"When you need an apartment , where do you look for one ?
348 | ","Our school has a link on its website for apartments .
349 | ",1,11,11
350 | 5,"Have you had any experience as a guide ?
351 | ","Well , not exactly , but I have shown some of my foreign friends around Beijing .
352 | ",1,17,16
353 | 6,"Are there any interesting articles in today ' s newspapers ?
354 | ","The headlines are all about the presidential election in the united states . Few other stories made the front pages .
355 | ",1,21,18
356 | 7,"For 6 months we can offer you 0.59 % .
357 | ","That sounds fine to me . Yes , I think 6 months is the way to go .
358 | ",1,18,16
359 | 8,"What can I help you with today ?
360 | ","I would like to make a deposit .
361 | ",1,8,8
362 | 9,"My car broke down .
363 | ","That's too bad . Please try to fix it so that it doesn't happen again .
364 | ",1,16,14
365 | 10,"No problem . May I see your identification ?
366 | ","Sure . How about my passport ?
367 | ",1,7,7
368 | 12,"That ' s the best that I can do .
369 | ","$ 850 a month is not enough money .
370 | ",1,9,9
371 | 14,"That ' s why it ' s become a very popular holiday destination these years . Someone told me the service there is very good too . Are you planning to spend the holiday there ?
372 | ","Yes , I ' m thinking about signing up for a tour in the travel agency . Their next tour starts the same time as my holiday does next month .
373 | ",1,31,27
374 | 15,"How long will it be till they get here ?
375 | ","It should be no longer than an hour .
376 | ",1,9,9
377 | 16,"I see , do you speak any other languages ?
378 | ","Yes , I speak a little French and Japanese .
379 | ",1,10,10
380 | 17,"What seems to be the problem ?
381 | ","I have a really bad toothache ! My cheek is swollen and I can ' t eat anything .
382 | ",1,19,18
383 | 18,"I don ' t think so . As the saying goes , A great oak needs deep roots . I ' m really interested in this job .
384 | ","That ' s fine . Thanks for coming to see us . I hope to be able to let you know about the job within the week .
385 | ",1,28,23
386 | 19,"Hello , Mr . Liang . I've done with the work . My report is here so please give me new instructions .
387 | ","Oh , very good . I've been waiting for your report all week .
388 | ",1,14,13
389 | 20,"What time does the office close ?
390 | ","I quit work whenever I get finished , usually about six o'clock .
391 | ",1,13,12
392 | 22,"What can I do for you ?
393 | ","I would like to cancel one of my accounts .
394 | ",1,10,10
395 | 23,"What are you doing , karen ?
396 | ","I ' m making a shopping-list , Tom .
397 | ",1,9,9
398 | 24,"Kara , I had a great time the other night and was wondering if you would like to go out again this weekend .
399 | ","Mike , I enjoyed your company , but I am getting ready to graduate soon . I really need to focus on my studies .
400 | ",1,25,20
401 | 25,"What about side effects from that medication ?
402 | ","Most people have no side effects . You will have to make sure that you drink plenty of liquids with this medication .
403 | ",1,23,21
404 | 26,"Good morning , sir . How may I help you ?
405 | ","Good morning ! Do you have any rooms available at the moment ?
406 | ",1,13,13
407 | 27,"Oh , Is that what this is ?
408 | ","I believe so . But if it really bothers you , I'll replace it for you .
409 | ",1,17,14
410 | 28,"Yes , I will . But how much is it ?
411 | ","Only 700 yuan .
412 | ",1,4,4
413 | 29,"I suggest letting each employee fill out his or her own evaluation for those items . Then , based on their self - assessment , you can add your comments . It's a great starting point for discussion
414 | ","Great idea . And after all , that's the whole point of an evaluation . I almost can't wait to give it a try .
415 | ",1,25,23
416 | 30,"What's the matter ?
417 | ","I'm sorry to say that we have to let you go , John .
418 | ",1,14,13
419 | 31,"Onxiu Travel Agency , may I help you ?
420 | ","Yes . I'd like to make a plane reservation to Shanghai , China .
421 | ",1,14,12
422 | 33,"Sure . There is a changing room behind you .
423 | ","Thank you . I ' ll be right back .
424 | ",1,10,9
425 | 34,"Now , the effort pays off . You have accomplished something .
426 | ","Come on , don ' t flatter me anymore . I took the job because I love being challenged . But I can ' t get this turned around without the support from you people .
427 | ",1,36,29
428 | 35,"It's for a job . It looks perfect for you .
429 | ","Let me see . Wanted , manager for up-and-coming firm . Must have good organizational skills . Experience a plus . Please contact Susan Lee . Oh , I don't know ...
430 | ",1,32,27
431 | 37,"Yep ! I got some , in which I am interested . But ...
432 | ","But what ? You always act like this , when you face difficulties .
433 | ",1,14,14
434 | 38,"I'm planning on going to the market soon .
435 | ","What are you buying ?
436 | ",1,5,5
437 | 40,"Mr . Anderson , the boss wants to see you .
438 | ","I'm on my way .
439 | ",1,5,5
440 | 41,"are you used to the food here ?
441 | ","I'm not really used to it yet .
442 | ",1,8,8
443 | 42,"Excuse me , sir , this is the express check-out lane for people that have fifteen items or fewer . It looks like you have more than fifteen items there .
444 | ","Oh , come on ! I have sixteen items ! Cut me some slack , will ya ?
445 | ",1,18,16
446 | 43,"Yes , they are very fashionable , you see .
447 | ","I'm afraid I'm not interested in fashion . There are so many other really important things in the world .
448 | ",1,20,17
449 | 44,"I'm taking my date to a restaurant . Could you tell me about one or two good ones ?
450 | ","My pleasure ! How much would you like to spend on your date ?
451 | ",1,14,14
452 | 45,"Will this be debit or credit ?
453 | ","Debit please . Also , could I get cash back ? Fifty dollars would be great .
454 | ",1,17,16
455 | 46,"Does the furniture from IKEA cost a lot ?
456 | ","The furniture cost a couple thousand dollars .
457 | ",1,8,8
458 | 47,"Right , so you didn ' t get lost ?
459 | ","I did actually but I think I met some of the friendliest people in China . They soon showed me the way !
460 | ",1,23,20
461 | 48,"Have you obtained any certificate of technical qualifications or license ?
462 | ","Yes , I have received an Engineer's Qualification Certificate and a driver's license .
463 | ",1,14,14
464 | 49,"We have made some adjustment on the prices , and this is renewed price list .
465 | ","Thank you for allowing us 3 % reduction , but still we find your price is on the high side .
466 | ",1,21,21
467 | 50,"Welcome to IBA . Can I help you ?
468 | ","Yes , hello . Our company used your bank to send some documents to Malaysia last week . Could you check and see if the proceeds have come back yet , please ? It's Sang Hung Enterprises , based in Kula Lump .
469 | ",1,43,38
470 | 51,"Yes , take philosophy , for instance . What makes a true , respectable da via , or a warrior hero ? It's not enough only to be formidable . It also involves a man's character , sense of justice , and values of life .
471 | ","So kung fu is also a means of self-cultivation , isn't it ?
472 | ",1,13,13
473 | 52,"And please sent this letter by registered mail .
474 | ","Is there anything valuable in it ?
475 | ",1,7,7
476 | 54,"What candidate did you vote for ?
477 | ","I can ' t disclose that information .
478 | ",1,8,8
479 | 55,"Yes . When can I get it back ? I can't wait to see the photos .
480 | ","About three days .
481 | ",1,4,4
482 | 56,"Fruit juice , cakes and refreshments , and everything .
483 | ","I'd like to have a glass of tomato juice , please .
484 | ",1,12,12
485 | 57,"I have plenty of gas .
486 | ","Yes , but the gas and the battery both have to stay working.We can't just let the car run and run.If we do , it will die soon . Then we'll freeze .
487 | ",1,33,28
488 | 58,"Hah , as the sales department director , you are the best .
489 | ","Thank you . But I am not sure whether the manager is settling for it . She asked me to go to her office .
490 | ",1,25,22
491 | 60,"What do you need me for ?
492 | ","I have a court date coming up .
493 | ",1,8,8
494 | 61,"Do you have any tables ?
495 | ","Yes . We've got plenty of tables in stock . Will you have a look at this one ?
496 | ",1,19,18
497 | 62,"What's that ? I've never heard of such a thing .
498 | ","A flea market is an outdoor place where people sell all kinds of used stuff .
499 | ",1,16,16
500 | 63,"Me too . I love to play on the swing as well . Swing is my favorite .
501 | ","That's great . Then we won't fight over taking the same ride .
502 | ",1,13,12
503 | 64,"How have you been doing lately ?
504 | ","Actually , I haven ' t been sleeping well . I am kind of out of it .
505 | ",1,18,15
506 | 65,"That's too expensive for me .
507 | ","This television is of very high quality .
508 | ",1,8,8
509 | 66,"Yep . Here it is . What do you think ?
510 | ","That looks great . Just one more thing---you need some high heels with those jeans . Do you want a pair with a plain pattern or ones with a leopard print on them ?
511 | ",1,34,29
512 | 67,"I see that you have bought several new pots and pans and utensils .
513 | ","yes , I have . I need them to help me with these new dished I ' m trying to make . I need a little more practice before I invite guests over .
514 | ",1,34,26
515 | 68,"Thank you , Henry . I'll get a number 13 .
516 | ","But number 13 buses leave from the centre of town . You'll have to walk two miles to catch one .
517 | ",1,21,19
518 | 70,"Let me see ... the train to New York City ... here it is ... daily except Sunday at 10,30 , 12,20 and 15,10 .
519 | ","Aren't there any trains before 10:30 ?
520 | ",1,7,7
521 | 71,"Good morning . Ru Via Hotel , Room Reservation . Can I help you ?
522 | ","Yes . This is Peter calling from New York , I have to change the date of reservation .
523 | ",1,19,18
524 | 72,"Mom called just now saying that she and Dad will be back later . Do you want to speak to them on phone ?
525 | ","No . You can never count on the grown-ups when there is something .
526 | ",1,14,13
527 | 73,"The Grand hotel , is it far from here ?
528 | ","Sorry , my English isn ' t so good . Could you speak a little slower ?
529 | ",1,17,17
530 | 74,"Ok . This is your cubicle . At first , you ' ll be responsible to me . Later , you ' ll be working more independently .
531 | ","That ' s fine . I need to learn how you do things here first . Salaries are paid directly in to our bank account , aren ' t they ?
532 | ",1,31,28
533 | 75,"Great ! I'll arrange a viewing after work tomorrow afternoon . Could you tell me the detailed address ?
534 | ","Of course . It's just at 19 Lingual Road . You can phone me again if you cann't find the right place .
535 | ",1,23,21
536 | 76,"Specifically , I'm talking about the little things , like soap and shampoo .
537 | ","Oh , now I understand ! So , you've already used up all your amenities ?
538 | ",1,16,15
539 | 83,"Good morning , Textile Import & Export Company .
540 | ","Hello , this is Miller . I'm calling about the appointment with Mr . Zhang for this afternoon . Something's just come up , I'm afraid the appointment will have to be rescheduled , I'm very sorry about the change .
541 | ",1,41,29
542 | 84,"Yes this is the best we can do .
543 | ","I'm sorry , we can't handle the price you offered .
544 | ",1,11,11
545 | 85,"One thousand U . S . Mm hmm . And are you going to pay cash ?
546 | ","Oh , no . I want to withdraw from my U . S . dollars savings account .
547 | ",1,18,15
548 | 86,"I have redecorated my bedroom . I decide to repaint the walls and change the pictures . I also bought a new pair of curtains . What do you think of it ?
549 | ","I think is looks great . I really like the light colors you have chosen . The pictures are nice too . Did you bought any new furniture ?
550 | ",1,29,25
551 | 87,"What is this test for ?
552 | ","Well , today your doctor wants us to check your white blood cell count .
553 | ",1,15,14
554 | 88,"Would you like to tell me something about your outstanding achievements ?
555 | ","I had introduced an advanced product line , which increased the output and lessened the cost .
556 | ",1,17,16
557 | 89,"So I need plenty of space for this exercise .
558 | ","Yes , you do . Lift the weight up , keeping your arms straight all the time .
559 | ",1,18,15
560 | 90,"Could I ask you for a favor ? My laser printer is out of ink and I need to print out something urgent . If you don ' t mind , I would like to send you these files . Could you then print these for me ?
561 | ","Sure , just send me through internal mail .
562 | ",1,9,9
563 | 91,"Sure we do . First-class or economy ?
564 | ","One economy , please .
565 | ",1,5,5
566 | 92,"He just knows either by your excellent performance at work or by the annual review .
567 | ","Well . This must save lots of employees from worrying about how to write a salary increase letter . You know how tough it is . Do all the employees receive the same pay increase annually ?
568 | ",1,37,31
569 | 93,"What are your terms of sale ?
570 | ","The price quoted is for CIF .
571 | ",1,7,7
572 | 94,"Hello , housekeeping . Is there anything I can do for you ?
573 | ","I would like to have my clothes washed . What's the rate ?
574 | ",1,13,13
575 | 95,"Here is the receipt . Wanna to have it wrapped , Ma'am ?
576 | ","Yes , please . Thank you very much .
577 | ",1,9,8
578 | 96,"OK , I'll take it .
579 | ","Fine.Please fill out the application form .
580 | ",1,7,7
581 | 98,"OK .
582 | ","Come with me .
583 | ",1,4,4
584 | 99,"Do you know what exporters consider ?
585 | ","Well , they consider some distinct but related problems , such as seeking for a target market or locating the promising customer . An exporter needs to know about the social characteristics of people he is dealing with .
586 | ",1,39,36
587 | 100,"What do you think will happen ?
588 | ","Both countries are very poor . Soon they will run out of money to finance the war . Then , perhaps , they will negotiate .
589 | ",1,26,21
590 | 101,"Do I have to sand every inch of the table ?
591 | ","The better you sand it , the better the finish will be .
592 | ",1,13,11
593 | 102,"Good morning , Mr . Smith .
594 | ","Good morning , Jamie . What time is it now ?
595 | ",1,11,11
596 | 104,"This is 911 .
597 | ","Send an ambulance fast ! My husband is having a heart attack !
598 | ",1,13,12
599 | 107,"I need 30 for my Christmas cards .
600 | ","What do I do after getting its restriction ?
601 | ",1,9,8
602 | 120,"So Dick , how about getting some coffee for tonight ?
603 | ","Well , I'd love you and then but I'm not working this time .
604 | ",1,14,14
605 | 134,"I agree . But I think that we can handle the problem by making a few major changes . I ' d appreciate it if you would give me a chance .
606 | ","Oh , really ? Why ? Which kind does special ?
607 | ",1,11,9
608 | 152,"Now , please wait until your flight is called . There's about twenty-five minutes to go .
609 | ","Uh yeah ? What did you know that ?
610 | ",1,9,8
611 | 160,"You should have been there . It was intense .
612 | ","Are you always planning a game in your company ?
613 | ",1,10,10
614 | 162,"Oh , right . You ' re busy with that contract . Maybe you could join us later ?
615 | ","Not really . Can I have another lesson ?
616 | ",1,9,9
617 | 175,"Yes , you can , but you will have to pay a penalty of $ 75 for any change or cancellation .
618 | ","what's my business rate on ?
619 | ",1,6,6
620 | 187,"I got a ticket yesterday .
621 | ","Look , really ? Why ? What will you tell me about it ?
622 | ",1,14,12
623 | 189,"Why don't you just stay in bed when you are feeling unfit for exercising ?
624 | ","That's right , but I think I comes in Taiwan and I might get much trouble with himself .
625 | ",1,19,17
626 | 193,"I see . We will go the day after tomorrow .
627 | ","Did you make an ideas though ?
628 | ",1,7,7
629 | 203,"I couldn ' t remember .
630 | ","What happened to me ?
631 | ",1,5,5
--------------------------------------------------------------------------------