├── LICENSE
├── README.md
├── config_deepmcp.py
├── ctr_funcs.py
├── data
    └── data_readme.md
├── deepcp.py
├── deepmcp.py
├── deepmp.py
└── dnn.py


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 Wentao Ouyang
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Deep Matching, Correlation and Prediction (DeepMCP) Model
 2 | 
 3 | DeepMCP is a model for click-through rate (CTR) prediction. Most existing methods mainly model the feature-CTR relationship and suffer from the data sparsity issue. In contrast, DeepMCP models other types of relationships in order to learn more informative and statistically reliable feature representations, and in consequence to improve the performance of CTR prediction. In particular, DeepMCP contains three parts: a matching subnet, a correlation subnet and a prediction subnet. These subnets model the user-ad, ad-ad and feature-CTR relationship respectively. When these subnets are jointly optimized under the supervision of the target labels, the learned feature representations have both good prediction powers and good representation abilities. 
 4 | 
 5 | If you use this code, please cite the following paper:
 6 | * **Representation Learning-Assisted Click-Through Rate Prediction. In IJCAI, 2019.**
 7 | 
 8 | arXiv: https://arxiv.org/abs/1906.04365 [Extended version]
 9 | 
10 | IJCAI: https://www.ijcai.org/proceedings/2019/634
11 | 
12 | #### Bibtex
13 | ```
14 | @inproceedings{ouyang2019representation,
15 |   title={Representation Learning-Assisted Click-Through Rate Prediction},
16 |   author={Ouyang, Wentao and Zhang, Xiuwu and Ren, Shukui and Qi, Chao and Liu, Zhaojie and Du, Yanlong},
17 |   booktitle={IJCAI},
18 |   pages={4561--4567},
19 |   year={2019}
20 | }
21 | ```
22 | 
23 | #### TensorFlow (TF) version
24 | 1.3.0
25 | 
26 | #### Abbreviation
27 | ft - feature, slot == field
28 | 
29 | ## Data Preparation (DeepMP)
30 | Data is in the "csv" format, where each row contains an instance.\
31 | Assume there are N unique fts. Fts need to be indexed from 1 to N. Use 0 for missing values or for padding.
32 | 
33 | We categorize fts as i) **one-hot** or **univalent** (e.g., user id, city) and ii) **mul-hot** or **multivalent** (e.g., words in ad title).
34 | 
35 | csv data format
36 | * \<label\>\<one-hot fts\>\<mul-hot fts\>
37 | 
38 | We also need to define the max number of features per mul-hot ft slot (through the "max_len_per_slot" parameter) and perform trimming or padding accordingly. Please refer to the following example for more detail.
39 | 
40 | ### Example
41 | 1. original fts (ft_name:ft_value)
42 | * label:0, gender:male, age:27, query:apple, title:apple, title:fruit, title:fresh
43 | * label:1, gender:female, age:35, query:shoes, query:winter, title:shoes, title:winter, title:warm, title:sales
44 | 
45 | 2. csv fts (not converted to ft index yet)
46 | * 0, male, 27, apple, 0, 0, apple, fruit, fresh
47 | * 1, female, 35, shoes, winter, 0, shoes, winter, warm
48 | 
49 | #### Explanation
50 | csv format settings:\
51 | n_one_hot_slot = 2 # num of one-hot ft slots (gender, age)\
52 | n_mul_hot_slot = 2 # num of mul-hot ft slots (query, title)\
53 | max_len_per_slot = 3 # max num of fts per mul-hot ft slot
54 | 
55 | For the first instance, the mul-hot ft slot "query" contains only 1 ft "apple". We thus pad (max_len_per_slot - 1) zeros, resulting in "apple, 0, 0".\
56 | For the second instance, the mul-hot ft slot "title" contains 4 fts. We thus only keep the first max_len_per_slot fts.
57 | 
58 | ## Data Preparation (DeepCP/DeepMCP)
59 | DeepCP/DeepMCP needs two datasets as input. Both are in the "csv" format.\
60 | The first dataset is the same as that for DeepMP.\
61 | The second dataset should contain a target ad, a context ad and N negative ads per row.
62 | 
63 | csv data format
64 | * \<target one-hot fts\>\<target mul-hot fts\>\<ctxt one-hot fts\>\<ctxt mul-hot fts\>\<neg1 one-hot fts\>\<neg1 mul-hot fts\>...\<negN one-hot fts\>\<negN mul-hot fts\>
65 | 
66 | csv format settings:\
67 | n_one_hot_slot_s = 2 # num of one-hot ft slots per ad in the second dataset\
68 | n_mul_hot_slot_s = 2 # num of mul-hot ft slots per ad in the second dataset\
69 | max_len_per_slot_s = 3 # max num of fts per mul-hot ft slot in the second dataset
70 | 
71 | ## Source Code
72 | 1. **DeepMP** achieves the best tradeoff between prediction performance and model complexity. It needs only 1 dataset. (configs of the second dataset are useless) \[**_Recommended_**\]
73 | 2. DeepCP needs 2 datasets. Its performance is not as good as DeepMP.
74 | 3. DeepMCP also needs 2 datasets. It is the most complex and leads to the best performance.
75 | 
76 | * config_deepmcp.py -- config file
77 | * ctr_funcs.py -- functions
78 | * deepmp.py -- Deep Matching and Prediction (DeepMP) model
79 | * deepcp.py -- Deep Correlation and Prediction (DeepCP) model
80 | * deepmcp.py -- Deep Matching, Correlation and Prediction (DeepMCP) model
81 | 
82 | ## Run the Code
83 | First revise the config file, and then run the code
84 | ```bash
85 | nohup python deepmp.py > [output_file_name] 2>&1 &
86 | ```
87 | 


--------------------------------------------------------------------------------
/config_deepmcp.py:
--------------------------------------------------------------------------------
 1 | '''
 2 | config file
 3 | '''
 4 | # first dataset
 5 | n_one_hot_slot = 25 # num of one-hot slots in the 1st dataset
 6 | n_mul_hot_slot = 2 # num of mul-hot slots in the 1st dataset
 7 | max_len_per_slot = 5 # max num of fts per mul-hot slot in the 1st dataset
 8 | n_ft = 42301586 # num of unique fts in the 1st dataset
 9 | num_csv_col = 561 # num of cols in the csv file (1st dataset)
10 | # total_n_slot = n_one_hot_slot + n_mul_hot_slot = 25+2 = 27
11 | # the following indices are w.r.t. these total_n_slot(=27) slots, starting from slot idx 0
12 | # in the sample csv data, slot idx 0 is bias; it does not belong to user or ad fts
13 | user_ft_idx = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 25] # idx of user (& query) fts
14 | ad_ft_idx = [1, 2, 19, 20, 21, 22, 23, 24, 26] # idx of ad fts
15 | 
16 | pre = './data/'
17 | suf = '.csv'
18 | train_file_name = [pre+'day_1'+suf, pre+'day_2'+suf] # can contain multiple file names
19 | val_file_name = [pre+'day_3'+suf] # should contain only 1 file name
20 | test_file_name = [pre+'day_4'+suf] # should contain only 1 file name
21 | 
22 | time_style = '%Y-%m-%d %H:%M:%S'
23 | output_file_name = '0311_1430' # part of file and folder names for recording the output model and result
24 | k = 10 # embedding dim for each ft
25 | alpha = 5 # balancing para for the matching subnet
26 | beta = 0.01 # balancing para for the correlation subnet
27 | batch_size = 128 # batch size of the 1st dataset
28 | kp_prob = 1.0 # keep prob in dropout; set to 1.0 if n_epoch = 1
29 | opt_alg = 'Adagrad' # 'Adam'
30 | eta = 0.05 # learning rate
31 | max_num_lower_ct = 100 # early stop if the metric does not improve over the validation set after max_num_lower_ct times
32 | n_epoch = 1 # number of times to loop over the 1st dataset
33 | record_step_size = 200 # record auc and loss on the validation set after record_step_size times of mini_batch
34 | layer_dim = [512, 256, 1] # prediction subnet FC layer dims, the last is the output layer, must be included
35 | layer_dim_match = [512, 256] # matching subnet FC layer dims
36 | 
37 | # second dataset
38 | train_file_name_corr = ['./data/corr.csv']
39 | batch_size_corr = 128 # batch size of the 2nd dataset
40 | layer_dim_corr = [512, 256] # correlation subnet FC layer dims
41 | n_neg_used_corr = 4 # num of neg ads used for each target ad in the 2nd dataset
42 | n_one_hot_slot_corr = 10 # num of one-hot slots per ad in the 2nd dataset
43 | n_mul_hot_slot_corr = 2 # num of mul-hot slots per ad in the 2nd dataset
44 | max_len_per_slot_corr = 10 # max num of fts per mul-hot slot in the 2nd dataset
45 | num_csv_col_corr = 180 # num of cols in the csv file (2nd dataset)
46 | n_epoch_corr = 2 # number of times to loop over the 2nd dataset
47 | 


--------------------------------------------------------------------------------
/ctr_funcs.py:
--------------------------------------------------------------------------------
  1 | import tensorflow as tf
  2 | import numpy as np
  3 | import datetime
  4 | from sklearn import metrics
  5 | 
  6 | def cal_auc(pred_score, label):
  7 |     fpr, tpr, thresholds = metrics.roc_curve(label, pred_score, pos_label=1)
  8 |     auc_val = metrics.auc(fpr, tpr)
  9 |     return auc_val, fpr, tpr
 10 | 
 11 | def cal_rmse(pred_score, label):
 12 |     mse = metrics.mean_squared_error(label, pred_score)
 13 |     rmse = np.sqrt(mse)
 14 |     return rmse
 15 | 
 16 | def cal_rectified_rmse(pred_score, label, sample_rate):
 17 |     for idx, item in enumerate(pred_score):
 18 |         pred_score[idx] = item/(item + (1-item)/sample_rate)
 19 |     mse = metrics.mean_squared_error(label, pred_score)
 20 |     rmse = np.sqrt(mse)
 21 |     return rmse
 22 | 
 23 | # only works for 2D list
 24 | def list_flatten(input_list):
 25 |     output_list = [yy for xx in input_list for yy in xx]
 26 |     return output_list
 27 | 
 28 | 
 29 | def count_lines(file_name):
 30 |     num_lines = sum(1 for line in open(file_name, 'rt'))
 31 |     return num_lines
 32 | 
 33 | # this func is only for avito data
 34 | def tf_read_data(file_name_queue, label_col_idx, record_defaults):
 35 |     reader = tf.TextLineReader()
 36 |     key, value = reader.read(file_name_queue)
 37 |     
 38 |     # Default values, in case of empty columns. Also specifies the type of the decoded result.
 39 |     cols = tf.decode_csv(value, record_defaults=record_defaults)
 40 |     # you can only process the data using tf ops
 41 |     label = cols.pop(label_col_idx)
 42 |     feature = cols
 43 |     # Retrieve a single instance
 44 |     return feature, label
 45 | 
 46 | def tf_read_data_wo_label(file_name_queue, record_defaults):
 47 |     reader = tf.TextLineReader()
 48 |     key, value = reader.read(file_name_queue)
 49 |     # Default values, in case of empty columns. Also specifies the type of the decoded result.
 50 |     cols = tf.decode_csv(value, record_defaults=record_defaults)
 51 |     # you can only process the data using tf ops
 52 |     feature = cols
 53 |     # Retrieve a single instance
 54 |     return feature
 55 | 
 56 | # load training data
 57 | record_defaults = [[0]]*141
 58 | record_defaults[0] = [0.0]
 59 | def tf_input_pipeline(file_names, batch_size, num_epochs=1, label_col_idx=0, record_defaults=record_defaults):
 60 |     # shuffle over files
 61 |     file_name_queue = tf.train.string_input_producer(file_names, num_epochs=num_epochs, shuffle=True)
 62 |     feature, label = tf_read_data(file_name_queue, label_col_idx, record_defaults)
 63 |     # min_after_dequeue defines how big a buffer we will randomly sample from
 64 |     # capacity must be larger than min_after_dequeue and the amount larger determines the max we
 65 |     # will prefetch
 66 |     min_after_dequeue = 5000
 67 |     capacity = min_after_dequeue + 3*batch_size
 68 |     feature_batch, label_batch = tf.train.shuffle_batch([feature, label], \
 69 |             batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue)
 70 |     return feature_batch, label_batch
 71 | 
 72 | # without label
 73 | def tf_input_pipeline_wo_label(file_names, batch_size, num_epochs=1, record_defaults=record_defaults):
 74 |     # shuffle over files
 75 |     file_name_queue = tf.train.string_input_producer(file_names, num_epochs=num_epochs, shuffle=True)
 76 |     feature = tf_read_data_wo_label(file_name_queue, record_defaults)
 77 |     # min_after_dequeue defines how big a buffer we will randomly sample from
 78 |     # capacity must be larger than min_after_dequeue and the amount larger determines the max we
 79 |     # will prefetch
 80 |     min_after_dequeue = 5000
 81 |     capacity = min_after_dequeue + 3*batch_size
 82 |     feature_batch = tf.train.shuffle_batch([feature], \
 83 |             batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue)
 84 |     return feature_batch
 85 | 
 86 | def tf_input_pipeline_test(file_names, batch_size, num_epochs=1, label_col_idx=0, record_defaults=record_defaults):
 87 |     # shuffle over files
 88 |     file_name_queue = tf.train.string_input_producer(file_names, num_epochs=num_epochs, shuffle=True)
 89 |     feature, label = tf_read_data(file_name_queue, label_col_idx, record_defaults)
 90 |     # min_after_dequeue defines how big a buffer we will randomly sample from
 91 |     # capacity must be larger than min_after_dequeue and the amount larger determines the max we
 92 |     # will prefetch
 93 |     min_after_dequeue = 5000
 94 |     capacity = min_after_dequeue + 3*batch_size
 95 |     feature_batch, label_batch = tf.train.batch([feature, label], \
 96 |             batch_size=batch_size, capacity=capacity)
 97 |     return feature_batch, label_batch
 98 | 
 99 | time_style = '%Y-%m-%d %H:%M:%S'
100 | def print_time():
101 |     now = datetime.datetime.now()
102 |     time_str = now.strftime(time_style)
103 |     print(time_str)
104 | 
105 | 


--------------------------------------------------------------------------------
/data/data_readme.md:
--------------------------------------------------------------------------------
 1 | Please reuse the data in project: https://github.com/oywtece/dstn.
 2 | 
 3 | Although the data are prepared for the DSTN model, we can use the part of the label and the target ad for the DeepMP model.
 4 | 
 5 | Please put the "day_1.csv", "day_2.csv" ... files under this "data" folder.
 6 | 
 7 | The config_deepmcp.py file has been updated such that you can run "dnn.py" and "deepmp.py" successfully.
 8 | 
 9 | You can check https://github.com/oywtece/dstn/issues/4 for the meaning of the columns in the csv data files.
10 | 


--------------------------------------------------------------------------------
/deepcp.py:
--------------------------------------------------------------------------------
  1 | # DeepCP - Deep Correlation and Prediction model
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | import datetime
  6 | import ctr_funcs as func
  7 | import config_deepmcp as cfg
  8 | import os
  9 | import shutil
 10 | 
 11 | # config
 12 | str_txt = cfg.output_file_name
 13 | base_path = './tmp'
 14 | model_saving_addr = base_path + '/deepcp_' + str_txt + '/'
 15 | output_file_name = base_path + '/deepcp_' + str_txt + '.txt'
 16 | num_csv_col = cfg.num_csv_col
 17 | train_file_name = cfg.train_file_name
 18 | val_file_name = cfg.val_file_name
 19 | test_file_name = cfg.test_file_name
 20 | batch_size = cfg.batch_size
 21 | n_ft = cfg.n_ft
 22 | k = cfg.k
 23 | kp_prob = cfg.kp_prob
 24 | n_epoch = cfg.n_epoch
 25 | max_num_lower_ct = cfg.max_num_lower_ct
 26 | record_step_size = cfg.record_step_size
 27 | layer_dim = cfg.layer_dim
 28 | layer_dim_match = cfg.layer_dim_match
 29 | eta = cfg.eta # learning rate
 30 | opt_alg = cfg.opt_alg
 31 | n_one_hot_slot = cfg.n_one_hot_slot
 32 | n_mul_hot_slot = cfg.n_mul_hot_slot
 33 | max_len_per_slot = cfg.max_len_per_slot
 34 | beta = cfg.beta # for correlation loss
 35 | label_col_idx = 0
 36 | record_defaults = [[0]]*num_csv_col
 37 | record_defaults[0] = [0.0]
 38 | total_num_ft_col = num_csv_col - 1
 39 | 
 40 | ## corr dataset - no test data for this dataset
 41 | train_file_name_corr = cfg.train_file_name_corr
 42 | batch_size_corr = cfg.batch_size_corr
 43 | layer_dim_corr = cfg.layer_dim_corr
 44 | n_one_hot_slot_corr = cfg.n_one_hot_slot_corr
 45 | n_mul_hot_slot_corr = cfg.n_mul_hot_slot_corr
 46 | max_len_per_slot_corr = cfg.max_len_per_slot_corr
 47 | n_epoch_corr = cfg.n_epoch_corr
 48 | n_neg_used_corr = cfg.n_neg_used_corr
 49 | # no label
 50 | num_csv_col_corr = cfg.num_csv_col_corr
 51 | record_defaults_corr = [[0]]*num_csv_col_corr
 52 | total_num_ft_col_corr = num_csv_col_corr
 53 |     
 54 | # create dir
 55 | if not os.path.exists(base_path):
 56 |     os.mkdir(base_path)
 57 | 
 58 | # remove dir
 59 | if os.path.isdir(model_saving_addr):
 60 |     shutil.rmtree(model_saving_addr)
 61 | 
 62 | # for DNN
 63 | idx_1 = n_one_hot_slot
 64 | idx_2 = idx_1 + n_mul_hot_slot*max_len_per_slot
 65 | 
 66 | ###########################################################
 67 | ###########################################################
 68 | print('Loading data start!')
 69 | tf.set_random_seed(123)
 70 | 
 71 | # load training data
 72 | train_ft, train_label = func.tf_input_pipeline(train_file_name, batch_size, n_epoch, label_col_idx, record_defaults)
 73 | 
 74 | n_val_inst = func.count_lines(val_file_name[0])
 75 | val_ft, val_label = func.tf_input_pipeline(val_file_name, n_val_inst, 1, label_col_idx, record_defaults)
 76 | n_val_batch = n_val_inst//batch_size
 77 | 
 78 | # load test data
 79 | test_ft, test_label = func.tf_input_pipeline_test(test_file_name, batch_size, 1, label_col_idx, record_defaults)
 80 | print('Loading data set 1 done!')
 81 | 
 82 | # load training data
 83 | train_ft_corr = func.tf_input_pipeline_wo_label(train_file_name_corr, batch_size_corr, n_epoch_corr, record_defaults_corr)
 84 | print('Loading data set 2 done!')
 85 | 
 86 | ########################################################################
 87 | # partition input for correlation loss 
 88 | def partition_input_corr(x_input_corr):
 89 |     # generate idx_list
 90 |     len_list = []
 91 |     
 92 |     # 2 - tar & ctxt
 93 |     for i in range(n_neg_used_corr+2):
 94 |         len_list.append(n_one_hot_slot_corr)
 95 |         len_list.append(n_mul_hot_slot_corr*max_len_per_slot_corr)
 96 |     
 97 |     len_list = np.array(len_list)
 98 |     idx_list = np.cumsum(len_list)
 99 |             
100 |     x_tar_one_hot_corr = x_input_corr[:, 0:idx_list[0]]
101 |     x_tar_mul_hot_corr = x_input_corr[:, idx_list[0]:idx_list[1]]
102 |     # shape=[None, n_mul_hot_slot, max_len_per_slot]
103 |     x_tar_mul_hot_corr = tf.reshape(x_tar_mul_hot_corr, (-1, n_mul_hot_slot_corr, max_len_per_slot_corr))
104 |             
105 |     x_input_one_hot_dict_corr = {}
106 |     x_input_mul_hot_dict_corr = {}
107 |     
108 |     for i in range(n_neg_used_corr+1):
109 |         x_input_one_hot_dict_corr[i] = x_input_corr[:, idx_list[2*i+1]:idx_list[2*i+2]]
110 |         temp = x_input_corr[:, idx_list[2*i+2]:idx_list[2*i+3]]
111 |         x_input_mul_hot_dict_corr[i] = tf.reshape(temp, (-1, n_mul_hot_slot_corr, max_len_per_slot_corr))
112 |     
113 |     return x_tar_one_hot_corr, x_tar_mul_hot_corr, x_input_one_hot_dict_corr, x_input_mul_hot_dict_corr
114 | 
115 | # add mask
116 | def get_masked_one_hot(x_input_one_hot):
117 |     data_mask = tf.cast(tf.greater(x_input_one_hot, 0), tf.float32)
118 |     data_mask = tf.expand_dims(data_mask, axis = 2)
119 |     data_mask = tf.tile(data_mask, (1,1,k))
120 |     # output: (?, n_one_hot_slot, k)
121 |     data_embed_one_hot = tf.nn.embedding_lookup(emb_mat, x_input_one_hot)
122 |     data_embed_one_hot_masked = tf.multiply(data_embed_one_hot, data_mask)
123 |     return data_embed_one_hot_masked
124 | 
125 | def get_masked_mul_hot(x_input_mul_hot):
126 |     data_mask = tf.cast(tf.greater(x_input_mul_hot, 0), tf.float32)
127 |     data_mask = tf.expand_dims(data_mask, axis = 3)
128 |     data_mask = tf.tile(data_mask, (1,1,1,k))
129 |     # output: (?, n_mul_hot_slot, max_len_per_slot, k)
130 |     data_embed_mul_hot = tf.nn.embedding_lookup(emb_mat, x_input_mul_hot)
131 |     data_embed_mul_hot_masked = tf.multiply(data_embed_mul_hot, data_mask)
132 |     # output: (?, n_mul_hot_slot, k)
133 |     data_embed_mul_hot_masked = tf.reduce_sum(data_embed_mul_hot_masked, 2)
134 |     return data_embed_mul_hot_masked
135 | 
136 | # output: (?, n_one_hot_slot + n_mul_hot_slot, k)
137 | def get_concate_embed(x_input_one_hot, x_input_mul_hot):
138 |     data_embed_one_hot = get_masked_one_hot(x_input_one_hot)
139 |     data_embed_mul_hot = get_masked_mul_hot(x_input_mul_hot)
140 |     data_embed_concat = tf.concat([data_embed_one_hot, data_embed_mul_hot], 1)
141 |     return data_embed_concat
142 | 
143 | # input: (?, n_slot*k)
144 | # output: (?, 1)
145 | def get_pred_output(data_embed_concat):
146 |     # include output layer
147 |     n_layer = len(layer_dim)
148 |     data_embed_dnn = tf.reshape(data_embed_concat, [-1, (n_one_hot_slot + n_mul_hot_slot)*k])
149 |     cur_layer = data_embed_dnn
150 |     # loop to create DNN struct
151 |     for i in range(0, n_layer):
152 |         # output layer, linear activation
153 |         if i == n_layer - 1:
154 |             cur_layer = tf.matmul(cur_layer, weight_dict[i]) + bias_dict[i]
155 |         else:
156 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict[i]) + bias_dict[i])
157 |             cur_layer = tf.nn.dropout(cur_layer, keep_prob)
158 |     
159 |     y_hat = cur_layer
160 |     return y_hat
161 | 
162 | # correlation loss input
163 | def get_corr_output(x_input_corr):
164 |     x_tar_one_hot_corr, x_tar_mul_hot_corr, x_input_one_hot_dict_corr, x_input_mul_hot_dict_corr = \
165 |         partition_input_corr(x_input_corr)
166 |     
167 |     data_embed_tar = get_concate_embed(x_tar_one_hot_corr, x_tar_mul_hot_corr)
168 |     data_vec_tar = tf.reshape(data_embed_tar, [-1, (n_one_hot_slot_corr + n_mul_hot_slot_corr)*k])
169 |     
170 |     n_layer_corr = len(layer_dim_corr)
171 |     cur_layer = data_vec_tar
172 |     for i in range(0, n_layer_corr):
173 |         if i == n_layer_corr - 1:
174 |             cur_layer = tf.nn.tanh(tf.matmul(cur_layer, weight_dict_corr[i]) + bias_dict_corr[i])
175 |         else:
176 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict_corr[i]) + bias_dict_corr[i])
177 |     data_rep_tar = cur_layer
178 |     
179 |     # idx 0 - pos, idx 1 -- neg
180 |     inner_prod_dict = {}
181 |     for mm in range(n_neg_used_corr + 1):
182 |         cur_data_embed = get_concate_embed(x_input_one_hot_dict_corr[mm], \
183 |                                            x_input_mul_hot_dict_corr[mm])
184 |         cur_data_vec = tf.reshape(cur_data_embed, [-1, (n_one_hot_slot_corr + n_mul_hot_slot_corr)*k])
185 |         cur_layer = cur_data_vec
186 |         for i in range(0, n_layer_corr):
187 |             if i == n_layer_corr - 1:
188 |                 cur_layer = tf.nn.tanh(tf.matmul(cur_layer, weight_dict_corr[i]) + bias_dict_corr[i])
189 |             else:
190 |                 cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict_corr[i]) + bias_dict_corr[i])
191 |         cur_data_rep = cur_layer
192 |         # each ele - None*1
193 |         inner_prod_dict[mm] = tf.reduce_sum(tf.multiply(data_rep_tar, cur_data_rep), 1, \
194 |                             keep_dims=True)
195 |     
196 |     return inner_prod_dict
197 | 
198 | ###########################################################
199 | ###########################################################
200 | # input for l1 - prediction loss
201 | x_input = tf.placeholder(tf.int32, shape=[None, total_num_ft_col])
202 | # shape=[None, n_one_hot_slot]
203 | x_input_one_hot = x_input[:, 0:idx_1]
204 | x_input_mul_hot = x_input[:, idx_1:idx_2]
205 | # shape=[None, n_mul_hot_slot, max_len_per_slot]
206 | x_input_mul_hot = tf.reshape(x_input_mul_hot, (-1, n_mul_hot_slot, max_len_per_slot))
207 | 
208 | # input for corr loss
209 | x_input_corr = tf.placeholder(tf.int32, shape=[None, total_num_ft_col_corr])
210 | 
211 | # target vec for l1
212 | y_target = tf.placeholder(tf.float32, shape=[None, 1])
213 | 
214 | # dropout keep prob
215 | keep_prob = tf.placeholder(tf.float32)
216 | # emb_mat dim add 1 -> for padding (idx = 0)
217 | with tf.device('/cpu:0'):
218 |     emb_mat = tf.Variable(tf.random_normal([n_ft + 1, k], stddev=0.01))
219 | 
220 | ################################
221 | # prediction subnet FC layers, including output layer
222 | n_layer = len(layer_dim)
223 | in_dim = (n_one_hot_slot + n_mul_hot_slot)*k
224 | weight_dict = {}
225 | bias_dict = {}
226 | 
227 | # loop to create DNN vars
228 | for i in range(0, n_layer):
229 |     out_dim = layer_dim[i]
230 |     weight_dict[i] = tf.Variable(tf.random_normal(shape=[in_dim, out_dim], stddev=np.sqrt(2.0/(in_dim+out_dim))))
231 |     bias_dict[i] = tf.Variable(tf.constant(0.0, shape=[out_dim]))
232 |     in_dim = layer_dim[i]
233 | 
234 | ################################
235 | # correlation subnet FC layers
236 | n_layer_corr = len(layer_dim_corr)
237 | in_dim_corr = (n_one_hot_slot_corr + n_mul_hot_slot_corr)*k
238 | weight_dict_corr = {}
239 | bias_dict_corr = {}
240 | 
241 | for i in range(0, n_layer_corr):
242 |     out_dim_corr = layer_dim_corr[i]
243 |     weight_dict_corr[i] = tf.Variable(tf.random_normal(shape=[in_dim_corr, out_dim_corr],\
244 |                         stddev=np.sqrt(2.0/(in_dim_corr+out_dim_corr))))
245 |     bias_dict_corr[i] = tf.Variable(tf.constant(0.0, shape=[out_dim_corr]))
246 |     in_dim_corr = layer_dim_corr[i]
247 | ################################
248 | 
249 | data_embed_concat = get_concate_embed(x_input_one_hot, x_input_mul_hot)
250 | y_hat = get_pred_output(data_embed_concat)
251 | inner_prod_dict_corr = get_corr_output(x_input_corr)
252 | 
253 | loss_ctr = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=y_hat, labels=y_target))
254 | # logloss
255 | y_corr_cast_1 = tf.ones_like(inner_prod_dict_corr[0])
256 | y_corr_cast_0 = tf.zeros_like(inner_prod_dict_corr[0])
257 | # pos
258 | loss_corr = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=inner_prod_dict_corr[0], \
259 |     labels=y_corr_cast_1))
260 | # neg
261 | for i in range(n_neg_used_corr):
262 |     loss_corr += tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=inner_prod_dict_corr[i+1], \
263 |                  labels=y_corr_cast_0))
264 | 
265 | loss = loss_ctr + beta*loss_corr
266 | 
267 | #############################
268 | # prediction
269 | #############################
270 | pred_score = tf.sigmoid(y_hat)
271 | 
272 | if opt_alg == 'Adam':
273 |     optimizer = tf.train.AdamOptimizer(eta).minimize(loss)
274 | else:
275 |     # default
276 |     optimizer = tf.train.AdagradOptimizer(eta).minimize(loss)
277 | 
278 | ########################################
279 | # Launch the graph.
280 | config = tf.ConfigProto(log_device_placement=False)
281 | config.gpu_options.allow_growth = True
282 | config.gpu_options.per_process_gpu_memory_fraction = 0.3
283 | 
284 | with tf.Session(config=config) as sess:
285 |     sess.run(tf.global_variables_initializer())
286 |     sess.run(tf.local_variables_initializer())
287 |     coord = tf.train.Coordinator()
288 |     threads = tf.train.start_queue_runners(sess, coord)
289 | 
290 |     saver_val = tf.train.Saver()
291 |     train_loss_list = []
292 |     val_auc_list = []
293 |     best_n_round = 0
294 |     best_val_auc = 0            
295 |     lower_ct = 0
296 |     early_stop_flag = 0
297 | 
298 |     val_ft_inst, val_label_inst = sess.run([val_ft, val_label])
299 | 
300 |     func.print_time()
301 |     print('Start train loop')
302 |     
303 |     epoch = -1
304 |     try:
305 |         while not coord.should_stop():           
306 |             epoch += 1  
307 |             train_ft_inst, train_label_inst = sess.run([train_ft, train_label])
308 |             train_label_inst = np.transpose([train_label_inst])            
309 |             
310 |             train_ft_corr_inst = sess.run(train_ft_corr)
311 |             
312 |             # training
313 |             sess.run(optimizer, feed_dict={x_input:train_ft_inst, y_target:train_label_inst, \
314 |                                            x_input_corr:train_ft_corr_inst, keep_prob:kp_prob})
315 |             
316 |             # record loss and accuracy every step_size generations
317 |             if (epoch+1)%record_step_size == 0:                
318 |                 train_loss_temp = sess.run(loss, feed_dict={ \
319 |                                            x_input:train_ft_inst, y_target:train_label_inst, \
320 |                                            x_input_corr:train_ft_corr_inst, keep_prob:1.0})
321 |                 train_loss_list.append(train_loss_temp)                
322 |  
323 |                 val_pred_score_all = []
324 |                 val_label_all = []
325 |                 
326 |                 for iii in range(n_val_batch):
327 |                     # get batch
328 |                     start_idx = iii*batch_size
329 |                     end_idx = (iii+1)*batch_size
330 |                     cur_val_ft = val_ft_inst[start_idx: end_idx]
331 |                     cur_val_label = val_label_inst[start_idx: end_idx]
332 |                     # pred score
333 |                     cur_val_pred_score = sess.run(pred_score, feed_dict={ \
334 |                                             x_input:cur_val_ft, keep_prob:1.0})
335 |                     val_pred_score_all.append(cur_val_pred_score.flatten())
336 |                     val_label_all.append(cur_val_label)   
337 |                     
338 |                 # calculate auc
339 |                 val_pred_score_re = func.list_flatten(val_pred_score_all)
340 |                 val_label_re = func.list_flatten(val_label_all)
341 |                 val_auc_temp, _, _ = func.cal_auc(val_pred_score_re, val_label_re)
342 |                 # record all val results    
343 |                 val_auc_list.append(val_auc_temp)
344 |                  
345 |                 # record best and save models
346 |                 if val_auc_temp > best_val_auc:
347 |                     best_val_auc = val_auc_temp
348 |                     best_n_round = epoch
349 |                     # Save the variables to disk
350 |                     save_path = saver_val.save(sess, model_saving_addr)
351 |                     print("Model saved in: %s" % save_path)
352 |                 # count of consecutive lower
353 |                 if val_auc_temp < best_val_auc:
354 |                      lower_ct += 1
355 |                 # once higher or equal, set to 0
356 |                 else:
357 |                      lower_ct = 0
358 |                 
359 |                 if lower_ct >= max_num_lower_ct:
360 |                     early_stop_flag = 1
361 |                 
362 |                 auc_and_loss = [epoch+1, train_loss_temp, val_auc_temp]
363 |                 # round to given number of decimals
364 |                 auc_and_loss = [np.round(xx,4) for xx in auc_and_loss]
365 |                 func.print_time() 
366 |                 print('Generation # {}. Train Loss: {:.4f}. Val Avg AUC: {:.4f}.'\
367 |                       .format(*auc_and_loss))
368 | 
369 |             # stop while loop    
370 |             if early_stop_flag == 1:
371 |                 break 
372 |                 
373 |     except tf.errors.OutOfRangeError:
374 |         func.print_time()
375 |         print('Done training -- epoch limit reached')
376 |     
377 |     # restore model
378 |     saver_val.restore(sess, model_saving_addr)
379 |     print("Model restored.")
380 |             
381 |     # load test data
382 |     test_pred_score_all = []
383 |     test_label_all = []
384 |     test_loss_all = []
385 |     try:
386 |         while True:
387 |             test_ft_inst, test_label_inst = sess.run([test_ft, test_label])
388 |             cur_test_pred_score = sess.run(pred_score, feed_dict={ \
389 |                                     x_input:test_ft_inst, keep_prob:1.0})
390 |             test_pred_score_all.append(cur_test_pred_score.flatten())
391 |             test_label_all.append(test_label_inst)
392 |             
393 |             cur_test_loss = sess.run(loss_ctr, feed_dict={ \
394 |                                     x_input:test_ft_inst, \
395 |                                     y_target: np.transpose([test_label_inst]), keep_prob:1.0})
396 |             test_loss_all.append(cur_test_loss)
397 | 
398 |     except tf.errors.OutOfRangeError:
399 |         func.print_time()
400 |         print('Done testing -- epoch limit reached')    
401 |     finally:
402 |         coord.request_stop()
403 |         
404 |     coord.join(threads) 
405 |          
406 |     # calculate auc
407 |     test_pred_score_re = func.list_flatten(test_pred_score_all)
408 |     test_label_re = func.list_flatten(test_label_all)
409 |     test_auc, _, _ = func.cal_auc(test_pred_score_re, test_label_re)
410 |     test_rmse = func.cal_rmse(test_pred_score_re, test_label_re)
411 |     test_loss = np.mean(test_loss_all)
412 |     
413 |     # rounding
414 |     test_auc = np.round(test_auc, 4)
415 |     test_rmse = np.round(test_rmse, 4)
416 |     test_loss = np.round(test_loss, 5)
417 |     train_loss_list = [np.round(xx,4) for xx in train_loss_list]
418 |     val_auc_list = [np.round(xx,4) for xx in val_auc_list]
419 |     
420 |     print('test_auc = ', test_auc)
421 |     print('test_rmse =', test_rmse)
422 |     print('test_loss =', test_loss)
423 |     print('train_loss_list =', train_loss_list)
424 |     print('val_auc_list =', val_auc_list)
425 |     
426 |     # write output to file
427 |     with open(output_file_name, 'a') as f:
428 |         now = datetime.datetime.now()
429 |         time_str = now.strftime(cfg.time_style)
430 |         f.write(time_str + '\n')
431 |         f.write('train_file_name = ' + train_file_name[0] + '\n')
432 |         f.write('learning_rate = ' + str(eta) \
433 |                 + ', beta = ' + str(beta) \
434 |                 + ', n_epoch = ' + str(n_epoch) \
435 |                 + ', emb_dize = ' + str(k) + '\n')
436 |         f.write('test_auc = ' + str(test_auc) + '\n')
437 |         f.write('test_rmse = ' + str(test_rmse) + '\n')
438 |         f.write('test_loss = ' + str(test_loss) + '\n')
439 |         f.write('train_loss_list =' + str(train_loss_list) + '\n')
440 |         f.write('val_auc_list =' + str(val_auc_list) + '\n')
441 |         f.write('-'*50 + '\n')
442 | 


--------------------------------------------------------------------------------
/deepmcp.py:
--------------------------------------------------------------------------------
  1 | # DeepMCP - Deep Matching, Correlation and Prediction model
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | import datetime
  6 | import ctr_funcs as func
  7 | import config_deepmcp as cfg
  8 | import os
  9 | import shutil
 10 | 
 11 | # config
 12 | str_txt = cfg.output_file_name
 13 | base_path = './tmp'
 14 | model_saving_addr = base_path + '/deepmcp_' + str_txt + '/'
 15 | output_file_name = base_path + '/deepmcp_' + str_txt + '.txt'
 16 | num_csv_col = cfg.num_csv_col
 17 | train_file_name = cfg.train_file_name
 18 | val_file_name = cfg.val_file_name
 19 | test_file_name = cfg.test_file_name
 20 | batch_size = cfg.batch_size
 21 | n_ft = cfg.n_ft
 22 | k = cfg.k
 23 | kp_prob = cfg.kp_prob
 24 | n_epoch = cfg.n_epoch
 25 | max_num_lower_ct = cfg.max_num_lower_ct
 26 | record_step_size = cfg.record_step_size
 27 | layer_dim = cfg.layer_dim
 28 | layer_dim_match = cfg.layer_dim_match
 29 | eta = cfg.eta # learning rate
 30 | opt_alg = cfg.opt_alg
 31 | n_one_hot_slot = cfg.n_one_hot_slot
 32 | n_mul_hot_slot = cfg.n_mul_hot_slot
 33 | max_len_per_slot = cfg.max_len_per_slot
 34 | alpha = cfg.alpha # for matching loss
 35 | beta = cfg.beta # for correlation loss
 36 | user_ft_idx = cfg.user_ft_idx
 37 | ad_ft_idx = cfg.ad_ft_idx
 38 | n_user_ft = len(user_ft_idx)
 39 | n_ad_ft = len(ad_ft_idx)
 40 | label_col_idx = 0
 41 | record_defaults = [[0]]*num_csv_col
 42 | record_defaults[0] = [0.0]
 43 | total_num_ft_col = num_csv_col - 1
 44 | 
 45 | ## corr dataset - no test data for this dataset
 46 | train_file_name_corr = cfg.train_file_name_corr
 47 | batch_size_corr = cfg.batch_size_corr
 48 | layer_dim_corr = cfg.layer_dim_corr
 49 | n_one_hot_slot_corr = cfg.n_one_hot_slot_corr
 50 | n_mul_hot_slot_corr = cfg.n_mul_hot_slot_corr
 51 | max_len_per_slot_corr = cfg.max_len_per_slot_corr
 52 | n_epoch_corr = cfg.n_epoch_corr
 53 | n_neg_used_corr = cfg.n_neg_used_corr
 54 | # no label
 55 | num_csv_col_corr = cfg.num_csv_col_corr
 56 | record_defaults_corr = [[0]]*num_csv_col_corr
 57 | total_num_ft_col_corr = num_csv_col_corr
 58 |     
 59 | # create dir
 60 | if not os.path.exists(base_path):
 61 |     os.mkdir(base_path)
 62 | 
 63 | # remove dir
 64 | if os.path.isdir(model_saving_addr):
 65 |     shutil.rmtree(model_saving_addr)
 66 | 
 67 | # for DNN
 68 | idx_1 = n_one_hot_slot
 69 | idx_2 = idx_1 + n_mul_hot_slot*max_len_per_slot
 70 | 
 71 | ###########################################################
 72 | ###########################################################
 73 | print('Loading data start!')
 74 | tf.set_random_seed(123)
 75 | 
 76 | # load training data
 77 | train_ft, train_label = func.tf_input_pipeline(train_file_name, batch_size, n_epoch, label_col_idx, record_defaults)
 78 | 
 79 | n_val_inst = func.count_lines(val_file_name[0])
 80 | val_ft, val_label = func.tf_input_pipeline(val_file_name, n_val_inst, 1, label_col_idx, record_defaults)
 81 | n_val_batch = n_val_inst//batch_size
 82 | 
 83 | # load test data
 84 | test_ft, test_label = func.tf_input_pipeline_test(test_file_name, batch_size, 1, label_col_idx, record_defaults)
 85 | print('Loading data set 1 done!')
 86 | 
 87 | # load training data
 88 | train_ft_corr = func.tf_input_pipeline_wo_label(train_file_name_corr, batch_size_corr, n_epoch_corr, record_defaults_corr)
 89 | print('Loading data set 2 done!')
 90 | 
 91 | ########################################################################
 92 | # partition input for correlation loss 
 93 | def partition_input_corr(x_input_corr):
 94 |     # generate idx_list
 95 |     len_list = []
 96 |     
 97 |     # 2 - tar & ctxt
 98 |     for i in range(n_neg_used_corr+2):
 99 |         len_list.append(n_one_hot_slot_corr)
100 |         len_list.append(n_mul_hot_slot_corr*max_len_per_slot_corr)
101 |     
102 |     len_list = np.array(len_list)
103 |     idx_list = np.cumsum(len_list)
104 |             
105 |     x_tar_one_hot_corr = x_input_corr[:, 0:idx_list[0]]
106 |     x_tar_mul_hot_corr = x_input_corr[:, idx_list[0]:idx_list[1]]
107 |     # shape=[None, n_mul_hot_slot, max_len_per_slot]
108 |     x_tar_mul_hot_corr = tf.reshape(x_tar_mul_hot_corr, (-1, n_mul_hot_slot_corr, max_len_per_slot_corr))
109 |             
110 |     x_input_one_hot_dict_corr = {}
111 |     x_input_mul_hot_dict_corr = {}
112 |     
113 |     for i in range(n_neg_used_corr+1):
114 |         x_input_one_hot_dict_corr[i] = x_input_corr[:, idx_list[2*i+1]:idx_list[2*i+2]]
115 |         temp = x_input_corr[:, idx_list[2*i+2]:idx_list[2*i+3]]
116 |         x_input_mul_hot_dict_corr[i] = tf.reshape(temp, (-1, n_mul_hot_slot_corr, max_len_per_slot_corr))
117 |     
118 |     return x_tar_one_hot_corr, x_tar_mul_hot_corr, x_input_one_hot_dict_corr, x_input_mul_hot_dict_corr
119 | 
120 | # add mask
121 | def get_masked_one_hot(x_input_one_hot):
122 |     data_mask = tf.cast(tf.greater(x_input_one_hot, 0), tf.float32)
123 |     data_mask = tf.expand_dims(data_mask, axis = 2)
124 |     data_mask = tf.tile(data_mask, (1,1,k))
125 |     # output: (?, n_one_hot_slot, k)
126 |     data_embed_one_hot = tf.nn.embedding_lookup(emb_mat, x_input_one_hot)
127 |     data_embed_one_hot_masked = tf.multiply(data_embed_one_hot, data_mask)
128 |     return data_embed_one_hot_masked
129 | 
130 | def get_masked_mul_hot(x_input_mul_hot):
131 |     data_mask = tf.cast(tf.greater(x_input_mul_hot, 0), tf.float32)
132 |     data_mask = tf.expand_dims(data_mask, axis = 3)
133 |     data_mask = tf.tile(data_mask, (1,1,1,k))
134 |     # output: (?, n_mul_hot_slot, max_len_per_slot, k)
135 |     data_embed_mul_hot = tf.nn.embedding_lookup(emb_mat, x_input_mul_hot)
136 |     data_embed_mul_hot_masked = tf.multiply(data_embed_mul_hot, data_mask)
137 |     # output: (?, n_mul_hot_slot, k)
138 |     data_embed_mul_hot_masked = tf.reduce_sum(data_embed_mul_hot_masked, 2)
139 |     return data_embed_mul_hot_masked
140 | 
141 | # output: (?, n_one_hot_slot + n_mul_hot_slot, k)
142 | def get_concate_embed(x_input_one_hot, x_input_mul_hot):
143 |     data_embed_one_hot = get_masked_one_hot(x_input_one_hot)
144 |     data_embed_mul_hot = get_masked_mul_hot(x_input_mul_hot)
145 |     data_embed_concat = tf.concat([data_embed_one_hot, data_embed_mul_hot], 1)
146 |     return data_embed_concat
147 | 
148 | # input: (?, n_slot*k)
149 | # output: (?, 1)
150 | def get_pred_output(data_embed_concat):
151 |     # include output layer
152 |     n_layer = len(layer_dim)
153 |     data_embed_dnn = tf.reshape(data_embed_concat, [-1, (n_one_hot_slot + n_mul_hot_slot)*k])
154 |     cur_layer = data_embed_dnn
155 |     # loop to create DNN struct
156 |     for i in range(0, n_layer):
157 |         # output layer, linear activation
158 |         if i == n_layer - 1:
159 |             cur_layer = tf.matmul(cur_layer, weight_dict[i]) + bias_dict[i]
160 |         else:
161 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict[i]) + bias_dict[i])
162 |             cur_layer = tf.nn.dropout(cur_layer, keep_prob)
163 |     
164 |     y_hat = cur_layer
165 |     return y_hat
166 | 
167 | # matching loss input
168 | def get_match_output(data_embed_concat):
169 |     cur_idx = user_ft_idx[0]
170 |     user_ft_cols = data_embed_concat[:, cur_idx:cur_idx+1, :]
171 |     for i in range(1, len(user_ft_idx)):
172 |         cur_idx = user_ft_idx[i]
173 |         cur_x = data_embed_concat[:, cur_idx:cur_idx+1, :]
174 |         user_ft_cols = tf.concat([user_ft_cols, cur_x], 1)
175 |     
176 |     cur_idx = ad_ft_idx[0]
177 |     ad_ft_cols = data_embed_concat[:, cur_idx:cur_idx+1, :]
178 |     for i in range(1, len(ad_ft_idx)):
179 |         cur_idx = ad_ft_idx[i]
180 |         cur_x = data_embed_concat[:, cur_idx:cur_idx+1, :]
181 |         ad_ft_cols = tf.concat([ad_ft_cols, cur_x], 1)
182 |     
183 |     user_ft_vec = tf.reshape(user_ft_cols, [-1, n_user_ft*k])
184 |     ad_ft_vec = tf.reshape(ad_ft_cols, [-1, n_ad_ft*k])
185 |     
186 |     n_layer_match = len(layer_dim_match)
187 |     cur_layer = user_ft_vec
188 |     for i in range(0, n_layer_match):
189 |         if i == n_layer_match - 1:
190 |             cur_layer = tf.nn.tanh(tf.matmul(cur_layer, weight_dict_user[i]) + bias_dict_user[i])
191 |         else:
192 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict_user[i]) + bias_dict_user[i])
193 |     user_rep = cur_layer
194 |     
195 |     cur_layer = ad_ft_vec
196 |     for i in range(0, n_layer_match):
197 |         if i == n_layer_match - 1:
198 |             cur_layer = tf.nn.tanh(tf.matmul(cur_layer, weight_dict_ad[i]) + bias_dict_ad[i])
199 |         else:
200 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict_ad[i]) + bias_dict_ad[i])
201 |     ad_rep = cur_layer
202 |     
203 |     # (?*mk) x (?*mk) -> (?*1)
204 |     inner_prod = tf.reduce_sum(tf.multiply(user_rep, ad_rep), 1, keep_dims=True)
205 |     return inner_prod
206 |     
207 | # correlation loss input
208 | def get_corr_output(x_input_corr):
209 |     x_tar_one_hot_corr, x_tar_mul_hot_corr, x_input_one_hot_dict_corr, x_input_mul_hot_dict_corr = \
210 |         partition_input_corr(x_input_corr)
211 |     
212 |     data_embed_tar = get_concate_embed(x_tar_one_hot_corr, x_tar_mul_hot_corr)
213 |     data_vec_tar = tf.reshape(data_embed_tar, [-1, (n_one_hot_slot_corr + n_mul_hot_slot_corr)*k])
214 |     
215 |     n_layer_corr = len(layer_dim_corr)
216 |     cur_layer = data_vec_tar
217 |     for i in range(0, n_layer_corr):
218 |         if i == n_layer_corr - 1:
219 |             cur_layer = tf.nn.tanh(tf.matmul(cur_layer, weight_dict_corr[i]) + bias_dict_corr[i])
220 |         else:
221 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict_corr[i]) + bias_dict_corr[i])
222 |     data_rep_tar = cur_layer
223 |     
224 |     # idx 0 - pos, idx 1 -- neg
225 |     inner_prod_dict = {}
226 |     for mm in range(n_neg_used_corr + 1):
227 |         cur_data_embed = get_concate_embed(x_input_one_hot_dict_corr[mm], \
228 |                                            x_input_mul_hot_dict_corr[mm])
229 |         cur_data_vec = tf.reshape(cur_data_embed, [-1, (n_one_hot_slot_corr + n_mul_hot_slot_corr)*k])
230 |         cur_layer = cur_data_vec
231 |         for i in range(0, n_layer_corr):
232 |             if i == n_layer_corr - 1:
233 |                 cur_layer = tf.nn.tanh(tf.matmul(cur_layer, weight_dict_corr[i]) + bias_dict_corr[i])
234 |             else:
235 |                 cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict_corr[i]) + bias_dict_corr[i])
236 |         cur_data_rep = cur_layer
237 |         # each ele - None*1
238 |         inner_prod_dict[mm] = tf.reduce_sum(tf.multiply(data_rep_tar, cur_data_rep), 1, \
239 |                             keep_dims=True)
240 |     
241 |     return inner_prod_dict
242 | 
243 | ###########################################################
244 | ###########################################################
245 | # input for l1 - prediction loss
246 | x_input = tf.placeholder(tf.int32, shape=[None, total_num_ft_col])
247 | # shape=[None, n_one_hot_slot]
248 | x_input_one_hot = x_input[:, 0:idx_1]
249 | x_input_mul_hot = x_input[:, idx_1:idx_2]
250 | # shape=[None, n_mul_hot_slot, max_len_per_slot]
251 | x_input_mul_hot = tf.reshape(x_input_mul_hot, (-1, n_mul_hot_slot, max_len_per_slot))
252 | 
253 | # input for corr loss
254 | x_input_corr = tf.placeholder(tf.int32, shape=[None, total_num_ft_col_corr])
255 | 
256 | # target vec for l1
257 | y_target = tf.placeholder(tf.float32, shape=[None, 1])
258 | 
259 | # dropout keep prob
260 | keep_prob = tf.placeholder(tf.float32)
261 | # emb_mat dim add 1 -> for padding (idx = 0)
262 | with tf.device('/cpu:0'):
263 |     emb_mat = tf.Variable(tf.random_normal([n_ft + 1, k], stddev=0.01))
264 | 
265 | ################################
266 | # prediction subnet FC layers, including output layer
267 | n_layer = len(layer_dim)
268 | in_dim = (n_one_hot_slot + n_mul_hot_slot)*k
269 | weight_dict = {}
270 | bias_dict = {}
271 | 
272 | # loop to create DNN vars
273 | for i in range(0, n_layer):
274 |     out_dim = layer_dim[i]
275 |     weight_dict[i] = tf.Variable(tf.random_normal(shape=[in_dim, out_dim], stddev=np.sqrt(2.0/(in_dim+out_dim))))
276 |     bias_dict[i] = tf.Variable(tf.constant(0.0, shape=[out_dim]))
277 |     in_dim = layer_dim[i]
278 | 
279 | ################################
280 | # matching subnet FC layers
281 | n_layer_match = len(layer_dim_match)
282 | in_dim_user = n_user_ft*k
283 | weight_dict_user={}
284 | bias_dict_user={}
285 | 
286 | in_dim_ad = n_ad_ft*k
287 | weight_dict_ad={}
288 | bias_dict_ad={}
289 | 
290 | for i in range(0, n_layer_match):
291 |     out_dim_user = layer_dim_match[i]
292 |     weight_dict_user[i] = tf.Variable(tf.random_normal(shape=[in_dim_user, out_dim_user],\
293 |                                     stddev=np.sqrt(2.0/(in_dim_user+out_dim_user))))
294 |     bias_dict_user[i] = tf.Variable(tf.constant(0.0, shape=[out_dim_user]))
295 |     in_dim_user = layer_dim_match[i]
296 | 
297 | for i in range(0, n_layer_match):
298 |     out_dim_ad = layer_dim_match[i]
299 |     weight_dict_ad[i] = tf.Variable(tf.random_normal(shape=[in_dim_ad, out_dim_ad],\
300 |                                     stddev=np.sqrt(2.0/(in_dim_ad+out_dim_ad))))
301 |     bias_dict_ad[i] = tf.Variable(tf.constant(0.0, shape=[out_dim_ad]))
302 |     in_dim_ad = layer_dim_match[i]
303 | 
304 | 
305 | ################################
306 | # correlation subnet FC layers
307 | n_layer_corr = len(layer_dim_corr)
308 | in_dim_corr = (n_one_hot_slot_corr + n_mul_hot_slot_corr)*k
309 | weight_dict_corr = {}
310 | bias_dict_corr = {}
311 | 
312 | for i in range(0, n_layer_corr):
313 |     out_dim_corr = layer_dim_corr[i]
314 |     weight_dict_corr[i] = tf.Variable(tf.random_normal(shape=[in_dim_corr, out_dim_corr],\
315 |                         stddev=np.sqrt(2.0/(in_dim_corr+out_dim_corr))))
316 |     bias_dict_corr[i] = tf.Variable(tf.constant(0.0, shape=[out_dim_corr]))
317 |     in_dim_corr = layer_dim_corr[i]
318 | ################################
319 | 
320 | data_embed_concat = get_concate_embed(x_input_one_hot, x_input_mul_hot)
321 | y_hat = get_pred_output(data_embed_concat)
322 | y_hat_match = get_match_output(data_embed_concat)
323 | inner_prod_dict_corr = get_corr_output(x_input_corr)
324 | 
325 | loss_ctr = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=y_hat, labels=y_target))
326 | loss_match = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=y_hat_match, labels=y_target))
327 | 
328 | # logloss
329 | y_corr_cast_1 = tf.ones_like(inner_prod_dict_corr[0])
330 | y_corr_cast_0 = tf.zeros_like(inner_prod_dict_corr[0])
331 | # pos
332 | loss_corr = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=inner_prod_dict_corr[0], \
333 |     labels=y_corr_cast_1))
334 | # neg
335 | for i in range(n_neg_used_corr):
336 |     loss_corr += tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=inner_prod_dict_corr[i+1], \
337 |                  labels=y_corr_cast_0))
338 | 
339 | loss = loss_ctr + alpha*loss_match + beta*loss_corr
340 | 
341 | #############################
342 | # prediction
343 | #############################
344 | pred_score = tf.sigmoid(y_hat)
345 | 
346 | if opt_alg == 'Adam':
347 |     optimizer = tf.train.AdamOptimizer(eta).minimize(loss)
348 | else:
349 |     # default
350 |     optimizer = tf.train.AdagradOptimizer(eta).minimize(loss)
351 | 
352 | ########################################
353 | # Launch the graph.
354 | config = tf.ConfigProto(log_device_placement=False)
355 | config.gpu_options.allow_growth = True
356 | config.gpu_options.per_process_gpu_memory_fraction = 0.3
357 | 
358 | with tf.Session(config=config) as sess:
359 |     sess.run(tf.global_variables_initializer())
360 |     sess.run(tf.local_variables_initializer())
361 |     coord = tf.train.Coordinator()
362 |     threads = tf.train.start_queue_runners(sess, coord)
363 | 
364 |     saver_val = tf.train.Saver()
365 |     train_loss_list = []
366 |     val_auc_list = []
367 |     best_n_round = 0
368 |     best_val_auc = 0            
369 |     lower_ct = 0
370 |     early_stop_flag = 0
371 | 
372 |     val_ft_inst, val_label_inst = sess.run([val_ft, val_label])
373 | 
374 |     func.print_time()
375 |     print('Start train loop')
376 |     
377 |     epoch = -1
378 |     try:
379 |         while not coord.should_stop():           
380 |             epoch += 1  
381 |             train_ft_inst, train_label_inst = sess.run([train_ft, train_label])
382 |             train_label_inst = np.transpose([train_label_inst])            
383 |             
384 |             train_ft_corr_inst = sess.run(train_ft_corr)
385 |             
386 |             # training
387 |             sess.run(optimizer, feed_dict={x_input:train_ft_inst, y_target:train_label_inst, \
388 |                                            x_input_corr:train_ft_corr_inst, keep_prob:kp_prob})
389 |             
390 |             # record loss and accuracy every step_size generations
391 |             if (epoch+1)%record_step_size == 0:                
392 |                 train_loss_temp = sess.run(loss, feed_dict={ \
393 |                                            x_input:train_ft_inst, y_target:train_label_inst, \
394 |                                            x_input_corr:train_ft_corr_inst, keep_prob:1.0})
395 |                 train_loss_list.append(train_loss_temp)                
396 |  
397 |                 val_pred_score_all = []
398 |                 val_label_all = []
399 |                 
400 |                 for iii in range(n_val_batch):
401 |                     # get batch
402 |                     start_idx = iii*batch_size
403 |                     end_idx = (iii+1)*batch_size
404 |                     cur_val_ft = val_ft_inst[start_idx: end_idx]
405 |                     cur_val_label = val_label_inst[start_idx: end_idx]
406 |                     # pred score
407 |                     cur_val_pred_score = sess.run(pred_score, feed_dict={ \
408 |                                             x_input:cur_val_ft, keep_prob:1.0})
409 |                     val_pred_score_all.append(cur_val_pred_score.flatten())
410 |                     val_label_all.append(cur_val_label)   
411 |                     
412 |                 # calculate auc
413 |                 val_pred_score_re = func.list_flatten(val_pred_score_all)
414 |                 val_label_re = func.list_flatten(val_label_all)
415 |                 val_auc_temp, _, _ = func.cal_auc(val_pred_score_re, val_label_re)
416 |                 # record all val results    
417 |                 val_auc_list.append(val_auc_temp)
418 |                  
419 |                 # record best and save models
420 |                 if val_auc_temp > best_val_auc:
421 |                     best_val_auc = val_auc_temp
422 |                     best_n_round = epoch
423 |                     # Save the variables to disk
424 |                     save_path = saver_val.save(sess, model_saving_addr)
425 |                     print("Model saved in: %s" % save_path)
426 |                 # count of consecutive lower
427 |                 if val_auc_temp < best_val_auc:
428 |                      lower_ct += 1
429 |                 # once higher or equal, set to 0
430 |                 else:
431 |                      lower_ct = 0
432 |                 
433 |                 if lower_ct >= max_num_lower_ct:
434 |                     early_stop_flag = 1
435 |                 
436 |                 auc_and_loss = [epoch+1, train_loss_temp, val_auc_temp]
437 |                 # round to given number of decimals
438 |                 auc_and_loss = [np.round(xx,4) for xx in auc_and_loss]
439 |                 func.print_time() 
440 |                 print('Generation # {}. Train Loss: {:.4f}. Val Avg AUC: {:.4f}.'\
441 |                       .format(*auc_and_loss))
442 | 
443 |             # stop while loop    
444 |             if early_stop_flag == 1:
445 |                 break 
446 |                 
447 |     except tf.errors.OutOfRangeError:
448 |         func.print_time()
449 |         print('Done training -- epoch limit reached')
450 |     
451 |     # restore model
452 |     saver_val.restore(sess, model_saving_addr)
453 |     print("Model restored.")
454 |             
455 |     # load test data
456 |     test_pred_score_all = []
457 |     test_label_all = []
458 |     test_loss_all = []
459 |     try:
460 |         while True:
461 |             test_ft_inst, test_label_inst = sess.run([test_ft, test_label])
462 |             cur_test_pred_score = sess.run(pred_score, feed_dict={ \
463 |                                     x_input:test_ft_inst, keep_prob:1.0})
464 |             test_pred_score_all.append(cur_test_pred_score.flatten())
465 |             test_label_all.append(test_label_inst)
466 |             
467 |             cur_test_loss = sess.run(loss_ctr, feed_dict={ \
468 |                                     x_input:test_ft_inst, \
469 |                                     y_target: np.transpose([test_label_inst]), keep_prob:1.0})
470 |             test_loss_all.append(cur_test_loss)
471 | 
472 |     except tf.errors.OutOfRangeError:
473 |         func.print_time()
474 |         print('Done testing -- epoch limit reached')    
475 |     finally:
476 |         coord.request_stop()
477 |         
478 |     coord.join(threads) 
479 |          
480 |     # calculate auc
481 |     test_pred_score_re = func.list_flatten(test_pred_score_all)
482 |     test_label_re = func.list_flatten(test_label_all)
483 |     test_auc, _, _ = func.cal_auc(test_pred_score_re, test_label_re)
484 |     test_rmse = func.cal_rmse(test_pred_score_re, test_label_re)
485 |     test_loss = np.mean(test_loss_all)
486 |     
487 |     # rounding
488 |     test_auc = np.round(test_auc, 4)
489 |     test_rmse = np.round(test_rmse, 4)
490 |     test_loss = np.round(test_loss, 5)
491 |     train_loss_list = [np.round(xx,4) for xx in train_loss_list]
492 |     val_auc_list = [np.round(xx,4) for xx in val_auc_list]
493 |     
494 |     print('test_auc = ', test_auc)
495 |     print('test_rmse =', test_rmse)
496 |     print('test_loss =', test_loss)
497 |     print('train_loss_list =', train_loss_list)
498 |     print('val_auc_list =', val_auc_list)
499 |     
500 |     # write output to file
501 |     with open(output_file_name, 'a') as f:
502 |         now = datetime.datetime.now()
503 |         time_str = now.strftime(cfg.time_style)
504 |         f.write(time_str + '\n')
505 |         f.write('train_file_name = ' + train_file_name[0] + '\n')
506 |         f.write('learning_rate = ' + str(eta) + ', alpha = ' + str(alpha) \
507 |                 + ', beta = ' + str(beta) \
508 |                 + ', n_epoch = ' + str(n_epoch) \
509 |                 + ', emb_dize = ' + str(k) + '\n')
510 |         f.write('test_auc = ' + str(test_auc) + '\n')
511 |         f.write('test_rmse = ' + str(test_rmse) + '\n')
512 |         f.write('test_loss = ' + str(test_loss) + '\n')
513 |         f.write('train_loss_list =' + str(train_loss_list) + '\n')
514 |         f.write('val_auc_list =' + str(val_auc_list) + '\n')
515 |         f.write('-'*50 + '\n')
516 | 


--------------------------------------------------------------------------------
/deepmp.py:
--------------------------------------------------------------------------------
  1 | # DeepMP - Deep Matching and Prediction model
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | import datetime
  6 | import ctr_funcs as func
  7 | import config_deepmcp as cfg
  8 | import os
  9 | import shutil
 10 | 
 11 | # config
 12 | str_txt = cfg.output_file_name
 13 | base_path = './tmp'
 14 | model_saving_addr = base_path + '/deepmp_' + str_txt + '/'
 15 | output_file_name = base_path + '/deepmp_' + str_txt + '.txt'
 16 | num_csv_col = cfg.num_csv_col
 17 | train_file_name = cfg.train_file_name
 18 | val_file_name = cfg.val_file_name
 19 | test_file_name = cfg.test_file_name
 20 | batch_size = cfg.batch_size
 21 | n_ft = cfg.n_ft
 22 | k = cfg.k
 23 | kp_prob = cfg.kp_prob
 24 | n_epoch = cfg.n_epoch
 25 | max_num_lower_ct = cfg.max_num_lower_ct
 26 | record_step_size = cfg.record_step_size
 27 | layer_dim = cfg.layer_dim
 28 | layer_dim_match = cfg.layer_dim_match
 29 | eta = cfg.eta # learning rate
 30 | opt_alg = cfg.opt_alg
 31 | n_one_hot_slot = cfg.n_one_hot_slot
 32 | n_mul_hot_slot = cfg.n_mul_hot_slot
 33 | max_len_per_slot = cfg.max_len_per_slot
 34 | alpha = cfg.alpha # for matching loss
 35 | user_ft_idx = cfg.user_ft_idx
 36 | ad_ft_idx = cfg.ad_ft_idx
 37 | n_user_ft = len(user_ft_idx)
 38 | n_ad_ft = len(ad_ft_idx)
 39 | label_col_idx = 0
 40 | record_defaults = [[0]]*num_csv_col
 41 | record_defaults[0] = [0.0]
 42 | total_num_ft_col = num_csv_col - 1
 43 |     
 44 | # create dir
 45 | if not os.path.exists(base_path):
 46 |     os.mkdir(base_path)
 47 | 
 48 | # remove dir
 49 | if os.path.isdir(model_saving_addr):
 50 |     shutil.rmtree(model_saving_addr)
 51 | 
 52 | # for DNN
 53 | idx_1 = n_one_hot_slot
 54 | idx_2 = idx_1 + n_mul_hot_slot*max_len_per_slot
 55 | 
 56 | ###########################################################
 57 | ###########################################################
 58 | print('Loading data start!')
 59 | tf.set_random_seed(123)
 60 | 
 61 | # load training data
 62 | train_ft, train_label = func.tf_input_pipeline(train_file_name, batch_size, n_epoch, label_col_idx, record_defaults)
 63 | 
 64 | n_val_inst = func.count_lines(val_file_name[0])
 65 | val_ft, val_label = func.tf_input_pipeline(val_file_name, n_val_inst, 1, label_col_idx, record_defaults)
 66 | n_val_batch = n_val_inst//batch_size
 67 | 
 68 | # load test data
 69 | test_ft, test_label = func.tf_input_pipeline_test(test_file_name, batch_size, 1, label_col_idx, record_defaults)
 70 | print('Loading data set 1 done!')
 71 | 
 72 | ########################################################################
 73 | 
 74 | # add mask
 75 | def get_masked_one_hot(x_input_one_hot):
 76 |     data_mask = tf.cast(tf.greater(x_input_one_hot, 0), tf.float32)
 77 |     data_mask = tf.expand_dims(data_mask, axis = 2)
 78 |     data_mask = tf.tile(data_mask, (1,1,k))
 79 |     # output: (?, n_one_hot_slot, k)
 80 |     data_embed_one_hot = tf.nn.embedding_lookup(emb_mat, x_input_one_hot)
 81 |     data_embed_one_hot_masked = tf.multiply(data_embed_one_hot, data_mask)
 82 |     return data_embed_one_hot_masked
 83 | 
 84 | def get_masked_mul_hot(x_input_mul_hot):
 85 |     data_mask = tf.cast(tf.greater(x_input_mul_hot, 0), tf.float32)
 86 |     data_mask = tf.expand_dims(data_mask, axis = 3)
 87 |     data_mask = tf.tile(data_mask, (1,1,1,k))
 88 |     # output: (?, n_mul_hot_slot, max_len_per_slot, k)
 89 |     data_embed_mul_hot = tf.nn.embedding_lookup(emb_mat, x_input_mul_hot)
 90 |     data_embed_mul_hot_masked = tf.multiply(data_embed_mul_hot, data_mask)
 91 |     # output: (?, n_mul_hot_slot, k)
 92 |     data_embed_mul_hot_masked = tf.reduce_sum(data_embed_mul_hot_masked, 2)
 93 |     return data_embed_mul_hot_masked
 94 | 
 95 | # output: (?, n_one_hot_slot + n_mul_hot_slot, k)
 96 | def get_concate_embed(x_input_one_hot, x_input_mul_hot):
 97 |     data_embed_one_hot = get_masked_one_hot(x_input_one_hot)
 98 |     data_embed_mul_hot = get_masked_mul_hot(x_input_mul_hot)
 99 |     data_embed_concat = tf.concat([data_embed_one_hot, data_embed_mul_hot], 1)
100 |     return data_embed_concat
101 | 
102 | # input: (?, n_slot*k)
103 | # output: (?, 1)
104 | def get_pred_output(data_embed_concat):
105 |     # include output layer
106 |     n_layer = len(layer_dim)
107 |     data_embed_dnn = tf.reshape(data_embed_concat, [-1, (n_one_hot_slot + n_mul_hot_slot)*k])
108 |     cur_layer = data_embed_dnn
109 |     # loop to create DNN struct
110 |     for i in range(0, n_layer):
111 |         # output layer, linear activation
112 |         if i == n_layer - 1:
113 |             cur_layer = tf.matmul(cur_layer, weight_dict[i]) + bias_dict[i]
114 |         else:
115 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict[i]) + bias_dict[i])
116 |             cur_layer = tf.nn.dropout(cur_layer, keep_prob)
117 |     
118 |     y_hat = cur_layer
119 |     return y_hat
120 | 
121 | # matching loss input
122 | def get_match_output(data_embed_concat):
123 |     cur_idx = user_ft_idx[0]
124 |     user_ft_cols = data_embed_concat[:, cur_idx:cur_idx+1, :]
125 |     for i in range(1, len(user_ft_idx)):
126 |         cur_idx = user_ft_idx[i]
127 |         cur_x = data_embed_concat[:, cur_idx:cur_idx+1, :]
128 |         user_ft_cols = tf.concat([user_ft_cols, cur_x], 1)
129 |     
130 |     cur_idx = ad_ft_idx[0]
131 |     ad_ft_cols = data_embed_concat[:, cur_idx:cur_idx+1, :]
132 |     for i in range(1, len(ad_ft_idx)):
133 |         cur_idx = ad_ft_idx[i]
134 |         cur_x = data_embed_concat[:, cur_idx:cur_idx+1, :]
135 |         ad_ft_cols = tf.concat([ad_ft_cols, cur_x], 1)
136 |     
137 |     user_ft_vec = tf.reshape(user_ft_cols, [-1, n_user_ft*k])
138 |     ad_ft_vec = tf.reshape(ad_ft_cols, [-1, n_ad_ft*k])
139 |     
140 |     n_layer_match = len(layer_dim_match)
141 |     cur_layer = user_ft_vec
142 |     for i in range(0, n_layer_match):
143 |         if i == n_layer_match - 1:
144 |             cur_layer = tf.nn.tanh(tf.matmul(cur_layer, weight_dict_user[i]) + bias_dict_user[i])
145 |         else:
146 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict_user[i]) + bias_dict_user[i])
147 |     user_rep = cur_layer
148 |     
149 |     cur_layer = ad_ft_vec
150 |     for i in range(0, n_layer_match):
151 |         if i == n_layer_match - 1:
152 |             cur_layer = tf.nn.tanh(tf.matmul(cur_layer, weight_dict_ad[i]) + bias_dict_ad[i])
153 |         else:
154 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict_ad[i]) + bias_dict_ad[i])
155 |     ad_rep = cur_layer
156 |     
157 |     # (?*mk) x (?*mk) -> (?*1)
158 |     inner_prod = tf.reduce_sum(tf.multiply(user_rep, ad_rep), 1, keep_dims=True)
159 |     return inner_prod
160 | 
161 | ###########################################################
162 | ###########################################################
163 | # input for l1 - prediction loss
164 | x_input = tf.placeholder(tf.int32, shape=[None, total_num_ft_col])
165 | # shape=[None, n_one_hot_slot]
166 | x_input_one_hot = x_input[:, 0:idx_1]
167 | x_input_mul_hot = x_input[:, idx_1:idx_2]
168 | # shape=[None, n_mul_hot_slot, max_len_per_slot]
169 | x_input_mul_hot = tf.reshape(x_input_mul_hot, (-1, n_mul_hot_slot, max_len_per_slot))
170 | 
171 | # target vec for l1
172 | y_target = tf.placeholder(tf.float32, shape=[None, 1])
173 | 
174 | # dropout keep prob
175 | keep_prob = tf.placeholder(tf.float32)
176 | # emb_mat dim add 1 -> for padding (idx = 0)
177 | with tf.device('/cpu:0'):
178 |     emb_mat = tf.Variable(tf.random_normal([n_ft + 1, k], stddev=0.01))
179 | 
180 | ################################
181 | # prediction subnet FC layers, including output layer
182 | n_layer = len(layer_dim)
183 | in_dim = (n_one_hot_slot + n_mul_hot_slot)*k
184 | weight_dict = {}
185 | bias_dict = {}
186 | 
187 | # loop to create DNN vars
188 | for i in range(0, n_layer):
189 |     out_dim = layer_dim[i]
190 |     weight_dict[i] = tf.Variable(tf.random_normal(shape=[in_dim, out_dim], stddev=np.sqrt(2.0/(in_dim+out_dim))))
191 |     bias_dict[i] = tf.Variable(tf.constant(0.0, shape=[out_dim]))
192 |     in_dim = layer_dim[i]
193 | 
194 | ################################
195 | # matching subnet FC layers
196 | n_layer_match = len(layer_dim_match)
197 | in_dim_user = n_user_ft*k
198 | weight_dict_user={}
199 | bias_dict_user={}
200 | 
201 | in_dim_ad = n_ad_ft*k
202 | weight_dict_ad={}
203 | bias_dict_ad={}
204 | 
205 | for i in range(0, n_layer_match):
206 |     out_dim_user = layer_dim_match[i]
207 |     weight_dict_user[i] = tf.Variable(tf.random_normal(shape=[in_dim_user, out_dim_user],\
208 |                                     stddev=np.sqrt(2.0/(in_dim_user+out_dim_user))))
209 |     bias_dict_user[i] = tf.Variable(tf.constant(0.0, shape=[out_dim_user]))
210 |     in_dim_user = layer_dim_match[i]
211 | 
212 | for i in range(0, n_layer_match):
213 |     out_dim_ad = layer_dim_match[i]
214 |     weight_dict_ad[i] = tf.Variable(tf.random_normal(shape=[in_dim_ad, out_dim_ad],\
215 |                                     stddev=np.sqrt(2.0/(in_dim_ad+out_dim_ad))))
216 |     bias_dict_ad[i] = tf.Variable(tf.constant(0.0, shape=[out_dim_ad]))
217 |     in_dim_ad = layer_dim_match[i]
218 | 
219 | ################################
220 | data_embed_concat = get_concate_embed(x_input_one_hot, x_input_mul_hot)
221 | y_hat = get_pred_output(data_embed_concat)
222 | y_hat_match = get_match_output(data_embed_concat)
223 | 
224 | loss_ctr = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=y_hat, labels=y_target))
225 | loss_match = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=y_hat_match, labels=y_target))
226 | 
227 | loss = loss_ctr + alpha*loss_match
228 | 
229 | #############################
230 | # prediction
231 | #############################
232 | pred_score = tf.sigmoid(y_hat)
233 | 
234 | if opt_alg == 'Adam':
235 |     optimizer = tf.train.AdamOptimizer(eta).minimize(loss)
236 | else:
237 |     # default
238 |     optimizer = tf.train.AdagradOptimizer(eta).minimize(loss)
239 | 
240 | ########################################
241 | # Launch the graph.
242 | config = tf.ConfigProto(log_device_placement=False)
243 | config.gpu_options.allow_growth = True
244 | config.gpu_options.per_process_gpu_memory_fraction = 0.3
245 | 
246 | with tf.Session(config=config) as sess:
247 |     sess.run(tf.global_variables_initializer())
248 |     sess.run(tf.local_variables_initializer())
249 |     coord = tf.train.Coordinator()
250 |     threads = tf.train.start_queue_runners(sess, coord)
251 | 
252 |     saver_val = tf.train.Saver()
253 |     train_loss_list = []
254 |     val_auc_list = []
255 |     best_n_round = 0
256 |     best_val_auc = 0            
257 |     lower_ct = 0
258 |     early_stop_flag = 0
259 | 
260 |     val_ft_inst, val_label_inst = sess.run([val_ft, val_label])
261 | 
262 |     func.print_time()
263 |     print('Start train loop')
264 |     
265 |     epoch = -1
266 |     try:
267 |         while not coord.should_stop():           
268 |             epoch += 1  
269 |             train_ft_inst, train_label_inst = sess.run([train_ft, train_label])
270 |             train_label_inst = np.transpose([train_label_inst])            
271 |             
272 |             # training
273 |             sess.run(optimizer, feed_dict={x_input:train_ft_inst, y_target:train_label_inst, \
274 |                                            keep_prob:kp_prob})
275 |             
276 |             # record loss and accuracy every step_size generations
277 |             if (epoch+1)%record_step_size == 0:                
278 |                 train_loss_temp = sess.run(loss, feed_dict={ \
279 |                                            x_input:train_ft_inst, y_target:train_label_inst, \
280 |                                            keep_prob:1.0})
281 |                 train_loss_list.append(train_loss_temp)                
282 |  
283 |                 val_pred_score_all = []
284 |                 val_label_all = []
285 |                 
286 |                 for iii in range(n_val_batch):
287 |                     # get batch
288 |                     start_idx = iii*batch_size
289 |                     end_idx = (iii+1)*batch_size
290 |                     cur_val_ft = val_ft_inst[start_idx: end_idx]
291 |                     cur_val_label = val_label_inst[start_idx: end_idx]
292 |                     # pred score
293 |                     cur_val_pred_score = sess.run(pred_score, feed_dict={ \
294 |                                             x_input:cur_val_ft, keep_prob:1.0})
295 |                     val_pred_score_all.append(cur_val_pred_score.flatten())
296 |                     val_label_all.append(cur_val_label)   
297 |                     
298 |                 # calculate auc
299 |                 val_pred_score_re = func.list_flatten(val_pred_score_all)
300 |                 val_label_re = func.list_flatten(val_label_all)
301 |                 val_auc_temp, _, _ = func.cal_auc(val_pred_score_re, val_label_re)
302 |                 # record all val results    
303 |                 val_auc_list.append(val_auc_temp)
304 |                  
305 |                 # record best and save models
306 |                 if val_auc_temp > best_val_auc:
307 |                     best_val_auc = val_auc_temp
308 |                     best_n_round = epoch
309 |                     # Save the variables to disk
310 |                     save_path = saver_val.save(sess, model_saving_addr)
311 |                     print("Model saved in: %s" % save_path)
312 |                 # count of consecutive lower
313 |                 if val_auc_temp < best_val_auc:
314 |                      lower_ct += 1
315 |                 # once higher or equal, set to 0
316 |                 else:
317 |                      lower_ct = 0
318 |                 
319 |                 if lower_ct >= max_num_lower_ct:
320 |                     early_stop_flag = 1
321 |                 
322 |                 auc_and_loss = [epoch+1, train_loss_temp, val_auc_temp]
323 |                 # round to given number of decimals
324 |                 auc_and_loss = [np.round(xx,4) for xx in auc_and_loss]
325 |                 func.print_time() 
326 |                 print('Generation # {}. Train Loss: {:.4f}. Val Avg AUC: {:.4f}.'\
327 |                       .format(*auc_and_loss))
328 | 
329 |             # stop while loop    
330 |             if early_stop_flag == 1:
331 |                 break 
332 |                 
333 |     except tf.errors.OutOfRangeError:
334 |         func.print_time()
335 |         print('Done training -- epoch limit reached')
336 |     
337 |     # restore model
338 |     saver_val.restore(sess, model_saving_addr)
339 |     print("Model restored.")
340 |             
341 |     # load test data
342 |     test_pred_score_all = []
343 |     test_label_all = []
344 |     test_loss_all = []
345 |     try:
346 |         while True:
347 |             test_ft_inst, test_label_inst = sess.run([test_ft, test_label])
348 |             cur_test_pred_score = sess.run(pred_score, feed_dict={ \
349 |                                     x_input:test_ft_inst, keep_prob:1.0})
350 |             test_pred_score_all.append(cur_test_pred_score.flatten())
351 |             test_label_all.append(test_label_inst)
352 |             
353 |             cur_test_loss = sess.run(loss_ctr, feed_dict={ \
354 |                                     x_input:test_ft_inst, \
355 |                                     y_target: np.transpose([test_label_inst]), keep_prob:1.0})
356 |             test_loss_all.append(cur_test_loss)
357 | 
358 |     except tf.errors.OutOfRangeError:
359 |         func.print_time()
360 |         print('Done testing -- epoch limit reached')    
361 |     finally:
362 |         coord.request_stop()
363 |         
364 |     coord.join(threads) 
365 |          
366 |     # calculate auc
367 |     test_pred_score_re = func.list_flatten(test_pred_score_all)
368 |     test_label_re = func.list_flatten(test_label_all)
369 |     test_auc, _, _ = func.cal_auc(test_pred_score_re, test_label_re)
370 |     test_rmse = func.cal_rmse(test_pred_score_re, test_label_re)
371 |     test_loss = np.mean(test_loss_all)
372 |     
373 |     # rounding
374 |     test_auc = np.round(test_auc, 4)
375 |     test_rmse = np.round(test_rmse, 4)
376 |     test_loss = np.round(test_loss, 5)
377 |     train_loss_list = [np.round(xx,4) for xx in train_loss_list]
378 |     val_auc_list = [np.round(xx,4) for xx in val_auc_list]
379 |     
380 |     print('test_auc = ', test_auc)
381 |     print('test_rmse =', test_rmse)
382 |     print('test_loss =', test_loss)
383 |     print('train_loss_list =', train_loss_list)
384 |     print('val_auc_list =', val_auc_list)
385 |     
386 |     # write output to file
387 |     with open(output_file_name, 'a') as f:
388 |         now = datetime.datetime.now()
389 |         time_str = now.strftime(cfg.time_style)
390 |         f.write(time_str + '\n')
391 |         f.write('train_file_name = ' + train_file_name[0] + '\n')
392 |         f.write('learning_rate = ' + str(eta) + ', alpha = ' + str(alpha) \
393 |                 + ', n_epoch = ' + str(n_epoch) \
394 |                 + ', emb_dize = ' + str(k) + '\n')
395 |         f.write('test_auc = ' + str(test_auc) + '\n')
396 |         f.write('test_rmse = ' + str(test_rmse) + '\n')
397 |         f.write('test_loss = ' + str(test_loss) + '\n')
398 |         f.write('train_loss_list =' + str(train_loss_list) + '\n')
399 |         f.write('val_auc_list =' + str(val_auc_list) + '\n')
400 |         f.write('-'*50 + '\n')
401 | 


--------------------------------------------------------------------------------
/dnn.py:
--------------------------------------------------------------------------------
  1 | # DNN (prediction)
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | import datetime
  6 | import ctr_funcs as func
  7 | import config_deepmcp as cfg
  8 | import os
  9 | import shutil
 10 | 
 11 | # config
 12 | str_txt = cfg.output_file_name
 13 | base_path = './tmp'
 14 | model_saving_addr = base_path + '/dnn_' + str_txt + '/'
 15 | output_file_name = base_path + '/dnn_' + str_txt + '.txt'
 16 | num_csv_col = cfg.num_csv_col
 17 | train_file_name = cfg.train_file_name
 18 | val_file_name = cfg.val_file_name
 19 | test_file_name = cfg.test_file_name
 20 | batch_size = cfg.batch_size
 21 | n_ft = cfg.n_ft
 22 | k = cfg.k
 23 | kp_prob = cfg.kp_prob
 24 | n_epoch = cfg.n_epoch
 25 | max_num_lower_ct = cfg.max_num_lower_ct
 26 | record_step_size = cfg.record_step_size
 27 | layer_dim = cfg.layer_dim
 28 | layer_dim_match = cfg.layer_dim_match
 29 | eta = cfg.eta # learning rate
 30 | opt_alg = cfg.opt_alg
 31 | n_one_hot_slot = cfg.n_one_hot_slot
 32 | n_mul_hot_slot = cfg.n_mul_hot_slot
 33 | max_len_per_slot = cfg.max_len_per_slot
 34 | label_col_idx = 0
 35 | record_defaults = [[0]]*num_csv_col
 36 | record_defaults[0] = [0.0]
 37 | total_num_ft_col = num_csv_col - 1
 38 |     
 39 | # create dir
 40 | if not os.path.exists(base_path):
 41 |     os.mkdir(base_path)
 42 | 
 43 | # remove dir
 44 | if os.path.isdir(model_saving_addr):
 45 |     shutil.rmtree(model_saving_addr)
 46 | 
 47 | # for DNN
 48 | idx_1 = n_one_hot_slot
 49 | idx_2 = idx_1 + n_mul_hot_slot*max_len_per_slot
 50 | 
 51 | ###########################################################
 52 | ###########################################################
 53 | print('Loading data start!')
 54 | tf.set_random_seed(123)
 55 | 
 56 | # load training data
 57 | train_ft, train_label = func.tf_input_pipeline(train_file_name, batch_size, n_epoch, label_col_idx, record_defaults)
 58 | 
 59 | n_val_inst = func.count_lines(val_file_name[0])
 60 | val_ft, val_label = func.tf_input_pipeline(val_file_name, n_val_inst, 1, label_col_idx, record_defaults)
 61 | n_val_batch = n_val_inst//batch_size
 62 | 
 63 | # load test data
 64 | test_ft, test_label = func.tf_input_pipeline_test(test_file_name, batch_size, 1, label_col_idx, record_defaults)
 65 | print('Loading data set 1 done!')
 66 | 
 67 | ########################################################################
 68 | 
 69 | # add mask
 70 | def get_masked_one_hot(x_input_one_hot):
 71 |     data_mask = tf.cast(tf.greater(x_input_one_hot, 0), tf.float32)
 72 |     data_mask = tf.expand_dims(data_mask, axis = 2)
 73 |     data_mask = tf.tile(data_mask, (1,1,k))
 74 |     # output: (?, n_one_hot_slot, k)
 75 |     data_embed_one_hot = tf.nn.embedding_lookup(emb_mat, x_input_one_hot)
 76 |     data_embed_one_hot_masked = tf.multiply(data_embed_one_hot, data_mask)
 77 |     return data_embed_one_hot_masked
 78 | 
 79 | def get_masked_mul_hot(x_input_mul_hot):
 80 |     data_mask = tf.cast(tf.greater(x_input_mul_hot, 0), tf.float32)
 81 |     data_mask = tf.expand_dims(data_mask, axis = 3)
 82 |     data_mask = tf.tile(data_mask, (1,1,1,k))
 83 |     # output: (?, n_mul_hot_slot, max_len_per_slot, k)
 84 |     data_embed_mul_hot = tf.nn.embedding_lookup(emb_mat, x_input_mul_hot)
 85 |     data_embed_mul_hot_masked = tf.multiply(data_embed_mul_hot, data_mask)
 86 |     # output: (?, n_mul_hot_slot, k)
 87 |     data_embed_mul_hot_masked = tf.reduce_sum(data_embed_mul_hot_masked, 2)
 88 |     return data_embed_mul_hot_masked
 89 | 
 90 | # output: (?, n_one_hot_slot + n_mul_hot_slot, k)
 91 | def get_concate_embed(x_input_one_hot, x_input_mul_hot):
 92 |     data_embed_one_hot = get_masked_one_hot(x_input_one_hot)
 93 |     data_embed_mul_hot = get_masked_mul_hot(x_input_mul_hot)
 94 |     data_embed_concat = tf.concat([data_embed_one_hot, data_embed_mul_hot], 1)
 95 |     return data_embed_concat
 96 | 
 97 | # input: (?, n_slot*k)
 98 | # output: (?, 1)
 99 | def get_pred_output(data_embed_concat):
100 |     # include output layer
101 |     n_layer = len(layer_dim)
102 |     data_embed_dnn = tf.reshape(data_embed_concat, [-1, (n_one_hot_slot + n_mul_hot_slot)*k])
103 |     cur_layer = data_embed_dnn
104 |     # loop to create DNN struct
105 |     for i in range(0, n_layer):
106 |         # output layer, linear activation
107 |         if i == n_layer - 1:
108 |             cur_layer = tf.matmul(cur_layer, weight_dict[i]) + bias_dict[i]
109 |         else:
110 |             cur_layer = tf.nn.relu(tf.matmul(cur_layer, weight_dict[i]) + bias_dict[i])
111 |             cur_layer = tf.nn.dropout(cur_layer, keep_prob)
112 |     
113 |     y_hat = cur_layer
114 |     return y_hat
115 | 
116 | ###########################################################
117 | ###########################################################
118 | # input for prediction loss
119 | x_input = tf.placeholder(tf.int32, shape=[None, total_num_ft_col])
120 | # shape=[None, n_one_hot_slot]
121 | x_input_one_hot = x_input[:, 0:idx_1]
122 | x_input_mul_hot = x_input[:, idx_1:idx_2]
123 | # shape=[None, n_mul_hot_slot, max_len_per_slot]
124 | x_input_mul_hot = tf.reshape(x_input_mul_hot, (-1, n_mul_hot_slot, max_len_per_slot))
125 | 
126 | # target vec for l1
127 | y_target = tf.placeholder(tf.float32, shape=[None, 1])
128 | 
129 | # dropout keep prob
130 | keep_prob = tf.placeholder(tf.float32)
131 | # emb_mat dim add 1 -> for padding (idx = 0)
132 | with tf.device('/cpu:0'):
133 |     emb_mat = tf.Variable(tf.random_normal([n_ft + 1, k], stddev=0.01))
134 | 
135 | ################################
136 | # prediction subnet FC layers, including output layer
137 | n_layer = len(layer_dim)
138 | in_dim = (n_one_hot_slot + n_mul_hot_slot)*k
139 | weight_dict = {}
140 | bias_dict = {}
141 | 
142 | # loop to create DNN vars
143 | for i in range(0, n_layer):
144 |     out_dim = layer_dim[i]
145 |     weight_dict[i] = tf.Variable(tf.random_normal(shape=[in_dim, out_dim], stddev=np.sqrt(2.0/(in_dim+out_dim))))
146 |     bias_dict[i] = tf.Variable(tf.constant(0.0, shape=[out_dim]))
147 |     in_dim = layer_dim[i]
148 | 
149 | ################################
150 | data_embed_concat = get_concate_embed(x_input_one_hot, x_input_mul_hot)
151 | y_hat = get_pred_output(data_embed_concat)
152 | 
153 | loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=y_hat, labels=y_target))
154 | 
155 | #############################
156 | # prediction
157 | #############################
158 | pred_score = tf.sigmoid(y_hat)
159 | 
160 | if opt_alg == 'Adam':
161 |     optimizer = tf.train.AdamOptimizer(eta).minimize(loss)
162 | else:
163 |     # default
164 |     optimizer = tf.train.AdagradOptimizer(eta).minimize(loss)
165 | 
166 | ########################################
167 | # Launch the graph.
168 | config = tf.ConfigProto(log_device_placement=False)
169 | config.gpu_options.allow_growth = True
170 | config.gpu_options.per_process_gpu_memory_fraction = 0.3
171 | 
172 | with tf.Session(config=config) as sess:
173 |     sess.run(tf.global_variables_initializer())
174 |     sess.run(tf.local_variables_initializer())
175 |     coord = tf.train.Coordinator()
176 |     threads = tf.train.start_queue_runners(sess, coord)
177 | 
178 |     saver_val = tf.train.Saver()
179 |     train_loss_list = []
180 |     val_auc_list = []
181 |     best_n_round = 0
182 |     best_val_auc = 0            
183 |     lower_ct = 0
184 |     early_stop_flag = 0
185 | 
186 |     val_ft_inst, val_label_inst = sess.run([val_ft, val_label])
187 | 
188 |     func.print_time()
189 |     print('Start train loop')
190 |     
191 |     epoch = -1
192 |     try:
193 |         while not coord.should_stop():           
194 |             epoch += 1  
195 |             train_ft_inst, train_label_inst = sess.run([train_ft, train_label])
196 |             train_label_inst = np.transpose([train_label_inst])            
197 |             
198 |             # training
199 |             sess.run(optimizer, feed_dict={x_input:train_ft_inst, y_target:train_label_inst, \
200 |                                            keep_prob:kp_prob})
201 |             
202 |             # record loss and accuracy every step_size generations
203 |             if (epoch+1)%record_step_size == 0:                
204 |                 train_loss_temp = sess.run(loss, feed_dict={ \
205 |                                            x_input:train_ft_inst, y_target:train_label_inst, \
206 |                                            keep_prob:1.0})
207 |                 train_loss_list.append(train_loss_temp)                
208 |  
209 |                 val_pred_score_all = []
210 |                 val_label_all = []
211 |                 
212 |                 for iii in range(n_val_batch):
213 |                     # get batch
214 |                     start_idx = iii*batch_size
215 |                     end_idx = (iii+1)*batch_size
216 |                     cur_val_ft = val_ft_inst[start_idx: end_idx]
217 |                     cur_val_label = val_label_inst[start_idx: end_idx]
218 |                     # pred score
219 |                     cur_val_pred_score = sess.run(pred_score, feed_dict={ \
220 |                                             x_input:cur_val_ft, keep_prob:1.0})
221 |                     val_pred_score_all.append(cur_val_pred_score.flatten())
222 |                     val_label_all.append(cur_val_label)   
223 |                     
224 |                 # calculate auc
225 |                 val_pred_score_re = func.list_flatten(val_pred_score_all)
226 |                 val_label_re = func.list_flatten(val_label_all)
227 |                 val_auc_temp, _, _ = func.cal_auc(val_pred_score_re, val_label_re)
228 |                 # record all val results    
229 |                 val_auc_list.append(val_auc_temp)
230 |                  
231 |                 # record best and save models
232 |                 if val_auc_temp > best_val_auc:
233 |                     best_val_auc = val_auc_temp
234 |                     best_n_round = epoch
235 |                     # Save the variables to disk
236 |                     save_path = saver_val.save(sess, model_saving_addr)
237 |                     print("Model saved in: %s" % save_path)
238 |                 # count of consecutive lower
239 |                 if val_auc_temp < best_val_auc:
240 |                      lower_ct += 1
241 |                 # once higher or equal, set to 0
242 |                 else:
243 |                      lower_ct = 0
244 |                 
245 |                 if lower_ct >= max_num_lower_ct:
246 |                     early_stop_flag = 1
247 |                 
248 |                 auc_and_loss = [epoch+1, train_loss_temp, val_auc_temp]
249 |                 # round to given number of decimals
250 |                 auc_and_loss = [np.round(xx,4) for xx in auc_and_loss]
251 |                 func.print_time() 
252 |                 print('Generation # {}. Train Loss: {:.4f}. Val Avg AUC: {:.4f}.'\
253 |                       .format(*auc_and_loss))
254 | 
255 |             # stop while loop    
256 |             if early_stop_flag == 1:
257 |                 break 
258 |                 
259 |     except tf.errors.OutOfRangeError:
260 |         func.print_time()
261 |         print('Done training -- epoch limit reached')
262 |     
263 |     # restore model
264 |     saver_val.restore(sess, model_saving_addr)
265 |     print("Model restored.")
266 |             
267 |     # load test data
268 |     test_pred_score_all = []
269 |     test_label_all = []
270 |     test_loss_all = []
271 |     try:
272 |         while True:
273 |             test_ft_inst, test_label_inst = sess.run([test_ft, test_label])
274 |             cur_test_pred_score = sess.run(pred_score, feed_dict={ \
275 |                                     x_input:test_ft_inst, keep_prob:1.0})
276 |             test_pred_score_all.append(cur_test_pred_score.flatten())
277 |             test_label_all.append(test_label_inst)
278 |             
279 |             cur_test_loss = sess.run(loss, feed_dict={ \
280 |                                     x_input:test_ft_inst, \
281 |                                     y_target: np.transpose([test_label_inst]), keep_prob:1.0})
282 |             test_loss_all.append(cur_test_loss)
283 | 
284 |     except tf.errors.OutOfRangeError:
285 |         func.print_time()
286 |         print('Done testing -- epoch limit reached')    
287 |     finally:
288 |         coord.request_stop()
289 |         
290 |     coord.join(threads) 
291 |          
292 |     # calculate auc
293 |     test_pred_score_re = func.list_flatten(test_pred_score_all)
294 |     test_label_re = func.list_flatten(test_label_all)
295 |     test_auc, _, _ = func.cal_auc(test_pred_score_re, test_label_re)
296 |     test_rmse = func.cal_rmse(test_pred_score_re, test_label_re)
297 |     test_loss = np.mean(test_loss_all)
298 |     
299 |     # rounding
300 |     test_auc = np.round(test_auc, 4)
301 |     test_rmse = np.round(test_rmse, 4)
302 |     test_loss = np.round(test_loss, 5)
303 |     train_loss_list = [np.round(xx,4) for xx in train_loss_list]
304 |     val_auc_list = [np.round(xx,4) for xx in val_auc_list]
305 |     
306 |     print('test_auc = ', test_auc)
307 |     print('test_rmse =', test_rmse)
308 |     print('test_loss =', test_loss)
309 |     print('train_loss_list =', train_loss_list)
310 |     print('val_auc_list =', val_auc_list)
311 |     
312 |     # write output to file
313 |     with open(output_file_name, 'a') as f:
314 |         now = datetime.datetime.now()
315 |         time_str = now.strftime(cfg.time_style)
316 |         f.write(time_str + '\n')
317 |         f.write('train_file_name = ' + train_file_name[0] + '\n')
318 |         f.write('learning_rate = ' + str(eta) \
319 |                 + ', n_epoch = ' + str(n_epoch) \
320 |                 + ', emb_dize = ' + str(k) + '\n')
321 |         f.write('test_auc = ' + str(test_auc) + '\n')
322 |         f.write('test_rmse = ' + str(test_rmse) + '\n')
323 |         f.write('test_loss = ' + str(test_loss) + '\n')
324 |         f.write('train_loss_list =' + str(train_loss_list) + '\n')
325 |         f.write('val_auc_list =' + str(val_auc_list) + '\n')
326 |         f.write('-'*50 + '\n')
327 | 


--------------------------------------------------------------------------------