├── data
    ├── embeddings
    │   └── readme
    └── metapath
    │   └── readme
├── paper
    ├── README
    ├── main.pdf
    ├── main.tex
    ├── image
    │   ├── hbb.pdf
    │   ├── psy.pdf
    │   ├── sc.pdf
    │   ├── zx.pdf
    │   ├── dim_db.pdf
    │   ├── dim_dm.pdf
    │   ├── yelp.pdf
    │   ├── dim_yelp.pdf
    │   ├── iter_db.pdf
    │   ├── iter_dm.pdf
    │   ├── reg_rmse.pdf
    │   ├── doubanbook.pdf
    │   ├── doubanmovie.pdf
    │   ├── factor_db.pdf
    │   ├── factor_dm.pdf
    │   ├── factor_yelp.pdf
    │   ├── framework.pdf
    │   ├── iter_yelp.pdf
    │   ├── metapath_db.pdf
    │   ├── metapath_dm.pdf
    │   ├── random_walk.pdf
    │   ├── reg_rmse_db.pdf
    │   ├── reg_rmse_dm.pdf
    │   ├── fusion_db_mae.pdf
    │   ├── fusion_dm_mae.pdf
    │   ├── metapath_yelp.pdf
    │   ├── reg_rmse_yelp.pdf
    │   ├── cold_start_rmse.pdf
    │   ├── fusion_db_rmse.pdf
    │   ├── fusion_dm_rmse.pdf
    │   ├── fusion_yelp_mae.pdf
    │   ├── fusion_yelp_rmse.pdf
    │   ├── metapath_douban.pdf
    │   ├── window_size_mae.pdf
    │   ├── cold_start_mae_db.pdf
    │   ├── cold_start_mae_dm.pdf
    │   ├── cold_start_rmse_db.pdf
    │   ├── cold_start_rmse_dm.pdf
    │   ├── cold_start_mae_yelp.pdf
    │   └── cold_start_rmse_yelp.pdf
    ├── sec-con.tex
    ├── sec-def.tex
    ├── sec-intro.tex
    ├── sec-rel.tex
    ├── sec-model.tex
    ├── reference.bib
    └── sec-exp.tex
├── .gitattributes
├── code
    ├── cut_data.py
    ├── embeddingGeneration.py
    ├── metapathGeneration.py
    ├── HERec_sl.py
    ├── HERec_pl.py
    └── HERec_spl.py
└── README.md


/data/embeddings/readme:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/data/metapath/readme:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/paper/README:
--------------------------------------------------------------------------------
1 | Source code of paper
2 | 


--------------------------------------------------------------------------------
/paper/main.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/main.pdf


--------------------------------------------------------------------------------
/paper/main.tex:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/main.tex


--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | *.tex linguist-language=python
2 | *.py linguist-language=python
3 | 


--------------------------------------------------------------------------------
/paper/image/hbb.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/hbb.pdf


--------------------------------------------------------------------------------
/paper/image/psy.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/psy.pdf


--------------------------------------------------------------------------------
/paper/image/sc.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/sc.pdf


--------------------------------------------------------------------------------
/paper/image/zx.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/zx.pdf


--------------------------------------------------------------------------------
/paper/image/dim_db.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/dim_db.pdf


--------------------------------------------------------------------------------
/paper/image/dim_dm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/dim_dm.pdf


--------------------------------------------------------------------------------
/paper/image/yelp.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/yelp.pdf


--------------------------------------------------------------------------------
/paper/image/dim_yelp.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/dim_yelp.pdf


--------------------------------------------------------------------------------
/paper/image/iter_db.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/iter_db.pdf


--------------------------------------------------------------------------------
/paper/image/iter_dm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/iter_dm.pdf


--------------------------------------------------------------------------------
/paper/image/reg_rmse.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/reg_rmse.pdf


--------------------------------------------------------------------------------
/paper/image/doubanbook.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/doubanbook.pdf


--------------------------------------------------------------------------------
/paper/image/doubanmovie.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/doubanmovie.pdf


--------------------------------------------------------------------------------
/paper/image/factor_db.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/factor_db.pdf


--------------------------------------------------------------------------------
/paper/image/factor_dm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/factor_dm.pdf


--------------------------------------------------------------------------------
/paper/image/factor_yelp.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/factor_yelp.pdf


--------------------------------------------------------------------------------
/paper/image/framework.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/framework.pdf


--------------------------------------------------------------------------------
/paper/image/iter_yelp.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/iter_yelp.pdf


--------------------------------------------------------------------------------
/paper/image/metapath_db.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/metapath_db.pdf


--------------------------------------------------------------------------------
/paper/image/metapath_dm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/metapath_dm.pdf


--------------------------------------------------------------------------------
/paper/image/random_walk.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/random_walk.pdf


--------------------------------------------------------------------------------
/paper/image/reg_rmse_db.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/reg_rmse_db.pdf


--------------------------------------------------------------------------------
/paper/image/reg_rmse_dm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/reg_rmse_dm.pdf


--------------------------------------------------------------------------------
/paper/image/fusion_db_mae.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/fusion_db_mae.pdf


--------------------------------------------------------------------------------
/paper/image/fusion_dm_mae.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/fusion_dm_mae.pdf


--------------------------------------------------------------------------------
/paper/image/metapath_yelp.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/metapath_yelp.pdf


--------------------------------------------------------------------------------
/paper/image/reg_rmse_yelp.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/reg_rmse_yelp.pdf


--------------------------------------------------------------------------------
/paper/image/cold_start_rmse.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/cold_start_rmse.pdf


--------------------------------------------------------------------------------
/paper/image/fusion_db_rmse.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/fusion_db_rmse.pdf


--------------------------------------------------------------------------------
/paper/image/fusion_dm_rmse.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/fusion_dm_rmse.pdf


--------------------------------------------------------------------------------
/paper/image/fusion_yelp_mae.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/fusion_yelp_mae.pdf


--------------------------------------------------------------------------------
/paper/image/fusion_yelp_rmse.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/fusion_yelp_rmse.pdf


--------------------------------------------------------------------------------
/paper/image/metapath_douban.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/metapath_douban.pdf


--------------------------------------------------------------------------------
/paper/image/window_size_mae.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/window_size_mae.pdf


--------------------------------------------------------------------------------
/paper/image/cold_start_mae_db.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/cold_start_mae_db.pdf


--------------------------------------------------------------------------------
/paper/image/cold_start_mae_dm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/cold_start_mae_dm.pdf


--------------------------------------------------------------------------------
/paper/image/cold_start_rmse_db.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/cold_start_rmse_db.pdf


--------------------------------------------------------------------------------
/paper/image/cold_start_rmse_dm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/cold_start_rmse_dm.pdf


--------------------------------------------------------------------------------
/paper/image/cold_start_mae_yelp.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/cold_start_mae_yelp.pdf


--------------------------------------------------------------------------------
/paper/image/cold_start_rmse_yelp.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/librahu/HERec/HEAD/paper/image/cold_start_rmse_yelp.pdf


--------------------------------------------------------------------------------
/code/cut_data.py:
--------------------------------------------------------------------------------
 1 | #!/user/bin/python
 2 | import random
 3 | 
 4 | train_rate = 0.8
 5 | 
 6 | R = []
 7 | with open('../data/ub.txt', 'r') as infile:
 8 |     for line in infile.readlines():
 9 |         user, item, rating = line.strip().split('\t')
10 |         R.append([user, item, rating])
11 | 
12 | random.shuffle(R)
13 | train_num = int(len(R) * train_rate)
14 | 
15 | with open('../data/ub_' + str(train_rate) + '.train', 'w') as trainfile,\
16 |      open('../data/ub_' + str(train_rate) + '.test', 'w') as testfile:
17 |      for r in R[:train_num]:
18 |          trainfile.write('\t'.join(r) + '\n')
19 |      for r in R[train_num:]:
20 |          testfile.write('\t'.join(r) + '\n')
21 | 
22 | 
23 | 


--------------------------------------------------------------------------------
/code/embeddingGeneration.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | 
 3 | train_rate = 0.8
 4 | dim = 128
 5 | walk_len = 5
 6 | win_size = 3
 7 | num_walk = 10
 8 | 
 9 | metapaths = ['ubu', 'ubcabu', 'ubcibu', 'bub', 'bcab', 'bcib']
10 | 
11 | for metapath in metapaths:
12 | 	metapath = metapath + '_' + str(train_rate)
13 | 	input_file = '../data/metapath/' + metapath + '.txt'
14 | 	output_file = '../data/embeddings/' + metapath + '.embedding'
15 | 
16 | 	cmd = 'deepwalk --format edgelist --input ' + input_file + ' --output ' + output_file + \
17 | 	      ' --walk-length ' + str(walk_len) + ' --window-size ' + str(win_size) + ' --number-walks '\
18 | 	       + str(num_walk) + ' --representation-size ' + str(dim)
19 | 
20 | 	print cmd;
21 | 	os.system(cmd)
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # HERec is also available in OpenHINE (https://github.com/BUPT-GAMMA/OpenHINE)
 2 | 
 3 | # Usage:
 4 | 
 5 | * python cut_data.py
 6 | 
 7 | * python metapathGeneration.py
 8 | 
 9 | * python embeddingGeneration.py
10 | 
11 | * python HERec_sl.py / HERec_pl.py / HERec_spl.py
12 | 
13 | (Note: HERec_sl, HERec_pl, HERec_spl mean HERec with simple linear fusion function, personalized linear fusion and non-linear fusion)
14 | # Requirements
15 | 
16 | * numpy
17 | 
18 | * scipy
19 | 
20 | * deepwalk (https://github.com/phanein/deepwalk)
21 | 
22 | # Reference
23 | 
24 | @article{
25 | 
26 | > author = {Chuan Shi, Binbin Hu, Wayne Xin Zhao and Philip S. Yu.},
27 |  
28 | > title = {Heterogeneous Information Network Embedding for Recommendation},
29 |  
30 | > booktitle = {IEEE Transactions on Knowledge and Data Engineering (TKDE)},
31 |  
32 | > year = {2018},
33 |  
34 | > url = {https://arxiv.org/pdf/1711.10730.pdf},
35 |  
36 | > publisher = {IEEE},
37 |  
38 | > keywords = {Heterogeneous information network, Network embedding, Matrix factorization, Recommender system},
39 |  
40 | }
41 | 


--------------------------------------------------------------------------------
/paper/sec-con.tex:
--------------------------------------------------------------------------------
1 | \section{Conclusion \label{sec-con}}
2 | In this paper, we proposed a novel heterogeneous information network embedding based approach (\ie HERec) to effectively utilizing auxiliary information in HINs for recommendation. We designed a new random walk strategy based on meta-paths to derive more meaningful node sequences for network embedding. Since embeddings based on different meta-paths contain different semantic, the learned embeddings were further integrated into an extended matrix factorization model using a set of fusion functions. Finally, the extended matrix factorization model together with fusion functions were jointly optimized for the rating prediction task. HERec aimed to learn useful information representations from HINs guided by the specific recommendation task, which distinguished the proposed approach from existing HIN based recommendation methods. Extensive experiments on three real datasets demonstrated the effectiveness of HERec. We also verified the ability of HERec to alleviate cold-start problem and examine the impact of meta-paths on performance.
3 | 
4 | As future work, we will investigate into how to apply deep learning methods (\eg convolutional neural networks, auto encoder)
5 | to better fuse the embeddings of multiple meta-paths. In addition, we only use the meta-paths which have the same starting and ending types to effectively extract network structure features in this work. Therefore, it is interesting and natural to extend the proposed model to learn the embeddings of any nodes with arbitrary meta-paths. As a major issue of recommender systems, we will also consider how to enhance the explainablity of the recommendation method based on the semantics of meta-paths.
6 | %As part of future work, we are interested in applying deep learning methods (\emph{e.g.} convolutional neural networks and auto encoders) to better fuse embeddings of various meta-paths. In addition, the proposed HIN embedding can also be employed for other applications.
7 | 


--------------------------------------------------------------------------------
/paper/sec-def.tex:
--------------------------------------------------------------------------------
  1 | \section{Preliminary \label{sec-def}}
  2 | 
  3 | A heterogeneous information network is a special kind of information network, which either contains multiple types of objects or multiple types of links.
  4 | \begin{myDef}
  5 | \textbf{Heterogeneous information network}~\cite{sun2012mining}. A HIN is denoted as $\mathcal{G} = \{\mathcal{V}, \mathcal{E}\}$ consisting of a object set $\mathcal{V}$ and a link set $\mathcal{E}$. A HIN is also associated with an object type mapping function $\phi: \mathcal{V} \rightarrow \mathcal{A}$ and a link type mapping function $\psi: \mathcal{E} \rightarrow \mathcal{R}$. $\mathcal{A}$ and $\mathcal{R}$ denote the sets of predefined object and link types, where $|\mathcal{A}| + |\mathcal{R}| > 2$.
  6 | \end{myDef}
  7 | 
  8 | The complexity of heterogeneous information network drives us to provide the meta level ($\eg$ schema-level) description for understanding the object types and link types better in the network. Hence, the concept of network schema is proposed to describe the meta structure of a network.
  9 | \begin{myDef}
 10 | \textbf{Network schema}~\cite{sun2013mining,sun2009ranking}. The network schema is denoted as $\mathcal{S} = (\mathcal{A}, \mathcal{R})$. It is a meta template
 11 | for an information network $\mathcal{G} = \{\mathcal{V}, \mathcal{E}\}$ with the object type mapping $\phi: \mathcal{V} \rightarrow \mathcal{A}$ and the link type mapping $\psi: \mathcal{E} \rightarrow \mathcal{R}$, which is a directed graph defined over object types $\mathcal{A}$, with edges as relations from $\mathcal{R}$.
 12 | \end{myDef}
 13 | 
 14 | \begin{exmp}
 15 | As shown in Fig.~\ref{fig_framework}(a), we have represented the setting of movie recommender systems by HINs.
 16 | We further present its corresponding network schema in Fig.~\ref{fig_schema}(a), consisting of multiple types of
 17 | objects, including User ($U$), Movie ($M$), Director ($D$). There exist different types of links between objects to represent different
 18 | relations. A user-user link indicates the friendship between two users, while a user-movie link indicates the rating relation.
 19 | Similarly, we present the schematic network schemas for book and business recommender systems in Fig.~\ref{fig_schema}(b) and Fig.~\ref{fig_schema}(c) respectively.
 20 | \end{exmp}
 21 | 
 22 | In HINs, two objects can be connected via different semantic paths, which are called meta-paths.
 23 | 
 24 | \begin{myDef}
 25 | \textbf{Meta-path}~\cite{sun2011pathsim}. A meta-path $\rho$ is defined on a network schema $\mathcal{S} = (\mathcal{A}, \mathcal{R})$  and is denoted as a path in the form of $A_1 \xrightarrow{R_1} A_2 \xrightarrow{R_2} \cdots \xrightarrow{R_l} A_{l+1}$ (abbreviated as $A_1A_2 \cdots A_{l+1}$), which describes a composite relation $R = R_1 \circ R_2 \circ \cdots \circ R_l$ between object $A_1$ and $A_{l+1}$, where $\circ$ denotes the composition operator on relations.
 26 | \end{myDef}
 27 | 
 28 | \begin{exmp}
 29 | Taking Fig.~\ref{fig_schema}(a) as an example, two objects can be connected via multiple meta-paths, $\eg$ ``User - User" ($UU$) and  ``User - Movie - User" ($UMU$). Different meta-paths usually convey different semantics. For example, the $UU$ path indicates friendship between two users, while the $UMU$ path indicates the co-watch relation between two users, \ie they have watched the same movies. As will be seen later, the detailed meta-paths used in this work is summarized in Table~\ref{tab_Data}.
 30 | \end{exmp}
 31 | %Taking Fig. 1(a) as an example, we construct a HIN to model the movie recommendation setting,
 32 | % which consists of multiple types of objects (\eg User ($U$), Movie ($M$), Director ($D$)) and links (\eg social relation between users and rating relation between users and movies).
 33 | %In this example, two objects can be connected via multiple meta-paths, \eg
 34 | % ``User-User" ($UU$) and  ``User-Movie-User" ($UMU$).
 35 | % Different meta-paths often convey different semantics. For example, the $UU$ path indicates friendship between two users, while
 36 | % the $UMU$ path indicates the two users have watched the same movies. As a major technical approach, meta-path-based search and mining methods have been extensively studied in HINs %\cite{shi2017heterogeneous}.
 37 | 
 38 | 
 39 | Recently, HIN has become a mainstream approach to model various complex interaction systems \cite{shi2017survey}. Specially, it has been adopted in recommender systems for characterizing complex and heterogenous recommendation settings.
 40 | 
 41 | 
 42 | 
 43 | \begin{myDef}
 44 | \textbf{HIN based recommendation}. In a recommender system, various kinds of information can be modeled by a HIN $\mathcal{G} = \{\mathcal{V}, \mathcal{E}\}$. On recommendation-oriented HINs, two kinds of entities (\ie users and items) together with the relations between them (\ie rating relation) are our focus.
 45 | Let $\mathcal{U}\subset \mathcal{V}$
 46 | and $\mathcal{I}\subset \mathcal{V}$ denote the sets of users and items respectively, a triplet $\langle u, i, r_{u,i}\rangle$  denotes a record that a user $u$ assigns a rating of $r_{u,i}$ to  an item $i$, and $\mathcal{R}=\{\langle u, i, r_{u,i}\rangle\}$ denotes the set of rating records.
 47 | We have $\mathcal{U}\subset \mathcal{V}$, $\mathcal{I}\subset \mathcal{V}$ and $\mathcal{R}\subset \mathcal{E}$.
 48 | Given the HIN $\mathcal{G} = \{\mathcal{V}, \mathcal{E}\}$, the goal is to predict the rating score $r_{u,i'}$ of $u\in \mathcal{U}$ to a non-rated item $i'\in \mathcal{I}$.
 49 | \end{myDef}
 50 | 
 51 | Several efforts have been made for HIN based recommendation. Most of these works mainly leverage meta-path based similarities to enhance
 52 | the recommendation performance~\cite{yu2013collaborative,yu2014personalized,shi2015semantic,shi2016integrating}. Next, we will present a new heterogeneous network embedding based approach to this task, which is able to effectively
 53 | exploit the information reflected in HINs. The notations we will use throughout the article are summarized in Table~\ref{tabl_notations}.
 54 | 
 55 | \begin{table}[htbp]
 56 | \centering
 57 | \caption{Notations and explanations.}\label{tabl_notations}{
 58 | \begin{tabular}{c||c}
 59 | \hline
 60 | {Notation} & {Explanation}\\
 61 | \hline
 62 | \hline
 63 | {$\mathcal{G}$}&{heterogeneous information network}\\
 64 | \hline
 65 | {$\mathcal{V}$} & {object set} \\
 66 | \hline
 67 | {$\mathcal{E}$} & {link set} \\
 68 | \hline
 69 | {$\mathcal{S}$} & {network schema} \\
 70 | \hline
 71 | {$\mathcal{A}$} & {object type set } \\
 72 | \hline
 73 | {$\mathcal{R}$} & {link type set} \\
 74 | \hline
 75 | {$\mathcal{U}$} & {user set} \\
 76 | \hline
 77 | {$\mathcal{I}$} & {item set} \\
 78 | \hline
 79 | {$\widehat{r_{u,i}}$} & {predicted rating user $u$ gives to item $i$}\\
 80 | \hline
 81 | {$\bm{e}_v$} & {low-dimensional representation of node $v$} \\
 82 | \hline
 83 | {$\mathcal{N}_u$} & {neighborhood of node $u$} \\
 84 | \hline
 85 | {$\rho$} & {a meta-path} \\
 86 | \hline
 87 | {$\mathcal{P}$} & {meta-path set} \\
 88 | \hline
 89 | {$\bm{e}^{(U)}_u, \bm{e}^{(I)}_i$} & {final representations of  user $u$, item $i$}\\
 90 | \hline
 91 | {$d$} & {dimension of HIN embeddings} \\
 92 | \hline
 93 | {$D$} & {dimension of latent factors} \\
 94 | \hline
 95 | {$\mathbf{x}_u, \mathbf{y}_i$} & {latent factors of user $u$, item $i$} \\
 96 | \hline
 97 | {$\bm{{\gamma}}^{(U)}_u$, $\bm{{\gamma}}_i^{(I)}$} & {latent factors for pairing HIN embedding of user $u$, item $i$} \\
 98 | \hline
 99 | {$\alpha$, $\beta$} & {parameters for integrating HIN embeddings} \\
100 | \hline
101 | {$\mathbf{M}^{(l)}$} & { transformation matrix w.r.t the $l$-th mete-path} \\
102 | \hline
103 | {$\bm{b}^{(l)}$} & { bias vector w.r.t the $l$-th mete-path} \\
104 | \hline
105 | {$w^{(l)}_u$} & {preference weight of user $u$ over the $l$-th meta-path} \\
106 | \hline
107 | {$\bm{\Theta}^{(U)}$, $\bm{\Theta}^{(I)}$} & {parameters of fusion functions for users, items} \\
108 | \hline
109 | {$\lambda$} & {regularization parameter} \\
110 | \hline
111 | {$\eta$} & {learning rate} \\
112 | \hline
113 | \end{tabular}}
114 | \end{table}
115 | 


--------------------------------------------------------------------------------
/code/metapathGeneration.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/python
  2 | import sys
  3 | import numpy as np
  4 | import random
  5 | 
  6 | class metapathGeneration:
  7 |     def __init__(self, unum, bnum, conum, canum, cinum):
  8 |         self.unum = unum + 1
  9 |         self.bnum = bnum + 1
 10 |         self.conum = conum + 1
 11 |         self.canum = canum + 1
 12 |         self.cinum = cinum + 1
 13 |         ub = self.load_ub('../data/ub_0.8.train')
 14 |         self.get_UBU(ub, '../data/metapath/ubu_0.8.txt')
 15 |         self.get_UBCaBU(ub, '../data/bca.txt', '../data/metapath/ubcabu_0.8.txt')
 16 |         self.get_UBCiBU(ub, '../data/bci.txt', '../data/metapath/ubcibu_0.8.txt')
 17 |         self.get_BUB(ub, '../data/metapath/bub_0.8.txt')
 18 |         self.get_BCiB('../data/bci.txt', '../data/metapath/bcib_0.8.txt')
 19 |         self.get_BCaB('../data/bca.txt', '../data/metapath/bcab_0.8.txt')
 20 | 
 21 |     def load_ub(self, ubfile):
 22 |         ub = np.zeros((self.unum, self.bnum))
 23 |         with open(ubfile, 'r') as infile:
 24 |             for line in infile.readlines():
 25 |                 user, item, rating = line.strip().split('\t')
 26 |                 ub[int(user)][int(item)] = 1 
 27 |         return ub
 28 |     
 29 |     def get_UCoU(self, ucofile, targetfile):
 30 |         print 'UCoU...'
 31 |         uco = np.zeros((self.unum, self.conum))
 32 |         with open(ucofile, 'r') as infile:
 33 |             for line in infile.readlines():
 34 |                 u, co, _ = line.strip().split('\t')
 35 |                 uco[int(u)][int(co)] = 1
 36 | 
 37 |         uu = uco.dot(uco.T)
 38 |         print uu.shape
 39 |         print 'writing to file...'
 40 |         total = 0
 41 |         with open(targetfile, 'w') as outfile:
 42 |             for i in range(uu.shape[0]):
 43 |                 for j in range(uu.shape[1]):
 44 |                     if uu[i][j] != 0 and i != j:
 45 |                         outfile.write(str(i) + '\t' + str(j) + '\t' + str(int(uu[i][j])) + '\n')
 46 |                         total += 1
 47 |         print 'total = ', total
 48 |     
 49 |     def get_UU(self, uufile, targetfile):
 50 |         print 'UU...'
 51 |         uu = np.zeros((self.unum, self.unum))
 52 |         with open(uufile, 'r') as infile:
 53 |             for line in infile.readlines():
 54 |                 u1, u2, _ = line.strip().split('\t')
 55 |                 uu[int(u1)][int(u2)] = 1
 56 |         r_uu = uu.dot(uu.T)
 57 | 
 58 |         print r_uu.shape
 59 |         print 'writing to file...'
 60 |         total = 0 
 61 |         with open(targetfile, 'w') as outfile:
 62 |             for i in range(r_uu.shape[0]):
 63 |                 for j in range(r_uu.shape[1]):
 64 |                     if r_uu[i][j] != 0 and i != j:
 65 |                         outfile.write(str(i) + '\t' + str(j) + '\t' + str(int(r_uu[i][j])) + '\n')
 66 |                         total += 1
 67 |         print 'total = ', total
 68 |                                                                                                                                      
 69 | 
 70 |     def get_UBU(self, ub, targetfile):
 71 |         print 'UMU...'
 72 | 
 73 |         uu = ub.dot(ub.T)
 74 |         print uu.shape
 75 |         print 'writing to file...'
 76 |         total = 0
 77 |         with open(targetfile, 'w') as outfile:
 78 |             for i in range(uu.shape[0]):
 79 |                 for j in range(uu.shape[1]):
 80 |                     if uu[i][j] != 0 and i != j:
 81 |                         outfile.write(str(i) + '\t' + str(j) + '\t' + str(int(uu[i][j])) + '\n')
 82 |                         total += 1
 83 |         print 'total = ', total
 84 |     
 85 |     def get_BUB(self, ub, targetfile):
 86 |         print 'MUM...'
 87 |         mm = ub.T.dot(ub)
 88 |         print mm.shape
 89 |         print 'writing to file...'
 90 |         total = 0
 91 |         with open(targetfile, 'w') as outfile:
 92 |             for i in range(mm.shape[0]):
 93 |                 for j in range(mm.shape[1]):
 94 |                     if mm[i][j] != 0 and i != j:
 95 |                         outfile.write(str(i) + '\t' + str(j) + '\t' + str(int(mm[i][j])) + '\n')
 96 |                         total += 1
 97 |         print 'total = ', total
 98 |     
 99 |     def get_BCiB(self, bcifile, targetfile):
100 |         print 'BCiB..'
101 | 
102 |         bci = np.zeros((self.bnum, self.cinum))
103 |         with open(bcifile) as infile:
104 |             for line in infile.readlines():
105 |                 m, d, _ = line.strip().split('\t')
106 |                 bci[int(m)][int(d)] = 1
107 | 
108 |         mm = bci.dot(bci.T)
109 |         print 'writing to file...'
110 |         total = 0
111 |         with open(targetfile, 'w') as outfile:
112 |             for i in range(mm.shape[0])[1:]:
113 |                 for j in range(mm.shape[1])[1:]:
114 |                     if mm[i][j] != 0 and i != j:
115 |                         outfile.write(str(i) + '\t' + str(j) + '\t' + str(int(mm[i][j])) + '\n')
116 |                         total += 1
117 |         print 'total = ', total
118 | 
119 |     def get_BCaB(self, bcafile, targetfile):
120 |         print 'BCaB..'
121 | 
122 |         bca = np.zeros((self.bnum, self.canum))
123 |         with open(bcafile) as infile:
124 |             for line in infile.readlines():
125 |                 m, a,__ = line.strip().split('\t')
126 |                 bca[int(m)][int(a)] = 1
127 | 
128 |         mm = bca.dot(bca.T)
129 |         print 'writing to file...'
130 |         total = 0
131 |         with open(targetfile, 'w') as outfile:
132 |             for i in range(mm.shape[0])[1:]:
133 |                 for j in range(mm.shape[1])[1:]:
134 |                     if mm[i][j] != 0 and i != j:
135 |                         outfile.write(str(i) + '\t' + str(j) + '\t' + str(int(mm[i][j])) + '\n')
136 |                         total += 1
137 |         print 'total = ', total
138 |     
139 |     def get_MTM(self, mtfile, targetfile):
140 |         print 'MTM..'
141 | 
142 |         mt = np.zeros((self.mnum, self.tnum))
143 |         with open(mtfile) as infile:
144 |             for line in infile.readlines():
145 |                 m, a,__ = line.strip().split('\t')
146 |                 mt[int(m)][int(a)] = 1
147 | 
148 |         mm = mt.dot(mt.T)
149 |         print 'writing to file...'
150 |         total = 0
151 |         with open(targetfile, 'w') as outfile:
152 |             for i in range(mm.shape[0])[1:]:
153 |                 for j in range(mm.shape[1])[1:]:
154 |                     if mm[i][j] != 0 and i != j:
155 |                         outfile.write(str(i) + '\t' + str(j) + '\t' + str(int(mm[i][j])) + '\n')
156 |                         total += 1
157 |         print 'total = ', total
158 |     
159 |     def get_UBCaBU(self, ub, bcafile, targetfile):
160 |         print 'UBCaBU...'
161 | 
162 |         bca = np.zeros((self.bnum, self.canum))
163 |         with open(bcafile, 'r') as infile:
164 |             for line in infile.readlines():
165 |                 m, d, _ = line.strip().split('\t')
166 |                 bca[int(m)][int(d)] = 1
167 | 
168 |         uu = ub.dot(bca).dot(bca.T).dot(ub.T)
169 |         print 'writing to file...'
170 |         total = 0
171 |         with open(targetfile, 'w') as outfile:
172 |             for i in range(uu.shape[0]):
173 |                 for j in range(uu.shape[1]):
174 |                     if uu[i][j] != 0 and i != j:
175 |                         outfile.write(str(i) + '\t' + str(j) + '\t' + str(int(uu[i][j])) + '\n')
176 |                         total += 1
177 |         print 'total = ', total
178 |     
179 |     def get_UBCiBU(self, ub, bcifile, targetfile):
180 |         print 'UBCiBU...'
181 | 
182 |         bci = np.zeros((self.bnum, self.cinum))
183 |         with open(bcifile, 'r') as infile:
184 |             for line in infile.readlines():
185 |                 m, a, _ = line.strip().split('\t')
186 |                 bci[int(m)][int(a)] = 1
187 | 
188 |         uu = ub.dot(bci).dot(bci.T).dot(ub.T)
189 |         print 'writing to file...'
190 |         total = 0
191 |         with open(targetfile, 'w') as outfile:
192 |             for i in range(uu.shape[0]):
193 |                 for j in range(uu.shape[1]):
194 |                     if uu[i][j] != 0 and i != j:
195 |                         outfile.write(str(i) + '\t' + str(j) + '\t' + str(int(uu[i][j])) + '\n')
196 |                         total += 1
197 |         print 'total = ', total
198 | 
199 | if __name__ == '__main__':
200 |     #see __init__() 
201 |     metapathGeneration(unum=16239, bnum=14284, conum=11, canum=511, cinum=47)
202 | 


--------------------------------------------------------------------------------
/paper/sec-intro.tex:
--------------------------------------------------------------------------------
 1 | \IEEEraisesectionheading{\section{Introduction \label{sec-intro}}}
 2 | 
 3 | \IEEEPARstart{I}{n} recent years, recommender systems, which help users discover items of interest from a large resource collection, have been playing an increasingly important role in various online services~\cite{dias2008value,koren2015advances}.
 4 | Traditional recommendation methods (\eg matrix factorization) mainly aim to learn an effective prediction function for characterizing user-item interaction records (\eg user-item rating matrix). With the rapid development of web services, various kinds of auxiliary data (\aka side information) become available in recommender systems. Although auxiliary data is likely to contain useful information for recommendation~\cite{schafer2007collaborative},
 5 | it is difficult to model and utilize these heterogeneous and complex information in recommender systems.
 6 | Furthermore, it is more challenging to develop a relatively general approach to model these varying data in different systems or platforms.
 7 | 
 8 | As a promising direction, heterogeneous information network (HIN), consisting of multiple types of nodes and links, has been proposed as a powerful information modeling method~\cite{sun2011pathsim,shi2017survey,shi2017heterogeneous}.
 9 |  Due to its flexility in modeling data heterogeneity,
10 |  HIN has been adopted in recommender systems to characterize rich auxiliary data.
11 | In Fig.~\ref{fig_framework}(a), we present an example for movie recommendation characterized by HINs.
12 | We can see that the HIN contains multiple types of entities connected by different types of relations.
13 | Under the HIN based representation, the recommendation problem can be considered as a similarity search task over the HIN~\cite{sun2011pathsim}.
14 | Such a recommendation setting is called as \emph{HIN based recommendation}~\cite{yu2013collaborative}.
15 | HIN based recommendation has received much attention in the literature~\cite{feng2012incorporating,yu2013collaborative,yu2014personalized,shi2015semantic,shi2016integrating,zheng2017recommendation}.
16 |  The basic idea of most existing HIN based recommendation methods is to leverage path based semantic relatedness between users and items over HINs, \eg meta-path based similarities,
17 |  for recommendation.
18 | 
19 | \begin{figure*}[t]
20 | \centering
21 | \includegraphics[width=15cm]{image/framework.pdf}
22 | \caption{\label{fig_framework}The schematic illustration of the proposed HERec approach.}
23 | \end{figure*}
24 | 
25 | Although HIN based methods have achieved performance improvement to some extent,
26 | there are two major problems for these methods using meta-path based similarities.
27 | First, meta-path based similarities rely on explicit path reachability,
28 | and may not be reliable to be used for recommendation when
29 | path connections are sparse or noisy.
30 | It is likely that some links in HINs are accidentally formed which do not convey meaningful semantics.
31 | Second, meta-path based similarities mainly characterize semantic relations defined over HINs, and may not be directly applicable to
32 | recommender systems.
33 | It is likely that the derived path based similarities have no explicit impact on the recommendation performance in some cases.
34 | Existing methods mainly learn a linear weighting mechanism to combine the path based similarities~\cite{shi2016integrating} or latent factors~\cite{yu2013collaborative}, which cannot learn the complicated mapping mechanism of HIN information for recommendation.
35 | The two problems essentially reflect two fundamental issues for HIN based recommendation, namely effective information extraction and exploitation based on HINs for recommendation.
36 | %The focus of this paper is to study how to address these two issues and improve HIN-based recommendation.
37 | 
38 | For the first issue, it is challenging to develop a way to effectively extract and represent useful information for HINs due to data heterogeneity.
39 | Unlike previous studies using meta-path based similarities~\cite{sun2011pathsim,yu2013collaborative}, our idea is to
40 | learn effective heterogeneous network representations for summarizing important structural characteristics and properties of HINs.
41 | Following \cite{perozzi2014deepwalk,grover2016node2vec}, we characterize nodes from HINs with low-dimensional vectors, \ie embeddings.
42 | Instead of relying on explicit path connection, we would like to encode useful information from HINs with latent vectors.
43 | Compared with meta-path based similarity, the learned embeddings are in a more compact form that is easy to use and integrate.
44 | Also, the network embedding approach itself is more resistant to sparse and noisy data.
45 | However, most existing network embedding methods focus on homogeneous networks only consisting of a single type of nodes and links, and cannot directly deal with heterogeneous networks consisting of multiple types of nodes and links. Hence, we propose a new heterogeneous network embedding method.
46 | Considering heterogeneous characteristics and rich semantics reflected by meta-paths, the proposed method first uses a
47 | random walk strategy guided by meta-paths to generate node sequences. For each meta-path, we learn a unique embedding representation for a node by maximizing its co-occurrence probability with neighboring nodes in the sequences sampled according to the given meta-path. We fuse the multiple embeddings \emph{w.r.t.} different meta-paths as the output of HIN embedding.
48 | 
49 | After obtaining the embeddings from HINs, we study how to integrate and utilize such information in recommender systems.
50 | We don't assume the learned embeddings are naturally applicable in recommender systems. Instead, we propose and explore three fusion functions to integrate multiple embeddings of a node into a single representation for recommendation, including simple linear fusion, personalized linear fusion and non-linear fusion.
51 | These fusion functions provide flexible ways to transform HIN embeddings into useful information for recommendation.
52 | Specially, we emphasize that personalization and non-linearity are two key points to consider for information transformation in our setting.
53 | Finally, we extend the classic matrix factorization framework by incorporating the fused HIN embeddings.
54 | The prediction model and the fusion function are jointly optimized for the rating prediction task.
55 | 
56 | By integrating the above two parts together, this work presents a novel HIN embedding based recommendation approach, called \emph{HERec} for short.
57 | HERec first extracts useful HIN based information using the proposed HIN embedding method,
58 | and then utilizes the extracted information for recommendation using the extended matrix factorization model.
59 | We present the overall illustration for the proposed approach in Fig.~\ref{fig_framework}.
60 | Extensive experiments on three real-world datasets demonstrate the effectiveness of the proposed approach. We also  verify the ability of HERec to alleviate cold-start problem and examine the impact of meta-paths on performance. The key contributions of this paper can be summarized as follows:
61 | 
62 | 
63 | \textbullet ~We propose a heterogeneous network embedding method guided by meta-paths to uncover the semantic and structural information of heterogeneous information networks. Moreover, we propose a general embedding fusion approach to integerate different
64 | embeddings based on different meta-paths into a single representation.
65 | 
66 | \textbullet ~We propose a novel heterogeneous information network embedding for recommendation model, called HERec for short. HERec can effectively integrate various kinds of embedding information in HIN to enhance the recommendation performance. In addition, we design a set of three flexible fusion functions to effectively transform HIN embeddings into useful information for recommendation.
67 | 
68 | \textbullet ~Extensive experiments on three real-world datasets demonstrate the effectiveness of the proposed model. Moreover, we show the capability of the proposed model for the cold-start prediction problem, and reveal that the transformed embedding information from HINs can improve the recommendation performance.
69 | 
70 | The remainder of this paper is organized as follows. Section~\ref{sec-rel} introduces the related works. Section~\ref{sec-def} describes notations used in the paper and presents some preliminary knowledge. Then, We propose the heterogeneous network embedding method and the HERec model in Section~\ref{sec-model}. Experiments and detailed analysis are reported in Section~\ref{sec-exp}. Finally, we conclude the paper in Section~\ref{sec-con}.
71 | 
72 | \begin{figure*}[t]
73 | \centering
74 | \subfigure[Douban Movie]{
75 | \begin{minipage}[b]{0.3\textwidth}
76 | \includegraphics[width=1\textwidth]{image/doubanmovie.pdf}
77 | \end{minipage}
78 | }
79 | \subfigure[Douban Book]{
80 | \begin{minipage}[b]{0.3\textwidth}
81 | \includegraphics[width=1\textwidth]{image/doubanbook.pdf}
82 | \end{minipage}
83 | }
84 | \subfigure[Yelp]{
85 | \begin{minipage}[b]{0.3\textwidth}
86 | \includegraphics[width=1\textwidth]{image/yelp.pdf}
87 | \end{minipage}
88 | }
89 | \caption{\label{fig_schema}Network schemas of heterogeneous information networks for the used three datasets.
90 | In our task, users and items are our major focus, denoted by large-sized circles, while the other attributes are denoted by small-sized circles. }
91 | 
92 | \end{figure*}
93 | 
94 | 


--------------------------------------------------------------------------------
/code/HERec_sl.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/python
  2 | #encoding=utf-8
  3 | import numpy as np
  4 | import time
  5 | import random
  6 | from math import sqrt,fabs,log
  7 | import sys
  8 | 
  9 | class HNERec:
 10 |     def __init__(self, unum, inum, ratedim, userdim, itemdim, user_metapaths,item_metapaths, trainfile, testfile, steps, delta, beta_e, beta_h, beta_p, beta_w, beta_b, reg_u, reg_v):
 11 |         self.unum = unum
 12 |         self.inum = inum
 13 |         self.ratedim = ratedim
 14 |         self.userdim = userdim
 15 |         self.itemdim = itemdim
 16 |         self.steps = steps
 17 |         self.delta = delta
 18 |         self.beta_e = beta_e
 19 |         self.beta_h = beta_h
 20 |         self.beta_p = beta_p
 21 |         self.beta_w = beta_w
 22 |         self.beta_b = beta_b
 23 |         self.reg_u = reg_u
 24 |         self.reg_v = reg_v
 25 | 
 26 |         self.user_metapathnum = len(user_metapaths)
 27 |         self.item_metapathnum = len(item_metapaths)
 28 | 
 29 |         self.X, self.user_metapathdims = self.load_embedding(user_metapaths, unum)
 30 |         print 'Load user embeddings finished.'
 31 | 
 32 |         self.Y, self.item_metapathdims = self.load_embedding(item_metapaths, inum)
 33 |         print 'Load user embeddings finished.'
 34 | 
 35 |         self.R, self.T, self.ba = self.load_rating(trainfile, testfile)
 36 |         print 'Load rating finished.'
 37 |         print 'train size : ', len(self.R)
 38 |         print 'test size : ', len(self.T) 
 39 | 
 40 |         self.initialize();
 41 |         self.recommend();
 42 | 
 43 |     def load_embedding(self, metapaths, num):
 44 |         X = {}
 45 |         for i in range(num):
 46 |             X[i] = {}
 47 |         metapathdims = []
 48 |     
 49 |         ctn = 0
 50 |         for metapath in metapaths:
 51 |             sourcefile = '../data/embeddings/' + metapath
 52 |             #print sourcefile
 53 |             with open(sourcefile) as infile:
 54 |                 
 55 |                 k = int(infile.readline().strip().split(' ')[1])
 56 |                 metapathdims.append(k)
 57 |                 for i in range(num):
 58 |                     X[i][ctn] = np.zeros(k)
 59 | 
 60 |                 n = 0
 61 |                 for line in infile.readlines():
 62 |                     n += 1
 63 |                     arr = line.strip().split(' ')
 64 |                     i = int(arr[0]) - 1
 65 |                     for j in range(k):
 66 |                         X[i][ctn][j] = float(arr[j + 1])
 67 |                 print 'metapath ', metapath, 'numbers ', n
 68 |             ctn += 1
 69 |         return X, metapathdims
 70 | 
 71 |     def load_rating(self, trainfile, testfile):
 72 |         R_train = []
 73 |         R_test = []
 74 |         ba = 0.0
 75 |         n = 0
 76 |         user_test_dict = dict()
 77 |         with open(trainfile) as infile:
 78 |             for line in infile.readlines():
 79 |                 user, item, rating = line.strip().split('\t')
 80 |                 R_train.append([int(user)-1, int(item)-1, int(rating)])
 81 |                 ba += int(rating)
 82 |                 n += 1
 83 |         ba = ba / n
 84 |         ba = 0
 85 |         with open(testfile) as infile:
 86 |             for line in infile.readlines():
 87 |                 user, item, rating = line.strip().split('\t')
 88 |                 R_test.append([int(user)-1, int(item)-1, int(rating)])
 89 | 
 90 |         return R_train, R_test, ba
 91 | 
 92 |     def initialize(self):
 93 |         self.E = np.random.randn(self.unum, self.itemdim) * 0.1
 94 |         self.H = np.random.randn(self.inum, self.userdim) * 0.1
 95 |         self.U = np.random.randn(self.unum, self.ratedim) * 0.1
 96 |         self.V = np.random.randn(self.unum, self.ratedim) * 0.1
 97 | 
 98 |         self.pu = np.ones((self.unum, self.user_metapathnum)) * 1.0 / self.user_metapathnum
 99 |         self.pv = np.ones((self.inum, self.item_metapathnum)) * 1.0 / self.item_metapathnum
100 | 
101 | 
102 |         self.Wu = {}
103 |         self.bu = {}
104 |         for k in range(self.user_metapathnum):
105 |             self.Wu[k] = np.random.randn(self.userdim, self.user_metapathdims[k]) * 0.1
106 |             self.bu[k] = np.random.randn(self.userdim) * 0.1
107 | 
108 |         self.Wv = {}
109 |         self.bv = {}
110 |         for k in range(self.item_metapathnum):
111 |             self.Wv[k] = np.random.randn(self.itemdim, self.item_metapathdims[k]) * 0.1
112 |             self.bv[k] = np.random.randn(self.itemdim) * 0.1
113 | 
114 |     def cal_u(self, i):
115 |         ui = np.zeros(self.userdim)
116 |         for k in range(self.user_metapathnum):
117 |             ui += self.pu[i][k] * (self.Wu[k].dot(self.X[i][k]) + self.bu[k])
118 |         return ui
119 | 
120 |     def cal_v(self, j):
121 |         vj = np.zeros(self.itemdim)
122 |         for k in range(self.item_metapathnum):
123 |             vj += self.pv[j][k] * (self.Wv[k].dot(self.Y[j][k]) + self.bv[k])
124 |         return vj
125 | 
126 |     def get_rating(self, i, j):
127 |         ui = self.cal_u(i)
128 |         vj = self.cal_v(j)
129 |         return self.U[i, :].dot(self.V[j, :]) + self.reg_u * ui.dot(self.H[j, :]) + self.reg_v * self.E[i, :].dot(vj)
130 | 
131 |     def maermse(self):
132 |         m = 0.0
133 |         mae = 0.0
134 |         rmse = 0.0
135 |         n = 0
136 |         for t in self.T:
137 |             n += 1
138 |             i = t[0]
139 |             j = t[1]
140 |             r = t[2]
141 |             r_p = self.get_rating(i, j)
142 | 
143 |             if r_p > 5: r_p = 5
144 |             if r_p < 1: r_p = 1
145 |             m = fabs(r_p - r)
146 |             mae += m
147 |             rmse += m * m
148 |         mae = mae * 1.0 / n
149 |         rmse = sqrt(rmse * 1.0 / n)
150 |         return mae, rmse
151 | 
152 |     def recommend(self):
153 |         mae = []
154 |         rmse = []
155 |         starttime = time.clock()
156 |         perror = 99999
157 |         cerror = 9999
158 |         n = len(self.R)
159 | 
160 |         for step in range(steps):
161 |             total_error = 0.0
162 |             for t in self.R:
163 |                 i = t[0]
164 |                 j = t[1]
165 |                 rij = t[2]
166 | 
167 |                 rij_t = self.get_rating(i, j)
168 |                 eij = rij - rij_t
169 |                 total_error += eij * eij
170 |                 
171 |                 U_g = -eij * self.V[j, :] + self.beta_e * self.U[i, :]
172 |                 V_g = -eij * self.U[i, :] + self.beta_h * self.V[j, :]
173 | 
174 |                 self.U[i, :] -= self.delta * U_g
175 |                 self.V[j, :] -= self.delta * V_g
176 | 
177 |                 ui = self.cal_u(i)
178 |                 for k in range(self.user_metapathnum):
179 |                     pu_g = self.reg_u * -eij * self.H[j, :].dot(self.Wu[k].dot(self.X[i][k]) + self.bu[k]) + self.beta_p * self.pu[i][k]
180 |                     Wu_g = self.reg_u * -eij * self.pu[i][k] * np.array([self.H[j, :]]).T.dot(np.array([self.X[i][k]])) + self.beta_w * self.Wu[k]
181 |                     bu_g = self.reg_u * -eij * self.pu[i][k] * self.H[j, :] + self.beta_b * self.bu[k]
182 | 
183 |                     #self.pu[i][k] -= 0.1 * self.delta * pu_g
184 |                     self.Wu[k] -= 0.1 * self.delta * Wu_g
185 |                     self.bu[k] -= 0.1 * self.delta * bu_g
186 | 
187 |                 H_g = self.reg_u * -eij * ui + self.beta_h * self.H[j, :]
188 |                 self.H[j, :] -= self.delta * H_g
189 | 
190 |                 vj = self.cal_v(j)
191 |                 for k in range(self.item_metapathnum):
192 |                     pv_g = self.reg_v * -eij * self.E[i, :].dot(self.Wv[k].dot(self.Y[j][k]) + self.bv[k]) + self.beta_p * self.pv[j][k]
193 |                     Wv_g = self.reg_v * -eij * self.pv[j][k] * np.array([self.E[i, :]]).T.dot(np.array([self.Y[j][k]])) + self.beta_w * self.Wv[k]
194 |                     bv_g = self.reg_v * -eij * self.pv[j][k] * self.E[i, :] + self.beta_b * self.bv[k]
195 | 
196 |                     #self.pv[j][k] -= 0.1 * self.delta * pv_g
197 |                     self.Wv[k] -= 0.1 * self.delta * Wv_g
198 |                     self.bv[k] -= 0.1 * self.delta * bv_g
199 | 
200 |                 E_g = self.reg_v * -eij * vj + 0.01 * self.E[i, :]
201 |                 self.E[i, :] -= self.delta * E_g
202 | 
203 |             perror = cerror
204 |             cerror = total_error / n
205 | 	    
206 | 	    self.delta = self.delta * 0.93
207 |             if(abs(perror - cerror) < 0.0001):
208 |                 break
209 |             #print 'step ', step, 'crror : ', sqrt(cerror)
210 |             MAE, RMSE = self.maermse()
211 |             mae.append(MAE)
212 |             rmse.append(RMSE)
213 |             #print 'MAE, RMSE ', MAE, RMSE
214 |             endtime = time.clock()
215 |             #print 'time: ', endtime - starttime
216 |         print 'MAE: ', min(mae), ' RMSE: ', min(rmse)
217 | 
218 | if __name__ == "__main__":
219 |     unum = 16239
220 |     inum = 14284
221 |     ratedim = 10
222 |     userdim = 30
223 |     itemdim = 10
224 |     train_rate = 0.8#float(sys.argv[1])
225 | 
226 |     user_metapaths = ['ubu', 'ubcibu', 'ubcabu']
227 |     item_metapaths = ['bub', 'bcib', 'bcab']
228 | 
229 |     for i in range(len(user_metapaths)):
230 |         user_metapaths[i] += '_' + str(train_rate) + '.embedding'
231 |     for i in range(len(item_metapaths)):
232 |         item_metapaths[i] += '_' + str(train_rate) + '.embedding'
233 |     #user_metapaths = ['ubu_' + str(train_rate) + '.embedding', 'ubcibu_'+str(train_rate)+'.embedding', 'ubcabu_'+str(train_rate)+'.embedding']
234 |     
235 |     #item_metapaths = ['bub_'+str(train_rate)+'.embedding', 'bcib.embedding', 'bcab.embedding']
236 |     trainfile = '../data/ub_'+str(train_rate)+'.train'
237 |     testfile = '../data/ub_'+str(train_rate)+'.test'
238 |     steps = 100
239 |     delta = 0.02
240 |     beta_e = 0.1
241 |     beta_h = 0.1
242 |     beta_p = 2
243 |     beta_w = 0.1
244 |     beta_b = 0.01
245 |     reg_u = 1.0
246 |     reg_v = 1.0
247 |     print 'train_rate: ', train_rate
248 |     print 'ratedim: ', ratedim, ' userdim: ', userdim, ' itemdim: ', itemdim
249 |     print 'max_steps: ', steps
250 |     print 'delta: ', delta, 'beta_e: ', beta_e, 'beta_h: ', beta_h, 'beta_p: ', beta_p, 'beta_w: ', beta_w, 'beta_b', beta_b, 'reg_u', reg_u, 'reg_v', reg_v
251 | 
252 |     HNERec(unum, inum, ratedim, userdim, itemdim, user_metapaths, item_metapaths, trainfile, testfile, steps, delta, beta_e, beta_h, beta_p, beta_w, beta_b, reg_u, reg_v)
253 | 


--------------------------------------------------------------------------------
/code/HERec_pl.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/python
  2 | #encoding=utf-8
  3 | import numpy as np
  4 | import time
  5 | import random
  6 | from math import sqrt,fabs,log
  7 | import sys
  8 | 
  9 | class HNERec:
 10 |     def __init__(self, unum, inum, ratedim, userdim, itemdim, user_metapaths,item_metapaths, trainfile, testfile, steps, delta, beta_e, beta_h, beta_p, beta_w, beta_b, reg_u, reg_v):
 11 |         self.unum = unum
 12 |         self.inum = inum
 13 |         self.ratedim = ratedim
 14 |         self.userdim = userdim
 15 |         self.itemdim = itemdim
 16 |         self.steps = steps
 17 |         self.delta = delta
 18 |         self.beta_e = beta_e
 19 |         self.beta_h = beta_h
 20 |         self.beta_p = beta_p
 21 |         self.beta_w = beta_w
 22 |         self.beta_b = beta_b
 23 |         self.reg_u = reg_u
 24 |         self.reg_v = reg_v
 25 | 
 26 |         self.user_metapathnum = len(user_metapaths)
 27 |         self.item_metapathnum = len(item_metapaths)
 28 | 
 29 |         self.X, self.user_metapathdims = self.load_embedding(user_metapaths, unum)
 30 |         print 'Load user embeddings finished.'
 31 | 
 32 |         self.Y, self.item_metapathdims = self.load_embedding(item_metapaths, inum)
 33 |         print 'Load user embeddings finished.'
 34 | 
 35 |         self.R, self.T, self.ba = self.load_rating(trainfile, testfile)
 36 |         print 'Load rating finished.'
 37 |         print 'train size : ', len(self.R)
 38 |         print 'test size : ', len(self.T) 
 39 | 
 40 |         self.initialize();
 41 |         self.recommend();
 42 | 
 43 |     def load_embedding(self, metapaths, num):
 44 |         X = {}
 45 |         for i in range(num):
 46 |             X[i] = {}
 47 |         metapathdims = []
 48 |     
 49 |         ctn = 0
 50 |         for metapath in metapaths:
 51 |             sourcefile = '../data/embeddings/' + metapath
 52 |             #print sourcefile
 53 |             with open(sourcefile) as infile:
 54 |                 
 55 |                 k = int(infile.readline().strip().split(' ')[1])
 56 |                 metapathdims.append(k)
 57 |                 for i in range(num):
 58 |                     X[i][ctn] = np.zeros(k)
 59 | 
 60 |                 n = 0
 61 |                 for line in infile.readlines():
 62 |                     n += 1
 63 |                     arr = line.strip().split(' ')
 64 |                     i = int(arr[0]) - 1
 65 |                     for j in range(k):
 66 |                         X[i][ctn][j] = float(arr[j + 1])
 67 |                 print 'metapath ', metapath, 'numbers ', n
 68 |             ctn += 1
 69 |         return X, metapathdims
 70 | 
 71 |     def load_rating(self, trainfile, testfile):
 72 |         R_train = []
 73 |         R_test = []
 74 |         ba = 0.0
 75 |         n = 0
 76 |         user_test_dict = dict()
 77 |         with open(trainfile) as infile:
 78 |             for line in infile.readlines():
 79 |                 user, item, rating = line.strip().split('\t')
 80 |                 R_train.append([int(user)-1, int(item)-1, int(rating)])
 81 |                 ba += int(rating)
 82 |                 n += 1
 83 |         ba = ba / n
 84 |         ba = 0
 85 |         with open(testfile) as infile:
 86 |             for line in infile.readlines():
 87 |                 user, item, rating = line.strip().split('\t')
 88 |                 R_test.append([int(user)-1, int(item)-1, int(rating)])
 89 | 
 90 |         return R_train, R_test, ba
 91 | 
 92 |     def initialize(self):
 93 |         self.E = np.random.randn(self.unum, self.itemdim) * 0.1
 94 |         self.H = np.random.randn(self.inum, self.userdim) * 0.1
 95 |         self.U = np.random.randn(self.unum, self.ratedim) * 0.1
 96 |         self.V = np.random.randn(self.unum, self.ratedim) * 0.1
 97 | 
 98 |         self.pu = np.ones((self.unum, self.user_metapathnum)) * 1.0 / self.user_metapathnum
 99 |         self.pv = np.ones((self.inum, self.item_metapathnum)) * 1.0 / self.item_metapathnum
100 | 
101 | 
102 |         self.Wu = {}
103 |         self.bu = {}
104 |         for k in range(self.user_metapathnum):
105 |             self.Wu[k] = np.random.randn(self.userdim, self.user_metapathdims[k]) * 0.1
106 |             self.bu[k] = np.random.randn(self.userdim) * 0.1
107 | 
108 |         self.Wv = {}
109 |         self.bv = {}
110 |         for k in range(self.item_metapathnum):
111 |             self.Wv[k] = np.random.randn(self.itemdim, self.item_metapathdims[k]) * 0.1
112 |             self.bv[k] = np.random.randn(self.itemdim) * 0.1
113 | 
114 |     def cal_u(self, i):
115 |         ui = np.zeros(self.userdim)
116 |         for k in range(self.user_metapathnum):
117 |             ui += self.pu[i][k] * (self.Wu[k].dot(self.X[i][k]) + self.bu[k])
118 |         return ui
119 | 
120 |     def cal_v(self, j):
121 |         vj = np.zeros(self.itemdim)
122 |         for k in range(self.item_metapathnum):
123 |             vj += self.pv[j][k] * (self.Wv[k].dot(self.Y[j][k]) + self.bv[k])
124 |         return vj
125 | 
126 |     def get_rating(self, i, j):
127 |         ui = self.cal_u(i)
128 |         vj = self.cal_v(j)
129 |         return self.U[i, :].dot(self.V[j, :]) + self.reg_u * ui.dot(self.H[j, :]) + self.reg_v * self.E[i, :].dot(vj)
130 | 
131 |     def maermse(self):
132 |         m = 0.0
133 |         mae = 0.0
134 |         rmse = 0.0
135 |         n = 0
136 |         for t in self.T:
137 |             n += 1
138 |             i = t[0]
139 |             j = t[1]
140 |             r = t[2]
141 |             r_p = self.get_rating(i, j)
142 | 
143 |             if r_p > 5: r_p = 5
144 |             if r_p < 1: r_p = 1
145 |             m = fabs(r_p - r)
146 |             mae += m
147 |             rmse += m * m
148 |         mae = mae * 1.0 / n
149 |         rmse = sqrt(rmse * 1.0 / n)
150 |         return mae, rmse
151 | 
152 |     def recommend(self):
153 |         mae = []
154 |         rmse = []
155 |         starttime = time.clock()
156 |         perror = 99999
157 |         cerror = 9999
158 |         n = len(self.R)
159 | 
160 |         for step in range(steps):
161 |             total_error = 0.0
162 |             train_start_time = time.time()
163 |             for t in self.R:
164 |                 i = t[0]
165 |                 j = t[1]
166 |                 rij = t[2]
167 | 
168 |                 rij_t = self.get_rating(i, j)
169 |                 eij = rij - rij_t
170 |                 total_error += eij * eij
171 |                 
172 |                 U_g = -eij * self.V[j, :] + self.beta_e * self.U[i, :]
173 |                 V_g = -eij * self.U[i, :] + self.beta_h * self.V[j, :]
174 | 
175 |                 self.U[i, :] -= delta * U_g
176 |                 self.V[j, :] -= delta * V_g
177 | 
178 |                 ui = self.cal_u(i)
179 |                 for k in range(self.user_metapathnum):
180 |                     pu_g = self.reg_u * -eij * self.H[j, :].dot(self.Wu[k].dot(self.X[i][k]) + self.bu[k]) + self.beta_p * self.pu[i][k]
181 |                     Wu_g = self.reg_u * -eij * self.pu[i][k] * np.array([self.H[j, :]]).T.dot(np.array([self.X[i][k]])) + self.beta_w * self.Wu[k]
182 |                     bu_g = self.reg_u * -eij * self.pu[i][k] * self.H[j, :] + self.beta_b * self.bu[k]
183 | 
184 |                     self.pu[i][k] -= 0.1 * self.delta * pu_g
185 |                     self.Wu[k] -= 0.1 * self.delta * Wu_g
186 |                     self.bu[k] -= 0.1 * self.delta * bu_g
187 | 
188 |                 H_g = self.reg_u * -eij * ui + self.beta_h * self.H[j, :]
189 |                 self.H[j, :] -= self.delta * H_g
190 | 
191 |                 vj = self.cal_v(j)
192 |                 for k in range(self.item_metapathnum):
193 |                     pv_g = self.reg_v * -eij * self.E[i, :].dot(self.Wv[k].dot(self.Y[j][k]) + self.bv[k]) + self.beta_p * self.pv[j][k]
194 |                     Wv_g = self.reg_v * -eij * self.pv[j][k] * np.array([self.E[i, :]]).T.dot(np.array([self.Y[j][k]])) + self.beta_w * self.Wv[k]
195 |                     bv_g = self.reg_v * -eij * self.pv[j][k] * self.E[i, :] + self.beta_b * self.bv[k]
196 | 
197 |                     self.pv[j][k] -= 0.1 * self.delta * pv_g
198 |                     self.Wv[k] -= 0.1 * self.delta * Wv_g
199 |                     self.bv[k] -= 0.1 * self.delta * bv_g
200 | 
201 |                 E_g = self.reg_v * -eij * vj + 0.01 * self.E[i, :]
202 |                 self.E[i, :] -= self.delta * E_g
203 | 
204 |             perror = cerror
205 |             cerror = total_error / n
206 | 
207 |             self.delta = 0.93 * self.delta
208 | 
209 |             if(abs(perror - cerror) < 0.0001):
210 |                 break
211 |             print 'step ', step, 'crror : ', sqrt(cerror)
212 |             train_end_time = time.time()
213 |             print 'train time : ', (train_end_time - train_start_time)
214 |             MAE, RMSE = self.maermse()
215 |             mae.append(MAE)
216 |             rmse.append(RMSE)
217 |             #if step % 5 == 0:
218 |             print 'step, MAE, RMSE ', step, MAE, RMSE
219 |             test_time = time.time()
220 |             print 'time: ', test_time - train_end_time
221 |         print 'MAE: ', min(mae), ' RMSE: ', min(rmse)
222 | 
223 | if __name__ == "__main__":
224 |     unum = 16239
225 |     inum = 14284
226 |     ratedim = 10#int(sys.argv[1])
227 |     userdim = 30
228 |     itemdim = 10
229 |     train_rate = 0.8
230 |     
231 |     user_metapaths = ['ubu', 'ubcibu', 'ubcabu']
232 |     item_metapaths = ['bub', 'bcib', 'bcab']
233 | 
234 |     for i in range(len(user_metapaths)):
235 |         user_metapaths[i] += '_' + str(train_rate) + '.embedding'
236 |     for i in range(len(item_metapaths)):
237 |         item_metapaths[i] += '_' + str(train_rate) + '.embedding'
238 | 
239 |     #user_metapaths = ['ubu_' + str(train_rate) +'.embedding', 'ubcibu_''.embedding', 'ubcabu_0.8.embedding']
240 |     
241 |     #item_metapaths = ['bub_0.8.embedding', 'bcib_0.8.embedding', 'bcab_0.8.embedding']
242 |     trainfile = '../data/ub_' + str(train_rate) +'.train'
243 |     testfile = '../data/ub_' + str(train_rate) + '.test'
244 |     steps = 100
245 |     delta = 0.01
246 |     beta_e = 0.1
247 |     beta_h = 0.1
248 |     beta_p = 2
249 |     beta_w = 0.1
250 |     beta_b = 0.01
251 |     reg_u = 1.0
252 |     reg_v = 1.0
253 |     print 'train_rate: ', train_rate
254 |     print 'ratedim: ', ratedim, ' userdim: ', userdim, ' itemdim: ', itemdim
255 |     print 'max_steps: ', steps
256 |     print 'delta: ', delta, 'beta_e: ', beta_e, 'beta_h: ', beta_h, 'beta_p: ', beta_p, 'beta_w: ', beta_w, 'beta_b', beta_b, 'reg_u', reg_u, 'reg_v', reg_v
257 | 
258 |     HNERec(unum, inum, ratedim, userdim, itemdim, user_metapaths, item_metapaths, trainfile, testfile, steps, delta, beta_e, beta_h, beta_p, beta_w, beta_b, reg_u, reg_v)
259 | 


--------------------------------------------------------------------------------
/paper/sec-rel.tex:
--------------------------------------------------------------------------------
 1 | \section{Related Work \label{sec-rel}}
 2 | In this section, we will review the related studies in three aspects, namely recommender systems, heterogeneous information networks and network embedding.
 3 | 
 4 | In the literature of recommender systems, early works mainly adopt collaborative filtering (CF) methods to utilize historical interactions for recommendation~\cite{schafer2007collaborative}. Particularly, the matrix factorization approach~\cite{koren2009matrix,shi2012adaptive} has shown its effectiveness and efficiency in many applications, which factorizes user-item rating matrix into two low rank user-specific and item-specific matrices, and then utilizes the factorized matrices to make further predictions~\cite{koren2015advances}. Since CF methods usually suffer from cold-start problem, many works~\cite{yin2013lcars,feng2012incorporating,hong2013co} attempt to leverage additional information to improve recommendation performance. For example, Ma et al.~\cite{ma2011recommender} integrate social relations into matrix factorization in recommendation. Ling et al.~\cite{ling2014ratings} consider the information of both ratings and reviews and propose a unified model to combine content based filtering with collaborative filtering for rating prediction task. Ye et al.~\cite{ye2011exploiting} incorporate user preference, social influence and geographical influence in the recommendation and propose a unified POI recommendation framework. More recently, Sedhain et al.~\cite{sedhain2017low} explain drawbacks of three popular cold-start models~\cite{gantner2010learning,krohn2012multi,sedhain2014social} and further propose a learning based approach for the cold-start problem to leverage social data via randomised SVD. And many works begin to utilize deep models ($\eg$ convolutional neural network, auto encoder) to exploit text information~\cite{zheng2017joint}, image information~\cite{he2016vbpr} and network structure information~\cite{zhang2016collaborative} for better recommendation. In addition, there are also some typical frameworks focusing on incorporating auxiliary information for recommendation. Chen et al.~\cite{chen2012svdfeature} propose a typical SVDFeature framework to efficiently solve the feature based matrix factorization. And Rendle~\cite{rendle2010factorization} proposes factorization machine, which is a generic approach to combine the generality of feature engineering.
 5 | 
 6 | As a newly emerging direction, heterogeneous information network~\cite{shi2017survey} can naturally model complex objects and their rich relations in recommender systems, in which objects are of different types and links among objects represent different relations~\cite{sun2013mining,ou2013comparing}. And several path based similarity measures~\cite{lao2010relational,sun2011pathsim,shi2014hetesim} are proposed to evaluate the similarity of objects in heterogeneous information network. Therefore, some researchers have began to be aware of the importance of HIN based recommendation. Wang et al.~\cite{feng2012incorporating} propose the OptRank method to alleviate the cold-start problem by utilizing heterogeneous information contained in social tagging system. Furthermore, the concept of meta-path is introduced into hybrid recommender systems~\cite{yu2013recommendation}. Yu et al.~\cite{yu2013collaborative} utilize meta-path based similarities as regularization terms in the matrix factorization framework. Yu et al.~\cite{yu2014personalized} take advantage of different types of entity relationships in heterogeneous information network and propose a personalized recommendation framework for implicit feedback dataset. Luo et al.~\cite{luo2014hete} propose a collaborative filtering based social recommendation method using heterogeneous relations. More recently, Shi et al.~\cite{shi2015semantic} propose the concept of weighted heterogeneous information network and design a meta-path based collaborative filtering model to flexibly integrate heterogeneous information for personalized recommendation. In \cite{shi2016integrating,zheng2016dual,zheng2017recommendation}, the similarities of users and items are both evaluated by path based similarity measures under different semantic meta-paths and a matrix factorization based on dual regularization framework is proposed for rating prediction. Besides meta-path, Zhao et al.~\cite{zhao2017meta} propose a factorization machine based model integrated with meta-graph based similarity for recommendation. Most of HIN based methods rely on the path based similarity, which may not fully mine latent features of users and items on HINs for recommendation.
 7 | 
 8 | 
 9 | %As a newly emerging direction, HIN \cite{shi2017survey} can naturally model complex objects and their rich relations in recommender systems. \cite{yu2013collaborative} utilized the entity similarity extracted from HIN as a regularization term in CF.  \cite{shi2015semantic} employed the meta-path based similarity of users for personalized recommendation. Recently, \cite{zheng2017recommendation} utilized the similarities of users and items as the regularization term of matrix factorization. Most of HIN-based methods rely on the path based similarity, which cannot fully mine latent features of users and items on HINs for recommendation.
10 | 
11 | %On the other hand, network embedding has shown its potential in structure feature extraction and has been successfully applied in many data mining tasks~\cite{hoff2002latent,yan2007graph}, such as classification~\cite{tu2016max,kipf2016semi}, clustering~\cite{wei2017cross,cao2016deep} and recommendation~\cite{liang2016factorization,sunmrlr}. Deepwalk~\cite{perozzi2014deepwalk} combined random walk and skip-gram to learn network representations. Furthermore, Grover and Leskovec~\cite{grover2016node2vec} propose a more flexible network embedding framework based on a biased random walk procedure. In addition, LINE \cite{tang2015line} and SDNE~\cite{wang2016structural} characterize the second-order link proximity, as well as neighbor relations. Cao et al.~\cite{cao2015grarep} propose the GraRep model to capture higher-order graph proximity for network representations. Apart from leaning network embedding from only the topology, many works~\cite{pan2016tri,yang2015network,zhang2016homophily} begin to  leverage node content information and other available graph information for the robust representations. Pan et al.\cite{pan2016tri} propose the TriDNR model to learn optimal node representation with node structure information, node content and node labels. Yang et al.~\cite{yang2015network} propose the TADW model to incorporate text features of vertices into network representation based on matrix factorization framework. Zhang et al.~\cite{zhang2016homophily} consider neighbors homophily, topology structure and node content during network representation learning based on matrix decomposition. Most of network embedding methods focus on homogeneous networks, and thus they cannot directly be applied for heterogeneous networks. Although several works~\cite{chang2015heterogeneous,tang2015pte,xu2017embedding,chen2017task,dong2017metapath2vec} attempt to analyze heterogeneous networks via embedding methods, their representations of nodes and relations may not be suitable for recommendation.
12 | 
13 | On the other hand, network embedding has shown its potential in structure feature extraction and has been successfully applied in many data mining tasks~\cite{hoff2002latent,yan2007graph,cui2017survey,cui2018general}, such as classification~\cite{tu2016max}, clustering~\cite{wei2017cross,cao2016deep} and recommendation~\cite{liang2016factorization,sunmrlr}. Deepwalk~\cite{perozzi2014deepwalk} combines random walk and skip-gram to learn network representations. Furthermore, Grover and Leskovec~\cite{grover2016node2vec} propose a more flexible network embedding framework based on a biased random walk procedure. In addition, LINE \cite{tang2015line} and SDNE~\cite{wang2016structural} characterize the second-order link proximity, as well as neighbor relations. Cao et al.~\cite{cao2015grarep} propose the GraRep model to capture higher-order graph proximity for network representations. Besides leaning network embedding from only the topology,  there are also many works~\cite{pan2016tri,yang2015network,zhang2016homophily} leveraging node content information and other available graph information for the robust representations. Unfortunately, most of network embedding methods focus on homogeneous networks, and thus they cannot directly be applied for heterogeneous networks. Recently, several works~\cite{chang2015heterogeneous,tang2015pte,xu2017embedding,chen2017task,dong2017metapath2vec} attempt to analyze heterogeneous networks via embedding methods. Particularly, Chang et al.~\cite{chang2015heterogeneous} design a deep embedding model to capture the complex interaction between the heterogeneous data in the network. Xu et al.~\cite{xu2017embedding} propose a EOE method to encode the intra-network and inter-network edges for the coupled heterogeneous network. Dong et al.~\cite{dong2017metapath2vec} define the neighbor of nodes via meta-path and learn the heterogeneous embedding by skip-gram with negative sampling. More recently, Fu et al.~\cite{fu2017hin2vec} utilize a neural network model to capture rich relation semantics in HIN. Although these methods can learn network embeddings in various heterogeneous network, their representations of nodes and relations may not be optimum for recommendation.
14 | 
15 | %Pan et al.\cite{pan2016tri} propose the TriDNR model to learn optimal node representation with node structure information, node content and node labels. Yang et al.~\cite{yang2015network} propose the TADW model to incorporate text features of vertices into network representation based on matrix factorization framework. Zhang et al.~\cite{zhang2016homophily} consider neighbors homophily, topology structure and node content during network representation learning based on matrix decomposition. Most of network embedding methods focus on homogeneous networks, and thus they cannot directly be applied for heterogeneous networks. Although several works~\cite{chang2015heterogeneous,tang2015pte,xu2017embedding,chen2017task,dong2017metapath2vec} attempt to analyze heterogeneous networks via embedding methods, their representations of nodes and relations may not be suitable for recommendation.
16 | 
17 | To our knowledge, there are few attempts which adopt the network embedding approach to extract useful information from heterogeneous information network and leverage such information for rating prediction. The proposed approach utilizes the flexibility of HIN for modeling complex heterogeneous context information, and meanwhile borrows the capability of network embedding for learning effective information representation.
18 | The final rating prediction component further incorporates a transformation mechanism implemented by three flexible functions to utilize the learned information from network embedding.
19 | 


--------------------------------------------------------------------------------
/code/HERec_spl.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/python
  2 | #encoding=utf-8
  3 | import numpy as np
  4 | import time
  5 | import random
  6 | from math import sqrt,fabs,log,exp
  7 | import sys
  8 | 
  9 | class HNERec:
 10 |     def __init__(self, unum, inum, ratedim, userdim, itemdim, user_metapaths,item_metapaths, trainfile, testfile, steps, delta, beta_e, beta_h, beta_p, beta_w, beta_b, reg_u, reg_v):
 11 |         self.unum = unum
 12 |         self.inum = inum
 13 |         self.ratedim = ratedim
 14 |         self.userdim = userdim
 15 |         self.itemdim = itemdim
 16 |         self.steps = steps
 17 |         self.delta = delta
 18 |         self.beta_e = beta_e
 19 |         self.beta_h = beta_h
 20 |         self.beta_p = beta_p
 21 |         self.beta_w = beta_w
 22 |         self.beta_b = beta_b
 23 |         self.reg_u = reg_u
 24 |         self.reg_v = reg_v
 25 | 
 26 |         self.user_metapathnum = len(user_metapaths)
 27 |         self.item_metapathnum = len(item_metapaths)
 28 | 
 29 |         self.X, self.user_metapathdims = self.load_embedding(user_metapaths, unum)
 30 |         print 'Load user embeddings finished.'
 31 | 
 32 |         self.Y, self.item_metapathdims = self.load_embedding(item_metapaths, inum)
 33 |         print 'Load user embeddings finished.'
 34 | 
 35 |         self.R, self.T, self.ba = self.load_rating(trainfile, testfile)
 36 |         print 'Load rating finished.'
 37 |         print 'train size : ', len(self.R)
 38 |         print 'test size : ', len(self.T) 
 39 | 
 40 |         self.initialize();
 41 |         self.recommend();
 42 | 
 43 |     def load_embedding(self, metapaths, num):
 44 |         X = {}
 45 |         for i in range(num):
 46 |             X[i] = {}
 47 |         metapathdims = []
 48 |     
 49 |         ctn = 0
 50 |         for metapath in metapaths:
 51 |             sourcefile = '../data/embeddings/' + metapath
 52 |             #print sourcefile
 53 |             with open(sourcefile) as infile:
 54 |                 
 55 |                 k = int(infile.readline().strip().split(' ')[1])
 56 |                 metapathdims.append(k)
 57 |                 for i in range(num):
 58 |                     X[i][ctn] = np.zeros(k)
 59 | 
 60 |                 n = 0
 61 |                 for line in infile.readlines():
 62 |                     n += 1
 63 |                     arr = line.strip().split(' ')
 64 |                     i = int(arr[0]) - 1
 65 |                     for j in range(k):
 66 |                         X[i][ctn][j] = float(arr[j + 1])
 67 |                 print 'metapath ', metapath, 'numbers ', n
 68 |             ctn += 1
 69 |         return X, metapathdims
 70 | 
 71 |     def load_rating(self, trainfile, testfile):
 72 |         R_train = []
 73 |         R_test = []
 74 |         ba = 0.0
 75 |         n = 0
 76 |         user_test_dict = dict()
 77 |         with open(trainfile) as infile:
 78 |             for line in infile.readlines():
 79 |                 user, item, rating = line.strip().split('\t')
 80 |                 R_train.append([int(user)-1, int(item)-1, int(rating)])
 81 |                 ba += int(rating)
 82 |                 n += 1
 83 |         ba = ba / n
 84 |         ba = 0
 85 |         with open(testfile) as infile:
 86 |             for line in infile.readlines():
 87 |                 user, item, rating = line.strip().split('\t')
 88 |                 R_test.append([int(user)-1, int(item)-1, int(rating)])
 89 | 
 90 |         return R_train, R_test, ba
 91 | 
 92 |     def initialize(self):
 93 |         self.E = np.random.randn(self.unum, self.itemdim) * 0.1
 94 |         self.H = np.random.randn(self.inum, self.userdim) * 0.1
 95 |         self.U = np.random.randn(self.unum, self.ratedim) * 0.1
 96 |         self.V = np.random.randn(self.inum, self.ratedim) * 0.1
 97 | 
 98 |         self.pu = np.ones((self.unum, self.user_metapathnum)) * 1.0 / self.user_metapathnum
 99 |         self.pv = np.ones((self.inum, self.item_metapathnum)) * 1.0 / self.item_metapathnum
100 | 
101 | 
102 |         self.Wu = {}
103 |         self.bu = {}
104 |         for k in range(self.user_metapathnum):
105 |             self.Wu[k] = np.random.randn(self.userdim, self.user_metapathdims[k]) * 0.1
106 |             self.bu[k] = np.random.randn(self.userdim) * 0.1
107 | 
108 |         self.Wv = {}
109 |         self.bv = {}
110 |         for k in range(self.item_metapathnum):
111 |             self.Wv[k] = np.random.randn(self.itemdim, self.item_metapathdims[k]) * 0.1
112 |             self.bv[k] = np.random.randn(self.itemdim) * 0.1
113 | 
114 |     def sigmod(self, x):
115 |         return 1 / (1 + np.exp(-x))
116 | 
117 |     def cal_u(self, i):
118 |         ui = np.zeros(self.userdim)
119 |         for k in range(self.user_metapathnum):
120 |             ui += self.pu[i][k] * self.sigmod((self.Wu[k].dot(self.X[i][k]) + self.bu[k]))
121 |         return self.sigmod(ui)
122 | 
123 |     def cal_v(self, j):
124 |         vj = np.zeros(self.itemdim)
125 |         for k in range(self.item_metapathnum):
126 |             vj += self.pv[j][k] * self.sigmod((self.Wv[k].dot(self.Y[j][k]) + self.bv[k]))
127 |         return self.sigmod(vj)
128 | 
129 |     def get_rating(self, i, j):
130 |         ui = self.cal_u(i)
131 |         vj = self.cal_v(j)
132 |         return self.U[i, :].dot(self.V[j, :]) + self.reg_u * ui.dot(self.H[j, :]) + self.reg_v * self.E[i, :].dot(vj)
133 | 
134 |     def maermse(self):
135 |         m = 0.0
136 |         mae = 0.0
137 |         rmse = 0.0
138 |         n = 0
139 |         for t in self.T:
140 |             n += 1
141 |             i = t[0]
142 |             j = t[1]
143 |             r = t[2]
144 |             r_p = self.get_rating(i, j)
145 | 
146 |             if r_p > 5: r_p = 5
147 |             if r_p < 1: r_p = 1
148 |             m = fabs(r_p - r)
149 |             mae += m
150 |             rmse += m * m
151 |         mae = mae * 1.0 / n
152 |         rmse = sqrt(rmse * 1.0 / n)
153 |         return mae, rmse
154 | 
155 |     def recommend(self):
156 |         mae = []
157 |         rmse = []
158 |         starttime = time.clock()
159 |         perror = 99999
160 |         cerror = 9999
161 |         n = len(self.R)
162 | 
163 |         for step in range(steps):
164 |             total_error = 0.0
165 |             for t in self.R:
166 |                 i = t[0]
167 |                 j = t[1]
168 |                 rij = t[2]
169 | 
170 |                 rij_t = self.get_rating(i, j)
171 |                 eij = rij - rij_t
172 |                 total_error += eij * eij
173 |                 
174 |                 U_g = -eij * self.V[j, :] + self.beta_e * self.U[i, :]
175 |                 V_g = -eij * self.U[i, :] + self.beta_h * self.V[j, :]
176 | 
177 |                 self.U[i, :] -= self.delta * U_g
178 |                 self.V[j, :] -= self.delta * V_g
179 | 
180 |                 ui = self.cal_u(i)
181 |                 for k in range(self.user_metapathnum):
182 |                     x_t = self.sigmod(self.Wu[k].dot(self.X[i][k]) + self.bu[k])
183 |                     
184 |                     pu_g = self.reg_u * -eij * (ui * (1-ui) * self.H[j, :]).dot(x_t) + self.beta_p * self.pu[i][k]
185 |                     
186 |                     Wu_g = self.reg_u * -eij * self.pu[i][k] * np.array([ui * (1-ui) * x_t * (1-x_t) * self.H[j, :]]).T.dot(np.array([self.X[i][k]])) + self.beta_w * self.Wu[k]
187 |                     bu_g = self.reg_u * -eij * ui * (1-ui) * self.pu[i][k] * self.H[j, :] * x_t * (1-x_t) + self.beta_b * self.bu[k]
188 |                     #print pu_g
189 |                     self.pu[i][k] -= 0.1 * self.delta * pu_g
190 |                     self.Wu[k] -= 0.1 * self.delta * Wu_g
191 |                     self.bu[k] -= 0.1 * self.delta * bu_g
192 | 
193 |                 H_g = self.reg_u * -eij * ui + self.beta_h * self.H[j, :]
194 |                 self.H[j, :] -= self.delta * H_g
195 | 
196 |                 vj = self.cal_v(j)
197 |                 for k in range(self.item_metapathnum):
198 |                     y_t = self.sigmod(self.Wv[k].dot(self.Y[j][k]) + self.bv[k])
199 |                     pv_g = self.reg_v * -eij * (vj * (1-vj) * self.E[i, :]).dot(y_t) + self.beta_p * self.pv[j][k]
200 |                     Wv_g = self.reg_v * -eij  * self.pv[j][k] * np.array([vj * (1-vj) * y_t * (1 - y_t) * self.E[i, :]]).T.dot(np.array([self.Y[j][k]])) + self.beta_w * self.Wv[k]
201 |                     bv_g = self.reg_v * -eij * vj * (1-vj) * self.pv[j][k] * self.E[i, :] * y_t * (1 - y_t) + self.beta_b * self.bv[k]
202 | 
203 |                     self.pv[j][k] -= 0.1 * self.delta * pv_g
204 |                     self.Wv[k] -= 0.1 * self.delta * Wv_g
205 |                     self.bv[k] -= 0.1 * self.delta * bv_g
206 | 
207 |                 E_g = self.reg_v * -eij * vj + 0.01 * self.E[i, :]
208 |                 self.E[i, :] -= self.delta * E_g
209 | 
210 |             perror = cerror
211 |             cerror = total_error / n
212 |             
213 |             self.delta = 0.93 * self.delta
214 | 
215 |             if(abs(perror - cerror) < 0.0001):
216 |                 break
217 |             #print 'step ', step, 'crror : ', sqrt(cerror)
218 |             MAE, RMSE = self.maermse()
219 |             mae.append(MAE)
220 |             rmse.append(RMSE)
221 |             #print 'MAE, RMSE ', MAE, RMSE
222 |             endtime = time.clock()
223 |             #print 'time: ', endtime - starttime
224 |         print 'MAE: ', min(mae), ' RMSE: ', min(rmse)
225 | 
226 | if __name__ == "__main__":
227 |     unum = 16239
228 |     inum = 14284
229 |     ratedim = 10
230 |     userdim = 30
231 |     itemdim = 10
232 |     train_rate = 0.8#sys.argv[1]
233 | 
234 |     user_metapaths = ['ubu', 'ubcibu', 'ubcabu']
235 |     item_metapaths = ['bub', 'bcib', 'bcab']
236 | 
237 |     for i in range(len(user_metapaths)):
238 |         user_metapaths[i] += '_' + str(train_rate) + '.embedding'
239 |     for i in range(len(item_metapaths)):
240 |         item_metapaths[i] += '_' + str(train_rate) + '.embedding'
241 | 
242 |     #user_metapaths = ['ubu_' + str(train_rate) + '.embedding', 'ubcibu_'+str(train_rate)+'.embedding', 'ubcabu_'+str(train_rate)+'.embedding'] 
243 |     
244 |     #item_metapaths = ['bub_'+str(train_rate)+'.embedding', 'bcib.embedding', 'bcab.embedding']
245 |     trainfile = '../data/ub_'+str(train_rate)+'.train'
246 |     testfile = '../data/ub_'+str(train_rate)+'.test'
247 |     steps = 100
248 |     delta = 0.02
249 |     beta_e = 0.1
250 |     beta_h = 0.1
251 |     beta_p = 2
252 |     beta_w = 0.1
253 |     beta_b = 0.1
254 |     reg_u = 1.0
255 |     reg_v = 1.0
256 |     print 'train_rate: ', train_rate
257 |     print 'ratedim: ', ratedim, ' userdim: ', userdim, ' itemdim: ', itemdim
258 |     print 'max_steps: ', steps
259 |     print 'delta: ', delta, 'beta_e: ', beta_e, 'beta_h: ', beta_h, 'beta_p: ', beta_p, 'beta_w: ', beta_w, 'beta_b', beta_b, 'reg_u', reg_u, 'reg_v', reg_v
260 | 
261 |     HNERec(unum, inum, ratedim, userdim, itemdim, user_metapaths, item_metapaths, trainfile, testfile, steps, delta, beta_e, beta_h, beta_p, beta_w, beta_b, reg_u, reg_v)
262 | 


--------------------------------------------------------------------------------
/paper/sec-model.tex:
--------------------------------------------------------------------------------
  1 | \section{The Proposed Approach \label{sec-model}}
  2 | 
  3 | In this section, we present a \emph{H}eterogeneous network \emph{E}mbedding based  approach for \emph{Rec}emendation, called  \emph{HERec}.
  4 | For addressing the two issues introduced in Section 1, the proposed HERec approach consists of two major components.
  5 | First, we propose a new heterogeneous network embedding method to learn the user/item embeddings from HINs.
  6 | Then, we extend the classic matrix factorization framework by incorporating the learned embeddings using a flexible set of fusion functions.
  7 | We present an overall schematic illustration of the proposed approach in Fig.~\ref{fig_framework}.
  8 | After the construction of the HINs~(Fig.~\ref{fig_framework}(a)), two major steps are presented,  namely HIN embedding~(Fig.~\ref{fig_framework}(b)) and recommendation~(Fig.~\ref{fig_framework}(c)). Next, we will present detailed illustration of the proposed approach.
  9 | 
 10 | \subsection{Heterogeneous Network Embedding}
 11 | Inspired by the recent progress on network embedding~\cite{perozzi2014deepwalk,grover2016node2vec}, we adopt the representation learning method to extract and represent useful information of HINs for recommendation. Given a HIN $\mathcal{G} = \{\mathcal{V}, \mathcal{E}\}$, our goal is to learn a low-dimensional representation $\bm{e}_v \in \mathbb{R}^d$ (\aka embedding) for each node $v \in \mathcal{V}$.
 12 | The learned embeddings are expected to highly summarize informative characteristics, which are likely to be useful in recommender systems represented on HINs.
 13 | Compared with meta-path based similarity~\cite{sun2011pathsim,shi2015semantic}, it is much easier to use and integrate the learned representations in subsequent procedures.
 14 | 
 15 | However, most of the existing network embedding methods mainly focus on homogeneous networks, which are not able to effectively model heterogeneous networks. For instance, the pioneering study \emph{deepwalk}~\cite{perozzi2014deepwalk} uses random walk to generate node sequences, which
 16 | cannot discriminate nodes and edges with different types. Hence, it requires a more principled way to traverse the HINs and generate meaningful node sequences.
 17 | 
 18 | %In this section, we will present the details of heterogeneous network embedding. The basic idea is to employ a meta-path based random walk to generate a sequence of nodes, and then we maximize the co-occurrence probability of nodes to obtain the latent feature representation of nodes for one single meta-path, which reflect the structural characteristics of nodes from one perspective. Furthermore, we propose a flexible embedding fusion framework to combine the latent feature representation of nodes from different meta-paths.
 19 | 
 20 | \subsubsection{Meta-path based Random Walk}
 21 | To generate meaningful node sequences, the key is to design an effective walking strategy that is able to capture the complex semantics reflected in HINs. In the literature of HINs, meta-path is an important concept to characterize the semantic patterns for HINs~\cite{sun2011pathsim}.
 22 | Hence, we propose to use the meta-path based random walk method to generate node sequences. Giving a heterogeneous network $\mathcal{G} = \{\mathcal{V}, \mathcal{E}\}$ and a meta-path $\rho : A_1 \xrightarrow{R_1} \cdots A_t \xrightarrow{R_t} A_{t+1} \cdots \xrightarrow{R_l} A_{l+1}$, the walk path is generated according to the following distribution:
 23 | 
 24 | \begin{eqnarray}\label{eq-rw}
 25 | &&P({n_{t+1}=x |n_{t}=v, \rho})\\
 26 | &=&\begin{cases}
 27 | \frac{1}{|\mathcal{N}^{A_{t+1}}(v)|}, &\text{($v, x$) $\in$ $\mathcal{E}$ and $\phi(x) = A_{t+1}$};\nonumber\\
 28 | 0,& \text{otherwise},
 29 | \end{cases}
 30 | \end{eqnarray}
 31 | where $n_t$ is the $t$-th node in the walk, $v$ has the type of $A_t$, and $\mathcal{N}^{(A_{t+1})}(v)$ is the first-order neighbor set for node $v$ with the type of $A_{t+1}$. A walk will follow the pattern of a meta-path repetitively until it reaches the pre-defined length.
 32 | 
 33 | \begin{exmp}
 34 | We still take Fig.~\ref{fig_framework}(a) as an example, which represents the heterogeneous information network of movie recommender systems.
 35 | Given a meta-path $UMU$, we can generate two sample walks (\ie node sequences) by starting with the user node of \emph{Tom}: (1) Tom$_{User}$ $\rightarrow$ The Terminator$_{Movie}$ $\rightarrow$  Mary$_{User}$, and (2) Tom$_{User}$ $\rightarrow$  Avater$_{Movie}$ $\rightarrow$  Bob$_{User}$ $\rightarrow$  The Terminator$_{Movie}$ $\rightarrow$  Mary$_{User}$. Similarly, given the meta-path $UMDMU$, we can also generate another node sequence: Tom$_{User}$ $\rightarrow$  The Terminator$_{Movie}$ $\rightarrow$  Cameron$_{Director}$  $\rightarrow$  Avater$_{Movie}$  $\rightarrow$ Mary$_{User}$. It is intuitive to see that these meta-paths can lead to meaningful node sequences corresponding to different semantic relations.
 36 | \end{exmp}
 37 | 
 38 | \begin{figure}[t]%[htbp]
 39 | \centering
 40 | \includegraphics[width=9cm]{image/random_walk.pdf}
 41 | \caption{\label{fig_randwalk}An illustrative example of the proposed meta-path based random walk. We first perform random walks guided by some selected meta-paths, and then filter the node sequences \emph{not with} the user type or item type.}
 42 | \end{figure}
 43 | 
 44 | \subsubsection{Type Constraint and Filtering}
 45 | Since our goal is to improve the recommendation performance, the main focus is to learn effective representations for users and items,
 46 | while objects with other types are of less interest in our task.
 47 | Hence, we only select meta-paths starting with \emph{user type} or \emph{item type}.
 48 | Once a node sequence has been generated using the above method, it is likely to contain nodes with different types.
 49 | We further remove the nodes with a type different from the starting type.
 50 | In this way, the final sequence will only consist of nodes with the starting type.
 51 | Applying type filtering to node sequences has two benefits. First, although node sequences are constructed using meta-paths with heterogeneous types, the final representations
 52 | are learned using homogeneous neighborhood. We embed the nodes with the same type in the same space, which relaxes the challenging goal of representing all the heterogeneous objects in a unified space.
 53 | Second, given a fixed-length window, a node is able to utilize more homogeneous neighbors that are more likely to be relevant than others with different types.
 54 | %We present an illustrative example of the proposed meta-path random walk method in Fig. \ref{fig_randwalk}.
 55 | 
 56 | \begin{exmp}
 57 | As shown in Fig.~\ref{fig_randwalk}, in order to learn effective representations for users and items, we only consider the meta-paths in which the starting type is \emph{user type} or \emph{item type}. In this way, we can derive some meta-paths, such as $UMU$, $UMDMU$ and $MUM$.
 58 |  Take the meta-path of $UMU$ as an instance. We can generate a sampled sequence  ``$u_1 \rightarrow m_1 \rightarrow u_2 \rightarrow m_2 \rightarrow u_3 \rightarrow m_2 \rightarrow u_4$" according to Eq.~\ref{eq-rw}. Once a sequence has been constructed, we further remove the nodes with a different type compared with the starting node. In this way, we finally obtain a homogeneous node sequence ``$u_1 \rightarrow u_2 \rightarrow u_3 \rightarrow u_4$".
 59 | \end{exmp}
 60 | 
 61 |  The connections between homogeneous nodes are essentially constructed via the heterogeneous neighborhood nodes. After this step, our next focus will be how to learn effective representations for homogeneous sequences.
 62 | 
 63 | \subsubsection{Optimization Objective}
 64 | Given a meta-path, we can construct the neighborhood $\mathcal{N}_u$ for node $u$ based on co-occurrence in a fixed-length window.
 65 | Following node2vec~\cite{grover2016node2vec}, we can learn the representations of nodes to optimize the following objective:
 66 | \begin{equation}
 67 | \max_f \sum_{u \in \mathcal{V}}{\log Pr(\mathcal{N}_u | f(u))},
 68 | \end{equation}
 69 | where $f: \mathcal{V} \rightarrow \mathbb{R}^d$ is a function (aiming to learn) mapping each node to $d$-dimensional feature space, and $\mathcal{N}_u  \subset \mathcal{V}$ represents the neighborhood of node $u$, \emph{w.r.t.} a specific meta-path. We can learn the embedding mapping function $f(\cdot)$ by applying stochastic gradient descent (SGD) to optimize this objective. %The embedding algorithm is shown in Algorithm 1.
 70 | A major difference between previous methods and ours lies in the construction of $\mathcal{N}_u$. Our method selects homogeneous neighbors using meta-path based random walks. %The whole algorithm framework is shown in Algorithm~\ref{alg_embedding}. %where $path\_constraint\_filter($path$)$ is the procedure performing the type constraint and filtering as discussed in the previous subsection.
 71 | 
 72 | %\begin{algorithm}[htb]
 73 | %\caption{HIN embedding algorithm for a single meta-path.}
 74 | %\label{alg_embedding}
 75 | %\begin{algorithmic}[1]
 76 | %\Require
 77 | %the heterogeneous information network $\mathcal{G} = \{\mathcal{V}, \mathcal{E}\}$;
 78 | %the given meta-path $\rho$;
 79 | %the target node type $A_t$;
 80 | %the dimension of embedding $d$;
 81 | %the walk length $wl$;
 82 | %the neighborhood size $ns$;
 83 | %the number of walks per node $r$.
 84 | %
 85 | %\Ensure
 86 | %The embedding of target node type w.r.t the single meta-path, denoted by $\bm{e}$
 87 | %\State Initialize $e$ by standard normal distribution;
 88 | %\State $paths = []$;
 89 | %\For {each $v$ $\in$ $\mathcal{V}$ and $\phi(v) == A_t$}
 90 | %    \For {$i$ = 1 to $r$}
 91 | %        \State $path = []$;
 92 | %        \While {$wl > 0$}
 93 | %            \State walk to node $x$ according to Eq.~\ref{eq-rw};
 94 | %            \If {$\phi(x) == A_t$}
 95 | %                \State append node $x$ into $path$;
 96 | %                \State $wl \leftarrow wl - 1$;
 97 | %            \EndIf
 98 | %         \EndWhile
 99 | %         %\State path\_constraint\_filter($path$);
100 | %         \State Add $path$ to $paths$;
101 | %     \EndFor
102 | %\EndFor
103 | %
104 | %\State $\bm{e} = SGD(paths, d, ns)$;
105 | %\State \Return {$\bm{e}$}.
106 | %\end{algorithmic}
107 | %\end{algorithm}
108 | 
109 | 
110 | \subsubsection{Embedding Fusion}
111 | Heterogeneous network embedding provides a general way to extract useful information from HINs.
112 | For our model, given a node $v \in \mathcal{V}$, we can obtain a set of representations $\{ \bm{e}^{(l)}_v \}_{l=1}^{|\mathcal{P}|}$,
113 | where $\mathcal{P}$ denotes the set of meta-paths, and $\bm{e}^{(l)}_v$ denotes the representation of $v$ \emph{w.r.t.} the $l$-th meta-path.
114 | It requires a principled fusion way to transform node embeddings into a more suitable form that is useful to improve recommendation performance.
115 | Existing studies usually adopt a linear weighting mechanism to combine the information mined from HINs (\eg meta-path based similarities), which may not be capable of deriving effective information representations for recommendation.
116 | Hence, we propose to use a general function $g(\cdot)$, which aims to fuse the learned node embeddings for users and items:
117 | 
118 | \begin{eqnarray}\label{eq-eui}
119 | \bm{e}^{(U)}_u &\leftarrow& g(\{\bm{e}^{(l)}_u\}),\\
120 | \bm{e}^{(I)}_i &\leftarrow& g(\{\bm{e}^{(l)}_i\}),\nonumber
121 | \end{eqnarray}
122 | where $\bm{e}^{(U)}_u$ and $\bm{e}^{(I)}_i$ are the final representations for a user $u$ and an item $i$ respectively, called \emph{HIN embeddings}.
123 | Since users and items are our focus, we only learn the embeddings for users and items.
124 | At the current stage, we do not specify the form of function $g(\cdot)$. Instead, we believe that
125 | a good fusion function should be learned according to the specific task. Hence, we leave the formulation and optimization of the fusion function
126 | in our recommendation model.
127 | 
128 | 
129 | \subsection{Integrating Matrix Factorization with Fused HIN Embedding for Recommendation}
130 | Previously, we have studied how to extract and represent useful information from HINs for recommendation.
131 | With HIN embedding, we can obtain user embeddings $\{\bm{e}^{(U)}_u\}_{u \in \mathcal{U}}$ and item embeddings $\{\bm{e}^{(I)}_i\}_{i \in \mathcal{I}}$, which are further specified by a function $g(\cdot)$ that is to learn. Now we study how to utilize the learned embeddings for recommendation.
132 | 
133 | \subsubsection{Rating Predictor}
134 | We build our rating predictor based on the classic matrix factorization (MF) model~\cite{mnih2008probabilistic}, which factorizes the user-item rating matrix into user-specific and item-specific matrices. In MF, the rating of a user $u$ on an item $i$ is simply defined as follows:
135 | 
136 | \begin{equation}\label{eq-mf}
137 | \widehat{r_{u,i}} = \mathbf{x}_u^{\top}\cdot \mathbf{y}_i,
138 | \end{equation}
139 | where $\mathbf{x}_u \in \mathbb{R}^{D}$ and $\mathbf{y}_i \in \mathbb{R}^{D}$ denote the latent factors corresponding to user $u$ and item $i$.
140 | Since we have also obtained the representations for user $u$ and item $i$, we further incorporate them into the rating predictor as below
141 | 
142 | \begin{equation}\label{eq-predictor}
143 | \widehat{r_{u,i}} = \mathbf{x}_u^{\top}\cdot \mathbf{y}_i + \alpha \cdot {\bm{e}_u^{(U)}}^{\top}\cdot\bm{{\gamma}}_i^{(I)}  + \beta \cdot {\bm{{\gamma}}^{(U)}_u}^{\top}\cdot {\bm{e}_i^{(I)}},
144 | \end{equation}
145 | where $\bm{e}_u^{(U)}$ and $\bm{e}_i^{(I)}$ are the fused embeddings,
146 | $\bm{{\gamma}}^{(U)}_u$ and $\bm{{\gamma}}_i^{(I)}$ are user-specific and item-specific latent factors to pair with the HIN embeddings
147 | $\bm{e}_u^{(U)}$ and $\bm{e}_i^{(I)}$ respectively, and $\alpha$ and $\beta$ are the tuning parameters to integrate the three terms.
148 | For Eq.~\ref{eq-predictor}, we need to note two points. First, $\bm{e}_u^{(U)}$ and $\bm{e}_i^{(I)}$ are the output of function $g(\cdot)$ in Eq.~\ref{eq-eui}.
149 | We assume that the derived embeddings after transformation by function $g(\cdot)$ are applicable
150 | in MF. Second, we don't directly pair $\bm{e}_u^{(U)}$ with $\bm{e}_i^{(I)}$.
151 | Recall the proposed embedding method indeed characterizes the relatedness between objects with the same type.
152 | We incorporate new latent factors $\bm{{\gamma}}^{(U)}_u$ and $\bm{{\gamma}}_i^{(I)}$ to relax
153 | the assumption that $\bm{e}_u^{(U)}$ and $\bm{e}_i^{(I)}$ have to be in the same space, which increases the flexility to the prediction model.
154 | 
155 | \subsubsection{Setting the Fusion Function}
156 | Previously, we assume the function $g(\cdot)$ has been given in a general form.
157 | Now, we study how to set the fusion function, which transforms HIN embeddings into a form that is useful in recommender systems.
158 | We only discuss the function for fusing user embeddings, and it is similar to fuse item embeddings.
159 | We propose to use three fusion functions to integrate embeddings:
160 | 
161 | %\begin{itemize}
162 | \textbullet\ Simple linear fusion. We assume each user has the same preference for each meta-path, and therefore, we assign each meta-path with a unified weight ($\ie$ average value) for each user. Moreover, we linearly transform embeddings to target space.
163 | \begin{equation}\label{eq-slf}
164 | g(\{\bm{e}^{(l)}_u\}) = \frac{1}{|\mathcal{P}|}\sum_{l=1}^{|\mathcal{P}|}{(\mathbf{M}^{(l)} \bm{e}^{(l)}_u+\bm{b}^{(l)})},
165 | \end{equation}
166 | where $\mathcal{P}$ is the set of considered meta-paths, $\mathbf{M}^{(l)} \in \mathbb{R}^{D\times d}$ and $\bm{b}^{(l)}  \in \mathbb{R}^{D}$ are the transformation matrix and bias vector \emph{w.r.t.} the $l$-th meta-path.
167 | 
168 | \textbullet\ Personalized linear fusion. The simple linear fusion cannot model users' personalized preference over the meta-paths. So we further assign each user with a weight vector on meta-paths, representing user's personalized preference for each meta-path. It is more reasonable that each user has his/her personal interest preference in many real applications.
169 | \begin{equation}\label{eq-plf}
170 | g(\{\bm{e}^{(l)}_u\}) = \sum_{l=1}^{|\mathcal{P}|}{w^{(l)}_u (\mathbf{M}^{(l)} \bm{e}^{(l)}_u+\bm{b}^{(l)})},
171 | \end{equation}
172 | where $w^{(l)}_u$ is the preference weight of user $u$ over the $l$-th meta-path.
173 | 
174 | \textbullet\ Personalized non-linear fusion. Linear fusion has limited expressive power in modeling complex data relations.
175 | Hence, we use non-linear function to enhance the fusion ability.
176 | \begin{equation}\label{eq-pnlf}
177 | %F(\mathbf{X}^{(u)}_i) = \sigma(\sum_{l=1}^{|\mathcal{P}^{(u)}|}{p^{(u)}_{il} \sigma (\mathbf{W}^{(u)}_l\mathbf{X}_{il}+\mathbf{b}^{(u)}_l)}).
178 | g(\{\bm{e}^{(l)}_u\}) = \sigma\bigg( \sum_{l=1}^{|\mathcal{P}|} {w^{(l)}_u \sigma\big(\mathbf{M}^{(l)} \bm{e}^{(l)}_u+\bm{b}^{(l)}\big)}\bigg),
179 | \end{equation}
180 | where $\sigma(\cdot)$ is a non-linear function, \ie sigmoid function in our work. Although we only use two non-linear transformations, it is flexible to extend to multiple non-linear layers, \eg Multi-Layer Perceptrons.
181 | %\end{itemize}
182 | 
183 | \subsubsection{Model Learning}
184 | We blend the fusion function into matrix factorization framework for learning the parameters of the proposed model.
185 | The objective can be formulated as follows:
186 | 
187 | \begin{align}
188 |  \pounds &= \sum_{\langle u, i, r_{u,i}\rangle \in \mathcal{R}}{(r_{u,i} - \widehat{r_{u,i}})}^2  + \lambda \sum_u{(\|\mathbf{x}_u\|_2 + \|\mathbf{y}_i\|_2} \nonumber \\
189 |  &+ \|\bm{\gamma}^{(U)}_u\|_2 + \|\bm{\gamma}^{(I)}_i\|_2 + \|\bm{\Theta}^{(U)}\|_2 + \|\bm{\Theta}^{(I)}\|_2),
190 | \end{align}
191 | 
192 | \noindent where $\widehat{r_{u,i}}$ is the predicted rating using Eq.~\ref{eq-predictor} by the proposed model, $\lambda$ is the regularization parameter,
193 | and $\bm{\Theta}^{(U)}$ and $\bm{\Theta}^{(I)}$ are the parameters of the function $g(\cdot)$ for users and items respectively. We adopt SGD to efficiently optimize the final objective.
194 | The update of original latent factors $\{\mathbf{x}_u\}$ and $\{\mathbf{y}_i\}$ are the same as that of standard MF in Eq.~\ref{eq-mf}.
195 | The parameters of the proposed model will be updated as follows:
196 | 
197 | \begin{footnotesize}
198 | \begin{eqnarray}
199 | \bm{\Theta}_{u,l}^{(U)} &\leftarrow& \bm{\Theta}_{u,l}^{(U)} - \eta \cdot (-\alpha (r_{u,i}-\widehat{r_{u,i}})\bm{\gamma}_i^{(I)}\frac{\partial{\bm{e}_{u}^{(U)}}}{\partial{\bm{\Theta}_{u,l}^{(U)}}} + {\lambda_{\Theta}}\bm{\Theta}_{u,l}^{(U)}), \nonumber \\
200 | \\
201 | \bm{\gamma}_u^{(U)} &\leftarrow& \bm{\gamma}_u^{(U)} - \eta \cdot (-\beta (r_{u,i}-\widehat{r_{u,i}})\bm{e}_i^{(I)} + \lambda_{\gamma}\bm{\gamma}_u^{(U)}),\\
202 | \nonumber\\
203 | \bm{\Theta}_{i,l}^{(I)} &\leftarrow& \bm{\Theta}_{i,l}^{(I)} - \eta \cdot (-\beta (r_{u,i}-\widehat{r_{u,i}})\bm{\gamma}_u^{(U)}\frac{\partial{\bm{e}_{i}^{(I)}}}{\partial{\bm{\Theta}_{i,l}^{(I)}}} + {\lambda_{\Theta}}\bm{\Theta}_{i,l}^{(I)}), \nonumber \\
204 | \\
205 | \bm{\gamma}_i^{(I)} &\leftarrow& \bm{\gamma}_i^{(I)} - \eta \cdot (-\alpha (r_{u,i}-\widehat{r_{u,i}})\bm{e}_u^{(U)} + \lambda_{\gamma}\bm{\gamma}_i^{(I)}),
206 | \end{eqnarray}
207 | \end{footnotesize}
208 | where $\eta$ is the learning rate, $\lambda_{\Theta}$ is the regularization for parameters $\bm{\Theta}^{(U)}$ and $\bm{\Theta}^{(I)}$,  and $\lambda_{\gamma}$ is the regularization for parameters $\bm{\gamma}^{(U)}$ and $\bm{\gamma}^{(I)}$.  In our work, we utilize the sigmod function for non-linear transformation, and we can take advantage of the properties of sigmod function for ease of derivative calculation. It is worth noting that the symbol $\bm{\Theta}$ denotes all the parameters in the fusion function, and the calculation of $\frac{\partial{\bm{e}_{i}}}{\partial{\bm{\Theta}_{i,l}}}$ will be different for different parameters in $\bm{\Theta}$. Next, we present the detailed derivation of $\frac{\partial{\bm{e}_{i}}}{\partial{\bm{\Theta}_{i,l}}}$ for personalized non-linear fusion function
209 | 
210 | 
211 | \begin{align}\label{eq-gradient}
212 | %\centering
213 | \footnotesize
214 | &\frac{\partial{\bm{e}_{i}}}{\partial{\bm{\Theta}_{i,l}}}=\\
215 | &\begin{cases}
216 | w^{(l)}_i\sigma(Z_s)\sigma(Z_f)(1-\sigma(Z_s))(1-\sigma(Z_f))e^{(l)}_i, &\text{$\bm{\Theta} = \bm{M}$};\nonumber\\
217 | \\
218 | w^{(l)}_i\sigma(Z_s)\sigma(Z_f)(1-\sigma(Z_s))(1-\sigma(Z_f)), &\text{$\bm{\Theta} = \bm{b}$}; \nonumber\\
219 | \\
220 | \sigma(Z_s)\sigma(Z_f)(1-\sigma(Z_s)), &\text{$\bm{\Theta} = w$},
221 | \end{cases}
222 | \end{align}
223 | where $Z_s = \sum_{l=1}^{|\mathcal{P}|} {w^{(l)}_i \sigma\big(\mathbf{M}^{(l)} \bm{e}^{(l)}_i+\bm{b}^{(l)}\big)}$ and $Z_f = \mathbf{M}^{(l)} \bm{e}^{(l)}_i+\bm{b}^{(l)}$. The derivations of $\frac{\partial{\bm{e}_{i}}}{\partial{\bm{\Theta}_{i,l}}}$ can be calculated in the above way for both users and items. We omit the derivations for linear fusion functions, since it is relatively straightforward.
224 | The whole algorithm framework is shown in Algorithm~\ref{alg_herec}.
225 | In line 1, we perform HIN embedding to obtain the representations of users and items. And in lines 3-16, we adopt the SGD algorithm to optimize the parameters in the fusion function and rating function.
226 | 
227 | %\begin{eqnarray}\label{eq-rw}
228 | %&&P({n_{t+1}=x |n_{t}=v, \rho})\\
229 | %&=&\begin{cases}
230 | %\frac{1}{|\mathcal{N}^{A_{t+1}}(v)|}, &\text{($v, x$) $\in$ $\mathcal{E}$ and $\phi(x) = A_{t+1}$};\nonumber\\
231 | %0,& \text{otherwise},
232 | %\end{cases}
233 | %\end{eqnarray}
234 | 
235 | %We omit the learning of $\bm{\Theta}_{i,l}^{(I)}$ and $\bm{\gamma}_i^{(I)}$ due to the similar derivations.
236 | 
237 | \begin{algorithm}[htb]
238 | \caption{The overall learning algorithm of HERec.}
239 | \label{alg_herec}
240 | \begin{algorithmic}[1]
241 | \Require
242 | 
243 | the rating matrix $\mathcal{R}$;
244 | %The final HIN embeddings of users, items, $\bm{e}^{(U)}, \bm{e}^{(V)}$;
245 | the learning rate $\eta$;
246 | the adjustable parameters $\alpha, \beta$;
247 | the regularization parameter $\lambda$;
248 | the meta-path sets for users and items, $\mathcal{P}^{(U)}$ and $\mathcal{P}^{(I)}$.
249 | \Ensure
250 | the latent factors for users and items, $\mathbf{x}$ and $\mathbf{y}$;
251 | the latent factors to pair HIN embedding of users and items, $\bm{{\gamma}}^{(U)}$ and $\bm{{\gamma}}^{(I)}$;
252 | the parameters of the fusion function for users and items, $\bm{\Theta}^{(U)}$  and $\bm{\Theta}^{(I)}$
253 | %\For {$l = 1$ to $|\mathcal{P}^{(U)}|$}
254 | %    \State Obtain users' embeddings $\{\bm{e}^{(l)}_u\}$  based on meta-path $\mathcal{P}^{(U)}_l$; %according to Algorithm \ref{alg_embedding};
255 | %\EndFor
256 | %\For {$l = 1$ to $|\mathcal{P}^{(I)}|$}
257 | %    \State Obtain items' embeddings $\{\bm{e}^{(l)}_i\}$ based on the meta-path set $\mathcal{P}^{(I)}_l$;  %according to Algorithm \ref{alg_embedding};
258 | %\EndFor
259 | \State Obtain the set of users' embeddings $\{ \bm{e}^{(l)}_u \}_{l=1}^{|\mathcal{P}^{(U)}|}$ and the set of items' embeddings $\{ \bm{e}^{(l)}_i \}_{l=1}^{|\mathcal{P}^{(I)}|}$
260 | \State Initialize $\mathbf{x}, \mathbf{y}, \bm{{\gamma}}^{(U)}, \bm{{\gamma}}^{(I)}, \bm{\Theta}^{(U)}, \bm{\Theta}^{(I)}$ by standard normal distribution;
261 | \While {not convergence}
262 |     \State Randomly select a triple $\langle u, i, r_{u,i}\rangle \in \mathcal{R}$;
263 |     \State Update  $\mathbf{x}_u, \mathbf{y}_i$ by typical MF;
264 |     \For {$l = 1$ to $|\mathcal{P}^{(U)}|$}
265 |         \State Calculate $\frac{\partial{\bm{e}_{u}^{(U)}}}{\partial{\bm{\Theta}_{u,l}^{(U)}}}$ by Eq. 14;
266 |         \State Update  $\bm{\Theta}_{u,l}^{(U)}$ by Eq. 10;
267 |     \EndFor
268 |     \State Update $\bm{\gamma}_u^{(U)}$ by Eq. 11;
269 |      \For {$l = 1$ to $|\mathcal{P}^{(I)}|$}
270 |         \State Calculate $\frac{\partial{\bm{e}_{i}^{(I)}}}{\partial{\bm{\Theta}_{i,l}^{(I)}}}$ by Eq. 14;
271 |         \State Update  $\bm{\Theta}_{i,l}^{(I)}$ by Eq. 12;
272 |     \EndFor
273 |     \State Update $\bm{\gamma}_i^{(I)}$ by Eq. 13;
274 | \EndWhile
275 | \State \Return {$\mathbf{x}, \mathbf{y}, \bm{{\gamma}}^{(U)}, \bm{{\gamma}}^{(I)}, \bm{\Theta}^{(U)}, \bm{\Theta}^{(I)}$}.
276 | \end{algorithmic}
277 | \end{algorithm}
278 | 
279 | 
280 | \subsubsection{Complexity Analysis}
281 | HERec contains two major parts: (1) HIN embedding. %The complexity of deepwalk is $\mathcal{O}(d \cdot |V|)$, where $d$ is the embedding dimension and $|V|$ is the number of nodes in the network. Therefore it takes $\mathcal{O}(d \cdot |\mathcal{U}|)$ and $\mathcal{O}(d \cdot |\mathcal{I}|)$ to learn users' and items' embeddings according to a single meta-path, respectively. And the total complexity of HIN embedding is $\mathcal{O}(|\mathcal{P}|\cdot d \cdot (|\mathcal{U}|+|\mathcal{I}|))$ since the number of selected meta-paths is $|\mathcal{P}|$.
282 | The complexity of deepwalk is $O(\tau \cdot |\mathcal{V}| \cdot t \cdot w \cdot (d + d \cdot \log|\mathcal{V}|))$, where $\tau$ is the number of random walk, $t$ is the length of random walk, $w$ is the size of neighbor, $d$ is the embedding dimension and $|\mathcal{V}|$ is the number of nodes in the network~\cite{chen2017harp}. Hence, the total complexity of HIN embedding can be abbreviated as $O(|\mathcal{P}| \cdot \tau \cdot t \cdot w \cdot d \cdot (|\mathcal{U}| \cdot \log|\mathcal{U}|+|\mathcal{I}| \cdot \log|\mathcal{I}|))$ since the number of selected meta-paths is $|\mathcal{P}|$ and the number of users and items is $|\mathcal{U}|$ and $|\mathcal{I}|$, respectively.
283 | It is worth noting that HIN embedding can be easily trained in parallel and we will implement it using a multi-thread mode in order to improve the efficiency of model. (2) Matrix factorization. For each triplet $\langle u, i, r_{u,i}\rangle$, updating $\bm{x}_u$, $\bm{y}_i$, $\bm{\gamma}_u^{(U)}$, $\bm{\gamma}_i^{(I)}$ takes $\mathcal{O}(D)$ time, where $D$ is the number of latent factors. And  updating $\bm{\Theta}_u^{(U)}$, $\bm{\Theta}_i^{(I)}$ takes $\mathcal{O}(|\mathcal{P}| \cdot D \cdot d)$ time to learn the transformation matrices  $\mathbf{M}$ for all meta-paths.
284 | In the proposed approach, $|\mathcal{P}|$ is generally small, and $d$ and $D$ are at most several hundreds, which makes the proposed method efficient in large datasets\footnote{The training time in Douban Movie dataset is around 196 seconds per iteration, and the test time is around 2 seconds.  The training time in Douban Book dataset is around 200 seconds per iteration, and the test time is around 2.5 seconds. The training time in Yelp dataset is around 52 seconds per iteration, and the test time is around 0.6 seconds.}. Specially, SGD has very good practice performance, and we have found that it has a fast convergence rate on our datasets. Concretely, about 40 to 60 iterations are required for dense datasets (\ie Douban Movie and Book), while about 20 iterations are required for sparse datasets (\ie Yelp).
285 | 
286 | 


--------------------------------------------------------------------------------
/paper/reference.bib:
--------------------------------------------------------------------------------
  1 | %% This BibTeX bibliography file was created using BibDesk.
  2 | %% http://bibdesk.sourceforge.net/
  3 | 
  4 | 
  5 | %% Created for shine at 2017-09-09 22:31:06 +0800
  6 | 
  7 | 
  8 | %% Saved with string encoding Unicode (UTF-8)
  9 | 
 10 | @inproceedings{cao2015grarep,
 11 |   title={Grarep: Learning graph representations with global structural information},
 12 |   author={Cao, Shaosheng and Lu, Wei and Xu, Qiongkai},
 13 |   booktitle={Proceedings of the 24th ACM International on Conference on Information and Knowledge Management},
 14 |   pages={891--900},
 15 |   year={2015}
 16 | }
 17 | 
 18 | @inproceedings{cao2016deep,
 19 |   title={Deep Neural Networks for Learning Graph Representations.},
 20 |   author={Cao, Shaosheng and Lu, Wei and Xu, Qiongkai},
 21 |   booktitle={Proceedings of the 30th AAAI Conference on Artificial Intelligence},
 22 |   pages={1145--1152},
 23 |   year={2016}
 24 | }
 25 | 
 26 | @incollection{celma2010music,
 27 | 	Author = {Celma, Oscar},
 28 | 	Booktitle = {Music Recommendation and Discovery},
 29 |     Pages = {43--85},
 30 | 	Title = {Music recommendation},
 31 | 	Year = {2010}}
 32 | 
 33 | @inproceedings{chang2015heterogeneous,
 34 | 	Author = {Chang, Shiyu and Han, Wei and Tang, Jiliang and Qi, Guo Jun and Aggarwal, Charu C and Huang, Thomas S},
 35 | 	Booktitle = {Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
 36 |     Pages = {119--128},
 37 | 	Title = {Heterogeneous network embedding via deep architectures},
 38 | 	Year = {2015}}
 39 | 
 40 | @article{chen2012svdfeature,
 41 |   title={Svdfeature: a toolkit for feature-based collaborative filtering},
 42 |   author={Chen, Tianqi and Zhang, Weinan and Lu, Qiuxia and Chen, Kailong and Zheng, Zhao and Yu, Yong},
 43 |   journal={Journal of Machine Learning Research},
 44 |   volume={13},
 45 |   pages={3619--3622},
 46 |   year={2012}
 47 | }
 48 | 
 49 | @article{chen2017harp,
 50 |   title={HARP: Hierarchical Representation Learning for Networks},
 51 |   author={Chen, Haochen and Perozzi, Bryan and Hu, Yifan and Skiena, Steven},
 52 |   journal={arXiv preprint arXiv:1706.07845},
 53 |   year={2017}
 54 | }
 55 | 
 56 | @inproceedings{chen2017task,
 57 | 	Author = {Chen, Ting and Sun, Yizhou},
 58 | 	Booktitle = {Proceedings of the 10th ACM International Conference on Web Search and Data Mining},
 59 | 	Pages = {295--304},
 60 | 	Title = {Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification},
 61 | 	Year = {2017}
 62 | }
 63 | 
 64 | @article{cui2017survey,
 65 |   title={A Survey on Network Embedding},
 66 |   author={Cui, Peng and Wang, Xiao and Pei, Jian and Zhu, Wenwu},
 67 |   journal={arXiv preprint arXiv:1711.08752},
 68 |   year={2017}
 69 | }
 70 | 
 71 | @article{cui2018general,
 72 |   title={General Knowledge Embedded Image Representation Learning},
 73 |   author={Cui, Peng and Liu, Shaowei and Zhu, Wenwu},
 74 |   journal={IEEE Transactions on Multimedia},
 75 |   volume={20},
 76 |   number={1},
 77 |   pages={198--207},
 78 |   year={2018},
 79 |   publisher={IEEE}
 80 | }
 81 | 
 82 | @inproceedings{dias2008value,
 83 |   title={The value of personalised recommender systems to e-business: a case study},
 84 |   author={Dias, M Benjamin and Locher, Dominique and Li, Ming and El-Deredy, Wael and Lisboa, Paulo JG},
 85 |   booktitle={Proceedings of the 2nd ACM Conference on Recommender Systems},
 86 |   pages={291--294},
 87 |   year={2008}
 88 | }
 89 | 
 90 | @inproceedings{dong2017metapath2vec,
 91 | 	Author = {Dong, Yuxiao and Chawla, Nitesh V and Swami, Ananthram},
 92 | 	Booktitle = {Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
 93 | 	Pages = {135--144},
 94 | 	Title = {metapath2vec: Scalable Representation Learning for Heterogeneous Networks},
 95 | 	Year = {2017}}
 96 | 
 97 | @inproceedings{feng2012incorporating,
 98 | 	Author = {Feng, Wei and Wang, Jianyong},
 99 | 	Booktitle = {Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
100 | 	Pages = {1276--1284},
101 | 	Title = {Incorporating heterogeneous information for personalized tag recommendation in social tagging systems},
102 | 	Year = {2012}}
103 | 
104 | @inproceedings{fu2017hin2vec,
105 |   title={HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning},
106 |   author={Fu, Taoyang and Lee, WangChien and Lei, Zhen},
107 |   booktitle={Proceedings of the 2017 ACM on Conference on Information and Knowledge Management},
108 |   pages={1797--1806},
109 |   year={2017},
110 |   organization={ACM}
111 | }
112 | 
113 | @inproceedings{gantner2010learning,
114 |   title={Learning attribute-to-feature mappings for cold-start recommendations},
115 |   author={Gantner, Zeno and Drumond, Lucas and Freudenthaler, Christoph and Rendle, Steffen and Schmidt-Thieme, Lars},
116 |   booktitle={Proceedings of the 10th IEEE International Conference on Data Mining series},
117 |   pages={176--185},
118 |   year={2010}
119 | }
120 | 
121 | @inproceedings{grover2016node2vec,
122 | 	Author = {Grover, Aditya and Leskovec, Jure},
123 | 	Booktitle = {Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
124 | 	Pages = {855--864},
125 | 	Title = {node2vec: scalable feature learning for networks},
126 | 	Year = {2016}}
127 | 
128 | @incollection{he2010social,
129 | 	Author = {He, Jianming and Chu, Wesley W},
130 | 	Booktitle = {Data Mining for Social Network Data},
131 | 	Pages = {47--74},
132 | 	Publisher = {Springer},
133 | 	Title = {A social network-based recommender system (SNRS)},
134 | 	Year = {2010}}
135 | 
136 | @inproceedings{he2016vbpr,
137 |   title={VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback.},
138 |   author={He, Ruining and McAuley, Julian},
139 |   booktitle={Proceedings of the 30th AAAI Conference on Artificial Intelligence},
140 |   pages={144--150},
141 |   year={2016}
142 | }
143 | 
144 | @article{hoff2002latent,
145 |   title={Latent space approaches to social network analysis},
146 |   author={Hoff, Peter D and Raftery, Adrian E and Handcock, Mark S},
147 |   journal={Journal of the American Statistical Association},
148 |   volume={97},
149 |   number={460},
150 |   pages={1090--1098},
151 |   year={2002}
152 | }
153 | 
154 | @inproceedings{hong2013co,
155 |   title={Co-factorization machines: modeling user interests and predicting individual decisions in twitter},
156 |   author={Hong, Liangjie and Doumith, Aziz S and Davison, Brian D},
157 |   booktitle={Proceedings of the 6th ACM International Conference on Web Search and Data Mining},
158 |   pages={557--566},
159 |   year={2013}
160 | }
161 | 
162 | @article{kipf2016semi,
163 |   title={Semi-supervised classification with graph convolutional networks},
164 |   author={Kipf, Thomas N and Welling, Max},
165 |   journal={arXiv preprint arXiv:1609.02907},
166 |   year={2016}
167 | }
168 | 
169 | @article{koren2009matrix,
170 |   title={Matrix factorization techniques for recommender systems},
171 |   author={Koren, Yehuda and Bell, Robert and Volinsky, Chris},
172 |   journal={Computer},
173 |   volume={42},
174 |   number={8},
175 |   year={2009}
176 | }
177 | 
178 | @incollection{koren2015advances,
179 | 	Author = {Koren, Yehuda and Bell, Robert},
180 | 	Booktitle = {Recommender systems handbook},
181 | 	Pages = {77--118},
182 | 	Title = {Advances in collaborative filtering},
183 | 	Year = {2015}}
184 | 
185 | @inproceedings{krohn2012multi,
186 |   title={Multi-relational matrix factorization using bayesian personalized ranking for social network data},
187 |   author={Krohn-Grimberghe, Artus and Drumond, Lucas and Freudenthaler, Christoph and Schmidt-Thieme, Lars},
188 |   booktitle={Proceedings of the 5th ACM International Conference on Web Search and Data Mining},
189 |   pages={173--182},
190 |   year={2012}
191 | }
192 | 
193 | @inproceedings{lao2010fast,
194 | 	Author = {Lao, Ni and Cohen, William W},
195 | 	Booktitle = {Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
196 | 	Pages = {881--888},
197 | 	Title = {Fast query execution for retrieval models based on path-constrained random walks},
198 | 	Year = {2010}}
199 | 
200 | @article{lao2010relational,
201 |   title={Relational retrieval using a combination of path-constrained random walks},
202 |   author={Lao, Ni and Cohen, William W},
203 |   journal={Machine Learning},
204 |   volume={81},
205 |   number={1},
206 |   pages={53--67},
207 |   year={2010}
208 | }
209 | 
210 | @inproceedings{liang2016factorization,
211 |   title={Factorization meets the item embedding: Regularizing matrix factorization with item co-occurrence},
212 |   author={Liang, Dawen and Altosaar, Jaan and Charlin, Laurent and Blei, David M},
213 |   booktitle={Proceedings of the 10th ACM Conference on Recommender Systems},
214 |   pages={59--66},
215 |   year={2016}
216 | }
217 | 
218 | @inproceedings{ling2014ratings,
219 |   title={Ratings meet reviews, a combined approach to recommend},
220 |   author={Ling, Guang and Lyu, Michael R and King, Irwin},
221 |   booktitle={Proceedings of the 8th ACM Conference on Recommender Systems},
222 |   pages={105--112},
223 |   year={2014}
224 | }
225 | 
226 | @inproceedings{liu2010personalized,
227 | 	Author = {Liu, Jiahui and Dolan, Peter and Pedersen, Elin R{\o}nby},
228 | 	Booktitle = {Proceedings of the 15th International Conference on Intelligent User Interfaces},
229 | 	Pages = {31--40},
230 | 	Title = {Personalized news recommendation based on click behavior},
231 | 	Year = {2010}}
232 | 
233 | @inproceedings{luo2011cauchy,
234 | 	Author = {Luo, Dijun and Nie, Feiping and Huang, Heng and Ding, Chris H},
235 | 	Booktitle = {Proceedings of the 28th International Conference on Machine Learning},
236 | 	Pages = {553--560},
237 | 	Title = {Cauchy graph embedding},
238 | 	Year = {2011}}
239 | 
240 | @inproceedings{luo2014hete,
241 | 	Author = {Luo, Chen and Pang, Wei and Wang, Zhe and Lin, Chenghua},
242 | 	Booktitle = {Proceedings of the 14th IEEE International Conference on Data Mining series},
243 | 	Pages = {917--922},
244 | 	Title = {Hete-cf: Social-based collaborative filtering recommendation using heterogeneous relations},
245 | 	Year = {2014}}
246 | 
247 | @inproceedings{ma2011recommender,
248 | 	Author = {Ma, Hao and Zhou, Dengyong and Liu, Chao and Lyu, Michael R and King, Irwin},
249 | 	Booktitle = {Proceedings of the fourth ACM International Conference on Web Search and Data Mining},
250 | 	Pages = {287--296},
251 | 	Title = {Recommender systems with social regularization},
252 | 	Year = {2011}}
253 | 
254 | @article{mikolov2013efficient,
255 |   title={Efficient estimation of word representations in vector space},
256 |   author={Mikolov, Tomas and Chen, Kai and Corrado, Greg and Dean, Jeffrey},
257 |   journal={arXiv preprint arXiv:1301.3781},
258 |   year={2013}
259 | }
260 | 
261 | @inproceedings{mnih2008probabilistic,
262 | 	Author = {Mnih, Andriy and Salakhutdinov, Ruslan R},
263 | 	Booktitle = {Advances in Neural Information Processing Systems},
264 | 	Pages = {1257--1264},
265 | 	Title = {Probabilistic matrix factorization},
266 | 	Year = {2008}
267 | }
268 | 
269 | @inproceedings{ou2013comparing,
270 |   title={Comparing apples to oranges: a scalable solution with heterogeneous hashing},
271 |   author={Ou, Mingdong and Cui, Peng and Wang, Fei and Wang, Jun and Zhu, Wenwu and Yang, Shiqiang},
272 |   booktitle={Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining},
273 |   pages={230--238},
274 |   year={2013},
275 |   organization={ACM}
276 | }
277 | 
278 | @article{pan2016tri,
279 |   title={Tri-party deep network representation},
280 |   author={Pan, Shirui and Wu, Jia and Zhu, Xingquan and Zhang, Chengqi and Wang, Yang},
281 |   journal={Network},
282 |   volume={11},
283 |   number={9},
284 |   pages={12},
285 |   year={2016}
286 | }
287 | 
288 | 
289 | @inproceedings{perozzi2014deepwalk,
290 | 	Author = {Perozzi, Bryan and Al-Rfou, Rami and Skiena, Steven},
291 | 	Booktitle = {Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
292 | 	Pages = {701--710},
293 | 	Title = {Deepwalk: Online learning of social representations},
294 | 	Year = {2014}}
295 | 
296 | @inproceedings{rendle2010factorization,
297 |   title={Factorization machines},
298 |   author={Rendle, Steffen},
299 |   booktitle={Proceedings of the 10th IEEE International Conference on Data Mining series},
300 |   pages={995--1000},
301 |   year={2010}
302 | }
303 | 
304 | @article{rendle2012factorization,
305 | 	Author = {Rendle, Steffen},
306 | 	Journal = {ACM Transactions on Intelligent Systems and Technology},
307 | 	Number = {3},
308 | 	Pages = {57},
309 | 	Title = {Factorization machines with libfm},
310 | 	Volume = {3},
311 | 	Year = {2012}}
312 | 
313 | @incollection{schafer2007collaborative,
314 | 	Author = {Schafer, J Ben and Frankowski, Dan and Herlocker, Jon and Sen, Shilad},
315 | 	Booktitle = {The Adaptive Web},
316 | 	Pages = {291--324}
317 | 	Title = {Collaborative filtering recommender systems},
318 | 	Year = {2007}}
319 | 
320 | @inproceedings{sedhain2017low,
321 |   title={Low-Rank Linear Cold-Start Recommendation from Social Data.},
322 |   author={Sedhain, Suvash and Menon, Aditya Krishna and Sanner, Scott and Xie, Lexing and Braziunas, Darius},
323 |   booktitle={Proceedings of the 31st AAAI Conference on Artificial Intelligence},
324 |   pages={1502--1508},
325 |   year={2017}
326 | }
327 | 
328 | @inproceedings{shi2012adaptive,
329 |   title={Adaptive diversification of recommendation results via latent factor portfolio},
330 |   author={Shi, Yue and Zhao, Xiaoxue and Wang, Jun and Larson, Martha and Hanjalic, Alan},
331 |   booktitle={Proceedings of the 35th international ACM SIGIR Conference on Research and Development in Information Retrieval},
332 |   pages={175--184},
333 |   year={2012},
334 | }
335 | 
336 | @article{shi2014hetesim,
337 | 	Author = {Shi, Chuan and Kong, Xiangnan and Huang, Yue and P. S. Yu and Wu, Bin},
338 | 	Journal = {IEEE Transactions on Knowledge and Data Engineering},
339 | 	Number = {10},
340 | 	Pages = {2479--2492},
341 | 	Publisher = {IEEE},
342 | 	Title = {Hetesim: A general framework for relevance measure in heterogeneous networks},
343 | 	Volume = {26},
344 | 	Year = {2014}}
345 | 
346 | @inproceedings{shi2015semantic,
347 | 	Author = {Shi, Chuan and Zhang, Zhiqiang and Luo, Ping and P. S. Yu and Yue, Yading and Wu, Bin},
348 | 	Booktitle = {Proceedings of the 24th ACM International on Conference on Information and Knowledge Management},
349 | 	Pages = {453--462},
350 | 	Title = {Semantic path based personalized recommendation on weighted heterogeneous information networks},
351 | 	Year = {2015}}
352 | 
353 | @article{shi2016integrating,
354 | 	Author = {Shi, Chuan and Liu, Jian and Zhuang, Fuzhen and P. S. Yu and Wu, Bin},
355 | 	Date-Modified = {2017-09-09 14:31:06 +0000},
356 | 	Journal = {Knowledge and Information System},
357 | 	Number = {3},
358 | 	Pages = {835--859},
359 | 	Title = {Integrating heterogeneous information via flexible regularization framework for recommendation},
360 | 	Volume = {49},
361 | 	Year = {2016}}
362 | 
363 | @incollection{shi2017heterogeneous,
364 | 	Author = {Shi, Chuan and P. S. Yu},
365 | 	Booktitle = {Data Analytics},
366 | 	Pages = {1--227},
367 | 	Publisher = {Springer},
368 | 	Title = {Heterogeneous Information Network Analysis and Applications},
369 | 	Year = {2017}}
370 | 
371 | @article{shi2017survey,
372 | 	Author = {Shi, Chuan and Li, Yitong and Zhang, Jiawei and Sun, Yizhou and P. S. Yu},
373 | 	Journal = {IEEE Transactions on Knowledge and Data Engineering},
374 | 	Number = {1},
375 | 	Pages = {17--37},
376 | 	Publisher = {IEEE},
377 | 	Title = {A survey of heterogeneous information network analysis},
378 | 	Volume = {29},
379 | 	Year = {2017}}
380 | 
381 | @inproceedings{sedhain2014social,
382 |   title={Social collaborative filtering for cold-start recommendations},
383 |   author={Sedhain, Suvash and Sanner, Scott and Braziunas, Darius and Xie, Lexing and Christensen, Jordan},
384 |   booktitle={Proceedings of the 8th ACM Conference on Recommender systems},
385 |   pages={345--348},
386 |   year={2014},
387 |   organization={ACM}
388 | }
389 | 
390 | @inproceedings{srebro2003weighted,
391 | 	Author = {Srebro, Nathan and Jaakkola, Tommi},
392 | 	Booktitle = {Proceedings of the 20th International Conference on Machine Learning},
393 | 	Pages = {720--727},
394 | 	Title = {Weighted low-rank approximations},
395 | 	Year = {2003}}
396 | 
397 | @inproceedings{sunmrlr,
398 |   title={MRLR: Multi-level Representation Learning for Personalized Ranking in Recommendation},
399 |   author={Sun, Zhu and Yang, Jie and Zhang, Jie and Bozzon, Alessandro and Chen, Yu and Xu, Chi},
400 |   booktitle={Proceedings of the 26th International Joint Conference on Artificial Intelligence},
401 |   pages={2807--2813},
402 |   year={2017}
403 | }
404 | 
405 | @inproceedings{sun2009ranking,
406 |   title={Ranking-based clustering of heterogeneous information networks with star network schema},
407 |   author={Sun, Yizhou and Yu, Yintao and Han, Jiawei},
408 |   booktitle={Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
409 |   pages={797--806},
410 |   year={2009}
411 | }
412 | 
413 | @article{sun2011pathsim,
414 | 	Author = {Sun, Yizhou and Han, Jiawei and Yan, Xifeng and Yu, Philip S and Wu, Tianyi},
415 | 	Journal = {Proceedings of the 37th Very Large Data Bases},
416 | 	Number = {11},
417 | 	Pages = {992--1003},
418 | 	Title = {Pathsim: Meta path-based top-k similarity search in heterogeneous information networks},
419 | 	Volume = {4},
420 | 	Year = {2011}}
421 | 
422 | @article{sun2012mining,
423 | 	Author = {Sun, Yizhou and Han, Jiawei},
424 | 	Journal = {Synthesis Lectures on Data Mining and Knowledge Discovery},
425 | 	Title = {Mining heterogeneous information networks: principles and methodologies},
426 |     Pages = {1--159},
427 | 	Year = {2012}}
428 | 
429 | @article{sun2013mining,
430 |   title={Mining heterogeneous information networks: a structural analysis approach},
431 |   author={Sun, Yizhou and Han, Jiawei},
432 |   journal={ACM SIGKDD Explorations Newsletter},
433 |   volume={14},
434 |   number={2},
435 |   pages={20--28},
436 |   year={2013}
437 | }
438 | 
439 | @article{Sun2013Meta,
440 | 	Author = {Sun, Yizhou and Han, Jiawei},
441 | 	Journal = {Journal of Tsinghua University(Science and Technology)},
442 | 	Number = {4},
443 | 	Pages = {329-338},
444 | 	Title = {Meta-path-based search and mining in heterogeneous information networks},
445 | 	Volume = {18},
446 | 	Year = {2013}}
447 | 
448 | @inproceedings{tang2015line,
449 | 	Author = {Tang, Jian and Qu, Meng and Wang, Mingzhe and Zhang, Ming and Yan, Jun and Mei, Qiaozhu},
450 | 	Booktitle = {Proceedings of the 24th International Conference on World Wide Web},
451 | 	Pages = {1067--1077},
452 | 	Title = {Line: Large-scale information network embedding},
453 | 	Year = {2015}}
454 | 
455 | @inproceedings{tang2015pte,
456 |   title={Pte: Predictive text embedding through large-scale heterogeneous text networks},
457 |   author={Tang, Jian and Qu, Meng and Mei, Qiaozhu},
458 |   booktitle={Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
459 |   pages={1165--1174},
460 |   year={2015}
461 | }
462 | 
463 | @inproceedings{tu2016max,
464 |   title={Max-Margin DeepWalk: Discriminative Learning of Network Representation.},
465 |   author={Tu, Cunchao and Zhang, Weicheng and Liu, Zhiyuan and Sun, Maosong},
466 |   booktitle={Proceedings of the 25th International Joint Conference on Artificial Intelligence},
467 |   pages={3889--3895},
468 |   year={2016}
469 | }
470 | 
471 | @inproceedings{vaz2012improving,
472 | 	Author = {Vaz, Paula Cristina and Martins de Matos, David and Martins, Bruno and Calado, Pavel},
473 | 	Booktitle = {Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries},
474 | 	Pages = {387--388},
475 | 	Title = {Improving a hybrid literary book recommendation system through author ranking},
476 | 	Year = {2012}}
477 | 
478 | @inproceedings{wang2016structural,
479 | 	Author = {Wang, Daixin and Cui, Peng and Zhu, Wenwu},
480 | 	Booktitle = {Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
481 | 	Pages = {1225--1234},
482 | 	Title = {Structural deep network embedding},
483 | 	Year = {2016}}
484 | 
485 | @inproceedings{wei2017cross,
486 |   title={Cross View Link Prediction by Learning Noise-resilient Representation Consensus},
487 |   author={Wei, Xiaokai and Xu, Linchuan and Cao, Bokai and Yu, Philip S},
488 |   booktitle={Proceedings of the 26th International Conference on World Wide Web},
489 |   pages={1611--1619},
490 |   year={2017}
491 | }
492 | 
493 | @article{willmott2005advantages,
494 |   title={Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance},
495 |   author={Willmott, Cort J and Matsuura, Kenji},
496 |   journal={Climate Research},
497 |   volume={30},
498 |   number={1},
499 |   pages={79--82},
500 |   year={2005}
501 | }
502 | 
503 | @inproceedings{xu2017embedding,
504 |   title={Embedding of Embedding (EOE): Joint Embedding for Coupled Heterogeneous Networks},
505 |   author={Xu, Linchuan and Wei, Xiaokai and Cao, Jiannong and Yu, Philip S},
506 |   booktitle={Proceedings of the 10th ACM International Conference on Web Search and Data Mining},
507 |   pages={741--749},
508 |   year={2017}
509 | }
510 | 
511 | @article{yan2007graph,
512 |   title={Graph embedding and extensions: A general framework for dimensionality reduction},
513 |   author={Yan, Shuicheng and Xu, Dong and Zhang, Benyu and Zhang, Hong-Jiang and Yang, Qiang and Lin, Stephen},
514 |   journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
515 |   volume={29},
516 |   number={1},
517 |   pages={40--51},
518 |   year={2007}
519 | }
520 | 
521 | @inproceedings{yang2015network,
522 |   title={Network Representation Learning with Rich Text Information.},
523 |   author={Yang, Cheng and Liu, Zhiyuan and Zhao, Deli and Sun, Maosong and Chang, Edward Y},
524 |   booktitle={Proceedings of the 24th International Joint Conference on Artificial Intelligence},
525 |   pages={2111--2117},
526 |   year={2015}
527 | }
528 | 
529 | @inproceedings{ye2011exploiting,
530 |   title={Exploiting geographical influence for collaborative point-of-interest recommendation},
531 |   author={Ye, Mao and Yin, Peifeng and Lee, Wang-Chien and Lee, Dik-Lun},
532 |   booktitle={Proceedings of the 34th international ACM SIGIR conference on Research and Development in Information Retrieval},
533 |   pages={325--334},
534 |   year={2011}
535 | }
536 | 
537 | @inproceedings{yin2013lcars,
538 | 	Author = {Yin, Hongzhi and Sun, Yizhou and Cui, Bin and Hu, Zhiting and Chen, Ling},
539 | 	Booktitle = {Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
540 | 	Pages = {221--229},
541 | 	Title = {Lcars: a location-content-aware recommender system},
542 | 	Year = {2013}}
543 | 
544 | @article{yu2013collaborative,
545 | 	Author = {Yu, Xiao and Ren, Xiang and Gu, Quanquan and Sun, Yizhou and Han, Jiawei},
546 | 	Journal = {Proceedings of 1st International Joint Conference on Artificial Intelligence Workshop on Heterogeneous Information Network Analysis},
547 | 	Title = {Collaborative filtering with entity similarity regularization in heterogeneous information networks},
548 | 	Year = {2013}}
549 | 
550 | @inproceedings{yu2013recommendation,
551 |   title={Recommendation in heterogeneous information networks with implicit user feedback},
552 |   author={Yu, Xiao and Ren, Xiang and Sun, Yizhou and Sturt, Bradley and Khandelwal, Urvashi and Gu, Quanquan and Norick, Brandon and Han, Jiawei},
553 |   booktitle={Proceedings of the 7th ACM conference on Recommender Systems},
554 |   pages={347--350},
555 |   year={2013}
556 | }
557 | 
558 | @inproceedings{yu2014personalized,
559 |   title={Personalized entity recommendation: A heterogeneous information network approach},
560 |   author={Yu, Xiao and Ren, Xiang and Sun, Yizhou and Gu, Quanquan and Sturt, Bradley and Khandelwal, Urvashi and Norick, Brandon and Han, Jiawei},
561 |   booktitle={Proceedings of the 7th ACM International Conference on Web Search and Data Mining},
562 |   pages={283--292},
563 |   year={2014}
564 | }
565 | 
566 | @inproceedings{zhang2016collaborative,
567 |   title={Collaborative knowledge base embedding for recommender systems},
568 |   author={Zhang, Fuzheng and Yuan, Nicholas Jing and Lian, Defu and Xie, Xing and Ma, Wei-Ying},
569 |   booktitle={Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
570 |   pages={353--362},
571 |   year={2016}
572 | }
573 | 
574 | @inproceedings{zhang2016homophily,
575 |   title={Homophily, Structure, and Content Augmented Network Representation Learning},
576 |   author={Zhang, Daokun and Yin, Jie and Zhu, Xingquan and Zhang, Chengqi},
577 |   booktitle={Proceedings of the 16th IEEE International Conference on Data Mining series},
578 |   pages={609--618},
579 |   year={2016}
580 | }
581 | 
582 | @inproceedings{zhao2017meta,
583 |   title={Meta-graph based recommendation fusion over heterogeneous information networks},
584 |   author={Zhao, Huan and Yao, Quanming and Li, Jianda and Song, Yangqiu and Lee, Dik Lun},
585 |   booktitle={Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
586 |   pages={635--644},
587 |   year={2017},
588 |   organization={ACM}
589 | }
590 | 
591 | @inproceedings{zheng2016dual,
592 | 	Author = {Zheng, Jing and Liu, Jian and Shi, Chuan and Zhuang, Fuzhen and Li, Jingzhi and Wu, Bin},
593 | 	Booktitle = {Proceedings of the 20th Pacific-Asia Conference on Knowledge Discovery and Data Mining},
594 | 	Pages = {542--554},
595 | 	Title = {Dual similarity regularization for recommendation},
596 | 	Year = {2016}}
597 | 
598 | @inproceedings{zheng2017joint,
599 |   title={Joint deep modeling of users and items using reviews for recommendation},
600 |   author={Zheng, Lei and Noroozi, Vahid and Yu, Philip S},
601 |   booktitle={Proceedings of the 10th ACM International Conference on Web Search and Data Mining},
602 |   pages={425--434},
603 |   year={2017}
604 | }
605 | 
606 | @article{zheng2017recommendation,
607 | 	Author = {Zheng, Jing and Liu, Jian and Shi, Chuan and Zhuang, Fuzhen and Li, Jingzhi and Wu, Bin},
608 | 	Journal = {International Journal of Data Science and Analytics},
609 | 	Number = {1},
610 | 	Pages = {35--48},
611 | 	Title = {Recommendation in heterogeneous information network via dual similarity regularization},
612 | 	Volume = {3},
613 | 	Year = {2017}}
614 | 
615 | @inproceedings{zhu2011social,
616 | 	Author = {Zhu, Jianke and Ma, Hao and Chen, Chun and Bu, Jiajun},
617 | 	Booktitle = {Proceedings of the 25th AAAI Conference on Artificial Intelligence},
618 | 	Pages = {158--163},
619 | 	Title = {Social Recommendation Using Low-Rank Semidefinite Program},
620 | 	Year = {2011}}
621 | 


--------------------------------------------------------------------------------
/paper/sec-exp.tex:
--------------------------------------------------------------------------------
  1 | \section{Experiments \label{sec-exp}}
  2 | %In this section, we will verify the superiority of HERec by performing experiments on three real datasets and comparing it to the state-of-the-arts.
  3 | %In this section, we construct the experimental evaluation and present the  result analysis.
  4 | In this section, we will demonstrate the effectiveness of HERec by performing experiments on three real datasets compared to the state-of-the-art recommendation methods.
  5 | 
  6 | \begin{table*}[t]%[htbp]
  7 | \centering
  8 | %\scriptsize
  9 | \caption{\label{tab_Data} Statistics of the three datasets.}
 10 | {
 11 | \begin{tabular}{|c||c|c|c|c|c|c|c|}
 12 | \hline
 13 | {Dataset} & {Relations} & {Number} & {Number} & {Number} & {Ave. degrees} & {Ave. degrees} & \multirow{2}{*}{Meta-paths}\\
 14 | {(Density)} & {(A-B)} & {of A} & {of B} & {of (A-B)}  & {of A} & {of B} &  \multirow{2}{*}{}\\
 15 | \hline
 16 | \hline
 17 | \multirow{6}{*}{Douban Movie} & {User-Movie} & {13,367} & {12,677} & {1,068,278} & {79.9} & {84.3} & {} \\
 18 | \cline{2-7}
 19 | \multirow{6}{*}{(0.63\%)} &  {User-User} & {2,440} & {2,294} & {4,085} & {1.7} & {1.8} & {UMU, MUM} \\
 20 | \cline{2-7}
 21 | \multirow{6}{*}{} &{User-Group} & {13,337} & {2,753} & {570,047} & {42.7} & {207.1} & {UMDMU, MDM}\\
 22 | \cline{2-7}
 23 | \multirow{6}{*}{} & {Movie-Director} & {10,179} & {2,449} & {11,276} & {1.1} & {4.6} & {UMAMU, MAM} \\
 24 | \cline{2-7}
 25 | \multirow{6}{*}{} & {Movie-Actor} & {11,718} & {6,311} & {33,587} & {2.9} & {5.3} & {UMTMU, MTM}\\
 26 | \cline{2-7}
 27 | \multirow{6}{*}{} & {Movie-Type} & {12,678} & {38} & {27,668} & {2.2} & {728.1} & {}\\
 28 | \hline
 29 | \hline
 30 | \multirow{5}{*}{Douban Book} & {User-Book} & {13,024} & {22,347} & {792,026} & {60.8} & {35.4} & \multirow{2}{*}{UBU, BUB} \\
 31 | \cline{2-7}
 32 | \multirow{5}{*}{(0.27\%)} & {User-User} & {12,748} & {12,748} & {169,150} & {13.3} & {13.3} & \multirow{2}{*}{UBPBU, BPB} \\
 33 | \cline{2-7}
 34 | \multirow{5}{*}{} & {Book-Author} & {21,907} & {10,805} & {21,905} & {1.0} & {2.0} & \multirow{2}{*}{UBYBU, BYB} \\
 35 | \cline{2-7}
 36 | \multirow{5}{*}{} & {Book-Publisher} & {21,773} & {1,815} & {21,773} & {1.0} & {11.9} & \multirow{2}{*}{UBABU} \\
 37 | \cline{2-7}
 38 | \multirow{5}{*}{} & {Book-Year} & {21,192} & {64} & {21,192} & {1.0} & {331.1} & \multirow{2}{*}{} \\
 39 | \hline
 40 | \hline
 41 | \multirow{5}{*}{Yelp} & {User-Business} & {16,239} & {14,284} & {198,397} & {12.2} & {13.9} &{} \\
 42 | \cline{2-7}
 43 | \multirow{5}{*}{(0.08\%)} & {User-User} & {10,580} & {10,580} & {158,590} & {15.0} & {15.0} &{UBU, BUB} \\
 44 | \cline{2-7}
 45 | \multirow{5}{*}{} & {User-Compliment} & {14,411} & {11} & {76,875} & {5.3} & {6988.6} & {UBCiBU, BCiB} \\
 46 | \cline{2-7}
 47 | \multirow{5}{*}{} & {Business-City} & {14,267} & {47} & {14,267} & {1.0} & {303.6} & {UBCaBU, BCaB} \\
 48 | \cline{2-7}
 49 | \multirow{5}{*}{} & {Business-Category} & {14,180} & {511} & {40,009} & {2.8} & {78.3} & {} \\
 50 | \hline
 51 | 
 52 | \end{tabular}}
 53 | \end{table*}
 54 | 
 55 | %\begin{table}[t]%[htbp]
 56 | %\center
 57 | %%\scriptsize
 58 | %\caption{\label{tab_metapath} The selected meta-paths for  three datasets in our work.}
 59 | %{\begin{tabular}{|c||c|}
 60 | %\hline
 61 | %{Dataset} & {Meta-paths}\\
 62 | %\hline\hline
 63 | %\multirow{2}{*}{Douban Movie} & {UMU, UMDMU, UMAMU, UMTMU}\\
 64 | %\multirow{2}{*}{} & {MUM, MAM, MDM, MTM} \\
 65 | %\hline
 66 | %\multirow{2}{*}{Douban Book} & {UBU, UBABU, UBPBU, UBYBU} \\
 67 | %\multirow{2}{*}{} & {BUB, BPB, BYB} \\
 68 | %\hline
 69 | %\multirow{2}{*}{Yelp} & {UBU, UBCiBU, UBCaBU} \\
 70 | %\multirow{2}{*}{} & {BUB, BCiB, BCaB} \\
 71 | %\hline
 72 | %\end{tabular}}
 73 | %\end{table}
 74 | 
 75 | 
 76 | \begin{table*}[t]
 77 | \centering
 78 | %\scriptsize
 79 | \caption{\label{tab_Effectiveness} Results of effectiveness experiments on three datasets. A smaller MAE or RMSE value indicates a better performance. For ease of reading the results, we also report the improvement w.r.t. the PMF model for each other method. A larger improvement ratio indicates a better performance. }
 80 | {
 81 | \begin{tabular}{|c|c|c||c|c|c|c|c|c||c|c||c|}
 82 | \hline
 83 | {Dataset} & {Training} & {Metrics}  & {PMF} &{SoMF} & {FM$_{HIN}$} & {HeteMF}& {SemRec}  & {DSR} & {HERec$_{dw}$} & {HERec$_{mp}$}&{HERec}\\
 84 | \hline
 85 | \hline
 86 | \multirow{16}{*}{Douban}& \multirow{4}{*}{80\%} & {MAE} & {0.5741} & {0.5817} & {0.5696} & {0.5750} & {0.5695} &  {0.5681}  & {0.5703} & {\textbf{0.5515}} &{0.5519} \\
 87 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {-1.32\%} & {+0.78\%} & {-0.16\%} & {+0.80\%}  & {+1.04\%} & {+0.66\%} &{+3.93\%} & {+3.86\%} \\
 88 | \cline{3-12}
 89 | \multirow{16}{*}{Movie} &\multirow{4}{*}{}& {RMSE} & {0.7641} & {0.7680} &   {0.7248} & {0.7556}& {0.7399} & {0.7225} & {0.7446} & {0.7121} & {\textbf{0.7053}} \\
 90 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {-0.07\%} & {+5.55\%} & {+1.53\%} & {+3.58\%}  & {+5.85\%} & {+2.97\%} &{+7.20\%} & {+8.09\%} \\
 91 | \cline{2-12}
 92 | \multirow{16}{*}{} &\multirow{4}{*}{}\multirow{4}{*}{60\%} & {MAE} & {0.5867} & {0.5991} & {0.5769} & {0.5894} & {0.5738} &  {0.5831} & {0.5838} & {0.5611} &{\textbf{0.5587}} \\
 93 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {-2.11\%} & {+1.67\%} & {-0.46\%} & {+2.19\%}  & {+0.61\%} & {+0.49\%} &{+4.36\%} & {+4.77\%} \\
 94 | \cline{3-12}
 95 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE} & {0.7891} & {0.7950} & {0.7842} & {0.7785} & {0.7551} &  {0.7408} & {0.7670} & {0.7264} &{\textbf{0.7148}} \\
 96 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {-0.75\%} & {+0.62\%} & {+1.34\%} & {+4.30\%}  & {+6.12\%} & {+2.80\%} &{+7.94\%} & {+9.41\%} \\
 97 | \cline{2-12}
 98 | \multirow{16}{*}{} &\multirow{4}{*}{}\multirow{4}{*}{40\%} & {MAE} & {0.6078} & {0.6328} &  {0.5871} & {0.6165} & {0.5945} &  {0.6170} & {0.6073} & {0.5747} &{\textbf{0.5699}}\\
 99 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {-4.11\%} & {+3.40\%} & {-1.43\%} & {+2.18\%}  & {-1.51\%} & {+0.08\%} &{+5.44\%} & {+6.23\%} \\
100 | \cline{3-12}
101 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE} & {0.8321} & {0.8479} & {0.7563} & {0.8221} & {0.7836} &  {0.7850} & {0.8057} & {0.7429} &{\textbf{0.7315}} \\
102 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {-1.89\%} & {+9.10\%} & {+1.20\%} & {+5.82\%}  & {+5.66\%} & {+3.17\%} &{+10.71\%} & {+12.09\%} \\
103 | \cline{2-12}
104 | \multirow{16}{*}{} &\multirow{4}{*}{}\multirow{4}{*}{20\%} & {MAE}& {0.7247}& {0.6979} & {0.6080} & {0.6896} & {0.6392} &  {0.6584} & {0.6699} & {0.6063} &{\textbf{0.5900}} \\
105 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+3.69\%} & {+16.10\%} & {+4.84\%} & {+11.79\%}  & {+9.14\%} & {+7.56\%} &{+16.33\%} & {+18.59\%} \\
106 | \cline{3-12}
107 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE} & {0.9440} & {0.9852} & {0.7878} & {0.9357} & {0.8599} &  {0.8345} & {0.9076} & {0.7877}  &{\textbf{0.7660}}\\
108 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {-4.36\%} & {+16.55\%} & {+0.88\%} & {+8.91\%}  & {+11.60\%} & {+3.86\%} &{+16.56\%}& {+18.86\%} \\
109 | 
110 | 
111 | \hline
112 | \hline
113 | \multirow{16}{*}{Douban}& \multirow{4}{*}{80\%} & {MAE} & {0.5774} & {0.5756}  & {0.5716} & {0.5740}   & {0.5675}   &  {0.5740}    & {0.5875}   & {0.5591} &{\textbf{0.5502}} \\
114 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+0.31\%} & {+1.00\%} & {+0.59\%} & {+1.71\%}  & {+0.59\%} & {-1.75\%} &{+3.17\%} & {+4.71\%} \\
115 | \cline{3-12}
116 | \multirow{16}{*}{Book} &\multirow{4}{*}{}& {RMSE}             & {0.7414} & {0.7302}   & {0.7199} & {0.7360}   & {0.7283}   &  {0.7206}    & {0.7450}  & {0.7081} &{\textbf{0.6811}}\\
117 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+1.55\%} & {+2.94\%} & {+0.77\%} & {+1.81\%}  & {+2.84\%} & {-0.44\%} &{+4.53\%} & {+8.17\%} \\
118 | \cline{2-12}
119 | \multirow{16}{*}{} &\multirow{4}{*}{}\multirow{4}{*}{60\%} & {MAE} & {0.6065} & {0.5903}   &{0.5812} & {0.5823}   & {0.5833}   &  {0.6020}   & {0.6203}  & {0.5666} &{\textbf{0.5600}}\\
120 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+2.67\%} & {+4.17\%} & {+3.99\%} & {+3.83\%}  & {+0.74\%} & {-2.28\%} &{+6.58\%} & {+7.67\%} \\
121 | \cline{3-12}
122 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE}    & {0.7908} & {0.7518}   &{0.7319} & {0.7466}   & {0.7505}   &  {0.7552}   & {0.7905}  & {0.7318} &{\textbf{0.7123}}\\
123 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+4.93\%} & {+7.45\%} & {+5.59\%} & {+5.10\%}  & {+4.50\%} & {+0.04\%} &{+7.46\%} & {+9.93\%} \\
124 | \cline{2-12}
125 | \multirow{16}{*}{} &\multirow{4}{*}{}\multirow{4}{*}{40\%} & {MAE} & {0.6800} & {0.6161}   &{0.6028} & {0.5982}   & {0.6025}   &  {0.6271}   & {0.6976}  & {0.5954} &{\textbf{0.5774}}\\
126 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+9.40\%} & {+11.35\%} & {+12.03\%} & {+11.40\%}  & {+7.78\%} & {-2.59\%} &{+12.44\%}& {+15.09\%} \\
127 | \cline{3-12}
128 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE}    & {0.9203} & {0.7936}   &{0.7617} & {0.7779}   & {0.7751}   &  {0.7730}   & {0.9022}  & {0.7703} &{\textbf{0.7400}}\\
129 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+13.77\%} & {+17.23\%} & {+15.47\%} & {+15.78\%}  & {+16.01\%} & {+1.97\%} &{+16.30\%} & {+19.59\%} \\
130 | \cline{2-12}
131 | \multirow{16}{*}{} &\multirow{4}{*}{}\multirow{4}{*}{20\%} & {MAE}& {1.0344} & {0.6327}   &{0.6396} & {0.6311}   & {0.6481}   &  {0.6300}   & {1.0166}   & {0.6785}  &{\textbf{0.6450}}\\
132 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+38.83\%} & {+38.17\%} & {+38.99\%} & {+37.35\%}  & {+39.10\%} & {+1.72\%} &{+34.41\%} & {+37.65\%} \\
133 | \cline{3-12}
134 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE}    & {1.4414} & {0.8236}   &{0.8188} & {0.8304}   & {0.8350}   &  {0.8200}   & {1.3205}   & {0.8869} &{\textbf{0.8581}} \\
135 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+42.86\%} & {+43.19\%} & {+42.39\%} & {+42.07\%}  & {+43.11\%} & {+8.39\%} &{+38.47\%} & {+40.47\%} \\
136 | %\cline{2-12}
137 | %{} &\multicolumn{2}{c||}{Average Rank} & {8.88} & {5.88} & {5.50} & {5.53} & {4.88} & {8.75} & {2.25} & {2.37} & {\textbf{1.63}}\\
138 | \hline
139 | \hline
140 | \multirow{16}{*}{Yelp}& \multirow{4}{*}{90\%} & {MAE} & {1.0412} & {1.0095}   &{0.9013} & {0.9487}   & {0.9043}   &  {0.9054}    & {1.0388}   & {0.8822} &{\textbf{0.8395}} \\
141 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+3.04\%} & {+13.44\%} & {+8.88\%} & {+13.15\%}  & {+13.04\%} & {+0.23\%} &{+15.27\%} & {+19.37\%} \\
142 | \cline{3-12}
143 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE}             & {1.4268} & {1.3392}   &{1.1417} & {1.2549}   & {1.1637}   &  {1.1186}    & {1.3581}  & {1.1309}  &{\textbf{1.0907}}\\
144 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+6.14\%} & {+19.98\%} & {+12.05\%} & {+18.44\%}  & {+21.60\%} & {+4.81\%} &{+20.74\%} & {+23.56\%} \\
145 | \cline{2-12}
146 | \multirow{16}{*}{} &\multirow{4}{*}{}\multirow{4}{*}{80\%} & {MAE} & {1.0791} & {1.0373}   &{0.9038} & {0.9654}   & {0.9176}   &  {0.9098}   & {1.0750}   & {0.8953}  &{\textbf{0.8475}} \\
147 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+3.87\%} & {+16.25\%} & {+10.54\%} & {+14.97\%}  & {+15.69\%} & {+0.38\%} &{+17.03\%} & {+21.46\%} \\
148 | \cline{3-12}
149 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE}    & {1.4816} & {1.3782}   &{1.1497} & {1.2799}   & {1.1771}   &  {1.1208}   & {1.4075}   & {1.1516} &{\textbf{1.1117}}\\
150 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+6.98\%} & {+22.40\%} & {+13.61\%} & {+20.55\%}  & {+24.35\%} & {+5.00\%} &{+22.27\%} & {+24.97\%} \\
151 | \cline{2-12}
152 | \multirow{16}{*}{} &\multirow{4}{*}{}\multirow{4}{*}{70\%} & {MAE} & {1.1170} & {1.0694}   &{0.9108} & {0.9975}   & {0.9407}   &  {0.9429}   & {1.1196}  & {0.9043} &{\textbf{0.8580}}\\
153 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+4.26\%} & {+18.46\%} & {+10.70\%} & {+15.78\%}  & {+15.59\%} & {-0.23\%} &{+19.04\%} & {+23.19\%} \\
154 | \cline{3-12}
155 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE}    & {1.5387} & {1.4201}   &{1.1651} & {1.3229}   & {1.2108}   &  {1.1582}   & {1.4632}  & {1.1639}  &{\textbf{1.1256}}\\
156 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+7.71\%} & {+24.28\%} & {+14.02\%} & {+21.31\%}  & {+24.73\%} & {+4.91\%} &{+24.36\%} & {+26.85\%} \\
157 | \cline{2-12}
158 | \multirow{16}{*}{} &\multirow{4}{*}{}\multirow{4}{*}{60\%} & {MAE}& {1.1778} & {1.1135}   &{0.9435} & {1.0368}   & {0.9637}   &  {1.0043}   & {1.1691}  & {0.9257}  &{\textbf{0.8759}}\\
159 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+5.46\%} & {+19.89\%} & {+11.97\%} & {+18.18\%}  & {+14.73\%} & {+0.74\%} &{+21.40\%} & {+25.63\%} \\
160 | \cline{3-12}
161 | \multirow{16}{*}{} &\multirow{4}{*}{}& {RMSE}    & {1.6167} & {1.4748}   &{1.2039} & {1.3713}   & {1.2380}   &  {1.2257}   & {1.5182}   & {1.1887}  &{\textbf{1.1488}}\\
162 | \multirow{16}{*}{} &\multirow{4}{*}{}& {Improve} & {} & {+8.78\%} & {+25.53\%} & {+15.18\%} & {+23.42\%}  & {+24.19\%} & {+6.09\%} &{+26.47\%} & {+28.94\%} \\
163 | %\cline{2-12}
164 | %{} &\multicolumn{2}{c||}{Average Rank} & {9.00} & {7.88} & {6.00} & {4.63} & {4.38} & {7.13} & {3.00} & {1.75} & {\textbf{1.25}}\\
165 | \hline
166 | %\hline
167 | %\multicolumn{3}{|c||}{Average Rank} & {10.33} & {8.79} &{5.25} & {7.83} & {6.54} &  {6.13} & {9.5} & {4.17} & {3.2} & {2.4} & {\textbf{1.79}}\\
168 | %\hline
169 | \end{tabular}}
170 | \end{table*}
171 | 
172 | \subsection{Evaluation Datasets}
173 | We adopt three widely used datasets from different domains, consisting of Douban Movie dataset \footnote{http://movie.douban.com} from movie domain, Douban Book dataset \footnote{http://book.douban.com} from book domain, and Yelp dataset \footnote{http://www.yelp.com/dataset-challenge} from business domain. Douban Movie dataset includes 13,367 users and 12,677 movies with 1,068,278 movie ratings ranging from 1 to 5. Moreover, the dataset also includes social relations and attribute information of users and movies. As for the Douban Book dataset, it includes 13,024 users and 22,347 books with 792,026 ratings ranging from 1 to 5. Yelp dataset records user ratings on local businesses and contains social relations and attribute information of businesses, which includes 16,239 users and 14,282 local businesses with 198,397 ratings raging from 1 to 5. The detailed descriptions of the three datasets are shown in Table~\ref{tab_Data}, and the network schemas of the three datasets are plotted in Fig.~\ref{fig_schema}. Besides the domain variation, these three datasets also have different rating sparsity degrees: the Yelp dataset is very sparse, while the Douban Movie dataset is much denser.
174 | 
175 | 
176 | 
177 | \subsection{Metrics}
178 | We use the widely used mean absolute error (MAE) and root mean square error (RMSE) to measure the quality of recommendation of different models. The metrics MAE and RMSE are defined as follows:
179 | 
180 | \begin{equation}
181 | MAE = \frac{1}{|\mathcal{D}_{test}|}\sum_{(i, j) \in \mathcal{D}_{test}}{|r_{i,j} - \widehat{r_{i,j}}|},
182 | \end{equation}
183 | \begin{equation}
184 | RMSE = \sqrt{\frac{1}{|\mathcal{D}_{test}|}\sum_{(i, j) \in \mathcal{D}_{test}}{(r_{i,j} - \widehat{r_{i,j}})^2}},
185 | \end{equation}
186 | where $r_{i,j}$ is the actual rating that user $i$ assigns to item $j$, $\widehat{r_{i,j}}$ is the predicted rating from a model, and $\mathcal{D}_{test}$ denotes the test set of rating records. From the definition, we can see that a smaller MAE or RMSE value indicates a better performance.
187 | 
188 | %\subsection{Evaluation Datasets and Metrics}
189 | %
190 | %We adopt three widely used datasets from different domains, including Douban Movie dataset \footnote{http://movie.douban.com}, Douban Book dataset \footnote{http://book.douban.com}, and Yelp dataset \footnote{http://www.yelp.com/dataset-challenge}. The detailed statistics of three datasets are
191 | %summarized in Table~\ref{tab_Data}.
192 | %We use two commonly adopted metrics to measure the performance of rating prediction, namely mean absolute error (MAE) and root mean square error (RMSE) \cite{willmott2005advantages}.
193 | 
194 | 
195 | \subsection{Methods to Compare}
196 | %In order to verify the effectiveness of HERec, we compare HERec with quite a few baselines, some of which represent state-of-the-art for HIN-based recommendation. Since there are three embedding fusion functions in HERec, we include three versions of HERec: HERec$_{sl}$, HERec$_{pl}$, and HERec$_{pnL}$. Moreover, we also design two methods, called HERec$_{dw}$ and HERec$_{mp}$, which replace our HIN embedding with two well known network embeddings under the framework of HERec. As the baselines, six representative rating predication methods are employed, including two classical matrix factorization methods (i.e., PMF and SoMF) and  four HIN based recommendation methods (FM$_{HIN}$, HeteMF, SemRec and DSR).
197 | We consider the following methods to compare:
198 | 
199 | \textbullet\ \textbf{PMF}~\cite{mnih2008probabilistic}: It is the classic probabilistic matrix factorization model by explicitly factorizing the rating matrix into two low-dimensional matrices.
200 | 
201 | \textbullet\ \textbf{SoMF}~\cite{ma2011recommender}: It incorporates social network information into the recommendation model.
202 | The social relations are characterized by a social regularization term, and integrated into the basic matrix factorization model.
203 | 
204 | \textbullet\ \textbf{FM$_{HIN}$}: It is the context-aware factorization machine~\cite{rendle2012factorization}, which is able to utilize various kinds of auxiliary information. In our experiments, we extract heterogeneous information as context features and incorporate them into the factorization machine for rating prediction. We implement the model with the code from the authors~\footnote{http://www.libfm.org/}.
205 | 
206 | 
207 | \textbullet\ \textbf{HeteMF}~\cite{yu2013collaborative}: It is a MF based recommendation method, which utilizes entity similarities calculated over HINs using meta-path based algorithms.
208 | 
209 | \textbullet\ \textbf{SemRec}~\cite{shi2015semantic}: It is a collaborative filtering method on weighted heterogeneous information network, which is constructed by connecting users and items with the same ratings. It flexibly integrates heterogeneous information for recommendation by weighted meta-paths and a weight ensemble method. We implement the model with the code from the authors~\footnote{https://github.com/zzqsmall/SemRec}.
210 | 
211 | 
212 | \textbullet\ \textbf{DSR}~\cite{zheng2017recommendation}: It is a MF based recommendation method with dual similarity regularization, which imposes the constraint on users and items with high and low similarities simultaneously.
213 | 
214 | \textbullet \textbf{HERec$_{dw}$}: It is a variant of HERec, which adopts the homogeneous network embedding method deepwalk~\cite{perozzi2014deepwalk} and ignores the heterogeneity
215 | of nodes in HINs. We use the code of deepwalk provided by the authors~\footnote{https://github.com/phanein/deepwalk}.
216 | 
217 | \textbullet \textbf{HERec$_{mp}$}: It can be viewed as a variant of HERec, which combines the heterogeneous network embedding of metapath2vec++~\cite{dong2017metapath2vec} and still preserves the superiority of HINs. We use the code of metapath2vec++ provided by the authors~\footnote{https://ericdongyx.github.io/metapath2vec/m2v.html}.
218 | 
219 | %\textbullet\ \textbf{HERec$_{dw}$}, \textbf{HERec$_{mp}$}: These two HERec variants adopt homogeneous network embedding method deepwalk~\cite{perozzi2014deepwalk} and heterogeneous network embedding method  metapath2vec++~\cite{dong2017metapath2vec}, respectively.
220 | 
221 | %\textbullet\ \textbf{HERec$_{sl}$}: It is a version of HERec correspond to the simple linear fusion function in Eq.~\ref{eq-slf}.
222 | 
223 | %\textbullet\ \textbf{HERec$_{pl}$}: It is a version of HERec correspond to the personalized linear fusion function in Eq.~\ref{eq-plf}.
224 | 
225 | %\textbullet\ \textbf{HERec$_{pnl}$}: It is a version of HERec correspond to the personalized non-linear fusion function in Eq.~\ref{eq-pnlf}.
226 | %\textbullet\ \textbf{HERec$_{pnl}$}: These three implementations of HERec correspond to the three fusion functions in Eq.~\ref{eq-slf}, Eq.~\ref{eq-plf} and Eq.~\ref{eq-pnlf}, namely simple linear fusion, personalized linear fusion and personalized non-linear fusion.
227 | \textbullet\ \textbf{HERec}: It is the proposed recommendation model based on heterogeneous information network embedding. In the effectiveness experiments, we utilize the personalized non-linear fusion function, which is formulated as Eq.~\ref{eq-pnlf}. And the performance of different fusion function will be reported in the later section.
228 | 
229 | 
230 | The selected baselines have a comprehensive coverage of existing rating prediction methods. PMF and SoMF are classic MF based rating prediction methods; FM$_{HIN}$, HeteMF, SemRec and DSR are HIN based recommendation methods using meta-path based similarities, which represent state-of-the-art HIN based methods. The proposed approach contains an HIN embedding component, which can be replaced by other network embedding methods. Hence, two variants HERec$_{dw}$ and HERec$_{mp}$
231 | are adopted as baselines.
232 | 
233 | 
234 | 
235 | Among these methods, HIN based methods need to specify the used meta-paths. We present the used meta-paths  in Table~\ref{tab_Data}.
236 | Following~\cite{sun2011pathsim}, we only select short meta-paths of at most four steps, since long meta-paths are likely to introduce noisy semantics.
237 | We set parameters for HERec as follows: the embedding dimension number $d = 64$ and the tuning coefficients $\alpha = \beta = 1.0$.
238 | Following~\cite{perozzi2014deepwalk}, the path length for random walk is set to 40.
239 | For HERec$_{dw}$ and HERec$_{mp}$, we set the embedding dimension number $d = 64$ for fairness.
240 | Following~\cite{shi2016integrating,zheng2017recommendation}, we set the number of latent factors for all the MF based methods to 10.
241 | Other baseline parameters either adopt the original optimal setting or are optimized by using a validation set of 10\% training data.
242 | 
243 | 
244 | \subsection{Effectiveness Experiments}
245 | For each dataset, we split the entire rating records into a training set and a test set. %We train the model using the training set and compare the performance on the test set.
246 | For Double Movie and Book datasets, we set four training ratios as in $\{80\%, 60\%, 40\%, 20\%\}$; while for Yelp dataset, following~\cite{shi2015semantic}, we set four larger training ratios as in $\{90\%, 80\%, 70\%, 60\%\}$, since
247 | it is too sparse. For each ratio, we randomly generate ten evaluation sets. We average the  results as the final performance.
248 | The results are shown in Table \ref{tab_Effectiveness}. The major findings from the experimental results are summarized as follows:
249 | 
250 | (1) Among these baselines, HIN based methods (HeteMF, SemRec, FM$_{HIN}$ and DSR) perform better than traditional MF based methods (PMF and SoMF), which indicates the usefulness of the heterogeneous information. It is worthwhile to note that the FM$_{HIN}$ model works very well among the three HIN based baselines. An intuitive explanation is that in our datasets, most of the original features are the attribute information of users or items, which are likely to contain useful evidence to improve the recommendation performance.
251 | 
252 | (2) The proposed HERec method (corresponding to the last column) is consistently better than the baselines, ranging from PMF to DSR.
253 | Compared with other HIN based methods,  HERec adopts a more principled way to leverage HINs for improving recommender systems, which provides better information extraction (a new HIN embedding model) and utilization (an extended MF model).
254 | Moreover, the superiority of the proposed HERec becomes more significant with less training data. In particular, the improvement ratio of HERec over the PMF is up to 40\% with 20\% training data on the Douban Book dataset, which indicates a significant performance boost.
255 | As mentioned previously, the Yelp dataset is very sparse. In this case, even with 60\% training data, the HERec model improves over PMF by about 26\%. As a comparison, with the same training ratio (\ie 60\%), the improvement ratios over PMF are about 29\% in terms of RMSE scores. These results indicate the effectiveness of the proposed approach, especially in a sparse dataset.
256 | 
257 | (3) Considering the two HERec variants HERec$_{dw}$ and HERec$_{mp}$, it is easy to see HERec$_{dw}$ performs much worse than HERec$_{mp}$.
258 | The major difference lies in the network embedding component. HERec$_{mp}$ adopts the recently proposed HIN embedding method \emph{metapath2vec}~\cite{dong2017metapath2vec}, while HERec$_{dw}$ ignores data heterogeneity
259 | and casts HINs as homogeneous networks for learning. It indicates HIN embedding methods are important to HIN based recommendation.
260 | The major difference between the proposed embedding method and metapath2vec (adopted by HERec$_{mp}$) lies in that we try to transform heterogeneous network information into homogeneous neighborhood, while metapath2vec tries to map all types of entities into the same representation space using heterogeneous sequences.
261 | Indeed, metapath2vec is a general-purpose HIN embedding method, while the proposed embedding method aims to improve the recommendation performance.
262 | Our focus is to learn the embeddings for users and items, while objects of other types are only used as the bridge to construct the homogeneous neighborhood. Based on the results (HERec $>$ HERec$_{mp}$), we argue that it is more effective to perform task-specific HIN embedding for improving the recommendation performance.
263 | 
264 | %(5) In addition, we can also observe that the superiority of the proposed HERec becomes more significant with less training data.
265 | %For example, on  Douban Movie dataset, using 20\% training data, HERec outperforms PMF up to 18.85\% on RMSE and 18.58\% on MAE.
266 | %It implies that our method is more effective to alleviate the cold-start problem.
267 | 
268 | %\begin{figure}[t]%[htbp]
269 | %\centering
270 | %\subfigure[RMSE]{
271 | %\begin{minipage}[t]{0.3\linewidth}
272 | %\centering
273 | %\includegraphics[width=4.2cm]{image/cold_start_rmse.pdf}
274 | %\end{minipage}
275 | %}
276 | %\hspace{30pt}
277 | %\subfigure[MAE]{
278 | %\begin{minipage}[t]{0.3\linewidth}
279 | %\centering
280 | %\includegraphics[width=4.2cm]{image/cold_start_mae.pdf}
281 | %\end{minipage}
282 | %}
283 | %\caption{\label{fig_cs}Performance comparison of different methods for cold-start prediction. Improvement ratios over PMF are reported.}
284 | %\end{figure}
285 | 
286 | %\begin{figure}[t]%[htbp]
287 | %\centering
288 | %\subfigure[RMSE]{
289 | %\begin{minipage}[t]{0.3\linewidth}
290 | %\centering
291 | %\includegraphics[width=4.2cm]{image/reg_rmse.pdf}
292 | %\end{minipage}
293 | %}
294 | %\hspace{40pt}
295 | %\subfigure[MAE]{
296 | %\begin{minipage}[t]{0.3\linewidth}
297 | %\centering
298 | %\includegraphics[width=4.2cm]{image/reg_mae.pdf}
299 | %\end{minipage}
300 | %}
301 | %\caption{\label{fig_reg}Varying parameters $\alpha$ and $\beta$.}
302 | %\end{figure}
303 | 
304 | 
305 | \subsection{Detailed Analysis of The Proposed Approach}
306 | In this part, we perform a series of detailed analysis for the proposed approach.
307 | 
308 | %\begin{figure}[t]%[htbp]
309 | %\centering
310 | %\includegraphics[width=8cm]{image/cold_start_rmse.pdf}
311 | %\caption{\label{fig_cs}Performance comparison of different methods for cold-start prediction on Douban Movie dataset. $y$-axis denotes the improvement ratio over PMF.}
312 | %\end{figure}
313 | 
314 | \begin{figure*}
315 | \centering
316 | \subfigure[Douban Movie]{
317 | \begin{minipage}[b]{0.3\textwidth}
318 | \includegraphics[width=1\textwidth]{image/fusion_dm_rmse.pdf} \\
319 | \includegraphics[width=1\textwidth]{image/fusion_dm_mae.pdf}
320 | \end{minipage}
321 | }
322 | \subfigure[Douban Book]{
323 | \begin{minipage}[b]{0.3\textwidth}
324 | \includegraphics[width=1\textwidth]{image/fusion_db_rmse.pdf} \\
325 | \includegraphics[width=1\textwidth]{image/fusion_db_mae.pdf}
326 | \end{minipage}
327 | }
328 | \subfigure[Yelp]{
329 | \begin{minipage}[b]{0.3\textwidth}
330 | \includegraphics[width=1\textwidth]{image/fusion_yelp_rmse.pdf} \\
331 | \includegraphics[width=1\textwidth]{image/fusion_yelp_mae.pdf}
332 | \end{minipage}
333 | }
334 | \caption{\label{fig_fusion}Performance comparison of different fusion functions on three datasets.}
335 | \end{figure*}
336 | 
337 | \begin{figure*}
338 | \centering
339 | \subfigure[Douban Movie]{
340 | \begin{minipage}[b]{0.3\textwidth}
341 | \includegraphics[width=1\textwidth]{image/cold_start_rmse_dm.pdf} \\
342 | \includegraphics[width=1\textwidth]{image/cold_start_mae_dm.pdf}
343 | \end{minipage}
344 | }
345 | \subfigure[Douban Book]{
346 | \begin{minipage}[b]{0.3\textwidth}
347 | \includegraphics[width=1\textwidth]{image/cold_start_rmse_db.pdf} \\
348 | \includegraphics[width=1\textwidth]{image/cold_start_mae_db.pdf}
349 | \end{minipage}
350 | }
351 | \subfigure[Yelp]{
352 | \begin{minipage}[b]{0.3\textwidth}
353 | \includegraphics[width=1\textwidth]{image/cold_start_rmse_yelp.pdf} \\
354 | \includegraphics[width=1\textwidth]{image/cold_start_mae_yelp.pdf}
355 | \end{minipage}
356 | }
357 | \caption{\label{fig_cs}Performance comparison of different methods for cold-start prediction on three datasets. $y$-axis denotes the improvement ratio over PMF.}
358 | \end{figure*}
359 | 
360 | \subsubsection{Selection of Different Fusion Functions}
361 | HERec requires a principled fusion way to transform node embeddings into a more suitable form that is useful to enhance recommendation performance. Therefore, we will discuss the impact of different fusion functions on the recommendation performance.
362 | For convenience, we call the HERec variant with the simple linear fusion function (Eq.~\ref{eq-slf}) as HERec$_{sl}$, the variant with personalized linear fusion function  (Eq.~\ref{eq-plf}) as HERec$_{pl}$,  and the variant with personalized non-linear fusion function (Eq.~\ref{eq-pnlf}) as HERec$_{pnl}$.
363 | We present the performance comparison of the three variants of HERec on the three datasets in Fig.~\ref{fig_fusion}.
364 | 
365 | As shown in Fig.~\ref{fig_fusion}, we can find that overall performance ranking is as follows: HERec$_{pnl}$ $>$ HERec$_{pl}$ $>$ HERec$_{sl}$. Among the three variants, the simple linear fusion function performs the worst, as it ignores the personalization and non-linearity. Indeed, users are likely to have varying preferences over meta-paths~\cite{shi2015semantic}, which should be considered in meta-path based methods. The personalization factor improves the performance significantly. As a comparison, the performance improvement of
366 | HERec$_{pnl}$ over  HERec$_{pl}$ is relatively small. A possible reason is that the incorporation of personalized combination parameters increases the capability of linear models. Nonetheless, HERec$_{pnl}$ still performs the best by considering personalization and non-linear transformation.
367 | In a complicated recommender setting, HIN embeddings may not be directly applicable in recommender systems, where a non-linear mapping function is preferred.
368 | 
369 | Since HERec$_{pnl}$ is the best variant of the proposed model, in what follows, HERec will use the personalized non-linear fusion function, \ie HERec$_{pnl}$ by default.
370 | 
371 | \subsubsection{Cold-start Prediction}
372 | HINs are particularly useful to improve cold-start prediction, where there are fewer rating records but heterogeneous context information is available. We consider studying the performance \emph{w.r.t.} different cold-start degrees, \ie the rating sparsity.
373 | To test it, we first  categorize ``cold" users into three groups according to the numbers of their rating records, \ie $(0, 5]$, $(5, 15]$ and $(15, 30]$. It is easy to see that the case for the first group is the most difficult, since users from this group have fewest rating records.
374 | Here, we only select the baselines which use HIN based information for recommendation, including SoMF, HeteMF, SemRec, DSR and FM$_{HIN}$.
375 | We present the performance comparison of different methods in Fig.~\ref{fig_cs}. For convenience, we report the improvement ratios \emph{w.r.t.} PMF.
376 | Overall, all the comparison methods are better than PMF (\ie a positive $y$-axis value). The proposed method performs the best among all the methods, and the improvement over PMF becomes more significant for users with fewer rating records. The results indicate that HIN based information is effective to improve the recommendation performance, and the proposed HERec method can effectively utilize HIN information in a more principled way.
377 | 
378 | 
379 | 
380 | %\begin{figure}[htbp]
381 | %%\centering
382 | %\subfigure[Yelp]{
383 | %\begin{minipage}[t]{0.3\linewidth}
384 | %\centering
385 | %\includegraphics[width=4.2cm]{image/metapath_yelp.pdf}
386 | %\end{minipage}
387 | %}
388 | %\hspace{40pt}
389 | %\subfigure[Douban Movie]{
390 | %\begin{minipage}[t]{0.3\linewidth}
391 | %\centering
392 | %\includegraphics[width=4.2cm]{image/metapath_douban.pdf}
393 | %\end{minipage}
394 | %}
395 | %\caption{\label{fig_metapath}Performance change of HERec when gradually incorporating meta-paths.}
396 | %\end{figure}
397 | 
398 | %\begin{figure}[htbp]
399 | %%\centering
400 | %\subfigure[Yelp]{
401 | %\begin{minipage}[t]{0.5\textwidth}
402 | %\centering
403 | %\includegraphics[width=8cm]{image/metapath_yelp.pdf}
404 | %\end{minipage}
405 | %}
406 | %\subfigure[Douban Movie]{
407 | %\begin{minipage}[t]{0.5\textwidth}
408 | %\centering
409 | %\includegraphics[width=8cm]{image/metapath_douban.pdf}
410 | %\end{minipage}
411 | %}
412 | %\caption{\label{fig_metapath}Performance change of HERec when gradually incorporating meta-paths.}
413 | %\end{figure}
414 | 
415 | \begin{figure*}[htbp]
416 | \centering
417 | \subfigure[Douban Movie]{
418 | \begin{minipage}[t]{0.3\textwidth}
419 | \centering
420 | \includegraphics[width=1\textwidth]{image/metapath_dm.pdf}
421 | \end{minipage}
422 | }
423 | \subfigure[Douban Book]{
424 | \begin{minipage}[t]{0.3\textwidth}
425 | \centering
426 | \includegraphics[width=1.05\textwidth]{image/metapath_db.pdf}
427 | \end{minipage}
428 | }
429 | \subfigure[Yelp]{
430 | \begin{minipage}[t]{0.3\textwidth}
431 | \centering
432 | \includegraphics[width=1\textwidth]{image/metapath_yelp.pdf}
433 | \end{minipage}
434 | }
435 | \caption{\label{fig_metapath}Performance change of HERec when gradually incorporating meta-paths.}
436 | \end{figure*}
437 | 
438 | 
439 | 
440 | \subsubsection{Impact of Different Meta-Paths}
441 | %Our approach depends on the selection of useful meta-paths.
442 | As shown in Table~\ref{tab_Data}, the proposed approach uses a selected set of meta-paths.
443 | To further analyze the impact of different meta-paths, we gradually incorporate these meta-paths into the proposed approach and check the performance change.
444 | In Fig. \ref{fig_metapath}, we can observe that generally the performance improves (\ie becoming smaller) with the incorporation of more meta-paths.
445 | Both meta-paths starting with the user type and item type are useful to improve the performance.
446 | However, it does not always yield the improvement with more meta-paths, and the performance slightly fluctuates.
447 | The reason is that some meta-paths may contain noisy  or conflict information with existing ones.
448 | Another useful observation is that the performance quickly achieves a relatively good performance with the incorporation of only a few meta-paths.
449 | This confirms previous finding~\cite{shi2015semantic}:  a small number of high-quality meta-paths are able to lead to large performance improvement. Hence, as mentioned before, we can effectively control the model complexity by selecting just a few meta-paths.
450 | 
451 | %As mentioned before, this finding is implicative
452 | %Our approach converts the meta-path information into embeddings subsequently transformed by a learnable function, which is able to reduce the noisy influence from meta-paths to some extent.
453 | 
454 | 
455 | %In this section, we study the impact of meta-paths on performances of HERec. We do experiments on Yelp and Douban Movie dataset via successively adding meta-paths to the HIN embedding phrase of HERec. The results are recorded in Fig. \ref{fig_metapath}. We can obverse that, with the addition of meta-paths, the HERec generally achieves better performances, which verifies that the multiple-perspective information extracted from multiple meta-paths is helpful to improve recommendation performance. Particularly, the performances are dramatically improved  when adding meta-paths of items (e.g., $BUB$ in Yelp and $MUM$ in Douban Movie), since it will integrate structure information of items into the model. On the other hand, we can also find that the recommendation performances are not always improved with the addition of more meta-paths, like the addition of $BCiB$ in Yelp and $MDM$ in Douban Movie. We think the reason lies in that, although meaningful meta-paths (e.g., $UBCiBU$ in Yelp and $UMDMU$ in Douban Movie) can provide valuable structural information for recommendation, while some meta-paths (e.g., $BCiB$ and $MDM$) have less valuable structural information but contain much noise information, which will harm the recommendation performance.
456 | %Each meta path based embedding has different semantic information which would have different effects on recommendation performance. We notice the recommendation performance gains obviously significant improvement when add first several meta-paths. And when we continue adding meta-path, the improvement would be quite slight on RMSE, and even has the decreasing tendency on MAE. It verifies that meaningful mete-path would dramatically improve recommendation performance, while normal meta-path is helpless to recommendation.
457 | 
458 | %\begin{figure}[htbp]
459 | %\centering
460 | %\includegraphics[width=8cm]{image/reg_rmse.pdf}
461 | %\caption{\label{fig_reg}Varying parameters $\alpha$ and $\beta$ on Douban Movie dataset.}
462 | %\end{figure}
463 | 
464 | %\begin{figure}[htbp]
465 | %%\centering
466 | %\subfigure[RMSE]{
467 | %\begin{minipage}[t]{0.5\textwidth}
468 | %\centering
469 | %\includegraphics[width=8cm]{image/reg_rmse.pdf}
470 | %\end{minipage}
471 | %}
472 | %\subfigure[MAE]{
473 | %\begin{minipage}[t]{0.5\textwidth}
474 | %\centering
475 | %\includegraphics[width=8cm]{image/reg_rmse.pdf}
476 | %\end{minipage}
477 | %}
478 | %\caption{\label{fig_reg}Varying parameters $\alpha$ and $\beta$ on Douban Movie dataset.}
479 | %\end{figure}
480 | 
481 | \begin{figure*}[t]%[htbp]
482 | \centering
483 | \subfigure[Douban Movie]{
484 | \begin{minipage}[t]{0.3\textwidth}
485 | \centering
486 | \includegraphics[width=1\textwidth]{image/dim_dm.pdf}
487 | \end{minipage}
488 | }
489 | \subfigure[Douban Book]{
490 | \begin{minipage}[t]{0.3\textwidth}
491 | \centering
492 | \includegraphics[width=1\textwidth]{image/dim_db.pdf}
493 | \end{minipage}
494 | }
495 | \subfigure[Yelp]{
496 | \begin{minipage}[t]{0.3\textwidth}
497 | \centering
498 | \includegraphics[width=1\textwidth]{image/dim_yelp.pdf}
499 | \end{minipage}
500 | }
501 | \caption{\label{fig_dim}Performance change with respect to the dimension of embeddings on three datasets.}
502 | \end{figure*}
503 | 
504 | 
505 | \begin{figure*}[t]%[htbp]
506 | \centering
507 | \subfigure[Douban Movie]{
508 | \begin{minipage}[t]{0.3\textwidth}
509 | \centering
510 | \includegraphics[width=1\textwidth]{image/factor_dm.pdf}
511 | \end{minipage}
512 | }
513 | \subfigure[Douban Book]{
514 | \begin{minipage}[t]{0.3\textwidth}
515 | \centering
516 | \includegraphics[width=1\textwidth]{image/factor_db.pdf}
517 | \end{minipage}
518 | }
519 | \subfigure[Yelp]{
520 | \begin{minipage}[t]{0.3\textwidth}
521 | \centering
522 | \includegraphics[width=1\textwidth]{image/factor_yelp.pdf}
523 | \end{minipage}
524 | }
525 | \caption{\label{fig_factor}Performance change with respect to the dimension of latent factors on three datasets.}
526 | \end{figure*}
527 | 
528 | \begin{figure*}[t]%[htbp]
529 | \centering
530 | \subfigure[Douban Movie]{
531 | \begin{minipage}[t]{0.3\textwidth}
532 | \centering
533 | \includegraphics[width=1\textwidth]{image/reg_rmse_dm.pdf}
534 | \end{minipage}
535 | }
536 | \subfigure[Douban Book]{
537 | \begin{minipage}[t]{0.3\textwidth}
538 | \centering
539 | \includegraphics[width=1\textwidth]{image/reg_rmse_db.pdf}
540 | \end{minipage}
541 | }
542 | \subfigure[Yelp]{
543 | \begin{minipage}[t]{0.3\textwidth}
544 | \centering
545 | \includegraphics[width=1\textwidth]{image/reg_rmse_yelp.pdf}
546 | \end{minipage}
547 | }
548 | \caption{\label{fig_reg}Varying parameters $\alpha$ and $\beta$ on the three datasets.}
549 | \end{figure*}
550 | 
551 | %\begin{figure*}[t]%[htbp]
552 | %\centering
553 | %\subfigure[Douban Movie]{
554 | %\begin{minipage}[t]{0.3\textwidth}
555 | %\centering
556 | %\includegraphics[width=1.06\textwidth]{image/iter_dm.pdf}
557 | %\end{minipage}
558 | %}
559 | %\subfigure[Douban Book]{
560 | %\begin{minipage}[t]{0.3\textwidth}
561 | %\centering
562 | %\includegraphics[width=1\textwidth]{image/iter_db.pdf}
563 | %\end{minipage}
564 | %}
565 | %\subfigure[Yelp]{
566 | %\begin{minipage}[t]{0.3\textwidth}
567 | %centering
568 | %\includegraphics[width=1\textwidth]{image/iter_yelp.pdf}
569 | %\end{minipage}
570 | %}
571 | %\caption{\label{fig_iter}Performance with respect to the number of iterations on three datasets.}
572 | %\end{figure*}
573 | 
574 | 
575 | \subsubsection{Parameter Tuning}
576 | We firstly study the impact of the dimension of embeddings for the proposed model. We vary embedding dimensions in the set of $\{16, 32, 64, 128, 256\}$ and select HERec$_{mp}$ as the reference baseline. As shown in Fig.~\ref{fig_dim}, our method consistently outperforms HERec$_{mp}$ and achieves the optimal performance when the dimension of embeddings is around 64. Moreover, the performance of our method is more stable when varying the dimension of embeddings compared with HERec$_{mp}$.
577 | 
578 | For matrix factorization based methods, an important parameter to tune is the number of latent factors. The proposed model also involves such a parameter.
579 | We vary it from 5 to 40 with a step of 5, and examine how the performance changes \emph{w.r.t} the number of latent factors.
580 | We present the tuning results in Fig.~\ref{fig_factor}. As we can see, using 10 latent factors yields the best performance, indicating that
581 | the number of latent factors should be set to a small number.
582 | 
583 | Next, we fix the number of latent factors as ten, and tune two other parameters $\alpha$ and $\beta$ (Eq.~\ref{eq-predictor}), which are used to integrate different terms as weights.
584 | Now we examine how they influence the model performance. For both parameters, we vary them in the set of $\{0.1, 0.5, 1, 2\}$.
585 |  As shown in Fig. \ref{fig_reg}, the optimal performance is obtained near $(1,1)$, \ie both $\alpha$ and $\beta$ are around 1. The results show that
586 |  the HIN embeddings from both the user and item sides are important to improve the prediction performance.
587 |  Overall, the change trend is smooth, indicating that the proposed model is not very sensitive to the two parameters.
588 | 
589 |  %Finally, we study the performance change \emph{w.r.t.} the number of iterations. As shown in Fig.~\ref{fig_iter},
590 |  %we can see that the proposed model has a fast convergence rate, and about 40 to 60 iterations are required for dense datasets (\ie Douban Movie and Book), while about 20 iterations
591 |  %are required for sparse datasets (\ie Yelp).
592 | %
593 | %\begin{figure}[htbp]
594 | %%\centering
595 | %\subfigure[RMSE]{
596 | %\begin{minipage}[t]{0.5\textwidth}
597 | %\centering
598 | %\includegraphics[width=8cm]{image/window_size_mae.pdf}
599 | %\end{minipage}
600 | %}
601 | %\subfigure[MAE]{
602 | %\begin{minipage}[t]{0.5\textwidth}
603 | %\centering
604 | %\includegraphics[width=8cm]{image/window_size_mae.pdf}
605 | %\end{minipage}
606 | %}
607 | %\caption{\label{fig_reg}Dimension of latent factors.}
608 | %\end{figure}
609 | 
610 |  %on Douban Movie dataset. As shows in Fig. \ref{fig_reg}, when the values of $\alpha$ and $\beta$ are both around 1, our model has the best performance. When the values of $\alpha$ and $\beta$ are quite large or small, the result is not good, especially in the case of small $\alpha$ and $\beta$. It shows that the latent features of users and items leant from HIN embedding both play an important role for better performances. However, too much weights on latent features of users and items also hurt the performances.
611 | 


--------------------------------------------------------------------------------