├── .gitignore ├── README.md ├── demo.py ├── deprecated ├── ppgrl_1.py └── tfgen_1.py ├── imgs ├── atl-robbery-1.gif └── cal-earthquake-1.gif ├── misc ├── __init__.py ├── plots.py └── ppgen.py ├── ppgrl.py ├── results ├── 911calls-105-imit.txt ├── 911calls-105-real.txt ├── 911calls-105.png ├── 911calls-208-imit.txt ├── 911calls-208-real.txt ├── poisson_exp.gif └── qqplot4intdiff.png ├── tfgen.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | .gitignore 3 | archive/* 4 | prepare_data/* 5 | resource/* 6 | results/* 7 | *.pyc 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Imitation-Learning-for-Point-Process 2 | === 3 | 4 | Introduction 5 | --- 6 | PPG (Point Process Generator) is a highly-customized RNN (Recurrent Neural Network) Model that would be able to produce actions (a point process) by imitating expert sequences. (**Shuang Li's ongoing work**) 7 | 8 | How to Train a PPG 9 | --- 10 | Before training a PPG, you have to organize and format the training data and test data into numpy arrays, which have shape (`num_seqs`, `max_num_actions`, `num_features`) and (`batch_size`, `max_num_actions`, `num_features`) respectively. Also you have to do paddings (zero values) for those sequences of actions whose length are less than `max_num_actions`. For the time being, 11 | `num_features` has to be set as 1 (time). 12 | 13 | Then you can initiate a session by tensorflow, and do the training process like following example: 14 | ```python 15 | max_t = 7 16 | seq_len = 10 17 | batch_size = 3 18 | state_size = 5 19 | feature_size = 1 20 | with tf.Session() as sess: 21 | # Substantiate a ppg object 22 | ppg = PointProcessGenerator( 23 | t_max=t_max, # max time for all learner & expert actions 24 | seq_len=seq_len, # length for all learner & expert actions sequences 25 | batch_size=batch_size, 26 | state_size=state_size, 27 | feature_size=feature_size, 28 | iters=10, display_step=1, lr=1e-4) 29 | # Start training 30 | ppg.train(sess, input_data, test_data, pretrained=False) 31 | ``` 32 | You can also omit parameter `test_data`, which is set `None` by default, if you don't have test data for training. 33 | 34 | The details of the training process will be logged into standard error stream. Below is testing log information. 35 | ```shell 36 | 2018-01-23 15:50:25.578574: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 37 | 2018-01-23 15:50:25.578595: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 38 | 2018-01-23 15:50:25.578601: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 39 | 2018-01-23 15:50:25.578605: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 40 | [2018-01-23T15:50:46.381850-05:00] Iter: 64 41 | [2018-01-23T15:50:46.382798-05:00] Train Loss: 12.07759 42 | [2018-01-23T15:51:01.708265-05:00] Iter: 128 43 | [2018-01-23T15:51:01.708412-05:00] Train Loss: 10.53193 44 | [2018-01-23T15:51:16.116164-05:00] Iter: 192 45 | [2018-01-23T15:51:16.116308-05:00] Train Loss: 6.10489 46 | [2018-01-23T15:51:26.816642-05:00] Iter: 256 47 | [2018-01-23T15:51:26.816785-05:00] Train Loss: 3.17384 48 | [2018-01-23T15:51:36.808152-05:00] Iter: 320 49 | [2018-01-23T15:51:36.808301-05:00] Train Loss: 2.29386 50 | [2018-01-23T15:51:46.169030-05:00] Iter: 384 51 | [2018-01-23T15:51:46.169334-05:00] Train Loss: 1.71573 52 | [2018-01-23T15:51:55.244403-05:00] Iter: 448 53 | [2018-01-23T15:51:55.244538-05:00] Train Loss: 1.82563 54 | [2018-01-23T15:52:04.491172-05:00] Iter: 512 55 | [2018-01-23T15:52:04.491308-05:00] Train Loss: 2.59895 56 | [2018-01-23T15:52:13.234096-05:00] Iter: 576 57 | [2018-01-23T15:52:13.234250-05:00] Train Loss: 1.99338 58 | [2018-01-23T15:52:21.904642-05:00] Iter: 640 59 | [2018-01-23T15:52:21.904949-05:00] Train Loss: 1.20168 60 | [2018-01-23T15:52:30.511283-05:00] Iter: 704 61 | [2018-01-23T15:52:30.511429-05:00] Train Loss: 0.94646 62 | [2018-01-23T15:52:39.296644-05:00] Iter: 768 63 | [2018-01-23T15:52:39.296788-05:00] Train Loss: 0.88800 64 | [2018-01-23T15:52:47.973522-05:00] Iter: 832 65 | [2018-01-23T15:52:47.973675-05:00] Train Loss: 0.70098 66 | [2018-01-23T15:52:56.207430-05:00] Iter: 896 67 | [2018-01-23T15:52:56.207577-05:00] Train Loss: 0.68432 68 | [2018-01-23T15:53:04.548818-05:00] Iter: 960 69 | [2018-01-23T15:53:04.548964-05:00] Train Loss: 0.66598 70 | [2018-01-23T15:53:04.549057-05:00] Optimization Finished! 71 | ``` 72 | 73 | How to Generate Actions 74 | --- 75 | By simply running following code, fixed size (number and length) of sequences with indicated time frame will be generated automatically without input data. What needs to be noted is the length and the number of the generated sequence have been specified by the same input parameters when you initialize the `ppg` object. 76 | ```python 77 | with tf.Session() as sess: 78 | 79 | # Here is the code for training a new ppg or loading an existed ppg 80 | 81 | # Generate actions 82 | actions, states_history = ppg.generate(sess, pretrained=False) 83 | print actions 84 | ``` 85 | Below are generated test actions. 86 | ```shell 87 | (array([[ 0.63660634, 1.12912512, 0.39286253], 88 | [ 1.64375508, 1.60563707, 1.77609217], 89 | [ 3.08153439, 2.41127753, 2.59949875], 90 | [ 3.91807413, 3.74258327, 3.54215193], 91 | [ 4.97372961, 4.49850368, 4.98060131], 92 | [ 5.73539734, 5.15121365, 5.43891001], 93 | [ 6.24749708, 5.667624 , 6.38705158], 94 | [ 6.60757065, 6.88907528, 0. ], 95 | [ 0. , 0. , 0. ], 96 | [ 0. , 0. , 0. ]], dtype=float32) 97 | ``` 98 | 99 | References 100 | --- 101 | - [Shuang Li, Shuai Xiao, Shixiang Zhu, Nan Du, Yao Xie, Le Song. "Learning Temporal Point Processes via Reinforcement Learning 102 | "](https://arxiv.org/abs/1811.05016) 103 | -------------------------------------------------------------------------------- /demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | Demo for testing basic version of imitpp on synthetic dataset 6 | """ 7 | 8 | import sys 9 | import arrow 10 | import utils 11 | import random 12 | import numpy as np 13 | import tensorflow as tf 14 | 15 | from ppgrl import RL_Hawkes_Generator 16 | from stppg import HawkesLam, GaussianMixtureDiffusionKernel, StdDiffusionKernel 17 | 18 | # Avoid error msg [OMP: Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized.] 19 | # Reference: https://github.com/dmlc/xgboost/issues/1715 20 | import os 21 | os.environ['KMP_DUPLICATE_LIB_OK']='True' 22 | 23 | if __name__ == "__main__": 24 | data = np.load('../Spatio-Temporal-Point-Process-Simulator/data/apd.crime.perday.npy') 25 | params = np.load('../Spatio-Temporal-Point-Process-Simulator/data/gaussian_mixture_params.npz') 26 | 27 | da = utils.DataAdapter(init_data=data) 28 | mu = params['mu'] 29 | # kernel = GaussianMixtureDiffusionKernel( 30 | # n_comp=5, layers=[5], C=1., beta=params['beta'], 31 | # SIGMA_SHIFT=.05, SIGMA_SCALE=.2, MU_SCALE=.1, 32 | # Wss=params['Wss'], bss=params['bss'], Wphis=params['Wphis']) 33 | kernel = GaussianMixtureDiffusionKernel( 34 | n_comp=20, layers=[5], C=1., beta=.8, 35 | SIGMA_SHIFT=.05, SIGMA_SCALE=.3, MU_SCALE=.03) 36 | # kernel = StdDiffusionKernel(C=1., beta=3., sigma_x=.15, sigma_y=.15) 37 | lam = HawkesLam(mu, kernel, maximum=1e+3) 38 | 39 | # ngrid should be smaller than 100, due to the computational 40 | # time is too large when n > 100. 41 | utils.spatial_intensity_on_map( 42 | "test.html", da, lam, data, t=1.0, 43 | xlim=[33.70, 33.87], 44 | ylim=[-84.50, -84.30], 45 | ngrid=200) 46 | # xlim=da.xlim, ylim=da.ylim, ngrid=200) 47 | -------------------------------------------------------------------------------- /deprecated/ppgrl_1.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import arrow 3 | import utils 4 | import random 5 | import numpy as np 6 | import tensorflow as tf 7 | import matplotlib.pyplot as plt 8 | 9 | from tfgen_1 import SpatialTemporalHawkes 10 | 11 | class RL_Hawkes_Generator(object): 12 | """ 13 | Reinforcement Learning Based Point Process Generator 14 | """ 15 | 16 | def __init__(self, batch_size, lr, keep_latest_k=None, T=[0., 10.], S=[[-1., 1.], [-1., 1.]], C=1., maximum=1e+3): 17 | """ 18 | Params: 19 | - T: the maximum time of the sequences 20 | - S: the space of location 21 | - C: the constant in diffusion kernel 22 | """ 23 | # model hyper-parameters 24 | self.batch_size = batch_size 25 | self.T = T # maximum time 26 | self.S = S # location space 27 | # input tensors: expert sequences (time, location) 28 | self.input_seqs = tf.placeholder(tf.float32, [batch_size, None, 3]) 29 | # Hawkes process generator 30 | self.hawkes = SpatialTemporalHawkes(C=C, maximum=maximum) 31 | # generated tensors: learner sequences (time, location, loglikelihood) 32 | # - learner_seqs: [batch_size, seq_len, data_dim] 33 | # - learner_seqs_loglik: [batch_size, seq_len, 1] 34 | self.seqs, logliks = self.hawkes.sampling(T, S, batch_size=batch_size, keep_latest_k=None) 35 | # build policy optimizer 36 | self._policy_optimizer( 37 | expert_seqs=self.input_seqs, 38 | learner_seqs=self.seqs, learner_seqs_loglik=logliks, 39 | lr=lr) 40 | 41 | def _policy_optimizer(self, expert_seqs, learner_seqs, learner_seqs_loglik, lr): 42 | """ 43 | """ 44 | # concatenate batches in the sequences 45 | concat_expert_seq = self.__concatenate_batch(expert_seqs) # [batch_size * expert_seq_len, data_dim] 46 | concat_learner_seq = self.__concatenate_batch(learner_seqs) # [batch_size * learner_seq_len, data_dim] 47 | concat_learner_seq_loglik = self.__concatenate_batch(learner_seqs_loglik) # [batch_size * learner_seq_len, 1] 48 | 49 | # calculate average rewards 50 | print("[%s] building reward." % arrow.now(), file=sys.stderr) 51 | reward = self._reward(concat_expert_seq, concat_learner_seq) 52 | 53 | # cost and optimizer 54 | print("[%s] building optimizer." % arrow.now(), file=sys.stderr) 55 | self.cost = tf.reduce_sum(tf.multiply(reward, concat_learner_seq_loglik), axis=0) 56 | self.optimizer = tf.train.GradientDescentOptimizer(lr).minimize(self.cost) 57 | 58 | def _reward(self, expert_seq, learner_seq, kernel_bandwidth=0.5): 59 | """reward function""" 60 | # get mask for concatenated expert and learner sequences 61 | learner_mask_t = tf.expand_dims(tf.cast(learner_seq[:, 0] > 0, tf.float32), -1) 62 | expert_mask_t = tf.expand_dims(tf.cast(expert_seq[:, 0] > 0, tf.float32), -1) 63 | 64 | # calculate mask for kernel matrix 65 | learner_learner_kernel_mask = tf.matmul(learner_mask_t, tf.transpose(learner_mask_t)) 66 | expert_learner_kernel_mask = tf.matmul(expert_mask_t, tf.transpose(learner_mask_t)) 67 | 68 | # calculate upper-half kernel matrix 69 | # - [learner_seq_len, learner_seq_len], [expert_seq_len, learner_seq_len] 70 | learner_learner_kernel, expert_learner_kernel = self.__kernel_matrix(learner_seq, expert_seq, kernel_bandwidth) 71 | 72 | learner_learner_kernel = tf.multiply(learner_learner_kernel, learner_learner_kernel_mask) 73 | expert_learner_kernel = tf.multiply(expert_learner_kernel, expert_learner_kernel_mask) 74 | 75 | # calculate reward for each of data point in learner sequence 76 | emp_ll_mean = tf.reduce_sum(learner_learner_kernel, axis=0) * 2 # [batch_size * learner_seq_len] 77 | emp_el_mean = tf.reduce_sum(expert_learner_kernel, axis=0) * 2 # [batch_size * learner_seq_len] 78 | return tf.expand_dims(emp_ll_mean - emp_el_mean, -1) # [batch_size * learner_seq_len, 1] 79 | 80 | @staticmethod 81 | def __concatenate_batch(seqs): 82 | """Concatenate each batch of the sequences into a single sequence.""" 83 | array_seq = tf.unstack(seqs, axis=0) # [batch_size, seq_len, data_dim] 84 | seq = tf.concat(array_seq, axis=0) # [batch_size*seq_len, data_dim] 85 | return seq 86 | 87 | @staticmethod 88 | def __kernel_matrix(learner_seq, expert_seq, kernel_bandwidth): 89 | """ 90 | Construct kernel matrix based on learn sequence and expert sequence, each entry of the matrix 91 | is the distance between two data points in learner_seq or expert_seq. return two matrix, left_mat 92 | is the distances between learn sequence and learn sequence, right_mat is the distances between 93 | learn sequence and expert sequence. 94 | """ 95 | # calculate l2 distances 96 | learner_learner_mat = utils.l2_norm(learner_seq, learner_seq) # [batch_size*seq_len, batch_size*seq_len] 97 | expert_learner_mat = utils.l2_norm(expert_seq, learner_seq) # [batch_size*seq_len, batch_size*seq_len] 98 | # exponential kernel 99 | learner_learner_mat = tf.exp(-learner_learner_mat / kernel_bandwidth) 100 | expert_learner_mat = tf.exp(-expert_learner_mat / kernel_bandwidth) 101 | return learner_learner_mat, expert_learner_mat 102 | 103 | def train(self, sess, 104 | epoches, # number of epoches (how many times is the entire dataset going to be trained) 105 | expert_seqs, # [n, seq_len, 3] 106 | trainplot=True, # plot the change of intensity over epoches 107 | pretrained=False): 108 | """Train the point process generator given expert sequences.""" 109 | 110 | # initialization 111 | if not pretrained: 112 | print("[%s] parameters are initialized." % arrow.now(), file=sys.stderr) 113 | # initialize network parameters 114 | init_op = tf.global_variables_initializer() 115 | sess.run(init_op) 116 | 117 | # data configurations 118 | # - number of expert sequences 119 | n_data = expert_seqs.shape[0] 120 | # - number of batches 121 | n_batches = int(n_data / self.batch_size) 122 | 123 | if trainplot: 124 | ppim = utils.PointProcessIntensityMeter(self.T[1], batch_size) 125 | 126 | # training over epoches 127 | for epoch in range(epoches): 128 | # shuffle indices of the training samples 129 | shuffled_ids = np.arange(n_data) 130 | np.random.shuffle(shuffled_ids) 131 | 132 | # training over batches 133 | avg_train_cost = [] 134 | for b in range(n_batches): 135 | idx = np.arange(self.batch_size * b, self.batch_size * (b + 1)) 136 | # training and testing indices selected in current batch 137 | batch_train_ids = shuffled_ids[idx] 138 | # training and testing batch data 139 | batch_train_expert = expert_seqs[batch_train_ids, :, :] 140 | # print(sess.run(self.seqs)) 141 | # optimization procedure 142 | sess.run(self.optimizer, feed_dict={self.input_seqs: batch_train_expert}) 143 | # cost for train batch and test batch 144 | train_cost = sess.run(self.cost, feed_dict={self.input_seqs: batch_train_expert}) 145 | print("[%s] batch training cost: %.2f." % (arrow.now(), train_cost), file=sys.stderr) 146 | # record cost for each batch 147 | avg_train_cost.append(train_cost) 148 | 149 | if trainplot: 150 | # update intensity plot 151 | learner_seqs, _ = self.hawkes.get_learner_seqs(sess, self.batch_size, keep_latest_k=None) 152 | ppim.update_time_intensity(batch_train_expert[:, : , 0], learner_seqs[:, :, 0]) 153 | ppim.update_location_intensity(batch_train_expert[:, : , 1:], learner_seqs[:, :, 1:]) 154 | 155 | # training log output 156 | avg_train_cost = np.mean(avg_train_cost) 157 | print('[%s] Epoch %d (n_train_batches=%d, batch_size=%d)' % (arrow.now(), epoch, n_batches, self.batch_size), file=sys.stderr) 158 | print('[%s] Training cost:\t%f' % (arrow.now(), avg_train_cost), file=sys.stderr) -------------------------------------------------------------------------------- /deprecated/tfgen_1.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | Imitation Learning for Point Process 6 | 7 | A LSTM based model for generating marked spatial-temporal points. 8 | 9 | References: 10 | - https://arxiv.org/abs/1811.05016 11 | 12 | Dependencies: 13 | - Python 3.6.7 14 | - tensorflow==1.5.0 15 | """ 16 | 17 | import sys 18 | import arrow 19 | import utils 20 | import numpy as np 21 | import tensorflow as tf 22 | 23 | from stppg import DiffusionKernel, HawkesLam, SpatialTemporalPointProcess 24 | 25 | class SpatialTemporalHawkes(object): 26 | """ 27 | """ 28 | 29 | def __init__(self, C=1., maximum=1e+3): 30 | """ 31 | """ 32 | INIT_PARAM = 5e-2 33 | self.C = C # constant in kernel function 34 | self.maximum = maximum # upper bound of conditional intensity 35 | self.mu = tf.get_variable(name="mu", initializer=tf.constant(INIT_PARAM), dtype=tf.float32) 36 | self.beta = tf.get_variable(name="beta", initializer=tf.constant(5.), dtype=tf.float32) 37 | self.sigma_x = tf.get_variable(name="sigma_x", initializer=tf.constant(INIT_PARAM), dtype=tf.float32) 38 | self.sigma_y = tf.get_variable(name="sigma_y", initializer=tf.constant(INIT_PARAM), dtype=tf.float32) 39 | 40 | def _kernel(self, x, y, t): 41 | """ 42 | difussion kernel function proposed by Musmeci and Vere-Jones (1992). 43 | """ 44 | return (self.C / (2 * np.pi * self.sigma_x * self.sigma_y * t)) * \ 45 | tf.exp(- self.beta * t - (tf.square(x)/tf.square(self.sigma_x) + tf.square(y)/tf.square(self.sigma_y)) / (2*t)) 46 | 47 | def _lambda(self, x, y, t, x_his, y_his, t_his): 48 | """ 49 | lambda function for the Hawkes process. 50 | """ 51 | lam = self.mu + tf.reduce_sum(self._kernel(x - x_his, y - y_his, t - t_his), axis=0) 52 | return lam 53 | 54 | @staticmethod 55 | def __homogeneous_poisson_sampling(T, S, maximum): 56 | """ 57 | To generate a homogeneous Poisson point pattern in space S X T, it basically 58 | takes two steps: 59 | 1. Simulate the number of events n = N(S) occurring in S according to a 60 | Poisson distribution with mean lam * |S X T|. 61 | 2. Sample each of the n location according to a uniform distribution on S 62 | respectively. 63 | 64 | Args: 65 | lam: intensity (or maximum intensity when used by thining algorithm) 66 | S: [(min_t, max_t), (min_x, max_x), (min_y, max_y), ...] indicates the 67 | range of coordinates regarding a square (or cubic ...) region. 68 | Returns: 69 | samples: point process samples: 70 | [(t1, x1, y1), (t2, x2, y2), ..., (tn, xn, yn)] 71 | """ 72 | _S = [T] + S 73 | # sample the number of events from S 74 | n = utils.lebesgue_measure(_S) 75 | N = tf.random.poisson(lam=maximum * n, shape=[1], dtype=tf.int32) 76 | # simulate spatial sequence and temporal sequence separately. 77 | points = [ tf.random.uniform(shape=N, minval=_S[i][0], maxval=_S[i][1]) for i in range(len(_S)) ] 78 | # sort the temporal sequence ascendingly. 79 | points[0] = tf.contrib.framework.sort(points[0], direction="ASCENDING") 80 | points = tf.transpose(tf.stack(points)) 81 | return points 82 | 83 | def _inhomogeneous_poisson_thinning(self, homo_points, maximum): 84 | """ 85 | To generate a realization of an inhomogeneous Poisson process in S × T, this 86 | function uses a thining algorithm as follows. For a given intensity function 87 | lam(s, t): 88 | 1. Define an upper bound max_lam for the intensity function lam(s, t) 89 | 2. Simulate a homogeneous Poisson process with intensity max_lam. 90 | 3. "Thin" the simulated process as follows, 91 | a. Compute p = lam(s, t)/max_lam for each point (s, t) of the homogeneous 92 | Poisson process 93 | b. Generate a sample u from the uniform distribution on (0, 1) 94 | c. Retain the locations for which u <= p. 95 | """ 96 | # number of home points 97 | n_homo_points = tf.shape(homo_points)[0] 98 | 99 | # thining procedure 100 | # - input: current index of homo points & current selection for retained points 101 | # - return: updated selection for retained points 102 | def thining(i, selection): 103 | retained_points = tf.boolean_mask(homo_points, selection) 104 | x, y, t = homo_points[i, 1], homo_points[i, 2], homo_points[i, 0] 105 | his_x, his_y, his_t = retained_points[:, 1], retained_points[:, 2], retained_points[:, 0] 106 | # thinning 107 | lam_value = self._lambda(x, y, t, his_x, his_y, his_t) 108 | lam_bar = maximum 109 | D = tf.random.uniform(shape=[1], minval=0., maxval=1.)[0] 110 | # accept: return this point 111 | upd_selection = selection 112 | upd_selection = tf.add(upd_selection, tf.one_hot(i, n_homo_points)) 113 | # reject: return zero entry 114 | return tf.cond(tf.less(D * lam_bar, lam_value), 115 | lambda: upd_selection, # retain this point 116 | lambda: selection) # return the same selection without any change 117 | 118 | # get thining selection 119 | selections = tf.scan( 120 | lambda selection, i: thining(i, selection), 121 | tf.range(n_homo_points), # indices of homo points 122 | initializer=(tf.zeros(shape=[n_homo_points], dtype=tf.float32))) # initial selection 123 | # get retained points 124 | retained_points = tf.boolean_mask(homo_points, selections[-1, :]) 125 | return retained_points 126 | 127 | def sampling(self, T, S, batch_size, keep_latest_k): 128 | """ 129 | generate samples with batch_size by thining algorithm, return sampling sequences and 130 | corresponding element-wise loglikelihood value. 131 | """ 132 | points_list = [] 133 | size_list = [] 134 | # generate inhomogeneous poisson points iterately 135 | for b in range(batch_size): 136 | homo_points = self.__homogeneous_poisson_sampling(T, S, self.maximum) 137 | points = self._inhomogeneous_poisson_thinning(homo_points, self.maximum) 138 | n_points = tf.shape(points)[0] 139 | points_list.append(points) 140 | size_list.append(n_points) 141 | # initialize tensor for sequences 142 | max_size = tf.reduce_max(tf.stack(size_list)) 143 | seqs = [] 144 | logliks = [] 145 | # organize generated samples into tensor seqs 146 | for b in range(batch_size): 147 | n_points = tf.shape(points_list[b])[0] 148 | points = points_list[b] 149 | logpdfs = tf.scan( 150 | lambda a, i: self.log_conditional_pdf(points[:i, :], S, keep_latest_k), 151 | tf.range(1, n_points+1), # from the first point to the last point 152 | initializer=np.array(0., dtype=np.float32)) 153 | seq_paddings = tf.zeros((max_size - n_points, 1 + len(S))) 154 | lik_paddings = tf.zeros(max_size - n_points) 155 | seq = tf.concat([points, seq_paddings], axis=0) 156 | loglik = tf.concat([logpdfs, lik_paddings], axis=0) 157 | seqs.append(seq) 158 | logliks.append(loglik) 159 | seqs = tf.stack(seqs, axis=0) 160 | logliks = tf.expand_dims(tf.stack(logliks, axis=0), -1) 161 | return seqs, logliks 162 | 163 | def log_conditional_pdf(self, points, S, keep_latest_k=None): 164 | """ 165 | log pdf conditional of a data point given its history, where the data point is 166 | points[-1], and its history is points[:-1] 167 | """ 168 | if keep_latest_k is not None: 169 | points = points[-keep_latest_k:, :] 170 | # number of the points 171 | n_points = tf.shape(points)[0] 172 | # variables for calculating triggering probability 173 | x, y, t = points[-1, 1], points[-1, 2], points[-1, 0] 174 | x_his, y_his, t_his = points[:-1, 1], points[:-1, 2], points[:-1, 0] 175 | 176 | def pdf_no_history(): 177 | return tf.log(self._lambda(x, y, t, x_his, y_his, t_his)) 178 | 179 | def pdf_with_history(): 180 | # triggering probability 181 | log_trig_prob = tf.log(self._lambda(x, y, t, x_his, y_his, t_his)) 182 | # variables for calculating tail probability 183 | tn, ti = points[-2, 0], points[:-1, 0] 184 | t_ti, tn_ti = t - ti, tn - ti 185 | # tail probability 186 | log_tail_prob = - \ 187 | self.mu * (t - t_his[-1]) * utils.lebesgue_measure(S) - \ 188 | tf.reduce_sum(tf.scan( 189 | lambda a, i: self.C * (tf.exp(- self.beta * tn_ti[i]) - tf.exp(- self.beta * t_ti[i])) / self.beta, 190 | tf.range(tf.shape(t_ti)[0]), 191 | initializer=np.array(0., dtype=np.float32))) 192 | return log_trig_prob + log_tail_prob 193 | 194 | # TODO: Unsolved issue: 195 | # pdf_with_history will still be called even if the condition is true, which leads to exception 196 | # "ValueError: slice index -1 of dimension 0 out of bounds." due to that points is empty but we 197 | # try to index a nonexisted element. 198 | # However, when points is indexed in a scan loop, this works fine and the numerical result is 199 | # also correct. which is very confused to me. Therefore, I leave this problem here temporarily. 200 | log_cond_pdf = tf.cond(tf.less(n_points, 2), 201 | pdf_no_history, # if there is only one point in the sequence 202 | pdf_with_history) # if there is more than one point in the sequence 203 | return log_cond_pdf 204 | 205 | if __name__ == "__main__": 206 | # Unittest example 207 | tf.random.set_random_seed(1234) 208 | with tf.Session() as sess: 209 | hawkes = SpatialTemporalHawkes(C=1., maximum=1e+3) 210 | 211 | init_op = tf.global_variables_initializer() 212 | sess.run(init_op) 213 | 214 | seqs, logliks = hawkes.sampling(T=[0., 10.], S=[[-1., 1.], [-1., 1.]], batch_size=3, keep_latest_k=None) 215 | res1, res2 = sess.run([ seqs, logliks ]) 216 | print(res1, res2) 217 | print(res1.shape, res2.shape) 218 | -------------------------------------------------------------------------------- /imgs/atl-robbery-1.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meowoodie/Learning-Temporal-Point-Processes-via-Reinforcement-Learning/6920a5c9b5a4b8cbed262b735f36f6171bcc002c/imgs/atl-robbery-1.gif -------------------------------------------------------------------------------- /imgs/cal-earthquake-1.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meowoodie/Learning-Temporal-Point-Processes-via-Reinforcement-Learning/6920a5c9b5a4b8cbed262b735f36f6171bcc002c/imgs/cal-earthquake-1.gif -------------------------------------------------------------------------------- /misc/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meowoodie/Learning-Temporal-Point-Processes-via-Reinforcement-Learning/6920a5c9b5a4b8cbed262b735f36f6171bcc002c/misc/__init__.py -------------------------------------------------------------------------------- /misc/plots.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | Utilities for visualizing results of experiments 6 | """ 7 | 8 | import sys 9 | import arrow 10 | import numpy as np 11 | from scipy import stats 12 | import matplotlib.pyplot as plt 13 | 14 | from ppgen import * 15 | 16 | def get_intensity(seq, n_seqs, n_t=100, t0=0, T=None): 17 | """ 18 | Calculate intensity (pdf) of input sequences 19 | 20 | Optional method: calculating histogram for the sequences 21 | """ 22 | T = seq.max() if T is None else T 23 | delta_t = float(T - t0) / float(n_t) 24 | 25 | cdf = [ len(filter(lambda t: 0 < t and t < cdf_t, seq)) 26 | for cdf_t in np.arange(t0, T+delta_t, delta_t) ] 27 | pdf = [ float(cur_cdf - prv_cdf) / float(n_seqs) 28 | for prv_cdf, cur_cdf in zip(cdf[:-1], cdf[1:]) ] 29 | 30 | return pdf, np.arange(t0, T, delta_t) 31 | 32 | def get_integral_diffs(seqs, intensity, T_max): 33 | integral_diffs = [] 34 | for seq in seqs: 35 | seq_indice = range(len(filter(lambda t: t>0 and t> sys.stderr, "[%s] Loading learner sequences..." % arrow.now() 134 | 135 | # intensity = IntensityHawkesPlusGaussianMixture(mu=1, alpha=0.3, beta=1, 136 | # k=2, centers=[T_max/4., T_max*3./4.], stds=[1, 1], coefs=[1, 1]) 137 | learner_seqs = np.loadtxt("data/learner_seqs.txt", delimiter=",") 138 | 139 | # intensity = IntensityPoly(mu=1, alpha=0.3, beta=1, 140 | # segs=[0, T_max/4, T_max*2/4, T_max*3/4, T_max], 141 | # b=0, A=[1., -1., 1., -1.]) 142 | # learner_seqs = np.loadtxt("resource/generation/hawkes_poly_learner_seq.txt", delimiter=",") 143 | 144 | print >> sys.stderr, "[%s] Generating expert sequences..." % arrow.now() 145 | # expert_seqs = generate_sample(intensity, T=T_max, n=2000) 146 | expert_seqs = np.loadtxt("data/expert_seqs.txt", delimiter=",") 147 | 148 | # Plot 1: Q-Q plot 149 | print >> sys.stderr, "[%s] Plotting Q-Q plot..." % arrow.now() 150 | # qqplot4intdiff(learner_seqs, expert_seqs, intensity, T=T_max, file_path="results/qqplot4intdiff.png") 151 | 152 | # Plot 2: Intensity plot 153 | print >> sys.stderr, "[%s] Plotting Intensity plot..." % arrow.now() 154 | intensityplot4seqs(learner_seqs, expert_seqs, T=T_max, file_path="results/intensityplot4seqs.png") 155 | -------------------------------------------------------------------------------- /misc/ppgen.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | import abc 5 | import scipy.stats 6 | import numpy as np 7 | 8 | class Intensity(object): 9 | __metaclass__ = abc.ABCMeta 10 | 11 | class IntensityHomogenuosPoisson(Intensity): 12 | 13 | def __init__(self, lam): 14 | self.lam = lam 15 | 16 | def get_value(self, t=None, past_ts=None): 17 | return self.lam 18 | 19 | def get_upper_bound(self, past_ts=None, t=None, to_t=None): 20 | return self.lam 21 | 22 | class IntensityGaussianMixture(Intensity): 23 | 24 | def __init__(self, k=2, centers=[2, 4], stds=[1, 1], coefs= [1, 1]): 25 | self.k = k 26 | self.centers = centers 27 | self.stds = stds 28 | self.coefs = coefs 29 | 30 | def get_value(self, t=None, past_ts=None): 31 | return self._get_gaussianmixture_value(t) 32 | 33 | def get_upper_bound(self, past_ts=None, t=None, to_t=None): 34 | max_val = sum([ self._get_gaussianmixture_value(center) for center in self.centers ]) 35 | return max_val 36 | 37 | def _get_gaussianmixture_value(self, t): 38 | inten = 0 39 | for i in range(self.k): 40 | inten += self.coefs[i] * scipy.stats.norm.pdf(t, self.centers[i], self.stds[i]) 41 | return inten 42 | 43 | def get_integral(self, t, past_ts=None): 44 | return sum([ coef * (scipy.stats.norm.cdf(t, center, std) - \ 45 | scipy.stats.norm.cdf(0, center, std)) 46 | for coef, center, std in zip(self.coefs, self.centers, self.stds) ]) 47 | 48 | class IntensityHawkes(Intensity): 49 | 50 | def __init__(self, mu=1, alpha=0.3, beta=1): 51 | self.mu = mu 52 | self.alpha = alpha 53 | self.beta = beta 54 | 55 | def get_value(self, t=None, past_ts=None): 56 | inten = self.mu + np.sum(self.alpha * self.beta * np.exp(-self.beta * np.subtract(t, past_ts))) 57 | return inten 58 | 59 | def get_upper_bound(self, past_ts=None, t=None, to_t=None): 60 | max_val = self.mu + np.sum(self.alpha * self.beta * np.exp(-self.beta * np.subtract(t, past_ts))) 61 | return max_val 62 | 63 | def get_integral(self, t, past_ts): 64 | return self.mu * t + \ 65 | self.alpha * np.sum(1 - np.exp(-self.beta * (t - np.array(past_ts)))) 66 | 67 | class IntensityPoly(Intensity): 68 | 69 | def __init__(self, segs=[0, 1, 2, 3], b=0, A=[1, 2, -3]): 70 | self.segs = segs 71 | self.b = b 72 | self.A = A 73 | if len(A) != len(segs) - 1: 74 | raise Exception("Inequality lies in the numbers of segs and A.") 75 | 76 | def get_value(self, t=None, past_ts=None): 77 | return self._get_poly_value(t) 78 | 79 | def get_upper_bound(self, past_ts=None, t=None, to_t=None): 80 | max_val = 0 81 | segs_within_range = [ s for s in self.segs if s > t and s < to_t ] 82 | if len(segs_within_range) > 0: 83 | max_val = max([ self._get_poly_value(t) for s in segs_within_range ]) 84 | max_val = max([ self._get_poly_value(t), self._get_poly_value(to_t), max_val ]) 85 | return max_val 86 | 87 | def _get_poly_value(self, t): 88 | if t > self.segs[-1]: 89 | raise Exception("t is out of range.") 90 | segs_before_t = [ s for s in self.segs if s < t ] 91 | b = self.b 92 | for seg_ind in range(len(segs_before_t)-1): 93 | b = b + self.A[seg_ind] * (segs_before_t[seg_ind+1] - segs_before_t[seg_ind]) 94 | if len(segs_before_t) >= 1: 95 | value = b + self.A[len(segs_before_t)-1] * (t - segs_before_t[len(segs_before_t)-1]) 96 | else: 97 | value = b 98 | return value 99 | 100 | def get_integral(self, t, past_ts=None): 101 | if t > self.segs[-1]: 102 | raise Exception("t is out of range.") 103 | segs_before_t = [ s for s in self.segs if s < t ] 104 | 105 | # get starting intercepts (bs) for each of segments (size = len(segs_before_t) + 1) 106 | bs = [self.b] 107 | for seg_ind in range(len(segs_before_t)-1): 108 | b = bs[seg_ind] + self.A[seg_ind] * \ 109 | (segs_before_t[seg_ind+1] - segs_before_t[seg_ind]) 110 | bs.append(b) 111 | bs.append(self._get_poly_value(t)) # last intercept 112 | 113 | # get length of each of segments (size = len(segs_before_t)) 114 | lens = [] 115 | for seg_ind in range(len(segs_before_t)-1): 116 | lens.append(segs_before_t[seg_ind+1] - segs_before_t[seg_ind]) 117 | last_seg = segs_before_t[-1] if len(segs_before_t) > 0 else 0 118 | lens.append(t - last_seg) # lengths of last segments 119 | 120 | # get integrals (area) for each of segments 121 | integrals = [ (width1 + width2) * height / 2. 122 | for width1, width2, height in zip(bs[:-1], bs[1:], lens) ] 123 | return sum(integrals) 124 | # 125 | # class IntensitySelfCorrecting(Intensity): 126 | # 127 | # def __init__(self, mu=1, alpha=0.3): 128 | # self.mu = mu 129 | # self.alpha = alpha 130 | # 131 | # def get_value(self, t, past_ts): 132 | # return np.exp(self.mu * t - self.alpha * len(past_ts)) 133 | # 134 | # def get_upper_bound(self, past_ts=None, t=None, to_t): 135 | # # TODO: Improve this upper bound 136 | # return np.exp(self.mu * to_t) 137 | # 138 | # def get_integral(self, t, past_ts): 139 | # for past_t in past_ts: 140 | # past_t 141 | # return 142 | 143 | class IntensityHawkesPlusPoly(IntensityHawkes, IntensityPoly): 144 | 145 | def __init__(self, mu=1, alpha=0.3, beta=1, 146 | segs=[0, 1, 2, 3], b=0, A=[1, 2, -3]): 147 | IntensityPoly.__init__(self, segs=segs, b=b, A=A) 148 | IntensityHawkes.__init__(self, mu=mu, alpha=alpha, beta=beta) 149 | 150 | def get_value(self, t=None, past_ts=None): 151 | return IntensityHawkes.get_value(self, t=t, past_ts=past_ts) + \ 152 | IntensityPoly.get_value(self, t=t) 153 | 154 | def get_upper_bound(self, past_ts=None, t=None, to_t=None): 155 | return IntensityPoly.get_upper_bound(self, t=t, to_t=to_t) + \ 156 | IntensityHawkes.get_upper_bound(self, past_ts=past_ts, t=t) 157 | 158 | def get_integral(self, t, past_ts): 159 | return IntensityPoly.get_integral(self, t=t) + \ 160 | IntensityHawkes.get_integral(self, t=t, past_ts=past_ts) 161 | 162 | class IntensityHawkesPlusGaussianMixture(IntensityHawkes, IntensityGaussianMixture): 163 | 164 | def __init__(self, mu=1, alpha=0.3, beta=1, 165 | k=2, centers=[2, 4], stds=[1, 1], coefs=[1, 1]): 166 | IntensityHawkes.__init__(self, mu=mu, alpha=alpha, beta=beta) 167 | IntensityGaussianMixture.__init__(self, k=k, centers=centers, stds=stds, coefs=coefs) 168 | 169 | def get_value(self, t=None, past_ts=None): 170 | return IntensityHawkes.get_value(self, t=t, past_ts=past_ts) + \ 171 | IntensityGaussianMixture.get_value(self, t=t) 172 | 173 | def get_upper_bound(self, past_ts=None, t=None, to_t=None): 174 | return IntensityGaussianMixture.get_upper_bound(self, t=t, to_t=to_t) + \ 175 | IntensityHawkes.get_upper_bound(self, past_ts=past_ts, t=t) 176 | 177 | def get_integral(self, t, past_ts): 178 | return IntensityGaussianMixture.get_integral(self, t=t) + \ 179 | IntensityHawkes.get_integral(self, t=t, past_ts=past_ts) 180 | 181 | def generate_sample(intensity, T, n): 182 | seqs = [] 183 | i = 0 184 | while True: 185 | past_ts = [] 186 | cur_t = 0 187 | while True: 188 | intens1 = intensity.get_upper_bound(past_ts=past_ts, t=cur_t, to_t=T) 189 | intens1 = intens1 if intens1 != 0 else 1e-4 190 | t_delta = np.random.exponential(1.0/float(intens1)) 191 | next_t = cur_t + t_delta 192 | # print "cur_t:%f, next_t:%f, delta_t:%f" % (cur_t, next_t, t_delta) 193 | if next_t > T: 194 | break 195 | intens2 = intensity.get_value(t=next_t, past_ts=past_ts) 196 | u = np.random.uniform() 197 | if float(intens2)/float(intens1) >= u: 198 | past_ts.append(next_t) 199 | cur_t = next_t 200 | if len(past_ts) > 1: 201 | seqs.append(past_ts) 202 | i += 1 203 | if i == n: 204 | break 205 | return seqs 206 | 207 | if __name__ == "__main__": 208 | n = 2 209 | T = 10. 210 | intensity_hawkes = IntensityHawkes(mu=1, alpha=0.3, beta=1) 211 | intensity_poly = IntensityPoly(segs=[0, T/4., T*2./4., T*3./4., T], 212 | b=0, A=[2, -2, 2, -2]) 213 | intensity_hawkes_poly = IntensityHawkesPlusPoly(mu=1, alpha=0.3, beta=1, 214 | segs=[0, T/4, T*2/4, T*3/4, T], 215 | b=1, A=[1, -1, 1, -1]) 216 | intensity_hawkes_gaussianmixture = IntensityHawkesPlusGaussianMixture(mu=1, alpha=0.3, beta=1, 217 | k=2, centers=[T/4, T*3/4], stds=[1, 1], coefs=[1, 1]) 218 | 219 | print generate_sample(intensity_poly, T, n) 220 | -------------------------------------------------------------------------------- /ppgrl.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import arrow 4 | import utils 5 | import random 6 | import numpy as np 7 | import tensorflow as tf 8 | import matplotlib.pyplot as plt 9 | 10 | from tfgen import SpatialTemporalHawkes, MarkedSpatialTemporalLSTM 11 | 12 | os.environ['KMP_DUPLICATE_LIB_OK']='True' 13 | 14 | 15 | 16 | class RL_LSTM_Generator(object): 17 | """ 18 | Reinforcement Learning & LSTM Based Point Process Generator 19 | """ 20 | 21 | def __init__(self, T, seq_len, lstm_hidden_size, loc_hidden_size, mak_hidden_size, m_dim): 22 | """ 23 | Params: 24 | - T: the maximum time of the sequences 25 | - seq_len: the length of the sequences 26 | - lstm_hidden_size: size of hidden state of the LSTM 27 | - loc_hidden_size: size of hidden feature of location 28 | - mak_hidden_size: size of hidden feature of mark 29 | - m_dim: number of categories of marks 30 | """ 31 | # model hyper-parameters 32 | self.T = T # maximum time 33 | self.t_dim = 1 # by default 34 | self.l_dim = 2 # by default 35 | self.m_dim = m_dim # number of categories of marks 36 | self.seq_len = seq_len # length of each generated sequences 37 | # LSTM generator 38 | self.mstlstm = MarkedSpatialTemporalLSTM( 39 | step_size=seq_len, lstm_hidden_size=lstm_hidden_size, 40 | loc_hidden_size=loc_hidden_size, mak_hidden_size=mak_hidden_size, m_dim=m_dim) 41 | 42 | def _initialize_policy_network(self, batch_size, starter_learning_rate=0.01, decay_rate=0.99, decay_step=100): 43 | """ 44 | Construct Policy Network 45 | 46 | Policy should be flexible and expressive enough to capture the potential complex point process patterns of data. 47 | Therefore, a customized recurrent neural network (RNN) with stochastic neurons is adopted, where hidden state is 48 | computed by hidden state of last moment and stochastically generated action. i.e. 49 | a_{i+1} is sampling from pi(a|h_{i}) 50 | h_{i+1} = rnn_cell(h_{i}, a_{i+1}) 51 | """ 52 | # input tensors: expert sequences (time, location, marks) 53 | self.input_seq_t = tf.placeholder(tf.float32, [batch_size, None, self.t_dim]) 54 | self.input_seq_l = tf.placeholder(tf.float32, [batch_size, None, self.l_dim]) 55 | self.input_seq_m = tf.placeholder(tf.float32, [batch_size, None, self.m_dim]) 56 | 57 | # construct customized stochastic LSTM network 58 | self.mstlstm.initialize_network(batch_size) 59 | # generated tensors: learner sequences (time, location, marks) 60 | learner_seq_t, learner_seq_l, learner_seq_m = self.mstlstm.seq_t, self.mstlstm.seq_l, self.mstlstm.seq_m 61 | # log likelihood 62 | learner_seq_loglik = self.mstlstm.seq_loglik 63 | # getting training time window (t_0 = 0, T = self.T by default) 64 | t0, T = 0, self.T # self._training_time_window(learner_seq_t) 65 | 66 | # concatenate batches in the sequences 67 | expert_seq_t, expert_seq_l, expert_seq_m = \ 68 | self.__concatenate_batch(self.input_seq_t), \ 69 | self.__concatenate_batch(self.input_seq_l), \ 70 | self.__concatenate_batch(self.input_seq_m) 71 | learner_seq_t, learner_seq_l, learner_seq_m, learner_seq_loglik = \ 72 | self.__concatenate_batch(learner_seq_t), \ 73 | self.__concatenate_batch(learner_seq_l), \ 74 | self.__concatenate_batch(learner_seq_m), \ 75 | self.__concatenate_batch(learner_seq_loglik) 76 | 77 | # calculate average rewards 78 | reward = self._reward(batch_size, t0, T,\ 79 | expert_seq_t, expert_seq_l, expert_seq_m, \ 80 | learner_seq_t, learner_seq_l, learner_seq_m) # [batch_size*seq_len, 1] 81 | 82 | # cost and optimizer 83 | self.cost = tf.reduce_sum(tf.multiply(reward, learner_seq_loglik), axis=0) / batch_size 84 | global_step = tf.Variable(0, trainable=False) 85 | learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step, decay_step, decay_rate, staircase=True) 86 | self.optimizer = tf.train.AdamOptimizer(learning_rate, beta1=0.6, beta2=0.9).minimize(self.cost, global_step=global_step) 87 | 88 | def _training_time_window(self, learner_seq_t): 89 | """ 90 | Time window for the purpose of training. The model only fits a specific segment of the expert sequence 91 | indicated by 'training_time_window'. This function will return the start time (t_0) and end time (T) of 92 | the segment. 93 | 94 | Policy 1: 95 | t_0 = 0; T = mean(max(learner_seq_t, axis=0)) 96 | """ 97 | # remove invalid time 98 | mask_t = self.__get_mask_truncate_by_T(learner_seq_t, self.T) # [batch_size, seq_len, 1] 99 | learner_seq_t = tf.multiply(learner_seq_t, mask_t) # [batch_size, seq_len, 1] 100 | # policy 1 101 | t_0 = 0 102 | T = tf.reduce_mean(tf.reduce_max(learner_seq_t, axis=0)) 103 | return t_0, T 104 | 105 | def _reward(self, batch_size, t0, T, 106 | expert_seq_t, expert_seq_l, expert_seq_m, # expert sequences 107 | learner_seq_t, learner_seq_l, learner_seq_m, # learner sequences 108 | kernel_bandwidth=0.5): 109 | """reward function""" 110 | # get mask for concatenated expert and learner sequences 111 | expert_seq_mask = self.__get_mask_truncate_by_T(expert_seq_t, T, t0) # [batch_size*seq_len, 1] 112 | learner_seq_mask = self.__get_mask_truncate_by_T(learner_seq_t, T, t0) # [batch_size*seq_len, 1] 113 | # calculate mask for kernel matrix 114 | learner_learner_kernel_mask = tf.matmul(learner_seq_mask, tf.transpose(learner_seq_mask)) 115 | expert_learner_kernel_mask = tf.matmul(expert_seq_mask, tf.transpose(learner_seq_mask)) 116 | # concatenate each data dimension for both expert sequence and learner sequence 117 | # TODO: Add mark to the sequences 118 | # expert_seq = tf.concat([expert_seq_t, expert_seq_l], axis=1) # [batch_size*seq_len, t_dim+l_dim+m_dim] 119 | # learner_seq = tf.concat([learner_seq_t, learner_seq_l], axis=1) # [batch_size*seq_len, t_dim+l_dim+m_dim] 120 | expert_seq = tf.concat([expert_seq_l], axis=1) # [batch_size*seq_len, t_dim] 121 | learner_seq = tf.concat([learner_seq_l], axis=1) # [batch_size*seq_len, t_dim] 122 | # calculate upper-half kernel matrix 123 | learner_learner_kernel, expert_learner_kernel = self.__kernel_matrix( 124 | learner_seq, expert_seq, kernel_bandwidth) # 2 * [batch_size*seq_len, batch_size*seq_len] 125 | learner_learner_kernel = tf.multiply(learner_learner_kernel, learner_learner_kernel_mask) 126 | expert_learner_kernel = tf.multiply(expert_learner_kernel, expert_learner_kernel_mask) 127 | # calculate reward for each of data point in learner sequence 128 | emp_ll_mean = tf.reduce_sum(learner_learner_kernel, axis=0) * 2 # batch_size*seq_len 129 | emp_el_mean = tf.reduce_sum(expert_learner_kernel, axis=0) * 2 # batch_size*seq_len 130 | return tf.expand_dims(emp_ll_mean - emp_el_mean, -1) # [batch_size*seq_len, 1] 131 | 132 | @staticmethod 133 | def __get_mask_truncate_by_T(seq_t, T, t_0=0): 134 | """Masking time, location and mark sequences for the entries before the maximum time T.""" 135 | # get basic mask where 0 if t > T else 1 136 | mask_t = tf.multiply( 137 | tf.cast(seq_t < T, tf.float32), 138 | tf.cast(seq_t > t_0, tf.float32)) 139 | return mask_t # [batch_size*seq_len, 1] or [batch_size, seq_len, 1] 140 | 141 | @staticmethod 142 | def __concatenate_batch(seqs): 143 | """Concatenate each batch of the sequences into a single sequence.""" 144 | array_seq = tf.unstack(seqs, axis=0) # [batch_size, seq_len, data_dim] 145 | seq = tf.concat(array_seq, axis=0) # [batch_size*seq_len, data_dim] 146 | return seq 147 | 148 | @staticmethod 149 | def __kernel_matrix(learner_seq, expert_seq, kernel_bandwidth): 150 | """ 151 | Construct kernel matrix based on learn sequence and expert sequence, each entry of the matrix 152 | is the distance between two data points in learner_seq or expert_seq. return two matrix, left_mat 153 | is the distances between learn sequence and learn sequence, right_mat is the distances between 154 | learn sequence and expert sequence. 155 | """ 156 | # calculate l2 distances 157 | learner_learner_mat = utils.l2_norm(learner_seq, learner_seq) # [batch_size*seq_len, batch_size*seq_len] 158 | expert_learner_mat = utils.l2_norm(expert_seq, learner_seq) # [batch_size*seq_len, batch_size*seq_len] 159 | # exponential kernel 160 | learner_learner_mat = tf.exp(-learner_learner_mat / kernel_bandwidth) 161 | expert_learner_mat = tf.exp(-expert_learner_mat / kernel_bandwidth) 162 | return learner_learner_mat, expert_learner_mat 163 | 164 | def train(self, sess, batch_size, 165 | epoches, # number of epoches (how many times is the entire dataset going to be trained) 166 | expert_seq_t, # [n, seq_len, 1] 167 | expert_seq_l, # [n, seq_len, 2] 168 | expert_seq_m, # [n, seq_len, m_dim] 169 | train_test_ratio = 9., # n_train / n_test 170 | trainplot=True, # plot the change of intensity over epoches 171 | pretrained=False): 172 | """Train the point process generator given expert sequences.""" 173 | # check the consistency of the shape of the expert sequences 174 | assert expert_seq_t.shape[:-1] == expert_seq_l.shape[:-1] == expert_seq_m.shape[:-1], \ 175 | "inconsistant 'number of sequences' or 'sequence length' of input expert sequences" 176 | 177 | # initialization 178 | if not pretrained: 179 | # initialize network parameters 180 | init_op = tf.global_variables_initializer() 181 | sess.run(init_op) 182 | # initialize policy network 183 | self._initialize_policy_network(batch_size) 184 | 185 | # data configurations 186 | # - number of expert sequences 187 | n_data = expert_seq_t.shape[0] 188 | n_train = int(n_data * train_test_ratio / (train_test_ratio + 1.)) 189 | n_test = int(n_data * 1. / (train_test_ratio + 1.)) 190 | # - number of batches 191 | n_batches = int(n_train / batch_size) 192 | # - check if test data size is large enough (> batch_size) 193 | assert n_test >= batch_size, "test data size %d is less than batch size %d." % (n_test, batch_size) 194 | 195 | if trainplot: 196 | ppim = utils.PointProcessIntensityMeter(self.T, batch_size) 197 | 198 | # training over epoches 199 | for epoch in range(epoches): 200 | # shuffle indices of the training samples 201 | shuffled_ids = np.arange(n_data) 202 | np.random.shuffle(shuffled_ids) 203 | shuffled_train_ids = shuffled_ids[:n_train] 204 | shuffled_test_ids = shuffled_ids[-n_test:] 205 | 206 | # training over batches 207 | avg_train_cost = [] 208 | avg_test_cost = [] 209 | for b in range(n_batches): 210 | idx = np.arange(batch_size * b, batch_size * (b + 1)) 211 | # training and testing indices selected in current batch 212 | batch_train_ids = shuffled_train_ids[idx] 213 | batch_test_ids = shuffled_test_ids[:batch_size] 214 | # training and testing batch data 215 | batch_train_expert_t = expert_seq_t[batch_train_ids, :, :] 216 | batch_train_expert_l = expert_seq_l[batch_train_ids, :, :] 217 | batch_train_expert_m = expert_seq_m[batch_train_ids, :, :] 218 | batch_test_expert_t = expert_seq_t[batch_test_ids, :, :] 219 | batch_test_expert_l = expert_seq_l[batch_test_ids, :, :] 220 | batch_test_expert_m = expert_seq_m[batch_test_ids, :, :] 221 | # # Debug 222 | # debug1, debug2 = sess.run([self.mstlstm.test1, self.mstlstm.test2], feed_dict={ 223 | # self.input_seq_t: batch_test_expert_t, 224 | # self.input_seq_l: batch_test_expert_l, 225 | # self.input_seq_m: batch_test_expert_m}) 226 | # print(debug1) 227 | # print(debug2) 228 | # optimization procedure 229 | sess.run(self.optimizer, feed_dict={ 230 | self.input_seq_t: batch_train_expert_t, 231 | self.input_seq_l: batch_train_expert_l, 232 | self.input_seq_m: batch_train_expert_m}) 233 | # cost for train batch and test batch 234 | train_cost = sess.run(self.cost, feed_dict={ 235 | self.input_seq_t: batch_train_expert_t, 236 | self.input_seq_l: batch_train_expert_l, 237 | self.input_seq_m: batch_train_expert_m}) 238 | test_cost = sess.run(self.cost, feed_dict={ 239 | self.input_seq_t: batch_test_expert_t, 240 | self.input_seq_l: batch_test_expert_l, 241 | self.input_seq_m: batch_test_expert_m}) 242 | # record cost for each batch 243 | avg_train_cost.append(train_cost) 244 | avg_test_cost.append(test_cost) 245 | 246 | if trainplot: 247 | # update intensity plot 248 | learner_seq_t, learner_seq_l = sess.run( 249 | [self.mstlstm.seq_t, self.mstlstm.seq_l], 250 | feed_dict={ 251 | self.input_seq_t: batch_test_expert_t, 252 | self.input_seq_l: batch_test_expert_l, 253 | self.input_seq_m: batch_test_expert_m}) 254 | ppim.update_time_intensity(batch_train_expert_t, learner_seq_t) 255 | ppim.update_location_intensity(batch_train_expert_l, learner_seq_l) 256 | 257 | # training log output 258 | avg_train_cost = np.mean(avg_train_cost) 259 | avg_test_cost = np.mean(avg_test_cost) 260 | print('[%s] Epoch %d (n_train_batches=%d, batch_size=%d)' % (arrow.now(), epoch, n_batches, batch_size), file=sys.stderr) 261 | print('[%s] Training cost:\t%f' % (arrow.now(), avg_train_cost), file=sys.stderr) 262 | print('[%s] Testing cost:\t%f' % (arrow.now(), avg_test_cost), file=sys.stderr) 263 | -------------------------------------------------------------------------------- /results/911calls-105.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meowoodie/Learning-Temporal-Point-Processes-via-Reinforcement-Learning/6920a5c9b5a4b8cbed262b735f36f6171bcc002c/results/911calls-105.png -------------------------------------------------------------------------------- /results/poisson_exp.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meowoodie/Learning-Temporal-Point-Processes-via-Reinforcement-Learning/6920a5c9b5a4b8cbed262b735f36f6171bcc002c/results/poisson_exp.gif -------------------------------------------------------------------------------- /results/qqplot4intdiff.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/meowoodie/Learning-Temporal-Point-Processes-via-Reinforcement-Learning/6920a5c9b5a4b8cbed262b735f36f6171bcc002c/results/qqplot4intdiff.png -------------------------------------------------------------------------------- /tfgen.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | Imitation Learning for Point Process 6 | 7 | A LSTM based model for generating marked spatial-temporal points. 8 | 9 | References: 10 | - https://arxiv.org/abs/1811.05016 11 | 12 | Dependencies: 13 | - Python 3.6.7 14 | - tensorflow==1.5.0 15 | """ 16 | 17 | import sys 18 | import arrow 19 | import utils 20 | import numpy as np 21 | import tensorflow as tf 22 | 23 | from stppg import GaussianMixtureDiffusionKernel, HawkesLam, SpatialTemporalPointProcess 24 | 25 | 26 | 27 | class MarkedSpatialTemporalLSTM(object): 28 | """ 29 | Customized Stochastic LSTM Network 30 | 31 | A LSTM Network with customized stochastic output neurons, which used to generate time, location and marks accordingly. 32 | """ 33 | 34 | def __init__(self, step_size, lstm_hidden_size, loc_hidden_size, mak_hidden_size, m_dim, x_lim=5, y_lim=5, epsilon=0.3): 35 | """ 36 | Params: 37 | - step_size: the steps (length) of the LSTM network 38 | - lstm_hidden_size: size of hidden state of the LSTM 39 | - loc_hidden_size: size of hidden feature of location 40 | - mak_hidden_size: size of hidden feature of mark 41 | - m_dim: number of categories of marks 42 | """ 43 | 44 | # data dimension 45 | self.t_dim = 1 # by default 46 | self.m_dim = m_dim # number of categories for the marks 47 | 48 | # model hyper-parameters 49 | self.step_size = step_size # step size of LSTM 50 | self.lstm_hidden_size = lstm_hidden_size # size of LSTM hidden feature 51 | self.loc_hidden_size = loc_hidden_size # size of location hidden feature 52 | self.loc_param_size = 5 # by default 53 | self.mak_hidden_size = mak_hidden_size # size of mark hidden feature 54 | self.x_lim, self.y_lim = x_lim, y_lim 55 | self.epsilon = epsilon 56 | 57 | INIT_PARAM_RATIO = 1 / np.sqrt(self.loc_hidden_size * self.loc_param_size) 58 | 59 | # define learning weights 60 | # - time weights 61 | self.Wt = tf.get_variable(name="Wt", initializer=INIT_PARAM_RATIO * tf.random_normal([self.lstm_hidden_size, self.t_dim])) 62 | self.bt = tf.get_variable(name="bt", initializer=INIT_PARAM_RATIO * tf.random_normal([self.t_dim])) 63 | # - location weights 64 | self.Wl0 = tf.get_variable(name="Wl0", initializer=INIT_PARAM_RATIO * tf.random_normal([self.lstm_hidden_size, self.loc_hidden_size])) 65 | self.bl0 = tf.get_variable(name="bl0", initializer=INIT_PARAM_RATIO * tf.random_normal([self.loc_hidden_size])) 66 | self.Wl1 = tf.get_variable(name="Wl1", initializer=INIT_PARAM_RATIO * tf.random_normal([self.loc_hidden_size, self.loc_param_size])) 67 | self.bl1 = tf.get_variable(name="bl1", initializer=INIT_PARAM_RATIO * tf.random_normal([self.loc_param_size])) 68 | # - mark weights 69 | self.Wm0 = tf.get_variable(name="Wm0", initializer=INIT_PARAM_RATIO * tf.random_normal([self.lstm_hidden_size, self.mak_hidden_size])) 70 | self.bm0 = tf.get_variable(name="bm0", initializer=INIT_PARAM_RATIO * tf.random_normal([self.mak_hidden_size])) 71 | self.Wm1 = tf.get_variable(name="Wm1", initializer=INIT_PARAM_RATIO * tf.random_normal([self.mak_hidden_size, self.m_dim])) 72 | self.bm1 = tf.get_variable(name="bm1", initializer=INIT_PARAM_RATIO * tf.random_normal([self.m_dim])) 73 | 74 | def initialize_network(self, batch_size): 75 | """Create a new network for training purpose, where the LSTM is at the zero state""" 76 | # create a basic LSTM cell 77 | tf_lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(self.lstm_hidden_size) 78 | # defining initial basic LSTM hidden state [2, batch_size, lstm_hidden_size] 79 | # - lstm_state.h: hidden state [batch_size, lstm_hidden_size] 80 | # - lstm_state.c: cell state [batch_size, lstm_hidden_size] 81 | init_lstm_state = tf_lstm_cell.zero_state(batch_size, dtype=tf.float32) 82 | # construct customized LSTM network 83 | self.seq_t, self.seq_l, self.seq_m, self.seq_loglik, self.final_state = self._recurrent_structure( 84 | batch_size, tf_lstm_cell, init_lstm_state) 85 | 86 | def _recurrent_structure(self, 87 | batch_size, 88 | tf_lstm_cell, # tensorflow LSTM cell object, e.g. 'tf.nn.rnn_cell.BasicLSTMCell' 89 | init_lstm_state): # initial LSTM state tensor 90 | """Recurrent structure with customized LSTM cells.""" 91 | # defining initial data point 92 | # - init_t: initial time [batch_size, t_dim] 93 | init_t = tf.zeros([batch_size, self.t_dim], dtype=tf.float32) 94 | # concatenate each customized LSTM cell by loop 95 | seq_t = [] # generated sequence initialization 96 | seq_l = [] 97 | seq_m = [] 98 | seq_loglik = [] 99 | last_t, last_lstm_state = init_t, init_lstm_state # loop initialization 100 | for _ in range(self.step_size): 101 | t, l, m, loglik, state = self._customized_lstm_cell(batch_size, tf_lstm_cell, last_lstm_state, last_t) 102 | seq_t.append(t) # record generated time 103 | seq_l.append(l) # record generated location 104 | seq_m.append(m) # record generated mark 105 | seq_loglik.append(loglik) # record log likelihood 106 | last_t = t # reset last_t 107 | last_lstm_state = state # reset last_lstm_state 108 | seq_t = tf.stack(seq_t, axis=1) # [batch_size, step_size, t_dim] 109 | seq_l = tf.stack(seq_l, axis=1) # [batch_size, step_size, 2] 110 | seq_m = tf.stack(seq_m, axis=1) # [batch_size, step_size, m_dim] 111 | seq_loglik = tf.stack(seq_loglik, axis=1) # [batch_size, step_size, 1] 112 | return seq_t, seq_l, seq_m, seq_loglik, state 113 | 114 | def _customized_lstm_cell(self, batch_size, 115 | tf_lstm_cell, # tensorflow LSTM cell object, e.g. 'tf.nn.rnn_cell.BasicLSTMCell' 116 | last_state, # last state as input of this LSTM cell 117 | last_t): # last_t + delta_t as input of this LSTM cell 118 | """ 119 | Customized Stochastic LSTM Cell 120 | 121 | The customized LSTM cell takes current (time 't', location 'l', mark 'm') and the hidden state of last moment 122 | as input, return the ('next_t', 'next_l', 'next_m') as well as the hidden state for the next moment. The time, 123 | location and mark will be sampled based upon last hidden state. 124 | 125 | The reason avoid using tensorflow builtin rnn structure is that, besides last hidden state, the other feedback 126 | to next moment is a customized stochastic variable which depends on the last moment's rnn output. 127 | """ 128 | # stochastic neurons for generating time, location and mark 129 | delta_t, loglik_t = self._dt(batch_size, last_state.h) # [batch_size, t_dim], [batch_size, 1] 130 | next_l, loglik_l = self._l(batch_size, last_state.h) # [batch_size, 2], [batch_size, 1] 131 | next_m, loglik_m = self._m(batch_size, last_state.h) # [batch_size, m_dim], [batch_size, 1] 132 | next_t = last_t + delta_t # [batch_size, t_dim] 133 | # log likelihood 134 | loglik = loglik_l # + loglik_l # + loglik_m # TODO: Add mark to input x 135 | # input of LSTM 136 | x = tf.concat([next_l], axis=1) # TODO: Add mark to input x 137 | # one step rnn structure 138 | # - x is a tensor that contains a single step of data points with shape [batch_size, t_dim + l_dim + m_dim] 139 | # - state is a tensor of hidden state with shape [2, batch_size, state_size] 140 | _, next_state = tf.nn.static_rnn(tf_lstm_cell, [x], initial_state=last_state, dtype=tf.float32) 141 | return next_t, next_l, next_m, loglik, next_state 142 | 143 | def _dt(self, batch_size, hidden_state): 144 | """Sampling time interval given hidden state of LSTM""" 145 | theta_h = tf.nn.elu(tf.matmul(hidden_state, self.Wt) + self.bt) + 1 # [batch_size, t_dim=1] 146 | # reparameterization trick for sampling action from exponential distribution 147 | delta_t = - tf.log(tf.random_uniform([batch_size, self.t_dim], dtype=tf.float32)) / theta_h # [batch_size, t_dim=1] 148 | # log likelihood 149 | loglik = - tf.multiply(theta_h, delta_t) + tf.log(theta_h) # [batch_size, 1] 150 | return delta_t, loglik 151 | 152 | def _l(self, batch_size, hidden_state): 153 | """Sampling location shifts given hidden state of LSTM""" 154 | # masks for epsilon greedy exploration & regular sampling 155 | p = tf.random_uniform([batch_size, 1], 0, 1) # [batch_size, 1] 156 | l_eps_mask = tf.cast(p < self.epsilon, dtype=tf.float32) # [batch_size, 1] 157 | l_reg_mask = 1. - l_eps_mask # [batch_size, 1] 158 | 159 | # sample from uniform distribution (epsilon greedy exploration) 160 | lx_eps = tf.random_uniform([batch_size, 1], minval=-self.x_lim, maxval=self.x_lim, dtype=tf.float32) 161 | ly_eps = tf.random_uniform([batch_size, 1], minval=-self.y_lim, maxval=self.y_lim, dtype=tf.float32) 162 | 163 | # sample from the distribution detemined by hidden state 164 | dense_feature = tf.nn.relu(tf.matmul(hidden_state, self.Wl0)) + self.bl0 # [batch_size, loc_hidden_size] 165 | dense_feature = tf.matmul(dense_feature, self.Wl1) + self.bl1 # [batch_size, loc_param_size] 166 | # - 5 params that determine the distribution of location shifts with shape [batch_size] 167 | mu0 = tf.reshape(dense_feature[:, 0], [batch_size, 1]) 168 | mu1 = tf.reshape(dense_feature[:, 1], [batch_size, 1]) 169 | # - construct positive definite and symmetrical matrix as covariance matrix 170 | A11 = tf.expand_dims(tf.reshape(dense_feature[:, 2], [batch_size, 1]), -1) # [batch_size, 1, 1] 171 | A22 = tf.expand_dims(tf.reshape(dense_feature[:, 3], [batch_size, 1]), -1) # [batch_size, 1, 1] 172 | A21 = tf.expand_dims(tf.reshape(dense_feature[:, 4], [batch_size, 1]), -1) # [batch_size, 1, 1] 173 | A12 = tf.zeros([batch_size, 1, 1]) # [batch_size, 1, 1] 174 | A1 = tf.concat([A11, A12], axis=2) # [batch_size, 1, 2] 175 | A2 = tf.concat([A21, A22], axis=2) # [batch_size, 1, 2] 176 | A = tf.concat([A1, A2], axis=1) # [batch_size, 2, 2] 177 | # - sigma = A * A^T with shape [batch_size, 2, 2] 178 | sigma = tf.scan(lambda a, x: tf.matmul(x, tf.transpose(x)), A) # [batch_size, 2, 2] 179 | sigma11 = tf.expand_dims(sigma[:, 0, 0], -1) # [batch_size, 1] 180 | sigma22 = tf.expand_dims(sigma[:, 1, 1], -1) # [batch_size, 1] 181 | sigma12 = tf.expand_dims(sigma[:, 0, 1], -1) # [batch_size, 1] 182 | # - random variable for generating locaiton 183 | rv0 = tf.random_normal([batch_size, 1]) 184 | rv1 = tf.random_normal([batch_size, 1]) 185 | # - location x and y 186 | x = mu0 + tf.multiply(sigma11, rv0) + tf.multiply(sigma12, rv1) # [batch_size, 1] 187 | y = mu1 + tf.multiply(sigma12, rv0) + tf.multiply(sigma22, rv1) # [batch_size, 1] 188 | 189 | # # combine exploration and regular sampling 190 | # x = tf.multiply(lx_eps, l_eps_mask) + tf.multiply(x, l_reg_mask) 191 | # y = tf.multiply(ly_eps, l_eps_mask) + tf.multiply(y, l_reg_mask) 192 | l = tf.concat([x, y], axis=1) # [batch_size, 2] 193 | 194 | # log likelihood 195 | sigma1 = tf.sqrt(tf.square(sigma11) + tf.square(sigma12)) 196 | sigma2 = tf.sqrt(tf.square(sigma12) + tf.square(sigma22)) 197 | v12 = tf.multiply(sigma11, sigma12) + tf.multiply(sigma12, sigma22) 198 | rho = v12 / tf.multiply(sigma1, sigma2) 199 | z = tf.square(x - mu0) / tf.square(sigma1) \ 200 | - 2 * tf.multiply(rho, tf.multiply(x - mu0, y - mu1)) / tf.multiply(sigma1, sigma2) \ 201 | + tf.square(y - mu1) / tf.square(sigma2) 202 | loglik = - z / 2 / (1 - tf.square(rho)) \ 203 | - tf.log(2 * np.pi * tf.multiply(tf.multiply(sigma1, sigma2), tf.sqrt(1 - tf.square(rho)))) 204 | 205 | return l, loglik 206 | 207 | def _m(self, batch_size, hidden_state): 208 | """Sampling mark given hidden state of LSTM""" 209 | dense_feature = tf.nn.relu(tf.matmul(hidden_state, self.Wm0)) + self.bm0 # [batch_size, location_para_dim] 210 | dense_feature = tf.nn.elu(tf.matmul(dense_feature, self.Wm1) + self.bm1) + 1 # [batch_size, dim_m] dense_feature is positive 211 | # sample from multinomial distribution (use Gumbel trick to sample the labels) 212 | eps = 1e-13 213 | rv_uniform = tf.random_uniform([batch_size, self.m_dim]) 214 | rv_Gumbel = -tf.log(-tf.log(rv_uniform + eps) + eps) 215 | label = tf.argmax(dense_feature + rv_Gumbel, axis=1) # label: [batch] 216 | m = tf.one_hot(indices=label, depth=self.m_dim) # [batch_size, m_dim] 217 | # log likelihood 218 | prob = tf.nn.softmax(dense_feature) 219 | loglik = tf.log(tf.reduce_sum(m * prob, 1) + 1e-13) 220 | return m, loglik 221 | -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | import branca 4 | import folium 5 | import geopandas 6 | import pandas as pd 7 | import numpy as np 8 | import seaborn as sns 9 | import tensorflow as tf 10 | import matplotlib 11 | import matplotlib.pyplot as plt 12 | import matplotlib.cm as cm 13 | from matplotlib import animation 14 | from matplotlib.backends.backend_pdf import PdfPages 15 | from mpl_toolkits.axes_grid1 import make_axes_locatable 16 | from shapely.geometry import Polygon 17 | 18 | def lebesgue_measure(S): 19 | """ 20 | A helper function for calculating the Lebesgue measure for a space. 21 | It actually is the length of an one-dimensional space, and the area of 22 | a two-dimensional space. 23 | """ 24 | sub_lebesgue_ms = [ sub_space[1] - sub_space[0] for sub_space in S ] 25 | return np.prod(sub_lebesgue_ms) 26 | 27 | 28 | 29 | def l2_norm(x, y): 30 | """ 31 | This helper function calculates distance (l2 norm) between two arbitrary data points from tensor x and 32 | tensor y respectively, where x and y have the same shape [length, data_dim]. 33 | """ 34 | x = tf.cast(x, dtype=tf.float32) 35 | y = tf.cast(y, dtype=tf.float32) 36 | x_sqr = tf.expand_dims(tf.reduce_sum(x * x, 1), -1) # [length, 1] 37 | y_sqr = tf.expand_dims(tf.reduce_sum(y * y, 1), -1) # [length, 1] 38 | xy = tf.matmul(x, tf.transpose(y)) # [length, length] 39 | dist_mat = x_sqr + tf.transpose(y_sqr) - 2 * xy 40 | return dist_mat 41 | 42 | 43 | 44 | class Meter(object): 45 | """ 46 | Base class for the point process visualizer 47 | """ 48 | def __init__(self, batch_size): 49 | self.batch_size = batch_size 50 | # figure and axes for time intensity plot 51 | self.fig_t = plt.figure() 52 | self.ax_t = self.fig_t.add_subplot(111) 53 | # figure and axes for space intensity plot 54 | self.fig_l = plt.figure() 55 | self.ax_l1 = self.fig_l.add_subplot(1,2,1) 56 | self.ax_l2 = self.fig_l.add_subplot(1,2,2) 57 | plt.ion() 58 | 59 | 60 | 61 | class PointProcessDistributionMeter(Meter): 62 | """ 63 | Data distribution visualizer for point process 64 | """ 65 | def __init__(self, T, S, batch_size): 66 | self.T = T 67 | self.S = S 68 | Meter.__init__(self, batch_size) 69 | 70 | def update_time_distribution(self, seq_t_learner, seq_t_expert): 71 | self.update_distribution(seq_t_learner, seq_t_expert, 72 | self.ax_t, self.T, 73 | xlabel="Time", ylabel="Distribution") 74 | 75 | def update_location_distribution(self, seq_l_learner, seq_l_expert): 76 | self.update_distribution(seq_l_learner[:, :, 0], seq_l_expert[:, :, 0], 77 | self.ax_l1, self.S[0], 78 | xlabel="X", ylabel="Distribution") 79 | self.update_distribution(seq_l_learner[:, :, 1], seq_l_expert[:, :, 1], 80 | self.ax_l2, self.S[1], 81 | xlabel="Y", ylabel="Distribution") 82 | 83 | @staticmethod 84 | def update_distribution(seq_learner, seq_expert, axes, xlim, xlabel, ylabel): 85 | # clear last figure 86 | axes.clear() 87 | seq_learner = seq_learner.flatten() 88 | seq_learner = seq_learner[seq_learner != 0] 89 | seq_expert = seq_expert.flatten() 90 | seq_expert = seq_expert[seq_expert != 0] 91 | sns.set(color_codes=True) 92 | sns.distplot(seq_learner, ax=axes, hist=False, rug=True, label="Learner") 93 | sns.distplot(seq_expert, ax=axes, hist=False, rug=True, label="Expert") 94 | axes.set_xlim(xlim) 95 | axes.set(xlabel=xlabel, ylabel=ylabel) 96 | axes.legend(frameon=False) 97 | plt.pause(0.02) 98 | 99 | 100 | 101 | class PointProcessIntensityMeter(Meter): 102 | """ 103 | Conditional intensity visualizer for point process 104 | """ 105 | def __init__(self, T, batch_size): 106 | self.T = T 107 | Meter.__init__(self, batch_size) 108 | 109 | def update_time_intensity(self, seq_t_1, seq_t_2, tlim=10): 110 | # clear last figure 111 | self.ax_t.clear() 112 | # sequence 1 113 | seq_flat_1 = seq_t_1.flatten() 114 | seq_flat_1 = seq_flat_1[seq_flat_1 != 0] 115 | seq_1_intensity_cum = [] 116 | for grid in np.arange(0, self.T, 0.5): 117 | idx = (seq_flat_1 < grid) 118 | event_count_cum = len(seq_flat_1[idx]) 119 | seq_1_intensity_cum = np.append(seq_1_intensity_cum, event_count_cum) 120 | seq_1_intensity = np.append(seq_1_intensity_cum[0], np.diff(seq_1_intensity_cum)) / self.batch_size 121 | self.ax_t.plot(np.arange(0, self.T, 0.5), seq_1_intensity) 122 | # sequence 2 123 | seq_flat_2 = seq_t_2.flatten() 124 | seq_flat_2 = seq_flat_2[seq_flat_2 != 0] 125 | seq_2_intensity_cum = [] 126 | for grid in np.arange(0, self.T, 0.5): 127 | idx = (seq_flat_2 < grid) 128 | event_count_cum = len(seq_flat_2[idx]) 129 | seq_2_intensity_cum = np.append(seq_2_intensity_cum, event_count_cum) 130 | seq_2_intensity = np.append(seq_2_intensity_cum[0], np.diff(seq_2_intensity_cum)) / self.batch_size 131 | self.ax_t.plot(np.arange(0, self.T, 0.5), seq_2_intensity) 132 | # configure plot limits 133 | self.ax_t.set_ylim((0, tlim)) 134 | plt.pause(0.02) 135 | 136 | def update_location_intensity(self, seq_l_1, seq_l_2, xylim=5, gridsize=51): 137 | # clear last figure 138 | self.ax_l1.clear() 139 | self.ax_l2.clear() 140 | # configure bins for histogram 141 | xedges = np.linspace(-xylim, xylim, gridsize) 142 | yedges = np.linspace(-xylim, xylim, gridsize) 143 | # sequence 1 144 | seq_1_x = seq_l_1[:, :, 0].flatten() 145 | seq_1_y = seq_l_1[:, :, 1].flatten() 146 | H, xedges, yedges = np.histogram2d(seq_1_x, seq_1_y, bins=(xedges, yedges)) 147 | self.ax_l1.imshow(H.T, interpolation='nearest', origin='low', extent=[xedges[0], xedges[-1], yedges[0], yedges[-1]]) 148 | # sequence 2 149 | seq_2_x = seq_l_2[:, :, 0].flatten() 150 | seq_2_y = seq_l_2[:, :, 1].flatten() 151 | H, xedges, yedges = np.histogram2d(seq_2_x, seq_2_y, bins=(xedges, yedges)) 152 | self.ax_l2.imshow(H.T, interpolation='nearest', origin='low', extent=[xedges[0], xedges[-1], yedges[0], yedges[-1]]) 153 | # configure plot limits 154 | self.ax_l1.set_xlim((-xylim, xylim)) 155 | self.ax_l1.set_ylim((-xylim, xylim)) 156 | self.ax_l2.set_xlim((-xylim, xylim)) 157 | self.ax_l2.set_ylim((-xylim, xylim)) 158 | plt.pause(0.02) 159 | 160 | 161 | 162 | class DataAdapter(): 163 | """ 164 | A helper class for normalizing and restoring data to the specific data range. 165 | 166 | init_data: numpy data points with shape [batch_size, seq_len, 3] that defines the x, y, t limits 167 | S: data spatial range. eg. [[-1., 1.], [-1., 1.]] 168 | T: data temporal range. eg. [0., 10.] 169 | """ 170 | def __init__(self, init_data, S=[[-1, 1], [-1, 1]], T=[0., 10.]): 171 | self.data = init_data 172 | self.T = T 173 | self.S = S 174 | self.tlim = [ init_data[:, :, 0].min(), init_data[:, :, 0].max() ] 175 | mask = np.nonzero(init_data[:, :, 0]) 176 | x_nonzero = init_data[:, :, 1][mask] 177 | y_nonzero = init_data[:, :, 2][mask] 178 | self.xlim = [ x_nonzero.min(), x_nonzero.max() ] 179 | self.ylim = [ y_nonzero.min(), y_nonzero.max() ] 180 | print(self.tlim) 181 | print(self.xlim) 182 | print(self.ylim) 183 | 184 | def normalize(self, data): 185 | """normalize batches of data points to the specified range""" 186 | rdata = np.copy(data) 187 | for b in range(len(rdata)): 188 | # scale x 189 | rdata[b, np.nonzero(rdata[b, :, 0]), 1] = \ 190 | (rdata[b, np.nonzero(rdata[b, :, 0]), 1] - self.xlim[0]) / \ 191 | (self.xlim[1] - self.xlim[0]) * (self.S[0][1] - self.S[0][0]) + self.S[0][0] 192 | # scale y 193 | rdata[b, np.nonzero(rdata[b, :, 0]), 2] = \ 194 | (rdata[b, np.nonzero(rdata[b, :, 0]), 2] - self.ylim[0]) / \ 195 | (self.ylim[1] - self.ylim[0]) * (self.S[1][1] - self.S[1][0]) + self.S[1][0] 196 | # scale t 197 | rdata[b, np.nonzero(rdata[b, :, 0]), 0] = \ 198 | (rdata[b, np.nonzero(rdata[b, :, 0]), 0] - self.tlim[0]) / \ 199 | (self.tlim[1] - self.tlim[0]) * (self.T[1] - self.T[0]) + self.T[0] 200 | return rdata 201 | 202 | def restore(self, data): 203 | """restore the normalized batches of data points back to their real ranges.""" 204 | ndata = np.copy(data) 205 | for b in range(len(ndata)): 206 | # scale x 207 | ndata[b, np.nonzero(ndata[b, :, 0]), 1] = \ 208 | (ndata[b, np.nonzero(ndata[b, :, 0]), 1] - self.S[0][0]) / \ 209 | (self.S[0][1] - self.S[0][0]) * (self.xlim[1] - self.xlim[0]) + self.xlim[0] 210 | # scale y 211 | ndata[b, np.nonzero(ndata[b, :, 0]), 2] = \ 212 | (ndata[b, np.nonzero(ndata[b, :, 0]), 2] - self.S[1][0]) / \ 213 | (self.S[1][1] - self.S[1][0]) * (self.ylim[1] - self.ylim[0]) + self.ylim[0] 214 | # scale t 215 | ndata[b, np.nonzero(ndata[b, :, 0]), 0] = \ 216 | (ndata[b, np.nonzero(ndata[b, :, 0]), 0] - self.T[0]) / \ 217 | (self.T[1] - self.T[0]) * (self.tlim[1] - self.tlim[0]) + self.tlim[0] 218 | return ndata 219 | 220 | def normalize_location(self, x, y): 221 | """normalize a single data location to the specified range""" 222 | _x = (x - self.xlim[0]) / (self.xlim[1] - self.xlim[0]) * (self.S[0][1] - self.S[0][0]) + self.S[0][0] 223 | _y = (y - self.ylim[0]) / (self.ylim[1] - self.ylim[0]) * (self.S[1][1] - self.S[1][0]) + self.S[1][0] 224 | return np.array([_x, _y]) 225 | 226 | def restore_location(self, x, y): 227 | """restore a single data location back to the its original range""" 228 | _x = (x - self.S[0][0]) / (self.S[0][1] - self.S[0][0]) * (self.xlim[1] - self.xlim[0]) + self.xlim[0] 229 | _y = (y - self.S[1][0]) / (self.S[1][1] - self.S[1][0]) * (self.ylim[1] - self.ylim[0]) + self.ylim[0] 230 | return np.array([_x, _y]) 231 | 232 | def __str__(self): 233 | raw_data_str = "raw data example:\n%s\n" % self.data[:1] 234 | nor_data_str = "normalized data example:\n%s" % self.normalize(self.data[:1]) 235 | return raw_data_str + nor_data_str 236 | 237 | 238 | 239 | def spatial_intensity_on_map( 240 | path, # html saving path 241 | da, # data adapter object defined in utils.py 242 | lam, # lambda object defined in stppg.py 243 | data, # a sequence of data points [seq_len, 3] happened in the past 244 | seq_ind, # index of sequence for visualization 245 | t, # normalized observation moment (t) 246 | xlim, # real observation x range 247 | ylim, # real observation y range 248 | ngrid=100): 249 | """Plot spatial intensity at time t over the entire map given its coordinates limits.""" 250 | # data preparation 251 | # - remove the first element in the seq, since t_0 is always 0, 252 | # which will cause numerical issue when computing lambda value 253 | seqs = da.normalize(data)[:, 1:, :] 254 | seq = seqs[seq_ind] # visualize the sequence indicated by seq_ind 255 | seq = seq[np.nonzero(seq[:, 0])[0], :] # only retain nonzero values 256 | print(seq) 257 | seq_t, seq_s = seq[:, 0], seq[:, 1:] 258 | sub_seq_t = seq_t[seq_t < t] # only retain values before time t. 259 | sub_seq_s = seq_s[:len(sub_seq_t)] 260 | # generate spatial grid polygons 261 | xmin, xmax, width = xlim[0], xlim[1], xlim[1] - xlim[0] 262 | ymin, ymax, height = ylim[0], ylim[1], ylim[1] - ylim[0] 263 | grid_height, grid_width = height / ngrid, width / ngrid 264 | x_left_origin = xmin 265 | x_right_origin = xmin + grid_width 266 | y_top_origin = ymax 267 | y_bottom_origin = ymax - grid_height 268 | polygons = [] # spatial polygons 269 | lam_dict = {} # spatial intensity 270 | _id = 0 271 | for i in range(ngrid): 272 | y_top = y_top_origin 273 | y_bottom = y_bottom_origin 274 | for j in range(ngrid): 275 | # append the intensity value to the list 276 | s = da.normalize_location((x_left_origin + x_right_origin) / 2., (y_top + y_bottom) / 2.) 277 | v = lam.value(t, sub_seq_t, s, sub_seq_s) 278 | lam_dict[str(_id)] = np.log(v) 279 | _id += 1 280 | # append polygon to the list 281 | polygons.append(Polygon( 282 | [(y_top, x_left_origin), (y_top, x_right_origin), (y_bottom, x_right_origin), (y_bottom, x_left_origin)])) 283 | # update coordinates 284 | y_top = y_top - grid_height 285 | y_bottom = y_bottom - grid_height 286 | x_left_origin += grid_width 287 | x_right_origin += grid_width 288 | # convert polygons to geopandas object 289 | geo_df = geopandas.GeoSeries(polygons) 290 | # init map 291 | # _map = folium.Map(location=[sum(xlim)/2., sum(ylim)/2.], zoom_start=12, zoom_control=True) 292 | _map = folium.Map(location=[sum(xlim)/2., sum(ylim)/2.], zoom_start=6, zoom_control=True, tiles='Stamen Terrain') 293 | # plot polygons on the map 294 | print(min(lam_dict.values()), max(lam_dict.values())) 295 | lam_cm = branca.colormap.linear.YlOrRd_09.scale(np.log(3), np.log(150)) # colorbar for intensity values 296 | poi_cm = branca.colormap.linear.PuBu_09.scale(min(sub_seq_t), max(sub_seq_t)) # colorbar for lasting time of points 297 | folium.GeoJson( 298 | data = geo_df.to_json(), 299 | style_function = lambda feature: { 300 | 'fillColor': lam_cm(lam_dict[feature['id']]), 301 | 'fillOpacity': .5, 302 | 'weight': 0.}).add_to(_map) 303 | # plot markers on the map 304 | for i in range(len(sub_seq_t)): 305 | x, y = da.restore_location(*sub_seq_s[i]) 306 | folium.Circle( 307 | location=[x, y], 308 | radius=10, # sub_seq_t[i] * 100, 309 | color=poi_cm(sub_seq_t[i]), 310 | fill=True, 311 | fill_color='blue').add_to(_map) 312 | # save the map 313 | _map.save(path) --------------------------------------------------------------------------------