├── log
├── re_log_2017-06-12_02:55:09
└── re_log_2017-06-12_02:59:23
├── pg_re.pyc
├── pg_su.pyc
├── data
└── .DS_Store
├── environment.pyc
├── other_agents.pyc
├── parameters.pyc
├── pg_network.pyc
├── plot
└── .DS_Store
├── slow_down_cdf.pyc
├── job_distribution.pyc
├── pg_re_single_core.pyc
├── deeprm state space.png
├── deeprm2 state space.png
├── launcher2.py
├── run_script.py
├── other_agents.py
├── job_distribution.py
├── parameters.py
├── README.md
├── launcher.py
├── pg_su.py
├── slow_down_cdf.py
├── pg_re_single_core_o.py
├── pg_re_single_core.py
├── pg_network.py
├── pg_re_o.py
├── pg_re.py
└── environment.py
/log/re_log_2017-06-12_02:55:09:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/log/re_log_2017-06-12_02:59:23:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/pg_re.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/pg_re.pyc
--------------------------------------------------------------------------------
/pg_su.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/pg_su.pyc
--------------------------------------------------------------------------------
/data/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/data/.DS_Store
--------------------------------------------------------------------------------
/environment.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/environment.pyc
--------------------------------------------------------------------------------
/other_agents.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/other_agents.pyc
--------------------------------------------------------------------------------
/parameters.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/parameters.pyc
--------------------------------------------------------------------------------
/pg_network.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/pg_network.pyc
--------------------------------------------------------------------------------
/plot/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/plot/.DS_Store
--------------------------------------------------------------------------------
/slow_down_cdf.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/slow_down_cdf.pyc
--------------------------------------------------------------------------------
/job_distribution.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/job_distribution.pyc
--------------------------------------------------------------------------------
/pg_re_single_core.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/pg_re_single_core.pyc
--------------------------------------------------------------------------------
/deeprm state space.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/deeprm state space.png
--------------------------------------------------------------------------------
/deeprm2 state space.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/BrightFeather/deeprm_conv/HEAD/deeprm2 state space.png
--------------------------------------------------------------------------------
/launcher2.py:
--------------------------------------------------------------------------------
1 | # tidier and simpler launcher without use of command line
2 | import os
3 | os.environ["THEANO_FLAGS"] = "device=cpu,floatX=float32"
4 | import sys
5 | import getopt
6 | import matplotlib
7 | matplotlib.use('Agg')
8 |
9 | import parameters
10 | import pg_re
11 | import pg_re_single_core
12 | import pg_su
13 | import slow_down_cdf
14 |
15 | pa = parameters.Parameters()
16 | pa.type_exp = "pg_re"
17 | # pa.pg_resume = "data/pg_su_net_file_9990.pkl"
18 | pa.simu_len = 50
19 | pa.num_ex = 10
20 | pa.output_filename= "data/pg_re_conv"
21 | pa.output_freq=2
22 | pg_re_single_core.launch(pa)
23 |
--------------------------------------------------------------------------------
/run_script.py:
--------------------------------------------------------------------------------
1 | # /usr/bin/env python
2 |
3 | import os
4 |
5 | simu_len = 200
6 |
7 | for new_job_rate in [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]:
8 | for num_seq_per_batch in [20]:
9 | for num_ex in [100]:
10 | for num_nw in [10]:
11 |
12 | file_name = 'data/pg_re_rate_' + str(new_job_rate) + '_simu_len_' + str(simu_len) + '_num_seq_per_batch_' + str(num_seq_per_batch) + '_ex_' + str(num_ex) + '_nw_' + str(num_nw)
13 | log = 'log/pg_re_rate_' + str(new_job_rate) + '_simu_len_' + str(simu_len) + '_num_seq_per_batch_' + str(num_seq_per_batch) + '_ex_' + str(num_ex) + '_nw_' + str(num_nw)
14 |
15 | # run experiment
16 | os.system('nohup python -u launcher.py --exp_type=pg_re --out_freq=50 --simu_len=' + str(simu_len) + ' --eps_max_len=' + str(simu_len * 4) + ' --num_ex=' + str(num_ex) + ' --new_job_rate=' + str(new_job_rate) + ' --num_seq_per_batch=' + str(num_seq_per_batch) + ' --num_nw=' + str(num_nw) + ' --ofile=' + file_name + ' > ' + log + ' &')
17 |
18 | # plot slowdown
19 | # it_num = 100
20 | # os.system('nohup python -u launcher.py --exp_type=test --simu_len=' + str(simu_len) + '--num_ex=' + str(num_ex) + ' --new_job_rate=' + str(new_job_rate) + ' --num_seq_per_batch=' + str(num_seq_per_batch) + ' --pg_re=' + file_name + '_' + str(it_num) + '.pkl' + ' &')
21 |
--------------------------------------------------------------------------------
/other_agents.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 |
4 | def get_packer_action(machine, job_slot):
5 | align_score = 0
6 | act = len(job_slot.slot) # if no action available, hold
7 |
8 | for i in xrange(len(job_slot.slot)):
9 | new_job = job_slot.slot[i]
10 | if new_job is not None: # there is a pending job
11 |
12 | avbl_res = machine.avbl_slot[:new_job.len, :]
13 | res_left = avbl_res - new_job.res_vec
14 |
15 | if np.all(res_left[:] >= 0): # enough resource to allocate
16 |
17 | tmp_align_score = avbl_res[0, :].dot(new_job.res_vec)
18 |
19 | if tmp_align_score > align_score:
20 | align_score = tmp_align_score
21 | act = i
22 | return act
23 |
24 |
25 | def get_sjf_action(machine, job_slot):
26 | sjf_score = 0
27 | act = len(job_slot.slot) # if no action available, hold
28 |
29 | for i in xrange(len(job_slot.slot)):
30 | new_job = job_slot.slot[i]
31 | if new_job is not None: # there is a pending job
32 |
33 | avbl_res = machine.avbl_slot[:new_job.len, :]
34 | res_left = avbl_res - new_job.res_vec
35 |
36 | if np.all(res_left[:] >= 0): # enough resource to allocate
37 |
38 | tmp_sjf_score = 1 / float(new_job.len)
39 |
40 | if tmp_sjf_score > sjf_score:
41 | sjf_score = tmp_sjf_score
42 | act = i
43 | return act
44 |
45 |
46 | def get_packer_sjf_action(machine, job_slot, knob): # knob controls which to favor, 1 to packer, 0 to sjf
47 |
48 | combined_score = 0
49 | act = len(job_slot.slot) # if no action available, hold
50 |
51 | for i in xrange(len(job_slot.slot)):
52 | new_job = job_slot.slot[i]
53 | if new_job is not None: # there is a pending job
54 |
55 | avbl_res = machine.avbl_slot[:new_job.len, :]
56 | res_left = avbl_res - new_job.res_vec
57 |
58 | if np.all(res_left[:] >= 0): # enough resource to allocate
59 |
60 | tmp_align_score = avbl_res[0, :].dot(new_job.res_vec)
61 | tmp_sjf_score = 1 / float(new_job.len)
62 |
63 | tmp_combined_score = knob * tmp_align_score + (1 - knob) * tmp_sjf_score
64 |
65 | if tmp_combined_score > combined_score:
66 | combined_score = tmp_combined_score
67 | act = i
68 | return act
69 |
70 |
71 | def get_random_action(job_slot):
72 | num_act = len(job_slot.slot) + 1 # if no action available,
73 | act = np.random.randint(num_act)
74 | return act
75 |
--------------------------------------------------------------------------------
/job_distribution.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 |
4 | class Dist:
5 |
6 | def __init__(self, num_res, max_nw_size, job_len):
7 | self.num_res = num_res
8 | self.max_nw_size = max_nw_size
9 | self.job_len = job_len
10 |
11 | self.job_small_chance = 0.8
12 |
13 | self.job_len_big_lower = job_len * 2 / 3
14 | self.job_len_big_upper = job_len
15 |
16 | self.job_len_small_lower = 1
17 | self.job_len_small_upper = job_len / 5
18 |
19 | self.dominant_res_lower = max_nw_size / 2
20 | self.dominant_res_upper = max_nw_size
21 |
22 | self.other_res_lower = 1
23 | self.other_res_upper = max_nw_size / 5
24 |
25 | def normal_dist(self):
26 |
27 | # new work duration
28 | nw_len = np.random.randint(1, self.job_len + 1) # same length in every dimension
29 |
30 | nw_size = np.zeros(self.num_res)
31 |
32 | for i in range(self.num_res):
33 | nw_size[i] = np.random.randint(1, self.max_nw_size + 1)
34 |
35 | return nw_len, nw_size
36 |
37 | def bi_model_dist(self):
38 |
39 | # -- job length --
40 | if np.random.rand() < self.job_small_chance: # small job
41 | nw_len = np.random.randint(self.job_len_small_lower,
42 | self.job_len_small_upper + 1)
43 | else: # big job
44 | nw_len = np.random.randint(self.job_len_big_lower,
45 | self.job_len_big_upper + 1)
46 |
47 | nw_size = np.zeros(self.num_res)
48 |
49 | # -- job resource request --
50 | dominant_res = np.random.randint(0, self.num_res)
51 | for i in range(self.num_res):
52 | if i == dominant_res:
53 | nw_size[i] = np.random.randint(self.dominant_res_lower,
54 | self.dominant_res_upper + 1)
55 | else:
56 | nw_size[i] = np.random.randint(self.other_res_lower,
57 | self.other_res_upper + 1)
58 |
59 | return nw_len, nw_size
60 |
61 |
62 | def generate_sequence_work(pa, seed=42):
63 |
64 | np.random.seed(seed)
65 |
66 | simu_len = pa.simu_len * pa.num_ex
67 |
68 | nw_dist = pa.dist.bi_model_dist
69 |
70 | nw_len_seq = np.zeros(simu_len, dtype=int)
71 | nw_size_seq = np.zeros((simu_len, pa.num_res), dtype=int)
72 |
73 | for i in range(simu_len):
74 |
75 | if np.random.rand() < pa.new_job_rate: # a new job comes
76 |
77 | nw_len_seq[i], nw_size_seq[i, :] = nw_dist()
78 |
79 | nw_len_seq = np.reshape(nw_len_seq,
80 | [pa.num_ex, pa.simu_len])
81 | nw_size_seq = np.reshape(nw_size_seq,
82 | [pa.num_ex, pa.simu_len, pa.num_res])
83 |
84 | return nw_len_seq, nw_size_seq
85 |
--------------------------------------------------------------------------------
/parameters.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import math
3 |
4 | import job_distribution
5 |
6 |
7 | class Parameters:
8 | def __init__(self):
9 |
10 | self.output_filename = 'data/tmp'
11 |
12 | self.num_epochs = 1000 # number of training epochs
13 | self.simu_len = 10 # length of the busy cycle that repeats itself, length of each trajectory 10
14 | self.num_ex = 1 # number of sequences, number of episodes
15 |
16 | self.output_freq = 10 # interval for output and store parameters
17 |
18 | self.num_seq_per_batch = 10 # number of sequences to compute baseline
19 | self.episode_max_length = 200 # enforcing an artificial terminal
20 |
21 | self.num_res = 2 # number of resources in the system
22 | self.num_nw = 5 # maximum allowed number of work in the queue
23 |
24 | self.time_horizon = 20 # number of time steps in the graph
25 | self.max_job_len = 15 # maximum duration of new jobs
26 | self.res_slot = 10 # maximum number of available resource slots
27 | self.max_job_size = 10 # maximum resource request of new work
28 |
29 | self.backlog_size = 60 # backlog queue size
30 |
31 | self.max_track_since_new = 10 # track how many time steps since last new jobs
32 |
33 | self.job_num_cap = 40 # maximum number of distinct colors in current work graph
34 |
35 | self.new_job_rate = 0.7 # lambda in new job arrival Poisson Process
36 |
37 | self.discount = 1 # discount factor
38 |
39 | # distribution for new job arrival
40 | self.dist = job_distribution.Dist(self.num_res, self.max_job_size, self.max_job_len)
41 |
42 | # graphical representation
43 | assert self.backlog_size % self.time_horizon == 0 # such that it can be converted into an image
44 | self.backlog_width = int(math.ceil(self.backlog_size / float(self.time_horizon * self.num_res)))
45 | self.network_input_height = self.time_horizon * self.num_res
46 | self.network_input_width = \
47 | (self.res_slot +
48 | self.max_job_size * self.num_nw) + \
49 | self.backlog_width + \
50 | 1 # for extra info, 1) time since last new job
51 |
52 | # compact representation
53 | self.network_compact_dim = (self.num_res + 1) * \
54 | (self.time_horizon + self.num_nw) + 1 # + 1 for backlog indicator
55 |
56 | self.network_output_dim = self.num_nw + 1 # + 1 for void action
57 |
58 | self.delay_penalty = -1 # penalty for delaying things in the current work screen
59 | self.hold_penalty = -1 # penalty for holding things in the new work screen
60 | self.dismiss_penalty = -1 # penalty for missing a job because the queue is full
61 |
62 | self.num_frames = 1 # number of frames to combine and process
63 | self.lr_rate = 0.001 # learning rate
64 | self.rms_rho = 0.9 # for rms prop
65 | self.rms_eps = 1e-9 # for rms prop
66 |
67 | self.unseen = False # change random seed to generate unseen example
68 |
69 | # supervised learning mimic policy
70 | self.batch_size = 10
71 | self.evaluate_policy_name = "SJF"
72 |
73 | def compute_dependent_parameters(self):
74 | assert self.backlog_size % self.time_horizon == 0 # such that it can be converted into an image
75 | self.backlog_width = self.backlog_size / self.time_horizon
76 | self.network_input_height = self.time_horizon
77 | self.network_input_width = \
78 | (self.res_slot +
79 | self.max_job_size * self.num_nw) * self.num_res + \
80 | self.backlog_width + \
81 | 1 # for extra info, 1) time since last new job
82 |
83 | # compact representation
84 | self.network_compact_dim = (self.num_res + 1) * \
85 | (self.time_horizon + self.num_nw) + 1 # + 1 for backlog indicator
86 |
87 | self.network_output_dim = self.num_nw + 1 # + 1 for void action
88 |
89 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # DeepRM+
2 |
Based on works of Hongzi Mao
HotNets'16 http://people.csail.mit.edu/hongzi/content/publications/DeepRM-HotNets16.pdf
3 |
4 |
Made improvements based on DeepRM http://github.com/hongzimao/deeprm
5 |
6 | ## Improvement on algorithm structure:
7 |
8 | #### Rebuild the network with a convolution neural network
9 |
10 | File: build_small_conv_pg_network in pg_network.py
11 |
Network structure:
Input: CNN with size 2*2, 16 filters
12 |
13 | Output: Fully connected layer with # of actions output
14 |
Major improvement. Improved convergence rate (by ??? --> To Do)
15 | #### Reshape state space.
16 |
17 | File: environment.py
18 |
19 |
In DeepRM, state space was generated by stacking vertically matrices in the following way:
20 |
21 |
State matrix for resource 1, job 1's request matrix for resource 1, job 2's request matrix for resource 1, ... , job n's request matrix for resource 1,\
State matrix for resource 2, job 1's request matrix for resource 2, job 2's request matrix for resource 2, ... , job n's request matrix for resource 2.
22 |
23 |
I decide to put the related matrices closer, therefore stacking matrices in the following way:
24 |
25 |
Stacking vertically respectively: State matrix for resource 1, job 1's request matrix for resource 1, job 2's request matrix for resource 1, ... , job n's request matrix for resource 1,\
and State matrix for resource 2, job 1's request matrix for resource 2, job 2's request matrix for resource 2, ... , job n's request matrix for resource 2.
And then stack the above two long matrices vertically.
26 |
27 | See picture below for better explanation:
28 |
29 |
30 |
31 |
32 | Original state matrix
33 |
34 |
35 |
36 |
37 | Reshaped state matrix
38 |
39 |
40 |
41 |
Major improvement. Improved the average slowdown by 8.9% after 1000 epochs of training.
42 |
43 | #### Rewrite penalty function.
44 |
45 | File: parameters.py
46 |
47 |
I gave different weights of penalty for jobs already planned(in machine matrix), jobs in jobslot queue and jobs in backlog.
Minor improvement. Improved convergence rate.
48 |
49 | ## Others
50 | * Added log and save checkpoints to make record of slowdown and save models (pg_re_single_core.py and pg_re.py)
51 |
52 |
53 | * Added launcher2 for convenient launching and debugging (launcher2.py)
54 |
55 | ## Install prerequisites
56 |
57 | ```
58 | sudo apt-get update
59 | sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git
60 | pip install --user Theano
61 | pip install --user Lasagne==0.1
62 | sudo apt-get install python-matplotlib
63 | ```
64 |
65 | ## Run code
66 | In folder RL, create a data/ folder.
67 |
68 | Use `launcher.py` to launch experiments.
69 |
70 |
71 | ```
72 | --exp_type
73 | --num_res
74 | --num_nw
75 | --simu_len
76 | --num_ex
77 | --num_seq_per_batch
78 | --eps_max_len
79 | --num_epochs
80 | --time_horizon