├── .gitattributes ├── .gitignore ├── README.md ├── data ├── InputFiles │ ├── .svn │ │ ├── all-wcprops │ │ └── entries │ ├── test.swf │ ├── train.swf │ └── validate.swf └── Results │ ├── train16_27_12.rst │ ├── train16_27_12.rwd │ └── train16_27_12.ult ├── requirements.txt └── src_fc ├── Config ├── ad_bf_para.set ├── config_n.set └── config_sys.set ├── CqGym ├── Gym.py ├── GymGraphics.py ├── GymState.py └── __init__.py ├── CqSim ├── Backfill.py ├── Basic_algorithm.py ├── Cqsim_sim.py ├── Info_collect.py ├── Job_trace.py ├── Node_struc.py ├── Start_window.py └── __init__.py ├── Extend ├── SWF │ ├── Filter_job_SWF.py │ ├── Filter_node_SWF.py │ ├── Node_struc_SWF.py │ └── __init__.py └── __init__.py ├── Filter ├── Filter_job.py ├── Filter_node.py └── __init__.py ├── IOModule ├── Debug_log.py ├── Log_print.py ├── Output_log.py └── __init__.py ├── Interface └── __init__.py ├── Models ├── A2C.py ├── DQL.py ├── PG.py ├── PPO.py └── __init__.py ├── SWF_filter.py ├── ThreadMgr ├── Pause.py └── __init__.py ├── Trainer ├── A2C_Trainer.py ├── DQL_Trainer.py ├── FCFS.py ├── PG_Trainer.py ├── PPO_Trainer.py └── __init__.py ├── __init__.py ├── cqsim.py ├── cqsim_main.py └── cqsim_path.py /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | __* 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CQGym: Gym Environment for Reinforcement-Learned Batch Job scheduling 2 | 3 | All necessary packages can be installed with 4 | ``` 5 | pip install -r requirements.txt 6 | ``` 7 | There are many command line options available in the CQGym environment. These can be viewed with 8 | ``` 9 | python cqsim.py -h 10 | ``` 11 | The following outlines the most common use cases for CQGym. 12 | 13 | 14 | ## Training and testing 15 | The common options for training a model from scratch. 16 | ``` 17 | python cqsim.py -n [str] -j [str] --rl_alg [str] --is_training [1] --output_weight_file [str] 18 | ``` 19 | The common options for testing a model. 20 | ``` 21 | python cqsim.py -n [str] -j [str] --rl_alg [str] --is_training [0] --input_weight_file [str] 22 | ``` 23 | * **-n:** the name of the job trace file present in /data/InputFiles/, such as test.swf. 24 | * **-j:** the same file name used for **-n**. 25 | * **--rl_alg:** the name of the training algorithm to use. Either PG, DQL, A2C, PPO, or FCFS. Defaults to FCFS. 26 | * **--is_training:** 1 = perform optimization. 0 = No optimization. 27 | * **--output_weight_file:** the file name model weights are saved under. Can be found under /data/Fmt/ at the end of execution. 28 | * **--input_weight_file:** [str]. Specify a file name to load in existing model weights. Should be present in /data/Fmt. 29 | 30 | ### Other environment options 31 | These options are useful for making a custom training routine using CQGym calls. 32 | * **-R:** [int]. Specify the number of traces to simulate before stopping. Defaults to 8000. 33 | * **-r:** [int]. Specify job trace starting point as a line number. Defaults to 0. 34 | * **--do_render** : [int] 1 = display graphics, 0 - do not display graphics. Rendered graphics reports training performance within the episode. 35 | 36 | ### Training testing example script 37 | Training for two episodes over 1500 job traces. 38 | ``` 39 | python cqsim.py -j train.swf -n train.swf -R 1500 --is_training 1 --output_weight_file pg0 --rl_alg PG 40 | python cqsim.py -j train.swf -n train.swf -r 1501 -R 1500 --is_training 1 --input_weight_file pg0 --output_weight_file pg1 --rl_alg PG 41 | ``` 42 | Testing on validation file for 5000 job traces. 43 | ``` 44 | python cqsim.py -j validate.swf -n validate.swf -R 5000 --is_training 0 --input_weight_file pg0 --rl_alg PG 45 | python cqsim.py -j validate.swf -n validate.swf -R 5000 --is_training 0 --input_weight_file pg1 --rl_alg PG 46 | ``` 47 | 48 | ### Learning parameters 49 | Model hyperparameters can be modified using these options. 50 | * **--learning_rate:** [float]. Defaults to 0.000021. 51 | * **--batch_size:** [int]. The number of state-action-value sequences recorded by the agent before performing optimization. Defaults to 70. 52 | * **--window_size:** [int]. Input size. How many jobs from the queue considered by the agent for scheduling. Defaults to 50. 53 | * **--reward_discount:** [float]. Between [0, 1]. Designates the importance of future rewards in future states. Corresponds to gamma in the Bellman Optimality equation. Defaults to 0.95 54 | 55 | ### Config/ 56 | Additionally, all default values can be found and modified in src_fc/Config/. 57 | 58 | ## Data Collection 59 | Output from training and testing episodes goes to /data/Results. 60 | * **.rst:** Job scheduling results. 61 | 62 | | Column | Description | 63 | | ------ | ----------- | 64 | | 1 | Job ID | 65 | | 2 | Processor count | 66 | | 3 | Requested time | 67 | | 4 | Actual runtime | 68 | | 5 | Wait time | 69 | | 6 | Submission time | 70 | | 7 | Start time | 71 | | 8 | End time | 72 | 73 | * **.ult:** Changes to system utilization. 74 | 75 | | Column | Description | 76 | | ------ | ----------- | 77 | | 1 | Time | 78 | | 2 | Utilization % | 79 | 80 | * **.rwd:** Reward results. 81 | 82 | | Column | Description | 83 | | :---- | :---- | 84 | | 1 | Reward value | 85 | -------------------------------------------------------------------------------- /data/InputFiles/.svn/all-wcprops: -------------------------------------------------------------------------------- 1 | K 25 2 | svn:wc:ra_dav:version-url 3 | V 37 4 | /svn/Cqsim/!svn/ver/1/data/InputFiles 5 | END 6 | -------------------------------------------------------------------------------- /data/InputFiles/.svn/entries: -------------------------------------------------------------------------------- 1 | 10 2 | 3 | dir 4 | 2 5 | http://bluesky.cs.iit.edu/svn/Cqsim/data/InputFiles 6 | http://bluesky.cs.iit.edu/svn/Cqsim 7 | 8 | 9 | 10 | 2012-03-26T19:05:16.207088Z 11 | 1 12 | cqsim 13 | 14 | 15 | svn:special svn:externals svn:needs-lock 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 980ef0c6-80ec-457d-8438-a9b7fd81fd26 28 | 29 | -------------------------------------------------------------------------------- /data/InputFiles/test.swf: -------------------------------------------------------------------------------- 1 | ; Version: 2.2 2 | ; Computer: IBM SP2 3 | ; Installation: Swedish Royal Institute of Technology (KTH) 4 | ; Acknowledge: Lars Malinowsky 5 | ; Information: http://www.pdc.kth.se/ 6 | ; http://www.cs.huji.ac.il/labs/parallel/workload/ 7 | ; Conversion: Dror Feitelson (feit@cs.huji.ac.il) 1 Aug 2006 8 | ; MaxJobs: 28490 9 | ; MaxRecords: 28490 10 | ; Preemption: No 11 | ; UnixStartTime: 843480031 12 | ; TimeZone: 3600 13 | ; TimeZoneString: Europe/Stockholm 14 | ; StartTime: Mon Sep 23 14:00:31 CEST 1996 15 | ; EndTime: Fri Aug 29 10:55:01 CEST 1997 16 | ; MaxNodes: 100 17 | ; MaxProcs: 100 18 | ; Note: uses the EASY scheduler 19 | ; 20 | 1 0000 11464 4000 40 -1 -1 40 4000 -1 1 17 17 -1 -1 -1 -1 -1 21 | 2 1000 11464 2000 80 -1 -1 80 2000 -1 1 17 17 -1 -1 -1 -1 -1 22 | 3 2000 11464 1000 90 -1 -1 90 1000 -1 1 17 17 -1 -1 -1 -1 -1 23 | 4 3000 11464 5000 20 -1 -1 20 5000 -1 1 17 17 -1 -1 -1 -1 -1 24 | 5 4000 11464 3000 60 -1 -1 60 3000 -1 1 17 17 -1 -1 -1 -1 -1 25 | 6 5000 11464 3000 50 -1 -1 50 3000 -1 1 17 17 -1 -1 -1 -1 -1 26 | 7 6000 11464 2000 50 -1 -1 50 3000 -1 1 17 17 -1 -1 -1 -1 -1 27 | 8 8000 11464 1000 50 -1 -1 50 5000 -1 1 17 17 -1 -1 -1 -1 -1 28 | 9 8000 11464 5000 10 -1 -1 10 5000 -1 1 17 17 -1 -1 -1 -1 -1 -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | gym==0.19.0 2 | h5py==2.10.0 3 | Keras==2.0.6 4 | matplotlib==3.4.1 5 | numpy==1.19.5 6 | pandas==1.2.3 7 | tensorflow==1.15.0 8 | pytorch==1.4.0 9 | tensorboardx==2.4 -------------------------------------------------------------------------------- /src_fc/Config/ad_bf_para.set: -------------------------------------------------------------------------------- 1 | adapt_para= 2 | check_para= 3 | -------------------------------------------------------------------------------- /src_fc/Config/config_n.set: -------------------------------------------------------------------------------- 1 | pre_name=CQSIM_ 2 | ext_fmt_j=.csv 3 | ext_fmt_n=.csv 4 | ext_fmt_j_c=.con 5 | ext_fmt_n_c=.con 6 | path_in=InputFiles/ 7 | path_out=Results/ 8 | path_fmt=Fmt/ 9 | path_debug=Debug/ 10 | ext_jr=.rst 11 | ext_si=.ult 12 | ext_ai=.adp 13 | ext_ri=.rwd 14 | ext_debug=.log 15 | -------------------------------------------------------------------------------- /src_fc/Config/config_sys.set: -------------------------------------------------------------------------------- 1 | cluster_fraction=1.0 2 | start=0 3 | start_date=None 4 | anchor=0 5 | read_num=8000 6 | debug_lvl=3 7 | alg=w,+,2 8 | alg_sign= 1,0,1 9 | backfill=3 10 | bf_para= 11 | win=50 12 | win_para=5,0,0 13 | ad_win=0 14 | ad_bf=0 15 | ad_alg=0 16 | ad_win_para= 17 | ad_bf_para=ad_bf_para.set 18 | ad_alg_para= 19 | config_n=config_n.set 20 | monitor=500 21 | job_trace=SDSC-SP2-1998-4.1-cln.swf 22 | node_struc=SDSC-SP2-1998-4.1-cln.swf 23 | input_dim=2 24 | job_info_size=4 25 | is_training=1 26 | input_weight_file= 27 | output_weight_file=new_weights 28 | do_render=0 29 | window_size=50 30 | learning_rate=0.000021 31 | reward_discount=0.95 32 | batch_size=70 33 | layer_size=4000,1000 -------------------------------------------------------------------------------- /src_fc/CqGym/Gym.py: -------------------------------------------------------------------------------- 1 | from CqSim.Cqsim_sim import Cqsim_sim 2 | from gym import Env, spaces 3 | import numpy as np 4 | from CqGym.GymState import GymState 5 | from CqGym.GymGraphics import GymGraphics 6 | from copy import deepcopy 7 | 8 | class CqsimEnv(Env): 9 | 10 | def __init__(self, module, debug=None, job_cols=0, window_size=0, do_render=False, render_interval=1, render_pause=0.01): 11 | Env.__init__(self) 12 | 13 | # Maintaining Variables for reset. 14 | self.simulator_module = module 15 | self.simulator_debug = debug 16 | 17 | # Initializing CQSim Backend 18 | self.simulator = Cqsim_sim(module, debug=debug) 19 | self.simulator.start() 20 | # Let Simulator load completely. 21 | self.simulator.pause_producer() 22 | 23 | GymState._job_cols_ = job_cols 24 | GymState._window_size_ = window_size 25 | self.gym_state = GymState() 26 | 27 | # Defining Action Space and Observation Space. 28 | self.action_space = spaces.Discrete(window_size) 29 | self.observation_space = spaces.Box(shape=(1, self.simulator.module['node'].get_tot() + 30 | window_size * job_cols, 2), 31 | dtype=np.float32, low=0.0, high=1000000.0) 32 | 33 | # Define object for Graph Visualization: 34 | self.graphics = GymGraphics(do_render, render_interval, render_pause) 35 | self.rewards = [] 36 | self.iter = 0 37 | 38 | def reset(self): 39 | """ 40 | Reset the Gym Environment and the Simulator to a fresh start. 41 | :return: None 42 | """ 43 | del self.simulator 44 | self.simulator = Cqsim_sim(deepcopy(self.simulator_module), debug=self.simulator_debug) 45 | self.simulator.start() 46 | # Let Simulator load completely. 47 | self.simulator.pause_producer() 48 | 49 | # Reinitialize Local variables 50 | self.gym_state = GymState() 51 | self.graphics.reset() 52 | self.rewards = [] 53 | self.iter = 0 54 | 55 | def render(self, mode='human'): 56 | """ 57 | :param mode: [str] :- No significance in the current version, only maintained to adhere to OpenAI-Gym standards. 58 | :return: None 59 | """ 60 | # Show graphics at intervals. 61 | self.graphics.visualize_data(self.iter, self.gym_state, self.rewards) 62 | 63 | def get_state(self): 64 | """ 65 | This function creates GymState Object for maintaining the current state of the Simulator. 66 | :return: [GymState] 67 | """ 68 | self.gym_state = GymState() 69 | self.gym_state.define_state(self.simulator.currentTime, # Current time in the simulator. 70 | self.simulator.simulator_wait_que_indices, # Current Wait Queue in focus. 71 | self.simulator.module['job'].job_info(-1), # All the JobInfo Dict. 72 | self.simulator.module['node'].nodeStruc, # All the NodeStruct Dict. 73 | self.simulator.module['node'].get_idle()) # Number of Nodes available. 74 | return self.gym_state 75 | 76 | def step(self, action: int): 77 | """ 78 | :param action: [int] :- Wait-Queue index of the selected Job. 79 | Note - this is not Job Index. 80 | :return: 81 | gym_state: [GymState] :- Contains all the information for the next state. 82 | gym_state.feature_vector stores Feature vector for the current state. 83 | done: [boolean] :- True - If the simulation is complete. 84 | reward : [float] :- reward for the current action. 85 | """ 86 | self.iter += 1 87 | ind = action 88 | print("Wait Queue at Step Func - ", self.simulator.simulator_wait_que_indices) 89 | self.simulator.simulator_wait_que_indices = [self.simulator.simulator_wait_que_indices[ind]] + \ 90 | self.simulator.simulator_wait_que_indices[:ind] + \ 91 | self.simulator.simulator_wait_que_indices[ind + 1:] 92 | reward = self.gym_state.get_reward(self.simulator.simulator_wait_que_indices[0]) 93 | 94 | # Maintaining data for GymGraphics 95 | self.rewards.append(reward) 96 | 97 | # Gym Paused, Running simulator. 98 | self.simulator.pause_producer() 99 | 100 | # Simulator executed with selected action. Retrieving new State. 101 | if self.simulator.is_simulation_complete: 102 | # Return an empty state if the Simulation is complete. Avoids NullPointer Exceptions. 103 | self.gym_state = GymState() 104 | else: 105 | self.get_state() 106 | 107 | return self.gym_state, self.simulator.is_simulation_complete, reward 108 | -------------------------------------------------------------------------------- /src_fc/CqGym/GymGraphics.py: -------------------------------------------------------------------------------- 1 | from matplotlib import pyplot as plt 2 | 3 | 4 | class GymGraphics: 5 | 6 | def __init__(self, do_render=False, render_interval=1, render_pause=0.01): 7 | 8 | # Maintaining Variables for Rendering Visuals 9 | self.do_render = do_render 10 | self.render_interval = render_interval 11 | self.render_pause = render_pause 12 | if self.do_render: 13 | self.max_wait_times = [] 14 | self.fig, ((self.node_graph, self.rewards_graph, self.max_wait_time_graph), 15 | (self.que_wait_time_graph, self.que_req_time_graph, self.que_req_proc_graph)) = \ 16 | plt.subplots(2, 3, figsize=(14, 5)) 17 | 18 | def reset(self): 19 | if self.do_render: 20 | self.max_wait_times = [] 21 | self.fig, ((self.node_graph, self.rewards_graph, self.max_wait_time_graph), 22 | (self.que_wait_time_graph, self.que_req_time_graph, self.que_req_proc_graph)) =\ 23 | plt.subplots(2, 3, figsize=(14, 5)) 24 | self.fig.tight_layout(pad=3.0) 25 | 26 | @staticmethod 27 | def get_que_data_arrays(state): 28 | 29 | que_ids = state.wait_que 30 | que_wait_times = [state.current_time - state.job_info[idx]['submit'] for idx in que_ids] 31 | que_req_times = [state.job_info[idx]['reqTime'] for idx in que_ids] 32 | que_req_procs = [state.job_info[idx]['reqProc'] for idx in que_ids] 33 | 34 | return [str(idx) for idx in que_ids], que_wait_times, que_req_times, que_req_procs 35 | 36 | def visualize_data(self, iter, state, rewards): 37 | 38 | if self.do_render and state and iter % self.render_interval == 0: 39 | 40 | self.node_graph.clear() 41 | self.node_graph.bar(['Used Nodes', 'Idle Nodes'], 42 | [state.total_nodes-state.idle_nodes, 43 | state.idle_nodes]) 44 | self.node_graph.set_title('Used Nodes vs Idle Nodes') 45 | self.node_graph.set_ylim([0, 5000]) 46 | 47 | self.rewards_graph.plot(rewards) 48 | self.rewards_graph.set_title('Rewards at each step') 49 | 50 | que_ids, que_wait_times, que_req_times, que_req_procs = self.get_que_data_arrays(state) 51 | 52 | self.que_wait_time_graph.clear() 53 | self.que_wait_time_graph.bar(que_ids, que_wait_times) 54 | self.que_wait_time_graph.set_title('Wait Time of Queued Jobs') 55 | 56 | self.max_wait_times.append(max(que_wait_times)) 57 | self.max_wait_time_graph.plot(self.max_wait_times) 58 | self.max_wait_time_graph.set_title('Max wait time at each step') 59 | 60 | self.que_req_time_graph.clear() 61 | self.que_req_time_graph.bar(que_ids, que_req_times) 62 | self.que_req_time_graph.set_title('Req Time of Queued Jobs') 63 | 64 | self.que_req_proc_graph.clear() 65 | self.que_req_proc_graph.bar(que_ids, que_req_procs) 66 | self.que_req_proc_graph.set_title('Req Proc of Queued Jobs') 67 | 68 | plt.ion() 69 | plt.show() 70 | plt.pause(self.render_pause) 71 | -------------------------------------------------------------------------------- /src_fc/CqGym/GymState.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | class GymState: 5 | 6 | _job_cols_ = 2 7 | _window_size_ = 50 8 | 9 | def __init__(self): 10 | # Variable to maintain the info received 11 | self.current_time = None 12 | self.wait_que = None 13 | self.wait_que_size = 0 14 | self.job_info = {} 15 | 16 | self.job_vector = [] 17 | self.node_vector = [] 18 | self.total_nodes = 0 19 | self.idle_nodes = 0 20 | self.feature_vector = [] 21 | 22 | def define_state(self, current_time, wait_que_indices, job_info_dict, node_info_list, idle_nodes_count): 23 | """ 24 | :param wait_que_indices: List[Integer] - indices of the jobs in wait que, List size limited. 25 | :param job_info_dict: Dict{Integer: Info} - Information of all the jobs from simulator 26 | :param node_info_list: List[Node Info] - Information of all the nodes from simulator 27 | :return: State parsable by the RL Model in use - Eg. Numpy Array 28 | """ 29 | self.current_time = current_time 30 | self.wait_que = wait_que_indices[:] 31 | self.wait_que_size = len(self.wait_que) 32 | self.job_info = job_info_dict 33 | 34 | self.wait_job = [job_info_dict[ind] for ind in wait_que_indices] 35 | 36 | wait_job_input = self.preprocessing_queued_jobs( 37 | self.wait_job, current_time) 38 | system_status_input = self.preprocessing_system_status( 39 | node_info_list, current_time) 40 | self.feature_vector = self.make_feature_vector( 41 | wait_job_input, system_status_input) 42 | 43 | def vector_reshape(vec): 44 | return vec.reshape(tuple([1]) + vec.shape) 45 | self.feature_vector = vector_reshape(self.feature_vector) 46 | 47 | self.total_nodes = len(node_info_list) 48 | self.idle_nodes = idle_nodes_count 49 | 50 | def preprocessing_queued_jobs(self, wait_job, currentTime): 51 | job_info_list = [] 52 | for job in wait_job: 53 | s = float(job['submit']) 54 | t = float(job['reqTime']) 55 | n = float(job['reqProc']) 56 | w = int(currentTime - s) 57 | # award 1: high priority; 0: low priority 58 | # a = int(wait_job[i]['award']) 59 | info = [[n, t], [0, w]] 60 | # info = [[n, t], [a, w]] 61 | job_info_list.append(info) 62 | return job_info_list 63 | 64 | def preprocessing_system_status(self, node_struc, currentTime): 65 | node_info_list = [] 66 | # Each element format - [Availbility, time to be available] [1, 0] - Node is available 67 | for node in node_struc: 68 | info = [] 69 | # avabile 1, not available 0 70 | if node['state'] < 0: 71 | info.append(1) 72 | info.append(0) 73 | else: 74 | info.append(0) 75 | info.append(node['end'] - currentTime) 76 | # Next available node time. 77 | 78 | node_info_list.append(info) 79 | return node_info_list 80 | 81 | def make_feature_vector(self, jobs, system_status): 82 | # Remove hard coded part ! 83 | job_cols = self._job_cols_ 84 | window_size = self._window_size_ 85 | input_dim = [len(system_status) + window_size * 86 | job_cols, len(system_status[0])] 87 | 88 | fv = np.zeros((1, input_dim[0], input_dim[1])) 89 | i = 0 90 | for idx, job in enumerate(jobs): 91 | fv[0, idx * job_cols:(idx + 1) * job_cols, :] = job 92 | i += 1 93 | if i == window_size: 94 | break 95 | fv[0, job_cols * window_size:, :] = system_status 96 | return fv 97 | 98 | def get_max_wait_time_in_queue(self): 99 | job_cnt = 0 100 | max_wait_time_in_que = 0 101 | max_job_size_in_que = 0 102 | total_wait_time = 0 103 | total_wait_core_seconds = 0 104 | for job_id in self.job_info: 105 | job_cnt += 1 106 | job = self.job_info[job_id] 107 | if job_cnt <= self._window_size_: 108 | max_wait_time_in_que = max( 109 | max_wait_time_in_que, self.current_time - job['submit']) 110 | max_job_size_in_que = max(max_job_size_in_que, job['reqProc']) 111 | total_wait_time += job['reqTime'] 112 | total_wait_core_seconds += job['reqTime'] * job['reqProc'] 113 | return max_wait_time_in_que, max_job_size_in_que, total_wait_time, total_wait_core_seconds, job_cnt 114 | 115 | def get_reward(self, selected_job): 116 | 117 | max_wait_time_in_que, max_job_size_in_que, total_wait_time, total_wait_core_seconds, total_wait_size = self.get_max_wait_time_in_queue() 118 | 119 | tmp_reward = 0 120 | running = self.total_nodes - self.idle_nodes 121 | selected_job_info = self.job_info[selected_job] 122 | selected_job_requested_nodes = selected_job_info['reqProc'] 123 | selected_job_wait_time = self.current_time - \ 124 | selected_job_info['submit'] 125 | 126 | selected_job_priority = selected_job_requested_nodes / self.total_nodes 127 | w1, w2, w3 = 1 / 3, 1 / 3, 1 / 3 128 | 129 | if self.idle_nodes < selected_job_requested_nodes: 130 | tmp_reward += running / self.total_nodes * w1 131 | else: 132 | tmp_reward += (selected_job_requested_nodes + 133 | running) / self.total_nodes * w1 134 | 135 | if max_wait_time_in_que >= 21600: 136 | tmp_reward += selected_job_wait_time / max_wait_time_in_que * w2 137 | else: 138 | tmp_reward += selected_job_wait_time / 21600 * w2 139 | 140 | tmp_reward += selected_job_priority * w3 141 | 142 | # tmp_reward = selected_job_requested_nodes / max_job_size_in_que 143 | 144 | return tmp_reward 145 | -------------------------------------------------------------------------------- /src_fc/CqGym/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/CqGym/__init__.py -------------------------------------------------------------------------------- /src_fc/CqSim/Backfill.py: -------------------------------------------------------------------------------- 1 | 2 | __metaclass__ = type 3 | 4 | class Backfill: 5 | def __init__(self, mode = 0, ad_mode = 0, node_module = None, debug = None, para_list = None): 6 | self.myInfo = "Backfill" 7 | self.mode = mode 8 | self.ad_mode = ad_mode 9 | self.node_module = node_module 10 | self.debug = debug 11 | self.para_list = para_list 12 | self.current_para = [] 13 | self.wait_job = [] 14 | 15 | self.debug.line(4," ") 16 | self.debug.line(4,"#") 17 | self.debug.debug("# "+self.myInfo,1) 18 | self.debug.line(4,"#") 19 | 20 | def reset (self, mode = None, ad_mode = None, node_module = None, debug = None, para_list = None): 21 | #self.debug.debug("* "+self.myInfo+" -- reset",5) 22 | if mode: 23 | self.mode = mode 24 | if ad_mode : 25 | self.ad_mode = ad_mode 26 | if node_module: 27 | self.node_module = node_module 28 | if debug: 29 | self.debug = debug 30 | if para_list: 31 | self.para_list = para_list 32 | self.current_para = [] 33 | self.wait_job = [] 34 | 35 | def backfill (self, wait_job, para_in = None): 36 | #self.debug.debug("* "+self.myInfo+" -- backfill",5) 37 | if (len(wait_job) <= 1): 38 | return [] 39 | self.current_para = para_in 40 | self.wait_job = wait_job 41 | job_list = self.main() 42 | return job_list 43 | 44 | def main(self): 45 | #self.debug.debug("* "+self.myInfo+" -- main",5) 46 | result = [] 47 | if (self.mode == 1): 48 | # EASY backfill 49 | result = self.backfill_EASY() 50 | elif (self.mode == 2): 51 | # Conservative backfill 52 | result = self.backfill_cons() 53 | elif self.mode == 3: 54 | # Gym Based - RL Backfill 55 | result = self.backfill_RL() 56 | else: 57 | return None 58 | return result 59 | 60 | def backfill_EASY(self): 61 | #self.debug.debug("* "+self.myInfo+" -- backfill_EASY",5) 62 | backfill_list = [] 63 | self.node_module.pre_reset(self.current_para['time']) 64 | self.node_module.reserve(self.wait_job[0]['proc'], self.wait_job[0]['index'], self.wait_job[0]['run']) 65 | i = 1 66 | job_num = len(self.wait_job) 67 | while (i < job_num): 68 | backfill_test = 0 69 | backfill_test = self.node_module.pre_avail(self.wait_job[i]['proc'], self.current_para['time'], 70 | self.current_para['time'] + self.wait_job[i]['run']) 71 | if (backfill_test == 1): 72 | backfill_list.append(self.wait_job[i]['index']) 73 | self.node_module.reserve(self.wait_job[i]['proc'], self.wait_job[i]['index'], self.wait_job[i]['run']) 74 | i += 1 75 | return backfill_list 76 | 77 | def backfill_cons(self): 78 | #self.debug.debug("* "+self.myInfo+" -- backfill_cons",5) 79 | backfill_list=[] 80 | self.node_module.pre_reset(self.current_para['time']) 81 | self.node_module.reserve(self.wait_job[0]['proc'], self.wait_job[0]['index'], self.wait_job[0]['run']) 82 | i = 1 83 | job_num = len(self.wait_job) 84 | while (i < job_num): 85 | backfill_test = 0 86 | backfill_test = self.node_module.pre_avail(self.wait_job[i]['proc'],\ 87 | self.current_para['time'], self.current_para['time']+self.wait_job[i]['run']) 88 | if (backfill_test == 1): 89 | backfill_list.append(self.wait_job[i]['index']) 90 | self.node_module.reserve(self.wait_job[i]['proc'], self.wait_job[i]['index'], self.wait_job[i]['run']) 91 | i += 1 92 | return backfill_list 93 | 94 | def backfill_RL(self): 95 | 96 | backfill_list = [] # List[Job Index] - Final selected backfillable job indices 97 | 98 | # Reserving First job. 99 | self.node_module.pre_reset(self.current_para['time']) 100 | self.node_module.reserve(self.wait_job[0]['proc'], self.wait_job[0]['index'], self.wait_job[0]['run']) 101 | 102 | while True: 103 | backfill_candidates = [] 104 | backfill_candidates_info = {} 105 | i = 1 106 | job_num = len(self.wait_job) 107 | while i < job_num: 108 | backfill_test = self.node_module.pre_avail(self.wait_job[i]['proc'], 109 | self.current_para['time'], 110 | self.current_para['time'] + self.wait_job[i]['run']) 111 | if backfill_test == 1: 112 | backfill_candidates.append(self.wait_job[i]['index']) 113 | backfill_candidates_info[self.wait_job[i]['index']] = { 114 | 'pos': i, 115 | 'proc': self.wait_job[i]['proc'], 116 | 'index': self.wait_job[i]['index'], 117 | 'run': self.wait_job[i]['run'], 118 | } 119 | i += 1 120 | 121 | if not backfill_candidates: 122 | break 123 | else: 124 | # ************* # 125 | # Reorder Queue Function [Cqsim_sim.reorder_queue()] - communicates with the Gym Environment and 126 | # pushes the selected job at the beginning. 127 | # ************* # 128 | wait_queue = self.current_para['reorder_queue_function'](backfill_candidates) 129 | selected_idx = wait_queue[0] 130 | backfill_list.append(selected_idx) 131 | self.node_module.reserve(backfill_candidates_info[selected_idx]['proc'], 132 | backfill_candidates_info[selected_idx]['index'], 133 | backfill_candidates_info[selected_idx]['run']) 134 | self.wait_job = self.wait_job[:backfill_candidates_info[selected_idx]['pos']] + \ 135 | self.wait_job[backfill_candidates_info[selected_idx]['pos']+1:] 136 | 137 | return backfill_list 138 | 139 | -------------------------------------------------------------------------------- /src_fc/CqSim/Basic_algorithm.py: -------------------------------------------------------------------------------- 1 | 2 | __metaclass__ = type 3 | 4 | 5 | class Basic_algorithm: 6 | def __init__ (self, ad_mode = 0, element = None, debug = None, para_list = None, ad_para_list=None): 7 | self.myInfo = "Basic Algorithm" 8 | self.ad_mode = ad_mode 9 | self.element = element 10 | self.debug = debug 11 | self.paralist = para_list 12 | self.ad_paralist = ad_para_list 13 | 14 | self.debug.line(4," ") 15 | self.debug.line(4,"#") 16 | self.debug.debug("# "+self.myInfo,1) 17 | self.debug.line(4,"#") 18 | 19 | self.algStr="" 20 | self.scoreList=[] 21 | i = 0 22 | temp_num = len(self.element[0]) 23 | while (i < temp_num): 24 | self.algStr += self.element[0][i] 25 | i += 1 26 | 27 | def reset (self, ad_mode = None, element = None, debug = None, para_list = None, ad_para_list=None): 28 | #self.debug.debug("* "+self.myInfo+" -- reset",5) 29 | if ad_mode : 30 | self.ad_mode = ad_mode 31 | if element: 32 | self.element = element 33 | if debug: 34 | self.debug = debug 35 | if paralist: 36 | self.paralist = paralist 37 | 38 | self.algStr="" 39 | self.scoreList=[] 40 | i = 0 41 | temp_num = len(self.element[0]) 42 | while (i < temp_num): 43 | self.algStr += self.element[0][i] 44 | i += 1 45 | 46 | def get_score(self, wait_job, currentTime, para_list = None): 47 | #self.debug.debug("* "+self.myInfo+" -- get_score",5) 48 | self.scoreList = [] 49 | waitNum = len(wait_job) 50 | if (waitNum<=0): 51 | return [] 52 | else: 53 | i=0 54 | z=currentTime - wait_job[0]['submit'] 55 | l=wait_job[0]['reqTime'] 56 | while (iz): 59 | z=temp_w 60 | if (wait_job[i]['reqTime'] None: 46 | """ 47 | Invoke thread which runs the CqSim. 48 | :return: 49 | """ 50 | self.cqsim_sim() 51 | 52 | def reset(self, module=None, debug=None): 53 | 54 | if module: 55 | self.module = module 56 | 57 | if debug: 58 | self.debug = debug 59 | 60 | self.event_seq = [] 61 | self.current_event = None 62 | self.reserve_job_id = -1 63 | # obsolete 64 | self.job_num = len(self.module['job'].job_info()) 65 | self.currentTime = 0 66 | # obsolete 67 | self.read_job_buf_size = 100 68 | self.read_job_pointer = 0 69 | self.previous_read_job_time = -1 70 | 71 | def cqsim_sim(self): 72 | 73 | self.import_submit_events() 74 | self.scan_event() 75 | 76 | self.print_result() 77 | 78 | self.is_simulation_complete = True 79 | self.release_all() 80 | 81 | self.debug.debug("------ Simulating Done!", 2) 82 | self.debug.debug(lvl=1) 83 | return 84 | 85 | def import_submit_events(self): 86 | # fread jobs to job list and buffer to event_list dynamically 87 | if self.read_job_pointer < 0: 88 | return -1 89 | temp_return = self.module['job'].dyn_import_job_file() 90 | i = self.read_job_pointer 91 | #while (i < len(self.module['job'].job_info())): 92 | while (i < self.module['job'].job_info_len()): 93 | self.insert_event(1,self.module['job'].job_info(i)['submit'],2,[1,i]) 94 | self.previous_read_job_time = self.module['job'].job_info(i)['submit'] 95 | self.debug.debug(" "+"Insert job["+"2"+"] "+str(self.module['job'].job_info(i)['submit']),4) 96 | i += 1 97 | 98 | if temp_return == None or temp_return < 0 : 99 | self.read_job_pointer = -1 100 | return -1 101 | else: 102 | self.read_job_pointer = i 103 | return 0 104 | 105 | def insert_event(self, type, time, priority, para = None): 106 | #self.debug.debug("# "+self.myInfo+" -- insert_event",5) 107 | temp_index = -1 108 | new_event = {"type":type, "time":time, "prio":priority, "para":para} 109 | if (type == 1): 110 | #i = self.event_pointer 111 | i = 0 112 | while (ipriority): 115 | temp_index = i 116 | break 117 | elif (self.event_seq[i]['time']>time): 118 | temp_index = i 119 | break 120 | i += 1 121 | 122 | if (temp_index>=len(self.event_seq) or temp_index == -1): 123 | self.event_seq.append(new_event) 124 | else: 125 | self.event_seq.insert(temp_index,new_event) 126 | 127 | def scan_event(self): 128 | 129 | self.debug.line(2, " ") 130 | self.debug.line(2, "=") 131 | self.debug.line(2, "=") 132 | self.current_event = None 133 | 134 | while (len(self.event_seq) > 0 or self.read_job_pointer >= 0): 135 | if len(self.event_seq) > 0: 136 | temp_current_event = self.event_seq[0] 137 | temp_currentTime = temp_current_event['time'] 138 | else: 139 | temp_current_event = None 140 | temp_currentTime = -1 141 | 142 | if (len(self.event_seq) == 0 or temp_currentTime >= self.previous_read_job_time) and self.read_job_pointer >= 0: 143 | self.import_submit_events() 144 | continue 145 | 146 | self.current_event = temp_current_event 147 | self.currentTime = temp_currentTime 148 | 149 | if self.current_event['type'] == 1: 150 | 151 | self.debug.line(2, " ") 152 | self.debug.line(2, ">>>") 153 | self.debug.line(2, "--") 154 | self.debug.debug(" Time: "+str(self.currentTime), 2) 155 | self.debug.debug(" "+str(self.current_event), 2) 156 | self.debug.line(2, "--") 157 | self.debug.debug(" Wait: "+str(self.module['job'].wait_list()),2) 158 | self.debug.debug(" Run : "+str(self.module['job'].run_list()),2) 159 | self.debug.line(2, "--") 160 | self.debug.debug(" Tot:"+str(self.module['node'].get_tot())+" Idle:"+str(self.module['node'].get_idle())+" Avail:"+str(self.module['node'].get_avail())+" ",2) 161 | self.debug.line(2, "--") 162 | 163 | self.event_job(self.current_event['para']) 164 | 165 | self.sys_collect() 166 | 167 | del self.event_seq[0] 168 | 169 | self.debug.line(2,"=") 170 | self.debug.line(2,"=") 171 | self.debug.line(2," ") 172 | return 173 | 174 | def event_job(self, para_in = None): 175 | 176 | if (self.current_event['para'][0] == 1): 177 | self.submit(self.current_event['para'][1]) 178 | elif (self.current_event['para'][0] == 2): 179 | self.finish(self.current_event['para'][1]) 180 | # Obsolete 181 | # self.score_calculate() 182 | self.start_scan() 183 | 184 | def submit(self, job_index): 185 | #self.debug.debug("# "+self.myInfo+" -- submit",5) 186 | self.debug.debug("[Submit] "+str(job_index),3) 187 | self.module['job'].job_submit(job_index) 188 | return 189 | 190 | def finish(self, job_index): 191 | #self.debug.debug("# "+self.myInfo+" -- finish",5) 192 | self.debug.debug("[Finish] "+str(job_index),3) 193 | self.module['node'].node_release(job_index,self.currentTime) 194 | self.module['job'].job_finish(job_index) 195 | self.module['output'].print_result(self.module['job'], job_index) 196 | self.module['job'].remove_job_from_dict(job_index) 197 | return 198 | 199 | def start_job(self, job_index): 200 | # self.debug.debug("# "+self.myInfo+" -- start",5) 201 | self.debug.debug("[Start] "+str(job_index), 3) 202 | self.module['node'].node_allocate(self.module['job'].job_info(job_index)['reqProc'], job_index, 203 | self.currentTime, self.currentTime + 204 | self.module['job'].job_info(job_index)['reqTime']) 205 | self.module['job'].job_start(job_index, self.currentTime) 206 | self.insert_event(1, self.currentTime+self.module['job'].job_info(job_index)['run'], 1, [2, job_index]) 207 | return 208 | 209 | def reorder_queue(self, wait_que): 210 | """ 211 | This(and only this) function manages thread synchronization and communication with the GymEnvironment. 212 | 213 | :param wait_que: [List[int]] : CqSim WaitQue at current Time. 214 | :return: [List[int]] : Updated wait_que, with the selected job at the beginning. 215 | """ 216 | self.simulator_wait_que_indices = wait_que 217 | self.pause_consumer() 218 | return self.simulator_wait_que_indices 219 | 220 | def start_scan(self): 221 | 222 | start_max = self.module['win'].start_num() 223 | temp_wait = self.module['job'].wait_list() 224 | win_count = start_max 225 | 226 | while temp_wait: 227 | if win_count >= start_max: 228 | win_count = 0 229 | temp_wait = self.start_window(temp_wait) 230 | 231 | # ************ # 232 | # Communicate with GymEnvironment. 233 | # ************ # 234 | print("Wait Queue at StartScan - ", temp_wait) 235 | if temp_wait[0] != self.reserve_job_id: 236 | temp_wait = self.reorder_queue(temp_wait) 237 | 238 | temp_job_id = temp_wait[0] 239 | temp_job = self.module['job'].job_info(temp_job_id) 240 | if self.module['node'].is_available(temp_job['reqProc']): 241 | # print(f'temp_job_id: {temp_job_id}') 242 | if self.reserve_job_id == temp_job_id: 243 | self.reserve_job_id = -1 244 | 245 | self.start_job(temp_job_id) 246 | temp_wait.pop(0) 247 | else: 248 | temp_wait = self.module['job'].wait_list() 249 | self.reserve_job_id = temp_wait[0] 250 | self.backfill(temp_wait) 251 | break 252 | 253 | win_count += 1 254 | return 255 | 256 | def start_window(self, temp_wait_B): 257 | 258 | win_size = self.module['win'].window_size() 259 | 260 | if (len(temp_wait_B)>win_size): 261 | temp_wait_A = temp_wait_B[0:win_size] 262 | temp_wait_B = temp_wait_B[win_size:] 263 | else: 264 | temp_wait_A = temp_wait_B 265 | temp_wait_B = [] 266 | 267 | temp_wait_info = [] 268 | max_num = len(temp_wait_A) 269 | i = 0 270 | while i < max_num: 271 | temp_job = self.module['job'].job_info(temp_wait_A[i]) 272 | temp_wait_info.append({"index": temp_wait_A[i], "proc": temp_job['reqProc'], 273 | "node": temp_job['reqProc'], "run": temp_job['run'], 274 | "score": temp_job['score']}) 275 | i += 1 276 | 277 | temp_wait_A = self.module['win'].start_window(temp_wait_info,{"time":self.currentTime}) 278 | temp_wait_B[0:0] = temp_wait_A 279 | return temp_wait_B 280 | 281 | def backfill(self, temp_wait): 282 | temp_wait_info = [] 283 | max_num = len(temp_wait) 284 | i = 0 285 | while i < max_num: 286 | temp_job = self.module['job'].job_info(temp_wait[i]) 287 | temp_wait_info.append({"index": temp_wait[i], "proc": temp_job['reqProc'], 288 | "node": temp_job['reqProc'], "run": temp_job['run'], "score": temp_job['score']}) 289 | i += 1 290 | 291 | # ************ # 292 | # reorder_queue function passed as an argument, to be invoked while selecting back-fill jobs. 293 | # ************ # 294 | backfill_list = self.module['backfill'].backfill(temp_wait_info, {'time': self.currentTime, 295 | 'reorder_queue_function': self.reorder_queue}) 296 | 297 | if not backfill_list: 298 | return 0 299 | 300 | for job in backfill_list: 301 | print('backfill job.') 302 | self.start_job(job) 303 | return 1 304 | 305 | def sys_collect(self): 306 | 307 | temp_inter = 0 308 | if (len(self.event_seq) > 1): 309 | temp_inter = self.event_seq[1]['time'] - self.currentTime 310 | temp_size = 0 311 | 312 | event_code=None 313 | if (self.event_seq[0]['type'] == 1): 314 | if (self.event_seq[0]['para'][0] == 1): 315 | event_code='S' 316 | elif(self.event_seq[0]['para'][0] == 2): 317 | event_code='E' 318 | elif (self.event_seq[0]['type'] == 2): 319 | event_code='Q' 320 | temp_info = self.module['info'].info_collect(time=self.currentTime, event=event_code, 321 | uti=(self.module['node'].get_tot() - 322 | self.module['node'].get_idle()) * 323 | 1.0/self.module['node'].get_tot(), 324 | waitNum=len(self.module['job'].wait_list()), 325 | waitSize=self.module['job'].wait_size(), inter=temp_inter) 326 | self.print_sys_info(temp_info) 327 | return 328 | 329 | def print_sys_info(self, sys_info): 330 | self.module['output'].print_sys_info(sys_info) 331 | 332 | def print_result(self): 333 | self.module['output'].print_sys_info() 334 | self.debug.debug(lvl=1) 335 | self.module['output'].print_result(self.module['job']) 336 | -------------------------------------------------------------------------------- /src_fc/CqSim/Info_collect.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | import time 3 | __metaclass__ = type 4 | 5 | class Info_collect: 6 | def __init__(self, alg_module = None, debug = None): 7 | self.myInfo = "Info Collect" 8 | self.alg_module = alg_module 9 | self.debug = debug 10 | #self.sys_info = [] 11 | 12 | self.debug.line(4," ") 13 | self.debug.line(4,"#") 14 | self.debug.debug("# "+self.myInfo,1) 15 | self.debug.line(4,"#") 16 | 17 | 18 | def reset(self, alg_module = None, debug = None): 19 | self.debug.debug("* "+self.myInfo+" -- reset",5) 20 | if alg_module: 21 | self.alg_module = alg_module 22 | if debug: 23 | self.debug = debug 24 | #self.sys_info = [] 25 | 26 | def info_collect(self, time, event, uti, waitNum = -1, waitSize = -1, inter = -1.0, extend = None): 27 | self.debug.debug("* "+self.myInfo+" -- info_collect",5) 28 | event_date = time 29 | temp_info = {'date': event_date, 'time': time, 'event': event, 'uti': uti, 'waitNum': waitNum, \ 30 | 'waitSize': waitSize, 'inter': inter, 'extend': extend} 31 | self.debug.debug(" "+str(temp_info),4) 32 | #self.sys_info.append(temp_info) 33 | return temp_info 34 | 35 | ''' 36 | def info_analysis(self): 37 | self.debug.debug("* "+self.myInfo+" -- info_analysis",5) 38 | return 1 39 | 40 | 41 | def get_info(self, index): 42 | self.debug.debug("* "+self.myInfo+" -- get_info",6) 43 | if index>=len(self.sys_info): 44 | return None 45 | return self.sys_info[index] 46 | 47 | def get_len(self): 48 | self.debug.debug("* "+self.myInfo+" -- get_len",6) 49 | return len(self.sys_info) 50 | ''' -------------------------------------------------------------------------------- /src_fc/CqSim/Job_trace.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | from functools import cmp_to_key 3 | import time 4 | import re 5 | 6 | __metaclass__ = type 7 | 8 | class Job_trace: 9 | def __init__(self, start=-1, num=-1, anchor=-1, density=1.0, read_input_freq = 1000, debug=None): 10 | self.myInfo = "Job Trace" 11 | self.start = start 12 | self.start_offset_A = 0.0 13 | self.start_offset_B = 0.0 14 | self.start_date = "" 15 | self.anchor = anchor 16 | self.read_num = num 17 | self.density = density 18 | self.debug = debug 19 | self.jobTrace={} 20 | self.jobFile = None 21 | self.read_input_freq = read_input_freq 22 | self.num_delete_jobs = 0 23 | 24 | self.debug.line(4," ") 25 | self.debug.line(4,"#") 26 | self.debug.debug("# "+self.myInfo,1) 27 | self.debug.line(4,"#") 28 | 29 | self.reset_data() 30 | 31 | def reset(self, start=None, num=None, anchor=None, density=None, read_input_freq = None, debug=None): 32 | #self.debug.debug("* "+self.myInfo+" -- reset",5) 33 | if start: 34 | self.anchor = start 35 | self.start_offset_A = 0.0 36 | self.start_offset_B = 0.0 37 | if num: 38 | self.read_num = num 39 | if anchor: 40 | self.anchor = anchor 41 | if density: 42 | self.density = density 43 | if debug: 44 | self.debug = debug 45 | if read_input_freq: 46 | self.read_input_freq = read_input_freq 47 | self.jobTrace={} 48 | self.jobFile = None 49 | self.reset_data() 50 | 51 | def reset_data(self): 52 | #self.debug.debug("* "+self.myInfo+" -- reset_data",5) 53 | self.job_wait_size = 0 54 | self.job_submit_list=[] 55 | self.job_wait_list=[] 56 | self.job_run_list=[] 57 | #self.job_done_list=[] 58 | self.num_delete_jobs = 0 59 | 60 | def initial_import_job_file(self, job_file): 61 | self.temp_start=self.start 62 | #regex_str = "([^;\\n]*)[;\\n]" 63 | self.jobFile = open(job_file,'r') 64 | self.min_sub = -1 65 | self.jobTrace={} 66 | self.reset_data() 67 | self.debug.line(4) 68 | self.i = 0 69 | self.j = 0 70 | #self.dyn_import_job_file() 71 | 72 | def dyn_import_job_file(self): 73 | if self.jobFile.closed: 74 | return -1 75 | temp_n = 0 76 | regex_str = "([^;\\n]*)[;\\n]" 77 | while (self.i=self.anchor): 84 | temp_dataList=re.findall(regex_str,tempStr) 85 | 86 | if (self.min_sub<0): 87 | self.min_sub=float(temp_dataList[1]) 88 | if (self.temp_start < 0): 89 | self.temp_start = self.min_sub 90 | self.start_offset_B = self.min_sub-self.temp_start 91 | 92 | tempInfo = {'id':int(temp_dataList[0]),\ 93 | 'submit':self.density*(float(temp_dataList[1])-self.min_sub)+self.temp_start,\ 94 | 'wait':float(temp_dataList[2]),\ 95 | 'run':float(temp_dataList[3]),\ 96 | 'usedProc':int(temp_dataList[4]),\ 97 | 'usedAveCPU':float(temp_dataList[5]),\ 98 | 'usedMem':float(temp_dataList[6]),\ 99 | 'reqProc':int(temp_dataList[7]),\ 100 | 'reqTime':float(temp_dataList[8]),\ 101 | 'reqMem':float(temp_dataList[9]),\ 102 | 'status':int(temp_dataList[10]),\ 103 | 'userID':int(temp_dataList[11]),\ 104 | 'groupID':int(temp_dataList[12]),\ 105 | 'num_exe':int(temp_dataList[13]),\ 106 | 'num_queue':int(temp_dataList[14]),\ 107 | 'num_part':int(temp_dataList[15]),\ 108 | 'num_pre':int(temp_dataList[16]),\ 109 | 'thinkTime':int(temp_dataList[17]),\ 110 | 'start':-1,\ 111 | 'end':-1,\ 112 | 'score':0,\ 113 | 'state':0,\ 114 | 'happy':-1,\ 115 | 'estStart':-1} 116 | #self.jobTrace.append(tempInfo) 117 | self.jobTrace[self.i] = tempInfo 118 | self.job_submit_list.append(self.i) 119 | self.debug.debug(temp_dataList,4) 120 | #self.debug.debug("* "+str(tempInfo),4) 121 | self.i += 1 122 | self.j += 1 123 | temp_n += 1 124 | return 0 125 | 126 | def import_job_file (self, job_file): 127 | #self.debug.debug("* "+self.myInfo+" -- import_job_file",5) 128 | temp_start=self.start 129 | regex_str = "([^;\\n]*)[;\\n]" 130 | jobFile = open(job_file,'r') 131 | min_sub = -1 132 | self.jobTrace={} 133 | self.reset_data() 134 | 135 | self.debug.line(4) 136 | i = 0 137 | j = 0 138 | while (i=self.anchor): 143 | temp_dataList=re.findall(regex_str,tempStr) 144 | 145 | if (min_sub<0): 146 | min_sub=float(temp_dataList[1]) 147 | if (temp_start < 0): 148 | temp_start = min_sub 149 | self.start_offset_B = min_sub-temp_start 150 | 151 | tempInfo = {'id':int(temp_dataList[0]),\ 152 | 'submit':self.density*(float(temp_dataList[1])-min_sub)+temp_start,\ 153 | 'wait':float(temp_dataList[2]),\ 154 | 'run':float(temp_dataList[3]),\ 155 | 'usedProc':int(temp_dataList[4]),\ 156 | 'usedAveCPU':float(temp_dataList[5]),\ 157 | 'usedMem':float(temp_dataList[6]),\ 158 | 'reqProc':int(temp_dataList[7]),\ 159 | 'reqTime':float(temp_dataList[8]),\ 160 | 'reqMem':float(temp_dataList[9]),\ 161 | 'status':int(temp_dataList[10]),\ 162 | 'userID':int(temp_dataList[11]),\ 163 | 'groupID':int(temp_dataList[12]),\ 164 | 'num_exe':int(temp_dataList[13]),\ 165 | 'num_queue':int(temp_dataList[14]),\ 166 | 'num_part':int(temp_dataList[15]),\ 167 | 'num_pre':int(temp_dataList[16]),\ 168 | 'thinkTime':int(temp_dataList[17]),\ 169 | 'start':-1,\ 170 | 'end':-1,\ 171 | 'score':0,\ 172 | 'state':0,\ 173 | 'happy':-1,\ 174 | 'estStart':-1} 175 | #self.jobTrace.append(tempInfo) 176 | self.jobTrace[self.i] = tempInfo 177 | self.job_submit_list.append(i) 178 | self.debug.debug(temp_dataList,4) 179 | #self.debug.debug("* "+str(tempInfo),4) 180 | i += 1 181 | j += 1 182 | 183 | self.debug.line(4) 184 | jobFile.close() 185 | #print('jobFile',jobFile,jobFile.closed) 186 | 187 | def import_job_config (self, config_file): 188 | #self.debug.debug("* "+self.myInfo+" -- import_job_config",5) 189 | regex_str = "([^=\\n]*)[=\\n]" 190 | jobFile = open(config_file,'r') 191 | config_data={} 192 | 193 | self.debug.line(4) 194 | while (1): 195 | tempStr = jobFile.readline() 196 | if not tempStr : # break when no more line 197 | break 198 | temp_dataList=re.findall(regex_str,tempStr) 199 | config_data[temp_dataList[0]]=temp_dataList[1] 200 | self.debug.debug(str(temp_dataList[0])+": "+str(temp_dataList[1]),4) 201 | self.debug.line(4) 202 | jobFile.close() 203 | self.start_offset_A = config_data['start_offset'] 204 | self.start_date = config_data['date'] 205 | 206 | def submit_list (self): 207 | #self.debug.debug("* "+self.myInfo+" -- submit_list",6) 208 | return self.job_submit_list 209 | 210 | def wait_list (self): 211 | #self.debug.debug("* "+self.myInfo+" -- wait_list",6) 212 | return self.job_wait_list 213 | 214 | def run_list (self): 215 | #self.debug.debug("* "+self.myInfo+" -- run_list",6) 216 | return self.job_run_list 217 | 218 | ''' 219 | def done_list (self): 220 | #self.debug.debug("* "+self.myInfo+" -- done_list",6) 221 | return self.job_done_list 222 | ''' 223 | 224 | def wait_size (self): 225 | #self.debug.debug("* "+self.myInfo+" -- wait_size",6) 226 | return self.job_wait_size 227 | 228 | def refresh_score (self, score, job_index=None): 229 | #self.debug.debug("* "+self.myInfo+" -- refresh_score",5) 230 | if job_index: 231 | self.jobTrace[job_index]['score'] = score 232 | else: 233 | i = 0 234 | while (i < len(self.job_wait_list)): 235 | self.jobTrace[self.job_wait_list[i]]['score'] = score[i] 236 | i += 1 237 | #self.job_wait_list.sort(self.scoreCmp) 238 | # python 2 -> 3 239 | self.job_wait_list.sort(key = cmp_to_key(self.scoreCmp)) 240 | #self.debug.debug(" Wait:"+str(self.job_wait_list),4) 241 | 242 | def scoreCmp(self,jobIndex_c1,jobIndex_c2): 243 | return -self.cmp(self.jobTrace[jobIndex_c1]['score'],self.jobTrace[jobIndex_c2]['score']) 244 | 245 | def cmp(self, v1, v2): # emulate cmp from Python 2 246 | if (v1 < v2): 247 | return -1 248 | elif (v1 == v2): 249 | return 0 250 | elif (v1 > v2): 251 | return 1 252 | 253 | def job_info (self, job_index = -1): 254 | #self.debug.debug("* "+self.myInfo+" -- job_info",6) 255 | if job_index == -1: 256 | return self.jobTrace 257 | return self.jobTrace[job_index] 258 | 259 | def job_info_len(self): 260 | return len(self.jobTrace)+self.num_delete_jobs 261 | 262 | def job_submit (self, job_index, job_score = 0, job_est_start = -1): 263 | #self.debug.debug("* "+self.myInfo+" -- job_submit",5) 264 | self.jobTrace[job_index]["state"]=1 265 | self.jobTrace[job_index]["score"]=job_score 266 | self.jobTrace[job_index]["estStart"]=job_est_start 267 | self.job_submit_list.remove(job_index) 268 | self.job_wait_list.append(job_index) 269 | self.job_wait_size += self.jobTrace[job_index]["reqProc"] 270 | return 1 271 | 272 | def job_start (self, job_index, time): 273 | #self.debug.debug("* "+self.myInfo+" -- job_start",5) 274 | self.debug.debug(" "+"["+str(job_index)+"]"+" Req:"+str(self.jobTrace[job_index]['reqProc'])+" Run:"+str(self.jobTrace[job_index]['run'])+" ",4) 275 | self.jobTrace[job_index]["state"]=2 276 | self.jobTrace[job_index]['start']=time 277 | self.jobTrace[job_index]['wait']=time-self.jobTrace[job_index]['submit'] 278 | self.jobTrace[job_index]['end'] = time+self.jobTrace[job_index]['run'] 279 | self.job_wait_list.remove(job_index) 280 | self.job_run_list.append(job_index) 281 | self.job_wait_size -= self.jobTrace[job_index]["reqProc"] 282 | return 1 283 | 284 | def job_finish (self, job_index, time=None): 285 | #self.debug.debug("* "+self.myInfo+" -- job_finish",5) 286 | self.debug.debug(" "+"["+str(job_index)+"]"+" Req:"+str(self.jobTrace[job_index]['reqProc'])+" Run:"+str(self.jobTrace[job_index]['run'])+" ",4) 287 | self.jobTrace[job_index]["state"]=3 288 | if time: 289 | self.jobTrace[job_index]['end'] = time 290 | self.job_run_list.remove(job_index) 291 | #self.job_done_list.append(job_index) 292 | return 1 293 | 294 | ''' 295 | def job_fail (self, job_index, time=None): 296 | #self.debug.debug("* "+self.myInfo+" -- job_fail",5) 297 | self.debug.debug(" "+"["+str(job_index)+"]"+" Req:"+str(self.jobTrace[job_index]['reqProc'])+" Run:"+str(self.jobTrace[job_index]['run'])+" ",4) 298 | self.jobTrace[job_index]["state"]=4 299 | if time: 300 | self.jobTrace[job_index]['end'] = time 301 | self.job_run_list.remove(job_index) 302 | self.fail_list.append(job_index) 303 | return 1 304 | ''' 305 | 306 | def job_set_score (self, job_index, job_score): 307 | #self.debug.debug("* "+self.myInfo+" -- job_set_score",5) 308 | self.jobTrace[job_index]["score"]=job_score 309 | return 1 310 | 311 | def remove_job_from_dict(self, job_index): 312 | del self.jobTrace[job_index] 313 | self.num_delete_jobs += 1 314 | #print('jobTrace.keys',self.jobTrace.keys()) 315 | 316 | 317 | 318 | 319 | 320 | 321 | 322 | 323 | 324 | -------------------------------------------------------------------------------- /src_fc/CqSim/Node_struc.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | import time 3 | import re 4 | 5 | __metaclass__ = type 6 | 7 | class Node_struc: 8 | def __init__(self, debug=None): 9 | self.myInfo = "Node Structure" 10 | self.debug = debug 11 | self.nodeStruc = [] 12 | self.job_list = [] 13 | self.predict_node = [] 14 | self.predict_job = [] 15 | self.tot = -1 16 | self.idle = -1 17 | self.avail = -1 18 | 19 | self.debug.line(4," ") 20 | self.debug.line(4,"#") 21 | self.debug.debug("# "+self.myInfo,1) 22 | self.debug.line(4,"#") 23 | 24 | def reset(self, debug=None): 25 | #self.debug.debug("* "+self.myInfo+" -- reset",5) 26 | self.debug = debug 27 | self.nodeStruc = [] 28 | self.job_list = [] 29 | self.predict_node = [] 30 | self.tot = -1 31 | self.idle = -1 32 | self.avail = -1 33 | 34 | def read_list(self,source_str): 35 | #self.debug.debug("* "+self.myInfo+" -- read_list",5) 36 | result_list=[] 37 | regex_str = "[\[,]([^,\[\]]*)" 38 | result_list=re.findall(regex_str,source_str) 39 | for item in result_list: 40 | item=int(item) 41 | return result_list 42 | 43 | def import_node_file(self, node_file): 44 | #self.debug.debug("* "+self.myInfo+" -- import_node_file",5) 45 | regex_str = "([^;\\n]*)[;\\n]" 46 | nodeFile = open(node_file,'r') 47 | self.nodeStruc = [] 48 | 49 | i = 0 50 | while (1): 51 | tempStr = nodeFile.readline() 52 | if not tempStr : # break when no more line 53 | break 54 | temp_dataList=re.findall(regex_str,tempStr) 55 | 56 | self.debug.debug(" node["+str(i)+"]: "+str(temp_dataList),4) 57 | tempInfo = {"id": int(temp_dataList[0]), \ 58 | "location": self.read_list(temp_dataList[1]), \ 59 | "group": int(temp_dataList[2]), \ 60 | "state": int(temp_dataList[3]), \ 61 | "proc": int(temp_dataList[4]), \ 62 | "start": -1, \ 63 | "end": -1, \ 64 | "extend": None} 65 | self.nodeStruc.append(tempInfo) 66 | i += 1 67 | nodeFile.close() 68 | self.tot = len(self.nodeStruc) 69 | self.idle = self.tot 70 | self.avail = self.tot 71 | self.debug.debug(" Tot:"+str(self.tot)+" Idle:"+str(self.idle)+" Avail:"+str(self.avail)+" ",4) 72 | return 73 | 74 | def import_node_config (self, config_file): 75 | #self.debug.debug("* "+self.myInfo+" -- import_node_config",5) 76 | regex_str = "([^=\\n]*)[=\\n]" 77 | nodeFile = open(config_file,'r') 78 | config_data={} 79 | 80 | self.debug.line(4) 81 | while (1): 82 | tempStr = nodeFile.readline() 83 | if not tempStr : # break when no more line 84 | break 85 | temp_dataList=re.findall(regex_str,tempStr) 86 | config_data[temp_dataList[0]]=temp_dataList[1] 87 | self.debug.debug(str(temp_dataList[0])+": "+str(temp_dataList[1]),4) 88 | self.debug.line(4) 89 | nodeFile.close() 90 | 91 | def import_node_data(self, node_data): 92 | #self.debug.debug("* "+self.myInfo+" -- import_node_data",5) 93 | self.nodeStruc = [] 94 | 95 | temp_len = len(node_data) 96 | i=0 97 | while (i= proc_num: 118 | result = 1 119 | self.debug.debug("[Avail Check] "+str(result),6) 120 | return result 121 | 122 | def get_tot(self): 123 | #self.debug.debug("* "+self.myInfo+" -- get_tot",6) 124 | return self.tot 125 | 126 | def get_idle(self): 127 | #self.debug.debug("* "+self.myInfo+" -- get_idle",6) 128 | return self.idle 129 | 130 | def get_avail(self): 131 | #self.debug.debug("* "+self.myInfo+" -- get_avail",6) 132 | return self.avail 133 | 134 | def node_allocate(self, proc_num, job_index, start, end): 135 | #self.debug.debug("* "+self.myInfo+" -- node_allocate",5) 136 | if self.is_available(proc_num) == 0: 137 | return 0 138 | i = 0 139 | for node in self.nodeStruc: 140 | if node['state'] <0: 141 | node['state'] = job_index 142 | node['start'] = start 143 | node['end'] = end 144 | i += 1 145 | #self.debug.debug(" yyy: "+str(node['state'])+" "+str(job_index),4) 146 | if (i>=proc_num): 147 | break 148 | self.idle -= proc_num 149 | self.avail = self.idle 150 | temp_job_info = {'job':job_index, 'end': end, 'node': proc_num} 151 | j = 0 152 | is_done = 0 153 | temp_num = len(self.job_list) 154 | while (j=start and self.predict_node[i]['time']self.predict_node[i]['avail']): 202 | return 0 203 | i += 1 204 | return 1 205 | 206 | def reserve(self, proc_num, job_index, time, start = None, index = -1 ): 207 | #self.debug.debug("* "+self.myInfo+" -- reserve",5) 208 | 209 | temp_max = len(self.predict_node) 210 | if (start): 211 | if (self.pre_avail(proc_num,start,start+time)==0): 212 | return -1 213 | else: 214 | i = 0 215 | j = 0 216 | if (index >= 0 and index < temp_max): 217 | i = index 218 | elif(index >= temp_max): 219 | return -1 220 | 221 | while (ipre_info_last['start']): 333 | pre_info_last['start'] = temp_job['start'] 334 | if (temp_job['end']>pre_info_last['end']): 335 | pre_info_last['end'] = temp_job['end'] 336 | return pre_info_last 337 | 338 | def pre_reset(self, time): 339 | #self.debug.debug("* "+self.myInfo+" -- pre_reset",5) 340 | self.predict_node = [] 341 | self.predict_job = [] 342 | temp_list = [] 343 | i = 0 344 | while i < self.tot: 345 | temp_list.append(self.nodeStruc[i]['state']) 346 | i += 1 347 | self.predict_node.append({'time': time, 'node': temp_list, 348 | 'idle': self.idle, 'avail': self.avail}) 349 | 350 | temp_job_num = len(self.job_list) 351 | i = 0 352 | j = 0 353 | while i < temp_job_num: 354 | if self.predict_node[j]['time'] != self.job_list[i]['end'] or i == 0: 355 | temp_list = [] 356 | k = 0 357 | while k < self.tot: 358 | temp_list.append(self.predict_node[j]['node'][k]) 359 | k += 1 360 | self.predict_node.append({'time': self.job_list[i]['end'], 'node': temp_list, 361 | 'idle': self.predict_node[j]['idle'], 'avail': self.predict_node[j]['avail']}) 362 | j += 1 363 | k = 0 364 | while k < self.tot: 365 | if self.predict_node[j]['node'][k] == self.job_list[i]['job']: 366 | self.predict_node[j]['node'][k] = -1 367 | self.predict_node[j]['idle'] += 1 368 | k += 1 369 | i += 1 370 | self.predict_node[j]['avail'] = self.predict_node[j]['idle'] 371 | ''' 372 | i = 0 373 | while i< self.tot: 374 | if self.nodeStruc[i]['state'] != -1: 375 | temp_index = get_pre_index(temp_time = self.nodeStruc[i]['end']) 376 | self.predict_node[temp_index]['node'][i] = self.nodeStruc[i]['state'] 377 | i += 1 378 | ''' 379 | return 1 380 | 381 | 382 | def find_res_place(self, proc_num, index, time): 383 | self.debug.debug("* "+self.myInfo+" -- find_res_place",5) 384 | if index>=len(self.predict_node): 385 | index = len(self.predict_node) - 1 386 | 387 | i = index 388 | end = self.predict_node[index]['time']+time 389 | temp_node_num = len(self.predict_node) 390 | 391 | while (i < temp_node_num): 392 | if (self.predict_node[i]['time']self.predict_node[i]['avail']): 394 | #print "xxxxx ",temp_node_num,proc_num,self.predict_node[i] 395 | return i 396 | i += 1 397 | return -1 -------------------------------------------------------------------------------- /src_fc/CqSim/Start_window.py: -------------------------------------------------------------------------------- 1 | 2 | __metaclass__ = type 3 | class Start_window: 4 | def __init__(self, mode = 0, ad_mode = 0, node_module = None, debug = None, para_list = [6,0,0], para_list_ad = None): 5 | self.myInfo = "Start Window" 6 | self.mode = mode 7 | self.ad_mode = ad_mode 8 | self.node_module = node_module 9 | self.debug = debug 10 | self.para_list = para_list 11 | self.para_list_ad = para_list_ad 12 | #print self.para_list 13 | if (len(self.para_list)>=1 and int(self.para_list[0]) > 0): 14 | self.win_size = int(self.para_list[0]) 15 | else: 16 | self.win_size = 1 17 | if (len(self.para_list)>=2 and int(self.para_list[1]) > 0): 18 | self.check_size_in = int(self.para_list[1]) 19 | else: 20 | self.check_size_in = self.win_size 21 | if (len(self.para_list)>=3 and int(self.para_list[2]) > 0): 22 | self.max_start_size = int(self.para_list[2]) 23 | else: 24 | self.max_start_size = self.win_size 25 | 26 | self.temp_check_len = self.check_size_in 27 | 28 | self.current_para = [] 29 | self.seq_list = [] 30 | 31 | self.debug.line(4," ") 32 | self.debug.line(4,"#") 33 | self.debug.debug("# "+self.myInfo,1) 34 | self.debug.line(4,"#") 35 | 36 | self.reset_list() 37 | 38 | def reset (self, mode = None, ad_mode = None, node_module = None, debug = None, para_list = None, para_list_ad = None): 39 | #self.debug.debug("* "+self.myInfo+" -- reset",5) 40 | if mode: 41 | self.mode = mode 42 | if ad_mode: 43 | self.ad_mode =ad_mode 44 | if node_module: 45 | self.node_module = node_module 46 | if debug: 47 | self.debug = debug 48 | if para_list: 49 | self.para_list = para_list 50 | if (self.para_list[0] and self.para_list[0] > 0): 51 | self.win_size = self.para_list[0] 52 | else: 53 | self.win_size = 1 54 | if (self.para_list[1] and self.para_list[1] > 0): 55 | self.check_size_in = self.para_list[1] 56 | else: 57 | self.check_size_in = self.win_size 58 | if (self.para_list[2] and self.para_list[2] > 0): 59 | self.max_start_size = self.para_list[2] 60 | else: 61 | self.max_start_size = self.win_size 62 | 63 | if para_list_ad: 64 | self.para_list_ad = para_list_ad 65 | 66 | self.current_para = [] 67 | self.seq_list = [] 68 | self.reset_list() 69 | 70 | def start_window (self, wait_job, para_in = None): 71 | #self.debug.debug("* "+self.myInfo+" -- start_window",5) 72 | self.current_para = para_in 73 | temp_len = len(wait_job) 74 | self.wait_job = [] 75 | i = 0 76 | while (i < self.win_size and i < temp_len): 77 | self.wait_job.append(wait_job[i]) 78 | i += 1 79 | if i>self.check_size_in: 80 | i = self.check_size_in 81 | self.temp_check_len = i 82 | result = self.main() 83 | return result 84 | 85 | def main (self): 86 | #self.debug.debug("* "+self.myInfo+" -- main",5) 87 | result = [] 88 | if self.mode == 1: 89 | # window 90 | result = self.window_check() 91 | #print ">>>>>>>>>>. ",result 92 | else: 93 | # no window 94 | i = 0 95 | temp_list=[] 96 | while (i < self.temp_check_len): 97 | temp_list.append(self.wait_job[i]['index']) 98 | i += 1 99 | return temp_list 100 | return result 101 | 102 | def window_adapt (self, para_in = None): 103 | #self.debug.debug("* "+self.myInfo+" -- window_adapt",5) 104 | return 0 105 | 106 | def window_size (self): 107 | #self.debug.debug("* "+self.myInfo+" -- window_size",6) 108 | return self.win_size 109 | 110 | def check_size (self): 111 | #self.debug.debug("* "+self.myInfo+" -- check_size",6) 112 | return self.check_size_in 113 | 114 | def start_num (self): 115 | #self.debug.debug("* "+self.myInfo+" -- start_num",6) 116 | return self.max_start_size 117 | 118 | def reset_list (self): 119 | #self.debug.debug("* "+self.myInfo+" -- reset_list",5) 120 | self.seq_list = [] 121 | self.temp_list=[] 122 | self.wait_job = [] 123 | temp_seq=[] 124 | i = 0 125 | ele = [] 126 | while (i=0): 141 | self.temp_list[temp_index] = ele_pool[i] 142 | temp_ele_pool = ele_pool[:] 143 | temp_ele_pool.pop(i) 144 | self.build_seq_list(seq_len-1,temp_ele_pool,temp_index-1) 145 | i -= 1 146 | 147 | def window_check (self): 148 | #self.debug.debug("* "+self.myInfo+" -- window_check",5) 149 | 150 | temp_wait_list = [] 151 | temp_wait_listB = [] 152 | temp_last = -1 153 | temp_max = 1 154 | i = 1 155 | if (self.temp_check_len == 1): 156 | return [self.wait_job[0]['index']] 157 | 158 | while (i<=self.temp_check_len): 159 | temp_max = temp_max * i 160 | i += 1 161 | 162 | i = 0 163 | while (iself.node_module.pre_get_last()['end']): 173 | temp_last = self.node_module.pre_get_last()['end'] 174 | temp_wait_list = self.seq_list[i] 175 | i += 1 176 | 177 | i = 0 178 | while (i=self.anchor): 42 | strNum = len(tempStr) 43 | newWord = 1 44 | k = 0 45 | ID = "" # 1 46 | submit = "" # 2 47 | wait = "" # 3 48 | run = "" # 4 49 | usedProc = "" # 5 50 | usedAveCPU = "" # 6 51 | usedMem = "" # 7 52 | reqProc = "" # 8 53 | reqTime = "" # 9 54 | reqMem = "" # 10 55 | status = "" # 11 56 | userID = "" # 12 57 | groupID = "" # 13 58 | num_exe = "" # 14 59 | num_queue = "" # 15 60 | num_part = "" # 16 61 | num_pre = "" # 17 62 | thinkTime = "" # 18 63 | 64 | for i in range(strNum): 65 | if (tempStr[i] == '\n'): 66 | break 67 | if (tempStr[i] == sept_sign): 68 | if (newWord == 0): 69 | newWord = 1 70 | k = k+1 71 | else: 72 | newWord = 0 73 | if k == 0: 74 | ID=ID+ tempStr[i] 75 | elif k == 1: 76 | submit = submit + tempStr[i] 77 | elif k == 2: 78 | wait = wait + tempStr[i] 79 | elif k == 3: 80 | run = run + tempStr[i] 81 | elif k == 4: 82 | usedProc = usedProc + tempStr[i] 83 | elif k == 5: 84 | usedAveCPU = usedAveCPU + tempStr[i] 85 | elif k == 6: 86 | usedMem = usedMem + tempStr[i] 87 | elif k == 7: 88 | reqProc = reqProc + tempStr[i] 89 | elif k == 8: 90 | reqTime = reqTime + tempStr[i] 91 | elif k == 9: 92 | reqMem = reqMem + tempStr[i] 93 | elif k == 10: 94 | status = status + tempStr[i] 95 | elif k == 11: 96 | userID = userID + tempStr[i] 97 | elif k == 12: 98 | groupID = groupID + tempStr[i] 99 | elif k == 13: 100 | num_exe = num_exe + tempStr[i] 101 | elif k == 14: 102 | num_queue = num_queue + tempStr[i] 103 | elif k == 15: 104 | num_part = num_part + tempStr[i] 105 | elif k == 16: 106 | num_pre = num_pre + tempStr[i] 107 | elif k == 17: 108 | thinkTime = thinkTime + tempStr[i] 109 | 110 | if (min_sub<0): 111 | min_sub=float(submit) 112 | if (self.start < 0): 113 | self.start = min_sub 114 | for con_data in self.config_data: 115 | if not con_data['name'] and con_data['name_config'] == 'start_offset': 116 | con_data['value'] = min_sub-self.start 117 | break 118 | 119 | tempInfo = {'id':int(ID),\ 120 | 'submit':self.density*(float(submit)-min_sub)+self.start,\ 121 | 'wait':float(wait),\ 122 | 'run':float(run),\ 123 | 'usedProc':int(usedProc),\ 124 | 'usedAveCPU':float(usedAveCPU),\ 125 | 'usedMem':float(usedMem),\ 126 | 'reqProc':int(reqProc),\ 127 | 'reqTime':float(reqTime),\ 128 | 'reqMem':float(reqMem),\ 129 | 'status':int(status),\ 130 | 'userID':int(userID),\ 131 | 'groupID':int(groupID),\ 132 | 'num_exe':int(num_exe),\ 133 | 'num_queue':int(num_queue),\ 134 | 'num_part':int(num_part),\ 135 | 'num_pre':int(num_pre),\ 136 | 'thinkTime':int(thinkTime),\ 137 | 'start':-1,\ 138 | 'end':-1,\ 139 | 'score':0,\ 140 | 'state':0,\ 141 | 'happy':-1,\ 142 | 'estStart':-1} 143 | # state: 0: not submit 1: waiting 2: running 3: done 144 | 145 | if (self.input_check(tempInfo)>=0): 146 | f2.write(str(tempInfo['id'])) 147 | f2.write(sep_sign) 148 | f2.write(str(tempInfo['submit'])) 149 | f2.write(sep_sign) 150 | f2.write(str(tempInfo['wait'])) 151 | f2.write(sep_sign) 152 | f2.write(str(tempInfo['run'])) 153 | f2.write(sep_sign) 154 | f2.write(str(tempInfo['usedProc'])) 155 | f2.write(sep_sign) 156 | f2.write(str(tempInfo['usedAveCPU'])) 157 | f2.write(sep_sign) 158 | f2.write(str(tempInfo['usedMem'])) 159 | f2.write(sep_sign) 160 | f2.write(str(tempInfo['reqProc'])) 161 | f2.write(sep_sign) 162 | f2.write(str(tempInfo['reqTime'])) 163 | f2.write(sep_sign) 164 | f2.write(str(tempInfo['reqMem'])) 165 | f2.write(sep_sign) 166 | f2.write(str(tempInfo['status'])) 167 | f2.write(sep_sign) 168 | f2.write(str(tempInfo['userID'])) 169 | f2.write(sep_sign) 170 | f2.write(str(tempInfo['groupID'])) 171 | f2.write(sep_sign) 172 | f2.write(str(tempInfo['num_exe'])) 173 | f2.write(sep_sign) 174 | f2.write(str(tempInfo['num_queue'])) 175 | f2.write(sep_sign) 176 | f2.write(str(tempInfo['num_part'])) 177 | f2.write(sep_sign) 178 | f2.write(str(tempInfo['num_pre'])) 179 | f2.write(sep_sign) 180 | f2.write(str(tempInfo['thinkTime'])) 181 | f2.write("\n") 182 | #self.jobList.append(tempInfo) 183 | temp_readNum+=1 184 | #job_num += 1 185 | temp_start += 1 186 | else: 187 | for con_data in self.config_data: 188 | if con_data['name']: 189 | con_ex = con_data['name']+self.config_equal+"([^"+self.config_sep+"]*)"+self.config_sep 190 | temp_con_List=re.findall(con_ex,tempStr) 191 | if (len(temp_con_List)>=1): 192 | con_data['value'] = temp_con_List[0] 193 | break 194 | 195 | 196 | jobFile.close() 197 | f2.close() 198 | self.jobNum = temp_readNum 199 | #self.jobNum = len(self.jobList) 200 | 201 | def read_job_trace(self): 202 | nr_sign =';' # Not read sign. Mark the line not the job data 203 | sep_sign =' ' # The sign seperate data in a line 204 | 205 | jobFile = open(self.trace,'r') 206 | min_sub = -1 207 | temp_readNum=0 208 | temp_start=0 209 | while (temp_readNum=self.anchor): 216 | strNum = len(tempStr) 217 | newWord = 1 218 | k = 0 219 | ID = "" # 1 220 | submit = "" # 2 221 | wait = "" # 3 222 | run = "" # 4 223 | usedProc = "" # 5 224 | usedAveCPU = "" # 6 225 | usedMem = "" # 7 226 | reqProc = "" # 8 227 | reqTime = "" # 9 228 | reqMem = "" # 10 229 | status = "" # 11 230 | userID = "" # 12 231 | groupID = "" # 13 232 | num_exe = "" # 14 233 | num_queue = "" # 15 234 | num_part = "" # 16 235 | num_pre = "" # 17 236 | thinkTime = "" # 18 237 | 238 | for i in range(strNum): 239 | if (tempStr[i] == '\n'): 240 | break 241 | if (tempStr[i] == sep_sign): 242 | if (newWord == 0): 243 | newWord = 1 244 | k = k+1 245 | else: 246 | newWord = 0 247 | if k == 0: 248 | ID=ID+ tempStr[i] 249 | elif k == 1: 250 | submit = submit + tempStr[i] 251 | elif k == 2: 252 | wait = wait + tempStr[i] 253 | elif k == 3: 254 | run = run + tempStr[i] 255 | elif k == 4: 256 | usedProc = usedProc + tempStr[i] 257 | elif k == 5: 258 | usedAveCPU = usedAveCPU + tempStr[i] 259 | elif k == 6: 260 | usedMem = usedMem + tempStr[i] 261 | elif k == 7: 262 | reqProc = reqProc + tempStr[i] 263 | elif k == 8: 264 | reqTime = reqTime + tempStr[i] 265 | elif k == 9: 266 | reqMem = reqMem + tempStr[i] 267 | elif k == 10: 268 | status = status + tempStr[i] 269 | elif k == 11: 270 | userID = userID + tempStr[i] 271 | elif k == 12: 272 | groupID = groupID + tempStr[i] 273 | elif k == 13: 274 | num_exe = num_exe + tempStr[i] 275 | elif k == 14: 276 | num_queue = num_queue + tempStr[i] 277 | elif k == 15: 278 | num_part = num_part + tempStr[i] 279 | elif k == 16: 280 | num_pre = num_pre + tempStr[i] 281 | elif k == 17: 282 | thinkTime = thinkTime + tempStr[i] 283 | 284 | if (min_sub<0): 285 | min_sub=float(submit) 286 | if (self.start < 0): 287 | self.start = min_sub 288 | for con_data in self.config_data: 289 | if not con_data['name'] and con_data['name_config'] == 'start_offset': 290 | con_data['value'] = min_sub-self.start 291 | break 292 | 293 | tempInfo = {'id':int(ID),\ 294 | 'submit':self.density*(float(submit)-min_sub)+self.start,\ 295 | 'wait':float(wait),\ 296 | 'run':float(run),\ 297 | 'usedProc':int(usedProc),\ 298 | 'usedAveCPU':float(usedAveCPU),\ 299 | 'usedMem':float(usedMem),\ 300 | 'reqProc':int(reqProc),\ 301 | 'reqTime':float(reqTime),\ 302 | 'reqMem':float(reqMem),\ 303 | 'status':int(status),\ 304 | 'userID':int(userID),\ 305 | 'groupID':int(groupID),\ 306 | 'num_exe':int(num_exe),\ 307 | 'num_queue':int(num_queue),\ 308 | 'num_part':int(num_part),\ 309 | 'num_pre':int(num_pre),\ 310 | 'thinkTime':int(thinkTime),\ 311 | 'start':-1,\ 312 | 'end':-1,\ 313 | 'score':0,\ 314 | 'state':0,\ 315 | 'happy':-1,\ 316 | 'estStart':-1} 317 | # state: 0: not submit 1: waiting 2: running 3: done 318 | 319 | if (self.input_check(tempInfo)>=0): 320 | self.jobList.append(tempInfo) 321 | temp_readNum+=1 322 | temp_start += 1 323 | else: 324 | for con_data in self.config_data: 325 | if con_data['name']: 326 | con_ex = con_data['name']+self.config_equal+"([^"+self.config_sep+"]*)"+self.config_sep 327 | temp_con_List=re.findall(con_ex,tempStr) 328 | if (len(temp_con_List)>=1): 329 | con_data['value'] = temp_con_List[0] 330 | break 331 | 332 | 333 | jobFile.close() 334 | self.jobNum = len(self.jobList) 335 | 336 | def input_check(self,jobInfo): 337 | if (int(jobInfo['run'])>int(jobInfo['reqTime'])): 338 | jobInfo['run']=jobInfo['reqTime'] 339 | if (int(jobInfo['id'])<=0): 340 | return -2 341 | if (int(jobInfo['submit'])<0): 342 | return -3 343 | if (int(jobInfo['run'])<=0): 344 | return -4 345 | if (int(jobInfo['reqTime'])<=0): 346 | return -5 347 | if (int(jobInfo['reqProc'])<=0): 348 | return -6 349 | return 1 350 | 351 | def output_job_data(self): 352 | if not self.save: 353 | print("Save file not set!") 354 | return 355 | 356 | sep_sign = ";" 357 | f2=open(self.save,"w") 358 | 359 | for jobResult_o in self.jobList: 360 | f2.write(str(jobResult_o['id'])) 361 | f2.write(sep_sign) 362 | f2.write(str(jobResult_o['submit'])) 363 | f2.write(sep_sign) 364 | f2.write(str(jobResult_o['wait'])) 365 | f2.write(sep_sign) 366 | f2.write(str(jobResult_o['run'])) 367 | f2.write(sep_sign) 368 | f2.write(str(jobResult_o['usedProc'])) 369 | f2.write(sep_sign) 370 | f2.write(str(jobResult_o['usedAveCPU'])) 371 | f2.write(sep_sign) 372 | f2.write(str(jobResult_o['usedMem'])) 373 | f2.write(sep_sign) 374 | f2.write(str(jobResult_o['reqProc'])) 375 | f2.write(sep_sign) 376 | f2.write(str(jobResult_o['reqTime'])) 377 | f2.write(sep_sign) 378 | f2.write(str(jobResult_o['reqMem'])) 379 | f2.write(sep_sign) 380 | f2.write(str(jobResult_o['status'])) 381 | f2.write(sep_sign) 382 | f2.write(str(jobResult_o['userID'])) 383 | f2.write(sep_sign) 384 | f2.write(str(jobResult_o['groupID'])) 385 | f2.write(sep_sign) 386 | f2.write(str(jobResult_o['num_exe'])) 387 | f2.write(sep_sign) 388 | f2.write(str(jobResult_o['num_queue'])) 389 | f2.write(sep_sign) 390 | f2.write(str(jobResult_o['num_part'])) 391 | f2.write(sep_sign) 392 | f2.write(str(jobResult_o['num_pre'])) 393 | f2.write(sep_sign) 394 | f2.write(str(jobResult_o['thinkTime'])) 395 | f2.write("\n") 396 | f2.close() 397 | 398 | def output_job_config(self): 399 | if not self.config: 400 | print("Config file not set!") 401 | return 402 | 403 | format_equal = '=' 404 | f2=open(self.config,"w") 405 | 406 | for con_data in self.config_data: 407 | f2.write(str(con_data['name_config'])) 408 | f2.write(format_equal) 409 | f2.write(str(con_data['value'])) 410 | f2.write('\n') 411 | f2.close() -------------------------------------------------------------------------------- /src_fc/Extend/SWF/Filter_node_SWF.py: -------------------------------------------------------------------------------- 1 | import re 2 | import Filter.Filter_node as filter_node 3 | 4 | __metaclass__ = type 5 | class Filter_node_SWF(filter_node.Filter_node): 6 | def reset_config_data(self): 7 | self.config_start=';' 8 | self.config_sep='\\n' 9 | self.config_equal=': ' 10 | self.config_data=[] 11 | self.config_data.append({'name_config':'MaxNodes','name':'MaxNodes','value':''}) 12 | self.config_data.append({'name_config':'MaxProcs','name':'MaxProcs','value':''}) 13 | 14 | def read_node_struc(self): 15 | nr_sign =';' # Not read sign. Mark the line not the job data 16 | sep_sign =' ' # The sign seperate data in a line 17 | sep_sign2 =':' # The sign seperate data in a line 18 | nameList=[] 19 | nameList.append(["MaxNodes","node"]) 20 | nameList.append(["MaxProcs","proc"]) 21 | regex_rest = " *:([^\\n]+)\\n" 22 | regexList = [] 23 | node_info={} 24 | 25 | for dataName in nameList: 26 | regexList.append([(dataName[0]+regex_rest),dataName[1]]) 27 | 28 | nodeFile = open(self.struc,'r') 29 | while (1): 30 | tempStr = nodeFile.readline() 31 | if not tempStr : # break when no more line 32 | break 33 | if tempStr[0] == nr_sign: # The information line 34 | for dataRegex in regexList: 35 | matchResult = re.findall(dataRegex[0],tempStr) 36 | if (matchResult): 37 | node_info[dataRegex[1]]=int(matchResult[0].strip()) 38 | break 39 | for con_data in self.config_data: 40 | con_ex = con_data['name']+self.config_equal+"([^"+self.config_sep+"]*)"+self.config_sep 41 | temp_con_List=re.findall(con_ex,tempStr) 42 | if (len(temp_con_List)>=1): 43 | con_data['value'] = temp_con_List[0] 44 | break 45 | else: 46 | break 47 | nodeFile.close() 48 | self.node_data_build(node_info) 49 | self.nodeNum = len(self.nodeList) 50 | 51 | def node_data_build(self,node_info): 52 | node_num = node_info['proc'] 53 | self.nodeList=[] 54 | i = 0 55 | while (i < node_num): 56 | self.nodeList.append({"id": i+1, \ 57 | "location": [1], \ 58 | "group": 1, \ 59 | "state": -1, \ 60 | "proc": 1, \ 61 | "start": -1, \ 62 | "end": -1, \ 63 | "extend": None}) 64 | i += 1 65 | return 1 66 | 67 | def output_node_data(self): 68 | if not self.save: 69 | print("Save file not set!") 70 | return 71 | 72 | sep_sign = ";" 73 | f2=open(self.save,"w") 74 | for nodeResult_o in self.nodeList: 75 | f2.write(str(nodeResult_o['id'])) 76 | f2.write(sep_sign) 77 | f2.write(str(nodeResult_o['location'])) 78 | f2.write(sep_sign) 79 | f2.write(str(nodeResult_o['group'])) 80 | f2.write(sep_sign) 81 | f2.write(str(nodeResult_o['state'])) 82 | f2.write(sep_sign) 83 | f2.write(str(nodeResult_o['proc'])) 84 | f2.write("\n") 85 | f2.close() 86 | 87 | def output_node_config(self): 88 | if not self.config: 89 | print("Config file not set!") 90 | return 91 | 92 | format_equal = '=' 93 | f2=open(self.config,"w") 94 | 95 | for con_data in self.config_data: 96 | f2.write(str(con_data['name_config'])) 97 | f2.write(format_equal) 98 | f2.write(str(con_data['value'])) 99 | f2.write('\n') 100 | f2.close() -------------------------------------------------------------------------------- /src_fc/Extend/SWF/Node_struc_SWF.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | import time 3 | import re 4 | 5 | import CqSim.Node_struc as Class_Node_struc 6 | 7 | __metaclass__ = type 8 | 9 | 10 | class Node_struc_SWF(Class_Node_struc.Node_struc): 11 | 12 | def node_allocate(self, proc_num, job_index, start, end): 13 | 14 | if self.is_available(proc_num) == 0: 15 | return 0 16 | 17 | i = 0 18 | for node in self.nodeStruc: 19 | if node['state'] < 0: 20 | node['state'] = job_index 21 | node['start'] = start 22 | node['end'] = end 23 | i += 1 24 | #self.debug.debug(" yyy: "+str(node['state'])+" "+str(job_index),4) 25 | if (i>=proc_num): 26 | break 27 | 28 | self.idle -= proc_num 29 | self.avail = self.idle 30 | temp_job_info = {'job':job_index, 'end': end, 'node': proc_num} 31 | j = 0 32 | is_done = 0 33 | temp_num = len(self.job_list) 34 | while (j=start and self.predict_node[i]['time']self.predict_node[i]['avail']): 95 | return 0 96 | i += 1 97 | return 1 98 | 99 | def reserve(self, proc_num, job_index, time, start = None, index = -1 ): 100 | #self.debug.debug("* "+self.myInfo+" -- reserve",5) 101 | 102 | temp_max = len(self.predict_node) 103 | if (start): 104 | if (self.pre_avail(proc_num,start,start+time)==0): 105 | return -1 106 | else: 107 | i = 0 108 | j = 0 109 | if (index >= 0 and index < temp_max): 110 | i = index 111 | elif(index >= temp_max): 112 | return -1 113 | 114 | while (i "+str(job_index) +" "+str(proc_num) +" "+str(time) +" ",2) 158 | while (ipre_info_last['start']): 179 | pre_info_last['start'] = temp_job['start'] 180 | if (temp_job['end']>pre_info_last['end']): 181 | pre_info_last['end'] = temp_job['end'] 182 | return pre_info_last 183 | 184 | def pre_reset(self, time): 185 | #self.debug.debug("* "+self.myInfo+" -- pre_reset",5) 186 | self.predict_node = [] 187 | self.predict_job = [] 188 | self.predict_node.append({'time':time, 'idle':self.idle, 'avail':self.avail}) 189 | 190 | 191 | temp_job_num = len(self.job_list) 192 | ''' 193 | i = 0 194 | self.debug.line(2,'==') 195 | while (i=len(self.predict_node): 226 | index = len(self.predict_node) - 1 227 | 228 | i = index 229 | end = self.predict_node[index]['time']+time 230 | temp_node_num = len(self.predict_node) 231 | 232 | while (i < temp_node_num): 233 | if (self.predict_node[i]['time']self.predict_node[i]['avail']): 235 | #print "xxxxx ",temp_node_num,proc_num,self.predict_node[i] 236 | return i 237 | i += 1 238 | return -1 -------------------------------------------------------------------------------- /src_fc/Extend/SWF/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/Extend/SWF/__init__.py -------------------------------------------------------------------------------- /src_fc/Extend/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/Extend/__init__.py -------------------------------------------------------------------------------- /src_fc/Filter/Filter_job.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | import time 3 | 4 | __metaclass__ = type 5 | 6 | class Filter_job: 7 | def __init__(self, trace, save=None, config=None, sdate=None, start=-1, density=1.0, anchor=0, rnum=0, debug=None): 8 | self.myInfo = "Filter Job" 9 | self.start = start 10 | self.sdate = sdate 11 | self.density = float(density) 12 | self.anchor = int(anchor) 13 | self.rnum = int(rnum) 14 | self.trace = str(trace) 15 | self.save = str(save) 16 | self.config = str(config) 17 | self.debug = debug 18 | self.jobNum = -1 19 | self.jobList=[] 20 | 21 | self.debug.line(4," ") 22 | self.debug.line(4,"#") 23 | self.debug.debug("# "+self.myInfo,1) 24 | self.debug.line(4,"#") 25 | 26 | self.reset_config_data() 27 | 28 | def reset(self, trace=None, save=None, config=None, sdate=None, start=None, density=None, anchor=None, rnum=None, debug=None): 29 | self.debug.debug("* "+self.myInfo+" -- reset",5) 30 | if start: 31 | self.start = start 32 | if sdate: 33 | self.sdate = sdate 34 | if density: 35 | self.density = float(density) 36 | if anchor: 37 | self.anchor = int(anchor) 38 | if rnum: 39 | self.rnum = int(rnum) 40 | if trace: 41 | self.trace = str(trace) 42 | if save: 43 | self.save = str(save) 44 | if config: 45 | self.config = str(config) 46 | if debug: 47 | self.debug = debug 48 | self.jobNum = -1 49 | self.jobList=[] 50 | 51 | self.reset_config_data() 52 | 53 | def reset_config_data(self): 54 | self.debug.debug("* "+self.myInfo+" -- reset_config_data",5) 55 | self.config_start=';' 56 | self.config_sep='\\n' 57 | self.config_equal=': ' 58 | self.config_data=[] 59 | #self.config_data.append({'name_config':'date','name':'StartTime','value':''}) 60 | 61 | def read_job_trace(self): 62 | self.debug.debug("* "+self.myInfo+" -- read_job_trace",5) 63 | return 64 | 65 | def input_check(self,jobInfo): 66 | self.debug.debug("* "+self.myInfo+" -- input_check",5) 67 | return 68 | 69 | def get_job_num(self): 70 | self.debug.debug("* "+self.myInfo+" -- get_job_num",6) 71 | return self.jobNum 72 | 73 | def get_job_data(self): 74 | self.debug.debug("* "+self.myInfo+" -- get_job_data",5) 75 | return self.jobList 76 | 77 | def output_job_data(self): 78 | self.debug.debug("* "+self.myInfo+" -- output_job_data",5) 79 | if not self.save: 80 | print("Save file not set!") 81 | return 82 | return 83 | 84 | def output_job_config(self): 85 | self.debug.debug("* "+self.myInfo+" -- output_job_config",5) 86 | if not self.config: 87 | print("Config file not set!") 88 | return 89 | return 90 | 91 | -------------------------------------------------------------------------------- /src_fc/Filter/Filter_node.py: -------------------------------------------------------------------------------- 1 | 2 | __metaclass__ = type 3 | 4 | class Filter_node: 5 | def __init__(self, struc=None, config=None, save=None, debug=None): 6 | self.myInfo = "Filter Node" 7 | self.struc = str(struc) 8 | self.save = str(save) 9 | self.config = str(config) 10 | self.debug = debug 11 | self.nodeNum = -1 12 | self.nodeList=[] 13 | 14 | self.debug.line(4," ") 15 | self.debug.line(4,"#") 16 | self.debug.debug("# "+self.myInfo,1) 17 | self.debug.line(4,"#") 18 | 19 | self.reset_config_data() 20 | 21 | 22 | def reset(self, struc=None, config=None, save=None, debug=None): 23 | self.debug.debug("* "+self.myInfo+" -- reset",5) 24 | if struc: 25 | self.struc = str(struc) 26 | if save: 27 | self.save = str(save) 28 | if config: 29 | self.config = str(config) 30 | if debug: 31 | self.debug = debug 32 | self.nodeNum = -1 33 | self.nodeList=[] 34 | 35 | self.reset_config_data() 36 | 37 | def reset_config_data(self): 38 | self.debug.debug("* "+self.myInfo+" -- reset_config_data",5) 39 | self.config_start=';' 40 | self.config_sep='\\n' 41 | self.config_equal=': ' 42 | self.config_data=[] 43 | #self.config_data.append({'name_config':'date','name':'StartTime','value':''}) 44 | 45 | def read_node_struc(self): 46 | self.debug.debug("* "+self.myInfo+" -- read_node_struc",5) 47 | return 48 | 49 | def input_check(self,nodeInfo): 50 | self.debug.debug("* "+self.myInfo+" -- input_check",5) 51 | return 52 | 53 | def get_node_num(self): 54 | self.debug.debug("* "+self.myInfo+" -- get_node_num",6) 55 | return self.nodeNum 56 | 57 | def get_node_data(self): 58 | self.debug.debug("* "+self.myInfo+" -- get_node_data",5) 59 | return self.nodeList 60 | 61 | def output_node_data(self): 62 | self.debug.debug("* "+self.myInfo+" -- output_node_data",5) 63 | if not self.save: 64 | print("Save file not set!") 65 | return 66 | return 67 | 68 | def output_node_config(self): 69 | self.debug.debug("* "+self.myInfo+" -- output_node_config",5) 70 | if not self.config: 71 | print("Config file not set!") 72 | return 73 | return 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /src_fc/Filter/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/Filter/__init__.py -------------------------------------------------------------------------------- /src_fc/IOModule/Debug_log.py: -------------------------------------------------------------------------------- 1 | import IOModule.Log_print as Log_print 2 | 3 | __metaclass__ = type 4 | 5 | 6 | class Debug_log: 7 | def __init__(self, lvl=2, show=2, path=None, log_freq=1): 8 | self.myInfo = "Debug" 9 | self.lvl = lvl 10 | self.path = path 11 | self.show = show 12 | self.debugFile = None 13 | self.debugFile = Log_print.Log_print(self.path, 0) 14 | self.debug_log_buf = [] 15 | self.log_freq = log_freq 16 | self.reset_log() 17 | 18 | def reset(self, lvl=None, path=None, log_freq=1): 19 | if lvl: 20 | self.lvl = lvl 21 | if path: 22 | self.path = path 23 | self.debugFile.reset(self.path, 0) 24 | self.debug_log_buf = [] 25 | self.log_freq = log_freq 26 | self.reset_log() 27 | 28 | def reset_log(self): 29 | self.debugFile.reset(self.path, 0) 30 | self.debugFile.file_open() 31 | self.debugFile.file_close() 32 | self.debugFile.reset(self.path, 1) 33 | return 1 34 | 35 | def set_lvl(self, lvl=0): 36 | self.lvl = lvl 37 | 38 | def debug(self, context=None, lvl=3): 39 | if (lvl <= self.lvl): 40 | if context != None: 41 | self.debug_log_buf.append(context) 42 | if (len(self.debug_log_buf) >= self.log_freq) or (context == None): 43 | self.debugFile.file_open() 44 | # print self.debug_log_buf 45 | for debug_log in self.debug_log_buf: 46 | self.debugFile.log_print(debug_log, 1) 47 | # print debug_log 48 | self.debugFile.file_close() 49 | self.debug_log_buf = [] 50 | if (lvl >= self.show) and (context != None): 51 | print(context) 52 | # pass 53 | 54 | def line(self, lvl=1, signal="-", num=15): 55 | if (lvl <= self.lvl): 56 | i = 0 57 | context = "" 58 | while (i < num): 59 | context += signal 60 | i += 1 61 | self.debug_log_buf.append(context) 62 | if (len(self.debug_log_buf) >= self.log_freq): 63 | self.debugFile.file_open() 64 | for debug_log in self.debug_log_buf: 65 | self.debugFile.log_print(debug_log, 1) 66 | self.debugFile.file_close() 67 | self.debug_log_buf = [] 68 | if (lvl >= self.show): 69 | print(context) 70 | ''' 71 | if (lvl<=self.lvl): 72 | self.debugFile.file_open() 73 | i = 0 74 | context = "" 75 | while (i=self.show): 80 | print context 81 | self.debugFile.file_close() 82 | ''' 83 | ''' 84 | def start_debug(self): 85 | self.debugFile.file_open() 86 | 87 | def end_debug(self): 88 | self.debugFile.file_close() 89 | ''' 90 | -------------------------------------------------------------------------------- /src_fc/IOModule/Log_print.py: -------------------------------------------------------------------------------- 1 | 2 | __metaclass__ = type 3 | 4 | class Log_print: 5 | def __init__(self, filePath, mode=0): 6 | self.modelist=['w','a'] 7 | self.filePath = filePath 8 | self.mode = self.modelist[mode] 9 | self.logFile=None 10 | 11 | def reset(self, filePath=None, mode=None): 12 | if filePath: 13 | self.filePath = filePath 14 | if mode: 15 | self.mode = self.modelist[mode] 16 | self.logFile=None 17 | 18 | def file_open(self): 19 | self.logFile = open(self.filePath,self.mode) 20 | return 1 21 | 22 | def file_close(self): 23 | self.logFile.close() 24 | return 1 25 | 26 | def log_print(self, context, isEnter=1): 27 | self.logFile.write(str(context)) 28 | if isEnter==1: 29 | self.logFile.write("\n") 30 | 31 | -------------------------------------------------------------------------------- /src_fc/IOModule/Output_log.py: -------------------------------------------------------------------------------- 1 | import IOModule.Log_print as Log_print 2 | 3 | __metaclass__ = type 4 | 5 | class Output_log: 6 | def __init__(self, output = None, log_freq = 1): 7 | self.myInfo = "Output_log" 8 | self.output_path = output 9 | self.sys_info_buf = [] 10 | self.job_buf = [] 11 | self.log_freq = log_freq 12 | self.reset_output() 13 | 14 | def reset(self, output = None, log_freq = 1): 15 | if output: 16 | self.output_path = output 17 | self.sys_info_buf = [] 18 | self.job_buf = [] 19 | self.log_freq = log_freq 20 | self.reset_output() 21 | 22 | def reset_output(self): 23 | self.sys_info = Log_print.Log_print(self.output_path['sys'],0) 24 | self.sys_info.reset(self.output_path['sys'],0) 25 | self.sys_info.file_open() 26 | self.sys_info.file_close() 27 | self.sys_info.reset(self.output_path['sys'],1) 28 | 29 | self.job_result = Log_print.Log_print(self.output_path['result'],0) 30 | self.job_result.reset(self.output_path['result'],0) 31 | self.job_result.file_open() 32 | self.job_result.file_close() 33 | self.job_result.reset(self.output_path['result'],1) 34 | 35 | self.reward_result = Log_print.Log_print(self.output_path['reward'],0) 36 | self.reward_result.reset(self.output_path['reward'],0) 37 | self.reward_result.file_open() 38 | self.reward_result.file_close() 39 | self.reward_result.reset(self.output_path['reward'],1) 40 | 41 | 42 | def print_sys_info(self, sys_info = None): 43 | if sys_info != None: 44 | self.sys_info_buf.append(sys_info) 45 | if (len(self.sys_info_buf) >= self.log_freq) or (sys_info == None): 46 | sep_sign=";" 47 | #pre_context = "Printing..............................\n" 48 | self.sys_info.file_open() 49 | for sys_info in self.sys_info_buf: 50 | context = "" 51 | context += str(int(sys_info['date'])) 52 | context += sep_sign 53 | context += (str(sys_info['uti'])) 54 | self.sys_info.log_print(context,1) 55 | self.sys_info.file_close() 56 | self.sys_info_buf = [] 57 | 58 | 59 | def print_result(self, job_module, job_index = None): 60 | if job_index != None: 61 | self.job_buf.append(job_module.job_info(job_index)) 62 | if (len(self.job_buf) >= self.log_freq) or (job_index == None): 63 | self.job_result.file_open() 64 | sep_sign=";" 65 | for temp_job in self.job_buf: 66 | #temp_job = job_module.job_info(job_index) 67 | context = "" 68 | context += str(temp_job['id']) 69 | context += sep_sign 70 | context += str(temp_job['reqProc']) 71 | context += sep_sign 72 | context += str(temp_job['reqTime']) 73 | context += sep_sign 74 | context += str(temp_job['run']) 75 | context += sep_sign 76 | context += str(temp_job['wait']) 77 | context += sep_sign 78 | context += str(temp_job['submit']) 79 | context += sep_sign 80 | context += str(temp_job['start']) 81 | context += sep_sign 82 | context += str(temp_job['end']) 83 | self.job_result.log_print(context,1) 84 | self.job_result.file_close() 85 | self.job_buf = [] 86 | 87 | def print_reward(self, reward_seq): 88 | if reward_seq is not None: 89 | self.reward_result.file_open() 90 | for reward in reward_seq: 91 | self.reward_result.log_print(reward, 1) 92 | self.reward_result.file_close() -------------------------------------------------------------------------------- /src_fc/IOModule/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/IOModule/__init__.py -------------------------------------------------------------------------------- /src_fc/Interface/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/Interface/__init__.py -------------------------------------------------------------------------------- /src_fc/Models/A2C.py: -------------------------------------------------------------------------------- 1 | import math 2 | import random 3 | import pickle 4 | 5 | import gym 6 | import numpy as np 7 | 8 | import torch 9 | import torch.nn as nn 10 | import torch.optim as optim 11 | import torch.nn.functional as F 12 | from torch.distributions import Categorical 13 | 14 | 15 | class A2C(nn.Module): 16 | def __init__(self, env, num_inputs, num_outputs, std=0.0, window_size=50, 17 | learning_rate=1e-2, gamma=0.99, batch_size=20, layer_size=[]): 18 | super(A2C, self).__init__() 19 | self.hidden1_size = layer_size[0] 20 | self.hidden2_size = layer_size[1] 21 | self.critic = nn.Sequential( 22 | nn.Conv1d(2, 1, 1), 23 | nn.Flatten(start_dim=0), 24 | nn.Linear(num_inputs, self.hidden1_size, bias=False), 25 | nn.ReLU(), 26 | nn.Linear(self.hidden1_size, self.hidden2_size, bias=False), 27 | nn.ReLU(), 28 | nn.Linear(self.hidden2_size, 1) 29 | ) 30 | 31 | self.actor = nn.Sequential( 32 | nn.Conv1d(2, 1, 1), 33 | nn.Flatten(start_dim=0), 34 | nn.Linear(num_inputs, self.hidden1_size, bias=False), 35 | nn.ReLU(), 36 | nn.Linear(self.hidden1_size, self.hidden2_size, bias=False), 37 | nn.ReLU(), 38 | nn.Linear(self.hidden2_size, num_outputs) 39 | ) 40 | self.batch_size = batch_size 41 | self.gamma = gamma 42 | self.lr = learning_rate 43 | self.window_size = window_size 44 | self.log_probs = [] 45 | self.values = [] 46 | self.rewards = [] 47 | self.rewards_seq = [] 48 | self.entropy = 0 49 | 50 | def forward(self, x): 51 | x = torch.reshape(x, (-1, 2, 1)) 52 | value = self.critic(x) 53 | probs = self.actor(x) 54 | return probs, value 55 | 56 | def remember(self, probs, value, reward, done, device, action): 57 | dist = Categorical(torch.softmax(probs, dim=-1)) 58 | log_prob = dist.log_prob(torch.tensor(action)) 59 | self.entropy += dist.entropy().mean() 60 | 61 | self.log_probs.append(log_prob) 62 | self.values.append(value) 63 | self.rewards.append(torch.FloatTensor( 64 | [reward]).unsqueeze(-1).to(device)) 65 | self.rewards_seq.append(reward) 66 | 67 | def train(self, next_value, optimizer): 68 | if len(self.values) < self.batch_size: 69 | return 70 | 71 | returns = self.compute_returns(next_value) 72 | 73 | self.log_probs = torch.tensor(self.log_probs) 74 | returns = torch.cat(returns).detach() 75 | self.values = torch.cat(self.values) 76 | 77 | advantage = returns - self.values 78 | 79 | actor_loss = -(self.log_probs * advantage.detach()).mean() 80 | critic_loss = advantage.pow(2).mean() 81 | 82 | loss = actor_loss + 0.5 * critic_loss - 0.001 * self.entropy 83 | 84 | optimizer.zero_grad() 85 | loss.backward() 86 | optimizer.step() 87 | 88 | self.log_probs = [] 89 | self.values = [] 90 | self.rewards = [] 91 | self.entropy = 0 92 | 93 | def compute_returns(self, next_value): 94 | R = next_value 95 | returns = [] 96 | for step in reversed(range(len(self.rewards))): 97 | R = self.rewards[step][0] + self.gamma * R 98 | returns.insert(0, R) 99 | return returns 100 | 101 | def save_using_model_name(self, model_name_path): 102 | torch.save(self.state_dict(), model_name_path + ".pkl") 103 | 104 | def load_using_model_name(self, model_name_path): 105 | self.load_state_dict( 106 | torch.load(model_name_path + ".pkl")) 107 | -------------------------------------------------------------------------------- /src_fc/Models/DQL.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import random 3 | from collections import deque 4 | import tensorflow.compat.v1 as tf 5 | import keras.backend as K 6 | tf.disable_v2_behavior() 7 | 8 | 9 | class DQL: 10 | def __init__(self, env, sess, window_size=50, learning_rate=1e-2, 11 | gamma=0.99, batch_size=20, layer_size=[]): 12 | self.hidden1_size = layer_size[0] 13 | self.hidden2_size = layer_size[1] 14 | 15 | self.env = env 16 | self.sess = sess 17 | 18 | self.window_size = window_size 19 | self.batch_size = batch_size 20 | self.lr = learning_rate 21 | self.gamma = gamma 22 | self.memory = deque(maxlen=20) 23 | self.reward_seq = [] 24 | 25 | self.policy, self.predict = self.build_policy() 26 | 27 | def build_policy(self): 28 | obs_shape = self.env.observation_space.shape 29 | input = tf.keras.layers.Input(shape=obs_shape) 30 | input_reshape = tf.reshape(input, [-1, obs_shape[2], 1]) 31 | conv1d = tf.keras.layers.Conv1D(1, obs_shape[2], input_shape=(obs_shape[2], 1))(input_reshape) 32 | conv1d_reshape = tf.reshape(conv1d, [-1, obs_shape[1]]) 33 | hidden_layer1 = tf.keras.layers.Dense(self.hidden1_size, 'relu', input_shape=(obs_shape[1],), use_bias=False)(conv1d_reshape) 34 | hidden_layer2 = tf.keras.layers.Dense(self.hidden2_size, 'relu', input_shape=(self.hidden1_size,), use_bias=False)(hidden_layer1) 35 | output = tf.keras.layers.Dense(1, activation='sigmoid')(hidden_layer2) 36 | 37 | def custom_loss(y_true, y_pred): 38 | advantage = y_true - y_pred 39 | 40 | return K.sum(K.square(advantage)) 41 | 42 | policy = tf.keras.Model(inputs=input, outputs=output) 43 | adam = tf.keras.optimizers.Adam(lr=self.lr) 44 | policy.compile(loss=custom_loss, optimizer=adam) 45 | predict = tf.keras.Model(inputs=[input], outputs=[output]) 46 | return policy, predict 47 | 48 | def act(self, obs): 49 | return self.predict.predict(obs) 50 | 51 | def train(self): 52 | if len(self.memory) < self.batch_size: 53 | return 54 | 55 | states = np.zeros((self.batch_size, *self.env.observation_space.shape)) 56 | actions = np.zeros([self.batch_size, 1]) 57 | G = np.zeros(self.batch_size) 58 | 59 | for i in range(self.batch_size): 60 | reward_sum = 0 61 | discount = 1 62 | for j in range(i, self.batch_size): 63 | _, _, reward, _ = self.memory[j] 64 | reward_sum += reward * discount 65 | discount *= self.gamma 66 | G[i] = reward_sum 67 | 68 | states[i, :], actions[i, :], _, _ = self.memory[i] 69 | 70 | mean = np.mean(G) 71 | std = np.std(G) if np.std(G) > 0 else 1 72 | G = (G - mean) / std 73 | 74 | self.policy.train_on_batch([states, G], actions) 75 | self.memory = deque(maxlen=self.batch_size) 76 | 77 | def remember(self, obs, action, reward, new_obs): 78 | self.memory.append([obs, action, reward, new_obs]) 79 | self.reward_seq.append(reward) 80 | 81 | def save(self, policy_fp, predict_fp): 82 | self.policy.save_weights(policy_fp) 83 | self.predict.save_weights(predict_fp) 84 | 85 | def load(self, policy_fp, predict_fp): 86 | self.policy.load_weights(policy_fp) 87 | self.predict.load_weights(predict_fp) 88 | 89 | def save_using_model_name(self, model_name_path): 90 | self.save(model_name_path + "_policy_.h5", model_name_path + "_predict_.h5") 91 | 92 | def load_using_model_name(self, model_name_path): 93 | self.load(model_name_path + "_policy_.h5", model_name_path + "_predict_.h5") 94 | -------------------------------------------------------------------------------- /src_fc/Models/PG.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import random 3 | from collections import deque 4 | import tensorflow.compat.v1 as tf 5 | import keras.backend as K 6 | tf.disable_v2_behavior() 7 | 8 | np.set_printoptions(threshold=np.inf) 9 | 10 | 11 | class PG: 12 | def __init__(self, env, sess, window_size=50, sys_size=0, 13 | learning_rate=1e-2, gamma=0.99, batch_size=20, layer_size=[]): 14 | 15 | self.env = env 16 | self.sess = sess 17 | 18 | self.sys_size = sys_size 19 | self.window_size = window_size 20 | self.batch_size = batch_size 21 | self.lr = learning_rate 22 | self.gamma = gamma 23 | self.num_input = self.sys_size + 2 * self.window_size 24 | self.memory = deque(maxlen=self.batch_size) 25 | self.hidden1_size = layer_size[0] 26 | self.hidden2_size = layer_size[1] 27 | self.rewards_seq = [] 28 | self.policy, self.predict = self.build_policy() 29 | 30 | def build_policy(self): 31 | obs_shape = self.env.observation_space.shape 32 | input = tf.keras.layers.Input(shape=obs_shape[1:]) 33 | advantages = tf.keras.layers.Input(shape=[1]) 34 | input_reshape = tf.reshape(input, [-1, obs_shape[2], 1]) 35 | conv1d = tf.keras.layers.Conv1D( 36 | 1, obs_shape[2], input_shape=(obs_shape[2], 1))(input_reshape) 37 | conv1d_reshape = tf.reshape(conv1d, [-1, obs_shape[1]]) 38 | hidden_layer1 = tf.keras.layers.Dense(self.hidden1_size, 'relu', input_shape=( 39 | obs_shape[1],), use_bias=False)(conv1d_reshape) 40 | hidden_layer2 = tf.keras.layers.Dense( 41 | self.hidden2_size, 'relu', input_shape=(self.hidden1_size,), use_bias=False)(hidden_layer1) 42 | output = tf.keras.layers.Dense( 43 | self.window_size, activation='softmax')(hidden_layer2) 44 | 45 | def custom_loss(y_true, y_pred): 46 | out = K.clip(y_pred, 1e-8, 1 - 1e-8) 47 | log_like = y_true * K.log(out) 48 | 49 | return K.sum(-log_like * advantages) 50 | 51 | policy = tf.keras.Model(inputs=[input, advantages], outputs=output) 52 | adam = tf.keras.optimizers.Adam(lr=self.lr) 53 | policy.compile(loss=custom_loss, optimizer=adam) 54 | predict = tf.keras.Model(inputs=[input], outputs=[output]) 55 | print(predict.summary()) 56 | return policy, predict 57 | 58 | def act(self, obs): 59 | return self.predict.predict(obs[0]) 60 | 61 | def train(self): 62 | if len(self.memory) < self.batch_size: 63 | return 64 | 65 | states = np.zeros((self.batch_size, self.num_input, 2)) 66 | actions = np.zeros([self.batch_size, self.window_size]) 67 | G = np.zeros(self.batch_size) 68 | 69 | for i in range(self.batch_size): 70 | reward_sum = 0 71 | discount = 1 72 | for j in range(i, self.batch_size): 73 | _, _, reward, _ = self.memory[j] 74 | reward_sum += reward * discount 75 | discount *= self.gamma 76 | G[i] = reward_sum 77 | 78 | states[i, :], actions[i, :], _, _ = self.memory[i] 79 | 80 | self.policy.train_on_batch([states, G], actions) 81 | self.memory = deque(maxlen=self.batch_size) 82 | 83 | def remember(self, obs, action, reward, new_obs): 84 | self.memory.append([obs, action, reward, new_obs]) 85 | self.rewards_seq.append(reward) 86 | 87 | def save(self, policy_fp, predict_fp): 88 | self.policy.save_weights(policy_fp) 89 | self.predict.save_weights(predict_fp) 90 | 91 | def load(self, policy_fp, predict_fp): 92 | self.policy.load_weights(policy_fp) 93 | self.predict.load_weights(predict_fp) 94 | 95 | def save_using_model_name(self, model_name_path): 96 | self.save(model_name_path + "_policy_.h5", 97 | model_name_path + "_predict_.h5") 98 | 99 | def load_using_model_name(self, model_name_path): 100 | self.load(model_name_path + "_policy_.h5", 101 | model_name_path + "_predict_.h5") 102 | -------------------------------------------------------------------------------- /src_fc/Models/PPO.py: -------------------------------------------------------------------------------- 1 | import math 2 | import random 3 | import pickle 4 | 5 | import gym 6 | import numpy as np 7 | 8 | import torch 9 | import torch.nn as nn 10 | import torch.optim as optim 11 | import torch.nn.functional as F 12 | from torch.distributions import Categorical 13 | from torch.utils.data.sampler import BatchSampler, SubsetRandomSampler 14 | from torch.autograd import Variable 15 | 16 | class ActorNet(nn.Module): 17 | 18 | def __init__(self, num_inputs, hidden1_size, hidden2_size, num_outputs): 19 | super(ActorNet, self).__init__() 20 | self.actor = nn.Sequential( 21 | nn.Conv1d(2, 1, 1), 22 | nn.Flatten(start_dim=0), 23 | nn.Linear(num_inputs, hidden1_size, bias=False), 24 | nn.ReLU(), 25 | nn.Linear(hidden1_size, hidden2_size, bias=False), 26 | nn.ReLU(), 27 | nn.Linear(hidden2_size, num_outputs), 28 | nn.Softmax(dim=0) 29 | ) 30 | 31 | def forward(self, x): 32 | x = torch.reshape(x, (-1, 2, 1)) 33 | probs = self.actor(x) 34 | return probs 35 | 36 | 37 | class CriticNet(nn.Module): 38 | 39 | def __init__(self, num_inputs, hidden1_size, hidden2_size): 40 | super(CriticNet, self).__init__() 41 | self.critic = nn.Sequential( 42 | nn.Conv1d(2, 1, 1), 43 | nn.Flatten(start_dim=0), 44 | nn.Linear(num_inputs, hidden1_size, bias=False), 45 | nn.ReLU(), 46 | nn.Linear(hidden1_size, hidden2_size, bias=False), 47 | nn.ReLU(), 48 | nn.Linear(hidden2_size, 1) 49 | ) 50 | 51 | def forward(self, x): 52 | x = torch.reshape(x, (-1, 2, 1)) 53 | value = self.critic(x) 54 | return value 55 | 56 | 57 | class PPO(): 58 | def __init__(self, env, num_inputs, num_outputs, std=0.0, window_size=50, 59 | learning_rate=1e-2, gamma=0.99, batch_size=10, layer_size=[]): 60 | super(PPO, self).__init__() 61 | self.hidden1_size = layer_size[0] 62 | self.hidden2_size = layer_size[1] 63 | 64 | self.actor_net = ActorNet( 65 | num_inputs, self.hidden1_size, self.hidden2_size, num_outputs) 66 | self.critic_net = CriticNet( 67 | num_inputs, self.hidden1_size, self.hidden2_size) 68 | 69 | self.batch_size = batch_size 70 | self.gamma = gamma 71 | self.lr = learning_rate 72 | self.window_size = window_size 73 | self.rewards = [] 74 | self.states = [] 75 | self.action_probs = [] 76 | self.ppo_update_time = 1 77 | self.clip_param = 0.2 78 | self.max_grad_norm = 0.5 79 | self.training_step = 0 80 | self.rewards_seq = [] 81 | self.num_inputs = num_inputs 82 | 83 | self.actor_optimizer = optim.Adam( 84 | self.actor_net.parameters(), lr=self.lr) 85 | self.critic_net_optimizer = optim.Adam( 86 | self.critic_net.parameters(), lr=self.lr) 87 | 88 | def forward(self, x): 89 | x = torch.reshape(x, (-1, 2, 1)) 90 | value = self.critic_net(x) 91 | probs = self.actor_net(x) 92 | return probs, value 93 | 94 | def select_action(self, state): 95 | with torch.no_grad(): 96 | probs = self.actor_net(state) 97 | value = self.critic_net(state) 98 | return probs, value 99 | 100 | def remember(self, probs, value, reward, done, device, action, state, next_state, action_p, obs): 101 | dist = Categorical(torch.softmax(probs, dim=-1)) 102 | log_prob = dist.log_prob(torch.tensor(action)) 103 | self.rewards.append(torch.FloatTensor( 104 | [reward]).unsqueeze(-1).to(device)) 105 | self.rewards_seq.append(reward) 106 | 107 | self.states.append(state.numpy()) 108 | self.action_probs.append(action_p) 109 | 110 | def train(self): 111 | if len(self.states) < self.batch_size: 112 | return 113 | old_action_log_prob = torch.stack(self.action_probs) 114 | R = 0 115 | Gt = [] 116 | for r in self.rewards[::-1]: 117 | R = r + self.gamma * R 118 | Gt.insert(0, R) 119 | Gt = torch.tensor(Gt, dtype=torch.float) 120 | self.states = torch.tensor(self.states, dtype=torch.float) 121 | 122 | for i in range(self.ppo_update_time): 123 | for index in BatchSampler(SubsetRandomSampler(range(len(self.states))), 1, False): 124 | Gt_index = Gt[index].view(-1, 1) 125 | sampled_states = self.states[index].view( 126 | 1, 1, self.num_inputs, 2) 127 | V = self.critic_net(sampled_states) 128 | advantage = (Gt_index - V).detach() 129 | action_prob = self.actor_net(sampled_states) 130 | ratio = torch.nan_to_num( 131 | torch.exp(action_prob - old_action_log_prob[index])) 132 | surr1 = ratio * advantage 133 | surr2 = torch.clamp(ratio, 1 - self.clip_param, 134 | 1 + self.clip_param) * advantage 135 | 136 | action_loss = -torch.min(surr1, surr2).mean() 137 | self.actor_optimizer.zero_grad() 138 | action_loss.backward(retain_graph=True) 139 | nn.utils.clip_grad_norm_( 140 | self.actor_net.parameters(), self.max_grad_norm) 141 | self.actor_optimizer.step() 142 | 143 | value_loss = -F.mse_loss(Gt_index[0], V) 144 | self.critic_net_optimizer.zero_grad() 145 | value_loss.backward() 146 | nn.utils.clip_grad_norm_( 147 | self.critic_net.parameters(), self.max_grad_norm) 148 | self.critic_net_optimizer.step() 149 | self.training_step += 1 150 | 151 | self.rewards = [] 152 | self.states = [] 153 | self.action_probs = [] 154 | 155 | def save_using_model_name(self, model_name_path): 156 | torch.save(self.actor_net.state_dict(), model_name_path + "_actor.pkl") 157 | torch.save(self.critic_net.state_dict(), 158 | model_name_path + "_critic.pkl") 159 | 160 | def load_using_model_name(self, model_name_path): 161 | self.actor_net.load_state_dict( 162 | torch.load(model_name_path + "_actor.pkl")) 163 | self.critic_net.load_state_dict( 164 | torch.load(model_name_path + "_critic.pkl")) 165 | -------------------------------------------------------------------------------- /src_fc/Models/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/Models/__init__.py -------------------------------------------------------------------------------- /src_fc/SWF_filter.py: -------------------------------------------------------------------------------- 1 | import cqsim_path 2 | import Filter_job_SWF 3 | 4 | ext = {'ext_job_trace':".swf",'ext_tmp_job':".csv"} 5 | path = [] 6 | path.append({'path_in':"Input Files/SWF file/CLEANED/",'path_tmp':"Temp/SWF Formatted/CLEANED/"}) 7 | path.append({'path_in':"Input Files/SWF file/ORIGINAL/",'path_tmp':"Temp/SWF Formatted/ORIGINAL/"}) 8 | 9 | 10 | SWF_files=[[],[]] 11 | SWF_files[0].append("CTC-SP2-1996-3.1-cln") 12 | SWF_files[0].append("HPC2N-2002-2.1-cln") 13 | SWF_files[0].append("LANL-CM5-1994-4.1-cln") 14 | SWF_files[0].append("LLNL-Atlas-2006-2.1-cln") 15 | SWF_files[0].append("LLNL-Thunder-2007-1.1-cln") 16 | SWF_files[0].append("LPC-EGEE-2004-1.2-cln") 17 | SWF_files[0].append("NASA-iPSC-1993-3.1-cln") 18 | SWF_files[0].append("OSC-Clust-2000-3.1-cln") 19 | SWF_files[0].append("SDSC-BLUE-2000-4.1-cln") 20 | SWF_files[0].append("SDSC-Par-1995-3.1-cln") 21 | SWF_files[0].append("SDSC-Par-1996-3.1-cln") 22 | SWF_files[0].append("SDSC-SP2-1998-4.1-cln") 23 | 24 | 25 | SWF_files[1].append("ANL-Intrepid-2009-1") 26 | SWF_files[1].append("CTC-SP2-1995-2") 27 | SWF_files[1].append("DAS2-fs0-2003-1") 28 | SWF_files[1].append("DAS2-fs1-2003-1") 29 | SWF_files[1].append("DAS2-fs2-2003-1") 30 | SWF_files[1].append("DAS2-fs3-2003-1") 31 | SWF_files[1].append("DAS2-fs4-2003-1") 32 | SWF_files[1].append("KTH-SP2-1996-2") 33 | SWF_files[1].append("LANL-O2K-1999-2") 34 | SWF_files[1].append("LCG-2005-1") 35 | SWF_files[1].append("LLNL-T3D-1996-2") 36 | SWF_files[1].append("LLNL-uBGL-2006-2") 37 | SWF_files[1].append("METACENTRUM-2009-2") 38 | SWF_files[1].append("RICC-2010-2") 39 | SWF_files[1].append("Sandia-Ross-2001-1") 40 | SWF_files[1].append("SDSC-DS-2004-1") 41 | SWF_files[1].append("SHARCNET-2005-2") 42 | SWF_files[1].append("SHARCNET-Whale-2005-2") 43 | 44 | 45 | trace_name="" 46 | save_name="" 47 | 48 | filter_job = Filter_job_SWF.Filter_job_SWF(trace=trace_name, save=save_name, sdate=None, debug=0) 49 | for i in SWF_files[0]: 50 | trace_name="" 51 | save_name="" 52 | trace_name = path[0]['path_in'] + i + ext['ext_job_trace'] 53 | save_name = path[0]['path_tmp'] + i + ext['ext_tmp_job'] 54 | print "==================================================" 55 | print trace_name 56 | filter_job.reset(trace=trace_name, save=save_name, sdate=None, debug=0) 57 | filter_job.read_job_trace() 58 | filter_job.output_job_data() 59 | 60 | for i in SWF_files[1]: 61 | trace_name="" 62 | save_name="" 63 | trace_name = path[1]['path_in'] + i + ext['ext_job_trace'] 64 | save_name = path[1]['path_tmp'] + i + ext['ext_tmp_job'] 65 | print "==================================================" 66 | print trace_name 67 | filter_job.reset(trace=trace_name, save=save_name, sdate=None, debug=0) 68 | filter_job.read_job_trace() 69 | filter_job.output_job_data() 70 | -------------------------------------------------------------------------------- /src_fc/ThreadMgr/Pause.py: -------------------------------------------------------------------------------- 1 | from threading import Condition 2 | import time 3 | 4 | 5 | class Pause: 6 | 7 | def __init__(self): 8 | """ 9 | Class Pause to implement a Producer-Consumer Approach. 10 | 2 Conditional variables are initialized - for each Prod and Cons. 11 | Conditional Variables implementation in Threads - https://docs.python.org/3/library/threading.html 12 | """ 13 | self.prod_cv = Condition() 14 | self.cons_cv = Condition() 15 | self.initial_check = True 16 | 17 | def pause_producer(self): 18 | """ 19 | First resume(notify) Consumer and then pause Producer. 20 | """ 21 | with self.cons_cv: # Implementing Locking for Conditional Variable. 22 | self.cons_cv.notifyAll() # To notify all the Paused Consumer Threads to resume. 23 | # Since only 1 Consumer thread, can also use .notify() 24 | 25 | with self.prod_cv: 26 | self.prod_cv.wait() # Pause the Producer until it gets a notification. 27 | 28 | def pause_consumer(self): 29 | """ 30 | First resume(notify) Consumer and then pause Producer. 31 | """ 32 | 33 | # ***** 34 | # Maintained a special variable for the initial situation specific to CqSim-Gym.Env 35 | # ***** 36 | self.initial_check = False 37 | 38 | with self.prod_cv: # Implementing Locking for Conditional Variable. 39 | self.prod_cv.notifyAll() # To notify that all Paused Producer Threads to resume. 40 | with self.cons_cv: 41 | self.cons_cv.wait() # Pause the Consumer until it gets a notification. 42 | 43 | def is_producer_paused(self): 44 | """ 45 | At the initialization of the CqSim-Gym.Env , the Env needs to provide the Current State of the Env. 46 | At this point we only need to wait until CqSim_Simulator(Consumer) has completely initialised the initial State. 47 | This function does not send any notification to the Conditional Variables. 48 | 49 | Note: This function still uses "While" loop, However it is only used ONCE in the CqSim-Gym.Env lifecycle. 50 | Hence does not add to overhead. 51 | """ 52 | while self.initial_check: 53 | time.sleep(0.001) 54 | 55 | def release_all(self): 56 | """ 57 | Once the Consumer(CqSim) and the Producer(Gym.Env) are completed i.e. all the jobs are loaded and assigned, 58 | both the threads notified and released to run and finish independently. 59 | """ 60 | with self.prod_cv: 61 | self.prod_cv.notifyAll() 62 | with self.cons_cv: 63 | self.cons_cv.notifyAll() 64 | -------------------------------------------------------------------------------- /src_fc/ThreadMgr/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/ThreadMgr/__init__.py -------------------------------------------------------------------------------- /src_fc/Trainer/A2C_Trainer.py: -------------------------------------------------------------------------------- 1 | from CqGym.Gym import CqsimEnv 2 | from Models.A2C import A2C 3 | import numpy as np 4 | import torch 5 | import torch.optim as optim 6 | 7 | 8 | def get_action_from_output_vector(output_vector, wait_queue_size, is_training): 9 | action_p = torch.softmax( 10 | output_vector[:wait_queue_size], dim=-1) 11 | action_p = np.array(action_p) 12 | action_p /= action_p.sum() 13 | if is_training: 14 | wait_queue_ind = np.random.choice(len(action_p), p=action_p) 15 | else: 16 | wait_queue_ind = np.argmax(action_p) 17 | return wait_queue_ind 18 | 19 | 20 | def model_training(env, weights_file_name=None, is_training=False, output_file_name=None, 21 | window_size=50, sys_size=0, learning_rate=0.1, gamma=0.99, batch_size=10, do_render=False, layer_size=[]): 22 | use_cuda = torch.cuda.is_available() 23 | device = torch.device("cuda" if use_cuda else "cpu") 24 | num_inputs = window_size * 2 + sys_size * 1 25 | a2c = A2C(env, num_inputs, window_size, std=0.0, window_size=window_size, 26 | learning_rate=learning_rate, gamma=gamma, batch_size=batch_size, layer_size=layer_size) 27 | optimizer = optim.Adam(a2c.parameters(), lr=learning_rate) 28 | 29 | if weights_file_name: 30 | a2c.load_using_model_name(weights_file_name) 31 | 32 | obs = env.get_state() 33 | done = False 34 | 35 | while not done: 36 | 37 | env.render() 38 | 39 | state = torch.FloatTensor(obs.feature_vector).to(device) 40 | 41 | probs, value = a2c(state) 42 | 43 | action = get_action_from_output_vector( 44 | probs.detach(), obs.wait_que_size, is_training) 45 | 46 | new_obs, done, reward = env.step(action) 47 | 48 | a2c.remember(probs, value, reward, done, device, action) 49 | if is_training and not done: 50 | next_state = torch.FloatTensor(new_obs.feature_vector).to(device) 51 | _, next_value = a2c(next_state) 52 | a2c.train(next_value, optimizer) 53 | obs = new_obs 54 | 55 | if is_training and output_file_name: 56 | a2c.save_using_model_name(output_file_name) 57 | 58 | return a2c.reward_seq 59 | 60 | def model_engine(module_list, module_debug, job_cols=0, window_size=0, sys_size=0, 61 | is_training=False, weights_file=None, output_file=None, do_render=False, learning_rate=0.1, reward_discount=0.99, batch_size=10, layer_size=[]): 62 | """ 63 | Execute the CqSim Simulator using OpenAi based Gym Environment with Scheduling implemented using DeepRL Engine. 64 | 65 | :param module_list: CQSim Module :- List of attributes for loading CqSim Simulator 66 | :param module_debug: Debug Module :- Module to manage debugging CqSim run. 67 | :param job_cols: [int] :- No. of attributes to define a job. 68 | :param window_size: [int] :- Size of the input window for the DeepLearning (RL) Model. 69 | :param is_training: [boolean] :- If the weights trained need to be saved. 70 | :param weights_file: [str] :- Existing Weights file path. 71 | :param output_file: [str] :- File path if the where the new weights will be saved. 72 | :return: None 73 | """ 74 | cqsim_gym = CqsimEnv(module_list, module_debug, 75 | job_cols, window_size, do_render) 76 | return model_training(cqsim_gym, window_size=window_size, sys_size=sys_size, is_training=is_training, 77 | weights_file_name=weights_file, output_file_name=output_file, learning_rate=learning_rate, gamma=reward_discount, batch_size=batch_size, layer_size=layer_size) 78 | -------------------------------------------------------------------------------- /src_fc/Trainer/DQL_Trainer.py: -------------------------------------------------------------------------------- 1 | from CqGym.Gym import CqsimEnv 2 | from Models.DQL import DQL 3 | import tensorflow.compat.v1 as tf 4 | import numpy as np 5 | import time 6 | tf.disable_v2_behavior() 7 | 8 | 9 | def get_action_from_output_vector(output_vector): 10 | return np.argmax(output_vector) 11 | 12 | 13 | def model_training(env, weights_file_name=None, is_training=False, output_file_name=None, 14 | window_size=50, sys_size=0, learning_rate=0.1, gamma=0.99, batch_size=10, do_render=False, layer_size=[]): 15 | 16 | start = time.time() 17 | sess = tf.Session() 18 | tf.keras.backend.set_session(sess) 19 | dql = DQL(env, sess, window_size, learning_rate, gamma, batch_size, layer_size) 20 | 21 | if weights_file_name: 22 | dql.load_using_model_name(weights_file_name) 23 | 24 | obs = env.get_state() 25 | done = False 26 | 27 | while not done: 28 | 29 | env.render() 30 | output_vector = dql.act(obs.feature_vector) 31 | 32 | action = get_action_from_output_vector(output_vector) 33 | new_obs, done, reward = env.step(action) 34 | dql.remember(obs.feature_vector, output_vector, reward, new_obs.feature_vector) 35 | if is_training: 36 | dql.train() 37 | obs = new_obs 38 | 39 | if is_training and output_file_name: 40 | dql.save_using_model_name(output_file_name) 41 | 42 | return dql.reward_seq 43 | 44 | def model_engine(module_list, module_debug, job_cols=0, window_size=0, sys_size=0, 45 | is_training=False, weights_file=None, output_file=None, do_render=False, 46 | learning_rate=1e-5, reward_discount=0.99, batch_size=10, layer_size=[]): 47 | """ 48 | Execute the CqSim Simulator using OpenAi based Gym Environment with Scheduling implemented using DeepRL Engine. 49 | 50 | :param module_list: CQSim Module :- List of attributes for loading CqSim Simulator 51 | :param module_debug: Debug Module :- Module to manage debugging CqSim run. 52 | :param job_cols: [int] :- No. of attributes to define a job. 53 | :param window_size: [int] :- Size of the input window for the DeepLearning (RL) Model. 54 | :param is_training: [boolean] :- If the weights trained need to be saved. 55 | :param weights_file: [str] :- Existing Weights file path. 56 | :param output_file: [str] :- File path if the where the new weights will be saved. 57 | :return: None 58 | """ 59 | cqsim_gym = CqsimEnv(module_list, module_debug, 60 | job_cols, window_size, do_render) 61 | return model_training(cqsim_gym, window_size=window_size, is_training=is_training, 62 | weights_file_name=weights_file, output_file_name=output_file, sys_size=sys_size, learning_rate=learning_rate, 63 | gamma=reward_discount, batch_size=batch_size, layer_size=layer_size) 64 | -------------------------------------------------------------------------------- /src_fc/Trainer/FCFS.py: -------------------------------------------------------------------------------- 1 | from CqGym.Gym import CqsimEnv 2 | import numpy as np 3 | 4 | 5 | def model_training(env, do_render=False): 6 | 7 | obs = env.get_state() 8 | done = False 9 | 10 | while not done: 11 | env.render() 12 | action = -1 13 | early_submit = float('Inf') 14 | for i, v in enumerate(obs.wait_job): 15 | if v['submit'] < early_submit: 16 | action = i 17 | early_submit = v['submit'] 18 | new_obs, done, reward = env.step(action) 19 | 20 | 21 | def model_engine(module_list, module_debug, job_cols=0, window_size=0, sys_size=0, do_render=False): 22 | """ 23 | Execute the CqSim Simulator using OpenAi based Gym Environment with Scheduling implemented using DeepRL Engine. 24 | 25 | :param module_list: CQSim Module :- List of attributes for loading CqSim Simulator 26 | :param module_debug: Debug Module :- Module to manage debugging CqSim run. 27 | :param job_cols: [int] :- No. of attributes to define a job. 28 | :param window_size: [int] :- Size of the input window for the DeepLearning (RL) Model. 29 | :param is_training: [boolean] :- If the weights trained need to be saved. 30 | :param weights_file: [str] :- Existing Weights file path. 31 | :param output_file: [str] :- File path if the where the new weights will be saved. 32 | :return: None 33 | """ 34 | cqsim_gym = CqsimEnv(module_list, module_debug, 35 | job_cols, window_size, do_render) 36 | model_training(cqsim_gym) 37 | -------------------------------------------------------------------------------- /src_fc/Trainer/PG_Trainer.py: -------------------------------------------------------------------------------- 1 | from CqGym.Gym import CqsimEnv 2 | from Models.PG import PG 3 | import tensorflow.compat.v1 as tf 4 | import numpy as np 5 | 6 | tf.disable_v2_behavior() 7 | 8 | 9 | def get_action_from_output_vector(output_vector, wait_queue_size, is_training): 10 | def softmax(z): 11 | return np.exp(z) / np.sum(np.exp(z)) 12 | action_p = softmax(output_vector.flatten()[:wait_queue_size]) 13 | if is_training: 14 | wait_queue_ind = np.random.choice(len(action_p), p=action_p) 15 | else: 16 | wait_queue_ind = np.argmax(action_p) 17 | return wait_queue_ind 18 | 19 | 20 | def model_training(env, weights_file_name=None, is_training=False, output_file_name=None, 21 | window_size=50, sys_size=0, learning_rate=0.1, gamma=0.99, batch_size=10, do_render=False, layer_size=[]): 22 | sess = tf.Session() 23 | tf.keras.backend.set_session(sess) 24 | pg = PG(env, sess, window_size, sys_size, learning_rate, 25 | gamma, batch_size, layer_size=layer_size) 26 | 27 | if weights_file_name: 28 | pg.load_using_model_name(weights_file_name) 29 | 30 | obs = env.get_state() 31 | done = False 32 | 33 | while not done: 34 | 35 | env.render() 36 | output_vector = pg.act(obs.feature_vector) 37 | 38 | action = get_action_from_output_vector( 39 | output_vector, obs.wait_que_size, is_training) 40 | new_obs, done, reward = env.step(action) 41 | pg.remember(obs.feature_vector, output_vector, 42 | reward, new_obs.feature_vector) 43 | if is_training: 44 | pg.train() 45 | obs = new_obs 46 | 47 | if is_training and output_file_name: 48 | pg.save_using_model_name(output_file_name) 49 | 50 | return pg.rewards_seq 51 | 52 | 53 | def model_engine(module_list, module_debug, job_cols=0, window_size=0, sys_size=0, 54 | is_training=False, weights_file=None, output_file=None, do_render=False, learning_rate=0.1, reward_discount=0.99, batch_size=10, layer_size=[]): 55 | """ 56 | Execute the CqSim Simulator using OpenAi based Gym Environment with Scheduling implemented using DeepRL Engine. 57 | 58 | :param module_list: CQSim Module :- List of attributes for loading CqSim Simulator 59 | :param module_debug: Debug Module :- Module to manage debugging CqSim run. 60 | :param job_cols: [int] :- No. of attributes to define a job. 61 | :param window_size: [int] :- Size of the input window for the DeepLearning (RL) Model. 62 | :param is_training: [boolean] :- If the weights trained need to be saved. 63 | :param weights_file: [str] :- Existing Weights file path. 64 | :param output_file: [str] :- File path if the where the new weights will be saved. 65 | :return: None 66 | """ 67 | cqsim_gym = CqsimEnv(module_list, module_debug, 68 | job_cols, window_size, do_render) 69 | return model_training(cqsim_gym, window_size=window_size, sys_size=sys_size, is_training=is_training, 70 | weights_file_name=weights_file, output_file_name=output_file, learning_rate=learning_rate, gamma=reward_discount, batch_size=batch_size, layer_size=layer_size) 71 | -------------------------------------------------------------------------------- /src_fc/Trainer/PPO_Trainer.py: -------------------------------------------------------------------------------- 1 | from CqGym.Gym import CqsimEnv 2 | from Models.PPO import PPO 3 | import numpy as np 4 | import torch 5 | import torch.optim as optim 6 | 7 | 8 | def get_action_from_output_vector(output_vector, wait_queue_size, is_training): 9 | action_p = torch.softmax( 10 | output_vector[:wait_queue_size], dim=-1) 11 | action_p = np.array(action_p) 12 | action_p /= action_p.sum() 13 | if is_training: 14 | wait_queue_ind = np.random.choice(len(action_p), p=action_p) 15 | else: 16 | wait_queue_ind = np.argmax(action_p) 17 | return wait_queue_ind 18 | 19 | 20 | def model_training(env, weights_file_name=None, is_training=False, output_file_name=None, 21 | window_size=50, sys_size=0, learning_rate=0.1, gamma=0.99, batch_size=10, do_render=False, layer_size=[]): 22 | use_cuda = torch.cuda.is_available() 23 | device = torch.device("cuda" if use_cuda else "cpu") 24 | num_inputs = window_size * 2 + sys_size * 1 25 | ppo = PPO(env, num_inputs, window_size, std=0.0, window_size=window_size, 26 | learning_rate=learning_rate, gamma=gamma, batch_size=batch_size, layer_size=layer_size) 27 | 28 | if weights_file_name: 29 | ppo.load_using_model_name(weights_file_name) 30 | 31 | obs = env.get_state() 32 | done = False 33 | 34 | while not done: 35 | 36 | env.render() 37 | 38 | state = torch.FloatTensor(obs.feature_vector).to(device) 39 | 40 | probs, value = ppo.select_action(state) 41 | 42 | action_p = torch.softmax(probs.detach(), dim=-1) 43 | 44 | action = get_action_from_output_vector( 45 | probs.detach(), obs.wait_que_size, is_training) 46 | 47 | new_obs, done, reward = env.step(action) 48 | next_state = torch.FloatTensor(new_obs.feature_vector).to(device) 49 | 50 | ppo.remember(probs, value, reward, done, device, 51 | action, state, next_state, action_p, obs) 52 | if is_training and not done: 53 | ppo.train() 54 | obs = new_obs 55 | 56 | if is_training and output_file_name: 57 | ppo.save_using_model_name(output_file_name) 58 | 59 | return ppo.reward_seq 60 | 61 | 62 | def model_engine(module_list, module_debug, job_cols=0, window_size=0, sys_size=0, 63 | is_training=False, weights_file=None, output_file=None, do_render=False, learning_rate=0.00001, reward_discount=0.99, batch_size=10, layer_size=[]): 64 | """ 65 | Execute the CqSim Simulator using OpenAi based Gym Environment with Scheduling implemented using DeepRL Engine. 66 | 67 | :param module_list: CQSim Module :- List of attributes for loading CqSim Simulator 68 | :param module_debug: Debug Module :- Module to manage debugging CqSim run. 69 | :param job_cols: [int] :- No. of attributes to define a job. 70 | :param window_size: [int] :- Size of the input window for the DeepLearning (RL) Model. 71 | :param is_training: [boolean] :- If the weights trained need to be saved. 72 | :param weights_file: [str] :- Existing Weights file path. 73 | :param output_file: [str] :- File path if the where the new weights will be saved. 74 | :return: None 75 | """ 76 | cqsim_gym = CqsimEnv(module_list, module_debug, 77 | job_cols, window_size, do_render) 78 | return model_training(cqsim_gym, window_size=window_size, sys_size=sys_size, is_training=is_training, 79 | weights_file_name=weights_file, output_file_name=output_file, learning_rate=learning_rate, gamma=reward_discount, batch_size=batch_size, layer_size=layer_size) 80 | -------------------------------------------------------------------------------- /src_fc/Trainer/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/Trainer/__init__.py -------------------------------------------------------------------------------- /src_fc/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SPEAR-UIC/CQGym/63159fef222801a65a19eb823dc4f7846c134ce1/src_fc/__init__.py -------------------------------------------------------------------------------- /src_fc/cqsim.py: -------------------------------------------------------------------------------- 1 | import optparse 2 | import os 3 | import sys 4 | from datetime import datetime 5 | import time 6 | import re 7 | import cqsim_path 8 | import cqsim_main 9 | 10 | 11 | def datetime_strptime(value, format): 12 | """Parse a datetime like datetime.strptime in Python >= 2.5""" 13 | return datetime.datetime(*time.strptime(value, format)[0:6]) 14 | 15 | 16 | class Option(optparse.Option): 17 | """An extended optparse option with cbank-specific types. 18 | 19 | Types: 20 | date -- parse a datetime from a variety of string formats 21 | """ 22 | 23 | DATE_FORMATS = [ 24 | "%Y-%m-%d", 25 | "%Y-%m-%d %H:%M:%S", 26 | "%Y-%m-%d %H:%M", 27 | "%y-%m-%d", 28 | "%y-%m-%d %H:%M:%S", 29 | "%y-%m-%d %H:%M", 30 | "%m/%d/%Y", 31 | "%m/%d/%Y %H:%M:%S", 32 | "%m/%d/%Y %H:%M", 33 | "%m/%d/%y", 34 | "%m/%d/%y %H:%M:%S", 35 | "%m/%d/%y %H:%M", 36 | "%Y%m%d", 37 | ] 38 | 39 | def check_date(self, opt, value): 40 | """Parse a datetime from a variety of string formats.""" 41 | for format in self.DATE_FORMATS: 42 | try: 43 | dt = datetime_strptime(value, format) 44 | except ValueError: 45 | continue 46 | else: 47 | # Python can't translate dates before 1900 to a string, 48 | # causing crashes when trying to build sql with them. 49 | if dt < datetime(1900, 1, 1): 50 | raise optparse.OptionValueError( 51 | "option %s: date must be after 1900: %s" % (opt, value)) 52 | else: 53 | return dt 54 | raise optparse.OptionValueError( 55 | "option %s: invalid date: %s" % (opt, value)) 56 | 57 | TYPES = optparse.Option.TYPES + ("date",) 58 | 59 | TYPE_CHECKER = optparse.Option.TYPE_CHECKER.copy() 60 | TYPE_CHECKER['date'] = check_date 61 | 62 | 63 | def callback_alg(option, opt_str, value, parser): 64 | temp_opt['alg'].append(value) 65 | return 66 | 67 | 68 | def callback_alg_sign(option, opt_str, value, parser): 69 | temp_opt['alg_sign'].append(value) 70 | return 71 | 72 | 73 | def callback_bf_para(option, opt_str, value, parser): 74 | temp_opt['bf_para'].append(value) 75 | return 76 | 77 | 78 | def callback_win_para(option, opt_str, value, parser): 79 | temp_opt['win_para'].append(value) 80 | return 81 | 82 | 83 | def callback_ad_win_para(option, opt_str, value, parser): 84 | temp_opt['ad_win_para'].append(value) 85 | return 86 | 87 | 88 | def callback_ad_bf_para(option, opt_str, value, parser): 89 | temp_opt['ad_bf_para'].append(value) 90 | return 91 | 92 | 93 | def callback_ad_alg_para(option, opt_str, value, parser): 94 | temp_opt['ad_alg_para'].append(value) 95 | return 96 | 97 | 98 | def get_raw_name(file_name): 99 | output_name = "" 100 | length = len(file_name) 101 | i = 0 102 | while (i < length): 103 | if (file_name[i] == '.'): 104 | break 105 | output_name += file_name[i] 106 | i += 1 107 | return output_name 108 | 109 | 110 | def alg_sign_check(alg_sign_t, leng): 111 | alg_sign_result = [] 112 | temp_len = len(alg_sign_t) 113 | i = 0 114 | while i < leng: 115 | if i < temp_len: 116 | alg_sign_result.append(int(alg_sign_t[i])) 117 | else: 118 | alg_sign_result.append(0) 119 | i += 1 120 | return alg_sign_result 121 | 122 | 123 | def get_list(inputstring, regex): 124 | return re.findall(regex, inputstring) 125 | 126 | 127 | def read_config(fileName): 128 | nr_sign = ';' # Not read sign. Mark the line not the job data 129 | sep_sign = '=' # The sign seperate data in a line 130 | readData = {} 131 | configFile = open(fileName, 'r') 132 | 133 | while True: 134 | tempStr = configFile.readline() 135 | if not tempStr: # break when no more line 136 | break 137 | if tempStr[0] != nr_sign: # The job trace line 138 | strNum = len(tempStr) 139 | newWord = 1 140 | k = 0 141 | dataName = "" 142 | dataValue = "" 143 | 144 | for i in range(strNum): 145 | if tempStr[i] == '\n': 146 | break 147 | if tempStr[i] == sep_sign: 148 | if newWord == 0: 149 | newWord = 1 150 | k = k + 1 151 | else: 152 | newWord = 0 153 | if k == 0: 154 | dataName = dataName + tempStr[i] 155 | elif k == 1: 156 | dataValue = dataValue + tempStr[i] 157 | readData[dataName] = dataValue 158 | configFile.close() 159 | 160 | return readData 161 | 162 | 163 | if __name__ == "__main__": 164 | 165 | temp_opt = {'alg': [], 'alg_sign': [], 'bf_para': [], 'win_para': [], 'ad_win_para': [], 'ad_bf_para': [], 166 | 'ad_alg_para': []} 167 | p = optparse.OptionParser(option_class=Option) 168 | # 1 169 | p.add_option("-j", "--job", dest="job_trace", type="string", 170 | help="file name of the job trace") 171 | p.add_option("-n", "--node", dest="node_struc", type="string", 172 | help="file name of the node structure") 173 | p.add_option("-J", "--job_save", dest="job_save", type="string", 174 | help="file name of the formatted job data") 175 | p.add_option("-N", "--node_save", dest="node_save", type="string", 176 | help="file name of the formatted node data") 177 | p.add_option("-f", "--frac", dest="cluster_fraction", type="float", 178 | # default=1.0, 179 | help="job density adjust") 180 | 181 | # 6 182 | p.add_option("-s", "--start", dest="start", type="float", 183 | # default=0.0, 184 | help="virtual job trace start time") 185 | p.add_option("-S", "--start_date", dest="start_date", type="date", 186 | help="job trace start date") 187 | p.add_option("-r", "--anchor", dest="anchor", type="int", 188 | # default=0, 189 | help="first read job position in job trace") 190 | p.add_option("-R", "--read", dest="read_num", type="int", 191 | # default=-1, 192 | help="number of jobs read from the job trace") 193 | p.add_option("-p", "--pre", dest="pre_name", type="string", 194 | # default="CQSIM_", 195 | help="previous file name") 196 | 197 | # 11 198 | p.add_option("-o", "--output", dest="output", type="string", 199 | help="simulator result file name") 200 | p.add_option("--debug", dest="debug", type="string", 201 | help="debug file name") 202 | p.add_option("--ext_fmt_j", dest="ext_fmt_j", type="string", 203 | # default=".csv", 204 | help="temp formatted job data extension type") 205 | p.add_option("--ext_fmt_n", dest="ext_fmt_n", type="string", 206 | # default=".csv", 207 | help="temp formatted node data extension type") 208 | p.add_option("--ext_fmt_j_c", dest="ext_fmt_j_c", type="string", 209 | # default=".con", 210 | help="temp job trace config extension type") 211 | 212 | # 16 213 | p.add_option("--ext_fmt_j_n", dest="ext_fmt_n_c", type="string", 214 | # default=".con", 215 | help="temp job trace config extension type") 216 | p.add_option("--path_in", dest="path_in", type="string", 217 | # default="Input Files/", 218 | help="input file path") 219 | p.add_option("--path_out", dest="path_out", type="string", 220 | # default="Results/", 221 | help="output result file path") 222 | p.add_option("--path_fmt", dest="path_fmt", type="string", 223 | # default="Temp/", 224 | help="temp file path") 225 | p.add_option("--path_debug", dest="path_debug", type="string", 226 | # default="Debug/", 227 | help="debug file path") 228 | 229 | # 21 230 | p.add_option("--ext_jr", dest="ext_jr", type="string", 231 | # default=".rst", 232 | help="job result log extension type") 233 | p.add_option("--ext_si", dest="ext_si", type="string", 234 | # default=".ult", 235 | help="system information log extension type") 236 | p.add_option("--ext_ai", dest="ext_ai", type="string", 237 | # default=".adp", 238 | help="adapt information log extension type") 239 | p.add_option("--ext_ri", dest="ext_ri", type="string", 240 | # default=".rwd", 241 | help="reward information log extension type") 242 | p.add_option("--ext_d", dest="ext_debug", type="string", 243 | # default=".log", 244 | help="debug log extension type") 245 | p.add_option("-v", "--debug_lvl", dest="debug_lvl", type="int", 246 | # default=10, 247 | help="debug mode") 248 | 249 | # 26 250 | p.add_option("-a", "--alg", dest="alg", type="string", 251 | action="callback", callback=callback_alg, 252 | help="basic algorithm list") 253 | p.add_option("-A", "--sign", dest="alg_sign", type="string", 254 | action="callback", callback=callback_alg_sign, 255 | help="sign of the algorithm element in the list") 256 | p.add_option("-b", "--bf", dest="backfill", type="int", 257 | # default=0, 258 | help="backfill mode") 259 | p.add_option("-B", "--bf_para", dest="bf_para", type="string", 260 | action="callback", callback=callback_bf_para, 261 | help="backfill parameter list") 262 | p.add_option("-w", "--win", dest="win", type="int", 263 | # default=0, 264 | help="window mode") 265 | 266 | # 31 267 | p.add_option("-W", "--win_para", dest="win_para", type="string", 268 | action="callback", callback=callback_win_para, 269 | help="window parameter list") 270 | p.add_option("-l", "--ad_bf", dest="ad_bf", type="int", 271 | # default=0, 272 | help="backfill adapt mode") 273 | p.add_option("-L", "--ad_bf_para", dest="ad_bf_para", type="string", 274 | action="callback", callback=callback_ad_bf_para, 275 | help="backfill adapt parameter list") 276 | p.add_option("-d", "--ad_win", dest="ad_win", type="int", 277 | # default=0, 278 | help="window adapt mode") 279 | p.add_option("-D", "--ad_win_para", dest="ad_win_para", type="string", 280 | action="callback", callback=callback_ad_win_para, 281 | help="window adapt parameter list") 282 | 283 | # 36 284 | p.add_option("-g", "--ad_alg", dest="ad_alg", type="int", 285 | # default=0, 286 | help="algorithm adapt mode") 287 | p.add_option("-G", "--ad_alg_para", dest="ad_alg_para", type="string", 288 | action="callback", callback=callback_ad_alg_para, 289 | help="algorithm adapt parameter list") 290 | p.add_option("-c", "--config_n", dest="config_n", type="string", 291 | default="config_n.set", 292 | help="name config file") 293 | p.add_option("-C", "--config_sys", dest="config_sys", type="string", 294 | default="config_sys.set", 295 | help="system config file") 296 | p.add_option("-m", "--monitor", dest="monitor", type="int", 297 | help="monitor interval time") 298 | 299 | # 41 300 | p.add_option("-I", "--log_freq", dest="log_freq", type="int", 301 | help="log frequency") 302 | 303 | p.add_option("-z", "--read_input_freq", dest="read_input_freq", type="int", 304 | help="read input frequency") 305 | 306 | p.add_option("--is_training", dest="is_training", type="int", 307 | default=0, 308 | help="is training: 0 testing; 1 training") 309 | 310 | p.add_option("--rl_alg", dest="rl_alg", type="string", 311 | default="FCFS", 312 | help="scheduling agent: PG; A2C; PPO; FCFS") 313 | 314 | p.add_option("--learning_rate", dest="learning_rate", type="float", 315 | help="learning rate of reinforcement learning") 316 | 317 | p.add_option("--window_size", dest="window_size", type="int", 318 | help="Jobs within the window of the head of queue are considered") 319 | 320 | p.add_option("--reward_discount", dest="reward_discount", type="float", 321 | help="Future reward discount in reinforcement learning") 322 | 323 | p.add_option("--layer_size", dest="layer_size", type="string", 324 | help="Layer size (e.g., 4000,1000)") 325 | 326 | p.add_option("--batch_size", dest="batch_size", type="int", 327 | help="Training batch size for reinforcement learning") 328 | 329 | p.add_option("--input_weight_file", dest="input_weight_file", type="string", 330 | default="", 331 | help="file name to read weights from") 332 | 333 | # 46 334 | p.add_option("--output_weight_file", dest="output_weight_file", type="string", 335 | default="", 336 | help="path to save weights for DeepRL model (not used if is_training is 0)") 337 | 338 | p.add_option("--do_render", dest="do_render", type="string", 339 | help="1 if enable rendering 0 otherwise.") 340 | 341 | opts, args = p.parse_args() 342 | 343 | inputPara = {} 344 | inputPara_sys = {} 345 | inputPara_name = {} 346 | opts.alg = temp_opt['alg'] 347 | opts.alg_sign = temp_opt['alg_sign'] 348 | opts.bf_para = temp_opt['bf_para'] 349 | opts.win_para = temp_opt['win_para'] 350 | opts.ad_win_para = temp_opt['ad_win_para'] 351 | opts.ad_bf_para = temp_opt['ad_bf_para'] 352 | opts.ad_alg_para = temp_opt['ad_alg_para'] 353 | 354 | inputPara['resource_job'] = 0 355 | inputPara['resource_node'] = 0 356 | # 0:Read original file 1:Read formatted file 357 | 358 | if opts.config_sys: 359 | inputPara_sys = read_config(cqsim_path.path_config + opts.config_sys) 360 | if opts.config_n: 361 | inputPara_name = read_config(cqsim_path.path_config + opts.config_n) 362 | elif inputPara_sys['config_n']: 363 | opts.config_n = inputPara_sys['config_n'] 364 | inputPara_name = read_config(opts.config_n) 365 | 366 | if not opts.job_trace and inputPara_sys["job_trace"]: 367 | opts.job_trace = inputPara_sys["job_trace"] 368 | 369 | if not opts.node_struc and not inputPara_sys["node_struc"]: 370 | opts.node_struc = inputPara_sys["node_struc"] 371 | 372 | if not opts.job_trace and not opts.job_save and not inputPara_sys["job_trace"]: 373 | print("Error: Please specify an original job trace or a formatted job data!") 374 | p.print_help() 375 | sys.exit() 376 | if not opts.node_struc and not opts.node_save and not inputPara_sys["node_struc"]: 377 | print("Error: Please specify an original node structure or a formatted node data!") 378 | p.print_help() 379 | sys.exit() 380 | if not opts.alg and not inputPara_sys["alg"]: 381 | print("Error: Please specify the algorithm element!") 382 | p.print_help() 383 | sys.exit() 384 | 385 | if not opts.job_trace: 386 | inputPara['resource_job'] = 1 387 | if not opts.node_struc: 388 | inputPara['resource_node'] = 1 389 | if not opts.output: 390 | opts.output = get_raw_name(opts.job_trace) 391 | if not opts.debug: 392 | opts.debug = "debug_" + get_raw_name(opts.job_trace) 393 | if not opts.job_save: 394 | opts.job_save = get_raw_name(opts.job_trace) 395 | if not opts.node_save: 396 | opts.node_save = get_raw_name(opts.job_trace) + "_node" 397 | if not opts.bf_para: 398 | opts.bf_para = [] 399 | if not opts.ad_win_para: 400 | opts.ad_win_para = [] 401 | if not opts.ad_bf_para: 402 | opts.ad_bf_para = [] 403 | if not opts.ad_alg_para: 404 | opts.ad_alg_para = [] 405 | if not opts.log_freq: 406 | opts.log_freq = 1 407 | if not opts.read_input_freq: 408 | opts.read_input_freq = 1000 409 | 410 | now = datetime.now() 411 | inputPara['job_trace'] = opts.job_trace 412 | inputPara['node_struc'] = opts.node_struc 413 | inputPara['job_save'] = opts.job_save 414 | inputPara['node_save'] = opts.node_save 415 | inputPara['cluster_fraction'] = opts.cluster_fraction 416 | inputPara['start'] = opts.start 417 | inputPara['start_date'] = opts.start_date 418 | inputPara['anchor'] = opts.anchor 419 | inputPara['read_num'] = opts.read_num 420 | inputPara['pre_name'] = opts.pre_name 421 | inputPara['output'] = opts.output + now.strftime('%H_%M_%S') 422 | inputPara['debug'] = opts.debug 423 | inputPara['ext_fmt_j'] = opts.ext_fmt_j 424 | inputPara['ext_fmt_n'] = opts.ext_fmt_n 425 | inputPara['ext_fmt_j_c'] = opts.ext_fmt_j_c 426 | inputPara['ext_fmt_n_c'] = opts.ext_fmt_n_c 427 | inputPara['path_in'] = opts.path_in 428 | inputPara['path_out'] = opts.path_out 429 | inputPara['path_fmt'] = opts.path_fmt 430 | inputPara['path_debug'] = opts.path_debug 431 | inputPara['ext_jr'] = opts.ext_jr 432 | inputPara['ext_si'] = opts.ext_si 433 | inputPara['ext_ai'] = opts.ext_ai 434 | inputPara['ext_ri'] = opts.ext_ri 435 | inputPara['ext_debug'] = opts.ext_debug 436 | inputPara['debug_lvl'] = opts.debug_lvl 437 | inputPara['alg'] = opts.alg 438 | inputPara['alg_sign'] = opts.alg_sign 439 | inputPara['backfill'] = opts.backfill 440 | inputPara['bf_para'] = opts.bf_para 441 | inputPara['win'] = opts.win 442 | inputPara['win_para'] = opts.win_para 443 | inputPara['ad_win'] = opts.ad_win 444 | inputPara['ad_win_para'] = opts.ad_win_para 445 | inputPara['ad_bf'] = opts.ad_bf 446 | inputPara['ad_bf_para'] = opts.ad_bf_para 447 | inputPara['ad_alg'] = opts.ad_alg 448 | inputPara['ad_alg_para'] = opts.ad_alg_para 449 | inputPara['config_n'] = opts.config_n 450 | inputPara['config_sys'] = opts.config_sys 451 | inputPara['monitor'] = opts.monitor 452 | inputPara['log_freq'] = opts.log_freq 453 | inputPara['read_input_freq'] = opts.read_input_freq 454 | inputPara['is_training'] = opts.is_training 455 | inputPara['rl_alg'] = opts.rl_alg 456 | inputPara['learning_rate'] = opts.learning_rate 457 | inputPara['window_size'] = opts.window_size 458 | inputPara['reward_discount'] = opts.reward_discount 459 | inputPara['batch_size'] = opts.batch_size 460 | inputPara['layer_size'] = opts.layer_size 461 | inputPara['input_weight_file'] = opts.input_weight_file 462 | inputPara['output_weight_file'] = opts.output_weight_file 463 | inputPara['do_render'] = opts.do_render 464 | 465 | for item in inputPara_name: 466 | if not inputPara[item]: 467 | inputPara[item] = str(inputPara_name[item]) 468 | 469 | for item in inputPara_sys: 470 | if (item not in inputPara) or (inputPara[item] is None): 471 | if inputPara_sys[item]: 472 | if item == "cluster_fraction" or item == "start": 473 | inputPara[item] = float(inputPara_sys[item]) 474 | elif item == "start_date": 475 | inputPara[item] = str(inputPara_sys[item]) 476 | elif item == "anchor" or \ 477 | item == "read_num" or \ 478 | item == "backfill" or \ 479 | item == "win" or \ 480 | item == "debug_lvl" or \ 481 | item == "ad_bf" or \ 482 | item == "ad_win" or \ 483 | item == "ad_alg" or \ 484 | item == "monitor" or \ 485 | item == "window_size" or \ 486 | item == "batch_size": 487 | inputPara[item] = int(inputPara_sys[item]) 488 | elif item == "alg" or \ 489 | item == "alg_sign" or \ 490 | item == "bf_para" or \ 491 | item == "win_para" or \ 492 | item == "ad_win_para" or \ 493 | item == "ad_bf_para" or \ 494 | item == "ad_alg_para": 495 | inputPara[item] = get_list(inputPara_sys[item], r'([^,]+)') 496 | elif item == "is_training": 497 | inputPara[item] = str(inputPara_sys[item]) 498 | elif item == "learning_rate" or \ 499 | item == "reward_discount": 500 | inputPara[item] = float(inputPara_sys[item]) 501 | else: 502 | inputPara[item] = str(inputPara_sys[item]) 503 | else: 504 | inputPara[item] = None 505 | 506 | inputPara['path_in'] = cqsim_path.path_data + inputPara['path_in'] 507 | inputPara['path_out'] = cqsim_path.path_data + inputPara['path_out'] 508 | inputPara['path_fmt'] = cqsim_path.path_data + inputPara['path_fmt'] 509 | inputPara['path_debug'] = cqsim_path.path_data + inputPara['path_debug'] 510 | inputPara['alg_sign'] = alg_sign_check( 511 | inputPara['alg_sign'], len(inputPara['alg'])) 512 | 513 | # Append Relative path to file names only if the Filenames exist. 514 | if inputPara['input_weight_file']: 515 | inputPara['input_weight_file'] = inputPara['path_fmt'] + \ 516 | inputPara['input_weight_file'] 517 | if inputPara['output_weight_file']: 518 | inputPara['output_weight_file'] = inputPara['path_fmt'] + \ 519 | inputPara['output_weight_file'] 520 | 521 | inputPara['layer_size'] = [int(size) 522 | for size in inputPara['layer_size'].split(',')] 523 | cqsim_main.cqsim_main(inputPara) 524 | -------------------------------------------------------------------------------- /src_fc/cqsim_main.py: -------------------------------------------------------------------------------- 1 | import os 2 | import IOModule.Debug_log as Class_Debug_log 3 | import IOModule.Output_log as Class_Output_log 4 | from time import time 5 | 6 | import CqSim.Job_trace as Class_Job_trace 7 | import CqSim.Backfill as Class_Backfill 8 | import CqSim.Start_window as Class_Start_window 9 | import CqSim.Basic_algorithm as Class_Basic_algorithm 10 | import CqSim.Info_collect as Class_Info_collect 11 | 12 | import Extend.SWF.Filter_job_SWF as filter_job_ext 13 | import Extend.SWF.Filter_node_SWF as filter_node_ext 14 | import Extend.SWF.Node_struc_SWF as node_struc_ext 15 | 16 | import Trainer.PG_Trainer as pg_trainer 17 | import Trainer.A2C_Trainer as a2c_trainer 18 | import Trainer.DQL_Trainer as dql_trainer 19 | import Trainer.PPO_Trainer as ppo_trainer 20 | import Trainer.FCFS as FCFS 21 | 22 | 23 | def cqsim_main(para_list): 24 | print("....................") 25 | for item in para_list: 26 | print(str(item) + ": " + str(para_list[item])) 27 | print("....................") 28 | 29 | trace_name = para_list['path_in'] + para_list['job_trace'] 30 | save_name_j = para_list['path_fmt'] + \ 31 | para_list['job_save'] + para_list['ext_fmt_j'] 32 | config_name_j = para_list['path_fmt'] + \ 33 | para_list['job_save'] + para_list['ext_fmt_j_c'] 34 | struc_name = para_list['path_in'] + para_list['node_struc'] 35 | save_name_n = para_list['path_fmt'] + \ 36 | para_list['node_save'] + para_list['ext_fmt_n'] 37 | config_name_n = para_list['path_fmt'] + \ 38 | para_list['node_save'] + para_list['ext_fmt_n_c'] 39 | 40 | output_sys = para_list['path_out'] + \ 41 | para_list['output'] + para_list['ext_si'] 42 | output_adapt = para_list['path_out'] + \ 43 | para_list['output'] + para_list['ext_ai'] 44 | output_result = para_list['path_out'] + \ 45 | para_list['output'] + para_list['ext_jr'] 46 | output_reward = para_list['path_out'] + \ 47 | para_list['output'] + para_list['ext_ri'] 48 | output_fn = {'sys': output_sys, 49 | 'adapt': output_adapt, 50 | 'result': output_result, 51 | 'reward': output_reward, 52 | } 53 | 54 | log_freq_int = para_list['log_freq'] 55 | read_input_freq = para_list['read_input_freq'] 56 | 57 | if not os.path.exists(para_list['path_fmt']): 58 | os.makedirs(para_list['path_fmt']) 59 | 60 | if not os.path.exists(para_list['path_out']): 61 | os.makedirs(para_list['path_out']) 62 | 63 | if not os.path.exists(para_list['path_debug']): 64 | os.makedirs(para_list['path_debug']) 65 | 66 | # Debug 67 | print(".................... Debug") 68 | debug_path = para_list['path_debug'] + \ 69 | para_list['debug'] + para_list['ext_debug'] 70 | module_debug = Class_Debug_log.Debug_log( 71 | lvl=para_list['debug_lvl'], show=2, path=debug_path, log_freq=log_freq_int) 72 | # module_debug.start_debug() 73 | 74 | # Job Filter 75 | print(".................... Job Filter") 76 | module_filter_job = filter_job_ext.Filter_job_SWF( 77 | trace=trace_name, save=save_name_j, config=config_name_j, debug=module_debug) 78 | module_filter_job.feed_job_trace() 79 | # module_filter_job.read_job_trace() 80 | # module_filter_job.output_job_data() 81 | module_filter_job.output_job_config() 82 | 83 | # Node Filter 84 | print(".................... Node Filter") 85 | module_filter_node = filter_node_ext.Filter_node_SWF( 86 | struc=struc_name, save=save_name_n, config=config_name_n, debug=module_debug) 87 | module_filter_node.read_node_struc() 88 | module_filter_node.output_node_data() 89 | module_filter_node.output_node_config() 90 | 91 | # Job Trace 92 | print(".................... Job Trace") 93 | module_job_trace = Class_Job_trace.Job_trace(start=para_list['start'], num=para_list['read_num'], anchor=para_list['anchor'], 94 | density=para_list['cluster_fraction'], read_input_freq=para_list['read_input_freq'], debug=module_debug) 95 | module_job_trace.initial_import_job_file(save_name_j) 96 | # module_job_trace.import_job_file(save_name_j) 97 | module_job_trace.import_job_config(config_name_j) 98 | 99 | # Node Structure 100 | print(".................... Node Structure") 101 | module_node_struc = node_struc_ext.Node_struc_SWF(debug=module_debug) 102 | module_node_struc.import_node_file(save_name_n) 103 | module_node_struc.import_node_config(config_name_n) 104 | 105 | # Backfill 106 | print(".................... Backfill") 107 | module_backfill = Class_Backfill.Backfill( 108 | mode=para_list['backfill'], node_module=module_node_struc, debug=module_debug, para_list=para_list['bf_para']) 109 | 110 | # Start Window 111 | print(".................... Start Window") 112 | module_win = Class_Start_window.Start_window( 113 | mode=para_list['win'], node_module=module_node_struc, debug=module_debug, para_list=para_list['win_para'], para_list_ad=para_list['ad_win_para']) 114 | 115 | # Basic Algorithm 116 | print(".................... Basic Algorithm") 117 | module_alg = Class_Basic_algorithm.Basic_algorithm( 118 | element=[para_list['alg'], para_list['alg_sign']], debug=module_debug, para_list=para_list['ad_alg_para']) 119 | 120 | # Information Collect 121 | print(".................... Information Collect") 122 | module_info_collect = Class_Info_collect.Info_collect( 123 | alg_module=module_alg, debug=module_debug) 124 | 125 | # Output Log 126 | print(".................... Output Log") 127 | module_output_log = Class_Output_log.Output_log( 128 | output=output_fn, log_freq=log_freq_int) 129 | 130 | # CqSim Simulator with RL 131 | print(".................... Cqsim Simulator using RL") 132 | module_list = {'job': module_job_trace, 'node': module_node_struc, 'backfill': module_backfill, 133 | 'win': module_win, 'alg': module_alg, 'info': module_info_collect, 'output': module_output_log} 134 | job_cols = int(para_list['job_info_size']) // int(para_list['input_dim']) 135 | batch_size = int(para_list['batch_size']) 136 | window_size = int(para_list['window_size']) 137 | learning_rate = float(para_list['learning_rate']) 138 | reward_discount = float(para_list['reward_discount']) 139 | is_training = True if ( 140 | para_list['is_training'] == '1' or para_list['is_training'] == 1) else False 141 | input_weight_file = para_list['input_weight_file'] 142 | output_weight_file = para_list['output_weight_file'] 143 | do_render = True if para_list['do_render'] == '1' else False 144 | layer_size = para_list['layer_size'] 145 | 146 | # Invoking the CqGym and PG model. This function manages the parameters required for initialization the 147 | # Gym Environment along with CqSim simulator and also for loading RL model - PG. 148 | reward_seq = [] 149 | if para_list['rl_alg'] == 'PPO': 150 | reward_seq = ppo_trainer.model_engine(module_list, module_debug, job_cols, window_size, module_node_struc.tot, 151 | is_training, input_weight_file, output_weight_file, do_render, learning_rate, reward_discount, batch_size, layer_size) 152 | elif para_list['rl_alg'] == 'A2C': 153 | reward_seq = a2c_trainer.model_engine(module_list, module_debug, job_cols, window_size, module_node_struc.tot, 154 | is_training, input_weight_file, output_weight_file, do_render, learning_rate, reward_discount, batch_size, layer_size) 155 | elif para_list['rl_alg'] == 'DQL': 156 | reward_seq = dql_trainer.model_engine(module_list, module_debug, job_cols, window_size, module_node_struc.tot, 157 | is_training, input_weight_file, output_weight_file, do_render, learning_rate, reward_discount, batch_size, layer_size) 158 | elif para_list['rl_alg'] == 'PG': 159 | reward_seq = pg_trainer.model_engine(module_list, module_debug, job_cols, window_size, module_node_struc.tot, 160 | is_training, input_weight_file, output_weight_file, do_render, learning_rate, reward_discount, batch_size, layer_size) 161 | else: # FCFS 162 | print(".................... FCFS") 163 | FCFS.model_engine(module_list, module_debug, 164 | job_cols, window_size, do_render) 165 | module_output_log.print_reward(reward_seq) 166 | -------------------------------------------------------------------------------- /src_fc/cqsim_path.py: -------------------------------------------------------------------------------- 1 | import sys 2 | path_cqsim="/workspace/Cqsim/" 3 | path_src=path_cqsim+"src" 4 | path_config="Config/" 5 | path_data="../"+"data/" 6 | sys.path.append(path_src) 7 | --------------------------------------------------------------------------------