├── .gitattributes
├── LICENSE
├── README.md
├── Untitled0.ipynb
├── data
├── README.md
├── data_10.mat
├── data_10_WeightsAlternated.mat
├── data_20.mat
├── data_30.mat
├── data_5.mat
├── data_6.mat
├── data_7.mat
├── data_8.mat
└── data_9.mat
├── demo_alternate_weights.py
├── demo_on_off.py
├── gain_his_ratio.txt
├── main.ipynb
├── main.py
├── mainPyTorch.py
├── mainTF2
├── memory.ipynb
├── memory.py
├── memoryPyTorch.py
├── memoryTF2.py
└── optimization.py
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2018 REVENOL
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # DROO
2 |
3 | *Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks*
4 |
5 | Python code to reproduce our DROO algorithm for Wireless-powered Mobile-Edge Computing [1], which uses the time-varying wireless channel gains as the input and generates the binary offloading decisions. It includes:
6 |
7 | - [memory.py](memory.py): the DNN structure for the WPMEC, inclduing training structure and test structure, implemented based on [Tensorflow 1.x](https://www.tensorflow.org/install/pip).
8 | - [memoryTF2.py](memoryTF2.py): Implemented based on [Tensorflow 2](https://www.tensorflow.org/install).
9 | - [memoryPyTorch.py](memoryPyTorch.py): Implemented based on [PyTorch](https://pytorch.org/get-started/locally/).
10 | - [optimization.py](optimization.py): solve the resource allocation problem
11 |
12 | - [data](./data): all data are stored in this subdirectory, includes:
13 |
14 | - **data_#.mat**: training and testing data sets, where # = {10, 20, 30} is the user number
15 |
16 | - [main.py](main.py): run this file for DROO, including setting system parameters, implemented based on [Tensorflow 1.x](https://www.tensorflow.org/install/pip)
17 | - [mainTF2.py](mainTF2.py): Implemented based on [Tensorflow 2](https://www.tensorflow.org/install). Run this file for DROO if you code with Tensorflow 2.
18 | - [mainPyTorch.py](mainPyTorch.py): Implemented based on [PyTorch](https://pytorch.org/get-started/locally/). Run this file for DROO if you code with PyTorch.
19 |
20 | - [demo_alternate_weights.py](demo_alternate_weights.py): run this file to evaluate the performance of DROO when WDs' weights are alternated
21 |
22 | - [demo_on_off.py](demo_on_off.py): run this file to evaluate the performance of DROO when some WDs are randomly turning on/off
23 |
24 |
25 | ## Cite this work
26 |
27 | 1. L. Huang, S. Bi, and Y. J. Zhang, “[Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks](https://ieeexplore.ieee.org/document/8771176),” IEEE Trans. Mobile Compt., vol. 19, no. 11, pp. 2581-2593, November 2020.
28 |
29 | ```
30 | @ARTICLE{huang2020DROO,
31 | author={Huang, Liang and Bi, Suzhi and Zhang, Ying-Jun Angela},
32 | journal={IEEE Transactions on Mobile Computing},
33 | title={Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks},
34 | year={2020},
35 | month={November},
36 | volume={19},
37 | number={11},
38 | pages={2581-2593},
39 | doi={10.1109/TMC.2019.2928811}
40 | }
41 | ```
42 |
43 | ## About authors
44 |
45 | - [Liang HUANG](https://scholar.google.com/citations?user=NifLoZ4AAAAJ), lianghuang AT zjut.edu.cn
46 |
47 | - [Suzhi BI](https://scholar.google.com/citations?user=uibqC-0AAAAJ), bsz AT szu.edu.cn
48 |
49 | - [Ying Jun (Angela) Zhang](https://scholar.google.com/citations?user=iOb3wocAAAAJ), yjzhang AT ie.cuhk.edu.hk
50 |
51 | ## Required packages
52 |
53 | - Tensorflow
54 |
55 | - numpy
56 |
57 | - scipy
58 |
59 | ## How the code works
60 |
61 | - For DROO algorithm, run the file, [main.py](main.py). If you code with Tenforflow 2 or PyTorch, run [mainTF2.py](mainTF2.py) or [mainPyTorch.py](mainPyTorch.py), respectively. The original DROO algorithm is coded based on [Tensorflow 1.x](https://www.tensorflow.org/install/pip). If you are fresh to deep learning, please start with [Tensorflow 2](https://www.tensorflow.org/install) or [PyTorch](https://pytorch.org/get-started/locally/), whose codes are much cleaner and easier to follow.
62 |
63 | - For more DROO demos:
64 | - Laternating-weight WDs, run the file, [demo_alternate_weights.py](demo_alternate_weights.
65 | - ON-OFF WDs, run the file, [demo_on_off.py](demo_on_off.py)
66 | - Remember to respectively edit the *import MemoryDNN* code from
67 | ```
68 | from memory import MemoryDNN
69 | ```
70 | to
71 | ```
72 | from memoryTF2 import MemoryDNN
73 | ```
74 | or
75 | ```
76 | from memoryPyTorch import MemoryDNN
77 | ```
78 | if you are using Tensorflow 2 or PyTorch.
79 |
80 | ### DROO is illustrated here for single-slot optimization. If you tend to apply DROO for multiple-slot continuous control problems, please refer to our [LyDROO](https://github.com/revenol/LyDROO) project.
81 |
--------------------------------------------------------------------------------
/Untitled0.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyOnESJFQS185LAOEe1zq+1H",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | },
17 | "accelerator": "GPU",
18 | "gpuClass": "standard"
19 | },
20 | "cells": [
21 | {
22 | "cell_type": "markdown",
23 | "metadata": {
24 | "id": "view-in-github",
25 | "colab_type": "text"
26 | },
27 | "source": [
28 | "
"
29 | ]
30 | },
31 | {
32 | "cell_type": "code",
33 | "execution_count": null,
34 | "metadata": {
35 | "id": "GcZqNc_bGfkz"
36 | },
37 | "outputs": [],
38 | "source": [
39 | "from google.colab import files"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "source": [
45 | "!cp content/drive/MyDrive/Importing\\ Scripts \\as\\ Modules/DROO-master/main.py /content"
46 | ],
47 | "metadata": {
48 | "colab": {
49 | "base_uri": "https://localhost:8080/"
50 | },
51 | "id": "B_yN8BR3G7bN",
52 | "outputId": "bc83f6e3-f21c-4fcc-b7b3-6e92d2b68de9"
53 | },
54 | "execution_count": null,
55 | "outputs": [
56 | {
57 | "output_type": "stream",
58 | "name": "stdout",
59 | "text": [
60 | "cp: cannot stat 'content/drive/MyDrive/Importing Scripts': No such file or directory\n",
61 | "cp: cannot stat 'as Modules/DROO-master/main.py': No such file or directory\n"
62 | ]
63 | }
64 | ]
65 | },
66 | {
67 | "cell_type": "markdown",
68 | "source": [
69 | "# New Section"
70 | ],
71 | "metadata": {
72 | "id": "MkqI9XbFHDmT"
73 | }
74 | }
75 | ]
76 | }
--------------------------------------------------------------------------------
/data/README.md:
--------------------------------------------------------------------------------
1 | # DROO
2 |
3 | *Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks*
4 |
5 | This folder includs all pre-generated training and testing data set, including:
6 |
7 | - **data_#.mat**: , where # = {5, 6, 7, 8, 9, 10, 20, 30} is the number of WDs
8 |
9 | - [data_10_WeightsAlternated.mat](data_10_WeightsAlternated.mat): The data set when all WDs' weights are alternated. It contains the same values of 'input_h' as the ones stored in [data_10.mat](data_10.mat). However, the optimal offloading mode, resource allocation, and the maximum computation rate are recalculated since WDs' weights are alternated.
10 |
11 |
12 | ## Data Format
13 |
14 | Data samples are generated by enumerating all 2^N binary offloading actions for N <= 10 and by following the CD method presented in [2] for N = 20, 30. There are 30,000 (for N = 10, 20, 30) or 10,000 (otherwise) samples saved in each \*.mat file. Where each data sample includes:
15 |
16 | | variable | description |
17 | |------------------------:|:-----------------------|
18 | | input_h | The wireless channel gain between WDs and the AP $\mathbf{h}$ |
19 | | output_mode | The optimal binary offloading action $\mathbf{x}^*$ |
20 | | output_a | The optimal fraction of time that the AP broadcasts RF energy for the WDs to harvest $a^*$ |
21 | | output_tau | The optimal fraction of time allocated to WDs for task offloading $\mathbf{\tau}^*$|
22 | | output_obj | The optimal weighted sum computation rate $Q^*$ |
23 |
24 |
25 |
26 | ## About our works
27 |
28 | 1. Liang Huang, Suzhi Bi, and Ying-jun Angela Zhang, "Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks", on [arxiv:1808.01977](https://arxiv.org/abs/1808.01977).
29 | 2. S. Bi and Y. J. Zhang, "Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading," *IEEE Trans. Wireless Commun.*, vol. 17, no. 6, pp. 4177-4190, Jun. 2018.
30 |
31 | ## About authors
32 |
33 | - Liang HUANG, lianghuang AT zjut.edu.cn
34 |
35 | - Suzhi BI, bsz AT szu.edu.cn
36 |
37 | - Ying Jun (Angela) Zhang, yjzhang AT ie.cuhk.edu.hk
38 |
--------------------------------------------------------------------------------
/data/data_10.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_10.mat
--------------------------------------------------------------------------------
/data/data_10_WeightsAlternated.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_10_WeightsAlternated.mat
--------------------------------------------------------------------------------
/data/data_20.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_20.mat
--------------------------------------------------------------------------------
/data/data_30.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_30.mat
--------------------------------------------------------------------------------
/data/data_5.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_5.mat
--------------------------------------------------------------------------------
/data/data_6.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_6.mat
--------------------------------------------------------------------------------
/data/data_7.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_7.mat
--------------------------------------------------------------------------------
/data/data_8.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_8.mat
--------------------------------------------------------------------------------
/data/data_9.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_9.mat
--------------------------------------------------------------------------------
/demo_alternate_weights.py:
--------------------------------------------------------------------------------
1 | # #################################################################
2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks
3 | #
4 | # This file contains a demo evaluating the performance of DROO with laternating-weight WDs. It loads the training samples with default WDs' weights from ./data/data_10.mat and with alternated weights from ./data/data_10_WeightsAlternated.mat. The channel gains in both files are the same. However, the optimal offloading mode, resource allocation, and the maximum computation rate in 'data_10_WeightsAlternated.mat' are recalculated since WDs' weights are alternated.
5 | #
6 | # References:
7 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-jun Angela Zhang, “Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks”, on arxiv:1808.01977
8 | #
9 | # version 1.0 -- April 2019. Written by Liang Huang (lianghuang AT zjut.edu.cn)
10 | # #################################################################
11 |
12 |
13 | import scipy.io as sio # import scipy.io for .mat file I/
14 | import numpy as np # import numpy
15 |
16 | from memory import MemoryDNN
17 | from optimization import bisection
18 | from main import plot_rate, save_to_txt
19 |
20 | import time
21 |
22 |
23 | def alternate_weights(case_id=0):
24 | '''
25 | Alternate the weights of all WDs. Note that, the maximum computation rate need be recomputed by solving (P2) once any WD's weight is changed.
26 | Input: case_id = 0 for default weights; case_id = 1 for alternated weights.
27 | Output: The alternated weights and the corresponding rate.
28 | '''
29 | # set alternated weights
30 | weights=[[1,1.5,1,1.5,1,1.5,1,1.5,1,1.5],[1.5,1,1.5,1,1.5,1,1.5,1,1.5,1]]
31 |
32 | # load the corresponding maximum computation rate
33 | if case_id == 0:
34 | # by defaulst, case_id = 0
35 | rate = sio.loadmat('./data/data_10')['output_obj']
36 | else:
37 | # alternate weights for all WDs, case_id = 1
38 | rate = sio.loadmat('./data/data_10_WeightsAlternated')['output_obj']
39 | return weights[case_id], rate
40 |
41 | if __name__ == "__main__":
42 | '''
43 | This demo evaluate DROO with laternating-weight WDs. We evaluate an extreme case by alternating the weights of all WDs between 1 and 1.5 at the same time, specifically, at time frame 6,000 and 8,000.
44 | '''
45 |
46 | N = 10 # number of users
47 | n = 10000 # number of time frames, <= 10,000
48 | K = N # initialize K = N
49 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
50 | Memory = 1024 # capacity of memory structure
51 | Delta = 32 # Update interval for adaptive K
52 |
53 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
54 | # Load data
55 | channel = sio.loadmat('./data/data_%d' %N)['input_h']
56 | rate = sio.loadmat('./data/data_%d' %N)['output_obj']
57 |
58 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
59 | channel = channel * 1000000
60 |
61 | # generate the train and test data sample index
62 | # data are splitted as 80:20
63 | # training data are randomly sampled with duplication if n > total data size
64 |
65 | split_idx = int(.8* len(channel))
66 | num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size
67 |
68 |
69 | mem = MemoryDNN(net = [N, 120, 80, N],
70 | learning_rate = 0.01,
71 | training_interval=10,
72 | batch_size=128,
73 | memory_size=Memory
74 | )
75 |
76 | start_time=time.time()
77 |
78 | rate_his = []
79 | rate_his_ratio = []
80 | mode_his = []
81 | k_idx_his = []
82 | K_his = []
83 | h = channel[0,:]
84 |
85 | # initilize the weights by setting case_id = 0.
86 | weight, rate = alternate_weights(0)
87 | print("WD weights at time frame %d:"%(0), weight)
88 |
89 |
90 | for i in range(n):
91 | # for dynamic number of WDs
92 | if i ==0.6*n:
93 | weight, rate = alternate_weights(1)
94 | print("WD weights at time frame %d:"%(i), weight)
95 | if i ==0.8*n:
96 | weight, rate = alternate_weights(0)
97 | print("WD weights at time frame %d:"%(i), weight)
98 |
99 |
100 | if i % (n//10) == 0:
101 | print("%0.1f"%(i/n))
102 | if i> 0 and i % Delta == 0:
103 | # index counts from 0
104 | if Delta > 1:
105 | max_k = max(k_idx_his[-Delta:-1]) +1;
106 | else:
107 | max_k = k_idx_his[-1] +1;
108 | K = min(max_k +1, N)
109 |
110 |
111 | i_idx = i
112 | h = channel[i_idx,:]
113 |
114 | # the action selection must be either 'OP' or 'KNN'
115 | m_list = mem.decode(h, K, decoder_mode)
116 |
117 | r_list = []
118 | for m in m_list:
119 | # only acitve users are used to compute the rate
120 | r_list.append(bisection(h/1000000, m, weight)[0])
121 |
122 | # memorize the largest reward
123 | rate_his.append(np.max(r_list))
124 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
125 | # record the index of largest reward
126 | k_idx_his.append(np.argmax(r_list))
127 | # record K in case of adaptive K
128 | K_his.append(K)
129 | # save the mode with largest reward
130 | mode_his.append(m_list[np.argmax(r_list)])
131 | # if i <0.6*n:
132 | # encode the mode with largest reward
133 | mem.encode(h, m_list[np.argmax(r_list)])
134 |
135 |
136 | total_time=time.time()-start_time
137 | mem.plot_cost()
138 | plot_rate(rate_his_ratio)
139 |
140 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
141 | print('Total time consumed:%s'%total_time)
142 | print('Average time per channel:%s'%(total_time/n))
143 |
144 | # save data into txt
145 | save_to_txt(k_idx_his, "k_idx_his.txt")
146 | save_to_txt(K_his, "K_his.txt")
147 | save_to_txt(mem.cost_his, "cost_his.txt")
148 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
149 | save_to_txt(mode_his, "mode_his.txt")
150 |
151 |
152 |
153 |
--------------------------------------------------------------------------------
/demo_on_off.py:
--------------------------------------------------------------------------------
1 | # #################################################################
2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks
3 | #
4 | # This file contains a demo evaluating the performance of DROO by randomly turning on/off some WDs. It loads the training samples from ./data/data_#.mat, where # denotes the number of active WDs in the MEC network. Note that, the maximum computation rate need be recomputed by solving (P2) once a WD is turned off/on.
5 | #
6 | # References:
7 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-jun Angela Zhang, “Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks”, submitted to IEEE Journal on Selected Areas in Communications.
8 | #
9 | # version 1.0 -- April 2019. Written by Liang Huang (lianghuang AT zjut.edu.cn)
10 | # #################################################################
11 |
12 |
13 | import scipy.io as sio # import scipy.io for .mat file I/
14 | import numpy as np # import numpy
15 |
16 | from memory import MemoryDNN
17 | from optimization import bisection
18 | from main import plot_rate, save_to_txt
19 |
20 | import time
21 |
22 |
23 | def WD_off(channel, N_active, N):
24 | # turn off one WD
25 | if N_active > 5: # current we support half of WDs are off
26 | N_active = N_active - 1
27 | # set the (N-active-1)th channel to close to 0
28 | # since all channels in each time frame are randomly generated, we turn of the WD with greatest index
29 | channel[:,N_active] = channel[:, N_active] / 1000000 # a programming trick,such that we can recover its channel gain once the WD is turned on again.
30 | print(" The %dth WD is turned on."%(N_active +1))
31 |
32 | # update the expected maximum computation rate
33 | rate = sio.loadmat('./data/data_%d' %N_active)['output_obj']
34 | return channel, rate, N_active
35 |
36 | def WD_on(channel, N_active, N):
37 | # turn on one WD
38 | if N_active < N:
39 | N_active = N_active + 1
40 | # recover (N_active-1)th channel
41 | channel[:,N_active-1] = channel[:, N_active-1] * 1000000
42 | print(" The %dth WD is turned on."%(N_active))
43 |
44 | # update the expected maximum computation rate
45 | rate = sio.loadmat('./data/data_%d' %N_active)['output_obj']
46 | return channel, rate, N_active
47 |
48 |
49 |
50 |
51 | if __name__ == "__main__":
52 | '''
53 | This demo evaluate DROO for MEC networks where WDs can be occasionally turned off/on. After DROO converges, we randomly turn off on one WD at each time frame 6,000, 6,500, 7,000, and 7,500, and then turn them on at time frames 8,000, 8,500, and 9,000. At time frame 9,500 , we randomly turn off two WDs, resulting an MEC network with 8 acitve WDs.
54 | '''
55 |
56 | N = 10 # number of users
57 | N_active = N # number of effective users
58 | N_off = 0 # number of off-users
59 | n = 10000 # number of time frames, <= 10,000
60 | K = N # initialize K = N
61 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
62 | Memory = 1024 # capacity of memory structure
63 | Delta = 32 # Update interval for adaptive K
64 |
65 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
66 | # Load data
67 | channel = sio.loadmat('./data/data_%d' %N)['input_h']
68 | rate = sio.loadmat('./data/data_%d' %N)['output_obj']
69 |
70 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
71 | channel = channel * 1000000
72 | channel_bak = channel.copy()
73 | # generate the train and test data sample index
74 | # data are splitted as 80:20
75 | # training data are randomly sampled with duplication if n > total data size
76 |
77 | split_idx = int(.8* len(channel))
78 | num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size
79 |
80 |
81 | mem = MemoryDNN(net = [N, 120, 80, N],
82 | learning_rate = 0.01,
83 | training_interval=10,
84 | batch_size=128,
85 | memory_size=Memory
86 | )
87 |
88 | start_time=time.time()
89 |
90 | rate_his = []
91 | rate_his_ratio = []
92 | mode_his = []
93 | k_idx_his = []
94 | K_his = []
95 | h = channel[0,:]
96 |
97 |
98 | for i in range(n):
99 | # for dynamic number of WDs
100 | if i ==0.6*n:
101 | print("At time frame %d:"%(i))
102 | channel, rate, N_active = WD_off(channel, N_active, N)
103 | if i ==0.65*n:
104 | print("At time frame %d:"%(i))
105 | channel, rate, N_active = WD_off(channel, N_active, N)
106 | if i ==0.7*n:
107 | print("At time frame %d:"%(i))
108 | channel, rate, N_active = WD_off(channel, N_active, N)
109 | if i ==0.75*n:
110 | print("At time frame %d:"%(i))
111 | channel, rate, N_active = WD_off(channel, N_active, N)
112 | if i ==0.8*n:
113 | print("At time frame %d:"%(i))
114 | channel, rate, N_active = WD_on(channel, N_active, N)
115 | if i ==0.85*n:
116 | print("At time frame %d:"%(i))
117 | channel, rate, N_active = WD_on(channel, N_active, N)
118 | if i ==0.9*n:
119 | print("At time frame %d:"%(i))
120 | channel, rate, N_active = WD_on(channel, N_active, N)
121 | channel, rate, N_active = WD_on(channel, N_active, N)
122 | if i == 0.95*n:
123 | print("At time frame %d:"%(i))
124 | channel, rate, N_active = WD_off(channel, N_active, N)
125 | channel, rate, N_active = WD_off(channel, N_active, N)
126 |
127 | if i % (n//10) == 0:
128 | print("%0.1f"%(i/n))
129 | if i> 0 and i % Delta == 0:
130 | # index counts from 0
131 | if Delta > 1:
132 | max_k = max(k_idx_his[-Delta:-1]) +1;
133 | else:
134 | max_k = k_idx_his[-1] +1;
135 | K = min(max_k +1, N)
136 |
137 | i_idx = i
138 | h = channel[i_idx,:]
139 |
140 | # the action selection must be either 'OP' or 'KNN'
141 | m_list = mem.decode(h, K, decoder_mode)
142 |
143 | r_list = []
144 | for m in m_list:
145 | # only acitve users are used to compute the rate
146 | r_list.append(bisection(h[0:N_active]/1000000, m[0:N_active])[0])
147 |
148 | # memorize the largest reward
149 | rate_his.append(np.max(r_list))
150 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
151 | # record the index of largest reward
152 | k_idx_his.append(np.argmax(r_list))
153 | # record K in case of adaptive K
154 | K_his.append(K)
155 | # save the mode with largest reward
156 | mode_his.append(m_list[np.argmax(r_list)])
157 | # if i <0.6*n:
158 | # encode the mode with largest reward
159 | mem.encode(h, m_list[np.argmax(r_list)])
160 |
161 |
162 | total_time=time.time()-start_time
163 | mem.plot_cost()
164 | plot_rate(rate_his_ratio)
165 |
166 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
167 | print('Total time consumed:%s'%total_time)
168 | print('Average time per channel:%s'%(total_time/n))
169 |
170 | # save data into txt
171 | save_to_txt(k_idx_his, "k_idx_his.txt")
172 | save_to_txt(K_his, "K_his.txt")
173 | save_to_txt(mem.cost_his, "cost_his.txt")
174 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
175 | save_to_txt(mode_his, "mode_his.txt")
176 |
177 |
178 |
179 |
--------------------------------------------------------------------------------
/main.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyPGw6GfH7V4bE9B/kfbRbog",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | },
17 | "accelerator": "GPU",
18 | "gpuClass": "standard"
19 | },
20 | "cells": [
21 | {
22 | "cell_type": "markdown",
23 | "metadata": {
24 | "id": "view-in-github",
25 | "colab_type": "text"
26 | },
27 | "source": [
28 | "
"
29 | ]
30 | },
31 | {
32 | "cell_type": "code",
33 | "execution_count": null,
34 | "metadata": {
35 | "id": "GcZqNc_bGfkz"
36 | },
37 | "outputs": [],
38 | "source": [
39 | "from google.colab import files"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "source": [
45 | "!cp content/drive/MyDrive/Importing\\ Scripts \\as\\ Modules/DROO-master/main.py /content"
46 | ],
47 | "metadata": {
48 | "colab": {
49 | "base_uri": "https://localhost:8080/"
50 | },
51 | "id": "B_yN8BR3G7bN",
52 | "outputId": "bc83f6e3-f21c-4fcc-b7b3-6e92d2b68de9"
53 | },
54 | "execution_count": null,
55 | "outputs": [
56 | {
57 | "output_type": "stream",
58 | "name": "stdout",
59 | "text": [
60 | "cp: cannot stat 'content/drive/MyDrive/Importing Scripts': No such file or directory\n",
61 | "cp: cannot stat 'as Modules/DROO-master/main.py': No such file or directory\n"
62 | ]
63 | }
64 | ]
65 | },
66 | {
67 | "cell_type": "code",
68 | "source": [
69 | "# #################################################################\n",
70 | "# Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks\n",
71 | "#\n",
72 | "# This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data.\n",
73 | "#\n",
74 | "# Input: ./data/data_#.mat\n",
75 | "# Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes\n",
76 | "# -----------------------------------------------------------------\n",
77 | "# | wireless channel gain | input_h |\n",
78 | "# -----------------------------------------------------------------\n",
79 | "# | computing mode selection | output_mode |\n",
80 | "# -----------------------------------------------------------------\n",
81 | "# | energy broadcasting parameter | output_a |\n",
82 | "# -----------------------------------------------------------------\n",
83 | "# | transmit time of wireless device | output_tau |\n",
84 | "# -----------------------------------------------------------------\n",
85 | "# | weighted sum computation rate | output_obj |\n",
86 | "# -----------------------------------------------------------------\n",
87 | "#\n",
88 | "#\n",
89 | "# References:\n",
90 | "# [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, \"Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks,\" in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811.\n",
91 | "# [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018.\n",
92 | "#\n",
93 | "# version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)\n",
94 | "# #################################################################\n",
95 | "\n",
96 | "\n",
97 | "import scipy.io as sio # import scipy.io for .mat file I/\n",
98 | "import numpy as np # import numpy\n",
99 | "\n",
100 | "from memory import MemoryDNN\n",
101 | "from optimization import bisection\n",
102 | "\n",
103 | "import time\n",
104 | "\n",
105 | "\n",
106 | "def plot_rate( rate_his, rolling_intv = 50):\n",
107 | " import matplotlib.pyplot as plt\n",
108 | " import pandas as pd\n",
109 | " import matplotlib as mpl\n",
110 | "\n",
111 | " rate_array = np.asarray(rate_his)\n",
112 | " df = pd.DataFrame(rate_his)\n",
113 | "\n",
114 | "\n",
115 | " mpl.style.use('seaborn')\n",
116 | " fig, ax = plt.subplots(figsize=(15,8))\n",
117 | "# rolling_intv = 20\n",
118 | "\n",
119 | " plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b')\n",
120 | " plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2)\n",
121 | " plt.ylabel('Normalized Computation Rate')\n",
122 | " plt.xlabel('Time Frames')\n",
123 | " plt.show()\n",
124 | "\n",
125 | "def save_to_txt(rate_his, file_path):\n",
126 | " with open(file_path, 'w') as f:\n",
127 | " for rate in rate_his:\n",
128 | " f.write(\"%s \\n\" % rate)\n",
129 | "\n",
130 | "if __name__ == \"__main__\":\n",
131 | " '''\n",
132 | " This algorithm generates K modes from DNN, and chooses with largest\n",
133 | " reward. The mode with largest reward is stored in the memory, which is\n",
134 | " further used to train the DNN.\n",
135 | " Adaptive K is implemented. K = max(K, K_his[-memory_size])\n",
136 | " '''\n",
137 | "\n",
138 | " N = 10 # number of users\n",
139 | " n = 30000 # number of time frames\n",
140 | " K = N # initialize K = N\n",
141 | " decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN'\n",
142 | " Memory = 1024 # capacity of memory structure\n",
143 | " Delta = 32 # Update interval for adaptive K\n",
144 | "\n",
145 | " print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))\n",
146 | " # Load data\n",
147 | " channel = sio.loadmat('./data/data_%d' %N)['input_h']\n",
148 | " rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO.\n",
149 | "\n",
150 | " # increase h to close to 1 for better training; it is a trick widely adopted in deep learning\n",
151 | " channel = channel * 1000000\n",
152 | "\n",
153 | " # generate the train and test data sample index\n",
154 | " # data are splitted as 80:20\n",
155 | " # training data are randomly sampled with duplication if n > total data size\n",
156 | "\n",
157 | " split_idx = int(.8* len(channel))\n",
158 | " num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size\n",
159 | "\n",
160 | "\n",
161 | " mem = MemoryDNN(net = [N, 120, 80, N],\n",
162 | " learning_rate = 0.01,\n",
163 | " training_interval=10,\n",
164 | " batch_size=128,\n",
165 | " memory_size=Memory\n",
166 | " )\n",
167 | "\n",
168 | " start_time=time.time()\n",
169 | "\n",
170 | " rate_his = []\n",
171 | " rate_his_ratio = []\n",
172 | " mode_his = []\n",
173 | " k_idx_his = []\n",
174 | " K_his = []\n",
175 | " for i in range(n):\n",
176 | " if i % (n//10) == 0:\n",
177 | " print(\"%0.1f\"%(i/n))\n",
178 | " if i> 0 and i % Delta == 0:\n",
179 | " # index counts from 0\n",
180 | " if Delta > 1:\n",
181 | " max_k = max(k_idx_his[-Delta:-1]) +1;\n",
182 | " else:\n",
183 | " max_k = k_idx_his[-1] +1;\n",
184 | " K = min(max_k +1, N)\n",
185 | "\n",
186 | " if i < n - num_test:\n",
187 | " # training\n",
188 | " i_idx = i % split_idx\n",
189 | " else:\n",
190 | " # test\n",
191 | " i_idx = i - n + num_test + split_idx\n",
192 | "\n",
193 | " h = channel[i_idx,:]\n",
194 | "\n",
195 | " # the action selection must be either 'OP' or 'KNN'\n",
196 | " m_list = mem.decode(h, K, decoder_mode)\n",
197 | "\n",
198 | " r_list = []\n",
199 | " for m in m_list:\n",
200 | " r_list.append(bisection(h/1000000, m)[0])\n",
201 | " \n",
202 | " # encode the mode with largest reward\n",
203 | " mem.encode(h, m_list[np.argmax(r_list)])\n",
204 | " # the main code for DROO training ends here\n",
205 | " \n",
206 | " \n",
207 | " \n",
208 | " \n",
209 | " # the following codes store some interested metrics for illustrations\n",
210 | " # memorize the largest reward\n",
211 | " rate_his.append(np.max(r_list))\n",
212 | " rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])\n",
213 | " # record the index of largest reward\n",
214 | " k_idx_his.append(np.argmax(r_list))\n",
215 | " # record K in case of adaptive K\n",
216 | " K_his.append(K)\n",
217 | " mode_his.append(m_list[np.argmax(r_list)])\n",
218 | "\n",
219 | "\n",
220 | " total_time=time.time()-start_time\n",
221 | " mem.plot_cost()\n",
222 | " plot_rate(rate_his_ratio)\n",
223 | "\n",
224 | " print(\"Averaged normalized computation rate:\", sum(rate_his_ratio[-num_test: -1])/num_test)\n",
225 | " print('Total time consumed:%s'%total_time)\n",
226 | " print('Average time per channel:%s'%(total_time/n))\n",
227 | "\n",
228 | " # save data into txt\n",
229 | " save_to_txt(k_idx_his, \"k_idx_his.txt\")\n",
230 | " save_to_txt(K_his, \"K_his.txt\")\n",
231 | " save_to_txt(mem.cost_his, \"cost_his.txt\")\n",
232 | " save_to_txt(rate_his_ratio, \"rate_his_ratio.txt\")\n",
233 | " save_to_txt(mode_his, \"mode_his.txt\")"
234 | ],
235 | "metadata": {
236 | "id": "eyzq8nm0PnRy"
237 | },
238 | "execution_count": null,
239 | "outputs": []
240 | },
241 | {
242 | "cell_type": "markdown",
243 | "source": [
244 | "# New Section"
245 | ],
246 | "metadata": {
247 | "id": "MkqI9XbFHDmT"
248 | }
249 | }
250 | ]
251 | }
--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
1 | # #################################################################
2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks
3 | #
4 | # This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data.
5 | #
6 | # Input: ./data/data_#.mat
7 | # Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes
8 | # -----------------------------------------------------------------
9 | # | wireless channel gain | input_h |
10 | # -----------------------------------------------------------------
11 | # | computing mode selection | output_mode |
12 | # -----------------------------------------------------------------
13 | # | energy broadcasting parameter | output_a |
14 | # -----------------------------------------------------------------
15 | # | transmit time of wireless device | output_tau |
16 | # -----------------------------------------------------------------
17 | # | weighted sum computation rate | output_obj |
18 | # -----------------------------------------------------------------
19 | #
20 | #
21 | # References:
22 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, "Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks," in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811.
23 | # [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018.
24 | #
25 | # version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)
26 | # #################################################################
27 |
28 |
29 | import scipy.io as sio # import scipy.io for .mat file I/
30 | import numpy as np # import numpy
31 |
32 | from memory import MemoryDNN
33 | from optimization import bisection
34 |
35 | import time
36 |
37 |
38 | def plot_rate( rate_his, rolling_intv = 50):
39 | import matplotlib.pyplot as plt
40 | import pandas as pd
41 | import matplotlib as mpl
42 |
43 | rate_array = np.asarray(rate_his)
44 | df = pd.DataFrame(rate_his)
45 |
46 |
47 | mpl.style.use('seaborn')
48 | fig, ax = plt.subplots(figsize=(15,8))
49 | # rolling_intv = 20
50 |
51 | plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b')
52 | plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2)
53 | plt.ylabel('Normalized Computation Rate')
54 | plt.xlabel('Time Frames')
55 | plt.show()
56 |
57 | def save_to_txt(rate_his, file_path):
58 | with open(file_path, 'w') as f:
59 | for rate in rate_his:
60 | f.write("%s \n" % rate)
61 |
62 | if __name__ == "__main__":
63 | '''
64 | This algorithm generates K modes from DNN, and chooses with largest
65 | reward. The mode with largest reward is stored in the memory, which is
66 | further used to train the DNN.
67 | Adaptive K is implemented. K = max(K, K_his[-memory_size])
68 | '''
69 |
70 | N = 10 # number of users
71 | n = 30000 # number of time frames
72 | K = N # initialize K = N
73 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
74 | Memory = 1024 # capacity of memory structure
75 | Delta = 32 # Update interval for adaptive K
76 |
77 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
78 | # Load data
79 | channel = sio.loadmat('./data/data_%d' %N)['input_h']
80 | rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO.
81 |
82 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
83 | channel = channel * 1000000
84 |
85 | # generate the train and test data sample index
86 | # data are splitted as 80:20
87 | # training data are randomly sampled with duplication if n > total data size
88 |
89 | split_idx = int(.8* len(channel))
90 | num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size
91 |
92 |
93 | mem = MemoryDNN(net = [N, 120, 80, N],
94 | learning_rate = 0.01,
95 | training_interval=10,
96 | batch_size=128,
97 | memory_size=Memory
98 | )
99 |
100 | start_time=time.time()
101 |
102 | rate_his = []
103 | rate_his_ratio = []
104 | mode_his = []
105 | k_idx_his = []
106 | K_his = []
107 | for i in range(n):
108 | if i % (n//10) == 0:
109 | print("%0.1f"%(i/n))
110 | if i> 0 and i % Delta == 0:
111 | # index counts from 0
112 | if Delta > 1:
113 | max_k = max(k_idx_his[-Delta:-1]) +1;
114 | else:
115 | max_k = k_idx_his[-1] +1;
116 | K = min(max_k +1, N)
117 |
118 | if i < n - num_test:
119 | # training
120 | i_idx = i % split_idx
121 | else:
122 | # test
123 | i_idx = i - n + num_test + split_idx
124 |
125 | h = channel[i_idx,:]
126 |
127 | # the action selection must be either 'OP' or 'KNN'
128 | m_list = mem.decode(h, K, decoder_mode)
129 |
130 | r_list = []
131 | for m in m_list:
132 | r_list.append(bisection(h/1000000, m)[0])
133 |
134 | # encode the mode with largest reward
135 | mem.encode(h, m_list[np.argmax(r_list)])
136 | # the main code for DROO training ends here
137 |
138 |
139 |
140 |
141 | # the following codes store some interested metrics for illustrations
142 | # memorize the largest reward
143 | rate_his.append(np.max(r_list))
144 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
145 | # record the index of largest reward
146 | k_idx_his.append(np.argmax(r_list))
147 | # record K in case of adaptive K
148 | K_his.append(K)
149 | mode_his.append(m_list[np.argmax(r_list)])
150 |
151 |
152 | total_time=time.time()-start_time
153 | mem.plot_cost()
154 | plot_rate(rate_his_ratio)
155 |
156 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
157 | print('Total time consumed:%s'%total_time)
158 | print('Average time per channel:%s'%(total_time/n))
159 |
160 | # save data into txt
161 | save_to_txt(k_idx_his, "k_idx_his.txt")
162 | save_to_txt(K_his, "K_his.txt")
163 | save_to_txt(mem.cost_his, "cost_his.txt")
164 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
165 | save_to_txt(mode_his, "mode_his.txt")
166 |
--------------------------------------------------------------------------------
/mainPyTorch.py:
--------------------------------------------------------------------------------
1 | # #################################################################
2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks
3 | #
4 | # This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data.
5 | #
6 | # Input: ./data/data_#.mat
7 | # Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes
8 | # -----------------------------------------------------------------
9 | # | wireless channel gain | input_h |
10 | # -----------------------------------------------------------------
11 | # | computing mode selection | output_mode |
12 | # -----------------------------------------------------------------
13 | # | energy broadcasting parameter | output_a |
14 | # -----------------------------------------------------------------
15 | # | transmit time of wireless device | output_tau |
16 | # -----------------------------------------------------------------
17 | # | weighted sum computation rate | output_obj |
18 | # -----------------------------------------------------------------
19 | #
20 | #
21 | # References:
22 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, "Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks," in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811.
23 | # [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018.
24 | #
25 | # version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)
26 | # #################################################################
27 |
28 |
29 | import scipy.io as sio # import scipy.io for .mat file I/
30 | import numpy as np # import numpy
31 |
32 | # Implementated based on the PyTorch
33 | from memoryPyTorch import MemoryDNN
34 | from optimization import bisection
35 |
36 | import time
37 |
38 |
39 | def plot_rate(rate_his, rolling_intv=50):
40 | import matplotlib.pyplot as plt
41 | import pandas as pd
42 | import matplotlib as mpl
43 |
44 | rate_array = np.asarray(rate_his)
45 | df = pd.DataFrame(rate_his)
46 |
47 |
48 | mpl.style.use('seaborn')
49 | fig, ax = plt.subplots(figsize=(15, 8))
50 | # rolling_intv = 20
51 |
52 | plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b')
53 | plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2)
54 | plt.ylabel('Normalized Computation Rate')
55 | plt.xlabel('Time Frames')
56 | plt.show()
57 |
58 | def save_to_txt(rate_his, file_path):
59 | with open(file_path, 'w') as f:
60 | for rate in rate_his:
61 | f.write("%s \n" % rate)
62 |
63 | if __name__ == "__main__":
64 | '''
65 | This algorithm generates K modes from DNN, and chooses with largest
66 | reward. The mode with largest reward is stored in the memory, which is
67 | further used to train the DNN.
68 | Adaptive K is implemented. K = max(K, K_his[-memory_size])
69 | '''
70 |
71 | N = 10 # number of users
72 | n = 30000 # number of time frames
73 | K = N # initialize K = N
74 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
75 | Memory = 1024 # capacity of memory structure
76 | Delta = 32 # Update interval for adaptive K
77 |
78 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
79 | # Load data
80 | channel = sio.loadmat('./data/data_%d' %N)['input_h']
81 | rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO.
82 |
83 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
84 | channel = channel * 1000000
85 |
86 | # generate the train and test data sample index
87 | # data are splitted as 80:20
88 | # training data are randomly sampled with duplication if n > total data size
89 |
90 | split_idx = int(.8 * len(channel))
91 | num_test = min(len(channel) - split_idx, n - int(.8 * n)) # training data size
92 |
93 |
94 | mem = MemoryDNN(net = [N, 120, 80, N],
95 | learning_rate = 0.01,
96 | training_interval=10,
97 | batch_size=128,
98 | memory_size=Memory
99 | )
100 |
101 | start_time = time.time()
102 |
103 | rate_his = []
104 | rate_his_ratio = []
105 | mode_his = []
106 | k_idx_his = []
107 | K_his = []
108 | for i in range(n):
109 | if i % (n//10) == 0:
110 | print("%0.1f"%(i/n))
111 | if i> 0 and i % Delta == 0:
112 | # index counts from 0
113 | if Delta > 1:
114 | max_k = max(k_idx_his[-Delta:-1]) +1;
115 | else:
116 | max_k = k_idx_his[-1] +1;
117 | K = min(max_k +1, N)
118 |
119 | if i < n - num_test:
120 | # training
121 | i_idx = i % split_idx
122 | else:
123 | # test
124 | i_idx = i - n + num_test + split_idx
125 |
126 | h = channel[i_idx,:]
127 |
128 | # the action selection must be either 'OP' or 'KNN'
129 | m_list = mem.decode(h, K, decoder_mode)
130 |
131 | r_list = []
132 | for m in m_list:
133 | r_list.append(bisection(h/1000000, m)[0])
134 |
135 | # encode the mode with largest reward
136 | mem.encode(h, m_list[np.argmax(r_list)])
137 | # the main code for DROO training ends here
138 |
139 |
140 |
141 |
142 | # the following codes store some interested metrics for illustrations
143 | # memorize the largest reward
144 | rate_his.append(np.max(r_list))
145 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
146 | # record the index of largest reward
147 | k_idx_his.append(np.argmax(r_list))
148 | # record K in case of adaptive K
149 | K_his.append(K)
150 | mode_his.append(m_list[np.argmax(r_list)])
151 |
152 |
153 | total_time=time.time()-start_time
154 | mem.plot_cost()
155 | plot_rate(rate_his_ratio)
156 |
157 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
158 | print('Total time consumed:%s'%total_time)
159 | print('Average time per channel:%s'%(total_time/n))
160 |
161 | # save data into txt
162 | save_to_txt(k_idx_his, "k_idx_his.txt")
163 | save_to_txt(K_his, "K_his.txt")
164 | save_to_txt(mem.cost_his, "cost_his.txt")
165 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
166 | save_to_txt(mode_his, "mode_his.txt")
167 |
--------------------------------------------------------------------------------
/mainTF2:
--------------------------------------------------------------------------------
1 | # #################################################################
2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks
3 | #
4 | # This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data.
5 | #
6 | # Input: ./data/data_#.mat
7 | # Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes
8 | # -----------------------------------------------------------------
9 | # | wireless channel gain | input_h |
10 | # -----------------------------------------------------------------
11 | # | computing mode selection | output_mode |
12 | # -----------------------------------------------------------------
13 | # | energy broadcasting parameter | output_a |
14 | # -----------------------------------------------------------------
15 | # | transmit time of wireless device | output_tau |
16 | # -----------------------------------------------------------------
17 | # | weighted sum computation rate | output_obj |
18 | # -----------------------------------------------------------------
19 | #
20 | #
21 | # References:
22 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, "Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks," in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811.
23 | # [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018.
24 | #
25 | # version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)
26 | # #################################################################
27 |
28 |
29 | import scipy.io as sio # import scipy.io for .mat file I/
30 | import numpy as np # import numpy
31 |
32 | # for tensorflow2
33 | from memoryTF2 import MemoryDNN
34 | from optimization import bisection
35 |
36 | import time
37 |
38 |
39 | def plot_rate( rate_his, rolling_intv = 50):
40 | import matplotlib.pyplot as plt
41 | import pandas as pd
42 | import matplotlib as mpl
43 |
44 | rate_array = np.asarray(rate_his)
45 | df = pd.DataFrame(rate_his)
46 |
47 |
48 | mpl.style.use('seaborn')
49 | fig, ax = plt.subplots(figsize=(15,8))
50 | # rolling_intv = 20
51 |
52 | plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b')
53 | plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2)
54 | plt.ylabel('Normalized Computation Rate')
55 | plt.xlabel('Time Frames')
56 | plt.show()
57 |
58 | def save_to_txt(rate_his, file_path):
59 | with open(file_path, 'w') as f:
60 | for rate in rate_his:
61 | f.write("%s \n" % rate)
62 |
63 | if __name__ == "__main__":
64 | '''
65 | This algorithm generates K modes from DNN, and chooses with largest
66 | reward. The mode with largest reward is stored in the memory, which is
67 | further used to train the DNN.
68 | Adaptive K is implemented. K = max(K, K_his[-memory_size])
69 | '''
70 |
71 | N = 10 # number of users
72 | n = 30000 # number of time frames
73 | K = N # initialize K = N
74 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
75 | Memory = 1024 # capacity of memory structure
76 | Delta = 32 # Update interval for adaptive K
77 |
78 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
79 | # Load data
80 | channel = sio.loadmat('./data/data_%d' %N)['input_h']
81 | rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO.
82 |
83 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
84 | channel = channel * 1000000
85 |
86 | # generate the train and test data sample index
87 | # data are splitted as 80:20
88 | # training data are randomly sampled with duplication if n > total data size
89 |
90 | split_idx = int(.8* len(channel))
91 | num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size
92 |
93 |
94 | mem = MemoryDNN(net = [N, 120, 80, N],
95 | learning_rate = 0.01,
96 | training_interval=10,
97 | batch_size=128,
98 | memory_size=Memory
99 | )
100 |
101 | start_time=time.time()
102 |
103 | rate_his = []
104 | rate_his_ratio = []
105 | mode_his = []
106 | k_idx_his = []
107 | K_his = []
108 | for i in range(n):
109 | if i % (n//10) == 0:
110 | print("%0.1f"%(i/n))
111 | if i> 0 and i % Delta == 0:
112 | # index counts from 0
113 | if Delta > 1:
114 | max_k = max(k_idx_his[-Delta:-1]) +1;
115 | else:
116 | max_k = k_idx_his[-1] +1;
117 | K = min(max_k +1, N)
118 |
119 | if i < n - num_test:
120 | # training
121 | i_idx = i % split_idx
122 | else:
123 | # test
124 | i_idx = i - n + num_test + split_idx
125 |
126 | h = channel[i_idx,:]
127 |
128 | # the action selection must be either 'OP' or 'KNN'
129 | m_list = mem.decode(h, K, decoder_mode)
130 |
131 | r_list = []
132 | for m in m_list:
133 | r_list.append(bisection(h/1000000, m)[0])
134 |
135 | # encode the mode with largest reward
136 | mem.encode(h, m_list[np.argmax(r_list)])
137 | # the main code for DROO training ends here
138 |
139 |
140 |
141 |
142 | # the following codes store some interested metrics for illustrations
143 | # memorize the largest reward
144 | rate_his.append(np.max(r_list))
145 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
146 | # record the index of largest reward
147 | k_idx_his.append(np.argmax(r_list))
148 | # record K in case of adaptive K
149 | K_his.append(K)
150 | mode_his.append(m_list[np.argmax(r_list)])
151 |
152 |
153 | total_time=time.time()-start_time
154 | mem.plot_cost()
155 | plot_rate(rate_his_ratio)
156 |
157 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
158 | print('Total time consumed:%s'%total_time)
159 | print('Average time per channel:%s'%(total_time/n))
160 |
161 | # save data into txt
162 | save_to_txt(k_idx_his, "k_idx_his.txt")
163 | save_to_txt(K_his, "K_his.txt")
164 | save_to_txt(mem.cost_his, "cost_his.txt")
165 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
166 | save_to_txt(mode_his, "mode_his.txt")
167 |
--------------------------------------------------------------------------------
/memory.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "provenance": [],
7 | "authorship_tag": "ABX9TyMFps9x/kObIv6+SfRFYpWe",
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | },
14 | "language_info": {
15 | "name": "python"
16 | }
17 | },
18 | "cells": [
19 | {
20 | "cell_type": "markdown",
21 | "metadata": {
22 | "id": "view-in-github",
23 | "colab_type": "text"
24 | },
25 | "source": [
26 | "
"
27 | ]
28 | },
29 | {
30 | "cell_type": "code",
31 | "source": [
32 | "\n"
33 | ],
34 | "metadata": {
35 | "id": "ykiMHW1OShjG"
36 | },
37 | "execution_count": 1,
38 | "outputs": []
39 | },
40 | {
41 | "cell_type": "markdown",
42 | "source": [
43 | "# **This Memory File contains memory operation including encoding and decoding operations.**\n",
44 | "version 1.0 -- January 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)\n"
45 | ],
46 | "metadata": {
47 | "id": "lpeCAT7KTAZe"
48 | }
49 | },
50 | {
51 | "cell_type": "markdown",
52 | "source": [],
53 | "metadata": {
54 | "id": "Ou8xRVrVS9Eu"
55 | }
56 | },
57 | {
58 | "cell_type": "code",
59 | "source": [
60 | "from __future__ import print_function\n",
61 | "import tensorflow as tf\n",
62 | "import numpy as np"
63 | ],
64 | "metadata": {
65 | "id": "taUp-kQTSVWU"
66 | },
67 | "execution_count": 1,
68 | "outputs": []
69 | },
70 | {
71 | "cell_type": "code",
72 | "execution_count": 2,
73 | "metadata": {
74 | "colab": {
75 | "base_uri": "https://localhost:8080/"
76 | },
77 | "id": "VljS8oG6SJkK",
78 | "outputId": "4bb4f086-69f6-4516-9ae2-2eb56af9c0e6"
79 | },
80 | "outputs": [
81 | {
82 | "output_type": "stream",
83 | "name": "stderr",
84 | "text": [
85 | "<>:13: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
86 | "<>:130: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
87 | "<>:132: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
88 | "<>:162: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
89 | "<>:13: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
90 | "<>:130: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
91 | "<>:132: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
92 | "<>:162: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
93 | ":13: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
94 | " assert(len(net) is 4) # only 4-layer DNN\n",
95 | ":130: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
96 | " if mode is 'OP':\n",
97 | ":132: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
98 | " elif mode is 'KNN':\n",
99 | ":162: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
100 | " if len(self.enumerate_actions) is 0:\n"
101 | ]
102 | }
103 | ],
104 | "source": [
105 | "# DNN network for memory\n",
106 | "class MemoryDNN:\n",
107 | " def __init__(\n",
108 | " self,\n",
109 | " net,\n",
110 | " learning_rate = 0.01,\n",
111 | " training_interval=10, \n",
112 | " batch_size=100, \n",
113 | " memory_size=1000,\n",
114 | " output_graph=False\n",
115 | " ):\n",
116 | " # net: [n_input, n_hidden_1st, n_hidded_2ed, n_output]\n",
117 | " assert(len(net) is 4) # only 4-layer DNN\n",
118 | "\n",
119 | " self.net = net\n",
120 | " self.training_interval = training_interval # learn every #training_interval\n",
121 | " self.lr = learning_rate\n",
122 | " self.batch_size = batch_size\n",
123 | " self.memory_size = memory_size\n",
124 | " \n",
125 | " # store all binary actions\n",
126 | " self.enumerate_actions = []\n",
127 | "\n",
128 | " # stored # memory entry\n",
129 | " self.memory_counter = 1\n",
130 | "\n",
131 | " # store training cost\n",
132 | " self.cost_his = []\n",
133 | "\n",
134 | " # reset graph \n",
135 | " tf.reset_default_graph()\n",
136 | "\n",
137 | " # initialize zero memory [h, m]\n",
138 | " self.memory = np.zeros((self.memory_size, self.net[0]+ self.net[-1]))\n",
139 | "\n",
140 | " # construct memory network\n",
141 | " self._build_net()\n",
142 | "\n",
143 | " self.sess = tf.Session()\n",
144 | "\n",
145 | " # for tensorboard\n",
146 | " if output_graph:\n",
147 | " # $ tensorboard --logdir=logs\n",
148 | " # tf.train.SummaryWriter soon be deprecated, use following\n",
149 | " tf.summary.FileWriter(\"logs/\", self.sess.graph)\n",
150 | "\n",
151 | " self.sess.run(tf.global_variables_initializer())\n",
152 | "\n",
153 | "\n",
154 | " def _build_net(self):\n",
155 | " def build_layers(h, c_names, net, w_initializer, b_initializer):\n",
156 | " with tf.variable_scope('l1'):\n",
157 | " w1 = tf.get_variable('w1', [net[0], net[1]], initializer=w_initializer, collections=c_names)\n",
158 | " b1 = tf.get_variable('b1', [1, self.net[1]], initializer=b_initializer, collections=c_names)\n",
159 | " l1 = tf.nn.relu(tf.matmul(h, w1) + b1)\n",
160 | "\n",
161 | " with tf.variable_scope('l2'):\n",
162 | " w2 = tf.get_variable('w2', [net[1], net[2]], initializer=w_initializer, collections=c_names)\n",
163 | " b2 = tf.get_variable('b2', [1, net[2]], initializer=b_initializer, collections=c_names)\n",
164 | " l2 = tf.nn.relu(tf.matmul(l1, w2) + b2)\n",
165 | "\n",
166 | " with tf.variable_scope('M'):\n",
167 | " w3 = tf.get_variable('w3', [net[2], net[3]], initializer=w_initializer, collections=c_names)\n",
168 | " b3 = tf.get_variable('b3', [1, net[3]], initializer=b_initializer, collections=c_names)\n",
169 | " out = tf.matmul(l2, w3) + b3\n",
170 | "\n",
171 | " return out\n",
172 | "\n",
173 | " # ------------------ build memory_net ------------------\n",
174 | " self.h = tf.placeholder(tf.float32, [None, self.net[0]], name='h') # input\n",
175 | " self.m = tf.placeholder(tf.float32, [None, self.net[-1]], name='mode') # for calculating loss\n",
176 | " self.is_train = tf.placeholder(\"bool\") # train or evaluate\n",
177 | "\n",
178 | " with tf.variable_scope('memory_net'):\n",
179 | " c_names, w_initializer, b_initializer = \\\n",
180 | " ['memory_net_params', tf.GraphKeys.GLOBAL_VARIABLES], \\\n",
181 | " tf.random_normal_initializer(0., 1/self.net[0]), tf.constant_initializer(0.1) # config of layers\n",
182 | "\n",
183 | " self.m_pred = build_layers(self.h, c_names, self.net, w_initializer, b_initializer)\n",
184 | "\n",
185 | " with tf.variable_scope('loss'):\n",
186 | " self.loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = self.m, logits = self.m_pred))\n",
187 | "\n",
188 | " with tf.variable_scope('train'):\n",
189 | " self._train_op = tf.train.AdamOptimizer(self.lr, 0.09).minimize(self.loss)\n",
190 | "\n",
191 | "\n",
192 | " def remember(self, h, m):\n",
193 | " # replace the old memory with new memory\n",
194 | " idx = self.memory_counter % self.memory_size\n",
195 | " self.memory[idx, :] = np.hstack((h,m))\n",
196 | "\n",
197 | " self.memory_counter += 1\n",
198 | "\n",
199 | " def encode(self, h, m):\n",
200 | " # encoding the entry\n",
201 | " self.remember(h, m)\n",
202 | " # train the DNN every 10 step\n",
203 | "# if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0:\n",
204 | " if self.memory_counter % self.training_interval == 0:\n",
205 | " self.learn()\n",
206 | "\n",
207 | " def learn(self):\n",
208 | " # sample batch memory from all memory\n",
209 | " if self.memory_counter > self.memory_size:\n",
210 | " sample_index = np.random.choice(self.memory_size, size=self.batch_size)\n",
211 | " else:\n",
212 | " sample_index = np.random.choice(self.memory_counter, size=self.batch_size)\n",
213 | " batch_memory = self.memory[sample_index, :]\n",
214 | " \n",
215 | " h_train = batch_memory[:, 0: self.net[0]]\n",
216 | " m_train = batch_memory[:, self.net[0]:]\n",
217 | " \n",
218 | " # print(h_train)\n",
219 | " # print(m_train)\n",
220 | "\n",
221 | " # train the DNN\n",
222 | " _, self.cost = self.sess.run([self._train_op, self.loss], \n",
223 | " feed_dict={self.h: h_train, self.m: m_train})\n",
224 | "\n",
225 | " assert(self.cost >0) \n",
226 | " self.cost_his.append(self.cost)\n",
227 | "\n",
228 | " def decode(self, h, k = 1, mode = 'OP'):\n",
229 | " # to have batch dimension when feed into tf placeholder\n",
230 | " h = h[np.newaxis, :]\n",
231 | "\n",
232 | " m_pred = self.sess.run(self.m_pred, feed_dict={self.h: h})\n",
233 | "\n",
234 | " if mode is 'OP':\n",
235 | " return self.knm(m_pred[0], k)\n",
236 | " elif mode is 'KNN':\n",
237 | " return self.knn(m_pred[0], k)\n",
238 | " else:\n",
239 | " print(\"The action selection must be 'OP' or 'KNN'\")\n",
240 | " \n",
241 | " def knm(self, m, k = 1):\n",
242 | " # return k-nearest-mode\n",
243 | " m_list = []\n",
244 | " \n",
245 | " # generate the first binary offloading decision \n",
246 | " # note that here 'm' is the output of DNN before the sigmoid activation function, in the field of all real number. \n",
247 | " # Therefore, we compare it with '0' instead of 0.5 in equation (8). Since, sigmod(0) = 0.5.\n",
248 | " m_list.append(1*(m>0))\n",
249 | " \n",
250 | " if k > 1:\n",
251 | " # generate the remaining K-1 binary offloading decisions with respect to equation (9)\n",
252 | " m_abs = abs(m)\n",
253 | " idx_list = np.argsort(m_abs)[:k-1]\n",
254 | " for i in range(k-1):\n",
255 | " if m[idx_list[i]] >0:\n",
256 | " # set a positive user to 0\n",
257 | " m_list.append(1*(m - m[idx_list[i]] > 0))\n",
258 | " else:\n",
259 | " # set a negtive user to 1\n",
260 | " m_list.append(1*(m - m[idx_list[i]] >= 0))\n",
261 | "\n",
262 | " return m_list\n",
263 | " \n",
264 | " def knn(self, m, k = 1):\n",
265 | " # list all 2^N binary offloading actions\n",
266 | " if len(self.enumerate_actions) is 0:\n",
267 | " import itertools\n",
268 | " self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0]))))\n",
269 | "\n",
270 | " # the 2-norm\n",
271 | " sqd = ((self.enumerate_actions - m)**2).sum(1)\n",
272 | " idx = np.argsort(sqd)\n",
273 | " return self.enumerate_actions[idx[:k]]\n",
274 | " \n",
275 | "\n",
276 | " def plot_cost(self):\n",
277 | " import matplotlib.pyplot as plt\n",
278 | " plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his)\n",
279 | " plt.ylabel('Training Loss')\n",
280 | " plt.xlabel('Time Frames')\n",
281 | " plt.show()"
282 | ]
283 | }
284 | ]
285 | }
--------------------------------------------------------------------------------
/memory.py:
--------------------------------------------------------------------------------
1 | # #################################################################
2 | # This file contains memory operation including encoding and decoding operations.
3 | #
4 | # version 1.0 -- January 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)
5 | # #################################################################
6 |
7 | from __future__ import print_function
8 | import tensorflow as tf
9 | import numpy as np
10 |
11 |
12 | # DNN network for memory
13 | class MemoryDNN:
14 | def __init__(
15 | self,
16 | net,
17 | learning_rate = 0.01,
18 | training_interval=10,
19 | batch_size=100,
20 | memory_size=1000,
21 | output_graph=False
22 | ):
23 | # net: [n_input, n_hidden_1st, n_hidded_2ed, n_output]
24 | assert(len(net) is 4) # only 4-layer DNN
25 |
26 | self.net = net
27 | self.training_interval = training_interval # learn every #training_interval
28 | self.lr = learning_rate
29 | self.batch_size = batch_size
30 | self.memory_size = memory_size
31 |
32 | # store all binary actions
33 | self.enumerate_actions = []
34 |
35 | # stored # memory entry
36 | self.memory_counter = 1
37 |
38 | # store training cost
39 | self.cost_his = []
40 |
41 | # reset graph
42 | tf.reset_default_graph()
43 |
44 | # initialize zero memory [h, m]
45 | self.memory = np.zeros((self.memory_size, self.net[0]+ self.net[-1]))
46 |
47 | # construct memory network
48 | self._build_net()
49 |
50 | self.sess = tf.Session()
51 |
52 | # for tensorboard
53 | if output_graph:
54 | # $ tensorboard --logdir=logs
55 | # tf.train.SummaryWriter soon be deprecated, use following
56 | tf.summary.FileWriter("logs/", self.sess.graph)
57 |
58 | self.sess.run(tf.global_variables_initializer())
59 |
60 |
61 | def _build_net(self):
62 | def build_layers(h, c_names, net, w_initializer, b_initializer):
63 | with tf.variable_scope('l1'):
64 | w1 = tf.get_variable('w1', [net[0], net[1]], initializer=w_initializer, collections=c_names)
65 | b1 = tf.get_variable('b1', [1, self.net[1]], initializer=b_initializer, collections=c_names)
66 | l1 = tf.nn.relu(tf.matmul(h, w1) + b1)
67 |
68 | with tf.variable_scope('l2'):
69 | w2 = tf.get_variable('w2', [net[1], net[2]], initializer=w_initializer, collections=c_names)
70 | b2 = tf.get_variable('b2', [1, net[2]], initializer=b_initializer, collections=c_names)
71 | l2 = tf.nn.relu(tf.matmul(l1, w2) + b2)
72 |
73 | with tf.variable_scope('M'):
74 | w3 = tf.get_variable('w3', [net[2], net[3]], initializer=w_initializer, collections=c_names)
75 | b3 = tf.get_variable('b3', [1, net[3]], initializer=b_initializer, collections=c_names)
76 | out = tf.matmul(l2, w3) + b3
77 |
78 | return out
79 |
80 | # ------------------ build memory_net ------------------
81 | self.h = tf.placeholder(tf.float32, [None, self.net[0]], name='h') # input
82 | self.m = tf.placeholder(tf.float32, [None, self.net[-1]], name='mode') # for calculating loss
83 | self.is_train = tf.placeholder("bool") # train or evaluate
84 |
85 | with tf.variable_scope('memory_net'):
86 | c_names, w_initializer, b_initializer = \
87 | ['memory_net_params', tf.GraphKeys.GLOBAL_VARIABLES], \
88 | tf.random_normal_initializer(0., 1/self.net[0]), tf.constant_initializer(0.1) # config of layers
89 |
90 | self.m_pred = build_layers(self.h, c_names, self.net, w_initializer, b_initializer)
91 |
92 | with tf.variable_scope('loss'):
93 | self.loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = self.m, logits = self.m_pred))
94 |
95 | with tf.variable_scope('train'):
96 | self._train_op = tf.train.AdamOptimizer(self.lr, 0.09).minimize(self.loss)
97 |
98 |
99 | def remember(self, h, m):
100 | # replace the old memory with new memory
101 | idx = self.memory_counter % self.memory_size
102 | self.memory[idx, :] = np.hstack((h,m))
103 |
104 | self.memory_counter += 1
105 |
106 | def encode(self, h, m):
107 | # encoding the entry
108 | self.remember(h, m)
109 | # train the DNN every 10 step
110 | # if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0:
111 | if self.memory_counter % self.training_interval == 0:
112 | self.learn()
113 |
114 | def learn(self):
115 | # sample batch memory from all memory
116 | if self.memory_counter > self.memory_size:
117 | sample_index = np.random.choice(self.memory_size, size=self.batch_size)
118 | else:
119 | sample_index = np.random.choice(self.memory_counter, size=self.batch_size)
120 | batch_memory = self.memory[sample_index, :]
121 |
122 | h_train = batch_memory[:, 0: self.net[0]]
123 | m_train = batch_memory[:, self.net[0]:]
124 |
125 | # print(h_train)
126 | # print(m_train)
127 |
128 | # train the DNN
129 | _, self.cost = self.sess.run([self._train_op, self.loss],
130 | feed_dict={self.h: h_train, self.m: m_train})
131 |
132 | assert(self.cost >0)
133 | self.cost_his.append(self.cost)
134 |
135 | def decode(self, h, k = 1, mode = 'OP'):
136 | # to have batch dimension when feed into tf placeholder
137 | h = h[np.newaxis, :]
138 |
139 | m_pred = self.sess.run(self.m_pred, feed_dict={self.h: h})
140 |
141 | if mode is 'OP':
142 | return self.knm(m_pred[0], k)
143 | elif mode is 'KNN':
144 | return self.knn(m_pred[0], k)
145 | else:
146 | print("The action selection must be 'OP' or 'KNN'")
147 |
148 | def knm(self, m, k = 1):
149 | # return k-nearest-mode
150 | m_list = []
151 |
152 | # generate the first binary offloading decision
153 | # note that here 'm' is the output of DNN before the sigmoid activation function, in the field of all real number.
154 | # Therefore, we compare it with '0' instead of 0.5 in equation (8). Since, sigmod(0) = 0.5.
155 | m_list.append(1*(m>0))
156 |
157 | if k > 1:
158 | # generate the remaining K-1 binary offloading decisions with respect to equation (9)
159 | m_abs = abs(m)
160 | idx_list = np.argsort(m_abs)[:k-1]
161 | for i in range(k-1):
162 | if m[idx_list[i]] >0:
163 | # set a positive user to 0
164 | m_list.append(1*(m - m[idx_list[i]] > 0))
165 | else:
166 | # set a negtive user to 1
167 | m_list.append(1*(m - m[idx_list[i]] >= 0))
168 |
169 | return m_list
170 |
171 | def knn(self, m, k = 1):
172 | # list all 2^N binary offloading actions
173 | if len(self.enumerate_actions) is 0:
174 | import itertools
175 | self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0]))))
176 |
177 | # the 2-norm
178 | sqd = ((self.enumerate_actions - m)**2).sum(1)
179 | idx = np.argsort(sqd)
180 | return self.enumerate_actions[idx[:k]]
181 |
182 |
183 | def plot_cost(self):
184 | import matplotlib.pyplot as plt
185 | plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his)
186 | plt.ylabel('Training Loss')
187 | plt.xlabel('Time Frames')
188 | plt.show()
189 |
--------------------------------------------------------------------------------
/memoryPyTorch.py:
--------------------------------------------------------------------------------
1 | # #################################################################
2 | # This file contains the main DROO operations, including building DNN,
3 | # Storing data sample, Training DNN, and generating quantized binary offloading decisions.
4 |
5 | # version 1.0 -- February 2020. Written based on Tensorflow 2 by Weijian Pan and
6 | # Liang Huang (lianghuang AT zjut.edu.cn)
7 | # ###################################################################
8 |
9 | from __future__ import print_function
10 | import torch
11 | import torch.optim as optim
12 | import torch.nn as nn
13 | import numpy as np
14 |
15 | print(torch.__version__)
16 |
17 |
18 | # DNN network for memory
19 | class MemoryDNN:
20 | def __init__(
21 | self,
22 | net,
23 | learning_rate = 0.01,
24 | training_interval=10,
25 | batch_size=100,
26 | memory_size=1000,
27 | output_graph=False
28 | ):
29 |
30 | self.net = net
31 | self.training_interval = training_interval # learn every #training_interval
32 | self.lr = learning_rate
33 | self.batch_size = batch_size
34 | self.memory_size = memory_size
35 |
36 | # store all binary actions
37 | self.enumerate_actions = []
38 |
39 | # stored # memory entry
40 | self.memory_counter = 1
41 |
42 | # store training cost
43 | self.cost_his = []
44 |
45 | # initialize zero memory [h, m]
46 | self.memory = np.zeros((self.memory_size, self.net[0] + self.net[-1]))
47 |
48 | # construct memory network
49 | self._build_net()
50 |
51 | def _build_net(self):
52 | self.model = nn.Sequential(
53 | nn.Linear(self.net[0], self.net[1]),
54 | nn.ReLU(),
55 | nn.Linear(self.net[1], self.net[2]),
56 | nn.ReLU(),
57 | nn.Linear(self.net[2], self.net[3]),
58 | nn.Sigmoid()
59 | )
60 |
61 | def remember(self, h, m):
62 | # replace the old memory with new memory
63 | idx = self.memory_counter % self.memory_size
64 | self.memory[idx, :] = np.hstack((h, m))
65 |
66 | self.memory_counter += 1
67 |
68 | def encode(self, h, m):
69 | # encoding the entry
70 | self.remember(h, m)
71 | # train the DNN every 10 step
72 | # if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0:
73 | if self.memory_counter % self.training_interval == 0:
74 | self.learn()
75 |
76 | def learn(self):
77 | # sample batch memory from all memory
78 | if self.memory_counter > self.memory_size:
79 | sample_index = np.random.choice(self.memory_size, size=self.batch_size)
80 | else:
81 | sample_index = np.random.choice(self.memory_counter, size=self.batch_size)
82 | batch_memory = self.memory[sample_index, :]
83 |
84 | h_train = torch.Tensor(batch_memory[:, 0: self.net[0]])
85 | m_train = torch.Tensor(batch_memory[:, self.net[0]:])
86 |
87 |
88 | # train the DNN
89 | optimizer = optim.Adam(self.model.parameters(), lr=self.lr,betas = (0.09,0.999),weight_decay=0.0001)
90 | criterion = nn.BCELoss()
91 | self.model.train()
92 | optimizer.zero_grad()
93 | predict = self.model(h_train)
94 | loss = criterion(predict, m_train)
95 | loss.backward()
96 | optimizer.step()
97 |
98 | self.cost = loss.item()
99 | assert(self.cost > 0)
100 | self.cost_his.append(self.cost)
101 |
102 | def decode(self, h, k = 1, mode = 'OP'):
103 | # to have batch dimension when feed into Tensor
104 | h = torch.Tensor(h[np.newaxis, :])
105 |
106 | self.model.eval()
107 | m_pred = self.model(h)
108 | m_pred = m_pred.detach().numpy()
109 |
110 | if mode is 'OP':
111 | return self.knm(m_pred[0], k)
112 | elif mode is 'KNN':
113 | return self.knn(m_pred[0], k)
114 | else:
115 | print("The action selection must be 'OP' or 'KNN'")
116 |
117 | def knm(self, m, k = 1):
118 | # return k order-preserving binary actions
119 | m_list = []
120 | # generate the first binary offloading decision with respect to equation (8)
121 | m_list.append(1*(m>0.5))
122 |
123 | if k > 1:
124 | # generate the remaining K-1 binary offloading decisions with respect to equation (9)
125 | m_abs = abs(m-0.5)
126 | idx_list = np.argsort(m_abs)[:k-1]
127 | for i in range(k-1):
128 | if m[idx_list[i]] >0.5:
129 | # set the \hat{x}_{t,(k-1)} to 0
130 | m_list.append(1*(m - m[idx_list[i]] > 0))
131 | else:
132 | # set the \hat{x}_{t,(k-1)} to 1
133 | m_list.append(1*(m - m[idx_list[i]] >= 0))
134 |
135 | return m_list
136 |
137 | def knn(self, m, k = 1):
138 | # list all 2^N binary offloading actions
139 | if len(self.enumerate_actions) is 0:
140 | import itertools
141 | self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0]))))
142 |
143 | # the 2-norm
144 | sqd = ((self.enumerate_actions - m)**2).sum(1)
145 | idx = np.argsort(sqd)
146 | return self.enumerate_actions[idx[:k]]
147 |
148 |
149 | def plot_cost(self):
150 | import matplotlib.pyplot as plt
151 | plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his)
152 | plt.ylabel('Training Loss')
153 | plt.xlabel('Time Frames')
154 | plt.show()
155 |
156 |
--------------------------------------------------------------------------------
/memoryTF2.py:
--------------------------------------------------------------------------------
1 | # #################################################################
2 | # This file contains the main DROO operations, including building DNN,
3 | # Storing data sample, Training DNN, and generating quantized binary offloading decisions.
4 |
5 | # version 1.0 -- January 2020. Written based on Tensorflow 2 by Weijian Pan and
6 | # Liang Huang (lianghuang AT zjut.edu.cn)
7 | # #################################################################
8 |
9 | from __future__ import print_function
10 | import tensorflow as tf
11 | from tensorflow import keras
12 | from tensorflow.keras import layers
13 | import numpy as np
14 |
15 | print(tf.__version__)
16 | print(tf.keras.__version__)
17 |
18 |
19 | # DNN network for memory
20 | class MemoryDNN:
21 | def __init__(
22 | self,
23 | net,
24 | learning_rate = 0.01,
25 | training_interval=10,
26 | batch_size=100,
27 | memory_size=1000,
28 | output_graph=False
29 | ):
30 |
31 | self.net = net # the size of the DNN
32 | self.training_interval = training_interval # learn every #training_interval
33 | self.lr = learning_rate
34 | self.batch_size = batch_size
35 | self.memory_size = memory_size
36 |
37 | # store all binary actions
38 | self.enumerate_actions = []
39 |
40 | # stored # memory entry
41 | self.memory_counter = 1
42 |
43 | # store training cost
44 | self.cost_his = []
45 |
46 | # initialize zero memory [h, m]
47 | self.memory = np.zeros((self.memory_size, self.net[0] + self.net[-1]))
48 |
49 | # construct memory network
50 | self._build_net()
51 |
52 | def _build_net(self):
53 | self.model = keras.Sequential([
54 | layers.Dense(self.net[1], activation='relu'), # the first hidden layer
55 | layers.Dense(self.net[2], activation='relu'), # the second hidden layer
56 | layers.Dense(self.net[-1], activation='sigmoid') # the output layer
57 | ])
58 |
59 | self.model.compile(optimizer=keras.optimizers.Adam(lr=self.lr), loss=tf.losses.binary_crossentropy, metrics=['accuracy'])
60 |
61 | def remember(self, h, m):
62 | # replace the old memory with new memory
63 | idx = self.memory_counter % self.memory_size
64 | self.memory[idx, :] = np.hstack((h, m))
65 |
66 | self.memory_counter += 1
67 |
68 | def encode(self, h, m):
69 | # encoding the entry
70 | self.remember(h, m)
71 | # train the DNN every 10 step
72 | # if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0:
73 | if self.memory_counter % self.training_interval == 0:
74 | self.learn()
75 |
76 | def learn(self):
77 | # sample batch memory from all memory
78 | if self.memory_counter > self.memory_size:
79 | sample_index = np.random.choice(self.memory_size, size=self.batch_size)
80 | else:
81 | sample_index = np.random.choice(self.memory_counter, size=self.batch_size)
82 | batch_memory = self.memory[sample_index, :]
83 |
84 | h_train = batch_memory[:, 0: self.net[0]]
85 | m_train = batch_memory[:, self.net[0]:]
86 |
87 | # print(h_train) # (128, 10)
88 | # print(m_train) # (128, 10)
89 |
90 | # train the DNN
91 | hist = self.model.fit(h_train, m_train, verbose=0)
92 | self.cost = hist.history['loss'][0]
93 | assert(self.cost > 0)
94 | self.cost_his.append(self.cost)
95 |
96 | def decode(self, h, k = 1, mode = 'OP'):
97 | # to have batch dimension when feed into tf placeholder
98 | h = h[np.newaxis, :]
99 |
100 | m_pred = self.model.predict(h)
101 |
102 | if mode is 'OP':
103 | return self.knm(m_pred[0], k)
104 | elif mode is 'KNN':
105 | return self.knn(m_pred[0], k)
106 | else:
107 | print("The action selection must be 'OP' or 'KNN'")
108 |
109 | def knm(self, m, k = 1):
110 | # return k order-preserving binary actions
111 | m_list = []
112 | # generate the first binary offloading decision with respect to equation (8)
113 | m_list.append(1*(m>0.5))
114 |
115 | if k > 1:
116 | # generate the remaining K-1 binary offloading decisions with respect to equation (9)
117 | m_abs = abs(m-0.5)
118 | idx_list = np.argsort(m_abs)[:k-1]
119 | for i in range(k-1):
120 | if m[idx_list[i]] >0.5:
121 | # set the \hat{x}_{t,(k-1)} to 0
122 | m_list.append(1*(m - m[idx_list[i]] > 0))
123 | else:
124 | # set the \hat{x}_{t,(k-1)} to 1
125 | m_list.append(1*(m - m[idx_list[i]] >= 0))
126 |
127 | return m_list
128 |
129 | def knn(self, m, k = 1):
130 | # list all 2^N binary offloading actions
131 | if len(self.enumerate_actions) is 0:
132 | import itertools
133 | self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0]))))
134 |
135 | # the 2-norm
136 | sqd = ((self.enumerate_actions - m)**2).sum(1)
137 | idx = np.argsort(sqd)
138 | return self.enumerate_actions[idx[:k]]
139 |
140 |
141 | def plot_cost(self):
142 | import matplotlib.pyplot as plt
143 | plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his)
144 | plt.ylabel('Training Loss')
145 | plt.xlabel('Time Frames')
146 | plt.show()
147 |
--------------------------------------------------------------------------------
/optimization.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | Created on Tue Jan 9 10:45:26 2018
4 |
5 | @author: Administrator
6 | """
7 | import numpy as np
8 | from scipy import optimize
9 | from scipy.special import lambertw
10 | import scipy.io as sio # import scipy.io for .mat file I/
11 | import time
12 |
13 |
14 | def plot_gain( gain_his):
15 | import matplotlib.pyplot as plt
16 | import pandas as pd
17 | import matplotlib as mpl
18 |
19 | gain_array = np.asarray(gain_his)
20 | df = pd.DataFrame(gain_his)
21 |
22 |
23 | mpl.style.use('seaborn')
24 | fig, ax = plt.subplots(figsize=(15,8))
25 | rolling_intv = 20
26 |
27 | plt.plot(np.arange(len(gain_array))+1, df.rolling(rolling_intv, min_periods=1).mean(), 'b')
28 | plt.fill_between(np.arange(len(gain_array))+1, df.rolling(rolling_intv, min_periods=1).min()[0], df.rolling(rolling_intv, min_periods=1).max()[0], color = 'b', alpha = 0.2)
29 | plt.ylabel('Gain ratio')
30 | plt.xlabel('learning steps')
31 | plt.show()
32 |
33 | def bisection(h, M, weights=[]):
34 | # the bisection algorithm proposed by Suzhi BI
35 | # average time to find the optimal: 0.012535839796066284 s
36 |
37 | # parameters and equations
38 | o=100
39 | p=3
40 | u=0.7
41 | eta1=((u*p)**(1.0/3))/o
42 | ki=10**-26
43 | eta2=u*p/10**-10
44 | B=2*10**6
45 | Vu=1.1
46 | epsilon=B/(Vu*np.log(2))
47 | x = [] # a =x[0], and tau_j = a[1:]
48 |
49 | M0=np.where(M==0)[0]
50 | M1=np.where(M==1)[0]
51 |
52 | hi=np.array([h[i] for i in M0])
53 | hj=np.array([h[i] for i in M1])
54 |
55 |
56 | if len(weights) == 0:
57 | # default weights [1, 1.5, 1, 1.5, 1, 1.5, ...]
58 | weights = [1.5 if i%2==1 else 1 for i in range(len(M))]
59 |
60 | wi=np.array([weights[M0[i]] for i in range(len(M0))])
61 | wj=np.array([weights[M1[i]] for i in range(len(M1))])
62 |
63 |
64 | def sum_rate(x):
65 | sum1=sum(wi*eta1*(hi/ki)**(1.0/3)*x[0]**(1.0/3))
66 | sum2=0
67 | for i in range(len(M1)):
68 | sum2+=wj[i]*epsilon*x[i+1]*np.log(1+eta2*hj[i]**2*x[0]/x[i+1])
69 | return sum1+sum2
70 |
71 | def phi(v, j):
72 | return 1/(-1-1/(lambertw(-1/(np.exp( 1 + v/wj[j]/epsilon))).real))
73 |
74 | def p1(v):
75 | p1 = 0
76 | for j in range(len(M1)):
77 | p1 += hj[j]**2 * phi(v, j)
78 |
79 | return 1/(1 + p1 * eta2)
80 |
81 | def Q(v):
82 | sum1 = sum(wi*eta1*(hi/ki)**(1.0/3))*p1(v)**(-2/3)/3
83 | sum2 = 0
84 | for j in range(len(M1)):
85 | sum2 += wj[j]*hj[j]**2/(1 + 1/phi(v,j))
86 | return sum1 + sum2*epsilon*eta2 - v
87 |
88 | def tau(v, j):
89 | return eta2*hj[j]**2*p1(v)*phi(v,j)
90 |
91 | # bisection starts here
92 | delta = 0.005
93 | UB = 999999999
94 | LB = 0
95 | while UB - LB > delta:
96 | v = (float(UB) + LB)/2
97 | if Q(v) > 0:
98 | LB = v
99 | else:
100 | UB = v
101 |
102 | x.append(p1(v))
103 | for j in range(len(M1)):
104 | x.append(tau(v, j))
105 |
106 | return sum_rate(x), x[0], x[1:]
107 |
108 |
109 |
110 | def cd_method(h):
111 | N = len(h)
112 | M0 = np.random.randint(2,size = N)
113 | gain0,a,Tj= bisection(h,M0)
114 | g_list = []
115 | M_list = []
116 | while True:
117 | for j in range(0,N):
118 | M = np.copy(M0)
119 | M[j] = (M[j]+1)%2
120 | gain,a,Tj= bisection(h,M)
121 | g_list.append(gain)
122 | M_list.append(M)
123 | g_max = max(g_list)
124 | if g_max > gain0:
125 | gain0 = g_max
126 | M0 = M_list[g_list.index(g_max)]
127 | else:
128 | break
129 | return gain0, M0
130 |
131 |
132 | if __name__ == "__main__":
133 |
134 | h=np.array([6.06020304235508*10**-6,1.10331933767028*10**-5,1.00213540309998*10**-7,1.21610610942759*10**-6,1.96138838395145*10**-6,1.71456339592966*10**-6,5.24563569673585*10**-6,5.89530717142197*10**-7,4.07769429231962*10**-6,2.88333185798682*10**-6])
135 | M=np.array([1,0,0,0,1,0,0,0,0,0])
136 | # h=np.array([1.00213540309998*10**-7,1.10331933767028*10**-5,6.06020304235508*10**-6,1.21610610942759*10**-6,1.96138838395145*10**-6,1.71456339592966*10**-6,5.24563569673585*10**-6,5.89530717142197*10**-7,4.07769429231962*10**-6,2.88333185798682*10**-6])
137 | # M=np.array([0,0,1,0,1,0,0,0,0,0])
138 |
139 |
140 | # h = np.array([4.6368924987170947*10**-7, 1.3479411763648968*10**-7, 7.174945246007612*10**-6, 2.5590719803595445*10**-7, 3.3189928740379023*10**-6, 1.2109071327755575*10**-5, 2.394278475886022*10**-6, 2.179121774067472*10**-6, 5.5213902658478367*10**-8, 2.168778154948169*10**-7, 2.053227965874453*10**-6, 7.002952297466865*10**-8, 7.594077851181444*10**-8, 7.904048961975136*10**-7, 8.867218892023474*10**-7, 5.886007653360979*10**-6, 2.3470565740563855*10**-6, 1.387049627074303*10**-7, 3.359475870531776*10**-7, 2.633733784949562*10**-7, 2.189895264149453*10**-6, 1.129177795302099*10**-5, 1.1760290137191366*10**-6, 1.6588656719735275*10**-7, 1.383637788476638*10**-6, 1.4485928387351664*10**-6, 1.4262265958416598*10**-6, 1.1779725004265418*10**-6, 7.738218993031842*10**-7, 4.763534225174186*10**-6])
141 | # M =np.array( [0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,])
142 |
143 | # time the average speed of bisection algorithm
144 | # repeat = 1
145 | # M =np.random.randint(2, size=(repeat,len(h)))
146 | # start_time=time.time()
147 | # for i in range(repeat):
148 | # gain,a,Tj= bisection(h,M[i,:])
149 | # total_time=time.time()-start_time
150 | # print('time_cost:%s'%(total_time/repeat))
151 |
152 | gain,a,Tj= bisection(h,M)
153 | print('y:%s'%gain)
154 | print('a:%s'%a)
155 | print('Tj:%s'%Tj)
156 |
157 | # test CD method. Given h, generate the max mode
158 | gain0, M0 = cd_method(h)
159 | print('max y:%s'%gain0)
160 | print(M0)
161 |
162 | # test all data
163 | K = [10, 20, 30] # number of users
164 | N = 1000 # number of channel
165 |
166 |
167 | for k in K:
168 | # Load data
169 | channel = sio.loadmat('./data/data_%d' %int(k))['input_h']
170 | gain = sio.loadmat('./data/data_%d' %int(k))['output_obj']
171 |
172 | start_time=time.time()
173 | gain_his = []
174 | gain_his_ratio = []
175 | mode_his = []
176 | for i in range(N):
177 | if i % (N//10) == 0:
178 | print("%0.1f"%(i/N))
179 |
180 | i_idx = i
181 |
182 | h = channel[i_idx,:]
183 |
184 | # the CD method
185 | gain0, M0 = cd_method(h)
186 |
187 |
188 | # memorize the largest reward
189 | gain_his.append(gain0)
190 | gain_his_ratio.append(gain_his[-1] / gain[i_idx][0])
191 |
192 | mode_his.append(M0)
193 |
194 |
195 | total_time=time.time()-start_time
196 | print('time_cost:%s'%total_time)
197 | print('average time per channel:%s'%(total_time/N))
198 |
199 |
200 | plot_gain(gain_his_ratio)
201 |
202 |
203 | print("gain/max ratio: ", sum(gain_his_ratio)/N)
204 |
205 |
206 |
207 |
208 |
209 |
210 |
211 |
212 |
213 |
214 |
215 |
216 |
217 |
218 |
219 |
220 |
221 |
222 |
223 |
224 |
225 |
226 |
227 |
228 |
229 |
230 |
231 |
232 |
233 |
234 |
235 |
236 |
237 |
--------------------------------------------------------------------------------