├── .gitattributes ├── LICENSE ├── README.md ├── Untitled0.ipynb ├── data ├── README.md ├── data_10.mat ├── data_10_WeightsAlternated.mat ├── data_20.mat ├── data_30.mat ├── data_5.mat ├── data_6.mat ├── data_7.mat ├── data_8.mat └── data_9.mat ├── demo_alternate_weights.py ├── demo_on_off.py ├── gain_his_ratio.txt ├── main.ipynb ├── main.py ├── mainPyTorch.py ├── mainTF2 ├── memory.ipynb ├── memory.py ├── memoryPyTorch.py ├── memoryTF2.py └── optimization.py /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 REVENOL 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DROO 2 | 3 | *Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks* 4 | 5 | Python code to reproduce our DROO algorithm for Wireless-powered Mobile-Edge Computing [1], which uses the time-varying wireless channel gains as the input and generates the binary offloading decisions. It includes: 6 | 7 | - [memory.py](memory.py): the DNN structure for the WPMEC, inclduing training structure and test structure, implemented based on [Tensorflow 1.x](https://www.tensorflow.org/install/pip). 8 | - [memoryTF2.py](memoryTF2.py): Implemented based on [Tensorflow 2](https://www.tensorflow.org/install). 9 | - [memoryPyTorch.py](memoryPyTorch.py): Implemented based on [PyTorch](https://pytorch.org/get-started/locally/). 10 | - [optimization.py](optimization.py): solve the resource allocation problem 11 | 12 | - [data](./data): all data are stored in this subdirectory, includes: 13 | 14 | - **data_#.mat**: training and testing data sets, where # = {10, 20, 30} is the user number 15 | 16 | - [main.py](main.py): run this file for DROO, including setting system parameters, implemented based on [Tensorflow 1.x](https://www.tensorflow.org/install/pip) 17 | - [mainTF2.py](mainTF2.py): Implemented based on [Tensorflow 2](https://www.tensorflow.org/install). Run this file for DROO if you code with Tensorflow 2. 18 | - [mainPyTorch.py](mainPyTorch.py): Implemented based on [PyTorch](https://pytorch.org/get-started/locally/). Run this file for DROO if you code with PyTorch. 19 | 20 | - [demo_alternate_weights.py](demo_alternate_weights.py): run this file to evaluate the performance of DROO when WDs' weights are alternated 21 | 22 | - [demo_on_off.py](demo_on_off.py): run this file to evaluate the performance of DROO when some WDs are randomly turning on/off 23 | 24 | 25 | ## Cite this work 26 | 27 | 1. L. Huang, S. Bi, and Y. J. Zhang, “[Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks](https://ieeexplore.ieee.org/document/8771176),” IEEE Trans. Mobile Compt., vol. 19, no. 11, pp. 2581-2593, November 2020. 28 | 29 | ``` 30 | @ARTICLE{huang2020DROO, 31 | author={Huang, Liang and Bi, Suzhi and Zhang, Ying-Jun Angela}, 32 | journal={IEEE Transactions on Mobile Computing}, 33 | title={Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks}, 34 | year={2020}, 35 | month={November}, 36 | volume={19}, 37 | number={11}, 38 | pages={2581-2593}, 39 | doi={10.1109/TMC.2019.2928811} 40 | } 41 | ``` 42 | 43 | ## About authors 44 | 45 | - [Liang HUANG](https://scholar.google.com/citations?user=NifLoZ4AAAAJ), lianghuang AT zjut.edu.cn 46 | 47 | - [Suzhi BI](https://scholar.google.com/citations?user=uibqC-0AAAAJ), bsz AT szu.edu.cn 48 | 49 | - [Ying Jun (Angela) Zhang](https://scholar.google.com/citations?user=iOb3wocAAAAJ), yjzhang AT ie.cuhk.edu.hk 50 | 51 | ## Required packages 52 | 53 | - Tensorflow 54 | 55 | - numpy 56 | 57 | - scipy 58 | 59 | ## How the code works 60 | 61 | - For DROO algorithm, run the file, [main.py](main.py). If you code with Tenforflow 2 or PyTorch, run [mainTF2.py](mainTF2.py) or [mainPyTorch.py](mainPyTorch.py), respectively. The original DROO algorithm is coded based on [Tensorflow 1.x](https://www.tensorflow.org/install/pip). If you are fresh to deep learning, please start with [Tensorflow 2](https://www.tensorflow.org/install) or [PyTorch](https://pytorch.org/get-started/locally/), whose codes are much cleaner and easier to follow. 62 | 63 | - For more DROO demos: 64 | - Laternating-weight WDs, run the file, [demo_alternate_weights.py](demo_alternate_weights. 65 | - ON-OFF WDs, run the file, [demo_on_off.py](demo_on_off.py) 66 | - Remember to respectively edit the *import MemoryDNN* code from 67 | ``` 68 | from memory import MemoryDNN 69 | ``` 70 | to 71 | ``` 72 | from memoryTF2 import MemoryDNN 73 | ``` 74 | or 75 | ``` 76 | from memoryPyTorch import MemoryDNN 77 | ``` 78 | if you are using Tensorflow 2 or PyTorch. 79 | 80 | ### DROO is illustrated here for single-slot optimization. If you tend to apply DROO for multiple-slot continuous control problems, please refer to our [LyDROO](https://github.com/revenol/LyDROO) project. 81 | -------------------------------------------------------------------------------- /Untitled0.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyOnESJFQS185LAOEe1zq+1H", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | }, 17 | "accelerator": "GPU", 18 | "gpuClass": "standard" 19 | }, 20 | "cells": [ 21 | { 22 | "cell_type": "markdown", 23 | "metadata": { 24 | "id": "view-in-github", 25 | "colab_type": "text" 26 | }, 27 | "source": [ 28 | "\"Open" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "metadata": { 35 | "id": "GcZqNc_bGfkz" 36 | }, 37 | "outputs": [], 38 | "source": [ 39 | "from google.colab import files" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "source": [ 45 | "!cp content/drive/MyDrive/Importing\\ Scripts \\as\\ Modules/DROO-master/main.py /content" 46 | ], 47 | "metadata": { 48 | "colab": { 49 | "base_uri": "https://localhost:8080/" 50 | }, 51 | "id": "B_yN8BR3G7bN", 52 | "outputId": "bc83f6e3-f21c-4fcc-b7b3-6e92d2b68de9" 53 | }, 54 | "execution_count": null, 55 | "outputs": [ 56 | { 57 | "output_type": "stream", 58 | "name": "stdout", 59 | "text": [ 60 | "cp: cannot stat 'content/drive/MyDrive/Importing Scripts': No such file or directory\n", 61 | "cp: cannot stat 'as Modules/DROO-master/main.py': No such file or directory\n" 62 | ] 63 | } 64 | ] 65 | }, 66 | { 67 | "cell_type": "markdown", 68 | "source": [ 69 | "# New Section" 70 | ], 71 | "metadata": { 72 | "id": "MkqI9XbFHDmT" 73 | } 74 | } 75 | ] 76 | } -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- 1 | # DROO 2 | 3 | *Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks* 4 | 5 | This folder includs all pre-generated training and testing data set, including: 6 | 7 | - **data_#.mat**: , where # = {5, 6, 7, 8, 9, 10, 20, 30} is the number of WDs 8 | 9 | - [data_10_WeightsAlternated.mat](data_10_WeightsAlternated.mat): The data set when all WDs' weights are alternated. It contains the same values of 'input_h' as the ones stored in [data_10.mat](data_10.mat). However, the optimal offloading mode, resource allocation, and the maximum computation rate are recalculated since WDs' weights are alternated. 10 | 11 | 12 | ## Data Format 13 | 14 | Data samples are generated by enumerating all 2^N binary offloading actions for N <= 10 and by following the CD method presented in [2] for N = 20, 30. There are 30,000 (for N = 10, 20, 30) or 10,000 (otherwise) samples saved in each \*.mat file. Where each data sample includes: 15 | 16 | | variable | description | 17 | |------------------------:|:-----------------------| 18 | | input_h | The wireless channel gain between WDs and the AP $\mathbf{h}$ | 19 | | output_mode | The optimal binary offloading action $\mathbf{x}^*$ | 20 | | output_a | The optimal fraction of time that the AP broadcasts RF energy for the WDs to harvest $a^*$ | 21 | | output_tau | The optimal fraction of time allocated to WDs for task offloading $\mathbf{\tau}^*$| 22 | | output_obj | The optimal weighted sum computation rate $Q^*$ | 23 | 24 | 25 | 26 | ## About our works 27 | 28 | 1. Liang Huang, Suzhi Bi, and Ying-jun Angela Zhang, "Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks", on [arxiv:1808.01977](https://arxiv.org/abs/1808.01977). 29 | 2. S. Bi and Y. J. Zhang, "Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading," *IEEE Trans. Wireless Commun.*, vol. 17, no. 6, pp. 4177-4190, Jun. 2018. 30 | 31 | ## About authors 32 | 33 | - Liang HUANG, lianghuang AT zjut.edu.cn 34 | 35 | - Suzhi BI, bsz AT szu.edu.cn 36 | 37 | - Ying Jun (Angela) Zhang, yjzhang AT ie.cuhk.edu.hk 38 | -------------------------------------------------------------------------------- /data/data_10.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_10.mat -------------------------------------------------------------------------------- /data/data_10_WeightsAlternated.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_10_WeightsAlternated.mat -------------------------------------------------------------------------------- /data/data_20.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_20.mat -------------------------------------------------------------------------------- /data/data_30.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_30.mat -------------------------------------------------------------------------------- /data/data_5.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_5.mat -------------------------------------------------------------------------------- /data/data_6.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_6.mat -------------------------------------------------------------------------------- /data/data_7.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_7.mat -------------------------------------------------------------------------------- /data/data_8.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_8.mat -------------------------------------------------------------------------------- /data/data_9.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_9.mat -------------------------------------------------------------------------------- /demo_alternate_weights.py: -------------------------------------------------------------------------------- 1 | # ################################################################# 2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks 3 | # 4 | # This file contains a demo evaluating the performance of DROO with laternating-weight WDs. It loads the training samples with default WDs' weights from ./data/data_10.mat and with alternated weights from ./data/data_10_WeightsAlternated.mat. The channel gains in both files are the same. However, the optimal offloading mode, resource allocation, and the maximum computation rate in 'data_10_WeightsAlternated.mat' are recalculated since WDs' weights are alternated. 5 | # 6 | # References: 7 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-jun Angela Zhang, “Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks”, on arxiv:1808.01977 8 | # 9 | # version 1.0 -- April 2019. Written by Liang Huang (lianghuang AT zjut.edu.cn) 10 | # ################################################################# 11 | 12 | 13 | import scipy.io as sio # import scipy.io for .mat file I/ 14 | import numpy as np # import numpy 15 | 16 | from memory import MemoryDNN 17 | from optimization import bisection 18 | from main import plot_rate, save_to_txt 19 | 20 | import time 21 | 22 | 23 | def alternate_weights(case_id=0): 24 | ''' 25 | Alternate the weights of all WDs. Note that, the maximum computation rate need be recomputed by solving (P2) once any WD's weight is changed. 26 | Input: case_id = 0 for default weights; case_id = 1 for alternated weights. 27 | Output: The alternated weights and the corresponding rate. 28 | ''' 29 | # set alternated weights 30 | weights=[[1,1.5,1,1.5,1,1.5,1,1.5,1,1.5],[1.5,1,1.5,1,1.5,1,1.5,1,1.5,1]] 31 | 32 | # load the corresponding maximum computation rate 33 | if case_id == 0: 34 | # by defaulst, case_id = 0 35 | rate = sio.loadmat('./data/data_10')['output_obj'] 36 | else: 37 | # alternate weights for all WDs, case_id = 1 38 | rate = sio.loadmat('./data/data_10_WeightsAlternated')['output_obj'] 39 | return weights[case_id], rate 40 | 41 | if __name__ == "__main__": 42 | ''' 43 | This demo evaluate DROO with laternating-weight WDs. We evaluate an extreme case by alternating the weights of all WDs between 1 and 1.5 at the same time, specifically, at time frame 6,000 and 8,000. 44 | ''' 45 | 46 | N = 10 # number of users 47 | n = 10000 # number of time frames, <= 10,000 48 | K = N # initialize K = N 49 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN' 50 | Memory = 1024 # capacity of memory structure 51 | Delta = 32 # Update interval for adaptive K 52 | 53 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta)) 54 | # Load data 55 | channel = sio.loadmat('./data/data_%d' %N)['input_h'] 56 | rate = sio.loadmat('./data/data_%d' %N)['output_obj'] 57 | 58 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning 59 | channel = channel * 1000000 60 | 61 | # generate the train and test data sample index 62 | # data are splitted as 80:20 63 | # training data are randomly sampled with duplication if n > total data size 64 | 65 | split_idx = int(.8* len(channel)) 66 | num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size 67 | 68 | 69 | mem = MemoryDNN(net = [N, 120, 80, N], 70 | learning_rate = 0.01, 71 | training_interval=10, 72 | batch_size=128, 73 | memory_size=Memory 74 | ) 75 | 76 | start_time=time.time() 77 | 78 | rate_his = [] 79 | rate_his_ratio = [] 80 | mode_his = [] 81 | k_idx_his = [] 82 | K_his = [] 83 | h = channel[0,:] 84 | 85 | # initilize the weights by setting case_id = 0. 86 | weight, rate = alternate_weights(0) 87 | print("WD weights at time frame %d:"%(0), weight) 88 | 89 | 90 | for i in range(n): 91 | # for dynamic number of WDs 92 | if i ==0.6*n: 93 | weight, rate = alternate_weights(1) 94 | print("WD weights at time frame %d:"%(i), weight) 95 | if i ==0.8*n: 96 | weight, rate = alternate_weights(0) 97 | print("WD weights at time frame %d:"%(i), weight) 98 | 99 | 100 | if i % (n//10) == 0: 101 | print("%0.1f"%(i/n)) 102 | if i> 0 and i % Delta == 0: 103 | # index counts from 0 104 | if Delta > 1: 105 | max_k = max(k_idx_his[-Delta:-1]) +1; 106 | else: 107 | max_k = k_idx_his[-1] +1; 108 | K = min(max_k +1, N) 109 | 110 | 111 | i_idx = i 112 | h = channel[i_idx,:] 113 | 114 | # the action selection must be either 'OP' or 'KNN' 115 | m_list = mem.decode(h, K, decoder_mode) 116 | 117 | r_list = [] 118 | for m in m_list: 119 | # only acitve users are used to compute the rate 120 | r_list.append(bisection(h/1000000, m, weight)[0]) 121 | 122 | # memorize the largest reward 123 | rate_his.append(np.max(r_list)) 124 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0]) 125 | # record the index of largest reward 126 | k_idx_his.append(np.argmax(r_list)) 127 | # record K in case of adaptive K 128 | K_his.append(K) 129 | # save the mode with largest reward 130 | mode_his.append(m_list[np.argmax(r_list)]) 131 | # if i <0.6*n: 132 | # encode the mode with largest reward 133 | mem.encode(h, m_list[np.argmax(r_list)]) 134 | 135 | 136 | total_time=time.time()-start_time 137 | mem.plot_cost() 138 | plot_rate(rate_his_ratio) 139 | 140 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test) 141 | print('Total time consumed:%s'%total_time) 142 | print('Average time per channel:%s'%(total_time/n)) 143 | 144 | # save data into txt 145 | save_to_txt(k_idx_his, "k_idx_his.txt") 146 | save_to_txt(K_his, "K_his.txt") 147 | save_to_txt(mem.cost_his, "cost_his.txt") 148 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt") 149 | save_to_txt(mode_his, "mode_his.txt") 150 | 151 | 152 | 153 | -------------------------------------------------------------------------------- /demo_on_off.py: -------------------------------------------------------------------------------- 1 | # ################################################################# 2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks 3 | # 4 | # This file contains a demo evaluating the performance of DROO by randomly turning on/off some WDs. It loads the training samples from ./data/data_#.mat, where # denotes the number of active WDs in the MEC network. Note that, the maximum computation rate need be recomputed by solving (P2) once a WD is turned off/on. 5 | # 6 | # References: 7 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-jun Angela Zhang, “Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks”, submitted to IEEE Journal on Selected Areas in Communications. 8 | # 9 | # version 1.0 -- April 2019. Written by Liang Huang (lianghuang AT zjut.edu.cn) 10 | # ################################################################# 11 | 12 | 13 | import scipy.io as sio # import scipy.io for .mat file I/ 14 | import numpy as np # import numpy 15 | 16 | from memory import MemoryDNN 17 | from optimization import bisection 18 | from main import plot_rate, save_to_txt 19 | 20 | import time 21 | 22 | 23 | def WD_off(channel, N_active, N): 24 | # turn off one WD 25 | if N_active > 5: # current we support half of WDs are off 26 | N_active = N_active - 1 27 | # set the (N-active-1)th channel to close to 0 28 | # since all channels in each time frame are randomly generated, we turn of the WD with greatest index 29 | channel[:,N_active] = channel[:, N_active] / 1000000 # a programming trick,such that we can recover its channel gain once the WD is turned on again. 30 | print(" The %dth WD is turned on."%(N_active +1)) 31 | 32 | # update the expected maximum computation rate 33 | rate = sio.loadmat('./data/data_%d' %N_active)['output_obj'] 34 | return channel, rate, N_active 35 | 36 | def WD_on(channel, N_active, N): 37 | # turn on one WD 38 | if N_active < N: 39 | N_active = N_active + 1 40 | # recover (N_active-1)th channel 41 | channel[:,N_active-1] = channel[:, N_active-1] * 1000000 42 | print(" The %dth WD is turned on."%(N_active)) 43 | 44 | # update the expected maximum computation rate 45 | rate = sio.loadmat('./data/data_%d' %N_active)['output_obj'] 46 | return channel, rate, N_active 47 | 48 | 49 | 50 | 51 | if __name__ == "__main__": 52 | ''' 53 | This demo evaluate DROO for MEC networks where WDs can be occasionally turned off/on. After DROO converges, we randomly turn off on one WD at each time frame 6,000, 6,500, 7,000, and 7,500, and then turn them on at time frames 8,000, 8,500, and 9,000. At time frame 9,500 , we randomly turn off two WDs, resulting an MEC network with 8 acitve WDs. 54 | ''' 55 | 56 | N = 10 # number of users 57 | N_active = N # number of effective users 58 | N_off = 0 # number of off-users 59 | n = 10000 # number of time frames, <= 10,000 60 | K = N # initialize K = N 61 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN' 62 | Memory = 1024 # capacity of memory structure 63 | Delta = 32 # Update interval for adaptive K 64 | 65 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta)) 66 | # Load data 67 | channel = sio.loadmat('./data/data_%d' %N)['input_h'] 68 | rate = sio.loadmat('./data/data_%d' %N)['output_obj'] 69 | 70 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning 71 | channel = channel * 1000000 72 | channel_bak = channel.copy() 73 | # generate the train and test data sample index 74 | # data are splitted as 80:20 75 | # training data are randomly sampled with duplication if n > total data size 76 | 77 | split_idx = int(.8* len(channel)) 78 | num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size 79 | 80 | 81 | mem = MemoryDNN(net = [N, 120, 80, N], 82 | learning_rate = 0.01, 83 | training_interval=10, 84 | batch_size=128, 85 | memory_size=Memory 86 | ) 87 | 88 | start_time=time.time() 89 | 90 | rate_his = [] 91 | rate_his_ratio = [] 92 | mode_his = [] 93 | k_idx_his = [] 94 | K_his = [] 95 | h = channel[0,:] 96 | 97 | 98 | for i in range(n): 99 | # for dynamic number of WDs 100 | if i ==0.6*n: 101 | print("At time frame %d:"%(i)) 102 | channel, rate, N_active = WD_off(channel, N_active, N) 103 | if i ==0.65*n: 104 | print("At time frame %d:"%(i)) 105 | channel, rate, N_active = WD_off(channel, N_active, N) 106 | if i ==0.7*n: 107 | print("At time frame %d:"%(i)) 108 | channel, rate, N_active = WD_off(channel, N_active, N) 109 | if i ==0.75*n: 110 | print("At time frame %d:"%(i)) 111 | channel, rate, N_active = WD_off(channel, N_active, N) 112 | if i ==0.8*n: 113 | print("At time frame %d:"%(i)) 114 | channel, rate, N_active = WD_on(channel, N_active, N) 115 | if i ==0.85*n: 116 | print("At time frame %d:"%(i)) 117 | channel, rate, N_active = WD_on(channel, N_active, N) 118 | if i ==0.9*n: 119 | print("At time frame %d:"%(i)) 120 | channel, rate, N_active = WD_on(channel, N_active, N) 121 | channel, rate, N_active = WD_on(channel, N_active, N) 122 | if i == 0.95*n: 123 | print("At time frame %d:"%(i)) 124 | channel, rate, N_active = WD_off(channel, N_active, N) 125 | channel, rate, N_active = WD_off(channel, N_active, N) 126 | 127 | if i % (n//10) == 0: 128 | print("%0.1f"%(i/n)) 129 | if i> 0 and i % Delta == 0: 130 | # index counts from 0 131 | if Delta > 1: 132 | max_k = max(k_idx_his[-Delta:-1]) +1; 133 | else: 134 | max_k = k_idx_his[-1] +1; 135 | K = min(max_k +1, N) 136 | 137 | i_idx = i 138 | h = channel[i_idx,:] 139 | 140 | # the action selection must be either 'OP' or 'KNN' 141 | m_list = mem.decode(h, K, decoder_mode) 142 | 143 | r_list = [] 144 | for m in m_list: 145 | # only acitve users are used to compute the rate 146 | r_list.append(bisection(h[0:N_active]/1000000, m[0:N_active])[0]) 147 | 148 | # memorize the largest reward 149 | rate_his.append(np.max(r_list)) 150 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0]) 151 | # record the index of largest reward 152 | k_idx_his.append(np.argmax(r_list)) 153 | # record K in case of adaptive K 154 | K_his.append(K) 155 | # save the mode with largest reward 156 | mode_his.append(m_list[np.argmax(r_list)]) 157 | # if i <0.6*n: 158 | # encode the mode with largest reward 159 | mem.encode(h, m_list[np.argmax(r_list)]) 160 | 161 | 162 | total_time=time.time()-start_time 163 | mem.plot_cost() 164 | plot_rate(rate_his_ratio) 165 | 166 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test) 167 | print('Total time consumed:%s'%total_time) 168 | print('Average time per channel:%s'%(total_time/n)) 169 | 170 | # save data into txt 171 | save_to_txt(k_idx_his, "k_idx_his.txt") 172 | save_to_txt(K_his, "K_his.txt") 173 | save_to_txt(mem.cost_his, "cost_his.txt") 174 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt") 175 | save_to_txt(mode_his, "mode_his.txt") 176 | 177 | 178 | 179 | -------------------------------------------------------------------------------- /main.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyPGw6GfH7V4bE9B/kfbRbog", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | }, 17 | "accelerator": "GPU", 18 | "gpuClass": "standard" 19 | }, 20 | "cells": [ 21 | { 22 | "cell_type": "markdown", 23 | "metadata": { 24 | "id": "view-in-github", 25 | "colab_type": "text" 26 | }, 27 | "source": [ 28 | "\"Open" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "metadata": { 35 | "id": "GcZqNc_bGfkz" 36 | }, 37 | "outputs": [], 38 | "source": [ 39 | "from google.colab import files" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "source": [ 45 | "!cp content/drive/MyDrive/Importing\\ Scripts \\as\\ Modules/DROO-master/main.py /content" 46 | ], 47 | "metadata": { 48 | "colab": { 49 | "base_uri": "https://localhost:8080/" 50 | }, 51 | "id": "B_yN8BR3G7bN", 52 | "outputId": "bc83f6e3-f21c-4fcc-b7b3-6e92d2b68de9" 53 | }, 54 | "execution_count": null, 55 | "outputs": [ 56 | { 57 | "output_type": "stream", 58 | "name": "stdout", 59 | "text": [ 60 | "cp: cannot stat 'content/drive/MyDrive/Importing Scripts': No such file or directory\n", 61 | "cp: cannot stat 'as Modules/DROO-master/main.py': No such file or directory\n" 62 | ] 63 | } 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "source": [ 69 | "# #################################################################\n", 70 | "# Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks\n", 71 | "#\n", 72 | "# This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data.\n", 73 | "#\n", 74 | "# Input: ./data/data_#.mat\n", 75 | "# Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes\n", 76 | "# -----------------------------------------------------------------\n", 77 | "# | wireless channel gain | input_h |\n", 78 | "# -----------------------------------------------------------------\n", 79 | "# | computing mode selection | output_mode |\n", 80 | "# -----------------------------------------------------------------\n", 81 | "# | energy broadcasting parameter | output_a |\n", 82 | "# -----------------------------------------------------------------\n", 83 | "# | transmit time of wireless device | output_tau |\n", 84 | "# -----------------------------------------------------------------\n", 85 | "# | weighted sum computation rate | output_obj |\n", 86 | "# -----------------------------------------------------------------\n", 87 | "#\n", 88 | "#\n", 89 | "# References:\n", 90 | "# [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, \"Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks,\" in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811.\n", 91 | "# [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018.\n", 92 | "#\n", 93 | "# version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)\n", 94 | "# #################################################################\n", 95 | "\n", 96 | "\n", 97 | "import scipy.io as sio # import scipy.io for .mat file I/\n", 98 | "import numpy as np # import numpy\n", 99 | "\n", 100 | "from memory import MemoryDNN\n", 101 | "from optimization import bisection\n", 102 | "\n", 103 | "import time\n", 104 | "\n", 105 | "\n", 106 | "def plot_rate( rate_his, rolling_intv = 50):\n", 107 | " import matplotlib.pyplot as plt\n", 108 | " import pandas as pd\n", 109 | " import matplotlib as mpl\n", 110 | "\n", 111 | " rate_array = np.asarray(rate_his)\n", 112 | " df = pd.DataFrame(rate_his)\n", 113 | "\n", 114 | "\n", 115 | " mpl.style.use('seaborn')\n", 116 | " fig, ax = plt.subplots(figsize=(15,8))\n", 117 | "# rolling_intv = 20\n", 118 | "\n", 119 | " plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b')\n", 120 | " plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2)\n", 121 | " plt.ylabel('Normalized Computation Rate')\n", 122 | " plt.xlabel('Time Frames')\n", 123 | " plt.show()\n", 124 | "\n", 125 | "def save_to_txt(rate_his, file_path):\n", 126 | " with open(file_path, 'w') as f:\n", 127 | " for rate in rate_his:\n", 128 | " f.write(\"%s \\n\" % rate)\n", 129 | "\n", 130 | "if __name__ == \"__main__\":\n", 131 | " '''\n", 132 | " This algorithm generates K modes from DNN, and chooses with largest\n", 133 | " reward. The mode with largest reward is stored in the memory, which is\n", 134 | " further used to train the DNN.\n", 135 | " Adaptive K is implemented. K = max(K, K_his[-memory_size])\n", 136 | " '''\n", 137 | "\n", 138 | " N = 10 # number of users\n", 139 | " n = 30000 # number of time frames\n", 140 | " K = N # initialize K = N\n", 141 | " decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN'\n", 142 | " Memory = 1024 # capacity of memory structure\n", 143 | " Delta = 32 # Update interval for adaptive K\n", 144 | "\n", 145 | " print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))\n", 146 | " # Load data\n", 147 | " channel = sio.loadmat('./data/data_%d' %N)['input_h']\n", 148 | " rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO.\n", 149 | "\n", 150 | " # increase h to close to 1 for better training; it is a trick widely adopted in deep learning\n", 151 | " channel = channel * 1000000\n", 152 | "\n", 153 | " # generate the train and test data sample index\n", 154 | " # data are splitted as 80:20\n", 155 | " # training data are randomly sampled with duplication if n > total data size\n", 156 | "\n", 157 | " split_idx = int(.8* len(channel))\n", 158 | " num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size\n", 159 | "\n", 160 | "\n", 161 | " mem = MemoryDNN(net = [N, 120, 80, N],\n", 162 | " learning_rate = 0.01,\n", 163 | " training_interval=10,\n", 164 | " batch_size=128,\n", 165 | " memory_size=Memory\n", 166 | " )\n", 167 | "\n", 168 | " start_time=time.time()\n", 169 | "\n", 170 | " rate_his = []\n", 171 | " rate_his_ratio = []\n", 172 | " mode_his = []\n", 173 | " k_idx_his = []\n", 174 | " K_his = []\n", 175 | " for i in range(n):\n", 176 | " if i % (n//10) == 0:\n", 177 | " print(\"%0.1f\"%(i/n))\n", 178 | " if i> 0 and i % Delta == 0:\n", 179 | " # index counts from 0\n", 180 | " if Delta > 1:\n", 181 | " max_k = max(k_idx_his[-Delta:-1]) +1;\n", 182 | " else:\n", 183 | " max_k = k_idx_his[-1] +1;\n", 184 | " K = min(max_k +1, N)\n", 185 | "\n", 186 | " if i < n - num_test:\n", 187 | " # training\n", 188 | " i_idx = i % split_idx\n", 189 | " else:\n", 190 | " # test\n", 191 | " i_idx = i - n + num_test + split_idx\n", 192 | "\n", 193 | " h = channel[i_idx,:]\n", 194 | "\n", 195 | " # the action selection must be either 'OP' or 'KNN'\n", 196 | " m_list = mem.decode(h, K, decoder_mode)\n", 197 | "\n", 198 | " r_list = []\n", 199 | " for m in m_list:\n", 200 | " r_list.append(bisection(h/1000000, m)[0])\n", 201 | " \n", 202 | " # encode the mode with largest reward\n", 203 | " mem.encode(h, m_list[np.argmax(r_list)])\n", 204 | " # the main code for DROO training ends here\n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " # the following codes store some interested metrics for illustrations\n", 210 | " # memorize the largest reward\n", 211 | " rate_his.append(np.max(r_list))\n", 212 | " rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])\n", 213 | " # record the index of largest reward\n", 214 | " k_idx_his.append(np.argmax(r_list))\n", 215 | " # record K in case of adaptive K\n", 216 | " K_his.append(K)\n", 217 | " mode_his.append(m_list[np.argmax(r_list)])\n", 218 | "\n", 219 | "\n", 220 | " total_time=time.time()-start_time\n", 221 | " mem.plot_cost()\n", 222 | " plot_rate(rate_his_ratio)\n", 223 | "\n", 224 | " print(\"Averaged normalized computation rate:\", sum(rate_his_ratio[-num_test: -1])/num_test)\n", 225 | " print('Total time consumed:%s'%total_time)\n", 226 | " print('Average time per channel:%s'%(total_time/n))\n", 227 | "\n", 228 | " # save data into txt\n", 229 | " save_to_txt(k_idx_his, \"k_idx_his.txt\")\n", 230 | " save_to_txt(K_his, \"K_his.txt\")\n", 231 | " save_to_txt(mem.cost_his, \"cost_his.txt\")\n", 232 | " save_to_txt(rate_his_ratio, \"rate_his_ratio.txt\")\n", 233 | " save_to_txt(mode_his, \"mode_his.txt\")" 234 | ], 235 | "metadata": { 236 | "id": "eyzq8nm0PnRy" 237 | }, 238 | "execution_count": null, 239 | "outputs": [] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "source": [ 244 | "# New Section" 245 | ], 246 | "metadata": { 247 | "id": "MkqI9XbFHDmT" 248 | } 249 | } 250 | ] 251 | } -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | # ################################################################# 2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks 3 | # 4 | # This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data. 5 | # 6 | # Input: ./data/data_#.mat 7 | # Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes 8 | # ----------------------------------------------------------------- 9 | # | wireless channel gain | input_h | 10 | # ----------------------------------------------------------------- 11 | # | computing mode selection | output_mode | 12 | # ----------------------------------------------------------------- 13 | # | energy broadcasting parameter | output_a | 14 | # ----------------------------------------------------------------- 15 | # | transmit time of wireless device | output_tau | 16 | # ----------------------------------------------------------------- 17 | # | weighted sum computation rate | output_obj | 18 | # ----------------------------------------------------------------- 19 | # 20 | # 21 | # References: 22 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, "Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks," in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811. 23 | # [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018. 24 | # 25 | # version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn) 26 | # ################################################################# 27 | 28 | 29 | import scipy.io as sio # import scipy.io for .mat file I/ 30 | import numpy as np # import numpy 31 | 32 | from memory import MemoryDNN 33 | from optimization import bisection 34 | 35 | import time 36 | 37 | 38 | def plot_rate( rate_his, rolling_intv = 50): 39 | import matplotlib.pyplot as plt 40 | import pandas as pd 41 | import matplotlib as mpl 42 | 43 | rate_array = np.asarray(rate_his) 44 | df = pd.DataFrame(rate_his) 45 | 46 | 47 | mpl.style.use('seaborn') 48 | fig, ax = plt.subplots(figsize=(15,8)) 49 | # rolling_intv = 20 50 | 51 | plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b') 52 | plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2) 53 | plt.ylabel('Normalized Computation Rate') 54 | plt.xlabel('Time Frames') 55 | plt.show() 56 | 57 | def save_to_txt(rate_his, file_path): 58 | with open(file_path, 'w') as f: 59 | for rate in rate_his: 60 | f.write("%s \n" % rate) 61 | 62 | if __name__ == "__main__": 63 | ''' 64 | This algorithm generates K modes from DNN, and chooses with largest 65 | reward. The mode with largest reward is stored in the memory, which is 66 | further used to train the DNN. 67 | Adaptive K is implemented. K = max(K, K_his[-memory_size]) 68 | ''' 69 | 70 | N = 10 # number of users 71 | n = 30000 # number of time frames 72 | K = N # initialize K = N 73 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN' 74 | Memory = 1024 # capacity of memory structure 75 | Delta = 32 # Update interval for adaptive K 76 | 77 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta)) 78 | # Load data 79 | channel = sio.loadmat('./data/data_%d' %N)['input_h'] 80 | rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO. 81 | 82 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning 83 | channel = channel * 1000000 84 | 85 | # generate the train and test data sample index 86 | # data are splitted as 80:20 87 | # training data are randomly sampled with duplication if n > total data size 88 | 89 | split_idx = int(.8* len(channel)) 90 | num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size 91 | 92 | 93 | mem = MemoryDNN(net = [N, 120, 80, N], 94 | learning_rate = 0.01, 95 | training_interval=10, 96 | batch_size=128, 97 | memory_size=Memory 98 | ) 99 | 100 | start_time=time.time() 101 | 102 | rate_his = [] 103 | rate_his_ratio = [] 104 | mode_his = [] 105 | k_idx_his = [] 106 | K_his = [] 107 | for i in range(n): 108 | if i % (n//10) == 0: 109 | print("%0.1f"%(i/n)) 110 | if i> 0 and i % Delta == 0: 111 | # index counts from 0 112 | if Delta > 1: 113 | max_k = max(k_idx_his[-Delta:-1]) +1; 114 | else: 115 | max_k = k_idx_his[-1] +1; 116 | K = min(max_k +1, N) 117 | 118 | if i < n - num_test: 119 | # training 120 | i_idx = i % split_idx 121 | else: 122 | # test 123 | i_idx = i - n + num_test + split_idx 124 | 125 | h = channel[i_idx,:] 126 | 127 | # the action selection must be either 'OP' or 'KNN' 128 | m_list = mem.decode(h, K, decoder_mode) 129 | 130 | r_list = [] 131 | for m in m_list: 132 | r_list.append(bisection(h/1000000, m)[0]) 133 | 134 | # encode the mode with largest reward 135 | mem.encode(h, m_list[np.argmax(r_list)]) 136 | # the main code for DROO training ends here 137 | 138 | 139 | 140 | 141 | # the following codes store some interested metrics for illustrations 142 | # memorize the largest reward 143 | rate_his.append(np.max(r_list)) 144 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0]) 145 | # record the index of largest reward 146 | k_idx_his.append(np.argmax(r_list)) 147 | # record K in case of adaptive K 148 | K_his.append(K) 149 | mode_his.append(m_list[np.argmax(r_list)]) 150 | 151 | 152 | total_time=time.time()-start_time 153 | mem.plot_cost() 154 | plot_rate(rate_his_ratio) 155 | 156 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test) 157 | print('Total time consumed:%s'%total_time) 158 | print('Average time per channel:%s'%(total_time/n)) 159 | 160 | # save data into txt 161 | save_to_txt(k_idx_his, "k_idx_his.txt") 162 | save_to_txt(K_his, "K_his.txt") 163 | save_to_txt(mem.cost_his, "cost_his.txt") 164 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt") 165 | save_to_txt(mode_his, "mode_his.txt") 166 | -------------------------------------------------------------------------------- /mainPyTorch.py: -------------------------------------------------------------------------------- 1 | # ################################################################# 2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks 3 | # 4 | # This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data. 5 | # 6 | # Input: ./data/data_#.mat 7 | # Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes 8 | # ----------------------------------------------------------------- 9 | # | wireless channel gain | input_h | 10 | # ----------------------------------------------------------------- 11 | # | computing mode selection | output_mode | 12 | # ----------------------------------------------------------------- 13 | # | energy broadcasting parameter | output_a | 14 | # ----------------------------------------------------------------- 15 | # | transmit time of wireless device | output_tau | 16 | # ----------------------------------------------------------------- 17 | # | weighted sum computation rate | output_obj | 18 | # ----------------------------------------------------------------- 19 | # 20 | # 21 | # References: 22 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, "Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks," in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811. 23 | # [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018. 24 | # 25 | # version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn) 26 | # ################################################################# 27 | 28 | 29 | import scipy.io as sio # import scipy.io for .mat file I/ 30 | import numpy as np # import numpy 31 | 32 | # Implementated based on the PyTorch 33 | from memoryPyTorch import MemoryDNN 34 | from optimization import bisection 35 | 36 | import time 37 | 38 | 39 | def plot_rate(rate_his, rolling_intv=50): 40 | import matplotlib.pyplot as plt 41 | import pandas as pd 42 | import matplotlib as mpl 43 | 44 | rate_array = np.asarray(rate_his) 45 | df = pd.DataFrame(rate_his) 46 | 47 | 48 | mpl.style.use('seaborn') 49 | fig, ax = plt.subplots(figsize=(15, 8)) 50 | # rolling_intv = 20 51 | 52 | plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b') 53 | plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2) 54 | plt.ylabel('Normalized Computation Rate') 55 | plt.xlabel('Time Frames') 56 | plt.show() 57 | 58 | def save_to_txt(rate_his, file_path): 59 | with open(file_path, 'w') as f: 60 | for rate in rate_his: 61 | f.write("%s \n" % rate) 62 | 63 | if __name__ == "__main__": 64 | ''' 65 | This algorithm generates K modes from DNN, and chooses with largest 66 | reward. The mode with largest reward is stored in the memory, which is 67 | further used to train the DNN. 68 | Adaptive K is implemented. K = max(K, K_his[-memory_size]) 69 | ''' 70 | 71 | N = 10 # number of users 72 | n = 30000 # number of time frames 73 | K = N # initialize K = N 74 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN' 75 | Memory = 1024 # capacity of memory structure 76 | Delta = 32 # Update interval for adaptive K 77 | 78 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta)) 79 | # Load data 80 | channel = sio.loadmat('./data/data_%d' %N)['input_h'] 81 | rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO. 82 | 83 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning 84 | channel = channel * 1000000 85 | 86 | # generate the train and test data sample index 87 | # data are splitted as 80:20 88 | # training data are randomly sampled with duplication if n > total data size 89 | 90 | split_idx = int(.8 * len(channel)) 91 | num_test = min(len(channel) - split_idx, n - int(.8 * n)) # training data size 92 | 93 | 94 | mem = MemoryDNN(net = [N, 120, 80, N], 95 | learning_rate = 0.01, 96 | training_interval=10, 97 | batch_size=128, 98 | memory_size=Memory 99 | ) 100 | 101 | start_time = time.time() 102 | 103 | rate_his = [] 104 | rate_his_ratio = [] 105 | mode_his = [] 106 | k_idx_his = [] 107 | K_his = [] 108 | for i in range(n): 109 | if i % (n//10) == 0: 110 | print("%0.1f"%(i/n)) 111 | if i> 0 and i % Delta == 0: 112 | # index counts from 0 113 | if Delta > 1: 114 | max_k = max(k_idx_his[-Delta:-1]) +1; 115 | else: 116 | max_k = k_idx_his[-1] +1; 117 | K = min(max_k +1, N) 118 | 119 | if i < n - num_test: 120 | # training 121 | i_idx = i % split_idx 122 | else: 123 | # test 124 | i_idx = i - n + num_test + split_idx 125 | 126 | h = channel[i_idx,:] 127 | 128 | # the action selection must be either 'OP' or 'KNN' 129 | m_list = mem.decode(h, K, decoder_mode) 130 | 131 | r_list = [] 132 | for m in m_list: 133 | r_list.append(bisection(h/1000000, m)[0]) 134 | 135 | # encode the mode with largest reward 136 | mem.encode(h, m_list[np.argmax(r_list)]) 137 | # the main code for DROO training ends here 138 | 139 | 140 | 141 | 142 | # the following codes store some interested metrics for illustrations 143 | # memorize the largest reward 144 | rate_his.append(np.max(r_list)) 145 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0]) 146 | # record the index of largest reward 147 | k_idx_his.append(np.argmax(r_list)) 148 | # record K in case of adaptive K 149 | K_his.append(K) 150 | mode_his.append(m_list[np.argmax(r_list)]) 151 | 152 | 153 | total_time=time.time()-start_time 154 | mem.plot_cost() 155 | plot_rate(rate_his_ratio) 156 | 157 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test) 158 | print('Total time consumed:%s'%total_time) 159 | print('Average time per channel:%s'%(total_time/n)) 160 | 161 | # save data into txt 162 | save_to_txt(k_idx_his, "k_idx_his.txt") 163 | save_to_txt(K_his, "K_his.txt") 164 | save_to_txt(mem.cost_his, "cost_his.txt") 165 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt") 166 | save_to_txt(mode_his, "mode_his.txt") 167 | -------------------------------------------------------------------------------- /mainTF2: -------------------------------------------------------------------------------- 1 | # ################################################################# 2 | # Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks 3 | # 4 | # This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data. 5 | # 6 | # Input: ./data/data_#.mat 7 | # Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes 8 | # ----------------------------------------------------------------- 9 | # | wireless channel gain | input_h | 10 | # ----------------------------------------------------------------- 11 | # | computing mode selection | output_mode | 12 | # ----------------------------------------------------------------- 13 | # | energy broadcasting parameter | output_a | 14 | # ----------------------------------------------------------------- 15 | # | transmit time of wireless device | output_tau | 16 | # ----------------------------------------------------------------- 17 | # | weighted sum computation rate | output_obj | 18 | # ----------------------------------------------------------------- 19 | # 20 | # 21 | # References: 22 | # [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, "Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks," in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811. 23 | # [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation offloading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018. 24 | # 25 | # version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn) 26 | # ################################################################# 27 | 28 | 29 | import scipy.io as sio # import scipy.io for .mat file I/ 30 | import numpy as np # import numpy 31 | 32 | # for tensorflow2 33 | from memoryTF2 import MemoryDNN 34 | from optimization import bisection 35 | 36 | import time 37 | 38 | 39 | def plot_rate( rate_his, rolling_intv = 50): 40 | import matplotlib.pyplot as plt 41 | import pandas as pd 42 | import matplotlib as mpl 43 | 44 | rate_array = np.asarray(rate_his) 45 | df = pd.DataFrame(rate_his) 46 | 47 | 48 | mpl.style.use('seaborn') 49 | fig, ax = plt.subplots(figsize=(15,8)) 50 | # rolling_intv = 20 51 | 52 | plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b') 53 | plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2) 54 | plt.ylabel('Normalized Computation Rate') 55 | plt.xlabel('Time Frames') 56 | plt.show() 57 | 58 | def save_to_txt(rate_his, file_path): 59 | with open(file_path, 'w') as f: 60 | for rate in rate_his: 61 | f.write("%s \n" % rate) 62 | 63 | if __name__ == "__main__": 64 | ''' 65 | This algorithm generates K modes from DNN, and chooses with largest 66 | reward. The mode with largest reward is stored in the memory, which is 67 | further used to train the DNN. 68 | Adaptive K is implemented. K = max(K, K_his[-memory_size]) 69 | ''' 70 | 71 | N = 10 # number of users 72 | n = 30000 # number of time frames 73 | K = N # initialize K = N 74 | decoder_mode = 'OP' # the quantization mode could be 'OP' (Order-preserving) or 'KNN' 75 | Memory = 1024 # capacity of memory structure 76 | Delta = 32 # Update interval for adaptive K 77 | 78 | print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta)) 79 | # Load data 80 | channel = sio.loadmat('./data/data_%d' %N)['input_h'] 81 | rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO. 82 | 83 | # increase h to close to 1 for better training; it is a trick widely adopted in deep learning 84 | channel = channel * 1000000 85 | 86 | # generate the train and test data sample index 87 | # data are splitted as 80:20 88 | # training data are randomly sampled with duplication if n > total data size 89 | 90 | split_idx = int(.8* len(channel)) 91 | num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size 92 | 93 | 94 | mem = MemoryDNN(net = [N, 120, 80, N], 95 | learning_rate = 0.01, 96 | training_interval=10, 97 | batch_size=128, 98 | memory_size=Memory 99 | ) 100 | 101 | start_time=time.time() 102 | 103 | rate_his = [] 104 | rate_his_ratio = [] 105 | mode_his = [] 106 | k_idx_his = [] 107 | K_his = [] 108 | for i in range(n): 109 | if i % (n//10) == 0: 110 | print("%0.1f"%(i/n)) 111 | if i> 0 and i % Delta == 0: 112 | # index counts from 0 113 | if Delta > 1: 114 | max_k = max(k_idx_his[-Delta:-1]) +1; 115 | else: 116 | max_k = k_idx_his[-1] +1; 117 | K = min(max_k +1, N) 118 | 119 | if i < n - num_test: 120 | # training 121 | i_idx = i % split_idx 122 | else: 123 | # test 124 | i_idx = i - n + num_test + split_idx 125 | 126 | h = channel[i_idx,:] 127 | 128 | # the action selection must be either 'OP' or 'KNN' 129 | m_list = mem.decode(h, K, decoder_mode) 130 | 131 | r_list = [] 132 | for m in m_list: 133 | r_list.append(bisection(h/1000000, m)[0]) 134 | 135 | # encode the mode with largest reward 136 | mem.encode(h, m_list[np.argmax(r_list)]) 137 | # the main code for DROO training ends here 138 | 139 | 140 | 141 | 142 | # the following codes store some interested metrics for illustrations 143 | # memorize the largest reward 144 | rate_his.append(np.max(r_list)) 145 | rate_his_ratio.append(rate_his[-1] / rate[i_idx][0]) 146 | # record the index of largest reward 147 | k_idx_his.append(np.argmax(r_list)) 148 | # record K in case of adaptive K 149 | K_his.append(K) 150 | mode_his.append(m_list[np.argmax(r_list)]) 151 | 152 | 153 | total_time=time.time()-start_time 154 | mem.plot_cost() 155 | plot_rate(rate_his_ratio) 156 | 157 | print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test) 158 | print('Total time consumed:%s'%total_time) 159 | print('Average time per channel:%s'%(total_time/n)) 160 | 161 | # save data into txt 162 | save_to_txt(k_idx_his, "k_idx_his.txt") 163 | save_to_txt(K_his, "K_his.txt") 164 | save_to_txt(mem.cost_his, "cost_his.txt") 165 | save_to_txt(rate_his_ratio, "rate_his_ratio.txt") 166 | save_to_txt(mode_his, "mode_his.txt") 167 | -------------------------------------------------------------------------------- /memory.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "authorship_tag": "ABX9TyMFps9x/kObIv6+SfRFYpWe", 8 | "include_colab_link": true 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "markdown", 21 | "metadata": { 22 | "id": "view-in-github", 23 | "colab_type": "text" 24 | }, 25 | "source": [ 26 | "\"Open" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "source": [ 32 | "\n" 33 | ], 34 | "metadata": { 35 | "id": "ykiMHW1OShjG" 36 | }, 37 | "execution_count": 1, 38 | "outputs": [] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "source": [ 43 | "# **This Memory File contains memory operation including encoding and decoding operations.**\n", 44 | "version 1.0 -- January 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)\n" 45 | ], 46 | "metadata": { 47 | "id": "lpeCAT7KTAZe" 48 | } 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "source": [], 53 | "metadata": { 54 | "id": "Ou8xRVrVS9Eu" 55 | } 56 | }, 57 | { 58 | "cell_type": "code", 59 | "source": [ 60 | "from __future__ import print_function\n", 61 | "import tensorflow as tf\n", 62 | "import numpy as np" 63 | ], 64 | "metadata": { 65 | "id": "taUp-kQTSVWU" 66 | }, 67 | "execution_count": 1, 68 | "outputs": [] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": 2, 73 | "metadata": { 74 | "colab": { 75 | "base_uri": "https://localhost:8080/" 76 | }, 77 | "id": "VljS8oG6SJkK", 78 | "outputId": "4bb4f086-69f6-4516-9ae2-2eb56af9c0e6" 79 | }, 80 | "outputs": [ 81 | { 82 | "output_type": "stream", 83 | "name": "stderr", 84 | "text": [ 85 | "<>:13: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 86 | "<>:130: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 87 | "<>:132: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 88 | "<>:162: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 89 | "<>:13: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 90 | "<>:130: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 91 | "<>:132: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 92 | "<>:162: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 93 | ":13: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 94 | " assert(len(net) is 4) # only 4-layer DNN\n", 95 | ":130: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 96 | " if mode is 'OP':\n", 97 | ":132: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 98 | " elif mode is 'KNN':\n", 99 | ":162: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n", 100 | " if len(self.enumerate_actions) is 0:\n" 101 | ] 102 | } 103 | ], 104 | "source": [ 105 | "# DNN network for memory\n", 106 | "class MemoryDNN:\n", 107 | " def __init__(\n", 108 | " self,\n", 109 | " net,\n", 110 | " learning_rate = 0.01,\n", 111 | " training_interval=10, \n", 112 | " batch_size=100, \n", 113 | " memory_size=1000,\n", 114 | " output_graph=False\n", 115 | " ):\n", 116 | " # net: [n_input, n_hidden_1st, n_hidded_2ed, n_output]\n", 117 | " assert(len(net) is 4) # only 4-layer DNN\n", 118 | "\n", 119 | " self.net = net\n", 120 | " self.training_interval = training_interval # learn every #training_interval\n", 121 | " self.lr = learning_rate\n", 122 | " self.batch_size = batch_size\n", 123 | " self.memory_size = memory_size\n", 124 | " \n", 125 | " # store all binary actions\n", 126 | " self.enumerate_actions = []\n", 127 | "\n", 128 | " # stored # memory entry\n", 129 | " self.memory_counter = 1\n", 130 | "\n", 131 | " # store training cost\n", 132 | " self.cost_his = []\n", 133 | "\n", 134 | " # reset graph \n", 135 | " tf.reset_default_graph()\n", 136 | "\n", 137 | " # initialize zero memory [h, m]\n", 138 | " self.memory = np.zeros((self.memory_size, self.net[0]+ self.net[-1]))\n", 139 | "\n", 140 | " # construct memory network\n", 141 | " self._build_net()\n", 142 | "\n", 143 | " self.sess = tf.Session()\n", 144 | "\n", 145 | " # for tensorboard\n", 146 | " if output_graph:\n", 147 | " # $ tensorboard --logdir=logs\n", 148 | " # tf.train.SummaryWriter soon be deprecated, use following\n", 149 | " tf.summary.FileWriter(\"logs/\", self.sess.graph)\n", 150 | "\n", 151 | " self.sess.run(tf.global_variables_initializer())\n", 152 | "\n", 153 | "\n", 154 | " def _build_net(self):\n", 155 | " def build_layers(h, c_names, net, w_initializer, b_initializer):\n", 156 | " with tf.variable_scope('l1'):\n", 157 | " w1 = tf.get_variable('w1', [net[0], net[1]], initializer=w_initializer, collections=c_names)\n", 158 | " b1 = tf.get_variable('b1', [1, self.net[1]], initializer=b_initializer, collections=c_names)\n", 159 | " l1 = tf.nn.relu(tf.matmul(h, w1) + b1)\n", 160 | "\n", 161 | " with tf.variable_scope('l2'):\n", 162 | " w2 = tf.get_variable('w2', [net[1], net[2]], initializer=w_initializer, collections=c_names)\n", 163 | " b2 = tf.get_variable('b2', [1, net[2]], initializer=b_initializer, collections=c_names)\n", 164 | " l2 = tf.nn.relu(tf.matmul(l1, w2) + b2)\n", 165 | "\n", 166 | " with tf.variable_scope('M'):\n", 167 | " w3 = tf.get_variable('w3', [net[2], net[3]], initializer=w_initializer, collections=c_names)\n", 168 | " b3 = tf.get_variable('b3', [1, net[3]], initializer=b_initializer, collections=c_names)\n", 169 | " out = tf.matmul(l2, w3) + b3\n", 170 | "\n", 171 | " return out\n", 172 | "\n", 173 | " # ------------------ build memory_net ------------------\n", 174 | " self.h = tf.placeholder(tf.float32, [None, self.net[0]], name='h') # input\n", 175 | " self.m = tf.placeholder(tf.float32, [None, self.net[-1]], name='mode') # for calculating loss\n", 176 | " self.is_train = tf.placeholder(\"bool\") # train or evaluate\n", 177 | "\n", 178 | " with tf.variable_scope('memory_net'):\n", 179 | " c_names, w_initializer, b_initializer = \\\n", 180 | " ['memory_net_params', tf.GraphKeys.GLOBAL_VARIABLES], \\\n", 181 | " tf.random_normal_initializer(0., 1/self.net[0]), tf.constant_initializer(0.1) # config of layers\n", 182 | "\n", 183 | " self.m_pred = build_layers(self.h, c_names, self.net, w_initializer, b_initializer)\n", 184 | "\n", 185 | " with tf.variable_scope('loss'):\n", 186 | " self.loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = self.m, logits = self.m_pred))\n", 187 | "\n", 188 | " with tf.variable_scope('train'):\n", 189 | " self._train_op = tf.train.AdamOptimizer(self.lr, 0.09).minimize(self.loss)\n", 190 | "\n", 191 | "\n", 192 | " def remember(self, h, m):\n", 193 | " # replace the old memory with new memory\n", 194 | " idx = self.memory_counter % self.memory_size\n", 195 | " self.memory[idx, :] = np.hstack((h,m))\n", 196 | "\n", 197 | " self.memory_counter += 1\n", 198 | "\n", 199 | " def encode(self, h, m):\n", 200 | " # encoding the entry\n", 201 | " self.remember(h, m)\n", 202 | " # train the DNN every 10 step\n", 203 | "# if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0:\n", 204 | " if self.memory_counter % self.training_interval == 0:\n", 205 | " self.learn()\n", 206 | "\n", 207 | " def learn(self):\n", 208 | " # sample batch memory from all memory\n", 209 | " if self.memory_counter > self.memory_size:\n", 210 | " sample_index = np.random.choice(self.memory_size, size=self.batch_size)\n", 211 | " else:\n", 212 | " sample_index = np.random.choice(self.memory_counter, size=self.batch_size)\n", 213 | " batch_memory = self.memory[sample_index, :]\n", 214 | " \n", 215 | " h_train = batch_memory[:, 0: self.net[0]]\n", 216 | " m_train = batch_memory[:, self.net[0]:]\n", 217 | " \n", 218 | " # print(h_train)\n", 219 | " # print(m_train)\n", 220 | "\n", 221 | " # train the DNN\n", 222 | " _, self.cost = self.sess.run([self._train_op, self.loss], \n", 223 | " feed_dict={self.h: h_train, self.m: m_train})\n", 224 | "\n", 225 | " assert(self.cost >0) \n", 226 | " self.cost_his.append(self.cost)\n", 227 | "\n", 228 | " def decode(self, h, k = 1, mode = 'OP'):\n", 229 | " # to have batch dimension when feed into tf placeholder\n", 230 | " h = h[np.newaxis, :]\n", 231 | "\n", 232 | " m_pred = self.sess.run(self.m_pred, feed_dict={self.h: h})\n", 233 | "\n", 234 | " if mode is 'OP':\n", 235 | " return self.knm(m_pred[0], k)\n", 236 | " elif mode is 'KNN':\n", 237 | " return self.knn(m_pred[0], k)\n", 238 | " else:\n", 239 | " print(\"The action selection must be 'OP' or 'KNN'\")\n", 240 | " \n", 241 | " def knm(self, m, k = 1):\n", 242 | " # return k-nearest-mode\n", 243 | " m_list = []\n", 244 | " \n", 245 | " # generate the first binary offloading decision \n", 246 | " # note that here 'm' is the output of DNN before the sigmoid activation function, in the field of all real number. \n", 247 | " # Therefore, we compare it with '0' instead of 0.5 in equation (8). Since, sigmod(0) = 0.5.\n", 248 | " m_list.append(1*(m>0))\n", 249 | " \n", 250 | " if k > 1:\n", 251 | " # generate the remaining K-1 binary offloading decisions with respect to equation (9)\n", 252 | " m_abs = abs(m)\n", 253 | " idx_list = np.argsort(m_abs)[:k-1]\n", 254 | " for i in range(k-1):\n", 255 | " if m[idx_list[i]] >0:\n", 256 | " # set a positive user to 0\n", 257 | " m_list.append(1*(m - m[idx_list[i]] > 0))\n", 258 | " else:\n", 259 | " # set a negtive user to 1\n", 260 | " m_list.append(1*(m - m[idx_list[i]] >= 0))\n", 261 | "\n", 262 | " return m_list\n", 263 | " \n", 264 | " def knn(self, m, k = 1):\n", 265 | " # list all 2^N binary offloading actions\n", 266 | " if len(self.enumerate_actions) is 0:\n", 267 | " import itertools\n", 268 | " self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0]))))\n", 269 | "\n", 270 | " # the 2-norm\n", 271 | " sqd = ((self.enumerate_actions - m)**2).sum(1)\n", 272 | " idx = np.argsort(sqd)\n", 273 | " return self.enumerate_actions[idx[:k]]\n", 274 | " \n", 275 | "\n", 276 | " def plot_cost(self):\n", 277 | " import matplotlib.pyplot as plt\n", 278 | " plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his)\n", 279 | " plt.ylabel('Training Loss')\n", 280 | " plt.xlabel('Time Frames')\n", 281 | " plt.show()" 282 | ] 283 | } 284 | ] 285 | } -------------------------------------------------------------------------------- /memory.py: -------------------------------------------------------------------------------- 1 | # ################################################################# 2 | # This file contains memory operation including encoding and decoding operations. 3 | # 4 | # version 1.0 -- January 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn) 5 | # ################################################################# 6 | 7 | from __future__ import print_function 8 | import tensorflow as tf 9 | import numpy as np 10 | 11 | 12 | # DNN network for memory 13 | class MemoryDNN: 14 | def __init__( 15 | self, 16 | net, 17 | learning_rate = 0.01, 18 | training_interval=10, 19 | batch_size=100, 20 | memory_size=1000, 21 | output_graph=False 22 | ): 23 | # net: [n_input, n_hidden_1st, n_hidded_2ed, n_output] 24 | assert(len(net) is 4) # only 4-layer DNN 25 | 26 | self.net = net 27 | self.training_interval = training_interval # learn every #training_interval 28 | self.lr = learning_rate 29 | self.batch_size = batch_size 30 | self.memory_size = memory_size 31 | 32 | # store all binary actions 33 | self.enumerate_actions = [] 34 | 35 | # stored # memory entry 36 | self.memory_counter = 1 37 | 38 | # store training cost 39 | self.cost_his = [] 40 | 41 | # reset graph 42 | tf.reset_default_graph() 43 | 44 | # initialize zero memory [h, m] 45 | self.memory = np.zeros((self.memory_size, self.net[0]+ self.net[-1])) 46 | 47 | # construct memory network 48 | self._build_net() 49 | 50 | self.sess = tf.Session() 51 | 52 | # for tensorboard 53 | if output_graph: 54 | # $ tensorboard --logdir=logs 55 | # tf.train.SummaryWriter soon be deprecated, use following 56 | tf.summary.FileWriter("logs/", self.sess.graph) 57 | 58 | self.sess.run(tf.global_variables_initializer()) 59 | 60 | 61 | def _build_net(self): 62 | def build_layers(h, c_names, net, w_initializer, b_initializer): 63 | with tf.variable_scope('l1'): 64 | w1 = tf.get_variable('w1', [net[0], net[1]], initializer=w_initializer, collections=c_names) 65 | b1 = tf.get_variable('b1', [1, self.net[1]], initializer=b_initializer, collections=c_names) 66 | l1 = tf.nn.relu(tf.matmul(h, w1) + b1) 67 | 68 | with tf.variable_scope('l2'): 69 | w2 = tf.get_variable('w2', [net[1], net[2]], initializer=w_initializer, collections=c_names) 70 | b2 = tf.get_variable('b2', [1, net[2]], initializer=b_initializer, collections=c_names) 71 | l2 = tf.nn.relu(tf.matmul(l1, w2) + b2) 72 | 73 | with tf.variable_scope('M'): 74 | w3 = tf.get_variable('w3', [net[2], net[3]], initializer=w_initializer, collections=c_names) 75 | b3 = tf.get_variable('b3', [1, net[3]], initializer=b_initializer, collections=c_names) 76 | out = tf.matmul(l2, w3) + b3 77 | 78 | return out 79 | 80 | # ------------------ build memory_net ------------------ 81 | self.h = tf.placeholder(tf.float32, [None, self.net[0]], name='h') # input 82 | self.m = tf.placeholder(tf.float32, [None, self.net[-1]], name='mode') # for calculating loss 83 | self.is_train = tf.placeholder("bool") # train or evaluate 84 | 85 | with tf.variable_scope('memory_net'): 86 | c_names, w_initializer, b_initializer = \ 87 | ['memory_net_params', tf.GraphKeys.GLOBAL_VARIABLES], \ 88 | tf.random_normal_initializer(0., 1/self.net[0]), tf.constant_initializer(0.1) # config of layers 89 | 90 | self.m_pred = build_layers(self.h, c_names, self.net, w_initializer, b_initializer) 91 | 92 | with tf.variable_scope('loss'): 93 | self.loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = self.m, logits = self.m_pred)) 94 | 95 | with tf.variable_scope('train'): 96 | self._train_op = tf.train.AdamOptimizer(self.lr, 0.09).minimize(self.loss) 97 | 98 | 99 | def remember(self, h, m): 100 | # replace the old memory with new memory 101 | idx = self.memory_counter % self.memory_size 102 | self.memory[idx, :] = np.hstack((h,m)) 103 | 104 | self.memory_counter += 1 105 | 106 | def encode(self, h, m): 107 | # encoding the entry 108 | self.remember(h, m) 109 | # train the DNN every 10 step 110 | # if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0: 111 | if self.memory_counter % self.training_interval == 0: 112 | self.learn() 113 | 114 | def learn(self): 115 | # sample batch memory from all memory 116 | if self.memory_counter > self.memory_size: 117 | sample_index = np.random.choice(self.memory_size, size=self.batch_size) 118 | else: 119 | sample_index = np.random.choice(self.memory_counter, size=self.batch_size) 120 | batch_memory = self.memory[sample_index, :] 121 | 122 | h_train = batch_memory[:, 0: self.net[0]] 123 | m_train = batch_memory[:, self.net[0]:] 124 | 125 | # print(h_train) 126 | # print(m_train) 127 | 128 | # train the DNN 129 | _, self.cost = self.sess.run([self._train_op, self.loss], 130 | feed_dict={self.h: h_train, self.m: m_train}) 131 | 132 | assert(self.cost >0) 133 | self.cost_his.append(self.cost) 134 | 135 | def decode(self, h, k = 1, mode = 'OP'): 136 | # to have batch dimension when feed into tf placeholder 137 | h = h[np.newaxis, :] 138 | 139 | m_pred = self.sess.run(self.m_pred, feed_dict={self.h: h}) 140 | 141 | if mode is 'OP': 142 | return self.knm(m_pred[0], k) 143 | elif mode is 'KNN': 144 | return self.knn(m_pred[0], k) 145 | else: 146 | print("The action selection must be 'OP' or 'KNN'") 147 | 148 | def knm(self, m, k = 1): 149 | # return k-nearest-mode 150 | m_list = [] 151 | 152 | # generate the first binary offloading decision 153 | # note that here 'm' is the output of DNN before the sigmoid activation function, in the field of all real number. 154 | # Therefore, we compare it with '0' instead of 0.5 in equation (8). Since, sigmod(0) = 0.5. 155 | m_list.append(1*(m>0)) 156 | 157 | if k > 1: 158 | # generate the remaining K-1 binary offloading decisions with respect to equation (9) 159 | m_abs = abs(m) 160 | idx_list = np.argsort(m_abs)[:k-1] 161 | for i in range(k-1): 162 | if m[idx_list[i]] >0: 163 | # set a positive user to 0 164 | m_list.append(1*(m - m[idx_list[i]] > 0)) 165 | else: 166 | # set a negtive user to 1 167 | m_list.append(1*(m - m[idx_list[i]] >= 0)) 168 | 169 | return m_list 170 | 171 | def knn(self, m, k = 1): 172 | # list all 2^N binary offloading actions 173 | if len(self.enumerate_actions) is 0: 174 | import itertools 175 | self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0])))) 176 | 177 | # the 2-norm 178 | sqd = ((self.enumerate_actions - m)**2).sum(1) 179 | idx = np.argsort(sqd) 180 | return self.enumerate_actions[idx[:k]] 181 | 182 | 183 | def plot_cost(self): 184 | import matplotlib.pyplot as plt 185 | plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his) 186 | plt.ylabel('Training Loss') 187 | plt.xlabel('Time Frames') 188 | plt.show() 189 | -------------------------------------------------------------------------------- /memoryPyTorch.py: -------------------------------------------------------------------------------- 1 | # ################################################################# 2 | # This file contains the main DROO operations, including building DNN, 3 | # Storing data sample, Training DNN, and generating quantized binary offloading decisions. 4 | 5 | # version 1.0 -- February 2020. Written based on Tensorflow 2 by Weijian Pan and 6 | # Liang Huang (lianghuang AT zjut.edu.cn) 7 | # ################################################################### 8 | 9 | from __future__ import print_function 10 | import torch 11 | import torch.optim as optim 12 | import torch.nn as nn 13 | import numpy as np 14 | 15 | print(torch.__version__) 16 | 17 | 18 | # DNN network for memory 19 | class MemoryDNN: 20 | def __init__( 21 | self, 22 | net, 23 | learning_rate = 0.01, 24 | training_interval=10, 25 | batch_size=100, 26 | memory_size=1000, 27 | output_graph=False 28 | ): 29 | 30 | self.net = net 31 | self.training_interval = training_interval # learn every #training_interval 32 | self.lr = learning_rate 33 | self.batch_size = batch_size 34 | self.memory_size = memory_size 35 | 36 | # store all binary actions 37 | self.enumerate_actions = [] 38 | 39 | # stored # memory entry 40 | self.memory_counter = 1 41 | 42 | # store training cost 43 | self.cost_his = [] 44 | 45 | # initialize zero memory [h, m] 46 | self.memory = np.zeros((self.memory_size, self.net[0] + self.net[-1])) 47 | 48 | # construct memory network 49 | self._build_net() 50 | 51 | def _build_net(self): 52 | self.model = nn.Sequential( 53 | nn.Linear(self.net[0], self.net[1]), 54 | nn.ReLU(), 55 | nn.Linear(self.net[1], self.net[2]), 56 | nn.ReLU(), 57 | nn.Linear(self.net[2], self.net[3]), 58 | nn.Sigmoid() 59 | ) 60 | 61 | def remember(self, h, m): 62 | # replace the old memory with new memory 63 | idx = self.memory_counter % self.memory_size 64 | self.memory[idx, :] = np.hstack((h, m)) 65 | 66 | self.memory_counter += 1 67 | 68 | def encode(self, h, m): 69 | # encoding the entry 70 | self.remember(h, m) 71 | # train the DNN every 10 step 72 | # if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0: 73 | if self.memory_counter % self.training_interval == 0: 74 | self.learn() 75 | 76 | def learn(self): 77 | # sample batch memory from all memory 78 | if self.memory_counter > self.memory_size: 79 | sample_index = np.random.choice(self.memory_size, size=self.batch_size) 80 | else: 81 | sample_index = np.random.choice(self.memory_counter, size=self.batch_size) 82 | batch_memory = self.memory[sample_index, :] 83 | 84 | h_train = torch.Tensor(batch_memory[:, 0: self.net[0]]) 85 | m_train = torch.Tensor(batch_memory[:, self.net[0]:]) 86 | 87 | 88 | # train the DNN 89 | optimizer = optim.Adam(self.model.parameters(), lr=self.lr,betas = (0.09,0.999),weight_decay=0.0001) 90 | criterion = nn.BCELoss() 91 | self.model.train() 92 | optimizer.zero_grad() 93 | predict = self.model(h_train) 94 | loss = criterion(predict, m_train) 95 | loss.backward() 96 | optimizer.step() 97 | 98 | self.cost = loss.item() 99 | assert(self.cost > 0) 100 | self.cost_his.append(self.cost) 101 | 102 | def decode(self, h, k = 1, mode = 'OP'): 103 | # to have batch dimension when feed into Tensor 104 | h = torch.Tensor(h[np.newaxis, :]) 105 | 106 | self.model.eval() 107 | m_pred = self.model(h) 108 | m_pred = m_pred.detach().numpy() 109 | 110 | if mode is 'OP': 111 | return self.knm(m_pred[0], k) 112 | elif mode is 'KNN': 113 | return self.knn(m_pred[0], k) 114 | else: 115 | print("The action selection must be 'OP' or 'KNN'") 116 | 117 | def knm(self, m, k = 1): 118 | # return k order-preserving binary actions 119 | m_list = [] 120 | # generate the first binary offloading decision with respect to equation (8) 121 | m_list.append(1*(m>0.5)) 122 | 123 | if k > 1: 124 | # generate the remaining K-1 binary offloading decisions with respect to equation (9) 125 | m_abs = abs(m-0.5) 126 | idx_list = np.argsort(m_abs)[:k-1] 127 | for i in range(k-1): 128 | if m[idx_list[i]] >0.5: 129 | # set the \hat{x}_{t,(k-1)} to 0 130 | m_list.append(1*(m - m[idx_list[i]] > 0)) 131 | else: 132 | # set the \hat{x}_{t,(k-1)} to 1 133 | m_list.append(1*(m - m[idx_list[i]] >= 0)) 134 | 135 | return m_list 136 | 137 | def knn(self, m, k = 1): 138 | # list all 2^N binary offloading actions 139 | if len(self.enumerate_actions) is 0: 140 | import itertools 141 | self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0])))) 142 | 143 | # the 2-norm 144 | sqd = ((self.enumerate_actions - m)**2).sum(1) 145 | idx = np.argsort(sqd) 146 | return self.enumerate_actions[idx[:k]] 147 | 148 | 149 | def plot_cost(self): 150 | import matplotlib.pyplot as plt 151 | plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his) 152 | plt.ylabel('Training Loss') 153 | plt.xlabel('Time Frames') 154 | plt.show() 155 | 156 | -------------------------------------------------------------------------------- /memoryTF2.py: -------------------------------------------------------------------------------- 1 | # ################################################################# 2 | # This file contains the main DROO operations, including building DNN, 3 | # Storing data sample, Training DNN, and generating quantized binary offloading decisions. 4 | 5 | # version 1.0 -- January 2020. Written based on Tensorflow 2 by Weijian Pan and 6 | # Liang Huang (lianghuang AT zjut.edu.cn) 7 | # ################################################################# 8 | 9 | from __future__ import print_function 10 | import tensorflow as tf 11 | from tensorflow import keras 12 | from tensorflow.keras import layers 13 | import numpy as np 14 | 15 | print(tf.__version__) 16 | print(tf.keras.__version__) 17 | 18 | 19 | # DNN network for memory 20 | class MemoryDNN: 21 | def __init__( 22 | self, 23 | net, 24 | learning_rate = 0.01, 25 | training_interval=10, 26 | batch_size=100, 27 | memory_size=1000, 28 | output_graph=False 29 | ): 30 | 31 | self.net = net # the size of the DNN 32 | self.training_interval = training_interval # learn every #training_interval 33 | self.lr = learning_rate 34 | self.batch_size = batch_size 35 | self.memory_size = memory_size 36 | 37 | # store all binary actions 38 | self.enumerate_actions = [] 39 | 40 | # stored # memory entry 41 | self.memory_counter = 1 42 | 43 | # store training cost 44 | self.cost_his = [] 45 | 46 | # initialize zero memory [h, m] 47 | self.memory = np.zeros((self.memory_size, self.net[0] + self.net[-1])) 48 | 49 | # construct memory network 50 | self._build_net() 51 | 52 | def _build_net(self): 53 | self.model = keras.Sequential([ 54 | layers.Dense(self.net[1], activation='relu'), # the first hidden layer 55 | layers.Dense(self.net[2], activation='relu'), # the second hidden layer 56 | layers.Dense(self.net[-1], activation='sigmoid') # the output layer 57 | ]) 58 | 59 | self.model.compile(optimizer=keras.optimizers.Adam(lr=self.lr), loss=tf.losses.binary_crossentropy, metrics=['accuracy']) 60 | 61 | def remember(self, h, m): 62 | # replace the old memory with new memory 63 | idx = self.memory_counter % self.memory_size 64 | self.memory[idx, :] = np.hstack((h, m)) 65 | 66 | self.memory_counter += 1 67 | 68 | def encode(self, h, m): 69 | # encoding the entry 70 | self.remember(h, m) 71 | # train the DNN every 10 step 72 | # if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0: 73 | if self.memory_counter % self.training_interval == 0: 74 | self.learn() 75 | 76 | def learn(self): 77 | # sample batch memory from all memory 78 | if self.memory_counter > self.memory_size: 79 | sample_index = np.random.choice(self.memory_size, size=self.batch_size) 80 | else: 81 | sample_index = np.random.choice(self.memory_counter, size=self.batch_size) 82 | batch_memory = self.memory[sample_index, :] 83 | 84 | h_train = batch_memory[:, 0: self.net[0]] 85 | m_train = batch_memory[:, self.net[0]:] 86 | 87 | # print(h_train) # (128, 10) 88 | # print(m_train) # (128, 10) 89 | 90 | # train the DNN 91 | hist = self.model.fit(h_train, m_train, verbose=0) 92 | self.cost = hist.history['loss'][0] 93 | assert(self.cost > 0) 94 | self.cost_his.append(self.cost) 95 | 96 | def decode(self, h, k = 1, mode = 'OP'): 97 | # to have batch dimension when feed into tf placeholder 98 | h = h[np.newaxis, :] 99 | 100 | m_pred = self.model.predict(h) 101 | 102 | if mode is 'OP': 103 | return self.knm(m_pred[0], k) 104 | elif mode is 'KNN': 105 | return self.knn(m_pred[0], k) 106 | else: 107 | print("The action selection must be 'OP' or 'KNN'") 108 | 109 | def knm(self, m, k = 1): 110 | # return k order-preserving binary actions 111 | m_list = [] 112 | # generate the first binary offloading decision with respect to equation (8) 113 | m_list.append(1*(m>0.5)) 114 | 115 | if k > 1: 116 | # generate the remaining K-1 binary offloading decisions with respect to equation (9) 117 | m_abs = abs(m-0.5) 118 | idx_list = np.argsort(m_abs)[:k-1] 119 | for i in range(k-1): 120 | if m[idx_list[i]] >0.5: 121 | # set the \hat{x}_{t,(k-1)} to 0 122 | m_list.append(1*(m - m[idx_list[i]] > 0)) 123 | else: 124 | # set the \hat{x}_{t,(k-1)} to 1 125 | m_list.append(1*(m - m[idx_list[i]] >= 0)) 126 | 127 | return m_list 128 | 129 | def knn(self, m, k = 1): 130 | # list all 2^N binary offloading actions 131 | if len(self.enumerate_actions) is 0: 132 | import itertools 133 | self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0])))) 134 | 135 | # the 2-norm 136 | sqd = ((self.enumerate_actions - m)**2).sum(1) 137 | idx = np.argsort(sqd) 138 | return self.enumerate_actions[idx[:k]] 139 | 140 | 141 | def plot_cost(self): 142 | import matplotlib.pyplot as plt 143 | plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his) 144 | plt.ylabel('Training Loss') 145 | plt.xlabel('Time Frames') 146 | plt.show() 147 | -------------------------------------------------------------------------------- /optimization.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Tue Jan 9 10:45:26 2018 4 | 5 | @author: Administrator 6 | """ 7 | import numpy as np 8 | from scipy import optimize 9 | from scipy.special import lambertw 10 | import scipy.io as sio # import scipy.io for .mat file I/ 11 | import time 12 | 13 | 14 | def plot_gain( gain_his): 15 | import matplotlib.pyplot as plt 16 | import pandas as pd 17 | import matplotlib as mpl 18 | 19 | gain_array = np.asarray(gain_his) 20 | df = pd.DataFrame(gain_his) 21 | 22 | 23 | mpl.style.use('seaborn') 24 | fig, ax = plt.subplots(figsize=(15,8)) 25 | rolling_intv = 20 26 | 27 | plt.plot(np.arange(len(gain_array))+1, df.rolling(rolling_intv, min_periods=1).mean(), 'b') 28 | plt.fill_between(np.arange(len(gain_array))+1, df.rolling(rolling_intv, min_periods=1).min()[0], df.rolling(rolling_intv, min_periods=1).max()[0], color = 'b', alpha = 0.2) 29 | plt.ylabel('Gain ratio') 30 | plt.xlabel('learning steps') 31 | plt.show() 32 | 33 | def bisection(h, M, weights=[]): 34 | # the bisection algorithm proposed by Suzhi BI 35 | # average time to find the optimal: 0.012535839796066284 s 36 | 37 | # parameters and equations 38 | o=100 39 | p=3 40 | u=0.7 41 | eta1=((u*p)**(1.0/3))/o 42 | ki=10**-26 43 | eta2=u*p/10**-10 44 | B=2*10**6 45 | Vu=1.1 46 | epsilon=B/(Vu*np.log(2)) 47 | x = [] # a =x[0], and tau_j = a[1:] 48 | 49 | M0=np.where(M==0)[0] 50 | M1=np.where(M==1)[0] 51 | 52 | hi=np.array([h[i] for i in M0]) 53 | hj=np.array([h[i] for i in M1]) 54 | 55 | 56 | if len(weights) == 0: 57 | # default weights [1, 1.5, 1, 1.5, 1, 1.5, ...] 58 | weights = [1.5 if i%2==1 else 1 for i in range(len(M))] 59 | 60 | wi=np.array([weights[M0[i]] for i in range(len(M0))]) 61 | wj=np.array([weights[M1[i]] for i in range(len(M1))]) 62 | 63 | 64 | def sum_rate(x): 65 | sum1=sum(wi*eta1*(hi/ki)**(1.0/3)*x[0]**(1.0/3)) 66 | sum2=0 67 | for i in range(len(M1)): 68 | sum2+=wj[i]*epsilon*x[i+1]*np.log(1+eta2*hj[i]**2*x[0]/x[i+1]) 69 | return sum1+sum2 70 | 71 | def phi(v, j): 72 | return 1/(-1-1/(lambertw(-1/(np.exp( 1 + v/wj[j]/epsilon))).real)) 73 | 74 | def p1(v): 75 | p1 = 0 76 | for j in range(len(M1)): 77 | p1 += hj[j]**2 * phi(v, j) 78 | 79 | return 1/(1 + p1 * eta2) 80 | 81 | def Q(v): 82 | sum1 = sum(wi*eta1*(hi/ki)**(1.0/3))*p1(v)**(-2/3)/3 83 | sum2 = 0 84 | for j in range(len(M1)): 85 | sum2 += wj[j]*hj[j]**2/(1 + 1/phi(v,j)) 86 | return sum1 + sum2*epsilon*eta2 - v 87 | 88 | def tau(v, j): 89 | return eta2*hj[j]**2*p1(v)*phi(v,j) 90 | 91 | # bisection starts here 92 | delta = 0.005 93 | UB = 999999999 94 | LB = 0 95 | while UB - LB > delta: 96 | v = (float(UB) + LB)/2 97 | if Q(v) > 0: 98 | LB = v 99 | else: 100 | UB = v 101 | 102 | x.append(p1(v)) 103 | for j in range(len(M1)): 104 | x.append(tau(v, j)) 105 | 106 | return sum_rate(x), x[0], x[1:] 107 | 108 | 109 | 110 | def cd_method(h): 111 | N = len(h) 112 | M0 = np.random.randint(2,size = N) 113 | gain0,a,Tj= bisection(h,M0) 114 | g_list = [] 115 | M_list = [] 116 | while True: 117 | for j in range(0,N): 118 | M = np.copy(M0) 119 | M[j] = (M[j]+1)%2 120 | gain,a,Tj= bisection(h,M) 121 | g_list.append(gain) 122 | M_list.append(M) 123 | g_max = max(g_list) 124 | if g_max > gain0: 125 | gain0 = g_max 126 | M0 = M_list[g_list.index(g_max)] 127 | else: 128 | break 129 | return gain0, M0 130 | 131 | 132 | if __name__ == "__main__": 133 | 134 | h=np.array([6.06020304235508*10**-6,1.10331933767028*10**-5,1.00213540309998*10**-7,1.21610610942759*10**-6,1.96138838395145*10**-6,1.71456339592966*10**-6,5.24563569673585*10**-6,5.89530717142197*10**-7,4.07769429231962*10**-6,2.88333185798682*10**-6]) 135 | M=np.array([1,0,0,0,1,0,0,0,0,0]) 136 | # h=np.array([1.00213540309998*10**-7,1.10331933767028*10**-5,6.06020304235508*10**-6,1.21610610942759*10**-6,1.96138838395145*10**-6,1.71456339592966*10**-6,5.24563569673585*10**-6,5.89530717142197*10**-7,4.07769429231962*10**-6,2.88333185798682*10**-6]) 137 | # M=np.array([0,0,1,0,1,0,0,0,0,0]) 138 | 139 | 140 | # h = np.array([4.6368924987170947*10**-7, 1.3479411763648968*10**-7, 7.174945246007612*10**-6, 2.5590719803595445*10**-7, 3.3189928740379023*10**-6, 1.2109071327755575*10**-5, 2.394278475886022*10**-6, 2.179121774067472*10**-6, 5.5213902658478367*10**-8, 2.168778154948169*10**-7, 2.053227965874453*10**-6, 7.002952297466865*10**-8, 7.594077851181444*10**-8, 7.904048961975136*10**-7, 8.867218892023474*10**-7, 5.886007653360979*10**-6, 2.3470565740563855*10**-6, 1.387049627074303*10**-7, 3.359475870531776*10**-7, 2.633733784949562*10**-7, 2.189895264149453*10**-6, 1.129177795302099*10**-5, 1.1760290137191366*10**-6, 1.6588656719735275*10**-7, 1.383637788476638*10**-6, 1.4485928387351664*10**-6, 1.4262265958416598*10**-6, 1.1779725004265418*10**-6, 7.738218993031842*10**-7, 4.763534225174186*10**-6]) 141 | # M =np.array( [0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,]) 142 | 143 | # time the average speed of bisection algorithm 144 | # repeat = 1 145 | # M =np.random.randint(2, size=(repeat,len(h))) 146 | # start_time=time.time() 147 | # for i in range(repeat): 148 | # gain,a,Tj= bisection(h,M[i,:]) 149 | # total_time=time.time()-start_time 150 | # print('time_cost:%s'%(total_time/repeat)) 151 | 152 | gain,a,Tj= bisection(h,M) 153 | print('y:%s'%gain) 154 | print('a:%s'%a) 155 | print('Tj:%s'%Tj) 156 | 157 | # test CD method. Given h, generate the max mode 158 | gain0, M0 = cd_method(h) 159 | print('max y:%s'%gain0) 160 | print(M0) 161 | 162 | # test all data 163 | K = [10, 20, 30] # number of users 164 | N = 1000 # number of channel 165 | 166 | 167 | for k in K: 168 | # Load data 169 | channel = sio.loadmat('./data/data_%d' %int(k))['input_h'] 170 | gain = sio.loadmat('./data/data_%d' %int(k))['output_obj'] 171 | 172 | start_time=time.time() 173 | gain_his = [] 174 | gain_his_ratio = [] 175 | mode_his = [] 176 | for i in range(N): 177 | if i % (N//10) == 0: 178 | print("%0.1f"%(i/N)) 179 | 180 | i_idx = i 181 | 182 | h = channel[i_idx,:] 183 | 184 | # the CD method 185 | gain0, M0 = cd_method(h) 186 | 187 | 188 | # memorize the largest reward 189 | gain_his.append(gain0) 190 | gain_his_ratio.append(gain_his[-1] / gain[i_idx][0]) 191 | 192 | mode_his.append(M0) 193 | 194 | 195 | total_time=time.time()-start_time 196 | print('time_cost:%s'%total_time) 197 | print('average time per channel:%s'%(total_time/N)) 198 | 199 | 200 | plot_gain(gain_his_ratio) 201 | 202 | 203 | print("gain/max ratio: ", sum(gain_his_ratio)/N) 204 | 205 | 206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | 215 | 216 | 217 | 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | 230 | 231 | 232 | 233 | 234 | 235 | 236 | 237 | --------------------------------------------------------------------------------