├── .gitattributes
├── LICENSE
├── README.md
├── Untitled0.ipynb
├── data
    ├── README.md
    ├── data_10.mat
    ├── data_10_WeightsAlternated.mat
    ├── data_20.mat
    ├── data_30.mat
    ├── data_5.mat
    ├── data_6.mat
    ├── data_7.mat
    ├── data_8.mat
    └── data_9.mat
├── demo_alternate_weights.py
├── demo_on_off.py
├── gain_his_ratio.txt
├── main.ipynb
├── main.py
├── mainPyTorch.py
├── mainTF2
├── memory.ipynb
├── memory.py
├── memoryPyTorch.py
├── memoryTF2.py
└── optimization.py


/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2018 REVENOL
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # DROO
 2 | 
 3 | *Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks*
 4 | 
 5 | Python code to reproduce our DROO algorithm for Wireless-powered Mobile-Edge Computing [1], which uses the time-varying wireless channel gains as the input and generates the binary offloading decisions. It includes:
 6 | 
 7 | - [memory.py](memory.py): the DNN structure for the WPMEC, inclduing training structure and test structure, implemented based on [Tensorflow 1.x](https://www.tensorflow.org/install/pip).
 8 |   - [memoryTF2.py](memoryTF2.py): Implemented based on [Tensorflow 2](https://www.tensorflow.org/install).
 9 |   - [memoryPyTorch.py](memoryPyTorch.py): Implemented based on [PyTorch](https://pytorch.org/get-started/locally/).
10 | - [optimization.py](optimization.py): solve the resource allocation problem
11 | 
12 | - [data](./data): all data are stored in this subdirectory, includes:
13 | 
14 |   - **data_#.mat**: training and testing data sets, where # = {10, 20, 30} is the user number
15 | 
16 | - [main.py](main.py): run this file for DROO, including setting system parameters, implemented based on [Tensorflow 1.x](https://www.tensorflow.org/install/pip)
17 |   - [mainTF2.py](mainTF2.py): Implemented based on [Tensorflow 2](https://www.tensorflow.org/install). Run this file for DROO if you code with Tensorflow 2.
18 |   - [mainPyTorch.py](mainPyTorch.py): Implemented based on [PyTorch](https://pytorch.org/get-started/locally/). Run this file for DROO if you code with PyTorch.
19 | 
20 | - [demo_alternate_weights.py](demo_alternate_weights.py): run this file to evaluate the performance of DROO when WDs' weights are alternated
21 | 
22 | - [demo_on_off.py](demo_on_off.py): run this file to evaluate the performance of DROO when some WDs are randomly turning on/off
23 | 
24 | 
25 | ## Cite this work
26 | 
27 | 1. L. Huang, S. Bi, and Y. J. Zhang, “[Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks](https://ieeexplore.ieee.org/document/8771176),” IEEE Trans. Mobile Compt., vol. 19, no. 11, pp. 2581-2593, November 2020.
28 | 
29 | ```
30 | @ARTICLE{huang2020DROO,  
31 | author={Huang, Liang and Bi, Suzhi and Zhang, Ying-Jun Angela},  
32 | journal={IEEE Transactions on Mobile Computing},   
33 | title={Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks},   
34 | year={2020},
35 | month={November},
36 | volume={19},  
37 | number={11},  
38 | pages={2581-2593},  
39 | doi={10.1109/TMC.2019.2928811}
40 | }
41 | ```
42 | 
43 | ## About authors
44 | 
45 | - [Liang HUANG](https://scholar.google.com/citations?user=NifLoZ4AAAAJ), lianghuang AT zjut.edu.cn
46 | 
47 | - [Suzhi BI](https://scholar.google.com/citations?user=uibqC-0AAAAJ), bsz AT szu.edu.cn
48 | 
49 | - [Ying Jun (Angela) Zhang](https://scholar.google.com/citations?user=iOb3wocAAAAJ), yjzhang AT ie.cuhk.edu.hk
50 | 
51 | ## Required packages
52 | 
53 | - Tensorflow
54 | 
55 | - numpy
56 | 
57 | - scipy
58 | 
59 | ## How the code works
60 | 
61 | - For DROO algorithm, run the file, [main.py](main.py). If you code with Tenforflow 2 or PyTorch, run [mainTF2.py](mainTF2.py) or [mainPyTorch.py](mainPyTorch.py), respectively. The original DROO algorithm is coded based on [Tensorflow 1.x](https://www.tensorflow.org/install/pip). If you are fresh to deep learning, please start with [Tensorflow 2](https://www.tensorflow.org/install) or [PyTorch](https://pytorch.org/get-started/locally/), whose codes are much cleaner and easier to follow.
62 | 
63 | - For more DROO demos:
64 |   - Laternating-weight WDs, run the file, [demo_alternate_weights.py](demo_alternate_weights.
65 |   - ON-OFF WDs, run the file, [demo_on_off.py](demo_on_off.py)
66 |   - Remember to respectively edit the *import MemoryDNN* code from
67 |     ```
68 |       from memory import MemoryDNN
69 |     ```
70 |     to
71 |     ```
72 |       from memoryTF2 import MemoryDNN
73 |     ```
74 |     or
75 |     ```
76 |       from memoryPyTorch import MemoryDNN
77 |     ```
78 |     if you are using Tensorflow 2 or PyTorch. 
79 | 
80 | ### DROO is illustrated here for single-slot optimization. If you tend to apply DROO for multiple-slot continuous control problems, please refer to our [LyDROO](https://github.com/revenol/LyDROO) project.
81 | 


--------------------------------------------------------------------------------
/Untitled0.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |   "nbformat": 4,
 3 |   "nbformat_minor": 0,
 4 |   "metadata": {
 5 |     "colab": {
 6 |       "provenance": [],
 7 |       "authorship_tag": "ABX9TyOnESJFQS185LAOEe1zq+1H",
 8 |       "include_colab_link": true
 9 |     },
10 |     "kernelspec": {
11 |       "name": "python3",
12 |       "display_name": "Python 3"
13 |     },
14 |     "language_info": {
15 |       "name": "python"
16 |     },
17 |     "accelerator": "GPU",
18 |     "gpuClass": "standard"
19 |   },
20 |   "cells": [
21 |     {
22 |       "cell_type": "markdown",
23 |       "metadata": {
24 |         "id": "view-in-github",
25 |         "colab_type": "text"
26 |       },
27 |       "source": [
28 |         "<a href=\"https://colab.research.google.com/github/mfarooq33/Task-Offloading-and-Fog-Computing/blob/master/Untitled0.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
29 |       ]
30 |     },
31 |     {
32 |       "cell_type": "code",
33 |       "execution_count": null,
34 |       "metadata": {
35 |         "id": "GcZqNc_bGfkz"
36 |       },
37 |       "outputs": [],
38 |       "source": [
39 |         "from google.colab import files"
40 |       ]
41 |     },
42 |     {
43 |       "cell_type": "code",
44 |       "source": [
45 |         "!cp content/drive/MyDrive/Importing\\ Scripts \\as\\ Modules/DROO-master/main.py /content"
46 |       ],
47 |       "metadata": {
48 |         "colab": {
49 |           "base_uri": "https://localhost:8080/"
50 |         },
51 |         "id": "B_yN8BR3G7bN",
52 |         "outputId": "bc83f6e3-f21c-4fcc-b7b3-6e92d2b68de9"
53 |       },
54 |       "execution_count": null,
55 |       "outputs": [
56 |         {
57 |           "output_type": "stream",
58 |           "name": "stdout",
59 |           "text": [
60 |             "cp: cannot stat 'content/drive/MyDrive/Importing Scripts': No such file or directory\n",
61 |             "cp: cannot stat 'as Modules/DROO-master/main.py': No such file or directory\n"
62 |           ]
63 |         }
64 |       ]
65 |     },
66 |     {
67 |       "cell_type": "markdown",
68 |       "source": [
69 |         "# New Section"
70 |       ],
71 |       "metadata": {
72 |         "id": "MkqI9XbFHDmT"
73 |       }
74 |     }
75 |   ]
76 | }


--------------------------------------------------------------------------------
/data/README.md:
--------------------------------------------------------------------------------
 1 | # DROO
 2 | 
 3 | *Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks*
 4 | 
 5 | This folder includs all pre-generated training and testing data set, including:
 6 | 
 7 | - **data_#.mat**: , where # = {5, 6, 7, 8, 9, 10, 20, 30} is the number of WDs
 8 | 
 9 | - [data_10_WeightsAlternated.mat](data_10_WeightsAlternated.mat): The data set when all WDs' weights are alternated. It contains the same values of 'input_h' as the ones stored in [data_10.mat](data_10.mat). However, the optimal offloading mode, resource allocation, and the maximum computation rate are recalculated since WDs' weights are alternated.
10 | 
11 | 
12 | ## Data Format
13 | 
14 | Data samples are generated by enumerating all 2^N binary offloading actions for N <= 10 and by following the CD method presented in [2] for N = 20, 30. There are 30,000 (for N = 10, 20, 30) or 10,000 (otherwise) samples saved in each \*.mat file. Where each data sample includes:
15 | 
16 | |      variable          |    description            |
17 | |------------------------:|:-----------------------|
18 | |     input_h           |  The wireless channel gain between WDs and the AP   $\mathbf{h}$        |         
19 | |     output_mode        |  The optimal binary offloading action  $\mathbf{x}^*$      |    
20 | |      output_a           | The optimal fraction of time that the AP broadcasts RF energy for the WDs to harvest  $a^*$ |    
21 | |    output_tau         | The optimal fraction of time allocated to WDs for task offloading $\mathbf{\tau}^*$|    
22 | |      output_obj         | The optimal weighted sum computation rate $Q^*$   |   
23 | 
24 | 
25 | 
26 | ## About our works
27 | 
28 | 1. Liang Huang, Suzhi Bi, and Ying-jun Angela Zhang, "Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks", on [arxiv:1808.01977](https://arxiv.org/abs/1808.01977).
29 | 2. S. Bi and Y. J. Zhang, "Computation rate maximization for wireless powered mobile-edge computing with binary computation ofﬂoading," *IEEE Trans. Wireless Commun.*, vol. 17, no. 6, pp. 4177-4190, Jun. 2018.
30 | 
31 | ## About authors
32 | 
33 | - Liang HUANG, lianghuang AT zjut.edu.cn
34 | 
35 | - Suzhi BI, bsz AT szu.edu.cn
36 | 
37 | - Ying Jun (Angela) Zhang, yjzhang AT ie.cuhk.edu.hk
38 | 


--------------------------------------------------------------------------------
/data/data_10.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_10.mat


--------------------------------------------------------------------------------
/data/data_10_WeightsAlternated.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_10_WeightsAlternated.mat


--------------------------------------------------------------------------------
/data/data_20.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_20.mat


--------------------------------------------------------------------------------
/data/data_30.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_30.mat


--------------------------------------------------------------------------------
/data/data_5.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_5.mat


--------------------------------------------------------------------------------
/data/data_6.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_6.mat


--------------------------------------------------------------------------------
/data/data_7.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_7.mat


--------------------------------------------------------------------------------
/data/data_8.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_8.mat


--------------------------------------------------------------------------------
/data/data_9.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mfarooq33/Task-Offloading-and-Fog-Computing/9a551607a525f33a062de5a2ff3c6a2351240d98/data/data_9.mat


--------------------------------------------------------------------------------
/demo_alternate_weights.py:
--------------------------------------------------------------------------------
  1 | #  #################################################################
  2 | #  Deep Reinforcement Learning for Online Ofﬂoading in Wireless Powered Mobile-Edge Computing Networks
  3 | #
  4 | #  This file contains a demo evaluating the performance of DROO with laternating-weight WDs. It loads the training samples with default WDs' weights from ./data/data_10.mat and with alternated weights from ./data/data_10_WeightsAlternated.mat. The channel gains in both files are the same. However, the optimal offloading mode, resource allocation, and the maximum computation rate in 'data_10_WeightsAlternated.mat' are recalculated since WDs' weights are alternated.
  5 | #
  6 | #  References:
  7 | #  [1] 1. Liang Huang, Suzhi Bi, and Ying-jun Angela Zhang, “Deep Reinforcement Learning for Online Ofﬂoading in Wireless Powered Mobile-Edge Computing Networks”, on arxiv:1808.01977
  8 | #
  9 | # version 1.0 -- April 2019. Written by Liang Huang (lianghuang AT zjut.edu.cn)
 10 | #  #################################################################
 11 | 
 12 | 
 13 | import scipy.io as sio                     # import scipy.io for .mat file I/
 14 | import numpy as np                         # import numpy
 15 | 
 16 | from memory import MemoryDNN
 17 | from optimization import bisection
 18 | from main import plot_rate, save_to_txt
 19 | 
 20 | import time
 21 | 
 22 | 
 23 | def alternate_weights(case_id=0):
 24 |     '''
 25 |     Alternate the weights of all WDs. Note that, the maximum computation rate need be recomputed by solving (P2) once any WD's weight is changed. 
 26 |     Input: case_id = 0 for default weights; case_id = 1 for alternated weights.
 27 |     Output: The alternated weights and the corresponding rate.
 28 |     '''
 29 |     # set alternated weights
 30 |     weights=[[1,1.5,1,1.5,1,1.5,1,1.5,1,1.5],[1.5,1,1.5,1,1.5,1,1.5,1,1.5,1]]
 31 |     
 32 |     # load the corresponding maximum computation rate 
 33 |     if case_id == 0:
 34 |         # by defaulst, case_id = 0
 35 |         rate = sio.loadmat('./data/data_10')['output_obj']
 36 |     else:
 37 |         # alternate weights for all WDs, case_id = 1
 38 |         rate = sio.loadmat('./data/data_10_WeightsAlternated')['output_obj']
 39 |     return weights[case_id], rate
 40 | 
 41 | if __name__ == "__main__":
 42 |     ''' 
 43 |         This demo evaluate DROO with laternating-weight WDs. We evaluate an extreme case by alternating the weights of all WDs between 1 and 1.5 at the same time, specifically, at time frame 6,000 and 8,000.
 44 |     '''
 45 |     
 46 |     N = 10                     # number of users
 47 |     n = 10000                # number of time frames, <= 10,000
 48 |     K = N                   # initialize K = N
 49 |     decoder_mode = 'OP'    # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
 50 |     Memory = 1024          # capacity of memory structure
 51 |     Delta = 32             # Update interval for adaptive K
 52 |     
 53 |     print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
 54 |     # Load data
 55 |     channel = sio.loadmat('./data/data_%d' %N)['input_h']
 56 |     rate = sio.loadmat('./data/data_%d' %N)['output_obj']
 57 |     
 58 |     # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
 59 |     channel = channel * 1000000
 60 | 
 61 |     # generate the train and test data sample index
 62 |     # data are splitted as 80:20
 63 |     # training data are randomly sampled with duplication if n > total data size
 64 | 
 65 |     split_idx = int(.8* len(channel))
 66 |     num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size
 67 |     
 68 |     
 69 |     mem = MemoryDNN(net = [N, 120, 80, N],
 70 |                     learning_rate = 0.01,
 71 |                     training_interval=10, 
 72 |                     batch_size=128, 
 73 |                     memory_size=Memory
 74 |                     )
 75 | 
 76 |     start_time=time.time()
 77 |     
 78 |     rate_his = []
 79 |     rate_his_ratio = []
 80 |     mode_his = []
 81 |     k_idx_his = []
 82 |     K_his = []
 83 |     h = channel[0,:]
 84 |     
 85 |     # initilize the weights by setting case_id = 0.
 86 |     weight, rate = alternate_weights(0)
 87 |     print("WD weights at time frame %d:"%(0), weight)
 88 |     
 89 |     
 90 |     for i in range(n):
 91 |         # for dynamic number of WDs
 92 |         if i ==0.6*n:
 93 |             weight, rate = alternate_weights(1)
 94 |             print("WD weights at time frame %d:"%(i), weight)
 95 |         if i ==0.8*n:
 96 |             weight, rate = alternate_weights(0)
 97 |             print("WD weights at time frame %d:"%(i), weight)
 98 | 
 99 |                 
100 |         if i % (n//10) == 0:
101 |            print("%0.1f"%(i/n))
102 |         if i> 0 and i % Delta == 0:
103 |             # index counts from 0
104 |             if Delta > 1:
105 |                 max_k = max(k_idx_his[-Delta:-1]) +1; 
106 |             else:
107 |                 max_k = k_idx_his[-1] +1; 
108 |             K = min(max_k +1, N)
109 | 
110 |         
111 |         i_idx = i
112 |         h = channel[i_idx,:]
113 |         
114 |         # the action selection must be either 'OP' or 'KNN'
115 |         m_list = mem.decode(h, K, decoder_mode)
116 |         
117 |         r_list = []
118 |         for m in m_list:
119 |             # only acitve users are used to compute the rate
120 |             r_list.append(bisection(h/1000000, m, weight)[0])
121 | 
122 |         # memorize the largest reward
123 |         rate_his.append(np.max(r_list))
124 |         rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
125 |         # record the index of largest reward
126 |         k_idx_his.append(np.argmax(r_list))
127 |         # record K in case of adaptive K
128 |         K_his.append(K)
129 |         # save the mode with largest reward
130 |         mode_his.append(m_list[np.argmax(r_list)])
131 | #        if i <0.6*n:
132 |         # encode the mode with largest reward
133 |         mem.encode(h, m_list[np.argmax(r_list)])
134 |         
135 | 
136 |     total_time=time.time()-start_time
137 |     mem.plot_cost()
138 |     plot_rate(rate_his_ratio)
139 |  
140 |     print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
141 |     print('Total time consumed:%s'%total_time)
142 |     print('Average time per channel:%s'%(total_time/n))
143 |     
144 |     # save data into txt
145 |     save_to_txt(k_idx_his, "k_idx_his.txt")
146 |     save_to_txt(K_his, "K_his.txt")
147 |     save_to_txt(mem.cost_his, "cost_his.txt")
148 |     save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
149 |     save_to_txt(mode_his, "mode_his.txt")
150 | 
151 | 
152 |     
153 | 


--------------------------------------------------------------------------------
/demo_on_off.py:
--------------------------------------------------------------------------------
  1 | #  #################################################################
  2 | #  Deep Reinforcement Learning for Online Ofﬂoading in Wireless Powered Mobile-Edge Computing Networks
  3 | #
  4 | #  This file contains a demo evaluating the performance of DROO by randomly turning on/off some WDs. It loads the training samples from ./data/data_#.mat, where # denotes the number of active WDs in the MEC network. Note that, the maximum computation rate need be recomputed by solving (P2) once a WD is turned off/on.
  5 | #
  6 | #  References:
  7 | #  [1] 1. Liang Huang, Suzhi Bi, and Ying-jun Angela Zhang, “Deep Reinforcement Learning for Online Ofﬂoading in Wireless Powered Mobile-Edge Computing Networks”, submitted to IEEE Journal on Selected Areas in Communications.
  8 | #
  9 | # version 1.0 -- April 2019. Written by Liang Huang (lianghuang AT zjut.edu.cn)
 10 | #  #################################################################
 11 | 
 12 | 
 13 | import scipy.io as sio                     # import scipy.io for .mat file I/
 14 | import numpy as np                         # import numpy
 15 | 
 16 | from memory import MemoryDNN
 17 | from optimization import bisection
 18 | from main import plot_rate, save_to_txt
 19 | 
 20 | import time
 21 | 
 22 | 
 23 | def WD_off(channel, N_active, N):
 24 |     # turn off one WD
 25 |     if N_active > 5: # current we support half of WDs are off
 26 |         N_active = N_active - 1
 27 |         # set the (N-active-1)th channel to close to 0
 28 |         # since all channels in each time frame are randomly generated, we turn of the WD with greatest index
 29 |         channel[:,N_active] = channel[:, N_active] / 1000000 # a programming trick,such that we can recover its channel gain once the WD is turned on again.
 30 |         print("    The %dth WD is turned on."%(N_active +1))
 31 |             
 32 |     # update the expected maximum computation rate
 33 |     rate = sio.loadmat('./data/data_%d' %N_active)['output_obj']
 34 |     return channel, rate, N_active
 35 | 
 36 | def WD_on(channel, N_active, N):
 37 |     # turn on one WD
 38 |     if N_active < N:
 39 |         N_active = N_active + 1
 40 |         # recover (N_active-1)th channel 
 41 |         channel[:,N_active-1] = channel[:, N_active-1] * 1000000 
 42 |         print("    The %dth WD is turned on."%(N_active))
 43 |     
 44 |     # update the expected maximum computation  rate
 45 |     rate = sio.loadmat('./data/data_%d' %N_active)['output_obj']        
 46 |     return channel, rate, N_active
 47 | 
 48 | 
 49 |     
 50 | 
 51 | if __name__ == "__main__":
 52 |     ''' 
 53 |         This demo evaluate DROO for MEC networks where WDs can be occasionally turned off/on. After DROO converges, we randomly turn off on one WD at each time frame 6,000, 6,500, 7,000, and 7,500, and then turn them on at time frames 8,000, 8,500, and 9,000. At time frame 9,500 , we randomly turn off two WDs, resulting an MEC network with 8 acitve WDs.
 54 |     '''
 55 |     
 56 |     N = 10                     # number of users
 57 |     N_active = N               # number of effective users
 58 |     N_off = 0                  # number of off-users
 59 |     n = 10000                     # number of time frames, <= 10,000
 60 |     K = N                   # initialize K = N
 61 |     decoder_mode = 'OP'    # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
 62 |     Memory = 1024          # capacity of memory structure
 63 |     Delta = 32             # Update interval for adaptive K
 64 |     
 65 |     print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
 66 |     # Load data
 67 |     channel = sio.loadmat('./data/data_%d' %N)['input_h']
 68 |     rate = sio.loadmat('./data/data_%d' %N)['output_obj']
 69 |     
 70 |     # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
 71 |     channel = channel * 1000000
 72 |     channel_bak = channel.copy()
 73 |     # generate the train and test data sample index
 74 |     # data are splitted as 80:20
 75 |     # training data are randomly sampled with duplication if n > total data size
 76 | 
 77 |     split_idx = int(.8* len(channel))
 78 |     num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size
 79 |     
 80 | 
 81 |     mem = MemoryDNN(net = [N, 120, 80, N],
 82 |                     learning_rate = 0.01,
 83 |                     training_interval=10, 
 84 |                     batch_size=128, 
 85 |                     memory_size=Memory
 86 |                     )
 87 | 
 88 |     start_time=time.time()
 89 |     
 90 |     rate_his = []
 91 |     rate_his_ratio = []
 92 |     mode_his = []
 93 |     k_idx_his = []
 94 |     K_his = []
 95 |     h = channel[0,:]
 96 | 
 97 |     
 98 |     for i in range(n):
 99 |         # for dynamic number of WDs
100 |         if i ==0.6*n:
101 |             print("At time frame %d:"%(i))
102 |             channel, rate, N_active = WD_off(channel, N_active, N)
103 |         if i ==0.65*n:
104 |             print("At time frame %d:"%(i))
105 |             channel, rate, N_active = WD_off(channel, N_active, N)
106 |         if i ==0.7*n:
107 |             print("At time frame %d:"%(i))
108 |             channel, rate, N_active = WD_off(channel, N_active, N)
109 |         if i ==0.75*n:
110 |             print("At time frame %d:"%(i))
111 |             channel, rate, N_active = WD_off(channel, N_active, N)
112 |         if i ==0.8*n:
113 |             print("At time frame %d:"%(i))
114 |             channel, rate, N_active = WD_on(channel, N_active, N)
115 |         if i ==0.85*n:
116 |             print("At time frame %d:"%(i))
117 |             channel, rate, N_active = WD_on(channel, N_active, N)
118 |         if i ==0.9*n:
119 |             print("At time frame %d:"%(i))
120 |             channel, rate, N_active = WD_on(channel, N_active, N)
121 |             channel, rate, N_active = WD_on(channel, N_active, N)
122 |         if i == 0.95*n:
123 |             print("At time frame %d:"%(i))
124 |             channel, rate, N_active = WD_off(channel, N_active, N)
125 |             channel, rate, N_active = WD_off(channel, N_active, N)
126 |                 
127 |         if i % (n//10) == 0:
128 |            print("%0.1f"%(i/n))
129 |         if i> 0 and i % Delta == 0:
130 |             # index counts from 0
131 |             if Delta > 1:
132 |                 max_k = max(k_idx_his[-Delta:-1]) +1; 
133 |             else:
134 |                 max_k = k_idx_his[-1] +1; 
135 |             K = min(max_k +1, N)
136 |         
137 |         i_idx = i
138 |         h = channel[i_idx,:]
139 |         
140 |         # the action selection must be either 'OP' or 'KNN'
141 |         m_list = mem.decode(h, K, decoder_mode)
142 |         
143 |         r_list = []
144 |         for m in m_list:
145 |             # only acitve users are used to compute the rate
146 |             r_list.append(bisection(h[0:N_active]/1000000, m[0:N_active])[0])
147 | 
148 |         # memorize the largest reward
149 |         rate_his.append(np.max(r_list))
150 |         rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
151 |         # record the index of largest reward
152 |         k_idx_his.append(np.argmax(r_list))
153 |         # record K in case of adaptive K
154 |         K_his.append(K)
155 |         # save the mode with largest reward
156 |         mode_his.append(m_list[np.argmax(r_list)])
157 | #        if i <0.6*n:
158 |         # encode the mode with largest reward
159 |         mem.encode(h, m_list[np.argmax(r_list)])
160 |         
161 | 
162 |     total_time=time.time()-start_time
163 |     mem.plot_cost()
164 |     plot_rate(rate_his_ratio)
165 |  
166 |     print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
167 |     print('Total time consumed:%s'%total_time)
168 |     print('Average time per channel:%s'%(total_time/n))
169 |     
170 |     # save data into txt
171 |     save_to_txt(k_idx_his, "k_idx_his.txt")
172 |     save_to_txt(K_his, "K_his.txt")
173 |     save_to_txt(mem.cost_his, "cost_his.txt")
174 |     save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
175 |     save_to_txt(mode_his, "mode_his.txt")
176 | 
177 | 
178 |     
179 | 


--------------------------------------------------------------------------------
/main.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "provenance": [],
  7 |       "authorship_tag": "ABX9TyPGw6GfH7V4bE9B/kfbRbog",
  8 |       "include_colab_link": true
  9 |     },
 10 |     "kernelspec": {
 11 |       "name": "python3",
 12 |       "display_name": "Python 3"
 13 |     },
 14 |     "language_info": {
 15 |       "name": "python"
 16 |     },
 17 |     "accelerator": "GPU",
 18 |     "gpuClass": "standard"
 19 |   },
 20 |   "cells": [
 21 |     {
 22 |       "cell_type": "markdown",
 23 |       "metadata": {
 24 |         "id": "view-in-github",
 25 |         "colab_type": "text"
 26 |       },
 27 |       "source": [
 28 |         "<a href=\"https://colab.research.google.com/github/mfarooq33/Task-Offloading-and-Fog-Computing/blob/master/main.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
 29 |       ]
 30 |     },
 31 |     {
 32 |       "cell_type": "code",
 33 |       "execution_count": null,
 34 |       "metadata": {
 35 |         "id": "GcZqNc_bGfkz"
 36 |       },
 37 |       "outputs": [],
 38 |       "source": [
 39 |         "from google.colab import files"
 40 |       ]
 41 |     },
 42 |     {
 43 |       "cell_type": "code",
 44 |       "source": [
 45 |         "!cp content/drive/MyDrive/Importing\\ Scripts \\as\\ Modules/DROO-master/main.py /content"
 46 |       ],
 47 |       "metadata": {
 48 |         "colab": {
 49 |           "base_uri": "https://localhost:8080/"
 50 |         },
 51 |         "id": "B_yN8BR3G7bN",
 52 |         "outputId": "bc83f6e3-f21c-4fcc-b7b3-6e92d2b68de9"
 53 |       },
 54 |       "execution_count": null,
 55 |       "outputs": [
 56 |         {
 57 |           "output_type": "stream",
 58 |           "name": "stdout",
 59 |           "text": [
 60 |             "cp: cannot stat 'content/drive/MyDrive/Importing Scripts': No such file or directory\n",
 61 |             "cp: cannot stat 'as Modules/DROO-master/main.py': No such file or directory\n"
 62 |           ]
 63 |         }
 64 |       ]
 65 |     },
 66 |     {
 67 |       "cell_type": "code",
 68 |       "source": [
 69 |         "#  #################################################################\n",
 70 |         "#  Deep Reinforcement Learning for Online Ofﬂoading in Wireless Powered Mobile-Edge Computing Networks\n",
 71 |         "#\n",
 72 |         "#  This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data.\n",
 73 |         "#\n",
 74 |         "#  Input: ./data/data_#.mat\n",
 75 |         "#    Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes\n",
 76 |         "#  -----------------------------------------------------------------\n",
 77 |         "#  |       wireless channel gain           |    input_h            |\n",
 78 |         "#  -----------------------------------------------------------------\n",
 79 |         "#  |       computing mode selection        |    output_mode        |\n",
 80 |         "#  -----------------------------------------------------------------\n",
 81 |         "#  |       energy broadcasting parameter   |    output_a           |\n",
 82 |         "#  -----------------------------------------------------------------\n",
 83 |         "#  |     transmit time of wireless device  |    output_tau         |\n",
 84 |         "#  -----------------------------------------------------------------\n",
 85 |         "#  |      weighted sum computation rate    |    output_obj         |\n",
 86 |         "#  -----------------------------------------------------------------\n",
 87 |         "#\n",
 88 |         "#\n",
 89 |         "#  References:\n",
 90 |         "#  [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, \"Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks,\" in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811.\n",
 91 |         "#  [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation ofﬂoading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018.\n",
 92 |         "#\n",
 93 |         "# version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)\n",
 94 |         "#  #################################################################\n",
 95 |         "\n",
 96 |         "\n",
 97 |         "import scipy.io as sio                     # import scipy.io for .mat file I/\n",
 98 |         "import numpy as np                         # import numpy\n",
 99 |         "\n",
100 |         "from memory import MemoryDNN\n",
101 |         "from optimization import bisection\n",
102 |         "\n",
103 |         "import time\n",
104 |         "\n",
105 |         "\n",
106 |         "def plot_rate( rate_his, rolling_intv = 50):\n",
107 |         "    import matplotlib.pyplot as plt\n",
108 |         "    import pandas as pd\n",
109 |         "    import matplotlib as mpl\n",
110 |         "\n",
111 |         "    rate_array = np.asarray(rate_his)\n",
112 |         "    df = pd.DataFrame(rate_his)\n",
113 |         "\n",
114 |         "\n",
115 |         "    mpl.style.use('seaborn')\n",
116 |         "    fig, ax = plt.subplots(figsize=(15,8))\n",
117 |         "#    rolling_intv = 20\n",
118 |         "\n",
119 |         "    plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b')\n",
120 |         "    plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2)\n",
121 |         "    plt.ylabel('Normalized Computation Rate')\n",
122 |         "    plt.xlabel('Time Frames')\n",
123 |         "    plt.show()\n",
124 |         "\n",
125 |         "def save_to_txt(rate_his, file_path):\n",
126 |         "    with open(file_path, 'w') as f:\n",
127 |         "        for rate in rate_his:\n",
128 |         "            f.write(\"%s \\n\" % rate)\n",
129 |         "\n",
130 |         "if __name__ == \"__main__\":\n",
131 |         "    '''\n",
132 |         "        This algorithm generates K modes from DNN, and chooses with largest\n",
133 |         "        reward. The mode with largest reward is stored in the memory, which is\n",
134 |         "        further used to train the DNN.\n",
135 |         "        Adaptive K is implemented. K = max(K, K_his[-memory_size])\n",
136 |         "    '''\n",
137 |         "\n",
138 |         "    N = 10                     # number of users\n",
139 |         "    n = 30000                     # number of time frames\n",
140 |         "    K = N                   # initialize K = N\n",
141 |         "    decoder_mode = 'OP'    # the quantization mode could be 'OP' (Order-preserving) or 'KNN'\n",
142 |         "    Memory = 1024          # capacity of memory structure\n",
143 |         "    Delta = 32             # Update interval for adaptive K\n",
144 |         "\n",
145 |         "    print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))\n",
146 |         "    # Load data\n",
147 |         "    channel = sio.loadmat('./data/data_%d' %N)['input_h']\n",
148 |         "    rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO.\n",
149 |         "\n",
150 |         "    # increase h to close to 1 for better training; it is a trick widely adopted in deep learning\n",
151 |         "    channel = channel * 1000000\n",
152 |         "\n",
153 |         "    # generate the train and test data sample index\n",
154 |         "    # data are splitted as 80:20\n",
155 |         "    # training data are randomly sampled with duplication if n > total data size\n",
156 |         "\n",
157 |         "    split_idx = int(.8* len(channel))\n",
158 |         "    num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size\n",
159 |         "\n",
160 |         "\n",
161 |         "    mem = MemoryDNN(net = [N, 120, 80, N],\n",
162 |         "                    learning_rate = 0.01,\n",
163 |         "                    training_interval=10,\n",
164 |         "                    batch_size=128,\n",
165 |         "                    memory_size=Memory\n",
166 |         "                    )\n",
167 |         "\n",
168 |         "    start_time=time.time()\n",
169 |         "\n",
170 |         "    rate_his = []\n",
171 |         "    rate_his_ratio = []\n",
172 |         "    mode_his = []\n",
173 |         "    k_idx_his = []\n",
174 |         "    K_his = []\n",
175 |         "    for i in range(n):\n",
176 |         "        if i % (n//10) == 0:\n",
177 |         "           print(\"%0.1f\"%(i/n))\n",
178 |         "        if i> 0 and i % Delta == 0:\n",
179 |         "            # index counts from 0\n",
180 |         "            if Delta > 1:\n",
181 |         "                max_k = max(k_idx_his[-Delta:-1]) +1;\n",
182 |         "            else:\n",
183 |         "                max_k = k_idx_his[-1] +1;\n",
184 |         "            K = min(max_k +1, N)\n",
185 |         "\n",
186 |         "        if i < n - num_test:\n",
187 |         "            # training\n",
188 |         "            i_idx = i % split_idx\n",
189 |         "        else:\n",
190 |         "            # test\n",
191 |         "            i_idx = i - n + num_test + split_idx\n",
192 |         "\n",
193 |         "        h = channel[i_idx,:]\n",
194 |         "\n",
195 |         "        # the action selection must be either 'OP' or 'KNN'\n",
196 |         "        m_list = mem.decode(h, K, decoder_mode)\n",
197 |         "\n",
198 |         "        r_list = []\n",
199 |         "        for m in m_list:\n",
200 |         "            r_list.append(bisection(h/1000000, m)[0])\n",
201 |         "            \n",
202 |         "        # encode the mode with largest reward\n",
203 |         "        mem.encode(h, m_list[np.argmax(r_list)])\n",
204 |         "        # the main code for DROO training ends here\n",
205 |         "        \n",
206 |         "        \n",
207 |         "        \n",
208 |         "        \n",
209 |         "        # the following codes store some interested metrics for illustrations\n",
210 |         "        # memorize the largest reward\n",
211 |         "        rate_his.append(np.max(r_list))\n",
212 |         "        rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])\n",
213 |         "        # record the index of largest reward\n",
214 |         "        k_idx_his.append(np.argmax(r_list))\n",
215 |         "        # record K in case of adaptive K\n",
216 |         "        K_his.append(K)\n",
217 |         "        mode_his.append(m_list[np.argmax(r_list)])\n",
218 |         "\n",
219 |         "\n",
220 |         "    total_time=time.time()-start_time\n",
221 |         "    mem.plot_cost()\n",
222 |         "    plot_rate(rate_his_ratio)\n",
223 |         "\n",
224 |         "    print(\"Averaged normalized computation rate:\", sum(rate_his_ratio[-num_test: -1])/num_test)\n",
225 |         "    print('Total time consumed:%s'%total_time)\n",
226 |         "    print('Average time per channel:%s'%(total_time/n))\n",
227 |         "\n",
228 |         "    # save data into txt\n",
229 |         "    save_to_txt(k_idx_his, \"k_idx_his.txt\")\n",
230 |         "    save_to_txt(K_his, \"K_his.txt\")\n",
231 |         "    save_to_txt(mem.cost_his, \"cost_his.txt\")\n",
232 |         "    save_to_txt(rate_his_ratio, \"rate_his_ratio.txt\")\n",
233 |         "    save_to_txt(mode_his, \"mode_his.txt\")"
234 |       ],
235 |       "metadata": {
236 |         "id": "eyzq8nm0PnRy"
237 |       },
238 |       "execution_count": null,
239 |       "outputs": []
240 |     },
241 |     {
242 |       "cell_type": "markdown",
243 |       "source": [
244 |         "# New Section"
245 |       ],
246 |       "metadata": {
247 |         "id": "MkqI9XbFHDmT"
248 |       }
249 |     }
250 |   ]
251 | }


--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
  1 | #  #################################################################
  2 | #  Deep Reinforcement Learning for Online Ofﬂoading in Wireless Powered Mobile-Edge Computing Networks
  3 | #
  4 | #  This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data.
  5 | #
  6 | #  Input: ./data/data_#.mat
  7 | #    Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes
  8 | #  -----------------------------------------------------------------
  9 | #  |       wireless channel gain           |    input_h            |
 10 | #  -----------------------------------------------------------------
 11 | #  |       computing mode selection        |    output_mode        |
 12 | #  -----------------------------------------------------------------
 13 | #  |       energy broadcasting parameter   |    output_a           |
 14 | #  -----------------------------------------------------------------
 15 | #  |     transmit time of wireless device  |    output_tau         |
 16 | #  -----------------------------------------------------------------
 17 | #  |      weighted sum computation rate    |    output_obj         |
 18 | #  -----------------------------------------------------------------
 19 | #
 20 | #
 21 | #  References:
 22 | #  [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, "Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks," in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811.
 23 | #  [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation ofﬂoading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018.
 24 | #
 25 | # version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)
 26 | #  #################################################################
 27 | 
 28 | 
 29 | import scipy.io as sio                     # import scipy.io for .mat file I/
 30 | import numpy as np                         # import numpy
 31 | 
 32 | from memory import MemoryDNN
 33 | from optimization import bisection
 34 | 
 35 | import time
 36 | 
 37 | 
 38 | def plot_rate( rate_his, rolling_intv = 50):
 39 |     import matplotlib.pyplot as plt
 40 |     import pandas as pd
 41 |     import matplotlib as mpl
 42 | 
 43 |     rate_array = np.asarray(rate_his)
 44 |     df = pd.DataFrame(rate_his)
 45 | 
 46 | 
 47 |     mpl.style.use('seaborn')
 48 |     fig, ax = plt.subplots(figsize=(15,8))
 49 | #    rolling_intv = 20
 50 | 
 51 |     plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b')
 52 |     plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2)
 53 |     plt.ylabel('Normalized Computation Rate')
 54 |     plt.xlabel('Time Frames')
 55 |     plt.show()
 56 | 
 57 | def save_to_txt(rate_his, file_path):
 58 |     with open(file_path, 'w') as f:
 59 |         for rate in rate_his:
 60 |             f.write("%s \n" % rate)
 61 | 
 62 | if __name__ == "__main__":
 63 |     '''
 64 |         This algorithm generates K modes from DNN, and chooses with largest
 65 |         reward. The mode with largest reward is stored in the memory, which is
 66 |         further used to train the DNN.
 67 |         Adaptive K is implemented. K = max(K, K_his[-memory_size])
 68 |     '''
 69 | 
 70 |     N = 10                     # number of users
 71 |     n = 30000                     # number of time frames
 72 |     K = N                   # initialize K = N
 73 |     decoder_mode = 'OP'    # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
 74 |     Memory = 1024          # capacity of memory structure
 75 |     Delta = 32             # Update interval for adaptive K
 76 | 
 77 |     print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
 78 |     # Load data
 79 |     channel = sio.loadmat('./data/data_%d' %N)['input_h']
 80 |     rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO.
 81 | 
 82 |     # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
 83 |     channel = channel * 1000000
 84 | 
 85 |     # generate the train and test data sample index
 86 |     # data are splitted as 80:20
 87 |     # training data are randomly sampled with duplication if n > total data size
 88 | 
 89 |     split_idx = int(.8* len(channel))
 90 |     num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size
 91 | 
 92 | 
 93 |     mem = MemoryDNN(net = [N, 120, 80, N],
 94 |                     learning_rate = 0.01,
 95 |                     training_interval=10,
 96 |                     batch_size=128,
 97 |                     memory_size=Memory
 98 |                     )
 99 | 
100 |     start_time=time.time()
101 | 
102 |     rate_his = []
103 |     rate_his_ratio = []
104 |     mode_his = []
105 |     k_idx_his = []
106 |     K_his = []
107 |     for i in range(n):
108 |         if i % (n//10) == 0:
109 |            print("%0.1f"%(i/n))
110 |         if i> 0 and i % Delta == 0:
111 |             # index counts from 0
112 |             if Delta > 1:
113 |                 max_k = max(k_idx_his[-Delta:-1]) +1;
114 |             else:
115 |                 max_k = k_idx_his[-1] +1;
116 |             K = min(max_k +1, N)
117 | 
118 |         if i < n - num_test:
119 |             # training
120 |             i_idx = i % split_idx
121 |         else:
122 |             # test
123 |             i_idx = i - n + num_test + split_idx
124 | 
125 |         h = channel[i_idx,:]
126 | 
127 |         # the action selection must be either 'OP' or 'KNN'
128 |         m_list = mem.decode(h, K, decoder_mode)
129 | 
130 |         r_list = []
131 |         for m in m_list:
132 |             r_list.append(bisection(h/1000000, m)[0])
133 |             
134 |         # encode the mode with largest reward
135 |         mem.encode(h, m_list[np.argmax(r_list)])
136 |         # the main code for DROO training ends here
137 |         
138 |         
139 |         
140 |         
141 |         # the following codes store some interested metrics for illustrations
142 |         # memorize the largest reward
143 |         rate_his.append(np.max(r_list))
144 |         rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
145 |         # record the index of largest reward
146 |         k_idx_his.append(np.argmax(r_list))
147 |         # record K in case of adaptive K
148 |         K_his.append(K)
149 |         mode_his.append(m_list[np.argmax(r_list)])
150 | 
151 | 
152 |     total_time=time.time()-start_time
153 |     mem.plot_cost()
154 |     plot_rate(rate_his_ratio)
155 | 
156 |     print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
157 |     print('Total time consumed:%s'%total_time)
158 |     print('Average time per channel:%s'%(total_time/n))
159 | 
160 |     # save data into txt
161 |     save_to_txt(k_idx_his, "k_idx_his.txt")
162 |     save_to_txt(K_his, "K_his.txt")
163 |     save_to_txt(mem.cost_his, "cost_his.txt")
164 |     save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
165 |     save_to_txt(mode_his, "mode_his.txt")
166 | 


--------------------------------------------------------------------------------
/mainPyTorch.py:
--------------------------------------------------------------------------------
  1 | #  #################################################################
  2 | #  Deep Reinforcement Learning for Online Ofﬂoading in Wireless Powered Mobile-Edge Computing Networks
  3 | #
  4 | #  This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data.
  5 | #
  6 | #  Input: ./data/data_#.mat
  7 | #    Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes
  8 | #  -----------------------------------------------------------------
  9 | #  |       wireless channel gain           |    input_h            |
 10 | #  -----------------------------------------------------------------
 11 | #  |       computing mode selection        |    output_mode        |
 12 | #  -----------------------------------------------------------------
 13 | #  |       energy broadcasting parameter   |    output_a           |
 14 | #  -----------------------------------------------------------------
 15 | #  |     transmit time of wireless device  |    output_tau         |
 16 | #  -----------------------------------------------------------------
 17 | #  |      weighted sum computation rate    |    output_obj         |
 18 | #  -----------------------------------------------------------------
 19 | #
 20 | #
 21 | #  References:
 22 | #  [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, "Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks," in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811.
 23 | #  [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation ofﬂoading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018.
 24 | #
 25 | # version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)
 26 | #  #################################################################
 27 | 
 28 | 
 29 | import scipy.io as sio                     # import scipy.io for .mat file I/
 30 | import numpy as np                         # import numpy
 31 | 
 32 | # Implementated based on the PyTorch 
 33 | from memoryPyTorch import MemoryDNN
 34 | from optimization import bisection
 35 | 
 36 | import time
 37 | 
 38 | 
 39 | def plot_rate(rate_his, rolling_intv=50):
 40 |     import matplotlib.pyplot as plt
 41 |     import pandas as pd
 42 |     import matplotlib as mpl
 43 | 
 44 |     rate_array = np.asarray(rate_his)
 45 |     df = pd.DataFrame(rate_his)
 46 | 
 47 | 
 48 |     mpl.style.use('seaborn')
 49 |     fig, ax = plt.subplots(figsize=(15, 8))
 50 | #    rolling_intv = 20
 51 | 
 52 |     plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b')
 53 |     plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2)
 54 |     plt.ylabel('Normalized Computation Rate')
 55 |     plt.xlabel('Time Frames')
 56 |     plt.show()
 57 | 
 58 | def save_to_txt(rate_his, file_path):
 59 |     with open(file_path, 'w') as f:
 60 |         for rate in rate_his:
 61 |             f.write("%s \n" % rate)
 62 | 
 63 | if __name__ == "__main__":
 64 |     '''
 65 |         This algorithm generates K modes from DNN, and chooses with largest
 66 |         reward. The mode with largest reward is stored in the memory, which is
 67 |         further used to train the DNN.
 68 |         Adaptive K is implemented. K = max(K, K_his[-memory_size])
 69 |     '''
 70 | 
 71 |     N = 10                       # number of users
 72 |     n = 30000                    # number of time frames
 73 |     K = N                        # initialize K = N
 74 |     decoder_mode = 'OP'          # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
 75 |     Memory = 1024                # capacity of memory structure
 76 |     Delta = 32                   # Update interval for adaptive K
 77 | 
 78 |     print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
 79 |     # Load data
 80 |     channel = sio.loadmat('./data/data_%d' %N)['input_h']
 81 |     rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO.
 82 | 
 83 |     # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
 84 |     channel = channel * 1000000
 85 | 
 86 |     # generate the train and test data sample index
 87 |     # data are splitted as 80:20
 88 |     # training data are randomly sampled with duplication if n > total data size
 89 | 
 90 |     split_idx = int(.8 * len(channel))
 91 |     num_test = min(len(channel) - split_idx, n - int(.8 * n)) # training data size
 92 | 
 93 | 
 94 |     mem = MemoryDNN(net = [N, 120, 80, N],
 95 |                     learning_rate = 0.01,
 96 |                     training_interval=10,
 97 |                     batch_size=128,
 98 |                     memory_size=Memory
 99 |                     )
100 | 
101 |     start_time = time.time()
102 | 
103 |     rate_his = []
104 |     rate_his_ratio = []
105 |     mode_his = []
106 |     k_idx_his = []
107 |     K_his = []
108 |     for i in range(n):
109 |         if i % (n//10) == 0:
110 |            print("%0.1f"%(i/n))
111 |         if i> 0 and i % Delta == 0:
112 |             # index counts from 0
113 |             if Delta > 1:
114 |                 max_k = max(k_idx_his[-Delta:-1]) +1;
115 |             else:
116 |                 max_k = k_idx_his[-1] +1;
117 |             K = min(max_k +1, N)
118 | 
119 |         if i < n - num_test:
120 |             # training
121 |             i_idx = i % split_idx
122 |         else:
123 |             # test
124 |             i_idx = i - n + num_test + split_idx
125 | 
126 |         h = channel[i_idx,:]
127 | 
128 |         # the action selection must be either 'OP' or 'KNN'
129 |         m_list = mem.decode(h, K, decoder_mode)
130 | 
131 |         r_list = []
132 |         for m in m_list:
133 |             r_list.append(bisection(h/1000000, m)[0])
134 | 
135 |         # encode the mode with largest reward
136 |         mem.encode(h, m_list[np.argmax(r_list)])
137 |         # the main code for DROO training ends here
138 | 
139 | 
140 | 
141 | 
142 |         # the following codes store some interested metrics for illustrations
143 |         # memorize the largest reward
144 |         rate_his.append(np.max(r_list))
145 |         rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
146 |         # record the index of largest reward
147 |         k_idx_his.append(np.argmax(r_list))
148 |         # record K in case of adaptive K
149 |         K_his.append(K)
150 |         mode_his.append(m_list[np.argmax(r_list)])
151 | 
152 | 
153 |     total_time=time.time()-start_time
154 |     mem.plot_cost()
155 |     plot_rate(rate_his_ratio)
156 | 
157 |     print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
158 |     print('Total time consumed:%s'%total_time)
159 |     print('Average time per channel:%s'%(total_time/n))
160 | 
161 |     # save data into txt
162 |     save_to_txt(k_idx_his, "k_idx_his.txt")
163 |     save_to_txt(K_his, "K_his.txt")
164 |     save_to_txt(mem.cost_his, "cost_his.txt")
165 |     save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
166 |     save_to_txt(mode_his, "mode_his.txt")
167 | 


--------------------------------------------------------------------------------
/mainTF2:
--------------------------------------------------------------------------------
  1 | #  #################################################################
  2 | #  Deep Reinforcement Learning for Online Ofﬂoading in Wireless Powered Mobile-Edge Computing Networks
  3 | #
  4 | #  This file contains the main code of DROO. It loads the training samples saved in ./data/data_#.mat, splits the samples into two parts (training and testing data constitutes 80% and 20%), trains the DNN with training and validation samples, and finally tests the DNN with test data.
  5 | #
  6 | #  Input: ./data/data_#.mat
  7 | #    Data samples are generated according to the CD method presented in [2]. There are 30,000 samples saved in each ./data/data_#.mat, where # is the user number. Each data sample includes
  8 | #  -----------------------------------------------------------------
  9 | #  |       wireless channel gain           |    input_h            |
 10 | #  -----------------------------------------------------------------
 11 | #  |       computing mode selection        |    output_mode        |
 12 | #  -----------------------------------------------------------------
 13 | #  |       energy broadcasting parameter   |    output_a           |
 14 | #  -----------------------------------------------------------------
 15 | #  |     transmit time of wireless device  |    output_tau         |
 16 | #  -----------------------------------------------------------------
 17 | #  |      weighted sum computation rate    |    output_obj         |
 18 | #  -----------------------------------------------------------------
 19 | #
 20 | #
 21 | #  References:
 22 | #  [1] 1. Liang Huang, Suzhi Bi, and Ying-Jun Angela Zhang, "Deep Reinforcement Learning for Online Offloading in Wireless Powered Mobile-Edge Computing Networks," in IEEE Transactions on Mobile Computing, early access, 2019, DOI:10.1109/TMC.2019.2928811.
 23 | #  [2] S. Bi and Y. J. Zhang, “Computation rate maximization for wireless powered mobile-edge computing with binary computation ofﬂoading,” IEEE Trans. Wireless Commun., vol. 17, no. 6, pp. 4177-4190, Jun. 2018.
 24 | #
 25 | # version 1.0 -- July 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)
 26 | #  #################################################################
 27 | 
 28 | 
 29 | import scipy.io as sio                     # import scipy.io for .mat file I/
 30 | import numpy as np                         # import numpy
 31 | 
 32 | # for tensorflow2
 33 | from memoryTF2 import MemoryDNN
 34 | from optimization import bisection
 35 | 
 36 | import time
 37 | 
 38 | 
 39 | def plot_rate( rate_his, rolling_intv = 50):
 40 |     import matplotlib.pyplot as plt
 41 |     import pandas as pd
 42 |     import matplotlib as mpl
 43 | 
 44 |     rate_array = np.asarray(rate_his)
 45 |     df = pd.DataFrame(rate_his)
 46 | 
 47 | 
 48 |     mpl.style.use('seaborn')
 49 |     fig, ax = plt.subplots(figsize=(15,8))
 50 | #    rolling_intv = 20
 51 | 
 52 |     plt.plot(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).mean().values), 'b')
 53 |     plt.fill_between(np.arange(len(rate_array))+1, np.hstack(df.rolling(rolling_intv, min_periods=1).min()[0].values), np.hstack(df.rolling(rolling_intv, min_periods=1).max()[0].values), color = 'b', alpha = 0.2)
 54 |     plt.ylabel('Normalized Computation Rate')
 55 |     plt.xlabel('Time Frames')
 56 |     plt.show()
 57 | 
 58 | def save_to_txt(rate_his, file_path):
 59 |     with open(file_path, 'w') as f:
 60 |         for rate in rate_his:
 61 |             f.write("%s \n" % rate)
 62 | 
 63 | if __name__ == "__main__":
 64 |     '''
 65 |         This algorithm generates K modes from DNN, and chooses with largest
 66 |         reward. The mode with largest reward is stored in the memory, which is
 67 |         further used to train the DNN.
 68 |         Adaptive K is implemented. K = max(K, K_his[-memory_size])
 69 |     '''
 70 | 
 71 |     N = 10                     # number of users
 72 |     n = 30000                     # number of time frames
 73 |     K = N                   # initialize K = N
 74 |     decoder_mode = 'OP'    # the quantization mode could be 'OP' (Order-preserving) or 'KNN'
 75 |     Memory = 1024          # capacity of memory structure
 76 |     Delta = 32             # Update interval for adaptive K
 77 | 
 78 |     print('#user = %d, #channel=%d, K=%d, decoder = %s, Memory = %d, Delta = %d'%(N,n,K,decoder_mode, Memory, Delta))
 79 |     # Load data
 80 |     channel = sio.loadmat('./data/data_%d' %N)['input_h']
 81 |     rate = sio.loadmat('./data/data_%d' %N)['output_obj'] # this rate is only used to plot figures; never used to train DROO.
 82 | 
 83 |     # increase h to close to 1 for better training; it is a trick widely adopted in deep learning
 84 |     channel = channel * 1000000
 85 | 
 86 |     # generate the train and test data sample index
 87 |     # data are splitted as 80:20
 88 |     # training data are randomly sampled with duplication if n > total data size
 89 | 
 90 |     split_idx = int(.8* len(channel))
 91 |     num_test = min(len(channel) - split_idx, n - int(.8* n)) # training data size
 92 | 
 93 | 
 94 |     mem = MemoryDNN(net = [N, 120, 80, N],
 95 |                     learning_rate = 0.01,
 96 |                     training_interval=10,
 97 |                     batch_size=128,
 98 |                     memory_size=Memory
 99 |                     )
100 | 
101 |     start_time=time.time()
102 | 
103 |     rate_his = []
104 |     rate_his_ratio = []
105 |     mode_his = []
106 |     k_idx_his = []
107 |     K_his = []
108 |     for i in range(n):
109 |         if i % (n//10) == 0:
110 |            print("%0.1f"%(i/n))
111 |         if i> 0 and i % Delta == 0:
112 |             # index counts from 0
113 |             if Delta > 1:
114 |                 max_k = max(k_idx_his[-Delta:-1]) +1;
115 |             else:
116 |                 max_k = k_idx_his[-1] +1;
117 |             K = min(max_k +1, N)
118 | 
119 |         if i < n - num_test:
120 |             # training
121 |             i_idx = i % split_idx
122 |         else:
123 |             # test
124 |             i_idx = i - n + num_test + split_idx
125 | 
126 |         h = channel[i_idx,:]
127 | 
128 |         # the action selection must be either 'OP' or 'KNN'
129 |         m_list = mem.decode(h, K, decoder_mode)
130 | 
131 |         r_list = []
132 |         for m in m_list:
133 |             r_list.append(bisection(h/1000000, m)[0])
134 |             
135 |         # encode the mode with largest reward
136 |         mem.encode(h, m_list[np.argmax(r_list)])
137 |         # the main code for DROO training ends here
138 |         
139 |         
140 |         
141 |         
142 |         # the following codes store some interested metrics for illustrations
143 |         # memorize the largest reward
144 |         rate_his.append(np.max(r_list))
145 |         rate_his_ratio.append(rate_his[-1] / rate[i_idx][0])
146 |         # record the index of largest reward
147 |         k_idx_his.append(np.argmax(r_list))
148 |         # record K in case of adaptive K
149 |         K_his.append(K)
150 |         mode_his.append(m_list[np.argmax(r_list)])
151 | 
152 | 
153 |     total_time=time.time()-start_time
154 |     mem.plot_cost()
155 |     plot_rate(rate_his_ratio)
156 | 
157 |     print("Averaged normalized computation rate:", sum(rate_his_ratio[-num_test: -1])/num_test)
158 |     print('Total time consumed:%s'%total_time)
159 |     print('Average time per channel:%s'%(total_time/n))
160 | 
161 |     # save data into txt
162 |     save_to_txt(k_idx_his, "k_idx_his.txt")
163 |     save_to_txt(K_his, "K_his.txt")
164 |     save_to_txt(mem.cost_his, "cost_his.txt")
165 |     save_to_txt(rate_his_ratio, "rate_his_ratio.txt")
166 |     save_to_txt(mode_his, "mode_his.txt")
167 | 


--------------------------------------------------------------------------------
/memory.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "provenance": [],
  7 |       "authorship_tag": "ABX9TyMFps9x/kObIv6+SfRFYpWe",
  8 |       "include_colab_link": true
  9 |     },
 10 |     "kernelspec": {
 11 |       "name": "python3",
 12 |       "display_name": "Python 3"
 13 |     },
 14 |     "language_info": {
 15 |       "name": "python"
 16 |     }
 17 |   },
 18 |   "cells": [
 19 |     {
 20 |       "cell_type": "markdown",
 21 |       "metadata": {
 22 |         "id": "view-in-github",
 23 |         "colab_type": "text"
 24 |       },
 25 |       "source": [
 26 |         "<a href=\"https://colab.research.google.com/github/mfarooq33/Task-Offloading-and-Fog-Computing/blob/master/memory.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
 27 |       ]
 28 |     },
 29 |     {
 30 |       "cell_type": "code",
 31 |       "source": [
 32 |         "\n"
 33 |       ],
 34 |       "metadata": {
 35 |         "id": "ykiMHW1OShjG"
 36 |       },
 37 |       "execution_count": 1,
 38 |       "outputs": []
 39 |     },
 40 |     {
 41 |       "cell_type": "markdown",
 42 |       "source": [
 43 |         "# **This Memory File contains memory operation including encoding and decoding operations.**\n",
 44 |         "version 1.0 -- January 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)\n"
 45 |       ],
 46 |       "metadata": {
 47 |         "id": "lpeCAT7KTAZe"
 48 |       }
 49 |     },
 50 |     {
 51 |       "cell_type": "markdown",
 52 |       "source": [],
 53 |       "metadata": {
 54 |         "id": "Ou8xRVrVS9Eu"
 55 |       }
 56 |     },
 57 |     {
 58 |       "cell_type": "code",
 59 |       "source": [
 60 |         "from __future__ import print_function\n",
 61 |         "import tensorflow as tf\n",
 62 |         "import numpy as np"
 63 |       ],
 64 |       "metadata": {
 65 |         "id": "taUp-kQTSVWU"
 66 |       },
 67 |       "execution_count": 1,
 68 |       "outputs": []
 69 |     },
 70 |     {
 71 |       "cell_type": "code",
 72 |       "execution_count": 2,
 73 |       "metadata": {
 74 |         "colab": {
 75 |           "base_uri": "https://localhost:8080/"
 76 |         },
 77 |         "id": "VljS8oG6SJkK",
 78 |         "outputId": "4bb4f086-69f6-4516-9ae2-2eb56af9c0e6"
 79 |       },
 80 |       "outputs": [
 81 |         {
 82 |           "output_type": "stream",
 83 |           "name": "stderr",
 84 |           "text": [
 85 |             "<>:13: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 86 |             "<>:130: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 87 |             "<>:132: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 88 |             "<>:162: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 89 |             "<>:13: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 90 |             "<>:130: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 91 |             "<>:132: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 92 |             "<>:162: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 93 |             "<ipython-input-2-6533ba2b3825>:13: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 94 |             "  assert(len(net) is 4) # only 4-layer DNN\n",
 95 |             "<ipython-input-2-6533ba2b3825>:130: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 96 |             "  if mode is 'OP':\n",
 97 |             "<ipython-input-2-6533ba2b3825>:132: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
 98 |             "  elif mode is 'KNN':\n",
 99 |             "<ipython-input-2-6533ba2b3825>:162: SyntaxWarning: \"is\" with a literal. Did you mean \"==\"?\n",
100 |             "  if len(self.enumerate_actions) is 0:\n"
101 |           ]
102 |         }
103 |       ],
104 |       "source": [
105 |         "# DNN network for memory\n",
106 |         "class MemoryDNN:\n",
107 |         "    def __init__(\n",
108 |         "        self,\n",
109 |         "        net,\n",
110 |         "        learning_rate = 0.01,\n",
111 |         "        training_interval=10, \n",
112 |         "        batch_size=100, \n",
113 |         "        memory_size=1000,\n",
114 |         "        output_graph=False\n",
115 |         "    ):\n",
116 |         "        # net: [n_input, n_hidden_1st, n_hidded_2ed, n_output]\n",
117 |         "        assert(len(net) is 4) # only 4-layer DNN\n",
118 |         "\n",
119 |         "        self.net = net\n",
120 |         "        self.training_interval = training_interval # learn every #training_interval\n",
121 |         "        self.lr = learning_rate\n",
122 |         "        self.batch_size = batch_size\n",
123 |         "        self.memory_size = memory_size\n",
124 |         "        \n",
125 |         "        # store all binary actions\n",
126 |         "        self.enumerate_actions = []\n",
127 |         "\n",
128 |         "        # stored # memory entry\n",
129 |         "        self.memory_counter = 1\n",
130 |         "\n",
131 |         "        # store training cost\n",
132 |         "        self.cost_his = []\n",
133 |         "\n",
134 |         "        # reset graph \n",
135 |         "        tf.reset_default_graph()\n",
136 |         "\n",
137 |         "        # initialize zero memory [h, m]\n",
138 |         "        self.memory = np.zeros((self.memory_size, self.net[0]+ self.net[-1]))\n",
139 |         "\n",
140 |         "        # construct memory network\n",
141 |         "        self._build_net()\n",
142 |         "\n",
143 |         "        self.sess = tf.Session()\n",
144 |         "\n",
145 |         "        # for tensorboard\n",
146 |         "        if output_graph:\n",
147 |         "            # $ tensorboard --logdir=logs\n",
148 |         "            # tf.train.SummaryWriter soon be deprecated, use following\n",
149 |         "            tf.summary.FileWriter(\"logs/\", self.sess.graph)\n",
150 |         "\n",
151 |         "        self.sess.run(tf.global_variables_initializer())\n",
152 |         "\n",
153 |         "\n",
154 |         "    def _build_net(self):\n",
155 |         "        def build_layers(h, c_names, net, w_initializer, b_initializer):\n",
156 |         "            with tf.variable_scope('l1'):\n",
157 |         "                w1 = tf.get_variable('w1', [net[0], net[1]], initializer=w_initializer, collections=c_names)\n",
158 |         "                b1 = tf.get_variable('b1', [1, self.net[1]], initializer=b_initializer, collections=c_names)\n",
159 |         "                l1 = tf.nn.relu(tf.matmul(h, w1) + b1)\n",
160 |         "\n",
161 |         "            with tf.variable_scope('l2'):\n",
162 |         "                w2 = tf.get_variable('w2', [net[1], net[2]], initializer=w_initializer, collections=c_names)\n",
163 |         "                b2 = tf.get_variable('b2', [1, net[2]], initializer=b_initializer, collections=c_names)\n",
164 |         "                l2 = tf.nn.relu(tf.matmul(l1, w2) + b2)\n",
165 |         "\n",
166 |         "            with tf.variable_scope('M'):\n",
167 |         "                w3 = tf.get_variable('w3', [net[2], net[3]], initializer=w_initializer, collections=c_names)\n",
168 |         "                b3 = tf.get_variable('b3', [1, net[3]], initializer=b_initializer, collections=c_names)\n",
169 |         "                out = tf.matmul(l2, w3) + b3\n",
170 |         "\n",
171 |         "            return out\n",
172 |         "\n",
173 |         "        # ------------------ build memory_net ------------------\n",
174 |         "        self.h = tf.placeholder(tf.float32, [None, self.net[0]], name='h')  # input\n",
175 |         "        self.m = tf.placeholder(tf.float32, [None, self.net[-1]], name='mode')  # for calculating loss\n",
176 |         "        self.is_train = tf.placeholder(\"bool\") # train or evaluate\n",
177 |         "\n",
178 |         "        with tf.variable_scope('memory_net'):\n",
179 |         "            c_names, w_initializer, b_initializer = \\\n",
180 |         "                ['memory_net_params', tf.GraphKeys.GLOBAL_VARIABLES], \\\n",
181 |         "                tf.random_normal_initializer(0., 1/self.net[0]), tf.constant_initializer(0.1)  # config of layers\n",
182 |         "\n",
183 |         "            self.m_pred = build_layers(self.h, c_names, self.net, w_initializer, b_initializer)\n",
184 |         "\n",
185 |         "        with tf.variable_scope('loss'):\n",
186 |         "            self.loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = self.m, logits = self.m_pred))\n",
187 |         "\n",
188 |         "        with tf.variable_scope('train'):\n",
189 |         "            self._train_op = tf.train.AdamOptimizer(self.lr, 0.09).minimize(self.loss)\n",
190 |         "\n",
191 |         "\n",
192 |         "    def remember(self, h, m):\n",
193 |         "        # replace the old memory with new memory\n",
194 |         "        idx = self.memory_counter % self.memory_size\n",
195 |         "        self.memory[idx, :] = np.hstack((h,m))\n",
196 |         "\n",
197 |         "        self.memory_counter += 1\n",
198 |         "\n",
199 |         "    def encode(self, h, m):\n",
200 |         "        # encoding the entry\n",
201 |         "        self.remember(h, m)\n",
202 |         "        # train the DNN every 10 step\n",
203 |         "#        if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0:\n",
204 |         "        if self.memory_counter % self.training_interval == 0:\n",
205 |         "            self.learn()\n",
206 |         "\n",
207 |         "    def learn(self):\n",
208 |         "        # sample batch memory from all memory\n",
209 |         "        if self.memory_counter > self.memory_size:\n",
210 |         "            sample_index = np.random.choice(self.memory_size, size=self.batch_size)\n",
211 |         "        else:\n",
212 |         "            sample_index = np.random.choice(self.memory_counter, size=self.batch_size)\n",
213 |         "        batch_memory = self.memory[sample_index, :]\n",
214 |         "        \n",
215 |         "        h_train = batch_memory[:, 0: self.net[0]]\n",
216 |         "        m_train = batch_memory[:, self.net[0]:]\n",
217 |         "        \n",
218 |         "        # print(h_train)\n",
219 |         "        # print(m_train)\n",
220 |         "\n",
221 |         "        # train the DNN\n",
222 |         "        _, self.cost = self.sess.run([self._train_op, self.loss], \n",
223 |         "                                         feed_dict={self.h: h_train, self.m: m_train})\n",
224 |         "\n",
225 |         "        assert(self.cost >0)    \n",
226 |         "        self.cost_his.append(self.cost)\n",
227 |         "\n",
228 |         "    def decode(self, h, k = 1, mode = 'OP'):\n",
229 |         "        # to have batch dimension when feed into tf placeholder\n",
230 |         "        h = h[np.newaxis, :]\n",
231 |         "\n",
232 |         "        m_pred = self.sess.run(self.m_pred, feed_dict={self.h: h})\n",
233 |         "\n",
234 |         "        if mode is 'OP':\n",
235 |         "            return self.knm(m_pred[0], k)\n",
236 |         "        elif mode is 'KNN':\n",
237 |         "            return self.knn(m_pred[0], k)\n",
238 |         "        else:\n",
239 |         "            print(\"The action selection must be 'OP' or 'KNN'\")\n",
240 |         "    \n",
241 |         "    def knm(self, m, k = 1):\n",
242 |         "        # return k-nearest-mode\n",
243 |         "        m_list = []\n",
244 |         "        \n",
245 |         "        # generate the ﬁrst binary ofﬂoading decision \n",
246 |         "        # note that here 'm' is the output of DNN before the sigmoid activation function, in the field of all real number. \n",
247 |         "        # Therefore, we compare it with '0' instead of 0.5 in equation (8). Since, sigmod(0) = 0.5.\n",
248 |         "        m_list.append(1*(m>0))\n",
249 |         "        \n",
250 |         "        if k > 1:\n",
251 |         "            # generate the remaining K-1 binary ofﬂoading decisions with respect to equation (9)\n",
252 |         "            m_abs = abs(m)\n",
253 |         "            idx_list = np.argsort(m_abs)[:k-1]\n",
254 |         "            for i in range(k-1):\n",
255 |         "                if m[idx_list[i]] >0:\n",
256 |         "                    # set a positive user to 0\n",
257 |         "                    m_list.append(1*(m - m[idx_list[i]] > 0))\n",
258 |         "                else:\n",
259 |         "                    # set a negtive user to 1\n",
260 |         "                    m_list.append(1*(m - m[idx_list[i]] >= 0))\n",
261 |         "\n",
262 |         "        return m_list\n",
263 |         "    \n",
264 |         "    def knn(self, m, k = 1):\n",
265 |         "        # list all 2^N binary offloading actions\n",
266 |         "        if len(self.enumerate_actions) is 0:\n",
267 |         "            import itertools\n",
268 |         "            self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0]))))\n",
269 |         "\n",
270 |         "        # the 2-norm\n",
271 |         "        sqd = ((self.enumerate_actions - m)**2).sum(1)\n",
272 |         "        idx = np.argsort(sqd)\n",
273 |         "        return self.enumerate_actions[idx[:k]]\n",
274 |         "        \n",
275 |         "\n",
276 |         "    def plot_cost(self):\n",
277 |         "        import matplotlib.pyplot as plt\n",
278 |         "        plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his)\n",
279 |         "        plt.ylabel('Training Loss')\n",
280 |         "        plt.xlabel('Time Frames')\n",
281 |         "        plt.show()"
282 |       ]
283 |     }
284 |   ]
285 | }


--------------------------------------------------------------------------------
/memory.py:
--------------------------------------------------------------------------------
  1 | #  #################################################################
  2 | #  This file contains memory operation including encoding and decoding operations.
  3 | #
  4 | # version 1.0 -- January 2018. Written by Liang Huang (lianghuang AT zjut.edu.cn)
  5 | #  #################################################################
  6 | 
  7 | from __future__ import print_function
  8 | import tensorflow as tf
  9 | import numpy as np
 10 | 
 11 | 
 12 | # DNN network for memory
 13 | class MemoryDNN:
 14 |     def __init__(
 15 |         self,
 16 |         net,
 17 |         learning_rate = 0.01,
 18 |         training_interval=10, 
 19 |         batch_size=100, 
 20 |         memory_size=1000,
 21 |         output_graph=False
 22 |     ):
 23 |         # net: [n_input, n_hidden_1st, n_hidded_2ed, n_output]
 24 |         assert(len(net) is 4) # only 4-layer DNN
 25 | 
 26 |         self.net = net
 27 |         self.training_interval = training_interval # learn every #training_interval
 28 |         self.lr = learning_rate
 29 |         self.batch_size = batch_size
 30 |         self.memory_size = memory_size
 31 |         
 32 |         # store all binary actions
 33 |         self.enumerate_actions = []
 34 | 
 35 |         # stored # memory entry
 36 |         self.memory_counter = 1
 37 | 
 38 |         # store training cost
 39 |         self.cost_his = []
 40 | 
 41 |         # reset graph 
 42 |         tf.reset_default_graph()
 43 | 
 44 |         # initialize zero memory [h, m]
 45 |         self.memory = np.zeros((self.memory_size, self.net[0]+ self.net[-1]))
 46 | 
 47 |         # construct memory network
 48 |         self._build_net()
 49 | 
 50 |         self.sess = tf.Session()
 51 | 
 52 |         # for tensorboard
 53 |         if output_graph:
 54 |             # $ tensorboard --logdir=logs
 55 |             # tf.train.SummaryWriter soon be deprecated, use following
 56 |             tf.summary.FileWriter("logs/", self.sess.graph)
 57 | 
 58 |         self.sess.run(tf.global_variables_initializer())
 59 | 
 60 | 
 61 |     def _build_net(self):
 62 |         def build_layers(h, c_names, net, w_initializer, b_initializer):
 63 |             with tf.variable_scope('l1'):
 64 |                 w1 = tf.get_variable('w1', [net[0], net[1]], initializer=w_initializer, collections=c_names)
 65 |                 b1 = tf.get_variable('b1', [1, self.net[1]], initializer=b_initializer, collections=c_names)
 66 |                 l1 = tf.nn.relu(tf.matmul(h, w1) + b1)
 67 | 
 68 |             with tf.variable_scope('l2'):
 69 |                 w2 = tf.get_variable('w2', [net[1], net[2]], initializer=w_initializer, collections=c_names)
 70 |                 b2 = tf.get_variable('b2', [1, net[2]], initializer=b_initializer, collections=c_names)
 71 |                 l2 = tf.nn.relu(tf.matmul(l1, w2) + b2)
 72 | 
 73 |             with tf.variable_scope('M'):
 74 |                 w3 = tf.get_variable('w3', [net[2], net[3]], initializer=w_initializer, collections=c_names)
 75 |                 b3 = tf.get_variable('b3', [1, net[3]], initializer=b_initializer, collections=c_names)
 76 |                 out = tf.matmul(l2, w3) + b3
 77 | 
 78 |             return out
 79 | 
 80 |         # ------------------ build memory_net ------------------
 81 |         self.h = tf.placeholder(tf.float32, [None, self.net[0]], name='h')  # input
 82 |         self.m = tf.placeholder(tf.float32, [None, self.net[-1]], name='mode')  # for calculating loss
 83 |         self.is_train = tf.placeholder("bool") # train or evaluate
 84 | 
 85 |         with tf.variable_scope('memory_net'):
 86 |             c_names, w_initializer, b_initializer = \
 87 |                 ['memory_net_params', tf.GraphKeys.GLOBAL_VARIABLES], \
 88 |                 tf.random_normal_initializer(0., 1/self.net[0]), tf.constant_initializer(0.1)  # config of layers
 89 | 
 90 |             self.m_pred = build_layers(self.h, c_names, self.net, w_initializer, b_initializer)
 91 | 
 92 |         with tf.variable_scope('loss'):
 93 |             self.loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = self.m, logits = self.m_pred))
 94 | 
 95 |         with tf.variable_scope('train'):
 96 |             self._train_op = tf.train.AdamOptimizer(self.lr, 0.09).minimize(self.loss)
 97 | 
 98 | 
 99 |     def remember(self, h, m):
100 |         # replace the old memory with new memory
101 |         idx = self.memory_counter % self.memory_size
102 |         self.memory[idx, :] = np.hstack((h,m))
103 | 
104 |         self.memory_counter += 1
105 | 
106 |     def encode(self, h, m):
107 |         # encoding the entry
108 |         self.remember(h, m)
109 |         # train the DNN every 10 step
110 | #        if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0:
111 |         if self.memory_counter % self.training_interval == 0:
112 |             self.learn()
113 | 
114 |     def learn(self):
115 |         # sample batch memory from all memory
116 |         if self.memory_counter > self.memory_size:
117 |             sample_index = np.random.choice(self.memory_size, size=self.batch_size)
118 |         else:
119 |             sample_index = np.random.choice(self.memory_counter, size=self.batch_size)
120 |         batch_memory = self.memory[sample_index, :]
121 |         
122 |         h_train = batch_memory[:, 0: self.net[0]]
123 |         m_train = batch_memory[:, self.net[0]:]
124 |         
125 |         # print(h_train)
126 |         # print(m_train)
127 | 
128 |         # train the DNN
129 |         _, self.cost = self.sess.run([self._train_op, self.loss], 
130 |                                          feed_dict={self.h: h_train, self.m: m_train})
131 | 
132 |         assert(self.cost >0)    
133 |         self.cost_his.append(self.cost)
134 | 
135 |     def decode(self, h, k = 1, mode = 'OP'):
136 |         # to have batch dimension when feed into tf placeholder
137 |         h = h[np.newaxis, :]
138 | 
139 |         m_pred = self.sess.run(self.m_pred, feed_dict={self.h: h})
140 | 
141 |         if mode is 'OP':
142 |             return self.knm(m_pred[0], k)
143 |         elif mode is 'KNN':
144 |             return self.knn(m_pred[0], k)
145 |         else:
146 |             print("The action selection must be 'OP' or 'KNN'")
147 |     
148 |     def knm(self, m, k = 1):
149 |         # return k-nearest-mode
150 |         m_list = []
151 |         
152 |         # generate the ﬁrst binary ofﬂoading decision 
153 |         # note that here 'm' is the output of DNN before the sigmoid activation function, in the field of all real number. 
154 |         # Therefore, we compare it with '0' instead of 0.5 in equation (8). Since, sigmod(0) = 0.5.
155 |         m_list.append(1*(m>0))
156 |         
157 |         if k > 1:
158 |             # generate the remaining K-1 binary ofﬂoading decisions with respect to equation (9)
159 |             m_abs = abs(m)
160 |             idx_list = np.argsort(m_abs)[:k-1]
161 |             for i in range(k-1):
162 |                 if m[idx_list[i]] >0:
163 |                     # set a positive user to 0
164 |                     m_list.append(1*(m - m[idx_list[i]] > 0))
165 |                 else:
166 |                     # set a negtive user to 1
167 |                     m_list.append(1*(m - m[idx_list[i]] >= 0))
168 | 
169 |         return m_list
170 |     
171 |     def knn(self, m, k = 1):
172 |         # list all 2^N binary offloading actions
173 |         if len(self.enumerate_actions) is 0:
174 |             import itertools
175 |             self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0]))))
176 | 
177 |         # the 2-norm
178 |         sqd = ((self.enumerate_actions - m)**2).sum(1)
179 |         idx = np.argsort(sqd)
180 |         return self.enumerate_actions[idx[:k]]
181 |         
182 | 
183 |     def plot_cost(self):
184 |         import matplotlib.pyplot as plt
185 |         plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his)
186 |         plt.ylabel('Training Loss')
187 |         plt.xlabel('Time Frames')
188 |         plt.show()
189 | 


--------------------------------------------------------------------------------
/memoryPyTorch.py:
--------------------------------------------------------------------------------
  1 | #  #################################################################
  2 | #  This file contains the main DROO operations, including building DNN, 
  3 | #  Storing data sample, Training DNN, and generating quantized binary offloading decisions.
  4 | 
  5 | #  version 1.0 -- February 2020. Written based on Tensorflow 2 by Weijian Pan and 
  6 | #  Liang Huang (lianghuang AT zjut.edu.cn)
  7 | #  ###################################################################
  8 | 
  9 | from __future__ import print_function
 10 | import torch
 11 | import torch.optim as optim
 12 | import torch.nn as nn
 13 | import numpy as np
 14 | 
 15 | print(torch.__version__)
 16 | 
 17 | 
 18 | # DNN network for memory
 19 | class MemoryDNN:
 20 |     def __init__(
 21 |         self,
 22 |         net,
 23 |         learning_rate = 0.01,
 24 |         training_interval=10,
 25 |         batch_size=100,
 26 |         memory_size=1000,
 27 |         output_graph=False
 28 |     ):
 29 | 
 30 |         self.net = net
 31 |         self.training_interval = training_interval      # learn every #training_interval
 32 |         self.lr = learning_rate
 33 |         self.batch_size = batch_size
 34 |         self.memory_size = memory_size
 35 | 
 36 |         # store all binary actions
 37 |         self.enumerate_actions = []
 38 | 
 39 |         # stored # memory entry
 40 |         self.memory_counter = 1
 41 | 
 42 |         # store training cost
 43 |         self.cost_his = []
 44 | 
 45 |         # initialize zero memory [h, m]
 46 |         self.memory = np.zeros((self.memory_size, self.net[0] + self.net[-1]))
 47 | 
 48 |         # construct memory network
 49 |         self._build_net()
 50 | 
 51 |     def _build_net(self):
 52 |         self.model = nn.Sequential(
 53 |                 nn.Linear(self.net[0], self.net[1]),
 54 |                 nn.ReLU(),
 55 |                 nn.Linear(self.net[1], self.net[2]),
 56 |                 nn.ReLU(),
 57 |                 nn.Linear(self.net[2], self.net[3]),
 58 |                 nn.Sigmoid()
 59 |         )
 60 | 
 61 |     def remember(self, h, m):
 62 |         # replace the old memory with new memory
 63 |         idx = self.memory_counter % self.memory_size
 64 |         self.memory[idx, :] = np.hstack((h, m))
 65 | 
 66 |         self.memory_counter += 1
 67 | 
 68 |     def encode(self, h, m):
 69 |         # encoding the entry
 70 |         self.remember(h, m)
 71 |         # train the DNN every 10 step
 72 | #        if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0:
 73 |         if self.memory_counter % self.training_interval == 0:
 74 |             self.learn()
 75 | 
 76 |     def learn(self):
 77 |         # sample batch memory from all memory
 78 |         if self.memory_counter > self.memory_size:
 79 |             sample_index = np.random.choice(self.memory_size, size=self.batch_size)
 80 |         else:
 81 |             sample_index = np.random.choice(self.memory_counter, size=self.batch_size)
 82 |         batch_memory = self.memory[sample_index, :]
 83 | 
 84 |         h_train = torch.Tensor(batch_memory[:, 0: self.net[0]])
 85 |         m_train = torch.Tensor(batch_memory[:, self.net[0]:])
 86 | 
 87 | 
 88 |         # train the DNN
 89 |         optimizer = optim.Adam(self.model.parameters(), lr=self.lr,betas = (0.09,0.999),weight_decay=0.0001) 
 90 |         criterion = nn.BCELoss()
 91 |         self.model.train()
 92 |         optimizer.zero_grad()
 93 |         predict = self.model(h_train)
 94 |         loss = criterion(predict, m_train)
 95 |         loss.backward()
 96 |         optimizer.step()
 97 | 
 98 |         self.cost = loss.item()
 99 |         assert(self.cost > 0)
100 |         self.cost_his.append(self.cost)
101 | 
102 |     def decode(self, h, k = 1, mode = 'OP'):
103 |         # to have batch dimension when feed into Tensor
104 |         h = torch.Tensor(h[np.newaxis, :])
105 | 
106 |         self.model.eval()
107 |         m_pred = self.model(h)
108 |         m_pred = m_pred.detach().numpy()
109 | 
110 |         if mode is 'OP':
111 |             return self.knm(m_pred[0], k)
112 |         elif mode is 'KNN':
113 |             return self.knn(m_pred[0], k)
114 |         else:
115 |             print("The action selection must be 'OP' or 'KNN'")
116 | 
117 |     def knm(self, m, k = 1):
118 |         # return k order-preserving binary actions
119 |         m_list = []
120 |         # generate the ﬁrst binary ofﬂoading decision with respect to equation (8)
121 |         m_list.append(1*(m>0.5))
122 | 
123 |         if k > 1:
124 |             # generate the remaining K-1 binary ofﬂoading decisions with respect to equation (9)
125 |             m_abs = abs(m-0.5)
126 |             idx_list = np.argsort(m_abs)[:k-1]
127 |             for i in range(k-1):
128 |                 if m[idx_list[i]] >0.5:
129 |                     # set the \hat{x}_{t,(k-1)} to 0
130 |                     m_list.append(1*(m - m[idx_list[i]] > 0))
131 |                 else:
132 |                     # set the \hat{x}_{t,(k-1)} to 1
133 |                     m_list.append(1*(m - m[idx_list[i]] >= 0))
134 | 
135 |         return m_list
136 | 
137 |     def knn(self, m, k = 1):
138 |         # list all 2^N binary offloading actions
139 |         if len(self.enumerate_actions) is 0:
140 |             import itertools
141 |             self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0]))))
142 | 
143 |         # the 2-norm
144 |         sqd = ((self.enumerate_actions - m)**2).sum(1)
145 |         idx = np.argsort(sqd)
146 |         return self.enumerate_actions[idx[:k]]
147 | 
148 | 
149 |     def plot_cost(self):
150 |         import matplotlib.pyplot as plt
151 |         plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his)
152 |         plt.ylabel('Training Loss')
153 |         plt.xlabel('Time Frames')
154 |         plt.show()
155 | 
156 | 


--------------------------------------------------------------------------------
/memoryTF2.py:
--------------------------------------------------------------------------------
  1 | #  #################################################################
  2 | #  This file contains the main DROO operations, including building DNN, 
  3 | #  Storing data sample, Training DNN, and generating quantized binary offloading decisions.
  4 | 
  5 | #  version 1.0 -- January 2020. Written based on Tensorflow 2 by Weijian Pan and 
  6 | #  Liang Huang (lianghuang AT zjut.edu.cn)
  7 | #  #################################################################
  8 | 
  9 | from __future__ import print_function
 10 | import tensorflow as tf
 11 | from tensorflow import keras
 12 | from tensorflow.keras import layers
 13 | import numpy as np
 14 | 
 15 | print(tf.__version__)
 16 | print(tf.keras.__version__)
 17 | 
 18 | 
 19 | # DNN network for memory
 20 | class MemoryDNN:
 21 |     def __init__(
 22 |         self,
 23 |         net,
 24 |         learning_rate = 0.01,
 25 |         training_interval=10, 
 26 |         batch_size=100, 
 27 |         memory_size=1000,
 28 |         output_graph=False
 29 |     ):
 30 | 
 31 |         self.net = net  # the size of the DNN
 32 |         self.training_interval = training_interval      # learn every #training_interval
 33 |         self.lr = learning_rate
 34 |         self.batch_size = batch_size
 35 |         self.memory_size = memory_size
 36 |         
 37 |         # store all binary actions
 38 |         self.enumerate_actions = []
 39 | 
 40 |         # stored # memory entry
 41 |         self.memory_counter = 1
 42 | 
 43 |         # store training cost
 44 |         self.cost_his = []
 45 | 
 46 |         # initialize zero memory [h, m]
 47 |         self.memory = np.zeros((self.memory_size, self.net[0] + self.net[-1]))
 48 | 
 49 |         # construct memory network
 50 |         self._build_net()
 51 | 
 52 |     def _build_net(self):
 53 |         self.model = keras.Sequential([
 54 |                     layers.Dense(self.net[1], activation='relu'),  # the first hidden layer
 55 |                     layers.Dense(self.net[2], activation='relu'),  # the second hidden layer
 56 |                     layers.Dense(self.net[-1], activation='sigmoid')  # the output layer
 57 |                 ])
 58 | 
 59 |         self.model.compile(optimizer=keras.optimizers.Adam(lr=self.lr), loss=tf.losses.binary_crossentropy, metrics=['accuracy'])
 60 | 
 61 |     def remember(self, h, m):
 62 |         # replace the old memory with new memory
 63 |         idx = self.memory_counter % self.memory_size
 64 |         self.memory[idx, :] = np.hstack((h, m))
 65 | 
 66 |         self.memory_counter += 1
 67 | 
 68 |     def encode(self, h, m):
 69 |         # encoding the entry
 70 |         self.remember(h, m)
 71 |         # train the DNN every 10 step
 72 | #        if self.memory_counter> self.memory_size / 2 and self.memory_counter % self.training_interval == 0:
 73 |         if self.memory_counter % self.training_interval == 0:
 74 |             self.learn()
 75 | 
 76 |     def learn(self):
 77 |         # sample batch memory from all memory
 78 |         if self.memory_counter > self.memory_size:
 79 |             sample_index = np.random.choice(self.memory_size, size=self.batch_size)
 80 |         else:
 81 |             sample_index = np.random.choice(self.memory_counter, size=self.batch_size)
 82 |         batch_memory = self.memory[sample_index, :]
 83 |         
 84 |         h_train = batch_memory[:, 0: self.net[0]]
 85 |         m_train = batch_memory[:, self.net[0]:]
 86 |         
 87 |         # print(h_train)          # (128, 10)
 88 |         # print(m_train)          # (128, 10)
 89 | 
 90 |         # train the DNN
 91 |         hist = self.model.fit(h_train, m_train, verbose=0)
 92 |         self.cost = hist.history['loss'][0]
 93 |         assert(self.cost > 0)
 94 |         self.cost_his.append(self.cost)
 95 | 
 96 |     def decode(self, h, k = 1, mode = 'OP'):
 97 |         # to have batch dimension when feed into tf placeholder
 98 |         h = h[np.newaxis, :]
 99 | 
100 |         m_pred = self.model.predict(h)
101 | 
102 |         if mode is 'OP':
103 |             return self.knm(m_pred[0], k)
104 |         elif mode is 'KNN':
105 |             return self.knn(m_pred[0], k)
106 |         else:
107 |             print("The action selection must be 'OP' or 'KNN'")
108 |     
109 |     def knm(self, m, k = 1):
110 |         # return k order-preserving binary actions
111 |         m_list = []
112 |         # generate the ﬁrst binary ofﬂoading decision with respect to equation (8)
113 |         m_list.append(1*(m>0.5))
114 |         
115 |         if k > 1:
116 |             # generate the remaining K-1 binary ofﬂoading decisions with respect to equation (9)
117 |             m_abs = abs(m-0.5)
118 |             idx_list = np.argsort(m_abs)[:k-1]
119 |             for i in range(k-1):
120 |                 if m[idx_list[i]] >0.5:
121 |                     # set the \hat{x}_{t,(k-1)} to 0
122 |                     m_list.append(1*(m - m[idx_list[i]] > 0))
123 |                 else:
124 |                     # set the \hat{x}_{t,(k-1)} to 1
125 |                     m_list.append(1*(m - m[idx_list[i]] >= 0))
126 | 
127 |         return m_list
128 |     
129 |     def knn(self, m, k = 1):
130 |         # list all 2^N binary offloading actions
131 |         if len(self.enumerate_actions) is 0:
132 |             import itertools
133 |             self.enumerate_actions = np.array(list(map(list, itertools.product([0, 1], repeat=self.net[0]))))
134 | 
135 |         # the 2-norm
136 |         sqd = ((self.enumerate_actions - m)**2).sum(1)
137 |         idx = np.argsort(sqd)
138 |         return self.enumerate_actions[idx[:k]]
139 |         
140 | 
141 |     def plot_cost(self):
142 |         import matplotlib.pyplot as plt
143 |         plt.plot(np.arange(len(self.cost_his))*self.training_interval, self.cost_his)
144 |         plt.ylabel('Training Loss')
145 |         plt.xlabel('Time Frames')
146 |         plt.show()
147 | 


--------------------------------------------------------------------------------
/optimization.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Tue Jan  9 10:45:26 2018
  4 | 
  5 | @author: Administrator
  6 | """
  7 | import numpy as np
  8 | from scipy import optimize
  9 | from scipy.special import lambertw
 10 | import scipy.io as sio                     # import scipy.io for .mat file I/
 11 | import time
 12 | 
 13 | 
 14 | def plot_gain( gain_his):
 15 |     import matplotlib.pyplot as plt
 16 |     import pandas as pd
 17 |     import matplotlib as mpl
 18 |     
 19 |     gain_array = np.asarray(gain_his)
 20 |     df = pd.DataFrame(gain_his)
 21 |     
 22 |     
 23 |     mpl.style.use('seaborn')
 24 |     fig, ax = plt.subplots(figsize=(15,8))
 25 |     rolling_intv = 20
 26 | 
 27 |     plt.plot(np.arange(len(gain_array))+1, df.rolling(rolling_intv, min_periods=1).mean(), 'b')
 28 |     plt.fill_between(np.arange(len(gain_array))+1, df.rolling(rolling_intv, min_periods=1).min()[0], df.rolling(rolling_intv, min_periods=1).max()[0], color = 'b', alpha = 0.2)
 29 |     plt.ylabel('Gain ratio')
 30 |     plt.xlabel('learning steps')
 31 |     plt.show()
 32 |     
 33 | def bisection(h, M, weights=[]):
 34 |     # the bisection algorithm proposed by Suzhi BI
 35 |     # average time to find the optimal: 0.012535839796066284 s
 36 | 
 37 |     # parameters and equations
 38 |     o=100
 39 |     p=3
 40 |     u=0.7
 41 |     eta1=((u*p)**(1.0/3))/o
 42 |     ki=10**-26   
 43 |     eta2=u*p/10**-10
 44 |     B=2*10**6
 45 |     Vu=1.1
 46 |     epsilon=B/(Vu*np.log(2))
 47 |     x = [] # a =x[0], and tau_j = a[1:]
 48 |     
 49 |     M0=np.where(M==0)[0]
 50 |     M1=np.where(M==1)[0]
 51 |     
 52 |     hi=np.array([h[i] for i in M0])
 53 |     hj=np.array([h[i] for i in M1])
 54 |     
 55 | 
 56 |     if len(weights) == 0:
 57 |         # default weights [1, 1.5, 1, 1.5, 1, 1.5, ...]
 58 |         weights = [1.5 if i%2==1 else 1 for i in range(len(M))]
 59 |         
 60 |     wi=np.array([weights[M0[i]] for i in range(len(M0))])
 61 |     wj=np.array([weights[M1[i]] for i in range(len(M1))])
 62 |     
 63 |     
 64 |     def sum_rate(x):
 65 |         sum1=sum(wi*eta1*(hi/ki)**(1.0/3)*x[0]**(1.0/3))
 66 |         sum2=0
 67 |         for i in range(len(M1)):
 68 |             sum2+=wj[i]*epsilon*x[i+1]*np.log(1+eta2*hj[i]**2*x[0]/x[i+1])
 69 |         return sum1+sum2
 70 | 
 71 |     def phi(v, j):
 72 |         return 1/(-1-1/(lambertw(-1/(np.exp( 1 + v/wj[j]/epsilon))).real))
 73 | 
 74 |     def p1(v):
 75 |         p1 = 0
 76 |         for j in range(len(M1)):
 77 |             p1 += hj[j]**2 * phi(v, j)
 78 | 
 79 |         return 1/(1 + p1 * eta2)
 80 | 
 81 |     def Q(v):
 82 |         sum1 = sum(wi*eta1*(hi/ki)**(1.0/3))*p1(v)**(-2/3)/3
 83 |         sum2 = 0
 84 |         for j in range(len(M1)):
 85 |             sum2 += wj[j]*hj[j]**2/(1 + 1/phi(v,j))
 86 |         return sum1 + sum2*epsilon*eta2 - v
 87 | 
 88 |     def tau(v, j):
 89 |         return eta2*hj[j]**2*p1(v)*phi(v,j)
 90 | 
 91 |     # bisection starts here
 92 |     delta = 0.005
 93 |     UB = 999999999
 94 |     LB = 0
 95 |     while UB - LB > delta:
 96 |         v = (float(UB) + LB)/2
 97 |         if Q(v) > 0:
 98 |             LB = v
 99 |         else:
100 |             UB = v
101 | 
102 |     x.append(p1(v))
103 |     for j in range(len(M1)):
104 |         x.append(tau(v, j))
105 | 
106 |     return sum_rate(x), x[0], x[1:]
107 | 
108 | 
109 | 
110 | def cd_method(h):
111 |     N = len(h)
112 |     M0 = np.random.randint(2,size = N)
113 |     gain0,a,Tj= bisection(h,M0)
114 |     g_list = []
115 |     M_list = []
116 |     while True:
117 |         for j in range(0,N):
118 |             M = np.copy(M0)
119 |             M[j] = (M[j]+1)%2
120 |             gain,a,Tj= bisection(h,M)
121 |             g_list.append(gain)
122 |             M_list.append(M)
123 |         g_max = max(g_list)
124 |         if g_max > gain0:
125 |             gain0 = g_max
126 |             M0 = M_list[g_list.index(g_max)]
127 |         else:
128 |             break
129 |     return gain0, M0
130 | 
131 | 
132 | if __name__ == "__main__":
133 |                 
134 |     h=np.array([6.06020304235508*10**-6,1.10331933767028*10**-5,1.00213540309998*10**-7,1.21610610942759*10**-6,1.96138838395145*10**-6,1.71456339592966*10**-6,5.24563569673585*10**-6,5.89530717142197*10**-7,4.07769429231962*10**-6,2.88333185798682*10**-6])
135 |     M=np.array([1,0,0,0,1,0,0,0,0,0])
136 | #    h=np.array([1.00213540309998*10**-7,1.10331933767028*10**-5,6.06020304235508*10**-6,1.21610610942759*10**-6,1.96138838395145*10**-6,1.71456339592966*10**-6,5.24563569673585*10**-6,5.89530717142197*10**-7,4.07769429231962*10**-6,2.88333185798682*10**-6])
137 | #    M=np.array([0,0,1,0,1,0,0,0,0,0])
138 | 
139 |     
140 | #    h = np.array([4.6368924987170947*10**-7,	1.3479411763648968*10**-7,	7.174945246007612*10**-6,	2.5590719803595445*10**-7,	3.3189928740379023*10**-6,	1.2109071327755575*10**-5,	2.394278475886022*10**-6,	2.179121774067472*10**-6,	5.5213902658478367*10**-8,	2.168778154948169*10**-7,	2.053227965874453*10**-6,	7.002952297466865*10**-8,	7.594077851181444*10**-8,	7.904048961975136*10**-7,	8.867218892023474*10**-7,	5.886007653360979*10**-6,	2.3470565740563855*10**-6,	1.387049627074303*10**-7,	3.359475870531776*10**-7,	2.633733784949562*10**-7,	2.189895264149453*10**-6,	1.129177795302099*10**-5,	1.1760290137191366*10**-6,	1.6588656719735275*10**-7,	1.383637788476638*10**-6,	1.4485928387351664*10**-6,	1.4262265958416598*10**-6, 1.1779725004265418*10**-6, 7.738218993031842*10**-7,	4.763534225174186*10**-6])
141 | #    M =np.array( [0,	0,	1,	0, 0,	1,	0,	0,	0,	0,	0,	0,	0,	0,	0,	1,	0,	0,	0,	0,	0,	1,	0,	0,	0,	0,	0,	0,	0,	1,])
142 |     
143 | #    time the average speed of bisection algorithm
144 | #    repeat = 1
145 | #    M =np.random.randint(2, size=(repeat,len(h)))
146 | #    start_time=time.time()
147 | #    for i in range(repeat):
148 | #        gain,a,Tj= bisection(h,M[i,:])
149 | #    total_time=time.time()-start_time
150 | #    print('time_cost:%s'%(total_time/repeat))
151 |     
152 |     gain,a,Tj= bisection(h,M)
153 |     print('y:%s'%gain)
154 |     print('a:%s'%a)
155 |     print('Tj:%s'%Tj)
156 |     
157 |     # test CD method. Given h, generate the max mode
158 |     gain0, M0 = cd_method(h)
159 |     print('max y:%s'%gain0)
160 |     print(M0)
161 |     
162 |     # test all data
163 |     K = [10, 20, 30]                     # number of users
164 |     N = 1000                     # number of channel
165 |     
166 |     
167 |     for k in K:
168 |             # Load data
169 |         channel = sio.loadmat('./data/data_%d' %int(k))['input_h']
170 |         gain = sio.loadmat('./data/data_%d' %int(k))['output_obj']
171 |     
172 |         start_time=time.time()
173 |         gain_his = []
174 |         gain_his_ratio = []
175 |         mode_his = []
176 |         for i in range(N):
177 |             if i % (N//10) == 0:
178 |                print("%0.1f"%(i/N))
179 |                
180 |             i_idx = i 
181 |                 
182 |             h = channel[i_idx,:]
183 |             
184 |             # the CD method
185 |             gain0, M0 = cd_method(h)
186 |             
187 |     
188 |             # memorize the largest reward
189 |             gain_his.append(gain0)
190 |             gain_his_ratio.append(gain_his[-1] / gain[i_idx][0])
191 |     
192 |             mode_his.append(M0)
193 |                 
194 |         
195 |         total_time=time.time()-start_time
196 |         print('time_cost:%s'%total_time)
197 |         print('average time per channel:%s'%(total_time/N))
198 |         
199 |     
200 |         plot_gain(gain_his_ratio)
201 |         
202 |             
203 |         print("gain/max ratio: ", sum(gain_his_ratio)/N)
204 | 
205 |     
206 | 
207 | 
208 | 
209 | 
210 | 
211 | 
212 | 
213 | 
214 | 
215 | 
216 | 
217 | 
218 | 
219 | 
220 | 
221 | 
222 | 
223 | 
224 | 
225 | 
226 | 
227 | 
228 | 
229 | 
230 | 
231 | 
232 | 
233 | 
234 | 
235 | 
236 | 
237 | 


--------------------------------------------------------------------------------