├── CITATION.cff
├── README.md
├── quickstart.ipynb
└── input_parameters_glossary.ipynb


/CITATION.cff:
--------------------------------------------------------------------------------
 1 | cff-version: 1.2.0
 2 | title: >-
 3 |   RouteNet-Fermi: Network Modeling with Graph Neural
 4 |   Networks
 5 | message: >-
 6 |   If you use this software, please cite it using the
 7 |   metadata from this file.
 8 | authors:
 9 |   - given-names: Miquel
10 |     family-names: Ferriol-Galmés
11 |     email: miquel.ferriol@upc.edu
12 |     affiliation: Universitat Politècnica de Catalunya
13 |   - given-names: Jordi
14 |     family-names: Paillisse
15 |     affiliation: Universitat Politècnica de Catalunya
16 |   - given-names: José
17 |     family-names: Suárez-Varela
18 |     affiliation: Universitat Politècnica de Catalunya
19 |   - given-names: Krzysztof
20 |     family-names: Rusek
21 |     affiliation: AGH University of Science and Technology
22 |   - given-names: Shihan
23 |     family-names: Xiao
24 |     affiliation: 'Huawei Technologies Co., Ltd.'
25 |   - affiliation: 'Huawei Technologies Co., Ltd.'
26 |     given-names: Xiang
27 |     family-names: Shi
28 |   - given-names: Xiangle
29 |     family-names: Cheng
30 |     affiliation: 'Huawei Technologies Co., Ltd.'
31 |   - given-names: Pere
32 |     family-names: Barlet-Ros
33 |     affiliation: Universitat Politècnica de Catalunya
34 |   - given-names: Albert
35 |     family-names: Cabellos-Aparicio
36 |     affiliation: Universitat Politècnica de Catalunya
37 | abstract: >-
38 |   Network models are an essential block of modern
39 |   networks. For example, they are widely used in
40 |   network planning and optimization. However, as
41 |   networks increase in scale and complexity, some
42 |   models present limitations, such as the assumption
43 |   of markovian traffic in queuing theory models, or
44 |   the high computational cost of network simulators.
45 |   Recent advances in machine learning, such as Graph
46 |   Neural Networks (GNN), are enabling a new
47 |   generation of network models that are data-driven
48 |   and can learn complex non-linear behaviors. In this
49 |   paper, we present RouteNet-Fermi, a custom GNN
50 |   model that shares the same goals as queuing theory,
51 |   while being considerably more accurate in the
52 |   presence of realistic traffic models. The proposed
53 |   model predicts accurately the delay, jitter, and
54 |   loss in networks. We have tested RouteNet-Fermi in
55 |   networks of increasing size (up to 300 nodes),
56 |   including samples with mixed traffic profiles --
57 |   e.g., with complex non-markovian models -- and
58 |   arbitrary routing and queue scheduling
59 |   configurations. Our experimental results show that
60 |   RouteNet-Fermi achieves similar accuracy as
61 |   computationally-expensive packet-level simulators
62 |   and it is able to accurately scale to large
63 |   networks. For example, the model produces delay
64 |   estimates with a mean relative error of 6.24% when
65 |   applied to a test dataset with 1,000 samples,
66 |   including network topologies one order of magnitude
67 |   larger than those seen during training.
68 | doi: https://doi.org/10.48550/arXiv.2212.12070


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # BNNetSimulator
 2 | 
 3 | BNNetSimulator is a packet-level network simulator that is built on the OMNeT++ framework. It allows users to easily generate network datasets for research and analysis. With BNNetSimulator, you can define a network topology with various node and link features, specify a src-dst routing, and set a traffic matrix. The simulator then provides performance metrics such as mean delay, jitter, and packet loss per path, as well as link utilization and QoS queue statistics. With the [datanetAPI](https://github.com/BNN-UPC/datanetAPI/tree/BNNetSimulator) tool, you can easily process and analyze the generated dataset.
 4 | 
 5 |   
 6 | 
 7 | ## Parameters
 8 | 
 9 | BNNetSimulator offers a variety of parameters that allow you to customize the simulation to your specific needs. You can define several characteristics for each node in the topology, including:
10 | 
11 | - The scheduling policy used to serve packets and the buffer size of the queues
12 | 
13 | - The capacities of the links
14 | 
15 | - The src-dst routing
16 | 
17 | The simulator also supports different types of traffic profiles, including:
18 | 
19 | - Poisson distribution
20 | 
21 | - CBR (Constant Bit Rate) distribution
22 | 
23 | - ON-OFF distribution
24 | 
25 | - Generic distribution generated with a python script
26 | 
27 | Additionally, you can define the packet size distribution as a list of packet sizes and their probabilities or through a generic python script. For more information on the available parameters and how to use them, please refer to the [input_parameters_glossary.ipynb](input_parameters_glossary.ipynb) document.
28 | 
29 |   
30 |   
31 | 
32 | The BNNetSimulator is available as a pre-built docker image, which can be found at https://hub.docker.com/r/bnnupc/bnnetsimulator. To use it, you can simply follow the code examples provided in the Jupyter Notebooks of the accompanying repository.
33 | 
34 | The repository includes two main Jupyter Notebooks that will guide you through the process of using the simulator:
35 | 
36 | - [quickstart.ipynb](quickstart.ipynb): This notebook provides a step-by-step introduction to the process of generating your first dataset using the BNNetSimulator. It covers all the necessary steps and provides clear examples to help you get started.
37 | 
38 | - [input_parameters_glossary.ipynb](input_parameters_glossary.ipynb): This notebook contains detailed descriptions of the various parameters that can be used to build a dataset. It provides information on what each parameter does, how to use it, and how it affects the resulting dataset.
39 | 
40 |   
41 | 
42 | Both the quickstart.ipynb and input_parameters_glossary.ipynb are a great resources for someone who wants to get started using BNNetSimulator.
43 | 
44 |   
45 | 
46 | ## Cite the simulator
47 | 
48 | Please, if you use the simulator, don't forget to use the following citation:
49 | 
50 | **<u>Plain text:</u>**
51 | 
52 | Miquel Ferriol-Galmés, Jordi Paillisse, José Suárez-Varela, Krzysztof Rusek, Shihan Xiao, Xiang Shi, Xiangle Cheng, Pere Barlet-Ros, and Albert Cabellos-Aparicio. (2022). RouteNet-Fermi: Network Modeling with Graph Neural Networks. arXiv preprint arXiv:2212.12070.
53 | 
54 | **<u>BibTeX:</u>**
55 | 
56 | ```
57 | 
58 | @article{ferriol2022routenet,
59 | 
60 | title={RouteNet-Fermi: Network Modeling with Graph Neural Networks},
61 | 
62 | author={Ferriol-Galm{\'e}s, Miquel and Paillisse, Jordi and Su{\'a}rez-Varela, Jos{\'e} and Rusek, Krzysztof and Xiao, Shihan and Shi, Xiang and Cheng, Xiangle and Barlet-Ros, Pere and Cabellos-Aparicio, Albert},
63 | 
64 | journal={arXiv preprint arXiv:2212.12070},
65 | 
66 | year={2022}
67 | 
68 | }
69 | 
70 | ```
71 | 
72 |   
73 | 
74 | ## Setting up the docker enviroment
75 | 
76 |   
77 | 
78 | To use the BNNetSimulator, you will need to have either Docker Engine or Docker Desktop installed on your machine. Both of these tools provide a way to run the BNNetSimulator as a Docker container, which makes it easy to set up and use.
79 | 
80 |   
81 | 
82 | Here are the links to the instructions for installing Docker on your machine, depending on your operating system:
83 | 
84 |   
85 | 
86 | - Docker Desktop: https://docs.docker.com/desktop/
87 | 
88 | - Docker Engine: https://docs.docker.com/engine/
89 | 
90 |   
91 | 
92 | Make sure to follow the instructions for your specific operating system, and have the software installed before you try to use the BNNetSimulator. And in case you are having some troubles, you can always refer to the official documentation.


--------------------------------------------------------------------------------
/quickstart.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## Example solution Walkthrough\n",
  8 |     "\n",
  9 |     "The following example will take you through the steps to generate a dataset for your network simulation. We recommend using this as a template for your own implementation and customizing it to suit your specific needs."
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "### Dataset Generation\n",
 17 |     "To generate the dataset, we will first need to define the graph topology, routing paths, and traffic matrix for each sample. These parameters will be used by the simulator to calculate the delay, jitter, and drops for each path.\n",
 18 |     "\n",
 19 |     "To begin, we will define the graph topology, including the nodes and edges that make up the graph, as well as the scheduling policy and buffer size for each node. We will then create a routing file that defines the paths between the nodes in the topology.\n",
 20 |     "\n",
 21 |     "Next, we will generate the traffic matrix, which includes information on the source and destination nodes, average bandwidth, time distribution, packet size and frequency, and ToS for each flow.\n",
 22 |     "\n",
 23 |     "Once we have defined these parameters, we can run the simulation and collect performance metrics such as delay, jitter, and drops for each path.\n",
 24 |     "\n",
 25 |     "If you need more information on the parameters of the dataset, check out the [input_parameters_glossary.ipynb](input_parameters_glossary.ipynb) notebook, which provides a detailed explanation of each parameter."
 26 |    ]
 27 |   },
 28 |   {
 29 |    "cell_type": "code",
 30 |    "execution_count": null,
 31 |    "metadata": {},
 32 |    "outputs": [],
 33 |    "source": [
 34 |     "import networkx as nx\n",
 35 |     "import random\n",
 36 |     "import os"
 37 |    ]
 38 |   },
 39 |   {
 40 |    "cell_type": "code",
 41 |    "execution_count": null,
 42 |    "metadata": {},
 43 |    "outputs": [],
 44 |    "source": [
 45 |     "# Define destination for the generated samples\n",
 46 |     "training_dataset_path = \"training\"\n",
 47 |     "#paths relative to data folder\n",
 48 |     "graphs_path = \"graphs\"\n",
 49 |     "routings_path = \"routings\"\n",
 50 |     "tm_path = \"tm\"\n",
 51 |     "# Path to simulator file\n",
 52 |     "simulation_file = os.path.join(training_dataset_path,\"simulation.txt\")\n",
 53 |     "# Name of the dataset: Allows you to store several datasets in the same path\n",
 54 |     "# Each dataset will be stored at <training_dataset_path>/results/<name>\n",
 55 |     "dataset_name = \"dataset-1\""
 56 |    ]
 57 |   },
 58 |   {
 59 |    "cell_type": "code",
 60 |    "execution_count": null,
 61 |    "metadata": {},
 62 |    "outputs": [],
 63 |    "source": [
 64 |     "# Create folders\n",
 65 |     "if os.path.isdir(training_dataset_path):\n",
 66 |     "    print (\"Destination path already exists. Files within the directory may be overwritten.\")\n",
 67 |     "else:\n",
 68 |     "    os.makedirs(os.path.join(training_dataset_path,graphs_path))\n",
 69 |     "    os.mkdir(os.path.join(training_dataset_path,routings_path))\n",
 70 |     "    os.mkdir(os.path.join(training_dataset_path,tm_path))"
 71 |    ]
 72 |   },
 73 |   {
 74 |    "cell_type": "code",
 75 |    "execution_count": null,
 76 |    "metadata": {},
 77 |    "outputs": [],
 78 |    "source": [
 79 |     "'''\n",
 80 |     "Generate a graph topology file. The graphs generated have the following characteristics:\n",
 81 |     "- The network is able to process 3 ToS: 0,1,2\n",
 82 |     "- All nodes have buffer sizes of 32000 bits and WFQ scheduling. ToS 0 is assigned to first queue, and ToS 1 and 2 to second queue.\n",
 83 |     "- All links have bandwidths of 100000 bits per second\n",
 84 |     "'''\n",
 85 |     "def generate_topology(net_size, graph_file):\n",
 86 |     "    G = nx.Graph()\n",
 87 |     "    \n",
 88 |     "    # Set the maximum number of ToS that will use the input traffic of the network\n",
 89 |     "    G.graph[\"levelsToS\"] = 3\n",
 90 |     "    \n",
 91 |     "    nodes = []\n",
 92 |     "    node_degree = []\n",
 93 |     "    for n in range(net_size):\n",
 94 |     "        node_degree.append(random.choices([2,3,4,5,6],weights=[0.34,0.35,0.2,0.1,0.01])[0])\n",
 95 |     "        \n",
 96 |     "        nodes.append(n)\n",
 97 |     "        G.add_node(n)\n",
 98 |     "        # Assign to each node the scheduling Policy\n",
 99 |     "        G.nodes[n][\"schedulingPolicy\"] = \"SP\"\n",
100 |     "        # Assign ToS to scheduling queues.\n",
101 |     "        # In this case we have two queues per port. ToS 0 is assigned to the first queue and ToS 1 and 2 to the second queue\n",
102 |     "        G.nodes[n][\"tosToQoSqueue\"] = \"0;1,2\"\n",
103 |     "        # Assign weights to each queue\n",
104 |     "        G.nodes[n][\"schedulingWeights\"] = \"60, 40\"\n",
105 |     "        # Assign the buffer size of all the ports of the node\n",
106 |     "        G.nodes[n][\"bufferSizes\"] = 32000\n",
107 |     "\n",
108 |     "    finish = False\n",
109 |     "    while True:\n",
110 |     "        aux_nodes = list(nodes)\n",
111 |     "        n0 = random.choice(aux_nodes)\n",
112 |     "        aux_nodes.remove(n0)\n",
113 |     "        # Remove adjacents nodes (only one link between two nodes)\n",
114 |     "        for n1 in G[n0]:\n",
115 |     "            if n1 in aux_nodes:\n",
116 |     "                aux_nodes.remove(n1)\n",
117 |     "        if len(aux_nodes) == 0:\n",
118 |     "            # No more links can be added to this node - can not acomplish node_degree for this node\n",
119 |     "            nodes.remove(n0)\n",
120 |     "            if len(nodes) == 1:\n",
121 |     "                break\n",
122 |     "            continue\n",
123 |     "        n1 = random.choice(aux_nodes)\n",
124 |     "        G.add_edge(n0, n1)\n",
125 |     "        # Assign the link capacity to the link\n",
126 |     "        G[n0][n1][\"bandwidth\"] = 100000\n",
127 |     "        \n",
128 |     "        for n in [n0,n1]:\n",
129 |     "            node_degree[n] -= 1\n",
130 |     "            if (node_degree[n] == 0):\n",
131 |     "                nodes.remove(n)\n",
132 |     "                if (len(nodes) == 1):\n",
133 |     "                    finish = True\n",
134 |     "                    break\n",
135 |     "        if finish:\n",
136 |     "            break\n",
137 |     "    if not nx.is_connected(G):\n",
138 |     "        G = generate_topology(net_size, graph_file)\n",
139 |     "        return G\n",
140 |     "    \n",
141 |     "    nx.write_gml(G,graph_file)\n",
142 |     "    \n",
143 |     "    return G"
144 |    ]
145 |   },
146 |   {
147 |    "cell_type": "code",
148 |    "execution_count": null,
149 |    "metadata": {},
150 |    "outputs": [],
151 |    "source": [
152 |     "'''\n",
153 |     "Generate a file with the shortest path routing of the topology G\n",
154 |     "'''\n",
155 |     "def generate_routing(G, routing_file):\n",
156 |     "    with open(routing_file,\"w\") as r_fd:\n",
157 |     "        lPaths = nx.shortest_path(G)\n",
158 |     "        for src in G:\n",
159 |     "            for dst in G:\n",
160 |     "                if src == dst:\n",
161 |     "                    continue\n",
162 |     "                path =  ','.join(str(x) for x in lPaths[src][dst])\n",
163 |     "                r_fd.write(path+\"\\n\")"
164 |    ]
165 |   },
166 |   {
167 |    "cell_type": "code",
168 |    "execution_count": null,
169 |    "metadata": {},
170 |    "outputs": [],
171 |    "source": [
172 |     "'''\n",
173 |     "Generate a traffic matrix file. We consider flows between all nodes in the newtork, each with the following characterstics\n",
174 |     "- The average bandwidth ranges between 10 and max_avg_lbda\n",
175 |     "- We consider three time distributions (in case of the ON-OFF policy we have on periods of 10 and off periods of 5)\n",
176 |     "- We consider two packages distributions, chosen at random\n",
177 |     "- ToS is assigned randomly\n",
178 |     "'''\n",
179 |     "def generate_tm(G, max_avg_lbda, traffic_file):\n",
180 |     "    poisson = \"0\" \n",
181 |     "    cbr = \"1\"\n",
182 |     "    on_off = \"2,10,5\" #time_distribution, avg on_time exp, avg off_time exp\n",
183 |     "    time_dist = [poisson,cbr,on_off]\n",
184 |     "    \n",
185 |     "    pkt_dist_1 = \"0,300,0.5,1700,0.5\" #genric pkt size dist, pkt_size 1, prob 1, pkt_size 2, prob 2\n",
186 |     "    pkt_dist_2 = \"0,500,0.6,1000,0.2,1400,0.2\" #genric pkt size dist, pkt_size 1, prob 1, \n",
187 |     "                                               # pkt_size 2, prob 2, pkt_size 3, prob 3\n",
188 |     "    pkt_size_dist = [pkt_dist_1, pkt_dist_2]\n",
189 |     "    tos_lst = [0,1,2]\n",
190 |     "    \n",
191 |     "    with open(traffic_file,\"w\") as tm_fd:\n",
192 |     "        for src in G:\n",
193 |     "            for dst in G:\n",
194 |     "                avg_bw = random.randint(10,max_avg_lbda)\n",
195 |     "                td = random.choice(time_dist)\n",
196 |     "                sd = random.choice(pkt_size_dist)\n",
197 |     "                tos = random.choice(tos_lst)\n",
198 |     "                \n",
199 |     "                traffic_line = \"{},{},{},{},{},{}\".format(\n",
200 |     "                    src,dst,avg_bw,td,sd,tos)\n",
201 |     "                tm_fd.write(traffic_line+\"\\n\")"
202 |    ]
203 |   },
204 |   {
205 |    "cell_type": "code",
206 |    "execution_count": null,
207 |    "metadata": {},
208 |    "outputs": [],
209 |    "source": [
210 |     "\"\"\"\n",
211 |     "We generate the files using the previously defined functions. This code will produce 100 samples where:\n",
212 |     "- We generate 5 topologies, and then we generate 20 traffic matrices for each\n",
213 |     "- The topology sizes range from 6 to 10 nodes\n",
214 |     "- We consider the maximum average bandwidth per flow as 1000\n",
215 |     "\"\"\"\n",
216 |     "max_avg_lbda = 1000\n",
217 |     "with open (simulation_file,\"w\") as fd:\n",
218 |     "    for net_size in range (6,11):\n",
219 |     "        #Generate graph\n",
220 |     "        graph_file = os.path.join(graphs_path,\"graph_{}.txt\".format(net_size))\n",
221 |     "        G = generate_topology(net_size, os.path.join(training_dataset_path,graph_file))\n",
222 |     "        # Generate routing\n",
223 |     "        routing_file = os.path.join(routings_path,\"routing_{}.txt\".format(net_size))\n",
224 |     "        generate_routing(G, os.path.join(training_dataset_path,routing_file))\n",
225 |     "        # Generate TM:\n",
226 |     "        for i in range (20):\n",
227 |     "            tm_file = os.path.join(tm_path,\"tm_{}_{}.txt\".format(net_size,i))\n",
228 |     "            generate_tm(G,max_avg_lbda, os.path.join(training_dataset_path,tm_file))\n",
229 |     "            sim_line = \"{},{},{}\\n\".format(graph_file,routing_file,tm_file)   \n",
230 |     "            # If dataset has been generated in windows, convert paths into linux format\n",
231 |     "            fd.write(sim_line.replace(\"\\\\\",\"/\"))  "
232 |    ]
233 |   },
234 |   {
235 |    "cell_type": "markdown",
236 |    "metadata": {},
237 |    "source": [
238 |     "Now that we have created the input files for the simulator, we are ready to run the simulation and collect the performance metrics. To do this, we will use a Docker image that contains all the necessary tools and dependencies.\n",
239 |     "\n",
240 |     "The Docker image is saved on Dockerhub, which means that when running the \"docker run\" command for the first time, the image will be downloaded automatically. All you need to make sure is that your computer is connected to the internet.\n",
241 |     "\n",
242 |     "Once the image is downloaded, you can use the \"docker run\" command to start the simulation and pass in the input files as parameters. The simulator will then use these input files to calculate the delay, jitter, and drops for each path.\n",
243 |     "\n",
244 |     "It's worth noting that the use of a Docker container ensures that the simulation runs in a consistent environment, regardless of the host machine's operating system and dependencies."
245 |    ]
246 |   },
247 |   {
248 |    "cell_type": "code",
249 |    "execution_count": null,
250 |    "metadata": {},
251 |    "outputs": [],
252 |    "source": [
253 |     "# First we generate the configuration file\n",
254 |     "import yaml\n",
255 |     "\n",
256 |     "conf_file = os.path.join(training_dataset_path,\"conf.yml\")\n",
257 |     "conf_parameters = {\n",
258 |     "    \"threads\": 6,# Number of threads to use \n",
259 |     "    \"dataset_name\": dataset_name, # Name of the dataset. It is created in <training_dataset_path>/results/<name>\n",
260 |     "    \"samples_per_file\": 10, # Number of samples per compressed file\n",
261 |     "    \"rm_prev_results\": \"n\", # If 'y' is selected and the results folder already exists, the folder is removed.\n",
262 |     "    \"write_pkt_info\": \"n\", # If 'y' is selected, a file per simulation is created in the pkts_info folder of the dataset. This file contain a line per packet with the following data: src_id dst_id flow_id tos timestamp(ns) pkt_size[ delay(ns)]\n",
263 |     "}\n",
264 |     "\n",
265 |     "with open(conf_file, 'w') as fd:\n",
266 |     "    yaml.dump(conf_parameters, fd)"
267 |    ]
268 |   },
269 |   {
270 |    "cell_type": "code",
271 |    "execution_count": null,
272 |    "metadata": {},
273 |    "outputs": [],
274 |    "source": [
275 |     "from getpass import getpass\n",
276 |     "def docker_cmd(training_dataset_path):\n",
277 |     "    raw_cmd = f\"docker run --rm --mount type=bind,src={os.path.join(os.getcwd(),training_dataset_path)},dst=/data bnnupc/bnnetsimulator\"\n",
278 |     "    terminal_cmd = raw_cmd\n",
279 |     "    if os.name != 'nt': # Unix, requires sudo\n",
280 |     "        print(\"Superuser privileges are required to run docker. Introduce sudo password when prompted\")\n",
281 |     "        terminal_cmd = f\"echo {getpass()} | sudo -S \" + raw_cmd\n",
282 |     "        raw_cmd = \"sudo \" + raw_cmd\n",
283 |     "    return raw_cmd, terminal_cmd"
284 |    ]
285 |   },
286 |   {
287 |    "cell_type": "code",
288 |    "execution_count": null,
289 |    "metadata": {},
290 |    "outputs": [],
291 |    "source": [
292 |     "# Start the docker\n",
293 |     "raw_cmd, terminal_cmd = docker_cmd(training_dataset_path)\n",
294 |     "print(\"The next cell will launch docker from the notebook. Alternatively, run the following command from a terminal:\")\n",
295 |     "print(raw_cmd)"
296 |    ]
297 |   },
298 |   {
299 |    "cell_type": "markdown",
300 |    "metadata": {},
301 |    "source": [
302 |     "It is possible that the execution cell may not produce an output until it finishes running the simulation. In this case, you can check the status of the simulation using the logs feature in Docker Desktop.\n",
303 |     "\n",
304 |     "To do this, simply go to the \"Containers\" section, select the \"bnnupc/bnnetsimulator\" container, and then click on \"Logs\". This will give you access to the log file, which contains information about the progress of the simulation.\n",
305 |     "\n",
306 |     "The log file will contain one line per simulated sample, with the first value indicating the simulation line, and then \"Ok\" if the simulation finishes properly, or an error message if there were any issues. The log file is located at <training_dataset_path>/out.log.\n",
307 |     "\n",
308 |     "It is recommended to regularly check the log file to ensure that the simulation is progressing as expected. This will help you catch any issues early on and take the necessary steps to resolve them.\n",
309 |     "\n",
310 |     "Additionally, you should also check the log file after the simulation has finished to ensure that there were no errors or other issues that may have affected the results."
311 |    ]
312 |   },
313 |   {
314 |    "cell_type": "code",
315 |    "execution_count": null,
316 |    "metadata": {},
317 |    "outputs": [],
318 |    "source": [
319 |     "!{terminal_cmd}"
320 |    ]
321 |   }
322 |  ],
323 |  "metadata": {
324 |   "interpreter": {
325 |    "hash": "e4fabe4be1dcb5b95007215d13ed47b80f9ccf78939eea74ae4a681230c3cbef"
326 |   },
327 |   "kernelspec": {
328 |    "display_name": "Python 3",
329 |    "language": "python",
330 |    "name": "python3"
331 |   },
332 |   "language_info": {
333 |    "codemirror_mode": {
334 |     "name": "ipython",
335 |     "version": 3
336 |    },
337 |    "file_extension": ".py",
338 |    "mimetype": "text/x-python",
339 |    "name": "python",
340 |    "nbconvert_exporter": "python",
341 |    "pygments_lexer": "ipython3",
342 |    "version": "3.8.10"
343 |   }
344 |  },
345 |  "nbformat": 4,
346 |  "nbformat_minor": 4
347 | }
348 | 


--------------------------------------------------------------------------------
/input_parameters_glossary.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Dataset Generation Parameters Glossary\n",
  8 |     "**THIS IS A GLOSSARY MEANT TO BE A REFERENCE, THE CODE CELLS ARE NOT MEANT TO BE EXECUTED**\n",
  9 |     "\n",
 10 |     "To build the dataset, we'll be working with three types of files that make up each sample:\n",
 11 |     "\n",
 12 |     "- **Graph topology**: This file represents a graph topology, including the nodes and edges that forms it as well as characterstics of each.\n",
 13 |     "- **Routing file**: This file shows the paths that traffic can take between each node within the graph topology.\n",
 14 |     "- **Traffic matrix (TM)**: This file represents the flows of traffic through a given network. It describe the traffic moving from one node to another.\n",
 15 |     "\n",
 16 |     "Each sample in the dataset is identified by a unique combination of these three files. This means that we can generate multiple samples from the same graph topology by pairing it with different traffic matrices, for example.\n",
 17 |     "\n",
 18 |     "In the following sections, we'll show you how to generate these files, and how their properties can be altered to create different variations of the dataset."
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "code",
 23 |    "execution_count": 1,
 24 |    "metadata": {},
 25 |    "outputs": [],
 26 |    "source": [
 27 |     "import networkx as nx\n",
 28 |     "import random"
 29 |    ]
 30 |   },
 31 |   {
 32 |    "cell_type": "code",
 33 |    "execution_count": 2,
 34 |    "metadata": {},
 35 |    "outputs": [],
 36 |    "source": [
 37 |     "# Generate, for instance, a complete graph\n",
 38 |     "G = nx.complete_graph(10)\n",
 39 |     "\n",
 40 |     "# Set the number of ToS that the input traffic of the network can use. If it is not defined, the simulator sets it to 1.\n",
 41 |     "G.graph[\"levelsToS\"] = 3\n",
 42 |     "\n",
 43 |     "# Assign bandwidth to each edge of the graph. Its value is considered in bps.\n",
 44 |     "for (n0,n1) in G.edges():\n",
 45 |     "    G[n0][n1][\"bandwidth\"] = 100000"
 46 |    ]
 47 |   },
 48 |   {
 49 |    "cell_type": "markdown",
 50 |    "metadata": {},
 51 |    "source": [
 52 |     "Each node in our network simulation is defined by two key characteristics:\n",
 53 |     "\n",
 54 |     "- **Scheduling policy**: The way in which packets are processed at each output port is determined by the state of the queues and the chosen scheduling policy. We consider the following four policies:\n",
 55 |     "    - *First In First Out (FIFO)*: all packets are placed in a single queue and served in the order they arrived, regardless of their type of service (ToS).\n",
 56 |     "    - *Strict Priority (SP)*: one queue is designated for each priority level, and packets in higher priority queues are served first.\n",
 57 |     "    - *Weighted Fair Queueing (WFQ)*: each queue is assigned a weight according to the configuration. The sum of all weights must equal 100. The policy selects a queue to serve based on its weight and the data rate of that queue, ensuring fairness among all queues.\n",
 58 |     "    - *Deficit Round Robin  (DRR)*: similar to WFQ, each queue is assigned a weight according to the configuration. The sum of all weights must equal 100. The policy cycles through the queues, dedicating time to each queue proportional to its weight.\n",
 59 |     "- **Buffer size**: the amount of storage space available at each output port for packets that are waiting to be processed. When a packet arrives and the outgoing queue is full, it will be dropped. The buffer size is measured in bits, with a minimum value of 8000 bits.\n",
 60 |     "\n",
 61 |     "It's important to remember that you will have to define these characteristics for all nodes in the topology. The scheduling policy of a node is stored in the attribute schedulingPolicy as a string.\n",
 62 |     "\n",
 63 |     "Here's some examples of nodes with the scheduling policy and buffer size defined:\n"
 64 |    ]
 65 |   },
 66 |   {
 67 |    "cell_type": "code",
 68 |    "execution_count": 3,
 69 |    "metadata": {},
 70 |    "outputs": [],
 71 |    "source": [
 72 |     "# Let's configure all the nodes with a FIFO policy\n",
 73 |     "for node in G:\n",
 74 |     "    G.nodes[node][\"schedulingPolicy\"] = \"FIFO\""
 75 |    ]
 76 |   },
 77 |   {
 78 |    "attachments": {},
 79 |    "cell_type": "markdown",
 80 |    "metadata": {},
 81 |    "source": [
 82 |     "When it comes to simulating network traffic, it's important to be able to control how packets with different types of service (ToS) are handled within the network. With the scheduling policies SP, WFQ, and DRR, we have an additional level of control through the tosToQoSqueue attribute.\n",
 83 |     "\n",
 84 |     "This attribute allows us to specify to which queues packets with different ToS should be placed in. For example, if we set the attribute to \"0;1,2\", we are telling the simulator to create two queues. Traffic with ToS 0 will be assigned to the first queue, and traffic with ToS 1 and 2 will be assigned to the second queue.\n",
 85 |     "\n",
 86 |     "If the tosToQoSqueue attribute is not set, the simulator will use a default behavior: it will create one queue for each ToS defined in the levelsToS attribute, and assign one ToS to each queue.\n",
 87 |     "\n",
 88 |     "Here's an example of a node with the tosToQoSqueue attribute defined:"
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "code",
 93 |    "execution_count": null,
 94 |    "metadata": {},
 95 |    "outputs": [],
 96 |    "source": [
 97 |     "# Let's configure all the nodes with a SP policy with two queues.\n",
 98 |     "for node in G:\n",
 99 |     "    G.nodes[node][\"schedulingPolicy\"] = \"SP\"\n",
100 |     "    G.nodes[node][\"tosToQoSqueue\"] = \"0;1,2\""
101 |    ]
102 |   },
103 |   {
104 |    "cell_type": "markdown",
105 |    "metadata": {},
106 |    "source": [
107 |     "When it comes to the scheduling policies of WFQ and DRR, we have an additional level of control over how packets are processed through the use of \"weights\" assigned to each queue. This is where the attribute schedulingWeights comes into play.\n",
108 |     "\n",
109 |     "To use this attribute, we will feed it a string containing the weights for each queue, separated by commas. For example, if we have 3 queues and we want to assign weights of 45, 30 and 25 to them respectively, the attribute will be set as \"45,30,25\".\n",
110 |     "\n",
111 |     "It is important to note that the sum of all weights must equal 100, otherwise the simulation will not work as expected.\n",
112 |     "\n",
113 |     "Here's an example of a node with the schedulingWeights attribute defined:"
114 |    ]
115 |   },
116 |   {
117 |    "cell_type": "code",
118 |    "execution_count": 8,
119 |    "metadata": {},
120 |    "outputs": [],
121 |    "source": [
122 |     "# Let's configure all the nodes with a WFQ policy with three queues\n",
123 |     "for node in G:\n",
124 |     "    G.nodes[node][\"schedulingPolicy\"] = \"WFQ\"\n",
125 |     "    G.nodes[node][\"tosToQoSqueue\"] = \"0;1;2\"\n",
126 |     "    G.nodes[node][\"schedulingWeights\"] = \"45, 30, 25\""
127 |    ]
128 |   },
129 |   {
130 |    "cell_type": "code",
131 |    "execution_count": null,
132 |    "metadata": {},
133 |    "outputs": [],
134 |    "source": [
135 |     "# Let's configure all the nodes with a DRR policy with three queues\n",
136 |     "for node in G:\n",
137 |     "    G.nodes[node][\"schedulingPolicy\"] = \"DRR\"\n",
138 |     "    G.nodes[node][\"tosToQoSqueue\"] = \"0;1;2\"\n",
139 |     "    G.nodes[node][\"schedulingWeights\"] = \"45, 30, 25\""
140 |    ]
141 |   },
142 |   {
143 |    "cell_type": "markdown",
144 |    "metadata": {},
145 |    "source": [
146 |     "To configure the buffer size in our network simulation, we only need to modify the attribute bufferSizes and include the desired size of the buffer in bits. It's important to keep in mind that the buffer size should be greater than 8000 bits.\n",
147 |     "\n",
148 |     "Here's an example of a node with the bufferSizes attribute defined:"
149 |    ]
150 |   },
151 |   {
152 |    "cell_type": "code",
153 |    "execution_count": 4,
154 |    "metadata": {},
155 |    "outputs": [],
156 |    "source": [
157 |     "# Assign to each node a queue size of 32000 bits\n",
158 |     "for node in G:\n",
159 |     "    G.nodes[node][\"bufferSizes\"] = 32000"
160 |    ]
161 |   },
162 |   {
163 |    "cell_type": "code",
164 |    "execution_count": 5,
165 |    "metadata": {},
166 |    "outputs": [],
167 |    "source": [
168 |     "# Finally we save the topology\n",
169 |     "graph_file = \"graph.txt\"\n",
170 |     "nx.write_gml(G,graph_file)"
171 |    ]
172 |   },
173 |   {
174 |    "cell_type": "markdown",
175 |    "metadata": {},
176 |    "source": [
177 |     "## Routing\n",
178 |     "The routing information for our network simulation is expressed in a text file, where each line represents a path through the network as a sequence of nodes. There are two types of routing that can be used: destination-based and source-destination-based routing.\n",
179 |     "\n",
180 |     "Destination-based routing is where packets are forwarded to the next hop based solely on their destination address. In contrast, source-destination-based routing takes into account both the source and destination addresses when forwarding packets.\n",
181 |     "\n",
182 |     "It's important to note that both types of routing should not contain loops. Loops can cause packets to circulate indefinitely, leading to network congestion and decreased performance.\n",
183 |     "\n",
184 |     "Here's an example of a routing file using destination-based routing:"
185 |    ]
186 |   },
187 |   {
188 |    "cell_type": "code",
189 |    "execution_count": 6,
190 |    "metadata": {},
191 |    "outputs": [],
192 |    "source": [
193 |     "# For instance, we can use networkx to calculate the shortest path routing for each src-dst pair.\n",
194 |     "with open(\"routing.txt\",\"w\") as r_fd:\n",
195 |     "    lPaths = nx.shortest_path(G)\n",
196 |     "    for src in G:\n",
197 |     "        for dst in G:\n",
198 |     "            if src == dst:\n",
199 |     "                continue\n",
200 |     "            path =  ','.join(str(x) for x in lPaths[src][dst])\n",
201 |     "            r_fd.write(path+\"\\n\")\n",
202 |     "        "
203 |    ]
204 |   },
205 |   {
206 |    "cell_type": "markdown",
207 |    "metadata": {},
208 |    "source": [
209 |     "## Traffic Matrix\n",
210 |     "The final step in generating our network simulation is to create the traffic matrix (TM) file. This file contains information on the different traffic flows that will be present in the network. Each line in the file describes one flow, and the parameters are separated by commas. Here's a breakdown of the parameters:\n",
211 |     "\n",
212 |     "- **source and destination**: These parameters indicate the source and destination nodes for the given flow. \n",
213 |     "\n",
214 |     "- **avg_bw**: This parameter indicates the average bandwidth, in bps, to be generated for this flow. It's minimum value is 10 bps\n",
215 |     "\n",
216 |     "- **time_distribution**: This parameter indicates how often packets should be generated over time. We support three time distributions, as well as a generic distribution described using a Python script :\n",
217 |     "  - Poisson(```time_distribution```=1): Generates packets according to a Poisson process. No extra parameters required.\n",
218 |     "  - CBR(```time_distribution```=1): Generates packets at a constant bit rate. No extra parameters required.\n",
219 |     "  - ON-OFF(```time_distribution```=2): Alternates between active and inactive periods. Additional parameters include the length of the activity and inactivity periods (```on_time``` and ```off_time``` respectively). Each period's duration follows an exponential distribution with an average duration of on_time or off_time seconds.\n",
220 |     "  - EXT_PYTHON(```time_distribution```=3):Utilizes an external Python distribution. Requires at least the name of the Python class used to generate packet times, with additional parameters separated by commas. All parameters must be of type double. See the 'Generic Distribution' section for more details.\"\n",
221 |     "\n",
222 |     "- **pkt_dist**: This parameter notes the distribution type used to generate the packets. Two distribution is supported.\n",
223 |     "  - Generic(```size_distribution```=0): Describe a set of packets size and and their relative frequency within the flow (**pkt_size_n** and **prob_n**). At least one packet size must be declared, but we can define up to 8 different sizes. The packet size should be a value between 50 and 2000 bits, while the sum of all the prob_n values should equal 1.\n",
224 |     "  - EXT_PYTHON(```size_distribution```=1):Utilizes an external Python distribution to generate the packet sizes. Requires at least the name of the Python class used to generate packet times, with additional parameters separated by commas. All parameters must be of type double. See the 'Generic Distribution' section for more details.\n",
225 |     "\n",
226 |     "\n",
227 |     "- **tos**: This parameter indicates the ToS assigned to the packets generated for this flow. When defining the tos to be used, select consecutive values starting from 0. For instance if you want to use 3 different ToS, select 0, 1 and 2.\n",
228 |     "\n",
229 |     "At the end, the resulting line should look something like this:\n",
230 |     "\n",
231 |     "```source, destination, avg_bw, time_distribution, [on_time, off_time,] pkt_dist, pkt_size_1, prob_1, [pkt_size_2, prob_2, [pkt_size_3, prob_3, [pkt_size_4, prob_4, [pkt_size_5, prob_5,]]]] tos```\n",
232 |     "\n",
233 |     "*Note:* It's possible to define more than one flow between two nodes, although this functionality is not fully tested and it's up to the user to use it. However, keep in mind that as the simulator provides performance metrics aggregated per path and for each flow, the size of the dataset can increase easily.\n",
234 |     "\n",
235 |     "By generating the traffic matrix file, you can control the different types of traffic that will be present in the network, leading to more accurate and realistic network traffic simulations.\n"
236 |    ]
237 |   },
238 |   {
239 |    "cell_type": "code",
240 |    "execution_count": 7,
241 |    "metadata": {},
242 |    "outputs": [],
243 |    "source": [
244 |     "\"\"\"\n",
245 |     "Example: this code will generate flows between all nodes in the graph, such as:\n",
246 |     "- The average bandwidth is randomized between 10 and 10000 bps\n",
247 |     "- An ON-OFF time distribution is used, with an average on_time of 5 s and an average off_time of 10 s\n",
248 |     "- Packets can have two possible sizes, 300 and 1700 bits, both equally probable\n",
249 |     "- The ToS for all flows is 0 (high priority)\n",
250 |     "\"\"\"\n",
251 |     "with open(\"traffic.txt\",\"w\") as tm_fd:\n",
252 |     "    for src in G:\n",
253 |     "        for dst in G:\n",
254 |     "            avg_bw = random.randint(10,10000)\n",
255 |     "            time_dist = 2\n",
256 |     "            on_time = 5\n",
257 |     "            off_time = 10\n",
258 |     "            pkt_size_1 = 300\n",
259 |     "            prob_1 = 0.5\n",
260 |     "            pkt_size_2 = 1700\n",
261 |     "            prob_2 = 0.5\n",
262 |     "            tos = 0\n",
263 |     "            traffic_line = \"{},{},{},{},{},{},0,{},{},{},{},{}\".format(\n",
264 |     "                src,dst,avg_bw,time_dist,on_time,off_time,pkt_size_1,\n",
265 |     "                prob_1,pkt_size_2,prob_2,tos)\n",
266 |     "            tm_fd.write(traffic_line+\"\\n\")\n"
267 |    ]
268 |   },
269 |   {
270 |    "cell_type": "markdown",
271 |    "metadata": {},
272 |    "source": [
273 |     "### Generic python script generator\n",
274 |     "\n",
275 |     "BNNetSimulator offers the capability to utilize an external Python script for generating the time and size distribution of packets. This class should include a method named ```next()```, which is invoked by BNNetSimulator. When the external Python script is employed for defining the time distribution, the next() method returns the wait time, measured in nanoseconds, before generating a new packet (known as the Inter Packet Gap). Conversely, when the script is used for determining the size distribution, this method returns the size of the next packet to be generated in bits.\n",
276 |     "\n",
277 |     "Additionally, another method that this class should implement is ```get_average()```, which provides the average Inter Packet Gap and the average packet size, respectively.\n",
278 |     "\n",
279 |     "The _init__ function of the class depend on if it is used to generate a time distribution or a size distribution:\n",
280 |     "\n",
281 |     "- Time distribution: ```def __init__ (self, src_id, dst_id, flow_id, equivalent_lambda_bits, param_1,..., param_n, avg_pkt_size)```\n",
282 |     "- Size distribution: ```def __init__ (self, src_id, dst_id, flow_id, param_1,..., param_m)```\n",
283 |     "\n",
284 |     "The list of params (param_1,..,param_n) are the additional parameters described when defining the traffic matrix. These parameters are expected to be of type double. The class method ```num_stream_parameters``` should return the count of these additional parameters.\n",
285 |     "\n",
286 |     "Here is an example of a class that would be used to generate packets whose size would follow a uniform distribution."
287 |    ]
288 |   },
289 |   {
290 |    "cell_type": "code",
291 |    "execution_count": null,
292 |    "metadata": {},
293 |    "outputs": [],
294 |    "source": [
295 |     "#!/usr/bin/python3\n",
296 |     "import numpy as np\n",
297 |     "\n",
298 |     "\n",
299 |     "class sd_uniform:\n",
300 |     "\n",
301 |     "    file_obj = {}\n",
302 |     "    \n",
303 |     "    def __init__ (self, src_id,dst_id,flow_id,min_pkt_size,max_pkt_size):\n",
304 |     "        self.min_pkt_size = min_pkt_size\n",
305 |     "        self.max_pkt_size = max_pkt_size\n",
306 |     "        self.avg_pkt_size = (self.max_pkt_size - self.min_pkt_size)/2 + self.min_pkt_size;\n",
307 |     "      \n",
308 |     "    \n",
309 |     "    # Return the number of parameters required by this module\n",
310 |     "    @classmethod\n",
311 |     "    def num_stream_parameters(cls):\n",
312 |     "        # min_pkt_size and max_pkt_size\n",
313 |     "        return (2)     \n",
314 |     "        \n",
315 |     "        \n",
316 |     "    def get_next(self):        \n",
317 |     "        return (round(np.random.uniform(self.min_pkt_size,self.max_pkt_size)))\n",
318 |     "    \n",
319 |     "    def get_average(self):\n",
320 |     "        return (self.avg_pkt_size)\n"
321 |    ]
322 |   },
323 |   {
324 |    "cell_type": "markdown",
325 |    "metadata": {},
326 |    "source": [
327 |     "To process a dataset that relies on external Python scripts, it is necessary to modify the datanetAPI.py file. At line 550, the dictionary _external_param_dic is defined. To add support for an external Python class, insert a new key-value pair into the dictionary. The key should be a string representing the name of the new class, and the value should be a list. In this list, the first parameter should be the name you wish to assign to the distribution, followed by the names of the parameters used by this distribution.\n",
328 |     "\n",
329 |     "For instanc, to add support for the class sd_uniform: ```self._external_param_dic = {\"sd_uniform\":[\"Uniform\",\"MinPktSize\",\"MaxPktSize\"]}```.\n",
330 |     "\n",
331 |     "Then rading a sample that use this dsistribution would be:\n",
332 |     "\n",
333 |     "```\n",
334 |     "print (s.get_traffic_matrix()[0,1][\"Flows\"][0][\"SizeDist\"])\n",
335 |     "    SizeDist.EXTERNAL_PY_S\n",
336 |     "print (s.get_traffic_matrix()[0,1][\"Flows\"][0][\"SizeDistParams\"])\n",
337 |     "    {'AvgPktSize': 1000.0, 'Distribution': 'Uniform', 'MinPktSize': 300.0, 'MaxPktSize': 1700.0}\n",
338 |     "```"
339 |    ]
340 |   }
341 |  ],
342 |  "metadata": {
343 |   "interpreter": {
344 |    "hash": "e4fabe4be1dcb5b95007215d13ed47b80f9ccf78939eea74ae4a681230c3cbef"
345 |   },
346 |   "kernelspec": {
347 |    "display_name": "Python 3",
348 |    "language": "python",
349 |    "name": "python3"
350 |   },
351 |   "language_info": {
352 |    "codemirror_mode": {
353 |     "name": "ipython",
354 |     "version": 3
355 |    },
356 |    "file_extension": ".py",
357 |    "mimetype": "text/x-python",
358 |    "name": "python",
359 |    "nbconvert_exporter": "python",
360 |    "pygments_lexer": "ipython3",
361 |    "version": "3.8.10"
362 |   }
363 |  },
364 |  "nbformat": 4,
365 |  "nbformat_minor": 4
366 | }
367 | 


--------------------------------------------------------------------------------