├── .gitignore ├── .gitmodules ├── README.md ├── install_basedeps.sh ├── install_conda.sh ├── install_p4env.sh └── install_python3libs.sh /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | 3 | # Byte-compiled / optimized / DLL files 4 | __pycache__/ 5 | *.py[cod] 6 | *$py.class 7 | 8 | # C extensions 9 | *.so 10 | 11 | # Distribution / packaging 12 | .Python 13 | build/ 14 | develop-eggs/ 15 | dist/ 16 | downloads/ 17 | eggs/ 18 | .eggs/ 19 | lib/ 20 | lib64/ 21 | parts/ 22 | sdist/ 23 | var/ 24 | wheels/ 25 | pip-wheel-metadata/ 26 | share/python-wheels/ 27 | *.egg-info/ 28 | .installed.cfg 29 | *.egg 30 | MANIFEST 31 | 32 | # PyInstaller 33 | # Usually these files are written by a python script from a template 34 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 35 | *.manifest 36 | *.spec 37 | 38 | # Installer logs 39 | pip-log.txt 40 | pip-delete-this-directory.txt 41 | 42 | # Unit test / coverage reports 43 | htmlcov/ 44 | .tox/ 45 | .nox/ 46 | .coverage 47 | .coverage.* 48 | .cache 49 | nosetests.xml 50 | coverage.xml 51 | *.cover 52 | *.py,cover 53 | .hypothesis/ 54 | .pytest_cache/ 55 | 56 | # Translations 57 | *.mo 58 | *.pot 59 | 60 | # Django stuff: 61 | *.log 62 | local_settings.py 63 | db.sqlite3 64 | db.sqlite3-journal 65 | 66 | # Flask stuff: 67 | instance/ 68 | .webassets-cache 69 | 70 | # Scrapy stuff: 71 | .scrapy 72 | 73 | # Sphinx documentation 74 | docs/_build/ 75 | 76 | # PyBuilder 77 | target/ 78 | 79 | # Jupyter Notebook 80 | .ipynb_checkpoints 81 | 82 | # IPython 83 | profile_default/ 84 | ipython_config.py 85 | 86 | # pyenv 87 | .python-version 88 | 89 | # pipenv 90 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 91 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 92 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 93 | # install all needed dependencies. 94 | #Pipfile.lock 95 | 96 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 97 | __pypackages__/ 98 | 99 | # Celery stuff 100 | celerybeat-schedule 101 | celerybeat.pid 102 | 103 | # SageMath parsed files 104 | *.sage.py 105 | 106 | # Environments 107 | .env 108 | .venv 109 | env/ 110 | venv/ 111 | ENV/ 112 | env.bak/ 113 | venv.bak/ 114 | 115 | # Spyder project settings 116 | .spyderproject 117 | .spyproject 118 | 119 | # Rope project settings 120 | .ropeproject 121 | 122 | # mkdocs documentation 123 | /site 124 | 125 | # mypy 126 | .mypy_cache/ 127 | .dmypy.json 128 | dmypy.json 129 | 130 | # Pyre type checker 131 | .pyre/ 132 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "intsight-funceval/intsight-bmv2"] 2 | path = intsight-funceval/intsight-bmv2 3 | url = https://github.com/jonadmark/intsight-bmv2.git 4 | [submodule "intsight-perfeval"] 5 | path = intsight-perfeval 6 | url = https://github.com/jonadmark/intsight-perfeval.git 7 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # IntSight (CoNEXT 2020) 2 | 3 | This repository is the central hub for all code and artifacts of the CoNEXT 2020 paper entitled "IntSight: Diagnosing SLO Violations with In-Band Network Telemetry" 4 | 5 | ## 1. Setup 6 | 7 | Follow the next steps to setup an experimental environment to reproduce the experiments. Our artifacts were build for and tested on Ubuntu 16.04. We recommend using this release of Ubuntu since it is the most suitable for the P4 environment. We also recommend starting from a clean install and setting up the environment on a baremetal machine (as opposed to a virtual machine). 8 | 9 | Starting by cloning this repository with the following command to ensure all submodules are cloned along with it: 10 | 11 | ``` 12 | git clone --recurse-submodules https://github.com/jonadmark/intsight-conext.git 13 | ``` 14 | 15 | Then navigate to the directory where the repository was cloned and install the base dependencies. On a terminal run: 16 | 17 | ``` 18 | cd intsight-conext 19 | sudo bash install_basedeps.sh 20 | ``` 21 | 22 | Next, install the P4 environment with the following command. This may take a while to run depending on the machine resources. In our experience, it takes from about one to three hours for the script to install the P4 environment. 23 | 24 | ``` 25 | bash install_p4env.sh 26 | ``` 27 | 28 | Install a Python3 environment using the Conda open source package management system and environment management system. Make sure to opt-in for initializing and auto activating Conda when prompted during the installation process. 29 | 30 | ``` 31 | bash install_conda.sh 32 | ``` 33 | 34 | After the installation, Conda will work on all newly initiated terminal sessions. Close the session you are using and start a new one. Finally, install the necessary Python3 libraries with the following command. 35 | 36 | ``` 37 | bash install_python3libs.sh 38 | ``` 39 | 40 | The environment is all set for running the experiments! 41 | 42 | ## 2. Reproducing the Functional Evaluation 43 | 44 | The directory containing the artifacts to reproduce the functional evaluation is `intsight-funceval/intsight-bmv2`. Before we can run the experiments, we first need to generate the pcap packet traces used for both of our use cases. Execute the following commands to navigate to the directory with configuration and scripts files for the end-to-end delay use case, and then generate the necessary pcaps. 45 | 46 | ``` 47 | cd intsight-funceval/intsight-bmv2/experiments/e2edelay 48 | python3 genpcaps.py 49 | ``` 50 | 51 | Then, similarly, navigate to the directory for the bandwidth use case and generate the necessary pcaps. 52 | 53 | ``` 54 | cd ../bandwidth 55 | python3 genpcaps.py 56 | ``` 57 | 58 | Next, run the end-to-end delay experiment. First, navigate back to the root directory for the functional evaluation `intsight-funceval/intsight-bmv2` and then launch the experimentation script. 59 | 60 | ``` 61 | cd ../.. 62 | bash experiment.sh experiments/e2edelay/network.json 63 | ``` 64 | 65 | Wait until the experiment ends, it should take no more than two minutes. Next, create a directory to store the obtained results and copy the files and directories necessary for running the analysis and generating the paper figures. 66 | 67 | ``` 68 | mkdir experiments/e2edelay/my_results 69 | cp configure.py /experiments/e2edelay/my_results/ 70 | cp -r logs /experiments/e2edelay/my_results/ 71 | ``` 72 | 73 | Generate the paper figures using the provided Jupyter Notebook. Open the notebook with the following command. 74 | 75 | ``` 76 | cd experiments/e2edelay/ 77 | jupyter notebook genfigures.ipynb 78 | ``` 79 | 80 | The command above will open a browser window and show the notebook. It shows a snapshot of the results presented in the paper. To reproduce the results, change the line `exp_dir = './paper_results/'` to `exp_dir = './my_results'` so that the analysis will consider the new results. Next, in the browser window, open the `Kernel` menu and click on `Restart & Run All`. This will run the notebook and generate all figures for the end-to-end delay use case. 81 | 82 | > **Note**: The functional experiments are based on the P4 software switch, which has performance as a non-goal. As previously metioned, for best results, install and run the experiments baremetal on a well provisioned machine. For the paper, we ran our experiments on a dedicated Ubuntu 16.04 (Linux 4.4) server with 2x Intel Xeon Silver 4208 2.1 GHz 8-core 16-thread processors, 8x 16 GB 2400 MHz RAM, and 2 TB of NVMe SSD storage. 83 | 84 | Next, we list the necessary steps to reproduce the bandwidth use case, which are basically the same as the ones for the end-to-end delay use case. 85 | 86 | ``` 87 | cd ../.. 88 | bash experiment.sh experiments/bandwidth/network.json 89 | mkdir experiments/bandwidth/my_results 90 | cp configure.py /experiments/bandwidth/my_results/ 91 | cp -r logs /experiments/bandwidth/my_results/ 92 | cd experiments/bandwidth/ 93 | jupyter notebook genfigures.ipynb 94 | ``` 95 | 96 | Similarly to before, the last command will open a browser window and show the notebook. It shows a snapshot of the results presented in the paper. To reproduce the results, change the line `exp_dir = './paper_results/'` to `exp_dir = './my_results'` so that the analysis will consider the new results. Next, in the browser window, open the `Kernel` menu and click on `Restart & Run All`. This will run the notebook and generate all figures for the end-to-end delay use case. 97 | 98 | Congratulations! You are all done with reproducing the functional evaluation. 99 | 100 | ## 3. Reproducing the Performance Evaluation 101 | 102 | The performance evaluation is based on analytical models of IntSight and the related approaches. To run the evaluation we first need to convert the network topologies and demands from their textual descriptions (made available by the Repetita project) to json files readily readable by the evaluation Jupyter notebook. On a terminal window at the root directory of this repository, run: 103 | 104 | ``` 105 | cd intsight-perfeval 106 | python3 repetita2json.py 107 | ``` 108 | 109 | After the conversion is done, the notebook can be opened with the following command: 110 | 111 | ``` 112 | jupyter notebook performance-evaluation.ipynb 113 | ``` 114 | 115 | The command above will open a browser window and show the notebook. Initially, it shows a snapshot of the results presented in the paper. To reproduce the results, in the browser window, open the `Kernel` menu and click on `Restart & Run All`. This will run the notebook and generate all figures. This notebook takes several minutes to run. When all notebook cells have been run, all figures and results will be available throughout the notebook as well as in subdirectory `paper_results`. 116 | 117 | Congratulations! You are all done with reproducing the performance evaluation. 118 | 119 | ## 4. Reusing our artifacts for your own experiments 120 | 121 | Our artifacts were built in a way that enables them to be reused for other purposes and additional experiments. Here we present a few pointers to guide anyone interested in adjusting or extending our artifacts. 122 | 123 | ### 4.1 Reusing the Functional Evaluation Artifacts 124 | 125 | The main file of a functional experiment is the `network.json` file. Following we present the contents of this file for the end-to-end delay use case and describe each parameter. 126 | 127 | ``` 128 | { 129 | "capture_traffic": false, 130 | "run_workload": true, 131 | "workload_file": "experiments/e2edelay/workload.json", 132 | "nodes": 5, 133 | "hosts_per_node": 2, 134 | "node_links": [ 135 | ["s3", "s4"], 136 | ["s3", "s2"], 137 | ["s4", "s5"], 138 | ["s2", "s1"] 139 | ], 140 | "e2e_delay_slas": { 141 | "h1": { 142 | "h10": [20000, 1] 143 | } 144 | } 145 | } 146 | ``` 147 | 148 | - `capture_traffic`: indicates if the network traffic should be captured during the execution of the experiment. Ideally `false` to help maximize the performance, but can be set to `true` to help debug problems. 149 | - `run_workload`: indicates if Mininet should run the workload (as described by `workload_file`) or simply build the emulated network and present a command prompt to the user. Can be set to `false` if you want to interactively generate traffic and do tests. 150 | - `workload_file`: a json file that describes the workload of the experiment. 151 | - `nodes`: the number of forwarding nodes in the network topology. 152 | - `hosts_per_node`: how many hosts should be created (and connected) to each forwarding node. 153 | - `node_links`: list of links between forwarding nodes in the network topology. 154 | - `e2e_delay_slas`: end-to-end delay SLA specifications. In the example, the delay between host `h1` to host `h10` cannot be greater or equal to 20 milliseconds (first value, delay < 20000 microseconds) for any packet (second value, number of high delayed packets < 1). 155 | 156 | The workload of an experiment can be adjusted by modifying (and running) the `genpcaps.py` script. The main element to modify are the `Y` functions that return a bitrate as a function of experiment time. For example, in the `genpcaps.py` script of the end-to-end delay use case, the bitrate behavior of the orange flow is defined by: 157 | 158 | ``` 159 | def Yorange(x): 160 | if x >= 30 and x <= 30.1: 161 | return 106.632 162 | return 15*random.gauss(1, 0.1) 163 | ``` 164 | 165 | Between instants 30.0 and 30.1 seconds of the experiment, the rate is about 106 Mbps. The rest of the experiment the bitrate is about 15 Mbps. The term `random.gauss(1, 0.1)` is used to generate oscillations in traffic so that the rate is not completely constant. 166 | 167 | Other important files to be aware of when modifying the functional evaluation artifacts. 168 | - `intsight.p4`: This is the P4 program installed in the forwarding nodes of the network. 169 | - `configure.py`: This script generates configuration files to be used during the experiment. Specially, it is responsible for parsing the `network.json` file into a Mininet topology description and generating P4 runtime rules to be installed into nodes during the experiment. 170 | - `report-receiver.py`: Script that receives the IntSight reports sent by the forwarding nodes. 171 | 172 | ### 4.2 Reusing the Performance Evaluation Artifacts 173 | 174 | The main file of the performance evaluation is the `performance-evaluation.ipynb` Jupyter notebook. The `1.2. Helper Functions` section of the notebook has a function to model the resource usage of each one of the evaluated approaches. For example, below we present the function that computes the resource usage for mirroring approaches (e.g., NetSight). 175 | 176 | ``` 177 | def Mirroring(net_json, net_graph, demands_json): 178 | # report rate 179 | pr = 0 180 | for d in demands_json['demands_list']: 181 | pr = pr + (d['pktrate']*nx.shortest_path_length(net_graph, d['src'], d['dst'])) 182 | # 183 | return { 184 | 'report_rate': pr, 185 | 'sram_memory': 0, 186 | 'tcam_memory': 0, 187 | 'header_space': 0, 188 | } 189 | 190 | ``` 191 | 192 | Model functions receive as three objects as input. The first is a Python dictionary with metadata about the network for which one wants to estimate the resource usage. The second is a NetworkX graph that represents the network topology. The third is another dictionary with information regarding the bitrate demand between pairs of forwarding nodes in the network. Model functions return a Python dictionary with fields indicating resource usage. As the name implies, mirroring approaches simply configure devices to mirror to the control plane a copy of the packets they are forwarding. Consequently, they do not use header space or memory. The report rate is a function of the number of hops in the path of each packet in the network. For each node pair demand, we add to an accumulated variable `pr`, the packet rate multiplied by the path length between endpoints. New model functions could be added to analyze additional approaches. 193 | 194 | Section 1.3. (i.e., Main Code), contains the main evaluation loop of the notebook. It applies each of the selected network topologies to the model functions and stores the results in a Pandas DataFrame (i.e., a table). This table can subsequently be queried to analyze the resource usage and generate figures, as we do in Section 2. Although the model functions and figure generation code in the notebook were built for the purpose of the performance evaluation, they could be swapped with other functions to be evaluated considering the network topologies available in the Repetita dataset. 195 | -------------------------------------------------------------------------------- /install_basedeps.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Print commands and exit on errors 4 | set -xe 5 | 6 | apt-get update 7 | 8 | KERNEL=$(uname -r) 9 | DEBIAN_FRONTEND=noninteractive apt-get -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" upgrade 10 | apt-get install -y --no-install-recommends \ 11 | autoconf \ 12 | automake \ 13 | bison \ 14 | build-essential \ 15 | ca-certificates \ 16 | cmake \ 17 | cpp \ 18 | curl \ 19 | flex \ 20 | git \ 21 | libboost-dev \ 22 | libboost-filesystem-dev \ 23 | libboost-iostreams1.58-dev \ 24 | libboost-program-options-dev \ 25 | libboost-system-dev \ 26 | libboost-test-dev \ 27 | libboost-thread-dev \ 28 | libc6-dev \ 29 | libevent-dev \ 30 | libffi-dev \ 31 | libfl-dev \ 32 | libgc-dev \ 33 | libgc1c2 \ 34 | libgflags-dev \ 35 | libgmp-dev \ 36 | libgmp10 \ 37 | libgmpxx4ldbl \ 38 | libjudy-dev \ 39 | libpcap-dev \ 40 | libreadline6 \ 41 | libreadline6-dev \ 42 | libssl-dev \ 43 | libtool \ 44 | linux-headers-$KERNEL\ 45 | make \ 46 | mktemp \ 47 | pkg-config \ 48 | python \ 49 | python-dev \ 50 | python-ipaddr \ 51 | python-pip \ 52 | python-psutil \ 53 | python-scapy \ 54 | python-setuptools \ 55 | tcpdump \ 56 | unzip \ 57 | vim \ 58 | wget \ 59 | xcscope-el \ 60 | tcpreplay \ 61 | tmux \ 62 | xterm 63 | -------------------------------------------------------------------------------- /install_conda.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Print commands and exit on errors. 4 | set -xe 5 | 6 | wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh 7 | bash Miniconda3-latest-Linux-x86_64.sh 8 | -------------------------------------------------------------------------------- /install_p4env.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Print commands and exit on errors. 4 | set -xe 5 | 6 | BMV2_COMMIT="7e25eeb19d01eee1a8e982dc7ee90ee438c10a05" 7 | PI_COMMIT="219b3d67299ec09b49f433d7341049256ab5f512" 8 | P4C_COMMIT="48a57a6ae4f96961b74bd13f6bdeac5add7bb815" 9 | PROTOBUF_COMMIT="v3.2.0" 10 | GRPC_COMMIT="v1.3.2" 11 | 12 | NUM_CORES=`grep -c ^processor /proc/cpuinfo` 13 | 14 | # Mininet 15 | git clone https://github.com/mininet/mininet.git mininet 16 | cd mininet 17 | sudo ./util/install.sh -nwv 18 | cd .. 19 | 20 | # Protobuf 21 | git clone https://github.com/google/protobuf.git 22 | cd protobuf 23 | git checkout ${PROTOBUF_COMMIT} 24 | export CFLAGS="-Os" 25 | export CXXFLAGS="-Os" 26 | export LDFLAGS="-Wl,-s" 27 | ./autogen.sh 28 | ./configure --prefix=/usr 29 | make -j${NUM_CORES} 30 | sudo make install 31 | sudo ldconfig 32 | unset CFLAGS CXXFLAGS LDFLAGS 33 | # force install python module 34 | cd python 35 | sudo python setup.py install 36 | cd ../.. 37 | 38 | # gRPC 39 | git clone https://github.com/grpc/grpc.git 40 | cd grpc 41 | git checkout ${GRPC_COMMIT} 42 | git submodule update --init --recursive 43 | export LDFLAGS="-Wl,-s" 44 | make -j${NUM_CORES} 45 | sudo make install 46 | sudo ldconfig 47 | unset LDFLAGS 48 | cd .. 49 | # Install gRPC Python Package 50 | sudo pip install grpcio 51 | 52 | # BMv2 deps (needed by PI) 53 | git clone https://github.com/p4lang/behavioral-model.git 54 | cd behavioral-model 55 | git checkout ${BMV2_COMMIT} 56 | # From bmv2's install_deps.sh, we can skip apt-get install. 57 | # Nanomsg is required by p4runtime, p4runtime is needed by BMv2... 58 | tmpdir=`mktemp -d -p .` 59 | cd ${tmpdir} 60 | bash ../travis/install-thrift.sh 61 | bash ../travis/install-nanomsg.sh 62 | sudo ldconfig 63 | bash ../travis/install-nnpy.sh 64 | cd .. 65 | sudo rm -rf $tmpdir 66 | cd .. 67 | 68 | # PI/P4Runtime 69 | git clone https://github.com/p4lang/PI.git 70 | cd PI 71 | git checkout ${PI_COMMIT} 72 | git submodule update --init --recursive 73 | ./autogen.sh 74 | ./configure --with-proto 75 | make -j${NUM_CORES} 76 | sudo make install 77 | sudo ldconfig 78 | cd .. 79 | 80 | # Bmv2 81 | cd behavioral-model 82 | ./autogen.sh 83 | ./configure --disable-logging-macros --with-pi 84 | make -j${NUM_CORES} 85 | sudo make install 86 | sudo ldconfig 87 | # Simple_switch_grpc target 88 | cd targets/simple_switch_grpc 89 | ./autogen.sh 90 | ./configure --with-thrift 91 | make -j${NUM_CORES} 92 | sudo make install 93 | sudo ldconfig 94 | cd .. 95 | cd .. 96 | cd .. 97 | 98 | # P4C 99 | git clone https://github.com/p4lang/p4c 100 | cd p4c 101 | git checkout ${P4C_COMMIT} 102 | git submodule update --init --recursive 103 | mkdir -p build 104 | cd build 105 | cmake .. 106 | make -j${NUM_CORES} 107 | # make -j${NUM_CORES} check 108 | sudo make install 109 | sudo ldconfig 110 | cd .. 111 | cd .. 112 | -------------------------------------------------------------------------------- /install_python3libs.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Print commands and exit on errors. 4 | set -xe 5 | 6 | conda install -c conda-forge jupyterlab 7 | conda install networkx scapy numpy pandas matplotlib --------------------------------------------------------------------------------