├── .clang-format ├── .dockerignore ├── .gitattributes ├── .github ├── codeql-config.yml └── workflows │ ├── build.yml │ ├── build_base.yml │ └── codeql.yml ├── .gitignore ├── .gitmodules ├── .pre-commit-config.yaml ├── CMakeLists.txt ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── CPPLINT.cfg ├── LICENSE ├── Makefile ├── README.md ├── SECURITY.md ├── SUPPORT.md ├── TEST.md ├── azure-pipelines.yml ├── bindings ├── csharp │ ├── .gitignore │ └── HelloWorld │ │ ├── HelloWorld.csproj │ │ ├── Program.cs │ │ └── machnet_shim.cs ├── go │ ├── machnet │ │ ├── conversion.h │ │ ├── go.mod │ │ └── machnet.go │ └── msg_gen │ │ ├── README.md │ │ ├── go.mod │ │ ├── go.sum │ │ └── main.go ├── js │ ├── .gitignore │ ├── benchmark.js │ ├── hello_world.js │ ├── latency.js │ ├── machnet_shim.js │ └── rocksdb_client.js └── rust │ ├── .gitignore │ ├── Cargo.toml │ ├── README.md │ ├── TODO.md │ ├── build.rs │ ├── resources │ ├── jring.h │ ├── jring_elem_private.h │ ├── machnet.h │ └── machnet_common.h │ └── src │ ├── bindings.rs │ └── lib.rs ├── build_shim.sh ├── docker-bake.hcl ├── dockerfiles ├── amazon-linux-2023.dockerfile ├── get_targets_for_arch.py └── ubuntu-22.04.dockerfile ├── docs ├── INTERNAL.md └── PERFORMANCE_REPORT.md ├── examples ├── .gitignore ├── Makefile ├── aws_instructions.md ├── azure_create_vms.py ├── azure_start_machnet.sh ├── hello_world.cc ├── requirements.txt └── rust │ ├── .gitignore │ ├── Cargo.toml │ ├── README.md │ ├── image.png │ └── src │ └── main.rs ├── machnet.sh └── src ├── CMakeLists.txt ├── apps ├── CMakeLists.txt ├── machnet │ ├── CMakeLists.txt │ ├── README.md │ ├── config.json │ └── main.cc ├── msg_gen │ ├── CMakeLists.txt │ ├── README.md │ └── main.cc └── rocksdb_server │ ├── CMakeLists.txt │ └── rocksdb_server.cc ├── benchmark └── CMakeLists.txt ├── core ├── CMakeLists.txt ├── drivers │ ├── dpdk │ │ ├── dpdk.cc │ │ ├── dpdk_test.cc │ │ ├── packet_pool.cc │ │ └── pmd.cc │ └── shm │ │ ├── channel.cc │ │ ├── channel_bench.cc │ │ ├── channel_test.cc │ │ ├── shmem.cc │ │ └── shmem_test.cc ├── flow_test.cc ├── machnet_config.cc ├── machnet_controller.cc ├── machnet_engine_test.cc ├── net │ ├── ether.cc │ ├── ipv4.cc │ └── udp.cc ├── ttime.cc ├── ud_socket.cc └── utils.cc ├── ext ├── CMakeLists.txt ├── CPPLINT.cfg ├── Makefile ├── jring.h ├── jring2.h ├── jring_bench.cc ├── jring_elem_private.h ├── machnet.c ├── machnet.h ├── machnet_bench.cc ├── machnet_common.h ├── machnet_ctrl.h ├── machnet_private.h ├── machnet_private_test.cc └── machnet_test.cc ├── include ├── arp.h ├── cc.h ├── channel.h ├── channel_msgbuf.h ├── common.h ├── dpdk.h ├── ether.h ├── flow.h ├── flow_key.h ├── icmp.h ├── ipv4.h ├── juggler_rpc_ctrl.h ├── machnet_config.h ├── machnet_controller.h ├── machnet_engine.h ├── machnet_pkthdr.h ├── packet.h ├── packet_pool.h ├── pause.h ├── pmd.h ├── shmem.h ├── ttime.h ├── types.h ├── ud_socket.h ├── udp.h ├── utils.h └── worker.h ├── tests └── CMakeLists.txt └── tools ├── CMakeLists.txt ├── jring2_perf ├── CMakeLists.txt └── main.cc ├── jring_perf ├── CMakeLists.txt └── main.cc ├── ping ├── CMakeLists.txt ├── README.md └── main.cc └── pktgen ├── CMakeLists.txt ├── README.md └── main.cc /.clang-format: -------------------------------------------------------------------------------- 1 | # Use the Google style in this project. 2 | BasedOnStyle: Google 3 | -------------------------------------------------------------------------------- /.dockerignore: -------------------------------------------------------------------------------- 1 | *.so 2 | *.o 3 | .cache/ 4 | build/ 5 | .history 6 | .vscode 7 | Dockerfile 8 | *.dockerfile 9 | docker-bake.hcl 10 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | # Set the default behavior, in case people don't have core.autocrlf set. 2 | * text=auto 3 | 4 | # Explicitly declare text files to be normalized and converted to native line endings on checkout. 5 | *.c text 6 | *.h text 7 | *.cpp text 8 | *.hpp text 9 | *.cc text 10 | *.hh text 11 | *.cxx text 12 | *.hxx text 13 | *.md text 14 | *.yml text 15 | *.sh text 16 | *.cmake text 17 | *.cfg text 18 | 19 | # Declare files that will always have CRLF line endings on checkout. 20 | *.sln text eol=crlf 21 | 22 | # Denote all files that are truly binary and should not be modified. 23 | *.png binary 24 | *.jpg binary 25 | *.pdf binary 26 | *.so binary 27 | *.dll binary 28 | *.exe binary 29 | *.bin binary 30 | 31 | # Denote Dockerfiles and ignore line ending changes 32 | *.dockerfile text eol=lf 33 | 34 | -------------------------------------------------------------------------------- /.github/codeql-config.yml: -------------------------------------------------------------------------------- 1 | paths-ignore: 2 | - 'third_party/**' 3 | -------------------------------------------------------------------------------- /.github/workflows/build.yml: -------------------------------------------------------------------------------- 1 | name: Build and Register Machnet as Latest 2 | 3 | on: 4 | workflow_dispatch: 5 | push: 6 | branches: 7 | - main 8 | 9 | jobs: 10 | build_and_push_machnet: 11 | runs-on: ubuntu-latest 12 | permissions: 13 | actions: read 14 | contents: read 15 | deployments: read 16 | packages: write 17 | pull-requests: write 18 | security-events: write 19 | 20 | steps: 21 | - name: Checkout code 22 | uses: actions/checkout@v2 23 | 24 | - name: Set up Docker Buildx 25 | uses: docker/setup-buildx-action@v2 26 | 27 | - name: Login to GitHub Container Registry 28 | uses: docker/login-action@v1 29 | with: 30 | registry: ghcr.io 31 | username: ${{ github.repository_owner }} 32 | password: ${{ secrets.GITHUB_TOKEN }} 33 | 34 | - name: Build and push Machnet Docker image 35 | uses: docker/build-push-action@v2 36 | with: 37 | context: . 38 | push: true 39 | tags: ghcr.io/${{ github.repository }}/machnet:latest 40 | target: machnet 41 | file: ./dockerfiles/ubuntu-22.04.dockerfile 42 | cache-from: type=gha,ref=ghcr.io/${{ github.repository }}/machnet_build_base:latest 43 | cache-to: type=gha,mode=max 44 | -------------------------------------------------------------------------------- /.github/workflows/build_base.yml: -------------------------------------------------------------------------------- 1 | name: CI - Base build environment image 2 | 3 | on: 4 | schedule: 5 | - cron: '0 0 * * *' # Run daily at midnight 6 | workflow_dispatch: 7 | 8 | jobs: 9 | build_and_push: 10 | runs-on: ubuntu-latest 11 | permissions: 12 | actions: read 13 | contents: read 14 | deployments: read 15 | packages: write 16 | pull-requests: write 17 | security-events: write 18 | 19 | steps: 20 | - name: Checkout code 21 | uses: actions/checkout@v2 22 | with: 23 | submodules: recursive 24 | 25 | - name: Set up Docker Buildx 26 | uses: docker/setup-buildx-action@v2 27 | 28 | - name: Login to GitHub Container Registry 29 | uses: docker/login-action@v1 30 | with: 31 | registry: ghcr.io 32 | username: ${{ github.repository_owner }} 33 | password: ${{ secrets.GITHUB_TOKEN }} 34 | 35 | - name: Build and push Docker image 36 | uses: docker/build-push-action@v2 37 | with: 38 | context: . 39 | push: true 40 | tags: ghcr.io/${{ github.repository }}/machnet_build_base:latest 41 | target: machnet_build_base 42 | file: ./dockerfiles/ubuntu-22.04.dockerfile 43 | cache-from: type=gha 44 | cache-to: type=gha,mode=max 45 | -------------------------------------------------------------------------------- /.github/workflows/codeql.yml: -------------------------------------------------------------------------------- 1 | name: Build Machnet and Run CodeQL Analysis 2 | 3 | on: 4 | workflow_dispatch: 5 | push: 6 | branches: 7 | - main 8 | pull_request: 9 | branches: 10 | - main 11 | 12 | jobs: 13 | build: 14 | runs-on: ubuntu-latest 15 | permissions: 16 | actions: read 17 | contents: read 18 | deployments: read 19 | packages: read 20 | pull-requests: write 21 | security-events: write 22 | 23 | container: 24 | image: ghcr.io/${{ github.repository }}/machnet_build_base:latest 25 | credentials: 26 | username: ${{ github.repository_owner }} 27 | password: ${{ secrets.GITHUB_TOKEN }} 28 | strategy: 29 | fail-fast: false 30 | steps: 31 | - name: Checkout code 32 | uses: actions/checkout@v3 33 | with: 34 | submodules: recursive 35 | 36 | - name: Initialize CodeQL 37 | uses: github/codeql-action/init@v2 38 | with: 39 | languages: c++ 40 | config-file: ./.github/codeql-config.yml 41 | 42 | 43 | - name: Build machnet 44 | run: | 45 | mkdir -p ${GITHUB_WORKSPACE}/build 46 | cd ${GITHUB_WORKSPACE}/build 47 | RTE_SDK=/root/dpdk cmake -DCMAKE_BUILD_TYPE=Release -GNinja ../ 48 | ninja 49 | 50 | - name: Perform CodeQL Analysis 51 | uses: github/codeql-action/analyze@v2 52 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.so 2 | *.o 3 | .cache/ 4 | build/ 5 | debug_build/ 6 | release_build/ 7 | .history 8 | .vscode 9 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "third_party/glog"] 2 | path = third_party/glog 3 | url = https://github.com/google/glog.git 4 | [submodule "third_party/googletest"] 5 | path = third_party/googletest 6 | url = https://github.com/google/googletest.git 7 | [submodule "third_party/HdrHistogram_c"] 8 | path = third_party/HdrHistogram_c 9 | url = https://github.com/HdrHistogram/HdrHistogram_c 10 | [submodule "third_party/googlebench"] 11 | path = third_party/googlebench 12 | url = https://github.com/google/benchmark.git 13 | branch = v1.8.2 14 | [submodule "third_party/xxHash"] 15 | path = third_party/xxHash 16 | url = https://github.com/Cyan4973/xxHash 17 | [submodule "third_party/gflags"] 18 | path = third_party/gflags 19 | url = https://github.com/gflags/gflags.git 20 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | # See https://pre-commit.com for more information 2 | # See https://pre-commit.com/hooks.html for more hooks 3 | repos: 4 | - repo: https://github.com/pre-commit/pre-commit-hooks 5 | rev: v4.0.1 6 | hooks: 7 | - id: trailing-whitespace 8 | - id: end-of-file-fixer 9 | - id: check-yaml 10 | - id: check-added-large-files 11 | - repo: https://gitlab.com/daverona/pre-commit-cpp 12 | rev: 0.8.0 # use the most recent version 13 | hooks: 14 | - id: clang-format # formatter of C/C++ code based on a style guide: LLVM, Google, Chromium, Mozilla, and WebKit available 15 | args: ["-style=Google"] 16 | - id: cpplint # linter (or style-error checker) for Google C++ Style Guide 17 | - id: cppcheck # static analyzer of C/C++ code 18 | args: ["--library=googletest"] 19 | - repo: https://github.com/google/pre-commit-tool-hooks 20 | rev: v1.2.2 # Use the rev you want to point at. 21 | hooks: 22 | # - id: check-copyright 23 | - id: check-google-doc-style 24 | - id: check-links 25 | - id: markdown-toc 26 | # - id: .. 27 | -------------------------------------------------------------------------------- /CMakeLists.txt: -------------------------------------------------------------------------------- 1 | cmake_minimum_required(VERSION 3.0) 2 | 3 | project(juggler VERSION 0.0.1 DESCRIPTION "Packet Juggling framework in C++" LANGUAGES CXX C) 4 | 5 | include(CTest) 6 | set(BUILD_TESTING OFF) # Disable testing for third-party modules 7 | # SET(CMAKE_FIND_LIBRARY_SUFFIXES .a .so ${CMAKE_FIND_LIBRARY_SUFFIXES}) 8 | 9 | # SET(BUILD_STATIC_LIBS ON) 10 | SET(BUILD_gflags_nothreads_LIBS ON) 11 | SET(BUILD_gflags_LIBS ON) 12 | add_subdirectory(third_party/gflags) 13 | include_directories(SYSTEM "${CMAKE_CURRENT_BINARY_DIR}/third_party/gflags/include/") 14 | 15 | find_package(gflags) 16 | 17 | # Glog doesn't correctly define its include directories, so we need to specify 18 | # manually 19 | add_subdirectory(third_party/glog) 20 | include_directories(SYSTEM third_party/glog/src) 21 | include_directories(SYSTEM ${CMAKE_BINARY_DIR}/third_party/glog) 22 | 23 | add_subdirectory(third_party/googletest) 24 | include_directories(SYSTEM third_party/googletest/googlemock/include) 25 | 26 | # Build google benchmark (target: benchmark) 27 | # do not build tests of benchmarking lib 28 | set(BENCHMARK_ENABLE_INSTALL OFF) 29 | set(BENCHMARK_ENABLE_TESTING OFF CACHE BOOL "Suppressing benchmark's tests" FORCE) 30 | add_subdirectory(third_party/googlebench) 31 | include_directories(SYSTEM third_party/googlebench/include) 32 | 33 | # Common sub-projects: HdrHistogram 34 | set(HDR_HISTOGRAM_BUILD_PROGRAMS OFF CACHE BOOL "Minimize HDR histogram build") 35 | set(HDR_LOG_REQUIRED OFF CACHE BOOL "Disable HDR histogram's log to avoid zlib dependency") 36 | add_subdirectory(third_party/HdrHistogram_c) 37 | include_directories(SYSTEM third_party/HdrHistogram_c/include/) 38 | 39 | include_directories(SYSTEM third_party/xxHash) 40 | 41 | # if(NOT gflags_FOUND) 42 | # endif() 43 | 44 | set(BUILD_TESTING ON) 45 | add_subdirectory(src) 46 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Microsoft Open Source Code of Conduct 2 | 3 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 4 | 5 | Resources: 6 | 7 | - [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/) 8 | - [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) 9 | - Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns 10 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | 2 | # Contributing 3 | 4 | The codebase follows [Google's C++ 5 | standard](https://google.github.io/styleguide/cppguide.html). 6 | 7 | Some tools are required for linting, checking, and code formatting: 8 | 9 | ```bash 10 | sudo apt install clang-format cppcheck # Or equivalent for your OS 11 | pip install pre-commit 12 | cd ${REPOROOT} 13 | pre-commit install 14 | ``` 15 | 16 | For instructions on how to build and test Machnet, see [README.md](README.md). 17 | -------------------------------------------------------------------------------- /CPPLINT.cfg: -------------------------------------------------------------------------------- 1 | filter=-legal/copyright 2 | filter=-build/include,-build/c++11 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) Microsoft Corporation. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # This makefile is primarily used for building docker containers 2 | 3 | SHELL=/bin/bash -e -o pipefail 4 | 5 | BUILD_COMMAND=docker buildx bake -f docker-bake.hcl 6 | GET_BUILDX_INFO_COMMAND=$(BUILD_COMMAND) --print 7 | BUILD_TARGETS_COMMAND=xargs $(BUILD_COMMAND) 8 | GET_TARGETS_FOR_ARCH_CMD=python3 $(CURDIR)/dockerfiles/get_targets_for_arch.py 9 | 10 | # By default, load into the local docker registry, can be overriden with --push for production builds 11 | BUILD_COMMAND_EXTRA_ARGS=--load 12 | 13 | 14 | .PHONY: all_containers x86_containers arm_containers 15 | 16 | # Users likely want to get containers that work on the current system, 17 | # so that is the default. 18 | default_containers: native_containers 19 | 20 | all_containers: 21 | $(BUILD_COMMAND) $(BUILD_COMMAND_EXTRA_ARGS) 22 | 23 | x86_containers: 24 | $(GET_BUILDX_INFO_COMMAND) | $(GET_TARGETS_FOR_ARCH_CMD) --arch x86 | $(BUILD_TARGETS_COMMAND) $(BUILD_COMMAND_EXTRA_ARGS) 25 | 26 | arm_containers: 27 | $(GET_BUILDX_INFO_COMMAND) | $(GET_TARGETS_FOR_ARCH_CMD) --arch arm | $(BUILD_TARGETS_COMMAND) $(BUILD_COMMAND_EXTRA_ARGS) 28 | 29 | native_containers: 30 | $(GET_BUILDX_INFO_COMMAND) | $(GET_TARGETS_FOR_ARCH_CMD) --arch native | $(BUILD_TARGETS_COMMAND) $(BUILD_COMMAND_EXTRA_ARGS) -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## Security 4 | 5 | Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/). 6 | 7 | If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/opensource/security/definition), please report it to us as described below. 8 | 9 | ## Reporting Security Issues 10 | 11 | **Please do not report security vulnerabilities through public GitHub issues.** 12 | 13 | Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/opensource/security/create-report). 14 | 15 | If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/opensource/security/pgpkey). 16 | 17 | You should receive a response within 24 hours. If for some reason you do not, please follow up by way of email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://aka.ms/opensource/security/msrc). 18 | 19 | Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue: 20 | 21 | * Type of issue (for example buffer overflow, SQL injection, cross-site scripting, etc.) 22 | * Full paths of source file(s) related to the manifestation of the issue 23 | * The location of the affected source code (tag/branch/commit or direct URL) 24 | * Any special configuration required to reproduce the issue 25 | * Step-by-step instructions to reproduce the issue 26 | * Proof-of-concept or exploit code (if possible) 27 | * Impact of the issue, including how an attacker might exploit the issue 28 | 29 | This information will help us triage your report more quickly. 30 | 31 | If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/opensource/security/bounty) page for more details about our active programs. 32 | 33 | ## Preferred Languages 34 | 35 | We prefer all communications to be in English. 36 | 37 | ## Policy 38 | 39 | Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/opensource/security/cvd). 40 | 41 | 42 | -------------------------------------------------------------------------------- /SUPPORT.md: -------------------------------------------------------------------------------- 1 | # Support 2 | 3 | ## How to file issues and get help 4 | 5 | 6 | This project uses GitHub Issues to track bugs and feature requests. Please search the existing 7 | issues before filing new issues to avoid duplicates. For new issues, file your bug or 8 | feature request as a new Issue. 9 | 10 | 11 | ## Microsoft Support Policy 12 | 13 | Support for this **PROJECT or PRODUCT** is limited to the resources listed above. 14 | -------------------------------------------------------------------------------- /TEST.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/machnet/7d047486538493d3a8aedc2cef9d9329b409c2e2/TEST.md -------------------------------------------------------------------------------- /azure-pipelines.yml: -------------------------------------------------------------------------------- 1 | trigger: 2 | - main 3 | 4 | pool: 5 | vmImage: 'ubuntu-latest' 6 | 7 | variables: 8 | timezone: 'America/Los_Angeles' 9 | rdma_core: '$(Pipeline.Workspace)/rdma-core' 10 | rte_sdk: '$(Pipeline.Workspace)/dpdk' 11 | 12 | steps: 13 | - script: | 14 | sudo ln -snf /usr/share/zoneinfo/$(timezone) /etc/localtime 15 | echo $(timezone) > /etc/timezone 16 | echo 'APT::Install-Suggests "0";' | sudo tee -a /etc/apt/apt.conf.d/00-docker 17 | echo 'APT::Install-Recommends "0";' | sudo tee -a /etc/apt/apt.conf.d/00-docker 18 | sudo apt-get update 19 | sudo apt-get install --no-install-recommends -y git build-essential cmake meson pkg-config libudev-dev libnl-3-dev libnl-route-3-dev python3-dev python3-docutils python3-pyelftools libnuma-dev ca-certificates autoconf libhugetlbfs-dev pciutils libunwind-dev uuid-dev nlohmann-json3-dev 20 | sudo apt-get --purge -y remove rdma-core librdmacm1 ibverbs-providers libibverbs-dev libibverbs1 21 | sudo rm -rf /var/lib/apt/lists/* 22 | displayName: 'Set timezone and Install dependencies' 23 | 24 | - script: | 25 | cd $(Pipeline.Workspace) 26 | git clone -b 'stable-v40' --single-branch --depth 1 https://github.com/linux-rdma/rdma-core.git $(rdma_core) 27 | cd $(rdma_core) 28 | mkdir build 29 | cd build 30 | cmake -GNinja -DNO_PYVERBS=1 -DNO_MAN_PAGES=1 ../ 31 | sudo ninja install 32 | displayName: 'Build rdma-core' 33 | 34 | - script: | 35 | cd $(Pipeline.Workspace) 36 | git clone --depth 1 --branch 'v21.11' https://github.com/DPDK/dpdk.git $(rte_sdk) 37 | cd $(rte_sdk) 38 | meson build -Dexamples='' -Dplatform=generic -Denable_kmods=false -Dtests=false -Ddisable_drivers='raw/*,crypto/*,baseband/*,dma/*' 39 | cd build/ 40 | DESTDIR=$(rte_sdk)/build/install ninja install 41 | rm -rf $(rte_sdk)/app $(rte_sdk)/drivers $(rte_sdk)/.git $(rte_sdk)/build/app 42 | displayName: 'Build DPDK' 43 | 44 | - checkout: self 45 | path: 'machnet' 46 | submodules: recursive 47 | 48 | - script: | 49 | cd $(Pipeline.Workspace)/machnet 50 | mkdir build 51 | cd build 52 | RTE_SDK=$(rte_sdk) cmake -DCMAKE_BUILD_TYPE=Release -GNinja ../ 53 | ninja 54 | displayName: 'Build Machnet' 55 | -------------------------------------------------------------------------------- /bindings/csharp/.gitignore: -------------------------------------------------------------------------------- 1 | *.so 2 | bin 3 | obj 4 | -------------------------------------------------------------------------------- /bindings/csharp/HelloWorld/HelloWorld.csproj: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Exe 5 | net7.0 6 | enable 7 | enable 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | -------------------------------------------------------------------------------- /bindings/csharp/HelloWorld/Program.cs: -------------------------------------------------------------------------------- 1 | // Example hello world program for Machnet 2 | // Usage: Assuming we have two servers (A and B), where Machnet is running on both 3 | // IP 10.0.255.100 at server A, and IP 10.0.255.101 at server B. 4 | // 5 | // On server A: dotnet run --local_ip 10.0.255.100 6 | // On server B: dotnet run --local_ip 10.0.255.101 --remote_ip 10.0.255.100 7 | // 8 | // If everything works, server A should print "Received message: Hello World!" 9 | 10 | using System; 11 | using System.Text; 12 | using CommandLine; 13 | 14 | class Program 15 | { 16 | private const UInt16 kHelloWorldPort = 888; 17 | 18 | public class Options 19 | { 20 | [Option('l', "local_ip", Required = true, HelpText = "Local IP address")] 21 | public string? LocalIp { get; set; } 22 | 23 | [Option('r', "remote_ip", Required = false, HelpText = "Remote IP address")] 24 | public string? RemoteIp { get; set; } 25 | } 26 | 27 | static void CustomCheck(bool condition, string message) 28 | { 29 | if (!condition) 30 | { 31 | Console.ForegroundColor = ConsoleColor.Red; 32 | Console.WriteLine("Error: " + message); 33 | Environment.Exit(1); 34 | } 35 | else 36 | { 37 | Console.ForegroundColor = ConsoleColor.Green; 38 | Console.WriteLine("Success: " + message); 39 | } 40 | Console.ResetColor(); 41 | } 42 | 43 | static void Main(string[] args) 44 | { 45 | Options? options = null; 46 | Parser.Default.ParseArguments(args) 47 | .WithParsed(o => options = o); 48 | 49 | if (options?.LocalIp == null) 50 | { 51 | Console.ForegroundColor = ConsoleColor.Red; 52 | Console.WriteLine("Error: Local IP address is required."); 53 | Console.ResetColor(); 54 | Environment.Exit(1); 55 | } 56 | 57 | 58 | Console.WriteLine($"Local IP: {options.LocalIp}"); 59 | if (!string.IsNullOrEmpty(options.RemoteIp)) 60 | { 61 | Console.WriteLine($"Remote IP: {options.RemoteIp}"); 62 | } 63 | 64 | int ret = MachnetShim.machnet_init(); 65 | CustomCheck(ret == 0, "machnet_init()"); 66 | 67 | IntPtr channel_ctx = MachnetShim.machnet_attach(); 68 | CustomCheck(channel_ctx != IntPtr.Zero, "machnet_attach()"); 69 | 70 | if (!string.IsNullOrEmpty(options.RemoteIp)) 71 | { 72 | // Client 73 | MachnetFlow_t flow = new MachnetFlow_t(); 74 | ret = MachnetShim.machnet_connect(channel_ctx, options.LocalIp, options.RemoteIp, kHelloWorldPort, ref flow); 75 | CustomCheck(ret == 0, "machnet_connect()"); 76 | 77 | string msg = "Hello World!"; 78 | byte[] msgBuffer = Encoding.UTF8.GetBytes(msg); 79 | ret = MachnetShim.machnet_send(channel_ctx, flow, msgBuffer, new IntPtr(msgBuffer.Length)); 80 | CustomCheck(ret != -1, "machnet_send()"); 81 | 82 | Console.WriteLine("Message sent successfully"); 83 | } 84 | else 85 | { 86 | Console.WriteLine("Waiting for message from client"); 87 | ret = MachnetShim.machnet_listen(channel_ctx, options.LocalIp, kHelloWorldPort); 88 | CustomCheck(ret == 0, "machnet_listen()"); 89 | 90 | while (true) 91 | { 92 | byte[] buf = new byte[1024]; 93 | MachnetFlow_t flow = new MachnetFlow_t(); 94 | int bytesRead = MachnetShim.machnet_recv(channel_ctx, buf, new IntPtr(buf.Length), ref flow); 95 | 96 | if (bytesRead == -1) 97 | { 98 | Console.ForegroundColor = ConsoleColor.Red; 99 | Console.WriteLine("Error: machnet_recv() failed"); 100 | Console.ResetColor(); 101 | break; 102 | } 103 | else if (bytesRead == 0) 104 | { 105 | SpinWait.SpinUntil(() => false, 10); 106 | } 107 | else 108 | { 109 | string receivedMsg = Encoding.UTF8.GetString(buf, 0, bytesRead); 110 | Console.WriteLine($"Received message: {receivedMsg}"); 111 | } 112 | } 113 | } 114 | } 115 | } 116 | -------------------------------------------------------------------------------- /bindings/csharp/HelloWorld/machnet_shim.cs: -------------------------------------------------------------------------------- 1 | using System; 2 | using System.Runtime.InteropServices; 3 | using System.Text; 4 | 5 | public struct MachnetFlow_t 6 | { 7 | public UInt32 src_ip; 8 | public UInt32 dst_ip; 9 | public UInt16 src_port; 10 | public UInt16 dst_port; 11 | } 12 | 13 | public static class MachnetShim 14 | { 15 | private const string libmachnet_shim_location = "libmachnet_shim.so"; 16 | 17 | [DllImport(libmachnet_shim_location, CallingConvention = CallingConvention.Cdecl)] 18 | public static extern int machnet_init(); 19 | 20 | [DllImport(libmachnet_shim_location, CallingConvention = CallingConvention.Cdecl)] 21 | public static extern IntPtr machnet_attach(); 22 | 23 | [DllImport(libmachnet_shim_location, CallingConvention = CallingConvention.Cdecl)] 24 | public static extern int machnet_listen(IntPtr channel_ctx, string local_ip, UInt16 port); 25 | 26 | [DllImport(libmachnet_shim_location, CallingConvention = CallingConvention.Cdecl)] 27 | public static extern int machnet_connect(IntPtr channel_ctx, string local_ip, string remote_ip, UInt16 port, ref MachnetFlow_t flow); 28 | 29 | [DllImport(libmachnet_shim_location, CallingConvention = CallingConvention.Cdecl)] 30 | public static extern int machnet_send(IntPtr channel_ctx, MachnetFlow_t flow, byte[] data, IntPtr dataSize); 31 | 32 | [DllImport(libmachnet_shim_location, CallingConvention = CallingConvention.Cdecl)] 33 | public static extern int machnet_recv(IntPtr channel_ctx, byte[] data, IntPtr dataSize, ref MachnetFlow_t flow); 34 | } 35 | -------------------------------------------------------------------------------- /bindings/go/machnet/conversion.h: -------------------------------------------------------------------------------- 1 | // Convert Go Pointers to void* before passing to C 2 | #ifndef GO_MACHNET_CONVERSION_H_ 3 | #define GO_MACHNET_CONVERSION_H_ 4 | 5 | #ifdef __cplusplus 6 | extern "C" { 7 | #endif 8 | 9 | #include 10 | 11 | int __machnet_sendmsg_go(const MachnetChannelCtx_t* ctx, MachnetIovec_t msg_iov, 12 | long msg_iovlen, MachnetFlow_t flow) { 13 | MachnetMsgHdr_t msghdr; // NOLINT 14 | msghdr.msg_size = msg_iov.len; 15 | 16 | msghdr.flow_info.dst_ip = flow.dst_ip; 17 | msghdr.flow_info.src_ip = flow.src_ip; 18 | msghdr.flow_info.dst_port = flow.dst_port; 19 | msghdr.flow_info.src_port = flow.src_port; 20 | 21 | msghdr.msg_iov = &msg_iov; 22 | msghdr.msg_iovlen = msg_iovlen; 23 | 24 | return __machnet_sendmsg(ctx, &msghdr); 25 | } 26 | 27 | MachnetFlow_t __machnet_recvmsg_go(const MachnetChannelCtx_t* ctx, 28 | MachnetIovec_t msg_iov, long msg_iovlen) { 29 | MachnetMsgHdr_t msghdr; // NOLINT 30 | msghdr.msg_iov = &msg_iov; 31 | msghdr.msg_iovlen = msg_iovlen; 32 | int ret = __machnet_recvmsg(ctx, &msghdr); 33 | 34 | if (ret > 0) { 35 | return msghdr.flow_info; 36 | } else { 37 | MachnetFlow_t flow; 38 | flow.dst_ip = 0; 39 | flow.src_ip = 0; 40 | flow.dst_port = 0; 41 | flow.src_port = 0; 42 | return flow; 43 | } 44 | } 45 | 46 | int __machnet_connect_go(MachnetChannelCtx_t* ctx, uint32_t local_ip, 47 | uint32_t remote_ip, uint16_t remote_port, 48 | MachnetFlow_t* flow) { 49 | return machnet_connect(ctx, local_ip, remote_ip, remote_port, flow); 50 | } 51 | 52 | int __machnet_listen_go(MachnetChannelCtx_t* ctx, uint32_t local_ip, 53 | uint16_t port) { 54 | return machnet_listen(ctx, local_ip, port); 55 | } 56 | 57 | MachnetFlow_t* __machnet_init_flow() { 58 | // cppcheck-suppress cstyleCast 59 | MachnetFlow_t* flow = (MachnetFlow_t*)malloc( 60 | sizeof(MachnetFlow_t)); // NOLINT 61 | flow->dst_ip = 0; 62 | flow->src_ip = 0; 63 | flow->dst_port = 0; 64 | flow->src_port = 0; 65 | return flow; 66 | } 67 | 68 | void __machnet_destroy_flow(MachnetFlow_t* flow) { free(flow); } 69 | 70 | #ifdef __cplusplus 71 | } 72 | #endif 73 | 74 | #endif // GO_MACHNET_CONVERSION_H_ 75 | -------------------------------------------------------------------------------- /bindings/go/machnet/go.mod: -------------------------------------------------------------------------------- 1 | module machnet-go 2 | 3 | go 1.20 4 | -------------------------------------------------------------------------------- /bindings/go/machnet/machnet.go: -------------------------------------------------------------------------------- 1 | package machnet 2 | 3 | // #cgo LDFLAGS: -L${SRCDIR}/../../build/src/core -lcore -L${SRCDIR}/../../build/src/ext -lmachnet_shim -lrt -Wl,-rpath=${SRCDIR}/../../build/src/core:${SRCDIR}/../../build/src/ext -fsanitize=address 4 | // #cgo CFLAGS: -I${SRCDIR}/../../src/ext -I${SRCDIR}/../../src/include -fsanitize=address 5 | // #include 6 | // #include "conversion.h" 7 | // #include "../../src/ext/machnet.h" 8 | // #include "../../src/ext/machnet_common.h" 9 | import "C" 10 | import ( 11 | "strconv" 12 | "strings" 13 | "unsafe" 14 | ) 15 | 16 | // Alternate Go Type Defs for C types 17 | type MachnetChannelCtx = C.MachnetChannelCtx_t 18 | 19 | // Define the MachnetFlow struct 20 | type MachnetFlow struct { 21 | SrcIp uint32 22 | DstIp uint32 23 | SrcPort uint16 24 | DstPort uint16 25 | } 26 | 27 | // Helper function to convert a C.MachnetFlow_t to a MachnetFlow. 28 | func convert_net_flow_go(c_flow *C.MachnetFlow_t) MachnetFlow { 29 | return MachnetFlow{ 30 | SrcIp: (uint32)(c_flow.src_ip), 31 | DstIp: (uint32)(c_flow.dst_ip), 32 | SrcPort: (uint16)(c_flow.src_port), 33 | DstPort: (uint16)(c_flow.dst_port), 34 | } 35 | } 36 | 37 | // Helper function to convert a MachnetFlow to a C.MachnetFlow_t. 38 | func convert_net_flow_c(flow MachnetFlow) C.MachnetFlow_t { 39 | return C.MachnetFlow_t{ 40 | src_ip: (C.uint)(flow.SrcIp), 41 | dst_ip: (C.uint)(flow.DstIp), 42 | src_port: (C.ushort)(flow.SrcPort), 43 | dst_port: (C.ushort)(flow.DstPort), 44 | } 45 | } 46 | 47 | // Helper function to convert a IPv4 address string to a uint32. 48 | func ipv4_str_to_uint32(ipv4_str string) uint32 { 49 | bytes := strings.Split(ipv4_str, ".") 50 | var ipv4_uint32 uint32 = 0 51 | for i := 0; i < 4; i++ { 52 | val, _ := strconv.Atoi(bytes[i]) 53 | ipv4_uint32 |= uint32(val) << uint32(8*(3-i)) 54 | } 55 | return ipv4_uint32 56 | } 57 | 58 | // Initialize the MACHNET shim. 59 | func Init() int { 60 | ret := C.machnet_init() 61 | return (int)(ret) 62 | } 63 | 64 | // Attach to the MACHNET shim. 65 | // Returns a pointer to the channel context. 66 | func Attach() *MachnetChannelCtx { 67 | var c_ctx *C.MachnetChannelCtx_t = (*C.MachnetChannelCtx_t)(C.machnet_attach()) 68 | return (*MachnetChannelCtx)(c_ctx) 69 | } 70 | 71 | // Connect to the remote host and port. 72 | func Connect(ctx *MachnetChannelCtx, local_ip string, remote_ip string, remote_port uint) (int, MachnetFlow) { 73 | // Initialize the flow 74 | var flow_ptr *C.MachnetFlow_t = C.__machnet_init_flow() 75 | 76 | local_ip_int := ipv4_str_to_uint32(local_ip) 77 | remote_ip_int := ipv4_str_to_uint32(remote_ip) 78 | 79 | ret := C.__machnet_connect_go((*C.MachnetChannelCtx_t)(ctx), (C.uint)(local_ip_int), (C.uint)(remote_ip_int), C.ushort(remote_port), flow_ptr) 80 | return (int)(ret), convert_net_flow_go(flow_ptr) 81 | } 82 | 83 | // Listen on the local host and port. 84 | func Listen(ctx *MachnetChannelCtx, local_ip string, local_port uint) int { 85 | local_ip_int := ipv4_str_to_uint32(local_ip) 86 | ret := C.__machnet_listen_go((*C.MachnetChannelCtx_t)(ctx), (C.uint)(local_ip_int), C.ushort(local_port)) 87 | return (int)(ret) 88 | } 89 | 90 | // Send message on the flow. 91 | // NOTE: Currently, only one iov is supported. 92 | func SendMsg(ctx *MachnetChannelCtx, flow MachnetFlow, base *uint8, iov_len uint) int { 93 | var iov C.MachnetIovec_t 94 | iov.base = unsafe.Pointer(base) 95 | iov.len = C.size_t(iov_len) 96 | 97 | ret := C.__machnet_sendmsg_go((*C.MachnetChannelCtx_t)(ctx), iov, 1, convert_net_flow_c(flow)) 98 | return (int)(ret) 99 | } 100 | 101 | // Receive message on the channel. 102 | // NOTE: Currently, only one iov is supported. 103 | func RecvMsg(ctx *MachnetChannelCtx, base *uint8, iov_len uint) (int, MachnetFlow) { 104 | var iov C.MachnetIovec_t 105 | iov.base = unsafe.Pointer(base) 106 | iov.len = C.size_t(iov_len) 107 | 108 | flow := C.__machnet_recvmsg_go((*C.MachnetChannelCtx_t)(ctx), iov, 1) 109 | if flow.dst_ip == 0 { 110 | return -1, convert_net_flow_go(&flow) 111 | } else { 112 | return 0, convert_net_flow_go(&flow) 113 | } 114 | } 115 | -------------------------------------------------------------------------------- /bindings/go/msg_gen/README.md: -------------------------------------------------------------------------------- 1 | # Mesage Generator Go App (main) 2 | 3 | This is a simple message generator application that uses the Machnet stack. 4 | 5 | ## Prerequisites 6 | 7 | Successful build of the `Machnet` project (see main [README](../../README.md)). 8 | 9 | 10 | ## Running the application 11 | 12 | ### Configuration 13 | 14 | Install the necessary go dependencies by way of: 15 | 16 | ``` 17 | cd apps/msg_gen 18 | go install 19 | ``` 20 | 21 | Build the go binary using: 22 | 23 | ``` 24 | go build main.go 25 | ``` 26 | 27 | The application is run by the `main` binary. You could see the available options by running `main --help`. 28 | 29 | ### Sending messages between two machines 30 | 31 | **Attention**: An Machnet stack instance must be running on every machine that needs to use this message generator application. You can find information on how to run the Machnet stack in the [Machnet README](../../src/apps/machnet/README.md). 32 | 33 | In the example below, a server named `poseidon` sends messages to the `zeus` server, which bounces them back. 34 | 35 | ```bash 36 | # On machine `zeus` (bouncing): 37 | cd ${REPOROOT}/src/apps/main 38 | sudo GLOG_logtostderr=1 ./main --local_hostname zeus 39 | 40 | # On machine `poseidon` (sender): 41 | cd ${REPOROOT}/src/apps/main 42 | sudo GLOG_logtostderr=1 ./main --local_hostname poseidon --remote_hostname zeus --msg_size 20000 -active_generator 43 | ``` 44 | 45 | The active generator side will maintain a closed loop with a pre-set message window (that is, number of inflight messages). You can adjust the number of inflight messages by setting the `-msg_window` option. The default value is 8. 46 | 47 | ### Options for Message Generator Application 48 | 1. `msg_size`: Set the size of the message to test against. 49 | 2. `msg_window`: Set the maximum number of messages in flight. 50 | 3. `active_generator`: If set, the application actively sends messages and reports the stats. 51 | 4. `latency`: Get the latency measurements. Default: `false` (gives throughput measurements in that case) 52 | 53 | For all options, run `./main --help` 54 | -------------------------------------------------------------------------------- /bindings/go/msg_gen/go.mod: -------------------------------------------------------------------------------- 1 | module msg_gen 2 | 3 | go 1.20 4 | 5 | require ( 6 | github.com/buger/jsonparser v1.1.1 7 | github.com/golang/glog v1.1.0 8 | github.com/msr-machnet/machnet v0.0.0-00010101000000-000000000000 9 | ) 10 | 11 | require github.com/HdrHistogram/hdrhistogram-go v1.1.2 // indirect 12 | 13 | replace github.com/msr-machnet/machnet => ../machnet 14 | -------------------------------------------------------------------------------- /bindings/go/msg_gen/go.sum: -------------------------------------------------------------------------------- 1 | dmitri.shuralyov.com/gpu/mtl v0.0.0-20190408044501-666a987793e9/go.mod h1:H6x//7gZCb22OMCxBHrMx7a5I7Hp++hsVxbQ4BYO7hU= 2 | github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo= 3 | github.com/HdrHistogram/hdrhistogram-go v1.1.2 h1:5IcZpTvzydCQeHzK4Ef/D5rrSqwxob0t8PQPMybUNFM= 4 | github.com/HdrHistogram/hdrhistogram-go v1.1.2/go.mod h1:yDgFjdqOqDEKOvasDdhWNXYg9BVp4O+o5f6V/ehm6Oo= 5 | github.com/ajstarks/svgo v0.0.0-20180226025133-644b8db467af/go.mod h1:K08gAheRH3/J6wwsYMMT4xOr94bZjxIelGM0+d/wbFw= 6 | github.com/buger/jsonparser v1.1.1 h1:2PnMjfWD7wBILjqQbt530v576A/cAbQvEW9gGIpYMUs= 7 | github.com/buger/jsonparser v1.1.1/go.mod h1:6RYKKt7H4d4+iWqouImQ9R2FZql3VbhNgx27UK13J/0= 8 | github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E= 9 | github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 10 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= 11 | github.com/fogleman/gg v1.2.1-0.20190220221249-0403632d5b90/go.mod h1:R/bRT+9gY/C5z7JzPU0zXsXHKM4/ayA+zqcVNZzPa1k= 12 | github.com/go-gl/glfw v0.0.0-20190409004039-e6da0acd62b1/go.mod h1:vR7hzQXu2zJy9AVAgeJqvqgH9Q5CA+iKCZ2gyEVpxRU= 13 | github.com/golang/freetype v0.0.0-20170609003504-e2365dfdc4a0/go.mod h1:E/TSTwGwJL78qG/PmXZO1EjYhfJinVAhrmmHX6Z8B9k= 14 | github.com/golang/glog v1.1.0 h1:/d3pCKDPWNnvIWe0vVUpNP32qc8U3PDVxySP/y360qE= 15 | github.com/golang/glog v1.1.0/go.mod h1:pfYeQZ3JWZoXTV5sFc986z3HTpwQs9At6P4ImfuP3NQ= 16 | github.com/google/go-cmp v0.5.4/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= 17 | github.com/jung-kurt/gofpdf v1.0.3-0.20190309125859-24315acbbda5/go.mod h1:7Id9E/uU8ce6rXgefFLlgrJj/GYY22cpxn+r32jIOes= 18 | github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ= 19 | github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI= 20 | github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE= 21 | github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno= 22 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= 23 | github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= 24 | github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg= 25 | golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w= 26 | golang.org/x/crypto v0.0.0-20190510104115-cbcb75029529/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI= 27 | golang.org/x/exp v0.0.0-20180321215751-8460e604b9de/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= 28 | golang.org/x/exp v0.0.0-20180807140117-3d87b88a115f/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= 29 | golang.org/x/exp v0.0.0-20190125153040-c74c464bbbf2/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= 30 | golang.org/x/exp v0.0.0-20190306152737-a1d7652674e8/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= 31 | golang.org/x/exp v0.0.0-20191030013958-a1ab85dbe136/go.mod h1:JXzH8nQsPlswgeRAPE3MuO9GYsAcnJvJ4vnMwN/5qkY= 32 | golang.org/x/image v0.0.0-20180708004352-c73c2afc3b81/go.mod h1:ux5Hcp/YLpHSI86hEcLt0YII63i6oz57MZXIpbrjZUs= 33 | golang.org/x/image v0.0.0-20190227222117-0694c2d4d067/go.mod h1:kZ7UVZpmo3dzQBMxlp+ypCbDeSB+sBbTgSJuh5dn5js= 34 | golang.org/x/image v0.0.0-20190802002840-cff245a6509b/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0= 35 | golang.org/x/mobile v0.0.0-20190719004257-d2bd2a29d028/go.mod h1:E/iHnbuqvinMTCcRqshq8CkpyQDoeVncDDYHnLhea+o= 36 | golang.org/x/mod v0.1.0/go.mod h1:0QHyrYULN0/3qlju5TqG8bIK38QM8yzMo5ekMj3DlcY= 37 | golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg= 38 | golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= 39 | golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= 40 | golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= 41 | golang.org/x/sys v0.0.0-20190312061237-fead79001313/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 42 | golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= 43 | golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= 44 | golang.org/x/tools v0.0.0-20180525024113-a5b4c53f6e8b/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 45 | golang.org/x/tools v0.0.0-20190206041539-40960b6deb8e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 46 | golang.org/x/tools v0.0.0-20191012152004-8de300cfc20a/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= 47 | golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= 48 | golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= 49 | golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= 50 | gonum.org/v1/gonum v0.0.0-20180816165407-929014505bf4/go.mod h1:Y+Yx5eoAFn32cQvJDxZx5Dpnq+c3wtXuadVZAcxbbBo= 51 | gonum.org/v1/gonum v0.8.2/go.mod h1:oe/vMfY3deqTw+1EZJhuvEW2iwGF1bW9wwu7XCu0+v0= 52 | gonum.org/v1/netlib v0.0.0-20190313105609-8cb42192e0e0/go.mod h1:wa6Ws7BG/ESfp6dHfk7C6KdzKA7wR7u/rKwOGE66zvw= 53 | gonum.org/v1/plot v0.0.0-20190515093506-e2840ee46a6b/go.mod h1:Wt8AAjI+ypCyYX3nZBvf6cAIx93T+c/OS2HFAYskSZc= 54 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 55 | gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= 56 | gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= 57 | rsc.io/pdf v0.1.1/go.mod h1:n8OzWcQ6Sp37PL01nO98y4iUCRdTGarVfzxY20ICaU4= 58 | -------------------------------------------------------------------------------- /bindings/js/.gitignore: -------------------------------------------------------------------------------- 1 | # node temp files: 2 | node_modules 3 | package-lock.json 4 | package.json 5 | 6 | # C library files 7 | *.so 8 | *.o 9 | -------------------------------------------------------------------------------- /bindings/js/benchmark.js: -------------------------------------------------------------------------------- 1 | const { performance } = require('perf_hooks'); 2 | g_str = 'a'.repeat(32); 3 | g_bytes = Buffer.from(g_str, 'utf8'); 4 | 5 | // Function to benchmark execution time 6 | function benchmark(func, iterations) { 7 | const start = process.hrtime.bigint(); 8 | for (let i = 0; i < iterations; i++) { 9 | func(); 10 | } 11 | const end = process.hrtime.bigint(); 12 | const duration = end - start; 13 | console.log(`${func.name} took ${duration / BigInt(iterations)} nanoseconds per iteration.`); 14 | } 15 | 16 | // Functions to be benchmarked 17 | function now() { 18 | return Date.now(); 19 | } 20 | 21 | function hrtime() { 22 | return process.hrtime(); 23 | } 24 | 25 | function bufferAlloc() { 26 | return Buffer.alloc(1024); 27 | } 28 | 29 | function bufferToString() { 30 | return g_bytes.toString('utf8'); 31 | } 32 | 33 | function bufferFromString() { 34 | return Buffer.from(g_str, 'utf8'); 35 | } 36 | 37 | function consoleLog() { 38 | console.log('This output will be hidden.'); 39 | } 40 | 41 | // Run benchmarks 42 | var iterations = 100000; 43 | benchmark(now, iterations); 44 | benchmark(bufferAlloc, iterations); 45 | benchmark(bufferToString, iterations); 46 | benchmark(bufferFromString, iterations); 47 | benchmark(hrtime, iterations); 48 | 49 | console.log('Waiting 5 seconds before running next set of benchmarks...'); 50 | 51 | var iterations = 1000; 52 | setTimeout(() => { benchmark(consoleLog, iterations); }, 5000); 53 | -------------------------------------------------------------------------------- /bindings/js/hello_world.js: -------------------------------------------------------------------------------- 1 | // A simple example of using Machnet that sends the message "Hello World!" over 2 | // the network. 3 | // 4 | // Requirements: npm install ref-napi ffi-napi ref-struct-napi 5 | // 6 | // Usage: Assuming we have two servers (A and B), where Machnet is running on both 7 | // IP 10.0.255.100 at server A, and IP 10.0.255.101 at server B. 8 | // 9 | // On server A: node hello_world.js --local_ip 10.0.255.100 10 | // On server B: node hello_world.js --local_ip 10.0.255.101 --remote_ip 10.0.255.100 11 | // 12 | // If everything works, server A should print "Received message: Hello World!" 13 | 14 | const ref = require('ref-napi'); 15 | const commander = require('commander'); 16 | const chalk = require('chalk'); 17 | const {machnet_shim, MachnetFlow_t} = require('./machnet_shim'); 18 | 19 | function customCheck(condition, message) { 20 | if (!condition) { 21 | console.log(chalk.red('Error: ' + message)); 22 | process.exit(1); 23 | } else { 24 | console.log(chalk.green('Success: ' + message)); 25 | } 26 | } 27 | 28 | const kHelloWorldPort = 888; 29 | commander.option('-l, --local_ip ', 'Local IP address') 30 | .option('-r, --remote_ip ', 'Remote IP address') 31 | .parse(process.argv); 32 | 33 | const options = commander.opts(); 34 | if (!options.local_ip) { 35 | console.log(chalk.red('Error: local_ip is required')); 36 | process.exit(1); 37 | } 38 | console.log(options); 39 | 40 | // Main logic 41 | var ret = machnet_shim.machnet_init(); 42 | customCheck(ret === 0, 'machnet_init()'); 43 | 44 | var channel_ctx = machnet_shim.machnet_attach(); 45 | customCheck(channel_ctx !== null, 'machnet_attach()'); 46 | 47 | if (options.remote_ip) { 48 | // Client 49 | const flow = new MachnetFlow_t(); 50 | ret = machnet_shim.machnet_connect( 51 | channel_ctx, ref.allocCString(options.local_ip), 52 | ref.allocCString(options.remote_ip), kHelloWorldPort, flow.ref()); 53 | customCheck(ret === 0, 'machnet_connect()'); 54 | 55 | const msg = 'Hello World!'; 56 | const msg_buffer = Buffer.from(msg, 'utf8'); 57 | ret = machnet_shim.machnet_send(channel_ctx, flow, msg_buffer, msg_buffer.length); 58 | if (ret === -1) { 59 | console.log(chalk.red('Error: machnet_send() failed')); 60 | } else { 61 | console.log(chalk.green('Message sent successfully')); 62 | } 63 | } else { 64 | // Server 65 | console.log('Waiting for message from client'); 66 | ret = machnet_shim.machnet_listen( 67 | channel_ctx, ref.allocCString(options.local_ip), kHelloWorldPort); 68 | customCheck(ret === 0, 'machnet_listen()'); 69 | 70 | function receive_message() { 71 | const buf = Buffer.alloc(1024); 72 | const flow = new MachnetFlow_t(); 73 | const bytesRead = 74 | machnet_shim.machnet_recv(channel_ctx, buf, buf.length, flow.ref()); 75 | 76 | if (bytesRead === -1) { 77 | console.log(chalk.red('Error: machnet_recv() failed')); 78 | } else if (bytesRead === 0) { 79 | setTimeout(receive_message, 10); 80 | } else { 81 | const receivedMsg = buf.toString('utf8', 0, bytesRead); 82 | console.log(`Received message: ${receivedMsg}`); 83 | receive_message(); 84 | } 85 | } 86 | 87 | receive_message(); 88 | } 89 | -------------------------------------------------------------------------------- /bindings/js/latency.js: -------------------------------------------------------------------------------- 1 | const ref = require('ref-napi'); 2 | const commander = require('commander'); 3 | const chalk = require('chalk'); 4 | const {machnet_shim, MachnetFlow_t} = require('./machnet_shim'); 5 | 6 | const kHelloWorldPort = 888; 7 | commander.option('-l, --local_ip ', 'Local IP address') 8 | .option('-r, --remote_ip ', 'Remote IP address') 9 | .option('-c, --is_client', 'Run as client') 10 | .parse(process.argv); 11 | 12 | const options = commander.opts(); 13 | if (!options.local_ip || !options.remote_ip) { 14 | console.log(chalk.red('Error: local_ip and remote_ip are required')); 15 | process.exit(1); 16 | } 17 | console.log(options); 18 | 19 | function customCheck(condition, message) { 20 | if (!condition) { 21 | console.log(chalk.red('Error: ' + message)); 22 | process.exit(1); 23 | } else { 24 | console.log(chalk.green('Success: ' + message)); 25 | } 26 | } 27 | 28 | function print_stats(arr) { 29 | const sorted = arr.sort((a, b) => a - b); 30 | const median = sorted[Math.floor(sorted.length / 2)]; 31 | const ninety = sorted[Math.floor(sorted.length * 0.9)]; 32 | const ninety_nine = sorted[Math.floor(sorted.length * 0.99)]; 33 | const ninety_nine_nine = sorted[Math.floor(sorted.length * 0.999)]; 34 | 35 | console.log('Median: ' + Math.floor(median) + ' us'); 36 | console.log('90th percentile: ' + Math.floor(ninety) + ' us'); 37 | console.log('99th percentile: ' + Math.floor(ninety_nine) + ' us'); 38 | console.log('99.9th percentile: ' + Math.floor(ninety_nine_nine) + ' us'); 39 | } 40 | 41 | // Main logic 42 | var ret = machnet_shim.machnet_init(); 43 | customCheck(ret === 0, 'machnet_init()'); 44 | 45 | var channel_ctx = machnet_shim.machnet_attach(); 46 | customCheck(channel_ctx !== null, 'machnet_attach()'); 47 | 48 | const NUM_MESSAGES = 100000; 49 | const latencies_us = []; 50 | 51 | const tx_flow = new MachnetFlow_t(); 52 | var ret = machnet_shim.machnet_connect( 53 | channel_ctx, ref.allocCString(options.local_ip), 54 | ref.allocCString(options.remote_ip), kHelloWorldPort, tx_flow.ref()); 55 | customCheck(ret === 0, 'machnet_connect()'); 56 | 57 | ret = machnet_shim.machnet_listen( 58 | channel_ctx, ref.allocCString(options.local_ip), kHelloWorldPort); 59 | customCheck(ret === 0, 'machnet_listen()'); 60 | 61 | if (options.is_client) { 62 | // Client 63 | console.log('Running as client'); 64 | 65 | let msgCounter = 0; 66 | const rx_flow = new MachnetFlow_t(); 67 | const rx_buf = Buffer.alloc(1024); 68 | const msg = `Hello World!`; 69 | const msg_buffer = Buffer.from(msg, 'utf8'); 70 | 71 | while (msgCounter < NUM_MESSAGES) { 72 | const startTime = process.hrtime(); 73 | ret = machnet_shim.machnet_send( 74 | channel_ctx, tx_flow, msg_buffer, msg_buffer.length); 75 | 76 | if (ret === -1) { 77 | console.log( 78 | chalk.red(`Error: machnet_send() failed for message ${msgCounter}`)); 79 | exit(1); 80 | } else { 81 | let bytesRead = 0; 82 | while (bytesRead === 0) { 83 | const result = machnet_shim.machnet_recv( 84 | channel_ctx, rx_buf, rx_buf.length, rx_flow.ref()); 85 | if (result === -1) { 86 | console.log(chalk.red( 87 | `Error: machnet_recv() failed for message ${msgCounter}`)); 88 | exit(1); 89 | } 90 | bytesRead = result; 91 | } 92 | 93 | const endTime = process.hrtime(); 94 | const latency_us = 95 | (endTime[0] - startTime[0]) * 1e6 + (endTime[1] - startTime[1]) / 1e3; 96 | latencies_us.push(latency_us); 97 | msgCounter++; 98 | } 99 | 100 | if (msgCounter % (NUM_MESSAGES / 10) === 0) { 101 | console.log(`Sent ${msgCounter} messages of ${NUM_MESSAGES}`); 102 | print_stats(latencies_us); 103 | latencies_us.length = 0; 104 | } 105 | } 106 | 107 | 108 | } else { 109 | // Server 110 | console.log('Running as server, waiting for messages from client'); 111 | const buf = Buffer.alloc(1024); 112 | const rx_flow = new MachnetFlow_t(); 113 | const replyMsg = `yes`; 114 | const replyBuffer = Buffer.from(replyMsg, 'utf8'); 115 | 116 | while (true) { 117 | const bytesRead = 118 | machnet_shim.machnet_recv(channel_ctx, buf, buf.length, rx_flow.ref()); 119 | 120 | if (bytesRead === -1) { 121 | console.log(chalk.red('Error: machnet_recv() failed')); 122 | continue; // continue to poll for messages 123 | } else if (bytesRead > 0) { 124 | machnet_shim.machnet_send( 125 | channel_ctx, tx_flow, replyBuffer, replyBuffer.length); 126 | } 127 | } 128 | } 129 | -------------------------------------------------------------------------------- /bindings/js/machnet_shim.js: -------------------------------------------------------------------------------- 1 | const ffi = require('ffi-napi'); 2 | const ref = require('ref-napi'); 3 | const Struct = require('ref-struct-napi'); 4 | 5 | // Basic types and loading the lib 6 | const MachnetFlow_t = Struct({ 7 | src_ip: 'uint32', 8 | dst_ip: 'uint32', 9 | src_port: 'uint16', 10 | dst_port: 'uint16' 11 | }); 12 | 13 | const size_t = ref.types.size_t; 14 | const voidPtr = ref.refType(ref.types.void); 15 | const charPtr = ref.refType(ref.types.char); 16 | const uint16 = ref.types.uint16; 17 | const MachnetFlowPtr = ref.refType(MachnetFlow_t); 18 | 19 | var dir = __dirname; 20 | const libmachnet_shim_location = dir + '/libmachnet_shim.so'; 21 | 22 | console.log('Loading libmachnet_shim'); 23 | var machnet_shim = ffi.Library(libmachnet_shim_location, { 24 | 'machnet_init': ['int', []], 25 | 'machnet_attach': ['pointer', []], 26 | 'machnet_listen': ['int', [voidPtr, charPtr, uint16]], 27 | 'machnet_connect': ['int', [voidPtr, charPtr, charPtr, uint16, MachnetFlowPtr]], 28 | 'machnet_send': ['int', [voidPtr, MachnetFlow_t, voidPtr, size_t]], 29 | 'machnet_recv': ['int', [voidPtr, voidPtr, size_t, MachnetFlowPtr]] 30 | }); 31 | 32 | module.exports = { 33 | machnet_shim: machnet_shim, 34 | MachnetFlow_t: MachnetFlow_t 35 | }; 36 | -------------------------------------------------------------------------------- /bindings/rust/.gitignore: -------------------------------------------------------------------------------- 1 | # Compiled files 2 | /target/ 3 | 4 | # Remove Cargo.lock from gitignore if creating an executable, leave it for libraries 5 | # More information: https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html 6 | Cargo.lock 7 | 8 | # These are backup files generated by rustfmt 9 | **/*.rs.bk 10 | 11 | # IDEs and editors 12 | /.idea/ 13 | *.swp 14 | *.swo 15 | .vscode/ 16 | -------------------------------------------------------------------------------- /bindings/rust/Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "machnet" 3 | version = "0.1.9" 4 | edition = "2021" 5 | 6 | authors = ["Vahab Jabrayilov "] 7 | description = "A Rust FFI bindings for Machnet" 8 | repository = "https://github.com/microsoft/machnet/tree/rust" 9 | # repository = "https://github.com/microsoft/machnet/tree/rust/bindings/rust" 10 | license = "MIT" 11 | categories= ["networking","ffi","dpdk"] 12 | keywords = ["networking","ffi","dpdk"] 13 | # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html 14 | 15 | [build-dependencies] 16 | bindgen = "0.69.4" 17 | -------------------------------------------------------------------------------- /bindings/rust/README.md: -------------------------------------------------------------------------------- 1 | # Machnet Rust Bindings 2 | 3 | [![Crates.io](https://img.shields.io/crates/v/machnet.svg)](https://crates.io/crates/machnet) 4 | [![Docs.rs](https://docs.rs/machnet/badge.svg)](https://docs.rs/machnet) 5 | [![Build](https://github.com/microsoft/machnet/actions/workflows/build.yml/badge.svg?event=push)](https://github.com/microsoft/machnet) 6 | ![GitHub License](https://img.shields.io/github/license/microsoft/machnet) 7 | 8 | This repository contains the Rust FFI bindings for **Machnet**. 9 | 10 | Machnet provides an easy way for applications to reduce their datacenter networking latency via kernel-bypass (DPDK-based) messaging. 11 | Distributed applications like databases and finance can use Machnet as the networking library to get sub-100 microsecond tail latency at high message rates, e.g., 750,000 1KB request-reply messages per second on Azure F8s_v2 VMs with 61 microsecond P99.9 round-trip latency. 12 | 13 | We support a variety of cloud (Azure, AWS, GCP) and bare-metal platforms, OSs, and NICs, evaluated in [PERFORMANCE_REPORT.md](../../docs/PERFORMANCE_REPORT.md). 14 | 15 | ## Prerequisites 16 | 17 | `clang` is required to build the Rust bindings. You can install it using the following command: 18 | 19 | ```bash 20 | sudo apt-get update && sudo apt-get install -y clang 21 | ``` 22 | 23 | It also requires that `libmachnet_shim.so` is built and installed on the system. 24 | You can check out the [Machnet](https://github.com/microsoft/machnet/) repo for the details. 25 | Use the [`build_shim.sh`](https://github.com/microsoft/machnet/blob/main/build_shim.sh) to automatically build and install the `libmachnet_shim.so` library. 26 | 27 | ## Getting Started 28 | 29 | To use the Machnet Rust bindings, add the following to your `Cargo.toml`: 30 | 31 | ```toml 32 | [dependencies] 33 | machnet = "0.1.9" 34 | ``` 35 | 36 | ## Demo 37 | 38 | We have a simple [msg_gen](https://github.com/microsoft/machnet/tree/rust/examples/rust) application that uses the Machnet stack. 39 | It is a message generator application that sends variable size messages to a server and receives them back. 40 | 41 | For 1 kilobyte message sizes, Rust and C++ show almost identical latencies of 53 and 52 microseconds respectively, indicating their comparable and fast performance. 42 | 43 | ## Open Source Project 44 | 45 | This project is an open-source initiative under Microsoft. We welcome contributions and suggestions from the community! 46 | See [CONTRIBUTING.md](../../CONTRIBUTING.md) for more details. 47 | 48 | Microsoft 49 | -------------------------------------------------------------------------------- /bindings/rust/TODO.md: -------------------------------------------------------------------------------- 1 | # TODO 2 | 3 | 4 | 5 | - [ ] Add tests to `src/lib.r` to ensure that the bindings are working correctly. 6 | - [ ] `cargo test --doc` failing if `libmachnet_shim.so` is not in `LD_LIBRARY_PATH`. 7 | - [ ] Populate [README](README.md) with more information. 8 | -------------------------------------------------------------------------------- /bindings/rust/build.rs: -------------------------------------------------------------------------------- 1 | // Copyright (C) 2023 Vahab Jabrayilov 2 | // Email: vjabrayilov@cs.columbia.edu 3 | // 4 | // This file is part of the Machnet project. 5 | // 6 | // This project is licensed under the MIT License - see the LICENSE file for details 7 | 8 | use std::{env, path::PathBuf}; 9 | fn main() { 10 | let lib_path = "resources"; 11 | 12 | println!("cargo:rustc-link-search=native={}", lib_path); 13 | println!("cargo:rustc-link-lib=machnet_shim"); 14 | 15 | let bindings = bindgen::Builder::default() 16 | .header(format!("{}/machnet.h", lib_path)) 17 | .parse_callbacks(Box::new(bindgen::CargoCallbacks::new())) 18 | .allowlist_function(".*machnet.*") 19 | .generate() 20 | .expect("Unable to generate bindings"); 21 | 22 | let out_path = PathBuf::from(env::var("OUT_DIR").unwrap()); 23 | bindings 24 | .write_to_file(out_path.join("bindings.rs")) 25 | .expect("Couldn't write bindings!"); 26 | } 27 | -------------------------------------------------------------------------------- /build_shim.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | print_usage() { 4 | echo "Usage: ./build_shim.sh" 5 | echo "" 6 | echo "This script builds the Machnet shim library (libmachnet_shim.so) and examples" 7 | } 8 | 9 | function blue() { 10 | echo -e "\033[0;36m$1\033[0m" 11 | } 12 | 13 | if [[ "$1" == "-h" ]] || [[ "$1" == "--help" ]]; then 14 | print_usage 15 | exit 0 16 | fi 17 | 18 | BASE_DIR="$(dirname "$(readlink -f "$0")")" 19 | SRC_DIR="${BASE_DIR}/src/ext" 20 | 21 | blue "Downloding dependency ..." 22 | sudo apt install libgflags-dev 23 | 24 | blue "Building Machnet shim library..." 25 | 26 | cd "${SRC_DIR}" || exit 1 27 | make clean 28 | make 29 | 30 | if [[ -f "${SRC_DIR}/libmachnet_shim.so" ]]; then 31 | cp "${SRC_DIR}/libmachnet_shim.so" "${BASE_DIR}/" 32 | blue "Machnet shim library built successfully and copied to ${BASE_DIR}/libmachnet_shim.so" 33 | else 34 | echo "Error: Building Machnet shim library failed. Please check the build process." 35 | exit 1 36 | fi 37 | 38 | blue "Building Machnet examples..." 39 | EXAMPLES_DIR="${BASE_DIR}/examples" 40 | cd ${EXAMPLES_DIR} 41 | make clean 42 | make 43 | blue "Machnet examples built successfully, see ${EXAMPLES_DIR} for binaries" 44 | -------------------------------------------------------------------------------- /docker-bake.hcl: -------------------------------------------------------------------------------- 1 | group "default" { 2 | targets = [ 3 | "machnet-ubuntu", 4 | "machnet-amazon-linux", 5 | ] 6 | } 7 | 8 | arm_list = [for dpdk_platform in [ 9 | { 10 | label = "generic" 11 | cflags = "", 12 | }, { 13 | label = "graviton2" 14 | cflags = "-mcpu=neoverse-n1" 15 | }, { 16 | label = "graviton3", 17 | cflags = "-mcpu=neoverse-v1", 18 | }] : { 19 | dpdk_platform = "${dpdk_platform.label}", 20 | platform = "linux/arm64", 21 | cpu_instruction_set = null, 22 | cflags = "${dpdk_platform.cflags}", 23 | label = "arm-${dpdk_platform.label}", 24 | }] 25 | x86_list = [for dpdk_platform in ["x86-64-v2", "x86-64-v3", "x86-64-v4"] : { 26 | dpdk_platform = "native", 27 | platform = "linux/amd64", 28 | cpu_instruction_set = "${dpdk_platform}", 29 | cflags = "-march=${dpdk_platform}", 30 | label = "${dpdk_platform}", 31 | }] 32 | 33 | configs = setunion(arm_list, x86_list) 34 | 35 | target "machnet-amazon-linux" { 36 | name = "machnet-amazon-linux-${item.label}" 37 | dockerfile = "dockerfiles/amazon-linux-2023.dockerfile" 38 | platforms = ["${item.platform}"] 39 | tags = [ 40 | "ghcr.io/microsoft/machnet/machnet:${item.label}", "ghcr.io/microsoft/machnet/machnet:amazon-linux-${item.label}", 41 | "ghcr.io/microsoft/machnet/machnet:amazon-linux-2023-${item.label}", "machnet:${item.label}", 42 | ] 43 | matrix = { 44 | item = configs 45 | } 46 | args = { 47 | DPDK_PLATFORM = "${item.dpdk_platform}", 48 | DPDK_EXTRA_MESON_DEFINES = join(" ", ["-Dmax_numa_nodes=1 -Ddefault_library=static", (item.cpu_instruction_set != null ? " -Dcpu_instruction_set=${item.cpu_instruction_set}" : "")]), 49 | CFLAGS = item.cflags 50 | CXXFLAGS = item.cflags 51 | timezone = "America/New_York" 52 | } 53 | } 54 | 55 | target "machnet-ubuntu" { 56 | name = "machnet-ubuntu-${item.label}" 57 | dockerfile = "dockerfiles/ubuntu-22.04.dockerfile" 58 | tags = [ 59 | "ghcr.io/microsoft/machnet/machnet:${item.label}", "ghcr.io/microsoft/machnet/machnet:ubuntu-${item.label}", "ghcr.io/microsoft/machnet/machnet:ubuntu-22.04-${item.label}" 60 | ] 61 | platforms = ["${item.platform}"] 62 | matrix = { 63 | item = configs 64 | } 65 | args = { 66 | DPDK_PLATFORM = "${item.dpdk_platform}", 67 | DPDK_EXTRA_MESON_DEFINES = join(" ", ["-Dmax_numa_nodes=1 -Ddefault_library=static", (item.cpu_instruction_set != null ? " -Dcpu_instruction_set=${item.cpu_instruction_set}" : "")]), 68 | CFLAGS = item.cflags 69 | CXXFLAGS = item.cflags 70 | timezone = "America/New_York" 71 | } 72 | } 73 | -------------------------------------------------------------------------------- /dockerfiles/amazon-linux-2023.dockerfile: -------------------------------------------------------------------------------- 1 | FROM docker.io/library/amazonlinux:2023 AS machnet_build_base 2 | 3 | # Fixes QEMU-based builds so they don't emulate an x86-64-v1 CPU 4 | ENV QEMU_CPU max 5 | 6 | ARG timezone 7 | 8 | # Set timezone and configure apt 9 | RUN ln -snf /usr/share/zoneinfo/${timezone} /etc/localtime && \ 10 | echo ${timezone} > /etc/timezone 11 | 12 | # Update and install dependencies 13 | RUN dnf update -y && \ 14 | dnf install -y \ 15 | git make automake gcc gcc-c++ kernel-devel ninja-build \ 16 | cmake meson pkg-config libudev-devel \ 17 | libnl3-devel python3-devel \ 18 | python3-docutils numactl-devel numactl \ 19 | ca-certificates autoconf libasan libasan-static \ 20 | pciutils libunwind-devel libuuid-devel xz-devel \ 21 | python3-pip glibc-devel tar which iproute sudo \ 22 | wget 23 | 24 | RUN python3 -m pip install pyelftools 25 | 26 | ENV BUILDTYPE NATIVEONLY 27 | 28 | ENV LIBHUGETLBFS_DIR /root/libhugetlbfs-2.24 29 | RUN wget https://github.com/libhugetlbfs/libhugetlbfs/releases/download/2.24/libhugetlbfs-2.24.tar.gz -O /root/libhugetlbfs.tar.gz && cd /root && \ 30 | tar xf *.tar.gz && cd ${LIBHUGETLBFS_DIR} && \ 31 | cd ${LIBHUGETLBFS_DIR} && \ 32 | ./autogen.sh && ./configure && make obj/hugectl obj/hugeedit obj/hugeadm obj/pagesize && make install && \ 33 | cd / && rm -rf ${LIBHUGETLBFS_DIR} /root/libhugetlbfs*.tar.gz 34 | 35 | # libhugetlbfs is both picky and not particularly performance sensitive. 36 | ARG CFLAGS 37 | ARG CXXFLAGS 38 | 39 | ENV NLOHMANN_JSON_DIR /root/nlohmann_json 40 | RUN git clone --depth 1 --branch 'v3.10.5' https://github.com/nlohmann/json.git ${NLOHMANN_JSON_DIR} && \ 41 | cd ${NLOHMANN_JSON_DIR} && mkdir build && cd build && cmake -DCMAKE_BUILD_TYPE=Release -GNinja ../ && ninja install && cd / && rm -rf ${NLOHMANN_JSON_DIR} 42 | 43 | # Remove conflicting packages 44 | # RUN apt-get --purge -y remove rdma-core librdmacm1 ibverbs-providers libibverbs-dev libibverbs1 45 | 46 | # Cleanup after package install 47 | # RUN rm -rf /var/lib/apt/lists/* 48 | 49 | WORKDIR /root 50 | 51 | # Set env variable for rdma-core 52 | ENV RDMA_CORE /root/rdma-core 53 | 54 | # Build rdma-core 55 | RUN git clone -b 'stable-v52' --single-branch --depth 1 https://github.com/linux-rdma/rdma-core.git ${RDMA_CORE} && \ 56 | cd ${RDMA_CORE} && \ 57 | mkdir build && \ 58 | cd build && \ 59 | cmake -GNinja -DNO_PYVERBS=1 -DNO_MAN_PAGES=1 ../ && \ 60 | ninja install && \ 61 | cd / && rm -rf ${RDMA_CORE} 62 | 63 | # Set env variable for DPDK 64 | ENV RTE_SDK /root/dpdk 65 | 66 | # Parts of DPDK that aren't needed for Machnet are disabled to help with the container size and so that LTO hopefully has an easier time finding optimizations. 67 | 68 | ENV DPDK_DISABLED_APPS dumpcap,graph,pdump,proc-info,test-acl,test-bbdev,test-cmdline,test-compress-perf,test-crypto-perf,test-dma-perf,test-eventdev,test-fib,test-flow-perf,test-gpudev,test-mldev,test-pipeline,test-regex,test-sad,test-security-perf 69 | ENV DPDK_DISABLED_DRIVER_GROUPS raw/*,crypto/*,baseband/*,dma/*,event/*,regex/*,ml/*,gpu/*,vdpa/*,compress/* 70 | ENV DPDK_DISABLED_COMMON_DRIVERS common/qat,common/octeontx,common/octeontx2,common/cnxk,common/dpaax 71 | # probably the only safe bus driver to disable 72 | ENV DPDK_DISABLED_BUS_DRIVERS bus/ifpga 73 | # PMDs which don't meet the minimum requirements for Machnet 74 | ENV DPDK_DISABLED_NIC_DRIVERS net/softnic,net/tap,net/af_packet,net/af_xdp,net/avp,net/bnx2x,net/memif,net/nfb,net/octeon_ep,net/pcap,net/ring,net/tap 75 | 76 | # Additional drivers to disable. Intended to allow disabling drivers not needed in your environment to save on image size. This needs to end with a comma. 77 | ARG DPDK_ADDITIONAL_DISABLED_DRIVERS 78 | 79 | ENV DPDK_DISABLED_DRIVERS ${DPDK_ADDITIONAL_DISABLED_DRIVERS}${DPDK_DISABLED_DRIVER_GROUPS},${DPDK_DISABLED_COMMON_DRIVERS},${DPDK_DISABLED_BUS_DRIVERS},${DPDK_DISABLED_NIC_DRIVERS} 80 | 81 | # Enabling a driver wins over disabling a driver, so if you the user disagree with any of our decisions add a comma delimited list of drivers to re-enable. 82 | # For example, Marvell OcteonTX2 would be enabled by passing '--build-arg="DPDK_ENABLED_DRIVERS=common/octeontx2"' to the build command 83 | ARG DPDK_ENABLED_DRIVERS 84 | 85 | # Set the DPDK platform for ARM SOCs 86 | ARG DPDK_PLATFORM=generic 87 | 88 | # Additional Meson defines 89 | ARG DPDK_EXTRA_MESON_DEFINES 90 | 91 | # Preset to build DPDK with. Defaults to release mode with debug info. 92 | ARG DPDK_MESON_BUILD_PRESET=debugoptimized 93 | 94 | # Build DPDK 95 | RUN git clone --depth 1 --branch 'v23.11' https://github.com/DPDK/dpdk.git ${RTE_SDK} 96 | RUN cd ${RTE_SDK} && \ 97 | meson setup build --buildtype=${DPDK_MESON_BUILD_PRESET} -Dexamples='' -Dplatform=${DPDK_PLATFORM} -Denable_kmods=false -Dtests=false -Ddisable_apps=${DPDK_DISABLED_APPS} -Ddisable_drivers=${DPDK_DISABLED_DRIVERS} -Denable_drivers='${DPDK_ENABLED_DRIVERS}' ${DPDK_EXTRA_MESON_DEFINES} && \ 98 | ninja -C build install && \ 99 | cd / && \ 100 | rm -rf ${RTE_SDK} 101 | 102 | RUN echo /usr/local/lib64 > /etc/ld.so.conf.d/usr_local.conf && ldconfig 103 | 104 | # Stage 2: Build Machnet 105 | FROM machnet_build_base AS machnet 106 | 107 | WORKDIR /root/machnet 108 | 109 | # Copy Machnet files 110 | COPY . . 111 | 112 | # Submodule update 113 | RUN git submodule update --init --recursive 114 | 115 | # Do a Release build 116 | RUN ldconfig && \ 117 | mkdir release_build && \ 118 | cd release_build && \ 119 | cmake -DCMAKE_BUILD_TYPE=Release -GNinja ../ && \ 120 | ninja install machnet msg_gen pktgen shmem_test machnet_test 121 | 122 | # Do a Debug build 123 | RUN ldconfig && \ 124 | mkdir debug_build && \ 125 | cd debug_build && \ 126 | cmake -DCMAKE_BUILD_TYPE=Debug -GNinja ../ && \ 127 | ninja install machnet msg_gen pktgen shmem_test machnet_test 128 | 129 | # 130 | # Cleanup phase 131 | # 132 | 133 | RUN find /usr/local/lib64 -name "*/librte_*.a" | xargs rm -f 134 | 135 | FROM scratch AS machnet_compressed_worker 136 | COPY --from=machnet / / 137 | 138 | CMD /root/machnet/release_build/src/apps/machnet/machnet --config_json /var/run/machnet/local_config.json 139 | 140 | ENTRYPOINT ["/bin/bash"] -------------------------------------------------------------------------------- /dockerfiles/get_targets_for_arch.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | import argparse 4 | import platform 5 | import json 6 | import sys 7 | 8 | def main(): 9 | parser = argparse.ArgumentParser( 10 | prog="Get Targets For Arch", 11 | description="This program filters the output of 'docker buildx bake -f $BAKEFILE --print' to get a list of targets for a specific architecture.", 12 | ) 13 | 14 | arch_choices = [ 15 | "x86", 16 | "arm", 17 | "native", 18 | ] 19 | 20 | parser.add_argument("--arch", type=str, choices=arch_choices) 21 | 22 | args = parser.parse_args() 23 | 24 | arch = None 25 | 26 | if args.arch == "x86": 27 | arch = "linux/amd64" 28 | elif args.arch == "arm": 29 | arch = "linux/arm64" 30 | elif args.arch == "native": 31 | machine = str(platform.machine()).lower() 32 | if machine == "amd64": 33 | arch = "linux/amd64" 34 | elif machine == "aarch64": 35 | arch = "linux/arm64" 36 | else: 37 | print(f"Unknown native arch: {machine}", file=sys.stderr) 38 | sys.exit(1) 39 | else: 40 | print(f"Unknown arch argument: {args.arch}", file=sys.stderr) 41 | sys.exit(1) 42 | 43 | buildx_info = json.load(sys.stdin) 44 | 45 | targets_to_build = list(filter_buildx_info(buildx_info, arch)) 46 | 47 | if len(targets_to_build) == 0: 48 | print("No targets to build", out=sys.stderr) 49 | sys.exit(1) 50 | 51 | for target in targets_to_build: 52 | print(target) 53 | 54 | 55 | def filter_buildx_info(buildx_info, platform): 56 | for target, target_info in buildx_info["target"].items(): 57 | if platform in target_info["platforms"]: 58 | yield target 59 | 60 | 61 | if __name__ == "__main__": 62 | main() -------------------------------------------------------------------------------- /dockerfiles/ubuntu-22.04.dockerfile: -------------------------------------------------------------------------------- 1 | # Stage 1: Install system packages and build dependencies 2 | FROM ubuntu:22.04 AS machnet_build_base 3 | 4 | # Fixes QEMU-based builds so they don't emulate an x86-64-v1 CPU 5 | ENV QEMU_CPU max 6 | 7 | ARG timezone 8 | 9 | # Set timezone and configure apt 10 | RUN ln -snf /usr/share/zoneinfo/${timezone} /etc/localtime && \ 11 | echo ${timezone} > /etc/timezone && \ 12 | echo 'APT::Install-Suggests "0";' >> /etc/apt/apt.conf.d/00-docker && \ 13 | echo 'APT::Install-Recommends "0";' >> /etc/apt/apt.conf.d/00-docker 14 | 15 | # Update and install dependencies 16 | RUN apt-get update && \ 17 | apt-get install --no-install-recommends -y \ 18 | git \ 19 | build-essential cmake meson pkg-config libudev-dev \ 20 | libnl-3-dev libnl-route-3-dev python3-dev \ 21 | python3-docutils python3-pyelftools libnuma-dev \ 22 | ca-certificates autoconf \ 23 | libhugetlbfs-dev pciutils libunwind-dev uuid-dev nlohmann-json3-dev 24 | 25 | # Remove conflicting packages 26 | RUN apt-get --purge -y remove rdma-core librdmacm1 ibverbs-providers libibverbs-dev libibverbs1 27 | 28 | # Cleanup after package install 29 | RUN rm -rf /var/lib/apt/lists/* 30 | 31 | WORKDIR /root 32 | 33 | # Set env variable for rdma-core 34 | ENV RDMA_CORE /root/rdma-core 35 | 36 | # Build rdma-core 37 | RUN git clone -b 'stable-v52' --single-branch --depth 1 https://github.com/linux-rdma/rdma-core.git ${RDMA_CORE} && \ 38 | cd ${RDMA_CORE} && \ 39 | mkdir build && \ 40 | cd build && \ 41 | cmake -GNinja -DNO_PYVERBS=1 -DNO_MAN_PAGES=1 ../ && \ 42 | ninja install 43 | 44 | RUN echo /usr/local/lib64 > /etc/ld.so.conf.d/usr_local.conf && ldconfig 45 | 46 | # Set env variable for DPDK 47 | ENV RTE_SDK /root/dpdk 48 | 49 | # Parts of DPDK that aren't needed for Machnet are disabled to help with the container size 50 | ENV DPDK_DISABLED_APPS dumpcap,graph,pdump,proc-info,test-acl,test-bbdev,test-cmdline,test-compress-perf,test-crypto-perf,test-dma-perf,test-eventdev,test-fib,test-flow-perf,test-gpudev,test-mldev,test-pipeline,test-regex,test-sad,test-security-perf 51 | ENV DPDK_DISABLED_DRIVER_GROUPS raw/*,crypto/*,baseband/*,dma/*,event/*,regex/*,ml/*,gpu/*,vdpa/*,compress/* 52 | ENV DPDK_DISABLED_COMMON_DRIVERS common/qat,common/octeontx,common/octeontx2,common/cnxk,common/dpaax 53 | # probably the only safe bus driver to disable 54 | ENV DPDK_DISABLED_BUS_DRIVERS bus/ifpga 55 | # PMDs which don't meet the minimum requirements for Machnet 56 | ENV DPDK_DISABLED_NIC_DRIVERS net/softnic,net/tap,net/af_packet,net/af_xdp,net/avp,net/bnx2x,net/memif,net/nfb,net/octeon_ep,net/pcap,net/ring,net/tap 57 | 58 | # Additional drivers to disable. Intended to allow disabling drivers not needed in your environment to save on image size. This needs to end with a comma. 59 | ARG DPDK_ADDITIONAL_DISABLED_DRIVERS 60 | 61 | ENV DPDK_DISABLED_DRIVERS ${DPDK_ADDITIONAL_DISABLED_DRIVERS}${DPDK_DISABLED_DRIVER_GROUPS},${DPDK_DISABLED_COMMON_DRIVERS},${DPDK_DISABLED_BUS_DRIVERS},${DPDK_DISABLED_NIC_DRIVERS} 62 | 63 | # Enabling a driver wins over disabling a driver, so if you the user disagree with any of our decisions add a comma delimited list of drivers to re-enable. 64 | # For example, Marvell OcteonTX2 would be enabled by passing '--build-arg="DPDK_ENABLED_DRIVERS=common/octeontx2"' to the build command 65 | ARG DPDK_ENABLED_DRIVERS 66 | 67 | # Set the DPDK platform for ARM SOCs 68 | ARG DPDK_PLATFORM=generic 69 | 70 | # Additional Meson defines 71 | ARG DPDK_EXTRA_MESON_DEFINES 72 | 73 | # Preset to build DPDK with. Defaults to release mode with debug info. 74 | ARG DPDK_MESON_BUILD_PRESET=debugoptimized 75 | 76 | # Build DPDK 77 | RUN git clone --depth 1 --branch 'v23.11' https://github.com/DPDK/dpdk.git ${RTE_SDK} 78 | RUN cd ${RTE_SDK} && \ 79 | meson setup build --buildtype=${DPDK_MESON_BUILD_PRESET} -Dexamples='' -Dplatform=${DPDK_PLATFORM} -Denable_kmods=false -Dtests=false -Ddisable_apps=${DPDK_DISABLED_APPS} -Ddisable_drivers=${DPDK_DISABLED_DRIVERS} -Denable_drivers='${DPDK_ENABLED_DRIVERS}' ${DPDK_EXTRA_MESON_DEFINES} && \ 80 | ninja -C build install && \ 81 | cd / && \ 82 | rm -rf ${RTE_SDK} 83 | 84 | # # Stage 2: Build Machnet 85 | FROM machnet_build_base AS machnet 86 | 87 | WORKDIR /root/machnet 88 | 89 | # Copy Machnet files 90 | COPY . . 91 | 92 | # Submodule update 93 | RUN git submodule update --init --recursive 94 | 95 | # Do a Release build 96 | RUN ldconfig && \ 97 | mkdir release_build && \ 98 | cd release_build && \ 99 | cmake -DCMAKE_BUILD_TYPE=Release -GNinja ../ && \ 100 | ninja 101 | 102 | # Do a Debug build 103 | RUN ldconfig && \ 104 | mkdir debug_build && \ 105 | cd debug_build && \ 106 | cmake -DCMAKE_BUILD_TYPE=Debug -GNinja ../ && \ 107 | ninja 108 | 109 | 110 | # Clean up unneeded static libraries to save space 111 | RUN find /usr/local/lib64 -name "*/librte_*.a" | xargs rm -f 112 | 113 | FROM scratch AS machnet_compressed_worker 114 | COPY --from=machnet / / 115 | 116 | CMD /root/machnet/release_build/src/apps/machnet/machnet --config_json /var/run/machnet/local_config.json 117 | 118 | ENTRYPOINT ["/bin/bash"] 119 | -------------------------------------------------------------------------------- /docs/INTERNAL.md: -------------------------------------------------------------------------------- 1 | This document is for developers of the Machnet service. It is not intended for 2 | users of the Machnet service. For users, see the [README](README.md). 3 | 4 | ## Compiling Machnet as a Docker image 5 | 6 | ### Building the standard x86 image 7 | 8 | ```bash 9 | docker build --no-cache -f dockerfiles/ubuntu-22.04.dockerfile --target machnet --tag machnet . 10 | ``` 11 | 12 | ### Building custom images 13 | 14 | The top-level makefile allows building x86 and arm64 docker images across a 15 | variety of microarchitecture capability levels (x86) or SOC targets (arm64). 16 | Invoking "make" will create them for the architecture of the host. 17 | 18 | To build a Machnet image for a specific architecture, e.g, x86-64-v4 19 | 20 | ```bash 21 | # This requires a recent Docker version 22 | $ docker buildx bake -f docker-bake x86-64-v4 23 | ``` 24 | 25 | ## Manually compiling Machnet from source 26 | 27 | Install required packages 28 | ```bash 29 | $ sudo apt -y install cmake pkg-config nlohmann-json3-dev ninja-build gcc g++ doxygen graphviz python3-pip meson libhugetlbfs-dev libnl-3-dev libnl-route-3-dev uuid-dev 30 | $ pip3 install pyelftools 31 | ``` 32 | 33 | For Ubuntu 20.04 we need at least versions `gcc-10`, `g++-10`, `cpp-10`. This step is not required for newer versions of Ubuntu. 34 | ```bash 35 | # Install and set gcc-10 and g++-10 as the default compiler. 36 | $ sudo apt install gcc-10 g++-10 37 | $ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 100 --slave /usr/bin/g++ g++ /usr/bin/g++-10 --slave /usr/bin/gcov gcov /usr/bin/gcov-10 38 | ``` 39 | 40 | ### Build and install rdma-core 41 | 42 | ```bash 43 | # Remove conflicting packages 44 | $ RUN apt-get --purge -y remove rdma-core librdmacm1 ibverbs-providers libibverbs-dev libibverbs1 45 | ``` 46 | 47 | Now we install rdma-core from source: 48 | ```bash 49 | export RDMA_CORE=/path/to/rdma-core 50 | git clone -b 'stable-v40' --single-branch --depth 1 https://github.com/linux-rdma/rdma-core.git ${RDMA_CORE} 51 | cd ${RDMA_CORE} 52 | mkdir -p build && cd build 53 | cmake -GNinja -DNO_PYVERBS=1 -DNO_MAN_PAGES=1 .. 54 | ninja install # as root 55 | ldconfig 56 | ``` 57 | 58 | ### Build DPDK 59 | 60 | Download and extract DPDK 23.11, then: 61 | 62 | ```bash 63 | cd dpdk-23.11 64 | meson build --prefix=${PWD}/build/install 65 | ninja -C build 66 | ninja -C build install 67 | export PKG_CONFIG_PATH="${PWD}/build/install/lib/x86_64-linux-gnu/pkgconfig" 68 | echo "${PWD}/build/install/lib/x86_64-linux-gnu" | sudo tee -a /etc/ld.so.conf.d/x86_64-linux-gnu.conf > /dev/null 69 | sudo ldconfig 70 | ``` 71 | 72 | Note: You can find the PKG_CONFIG_PATH by the following command: find *path_to_dpdk* -name "*.pc". 73 | 74 | ### Build Machnet 75 | 76 | Then, from `${REPOROOT}`: 77 | ```bash 78 | git submodule update --init --recursive 79 | 80 | # Debug build: 81 | mkdir build && cd build && cmake -DCMAKE_BUILD_TYPE=Debug -GNinja ../ && ninja 82 | 83 | # Release build: 84 | mkdir build && cd build && cmake -DCMAKE_BUILD_TYPE=Release -GNinja ../ && ninja 85 | ``` 86 | 87 | The `machnet` binary will be available in `${REPOROOT}/build/src/apps/`. You may 88 | see more details about the `Machnet` program in this 89 | [README](src/apps/machnet/README.md). 90 | 91 | ## Tests 92 | 93 | To run all tests, from ${REPOROOT}: 94 | ```bash 95 | # If the Ansible automation was not used, enable hugepages first 96 | echo 1024 | sudo tee /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages 97 | sudo ctest # sudo is required for DPDK-related tests. 98 | ``` 99 | -------------------------------------------------------------------------------- /docs/PERFORMANCE_REPORT.md: -------------------------------------------------------------------------------- 1 | # Machnet performance report 2 | 3 | **Important note: The performance results should be compared across platforms, 4 | since the intra-platform variability (e.g., different pairs of VMs in the same 5 | cloud) is high.** 6 | 7 | ## Single-connection request-response benchmark 8 | 9 | Description: A client sends a request to a server, and the server sends a 10 | response back to the client. The client keeps a configurable number messages in 11 | flight for pipelining. 12 | 13 | Start the Machnet Docker container on both client and server 14 | 15 | ```bash 16 | # Server 17 | ./machnet.sh --mac --ip 18 | 19 | # Client 20 | ./machnet.sh --mac --ip 21 | ``` 22 | 23 | Run the `msg_gen` benchmark: 24 | ```bash 25 | MSG_GEN="docker run -v /var/run/machnet:/var/run/machnet ghcr.io/microsoft/machnet/machnet:latest release_build/src/apps/msg_gen/msg_gen" 26 | 27 | # Server 28 | ${MSG_GEN} --local_ip --msg_size 1024 29 | 30 | # Client: Experiment E1: latency 31 | ${MSG_GEN} --local_ip --remote_ip --msg_window 1 --tx_msg_size 1024 32 | 33 | # Client: Experiment E2: message rate 34 | ${MSG_GEN} --local_ip --remote_ip --msg_window 32 --tx_msg_size 1024 35 | ``` 36 | 37 | | Server | NIC, DPDK PMD | Round-trip p50 99 99.9 | RPCs/s | Notes | 38 | | --- | --- | --- | --- | --- | 39 | | Azure F8s_v2 | CX4-Lx, netvsc | E1: 18 us, 19 us, 25 us | 54K | No proximity groups 40 | | *Ubuntu 22.03* | | E2: 41 us, 54 us, 61 us | 753K | 41 | | | | | | | 42 | | AWS c5.xlarge | ENA, ena | E1: 42 us, 66 us, 105 us | 22K | Proximity groups 43 | | *Amazon Linux* | | E2: 61 us, 132 us, 209 us | 122K | `--msg_window 8` 44 | | | | | | | 45 | | GCP XXX, XXX | gVNIC | E1: XXX | XXX | | 46 | | *XXX*| | E2: XXX | XXX | | 47 | | | | | | | 48 | | Bare metal | E810 PF, ice | E1: 18 us, 21 us, 22 us | 55K | 49 | | *Mariner 2.0* | | E2: 30 us, 33 us, 37 us | 1043K | 50 | | | | | | | 51 | | Bare metal | E810 VF, iavf | E1: 18 us, 22 us, 22 us | 55K | 52 | | *Mariner 2.0* | | E2: 31 us, 35 us, 41 us | 1003K | 53 | | | | | | | 54 | | Bare metal | Bluefield-2, mlx5 | E1: 9 us, 12 us, 13 us | 99K | 55 | | *Ubuntu 22.04* | | E2: 24 us, 26 us, 28 us | 1320K | 56 | | | | | | | 57 | | Bare metal j| CX5, mlx5 | E1: XXX | XXX | 58 | | *Ubuntu 20.04* | | E2: XXX | XXX | 59 | | | | | | | 60 | | Bare metal | CX6-Dx, mlx5 | E1: XXX | XXX | 61 | | *Ubuntu 20.04* | | E2: XXX | XXX | 62 | 63 | -------------------------------------------------------------------------------- /examples/.gitignore: -------------------------------------------------------------------------------- 1 | hello_world 2 | -------------------------------------------------------------------------------- /examples/Makefile: -------------------------------------------------------------------------------- 1 | CC = g++ 2 | CFLAGS = -I../src/include -I../src/ext -L../ -lmachnet_shim -lrt -lgflags -Wl,-rpath,.. 3 | 4 | all: hello_world 5 | 6 | hello_world: hello_world.cc 7 | $(CC) -o $@ $< $(CFLAGS) 8 | 9 | clean: 10 | rm -f hello_world 11 | -------------------------------------------------------------------------------- /examples/aws_instructions.md: -------------------------------------------------------------------------------- 1 | # Instructions to run Machnet on AWS 2 | 3 | ## Steps to create the VMs 4 | 5 | - Create two VMs (e.g., c5.xlarge) using EC2's web interface. This can be done 6 | in one shot by setting "Number of instances" as 2 in the UI, which will also 7 | ensure that the VMs end up in the same subnet. 8 | 9 | - Add elastic IPs to each VM to allow SSH access. 10 | 11 | - Create two more NICs (one for each VM) in the same subnet as above. These will 12 | be used for private Machnet traffic between the VMs. 13 | 14 | - **Modify these NICs' security group to allow IPv4 traffic.** 15 | - Associate one NIC to each VM. 16 | 17 | 18 | 19 | ## Bind each VM's private NIC to Machnet 20 | 21 | The commands below provide a rough outline. 22 | **Before binding the private NIC to DPDK, note down its IP and MAC address.** 23 | 24 | ```bash 25 | # Install igb_uio 26 | sudo yum install git make kernel-devel 27 | git clone git://dpdk.org/dpdk-kmods 28 | cd dpdk-kmods/linux/igb_uio; make; sudo modprobe uio; sudo insmod igb_uio.ko 29 | 30 | # Assuming that ens6 is the private NIC for Machnet 31 | MACHNET_NIC=ens6 32 | cd; wget https://fast.dpdk.org/rel/dpdk-23.11.tar.xz; tar -xvf dpdk-23.11.tar.xz 33 | sudo ifconfig ${MACHNET_NIC} down 34 | sudo ~/dpdk-23.11/usertools/dpdk-devbind.py --bind igb_uio ${MACHNET_NIC} 35 | ``` -------------------------------------------------------------------------------- /examples/azure_start_machnet.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Hacky script to start-up Machnet on an Azure VM on interface #1 3 | # Usage: ./azure_start_machnet.sh [--bare_metal] 4 | 5 | BARE_METAL_ARG="" 6 | if [ "$1" == "--bare_metal" ]; then 7 | echo "Running in bare metal mode, will not use docker" 8 | BARE_METAL_ARG="--bare_metal" 9 | fi 10 | 11 | machnet_ip_addr=$( 12 | curl -s -H Metadata:true --noproxy "*" \ 13 | "http://169.254.169.254/metadata/instance?api-version=2021-02-01" | \ 14 | jq '.network.interface[1].ipv4.ipAddress[0].privateIpAddress' | \ 15 | tr -d '"') 16 | echo "Machnet IP address: ${machnet_ip_addr}" 17 | 18 | machnet_mac_addr=$( 19 | curl -s -H Metadata:true --noproxy "*" \ 20 | "http://169.254.169.254/metadata/instance?api-version=2021-02-01" | \ 21 | jq '.network.interface[1].macAddress' | \ 22 | tr -d '"' | \ 23 | sed 's/\(..\)/\1:/g;s/:$//') # Converts AABBCCDDEEFF to AA:BB:CC:DD:EE:FF 24 | echo "Machnet MAC address: ${machnet_mac_addr}" 25 | 26 | sudo modprobe uio_hv_generic 27 | 28 | if [ -d /sys/class/net/eth1 ]; then 29 | DEV_UUID=$(basename $(readlink /sys/class/net/eth1/device)) 30 | echo "Unbinding $DEV_UUID from hv_netvsc" 31 | sudo driverctl -b vmbus set-override $DEV_UUID uio_hv_generic 32 | fi 33 | 34 | THIS_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )" 35 | cd ${THIS_SCRIPT_DIR}/..; ./machnet.sh ${BARE_METAL_ARG} --mac ${machnet_mac_addr} --ip ${machnet_ip_addr} 36 | -------------------------------------------------------------------------------- /examples/hello_world.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file main.cc 3 | * Simple hello world application using only Machnet public APIs 4 | * Usage: 5 | * - First start the server: ./hello_world --local= 6 | * - Client: 7 | * ./hello_world --local= --remote= --is_client=1 8 | */ 9 | 10 | #include 11 | #include 12 | 13 | #include 14 | 15 | DEFINE_string(local, "", "Local IP address"); 16 | DEFINE_string(remote, "", "Remote IP address"); 17 | 18 | static constexpr uint16_t kPort = 31580; 19 | 20 | // assert with message 21 | void assert_with_msg(bool cond, const char *msg) { 22 | if (!cond) { 23 | printf("%s\n", msg); 24 | exit(-1); 25 | } 26 | } 27 | 28 | int main(int argc, char *argv[]) { 29 | gflags::ParseCommandLineFlags(&argc, &argv, true); 30 | 31 | int ret = machnet_init(); 32 | assert_with_msg(ret == 0, "machnet_init() failed"); 33 | 34 | void *channel = machnet_attach(); 35 | assert_with_msg(channel != nullptr, "machnet_attach() failed"); 36 | 37 | ret = machnet_listen(channel, FLAGS_local.c_str(), kPort); 38 | assert_with_msg(ret == 0, "machnet_listen() failed"); 39 | 40 | printf("Listening on %s:%d\n", FLAGS_local.c_str(), kPort); 41 | 42 | if (FLAGS_remote != "") { 43 | printf("Sending message to %s:%d\n", FLAGS_remote.c_str(), kPort); 44 | MachnetFlow flow; 45 | std::string msg = "Hello World!"; 46 | ret = machnet_connect(channel, FLAGS_local.c_str(), FLAGS_remote.c_str(), 47 | kPort, &flow); 48 | assert_with_msg(ret == 0, "machnet_connect() failed"); 49 | 50 | const int ret = machnet_send(channel, flow, msg.data(), msg.size()); 51 | if (ret == -1) printf("machnet_send() failed\n"); 52 | } else { 53 | printf("Waiting for message from client\n"); 54 | size_t count = 0; 55 | 56 | while (true) { 57 | std::array buf; 58 | MachnetFlow flow; 59 | const ssize_t ret = machnet_recv(channel, buf.data(), buf.size(), &flow); 60 | assert_with_msg(ret >= 0, "machnet_recvmsg() failed"); 61 | if (ret == 0) { 62 | usleep(10); 63 | continue; 64 | } 65 | 66 | std::string msg(buf.data(), ret); 67 | printf("Received message: %s, count = %zu\n", msg.c_str(), count++); 68 | } 69 | } 70 | 71 | return 0; 72 | } 73 | -------------------------------------------------------------------------------- /examples/requirements.txt: -------------------------------------------------------------------------------- 1 | azure-mgmt-resource 2 | azure-mgmt-compute 3 | azure-mgmt-network 4 | azure-identity 5 | termcolor -------------------------------------------------------------------------------- /examples/rust/.gitignore: -------------------------------------------------------------------------------- 1 | # Compiled files 2 | /target/ 3 | 4 | # Remove Cargo.lock from gitignore if creating an executable, keep it for libraries 5 | # Cargo.lock 6 | 7 | # Generated by Cargo 8 | **/Cargo.lock 9 | 10 | # Generated by diesel_cli 11 | **/diesel.toml 12 | 13 | # Environment variables 14 | .env 15 | .env.* 16 | 17 | # Backup files 18 | *.swp 19 | *.swo 20 | 21 | # IDE and editor files 22 | .vscode/ 23 | .idea/ 24 | *.iml 25 | *.user 26 | *.vs/ 27 | *.sublime-workspace 28 | 29 | # Mac specific 30 | .DS_Store 31 | 32 | # Windows specific 33 | Thumbs.db 34 | Desktop.ini 35 | 36 | # npm packages for wasm 37 | node_modules/ 38 | package-lock.json 39 | package.json 40 | -------------------------------------------------------------------------------- /examples/rust/Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "msg_gen" 3 | version = "0.1.0" 4 | edition = "2021" 5 | 6 | # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html 7 | 8 | [dependencies] 9 | clap = {version = "4.5.1", features = ["derive"]} 10 | env_logger = "0.11.2" 11 | hdrhistogram = "7.5.4" 12 | lazy_static = "1.4.0" 13 | log = "0.4.20" 14 | machnet = "0.1.8" 15 | signal-hook = "0.3.17" 16 | -------------------------------------------------------------------------------- /examples/rust/README.md: -------------------------------------------------------------------------------- 1 | # `msg_gen` Application 2 | 3 | This is a simple message generator application that uses the Machnet stack. 4 | 5 | ## Prerequisites 6 | 7 | Before you begin, ensure you have met the following requirements: 8 | 9 | - Successful build of the Machnet project (see main [README](../../README.md)). 10 | - A Machnet stack instance must already be running on the machine that needs to use this message generator application. 11 | - You can find information on how to run the Machnet stack in the [Machnet README](../../README.md). 12 | 13 | ## Building and Running the Application 14 | 15 | The `msg_gen` application is built using cargo. 16 | 17 | ```bash 18 | cargo build --release 19 | # Alternatively for a debug build,just do 20 | # cargo build 21 | ``` 22 | 23 | To see the available options, run: 24 | 25 | ```bash 26 | msg_gen --help 27 | ``` 28 | 29 | To send 64 bytes messages from a client[`10.0.255.110`] to a server[`10.0.255.111`] and receive them back, follow this example: 30 | 31 | Run on the server: 32 | 33 | ```bash 34 | ./target/release/msg_gen --local-ip 10.0.255.111 --local-port 1111 --msg-size 64 35 | ``` 36 | 37 | Run on the client: 38 | 39 | ```bash 40 | ./target/release/msg_gen --local-ip 10.0.255.110 --local-port 1111 --remote-ip 10.0.255.111 --remote-port 1111 --msg-size 64 41 | ``` 42 | 43 | ### Benchmark 44 | 45 | From 64 bytes to 64KB, we measured the `p99` latency for both Rust and C++ implementations. 46 | The results are as follows: 47 | 48 | | Message Size (bytes) | Rust Latency | C++ Latency | 49 | | -------------------- | ------------ | ----------- | 50 | | 64 | 51 | 51 | 51 | | 128 | 52 | 53 | 52 | | 256 | 49 | 48 | 53 | | 512 | 52 | 51 | 54 | | 1024 | 53 | 52 | 55 | | 2048 | 61 | 61 | 56 | | 4096 | 68 | 67 | 57 | | 8192 | 89 | 87 | 58 | | 16384 | 171 | 171 | 59 | | 32768 | 321 | 323 | 60 | | 65536 | 627 | 631 | 61 | 62 | Latency 63 | -------------------------------------------------------------------------------- /examples/rust/image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/machnet/7d047486538493d3a8aedc2cef9d9329b409c2e2/examples/rust/image.png -------------------------------------------------------------------------------- /machnet.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # Start the Machnet service on this machineA 3 | # Usage: machnet.sh --mac --ip 4 | # - mac: MAC address of the local DPDK interface 5 | # - ip: IP address of the local DPDK interface 6 | # - debug: if set, run a debug build from the Machnet Docker container 7 | # - bare_metal: if set, will use local binary instead of Docker image 8 | 9 | LOCAL_MAC="" 10 | LOCAL_IP="" 11 | BARE_METAL=0 12 | DEBUG=0 13 | while [[ $# -gt 0 ]]; do 14 | key="$1" 15 | case $key in 16 | -m|--mac) 17 | LOCAL_MAC="$2" 18 | shift 19 | shift 20 | ;; 21 | -i|--ip) 22 | LOCAL_IP="$2" 23 | shift 24 | shift 25 | ;; 26 | -b|--bare_metal) 27 | BARE_METAL=1 28 | shift 29 | ;; 30 | -d|--debug) 31 | DEBUG=1 32 | shift 33 | ;; 34 | *) 35 | echo "Unknown option $key" 36 | exit 1 37 | ;; 38 | esac 39 | done 40 | 41 | # Pre-flight checks 42 | if [ -z "$LOCAL_MAC" ] || [ -z "$LOCAL_IP" ]; then 43 | echo "Usage: machnet.sh --mac --ip " 44 | exit 1 45 | fi 46 | 47 | # 48 | # Hugepage allocation 49 | # 50 | 51 | #Allocate memory for the first NUMA node 52 | if ! cat /sys/devices/system/node/*/meminfo | grep HugePages_Total | grep -q 1024 53 | then 54 | echo "Insufficient or no hugepages available" 55 | read -p "Do you want to allocate 1024 2MB hugepages? (y/n) " -n 1 -r 56 | echo 57 | if [[ ! $REPLY =~ ^[Yy]$ ]] 58 | then 59 | echo "OK, continuing without allocating hugepages" 60 | else 61 | echo "Allocating 1024 hugepages" 62 | sudo bash -c "echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages" 63 | if ! cat /sys/devices/system/node/*/meminfo | grep HugePages_Total | grep -q 1024 64 | then 65 | echo "Failed to allocate hugepages" 66 | exit 1 67 | else 68 | echo "Successfully allocated 1024 hugepages on NUMA node0" 69 | fi 70 | fi 71 | fi 72 | 73 | # Allocate memory for the rest of the NUMA nodes, if any 74 | for n in /sys/devices/system/node/node[1-9]; do 75 | if [ -d "$n" ]; then 76 | sudo bash -c "echo 1024 > $n/hugepages/hugepages-2048kB/nr_hugepages" 77 | if ! cat $n/meminfo | grep HugePages_Total | grep -q 1024 78 | then 79 | echo "Failed to allocate hugepages on NUMA `echo $n | cut -d / -f 6`" 80 | exit 1 81 | else 82 | echo "Successfully allocated 1024 hugepages on NUMA `echo $n | cut -d / -f 6`" 83 | fi 84 | fi 85 | done 86 | 87 | 88 | echo "Starting Machnet with local MAC $LOCAL_MAC and IP $LOCAL_IP" 89 | 90 | if [ ! -d "/var/run/machnet" ]; then 91 | echo "Creating /var/run/machnet" 92 | sudo mkdir -p /var/run/machnet 93 | sudo chmod 755 /var/run/machnet # Set permissions like Ubuntu's default, needed on (e.g.) CentOS 94 | fi 95 | 96 | sudo bash -c "echo '{\"machnet_config\": {\"$LOCAL_MAC\": {\"ip\": \"$LOCAL_IP\"}}}' > /var/run/machnet/local_config.json" 97 | echo "Created config for local Machnet, in /var/run/machnet/local_config.json. Contents:" 98 | sudo cat /var/run/machnet/local_config.json 99 | 100 | if [ $BARE_METAL -eq 1 ]; then 101 | echo "Starting Machnet in bare-metal mode" 102 | THIS_SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )" 103 | machnet_bin="${THIS_SCRIPT_DIR}/build/src/apps/machnet/machnet" 104 | 105 | if [ ! -f ${machnet_bin} ]; then 106 | echo "Machnet binary ${machnet_bin} not found, please build Machnet first" 107 | exit 1 108 | fi 109 | 110 | sudo ${machnet_bin} --config_json /var/run/machnet/local_config.json 111 | else 112 | if ! command -v docker &> /dev/null 113 | then 114 | echo "Please install docker" 115 | exit 116 | fi 117 | 118 | if ! groups | grep -q docker; then 119 | echo "Please add the current user to the docker group" 120 | exit 121 | fi 122 | 123 | echo "Checking if the Machnet Docker image is available" 124 | if ! docker pull ghcr.io/microsoft/machnet/machnet:latest 125 | then 126 | echo "Please make sure you have access to the Machnet Docker image at ghcr.io/microsoft/machnet/" 127 | echo "See Machnet README for instructions on how to get access" 128 | fi 129 | 130 | if [ $DEBUG -eq 1 ]; then 131 | echo "Using debug build from Docker image" 132 | machnet_bin="/root/machnet/debug_build/src/apps/machnet/machnet" 133 | else 134 | echo "Using release build from Docker image" 135 | machnet_bin="/root/machnet/release_build/src/apps/machnet/machnet" 136 | fi 137 | 138 | sudo docker run --privileged --net=host \ 139 | -v /dev/hugepages:/dev/hugepages \ 140 | -v /var/run/machnet:/var/run/machnet \ 141 | ghcr.io/microsoft/machnet/machnet:latest \ 142 | ${machnet_bin} \ 143 | --config_json /var/run/machnet/local_config.json 144 | fi 145 | -------------------------------------------------------------------------------- /src/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | set(CMAKE_C_STANDARD 11) 2 | set(CMAKE_C_STANDARD_REQUIRED ON) 3 | set(CMAKE_CXX_STANDARD 20) 4 | set(CMAKE_CXX_STANDARD_REQUIRED ON) 5 | 6 | set(CMAKE_CXX_EXTENSIONS OFF) 7 | set(CMAKE_EXPORT_COMPILE_COMMANDS ON) 8 | 9 | # Select flags. 10 | set(CMAKE_C_FLAGS "-Wall") 11 | set(CMAKE_C_FLAGS_RELEASE "-O3 -DNDEBUG") 12 | set(CMAKE_C_FLAGS_RELWITHDEBINFO "-O3 -DNDEBUG -g") 13 | set(CMAKE_C_FLAGS_DEBUG "-O0 -g -DDEBUG -fno-omit-frame-pointer -fsanitize=address") 14 | 15 | set(CMAKE_CXX_FLAGS "-Wall -fno-rtti -fno-exceptions") 16 | set(CMAKE_CXX_FLAGS_RELEASE "-O3 -DNDEBUG -Wno-unused-value") 17 | set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "-O3 -DNDEBUG -g -Wno-unused-value") 18 | set(CMAKE_CXX_FLAGS_DEBUG "-O0 -g -fno-omit-frame-pointer -fsanitize=address -DDEBUG") 19 | set(CMAKE_LINKER_FLAGS_DEBUG "${CMAKE_LINKER_FLAGS_DEBUG} -fno-omit-frame-pointer -fsanitize=address") 20 | 21 | # Include 'libdpdk'. 22 | find_package(PkgConfig REQUIRED) 23 | pkg_check_modules(LIBDPDK_STATIC libdpdk>=23.11 libdpdk<24.0 REQUIRED IMPORTED_TARGET) 24 | 25 | include_directories(ext) 26 | add_subdirectory(ext) 27 | 28 | # Check if DPDK is defined, since it's needed for core/. Else build only non-core 29 | # non-DPDK parts. 30 | if(LIBDPDK_STATIC_FOUND) 31 | include_directories(include) 32 | add_subdirectory(core) 33 | add_subdirectory(benchmark) 34 | enable_testing() 35 | add_subdirectory(tests) 36 | add_subdirectory(apps) 37 | add_subdirectory(tools) 38 | else() 39 | message(WARNING "DPDK not found and RTE_SDK is not set. Building only msg_gen.") 40 | add_subdirectory(apps/msg_gen) 41 | endif() 42 | -------------------------------------------------------------------------------- /src/apps/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | include_directories(../modules) 2 | 3 | add_subdirectory(machnet) 4 | add_subdirectory(msg_gen) 5 | add_subdirectory(rocksdb_server) 6 | -------------------------------------------------------------------------------- /src/apps/machnet/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | set(target_name machnet) 2 | add_executable (${target_name} main.cc) 3 | target_link_libraries(${target_name} PUBLIC core glog rt gflags::gflags) 4 | target_include_directories(${target_name} PUBLIC ${GFLAGS_INCLUDE_DIR}) 5 | -------------------------------------------------------------------------------- /src/apps/machnet/README.md: -------------------------------------------------------------------------------- 1 | # Machnet: A high-performance network stack as a sidecar 2 | 3 | This README is a work in progress. 4 | 5 | Here, you can find details on how to run the Machnet service. 6 | 7 | ## Prerequisites 8 | 9 | Successful build of the `Machnet` project (see main [README](../../../README.md)). 10 | 11 | 12 | ## Running the stack 13 | 14 | ### Configuration 15 | 16 | Machnet controller loads the configuration from a `JSON` file. The configuration 17 | is straightforward. Each entry in the `machnet_config` dictionary corresponds to 18 | a network interface that is managed by Machnet. The key is the MAC address of 19 | the interface, and the value is a dictionary with the following fields: 20 | * `ip`: the IP address of the interface. 21 | * `engine_threads`: The number of threads (and NIC HW queues) to use for this interface. 22 | * `cpu_mask`: The CPU mask to use to affine all engine threads. If not specified, the default is to use all available cores. 23 | 24 | **Example [config.json](config.json):** 25 | ```json 26 | { 27 | "machnet_config": { 28 | "60:45:bd:0f:d7:6e": { 29 | "ip": "10.0.255.10", 30 | "engine_threads": 1 31 | } 32 | } 33 | } 34 | ``` 35 | 36 | The configuration is shared with other applications (for example, 37 | [msg_gen](../msg_gen/), [pktgen](../pktgen)). 38 | 39 | **Attention:** When running in Microsoft Azure, the recommended DPDK driver for the accelerated NIC is [`hn_netvsc`](https://doc.dpdk.org/guides/nics/netvsc.html). To use `NETVSC PMD` all relevant `VMBUS` devices, need to be bound to the userspace I/O driver (`uio_hv_generic`). To do this once, run the following command: 40 | ```bash 41 | # Assuming `eth1` is the interface to be used by Machnet: 42 | DEV_UUID=$(basename $(readlink /sys/class/net/eth1/device)) 43 | driverctl -b vmbus set-override $DEV_UUID uio_hv_generic 44 | ``` 45 | 46 | 47 | ### Running 48 | 49 | The Machnet stack is run by the `machnet` binary. You could see the available options by running `machnet --help`. 50 | 51 | The folowing command will run the Machnet stack. The stack doesn't initialize any channels or listeners by default. Those are created and destroyed on demand by the applications that use Machnet. The `machnet` binary will run in the foreground, and will print logs to `stderr` if the `GLOG_logtostderr` option is set. 52 | 53 | ```bash 54 | cd ${REPOROOT}/build/ 55 | sudo GLOG_logtostderr=1 ./src/apps/machnet/machnet 56 | 57 | # If ran from a different directory, you may need to specify the path to the config file: 58 | sudo GLOG_logtostderr=1 ./some/path/machnet --config_file ${REPOROOT}/src/apps/machnet/config.json 59 | ``` 60 | 61 | You should be able to `ping` Machnet from a remote machine in the same subnet. 62 | 63 | To redirect log output to a file in `/tmp`, omit the `GLOG_logtostderr` option. 64 | 65 | You can find an example of an application that uses the Machnet stack in [msg_gen](../msg_gen/). 66 | -------------------------------------------------------------------------------- /src/apps/machnet/config.json: -------------------------------------------------------------------------------- 1 | { 2 | "machnet_config": { 3 | "00:00:00:00:00:00": { 4 | "ip": "172.17.0.1", 5 | "engine_threads": 1 6 | } 7 | } 8 | } 9 | -------------------------------------------------------------------------------- /src/apps/machnet/main.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file main.cc 3 | * @brief Machnet stack main entry point. 4 | */ 5 | #include 6 | #include 7 | #include 8 | 9 | DEFINE_string(config_json, "../src/apps/machnet/config.json", 10 | "JSON file with Machnet-related parameters."); 11 | 12 | int main(int argc, char *argv[]) { 13 | ::google::InitGoogleLogging(argv[0]); 14 | gflags::ParseCommandLineFlags(&argc, &argv, true); 15 | gflags::SetUsageMessage("Main Machnet daemon."); 16 | FLAGS_logtostderr = 1; 17 | 18 | juggler::MachnetController *controller = 19 | CHECK_NOTNULL(juggler::MachnetController::Create(FLAGS_config_json)); 20 | controller->Run(); 21 | 22 | juggler::MachnetController::ReleaseInstance(); 23 | return (0); 24 | } 25 | -------------------------------------------------------------------------------- /src/apps/msg_gen/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | set(target_name msg_gen) 2 | add_executable (${target_name} main.cc) 3 | target_link_libraries(${target_name} PUBLIC glog machnet_shim rt hdr_histogram gflags) 4 | -------------------------------------------------------------------------------- /src/apps/msg_gen/README.md: -------------------------------------------------------------------------------- 1 | # Message Generator (msg_gen) 2 | 3 | This is a simple message generator application that uses the Machnet stack. 4 | 5 | ## Prerequisites 6 | 7 | Successful build of the `Machnet` project (see main [README](../../../README.md)). 8 | 9 | A `Machnet` stack instance must already be running on the machine that needs to use this 10 | message generator application. You can find information on how to run the 11 | `Machnet` stack in the [README](../machnet/README.md). 12 | 13 | 14 | ## Running the application 15 | 16 | The `msg_gen` application is run by the `msg_gen` binary. You could see the 17 | available options by running `msg_gen --help`. 18 | 19 | ### Sending messages between two machines 20 | 21 | In the example below, a server with IP `10.0.0.1` sends messages to the remote 22 | server with IP `10.0.0.2`, which bounces them back. 23 | 24 | ```bash 25 | # On machine `10.0.0.2` (bouncing): 26 | cd ${REPOROOT}/build/ 27 | sudo GLOG_logtostderr=1 ./src/apps/msg_gen/msg_gen --local_ip 10.0.0.2 28 | # GLOG_logtostderr=1 can be omitted if you want to log to a file instead of stderr. 29 | 30 | # On machine `10.0.0.1` (sender): 31 | cd ${REPOROOT}/build/ 32 | sudo GLOG_logtostderr=1 ./src/apps/msg_gen/msg_gen --local_ip 10.0.0.2 --remote_ip 10.0.0.1 33 | 34 | ``` 35 | -------------------------------------------------------------------------------- /src/apps/rocksdb_server/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | set(target_name rocksdb_server) 2 | 3 | # Try to find a RocksDB installation 4 | set(ROCKSDB_INSTALL_DIR /mnt/ankalia/rocksdb/local_install) 5 | list(APPEND CMAKE_PREFIX_PATH ${ROCKSDB_INSTALL_DIR}/lib/cmake/rocksdb) 6 | if (NOT EXISTS ${ROCKSDB_INSTALL_DIR}/lib/cmake/rocksdb) 7 | message(WARNING "RocksDB installation not found at ${ROCKSDB_INSTALL_DIR}. Not building rocksdb_server example app.") 8 | return() 9 | endif() 10 | 11 | find_package(RocksDB REQUIRED) 12 | add_executable (${target_name} rocksdb_server.cc) 13 | target_link_libraries(${target_name} PUBLIC glog core machnet_shim rt gflags ${LIBDPDK_LIBRARIES} RocksDB::rocksdb) 14 | -------------------------------------------------------------------------------- /src/apps/rocksdb_server/rocksdb_server.cc: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | 18 | DEFINE_int32(num_keys, 1000, "Number of keys to insert and retrieve"); 19 | DEFINE_int32(num_probes, 10000, "Number of probes to perform"); 20 | DEFINE_int32(value_size, 200, "Size of value in bytes"); 21 | DEFINE_string(local, "", "Local IP address, needed for Machnet only"); 22 | DEFINE_string(transport, "machnet", "Transport to use (machnet, udp)"); 23 | 24 | static constexpr uint16_t kPort = 888; 25 | const char kNsaasRocksDbServerFile[] = "/tmp/testdb"; 26 | 27 | void MachnetTransportServer(rocksdb::DB *db) { 28 | // Initialize machnet and attach 29 | int ret = machnet_init(); 30 | CHECK_EQ(ret, 0) << "machnet_init() failed"; 31 | void *channel = machnet_attach(); 32 | 33 | CHECK(channel != nullptr) << "machnet_attach() failed"; 34 | ret = machnet_listen(channel, FLAGS_local.c_str(), kPort); 35 | CHECK_EQ(ret, 0) << "machnet_listen() failed"; 36 | 37 | // Handle client requests 38 | LOG(INFO) << "Waiting for client requests"; 39 | while (true) { 40 | std::array buf; 41 | MachnetFlow rx_flow; 42 | const ssize_t ret = machnet_recv(channel, buf.data(), buf.size(), &rx_flow); 43 | CHECK_GE(ret, 0) << "machnet_recv() failed"; 44 | if (ret == 0) { 45 | usleep(1); 46 | continue; 47 | } 48 | 49 | std::string key(buf.data(), ret); 50 | VLOG(1) << "Received GET request for key: " << key; 51 | 52 | std::string value; 53 | const rocksdb::Status status = db->Get(rocksdb::ReadOptions(), key, &value); 54 | 55 | if (status.ok()) { 56 | MachnetFlow tx_flow; 57 | tx_flow.dst_ip = rx_flow.src_ip; 58 | tx_flow.src_ip = rx_flow.dst_ip; 59 | tx_flow.dst_port = rx_flow.src_port; 60 | tx_flow.src_port = rx_flow.dst_port; 61 | 62 | ssize_t send_ret = 63 | machnet_send(channel, tx_flow, value.data(), value.size()); 64 | VLOG(1) << "Sent value of size " << value.size() << " bytes to client"; 65 | if (send_ret == -1) { 66 | LOG(ERROR) << "machnet_send() failed"; 67 | } 68 | } else { 69 | LOG(ERROR) << "Error retrieving key: " << status.ToString(); 70 | } 71 | } 72 | } 73 | 74 | void UDPTransportServer(rocksdb::DB *db) { 75 | // Create and configure the UDP socket 76 | int sockfd = socket(AF_INET, SOCK_DGRAM, 0); 77 | CHECK_GE(sockfd, 0) << "socket() failed"; 78 | int flags = fcntl(sockfd, F_GETFL, 0); 79 | CHECK_GE(flags, 0) << "fcntl() F_GETFL failed"; 80 | CHECK_GE(fcntl(sockfd, F_SETFL, flags | O_NONBLOCK), 0) 81 | << "fcntl() F_SETFL failed"; 82 | 83 | struct sockaddr_in server_addr; 84 | memset(&server_addr, 0, sizeof(server_addr)); 85 | server_addr.sin_family = AF_INET; 86 | server_addr.sin_addr.s_addr = htonl(INADDR_ANY); 87 | server_addr.sin_port = htons(kPort); 88 | 89 | int ret = bind(sockfd, (struct sockaddr *)&server_addr, sizeof(server_addr)); 90 | if (ret < 0) { 91 | LOG(FATAL) << "bind() failed, error: " << strerror(errno); 92 | } 93 | 94 | // Handle client requests 95 | LOG(INFO) << "Waiting for client requests"; 96 | while (true) { 97 | std::array buf; 98 | struct sockaddr_in client_addr; 99 | socklen_t client_addr_len = sizeof(client_addr); 100 | const ssize_t ret = 101 | recvfrom(sockfd, buf.data(), buf.size(), 0, 102 | (struct sockaddr *)&client_addr, &client_addr_len); 103 | if (ret < 0) { 104 | if (errno == EAGAIN || errno == EWOULDBLOCK) { 105 | usleep(1); 106 | continue; 107 | } 108 | LOG(FATAL) << "recvfrom() failed, error: " << strerror(errno); 109 | } 110 | 111 | std::string key(buf.data(), ret); 112 | VLOG(1) << "Received GET request for key: " << key; 113 | 114 | std::string value; 115 | const rocksdb::Status status = db->Get(rocksdb::ReadOptions(), key, &value); 116 | 117 | if (status.ok()) { 118 | ssize_t send_ret = 119 | sendto(sockfd, value.data(), value.size(), 0, 120 | (struct sockaddr *)&client_addr, client_addr_len); 121 | VLOG(1) << "Sent value of size " << value.size() << " bytes to client"; 122 | if (send_ret == -1) { 123 | LOG(ERROR) << "sendto() failed"; 124 | } 125 | } else { 126 | LOG(ERROR) << "Error retrieving key: " << status.ToString(); 127 | } 128 | } 129 | } 130 | 131 | int main(int argc, char *argv[]) { 132 | google::InitGoogleLogging(argv[0]); 133 | gflags::ParseCommandLineFlags(&argc, &argv, true); 134 | FLAGS_logtostderr = 1; 135 | 136 | // Initialize RocksDB 137 | rocksdb::DB *db; 138 | rocksdb::Options options; 139 | options.create_if_missing = true; 140 | LOG(INFO) << "Opening RocksDB, file = " << kNsaasRocksDbServerFile; 141 | rocksdb::Status status = 142 | rocksdb::DB::Open(options, kNsaasRocksDbServerFile, &db); 143 | if (!status.ok()) { 144 | LOG(ERROR) << "Error opening RocksDB: " << status.ToString(); 145 | return 1; 146 | } 147 | 148 | // Insert num_keys string keys 149 | LOG(INFO) << "Inserting " << FLAGS_num_keys << " key-value pairs"; 150 | for (int i = 0; i < FLAGS_num_keys; i++) { 151 | std::string key = "key" + std::to_string(i); 152 | std::string value = "value" + std::string(FLAGS_value_size, 'x'); 153 | status = db->Put(rocksdb::WriteOptions(), key, value); 154 | if (!status.ok()) { 155 | LOG(ERROR) << "Error inserting key-value pair: " << status.ToString(); 156 | return 1; 157 | } 158 | } 159 | 160 | if (FLAGS_transport == "machnet") { 161 | MachnetTransportServer(db); 162 | } else if (FLAGS_transport == "udp") { 163 | UDPTransportServer(db); 164 | } else { 165 | LOG(FATAL) << "Unknown transport: " << FLAGS_transport; 166 | } 167 | 168 | return 0; 169 | } 170 | -------------------------------------------------------------------------------- /src/benchmark/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | file(GLOB_RECURSE BENCHMARK_FILES "${PROJECT_SOURCE_DIR}/src/*_bench.cc" ) 2 | 3 | foreach(bench_name IN LISTS BENCHMARK_FILES) 4 | get_filename_component(bench_bin ${bench_name} NAME_WE) 5 | add_executable(${bench_bin} ${bench_name}) 6 | target_link_libraries(${bench_bin} PUBLIC 7 | core machnet_shim glog gtest benchmark ${LIBDPDK_LIBRARIES} hugetlbfs rt) 8 | endforeach() 9 | -------------------------------------------------------------------------------- /src/core/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | # Create a library called "core" which includes the source file "dpdk.cc". 2 | # The extension is already found. Any number of sources could be listed here. 3 | file(GLOB_RECURSE SRC_FILES ${CMAKE_CURRENT_SOURCE_DIR}/*.cc) 4 | list(FILTER SRC_FILES EXCLUDE REGEX "^.*_test\\.(cc|h)$") 5 | 6 | add_library (core STATIC ${SRC_FILES}) 7 | 8 | set_target_properties(core PROPERTIES 9 | VERSION ${PROJECT_VERSION} 10 | SOVERSION 1 11 | PUBLIC_HEADER dpdk.h) 12 | 13 | find_package(nlohmann_json REQUIRED) 14 | 15 | target_include_directories(core PUBLIC ../include) 16 | target_include_directories(core PRIVATE .) 17 | target_link_libraries(core PRIVATE) 18 | target_link_libraries(core PRIVATE nlohmann_json::nlohmann_json) 19 | target_link_libraries(core PUBLIC PkgConfig::LIBDPDK_STATIC) 20 | 21 | # link_directories($ENV{RTE_SDK}/$ENV{RTE_TARGET}/lib/) 22 | # find_library(DPDK_LIB NAMES libdpdk.a dpdk) 23 | # target_link_libraries(core PRIVATE dpdk numa dl) 24 | # target_link_libraries(core PRIVATE -Wl,--whole-archive dpdk -Wl,--no-whole-archive numa dl ${IBVERBS} ${LIBMLX4} ${LIBMLX5}) 25 | -------------------------------------------------------------------------------- /src/core/drivers/dpdk/dpdk.cc: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | namespace juggler { 9 | namespace dpdk { 10 | 11 | void Dpdk::InitDpdk(juggler::utils::CmdLineOpts rte_args) { 12 | if (initialized_) { 13 | LOG(WARNING) << "DPDK is already initialized."; 14 | return; 15 | } 16 | 17 | LOG(INFO) << "Initializing DPDK with args: " << rte_args.ToString(); 18 | int ret = rte_eal_init(rte_args.GetArgc(), rte_args.GetArgv()); 19 | if (ret < 0) { 20 | LOG(FATAL) << "rte_eal_init() failed: ret = " << ret 21 | << " rte_errno = " << rte_errno << " (" 22 | << rte_strerror(rte_errno) << ")"; 23 | } 24 | 25 | // Check if DPDK runs in PA or VA mode. 26 | if (rte_eal_iova_mode() == RTE_IOVA_VA) { 27 | LOG(INFO) << "DPDK runs in VA mode."; 28 | } else { 29 | LOG(INFO) << "DPDK runs in PA mode."; 30 | } 31 | 32 | ScanDpdkPorts(); 33 | initialized_ = true; 34 | } 35 | 36 | void Dpdk::DeInitDpdk() { 37 | int ret = rte_eal_cleanup(); 38 | if (ret != 0) { 39 | LOG(FATAL) << "rte_eal_cleanup() failed: ret = " << ret 40 | << " rte_errno = " << rte_errno << " (" 41 | << rte_strerror(rte_errno) << ")"; 42 | } 43 | 44 | initialized_ = false; 45 | } 46 | 47 | size_t Dpdk::GetNumPmdPortsAvailable() { return rte_eth_dev_count_avail(); } 48 | 49 | std::optional Dpdk::GetPmdPortIdByMac( 50 | const juggler::net::Ethernet::Address &l2_addr) const { 51 | if (!initialized_) { 52 | LOG(WARNING) << "DPDK is not initialized. Cannot retrieve eth device " 53 | "contextual info."; 54 | return std::nullopt; 55 | } 56 | 57 | std::optional p_id = std::nullopt; 58 | uint16_t port_id; 59 | RTE_ETH_FOREACH_DEV(port_id) { 60 | std::string pci_info; 61 | juggler::net::Ethernet::Address lladdr; 62 | 63 | int ret = rte_eth_macaddr_get( 64 | port_id, reinterpret_cast(lladdr.bytes)); 65 | if (ret != 0) { 66 | LOG(WARNING) 67 | << "rte_eth_macaddr_get() failed. Cannot retrieve eth device " 68 | "contextual info for port " 69 | << static_cast(port_id); 70 | break; 71 | } 72 | LOG(INFO) << "looking for " << l2_addr.ToString() << " found " 73 | << lladdr.ToString() << " port " << static_cast(port_id); 74 | 75 | if (lladdr == l2_addr) { 76 | p_id = port_id; 77 | } 78 | } 79 | 80 | return p_id; 81 | } 82 | 83 | } // namespace dpdk 84 | } // namespace juggler 85 | -------------------------------------------------------------------------------- /src/core/drivers/dpdk/dpdk_test.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file dpdk_test.cc 3 | * 4 | * Unit tests for juggler's DPDK helpers 5 | */ 6 | #include "dpdk.h" 7 | 8 | #include 9 | 10 | #include 11 | 12 | #include "packet.h" 13 | #include "pmd.h" 14 | #include "utils.h" 15 | 16 | std::unique_ptr g_tx_pkt_pool; 17 | std::unique_ptr g_pmd; 18 | 19 | TEST(BasicTxTest, BasicTxTest) { 20 | const size_t payload_size = 4000; 21 | juggler::net::Ethernet::Address local_mac_addr("00:11:22:33:44:55"); 22 | juggler::net::Ethernet::Address remote_mac_addr("00:11:22:33:44:55"); 23 | auto local_ipv4_addr = juggler::net::Ipv4::Address::MakeAddress("1.1.1.1"); 24 | CHECK(local_ipv4_addr.has_value()); 25 | auto remote_ipv4_addr = juggler::net::Ipv4::Address::MakeAddress("2.2.2.2"); 26 | CHECK(remote_ipv4_addr.has_value()); 27 | 28 | juggler::dpdk::Packet *pkt = g_tx_pkt_pool->PacketAlloc(); 29 | 30 | const size_t kTotalPacketLen = sizeof(juggler::net::Ethernet) + 31 | sizeof(juggler::net::Ipv4) + payload_size; 32 | auto *data = pkt->append(kTotalPacketLen); 33 | EXPECT_NE(data, nullptr); 34 | 35 | // L2 header 36 | auto *eh = pkt->head_data(); 37 | eh->src_addr = local_mac_addr; 38 | eh->dst_addr = remote_mac_addr; 39 | eh->eth_type = juggler::be16_t(RTE_ETHER_TYPE_IPV4); 40 | 41 | // IPv4 header 42 | auto *ipv4h = pkt->head_data(sizeof(*eh)); 43 | ipv4h->version_ihl = 0x45; 44 | ipv4h->type_of_service = 0; 45 | ipv4h->packet_id = juggler::be16_t(0x1513); 46 | ipv4h->fragment_offset = juggler::be16_t(0); 47 | ipv4h->time_to_live = 64; 48 | ipv4h->next_proto_id = juggler::net::Ipv4::Proto::kUdp; 49 | ipv4h->total_length = 50 | juggler::be16_t(sizeof(juggler::net::Ipv4) + 51 | sizeof(juggler::net::Ethernet) + payload_size); 52 | ipv4h->src_addr.address = juggler::be32_t(local_ipv4_addr.value().address); 53 | ipv4h->dst_addr.address = juggler::be32_t(remote_ipv4_addr.value().address); 54 | ipv4h->hdr_checksum = 0; 55 | 56 | EXPECT_EQ(pkt->length(), kTotalPacketLen); 57 | 58 | auto *txring = g_pmd->GetRing(0); 59 | auto ret = txring->TrySendPackets(&pkt, 1); 60 | EXPECT_EQ(ret, 1); 61 | } 62 | 63 | int main(int argc, char **argv) { 64 | testing::InitGoogleTest(&argc, argv); 65 | 66 | auto kEalOpts = juggler::utils::CmdLineOpts( 67 | {"", "-c", "0x0", "-n", "6", "--proc-type=auto", "-m", "1024", "--log-level", 68 | "8", "--vdev=net_null0,copy=1", "--no-pci"}); 69 | 70 | auto d = juggler::dpdk::Dpdk(); 71 | d.InitDpdk(kEalOpts); 72 | 73 | g_pmd.reset(new juggler::dpdk::PmdPort( 74 | 0 /* because of PCIe allowlist, we have 1 port */)); 75 | CHECK_NOTNULL(g_pmd); 76 | g_pmd->InitDriver(); 77 | 78 | g_tx_pkt_pool.reset(new juggler::dpdk::PacketPool( 79 | juggler::dpdk::PmdRing::kDefaultRingDescNr * 2, 80 | juggler::dpdk::PmdRing::kJumboFrameSize + RTE_PKTMBUF_HEADROOM)); 81 | CHECK_NOTNULL(g_tx_pkt_pool); 82 | 83 | return RUN_ALL_TESTS(); 84 | } 85 | -------------------------------------------------------------------------------- /src/core/drivers/dpdk/packet_pool.cc: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | 7 | #include 8 | 9 | namespace juggler { 10 | namespace dpdk { 11 | 12 | [[maybe_unused]] static rte_mempool* CreateSpScPacketPool( 13 | const std::string& name, uint32_t nmbufs, uint16_t mbuf_data_size) { 14 | struct rte_mempool* mp; 15 | struct rte_pktmbuf_pool_private mbp_priv; 16 | 17 | const uint16_t priv_size = 0; 18 | const size_t elt_size = sizeof(struct rte_mbuf) + priv_size + mbuf_data_size; 19 | memset(&mbp_priv, 0, sizeof(mbp_priv)); 20 | mbp_priv.mbuf_data_room_size = mbuf_data_size; 21 | mbp_priv.mbuf_priv_size = priv_size; 22 | 23 | const unsigned int kMemPoolFlags = 24 | RTE_MEMPOOL_F_SC_GET | RTE_MEMPOOL_F_SP_PUT; 25 | mp = rte_mempool_create(name.c_str(), nmbufs, elt_size, 0, sizeof(mbp_priv), 26 | rte_pktmbuf_pool_init, &mbp_priv, rte_pktmbuf_init, 27 | NULL, rte_socket_id(), kMemPoolFlags); 28 | if (mp == nullptr) { 29 | LOG(ERROR) << "rte_mempool_create() failed. "; 30 | return nullptr; 31 | } 32 | 33 | return mp; 34 | } 35 | 36 | uint16_t PacketPool::next_id_ = 0; 37 | 38 | // 'id' of the PacketPool usually refers to the thread id. 39 | // 'nmbufs' is the number of mbufs to allocate in the backing pool. 40 | // 'mbuf_size' the size of an mbuf buffer. (MBUF_DATASZ_DEFAULT is the minimum) 41 | PacketPool::PacketPool(uint32_t nmbufs, uint16_t mbuf_size, 42 | const char* mempool_name) 43 | : is_dpdk_primary_process_(rte_eal_process_type() == RTE_PROC_PRIMARY) { 44 | if (is_dpdk_primary_process_) { 45 | // Create mempool here, choose the name automatically 46 | id_ = ++next_id_; 47 | std::string mpool_name = "mbufpool" + std::to_string(id_); 48 | LOG(INFO) << "[ALLOC] [type:mempool, name:" << mpool_name 49 | << ", nmbufs:" << nmbufs << ", mbuf_size:" << mbuf_size << "]"; 50 | // mpool_ = rte_pktmbuf_pool_create(mpool_name.c_str(), nmbufs, 0, 0, 51 | // mbuf_size, SOCKET_ID_ANY); 52 | mpool_ = CreateSpScPacketPool(mpool_name, nmbufs, mbuf_size); 53 | CHECK(mpool_) << "Failed to create packet pool."; 54 | } else { 55 | // Lookup mempool created earlier by the primary 56 | mpool_ = rte_mempool_lookup(mempool_name); 57 | if (mpool_ == nullptr) { 58 | LOG(FATAL) << "[LOOKUP] [type: mempool, name: " << mempool_name 59 | << "] failed. rte_errno = " << rte_errno << " (" 60 | << rte_strerror(rte_errno) << ")"; 61 | } else { 62 | LOG(INFO) << "[LOOKUP] [type: mempool, name " << mempool_name 63 | << "] successful. num mbufs " << mpool_->size << ", mbuf size " 64 | << mpool_->elt_size; 65 | } 66 | } 67 | } 68 | 69 | PacketPool::~PacketPool() { 70 | LOG(INFO) << "[FREE] [type:mempool, name:" << this->GetPacketPoolName() 71 | << "]"; 72 | if (is_dpdk_primary_process_) rte_mempool_free(mpool_); 73 | } 74 | 75 | } // namespace dpdk 76 | } // namespace juggler 77 | -------------------------------------------------------------------------------- /src/core/drivers/shm/shmem.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file shm.cc 3 | * 4 | * Implementation of juggler's POSIX shared memory driver's methods. 5 | */ 6 | #include 7 | namespace juggler { 8 | namespace shm { 9 | 10 | bool ShMem::Init() { 11 | // First, create (or open) the POSIX shared memory object. 12 | shmem_.reset(new Shm(name_, size_)); 13 | CHECK_NOTNULL(shmem_); 14 | auto shmem_fd = shmem_.get()->fd; 15 | if (shmem_fd == -1) return false; 16 | auto shmem_size = shmem_.get()->size; 17 | 18 | // Now, map the shared memory object into the process's address space. 19 | mem_region_.reset(new Mmap(shmem_size, shmem_fd)); 20 | CHECK_NOTNULL(mem_region_); 21 | 22 | // We could not mmap. 23 | if (mem_region_.get()->mem == nullptr) return false; 24 | 25 | // Lock shared memory object to RAM. 26 | if (mlock(mem_region_.get()->mem, mem_region_.get()->size) != 0) 27 | LOG(WARNING) << juggler::utils::Format( 28 | "Could not mlock() shared memory object (errno: %d)", errno); 29 | 30 | return true; 31 | } 32 | 33 | } // namespace shm 34 | } // namespace juggler 35 | -------------------------------------------------------------------------------- /src/core/drivers/shm/shmem_test.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file shm_test.cc 3 | * 4 | * Unit tests for juggler's POSIX shared memory driver. 5 | */ 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include 11 | 12 | #include 13 | 14 | DEFINE_int32(shm_size, 1 << 20, "Size of shared memory object"); 15 | 16 | TEST(BasicShmTest, BasicShmTest) { 17 | juggler::shm::ShMem region("test_region", FLAGS_shm_size); 18 | EXPECT_TRUE(region.Init()) 19 | << "Failed to initialize POSIX shared memory object."; 20 | EXPECT_NE(region.head_data(), nullptr); 21 | } 22 | 23 | TEST(BasicShmTest, BasicShmTest2) { 24 | pid_t pid = fork(); 25 | if (pid != 0) { 26 | // Parent process. 27 | juggler::shm::ShMem region("test_region", FLAGS_shm_size); 28 | EXPECT_TRUE(region.Init()) 29 | << "Failed to initialize POSIX shared memory object."; 30 | 31 | // atomic_flag implementation is partial with gcc-8,9. Use atomic 32 | // instead. 33 | auto *flag = region.head_data *>(); 34 | 35 | // Pre-fill the region with a sequence of increasing `size_t' numbers, 36 | // starting from zero. 37 | size_t *data = reinterpret_cast(flag + 1); 38 | auto *mmap_end = region.head_data() + region.length(); 39 | size_t counter = 0; 40 | while (reinterpret_cast(data + 1) <= mmap_end) { 41 | *data = counter++; 42 | data++; 43 | } 44 | 45 | flag->store(true); // Everything is done. 46 | 47 | int wstatus; 48 | waitpid(pid, &wstatus, 0); 49 | 50 | // The child is going to check the pattern, and exit with code 0 on success. 51 | EXPECT_EQ(WEXITSTATUS(wstatus), 0); 52 | } else { 53 | // Child process. 54 | juggler::shm::ShMem region("test_region", FLAGS_shm_size); 55 | EXPECT_TRUE(region.Init()) 56 | << "Failed to initialize POSIX shared memory object."; 57 | 58 | auto *flag = region.head_data *>(); 59 | 60 | // Note: POSIX shared memory is zero-initialized. Even if child reaches here 61 | // before parent the flag will be "false". 62 | do { 63 | // Parent is working on the shared memory region. 64 | machnet_pause(); 65 | } while (flag->load(std::memory_order_acquire) == false); 66 | 67 | // Check the sequence written by the parent process. 68 | size_t *data = reinterpret_cast(flag + 1); 69 | auto *mmap_end = region.head_data() + region.length(); 70 | size_t counter = 0; 71 | while (reinterpret_cast(data + 1) < mmap_end) { 72 | if (*data != counter++) { 73 | std::cout << *data << " " << counter - 1 << std::endl; 74 | exit(-1); 75 | } 76 | data++; 77 | } 78 | exit(0); // Success. 79 | } 80 | } 81 | 82 | TEST(BasicShMemManagerTest, ShMemManagerDupNameAllocTest) { 83 | // Check that we can allocate two shared memory objects with the same name. 84 | juggler::shm::ShMemManager manager; 85 | const std::string shmem_name = "test_region"; 86 | const size_t shmem_size = 16384; 87 | 88 | auto shmem_object = manager.Alloc(shmem_name, shmem_size); 89 | EXPECT_NE(shmem_object, nullptr); 90 | 91 | shmem_object = manager.Alloc(shmem_name, shmem_size); 92 | EXPECT_EQ(shmem_object, nullptr) 93 | << "Allocated two shared memory objects with the same name."; 94 | } 95 | 96 | TEST(BasicShMemManagerTest, ShMemManagerAllocFreeTest) { 97 | // Check that we can allocate two shared memory objects with the same name. 98 | juggler::shm::ShMemManager manager; 99 | const std::string shmem_name = "test_region"; 100 | const size_t shmem_size = 16384; 101 | 102 | // Allocate the first shared memory object. 103 | auto shmem_object = manager.Alloc(shmem_name, shmem_size); 104 | EXPECT_NE(shmem_object, nullptr); 105 | 106 | // Release the object. 107 | manager.Free(shmem_name); 108 | 109 | // Check that we are allowed to create a shared memory object with the same 110 | // name after we have released it. 111 | shmem_object = manager.Alloc(shmem_name, shmem_size); 112 | EXPECT_NE(shmem_object, nullptr); 113 | } 114 | 115 | TEST(BasicShMemManagerTest, ShMemManagerOverflowTest) { 116 | juggler::shm::ShMemManager manager; 117 | 118 | const size_t shmem_object_size = 16384; 119 | for (unsigned int i = 0; i < juggler::shm::ShMemManager::kMaxShMemObjects; 120 | i++) { 121 | auto name = std::string("test_region_") + std::to_string(i); 122 | auto shmem_object = manager.Alloc(name, shmem_object_size); 123 | EXPECT_NE(shmem_object, nullptr); 124 | } 125 | 126 | // Reached the maximum of shared memory objects. 127 | auto test_region = 128 | manager.Alloc(std::string("test_region"), shmem_object_size); 129 | EXPECT_EQ(test_region, nullptr); 130 | } 131 | 132 | int main(int argc, char **argv) { 133 | ::google::InitGoogleLogging(argv[0]); 134 | testing::InitGoogleTest(&argc, argv); 135 | gflags::ParseCommandLineFlags(&argc, &argv, true); 136 | 137 | int ret = RUN_ALL_TESTS(); 138 | return ret; 139 | } 140 | -------------------------------------------------------------------------------- /src/core/machnet_config.cc: -------------------------------------------------------------------------------- 1 | #include "machnet_config.h" 2 | 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | #include "dpdk.h" 9 | #include "ether.h" 10 | 11 | namespace juggler { 12 | 13 | static std::optional GetPCIeAddressSysfs( 14 | const juggler::net::Ethernet::Address &l2_addr) { 15 | // Note: This works on Azure even after we unbind the NIC (e.g., `eth1`) 16 | // because a "fake" sibling interface with the same MAC and PCIe address 17 | // remains in sysfs. 18 | LOG(INFO) << "Walking /sys/class/net to find PCIe address for L2 address " 19 | << l2_addr.ToString(); 20 | 21 | std::string sys_net_path = "/sys/class/net"; 22 | for (const auto &entry : std::filesystem::directory_iterator(sys_net_path)) { 23 | std::string interface_name = entry.path().filename(); 24 | LOG(INFO) << "Checking interface " << interface_name; 25 | 26 | // Check the address file (contains the MAC) for this path. 27 | std::string address_fname = entry.path() / "address"; 28 | if (!std::filesystem::exists(address_fname)) { 29 | continue; 30 | } 31 | 32 | std::ifstream address_file(address_fname); 33 | // Get the interface's MAC address from the address file. 34 | std::string interface_l2_addr; 35 | address_file >> interface_l2_addr; 36 | 37 | const juggler::net::Ethernet::Address interface_addr(interface_l2_addr); 38 | if (interface_addr == l2_addr) { 39 | // Get the PCI address of the interface. 40 | std::string dev_uevent_fname = entry.path() / "device" / "uevent"; 41 | if (!std::filesystem::exists(dev_uevent_fname)) { 42 | continue; 43 | } 44 | std::ifstream uevent_file(dev_uevent_fname); 45 | // Find the PCI address in the uevent file. It is a single line in the 46 | // form: 47 | // PCI_SLOT_NAME=0000:00:1f.0 48 | std::string pcie_addr; 49 | while (uevent_file >> pcie_addr) { 50 | if (pcie_addr.find("PCI_SLOT_NAME") != std::string::npos) { 51 | return pcie_addr.substr(pcie_addr.find("=") + 1); 52 | } 53 | } 54 | } 55 | } 56 | 57 | LOG(WARNING) << "Failed to get PCI address for L2 address " 58 | << l2_addr.ToString() << " by the sysfs method."; 59 | return std::nullopt; 60 | } 61 | 62 | MachnetConfigProcessor::MachnetConfigProcessor( 63 | const std::string &config_json_filename) 64 | : config_json_filename_(config_json_filename), interfaces_config_{} { 65 | std::ifstream config_json_file(config_json_filename); 66 | CHECK(config_json_file.is_open()) 67 | << " Failed to open config JSON file " << config_json_filename << "."; 68 | 69 | config_json_file >> json_; 70 | AssertJsonValidMachnetConfig(); 71 | DiscoverInterfaceConfiguration(); 72 | } 73 | 74 | void MachnetConfigProcessor::AssertJsonValidMachnetConfig() { 75 | if (json_.find(kMachnetConfigJsonKey) == json_.end()) { 76 | LOG(FATAL) << "No entry for Machnet config (key " << kMachnetConfigJsonKey 77 | << ") in " << config_json_filename_; 78 | } 79 | 80 | for (const auto &interface : json_.at(kMachnetConfigJsonKey)) { 81 | if (interface.find("ip") == interface.end()) { 82 | LOG(FATAL) << "No IP address for " << interface << " in " 83 | << config_json_filename_; 84 | } 85 | for (const auto &[key, _] : interface.items()) { 86 | if (key != "ip" && key != "engine_threads" && key != "cpu_mask" && 87 | key != "pcie") { 88 | LOG(FATAL) << "Invalid key " << key << " in " << interface << " in " 89 | << config_json_filename_; 90 | } 91 | } 92 | } 93 | } 94 | 95 | void MachnetConfigProcessor::DiscoverInterfaceConfiguration() { 96 | for (const auto &[key, json_val] : json_.at(kMachnetConfigJsonKey).items()) { 97 | const net::Ethernet::Address l2_addr(key); 98 | size_t engine_threads = 1; 99 | cpu_set_t cpu_mask = NetworkInterfaceConfig::kDefaultCpuMask; 100 | 101 | net::Ipv4::Address ip_addr; 102 | CHECK(ip_addr.FromString(json_val.at("ip"))); 103 | 104 | if (json_val.find("engine_threads") != json_val.end()) { 105 | engine_threads = json_val.at("engine_threads"); 106 | LOG(INFO) << "Using " << engine_threads << " engine threads for " 107 | << l2_addr.ToString(); 108 | } else { 109 | LOG(INFO) << "Using default engine threads = " << engine_threads 110 | << " for " << l2_addr.ToString(); 111 | } 112 | 113 | if (json_val.find("cpu_mask") != json_val.end()) { 114 | std::string cpu_mask_str = json_val.at("cpu_mask"); 115 | const size_t cpu_mask_val = 116 | std::stoull(cpu_mask_str.c_str(), nullptr, 16); 117 | cpu_mask = utils::calculate_cpu_mask(cpu_mask_val); 118 | LOG(INFO) << "Using CPU mask " << cpu_mask_str << " for " 119 | << l2_addr.ToString(); 120 | } else { 121 | LOG(INFO) << "Using default CPU mask for " << l2_addr.ToString(); 122 | } 123 | 124 | std::string pci_addr = ""; 125 | if (json_val.find("pcie") != json_val.end()) { 126 | pci_addr = json_val.at("pcie"); 127 | LOG(INFO) << "Using config file PCIe address " << pci_addr << " for " 128 | << l2_addr.ToString(); 129 | } else { 130 | const std::optional ret = GetPCIeAddressSysfs(l2_addr); 131 | if (ret.has_value()) { 132 | pci_addr = ret.value(); 133 | } else { 134 | LOG(WARNING) << "Failed to get PCIe address from sysfs for L2 address " 135 | << l2_addr.ToString() 136 | << ", and no PCIe address specified in config file."; 137 | } 138 | } 139 | 140 | interfaces_config_.emplace(pci_addr, l2_addr, ip_addr, engine_threads, 141 | cpu_mask); 142 | } 143 | for (const auto &interface : interfaces_config_) { 144 | interface.Dump(); 145 | } 146 | } 147 | 148 | utils::CmdLineOpts MachnetConfigProcessor::GetEalOpts() const { 149 | utils::CmdLineOpts eal_opts{juggler::dpdk::kDefaultEalOpts}; 150 | // TODO(ilias) : What cpu mask to set for EAL? 151 | eal_opts.Append({"-c", "0x1"}); 152 | eal_opts.Append({"-n", "4"}); 153 | eal_opts.Append({"--telemetry"}); 154 | for (const auto &interface : interfaces_config_) { 155 | if (interface.pcie_addr() != "") { 156 | eal_opts.Append({"-a", interface.pcie_addr()}); 157 | } else { 158 | LOG(WARNING) << "Not passing PCIe allowlist for interface " 159 | << interface.l2_addr().ToString(); 160 | } 161 | } 162 | 163 | return eal_opts; 164 | } 165 | 166 | } // namespace juggler 167 | -------------------------------------------------------------------------------- /src/core/machnet_engine_test.cc: -------------------------------------------------------------------------------- 1 | /** 2 | * @file machnet_engine_test.cc 3 | * 4 | * Unit tests for the MachnetEngine class. 5 | */ 6 | 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | 14 | #include 15 | #include 16 | 17 | constexpr const char *file_name(const char *path) { 18 | const char *file = path; 19 | while (*path) { 20 | if (*path++ == '/') { 21 | file = path; 22 | } 23 | } 24 | return file; 25 | } 26 | 27 | const char *fname = file_name(__FILE__); 28 | 29 | TEST(BasicMachnetEngineSharedStateTest, SrcPortAlloc) { 30 | using EthAddr = juggler::net::Ethernet::Address; 31 | using Ipv4Addr = juggler::net::Ipv4::Address; 32 | using UdpPort = juggler::net::Udp::Port; 33 | using MachnetEngineSharedState = juggler::MachnetEngineSharedState; 34 | 35 | EthAddr test_mac{"00:00:00:00:00:01"}; 36 | Ipv4Addr test_ip; 37 | test_ip.FromString("10.0.0.1"); 38 | 39 | MachnetEngineSharedState state({}, {test_mac}, {test_ip}); 40 | std::vector allocated_ports; 41 | do { 42 | auto port = state.SrcPortAlloc(test_ip, [](uint16_t port) { return true; }); 43 | if (!port.has_value()) break; 44 | allocated_ports.emplace_back(port.value()); 45 | } while (true); 46 | 47 | std::vector expected_ports; 48 | expected_ports.resize(MachnetEngineSharedState::kSrcPortMax - 49 | MachnetEngineSharedState::kSrcPortMin + 1); 50 | std::iota(expected_ports.begin(), expected_ports.end(), 51 | MachnetEngineSharedState::kSrcPortMin); 52 | 53 | EXPECT_EQ(allocated_ports, expected_ports); 54 | 55 | auto release_allocated_ports = [&state, 56 | &test_ip](std::vector &ports) { 57 | while (!ports.empty()) { 58 | state.SrcPortRelease(test_ip, ports.back()); 59 | ports.pop_back(); 60 | } 61 | }; 62 | release_allocated_ports(allocated_ports); 63 | 64 | // Test whether the lambda condition for port allocation works. 65 | // Try to allocate all ports divisible by 3. 66 | auto is_divisible_by_3 = [](uint16_t port) { return port % 3 == 0; }; 67 | do { 68 | auto port = state.SrcPortAlloc(test_ip, is_divisible_by_3); 69 | if (!port.has_value()) break; 70 | allocated_ports.emplace_back(port.value()); 71 | } while (true); 72 | 73 | expected_ports.clear(); 74 | for (size_t p = MachnetEngineSharedState::kSrcPortMin; 75 | p <= MachnetEngineSharedState::kSrcPortMax; p++) { 76 | if (is_divisible_by_3(p)) { 77 | expected_ports.emplace_back(p); 78 | } 79 | } 80 | 81 | EXPECT_EQ(allocated_ports, expected_ports); 82 | release_allocated_ports(allocated_ports); 83 | 84 | auto illegal_condition = [](uint16_t port) { return port == 0; }; 85 | auto port = state.SrcPortAlloc(test_ip, illegal_condition); 86 | EXPECT_FALSE(port.has_value()); 87 | } 88 | 89 | TEST(BasicMachnetEngineTest, BasicMachnetEngineTest) { 90 | using PmdPort = juggler::dpdk::PmdPort; 91 | using MachnetEngine = juggler::MachnetEngine; 92 | 93 | const uint32_t kChannelRingSize = 1024; 94 | juggler::shm::ChannelManager channel_mgr; 95 | channel_mgr.AddChannel(fname, kChannelRingSize, kChannelRingSize, 96 | kChannelRingSize, kChannelRingSize); 97 | auto channel = channel_mgr.GetChannel(fname); 98 | 99 | juggler::net::Ethernet::Address test_mac("00:00:00:00:00:01"); 100 | juggler::net::Ipv4::Address test_ip; 101 | test_ip.FromString("10.0.0.1"); 102 | std::vector rss_key = {}; 103 | std::vector test_ips = {test_ip}; 104 | auto shared_state = std::make_shared( 105 | rss_key, test_mac, test_ips); 106 | const uint32_t kRingDescNr = 1024; 107 | auto pmd_port = std::make_shared(0, 1, 1, kRingDescNr, kRingDescNr); 108 | pmd_port->InitDriver(); 109 | MachnetEngine engine(pmd_port, 0, 0, shared_state, {channel}); 110 | EXPECT_EQ(engine.GetChannelCount(), 1); 111 | } 112 | 113 | int main(int argc, char **argv) { 114 | testing::InitGoogleTest(&argc, argv); 115 | 116 | auto kEalOpts = juggler::utils::CmdLineOpts( 117 | {"", "-c", "0x0", "-n", "6", "--proc-type=auto", "-m", "1024", "--log-level", 118 | "8", "--vdev=net_null0,copy=1", "--no-pci"}); 119 | 120 | auto d = juggler::dpdk::Dpdk(); 121 | d.InitDpdk(kEalOpts); 122 | int ret = RUN_ALL_TESTS(); 123 | return ret; 124 | } 125 | -------------------------------------------------------------------------------- /src/core/net/ether.cc: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | namespace juggler { 4 | namespace net { 5 | 6 | bool Ethernet::Address::FromString(std::string str) { 7 | return kSize == sscanf(str.c_str(), "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx", 8 | &bytes[0], &bytes[1], &bytes[2], &bytes[3], &bytes[4], 9 | &bytes[5]); 10 | } 11 | 12 | std::string Ethernet::Address::ToString() const { 13 | std::string ret; 14 | char addr[18]; 15 | 16 | if (17 == snprintf(addr, sizeof(addr), 17 | "%02hhx:%02hhx:%02hhx:%02hhx:%02hhx:%02hhx", bytes[0], 18 | bytes[1], bytes[2], bytes[3], bytes[4], bytes[5])) { 19 | ret = std::string(addr); 20 | } 21 | 22 | return ret; 23 | } 24 | 25 | std::string Ethernet::ToString() const { 26 | return juggler::utils::Format("[Eth: dst %s, src %s, eth_type %u]", 27 | dst_addr.ToString().c_str(), 28 | src_addr.ToString().c_str(), eth_type.value()); 29 | } 30 | 31 | } // namespace net 32 | } // namespace juggler 33 | -------------------------------------------------------------------------------- /src/core/net/ipv4.cc: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include 4 | 5 | namespace juggler { 6 | namespace net { 7 | 8 | bool Ipv4::Address::IsValid(const std::string &addr) { 9 | struct sockaddr_in sa; 10 | int result = inet_pton(AF_INET, addr.c_str(), &(sa.sin_addr)); 11 | return result != 0; 12 | } 13 | 14 | std::optional Ipv4::Address::MakeAddress( 15 | const std::string &addr) { 16 | Address ret; 17 | if (!ret.FromString(addr)) return std::nullopt; 18 | return ret; 19 | } 20 | 21 | bool Ipv4::Address::FromString(std::string str) { 22 | if (!Ipv4::Address::IsValid(str)) return false; 23 | unsigned char bytes[4]; 24 | uint8_t len = sscanf(str.c_str(), "%hhu.%hhu.%hhu.%hhu", &bytes[0], &bytes[1], 25 | &bytes[2], &bytes[3]); 26 | if (len != Ipv4::Address::kSize) return false; 27 | address = be32_t((uint32_t)(bytes[0]) << 24 | (uint32_t)(bytes[1]) << 16 | 28 | (uint32_t)(bytes[2]) << 8 | (uint32_t)(bytes[3])); 29 | 30 | return true; 31 | } 32 | 33 | std::string Ipv4::Address::ToString() const { 34 | const std::vector bytes(address.ToByteVector()); 35 | CHECK_EQ(bytes.size(), 4); 36 | return juggler::utils::Format("%hhu.%hhu.%hhu.%hhu", bytes[0], bytes[1], 37 | bytes[2], bytes[3]); 38 | } 39 | 40 | std::string Ipv4::ToString() const { 41 | return juggler::utils::Format( 42 | "[IPv4: src %s, dst %s, ihl %u, ToS %u, tot_len %u, ID %u, frag_off %u, " 43 | "TTL %u, proto %u, check %u]", 44 | src_addr.ToString().c_str(), dst_addr.ToString().c_str(), version_ihl, 45 | type_of_service, total_length.value(), packet_id.value(), 46 | fragment_offset.value(), time_to_live, next_proto_id, hdr_checksum); 47 | } 48 | 49 | } // namespace net 50 | } // namespace juggler 51 | -------------------------------------------------------------------------------- /src/core/net/udp.cc: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | namespace juggler { 4 | namespace net { 5 | 6 | std::string Udp::ToString() const { 7 | return juggler::utils::Format( 8 | "[UDP: src_port %zu, dst_port %zu, len %zu, csum %zu]", 9 | src_port.port.value(), dst_port.port.value(), len.value(), cksum.value()); 10 | } 11 | 12 | } // namespace net 13 | } // namespace juggler 14 | -------------------------------------------------------------------------------- /src/core/ttime.cc: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | namespace juggler { 4 | namespace time { 5 | 6 | thread_local uint64_t tsc_hz; 7 | 8 | } // namespace time 9 | } // namespace juggler 10 | -------------------------------------------------------------------------------- /src/core/utils.cc: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include 4 | #include 5 | 6 | namespace juggler { 7 | namespace utils { 8 | 9 | void TimeLog::DumpToFile(std::string file_name) { 10 | // Sort the timestamps array by putting the oldest event first. 11 | // We could use regular sorting here, but instead we rotate to protect against 12 | // wraparounds and duplicate timestamps. 13 | post_process(); 14 | auto oldest_entry = index_ <= time_log_.size() ? 0 : (index_ + 1) & bitmask_; 15 | LOG(INFO) << "oldest entry: " << oldest_entry; 16 | std::rotate(time_log_.begin(), time_log_.begin() + oldest_entry, 17 | time_log_.end()); 18 | 19 | std::ofstream out_file(file_name); 20 | for (const auto &elem : time_log_) { 21 | out_file << elem << std::endl; 22 | } 23 | } 24 | 25 | template 26 | void TimeSeries::DumpToFile(std::string file_name) { 27 | // Sort the timestamps array by putting the oldest event first. 28 | // We could use regular sorting here, but instead we rotate to protect against 29 | // wraparounds and duplicate timestamps. 30 | post_process(); 31 | auto oldest_entry = index_ <= time_log_.size() ? 0 : (index_ + 1) & bitmask_; 32 | std::rotate(time_log_.begin(), time_log_.begin() + oldest_entry, 33 | time_log_.end()); 34 | std::rotate(values_.begin(), values_.begin() + oldest_entry, values_.end()); 35 | 36 | std::ofstream out_file(file_name); 37 | for (size_t i = 0; i <= bitmask_; ++i) { 38 | if (time_log_[i] == UINT64_MAX) break; 39 | out_file << time_log_[i] << "," << values_[i] << std::endl; 40 | } 41 | } 42 | 43 | } // namespace utils 44 | } // namespace juggler 45 | -------------------------------------------------------------------------------- /src/ext/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | # Declare the library target. 2 | set(MACHNET_SHIM_LIB_NAME machnet_shim) 3 | 4 | add_library(${MACHNET_SHIM_LIB_NAME} SHARED machnet.c) 5 | target_link_libraries (${MACHNET_SHIM_LIB_NAME} uuid) 6 | target_link_libraries (${MACHNET_SHIM_LIB_NAME} uuid) 7 | 8 | # Configure the directories to search for header files. 9 | target_include_directories(${MACHNET_SHIM_LIB_NAME} PRIVATE .) 10 | target_include_directories(${MACHNET_SHIM_LIB_NAME} PRIVATE ../include) 11 | 12 | # Set the version property. 13 | set_target_properties(${MACHNET_SHIM_LIB_NAME} PROPERTIES VERSION ${PROJECT_VERSION}) 14 | 15 | # Set the shared object version property to the project's major version. 16 | set_target_properties(${MACHNET_SHIM_LIB_NAME} PROPERTIES SOVERSION ${PROJECT_VERSION_MAJOR}) 17 | 18 | # Set the public header property to the one with the actual API. 19 | set_target_properties(${MACHNET_SHIM_LIB_NAME} PROPERTIES PUBLIC_HEADER machnet.h) 20 | -------------------------------------------------------------------------------- /src/ext/CPPLINT.cfg: -------------------------------------------------------------------------------- 1 | filter=-legal/copyright 2 | filter=-build/include,-build/c++11 3 | filter=-readability/casting 4 | -------------------------------------------------------------------------------- /src/ext/Makefile: -------------------------------------------------------------------------------- 1 | CC = gcc 2 | CFLAGS = -Wall -fPIC 3 | LDFLAGS = -shared 4 | LIBS = -luuid 5 | TARGET = libmachnet_shim.so 6 | SRCS = machnet.c 7 | INC = ../include 8 | OBJS = $(SRCS:.c=.o) 9 | 10 | all: $(TARGET) 11 | 12 | $(TARGET): $(OBJS) 13 | $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^ $(LIBS) 14 | 15 | %.o: %.c 16 | $(CC) -I$(INC) $(CFLAGS) -c $< -o $@ 17 | 18 | clean: 19 | rm -f $(OBJS) $(TARGET) 20 | 21 | .PHONY: all clean 22 | -------------------------------------------------------------------------------- /src/ext/jring_bench.cc: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | #include 7 | 8 | static constexpr size_t kProducerCore = 2; 9 | static constexpr size_t kConsumerCore = 5; 10 | 11 | static constexpr size_t kQueueSz = 1024; 12 | static constexpr size_t KMaxMsgSize = 2048; 13 | 14 | class ProducerConsumerBenchmark : public benchmark::Fixture { 15 | public: 16 | ProducerConsumerBenchmark() 17 | : p2c_ring_(nullptr), c2p_ring_(nullptr), stop_(false) {} 18 | /** 19 | * @brief Consumer task. 20 | */ 21 | void consumer_task() { 22 | juggler::utils::BindThisThreadToCore(kConsumerCore); 23 | 24 | std::vector buf(KMaxMsgSize); 25 | auto nb_rx = 0u; 26 | while (!stop_.load()) { 27 | nb_rx += jring2_dequeue_burst(p2c_ring_, buf.data(), 1); 28 | } 29 | (void)nb_rx; 30 | } 31 | 32 | void SetUp(const ::benchmark::State& state) override { 33 | // Initialize the message size from the benchmark state. 34 | const size_t msg_size = static_cast(state.range(0)); 35 | const auto ring_mem_size = jring2_get_buf_ring_size(msg_size, kQueueSz); 36 | 37 | // Allocate memory for the rings. 38 | p2c_ring_ = CHECK_NOTNULL( 39 | static_cast(aligned_alloc(CACHELINE_SIZE, ring_mem_size))); 40 | jring2_init(p2c_ring_, kQueueSz, msg_size); 41 | 42 | c2p_ring_ = CHECK_NOTNULL( 43 | static_cast(aligned_alloc(CACHELINE_SIZE, ring_mem_size))); 44 | jring2_init(c2p_ring_, kQueueSz, msg_size); 45 | 46 | // Start the consumer thread. 47 | stop_.store(false); 48 | consumer_ = std::thread(&ProducerConsumerBenchmark::consumer_task, this); 49 | consumer_.detach(); 50 | } 51 | 52 | void TearDown(const ::benchmark::State& state) override { 53 | stop_.store(true); 54 | free(p2c_ring_); 55 | p2c_ring_ = nullptr; 56 | free(c2p_ring_); 57 | c2p_ring_ = nullptr; 58 | } 59 | 60 | protected: 61 | jring2_t* p2c_ring_{nullptr}; 62 | jring2_t* c2p_ring_{nullptr}; 63 | std::atomic stop_{false}; 64 | std::thread consumer_{}; 65 | }; 66 | 67 | BENCHMARK_DEFINE_F(ProducerConsumerBenchmark, ProducerBenchmark) 68 | (benchmark::State& st) { // NOLINT 69 | const size_t msg_size = static_cast(st.range(0)); 70 | const uint32_t num_messages = static_cast(st.range(1)); 71 | CHECK_GE(msg_size, sizeof(uint64_t)); 72 | 73 | juggler::utils::BindThisThreadToCore(kProducerCore); 74 | sleep(2); 75 | 76 | std::vector tx_buf(msg_size, 'a'); 77 | auto nb_tx = 0u; 78 | for (auto _ : st) { 79 | auto iteration_nb_tx = 0u; 80 | while (iteration_nb_tx < num_messages) { 81 | ::benchmark::DoNotOptimize(tx_buf); 82 | ::benchmark::DoNotOptimize( 83 | iteration_nb_tx += jring2_enqueue_bulk(p2c_ring_, tx_buf.data(), 1)); 84 | } 85 | nb_tx += iteration_nb_tx; 86 | } 87 | 88 | st.counters["msg_rate"] = 89 | benchmark::Counter(nb_tx, benchmark::Counter::kIsRate); 90 | st.counters["bps"] = 91 | benchmark::Counter(nb_tx * msg_size * 8, benchmark::Counter::kIsRate); 92 | } 93 | 94 | BENCHMARK_REGISTER_F(ProducerConsumerBenchmark, ProducerBenchmark) 95 | ->Args({64, 1 << 24}) // msg_size = 64, num_messages = 16M 96 | ->Args({128, 1 << 24}) // msg_size = 128, num_messages = 16M 97 | ->Args({256, 1 << 24}) // msg_size = 256, num_messages = 16M 98 | ->Args({512, 1 << 24}) // msg_size = 512, num_messages = 16M 99 | ->Args({1024, 1 << 24}) // msg_size = 1024, num_messages = 16M 100 | ->Args({2048, 1 << 24}) // msg_size = 1024, num_messages = 16M 101 | ->Iterations(10); // number of iterations for each case 102 | 103 | BENCHMARK_MAIN(); 104 | -------------------------------------------------------------------------------- /src/ext/machnet_ctrl.h: -------------------------------------------------------------------------------- 1 | #ifndef SRC_EXT_MACHNET_CTRL_H_ 2 | #define SRC_EXT_MACHNET_CTRL_H_ 3 | 4 | #include "machnet_common.h" 5 | #ifdef __cplusplus 6 | extern "C" { 7 | #endif 8 | 9 | #include 10 | 11 | #define MACHNET_CONTROLLER_DEFAULT_PATH "/var/run/machnet/machnet_ctrl.sock" 12 | 13 | /** 14 | * @struct machnet_app_info 15 | * @brief Information about the application 16 | * @var machnet_app_info::name Name of the application. 17 | */ 18 | struct machnet_app_info { 19 | char name[128]; 20 | } __attribute__((packed)); 21 | typedef struct machnet_app_info machnet_app_info_t; 22 | 23 | /** 24 | * @struct machnet_channel_info 25 | * @brief This struct is used to request a new channel from the controller. 26 | * An application that needs to use Machnet should send a request to create a 27 | * new channel to the controller. 28 | * 29 | * @var machnet_channel_info::channel_uuid The UUID of the application that 30 | * is requesting a new channel. 31 | * @var machnet_channel_info::desc_ring_size The depth of the descriptor rings 32 | * (Machnet, App). 33 | * @var machnet_channel_info::buffer_count The size of the buffer pool. 34 | */ 35 | struct machnet_channel_info { 36 | uuid_t channel_uuid; 37 | #define MACHNET_CHANNEL_INFO_DESC_RING_SIZE_DEFAULT 1024 38 | uint32_t desc_ring_size; 39 | #define MACHNET_CHANNEL_INFO_BUFFER_COUNT_DEFAULT 4096 40 | uint32_t buffer_count; 41 | } __attribute__((packed)); 42 | typedef struct machnet_channel_info machnet_channel_info_t; 43 | 44 | /** 45 | * @struct machnet_ctrl_resp 46 | */ 47 | struct machnet_ctrl_status { 48 | #define MACHNET_CTRL_STATUS_FAILURE -1 49 | #define MACHNET_CTRL_STATUS_SUCCESS 0 50 | int status; 51 | }; 52 | typedef struct machnet_ctrl_status machnet_ctrl_status_t; 53 | 54 | /** 55 | * @struct machnet_ctrl_msg 56 | */ 57 | struct machnet_ctrl_msg { 58 | #define MACHNET_CTRL_MSG_TYPE_INVALID 0x00 59 | #define MACHNET_CTRL_MSG_TYPE_REQ_REGISTER 0x01 60 | #define MACHNET_CTRL_MSG_TYPE_REQ_CHANNEL 0x02 61 | #define MACHNET_CTRL_MSG_TYPE_REQ_FLOW 0x03 62 | #define MACHNET_CTRL_MSG_TYPE_REQ_LISTEN 0x04 63 | #define MACHNET_CTRL_MSG_TYPE_RESPONSE 0x10 64 | uint16_t type; 65 | uint32_t msg_id; 66 | uuid_t app_uuid; 67 | int status; 68 | union { 69 | machnet_app_info_t app_info; 70 | machnet_channel_info_t channel_info; 71 | }; 72 | } __attribute__((packed)); 73 | typedef struct machnet_ctrl_msg machnet_ctrl_msg_t; 74 | 75 | extern uuid_t g_app_uuid; 76 | 77 | #ifdef __cplusplus 78 | } 79 | #endif 80 | 81 | #endif // SRC_EXT_MACHNET_CTRL_H_ 82 | -------------------------------------------------------------------------------- /src/include/cc.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file cc.h 3 | * This file contains Congestion Control related definitions. 4 | */ 5 | #ifndef SRC_INCLUDE_CC_H_ 6 | #define SRC_INCLUDE_CC_H_ 7 | 8 | #include 9 | #include 10 | #include 11 | 12 | #include "utils.h" 13 | 14 | namespace juggler { 15 | namespace net { 16 | namespace swift { 17 | 18 | constexpr bool seqno_lt(uint32_t a, uint32_t b) { 19 | return static_cast(a - b) < 0; 20 | } 21 | constexpr bool seqno_le(uint32_t a, uint32_t b) { 22 | return static_cast(a - b) <= 0; 23 | } 24 | constexpr bool seqno_eq(uint32_t a, uint32_t b) { 25 | return static_cast(a - b) == 0; 26 | } 27 | constexpr bool seqno_ge(uint32_t a, uint32_t b) { 28 | return static_cast(a - b) >= 0; 29 | } 30 | constexpr bool seqno_gt(uint32_t a, uint32_t b) { 31 | return static_cast(a - b) > 0; 32 | } 33 | 34 | /** 35 | * @brief Swift Congestion Control (SWCC) protocol control block. 36 | */ 37 | // TODO(ilias): First-cut implementation. Needs a lot of work. 38 | struct Pcb { 39 | static constexpr std::size_t kInitialCwnd = 32; 40 | static constexpr std::size_t kSackBitmapSize = 256; 41 | static constexpr std::size_t kRexmitThreshold = 3; 42 | static constexpr int kRtoThresholdInTicks = 3; // in slow timer ticks. 43 | static constexpr int kRtoDisabled = -1; 44 | Pcb() {} 45 | 46 | // Return the sender effective window in # of packets. 47 | uint32_t effective_wnd() const { 48 | uint32_t effective_wnd = cwnd - (snd_nxt - snd_una - snd_ooo_acks); 49 | return effective_wnd > cwnd ? 0 : effective_wnd; 50 | } 51 | 52 | uint32_t seqno() const { return snd_nxt; } 53 | uint32_t get_snd_nxt() { 54 | uint32_t seqno = snd_nxt; 55 | snd_nxt++; 56 | return seqno; 57 | } 58 | 59 | std::string ToString() const { 60 | std::string s; 61 | s += "[CC] snd_nxt: " + std::to_string(snd_nxt) + 62 | ", snd_una: " + std::to_string(snd_una) + 63 | ", rcv_nxt: " + std::to_string(rcv_nxt) + 64 | ", cwnd: " + std::to_string(cwnd) + 65 | ", fast_rexmits: " + std::to_string(fast_rexmits) + 66 | ", rto_rexmits: " + std::to_string(rto_rexmits) + 67 | ", effective_wnd: " + std::to_string(effective_wnd()); 68 | return s; 69 | } 70 | 71 | uint32_t ackno() const { return rcv_nxt; } 72 | bool max_rexmits_reached() const { return rto_rexmits >= kRexmitThreshold; } 73 | bool rto_disabled() const { return rto_timer == kRtoDisabled; } 74 | bool rto_expired() const { return rto_timer >= kRtoThresholdInTicks; } 75 | 76 | uint32_t get_rcv_nxt() const { return rcv_nxt; } 77 | void advance_rcv_nxt() { rcv_nxt++; } 78 | void rto_enable() { rto_timer = 0; } 79 | void rto_disable() { rto_timer = kRtoDisabled; } 80 | void rto_reset() { rto_enable(); } 81 | void rto_maybe_reset() { 82 | if (snd_una == snd_nxt) 83 | rto_disable(); 84 | else 85 | rto_reset(); 86 | } 87 | void rto_advance() { rto_timer++; } 88 | 89 | void sack_bitmap_shift_right_one() { 90 | constexpr size_t sack_bitmap_bucket_max_idx = 91 | kSackBitmapSize / sizeof(sack_bitmap[0]) - 1; 92 | 93 | for (size_t i = sack_bitmap_bucket_max_idx; i > 0; --i) { 94 | // Shift the current each bucket to the right by 1 and take the most 95 | // significant bit from the previous bucket 96 | const uint64_t sack_bitmap_left_bucket = sack_bitmap[i - 1]; 97 | uint64_t &sack_bitmap_right_bucket = sack_bitmap[i]; 98 | 99 | sack_bitmap_right_bucket = 100 | (sack_bitmap_right_bucket >> 1) | (sack_bitmap_left_bucket << 63); 101 | } 102 | 103 | // Special handling for the left most bucket 104 | uint64_t &sack_bitmap_left_most_bucket = sack_bitmap[0]; 105 | sack_bitmap_left_most_bucket >>= 1; 106 | 107 | sack_bitmap_count--; 108 | } 109 | 110 | void sack_bitmap_bit_set(const size_t index) { 111 | constexpr size_t sack_bitmap_bucket_size = sizeof(sack_bitmap[0]); 112 | const size_t sack_bitmap_bucket_idx = index / sack_bitmap_bucket_size; 113 | const size_t sack_bitmap_idx_in_bucket = index % sack_bitmap_bucket_size; 114 | 115 | LOG_IF(FATAL, index >= kSackBitmapSize) << "Index out of bounds: " << index; 116 | 117 | sack_bitmap[sack_bitmap_bucket_idx] |= (1ULL << sack_bitmap_idx_in_bucket); 118 | 119 | sack_bitmap_count++; 120 | } 121 | 122 | uint32_t target_delay{0}; 123 | uint32_t snd_nxt{0}; 124 | uint32_t snd_una{0}; 125 | uint32_t snd_ooo_acks{0}; 126 | uint32_t rcv_nxt{0}; 127 | uint64_t sack_bitmap[kSackBitmapSize / sizeof(uint64_t)]{0}; 128 | uint8_t sack_bitmap_count{0}; 129 | uint16_t cwnd{kInitialCwnd}; 130 | uint16_t duplicate_acks{0}; 131 | int rto_timer{kRtoDisabled}; 132 | uint16_t fast_rexmits{0}; 133 | uint16_t rto_rexmits{0}; 134 | }; 135 | 136 | } // namespace swift 137 | } // namespace net 138 | } // namespace juggler 139 | 140 | #endif // SRC_INCLUDE_CC_H_ 141 | -------------------------------------------------------------------------------- /src/include/common.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file common.h 3 | * @brief Common hardcoded constants used in the project. 4 | */ 5 | 6 | #ifndef SRC_INCLUDE_COMMON_H_ 7 | #define SRC_INCLUDE_COMMON_H_ 8 | 9 | #include 10 | 11 | namespace juggler { 12 | #ifdef __cpp_lib_hardware_interference_size 13 | using std::hardware_constructive_interference_size; 14 | using std::hardware_destructive_interference_size; 15 | #else 16 | // 64 bytes on x86-64 │ L1_CACHE_BYTES │ L1_CACHE_SHIFT │ __cacheline_aligned │ 17 | // ... 18 | constexpr std::size_t hardware_constructive_interference_size = 64; 19 | constexpr std::size_t hardware_destructive_interference_size = 64; 20 | #endif 21 | // TODO(ilias): Adding an assertion for now, to prevent incompatibilities 22 | // with the C helper library. 23 | static_assert(hardware_constructive_interference_size == 64); 24 | static_assert(hardware_destructive_interference_size == 64); 25 | 26 | // x86_64 Page size. 27 | // TODO(ilias): Write an initialization function to get this 28 | // programmatically using `sysconf'. 29 | static const std::size_t kPageSize = 4096; 30 | 31 | static const std::size_t kHugePage2MSize = 2 * 1024 * 1024; 32 | 33 | static constexpr bool kShmZeroCopyEnabled = false; 34 | 35 | enum class CopyMode { 36 | kMemCopy, 37 | kZeroCopy, 38 | }; 39 | 40 | } // namespace juggler 41 | 42 | #endif // SRC_INCLUDE_COMMON_H_ 43 | -------------------------------------------------------------------------------- /src/include/dpdk.h: -------------------------------------------------------------------------------- 1 | #ifndef SRC_INCLUDE_DPDK_H_ 2 | #define SRC_INCLUDE_DPDK_H_ 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | 13 | #include 14 | 15 | namespace juggler { 16 | namespace dpdk { 17 | 18 | [[maybe_unused]] static void FetchDpdkPortInfo( 19 | uint8_t port_id, struct rte_eth_dev_info *devinfo, 20 | juggler::net::Ethernet::Address *lladdr) { 21 | if (!rte_eth_dev_is_valid_port(port_id)) { 22 | LOG(INFO) << "Port id " << static_cast(port_id) << " is not valid."; 23 | return; 24 | } 25 | 26 | int ret = rte_eth_dev_info_get(port_id, devinfo); 27 | if (ret != 0) { 28 | LOG(WARNING) << "rte_eth_dev_info() failed. Cannot retrieve eth device " 29 | "contextual info for port " 30 | << static_cast(port_id); 31 | return; 32 | } 33 | CHECK_NOTNULL(devinfo->device); 34 | 35 | rte_eth_macaddr_get(port_id, 36 | reinterpret_cast(lladdr->bytes)); 37 | 38 | LOG(INFO) << "[PMDPORT] [port_id: " << static_cast(port_id) 39 | << ", driver: " << devinfo->driver_name 40 | << ", RXQ: " << devinfo->max_rx_queues 41 | << ", TXQ: " << devinfo->max_tx_queues 42 | << ", l2addr: " << lladdr->ToString() << "]"; 43 | } 44 | 45 | [[maybe_unused]] static std::optional FindSlaveVfPortId( 46 | uint16_t port_id) { 47 | struct rte_eth_dev_info devinfo; 48 | juggler::net::Ethernet::Address lladdr; 49 | 50 | FetchDpdkPortInfo(port_id, &devinfo, &lladdr); 51 | 52 | uint16_t slave_port_id = 0; 53 | while (slave_port_id < RTE_MAX_ETHPORTS) { 54 | if (slave_port_id == port_id) { 55 | slave_port_id++; 56 | continue; 57 | } 58 | 59 | if (!rte_eth_dev_is_valid_port(slave_port_id)) { 60 | break; 61 | } 62 | 63 | struct rte_eth_dev_info slave_devinfo; 64 | juggler::net::Ethernet::Address slave_lladdr; 65 | FetchDpdkPortInfo(slave_port_id, &slave_devinfo, &slave_lladdr); 66 | if (slave_lladdr == lladdr) { 67 | return slave_port_id; 68 | } 69 | 70 | slave_port_id++; 71 | } 72 | 73 | return std::nullopt; 74 | } 75 | 76 | [[maybe_unused]] static void ScanDpdkPorts() { 77 | // This iteration is *required* to expose the net failsafe interface in Azure 78 | // VMs. Without this, the application is going to bind on top of the mlx5 79 | // driver. Worse TX is going to work, but nothing will appear on the RX side. 80 | uint16_t port_id; 81 | RTE_ETH_FOREACH_DEV(port_id) { 82 | struct rte_eth_dev_info devinfo; 83 | juggler::net::Ethernet::Address lladdr; 84 | 85 | FetchDpdkPortInfo(port_id, &devinfo, &lladdr); 86 | } 87 | } 88 | 89 | // Default EAL init arguments. 90 | static auto kDefaultEalOpts = 91 | juggler::utils::CmdLineOpts({"", "--log-level=eal,8", "--proc-type=auto"}); 92 | 93 | class Dpdk { 94 | public: 95 | Dpdk() : initialized_(false) {} 96 | ~Dpdk() { DeInitDpdk(); } 97 | 98 | void InitDpdk(juggler::utils::CmdLineOpts copts = kDefaultEalOpts); 99 | void DeInitDpdk(); 100 | const bool isInitialized() { return initialized_; } 101 | size_t GetNumPmdPortsAvailable(); 102 | std::optional GetPmdPortIdByMac( 103 | const juggler::net::Ethernet::Address &l2_addr) const; 104 | 105 | private: 106 | bool initialized_; 107 | }; 108 | } // namespace dpdk 109 | } // namespace juggler 110 | 111 | #endif // SRC_INCLUDE_DPDK_H_ 112 | -------------------------------------------------------------------------------- /src/include/ether.h: -------------------------------------------------------------------------------- 1 | // Code originally written for BESS, modified here. 2 | // Copyright (c) 2016-2017, Nefeli Networks, Inc. 3 | // Copyright (c) 2017, Cloudigo. 4 | // All rights reserved. 5 | // 6 | // Redistribution and use in source and binary forms, with or without 7 | // modification, are permitted provided that the following conditions are met: 8 | // 9 | // * Redistributions of source code must retain the above copyright notice, this 10 | // list of conditions and the following disclaimer. 11 | // 12 | // * Redistributions in binary form must reproduce the above copyright notice, 13 | // this list of conditions and the following disclaimer in the documentation 14 | // and/or other materials provided with the distribution. 15 | // 16 | // * Neither the names of the copyright holders nor the names of their 17 | // contributors may be used to endorse or promote products derived from this 18 | // software without specific prior written permission. 19 | // 20 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 23 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 24 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 25 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 26 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 27 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 28 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 29 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 30 | // POSSIBILITY OF SUCH DAMAGE. 31 | 32 | #ifndef SRC_INCLUDE_ETHER_H_ 33 | #define SRC_INCLUDE_ETHER_H_ 34 | 35 | #include 36 | #include 37 | 38 | #include 39 | #include 40 | 41 | namespace juggler { 42 | namespace net { 43 | 44 | struct __attribute__((packed)) Ethernet { 45 | struct __attribute__((packed)) Address { 46 | static const uint8_t kSize = 6; 47 | Address() = default; 48 | Address(const uint8_t *addr) { 49 | bytes[0] = addr[0]; 50 | bytes[1] = addr[1]; 51 | bytes[2] = addr[2]; 52 | bytes[3] = addr[3]; 53 | bytes[4] = addr[4]; 54 | bytes[5] = addr[5]; 55 | } 56 | 57 | Address(const std::string mac_addr) { FromString(mac_addr); } 58 | 59 | void FromUint8(const uint8_t *addr) { 60 | bytes[0] = addr[0]; 61 | bytes[1] = addr[1]; 62 | bytes[2] = addr[2]; 63 | bytes[3] = addr[3]; 64 | bytes[4] = addr[4]; 65 | bytes[5] = addr[5]; 66 | } 67 | 68 | bool FromString(std::string addr); 69 | std::string ToString() const; 70 | 71 | Address &operator=(const Address &rhs) { 72 | bytes[0] = rhs.bytes[0]; 73 | bytes[1] = rhs.bytes[1]; 74 | bytes[2] = rhs.bytes[2]; 75 | bytes[3] = rhs.bytes[3]; 76 | bytes[4] = rhs.bytes[4]; 77 | bytes[5] = rhs.bytes[5]; 78 | return *this; 79 | } 80 | bool operator==(const Address &rhs) const { 81 | return bytes[0] == rhs.bytes[0] && bytes[1] == rhs.bytes[1] && 82 | bytes[2] == rhs.bytes[2] && bytes[3] == rhs.bytes[3] && 83 | bytes[4] == rhs.bytes[4] && bytes[5] == rhs.bytes[5]; 84 | } 85 | bool operator!=(const Address &rhs) const { return !operator==(rhs); } 86 | 87 | uint8_t bytes[kSize]; 88 | }; 89 | inline static const Address kBroadcastAddr{"ff:ff:ff:ff:ff:ff"}; 90 | inline static const Address kZeroAddr{"00:00:00:00:00:00"}; 91 | 92 | enum EthType : uint16_t { 93 | kArp = 0x806, 94 | kIpv4 = 0x800, 95 | kIpv6 = 0x86DD, 96 | }; 97 | 98 | std::string ToString() const; 99 | 100 | Address dst_addr; 101 | Address src_addr; 102 | be16_t eth_type; 103 | }; 104 | 105 | } // namespace net 106 | } // namespace juggler 107 | 108 | namespace std { 109 | template <> 110 | struct hash { 111 | size_t operator()(const juggler::net::Ethernet::Address &addr) const { 112 | return juggler::utils::hash( 113 | reinterpret_cast(addr.bytes), 114 | juggler::net::Ethernet::Address::kSize); 115 | } 116 | }; 117 | } // namespace std 118 | 119 | #endif // SRC_INCLUDE_ETHER_H_ 120 | -------------------------------------------------------------------------------- /src/include/flow_key.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file flow_key.h 3 | */ 4 | #ifndef SRC_INCLUDE_FLOW_KEY_H_ 5 | #define SRC_INCLUDE_FLOW_KEY_H_ 6 | 7 | #include 8 | #include 9 | 10 | namespace juggler { 11 | namespace net { 12 | namespace flow { 13 | 14 | struct Listener { 15 | using Ipv4 = juggler::net::Ipv4; 16 | using Udp = juggler::net::Udp; 17 | Listener(const Listener& other) = default; 18 | 19 | /** 20 | * @brief Construct a new Listener object. 21 | * 22 | * @param local_addr Local IP address (in network byte order). 23 | * @param local_port Local UDP port (in network byte order). 24 | */ 25 | Listener(const Ipv4::Address& local_addr, const Udp::Port& local_port) 26 | : addr(local_addr), port(local_port) {} 27 | 28 | /** 29 | * @brief Construct a new Listener object. 30 | * 31 | * @param local_addr Local IP address (in host byte order). 32 | * @param local_port Local UDP port (in host byte order). 33 | */ 34 | Listener(const uint32_t local_addr, const uint16_t local_port) 35 | : addr(local_addr), port(local_port) {} 36 | 37 | bool operator==(const Listener& other) const { 38 | return addr == other.addr && port == other.port; 39 | } 40 | 41 | const Ipv4::Address addr; 42 | const Udp::Port port; 43 | }; 44 | static_assert(sizeof(Listener) == 6, "Listener size is not 6 bytes."); 45 | 46 | /** 47 | * @struct Key 48 | * @brief Flow key: corresponds to the 5-tuple (UDP is always the protocol). 49 | */ 50 | struct Key { 51 | using Ipv4 = juggler::net::Ipv4; 52 | using Udp = juggler::net::Udp; 53 | Key(const Key& other) = default; 54 | /** 55 | * @brief Construct a new Key object. 56 | * 57 | * @param local_addr Local IP address (in network byte order). 58 | * @param local_port Local UDP port (in network byte order). 59 | * @param remote_addr Remote IP address (in network byte order). 60 | * @param remote_port Remote UDP port (in network byte order). 61 | */ 62 | Key(const Ipv4::Address& local_addr, const Udp::Port& local_port, 63 | const Ipv4::Address& remote_addr, const Udp::Port& remote_port) 64 | : local_addr(local_addr), 65 | local_port(local_port), 66 | remote_addr(remote_addr), 67 | remote_port(remote_port) {} 68 | 69 | /** 70 | * @brief Construct a new Key object. 71 | * 72 | * @param local_addr Local IP address (in host byte order). 73 | * @param local_port Local UDP port (in host byte order). 74 | * @param remote_addr Remote IP address (in host byte order). 75 | * @param remote_port Remote UDP port (in host byte order). 76 | */ 77 | Key(const uint32_t local_addr, const uint16_t local_port, 78 | const uint32_t remote_addr, const uint16_t remote_port) 79 | : local_addr(local_addr), 80 | local_port(local_port), 81 | remote_addr(remote_addr), 82 | remote_port(remote_port) {} 83 | 84 | bool operator==(const Key& other) const { 85 | return local_addr == other.local_addr && local_port == other.local_port && 86 | remote_addr == other.remote_addr && remote_port == other.remote_port; 87 | } 88 | 89 | std::string ToString() const { 90 | return utils::Format("[%s:%hu <-> %s:%hu]", remote_addr.ToString().c_str(), 91 | remote_port.port.value(), 92 | local_addr.ToString().c_str(), 93 | local_port.port.value()); 94 | } 95 | 96 | const Ipv4::Address local_addr; 97 | const Udp::Port local_port; 98 | const Ipv4::Address remote_addr; 99 | const Udp::Port remote_port; 100 | }; 101 | static_assert(sizeof(Key) == 12, "Flow key size is not 12 bytes."); 102 | 103 | } // namespace flow 104 | } // namespace net 105 | } // namespace juggler 106 | 107 | namespace std { 108 | 109 | template <> 110 | struct hash { 111 | size_t operator()(const juggler::net::flow::Listener& listener) const { 112 | return juggler::utils::hash( 113 | reinterpret_cast(&listener), sizeof(listener)); 114 | } 115 | }; 116 | 117 | template <> 118 | struct hash { 119 | size_t operator()(const juggler::net::flow::Key& key) const { 120 | // TODO(ilias): Use a better hash function. 121 | // return std::hash()(key.local_addr.address.raw_value()) ^ 122 | // std::hash()(key.remote_addr.address.raw_value()) ^ 123 | // std::hash()(key.local_port.port.raw_value()) ^ 124 | // std::hash()(key.remote_port.port.raw_value()); 125 | return juggler::utils::hash(reinterpret_cast(&key), 126 | sizeof(key)); 127 | } 128 | }; 129 | 130 | } // namespace std 131 | 132 | #endif // SRC_INCLUDE_FLOW_KEY_H_ 133 | -------------------------------------------------------------------------------- /src/include/icmp.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file icmp.h 3 | * @brief ICMP header definition. 4 | */ 5 | #ifndef SRC_INCLUDE_ICMP_H_ 6 | #define SRC_INCLUDE_ICMP_H_ 7 | #include 8 | #include 9 | 10 | namespace juggler { 11 | namespace net { 12 | struct __attribute__((packed)) Icmp { 13 | enum Type : uint8_t { 14 | kEchoReply = 0, 15 | kEchoRequest = 8, 16 | }; 17 | Type type; 18 | static const uint8_t kCodeZero = 0; 19 | uint8_t code; 20 | uint16_t cksum; 21 | be16_t id; 22 | be16_t seq; 23 | }; 24 | 25 | } // namespace net 26 | } // namespace juggler 27 | 28 | #endif // SRC_INCLUDE_ICMP_H_ 29 | -------------------------------------------------------------------------------- /src/include/ipv4.h: -------------------------------------------------------------------------------- 1 | #ifndef SRC_INCLUDE_IPV4_H_ 2 | #define SRC_INCLUDE_IPV4_H_ 3 | 4 | #include 5 | #include 6 | #include 7 | 8 | #include 9 | #include 10 | #include 11 | #include 12 | 13 | namespace juggler { 14 | namespace net { 15 | 16 | struct __attribute__((packed)) Ipv4 { 17 | static const uint8_t kDefaultTTL = 64; 18 | struct __attribute__((packed)) Address { 19 | static const uint8_t kSize = 4; 20 | Address() = default; 21 | Address(const uint8_t *addr) { 22 | juggler::utils::Copy(&address, addr, sizeof(address)); 23 | } 24 | Address(uint32_t addr) { address = be32_t(addr); } 25 | 26 | static bool IsValid(const std::string &addr); 27 | 28 | /// If addr is not a valid IPv4 address, return a zero-valued IP address 29 | static std::optional
MakeAddress(const std::string &addr); 30 | 31 | Address &operator=(const Address &rhs) { 32 | address = rhs.address; 33 | return *this; 34 | } 35 | bool operator==(const Address &rhs) const { return address == rhs.address; } 36 | bool operator!=(const Address &rhs) const { return address != rhs.address; } 37 | bool operator==(const be32_t &rhs) const { return rhs == address; } 38 | bool operator!=(const be32_t &rhs) const { return rhs != address; } 39 | bool operator!=(be32_t rhs) const { return rhs != address; } 40 | bool operator==(const uint32_t &rhs) const { 41 | return be32_t(rhs) == address; 42 | } 43 | bool operator!=(const uint32_t &rhs) const { 44 | return be32_t(rhs) != address; 45 | } 46 | 47 | bool FromString(std::string addr); 48 | std::string ToString() const; 49 | 50 | be32_t address; 51 | }; 52 | 53 | enum Proto : uint8_t { 54 | kIcmp = 1, 55 | kTcp = 6, 56 | kUdp = 17, 57 | kRaw = 255, 58 | }; 59 | 60 | std::string ToString() const; 61 | 62 | uint8_t version_ihl; 63 | uint8_t type_of_service; 64 | be16_t total_length; 65 | be16_t packet_id; 66 | be16_t fragment_offset; 67 | uint8_t time_to_live; 68 | uint8_t next_proto_id; 69 | uint16_t hdr_checksum; 70 | Address src_addr; 71 | Address dst_addr; 72 | }; 73 | 74 | } // namespace net 75 | } // namespace juggler 76 | 77 | namespace std { 78 | template <> 79 | struct hash { 80 | std::size_t operator()(const juggler::net::Ipv4::Address &addr) const { 81 | return juggler::utils::hash( 82 | reinterpret_cast(&addr.address), 83 | sizeof(addr.address.raw_value())); 84 | } 85 | }; 86 | 87 | } // namespace std 88 | #endif // SRC_INCLUDE_IPV4_H_ 89 | -------------------------------------------------------------------------------- /src/include/juggler_rpc_ctrl.h: -------------------------------------------------------------------------------- 1 | /** 2 | * Note: This is currently unused. 3 | */ 4 | #ifndef SRC_INCLUDE_JUGGLER_RPC_CTRL_H_ 5 | #define SRC_INCLUDE_JUGGLER_RPC_CTRL_H_ 6 | 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | 16 | #include 17 | 18 | using grpc::ServerContext; 19 | using grpc::Service; 20 | using grpc::Status; 21 | 22 | /** 23 | * @brief Abstract template class for gRPC-based controller for Orion apps. 24 | * This can be instantiated by any application that needs to expose a control 25 | * plane API over gRPC. The API is discoverable with reflection. 26 | */ 27 | template 28 | class OrionRpcCtrl { 29 | public: 30 | static_assert(std::is_base_of::value, 31 | "T should inherit from grpc::Service"); 32 | 33 | /** 34 | * @param addr String reference to the address on which to bind the gRPC 35 | * server. Example format: 0.0.0.0:5000. 36 | * @param service shared pointer to an object that implements the API. It 37 | * needs to inherit from grpc::Service, so that it can be registered. 38 | */ 39 | OrionRpcCtrl(const std::string &addr, T *service) 40 | : server_address_(addr), 41 | service_(CHECK_NOTNULL(service)), 42 | builder_(CHECK_NOTNULL(new grpc::ServerBuilder())), 43 | terminate_cb_(nullptr) {} 44 | OrionRpcCtrl(const OrionRpcCtrl &) = delete; 45 | OrionRpcCtrl &operator=(const OrionRpcCtrl &) = delete; 46 | 47 | /** 48 | * @brief Start the gRPC server and wait for requests. This call is blocking. 49 | */ 50 | void Run() { 51 | grpc::EnableDefaultHealthCheckService(true); 52 | grpc::reflection::InitProtoReflectionServerBuilderPlugin(); 53 | 54 | builder_.get()->AddListeningPort(server_address_, 55 | grpc::InsecureServerCredentials()); 56 | builder_.get()->RegisterService(service_.get()); 57 | std::unique_ptr server(builder_.get()->BuildAndStart()); 58 | LOG(INFO) << "OrionRpcCtrl is listening on: " << server_address_; 59 | 60 | // Set the termination callback. 61 | terminate_cb_ = [&server]() { server->Shutdown(); }; 62 | 63 | // Block and wait for RPC requests. 64 | server->Wait(); 65 | } 66 | 67 | /// Asynchronously terminate the server. 68 | void Terminate() { 69 | CHECK_NOTNULL(terminate_cb_); 70 | std::thread async_terminate([this]() { terminate_cb_(); }); 71 | async_terminate.detach(); 72 | } 73 | 74 | private: 75 | const std::string server_address_; 76 | std::shared_ptr service_; 77 | std::unique_ptr builder_; 78 | std::function terminate_cb_; 79 | }; 80 | 81 | #endif // SRC_INCLUDE_JUGGLER_RPC_CTRL_H_ 82 | -------------------------------------------------------------------------------- /src/include/machnet_config.h: -------------------------------------------------------------------------------- 1 | #ifndef SRC_INCLUDE_MACHNET_CONFIG_H_ 2 | #define SRC_INCLUDE_MACHNET_CONFIG_H_ 3 | 4 | #include 5 | 6 | #include 7 | #define JSON_NOEXCEPTION // Disable exceptions for nlohmann::json 8 | #include 9 | #include 10 | #include 11 | #include 12 | 13 | #include 14 | #include 15 | #include 16 | 17 | namespace juggler { 18 | 19 | class NetworkInterfaceConfig { 20 | public: 21 | inline static const cpu_set_t kDefaultCpuMask = 22 | utils::calculate_cpu_mask(0xFFFFFFFF); 23 | explicit NetworkInterfaceConfig(const std::string pcie_addr, 24 | const net::Ethernet::Address &l2_addr, 25 | const net::Ipv4::Address &ip_addr, 26 | size_t engine_threads = 1, 27 | cpu_set_t cpu_mask = kDefaultCpuMask) 28 | : pcie_addr_(pcie_addr), 29 | l2_addr_(l2_addr), 30 | ip_addr_(ip_addr), 31 | engine_threads_(engine_threads), 32 | cpu_mask_(cpu_mask), 33 | dpdk_port_id_(std::nullopt) {} 34 | bool operator==(const NetworkInterfaceConfig &other) const { 35 | return l2_addr_ == other.l2_addr_; 36 | } 37 | 38 | const std::string &pcie_addr() const { return pcie_addr_; } 39 | const net::Ethernet::Address &l2_addr() const { return l2_addr_; } 40 | const net::Ipv4::Address &ip_addr() const { return ip_addr_; } 41 | size_t engine_threads() const { return engine_threads_; } 42 | cpu_set_t cpu_mask() const { return cpu_mask_; } 43 | std::optional dpdk_port_id() const { return dpdk_port_id_; } 44 | void Dump() const { 45 | LOG(INFO) << "NetworkInterfaceConfig: " 46 | << utils::Format( 47 | "[PCIe: %s, L2: %s, IP: %s, engine_threads: %zu, " 48 | "cpu_mask: %lu, dpdk_port_id: %d]", 49 | pcie_addr_.c_str(), l2_addr_.ToString().c_str(), 50 | ip_addr_.ToString().c_str(), engine_threads_, 51 | utils::cpuset_to_sizet(cpu_mask_), 52 | dpdk_port_id_.value_or(-1)); 53 | } 54 | 55 | void set_dpdk_port_id(std::optional dpdk_port_id) { 56 | dpdk_port_id_ = dpdk_port_id; 57 | } 58 | 59 | private: 60 | const std::string pcie_addr_; 61 | const net::Ethernet::Address l2_addr_; 62 | const net::Ipv4::Address ip_addr_; 63 | const size_t engine_threads_; 64 | cpu_set_t cpu_mask_; 65 | std::optional dpdk_port_id_; 66 | }; 67 | } // namespace juggler 68 | 69 | namespace std { 70 | template <> 71 | struct hash { 72 | size_t operator()(const juggler::NetworkInterfaceConfig &nic) const { 73 | return std::hash()(nic.l2_addr()); 74 | } 75 | }; 76 | } // namespace std 77 | 78 | namespace juggler { 79 | /** 80 | * @brief This is Machnet configuration JSON reader. The following is an example 81 | * of Machnet config on a single host: 82 | * 83 | * { 84 | * "machnet_config": { 85 | * "00:0d:3a:d6:9b:6a": { 86 | * "ip": "10.0.0.1", 87 | * "engine_threads": "1", 88 | * "cpu_mask": "0x1" 89 | * }, 90 | * } 91 | * } 92 | * 93 | * Each item in the "machnet_config" dictionary is a network interface MAC 94 | * address. The MAC address is the only reliable way to identify a network 95 | * interface especially on Azure. 96 | * 97 | * Note that `engine_threads` (decimal) and `cpu_mask` (hex) are optional. If 98 | * not specified, the default value is 1 and 0xFFFFFFFF respectively. 99 | */ 100 | class MachnetConfigProcessor { 101 | public: 102 | explicit MachnetConfigProcessor(const std::string &config_json_filename); 103 | 104 | std::unordered_set &interfaces_config() { 105 | return interfaces_config_; 106 | } 107 | utils::CmdLineOpts GetEalOpts() const; 108 | 109 | private: 110 | void AssertJsonValidMachnetConfig(); 111 | void DiscoverInterfaceConfiguration(); 112 | 113 | static constexpr const char *kMachnetConfigJsonKey = "machnet_config"; 114 | const std::string config_json_filename_; 115 | std::unordered_set interfaces_config_; 116 | nlohmann::json json_; 117 | }; 118 | 119 | } // namespace juggler 120 | 121 | #endif // SRC_INCLUDE_MACHNET_CONFIG_H_ 122 | -------------------------------------------------------------------------------- /src/include/machnet_controller.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file machnet_controller.h 3 | * @brief This file contains the MachnetController class. It is responsible for 4 | * the control plane of the Machnet stack. 5 | */ 6 | #ifndef SRC_INCLUDE_MACHNET_CONTROLLER_H_ 7 | #define SRC_INCLUDE_MACHNET_CONTROLLER_H_ 8 | 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | 16 | #include 17 | #include 18 | 19 | #include "common.h" 20 | 21 | namespace juggler { 22 | /** 23 | * @class MachnetController 24 | * @brief This class is responsible for the control plane of the Machnet stack. 25 | * It is responsible for creating new channels, listening to specific ports, and 26 | * for creating new connections based on the requests received from the 27 | * applications. 28 | */ 29 | class MachnetController { 30 | public: 31 | using UDSocket = juggler::net::UDSocket; 32 | using UDServer = juggler::net::UDServer; 33 | using ChannelManager = juggler::shm::ChannelManager; 34 | // Timeout for idle connections in seconds. 35 | static constexpr uint32_t kConnectionTimeoutInSec = 2; 36 | MachnetController(const MachnetController &) = delete; 37 | // Delete constructor and assignment operator. 38 | MachnetController &operator=(const MachnetController &) = delete; 39 | 40 | // Create a singleton instance of the controller. 41 | static MachnetController *Create(const std::string &conf_file) { 42 | if (instance_ == nullptr) { 43 | instance_ = new MachnetController(conf_file); 44 | } 45 | return instance_; 46 | } 47 | 48 | static void ReleaseInstance() { 49 | if (instance_ != nullptr) { 50 | delete instance_; 51 | instance_ = nullptr; 52 | } 53 | } 54 | 55 | static void sig_handler(int signum) { 56 | if (signum == SIGINT && instance_ != nullptr) { 57 | LOG(INFO) << "Received SIGINT. Stopping the controller."; 58 | instance_->Stop(); 59 | } 60 | } 61 | 62 | /** 63 | * @brief Start the controller. 64 | * It spawns a new thread that listens for incoming connections, and handles 65 | * requests. 66 | */ 67 | void Run(); 68 | /** 69 | * @brief Check if the controller is running. 70 | * @return True if the controller is running, false otherwise. 71 | */ 72 | bool IsRunning() const { return running_; } 73 | /** 74 | * @brief Stop the controller. 75 | * It closes the listening socket, and sleeps for a few seconds to allow the 76 | * controller thread to terminate. 77 | */ 78 | void Stop(); 79 | 80 | private: 81 | // Default constructor is private. 82 | explicit MachnetController(const std::string &conf_file); 83 | /** 84 | * @brief Callback to handle new connections to the controller. 85 | * 86 | * @param s The socket that is being connected. 87 | */ 88 | bool HandleNewConnection(UDSocket *s); 89 | 90 | /** 91 | * @brief Callback to handle new messages to the controller. 92 | * This function is responsible for executing the core logic of the 93 | * controller. 94 | * @param s The socket that is being connected. 95 | * @param data The data received. 96 | * @param length The length of the data received. 97 | */ 98 | void HandleNewMessage(UDSocket *s, const char *data, size_t length); 99 | 100 | /** 101 | * @brief Callback to handle passive close of the socket. 102 | * @param s The socket that is being closed. 103 | */ 104 | void HandlePassiveClose(UDSocket *s); 105 | 106 | /** 107 | * @brief Callback to handle connection timeout. 108 | * @param s The socket that timed out. 109 | * @attention This function is not implemented yet. 110 | */ 111 | void HandleTimeout(UDSocket *s); 112 | 113 | /** 114 | * @brief Callback to handle shutdown of a client. 115 | * @param s The socket that is being closed. 116 | * @param code The code of the shutdown. 117 | * @param reason The reason for the shutdown. 118 | */ 119 | void HandleShutdown(UDSocket *s, int code, void *reason); 120 | 121 | /** 122 | * @brief Register a new application with the controller. 123 | * @param[in] app_uuid UUID of the originating application. 124 | * @param[in] app_info Information about the application to be registered. 125 | * @return True if the channel has been registered successfully, false 126 | * otherwise. 127 | */ 128 | bool RegisterApplication(const uuid_t app_uuid, 129 | const machnet_app_info_t *app_info); 130 | 131 | /** 132 | * @brief Unregister an application from the controller. Releases all the 133 | * resources associated with this application. 134 | * @param[in] app_uuid UUID of the originating application. 135 | */ 136 | void UnregisterApplication(const uuid_t app_uuid); 137 | 138 | /** 139 | * @brief Create a new channel. 140 | * @param[in] app_uuid UUID of the originating application. 141 | * @param[in] channel_info Information about the channel to be created. 142 | * @param[out] fd The file descriptor of the channel (-1 on failure). 143 | * @return True if the channel has been created successfully, false otherwise. 144 | */ 145 | bool CreateChannel(const uuid_t app_uuid, 146 | const machnet_channel_info_t *channel_info, int *fd); 147 | 148 | /** 149 | * @brief The main loop of the controller. 150 | */ 151 | void RunController(); 152 | 153 | /** 154 | * @brief Stop the controller. (thread-safe) 155 | */ 156 | void StopController(); 157 | 158 | private: 159 | static inline MachnetController *instance_; 160 | MachnetConfigProcessor config_processor_; 161 | ChannelManager channel_manager_; 162 | bool running_{false}; 163 | dpdk::Dpdk dpdk_{}; 164 | std::vector> pmd_ports_{}; 165 | std::vector> engines_{}; 166 | std::unique_ptr server_{nullptr}; 167 | std::unordered_map> 168 | applications_registered_{}; 169 | }; 170 | } // namespace juggler 171 | 172 | #endif // SRC_INCLUDE_MACHNET_CONTROLLER_H_ 173 | -------------------------------------------------------------------------------- /src/include/machnet_pkthdr.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file machnet_pkthdr.h 3 | * @brief Description of the Machnet packet header. 4 | */ 5 | 6 | #ifndef SRC_INCLUDE_MACHNET_PKTHDR_H_ 7 | #define SRC_INCLUDE_MACHNET_PKTHDR_H_ 8 | 9 | #include 10 | 11 | namespace juggler { 12 | namespace net { 13 | 14 | /** 15 | * Machnet Packet Header. 16 | */ 17 | struct __attribute__((packed)) MachnetPktHdr { 18 | static constexpr uint16_t kMagic = 0x4e53; 19 | be16_t magic; // Magic value tagged after initialization for the flow. 20 | enum class MachnetFlags : uint8_t { 21 | kData = 0b0, 22 | kSyn = 0b1, // SYN packet. 23 | kAck = 0b10, // ACK packet. 24 | kSynAck = 0b11, // SYN-ACK packet. 25 | kRst = 0b10000000, // RST packet. 26 | }; 27 | MachnetFlags net_flags; // Network flags. 28 | uint8_t msg_flags; // Field to reflect the `MachnetMsgBuf_t' flags. 29 | be32_t seqno; // Sequence number to denote the packet counter in the flow. 30 | be32_t ackno; // Sequence number to denote the packet counter in the flow. 31 | be64_t sack_bitmap[4]; // Bitmap of the SACKs received. 32 | be16_t sack_bitmap_count; // Length of the SACK bitmap [0-256]. 33 | be64_t timestamp1; // Timestamp of the packet before sending. 34 | }; 35 | static_assert(sizeof(MachnetPktHdr) == 54, "MachnetPktHdr size mismatch"); 36 | 37 | inline MachnetPktHdr::MachnetFlags operator|(MachnetPktHdr::MachnetFlags lhs, 38 | MachnetPktHdr::MachnetFlags rhs) { 39 | using MachnetFlagsType = 40 | std::underlying_type::type; 41 | return MachnetPktHdr::MachnetFlags(static_cast(lhs) | 42 | static_cast(rhs)); 43 | } 44 | 45 | inline MachnetPktHdr::MachnetFlags operator&(MachnetPktHdr::MachnetFlags lhs, 46 | MachnetPktHdr::MachnetFlags rhs) { 47 | using MachnetFlagsType = 48 | std::underlying_type::type; 49 | return MachnetPktHdr::MachnetFlags(static_cast(lhs) & 50 | static_cast(rhs)); 51 | } 52 | 53 | } // namespace net 54 | } // namespace juggler 55 | 56 | #endif // SRC_INCLUDE_MACHNET_PKTHDR_H_ 57 | -------------------------------------------------------------------------------- /src/include/packet_pool.h: -------------------------------------------------------------------------------- 1 | #ifndef SRC_INCLUDE_PACKET_POOL_H_ 2 | #define SRC_INCLUDE_PACKET_POOL_H_ 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | 11 | #include 12 | 13 | namespace juggler { 14 | namespace dpdk { 15 | 16 | /** 17 | * @brief A packet pool class implementation, wrapping around DPDK's mbuf pool. 18 | */ 19 | class PacketPool { 20 | public: 21 | static constexpr uint32_t kRteDefaultMbufsNum_ = 22 | 2048 - 1; //!< Default number of mbufs. 23 | static constexpr uint16_t kRteDefaultMbufDataSz_ = 24 | RTE_MBUF_DEFAULT_BUF_SIZE; //!< Default mbuf data size. 25 | static constexpr char *kRteDefaultMempoolName = 26 | nullptr; //!< Default mempool name. 27 | 28 | /** @brief Default constructor is deleted to prevent instantiation. */ 29 | PacketPool() = delete; 30 | 31 | /** @brief Copy constructor is deleted to prevent copying. */ 32 | PacketPool(const PacketPool &) = delete; 33 | 34 | /** @brief Assignment operator is deleted to prevent assignment. */ 35 | PacketPool &operator=(const PacketPool &) = delete; 36 | 37 | /** 38 | * @brief Initializes the packet pool. 39 | * @param nmbufs Number of mbufs. 40 | * @param mbuf_size Size of each mbuf. 41 | * @param mempool_name Name of the mempool. 42 | */ 43 | PacketPool(uint32_t nmbufs = kRteDefaultMbufsNum_, 44 | uint16_t mbuf_size = kRteDefaultMbufDataSz_, 45 | const char *mempool_name = kRteDefaultMempoolName); 46 | ~PacketPool(); 47 | 48 | /** 49 | * @return The name of the packet pool. 50 | */ 51 | const char *GetPacketPoolName() { return mpool_->name; } 52 | 53 | /** 54 | * @return The data room size of the packet. 55 | */ 56 | uint32_t GetPacketDataRoomSize() const { 57 | return rte_pktmbuf_data_room_size(mpool_); 58 | } 59 | 60 | /** 61 | * @return The underlying memory pool. 62 | */ 63 | rte_mempool *GetMemPool() { return mpool_; } 64 | 65 | /** 66 | * @brief Allocates a packet from the pool. 67 | * @return Pointer to the allocated packet. 68 | */ 69 | Packet *PacketAlloc() { 70 | return reinterpret_cast(rte_pktmbuf_alloc(mpool_)); 71 | } 72 | 73 | /** 74 | * @brief Allocates multiple packets in bulk. 75 | * @param pkts Array to store the pointers to allocated packets. 76 | * @param cnt Count of packets to allocate. 77 | * @return True if allocation succeeds, false otherwise. 78 | */ 79 | bool PacketBulkAlloc(Packet **pkts, uint16_t cnt) { 80 | int ret = rte_pktmbuf_alloc_bulk( 81 | mpool_, reinterpret_cast(pkts), cnt); 82 | if (ret == 0) [[likely]] 83 | return true; 84 | return false; 85 | } 86 | 87 | /** 88 | * @brief Allocates multiple packets in bulk to a batch. 89 | * @param batch Batch to store the allocated packets. 90 | * @param cnt Count of packets to allocate. 91 | * @return True if allocation succeeds, false otherwise. 92 | */ 93 | bool PacketBulkAlloc(PacketBatch *batch, uint16_t cnt) { 94 | (void)DCHECK_NOTNULL(batch); 95 | int ret = rte_pktmbuf_alloc_bulk( 96 | mpool_, reinterpret_cast(batch->pkts()), cnt); 97 | if (ret != 0) [[unlikely]] 98 | return false; 99 | 100 | batch->IncrCount(cnt); 101 | return true; 102 | } 103 | 104 | /** 105 | * @return The total capacity (number of packets) in the pool. 106 | */ 107 | uint32_t Capacity() { return mpool_->populated_size; } 108 | 109 | /** 110 | * @return The count of available packets in the pool. 111 | */ 112 | uint32_t AvailPacketsCount() { return rte_mempool_avail_count(mpool_); } 113 | 114 | private: 115 | const bool 116 | is_dpdk_primary_process_; //!< Indicates if it's a DPDK primary process. 117 | static uint16_t next_id_; //!< Static ID for the next packet pool instance. 118 | rte_mempool *mpool_; //!< Underlying rte mbuf pool. 119 | uint16_t id_; //!< Unique ID for this packet pool instance. 120 | }; 121 | 122 | } // namespace dpdk 123 | } // namespace juggler 124 | 125 | #endif // SRC_INCLUDE_PACKET_POOL_H_ 126 | -------------------------------------------------------------------------------- /src/include/pause.h: -------------------------------------------------------------------------------- 1 | #ifndef SRC_INCLUDE_PAUSE_H_ 2 | #define SRC_INCLUDE_PAUSE_H_ 3 | 4 | #ifdef __cplusplus 5 | extern "C" { 6 | #endif 7 | 8 | #if defined(__x86_64__) 9 | #include 10 | #elif defined(__aarch64__) || defined(_M_ARM64) 11 | #include 12 | #else 13 | static_assert(false, 14 | "Unsupported architecture, please add the pause intrinsic for " 15 | "the architecture."); 16 | #endif 17 | 18 | static void inline machnet_pause() { 19 | #if defined(__x86_64__) 20 | _mm_pause(); 21 | #elif defined(__aarch64__) || defined(_M_ARM64) 22 | __asm__ volatile("yield" ::: "memory"); 23 | #else 24 | static_assert(false, 25 | "Unsupported architecture, please add the pause intrinsic for " 26 | "the architecture."); 27 | #endif 28 | } 29 | 30 | #ifdef __cplusplus 31 | } 32 | #endif 33 | 34 | #endif // SRC_INCLUDE_PAUSE_H_ 35 | -------------------------------------------------------------------------------- /src/include/ttime.h: -------------------------------------------------------------------------------- 1 | #ifndef SRC_INCLUDE_TTIME_H_ 2 | #define SRC_INCLUDE_TTIME_H_ 3 | 4 | #include 5 | #define SRC_DPDK_TIMER 1 6 | 7 | #include 8 | #include 9 | #include 10 | #include 11 | 12 | #include 13 | 14 | namespace juggler { 15 | 16 | namespace time { 17 | 18 | // TODO(ilias): Reconsider whether this should be global for all threads. Which 19 | // CPUs have non-synchronized TSC these days? 20 | extern thread_local uint64_t tsc_hz; 21 | 22 | static inline uint64_t rdtsc() { return rte_rdtsc(); } 23 | 24 | [[maybe_unused]] static inline uint64_t estimate_tsc_hz() { 25 | return rte_get_tsc_hz(); 26 | } 27 | 28 | template 29 | requires std::integral || std::floating_point 30 | [[maybe_unused]] static inline T cycles_to_ns(uint64_t cycles) { 31 | DCHECK_NE(tsc_hz, 0); 32 | 33 | return static_cast(cycles * 1E9 / tsc_hz); 34 | } 35 | 36 | template 37 | requires std::integral || std::floating_point 38 | [[maybe_unused]] static constexpr inline T cycles_to_us(uint64_t cycles) { 39 | return cycles_to_ns(cycles) / 1E3; 40 | } 41 | 42 | template 43 | requires std::integral || std::floating_point 44 | [[maybe_unused]] static constexpr inline T cycles_to_ms(uint64_t cycles) { 45 | return cycles_to_ns(cycles) / 1E6; 46 | } 47 | 48 | template 49 | requires std::integral || std::floating_point 50 | [[maybe_unused]] static constexpr inline T cycles_to_s(uint64_t cycles) { 51 | return cycles_to_ns(cycles) / 1E9; 52 | } 53 | 54 | [[maybe_unused]] static inline uint64_t us_to_cycles(uint64_t us) { 55 | return us * tsc_hz / 1E6; 56 | } 57 | 58 | [[maybe_unused]] static inline uint64_t ms_to_cycles(uint64_t ms) { 59 | return ms * tsc_hz / 1E3; 60 | } 61 | 62 | [[maybe_unused]] static inline uint64_t s_to_cycles(uint64_t s) { 63 | return s * tsc_hz; 64 | } 65 | 66 | } // namespace time 67 | } // namespace juggler 68 | 69 | #endif // SRC_INCLUDE_TTIME_H_ 70 | -------------------------------------------------------------------------------- /src/include/ud_socket.h: -------------------------------------------------------------------------------- 1 | /** 2 | * @file ud_socket.h 3 | * @brief This file contains functionality to support a server that uses Unix 4 | * domain sockets. 5 | */ 6 | #ifndef SRC_INCLUDE_UD_SOCKET_H_ 7 | #define SRC_INCLUDE_UD_SOCKET_H_ 8 | 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | 16 | namespace juggler { 17 | namespace net { 18 | 19 | // Forward declaration. 20 | class UDServer; 21 | 22 | class UDSocket { 23 | public: 24 | UDSocket(); 25 | UDSocket(void *context, int socket_fd); 26 | explicit UDSocket(int socket_fd); 27 | ~UDSocket(); 28 | 29 | bool operator==(const UDSocket &other) const { 30 | return GetFd() == other.GetFd(); 31 | } 32 | 33 | template 34 | Q GetVaddr() const { 35 | return reinterpret_cast(reinterpret_cast(this)); 36 | } 37 | 38 | int GetFd() const; 39 | bool IsListening() const; 40 | bool IsConnected() const; 41 | bool IsAlive() const; 42 | 43 | void SetNonBlocking() const; 44 | bool Listen(const std::string &path); 45 | bool Connect(const std::string &path); 46 | 47 | bool SendMsg(const char *msg, size_t len); 48 | bool SendMsgWithFd(const char *msg, size_t len, int fd); 49 | int RecvMsgWithFd(char *msg, size_t len, int *fd); 50 | 51 | void *GetContext() const { return context_; } 52 | bool AllocateUserData(size_t size); 53 | template 54 | Q GetUserData() const { 55 | return reinterpret_cast(user_data_); 56 | } 57 | 58 | void Shutdown(); 59 | 60 | private: 61 | void *context_{nullptr}; 62 | int socket_fd_; 63 | bool is_listening_; 64 | bool is_connected_; 65 | void *user_data_{nullptr}; 66 | }; 67 | 68 | } // namespace net 69 | } // namespace juggler 70 | 71 | namespace std { 72 | template <> 73 | struct hash { 74 | size_t operator()(const juggler::net::UDSocket &socket) const { 75 | // File descriptor is unique for each socket. 76 | return socket.GetFd(); 77 | } 78 | }; 79 | } // namespace std 80 | 81 | namespace juggler { 82 | namespace net { 83 | 84 | class UDServer { 85 | public: 86 | using on_connect_cb_t = std::function; 87 | using on_close_cb_t = std::function; 88 | using on_message_cb_t = 89 | std::function; 90 | using on_timeout_cb_t = std::function; 91 | UDServer(const std::string &path, on_connect_cb_t on_connect, 92 | on_close_cb_t on_close, on_message_cb_t on_message, 93 | on_timeout_cb_t on_timeout); 94 | ~UDServer(); 95 | 96 | void Run(); 97 | void Stop(); 98 | 99 | private: 100 | const on_connect_cb_t on_connect_; 101 | const on_close_cb_t on_close_; 102 | const on_message_cb_t on_message_; 103 | const on_timeout_cb_t on_timeout_; 104 | std::atomic keep_running_; 105 | UDSocket listen_socket_; 106 | std::unordered_map> connected_clients_; 107 | }; 108 | 109 | } // namespace net 110 | } // namespace juggler 111 | 112 | #endif // SRC_INCLUDE_UD_SOCKET_H_ 113 | -------------------------------------------------------------------------------- /src/include/udp.h: -------------------------------------------------------------------------------- 1 | #ifndef SRC_INCLUDE_UDP_H_ 2 | #define SRC_INCLUDE_UDP_H_ 3 | 4 | #include 5 | #include 6 | 7 | namespace juggler { 8 | namespace net { 9 | 10 | struct __attribute__((packed)) Udp { 11 | struct __attribute__((packed)) Port { 12 | static const uint8_t kSize = 2; 13 | Port() = default; 14 | Port(uint16_t udp_port) { port = be16_t(udp_port); } 15 | bool operator==(const Port &rhs) const { return port == rhs.port; } 16 | bool operator==(be16_t rhs) const { return rhs == port; } 17 | bool operator!=(const Port &rhs) const { return port != rhs.port; } 18 | bool operator!=(be16_t rhs) const { return rhs != port; } 19 | 20 | be16_t port; 21 | }; 22 | 23 | std::string ToString() const; 24 | 25 | Port src_port; 26 | Port dst_port; 27 | be16_t len; 28 | be16_t cksum; 29 | }; 30 | 31 | } // namespace net 32 | } // namespace juggler 33 | 34 | namespace std { 35 | template <> 36 | struct hash { 37 | std::size_t operator()(const juggler::net::Udp::Port &port) const { 38 | return juggler::utils::hash( 39 | reinterpret_cast(&port.port), 40 | sizeof(port.port.raw_value())); 41 | } 42 | }; 43 | } // namespace std 44 | 45 | #endif // SRC_INCLUDE_UDP_H_ 46 | -------------------------------------------------------------------------------- /src/tests/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | include_directories(SYSTEM ${LIBDPDK_INCLUDE_DIRS}) 2 | 3 | # Build each test in the list as a separate executable 4 | file(GLOB_RECURSE TEST_FILES "${PROJECT_SOURCE_DIR}/src/*_test.cc" ) 5 | # list(FILTER TEST_FILES EXCLUDE REGEX "${PROJECT_SOURCE_DIR}/src/tests/.*" ) 6 | 7 | foreach(test_name IN LISTS TEST_FILES) 8 | get_filename_component(test_bin ${test_name} NAME_WE) 9 | add_executable(${test_bin} ${test_name}) 10 | target_link_libraries(${test_bin} PUBLIC 11 | core machnet_shim glog gtest gmock gflags ${LIBDPDK_LIBRARIES} hugetlbfs rt) 12 | add_test(NAME ${test_bin} COMMAND ${test_bin}) 13 | endforeach() 14 | -------------------------------------------------------------------------------- /src/tools/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | include_directories(../modules) 2 | 3 | add_subdirectory(pktgen) 4 | add_subdirectory(ping) 5 | add_subdirectory(jring_perf) 6 | add_subdirectory(jring2_perf) 7 | -------------------------------------------------------------------------------- /src/tools/jring2_perf/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | add_executable (jring2_perf main.cc) 2 | target_link_libraries(jring2_perf LINK_PUBLIC glog gflags PkgConfig::LIBDPDK_STATIC) 3 | -------------------------------------------------------------------------------- /src/tools/jring2_perf/main.cc: -------------------------------------------------------------------------------- 1 | 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | 10 | #include "jring2.h" 11 | 12 | static constexpr size_t kProducerCore = 2; 13 | static constexpr size_t kConsumerCore = 7; 14 | 15 | static constexpr size_t kQueueSz = 1024; 16 | static constexpr size_t kWindowSize = 16; 17 | 18 | static constexpr size_t kMsgPayloadSize8B = 7; 19 | 20 | std::atomic g_stop{false}; 21 | 22 | struct Msg { 23 | // Constructor to initialize the message to zero. 24 | Msg() { memset(this, 0, sizeof(Msg)); } 25 | size_t val[kMsgPayloadSize8B]; 26 | }; 27 | 28 | jring2_t *g_p2c_ring; // Producer to consumer ring 29 | jring2_t *g_c2p_ring; // Consumer to producer ring 30 | double g_rdtsc_freq_ghz = 0.0; 31 | 32 | uint64_t rdtsc() { return rte_rdtsc(); } 33 | 34 | double MeasureRdtscFreqGHz() { 35 | const uint64_t start_rdtsc = rdtsc(); 36 | const uint64_t start_ns = time(NULL) * 1000000000; 37 | sleep(1); 38 | const uint64_t end_rdtsc = rdtsc(); 39 | const uint64_t end_ns = time(NULL) * 1000000000; 40 | 41 | const uint64_t rdtsc_diff = end_rdtsc - start_rdtsc; 42 | const uint64_t ns_diff = end_ns - start_ns; 43 | 44 | return rdtsc_diff * 1.0 / ns_diff; 45 | } 46 | 47 | void ProducerThread() { 48 | juggler::utils::BindThisThreadToCore(kProducerCore); 49 | 50 | size_t seq_num = 0; 51 | size_t sum_lat_cycles = 0; 52 | uint64_t msr_start_cycles = rdtsc(); 53 | srand(time(NULL)); 54 | 55 | std::array timestamps{}; 56 | size_t inflight_requests = 0; 57 | 58 | while (!g_stop.load()) { 59 | // If there are kWindowSize outstanding requests, wait for a response 60 | if (inflight_requests == kWindowSize) { 61 | Msg resp_msg{}; 62 | while (true) { 63 | int result = jring2_dequeue_burst(g_c2p_ring, &resp_msg, 1); 64 | if (result == 1) break; 65 | } 66 | 67 | const size_t msg_lat_cycles = 68 | (rdtsc() - timestamps[seq_num % kWindowSize]); 69 | sum_lat_cycles += msg_lat_cycles; 70 | 71 | // Check message contents 72 | for (size_t i = 0; i < kMsgPayloadSize8B; i++) { 73 | if (resp_msg.val[i] != seq_num) { 74 | std::cerr << "Producer error: val mismatch, expected: " << seq_num 75 | << " actual(" << i << "): " << resp_msg.val[i] << std::endl; 76 | exit(1); 77 | } 78 | } 79 | 80 | seq_num++; 81 | inflight_requests--; 82 | 83 | // Print once every million msg 84 | static constexpr size_t kPrintEveryNMsgs = (1024 * 1024); 85 | if (seq_num % kPrintEveryNMsgs == 0) { 86 | const size_t avg_lat_cycles = sum_lat_cycles / 1000000; 87 | const size_t avg_lat_ns = avg_lat_cycles / g_rdtsc_freq_ghz; 88 | 89 | const size_t ns_since_msr = 90 | (rdtsc() - msr_start_cycles) / g_rdtsc_freq_ghz; 91 | const double us_since_msr = ns_since_msr / 1000.0; 92 | 93 | printf("Producer: avg RTT latency for %lu byte msgs: %lu ns\n", 94 | sizeof(Msg), avg_lat_ns); 95 | printf("Producer: Msg rate: %.2f M reqs/sec (%.2f M msgs/sec)\n", 96 | kPrintEveryNMsgs / us_since_msr, 97 | kPrintEveryNMsgs * 2 / us_since_msr); 98 | 99 | msr_start_cycles = rdtsc(); 100 | sum_lat_cycles = 0; 101 | } 102 | } 103 | 104 | // Issue a new request 105 | const uint64_t req_rdtsc = rdtsc(); 106 | Msg req_msg{}; 107 | for (size_t i = 0; i < kMsgPayloadSize8B; i++) { 108 | req_msg.val[i] = seq_num + inflight_requests; 109 | } 110 | 111 | timestamps[(seq_num + inflight_requests) % kWindowSize] = req_rdtsc; 112 | jring2_enqueue_bulk(g_p2c_ring, &req_msg, 1); 113 | inflight_requests++; 114 | } 115 | std::cout << "Producer exiting" << std::endl; 116 | exit(0); 117 | } 118 | 119 | void ConsumerThread() { 120 | juggler::utils::BindThisThreadToCore(kConsumerCore); 121 | 122 | while (!g_stop.load()) { 123 | Msg req_msg{}; 124 | while (true) { 125 | int result = jring2_dequeue_burst(g_p2c_ring, &req_msg, 1); 126 | if (result == 1) break; 127 | } 128 | 129 | // Send a response 130 | Msg resp_msg{}; 131 | for (size_t i = 0; i < kMsgPayloadSize8B; i++) { 132 | resp_msg.val[i] = req_msg.val[i]; 133 | } 134 | 135 | auto ret = jring2_enqueue_bulk(g_c2p_ring, &resp_msg, 1); 136 | if (ret != 1) { 137 | std::cerr << "Consumer error: enqueue failed" << std::endl; 138 | exit(1); 139 | } 140 | } 141 | 142 | std::cout << "Consumer exiting" << std::endl; 143 | exit(0); 144 | } 145 | 146 | int main() { 147 | std::cout << "Measuring RDTSC freq" << std::endl; 148 | g_rdtsc_freq_ghz = MeasureRdtscFreqGHz(); 149 | std::cout << "RDTSC freq: " << g_rdtsc_freq_ghz << " GHz" << std::endl; 150 | if (g_rdtsc_freq_ghz < 1.0 || g_rdtsc_freq_ghz > 4.0) { 151 | std::cerr << "ERROR: RDTSC freq is too high or too low" << std::endl; 152 | exit(1); 153 | } 154 | 155 | // Register signal handler. 156 | signal(SIGINT, [](int) { g_stop.store(true); }); 157 | 158 | // Initialize 159 | const auto kRingMemSz = jring2_get_buf_ring_size(sizeof(Msg), kQueueSz); 160 | if (kRingMemSz == -1ull) { 161 | std::cerr << "ERROR: jring2_get_buf_ring_size failed" << std::endl; 162 | exit(1); 163 | } 164 | g_p2c_ring = 165 | static_cast(aligned_alloc(CACHELINE_SIZE, kRingMemSz)); 166 | if (g_p2c_ring == nullptr) { 167 | std::cerr << "ERROR: aligned_alloc failed (g_p2c_ring)" << std::endl; 168 | exit(1); 169 | } 170 | 171 | g_c2p_ring = 172 | static_cast(aligned_alloc(CACHELINE_SIZE, kRingMemSz)); 173 | if (g_c2p_ring == nullptr) { 174 | std::cerr << "ERROR: aligned_alloc failed (g_c2p_ring)" << std::endl; 175 | exit(1); 176 | } 177 | 178 | jring2_init(g_p2c_ring, kQueueSz, sizeof(Msg)); 179 | jring2_init(g_c2p_ring, kQueueSz, sizeof(Msg)); 180 | 181 | std::cout << "Starting consumer thread" << std::endl; 182 | std::thread trecv(ConsumerThread); 183 | 184 | sleep(1); 185 | std::cout << "Starting producer thread" << std::endl; 186 | std::thread tsend(ProducerThread); 187 | 188 | tsend.join(); 189 | trecv.join(); 190 | 191 | free(g_p2c_ring); 192 | free(g_c2p_ring); 193 | 194 | return 0; 195 | } 196 | -------------------------------------------------------------------------------- /src/tools/jring_perf/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | add_executable (jring_perf main.cc) 2 | target_link_libraries(jring_perf LINK_PUBLIC glog gflags PkgConfig::LIBDPDK_STATIC) 3 | -------------------------------------------------------------------------------- /src/tools/jring_perf/main.cc: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | #include 7 | #include 8 | #include 9 | #include 10 | 11 | static constexpr size_t kNumRingElems = (1024 * 64); 12 | static constexpr size_t kMsgSize = 128; 13 | static constexpr size_t kOpsPerEpoch = kNumRingElems; 14 | static constexpr size_t kProducerCpuCoreId = 2; 15 | static constexpr size_t kConsumerCpuCoreId = 5; 16 | 17 | // Message struct for queue. 18 | struct msg_t { 19 | struct timespec ts; 20 | char data[kMsgSize - sizeof(struct timespec)]; 21 | }; 22 | static_assert(sizeof(msg_t) == kMsgSize, "Message size is not correct"); 23 | 24 | void SetThisThreadName(const std::string &name) { 25 | pthread_setname_np(pthread_self(), name.c_str()); 26 | } 27 | 28 | void BusySleepNs(size_t nsec) { 29 | struct timespec ts; 30 | clock_gettime(CLOCK_REALTIME, &ts); 31 | size_t start_ns = ts.tv_sec * 1000000000 + ts.tv_nsec; 32 | while (true) { 33 | clock_gettime(CLOCK_REALTIME, &ts); 34 | size_t now_ns = ts.tv_sec * 1000000000 + ts.tv_nsec; 35 | if (now_ns - start_ns >= nsec) break; 36 | } 37 | } 38 | 39 | jring_t *init_ring(size_t element_count) { 40 | size_t ring_sz = jring_get_buf_ring_size(sizeof(msg_t), element_count); 41 | LOG(INFO) << "Ring size: " << ring_sz << " bytes, msg size: " << sizeof(msg_t) 42 | << " bytes, element count: " << element_count; 43 | jring_t *ring = CHECK_NOTNULL(reinterpret_cast(aligned_alloc( 44 | juggler::hardware_constructive_interference_size, ring_sz))); 45 | if (jring_init(ring, element_count, sizeof(msg_t), 1, 1) < 0) { 46 | LOG(ERROR) << "Failed to initialize ring buffer"; 47 | free(ring); 48 | exit(EXIT_FAILURE); 49 | } 50 | return ring; 51 | } 52 | 53 | void ProducerThread(jring_t *ring) { 54 | LOG(INFO) << "Producer thread started, binding to core " 55 | << kProducerCpuCoreId; 56 | juggler::utils::BindThisThreadToCore(kProducerCpuCoreId); 57 | SetThisThreadName("jring_producer"); 58 | 59 | struct timespec msr_start; 60 | clock_gettime(CLOCK_REALTIME, &msr_start); 61 | size_t num_msg_since_last_msr = 0; 62 | 63 | while (true) { 64 | msg_t msg; 65 | clock_gettime(CLOCK_REALTIME, &msg.ts); 66 | while (jring_sp_enqueue_bulk(ring, &msg, 1, nullptr) != 1) { 67 | // do nothing 68 | } 69 | 70 | BusySleepNs(500); // Emulate 2 Mpps 71 | 72 | // check if 1 sec has elapsed since last msr, using msg.ts 73 | const size_t ns_since_last_msr = (msg.ts.tv_sec - msr_start.tv_sec) * 1e9 + 74 | (msg.ts.tv_nsec - msr_start.tv_nsec); 75 | if (ns_since_last_msr >= 1e9) { 76 | const size_t kpps = num_msg_since_last_msr / 1e3; 77 | LOG(INFO) << "Producer: " << kpps << " Kpps"; 78 | num_msg_since_last_msr = 0; 79 | msr_start = msg.ts; 80 | } else { 81 | ++num_msg_since_last_msr; 82 | } 83 | } 84 | } 85 | 86 | void ConsumerThread(jring_t *ring) { 87 | LOG(INFO) << "Consumer thread started, binding to core " 88 | << kConsumerCpuCoreId; 89 | juggler::utils::BindThisThreadToCore(kConsumerCpuCoreId); 90 | SetThisThreadName("jring_consumer"); 91 | 92 | struct timespec msr_start; 93 | clock_gettime(CLOCK_REALTIME, &msr_start); 94 | size_t num_rx = 0; 95 | size_t ns_sum = 0; 96 | 97 | while (true) { 98 | msg_t msg; 99 | while (jring_sc_dequeue_bulk(ring, &msg, 1, nullptr) != 1) { 100 | // do nothing 101 | } 102 | num_rx++; 103 | struct timespec ts; 104 | clock_gettime(CLOCK_REALTIME, &ts); 105 | 106 | const size_t msg_lat_ns = 107 | (ts.tv_sec - msg.ts.tv_sec) * 1e9 + (ts.tv_nsec - msg.ts.tv_nsec); 108 | ns_sum += msg_lat_ns; 109 | 110 | const size_t ns_since_last_msr = 111 | (ts.tv_sec - msr_start.tv_sec) * 1e9 + (ts.tv_nsec - msr_start.tv_nsec); 112 | if (ns_since_last_msr >= 1e9) { 113 | const double kpps = num_rx / 1e3; 114 | const size_t avg_lat_ns = ns_sum / num_rx; 115 | LOG(INFO) << "Consumer: " << kpps << " Kpps, avg latency: " << avg_lat_ns 116 | << " ns"; 117 | num_rx = 0; 118 | ns_sum = 0; 119 | msr_start = ts; 120 | } 121 | } 122 | } 123 | 124 | int main() { 125 | google::InitGoogleLogging("jring_bench"); 126 | FLAGS_logtostderr = true; 127 | jring_t *ring = init_ring(kNumRingElems); 128 | std::thread producer(ProducerThread, ring); 129 | std::thread consumer(ConsumerThread, ring); 130 | 131 | producer.join(); 132 | consumer.join(); 133 | 134 | free(ring); 135 | return 0; 136 | } 137 | -------------------------------------------------------------------------------- /src/tools/ping/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | set(target_name ping) 2 | add_executable (${target_name} main.cc) 3 | target_link_libraries(${target_name} LINK_PUBLIC core glog gflags) 4 | -------------------------------------------------------------------------------- /src/tools/ping/README.md: -------------------------------------------------------------------------------- 1 | # Ping utility 2 | 3 | This is a simple DPDK-based ping utility. It can be used to measure roundtrip 4 | time (RTT) between two machines. 5 | 6 | 7 | ## Prerequisites 8 | 9 | Successful build of the `Machnet` project (see main [README](../../../README.md)). 10 | 11 | ## Running the application 12 | 13 | ### Configuration 14 | 15 | The configuration is done by editing the [config.json](../machnet/config.json) file similar to the [machnet](../machnet) application. The configuration file should contain exactly one interface configuration. 16 | 17 | ```json 18 | { 19 | "machnet_config": { 20 | "00:0d:3a:d6:9b:6e": { 21 | "ip": "10.0.255.254", 22 | "cpu_mask": "0xffff" 23 | } 24 | } 25 | } 26 | ``` 27 | 28 | The above entry corresponds to the MAC address of some active network interface, and the IP address and CPU mask to be used by the application. The CPU mask is a hexadecimal number that specifies the cores that the application will use. For example, `0xffff` means that the application will use all cores. If the CPU mask is not specified, the application will use all cores. 29 | 30 | Other legal configuration options are safely ignored (for example, `engine_threads`). 31 | 32 | **Attention:** The ping utility is going to take over the specified interface. This means that any other application that is using the same interface will stop working! 33 | 34 | ### Running 35 | 36 | From `${REPOROOT}/build/src/tools/ping`: 37 | 38 | ```bash 39 | sudo GLOG_logtostderr=1 ./ping --remote_ip 10.0.0.1 40 | ``` 41 | -------------------------------------------------------------------------------- /src/tools/pktgen/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | set(target_name pktgen) 2 | add_executable (${target_name} main.cc) 3 | target_link_libraries(${target_name} LINK_PUBLIC core glog gflags ${LIBDPDK_LIBRARIES}) 4 | -------------------------------------------------------------------------------- /src/tools/pktgen/README.md: -------------------------------------------------------------------------------- 1 | # Packet Generator (pktgen) 2 | 3 | This is a simple packet generator application. It does not provide a full-fledged network stack, but rather uses raw datagrams with DPDK. Suitable for testing baseline performance of a system. 4 | 5 | Among others, the application can be used for two purposes: 6 | * Load generator: actively generates packets (configurable sizes) and sends them to a remote host. It also receives packets and prints the achieved PPS rate. 7 | * RTT measurement: actively sends packets to a remote host. The remote host should be running `pktgen` in **bouncing mode** (see subsequent section). The `pktgen` is collecting roundtrip time measurements. When stopping the appplication (with Ctrl+C), it will print RTT statistics. 8 | 9 | 10 | ## Prerequisites 11 | 12 | Successful build of the `Machnet` project (see main [README](../../../README.md)). 13 | 14 | ## Running the application 15 | 16 | ### Configuration 17 | 18 | The `pktgen` application shares the same configuration file as the Machnet stack. You may check [config.json](../machnet/config.json) for an example. 19 | 20 | The `pktgen` application ignores the `engine_threads` directive in the configuration. Instead, it 21 | uses a single thread for both sending and receiving packets. 22 | 23 | **Attention:** When running in Microsoft Azure, the recommended DPDK driver for the accelerated NIC is [`hn_netvsc`](https://doc.dpdk.org/guides/nics/netvsc.html). Check [here](../machnet/README.md#configuration) for instructions on how to bind the NIC to the `uio_hv_generic` driver. 24 | ### Running in active mode (packet generator) 25 | 26 | When running in this mode the application is actively generating packets. It also receives packets and prints the achieved PPS rate. 27 | 28 | 29 | ```bash 30 | # Send packets to a remote host. 31 | REMOTE_IP="10.0.0.254" 32 | cd ${REPOROOT}/build/ 33 | sudo GLOG_logtostderr=1 ./src/tools/pktgen/pktgen --remote_ip $REMOTE_IP --active-generator 34 | 35 | # If ran from a different directory, you may need to specify the path to the config file: 36 | sudo GLOG_logtostderr=1 ./pktgen --config_file ${REPOROOT}/src/apps/machnet/config.json --remote_ip $REMOTE_IP --active-generator 37 | 38 | ``` 39 | 40 | The above command will run the `pktgen` application to a remote machine with IP `10.0.0.254` in the same subnet. The default packet size is 64 bytes; to adjust this append the `--pkt_size` option. For example, to send 1500-byte (max size) packets: 41 | 42 | ```bash 43 | # From ${REPOROOT}/build/ 44 | sudo GLOG_logtostderr=1 ./src/tools/pktgen/pktgen --remote_ip $REMOTE_IP --active-generator --pkt_size 1500 45 | ``` 46 | 47 | ### Running in ping mode (RTT measurement) 48 | 49 | When running in this mode the application is actively sending packets to the remote host. The remote host should be running `pktgen` in **bouncing mode** (see subsequent section). 50 | 51 | The `pktgen` is collecting roundtrip time measurements. When stopping the appplication (with Ctrl+C), it will print RTT statistics. 52 | 53 | ```bash 54 | REMOTE_IP="10.0.0.254" 55 | # From ${REPOROOT}/build/ 56 | sudo GLOG_logtostderr=1 ./src/tools/pktgen/pktgen --remote_ip $REMOTE_IP --ping 57 | ``` 58 | 59 | If you want RTT measurements printed every 1 second, add option `--v=1` to the previous invocation. 60 | 61 | To save the samples in a log file add option: `--rtt_log path/to/file`. 62 | 63 | 64 | ### Running in passive mode (packet bouncing) 65 | 66 | When running in this mode the application is only bouncing packets to the remote host. It does not generate any other packets. 67 | 68 | 69 | From `${REPOROOT}/build/src/tools/pktgen`: 70 | 71 | ```bash 72 | REMOTE_IP="10.0.0.254" 73 | # From ${REPOROOT}/build/ 74 | sudo GLOG_logtostderr=1 ./src/tools/pktgen/pktgen 75 | ``` 76 | 77 | The above command will run the `pktgen` application in **passive** mode on local machine. It will send all received packets to the originating remote server. 78 | --------------------------------------------------------------------------------