├── .gitignore ├── CMakeLists.txt ├── COPYING ├── README.md ├── app ├── CMakeLists.txt ├── echo.cpp └── httpd.cpp ├── bench ├── bench_pages.lua └── pages.tar.bz2 ├── doc └── img │ ├── architecture.png │ └── performances.png ├── driver ├── CMakeLists.txt ├── allocator.hpp ├── buffer.cpp ├── buffer.hpp ├── clock.hpp ├── cpu.cpp ├── cpu.hpp ├── driver.hpp ├── mpipe.cpp ├── mpipe.hpp ├── timer.cpp └── timer.hpp ├── net ├── CMakeLists.txt ├── arp.hpp ├── checksum.cpp ├── checksum.hpp ├── endian.hpp ├── ethernet.hpp ├── ipv4.hpp └── tcp.hpp ├── tilera-toolchain.cmake └── util ├── CMakeLists.txt └── macros.hpp /.gitignore: -------------------------------------------------------------------------------- 1 | # Object files 2 | *.o 3 | *.ko 4 | *.obj 5 | *.elf 6 | 7 | # Precompiled Headers 8 | *.gch 9 | *.pch 10 | 11 | # Libraries 12 | *.lib 13 | *.a 14 | *.la 15 | *.lo 16 | 17 | # Shared objects (inc. Windows DLLs) 18 | *.dll 19 | *.so 20 | *.so.* 21 | *.dylib 22 | 23 | # Executables 24 | *.exe 25 | *.out 26 | *.app 27 | *.i*86 28 | *.x86_64 29 | *.hex 30 | 31 | # Debug files 32 | *.dSYM/ 33 | 34 | # CMake 35 | CMakeCache.txt 36 | CMakeFiles/ 37 | cmake_install.cmake 38 | Makefile 39 | -------------------------------------------------------------------------------- /CMakeLists.txt: -------------------------------------------------------------------------------- 1 | project (rusty) 2 | 3 | # CFLAGS 4 | 5 | # Uses the Tilera's memory allocator. 6 | # 7 | # Memory that is required by a single core will be homed on this core. 8 | # Can improve the performances. 9 | add_definitions(-DUSE_TILE_ALLOCATOR) 10 | 11 | # Uses Jumbo Ethernet frames if supported by the remote TCP. 12 | # 13 | # Improves the performances but consumes more mPIPE resources. 14 | # add_definitions(-DMPIPE_JUMBO_FRAMES) 15 | 16 | # Uses chained mPIPE buffers. 17 | # 18 | # Can improve or decrease performances. 19 | # add_definitions(-DMPIPE_CHAINED_BUFFERS) 20 | 21 | # Tells the compiler to generate branch prediction hints. 22 | # 23 | # Can improve performances. 24 | add_definitions(-DBRANCH_PREDICT) 25 | 26 | # Disables debug messages and assertions (implies -DNDEBUGMSG). 27 | # 28 | # Increases performances. 29 | add_definitions(-DNDEBUG) 30 | 31 | # Disable debug messages. 32 | # 33 | # Increases performances. 34 | add_definitions(-DNDEBUGMSG) 35 | 36 | # End of CFLAGS 37 | 38 | cmake_minimum_required (VERSION 2.8) 39 | 40 | set (CMAKE_CC_FLAGS "-Wall -std=c99 -O2") 41 | set (CMAKE_CXX_FLAGS "-Wall -std=c++11 -O2") 42 | 43 | find_package (Threads REQUIRED) 44 | 45 | include_directories (.) 46 | 47 | add_subdirectory (util) 48 | add_subdirectory (net) 49 | add_subdirectory (driver) 50 | add_subdirectory (app) 51 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Rusty 2 | 3 | Rusty is a light-weight, user-space, event-driven and highly-scalable TCP/IP 4 | stack. It has been developed to run on a 5 | [EZChip *TILE-Gx36* processor](http://www.tilera.com/products/?ezchip=585&spage=614). 6 | 7 | **A simple web-server using this new framework got a 2.6× performance 8 | improvement**, when compared to the same application layer running on the new 9 | reusable TCP sockets introduced in Linux 3.9. The web-server was able to deliver 10 | static HTML pages at a rate of 12 Gbps on a *TILE-Gx36* powered device. 11 | 12 | ![Performances](doc/img/performances.png) 13 | 14 | Currently, the stack only works on the *TILE-Gx* microarchitecture with the 15 | *mPIPE* user-space network driver. 16 | 17 | The software is licensed under the GNU General Public License version 3, and has 18 | been written in the context of 19 | [my Master's thesis](https://github.com/RaphaelJ/master-thesis/raw/master/thesis.pdf) 20 | for the University of Liège. 21 | 22 | # Architecture 23 | 24 | Rusty is design around a share-nothing and event-driven architecture. 25 | 26 | Rusty takes full control of cores it runs on (it disables preemptive 27 | multi-threading on these cores). Each of these cores is accountable for a subset 28 | of the TCP connections that the server handle, and connections are not shared 29 | between cores (a given connection will always be handled by the same core). 30 | This enables scalability. 31 | 32 | The application layer is composed of event-handlers that processes events of the 33 | network stack (such as a new connection or a new arrival of data). 34 | Event-handlers runs in the same thread and context that the network stack. This 35 | removes the overhead of context-switches. **The network stack is not backward 36 | compatible with software written using BSD sockets**. 37 | 38 | ![Architecture](doc/img/architecture.png) 39 | 40 | Details of this architecture are given in 41 | [my Master's thesis](https://github.com/RaphaelJ/master-thesis/raw/master/thesis.pdf). 42 | 43 | # Compiling 44 | 45 | ## Requirements 46 | 47 | Rusty requires GCC 4.8. 48 | 49 | *UG504-Gx-MDE-GettingStartedGuide* of the official *Tilera* documentation 50 | explains how to install GCC 4.8 on a *TILE-Gx* device. 51 | 52 | ## Cross compilation using CMake 53 | 54 | Rusty uses the CMake build tool. When compiling Rusty on the host system for the 55 | *TILE-Gx* architecture, CMake must be configured to use the cross-compiling GCC. 56 | 57 | This can be done, while being in the root directory of the project, by executing 58 | this command: 59 | 60 | cmake . -DCMAKE_TOOLCHAIN_FILE=tilera-toolchain.cmake 61 | 62 | ## Compiling 63 | 64 | The project can then be compiled by running the `make` command in the root 65 | directory. 66 | 67 | This will build: 68 | 69 | * The network stack. 70 | * A simple web-server and a simple echo server in the `/app` directory. 71 | 72 | # Writing an application using Rusty 73 | 74 | Two sample applications (a very simple web-server and an echo server) are 75 | available in the `/app` directory. The second chapter of 76 | [my Master's thesis](https://github.com/RaphaelJ/master-thesis/raw/master/thesis.pdf) 77 | (starting at page 15) explains how the echo server is implemented. 78 | 79 | Currently, only server sockets are supported (i.e. using the `listen()` call). 80 | The application layer is not able to initiate a client connection. 81 | 82 | # Running the web-server 83 | 84 | ## Data-plane tiles 85 | 86 | The framework features a simple web-server in the `/app` directory. Rusty 87 | applications require multiple *data-plane* cores. The *TILE-Gx* device can be 88 | booted with data-plane cores using the `--hvx "dataplane="` option when 89 | calling the `tile-monitor` command: 90 | 91 | tile-monitor --root --hvx "dataplane=0-35" 92 | 93 | The previous command starts the *TILE-Gx* device with 35 data-plane cores. 94 | 95 | ## ARP table entries 96 | 97 | The network stack currently **does not** support dynamic ARP entries when 98 | running on multiple cores. The `static_arp_entries` array on line 52 in 99 | `app/httpd.cpp` must be filled with static ARP entries. 100 | 101 | The application must be recompiled (using `make`) each time new static ARP 102 | entries are added. 103 | 104 | ## Starting the web-server 105 | 106 | The web-server must be given the TCP port and the network links it will listen 107 | on, the root directory containing the served files, and the number of worker 108 | cores to use: 109 | 110 | Usage: ./app/httpd [ ]... 111 | 112 | The following example starts the web-server on two links with two IPv4 113 | addresses (10.0.2.2 on `xgbe1` and 10.0.3.2 on `xgbe2`), with 18 cores dedicated 114 | for the first link, and 17 cores for the second. The web-server serves pages 115 | from the `/bench/pages` directory on port 80: 116 | 117 | ./app/httpd 80 bench/pages/ 2 xgbe1 10.0.2.2 18 xgbe2 10.0.3.2 17 118 | 119 | About 30 cores are required to fill a single 10 Gbps Ethernet link. 120 | 121 | ## Benchmarking the web-server 122 | 123 | A large number of concurrent HTTP requests can be generated on a second device 124 | using the [wrk](https://github.com/wg/wrk) scriptable HTTP benchmarking tool. 125 | 126 | The `bench/` directory contains a bzip2 archive with the 500 most popular front 127 | pages of the World Wide Web (according to [Alexa](http://alexa.com/topsites)). 128 | The directory also contains a Lua script that generates random requests to these 129 | 500 pages. The following command generates random HTTP requests using the 130 | `bench_pages.lua` script using 1,000 concurrent TCP connections and 6 threads 131 | during 30 seconds: 132 | 133 | wrk -s bench/bench_pages.lua -c 1000 -d 30s -t 6 http://10.0.2.2 134 | 135 | After 30 seconds, wrk will generate a textual report with the performances of 136 | the web-server. When the web-server is running on multiple links, several 137 | instances of wrk could be run concurrently, querying different IPv4 addresses. 138 | 139 | # Similar projects 140 | 141 | * [Seastar](http://seastar-project.org), a more advanced highly-scalable network 142 | stack. **You should prefer Seastar to Rusty for any production setup**. 143 | * [mTCP](http://shader.kaist.edu/mtcp/), another user-space, highly-scalable, 144 | network stack with a different architecture. 145 | -------------------------------------------------------------------------------- /app/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | add_executable(echo echo.cpp) 2 | 3 | target_link_libraries ( 4 | echo 5 | pthread tmc gxio 6 | driver net util 7 | ) 8 | 9 | # Precomputes the checksums of the files in the root directory. 10 | # 11 | # Improves the performances. 12 | add_definitions(-DUSE_PRECOMPUTED_CHECKSUMS) 13 | 14 | add_executable(httpd httpd.cpp) 15 | 16 | target_link_libraries ( 17 | httpd 18 | pthread tmc gxio 19 | driver net util 20 | ) 21 | -------------------------------------------------------------------------------- /app/echo.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Echo server. Replies to requests on a port with a copy of the received 3 | // message. 4 | // 5 | // Usage: ./app/echo 6 | // 7 | // Copyright 2015 Raphael Javaux 8 | // University of Liege. 9 | // 10 | // This program is free software: you can redistribute it and/or modify 11 | // it under the terms of the GNU General Public License as published by 12 | // the Free Software Foundation, either version 3 of the License, or 13 | // (at your option) any later version. 14 | // 15 | // This program is distributed in the hope that it will be useful, 16 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 17 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 18 | // GNU General Public License for more details. 19 | // 20 | // You should have received a copy of the GNU General Public License 21 | // along with this program. If not, see . 22 | // 23 | 24 | #include 25 | #include 26 | 27 | #include "driver/mpipe.hpp" // mpipe_t 28 | #include "util/macros.hpp" // RUSTY_DEBUG, COLOR_GRN 29 | 30 | using namespace std; 31 | 32 | using namespace rusty::driver; 33 | using namespace rusty::net; 34 | 35 | #define ECHO_COLOR COLOR_GRN 36 | #define ECHO_DEBUG(MSG, ...) \ 37 | RUSTY_DEBUG("ECHO", ECHO_COLOR, MSG, ##__VA_ARGS__) 38 | 39 | // Parsed CLI arguments. 40 | struct args_t { 41 | char *link_name; 42 | net_t ipv4_addr; 43 | mpipe_t::tcp_t::port_t tcp_port; 44 | size_t n_workers; 45 | }; 46 | 47 | static void _print_usage(char **argv); 48 | 49 | // Parses CLI arguments. 50 | // 51 | // Fails on a malformed command. 52 | static bool _parse_args(int argc, char **argv, args_t *args); 53 | 54 | // Used to define empty event handlers. 55 | static void _do_nothing(void); 56 | 57 | int main(int argc, char **argv) 58 | { 59 | args_t args; 60 | if (!_parse_args(argc, argv, &args)) 61 | return EXIT_FAILURE; 62 | 63 | mpipe_t mpipe(args.link_name, args.ipv4_addr, args.n_workers); 64 | 65 | ECHO_DEBUG( 66 | "Starts the echo server on interface %s (%s) with %s as IPv4 address " 67 | "on port %d", 68 | args.link_name, mpipe_t::ethernet_t::addr_t::to_alpha(mpipe.ether_addr), 69 | mpipe_t::ipv4_t::addr_t::to_alpha(args.ipv4_addr), args.tcp_port 70 | ); 71 | 72 | mpipe.tcp_listen( 73 | // On new connection handler. 74 | args.tcp_port, 75 | [](mpipe_t::tcp_t::conn_t conn) 76 | { 77 | ECHO_DEBUG( 78 | "New connection from %s:%" PRIu16 " on port %" PRIu16, 79 | mpipe_t::ipv4_t::addr_t::to_alpha(conn.tcb_id.raddr), 80 | conn.tcb_id.rport.host(), conn.tcb_id.lport.host() 81 | ); 82 | 83 | mpipe_t::tcp_t::conn_handlers_t handlers; 84 | 85 | handlers.new_data = 86 | [conn](mpipe_t::cursor_t in) mutable 87 | { 88 | size_t size = in.size(); 89 | 90 | in.read_with( 91 | [size](const char *buffer) 92 | { 93 | ECHO_DEBUG( 94 | "Received %zu bytes: %.*s", size, (int) size, 95 | buffer 96 | ); 97 | }, size 98 | ); 99 | 100 | conn.send( 101 | size, 102 | [in](size_t offset, mpipe_t::cursor_t out) 103 | { 104 | in.drop(offset) 105 | .take(out.size()) 106 | .for_each( 107 | [&out](const char * buffer, size_t buffer_size) 108 | { 109 | out = out.write(buffer, buffer_size); 110 | } 111 | ); 112 | }, 113 | 114 | _do_nothing // Does nothing on acknowledgment 115 | ); 116 | }; 117 | 118 | 119 | handlers.remote_close = 120 | [conn]() mutable 121 | { 122 | // Closes when the remote closes the connection. 123 | conn.close(); 124 | }; 125 | 126 | handlers.close = _do_nothing; 127 | handlers.reset = _do_nothing; 128 | 129 | return handlers; 130 | } 131 | ); 132 | 133 | // Runs the application. 134 | mpipe.run(); 135 | 136 | // Wait for the instance to finish (will not happen). 137 | mpipe.join(); 138 | 139 | return EXIT_SUCCESS; 140 | } 141 | 142 | static void _print_usage(char **argv) 143 | { 144 | fprintf( 145 | stderr, "Usage: %s \n", argv[0] 146 | ); 147 | } 148 | 149 | static bool _parse_args(int argc, char **argv, args_t *args) 150 | { 151 | if (argc != 5) { 152 | _print_usage(argv); 153 | return false; 154 | } 155 | 156 | args->link_name = argv[1]; 157 | 158 | struct in_addr in_addr; 159 | if (inet_aton(argv[2], &in_addr) != 1) { 160 | fprintf(stderr, "Failed to parse the IPv4.\n"); 161 | _print_usage(argv); 162 | return false; 163 | } 164 | args->ipv4_addr = ipv4_addr_t::from_in_addr(in_addr); 165 | 166 | args->tcp_port = atoi(argv[3]); 167 | 168 | args->n_workers = atoi(argv[4]); 169 | 170 | return true; 171 | } 172 | 173 | static void _do_nothing(void) 174 | { 175 | } 176 | 177 | #undef ECHO_COLOR 178 | #undef ECHO_DEBUG 179 | -------------------------------------------------------------------------------- /app/httpd.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Very simple HTTP server. Preload files from the given directory. 3 | // 4 | // Usage: ./app/httpd 5 | // 6 | // Copyright 2015 Raphael Javaux 7 | // University of Liege. 8 | // 9 | // This program is free software: you can redistribute it and/or modify 10 | // it under the terms of the GNU General Public License as published by 11 | // the Free Software Foundation, either version 3 of the License, or 12 | // (at your option) any later version. 13 | // 14 | // This program is distributed in the hope that it will be useful, 15 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 16 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 17 | // GNU General Public License for more details. 18 | // 19 | // You should have received a copy of the GNU General Public License 20 | // along with this program. If not, see . 21 | // 22 | 23 | #include // min() 24 | #include 25 | #include 26 | #include 27 | #include 28 | #include 29 | 30 | #include // alloca() 31 | #include // struct dirent, opendir(), readdir() 32 | #include // struct stat, stat() 33 | 34 | #include // tmc_mem_prefetch() 35 | 36 | #include "driver/cpu.hpp" 37 | #include "driver/mpipe.hpp" 38 | #include "net/checksum.hpp" // partial_sum_t, precomputed_sums_t 39 | #include "net/endian.hpp" // net_t 40 | #include "util/macros.hpp" // LIKELY(), UNLIKELY, RUSTY_* 41 | 42 | using namespace std; 43 | 44 | using namespace rusty::driver; 45 | using namespace rusty::net; 46 | 47 | static mpipe_t::arp_ipv4_t::static_entry_t 48 | _static_arp_entry(const char *ipv4_addr_char, const char *ether_addr_char); 49 | 50 | static vector static_arp_entries { 51 | // eth2 frodo.run.montefiore.ulg.ac.be 52 | _static_arp_entry("10.0.2.1", "90:e2:ba:46:f2:d4"), 53 | 54 | // eth3 frodo.run.montefiore.ulg.ac.be 55 | _static_arp_entry("10.0.3.1", "90:e2:ba:46:f2:d5"), 56 | 57 | // eth4 frodo.run.montefiore.ulg.ac.be 58 | _static_arp_entry("10.0.4.1", "90:e2:ba:46:f2:e0"), 59 | 60 | // eth5 frodo.run.montefiore.ulg.ac.be 61 | _static_arp_entry("10.0.5.1", "90:e2:ba:46:f2:e1") 62 | }; 63 | 64 | #define HTTPD_COLOR COLOR_GRN 65 | #define HTTPD_DEBUG(MSG, ...) \ 66 | RUSTY_DEBUG("HTTPD", HTTPD_COLOR, MSG, ##__VA_ARGS__) 67 | #define HTTPD_ERROR(MSG, ...) \ 68 | RUSTY_ERROR("HTTPD", HTTPD_COLOR, MSG, ##__VA_ARGS__) 69 | #define HTTPD_DIE(MSG, ...) \ 70 | RUSTY_DIE( "HTTPD", HTTPD_COLOR, MSG, ##__VA_ARGS__) 71 | 72 | // Parsed CLI arguments. 73 | struct args_t { 74 | struct interface_t { 75 | char *link_name; 76 | net_t ipv4_addr; 77 | size_t n_workers; 78 | }; 79 | 80 | mpipe_t::tcp_t::port_t tcp_port; 81 | char *root_dir; 82 | vector interfaces; 83 | }; 84 | 85 | // Type used to index file by their filename. Different from 'std::string', so 86 | // we can initialize it without having to reallocate and copy the string. 87 | struct filename_t { 88 | const char *value; 89 | }; 90 | 91 | namespace std { 92 | 93 | // 'std::hash<>' and 'std::equal_to<>' instances are required to index 94 | // files by their filenames. 95 | 96 | template <> 97 | struct hash { 98 | inline size_t operator()(filename_t filename) const 99 | { 100 | const char *str = filename.value; 101 | 102 | size_t sum = 0; 103 | while (*str != '\0') { 104 | sum += hash()(*str); 105 | str++; 106 | } 107 | 108 | return sum; 109 | } 110 | }; 111 | 112 | template <> 113 | struct equal_to { 114 | inline bool operator()(filename_t a, filename_t b) const 115 | { 116 | return strcmp(a.value, b.value) == 0; 117 | } 118 | }; 119 | 120 | } /* namespace std */ 121 | 122 | // Served file and its content. 123 | struct file_t { 124 | const char *content; 125 | size_t content_len; 126 | 127 | #ifdef USE_PRECOMPUTED_CHECKSUMS 128 | precomputed_sums_t precomputed_sums; 129 | #endif /* USE_PRECOMPUTED_CHECKSUMS */ 130 | }; 131 | 132 | static void _print_usage(char **argv); 133 | 134 | // Parses CLI arguments. 135 | // 136 | // Fails on a malformed command. 137 | static bool _parse_args(int argc, char **argv, args_t *args); 138 | 139 | // Loads all the file contents from the directory in the hash-table. 140 | static void _preload_files( 141 | unordered_map *files, const char *dir 142 | ); 143 | 144 | // Used to define empty event handlers. 145 | static void _do_nothing(void); 146 | 147 | // Interprets an HTTP request and serves the requested content. 148 | static void _on_received_data( 149 | unordered_map *files, mpipe_t::tcp_t::conn_t conn, 150 | mpipe_t::cursor_t in 151 | ); 152 | 153 | // Responds to the client with a 200 OK HTTP response containing the given file. 154 | void _respond_with_200(mpipe_t::tcp_t::conn_t conn, const file_t *file); 155 | 156 | // Responds to the client with a 400 Bad Request HTTP response. 157 | void _respond_with_400(mpipe_t::tcp_t::conn_t conn); 158 | 159 | // Responds to the client with a 404 Not Found HTTP response. 160 | void _respond_with_404(mpipe_t::tcp_t::conn_t conn); 161 | 162 | int main(int argc, char **argv) 163 | { 164 | args_t args; 165 | if (!_parse_args(argc, argv, &args)) 166 | return EXIT_FAILURE; 167 | 168 | unordered_map files { }; 169 | _preload_files(&files, args.root_dir); 170 | 171 | // 172 | // Handler executed on new connections. 173 | // 174 | 175 | auto on_new_connection = 176 | [&files](mpipe_t::tcp_t::conn_t conn) 177 | { 178 | HTTPD_DEBUG( 179 | "New connection from %s:%" PRIu16 " on port %" PRIu16, 180 | mpipe_t::ipv4_t::addr_t::to_alpha(conn.tcb_id.raddr), 181 | conn.tcb_id.rport.host(), conn.tcb_id.lport.host() 182 | ); 183 | 184 | mpipe_t::tcp_t::conn_handlers_t handlers; 185 | 186 | handlers.new_data = 187 | [&files, conn](mpipe_t::cursor_t in) mutable 188 | { 189 | if (conn.can_send()) 190 | _on_received_data(&files, conn, in); 191 | }; 192 | 193 | handlers.remote_close = 194 | [conn]() mutable 195 | { 196 | // Closes when the remote closes the connection. 197 | conn.close(); 198 | }; 199 | 200 | handlers.close = _do_nothing; 201 | 202 | handlers.reset = _do_nothing; 203 | 204 | return handlers; 205 | }; 206 | 207 | // 208 | // Starts an mpipe instance for each interface. 209 | // 210 | 211 | vector instances; 212 | instances.reserve(args.interfaces.size()); 213 | 214 | int first_dataplane_cpu = 0; 215 | for (args_t::interface_t &interface : args.interfaces) { 216 | instances.emplace_back( 217 | interface.link_name, interface.ipv4_addr, interface.n_workers, 218 | first_dataplane_cpu, static_arp_entries 219 | ); 220 | 221 | mpipe_t &mpipe = instances.back(); 222 | 223 | HTTPD_DEBUG( 224 | "Starts the HTTP server on interface %s (%s) with %s as IPv4 " 225 | "address on port %d serving %s", 226 | interface.link_name, 227 | mpipe_t::ethernet_t::addr_t::to_alpha(mpipe.ether_addr), 228 | mpipe_t::ipv4_t::addr_t::to_alpha(interface.ipv4_addr), 229 | args.tcp_port, args.root_dir 230 | ); 231 | 232 | mpipe.tcp_listen(args.tcp_port, on_new_connection); 233 | 234 | mpipe.run(); 235 | 236 | first_dataplane_cpu += interface.n_workers; 237 | } 238 | 239 | printf("HTTPD started\n"); 240 | 241 | // Wait for all instances to finish (will not happen). 242 | for (mpipe_t &mpipe_instance : instances) 243 | mpipe_instance.join(); 244 | 245 | return EXIT_SUCCESS; 246 | } 247 | 248 | static mpipe_t::arp_ipv4_t::static_entry_t 249 | _static_arp_entry(const char *ipv4_addr_char, const char *ether_addr_char) 250 | { 251 | struct in_addr ipv4_addr; 252 | if (inet_aton(ipv4_addr_char, &ipv4_addr) != 1) 253 | HTTPD_DIE("Invalid IPv4 address"); 254 | 255 | struct ether_addr *ether_addr; 256 | if ((ether_addr = ether_aton(ether_addr_char)) == nullptr) 257 | HTTPD_DIE("Invalid Ethernet address"); 258 | 259 | return (mpipe_t::arp_ipv4_t::static_entry_t) { 260 | mpipe_t::ipv4_t::addr_t::from_in_addr(ipv4_addr), 261 | mpipe_t::ethernet_t::addr_t::from_ether_addr(ether_addr) 262 | }; 263 | } 264 | 265 | static void _print_usage(char **argv) 266 | { 267 | fprintf( 268 | stderr, 269 | "Usage: %s " 270 | "[ ]...\n", 271 | argv[0] 272 | ); 273 | } 274 | 275 | static bool _parse_args(int argc, char **argv, args_t *args) 276 | { 277 | if (argc < 4) { 278 | _print_usage(argv); 279 | return false; 280 | } 281 | 282 | args->tcp_port = atoi(argv[1]); 283 | 284 | args->root_dir = argv[2]; 285 | 286 | int n_links = atoi(argv[3]); 287 | 288 | if (argc != 4 + n_links * 3){ 289 | _print_usage(argv); 290 | return false; 291 | } 292 | 293 | args->interfaces.reserve(n_links); 294 | 295 | for (int i = 0; i < n_links; i++) { 296 | args_t::interface_t interface; 297 | 298 | interface.link_name = argv[4 + 3 * i]; 299 | 300 | struct in_addr in_addr; 301 | if (inet_aton(argv[5 + 3 * i], &in_addr) != 1) { 302 | fprintf(stderr, "Failed to parse the IPv4.\n"); 303 | _print_usage(argv); 304 | return false; 305 | } 306 | interface.ipv4_addr = ipv4_addr_t::from_in_addr(in_addr); 307 | 308 | interface.n_workers = atoi(argv[6 + 3 * i]); 309 | 310 | args->interfaces.push_back(interface); 311 | } 312 | 313 | return true; 314 | } 315 | 316 | static void _preload_files( 317 | unordered_map *files, const char *root_dir 318 | ) 319 | { 320 | DIR *dir; 321 | 322 | if (!(dir = opendir(root_dir))) 323 | HTTPD_DIE("Unable to open the directory"); 324 | 325 | struct dirent *entry; 326 | 327 | size_t root_dir_len = strlen(root_dir); 328 | 329 | while ((entry = readdir(dir))) { 330 | filename_t filename = { strdup(entry->d_name) }; 331 | 332 | // Filename with the directory path. 333 | char *filepath = new char[root_dir_len + strlen(filename.value) + 2]; 334 | strcpy(filepath, root_dir); 335 | filepath[root_dir_len] = '/'; 336 | strcpy(filepath + root_dir_len + 1, filename.value); 337 | 338 | // Skips directories. 339 | struct stat stat_buffer; 340 | if (stat(filepath, &stat_buffer) != 0) 341 | HTTPD_DIE("Unable to get info on a file (%s)", filename.value); 342 | if (S_ISDIR(stat_buffer.st_mode)) 343 | continue; 344 | 345 | FILE *file; 346 | if (!(file = fopen(filepath, "r"))) 347 | HTTPD_DIE("Unable to open a file"); 348 | 349 | // Obtains the size of the file 350 | fseek(file, 0, SEEK_END); 351 | size_t content_size = ftell(file); 352 | fseek(file, 0, SEEK_SET); 353 | 354 | // Reads the file content 355 | 356 | char *content = new char[content_size + 1]; 357 | size_t read = fread(content, 1, content_size, file); 358 | 359 | if (read != content_size) 360 | HTTPD_DIE("Unable to read a file %zu %zu", read, content_size); 361 | 362 | content[content_size] = '\0'; 363 | 364 | fclose(file); 365 | 366 | #ifdef USE_PRECOMPUTED_CHECKSUMS 367 | file_t entry = { 368 | content, content_size, precomputed_sums_t(content, content_size) 369 | }; 370 | #else 371 | file_t entry = { content, content_size }; 372 | #endif /* USE_PRECOMPUTED_CHECKSUMS */ 373 | 374 | files->emplace(filename, entry); 375 | } 376 | 377 | HTTPD_DEBUG("%zu file(s) preloaded", files->size()); 378 | } 379 | 380 | static void _do_nothing(void) 381 | { 382 | } 383 | 384 | static void _on_received_data( 385 | unordered_map *files, mpipe_t::tcp_t::conn_t conn, 386 | mpipe_t::cursor_t in 387 | ) 388 | { 389 | // Expects that the first received segment contains the entire request. 390 | 391 | size_t size = in.size(); 392 | 393 | #define BAD_REQUEST(WHY, ...) \ 394 | do { \ 395 | HTTPD_ERROR("400 Bad Request (" WHY ")", ##__VA_ARGS__); \ 396 | _respond_with_400(conn); \ 397 | conn.close(); \ 398 | return; \ 399 | } while (0) 400 | 401 | if (UNLIKELY(size < sizeof ("XXX / HTTP/X.X\n"))) 402 | BAD_REQUEST("Not enough received data for the HTTP header"); 403 | 404 | in.read_with( 405 | [files, conn](const char *buffer) mutable 406 | { 407 | // 408 | // Extracts the filename from the HTTP header 409 | // 410 | 411 | size_t get_len = sizeof ("GET /") - sizeof ('\0'); 412 | 413 | if (UNLIKELY(strncmp(buffer, "GET /", get_len) != 0)) 414 | BAD_REQUEST("Not a GET request"); 415 | 416 | const char *path_begin = buffer + get_len; 417 | const char *path_end = strchr(path_begin, ' '); 418 | 419 | const char *http11_begin = path_end + 1; 420 | size_t http11_len = sizeof ("HTTP/1.1") - sizeof ('\0'); 421 | const char *http11_end = http11_begin + http11_len; 422 | 423 | 424 | if (UNLIKELY(strncmp(http11_begin, "HTTP/1.1", http11_len) != 0)) 425 | BAD_REQUEST("Not HTTP 1.1"); 426 | 427 | if (UNLIKELY(http11_end[0] != '\n' && http11_end[0] != '\r')) 428 | BAD_REQUEST("Invalid header"); 429 | 430 | size_t path_len = (intptr_t) path_end - (intptr_t) path_begin; 431 | char *path = (char *) alloca(path_len); 432 | 433 | strncpy(path, path_begin, path_len); 434 | path[path_len] = '\0'; 435 | 436 | // 437 | // Responds to the request. 438 | // 439 | 440 | auto file_it = files->find({ path }); 441 | 442 | if (LIKELY(file_it != files->end())) { 443 | HTTPD_DEBUG("200 OK - \"%s\"", path); 444 | _respond_with_200(conn, &file_it->second); 445 | } else { 446 | HTTPD_ERROR("404 Not Found - \"%s\"", path); 447 | _respond_with_404(conn); 448 | } 449 | 450 | conn.close(); 451 | }, size 452 | ); 453 | 454 | #undef BAD_REQUEST 455 | } 456 | 457 | void _respond_with_200(mpipe_t::tcp_t::conn_t conn, const file_t *file) 458 | { 459 | constexpr char header[] = "HTTP/1.1 200 OK\r\n" 460 | "Content-Type: text/html\r\n" 461 | "Content-Length: %10zu\r\n" 462 | "\r\n"; 463 | 464 | constexpr size_t header_len = sizeof (header) - sizeof ('\0') 465 | - sizeof ("%10zu") 466 | + sizeof ("4294967295"); 467 | 468 | size_t total_len = header_len + file->content_len; 469 | 470 | #ifdef USE_PRECOMPUTED_CHECKSUMS 471 | mpipe_t::tcp_t::writer_sum_t writer = 472 | #else 473 | mpipe_t::tcp_t::writer_t writer = 474 | #endif /* USE_PRECOMPUTED_CHECKSUMS */ 475 | [file](size_t offset, mpipe_t::cursor_t out) 476 | { 477 | size_t content_offset; 478 | 479 | #ifdef USE_PRECOMPUTED_CHECKSUMS 480 | partial_sum_t partial_sum; 481 | #endif /* USE_PRECOMPUTED_CHECKSUMS */ 482 | 483 | // Writes the HTTP header if required. 484 | if (offset < header_len) { 485 | tmc_mem_prefetch(header, sizeof (header)); 486 | 487 | char buffer[header_len + 1]; 488 | 489 | snprintf(buffer, sizeof (buffer), header, file->content_len); 490 | 491 | size_t to_write = min(out.size(), header_len - offset); 492 | 493 | out = out.write(buffer + offset, to_write); 494 | content_offset = 0; 495 | 496 | #ifdef USE_PRECOMPUTED_CHECKSUMS 497 | partial_sum = partial_sum_t(buffer + offset, to_write); 498 | #endif /* USE_PRECOMPUTED_CHECKSUMS */ 499 | } else { 500 | content_offset = offset - header_len; 501 | 502 | #ifdef USE_PRECOMPUTED_CHECKSUMS 503 | partial_sum = partial_sum_t::ZERO; 504 | #endif /* USE_PRECOMPUTED_CHECKSUMS */ 505 | } 506 | 507 | size_t out_size = out.size(); 508 | 509 | tmc_mem_prefetch(file->content + content_offset, out_size); 510 | 511 | #ifdef USE_PRECOMPUTED_CHECKSUMS 512 | size_t content_end = content_offset + out_size; 513 | file->precomputed_sums.prefetch(content_offset, content_end); 514 | #endif /* USE_PRECOMPUTED_CHECKSUMS */ 515 | 516 | // Writes the file content if required. 517 | if (out_size > 0) { 518 | assert(out_size <= file->content_len - content_offset); 519 | out.write(file->content + content_offset, out_size); 520 | } 521 | 522 | #ifdef USE_PRECOMPUTED_CHECKSUMS 523 | // Returns the precomputed checksum sums. 524 | return (partial_sum_t) partial_sum.append( 525 | file->precomputed_sums.sum(content_offset, content_end) 526 | ); 527 | #endif /* USE_PRECOMPUTED_CHECKSUMS */ 528 | }; 529 | 530 | conn.send(total_len, writer, _do_nothing /* Does nothing on ACK */); 531 | } 532 | 533 | #define RESPOND_WITH_CONTENT(CONTENT) \ 534 | do { \ 535 | constexpr char status[] = CONTENT; \ 536 | \ 537 | conn.send( \ 538 | sizeof (status) - 1, \ 539 | [](size_t offset, mpipe_t::cursor_t out) \ 540 | { \ 541 | out.write(status + offset, out.size()); \ 542 | }, _do_nothing \ 543 | ); \ 544 | } while (0); 545 | 546 | void _respond_with_400(mpipe_t::tcp_t::conn_t conn) 547 | { 548 | RESPOND_WITH_CONTENT("HTTP/1.1 400 Bad Request\r\n\r\n"); 549 | } 550 | 551 | void _respond_with_404(mpipe_t::tcp_t::conn_t conn) 552 | { 553 | RESPOND_WITH_CONTENT("HTTP/1.1 404 Not Found\r\n\r\n"); 554 | } 555 | 556 | #undef RESPOND_WITH_CONTENT 557 | 558 | #undef HTTPD_COLOR 559 | #undef HTTPD_DEBUG 560 | #undef HTTPD_DIE 561 | -------------------------------------------------------------------------------- /bench/bench_pages.lua: -------------------------------------------------------------------------------- 1 | n_pages = 500 2 | 3 | request = function() 4 | r = math.random(0, n_pages) 5 | path = "/" .. r .. ".htm" 6 | return wrk.format(nil, path) 7 | end 8 | 9 | -------------------------------------------------------------------------------- /bench/pages.tar.bz2: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RaphaelJ/rusty/698ff86e1f17b23356679171ae9ae6ad41fe0137/bench/pages.tar.bz2 -------------------------------------------------------------------------------- /doc/img/architecture.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RaphaelJ/rusty/698ff86e1f17b23356679171ae9ae6ad41fe0137/doc/img/architecture.png -------------------------------------------------------------------------------- /doc/img/performances.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RaphaelJ/rusty/698ff86e1f17b23356679171ae9ae6ad41fe0137/doc/img/performances.png -------------------------------------------------------------------------------- /driver/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | include_directories (../) 2 | 3 | add_library (driver buffer.cpp cpu.cpp mpipe.cpp) 4 | 5 | target_link_libraries (driver util) -------------------------------------------------------------------------------- /driver/allocator.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Defines a C++ STL allocator which wraps the TMC memory management library. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #ifndef __RUSTY_DRIVER_ALLOCATOR_HPP__ 22 | #define __RUSTY_DRIVER_ALLOCATOR_HPP__ 23 | 24 | #include // shared_ptr 25 | #include // forward 26 | 27 | #include // tmc_alloc_t, tmc_alloc_* 28 | #include // tmc_mspace_* 29 | 30 | #include "driver/driver.hpp" 31 | 32 | #include 33 | 34 | using namespace std; 35 | 36 | namespace rusty { 37 | namespace driver { 38 | 39 | // Allocator which will use the provided 'tmc_alloc_t' configuration to allocate 40 | // an heap on which it will be able to allocate data. 41 | // 42 | // This can be used to specify how memory of STL containers should be cached on 43 | // the Tilera device. 44 | // 45 | // Multiple threads should be able to allocate/deallocate memory concurrently. 46 | // 47 | // Every allocated data will be freed when the object and all of its copies will 48 | // be destucted. Thus you must at least have one copy of 'tile_allocator_t' 49 | // alive to be able to use allocated memories. 50 | template 51 | struct tile_allocator_t { 52 | // 53 | // Member types 54 | // 55 | 56 | typedef T value_type; 57 | typedef T* pointer; 58 | typedef const T* const_pointer; 59 | typedef T& reference; 60 | typedef const T& const_reference; 61 | typedef size_t size_type; 62 | 63 | template 64 | struct rebind { 65 | typedef tile_allocator_t other; 66 | }; 67 | 68 | // 69 | // Member fields 70 | // 71 | 72 | // Uses a 'shared_ptr' to the 'tmc_mspace' with a destructor which frees the 73 | // memory space once no more 'tile_allocator_t' are referencing it. 74 | 75 | shared_ptr mspace; 76 | 77 | // 78 | // Methods 79 | // 80 | 81 | // Creates an allocator which uses a tmc_alloc_t initialized with 82 | // TMC_ALLOC_INIT to allocate pages for the heap. 83 | inline tile_allocator_t(void) 84 | { 85 | tmc_alloc_t alloc = TMC_ALLOC_INIT; 86 | _init_mspace(&alloc); 87 | } 88 | 89 | template 90 | inline tile_allocator_t(const tile_allocator_t& other) 91 | : mspace(other.mspace) 92 | { 93 | } 94 | 95 | // Creates an allocator which uses the given tmc_alloc_t to allocate pages 96 | // for the heap. 97 | inline tile_allocator_t(tmc_alloc_t *alloc) 98 | { 99 | _init_mspace(&alloc); 100 | } 101 | 102 | // Creates an allocator which uses a tmc_alloc_t initialized with 103 | // TMC_ALLOC_INIT on which tmc_alloc_set_home() with the given home 104 | // parameter to allocate pages for the heap. 105 | // 106 | // home can be: 107 | // 108 | // * a CPU number. The memory will be cached on ths CPU. 109 | // * TMC_ALLOC_HOME_SINGLE. The memory will be cached on a single CPU, 110 | // choosen by the operating system. 111 | // * TMC_ALLOC_HOME_HERE. The memory will be cached on the CPU which called 112 | // allocate(). 113 | // * TMC_ALLOC_HOME_TASK. The memory will be cached on the CPU which is 114 | // accessing it. The kernel will automatically migrates page between CPUs. 115 | // * TMC_ALLOC_HOME_HASH. The memory will home cache will be distributed 116 | // around via hash-for-home. 117 | // * TMC_ALLOC_HOME_NONE. The memory will not be cached. 118 | // * TMC_ALLOC_HOME_INCOHERENT. Memory is incoherent between CPUs, and 119 | // requires explicit flush and invalidate to enforce coherence. 120 | // * TMC_ALLOC_HOME_DEFAULT. Use operating system default. 121 | inline tile_allocator_t(int home) 122 | { 123 | tmc_alloc_t alloc = TMC_ALLOC_INIT; 124 | tmc_alloc_set_home(&alloc, home); 125 | _init_mspace(&alloc); 126 | } 127 | 128 | // Creates an allocator which uses a tmc_alloc_t initialized with 129 | // TMC_ALLOC_INIT on which tmc_alloc_set_pagesize() with the given pagesize 130 | // parameter to allocate pages for the heap. 131 | // 132 | // The size is rounded up to the nearest page size. If no single page can 133 | // hold the given number of bytes, the largest page size is selected, and 134 | // the method returns NULL. 135 | inline tile_allocator_t(size_t pagesize) 136 | { 137 | tmc_alloc_t alloc = TMC_ALLOC_INIT; 138 | tmc_alloc_set_pagesize(&alloc, pagesize); 139 | _init_mspace(&alloc); 140 | } 141 | 142 | // Combines 'tile_allocator_t(int home)' and 143 | // 'tile_allocator_t(size_t pagesize)'. 144 | inline tile_allocator_t(int home, size_t pagesize) 145 | { 146 | tmc_alloc_t alloc = TMC_ALLOC_INIT; 147 | tmc_alloc_set_home(&alloc, home); 148 | tmc_alloc_set_pagesize(&alloc, pagesize); 149 | _init_mspace(&alloc); 150 | } 151 | 152 | // ------------------------------------------------------------------------- 153 | 154 | // 155 | // Allocator methods and operators. 156 | // 157 | 158 | inline T* address(T& obj) 159 | { 160 | return &obj; 161 | } 162 | 163 | inline T* allocate(size_t length) 164 | { 165 | // DRIVER_DEBUG("allocate<%s>(%zu)", typeid (T).name(), length); 166 | 167 | return (T*) tmc_mspace_malloc(*mspace, length * sizeof (T)); 168 | } 169 | 170 | inline void deallocate(T* ptr, size_t length) 171 | { 172 | // DRIVER_DEBUG("deallocate<%s>(%zu)", typeid (T).name(), length); 173 | 174 | tmc_mspace_free(ptr); 175 | } 176 | 177 | template 178 | void construct(U* p, Args&&... args) 179 | { 180 | new (p) U(forward(args) ...); 181 | } 182 | 183 | template 184 | void destroy(U* p) 185 | { 186 | p->~U(); 187 | } 188 | 189 | friend inline bool operator==( 190 | const tile_allocator_t& a, const tile_allocator_t& b 191 | ) 192 | { 193 | return *(a.mspace) == *(b.mspace); 194 | } 195 | 196 | friend inline bool operator!=( 197 | const tile_allocator_t& a, const tile_allocator_t& b 198 | ) 199 | { 200 | return !(a == b); 201 | } 202 | 203 | private: 204 | 205 | inline void _init_mspace(tmc_alloc_t *alloc) 206 | { 207 | mspace = shared_ptr(new tmc_mspace, _free_mspace); 208 | *mspace = tmc_mspace_create_special(0, 0, alloc); 209 | } 210 | 211 | static void _free_mspace(tmc_mspace *mspace) 212 | { 213 | DRIVER_DEBUG("Freeing mpace starting at %zu", (size_t) *mspace); 214 | tmc_mspace_destroy(*mspace); 215 | delete mspace; 216 | } 217 | }; 218 | 219 | } } /* namespace rusty::driver */ 220 | 221 | #endif /* __RUSTY_DRIVER_ALLOCATOR_HPP__ */ 222 | -------------------------------------------------------------------------------- /driver/buffer.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Provides an higher level interface to mPIPE buffers. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #include 22 | #include // shared_ptr, make_shared 23 | 24 | #include // get_cycle_count() 25 | 26 | #include // MPIPE_EDMA_DESC_* 27 | 28 | #include "driver/driver.hpp" 29 | 30 | #include "driver/buffer.hpp" 31 | #include "driver/allocator.hpp" 32 | 33 | namespace rusty { 34 | namespace driver { 35 | namespace buffer { 36 | 37 | #ifdef MPIPE_CHAINED_BUFFERS 38 | const cursor_t cursor_t::EMPTY = { 39 | shared_ptr<_buffer_desc_t>(nullptr), nullptr, 0, 40 | shared_ptr(nullptr), 0 41 | }; 42 | #else 43 | const cursor_t cursor_t::EMPTY = { 44 | shared_ptr<_buffer_desc_t>(nullptr), nullptr, 0 45 | }; 46 | #endif /* MPIPE_CHAINED_BUFFERS */ 47 | 48 | } } } /* namespace rusty::driver::buffer */ 49 | -------------------------------------------------------------------------------- /driver/buffer.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Provides an higher level interface to mPIPE buffers. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #ifndef __RUSTY_DRIVERS_BUFFER_HPP__ 22 | #define __RUSTY_DRIVERS_BUFFER_HPP__ 23 | 24 | #include // min() 25 | #include 26 | #include 27 | #include 28 | #include // shared_ptr 29 | 30 | #include // gxio_mpipe_* 31 | 32 | #include "util/macros.hpp" 33 | 34 | using namespace std; 35 | 36 | namespace rusty { 37 | namespace driver { 38 | namespace buffer { 39 | 40 | // Used internally to manage an mPIPE buffer life cycle. 41 | struct _buffer_desc_t; 42 | 43 | // Structure which can be used as an iterator to read and write into an mPIPE 44 | // (possibly chained) buffer. 45 | // 46 | // The internal state of the cursor is never modified. That is, when data is 47 | // read or written, a new cursor is returned without the previous one being 48 | // modified. This makes it easier to use (you can chain methods, e.g. 49 | // 'cursor.read(&a).drop(10).read(&b);') and backtracking is just a matter of 50 | // reusing an old cursor. 51 | struct cursor_t { 52 | 53 | // A cursor state is represented by the current buffer descriptor, the next 54 | // byte to read/write in the current buffer, the remaining bytes in this 55 | // buffer and a reference to the cursor containing the next buffer descriptor. 56 | // 57 | // 'current_size' can only be equal to zero if there is no buffer after. 58 | // That is, if the end of the current buffer is reached ('current_size' 59 | // become zero), the cursor must load the next buffer descriptor ('current' 60 | // must point to the next buffer's first byte). This makes 'read_in_place()' 61 | // and 'write_in_place()' implementations easier. 62 | 63 | // State of the cursor at the end of the buffer chain. 64 | static const cursor_t EMPTY; 65 | 66 | shared_ptr<_buffer_desc_t> desc; 67 | 68 | char *current; // Next byte to read/write. 69 | size_t current_size; 70 | 71 | #ifdef MPIPE_CHAINED_BUFFERS 72 | 73 | shared_ptr next; // Following buffers. 74 | size_t next_size; // Total size of 75 | // following buffers. 76 | 77 | #endif /* MPIPE_CHAINED_BUFFERS */ 78 | 79 | // Creates a buffer's cursor from an ingress packet descriptor. 80 | // 81 | // If 'managed' is true, buffer descriptors will be freed automatically by 82 | // calling 'gxio_mpipe_push_buffer_bdesc()'. 83 | // 84 | // Complexity: O(n) where 'n' is the number of buffer descriptors in the 85 | // chain. 86 | template > 87 | inline cursor_t( 88 | gxio_mpipe_context_t *context, gxio_mpipe_idesc_t *idesc, bool managed, 89 | alloc_t alloc = alloc_t() 90 | ) 91 | { 92 | // gxio_mpipe_idesc_to_bdesc() seems to be broken on MDE v4.3.2. 93 | // gxio_mpipe_bdesc_t edesc = gxio_mpipe_idesc_to_bdesc(idesc); 94 | gxio_mpipe_bdesc_t edesc; 95 | edesc.word = idesc->words[7]; 96 | 97 | size_t total_size = gxio_mpipe_idesc_get_xfer_size(idesc); 98 | 99 | _init_with_bdesc(context, &edesc, total_size, managed, alloc); 100 | } 101 | 102 | // Creates a buffer's cursor from a (possibly chained) buffer descriptor. 103 | // 104 | // If 'managed' is true, buffer descriptors will be freed automatically by 105 | // calling 'gxio_mpipe_push_buffer_bdesc()'. 106 | // 107 | // Complexity: O(n) where 'n' is the number of buffer descriptors in the 108 | // chain. 109 | template > 110 | inline cursor_t( 111 | gxio_mpipe_context_t *context, gxio_mpipe_bdesc_t *bdesc, 112 | size_t total_size, bool managed, alloc_t alloc = alloc_t() 113 | ) 114 | { 115 | _init_with_bdesc(context, bdesc, total_size, managed, alloc); 116 | } 117 | 118 | // Returns the total number of remaining bytes. 119 | // 120 | // Complexity: O(1). 121 | inline size_t size(void) const 122 | { 123 | #ifdef MPIPE_CHAINED_BUFFERS 124 | return current_size + next_size; 125 | #else 126 | return current_size; 127 | #endif/* MPIPE_CHAINED_BUFFERS */ 128 | } 129 | 130 | // True if there is nothing more to read. 131 | // 132 | // Complexity: O(1). 133 | inline bool empty(void) const 134 | { 135 | #ifdef MPIPE_CHAINED_BUFFERS 136 | if (current_size == 0) { 137 | assert(next_size == 0); 138 | return true; 139 | } else 140 | return false; 141 | #else 142 | return current_size == 0; 143 | #endif /* MPIPE_CHAINED_BUFFERS */ 144 | } 145 | 146 | // Returns a new cursor which references the 'n' first bytes of the cursor. 147 | // 148 | // If 'n' is larger than the size of the cursor (given 'size()'), the 149 | // original cursor is returned. 150 | // 151 | // Complexity: O(1). 152 | inline cursor_t take(size_t n) const 153 | { 154 | #ifdef MPIPE_CHAINED_BUFFERS 155 | if (n <= current_size) 156 | return cursor_t(desc, current, n, nullptr, 0); 157 | else if (n >= size()) 158 | return *this; 159 | else { 160 | return cursor_t( 161 | desc, current, current_size, next, next_size - n 162 | ); 163 | } 164 | #else 165 | return cursor_t(desc, current, min(current_size, n)); 166 | #endif/* MPIPE_CHAINED_BUFFERS */ 167 | } 168 | 169 | // Returns a new cursor which references 'n' bytes after the cursor. 170 | // 171 | // Returns an empty cursor if the 'n' is larger than 'size()'. 172 | // 173 | // Complexity: O(n) with chained buffer, O(1) with unchained buffers. 174 | inline cursor_t drop(size_t n) const 175 | { 176 | if (n >= size()) 177 | return EMPTY; 178 | else { 179 | #ifdef MPIPE_CHAINED_BUFFERS 180 | cursor_t cursor = *this; 181 | while (n > 0 && n >= cursor.current_size) { 182 | n -= cursor.current_size; 183 | cursor = *(cursor.next); 184 | } 185 | 186 | return cursor._drop_in_buffer(n); 187 | #else 188 | return _drop_in_buffer(n); 189 | #endif /* MPIPE_CHAINED_BUFFERS */ 190 | } 191 | } 192 | 193 | // Equivalent to 'drop(sizeof (T))'. 194 | template 195 | inline cursor_t drop() const 196 | { 197 | return drop(sizeof (T)); 198 | } 199 | 200 | // Equivalent to 'drop(sizeof (T) * n)'. 201 | template 202 | inline cursor_t drop(size_t n) const 203 | { 204 | return drop(sizeof (T) * n); 205 | } 206 | 207 | // ------------------------------------------------------------------------- 208 | 209 | // 210 | // Copying read and write. 211 | // 212 | 213 | // Returns true if there is enough bytes left to read or write 'n' bytes* 214 | // using 'read()' or 'write()'. 215 | // 216 | // Complexity: O(1). 217 | inline bool can(size_t n) const 218 | { 219 | return n <= size(); 220 | } 221 | 222 | // Equivalent to 'can(sizeof (T))'. 223 | template 224 | inline bool can() const 225 | { 226 | return can(sizeof (T)); 227 | } 228 | 229 | // Reads 'n' bytes of data. There must be enough bytes in the buffer to read 230 | // the item (see 'can()'). 231 | // 232 | // Returns a new buffer which references the data following what has been 233 | // read. 234 | // 235 | // Complexity: O(n) where 'n' is the number of bytes to read. 236 | inline cursor_t read(char *data, size_t n) const 237 | { 238 | assert(can(n)); 239 | 240 | #ifdef MPIPE_CHAINED_BUFFERS 241 | cursor_t cursor = *this; 242 | 243 | while (n > cursor.current_size) { 244 | memcpy(data, cursor.current, cursor.current_size); 245 | n -= cursor.current_size; 246 | cursor = *(cursor.next); 247 | } 248 | 249 | if (n > 0) { 250 | memcpy(data, cursor.current, n); 251 | cursor = cursor._drop_in_buffer(n); 252 | } 253 | 254 | return cursor; 255 | #else 256 | memcpy(data, current, n); 257 | return _drop_in_buffer(n); 258 | #endif /* MPIPE_CHAINED_BUFFERS */ 259 | } 260 | 261 | // Equivalent to 'read(data, sizeof (T))'. 262 | template 263 | inline cursor_t read(T *data) const 264 | { 265 | return read((char *) data, sizeof (T)); 266 | } 267 | 268 | // Writes 'n' bytes of data. There must be enough bytes in the buffer to 269 | // write the item (see 'can()'). 270 | // 271 | // Returns a new buffer which references the data following what has been 272 | // written. 273 | // 274 | // Complexity: O(n) where 'n' is the number of bytes to write. 275 | inline cursor_t write(const char *data, size_t n) const 276 | { 277 | assert(can(n)); 278 | 279 | #ifdef MPIPE_CHAINED_BUFFERS 280 | cursor_t cursor = *this; 281 | 282 | while (n > cursor.current_size) { 283 | memcpy(cursor.current, data, cursor.current_size); 284 | n -= cursor.current_size; 285 | cursor = *(cursor.next); 286 | } 287 | 288 | if (n > 0) { 289 | memcpy(cursor.current, data, n); 290 | cursor = cursor._drop_in_buffer(n); 291 | } 292 | 293 | return cursor; 294 | #else 295 | memcpy(current, data, n); 296 | return _drop_in_buffer(n); 297 | #endif /* MPIPE_CHAINED_BUFFERS */ 298 | } 299 | 300 | // Equivalent to 'write(data, sizeof (T))'. 301 | template 302 | inline cursor_t write(const T *data) const 303 | { 304 | return write((const char *) data, sizeof (T)); 305 | } 306 | 307 | // ------------------------------------------------------------------------- 308 | 309 | // 310 | // In-place read and write. 311 | // 312 | 313 | // Returns true if there is enough bytes left in the *current buffer* to 314 | // read or write 'n' bytes with 'in_place()'. 315 | // 316 | // Complexity: O(1). 317 | inline bool can_in_place(size_t n) const 318 | { 319 | return n <= current_size; 320 | } 321 | 322 | // Equivalent to 'can_in_place(sizeof (T))'. 323 | template 324 | inline bool can_in_place() const 325 | { 326 | return can_in_place(sizeof (T)); 327 | } 328 | 329 | // Gives a pointer to read or write the given number of bytes directly in 330 | // the buffer's memory without copying. 331 | // 332 | // Returns a new buffer which references the data following what is to be 333 | // read or written. 334 | // 335 | // Complexity: O(1). 336 | inline cursor_t in_place(char **data, size_t n) 337 | { 338 | assert(can_in_place(n)); 339 | 340 | *data = current; 341 | 342 | #ifdef MPIPE_CHAINED_BUFFERS 343 | if (n == current_size) 344 | return *next; 345 | else 346 | return _drop_in_buffer(n); 347 | #else 348 | return _drop_in_buffer(n); 349 | #endif /* MPIPE_CHAINED_BUFFERS */ 350 | } 351 | 352 | // Gives a pointer to read or write the given number of bytes directly in 353 | // the buffer's memory without copying. 354 | // 355 | // Returns a new buffer which references the data following what is to be 356 | // read or written. 357 | // 358 | // Complexity: O(1). 359 | inline cursor_t in_place(const char **data, size_t n) const 360 | { 361 | assert(can_in_place(n)); 362 | 363 | *data = current; 364 | 365 | #ifdef MPIPE_CHAINED_BUFFERS 366 | if (n == current_size) 367 | return *next; 368 | else 369 | return _drop_in_buffer(n); 370 | #else 371 | return _drop_in_buffer(n); 372 | #endif /* MPIPE_CHAINED_BUFFERS */ 373 | } 374 | 375 | // Equivalent to 'in_place(data, sizeof (T))'. 376 | template 377 | inline cursor_t in_place(T **data) 378 | { 379 | return in_place((char **) data, sizeof (T)); 380 | } 381 | 382 | // Equivalent to 'in_place(data, sizeof (T))'. 383 | template 384 | inline cursor_t in_place(const T **data) const 385 | { 386 | return in_place((const char **) data, sizeof (T)); 387 | } 388 | 389 | // Gives to the given function a pointer to read 'n' bytes of data and a 390 | // cursor to the following data. The return value of the given function will 391 | // be forwarded as the return value of 'read_with()'. 392 | // 393 | // Will directly reference the buffer's memory if it's possible 394 | // ('can_in_place()'), will gives a reference to a copy otherwise. 395 | // 396 | // The call to the given function is a tail-call. 397 | // 398 | // Complexity: O(1) (best-case) or O(n) (worst-case) where 'n' is the number 399 | // of bytes to read. 400 | template 401 | inline R read_with(function f, size_t n) const 402 | { 403 | #ifdef MPIPE_CHAINED_BUFFERS 404 | if (can_in_place(n)) { 405 | const char *p; 406 | cursor_t cursor = in_place(&p, n); 407 | return f(p, cursor); 408 | } else { 409 | assert(can(n)); 410 | char data[n]; 411 | cursor_t cursor = read(data, n); 412 | return f(data, cursor); 413 | } 414 | #else 415 | const char *p; 416 | cursor_t cursor = in_place(&p, n); 417 | return f(p, cursor); 418 | #endif /* MPIPE_CHAINED_BUFFERS */ 419 | } 420 | 421 | // Equivalent to 'read_with(f, sizeof (T))'. 422 | template 423 | inline R read_with(function f) const 424 | { 425 | #ifdef MPIPE_CHAINED_BUFFERS 426 | if (can_in_place()) { 427 | const T *p; 428 | cursor_t cursor = in_place(&p); 429 | return f(p, cursor); 430 | } else { 431 | assert(can()); 432 | T data; 433 | cursor_t cursor = read(&data); 434 | return f(&data, cursor); 435 | } 436 | #else 437 | const T *p; 438 | cursor_t cursor = in_place(&p); 439 | return f(p, cursor); 440 | #endif /* MPIPE_CHAINED_BUFFERS */ 441 | } 442 | 443 | // Gives a pointer to read 'n' bytes of data to the function. 444 | // 445 | // Will directly reference the buffer's memory if it's possible 446 | // ('can_in_place()'), will gives a reference to a copy otherwise. 447 | // 448 | // The call to the given function is *not* a tail-call. 449 | // 450 | // Complexity: O(1) (best-case) or O(n) (worst-case) where 'n' is the number 451 | // of bytes to read. 452 | inline cursor_t read_with(function f, size_t n) const 453 | { 454 | return read_with([&f](const char *data, cursor_t cursor) { 455 | f(data); 456 | return cursor; 457 | }, n); 458 | } 459 | 460 | // Equivalent to 'read_with(f, sizeof (T))'. 461 | template 462 | inline cursor_t read_with(function f) const 463 | { 464 | return read_with([&f](const T *data, cursor_t cursor) { 465 | f(data); 466 | return cursor; 467 | }); 468 | } 469 | 470 | // Gives a pointer to write 'n' bytes of data to the function. 471 | // 472 | // Will directly reference the buffer's memory if it's possible 473 | // ('can_in_place()'), will gives a reference to a copy otherwise. 474 | // 475 | // State of the memory referenced by the pointer is undefined. 476 | // 477 | // Complexity: O(1) (best-case) or O(n) (worst-case) where 'n' is the number 478 | // of bytes to write. 479 | inline cursor_t write_with(function f, size_t n) 480 | { 481 | #ifdef MPIPE_CHAINED_BUFFERS 482 | if (can_in_place(n)) { 483 | char *p; 484 | cursor_t cursor = in_place(&p, n); 485 | f(p); 486 | return cursor; 487 | } else { 488 | assert(can(n)); 489 | char data[n]; 490 | f(data); 491 | return write(data, n); 492 | } 493 | #else 494 | char *p; 495 | cursor_t cursor = in_place(&p, n); 496 | f(p); 497 | return cursor; 498 | #endif /* MPIPE_CHAINED_BUFFERS */ 499 | } 500 | 501 | // Equivalent to 'write_with(f, sizeof (T))'. 502 | template 503 | inline cursor_t write_with(function f) 504 | { 505 | #ifdef MPIPE_CHAINED_BUFFERS 506 | if (can_in_place()) { 507 | T *p; 508 | cursor_t cursor = in_place(&p); 509 | f(p); 510 | return cursor; 511 | } else { 512 | assert(can()); 513 | T data; 514 | f(&data); 515 | return write(&data); 516 | } 517 | #else 518 | T *p; 519 | cursor_t cursor = in_place(&p); 520 | f(p); 521 | return cursor; 522 | #endif /* MPIPE_CHAINED_BUFFERS */ 523 | } 524 | 525 | // Executes the given function on each buffer, in order. 526 | // 527 | // Complexity: O(n). 528 | inline void for_each(function f) const 529 | { 530 | #ifdef MPIPE_CHAINED_BUFFERS 531 | cursor_t cursor = *this; 532 | 533 | while (!cursor.empty()) { 534 | f(cursor.current, cursor.current_size); 535 | cursor = *cursor._next; 536 | } 537 | #else 538 | if (!empty()) 539 | f(current, current_size); 540 | #endif /* MPIPE_CHAINED_BUFFERS */ 541 | } 542 | 543 | private: 544 | #ifdef MPIPE_CHAINED_BUFFERS 545 | cursor_t( 546 | shared_ptr<_buffer_desc_t> _desc, char *_current, 547 | size_t _current_size, 548 | shared_ptr _next, size_t _next_size 549 | ) : desc(_desc), current(_current), current_size(_current_size), 550 | next(_next), next_size(_next_size) 551 | #else 552 | cursor_t( 553 | shared_ptr<_buffer_desc_t> _desc, char *_current, 554 | size_t _current_size 555 | ) : desc(_desc), current(_current), current_size(_current_size) 556 | #endif /* MPIPE_CHAINED_BUFFERS */ 557 | { 558 | } 559 | 560 | // Complexity: O(n) where 'n' is the number of buffer descriptors in the 561 | // chain. 562 | // 563 | // The allocator is used to allocate the 'shared_ptr'. 564 | template > 565 | void _init_with_bdesc( 566 | gxio_mpipe_context_t *context, gxio_mpipe_bdesc_t *bdesc, 567 | size_t total_size, bool managed, alloc_t alloc = alloc_t() 568 | ); 569 | 570 | // Returns a new cursor which references 'n' bytes after the cursor. 571 | // 572 | // Does *not* handle the case when 'n' is exactly equal to 'current_size' 573 | // (i.e. when a new buffer must be loaded). 574 | // 575 | // Complexity: O(1). 576 | inline cursor_t _drop_in_buffer(size_t n) const 577 | { 578 | #ifdef MPIPE_CHAINED_BUFFERS 579 | assert(can_in_place(n) && (n < current_size || next == nullptr)); 580 | 581 | return { 582 | desc, current + n, current_size - n, 583 | next, next_size 584 | }; 585 | #else 586 | assert(can_in_place(n)); 587 | 588 | return { desc, current + n, current_size - n }; 589 | #endif 590 | } 591 | }; 592 | 593 | struct _buffer_desc_t { 594 | gxio_mpipe_context_t *context; 595 | gxio_mpipe_bdesc_t bdesc; 596 | bool is_managed; // If true, the buffer will be released 597 | // when this object will be destructed. 598 | 599 | _buffer_desc_t( 600 | gxio_mpipe_context_t *_context, gxio_mpipe_bdesc_t _bdesc, 601 | bool _is_managed 602 | ) : context(_context), bdesc(_bdesc), is_managed(_is_managed) 603 | { 604 | } 605 | 606 | ~_buffer_desc_t(void) 607 | { 608 | if (is_managed) 609 | gxio_mpipe_push_buffer_bdesc(context, bdesc); 610 | } 611 | }; 612 | 613 | template 614 | void cursor_t::_init_with_bdesc( 615 | gxio_mpipe_context_t *context, gxio_mpipe_bdesc_t *bdesc, size_t total_size, 616 | bool is_managed, alloc_t alloc 617 | ) 618 | { 619 | // The end of the buffer chain could be reached because: 620 | // 1) there is no buffer descriptor. 621 | // 2) there is another buffer descriptor but we limited the number of bytes 622 | // we can use (this is used by slice methods such as 'take()'). 623 | // 3) the descriptor is invalid (last buffer in a chain). 624 | 625 | if ( 626 | bdesc == nullptr || total_size == 0 627 | || bdesc->c == MPIPE_EDMA_DESC_WORD1__C_VAL_INVALID 628 | ) { 629 | assert(total_size == 0); 630 | *this = EMPTY; 631 | return; 632 | } 633 | 634 | // Allocates a manageable buffer descriptor. 635 | // desc = make_shared<_buffer_desc_t>(context, *bdesc, is_managed); 636 | desc = allocate_shared<_buffer_desc_t>(alloc, context, *bdesc, is_managed); 637 | 638 | // The last 42 bits of the buffer descriptor contain the virtual address of 639 | // the buffer with the lower 7 bits being the offset of packet data inside 640 | // this buffer. 641 | // 642 | // When the buffer is chained with other buffers, the next buffer descriptor 643 | // is written in the first 8 bytes of the buffer and the offset is at least 644 | // 8 bytes. 645 | 646 | char *va = (char *) ((intptr_t) bdesc->va << 7); 647 | size_t offset = bdesc->__reserved_0; 648 | 649 | current = va + offset; 650 | 651 | #ifdef MPIPE_CHAINED_BUFFERS 652 | size_t buffer_size = gxio_mpipe_buffer_size_enum_to_buffer_size( 653 | (gxio_mpipe_buffer_size_enum_t) bdesc->size 654 | ); 655 | 656 | switch (bdesc->c) { 657 | case MPIPE_EDMA_DESC_WORD1__C_VAL_UNCHAINED: 658 | assert(total_size <= buffer_size - offset); 659 | 660 | current_size = total_size; 661 | next = nullptr; 662 | next_size = 0; 663 | return; 664 | case MPIPE_EDMA_DESC_WORD1__C_VAL_CHAINED: 665 | current_size = min(total_size, buffer_size - offset); 666 | next_size = total_size - current_size; 667 | next = make_shared( 668 | context, (gxio_mpipe_bdesc_t *) va, next_size, is_managed 669 | ); 670 | return; 671 | default: 672 | DRIVER_DIE("Invalid buffer descriptor"); 673 | }; 674 | #else 675 | assert(bdesc->c == MPIPE_EDMA_DESC_WORD1__C_VAL_UNCHAINED); 676 | 677 | current_size = total_size; 678 | #endif /* MPIPE_CHAINED_BUFFERS */ 679 | } 680 | 681 | } } } /* namespace rusty::driver:buffer */ 682 | 683 | #endif /* __RUSTY_DRIVERS_BUFFER_HPP__ */ 684 | -------------------------------------------------------------------------------- /driver/clock.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Provides a way to compute time intervals between two events. The interface 3 | // does not provide a way to know the date/hour/minute of an event, but can be 4 | // used to how much time past between two events. 5 | // 6 | // Relies on the CPU cycle count instead of the system clock for efficiency. 7 | // 8 | // Copyright 2015 Raphael Javaux 9 | // University of Liege. 10 | // 11 | // This program is free software: you can redistribute it and/or modify 12 | // it under the terms of the GNU General Public License as published by 13 | // the Free Software Foundation, either version 3 of the License, or 14 | // (at your option) any later version. 15 | // 16 | // This program is distributed in the hope that it will be useful, 17 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 18 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 19 | // GNU General Public License for more details. 20 | // 21 | // You should have received a copy of the GNU General Public License 22 | // along with this program. If not, see . 23 | // 24 | 25 | #ifndef __RUSTY_DRIVER_CLOCK_HPP__ 26 | #define __RUSTY_DRIVER_CLOCK_HPP__ 27 | 28 | #include 29 | #include // round() 30 | #include 31 | 32 | #include // get_cycle_count() 33 | 34 | #include "driver/cpu.hpp" // CYCLES_PER_SECOND, cycles_t 35 | 36 | using namespace std; 37 | 38 | using namespace rusty::driver::cpu; 39 | 40 | namespace rusty { 41 | namespace driver { 42 | 43 | struct cpu_clock_t { 44 | // Interval between two dates. 45 | struct interval_t { 46 | cycles_t cycles; 47 | 48 | inline interval_t(void) : cycles(0) 49 | { 50 | } 51 | 52 | // Creates a time interval from a number of microseconds (10^-6). 53 | inline interval_t(uint64_t microsec) 54 | : cycles(CYCLES_PER_SECOND / 1000000 * microsec) 55 | { 56 | } 57 | 58 | // Returns the number of microseconds (10^-6) in the time interval. 59 | inline uint64_t microsec(void) 60 | { 61 | return this->cycles * 1000000 / CYCLES_PER_SECOND; 62 | } 63 | 64 | inline interval_t operator+(interval_t other) const 65 | { 66 | return (interval_t) { this->cycles + other.cycles }; 67 | } 68 | 69 | // If 'this' is < than 'other', is the same as 'other - this'. 70 | inline interval_t operator-(interval_t other) const 71 | { 72 | return (interval_t) { this->cycles - other.cycles }; 73 | } 74 | 75 | inline interval_t operator*(double factor) const 76 | { 77 | return (interval_t) { (cycles_t) round(this->cycles * factor) }; 78 | } 79 | 80 | inline interval_t operator*=(double factor) 81 | { 82 | this->cycles = (cycles_t) round(this->cycles * factor); 83 | return *this; 84 | } 85 | 86 | inline bool operator<(interval_t other) const 87 | { 88 | return this->cycles < other.cycles; 89 | } 90 | }; 91 | 92 | // A time on which an interval can be computed. 93 | // 94 | // Time is stored as a CPU cycle count. 95 | struct time_t { 96 | cycles_t cycles; 97 | 98 | // Returns the next time value in the domain (i.e. the next cycle count 99 | // value). 100 | inline time_t next(void) const 101 | { 102 | return (time_t) { this->cycles + 1 }; 103 | } 104 | 105 | // Returns the interval between two times. 106 | inline interval_t operator-(time_t other) const 107 | { 108 | assert(this->cycles >= other.cycles); 109 | return (interval_t) { this->cycles - other.cycles }; 110 | } 111 | 112 | inline time_t operator+(interval_t interval) const 113 | { 114 | return (time_t) { this->cycles + interval.cycles }; 115 | } 116 | 117 | // Returns a 'time_t' object representing the current time. 118 | inline static time_t now(void) 119 | { 120 | return (time_t) { get_cycle_count() }; 121 | } 122 | }; 123 | }; 124 | 125 | } } /* namespace rusty::driver */ 126 | 127 | namespace std { 128 | 129 | // 'std::less<>' instance required for 'time_t' to be used in ordered 130 | // containers. 131 | 132 | using namespace rusty::driver; 133 | 134 | template <> 135 | struct less { 136 | inline bool operator()(cpu_clock_t::time_t a, cpu_clock_t::time_t b) const 137 | { 138 | return a.cycles < b.cycles; 139 | } 140 | }; 141 | 142 | } 143 | 144 | #endif /* __RUSTY_DRIVER_CLOCK_HPP__ */ 145 | -------------------------------------------------------------------------------- /driver/cpu.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Provides functions to manage dataplane Tiles. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #include 22 | #include 23 | #include 24 | #include 25 | 26 | #include // tmc_cpus_* 27 | #include // set_dataplane 28 | 29 | #include "driver/driver.hpp" 30 | 31 | #include "driver/cpu.hpp" 32 | 33 | using namespace std; 34 | 35 | namespace rusty { 36 | namespace driver { 37 | namespace cpu { 38 | 39 | void bind_to_dataplane(unsigned int n) 40 | { 41 | int result; 42 | 43 | // Finds dataplane Tiles. 44 | cpu_set_t dataplane_cpu_set; 45 | result = tmc_cpus_get_dataplane_cpus(&dataplane_cpu_set); 46 | VERIFY_ERRNO(result, "tmc_cpus_get_dataplane_cpus()"); 47 | 48 | unsigned int count = tmc_cpus_count(&dataplane_cpu_set); 49 | if (n + 1 > count) { 50 | DRIVER_DIE( 51 | "bind_to_dataplane(): not enough dataplane Tiles " 52 | "(%d requested, having %d)", n + 1, count 53 | ); 54 | } 55 | 56 | // Binds itself to the first dataplane Tile. 57 | result = tmc_cpus_find_nth_cpu(&dataplane_cpu_set, n); 58 | VERIFY_ERRNO(result, "tmc_cpus_find_nth_cpu()"); 59 | result = tmc_cpus_set_my_cpu(result); 60 | VERIFY_ERRNO(result, "tmc_cpus_set_my_cpu()"); 61 | 62 | #ifdef DEBUG_DATAPLANE 63 | // Put dataplane tiles in "debug" mode. Interrupts other than page 64 | // faults will generate a kernel stacktrace. 65 | result = set_dataplane(DP_DEBUG); 66 | VERIFY_ERRNO(result, "set_dataplane()"); 67 | #endif 68 | } 69 | 70 | } } } /* namespace rusty::driver::cpu */ 71 | -------------------------------------------------------------------------------- /driver/cpu.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Provides functions to manage dataplane Tiles. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #include 22 | 23 | #ifndef __RUSTY_DRIVERS_CPU_HPP__ 24 | #define __RUSTY_DRIVERS_CPU_HPP__ 25 | 26 | namespace rusty { 27 | namespace driver { 28 | namespace cpu { 29 | 30 | // CPU cycle counter value. 31 | typedef uint64_t cycles_t; 32 | 33 | // CPU Frequency in Hz. 34 | static constexpr cycles_t CYCLES_PER_SECOND = 1200000000; 35 | 36 | // Binds the current task to the n-th available dataplane Tile (first CPU is 0). 37 | // 38 | // Fails if there is less than n + 1 dataplane Tiles. 39 | void bind_to_dataplane(unsigned int n); 40 | 41 | } } } /* namespace rusty::driver:cpu */ 42 | 43 | #endif /* __RUSTY_DRIVERS_CPU_HPP__ */ 44 | -------------------------------------------------------------------------------- /driver/driver.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Various pre-processor macros used by the mPIPE driver. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #ifndef __RUSTY_DRIVER_DRIVER_HPP__ 22 | #define __RUSTY_DRIVER_DRIVER_HPP__ 23 | 24 | #include // gxio_strerror 25 | 26 | #include "util/macros.hpp" 27 | 28 | #define DRIVER_COLOR COLOR_YEL 29 | #define DRIVER_DEBUG(MSG, ...) \ 30 | RUSTY_DEBUG("DRIVER", DRIVER_COLOR, MSG, ##__VA_ARGS__) 31 | #define DRIVER_DIE(MSG, ...) \ 32 | RUSTY_DIE( "DRIVER", DRIVER_COLOR, MSG, ##__VA_ARGS__) 33 | 34 | // Checks for errors in function which returns -1 and sets errno on failure. 35 | #define VERIFY_ERRNO(VAL, WHAT) \ 36 | do { \ 37 | if (VAL == -1) \ 38 | DRIVER_DIE("%s (errno: %d)", (WHAT), errno); \ 39 | } while (0) 40 | 41 | // Checks for errors from the pthread_* calls, which returns 0 on success. 42 | #define VERIFY_PTHREAD(VAL, WHAT) \ 43 | do { \ 44 | if (VAL != 0) \ 45 | DRIVER_DIE("%s: (error: %d)", (WHAT), VAL); \ 46 | } while (0) 47 | 48 | // Checks for errors from the GXIO API, which returns negative error codes. 49 | #define VERIFY_GXIO(VAL, WHAT) \ 50 | do { \ 51 | long __val = (long) (VAL); \ 52 | if (__val < 0) \ 53 | DRIVER_DIE("%s: (%ld) %s", (WHAT), __val, gxio_strerror(__val)); \ 54 | } while (0) 55 | 56 | #endif /* __RUSTY_DRIVER_DRIVER_HPP__ */ 57 | -------------------------------------------------------------------------------- /driver/mpipe.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Wrapper over mPIPE functions. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // Makes initialization of the driver easier and provides an interface for the 8 | // Ethernet layer to use the mPIPE driver. 9 | // 10 | // This program is free software: you can redistribute it and/or modify 11 | // it under the terms of the GNU General Public License as published by 12 | // the Free Software Foundation, either version 3 of the License, or 13 | // (at your option) any later version. 14 | // 15 | // This program is distributed in the hope that it will be useful, 16 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 17 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 18 | // GNU General Public License for more details. 19 | // 20 | // You should have received a copy of the GNU General Public License 21 | // along with this program. If not, see . 22 | // 23 | 24 | #include // sort() 25 | #include 26 | #include 27 | #include 28 | #include 29 | #include 30 | #include // allocator 31 | #include // min() 32 | 33 | #include // struct ether_addr 34 | #include // pthread_* 35 | 36 | #include // gxio_mpipe_* 37 | #include // tmc_alloc_map(), tmc_alloc_set_home(), 38 | // tmc_alloc_set_pagesize(). 39 | #include // tmc_mem_prefetch() 40 | #include // tmc_cpus_* 41 | 42 | #include "driver/allocator.hpp" // tile_allocator_t 43 | #include "driver/driver.hpp" // VERIFY_ERRNO, VERIFY_GXIO 44 | #include "driver/buffer.hpp" // cursor_t 45 | #include "net/endian.hpp" // net_t 46 | #include "net/ethernet.hpp" // ethernet_t 47 | 48 | #include "driver/mpipe.hpp" 49 | 50 | using namespace std; 51 | 52 | using namespace rusty::net; 53 | 54 | namespace rusty { 55 | namespace driver { 56 | 57 | // Returns the hardware address of the link related to the given mPIPE 58 | // environment (in network byte order). 59 | static net_t _ether_addr(gxio_mpipe_link_t *link); 60 | 61 | mpipe_t::instance_t::instance_t(alloc_t _alloc) 62 | : alloc(_alloc), ethernet(_alloc), timers(_alloc) 63 | { 64 | } 65 | 66 | void mpipe_t::instance_t::run(void) 67 | { 68 | int result; 69 | 70 | // Binds the instance to its dataplane CPU. 71 | 72 | result = tmc_cpus_set_my_cpu(this->cpu_id); 73 | VERIFY_ERRNO(result, "tmc_cpus_set_my_cpu()"); 74 | 75 | #ifdef DEBUG_DATAPLANE 76 | // Put dataplane tiles in "debug" mode. Interrupts other than page 77 | // faults will generate a kernel stacktrace. 78 | result = set_dataplane(DP_DEBUG); 79 | VERIFY_ERRNO(result, "set_dataplane()"); 80 | #endif 81 | 82 | // Polling loop over the packet queue. Tries to executes timers between 83 | // polling attempts. 84 | 85 | while (LIKELY(this->parent->is_running)) { 86 | this->timers.tick(); 87 | 88 | gxio_mpipe_idesc_t idesc; 89 | 90 | result = gxio_mpipe_iqueue_try_get(&this->iqueue, &idesc); 91 | 92 | if (UNLIKELY(result == GXIO_MPIPE_ERR_IQUEUE_EMPTY)) // Queue is empty. Retries. 93 | continue; 94 | 95 | if (gxio_mpipe_iqueue_drop_if_bad(&this->iqueue, &idesc)) { 96 | DRIVER_DEBUG("Invalid packet dropped"); 97 | continue; 98 | } 99 | 100 | // Initializes a buffer cursor which starts at the Ethernet header and 101 | // stops at the end of the packet. 102 | // 103 | // The buffer will be freed when the cursor will be destructed. 104 | cursor_t cursor(&this->parent->context, &idesc, true, this->alloc); 105 | cursor = cursor.drop(gxio_mpipe_idesc_get_l2_offset(&idesc)); 106 | 107 | tmc_mem_prefetch(cursor.current, cursor.current_size); 108 | 109 | DRIVER_DEBUG("Receives a %zu bytes packet", cursor.size()); 110 | 111 | this->ethernet.receive_frame(cursor); 112 | } 113 | } 114 | 115 | void mpipe_t::instance_t::send_packet( 116 | size_t packet_size, function packet_writer 117 | ) 118 | { 119 | assert(packet_size <= this->parent->max_packet_size); 120 | 121 | DRIVER_DEBUG("Sends a %zu bytes packet", packet_size); 122 | 123 | // Allocates a buffer and executes the 'packet_writer' on its memory. 124 | gxio_mpipe_bdesc_t bdesc = this->parent->_alloc_buffer(packet_size); 125 | 126 | // Allocates an unmanaged cursor, which will not desallocate the buffer when 127 | // destr 128 | cursor_t cursor( 129 | &this->parent->context, &bdesc, packet_size, false, this->alloc 130 | ); 131 | packet_writer(cursor); 132 | 133 | // Creates the egress descriptor. 134 | 135 | gxio_mpipe_edesc_t edesc = { 0 }; 136 | edesc.bound = 1; // Last and single descriptor for the trame. 137 | edesc.hwb = 1, // The buffer will be automaticaly freed. 138 | edesc.xfer_size = packet_size; 139 | 140 | // Sets 'va', 'stack_idx', 'inst', 'hwb', 'size' and 'c'. 141 | gxio_mpipe_edesc_set_bdesc(&edesc, bdesc); 142 | 143 | // NOTE: if multiple packets are to be sent, reserve() + put_at() with a 144 | // single memory barrier should be more efficient. 145 | gxio_mpipe_equeue_put(&this->parent->equeue, edesc); 146 | } 147 | 148 | // We use multiple NotigRings linked to the same NotifGroup to enable some 149 | // kind of load balancing: with multiple NotifRings, each related to a distinct 150 | // worker thread, the hardware load-balancer will classify packets by their flow 151 | // (IP addresses, ports, ...) by worker. 152 | mpipe_t::mpipe_t( 153 | const char *link_name, net_t ipv4_addr, int n_workers, 154 | int first_dataplane_cpu, 155 | vector static_arp4_entries 156 | ) : instances(n_workers) 157 | { 158 | assert(n_workers > 0); 159 | assert((unsigned int) n_workers <= N_BUCKETS); 160 | 161 | int result; 162 | 163 | gxio_mpipe_context_t * const context = &this->context; 164 | 165 | // 166 | // mPIPE driver. 167 | // 168 | // Tries to create an context for the mPIPE instance of the given link. 169 | // 170 | 171 | { 172 | gxio_mpipe_link_t * const link = &this->link; 173 | 174 | result = gxio_mpipe_link_instance(link_name); 175 | VERIFY_GXIO(result, "gxio_mpipe_link_instance()"); 176 | int instance_id = result; 177 | 178 | result = gxio_mpipe_init(context, instance_id); 179 | VERIFY_GXIO(result, "gxio_mpipe_init()"); 180 | 181 | result = gxio_mpipe_link_open(link, context, link_name, 0); 182 | VERIFY_GXIO(result, "gxio_mpipe_link_open()"); 183 | 184 | #ifdef MPIPE_JUMBO_FRAMES 185 | // Enable JUMBO ethernet packets 186 | gxio_mpipe_link_set_attr(link, GXIO_MPIPE_LINK_RECEIVE_JUMBO, 1); 187 | #endif 188 | } 189 | 190 | // 191 | // Checks if there is enough dataplane Tiles for the requested number of 192 | // workers. 193 | // 194 | // Allocates and constructs the stack instance for each worker. 195 | // 196 | 197 | { 198 | cpu_set_t dataplane_cpu_set; 199 | result = tmc_cpus_get_dataplane_cpus(&dataplane_cpu_set); 200 | VERIFY_ERRNO(result, "tmc_cpus_get_dataplane_cpus()"); 201 | 202 | int count = tmc_cpus_count(&dataplane_cpu_set); 203 | if (first_dataplane_cpu + n_workers > count) { 204 | DRIVER_DIE( 205 | "There is not enough dataplane Tiles for the requested number " 206 | "of workers (%u requested, having %u)", 207 | first_dataplane_cpu + n_workers, count 208 | ); 209 | } 210 | 211 | for (int i = 0; i < n_workers; i++) { 212 | result = tmc_cpus_find_nth_cpu( 213 | &dataplane_cpu_set, i + first_dataplane_cpu 214 | ); 215 | VERIFY_GXIO(result, "tmc_cpus_find_nth_cpu()"); 216 | 217 | int cpu_id = result; 218 | 219 | #ifdef USE_TILE_ALLOCATOR 220 | // Allocates the instance on its dedicated CPU. 221 | tile_allocator_t alloc(cpu_id); 222 | #else 223 | // Uses the standard allocator. 224 | allocator alloc; 225 | #endif /* USE_TILE_ALLOCATOR */ 226 | 227 | this->instances[i] = alloc.allocate(1); 228 | 229 | // Constructs the instance and gives the allocator. 230 | new (this->instances[i]) instance_t(alloc); 231 | 232 | this->instances[i]->cpu_id = cpu_id; 233 | } 234 | } 235 | 236 | // 237 | // Ingres queues. 238 | // 239 | // Creates an iqueue and a notification ring for each worker, and a single 240 | // notification group with its buckets. 241 | // 242 | 243 | { 244 | // 245 | // Creates a NotifRing and an iqueue wrapper for each worker. 246 | // 247 | 248 | result = gxio_mpipe_alloc_notif_rings(context, n_workers, 0, 0); 249 | VERIFY_GXIO(result, "gxio_mpipe_alloc_notif_rings()"); 250 | unsigned int first_ring_id = result; 251 | 252 | tmc_alloc_t alloc = TMC_ALLOC_INIT; 253 | 254 | size_t ring_size = IQUEUE_ENTRIES * sizeof (gxio_mpipe_idesc_t); 255 | 256 | // Sets page_size >= ring_size. 257 | if (tmc_alloc_set_pagesize(&alloc, ring_size) == NULL) 258 | DRIVER_DIE("tmc_alloc_set_pagesize()"); 259 | 260 | assert(tmc_alloc_get_pagesize(&alloc) >= ring_size); 261 | 262 | for (int i = 0; i < n_workers; i++) { 263 | instance_t *instance = this->instances[i]; 264 | 265 | unsigned int ring_id = first_ring_id + i; 266 | 267 | // Allocates a NotifRing for each worker. 268 | // 269 | // The NotifRing must be 4 KB aligned and must reside on a single 270 | // physically contiguous memory. So we allocate a page sufficiently 271 | // large to hold it. This page holding the notifications descriptors 272 | // will reside on the current Tile's cache. 273 | // 274 | // Allocated pages are cache-homed on the worker's Tile. 275 | 276 | tmc_alloc_set_home(&alloc, instance->cpu_id); 277 | 278 | instance->notif_ring_mem = (char *) tmc_alloc_map( 279 | &alloc, ring_size 280 | ); 281 | if (instance->notif_ring_mem == NULL) 282 | DRIVER_DIE("tmc_alloc_map()"); 283 | 284 | // ring is 4 KB aligned. 285 | assert(((intptr_t) instance->notif_ring_mem & 0xFFF) == 0); 286 | 287 | // Initializes an iqueue for the worker. 288 | 289 | result = gxio_mpipe_iqueue_init( 290 | &instance->iqueue, context, ring_id, instance->notif_ring_mem, 291 | ring_size, 0 292 | ); 293 | VERIFY_GXIO(result, "gxio_mpipe_iqueue_init()"); 294 | } 295 | 296 | DRIVER_DEBUG( 297 | "Allocated %u x %zu bytes for the NotifRings on a %zu bytes pages", 298 | n_workers, ring_size, tmc_alloc_get_pagesize(&alloc) 299 | ); 300 | 301 | // 302 | // Create a single NotifGroup and a set of buckets 303 | // 304 | 305 | result = gxio_mpipe_alloc_notif_groups(context, 1 /* count */, 0, 0); 306 | VERIFY_GXIO(result, "gxio_mpipe_alloc_notif_groups()"); 307 | this->notif_group_id = result; 308 | 309 | result = gxio_mpipe_alloc_buckets(context, N_BUCKETS, 0, 0); 310 | VERIFY_GXIO(result, "gxio_mpipe_alloc_buckets()"); 311 | this->first_bucket_id = result; 312 | 313 | // Initialize the NotifGroup and its buckets. Assigns the single 314 | // NotifRing to the group. 315 | 316 | result = gxio_mpipe_init_notif_group_and_buckets( 317 | context, this->notif_group_id, first_ring_id, 318 | n_workers /* ring count */, this->first_bucket_id, N_BUCKETS, 319 | // Load-balancing mode: packets of a same flow go to the same 320 | // bucket. 321 | GXIO_MPIPE_BUCKET_STATIC_FLOW_AFFINITY 322 | ); 323 | VERIFY_GXIO(result, "gxio_mpipe_init_notif_group_and_buckets()"); 324 | } 325 | 326 | // 327 | // Egress queue. 328 | // 329 | // Initializes a single eDMA ring with its equeue wrapper. 330 | // 331 | 332 | { 333 | // Allocates a single eDMA ring ID. Multiple eDMA rings could be used 334 | // concurrently on the same context/link. 335 | result = gxio_mpipe_alloc_edma_rings(context, 1 /* count */, 0, 0); 336 | VERIFY_GXIO(result, "gxio_mpipe_alloc_edma_rings"); 337 | this->edma_ring_id = result; 338 | 339 | size_t ring_size = EQUEUE_ENTRIES * sizeof(gxio_mpipe_edesc_t); 340 | 341 | // The eDMA ring must be 1 KB aligned and must reside on a single 342 | // physically contiguous memory. So we allocate a page sufficiently 343 | // large to hold it. 344 | // As only the mPIPE hardware and no Tile will read from this memory, 345 | // and as memory-write are non-blocking in this case, we can benefit 346 | // from an hash-for-home cache policy. 347 | // NOTE: test the impact on this policy on performances. 348 | tmc_alloc_t alloc = TMC_ALLOC_INIT; 349 | tmc_alloc_set_home(&alloc, TMC_ALLOC_HOME_HASH); 350 | 351 | // Sets page_size >= ring_size. 352 | if (tmc_alloc_set_pagesize(&alloc, ring_size) == NULL) 353 | DRIVER_DIE("tmc_alloc_set_pagesize()"); 354 | 355 | assert(tmc_alloc_get_pagesize(&alloc) >= ring_size); 356 | 357 | DRIVER_DEBUG( 358 | "Allocating %zu bytes for the eDMA ring on a %zu bytes page", 359 | ring_size, tmc_alloc_get_pagesize(&alloc) 360 | ); 361 | 362 | this->edma_ring_mem = (char *) tmc_alloc_map(&alloc, ring_size); 363 | if (this->edma_ring_mem == NULL) 364 | DRIVER_DIE("tmc_alloc_map()"); 365 | 366 | // ring is 1 KB aligned. 367 | assert(((intptr_t) this->edma_ring_mem & 0x3FF) == 0); 368 | 369 | // Initializes an equeue which uses the eDMA ring memory and the channel 370 | // associated with the context's link. 371 | 372 | int channel = gxio_mpipe_link_channel(&this->link); 373 | 374 | result = gxio_mpipe_equeue_init( 375 | &this->equeue, context, this->edma_ring_id, 376 | channel, this->edma_ring_mem, ring_size, 0 377 | ); 378 | VERIFY_GXIO(result, "gxio_gxio_equeue_init()"); 379 | } 380 | 381 | // 382 | // Buffer stacks and buffers 383 | // 384 | // Initializes a buffer stack and a set of buffers for each non-empty stack 385 | // in BUFFERS_STACKS. 386 | // 387 | 388 | { 389 | // Counts the number of non-empty buffer stacks. 390 | int n_stacks = 0; 391 | for (const buffer_stack_info_t& stack_info : BUFFERS_STACKS) { 392 | if (stack_info.count > 0) 393 | n_stacks++; 394 | } 395 | 396 | result = gxio_mpipe_alloc_buffer_stacks(context, n_stacks, 0, 0); 397 | VERIFY_GXIO(result, "gxio_mpipe_alloc_buffer_stacks()"); 398 | unsigned int stack_id = result; 399 | 400 | this->buffer_stacks.reserve(n_stacks); 401 | 402 | // Allocates, initializes and registers the memory for each stacks. 403 | for (const buffer_stack_info_t& stack_info : BUFFERS_STACKS) { 404 | // Skips unused buffer types. 405 | if (stack_info.count <= 0) 406 | continue; 407 | 408 | // First we need to compute the exact memory usage of the stack 409 | // and its associated buffers, then we allocates a set of pages to 410 | // hold them. 411 | // 412 | // Packet buffer memory is allocated after the buffer stack. 413 | // Buffer stack is required to be 64K aligned on a contiguous 414 | // memory, so we allocate it at the beginning of a page of at least 415 | // 64 KB. Buffer memory is required to be 128 byte aligned, so we 416 | // add a padding after the stack. 417 | 418 | size_t stack_size = gxio_mpipe_calc_buffer_stack_bytes( 419 | stack_info.count 420 | ); 421 | 422 | // Adds a padding to have a 128 bytes aligned address for the packet 423 | // buffer memory. 424 | stack_size += -(long) stack_size & (128 - 1); 425 | 426 | size_t buffer_size = gxio_mpipe_buffer_size_enum_to_buffer_size( 427 | stack_info.size 428 | ); 429 | 430 | size_t total_size = stack_size + stack_info.count * buffer_size; 431 | 432 | // Uses the distributed caching mechanism for packet data because of 433 | // being too large to fit in a single Tile local (L2) cache. 434 | // 435 | // tmc_mem_prefetch() could be used before accessing a buffer to 436 | // fetch the buffer into the local cache. 437 | tmc_alloc_t alloc = TMC_ALLOC_INIT; 438 | tmc_alloc_set_home(&alloc, TMC_ALLOC_HOME_HASH); 439 | 440 | // Page size must be at least 64 KB, and must be able to store the 441 | // entire stack. Moreover, we can we have up to 16 TLB page entries 442 | // per buffer stack. 443 | // 444 | // To minimize the memory space used, we will try to use as much TLB 445 | // entries as possible with pages larger than the stack and 64 KB. 446 | size_t min_page_size = max({ 447 | (total_size + 15) / 16, // == (int) ceil(total_size / 16) 448 | (size_t) 64 * 1024, // == 64 KB 449 | stack_size 450 | }); 451 | 452 | if (tmc_alloc_set_pagesize(&alloc, min_page_size) == NULL) 453 | // NOTE: could fail if there is no page size >= 64 KB. 454 | DRIVER_DIE("tmc_alloc_set_pagesize()"); 455 | 456 | DRIVER_DEBUG( 457 | "Allocating %lu x %zu bytes buffers (%zu bytes) and a %zu " 458 | "bytes stack on %zu x %zu bytes page(s)", 459 | stack_info.count, buffer_size, total_size, stack_size, 460 | (total_size + 1) / tmc_alloc_get_pagesize(&alloc), 461 | tmc_alloc_get_pagesize(&alloc) 462 | ); 463 | 464 | char *mem = (char *) tmc_alloc_map(&alloc, total_size); 465 | if (mem == NULL) 466 | DRIVER_DIE("tmc_alloc_map()"); 467 | 468 | assert(((intptr_t) mem & 0xFFFF) == 0); // mem is 64 KB aligned. 469 | 470 | // Initializes the buffer stack. 471 | 472 | result = gxio_mpipe_init_buffer_stack( 473 | context, stack_id, stack_info.size, mem, stack_size, 0 474 | ); 475 | VERIFY_GXIO(result, "gxio_mpipe_init_buffer_stack()"); 476 | 477 | // Registers the buffer pages into the mPIPE's TLB. 478 | 479 | size_t page_size = tmc_alloc_get_pagesize(&alloc); 480 | 481 | for (char *p = mem; p < mem + total_size; p += page_size) { 482 | result = gxio_mpipe_register_page( 483 | context, stack_id, p, page_size, 0 484 | ); 485 | VERIFY_GXIO(result, "gxio_mpipe_register_page()"); 486 | } 487 | 488 | // Writes buffer descriptors into the stack. 489 | 490 | for ( 491 | char *p = mem + stack_size; 492 | p < mem + total_size; 493 | p += buffer_size 494 | ) { 495 | // buffer is 128 bytes aligned. 496 | assert(((size_t) p & 0x7F) == 0); 497 | 498 | gxio_mpipe_push_buffer(context, stack_id, p); 499 | } 500 | 501 | // Registers the stack resources in the environment. 502 | 503 | buffer_stack_t buffer_stack = { 504 | &stack_info, stack_id, buffer_size, 505 | mem, mem + stack_size, total_size 506 | }; 507 | 508 | this->buffer_stacks.push_back(buffer_stack); 509 | 510 | stack_id++; 511 | } 512 | 513 | // Sorts 'this->buffer_stacks' by increasing buffer sizes. 514 | 515 | sort( 516 | this->buffer_stacks.begin(), this->buffer_stacks.end(), 517 | [](const buffer_stack_t& a, const buffer_stack_t& b) { 518 | return a.info->size < b.info->size; 519 | } 520 | ); 521 | 522 | max_packet_size = this->buffer_stacks.back().buffer_size; 523 | 524 | #ifndef MPIPE_JUMBO_FRAMES 525 | gxio_mpipe_link_set_attr(&link, GXIO_MPIPE_LINK_RECEIVE_JUMBO, 1); 526 | 527 | gxio_mpipe_equeue_set_snf_size(&this->equeue, max_packet_size); 528 | #else 529 | max_packet_size = min((size_t) 1500, max_packet_size); 530 | #endif /* MPIPE_JUMBO_FRAMES */ 531 | 532 | DRIVER_DEBUG("Maximum packet size: %zu bytes", max_packet_size); 533 | } 534 | 535 | // 536 | // Classifier rules 537 | // 538 | // Defines a single rule that match every packet to the unique bucket we 539 | // created. 540 | // 541 | // See UG527-Application-Libraries-Reference-Manual.pdf, page 215. 542 | // 543 | 544 | { 545 | gxio_mpipe_rules_t *rules = &this->rules; 546 | gxio_mpipe_rules_init(rules, context); 547 | 548 | result = gxio_mpipe_rules_begin( 549 | rules, this->first_bucket_id, N_BUCKETS, nullptr 550 | ); 551 | VERIFY_GXIO(result, "gxio_mpipe_rules_begin()"); 552 | 553 | result = gxio_mpipe_rules_commit(rules); 554 | VERIFY_GXIO(result, "gxio_mpipe_rules_commit()"); 555 | } 556 | 557 | // 558 | // Initializes the network protocols stacks. 559 | // 560 | 561 | { 562 | this->ether_addr = _ether_addr(&this->link); 563 | 564 | for (instance_t *instance : this->instances) { 565 | instance->parent = this; 566 | instance->ethernet.init( 567 | instance, &instance->timers, this->ether_addr, ipv4_addr, 568 | static_arp4_entries 569 | ); 570 | } 571 | } 572 | } 573 | 574 | mpipe_t::~mpipe_t(void) 575 | { 576 | int result; 577 | 578 | // Releases the mPIPE context 579 | 580 | result = gxio_mpipe_link_close(&this->link); 581 | VERIFY_GXIO(result, "gxio_mpipe_link_close(()"); 582 | 583 | result = gxio_mpipe_destroy(&this->context); 584 | VERIFY_GXIO(result, "gxio_mpipe_destroy()"); 585 | 586 | // Releases rings memory 587 | 588 | for (instance_t *instance : this->instances) { 589 | size_t notif_ring_size = IQUEUE_ENTRIES * sizeof(gxio_mpipe_idesc_t); 590 | result = tmc_alloc_unmap(instance->notif_ring_mem, notif_ring_size ); 591 | VERIFY_ERRNO(result, "tmc_alloc_unmap()"); 592 | } 593 | 594 | size_t edma_ring_size = EQUEUE_ENTRIES * sizeof(gxio_mpipe_edesc_t); 595 | result = tmc_alloc_unmap(this->edma_ring_mem, edma_ring_size); 596 | VERIFY_ERRNO(result, "tmc_alloc_unmap()"); 597 | 598 | // Releases buffers memory 599 | 600 | for (const buffer_stack_t& buffer_stack : this->buffer_stacks) { 601 | result = tmc_alloc_unmap(buffer_stack.mem, buffer_stack.mem_size); 602 | VERIFY_ERRNO(result, "tmc_alloc_unmap()"); 603 | } 604 | } 605 | 606 | // Wrapper over 'instance_t::run()' for 'pthread_create()'. 607 | void *worker_runner(void *); 608 | 609 | void mpipe_t::run(void) 610 | { 611 | this->is_running = true; 612 | 613 | // Starts the worker threads. 614 | for (instance_t *instance : instances) { 615 | int result = pthread_create( 616 | &instance->thread, nullptr, worker_runner, instance 617 | ); 618 | VERIFY_PTHREAD(result, "pthread_create()"); 619 | } 620 | 621 | } 622 | 623 | void *worker_runner(void *instance_void) 624 | { 625 | ((mpipe_t::instance_t *) instance_void)->run(); 626 | return nullptr; 627 | } 628 | 629 | void mpipe_t::stop(void) 630 | { 631 | this->is_running = false; 632 | } 633 | 634 | void mpipe_t::join(void) 635 | { 636 | // Waits for all threads to exit. 637 | for (instance_t *instance : instances) 638 | pthread_join(instance->thread, nullptr); 639 | } 640 | 641 | // Replicates the call to every worker TCP stack. 642 | // 643 | // FIXME: Is not thread-safe, could not be called while the instances are 644 | // running. 645 | void mpipe_t::tcp_listen( 646 | tcp_t::port_t port, tcp_t::new_conn_callback_t new_conn_callback 647 | ) 648 | { 649 | assert(!this->is_running); // FIXME: not thread-safe. 650 | 651 | for (instance_t *instance : this->instances) 652 | instance->ethernet.ipv4.tcp.listen(port, new_conn_callback); 653 | } 654 | 655 | 656 | gxio_mpipe_bdesc_t mpipe_t::_alloc_buffer(size_t size) 657 | { 658 | // Finds the first buffer size large enough to hold the requested buffer. 659 | for (const buffer_stack_t &stack : this->buffer_stacks) { 660 | if (stack.buffer_size >= size) { 661 | gxio_mpipe_bdesc_t bdesc = gxio_mpipe_pop_buffer_bdesc( 662 | &this->context, stack.id 663 | ); 664 | 665 | if (UNLIKELY(bdesc.c == MPIPE_EDMA_DESC_WORD1__C_VAL_INVALID)) { 666 | DRIVER_DIE( 667 | "Invalid buffer descriptor (size %zu). " 668 | "Maybe you should allocate more buffers.", stack.buffer_size 669 | ); 670 | } 671 | 672 | return bdesc; 673 | } 674 | } 675 | 676 | // TODO: build a chained buffer if possible. 677 | DRIVER_DIE("No buffer is sufficiently large to hold the requested size."); 678 | } 679 | 680 | static net_t _ether_addr(gxio_mpipe_link_t *link) 681 | { 682 | int64_t addr64 = gxio_mpipe_link_get_attr(link, GXIO_MPIPE_LINK_MAC); 683 | 684 | // Address is in the 48 least-significant bits. 685 | assert((addr64 & 0xFFFFFFFFFFFF) == addr64); 686 | 687 | // Immediately returns the address in network byte order. 688 | net_t addr; 689 | addr.net = { 690 | (uint8_t) (addr64 >> 40), 691 | (uint8_t) (addr64 >> 32), 692 | (uint8_t) (addr64 >> 24), 693 | (uint8_t) (addr64 >> 16), 694 | (uint8_t) (addr64 >> 8), 695 | (uint8_t) addr64 696 | }; 697 | return addr; 698 | } 699 | 700 | } } /* namespace rusty::driver */ 701 | -------------------------------------------------------------------------------- /driver/mpipe.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Wrapper over mPIPE functions. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // Makes initialization of the driver easier and provides an interface for the 8 | // Ethernet layer to use the mPIPE driver. 9 | // 10 | // This program is free software: you can redistribute it and/or modify 11 | // it under the terms of the GNU General Public License as published by 12 | // the Free Software Foundation, either version 3 of the License, or 13 | // (at your option) any later version. 14 | // 15 | // This program is distributed in the hope that it will be useful, 16 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 17 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 18 | // GNU General Public License for more details. 19 | // 20 | // You should have received a copy of the GNU General Public License 21 | // along with this program. If not, see . 22 | // 23 | 24 | #ifndef __RUSTY_DRIVER_MPIPE_HPP__ 25 | #define __RUSTY_DRIVER_MPIPE_HPP__ 26 | 27 | #include 28 | #include 29 | #include // allocator 30 | 31 | #include // get_cycle_count() 32 | #include // gxio_mpipe_*, GXIO_MPIPE_* 33 | 34 | #include "driver/allocator.hpp" // tile_allocator_t 35 | #include "driver/clock.hpp" // cpu_clock_t 36 | #include "driver/buffer.hpp" // cursor_t 37 | #include "driver/timer.hpp" // timer_manager_t 38 | #include "net/endian.hpp" // net_t 39 | #include "net/ethernet.hpp" // ethernet_t 40 | 41 | using namespace std; 42 | 43 | using namespace rusty::net; 44 | 45 | namespace rusty { 46 | namespace driver { 47 | 48 | // ----------------------------------------------------------------------------- 49 | 50 | // 51 | // Paramaters. 52 | // 53 | 54 | // Number of buckets that the load balancer uses. 55 | // 56 | // Must be a power of 2 and must be be larger or equal to the number of workers. 57 | static const unsigned int N_BUCKETS = 1024; 58 | 59 | // Number of packet descriptors in the ingress queues. There will be as must 60 | // iqueues as workers. 61 | // 62 | // Could be 128, 512, 2K or 64K. 63 | static const unsigned int IQUEUE_ENTRIES = GXIO_MPIPE_IQUEUE_ENTRY_512; 64 | 65 | // Number of packet descriptors in the egress queue. 66 | // 67 | // Could be 512, 2K, 8K or 64K. 68 | static const unsigned int EQUEUE_ENTRIES = GXIO_MPIPE_EQUEUE_ENTRY_2K; 69 | 70 | // mPIPE buffer stacks. 71 | // 72 | // Gives the number of buffers and the buffer sizes for each buffer stack. 73 | // 74 | // mPIPE only allows 32 buffer stacks to be used at the same time. 75 | // 76 | // NOTE: Knowing the average and standard deviation of received/emitted packets 77 | // and the optimal cache usage, the most efficient buffer sizes could be 78 | // computed. 79 | 80 | struct buffer_stack_info_t { 81 | // Could be 128, 256, 512, 1024, 1664, 4096, 10368 or 16384 bytes. 82 | // 4096, 10368 and 16384 are only relevant if jumbo frames are allowed. 83 | gxio_mpipe_buffer_size_enum_t size; 84 | 85 | unsigned long count; 86 | 87 | buffer_stack_info_t(gxio_mpipe_buffer_size_enum_t size, unsigned long count) 88 | : size(size), count(count) { } 89 | }; 90 | 91 | #ifdef MPIPE_JUMBO_FRAMES 92 | static const array BUFFERS_STACKS { 93 | #else 94 | static const array BUFFERS_STACKS { 95 | #endif /* MPIPE_JUMBO_FRAMES */ 96 | buffer_stack_info_t(GXIO_MPIPE_BUFFER_SIZE_128, 8192), // ~ 1 MB 97 | buffer_stack_info_t(GXIO_MPIPE_BUFFER_SIZE_256, 1024), // ~ 256 KB 98 | buffer_stack_info_t(GXIO_MPIPE_BUFFER_SIZE_512, 1024), // ~ 512 KB 99 | buffer_stack_info_t(GXIO_MPIPE_BUFFER_SIZE_1024, 512), // ~ 512 KB 100 | buffer_stack_info_t(GXIO_MPIPE_BUFFER_SIZE_1664, 2048), // ~ 3 MB 101 | 102 | #ifdef MPIPE_JUMBO_FRAMES 103 | buffer_stack_info_t(GXIO_MPIPE_BUFFER_SIZE_4096, 256), // ~ 1 MB 104 | buffer_stack_info_t(GXIO_MPIPE_BUFFER_SIZE_10368, 1024), // ~ 10 MB 105 | buffer_stack_info_t(GXIO_MPIPE_BUFFER_SIZE_16384, 128) // ~ 2 MB 106 | #endif /* MPIPE_JUMBO_FRAMES */ 107 | }; 108 | 109 | // ----------------------------------------------------------------------------- 110 | 111 | // 112 | // mPIPE environment 113 | // 114 | 115 | // Contains references to resources needed to use the mPIPE driver. 116 | // 117 | // Provides an interface for the Ethernet layer to interface with the mPIPE 118 | // driver. 119 | // 120 | // NOTE: should probably be allocated on the Tile's cache which uses the iqueue 121 | // and the equeue wrappers. 122 | struct mpipe_t { 123 | // 124 | // Member types 125 | // 126 | 127 | #ifdef USE_TILE_ALLOCATOR 128 | typedef tile_allocator_t alloc_t; 129 | #else 130 | // Uses the standard allocator. 131 | typedef allocator alloc_t; 132 | #endif /* USE_TILE_ALLOCATOR */ 133 | 134 | // Each worker thread will be given an mPIPE instance. 135 | // 136 | // Each instance contains its own ingress queue. A unique egress queue is 137 | // however shared between all threads. 138 | struct instance_t { 139 | // 140 | // Member types 141 | // 142 | 143 | typedef cpu_clock_t clock_t; 144 | 145 | // Cursor which will abstract how the upper (Ethernet) layer will read 146 | // from and write to memory in mPIPE buffers. 147 | typedef buffer::cursor_t cursor_t; 148 | 149 | typedef cpu_timer_manager_t timer_manager_t; 150 | 151 | // 152 | // Fields 153 | // 154 | 155 | mpipe_t *parent; 156 | pthread_t thread; 157 | 158 | alloc_t alloc; 159 | 160 | // Dataplane Tile dedicated to the execution of this worker. 161 | int cpu_id; 162 | 163 | // Ingres queue. 164 | gxio_mpipe_iqueue_t iqueue; 165 | char *notif_ring_mem; 166 | 167 | // Upper Ethernet data-link layer. 168 | net::ethernet_t ethernet; 169 | 170 | timer_manager_t timers; 171 | 172 | // 173 | // Methods 174 | // 175 | 176 | instance_t(alloc_t _alloc); 177 | 178 | // Starts the n workers threads. The function immediatly returns. 179 | // 180 | // Forwards any received packet to the upper (Ethernet) data-link layer. 181 | void run(void); 182 | 183 | // Sends a packet of the given size on the interface by calling the 184 | // 'packet_writer' with a cursor corresponding to a buffer allocated 185 | // memory. 186 | void send_packet( 187 | size_t packet_size, function packet_writer 188 | ); 189 | 190 | // Maximum packet size. Doesn't change after initialization. 191 | inline size_t max_packet_size(void); 192 | 193 | // 194 | // Static methods 195 | // 196 | 197 | // Returns the current TCP sequence number. 198 | static inline 199 | net::ethernet_t::ipv4_ethernet_t::tcp_ipv4_t::seq_t 200 | get_current_tcp_seq(void); 201 | 202 | private: 203 | // Allocates a buffer from the smallest stack able to hold the requested 204 | // size. 205 | gxio_mpipe_bdesc_t _alloc_buffer(size_t size); 206 | }; 207 | 208 | typedef buffer::cursor_t cursor_t; 209 | 210 | // Aliases for upper network layer types. 211 | // 212 | // This permits the user to refer to network layer types easily, (i.e. 213 | // 'mpipe_t::ipv4_t::addr_t' to refer to an IPv4 address). 214 | 215 | typedef net::ethernet_t ethernet_t; 216 | typedef mpipe_t::ethernet_t::ipv4_ethernet_t ipv4_t; 217 | typedef mpipe_t::ethernet_t::arp_ethernet_ipv4_t arp_ipv4_t; 218 | typedef mpipe_t::ipv4_t::tcp_ipv4_t tcp_t; 219 | 220 | // Allocated resources for a buffer stack. 221 | struct buffer_stack_t { 222 | const buffer_stack_info_t *info; 223 | unsigned int id; 224 | 225 | // Result of 'gxio_mpipe_buffer_size_enum_to_buffer_size(info->size)'. 226 | size_t buffer_size; 227 | 228 | // First byte of the buffer stack. 229 | char *mem; 230 | // Packet buffer memory allocated just after the buffer stack. 231 | char *buffer_mem; 232 | 233 | // Number of bytes allocated for the buffer stack and its buffers. 234 | size_t mem_size; 235 | }; 236 | 237 | // 238 | // Fields 239 | // 240 | 241 | // Driver 242 | gxio_mpipe_context_t context; 243 | gxio_mpipe_link_t link; 244 | 245 | // Workers instances. 246 | // 247 | // Instances are not directly stored in the vector as the will cache-homed 248 | // on the Tile core that they run on. 249 | vector instances; 250 | 251 | // Ingres queues 252 | unsigned int notif_group_id; // Load balancer group. 253 | unsigned int first_bucket_id; 254 | 255 | // Egress queue 256 | gxio_mpipe_equeue_t equeue; 257 | unsigned int edma_ring_id; 258 | char *edma_ring_mem; 259 | 260 | // Buffers and their stacks. Stacks are sorted by increasing buffer sizes. 261 | vector buffer_stacks; 262 | 263 | // Rules 264 | gxio_mpipe_rules_t rules; 265 | 266 | // Equals to 'true' while the 'run()' method is running. 267 | // 268 | // Setting this field to false will stop the execution of the 'run()' 269 | // method. 270 | bool is_running = false; 271 | 272 | net_t ether_addr; 273 | 274 | // Maximum packet size. Doesn't change after initialization. 275 | size_t max_packet_size; 276 | 277 | // ------------------------------------------------------------------------- 278 | 279 | // 280 | // Methods 281 | // 282 | 283 | // Initializes the given mpipe_env_t for the given link. 284 | // 285 | // Starts the mPIPE driver, allocates NotifRings and their iqueue wrappers, 286 | // an eDMA ring with its equeue wrapper and a set of buffer stacks with 287 | // their buffers. 288 | // 289 | // 'first_dataplane_cpu' specifies the number of the first dataplane Tile 290 | // that can be used. Useful when multiple 'mpipe_t' instances are created 291 | // and that you don't want them to share the same dataplane Tiles. 292 | mpipe_t( 293 | const char *link_name, net_t ipv4_addr, int n_workers, 294 | int first_dataplane_cpu = 0, 295 | vector static_arp_entries 296 | = vector() 297 | ); 298 | 299 | // Releases mPIPE resources referenced by current mPIPE environment. 300 | ~mpipe_t(void); 301 | 302 | // Starts the workers and process any received packet. 303 | // 304 | // This function doesn't return until a call to 'stop()' is made. 305 | void run(void); 306 | 307 | // Stops the execution of working threads. 308 | // 309 | // This method just sets 'is_running' to 'false'. You should make a call to 310 | // 'join()' after to wait for threads to finish. 311 | void stop(void); 312 | 313 | // Waits for threads to finish. 314 | void join(void); 315 | 316 | // 317 | // TCP server sockets. 318 | // 319 | 320 | // Starts listening for TCP connections on the given port. 321 | // 322 | // If the port was already in the listen state, replaces the previous 323 | // callback function. 324 | // 325 | // FIXME: the function is not thread safe. DON'T call it when workers are 326 | // concurrently running. 327 | void tcp_listen( 328 | tcp_t::port_t tcp, tcp_t::new_conn_callback_t new_conn_callback 329 | ); 330 | 331 | // 332 | // TCP client/connected sockets. 333 | // 334 | 335 | 336 | private: 337 | // Allocates a buffer from the smallest stack able to hold the requested 338 | // size. 339 | gxio_mpipe_bdesc_t _alloc_buffer(size_t size); 340 | 341 | // Sends a packet of the given size on the interface by calling the 342 | // 'packet_writer' with a cursor corresponding to a buffer allocated 343 | // memory. 344 | void send_packet( 345 | size_t packet_size, function packet_writer 346 | ); 347 | }; 348 | 349 | inline size_t mpipe_t::instance_t::max_packet_size(void) 350 | { 351 | return this->parent->max_packet_size; 352 | } 353 | 354 | inline mpipe_t::tcp_t::seq_t mpipe_t::instance_t::get_current_tcp_seq(void) 355 | { 356 | // Number of cycles between two increments of the sequence number 357 | // (~ 4 µs). 358 | static const cycles_t DELAY = CYCLES_PER_SECOND * 4 / 1000000; 359 | 360 | return mpipe_t::tcp_t::seq_t((uint32_t) get_cycle_count() / DELAY); 361 | } 362 | 363 | } } /* namespace rusty::driver */ 364 | 365 | #endif /* __RUSTY_DRIVER_MPIPE_HPP__ */ 366 | -------------------------------------------------------------------------------- /driver/timer.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Provides a timer manager which uses the CPU's cycle counter to trigger 3 | // timers. 4 | // 5 | // Copyright 2015 Raphael Javaux 6 | // University of Liege. 7 | // 8 | // This program is free software: you can redistribute it and/or modify 9 | // it under the terms of the GNU General Public License as published by 10 | // the Free Software Foundation, either version 3 of the License, or 11 | // (at your option) any later version. 12 | // 13 | // This program is distributed in the hope that it will be useful, 14 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 15 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 16 | // GNU General Public License for more details. 17 | // 18 | // You should have received a copy of the GNU General Public License 19 | // along with this program. If not, see . 20 | // 21 | 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include // move() 28 | 29 | #include // get_cycle_count() 30 | 31 | #include "driver/cpu.hpp" // cycles_t, CYCLES_PER_SECOND 32 | #include "driver/driver.hpp" // DRIVER_DEBUG() 33 | #include "util/macros.hpp" // UNLIKELY() 34 | 35 | #include "driver/timer.hpp" 36 | 37 | using namespace rusty::driver::cpu; 38 | 39 | namespace rusty { 40 | namespace driver { 41 | namespace timer { 42 | 43 | } } } /* namespace rusty::driver::timer */ 44 | -------------------------------------------------------------------------------- /driver/timer.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Provides a timer manager which uses the CPU's cycle counter to trigger 3 | // timers. 4 | // 5 | // Copyright 2015 Raphael Javaux 6 | // University of Liege. 7 | // 8 | // This program is free software: you can redistribute it and/or modify 9 | // it under the terms of the GNU General Public License as published by 10 | // the Free Software Foundation, either version 3 of the License, or 11 | // (at your option) any later version. 12 | // 13 | // This program is distributed in the hope that it will be useful, 14 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 15 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 16 | // GNU General Public License for more details. 17 | // 18 | // You should have received a copy of the GNU General Public License 19 | // along with this program. If not, see . 20 | // 21 | 22 | #ifndef __RUSTY_DRIVER_TIMER_HPP__ 23 | #define __RUSTY_DRIVER_TIMER_HPP__ 24 | 25 | #include 26 | #include 27 | #include // PRIu64 28 | #include // less 29 | #include // allocator 30 | #include 31 | #include 32 | 33 | #include "driver/clock.hpp" // cpu_clock_t 34 | 35 | using namespace std; 36 | 37 | namespace rusty { 38 | namespace driver { 39 | 40 | // Manages timers using uses the CPU's cycle counter. 41 | // 42 | // The 'tick()' method should be called periodically to execute expired timers. 43 | // 44 | // The manager is *not* thread-safe. Users must avoid concurrent calls to 45 | // 'tick()', 'schedule()' and 'remove()'. Calling 'schedule()' or 'remove()' 46 | // within a timer should be safe. 47 | template > 48 | struct cpu_timer_manager_t { 49 | // 50 | // Member types 51 | // 52 | 53 | // Timers are stored by the time they will expire. 54 | // 55 | // Only one function/timer can be mapped to an expiration date. In 56 | // the very rare case where two timers map on the same expiration date, the 57 | // second one will be inserted in the next free expiration date in the 58 | // domain. 59 | // 60 | // As expiration dates are based on the CPU cycle counter, the next 61 | // expiration date is only once cycle later. This should be pretty safe to 62 | // use the next CPU cycle because a CPU cycle is a very small time unit and 63 | // the execution of the first timer will take more than one cycle. This 64 | // simplifies the implementation and makes it more efficient than a vector 65 | // of timers which requires numerous dynamic memory allocations. 66 | typedef map< 67 | cpu_clock_t::time_t, function, 68 | less, alloc_t 69 | > timers_t; 70 | 71 | // The 'destroy()' method uses the timer expiration date to retrieve and 72 | // remove a timer. 73 | typedef cpu_clock_t::time_t timer_id_t; 74 | 75 | // 76 | // Fields 77 | // 78 | 79 | timers_t timers; 80 | 81 | // 82 | // Methods 83 | // 84 | 85 | cpu_timer_manager_t(alloc_t _alloc = alloc_t()); 86 | 87 | // Executes expired timers. This method should be called periodically. 88 | void tick(void); 89 | 90 | // Registers a timer. The timer will only be executed once. 91 | timer_id_t schedule(cpu_clock_t::interval_t delay, function f); 92 | 93 | // Reschedules the given timer with a new delay. Returns the new 'timer_id'. 94 | timer_id_t reschedule( 95 | timer_id_t timer_id, cpu_clock_t::interval_t new_delay 96 | ); 97 | 98 | // Unschedules a timer by the identifier that has been returned by the 99 | // 'schedule()' call. 100 | // 101 | // Returns 'true' if the timer has been removed, 'false' if it was not 102 | // found. 103 | bool remove(timer_id_t timer_id); 104 | 105 | private: 106 | // Sames as 'schedule' but doesn't produce a log message. 107 | timer_id_t _insert(cpu_clock_t::interval_t delay, function f); 108 | }; 109 | 110 | template 111 | cpu_timer_manager_t::cpu_timer_manager_t(alloc_t _alloc) 112 | : timers(less(), _alloc) 113 | { 114 | } 115 | 116 | template 117 | void cpu_timer_manager_t::tick(void) 118 | { 119 | typename timers_t::const_iterator it; 120 | 121 | while ((it = timers.begin()) != timers.end()) { 122 | // Removes the timer before calling it as some callbacks could make 123 | // calls to 'schedule()' or 'remove()' and change the 'timers' member 124 | // field. 125 | // Similarly, the loop call 'timers.begin()' at each iteration as the 126 | // iterator could be invalidated. 127 | 128 | cpu_clock_t::time_t now = cpu_clock_t::time_t::now(); 129 | if (less()(now, it->first)) 130 | break; 131 | 132 | DRIVER_DEBUG("Executes timer %" PRIu64, it->first.cycles); 133 | 134 | function f = move(it->second); 135 | timers.erase(it); 136 | 137 | f(); // Could invalidate 'it'. 138 | } 139 | } 140 | 141 | template 142 | typename cpu_timer_manager_t::timer_id_t 143 | cpu_timer_manager_t::schedule( 144 | cpu_clock_t::interval_t delay, function f 145 | ) 146 | { 147 | timer_id_t timer_id = this->_insert(delay, f); 148 | 149 | DRIVER_DEBUG( 150 | "Schedules timer %" PRIu64 " with a %" PRIu64 " µs delay", 151 | timer_id.cycles, delay.microsec() 152 | ); 153 | 154 | return timer_id; 155 | } 156 | 157 | template 158 | typename cpu_timer_manager_t::timer_id_t 159 | cpu_timer_manager_t::reschedule( 160 | timer_id_t timer_id, cpu_clock_t::interval_t new_delay 161 | ) 162 | { 163 | auto it = timers.find(timer_id); 164 | assert(it != timers.end()); 165 | 166 | timer_id_t new_timer_id = _insert(new_delay, it->second); 167 | 168 | timers.erase(timer_id); 169 | 170 | DRIVER_DEBUG( 171 | "Reschedules timer %" PRIu64 " as %" PRIu64 " with a %" PRIu64 172 | " µs delay", timer_id.cycles, new_timer_id.cycles, new_delay.microsec() 173 | ); 174 | 175 | return new_timer_id; 176 | } 177 | 178 | template 179 | bool cpu_timer_manager_t::remove(timer_id_t timer_id) 180 | { 181 | DRIVER_DEBUG("Unschedules timer %" PRIu64, timer_id.cycles); 182 | 183 | return timers.erase(timer_id); 184 | } 185 | 186 | template 187 | typename cpu_timer_manager_t::timer_id_t 188 | cpu_timer_manager_t::_insert( 189 | cpu_clock_t::interval_t delay, function f 190 | ) 191 | { 192 | cpu_clock_t::time_t expire = cpu_clock_t::time_t::now() + delay; 193 | 194 | // Uses the next time slot if the current one already exist. 195 | while (!timers.emplace(expire, f).second) 196 | expire = expire.next(); 197 | 198 | return expire; 199 | } 200 | 201 | } } /* namespace rusty::driver */ 202 | 203 | #endif /* __RUSTY_DRIVER_TIMER_HPP__ */ 204 | -------------------------------------------------------------------------------- /net/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | include_directories (../) 2 | 3 | add_library (net checksum.cpp) 4 | 5 | target_link_libraries (net util) 6 | -------------------------------------------------------------------------------- /net/arp.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Manages ARP requests and responses. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #ifndef __RUSTY_NET_ARP_HPP__ 22 | #define __RUSTY_NET_ARP_HPP__ 23 | 24 | #include 25 | #include 26 | #include 27 | #include 28 | #include 29 | #include // piecewise_construct, forward_as_tuple() 30 | #include // move(), pair 31 | #include 32 | 33 | #include // ARPOP_REQUEST, ARPOP_REPLY 34 | 35 | #include "net/endian.hpp" // net_t 36 | #include "util/macros.hpp" // RUSTY_*, COLOR_* 37 | 38 | using namespace std; 39 | 40 | namespace rusty { 41 | namespace net { 42 | 43 | #define ARP_COLOR COLOR_BLU 44 | #define ARP_DEBUG(MSG, ...) \ 45 | RUSTY_DEBUG("ARP", ARP_COLOR, MSG, ##__VA_ARGS__) 46 | #define ARP_ERROR(MSG, ...) \ 47 | RUSTY_ERROR("ARP", ARP_COLOR, MSG, ##__VA_ARGS__) 48 | #define ARP_DIE(MSG, ...) \ 49 | RUSTY_DIE( "ARP", ARP_COLOR, MSG, ##__VA_ARGS__) 50 | 51 | // *_NET constants are network byte order constants. 52 | static const net_t ARPOP_REQUEST_NET = ARPOP_REQUEST; 53 | static const net_t ARPOP_REPLY_NET = ARPOP_REPLY; 54 | 55 | template > 57 | struct arp_t { 58 | // 59 | // Member types 60 | // 61 | 62 | typedef typename data_link_t::clock_t clock_t; 63 | typedef typename data_link_t::cursor_t cursor_t; 64 | typedef typename data_link_t::timer_manager_t timer_manager_t; 65 | typedef typename timer_manager_t::timer_id_t timer_id_t; 66 | 67 | typedef typename data_link_t::addr_t data_link_addr_t; 68 | typedef typename proto_t::addr_t proto_addr_t; 69 | 70 | struct message_t { 71 | struct header_t { // fixed-size header 72 | net_t hrd; // format of hardware address 73 | net_t pro; // format of protocol address 74 | uint8_t hln; // length of hardware address 75 | uint8_t pln; // length of protocol address 76 | net_t op; // ARP opcode 77 | } __attribute__((__packed__)) hdr; 78 | 79 | net_t sha; // sender hardware address 80 | net_t spa; // sender protocol address 81 | 82 | net_t tha; // target hardware address 83 | net_t tpa; // target protocol address 84 | } __attribute__((__packed__)); 85 | 86 | struct static_entry_t { 87 | net_t proto_addr; 88 | net_t data_link_addr; 89 | }; 90 | 91 | struct cache_entry_t { 92 | net_t addr; 93 | 94 | // Static entries can not be removed. 95 | bool is_static; 96 | 97 | // Timer which triggers the expiration of the entry. 98 | // 99 | // Undefined for static entries. 100 | timer_id_t timer; 101 | }; 102 | 103 | // Types related to the 'addrs_cache' hash table. 104 | typedef pair, cache_entry_t> 105 | addrs_cache_pair_t; 106 | typedef typename alloc_t::template rebind::other 107 | addrs_cache_alloc_t; 108 | typedef unordered_map< 109 | net_t, cache_entry_t, 110 | hash>, equal_to>, 111 | addrs_cache_alloc_t 112 | > addrs_cache_t; 113 | 114 | // Callback used in the call of 'with_data_link_addr()'. 115 | typedef function *)> callback_t; 116 | 117 | struct pending_entry_t { 118 | vector callbacks; 119 | 120 | // Timer which triggers the expiration of resolution. 121 | timer_id_t timer; 122 | 123 | pending_entry_t(alloc_t _alloc = alloc_t()) : callbacks(_alloc) 124 | { 125 | } 126 | }; 127 | 128 | // Types related to the 'pending_reqs' hash table. 129 | typedef pair, pending_entry_t> 130 | pending_reqs_pair_t; 131 | typedef typename alloc_t::template rebind::other 132 | pending_reqs_alloc_t; 133 | typedef unordered_map< 134 | net_t, pending_entry_t, 135 | hash>, equal_to>, 136 | pending_reqs_alloc_t 137 | > pending_reqs_t; 138 | 139 | // 140 | // Static fields 141 | // 142 | 143 | // Delay in microseconds (10^-6) before an ARP table entry will be removed. 144 | static const typename clock_t::interval_t ENTRY_TIMEOUT; 145 | 146 | // Delay in microseconds (10^-6) to wait for an ARP resolution response. 147 | static const typename clock_t::interval_t REQUEST_TIMEOUT; 148 | 149 | // 150 | // Fields 151 | // 152 | 153 | // Data-link layer instance. 154 | data_link_t *data_link; 155 | 156 | timer_manager_t *timers; 157 | 158 | // Protocol layer instance. 159 | proto_t *proto; 160 | 161 | 162 | const net_t DATA_LINK_TYPE_NET = data_link_t::ARP_TYPE; 163 | const net_t PROTO_TYPE_NET = proto_t::ARP_TYPE; 164 | 165 | // Contains mapping/cache of known protocol addresses to their data-link 166 | // addresses. 167 | // 168 | // The set of known protocol addresses is disjoint with the set of addresses 169 | // in 'pending_reqs'. 170 | addrs_cache_t addrs_cache; 171 | 172 | // Contains a mapping of protocol addresses for which an ARP request has 173 | // been broadcasted but no response has been received yet. 174 | // The value contains a vector of functions which must be called once the 175 | // ARP reply is received. 176 | // 177 | // The set of pending protocol addresses is disjoint with the set of 178 | // addresses in 'addrs_cache'. 179 | pending_reqs_t pending_reqs; 180 | 181 | // 182 | // Methods 183 | // 184 | 185 | // Creates an ARP environment without initializing it. 186 | // 187 | // One must call 'init()' before using any other method. 188 | arp_t(alloc_t _alloc = alloc_t()) 189 | : addrs_cache( 190 | 0, hash>(), equal_to>(), 191 | _alloc 192 | ), 193 | pending_reqs( 194 | 0, hash>(), equal_to>(), 195 | _alloc 196 | ) 197 | { 198 | } 199 | 200 | // Creates an ARP environment for the given data-link and protocol layer 201 | // instances. 202 | // 203 | // Does the same thing as creating the environment with 'arp_t()' and then 204 | // calling 'init()'. 205 | arp_t( 206 | data_link_t *_data_link, timer_manager_t *_timers, proto_t *_proto, 207 | vector static_entries = vector(), 208 | alloc_t _alloc = alloc_t() 209 | ) : data_link(_data_link), timers(_timers), proto(_proto), 210 | addrs_cache( 211 | static_entries.size(), 212 | hash>(), equal_to>(), _alloc 213 | ), 214 | pending_reqs( 215 | 0, hash>(), equal_to>(), 216 | _alloc 217 | ) 218 | { 219 | _insert_static_entries(static_entries); 220 | } 221 | 222 | // Initializes an ARP environment for the given data-link and protocol layer 223 | // instances.ipv4 224 | void init( 225 | data_link_t *_data_link, timer_manager_t *_timers, proto_t *_proto, 226 | vector static_entries = vector() 227 | ) 228 | { 229 | data_link = _data_link; 230 | timers = _timers; 231 | proto = _proto; 232 | _insert_static_entries(static_entries); 233 | } 234 | 235 | // Processes an ARP message wich starts at the given cursor (data-link frame 236 | // payload without data-link layer headers). 237 | // 238 | // This method is typically called by the data-link layer when it receives 239 | // a frame. 240 | void receive_message(cursor_t cursor) 241 | { 242 | #define IGNORE_MSG(WHY, ...) \ 243 | do { \ 244 | ARP_ERROR("Message ignored: " WHY, ##__VA_ARGS__); \ 245 | return; \ 246 | } while (0) 247 | 248 | size_t cursor_size = cursor.size(); 249 | 250 | if (UNLIKELY(cursor_size < sizeof (typename message_t::header_t))) { 251 | IGNORE_MSG("too small to hold an ARP message's fixed-size header"); 252 | return; 253 | } 254 | 255 | cursor.template read_with( 256 | [this, cursor_size](const message_t *msg) { 257 | // 258 | // Checks that the ARP message is for the given data-link and 259 | // protocol layers. 260 | // Ignores the message otherwise. 261 | // 262 | 263 | if (UNLIKELY(msg->hdr.hrd != DATA_LINK_TYPE_NET)) { 264 | IGNORE_MSG( 265 | "invalid hardware type (received %hu, expected %hu)", 266 | msg->hdr.hrd.host(), data_link_t::ARP_TYPE 267 | ); 268 | } 269 | 270 | if (UNLIKELY(msg->hdr.pro != PROTO_TYPE_NET)) { 271 | IGNORE_MSG( 272 | "invalid hardware type (received %hu, expected %hu)", 273 | msg->hdr.pro.host(), proto_t::ARP_TYPE 274 | ); 275 | } 276 | 277 | if (UNLIKELY(msg->hdr.hln != data_link_t::ADDR_LEN)) { 278 | IGNORE_MSG( 279 | "invalid hardware address size " 280 | "(received %zu, expected %zu)", 281 | (size_t) msg->hdr.hln, (size_t) data_link_t::ADDR_LEN 282 | ); 283 | } 284 | 285 | if (UNLIKELY(msg->hdr.pln != proto_t::ADDR_LEN)) { 286 | IGNORE_MSG( 287 | "invalid hardware address size " 288 | "(received %zu, expected %zu)", 289 | (size_t) msg->hdr.pln, (size_t) proto_t::ADDR_LEN 290 | ); 291 | } 292 | 293 | if (UNLIKELY(cursor_size < sizeof (message_t))) 294 | IGNORE_MSG("too small to hold an ARP message"); 295 | 296 | // 297 | // Processes the ARP message. 298 | // 299 | 300 | if (msg->hdr.op == ARPOP_REQUEST_NET) { 301 | ARP_DEBUG( 302 | "Receives an ARP request from %s (%s)", 303 | proto_t::addr_t::to_alpha(msg->spa), 304 | data_link_t::addr_t::to_alpha(msg->sha) 305 | ); 306 | 307 | _cache_update(msg->sha, msg->spa); 308 | 309 | if (msg->tpa == this->proto->addr) { 310 | // Someone is asking for our Ethernet address. 311 | // Sends an ARP reply with our protocol address to the host 312 | // which sent the request. 313 | 314 | send_message(ARPOP_REPLY_NET, msg->sha, msg->spa); 315 | } 316 | } else if (msg->hdr.op == ARPOP_REPLY_NET) { 317 | ARP_DEBUG( 318 | "Receives an ARP reply from %s (%s)", 319 | proto_t::addr_t::to_alpha(msg->spa), 320 | data_link_t::addr_t::to_alpha(msg->sha) 321 | ); 322 | 323 | _cache_update(msg->sha, msg->spa); 324 | } else 325 | IGNORE_MSG("unknown ARP opcode (%hu)", msg->hdr.op.host()); 326 | }); 327 | 328 | #undef IGNORE_MSG 329 | } 330 | 331 | // Creates and push an ARP message to the data-link layer (L2). 332 | void send_message( 333 | net_t op, net_t tha, net_t tpa 334 | ) 335 | { 336 | #ifndef NDEBUG 337 | if (op == ARPOP_REQUEST_NET) { 338 | ARP_DEBUG( 339 | "Requests for %s at %s", proto_t::addr_t::to_alpha(tpa), 340 | data_link_t::addr_t::to_alpha(tha) 341 | ); 342 | } else if (op == ARPOP_REPLY_NET) { 343 | ARP_DEBUG( 344 | "Replies to %s (%s)", proto_t::addr_t::to_alpha(tpa), 345 | data_link_t::addr_t::to_alpha(tha) 346 | ); 347 | } else { 348 | ARP_DIE( 349 | "Trying to send an ARP message with an invalid operation " 350 | "code" 351 | ); 352 | } 353 | #endif 354 | 355 | this->data_link->send_arp_payload( 356 | tha, sizeof (message_t), [this, op, tha, tpa](cursor_t cursor) { 357 | _write_message(cursor, op, tha, tpa); 358 | } 359 | ); 360 | } 361 | 362 | // Executes the given callback function by giving the data-link address 363 | // corresponding to the given protocol address address. 364 | // 365 | // The callback will receive a 'nullptr' as 'data_link_addr_t' if the 366 | // address is unreachable. 367 | // 368 | // The callback will immediately be executed if the mapping is in the cache 369 | // (addrs_cache) but could be delayed if an ARP transaction is required. 370 | // 371 | // Returns 'true' if the address was in the ARP cache and the callback has 372 | // been executed, or 'false' if the callback execution has been delayed 373 | // because of an unknown protocol address. 374 | // 375 | // Example with ARP for IPv4 over Ethernet: 376 | // 377 | // arp.with_data_link_addr(ipv4_addr, [=](auto ether_addr) { 378 | // printf( 379 | // "%s hardware address is %s\n", inet_ntoa(ipv4_addr), 380 | // ether_ntoa(ether_addr) 381 | // ); 382 | // }); 383 | // 384 | bool with_data_link_addr( 385 | net_t proto_addr, callback_t callback 386 | ) 387 | { 388 | // NOTE: this procedure should require an exclusive lock for addrs_cache 389 | // and pending_reqs in case of multiple threads executing it. 390 | 391 | // lock 392 | 393 | auto it_cache = this->addrs_cache.find(proto_addr); 394 | 395 | if (it_cache != this->addrs_cache.end()) { 396 | // Hardware address is cached. 397 | 398 | // unlock 399 | callback(&it_cache->second.addr); 400 | return true; 401 | } else { 402 | // Hardware address is NOT cached. 403 | // 404 | // Checks if a pending request exists for this address. 405 | 406 | auto it_pending = this->pending_reqs.find(proto_addr); 407 | 408 | if (it_pending != this->pending_reqs.end()) { 409 | // The pending request entry already existed. A request has 410 | // already been broadcasted for this protocol address. 411 | // 412 | // Simply adds the callback to the vector. 413 | 414 | it_pending->second.callbacks.push_back(callback); 415 | 416 | // unlock 417 | } else { 418 | // No previous pending request entry. 419 | // 420 | // Creates the entry with a new timer and broadcasts an ARP 421 | // request for this protocol address. 422 | 423 | auto p = this->pending_reqs.emplace( 424 | piecewise_construct, 425 | forward_as_tuple(proto_addr), 426 | forward_as_tuple(this->pending_reqs.get_allocator()) 427 | ); 428 | assert(p.second); // Emplace succeed. 429 | pending_entry_t *entry = &p.first->second; 430 | 431 | entry->callbacks.push_back(callback); 432 | 433 | entry->timer = timers->schedule( 434 | REQUEST_TIMEOUT, [this, proto_addr]() { 435 | this->_remove_pending_request(proto_addr); 436 | } 437 | ); 438 | 439 | // unlock 440 | 441 | this->send_message( 442 | ARPOP_REQUEST_NET, data_link_t::BROADCAST_ADDR, proto_addr 443 | ); 444 | } 445 | 446 | return false; 447 | } 448 | } 449 | 450 | private: 451 | 452 | void _insert_static_entries(vector static_entries) 453 | { 454 | for (static_entry_t static_entry : static_entries) { 455 | cache_entry_t cache_entry; 456 | cache_entry.addr = static_entry.data_link_addr; 457 | cache_entry.is_static = true; 458 | 459 | addrs_cache.emplace(static_entry.proto_addr, cache_entry); 460 | } 461 | } 462 | 463 | // Removes a pending entry for the given protocol address. 464 | // 465 | // Doesn't unschedule the timer. 466 | void _remove_pending_request(net_t addr) 467 | { 468 | ARP_DEBUG( 469 | "Removes pending request for %s", proto_t::addr_t::to_alpha(addr) 470 | ); 471 | this->pending_reqs.erase(addr); 472 | } 473 | 474 | // Adds the given protocol to data-link layer address mapping in the cache 475 | // or updates cache entry if it already exists. 476 | // 477 | // In case of a new address, executes pending requests callbacks linked to 478 | // the protocol, if any. 479 | void _cache_update( 480 | net_t data_link_addr, net_t proto_addr 481 | ) 482 | { 483 | // NOTE: this procedure should require an exclusive lock for addrs_cache 484 | // and pending_reqs in case of multiple threads executing it. 485 | 486 | // lock 487 | 488 | // Schedules a timer to remove the entry after ENTRY_TIMEOUT. 489 | timer_id_t timer_id = timers->schedule( 490 | ENTRY_TIMEOUT, [this, proto_addr]() 491 | { 492 | this->_remove_cache_entry(proto_addr); 493 | } 494 | ); 495 | 496 | cache_entry_t entry { data_link_addr, false, timer_id }; 497 | auto inserted = this->addrs_cache.emplace(proto_addr, entry); 498 | 499 | if (!inserted.second) { 500 | // Address already in cache, replace the previous value if 501 | // different and not static. 502 | 503 | cache_entry_t *inserted_entry = &inserted.first->second; 504 | 505 | if (inserted_entry->is_static) 506 | return; 507 | 508 | if (UNLIKELY(inserted_entry->addr != data_link_addr)) { 509 | ARP_DEBUG( 510 | "Updates %s cache entry to %s (was %s)", 511 | proto_t::addr_t::to_alpha(proto_addr), 512 | data_link_t::addr_t::to_alpha(data_link_addr), 513 | data_link_t::addr_t::to_alpha(inserted_entry->addr) 514 | ); 515 | inserted_entry->addr = data_link_addr; 516 | } 517 | 518 | // Replaces the old timeout. 519 | 520 | timers->remove(inserted_entry->timer); 521 | inserted_entry->timer = timer_id; 522 | 523 | // unlock 524 | } else { 525 | // The address was not in cache. Checks for pending requests. 526 | 527 | ARP_DEBUG( 528 | "New cache entry (%s is %s)", 529 | proto_t::addr_t::to_alpha(proto_addr), 530 | data_link_t::addr_t::to_alpha(data_link_addr) 531 | ); 532 | 533 | auto it = this->pending_reqs.find(proto_addr); 534 | 535 | if (it != this->pending_reqs.end()) { 536 | // The address has pending requests. 537 | 538 | pending_entry_t *pending_entry = &it->second; 539 | 540 | // Removes the request timeout. 541 | timers->remove(pending_entry->timer); 542 | 543 | // As it's possible that one of these callbacks induce a new 544 | // lookup to the ARP cache for the same address, and thus a 545 | // deadlock, we must first remove the pending requests entry and 546 | // free the lock before calling any callback. 547 | 548 | vector callbacks = move( 549 | pending_entry->callbacks 550 | ); 551 | this->pending_reqs.erase(it); 552 | 553 | // unlock 554 | 555 | ARP_DEBUG( 556 | "Executes %d pending callbacks for %s", 557 | (int) callbacks.size(), 558 | proto_t::addr_t::to_alpha(proto_addr) 559 | ); 560 | 561 | // Executes the callbacks. 562 | 563 | for (callback_t& callback : callbacks) 564 | callback(&data_link_addr); 565 | } else { 566 | // No pending request. 567 | // 568 | // Occurs when the addres has not been requested. 569 | 570 | // unlock 571 | } 572 | } 573 | } 574 | 575 | // Removes the cache entry for the given protocol address. 576 | // 577 | // Doesn't unschedule the timer. 578 | void _remove_cache_entry(net_t addr) 579 | { 580 | // Does not remove a static entry. 581 | assert(!this->addrs_cache.find(addr)->second.is_static); 582 | 583 | ARP_DEBUG( 584 | "Removes cache entry for %s", proto_t::addr_t::to_alpha(addr) 585 | ); 586 | 587 | this->addrs_cache.erase(addr); 588 | } 589 | 590 | // Writes the ARP message after the given buffer cursor. 591 | // 592 | // NOTE: inline ? 593 | cursor_t _write_message( 594 | cursor_t cursor, net_t op, net_t tha, 595 | net_t tpa 596 | ) 597 | { 598 | return cursor.template write_with( 599 | [this, op, tha, tpa](message_t *msg) { 600 | msg->hdr.hrd = DATA_LINK_TYPE_NET; 601 | msg->hdr.pro = PROTO_TYPE_NET; 602 | 603 | msg->hdr.hln = (uint8_t) data_link_t::ADDR_LEN; 604 | msg->hdr.pln = (uint8_t) proto_t::ADDR_LEN; 605 | 606 | msg->hdr.op = op; 607 | 608 | msg->sha = this->data_link->addr; 609 | msg->spa = this->proto->addr; 610 | msg->tha = tha; 611 | msg->tpa = tpa; 612 | } 613 | ); 614 | } 615 | }; 616 | 617 | #undef ARP_COLOR 618 | #undef ARP_DEBUG 619 | 620 | template 621 | const typename arp_t::clock_t::interval_t 622 | arp_t::ENTRY_TIMEOUT(3600L * 1000000L); 623 | 624 | template 625 | const typename arp_t::clock_t::interval_t 626 | arp_t::REQUEST_TIMEOUT(5L * 1000000L); 627 | 628 | } } /* namespace rusty::net */ 629 | 630 | #endif /* __RUSTY_NET_ARP_HPP__ */ 631 | -------------------------------------------------------------------------------- /net/checksum.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Computes a checksum required by IPv4 and TCP protocols. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #include 22 | #include 23 | #include 24 | 25 | #include // __BIG_ENDIAN, __BYTE_ORDER, __LITTLE_ENDIAN 26 | 27 | #include "net/endian.hpp" // net_t, to_host(), to_network() 28 | 29 | #include "net/checksum.hpp" 30 | 31 | namespace rusty { 32 | namespace net { 33 | 34 | const partial_sum_t partial_sum_t::ZERO = partial_sum_t(); 35 | 36 | const checksum_t checksum_t::ZERO = checksum_t(); 37 | 38 | #ifndef NDEBUG 39 | // Reference implementation of the ones's complement sum. 40 | // 41 | // Only used for debugging. 42 | static uint16_t _ones_complement_sum_naive(const void *data, size_t size); 43 | #endif /* NDEBUG */ 44 | 45 | uint16_t _ones_complement_sum(const void *data, size_t size) 46 | { 47 | // The 16 bits ones' complement sum is the ones' complement addition of 48 | // every pair of bytes. If there is an odd number of bytes, then a zero byte 49 | // is virtually added to the buffer. 50 | // 51 | // e.g. the ones' complement sum of the bytes [a, b, c, d, e, f, g] is 52 | // [a, b] +' [c, d] +' [e, f] +' [g, 0] where +' is the ones' complement 53 | // addition. 54 | // 55 | // Ones' complement addition is standard addition but with the carry bit 56 | // added to the result. 57 | // 58 | // As an example, here is the 4 bits ones' complement addition of 1111 and 59 | // 1011: 60 | // 61 | // 1111 62 | // + 1101 63 | // ------ 64 | // 1 1000 65 | // \--------> The carry bit here must be added to the result (1000). 66 | // 1001 --> 4 bits ones' complement addition of 1111 and 1011. 67 | // 68 | // The 16 bits ones' complement sum is thus equal to: 69 | // 70 | // uint16_t *p = (uint16_t *) data; 71 | // 72 | // // Computes the ones' complement sum. 73 | // uint32_t sum = 0; 74 | // for (int i = 0; i < size / 2; i++) { 75 | // sum += p[i]; 76 | // 77 | // if (sum >> 16) // if carry bit 78 | // sum += 1; 79 | // } 80 | // 81 | // Instead of only adding 16 bits at a time and checking for a carry bit 82 | // at each addition (which produces a lot unpredictable branches), we use a 83 | // trick from [1]. 84 | // 85 | // The trick is to use a 64 bits integer as sum's accumulator and adds two 86 | // pair of bytes at a time (2 x 16 bits = 32 bits). The 32 mosts significant 87 | // bits of the 64 bits accumulator will accumulate carry bits while the 32 88 | // least significant bits will accumulate two 16 bits sums: 89 | // 90 | // +-----------------------------------+-----------------+-----------------+ 91 | // | 32 bits carry bits accumulator | 2nd 16 bits sum | 1st 16 bits sum | 92 | // +-----------------------------------+-----------------+-----------------+ 93 | // \-----------------------------------------------------------------------/ 94 | // 64 bits accumulator 95 | // 96 | // It's not a problem that the least significant 16 bits sum produces a 97 | // carry bit as ones' complement addition is commutative and as this carry 98 | // bit will be added to the second sum, which will be summed with first one 99 | // later. 100 | // 101 | // The algorithm then adds the 32 bits carry bits accumulator to the sum of 102 | // the two 16 bits sums to get the final sum.. 103 | // 104 | // [1]: http://tools.ietf.org/html/rfc1071 105 | 106 | assert(data != nullptr); 107 | 108 | uint64_t sum = 0; 109 | const uint32_t *data32 = (const uint32_t *) ((intptr_t) data & ~0x3); 110 | size_t remaining = size; 111 | 112 | // Processes the first bytes not aligned on 32 bits. 113 | intptr_t unaligned_offset = (intptr_t) data & 0x3; 114 | if (unaligned_offset) { 115 | size_t unaligned_bytes = sizeof (uint32_t) - unaligned_offset; 116 | // Loads the entire first word but masks the bytes which are before the 117 | // buffer. This should be safe as memory pages should be word-aligned. 118 | 119 | #if __BYTE_ORDER == __LITTLE_ENDIAN 120 | uint32_t word_mask = 0xFFFFFFFF << (unaligned_offset * 8); 121 | #elif __BYTE_ORDER == __BIG_ENDIAN 122 | uint32_t word_mask = 0xFFFFFFFF >> (unaligned_offset * 8); 123 | #else 124 | #error "Please set __BYTE_ORDER in " 125 | #endif 126 | 127 | if (unaligned_bytes > remaining) { 128 | // Masks the bytes that are after the buffer. 129 | size_t mask_right = unaligned_bytes - remaining; 130 | unaligned_bytes = remaining; 131 | 132 | #if __BYTE_ORDER == __LITTLE_ENDIAN 133 | word_mask &= 0xFFFFFFFF >> (mask_right * 8); 134 | #elif __BYTE_ORDER == __BIG_ENDIAN 135 | word_mask &= 0xFFFFFFFF << (mask_right * 8); 136 | #else 137 | #error "Please set __BYTE_ORDER in " 138 | #endif 139 | } 140 | 141 | sum += data32[0] & word_mask; 142 | remaining -= unaligned_bytes; 143 | data32++; 144 | } 145 | 146 | // Sums 32 bits at a time. 147 | while (remaining >= sizeof (uint32_t)) { 148 | sum += *data32; 149 | remaining -= sizeof (uint32_t); 150 | data32++; 151 | } 152 | 153 | // Sums the last bytes which could not fully fit a 32 bits integer. 154 | if (remaining > 0) { 155 | // Loads the last entire word but masks the bytes that are after the 156 | // buffer. This should be safe as a memory page should never end on a 157 | // word boundary. 158 | 159 | size_t mask_right = sizeof (uint32_t) - remaining; 160 | 161 | #if __BYTE_ORDER == __LITTLE_ENDIAN 162 | uint32_t word_mask = 0xFFFFFFFF >> (mask_right * 8); 163 | #elif __BYTE_ORDER == __BIG_ENDIAN 164 | uint32_t word_mask = 0xFFFFFFFF << (mask_right * 8); 165 | #else 166 | #error "Please set __BYTE_ORDER in " 167 | #endif 168 | 169 | sum += data32[0] & word_mask; 170 | } 171 | 172 | // 16 bits ones' complement sums of the two sub-sums and the carry bits. 173 | do 174 | sum = (sum >> 16) + (sum & 0xFFFF); 175 | while (sum >> 16); 176 | 177 | // If data started on an odd address, we computed the wrong sum. We computed 178 | // [0, a] +' [b, c] +' ... instead of [a, b] +' [c, d] +' ... 179 | // 180 | // The correct sum can be obtained by swapping bytes. 181 | uint16_t ret; 182 | if ((intptr_t) data & 0x1) 183 | ret = _swap_bytes((uint16_t) sum); 184 | else 185 | ret = (uint16_t) sum; 186 | 187 | assert(ret == _ones_complement_sum_naive(data, size)); 188 | 189 | return ret; 190 | } 191 | 192 | partial_sum_t precomputed_sums_t::sum(size_t begin, size_t end) const 193 | { 194 | assert(begin <= end); 195 | assert(end <= this->size); 196 | 197 | // If the section starts at an odd index. Removes the odd byte. 198 | size_t begin_div2 = begin / 2, 199 | end_div2 = end / 2; 200 | 201 | // Substraction in ones's complement arithmetic is adding the negation. 202 | uint32_t sum = this->table[end_div2] 203 | + ((uint16_t) ~this->table[begin_div2]); 204 | 205 | // Adds the included last byte of the first 16 bits word. 206 | if (end & 0x1) { 207 | #if __BYTE_ORDER == __LITTLE_ENDIAN 208 | sum += ((const char *) this->data)[end - 1]; 209 | #elif __BYTE_ORDER == __BIG_ENDIAN 210 | sum += ((const char *) this->data)[end - 1] << 8; 211 | #else 212 | #error "Please set __BYTE_ORDER in " 213 | #endif 214 | } 215 | 216 | // Removes the non-included first byte. 217 | if (begin & 0x1) { 218 | #if __BYTE_ORDER == __LITTLE_ENDIAN 219 | sum += (uint16_t) ~((const char *) this->data)[begin - 1]; 220 | #elif __BYTE_ORDER == __BIG_ENDIAN 221 | sum += (uint16_t) ~((const char *) this->data)[begin - 1] << 8; 222 | #else 223 | #error "Please set __BYTE_ORDER in " 224 | #endif 225 | 226 | // Removes the carry bit before swapping bytes. 227 | do 228 | sum = (sum >> 16) + (sum & 0xFFFF); 229 | while (sum >> 16); 230 | 231 | sum = (uint32_t) _swap_bytes((uint16_t) sum); 232 | } 233 | 234 | do 235 | sum = (sum >> 16) + (sum & 0xFFFF); 236 | while (sum >> 16); 237 | 238 | size_t size = end - begin; 239 | 240 | partial_sum_t ret = partial_sum_t((uint16_t) sum, size & 0x1); 241 | 242 | assert(ret == partial_sum_t((const char *) this->data + begin, size)); 243 | 244 | return ret; 245 | } 246 | 247 | const uint16_t * 248 | precomputed_sums_t::_precompute_table(const void *_data, size_t _size) 249 | { 250 | size_t size_table = _size / 2 + 1; 251 | 252 | uint16_t *table = new uint16_t[size_table]; 253 | 254 | // Sums two bytes at a time. 255 | const uint16_t *data16 = (const uint16_t *) _data; 256 | 257 | table[0] = 0; 258 | for (size_t i = 1; i < size_table; i++) { 259 | // Sums pair of bytes in a 32 bits integer, so the carry bit will not be 260 | // lost. 261 | uint32_t sum = (uint32_t) table[i - 1] + (uint32_t) data16[i - 1]; 262 | 263 | // Reports the carrybit in the stored 16 bits sum. 264 | table[i] = (uint16_t) ((sum >> 16) + (sum & 0xFFFF)); 265 | } 266 | 267 | assert( 268 | table[size_table - 1] 269 | == _ones_complement_sum_naive(_data, _size ^ 0x1) 270 | ); 271 | 272 | return table; 273 | } 274 | 275 | #ifndef NDEBUG 276 | static uint16_t _ones_complement_sum_naive(const void *data, size_t size) 277 | { 278 | uint64_t sum = 0; 279 | 280 | // Sums two bytes at a time. 281 | const uint16_t *data16 = (const uint16_t *) data; 282 | 283 | while (size > 1) { 284 | sum += *data16; 285 | size -= 2; 286 | data16++; 287 | } 288 | 289 | // Adds left-over byte, if any. 290 | if (size > 0) { 291 | #if __BYTE_ORDER == __LITTLE_ENDIAN 292 | uint16_t mask = 0x00FF; 293 | #elif __BYTE_ORDER == __BIG_ENDIAN 294 | uint16_t mask = 0xFF00; 295 | #else 296 | #error "Please set __BYTE_ORDER in " 297 | #endif 298 | 299 | sum += *data16 & mask; 300 | } 301 | 302 | // Folds 64-bit sum to 16 bits. 303 | while (sum >> 16) 304 | sum = (sum >> 16) + (sum & 0xFFFF); 305 | 306 | return (uint16_t) sum; 307 | } 308 | #endif /* NDEBUG */ 309 | 310 | } } /* namespace rusty::net */ 311 | -------------------------------------------------------------------------------- /net/checksum.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Computes a checksum required by IPv4 and TCP protocols. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #ifndef __RUSTY_NET_CHECKSUM_HPP__ 22 | #define __RUSTY_NET_CHECKSUM_HPP__ 23 | 24 | #include 25 | 26 | #include "net/endian.hpp" // net_t 27 | 28 | namespace rusty { 29 | namespace net { 30 | 31 | // Computes the 16 bits ones' complement sum of the given buffer. 32 | uint16_t _ones_complement_sum(const void *data, size_t size); 33 | 34 | // Swaps the two bytes of the integer ([a, b] -> [b, a]). 35 | static inline uint16_t _swap_bytes(uint16_t bytes); 36 | 37 | // ----------------------------------------------------------------------------- 38 | 39 | // Partially computed checksums. 40 | // 41 | // Used to compute a checksum incrementally. 42 | // 43 | // Can be computed with 'partial_sum_t()' and combined with an other partially 44 | // computed sum with 'append()'. The checksum can be then computed from this 45 | // partially computed sum with 'checksum_t()'. 46 | struct partial_sum_t { 47 | uint16_t sum; 48 | 49 | // 'true' when the sum has been computed on an odd number of bytes. 50 | bool odd; 51 | 52 | // Sum of an empty buffer. 53 | static const partial_sum_t ZERO; 54 | 55 | // Initializes the partial sum to zero (sum of an empty buffer). 56 | inline partial_sum_t(void) 57 | { 58 | this->sum = 0; 59 | this->odd = false; 60 | } 61 | 62 | // Initializes from an already computed sum. 63 | inline partial_sum_t(uint16_t _sum, bool _odd) 64 | { 65 | this->sum = _sum; 66 | this->odd = _odd; 67 | } 68 | 69 | // Computes the sum of a buffer. 70 | inline partial_sum_t(const void *data, size_t size) 71 | { 72 | this->sum = _ones_complement_sum(data, size); 73 | this->odd = (bool) (size & 0x1); 74 | } 75 | 76 | // Computes the sum of a buffer cursor. 77 | template 78 | inline partial_sum_t(cursor_t cursor) : partial_sum_t() 79 | { 80 | cursor.for_each([this](const void *buffer, size_t size) { 81 | *this = this->append(partial_sum_t(buffer, size)); 82 | }); 83 | } 84 | 85 | // Returns the partial sum that would have been obtained if the buffer of the 86 | // current sum and the buffer of the given partial sum were attached. 87 | inline partial_sum_t append(partial_sum_t second) const 88 | { 89 | uint32_t sum = this->sum; 90 | 91 | // When the first sum was computed on an odd number of bytes, we 92 | // virtually added a zero byte to the buffer. We need to swap the bytes 93 | // of the second sum to cancel this padding. 94 | if (this->odd) 95 | sum += _swap_bytes(second.sum); 96 | else 97 | sum += second.sum; 98 | 99 | sum += sum >> 16; // Carry bit. 100 | 101 | return { (uint16_t) sum, this->odd != second.odd }; 102 | } 103 | 104 | inline friend bool operator==(partial_sum_t a, partial_sum_t b) 105 | { 106 | return a.sum == b.sum && a.odd == b.odd; 107 | } 108 | 109 | inline friend bool operator!=(partial_sum_t a, partial_sum_t b) 110 | { 111 | return !(a == b); 112 | } 113 | }; 114 | 115 | // ----------------------------------------------------------------------------- 116 | 117 | // Precomputed partial sum table. 118 | // 119 | // Once computed from a data buffer, the precomputed table can gives in constant 120 | // time the one's complement sum of any subsection of the buffer. 121 | // 122 | // It uses internally a kind of summed area table [1] instead of computing the 123 | // sum for every possible sub-section, giving a O(n) memory space usage (with n 124 | // being the number of bytes in the original buffer). 125 | // 126 | // [1] https://en.wikipedia.org/wiki/Summed_area_table 127 | struct precomputed_sums_t { 128 | const void *data; 129 | const size_t size; 130 | 131 | const uint16_t *table; 132 | 133 | // Given a data buffer and its size, precomputes the one's complement sum 134 | // table. 135 | // 136 | // '_data' must never be de-allocated before the precomputed sum table. 137 | // 138 | // Complexity: O(_size). 139 | precomputed_sums_t(const void *_data, size_t _size) 140 | : data(_data), size(_size), table(_precompute_table(_data, _size)) 141 | { 142 | } 143 | 144 | // Returns the partial sum of the data in the buffer which starts at 'begin' 145 | // (inclusive) and which stops at 'end' (excluded). 146 | partial_sum_t sum(size_t begin, size_t end) const; 147 | 148 | inline void prefetch(size_t begin, size_t end) const 149 | { 150 | size_t begin_div2 = begin / 2, 151 | end_div2 = end / 2; 152 | __builtin_prefetch(this->table + end_div2); 153 | __builtin_prefetch(this->table + begin_div2); 154 | } 155 | 156 | private: 157 | 158 | // Allocates and computes the one's complement sum table. 159 | static const uint16_t *_precompute_table(const void *_data, size_t _size); 160 | }; 161 | 162 | 163 | // ----------------------------------------------------------------------------- 164 | 165 | // 166 | // Checksum 167 | // 168 | 169 | struct checksum_t { 170 | net_t value; 171 | 172 | // Checksum of an empty buffer. 173 | static const checksum_t ZERO; 174 | 175 | // Initializes the checksum to zero (checksum of an empty buffer). 176 | inline checksum_t(void) 177 | { 178 | this->value.net = 0; 179 | } 180 | 181 | // Computes the Internet Checksum of the given buffer. 182 | // 183 | // The Internet Checksum is the 16 bit ones' complement of the one's 184 | // complement sum of all 16 bit words in the given buffer. 185 | // 186 | // See [1] for the complete Internet checksum specification. 187 | // 188 | // The buffer is expected to be given in network byte order. 189 | // The returned 16 bits checksum will be in network byte order. 190 | // 191 | // [1]: http://tools.ietf.org/html/rfc1071 192 | inline checksum_t(const void *data, size_t size) 193 | { 194 | // The checksum is the ones' completent (e.g. binary not) of the 16 bits 195 | // ones' complement sum of every pair of bytes (16 bits). 196 | 197 | this->value.net = ~ _ones_complement_sum(data, size); 198 | } 199 | 200 | // Computes the Internet Checksum of the already computed ones' complement 201 | // sum. 202 | inline checksum_t(partial_sum_t partial_sum) 203 | { 204 | this->value.net = ~ partial_sum.sum; 205 | } 206 | 207 | // Returns 'true' if the value of the checksum is zero 208 | // 209 | // The Internet Checksum of an IPv4 datagram or a TCP segment has a zero 210 | // value when valid. 211 | inline bool is_valid(void) 212 | { 213 | return this->value.net == 0; 214 | } 215 | } __attribute__ ((__packed__)); 216 | 217 | // ----------------------------------------------------------------------------- 218 | 219 | static inline uint16_t _swap_bytes(uint16_t bytes) 220 | { 221 | return (bytes << 8) | (bytes >> 8); 222 | } 223 | 224 | } } /* namespace rusty::net */ 225 | 226 | #endif /* __RUSTY_NET_CHECKSUM_HPP__ */ 227 | -------------------------------------------------------------------------------- /net/endian.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Provides a template for type-safe network byte order types. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #ifndef __RUSTY_NET_ENDIAN_HPP__ 22 | #define __RUSTY_NET_ENDIAN_HPP__ 23 | 24 | #include 25 | #include 26 | #include // equal_to, hash 27 | #include // swap() 28 | 29 | #include // htonl(), htons(), ntohl(), ntons() 30 | #include // __BIG_ENDIAN, __BYTE_ORDER, __LITTLE_ENDIAN 31 | 32 | using namespace std; 33 | 34 | namespace rusty { 35 | namespace net { 36 | 37 | // Provides two generic function to change the endianness of a data-type. 38 | // 39 | // The functions can be specialized for various types. 40 | 41 | template 42 | inline T to_network(T host); 43 | 44 | template 45 | inline T to_host(T net); 46 | 47 | // Contains a value in network byte order. 48 | // 49 | // Use 'net_t::net_t()' to construct a 'net_t' value from a host byte order 50 | // value and 'net_t::from_net()' to construct a 'net_t' value from a network 51 | // byte order value. 52 | // 53 | // Use 'net_t::net' to get the network byte order value and 54 | // 'net_t::host()' to get the host byte order value. 55 | template 56 | struct net_t { 57 | // 58 | // Member types 59 | // 60 | 61 | typedef net_t this_t; 62 | 63 | // 64 | // Fields 65 | // 66 | 67 | // Value in network byte order. 68 | host_t net; 69 | 70 | // Initializes to an undefined value. 71 | inline net_t(void) 72 | { 73 | } 74 | 75 | // Initializes with an host byte order value. 76 | inline net_t(host_t host) 77 | { 78 | net = to_network(host); 79 | } 80 | 81 | inline host_t host(void) const 82 | { 83 | return to_host(net); 84 | } 85 | 86 | // Contructs a network byte order value from a value which is already in 87 | // network byte order. 88 | static inline net_t from_net(host_t _net) 89 | { 90 | net_t val; 91 | val.net = _net; 92 | return val; 93 | } 94 | 95 | inline net_t& operator=(this_t other) 96 | { 97 | net = other.net; 98 | return *this; 99 | } 100 | 101 | friend inline bool operator==(this_t a, this_t b) 102 | { 103 | return a.net == b.net; 104 | } 105 | 106 | friend inline bool operator!=(this_t a, this_t b) 107 | { 108 | return a.net != b.net; 109 | } 110 | 111 | friend inline bool operator==(this_t a, host_t b) 112 | { 113 | return a.host() == b; 114 | } 115 | 116 | friend inline bool operator!=(this_t a, host_t b) 117 | { 118 | return a.host() != b; 119 | } 120 | 121 | friend inline bool operator==(host_t a, this_t b) 122 | { 123 | return a == b.host(); 124 | } 125 | 126 | friend inline bool operator!=(host_t a, this_t b) 127 | { 128 | return a != b.host(); 129 | } 130 | 131 | friend inline this_t operator+(this_t a, this_t b) 132 | { 133 | return net_t(a.host() + b.host()); 134 | } 135 | 136 | friend inline this_t operator+(this_t a, host_t b) 137 | { 138 | return net_t(a.host() + b); 139 | } 140 | 141 | friend inline this_t operator+(host_t a, this_t b) 142 | { 143 | return net_t(a + b.host()); 144 | } 145 | 146 | friend inline this_t operator-(this_t a, this_t b) 147 | { 148 | return net_t(a.host() - b.host()); 149 | } 150 | 151 | friend inline this_t operator-(this_t a, host_t b) 152 | { 153 | return net_t(a.host() - b); 154 | } 155 | 156 | friend inline this_t operator-(host_t a, this_t b) 157 | { 158 | return net_t(a - b.host()); 159 | } 160 | } __attribute__ ((__packed__)); 161 | 162 | // 163 | // Specialized 'to_network()' and 'to_host()' instances for 'uint16_t' and 164 | // 'uint32_t'. 165 | // 166 | 167 | template <> 168 | inline uint16_t to_network(uint16_t host) 169 | { 170 | return htons(host); 171 | } 172 | 173 | 174 | template <> 175 | inline uint16_t to_host(uint16_t net) 176 | { 177 | return ntohs(net); 178 | } 179 | 180 | template <> 181 | inline uint32_t to_network(uint32_t host) 182 | { 183 | return htonl(host); 184 | } 185 | 186 | template <> 187 | inline uint32_t to_host(uint32_t net) 188 | { 189 | return ntohl(net); 190 | } 191 | 192 | // 193 | // Generic 'to_network()' and 'to_host()' instances. 194 | // 195 | 196 | // Reverse the bytes in the value. 197 | template 198 | static inline T _change_endian(T value); 199 | 200 | template 201 | inline T to_network(T host) 202 | { 203 | return _change_endian(host); 204 | } 205 | 206 | template 207 | inline T to_host(T net) 208 | { 209 | return _change_endian(net); 210 | } 211 | 212 | template 213 | static inline T _change_endian(T value) 214 | { 215 | #if __BYTE_ORDER == __LITTLE_ENDIAN 216 | uint8_t *value_bytes = (uint8_t *) &value; 217 | 218 | for (int i = 0; i < sizeof (T) / 2; i++) { 219 | swap( 220 | value_bytes[i], value_bytes[sizeof (T) - 1 - i] 221 | ); 222 | } 223 | 224 | return value; 225 | #elif __BYTE_ORDER == __BIG_ENDIAN 226 | return value; 227 | #else 228 | #error "Please set __BYTE_ORDER in " 229 | #endif 230 | } 231 | 232 | } } /* namespace rusty::net */ 233 | 234 | // 235 | // 'std::equal_to<>' and 'std::hash<>' instances for 'net_t<>'. 236 | // 237 | 238 | namespace std { 239 | 240 | using namespace rusty::net; 241 | 242 | template <> 243 | template 244 | struct equal_to> { 245 | inline bool operator()(const net_t& a, const net_t& b) const 246 | { 247 | return a == b; 248 | } 249 | }; 250 | 251 | template <> 252 | template 253 | struct hash> { 254 | inline size_t operator()(const net_t &value) const 255 | { 256 | return hash()(value.net); 257 | } 258 | }; 259 | 260 | } /* namespace std */ 261 | 262 | #endif /* __RUSTY_NET_ENDIAN_HPP__ */ 263 | -------------------------------------------------------------------------------- /net/ethernet.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Receives, processes and sends Ethernet frames. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #ifndef __RUSTY_NET_ETHERNET_HPP__ 22 | #define __RUSTY_NET_ETHERNET_HPP__ 23 | 24 | #include 25 | #include 26 | #include 27 | 28 | #include // ether_addr, ETHERTYPE_* 29 | #include // ether_ntoa(), ether_addr 30 | 31 | #include "net/arp.hpp" // arp_t 32 | #include "net/endian.hpp" // net_t 33 | #include "net/ipv4.hpp" // ipv4_t 34 | #include "util/macros.hpp" // RUSTY_*, COLOR_* 35 | 36 | namespace rusty { 37 | namespace net { 38 | 39 | #define ETH_COLOR COLOR_RED 40 | #define ETH_DEBUG(MSG, ...) \ 41 | RUSTY_DEBUG("ETH", ETH_COLOR, MSG, ##__VA_ARGS__) 42 | #define ETH_ERROR(MSG, ...) \ 43 | RUSTY_ERROR("ETH", ETH_COLOR, MSG, ##__VA_ARGS__) 44 | 45 | // *_NET constants are network byte order constants. 46 | static const net_t ETHERTYPE_ARP_NET = ETHERTYPE_ARP; 47 | static const net_t ETHERTYPE_IP_NET = ETHERTYPE_IP; 48 | 49 | // Ethernet layer able to process frames from and to the specified physical 50 | // 'phys_var_t' layer. 51 | template > 52 | struct ethernet_t { 53 | // 54 | // Member types 55 | // 56 | 57 | // Redefines 'phys_var_t' as 'phys_t' so it can be accessible as a member 58 | // type. 59 | typedef phys_var_t phys_t; 60 | 61 | typedef ethernet_t this_t; 62 | 63 | typedef typename phys_t::clock_t clock_t; 64 | typedef typename phys_t::cursor_t cursor_t; 65 | typedef typename phys_t::timer_manager_t timer_manager_t; 66 | 67 | // Ethernet address. 68 | struct addr_t { 69 | uint8_t value[ETH_ALEN]; 70 | 71 | inline addr_t &operator=(addr_t other) 72 | { 73 | memcpy(&value, &other.value, sizeof value); 74 | return *this; 75 | } 76 | 77 | friend inline bool operator==(addr_t a, addr_t b) 78 | { 79 | return !(a != b); 80 | } 81 | 82 | friend inline bool operator!=(addr_t a, addr_t b) 83 | { 84 | return memcmp(&a, &b, sizeof (addr_t)); 85 | } 86 | 87 | static net_t from_ether_addr(struct ether_addr *ether_addr) 88 | { 89 | net_t addr; 90 | memcpy(addr.net.value, ether_addr->ether_addr_octet, ETH_ALEN); 91 | return addr; 92 | } 93 | 94 | // Converts the Ethernet address to the standard hex-digits-and-colons 95 | // notation into a statically allocated buffer. 96 | // 97 | // This method is typically called for debugging messages. 98 | static char *to_alpha(net_t addr) 99 | { 100 | return ether_ntoa((struct ether_addr *) &addr); 101 | } 102 | } __attribute__ ((__packed__)); 103 | 104 | struct header_t { 105 | net_t dhost; // Destination Ethernet address. 106 | net_t shost; // Source Ethernet address. 107 | net_t type; // Ether-type. 108 | } __attribute__ ((__packed__)); 109 | 110 | // Upper network layers types. 111 | typedef ipv4_t ipv4_ethernet_t; 112 | typedef arp_t arp_ethernet_ipv4_t; 113 | 114 | // 115 | // Static fields 116 | // 117 | 118 | static constexpr size_t HEADER_SIZE = sizeof (header_t); 119 | 120 | // 'arp_t' requires the following static fields: 121 | static constexpr uint16_t ARP_TYPE = ARPHRD_ETHER; 122 | static constexpr size_t ADDR_LEN = ETH_ALEN; 123 | 124 | static const net_t BROADCAST_ADDR; 125 | 126 | // 127 | // Fields 128 | // 129 | 130 | net_t addr; 131 | 132 | // Physical layer instance. 133 | phys_t *phys; 134 | 135 | // Upper network layer instances. 136 | arp_ethernet_ipv4_t arp; 137 | ipv4_ethernet_t ipv4; 138 | 139 | // Maximum payload size. Doesn't change after intialization. 140 | size_t max_payload_size; 141 | 142 | // 143 | // Methods 144 | // 145 | 146 | // Creates an Ethernet environment without initializing it. 147 | // 148 | // One must call 'init()' before using any other method. 149 | ethernet_t(alloc_t _alloc = alloc_t()) : arp(_alloc), ipv4(_alloc) 150 | { 151 | } 152 | 153 | // Creates an Ethernet environment for the given physical layer 154 | // instance, Ethernet address and IPv4 address. 155 | // 156 | // Does the same thing as creating the environment with 'ethernet_t()' and 157 | // then calling 'init()'. 158 | ethernet_t( 159 | phys_t *_phys, timer_manager_t *_timers, net_t _addr, 160 | net_t ipv4_addr, 161 | vector static_arp_entries 162 | = vector(), 163 | alloc_t _alloc = alloc_t() 164 | ) : phys(_phys), addr(_addr), arp(_alloc), ipv4(_alloc) 165 | { 166 | max_payload_size = _max_payload_size(); 167 | arp.init(this, _timers, &ipv4, static_arp_entries); 168 | ipv4.init(this, &arp, ipv4_addr, _timers); 169 | } 170 | 171 | // Initializes an Ethernet environment for the given physical layer 172 | // instance, Ethernet address and IPv4 address. 173 | void init( 174 | phys_t *_phys, timer_manager_t *_timers, net_t _addr, 175 | net_t ipv4_addr, 176 | vector static_arp_entries 177 | = vector() 178 | ) 179 | { 180 | phys = _phys; 181 | max_payload_size = _max_payload_size(); 182 | addr = _addr; 183 | arp.init(this, _timers, &ipv4, static_arp_entries); 184 | ipv4.init(this, &arp, ipv4_addr, _timers); 185 | } 186 | 187 | // Processes an Ethernet frame. The cursor must begin at the Ethernet layer 188 | // and must end at the end of the packet payload. 189 | // 190 | // This method is typically called by the physical layer when it receives 191 | // a packet. 192 | void receive_frame(cursor_t cursor) 193 | { 194 | if (UNLIKELY(cursor.size() < HEADER_SIZE)) { 195 | ETH_ERROR("Frame ignored: too small to hold an Ethernet header"); 196 | return; 197 | } 198 | 199 | cursor.template read_with( 200 | [this](const header_t *hdr, cursor_t payload) { 201 | #define IGNORE_FRAME(WHY, ...) \ 202 | do { \ 203 | ETH_ERROR( \ 204 | "Frame from %s ignored: " WHY, \ 205 | addr_t::to_alpha(hdr->shost), ##__VA_ARGS__ \ 206 | ); \ 207 | return; \ 208 | } while (0) 209 | 210 | if (UNLIKELY(hdr->dhost != addr && hdr->dhost != BROADCAST_ADDR)) { 211 | IGNORE_FRAME( 212 | "bad recipient (%s)", addr_t::to_alpha(hdr->dhost) 213 | ); 214 | } 215 | 216 | #define RECEIVE_FRAME() \ 217 | do { \ 218 | ETH_DEBUG( \ 219 | "Receives an Ethernet frame from %s", \ 220 | addr_t::to_alpha(hdr->shost) \ 221 | ); \ 222 | } while (0) 223 | 224 | if (hdr->type == ETHERTYPE_ARP_NET) { 225 | RECEIVE_FRAME(); 226 | arp.receive_message(payload); 227 | } else if (hdr->type == ETHERTYPE_IP_NET) { 228 | RECEIVE_FRAME(); 229 | ipv4.receive_datagram(payload); 230 | } else { 231 | IGNORE_FRAME( 232 | "unknown Ethernet type (%" PRIu16 ")", hdr->type.host() 233 | ); 234 | } 235 | 236 | #undef RECEIVE_FRAME 237 | #undef IGNORE_FRAME 238 | }); 239 | } 240 | 241 | // Creates an Ethernet frame with the given destination and Ethernet type, 242 | // and writes its payload with the given 'payload_writer'. The frame is then 243 | // transmitted to physical layer. 244 | void send_payload( 245 | net_t dst, net_t ether_type, 246 | size_t payload_size, function payload_writer 247 | ) 248 | { 249 | assert(payload_size >= 0 && payload_size <= max_payload_size); 250 | 251 | size_t frame_size = HEADER_SIZE + payload_size; 252 | 253 | ETH_DEBUG( 254 | "Sends a %zu bytes ethernet frame to %s with type 0x%x", 255 | frame_size, addr_t::to_alpha(dst), ether_type.host() 256 | ); 257 | 258 | this->phys->send_packet( 259 | frame_size, 260 | [this, dst, ether_type, &payload_writer](cursor_t cursor) { 261 | cursor = _write_header(cursor, dst, ether_type); 262 | payload_writer(cursor); 263 | }); 264 | } 265 | 266 | // Equivalent to 'send_payload()' with 'ether_type' equals to 267 | // 'ETHERTYPE_ARP_NET'. 268 | // 269 | // This method is typically called by the ARP instance when it wants to send 270 | // a message. 271 | inline void send_arp_payload( 272 | net_t dst, size_t payload_size, 273 | function payload_writer 274 | ) 275 | { 276 | send_payload(dst, ETHERTYPE_ARP_NET, payload_size, payload_writer); 277 | } 278 | 279 | // Equivalent to 'send_payload()' with 'ether_type' equals to 280 | // 'ETHERTYPE_IP_NET'. 281 | // 282 | // This method is typically called by the IPv4 instance when it wants to 283 | // send a packet. 284 | inline void send_ip_payload( 285 | net_t dst, size_t payload_size, 286 | function payload_writer 287 | ) 288 | { 289 | send_payload(dst, ETHERTYPE_IP_NET, payload_size, payload_writer); 290 | } 291 | 292 | private: 293 | 294 | // Writes the Ethernet header starting at the given buffer cursor. 295 | cursor_t _write_header( 296 | cursor_t cursor, net_t dst, net_t ether_type 297 | ) 298 | { 299 | return cursor.template write_with( 300 | [this, dst, ether_type, cursor](header_t *hdr) { 301 | hdr->dhost = dst; 302 | hdr->shost = addr; 303 | hdr->type = ether_type; 304 | }); 305 | } 306 | 307 | size_t _max_payload_size(void) 308 | { 309 | return this->phys->max_packet_size() - HEADER_SIZE; 310 | } 311 | }; 312 | 313 | template 314 | const net_t::addr_t> 315 | ethernet_t::BROADCAST_ADDR = 316 | { { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF } }; 317 | 318 | #undef ETH_COLOR 319 | #undef ETH_DEBUG 320 | #undef ETH_ERROR 321 | 322 | } } /* namespace rusty::net */ 323 | 324 | #endif /* __RUSTY_NET_ETHERNET_HPP__ */ 325 | -------------------------------------------------------------------------------- /net/ipv4.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Receives, processes and sends IPv4 datagrams. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #ifndef __RUSTY_NET_IPV4_HPP__ 22 | #define __RUSTY_NET_IPV4_HPP__ 23 | 24 | #include // min() 25 | #include 26 | #include // equal_to, hash 27 | #include 28 | 29 | #include // inet_ntoa() 30 | #include // ETHERTYPE_IP 31 | #include // in_addr, IPPROTO_TCP 32 | #include // IPDEFTTL, IPVERSION, IP_MF, IPPROTO_TCP, 33 | // IPTOS_CLASS_DEFAULT 34 | 35 | #include "net/checksum.hpp" // checksum_t, partial_sum_t 36 | #include "net/tcp.hpp" // tcp_t 37 | #include "util/macros.hpp" // RUSTY_*, COLOR_* 38 | 39 | using namespace std; 40 | 41 | namespace rusty { 42 | namespace net { 43 | 44 | #define IPV4_COLOR COLOR_CYN 45 | #define IPV4_DEBUG(MSG, ...) \ 46 | RUSTY_DEBUG("IPV4", IPV4_COLOR, MSG, ##__VA_ARGS__) 47 | #define IPV4_ERROR(MSG, ...) \ 48 | RUSTY_ERROR("IPV4", IPV4_COLOR, MSG, ##__VA_ARGS__) 49 | 50 | struct ipv4_addr_t { 51 | uint32_t value; 52 | 53 | inline ipv4_addr_t &operator=(ipv4_addr_t other) 54 | { 55 | value = other.value; 56 | return *this; 57 | } 58 | 59 | friend inline bool operator==(ipv4_addr_t a, ipv4_addr_t b) 60 | { 61 | return a.value == b.value; 62 | } 63 | 64 | friend inline bool operator!=(ipv4_addr_t a, ipv4_addr_t b) 65 | { 66 | return a.value != b.value; 67 | } 68 | 69 | // Converts the IPv4 address to a string in IPv4 dotted-decimal notation 70 | // into a statically allocated buffer. 71 | // 72 | // This method is typically called for debugging messages. 73 | static char *to_alpha(net_t addr) 74 | { 75 | return inet_ntoa(ipv4_addr_t::to_in_addr(addr)); 76 | } 77 | 78 | static net_t from_in_addr(struct in_addr in_addr) 79 | { 80 | net_t addr; 81 | addr.net.value = in_addr.s_addr; 82 | return addr; 83 | } 84 | 85 | static struct in_addr to_in_addr(net_t addr) 86 | { 87 | struct in_addr in_addr; 88 | in_addr.s_addr = addr.net.value; 89 | return in_addr; 90 | } 91 | } __attribute__ ((__packed__)); 92 | 93 | // IPv4 network layer able to process datagram from and to the specified 94 | // data-link 'data_link_var_t' layer. 95 | template > 96 | struct ipv4_t { 97 | // 98 | // Member types 99 | // 100 | 101 | // Redefines 'data_link_var_t' as 'data_link_t' so it can be accessible as a 102 | // member type. 103 | typedef data_link_var_t data_link_t; 104 | 105 | typedef ipv4_t this_t; 106 | 107 | typedef ipv4_addr_t addr_t; 108 | 109 | typedef typename data_link_t::clock_t clock_t; 110 | typedef typename data_link_t::cursor_t cursor_t; 111 | typedef typename data_link_t::timer_manager_t timer_manager_t; 112 | 113 | struct header_t { 114 | #if __BYTE_ORDER == __LITTLE_ENDIAN 115 | uint8_t ihl:4; 116 | uint8_t version:4; 117 | #elif __BYTE_ORDER == __BIG_ENDIAN 118 | uint8_t version:4; 119 | uint8_t ihl:4; 120 | #else 121 | #error "Please fix __BYTE_ORDER in " 122 | #endif 123 | 124 | uint8_t tos; 125 | net_t tot_len; 126 | uint16_t id; 127 | net_t frag_off; 128 | uint8_t ttl; 129 | uint8_t protocol; 130 | checksum_t check; 131 | net_t saddr; 132 | net_t daddr; 133 | } __attribute__ ((__packed__)); 134 | 135 | // Lower layer address type. 136 | typedef typename data_link_t::addr_t data_link_addr_t; 137 | 138 | // Upper layer protocol type. 139 | typedef tcp_t tcp_ipv4_t; 140 | 141 | // 142 | // Static fields 143 | // 144 | 145 | // 'arp_t' requires the following static fields: 146 | static constexpr uint16_t ARP_TYPE = ETHERTYPE_IP; 147 | static constexpr size_t ADDR_LEN = 4; 148 | 149 | static constexpr size_t HEADER_SIZE = sizeof (header_t); 150 | 151 | // Header size in 32 bit words. 152 | static constexpr size_t HEADER_LEN = HEADER_SIZE / sizeof (uint32_t); 153 | 154 | // 155 | // Fields 156 | // 157 | 158 | // Lower network layer instances. 159 | data_link_t *data_link; 160 | arp_t *arp; 161 | 162 | // Upper protocol instances 163 | tcp_ipv4_t tcp; 164 | 165 | // Instance's IPv4 address 166 | net_t addr; 167 | 168 | // Maximum payload size. Doesn't change after intialization. 169 | size_t max_payload_size; 170 | 171 | // The current identification number used to indentify egressed datagrams. 172 | // 173 | // This counter is incremented by one each time a datagram is sent. 174 | uint16_t current_datagram_id = 0; 175 | 176 | // 177 | // Methods 178 | // 179 | 180 | // Creates an IPv4 environment without initializing it. 181 | // 182 | // One must call 'init()' before using any other method. 183 | ipv4_t(alloc_t _alloc = alloc_t()) : tcp(_alloc) 184 | { 185 | } 186 | 187 | // Creates an IPv4 environment for the given data-link layer instance and 188 | // IPv4 address. 189 | // 190 | // Does the same thing as creating the environment with 'ipv4_t()' and then 191 | // calling 'init()'. 192 | ipv4_t( 193 | data_link_t *_data_link, arp_t *_arp, 194 | net_t _addr, timer_manager_t *_timers, 195 | alloc_t _alloc = alloc_t() 196 | ) : data_link(_data_link), arp(_arp), addr(_addr), tcp(_alloc) 197 | { 198 | // TCP must be initialized after max_payload_size 199 | max_payload_size = this->_max_payload_size(); 200 | tcp.init(this, _timers); 201 | } 202 | 203 | // Initializes an IPv4 environment for the given data-link layer instance 204 | // and IPv4 address). 205 | void init( 206 | data_link_t *_data_link, arp_t *_arp, 207 | net_t _addr, timer_manager_t *_timers 208 | ) 209 | { 210 | data_link = _data_link; 211 | arp = _arp; 212 | addr = _addr; 213 | max_payload_size = this->_max_payload_size(); 214 | tcp.init(this, _timers); 215 | } 216 | 217 | // Processes an IPv4 datagram wich starts at the given cursor (data-link 218 | // layer payload without headers). 219 | void receive_datagram(cursor_t cursor) 220 | { 221 | size_t cursor_size = cursor.size(); 222 | 223 | if (UNLIKELY(cursor_size < HEADER_SIZE)) { 224 | IPV4_ERROR("Datagram ignored: too small to hold an IPv4 header"); 225 | return; 226 | } 227 | 228 | cursor.template read_with( 229 | [this, cursor_size](const header_t *hdr, cursor_t payload) { 230 | #define IGNORE_DATAGRAM(WHY, ...) \ 231 | do { \ 232 | IPV4_ERROR( \ 233 | "Datagram from %s ignored: " WHY, \ 234 | addr_t::to_alpha(hdr->saddr), ##__VA_ARGS__ \ 235 | ); \ 236 | return; \ 237 | } while (0) 238 | 239 | // 240 | // Checks datagram validity. 241 | // 242 | 243 | if (UNLIKELY(hdr->version != IPVERSION)) { 244 | IGNORE_DATAGRAM( 245 | "invalid IP version (received %u, excpected %u)", 246 | (unsigned int) hdr->version, IPVERSION 247 | ); 248 | } 249 | 250 | if (hdr->ihl != HEADER_LEN) 251 | IGNORE_DATAGRAM("options are not supported"); 252 | 253 | size_t header_size = hdr->ihl * sizeof (uint32_t), 254 | total_size = hdr->tot_len.host(); 255 | 256 | if (UNLIKELY(total_size < header_size)) { 257 | IGNORE_DATAGRAM( 258 | "total size (%zu) is less than header size (%zu)", 259 | total_size, header_size 260 | ); 261 | } 262 | 263 | if (UNLIKELY(cursor_size < total_size)) { 264 | IGNORE_DATAGRAM( 265 | "datagram size (%zu) is less than total size (%zu)", 266 | total_size, cursor_size 267 | ); 268 | } 269 | 270 | uint16_t frag_off_host = hdr->frag_off.host(); 271 | if (UNLIKELY( 272 | frag_off_host & IP_MF // More fragment. 273 | || (frag_off_host & IP_OFFMASK) > 0 // Not the first fragment. 274 | )) 275 | IGNORE_DATAGRAM("fragmented datagrams are not supported"); 276 | 277 | if (UNLIKELY(hdr->daddr != addr)) 278 | IGNORE_DATAGRAM("bad recipient"); 279 | 280 | if (UNLIKELY(!checksum_t(hdr, HEADER_SIZE).is_valid())) 281 | IGNORE_DATAGRAM("invalid checksum"); 282 | 283 | // 284 | // Processes the datagram. 285 | // 286 | 287 | // The Ethernet frame could contain a small padding at its end. 288 | payload = payload.take(total_size - header_size); 289 | 290 | if (hdr->protocol == IPPROTO_TCP) { 291 | IPV4_DEBUG( 292 | "Receives an IPv4 datagram from %s", 293 | addr_t::to_alpha(hdr->saddr) 294 | ); 295 | this->tcp.receive_segment(hdr->saddr, payload); 296 | } else { 297 | IGNORE_DATAGRAM( 298 | "unknown IPv4 protocol (%u)", (unsigned int) hdr->protocol 299 | ); 300 | } 301 | 302 | #undef IGNORE_DATAGRAM 303 | }); 304 | } 305 | 306 | // Creates and push an IPv4 datagram with its payload to the daya-link layer 307 | // (L2). 308 | // 309 | // 'payload_writer' execution could be delayed after this function returns, 310 | // if an ARP transaction is required to translate the IPv4 address to its 311 | // corresponding data-link address. One should take care of not using memory 312 | // which could be deallocated before the 'payload_writer' execution. 313 | // 314 | // Returns 'true' if the 'payload_writer' execution has not been delayed. 315 | bool send_payload( 316 | net_t dst, uint8_t protocol, 317 | size_t payload_size, function payload_writer 318 | ) 319 | { 320 | assert(payload_size >= 0 && payload_size <= max_payload_size); 321 | 322 | return this->arp->with_data_link_addr( 323 | dst, [this, dst, protocol, payload_size, payload_writer]( 324 | const net_t *data_link_dst 325 | ) { 326 | if (data_link_dst == nullptr) { 327 | IPV4_ERROR("Unreachable address: %s", addr_t::to_alpha(dst)); 328 | return; 329 | } 330 | 331 | size_t datagram_size = HEADER_SIZE + payload_size; 332 | 333 | IPV4_DEBUG( 334 | "Sends a %zu bytes IPv4 datagram to %s with protocol " 335 | "%" PRIu16, datagram_size, addr_t::to_alpha(dst), protocol 336 | ); 337 | 338 | // lock 339 | uint16_t datagram_id = current_datagram_id++; 340 | // unlock 341 | 342 | this->data_link->send_ip_payload( 343 | *data_link_dst, datagram_size, 344 | [this, dst, payload_writer, protocol, datagram_size, datagram_id] 345 | (cursor_t cursor) { 346 | cursor = _write_header( 347 | cursor, datagram_size, datagram_id, protocol, dst 348 | ); 349 | payload_writer(cursor); 350 | }); 351 | }); 352 | } 353 | 354 | // Equivalent to 'send_payload()' with 'protocol' equals to 'IPPROTO_TCP'. 355 | // 356 | // This method is typically called by the TCP instance when it wants to 357 | // send a TCP segment. 358 | inline void send_tcp_payload( 359 | net_t dst, size_t payload_size, 360 | function payload_writer 361 | ) 362 | { 363 | send_payload(dst, IPPROTO_TCP, payload_size, payload_writer); 364 | } 365 | 366 | // 367 | // Static methods 368 | // 369 | 370 | // Computes the partial (check)sum of the pseudo TCP header. 371 | // 372 | // The TCP segment checksum is computed on the TCP segment and on a pseudo 373 | // header. This pseudo header will only be used to compute the checksum and 374 | // will not be transmitted. 375 | // 376 | // The pseudo header of a TCP segment transmitted over IPv4 is the 377 | // following: 378 | // 379 | // +--------------------------------------------+ 380 | // | Source network address | 381 | // +--------------------------------------------+ 382 | // | Destination network address | 383 | // +----------+----------+----------------------+ 384 | // | zero | Protocol | TCP segment size | 385 | // +----------+----------+----------------------+ 386 | // 387 | // This method will be called by the TCP transport layer. Its implementation 388 | // varies depending on the network layer protocol, which explains why it's 389 | // not defined in 'tcp_t'. 390 | static partial_sum_t tcp_pseudo_header_sum( 391 | net_t saddr, net_t daddr, 392 | net_t seg_size 393 | ) 394 | { 395 | static constexpr size_t PSEUDO_HEADER_SIZE = 12; 396 | char buffer[PSEUDO_HEADER_SIZE]; 397 | 398 | mempcpy(&buffer[0], &saddr, sizeof saddr); 399 | mempcpy(&buffer[4], &daddr, sizeof daddr); 400 | buffer[8] = 0; 401 | buffer[9] = IPPROTO_TCP; 402 | mempcpy(&buffer[10], &seg_size, sizeof seg_size); 403 | 404 | return partial_sum_t(buffer, PSEUDO_HEADER_SIZE); 405 | } 406 | 407 | private: 408 | 409 | // Writes the IPv4 header starting at the given buffer cursor. 410 | cursor_t _write_header( 411 | cursor_t cursor, size_t datagram_size, uint16_t datagram_id, 412 | uint8_t protocol, net_t dst 413 | ) 414 | { 415 | static const net_t FRAG_OFF_NET = IP_DF; // Don't fragment. 416 | 417 | return cursor.template write_with( 418 | [this, datagram_size, datagram_id, protocol, dst](header_t *hdr) { 419 | hdr->version = IPVERSION; 420 | hdr->ihl = HEADER_LEN; 421 | hdr->tos = IPTOS_CLASS_DEFAULT; 422 | hdr->tot_len = datagram_size; 423 | hdr->id = datagram_id; 424 | hdr->frag_off = FRAG_OFF_NET; 425 | hdr->ttl = IPDEFTTL; 426 | hdr->protocol = protocol; 427 | hdr->check = checksum_t::ZERO; 428 | hdr->saddr = this->addr; 429 | hdr->daddr = dst; 430 | 431 | hdr->check = checksum_t(hdr, HEADER_SIZE); 432 | }); 433 | } 434 | 435 | size_t _max_payload_size(void) 436 | { 437 | // IPv4 datagrams can't be larger than 65,535 bytes. 438 | return min(this->data_link->max_payload_size, (size_t) 65535) 439 | - HEADER_SIZE; 440 | } 441 | }; 442 | 443 | #undef IPV4_COLOR 444 | #undef IPV4_DEBUG 445 | #undef IPV4_ERROR 446 | 447 | } } /* namespace rusty::net */ 448 | 449 | namespace std { 450 | 451 | // 'std::hash<>' and 'std::equal_to<>' instances are required for IPv4 addresses 452 | // to be used in unordered containers. 453 | 454 | using namespace rusty::net; 455 | 456 | template <> 457 | struct hash { 458 | inline size_t operator()(const ipv4_addr_t &addr) const 459 | { 460 | return hash()(addr.value); 461 | } 462 | }; 463 | 464 | template <> 465 | struct equal_to { 466 | inline bool operator()(const ipv4_addr_t& a,const ipv4_addr_t& b) const 467 | { 468 | return a == b; 469 | } 470 | }; 471 | 472 | } /* namespace std */ 473 | 474 | #endif /* __RUSTY_NET_IPV4_HPP__ */ 475 | -------------------------------------------------------------------------------- /tilera-toolchain.cmake: -------------------------------------------------------------------------------- 1 | SET(CMAKE_SYSTEM_NAME Linux) 2 | SET(CMAKE_SYSTEM_VERSION 1) 3 | 4 | SET(CMAKE_C_COMPILER $ENV{TILERA_ROOT}/bin/tile-gcc48) 5 | SET(CMAKE_CXX_COMPILER $ENV{TILERA_ROOT}/bin/tile-g++48) 6 | 7 | SET(CMAKE_FIND_ROOT_PATH $ENV{TILERA_ROOT}/tile) 8 | 9 | MESSAGE("Use ${CMAKE_FIND_ROOT_PATH} as a building root") 10 | 11 | # Doesn't search for programs in the target root 12 | SET(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER) 13 | 14 | # Only searches libraries and includes in the targer root 15 | SET(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY) 16 | SET(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY) 17 | 18 | SET(ENV{PKG_CONFIG_SYSROOT_DIR} $ENV{TILERA_ROOT}/tile) 19 | SET(ENV{PKG_CONFIG_LIBDIR} $ENV{TILERA_ROOT}/tile/usr/lib/pkgconfig:$ENV{TILERA_ROOT}/tile/lib/pkgconfig) 20 | 21 | -------------------------------------------------------------------------------- /util/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | include_directories (../) 2 | -------------------------------------------------------------------------------- /util/macros.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Various pre-processor macros. 3 | // 4 | // Copyright 2015 Raphael Javaux 5 | // University of Liege. 6 | // 7 | // This program is free software: you can redistribute it and/or modify 8 | // it under the terms of the GNU General Public License as published by 9 | // the Free Software Foundation, either version 3 of the License, or 10 | // (at your option) any later version. 11 | // 12 | // This program is distributed in the hope that it will be useful, 13 | // but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | // GNU General Public License for more details. 16 | // 17 | // You should have received a copy of the GNU General Public License 18 | // along with this program. If not, see . 19 | // 20 | 21 | #ifndef __RUSTY_UTILS_MACROS_HPP__ 22 | #define __RUSTY_UTILS_MACROS_HPP__ 23 | 24 | // 25 | // Branch prediction hints. 26 | // 27 | 28 | #ifdef BRANCH_PREDICT 29 | #define LIKELY(x) __builtin_expect(!!(x), 1) 30 | #define UNLIKELY(x) __builtin_expect((x), 0) 31 | #else 32 | #define LIKELY(x) 33 | #define UNLIKELY(x) 34 | #endif /* BRANCH_PREDICT */ 35 | 36 | // 37 | // Terminal colors 38 | // 39 | 40 | #define COLOR_RED "\033[31;1m" 41 | #define COLOR_GRN "\033[32;1m" 42 | #define COLOR_YEL "\033[33;1m" 43 | #define COLOR_BLU "\033[34;1m" 44 | #define COLOR_MAG "\033[35;1m" 45 | #define COLOR_CYN "\033[36;1m" 46 | 47 | #define COLOR_BOLD "\033[1m" 48 | 49 | #define COLOR_RESET "\033[0m" 50 | 51 | // 52 | // Logging messages 53 | // 54 | 55 | // There is three levels of log messages: debug, error and die, each associated 56 | // with their respective 'RUSTY_*' macros. 57 | // 58 | // * 'RUSTY_DEBUG()' should be used for information messages during normal 59 | // operations, such as events. These messages will only be displayed when 60 | // NDEBUG is not defined. 61 | // * 'RUSTY_ERROR()' should be used for unexpected but recoverable events, 62 | // such as the reception of an invalid packet. 63 | // * 'RUSTY_DIE()' should be used for unexpected and unrecoverable events, 64 | // such as a failled memory allocation. The macro immediately stops the 65 | // application after displaying the message by calling 'exit()' with 66 | // 'EXIT_FAILURE' as status code. 67 | // 68 | // Each macro displays the error message with the module name and where it has 69 | // been called. Each module have an associated color to make messages easier to 70 | // read. 71 | // 72 | // The passed message could use be formatted as in 'printf' and an arbritary 73 | // number of arguments could be given to the macros. 74 | 75 | #ifdef NDEBUG 76 | #define RUSTY_DEBUG(MODULE, COLOR, MSG, ...) 77 | #elif NDEBUGMSG 78 | #define RUSTY_DEBUG(MODULE, COLOR, MSG, ...) 79 | #else 80 | #define RUSTY_DEBUG(MODULE, COLOR, MSG, ...) \ 81 | do { \ 82 | fprintf( \ 83 | stderr, "%-20s%-20s" MSG, \ 84 | "[" COLOR_GRN "DEBUG" COLOR_RESET "]", \ 85 | "[" COLOR MODULE COLOR_RESET "]", \ 86 | ##__VA_ARGS__ \ 87 | ); \ 88 | fprintf(stderr, " (" __FILE__ ":%d)\n", __LINE__); \ 89 | } while (0) 90 | #endif 91 | 92 | #define RUSTY_ERROR(MODULE, COLOR, MSG, ...) \ 93 | do { \ 94 | fprintf( \ 95 | stderr, "%-20s%-20s" COLOR_BOLD MSG, \ 96 | "[" COLOR_YEL "ERROR" COLOR_RESET "]", \ 97 | "[" COLOR MODULE COLOR_RESET "]", \ 98 | ##__VA_ARGS__ \ 99 | ); \ 100 | fprintf(stderr, " (" __FILE__ ":%d)" COLOR_RESET "\n", __LINE__); \ 101 | } while (0) 102 | 103 | #define RUSTY_DIE(MODULE, COLOR, MSG, ...) \ 104 | do { \ 105 | fprintf( \ 106 | stderr, "%-20s%-20s" COLOR_BOLD MSG, \ 107 | "[" COLOR_RED "DIE" COLOR_RESET "]", \ 108 | "[" COLOR MODULE COLOR_RESET "]", \ 109 | ##__VA_ARGS__ \ 110 | ); \ 111 | fprintf(stderr, " (" __FILE__ ":%d)" COLOR_RESET "\n", __LINE__); \ 112 | exit(EXIT_FAILURE); \ 113 | } while (0) 114 | 115 | 116 | #endif /* __RUSTY_UTILS_MACROS_HPP__ */ 117 | --------------------------------------------------------------------------------