├── .gitignore ├── .vscode └── settings.json ├── Keymashed Presentation.pdf ├── README.md ├── bpf ├── README.md ├── bpf.c ├── bpf.o ├── bpf_api.h ├── bpf_elf.h └── setup-tc.sh ├── media ├── Architecture.drawio.png ├── Architecture.drawio.svg ├── BURST 2024 SebMur-19-resized.jpg ├── BURST 2024 SebMur-6-resized.jpg ├── BURST 2024 SebMur-66-cropped.jpg ├── BURST 2024 SebMur-82-cropped.jpg ├── BURST 2024 suspiciously-optiplex shaped box.jpg ├── JPEG Process.drawio.png ├── Macroblock.drawio.svg ├── MacroblockExpanded.drawio.svg ├── YUV420.drawio.svg ├── YUV422.drawio.svg ├── YUV444.drawio.svg ├── Zigzag.drawio.svg ├── dct │ ├── dct_of_block.svg │ ├── gen.py │ ├── jpeg-dct-transform.png │ ├── original_8x8_block.svg │ ├── quantization_matrix.svg │ ├── quantized_dct_block.svg │ └── reconstructed_block.svg ├── keymash-all-effects.mp4 ├── rickroll-keymash-compressed-with-audio.mp4 └── rickroll-keymashed-packet-loss.mp4 └── rust-userspace ├── .gitignore ├── Cargo.lock ├── Cargo.toml ├── README.md ├── benches └── rtp.rs ├── first.gif ├── rust-toolchain.toml └── src ├── audio.rs ├── bin ├── dct-pipeline.rs ├── recv.rs └── send.rs ├── bpf.rs ├── lib.rs ├── rtp.rs ├── video ├── dct.rs └── mod.rs └── wpm.rs /.gitignore: -------------------------------------------------------------------------------- 1 | .ccls-cache 2 | -------------------------------------------------------------------------------- /.vscode/settings.json: -------------------------------------------------------------------------------- 1 | { 2 | "rust-analyzer.linkedProjects": [ 3 | "rust-userspace/Cargo.toml" 4 | ], 5 | "files.associations": { 6 | "bpf_api.h": "c", 7 | "libbpf.h": "c" 8 | } 9 | } -------------------------------------------------------------------------------- /Keymashed Presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/Keymashed Presentation.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

2 | 3 | _Ya know, how sometimes your computer's internet is slow?_ 4 | 5 | _What if... you could motivate it. Make the internet itself flow a lil' quicker._ 6 | 7 | # Keymashed 8 | 9 | An interactive art installation at Purdue Hackers' BURST fall 2024 show. Since making the internet faster is a hard research problem, keymashed instead worsens the internet and then eases up proportional to how fast you mash the keyboard. Observe the fruits of your tactile encouragement through a custom live-streaming video protocol. 10 | 11 | ## Table of contents: 12 | - [✨the keymashed experience✨](#the-keymashed-experience) 13 | - [The Exhibit](#the-exhibit) 14 | - [Technical Details](#technical-details) 15 | - [eBPF Packet Filter](#ebpf-packet-filter) 16 | - [Real-time UDP Streaming](#real-time-udp-streaming) 17 | - [Video Codec](#video-codec) 18 | - [User-level Application](#user-level-application) 19 | - [Project Evolution](#project-evolution) 20 | - [Gallery](#gallery) 21 | - [About the author / hire me!](#about-the-author--hire-me) 22 | - [Credits](#credits) 23 | 24 | ## ✨the keymashed experience✨: 25 | 26 | https://github.com/user-attachments/assets/f13cbadf-bcb7-433d-a5de-5e4c0cf470ff 27 | 28 | _You walk up to the exhibit. There's a keyboard in front of you. The pedestal says, "Mash the keyboard". There are indistinct splotches of grey on the screen that may or may not be people standing around. As you start mashing, the image gains quality and smoothness. The edges of the screen glow a bright green to indicate you're close to the peak. The image resolves into a birds-eye view of the pedestal. In the screen, you see yourself starting to approach the exhibit._ 29 | 30 | ## The Exhibit 31 | 32 | Keymashed as an exhibit consisted of: 33 | - An IBM Model-M keyboard with exquisite mash-feel. 34 | - An old square monitor. 35 | - Two Dell Optiplexes (cheap desktop computers) that are connected to the monitor and webcam respectively. 36 | 37 | There are two effects at play: 38 | - _Packet loss:_ UDP packets are being dropped on the livestream playing computer at the network interface level. The more keys you mash, the less packets are lost. At the threshold, packet loss stops occurring. 39 | - _Lossy compression:_ Frames are being encoded lossily on the livestream sender computer. The more keys you mash, the lower the lossy compression. At the threshold, the image becomes clear without any color banding. 40 | 41 | The webcam is mounted on top of a wall along with an Optiplex with a wireless dongle. This is the sender computer. The receiver computer sits under the pedestal that holds the monitor. 42 |

43 | 44 | The livestream is delayed by 30 seconds, since it's more interesting to see a bit into the past rather than just looking at your own back. 45 | 46 | ## Technical Details 47 | 48 | The repository consists of the following components: 49 | - an eBPF filter written in C that drops packets with some probability that it reads from a shared map. This eBPF filter is installed onto the network interface using the `tc` utility. 50 | - a video codec which uses a JPEG-like scheme to lossily compress blocks of frames which are then reassembled and decompressed on the receiver. The quality of the JPEG encoding can vary per block. 51 | - an RTP-like protocol for receiving packets over UDP. 52 | 53 | For supplementary diagrams and technical context, you can also have a look at [this presentation for `keymashed` I created when interviewing at Neuralink](Keymashed%20Presentation.pdf). Note that the presentation originally contained several videos and may be slightly confusing as a PDF. Feel free to contact me for the authoritative PPTX file. 54 | 55 | Explanations of each component follow. 56 | 57 | ### eBPF Packet Filter 58 | 59 | [eBPF](https://ebpf.io/) is a relatively recent feature in the Linux kernel which allows running sandboxed user-provided code in the kernel inside a virtual machine. It is used in [many kernel subsystems which deal with security, tracing and networking](https://docs.ebpf.io/linux/program-type/). 60 | 61 | We create an eBPF filter in [bpf.c](bpf/bpf.c) which reads the drop probability from a file (which user programs can write to) and then decides whether to drop the current packet or not. This eBPF filter is installed at a network interface using the `tc` (traffic control) utility. 62 | 63 | ```c 64 | struct { 65 | // declare that the bpf map will be of type array, mapping uint32_t to uint32_t and have a maximum of one entry. 66 | __uint(type, BPF_MAP_TYPE_ARRAY); 67 | __uint(key_size, sizeof(uint32_t)); 68 | __uint(value_size, sizeof(uint32_t)); 69 | __uint(max_entries, 1); 70 | // PIN_BY_NAME ensures that the map is pinned in /sys/fs/bpf 71 | __uint(pinning, LIBBPF_PIN_BY_NAME); 72 | // synchronize the `map_keymash` name with the userspace program 73 | } map_mash __section(".maps"); 74 | 75 | __section("classifier") 76 | int scream_bpf(struct __sk_buff *skb) 77 | { 78 | uint32_t key = 0, *val = 0; 79 | 80 | val = map_lookup_elem(&map_mash, &key); 81 | if (val && get_prandom_u32() < *val) { 82 | return TC_ACT_SHOT; // Drop packet 83 | } 84 | return TC_ACT_OK; // Pass packet 85 | } 86 | ``` 87 | 88 | The [userspace code](rust-userspace/src/bpf.rs) interacts with the eBPF filter using the `bpf_obj_get` and `bpf_map_update_elem` functions from `libbpf`. 89 | 90 | ### Real-time UDP streaming 91 | [UDP is the user-datagram protocol](https://en.wikipedia.org/wiki/User_Datagram_Protocol), commonly used for multimedia streaming applications due to its packet-oriented and unreliable nature. The real-time protocol (RTP) is built on top of UDP. 92 | 93 | I decided to re-invent the real-time protocol (RTP) from scratch, with a focus on reducing copies as much as possible. It makes heavy use of the [`zerocopy`](https://github.com/google/zerocopy) crate and const generics and supports `?Sized` types. [Have a look at the rtp module](rust-userspace/src/rtp.rs) - the code is dense but well-commented. _It's probably the most ~~over~~-engineered part of the entire project._ High-level summary: 94 | - each sent packet is assigned a sequence number. 95 | - on the receiver side, we maintain a circular buffer with slots for packets, putting incoming packets into slots as received. 96 | - the receiver consumes one slot at a time which may or may not contain a packet (it may be lost/late). If a packet arrives after having been consumed (late), it will be discarded. 97 | - if the sender lags behind in sending packets, the receiver can wait if the "early-latest" span is too low. The early-latest span measures the difference between the latest received packet number and the packet that will be consumed next. 98 | - if the receiver lags behind in consuming packets, earlier packets are overwritten with new ones. The receiver can jump ahead to start playing the new packets when it arrives at that section. 99 | 100 | An example of video playback with heavy packet loss (intensity of background ∝ packet loss): 101 | 102 | https://github.com/user-attachments/assets/766d756a-1409-4f98-a055-338dbd613f82 103 | 104 | Lost packets are not painted for a frame, resulting in newer frames being partially painted over older ones. This causes the glitchy effect. The long strips are a consequence of the packetization strategy, which is explored towards the end of the next section. 105 | 106 | ### Video Codec 107 | 108 | The webcam transmits video in the `YUV422` format. The [`YUV`](https://en.wikipedia.org/wiki/YCbCr) format is an alternative to the more well-known `RGB` format; it encodes the luminance (`Y`), blue-difference chroma (`Cb`/`U`) and red-difference chroma (`Cr`/`V`). 109 | 110 | ![A 2 high and 4 wide grid of three overlaid boxes colored white, blue and red.](media/YUV444.drawio.svg) 111 | 112 | The `422` refers the [chroma subsampling](https://en.wikipedia.org/wiki/Chroma_subsampling), explained below. 113 | 114 | > Chroma subsampling is the practice of encoding images by implementing less resolution for chroma information than for luma information, taking advantage of the human visual system's lower acuity for color differences than for luminance. 115 | > 116 | > -- Wikipedia 117 | 118 | In this case, instead of having independent `YUV` values for every pixel, we let two horizontal `Y` pixels share the `U` and `V` color values. This allows us to pack two pixels within four bytes (assuming 8 bits per component) instead of the usual six bytes, achieving compression of 2 bytes per pixel. 119 | 120 | ![A 2 high and 4 wide grid where two adjacent cells share a red and green box and each cell contains a white box](media/YUV422.drawio.svg) 121 | 122 | After receiving the video from the webcam, the video sender further subsamples the colors into 4:2:0. This increases our compression to 1.5 bytes per pixel. 123 | 124 | ![A 2 high and 4 wide grid where each cell contains a white box. Overlapping red and blue boxes are present at the points where four cells touch.](media/YUV420.drawio.svg) 125 | 126 | The subsampled frame is then broken into _macroblocks_ of 16 x 16 pixels which contain six _blocks_ of 8 x 8 values: four for luminance, one for red-difference and one for blue-difference. 127 | 128 | ![](media/Macroblock.drawio.svg) 129 | 130 | The macroblock. 131 | 132 | ![](media/MacroblockExpanded.drawio.svg) 133 | 134 | The macroblock, decomposed into its six constituent blocks. 135 | 136 | Each block is converted to a frequency-domain representation using the [DCT transform](https://en.wikipedia.org/wiki/JPEG#Discrete_cosine_transform). The DCT-transformed output makes the high-frequency and low-frequency components of the block more apparent. 137 | 138 | After the transformation, the values are divided element-wise by the _quantization matrix_, which is specially chosen to minimize perceptual quality loss. The quantization matrix can be scaled to increase/decrease image quality - this is the main knob that we use tune the lossy compression. Note how the lower-left elements of the quantization matrix are larger than the ones on the top-right; this prioritizes the lower-frequency components. 139 | 140 |

141 | 142 | 143 | 144 | 145 | 146 |

147 | 148 | Finally, the quantized block is run-length encoded in a zig-zag pattern. This causes zero values to end up at the end, which makes our naive encoding quite efficient on its own. 149 | 150 | ![](media/Zigzag.drawio.svg) 151 | 152 | Run-length encoding encodes a stream of data as pairs of (value, count) where count is the number of times the value is repeated. 153 | ``` 154 | And so [54, 23, 23, 1, 1, 1, 0, 0, 0, ... 23 more times ...] gets encoded as [(54, 1), (23, 2), (1, 3), (0, 26), ...] which is quite efficient. 155 | ``` 156 | 157 | I did some performance optimization and parallelization using [`rayon`](https://github.com/rayon-rs/rayon) to get this running as smoothly as it does - shoutout to [`cargo-flamegraph`](https://github.com/flamegraph-rs/flamegraph)! 158 | 159 | You can observe the final outcome of this lossy compression as a video: 160 | 161 | https://github.com/user-attachments/assets/489e3978-6acb-4a16-af49-40a0fb24831a 162 | 163 | | | left | right | 164 | |------|----------------|-------------------------------| 165 | | top | original video | DCT blocks | 166 | | bottom| quantized then dequantized DCT blocks | reconstructed video | 167 | 168 | > Recall that: 169 | > - a block is a 8x8 group of pixels. 170 | > - quantization refers to dividing the DCT block by the quantization matrix. 171 | 172 | Since the DCT blocks are being displayed in the YUV color space, values close to zero are an intense green. Note how the larger amount of green of the dequantized DCT blocks corresponds to the lower quality of the reconstructed video. 173 | 174 | The quality correlates with how fast you're mashing the keyboard. The background's intensity is a visual indicator for this. _You may need to increase the volume on your device to hear keymashing._ 175 | 176 | ![](media/JPEG%20Process.drawio.png) 177 | 178 | Encoded macroblocks are inserted into a packet with the following metadata and then sent over the network. 179 | 180 | ``` 181 | |---------------| 182 | | Frame no. | 183 | |---------------| 184 | | Block 1 | 185 | | x, y, quality | 186 | | RLE data | 187 | |---------------| 188 | | Block 2 | 189 | | x, y, quality | 190 | | RLE data | 191 | |---------------| 192 | | ... | 193 | ``` 194 | 195 | Note that these macroblocks are greedily packet into a single UDP packet. The packetizing logic tends to put in adjacent macroblocks together, which means that a packet being lost results in a long strip of macroblocks not being updated that frame. 196 | 197 | ### User-level Application 198 | The application itself uses `SDL2` for handling key input and rendering the video. 199 | 200 | Putting both effects together, a demo of what the output looks like follows. _Turn up your volume to hear the keymashing._ 201 | 202 | https://github.com/user-attachments/assets/cc3fd479-7786-4c24-bc81-64d4656eac57 203 | 204 | A diagram of the network setup (video processing steps elided): 205 | 206 | ![](media/Architecture.drawio.png) 207 | 208 | ## Project Evolution 209 | 210 | "what if you could scream at your computer to make it run faster?" was the original question I asked. We ([@kartva](https://github.com/kartva/) and [@9p4](https://github.com/9p4)) wrote `run-louder`/`screamd` (we went through many names) which would spawn a child process, say Google Chrome, and intercept all syscalls made by it using `ptrace` (the same syscall that `gdb` uses). After intercepting a syscall, the parent would sleep for some time (proportional to scream intensity) before resuming the child. 211 | 212 | We demoed it and have a shaky video of: 213 | - trying to open Chrome but it's stuck loading 214 | - coming up to the laptop and yelling at it 215 | - Chrome immediately loads 216 | 217 | As an extension to this idea, I started working on affecting the network as well by dropping packets. At this point, I decided to present `run-louder`/`screamd` at BURST, which necessitated changing screaming to key-mashing (out of respect for the art gallery setting). Additionally, while `ping` works fine as a method of demoing packet loss, I wanted something more visual and thus ended up writing the video codec. 218 | 219 | If you have ideas around making `screamd`, please contact me or create an issue! 220 | 221 | # Gallery 222 | 223 |

224 | 225 | 226 | 227 | 228 |

229 | 230 | # About the author / hire me! 231 | _I'm looking for Summer 2025 internships - and I'm particularly excited about working with startups._ Read more about my work at [my Github profile](https://github.com/kartva/). 232 | 233 | # Credits 234 | 235 | - [@9p4](https://github.com/9p4) helped a lot with initial ideation and prototyping, as well as creating the initial Zig version of `screamd`. 236 | - Poster design by Rebecca Pine and pixel art by Jadden Picardal. 237 | - Most photos by Sebastian Murariu. 238 | - [@ArhanChaudhary](https://github.com/ArhanChaudhary) for "ominously watching me code" and motivating me to do this from his [NAND computer project](https://github.com/ArhanChaudhary/NAND). 239 | - [@kdkasad](https://github.com/kdkasad) for interesting conversation and reviews of the README. 240 | - [@RhysU](https://github.com/RhysU) for feedback on the README. 241 | -------------------------------------------------------------------------------- /bpf/README.md: -------------------------------------------------------------------------------- 1 | ```bash 2 | sudo apt install -y clang gcc-multilib 3 | clang -target bpf -O2 -g -o bpf.o -c bpf.c 4 | 5 | # replace "wlp3s0" with the network adaptor you want; 6 | # use ip a to look at available network adaptors 7 | 8 | # add the special ingress qdisc to mess with incoming packets 9 | 10 | sudo tc qdisc add dev wlp3s0 ingress 11 | 12 | # add the prio classful qdisc to the network adaptor 13 | 14 | sudo tc qdisc add dev wlp3s0 root handle 1: prio 15 | 16 | # add the filter to the ingress qdisc 17 | # da = direct-action 18 | 19 | sudo tc filter add dev wlp3s0 ingress bpf da obj bpf.o sec classifier 20 | 21 | # add the filter to the outbound prio qdisc 22 | # da = direction-action 23 | # protocol all = affect all packets 24 | # prio 1 = filter has highest priority 25 | 26 | sudo tc filter add dev wlp3s0 protocol all parent 1: prio 1 bpf da obj bpf.o sec classifier 27 | ``` 28 | 29 | Without comments: 30 | 31 | ```bash 32 | sudo tc qdisc add dev lo ingress 33 | sudo tc qdisc add dev lo root handle 1: prio 34 | sudo tc filter add dev lo ingress bpf da obj bpf.o sec classifier 35 | sudo tc filter add dev lo protocol all parent 1: prio 1 bpf da obj bpf.o sec classifier 36 | ``` 37 | 38 | To remove the filter: 39 | 40 | ``` 41 | sudo tc qdisc del dev wlp3s0 root 42 | sudo tc qdisc del dev wlp3s0 ingress 43 | ``` 44 | 45 | To observe installed filters: 46 | ``` 47 | sudo tc filter show dev wlp3s0 48 | sudo tc qdisc show dev wlp3s0 49 | ``` -------------------------------------------------------------------------------- /bpf/bpf.c: -------------------------------------------------------------------------------- 1 | #include "bpf_api.h" 2 | 3 | /* Minimal, stand-alone toy map pinning example: 4 | * 5 | * clang -target bpf -O2 [...] -o bpf_shared.o -c bpf_shared.c 6 | * tc filter add dev foo parent 1: bpf obj bpf_shared.o sec egress 7 | * tc filter add dev foo parent ffff: bpf obj bpf_shared.o sec ingress 8 | * 9 | * Both classifier will share the very same map instance in this example, 10 | * so map content can be accessed from ingress *and* egress side! 11 | * 12 | * This example has a pinning of PIN_OBJECT_NS, so it's private and 13 | * thus shared among various program sections within the object. 14 | * 15 | * A setting of PIN_GLOBAL_NS would place it into a global namespace, 16 | * so that it can be shared among different object files. A setting 17 | * of PIN_NONE (= 0) means no sharing, so each tc invocation a new map 18 | * instance is being created. 19 | */ 20 | 21 | struct { 22 | // declare that the bpf map will be of type array, mapping uint32_t to uint32_t and have a maximum of one entry. 23 | __uint(type, BPF_MAP_TYPE_ARRAY); 24 | __uint(key_size, sizeof(uint32_t)); 25 | __uint(value_size, sizeof(uint32_t)); 26 | __uint(max_entries, 1); 27 | // PIN_BY_NAME ensures that the map is pinned in /sys/fs/bpf 28 | __uint(pinning, LIBBPF_PIN_BY_NAME); 29 | // synchronize the `map_keymash` name with the userspace program 30 | } map_keymash __section(".maps"); 31 | 32 | __section("classifier") 33 | int scream_bpf(struct __sk_buff *skb) 34 | { 35 | uint32_t key = 0, *val = 0; 36 | 37 | val = map_lookup_elem(&map_keymash, &key); 38 | if (val && get_prandom_u32() < *val) { 39 | return TC_ACT_SHOT; // Drop packet 40 | } 41 | return TC_ACT_OK; // Pass packet 42 | } 43 | 44 | BPF_LICENSE("GPL"); -------------------------------------------------------------------------------- /bpf/bpf.o: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/bpf/bpf.o -------------------------------------------------------------------------------- /bpf/bpf_api.h: -------------------------------------------------------------------------------- 1 | /* SPDX-License-Identifier: GPL-2.0 or BSD-3-Clause */ 2 | #ifndef __BPF_API__ 3 | #define __BPF_API__ 4 | 5 | /* Note: 6 | * 7 | * This file can be included into eBPF kernel programs. It contains 8 | * a couple of useful helper functions, map/section ABI (bpf_elf.h), 9 | * misc macros and some eBPF specific LLVM built-ins. 10 | */ 11 | 12 | #include 13 | 14 | #include 15 | #include 16 | #include 17 | 18 | #include 19 | 20 | #include "bpf_elf.h" 21 | 22 | /** libbpf pin type. */ 23 | enum libbpf_pin_type { 24 | LIBBPF_PIN_NONE, 25 | /* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */ 26 | LIBBPF_PIN_BY_NAME, 27 | }; 28 | 29 | /** Type helper macros. */ 30 | 31 | #define __uint(name, val) int (*name)[val] // int (*name)[4] -> 16 bytes 32 | #define __type(name, val) typeof(val) *name 33 | #define __array(name, val) typeof(val) *name[] 34 | 35 | /** Misc macros. */ 36 | 37 | #ifndef __stringify 38 | # define __stringify(X) #X 39 | #endif 40 | 41 | #ifndef __maybe_unused 42 | # define __maybe_unused __attribute__((__unused__)) 43 | #endif 44 | 45 | #ifndef offsetof 46 | # define offsetof(TYPE, MEMBER) __builtin_offsetof(TYPE, MEMBER) 47 | #endif 48 | 49 | #ifndef likely 50 | # define likely(X) __builtin_expect(!!(X), 1) 51 | #endif 52 | 53 | #ifndef unlikely 54 | # define unlikely(X) __builtin_expect(!!(X), 0) 55 | #endif 56 | 57 | #ifndef htons 58 | # define htons(X) __constant_htons((X)) 59 | #endif 60 | 61 | #ifndef ntohs 62 | # define ntohs(X) __constant_ntohs((X)) 63 | #endif 64 | 65 | #ifndef htonl 66 | # define htonl(X) __constant_htonl((X)) 67 | #endif 68 | 69 | #ifndef ntohl 70 | # define ntohl(X) __constant_ntohl((X)) 71 | #endif 72 | 73 | #ifndef __inline__ 74 | # define __inline__ __attribute__((always_inline)) 75 | #endif 76 | 77 | /** Section helper macros. */ 78 | 79 | #ifndef __section 80 | # define __section(NAME) \ 81 | __attribute__((section(NAME), used)) 82 | #endif 83 | 84 | #ifndef __section_tail 85 | # define __section_tail(ID, KEY) \ 86 | __section(__stringify(ID) "/" __stringify(KEY)) 87 | #endif 88 | 89 | #ifndef __section_xdp_entry 90 | # define __section_xdp_entry \ 91 | __section(ELF_SECTION_PROG) 92 | #endif 93 | 94 | #ifndef __section_cls_entry 95 | # define __section_cls_entry \ 96 | __section(ELF_SECTION_CLASSIFIER) 97 | #endif 98 | 99 | #ifndef __section_act_entry 100 | # define __section_act_entry \ 101 | __section(ELF_SECTION_ACTION) 102 | #endif 103 | 104 | #ifndef __section_lwt_entry 105 | # define __section_lwt_entry \ 106 | __section(ELF_SECTION_PROG) 107 | #endif 108 | 109 | #ifndef __section_license 110 | # define __section_license \ 111 | __section(ELF_SECTION_LICENSE) 112 | #endif 113 | 114 | #ifndef __section_maps 115 | # define __section_maps \ 116 | __section(ELF_SECTION_MAPS) 117 | #endif 118 | 119 | /** Declaration helper macros. */ 120 | 121 | #ifndef BPF_LICENSE 122 | # define BPF_LICENSE(NAME) \ 123 | char ____license[] __section_license = NAME 124 | #endif 125 | 126 | /** Classifier helper */ 127 | 128 | #ifndef BPF_H_DEFAULT 129 | # define BPF_H_DEFAULT -1 130 | #endif 131 | 132 | /** BPF helper functions for tc. Individual flags are in linux/bpf.h */ 133 | 134 | #ifndef __BPF_FUNC 135 | # define __BPF_FUNC(NAME, ...) \ 136 | (* NAME)(__VA_ARGS__) __maybe_unused 137 | #endif 138 | 139 | #ifndef BPF_FUNC 140 | # define BPF_FUNC(NAME, ...) \ 141 | __BPF_FUNC(NAME, __VA_ARGS__) = (void *) BPF_FUNC_##NAME 142 | #endif 143 | 144 | /* BPF syscall */ 145 | static int BPF_FUNC(sys_bpf, int cmd, union bpf_attr *attr, unsigned int size); 146 | 147 | /* Map access/manipulation */ 148 | static void *BPF_FUNC(map_lookup_elem, void *map, const void *key); 149 | static int BPF_FUNC(map_update_elem, void *map, const void *key, 150 | const void *value, uint32_t flags); 151 | static int BPF_FUNC(map_delete_elem, void *map, const void *key); 152 | 153 | /* Time access */ 154 | static uint64_t BPF_FUNC(ktime_get_ns); 155 | 156 | /* Debugging */ 157 | 158 | /* FIXME: __attribute__ ((format(printf, 1, 3))) not possible unless 159 | * llvm bug https://llvm.org/bugs/show_bug.cgi?id=26243 gets resolved. 160 | * It would require ____fmt to be made const, which generates a reloc 161 | * entry (non-map). 162 | */ 163 | static void BPF_FUNC(trace_printk, const char *fmt, int fmt_size, ...); 164 | 165 | #ifndef printt 166 | # define printt(fmt, ...) \ 167 | ({ \ 168 | char ____fmt[] = fmt; \ 169 | trace_printk(____fmt, sizeof(____fmt), ##__VA_ARGS__); \ 170 | }) 171 | #endif 172 | 173 | /* Random numbers */ 174 | static uint32_t BPF_FUNC(get_prandom_u32); 175 | 176 | /* Tail calls */ 177 | static void BPF_FUNC(tail_call, struct __sk_buff *skb, void *map, 178 | uint32_t index); 179 | 180 | /* System helpers */ 181 | static uint32_t BPF_FUNC(get_smp_processor_id); 182 | static uint32_t BPF_FUNC(get_numa_node_id); 183 | 184 | /* Packet misc meta data */ 185 | static uint32_t BPF_FUNC(get_cgroup_classid, struct __sk_buff *skb); 186 | static int BPF_FUNC(skb_under_cgroup, void *map, uint32_t index); 187 | 188 | static uint32_t BPF_FUNC(get_route_realm, struct __sk_buff *skb); 189 | static uint32_t BPF_FUNC(get_hash_recalc, struct __sk_buff *skb); 190 | static uint32_t BPF_FUNC(set_hash_invalid, struct __sk_buff *skb); 191 | 192 | /* Packet redirection */ 193 | static int BPF_FUNC(redirect, int ifindex, uint32_t flags); 194 | static int BPF_FUNC(clone_redirect, struct __sk_buff *skb, int ifindex, 195 | uint32_t flags); 196 | 197 | /* Packet manipulation */ 198 | static int BPF_FUNC(skb_load_bytes, struct __sk_buff *skb, uint32_t off, 199 | void *to, uint32_t len); 200 | static int BPF_FUNC(skb_store_bytes, struct __sk_buff *skb, uint32_t off, 201 | const void *from, uint32_t len, uint32_t flags); 202 | 203 | static int BPF_FUNC(l3_csum_replace, struct __sk_buff *skb, uint32_t off, 204 | uint32_t from, uint32_t to, uint32_t flags); 205 | static int BPF_FUNC(l4_csum_replace, struct __sk_buff *skb, uint32_t off, 206 | uint32_t from, uint32_t to, uint32_t flags); 207 | static int BPF_FUNC(csum_diff, const void *from, uint32_t from_size, 208 | const void *to, uint32_t to_size, uint32_t seed); 209 | static int BPF_FUNC(csum_update, struct __sk_buff *skb, uint32_t wsum); 210 | 211 | static int BPF_FUNC(skb_change_type, struct __sk_buff *skb, uint32_t type); 212 | static int BPF_FUNC(skb_change_proto, struct __sk_buff *skb, uint32_t proto, 213 | uint32_t flags); 214 | static int BPF_FUNC(skb_change_tail, struct __sk_buff *skb, uint32_t nlen, 215 | uint32_t flags); 216 | 217 | static int BPF_FUNC(skb_pull_data, struct __sk_buff *skb, uint32_t len); 218 | 219 | /* Event notification */ 220 | static int __BPF_FUNC(skb_event_output, struct __sk_buff *skb, void *map, 221 | uint64_t index, const void *data, uint32_t size) = 222 | (void *) BPF_FUNC_perf_event_output; 223 | 224 | /* Packet vlan encap/decap */ 225 | static int BPF_FUNC(skb_vlan_push, struct __sk_buff *skb, uint16_t proto, 226 | uint16_t vlan_tci); 227 | static int BPF_FUNC(skb_vlan_pop, struct __sk_buff *skb); 228 | 229 | /* Packet tunnel encap/decap */ 230 | static int BPF_FUNC(skb_get_tunnel_key, struct __sk_buff *skb, 231 | struct bpf_tunnel_key *to, uint32_t size, uint32_t flags); 232 | static int BPF_FUNC(skb_set_tunnel_key, struct __sk_buff *skb, 233 | const struct bpf_tunnel_key *from, uint32_t size, 234 | uint32_t flags); 235 | 236 | static int BPF_FUNC(skb_get_tunnel_opt, struct __sk_buff *skb, 237 | void *to, uint32_t size); 238 | static int BPF_FUNC(skb_set_tunnel_opt, struct __sk_buff *skb, 239 | const void *from, uint32_t size); 240 | 241 | /** LLVM built-ins, mem*() routines work for constant size */ 242 | 243 | #ifndef lock_xadd 244 | # define lock_xadd(ptr, val) ((void) __sync_fetch_and_add(ptr, val)) 245 | #endif 246 | 247 | #ifndef memset 248 | # define memset(s, c, n) __builtin_memset((s), (c), (n)) 249 | #endif 250 | 251 | #ifndef memcpy 252 | # define memcpy(d, s, n) __builtin_memcpy((d), (s), (n)) 253 | #endif 254 | 255 | #ifndef memmove 256 | # define memmove(d, s, n) __builtin_memmove((d), (s), (n)) 257 | #endif 258 | 259 | /* FIXME: __builtin_memcmp() is not yet fully usable unless llvm bug 260 | * https://llvm.org/bugs/show_bug.cgi?id=26218 gets resolved. Also 261 | * this one would generate a reloc entry (non-map), otherwise. 262 | */ 263 | #if 0 264 | #ifndef memcmp 265 | # define memcmp(a, b, n) __builtin_memcmp((a), (b), (n)) 266 | #endif 267 | #endif 268 | 269 | unsigned long long load_byte(void *skb, unsigned long long off) 270 | asm ("llvm.bpf.load.byte"); 271 | 272 | unsigned long long load_half(void *skb, unsigned long long off) 273 | asm ("llvm.bpf.load.half"); 274 | 275 | unsigned long long load_word(void *skb, unsigned long long off) 276 | asm ("llvm.bpf.load.word"); 277 | 278 | #endif /* __BPF_API__ */ -------------------------------------------------------------------------------- /bpf/bpf_elf.h: -------------------------------------------------------------------------------- 1 | /* SPDX-License-Identifier: GPL-2.0 or BSD-3-Clause */ 2 | #ifndef __BPF_ELF__ 3 | #define __BPF_ELF__ 4 | 5 | #include 6 | 7 | /* Note: 8 | * 9 | * Below ELF section names and bpf_elf_map structure definition 10 | * are not (!) kernel ABI. It's rather a "contract" between the 11 | * application and the BPF loader in tc. For compatibility, the 12 | * section names should stay as-is. Introduction of aliases, if 13 | * needed, are a possibility, though. 14 | */ 15 | 16 | /* ELF section names, etc */ 17 | #define ELF_SECTION_LICENSE "license" 18 | #define ELF_SECTION_MAPS "maps" 19 | #define ELF_SECTION_PROG "prog" 20 | #define ELF_SECTION_CLASSIFIER "classifier" 21 | #define ELF_SECTION_ACTION "action" 22 | 23 | #define ELF_MAX_MAPS 64 24 | #define ELF_MAX_LICENSE_LEN 128 25 | 26 | /* Object pinning settings */ 27 | #define PIN_NONE 0 28 | #define PIN_OBJECT_NS 1 29 | #define PIN_GLOBAL_NS 2 30 | 31 | /* ELF map definition */ 32 | struct bpf_elf_map { 33 | __u32 type; 34 | __u32 size_key; 35 | __u32 size_value; 36 | __u32 max_elem; 37 | __u32 flags; 38 | __u32 id; 39 | __u32 pinning; 40 | __u32 inner_id; 41 | __u32 inner_idx; 42 | }; 43 | 44 | #define BPF_ANNOTATE_KV_PAIR(name, type_key, type_val) \ 45 | struct ____btf_map_##name { \ 46 | type_key key; \ 47 | type_val value; \ 48 | }; \ 49 | struct ____btf_map_##name \ 50 | __attribute__ ((section(".maps." #name), used)) \ 51 | ____btf_map_##name = { } 52 | 53 | #endif /* __BPF_ELF__ */ -------------------------------------------------------------------------------- /bpf/setup-tc.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | sudo tc qdisc add dev wlp3s0 ingress 4 | sudo tc qdisc add dev wlp3s0 root handle 1: prio 5 | sudo tc filter add dev wlp3s0 ingress bpf da obj bpf.o sec classifier 6 | sudo tc filter add dev wlp3s0 protocol all parent 1: prio 1 bpf da obj bpf.o sec classifier -------------------------------------------------------------------------------- /media/Architecture.drawio.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/Architecture.drawio.png -------------------------------------------------------------------------------- /media/BURST 2024 SebMur-19-resized.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/BURST 2024 SebMur-19-resized.jpg -------------------------------------------------------------------------------- /media/BURST 2024 SebMur-6-resized.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/BURST 2024 SebMur-6-resized.jpg -------------------------------------------------------------------------------- /media/BURST 2024 SebMur-66-cropped.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/BURST 2024 SebMur-66-cropped.jpg -------------------------------------------------------------------------------- /media/BURST 2024 SebMur-82-cropped.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/BURST 2024 SebMur-82-cropped.jpg -------------------------------------------------------------------------------- /media/BURST 2024 suspiciously-optiplex shaped box.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/BURST 2024 suspiciously-optiplex shaped box.jpg -------------------------------------------------------------------------------- /media/JPEG Process.drawio.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/JPEG Process.drawio.png -------------------------------------------------------------------------------- /media/dct/dct_of_block.svg: -------------------------------------------------------------------------------- 1 | 2 | 4 | 935 | -------------------------------------------------------------------------------- /media/dct/gen.py: -------------------------------------------------------------------------------- 1 | # LLM-generated file; only for README diagrams. 2 | 3 | import numpy as np 4 | import matplotlib.pyplot as plt 5 | from scipy.fft import dct, idct 6 | 7 | # Function to apply 2D DCT 8 | def dct2(block): 9 | return dct(dct(block.T, norm='ortho').T, norm='ortho') 10 | 11 | # Function to apply 2D inverse DCT 12 | def idct2(block): 13 | return idct(idct(block.T, norm='ortho').T, norm='ortho') 14 | 15 | # Example 8x8 pixel block (values ranging from 0 to 255) 16 | pixel_block = np.array([ 17 | [52, 55, 61, 66, 70, 61, 64, 73], 18 | [63, 59, 55, 90, 109, 85, 69, 72], 19 | [62, 59, 68, 113, 144, 104, 66, 73], 20 | [63, 58, 71, 122, 154, 106, 70, 69], 21 | [67, 61, 68, 104, 126, 88, 68, 70], 22 | [79, 65, 60, 70, 77, 68, 58, 75], 23 | [85, 71, 64, 59, 55, 61, 65, 83], 24 | [87, 79, 69, 68, 65, 76, 78, 94] 25 | ]) 26 | 27 | # Zero-center the pixel block by subtracting 128 28 | zero_centered_block = pixel_block - 128 29 | 30 | # Quantization matrix (example from JPEG standard) 31 | quantization_matrix = np.array([ 32 | [16, 11, 10, 16, 24, 40, 51, 61], 33 | [12, 12, 14, 19, 26, 58, 60, 55], 34 | [14, 13, 16, 24, 40, 57, 69, 56], 35 | [14, 17, 22, 29, 51, 87, 80, 62], 36 | [18, 22, 37, 56, 68, 109, 103, 77], 37 | [24, 35, 55, 64, 81, 104, 113, 92], 38 | [49, 64, 78, 87, 103, 121, 120, 101], 39 | [72, 92, 95, 98, 112, 100, 103, 99] 40 | ]) 41 | 42 | # Perform DCT 43 | dct_block = dct2(zero_centered_block) 44 | 45 | # Quantization step 46 | quantized_block = np.round(dct_block / quantization_matrix) 47 | quantized_block[np.isclose(quantized_block, 0)] = 0 # Replace near-zero values with 0 48 | 49 | # Dequantization step 50 | dequantized_block = quantized_block * quantization_matrix 51 | 52 | # Perform Inverse DCT 53 | reconstructed_block = idct2(dequantized_block) 54 | 55 | # Add 128 back to reverse zero-centering 56 | reconstructed_block += 128 57 | 58 | # Function to display a grid with numbers and save as SVG 59 | def save_grid(data, title, filename): 60 | fig, ax = plt.subplots(figsize=(6, 6)) 61 | ax.imshow(data, cmap='gray') 62 | ax.set_title(title) 63 | ax.axis('off') 64 | for (i, j), val in np.ndenumerate(data): 65 | ax.text(j, i, f"{val:.0f}", ha='center', va='center', color='red') 66 | plt.savefig(filename, format='svg', bbox_inches='tight') 67 | plt.close(fig) 68 | 69 | # Save individual images 70 | save_grid(pixel_block, "Original 8x8 Block", "original_8x8_block.svg") 71 | save_grid(dct_block, "DCT of Block", "dct_of_block.svg") 72 | save_grid(quantization_matrix, "Quantization Matrix", "quantization_matrix.svg") 73 | save_grid(quantized_block, "Quantized DCT Block", "quantized_dct_block.svg") 74 | save_grid(np.round(reconstructed_block), "Reconstructed Block", "reconstructed_block.svg") 75 | -------------------------------------------------------------------------------- /media/dct/jpeg-dct-transform.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/dct/jpeg-dct-transform.png -------------------------------------------------------------------------------- /media/dct/quantized_dct_block.svg: -------------------------------------------------------------------------------- 1 | 2 | 4 | 995 | -------------------------------------------------------------------------------- /media/keymash-all-effects.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/keymash-all-effects.mp4 -------------------------------------------------------------------------------- /media/rickroll-keymash-compressed-with-audio.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/rickroll-keymash-compressed-with-audio.mp4 -------------------------------------------------------------------------------- /media/rickroll-keymashed-packet-loss.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/media/rickroll-keymashed-packet-loss.mp4 -------------------------------------------------------------------------------- /rust-userspace/.gitignore: -------------------------------------------------------------------------------- 1 | target 2 | -------------------------------------------------------------------------------- /rust-userspace/Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "rust_userspace" 3 | version = "0.1.0" 4 | edition = "2021" 5 | 6 | [profile.release] 7 | debug = true 8 | 9 | [[bench]] 10 | name = "rtp" 11 | harness = false 12 | 13 | [dependencies] 14 | bytes = "1.8.0" 15 | criterion = "0.5.1" 16 | crossterm = "0.28.1" 17 | fft2d = { version = "0.1.1", features = ["rustdct"] } 18 | libbpf-sys = "1.5.0" 19 | libc = "0.2.167" 20 | log = "0.4.22" 21 | memmap = "0.7.0" 22 | rand = "0.8.5" 23 | ratatui = "0.29.0" 24 | rayon = "1.10.0" 25 | rscam = "0.5.5" 26 | sdl2 = { version = "0.37.0", features = ["image"] } 27 | simplelog = "0.12.2" 28 | video-rs = "0.10.3" 29 | zerocopy = { version = "0.8.11", features = ["derive"] } 30 | -------------------------------------------------------------------------------- /rust-userspace/README.md: -------------------------------------------------------------------------------- 1 | This Rust project requires nightly Rust to compile. 2 | 3 | Bump up the net send buffer when running this program, otherwise UDP packets tend to get dropped. 4 | 5 | ```bash 6 | sudo sysctl -w net.core.rmem_default=8388608 7 | # bumping up the stack size limit may also be helpful if the program crashes 8 | ulimit -s 8388608 9 | ``` -------------------------------------------------------------------------------- /rust-userspace/benches/rtp.rs: -------------------------------------------------------------------------------- 1 | #![feature(generic_const_exprs)] 2 | 3 | use criterion::{black_box, criterion_group, criterion_main, Criterion}; 4 | use rust_userspace::rtp::{RtpSizedPayloadReceiver, RtpSizedPayloadSender}; 5 | use std::net::UdpSocket; 6 | use zerocopy::{FromBytes, IntoBytes, KnownLayout, Immutable}; 7 | 8 | 9 | #[derive(FromBytes, Debug, IntoBytes, Immutable, KnownLayout)] 10 | #[repr(C)] 11 | struct TestPayload { 12 | data: [u8; 64] 13 | } 14 | 15 | fn setup_sockets() -> (UdpSocket, UdpSocket) { 16 | let sender = UdpSocket::bind("127.0.0.1:0").unwrap(); 17 | let receiver = UdpSocket::bind("127.0.0.1:0").unwrap(); 18 | sender.connect(receiver.local_addr().unwrap()).unwrap(); 19 | receiver.connect(sender.local_addr().unwrap()).unwrap(); 20 | (sender, receiver) 21 | } 22 | 23 | fn bench_rtp_send_receive(c: &mut Criterion) { 24 | let mut group = c.benchmark_group("rtp"); 25 | 26 | group.bench_function("send_receive", |b| { 27 | let (sender_socket, receiver_socket) = setup_sockets(); 28 | let mut sender = RtpSizedPayloadSender::::new(sender_socket); 29 | let receiver = RtpSizedPayloadReceiver::::new(receiver_socket); 30 | 31 | b.iter(|| { 32 | // Send a packet 33 | sender.send(|payload: &mut TestPayload| { 34 | payload.data = black_box([42u8; 64]); 35 | }); 36 | 37 | // Wait for packet while releasing lock between checks 38 | loop { 39 | let has_packet = { 40 | let receiver_lock = receiver.lock_receiver(); 41 | receiver_lock.peek_earliest_packet().is_some() 42 | }; 43 | 44 | if has_packet { 45 | break; 46 | } 47 | std::thread::yield_now(); 48 | } 49 | 50 | // Now get and consume the packet 51 | let mut receiver_lock = receiver.lock_receiver(); 52 | let packet = receiver_lock.consume_earliest_packet(); 53 | black_box(packet.get_data().unwrap()); 54 | }); 55 | }); 56 | 57 | group.finish(); 58 | } 59 | 60 | criterion_group!(benches, bench_rtp_send_receive); 61 | criterion_main!(benches); 62 | -------------------------------------------------------------------------------- /rust-userspace/first.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kartva/keymashed/1b35b13fd3781654b8eba5eb455b281916bf5f16/rust-userspace/first.gif -------------------------------------------------------------------------------- /rust-userspace/rust-toolchain.toml: -------------------------------------------------------------------------------- 1 | [toolchain] 2 | channel = "nightly-2024-11-20" -------------------------------------------------------------------------------- /rust-userspace/src/audio.rs: -------------------------------------------------------------------------------- 1 | use sdl2::audio::{AudioCallback, AudioDevice, AudioSpecDesired}; 2 | use std::time::Duration; 3 | use std::net::Ipv4Addr; 4 | use crate::{rtp, udp_connect_retry, RECV_AUDIO_PORT, RECV_IP, SEND_IP, SEND_AUDIO_PORT}; 5 | 6 | pub const AUDIO_SAMPLE_COUNT: usize = 1024; 7 | pub const AUDIO_FREQUENCY: i32 = 44100; 8 | pub const AUDIO_BUFFER_LENGTH: usize = 1024; 9 | 10 | pub struct AudioCallbackData { 11 | last: [f32; AUDIO_SAMPLE_COUNT], 12 | recv: rtp::RtpSizedPayloadReceiver<[f32; AUDIO_SAMPLE_COUNT], AUDIO_BUFFER_LENGTH>, 13 | } 14 | 15 | impl AudioCallback for AudioCallbackData { 16 | type Channel = f32; 17 | 18 | fn callback(&mut self, out: &mut [f32]) { 19 | let mut locked_receiver = self.recv.lock_receiver(); 20 | 21 | // If the circular buffer hasn't seen enough future packets, wait for more to arrive 22 | // Handles the case: sender is falling behind in sending packets. 23 | while locked_receiver.early_latest_span() < 5 { 24 | log::debug!("Sleeping and waiting for more packets to arrive. Early-latest span {}", locked_receiver.early_latest_span()); 25 | drop(locked_receiver); 26 | std::thread::sleep(Duration::from_millis( 27 | (1000 * AUDIO_SAMPLE_COUNT as u64) / (AUDIO_FREQUENCY as u64), 28 | )); 29 | locked_receiver = self.recv.lock_receiver(); 30 | } 31 | 32 | let received_packet = locked_receiver.consume_earliest_packet(); 33 | 34 | if let Some(packet) = received_packet.get_data() { 35 | log::info!("Playing packet with seq: {:?}", packet.header); 36 | 37 | out.copy_from_slice(&packet.data); 38 | 39 | self.last = packet.data; 40 | } else { 41 | log::info!("No packet to play. Playing last received packet again."); 42 | } 43 | } 44 | } 45 | 46 | /// Start playing audio from a UDP stream. Audio will play until returned device is dropped. 47 | /// Ensure that the frequency, sample count and bit depth of the sender and receiver match. 48 | 49 | pub fn play_audio(audio_subsystem: &sdl2::AudioSubsystem) -> AudioDevice { 50 | let sock = udp_connect_retry((Ipv4Addr::UNSPECIFIED, RECV_AUDIO_PORT)); 51 | sock.connect((SEND_IP, SEND_AUDIO_PORT)).unwrap(); 52 | 53 | let recv: rtp::RtpSizedPayloadReceiver<[f32; AUDIO_SAMPLE_COUNT], AUDIO_BUFFER_LENGTH> = rtp::RtpReceiver::new(sock); 54 | 55 | let desired_spec = AudioSpecDesired { 56 | freq: Some(AUDIO_FREQUENCY), 57 | // mono 58 | channels: Some(1), 59 | // number of samples 60 | // should be the same as the number of samples in a packet 61 | samples: Some(AUDIO_SAMPLE_COUNT as u16), 62 | }; 63 | 64 | let device = audio_subsystem 65 | .open_playback(None, &desired_spec, |_spec| { 66 | // initialize the audio callback 67 | AudioCallbackData { 68 | last: [0.0; AUDIO_SAMPLE_COUNT], 69 | recv, 70 | } 71 | }) 72 | .unwrap(); 73 | 74 | log::info!("Starting to play audio; waiting for packets to queue!"); 75 | // let packets queue up 76 | std::thread::sleep(Duration::from_secs(1)); 77 | 78 | device.resume(); 79 | device 80 | } 81 | 82 | /// Start sending audio over a UDP stream. Audio will be sent indefinitely. 83 | pub fn send_audio() -> ! { 84 | let sock = udp_connect_retry((Ipv4Addr::UNSPECIFIED, SEND_AUDIO_PORT)); 85 | sock.connect((RECV_IP, RECV_AUDIO_PORT)).unwrap(); 86 | let mut sender: rtp::RtpSizedPayloadSender<[f32; AUDIO_SAMPLE_COUNT]> = rtp::RtpSizedPayloadSender::new(sock); 87 | 88 | let mut time = 0.0; 89 | let mut audio_wav_reader = std::iter::from_fn(move || { 90 | time += 1.0 / AUDIO_FREQUENCY as f32; 91 | Some(0.5 * (2.0 * std::f32::consts::PI * 440.0 * time).sin()) 92 | }); 93 | 94 | log::info!("Starting to send audio!"); 95 | 96 | loop { 97 | sender.send(|bytes: &mut [f32; AUDIO_SAMPLE_COUNT]| { 98 | for idx in 0..AUDIO_SAMPLE_COUNT { 99 | bytes[idx] = audio_wav_reader.next().unwrap(); 100 | } 101 | }); 102 | std::thread::sleep(Duration::from_millis( 103 | (1000 * AUDIO_SAMPLE_COUNT as u64) / (AUDIO_FREQUENCY as u64), 104 | )); 105 | log::trace!("Sent audio packet."); 106 | } 107 | } -------------------------------------------------------------------------------- /rust-userspace/src/bin/dct-pipeline.rs: -------------------------------------------------------------------------------- 1 | // ---------------------------------------------------------------------------- 2 | // WARNING: 3 | // Documentation for this code is poor. This code is meant to demonstrate the 4 | // various stages of the lossy compression pipeline. 5 | // ---------------------------------------------------------------------------- 6 | 7 | use core::f64; 8 | use rayon::prelude::*; 9 | use rust_userspace::{video::{ 10 | dct::dct2d, dequantize_macroblock, quantize_macroblock, Macroblock, MacroblockWithPosition, 11 | MutableYUVFrame, YUVFrame, YUVFrameMacroblockIterator, YUYV422Sample, 12 | }, wpm}; 13 | use sdl2::{pixels::{Color, PixelFormatEnum}, rect::Rect}; 14 | use std::{path::Path, sync::Mutex, time::Duration}; 15 | use zerocopy::IntoBytes; 16 | 17 | const FILE_PATH: &str = "/home/kart/Downloads/Rick_Astley_Never_Gonna_Give_You_Up.mp4"; 18 | 19 | const GRID_PADDING: u32 = 10; 20 | 21 | struct DisplayBuffer<'a> { 22 | texture: sdl2::render::Texture<'a>, 23 | rect: Rect, 24 | } 25 | 26 | impl<'a> DisplayBuffer<'a> { 27 | fn new( 28 | texture_creator: &'a sdl2::render::TextureCreator, 29 | x: i32, 30 | y: i32, 31 | width: u32, 32 | height: u32, 33 | ) -> Self { 34 | Self { 35 | texture: texture_creator 36 | .create_texture_streaming(PixelFormatEnum::YUY2, width, height) 37 | .unwrap(), 38 | rect: Rect::new(x, y, width, height), 39 | } 40 | } 41 | } 42 | 43 | fn calculate_quality(x: usize, _y: usize, video_width: usize, _video_height: usize) -> f64 { 44 | if x < video_width / 3 { 45 | 0.04 46 | } else { 47 | let x_scale = (2 * video_width / 3) as f64; 48 | let x = x as f64 - (video_width as f64 - x_scale); 49 | 0.04 + 0.3 * (x / x_scale) 50 | } 51 | 52 | // let width = width as f64; 53 | // let x = x as f64; 54 | // let scaling_factor = 1.0 / (1.0 + f64::consts::E.powf(((x - width / 2.0) / width) * 6.0)); 55 | // 0.04 + 1.0 * scaling_factor 56 | } 57 | 58 | use video_rs; 59 | 60 | fn main() -> Result<(), Box> { 61 | let mut decoder = video_rs::Decoder::new(Path::new(FILE_PATH)).unwrap(); 62 | let (video_width, video_height) = decoder.size(); 63 | 64 | let sdl_context = sdl2::init().unwrap(); 65 | let video_subsystem = sdl_context.video().unwrap(); 66 | 67 | let window = video_subsystem 68 | .window( 69 | "JPEG Compression Stages", 70 | GRID_PADDING * 3 + video_width * 2, 71 | GRID_PADDING * 3 + video_height * 2, 72 | ) 73 | .position_centered() 74 | .build()?; 75 | 76 | let mut canvas = window.into_canvas().accelerated().build()?; 77 | let texture_creator = canvas.texture_creator(); 78 | 79 | // Create display buffers for each visualization stage 80 | let mut buffers = vec![ 81 | DisplayBuffer::new( 82 | &texture_creator, 83 | GRID_PADDING as i32, 84 | GRID_PADDING as i32, 85 | video_width, 86 | video_height, 87 | ), // Original 88 | DisplayBuffer::new( 89 | &texture_creator, 90 | (GRID_PADDING * 2 + video_width) as i32, 91 | GRID_PADDING as i32, 92 | video_width, 93 | video_height, 94 | ), // DCT 95 | // DisplayBuffer::new(&texture_creator, (GRID_PADDING * 3 + video_width * 2) as i32, GRID_PADDING as i32, 96 | // video_width, video_height), // Quantization Matrix 97 | DisplayBuffer::new( 98 | &texture_creator, 99 | GRID_PADDING as i32, 100 | (GRID_PADDING * 2 + video_height) as i32, 101 | video_width, 102 | video_height, 103 | ), // Quantized DCT 104 | DisplayBuffer::new( 105 | &texture_creator, 106 | (GRID_PADDING * 2 + video_width) as i32, 107 | (GRID_PADDING * 2 + video_height) as i32, 108 | video_width, 109 | video_height, 110 | ), // Reconstructed 111 | ]; 112 | 113 | let mut event_pump = sdl_context.event_pump()?; 114 | 115 | let mut typing_metrics = wpm::TypingMetrics::new(); 116 | let mut frame_buf = Vec::with_capacity(video_width as usize * video_height as usize * 2); 117 | 'running: loop { 118 | let start_time = std::time::Instant::now(); 119 | 120 | for event in event_pump.poll_iter() { 121 | match event { 122 | sdl2::event::Event::Quit {..} => return Ok(()), 123 | sdl2::event::Event::KeyDown { keycode, repeat: false, timestamp: _, .. } => { 124 | match keycode { 125 | Some(k) => { 126 | let ik = k.into_i32(); 127 | typing_metrics.receive_char_stroke(ik); 128 | }, 129 | _ => {} 130 | } 131 | }, 132 | _ => {} 133 | } 134 | } 135 | 136 | let wpm = typing_metrics.calc_wpm(); 137 | log::info!("WPM: {}", wpm); 138 | 139 | let quality = wpm::wpm_to_jpeg_quality(wpm); 140 | 141 | canvas.set_draw_color(wpm::wpm_to_sdl_color(wpm, Color::RED)); 142 | canvas.clear(); 143 | 144 | let frame = match decoder.decode_raw() { 145 | Ok(f) => f, 146 | Err(video_rs::Error::DecodeExhausted) => break 'running, 147 | Err(e) => panic!("{:?}", e), 148 | }; 149 | 150 | frame_buf.clear(); 151 | // push YUYV422 samples from an RGB888 image 152 | for y in 0..video_height { 153 | for x in 0..(video_width / 2) { 154 | let start_index = (y * video_width * 3 + x * 3 * 2) as usize; 155 | let rgb = &frame.data(0)[start_index..start_index + 6]; 156 | let yuyv = YUYV422Sample::from_rgb24(rgb.try_into().unwrap()); 157 | 158 | frame_buf.extend_from_slice(&yuyv.as_bytes()); 159 | } 160 | } 161 | 162 | // Capture frame 163 | let frame: &[u8] = frame_buf.as_ref(); 164 | let yuv_frame = YUVFrame::new(video_width as usize, video_height as usize, &frame); 165 | 166 | // Original frame 167 | buffers[0] 168 | .texture 169 | .with_lock(None, |buffer: &mut [u8], _pitch: usize| { 170 | buffer.copy_from_slice(frame); 171 | })?; 172 | 173 | fn scale_down_to_u8(val: f64) -> u8 { 174 | (val * 255.0).clamp(0.0, 255.0) as u8 175 | } 176 | 177 | // DCT frame 178 | buffers[1] 179 | .texture 180 | .with_lock(None, |buffer: &mut [u8], _pitch: usize| { 181 | let output_yuv_frame = Mutex::new(MutableYUVFrame::new( 182 | video_width as usize, 183 | video_height as usize, 184 | buffer, 185 | )); 186 | 187 | YUVFrameMacroblockIterator::new(&yuv_frame) 188 | .par_bridge() 189 | .into_par_iter() 190 | .map(|MacroblockWithPosition { block, x, y }| { 191 | // Perform DCT; scale float-points values to 0-255 range 192 | let Macroblock { 193 | y0, 194 | y1, 195 | y2, 196 | y3, 197 | u, 198 | v, 199 | } = block; 200 | 201 | let dct_block = Macroblock { 202 | y0: dct2d(&y0).map(|row| row.map(scale_down_to_u8)), 203 | y1: dct2d(&y1).map(|row| row.map(scale_down_to_u8)), 204 | y2: dct2d(&y2).map(|row| row.map(scale_down_to_u8)), 205 | y3: dct2d(&y3).map(|row| row.map(scale_down_to_u8)), 206 | u: dct2d(&u).map(|row| row.map(scale_down_to_u8)), 207 | v: dct2d(&v).map(|row| row.map(scale_down_to_u8)), 208 | }; 209 | 210 | MacroblockWithPosition { 211 | block: dct_block, 212 | x, 213 | y, 214 | } 215 | }) 216 | .for_each(|MacroblockWithPosition { block, x, y }| { 217 | block.copy_to_yuv422_frame(&mut output_yuv_frame.lock().unwrap(), x, y); 218 | }); 219 | })?; 220 | 221 | // Quantization matrix 222 | // buffers[2].texture.with_lock(None, |buffer: &mut [u8], _pitch: usize| { 223 | // let output_yuv_frame = Mutex::new(MutableYUVFrame::new(video_width as usize, video_height as usize, buffer)); 224 | 225 | // YUVFrameMacroblockIterator::new(&yuv_frame).par_bridge().into_par_iter().map(|MacroblockWithPosition { block: _, x, y }| { 226 | // let quality = calculate_quality(x, y, video_width as usize, video_height as usize); 227 | 228 | // let quality_scaled_luminance_q_matrix = 229 | // quality_scaled_q_matrix(&LUMINANCE_QUANTIZATION_TABLE, quality); 230 | // let quality_scaled_chrominance_q_matrix = 231 | // quality_scaled_q_matrix(&CHROMINANCE_QUANTIZATION_TABLE, quality); 232 | // let block = Macroblock { 233 | // y0: quality_scaled_luminance_q_matrix.map(|row| row.map(scale_down_to_u8)), 234 | // y1: quality_scaled_luminance_q_matrix.map(|row| row.map(scale_down_to_u8)), 235 | // y2: quality_scaled_luminance_q_matrix.map(|row| row.map(scale_down_to_u8)), 236 | // y3: quality_scaled_luminance_q_matrix.map(|row| row.map(scale_down_to_u8)), 237 | // u: quality_scaled_chrominance_q_matrix.map(|row| row.map(scale_down_to_u8)), 238 | // v: quality_scaled_chrominance_q_matrix.map(|row| row.map(scale_down_to_u8)), 239 | // }; 240 | // MacroblockWithPosition { block, x, y } 241 | // }) 242 | // .for_each(|MacroblockWithPosition { block, x, y }| { 243 | // block.copy_to_yuv422_frame(&mut output_yuv_frame.lock().unwrap(), x, y); 244 | // }); 245 | // })?; 246 | 247 | // Quantized DCT frame 248 | buffers[2] 249 | .texture 250 | .with_lock(None, |buffer: &mut [u8], _pitch: usize| { 251 | let output_yuv_frame = Mutex::new(MutableYUVFrame::new( 252 | video_width as usize, 253 | video_height as usize, 254 | buffer, 255 | )); 256 | 257 | YUVFrameMacroblockIterator::new(&yuv_frame) 258 | .par_bridge() 259 | .into_par_iter() 260 | .map(|MacroblockWithPosition { block, x, y }| { 261 | let quantized_block = quantize_macroblock(&block, quality); 262 | let output_macroblock = Macroblock { 263 | y0: quantized_block.y0.map(|row| row.map(|val| val as u8)), 264 | y1: quantized_block.y1.map(|row| row.map(|val| val as u8)), 265 | y2: quantized_block.y2.map(|row| row.map(|val| val as u8)), 266 | y3: quantized_block.y3.map(|row| row.map(|val| val as u8)), 267 | u: quantized_block.u.map(|row| row.map(|val| val as u8)), 268 | v: quantized_block.v.map(|row| row.map(|val| val as u8)), 269 | }; 270 | 271 | MacroblockWithPosition { 272 | block: output_macroblock, 273 | x, 274 | y, 275 | } 276 | }) 277 | .for_each(|MacroblockWithPosition { block, x, y }| { 278 | block.copy_to_yuv422_frame(&mut output_yuv_frame.lock().unwrap(), x, y); 279 | }); 280 | })?; 281 | 282 | // Reconstructed 283 | buffers[3] 284 | .texture 285 | .with_lock(None, |buffer: &mut [u8], _pitch: usize| { 286 | let output_yuv_frame = Mutex::new(MutableYUVFrame::new( 287 | video_width as usize, 288 | video_height as usize, 289 | buffer, 290 | )); 291 | 292 | YUVFrameMacroblockIterator::new(&yuv_frame) 293 | .par_bridge() 294 | .into_par_iter() 295 | .map(|MacroblockWithPosition { block, x, y }| { 296 | // let quality = 297 | // calculate_quality(x, y, video_width as usize, video_height as usize); 298 | 299 | let quantized_block = quantize_macroblock(&block, quality); 300 | let dequantized_block = dequantize_macroblock(&quantized_block, quality); 301 | let output_macroblock = Macroblock { 302 | y0: dequantized_block.y0.map(|row| row.map(|val| val as u8)), 303 | y1: dequantized_block.y1.map(|row| row.map(|val| val as u8)), 304 | y2: dequantized_block.y2.map(|row| row.map(|val| val as u8)), 305 | y3: dequantized_block.y3.map(|row| row.map(|val| val as u8)), 306 | u: dequantized_block.u.map(|row| row.map(|val| val as u8)), 307 | v: dequantized_block.v.map(|row| row.map(|val| val as u8)), 308 | }; 309 | 310 | MacroblockWithPosition { 311 | block: output_macroblock, 312 | x, 313 | y, 314 | } 315 | }) 316 | .for_each(|MacroblockWithPosition { block, x, y }| { 317 | block.copy_to_yuv422_frame(&mut output_yuv_frame.lock().unwrap(), x, y); 318 | }); 319 | })?; 320 | 321 | // Render all buffers 322 | canvas.clear(); 323 | for buffer in &buffers { 324 | canvas.copy(&buffer.texture, None, buffer.rect)?; 325 | } 326 | canvas.present(); 327 | 328 | // delay to hit target FPS 329 | let target_latency = Duration::from_secs_f64(1.0 / decoder.frame_rate() as f64); 330 | let elapsed = start_time.elapsed(); 331 | if elapsed < target_latency { 332 | std::thread::sleep(target_latency - elapsed); 333 | } else { 334 | log::warn!( 335 | "Sender took too long sending; overshot frame deadline by {} ms", 336 | (elapsed - target_latency).as_millis() 337 | ); 338 | } 339 | } 340 | 341 | Ok(()) 342 | } 343 | -------------------------------------------------------------------------------- /rust-userspace/src/bin/recv.rs: -------------------------------------------------------------------------------- 1 | // ---------------------------------------------------------------------------- 2 | // WARNING: 3 | // Documentation for this code is somewhat poor. This code receives livestream 4 | // data and displays it on the screen. 5 | // ---------------------------------------------------------------------------- 6 | 7 | #![feature(generic_const_exprs)] 8 | 9 | use rust_userspace::*; 10 | 11 | use bytes::Buf; 12 | use sdl2::{self, pixels::{Color, PixelFormatEnum}, rect::Rect}; 13 | use video::{decode_quantized_macroblock, dequantize_macroblock, MutableYUVFrame}; 14 | use zerocopy::{FromBytes, Immutable, IntoBytes, KnownLayout}; 15 | use std::{net::Ipv4Addr, time::Duration}; 16 | 17 | #[derive(FromBytes, Debug, IntoBytes, Immutable, KnownLayout)] 18 | #[repr(C)] 19 | struct VideoPacket { 20 | data: [u8; 1504] 21 | } 22 | 23 | fn main() -> std::io::Result<()> { 24 | rust_userspace::init_logger(false); 25 | 26 | let (bpf_write_channel, bpf_receive_channel) = std::sync::mpsc::channel(); 27 | std::thread::spawn(move || { 28 | log::info!("Starting BPF thread"); 29 | let bpf_handle = unsafe { bpf::init().unwrap() }; 30 | log::info!("BPF map found and opened"); 31 | loop { 32 | match bpf_receive_channel.recv() { 33 | Ok(val) => bpf_handle.write_to_map(0, val).unwrap(), 34 | Err(_) => break, 35 | } 36 | } 37 | }); 38 | 39 | // Remove packet loss when setting up network connections. 40 | bpf_write_channel.send(0).unwrap(); 41 | 42 | let sdl_context = sdl2::init().unwrap(); 43 | 44 | sdl2::hint::set_video_minimize_on_focus_loss(false); 45 | let video_subsystem = sdl_context.video().unwrap(); 46 | 47 | // let audio_subsystem = sdl_context.audio().unwrap(); 48 | // let _audio = audio::play_audio(&audio_subsystem); 49 | 50 | let display_mode = video_subsystem.desktop_display_mode(0).unwrap(); 51 | 52 | let window = video_subsystem.window("rust-userspace", display_mode.w as u32, display_mode.h as u32) 53 | .position_centered().fullscreen_desktop() 54 | .build().unwrap(); 55 | // we don't use vsync here because my monitor runs at 120hz and I don't want to stream at that rate 56 | 57 | let mut renderer = window.into_canvas().accelerated().build().unwrap(); 58 | 59 | // Get the window's current size 60 | let (window_width, window_height) = renderer.output_size().unwrap(); 61 | 62 | // Calculate the position to center the texture at its original resolution 63 | let x = (window_width - VIDEO_WIDTH) / 2; 64 | let y = (window_height - VIDEO_HEIGHT) / 2; 65 | 66 | // Create a destination rectangle for the texture at its original size 67 | let dest_rect = Rect::new(x as i32, y as i32, VIDEO_WIDTH, VIDEO_HEIGHT); 68 | 69 | let texture_creator = renderer.texture_creator(); 70 | let mut texture = texture_creator.create_texture_streaming(PixelFormatEnum::YUY2, VIDEO_WIDTH, VIDEO_HEIGHT).unwrap(); 71 | 72 | let video_recieving_socket = udp_connect_retry((Ipv4Addr::UNSPECIFIED, RECV_VIDEO_PORT)); 73 | video_recieving_socket.connect((SEND_IP, SEND_VIDEO_PORT)).unwrap(); 74 | let video_receiver = rtp::RtpSlicePayloadReceiver::::new(video_recieving_socket); 75 | 76 | let sender_communication_socket = udp_connect_retry((Ipv4Addr::UNSPECIFIED, RECV_CONTROL_PORT)); 77 | sender_communication_socket.connect((SEND_IP, SEND_CONTROL_PORT)).unwrap(); 78 | 79 | log::info!("Sender connected to control server from {:?}", sender_communication_socket.local_addr().unwrap()); 80 | 81 | let mut frame_count = 0; 82 | let mut typing_metrics = wpm::TypingMetrics::new(); 83 | loop { 84 | let start_time = std::time::Instant::now(); 85 | 86 | // Handle input 87 | 88 | let mut event_pump = sdl_context.event_pump().unwrap(); 89 | for event in event_pump.poll_iter() { 90 | match event { 91 | sdl2::event::Event::Quit {..} => return Ok(()), 92 | sdl2::event::Event::KeyDown { keycode, repeat: false, timestamp: _, .. } => { 93 | match keycode { 94 | Some(k) => { 95 | let ik = k.into_i32(); 96 | typing_metrics.receive_char_stroke(ik); 97 | }, 98 | _ => {} 99 | } 100 | }, 101 | _ => {} 102 | } 103 | } 104 | 105 | let wpm = typing_metrics.calc_wpm(); 106 | log::info!("WPM: {}", wpm); 107 | 108 | let bpf_drop_rate = wpm::wpm_to_drop_amt(wpm); 109 | log::info!("BPF drop rate: {} ({})", bpf_drop_rate, (bpf_drop_rate as f64 / u32::MAX as f64) * 100.0); 110 | 111 | match bpf_write_channel.send(bpf_drop_rate) { 112 | Ok(_) => {}, 113 | Err(_) => { 114 | log::error!("Failed to send BPF drop rate to BPF thread"); 115 | }, 116 | } 117 | 118 | // send desired quality to sender 119 | let quality = wpm::wpm_to_jpeg_quality(wpm); 120 | let control_msg = ControlMessage { quality }; 121 | udp_send(&sender_communication_socket, control_msg.as_bytes()); 122 | log::debug!("Sent quality update: {}", quality); 123 | 124 | // Draw video 125 | 126 | renderer.set_draw_color(wpm::wpm_to_sdl_color(wpm, Color::GREEN)); 127 | renderer.clear(); 128 | 129 | texture.with_lock(None, |buffer: &mut [u8], _pitch: usize| { 130 | let mut locked_video_receiver = video_receiver.lock_receiver(); 131 | let mut output_yuv_frame = MutableYUVFrame::new(VIDEO_WIDTH as usize, VIDEO_HEIGHT as usize, buffer); 132 | 133 | // If the circular buffer hasn't seen enough future packets, wait for more to arrive 134 | // Handles the case: sender is falling behind in sending packets. 135 | if locked_video_receiver.early_latest_span() < 20 { 136 | log::info!("Sleeping and waiting for more packets to arrive. Early-latest span {}", locked_video_receiver.early_latest_span()); 137 | return; 138 | } 139 | 140 | log::info!("Playing frame {}", frame_count); 141 | 142 | const BLOCK_WRITTEN_WIDTH: usize = (VIDEO_WIDTH as usize) / MACROBLOCK_X_DIM; 143 | const BLOCK_WRITTEN_HEIGHT: usize = (VIDEO_HEIGHT as usize) / MACROBLOCK_Y_DIM; 144 | 145 | let mut block_written = [[false; BLOCK_WRITTEN_WIDTH]; BLOCK_WRITTEN_HEIGHT]; 146 | 147 | let mut packet_index = 0usize; 148 | while (packet_index as u32) < (VIDEO_HEIGHT * VIDEO_WIDTH * PIXEL_WIDTH as u32 / MACROBLOCK_BYTE_SIZE as u32) { 149 | // if we have a packet with a higher frame number, earlier packets have been dropped from the circular buffer 150 | // so redraw the current frame with more up-to-date packets (and skip ahead to a later frame) 151 | // Handles the case: receiver is falling behind in consuming packets. 152 | log::trace!("Playing Frame {frame_count} packet index: {}", packet_index); 153 | 154 | if let Some(p) = locked_video_receiver.peek_earliest_packet() { 155 | let mut cursor = &p.data[..]; 156 | let packet_frame_count = cursor.get_u32(); 157 | if packet_frame_count > frame_count { 158 | log::warn!("Skipping ahead to frame {}", packet_frame_count); 159 | frame_count = packet_frame_count; 160 | packet_index = 0; 161 | } 162 | } 163 | 164 | let packet = locked_video_receiver.consume_earliest_packet(); 165 | if let Some(packet) = packet.get_data() { 166 | // copy the packet data into the buffer 167 | let mut cursor = &packet.data[..]; 168 | log::trace!("Packet slice has length {}", cursor.len()); 169 | 170 | let _packet_frame_count = cursor.get_u32(); 171 | loop { 172 | let x = cursor.get_u16() as usize; 173 | let y = cursor.get_u16() as usize; 174 | 175 | if (x == u16::MAX as usize) && (y == u16::MAX as usize) { 176 | break; 177 | } 178 | let quality = cursor.get_f64(); 179 | 180 | block_written[y / MACROBLOCK_Y_DIM][x / MACROBLOCK_X_DIM] = true; 181 | 182 | let decoded_quantized_macroblock; 183 | (decoded_quantized_macroblock, cursor) = decode_quantized_macroblock(&cursor); 184 | let macroblock = dequantize_macroblock(&decoded_quantized_macroblock, quality); 185 | macroblock.copy_to_yuv422_frame(&mut output_yuv_frame, x, y); 186 | packet_index += 1; 187 | } 188 | } 189 | else { 190 | // TODO: fix this hack 191 | // roughly 40 macroblocks per packet are packed in 192 | packet_index += 40; 193 | } 194 | } 195 | 196 | frame_count += 1; 197 | }).unwrap(); 198 | 199 | renderer.copy(&texture, None, dest_rect).unwrap(); 200 | renderer.present(); 201 | 202 | let elapsed = start_time.elapsed(); 203 | log::info!("Recieved and drew frame {} in {} ms", frame_count, elapsed.as_millis()); 204 | // delay to hit target FPS 205 | let target_latency = Duration::from_secs_f64(1.0 / VIDEO_FPS_TARGET); 206 | if elapsed < target_latency { 207 | std::thread::sleep(target_latency - elapsed); 208 | } else { 209 | log::warn!("Receiver took too long presenting; overshot frame deadline by {} ms", (elapsed - target_latency).as_millis()); 210 | } 211 | } 212 | } 213 | -------------------------------------------------------------------------------- /rust-userspace/src/bin/send.rs: -------------------------------------------------------------------------------- 1 | // ---------------------------------------------------------------------------- 2 | // WARNING: 3 | // Documentation for this code is somewhat poor. This code sends livestream 4 | // data. 5 | // ---------------------------------------------------------------------------- 6 | 7 | #![feature(generic_const_exprs)] 8 | 9 | use rayon::iter::ParallelBridge; 10 | use rayon::iter::ParallelIterator; 11 | use rtp::RtpSlicePayloadSender; 12 | use rust_userspace::*; 13 | 14 | use bytes::BufMut; 15 | use std::convert::Infallible; 16 | use std::net::Ipv4Addr; 17 | use std::net::UdpSocket; 18 | use std::sync::Arc; 19 | use std::sync::Mutex; 20 | use std::sync::RwLock; 21 | use std::time::Duration; 22 | use video::{ 23 | encode_quantized_macroblock, quantize_macroblock, MacroblockWithPosition, YUVFrame, 24 | YUVFrameMacroblockIterator, 25 | }; 26 | use zerocopy::FromBytes; 27 | 28 | fn main() -> std::io::Result<()> { 29 | run_louder::init_logger(true); 30 | send_video(); 31 | 32 | Ok(()) 33 | } 34 | 35 | fn receive_control(quality: Arc>, stream: UdpSocket) { 36 | let mut msg_buf = [0; size_of::()]; 37 | log::info!("Listening for control server!"); 38 | loop { 39 | stream.recv(&mut msg_buf).unwrap(); 40 | let control_msg = ControlMessage::ref_from_bytes(&msg_buf).unwrap(); 41 | log::debug!("Received quality update: {}", control_msg.quality); 42 | *quality.write().unwrap() = control_msg.quality; 43 | } 44 | } 45 | 46 | struct DummyWebcam { 47 | frame_count: u32, 48 | frame: Vec, 49 | } 50 | 51 | impl DummyWebcam { 52 | fn new(height: usize, width: usize) -> Self { 53 | Self { 54 | frame_count: 0, 55 | frame: Vec::with_capacity(height * width * 2), 56 | } 57 | } 58 | 59 | fn capture(&mut self) -> Result<&[u8], Infallible> { 60 | self.frame_count += 1; 61 | self.frame.clear(); 62 | self.frame 63 | .resize(VIDEO_WIDTH as usize * VIDEO_HEIGHT as usize * 2, 0); 64 | let frame = self.frame.as_mut_slice(); 65 | 66 | for y in 0..VIDEO_HEIGHT as usize { 67 | for x in 0..VIDEO_WIDTH as usize { 68 | let pixel = &mut frame[(y * VIDEO_WIDTH as usize + x) * PIXEL_WIDTH as usize 69 | ..(y * VIDEO_WIDTH as usize + x) * PIXEL_WIDTH as usize + 2]; 70 | pixel[0] = ((x + (self.frame_count as usize)) % u8::MAX as usize) as u8; 71 | pixel[1] = ((y + (2 * self.frame_count as usize)) % u8::MAX as usize) as u8; 72 | } 73 | } 74 | 75 | Ok(self.frame.as_slice()) 76 | } 77 | } 78 | 79 | const FRAME_CIRCULAR_BUFFER_SIZE: usize = 80 | VIDEO_FRAME_DELAY * VIDEO_HEIGHT as usize * VIDEO_WIDTH as usize * 2; 81 | 82 | struct FrameCircularBuffer { 83 | buffer: Box<[u8; FRAME_CIRCULAR_BUFFER_SIZE]>, 84 | start_frame_num: usize, 85 | end_frame_num: usize, 86 | } 87 | 88 | impl FrameCircularBuffer { 89 | pub fn new() -> Self { 90 | Self { 91 | buffer: Box::new([0; FRAME_CIRCULAR_BUFFER_SIZE]), 92 | start_frame_num: 0, 93 | end_frame_num: 0, 94 | } 95 | } 96 | 97 | pub fn push_frame(&mut self, frame: &[u8]) { 98 | if self.start_frame_num == (self.end_frame_num + 1) % VIDEO_FRAME_DELAY { 99 | log::error!("Frame buffer full; dropping frame"); 100 | return; 101 | } 102 | 103 | // copy frame into buffer 104 | self.buffer[(self.end_frame_num * VIDEO_HEIGHT as usize * VIDEO_WIDTH as usize * 2) 105 | ..((self.end_frame_num + 1) * VIDEO_HEIGHT as usize * VIDEO_WIDTH as usize * 2)] 106 | .copy_from_slice(frame); 107 | self.end_frame_num = (self.end_frame_num + 1) % VIDEO_FRAME_DELAY; 108 | } 109 | 110 | pub fn pop_frame(&mut self) -> Option<&[u8]> { 111 | if self.start_frame_num != self.end_frame_num { 112 | let frame = &self.buffer[(self.start_frame_num 113 | * VIDEO_HEIGHT as usize 114 | * VIDEO_WIDTH as usize 115 | * 2) 116 | ..((self.start_frame_num + 1) * VIDEO_HEIGHT as usize * VIDEO_WIDTH as usize * 2)]; 117 | self.start_frame_num = (self.start_frame_num + 1) % VIDEO_FRAME_DELAY; 118 | Some(frame) 119 | } else { 120 | None 121 | } 122 | } 123 | } 124 | 125 | pub fn send_video() { 126 | log::info!("Starting camera!"); 127 | 128 | let mut camera = rscam::Camera::new("/dev/video1").unwrap(); 129 | 130 | // dbg!(camera 131 | // .intervals(b"YUYV", (VIDEO_WIDTH as _, VIDEO_HEIGHT as _)) 132 | // .expect("interval information is available")); 133 | 134 | camera 135 | .start(&rscam::Config { 136 | interval: (1, VIDEO_FPS_TARGET as _), 137 | resolution: (VIDEO_WIDTH as _, VIDEO_HEIGHT as _), 138 | format: b"YUYV", 139 | ..Default::default() 140 | }) 141 | .unwrap(); 142 | 143 | let sock = udp_connect_retry((Ipv4Addr::UNSPECIFIED, SEND_VIDEO_PORT)); 144 | sock.connect((RECV_IP, RECV_VIDEO_PORT)).unwrap(); 145 | 146 | let receiver_communication_socket = 147 | udp_connect_retry((Ipv4Addr::UNSPECIFIED, SEND_CONTROL_PORT)); 148 | receiver_communication_socket 149 | .connect((RECV_IP, RECV_CONTROL_PORT)) 150 | .unwrap(); 151 | 152 | let quality = Arc::new(RwLock::new(0.3)); 153 | let cloned_quality = quality.clone(); 154 | std::thread::spawn(|| { 155 | receive_control(cloned_quality, receiver_communication_socket); 156 | }); 157 | 158 | let mut sender: RtpSlicePayloadSender = rtp::RtpSender::new(sock); 159 | let sender = Arc::new(Mutex::new(&mut sender)); 160 | 161 | let mut frame_delay_buffer = FrameCircularBuffer::new(); 162 | let mut frame_count = 0; 163 | 164 | let mut dummy_camera = DummyWebcam::new(VIDEO_HEIGHT as usize, VIDEO_WIDTH as usize); 165 | for _ in 0..VIDEO_FRAME_DELAY { 166 | let frame = dummy_camera.capture().unwrap(); 167 | frame_delay_buffer.push_frame(frame.as_ref()); 168 | } 169 | drop(dummy_camera); 170 | 171 | loop { 172 | let start_time = std::time::Instant::now(); 173 | 174 | let frame = camera.capture().unwrap(); 175 | let frame: &[u8] = frame.as_ref(); 176 | frame_delay_buffer.push_frame(frame); 177 | 178 | let frame = match frame_delay_buffer.pop_frame() { 179 | Some(frame) => frame, 180 | None => panic!("Frame buffer empty"), 181 | }; 182 | 183 | assert!(frame.len() % (VIDEO_WIDTH * PIXEL_WIDTH as u32) as usize == 0); 184 | assert!(frame.len() / (VIDEO_WIDTH as usize * PIXEL_WIDTH) == VIDEO_HEIGHT as usize); 185 | 186 | 187 | let frame = YUVFrame::new(VIDEO_WIDTH as usize, VIDEO_HEIGHT as usize, frame); 188 | 189 | fn process_block( 190 | quality: Arc>, 191 | frame: &YUVFrame<'_>, 192 | frame_count: u32, 193 | x: usize, 194 | y: usize, 195 | x_end: usize, 196 | y_end: usize, 197 | sender: Arc>>, 198 | packet_buf: Arc>>, 199 | ) { 200 | let mut current_macroblock_buf = Vec::with_capacity(PACKET_PAYLOAD_SIZE_THRESHOLD); 201 | 202 | for MacroblockWithPosition { x, y, block } in 203 | YUVFrameMacroblockIterator::new_with_bounds(frame, x, y, x_end, y_end) 204 | { 205 | current_macroblock_buf.clear(); 206 | 207 | // get quality 208 | // cycle quality between 0.3 and 0.03 based on the current time 209 | let quality = quality.read().unwrap().clone(); 210 | 211 | let quantized_macroblock = quantize_macroblock(&block, quality); 212 | 213 | current_macroblock_buf.put_u16(x as u16); 214 | current_macroblock_buf.put_u16(y as u16); 215 | current_macroblock_buf.put_f64(quality); 216 | encode_quantized_macroblock(&quantized_macroblock, &mut current_macroblock_buf); 217 | 218 | let mut packet_buf = packet_buf.lock().unwrap(); 219 | if packet_buf.len() + current_macroblock_buf.len() + 2 * size_of::() >= PACKET_PAYLOAD_SIZE_THRESHOLD { 220 | // send the packet and start a new one 221 | packet_buf.put_u16(u16::MAX); 222 | packet_buf.put_u16(u16::MAX); 223 | 224 | sender.lock().unwrap().send_bytes(|mem| { 225 | mem[..packet_buf.len()].copy_from_slice(&packet_buf); 226 | packet_buf.len() 227 | }); 228 | packet_buf.clear(); 229 | packet_buf.put_u32(frame_count); 230 | } 231 | 232 | // The macroblock consists of x, y, and the encoded macroblock 233 | // log::trace!( 234 | // "Storing macroblock at ({}, {}, {}) at cursor position {}", 235 | // frame_count, 236 | // x, 237 | // y, 238 | // packet_buf.len() 239 | // ); 240 | packet_buf.put_slice(¤t_macroblock_buf); 241 | } 242 | } 243 | 244 | const PAR_PACKET_SPAN: usize = 16; 245 | assert!(PAR_PACKET_SPAN % MACROBLOCK_X_DIM == 0); 246 | assert!(PAR_PACKET_SPAN % MACROBLOCK_Y_DIM == 0); 247 | 248 | let mut packet_buf = Vec::with_capacity(PACKET_PAYLOAD_SIZE_THRESHOLD); 249 | packet_buf.put_u32(frame_count); 250 | 251 | let packet_buf = Arc::new(Mutex::new(packet_buf)); 252 | 253 | let start_seq = sender.lock().unwrap().seq_num(); 254 | 255 | (0..VIDEO_WIDTH as u32) 256 | .step_by(PAR_PACKET_SPAN) 257 | .par_bridge() 258 | .for_each(|x| { 259 | (0..VIDEO_HEIGHT as u32) 260 | .step_by(PAR_PACKET_SPAN) 261 | .for_each(|y| { 262 | process_block( 263 | quality.clone(), 264 | &frame, 265 | frame_count, 266 | x as usize, 267 | y as usize, 268 | x as usize + PAR_PACKET_SPAN, 269 | y as usize + PAR_PACKET_SPAN, 270 | sender.clone(), 271 | packet_buf.clone(), 272 | ); 273 | }); 274 | }); 275 | 276 | // send leftover packet, if any 277 | let mut packet_buf = packet_buf.lock().unwrap(); 278 | if packet_buf.len() > 4 { 279 | packet_buf.put_u16(u16::MAX); 280 | packet_buf.put_u16(u16::MAX); 281 | 282 | sender.lock().unwrap().send_bytes(|mem| { 283 | mem[..packet_buf.len()].copy_from_slice(&packet_buf); 284 | packet_buf.len() 285 | }); 286 | } 287 | 288 | let elapsed = start_time.elapsed(); 289 | log::info!("Sent frame {} in seq {}-{} in {} ms", frame_count, start_seq, sender.lock().unwrap().seq_num(), elapsed.as_millis()); 290 | 291 | // delay to hit target FPS 292 | let target_latency = Duration::from_secs_f64(1.0 / VIDEO_FPS_TARGET); 293 | if elapsed < target_latency { 294 | std::thread::sleep(target_latency - elapsed); 295 | } else { 296 | log::warn!( 297 | "Sender took too long sending; overshot frame deadline by {} ms", 298 | (elapsed - target_latency).as_millis() 299 | ); 300 | } 301 | frame_count += 1; 302 | } 303 | } 304 | -------------------------------------------------------------------------------- /rust-userspace/src/bpf.rs: -------------------------------------------------------------------------------- 1 | use std::{ 2 | ffi::CStr, 3 | os::raw::{c_int, c_void}, 4 | }; 5 | 6 | use libbpf_sys::{bpf_obj_get, BPF_ANY}; 7 | 8 | const BPF_MAP_NAME: &CStr = c"/sys/fs/bpf/tc/globals/map_keymash"; 9 | 10 | #[derive(Debug)] 11 | pub struct BpfHandle { 12 | map_fd: c_int, 13 | } 14 | 15 | #[derive(Debug, Clone, Copy)] 16 | pub enum BpfError { 17 | LoadMap(c_int), 18 | MapWrite(c_int), 19 | } 20 | 21 | /// Opens the eBPF map. 22 | pub unsafe fn init() -> Result { 23 | let res = bpf_obj_get(BPF_MAP_NAME.as_ptr()); 24 | if res < 0 { 25 | log::error!("Failed to load BPF map {BPF_MAP_NAME:?}: {}", res); 26 | return Err(BpfError::LoadMap(res)); 27 | } 28 | Ok(BpfHandle { map_fd: res }) 29 | } 30 | 31 | impl BpfHandle { 32 | /// Write a key-value pair to the eBPF map. 33 | pub fn write_to_map(&self, key: u32, value: u32) -> Result<(), BpfError> { 34 | unsafe { 35 | let res = libbpf_sys::bpf_map_update_elem( 36 | self.map_fd, 37 | &key as *const u32 as *const c_void, 38 | &value as *const u32 as *const c_void, 39 | BPF_ANY.into(), 40 | ); 41 | if res != 0 { 42 | log::error!("Failed to write to BPF map: {}", res); 43 | return Err(BpfError::MapWrite(res)); 44 | } 45 | } 46 | Ok(()) 47 | } 48 | } 49 | 50 | impl Drop for BpfHandle { 51 | fn drop(&mut self) { 52 | unsafe { 53 | libc::close(self.map_fd); 54 | } 55 | } 56 | } 57 | -------------------------------------------------------------------------------- /rust-userspace/src/lib.rs: -------------------------------------------------------------------------------- 1 | #![feature(generic_const_exprs)] 2 | 3 | use std::{io::Write, net::UdpSocket, time::Duration}; 4 | 5 | use simplelog::WriteLogger; 6 | use zerocopy::{FromBytes, Immutable, IntoBytes, KnownLayout}; 7 | 8 | pub mod audio; 9 | pub mod bpf; 10 | pub mod rtp; 11 | pub mod video; 12 | pub mod wpm; 13 | 14 | pub const VIDEO_WIDTH: u32 = 640; 15 | pub const VIDEO_HEIGHT: u32 = 480; 16 | pub const VIDEO_FPS_TARGET: f64 = 30.0; 17 | 18 | pub const VIDEO_DELAY: Duration = Duration::from_secs(1); 19 | // calculate frames per second, multiply by number of seconds to delay 20 | pub const VIDEO_FRAME_DELAY: usize = (VIDEO_FPS_TARGET * VIDEO_DELAY.as_secs() as f64) as usize; 21 | 22 | pub const LOG_LEVEL: log::LevelFilter = log::LevelFilter::Warn; 23 | pub const BUFFER_LOGS: bool = false; 24 | 25 | /// Maximum size of packet payloads. (Tries to correspond to Ethernet MTU) 26 | pub const PACKET_PAYLOAD_SIZE_THRESHOLD: usize = 1400; 27 | 28 | /// IP address of the machine running the `recv` binary. 29 | pub const RECV_IP: &str = "127.0.0.1"; 30 | /// IP address of the machine running the `send` binary. 31 | pub const SEND_IP: &str = "127.0.0.1"; 32 | 33 | /// Port on recv for audio data. 34 | pub const RECV_AUDIO_PORT: u16 = 44403; 35 | /// Port on send for audio data. 36 | pub const SEND_AUDIO_PORT: u16 = 44406; 37 | /// Port on recv for video data. 38 | pub const RECV_VIDEO_PORT: u16 = 44002; 39 | /// Port on send for video data. 40 | pub const SEND_VIDEO_PORT: u16 = 44001; 41 | /// Port on recv for control messages. 42 | pub const RECV_CONTROL_PORT: u16 = 51902; 43 | /// Port on send for control messages. 44 | pub const SEND_CONTROL_PORT: u16 = 44601; 45 | 46 | pub const PIXEL_WIDTH: usize = 2; 47 | pub const MACROBLOCK_X_DIM: usize = 16; 48 | pub const MACROBLOCK_Y_DIM: usize = 16; 49 | pub const MACROBLOCK_BYTE_SIZE: usize = MACROBLOCK_X_DIM * MACROBLOCK_Y_DIM * PIXEL_WIDTH; 50 | 51 | #[derive(FromBytes, IntoBytes, KnownLayout, Immutable, Debug, Clone, Copy)] 52 | pub struct ControlMessage { 53 | pub quality: f64, 54 | } 55 | 56 | pub fn init_logger(_is_send: bool) { 57 | let log_file_name = if _is_send { "send.log" } else { "recv.log" }; 58 | 59 | let log_file: Box = if BUFFER_LOGS { 60 | Box::new(std::io::BufWriter::with_capacity( 61 | 65536 /* 64 KiB */, 62 | std::fs::File::create(log_file_name).unwrap() 63 | )) 64 | } else { 65 | Box::new(std::fs::File::create(log_file_name).unwrap()) 66 | }; 67 | WriteLogger::init( 68 | LOG_LEVEL, 69 | simplelog::Config::default(), 70 | log_file, 71 | ) 72 | .unwrap(); 73 | } 74 | 75 | pub fn udp_send(sock: &UdpSocket, buf: &[u8]) { 76 | if let Err(e) = sock.send(buf) { 77 | log::error!("Error sending packet from {:?} -> {:?}: {}", sock.peer_addr(), sock.local_addr(), e); 78 | } 79 | } 80 | 81 | pub fn udp_connect_retry(addr: A) -> std::net::UdpSocket 82 | where 83 | A: std::net::ToSocketAddrs + std::fmt::Debug, 84 | { 85 | loop { 86 | if let Ok(s) = std::net::UdpSocket::bind(&addr) { 87 | break s; 88 | } else { 89 | log::error!("Failed to bind to {addr:?}; retrying in 2 second"); 90 | std::thread::sleep(Duration::from_secs(2)); 91 | } 92 | } 93 | } 94 | -------------------------------------------------------------------------------- /rust-userspace/src/rtp.rs: -------------------------------------------------------------------------------- 1 | //! An implementation for an RTP-like protocol. Has a strong emphasis on zero-copy ser/de of packets using [`zerocopy`] (because figuring it out was fun). 2 | //! 3 | //! Nomenclature: 4 | //! - Payload: The type of the data that is being sent. 5 | //! - Packet: A packet of data that is sent over the network. It contains a header and the payload data. 6 | //! 7 | //! Payloads are allowed to be unsized. However, since we maintain a buffer of packets, we must still know an upper-bound on the size of the payload. 8 | //! The `SLOT_SIZE` parameter in the types in this module represents this upper-bound. (Note that it is exclusive of packet metadata) 9 | //! 10 | //! Because unsized types do not have a fixed alignment, the types in this module have a type parameter `AlignPayloadTo` that represents a type that has the correct alignment for the payload. 11 | //! - Slices of data `[T]` have alignment of `T` (`Slice` variants of the structs in this module encode this concept). 12 | //! - For `dyn Trait` objects, use the type that the dyn was derived from. 13 | //! 14 | //! Read more about alignment in Rust [here](https://doc.rust-lang.org/reference/type-layout.html). 15 | 16 | use std::{ 17 | fmt::Debug, 18 | marker::PhantomData, 19 | mem::offset_of, 20 | net::UdpSocket, 21 | num::NonZero, 22 | ops::{Deref, DerefMut}, 23 | sync::{Arc, Mutex, MutexGuard}, 24 | }; 25 | 26 | use zerocopy::{byteorder::network_endian::U32, FromBytes, Unaligned}; 27 | use zerocopy::{Immutable, IntoBytes, KnownLayout, TryFromBytes}; 28 | 29 | #[derive(Debug, FromBytes, IntoBytes, KnownLayout, Immutable, Unaligned)] 30 | #[repr(C)] 31 | pub struct PacketHeader { 32 | sequence_number: U32, 33 | } 34 | 35 | #[derive(Debug, TryFromBytes, IntoBytes, KnownLayout, Immutable)] 36 | #[repr(C)] 37 | /// Represents a packet of data that is sent over the network. 38 | /// Payload is the type of data that is being sent. It must implement [`TryFromBytes`], [`IntoBytes`], [`KnownLayout`], and [`Immutable`] for efficient zero-copy ser/de. 39 | pub struct Packet { 40 | /// [`accept_thread`] relies on the presence and type of the sequence number field. 41 | pub header: PacketHeader, 42 | pub data: Payload, 43 | } 44 | 45 | /// Convenience function for the size of a packet with a given payload type. 46 | pub const fn size_of_packet() -> usize { 47 | std::mem::size_of::>() 48 | } 49 | 50 | /// A buffer of bytes that is the size of a packet. 51 | /// 52 | /// The buffer is aligned for a packet with a payload of `PayloadAlignTo`. 53 | /// The `SLOT_SIZE` is the size of the packet slot in bytes. This size **is not inclusive** of packet metadata. 54 | struct AlignedPacketBytes< 55 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 56 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 57 | const SLOT_SIZE: usize, 58 | > where 59 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 60 | { 61 | _phantom: PhantomData, 62 | _align: [Packet; 0], // align to the alignment of the packet 63 | // TODO: statically assert that the alignment is correct - this is quite hard! 64 | inner: [u8; size_of_packet::<[u8; SLOT_SIZE]>()], 65 | } 66 | 67 | 68 | // A custom [`fmt::Debug`] impl since the auto-impl requires `AlignPayloadTo` to implement it. 69 | impl< 70 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 71 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 72 | const SLOT_SIZE: usize, 73 | > Debug for AlignedPacketBytes 74 | where 75 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 76 | { 77 | fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { 78 | f.debug_struct("PacketBytes") 79 | .field("inner", &self.inner) 80 | .finish() 81 | } 82 | } 83 | 84 | // Implement deref and deref_mut for the aligned packet bytes. 85 | impl< 86 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 87 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 88 | const SLOT_SIZE: usize, 89 | > Deref for AlignedPacketBytes 90 | where 91 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 92 | { 93 | type Target = [u8]; 94 | 95 | fn deref(&self) -> &Self::Target { 96 | &self.inner 97 | } 98 | } 99 | 100 | impl< 101 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 102 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 103 | const SLOT_SIZE: usize, 104 | > DerefMut for AlignedPacketBytes 105 | where 106 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 107 | { 108 | fn deref_mut(&mut self) -> &mut Self::Target { 109 | &mut self.inner 110 | } 111 | } 112 | 113 | /// A packet buffer slot. See [`RtpCircularBuffer`]. 114 | /// The `PACKET_SLOT_SIZE` is the size of the packet slot in bytes. This size **is not inclusive** of packet metadata. 115 | pub struct MaybeInitPacket< 116 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 117 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 118 | const SLOT_SIZE: usize, 119 | > where 120 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 121 | { 122 | /// Size of the received packet. Is None if the packet is not initialized. 123 | recv_size: Option>, 124 | // align to the alignment of the packet 125 | packet: AlignedPacketBytes, 126 | } 127 | 128 | impl< 129 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 130 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 131 | const SLOT_SIZE: usize, 132 | > MaybeInitPacket 133 | where 134 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 135 | { 136 | pub fn is_init(&self) -> bool { 137 | self.recv_size.is_some() 138 | } 139 | 140 | pub fn get_data(&self) -> Option<&Packet> { 141 | if let Some(len) = self.recv_size { 142 | Some(Packet::::try_ref_from_bytes(&self.packet[..len.into()]).unwrap()) 143 | } else { 144 | None 145 | } 146 | } 147 | } 148 | 149 | /// A circular buffer of RTP packets. 150 | /// Index into this buffer with a sequence number to get a packet. 151 | /// `SLOT_SIZE` is the size of the payload data in bytes. This size is exclusive of packet metadata. 152 | pub struct RtpCircularBuffer< 153 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 154 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 155 | const SLOT_SIZE: usize, 156 | const BUFFER_LENGTH: usize, 157 | > where 158 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 159 | { 160 | /// The sequence number of the earliest packet in the buffer. 161 | /// External users can fetch this through [`RtpCircularBuffer::earliest_seq`]. 162 | earliest_seq: u32, 163 | /// The span of the earliest sequence number and the latest sequence number of a received packet in the buffer. 164 | /// This can relied on as a hint for how full the buffer is. (i.e. how ahead is the latest received packet?) 165 | /// External users can fetch this through [`RtpCircularBuffer::early_latest_span`]. 166 | early_latest_span: u32, 167 | buf: Box<[MaybeInitPacket; BUFFER_LENGTH]>, 168 | } 169 | 170 | /// A packet that has been received and is ready to be consumed. 171 | /// Holds a reference to the buffer it came from. When dropped, 172 | /// the packet is consumed and deleted from the circular buffer. 173 | 174 | // This is a wrapper around a mutable reference to the circular buffer. 175 | // It exists for the custom [`Drop`] impl that consumes the packet. 176 | pub struct ReceivedPacket< 177 | 'a, 178 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 179 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 180 | const SLOT_SIZE: usize, 181 | const BUFFER_LENGTH: usize, 182 | >(&'a mut RtpCircularBuffer) 183 | where 184 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized; 185 | 186 | impl< 187 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 188 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 189 | const SLOT_SIZE: usize, 190 | const BUFFER_LENGTH: usize, 191 | > ReceivedPacket<'_, Payload, AlignPayloadTo, SLOT_SIZE, BUFFER_LENGTH> 192 | where 193 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 194 | { 195 | /// Get a reference to the packet data from the buffer. 196 | pub fn get_data(&self) -> Option<&Packet> { 197 | let rtp_receiver = &self.0; 198 | 199 | if let Some(MaybeInitPacket { 200 | recv_size: Some(packet_len), 201 | packet: p, 202 | .. 203 | }) = rtp_receiver.get(rtp_receiver.earliest_seq) 204 | { 205 | log::trace!("Getting data from seq {} with len {}", rtp_receiver.earliest_seq, packet_len); 206 | Some(Packet::::try_ref_from_bytes(&p[..((*packet_len).into())]).unwrap()) 207 | } else { 208 | None 209 | } 210 | } 211 | } 212 | 213 | // [`Drop`] impl for [`ReceivedPacket`] that consumes the packet. 214 | impl< 215 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 216 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 217 | const SLOT_SIZE: usize, 218 | const BUFFER_LENGTH: usize, 219 | > Drop for ReceivedPacket<'_, Payload, AlignPayloadTo, SLOT_SIZE, BUFFER_LENGTH> 220 | where 221 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 222 | { 223 | fn drop(&mut self) { 224 | let rtp_receiver = &mut self.0; 225 | 226 | rtp_receiver 227 | .get_mut(rtp_receiver.earliest_seq) 228 | .unwrap() 229 | .recv_size = None; 230 | log::trace!("consumed seq {}", rtp_receiver.earliest_seq); 231 | rtp_receiver.earliest_seq = rtp_receiver.earliest_seq.wrapping_add(1); 232 | rtp_receiver.early_latest_span = rtp_receiver.early_latest_span.saturating_sub(1); 233 | } 234 | } 235 | 236 | impl< 237 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 238 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 239 | const SLOT_SIZE: usize, 240 | const BUFFER_LENGTH: usize, 241 | > RtpCircularBuffer 242 | where 243 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 244 | { 245 | const fn generate_default_packet() -> MaybeInitPacket { 246 | MaybeInitPacket { 247 | recv_size: None, 248 | packet: AlignedPacketBytes { 249 | _phantom: PhantomData, 250 | _align: [], 251 | inner: [0u8; size_of_packet::<[u8; SLOT_SIZE]>()], 252 | }, 253 | } 254 | } 255 | 256 | fn new() -> Self { 257 | RtpCircularBuffer { 258 | earliest_seq: 0, 259 | early_latest_span: 0, 260 | buf: Box::new([const { Self::generate_default_packet() }; BUFFER_LENGTH]), 261 | } 262 | } 263 | 264 | /// Returns the slot with the earliest seq_num in the circular buffer. 265 | /// Note that this slot may or may not contain a packet. 266 | /// The slot will be consumed upon dropping the returned value. 267 | pub fn consume_earliest_packet( 268 | &mut self, 269 | ) -> ReceivedPacket<'_, Payload, AlignPayloadTo, SLOT_SIZE, BUFFER_LENGTH> { 270 | ReceivedPacket(self) 271 | } 272 | 273 | /// Returns a reference to the slot with the earliest seq_num in the buffer. 274 | /// Returns None if the slot is not inhabited by a packet. 275 | pub fn peek_earliest_packet(&self) -> Option<&Packet> { 276 | self.get(self.earliest_seq).and_then(|p| p.get_data()) 277 | } 278 | 279 | /// The sequence number of the earliest packet in the buffer. 280 | pub fn earliest_seq(&self) -> u32 { 281 | self.earliest_seq 282 | } 283 | 284 | /// The span of the earliest sequence number and the latest sequence number of a received packet in the buffer. 285 | /// This can relied on as a hint for how full the buffer is. (i.e. how ahead is the latest received packet?) 286 | pub fn early_latest_span(&self) -> u32 { 287 | self.early_latest_span 288 | } 289 | 290 | /// Returns a reference to the [`MaybeInitPacket`] slot that corresponds to the given sequence number. 291 | /// Returns None if the corresponding packet is not present in the buffer. 292 | pub fn get(&self, seq_num: u32) -> Option<&MaybeInitPacket> { 293 | if seq_num.wrapping_sub(self.earliest_seq) as usize >= self.buf.len() { 294 | None 295 | } else { 296 | let idx = (seq_num as usize) % self.buf.len(); 297 | Some(&self.buf[idx]) 298 | } 299 | } 300 | 301 | /// See [`RtpCircularBuffer::get`]. 302 | fn get_mut( 303 | &mut self, 304 | seq_num: u32, 305 | ) -> Option<&mut MaybeInitPacket> { 306 | if seq_num.wrapping_sub(self.earliest_seq) as usize >= self.buf.len() { 307 | None 308 | } else { 309 | let idx = (seq_num as usize) % self.buf.len(); 310 | Some(&mut self.buf[idx]) 311 | } 312 | } 313 | } 314 | 315 | /// An RTP sender that sends packets over the network, specialized for a `Sized` payload. 316 | pub type RtpSizedPayloadSender = 317 | RtpSender() }>; 318 | 319 | /// An RTP sender that sends packets over the network, specialized for a `[T]` payload. 320 | /// Arguments: 321 | /// - `SlicedPayload`: The type of the data that is being sent. 322 | /// - `MAX_SLICE_LENGTH`: The maximum number of elements in the slice. 323 | pub type RtpSlicePayloadSender< 324 | SlicedPayload: TryFromBytes + IntoBytes + Immutable + KnownLayout, 325 | const MAX_SLICE_LENGTH: usize, 326 | > = RtpSender<[SlicedPayload], SlicedPayload, { size_of::() * MAX_SLICE_LENGTH }>; 327 | 328 | /// An RTP sender that sends packets over the network. 329 | /// Arguments: 330 | /// - `Payload`: The type of the data that is being sent. 331 | /// - `AlignPayloadTo`: The type that has the correct alignment for the payload. 332 | /// - `SLOT_SIZE`: The size of the payload data in bytes. This size is exclusive of packet metadata. 333 | pub struct RtpSender< 334 | Payload: TryFromBytes + IntoBytes + Immutable + KnownLayout + ?Sized, 335 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 336 | const SLOT_SIZE: usize, 337 | > where 338 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 339 | { 340 | sock: UdpSocket, 341 | seq_num: u32, 342 | /// A correctly aligned scratch buffer for writing packet data to. 343 | scratch: AlignedPacketBytes, 344 | } 345 | 346 | impl< 347 | Payload: TryFromBytes + IntoBytes + Immutable + KnownLayout + ?Sized, 348 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 349 | const SLOT_SIZE: usize, 350 | > RtpSender 351 | where 352 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 353 | { 354 | /// Create a new RTP sender. 355 | /// The sender will bind to the given socket. 356 | /// The sender will use a scratch buffer of size `max_size` for packet serialization. 357 | pub fn new(sock: UdpSocket) -> Self { 358 | RtpSender { 359 | sock, 360 | seq_num: 0, 361 | scratch: AlignedPacketBytes { 362 | _phantom: PhantomData, 363 | _align: [], 364 | inner: [0u8; size_of_packet::<[u8; SLOT_SIZE]>()], 365 | }, 366 | } 367 | } 368 | 369 | /// Get the seq num of the next packet to be sent. 370 | pub fn seq_num(&self) -> u32 { 371 | self.seq_num 372 | } 373 | 374 | /// Send a packet over the network by filling data in the mutable slice. 375 | /// The closure `fill` is called with a mutable slice of the packet data, and should return the number of bytes to be sent. 376 | pub fn send_bytes<'a>(&'a mut self, fill: impl FnOnce(&mut [u8]) -> usize) { 377 | // Note that the size of the packets we use is less than 10kb, for which 378 | // https://www.kernel.org/doc/html/v6.3/networking/msg_zerocopy.html 379 | // copying into the kernel is actually faster than MSG_ZEROCOPY. 380 | 381 | let packet = &mut self.scratch; 382 | 383 | let header = 384 | PacketHeader::mut_from_bytes(&mut packet[0..size_of::()]).unwrap(); 385 | 386 | header.sequence_number = self.seq_num.into(); 387 | 388 | // Note that this is only correct because the alignment of the packet is the same as the alignment of the payload. 389 | // Also #[repr(C)] on Packet should guarantee some amount of stability wrt. padding. 390 | 391 | let packet_start_offset = offset_of!(Packet, data); 392 | let mem = &mut packet[packet_start_offset..]; 393 | let payload_len = fill(mem); 394 | 395 | super::udp_send(&self.sock, &packet[..packet_start_offset + payload_len]); 396 | log::trace!("sent seq: {} ({} bytes)", self.seq_num, packet_start_offset + payload_len); 397 | 398 | self.seq_num = self.seq_num.wrapping_add(1); 399 | } 400 | } 401 | 402 | /// Implementation for Payloads that can be interpreted from the raw byte buffer without further validation. 403 | /// This enables creating a &mut Payload from the internal byte buffer. 404 | 405 | // TODO: figure out creating a MaybeUninit from the internal byte buffer to give to the closure. 406 | // MaybeUninit does not implement IntoBytes and thus creating a mutable reference to it from the internal byte buffer is not possible. 407 | 408 | impl< 409 | Payload: FromBytes + TryFromBytes + IntoBytes + Immutable + KnownLayout, 410 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 411 | const SLOT_SIZE: usize, 412 | > RtpSender 413 | where 414 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 415 | { 416 | /// Send a packet over the network by filling data in the mutable slice. 417 | /// The closure `fill` is called with a mutable reference of the data. 418 | pub fn send<'a>(&'a mut self, fill: impl FnOnce(&mut Payload)) { 419 | self.send_bytes(|mem| { 420 | let mut data = Payload::mut_from_bytes(mem).unwrap(); 421 | fill(&mut data); 422 | size_of_val(data) 423 | }); 424 | } 425 | } 426 | 427 | /// An RTP receiver that recieves packets over the network. 428 | pub struct RtpReceiver< 429 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + ?Sized, 430 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 431 | const SLOT_SIZE: usize, 432 | const BUFFER_LENGTH: usize, 433 | > where 434 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 435 | { 436 | rtp_circular_buffer: Arc>>, 437 | } 438 | 439 | /// An RTP receiver that recieves packets over the network, specialized for a `Sized` payload. 440 | /// Arguments: 441 | /// - `Payload`: The type of the data that is being sent. 442 | /// - `BUFFER_LENGTH`: The number of packets that can be stored in the buffer. 443 | pub type RtpSizedPayloadReceiver< 444 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable, 445 | const BUFFER_LENGTH: usize, 446 | > = RtpReceiver() }, BUFFER_LENGTH>; 447 | 448 | /// An RTP receiver that recieves packets over the network, specialized for a `[T]` payload. 449 | /// Arguments: 450 | /// - `SlicedPayload`: The type of the data that is being sent. 451 | /// - `MAX_SLICE_LENGTH`: The maximum number of elements in the slice. 452 | /// - `BUFFER_LENGTH`: The number of packets that can be stored in the buffer. 453 | pub type RtpSlicePayloadReceiver< 454 | SlicedPayload: TryFromBytes + IntoBytes + KnownLayout + Immutable, 455 | const MAX_SLICE_LENGTH: usize, 456 | const BUFFER_LENGTH: usize, 457 | > = RtpReceiver< 458 | [SlicedPayload], 459 | SlicedPayload, 460 | { size_of::() * MAX_SLICE_LENGTH }, 461 | BUFFER_LENGTH, 462 | >; 463 | 464 | impl< 465 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + Send + 'static + Debug + ?Sized, 466 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable + Send + 'static + Debug, 467 | const SLOT_SIZE: usize, 468 | const BUFFER_LENGTH: usize, 469 | > RtpReceiver 470 | where 471 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 472 | { 473 | /// Launches listener thread that recieves packets and stores them in a buffer. 474 | pub fn new(sock: UdpSocket) -> Self { 475 | let rtp_circular_buffer = Arc::new(Mutex::new(RtpCircularBuffer::new())); 476 | 477 | let cloned_rtp_circular_buffer = rtp_circular_buffer.clone(); 478 | std::thread::spawn(move || { 479 | accept_thread(sock, cloned_rtp_circular_buffer); 480 | }); 481 | 482 | RtpReceiver { 483 | rtp_circular_buffer, 484 | } 485 | } 486 | 487 | /// Locks the buffer for interaction. 488 | pub fn lock_receiver( 489 | &self, 490 | ) -> MutexGuard<'_, RtpCircularBuffer> { 491 | self.rtp_circular_buffer.lock().unwrap() 492 | } 493 | } 494 | 495 | fn accept_thread< 496 | Payload: TryFromBytes + IntoBytes + KnownLayout + Immutable + Debug + ?Sized, 497 | AlignPayloadTo: TryFromBytes + IntoBytes + KnownLayout + Immutable, 498 | const SLOT_SIZE: usize, 499 | const BUFFER_LENGTH: usize, 500 | >( 501 | sock: UdpSocket, 502 | recv: Arc>>, 503 | ) where 504 | [(); size_of_packet::<[u8; SLOT_SIZE]>()]: Sized, 505 | { 506 | sock.set_nonblocking(false).unwrap(); 507 | log::info!("Receiver started listening on {:?}.", sock.local_addr()); 508 | 509 | loop { 510 | // wait until socket has a packet to read 511 | // an unaligned buffer here is safe since [`PacketHeader`] implements [`Unaligned`]. 512 | let mut seq_num_buffer = [0u8; size_of::()]; 513 | sock.peek(&mut seq_num_buffer).unwrap(); 514 | 515 | // we have available data to read 516 | let mut state = recv.lock().unwrap(); 517 | 518 | let seq_num: u32 = PacketHeader::ref_from_bytes(&seq_num_buffer).unwrap().sequence_number.into(); 519 | 520 | // If the received packet has a place in the buffer, write the packet to the correct slot. 521 | // The received packet is allowed a place if its sequence number is larger than the earliest packet 522 | // by u32::MAX / 2. (If more, this is probably a late packet and we discard it.) 523 | 524 | if (seq_num.wrapping_sub(state.earliest_seq)) < u32::MAX / 2 { 525 | // If this packet will need to overwrite old existing packets. 526 | if seq_num.wrapping_sub(state.earliest_seq) as usize >= state.buf.len() { 527 | log::debug!( 528 | "received an advanced packet with seq {}; dropping packets from {} to {}", 529 | seq_num, 530 | state.earliest_seq, 531 | seq_num.wrapping_sub(state.buf.len() as u32) 532 | ); 533 | while seq_num.wrapping_sub(state.earliest_seq) as usize >= state.buf.len() { 534 | // Drop old packets until we can fit this new one. 535 | state.consume_earliest_packet(); 536 | } 537 | } 538 | 539 | // check whether we can update early_latest_span 540 | state.early_latest_span = u32::max( 541 | state.early_latest_span, 542 | seq_num.wrapping_sub(state.earliest_seq), 543 | ); 544 | 545 | // Receive packet into the correct slot. 546 | let MaybeInitPacket { 547 | recv_size: init, 548 | packet, 549 | .. 550 | } = state 551 | .get_mut(seq_num) 552 | .expect("Circular buffer should have space for packet."); 553 | 554 | let len = sock.recv(packet).unwrap(); 555 | *init = Some(NonZero::new(len).expect("Packet should have non-zero length.")); 556 | 557 | if len > 16 { 558 | log::trace!( 559 | "received seq_num {seq_num} and raw data: {:?}... (len {})", 560 | &packet[..16], 561 | len 562 | ); 563 | } else { 564 | log::trace!("received seq_num {seq_num} and raw data: {:?}", &packet); 565 | } 566 | } else { 567 | // Otherwise, discard the packet. 568 | 569 | let _ = sock.recv(&mut seq_num_buffer); 570 | log::debug!( 571 | "dropping seq_num {} for being too early/late; accepted range is {}-{}", 572 | seq_num, 573 | state.earliest_seq, 574 | state.earliest_seq + state.buf.len() as u32 575 | ); 576 | continue; 577 | } 578 | } 579 | } 580 | -------------------------------------------------------------------------------- /rust-userspace/src/video/dct.rs: -------------------------------------------------------------------------------- 1 | #![allow(dead_code)] 2 | 3 | mod fft2d_dct { 4 | use fft2d::slice::dcst::{dct_2d, idct_2d}; 5 | 6 | // From https://en.wikipedia.org/wiki/JPEG#JPEG_codec_example 7 | pub fn dct2d(block: &[[u8; 8]; 8]) -> [[f64; 8]; 8] { 8 | let mut slice = [0.0; 64]; 9 | 10 | for i in 0..8 { 11 | for j in 0..8 { 12 | // convert to [0.0, 1.0] 13 | slice[i * 8 + j] = (block[i][j] as f64) / 255.0; 14 | } 15 | } 16 | 17 | dct_2d(8, 8, &mut slice); 18 | let mut out = [[0.0; 8]; 8]; 19 | for i in 0..8 { 20 | for j in 0..8 { 21 | out[i][j] = slice[i * 8 + j]; 22 | } 23 | } 24 | out 25 | } 26 | 27 | pub fn inverse_dct2d(block: &[[f64; 8]; 8]) -> [[u8; 8]; 8] { 28 | let mut slice = [0.0; 64]; 29 | 30 | for i in 0..8 { 31 | for j in 0..8 { 32 | slice[i * 8 + j] = block[i][j]; 33 | } 34 | } 35 | 36 | idct_2d(8, 8, &mut slice); 37 | let mut out = [[0; 8]; 8]; 38 | let fft_coeff = 4.0 / (8.0 * 8.0); 39 | for i in 0..8 { 40 | for j in 0..8 { 41 | out[i][j] = ((slice[i * 8 + j] * fft_coeff).max(0.0).min(1.0) * 255.0) as u8; 42 | } 43 | } 44 | out 45 | } 46 | } 47 | 48 | mod naive_dct { 49 | fn dct_alpha(u: usize) -> f64 { 50 | if u == 0 { 51 | 1.0 / (2.0f64).sqrt() 52 | } else { 53 | 1.0 54 | } 55 | } 56 | 57 | // From https://en.wikipedia.org/wiki/JPEG#JPEG_codec_example 58 | pub fn dct2d(block: &[[u8; 8]; 8]) -> [[f64; 8]; 8] { 59 | let mut out = [[0.0; 8]; 8]; 60 | for u in 0..8 { 61 | for v in 0..8 { 62 | let mut sum = 0.0; 63 | for x in 0..8 { 64 | for y in 0..8 { 65 | sum += dct_alpha(u) 66 | * dct_alpha(v) 67 | * (block[x][y] as f64 - 128.0) 68 | * (std::f64::consts::PI * (2.0 * (x as f64) + 1.0) * (u as f64) / 16.0) 69 | .cos() 70 | * (std::f64::consts::PI * (2.0 * (y as f64) + 1.0) * (v as f64) / 16.0) 71 | .cos(); 72 | } 73 | } 74 | out[u][v] = sum / 4.0; 75 | } 76 | } 77 | out 78 | } 79 | pub fn inverse_dct2d(block: &[[f64; 8]; 8]) -> [[u8; 8]; 8] { 80 | let mut out = [[0; 8]; 8]; 81 | for x in 0..8 { 82 | for y in 0..8 { 83 | let mut sum = 0.0; 84 | for u in 0..8 { 85 | for v in 0..8 { 86 | sum += dct_alpha(u) 87 | * dct_alpha(v) 88 | * block[u][v] 89 | * (std::f64::consts::PI * (2.0 * (x as f64) + 1.0) * (u as f64) / 16.0) 90 | .cos() 91 | * (std::f64::consts::PI * (2.0 * (y as f64) + 1.0) * (v as f64) / 16.0) 92 | .cos(); 93 | } 94 | } 95 | out[x][y] = (sum / 4.0 + 128.0) as u8; 96 | } 97 | } 98 | out 99 | } 100 | } 101 | 102 | mod fixed_fast_dct { 103 | /* 104 | * Fast discrete cosine transform algorithms (Rust) 105 | * 106 | * Copyright (c) 2020 Project Nayuki. (MIT License) 107 | * https://www.nayuki.io/page/fast-discrete-cosine-transform-algorithms 108 | * 109 | * Permission is hereby granted, free of charge, to any person obtaining a copy of 110 | * this software and associated documentation files (the "Software"), to deal in 111 | * the Software without restriction, including without limitation the rights to 112 | * use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 113 | * the Software, and to permit persons to whom the Software is furnished to do so, 114 | * subject to the following conditions: 115 | * - The above copyright notice and this permission notice shall be included in 116 | * all copies or substantial portions of the Software. 117 | * - The Software is provided "as is", without warranty of any kind, express or 118 | * implied, including but not limited to the warranties of merchantability, 119 | * fitness for a particular purpose and noninfringement. In no event shall the 120 | * authors or copyright holders be liable for any claim, damages or other 121 | * liability, whether in an action of contract, tort or otherwise, arising from, 122 | * out of or in connection with the Software or the use or other dealings in the 123 | * Software. 124 | */ 125 | 126 | /* 127 | * Computes the scaled DCT type II on the given length-8 array in place. 128 | * The inverse of this function is inverse_transform(), except for rounding errors. 129 | */ 130 | pub fn transform(vector: &mut [f64; 8]) { 131 | // Algorithm by Arai, Agui, Nakajima, 1988. For details, see: 132 | // https://web.stanford.edu/class/ee398a/handouts/lectures/07-TransformCoding.pdf#page=30 133 | let v0 = vector[0] + vector[7]; 134 | let v1 = vector[1] + vector[6]; 135 | let v2 = vector[2] + vector[5]; 136 | let v3 = vector[3] + vector[4]; 137 | let v4 = vector[3] - vector[4]; 138 | let v5 = vector[2] - vector[5]; 139 | let v6 = vector[1] - vector[6]; 140 | let v7 = vector[0] - vector[7]; 141 | 142 | let v8 = v0 + v3; 143 | let v9 = v1 + v2; 144 | let v10 = v1 - v2; 145 | let v11 = v0 - v3; 146 | let v12 = -v4 - v5; 147 | let v13 = (v5 + v6) * A[3]; 148 | let v14 = v6 + v7; 149 | 150 | let v15 = v8 + v9; 151 | let v16 = v8 - v9; 152 | let v17 = (v10 + v11) * A[1]; 153 | let v18 = (v12 + v14) * A[5]; 154 | 155 | let v19 = -v12 * A[2] - v18; 156 | let v20 = v14 * A[4] - v18; 157 | 158 | let v21 = v17 + v11; 159 | let v22 = v11 - v17; 160 | let v23 = v13 + v7; 161 | let v24 = v7 - v13; 162 | 163 | let v25 = v19 + v24; 164 | let v26 = v23 + v20; 165 | let v27 = v23 - v20; 166 | let v28 = v24 - v19; 167 | 168 | vector[0] = (S[0] * v15) / 8.0f64.sqrt(); 169 | vector[1] = (S[1] * v26) / 2.0; 170 | vector[2] = (S[2] * v21) / 2.0; 171 | vector[3] = (S[3] * v28) / 2.0; 172 | vector[4] = (S[4] * v16) / 2.0; 173 | vector[5] = (S[5] * v25) / 2.0; 174 | vector[6] = (S[6] * v22) / 2.0; 175 | vector[7] = (S[7] * v27) / 2.0; 176 | } 177 | /* 178 | * Computes the scaled DCT type III on the given length-8 array in place. 179 | * The inverse of this function is transform(), except for rounding errors. 180 | */ 181 | pub fn inverse_transform(vector: &mut [f64; 8]) { 182 | vector[0] *= 8.0f64.sqrt(); 183 | for i in 1..8 { 184 | vector[i] *= 2.0; 185 | } 186 | // A straightforward inverse of the forward algorithm 187 | let v15 = vector[0] / S[0]; 188 | let v26 = vector[1] / S[1]; 189 | let v21 = vector[2] / S[2]; 190 | let v28 = vector[3] / S[3]; 191 | let v16 = vector[4] / S[4]; 192 | let v25 = vector[5] / S[5]; 193 | let v22 = vector[6] / S[6]; 194 | let v27 = vector[7] / S[7]; 195 | 196 | let v19 = (v25 - v28) / 2.0; 197 | let v20 = (v26 - v27) / 2.0; 198 | let v23 = (v26 + v27) / 2.0; 199 | let v24 = (v25 + v28) / 2.0; 200 | 201 | let v7 = (v23 + v24) / 2.0; 202 | let v11 = (v21 + v22) / 2.0; 203 | let v13 = (v23 - v24) / 2.0; 204 | let v17 = (v21 - v22) / 2.0; 205 | 206 | let v8 = (v15 + v16) / 2.0; 207 | let v9 = (v15 - v16) / 2.0; 208 | 209 | let v18 = (v19 - v20) * A[5]; // Different from original 210 | let v12 = (v19 * A[4] - v18) / (A[2] * A[5] - A[2] * A[4] - A[4] * A[5]); 211 | let v14 = (v18 - v20 * A[2]) / (A[2] * A[5] - A[2] * A[4] - A[4] * A[5]); 212 | 213 | let v6 = v14 - v7; 214 | let v5 = v13 / A[3] - v6; 215 | let v4 = -v5 - v12; 216 | let v10 = v17 / A[1] - v11; 217 | 218 | let v0 = (v8 + v11) / 2.0; 219 | let v1 = (v9 + v10) / 2.0; 220 | let v2 = (v9 - v10) / 2.0; 221 | let v3 = (v8 - v11) / 2.0; 222 | 223 | vector[0] = (v0 + v7) / 2.0; 224 | vector[1] = (v1 + v6) / 2.0; 225 | vector[2] = (v2 + v5) / 2.0; 226 | vector[3] = (v3 + v4) / 2.0; 227 | vector[4] = (v3 - v4) / 2.0; 228 | vector[5] = (v2 - v5) / 2.0; 229 | vector[6] = (v1 - v6) / 2.0; 230 | vector[7] = (v0 - v7) / 2.0; 231 | } 232 | /*---- Tables of constants ----*/ 233 | const S: [f64; 8] = [ 234 | 0.353553390593273762200422, 235 | 0.254897789552079584470970, 236 | 0.270598050073098492199862, 237 | 0.300672443467522640271861, 238 | 0.353553390593273762200422, 239 | 0.449988111568207852319255, 240 | 0.653281482438188263928322, 241 | 1.281457723870753089398043, 242 | ]; 243 | const A: [f64; 6] = [ 244 | std::f64::NAN, 245 | 0.707106781186547524400844, 246 | 0.541196100146196984399723, 247 | 0.707106781186547524400844, 248 | 1.306562964876376527856643, 249 | 0.382683432365089771728460, 250 | ]; 251 | 252 | /* --- MIT License code ends here --- */ 253 | 254 | pub fn dct2d(block: &[[u8; 8]; 8]) -> [[f64; 8]; 8] { 255 | let mut out = [[0.0; 8]; 8]; 256 | // DCT over rows 257 | for i in 0..8 { 258 | out[i] = block[i].map(|x| x as f64); 259 | transform(&mut out[i]); 260 | } 261 | 262 | // DCT over columns 263 | for i in 0..8 { 264 | let mut column = [0.0; 8]; 265 | for j in 0..8 { 266 | column[j] = out[j][i]; 267 | } 268 | transform(&mut column); 269 | for j in 0..8 { 270 | out[j][i] = column[j]; 271 | } 272 | } 273 | 274 | out 275 | } 276 | pub fn inverse_dct2d(block: &[[f64; 8]; 8]) -> [[u8; 8]; 8] { 277 | let mut output = [[0.0; 8]; 8]; 278 | 279 | // IDCT over columns 280 | for i in 0..8 { 281 | let mut column = [0.0; 8]; 282 | for j in 0..8 { 283 | column[j] = block[j][i]; 284 | } 285 | inverse_transform(&mut column); 286 | for j in 0..8 { 287 | output[j][i] = column[j].round(); 288 | } 289 | } 290 | 291 | // IDCT over rows 292 | for i in 0..8 { 293 | inverse_transform(&mut output[i]); 294 | } 295 | 296 | let mut rounded_output = [[0; 8]; 8]; 297 | for i in 0..8 { 298 | for j in 0..8 { 299 | rounded_output[i][j] = output[i][j].round() as u8; 300 | } 301 | } 302 | 303 | rounded_output 304 | } 305 | } 306 | 307 | mod test_dct { 308 | 309 | #[test] 310 | fn test_dct_invertibility() { 311 | use super::fixed_fast_dct::{dct2d, inverse_dct2d}; 312 | 313 | let block = [ 314 | [52, 55, 61, 66, 70, 61, 64, 73], 315 | [63, 59, 55, 90, 109, 85, 69, 72], 316 | [62, 59, 68, 113, 144, 104, 66, 73], 317 | [63, 58, 71, 122, 154, 106, 70, 69], 318 | [67, 61, 68, 104, 126, 88, 68, 70], 319 | [79, 65, 60, 70, 77, 68, 58, 75], 320 | [85, 71, 64, 59, 55, 61, 65, 83], 321 | [87, 79, 69, 68, 65, 76, 78, 94], 322 | ]; 323 | 324 | let dct = dct2d(&block); 325 | let inverse = inverse_dct2d(&dct); 326 | 327 | for i in 0..8 { 328 | for j in 0..8 { 329 | assert!((inverse[i][j] as i32 - block[i][j] as i32).abs() < 10); 330 | } 331 | } 332 | } 333 | } 334 | 335 | pub use fft2d_dct::{dct2d, inverse_dct2d}; 336 | -------------------------------------------------------------------------------- /rust-userspace/src/video/mod.rs: -------------------------------------------------------------------------------- 1 | use std::ops::{Index, IndexMut}; 2 | 3 | pub mod dct; 4 | 5 | use zerocopy::{FromBytes, Immutable, IntoBytes, KnownLayout, Unaligned}; 6 | 7 | #[derive(FromBytes, Immutable, KnownLayout, Unaligned, IntoBytes)] 8 | #[repr(C)] 9 | /// Represents two horizontal pixels in a YUVV422 image. 10 | pub struct YUYV422Sample { 11 | /// Luminance of first pixel 12 | y0: u8, 13 | /// Cb 14 | u: u8, 15 | /// Luminance of second pixel 16 | y1: u8, 17 | /// Cr 18 | v: u8, 19 | } 20 | 21 | impl YUYV422Sample { 22 | pub fn from_rgb24(rgb: &[u8; 6]) -> Self { 23 | fn rgb_to_yuv(r: u8, g: u8, b: u8) -> (u8, u8, u8) { 24 | let y = (0.299 * r as f32 + 0.587 * g as f32 + 0.114 * b as f32) as u8; 25 | let u = (128.0 - 0.168736 * r as f32 - 0.331264 * g as f32 + 0.5 * b as f32) as u8; 26 | let v = (128.0 + 0.5 * r as f32 - 0.418688 * g as f32 - 0.081312 * b as f32) as u8; 27 | (y, u, v) 28 | } 29 | 30 | let (y0, u0, v0) = rgb_to_yuv(rgb[0], rgb[1], rgb[2]); // First pixel 31 | let (y1, _, _) = rgb_to_yuv(rgb[3], rgb[4], rgb[5]); // Second pixel 32 | 33 | Self { 34 | y0, 35 | u: u0, 36 | y1, 37 | v: v0, 38 | } 39 | } 40 | } 41 | 42 | pub struct YUVFrame<'a> { 43 | /// Width of the frame in pixels. 44 | width: usize, 45 | /// Height of the frame in pixels. 46 | height: usize, 47 | /// Data of the frame. Number of YUV422 samples will be (width / 2) * height, 48 | /// since each sample is two pixels. 49 | data: &'a [YUYV422Sample], 50 | } 51 | 52 | pub struct MutableYUVFrame<'a> { 53 | /// Width of the frame in pixels. 54 | width: usize, 55 | _height: usize, 56 | /// Data of the frame. Number of YUV422 samples will be (width / 2) * height, 57 | /// since each sample is two pixels. 58 | data: &'a mut [YUYV422Sample], 59 | } 60 | 61 | impl<'a> YUVFrame<'a> { 62 | pub fn new(width: usize, height: usize, data: &'a [u8]) -> Self { 63 | let data = <[YUYV422Sample]>::ref_from_bytes(data).unwrap(); 64 | assert_eq!(data.len(), width * height / 2); 65 | Self { 66 | width, 67 | height, 68 | data, 69 | } 70 | } 71 | 72 | /// Get the luminance of a pixel at (x, y). 73 | pub fn get_luma(&self, x: usize, y: usize) -> u8 { 74 | let pixel = &self.data[y * self.width / 2 + x / 2]; 75 | if x % 2 == 0 { 76 | pixel.y0 77 | } else { 78 | pixel.y1 79 | } 80 | } 81 | 82 | /// Get the chrominance of a pixel at (x, y). 83 | /// Returns (Cb, Cr). 84 | pub fn get_chroma(&self, x: usize, y: usize) -> (u8, u8) { 85 | let pixel = &self.data[y * self.width / 2 + x / 2]; 86 | (pixel.u, pixel.v) 87 | } 88 | } 89 | 90 | impl<'a> MutableYUVFrame<'a> { 91 | pub fn new(width: usize, height: usize, data: &'a mut [u8]) -> Self { 92 | let data = <[YUYV422Sample]>::mut_from_bytes(data).unwrap(); 93 | assert_eq!(data.len(), width * height / 2); 94 | Self { 95 | width, 96 | _height: height, 97 | data, 98 | } 99 | } 100 | 101 | /// Set the luminance of a pixel at (x, y). 102 | pub fn set_luma(&mut self, x: usize, y: usize, value: u8) { 103 | let pixel = &mut self.data[y * self.width / 2 + x / 2]; 104 | if x % 2 == 0 { 105 | pixel.y0 = value; 106 | } else { 107 | pixel.y1 = value; 108 | } 109 | } 110 | 111 | /// Set the chrominance of a pixel at (x, y). 112 | pub fn set_chroma(&mut self, x: usize, y: usize, value: (u8, u8)) { 113 | let pixel = &mut self.data[y * self.width / 2 + x / 2]; 114 | pixel.u = value.0; 115 | pixel.v = value.1; 116 | } 117 | 118 | /// Get the luminance of a pixel at (x, y). 119 | pub fn get_luma(&self, x: usize, y: usize) -> u8 { 120 | let pixel = &self.data[y * self.width / 2 + x / 2]; 121 | if x % 2 == 0 { 122 | pixel.y0 123 | } else { 124 | pixel.y1 125 | } 126 | } 127 | 128 | /// Get the chrominance of a pixel at (x, y). 129 | /// Returns (Cb, Cr). 130 | pub fn get_chroma(&self, x: usize, y: usize) -> (u8, u8) { 131 | let pixel = &self.data[y * self.width / 2 + x / 2]; 132 | (pixel.u, pixel.v) 133 | } 134 | } 135 | 136 | /// A macroblock. Spans a 16x16 block of pixels, 137 | /// with 4 8x8 blocks for Y and 1 8x8 block for U and V each. 138 | #[derive(Default, Clone, PartialEq, Eq, PartialOrd, Ord, Debug)] 139 | pub struct Macroblock { 140 | pub y0: [[u8; 8]; 8], 141 | pub y1: [[u8; 8]; 8], 142 | pub y2: [[u8; 8]; 8], 143 | pub y3: [[u8; 8]; 8], 144 | pub u: [[u8; 8]; 8], 145 | pub v: [[u8; 8]; 8], 146 | } 147 | 148 | impl Macroblock { 149 | /// Copy macroblock into a YUV422 buffer at given x and y coordinates. 150 | pub fn copy_to_yuv422_frame<'a>(&self, frame: &mut MutableYUVFrame<'a>, x: usize, y: usize) { 151 | for (y_block, x_start, x_end, y_start, y_end) in [ 152 | (&self.y0, 0, 8, 0, 8), 153 | (&self.y1, 8, 16, 0, 8), 154 | (&self.y2, 0, 8, 8, 16), 155 | (&self.y3, 8, 16, 8, 16), 156 | ] { 157 | for y_offset in y_start..y_end { 158 | for x_offset in x_start..x_end { 159 | frame.set_luma(x + x_offset, y + y_offset, y_block[x_offset - x_start][y_offset - y_start]); 160 | } 161 | } 162 | } 163 | 164 | for y_offset in (0..16).step_by(2) { 165 | for x_offset in (0..16).step_by(2) { 166 | // 4:2:0 chroma subsampling -> 4:2:2 chroma subsampling 167 | frame.set_chroma(x + x_offset, y + y_offset, (self.u[x_offset / 2][y_offset / 2], self.v[x_offset / 2][y_offset / 2])); 168 | frame.set_chroma(x + x_offset, y + y_offset + 1, (self.u[x_offset / 2][y_offset / 2], self.v[x_offset / 2][y_offset / 2])); 169 | } 170 | } 171 | } 172 | } 173 | 174 | #[derive(Default, Clone, Debug)] 175 | pub struct MacroblockWithPosition { 176 | pub block: Macroblock, 177 | pub x: usize, 178 | pub y: usize, 179 | } 180 | 181 | pub struct YUVFrameMacroblockIterator<'a> { 182 | frame: &'a YUVFrame<'a>, 183 | x_start: usize, 184 | /// While our current implementation doesn't need this, 185 | /// it's useful if we decide to change iteration order. 186 | _y_start: usize, 187 | x: usize, 188 | y: usize, 189 | x_end: usize, 190 | y_end: usize, 191 | } 192 | 193 | impl<'a> YUVFrameMacroblockIterator<'a> { 194 | pub fn new(frame: &'a YUVFrame<'a>) -> Self { 195 | Self { frame, x_start: 0, _y_start: 0, x: 0, y: 0, x_end: frame.width, y_end: frame.height } 196 | } 197 | 198 | /// Iterate over a subset of the frame defined by the rectangle (x, y) to (x_end, y_end). 199 | pub fn new_with_bounds(frame: &'a YUVFrame<'a>, x_start: usize, y_start: usize, x_end: usize, y_end: usize) -> Self { 200 | Self { frame, x_start, _y_start: y_start, x: x_start, y: y_start, x_end, y_end } 201 | } 202 | } 203 | 204 | impl<'a> Iterator for YUVFrameMacroblockIterator<'a> { 205 | type Item = MacroblockWithPosition; 206 | 207 | fn next(&mut self) -> Option { 208 | if self.y >= self.y_end { 209 | return None; 210 | } 211 | 212 | let mut block = Macroblock::default(); 213 | 214 | for (y_block, x_start, x_end, y_start, y_end) in [ 215 | (&mut block.y0, 0, 8, 0, 8), 216 | (&mut block.y1, 8, 16, 0, 8), 217 | (&mut block.y2, 0, 8, 8, 16), 218 | (&mut block.y3, 8, 16, 8, 16), 219 | ] { 220 | for y in y_start..y_end { 221 | for x in x_start..x_end { 222 | y_block[x - x_start][y - y_start] = self.frame.get_luma(self.x + x, self.y + y); 223 | } 224 | } 225 | } 226 | 227 | for y in (0..16).step_by(2) { 228 | for x in (0..16).step_by(2) { 229 | // note that this ignores the chroma of the x, y + 1 pixel, i.e. making this 4:2:0 230 | block.u[x / 2][y / 2] = self.frame.get_chroma(self.x + x, self.y + y).0; 231 | block.v[x / 2][y / 2] = self.frame.get_chroma(self.x + x, self.y + y).1; 232 | } 233 | } 234 | 235 | let (x, y) = (self.x, self.y); 236 | 237 | self.x += 16; 238 | if self.x >= self.x_end { 239 | self.x = self.x_start; 240 | self.y += 16; 241 | } 242 | 243 | Some(MacroblockWithPosition { x, y, block }) 244 | } 245 | } 246 | 247 | // ref: https://github.com/autergame/JpegView-Rust/blob/main/src/jpeg.rs 248 | /// Standard JPEG luminance quantization table 249 | #[rustfmt::skip] 250 | pub const LUMINANCE_QUANTIZATION_TABLE: [[f64; 8]; 8] = [ 251 | [16.0f64, 11.0f64, 10.0f64, 16.0f64, 24.0f64, 40.0f64, 51.0f64, 61.0f64], 252 | [12.0f64, 12.0f64, 14.0f64, 19.0f64, 26.0f64, 58.0f64, 60.0f64, 55.0f64], 253 | [14.0f64, 13.0f64, 16.0f64, 24.0f64, 40.0f64, 57.0f64, 69.0f64, 56.0f64], 254 | [14.0f64, 17.0f64, 22.0f64, 29.0f64, 51.0f64, 87.0f64, 80.0f64, 62.0f64], 255 | [18.0f64, 22.0f64, 37.0f64, 56.0f64, 68.0f64, 109.0f64, 103.0f64, 77.0f64], 256 | [24.0f64, 35.0f64, 55.0f64, 64.0f64, 81.0f64, 104.0f64, 113.0f64, 92.0f64], 257 | [49.0f64, 64.0f64, 78.0f64, 87.0f64, 103.0f64, 121.0f64, 120.0f64, 101.0f64], 258 | [72.0f64, 92.0f64, 95.0f64, 98.0f64, 112.0f64, 100.0f64, 103.0f64, 99.0f64] 259 | ]; 260 | 261 | /// Standard JPEG chrominance quantization table 262 | #[rustfmt::skip] 263 | pub const CHROMINANCE_QUANTIZATION_TABLE: [[f64; 8]; 8] = [ 264 | [17.0f64, 18.0f64, 24.0f64, 47.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64], 265 | [18.0f64, 21.0f64, 26.0f64, 66.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64], 266 | [24.0f64, 26.0f64, 56.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64], 267 | [47.0f64, 66.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64], 268 | [99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64], 269 | [99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64], 270 | [99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64], 271 | [99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64, 99.0f64] 272 | ]; 273 | 274 | 275 | /// Quantizes DCT block with flexible quantization. Returns a signed value. 276 | fn quantize_block(dct_block: &[[f64; 8]; 8], quantization_table: &[[f64; 8]; 8]) -> [[i8; 8]; 8] { 277 | let mut result = [[0; 8]; 8]; 278 | for i in 0..8 { 279 | for j in 0..8 { 280 | result[i][j] = (dct_block[i][j] / quantization_table[i][j]).round() as i8; 281 | } 282 | } 283 | result 284 | } 285 | 286 | /// Entry-for-entry product of quantized block and quantization table. 287 | fn dequantize_block( 288 | quantized_block: &[[i8; 8]; 8], 289 | quantization_table: &[[f64; 8]; 8], 290 | ) -> [[f64; 8]; 8] { 291 | let mut result = [[0.0; 8]; 8]; 292 | for i in 0..8 { 293 | for j in 0..8 { 294 | result[i][j] = quantized_block[i][j] as f64 * quantization_table[i][j]; 295 | } 296 | } 297 | result 298 | } 299 | 300 | /// Range quality from 0.3 to 0.03. (Lower is better) 301 | pub fn quality_scaled_q_matrix(q_matrix: &[[f64; 8]; 8], quality: f64) -> [[f64; 8]; 8] { 302 | q_matrix.map(|row| row.map(|x| x * quality)) 303 | } 304 | 305 | /// Process an entire YUV block for DCT and quantization 306 | pub fn quantize_macroblock(block: &Macroblock, quality: f64) -> QuantizedMacroblock { 307 | let quality_scaled_luminance_q_matrix = 308 | quality_scaled_q_matrix(&LUMINANCE_QUANTIZATION_TABLE, quality); 309 | let quality_scaled_chrominance_q_matrix = 310 | quality_scaled_q_matrix(&CHROMINANCE_QUANTIZATION_TABLE, quality); 311 | 312 | QuantizedMacroblock { 313 | y0: quantize_block(&dct::dct2d(&block.y0), &quality_scaled_luminance_q_matrix), 314 | y1: quantize_block(&dct::dct2d(&block.y1), &quality_scaled_luminance_q_matrix), 315 | y2: quantize_block(&dct::dct2d(&block.y2), &quality_scaled_luminance_q_matrix), 316 | y3: quantize_block(&dct::dct2d(&block.y3), &quality_scaled_luminance_q_matrix), 317 | u: quantize_block(&dct::dct2d(&block.u), &quality_scaled_chrominance_q_matrix), 318 | v: quantize_block(&dct::dct2d(&block.v), &quality_scaled_chrominance_q_matrix), 319 | } 320 | } 321 | 322 | pub fn dequantize_macroblock(block: &QuantizedMacroblock, quality: f64) -> Macroblock { 323 | let quality_scaled_luminance_q_matrix = 324 | quality_scaled_q_matrix(&LUMINANCE_QUANTIZATION_TABLE, quality); 325 | let quality_scaled_chrominance_q_matrix = 326 | quality_scaled_q_matrix(&CHROMINANCE_QUANTIZATION_TABLE, quality); 327 | 328 | Macroblock { 329 | y0: dct::inverse_dct2d(&dequantize_block(&block.y0, &quality_scaled_luminance_q_matrix)), 330 | y1: dct::inverse_dct2d(&dequantize_block(&block.y1, &quality_scaled_luminance_q_matrix)), 331 | y2: dct::inverse_dct2d(&dequantize_block(&block.y2, &quality_scaled_luminance_q_matrix)), 332 | y3: dct::inverse_dct2d(&dequantize_block(&block.y3, &quality_scaled_luminance_q_matrix)), 333 | u: dct::inverse_dct2d(&dequantize_block(&block.u, &quality_scaled_chrominance_q_matrix)), 334 | v: dct::inverse_dct2d(&dequantize_block(&block.v, &quality_scaled_chrominance_q_matrix)), 335 | } 336 | } 337 | 338 | /// A quantized macroblock. Spans a 16x16 block of pixels, 339 | /// with 4 8x8 blocks for Y and 1 8x8 block for U and V each. 340 | #[derive(Default, Clone, PartialEq, Eq, PartialOrd, Ord, Debug)] 341 | pub struct QuantizedMacroblock { 342 | pub y0: [[i8; 8]; 8], 343 | pub y1: [[i8; 8]; 8], 344 | pub y2: [[i8; 8]; 8], 345 | pub y3: [[i8; 8]; 8], 346 | pub u: [[i8; 8]; 8], 347 | pub v: [[i8; 8]; 8], 348 | } 349 | 350 | #[derive(FromBytes, KnownLayout, IntoBytes, Immutable, Unaligned)] 351 | #[repr(transparent)] 352 | struct QuantizedZigZagBlock { 353 | data: [[i8; 8]; 8], 354 | } 355 | 356 | impl QuantizedZigZagBlock { 357 | // Implementation note: I'm quite happy with how the zero-copy cast works here. 358 | // Allows having nice zero-copy wrapper types that are generic over the mutability of the underlying data. 359 | // Otherwise, this would be split into QuantizedZigZagBlock<'a>(&'a ...) and QuantizedZigZagBlockMut<'a>(&'a mut ...). 360 | 361 | fn new_ref(data: &'_ [[i8; 8]; 8]) -> &'_ Self { 362 | Self::ref_from_bytes(data.as_bytes()).unwrap() 363 | } 364 | 365 | fn new_ref_mut(data: &'_ mut [[i8; 8]; 8]) -> &'_ mut Self { 366 | Self::mut_from_bytes(data.as_mut_bytes()).unwrap() 367 | } 368 | 369 | fn len(&self) -> usize { 370 | 64 371 | } 372 | } 373 | 374 | // A miraculous zig-zag scan implementation by AI 375 | #[rustfmt::skip] 376 | const ZIGZAG_ORDER: [(usize, usize); 64] = [ 377 | (0, 0), (0, 1), (1, 0), (2, 0), (1, 1), (0, 2), (0, 3), (1, 2), 378 | (2, 1), (3, 0), (4, 0), (3, 1), (2, 2), (1, 3), (0, 4), (0, 5), 379 | (1, 4), (2, 3), (3, 2), (4, 1), (5, 0), (6, 0), (5, 1), (4, 2), 380 | (3, 3), (2, 4), (1, 5), (0, 6), (0, 7), (1, 6), (2, 5), (3, 4), 381 | (4, 3), (5, 2), (6, 1), (7, 0), (7, 1), (6, 2), (5, 3), (4, 4), 382 | (3, 5), (2, 6), (1, 7), (2, 7), (3, 6), (4, 5), (5, 4), (6, 3), 383 | (7, 2), (7, 3), (6, 4), (5, 5), (4, 6), (3, 7), (4, 7), (5, 6), 384 | (6, 5), (7, 4), (7, 5), (6, 6), (5, 7), (6, 7), (7, 6), (7, 7), 385 | ]; 386 | 387 | impl Index for QuantizedZigZagBlock { 388 | type Output = i8; 389 | 390 | fn index(&self, index: usize) -> &Self::Output { 391 | let (i, j) = ZIGZAG_ORDER[index]; 392 | &self.data[i][j] 393 | } 394 | } 395 | 396 | impl IndexMut for QuantizedZigZagBlock { 397 | fn index_mut(&mut self, index: usize) -> &mut Self::Output { 398 | let (i, j) = ZIGZAG_ORDER[index]; 399 | &mut self.data[i][j] 400 | } 401 | } 402 | 403 | /// Currently performs RLE encoding. 404 | fn encode_quantized_block(block: &[[i8; 8]; 8], buf: &mut Vec) { 405 | let zig_zag_block = QuantizedZigZagBlock::new_ref(block); 406 | 407 | let mut index = 0; 408 | 409 | while index < zig_zag_block.len() { 410 | let current_element = zig_zag_block[index]; 411 | let mut run_length = 1u8; 412 | index += 1; 413 | 414 | while index < zig_zag_block.len() { 415 | if zig_zag_block[index] == current_element && run_length < u8::MAX { 416 | run_length += 1; 417 | index += 1; 418 | } else { 419 | break; 420 | } 421 | } 422 | 423 | buf.push(current_element as u8); 424 | buf.push(run_length); 425 | } 426 | } 427 | 428 | pub fn encode_quantized_macroblock(quantized_macroblock: &QuantizedMacroblock, buf: &mut Vec) { 429 | for plane in &[ 430 | quantized_macroblock.y0, 431 | quantized_macroblock.y1, 432 | quantized_macroblock.y2, 433 | quantized_macroblock.y3, 434 | quantized_macroblock.u, 435 | quantized_macroblock.v, 436 | ] { 437 | encode_quantized_block(plane, buf); 438 | } 439 | } 440 | 441 | /// Decodes a quantized block from the stream, returning the block and a pointer to the remaining data. 442 | fn decode_quantized_block(data: &[u8]) -> ([[i8; 8]; 8], &[u8]) { 443 | let mut block = [[0; 8]; 8]; 444 | let quantized_block = QuantizedZigZagBlock::new_ref_mut(&mut block); 445 | 446 | let mut encoded_data_index = 0; 447 | let mut zig_zag_index = 0; 448 | 449 | // let mut zig_zag_out = Vec::new(); 450 | // let mut s = String::new(); 451 | 452 | while zig_zag_index < quantized_block.len() { 453 | let value = data[encoded_data_index]; 454 | let run_length = data[encoded_data_index + 1]; 455 | 456 | // s.push_str(&format!("{:02x}x{} ", value, run_length)); 457 | 458 | encoded_data_index += 2; 459 | 460 | for _ in 0..run_length { 461 | quantized_block[zig_zag_index] = value as i8; 462 | // zig_zag_out.push(value as i8); 463 | zig_zag_index += 1; 464 | } 465 | } 466 | 467 | // log::trace!("{} -> {zig_zag_out:?}", s); 468 | 469 | (block, &data[encoded_data_index..]) 470 | } 471 | 472 | /// Decodes a quantized macroblock from the stream, returning a pointer to the remaining data. 473 | pub fn decode_quantized_macroblock(data: &[u8]) -> (QuantizedMacroblock, &[u8]) { 474 | let mut block = QuantizedMacroblock::default(); 475 | let mut remaining = data; 476 | 477 | for plane in [ 478 | &mut block.y0, 479 | &mut block.y1, 480 | &mut block.y2, 481 | &mut block.y3, 482 | &mut block.u, 483 | &mut block.v, 484 | ] { 485 | (*plane, remaining) = decode_quantized_block(remaining); 486 | } 487 | 488 | (block, remaining) 489 | } 490 | 491 | #[cfg(test)] 492 | mod test { 493 | use super::*; 494 | 495 | #[test] 496 | fn test_quantization() { 497 | let block = Macroblock { 498 | y0: [[128; 8]; 8], 499 | y1: [[128; 8]; 8], 500 | y2: [[128; 8]; 8], 501 | y3: [[128; 8]; 8], 502 | u: [[128; 8]; 8], 503 | v: [[128; 8]; 8], 504 | }; 505 | 506 | let quantized_block = quantize_macroblock(&block, 0.03); 507 | let dequantized_block = dequantize_macroblock(&quantized_block, 0.03); 508 | 509 | assert_eq!(block.y0, dequantized_block.y0); 510 | assert_eq!(block.y1, dequantized_block.y1); 511 | assert_eq!(block.y2, dequantized_block.y2); 512 | assert_eq!(block.y3, dequantized_block.y3); 513 | assert_eq!(block.u, dequantized_block.u); 514 | assert_eq!(block.v, dequantized_block.v); 515 | } 516 | 517 | #[test] 518 | fn test_macroblock_compression() { 519 | simplelog::SimpleLogger::init(simplelog::LevelFilter::Trace, simplelog::Config::default()) 520 | .unwrap(); 521 | 522 | let macroblock = Macroblock { 523 | y0: [ 524 | [157, 157, 157, 157, 157, 156, 157, 156], 525 | [156, 156, 156, 155, 153, 154, 154, 155], 526 | [157, 158, 158, 157, 156, 156, 156, 155], 527 | [159, 159, 159, 159, 158, 158, 157, 156], 528 | [159, 158, 158, 159, 159, 159, 157, 158], 529 | [158, 157, 157, 159, 159, 158, 158, 157], 530 | [158, 158, 159, 159, 158, 158, 158, 157], 531 | [159, 159, 159, 159, 158, 158, 158, 157], 532 | ], 533 | y1: [ 534 | [159, 160, 159, 159, 158, 158, 158, 158], 535 | [159, 159, 159, 158, 158, 158, 158, 158], 536 | [158, 158, 159, 158, 159, 158, 158, 158], 537 | [158, 157, 158, 158, 158, 158, 158, 158], 538 | [158, 158, 157, 157, 158, 158, 157, 157], 539 | [158, 157, 157, 158, 158, 158, 157, 157], 540 | [157, 157, 157, 157, 158, 158, 157, 157], 541 | [157, 157, 156, 156, 157, 157, 157, 157], 542 | ], 543 | y2: [ 544 | [156, 157, 157, 156, 156, 156, 156, 155], 545 | [155, 155, 155, 154, 154, 154, 154, 154], 546 | [155, 156, 155, 156, 155, 155, 155, 155], 547 | [156, 157, 157, 157, 157, 157, 156, 156], 548 | [157, 158, 158, 158, 157, 157, 156, 156], 549 | [157, 158, 158, 157, 157, 156, 157, 156], 550 | [157, 157, 157, 157, 157, 157, 157, 156], 551 | [157, 158, 157, 157, 157, 157, 157, 156], 552 | ], 553 | y3: [ 554 | [159, 157, 157, 157, 157, 158, 158, 157], 555 | [158, 158, 157, 157, 157, 157, 157, 156], 556 | [158, 158, 158, 158, 157, 157, 157, 156], 557 | [157, 157, 158, 158, 158, 157, 157, 156], 558 | [157, 157, 158, 158, 157, 157, 157, 156], 559 | [157, 157, 158, 157, 156, 157, 157, 156], 560 | [157, 157, 157, 156, 156, 156, 156, 156], 561 | [157, 156, 156, 156, 156, 156, 156, 156], 562 | ], 563 | u: [ 564 | [131, 131, 131, 131, 132, 131, 132, 131], 565 | [128, 128, 129, 129, 129, 128, 128, 129], 566 | [128, 130, 128, 128, 128, 128, 129, 129], 567 | [128, 129, 128, 128, 128, 128, 129, 128], 568 | [129, 128, 128, 129, 129, 128, 128, 128], 569 | [129, 128, 128, 128, 128, 128, 128, 128], 570 | [128, 129, 128, 129, 129, 128, 128, 128], 571 | [128, 128, 128, 128, 129, 128, 129, 128], 572 | ], 573 | v: [ 574 | [130, 129, 129, 129, 129, 129, 129, 129], 575 | [131, 130, 131, 131, 131, 131, 131, 130], 576 | [130, 130, 130, 131, 131, 130, 130, 130], 577 | [130, 131, 130, 130, 131, 131, 131, 131], 578 | [130, 130, 130, 130, 130, 130, 131, 130], 579 | [131, 130, 129, 129, 130, 131, 130, 130], 580 | [131, 131, 130, 131, 131, 130, 130, 131], 581 | [131, 131, 130, 130, 130, 130, 131, 130], 582 | ], 583 | }; 584 | let quantized_macroblock = quantize_macroblock(¯oblock, 0.3); 585 | log::info!("{:?}", quantized_macroblock); 586 | let mut rle_buf = Vec::new(); 587 | encode_quantized_macroblock(&quantized_macroblock, &mut rle_buf); 588 | let (decoded_quantized_macroblock, remaining) = decode_quantized_macroblock(&rle_buf); 589 | assert!(remaining.is_empty()); 590 | assert_eq!(quantized_macroblock, decoded_quantized_macroblock); 591 | let decoded_macroblock = dequantize_macroblock(&decoded_quantized_macroblock, 0.3); 592 | 593 | log::info!("{:?}", decoded_macroblock); 594 | 595 | // check that all values within the decoded macroblock are within epsilon of the original 596 | let epsilon = 20; 597 | for (original, decoded) in macroblock.y0.iter().flatten().zip(decoded_macroblock.y0.iter().flatten()) { 598 | assert!((*original as i8 - *decoded as i8).abs() < epsilon); 599 | } 600 | } 601 | } -------------------------------------------------------------------------------- /rust-userspace/src/wpm.rs: -------------------------------------------------------------------------------- 1 | use std::{ 2 | collections::{BTreeMap, VecDeque}, 3 | time::{Duration, Instant}, 4 | }; 5 | 6 | use sdl2::pixels::Color; 7 | 8 | /* 9 | ⠀⠀⣄⠀⠀ 10 | ⠠⢴⣿⡦⠄ 11 | ⠉⣽⣿⣏⠉ 12 | ⠀⠀⣿⠀⠀ 13 | */ 14 | 15 | const WPM_SATURATION: f64 = 45.0; 16 | const WORST_PACKET_DROP: u32 = 2 * (u32::MAX / 5); // 40% drop rate at 0 WPM 17 | 18 | pub fn wpm_to_drop_amt(wpm: f64) -> u32 { 19 | let clipped_wpm = wpm.min(WPM_SATURATION); 20 | 21 | let drop_frac = (WPM_SATURATION - clipped_wpm) / WPM_SATURATION; 22 | // exponentiate drop_fac to make packet drop rate even more apparent 23 | ((WORST_PACKET_DROP as f64) * f64::powi(drop_frac, 2)) as u32 24 | } 25 | 26 | pub fn wpm_to_sdl_color(wpm: f64, base_color: Color) -> Color { 27 | let clipped_wpm = wpm.min(WPM_SATURATION); 28 | 29 | let wpm_frac = clipped_wpm / WPM_SATURATION; 30 | let wpm_color = Color::RGB( 31 | (base_color.r as f64 * wpm_frac) as u8, 32 | (base_color.g as f64 * wpm_frac) as u8, 33 | (base_color.b as f64 * wpm_frac) as u8, 34 | ); 35 | 36 | wpm_color 37 | } 38 | 39 | const WORST_JPEG_QUALITY: f64 = 1.0; 40 | const BEST_JPEG_QUALITY: f64 = 0.03; 41 | 42 | pub fn wpm_to_jpeg_quality(wpm: f64) -> f64 { 43 | let clipped_wpm = wpm.min(WPM_SATURATION); 44 | 45 | let wpm_ratio = (WPM_SATURATION - clipped_wpm) / WPM_SATURATION; 46 | // flip wpm_ratio so that higher WPMs result in higher quality 47 | WORST_JPEG_QUALITY - (WORST_JPEG_QUALITY - BEST_JPEG_QUALITY) * f64::powi(1.0 - wpm_ratio, 1) 48 | } 49 | 50 | pub const CHART_DATA_LENGTH: usize = 1000; 51 | 52 | #[derive(Debug)] 53 | pub struct TypingMetrics { 54 | stroke_window: VecDeque<(i32, Instant)>, 55 | repeated_keys: BTreeMap, 56 | } 57 | 58 | impl TypingMetrics { 59 | pub fn new() -> Self { 60 | Self { 61 | stroke_window: VecDeque::new(), 62 | repeated_keys: BTreeMap::new(), 63 | } 64 | } 65 | 66 | fn update_stroke_window(&mut self) { 67 | let now = Instant::now(); 68 | let window_duration = Duration::from_secs(3); 69 | while self 70 | .stroke_window 71 | .front() 72 | .map_or(false, |(_, t)| now.duration_since(*t) > window_duration) 73 | { 74 | self.stroke_window.pop_front(); 75 | } 76 | } 77 | 78 | /// Calculate the WPM based on the stored stroke window. 79 | fn calculate_wpm(&mut self) -> f64 { 80 | // penalize repeated keys 81 | let mut wpm = 0.0; 82 | self.repeated_keys.clear(); 83 | 84 | for (c, _) in &self.stroke_window { 85 | wpm += 1.0; 86 | // repeated keys get higher penalties for each repetition 87 | if let Some(times) = self.repeated_keys.get(c) { 88 | wpm -= (0.04 * (*times as f64)).max(0.0); 89 | self.repeated_keys.insert(*c, times + 1); 90 | } else { 91 | self.repeated_keys.insert(*c, 1); 92 | } 93 | } 94 | wpm 95 | } 96 | 97 | /// Calculate the WPM in the current instant based on the current stroke window. 98 | pub fn calc_wpm(&mut self) -> f64 { 99 | self.update_stroke_window(); 100 | self.calculate_wpm() 101 | } 102 | 103 | pub fn receive_char_stroke(&mut self, c: i32) { 104 | self.stroke_window.push_back((c, Instant::now())); 105 | } 106 | 107 | /// The given timestamp must be more recent than previously supplied timestamps. 108 | pub fn receive_char_stroke_with_timestamp(&mut self, c: i32, timestamp: Instant) { 109 | self.stroke_window.push_back((c, timestamp)); 110 | } 111 | } 112 | --------------------------------------------------------------------------------