├── Kconfig ├── example.png ├── .build.yml ├── Kbuild ├── .gitignore ├── Makefile ├── alpine-notes.md ├── virtio.md ├── virtio_vmmci.h ├── virtio_pci_common.h ├── virtio_pci_common.c ├── virtio_pci_openbsd.c ├── virtio_vmmci.c ├── README.md └── LICENSE /Kconfig: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/voutilad/virtio_vmmci/HEAD/example.png -------------------------------------------------------------------------------- /.build.yml: -------------------------------------------------------------------------------- 1 | image: alpine/3.12 2 | sources: 3 | - https://github.com/voutilad/virtio_vmmci 4 | packages: 5 | - gcc 6 | - make 7 | tasks: 8 | - setup: | 9 | sudo apk add linux-$(uname -r | awk -F '-' '{ print $3 }')-dev 10 | - build: | 11 | cd virtio_vmmci 12 | make 13 | -------------------------------------------------------------------------------- /Kbuild: -------------------------------------------------------------------------------- 1 | # SPDX-License-Identifier: GPL-2.0 2 | # 3 | # Copyright (C) 2020 Dave Voutila . All rights reserved. 4 | 5 | ccflags-y := -O3 -Wall 6 | ccflags-y += -DCONFIG_HZ=$(CONFIG_HZ) 7 | ccflags-$(CONFIG_VMMCI_DEBUG) += -DDEBUG -g 8 | 9 | obj-m += virtio_vmmci.o virtio_pci_obsd.o 10 | virtio_pci_obsd-y := virtio_pci_openbsd.o virtio_pci_common.o 11 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Prerequisites 2 | *.d 3 | 4 | # Object files 5 | *.o 6 | *.ko 7 | *.obj 8 | *.elf 9 | 10 | # Linker output 11 | *.ilk 12 | *.map 13 | *.exp 14 | 15 | # Precompiled Headers 16 | *.gch 17 | *.pch 18 | 19 | # Libraries 20 | *.lib 21 | *.a 22 | *.la 23 | *.lo 24 | 25 | # Shared objects (inc. Windows DLLs) 26 | *.dll 27 | *.so 28 | *.so.* 29 | *.dylib 30 | 31 | # Executables 32 | *.exe 33 | *.out 34 | *.app 35 | *.i*86 36 | *.x86_64 37 | *.hex 38 | 39 | # Debug files 40 | *.dSYM/ 41 | *.su 42 | *.idb 43 | *.pdb 44 | 45 | # Kernel Module Compile Results 46 | *.mod* 47 | *.cmd 48 | .tmp_versions/ 49 | modules.order 50 | Module.symvers 51 | Mkfile.old 52 | dkms.conf 53 | .cache.mk 54 | *.swp 55 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # SPDX-License-Identifier: GPL-2.0 2 | # 3 | # Copyright (C) 2020 Dave Voutila . All rights reserved. 4 | 5 | KERNELRELEASE ?= $(shell uname -r) 6 | KERNELDIR ?= /lib/modules/$(KERNELRELEASE)/build 7 | DEPMOD ?= depmod 8 | PWD := $(shell pwd) 9 | 10 | all: module 11 | debug: module-debug 12 | 13 | module: 14 | @$(MAKE) -C $(KERNELDIR) M=$(PWD) modules 15 | 16 | module-debug: 17 | @$(MAKE) -C $(KERNELDIR) M=$(PWD) CONFIG_VMMCI_DEBUG=y modules 18 | 19 | clean: 20 | @$(MAKE) -C $(KERNELDIR) M=$(PWD) clean 21 | 22 | install: 23 | @$(MAKE) -C $(KERNELDIR) M=$(PWD) modules_install 24 | @$(DEPMOD) -A $(KERNELRELEASE) 25 | 26 | .PHONY: all module-debug module-install install clean 27 | -------------------------------------------------------------------------------- /alpine-notes.md: -------------------------------------------------------------------------------- 1 | # Building on Alpine 2 | 3 | ## About this guide 4 | This was tested using: 5 | - OpenBSD-current snapshot from 7 May 2020 6 | - Alpine Linux v3.11.6 (kernel version 5.4.34-virt) 7 | 8 | ## Dependencies 9 | Install the following packages: 10 | 11 | - gcc 12 | - make 13 | - linux-virt-dev 14 | 15 | # Building 16 | 17 | 1. `$ make` 18 | 2. `# make install` 19 | 20 | > You'll probably see some SSL errors [1] and complaints about missing keys...this is expected. If you'd like the key to be signed, feel free to follow [2] to figure out how to do so yourself. 21 | 22 | # Random notes about timekeeping 23 | 24 | https://wiki.postmarketos.org/wiki/Out-of-tree_kernel_modules 25 | 26 | https://kb.meinbergglobal.com/kb/time_sync/time_synchronization_in_virtual_machines 27 | 28 | [1] An example of the errors you'll probably see: https://github.com/andikleen/simple-pt/issues/8#issue-227415517 29 | 30 | [2] Official docs on generating the private key: https://www.kernel.org/doc/html/v4.15/admin-guide/module-signing.html#generating-signing-keys 31 | 32 | -------------------------------------------------------------------------------- /virtio.md: -------------------------------------------------------------------------------- 1 | # Learnings from VirtIO hacking... 2 | Here are some notes of things I've learned along the way. Maybe someone will 3 | find this interesting? 4 | 5 | ## Making Linux assign VMMCI to the virtio-pci driver 6 | OpenBSD "hides" some devices from non-OpenBSD guests by using non-standard PCI 7 | vendor, device, and subsystem identifiers that fall outside established virtio 8 | ranges that Linux explicitly uses to sanity check before "attaching" to a 9 | device. 10 | 11 | For instance, on Ubuntu 18.04, `lspci -v` shows (abbreviated): 12 | 13 | ``` 14 | ... 15 | 00:04.0 SCSI storage controller: Red Hat, Inc. Virtio SCSI 16 | Subsystem: Device 0b5d:0008 17 | Flags: bus master, fast devsel, latency 0, IRQ 7 18 | I/O ports at 4000 [size=4K] 19 | Kernel driver in use: virtio-pci 20 | 21 | 00:05.0 Communication controller: Device 0b5d:0777 22 | Subsystem: Device 0b5d:ffff 23 | Flags: bus master, fast devsel, latency 0, IRQ 9 24 | I/O ports at 5000 [size=4K] 25 | ``` 26 | Where the SCSI controller looks like a Red Hat device because the OpenBSD host 27 | uses Red Hat's "donated" (as they say) identifiers. However, the `vmmci(4)` 28 | device (the `00:05.0` one) uses `0b5d:0777` (`b5d == 'bsd'`...ha). 29 | 30 | We can either re-implement all the `virtio_pci` code in our vmmci driver, or do 31 | what I did in the interim and hack the kernel's `virtio_pci` driver. 32 | 33 | ## Reading the Host clock 34 | OpenBSD gets a little cheeky and uses virtio configuration registers as a way 35 | to transfer the host clock details to the guest. Since the config space is 36 | mapped via the virtio pci driver, the vmmci virtio driver just needs to read 37 | the right config registers to get the host to return the clock. 38 | 39 | __IF IT WERE SO EASY!__ 40 | 41 | Apparently there are differing versions of virtio, and the OpenBSD devices show 42 | up as "legacy" devices. I honestly don't know the differences yet, but the main 43 | difference here is how Linux will try to read the config registers. 44 | 45 | OpenBSD currently assumes a non-legacy (?) approach where (forgive my lack of 46 | knowledge here) a single read is attempted against a register that then returns 47 | up to 32-bits of data (a 64-bit read is done via 2 reads of 32 bits at 2 48 | registers 4 bytes apart). 49 | 50 | The problem is Linux's legacy virtio pci implementation instead tries to read 51 | 1 byte from the register address, then continues down the line of registers 52 | until it has read enough data. This causes garbage data to be read a legacy pci 53 | Linux virtio driver. Hence the `virtio_pci_obsd` customizations other 54 | than just matching PCI ids. 55 | -------------------------------------------------------------------------------- /virtio_vmmci.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Implementation of an OpenBSD VMM control interface for Linux guests 3 | * running under an OpenBSD host. 4 | * 5 | * Copyright 2019 Dave Voutila 6 | * 7 | * This program is free software; you can redistribute it and/or modify 8 | * it under the terms of the GNU General Public License as published by 9 | * the Free Software Foundation; either version 2 of the License, or 10 | * (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA 20 | */ 21 | #include 22 | #ifndef _VIRTIO_VMMCI_H 23 | #define _VIRTIO_VMMCI_H 24 | 25 | /* Find if we have an RTC device or not. We should, but Linux gives us two 26 | * possible ways to find it based on how the guest kernel was configured at 27 | * build time. 28 | */ 29 | #if defined(CONFIG_RTC_HCTOSYS_DEVICE) 30 | #define VMMCI_RTC_DEVICE CONFIG_RTC_HCTOSYS_DEVICE 31 | #elif defined(CONFIG_RTC_SYSTOHC_DEVICE) 32 | #define VMMCI_RTC_DEVICE CONFIG_RTC_SYSTOHC_DEVICE 33 | #endif 34 | 35 | const char *QNAME_MONITOR = "vmmci-monitor"; 36 | 37 | /* This should be picked up from the kernel config */ 38 | #ifdef CONFIG_HZ 39 | #define HZ CONFIG_HZ 40 | #else 41 | /* fallback to 100, the default in alpine-virt */ 42 | #define HZ 100 43 | #endif 44 | 45 | /* 1 jiffy = 1 HZ...but that assumes our clock doesn't have issues :-( */ 46 | #define DELAY_1s HZ 47 | #define DELAY_20s 20 * HZ 48 | 49 | #define VIRTIO_ID_VMMCI 0xffff /* matches OpenBSD's private id */ 50 | 51 | #define PCI_VENDOR_ID_OPENBSD 0x0b5d 52 | #define PCI_DEVICE_ID_OPENBSD_VMMCI 0x0777 53 | 54 | /* Configuration registers */ 55 | #define VMMCI_CONFIG_COMMAND 0 56 | #define VMMCI_CONFIG_TIME_SEC 4 57 | #define VMMCI_CONFIG_TIME_USEC 12 58 | 59 | /* Features...these get bit-shifted in the Linux virtio code */ 60 | #define VMMCI_F_TIMESYNC 0 61 | #define VMMCI_F_ACK 1 62 | #define VMMCI_F_SYNCRTC 2 63 | 64 | /* 65 | * Linux is in a 32/64 bit transition phases where v4.17 and below 66 | * seem to define timespec64 as just timespec...ugh. Also, this is 67 | * probably a really bad idea to handle it this way? 68 | */ 69 | #if LINUX_VERSION_CODE < KERNEL_VERSION(4,18,0) 70 | #define TIME_FMT "%ld.%09ld" 71 | #else 72 | #define TIME_FMT "%lld.%09ld" 73 | #endif 74 | 75 | #define debug(fmt, ...) \ 76 | do { if (debug) pr_info("vmmci: [%s] " fmt, __func__, ##__VA_ARGS__); \ 77 | } while (0) 78 | #define log(fmt, ...) pr_info("vmmci: " fmt, ##__VA_ARGS__) 79 | 80 | #endif // _VIRTIO_VMMCI_H 81 | -------------------------------------------------------------------------------- /virtio_pci_common.h: -------------------------------------------------------------------------------- 1 | #ifndef _DRIVERS_VIRTIO_VIRTIO_PCI_COMMON_H 2 | #define _DRIVERS_VIRTIO_VIRTIO_PCI_COMMON_H 3 | /* 4 | * Virtio PCI driver - APIs for common functionality for all device versions 5 | * 6 | * This module allows virtio devices to be used over a virtual PCI device. 7 | * This can be used with QEMU based VMMs like KVM or Xen. 8 | * 9 | * Copyright IBM Corp. 2007 10 | * Copyright Red Hat, Inc. 2014 11 | * 12 | * Authors: 13 | * Anthony Liguori 14 | * Rusty Russell 15 | * Michael S. Tsirkin 16 | * 17 | * This work is licensed under the terms of the GNU GPL, version 2 or later. 18 | * See the COPYING file in the top-level directory. 19 | * 20 | */ 21 | 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include 28 | #include 29 | #include 30 | #include 31 | #include 32 | #include 33 | #include 34 | 35 | struct virtio_pci_vq_info { 36 | /* the actual virtqueue */ 37 | struct virtqueue *vq; 38 | 39 | /* the list node for the virtqueues list */ 40 | struct list_head node; 41 | 42 | /* MSI-X vector (or none) */ 43 | unsigned msix_vector; 44 | }; 45 | 46 | /* Our device structure */ 47 | struct virtio_pci_device { 48 | struct virtio_device vdev; 49 | struct pci_dev *pci_dev; 50 | 51 | /* In legacy mode, these two point to within ->legacy. */ 52 | /* Where to read and clear interrupt */ 53 | u8 __iomem *isr; 54 | 55 | /* Modern only fields */ 56 | /* The IO mapping for the PCI config space (non-legacy mode) */ 57 | struct virtio_pci_common_cfg __iomem *common; 58 | /* Device-specific data (non-legacy mode) */ 59 | void __iomem *device; 60 | /* Base of vq notifications (non-legacy mode). */ 61 | void __iomem *notify_base; 62 | 63 | /* So we can sanity-check accesses. */ 64 | size_t notify_len; 65 | size_t device_len; 66 | 67 | /* Capability for when we need to map notifications per-vq. */ 68 | int notify_map_cap; 69 | 70 | /* Multiply queue_notify_off by this value. (non-legacy mode). */ 71 | u32 notify_offset_multiplier; 72 | 73 | int modern_bars; 74 | 75 | /* Legacy only field */ 76 | /* the IO mapping for the PCI config space */ 77 | void __iomem *ioaddr; 78 | 79 | /* a list of queues so we can dispatch IRQs */ 80 | spinlock_t lock; 81 | struct list_head virtqueues; 82 | 83 | /* array of all queues for house-keeping */ 84 | struct virtio_pci_vq_info **vqs; 85 | 86 | /* MSI-X support */ 87 | int msix_enabled; 88 | int intx_enabled; 89 | cpumask_var_t *msix_affinity_masks; 90 | /* Name strings for interrupts. This size should be enough, 91 | * and I'm too lazy to allocate each name separately. */ 92 | char (*msix_names)[256]; 93 | /* Number of available vectors */ 94 | unsigned msix_vectors; 95 | /* Vectors allocated, excluding per-vq vectors if any */ 96 | unsigned msix_used_vectors; 97 | 98 | /* Whether we have vector per vq */ 99 | bool per_vq_vectors; 100 | 101 | struct virtqueue *(*setup_vq)(struct virtio_pci_device *vp_dev, 102 | struct virtio_pci_vq_info *info, 103 | unsigned idx, 104 | void (*callback)(struct virtqueue *vq), 105 | const char *name, 106 | bool ctx, 107 | u16 msix_vec); 108 | void (*del_vq)(struct virtio_pci_vq_info *info); 109 | 110 | u16 (*config_vector)(struct virtio_pci_device *vp_dev, u16 vector); 111 | }; 112 | 113 | /* Constants for MSI-X */ 114 | /* Use first vector for configuration changes, second and the rest for 115 | * virtqueues Thus, we need at least 2 vectors for MSI. */ 116 | enum { 117 | VP_MSIX_CONFIG_VECTOR = 0, 118 | VP_MSIX_VQ_VECTOR = 1, 119 | }; 120 | 121 | /* Convert a generic virtio device to our structure */ 122 | static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev) 123 | { 124 | return container_of(vdev, struct virtio_pci_device, vdev); 125 | } 126 | 127 | /* wait for pending irq handlers */ 128 | void vp_synchronize_vectors(struct virtio_device *vdev); 129 | /* the notify function used when creating a virt queue */ 130 | bool vp_notify(struct virtqueue *vq); 131 | /* the config->del_vqs() implementation */ 132 | void vp_del_vqs(struct virtio_device *vdev); 133 | 134 | /* the config->find_vqs() implementation */ 135 | #if LINUX_VERSION_CODE < KERNEL_VERSION(4,11,0) 136 | int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs, 137 | struct virtqueue *vqs[], vq_callback_t *callbacks[], 138 | const char * const names[]); 139 | #elif LINUX_VERSION_CODE >= KERNEL_VERSION(4,11,0) && LINUX_VERSION_CODE < KERNEL_VERSION(4,12,0) 140 | int vp_find_vqs(struct virtio_device *, unsigned nvqs, 141 | struct virtqueue *vqs[], vq_callback_t *callbacks[], 142 | const char * const names[], struct irq_affinity *desc); 143 | #elif LINUX_VERSION_CODE >= KERNEL_VERSION(4,12,0) && LINUX_VERSION_CODE < KERNEL_VERSION(6,11,0) 144 | int vp_find_vqs(struct virtio_device *, unsigned int nvqs, 145 | struct virtqueue *vqs[], vq_callback_t *callbacks[], 146 | const char * const names[], const bool *ctx, 147 | struct irq_affinity *desc); 148 | #elif LINUX_VERSION_CODE >= KERNEL_VERSION(6,11,0) 149 | int vp_find_vqs(struct virtio_device *, unsigned int, 150 | struct virtqueue *[], struct virtqueue_info [], 151 | struct irq_affinity *); 152 | #else 153 | #error missing kernel version check 154 | #endif 155 | 156 | const char *vp_bus_name(struct virtio_device *vdev); 157 | 158 | /* Setup the affinity for a virtqueue: 159 | * - force the affinity for per vq vector 160 | * - OR over all affinities for shared MSI 161 | * - ignore the affinity request if we're using INTX 162 | */ 163 | 164 | #if LINUX_VERSION_CODE < KERNEL_VERSION(4,19,0) 165 | int vp_set_vq_affinity(struct virtqueue *vq, int cpu); 166 | #else 167 | int vp_set_vq_affinity(struct virtqueue *vq, const struct cpumask *cpu_mask); 168 | #endif 169 | 170 | #if LINUX_VERSION_CODE >= KERNEL_VERSION(4,13,0) 171 | const struct cpumask *vp_get_vq_affinity(struct virtio_device *vdev, int index); 172 | #endif 173 | 174 | int virtio_pci_obsd_probe(struct virtio_pci_device *); 175 | void virtio_pci_obsd_remove(struct virtio_pci_device *); 176 | 177 | // not sure yet which Linux version introduced this... 178 | #define VIRTIO_F_SR_IOV 37 179 | 180 | #endif 181 | -------------------------------------------------------------------------------- /virtio_pci_common.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Virtio PCI driver - common functionality for all device versions 3 | * 4 | * This module allows virtio devices to be used over a virtual PCI device. 5 | * This can be used with QEMU based VMMs like KVM or Xen. 6 | * 7 | * Copyright IBM Corp. 2007 8 | * Copyright Red Hat, Inc. 2014 9 | * 10 | * Authors: 11 | * Anthony Liguori 12 | * Rusty Russell 13 | * Michael S. Tsirkin 14 | * 15 | * This work is licensed under the terms of the GNU GPL, version 2 or later. 16 | * See the COPYING file in the top-level directory. 17 | * 18 | */ 19 | /* 20 | * Virtio PCI OpenBSD driver - a hacked version of the standard Virtio PCI 21 | * driver to handle current OpenBSD quirks. 22 | * 23 | * Authors: 24 | * Dave Voutila 25 | */ 26 | 27 | #include "virtio_pci_common.h" 28 | #include "virtio_vmmci.h" 29 | 30 | /* Handle a configuration change: Tell driver if it wants to know. */ 31 | static irqreturn_t vp_config_changed(int irq, void *opaque) 32 | { 33 | struct virtio_pci_device *vp_dev = opaque; 34 | 35 | // printk(KERN_INFO "vp_config_changed: %02x\n", irq); 36 | 37 | virtio_config_changed(&vp_dev->vdev); 38 | return IRQ_HANDLED; 39 | } 40 | 41 | /* A small wrapper to also acknowledge the interrupt when it's handled. 42 | * I really need an EIO hook for the vring so I can ack the interrupt once we 43 | * know that we'll be handling the IRQ but before we invoke the callback since 44 | * the callback may notify the host which results in the host attempting to 45 | * raise an interrupt that we would then mask once we acknowledged the 46 | * interrupt. */ 47 | static irqreturn_t vp_interrupt(int irq, void *opaque) 48 | { 49 | // printk(KERN_INFO "virtio_pci_common: vp_interrupt (%d)\n", irq); 50 | vp_config_changed(irq, opaque); 51 | 52 | return IRQ_HANDLED; 53 | } 54 | 55 | /* the config->del_vqs() implementation */ 56 | void vp_del_vqs(struct virtio_device *vdev) 57 | { 58 | // XXX: we don't use queues! 59 | } 60 | 61 | #if LINUX_VERSION_CODE < KERNEL_VERSION(4,11,0) 62 | int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs, 63 | struct virtqueue *vqs[], vq_callback_t *callbacks[], 64 | const char * const names[]) 65 | { 66 | return 0; 67 | } 68 | #elif LINUX_VERSION_CODE >= KERNEL_VERSION(4,11,0) && LINUX_VERSION_CODE < KERNEL_VERSION(4,12,0) 69 | int vp_find_vqs(struct virtio_device *, unsigned nvqs, 70 | struct virtqueue *vqs[], vq_callback_t *callbacks[], 71 | const char * const names[], struct irq_affinity *desc) 72 | { 73 | return 0; 74 | } 75 | #elif LINUX_VERSION_CODE >= KERNEL_VERSION(4,12,0) && LINUX_VERSION_CODE < KERNEL_VERSION(6,11,0) 76 | int vp_find_vqs(struct virtio_device *vdev, unsigned int nvqs, 77 | struct virtqueue *vqs[], vq_callback_t *callbacks[], 78 | const char * const names[], const bool *ctx, 79 | struct irq_affinity *desc) 80 | { 81 | return 0; 82 | } 83 | #elif LINUX_VERSION_CODE >= KERNEL_VERSION (6,11,0) 84 | /* 6.11 changed the API...grrrrr. */ 85 | int vp_find_vqs(struct virtio_device *vdev, unsigned int nvqs, 86 | struct virtqueue *vqs[], struct virtqueue_info vqs_info[], 87 | struct irq_affinity *desc) 88 | { 89 | return 0; 90 | } 91 | #endif 92 | 93 | const char *vp_bus_name(struct virtio_device *vdev) 94 | { 95 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 96 | 97 | return pci_name(vp_dev->pci_dev); 98 | } 99 | 100 | #if LINUX_VERSION_CODE <= KERNEL_VERSION(4,18,20) 101 | /* Setup the affinity for a virtqueue: 102 | * - force the affinity for per vq vector 103 | * - OR over all affinities for shared MSI 104 | * - ignore the affinity request if we're using INTX 105 | */ 106 | int vp_set_vq_affinity(struct virtqueue *vq, int cpu) 107 | { 108 | struct virtio_device *vdev = vq->vdev; 109 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 110 | struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index]; 111 | struct cpumask *mask; 112 | unsigned int irq; 113 | 114 | if (!vq->callback) 115 | return -EINVAL; 116 | 117 | if (vp_dev->msix_enabled) { 118 | mask = vp_dev->msix_affinity_masks[info->msix_vector]; 119 | irq = pci_irq_vector(vp_dev->pci_dev, info->msix_vector); 120 | if (cpu == -1) 121 | irq_set_affinity_hint(irq, NULL); 122 | else { 123 | cpumask_clear(mask); 124 | cpumask_set_cpu(cpu, mask); 125 | irq_set_affinity_hint(irq, mask); 126 | } 127 | } 128 | return 0; 129 | } 130 | #else 131 | /* Setup the affinity for a virtqueue: 132 | * - force the affinity for per vq vector 133 | * - OR over all affinities for shared MSI 134 | * - ignore the affinity request if we're using INTX 135 | */ 136 | int vp_set_vq_affinity(struct virtqueue *vq, const struct cpumask *cpu_mask) 137 | { 138 | struct virtio_device *vdev = vq->vdev; 139 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 140 | struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index]; 141 | struct cpumask *mask; 142 | unsigned int irq; 143 | 144 | if (!vq->callback) 145 | return -EINVAL; 146 | 147 | if (vp_dev->msix_enabled) { 148 | mask = vp_dev->msix_affinity_masks[info->msix_vector]; 149 | irq = pci_irq_vector(vp_dev->pci_dev, info->msix_vector); 150 | if (!cpu_mask) 151 | irq_set_affinity_hint(irq, NULL); 152 | else { 153 | cpumask_copy(mask, cpu_mask); 154 | irq_set_affinity_hint(irq, mask); 155 | } 156 | } 157 | return 0; 158 | } 159 | #endif 160 | 161 | const struct cpumask *vp_get_vq_affinity(struct virtio_device *vdev, int index) 162 | { 163 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 164 | 165 | if (!vp_dev->per_vq_vectors || 166 | vp_dev->vqs[index]->msix_vector == VIRTIO_MSI_NO_VECTOR) 167 | return NULL; 168 | 169 | return pci_irq_get_affinity(vp_dev->pci_dev, 170 | vp_dev->vqs[index]->msix_vector); 171 | } 172 | 173 | /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */ 174 | static const struct pci_device_id virtio_pci_id_table[] = { 175 | { PCI_DEVICE(PCI_VENDOR_ID_OPENBSD, PCI_DEVICE_ID_OPENBSD_VMMCI) }, 176 | { 0 } 177 | }; 178 | 179 | MODULE_DEVICE_TABLE(pci, virtio_pci_id_table); 180 | 181 | static void virtio_pci_release_dev(struct device *_d) 182 | { 183 | struct virtio_device *vdev = dev_to_virtio(_d); 184 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 185 | 186 | /* As struct device is a kobject, it's not safe to 187 | * free the memory (including the reference counter itself) 188 | * until it's release callback. */ 189 | kfree(vp_dev); 190 | } 191 | 192 | static int virtio_pci_probe(struct pci_dev *pci_dev, 193 | const struct pci_device_id *id) 194 | { 195 | struct virtio_pci_device *vp_dev, *reg_dev = NULL; 196 | int rc; 197 | 198 | /* allocate our structure and fill it out */ 199 | vp_dev = kzalloc(sizeof(struct virtio_pci_device), GFP_KERNEL); 200 | if (!vp_dev) 201 | return -ENOMEM; 202 | 203 | pci_set_drvdata(pci_dev, vp_dev); 204 | vp_dev->vdev.dev.parent = &pci_dev->dev; 205 | vp_dev->vdev.dev.release = virtio_pci_release_dev; 206 | vp_dev->pci_dev = pci_dev; 207 | INIT_LIST_HEAD(&vp_dev->virtqueues); 208 | spin_lock_init(&vp_dev->lock); 209 | 210 | /* enable the device */ 211 | rc = pci_enable_device(pci_dev); 212 | if (rc) 213 | goto err_enable_device; 214 | 215 | rc = virtio_pci_obsd_probe(vp_dev); 216 | if (rc) 217 | goto err_probe; 218 | 219 | 220 | pci_set_master(pci_dev); 221 | 222 | 223 | rc = request_irq(pci_dev->irq, vp_interrupt, IRQF_SHARED, 224 | dev_name(&vp_dev->vdev.dev), vp_dev); 225 | if (rc) 226 | goto err_probe; 227 | 228 | rc = register_virtio_device(&vp_dev->vdev); 229 | reg_dev = vp_dev; 230 | if (rc) 231 | goto err_register; 232 | 233 | return 0; 234 | 235 | err_register: 236 | virtio_pci_obsd_remove(vp_dev); 237 | err_probe: 238 | pci_disable_device(pci_dev); 239 | err_enable_device: 240 | if (reg_dev) 241 | put_device(&vp_dev->vdev.dev); 242 | else 243 | kfree(vp_dev); 244 | return rc; 245 | } 246 | 247 | static void virtio_pci_remove(struct pci_dev *pci_dev) 248 | { 249 | struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); 250 | struct device *dev = get_device(&vp_dev->vdev.dev); 251 | 252 | pci_disable_sriov(pci_dev); 253 | 254 | unregister_virtio_device(&vp_dev->vdev); 255 | 256 | virtio_pci_obsd_remove(vp_dev); 257 | 258 | // XXX: where should this go? 259 | free_irq(pci_dev->irq, vp_dev); 260 | 261 | pci_disable_device(pci_dev); 262 | put_device(dev); 263 | } 264 | 265 | static int virtio_pci_sriov_configure(struct pci_dev *pci_dev, int num_vfs) 266 | { 267 | // XXX: not needed? 268 | return 0; 269 | } 270 | 271 | #ifdef CONFIG_PM_SLEEP 272 | static int virtio_pci_freeze(struct device *dev) 273 | { 274 | // XXX: vmm(4) does not support power management 275 | return 0; 276 | } 277 | 278 | static int virtio_pci_restore(struct device *dev) 279 | { 280 | // XXX: vmm(4) does not support power management 281 | return 0; 282 | } 283 | 284 | static const struct dev_pm_ops virtio_pci_pm_ops = { 285 | SET_SYSTEM_SLEEP_PM_OPS(virtio_pci_freeze, virtio_pci_restore) 286 | }; 287 | #endif 288 | 289 | 290 | static struct pci_driver virtio_pci_driver = { 291 | .name = "virtio-pci-obsd", 292 | .id_table = virtio_pci_id_table, 293 | .probe = virtio_pci_probe, 294 | .remove = virtio_pci_remove, 295 | #ifdef CONFIG_PM_SLEEP 296 | .driver.pm = &virtio_pci_pm_ops, 297 | #endif 298 | .sriov_configure = virtio_pci_sriov_configure, 299 | }; 300 | 301 | module_pci_driver(virtio_pci_driver); 302 | 303 | MODULE_AUTHOR("Dave Voutila "); 304 | MODULE_DESCRIPTION("virtio-pci-obsd"); 305 | MODULE_LICENSE("GPL"); 306 | MODULE_VERSION("1"); 307 | -------------------------------------------------------------------------------- /virtio_pci_openbsd.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Virtio PCI driver - legacy device support 3 | * 4 | * This module allows virtio devices to be used over a virtual PCI device. 5 | * This can be used with QEMU based VMMs like KVM or Xen. 6 | * 7 | * Copyright IBM Corp. 2007 8 | * Copyright Red Hat, Inc. 2014 9 | * 10 | * Authors: 11 | * Anthony Liguori 12 | * Rusty Russell 13 | * Michael S. Tsirkin 14 | * 15 | * This work is licensed under the terms of the GNU GPL, version 2 or later. 16 | * See the COPYING file in the top-level directory. 17 | * 18 | */ 19 | /* 20 | * Virtio PCI driver - OpenBSD device support 21 | * 22 | * This module handles the quirks related to OpenBSD virtio control devices 23 | * like virtio_vmmci. 24 | * 25 | * Authors: 26 | * Dave Voutila 27 | */ 28 | #include "virtio_pci_common.h" 29 | 30 | /* virtio config->get_features() implementation */ 31 | static u64 vp_get_features(struct virtio_device *vdev) 32 | { 33 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 34 | 35 | /* When someone needs more than 32 feature bits, we'll need to 36 | * steal a bit to indicate that the rest are somewhere else. */ 37 | return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES); 38 | } 39 | 40 | /* virtio config->finalize_features() implementation */ 41 | static int vp_finalize_features(struct virtio_device *vdev) 42 | { 43 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 44 | 45 | /* Give virtio_ring a chance to accept features. */ 46 | // vring_transport_features(vdev); 47 | 48 | /* Make sure we don't have any features > 32 bits! */ 49 | BUG_ON((u32)vdev->features != vdev->features); 50 | 51 | /* We only support 32 feature bits. */ 52 | iowrite32(vdev->features, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES); 53 | 54 | return 0; 55 | } 56 | 57 | // Apparently older 3.x Linux Kernels don't have this? 58 | #ifndef VIRTIO_PCI_CONFIG_OFF 59 | #define VIRTIO_PCI_CONFIG_OFF(x) ((x) ? 24 : 20) 60 | #endif 61 | 62 | /* OpenBSD's vmmci does some funky stuff when reading registers, so the normal 63 | Linux legacy vp_get won't work since it reads a byte at a time iterating 64 | over the registers. 65 | */ 66 | static void vp_get(struct virtio_device *vdev, unsigned offset, 67 | void *buf, unsigned len) 68 | { 69 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 70 | void __iomem *config_addr = vp_dev->ioaddr + 71 | VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled); 72 | u8 b; 73 | __le16 w; 74 | __le32 l; 75 | 76 | BUG_ON(NULL == config_addr); 77 | switch (len) { 78 | case 1: 79 | b = ioread8(config_addr + offset); 80 | memcpy(buf, &b, sizeof b); 81 | break; 82 | case 2: 83 | w = cpu_to_le16(ioread16(config_addr + offset)); 84 | memcpy(buf, &w, sizeof w); 85 | break; 86 | case 4: 87 | l = cpu_to_le32(ioread32(config_addr + offset)); 88 | memcpy(buf, &l, sizeof l); 89 | break; 90 | case 8: 91 | l = cpu_to_le32(ioread32(config_addr + offset)); 92 | memcpy(buf, &l, sizeof l); 93 | l = cpu_to_le32(ioread32(config_addr + offset + sizeof l)); 94 | memcpy(buf + sizeof l, &l, sizeof l); 95 | break; 96 | default: 97 | BUG(); 98 | } 99 | } 100 | 101 | /* Similar to vp_get, we need ot use the logic from Linux's virtio_pci_modern 102 | to make sure we write to the device properly 103 | */ 104 | static void vp_set(struct virtio_device *vdev, unsigned offset, 105 | const void *buf, unsigned len) 106 | { 107 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 108 | void __iomem *config_addr = vp_dev->ioaddr + 109 | VIRTIO_PCI_CONFIG_OFF(vp_dev->msix_enabled); 110 | u8 b; 111 | __le16 w; 112 | __le32 l; 113 | 114 | // BUG_ON(offset + len > vp_dev->device_len); 115 | 116 | switch (len) { 117 | case 1: 118 | memcpy(&b, buf, sizeof b); 119 | iowrite8(b, config_addr + offset); 120 | break; 121 | case 2: 122 | memcpy(&w, buf, sizeof w); 123 | iowrite16(le16_to_cpu(w), config_addr + offset); 124 | break; 125 | case 4: 126 | memcpy(&l, buf, sizeof l); 127 | iowrite32(le32_to_cpu(l), config_addr + offset); 128 | break; 129 | case 8: 130 | memcpy(&l, buf, sizeof l); 131 | iowrite32(le32_to_cpu(l), config_addr + offset); 132 | memcpy(&l, buf + sizeof l, sizeof l); 133 | iowrite32(le32_to_cpu(l), config_addr + offset + sizeof l); 134 | break; 135 | default: 136 | BUG(); 137 | } 138 | } 139 | 140 | /* config->{get,set}_status() implementations */ 141 | static u8 vp_get_status(struct virtio_device *vdev) 142 | { 143 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 144 | return ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS); 145 | } 146 | 147 | static void vp_set_status(struct virtio_device *vdev, u8 status) 148 | { 149 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 150 | /* We should never be setting status to 0. */ 151 | BUG_ON(status == 0); 152 | iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS); 153 | } 154 | 155 | static void vp_reset(struct virtio_device *vdev) 156 | { 157 | struct virtio_pci_device *vp_dev = to_vp_device(vdev); 158 | /* 0 status means a reset. */ 159 | iowrite8(0, vp_dev->ioaddr + VIRTIO_PCI_STATUS); 160 | /* Flush out the status write, and flush in device writes, 161 | * including MSi-X interrupts, if any. */ 162 | ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS); 163 | } 164 | 165 | static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector) 166 | { 167 | /* Setup the vector used for configuration events */ 168 | iowrite16(vector, vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR); 169 | /* Verify we had enough resources to assign the vector */ 170 | /* Will also flush the write out to device */ 171 | return ioread16(vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR); 172 | } 173 | 174 | static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev, 175 | struct virtio_pci_vq_info *info, 176 | unsigned index, 177 | void (*callback)(struct virtqueue *vq), 178 | const char *name, 179 | bool ctx, 180 | u16 msix_vec) 181 | { 182 | // XXX: not needed 183 | return ERR_PTR(-ENOENT); 184 | } 185 | 186 | static void del_vq(struct virtio_pci_vq_info *info) 187 | { 188 | // XXX: we don't use virt queues 189 | } 190 | 191 | static const struct virtio_config_ops virtio_pci_config_ops = { 192 | .get = vp_get, 193 | .set = vp_set, 194 | .get_status = vp_get_status, 195 | .set_status = vp_set_status, 196 | .reset = vp_reset, 197 | .find_vqs = vp_find_vqs, 198 | .del_vqs = vp_del_vqs, 199 | .get_features = vp_get_features, 200 | .finalize_features = vp_finalize_features, 201 | .bus_name = vp_bus_name, 202 | .set_vq_affinity = vp_set_vq_affinity, 203 | #if LINUX_VERSION_CODE >= KERNEL_VERSION(4,13,0) 204 | .get_vq_affinity = vp_get_vq_affinity, 205 | #endif 206 | }; 207 | 208 | static int virtio_pci_obsd_match(struct pci_dev *pci_dev) 209 | { 210 | printk(KERN_INFO "virtio_pci_obsd_match: matching 0x%04x\n", pci_dev->device); 211 | /* Typically we'd only own devices >= 0x1000 and <= 0x103f... */ 212 | if (pci_dev->device < 0x1000 || pci_dev->device > 0x103f) { 213 | /* but we make an exception for OpenBSD */ 214 | if (pci_dev->device == 0x0777) { 215 | printk(KERN_INFO "virtio_pci_obsd_match: found OpenBSD device\n"); 216 | return 0; 217 | } 218 | printk(KERN_ERR "virtio_pci_obsd_match: unkown device\n"); 219 | return -ENODEV; 220 | } 221 | 222 | printk(KERN_INFO "virtio_pci_obsd_match: regular virtio match\n"); 223 | return 0; 224 | } 225 | 226 | /* the PCI probing function */ 227 | int virtio_pci_obsd_probe(struct virtio_pci_device *vp_dev) 228 | { 229 | struct pci_dev *pci_dev = vp_dev->pci_dev; 230 | int rc; 231 | 232 | rc = virtio_pci_obsd_match(pci_dev); 233 | if (rc) 234 | return rc; 235 | 236 | if (pci_dev->revision != VIRTIO_PCI_ABI_VERSION) { 237 | printk(KERN_ERR "virtio_pci_obsd: expected ABI version %d, got %d\n", 238 | VIRTIO_PCI_ABI_VERSION, pci_dev->revision); 239 | return -ENODEV; 240 | } 241 | 242 | rc = dma_set_mask(&pci_dev->dev, DMA_BIT_MASK(64)); 243 | 244 | // XXX: I think we can cut this stuff out since vmd(8) only 245 | // supports 64-bit guests 246 | if (rc) { 247 | rc = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32)); 248 | } else { 249 | /* 250 | * The virtio ring base address is expressed as a 32-bit PFN, 251 | * with a page size of 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT. 252 | */ 253 | dma_set_coherent_mask(&pci_dev->dev, 254 | DMA_BIT_MASK(32 + VIRTIO_PCI_QUEUE_ADDR_SHIFT)); 255 | } 256 | 257 | if (rc) 258 | dev_warn(&pci_dev->dev, "Failed to enable 64-bit or 32-bit DMA. Trying to continue, but this might not work.\n"); 259 | 260 | rc = pci_request_region(pci_dev, 0, "virtio-pci-obsd"); 261 | if (rc) 262 | return rc; 263 | 264 | rc = -ENOMEM; 265 | vp_dev->ioaddr = pci_iomap(pci_dev, 0, 0); 266 | if (!vp_dev->ioaddr) 267 | goto err_iomap; 268 | 269 | vp_dev->isr = vp_dev->ioaddr + VIRTIO_PCI_ISR; 270 | 271 | /* we use the subsystem vendor/device id as the virtio vendor/device 272 | * id. this allows us to use the same PCI vendor/device id for all 273 | * virtio devices and to identify the particular virtio driver by 274 | * the subsystem ids */ 275 | vp_dev->vdev.id.vendor = pci_dev->subsystem_vendor; 276 | vp_dev->vdev.id.device = pci_dev->subsystem_device; 277 | 278 | vp_dev->vdev.config = &virtio_pci_config_ops; 279 | 280 | vp_dev->config_vector = vp_config_vector; 281 | vp_dev->setup_vq = setup_vq; 282 | vp_dev->del_vq = del_vq; 283 | 284 | return 0; 285 | 286 | err_iomap: 287 | pci_release_region(pci_dev, 0); 288 | return rc; 289 | } 290 | 291 | void virtio_pci_obsd_remove(struct virtio_pci_device *vp_dev) 292 | { 293 | struct pci_dev *pci_dev = vp_dev->pci_dev; 294 | 295 | pci_iounmap(pci_dev, vp_dev->ioaddr); 296 | pci_release_region(pci_dev, 0); 297 | } 298 | -------------------------------------------------------------------------------- /virtio_vmmci.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Implementation of an OpenBSD VMM control interface for Linux guests 3 | * running under an OpenBSD host. 4 | * 5 | * Copyright 2020 Dave Voutila 6 | * 7 | * This program is free software; you can redistribute it and/or modify 8 | * it under the terms of the GNU General Public License as published by 9 | * the Free Software Foundation; either version 2 of the License, or 10 | * (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA 20 | */ 21 | 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include 28 | #include 29 | #include 30 | #include 31 | #include 32 | 33 | #include "virtio_vmmci.h" 34 | 35 | /* You can either change the global debug level here by changing the 36 | * initialization value for "debug" or configure it at runtime via 37 | * the kernel module parameter. See README.md for details. 38 | */ 39 | static int debug = 0; 40 | 41 | static int set_debug(const char *val, const struct kernel_param *kp) 42 | { 43 | int n = 0, rc; 44 | rc = kstrtoint(val, 10, &n); 45 | if (rc || n < 0) 46 | return -EINVAL; 47 | 48 | return param_set_int(val, kp); 49 | } 50 | 51 | static int get_debug(char *buffer, const struct kernel_param *kp) 52 | { 53 | int bytes; 54 | bytes = snprintf(buffer, 1024, "%d\n", debug); 55 | return bytes + 1; // account for NULL 56 | } 57 | 58 | static const struct kernel_param_ops debug_param_ops = { 59 | .set = set_debug, 60 | .get = get_debug, 61 | }; 62 | 63 | module_param_cb(debug, &debug_param_ops, &debug, 0664); 64 | 65 | /* Define our sysctl table entries for exposing our current clock 66 | * drift in seconds and nanoseconds. (Avoid using floating point vals 67 | * for now.) 68 | */ 69 | int drift_sec = 0; 70 | int drift_nsec = 0; 71 | 72 | static struct ctl_table_header *vmmci_table_header; 73 | 74 | static struct ctl_table drift_table[] = { 75 | { 76 | .procname = "drift_sec", 77 | .mode = 0444, 78 | .maxlen = sizeof(int), 79 | .data = &drift_sec, 80 | .proc_handler = &proc_dointvec, 81 | }, 82 | { 83 | .procname = "drift_nsec", 84 | .mode = 0444, 85 | .maxlen = sizeof(int), 86 | .data = &drift_nsec, 87 | .proc_handler = &proc_dointvec, 88 | }, 89 | { }, 90 | }; 91 | 92 | 93 | #if LINUX_VERSION_CODE < KERNEL_VERSION(6,6,0) 94 | /* 95 | * Removed in: 96 | * https://github.com/torvalds/linux/commit/2f2665c13af4895b26761107c2f637c2f112d8e9 97 | */ 98 | static struct ctl_table vmmci_table = { 99 | .procname = "vmmci", 100 | .child = drift_table, 101 | }; 102 | #endif 103 | 104 | /* Define our basic commands and structs for our device including the 105 | * virtio feature tables. 106 | */ 107 | enum vmmci_cmd { 108 | VMMCI_NONE = 0, 109 | VMMCI_SHUTDOWN, 110 | VMMCI_REBOOT, 111 | VMMCI_SYNCRTC, 112 | }; 113 | 114 | struct virtio_vmmci { 115 | struct virtio_device *vdev; 116 | 117 | /* Used for monitoring clock drift. Needs scheduling. */ 118 | struct workqueue_struct *monitor_wq; 119 | struct delayed_work monitor_work; 120 | 121 | /* Used for synchronizing clock. Work is put on from 122 | * the general purpose queue from the interrupt handler. 123 | */ 124 | struct work_struct sync_work; 125 | }; 126 | 127 | static struct virtio_device_id id_table[] = { 128 | { VIRTIO_ID_VMMCI, VIRTIO_DEV_ANY_ID }, 129 | { 0 }, 130 | }; 131 | 132 | static unsigned int features[] = { 133 | VMMCI_F_TIMESYNC, VMMCI_F_ACK, VMMCI_F_SYNCRTC, 134 | }; 135 | 136 | 137 | /* Synchronizes the system time to the hardware clock (rtc). Uses a process 138 | * similar to the one performed by the kernel at startup as defined in 139 | * the Linux kernel source file /drivers/rtc/hctosys.c. Minus the 32-bit 140 | * and non-amd64 specific stuff. 141 | */ 142 | #ifdef VMMCI_RTC_DEVICE 143 | static int sync_system_time(void) 144 | { 145 | int rc = -1; 146 | struct rtc_time hw_tm; 147 | 148 | #if LINUX_VERSION_CODE < KERNEL_VERSION(3,17,0) 149 | struct timespec time = { 150 | #else 151 | struct timespec64 time = { 152 | #endif 153 | .tv_nsec = NSEC_PER_SEC >> 1, 154 | }; 155 | 156 | // Try to open the hardware clock...which should be the emulated 157 | // mc146818 clock device. 158 | struct rtc_device *rtc = rtc_class_open(VMMCI_RTC_DEVICE); 159 | if (rtc == NULL) { 160 | printk(KERN_ERR "vmmci unable to open rtc device\n"); 161 | rc = -ENODEV; 162 | goto end; 163 | } 164 | 165 | // Reading the rtc device should be the same as getting the host 166 | // time via the vmmci config registers...just without all the 167 | // nastiness 168 | rc = rtc_read_time(rtc, &hw_tm); 169 | if (rc) { 170 | printk(KERN_ERR "vmmci failed to read the hardware clock\n"); 171 | goto close; 172 | } 173 | 174 | // Setting the system clock using do_settimeofday64 should be safe 175 | // as it is similar to OpenBSD's tc_setclock that steps the system 176 | // clock while triggering any alarms/timeouts that should fire 177 | #if LINUX_VERSION_CODE < KERNEL_VERSION(3,17,0) 178 | rtc_tm_to_time(&hw_tm, &time.tv_sec); 179 | rc = do_settimeofday(&time); 180 | #else 181 | time.tv_sec = rtc_tm_to_time64(&hw_tm); 182 | rc = do_settimeofday64(&time); 183 | #endif 184 | if (rc) { 185 | printk(KERN_ERR "vmmci failed to set system clock to rtc!\n"); 186 | goto close; 187 | } 188 | log("set system clock to %d-%02d-%02d %02d:%02d:%02d UTC\n", 189 | hw_tm.tm_year + 1900, hw_tm.tm_mon + 1, hw_tm.tm_mday, 190 | hw_tm.tm_hour, hw_tm.tm_min, hw_tm.tm_sec); 191 | 192 | close: 193 | // I assume this cleans up any references, if the kernel tracks them 194 | rtc_class_close(rtc); 195 | 196 | end: 197 | return rc; 198 | } 199 | #else 200 | static int sync_system_time(void) 201 | { 202 | debug("no known rtc device available"); 203 | return -1; 204 | } 205 | #endif 206 | 207 | static void sync_work_func(struct work_struct *work) 208 | { 209 | int rc = 0; 210 | 211 | debug("starting clock synchronization..."); 212 | rc = sync_system_time(); 213 | if (rc) 214 | debug("clock synchronization failed (%d)\n", rc); 215 | else 216 | debug("finished clock synchronization!\n"); 217 | 218 | } 219 | 220 | /* Runs our guest/host clock drift measurements and logs them to the syslog */ 221 | static void monitor_work_func(struct work_struct *work) 222 | { 223 | struct virtio_vmmci *vmmci; 224 | struct timespec64 host, guest, diff; 225 | s64 sec, usec; // should these be signed or unsigned? 226 | 227 | debug("measuring clock drift...\n"); 228 | 229 | // My god this container_of stuff seems...messy? Oh, Linux... 230 | vmmci = container_of((struct delayed_work *) work, struct virtio_vmmci, monitor_work); 231 | 232 | vmmci->vdev->config->get(vmmci->vdev, VMMCI_CONFIG_TIME_SEC, &sec, sizeof(sec)); 233 | vmmci->vdev->config->get(vmmci->vdev, VMMCI_CONFIG_TIME_USEC, &usec, sizeof(usec)); 234 | 235 | #if LINUX_VERSION_CODE < KERNEL_VERSION(5,0,0) 236 | getnstimeofday64(&guest); 237 | #else 238 | ktime_get_real_ts64(&guest); 239 | #endif 240 | debug("host clock: %lld.%09lld, guest clock: " TIME_FMT, 241 | sec, usec * NSEC_PER_USEC, guest.tv_sec, guest.tv_nsec); 242 | 243 | host.tv_sec = sec; 244 | host.tv_nsec = (long) usec * NSEC_PER_USEC; 245 | 246 | diff = timespec64_sub(host, guest); 247 | 248 | // XXX: our globals for tracking drift...since we're not SMP enabled let's 249 | // ignore locking/unlocking for now...also yes, we're blindly going from a 250 | // s64 to an int here. 251 | drift_sec = diff.tv_sec; 252 | drift_nsec = diff.tv_nsec; 253 | 254 | debug("current clock drift: " TIME_FMT " seconds\n", diff.tv_sec, diff.tv_nsec); 255 | 256 | queue_delayed_work(vmmci->monitor_wq, &vmmci->monitor_work, DELAY_20s); 257 | debug("drift measurement routine finished\n"); 258 | } 259 | 260 | static int vmmci_probe(struct virtio_device *vdev) 261 | { 262 | struct virtio_vmmci *vmmci; 263 | 264 | debug("initializing vmmci device\n"); 265 | debug("HZ: %d", HZ); 266 | vdev->priv = vmmci = kzalloc(sizeof(*vmmci), GFP_KERNEL); 267 | if (!vmmci) { 268 | printk(KERN_ERR "vmmci_probe: failed to alloc vmmci struct\n"); 269 | return -ENOMEM; 270 | } 271 | vmmci->vdev = vdev; 272 | 273 | if (virtio_has_feature(vdev, VMMCI_F_TIMESYNC)) 274 | debug("...found feature TIMESYNC\n"); 275 | if (virtio_has_feature(vdev, VMMCI_F_ACK)) 276 | debug("...found feature ACK\n"); 277 | if (virtio_has_feature(vdev, VMMCI_F_SYNCRTC)) 278 | debug("...found feature SYNCRTC\n"); 279 | 280 | // wire up routine clock drift monitoring 281 | vmmci->monitor_wq = create_singlethread_workqueue(QNAME_MONITOR); 282 | if (vmmci->monitor_wq == NULL) { 283 | printk(KERN_ERR "vmmci_probe: failed to alloc monitoring workqueue\n"); 284 | return -ENOMEM; 285 | } 286 | 287 | INIT_DELAYED_WORK(&vmmci->monitor_work, monitor_work_func); 288 | queue_delayed_work(vmmci->monitor_wq, &vmmci->monitor_work, DELAY_1s); 289 | 290 | INIT_WORK(&vmmci->sync_work, sync_work_func); 291 | 292 | #if LINUX_VERSION_CODE < KERNEL_VERSION(6,6,0) 293 | vmmci_table_header = register_sysctl_table(&vmmci_table); 294 | #else 295 | vmmci_table_header = register_sysctl_sz("vmmci", drift_table, 2); 296 | #endif 297 | log("started VMM Control Interface driver\n"); 298 | return 0; 299 | } 300 | 301 | static void vmmci_remove(struct virtio_device *vdev) 302 | { 303 | struct virtio_vmmci *vmmci = vdev->priv; 304 | debug("removing device\n"); 305 | 306 | cancel_delayed_work(&vmmci->monitor_work); 307 | flush_workqueue(vmmci->monitor_wq); 308 | destroy_workqueue(vmmci->monitor_wq); 309 | cancel_work_sync(&vmmci->sync_work); 310 | debug("cancelled, flushed, and destroyed work queues\n"); 311 | 312 | vdev->config->reset(vdev); 313 | debug("reset device\n"); 314 | 315 | kfree(vmmci); 316 | 317 | unregister_sysctl_table(vmmci_table_header); 318 | 319 | log("removed device\n"); 320 | } 321 | 322 | static void vmmci_changed(struct virtio_device *vdev) 323 | { 324 | struct virtio_vmmci *vmmci = vdev->priv; 325 | s32 cmd = 0; 326 | debug("reading command register...\n"); 327 | 328 | vdev->config->get(vdev, VMMCI_CONFIG_COMMAND, &cmd, sizeof(cmd)); 329 | 330 | switch (cmd) { 331 | case VMMCI_NONE: 332 | debug("VMMCI_NONE received\n"); 333 | break; 334 | 335 | case VMMCI_SHUTDOWN: 336 | log("shutdown requested by host!\n"); 337 | orderly_poweroff(false); 338 | break; 339 | 340 | case VMMCI_REBOOT: 341 | log("reboot requested by host!\n"); 342 | orderly_reboot(); 343 | break; 344 | 345 | case VMMCI_SYNCRTC: 346 | log("clock sync requested by host\n"); 347 | schedule_work(&vmmci->sync_work); 348 | break; 349 | 350 | default: 351 | printk(KERN_ERR "invalid command received: 0x%04x\n", cmd); 352 | break; 353 | } 354 | 355 | if (cmd != VMMCI_NONE 356 | && (vdev->features & VMMCI_F_ACK)) { 357 | vdev->config->set(vdev, VMMCI_CONFIG_COMMAND, &cmd, sizeof(cmd)); 358 | debug("...acknowledged command %d\n", cmd); 359 | } 360 | } 361 | 362 | #if LINUX_VERSION_CODE >= KERNEL_VERSION(4,10,0) 363 | static int vmmci_validate(struct virtio_device *vdev) 364 | { 365 | debug("not implemented"); 366 | return 0; 367 | } 368 | #endif 369 | 370 | #ifdef CONFIG_PM_SLEEP 371 | static int vmmci_freeze(struct virtio_device *vdev) 372 | { 373 | debug("not implemented\n"); 374 | return 0; 375 | } 376 | 377 | static int vmmci_restore(struct virtio_device *vdev) 378 | { 379 | debug("not implemented\n"); 380 | return 0; 381 | } 382 | #endif 383 | 384 | static struct virtio_driver virtio_vmmci_driver = { 385 | .feature_table = features, 386 | .feature_table_size = ARRAY_SIZE(features), 387 | .driver.name = KBUILD_MODNAME, 388 | .driver.owner = THIS_MODULE, 389 | .id_table = id_table, 390 | #if LINUX_VERSION_CODE >= KERNEL_VERSION(4,10,0) 391 | .validate = vmmci_validate, 392 | #endif 393 | .probe = vmmci_probe, 394 | .remove = vmmci_remove, 395 | .config_changed = vmmci_changed, 396 | #ifdef CONFIG_PM_SLEEP 397 | .freeze = vmmci_freeze, 398 | .restore = vmmci_restore, 399 | #endif 400 | }; 401 | 402 | module_virtio_driver(virtio_vmmci_driver); 403 | MODULE_LICENSE("GPL"); 404 | MODULE_DESCRIPTION("OpenBSD VMM Control Interface"); 405 | MODULE_AUTHOR("Dave Voutila "); 406 | MODULE_SOFTDEP("pre: virtio_pci_obsd"); 407 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # An OpenBSD VMM Control Interface (vmmci) for Linux 2 | _...or "How I learned to shut my x270 laptop and not worry about my VMs."_ 3 | 4 | [![builds.sr.ht status](https://builds.sr.ht/~voutilad/virtio_vmmci.svg)](https://builds.sr.ht/~voutilad/virtio_vmmci?) 5 | 6 | This is an implementation of [vmmci(4)](https://man.openbsd.org/vmmci) for 7 | Linux using a customized version of the `virtio_pci` driver from the 8 | mainline kernel. It currently supports the following: 9 | 10 | 1. **Clean Shutdowns on Request** 11 | When requested by `vmctl(8)`...you can safely use 12 | `vmctl stop ` and it'll nicely stop services and 13 | sync disks! 14 | 15 | 2. **System Time Synchronization** 16 | When the host `vmd(8)` emulation of the hardware clock detects a 17 | clock drift (most likely due to the host being suspended/resumed), 18 | it fires a `SYNCRTC` message that the Linux `vmmci` driver responds 19 | to by synchronizing system time to the hardware clock time. (This 20 | currently only happens during certain host events like resuming 21 | from a suspended state.) 22 | 23 | 3. **Tracking Clock Drift** 24 | At regular intervals (currently 20s), `vmmci` will measure current 25 | clock drift, recording the current drift amount in seconds and 26 | nanoseconds parts readable via `sysctl vmmci` 27 | 28 | > **NOTE:** if you're here to deal with constant, excessive clock 29 | > drift, see the [FAQ](#wait-why-isnt-this-fixing-my-clock-drift-issues)! 30 | 31 | ## Example with Linux Guests 32 | ![vmd(8) and 3 Linux guests](/example.png?raw=true "VMD(8) and 3 Linux Guests") 33 | 34 | Above is a screenshot of the clock sync in practice. Tmux pane `0` is 35 | my instance of `vmd(8)` running in the foreground with verbose 36 | logging. The other panes: 37 | 38 | 1. **Alpine 3.8.4** (virt) with kernel 4.14.104-0-virt 39 | 2. **Debian Buster** (9.8) with kernel 4.9.0-8-amd64 (yeah, something 40 | is jacked up with dmesg's time...but it IS correct in 41 | `journalctl(1)` and when checking `timedatectl(1)`) 42 | 3. **Ubuntu 18.04** with my custom kernel 4.20.13-obsd+ 43 | 44 | Take note of the `rtc_fire1` log events from `vmd(8)`. That's where my 45 | laptop comes out of hibernation and the virtual rtc detects a drift 46 | and sends sync requests to the guests. Each Linux guest receives the 47 | request, performs the clock step, and ack's. 48 | 49 | ## Known Issues or Caveats 50 | Before you dive in, a few things to note: 51 | 52 | 1. I test and develop using OpenBSD snapshots, so relatively in sync 53 | with _-current_. (This should work with OpenBSD 6.7 and later.) 54 | 55 | 2. I lean heavily on the simplification that OpenBSD virtualization 56 | guests are single CPU currently. 57 | 58 | 3. This currenly won't solve larger clock issues, such as major drift. 59 | 60 | 4. I primarily focus on supporting the newest long-term support 61 | kernels picked up by major distros, which means Linux 5.4 at the 62 | moment. 63 | 64 | 5. I focus my testing on **Alpine Linux** guests using their `-virt` 65 | releases since it's simple to install and manage without a lot of 66 | ancillary stuff. Plus, _I personally like Alpine_. 67 | 68 | ## Installation & Usage 69 | This Linux VMMCI currently comes in **two parts:** 70 | 71 | 1. `virtio_pci_obsd.ko` -- handles the quirks of getting Linux's 72 | virtio pci framework to properly work with the VMM Control 73 | Interface device from `vmd(8)` 74 | 2. `virtio_vmmci.ko` -- virtio device driver that replicates the 75 | behavior of OpenBSD's `vmmci(4)` driver 76 | 77 | _You will need both modules installed!_ 78 | 79 | Assuming you've got a recent Linux distro running as a guest already 80 | under OpenBSD, it shouldn't be more than a few minutes to get things 81 | up and running. 82 | 83 | ### 1. Prerequisites 84 | Install the tools required to build kernel modules using your package 85 | manager or whatever you normally use to install stuff. 86 | 87 | For Alpine systems running the `-virt` flavored kernel: 88 | 89 | ```sh 90 | # apk add gcc make linux-virt-dev 91 | ``` 92 | 93 | Basically you need your kernel headers and some GCC tooling. 94 | 95 | ### 2. Compiling 96 | This should be easy and expose issues with your lack of prerequisites 97 | or an incompatability with your kernel version: 98 | 99 | ```sh 100 | $ make 101 | ``` 102 | 103 | > A common source of compiler warnings are from variations in kernel 104 | > versions. Please share your kernel version and the compiler output 105 | > if you have issues! 106 | 107 | ### 3. Installation 108 | As root, simply run: 109 | 110 | ```sh 111 | # make install 112 | ``` 113 | 114 | You'll probably see some SSL errors and complaints about missing key 115 | files. _This is expected as you're building an out-of-tree kernel 116 | module that isn't being signed._ If you'd like to sign the module, 117 | you're on your own at the moment, but maybe read the Linux kernel 118 | documentation on it here: 119 | 120 | At this point, you'll have 2 new kernel modules. You should see them 121 | if you run: 122 | 123 | ```sh 124 | $ ls -l /lib/modules/$(uname -r)/extra 125 | total 36 126 | -rw-r--r-- 1 root root 15272 May 9 20:42 virtio_pci_obsd.ko 127 | -rw-r--r-- 1 root root 19872 May 9 20:42 virtio_vmmci.ko 128 | ``` 129 | 130 | ### 4. Loading the modules 131 | This also should be easy now since their properly installed. Simply 132 | run: 133 | 134 | ```sh 135 | # modprobe virtio_vmmci 136 | ``` 137 | 138 | It should load both the `virtio_vmmci.ko` and `virtio_pci_obsd.ko` 139 | modules. They'll be visible when running `lsmod(8)`, but you won't see 140 | a "depends on" entry due to it being a "soft" dependency. 141 | 142 | ### 5. Checking it's Loaded 143 | After you load `virtio_pci_obsd.ko` you should see your system match 144 | and enable the vmmci PCI device. Check `dmesg(1)` and you should see 145 | something like: 146 | 147 | ``` 148 | [ 825.819945] virtio_pci_obsd: loading out-of-tree module taints kernel. 149 | [ 825.819945] virtio_pci_obsd: module verification failed: signature and/or required key missing - tainting kernel 150 | [ 825.819945] virtio-pci-obsd 0000:00:05.0: runtime IRQ mapping not provided by arch 151 | [ 825.819945] virtio_pci_obsd_match: matching 0x0777 152 | [ 825.819945] virtio_pci_obsd_match: found OpenBSD device 153 | [ 825.819945] virtio-pci-obsd 0000:00:05.0: enabling bus mastering 154 | ``` 155 | 156 | If you check with `lspci(8)` in verbose mode (`lspci -v`) you should 157 | see the device and the fact it's using our `virti_pci_obsd` driver: 158 | 159 | ``` 160 | 00:05.0 Communication controller: Device 0b5d:0777 161 | Subsystem: Device 0b5d:ffff 162 | Flags: bus master, fast devsel, latency 0, IRQ 9 163 | I/O ports at 5000 [size=4K] 164 | Kernel driver in use: virtio-pci-obsd 165 | ``` 166 | 167 | When you load `virtio_vmmci.ko`, you should see a confirmation the 168 | module is loaded: 169 | 170 | ``` 171 | [ 256.030878] virtio_vmmci: started VMM Control Interface driver 172 | ``` 173 | 174 | You can enable debug mode either by passing a `debug=1` argument when 175 | loading the `virtio_vmmci.ko` module or toggle it afterwards by 176 | writing either a `0` (off) or `1`/any positive integer (on) to 177 | `/sys/modules/virtio_vmmci/parameters/debug` as the root user. When 178 | debug mode is on, you'll get extra dmesg noise like: 179 | 180 | ``` 181 | [17769.012388] virtio_vmmci: [vmmci_validate] not implemented 182 | [17769.012388] virtio_vmmci: [vmmci_probe] initializing vmmci device 183 | [17769.012388] virtio_vmmci: [vmmci_probe] ...found feature TIMESYNC 184 | [17769.012388] virtio_vmmci: [vmmci_probe] ...found feature ACK 185 | [17769.012388] virtio_vmmci: [vmmci_probe] ...found feature SYNCRTC 186 | [17769.012388] virtio_vmmci: started VMM Control Interface driver 187 | [17769.034540] virtio_vmmci: [clock_work_func] starting clock synchronization 188 | [17769.034864] virtio_vmmci: [clock_work_func] guest clock: 1550959642.629898000, host clock: 1550959642.638556712 189 | [17769.034867] virtio_vmmci: [clock_work_func] current time delta: -1.991341288 190 | [17769.034870] virtio_vmmci: [clock_work_func] clock synchronization routine finished 191 | ``` 192 | 193 | Lastly, check the sysctl tables. The driver registers 2 particular 194 | values that contain the seconds and nanoseconds portion of the last 195 | measured drift amount: 196 | 197 | ``` 198 | you@guest:~/virtio_vmmci$ sudo sysctl vmmci 199 | vmmci.drift_nsec = 199647574 200 | vmmci.drift_sec = 1 201 | ``` 202 | 203 | In the above example, the total drift is `1.199647574 seconds`. 204 | 205 | > In the future I may expose the last measured time as well 206 | 207 | ### 5. Configuring autoloading at boot time 208 | This is pretty simple in modern distros that use 209 | `/etc/modules-load.d`. As root, create a file 210 | `/etc/modules-load.d/virtio_vmmci.conf` with the contents: 211 | 212 | ``` 213 | virtio_vmmci 214 | ``` 215 | 216 | At boot, you should see the modules loaded automatically. 217 | 218 | ## Testing and Confirming Module Installation 219 | There are a few things you can do to validate your installation. 220 | 221 | ### Clock Sync 222 | You can easily test the clock synchronization by suspending your 223 | OpenBSD host by triggering `zzz` manually or by something like closing 224 | your laptop lid. Wait at least 10 seconds or so and resume your 225 | OpenBSD system. In the Linux guest, your `dmesg(1)` output will tell 226 | you (in less than 30 seconds) that it's detected a clock drift and 227 | it's sync'ing the clock: 228 | 229 | ``` 230 | [15670.027879] virtio_vmmci: [clock_work_func] current time delta: 91.482370612 231 | [15670.027879] virtio_vmmci: detected drift greater than 5 seconds, synchronizing clock 232 | [15670.027879] virtio_vmmci: [clock_work_func] clock synchronization routine finished 233 | ``` 234 | 235 | If you check `date` or `timedatectl` on the Linux guest you should see 236 | the system time is very close to our host time. 237 | 238 | ### Clean Shutdown 239 | How can we test a clean shutdown? It's not too hard, but it might not 240 | work the same between distros and versions. Here's what I've done on 241 | Alpine 3.11.6. 242 | 243 | Assuming your vm is up and running: 244 | 245 | 1. Use `tmux(1)` or another means of getting 2 terminal sessions going 246 | at once. 247 | 2. In one session, `vmctl console ` to connect to the 248 | VM over the serial console. (This obviously assumes your guest is 249 | configured to work that way.) 250 | 3. In another session, issue `vmctl stop `. 251 | 4. Back in the serial console session, you should see your init 252 | system...probably `systemd`...start running through the shutdown 253 | process. 254 | 255 | There _may_ be some variations. The Linux vmmci driver calls a kernel 256 | helper function that handles orchestrating the shutdown via 257 | userspace. (The question of how to shutdown a Linux system from 258 | kernelspace is quite fascinating to explore.) 259 | 260 | # Seldomly Asked Questions 261 | Some questions that people...mainly myself...have had... 262 | 263 | ## Wait, why isn't this fixing my clock drift issues? 264 | My initial release would constantly adjust the guest clock when 265 | detecting drift. I since removed the functionality and will not re-add 266 | it no matter how much it's requested. 267 | 268 | Some reasons I removed it: 269 | 270 | - It's a bandaid on a bigger issue, not a real solution. 271 | - You can apply a bandaid already using something like `hwclock -us`, 272 | but since it uses `settimeofday(2)` it may not trigger pending 273 | timers properly! 274 | 275 | Constant, excessive drift shouldn't be the norm. Using refined-jiffies 276 | will cause this. 277 | 278 | If you or a loved one experience excessive clock drift in your Linux 279 | guests under OpenBSD's vmm(4)/vmd(8) hypervisor framework, please try 280 | the following: 281 | 282 | - Build and install my other Linux kernel: 283 | [vmm-clock](https://github.com/voutilad/vmm_clock) 284 | - Use OpenBSD-current as of 1 July 2020 or so when my vmd(8) patch[6] 285 | was merged into the tree 286 | 287 | > You will need BOTH...vmm-clock will crash your guest if you don't 288 | > have a vmd(8) instance with the stability improvements. 289 | 290 | ## _Isn't just using settimeofday(2) dangerous?_ 291 | This isn't using the userland `settimeofday(2)` system call and 292 | instead using a particular kernel function (`do_settimeofday64`[3]) 293 | that appears to be pretty analagous to OpenBSD's kernel's 294 | `tc_setclock` function[4] in that it steps the system clock while 295 | triggering any alarms or timeouts that would fire. 296 | 297 | Looking at how VirtualBox handles this with their userland guest 298 | additions services, they look for large clock drifts where "large" 299 | is currently > 30 minutes. If it's large, it just uses 300 | `settimeofday(2)`. Otherwise, it tries to use something like 301 | `adjtimex(2)` to accelerate the clock up to the correct time. (This is 302 | something I may consider for vmmci after some more usage/testing.) 303 | 304 | See their source for `VBoxServiceTimeSync.cpp`[5]. 305 | 306 | ## _Can't you just use OpenNTPD or some other NTP daemon?_ 307 | Maybe for small clock disturbances/drifts, but it's not ideal for 308 | major stepping and only solves the clock problem. 309 | 310 | There are two reasons I'd consider using `virtio_vmmci` either in 311 | addition to or in place of relying on an NTP daemon: 312 | 313 | 1. **Not every guest has network access.** This precludes NTP as an 314 | option. Even if the guest has limited network access, it still 315 | needs access to an NTP server, ideally multiple. This isn't always 316 | the case. 317 | 318 | 2. **Large clock drifts like when you suspend your laptop for an 319 | evening make most NTP daemons sad.** I've never seen an NTP daemon 320 | that is cool with just jumping the system time ahead 321 | (i.e. _stepping_) like that. Some require special config to even 322 | do. Yes, `ntpd(8)` supports a `-s` flag to do an actual set of the time 323 | and not just an adjustment, but even as the man page says it's for 324 | startup. (Useful for embedded, clock-less systems like a Raspberry 325 | Pi.) 326 | 327 | A lot of modern Linux distros install and enable an NTP daemon by 328 | default these days. That's fine. But don't forget vmmci gives you 329 | **clean shutdowns** as well as properly stepping the clock after a 330 | long suspend/hibernation! 331 | 332 | ## Why all the nasty Virtio PCI glue code? 333 | Few reasons, but for more background see my email to 334 | _misc@openbsd.org:_ https://marc.info/?t=155102953000002 335 | 336 | In short: 337 | 338 | 1. OpenBSD purposely uses self-asigned PCI and Virtio device 339 | identifiers to "hide" the VMM Control Interface device 340 | 2. Linux's virtio pci code is a LOT more complex and is trying to 341 | handle a variety of virtio devices...but can't handle a particular 342 | quirk with how the VMM Control Interface deals with config register i/o. 343 | 344 | # Future Work 345 | 346 | See the [issues](https://github.com/voutilad/virtio_vmmci/issues/) 347 | page for my ideas on future enhancements. Feel free to add some 348 | yourself, but keep in mind this is: 349 | 350 | 1. Not my job...it's a hobby 351 | 2. It's for my personal use first and foremost 352 | 3. My current job is in software but has nothing to do with kernels, 353 | virtualization, etc. so this is truly an after-hours thing. 354 | 355 | # Acknowledgements! 356 | 1. Thanks to the OpenBSD `vmm(4)`/`vmd(8)` hackers...especially those that put 357 | together OpenBSD's `vmmci(4)` driver which acted as my reference point. 358 | 359 | 2. The [bootlin cross-referencer](https://elixir.bootlin.com/linux/latest/source) 360 | because holy hell is that thing 10x more useful than poking around Torvald's 361 | mirror of the official Linux Git repo. 362 | 363 | 3. This page from "The kernel development community" was very helpful in 364 | figuring out how to schedule "deferred work" in the kernel: 365 | https://linux-kernel-labs.github.io/master/labs/deferred_work.html 366 | 367 | 4. The `virtio_balloon.c` driver in the Linux kernel tree is a relatively 368 | simple virtio example to understand Linux virtio drivers. 369 | 370 | 5. The wireguard kernel module source tree for showing how to properly 371 | build out of tree modules: 372 | https://git.zx2c4.com/wireguard-linux-compat/tree/src 373 | 374 | 6. Folks that have helped test on different distros with different 375 | kernel versions :-) 376 | 377 | # Footnotes 378 | GitHub might not render these...but believe me they're here :-) 379 | 380 | [1] Linux Kernel documentation on generating a private key for signing 381 | kernel modules: 382 | https://www.kernel.org/doc/html/v4.15/admin-guide/module-signing.html#generating-signing-keys 383 | 384 | [2] See this write-up on time-sync in vm's: 385 | http://archive.is/ndiy3 386 | 387 | [3] See the `time/timekeeping.c` source file: 388 | https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/time/timekeeping.c?h=v4.20#n1220 389 | 390 | [4] OpenBSD's `sys/kern/kern_tc.c`: 391 | https://github.com/openbsd/src/blob/e12a049bd4bbd1e8315c373a739e08972ed6dd1d/sys/kern/kern_tc.c#L382 392 | 393 | [5] VirtualBox's `VBoxServiceTimeSync.cpp`: 394 | https://www.virtualbox.org/browser/vbox/trunk/src/VBox/Additions/common/VBoxService/VBoxServiceTimeSync.cpp?rev=76553#L683 395 | 396 | [6] My vmd(8) stability fixes: 397 | https://github.com/openbsd/src/commit/08fd0ce3179b426bc00beaee67fffdfa71997830 398 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 2, June 1991 3 | 4 | Copyright (C) 1989, 1991 Free Software Foundation, Inc., 5 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA 6 | Everyone is permitted to copy and distribute verbatim copies 7 | of this license document, but changing it is not allowed. 8 | 9 | Preamble 10 | 11 | The licenses for most software are designed to take away your 12 | freedom to share and change it. By contrast, the GNU General Public 13 | License is intended to guarantee your freedom to share and change free 14 | software--to make sure the software is free for all its users. This 15 | General Public License applies to most of the Free Software 16 | Foundation's software and to any other program whose authors commit to 17 | using it. (Some other Free Software Foundation software is covered by 18 | the GNU Lesser General Public License instead.) You can apply it to 19 | your programs, too. 20 | 21 | When we speak of free software, we are referring to freedom, not 22 | price. Our General Public Licenses are designed to make sure that you 23 | have the freedom to distribute copies of free software (and charge for 24 | this service if you wish), that you receive source code or can get it 25 | if you want it, that you can change the software or use pieces of it 26 | in new free programs; and that you know you can do these things. 27 | 28 | To protect your rights, we need to make restrictions that forbid 29 | anyone to deny you these rights or to ask you to surrender the rights. 30 | These restrictions translate to certain responsibilities for you if you 31 | distribute copies of the software, or if you modify it. 32 | 33 | For example, if you distribute copies of such a program, whether 34 | gratis or for a fee, you must give the recipients all the rights that 35 | you have. You must make sure that they, too, receive or can get the 36 | source code. And you must show them these terms so they know their 37 | rights. 38 | 39 | We protect your rights with two steps: (1) copyright the software, and 40 | (2) offer you this license which gives you legal permission to copy, 41 | distribute and/or modify the software. 42 | 43 | Also, for each author's protection and ours, we want to make certain 44 | that everyone understands that there is no warranty for this free 45 | software. If the software is modified by someone else and passed on, we 46 | want its recipients to know that what they have is not the original, so 47 | that any problems introduced by others will not reflect on the original 48 | authors' reputations. 49 | 50 | Finally, any free program is threatened constantly by software 51 | patents. We wish to avoid the danger that redistributors of a free 52 | program will individually obtain patent licenses, in effect making the 53 | program proprietary. To prevent this, we have made it clear that any 54 | patent must be licensed for everyone's free use or not licensed at all. 55 | 56 | The precise terms and conditions for copying, distribution and 57 | modification follow. 58 | 59 | GNU GENERAL PUBLIC LICENSE 60 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 61 | 62 | 0. This License applies to any program or other work which contains 63 | a notice placed by the copyright holder saying it may be distributed 64 | under the terms of this General Public License. The "Program", below, 65 | refers to any such program or work, and a "work based on the Program" 66 | means either the Program or any derivative work under copyright law: 67 | that is to say, a work containing the Program or a portion of it, 68 | either verbatim or with modifications and/or translated into another 69 | language. (Hereinafter, translation is included without limitation in 70 | the term "modification".) Each licensee is addressed as "you". 71 | 72 | Activities other than copying, distribution and modification are not 73 | covered by this License; they are outside its scope. The act of 74 | running the Program is not restricted, and the output from the Program 75 | is covered only if its contents constitute a work based on the 76 | Program (independent of having been made by running the Program). 77 | Whether that is true depends on what the Program does. 78 | 79 | 1. You may copy and distribute verbatim copies of the Program's 80 | source code as you receive it, in any medium, provided that you 81 | conspicuously and appropriately publish on each copy an appropriate 82 | copyright notice and disclaimer of warranty; keep intact all the 83 | notices that refer to this License and to the absence of any warranty; 84 | and give any other recipients of the Program a copy of this License 85 | along with the Program. 86 | 87 | You may charge a fee for the physical act of transferring a copy, and 88 | you may at your option offer warranty protection in exchange for a fee. 89 | 90 | 2. You may modify your copy or copies of the Program or any portion 91 | of it, thus forming a work based on the Program, and copy and 92 | distribute such modifications or work under the terms of Section 1 93 | above, provided that you also meet all of these conditions: 94 | 95 | a) You must cause the modified files to carry prominent notices 96 | stating that you changed the files and the date of any change. 97 | 98 | b) You must cause any work that you distribute or publish, that in 99 | whole or in part contains or is derived from the Program or any 100 | part thereof, to be licensed as a whole at no charge to all third 101 | parties under the terms of this License. 102 | 103 | c) If the modified program normally reads commands interactively 104 | when run, you must cause it, when started running for such 105 | interactive use in the most ordinary way, to print or display an 106 | announcement including an appropriate copyright notice and a 107 | notice that there is no warranty (or else, saying that you provide 108 | a warranty) and that users may redistribute the program under 109 | these conditions, and telling the user how to view a copy of this 110 | License. (Exception: if the Program itself is interactive but 111 | does not normally print such an announcement, your work based on 112 | the Program is not required to print an announcement.) 113 | 114 | These requirements apply to the modified work as a whole. If 115 | identifiable sections of that work are not derived from the Program, 116 | and can be reasonably considered independent and separate works in 117 | themselves, then this License, and its terms, do not apply to those 118 | sections when you distribute them as separate works. But when you 119 | distribute the same sections as part of a whole which is a work based 120 | on the Program, the distribution of the whole must be on the terms of 121 | this License, whose permissions for other licensees extend to the 122 | entire whole, and thus to each and every part regardless of who wrote it. 123 | 124 | Thus, it is not the intent of this section to claim rights or contest 125 | your rights to work written entirely by you; rather, the intent is to 126 | exercise the right to control the distribution of derivative or 127 | collective works based on the Program. 128 | 129 | In addition, mere aggregation of another work not based on the Program 130 | with the Program (or with a work based on the Program) on a volume of 131 | a storage or distribution medium does not bring the other work under 132 | the scope of this License. 133 | 134 | 3. You may copy and distribute the Program (or a work based on it, 135 | under Section 2) in object code or executable form under the terms of 136 | Sections 1 and 2 above provided that you also do one of the following: 137 | 138 | a) Accompany it with the complete corresponding machine-readable 139 | source code, which must be distributed under the terms of Sections 140 | 1 and 2 above on a medium customarily used for software interchange; or, 141 | 142 | b) Accompany it with a written offer, valid for at least three 143 | years, to give any third party, for a charge no more than your 144 | cost of physically performing source distribution, a complete 145 | machine-readable copy of the corresponding source code, to be 146 | distributed under the terms of Sections 1 and 2 above on a medium 147 | customarily used for software interchange; or, 148 | 149 | c) Accompany it with the information you received as to the offer 150 | to distribute corresponding source code. (This alternative is 151 | allowed only for noncommercial distribution and only if you 152 | received the program in object code or executable form with such 153 | an offer, in accord with Subsection b above.) 154 | 155 | The source code for a work means the preferred form of the work for 156 | making modifications to it. For an executable work, complete source 157 | code means all the source code for all modules it contains, plus any 158 | associated interface definition files, plus the scripts used to 159 | control compilation and installation of the executable. However, as a 160 | special exception, the source code distributed need not include 161 | anything that is normally distributed (in either source or binary 162 | form) with the major components (compiler, kernel, and so on) of the 163 | operating system on which the executable runs, unless that component 164 | itself accompanies the executable. 165 | 166 | If distribution of executable or object code is made by offering 167 | access to copy from a designated place, then offering equivalent 168 | access to copy the source code from the same place counts as 169 | distribution of the source code, even though third parties are not 170 | compelled to copy the source along with the object code. 171 | 172 | 4. You may not copy, modify, sublicense, or distribute the Program 173 | except as expressly provided under this License. Any attempt 174 | otherwise to copy, modify, sublicense or distribute the Program is 175 | void, and will automatically terminate your rights under this License. 176 | However, parties who have received copies, or rights, from you under 177 | this License will not have their licenses terminated so long as such 178 | parties remain in full compliance. 179 | 180 | 5. You are not required to accept this License, since you have not 181 | signed it. However, nothing else grants you permission to modify or 182 | distribute the Program or its derivative works. These actions are 183 | prohibited by law if you do not accept this License. Therefore, by 184 | modifying or distributing the Program (or any work based on the 185 | Program), you indicate your acceptance of this License to do so, and 186 | all its terms and conditions for copying, distributing or modifying 187 | the Program or works based on it. 188 | 189 | 6. Each time you redistribute the Program (or any work based on the 190 | Program), the recipient automatically receives a license from the 191 | original licensor to copy, distribute or modify the Program subject to 192 | these terms and conditions. You may not impose any further 193 | restrictions on the recipients' exercise of the rights granted herein. 194 | You are not responsible for enforcing compliance by third parties to 195 | this License. 196 | 197 | 7. If, as a consequence of a court judgment or allegation of patent 198 | infringement or for any other reason (not limited to patent issues), 199 | conditions are imposed on you (whether by court order, agreement or 200 | otherwise) that contradict the conditions of this License, they do not 201 | excuse you from the conditions of this License. If you cannot 202 | distribute so as to satisfy simultaneously your obligations under this 203 | License and any other pertinent obligations, then as a consequence you 204 | may not distribute the Program at all. For example, if a patent 205 | license would not permit royalty-free redistribution of the Program by 206 | all those who receive copies directly or indirectly through you, then 207 | the only way you could satisfy both it and this License would be to 208 | refrain entirely from distribution of the Program. 209 | 210 | If any portion of this section is held invalid or unenforceable under 211 | any particular circumstance, the balance of the section is intended to 212 | apply and the section as a whole is intended to apply in other 213 | circumstances. 214 | 215 | It is not the purpose of this section to induce you to infringe any 216 | patents or other property right claims or to contest validity of any 217 | such claims; this section has the sole purpose of protecting the 218 | integrity of the free software distribution system, which is 219 | implemented by public license practices. Many people have made 220 | generous contributions to the wide range of software distributed 221 | through that system in reliance on consistent application of that 222 | system; it is up to the author/donor to decide if he or she is willing 223 | to distribute software through any other system and a licensee cannot 224 | impose that choice. 225 | 226 | This section is intended to make thoroughly clear what is believed to 227 | be a consequence of the rest of this License. 228 | 229 | 8. If the distribution and/or use of the Program is restricted in 230 | certain countries either by patents or by copyrighted interfaces, the 231 | original copyright holder who places the Program under this License 232 | may add an explicit geographical distribution limitation excluding 233 | those countries, so that distribution is permitted only in or among 234 | countries not thus excluded. In such case, this License incorporates 235 | the limitation as if written in the body of this License. 236 | 237 | 9. The Free Software Foundation may publish revised and/or new versions 238 | of the General Public License from time to time. Such new versions will 239 | be similar in spirit to the present version, but may differ in detail to 240 | address new problems or concerns. 241 | 242 | Each version is given a distinguishing version number. If the Program 243 | specifies a version number of this License which applies to it and "any 244 | later version", you have the option of following the terms and conditions 245 | either of that version or of any later version published by the Free 246 | Software Foundation. If the Program does not specify a version number of 247 | this License, you may choose any version ever published by the Free Software 248 | Foundation. 249 | 250 | 10. If you wish to incorporate parts of the Program into other free 251 | programs whose distribution conditions are different, write to the author 252 | to ask for permission. For software which is copyrighted by the Free 253 | Software Foundation, write to the Free Software Foundation; we sometimes 254 | make exceptions for this. Our decision will be guided by the two goals 255 | of preserving the free status of all derivatives of our free software and 256 | of promoting the sharing and reuse of software generally. 257 | 258 | NO WARRANTY 259 | 260 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY 261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN 262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES 263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED 264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS 266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE 267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, 268 | REPAIR OR CORRECTION. 269 | 270 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR 272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, 273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING 274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED 275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY 276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER 277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE 278 | POSSIBILITY OF SUCH DAMAGES. 279 | 280 | END OF TERMS AND CONDITIONS 281 | 282 | How to Apply These Terms to Your New Programs 283 | 284 | If you develop a new program, and you want it to be of the greatest 285 | possible use to the public, the best way to achieve this is to make it 286 | free software which everyone can redistribute and change under these terms. 287 | 288 | To do so, attach the following notices to the program. It is safest 289 | to attach them to the start of each source file to most effectively 290 | convey the exclusion of warranty; and each file should have at least 291 | the "copyright" line and a pointer to where the full notice is found. 292 | 293 | 294 | Copyright (C) 295 | 296 | This program is free software; you can redistribute it and/or modify 297 | it under the terms of the GNU General Public License as published by 298 | the Free Software Foundation; either version 2 of the License, or 299 | (at your option) any later version. 300 | 301 | This program is distributed in the hope that it will be useful, 302 | but WITHOUT ANY WARRANTY; without even the implied warranty of 303 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 304 | GNU General Public License for more details. 305 | 306 | You should have received a copy of the GNU General Public License along 307 | with this program; if not, write to the Free Software Foundation, Inc., 308 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. 309 | 310 | Also add information on how to contact you by electronic and paper mail. 311 | 312 | If the program is interactive, make it output a short notice like this 313 | when it starts in an interactive mode: 314 | 315 | Gnomovision version 69, Copyright (C) year name of author 316 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 317 | This is free software, and you are welcome to redistribute it 318 | under certain conditions; type `show c' for details. 319 | 320 | The hypothetical commands `show w' and `show c' should show the appropriate 321 | parts of the General Public License. Of course, the commands you use may 322 | be called something other than `show w' and `show c'; they could even be 323 | mouse-clicks or menu items--whatever suits your program. 324 | 325 | You should also get your employer (if you work as a programmer) or your 326 | school, if any, to sign a "copyright disclaimer" for the program, if 327 | necessary. Here is a sample; alter the names: 328 | 329 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program 330 | `Gnomovision' (which makes passes at compilers) written by James Hacker. 331 | 332 | , 1 April 1989 333 | Ty Coon, President of Vice 334 | 335 | This General Public License does not permit incorporating your program into 336 | proprietary programs. If your program is a subroutine library, you may 337 | consider it more useful to permit linking proprietary applications with the 338 | library. If this is what you want to do, use the GNU Lesser General 339 | Public License instead of this License. 340 | --------------------------------------------------------------------------------