-
Notifications
You must be signed in to change notification settings - Fork 0
psoftware/bpfhv
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
=== Structure of this repository ===
The driver/ directory contains the bpfhv guest driver for
Linux kernels (>= 4.18):
- bpfhv.c: driver source code
The proxy/ directory contains the implementation of an external
backend process associated to the QEMU bpfhv-proxy network backend
(-netdev bpfhv-proxy).
The external backend provides the queue processing functionalities
for a single bpfhv device that belongs to a QEMU VM. In other words,
QEMU only implements the control functionalities of the device, while
RX and TX queues are processed by the external process. Similarly
to vhost-user, QEMU and the external backend process use a dedicated
control channel to exchange some information needed for the packet
processing tasks (e.g. guest memory map, addresses of each TX and RX
queue, file descriptors for notifications, etc.).
Files:
- backend.c: main file, implements the control protocol and
two packet processing loops (the first one is
a poll() event loop, whereas the second one uses
busy-wait;
- sring.[ch]: hv implementation of a device which uses a minimal
descriptor format, with no support for offloads (and
reduced per-packet overhead);
- sring_progs.c: eBPF programs for the sring device;
- sring_gso.[ch]: hv implementation of a device which uses an
extended descriptor format, supporting checksum
offloads and TCP/UDP segmentation offloads;
- sring_gso_progs.c: eBPF programs for the sring_gso device
- vring_packed.[ch]: hv implementation of the packed virtqueue
in the VirtIO 1.1 specification;
- vring_packed_progs.c: eBPF programs for the vring_packed device;
- start-qemu.sh: an example script to start a QEMU VM with a
bpfhv device peered with a bpfhv-proxy network
backend;
- start-proxy.sh: an example script to start the external backend
process and configure the backend network device
(e.g. a TAP interface or a netmap port);
=== Some advantages of bpfhv ===
- Have doorbells on separate pages (configurable stride)
- Provider can evolve the medatata header (e.g., virtio-net)
to balance between the needs of FreeBSD and Linux.
(virtio-net is good for Linux, but not for FreeBSD).
- Virtio 1.1 vs 1.0 (while 0.95 is still around). This is a
sign that there is a need for evolution and compatibility
problems.
- You can define a metadata format (e.g. virtio-net header)
that fits the specific hardware NIC features used by the
cloud provider.
- Let the provider inject code to encrypt/decrypt the payload,
together with the hardcoded key. The encrypt/decrypt routines
can be helper functions that take as argument the OS packet
pointer and the key.
- Simplification of device paravirtualization. Fixed datapath
ABI means that you need to be backward compatible. Look at
virtio implementation in Linux 4.20: it needs to support
both split and packet ring --> complex, error prone, less
efficient.
- Change virtual switch and backend under the hood (tap,
netmap, other).
- Adapt to changing workloads.
=== TODOs (driver) ===
- Let BPFHV_MAX_TX_BUFS and BPFHV_MAX_RX_BUFS be variable.
This requires of course reshaping the layout of the
context data structures.
- Try to replace dma_map_single() with dma_map_page() on
the RX datapath ? Not sure this is relevant.
- What if the eBPF program needs to modify the SG layout,
e.g., for encapsulation or encryption? This would require
changing the paddr/vaddr/len in the buffer descriptors,
and DMA mapping and unmapping... So maybe we should ask
the eBPF program to DMA map/unmap so, that it can do
that after encapsulation or encryption (i.e. once
the SG layout is stable).
=== TODOs (qemu) ===
- Replace cpu_physical_memory_[un]map() with dma_memory_[un]map()
and the MemoryRegionCache library. This should be only necessary
if the guest platform has an IOMMU.
Code in virtqueue_pop() and virtqueue_push().
- let backend.ops.init fail (vring packed < 2^15)
- move vring_packed generic code at the top of the files
About
Implementation and evaluation of hypervisor to guest offloading for high throughput Virtual Machine traffic classification
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published