Skip to content

macOS support: vfkit as hypervisor backend #5

@CMGS

Description

@CMGS

Summary

Evaluate and implement vfkit as the macOS hypervisor backend, replacing Cloud Hypervisor (CH) which is Linux-only. vfkit wraps Apple's Virtualization.framework and is used in production by Podman 5.0+, minikube 1.35+, and CRC.

Feasibility Analysis

Feature Mapping: CH → vfkit

Boot Methods — Fully Mappable

  • --kernel + --initramfs + --cmdline--bootloader linux,kernel=...,initrd=...,cmdline=... (direct mapping)
  • --firmware CLOUDHV.fd--bootloader efi,variable-store=...,create (uses Apple built-in EFI)
  • macOS guest: --bootloader macos,machineIdentifierPath=...,hardwareModelPath=...,auxImagePath=... (Apple Silicon only, requires IPSW restore image)

Note: On Apple Silicon, vfkit linux boot requires an uncompressed kernel (gzip/lz4 vmlinuz won't work). EFI boot has no such restriction.

REST API — Sufficient Coverage

  • PUT /api/v1/vm.shutdownPOST /vm/state {"state":"Stop"}
  • PUT /api/v1/vm.power-buttonPOST /vm/state {"state":"Stop"} (equivalent ACPI)
  • GET /api/v1/vm.info (query PTY path) → GET /vm/inspect
  • --api-socket /path--restful-uri unix:///path

CPU / Memory — Mostly Mappable

  • --cpus boot=N,max=M--cpus N (no max_vcpus, no CPU hotplug)
  • --memory size=BYTES,hugepages=on--memory N in MiB (no hugepages on macOS, unit conversion needed)
  • --balloon size=...,deflate_on_oom=...--device virtio-balloon (exists but no fine-grained params)
  • --watchdog → N/A (can skip)
  • --rng src=/dev/urandom--device virtio-rng (direct mapping)

Console — Needs Adjustment

  • --console pty (OCI boot) → --device virtio-serial,pty (direct mapping, PTY path via REST API)
  • --serial socket=console.sock (UEFI boot) → No socket mode (incompatible, must use PTY instead)

Blocking Issues

1. Storage — No qcow2 Support (Severe)

vfkit virtio-blk only supports raw format. No qcow2.

cocoonv2 cloudimg path currently depends on:

qemu-img create -f qcow2 -b base.qcow2 overlay.qcow2

Recommended: APFS clonefile

macOS APFS provides native block-level COW clone — instant, zero extra space. Equivalent to cp -c base.raw overlay.raw. Behaves similarly to qcow2 overlays but handled at the filesystem layer. Requires base images in raw format.

OCI boot path (raw COW + EROFS layers) is unaffected. CH disk serial maps to vfkit deviceId.

2. Networking — Completely Different Architecture (Severe)

macOS has no network namespaces, tap devices, or CNI. The entire network stack is inapplicable.

  • netns + CNI + bridge + IPAM → vmnet-helper (shared/bridged/host modes)
  • tap + TC redirect → vfkit --device virtio-net,fd=N via socketpair from vmnet-helper
  • Fixed IP (CNI host-local) → vmnet DHCP or --start-address / --end-address range control

Recommended: vmnet-helper

Supports shared (NAT) / bridged / host modes. Requires root on macOS 15 and below; unrestricted on macOS 26+. 10x faster than socket_vmnet.

Simplest path (Phase 1): --device virtio-net,nat — zero config but no fixed IP.

3. Disk I/O Tuning Unavailable (Low Impact)

vfkit does not expose num_queues, queue_size, direct, sparse, or network offload params. All controlled internally by Virtualization.framework. Acceptable for macOS dev/test scenarios.


macOS Guest VM Support

vfkit supports macOS guest VMs via --bootloader macos (Apple Silicon only, macOS 12+).

Setup Flow

  1. Download IPSW restore image (via VZMacOSRestoreImage.latestSupported API or manually)
  2. Create blank raw disk image
  3. Install macOS from IPSW into disk (generates MachineIdentifier, HardwareModel, AuxiliaryStorage)
  4. Boot with --bootloader macos,machineIdentifierPath=...,hardwareModelPath=...,auxImagePath=...

Headless Operation

Virtualization.framework requires a VZMacGraphicsDeviceConfiguration for macOS guests — this is a hard framework-level constraint. However, headless operation is achievable by creating the graphics device but not rendering any host window (Tart does this via NSApplication.setActivationPolicy(.prohibited)). The VM runs normally; access via SSH or VNC.

Concurrency Limits

Hard limit: 2 macOS VMs simultaneously, enforced at XNU kernel level (hv_apple_isa_vm_quota counter). This is NOT a Virtualization.framework limitation — it's in the kernel's hypervisor trap handler.

  • Linux VMs are not subject to this limit (unlimited)
  • The macOS EULA also permits only 2 additional VM instances per physical Mac
  • Workaround: Apple KDK development kernel + hv_apple_isa_vm_quota=0xFF (up to 255 VMs, but breaks system updates, requires SIP disabled — not practical for production)
  • Scaling beyond 2: requires multiple physical Macs with orchestration (e.g., Cirrus Labs' Orchard)

macOS Guest Capabilities

Works: Metal GPU (paravirtualized, compute perf = native), general macOS apps, networking, storage, iCloud (macOS 15+)

Does NOT work: App Store apps, FairPlay DRM, nested virtualization, Touch ID


GPU Access in VMs

Three approaches with very different capabilities:

macOS Guest — Metal Paravirtualized GPU

VZMacGraphicsDeviceConfiguration exposes a paravirtualized Metal GPU to the guest.

  • GPU compute performance: identical to native (100% active residency, same frequency/power)
  • Graphics rendering: works well (80-84% GPU utilization)
  • CoreML / MLX: theoretically functional (both use Metal compute)
  • Limitation: virtual GPU presents as unrecognized device — some apps doing hardware checks may refuse

Linux Guest (AVF) — No GPU Acceleration

VZVirtioGraphicsDeviceConfiguration is 2D framebuffer only. CPU renders, host displays. No 3D, no compute.

Linux Guest (libkrun/krunkit) — Vulkan via Venus + MoltenVK

Completely separate stack (Red Hat, uses Hypervisor.framework not Virtualization.framework):

Guest: App → Vulkan → Mesa Venus driver → virtio-gpu shared memory
Host:  virglrenderer → MoltenVK → Metal → Apple GPU
  • Requires macOS 14+, Apple Silicon
  • llama.cpp ggml-vulkan: 77% of native Metal (20.84 vs 27 tokens/sec)
  • Newer ggml-remoting (Sep 2025): bypasses Vulkan, forwards tensor ops directly to host ggml-metal — 95-100% native speed
  • No GPU passthrough exists on Apple Silicon (unified memory SoC)

Code Architecture

hypervisor/
├── hypervisor.go              # Interface (unchanged)
├── cloudhypervisor/           # Linux backend (unchanged)
└── vfkit/                     # macOS backend (new)
    ├── vfkit.go
    ├── conf.go                # CLI arg builder
    ├── start.go
    ├── stop.go
    ├── create.go              # APFS clone instead of qcow2
    └── helper.go              # REST API client

network/
├── network.go                 # Interface (unchanged)
├── cni/                       # Linux (unchanged)
└── vmnet/                     # macOS (new, vmnet-helper based)

Bonus: vfkit is written in Go and provides github.com/crc-org/vfkit/pkg/config — can be used as a Go library instead of exec.


Implementation Plan

Phase Scope Estimate
Phase 1 EFI boot + NAT networking + APFS clone storage 1–2 weeks
Phase 2 vmnet-helper networking (fixed IP, external reachability) 1 week
Phase 3 OCI direct boot (uncompressed kernel) 3–5 days
Phase 4 macOS guest support (IPSW install, headless, Metal GPU) 1 week
Phase 5 Edge cases + testing 1 week

Phase 1 delivers a working vm create + vm start + vm stop + vm console on macOS with cloud images.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions