block/file: probably(?) want an aio backend

the existing file backend in `block/file.rs` gets us to pretty thrilling throughput (iops/bandwidth) figures (given enough workers threads), but is really pushing against the system to get that throughput. given files/devices/raw zvols that support async I/O, we can probably get much better throughput without needing several hundred threads by "just" doing async I/O from propolis. `port_associate(3C)` talks about exactly what we might want to do in `Bind AIO transaction to a specific port`.

### the math

we have a whole bunch of Propolis threads whose only purpose is to consume stack space, go into the kernel, wait, come back, and maybe sleep. the context switching from all this ends up on the order of 10% of total CPU time (or closer to 20% of non-idle CPU time). each I/O ends up at `biowait()` in the kernel (twice!), which is especially egregious for writes that complete in single-digit microseconds to the hardware. we're just offering up millions of opportunities to context switch too eagerly.

a different problem is that to plumb all the throughput hardware might support, we may need a truly astounding number of threads. with some relatively conservative figures, assume a disk can support 2M reads/sec at 100 microseconds per read. that gets you 200 seconds per second of waiting, and if we have one I/O per thread, that implies at *least* 200 threads *per disk*. context switching gets even worse.

then there's the number of queues. we can tune the number of NVMe queues up, and that's great, but even at 64 queues that's ~4 threads constantly fighting over each SQ/CQ state lock. not great.

so, the math doesn't look good for getting much better with synchronous I/O!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

block/file: probably(?) want an aio backend #1042

the math

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

block/file: probably(?) want an aio backend #1042

Description

the math

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions