Skip to content

nvme: set completion queue EventIdx more carefully #1043

@iximeow

Description

@iximeow

#1026 took a big hammer to it by configuring EventIdx such that we always get a doorbell when a CQ is emptied, but we really only need to hold EventIdx back if we might have gotten an SQ corked on the CQ. then, the doorbell is load-bearing to tell us to wake the SQ and finally actually do the I/O.

I think we want to only let EventIdx go so far forward that the CQ head<->CQ EventIdx size is the CQ size minus the number of SQs that complete into that queue. there is some raciness here I haven't convinced myself about though: if a CQ has one SQ pointed at it, CQ_EventIdx-CQ_head == 1, then the guest configures a second SQ to that first CQ, then both submission queues get I/Os, one of the SQs could end up corked while the other takes the remaining CQ space.

that might be fine if when we read the CQ shadow doorbell and discover the CQ is no longer full, we kick min(CQ_space, corked_SQs) so that I/Os get processed.

otoh, by setting EventIdx to one before the CQ tail, we might have to wait for the guest to get through a deep (Linux picks 1024) queue of completions before it notifies us there's space again. at 200 nanoseconds per completion (number is fake and invented) that's 200 microseconds of latency to resume I/Os if the queue is full. we might want to limit EventIdx to be at most 64 away from the CQ tail, which also avoids the SQs-stay-corked-forever issue above. then, CQ doorbells that we didn't need are an indicator that we're not pushing EventIdx far enough out, and maybe we should let it go higher?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request.storageRelated to storage devices/backends.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions