Conversation
1f2b09a to
d883528
Compare
The ebpf-go CI has been plagued by a non-deterministic hang of
unit tests. It affects all packages and manifests as a write to
stdout getting stuck, followed by the test timing out. This
triggers a goroutine dump, which in turn unblocks the stuck write
to stdout.
Its possible to reproduce this behaviour using the following
commandline:
taskset -c 0 vimto -smp cpus=2 -kernel ghcr.io/cilium/ci-kernels:6.15.3 \
exec -- sh -c 'seq 1 1000000 | while read i; do echo "line $i"; done'
After a few seconds the output will freeze. Inspecting the stack of
the executing program shows something like the following:
[<0>] wait_port_writable+0x139/0x2d0
[<0>] port_fops_write+0x88/0x130
[<0>] vfs_write+0xf3/0x450
[<0>] ksys_write+0x6d/0xe0
[<0>] do_syscall_64+0x9e/0x1a0
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
1 0x1 0x7ffdf4878c80 0x9 0x0 0x0 0x0 0x7ffdf4878c20 0x7f592daed77e
As far as I can tell it is critical that execution is restricted to
a single CPU on the host side, while qemu presents two vCPU to the VM.
Passing ioeventfd=off to the serial console device works around
this problem.
See cilium/ebpf#1734 for more details.
d883528 to
4e07dbd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The ebpf-go CI has been plagued by a non-deterministic hang of unit tests. It affects all packages and manifests as a write to stdout getting stuck, followed by the test timing out. This triggers a goroutine dump, which in turn unblocks the stuck write to stdout.
Its possible to reproduce this behaviour using the following commandline:
taskset -c 0 vimto -smp cpus=2 -kernel ghcr.io/cilium/ci-kernels:6.15.3
exec -- sh -c 'seq 1 1000000 | while read i; do echo "line $i"; done'
After a few seconds the output will freeze. Inspecting the stack of the executing program shows something like the following:
As far as I can tell it is critical that execution is restricted to a single CPU on the host side, while qemu presents two vCPU to the VM.
Passing ioeventfd=off to the serial console device works around this problem.
See cilium/ebpf#1734 for more details.