* drain_call_rcu() vs nested event loops
@ 2023-07-13 19:42 Stefan Hajnoczi
0 siblings, 0 replies; only message in thread
From: Stefan Hajnoczi @ 2023-07-13 19:42 UTC (permalink / raw)
To: pbonzini, kwolf; +Cc: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 8042 bytes --]
Hi,
I've encountered a bug where two vcpu threads enter a device's MMIO
emulation callback at the same time. This is never supposed to happen
thanks to the Big QEMU Lock (BQL), but drain_call_rcu() and nested event
loops make it possible:
1. A device's MMIO emulation callback invokes AIO_WAIT_WHILE().
2. A device_add monitor command runs in AIO_WAIT_WHILE()'s aio_poll()
nested event loop.
3. qmp_device_add() -> drain_call_rcu() is called and the BQL is
temporarily dropped.
4. Another vcpu thread dispatches the same device's MMIO callback
because it is now able to acquire the BQL.
I've included the backtraces below if you want to see the details. They
are from a RHEL qemu-kvm 6.2.0-35 coredump but I haven't found anything
in qemu.git/master that would fix this.
One fix is to make qmp_device_add() a coroutine and schedule a BH in the
iohandler AioContext. That way the coroutine must wait until the nested
event loop finishes before its BH executes. drain_call_rcu() will never
be called from a nested event loop and the problem does not occur
anymore.
Another possibility is to remove the following in monitor_qmp_dispatcher_co():
/*
* Move the coroutine from iohandler_ctx to qemu_aio_context for
* executing the command handler so that it can make progress if it
* involves an AIO_WAIT_WHILE().
*/
aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
qemu_coroutine_yield();
By executing QMP commands in the iohandler AioContext by default, we can
prevent issues like this in the future. However, there might be some QMP
commands that assume they are running in the qemu_aio_context (e.g.
coroutine commands that yield) and they might need to manually move to
the qemu_aio_context.
What do you think?
Stefan
---
Thread 41 (Thread 0x7fdc3dffb700 (LWP 910296)):
#0 0x00007fde88ac99bd in syscall () from /lib64/libc.so.6
#1 0x000055bd7a2e066f in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at /usr/src/debug/qemu-kvm-6.2.0-35.module+el8.9.0+19024+8193e2ac.x86_64/include/qemu/futex.h:29
#2 qemu_event_wait (ev=ev@entry=0x7fdc3dffa2d0) at ../util/qemu-thread-posix.c:510
#3 0x000055bd7a2e8e54 in drain_call_rcu () at ../util/rcu.c:347
#4 0x000055bd79f63d1e in qmp_device_add (qdict=<optimized out>, ret_data=<optimized out>, errp=<optimized out>) at ../softmmu/qdev-monitor.c:863
#5 0x000055bd7a2d420d in do_qmp_dispatch_bh (opaque=0x7fde8c22aee0) at ../qapi/qmp-dispatch.c:129
#6 0x000055bd7a2ef3bd in aio_bh_call (bh=0x7fdc6015cd50) at ../util/async.c:174
#7 aio_bh_poll (ctx=ctx@entry=0x55bd7c910f40) at ../util/async.c:174
#8 0x000055bd7a2dd3b2 in aio_poll (ctx=0x55bd7c910f40, blocking=blocking@entry=true) at ../util/aio-posix.c:659
#9 0x000055bd7a2effea in aio_wait_bh_oneshot (ctx=0x55bd7ca980e0, cb=cb@entry=0x55bd7a11a9c0 <virtio_blk_data_plane_stop_bh>, opaque=opaque@entry=0x55bd7e585c40) at ../util/aio-wait.c:85
#10 0x000055bd7a11b30b in virtio_blk_data_plane_stop (vdev=<optimized out>) at ../hw/block/dataplane/virtio-blk.c:333
#11 0x000055bd7a0591e0 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x55bd7cb57ba8) at ../hw/virtio/virtio-bus.c:258
#12 0x000055bd7a05995f in virtio_bus_stop_ioeventfd (bus=bus@entry=0x55bd7cb57ba8) at ../hw/virtio/virtio-bus.c:250
#13 0x000055bd7a05b238 in virtio_pci_stop_ioeventfd (proxy=0x55bd7cb4f9a0) at ../hw/virtio/virtio-pci.c:1289
#14 virtio_pci_common_write (opaque=0x55bd7cb4f9a0, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1289
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#15 0x000055bd7a0f6777 in memory_region_write_accessor (mr=0x55bd7cb50410, addr=<optimized out>, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...) at ../softmmu/memory.c:492
#16 0x000055bd7a0f320e in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7fdc3dffa5c8, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>,
access_fn=0x55bd7a0f6710 <memory_region_write_accessor>, mr=0x55bd7cb50410, attrs=...) at ../softmmu/memory.c:554
#17 0x000055bd7a0f62a3 in memory_region_dispatch_write (mr=mr@entry=0x55bd7cb50410, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...) at ../softmmu/memory.c:1504
#18 0x000055bd7a0e7f2e in flatview_write_continue (fv=fv@entry=0x55bd7d17cad0, addr=addr@entry=4236247060, attrs=..., ptr=ptr@entry=0x7fde84003028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>,
mr=0x55bd7cb50410) at /usr/src/debug/qemu-kvm-6.2.0-35.module+el8.9.0+19024+8193e2ac.x86_64/include/qemu/host-utils.h:165
#19 0x000055bd7a0e8093 in flatview_write (fv=0x55bd7d17cad0, addr=4236247060, attrs=..., buf=0x7fde84003028, len=1) at ../softmmu/physmem.c:2856
#20 0x000055bd7a0ebc6f in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at ../softmmu/physmem.c:2952
#21 0x000055bd7a1a28b9 in kvm_cpu_exec (cpu=cpu@entry=0x55bd7cc32bf0) at ../accel/kvm/kvm-all.c:2995
#22 0x000055bd7a1a36e5 in kvm_vcpu_thread_fn (arg=0x55bd7cc32bf0) at ../accel/kvm/kvm-accel-ops.c:49
#23 0x000055bd7a2dfdd4 in qemu_thread_start (args=0x55bd7cc41f20) at ../util/qemu-thread-posix.c:585
#24 0x00007fde88e5d1ca in start_thread () from /lib64/libpthread.so.0
#25 0x00007fde88ac9e73 in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fdc6f5fe700 (LWP 910286)):
#0 0x00007fde88adeacf in raise () from /lib64/libc.so.6
#1 0x00007fde88ab1ea5 in abort () from /lib64/libc.so.6
#2 0x00007fde88ab1d79 in __assert_fail_base.cold.0 () from /lib64/libc.so.6
#3 0x00007fde88ad7426 in __assert_fail () from /lib64/libc.so.6
#4 0x000055bd7a1175c8 in virtio_blk_set_status (vdev=0x55bd7cb57c30, status=<optimized out>) at ../hw/block/virtio-blk.c:1043
#5 0x000055bd7a1474e4 in virtio_set_status (vdev=vdev@entry=0x55bd7cb57c30, val=val@entry=0 '\000') at ../hw/virtio/virtio.c:1945
#6 0x000055bd7a05b243 in virtio_pci_common_write (opaque=0x55bd7cb4f9a0, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#7 0x000055bd7a0f6777 in memory_region_write_accessor (mr=0x55bd7cb50410, addr=<optimized out>, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...) at ../softmmu/memory.c:492
#8 0x000055bd7a0f320e in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7fdc6f5fd5c8, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>,
access_fn=0x55bd7a0f6710 <memory_region_write_accessor>, mr=0x55bd7cb50410, attrs=...) at ../softmmu/memory.c:554
#9 0x000055bd7a0f62a3 in memory_region_dispatch_write (mr=mr@entry=0x55bd7cb50410, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...) at ../softmmu/memory.c:1504
#10 0x000055bd7a0e7f2e in flatview_write_continue (fv=fv@entry=0x7fdad69b4a90, addr=addr@entry=4236247060, attrs=..., ptr=ptr@entry=0x7fde8c05f028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>,
mr=0x55bd7cb50410) at /usr/src/debug/qemu-kvm-6.2.0-35.module+el8.9.0+19024+8193e2ac.x86_64/include/qemu/host-utils.h:165
#11 0x000055bd7a0e8093 in flatview_write (fv=0x7fdad69b4a90, addr=4236247060, attrs=..., buf=0x7fde8c05f028, len=1) at ../softmmu/physmem.c:2856
#12 0x000055bd7a0ebc6f in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at ../softmmu/physmem.c:2952
#13 0x000055bd7a1a28b9 in kvm_cpu_exec (cpu=cpu@entry=0x55bd7cb953f0) at ../accel/kvm/kvm-all.c:2995
#14 0x000055bd7a1a36e5 in kvm_vcpu_thread_fn (arg=0x55bd7cb953f0) at ../accel/kvm/kvm-accel-ops.c:49
#15 0x000055bd7a2dfdd4 in qemu_thread_start (args=0x55bd7cba4420) at ../util/qemu-thread-posix.c:585
#16 0x00007fde88e5d1ca in start_thread () from /lib64/libpthread.so.0
#17 0x00007fde88ac9e73 in clone () from /lib64/libc.so.6
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2023-07-13 19:43 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-13 19:42 drain_call_rcu() vs nested event loops Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).