From: Peter Xu <peterx@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Fabiano Rosas" <farosas@suse.de>,
"David Hildenbrand" <david@redhat.com>,
peterx@redhat.com, "Paolo Bonzini" <pbonzini@redhat.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Peixiu Hou" <phou@redhat.com>, "Kevin Wolf" <kwolf@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: [PULL 28/36] system/physmem: mark io_mem_unassigned lockless
Date: Mon, 3 Nov 2025 16:06:17 -0500 [thread overview]
Message-ID: <20251103210625.3689448-29-peterx@redhat.com> (raw)
In-Reply-To: <20251103210625.3689448-1-peterx@redhat.com>
From: Stefan Hajnoczi <stefanha@redhat.com>
When the Bus Master bit is disabled in a PCI device's Command Register,
the device's DMA address space becomes unassigned memory (i.e. the
io_mem_unassigned MemoryRegion).
This can lead to deadlocks with IOThreads since io_mem_unassigned
accesses attempt to acquire the Big QEMU Lock (BQL). For example,
virtio-pci devices deadlock in virtio_write_config() ->
virtio_pci_stop_ioeventfd() when waiting for the IOThread while holding
the BQL. The IOThread is unable to acquire the BQL but the vcpu thread
won't release the BQL while waiting for the IOThread.
io_mem_unassigned is trivially thread-safe since it has no state, it
simply rejects all load/store accesses. Therefore it is safe to enable
lockless I/O on io_mem_unassigned to eliminate this deadlock.
Here is the backtrace described above:
Thread 9 (Thread 0x7fccfcdff6c0 (LWP 247832) "CPU 4/KVM"):
#0 0x00007fcd11529d46 in ppoll () from target:/lib64/libc.so.6
#1 0x000056468a1a9bad in ppoll (__fds=<optimized out>, __nfds=<optimized out>, __timeout=0x0, __ss=0x0) at /usr/include/bits/poll2.h:88
#2 0x000056468a18f9d9 in fdmon_poll_wait (ctx=0x5646c6a1dc30, ready_list=0x7fccfcdfb310, timeout=-1) at ../util/fdmon-poll.c:79
#3 0x000056468a18f14f in aio_poll (ctx=<optimized out>, blocking=blocking@entry=true) at ../util/aio-posix.c:730
#4 0x000056468a1ad842 in aio_wait_bh_oneshot (ctx=<optimized out>, cb=cb@entry=0x564689faa420 <virtio_blk_ioeventfd_stop_vq_bh>, opaque=<optimized out>) at ../util/aio-wait.c:85
#5 0x0000564689faaa89 in virtio_blk_stop_ioeventfd (vdev=0x5646c8fd7e90) at ../hw/block/virtio-blk.c:1644
#6 0x0000564689d77880 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x5646c8fd7e08) at ../hw/virtio/virtio-bus.c:264
#7 0x0000564689d780db in virtio_bus_stop_ioeventfd (bus=bus@entry=0x5646c8fd7e08) at ../hw/virtio/virtio-bus.c:256
#8 0x0000564689d7d98a in virtio_pci_stop_ioeventfd (proxy=0x5646c8fcf8e0) at ../hw/virtio/virtio-pci.c:413
#9 virtio_write_config (pci_dev=0x5646c8fcf8e0, address=4, val=<optimized out>, len=<optimized out>) at ../hw/virtio/virtio-pci.c:803
#10 0x0000564689dcb45a in memory_region_write_accessor (mr=mr@entry=0x5646c6dc2d30, addr=3145732, value=value@entry=0x7fccfcdfb528, size=size@entry=2, shift=<optimized out>, mask=mask@entry=65535, attrs=...) at ../system/memory.c:491
#11 0x0000564689dcaeb0 in access_with_adjusted_size (addr=addr@entry=3145732, value=value@entry=0x7fccfcdfb528, size=size@entry=2, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x564689dcb3f0 <memory_region_write_accessor>, mr=0x5646c6dc2d30, attrs=...) at ../system/memory.c:567
#12 0x0000564689dcb156 in memory_region_dispatch_write (mr=mr@entry=0x5646c6dc2d30, addr=addr@entry=3145732, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...) at ../system/memory.c:1554
#13 0x0000564689dd389a in flatview_write_continue_step (attrs=..., attrs@entry=..., buf=buf@entry=0x7fcd05b87028 "", mr_addr=3145732, l=l@entry=0x7fccfcdfb5f0, mr=0x5646c6dc2d30, len=2) at ../system/physmem.c:3266
#14 0x0000564689dd3adb in flatview_write_continue (fv=0x7fcadc0d8930, addr=3761242116, attrs=..., ptr=0xe0300004, len=2, mr_addr=<optimized out>, l=<optimized out>, mr=<optimized out>) at ../system/physmem.c:3296
#15 flatview_write (fv=0x7fcadc0d8930, addr=addr@entry=3761242116, attrs=attrs@entry=..., buf=buf@entry=0x7fcd05b87028, len=len@entry=2) at ../system/physmem.c:3327
#16 0x0000564689dd7191 in address_space_write (as=0x56468b433600 <address_space_memory>, addr=3761242116, attrs=..., buf=0x7fcd05b87028, len=2) at ../system/physmem.c:3447
#17 address_space_rw (as=0x56468b433600 <address_space_memory>, addr=3761242116, attrs=attrs@entry=..., buf=buf@entry=0x7fcd05b87028, len=2, is_write=<optimized out>) at ../system/physmem.c:3457
#18 0x0000564689ff1ef6 in kvm_cpu_exec (cpu=cpu@entry=0x5646c6dab810) at ../accel/kvm/kvm-all.c:3248
#19 0x0000564689ff32f5 in kvm_vcpu_thread_fn (arg=arg@entry=0x5646c6dab810) at ../accel/kvm/kvm-accel-ops.c:53
#20 0x000056468a19225c in qemu_thread_start (args=0x5646c6db6190) at ../util/qemu-thread-posix.c:393
#21 0x00007fcd114c5b68 in start_thread () from target:/lib64/libc.so.6
#22 0x00007fcd115364e4 in clone () from target:/lib64/libc.so.6
Thread 3 (Thread 0x7fcd0503a6c0 (LWP 247825) "IO iothread1"):
#0 0x00007fcd114c2d30 in __lll_lock_wait () from target:/lib64/libc.so.6
#1 0x00007fcd114c8fe2 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6
#2 0x000056468a192538 in qemu_mutex_lock_impl (mutex=0x56468b432e60 <bql>, file=0x56468a1e26a5 "../system/physmem.c", line=3198) at ../util/qemu-thread-posix.c:94
#3 0x0000564689dc12e2 in bql_lock_impl (file=file@entry=0x56468a1e26a5 "../system/physmem.c", line=line@entry=3198) at ../system/cpus.c:566
#4 0x0000564689ddc151 in prepare_mmio_access (mr=0x56468b433800 <io_mem_unassigned>) at ../system/physmem.c:3198
#5 address_space_lduw_internal_cached_slow (cache=<optimized out>, addr=2, attrs=..., result=0x0, endian=DEVICE_LITTLE_ENDIAN) at ../system/memory_ldst.c.inc:211
#6 address_space_lduw_le_cached_slow (cache=<optimized out>, addr=addr@entry=2, attrs=attrs@entry=..., result=result@entry=0x0) at ../system/memory_ldst.c.inc:253
#7 0x0000564689fd692c in address_space_lduw_le_cached (result=0x0, cache=<optimized out>, addr=2, attrs=...) at /var/tmp/qemu/include/exec/memory_ldst_cached.h.inc:35
#8 lduw_le_phys_cached (cache=<optimized out>, addr=2) at /var/tmp/qemu/include/exec/memory_ldst_phys.h.inc:66
#9 virtio_lduw_phys_cached (vdev=<optimized out>, cache=<optimized out>, pa=2) at /var/tmp/qemu/include/hw/virtio/virtio-access.h:166
#10 vring_avail_idx (vq=0x5646c8fe2470) at ../hw/virtio/virtio.c:396
#11 virtio_queue_split_set_notification (vq=0x5646c8fe2470, enable=0) at ../hw/virtio/virtio.c:534
#12 virtio_queue_set_notification (vq=0x5646c8fe2470, enable=0) at ../hw/virtio/virtio.c:595
#13 0x000056468a18e7a8 in poll_set_started (ctx=ctx@entry=0x5646c6c74e30, ready_list=ready_list@entry=0x7fcd050366a0, started=started@entry=true) at ../util/aio-posix.c:247
#14 0x000056468a18f2bb in poll_set_started (ctx=0x5646c6c74e30, ready_list=0x7fcd050366a0, started=true) at ../util/aio-posix.c:226
#15 try_poll_mode (ctx=0x5646c6c74e30, ready_list=0x7fcd050366a0, timeout=<synthetic pointer>) at ../util/aio-posix.c:612
#16 aio_poll (ctx=0x5646c6c74e30, blocking=blocking@entry=true) at ../util/aio-posix.c:689
#17 0x000056468a032c26 in iothread_run (opaque=opaque@entry=0x5646c69f3380) at ../iothread.c:63
#18 0x000056468a19225c in qemu_thread_start (args=0x5646c6c75410) at ../util/qemu-thread-posix.c:393
#19 0x00007fcd114c5b68 in start_thread () from target:/lib64/libc.so.6
#20 0x00007fcd115364e4 in clone () from target:/lib64/libc.so.6
Buglink: https://issues.redhat.com/browse/RHEL-71933
Reported-by: Peixiu Hou <phou@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20251029185224.420261-1-stefanha@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
system/physmem.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/system/physmem.c b/system/physmem.c
index a7e2a5d07f..c9869e4049 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -3011,6 +3011,9 @@ static void io_mem_init(void)
{
memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
NULL, UINT64_MAX);
+
+ /* Trivially thread-safe since memory accesses are rejected */
+ memory_region_enable_lockless_io(&io_mem_unassigned);
}
AddressSpaceDispatch *address_space_dispatch_new(FlatView *fv)
--
2.50.1
next prev parent reply other threads:[~2025-11-03 21:09 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-03 21:05 [PULL 00/36] Staging patches Peter Xu
2025-11-03 21:05 ` [PULL 01/36] migration/savevm: Add a compatibility check for capabilities Peter Xu
2025-11-03 21:05 ` [PULL 02/36] MAINTAINERS: update cpr reviewers Peter Xu
2025-11-03 21:05 ` [PULL 03/36] migration/ram: fix docs of ram_handle_zero Peter Xu
2025-11-03 21:05 ` [PULL 04/36] migration: add FEATURE_SEEKABLE to QIOChannelBlock Peter Xu
2025-11-03 21:05 ` [PULL 05/36] migration: mapped-ram: handle zero pages Peter Xu
2025-11-03 21:05 ` [PULL 06/36] migration: Remove unused VMSTATE_UINTTL_EQUAL[_V]() macros Peter Xu
2025-11-03 21:05 ` [PULL 07/36] migration: Fix error leak in postcopy_ram_listen_thread() Peter Xu
2025-11-03 21:05 ` [PULL 08/36] migration/cpr: Fix coverity report in cpr_exec_persist_state() Peter Xu
2025-11-03 21:05 ` [PULL 09/36] migration/cpr: Fix UAF in cpr_exec_cb() when execvp() fails Peter Xu
2025-11-03 21:05 ` [PULL 10/36] migration/cpr: Avoid crashing QEMU when cpr-exec runs with no args Peter Xu
2025-11-03 21:06 ` [PULL 11/36] ram-block-attributes: fix interaction with hugetlb memory backends Peter Xu
2025-11-03 21:06 ` [PULL 12/36] ram-block-attributes: Unify the retrieval of the block size Peter Xu
2025-11-03 21:06 ` [PULL 13/36] migration/qmp: Update "resume" flag doc in "migrate" command Peter Xu
2025-11-05 12:27 ` Richard Henderson
2025-11-03 21:06 ` [PULL 14/36] migration/cpr: Document obscure usage of g_autofree when parse str Peter Xu
2025-11-03 21:06 ` [PULL 15/36] hostmem/shm: Allow shm memory backend serve as shared memory for coco-VMs Peter Xu
2025-11-03 21:06 ` [PULL 16/36] migration: Fix regression of passing error_fatal into vmstate_load_state() Peter Xu
2025-11-03 21:06 ` [PULL 17/36] migration: Don't free the reason after calling migrate_add_blocker Peter Xu
2025-11-03 21:06 ` [PULL 18/36] migration: Use unsigned instead of int for bit set of MigMode Peter Xu
2025-11-03 21:06 ` [PULL 19/36] migration: Use bitset of MigMode instead of variable arguments Peter Xu
2025-11-03 21:06 ` [PULL 20/36] migration: Put Error **errp parameter last Peter Xu
2025-11-03 21:06 ` [PULL 21/36] io: Add qio_channel_wait_cond() helper Peter Xu
2025-11-03 21:06 ` [PULL 22/36] migration: Properly wait on G_IO_IN when peeking messages Peter Xu
2025-11-03 21:06 ` [PULL 23/36] migration: vmstate_save_state_v(): fix error path Peter Xu
2025-11-03 21:06 ` [PULL 24/36] tmp_emulator: improve and fix use of errp Peter Xu
2025-11-03 21:06 ` [PULL 25/36] migration/vmstate: stop reporting error number for new _errp APIs Peter Xu
2025-11-03 21:06 ` [PULL 26/36] migration: vmsd errp handlers: return bool Peter Xu
2025-11-03 21:06 ` [PULL 27/36] scripts/vmstate-static-checker: Fix deprecation warnings with latest argparse Peter Xu
2025-11-03 21:06 ` Peter Xu [this message]
2025-11-03 21:06 ` [PULL 29/36] migration: Flush migration channel after sending data of CMD_PACKAGED Peter Xu
2025-11-03 21:06 ` [PULL 30/36] migration: Do not try to start VM if disk activation fails Peter Xu
2025-11-03 21:06 ` [PULL 31/36] migration: Move postcopy_ram_listen_thread() to postcopy-ram.c Peter Xu
2025-11-03 21:06 ` [PULL 32/36] migration: Introduce postcopy incoming setup and cleanup functions Peter Xu
2025-11-03 21:06 ` [PULL 33/36] migration: Refactor all incoming cleanup info migration_incoming_destroy() Peter Xu
2025-11-03 21:06 ` [PULL 34/36] migration: Respect exit-on-error when migration fails before resuming Peter Xu
2025-11-03 21:06 ` [PULL 35/36] migration: Make postcopy listen thread joinable Peter Xu
2025-11-03 21:06 ` [PULL 36/36] migration: Introduce POSTCOPY_DEVICE state Peter Xu
2025-11-05 7:52 ` [PULL 00/36] Staging patches Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251103210625.3689448-29-peterx@redhat.com \
--to=peterx@redhat.com \
--cc=david@redhat.com \
--cc=farosas@suse.de \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=phou@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).