From: Stefan Hajnoczi <stefanha@redhat.com>
To: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
qemu-block@nongnu.org, pbonzini@redhat.com,
Hanna Reitz <hreitz@redhat.com>, Fam Zheng <fam@euphon.net>,
Fiona Ebner <f.ebner@proxmox.com>,
Stefan Hajnoczi <stefanha@redhat.com>
Subject: [RFC 0/3] aio-posix: call ->poll_end() when removing AioHandler
Date: Wed, 13 Dec 2023 16:15:41 -0500 [thread overview]
Message-ID: <20231213211544.1601971-1-stefanha@redhat.com> (raw)
Hanna and Fiona encountered a bug in aio_set_fd_handler(): there is no matching
io_poll_end() call upon removing an AioHandler when io_poll_begin() was
previously called. The missing io_poll_end() call leaves virtqueue
notifications disabled and the virtqueue's ioeventfd will never become
readable anymore.
The details of how virtio-scsi devices using IOThreads can hang after
hotplug/unplug are covered here:
https://issues.redhat.com/browse/RHEL-3934
Hanna is currently away over the December holidays. I'm sending these RFC
patches in the meantime. They demonstrate running aio_set_fd_handler() in the
AioContext home thread and adding the missing io_poll_end() call.
The downside to my approach is that aio_set_fd_handler() becomes a
synchronization point that waits for the remote AioContext thread to finish
running a BH. Synchronization points are prone to deadlocks if the caller
invokes them while holding a lock that the remote AioContext needs to make
progress or if the remote AioContext cannot make progress before we make
progress in our own event loop. To minimize these concerns I have based this
patch series on my AioContext lock removal series and only allow the main loop
thread to call aio_set_fd_handler() on other threads (which I think is already
the convention today).
Another concern is that aio_set_fd_handler() now invokes user-provided
io_poll_end(), io_poll(), and io_poll_ready() functions. The io_poll_ready()
callback might contain a nested aio_poll() call, so there is a new place where
nested event loops can occur and hence a new re-entrant code path that I
haven't thought about yet.
But there you have it. Please let me know what you think and try your
reproducers to see if this fixes the missing io_poll_end() issue. Thanks!
Alternatives welcome! (A cleaner version of this approach might be to forbid
cross-thread aio_set_fd_handler() calls and to refactor all
aio_set_fd_handler() callers so they come from the AioContext's home thread.
I'm starting to think that only the aio_notify() and aio_schedule_bh() APIs
should be thread-safe.)
Stefan Hajnoczi (3):
aio-posix: run aio_set_fd_handler() in target AioContext
aio: use counter instead of ctx->list_lock
aio-posix: call ->poll_end() when removing AioHandler
include/block/aio.h | 22 ++---
util/aio-posix.c | 197 ++++++++++++++++++++++++++++++++------------
util/async.c | 2 -
util/fdmon-epoll.c | 6 +-
4 files changed, 152 insertions(+), 75 deletions(-)
--
2.43.0
next reply other threads:[~2023-12-13 21:16 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-13 21:15 Stefan Hajnoczi [this message]
2023-12-13 21:15 ` [RFC 1/3] aio-posix: run aio_set_fd_handler() in target AioContext Stefan Hajnoczi
2023-12-13 21:15 ` [RFC 2/3] aio: use counter instead of ctx->list_lock Stefan Hajnoczi
2023-12-13 21:15 ` [RFC 3/3] aio-posix: call ->poll_end() when removing AioHandler Stefan Hajnoczi
2023-12-13 21:52 ` Paolo Bonzini
2023-12-14 20:12 ` Stefan Hajnoczi
2023-12-14 20:39 ` Paolo Bonzini
2023-12-18 14:27 ` Stefan Hajnoczi
2023-12-13 21:52 ` [RFC 0/3] " Stefan Hajnoczi
2023-12-13 23:10 ` Paolo Bonzini
2023-12-14 19:52 ` Stefan Hajnoczi
2023-12-14 13:38 ` Fiona Ebner
2023-12-14 19:53 ` Stefan Hajnoczi
2023-12-18 12:41 ` Fiona Ebner
2023-12-18 14:25 ` Stefan Hajnoczi
2023-12-18 14:49 ` Paolo Bonzini
2023-12-19 8:40 ` Fiona Ebner
2024-01-02 15:24 ` Hanna Czenczek
2024-01-02 15:53 ` Paolo Bonzini
2024-01-02 16:55 ` Hanna Czenczek
2024-01-03 11:40 ` Fiona Ebner
2024-01-03 13:35 ` Paolo Bonzini
2024-01-05 13:43 ` Fiona Ebner
2024-01-05 14:30 ` Fiona Ebner
2024-01-22 17:41 ` Hanna Czenczek
2024-01-22 17:52 ` Hanna Czenczek
2024-01-23 11:12 ` Fiona Ebner
2024-01-23 11:25 ` Hanna Czenczek
2024-01-23 11:15 ` Hanna Czenczek
2024-01-23 16:28 ` Hanna Czenczek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231213211544.1601971-1-stefanha@redhat.com \
--to=stefanha@redhat.com \
--cc=f.ebner@proxmox.com \
--cc=fam@euphon.net \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).