From: Stefan Hajnoczi <stefanha@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org,
"Dr. David Alan Gilbert" <dave@treblig.org>,
"Eduardo Habkost" <eduardo@habkost.net>,
"Markus Armbruster" <armbru@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
kwolf@redhat.com, "Maxim Levitsky" <mlevitsk@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>
Subject: Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine
Date: Thu, 7 Sep 2023 10:00:06 -0400 [thread overview]
Message-ID: <20230907140006.GA1363873@fedora> (raw)
In-Reply-To: <1fa3ad95-c335-7e97-42f0-00dca5c5ba48@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 4058 bytes --]
On Thu, Sep 07, 2023 at 01:28:55PM +0200, Paolo Bonzini wrote:
> On 9/6/23 21:01, Stefan Hajnoczi wrote:
> > It is not safe to call drain_call_rcu() from qmp_device_add() because
> > some call stacks are not prepared for drain_call_rcu() to drop the Big
> > QEMU Lock (BQL).
> >
> > For example, device emulation code is protected by the BQL but when it
> > calls aio_poll() -> ... -> qmp_device_add() -> drain_call_rcu() then the
> > BQL is dropped. See https://bugzilla.redhat.com/show_bug.cgi?id=2215192 for a
> > concrete bug of this type.
> >
> > Another limitation of drain_call_rcu() is that it cannot be invoked within an
> > RCU read-side critical section since the reclamation phase cannot complete
> > until the end of the critical section. Unfortunately, call stacks have been
> > seen where this happens (see
> > https://bugzilla.redhat.com/show_bug.cgi?id=2214985).
>
> I think the root cause here is that do_qmp_dispatch_bh is called on the
> wrong context, namely qemu_get_aio_context() instead of
> iohandler_get_aio_context(). This is what causes it to move to the vCPU
> thread.
>
> Auditing all subsystems that use iohandler_get_aio_context(), for example
> via qemu_set_fd_handler(), together with bottom halves, would be a bit
> daunting.
>
> I don't have any objection to this patch series actually, but I would like
> to see if using the right AioContext also fixes the bug---and then treat
> these changes as more of a cleanup. Coroutines are pretty pervasive in QEMU
> and are not going away which, as you say in the updated docs, makes
> drain_call_rcu_co() preferrable to drain_call_rcu().
While I agree that the issue would not happen if monitor commands only
ran in the iohandler AioContext, I don't think we can change that.
When Kevin implemented coroutine commands in commit 9ce44e2ce267 ("qmp:
Move dispatcher to a coroutine"), he used qemu_get_aio_context()
deliberately so that AIO_WAIT_WHILE() can make progress.
I'm not clear on the exact scenario though, because coroutines shouldn't
call AIO_WAIT_WHILE().
Kevin?
There is only one coroutine monitor command that calls the QEMU block
layer: qmp_block_resize(). If we're going to change how the AioContext
works then now is the time to do it before there are more commands that
need to be audited/refactored.
Stefan
>
> Paolo
>
>
> > This patch series introduces drain_call_rcu_co(), which does the same thing as
> > drain_call_rcu() but asynchronously. By yielding back to the event loop we can
> > wait until the caller drops the BQL and leaves its RCU read-side critical
> > section.
> >
> > Patch 1 changes HMP so that coroutine monitor commands yield back to the event
> > loop instead of running inside a nested event loop.
> >
> > Patch 2 introduces the new drain_call_rcu_co() API.
> >
> > Patch 3 converts qmp_device_add() into a coroutine monitor command and uses
> > drain_call_rcu_co().
> >
> > I'm sending this as an RFC because I don't have confirmation yet that the bugs
> > mentioned above are fixed by this patch series.
> >
> > Stefan Hajnoczi (3):
> > hmp: avoid the nested event loop in handle_hmp_command()
> > rcu: add drain_call_rcu_co() API
> > qmp: make qmp_device_add() a coroutine
> >
> > MAINTAINERS | 2 ++
> > docs/devel/rcu.txt | 21 ++++++++++++++++
> > qapi/qdev.json | 1 +
> > include/monitor/qdev.h | 3 ++-
> > include/qemu/rcu.h | 1 +
> > util/rcu-internal.h | 8 ++++++
> > monitor/hmp.c | 28 +++++++++++----------
> > monitor/qmp-cmds.c | 2 +-
> > softmmu/qdev-monitor.c | 34 +++++++++++++++++++++++---
> > util/rcu-co.c | 55 ++++++++++++++++++++++++++++++++++++++++++
> > util/rcu.c | 3 ++-
> > hmp-commands.hx | 1 +
> > util/meson.build | 2 +-
> > 13 files changed, 140 insertions(+), 21 deletions(-)
> > create mode 100644 util/rcu-internal.h
> > create mode 100644 util/rcu-co.c
> >
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2023-09-07 14:01 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-06 19:01 [RFC 0/3] qmp: make qmp_device_add() a coroutine Stefan Hajnoczi
2023-09-06 19:01 ` [RFC 1/3] hmp: avoid the nested event loop in handle_hmp_command() Stefan Hajnoczi
2023-09-07 1:06 ` Dr. David Alan Gilbert
2023-09-07 14:04 ` Stefan Hajnoczi
2023-09-07 14:07 ` Dr. David Alan Gilbert
2023-09-07 15:34 ` Stefan Hajnoczi
2023-09-07 20:53 ` Dr. David Alan Gilbert
2023-09-07 21:25 ` Stefan Hajnoczi
2023-09-06 19:01 ` [RFC 2/3] rcu: add drain_call_rcu_co() API Stefan Hajnoczi
2023-09-12 16:36 ` Kevin Wolf
2023-09-12 16:52 ` Stefan Hajnoczi
2023-09-06 19:01 ` [RFC 3/3] qmp: make qmp_device_add() a coroutine Stefan Hajnoczi
2023-09-12 16:47 ` Kevin Wolf
2023-09-12 17:04 ` Stefan Hajnoczi
2023-09-07 11:28 ` [RFC 0/3] " Paolo Bonzini
2023-09-07 14:00 ` Stefan Hajnoczi [this message]
2023-09-07 14:25 ` Paolo Bonzini
2023-09-07 15:29 ` Stefan Hajnoczi
2023-09-12 17:08 ` Kevin Wolf
2023-09-13 11:38 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230907140006.GA1363873@fedora \
--to=stefanha@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=dave@treblig.org \
--cc=eblake@redhat.com \
--cc=eduardo@habkost.net \
--cc=kwolf@redhat.com \
--cc=mlevitsk@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).