qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Hanna Reitz <hreitz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Eric Blake <eblake@redhat.com>
Subject: Advice on block QMP command coroutines
Date: Mon, 01 Apr 2024 15:27:37 -0300	[thread overview]
Message-ID: <87bk6trl9i.fsf@suse.de> (raw)

Hi, I'm working on a v3 of my query-block series [1] and I'm a bit
confused about how to convert a QMP command into a coroutine.

In case you miss the context:

 In that series I'm turning query-block into a coroutine so we can avoid
 holding the BQL for too long in the case of a misbehaving (slow)
 syscall at the end of the call chain (get_allocated_file_size -> fstat
 in my case).

The issue:

After converting qmp_query_block into a coroutine, I'm hitting the
assert(false) bug at qcow2_get_specific_info() which was already fixed
for non-coroutines [2]. The bug was caused by qmp_query_block() running
during bdrv_activate_all():

bdrv_activate_all
  ...
  bdrv_invalidate_cache
    bdrv_poll_co
    |-> aio_co_enter
    |   ...
    |   qcow2_co_invalidate_cache
    |     memset(s, 0, ...)
    |     qcow2_do_open
    |       blk_co_pread
    |  	    ...
    |       qemu_coroutine_yield
    |-> AIO_WAIT_WHILE
    |   aio_poll
    |     reschedule of qmp_dispatch
    |     qmp_query_block
    |     ...
    |     qcow2_get_specific_info
    |       sees s->qcow_version == 0
    |       assert(false)
  
So my question is how do we expect to be able to convert a QMP command
into a coroutine if we're rescheduling all coroutines into
qemu_aio_context (at qmp_dispatch). I don't see how to avoid any
random aio_poll causing a dispatch of the coroutine in the middle of
something else.

If I keep the QMP command in the iohandler context, then the bug never
happens. Rescheduling back into the iohandler would also work, were it
not for the HMP path which only polls on qemu_aio_context and causes a
deadlock.

What's the recommended approach here?

Thank you

1- https://lore.kernel.org/r/20230609201910.12100-1-farosas@suse.de
2- https://gitlab.com/qemu-project/qemu/-/issues/1933


                 reply	other threads:[~2024-04-01 18:28 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bk6trl9i.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).