All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Hanna Czenczek <hreitz@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Richard W . M . Jones" <rjones@redhat.com>,
	"Ilya Dryomov" <idryomov@gmail.com>,
	"Peter Lieven" <pl@dlhnet.de>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Fam Zheng" <fam@euphon.net>,
	"Ronnie Sahlberg" <ronniesahlberg@gmail.com>
Subject: Re: [PATCH 08/16] nvme: Fix coroutine waking
Date: Wed, 29 Oct 2025 18:43:34 +0100	[thread overview]
Message-ID: <aQJSRicrD7NET-az@redhat.com> (raw)
In-Reply-To: <20251028163343.116249-9-hreitz@redhat.com>

Am 28.10.2025 um 17:33 hat Hanna Czenczek geschrieben:
> nvme wakes the request coroutine via qemu_coroutine_enter() from a BH
> scheduled in the BDS AioContext.  This may not be the same context as
> the one in which the request originally ran, which would be wrong:
> - It could mean we enter the coroutine before it yields,
> - We would move the coroutine in to a different context.
> 
> (Can be reproduced with multiqueue by adding a usleep(100000) before the
> `while (data.ret == -EINPROGRESS)` loop.)
> 
> To fix that, schedule nvme_rw_cb_bh() in the coroutine AioContext.
> (Just like in the preceding iscsi and nfs patches, we could drop the
> trivial nvme_rw_cb_bh() and just use aio_co_wake() directly, but don’t
> do that to keep replay_bh_schedule_oneshot_event().)
> 
> With this, we can remove NVMeCoData.ctx.
> 
> Note the check of data->co == NULL to bypass the BH/yield combination in
> case nvme_rw_cb() is called from nvme_submit_command(): We probably want
> to keep this fast path for performance reasons, but we have to be quite
> careful about it:
> - We cannot overload .ret for this, but have to use a dedicated
>   .skip_yield field.  Otherwise, if nvme_rw_cb() runs in a different
>   thread than the coroutine, it may see .ret set and skip the yield,
>   while nvme_rw_cb() will still schedule a BH for waking.   Therefore,
>   the signal to skip the yield can only be set in nvme_rw_cb() if waking
>   too is skipped, which is independent from communicating the return
>   value.
> - We can only skip the yield if nvme_rw_cb() actually runs in the
>   request coroutine.  Otherwise (specifically if they run in different
>   AioContexts), the order between this function’s execution and the
>   coroutine yielding (or not yielding) is not reliable.
> - There is no point to yielding in a loop; there are no spurious wakes,
>   so once we yield, we will only be re-entered once the command is done.
>   Replace `while` by `if`.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

As suggested in an earlier patch, I prefer to keep setting -EINPROGRESS,
even if we don't check for it elsewhere, for better debugability.

Kevin



  reply	other threads:[~2025-10-29 17:44 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28 16:33 [PATCH 00/16] block: Some multi-threading fixes Hanna Czenczek
2025-10-28 16:33 ` [PATCH 01/16] block: Note on aio_co_wake use if not yet yielding Hanna Czenczek
2025-10-28 16:33 ` [PATCH 02/16] rbd: Run co BH CB in the coroutine’s AioContext Hanna Czenczek
2025-10-28 16:33 ` [PATCH 03/16] iscsi: " Hanna Czenczek
2025-10-29 14:27   ` Kevin Wolf
2025-10-31  9:07     ` Hanna Czenczek
2025-10-31 13:27       ` Kevin Wolf
2025-10-28 16:33 ` [PATCH 04/16] nfs: " Hanna Czenczek
2025-10-28 16:33 ` [PATCH 05/16] curl: Fix coroutine waking Hanna Czenczek
2025-10-29 16:57   ` Kevin Wolf
2025-10-31  9:15     ` Hanna Czenczek
2025-10-31 13:17       ` Kevin Wolf
2025-10-28 16:33 ` [PATCH 06/16] gluster: Do not move coroutine into BDS context Hanna Czenczek
2025-10-29 17:10   ` Kevin Wolf
2025-10-31  9:16     ` Hanna Czenczek
2025-10-28 16:33 ` [PATCH 07/16] nvme: Kick and check completions in " Hanna Czenczek
2025-10-29 17:23   ` Kevin Wolf
2025-10-29 17:39     ` Kevin Wolf
2025-10-31  9:18       ` Hanna Czenczek
2025-10-31  9:19     ` Hanna Czenczek
2025-10-28 16:33 ` [PATCH 08/16] nvme: Fix coroutine waking Hanna Czenczek
2025-10-29 17:43   ` Kevin Wolf [this message]
2025-10-28 16:33 ` [PATCH 09/16] block/io: Take reqs_lock for tracked_requests Hanna Czenczek
2025-10-28 16:33 ` [PATCH 10/16] qcow2: Fix cache_clean_timer Hanna Czenczek
2025-10-29 20:23   ` Kevin Wolf
2025-10-31  9:29     ` Hanna Czenczek
2025-10-31 13:03       ` Kevin Wolf
2025-11-06 16:08         ` Hanna Czenczek
2025-10-28 16:33 ` [PATCH 11/16] ssh: Run restart_coroutine in current AioContext Hanna Czenczek
2025-10-28 16:33 ` [PATCH 12/16] blkreplay: Run BH in coroutine’s AioContext Hanna Czenczek
2025-10-28 16:33 ` [PATCH 13/16] block: Note in which AioContext AIO CBs are called Hanna Czenczek
2025-10-28 16:33 ` [PATCH 14/16] iscsi: Create AIO BH in original AioContext Hanna Czenczek
2025-10-28 16:33 ` [PATCH 15/16] null-aio: Run CB " Hanna Czenczek
2025-10-28 16:33 ` [PATCH 16/16] win32-aio: Run CB in original context Hanna Czenczek
2025-10-30 14:12   ` Kevin Wolf
2025-10-31  9:31     ` Hanna Czenczek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQJSRicrD7NET-az@redhat.com \
    --to=kwolf@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=idryomov@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=pl@dlhnet.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    --cc=ronniesahlberg@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.