qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Hanna Czenczek <hreitz@redhat.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, "Hanna Czenczek" <hreitz@redhat.com>,
	"Kevin Wolf" <kwolf@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Richard W . M . Jones" <rjones@redhat.com>,
	"Ilya Dryomov" <idryomov@gmail.com>,
	"Peter Lieven" <pl@dlhnet.de>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Fam Zheng" <fam@euphon.net>,
	"Ronnie Sahlberg" <ronniesahlberg@gmail.com>
Subject: [PATCH 00/16] block: Some multi-threading fixes
Date: Tue, 28 Oct 2025 17:33:26 +0100	[thread overview]
Message-ID: <20251028163343.116249-1-hreitz@redhat.com> (raw)

Hi,

As noted in “the original patch”[1] (which, in slightly different form,
is still part of this series), with multi-queue, a BDS’s “main”
AioContext should have little meaning.  Instead, every request can be in
any AioContext.  This series fixes some of the places where that doesn’t
seem to be considered yet, some of which can cause user-visible bugs
(those patches are Cc-ed to stable, I hope).

[1] https://lists.nongnu.org/archive/html/qemu-block/2025-02/msg00123.html

A common problem pattern is that the request coroutine yields, awaiting
a completion function that runs in a different thread.  Waking the
coroutine is then often done in the BDS’s main AioContext (e.g. via a BH
scheduled there), i.e. when using multiqueue, still a different thread
from the original request.  This can cause races, particularly in case
the coroutine yields in a loop awaiting request completion, which can
cause it to see completion before yielding for the first time, even
though the completion function is still going to wake it.  (A wake
without a yield is bad.)

In other cases, there is no race (patch 1 tries to clarify when
aio_co_wake() is safe to call), but scheduling the completion wake-up in
the BDS main AioContext still doesn’t make too much sense, and it should
instead run in the request’s context, so it can directly enter the
request coroutine instead of just scheduling it yet again.

Patches 7, 9, and 10 are general concurrency fixes that have nothing to
do with this wake-up pattern, I just found those issues along the way.

As for the last four patches: The block layer currently doesn’t specify
the context in which AIO callbacks run.  Callers probably expect this to
be the same context in which they issued the request, but we don’t
explicitly say so.  Now, the only caller of these AIO-style methods is
block/io.c, which immediately “maps” them to coroutines in a non-racey
manner, i.e. it doesn’t actually care much about the context.

So while it makes sense to specify the AIOCB context (and then make the
implementations adhere to it), in practice, the only caller doesn’t
really care, and the block layer as a whole doesn’t really care about
the AIO context either.  So maybe we should just drop the last four
patches, or keep patch 13, but instead of stating that the CB is run in
the request context, explicitly say that it may be run in any
AioContext.


Hanna Czenczek (16):
  block: Note on aio_co_wake use if not yet yielding
  rbd: Run co BH CB in the coroutine’s AioContext
  iscsi: Run co BH CB in the coroutine’s AioContext
  nfs: Run co BH CB in the coroutine’s AioContext
  curl: Fix coroutine waking
  gluster: Do not move coroutine into BDS context
  nvme: Kick and check completions in BDS context
  nvme: Fix coroutine waking
  block/io: Take reqs_lock for tracked_requests
  qcow2: Fix cache_clean_timer
  ssh: Run restart_coroutine in current AioContext
  blkreplay: Run BH in coroutine’s AioContext
  block: Note in which AioContext AIO CBs are called
  iscsi: Create AIO BH in original AioContext
  null-aio: Run CB in original AioContext
  win32-aio: Run CB in original context

 block/qcow2.h                    |  1 +
 include/block/aio.h              | 15 ++++++
 include/block/block_int-common.h |  7 ++-
 block/blkreplay.c                |  3 +-
 block/curl.c                     | 55 ++++++++++++-------
 block/gluster.c                  | 17 +++---
 block/io.c                       |  3 ++
 block/iscsi.c                    | 46 +++++-----------
 block/nfs.c                      | 23 +++-----
 block/null.c                     |  7 ++-
 block/nvme.c                     | 70 +++++++++++++------------
 block/qcow2.c                    | 90 +++++++++++++++++++++++++++-----
 block/rbd.c                      | 12 ++---
 block/ssh.c                      | 22 ++++----
 block/win32-aio.c                | 31 ++++++++---
 15 files changed, 251 insertions(+), 151 deletions(-)

-- 
2.51.0



             reply	other threads:[~2025-10-28 16:41 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28 16:33 Hanna Czenczek [this message]
2025-10-28 16:33 ` [PATCH 01/16] block: Note on aio_co_wake use if not yet yielding Hanna Czenczek
2025-10-28 16:33 ` [PATCH 02/16] rbd: Run co BH CB in the coroutine’s AioContext Hanna Czenczek
2025-10-28 16:33 ` [PATCH 03/16] iscsi: " Hanna Czenczek
2025-10-29 14:27   ` Kevin Wolf
2025-10-31  9:07     ` Hanna Czenczek
2025-10-31 13:27       ` Kevin Wolf
2025-10-28 16:33 ` [PATCH 04/16] nfs: " Hanna Czenczek
2025-10-28 16:33 ` [PATCH 05/16] curl: Fix coroutine waking Hanna Czenczek
2025-10-29 16:57   ` Kevin Wolf
2025-10-31  9:15     ` Hanna Czenczek
2025-10-31 13:17       ` Kevin Wolf
2025-10-28 16:33 ` [PATCH 06/16] gluster: Do not move coroutine into BDS context Hanna Czenczek
2025-10-29 17:10   ` Kevin Wolf
2025-10-31  9:16     ` Hanna Czenczek
2025-10-28 16:33 ` [PATCH 07/16] nvme: Kick and check completions in " Hanna Czenczek
2025-10-29 17:23   ` Kevin Wolf
2025-10-29 17:39     ` Kevin Wolf
2025-10-31  9:18       ` Hanna Czenczek
2025-10-31  9:19     ` Hanna Czenczek
2025-10-28 16:33 ` [PATCH 08/16] nvme: Fix coroutine waking Hanna Czenczek
2025-10-29 17:43   ` Kevin Wolf
2025-10-28 16:33 ` [PATCH 09/16] block/io: Take reqs_lock for tracked_requests Hanna Czenczek
2025-10-28 16:33 ` [PATCH 10/16] qcow2: Fix cache_clean_timer Hanna Czenczek
2025-10-29 20:23   ` Kevin Wolf
2025-10-31  9:29     ` Hanna Czenczek
2025-10-31 13:03       ` Kevin Wolf
2025-11-06 16:08         ` Hanna Czenczek
2025-10-28 16:33 ` [PATCH 11/16] ssh: Run restart_coroutine in current AioContext Hanna Czenczek
2025-10-28 16:33 ` [PATCH 12/16] blkreplay: Run BH in coroutine’s AioContext Hanna Czenczek
2025-10-28 16:33 ` [PATCH 13/16] block: Note in which AioContext AIO CBs are called Hanna Czenczek
2025-10-28 16:33 ` [PATCH 14/16] iscsi: Create AIO BH in original AioContext Hanna Czenczek
2025-10-28 16:33 ` [PATCH 15/16] null-aio: Run CB " Hanna Czenczek
2025-10-28 16:33 ` [PATCH 16/16] win32-aio: Run CB in original context Hanna Czenczek
2025-10-30 14:12   ` Kevin Wolf
2025-10-31  9:31     ` Hanna Czenczek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251028163343.116249-1-hreitz@redhat.com \
    --to=hreitz@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=fam@euphon.net \
    --cc=idryomov@gmail.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=pl@dlhnet.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    --cc=ronniesahlberg@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).