From: Stefan Hajnoczi <stefanha@redhat.com>
To: Hanna Czenczek <hreitz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
qemu-block@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [PULL 11/33] scsi: only access SCSIDevice->requests from one thread
Date: Thu, 25 Jan 2024 08:25:07 -0500 [thread overview]
Message-ID: <20240125132507.GA640116@fedora> (raw)
In-Reply-To: <dce73220-50a8-4339-a430-dcf13f9170ba@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 4876 bytes --]
On Thu, Jan 25, 2024 at 10:06:51AM +0100, Hanna Czenczek wrote:
> On 24.01.24 22:53, Stefan Hajnoczi wrote:
> > On Wed, Jan 24, 2024 at 01:12:47PM +0100, Hanna Czenczek wrote:
> > > On 23.01.24 18:10, Kevin Wolf wrote:
> > > > Am 23.01.2024 um 17:40 hat Hanna Czenczek geschrieben:
> > > > > On 21.12.23 22:23, Kevin Wolf wrote:
> > > > > > From: Stefan Hajnoczi<stefanha@redhat.com>
> > > > > >
> > > > > > Stop depending on the AioContext lock and instead access
> > > > > > SCSIDevice->requests from only one thread at a time:
> > > > > > - When the VM is running only the BlockBackend's AioContext may access
> > > > > > the requests list.
> > > > > > - When the VM is stopped only the main loop may access the requests
> > > > > > list.
> > > > > >
> > > > > > These constraints protect the requests list without the need for locking
> > > > > > in the I/O code path.
> > > > > >
> > > > > > Note that multiple IOThreads are not supported yet because the code
> > > > > > assumes all SCSIRequests are executed from a single AioContext. Leave
> > > > > > that as future work.
> > > > > >
> > > > > > Signed-off-by: Stefan Hajnoczi<stefanha@redhat.com>
> > > > > > Reviewed-by: Eric Blake<eblake@redhat.com>
> > > > > > Message-ID:<20231204164259.1515217-2-stefanha@redhat.com>
> > > > > > Signed-off-by: Kevin Wolf<kwolf@redhat.com>
> > > > > > ---
> > > > > > include/hw/scsi/scsi.h | 7 +-
> > > > > > hw/scsi/scsi-bus.c | 181 ++++++++++++++++++++++++++++-------------
> > > > > > 2 files changed, 131 insertions(+), 57 deletions(-)
> > > > > My reproducer forhttps://issues.redhat.com/browse/RHEL-3934 now breaks more
> > > > > often because of this commit than because of the original bug, i.e. when
> > > > > repeatedly hot-plugging and unplugging a virtio-scsi and a scsi-hd device,
> > > > > this tends to happen when unplugging the scsi-hd:
> > > > >
> > > > > {"execute":"device_del","arguments":{"id":"stg0"}}
> > > > > {"return": {}}
> > > > > qemu-system-x86_64: ../block/block-backend.c:2429: blk_get_aio_context:
> > > > > Assertion `ctx == blk->ctx' failed.
> > > [...]
> > >
> > > > I don't know anything about the problem either, but since you already
> > > > speculated about the cause, let me speculate about the solution:
> > > > Can we hold the graph writer lock for the tran_commit() call in
> > > > bdrv_try_change_aio_context()? And of course take the reader lock for
> > > > blk_get_aio_context(), but that should be completely unproblematic.
> > > I tried this, and it’s not easy taking the lock just for tran_commit(),
> > > because some callers of bdrv_try_change_aio_context() already hold the write
> > > lock (specifically bdrv_attach_child_common(),
> > > bdrv_attach_child_common_abort(), and bdrv_root_unref_child()[1]), and
> > > qmp_x_blockdev_set_iothread() holds the read lock. Other callers don’t hold
> > > any lock[2].
> > >
> > > So I’m not sure whether we should mark all of bdrv_try_change_aio_context()
> > > as GRAPH_WRLOCK and then make all callers take the lock, or really only take
> > > it for tran_commit(), and have callers release the lock around
> > > bdrv_try_change_aio_context(). Former sounds better to naïve me.
> > >
> > > (In any case, FWIW, having blk_set_aio_context() take the write lock, and
> > > scsi_device_for_each_req_async_bh() take the read lock[3], does fix the
> > > assertion failure.)
> > I wonder if a simpler solution is blk_inc_in_flight() in
> > scsi_device_for_each_req_async() and blk_dec_in_flight() in
> > scsi_device_for_each_req_async_bh() so that drain
> > waits for the BH.
> >
> > There is a drain around the AioContext change, so as long as
> > scsi_device_for_each_req_async() was called before blk_set_aio_context()
> > and not _during_ aio_poll(), we would prevent inconsistent BB vs BDS
> > aio_contexts.
>
> Actually, Kevin has suggested on IRC that we drop the whole drain. :)
>
> Dropping the write lock in our outside of bdrv_try_change_aio_context() for
> callers that have already taken it seems unsafe, so the only option would be
> to make the whole function write-lock-able. The drained section can cause
> problems with that if it ends up wanting to reorganize the graph, so AFAIU
> drain should never be done while under a write lock. This is already a
> problem now because there are three callers that do call
> bdrv_try_change_aio_context() while under a write lock, so it seems like we
> shouldn’t keep the drain as-is.
>
> So, Kevin suggested just dropping that drain, because I/O requests are no
> longer supposed to care about a BDS’s native AioContext anymore anyway, so
> it seems like the need for the drain has gone away with multiqueue. Then we
> could make the whole function GRAPH_WRLOCK.
Okay.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2024-01-25 13:25 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-21 21:23 [PULL 00/33] Block layer patches Kevin Wolf
2023-12-21 21:23 ` [PULL 01/33] nbd/server: avoid per-NBDRequest nbd_client_get/put() Kevin Wolf
2023-12-21 21:23 ` [PULL 02/33] nbd/server: only traverse NBDExport->clients from main loop thread Kevin Wolf
2023-12-21 21:23 ` [PULL 03/33] nbd/server: introduce NBDClient->lock to protect fields Kevin Wolf
2023-12-21 21:23 ` [PULL 04/33] block/file-posix: set up Linux AIO and io_uring in the current thread Kevin Wolf
2023-12-21 21:23 ` [PULL 05/33] virtio-blk: add lock to protect s->rq Kevin Wolf
2023-12-21 21:23 ` [PULL 06/33] virtio-blk: don't lock AioContext in the completion code path Kevin Wolf
2023-12-21 21:23 ` [PULL 07/33] virtio-blk: don't lock AioContext in the submission " Kevin Wolf
2023-12-21 21:23 ` [PULL 08/33] block: Fix crash when loading snapshot on inactive node Kevin Wolf
2023-12-21 21:23 ` [PULL 09/33] vl: Improve error message for conflicting -incoming and -loadvm Kevin Wolf
2023-12-21 21:23 ` [PULL 10/33] iotests: Basic tests for internal snapshots Kevin Wolf
2023-12-21 21:23 ` [PULL 11/33] scsi: only access SCSIDevice->requests from one thread Kevin Wolf
2024-01-23 16:40 ` Hanna Czenczek
2024-01-23 17:10 ` Kevin Wolf
2024-01-23 17:23 ` Hanna Czenczek
2024-01-24 12:12 ` Hanna Czenczek
2024-01-24 21:53 ` Stefan Hajnoczi
2024-01-25 9:06 ` Hanna Czenczek
2024-01-25 13:25 ` Stefan Hajnoczi [this message]
2024-01-25 17:32 ` Hanna Czenczek
2024-01-26 13:18 ` Kevin Wolf
2024-01-26 15:24 ` Hanna Czenczek
2024-01-31 20:35 ` Stefan Hajnoczi
2024-02-01 14:10 ` Hanna Czenczek
2024-02-01 14:28 ` Stefan Hajnoczi
2024-02-01 15:25 ` Hanna Czenczek
2024-02-01 15:49 ` Hanna Czenczek
2024-02-02 12:32 ` Hanna Czenczek
2024-02-06 19:32 ` Stefan Hajnoczi
2024-01-29 16:30 ` Hanna Czenczek
2024-01-31 10:17 ` Kevin Wolf
2024-02-01 9:43 ` Hanna Czenczek
2024-02-01 10:21 ` Kevin Wolf
2024-02-01 10:35 ` Hanna Czenczek
2024-01-23 17:21 ` Hanna Czenczek
2023-12-21 21:23 ` [PULL 12/33] virtio-scsi: don't lock AioContext around virtio_queue_aio_attach_host_notifier() Kevin Wolf
2023-12-21 21:23 ` [PULL 13/33] scsi: don't lock AioContext in I/O code path Kevin Wolf
2023-12-21 21:23 ` [PULL 14/33] dma-helpers: don't lock AioContext in dma_blk_cb() Kevin Wolf
2023-12-21 21:23 ` [PULL 15/33] virtio-scsi: replace AioContext lock with tmf_bh_lock Kevin Wolf
2023-12-21 21:23 ` [PULL 16/33] scsi: assert that callbacks run in the correct AioContext Kevin Wolf
2023-12-21 21:23 ` [PULL 17/33] tests: remove aio_context_acquire() tests Kevin Wolf
2023-12-21 21:23 ` [PULL 18/33] aio: make aio_context_acquire()/aio_context_release() a no-op Kevin Wolf
2023-12-21 21:23 ` [PULL 19/33] graph-lock: remove AioContext locking Kevin Wolf
2023-12-21 21:23 ` [PULL 20/33] block: " Kevin Wolf
2023-12-21 21:23 ` [PULL 21/33] block: remove bdrv_co_lock() Kevin Wolf
2023-12-21 21:23 ` [PULL 22/33] scsi: remove AioContext locking Kevin Wolf
2023-12-21 21:23 ` [PULL 23/33] aio-wait: draw equivalence between AIO_WAIT_WHILE() and AIO_WAIT_WHILE_UNLOCKED() Kevin Wolf
2023-12-21 21:23 ` [PULL 24/33] aio: remove aio_context_acquire()/aio_context_release() API Kevin Wolf
2023-12-21 21:23 ` [PULL 25/33] docs: remove AioContext lock from IOThread docs Kevin Wolf
2023-12-21 21:23 ` [PULL 26/33] scsi: remove outdated AioContext lock comment Kevin Wolf
2023-12-21 21:23 ` [PULL 27/33] job: remove outdated AioContext locking comments Kevin Wolf
2023-12-21 21:23 ` [PULL 28/33] block: " Kevin Wolf
2023-12-21 21:23 ` [PULL 29/33] block-coroutine-wrapper: use qemu_get_current_aio_context() Kevin Wolf
2023-12-21 21:23 ` [PULL 30/33] string-output-visitor: show structs as "<omitted>" Kevin Wolf
2023-12-21 21:23 ` [PULL 31/33] qdev-properties: alias all object class properties Kevin Wolf
2023-12-21 21:23 ` [PULL 32/33] qdev: add IOThreadVirtQueueMappingList property type Kevin Wolf
2023-12-21 21:23 ` [PULL 33/33] virtio-blk: add iothread-vq-mapping parameter Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240125132507.GA640116@fedora \
--to=stefanha@redhat.com \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.