qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Hanna Czenczek <hreitz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [PULL 11/33] scsi: only access SCSIDevice->requests from one thread
Date: Thu, 25 Jan 2024 08:25:07 -0500	[thread overview]
Message-ID: <20240125132507.GA640116@fedora> (raw)
In-Reply-To: <dce73220-50a8-4339-a430-dcf13f9170ba@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4876 bytes --]

On Thu, Jan 25, 2024 at 10:06:51AM +0100, Hanna Czenczek wrote:
> On 24.01.24 22:53, Stefan Hajnoczi wrote:
> > On Wed, Jan 24, 2024 at 01:12:47PM +0100, Hanna Czenczek wrote:
> > > On 23.01.24 18:10, Kevin Wolf wrote:
> > > > Am 23.01.2024 um 17:40 hat Hanna Czenczek geschrieben:
> > > > > On 21.12.23 22:23, Kevin Wolf wrote:
> > > > > > From: Stefan Hajnoczi<stefanha@redhat.com>
> > > > > > 
> > > > > > Stop depending on the AioContext lock and instead access
> > > > > > SCSIDevice->requests from only one thread at a time:
> > > > > > - When the VM is running only the BlockBackend's AioContext may access
> > > > > >      the requests list.
> > > > > > - When the VM is stopped only the main loop may access the requests
> > > > > >      list.
> > > > > > 
> > > > > > These constraints protect the requests list without the need for locking
> > > > > > in the I/O code path.
> > > > > > 
> > > > > > Note that multiple IOThreads are not supported yet because the code
> > > > > > assumes all SCSIRequests are executed from a single AioContext. Leave
> > > > > > that as future work.
> > > > > > 
> > > > > > Signed-off-by: Stefan Hajnoczi<stefanha@redhat.com>
> > > > > > Reviewed-by: Eric Blake<eblake@redhat.com>
> > > > > > Message-ID:<20231204164259.1515217-2-stefanha@redhat.com>
> > > > > > Signed-off-by: Kevin Wolf<kwolf@redhat.com>
> > > > > > ---
> > > > > >     include/hw/scsi/scsi.h |   7 +-
> > > > > >     hw/scsi/scsi-bus.c     | 181 ++++++++++++++++++++++++++++-------------
> > > > > >     2 files changed, 131 insertions(+), 57 deletions(-)
> > > > > My reproducer forhttps://issues.redhat.com/browse/RHEL-3934  now breaks more
> > > > > often because of this commit than because of the original bug, i.e. when
> > > > > repeatedly hot-plugging and unplugging a virtio-scsi and a scsi-hd device,
> > > > > this tends to happen when unplugging the scsi-hd:
> > > > > 
> > > > > {"execute":"device_del","arguments":{"id":"stg0"}}
> > > > > {"return": {}}
> > > > > qemu-system-x86_64: ../block/block-backend.c:2429: blk_get_aio_context:
> > > > > Assertion `ctx == blk->ctx' failed.
> > > [...]
> > > 
> > > > I don't know anything about the problem either, but since you already
> > > > speculated about the cause, let me speculate about the solution:
> > > > Can we hold the graph writer lock for the tran_commit() call in
> > > > bdrv_try_change_aio_context()? And of course take the reader lock for
> > > > blk_get_aio_context(), but that should be completely unproblematic.
> > > I tried this, and it’s not easy taking the lock just for tran_commit(),
> > > because some callers of bdrv_try_change_aio_context() already hold the write
> > > lock (specifically bdrv_attach_child_common(),
> > > bdrv_attach_child_common_abort(), and bdrv_root_unref_child()[1]), and
> > > qmp_x_blockdev_set_iothread() holds the read lock.  Other callers don’t hold
> > > any lock[2].
> > > 
> > > So I’m not sure whether we should mark all of bdrv_try_change_aio_context()
> > > as GRAPH_WRLOCK and then make all callers take the lock, or really only take
> > > it for tran_commit(), and have callers release the lock around
> > > bdrv_try_change_aio_context(). Former sounds better to naïve me.
> > > 
> > > (In any case, FWIW, having blk_set_aio_context() take the write lock, and
> > > scsi_device_for_each_req_async_bh() take the read lock[3], does fix the
> > > assertion failure.)
> > I wonder if a simpler solution is blk_inc_in_flight() in
> > scsi_device_for_each_req_async() and blk_dec_in_flight() in
> > scsi_device_for_each_req_async_bh() so that drain
> > waits for the BH.
> > 
> > There is a drain around the AioContext change, so as long as
> > scsi_device_for_each_req_async() was called before blk_set_aio_context()
> > and not _during_ aio_poll(), we would prevent inconsistent BB vs BDS
> > aio_contexts.
> 
> Actually, Kevin has suggested on IRC that we drop the whole drain. :)
> 
> Dropping the write lock in our outside of bdrv_try_change_aio_context() for
> callers that have already taken it seems unsafe, so the only option would be
> to make the whole function write-lock-able.  The drained section can cause
> problems with that if it ends up wanting to reorganize the graph, so AFAIU
> drain should never be done while under a write lock.  This is already a
> problem now because there are three callers that do call
> bdrv_try_change_aio_context() while under a write lock, so it seems like we
> shouldn’t keep the drain as-is.
> 
> So, Kevin suggested just dropping that drain, because I/O requests are no
> longer supposed to care about a BDS’s native AioContext anymore anyway, so
> it seems like the need for the drain has gone away with multiqueue.  Then we
> could make the whole function GRAPH_WRLOCK.

Okay.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2024-01-25 13:25 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-21 21:23 [PULL 00/33] Block layer patches Kevin Wolf
2023-12-21 21:23 ` [PULL 01/33] nbd/server: avoid per-NBDRequest nbd_client_get/put() Kevin Wolf
2023-12-21 21:23 ` [PULL 02/33] nbd/server: only traverse NBDExport->clients from main loop thread Kevin Wolf
2023-12-21 21:23 ` [PULL 03/33] nbd/server: introduce NBDClient->lock to protect fields Kevin Wolf
2023-12-21 21:23 ` [PULL 04/33] block/file-posix: set up Linux AIO and io_uring in the current thread Kevin Wolf
2023-12-21 21:23 ` [PULL 05/33] virtio-blk: add lock to protect s->rq Kevin Wolf
2023-12-21 21:23 ` [PULL 06/33] virtio-blk: don't lock AioContext in the completion code path Kevin Wolf
2023-12-21 21:23 ` [PULL 07/33] virtio-blk: don't lock AioContext in the submission " Kevin Wolf
2023-12-21 21:23 ` [PULL 08/33] block: Fix crash when loading snapshot on inactive node Kevin Wolf
2023-12-21 21:23 ` [PULL 09/33] vl: Improve error message for conflicting -incoming and -loadvm Kevin Wolf
2023-12-21 21:23 ` [PULL 10/33] iotests: Basic tests for internal snapshots Kevin Wolf
2023-12-21 21:23 ` [PULL 11/33] scsi: only access SCSIDevice->requests from one thread Kevin Wolf
2024-01-23 16:40   ` Hanna Czenczek
2024-01-23 17:10     ` Kevin Wolf
2024-01-23 17:23       ` Hanna Czenczek
2024-01-24 12:12       ` Hanna Czenczek
2024-01-24 21:53         ` Stefan Hajnoczi
2024-01-25  9:06           ` Hanna Czenczek
2024-01-25 13:25             ` Stefan Hajnoczi [this message]
2024-01-25 17:32       ` Hanna Czenczek
2024-01-26 13:18         ` Kevin Wolf
2024-01-26 15:24           ` Hanna Czenczek
2024-01-31 20:35             ` Stefan Hajnoczi
2024-02-01 14:10               ` Hanna Czenczek
2024-02-01 14:28                 ` Stefan Hajnoczi
2024-02-01 15:25                   ` Hanna Czenczek
2024-02-01 15:49                     ` Hanna Czenczek
2024-02-02 12:32                     ` Hanna Czenczek
2024-02-06 19:32                       ` Stefan Hajnoczi
2024-01-29 16:30       ` Hanna Czenczek
2024-01-31 10:17         ` Kevin Wolf
2024-02-01  9:43           ` Hanna Czenczek
2024-02-01 10:21             ` Kevin Wolf
2024-02-01 10:35               ` Hanna Czenczek
2024-01-23 17:21     ` Hanna Czenczek
2023-12-21 21:23 ` [PULL 12/33] virtio-scsi: don't lock AioContext around virtio_queue_aio_attach_host_notifier() Kevin Wolf
2023-12-21 21:23 ` [PULL 13/33] scsi: don't lock AioContext in I/O code path Kevin Wolf
2023-12-21 21:23 ` [PULL 14/33] dma-helpers: don't lock AioContext in dma_blk_cb() Kevin Wolf
2023-12-21 21:23 ` [PULL 15/33] virtio-scsi: replace AioContext lock with tmf_bh_lock Kevin Wolf
2023-12-21 21:23 ` [PULL 16/33] scsi: assert that callbacks run in the correct AioContext Kevin Wolf
2023-12-21 21:23 ` [PULL 17/33] tests: remove aio_context_acquire() tests Kevin Wolf
2023-12-21 21:23 ` [PULL 18/33] aio: make aio_context_acquire()/aio_context_release() a no-op Kevin Wolf
2023-12-21 21:23 ` [PULL 19/33] graph-lock: remove AioContext locking Kevin Wolf
2023-12-21 21:23 ` [PULL 20/33] block: " Kevin Wolf
2023-12-21 21:23 ` [PULL 21/33] block: remove bdrv_co_lock() Kevin Wolf
2023-12-21 21:23 ` [PULL 22/33] scsi: remove AioContext locking Kevin Wolf
2023-12-21 21:23 ` [PULL 23/33] aio-wait: draw equivalence between AIO_WAIT_WHILE() and AIO_WAIT_WHILE_UNLOCKED() Kevin Wolf
2023-12-21 21:23 ` [PULL 24/33] aio: remove aio_context_acquire()/aio_context_release() API Kevin Wolf
2023-12-21 21:23 ` [PULL 25/33] docs: remove AioContext lock from IOThread docs Kevin Wolf
2023-12-21 21:23 ` [PULL 26/33] scsi: remove outdated AioContext lock comment Kevin Wolf
2023-12-21 21:23 ` [PULL 27/33] job: remove outdated AioContext locking comments Kevin Wolf
2023-12-21 21:23 ` [PULL 28/33] block: " Kevin Wolf
2023-12-21 21:23 ` [PULL 29/33] block-coroutine-wrapper: use qemu_get_current_aio_context() Kevin Wolf
2023-12-21 21:23 ` [PULL 30/33] string-output-visitor: show structs as "<omitted>" Kevin Wolf
2023-12-21 21:23 ` [PULL 31/33] qdev-properties: alias all object class properties Kevin Wolf
2023-12-21 21:23 ` [PULL 32/33] qdev: add IOThreadVirtQueueMappingList property type Kevin Wolf
2023-12-21 21:23 ` [PULL 33/33] virtio-blk: add iothread-vq-mapping parameter Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240125132507.GA640116@fedora \
    --to=stefanha@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).