qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Fam Zheng" <fam@euphon.net>, "Qing Wang" <qinwang@redhat.com>
Subject: Re: [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread
Date: Fri, 17 Feb 2023 11:22:02 +0100	[thread overview]
Message-ID: <Y+9VShQtrC1KeEWU@redhat.com> (raw)
In-Reply-To: <20230210143238.524357-4-stefanha@redhat.com>

Am 10.02.2023 um 15:32 hat Stefan Hajnoczi geschrieben:
> When an IOThread is configured, the ctrl virtqueue is processed in the
> IOThread. TMFs that reset SCSI devices are currently called directly
> from the IOThread and trigger an assertion failure in blk_drain():
> 
>   ../block/block-backend.c:1780: void blk_drain(BlockBackend *): Assertion `qemu_in_main_thread()' failed.
> 
> The blk_drain() function is not designed to be called from an IOThread
> because it needs the Big QEMU Lock (BQL).
> 
> This patch defers TMFs that reset SCSI devices to a Bottom Half (BH)
> that runs in the main loop thread under the BQL. This way it's safe to
> call blk_drain() and the assertion failure is avoided.

It's not entirely obvious what is the call path that leads to
blk_drain(). Do we somehow call into virtio_scsi_dataplane_stop()?

> Introduce s->tmf_bh_list for tracking TMF requests that have been
> deferred to the BH. When the BH runs it will grab the entire list and
> process all requests. Care must be taken to clear the list when the
> virtio-scsi device is reset or unrealized. Otherwise deferred TMF
> requests could execute later and lead to use-after-free or other
> undefined behavior.

Why don't we already need the same for other asynchronously processed
requests? Certainly having a read request write to guest memory after
the device has been reset or unplugged isn't what we want either.

I see that we assert(!s->dataplane_started) in virtio_scsi_reset(),
which may be part of the reason. If we're not processing any requests,
then we're safe. virtio_scsi_dataplane_stop() calls blk_drain_all()
(which is really a too big hammer) in order to make sure that in-flight
requests are completed before dataplane_started becomes false.

I was wondering if we couldn't just blk_inc_in_flight() while a TMF
request is in flight and then use the same draining logic to be covered.
You could use oneshot BHs then and do away with the list because you
don't need to cancel anything any more, you just wait until the BHs have
completed.

The practical problem may be that we don't have a blk here (which is
probably also why blk_drain_all() is used). We could have our own
AIO_WAIT_WHILE() instead. I feel waiting instead of cancelling BHs would
simplify the code.

In fact, I think technically, you may not need any of that because
blk_drain_all() also executes all BHs in the main loop before it
returns, but that might be a bit subtle...

> The s->resetting counter that's used by TMFs that reset SCSI devices is
> accessed from multiple threads. This patch makes that explicit by using
> atomic accessor functions. With this patch applied the counter is only
> modified by the main loop thread under the BQL but can be read by any
> thread.
> 
> Reported-by: Qing Wang <qinwang@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Kevin



  parent reply	other threads:[~2023-02-17 10:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-10 14:32 [PATCH v2 0/3] virtio-scsi: fix SCSIDevice hot unplug with IOThread Stefan Hajnoczi
2023-02-10 14:32 ` [PATCH v2 1/3] scsi: protect req->aiocb with AioContext lock Stefan Hajnoczi
2023-02-15 18:17   ` Eric Blake
2023-02-16 12:34   ` Kevin Wolf
2023-02-10 14:32 ` [PATCH v2 2/3] dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race Stefan Hajnoczi
2023-02-15 18:19   ` Eric Blake
2023-02-16 15:27   ` Kevin Wolf
2023-02-16 21:27     ` Stefan Hajnoczi
2023-02-10 14:32 ` [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread Stefan Hajnoczi
2023-02-15 18:27   ` Eric Blake
2023-02-17 10:22   ` Kevin Wolf [this message]
     [not found]     ` <Y/Tz+qw7thcwO+G3@fedora>
2023-02-23 15:03       ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+9VShQtrC1KeEWU@redhat.com \
    --to=kwolf@redhat.com \
    --cc=david@redhat.com \
    --cc=fam@euphon.net \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qinwang@redhat.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).