From: Kevin Wolf <kwolf@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org,
"Michael S. Tsirkin" <mst@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"David Hildenbrand" <david@redhat.com>,
"Peter Xu" <peterx@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Fam Zheng" <fam@euphon.net>
Subject: Re: [PATCH v2 2/3] dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race
Date: Thu, 16 Feb 2023 16:27:42 +0100 [thread overview]
Message-ID: <Y+5LblriH/1jmHcB@redhat.com> (raw)
In-Reply-To: <20230210143238.524357-3-stefanha@redhat.com>
Am 10.02.2023 um 15:32 hat Stefan Hajnoczi geschrieben:
> dma_blk_cb() only takes the AioContext lock around ->io_func(). That
> means the rest of dma_blk_cb() is not protected. In particular, the
> DMAAIOCB field accesses happen outside the lock.
>
> There is a race when the main loop thread holds the AioContext lock and
> invokes scsi_device_purge_requests() -> bdrv_aio_cancel() ->
> dma_aio_cancel() while an IOThread executes dma_blk_cb(). The dbs->acb
> field determines how cancellation proceeds. If dma_aio_cancel() see
> dbs->acb == NULL while dma_blk_cb() is still running, the request can be
> completed twice (-ECANCELED and the actual return value).
>
> The following assertion can occur with virtio-scsi when an IOThread is
> used:
>
> ../hw/scsi/scsi-disk.c:368: scsi_dma_complete: Assertion `r->req.aiocb != NULL' failed.
>
> Fix the race by holding the AioContext across dma_blk_cb(). Now
> dma_aio_cancel() under the AioContext lock will not see
> inconsistent/intermediate states.
>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Two things that seem to need attention in the review:
> diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
> index 7820fec54c..2463964805 100644
> --- a/softmmu/dma-helpers.c
> +++ b/softmmu/dma-helpers.c
> @@ -113,17 +113,19 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
> static void dma_blk_cb(void *opaque, int ret)
> {
> DMAAIOCB *dbs = (DMAAIOCB *)opaque;
> + AioContext *ctx = dbs->ctx;
> dma_addr_t cur_addr, cur_len;
> void *mem;
>
> trace_dma_blk_cb(dbs, ret);
>
> + aio_context_acquire(ctx);
During the first iteration, the caller may already hold the AioContext
lock. In the case of scsi-disk, it does. Locking a second time is fine
in principle because it's a recursive lock, but we have to be careful
not to call AIO_WAIT_WHILE() indirectly then because it could deadlock.
Except for the dbs->common.cb (see below) I don't see any functions that
would be problematic in this respect. In fact, the one I would be most
worried about is dbs->io_func, but it was already locked before.
> dbs->acb = NULL;
> dbs->offset += dbs->iov.size;
>
> if (dbs->sg_cur_index == dbs->sg->nsg || ret < 0) {
> dma_complete(dbs, ret);
All request callbacks hold the AioContext lock now when they didn't
before. I wonder if it's worth documenting the locking rules for
dma_blk_io() in a comment. Could be a separate patch, though.
You remove the locking in scsi_dma_complete() to compensate. All the
other callers come from IDE and nvme, which don't take the lock
internally. Taking the main AioContext lock once is fine, so this looks
good.
If it is possible that we already complete on the first iteration, this
could however also be affected by the case above so that the AioContext
is locked twice. In this case, the invoked callback must not call
AIO_WAIT_WHILE() and we would need to audit IDE and nvme.
Is it possible? In other words, can dbs->sg->nsg be 0? If not, can we
assert and document it?
> - return;
> + goto out;
> }
> dma_blk_unmap(dbs);
>
> @@ -164,9 +166,9 @@ static void dma_blk_cb(void *opaque, int ret)
>
> if (dbs->iov.size == 0) {
> trace_dma_map_wait(dbs);
> - dbs->bh = aio_bh_new(dbs->ctx, reschedule_dma, dbs);
> + dbs->bh = aio_bh_new(ctx, reschedule_dma, dbs);
Does copying dbs->ctx to a local variable imply that it may change
during the function? I didn't think so, but if it may, then why is
calling aio_bh_new() with the old value right?
> cpu_register_map_client(dbs->bh);
> - return;
> + goto out;
> }
>
> if (!QEMU_IS_ALIGNED(dbs->iov.size, dbs->align)) {
> @@ -174,11 +176,11 @@ static void dma_blk_cb(void *opaque, int ret)
> QEMU_ALIGN_DOWN(dbs->iov.size, dbs->align));
> }
>
> - aio_context_acquire(dbs->ctx);
> dbs->acb = dbs->io_func(dbs->offset, &dbs->iov,
> dma_blk_cb, dbs, dbs->io_func_opaque);
> - aio_context_release(dbs->ctx);
> assert(dbs->acb);
> +out:
> + aio_context_release(ctx);
> }
Kevin
next prev parent reply other threads:[~2023-02-16 15:28 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-10 14:32 [PATCH v2 0/3] virtio-scsi: fix SCSIDevice hot unplug with IOThread Stefan Hajnoczi
2023-02-10 14:32 ` [PATCH v2 1/3] scsi: protect req->aiocb with AioContext lock Stefan Hajnoczi
2023-02-15 18:17 ` Eric Blake
2023-02-16 12:34 ` Kevin Wolf
2023-02-10 14:32 ` [PATCH v2 2/3] dma-helpers: prevent dma_blk_cb() vs dma_aio_cancel() race Stefan Hajnoczi
2023-02-15 18:19 ` Eric Blake
2023-02-16 15:27 ` Kevin Wolf [this message]
2023-02-16 21:27 ` Stefan Hajnoczi
2023-02-10 14:32 ` [PATCH v2 3/3] virtio-scsi: reset SCSI devices from main loop thread Stefan Hajnoczi
2023-02-15 18:27 ` Eric Blake
2023-02-17 10:22 ` Kevin Wolf
[not found] ` <Y/Tz+qw7thcwO+G3@fedora>
2023-02-23 15:03 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+5LblriH/1jmHcB@redhat.com \
--to=kwolf@redhat.com \
--cc=david@redhat.com \
--cc=fam@euphon.net \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.