Re: [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Stefan Hajnoczi <stefanha@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fiona Ebner <f.ebner@proxmox.com>,
	qemu-devel@nongnu.org, fam@euphon.net, mst@redhat.com,
	kwolf@redhat.com, qemu-stable@nongnu.org
Subject: Re: [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO
Date: Mon, 27 Oct 2025 14:55:58 -0400	[thread overview]
Message-ID: <20251027185558.GB11774@fedora> (raw)
In-Reply-To: <2de05b13-f97a-480d-b07c-2a8ffc695c21@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3739 bytes --]

On Sat, Oct 18, 2025 at 10:14:49AM +0200, Paolo Bonzini wrote:
> On 10/17/25 19:54, Stefan Hajnoczi wrote:
> > On Fri, Oct 17, 2025 at 11:43:30AM +0200, Fiona Ebner wrote:
> > > Changes in v2:
> > > * Different approach, collect requests for cancelling in a list for a
> > >    localized solution rather than keeping track of the lock status via
> > >    function arguments.
> > > 
> > >   hw/scsi/virtio-scsi.c | 14 +++++++++++++-
> > >   1 file changed, 13 insertions(+), 1 deletion(-)
> > 
> > Thanks, applied to my block tree:
> > https://gitlab.com/stefanha/qemu/commits/block
> 
> Thanks Stefan; sorry for the delay in reviewing.  The fix
> of releasing the lock around virtio_scsi_tmf_cancel_req():
> 
> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
> index 9b12ee7f1c6..ac17c97f224 100644
> --- a/hw/scsi/scsi-bus.c
> +++ b/hw/scsi/scsi-bus.c
> @@ -1503,6 +1503,10 @@ SCSIRequest *scsi_req_ref(SCSIRequest *req)
>  void scsi_req_unref(SCSIRequest *req)
>  {
> +    if (!req) {
> +        return;
> +    }
> +
>      assert(req->refcount > 0);
>      if (--req->refcount == 0) {
>          BusState *qbus = req->dev->qdev.parent_bus;
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index d817fc42b4c..481e78e4771 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -364,7 +364,11 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
>      }
>      WITH_QEMU_LOCK_GUARD(&d->requests_lock) {
> +        SCSIRequest *prev = NULL;
>          QTAILQ_FOREACH(r, &d->requests, next) {
> +            scsi_req_unref(prev);
> +            prev = NULL;
> +
>              VirtIOSCSIReq *cmd_req = r->hba_private;
>              assert(cmd_req); /* request has hba_private while enqueued */
> @@ -374,8 +378,20 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
>              if (match_tag && cmd_req->req.cmd.tag != tmf->req.tmf.tag) {
>                  continue;
>              }
> +
> +            /*
> +             * Keep it alive while the lock is released, and also to be
> +             * able to read "next".
> +             */
> +            scsi_req_ref(r);
> +            prev = r;
> +
> +            qemu_mutex_unlock(&d->request_lock);
>              virtio_scsi_tmf_cancel_req(tmf, r);
> +            qemu_mutex_lock(&d->request_lock);
>          }
> +
> +        scsi_req_unref(prev);
>      }
>      /* Incremented by virtio_scsi_do_tmf() */
> 
> 
> would have a bug too, in that the loop is not using
> QTAILQ_FOREACH_SAFE and scsi_req_dequeue() removes the
> request from the list.
> 
> I think scsi_req_ref/unref should also be changed to use atomics.
> free_request is only implemented by hw/usb/dev-uas.c and all the
> others do not need a lock, so we're fine with that.

At the moment there is the assumption that a request executes in the
same AioContext for its entire lifetime. Most devices only have one
AioContext and don't worry about thread-safety at all (like the
hw/usb/dev-uas.c example you mentioned).

SCSIRequest->refcount does not need to be atomic today and any change to
the SCSI layer that actually touches a request from multiple threads
will need to do more than just making refcount atomic.

I worry making refcount atomic might give the impression that
SCSIRequest is thread-safe when it's not. I would only make it atomic
when there are multi-threaded users.

> 
> And QOM references held by the requests are not necessary, because
> anyway the requests won't survive scsi_qdev_unrealize (at which
> point the device is certainly alive).  I'll test this, add some
> comments and send a patch:

Avoiding QOM ref/unref would be nice.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

     prev parent reply	other threads:[~2025-10-27 18:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-17  9:43 [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO Fiona Ebner
2025-10-17 17:54 ` Stefan Hajnoczi
2025-10-18  8:14   ` Paolo Bonzini
2025-10-27 18:55     ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251027185558.GB11774@fedora \
    --to=stefanha@redhat.com \
    --cc=f.ebner@proxmox.com \
    --cc=fam@euphon.net \
    --cc=kwolf@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.