[PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO
@ 2025-10-17  9:43 Fiona Ebner
  2025-10-17 17:54 ` Stefan Hajnoczi
  0 siblings, 1 reply; 4+ messages in thread
From: Fiona Ebner @ 2025-10-17  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, fam, mst, stefanha, kwolf, qemu-stable

When scsi_req_dequeue() is reached via
scsi_req_cancel_async()
virtio_scsi_tmf_cancel_req()
virtio_scsi_do_tmf_aio_context(),
there is a deadlock when trying to acquire the SCSI device's requests
lock, because it was already acquired in
virtio_scsi_do_tmf_aio_context().

In particular, the issue happens with a FreeBSD guest (13, 14, 15,
maybe more), when it cancels SCSI requests, because of timeout.

This is a regression caused by commit da6eebb33b ("virtio-scsi:
perform TMFs in appropriate AioContexts") and the introduction of the
requests_lock earlier.

To fix the issue, only cancel the requests after releasing the
requests_lock. For this, the SCSI device's requests are iterated while
holding the requests_lock and the requests to be cancelled are
collected in a list. Then, the collected requests are cancelled
one by one while not holding the requests_lock. This is safe, because
only requests from the current AioContext are collected and acted
upon.

Originally reported by Proxmox VE users:
https://bugzilla.proxmox.com/show_bug.cgi?id=6810
https://forum.proxmox.com/threads/173914/

Fixes: da6eebb33b ("virtio-scsi: perform TMFs in appropriate AioContexts")
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v2:
* Different approach, collect requests for cancelling in a list for a
  localized solution rather than keeping track of the lock status via
  function arguments.

 hw/scsi/virtio-scsi.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index d817fc42b4..2896b05808 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -339,6 +339,7 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
     SCSIDevice *d = virtio_scsi_device_get(s, tmf->req.tmf.lun);
     SCSIRequest *r;
     bool match_tag;
+    g_autoptr(GList) reqs = NULL;

     if (!d) {
         tmf->resp.tmf.response = VIRTIO_SCSI_S_BAD_TARGET;
@@ -374,10 +375,21 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
             if (match_tag && cmd_req->req.cmd.tag != tmf->req.tmf.tag) {
                 continue;
             }
-            virtio_scsi_tmf_cancel_req(tmf, r);
+            /*
+             * Cannot cancel directly, because scsi_req_dequeue() would deadlock
+             * when attempting to acquire the request_lock a second time. Taking
+             * a reference here is paired with an unref after cancelling below.
+             */
+            scsi_req_ref(r);
+            reqs = g_list_append(reqs, r);
         }
     }

+    for (GList *elem = g_list_first(reqs); elem; elem = g_list_next(elem)) {
+        virtio_scsi_tmf_cancel_req(tmf, elem->data);
+        scsi_req_unref(elem->data);
+    }
+
     /* Incremented by virtio_scsi_do_tmf() */
     virtio_scsi_tmf_dec_remaining(tmf);

-- 
2.47.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO
  2025-10-17  9:43 [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO Fiona Ebner
@ 2025-10-17 17:54 ` Stefan Hajnoczi
  2025-10-18  8:14   ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Hajnoczi @ 2025-10-17 17:54 UTC (permalink / raw)
  To: Fiona Ebner; +Cc: qemu-devel, pbonzini, fam, mst, kwolf, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 2418 bytes --]

On Fri, Oct 17, 2025 at 11:43:30AM +0200, Fiona Ebner wrote:
> When scsi_req_dequeue() is reached via
> scsi_req_cancel_async()
> virtio_scsi_tmf_cancel_req()
> virtio_scsi_do_tmf_aio_context(),
> there is a deadlock when trying to acquire the SCSI device's requests
> lock, because it was already acquired in
> virtio_scsi_do_tmf_aio_context().
> 
> In particular, the issue happens with a FreeBSD guest (13, 14, 15,
> maybe more), when it cancels SCSI requests, because of timeout.
> 
> This is a regression caused by commit da6eebb33b ("virtio-scsi:
> perform TMFs in appropriate AioContexts") and the introduction of the
> requests_lock earlier.
> 
> To fix the issue, only cancel the requests after releasing the
> requests_lock. For this, the SCSI device's requests are iterated while
> holding the requests_lock and the requests to be cancelled are
> collected in a list. Then, the collected requests are cancelled
> one by one while not holding the requests_lock. This is safe, because
> only requests from the current AioContext are collected and acted
> upon.
> 
> Originally reported by Proxmox VE users:
> https://bugzilla.proxmox.com/show_bug.cgi?id=6810
> https://forum.proxmox.com/threads/173914/
> 
> Fixes: da6eebb33b ("virtio-scsi: perform TMFs in appropriate AioContexts")
> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Changes in v2:
> * Different approach, collect requests for cancelling in a list for a
>   localized solution rather than keeping track of the lock status via
>   function arguments.
> 
>  hw/scsi/virtio-scsi.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)

Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block

I replace g_list_append() with g_list_prepend() like in
scsi_device_for_each_req_async_bh(). The GLib documentation says the
following (https://docs.gtk.org/glib/type_func.List.append.html):

  g_list_append() has to traverse the entire list to find the end, which
  is inefficient when adding multiple elements. A common idiom to avoid
  the inefficiency is to use g_list_prepend() and reverse the list with
  g_list_reverse() when all elements have been added.

We don't call g_list_reverse() in scsi_device_for_each_req_async_bh()
and I don't think it's necessary here either.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO
  2025-10-17 17:54 ` Stefan Hajnoczi
@ 2025-10-18  8:14   ` Paolo Bonzini
  2025-10-27 18:55     ` Stefan Hajnoczi
  0 siblings, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2025-10-18  8:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Fiona Ebner; +Cc: qemu-devel, fam, mst, kwolf, qemu-stable

On 10/17/25 19:54, Stefan Hajnoczi wrote:
> On Fri, Oct 17, 2025 at 11:43:30AM +0200, Fiona Ebner wrote:
>> Changes in v2:
>> * Different approach, collect requests for cancelling in a list for a
>>    localized solution rather than keeping track of the lock status via
>>    function arguments.
>>
>>   hw/scsi/virtio-scsi.c | 14 +++++++++++++-
>>   1 file changed, 13 insertions(+), 1 deletion(-)
> 
> Thanks, applied to my block tree:
> https://gitlab.com/stefanha/qemu/commits/block

Thanks Stefan; sorry for the delay in reviewing.  The fix
of releasing the lock around virtio_scsi_tmf_cancel_req():

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 9b12ee7f1c6..ac17c97f224 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -1503,6 +1503,10 @@ SCSIRequest *scsi_req_ref(SCSIRequest *req)
  
  void scsi_req_unref(SCSIRequest *req)
  {
+    if (!req) {
+        return;
+    }
+
      assert(req->refcount > 0);
      if (--req->refcount == 0) {
          BusState *qbus = req->dev->qdev.parent_bus;
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index d817fc42b4c..481e78e4771 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -364,7 +364,11 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
      }
  
      WITH_QEMU_LOCK_GUARD(&d->requests_lock) {
+        SCSIRequest *prev = NULL;
          QTAILQ_FOREACH(r, &d->requests, next) {
+            scsi_req_unref(prev);
+            prev = NULL;
+
              VirtIOSCSIReq *cmd_req = r->hba_private;
              assert(cmd_req); /* request has hba_private while enqueued */
  
@@ -374,8 +378,20 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
              if (match_tag && cmd_req->req.cmd.tag != tmf->req.tmf.tag) {
                  continue;
              }
+
+            /*
+             * Keep it alive while the lock is released, and also to be
+             * able to read "next".
+             */
+            scsi_req_ref(r);
+            prev = r;
+
+            qemu_mutex_unlock(&d->request_lock);
              virtio_scsi_tmf_cancel_req(tmf, r);
+            qemu_mutex_lock(&d->request_lock);
          }
+
+        scsi_req_unref(prev);
      }
  
      /* Incremented by virtio_scsi_do_tmf() */


would have a bug too, in that the loop is not using
QTAILQ_FOREACH_SAFE and scsi_req_dequeue() removes the
request from the list.

I think scsi_req_ref/unref should also be changed to use atomics.
free_request is only implemented by hw/usb/dev-uas.c and all the
others do not need a lock, so we're fine with that.

And QOM references held by the requests are not necessary, because
anyway the requests won't survive scsi_qdev_unrealize (at which
point the device is certainly alive).  I'll test this, add some
comments and send a patch:

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 9b12ee7f1c6..7fcacc178da 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -838,8 +838,6 @@ SCSIRequest *scsi_req_alloc(const SCSIReqOps *reqops, SCSIDevice *d,
      req->status = -1;
      req->host_status = -1;
      req->ops = reqops;
-    object_ref(OBJECT(d));
-    object_ref(OBJECT(qbus->parent));
      notifier_list_init(&req->cancel_notifiers);
  
      if (reqops->init_req) {
@@ -1496,15 +1494,15 @@ void scsi_device_report_change(SCSIDevice *dev, SCSISense sense)
  
  SCSIRequest *scsi_req_ref(SCSIRequest *req)
  {
-    assert(req->refcount > 0);
-    req->refcount++;
+    assert(qatomic_read(&req->refcount) > 0);
+    qatomic_inc(&req->refcount);
      return req;
  }
  
  void scsi_req_unref(SCSIRequest *req)
  {
-    assert(req->refcount > 0);
-    if (--req->refcount == 0) {
+    assert(qatomic_read(&req->refcount) > 0);
+    if (qatomic_fetch_dec(&req->refcount) == 1) {
          BusState *qbus = req->dev->qdev.parent_bus;
          SCSIBus *bus = DO_UPCAST(SCSIBus, qbus, qbus);
  
@@ -1514,8 +1512,6 @@ void scsi_req_unref(SCSIRequest *req)
          if (req->ops->free_req) {
              req->ops->free_req(req);
          }
-        object_unref(OBJECT(req->dev));
-        object_unref(OBJECT(qbus->parent));
          g_free(req);
      }
  }


Paolo



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO
  2025-10-18  8:14   ` Paolo Bonzini
@ 2025-10-27 18:55     ` Stefan Hajnoczi
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2025-10-27 18:55 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Fiona Ebner, qemu-devel, fam, mst, kwolf, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 3739 bytes --]

On Sat, Oct 18, 2025 at 10:14:49AM +0200, Paolo Bonzini wrote:
> On 10/17/25 19:54, Stefan Hajnoczi wrote:
> > On Fri, Oct 17, 2025 at 11:43:30AM +0200, Fiona Ebner wrote:
> > > Changes in v2:
> > > * Different approach, collect requests for cancelling in a list for a
> > >    localized solution rather than keeping track of the lock status via
> > >    function arguments.
> > > 
> > >   hw/scsi/virtio-scsi.c | 14 +++++++++++++-
> > >   1 file changed, 13 insertions(+), 1 deletion(-)
> > 
> > Thanks, applied to my block tree:
> > https://gitlab.com/stefanha/qemu/commits/block
> 
> Thanks Stefan; sorry for the delay in reviewing.  The fix
> of releasing the lock around virtio_scsi_tmf_cancel_req():
> 
> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
> index 9b12ee7f1c6..ac17c97f224 100644
> --- a/hw/scsi/scsi-bus.c
> +++ b/hw/scsi/scsi-bus.c
> @@ -1503,6 +1503,10 @@ SCSIRequest *scsi_req_ref(SCSIRequest *req)
>  void scsi_req_unref(SCSIRequest *req)
>  {
> +    if (!req) {
> +        return;
> +    }
> +
>      assert(req->refcount > 0);
>      if (--req->refcount == 0) {
>          BusState *qbus = req->dev->qdev.parent_bus;
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index d817fc42b4c..481e78e4771 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -364,7 +364,11 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
>      }
>      WITH_QEMU_LOCK_GUARD(&d->requests_lock) {
> +        SCSIRequest *prev = NULL;
>          QTAILQ_FOREACH(r, &d->requests, next) {
> +            scsi_req_unref(prev);
> +            prev = NULL;
> +
>              VirtIOSCSIReq *cmd_req = r->hba_private;
>              assert(cmd_req); /* request has hba_private while enqueued */
> @@ -374,8 +378,20 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
>              if (match_tag && cmd_req->req.cmd.tag != tmf->req.tmf.tag) {
>                  continue;
>              }
> +
> +            /*
> +             * Keep it alive while the lock is released, and also to be
> +             * able to read "next".
> +             */
> +            scsi_req_ref(r);
> +            prev = r;
> +
> +            qemu_mutex_unlock(&d->request_lock);
>              virtio_scsi_tmf_cancel_req(tmf, r);
> +            qemu_mutex_lock(&d->request_lock);
>          }
> +
> +        scsi_req_unref(prev);
>      }
>      /* Incremented by virtio_scsi_do_tmf() */
> 
> 
> would have a bug too, in that the loop is not using
> QTAILQ_FOREACH_SAFE and scsi_req_dequeue() removes the
> request from the list.
> 
> I think scsi_req_ref/unref should also be changed to use atomics.
> free_request is only implemented by hw/usb/dev-uas.c and all the
> others do not need a lock, so we're fine with that.

At the moment there is the assumption that a request executes in the
same AioContext for its entire lifetime. Most devices only have one
AioContext and don't worry about thread-safety at all (like the
hw/usb/dev-uas.c example you mentioned).

SCSIRequest->refcount does not need to be atomic today and any change to
the SCSI layer that actually touches a request from multiple threads
will need to do more than just making refcount atomic.

I worry making refcount atomic might give the impression that
SCSIRequest is thread-safe when it's not. I would only make it atomic
when there are multi-threaded users.

> 
> And QOM references held by the requests are not necessary, because
> anyway the requests won't survive scsi_qdev_unrealize (at which
> point the device is certainly alive).  I'll test this, add some
> comments and send a patch:

Avoiding QOM ref/unref would be nice.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-10-27 18:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-17  9:43 [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO Fiona Ebner
2025-10-17 17:54 ` Stefan Hajnoczi
2025-10-18  8:14   ` Paolo Bonzini
2025-10-27 18:55     ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).