* [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO
@ 2025-10-17 9:43 Fiona Ebner
2025-10-17 17:54 ` Stefan Hajnoczi
0 siblings, 1 reply; 4+ messages in thread
From: Fiona Ebner @ 2025-10-17 9:43 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, fam, mst, stefanha, kwolf, qemu-stable
When scsi_req_dequeue() is reached via
scsi_req_cancel_async()
virtio_scsi_tmf_cancel_req()
virtio_scsi_do_tmf_aio_context(),
there is a deadlock when trying to acquire the SCSI device's requests
lock, because it was already acquired in
virtio_scsi_do_tmf_aio_context().
In particular, the issue happens with a FreeBSD guest (13, 14, 15,
maybe more), when it cancels SCSI requests, because of timeout.
This is a regression caused by commit da6eebb33b ("virtio-scsi:
perform TMFs in appropriate AioContexts") and the introduction of the
requests_lock earlier.
To fix the issue, only cancel the requests after releasing the
requests_lock. For this, the SCSI device's requests are iterated while
holding the requests_lock and the requests to be cancelled are
collected in a list. Then, the collected requests are cancelled
one by one while not holding the requests_lock. This is safe, because
only requests from the current AioContext are collected and acted
upon.
Originally reported by Proxmox VE users:
https://bugzilla.proxmox.com/show_bug.cgi?id=6810
https://forum.proxmox.com/threads/173914/
Fixes: da6eebb33b ("virtio-scsi: perform TMFs in appropriate AioContexts")
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
Changes in v2:
* Different approach, collect requests for cancelling in a list for a
localized solution rather than keeping track of the lock status via
function arguments.
hw/scsi/virtio-scsi.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index d817fc42b4..2896b05808 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -339,6 +339,7 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
SCSIDevice *d = virtio_scsi_device_get(s, tmf->req.tmf.lun);
SCSIRequest *r;
bool match_tag;
+ g_autoptr(GList) reqs = NULL;
if (!d) {
tmf->resp.tmf.response = VIRTIO_SCSI_S_BAD_TARGET;
@@ -374,10 +375,21 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
if (match_tag && cmd_req->req.cmd.tag != tmf->req.tmf.tag) {
continue;
}
- virtio_scsi_tmf_cancel_req(tmf, r);
+ /*
+ * Cannot cancel directly, because scsi_req_dequeue() would deadlock
+ * when attempting to acquire the request_lock a second time. Taking
+ * a reference here is paired with an unref after cancelling below.
+ */
+ scsi_req_ref(r);
+ reqs = g_list_append(reqs, r);
}
}
+ for (GList *elem = g_list_first(reqs); elem; elem = g_list_next(elem)) {
+ virtio_scsi_tmf_cancel_req(tmf, elem->data);
+ scsi_req_unref(elem->data);
+ }
+
/* Incremented by virtio_scsi_do_tmf() */
virtio_scsi_tmf_dec_remaining(tmf);
--
2.47.3
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO
2025-10-17 9:43 [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO Fiona Ebner
@ 2025-10-17 17:54 ` Stefan Hajnoczi
2025-10-18 8:14 ` Paolo Bonzini
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Hajnoczi @ 2025-10-17 17:54 UTC (permalink / raw)
To: Fiona Ebner; +Cc: qemu-devel, pbonzini, fam, mst, kwolf, qemu-stable
[-- Attachment #1: Type: text/plain, Size: 2418 bytes --]
On Fri, Oct 17, 2025 at 11:43:30AM +0200, Fiona Ebner wrote:
> When scsi_req_dequeue() is reached via
> scsi_req_cancel_async()
> virtio_scsi_tmf_cancel_req()
> virtio_scsi_do_tmf_aio_context(),
> there is a deadlock when trying to acquire the SCSI device's requests
> lock, because it was already acquired in
> virtio_scsi_do_tmf_aio_context().
>
> In particular, the issue happens with a FreeBSD guest (13, 14, 15,
> maybe more), when it cancels SCSI requests, because of timeout.
>
> This is a regression caused by commit da6eebb33b ("virtio-scsi:
> perform TMFs in appropriate AioContexts") and the introduction of the
> requests_lock earlier.
>
> To fix the issue, only cancel the requests after releasing the
> requests_lock. For this, the SCSI device's requests are iterated while
> holding the requests_lock and the requests to be cancelled are
> collected in a list. Then, the collected requests are cancelled
> one by one while not holding the requests_lock. This is safe, because
> only requests from the current AioContext are collected and acted
> upon.
>
> Originally reported by Proxmox VE users:
> https://bugzilla.proxmox.com/show_bug.cgi?id=6810
> https://forum.proxmox.com/threads/173914/
>
> Fixes: da6eebb33b ("virtio-scsi: perform TMFs in appropriate AioContexts")
> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>
> Changes in v2:
> * Different approach, collect requests for cancelling in a list for a
> localized solution rather than keeping track of the lock status via
> function arguments.
>
> hw/scsi/virtio-scsi.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block
I replace g_list_append() with g_list_prepend() like in
scsi_device_for_each_req_async_bh(). The GLib documentation says the
following (https://docs.gtk.org/glib/type_func.List.append.html):
g_list_append() has to traverse the entire list to find the end, which
is inefficient when adding multiple elements. A common idiom to avoid
the inefficiency is to use g_list_prepend() and reverse the list with
g_list_reverse() when all elements have been added.
We don't call g_list_reverse() in scsi_device_for_each_req_async_bh()
and I don't think it's necessary here either.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO
2025-10-17 17:54 ` Stefan Hajnoczi
@ 2025-10-18 8:14 ` Paolo Bonzini
2025-10-27 18:55 ` Stefan Hajnoczi
0 siblings, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2025-10-18 8:14 UTC (permalink / raw)
To: Stefan Hajnoczi, Fiona Ebner; +Cc: qemu-devel, fam, mst, kwolf, qemu-stable
On 10/17/25 19:54, Stefan Hajnoczi wrote:
> On Fri, Oct 17, 2025 at 11:43:30AM +0200, Fiona Ebner wrote:
>> Changes in v2:
>> * Different approach, collect requests for cancelling in a list for a
>> localized solution rather than keeping track of the lock status via
>> function arguments.
>>
>> hw/scsi/virtio-scsi.c | 14 +++++++++++++-
>> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> Thanks, applied to my block tree:
> https://gitlab.com/stefanha/qemu/commits/block
Thanks Stefan; sorry for the delay in reviewing. The fix
of releasing the lock around virtio_scsi_tmf_cancel_req():
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 9b12ee7f1c6..ac17c97f224 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -1503,6 +1503,10 @@ SCSIRequest *scsi_req_ref(SCSIRequest *req)
void scsi_req_unref(SCSIRequest *req)
{
+ if (!req) {
+ return;
+ }
+
assert(req->refcount > 0);
if (--req->refcount == 0) {
BusState *qbus = req->dev->qdev.parent_bus;
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index d817fc42b4c..481e78e4771 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -364,7 +364,11 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
}
WITH_QEMU_LOCK_GUARD(&d->requests_lock) {
+ SCSIRequest *prev = NULL;
QTAILQ_FOREACH(r, &d->requests, next) {
+ scsi_req_unref(prev);
+ prev = NULL;
+
VirtIOSCSIReq *cmd_req = r->hba_private;
assert(cmd_req); /* request has hba_private while enqueued */
@@ -374,8 +378,20 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
if (match_tag && cmd_req->req.cmd.tag != tmf->req.tmf.tag) {
continue;
}
+
+ /*
+ * Keep it alive while the lock is released, and also to be
+ * able to read "next".
+ */
+ scsi_req_ref(r);
+ prev = r;
+
+ qemu_mutex_unlock(&d->request_lock);
virtio_scsi_tmf_cancel_req(tmf, r);
+ qemu_mutex_lock(&d->request_lock);
}
+
+ scsi_req_unref(prev);
}
/* Incremented by virtio_scsi_do_tmf() */
would have a bug too, in that the loop is not using
QTAILQ_FOREACH_SAFE and scsi_req_dequeue() removes the
request from the list.
I think scsi_req_ref/unref should also be changed to use atomics.
free_request is only implemented by hw/usb/dev-uas.c and all the
others do not need a lock, so we're fine with that.
And QOM references held by the requests are not necessary, because
anyway the requests won't survive scsi_qdev_unrealize (at which
point the device is certainly alive). I'll test this, add some
comments and send a patch:
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 9b12ee7f1c6..7fcacc178da 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -838,8 +838,6 @@ SCSIRequest *scsi_req_alloc(const SCSIReqOps *reqops, SCSIDevice *d,
req->status = -1;
req->host_status = -1;
req->ops = reqops;
- object_ref(OBJECT(d));
- object_ref(OBJECT(qbus->parent));
notifier_list_init(&req->cancel_notifiers);
if (reqops->init_req) {
@@ -1496,15 +1494,15 @@ void scsi_device_report_change(SCSIDevice *dev, SCSISense sense)
SCSIRequest *scsi_req_ref(SCSIRequest *req)
{
- assert(req->refcount > 0);
- req->refcount++;
+ assert(qatomic_read(&req->refcount) > 0);
+ qatomic_inc(&req->refcount);
return req;
}
void scsi_req_unref(SCSIRequest *req)
{
- assert(req->refcount > 0);
- if (--req->refcount == 0) {
+ assert(qatomic_read(&req->refcount) > 0);
+ if (qatomic_fetch_dec(&req->refcount) == 1) {
BusState *qbus = req->dev->qdev.parent_bus;
SCSIBus *bus = DO_UPCAST(SCSIBus, qbus, qbus);
@@ -1514,8 +1512,6 @@ void scsi_req_unref(SCSIRequest *req)
if (req->ops->free_req) {
req->ops->free_req(req);
}
- object_unref(OBJECT(req->dev));
- object_unref(OBJECT(qbus->parent));
g_free(req);
}
}
Paolo
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO
2025-10-18 8:14 ` Paolo Bonzini
@ 2025-10-27 18:55 ` Stefan Hajnoczi
0 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2025-10-27 18:55 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Fiona Ebner, qemu-devel, fam, mst, kwolf, qemu-stable
[-- Attachment #1: Type: text/plain, Size: 3739 bytes --]
On Sat, Oct 18, 2025 at 10:14:49AM +0200, Paolo Bonzini wrote:
> On 10/17/25 19:54, Stefan Hajnoczi wrote:
> > On Fri, Oct 17, 2025 at 11:43:30AM +0200, Fiona Ebner wrote:
> > > Changes in v2:
> > > * Different approach, collect requests for cancelling in a list for a
> > > localized solution rather than keeping track of the lock status via
> > > function arguments.
> > >
> > > hw/scsi/virtio-scsi.c | 14 +++++++++++++-
> > > 1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > Thanks, applied to my block tree:
> > https://gitlab.com/stefanha/qemu/commits/block
>
> Thanks Stefan; sorry for the delay in reviewing. The fix
> of releasing the lock around virtio_scsi_tmf_cancel_req():
>
> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
> index 9b12ee7f1c6..ac17c97f224 100644
> --- a/hw/scsi/scsi-bus.c
> +++ b/hw/scsi/scsi-bus.c
> @@ -1503,6 +1503,10 @@ SCSIRequest *scsi_req_ref(SCSIRequest *req)
> void scsi_req_unref(SCSIRequest *req)
> {
> + if (!req) {
> + return;
> + }
> +
> assert(req->refcount > 0);
> if (--req->refcount == 0) {
> BusState *qbus = req->dev->qdev.parent_bus;
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index d817fc42b4c..481e78e4771 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -364,7 +364,11 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
> }
> WITH_QEMU_LOCK_GUARD(&d->requests_lock) {
> + SCSIRequest *prev = NULL;
> QTAILQ_FOREACH(r, &d->requests, next) {
> + scsi_req_unref(prev);
> + prev = NULL;
> +
> VirtIOSCSIReq *cmd_req = r->hba_private;
> assert(cmd_req); /* request has hba_private while enqueued */
> @@ -374,8 +378,20 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque)
> if (match_tag && cmd_req->req.cmd.tag != tmf->req.tmf.tag) {
> continue;
> }
> +
> + /*
> + * Keep it alive while the lock is released, and also to be
> + * able to read "next".
> + */
> + scsi_req_ref(r);
> + prev = r;
> +
> + qemu_mutex_unlock(&d->request_lock);
> virtio_scsi_tmf_cancel_req(tmf, r);
> + qemu_mutex_lock(&d->request_lock);
> }
> +
> + scsi_req_unref(prev);
> }
> /* Incremented by virtio_scsi_do_tmf() */
>
>
> would have a bug too, in that the loop is not using
> QTAILQ_FOREACH_SAFE and scsi_req_dequeue() removes the
> request from the list.
>
> I think scsi_req_ref/unref should also be changed to use atomics.
> free_request is only implemented by hw/usb/dev-uas.c and all the
> others do not need a lock, so we're fine with that.
At the moment there is the assumption that a request executes in the
same AioContext for its entire lifetime. Most devices only have one
AioContext and don't worry about thread-safety at all (like the
hw/usb/dev-uas.c example you mentioned).
SCSIRequest->refcount does not need to be atomic today and any change to
the SCSI layer that actually touches a request from multiple threads
will need to do more than just making refcount atomic.
I worry making refcount atomic might give the impression that
SCSIRequest is thread-safe when it's not. I would only make it atomic
when there are multi-threaded users.
>
> And QOM references held by the requests are not necessary, because
> anyway the requests won't survive scsi_qdev_unrealize (at which
> point the device is certainly alive). I'll test this, add some
> comments and send a patch:
Avoiding QOM ref/unref would be nice.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-10-27 18:58 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-17 9:43 [PATCH v2] hw/scsi: avoid deadlock upon TMF request cancelling with VirtIO Fiona Ebner
2025-10-17 17:54 ` Stefan Hajnoczi
2025-10-18 8:14 ` Paolo Bonzini
2025-10-27 18:55 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).