* [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info
@ 2026-04-07 21:25 Link Lin
[not found] ` <CALUx4KQKNYZm5AZzQXNqLRdGAT0nQOpmn_Lz6WHie73w1d9JQA@mail.gmail.com>
0 siblings, 1 reply; 8+ messages in thread
From: Link Lin @ 2026-04-07 21:25 UTC (permalink / raw)
To: mst, jasowang, xuanzhuo
Cc: eperezma, jiaqiyan, rientjes, weixugc, virtualization,
linux-kernel, Link Lin
In environments where free page reporting is disabled, a kernel
panic is triggered when tearing down the virtio_balloon module:
[12261.808190] Call trace:
[12261.808471] __list_del_entry_valid_or_report+0x18/0xe0
[12261.809064] vp_del_vqs+0x12c/0x270
[12261.809462] remove_common+0x80/0x98 [virtio_balloon]
[12261.810034] virtballoon_remove+0xfc/0x158 [virtio_balloon]
[12261.810663] virtio_dev_remove+0x68/0xf8
[12261.811108] device_release_driver_internal+0x17c/0x278
[12261.811701] driver_detach+0xd4/0x138
[12261.812117] bus_remove_driver+0x90/0xd0
[12261.812562] driver_unregister+0x40/0x70
[12261.813006] unregister_virtio_driver+0x20/0x38
[12261.813518] cleanup_module+0x20/0x7a8 [virtio_balloon]
[12261.814109] __arm64_sys_delete_module+0x278/0x3d0
[12261.814654] invoke_syscall+0x5c/0x120
[12261.815086] el0_svc_common+0x90/0xf8
[12261.815506] do_el0_svc+0x2c/0x48
[12261.815883] el0_svc+0x3c/0xa8
[12261.816235] el0t_64_sync_handler+0x8c/0x108
[12261.816724] el0t_64_sync+0x198/0x1a0
The issue originates in vp_find_vqs_intx(). It kzalloc_objs() based
on the nvqs count provided by the caller, virtio_balloon::init_vqs().
However, it is not always the case that all nvqs number of
virtio_pci_vq_info objects will be properly populated.
For example, when VIRTIO_BALLOON_F_FREE_PAGE_HINT is absent, the
VIRTIO_BALLOON_VQ_FREE_PAGE-th item in the vp_dev->vqs array is
actually never populated, and is still a zeroe-initialized
virtio_pci_vq_info object, which is eventually going to trigger
a __list_del_entry_valid_or_report() crash.
Tested by applying this patch to a guest VM kernel with the
VIRTIO_BALLOON_F_REPORTING feature enabled and the
VIRTIO_BALLOON_F_FREE_PAGE_HINT feature disabled.
Without this patch, unloading the virtio_balloon module triggers a panic.
With this patch, no panic is observed.
The fix is to use queue_idx to handle the case that vp_find_vqs_intx()
skips vp_setup_vq() when caller provided null vqs_info[i].name, when
the caller doesn't populate all nvqs number of virtqueue_info objects.
Invariantly queue_idx is the correct index to store a successfully
created and populated virtio_pci_vq_info object. As a result, now
a virtio_pci_device object only stores queue_idx number of valid
virtio_pci_vq_info objects in its vqs array when the for-loop over
nvqs finishes (of course, without goto out_del_vqs).
vp_find_vqs_msix() has similar issue, so fix it in the same way.
This patch is marked as RFC because we are uncertain if any virtio-pci
code implicitly requires virtio_pci_device's vqs array to always
contain nvqs number of virtio_pci_vq_info objects, and to store
zero-initialized virtio_pci_vq_info objects. We have not observed
any issues in our testing, but insights or alternatives are welcome!
Signed-off-by: Link Lin <linkl@google.com>
Co-developed-by: Jiaqi Yan <jiaqiyan@google.com>
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
---
drivers/virtio/virtio_pci_common.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index da97b6a988de..9b32301529e5 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -423,14 +423,15 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned int nvqs,
vqs[i] = NULL;
continue;
}
- vqs[i] = vp_find_one_vq_msix(vdev, queue_idx++, vqi->callback,
+ vqs[i] = vp_find_one_vq_msix(vdev, queue_idx, vqi->callback,
vqi->name, vqi->ctx, false,
&allocated_vectors, vector_policy,
- &vp_dev->vqs[i]);
+ &vp_dev->vqs[queue_idx]);
if (IS_ERR(vqs[i])) {
err = PTR_ERR(vqs[i]);
goto error_find;
}
+ ++queue_idx;
}
if (!avq_num)
@@ -485,13 +486,14 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned int nvqs,
vqs[i] = NULL;
continue;
}
- vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback,
+ vqs[i] = vp_setup_vq(vdev, queue_idx, vqi->callback,
vqi->name, vqi->ctx,
- VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[i]);
+ VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[queue_idx]);
if (IS_ERR(vqs[i])) {
err = PTR_ERR(vqs[i]);
goto out_del_vqs;
}
+ ++queue_idx;
}
if (!avq_num)
--
2.53.0.1213.gd9a14994de-goog
^ permalink raw reply related [flat|nested] 8+ messages in thread[parent not found: <CALUx4KQKNYZm5AZzQXNqLRdGAT0nQOpmn_Lz6WHie73w1d9JQA@mail.gmail.com>]
* Re: [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info [not found] ` <CALUx4KQKNYZm5AZzQXNqLRdGAT0nQOpmn_Lz6WHie73w1d9JQA@mail.gmail.com> @ 2026-04-21 21:47 ` Link Lin 2026-04-21 21:51 ` Michael S. Tsirkin 2026-04-21 21:50 ` Michael S. Tsirkin 1 sibling, 1 reply; 8+ messages in thread From: Link Lin @ 2026-04-21 21:47 UTC (permalink / raw) To: mst, jasowang, xuanzhuo Cc: eperezma, jiaqiyan, rientjes, weixugc, virtualization, linux-kernel Hi everyone, Friendly ping. Apologies if you are getting this the second time - my last ping wasn't in plain text mode and got rejected by some mailing lists. Please let me know if anyone has had a chance to look at this RFC patch, or if any changes are needed. Thanks, Link On Tue, Apr 21, 2026 at 2:24 PM Link Lin <linkl@google.com> wrote: > > Hi everyone, > > Friendly ping on this RFC patch. Please let me know if anyone has had a chance to look at this, or if any changes are needed. > > Thanks, > Link > > On Tue, Apr 7, 2026 at 2:25 PM Link Lin <linkl@google.com> wrote: >> >> In environments where free page reporting is disabled, a kernel >> panic is triggered when tearing down the virtio_balloon module: >> >> [12261.808190] Call trace: >> [12261.808471] __list_del_entry_valid_or_report+0x18/0xe0 >> [12261.809064] vp_del_vqs+0x12c/0x270 >> [12261.809462] remove_common+0x80/0x98 [virtio_balloon] >> [12261.810034] virtballoon_remove+0xfc/0x158 [virtio_balloon] >> [12261.810663] virtio_dev_remove+0x68/0xf8 >> [12261.811108] device_release_driver_internal+0x17c/0x278 >> [12261.811701] driver_detach+0xd4/0x138 >> [12261.812117] bus_remove_driver+0x90/0xd0 >> [12261.812562] driver_unregister+0x40/0x70 >> [12261.813006] unregister_virtio_driver+0x20/0x38 >> [12261.813518] cleanup_module+0x20/0x7a8 [virtio_balloon] >> [12261.814109] __arm64_sys_delete_module+0x278/0x3d0 >> [12261.814654] invoke_syscall+0x5c/0x120 >> [12261.815086] el0_svc_common+0x90/0xf8 >> [12261.815506] do_el0_svc+0x2c/0x48 >> [12261.815883] el0_svc+0x3c/0xa8 >> [12261.816235] el0t_64_sync_handler+0x8c/0x108 >> [12261.816724] el0t_64_sync+0x198/0x1a0 >> >> The issue originates in vp_find_vqs_intx(). It kzalloc_objs() based >> on the nvqs count provided by the caller, virtio_balloon::init_vqs(). >> However, it is not always the case that all nvqs number of >> virtio_pci_vq_info objects will be properly populated. >> >> For example, when VIRTIO_BALLOON_F_FREE_PAGE_HINT is absent, the >> VIRTIO_BALLOON_VQ_FREE_PAGE-th item in the vp_dev->vqs array is >> actually never populated, and is still a zeroe-initialized >> virtio_pci_vq_info object, which is eventually going to trigger >> a __list_del_entry_valid_or_report() crash. >> >> Tested by applying this patch to a guest VM kernel with the >> VIRTIO_BALLOON_F_REPORTING feature enabled and the >> VIRTIO_BALLOON_F_FREE_PAGE_HINT feature disabled. >> Without this patch, unloading the virtio_balloon module triggers a panic. >> With this patch, no panic is observed. >> >> The fix is to use queue_idx to handle the case that vp_find_vqs_intx() >> skips vp_setup_vq() when caller provided null vqs_info[i].name, when >> the caller doesn't populate all nvqs number of virtqueue_info objects. >> Invariantly queue_idx is the correct index to store a successfully >> created and populated virtio_pci_vq_info object. As a result, now >> a virtio_pci_device object only stores queue_idx number of valid >> virtio_pci_vq_info objects in its vqs array when the for-loop over >> nvqs finishes (of course, without goto out_del_vqs). >> >> vp_find_vqs_msix() has similar issue, so fix it in the same way. >> >> This patch is marked as RFC because we are uncertain if any virtio-pci >> code implicitly requires virtio_pci_device's vqs array to always >> contain nvqs number of virtio_pci_vq_info objects, and to store >> zero-initialized virtio_pci_vq_info objects. We have not observed >> any issues in our testing, but insights or alternatives are welcome! >> >> Signed-off-by: Link Lin <linkl@google.com> >> Co-developed-by: Jiaqi Yan <jiaqiyan@google.com> >> Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> >> --- >> drivers/virtio/virtio_pci_common.c | 10 ++++++---- >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c >> index da97b6a988de..9b32301529e5 100644 >> --- a/drivers/virtio/virtio_pci_common.c >> +++ b/drivers/virtio/virtio_pci_common.c >> @@ -423,14 +423,15 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned int nvqs, >> vqs[i] = NULL; >> continue; >> } >> - vqs[i] = vp_find_one_vq_msix(vdev, queue_idx++, vqi->callback, >> + vqs[i] = vp_find_one_vq_msix(vdev, queue_idx, vqi->callback, >> vqi->name, vqi->ctx, false, >> &allocated_vectors, vector_policy, >> - &vp_dev->vqs[i]); >> + &vp_dev->vqs[queue_idx]); >> if (IS_ERR(vqs[i])) { >> err = PTR_ERR(vqs[i]); >> goto error_find; >> } >> + ++queue_idx; >> } >> >> if (!avq_num) >> @@ -485,13 +486,14 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned int nvqs, >> vqs[i] = NULL; >> continue; >> } >> - vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback, >> + vqs[i] = vp_setup_vq(vdev, queue_idx, vqi->callback, >> vqi->name, vqi->ctx, >> - VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[i]); >> + VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[queue_idx]); >> if (IS_ERR(vqs[i])) { >> err = PTR_ERR(vqs[i]); >> goto out_del_vqs; >> } >> + ++queue_idx; >> } >> >> if (!avq_num) >> -- >> 2.53.0.1213.gd9a14994de-goog ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info 2026-04-21 21:47 ` Link Lin @ 2026-04-21 21:51 ` Michael S. Tsirkin 2026-04-21 22:15 ` Link Lin 0 siblings, 1 reply; 8+ messages in thread From: Michael S. Tsirkin @ 2026-04-21 21:51 UTC (permalink / raw) To: Link Lin Cc: jasowang, xuanzhuo, eperezma, jiaqiyan, rientjes, weixugc, virtualization, linux-kernel On Tue, Apr 21, 2026 at 02:47:32PM -0700, Link Lin wrote: > Hi everyone, > > Friendly ping. Apologies if you are getting this the second time - my > last ping wasn't in plain text mode and got rejected by some mailing > lists. > > Please let me know if anyone has had a chance to look at this RFC > patch, or if any changes are needed. > > Thanks, > Link > > On Tue, Apr 21, 2026 at 2:24 PM Link Lin <linkl@google.com> wrote: > > > > Hi everyone, > > > > Friendly ping on this RFC patch. Please let me know if anyone has had a chance to look at this, or if any changes are needed. > > > > Thanks, > > Link > > > > On Tue, Apr 7, 2026 at 2:25 PM Link Lin <linkl@google.com> wrote: > >> > >> In environments where free page reporting is disabled, a kernel > >> panic is triggered when tearing down the virtio_balloon module: > >> > >> [12261.808190] Call trace: > >> [12261.808471] __list_del_entry_valid_or_report+0x18/0xe0 > >> [12261.809064] vp_del_vqs+0x12c/0x270 > >> [12261.809462] remove_common+0x80/0x98 [virtio_balloon] > >> [12261.810034] virtballoon_remove+0xfc/0x158 [virtio_balloon] > >> [12261.810663] virtio_dev_remove+0x68/0xf8 > >> [12261.811108] device_release_driver_internal+0x17c/0x278 > >> [12261.811701] driver_detach+0xd4/0x138 > >> [12261.812117] bus_remove_driver+0x90/0xd0 > >> [12261.812562] driver_unregister+0x40/0x70 > >> [12261.813006] unregister_virtio_driver+0x20/0x38 > >> [12261.813518] cleanup_module+0x20/0x7a8 [virtio_balloon] > >> [12261.814109] __arm64_sys_delete_module+0x278/0x3d0 > >> [12261.814654] invoke_syscall+0x5c/0x120 > >> [12261.815086] el0_svc_common+0x90/0xf8 > >> [12261.815506] do_el0_svc+0x2c/0x48 > >> [12261.815883] el0_svc+0x3c/0xa8 > >> [12261.816235] el0t_64_sync_handler+0x8c/0x108 > >> [12261.816724] el0t_64_sync+0x198/0x1a0 > >> > >> The issue originates in vp_find_vqs_intx(). It kzalloc_objs() based > >> on the nvqs count provided by the caller, virtio_balloon::init_vqs(). > >> However, it is not always the case that all nvqs number of > >> virtio_pci_vq_info objects will be properly populated. > >> > >> For example, when VIRTIO_BALLOON_F_FREE_PAGE_HINT is absent, the > >> VIRTIO_BALLOON_VQ_FREE_PAGE-th item in the vp_dev->vqs array is > >> actually never populated, and is still a zeroe-initialized > >> virtio_pci_vq_info object, which is eventually going to trigger > >> a __list_del_entry_valid_or_report() crash. > >> > >> Tested by applying this patch to a guest VM kernel with the > >> VIRTIO_BALLOON_F_REPORTING feature enabled and the > >> VIRTIO_BALLOON_F_FREE_PAGE_HINT feature disabled. > >> Without this patch, unloading the virtio_balloon module triggers a panic. > >> With this patch, no panic is observed. > >> > >> The fix is to use queue_idx to handle the case that vp_find_vqs_intx() > >> skips vp_setup_vq() when caller provided null vqs_info[i].name, when > >> the caller doesn't populate all nvqs number of virtqueue_info objects. > >> Invariantly queue_idx is the correct index to store a successfully > >> created and populated virtio_pci_vq_info object. As a result, now > >> a virtio_pci_device object only stores queue_idx number of valid > >> virtio_pci_vq_info objects in its vqs array when the for-loop over > >> nvqs finishes (of course, without goto out_del_vqs). > >> > >> vp_find_vqs_msix() has similar issue, so fix it in the same way. > >> > >> This patch is marked as RFC because we are uncertain if any virtio-pci > >> code implicitly requires virtio_pci_device's vqs array to always > >> contain nvqs number of virtio_pci_vq_info objects, and to store > >> zero-initialized virtio_pci_vq_info objects. We have not observed > >> any issues in our testing, but insights or alternatives are welcome! > >> > >> Signed-off-by: Link Lin <linkl@google.com> > >> Co-developed-by: Jiaqi Yan <jiaqiyan@google.com> > >> Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> > >> --- > >> drivers/virtio/virtio_pci_common.c | 10 ++++++---- > >> 1 file changed, 6 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c > >> index da97b6a988de..9b32301529e5 100644 > >> --- a/drivers/virtio/virtio_pci_common.c > >> +++ b/drivers/virtio/virtio_pci_common.c > >> @@ -423,14 +423,15 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned int nvqs, > >> vqs[i] = NULL; > >> continue; > >> } > >> - vqs[i] = vp_find_one_vq_msix(vdev, queue_idx++, vqi->callback, > >> + vqs[i] = vp_find_one_vq_msix(vdev, queue_idx, vqi->callback, > >> vqi->name, vqi->ctx, false, > >> &allocated_vectors, vector_policy, > >> - &vp_dev->vqs[i]); > >> + &vp_dev->vqs[queue_idx]); > >> if (IS_ERR(vqs[i])) { > >> err = PTR_ERR(vqs[i]); > >> goto error_find; > >> } > >> + ++queue_idx; > >> } > >> > >> if (!avq_num) > >> @@ -485,13 +486,14 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned int nvqs, > >> vqs[i] = NULL; > >> continue; > >> } > >> - vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback, > >> + vqs[i] = vp_setup_vq(vdev, queue_idx, vqi->callback, > >> vqi->name, vqi->ctx, > >> - VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[i]); > >> + VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[queue_idx]); > >> if (IS_ERR(vqs[i])) { > >> err = PTR_ERR(vqs[i]); > >> goto out_del_vqs; > >> } > >> + ++queue_idx; > >> } > >> > >> if (!avq_num) > >> -- > >> 2.53.0.1213.gd9a14994de-goog I have this in my tree: https://lore.kernel.org/all/20260315141808.547081-1-ammarfaizi2@openresty.com/ same? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info 2026-04-21 21:51 ` Michael S. Tsirkin @ 2026-04-21 22:15 ` Link Lin 2026-04-21 22:16 ` Michael S. Tsirkin 0 siblings, 1 reply; 8+ messages in thread From: Link Lin @ 2026-04-21 22:15 UTC (permalink / raw) To: Michael S. Tsirkin Cc: jasowang, xuanzhuo, eperezma, jiaqiyan, rientjes, weixugc, virtualization, linux-kernel, ammarfaizi2 Hi Michael, That's essentially the same fix as ours but one month earlier. That's great news! Quick follow-up question: Do you know if this will make it into v7.1? I also noticed the patch has the "Cc: stable@vger.kernel.org # v6.11+" tag, which is perfect since we actually hit this bug in v6.12. I wasn't aware of Ammar's patch as I wasn't subscribed to the mailing list. Looks like we tried to reinvent the wheel ^_^; Sincerely, Link On Tue, Apr 21, 2026 at 2:51 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Tue, Apr 21, 2026 at 02:47:32PM -0700, Link Lin wrote: > > Hi everyone, > > > > Friendly ping. Apologies if you are getting this the second time - my > > last ping wasn't in plain text mode and got rejected by some mailing > > lists. > > > > Please let me know if anyone has had a chance to look at this RFC > > patch, or if any changes are needed. > > > > Thanks, > > Link > > > > On Tue, Apr 21, 2026 at 2:24 PM Link Lin <linkl@google.com> wrote: > > > > > > Hi everyone, > > > > > > Friendly ping on this RFC patch. Please let me know if anyone has had a chance to look at this, or if any changes are needed. > > > > > > Thanks, > > > Link > > > > > > On Tue, Apr 7, 2026 at 2:25 PM Link Lin <linkl@google.com> wrote: > > >> > > >> In environments where free page reporting is disabled, a kernel > > >> panic is triggered when tearing down the virtio_balloon module: > > >> > > >> [12261.808190] Call trace: > > >> [12261.808471] __list_del_entry_valid_or_report+0x18/0xe0 > > >> [12261.809064] vp_del_vqs+0x12c/0x270 > > >> [12261.809462] remove_common+0x80/0x98 [virtio_balloon] > > >> [12261.810034] virtballoon_remove+0xfc/0x158 [virtio_balloon] > > >> [12261.810663] virtio_dev_remove+0x68/0xf8 > > >> [12261.811108] device_release_driver_internal+0x17c/0x278 > > >> [12261.811701] driver_detach+0xd4/0x138 > > >> [12261.812117] bus_remove_driver+0x90/0xd0 > > >> [12261.812562] driver_unregister+0x40/0x70 > > >> [12261.813006] unregister_virtio_driver+0x20/0x38 > > >> [12261.813518] cleanup_module+0x20/0x7a8 [virtio_balloon] > > >> [12261.814109] __arm64_sys_delete_module+0x278/0x3d0 > > >> [12261.814654] invoke_syscall+0x5c/0x120 > > >> [12261.815086] el0_svc_common+0x90/0xf8 > > >> [12261.815506] do_el0_svc+0x2c/0x48 > > >> [12261.815883] el0_svc+0x3c/0xa8 > > >> [12261.816235] el0t_64_sync_handler+0x8c/0x108 > > >> [12261.816724] el0t_64_sync+0x198/0x1a0 > > >> > > >> The issue originates in vp_find_vqs_intx(). It kzalloc_objs() based > > >> on the nvqs count provided by the caller, virtio_balloon::init_vqs(). > > >> However, it is not always the case that all nvqs number of > > >> virtio_pci_vq_info objects will be properly populated. > > >> > > >> For example, when VIRTIO_BALLOON_F_FREE_PAGE_HINT is absent, the > > >> VIRTIO_BALLOON_VQ_FREE_PAGE-th item in the vp_dev->vqs array is > > >> actually never populated, and is still a zeroe-initialized > > >> virtio_pci_vq_info object, which is eventually going to trigger > > >> a __list_del_entry_valid_or_report() crash. > > >> > > >> Tested by applying this patch to a guest VM kernel with the > > >> VIRTIO_BALLOON_F_REPORTING feature enabled and the > > >> VIRTIO_BALLOON_F_FREE_PAGE_HINT feature disabled. > > >> Without this patch, unloading the virtio_balloon module triggers a panic. > > >> With this patch, no panic is observed. > > >> > > >> The fix is to use queue_idx to handle the case that vp_find_vqs_intx() > > >> skips vp_setup_vq() when caller provided null vqs_info[i].name, when > > >> the caller doesn't populate all nvqs number of virtqueue_info objects. > > >> Invariantly queue_idx is the correct index to store a successfully > > >> created and populated virtio_pci_vq_info object. As a result, now > > >> a virtio_pci_device object only stores queue_idx number of valid > > >> virtio_pci_vq_info objects in its vqs array when the for-loop over > > >> nvqs finishes (of course, without goto out_del_vqs). > > >> > > >> vp_find_vqs_msix() has similar issue, so fix it in the same way. > > >> > > >> This patch is marked as RFC because we are uncertain if any virtio-pci > > >> code implicitly requires virtio_pci_device's vqs array to always > > >> contain nvqs number of virtio_pci_vq_info objects, and to store > > >> zero-initialized virtio_pci_vq_info objects. We have not observed > > >> any issues in our testing, but insights or alternatives are welcome! > > >> > > >> Signed-off-by: Link Lin <linkl@google.com> > > >> Co-developed-by: Jiaqi Yan <jiaqiyan@google.com> > > >> Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> > > >> --- > > >> drivers/virtio/virtio_pci_common.c | 10 ++++++---- > > >> 1 file changed, 6 insertions(+), 4 deletions(-) > > >> > > >> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c > > >> index da97b6a988de..9b32301529e5 100644 > > >> --- a/drivers/virtio/virtio_pci_common.c > > >> +++ b/drivers/virtio/virtio_pci_common.c > > >> @@ -423,14 +423,15 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned int nvqs, > > >> vqs[i] = NULL; > > >> continue; > > >> } > > >> - vqs[i] = vp_find_one_vq_msix(vdev, queue_idx++, vqi->callback, > > >> + vqs[i] = vp_find_one_vq_msix(vdev, queue_idx, vqi->callback, > > >> vqi->name, vqi->ctx, false, > > >> &allocated_vectors, vector_policy, > > >> - &vp_dev->vqs[i]); > > >> + &vp_dev->vqs[queue_idx]); > > >> if (IS_ERR(vqs[i])) { > > >> err = PTR_ERR(vqs[i]); > > >> goto error_find; > > >> } > > >> + ++queue_idx; > > >> } > > >> > > >> if (!avq_num) > > >> @@ -485,13 +486,14 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned int nvqs, > > >> vqs[i] = NULL; > > >> continue; > > >> } > > >> - vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback, > > >> + vqs[i] = vp_setup_vq(vdev, queue_idx, vqi->callback, > > >> vqi->name, vqi->ctx, > > >> - VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[i]); > > >> + VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[queue_idx]); > > >> if (IS_ERR(vqs[i])) { > > >> err = PTR_ERR(vqs[i]); > > >> goto out_del_vqs; > > >> } > > >> + ++queue_idx; > > >> } > > >> > > >> if (!avq_num) > > >> -- > > >> 2.53.0.1213.gd9a14994de-goog > > > I have this in my tree: > > https://lore.kernel.org/all/20260315141808.547081-1-ammarfaizi2@openresty.com/ > > > same? > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info 2026-04-21 22:15 ` Link Lin @ 2026-04-21 22:16 ` Michael S. Tsirkin 2026-04-26 14:28 ` Ammar Faizi 0 siblings, 1 reply; 8+ messages in thread From: Michael S. Tsirkin @ 2026-04-21 22:16 UTC (permalink / raw) To: Link Lin Cc: jasowang, xuanzhuo, eperezma, jiaqiyan, rientjes, weixugc, virtualization, linux-kernel, ammarfaizi2 On Tue, Apr 21, 2026 at 03:15:55PM -0700, Link Lin wrote: > Hi Michael, > > That's essentially the same fix as ours but one month earlier. That's > great news! > > Quick follow-up question: Do you know if this will make it into v7.1? I'll try. > I also noticed the patch has the "Cc: stable@vger.kernel.org # v6.11+" > tag, which is perfect since we actually hit this bug in v6.12. > > I wasn't aware of Ammar's patch as I wasn't subscribed to the mailing > list. Looks like we tried to reinvent the wheel ^_^; > > Sincerely, > Link > > > On Tue, Apr 21, 2026 at 2:51 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Tue, Apr 21, 2026 at 02:47:32PM -0700, Link Lin wrote: > > > Hi everyone, > > > > > > Friendly ping. Apologies if you are getting this the second time - my > > > last ping wasn't in plain text mode and got rejected by some mailing > > > lists. > > > > > > Please let me know if anyone has had a chance to look at this RFC > > > patch, or if any changes are needed. > > > > > > Thanks, > > > Link > > > > > > On Tue, Apr 21, 2026 at 2:24 PM Link Lin <linkl@google.com> wrote: > > > > > > > > Hi everyone, > > > > > > > > Friendly ping on this RFC patch. Please let me know if anyone has had a chance to look at this, or if any changes are needed. > > > > > > > > Thanks, > > > > Link > > > > > > > > On Tue, Apr 7, 2026 at 2:25 PM Link Lin <linkl@google.com> wrote: > > > >> > > > >> In environments where free page reporting is disabled, a kernel > > > >> panic is triggered when tearing down the virtio_balloon module: > > > >> > > > >> [12261.808190] Call trace: > > > >> [12261.808471] __list_del_entry_valid_or_report+0x18/0xe0 > > > >> [12261.809064] vp_del_vqs+0x12c/0x270 > > > >> [12261.809462] remove_common+0x80/0x98 [virtio_balloon] > > > >> [12261.810034] virtballoon_remove+0xfc/0x158 [virtio_balloon] > > > >> [12261.810663] virtio_dev_remove+0x68/0xf8 > > > >> [12261.811108] device_release_driver_internal+0x17c/0x278 > > > >> [12261.811701] driver_detach+0xd4/0x138 > > > >> [12261.812117] bus_remove_driver+0x90/0xd0 > > > >> [12261.812562] driver_unregister+0x40/0x70 > > > >> [12261.813006] unregister_virtio_driver+0x20/0x38 > > > >> [12261.813518] cleanup_module+0x20/0x7a8 [virtio_balloon] > > > >> [12261.814109] __arm64_sys_delete_module+0x278/0x3d0 > > > >> [12261.814654] invoke_syscall+0x5c/0x120 > > > >> [12261.815086] el0_svc_common+0x90/0xf8 > > > >> [12261.815506] do_el0_svc+0x2c/0x48 > > > >> [12261.815883] el0_svc+0x3c/0xa8 > > > >> [12261.816235] el0t_64_sync_handler+0x8c/0x108 > > > >> [12261.816724] el0t_64_sync+0x198/0x1a0 > > > >> > > > >> The issue originates in vp_find_vqs_intx(). It kzalloc_objs() based > > > >> on the nvqs count provided by the caller, virtio_balloon::init_vqs(). > > > >> However, it is not always the case that all nvqs number of > > > >> virtio_pci_vq_info objects will be properly populated. > > > >> > > > >> For example, when VIRTIO_BALLOON_F_FREE_PAGE_HINT is absent, the > > > >> VIRTIO_BALLOON_VQ_FREE_PAGE-th item in the vp_dev->vqs array is > > > >> actually never populated, and is still a zeroe-initialized > > > >> virtio_pci_vq_info object, which is eventually going to trigger > > > >> a __list_del_entry_valid_or_report() crash. > > > >> > > > >> Tested by applying this patch to a guest VM kernel with the > > > >> VIRTIO_BALLOON_F_REPORTING feature enabled and the > > > >> VIRTIO_BALLOON_F_FREE_PAGE_HINT feature disabled. > > > >> Without this patch, unloading the virtio_balloon module triggers a panic. > > > >> With this patch, no panic is observed. > > > >> > > > >> The fix is to use queue_idx to handle the case that vp_find_vqs_intx() > > > >> skips vp_setup_vq() when caller provided null vqs_info[i].name, when > > > >> the caller doesn't populate all nvqs number of virtqueue_info objects. > > > >> Invariantly queue_idx is the correct index to store a successfully > > > >> created and populated virtio_pci_vq_info object. As a result, now > > > >> a virtio_pci_device object only stores queue_idx number of valid > > > >> virtio_pci_vq_info objects in its vqs array when the for-loop over > > > >> nvqs finishes (of course, without goto out_del_vqs). > > > >> > > > >> vp_find_vqs_msix() has similar issue, so fix it in the same way. > > > >> > > > >> This patch is marked as RFC because we are uncertain if any virtio-pci > > > >> code implicitly requires virtio_pci_device's vqs array to always > > > >> contain nvqs number of virtio_pci_vq_info objects, and to store > > > >> zero-initialized virtio_pci_vq_info objects. We have not observed > > > >> any issues in our testing, but insights or alternatives are welcome! > > > >> > > > >> Signed-off-by: Link Lin <linkl@google.com> > > > >> Co-developed-by: Jiaqi Yan <jiaqiyan@google.com> > > > >> Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> > > > >> --- > > > >> drivers/virtio/virtio_pci_common.c | 10 ++++++---- > > > >> 1 file changed, 6 insertions(+), 4 deletions(-) > > > >> > > > >> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c > > > >> index da97b6a988de..9b32301529e5 100644 > > > >> --- a/drivers/virtio/virtio_pci_common.c > > > >> +++ b/drivers/virtio/virtio_pci_common.c > > > >> @@ -423,14 +423,15 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned int nvqs, > > > >> vqs[i] = NULL; > > > >> continue; > > > >> } > > > >> - vqs[i] = vp_find_one_vq_msix(vdev, queue_idx++, vqi->callback, > > > >> + vqs[i] = vp_find_one_vq_msix(vdev, queue_idx, vqi->callback, > > > >> vqi->name, vqi->ctx, false, > > > >> &allocated_vectors, vector_policy, > > > >> - &vp_dev->vqs[i]); > > > >> + &vp_dev->vqs[queue_idx]); > > > >> if (IS_ERR(vqs[i])) { > > > >> err = PTR_ERR(vqs[i]); > > > >> goto error_find; > > > >> } > > > >> + ++queue_idx; > > > >> } > > > >> > > > >> if (!avq_num) > > > >> @@ -485,13 +486,14 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned int nvqs, > > > >> vqs[i] = NULL; > > > >> continue; > > > >> } > > > >> - vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback, > > > >> + vqs[i] = vp_setup_vq(vdev, queue_idx, vqi->callback, > > > >> vqi->name, vqi->ctx, > > > >> - VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[i]); > > > >> + VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[queue_idx]); > > > >> if (IS_ERR(vqs[i])) { > > > >> err = PTR_ERR(vqs[i]); > > > >> goto out_del_vqs; > > > >> } > > > >> + ++queue_idx; > > > >> } > > > >> > > > >> if (!avq_num) > > > >> -- > > > >> 2.53.0.1213.gd9a14994de-goog > > > > > > I have this in my tree: > > > > https://lore.kernel.org/all/20260315141808.547081-1-ammarfaizi2@openresty.com/ > > > > > > same? > > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info 2026-04-21 22:16 ` Michael S. Tsirkin @ 2026-04-26 14:28 ` Ammar Faizi 2026-04-26 14:57 ` Michael S. Tsirkin 0 siblings, 1 reply; 8+ messages in thread From: Ammar Faizi @ 2026-04-26 14:28 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Link Lin, jasowang, xuanzhuo, eperezma, jiaqiyan, rientjes, weixugc, virtualization, linux-kernel On Wed, Apr 22, 2026 at 5:16 AM Michael S. Tsirkin wrote: > On Tue, Apr 21, 2026 at 03:15:55PM -0700, Link Lin wrote: > > Quick follow-up question: Do you know if this will make it into v7.1? > > I'll try. It seems it's already too late for the 7.1 merge window. Maybe 7.2? -- Ammar Faizi ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info 2026-04-26 14:28 ` Ammar Faizi @ 2026-04-26 14:57 ` Michael S. Tsirkin 0 siblings, 0 replies; 8+ messages in thread From: Michael S. Tsirkin @ 2026-04-26 14:57 UTC (permalink / raw) To: Ammar Faizi Cc: Link Lin, jasowang, xuanzhuo, eperezma, jiaqiyan, rientjes, weixugc, virtualization, linux-kernel On Sun, Apr 26, 2026 at 09:28:15PM +0700, Ammar Faizi wrote: > On Wed, Apr 22, 2026 at 5:16 AM Michael S. Tsirkin wrote: > > On Tue, Apr 21, 2026 at 03:15:55PM -0700, Link Lin wrote: > > > Quick follow-up question: Do you know if this will make it into v7.1? > > > > I'll try. > > It seems it's already too late for the 7.1 merge window. Maybe 7.2? > > -- > Ammar Faizi It's a bugfix, no? never late. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info [not found] ` <CALUx4KQKNYZm5AZzQXNqLRdGAT0nQOpmn_Lz6WHie73w1d9JQA@mail.gmail.com> 2026-04-21 21:47 ` Link Lin @ 2026-04-21 21:50 ` Michael S. Tsirkin 1 sibling, 0 replies; 8+ messages in thread From: Michael S. Tsirkin @ 2026-04-21 21:50 UTC (permalink / raw) To: Link Lin Cc: jasowang, xuanzhuo, eperezma, jiaqiyan, rientjes, weixugc, virtualization, linux-kernel On Tue, Apr 21, 2026 at 02:24:16PM -0700, Link Lin wrote: > Hi everyone, > > Friendly ping on this RFC patch. Please let me know if anyone has had a chance > to look at this, or if any changes are needed. > > Thanks, > Link > > On Tue, Apr 7, 2026 at 2:25 PM Link Lin <linkl@google.com> wrote: > > In environments where free page reporting is disabled, a kernel > panic is triggered when tearing down the virtio_balloon module: > > [12261.808190] Call trace: > [12261.808471] __list_del_entry_valid_or_report+0x18/0xe0 > [12261.809064] vp_del_vqs+0x12c/0x270 > [12261.809462] remove_common+0x80/0x98 [virtio_balloon] > [12261.810034] virtballoon_remove+0xfc/0x158 [virtio_balloon] > [12261.810663] virtio_dev_remove+0x68/0xf8 > [12261.811108] device_release_driver_internal+0x17c/0x278 > [12261.811701] driver_detach+0xd4/0x138 > [12261.812117] bus_remove_driver+0x90/0xd0 > [12261.812562] driver_unregister+0x40/0x70 > [12261.813006] unregister_virtio_driver+0x20/0x38 > [12261.813518] cleanup_module+0x20/0x7a8 [virtio_balloon] > [12261.814109] __arm64_sys_delete_module+0x278/0x3d0 > [12261.814654] invoke_syscall+0x5c/0x120 > [12261.815086] el0_svc_common+0x90/0xf8 > [12261.815506] do_el0_svc+0x2c/0x48 > [12261.815883] el0_svc+0x3c/0xa8 > [12261.816235] el0t_64_sync_handler+0x8c/0x108 > [12261.816724] el0t_64_sync+0x198/0x1a0 > > The issue originates in vp_find_vqs_intx(). It kzalloc_objs() based > on the nvqs count provided by the caller, virtio_balloon::init_vqs(). > However, it is not always the case that all nvqs number of > virtio_pci_vq_info objects will be properly populated. > > For example, when VIRTIO_BALLOON_F_FREE_PAGE_HINT is absent, the > VIRTIO_BALLOON_VQ_FREE_PAGE-th item in the vp_dev->vqs array is > actually never populated, and is still a zeroe-initialized > virtio_pci_vq_info object, which is eventually going to trigger > a __list_del_entry_valid_or_report() crash. > > Tested by applying this patch to a guest VM kernel with the > VIRTIO_BALLOON_F_REPORTING feature enabled and the > VIRTIO_BALLOON_F_FREE_PAGE_HINT feature disabled. > Without this patch, unloading the virtio_balloon module triggers a panic. > With this patch, no panic is observed. > > The fix is to use queue_idx to handle the case that vp_find_vqs_intx() > skips vp_setup_vq() when caller provided null vqs_info[i].name, when > the caller doesn't populate all nvqs number of virtqueue_info objects. > Invariantly queue_idx is the correct index to store a successfully > created and populated virtio_pci_vq_info object. As a result, now > a virtio_pci_device object only stores queue_idx number of valid > virtio_pci_vq_info objects in its vqs array when the for-loop over > nvqs finishes (of course, without goto out_del_vqs). > > vp_find_vqs_msix() has similar issue, so fix it in the same way. > > This patch is marked as RFC because we are uncertain if any virtio-pci > code implicitly requires virtio_pci_device's vqs array to always > contain nvqs number of virtio_pci_vq_info objects, and to store > zero-initialized virtio_pci_vq_info objects. We have not observed > any issues in our testing, but insights or alternatives are welcome! > > Signed-off-by: Link Lin <linkl@google.com> > Co-developed-by: Jiaqi Yan <jiaqiyan@google.com> > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> > --- > drivers/virtio/virtio_pci_common.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/ > virtio_pci_common.c > index da97b6a988de..9b32301529e5 100644 > --- a/drivers/virtio/virtio_pci_common.c > +++ b/drivers/virtio/virtio_pci_common.c > @@ -423,14 +423,15 @@ static int vp_find_vqs_msix(struct virtio_device > *vdev, unsigned int nvqs, > vqs[i] = NULL; > continue; > } > - vqs[i] = vp_find_one_vq_msix(vdev, queue_idx++, vqi-> > callback, > + vqs[i] = vp_find_one_vq_msix(vdev, queue_idx, vqi-> > callback, > vqi->name, vqi->ctx, false, > &allocated_vectors, > vector_policy, > - &vp_dev->vqs[i]); > + &vp_dev->vqs[queue_idx]); > if (IS_ERR(vqs[i])) { > err = PTR_ERR(vqs[i]); > goto error_find; > } > + ++queue_idx; > } > > if (!avq_num) > @@ -485,13 +486,14 @@ static int vp_find_vqs_intx(struct virtio_device > *vdev, unsigned int nvqs, > vqs[i] = NULL; > continue; > } > - vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback, > + vqs[i] = vp_setup_vq(vdev, queue_idx, vqi->callback, > vqi->name, vqi->ctx, > - VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs > [i]); > + VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs > [queue_idx]); > if (IS_ERR(vqs[i])) { > err = PTR_ERR(vqs[i]); > goto out_del_vqs; > } > + ++queue_idx; > } > > if (!avq_num) > -- > 2.53.0.1213.gd9a14994de-goog > I have this in my tree: https://lore.kernel.org/all/20260315141808.547081-1-ammarfaizi2@openresty.com/ same? -- MST ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-04-26 14:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-07 21:25 [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info Link Lin
[not found] ` <CALUx4KQKNYZm5AZzQXNqLRdGAT0nQOpmn_Lz6WHie73w1d9JQA@mail.gmail.com>
2026-04-21 21:47 ` Link Lin
2026-04-21 21:51 ` Michael S. Tsirkin
2026-04-21 22:15 ` Link Lin
2026-04-21 22:16 ` Michael S. Tsirkin
2026-04-26 14:28 ` Ammar Faizi
2026-04-26 14:57 ` Michael S. Tsirkin
2026-04-21 21:50 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox