* Re: [RFC PATCH 0/6] Support virtio-mem memory hotplug in TDX guests
From: Kiryl Shutsemau @ 2026-06-12 12:16 UTC (permalink / raw)
To: Zhenzhong Duan
Cc: marcandre.lureau, david, rick.p.edgecombe, prsampat, pbonzini,
mst, peterx, chenyi.qiang, elena.reshetova, michaeluth,
ackerleytng, linux-kernel, linux-coco, virtualization, x86,
yilun.xu, xiaoyao.li, chao.p.peng
In-Reply-To: <20260604093551.1511079-1-zhenzhong.duan@intel.com>
On Thu, Jun 04, 2026 at 05:35:45AM -0400, Zhenzhong Duan wrote:
> 2. Re-accepting already-accepted memory returns errors. Ignoring these errors
> can mislead the guest into believing re-accepted memory is zeroed when it
> contains stale data.
Re-accepting concern is valid, but often overblown. Reaccepting memory
that never got allocated is fine.
> == About this series ==
>
> This series takes a different direction, supporting start-private memory
> and addressing the limitations of previous series [1] by implementing a
> callback-based infrastructure that integrates TDX memory acceptance and
> release operations with proper subblock granularity.
You are presenting these callbacks as generic memory hotplug thingy, but
it is only plugged into virtio mem. ACPI hotplug won't accept/release
memory unless I miss something. Are you expecting them to cover non
virtio cases too?
And these callbacks feels like very ad-hoc solution.
> See Rick and Paolo's
> discussion about using TDG.MEM.PAGE.RELEASE in [1].
Having RELEASE in hotplug path without addressing private->shared
conversion first is odd. That's the most obvious path that has to be
covered first.
Hm?
> == Future work ==
> support lazy accept
It would be nice to have some outline on how we will get there to
understand if this patchset is stepping stone or dead end that has to be
thrown away later on.
Hot[un]plug is often used to manager overcommited host. Eager accept
might be counter-productive.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply
* Re: [PATCH] drm: Consistently define pci_device_ids using named initializers
From: Thomas Zimmermann @ 2026-06-12 12:10 UTC (permalink / raw)
To: Uwe Kleine-König (The Capable Hub), Maarten Lankhorst,
Maxime Ripard, David Airlie, Simona Vetter, Gerd Hoffmann
Cc: Markus Schneider-Pargmann, Patrik Jakobsson, Jianmin Lv,
Qianhai Wu, Huacai Chen, Mingcong Bai, Xi Ruoyao, Icenowy Zheng,
Dave Airlie, Jocelyn Falempe, dri-devel, linux-kernel,
virtualization, spice-devel
In-Reply-To: <20260504150537.2136760-2-u.kleine-koenig@baylibre.com>
Hi
Am 04.05.26 um 17:05 schrieb Uwe Kleine-König (The Capable Hub):
> The .driver_data member of the various struct pci_device_id arrays were
> initialized by list expressions. This isn't easily readable if you're
> not into PCI. Using the PCI_DEVICE macro and named initializers is more
> explicit and thus easier to parse. Also skip explicit assignments of 0
> (which the compiler then takes care of).
>
> This change doesn't introduce changes to the compiled pci_device_id
> arrays. Tested on x86 and arm64.
>
> Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
I'll merge the patch into drm-misc-next.
Best regards
Thomas
> ---
> Hello,
>
> The secret plan is to make struct pci_device_id::driver_data an
> anonymous union (similar to
> https://lore.kernel.org/all/cover.1776579304.git.u.kleine-koenig@baylibre.com/)
> and that requires named initializers. But IMHO it's also a nice cleanup
> on its own.
>
> The anonymous union will allow changes like the following:
>
> - { PCI_DEVICE(0x8086, 0x8108), .driver_data = (long) &psb_chip_ops },
> + { PCI_DEVICE(0x8086, 0x8108), .driver_data_ptr = &psb_chip_ops },
>
> (together with the respective change in the code when the value is
> used). This gets rid of a bunch of casts and thus slightly improves
> type safety.
>
> Best regards
> Uwe
>
> drivers/gpu/drm/gma500/psb_drv.c | 56 +++++++++++++--------------
> drivers/gpu/drm/loongson/lsdc_drv.c | 4 +-
> drivers/gpu/drm/mgag200/mgag200_drv.c | 24 ++++++------
> drivers/gpu/drm/qxl/qxl_drv.c | 15 ++++---
> 4 files changed, 52 insertions(+), 47 deletions(-)
>
> diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
> index 005ab7f5355f..039da26ef24d 100644
> --- a/drivers/gpu/drm/gma500/psb_drv.c
> +++ b/drivers/gpu/drm/gma500/psb_drv.c
> @@ -56,36 +56,36 @@ static int psb_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent);
> */
> static const struct pci_device_id pciidlist[] = {
> /* Poulsbo */
> - { 0x8086, 0x8108, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &psb_chip_ops },
> - { 0x8086, 0x8109, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &psb_chip_ops },
> + { PCI_DEVICE(0x8086, 0x8108), .driver_data = (long) &psb_chip_ops },
> + { PCI_DEVICE(0x8086, 0x8109), .driver_data = (long) &psb_chip_ops },
> /* Oak Trail */
> - { 0x8086, 0x4100, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &oaktrail_chip_ops },
> - { 0x8086, 0x4101, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &oaktrail_chip_ops },
> - { 0x8086, 0x4102, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &oaktrail_chip_ops },
> - { 0x8086, 0x4103, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &oaktrail_chip_ops },
> - { 0x8086, 0x4104, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &oaktrail_chip_ops },
> - { 0x8086, 0x4105, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &oaktrail_chip_ops },
> - { 0x8086, 0x4106, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &oaktrail_chip_ops },
> - { 0x8086, 0x4107, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &oaktrail_chip_ops },
> - { 0x8086, 0x4108, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &oaktrail_chip_ops },
> + { PCI_DEVICE(0x8086, 0x4100), .driver_data = (long) &oaktrail_chip_ops },
> + { PCI_DEVICE(0x8086, 0x4101), .driver_data = (long) &oaktrail_chip_ops },
> + { PCI_DEVICE(0x8086, 0x4102), .driver_data = (long) &oaktrail_chip_ops },
> + { PCI_DEVICE(0x8086, 0x4103), .driver_data = (long) &oaktrail_chip_ops },
> + { PCI_DEVICE(0x8086, 0x4104), .driver_data = (long) &oaktrail_chip_ops },
> + { PCI_DEVICE(0x8086, 0x4105), .driver_data = (long) &oaktrail_chip_ops },
> + { PCI_DEVICE(0x8086, 0x4106), .driver_data = (long) &oaktrail_chip_ops },
> + { PCI_DEVICE(0x8086, 0x4107), .driver_data = (long) &oaktrail_chip_ops },
> + { PCI_DEVICE(0x8086, 0x4108), .driver_data = (long) &oaktrail_chip_ops },
> /* Cedar Trail */
> - { 0x8086, 0x0be0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0be1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0be2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0be3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0be4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0be5, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0be6, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0be7, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0be8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0be9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0bea, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0beb, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0bec, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0bed, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0bee, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0x8086, 0x0bef, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (long) &cdv_chip_ops },
> - { 0, }
> + { PCI_DEVICE(0x8086, 0x0be0), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0be1), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0be2), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0be3), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0be4), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0be5), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0be6), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0be7), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0be8), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0be9), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0bea), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0beb), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0bec), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0bed), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0bee), .driver_data = (long) &cdv_chip_ops },
> + { PCI_DEVICE(0x8086, 0x0bef), .driver_data = (long) &cdv_chip_ops },
> + { }
> };
> MODULE_DEVICE_TABLE(pci, pciidlist);
>
> diff --git a/drivers/gpu/drm/loongson/lsdc_drv.c b/drivers/gpu/drm/loongson/lsdc_drv.c
> index 1ece1ea42f78..f9f7271ddbff 100644
> --- a/drivers/gpu/drm/loongson/lsdc_drv.c
> +++ b/drivers/gpu/drm/loongson/lsdc_drv.c
> @@ -444,8 +444,8 @@ static const struct dev_pm_ops lsdc_pm_ops = {
> };
>
> static const struct pci_device_id lsdc_pciid_list[] = {
> - {PCI_VDEVICE(LOONGSON, 0x7a06), CHIP_LS7A1000},
> - {PCI_VDEVICE(LOONGSON, 0x7a36), CHIP_LS7A2000},
> + { PCI_VDEVICE(LOONGSON, 0x7a06), .driver_data = CHIP_LS7A1000 },
> + { PCI_VDEVICE(LOONGSON, 0x7a36), .driver_data = CHIP_LS7A2000 },
> { }
> };
>
> diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c b/drivers/gpu/drm/mgag200/mgag200_drv.c
> index a32be27c39e8..8ad4ddb60ee6 100644
> --- a/drivers/gpu/drm/mgag200/mgag200_drv.c
> +++ b/drivers/gpu/drm/mgag200/mgag200_drv.c
> @@ -205,18 +205,18 @@ int mgag200_device_init(struct mga_device *mdev,
> */
>
> static const struct pci_device_id mgag200_pciidlist[] = {
> - { PCI_VENDOR_ID_MATROX, 0x520, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_PCI },
> - { PCI_VENDOR_ID_MATROX, 0x521, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_AGP },
> - { PCI_VENDOR_ID_MATROX, 0x522, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_SE_A },
> - { PCI_VENDOR_ID_MATROX, 0x524, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_SE_B },
> - { PCI_VENDOR_ID_MATROX, 0x530, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_EV },
> - { PCI_VENDOR_ID_MATROX, 0x532, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_WB },
> - { PCI_VENDOR_ID_MATROX, 0x533, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_EH },
> - { PCI_VENDOR_ID_MATROX, 0x534, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_ER },
> - { PCI_VENDOR_ID_MATROX, 0x536, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_EW3 },
> - { PCI_VENDOR_ID_MATROX, 0x538, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_EH3 },
> - { PCI_VENDOR_ID_MATROX, 0x53a, PCI_ANY_ID, PCI_ANY_ID, 0, 0, G200_EH5 },
> - {0,}
> + { PCI_VDEVICE(MATROX, 0x0520), .driver_data = G200_PCI },
> + { PCI_VDEVICE(MATROX, 0x0521), .driver_data = G200_AGP },
> + { PCI_VDEVICE(MATROX, 0x0522), .driver_data = G200_SE_A },
> + { PCI_VDEVICE(MATROX, 0x0524), .driver_data = G200_SE_B },
> + { PCI_VDEVICE(MATROX, 0x0530), .driver_data = G200_EV },
> + { PCI_VDEVICE(MATROX, 0x0532), .driver_data = G200_WB },
> + { PCI_VDEVICE(MATROX, 0x0533), .driver_data = G200_EH },
> + { PCI_VDEVICE(MATROX, 0x0534), .driver_data = G200_ER },
> + { PCI_VDEVICE(MATROX, 0x0536), .driver_data = G200_EW3 },
> + { PCI_VDEVICE(MATROX, 0x0538), .driver_data = G200_EH3 },
> + { PCI_VDEVICE(MATROX, 0x053a), .driver_data = G200_EH5 },
> + { }
> };
>
> MODULE_DEVICE_TABLE(pci, mgag200_pciidlist);
> diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
> index 2bbb1168a3ff..6c3c309b8e4d 100644
> --- a/drivers/gpu/drm/qxl/qxl_drv.c
> +++ b/drivers/gpu/drm/qxl/qxl_drv.c
> @@ -50,11 +50,16 @@
> #include "qxl_object.h"
>
> static const struct pci_device_id pciidlist[] = {
> - { 0x1b36, 0x100, PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA << 8,
> - 0xffff00, 0 },
> - { 0x1b36, 0x100, PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_DISPLAY_OTHER << 8,
> - 0xffff00, 0 },
> - { 0, 0, 0 },
> + {
> + PCI_DEVICE(0x1b36, 0x0100),
> + .class = PCI_CLASS_DISPLAY_VGA << 8,
> + .class_mask = 0xffff00
> + }, {
> + PCI_DEVICE(0x1b36, 0x0100),
> + .class = PCI_CLASS_DISPLAY_OTHER << 8,
> + .class_mask = 0xffff00
> + },
> + { },
> };
> MODULE_DEVICE_TABLE(pci, pciidlist);
>
>
> base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
^ permalink raw reply
* [PATCH] vdpa_sim: fix cleanup after worker creation failure
From: Linfeng Sun @ 2026-06-12 10:50 UTC (permalink / raw)
To: Michael S . Tsirkin, Jason Wang
Cc: Xuan Zhuo, Eugenio Pérez, virtualization, linux-kernel,
Linfeng Sun
vdpasim_create() leaves vdpasim->worker as an ERR_PTR when
kthread_run_worker() fails. The error path then drops the device
reference, which releases the partially initialized simulator.
vdpasim_free() unconditionally passes the worker pointer to
kthread_destroy_worker(), so the ERR_PTR is dereferenced and can
trigger a general protection fault.
Store the worker error, clear the pointer, and make the release path
only clean up resources that were successfully initialized before
the failure.
Signed-off-by: Linfeng Sun <slf@hdu.edu.cn>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 8cb1cc2ea139..6a4e28c49d2d 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -230,9 +230,12 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr,
kthread_init_work(&vdpasim->work, vdpasim_work_fn);
vdpasim->worker = kthread_run_worker(0, "vDPA sim worker: %s",
- dev_attr->name);
- if (IS_ERR(vdpasim->worker))
+ dev_attr->name);
+ if (IS_ERR(vdpasim->worker)) {
+ ret = PTR_ERR(vdpasim->worker);
+ vdpasim->worker = NULL;
goto err_iommu;
+ }
mutex_init(&vdpasim->mutex);
spin_lock_init(&vdpasim->iommu_lock);
@@ -742,18 +745,24 @@ static void vdpasim_free(struct vdpa_device *vdpa)
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
int i;
- kthread_cancel_work_sync(&vdpasim->work);
- kthread_destroy_worker(vdpasim->worker);
+ if (vdpasim->worker) {
+ kthread_cancel_work_sync(&vdpasim->work);
+ kthread_destroy_worker(vdpasim->worker);
+ }
- for (i = 0; i < vdpasim->dev_attr.nvqs; i++) {
- vringh_kiov_cleanup(&vdpasim->vqs[i].out_iov);
- vringh_kiov_cleanup(&vdpasim->vqs[i].in_iov);
+ if (vdpasim->vqs) {
+ for (i = 0; i < vdpasim->dev_attr.nvqs; i++) {
+ vringh_kiov_cleanup(&vdpasim->vqs[i].out_iov);
+ vringh_kiov_cleanup(&vdpasim->vqs[i].in_iov);
+ }
}
vdpasim->dev_attr.free(vdpasim);
- for (i = 0; i < vdpasim->dev_attr.nas; i++)
- vhost_iotlb_reset(&vdpasim->iommu[i]);
+ if (vdpasim->iommu && vdpasim->iommu_pt) {
+ for (i = 0; i < vdpasim->dev_attr.nas; i++)
+ vhost_iotlb_reset(&vdpasim->iommu[i]);
+ }
kfree(vdpasim->iommu);
kfree(vdpasim->iommu_pt);
kfree(vdpasim->vqs);
--
2.43.0
^ permalink raw reply related
* Re: [PATCH net-next v3 4/4] vsock: fold sk_acceptq_removed() into vsock_remove_pending()
From: Stefano Garzarella @ 2026-06-12 10:48 UTC (permalink / raw)
To: Raf Dickson
Cc: netdev, virtualization, pabeni, stefanha, bryan-bt.tan,
vishnu.dasa, bcm-kernel-feedback-list, bobbyeshleman, leonardi,
horms, edumazet, kuba
In-Reply-To: <20260612045216.105796-5-rafdog35@gmail.com>
On Fri, Jun 12, 2026 at 04:52:16AM +0000, Raf Dickson wrote:
>Callers of vsock_remove_pending() must also call sk_acceptq_removed()
>to keep sk_ack_backlog consistent. Move the call into
>vsock_remove_pending() itself to make it automatic and prevent future
>callers from forgetting it.
>
>Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Raf Dickson <rafdog35@gmail.com>
>---
> net/vmw_vsock/af_vsock.c | 2 +-
> net/vmw_vsock/vmci_transport.c | 4 +---
> 2 files changed, 2 insertions(+), 4 deletions(-)
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 24916dd4e9..4a7d6d247a 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -494,6 +494,7 @@ void vsock_remove_pending(struct sock *listener, struct sock *pending)
> list_del_init(&vpending->pending_links);
> sock_put(listener);
> sock_put(pending);
>+ sk_acceptq_removed(listener);
> }
> EXPORT_SYMBOL_GPL(vsock_remove_pending);
>
>@@ -773,7 +774,6 @@ static void vsock_pending_work(struct work_struct *work)
> if (vsock_is_pending(sk)) {
> vsock_remove_pending(listener, sk);
>
^^
There is an extra blank line that we can now remove here.
BTW, the code LGTM:
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
>- sk_acceptq_removed(listener);
> } else if (!vsk->rejected) {
> /* We are not on the pending list and accept() did not reject
> * us, so we must have been accepted by our user process. We
>diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
>index c2db016cca..3e6445f4e1 100644
>--- a/net/vmw_vsock/vmci_transport.c
>+++ b/net/vmw_vsock/vmci_transport.c
>@@ -980,10 +980,8 @@ static int vmci_transport_recv_listen(struct sock *sk,
> err = -EINVAL;
> }
>
>- if (err < 0) {
>+ if (err < 0)
> vsock_remove_pending(sk, pending);
>- sk_acceptq_removed(sk);
>- }
>
> release_sock(pending);
> vmci_transport_release_pending(pending);
>--
>2.54.0
>
^ permalink raw reply
* Re: [PATCH net-next v3 3/4] vsock: fold sk_acceptq_added() into vsock_enqueue_accept()
From: Stefano Garzarella @ 2026-06-12 10:46 UTC (permalink / raw)
To: Raf Dickson
Cc: netdev, virtualization, pabeni, stefanha, bryan-bt.tan,
vishnu.dasa, bcm-kernel-feedback-list, bobbyeshleman, leonardi,
horms, edumazet, kuba
In-Reply-To: <20260612045216.105796-4-rafdog35@gmail.com>
On Fri, Jun 12, 2026 at 04:52:15AM +0000, Raf Dickson wrote:
>virtio and hyperv call sk_acceptq_added() immediately before
>vsock_enqueue_accept(). Move the call into vsock_enqueue_accept()
>itself so callers cannot forget it and the accounting is consistent.
>
>Suggested-by: Paolo Abeni <pabeni@redhat.com>
>Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Raf Dickson <rafdog35@gmail.com>
>---
> net/vmw_vsock/af_vsock.c | 1 +
> net/vmw_vsock/hyperv_transport.c | 1 -
> net/vmw_vsock/virtio_transport_common.c | 1 -
> 3 files changed, 1 insertion(+), 2 deletions(-)
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
^ permalink raw reply
* Re: [PATCH net-next v3 2/4] vsock: fold sk_acceptq_added() into vsock_add_pending()
From: Stefano Garzarella @ 2026-06-12 10:42 UTC (permalink / raw)
To: Raf Dickson
Cc: netdev, virtualization, pabeni, stefanha, bryan-bt.tan,
vishnu.dasa, bcm-kernel-feedback-list, bobbyeshleman, leonardi,
horms, edumazet, kuba
In-Reply-To: <20260612045216.105796-3-rafdog35@gmail.com>
On Fri, Jun 12, 2026 at 04:52:14AM +0000, Raf Dickson wrote:
>Move sk_acceptq_added() into vsock_add_pending() so callers cannot
>forget it. vmci is the only transport using the pending list and
>is updated accordingly.
>
>Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Raf Dickson <rafdog35@gmail.com>
>---
> net/vmw_vsock/af_vsock.c | 1 +
> net/vmw_vsock/vmci_transport.c | 1 -
> 2 files changed, 1 insertion(+), 1 deletion(-)
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
^ permalink raw reply
* Re: [PATCH net-next v3 1/4] vsock: introduce vsock_pending_to_accept() helper
From: Stefano Garzarella @ 2026-06-12 10:41 UTC (permalink / raw)
To: Raf Dickson
Cc: netdev, virtualization, pabeni, stefanha, bryan-bt.tan,
vishnu.dasa, bcm-kernel-feedback-list, bobbyeshleman, leonardi,
horms, edumazet, kuba
In-Reply-To: <20260612045216.105796-2-rafdog35@gmail.com>
On Fri, Jun 12, 2026 at 04:52:13AM +0000, Raf Dickson wrote:
>Add vsock_pending_to_accept() to move a socket directly from the
>pending list to the accept queue in a single operation, avoiding
>the sock_put/sock_hold dance and the sk_acceptq_removed()/
>sk_acceptq_added() pair that would otherwise be needed when
>calling vsock_remove_pending() followed by vsock_enqueue_accept().
>
>Use it in vmci_transport_recv_connecting_server() where a completed
>handshake transitions the socket from pending to accept queue.
>
>Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Raf Dickson <rafdog35@gmail.com>
>---
> include/net/af_vsock.h | 1 +
> net/vmw_vsock/af_vsock.c | 10 ++++++++++
> net/vmw_vsock/vmci_transport.c | 3 +--
> 3 files changed, 12 insertions(+), 2 deletions(-)
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
^ permalink raw reply
* Re: [PATCH v2 net] virtio_net: do not allow tunnel csum offload for non GSO packets
From: Gabriel Goller @ 2026-06-12 10:11 UTC (permalink / raw)
To: Paolo Abeni
Cc: netdev, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, virtualization, Gabriel Goller, Fiona Ebner
In-Reply-To: <6c3b6c47fb05c100f384630dc48f3975cf37b67a.1781195144.git.pabeni@redhat.com>
> Fiona reports broken connectivity for virtio net setup using UDP tunnel
> inside the guest and NIC with not UDP tunnel TSO support in the host.
>
> Currently the virtio_net driver exposes csum offload for UDP-tunneled,
> TCP non GSO packets. Such packet reach the host as CSUM_PARTIAL ones
> with the 'encapsulation' flag cleared, as the virtio specification do
> not support this specific kind of offload.
>
> HW NICs with UDP tunnel TSO support - and those drivers directly
> accessing skb->csum_start/csum_offset - are still capable of computing
> the needed csum correctly, but otherwise the packets reach the wire with
> bad csum on both the inner and outer transport header.
>
> Address the issue explicitly disabling csum offload for UDP tunneled,
> non GSO packets via the ndo_features_check op.
>
> Fixes: 56a06bd40fab ("virtio_net: enable gso over UDP tunnel support.")
> Reported-by: Fiona Ebner <f.ebner@proxmox.com>
> Closes: https://bugzilla.proxmox.com/show_bug.cgi?id=7627
> Tested-by: Fiona Ebner <f.ebner@proxmox.com>
> Tested-by: Gabriel Goller <g.goller@proxmox.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested it on our testbench, consider:
Reviewed-by: Gabriel Goller <g.goller@proxmox.com>
Tested-by: Gabriel Goller <g.goller@proxmox.com>
--
Gabriel Goller <g.goller@proxmox.com>
^ permalink raw reply
* Re: [PATCH net-next v2] vsock/vmci: use sk_acceptq_is_full() helper
From: Raf Dickson @ 2026-06-12 9:19 UTC (permalink / raw)
To: sgarzare
Cc: bcm-kernel-feedback-list, bryan-bt.tan, edumazet, horms, kuba,
leonardi, netdev, pabeni, stefanha, virtualization, vishnu.dasa
In-Reply-To: <aivKma8mRjTXV0BM@sgarzare-redhat>
On Fri, Jun 12, 2026 at 09:03AM +0200, Stefano Garzarella wrote:
> nit: title should be updated since now this is not just vmci
> Wait a bit and in case this is not queued, resend with the title fixed.
> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Thanks! Will wait a few days and resend with the corrected title if
it hasn't been queued by then.
Raf
^ permalink raw reply
* Re: [PATCH net-next v2] vsock/vmci: use sk_acceptq_is_full() helper
From: Stefano Garzarella @ 2026-06-12 9:03 UTC (permalink / raw)
To: Raf Dickson
Cc: netdev, virtualization, pabeni, stefanha, bryan-bt.tan,
vishnu.dasa, bcm-kernel-feedback-list, leonardi, horms, edumazet,
kuba
In-Reply-To: <20260612045842.122207-1-rafdog35@gmail.com>
On Fri, Jun 12, 2026 at 04:58:42AM +0000, Raf Dickson wrote:
nit: title should be updated since now this is not just vmci
(e.g. vsock: use sk_acceptq_is_full() helper in all transports)
Not sure if it can be fixed while applying by netdev maintainers.
Wait a bit and in case this is not queued, resend with the title fixed.
>Replace the open-coded backlog check with sk_acceptq_is_full().
>The helper uses > instead of >=, which is the correct comparison
>per commit 64a146513f8f ("[NET]: Revert incorrect accept queue
>backlog changes."), and adds READ_ONCE() for proper memory ordering.
>
>Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Raf Dickson <rafdog35@gmail.com>
>---
> net/vmw_vsock/hyperv_transport.c | 2 +-
> net/vmw_vsock/vmci_transport.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
That said, the patch LGTM:
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
>
>diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
>index b3394946b2..e6adbc4701 100644
>--- a/net/vmw_vsock/hyperv_transport.c
>+++ b/net/vmw_vsock/hyperv_transport.c
>@@ -323,7 +323,7 @@ static void hvs_open_connection(struct vmbus_channel *chan)
> goto out;
>
> if (conn_from_host) {
>- if (sk->sk_ack_backlog >= sk->sk_max_ack_backlog)
>+ if (sk_acceptq_is_full(sk))
> goto out;
>
> new = vsock_create_connected(sk);
>diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
>index 91516488a7..56503bee31 100644
>--- a/net/vmw_vsock/vmci_transport.c
>+++ b/net/vmw_vsock/vmci_transport.c
>@@ -1010,7 +1010,7 @@ static int vmci_transport_recv_listen(struct sock *sk,
> * reset. Otherwise we create and initialize a child socket and reply
> * with a connection negotiation.
> */
>- if (sk->sk_ack_backlog >= sk->sk_max_ack_backlog) {
>+ if (sk_acceptq_is_full(sk)) {
> vmci_transport_reply_reset(pkt);
> return -ECONNREFUSED;
> }
>--
>2.54.0
>
^ permalink raw reply
* [PATCH net-next v2] vsock/vmci: use sk_acceptq_is_full() helper
From: Raf Dickson @ 2026-06-12 4:58 UTC (permalink / raw)
To: netdev, virtualization
Cc: pabeni, sgarzare, stefanha, bryan-bt.tan, vishnu.dasa,
bcm-kernel-feedback-list, leonardi, horms, edumazet, kuba,
Raf Dickson
Replace the open-coded backlog check with sk_acceptq_is_full().
The helper uses > instead of >=, which is the correct comparison
per commit 64a146513f8f ("[NET]: Revert incorrect accept queue
backlog changes."), and adds READ_ONCE() for proper memory ordering.
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Raf Dickson <rafdog35@gmail.com>
---
net/vmw_vsock/hyperv_transport.c | 2 +-
net/vmw_vsock/vmci_transport.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index b3394946b2..e6adbc4701 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -323,7 +323,7 @@ static void hvs_open_connection(struct vmbus_channel *chan)
goto out;
if (conn_from_host) {
- if (sk->sk_ack_backlog >= sk->sk_max_ack_backlog)
+ if (sk_acceptq_is_full(sk))
goto out;
new = vsock_create_connected(sk);
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 91516488a7..56503bee31 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1010,7 +1010,7 @@ static int vmci_transport_recv_listen(struct sock *sk,
* reset. Otherwise we create and initialize a child socket and reply
* with a connection negotiation.
*/
- if (sk->sk_ack_backlog >= sk->sk_max_ack_backlog) {
+ if (sk_acceptq_is_full(sk)) {
vmci_transport_reply_reset(pkt);
return -ECONNREFUSED;
}
--
2.54.0
^ permalink raw reply related
* [PATCH net-next v3 4/4] vsock: fold sk_acceptq_removed() into vsock_remove_pending()
From: Raf Dickson @ 2026-06-12 4:52 UTC (permalink / raw)
To: netdev, virtualization
Cc: pabeni, sgarzare, stefanha, bryan-bt.tan, vishnu.dasa,
bcm-kernel-feedback-list, bobbyeshleman, leonardi, horms,
edumazet, kuba, Raf Dickson
In-Reply-To: <20260612045216.105796-1-rafdog35@gmail.com>
Callers of vsock_remove_pending() must also call sk_acceptq_removed()
to keep sk_ack_backlog consistent. Move the call into
vsock_remove_pending() itself to make it automatic and prevent future
callers from forgetting it.
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Raf Dickson <rafdog35@gmail.com>
---
net/vmw_vsock/af_vsock.c | 2 +-
net/vmw_vsock/vmci_transport.c | 4 +---
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 24916dd4e9..4a7d6d247a 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -494,6 +494,7 @@ void vsock_remove_pending(struct sock *listener, struct sock *pending)
list_del_init(&vpending->pending_links);
sock_put(listener);
sock_put(pending);
+ sk_acceptq_removed(listener);
}
EXPORT_SYMBOL_GPL(vsock_remove_pending);
@@ -773,7 +774,6 @@ static void vsock_pending_work(struct work_struct *work)
if (vsock_is_pending(sk)) {
vsock_remove_pending(listener, sk);
- sk_acceptq_removed(listener);
} else if (!vsk->rejected) {
/* We are not on the pending list and accept() did not reject
* us, so we must have been accepted by our user process. We
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index c2db016cca..3e6445f4e1 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -980,10 +980,8 @@ static int vmci_transport_recv_listen(struct sock *sk,
err = -EINVAL;
}
- if (err < 0) {
+ if (err < 0)
vsock_remove_pending(sk, pending);
- sk_acceptq_removed(sk);
- }
release_sock(pending);
vmci_transport_release_pending(pending);
--
2.54.0
^ permalink raw reply related
* [PATCH net-next v3 3/4] vsock: fold sk_acceptq_added() into vsock_enqueue_accept()
From: Raf Dickson @ 2026-06-12 4:52 UTC (permalink / raw)
To: netdev, virtualization
Cc: pabeni, sgarzare, stefanha, bryan-bt.tan, vishnu.dasa,
bcm-kernel-feedback-list, bobbyeshleman, leonardi, horms,
edumazet, kuba, Raf Dickson
In-Reply-To: <20260612045216.105796-1-rafdog35@gmail.com>
virtio and hyperv call sk_acceptq_added() immediately before
vsock_enqueue_accept(). Move the call into vsock_enqueue_accept()
itself so callers cannot forget it and the accounting is consistent.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Raf Dickson <rafdog35@gmail.com>
---
net/vmw_vsock/af_vsock.c | 1 +
net/vmw_vsock/hyperv_transport.c | 1 -
net/vmw_vsock/virtio_transport_common.c | 1 -
3 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 6cfa89b6f3..24916dd4e9 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -518,6 +518,7 @@ void vsock_enqueue_accept(struct sock *listener, struct sock *connected)
sock_hold(connected);
sock_hold(listener);
list_add_tail(&vconnected->accept_queue, &vlistener->accept_queue);
+ sk_acceptq_added(listener);
}
EXPORT_SYMBOL_GPL(vsock_enqueue_accept);
diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index b3394946b2..0de8148877 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -410,7 +410,6 @@ static void hvs_open_connection(struct vmbus_channel *chan)
if (conn_from_host) {
new->sk_state = TCP_ESTABLISHED;
- sk_acceptq_added(sk);
hvs_new->vm_srv_id = *if_type;
hvs_new->host_srv_id = *if_instance;
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index b10666937c..4a39d48db9 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -1582,7 +1582,6 @@ virtio_transport_recv_listen(struct sock *sk, struct sk_buff *skb,
return ret;
}
- sk_acceptq_added(sk);
if (virtio_transport_space_update(child, skb))
child->sk_write_space(child);
--
2.54.0
^ permalink raw reply related
* [PATCH net-next v3 2/4] vsock: fold sk_acceptq_added() into vsock_add_pending()
From: Raf Dickson @ 2026-06-12 4:52 UTC (permalink / raw)
To: netdev, virtualization
Cc: pabeni, sgarzare, stefanha, bryan-bt.tan, vishnu.dasa,
bcm-kernel-feedback-list, bobbyeshleman, leonardi, horms,
edumazet, kuba, Raf Dickson
In-Reply-To: <20260612045216.105796-1-rafdog35@gmail.com>
Move sk_acceptq_added() into vsock_add_pending() so callers cannot
forget it. vmci is the only transport using the pending list and
is updated accordingly.
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Raf Dickson <rafdog35@gmail.com>
---
net/vmw_vsock/af_vsock.c | 1 +
net/vmw_vsock/vmci_transport.c | 1 -
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 1f94f0d44c..6cfa89b6f3 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -483,6 +483,7 @@ void vsock_add_pending(struct sock *listener, struct sock *pending)
sock_hold(pending);
sock_hold(listener);
list_add_tail(&vpending->pending_links, &vlistener->pending_links);
+ sk_acceptq_added(listener);
}
EXPORT_SYMBOL_GPL(vsock_add_pending);
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 635ebf9da4..c2db016cca 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1109,7 +1109,6 @@ static int vmci_transport_recv_listen(struct sock *sk,
}
vsock_add_pending(sk, pending);
- sk_acceptq_added(sk);
pending->sk_state = TCP_SYN_SENT;
vmci_trans(vpending)->produce_size =
--
2.54.0
^ permalink raw reply related
* [PATCH net-next v3 1/4] vsock: introduce vsock_pending_to_accept() helper
From: Raf Dickson @ 2026-06-12 4:52 UTC (permalink / raw)
To: netdev, virtualization
Cc: pabeni, sgarzare, stefanha, bryan-bt.tan, vishnu.dasa,
bcm-kernel-feedback-list, bobbyeshleman, leonardi, horms,
edumazet, kuba, Raf Dickson
In-Reply-To: <20260612045216.105796-1-rafdog35@gmail.com>
Add vsock_pending_to_accept() to move a socket directly from the
pending list to the accept queue in a single operation, avoiding
the sock_put/sock_hold dance and the sk_acceptq_removed()/
sk_acceptq_added() pair that would otherwise be needed when
calling vsock_remove_pending() followed by vsock_enqueue_accept().
Use it in vmci_transport_recv_connecting_server() where a completed
handshake transitions the socket from pending to accept queue.
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Raf Dickson <rafdog35@gmail.com>
---
include/net/af_vsock.h | 1 +
net/vmw_vsock/af_vsock.c | 10 ++++++++++
net/vmw_vsock/vmci_transport.c | 3 +--
3 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 4e40063ada..30046a3c20 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -220,6 +220,7 @@ static inline bool __vsock_in_connected_table(struct vsock_sock *vsk)
void vsock_add_pending(struct sock *listener, struct sock *pending);
void vsock_remove_pending(struct sock *listener, struct sock *pending);
void vsock_enqueue_accept(struct sock *listener, struct sock *connected);
+void vsock_pending_to_accept(struct sock *listener, struct sock *pending);
void vsock_insert_connected(struct vsock_sock *vsk);
void vsock_remove_bound(struct vsock_sock *vsk);
void vsock_remove_connected(struct vsock_sock *vsk);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 2ce1063d4a..1f94f0d44c 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -496,6 +496,16 @@ void vsock_remove_pending(struct sock *listener, struct sock *pending)
}
EXPORT_SYMBOL_GPL(vsock_remove_pending);
+void vsock_pending_to_accept(struct sock *listener, struct sock *pending)
+{
+ struct vsock_sock *vpending = vsock_sk(pending);
+ struct vsock_sock *vlistener = vsock_sk(listener);
+
+ list_del_init(&vpending->pending_links);
+ list_add_tail(&vpending->accept_queue, &vlistener->accept_queue);
+}
+EXPORT_SYMBOL_GPL(vsock_pending_to_accept);
+
void vsock_enqueue_accept(struct sock *listener, struct sock *connected)
{
struct vsock_sock *vlistener;
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 91516488a7..635ebf9da4 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1258,8 +1258,7 @@ vmci_transport_recv_connecting_server(struct sock *listener,
* listener's pending list to the accept queue so callers of accept()
* can find it.
*/
- vsock_remove_pending(listener, pending);
- vsock_enqueue_accept(listener, pending);
+ vsock_pending_to_accept(listener, pending);
/* Callers of accept() will be waiting on the listening socket, not
* the pending socket.
--
2.54.0
^ permalink raw reply related
* [PATCH net-next v3 0/4] vsock: consolidate acceptq accounting into core helpers
From: Raf Dickson @ 2026-06-12 4:52 UTC (permalink / raw)
To: netdev, virtualization
Cc: pabeni, sgarzare, stefanha, bryan-bt.tan, vishnu.dasa,
bcm-kernel-feedback-list, bobbyeshleman, leonardi, horms,
edumazet, kuba, Raf Dickson
These patches follow up on commit c05fa14db43e
("vsock/vmci: fix sk_ack_backlog leak on failed handshake")
by consolidating sk_acceptq_added() and sk_acceptq_removed() into
the core vsock helpers so transports cannot forget them.
Changes since v2:
- Add vsock_pending_to_accept() helper for the vmci pending->accept
transition, avoiding a double sk_acceptq_added() (Stefano Garzarella)
- Split into 4 patches for bisectability (Stefano Garzarella)
- Fold sk_acceptq_added() into vsock_add_pending() as a separate patch
Link: https://lore.kernel.org/netdev/20260611021317.69362-1-rafdog35@gmail.com/
Raf Dickson (4):
vsock: introduce vsock_pending_to_accept() helper
vsock: fold sk_acceptq_added() into vsock_add_pending()
vsock: fold sk_acceptq_added() into vsock_enqueue_accept()
vsock: fold sk_acceptq_removed() into vsock_remove_pending()
include/net/af_vsock.h | 1 +
net/vmw_vsock/af_vsock.c | 14 +++++++++++++-
net/vmw_vsock/hyperv_transport.c | 1 -
net/vmw_vsock/virtio_transport_common.c | 1 -
net/vmw_vsock/vmci_transport.c | 8 ++------
5 files changed, 16 insertions(+), 9 deletions(-)
--
2.54.0
^ permalink raw reply
* Re: [PATCH net-next 0/3] xsk: support tx napi busy_poll
From: Menglong Dong @ 2026-06-12 1:09 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: jasowang, mst, xuanzhuo, eperezma, andrew+netdev, davem, edumazet,
kuba, pabeni, magnus.karlsson, sdf, horms, ast, daniel, hawk,
john.fastabend, bjorn, kerneljasonxing, netdev, virtualization,
linux-kernel, bpf
In-Reply-To: <aisBEUmSJa1vkFYo@boxer>
On Fri, Jun 12, 2026 at 2:40 AM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> On Thu, Jun 11, 2026 at 03:12:39PM +0800, menglong8.dong@gmail.com wrote:
> > From: Menglong Dong <dongml2@chinatelecom.cn>
> >
> > For now, we use sk_busy_loop() in __xsk_sendmsg() to send the data in tx
> > ring. The sk_busy_loop() will poll on the target NAPI. However, for the
> > nic driver that support the tx napi, such as virtio-net, it can't schedule
> > the tx NAPI, but only the rx NAPI. If we enable the busy_poll for xsk and
> > use virtio-net, we can't send data, as the rx NAPI in virtio-net doesn't
> > handle the packet sending.
>
> Am I reading this right that you decided to break busy-poll support for
> zero-copy drivers that happen to handle transmit side from Rx NAPI context
> in favor of supporting virtio-net?
Oh, nop. This series only has an impact for virtio-net. For the other
driver, the logic doesn't change.
I added a "tx_napi" field to struct nap_struct. For now, only virtio-net
will initialize it, as only virtio-net has the "tx napi".
In __napi_busy_loop(), we will use the napi->tx_napi, if it is not NULL
and we are in the tx code path(sk_tx_busy_loop), instead of napi.
Thanks!
Menglong Dong
>
> >
> > Fix this by introduce the sk_tx_busy_loop(), which will poll on the tx
> > NAPI if available. To get the tx NAPI from the napi_id, we add the
> > "tx_napi" field to napi_struct, which is ugly :/
> >
> > Another choice is to call virtnet_xsk_xmit() in virtnet_poll() too. But
> > this a little contradict the design of tx NAPI.
> >
> > Menglong Dong (3):
> > net: busy-poll: introduce sk_tx_busy_loop()
> > virtio_net: initialize napi.tx_napi in virtnet_alloc_queues()
> > xsk: replace sk_busy_loop with sk_tx_busy_loop in __xsk_sendmsg()
> >
> > drivers/net/virtio_net.c | 1 +
> > include/linux/netdevice.h | 1 +
> > include/net/busy_poll.h | 41 ++++++++++++++++++++++++++++++++++++---
> > net/core/dev.c | 23 +++++-----------------
> > net/xdp/xsk.c | 2 +-
> > 5 files changed, 46 insertions(+), 22 deletions(-)
> >
> > --
> > 2.54.0
> >
^ permalink raw reply
* Re: [PATCH net-next] vsock/vmci: use sk_acceptq_is_full() helper
From: Jakub Kicinski @ 2026-06-11 20:26 UTC (permalink / raw)
To: Luigi Leonardi
Cc: Raf Dickson, netdev, virtualization, pabeni, sgarzare, stefanha,
bryan-bt.tan, vishnu.dasa, bcm-kernel-feedback-list
In-Reply-To: <aippcmlliXpO0BXM@leonardi-redhat>
On Thu, 11 Jun 2026 09:58:22 +0200 Luigi Leonardi wrote:
> note: according to patchwork [1] you forgot to CC some maintainers,
> please be more careful next time :)
Please focus your reviews on the code not patchwork checks.
^ permalink raw reply
* Re: [PATCH net-next 0/3] xsk: support tx napi busy_poll
From: Maciej Fijalkowski @ 2026-06-11 18:40 UTC (permalink / raw)
To: menglong8.dong
Cc: jasowang, mst, xuanzhuo, eperezma, andrew+netdev, davem, edumazet,
kuba, pabeni, magnus.karlsson, sdf, horms, ast, daniel, hawk,
john.fastabend, bjorn, kerneljasonxing, netdev, virtualization,
linux-kernel, bpf
In-Reply-To: <20260611071242.2485058-1-dongml2@chinatelecom.cn>
On Thu, Jun 11, 2026 at 03:12:39PM +0800, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <dongml2@chinatelecom.cn>
>
> For now, we use sk_busy_loop() in __xsk_sendmsg() to send the data in tx
> ring. The sk_busy_loop() will poll on the target NAPI. However, for the
> nic driver that support the tx napi, such as virtio-net, it can't schedule
> the tx NAPI, but only the rx NAPI. If we enable the busy_poll for xsk and
> use virtio-net, we can't send data, as the rx NAPI in virtio-net doesn't
> handle the packet sending.
Am I reading this right that you decided to break busy-poll support for
zero-copy drivers that happen to handle transmit side from Rx NAPI context
in favor of supporting virtio-net?
>
> Fix this by introduce the sk_tx_busy_loop(), which will poll on the tx
> NAPI if available. To get the tx NAPI from the napi_id, we add the
> "tx_napi" field to napi_struct, which is ugly :/
>
> Another choice is to call virtnet_xsk_xmit() in virtnet_poll() too. But
> this a little contradict the design of tx NAPI.
>
> Menglong Dong (3):
> net: busy-poll: introduce sk_tx_busy_loop()
> virtio_net: initialize napi.tx_napi in virtnet_alloc_queues()
> xsk: replace sk_busy_loop with sk_tx_busy_loop in __xsk_sendmsg()
>
> drivers/net/virtio_net.c | 1 +
> include/linux/netdevice.h | 1 +
> include/net/busy_poll.h | 41 ++++++++++++++++++++++++++++++++++++---
> net/core/dev.c | 23 +++++-----------------
> net/xdp/xsk.c | 2 +-
> 5 files changed, 46 insertions(+), 22 deletions(-)
>
> --
> 2.54.0
>
^ permalink raw reply
* Re: [PATCH v2 0/3] virtiofs: hiprio FORGET robustness and no-reply request completion
From: Stefan Hajnoczi @ 2026-06-11 17:25 UTC (permalink / raw)
To: German Maglione
Cc: Li Wang, Vivek Goyal, Stefan Hajnoczi, Eugenio Pérez,
virtualization, linux-fsdevel, linux-kernel, Miklos Szeredi
In-Reply-To: <CAJfpegs8v6a9D1zWsNY7pX_Zx1wgjGvoEFS6hkk+qE2iDQptZw@mail.gmail.com>
On Thu, Jun 11, 2026 at 5:34 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Thu, 2 Apr 2026 at 12:45, Li Wang <liwang@kylinos.cn> wrote:
> >
> > This series fixes virtiofs completion for FUSE requests that do not use a
> > reply descriptor, tightens hiprio FORGET handling when virtqueue submission
> > fails transiently, and prevents the hiprio worker from stalling when one
> > FORGET must be dropped after an unrecoverable error.
>
> Can someone from the virtiofs team please review this?
German: Are you available to review this? I am can review patches as a
fallback but would prefer for you to handle them most of the time
since I haven't been actively involved in virtiofs for some time.
Thanks,
Stefan
^ permalink raw reply
* [PATCH v2 net] virtio_net: do not allow tunnel csum offload for non GSO packets
From: Paolo Abeni @ 2026-06-11 16:36 UTC (permalink / raw)
To: netdev
Cc: Michael S. Tsirkin, Jason Wang, Xuan Zhuo, Eugenio Pérez,
Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
virtualization, Gabriel Goller, Fiona Ebner
Fiona reports broken connectivity for virtio net setup using UDP tunnel
inside the guest and NIC with not UDP tunnel TSO support in the host.
Currently the virtio_net driver exposes csum offload for UDP-tunneled,
TCP non GSO packets. Such packet reach the host as CSUM_PARTIAL ones
with the 'encapsulation' flag cleared, as the virtio specification do
not support this specific kind of offload.
HW NICs with UDP tunnel TSO support - and those drivers directly
accessing skb->csum_start/csum_offset - are still capable of computing
the needed csum correctly, but otherwise the packets reach the wire with
bad csum on both the inner and outer transport header.
Address the issue explicitly disabling csum offload for UDP tunneled,
non GSO packets via the ndo_features_check op.
Fixes: 56a06bd40fab ("virtio_net: enable gso over UDP tunnel support.")
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Closes: https://bugzilla.proxmox.com/show_bug.cgi?id=7627
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Gabriel Goller <g.goller@proxmox.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v1 -> v2:
- deal with to-be-segmented skbs, too.
---
drivers/net/virtio_net.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f4adcfee7a80..7d2eeb9b1226 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -6222,6 +6222,19 @@ static void virtnet_free_irq_moder(struct virtnet_info *vi)
rtnl_unlock();
}
+static netdev_features_t virtnet_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ /* Inner csum offload is only available for GSO packets. */
+ if (skb->encapsulation &&
+ (!skb_is_gso(skb) || netif_needs_gso(skb, features)))
+ return features & ~NETIF_F_CSUM_MASK;
+
+ /* Passthru. */
+ return features;
+}
+
static const struct net_device_ops virtnet_netdev = {
.ndo_open = virtnet_open,
.ndo_stop = virtnet_close,
@@ -6235,7 +6248,7 @@ static const struct net_device_ops virtnet_netdev = {
.ndo_bpf = virtnet_xdp,
.ndo_xdp_xmit = virtnet_xdp_xmit,
.ndo_xsk_wakeup = virtnet_xsk_wakeup,
- .ndo_features_check = passthru_features_check,
+ .ndo_features_check = virtnet_features_check,
.ndo_get_phys_port_name = virtnet_get_phys_port_name,
.ndo_set_features = virtnet_set_features,
.ndo_tx_timeout = virtnet_tx_timeout,
--
2.54.0
^ permalink raw reply related
* Re: [PATCH 0/2] Fix lock errors in VDUSE suspend feature
From: Eugenio Perez Martin @ 2026-06-11 16:33 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Maxime Coquelin, Stefano Garzarella, Jason Wang, Xuan Zhuo,
Laurent Vivier, virtualization, linux-kernel, Cindy Lu,
Yongji Xie
In-Reply-To: <20260611105251-mutt-send-email-mst@kernel.org>
On Thu, Jun 11, 2026 at 4:53 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Jun 11, 2026 at 03:38:04PM +0200, Eugenio Pérez wrote:
> > Fix wrong ordering at taking semaphore after spinlock and convert the spinlock
> > take and release into guards, so they are not lost after a return.
> >
> > This series goes on top of https://lore.kernel.org/lkml/20260610083452.477759-1-eperezma@redhat.com/
> > ---
> > It would be great if these can be squashed.
>
> ok i figured out the confusion. squashed
>
Yes, you're right; this is on top of Nathan's. Thanks for both merging
and figuring it out!
^ permalink raw reply
* Re: [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx wake up
From: Bui Quang Minh @ 2026-06-11 16:24 UTC (permalink / raw)
To: menglong8.dong, xuanzhuo, eperezma
Cc: mst, jasowang, andrew+netdev, davem, edumazet, kuba, pabeni,
kerneljasonxing, netdev, virtualization, linux-kernel
In-Reply-To: <20260611025644.2431148-2-dongml2@chinatelecom.cn>
On 6/11/26 09:56, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <dongml2@chinatelecom.cn>
>
> During packet receiving in virtio-net, the rq can be empty, which means
> "rq->vq->num_free == virtqueue_get_vring_size(rq->vq)", in
> virtnet_add_recvbuf_xsk(), if we are using xsk. Meanwhile, the fill ring
> can be empty too, which means we can't allocate anything from
> xsk_buff_alloc_batch(). Then, we will set the XDP_RING_NEED_WAKEUP flag.
>
> However, if the user clean all the data in rx ring and fill the
> "fill ring" and check the XDP_RING_NEED_WAKEUP flag after
> xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), then the rx
> napi will never be scheduled: the rx ring is empty, which means we will
> never receive a packet to trigger the further recv fill. The rx ring is
> empty now, so the user will not check the flag too.
>
> Fix this by set the XDP_RING_NEED_WAKEUP flag before
> xsk_buff_alloc_batch() if both rq->vq and fill ring are empty.
>
> Meanwhile, set the XDP_RING_NEED_WAKEUP flag if we have any free entry in
> rq->vq.
>
> Fixes: e3f8800aa243 ("virtio-net: xsk: Support wakeup on RX side")
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> ---
> drivers/net/virtio_net.c | 25 ++++++++++++++++++++++---
> 1 file changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index f4adcfee7a80..4b5b3fa62008 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1323,16 +1323,27 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
> struct xsk_buff_pool *pool, gfp_t gfp)
> {
> struct xdp_buff **xsk_buffs;
> + bool need_wakeup;
> dma_addr_t addr;
> int err = 0;
> u32 len, i;
> int num;
>
> + need_wakeup = xsk_uses_need_wakeup(pool);
> xsk_buffs = rq->xsk_buffs;
>
> + /* If both rq->vq and fill ring are empty, and then the user submit
> + * all the chunks to the fill ring and check the wake up flag
> + * after xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(),
> + * we will lose the chance to wake up the rx napi, so we have to
> + * set the need_wakeup flag here.
> + */
> + if (need_wakeup && virtqueue_get_vring_size(rq->vq) == rq->vq->num_free)
> + xsk_set_rx_need_wakeup(pool);
I think when polling the receive queue, the userspace program needs to
check the XDP_RING_NEED_WAKEUP flag if it does not see any packets. The
flag check is quite lightweight in my opinion. Here are some examples I find
-
https://github.com/xdp-project/xdp-tools/blob/e9469501622aa22a7e452a671000bec8685edcde/lib/util/xdpsock.c#L1206
-
https://github.com/xdp-project/bpf-examples/blob/43e565901c4287efa863edca7f0e6cd6e35ed896/AF_XDP-forwarding/xsk_fwd.c#L540
Furthermore, the XDP_RING_NEED_WAKEUP flag related functions does not
provide any memory orderings. So even with your patch, I'm worried that
this case is possible
kernel userspace
xsk_buff_alloc_batch -> failed
submit fill
ring
flag !=
XDP_RING_NEED_WAKEUP
// reordering due to lack of memory orderings
xsk_set_rx_need_wakeup
I'm not expert here, so correct me if I'm wrong. I think the wake up
flag is designed with no orderings so we cannot rely on it to reason and
skip further checks.
> +
> num = xsk_buff_alloc_batch(pool, xsk_buffs, rq->vq->num_free);
> if (!num) {
> - if (xsk_uses_need_wakeup(pool)) {
> + if (need_wakeup) {
> xsk_set_rx_need_wakeup(pool);
> /* Return 0 instead of -ENOMEM so that NAPI is
> * descheduled.
> @@ -1341,8 +1352,6 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
> }
>
> return -ENOMEM;
> - } else {
> - xsk_clear_rx_need_wakeup(pool);
> }
>
> len = xsk_pool_get_rx_frame_size(pool) + vi->hdr_len;
> @@ -1363,6 +1372,16 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
> goto err;
> }
>
> + if (need_wakeup) {
> + if (rq->vq->num_free)
> + /* We have free buffers, so we'd better wake up the
> + * rx napi as soon as possible.
> + */
> + xsk_set_rx_need_wakeup(pool);
> + else
> + xsk_clear_rx_need_wakeup(pool);
> + }
> +
Why do we need to set XDP_RING_NEED_WAKEUP even when
xsk_buff_alloc_batch succeeds?
> return num;
>
> err:
Thanks,
Quang Minh.
^ permalink raw reply
* Re: [PATCH 0/2] Fix lock errors in VDUSE suspend feature
From: Michael S. Tsirkin @ 2026-06-11 14:53 UTC (permalink / raw)
To: Eugenio Pérez
Cc: Maxime Coquelin, Stefano Garzarella, Jason Wang, Xuan Zhuo,
Laurent Vivier, virtualization, linux-kernel, Cindy Lu,
Yongji Xie
In-Reply-To: <20260611133806.198402-1-eperezma@redhat.com>
On Thu, Jun 11, 2026 at 03:38:04PM +0200, Eugenio Pérez wrote:
> Fix wrong ordering at taking semaphore after spinlock and convert the spinlock
> take and release into guards, so they are not lost after a return.
>
> This series goes on top of https://lore.kernel.org/lkml/20260610083452.477759-1-eperezma@redhat.com/
> ---
> It would be great if these can be squashed.
ok i figured out the confusion. squashed
>
> Eugenio Pérez (2):
> vduse: fix not releasing taken semaphore in vduse_dev_queue_irq_work
> vduse: not take the device semaphore while holding vq spinlock
>
> drivers/vdpa/vdpa_user/vduse_dev.c | 17 ++++++-----------
> 1 file changed, 6 insertions(+), 11 deletions(-)
>
> --
> 2.54.0
^ permalink raw reply
* Re: [PATCH 0/2] Fix lock errors in VDUSE suspend feature
From: Michael S. Tsirkin @ 2026-06-11 14:49 UTC (permalink / raw)
To: Eugenio Pérez
Cc: Maxime Coquelin, Stefano Garzarella, Jason Wang, Xuan Zhuo,
Laurent Vivier, virtualization, linux-kernel, Cindy Lu,
Yongji Xie
In-Reply-To: <20260611133806.198402-1-eperezma@redhat.com>
On Thu, Jun 11, 2026 at 03:38:04PM +0200, Eugenio Pérez wrote:
> Fix wrong ordering at taking semaphore after spinlock and convert the spinlock
> take and release into guards, so they are not lost after a return.
>
> This series goes on top of https://lore.kernel.org/lkml/20260610083452.477759-1-eperezma@redhat.com/
You mean
20260610-vduse_vq_kick-fix-guard-usage-v1-1-0ce02c08006e@kernel.org
actually
> ---
> It would be great if these can be squashed.
>
> Eugenio Pérez (2):
> vduse: fix not releasing taken semaphore in vduse_dev_queue_irq_work
> vduse: not take the device semaphore while holding vq spinlock
>
> drivers/vdpa/vdpa_user/vduse_dev.c | 17 ++++++-----------
> 1 file changed, 6 insertions(+), 11 deletions(-)
>
> --
> 2.54.0
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox