All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: qemu-devel@nongnu.org, "Peter Maydell" <peter.maydell@linaro.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Lei Yang" <leiyang@redhat.com>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Zhao Liu" <zhao1.liu@intel.com>,
	"Gabriel Goller" <g.goller@proxmox.com>,
	"Stefan Hanreich" <s.hanreich@proxmox.com>,
	"Thomas Lamprecht" <t.lamprecht@proxmox.com>
Subject: Re: [PULL 12/14] virtio-net: Advertise UDP tunnel GSO support by default
Date: Fri, 5 Jun 2026 11:20:09 -0400	[thread overview]
Message-ID: <20260605111823-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <077647f6-d569-4918-9aea-c0597a6ddbc8@proxmox.com>

On Fri, Jun 05, 2026 at 04:02:39PM +0200, Fiona Ebner wrote:
> Dear maintainers,
> 
> Am 09.11.25 um 4:10 PM schrieb Michael S. Tsirkin:
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index 17ed0ef919..3b85560f6f 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -4299,19 +4299,19 @@ static const Property virtio_net_properties[] = {
> >      VIRTIO_DEFINE_PROP_FEATURE("host_tunnel", VirtIONet,
> >                                 host_features_ex,
> >                                 VIRTIO_NET_F_HOST_UDP_TUNNEL_GSO,
> > -                               false),
> > +                               true),
> it seems that the host_tunnel setting can cause issues when VXLAN
> traffic originating in a guest goes over a physical NIC which does not
> support the feature. We received several reports about the issue
> [0][1][2][3] and were able to reproduce it. Turning off the
> 'host_tunnel' property in the commandline for the VirtIO net device
> makes TCP traffic work. The network configuration from our reproducer
> setup is as follows:
> 
>       guest A (iperf3 -c)                   guest B (iperf3 -s)
>   vxlan using vNIC as underlay         vxlan using vNIC as underlay
> virtualized NIC exposed to guest     virtualized NIC exposed to guest
>     ---guest boundary---                  ---guest boundary---
>  tap device connected to bridge       tap device connected to bridge
> bridge with physical NIC as port     bridge with physical NIC as port
>         physical NIC   <---host boundary--->   physical NIC
> 
> Bridge configuration:
> iface vmbr0 inet static
> 	address 10.48.0.109/20
> 	gateway 10.48.0.1
> 	bridge-ports nic3
> 	bridge-stp off
> 	bridge-fd 0
> 	bridge-vlan-aware yes
> 	bridge-vids 2-4094
> 
> VXLAN created with:
> ip link add vxlan0 type vxlan id 100 remote X dstport 4789 dev eth1
> where eth1 is the virtualized NIC exposed to the guest
> 
> The physical NIC does not have the feature:
> Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme
> BCM5719 Gigabit Ethernet PCIe [14e4:1657] (rev 01)
> tx-udp_tnl-segmentation: off [fixed]
> tx-udp_tnl-csum-segmentation: off [fixed]
> 
> Using a physical NIC which does have the feature works:
> Ethernet controller [0200]: Broadcom Inc. and subsidiaries BCM57504
> NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb Ethernet [14e4:1751] (rev 11)
> tx-udp_tnl-segmentation: on
> tx-udp_tnl-csum-segmentation: on
> 
> Host kernel:
> Proxmox VE with 7.0.2-6-pve
> 
> Guest kernel:
> Apline with 6.18.34-0-lts
> 
> QEMU commandline for the vNIC:
> > -netdev 'type=tap,id=net2,ifname=tap103i2,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \
> > -device 'virtio-net-pci,mac=BC:24:11:78:C3:3B,netdev=net2,bus=pci.0,addr=0x14,id=net2,rx_queue_size=1024,tx_queue_size=256,host_mtu=1500' \
> 
> We can see that QEMU sets the features for the tap interface via ioctl()
> and the host kernel allows it:
> tx-udp_tnl-segmentation: on
> tx-udp_tnl-csum-segmentation: on
> 
> As far as we understand, in the problematic scenario, nothing is ever
> filling in the checksums for the inner TCP packets, meaning the outer
> UDP checksum ends up being wrong on the target side. Is the host kernel
> responsible for doing that before passing the packet to the physical NIC
> (without the feature)? Or who would be?
> 
> Turning off host_tunnel_csum without turning off host_tunnel does not help.
> 
> Interestingly, turning off the features for the working physical NIC
> does not make it break:
> tx-udp_tnl-segmentation: off
> tx-udp_tnl-csum-segmentation: off
> Could it be that the NIC just always fills in the inner TCP checksums
> regardless of that setting?
> 
> On the other hand, running
> localhost:~# ethtool -K eth2 tx-checksum-ip-generic off
> Actual changes:
> tx-checksum-ip-generic: off
> tx-tcp-segmentation: off [not requested]
> tx-tcp-ecn-segmentation: off [not requested]
> tx-tcp6-segmentation: off [not requested]
> tx-udp-segmentation: off [not requested]
> inside the guests makes it work for the physical NIC without the
> tx-udp_tnl* features.
> 
> I wanted to ask if this configuration is expected to be unsupported and
> if the management is expected to turn off the feature on the commandline
> if the traffic might go over a physical NIC without the feature. Or if
> this could be a kernel or NIC bug that should be investigated further?
> In the former case, should the option really be turned on by default
> with new machine versions?
> 
> I'll be happy to test and capture/provide additional information. Let me
> know what would be interesting.
> 
> Thanks to Stefan and Gabriel for looking into the issue with me!
> 
> Best Regards,
> Fiona
> 
> [0]: https://bugzilla.proxmox.com/show_bug.cgi?id=7627
> [1]: https://forum.proxmox.com/threads/183494/post-855144
> [2]: https://forum.proxmox.com/threads/182328/post-854627
> [3]: https://forum.proxmox.com/threads/183963/post-855737


looks like a kernel or nic bug to me.


-- 
MST



  parent reply	other threads:[~2026-06-05 15:20 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-09 14:35 [PULL 00/14] virtio,pci,pc: fixes for 10.2 Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 01/14] MAINTAINERS: Update entry for AMD-Vi Emulation Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 02/14] amd_iommu: Fix handling of devices on buses != 0 Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 03/14] amd_iommu: Support 64-bit address for IOTLB lookup Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 04/14] vhost-user: fix shared object lookup handler logic Michael S. Tsirkin
2025-11-10  9:23   ` Albert Esteve
2025-11-10 14:37     ` Richard Henderson
2025-11-10 15:42     ` Michael S. Tsirkin
2025-11-10 15:57       ` Albert Esteve
2025-11-10 16:06         ` Michael S. Tsirkin
2025-11-10 18:54           ` Albert Esteve
2025-11-09 14:35 ` [PULL 05/14] intel_iommu: Handle PASID cache invalidation Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 06/14] intel_iommu: Reset pasid cache when system level reset Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 07/14] intel_iommu: Fix DMA failure when guest switches IOMMU domain Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 08/14] vhost-user: make vhost_set_vring_file() synchronous Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 09/14] tests/qtest/bios-tables-test: Prepare for _DSM change in the DSDT table Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 10/14] hw/pci-host/gpex-acpi: Fix _DSM function 0 support return value Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 11/14] tests/qtest/bios-tables-test: Update DSDT blobs after GPEX _DSM change Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 12/14] virtio-net: Advertise UDP tunnel GSO support by default Michael S. Tsirkin
2026-06-05 14:02   ` Fiona Ebner
2026-06-05 14:54     ` Paolo Abeni
2026-06-05 15:08       ` Paolo Abeni
2026-06-05 15:20     ` Michael S. Tsirkin [this message]
2025-11-09 14:35 ` [PULL 13/14] q35: increase default tseg size Michael S. Tsirkin
2025-11-09 14:35 ` [PULL 14/14] vhost-user.rst: clarify when FDs can be sent Michael S. Tsirkin
2025-11-10 16:57 ` [PULL 00/14] virtio,pci,pc: fixes for 10.2 Richard Henderson
2025-11-17 10:27 ` Michael S. Tsirkin
2025-11-17 11:44   ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260605111823-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=f.ebner@proxmox.com \
    --cc=g.goller@proxmox.com \
    --cc=jasowang@redhat.com \
    --cc=leiyang@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=s.hanreich@proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    --cc=wangyanan55@huawei.com \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.