netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
@ 2025-04-02 21:12 Markus Fohrer
  2025-04-03 13:04 ` Michael S. Tsirkin
  2025-04-04  7:59 ` Torsten Krah
  0 siblings, 2 replies; 24+ messages in thread
From: Markus Fohrer @ 2025-04-02 21:12 UTC (permalink / raw)
  To: virtualization; +Cc: mst, jasowang, davem, edumazet, netdev, linux-kernel

Hi,

I'm observing a significant performance regression in KVM guest VMs using virtio-net with recent Linux kernels (6.8.1+ and 6.14).

When running on a host system equipped with a Broadcom NetXtreme-E (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the guest drops to 100–200 KB/s. The same guest configuration performs normally (~100 MB/s) when using kernel 6.8.0 or when the VM is moved to a host with Intel NICs.

Test environment:
- Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
- Guest: Linux with virtio-net interface
- NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host level)
- CPU: AMD EPYC
- Storage: virtio-scsi
- VM network: virtio-net, virtio-scsi (no CPU or IO bottlenecks)
- Traffic test: iperf3, scp, wget consistently slow in guest

This issue is not present:
- On 6.8.0 
- On hosts with Intel NICs (same VM config)

I have bisected the issue to the following upstream commit:

  49d14b54a527 ("virtio-net: Suppress tx timeout warning for small tx")
  https://git.kernel.org/linus/49d14b54a527

Reverting this commit restores normal network performance in affected guest VMs.

I’m happy to provide more data or assist with testing a potential fix.

Thanks,
Markus Fohrer


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-02 21:12 [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+ Markus Fohrer
@ 2025-04-03 13:04 ` Michael S. Tsirkin
  2025-04-03 13:51   ` Markus Fohrer
  2025-04-04  8:16   ` Markus Fohrer
  2025-04-04  7:59 ` Torsten Krah
  1 sibling, 2 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2025-04-03 13:04 UTC (permalink / raw)
  To: Markus Fohrer
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> Hi,
> 
> I'm observing a significant performance regression in KVM guest VMs using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> 
> When running on a host system equipped with a Broadcom NetXtreme-E (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the guest drops to 100–200 KB/s. The same guest configuration performs normally (~100 MB/s) when using kernel 6.8.0 or when the VM is moved to a host with Intel NICs.
> 
> Test environment:
> - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> - Guest: Linux with virtio-net interface
> - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host level)
> - CPU: AMD EPYC
> - Storage: virtio-scsi
> - VM network: virtio-net, virtio-scsi (no CPU or IO bottlenecks)
> - Traffic test: iperf3, scp, wget consistently slow in guest
> 
> This issue is not present:
> - On 6.8.0 
> - On hosts with Intel NICs (same VM config)
> 
> I have bisected the issue to the following upstream commit:
> 
>   49d14b54a527 ("virtio-net: Suppress tx timeout warning for small tx")
>   https://git.kernel.org/linus/49d14b54a527

Thanks a lot for the info!


both the link and commit point at:

commit 49d14b54a527289d09a9480f214b8c586322310a
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Sep 26 16:58:36 2024 +0000

    net: test for not too small csum_start in virtio_net_hdr_to_skb()
    

is this what you mean?

I don't know which commit is "virtio-net: Suppress tx timeout warning for small tx"



> Reverting this commit restores normal network performance in affected guest VMs.
> 
> I’m happy to provide more data or assist with testing a potential fix.
> 
> Thanks,
> Markus Fohrer


Thanks! First I think it's worth checking what is the setup, e.g.
which offloads are enabled.
Besides that, I'd start by seeing what's doing on. Assuming I'm right about
Eric's patch:

diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 276ca543ef44d8..02a9f4dc594d02 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -103,8 +103,10 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
 
 		if (!skb_partial_csum_set(skb, start, off))
 			return -EINVAL;
+		if (skb_transport_offset(skb) < nh_min_len)
+			return -EINVAL;
 
-		nh_min_len = max_t(u32, nh_min_len, skb_transport_offset(skb));
+		nh_min_len = skb_transport_offset(skb);
 		p_off = nh_min_len + thlen;
 		if (!pskb_may_pull(skb, p_off))
 			return -EINVAL;


sticking a printk before return -EINVAL to show the offset and nh_min_len
would be a good 1st step. Thanks!


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 13:04 ` Michael S. Tsirkin
@ 2025-04-03 13:51   ` Markus Fohrer
  2025-04-03 14:03     ` Michael S. Tsirkin
  2025-04-04  8:16   ` Markus Fohrer
  1 sibling, 1 reply; 24+ messages in thread
From: Markus Fohrer @ 2025-04-03 13:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
Tsirkin:
> On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > Hi,
> > 
> > I'm observing a significant performance regression in KVM guest VMs
> > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > 
> > When running on a host system equipped with a Broadcom NetXtreme-E
> > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the
> > guest drops to 100–200 KB/s. The same guest configuration performs
> > normally (~100 MB/s) when using kernel 6.8.0 or when the VM is
> > moved to a host with Intel NICs.
> > 
> > Test environment:
> > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > - Guest: Linux with virtio-net interface
> > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host level)
> > - CPU: AMD EPYC
> > - Storage: virtio-scsi
> > - VM network: virtio-net, virtio-scsi (no CPU or IO bottlenecks)
> > - Traffic test: iperf3, scp, wget consistently slow in guest
> > 
> > This issue is not present:
> > - On 6.8.0 
> > - On hosts with Intel NICs (same VM config)
> > 
> > I have bisected the issue to the following upstream commit:
> > 
> >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for small
> > tx")
> >   https://git.kernel.org/linus/49d14b54a527
> 
> Thanks a lot for the info!
> 
> 
> both the link and commit point at:
> 
> commit 49d14b54a527289d09a9480f214b8c586322310a
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Thu Sep 26 16:58:36 2024 +0000
> 
>     net: test for not too small csum_start in virtio_net_hdr_to_skb()
>     
> 
> is this what you mean?
> 
> I don't know which commit is "virtio-net: Suppress tx timeout warning
> for small tx"
> 
> 
> 
> > Reverting this commit restores normal network performance in
> > affected guest VMs.
> > 
> > I’m happy to provide more data or assist with testing a potential
> > fix.
> > 
> > Thanks,
> > Markus Fohrer
> 
> 
> Thanks! First I think it's worth checking what is the setup, e.g.
> which offloads are enabled.
> Besides that, I'd start by seeing what's doing on. Assuming I'm right
> about
> Eric's patch:
> 
> diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> index 276ca543ef44d8..02a9f4dc594d02 100644
> --- a/include/linux/virtio_net.h
> +++ b/include/linux/virtio_net.h
> @@ -103,8 +103,10 @@ static inline int virtio_net_hdr_to_skb(struct
> sk_buff *skb,
>  
>  		if (!skb_partial_csum_set(skb, start, off))
>  			return -EINVAL;
> +		if (skb_transport_offset(skb) < nh_min_len)
> +			return -EINVAL;
>  
> -		nh_min_len = max_t(u32, nh_min_len,
> skb_transport_offset(skb));
> +		nh_min_len = skb_transport_offset(skb);
>  		p_off = nh_min_len + thlen;
>  		if (!pskb_may_pull(skb, p_off))
>  			return -EINVAL;
> 
> 
> sticking a printk before return -EINVAL to show the offset and
> nh_min_len
> would be a good 1st step. Thanks!
> 


Hi Eric,

thanks a lot for the quick response — and yes, you're absolutely right.

Apologies for the confusion: I mistakenly wrote the wrong commit
description in my initial mail.

The correct commit is indeed:

commit 49d14b54a527289d09a9480f214b8c586322310a
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Sep 26 16:58:36 2024 +0000

    net: test for not too small csum_start in virtio_net_hdr_to_skb()

This is the one I bisected and which causes the performance regression
in my environment.

Thanks again,
Markus


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 13:51   ` Markus Fohrer
@ 2025-04-03 14:03     ` Michael S. Tsirkin
  2025-04-03 14:26       ` Willem de Bruijn
  2025-04-03 20:07       ` Markus Fohrer
  0 siblings, 2 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2025-04-03 14:03 UTC (permalink / raw)
  To: Markus Fohrer
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> Tsirkin:
> > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > Hi,
> > > 
> > > I'm observing a significant performance regression in KVM guest VMs
> > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > 
> > > When running on a host system equipped with a Broadcom NetXtreme-E
> > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the
> > > guest drops to 100–200 KB/s. The same guest configuration performs
> > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM is
> > > moved to a host with Intel NICs.
> > > 
> > > Test environment:
> > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > - Guest: Linux with virtio-net interface
> > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host level)
> > > - CPU: AMD EPYC
> > > - Storage: virtio-scsi
> > > - VM network: virtio-net, virtio-scsi (no CPU or IO bottlenecks)
> > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > 
> > > This issue is not present:
> > > - On 6.8.0 
> > > - On hosts with Intel NICs (same VM config)
> > > 
> > > I have bisected the issue to the following upstream commit:
> > > 
> > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for small
> > > tx")
> > >   https://git.kernel.org/linus/49d14b54a527
> > 
> > Thanks a lot for the info!
> > 
> > 
> > both the link and commit point at:
> > 
> > commit 49d14b54a527289d09a9480f214b8c586322310a
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Thu Sep 26 16:58:36 2024 +0000
> > 
> >     net: test for not too small csum_start in virtio_net_hdr_to_skb()
> >     
> > 
> > is this what you mean?
> > 
> > I don't know which commit is "virtio-net: Suppress tx timeout warning
> > for small tx"
> > 
> > 
> > 
> > > Reverting this commit restores normal network performance in
> > > affected guest VMs.
> > > 
> > > I’m happy to provide more data or assist with testing a potential
> > > fix.
> > > 
> > > Thanks,
> > > Markus Fohrer
> > 
> > 
> > Thanks! First I think it's worth checking what is the setup, e.g.
> > which offloads are enabled.
> > Besides that, I'd start by seeing what's doing on. Assuming I'm right
> > about
> > Eric's patch:
> > 
> > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> > index 276ca543ef44d8..02a9f4dc594d02 100644
> > --- a/include/linux/virtio_net.h
> > +++ b/include/linux/virtio_net.h
> > @@ -103,8 +103,10 @@ static inline int virtio_net_hdr_to_skb(struct
> > sk_buff *skb,
> >  
> >  		if (!skb_partial_csum_set(skb, start, off))
> >  			return -EINVAL;
> > +		if (skb_transport_offset(skb) < nh_min_len)
> > +			return -EINVAL;
> >  
> > -		nh_min_len = max_t(u32, nh_min_len,
> > skb_transport_offset(skb));
> > +		nh_min_len = skb_transport_offset(skb);
> >  		p_off = nh_min_len + thlen;
> >  		if (!pskb_may_pull(skb, p_off))
> >  			return -EINVAL;
> > 
> > 
> > sticking a printk before return -EINVAL to show the offset and
> > nh_min_len
> > would be a good 1st step. Thanks!
> > 
> 
> 
> Hi Eric,
> 
> thanks a lot for the quick response — and yes, you're absolutely right.
> 
> Apologies for the confusion: I mistakenly wrote the wrong commit
> description in my initial mail.
> 
> The correct commit is indeed:
> 
> commit 49d14b54a527289d09a9480f214b8c586322310a
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Thu Sep 26 16:58:36 2024 +0000
> 
>     net: test for not too small csum_start in virtio_net_hdr_to_skb()
> 
> This is the one I bisected and which causes the performance regression
> in my environment.
> 
> Thanks again,
> Markus


I'm not Eric but good to know.
Alright, so I would start with the two items: device features and
printk.

-- 
MST


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 14:03     ` Michael S. Tsirkin
@ 2025-04-03 14:26       ` Willem de Bruijn
  2025-04-03 20:00         ` Markus Fohrer
  2025-04-03 20:35         ` Markus Fohrer
  2025-04-03 20:07       ` Markus Fohrer
  1 sibling, 2 replies; 24+ messages in thread
From: Willem de Bruijn @ 2025-04-03 14:26 UTC (permalink / raw)
  To: Michael S. Tsirkin, Markus Fohrer
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Michael S. Tsirkin wrote:
> On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > Tsirkin:
> > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > > Hi,
> > > > 
> > > > I'm observing a significant performance regression in KVM guest VMs
> > > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > > 
> > > > When running on a host system equipped with a Broadcom NetXtreme-E
> > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the
> > > > guest drops to 100–200 KB/s. The same guest configuration performs
> > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM is
> > > > moved to a host with Intel NICs.
> > > > 
> > > > Test environment:
> > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > - Guest: Linux with virtio-net interface
> > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host level)
> > > > - CPU: AMD EPYC
> > > > - Storage: virtio-scsi
> > > > - VM network: virtio-net, virtio-scsi (no CPU or IO bottlenecks)
> > > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > > 
> > > > This issue is not present:
> > > > - On 6.8.0 
> > > > - On hosts with Intel NICs (same VM config)
> > > > 
> > > > I have bisected the issue to the following upstream commit:
> > > > 
> > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for small
> > > > tx")
> > > >   https://git.kernel.org/linus/49d14b54a527
> > > 
> > > Thanks a lot for the info!
> > > 
> > > 
> > > both the link and commit point at:
> > > 
> > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > Author: Eric Dumazet <edumazet@google.com>
> > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > 
> > >     net: test for not too small csum_start in virtio_net_hdr_to_skb()
> > >     
> > > 
> > > is this what you mean?
> > > 
> > > I don't know which commit is "virtio-net: Suppress tx timeout warning
> > > for small tx"
> > > 
> > > 
> > > 
> > > > Reverting this commit restores normal network performance in
> > > > affected guest VMs.
> > > > 
> > > > I’m happy to provide more data or assist with testing a potential
> > > > fix.
> > > > 
> > > > Thanks,
> > > > Markus Fohrer
> > > 
> > > 
> > > Thanks! First I think it's worth checking what is the setup, e.g.
> > > which offloads are enabled.
> > > Besides that, I'd start by seeing what's doing on. Assuming I'm right
> > > about
> > > Eric's patch:
> > > 
> > > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > --- a/include/linux/virtio_net.h
> > > +++ b/include/linux/virtio_net.h
> > > @@ -103,8 +103,10 @@ static inline int virtio_net_hdr_to_skb(struct
> > > sk_buff *skb,
> > >  
> > >  		if (!skb_partial_csum_set(skb, start, off))
> > >  			return -EINVAL;
> > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > +			return -EINVAL;
> > >  
> > > -		nh_min_len = max_t(u32, nh_min_len,
> > > skb_transport_offset(skb));
> > > +		nh_min_len = skb_transport_offset(skb);
> > >  		p_off = nh_min_len + thlen;
> > >  		if (!pskb_may_pull(skb, p_off))
> > >  			return -EINVAL;
> > > 
> > > 
> > > sticking a printk before return -EINVAL to show the offset and
> > > nh_min_len
> > > would be a good 1st step. Thanks!
> > > 
> > 
> > 
> > Hi Eric,
> > 
> > thanks a lot for the quick response — and yes, you're absolutely right.
> > 
> > Apologies for the confusion: I mistakenly wrote the wrong commit
> > description in my initial mail.
> > 
> > The correct commit is indeed:
> > 
> > commit 49d14b54a527289d09a9480f214b8c586322310a
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Thu Sep 26 16:58:36 2024 +0000
> > 
> >     net: test for not too small csum_start in virtio_net_hdr_to_skb()
> > 
> > This is the one I bisected and which causes the performance regression
> > in my environment.

This commit is introduced in v6.12.

You say 6.8 is good, but 6.8.1 is bad. This commit is not in 6.8.1.
Nor any virtio-net related change:

$ git log --oneline linux/v6.8..linux/v6.8.1 -- include/linux/virtio_net.h drivers/net/virtio_net.c | wc -l
0

Is it perhaps a 6.8.1 derived distro kernel?

That patch detects silly packets created by a fuzzer. It should not
affect sane traffic. Not saying your analysis is wrong. We just need
more data to understand the regression better.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 14:26       ` Willem de Bruijn
@ 2025-04-03 20:00         ` Markus Fohrer
  2025-04-03 20:35         ` Markus Fohrer
  1 sibling, 0 replies; 24+ messages in thread
From: Markus Fohrer @ 2025-04-03 20:00 UTC (permalink / raw)
  To: Willem de Bruijn, Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Donnerstag, dem 03.04.2025 um 10:26 -0400 schrieb Willem de Bruijn:
> Michael S. Tsirkin wrote:
> > On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > Tsirkin:
> > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > > > Hi,
> > > > > 
> > > > > I'm observing a significant performance regression in KVM
> > > > > guest VMs
> > > > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > > > 
> > > > > When running on a host system equipped with a Broadcom
> > > > > NetXtreme-E
> > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
> > > > > the
> > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > performs
> > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM
> > > > > is
> > > > > moved to a host with Intel NICs.
> > > > > 
> > > > > Test environment:
> > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > - Guest: Linux with virtio-net interface
> > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > level)
> > > > > - CPU: AMD EPYC
> > > > > - Storage: virtio-scsi
> > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > bottlenecks)
> > > > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > > > 
> > > > > This issue is not present:
> > > > > - On 6.8.0 
> > > > > - On hosts with Intel NICs (same VM config)
> > > > > 
> > > > > I have bisected the issue to the following upstream commit:
> > > > > 
> > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
> > > > > small
> > > > > tx")
> > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > 
> > > > Thanks a lot for the info!
> > > > 
> > > > 
> > > > both the link and commit point at:
> > > > 
> > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > Author: Eric Dumazet <edumazet@google.com>
> > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > 
> > > >     net: test for not too small csum_start in
> > > > virtio_net_hdr_to_skb()
> > > >     
> > > > 
> > > > is this what you mean?
> > > > 
> > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > warning
> > > > for small tx"
> > > > 
> > > > 
> > > > 
> > > > > Reverting this commit restores normal network performance in
> > > > > affected guest VMs.
> > > > > 
> > > > > I’m happy to provide more data or assist with testing a
> > > > > potential
> > > > > fix.
> > > > > 
> > > > > Thanks,
> > > > > Markus Fohrer
> > > > 
> > > > 
> > > > Thanks! First I think it's worth checking what is the setup,
> > > > e.g.
> > > > which offloads are enabled.
> > > > Besides that, I'd start by seeing what's doing on. Assuming I'm
> > > > right
> > > > about
> > > > Eric's patch:
> > > > 
> > > > diff --git a/include/linux/virtio_net.h
> > > > b/include/linux/virtio_net.h
> > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > --- a/include/linux/virtio_net.h
> > > > +++ b/include/linux/virtio_net.h
> > > > @@ -103,8 +103,10 @@ static inline int
> > > > virtio_net_hdr_to_skb(struct
> > > > sk_buff *skb,
> > > >  
> > > >  		if (!skb_partial_csum_set(skb, start, off))
> > > >  			return -EINVAL;
> > > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > > +			return -EINVAL;
> > > >  
> > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > skb_transport_offset(skb));
> > > > +		nh_min_len = skb_transport_offset(skb);
> > > >  		p_off = nh_min_len + thlen;
> > > >  		if (!pskb_may_pull(skb, p_off))
> > > >  			return -EINVAL;
> > > > 
> > > > 
> > > > sticking a printk before return -EINVAL to show the offset and
> > > > nh_min_len
> > > > would be a good 1st step. Thanks!
> > > > 
> > > 
> > > 
> > > Hi Eric,
> > > 
> > > thanks a lot for the quick response — and yes, you're absolutely
> > > right.
> > > 
> > > Apologies for the confusion: I mistakenly wrote the wrong commit
> > > description in my initial mail.
> > > 
> > > The correct commit is indeed:
> > > 
> > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > Author: Eric Dumazet <edumazet@google.com>
> > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > 
> > >     net: test for not too small csum_start in
> > > virtio_net_hdr_to_skb()
> > > 
> > > This is the one I bisected and which causes the performance
> > > regression
> > > in my environment.
> 
> This commit is introduced in v6.12.
> 
> You say 6.8 is good, but 6.8.1 is bad. This commit is not in 6.8.1.
> Nor any virtio-net related change:
> 
> $ git log --oneline linux/v6.8..linux/v6.8.1 --
> include/linux/virtio_net.h drivers/net/virtio_net.c | wc -l
> 0
> 
> Is it perhaps a 6.8.1 derived distro kernel?
> 
> That patch detects silly packets created by a fuzzer. It should not
> affect sane traffic. Not saying your analysis is wrong. We just need
> more data to understand the regression better.

thanks for the clarification — you're right, my initial `git bisect`
was performed on Ubuntu's 6.8-based kernels (e.g. 6.8.0-31 to 6.8.0-
53), so it likely included backports not present in upstream 6.8.1.

This explains the confusion around commit 49d14b54a527 — sorry about
that.

To confirm: I can reproduce the regression using the mainline 6.14
kernel from kernel.org. So the issue still exists upstream, even though
the exact bisect result needs to be redone with mainline-only sources.

I’ll collect and share more information (device features, virtio state,
etc.) as you suggested to help narrow it down.

Thanks again,  
Markus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 14:03     ` Michael S. Tsirkin
  2025-04-03 14:26       ` Willem de Bruijn
@ 2025-04-03 20:07       ` Markus Fohrer
  2025-04-03 21:06         ` Michael S. Tsirkin
  1 sibling, 1 reply; 24+ messages in thread
From: Markus Fohrer @ 2025-04-03 20:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Donnerstag, dem 03.04.2025 um 10:03 -0400 schrieb Michael S.
Tsirkin:
> On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > Tsirkin:
> > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > > Hi,
> > > > 
> > > > I'm observing a significant performance regression in KVM guest
> > > > VMs
> > > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > > 
> > > > When running on a host system equipped with a Broadcom
> > > > NetXtreme-E
> > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the
> > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > performs
> > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM is
> > > > moved to a host with Intel NICs.
> > > > 
> > > > Test environment:
> > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > - Guest: Linux with virtio-net interface
> > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > level)
> > > > - CPU: AMD EPYC
> > > > - Storage: virtio-scsi
> > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > bottlenecks)
> > > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > > 
> > > > This issue is not present:
> > > > - On 6.8.0 
> > > > - On hosts with Intel NICs (same VM config)
> > > > 
> > > > I have bisected the issue to the following upstream commit:
> > > > 
> > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
> > > > small
> > > > tx")
> > > >   https://git.kernel.org/linus/49d14b54a527
> > > 
> > > Thanks a lot for the info!
> > > 
> > > 
> > > both the link and commit point at:
> > > 
> > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > Author: Eric Dumazet <edumazet@google.com>
> > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > 
> > >     net: test for not too small csum_start in
> > > virtio_net_hdr_to_skb()
> > >     
> > > 
> > > is this what you mean?
> > > 
> > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > warning
> > > for small tx"
> > > 
> > > 
> > > 
> > > > Reverting this commit restores normal network performance in
> > > > affected guest VMs.
> > > > 
> > > > I’m happy to provide more data or assist with testing a
> > > > potential
> > > > fix.
> > > > 
> > > > Thanks,
> > > > Markus Fohrer
> > > 
> > > 
> > > Thanks! First I think it's worth checking what is the setup, e.g.
> > > which offloads are enabled.
> > > Besides that, I'd start by seeing what's doing on. Assuming I'm
> > > right
> > > about
> > > Eric's patch:
> > > 
> > > diff --git a/include/linux/virtio_net.h
> > > b/include/linux/virtio_net.h
> > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > --- a/include/linux/virtio_net.h
> > > +++ b/include/linux/virtio_net.h
> > > @@ -103,8 +103,10 @@ static inline int
> > > virtio_net_hdr_to_skb(struct
> > > sk_buff *skb,
> > >  
> > >  		if (!skb_partial_csum_set(skb, start, off))
> > >  			return -EINVAL;
> > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > +			return -EINVAL;
> > >  
> > > -		nh_min_len = max_t(u32, nh_min_len,
> > > skb_transport_offset(skb));
> > > +		nh_min_len = skb_transport_offset(skb);
> > >  		p_off = nh_min_len + thlen;
> > >  		if (!pskb_may_pull(skb, p_off))
> > >  			return -EINVAL;
> > > 
> > > 
> > > sticking a printk before return -EINVAL to show the offset and
> > > nh_min_len
> > > would be a good 1st step. Thanks!
> > > 
> > 
> > 
> > Hi Eric,
> > 
> > thanks a lot for the quick response — and yes, you're absolutely
> > right.
> > 
> > Apologies for the confusion: I mistakenly wrote the wrong commit
> > description in my initial mail.
> > 
> > The correct commit is indeed:
> > 
> > commit 49d14b54a527289d09a9480f214b8c586322310a
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Thu Sep 26 16:58:36 2024 +0000
> > 
> >     net: test for not too small csum_start in
> > virtio_net_hdr_to_skb()
> > 
> > This is the one I bisected and which causes the performance
> > regression
> > in my environment.
> > 
> > Thanks again,
> > Markus
> 
> 
> I'm not Eric but good to know.
> Alright, so I would start with the two items: device features and
> printk.
> 

as requested, here’s the device/feature information from the guest
running kernel 6.14 (mainline):

Interface: ens18

ethtool -i ens18:
driver: virtio_net
version: 1.0.0
firmware-version: 
expansion-rom-version: 
bus-info: 0000:00:12.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no


ethtool -k ens18:
Features for ens18:
rx-checksumming: on [fixed]
tx-checksumming: on
	tx-checksum-ipv4: off [fixed]
	tx-checksum-ip-generic: on
	tx-checksum-ipv6: off [fixed]
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: off [fixed]
scatter-gather: on
	tx-scatter-gather: on
	tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
	tx-tcp-segmentation: on
	tx-tcp-ecn-segmentation: on
	tx-tcp-mangleid-segmentation: off
	tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-gso-robust: on [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-tunnel-remcsum-segmentation: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off
tx-gso-list: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: on
tls-hw-record: off [fixed]
rx-gro-list: off
macsec-hw-offload: off [fixed]
rx-udp-gro-forwarding: off
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]

ethtool ens18:
Settings for ens18:
	Supported ports: [  ]
	Supported link modes:   Not reported
	Supported pause frame use: No
	Supports auto-negotiation: No
	Supported FEC modes: Not reported
	Advertised link modes:  Not reported
	Advertised pause frame use: No
	Advertised auto-negotiation: No
	Advertised FEC modes: Not reported
	Speed: Unknown!
	Duplex: Unknown! (255)
	Auto-negotiation: off
	Port: Other
	PHYAD: 0
	Transceiver: internal
netlink error: Operation not permitted
	Link detected: yes


Kernel log (journalctl -k):
Apr 03 19:50:37 kb-test.allod.com kernel: virtio_scsi virtio2: 4/0/0
default/read/poll queues  
Apr 03 19:50:37 kb-test.allod.com kernel: virtio_net virtio1 ens18:
renamed from eth0

Let me know if you’d like comparison data from kernel 6.11 or any
additional tests

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 14:26       ` Willem de Bruijn
  2025-04-03 20:00         ` Markus Fohrer
@ 2025-04-03 20:35         ` Markus Fohrer
  1 sibling, 0 replies; 24+ messages in thread
From: Markus Fohrer @ 2025-04-03 20:35 UTC (permalink / raw)
  To: Willem de Bruijn, Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Donnerstag, dem 03.04.2025 um 10:26 -0400 schrieb Willem de Bruijn:
> Michael S. Tsirkin wrote:
> > On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > Tsirkin:
> > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > > > Hi,
> > > > > 
> > > > > I'm observing a significant performance regression in KVM
> > > > > guest VMs
> > > > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > > > 
> > > > > When running on a host system equipped with a Broadcom
> > > > > NetXtreme-E
> > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
> > > > > the
> > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > performs
> > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM
> > > > > is
> > > > > moved to a host with Intel NICs.
> > > > > 
> > > > > Test environment:
> > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > - Guest: Linux with virtio-net interface
> > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > level)
> > > > > - CPU: AMD EPYC
> > > > > - Storage: virtio-scsi
> > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > bottlenecks)
> > > > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > > > 
> > > > > This issue is not present:
> > > > > - On 6.8.0 
> > > > > - On hosts with Intel NICs (same VM config)
> > > > > 
> > > > > I have bisected the issue to the following upstream commit:
> > > > > 
> > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
> > > > > small
> > > > > tx")
> > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > 
> > > > Thanks a lot for the info!
> > > > 
> > > > 
> > > > both the link and commit point at:
> > > > 
> > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > Author: Eric Dumazet <edumazet@google.com>
> > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > 
> > > >     net: test for not too small csum_start in
> > > > virtio_net_hdr_to_skb()
> > > >     
> > > > 
> > > > is this what you mean?
> > > > 
> > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > warning
> > > > for small tx"
> > > > 
> > > > 
> > > > 
> > > > > Reverting this commit restores normal network performance in
> > > > > affected guest VMs.
> > > > > 
> > > > > I’m happy to provide more data or assist with testing a
> > > > > potential
> > > > > fix.
> > > > > 
> > > > > Thanks,
> > > > > Markus Fohrer
> > > > 
> > > > 
> > > > Thanks! First I think it's worth checking what is the setup,
> > > > e.g.
> > > > which offloads are enabled.
> > > > Besides that, I'd start by seeing what's doing on. Assuming I'm
> > > > right
> > > > about
> > > > Eric's patch:
> > > > 
> > > > diff --git a/include/linux/virtio_net.h
> > > > b/include/linux/virtio_net.h
> > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > --- a/include/linux/virtio_net.h
> > > > +++ b/include/linux/virtio_net.h
> > > > @@ -103,8 +103,10 @@ static inline int
> > > > virtio_net_hdr_to_skb(struct
> > > > sk_buff *skb,
> > > >  
> > > >  		if (!skb_partial_csum_set(skb, start, off))
> > > >  			return -EINVAL;
> > > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > > +			return -EINVAL;
> > > >  
> > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > skb_transport_offset(skb));
> > > > +		nh_min_len = skb_transport_offset(skb);
> > > >  		p_off = nh_min_len + thlen;
> > > >  		if (!pskb_may_pull(skb, p_off))
> > > >  			return -EINVAL;
> > > > 
> > > > 
> > > > sticking a printk before return -EINVAL to show the offset and
> > > > nh_min_len
> > > > would be a good 1st step. Thanks!
> > > > 
> > > 
> > > 
> > > Hi Eric,
> > > 
> > > thanks a lot for the quick response — and yes, you're absolutely
> > > right.
> > > 
> > > Apologies for the confusion: I mistakenly wrote the wrong commit
> > > description in my initial mail.
> > > 
> > > The correct commit is indeed:
> > > 
> > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > Author: Eric Dumazet <edumazet@google.com>
> > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > 
> > >     net: test for not too small csum_start in
> > > virtio_net_hdr_to_skb()
> > > 
> > > This is the one I bisected and which causes the performance
> > > regression
> > > in my environment.
> 
> This commit is introduced in v6.12.
> 
> You say 6.8 is good, but 6.8.1 is bad. This commit is not in 6.8.1.
> Nor any virtio-net related change:
> 
> $ git log --oneline linux/v6.8..linux/v6.8.1 --
> include/linux/virtio_net.h drivers/net/virtio_net.c | wc -l
> 0
> 
> Is it perhaps a 6.8.1 derived distro kernel?
> 
> That patch detects silly packets created by a fuzzer. It should not
> affect sane traffic. Not saying your analysis is wrong. We just need
> more data to understand the regression better.


To clarify: my earlier tests were based on Ubuntu-patched kernels
(e.g., 6.8.0-31 to 6.8.0-53).

I've now repeated the tests using clean mainline kernels from
kernel.org.

Download speed was measured using:
  wget -O /dev/null http://speedtest.belwue.net/10G

Results:
- Kernel 6.11: ~85 MB/s
- Kernel 6.12 and 6.14: < 200 KB/s

This confirms that the regression was introduced between v6.11 and
v6.12 in upstream.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 20:07       ` Markus Fohrer
@ 2025-04-03 21:06         ` Michael S. Tsirkin
  2025-04-03 21:24           ` Markus Fohrer
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2025-04-03 21:06 UTC (permalink / raw)
  To: Markus Fohrer
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

On Thu, Apr 03, 2025 at 10:07:12PM +0200, Markus Fohrer wrote:
> Am Donnerstag, dem 03.04.2025 um 10:03 -0400 schrieb Michael S.
> Tsirkin:
> > On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > Tsirkin:
> > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > > > Hi,
> > > > > 
> > > > > I'm observing a significant performance regression in KVM guest
> > > > > VMs
> > > > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > > > 
> > > > > When running on a host system equipped with a Broadcom
> > > > > NetXtreme-E
> > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the
> > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > performs
> > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM is
> > > > > moved to a host with Intel NICs.
> > > > > 
> > > > > Test environment:
> > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > - Guest: Linux with virtio-net interface
> > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > level)
> > > > > - CPU: AMD EPYC
> > > > > - Storage: virtio-scsi
> > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > bottlenecks)
> > > > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > > > 
> > > > > This issue is not present:
> > > > > - On 6.8.0 
> > > > > - On hosts with Intel NICs (same VM config)
> > > > > 
> > > > > I have bisected the issue to the following upstream commit:
> > > > > 
> > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
> > > > > small
> > > > > tx")
> > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > 
> > > > Thanks a lot for the info!
> > > > 
> > > > 
> > > > both the link and commit point at:
> > > > 
> > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > Author: Eric Dumazet <edumazet@google.com>
> > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > 
> > > >     net: test for not too small csum_start in
> > > > virtio_net_hdr_to_skb()
> > > >     
> > > > 
> > > > is this what you mean?
> > > > 
> > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > warning
> > > > for small tx"
> > > > 
> > > > 
> > > > 
> > > > > Reverting this commit restores normal network performance in
> > > > > affected guest VMs.
> > > > > 
> > > > > I’m happy to provide more data or assist with testing a
> > > > > potential
> > > > > fix.
> > > > > 
> > > > > Thanks,
> > > > > Markus Fohrer
> > > > 
> > > > 
> > > > Thanks! First I think it's worth checking what is the setup, e.g.
> > > > which offloads are enabled.
> > > > Besides that, I'd start by seeing what's doing on. Assuming I'm
> > > > right
> > > > about
> > > > Eric's patch:
> > > > 
> > > > diff --git a/include/linux/virtio_net.h
> > > > b/include/linux/virtio_net.h
> > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > --- a/include/linux/virtio_net.h
> > > > +++ b/include/linux/virtio_net.h
> > > > @@ -103,8 +103,10 @@ static inline int
> > > > virtio_net_hdr_to_skb(struct
> > > > sk_buff *skb,
> > > >  
> > > >  		if (!skb_partial_csum_set(skb, start, off))
> > > >  			return -EINVAL;
> > > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > > +			return -EINVAL;
> > > >  
> > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > skb_transport_offset(skb));
> > > > +		nh_min_len = skb_transport_offset(skb);
> > > >  		p_off = nh_min_len + thlen;
> > > >  		if (!pskb_may_pull(skb, p_off))
> > > >  			return -EINVAL;
> > > > 
> > > > 
> > > > sticking a printk before return -EINVAL to show the offset and
> > > > nh_min_len
> > > > would be a good 1st step. Thanks!
> > > > 
> > > 
> > > 
> > > Hi Eric,
> > > 
> > > thanks a lot for the quick response — and yes, you're absolutely
> > > right.
> > > 
> > > Apologies for the confusion: I mistakenly wrote the wrong commit
> > > description in my initial mail.
> > > 
> > > The correct commit is indeed:
> > > 
> > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > Author: Eric Dumazet <edumazet@google.com>
> > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > 
> > >     net: test for not too small csum_start in
> > > virtio_net_hdr_to_skb()
> > > 
> > > This is the one I bisected and which causes the performance
> > > regression
> > > in my environment.
> > > 
> > > Thanks again,
> > > Markus
> > 
> > 
> > I'm not Eric but good to know.
> > Alright, so I would start with the two items: device features and
> > printk.
> > 
> 
> as requested, here’s the device/feature information from the guest
> running kernel 6.14 (mainline):
> 
> Interface: ens18
> 
> ethtool -i ens18:
> driver: virtio_net
> version: 1.0.0
> firmware-version: 
> expansion-rom-version: 
> bus-info: 0000:00:12.0
> supports-statistics: yes
> supports-test: no
> supports-eeprom-access: no
> supports-register-dump: no
> supports-priv-flags: no
> 
> 
> ethtool -k ens18:
> Features for ens18:
> rx-checksumming: on [fixed]
> tx-checksumming: on
> 	tx-checksum-ipv4: off [fixed]
> 	tx-checksum-ip-generic: on
> 	tx-checksum-ipv6: off [fixed]
> 	tx-checksum-fcoe-crc: off [fixed]
> 	tx-checksum-sctp: off [fixed]
> scatter-gather: on
> 	tx-scatter-gather: on
> 	tx-scatter-gather-fraglist: off [fixed]
> tcp-segmentation-offload: on
> 	tx-tcp-segmentation: on
> 	tx-tcp-ecn-segmentation: on
> 	tx-tcp-mangleid-segmentation: off
> 	tx-tcp6-segmentation: on
> generic-segmentation-offload: on
> generic-receive-offload: on
> large-receive-offload: off [fixed]
> rx-vlan-offload: off [fixed]
> tx-vlan-offload: off [fixed]
> ntuple-filters: off [fixed]
> receive-hashing: off [fixed]
> highdma: on [fixed]
> rx-vlan-filter: on [fixed]
> vlan-challenged: off [fixed]
> tx-gso-robust: on [fixed]
> tx-fcoe-segmentation: off [fixed]
> tx-gre-segmentation: off [fixed]
> tx-gre-csum-segmentation: off [fixed]
> tx-ipxip4-segmentation: off [fixed]
> tx-ipxip6-segmentation: off [fixed]
> tx-udp_tnl-segmentation: off [fixed]
> tx-udp_tnl-csum-segmentation: off [fixed]
> tx-gso-partial: off [fixed]
> tx-tunnel-remcsum-segmentation: off [fixed]
> tx-sctp-segmentation: off [fixed]
> tx-esp-segmentation: off [fixed]
> tx-udp-segmentation: off
> tx-gso-list: off [fixed]
> tx-nocache-copy: off
> loopback: off [fixed]
> rx-fcs: off [fixed]
> rx-all: off [fixed]
> tx-vlan-stag-hw-insert: off [fixed]
> rx-vlan-stag-hw-parse: off [fixed]
> rx-vlan-stag-filter: off [fixed]
> l2-fwd-offload: off [fixed]
> hw-tc-offload: off [fixed]
> esp-hw-offload: off [fixed]
> esp-tx-csum-hw-offload: off [fixed]
> rx-udp_tunnel-port-offload: off [fixed]
> tls-hw-tx-offload: off [fixed]
> tls-hw-rx-offload: off [fixed]
> rx-gro-hw: on
> tls-hw-record: off [fixed]
> rx-gro-list: off
> macsec-hw-offload: off [fixed]
> rx-udp-gro-forwarding: off
> hsr-tag-ins-offload: off [fixed]
> hsr-tag-rm-offload: off [fixed]
> hsr-fwd-offload: off [fixed]
> hsr-dup-offload: off [fixed]
> 
> ethtool ens18:
> Settings for ens18:
> 	Supported ports: [  ]
> 	Supported link modes:   Not reported
> 	Supported pause frame use: No
> 	Supports auto-negotiation: No
> 	Supported FEC modes: Not reported
> 	Advertised link modes:  Not reported
> 	Advertised pause frame use: No
> 	Advertised auto-negotiation: No
> 	Advertised FEC modes: Not reported
> 	Speed: Unknown!
> 	Duplex: Unknown! (255)
> 	Auto-negotiation: off
> 	Port: Other
> 	PHYAD: 0
> 	Transceiver: internal
> netlink error: Operation not permitted
> 	Link detected: yes
> 
> 
> Kernel log (journalctl -k):
> Apr 03 19:50:37 kb-test.allod.com kernel: virtio_scsi virtio2: 4/0/0
> default/read/poll queues  
> Apr 03 19:50:37 kb-test.allod.com kernel: virtio_net virtio1 ens18:
> renamed from eth0
> 
> Let me know if you’d like comparison data from kernel 6.11 or any
> additional tests


I think let's redo bisect first then I will suggest which traces to add.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 21:06         ` Michael S. Tsirkin
@ 2025-04-03 21:24           ` Markus Fohrer
  2025-04-03 21:49             ` Willem de Bruijn
  2025-04-03 22:05             ` Michael S. Tsirkin
  0 siblings, 2 replies; 24+ messages in thread
From: Markus Fohrer @ 2025-04-03 21:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Donnerstag, dem 03.04.2025 um 17:06 -0400 schrieb Michael S.
Tsirkin:
> On Thu, Apr 03, 2025 at 10:07:12PM +0200, Markus Fohrer wrote:
> > Am Donnerstag, dem 03.04.2025 um 10:03 -0400 schrieb Michael S.
> > Tsirkin:
> > > On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> > > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > > Tsirkin:
> > > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer
> > > > > wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I'm observing a significant performance regression in KVM
> > > > > > guest
> > > > > > VMs
> > > > > > using virtio-net with recent Linux kernels (6.8.1+ and
> > > > > > 6.14).
> > > > > > 
> > > > > > When running on a host system equipped with a Broadcom
> > > > > > NetXtreme-E
> > > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
> > > > > > the
> > > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > > performs
> > > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM
> > > > > > is
> > > > > > moved to a host with Intel NICs.
> > > > > > 
> > > > > > Test environment:
> > > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > > - Guest: Linux with virtio-net interface
> > > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > > level)
> > > > > > - CPU: AMD EPYC
> > > > > > - Storage: virtio-scsi
> > > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > > bottlenecks)
> > > > > > - Traffic test: iperf3, scp, wget consistently slow in
> > > > > > guest
> > > > > > 
> > > > > > This issue is not present:
> > > > > > - On 6.8.0 
> > > > > > - On hosts with Intel NICs (same VM config)
> > > > > > 
> > > > > > I have bisected the issue to the following upstream commit:
> > > > > > 
> > > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning
> > > > > > for
> > > > > > small
> > > > > > tx")
> > > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > > 
> > > > > Thanks a lot for the info!
> > > > > 
> > > > > 
> > > > > both the link and commit point at:
> > > > > 
> > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > 
> > > > >     net: test for not too small csum_start in
> > > > > virtio_net_hdr_to_skb()
> > > > >     
> > > > > 
> > > > > is this what you mean?
> > > > > 
> > > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > > warning
> > > > > for small tx"
> > > > > 
> > > > > 
> > > > > 
> > > > > > Reverting this commit restores normal network performance
> > > > > > in
> > > > > > affected guest VMs.
> > > > > > 
> > > > > > I’m happy to provide more data or assist with testing a
> > > > > > potential
> > > > > > fix.
> > > > > > 
> > > > > > Thanks,
> > > > > > Markus Fohrer
> > > > > 
> > > > > 
> > > > > Thanks! First I think it's worth checking what is the setup,
> > > > > e.g.
> > > > > which offloads are enabled.
> > > > > Besides that, I'd start by seeing what's doing on. Assuming
> > > > > I'm
> > > > > right
> > > > > about
> > > > > Eric's patch:
> > > > > 
> > > > > diff --git a/include/linux/virtio_net.h
> > > > > b/include/linux/virtio_net.h
> > > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > > --- a/include/linux/virtio_net.h
> > > > > +++ b/include/linux/virtio_net.h
> > > > > @@ -103,8 +103,10 @@ static inline int
> > > > > virtio_net_hdr_to_skb(struct
> > > > > sk_buff *skb,
> > > > >  
> > > > >  		if (!skb_partial_csum_set(skb, start, off))
> > > > >  			return -EINVAL;
> > > > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > > > +			return -EINVAL;
> > > > >  
> > > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > > skb_transport_offset(skb));
> > > > > +		nh_min_len = skb_transport_offset(skb);
> > > > >  		p_off = nh_min_len + thlen;
> > > > >  		if (!pskb_may_pull(skb, p_off))
> > > > >  			return -EINVAL;
> > > > > 
> > > > > 
> > > > > sticking a printk before return -EINVAL to show the offset
> > > > > and
> > > > > nh_min_len
> > > > > would be a good 1st step. Thanks!
> > > > > 
> > > > 
> > > > 
> > > > Hi Eric,
> > > > 
> > > > thanks a lot for the quick response — and yes, you're
> > > > absolutely
> > > > right.
> > > > 
> > > > Apologies for the confusion: I mistakenly wrote the wrong
> > > > commit
> > > > description in my initial mail.
> > > > 
> > > > The correct commit is indeed:
> > > > 
> > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > Author: Eric Dumazet <edumazet@google.com>
> > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > 
> > > >     net: test for not too small csum_start in
> > > > virtio_net_hdr_to_skb()
> > > > 
> > > > This is the one I bisected and which causes the performance
> > > > regression
> > > > in my environment.
> > > > 
> > > > Thanks again,
> > > > Markus
> > > 
> > > 
> > > I'm not Eric but good to know.
> > > Alright, so I would start with the two items: device features and
> > > printk.
> > > 
> > 
> > as requested, here’s the device/feature information from the guest
> > running kernel 6.14 (mainline):
> > 
> > Interface: ens18
> > 
> > ethtool -i ens18:
> > driver: virtio_net
> > version: 1.0.0
> > firmware-version: 
> > expansion-rom-version: 
> > bus-info: 0000:00:12.0
> > supports-statistics: yes
> > supports-test: no
> > supports-eeprom-access: no
> > supports-register-dump: no
> > supports-priv-flags: no
> > 
> > 
> > ethtool -k ens18:
> > Features for ens18:
> > rx-checksumming: on [fixed]
> > tx-checksumming: on
> > 	tx-checksum-ipv4: off [fixed]
> > 	tx-checksum-ip-generic: on
> > 	tx-checksum-ipv6: off [fixed]
> > 	tx-checksum-fcoe-crc: off [fixed]
> > 	tx-checksum-sctp: off [fixed]
> > scatter-gather: on
> > 	tx-scatter-gather: on
> > 	tx-scatter-gather-fraglist: off [fixed]
> > tcp-segmentation-offload: on
> > 	tx-tcp-segmentation: on
> > 	tx-tcp-ecn-segmentation: on
> > 	tx-tcp-mangleid-segmentation: off
> > 	tx-tcp6-segmentation: on
> > generic-segmentation-offload: on
> > generic-receive-offload: on
> > large-receive-offload: off [fixed]
> > rx-vlan-offload: off [fixed]
> > tx-vlan-offload: off [fixed]
> > ntuple-filters: off [fixed]
> > receive-hashing: off [fixed]
> > highdma: on [fixed]
> > rx-vlan-filter: on [fixed]
> > vlan-challenged: off [fixed]
> > tx-gso-robust: on [fixed]
> > tx-fcoe-segmentation: off [fixed]
> > tx-gre-segmentation: off [fixed]
> > tx-gre-csum-segmentation: off [fixed]
> > tx-ipxip4-segmentation: off [fixed]
> > tx-ipxip6-segmentation: off [fixed]
> > tx-udp_tnl-segmentation: off [fixed]
> > tx-udp_tnl-csum-segmentation: off [fixed]
> > tx-gso-partial: off [fixed]
> > tx-tunnel-remcsum-segmentation: off [fixed]
> > tx-sctp-segmentation: off [fixed]
> > tx-esp-segmentation: off [fixed]
> > tx-udp-segmentation: off
> > tx-gso-list: off [fixed]
> > tx-nocache-copy: off
> > loopback: off [fixed]
> > rx-fcs: off [fixed]
> > rx-all: off [fixed]
> > tx-vlan-stag-hw-insert: off [fixed]
> > rx-vlan-stag-hw-parse: off [fixed]
> > rx-vlan-stag-filter: off [fixed]
> > l2-fwd-offload: off [fixed]
> > hw-tc-offload: off [fixed]
> > esp-hw-offload: off [fixed]
> > esp-tx-csum-hw-offload: off [fixed]
> > rx-udp_tunnel-port-offload: off [fixed]
> > tls-hw-tx-offload: off [fixed]
> > tls-hw-rx-offload: off [fixed]
> > rx-gro-hw: on
> > tls-hw-record: off [fixed]
> > rx-gro-list: off
> > macsec-hw-offload: off [fixed]
> > rx-udp-gro-forwarding: off
> > hsr-tag-ins-offload: off [fixed]
> > hsr-tag-rm-offload: off [fixed]
> > hsr-fwd-offload: off [fixed]
> > hsr-dup-offload: off [fixed]
> > 
> > ethtool ens18:
> > Settings for ens18:
> > 	Supported ports: [  ]
> > 	Supported link modes:   Not reported
> > 	Supported pause frame use: No
> > 	Supports auto-negotiation: No
> > 	Supported FEC modes: Not reported
> > 	Advertised link modes:  Not reported
> > 	Advertised pause frame use: No
> > 	Advertised auto-negotiation: No
> > 	Advertised FEC modes: Not reported
> > 	Speed: Unknown!
> > 	Duplex: Unknown! (255)
> > 	Auto-negotiation: off
> > 	Port: Other
> > 	PHYAD: 0
> > 	Transceiver: internal
> > netlink error: Operation not permitted
> > 	Link detected: yes
> > 
> > 
> > Kernel log (journalctl -k):
> > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_scsi virtio2:
> > 4/0/0
> > default/read/poll queues  
> > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_net virtio1 ens18:
> > renamed from eth0
> > 
> > Let me know if you’d like comparison data from kernel 6.11 or any
> > additional tests
> 
> 
> I think let's redo bisect first then I will suggest which traces to
> add.
> 

The build with the added printk is currently running. I’ll test it
shortly and report the results.

Should the bisect be done between v6.11 and v6.12, or v6.11 and v6.14?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 21:24           ` Markus Fohrer
@ 2025-04-03 21:49             ` Willem de Bruijn
  2025-04-03 22:05             ` Michael S. Tsirkin
  1 sibling, 0 replies; 24+ messages in thread
From: Willem de Bruijn @ 2025-04-03 21:49 UTC (permalink / raw)
  To: Markus Fohrer, Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Markus Fohrer wrote:
> Am Donnerstag, dem 03.04.2025 um 17:06 -0400 schrieb Michael S.
> Tsirkin:
> > On Thu, Apr 03, 2025 at 10:07:12PM +0200, Markus Fohrer wrote:
> > > Am Donnerstag, dem 03.04.2025 um 10:03 -0400 schrieb Michael S.
> > > Tsirkin:
> > > > On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> > > > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > > > Tsirkin:
> > > > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer
> > > > > > wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I'm observing a significant performance regression in KVM
> > > > > > > guest
> > > > > > > VMs
> > > > > > > using virtio-net with recent Linux kernels (6.8.1+ and
> > > > > > > 6.14).
> > > > > > > 
> > > > > > > When running on a host system equipped with a Broadcom
> > > > > > > NetXtreme-E
> > > > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
> > > > > > > the
> > > > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > > > performs
> > > > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM
> > > > > > > is
> > > > > > > moved to a host with Intel NICs.
> > > > > > > 
> > > > > > > Test environment:
> > > > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > > > - Guest: Linux with virtio-net interface
> > > > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > > > level)
> > > > > > > - CPU: AMD EPYC
> > > > > > > - Storage: virtio-scsi
> > > > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > > > bottlenecks)
> > > > > > > - Traffic test: iperf3, scp, wget consistently slow in
> > > > > > > guest
> > > > > > > 
> > > > > > > This issue is not present:
> > > > > > > - On 6.8.0 
> > > > > > > - On hosts with Intel NICs (same VM config)
> > > > > > > 
> > > > > > > I have bisected the issue to the following upstream commit:
> > > > > > > 
> > > > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning
> > > > > > > for
> > > > > > > small
> > > > > > > tx")
> > > > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > > > 
> > > > > > Thanks a lot for the info!
> > > > > > 
> > > > > > 
> > > > > > both the link and commit point at:
> > > > > > 
> > > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > > 
> > > > > >     net: test for not too small csum_start in
> > > > > > virtio_net_hdr_to_skb()
> > > > > >     
> > > > > > 
> > > > > > is this what you mean?
> > > > > > 
> > > > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > > > warning
> > > > > > for small tx"
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > Reverting this commit restores normal network performance
> > > > > > > in
> > > > > > > affected guest VMs.
> > > > > > > 
> > > > > > > I’m happy to provide more data or assist with testing a
> > > > > > > potential
> > > > > > > fix.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Markus Fohrer
> > > > > > 
> > > > > > 
> > > > > > Thanks! First I think it's worth checking what is the setup,
> > > > > > e.g.
> > > > > > which offloads are enabled.
> > > > > > Besides that, I'd start by seeing what's doing on. Assuming
> > > > > > I'm
> > > > > > right
> > > > > > about
> > > > > > Eric's patch:
> > > > > > 
> > > > > > diff --git a/include/linux/virtio_net.h
> > > > > > b/include/linux/virtio_net.h
> > > > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > > > --- a/include/linux/virtio_net.h
> > > > > > +++ b/include/linux/virtio_net.h
> > > > > > @@ -103,8 +103,10 @@ static inline int
> > > > > > virtio_net_hdr_to_skb(struct
> > > > > > sk_buff *skb,
> > > > > >  
> > > > > >  		if (!skb_partial_csum_set(skb, start, off))
> > > > > >  			return -EINVAL;
> > > > > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > > > > +			return -EINVAL;
> > > > > >  
> > > > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > > > skb_transport_offset(skb));
> > > > > > +		nh_min_len = skb_transport_offset(skb);
> > > > > >  		p_off = nh_min_len + thlen;
> > > > > >  		if (!pskb_may_pull(skb, p_off))
> > > > > >  			return -EINVAL;
> > > > > > 
> > > > > > 
> > > > > > sticking a printk before return -EINVAL to show the offset
> > > > > > and
> > > > > > nh_min_len
> > > > > > would be a good 1st step. Thanks!
> > > > > > 
> > > > > 
> > > > > 
> > > > > Hi Eric,
> > > > > 
> > > > > thanks a lot for the quick response — and yes, you're
> > > > > absolutely
> > > > > right.
> > > > > 
> > > > > Apologies for the confusion: I mistakenly wrote the wrong
> > > > > commit
> > > > > description in my initial mail.
> > > > > 
> > > > > The correct commit is indeed:
> > > > > 
> > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > 
> > > > >     net: test for not too small csum_start in
> > > > > virtio_net_hdr_to_skb()
> > > > > 
> > > > > This is the one I bisected and which causes the performance
> > > > > regression
> > > > > in my environment.
> > > > > 
> > > > > Thanks again,
> > > > > Markus
> > > > 
> > > > 
> > > > I'm not Eric but good to know.
> > > > Alright, so I would start with the two items: device features and
> > > > printk.
> > > > 
> > > 
> > > as requested, here’s the device/feature information from the guest
> > > running kernel 6.14 (mainline):
> > > 
> > > Interface: ens18
> > > 
> > > ethtool -i ens18:
> > > driver: virtio_net
> > > version: 1.0.0
> > > firmware-version: 
> > > expansion-rom-version: 
> > > bus-info: 0000:00:12.0
> > > supports-statistics: yes
> > > supports-test: no
> > > supports-eeprom-access: no
> > > supports-register-dump: no
> > > supports-priv-flags: no
> > > 
> > > 
> > > ethtool -k ens18:
> > > Features for ens18:
> > > rx-checksumming: on [fixed]
> > > tx-checksumming: on
> > > 	tx-checksum-ipv4: off [fixed]
> > > 	tx-checksum-ip-generic: on
> > > 	tx-checksum-ipv6: off [fixed]
> > > 	tx-checksum-fcoe-crc: off [fixed]
> > > 	tx-checksum-sctp: off [fixed]
> > > scatter-gather: on
> > > 	tx-scatter-gather: on
> > > 	tx-scatter-gather-fraglist: off [fixed]
> > > tcp-segmentation-offload: on
> > > 	tx-tcp-segmentation: on
> > > 	tx-tcp-ecn-segmentation: on
> > > 	tx-tcp-mangleid-segmentation: off
> > > 	tx-tcp6-segmentation: on
> > > generic-segmentation-offload: on
> > > generic-receive-offload: on
> > > large-receive-offload: off [fixed]
> > > rx-vlan-offload: off [fixed]
> > > tx-vlan-offload: off [fixed]
> > > ntuple-filters: off [fixed]
> > > receive-hashing: off [fixed]
> > > highdma: on [fixed]
> > > rx-vlan-filter: on [fixed]
> > > vlan-challenged: off [fixed]
> > > tx-gso-robust: on [fixed]
> > > tx-fcoe-segmentation: off [fixed]
> > > tx-gre-segmentation: off [fixed]
> > > tx-gre-csum-segmentation: off [fixed]
> > > tx-ipxip4-segmentation: off [fixed]
> > > tx-ipxip6-segmentation: off [fixed]
> > > tx-udp_tnl-segmentation: off [fixed]
> > > tx-udp_tnl-csum-segmentation: off [fixed]
> > > tx-gso-partial: off [fixed]
> > > tx-tunnel-remcsum-segmentation: off [fixed]
> > > tx-sctp-segmentation: off [fixed]
> > > tx-esp-segmentation: off [fixed]
> > > tx-udp-segmentation: off
> > > tx-gso-list: off [fixed]
> > > tx-nocache-copy: off
> > > loopback: off [fixed]
> > > rx-fcs: off [fixed]
> > > rx-all: off [fixed]
> > > tx-vlan-stag-hw-insert: off [fixed]
> > > rx-vlan-stag-hw-parse: off [fixed]
> > > rx-vlan-stag-filter: off [fixed]
> > > l2-fwd-offload: off [fixed]
> > > hw-tc-offload: off [fixed]
> > > esp-hw-offload: off [fixed]
> > > esp-tx-csum-hw-offload: off [fixed]
> > > rx-udp_tunnel-port-offload: off [fixed]
> > > tls-hw-tx-offload: off [fixed]
> > > tls-hw-rx-offload: off [fixed]
> > > rx-gro-hw: on
> > > tls-hw-record: off [fixed]
> > > rx-gro-list: off
> > > macsec-hw-offload: off [fixed]
> > > rx-udp-gro-forwarding: off
> > > hsr-tag-ins-offload: off [fixed]
> > > hsr-tag-rm-offload: off [fixed]
> > > hsr-fwd-offload: off [fixed]
> > > hsr-dup-offload: off [fixed]
> > > 
> > > ethtool ens18:
> > > Settings for ens18:
> > > 	Supported ports: [  ]
> > > 	Supported link modes:   Not reported
> > > 	Supported pause frame use: No
> > > 	Supports auto-negotiation: No
> > > 	Supported FEC modes: Not reported
> > > 	Advertised link modes:  Not reported
> > > 	Advertised pause frame use: No
> > > 	Advertised auto-negotiation: No
> > > 	Advertised FEC modes: Not reported
> > > 	Speed: Unknown!
> > > 	Duplex: Unknown! (255)
> > > 	Auto-negotiation: off
> > > 	Port: Other
> > > 	PHYAD: 0
> > > 	Transceiver: internal
> > > netlink error: Operation not permitted
> > > 	Link detected: yes
> > > 
> > > 
> > > Kernel log (journalctl -k):
> > > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_scsi virtio2:
> > > 4/0/0
> > > default/read/poll queues  
> > > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_net virtio1 ens18:
> > > renamed from eth0
> > > 
> > > Let me know if you’d like comparison data from kernel 6.11 or any
> > > additional tests
> > 
> > 
> > I think let's redo bisect first then I will suggest which traces to
> > add.
> > 
> 
> The build with the added printk is currently running. I’ll test it
> shortly and report the results.
> 
> Should the bisect be done between v6.11 and v6.12, or v6.11 and v6.14?

If reverting one specific patch resolved it, that's a big smoking gun.
No need to bisect a huge stack of patches then again, imho.

Maybe check-out that SHA1 and the one before and verify that that
matches your earlier experience?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 21:24           ` Markus Fohrer
  2025-04-03 21:49             ` Willem de Bruijn
@ 2025-04-03 22:05             ` Michael S. Tsirkin
  2025-04-04 11:32               ` Markus Fohrer
  1 sibling, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2025-04-03 22:05 UTC (permalink / raw)
  To: Markus Fohrer
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

On Thu, Apr 03, 2025 at 11:24:43PM +0200, Markus Fohrer wrote:
> Am Donnerstag, dem 03.04.2025 um 17:06 -0400 schrieb Michael S.
> Tsirkin:
> > On Thu, Apr 03, 2025 at 10:07:12PM +0200, Markus Fohrer wrote:
> > > Am Donnerstag, dem 03.04.2025 um 10:03 -0400 schrieb Michael S.
> > > Tsirkin:
> > > > On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer wrote:
> > > > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > > > Tsirkin:
> > > > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer
> > > > > > wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I'm observing a significant performance regression in KVM
> > > > > > > guest
> > > > > > > VMs
> > > > > > > using virtio-net with recent Linux kernels (6.8.1+ and
> > > > > > > 6.14).
> > > > > > > 
> > > > > > > When running on a host system equipped with a Broadcom
> > > > > > > NetXtreme-E
> > > > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
> > > > > > > the
> > > > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > > > performs
> > > > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM
> > > > > > > is
> > > > > > > moved to a host with Intel NICs.
> > > > > > > 
> > > > > > > Test environment:
> > > > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > > > - Guest: Linux with virtio-net interface
> > > > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > > > level)
> > > > > > > - CPU: AMD EPYC
> > > > > > > - Storage: virtio-scsi
> > > > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > > > bottlenecks)
> > > > > > > - Traffic test: iperf3, scp, wget consistently slow in
> > > > > > > guest
> > > > > > > 
> > > > > > > This issue is not present:
> > > > > > > - On 6.8.0 
> > > > > > > - On hosts with Intel NICs (same VM config)
> > > > > > > 
> > > > > > > I have bisected the issue to the following upstream commit:
> > > > > > > 
> > > > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning
> > > > > > > for
> > > > > > > small
> > > > > > > tx")
> > > > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > > > 
> > > > > > Thanks a lot for the info!
> > > > > > 
> > > > > > 
> > > > > > both the link and commit point at:
> > > > > > 
> > > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > > 
> > > > > >     net: test for not too small csum_start in
> > > > > > virtio_net_hdr_to_skb()
> > > > > >     
> > > > > > 
> > > > > > is this what you mean?
> > > > > > 
> > > > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > > > warning
> > > > > > for small tx"
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > Reverting this commit restores normal network performance
> > > > > > > in
> > > > > > > affected guest VMs.
> > > > > > > 
> > > > > > > I’m happy to provide more data or assist with testing a
> > > > > > > potential
> > > > > > > fix.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Markus Fohrer
> > > > > > 
> > > > > > 
> > > > > > Thanks! First I think it's worth checking what is the setup,
> > > > > > e.g.
> > > > > > which offloads are enabled.
> > > > > > Besides that, I'd start by seeing what's doing on. Assuming
> > > > > > I'm
> > > > > > right
> > > > > > about
> > > > > > Eric's patch:
> > > > > > 
> > > > > > diff --git a/include/linux/virtio_net.h
> > > > > > b/include/linux/virtio_net.h
> > > > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > > > --- a/include/linux/virtio_net.h
> > > > > > +++ b/include/linux/virtio_net.h
> > > > > > @@ -103,8 +103,10 @@ static inline int
> > > > > > virtio_net_hdr_to_skb(struct
> > > > > > sk_buff *skb,
> > > > > >  
> > > > > >  		if (!skb_partial_csum_set(skb, start, off))
> > > > > >  			return -EINVAL;
> > > > > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > > > > +			return -EINVAL;
> > > > > >  
> > > > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > > > skb_transport_offset(skb));
> > > > > > +		nh_min_len = skb_transport_offset(skb);
> > > > > >  		p_off = nh_min_len + thlen;
> > > > > >  		if (!pskb_may_pull(skb, p_off))
> > > > > >  			return -EINVAL;
> > > > > > 
> > > > > > 
> > > > > > sticking a printk before return -EINVAL to show the offset
> > > > > > and
> > > > > > nh_min_len
> > > > > > would be a good 1st step. Thanks!
> > > > > > 
> > > > > 
> > > > > 
> > > > > Hi Eric,
> > > > > 
> > > > > thanks a lot for the quick response — and yes, you're
> > > > > absolutely
> > > > > right.
> > > > > 
> > > > > Apologies for the confusion: I mistakenly wrote the wrong
> > > > > commit
> > > > > description in my initial mail.
> > > > > 
> > > > > The correct commit is indeed:
> > > > > 
> > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > 
> > > > >     net: test for not too small csum_start in
> > > > > virtio_net_hdr_to_skb()
> > > > > 
> > > > > This is the one I bisected and which causes the performance
> > > > > regression
> > > > > in my environment.
> > > > > 
> > > > > Thanks again,
> > > > > Markus
> > > > 
> > > > 
> > > > I'm not Eric but good to know.
> > > > Alright, so I would start with the two items: device features and
> > > > printk.
> > > > 
> > > 
> > > as requested, here’s the device/feature information from the guest
> > > running kernel 6.14 (mainline):
> > > 
> > > Interface: ens18
> > > 
> > > ethtool -i ens18:
> > > driver: virtio_net
> > > version: 1.0.0
> > > firmware-version: 
> > > expansion-rom-version: 
> > > bus-info: 0000:00:12.0
> > > supports-statistics: yes
> > > supports-test: no
> > > supports-eeprom-access: no
> > > supports-register-dump: no
> > > supports-priv-flags: no
> > > 
> > > 
> > > ethtool -k ens18:
> > > Features for ens18:
> > > rx-checksumming: on [fixed]
> > > tx-checksumming: on
> > > 	tx-checksum-ipv4: off [fixed]
> > > 	tx-checksum-ip-generic: on
> > > 	tx-checksum-ipv6: off [fixed]
> > > 	tx-checksum-fcoe-crc: off [fixed]
> > > 	tx-checksum-sctp: off [fixed]
> > > scatter-gather: on
> > > 	tx-scatter-gather: on
> > > 	tx-scatter-gather-fraglist: off [fixed]
> > > tcp-segmentation-offload: on
> > > 	tx-tcp-segmentation: on
> > > 	tx-tcp-ecn-segmentation: on
> > > 	tx-tcp-mangleid-segmentation: off
> > > 	tx-tcp6-segmentation: on
> > > generic-segmentation-offload: on
> > > generic-receive-offload: on
> > > large-receive-offload: off [fixed]
> > > rx-vlan-offload: off [fixed]
> > > tx-vlan-offload: off [fixed]
> > > ntuple-filters: off [fixed]
> > > receive-hashing: off [fixed]
> > > highdma: on [fixed]
> > > rx-vlan-filter: on [fixed]
> > > vlan-challenged: off [fixed]
> > > tx-gso-robust: on [fixed]
> > > tx-fcoe-segmentation: off [fixed]
> > > tx-gre-segmentation: off [fixed]
> > > tx-gre-csum-segmentation: off [fixed]
> > > tx-ipxip4-segmentation: off [fixed]
> > > tx-ipxip6-segmentation: off [fixed]
> > > tx-udp_tnl-segmentation: off [fixed]
> > > tx-udp_tnl-csum-segmentation: off [fixed]
> > > tx-gso-partial: off [fixed]
> > > tx-tunnel-remcsum-segmentation: off [fixed]
> > > tx-sctp-segmentation: off [fixed]
> > > tx-esp-segmentation: off [fixed]
> > > tx-udp-segmentation: off
> > > tx-gso-list: off [fixed]
> > > tx-nocache-copy: off
> > > loopback: off [fixed]
> > > rx-fcs: off [fixed]
> > > rx-all: off [fixed]
> > > tx-vlan-stag-hw-insert: off [fixed]
> > > rx-vlan-stag-hw-parse: off [fixed]
> > > rx-vlan-stag-filter: off [fixed]
> > > l2-fwd-offload: off [fixed]
> > > hw-tc-offload: off [fixed]
> > > esp-hw-offload: off [fixed]
> > > esp-tx-csum-hw-offload: off [fixed]
> > > rx-udp_tunnel-port-offload: off [fixed]
> > > tls-hw-tx-offload: off [fixed]
> > > tls-hw-rx-offload: off [fixed]
> > > rx-gro-hw: on
> > > tls-hw-record: off [fixed]
> > > rx-gro-list: off
> > > macsec-hw-offload: off [fixed]
> > > rx-udp-gro-forwarding: off
> > > hsr-tag-ins-offload: off [fixed]
> > > hsr-tag-rm-offload: off [fixed]
> > > hsr-fwd-offload: off [fixed]
> > > hsr-dup-offload: off [fixed]
> > > 
> > > ethtool ens18:
> > > Settings for ens18:
> > > 	Supported ports: [  ]
> > > 	Supported link modes:   Not reported
> > > 	Supported pause frame use: No
> > > 	Supports auto-negotiation: No
> > > 	Supported FEC modes: Not reported
> > > 	Advertised link modes:  Not reported
> > > 	Advertised pause frame use: No
> > > 	Advertised auto-negotiation: No
> > > 	Advertised FEC modes: Not reported
> > > 	Speed: Unknown!
> > > 	Duplex: Unknown! (255)
> > > 	Auto-negotiation: off
> > > 	Port: Other
> > > 	PHYAD: 0
> > > 	Transceiver: internal
> > > netlink error: Operation not permitted
> > > 	Link detected: yes
> > > 
> > > 
> > > Kernel log (journalctl -k):
> > > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_scsi virtio2:
> > > 4/0/0
> > > default/read/poll queues  
> > > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_net virtio1 ens18:
> > > renamed from eth0
> > > 
> > > Let me know if you’d like comparison data from kernel 6.11 or any
> > > additional tests
> > 
> > 
> > I think let's redo bisect first then I will suggest which traces to
> > add.
> > 
> 
> The build with the added printk is currently running. I’ll test it
> shortly and report the results.
> 
> Should the bisect be done between v6.11 and v6.12, or v6.11 and v6.14?

The commit you showed is between 6.11 and 6.12. Having said that,
you can manually checkout 49d14b54a527289d09a9480f214b8c586322310a
and 49d14b54a527289d09a9480f214b8c586322310a~1 and record
the results with git bisect bad/good and if it works
then git bisect will stop immediately for you.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-02 21:12 [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+ Markus Fohrer
  2025-04-03 13:04 ` Michael S. Tsirkin
@ 2025-04-04  7:59 ` Torsten Krah
  2025-04-04  8:26   ` Michael S. Tsirkin
  1 sibling, 1 reply; 24+ messages in thread
From: Torsten Krah @ 2025-04-04  7:59 UTC (permalink / raw)
  To: virtualization
  Cc: Markus Fohrer, mst, jasowang, davem, edumazet, netdev,
	linux-kernel

Am Mittwoch, dem 02.04.2025 um 23:12 +0200 schrieb Markus Fohrer:
> When running on a host system equipped with a Broadcom NetXtreme-E
> (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the guest
> drops to 100–200 KB/s. The same guest configuration performs normally
> (~100 MB/s) when using kernel 6.8.0 or when the VM is moved to a host
> with Intel NICs.

Hi,

as I am affected too, here is the link to the Ubuntu issue, just in
case someone wants to have a look:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2098961

We're seeing lots of those in dmesg output:

[  561.505323] net_ratelimit: 1396 callbacks suppressed
[  561.505339] ens18: bad gso: type: 4, size: 1448
[  561.505343] ens18: bad gso: type: 4, size: 1448
[  561.507270] ens18: bad gso: type: 4, size: 1448
[  561.508257] ens18: bad gso: type: 4, size: 1448
[  561.511432] ens18: bad gso: type: 4, size: 1448
[  561.511452] ens18: bad gso: type: 4, size: 1448
[  561.514719] ens18: bad gso: type: 4, size: 1448
[  561.514966] ens18: bad gso: type: 4, size: 1448
[  561.518553] ens18: bad gso: type: 4, size: 1448
[  561.518781] ens18: bad gso: type: 4, size: 1448
[  566.506044] net_ratelimit: 1363 callbacks suppressed


And another interesting thing we observed - at least in our environment
- that we can trigger that regression only with IPv4 traffic (bad
performance and lots of bad gso messages) - if we only use IPv6, it
does work (good performance and not one bad gso message).

kind regards

Torsten


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 13:04 ` Michael S. Tsirkin
  2025-04-03 13:51   ` Markus Fohrer
@ 2025-04-04  8:16   ` Markus Fohrer
  2025-04-04  8:29     ` Michael S. Tsirkin
  1 sibling, 1 reply; 24+ messages in thread
From: Markus Fohrer @ 2025-04-04  8:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
Tsirkin:
> On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > Hi,
> > 
> > I'm observing a significant performance regression in KVM guest VMs
> > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > 
> > When running on a host system equipped with a Broadcom NetXtreme-E
> > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the
> > guest drops to 100–200 KB/s. The same guest configuration performs
> > normally (~100 MB/s) when using kernel 6.8.0 or when the VM is
> > moved to a host with Intel NICs.
> > 
> > Test environment:
> > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > - Guest: Linux with virtio-net interface
> > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host level)
> > - CPU: AMD EPYC
> > - Storage: virtio-scsi
> > - VM network: virtio-net, virtio-scsi (no CPU or IO bottlenecks)
> > - Traffic test: iperf3, scp, wget consistently slow in guest
> > 
> > This issue is not present:
> > - On 6.8.0 
> > - On hosts with Intel NICs (same VM config)
> > 
> > I have bisected the issue to the following upstream commit:
> > 
> >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for small
> > tx")
> >   https://git.kernel.org/linus/49d14b54a527
> 
> Thanks a lot for the info!
> 
> 
> both the link and commit point at:
> 
> commit 49d14b54a527289d09a9480f214b8c586322310a
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Thu Sep 26 16:58:36 2024 +0000
> 
>     net: test for not too small csum_start in virtio_net_hdr_to_skb()
>     
> 
> is this what you mean?
> 
> I don't know which commit is "virtio-net: Suppress tx timeout warning
> for small tx"
> 
> 
> 
> > Reverting this commit restores normal network performance in
> > affected guest VMs.
> > 
> > I’m happy to provide more data or assist with testing a potential
> > fix.
> > 
> > Thanks,
> > Markus Fohrer
> 
> 
> Thanks! First I think it's worth checking what is the setup, e.g.
> which offloads are enabled.
> Besides that, I'd start by seeing what's doing on. Assuming I'm right
> about
> Eric's patch:
> 
> diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> index 276ca543ef44d8..02a9f4dc594d02 100644
> --- a/include/linux/virtio_net.h
> +++ b/include/linux/virtio_net.h
> @@ -103,8 +103,10 @@ static inline int virtio_net_hdr_to_skb(struct
> sk_buff *skb,
>  
>  		if (!skb_partial_csum_set(skb, start, off))
>  			return -EINVAL;
> +		if (skb_transport_offset(skb) < nh_min_len)
> +			return -EINVAL;
>  
> -		nh_min_len = max_t(u32, nh_min_len,
> skb_transport_offset(skb));
> +		nh_min_len = skb_transport_offset(skb);
>  		p_off = nh_min_len + thlen;
>  		if (!pskb_may_pull(skb, p_off))
>  			return -EINVAL;
> 
> 
> sticking a printk before return -EINVAL to show the offset and
> nh_min_len
> would be a good 1st step. Thanks!
> 

I added the following printk inside virtio_net_hdr_to_skb():

    if (skb_transport_offset(skb) < nh_min_len){
        printk(KERN_INFO "virtio_net: 3 drop, transport_offset=%u,
nh_min_len=%u\n",
               skb_transport_offset(skb), nh_min_len);
        return -EINVAL;
    }

Built and installed the kernel, then triggered a large download via:

    wget http://speedtest.belwue.net/10G

Relevant output from `dmesg -w`:

[   57.327943] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.428942] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.428962] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.553068] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.553088] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.576678] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.618438] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.618453] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.703077] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.823072] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.891982] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   57.946190] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
[   58.218686] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  

I would now do the test with commit
49d14b54a527289d09a9480f214b8c586322310a and commit
49d14b54a527289d09a9480f214b8c586322310a~1




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-04  7:59 ` Torsten Krah
@ 2025-04-04  8:26   ` Michael S. Tsirkin
  0 siblings, 0 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2025-04-04  8:26 UTC (permalink / raw)
  To: Torsten Krah
  Cc: virtualization, Markus Fohrer, jasowang, davem, edumazet, netdev,
	linux-kernel

On Fri, Apr 04, 2025 at 09:59:19AM +0200, Torsten Krah wrote:
> Am Mittwoch, dem 02.04.2025 um 23:12 +0200 schrieb Markus Fohrer:
> > When running on a host system equipped with a Broadcom NetXtreme-E
> > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the guest
> > drops to 100–200 KB/s. The same guest configuration performs normally
> > (~100 MB/s) when using kernel 6.8.0 or when the VM is moved to a host
> > with Intel NICs.
> 
> Hi,
> 
> as I am affected too, here is the link to the Ubuntu issue, just in
> case someone wants to have a look:
> 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2098961
> 
> We're seeing lots of those in dmesg output:
> 
> [  561.505323] net_ratelimit: 1396 callbacks suppressed
> [  561.505339] ens18: bad gso: type: 4, size: 1448
> [  561.505343] ens18: bad gso: type: 4, size: 1448
> [  561.507270] ens18: bad gso: type: 4, size: 1448
> [  561.508257] ens18: bad gso: type: 4, size: 1448
> [  561.511432] ens18: bad gso: type: 4, size: 1448
> [  561.511452] ens18: bad gso: type: 4, size: 1448
> [  561.514719] ens18: bad gso: type: 4, size: 1448
> [  561.514966] ens18: bad gso: type: 4, size: 1448
> [  561.518553] ens18: bad gso: type: 4, size: 1448
> [  561.518781] ens18: bad gso: type: 4, size: 1448
> [  566.506044] net_ratelimit: 1363 callbacks suppressed
> 
> 
> And another interesting thing we observed - at least in our environment
> - that we can trigger that regression only with IPv4 traffic (bad
> performance and lots of bad gso messages) - if we only use IPv6, it
> does work (good performance and not one bad gso message).
> 
> kind regards
> 
> Torsten


I suspect it's something weird on the ubuntu hypervisor side,
supplying wrong checksum offsets.

Can you stick a printk here:
                if (skb_transport_offset(skb) < nh_min_len)
                        return -EINVAL;

printing, on error, all of: start, off, needed, nh_min_len.


Also, what kind of device is this? QEMU? vhost-user? vhost-net?
Thanks!

-- 
MST


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-04  8:16   ` Markus Fohrer
@ 2025-04-04  8:29     ` Michael S. Tsirkin
  2025-04-04  8:52       ` Markus Fohrer
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2025-04-04  8:29 UTC (permalink / raw)
  To: Markus Fohrer
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

On Fri, Apr 04, 2025 at 10:16:55AM +0200, Markus Fohrer wrote:
> Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> Tsirkin:
> > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > Hi,
> > > 
> > > I'm observing a significant performance regression in KVM guest VMs
> > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > 
> > > When running on a host system equipped with a Broadcom NetXtreme-E
> > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the
> > > guest drops to 100–200 KB/s. The same guest configuration performs
> > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM is
> > > moved to a host with Intel NICs.
> > > 
> > > Test environment:
> > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > - Guest: Linux with virtio-net interface
> > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host level)
> > > - CPU: AMD EPYC
> > > - Storage: virtio-scsi
> > > - VM network: virtio-net, virtio-scsi (no CPU or IO bottlenecks)
> > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > 
> > > This issue is not present:
> > > - On 6.8.0 
> > > - On hosts with Intel NICs (same VM config)
> > > 
> > > I have bisected the issue to the following upstream commit:
> > > 
> > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for small
> > > tx")
> > >   https://git.kernel.org/linus/49d14b54a527
> > 
> > Thanks a lot for the info!
> > 
> > 
> > both the link and commit point at:
> > 
> > commit 49d14b54a527289d09a9480f214b8c586322310a
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Thu Sep 26 16:58:36 2024 +0000
> > 
> >     net: test for not too small csum_start in virtio_net_hdr_to_skb()
> >     
> > 
> > is this what you mean?
> > 
> > I don't know which commit is "virtio-net: Suppress tx timeout warning
> > for small tx"
> > 
> > 
> > 
> > > Reverting this commit restores normal network performance in
> > > affected guest VMs.
> > > 
> > > I’m happy to provide more data or assist with testing a potential
> > > fix.
> > > 
> > > Thanks,
> > > Markus Fohrer
> > 
> > 
> > Thanks! First I think it's worth checking what is the setup, e.g.
> > which offloads are enabled.
> > Besides that, I'd start by seeing what's doing on. Assuming I'm right
> > about
> > Eric's patch:
> > 
> > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> > index 276ca543ef44d8..02a9f4dc594d02 100644
> > --- a/include/linux/virtio_net.h
> > +++ b/include/linux/virtio_net.h
> > @@ -103,8 +103,10 @@ static inline int virtio_net_hdr_to_skb(struct
> > sk_buff *skb,
> >  
> >  		if (!skb_partial_csum_set(skb, start, off))
> >  			return -EINVAL;
> > +		if (skb_transport_offset(skb) < nh_min_len)
> > +			return -EINVAL;
> >  
> > -		nh_min_len = max_t(u32, nh_min_len,
> > skb_transport_offset(skb));
> > +		nh_min_len = skb_transport_offset(skb);
> >  		p_off = nh_min_len + thlen;
> >  		if (!pskb_may_pull(skb, p_off))
> >  			return -EINVAL;
> > 
> > 
> > sticking a printk before return -EINVAL to show the offset and
> > nh_min_len
> > would be a good 1st step. Thanks!
> > 
> 
> I added the following printk inside virtio_net_hdr_to_skb():
> 
>     if (skb_transport_offset(skb) < nh_min_len){
>         printk(KERN_INFO "virtio_net: 3 drop, transport_offset=%u,
> nh_min_len=%u\n",
>                skb_transport_offset(skb), nh_min_len);
>         return -EINVAL;
>     }
> 
> Built and installed the kernel, then triggered a large download via:
> 
>     wget http://speedtest.belwue.net/10G
> 
> Relevant output from `dmesg -w`:
> 
> [   57.327943] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.428942] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.428962] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.553068] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.553088] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.576678] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.618438] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.618453] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.703077] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.823072] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.891982] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.946190] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   58.218686] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  

Hmm indeed. And what about these values?
                u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
                u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
                u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
print them too?



> I would now do the test with commit
> 49d14b54a527289d09a9480f214b8c586322310a and commit
> 49d14b54a527289d09a9480f214b8c586322310a~1
> 

Worth checking though it seems likely now the hypervisor is doing weird
things. what kind of backend is it? qemu? tun? vhost-user? vhost-net?

-- 
MST


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-04  8:29     ` Michael S. Tsirkin
@ 2025-04-04  8:52       ` Markus Fohrer
  2025-04-04 11:40         ` Markus Fohrer
  0 siblings, 1 reply; 24+ messages in thread
From: Markus Fohrer @ 2025-04-04  8:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Freitag, dem 04.04.2025 um 04:29 -0400 schrieb Michael S. Tsirkin:
> On Fri, Apr 04, 2025 at 10:16:55AM +0200, Markus Fohrer wrote:
> > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > Tsirkin:
> > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > > Hi,
> > > > 
> > > > I'm observing a significant performance regression in KVM guest
> > > > VMs
> > > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > > 
> > > > When running on a host system equipped with a Broadcom
> > > > NetXtreme-E
> > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the
> > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > performs
> > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM is
> > > > moved to a host with Intel NICs.
> > > > 
> > > > Test environment:
> > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > - Guest: Linux with virtio-net interface
> > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > level)
> > > > - CPU: AMD EPYC
> > > > - Storage: virtio-scsi
> > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > bottlenecks)
> > > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > > 
> > > > This issue is not present:
> > > > - On 6.8.0 
> > > > - On hosts with Intel NICs (same VM config)
> > > > 
> > > > I have bisected the issue to the following upstream commit:
> > > > 
> > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
> > > > small
> > > > tx")
> > > >   https://git.kernel.org/linus/49d14b54a527
> > > 
> > > Thanks a lot for the info!
> > > 
> > > 
> > > both the link and commit point at:
> > > 
> > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > Author: Eric Dumazet <edumazet@google.com>
> > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > 
> > >     net: test for not too small csum_start in
> > > virtio_net_hdr_to_skb()
> > >     
> > > 
> > > is this what you mean?
> > > 
> > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > warning
> > > for small tx"
> > > 
> > > 
> > > 
> > > > Reverting this commit restores normal network performance in
> > > > affected guest VMs.
> > > > 
> > > > I’m happy to provide more data or assist with testing a
> > > > potential
> > > > fix.
> > > > 
> > > > Thanks,
> > > > Markus Fohrer
> > > 
> > > 
> > > Thanks! First I think it's worth checking what is the setup, e.g.
> > > which offloads are enabled.
> > > Besides that, I'd start by seeing what's doing on. Assuming I'm
> > > right
> > > about
> > > Eric's patch:
> > > 
> > > diff --git a/include/linux/virtio_net.h
> > > b/include/linux/virtio_net.h
> > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > --- a/include/linux/virtio_net.h
> > > +++ b/include/linux/virtio_net.h
> > > @@ -103,8 +103,10 @@ static inline int
> > > virtio_net_hdr_to_skb(struct
> > > sk_buff *skb,
> > >  
> > >  		if (!skb_partial_csum_set(skb, start, off))
> > >  			return -EINVAL;
> > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > +			return -EINVAL;
> > >  
> > > -		nh_min_len = max_t(u32, nh_min_len,
> > > skb_transport_offset(skb));
> > > +		nh_min_len = skb_transport_offset(skb);
> > >  		p_off = nh_min_len + thlen;
> > >  		if (!pskb_may_pull(skb, p_off))
> > >  			return -EINVAL;
> > > 
> > > 
> > > sticking a printk before return -EINVAL to show the offset and
> > > nh_min_len
> > > would be a good 1st step. Thanks!
> > > 
> > 
> > I added the following printk inside virtio_net_hdr_to_skb():
> > 
> >     if (skb_transport_offset(skb) < nh_min_len){
> >         printk(KERN_INFO "virtio_net: 3 drop, transport_offset=%u,
> > nh_min_len=%u\n",
> >                skb_transport_offset(skb), nh_min_len);
> >         return -EINVAL;
> >     }
> > 
> > Built and installed the kernel, then triggered a large download
> > via:
> > 
> >     wget http://speedtest.belwue.net/10G
> > 
> > Relevant output from `dmesg -w`:
> > 
> > [   57.327943] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.428942] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.428962] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.553068] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.553088] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.576678] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.618438] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.618453] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.703077] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.823072] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.891982] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   57.946190] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> > [   58.218686] virtio_net: 3 drop, transport_offset=34,
> > nh_min_len=40  
> 
> Hmm indeed. And what about these values?
>                 u32 start = __virtio16_to_cpu(little_endian, hdr-
> >csum_start);
>                 u32 off = __virtio16_to_cpu(little_endian, hdr-
> >csum_offset);
>                 u32 needed = start + max_t(u32, thlen, off +
> sizeof(__sum16));
> print them too?
> 
> 
> 
> > I would now do the test with commit
> > 49d14b54a527289d09a9480f214b8c586322310a and commit
> > 49d14b54a527289d09a9480f214b8c586322310a~1
> > 
> 
> Worth checking though it seems likely now the hypervisor is doing
> weird
> things. what kind of backend is it? qemu? tun? vhost-user? vhost-net?
> 

Backend: QEMU/KVM hypervisor (Proxmox)


printk output:

[   58.641906] virtio_net: drop, transport_offset=34  start=34, off=16,
needed=54, nh_min_len=40
[   58.678048] virtio_net: drop, transport_offset=34  start=34, off=16,
needed=54, nh_min_len=40
[   58.952871] virtio_net: drop, transport_offset=34  start=34, off=16,
needed=54, nh_min_len=40
[   58.962157] virtio_net: drop, transport_offset=34  start=34, off=16,
needed=54, nh_min_len=40
[   59.071645] virtio_net: drop, transport_offset=34  start=34, off=16,
needed=54, nh_min_len=40






^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-03 22:05             ` Michael S. Tsirkin
@ 2025-04-04 11:32               ` Markus Fohrer
  0 siblings, 0 replies; 24+ messages in thread
From: Markus Fohrer @ 2025-04-04 11:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Donnerstag, dem 03.04.2025 um 18:05 -0400 schrieb Michael S.
Tsirkin:
> On Thu, Apr 03, 2025 at 11:24:43PM +0200, Markus Fohrer wrote:
> > Am Donnerstag, dem 03.04.2025 um 17:06 -0400 schrieb Michael S.
> > Tsirkin:
> > > On Thu, Apr 03, 2025 at 10:07:12PM +0200, Markus Fohrer wrote:
> > > > Am Donnerstag, dem 03.04.2025 um 10:03 -0400 schrieb Michael S.
> > > > Tsirkin:
> > > > > On Thu, Apr 03, 2025 at 03:51:01PM +0200, Markus Fohrer
> > > > > wrote:
> > > > > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb
> > > > > > Michael S.
> > > > > > Tsirkin:
> > > > > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer
> > > > > > > wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > I'm observing a significant performance regression in
> > > > > > > > KVM
> > > > > > > > guest
> > > > > > > > VMs
> > > > > > > > using virtio-net with recent Linux kernels (6.8.1+ and
> > > > > > > > 6.14).
> > > > > > > > 
> > > > > > > > When running on a host system equipped with a Broadcom
> > > > > > > > NetXtreme-E
> > > > > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput
> > > > > > > > in
> > > > > > > > the
> > > > > > > > guest drops to 100–200 KB/s. The same guest
> > > > > > > > configuration
> > > > > > > > performs
> > > > > > > > normally (~100 MB/s) when using kernel 6.8.0 or when
> > > > > > > > the VM
> > > > > > > > is
> > > > > > > > moved to a host with Intel NICs.
> > > > > > > > 
> > > > > > > > Test environment:
> > > > > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > > > > - Guest: Linux with virtio-net interface
> > > > > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at
> > > > > > > > host
> > > > > > > > level)
> > > > > > > > - CPU: AMD EPYC
> > > > > > > > - Storage: virtio-scsi
> > > > > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > > > > bottlenecks)
> > > > > > > > - Traffic test: iperf3, scp, wget consistently slow in
> > > > > > > > guest
> > > > > > > > 
> > > > > > > > This issue is not present:
> > > > > > > > - On 6.8.0 
> > > > > > > > - On hosts with Intel NICs (same VM config)
> > > > > > > > 
> > > > > > > > I have bisected the issue to the following upstream
> > > > > > > > commit:
> > > > > > > > 
> > > > > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout
> > > > > > > > warning
> > > > > > > > for
> > > > > > > > small
> > > > > > > > tx")
> > > > > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > > > > 
> > > > > > > Thanks a lot for the info!
> > > > > > > 
> > > > > > > 
> > > > > > > both the link and commit point at:
> > > > > > > 
> > > > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > > > 
> > > > > > >     net: test for not too small csum_start in
> > > > > > > virtio_net_hdr_to_skb()
> > > > > > >     
> > > > > > > 
> > > > > > > is this what you mean?
> > > > > > > 
> > > > > > > I don't know which commit is "virtio-net: Suppress tx
> > > > > > > timeout
> > > > > > > warning
> > > > > > > for small tx"
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > > Reverting this commit restores normal network
> > > > > > > > performance
> > > > > > > > in
> > > > > > > > affected guest VMs.
> > > > > > > > 
> > > > > > > > I’m happy to provide more data or assist with testing a
> > > > > > > > potential
> > > > > > > > fix.
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > Markus Fohrer
> > > > > > > 
> > > > > > > 
> > > > > > > Thanks! First I think it's worth checking what is the
> > > > > > > setup,
> > > > > > > e.g.
> > > > > > > which offloads are enabled.
> > > > > > > Besides that, I'd start by seeing what's doing on.
> > > > > > > Assuming
> > > > > > > I'm
> > > > > > > right
> > > > > > > about
> > > > > > > Eric's patch:
> > > > > > > 
> > > > > > > diff --git a/include/linux/virtio_net.h
> > > > > > > b/include/linux/virtio_net.h
> > > > > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > > > > --- a/include/linux/virtio_net.h
> > > > > > > +++ b/include/linux/virtio_net.h
> > > > > > > @@ -103,8 +103,10 @@ static inline int
> > > > > > > virtio_net_hdr_to_skb(struct
> > > > > > > sk_buff *skb,
> > > > > > >  
> > > > > > >  		if (!skb_partial_csum_set(skb, start,
> > > > > > > off))
> > > > > > >  			return -EINVAL;
> > > > > > > +		if (skb_transport_offset(skb) <
> > > > > > > nh_min_len)
> > > > > > > +			return -EINVAL;
> > > > > > >  
> > > > > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > > > > skb_transport_offset(skb));
> > > > > > > +		nh_min_len = skb_transport_offset(skb);
> > > > > > >  		p_off = nh_min_len + thlen;
> > > > > > >  		if (!pskb_may_pull(skb, p_off))
> > > > > > >  			return -EINVAL;
> > > > > > > 
> > > > > > > 
> > > > > > > sticking a printk before return -EINVAL to show the
> > > > > > > offset
> > > > > > > and
> > > > > > > nh_min_len
> > > > > > > would be a good 1st step. Thanks!
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Hi Eric,
> > > > > > 
> > > > > > thanks a lot for the quick response — and yes, you're
> > > > > > absolutely
> > > > > > right.
> > > > > > 
> > > > > > Apologies for the confusion: I mistakenly wrote the wrong
> > > > > > commit
> > > > > > description in my initial mail.
> > > > > > 
> > > > > > The correct commit is indeed:
> > > > > > 
> > > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > > 
> > > > > >     net: test for not too small csum_start in
> > > > > > virtio_net_hdr_to_skb()
> > > > > > 
> > > > > > This is the one I bisected and which causes the performance
> > > > > > regression
> > > > > > in my environment.
> > > > > > 
> > > > > > Thanks again,
> > > > > > Markus
> > > > > 
> > > > > 
> > > > > I'm not Eric but good to know.
> > > > > Alright, so I would start with the two items: device features
> > > > > and
> > > > > printk.
> > > > > 
> > > > 
> > > > as requested, here’s the device/feature information from the
> > > > guest
> > > > running kernel 6.14 (mainline):
> > > > 
> > > > Interface: ens18
> > > > 
> > > > ethtool -i ens18:
> > > > driver: virtio_net
> > > > version: 1.0.0
> > > > firmware-version: 
> > > > expansion-rom-version: 
> > > > bus-info: 0000:00:12.0
> > > > supports-statistics: yes
> > > > supports-test: no
> > > > supports-eeprom-access: no
> > > > supports-register-dump: no
> > > > supports-priv-flags: no
> > > > 
> > > > 
> > > > ethtool -k ens18:
> > > > Features for ens18:
> > > > rx-checksumming: on [fixed]
> > > > tx-checksumming: on
> > > > 	tx-checksum-ipv4: off [fixed]
> > > > 	tx-checksum-ip-generic: on
> > > > 	tx-checksum-ipv6: off [fixed]
> > > > 	tx-checksum-fcoe-crc: off [fixed]
> > > > 	tx-checksum-sctp: off [fixed]
> > > > scatter-gather: on
> > > > 	tx-scatter-gather: on
> > > > 	tx-scatter-gather-fraglist: off [fixed]
> > > > tcp-segmentation-offload: on
> > > > 	tx-tcp-segmentation: on
> > > > 	tx-tcp-ecn-segmentation: on
> > > > 	tx-tcp-mangleid-segmentation: off
> > > > 	tx-tcp6-segmentation: on
> > > > generic-segmentation-offload: on
> > > > generic-receive-offload: on
> > > > large-receive-offload: off [fixed]
> > > > rx-vlan-offload: off [fixed]
> > > > tx-vlan-offload: off [fixed]
> > > > ntuple-filters: off [fixed]
> > > > receive-hashing: off [fixed]
> > > > highdma: on [fixed]
> > > > rx-vlan-filter: on [fixed]
> > > > vlan-challenged: off [fixed]
> > > > tx-gso-robust: on [fixed]
> > > > tx-fcoe-segmentation: off [fixed]
> > > > tx-gre-segmentation: off [fixed]
> > > > tx-gre-csum-segmentation: off [fixed]
> > > > tx-ipxip4-segmentation: off [fixed]
> > > > tx-ipxip6-segmentation: off [fixed]
> > > > tx-udp_tnl-segmentation: off [fixed]
> > > > tx-udp_tnl-csum-segmentation: off [fixed]
> > > > tx-gso-partial: off [fixed]
> > > > tx-tunnel-remcsum-segmentation: off [fixed]
> > > > tx-sctp-segmentation: off [fixed]
> > > > tx-esp-segmentation: off [fixed]
> > > > tx-udp-segmentation: off
> > > > tx-gso-list: off [fixed]
> > > > tx-nocache-copy: off
> > > > loopback: off [fixed]
> > > > rx-fcs: off [fixed]
> > > > rx-all: off [fixed]
> > > > tx-vlan-stag-hw-insert: off [fixed]
> > > > rx-vlan-stag-hw-parse: off [fixed]
> > > > rx-vlan-stag-filter: off [fixed]
> > > > l2-fwd-offload: off [fixed]
> > > > hw-tc-offload: off [fixed]
> > > > esp-hw-offload: off [fixed]
> > > > esp-tx-csum-hw-offload: off [fixed]
> > > > rx-udp_tunnel-port-offload: off [fixed]
> > > > tls-hw-tx-offload: off [fixed]
> > > > tls-hw-rx-offload: off [fixed]
> > > > rx-gro-hw: on
> > > > tls-hw-record: off [fixed]
> > > > rx-gro-list: off
> > > > macsec-hw-offload: off [fixed]
> > > > rx-udp-gro-forwarding: off
> > > > hsr-tag-ins-offload: off [fixed]
> > > > hsr-tag-rm-offload: off [fixed]
> > > > hsr-fwd-offload: off [fixed]
> > > > hsr-dup-offload: off [fixed]
> > > > 
> > > > ethtool ens18:
> > > > Settings for ens18:
> > > > 	Supported ports: [  ]
> > > > 	Supported link modes:   Not reported
> > > > 	Supported pause frame use: No
> > > > 	Supports auto-negotiation: No
> > > > 	Supported FEC modes: Not reported
> > > > 	Advertised link modes:  Not reported
> > > > 	Advertised pause frame use: No
> > > > 	Advertised auto-negotiation: No
> > > > 	Advertised FEC modes: Not reported
> > > > 	Speed: Unknown!
> > > > 	Duplex: Unknown! (255)
> > > > 	Auto-negotiation: off
> > > > 	Port: Other
> > > > 	PHYAD: 0
> > > > 	Transceiver: internal
> > > > netlink error: Operation not permitted
> > > > 	Link detected: yes
> > > > 
> > > > 
> > > > Kernel log (journalctl -k):
> > > > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_scsi virtio2:
> > > > 4/0/0
> > > > default/read/poll queues  
> > > > Apr 03 19:50:37 kb-test.allod.com kernel: virtio_net virtio1
> > > > ens18:
> > > > renamed from eth0
> > > > 
> > > > Let me know if you’d like comparison data from kernel 6.11 or
> > > > any
> > > > additional tests
> > > 
> > > 
> > > I think let's redo bisect first then I will suggest which traces
> > > to
> > > add.
> > > 
> > 
> > The build with the added printk is currently running. I’ll test it
> > shortly and report the results.
> > 
> > Should the bisect be done between v6.11 and v6.12, or v6.11 and
> > v6.14?
> 
> The commit you showed is between 6.11 and 6.12. Having said that,
> you can manually checkout 49d14b54a527289d09a9480f214b8c586322310a
> and 49d14b54a527289d09a9480f214b8c586322310a~1 and record
> the results with git bisect bad/good and if it works
> then git bisect will stop immediately for you.
> 


I built and tested:
- 49d14b54a527289d09a9480f214b8c586322310a -> bad
- 49d14b54a527289d09a9480f214b8c586322310a~1 -> good

git bisect result:
49d14b54a527289d09a9480f214b8c586322310a is the first bad commit


Log:
git bisect start
# status: waiting for both good and bad commits
# bad: [49d14b54a527289d09a9480f214b8c586322310a] net: test for not too
small csum_start in virtio_net_hdr_to_skb()
git bisect bad 49d14b54a527289d09a9480f214b8c586322310a
# status: waiting for good commit(s), bad commit known
# good: [17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8] net: gso: fix tcp
fraglist segmentation after pull from frag_list
git bisect good 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8
# first bad commit: [49d14b54a527289d09a9480f214b8c586322310a] net:
test for not too small csum_start in virtio_net_hdr_to_skb()


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-04  8:52       ` Markus Fohrer
@ 2025-04-04 11:40         ` Markus Fohrer
  2025-04-04 15:13           ` Willem de Bruijn
  0 siblings, 1 reply; 24+ messages in thread
From: Markus Fohrer @ 2025-04-04 11:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Freitag, dem 04.04.2025 um 10:52 +0200 schrieb Markus Fohrer:
> Am Freitag, dem 04.04.2025 um 04:29 -0400 schrieb Michael S. Tsirkin:
> > On Fri, Apr 04, 2025 at 10:16:55AM +0200, Markus Fohrer wrote:
> > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > Tsirkin:
> > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > > > Hi,
> > > > > 
> > > > > I'm observing a significant performance regression in KVM
> > > > > guest
> > > > > VMs
> > > > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > > > 
> > > > > When running on a host system equipped with a Broadcom
> > > > > NetXtreme-E
> > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
> > > > > the
> > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > performs
> > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM
> > > > > is
> > > > > moved to a host with Intel NICs.
> > > > > 
> > > > > Test environment:
> > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > - Guest: Linux with virtio-net interface
> > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > level)
> > > > > - CPU: AMD EPYC
> > > > > - Storage: virtio-scsi
> > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > bottlenecks)
> > > > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > > > 
> > > > > This issue is not present:
> > > > > - On 6.8.0 
> > > > > - On hosts with Intel NICs (same VM config)
> > > > > 
> > > > > I have bisected the issue to the following upstream commit:
> > > > > 
> > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
> > > > > small
> > > > > tx")
> > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > 
> > > > Thanks a lot for the info!
> > > > 
> > > > 
> > > > both the link and commit point at:
> > > > 
> > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > Author: Eric Dumazet <edumazet@google.com>
> > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > 
> > > >     net: test for not too small csum_start in
> > > > virtio_net_hdr_to_skb()
> > > >     
> > > > 
> > > > is this what you mean?
> > > > 
> > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > warning
> > > > for small tx"
> > > > 
> > > > 
> > > > 
> > > > > Reverting this commit restores normal network performance in
> > > > > affected guest VMs.
> > > > > 
> > > > > I’m happy to provide more data or assist with testing a
> > > > > potential
> > > > > fix.
> > > > > 
> > > > > Thanks,
> > > > > Markus Fohrer
> > > > 
> > > > 
> > > > Thanks! First I think it's worth checking what is the setup,
> > > > e.g.
> > > > which offloads are enabled.
> > > > Besides that, I'd start by seeing what's doing on. Assuming I'm
> > > > right
> > > > about
> > > > Eric's patch:
> > > > 
> > > > diff --git a/include/linux/virtio_net.h
> > > > b/include/linux/virtio_net.h
> > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > --- a/include/linux/virtio_net.h
> > > > +++ b/include/linux/virtio_net.h
> > > > @@ -103,8 +103,10 @@ static inline int
> > > > virtio_net_hdr_to_skb(struct
> > > > sk_buff *skb,
> > > >  
> > > >  		if (!skb_partial_csum_set(skb, start, off))
> > > >  			return -EINVAL;
> > > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > > +			return -EINVAL;
> > > >  
> > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > skb_transport_offset(skb));
> > > > +		nh_min_len = skb_transport_offset(skb);
> > > >  		p_off = nh_min_len + thlen;
> > > >  		if (!pskb_may_pull(skb, p_off))
> > > >  			return -EINVAL;
> > > > 
> > > > 
> > > > sticking a printk before return -EINVAL to show the offset and
> > > > nh_min_len
> > > > would be a good 1st step. Thanks!
> > > > 
> > > 
> > > I added the following printk inside virtio_net_hdr_to_skb():
> > > 
> > >     if (skb_transport_offset(skb) < nh_min_len){
> > >         printk(KERN_INFO "virtio_net: 3 drop,
> > > transport_offset=%u,
> > > nh_min_len=%u\n",
> > >                skb_transport_offset(skb), nh_min_len);
> > >         return -EINVAL;
> > >     }
> > > 
> > > Built and installed the kernel, then triggered a large download
> > > via:
> > > 
> > >     wget http://speedtest.belwue.net/10G
> > > 
> > > Relevant output from `dmesg -w`:
> > > 
> > > [   57.327943] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.428942] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.428962] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.553068] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.553088] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.576678] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.618438] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.618453] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.703077] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.823072] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.891982] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   57.946190] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > > [   58.218686] virtio_net: 3 drop, transport_offset=34,
> > > nh_min_len=40  
> > 
> > Hmm indeed. And what about these values?
> >                 u32 start = __virtio16_to_cpu(little_endian, hdr-
> > > csum_start);
> >                 u32 off = __virtio16_to_cpu(little_endian, hdr-
> > > csum_offset);
> >                 u32 needed = start + max_t(u32, thlen, off +
> > sizeof(__sum16));
> > print them too?
> > 
> > 
> > 
> > > I would now do the test with commit
> > > 49d14b54a527289d09a9480f214b8c586322310a and commit
> > > 49d14b54a527289d09a9480f214b8c586322310a~1
> > > 
> > 
> > Worth checking though it seems likely now the hypervisor is doing
> > weird
> > things. what kind of backend is it? qemu? tun? vhost-user? vhost-
> > net?
> > 
> 
> Backend: QEMU/KVM hypervisor (Proxmox)
> 
> 
> printk output:
> 
> [   58.641906] virtio_net: drop, transport_offset=34  start=34,
> off=16,
> needed=54, nh_min_len=40
> [   58.678048] virtio_net: drop, transport_offset=34  start=34,
> off=16,
> needed=54, nh_min_len=40
> [   58.952871] virtio_net: drop, transport_offset=34  start=34,
> off=16,
> needed=54, nh_min_len=40
> [   58.962157] virtio_net: drop, transport_offset=34  start=34,
> off=16,
> needed=54, nh_min_len=40
> [   59.071645] virtio_net: drop, transport_offset=34  start=34,
> off=16,
> needed=54, nh_min_len=40
> 
> 
> 
> 
> 

I just noticed that commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8
(tcp_offload.c: gso fix) also touches checksum handling and may
affect how skb state is passed to virtio_net_hdr_to_skb().

Is it possible that the regression only appears due to the combination
of 17bd3bd8 and 49d14b54a5?



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-04 11:40         ` Markus Fohrer
@ 2025-04-04 15:13           ` Willem de Bruijn
  2025-04-04 20:23             ` Markus Fohrer
  2025-04-04 22:05             ` Ilya Maximets
  0 siblings, 2 replies; 24+ messages in thread
From: Willem de Bruijn @ 2025-04-04 15:13 UTC (permalink / raw)
  To: Markus Fohrer, Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Markus Fohrer wrote:
> Am Freitag, dem 04.04.2025 um 10:52 +0200 schrieb Markus Fohrer:
> > Am Freitag, dem 04.04.2025 um 04:29 -0400 schrieb Michael S. Tsirkin:
> > > On Fri, Apr 04, 2025 at 10:16:55AM +0200, Markus Fohrer wrote:
> > > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > > Tsirkin:
> > > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I'm observing a significant performance regression in KVM
> > > > > > guest
> > > > > > VMs
> > > > > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > > > > 
> > > > > > When running on a host system equipped with a Broadcom
> > > > > > NetXtreme-E
> > > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
> > > > > > the
> > > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > > performs
> > > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM
> > > > > > is
> > > > > > moved to a host with Intel NICs.
> > > > > > 
> > > > > > Test environment:
> > > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > > - Guest: Linux with virtio-net interface
> > > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > > level)
> > > > > > - CPU: AMD EPYC
> > > > > > - Storage: virtio-scsi
> > > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > > bottlenecks)
> > > > > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > > > > 
> > > > > > This issue is not present:
> > > > > > - On 6.8.0 
> > > > > > - On hosts with Intel NICs (same VM config)
> > > > > > 
> > > > > > I have bisected the issue to the following upstream commit:
> > > > > > 
> > > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
> > > > > > small
> > > > > > tx")
> > > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > > 
> > > > > Thanks a lot for the info!
> > > > > 
> > > > > 
> > > > > both the link and commit point at:
> > > > > 
> > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > 
> > > > >     net: test for not too small csum_start in
> > > > > virtio_net_hdr_to_skb()
> > > > >     
> > > > > 
> > > > > is this what you mean?
> > > > > 
> > > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > > warning
> > > > > for small tx"
> > > > > 
> > > > > 
> > > > > 
> > > > > > Reverting this commit restores normal network performance in
> > > > > > affected guest VMs.
> > > > > > 
> > > > > > I’m happy to provide more data or assist with testing a
> > > > > > potential
> > > > > > fix.
> > > > > > 
> > > > > > Thanks,
> > > > > > Markus Fohrer
> > > > > 
> > > > > 
> > > > > Thanks! First I think it's worth checking what is the setup,
> > > > > e.g.
> > > > > which offloads are enabled.
> > > > > Besides that, I'd start by seeing what's doing on. Assuming I'm
> > > > > right
> > > > > about
> > > > > Eric's patch:
> > > > > 
> > > > > diff --git a/include/linux/virtio_net.h
> > > > > b/include/linux/virtio_net.h
> > > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > > --- a/include/linux/virtio_net.h
> > > > > +++ b/include/linux/virtio_net.h
> > > > > @@ -103,8 +103,10 @@ static inline int
> > > > > virtio_net_hdr_to_skb(struct
> > > > > sk_buff *skb,
> > > > >  
> > > > >  		if (!skb_partial_csum_set(skb, start, off))
> > > > >  			return -EINVAL;
> > > > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > > > +			return -EINVAL;
> > > > >  
> > > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > > skb_transport_offset(skb));
> > > > > +		nh_min_len = skb_transport_offset(skb);
> > > > >  		p_off = nh_min_len + thlen;
> > > > >  		if (!pskb_may_pull(skb, p_off))
> > > > >  			return -EINVAL;
> > > > > 
> > > > > 
> > > > > sticking a printk before return -EINVAL to show the offset and
> > > > > nh_min_len
> > > > > would be a good 1st step. Thanks!
> > > > > 
> > > > 
> > > > I added the following printk inside virtio_net_hdr_to_skb():
> > > > 
> > > >     if (skb_transport_offset(skb) < nh_min_len){
> > > >         printk(KERN_INFO "virtio_net: 3 drop,
> > > > transport_offset=%u,
> > > > nh_min_len=%u\n",
> > > >                skb_transport_offset(skb), nh_min_len);
> > > >         return -EINVAL;
> > > >     }
> > > > 
> > > > Built and installed the kernel, then triggered a large download
> > > > via:
> > > > 
> > > >     wget http://speedtest.belwue.net/10G
> > > > 
> > > > Relevant output from `dmesg -w`:
> > > > 
> > > > [   57.327943] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.428942] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.428962] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.553068] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.553088] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.576678] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.618438] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.618453] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.703077] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.823072] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.891982] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   57.946190] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > > [   58.218686] virtio_net: 3 drop, transport_offset=34,
> > > > nh_min_len=40  
> > > 
> > > Hmm indeed. And what about these values?
> > >                 u32 start = __virtio16_to_cpu(little_endian, hdr-
> > > > csum_start);
> > >                 u32 off = __virtio16_to_cpu(little_endian, hdr-
> > > > csum_offset);
> > >                 u32 needed = start + max_t(u32, thlen, off +
> > > sizeof(__sum16));
> > > print them too?
> > > 
> > > 
> > > 
> > > > I would now do the test with commit
> > > > 49d14b54a527289d09a9480f214b8c586322310a and commit
> > > > 49d14b54a527289d09a9480f214b8c586322310a~1
> > > > 
> > > 
> > > Worth checking though it seems likely now the hypervisor is doing
> > > weird
> > > things. what kind of backend is it? qemu? tun? vhost-user? vhost-
> > > net?
> > > 
> > 
> > Backend: QEMU/KVM hypervisor (Proxmox)
> > 
> > 
> > printk output:
> > 
> > [   58.641906] virtio_net: drop, transport_offset=34  start=34,
> > off=16,
> > needed=54, nh_min_len=40
> > [   58.678048] virtio_net: drop, transport_offset=34  start=34,
> > off=16,
> > needed=54, nh_min_len=40
> > [   58.952871] virtio_net: drop, transport_offset=34  start=34,
> > off=16,
> > needed=54, nh_min_len=40
> > [   58.962157] virtio_net: drop, transport_offset=34  start=34,
> > off=16,
> > needed=54, nh_min_len=40
> > [   59.071645] virtio_net: drop, transport_offset=34  start=34,
> > off=16,
> > needed=54, nh_min_len=40

So likely a TCP/IPv4 packet, but with VIRTIO_NET_HDR_GSO_TCPV6.

This is observed in the guest on the ingress path, right? In
virtnet_receive_done.

Is this using vhost-net in the host for pass-through? IOW, is
the host writing the virtio_net_hdr too?

> > 
> > 
> > 
> > 
> 
> I just noticed that commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8
> (tcp_offload.c: gso fix) also touches checksum handling and may
> affect how skb state is passed to virtio_net_hdr_to_skb().
> 
> Is it possible that the regression only appears due to the combination
> of 17bd3bd8 and 49d14b54a5?

That patch only affects packets with SKB_GSO_FRAGLIST. Which is only
set on forwarding if NETIF_F_FRAGLIST is set. I don 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-04 15:13           ` Willem de Bruijn
@ 2025-04-04 20:23             ` Markus Fohrer
  2025-04-04 22:05             ` Ilya Maximets
  1 sibling, 0 replies; 24+ messages in thread
From: Markus Fohrer @ 2025-04-04 20:23 UTC (permalink / raw)
  To: Willem de Bruijn, Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Freitag, dem 04.04.2025 um 11:13 -0400 schrieb Willem de Bruijn:
> Markus Fohrer wrote:
> > Am Freitag, dem 04.04.2025 um 10:52 +0200 schrieb Markus Fohrer:
> > > Am Freitag, dem 04.04.2025 um 04:29 -0400 schrieb Michael S.
> > > Tsirkin:
> > > > On Fri, Apr 04, 2025 at 10:16:55AM +0200, Markus Fohrer wrote:
> > > > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael
> > > > > S.
> > > > > Tsirkin:
> > > > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer
> > > > > > wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I'm observing a significant performance regression in KVM
> > > > > > > guest
> > > > > > > VMs
> > > > > > > using virtio-net with recent Linux kernels (6.8.1+ and
> > > > > > > 6.14).
> > > > > > > 
> > > > > > > When running on a host system equipped with a Broadcom
> > > > > > > NetXtreme-E
> > > > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput
> > > > > > > in
> > > > > > > the
> > > > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > > > performs
> > > > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the
> > > > > > > VM
> > > > > > > is
> > > > > > > moved to a host with Intel NICs.
> > > > > > > 
> > > > > > > Test environment:
> > > > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > > > - Guest: Linux with virtio-net interface
> > > > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at
> > > > > > > host
> > > > > > > level)
> > > > > > > - CPU: AMD EPYC
> > > > > > > - Storage: virtio-scsi
> > > > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > > > bottlenecks)
> > > > > > > - Traffic test: iperf3, scp, wget consistently slow in
> > > > > > > guest
> > > > > > > 
> > > > > > > This issue is not present:
> > > > > > > - On 6.8.0 
> > > > > > > - On hosts with Intel NICs (same VM config)
> > > > > > > 
> > > > > > > I have bisected the issue to the following upstream
> > > > > > > commit:
> > > > > > > 
> > > > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning
> > > > > > > for
> > > > > > > small
> > > > > > > tx")
> > > > > > >   https://git.kernel.org/linus/49d14b54a527
> > > > > > 
> > > > > > Thanks a lot for the info!
> > > > > > 
> > > > > > 
> > > > > > both the link and commit point at:
> > > > > > 
> > > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > > 
> > > > > >     net: test for not too small csum_start in
> > > > > > virtio_net_hdr_to_skb()
> > > > > >     
> > > > > > 
> > > > > > is this what you mean?
> > > > > > 
> > > > > > I don't know which commit is "virtio-net: Suppress tx
> > > > > > timeout
> > > > > > warning
> > > > > > for small tx"
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > Reverting this commit restores normal network performance
> > > > > > > in
> > > > > > > affected guest VMs.
> > > > > > > 
> > > > > > > I’m happy to provide more data or assist with testing a
> > > > > > > potential
> > > > > > > fix.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Markus Fohrer
> > > > > > 
> > > > > > 
> > > > > > Thanks! First I think it's worth checking what is the
> > > > > > setup,
> > > > > > e.g.
> > > > > > which offloads are enabled.
> > > > > > Besides that, I'd start by seeing what's doing on. Assuming
> > > > > > I'm
> > > > > > right
> > > > > > about
> > > > > > Eric's patch:
> > > > > > 
> > > > > > diff --git a/include/linux/virtio_net.h
> > > > > > b/include/linux/virtio_net.h
> > > > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > > > --- a/include/linux/virtio_net.h
> > > > > > +++ b/include/linux/virtio_net.h
> > > > > > @@ -103,8 +103,10 @@ static inline int
> > > > > > virtio_net_hdr_to_skb(struct
> > > > > > sk_buff *skb,
> > > > > >  
> > > > > >  		if (!skb_partial_csum_set(skb, start,
> > > > > > off))
> > > > > >  			return -EINVAL;
> > > > > > +		if (skb_transport_offset(skb) <
> > > > > > nh_min_len)
> > > > > > +			return -EINVAL;
> > > > > >  
> > > > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > > > skb_transport_offset(skb));
> > > > > > +		nh_min_len = skb_transport_offset(skb);
> > > > > >  		p_off = nh_min_len + thlen;
> > > > > >  		if (!pskb_may_pull(skb, p_off))
> > > > > >  			return -EINVAL;
> > > > > > 
> > > > > > 
> > > > > > sticking a printk before return -EINVAL to show the offset
> > > > > > and
> > > > > > nh_min_len
> > > > > > would be a good 1st step. Thanks!
> > > > > > 
> > > > > 
> > > > > I added the following printk inside virtio_net_hdr_to_skb():
> > > > > 
> > > > >     if (skb_transport_offset(skb) < nh_min_len){
> > > > >         printk(KERN_INFO "virtio_net: 3 drop,
> > > > > transport_offset=%u,
> > > > > nh_min_len=%u\n",
> > > > >                skb_transport_offset(skb), nh_min_len);
> > > > >         return -EINVAL;
> > > > >     }
> > > > > 
> > > > > Built and installed the kernel, then triggered a large
> > > > > download
> > > > > via:
> > > > > 
> > > > >     wget http://speedtest.belwue.net/10G
> > > > > 
> > > > > Relevant output from `dmesg -w`:
> > > > > 
> > > > > [   57.327943] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.428942] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.428962] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.553068] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.553088] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.576678] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.618438] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.618453] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.703077] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.823072] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.891982] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   57.946190] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > > [   58.218686] virtio_net: 3 drop, transport_offset=34,
> > > > > nh_min_len=40  
> > > > 
> > > > Hmm indeed. And what about these values?
> > > >                 u32 start = __virtio16_to_cpu(little_endian,
> > > > hdr-
> > > > > csum_start);
> > > >                 u32 off = __virtio16_to_cpu(little_endian, hdr-
> > > > > csum_offset);
> > > >                 u32 needed = start + max_t(u32, thlen, off +
> > > > sizeof(__sum16));
> > > > print them too?
> > > > 
> > > > 
> > > > 
> > > > > I would now do the test with commit
> > > > > 49d14b54a527289d09a9480f214b8c586322310a and commit
> > > > > 49d14b54a527289d09a9480f214b8c586322310a~1
> > > > > 
> > > > 
> > > > Worth checking though it seems likely now the hypervisor is
> > > > doing
> > > > weird
> > > > things. what kind of backend is it? qemu? tun? vhost-user?
> > > > vhost-
> > > > net?
> > > > 
> > > 
> > > Backend: QEMU/KVM hypervisor (Proxmox)
> > > 
> > > 
> > > printk output:
> > > 
> > > [   58.641906] virtio_net: drop, transport_offset=34  start=34,
> > > off=16,
> > > needed=54, nh_min_len=40
> > > [   58.678048] virtio_net: drop, transport_offset=34  start=34,
> > > off=16,
> > > needed=54, nh_min_len=40
> > > [   58.952871] virtio_net: drop, transport_offset=34  start=34,
> > > off=16,
> > > needed=54, nh_min_len=40
> > > [   58.962157] virtio_net: drop, transport_offset=34  start=34,
> > > off=16,
> > > needed=54, nh_min_len=40
> > > [   59.071645] virtio_net: drop, transport_offset=34  start=34,
> > > off=16,
> > > needed=54, nh_min_len=40
> 
> So likely a TCP/IPv4 packet, but with VIRTIO_NET_HDR_GSO_TCPV6.
> 
> This is observed in the guest on the ingress path, right? In
> virtnet_receive_done.

Yes, all tests are done inside the guest system. Packet drops are seen
when receiving traffic.

I hadn't tested upload before, so I did now:

- Download is slow (<200 KB/s)
- Upload works fine (~190 MB/s)

> 
> Is this using vhost-net in the host for pass-through? IOW, is
> the host writing the virtio_net_hdr too?

Yes, the guest runs on a Proxmox host using QEMU/KVM with vhost-net.

vhost_net module is loaded:

  # lsmod | grep vhost
  vhost_net              32768  30
  vhost                  61440  1 vhost_net
  vhost_iotlb            16384  1 vhost
  tap                    28672  1 vhost_net

QEMU is launched with:

  -netdev type=tap,...,vhost=on
  -device virtio-net-pci,...

> 
> > > 
> > > 
> > > 
> > > 
> > 
> > I just noticed that commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8
> > (tcp_offload.c: gso fix) also touches checksum handling and may
> > affect how skb state is passed to virtio_net_hdr_to_skb().
> > 
> > Is it possible that the regression only appears due to the
> > combination
> > of 17bd3bd8 and 49d14b54a5?
> 
> That patch only affects packets with SKB_GSO_FRAGLIST. Which is only
> set on forwarding if NETIF_F_FRAGLIST is set. I don 

Checked in the guest:

  # ethtool -k eth0 | grep frag
    tx-scatter-gather-fraglist: off [fixed]

So fraglist offload is disabled in the guest.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-04 15:13           ` Willem de Bruijn
  2025-04-04 20:23             ` Markus Fohrer
@ 2025-04-04 22:05             ` Ilya Maximets
  2025-04-05  6:15               ` Markus Fohrer
  1 sibling, 1 reply; 24+ messages in thread
From: Ilya Maximets @ 2025-04-04 22:05 UTC (permalink / raw)
  To: Willem de Bruijn, Markus Fohrer, Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel,
	i.maximets

On 4/4/25 5:13 PM, Willem de Bruijn wrote:
> Markus Fohrer wrote:
>> Am Freitag, dem 04.04.2025 um 10:52 +0200 schrieb Markus Fohrer:
>>> Am Freitag, dem 04.04.2025 um 04:29 -0400 schrieb Michael S. Tsirkin:
>>>> On Fri, Apr 04, 2025 at 10:16:55AM +0200, Markus Fohrer wrote:
>>>>> Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
>>>>> Tsirkin:
>>>>>> On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm observing a significant performance regression in KVM
>>>>>>> guest
>>>>>>> VMs
>>>>>>> using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
>>>>>>>
>>>>>>> When running on a host system equipped with a Broadcom
>>>>>>> NetXtreme-E
>>>>>>> (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
>>>>>>> the
>>>>>>> guest drops to 100–200 KB/s. The same guest configuration
>>>>>>> performs
>>>>>>> normally (~100 MB/s) when using kernel 6.8.0 or when the VM
>>>>>>> is
>>>>>>> moved to a host with Intel NICs.
>>>>>>>
>>>>>>> Test environment:
>>>>>>> - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
>>>>>>> - Guest: Linux with virtio-net interface
>>>>>>> - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
>>>>>>> level)
>>>>>>> - CPU: AMD EPYC
>>>>>>> - Storage: virtio-scsi
>>>>>>> - VM network: virtio-net, virtio-scsi (no CPU or IO
>>>>>>> bottlenecks)
>>>>>>> - Traffic test: iperf3, scp, wget consistently slow in guest
>>>>>>>
>>>>>>> This issue is not present:
>>>>>>> - On 6.8.0 
>>>>>>> - On hosts with Intel NICs (same VM config)
>>>>>>>
>>>>>>> I have bisected the issue to the following upstream commit:
>>>>>>>
>>>>>>>   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
>>>>>>> small
>>>>>>> tx")
>>>>>>>   https://git.kernel.org/linus/49d14b54a527
>>>>>>
>>>>>> Thanks a lot for the info!
>>>>>>
>>>>>>
>>>>>> both the link and commit point at:
>>>>>>
>>>>>> commit 49d14b54a527289d09a9480f214b8c586322310a
>>>>>> Author: Eric Dumazet <edumazet@google.com>
>>>>>> Date:   Thu Sep 26 16:58:36 2024 +0000
>>>>>>
>>>>>>     net: test for not too small csum_start in
>>>>>> virtio_net_hdr_to_skb()
>>>>>>     
>>>>>>
>>>>>> is this what you mean?
>>>>>>
>>>>>> I don't know which commit is "virtio-net: Suppress tx timeout
>>>>>> warning
>>>>>> for small tx"
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Reverting this commit restores normal network performance in
>>>>>>> affected guest VMs.
>>>>>>>
>>>>>>> I’m happy to provide more data or assist with testing a
>>>>>>> potential
>>>>>>> fix.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Markus Fohrer
>>>>>>
>>>>>>
>>>>>> Thanks! First I think it's worth checking what is the setup,
>>>>>> e.g.
>>>>>> which offloads are enabled.
>>>>>> Besides that, I'd start by seeing what's doing on. Assuming I'm
>>>>>> right
>>>>>> about
>>>>>> Eric's patch:
>>>>>>
>>>>>> diff --git a/include/linux/virtio_net.h
>>>>>> b/include/linux/virtio_net.h
>>>>>> index 276ca543ef44d8..02a9f4dc594d02 100644
>>>>>> --- a/include/linux/virtio_net.h
>>>>>> +++ b/include/linux/virtio_net.h
>>>>>> @@ -103,8 +103,10 @@ static inline int
>>>>>> virtio_net_hdr_to_skb(struct
>>>>>> sk_buff *skb,
>>>>>>  
>>>>>>  		if (!skb_partial_csum_set(skb, start, off))
>>>>>>  			return -EINVAL;
>>>>>> +		if (skb_transport_offset(skb) < nh_min_len)
>>>>>> +			return -EINVAL;
>>>>>>  
>>>>>> -		nh_min_len = max_t(u32, nh_min_len,
>>>>>> skb_transport_offset(skb));
>>>>>> +		nh_min_len = skb_transport_offset(skb);
>>>>>>  		p_off = nh_min_len + thlen;
>>>>>>  		if (!pskb_may_pull(skb, p_off))
>>>>>>  			return -EINVAL;
>>>>>>
>>>>>>
>>>>>> sticking a printk before return -EINVAL to show the offset and
>>>>>> nh_min_len
>>>>>> would be a good 1st step. Thanks!
>>>>>>
>>>>>
>>>>> I added the following printk inside virtio_net_hdr_to_skb():
>>>>>
>>>>>     if (skb_transport_offset(skb) < nh_min_len){
>>>>>         printk(KERN_INFO "virtio_net: 3 drop,
>>>>> transport_offset=%u,
>>>>> nh_min_len=%u\n",
>>>>>                skb_transport_offset(skb), nh_min_len);
>>>>>         return -EINVAL;
>>>>>     }
>>>>>
>>>>> Built and installed the kernel, then triggered a large download
>>>>> via:
>>>>>
>>>>>     wget http://speedtest.belwue.net/10G
>>>>>
>>>>> Relevant output from `dmesg -w`:
>>>>>
>>>>> [   57.327943] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.428942] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.428962] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.553068] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.553088] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.576678] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.618438] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.618453] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.703077] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.823072] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.891982] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   57.946190] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>> [   58.218686] virtio_net: 3 drop, transport_offset=34,
>>>>> nh_min_len=40  
>>>>
>>>> Hmm indeed. And what about these values?
>>>>                 u32 start = __virtio16_to_cpu(little_endian, hdr-
>>>>> csum_start);
>>>>                 u32 off = __virtio16_to_cpu(little_endian, hdr-
>>>>> csum_offset);
>>>>                 u32 needed = start + max_t(u32, thlen, off +
>>>> sizeof(__sum16));
>>>> print them too?
>>>>
>>>>
>>>>
>>>>> I would now do the test with commit
>>>>> 49d14b54a527289d09a9480f214b8c586322310a and commit
>>>>> 49d14b54a527289d09a9480f214b8c586322310a~1
>>>>>
>>>>
>>>> Worth checking though it seems likely now the hypervisor is doing
>>>> weird
>>>> things. what kind of backend is it? qemu? tun? vhost-user? vhost-
>>>> net?
>>>>
>>>
>>> Backend: QEMU/KVM hypervisor (Proxmox)
>>>
>>>
>>> printk output:
>>>
>>> [   58.641906] virtio_net: drop, transport_offset=34  start=34,
>>> off=16,
>>> needed=54, nh_min_len=40
>>> [   58.678048] virtio_net: drop, transport_offset=34  start=34,
>>> off=16,
>>> needed=54, nh_min_len=40
>>> [   58.952871] virtio_net: drop, transport_offset=34  start=34,
>>> off=16,
>>> needed=54, nh_min_len=40
>>> [   58.962157] virtio_net: drop, transport_offset=34  start=34,
>>> off=16,
>>> needed=54, nh_min_len=40
>>> [   59.071645] virtio_net: drop, transport_offset=34  start=34,
>>> off=16,
>>> needed=54, nh_min_len=40
> 
> So likely a TCP/IPv4 packet, but with VIRTIO_NET_HDR_GSO_TCPV6.


Hi, Markus.

Given this and the fact that the issue depends on the bnxt_en NIC on the
hist, I'd make an educated guess that the problem is the host NIC driver.

There are some known GRO issues in the nbxt_en driver fixed recently in

  commit de37faf41ac55619dd329229a9bd9698faeabc52
  Author: Michael Chan <michael.chan@broadcom.com>
  Date:   Wed Dec 4 13:59:17 2024 -0800

    bnxt_en: Fix GSO type for HW GRO packets on 5750X chips

It's not clear to me what's your host kernel version.  But the commit
above was introduced in 6.14 and may be in fairly recent stable kernels.
The oldest is v6.12.6 AFAICT.  Can you try one of these host kernels?

Also, to confirm and workaround the problem, please, try disabling HW GRO
on the bnxt_en NIC first:

  ethtool -K <BNXT_EN NIC IFACE> rx-gro-hw off

If that doesn't help, then the problem is likely something different.

Best regards, Ilya Maximets.

> 
> This is observed in the guest on the ingress path, right? In
> virtnet_receive_done.
> 
> Is this using vhost-net in the host for pass-through? IOW, is
> the host writing the virtio_net_hdr too?
> 
>>>
>>>
>>>
>>>
>>
>> I just noticed that commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8
>> (tcp_offload.c: gso fix) also touches checksum handling and may
>> affect how skb state is passed to virtio_net_hdr_to_skb().
>>
>> Is it possible that the regression only appears due to the combination
>> of 17bd3bd8 and 49d14b54a5?
> 
> That patch only affects packets with SKB_GSO_FRAGLIST. Which is only
> set on forwarding if NETIF_F_FRAGLIST is set. I don 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-04 22:05             ` Ilya Maximets
@ 2025-04-05  6:15               ` Markus Fohrer
  2025-04-05 12:18                 ` Ilya Maximets
  0 siblings, 1 reply; 24+ messages in thread
From: Markus Fohrer @ 2025-04-05  6:15 UTC (permalink / raw)
  To: Ilya Maximets, Willem de Bruijn, Michael S. Tsirkin
  Cc: virtualization, jasowang, davem, edumazet, netdev, linux-kernel

Am Samstag, dem 05.04.2025 um 00:05 +0200 schrieb Ilya Maximets:

> On 4/4/25 5:13 PM, Willem de Bruijn wrote:
> 
> > Markus Fohrer wrote:
> > 
> > > Am Freitag, dem 04.04.2025 um 10:52 +0200 schrieb Markus Fohrer:
> > > 
> > > > Am Freitag, dem 04.04.2025 um 04:29 -0400 schrieb Michael S. Tsirkin:
> > > > 
> > > > > On Fri, Apr 04, 2025 at 10:16:55AM +0200, Markus Fohrer wrote:
> > > > > 
> > > > > > Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> > > > > > Tsirkin:
> > > > > > 
> > > > > > > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > > > > > 
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > I'm observing a significant performance regression in KVM
> > > > > > > > guest
> > > > > > > > VMs
> > > > > > > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > > > > > > 
> > > > > > > > When running on a host system equipped with a Broadcom
> > > > > > > > NetXtreme-E
> > > > > > > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
> > > > > > > > the
> > > > > > > > guest drops to 100–200 KB/s. The same guest configuration
> > > > > > > > performs
> > > > > > > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM
> > > > > > > > is
> > > > > > > > moved to a host with Intel NICs.
> > > > > > > > 
> > > > > > > > Test environment:
> > > > > > > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > > > > > > - Guest: Linux with virtio-net interface
> > > > > > > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
> > > > > > > > level)
> > > > > > > > - CPU: AMD EPYC
> > > > > > > > - Storage: virtio-scsi
> > > > > > > > - VM network: virtio-net, virtio-scsi (no CPU or IO
> > > > > > > > bottlenecks)
> > > > > > > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > > > > > > 
> > > > > > > > This issue is not present:
> > > > > > > > - On 6.8.0 
> > > > > > > > - On hosts with Intel NICs (same VM config)
> > > > > > > > 
> > > > > > > > I have bisected the issue to the following upstream commit:
> > > > > > > > 
> > > > > > > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
> > > > > > > > small
> > > > > > > > tx")
> > > > > > > >   [https://git.kernel.org/linus/49d14b54a527](https://git.kernel.org/linus/49d14b54a527)
> > > > > > > 
> > > > > > > 
> > > > > > > Thanks a lot for the info!
> > > > > > > 
> > > > > > > 
> > > > > > > both the link and commit point at:
> > > > > > > 
> > > > > > > commit 49d14b54a527289d09a9480f214b8c586322310a
> > > > > > > Author: Eric Dumazet <[edumazet@google.com](mailto:edumazet@google.com)>
> > > > > > > Date:   Thu Sep 26 16:58:36 2024 +0000
> > > > > > > 
> > > > > > >     net: test for not too small csum_start in
> > > > > > > virtio_net_hdr_to_skb()
> > > > > > >     
> > > > > > > 
> > > > > > > is this what you mean?
> > > > > > > 
> > > > > > > I don't know which commit is "virtio-net: Suppress tx timeout
> > > > > > > warning
> > > > > > > for small tx"
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > > Reverting this commit restores normal network performance in
> > > > > > > > affected guest VMs.
> > > > > > > > 
> > > > > > > > I’m happy to provide more data or assist with testing a
> > > > > > > > potential
> > > > > > > > fix.
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > Markus Fohrer
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Thanks! First I think it's worth checking what is the setup,
> > > > > > > e.g.
> > > > > > > which offloads are enabled.
> > > > > > > Besides that, I'd start by seeing what's doing on. Assuming I'm
> > > > > > > right
> > > > > > > about
> > > > > > > Eric's patch:
> > > > > > > 
> > > > > > > diff --git a/include/linux/virtio_net.h
> > > > > > > b/include/linux/virtio_net.h
> > > > > > > index 276ca543ef44d8..02a9f4dc594d02 100644
> > > > > > > --- a/include/linux/virtio_net.h
> > > > > > > +++ b/include/linux/virtio_net.h
> > > > > > > @@ -103,8 +103,10 @@ static inline int
> > > > > > > virtio_net_hdr_to_skb(struct
> > > > > > > sk_buff *skb,
> > > > > > >  
> > > > > > >  		if (!skb_partial_csum_set(skb, start, off))
> > > > > > >  			return -EINVAL;
> > > > > > > +		if (skb_transport_offset(skb) < nh_min_len)
> > > > > > > +			return -EINVAL;
> > > > > > >  
> > > > > > > -		nh_min_len = max_t(u32, nh_min_len,
> > > > > > > skb_transport_offset(skb));
> > > > > > > +		nh_min_len = skb_transport_offset(skb);
> > > > > > >  		p_off = nh_min_len + thlen;
> > > > > > >  		if (!pskb_may_pull(skb, p_off))
> > > > > > >  			return -EINVAL;
> > > > > > > 
> > > > > > > 
> > > > > > > sticking a printk before return -EINVAL to show the offset and
> > > > > > > nh_min_len
> > > > > > > would be a good 1st step. Thanks!
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > I added the following printk inside virtio_net_hdr_to_skb():
> > > > > > 
> > > > > >     if (skb_transport_offset(skb) < nh_min_len){
> > > > > >         printk(KERN_INFO "virtio_net: 3 drop,
> > > > > > transport_offset=%u,
> > > > > > nh_min_len=%u\n",
> > > > > >                skb_transport_offset(skb), nh_min_len);
> > > > > >         return -EINVAL;
> > > > > >     }
> > > > > > 
> > > > > > Built and installed the kernel, then triggered a large download
> > > > > > via:
> > > > > > 
> > > > > >     wget [http://speedtest.belwue.net/10G](http://speedtest.belwue.net/10G)
> > > > > > 
> > > > > > Relevant output from `dmesg -w`:
> > > > > > 
> > > > > > [   57.327943] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.428942] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.428962] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.553068] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.553088] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.576678] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.618438] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.618453] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.703077] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.823072] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.891982] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   57.946190] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > > [   58.218686] virtio_net: 3 drop, transport_offset=34,
> > > > > > nh_min_len=40  
> > > > > 
> > > > > 
> > > > > Hmm indeed. And what about these values?
> > > > >                 u32 start = __virtio16_to_cpu(little_endian, hdr-
> > > > > 
> > > > > > csum_start);
> > > > > 
> > > > >                 u32 off = __virtio16_to_cpu(little_endian, hdr-
> > > > > 
> > > > > > csum_offset);
> > > > > 
> > > > >                 u32 needed = start + max_t(u32, thlen, off +
> > > > > sizeof(__sum16));
> > > > > print them too?
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > > I would now do the test with commit
> > > > > > 49d14b54a527289d09a9480f214b8c586322310a and commit
> > > > > > 49d14b54a527289d09a9480f214b8c586322310a~1
> > > > > > 
> > > > > 
> > > > > 
> > > > > Worth checking though it seems likely now the hypervisor is doing
> > > > > weird
> > > > > things. what kind of backend is it? qemu? tun? vhost-user? vhost-
> > > > > net?
> > > > > 
> > > > 
> > > > 
> > > > Backend: QEMU/KVM hypervisor (Proxmox)
> > > > 
> > > > 
> > > > printk output:
> > > > 
> > > > [   58.641906] virtio_net: drop, transport_offset=34  start=34,
> > > > off=16,
> > > > needed=54, nh_min_len=40
> > > > [   58.678048] virtio_net: drop, transport_offset=34  start=34,
> > > > off=16,
> > > > needed=54, nh_min_len=40
> > > > [   58.952871] virtio_net: drop, transport_offset=34  start=34,
> > > > off=16,
> > > > needed=54, nh_min_len=40
> > > > [   58.962157] virtio_net: drop, transport_offset=34  start=34,
> > > > off=16,
> > > > needed=54, nh_min_len=40
> > > > [   59.071645] virtio_net: drop, transport_offset=34  start=34,
> > > > off=16,
> > > > needed=54, nh_min_len=40
> > > 
> > 
> > 
> > So likely a TCP/IPv4 packet, but with VIRTIO_NET_HDR_GSO_TCPV6.
> 
> 
> 
> Hi, Markus.
> 
> Given this and the fact that the issue depends on the bnxt_en NIC on the
> hist, I'd make an educated guess that the problem is the host NIC driver.
> 
> There are some known GRO issues in the nbxt_en driver fixed recently in
> 
>   commit de37faf41ac55619dd329229a9bd9698faeabc52
>   Author: Michael Chan <[michael.chan@broadcom.com](mailto:michael.chan@broadcom.com)>
>   Date:   Wed Dec 4 13:59:17 2024 -0800
> 
>     bnxt_en: Fix GSO type for HW GRO packets on 5750X chips
> 
> It's not clear to me what's your host kernel version.  But the commit
> above was introduced in 6.14 and may be in fairly recent stable kernels.
> The oldest is v6.12.6 AFAICT.  Can you try one of these host kernels?
> 
> Also, to confirm and workaround the problem, please, try disabling HW GRO
> on the bnxt_en NIC first:
> 
>   ethtool -K <BNXT_EN NIC IFACE> rx-gro-hw off
> 
> If that doesn't help, then the problem is likely something different.
> 
> Best regards, Ilya Maximets.


Setting `rx-gro-hw off` on the Broadcom interfaces also resolves the issue:

ethtool -K ens1f0np0 rx-gro-hw off  
ethtool -K ens1f1np1 rx-gro-hw off  
ethtool -K ens1f2np2 rx-gro-hw off  
ethtool -K ens1f3np3 rx-gro-hw off

With this setting applied, the guest receives traffic correctly even when GRO is enabled on the host.

The system is running the latest Proxmox kernel:

6.8.12-9-pve




> > This is observed in the guest on the ingress path, right? In
> > virtnet_receive_done.
> > 
> > Is this using vhost-net in the host for pass-through? IOW, is
> > the host writing the virtio_net_hdr too?
> > 
> > 
> > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > > I just noticed that commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8
> > > (tcp_offload.c: gso fix) also touches checksum handling and may
> > > affect how skb state is passed to virtio_net_hdr_to_skb().
> > > 
> > > Is it possible that the regression only appears due to the combination
> > > of 17bd3bd8 and 49d14b54a5?
> > 
> > 
> > That patch only affects packets with SKB_GSO_FRAGLIST. Which is only
> > set on forwarding if NETIF_F_FRAGLIST is set. I don 
> 
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
  2025-04-05  6:15               ` Markus Fohrer
@ 2025-04-05 12:18                 ` Ilya Maximets
  0 siblings, 0 replies; 24+ messages in thread
From: Ilya Maximets @ 2025-04-05 12:18 UTC (permalink / raw)
  To: Markus Fohrer, Willem de Bruijn, Michael S. Tsirkin
  Cc: i.maximets, virtualization, jasowang, davem, edumazet, netdev,
	linux-kernel

On 4/5/25 8:15 AM, Markus Fohrer wrote:
> Am Samstag, dem 05.04.2025 um 00:05 +0200 schrieb Ilya Maximets:
> 
>> On 4/4/25 5:13 PM, Willem de Bruijn wrote:
>>
>>> Markus Fohrer wrote:
>>>
>>>> Am Freitag, dem 04.04.2025 um 10:52 +0200 schrieb Markus Fohrer:
>>>>
>>>>> Am Freitag, dem 04.04.2025 um 04:29 -0400 schrieb Michael S. Tsirkin:
>>>>>
>>>>>> On Fri, Apr 04, 2025 at 10:16:55AM +0200, Markus Fohrer wrote:
>>>>>>
>>>>>>> Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
>>>>>>> Tsirkin:
>>>>>>>
>>>>>>>> On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm observing a significant performance regression in KVM
>>>>>>>>> guest
>>>>>>>>> VMs
>>>>>>>>> using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
>>>>>>>>>
>>>>>>>>> When running on a host system equipped with a Broadcom
>>>>>>>>> NetXtreme-E
>>>>>>>>> (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in
>>>>>>>>> the
>>>>>>>>> guest drops to 100–200 KB/s. The same guest configuration
>>>>>>>>> performs
>>>>>>>>> normally (~100 MB/s) when using kernel 6.8.0 or when the VM
>>>>>>>>> is
>>>>>>>>> moved to a host with Intel NICs.
>>>>>>>>>
>>>>>>>>> Test environment:
>>>>>>>>> - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
>>>>>>>>> - Guest: Linux with virtio-net interface
>>>>>>>>> - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host
>>>>>>>>> level)
>>>>>>>>> - CPU: AMD EPYC
>>>>>>>>> - Storage: virtio-scsi
>>>>>>>>> - VM network: virtio-net, virtio-scsi (no CPU or IO
>>>>>>>>> bottlenecks)
>>>>>>>>> - Traffic test: iperf3, scp, wget consistently slow in guest
>>>>>>>>>
>>>>>>>>> This issue is not present:
>>>>>>>>> - On 6.8.0 
>>>>>>>>> - On hosts with Intel NICs (same VM config)
>>>>>>>>>
>>>>>>>>> I have bisected the issue to the following upstream commit:
>>>>>>>>>
>>>>>>>>>   49d14b54a527 ("virtio-net: Suppress tx timeout warning for
>>>>>>>>> small
>>>>>>>>> tx")
>>>>>>>>>   [https://git.kernel.org/linus/49d14b54a527](https://git.kernel.org/linus/49d14b54a527)
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks a lot for the info!
>>>>>>>>
>>>>>>>>
>>>>>>>> both the link and commit point at:
>>>>>>>>
>>>>>>>> commit 49d14b54a527289d09a9480f214b8c586322310a
>>>>>>>> Author: Eric Dumazet <[edumazet@google.com](mailto:edumazet@google.com)>
>>>>>>>> Date:   Thu Sep 26 16:58:36 2024 +0000
>>>>>>>>
>>>>>>>>     net: test for not too small csum_start in
>>>>>>>> virtio_net_hdr_to_skb()
>>>>>>>>     
>>>>>>>>
>>>>>>>> is this what you mean?
>>>>>>>>
>>>>>>>> I don't know which commit is "virtio-net: Suppress tx timeout
>>>>>>>> warning
>>>>>>>> for small tx"
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Reverting this commit restores normal network performance in
>>>>>>>>> affected guest VMs.
>>>>>>>>>
>>>>>>>>> I’m happy to provide more data or assist with testing a
>>>>>>>>> potential
>>>>>>>>> fix.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Markus Fohrer
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks! First I think it's worth checking what is the setup,
>>>>>>>> e.g.
>>>>>>>> which offloads are enabled.
>>>>>>>> Besides that, I'd start by seeing what's doing on. Assuming I'm
>>>>>>>> right
>>>>>>>> about
>>>>>>>> Eric's patch:
>>>>>>>>
>>>>>>>> diff --git a/include/linux/virtio_net.h
>>>>>>>> b/include/linux/virtio_net.h
>>>>>>>> index 276ca543ef44d8..02a9f4dc594d02 100644
>>>>>>>> --- a/include/linux/virtio_net.h
>>>>>>>> +++ b/include/linux/virtio_net.h
>>>>>>>> @@ -103,8 +103,10 @@ static inline int
>>>>>>>> virtio_net_hdr_to_skb(struct
>>>>>>>> sk_buff *skb,
>>>>>>>>  
>>>>>>>>  		if (!skb_partial_csum_set(skb, start, off))
>>>>>>>>  			return -EINVAL;
>>>>>>>> +		if (skb_transport_offset(skb) < nh_min_len)
>>>>>>>> +			return -EINVAL;
>>>>>>>>  
>>>>>>>> -		nh_min_len = max_t(u32, nh_min_len,
>>>>>>>> skb_transport_offset(skb));
>>>>>>>> +		nh_min_len = skb_transport_offset(skb);
>>>>>>>>  		p_off = nh_min_len + thlen;
>>>>>>>>  		if (!pskb_may_pull(skb, p_off))
>>>>>>>>  			return -EINVAL;
>>>>>>>>
>>>>>>>>
>>>>>>>> sticking a printk before return -EINVAL to show the offset and
>>>>>>>> nh_min_len
>>>>>>>> would be a good 1st step. Thanks!
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I added the following printk inside virtio_net_hdr_to_skb():
>>>>>>>
>>>>>>>     if (skb_transport_offset(skb) < nh_min_len){
>>>>>>>         printk(KERN_INFO "virtio_net: 3 drop,
>>>>>>> transport_offset=%u,
>>>>>>> nh_min_len=%u\n",
>>>>>>>                skb_transport_offset(skb), nh_min_len);
>>>>>>>         return -EINVAL;
>>>>>>>     }
>>>>>>>
>>>>>>> Built and installed the kernel, then triggered a large download
>>>>>>> via:
>>>>>>>
>>>>>>>     wget [http://speedtest.belwue.net/10G](http://speedtest.belwue.net/10G)
>>>>>>>
>>>>>>> Relevant output from `dmesg -w`:
>>>>>>>
>>>>>>> [   57.327943] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.428942] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.428962] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.553068] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.553088] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.576678] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.618438] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.618453] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.703077] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.823072] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.891982] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   57.946190] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>> [   58.218686] virtio_net: 3 drop, transport_offset=34,
>>>>>>> nh_min_len=40  
>>>>>>
>>>>>>
>>>>>> Hmm indeed. And what about these values?
>>>>>>                 u32 start = __virtio16_to_cpu(little_endian, hdr-
>>>>>>
>>>>>>> csum_start);
>>>>>>
>>>>>>                 u32 off = __virtio16_to_cpu(little_endian, hdr-
>>>>>>
>>>>>>> csum_offset);
>>>>>>
>>>>>>                 u32 needed = start + max_t(u32, thlen, off +
>>>>>> sizeof(__sum16));
>>>>>> print them too?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> I would now do the test with commit
>>>>>>> 49d14b54a527289d09a9480f214b8c586322310a and commit
>>>>>>> 49d14b54a527289d09a9480f214b8c586322310a~1
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Worth checking though it seems likely now the hypervisor is doing
>>>>>> weird
>>>>>> things. what kind of backend is it? qemu? tun? vhost-user? vhost-
>>>>>> net?
>>>>>>
>>>>>
>>>>>
>>>>> Backend: QEMU/KVM hypervisor (Proxmox)
>>>>>
>>>>>
>>>>> printk output:
>>>>>
>>>>> [   58.641906] virtio_net: drop, transport_offset=34  start=34,
>>>>> off=16,
>>>>> needed=54, nh_min_len=40
>>>>> [   58.678048] virtio_net: drop, transport_offset=34  start=34,
>>>>> off=16,
>>>>> needed=54, nh_min_len=40
>>>>> [   58.952871] virtio_net: drop, transport_offset=34  start=34,
>>>>> off=16,
>>>>> needed=54, nh_min_len=40
>>>>> [   58.962157] virtio_net: drop, transport_offset=34  start=34,
>>>>> off=16,
>>>>> needed=54, nh_min_len=40
>>>>> [   59.071645] virtio_net: drop, transport_offset=34  start=34,
>>>>> off=16,
>>>>> needed=54, nh_min_len=40
>>>>
>>>
>>>
>>> So likely a TCP/IPv4 packet, but with VIRTIO_NET_HDR_GSO_TCPV6.
>>
>>
>>
>> Hi, Markus.
>>
>> Given this and the fact that the issue depends on the bnxt_en NIC on the
>> hist, I'd make an educated guess that the problem is the host NIC driver.
>>
>> There are some known GRO issues in the nbxt_en driver fixed recently in
>>
>>   commit de37faf41ac55619dd329229a9bd9698faeabc52
>>   Author: Michael Chan <[michael.chan@broadcom.com](mailto:michael.chan@broadcom.com)>
>>   Date:   Wed Dec 4 13:59:17 2024 -0800
>>
>>     bnxt_en: Fix GSO type for HW GRO packets on 5750X chips
>>
>> It's not clear to me what's your host kernel version.  But the commit
>> above was introduced in 6.14 and may be in fairly recent stable kernels.
>> The oldest is v6.12.6 AFAICT.  Can you try one of these host kernels?
>>
>> Also, to confirm and workaround the problem, please, try disabling HW GRO
>> on the bnxt_en NIC first:
>>
>>   ethtool -K <BNXT_EN NIC IFACE> rx-gro-hw off
>>
>> If that doesn't help, then the problem is likely something different.
>>
>> Best regards, Ilya Maximets.
> 
> 
> Setting `rx-gro-hw off` on the Broadcom interfaces also resolves the issue:
> 
> ethtool -K ens1f0np0 rx-gro-hw off  
> ethtool -K ens1f1np1 rx-gro-hw off  
> ethtool -K ens1f2np2 rx-gro-hw off  
> ethtool -K ens1f3np3 rx-gro-hw off
> 
> With this setting applied, the guest receives traffic correctly even when GRO is enabled on the host.

OK.  It's definitely a host bnxt_en driver bug then.

> 
> The system is running the latest Proxmox kernel:
> 
> 6.8.12-9-pve

6.8 is long EoL upstream, so you need to ask distribution maintainers to
backport aforementioned bnxt_en driver fix (de37faf41ac5), or move to
latest 6.12+ stable kernels which are supported upstream.

Since proxmox mostly just rebuilds ubuntu kernels, you probably need to
ask for fixes to be backported in the corresponding ubuntu kernel first.

Meanwhile, you may run with rx-gro-hw off on those cards.

Best regards, Ilya Maximets.

> 
> 
> 
> 
>>> This is observed in the guest on the ingress path, right? In
>>> virtnet_receive_done.
>>>
>>> Is this using vhost-net in the host for pass-through? IOW, is
>>> the host writing the virtio_net_hdr too?
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> I just noticed that commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8
>>>> (tcp_offload.c: gso fix) also touches checksum handling and may
>>>> affect how skb state is passed to virtio_net_hdr_to_skb().
>>>>
>>>> Is it possible that the regression only appears due to the combination
>>>> of 17bd3bd8 and 49d14b54a5?
>>>
>>>
>>> That patch only affects packets with SKB_GSO_FRAGLIST. Which is only
>>> set on forwarding if NETIF_F_FRAGLIST is set. I don 
>>
>>
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-04-05 12:19 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-02 21:12 [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+ Markus Fohrer
2025-04-03 13:04 ` Michael S. Tsirkin
2025-04-03 13:51   ` Markus Fohrer
2025-04-03 14:03     ` Michael S. Tsirkin
2025-04-03 14:26       ` Willem de Bruijn
2025-04-03 20:00         ` Markus Fohrer
2025-04-03 20:35         ` Markus Fohrer
2025-04-03 20:07       ` Markus Fohrer
2025-04-03 21:06         ` Michael S. Tsirkin
2025-04-03 21:24           ` Markus Fohrer
2025-04-03 21:49             ` Willem de Bruijn
2025-04-03 22:05             ` Michael S. Tsirkin
2025-04-04 11:32               ` Markus Fohrer
2025-04-04  8:16   ` Markus Fohrer
2025-04-04  8:29     ` Michael S. Tsirkin
2025-04-04  8:52       ` Markus Fohrer
2025-04-04 11:40         ` Markus Fohrer
2025-04-04 15:13           ` Willem de Bruijn
2025-04-04 20:23             ` Markus Fohrer
2025-04-04 22:05             ` Ilya Maximets
2025-04-05  6:15               ` Markus Fohrer
2025-04-05 12:18                 ` Ilya Maximets
2025-04-04  7:59 ` Torsten Krah
2025-04-04  8:26   ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).