All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Markus Fohrer <markus.fohrer@webked.de>
Cc: virtualization@lists.linux-foundation.org, jasowang@redhat.com,
	davem@davemloft.net, edumazet@google.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+
Date: Fri, 4 Apr 2025 04:29:31 -0400	[thread overview]
Message-ID: <20250404042711-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <11c5cb52d024a5158c5b8c5e69e2e4639a055a31.camel@webked.de>

On Fri, Apr 04, 2025 at 10:16:55AM +0200, Markus Fohrer wrote:
> Am Donnerstag, dem 03.04.2025 um 09:04 -0400 schrieb Michael S.
> Tsirkin:
> > On Wed, Apr 02, 2025 at 11:12:07PM +0200, Markus Fohrer wrote:
> > > Hi,
> > > 
> > > I'm observing a significant performance regression in KVM guest VMs
> > > using virtio-net with recent Linux kernels (6.8.1+ and 6.14).
> > > 
> > > When running on a host system equipped with a Broadcom NetXtreme-E
> > > (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the
> > > guest drops to 100–200 KB/s. The same guest configuration performs
> > > normally (~100 MB/s) when using kernel 6.8.0 or when the VM is
> > > moved to a host with Intel NICs.
> > > 
> > > Test environment:
> > > - Host: QEMU/KVM, Linux 6.8.1 and 6.14.0
> > > - Guest: Linux with virtio-net interface
> > > - NIC: Broadcom BCM57416 (bnxt_en driver, no issues at host level)
> > > - CPU: AMD EPYC
> > > - Storage: virtio-scsi
> > > - VM network: virtio-net, virtio-scsi (no CPU or IO bottlenecks)
> > > - Traffic test: iperf3, scp, wget consistently slow in guest
> > > 
> > > This issue is not present:
> > > - On 6.8.0 
> > > - On hosts with Intel NICs (same VM config)
> > > 
> > > I have bisected the issue to the following upstream commit:
> > > 
> > >   49d14b54a527 ("virtio-net: Suppress tx timeout warning for small
> > > tx")
> > >   https://git.kernel.org/linus/49d14b54a527
> > 
> > Thanks a lot for the info!
> > 
> > 
> > both the link and commit point at:
> > 
> > commit 49d14b54a527289d09a9480f214b8c586322310a
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Thu Sep 26 16:58:36 2024 +0000
> > 
> >     net: test for not too small csum_start in virtio_net_hdr_to_skb()
> >     
> > 
> > is this what you mean?
> > 
> > I don't know which commit is "virtio-net: Suppress tx timeout warning
> > for small tx"
> > 
> > 
> > 
> > > Reverting this commit restores normal network performance in
> > > affected guest VMs.
> > > 
> > > I’m happy to provide more data or assist with testing a potential
> > > fix.
> > > 
> > > Thanks,
> > > Markus Fohrer
> > 
> > 
> > Thanks! First I think it's worth checking what is the setup, e.g.
> > which offloads are enabled.
> > Besides that, I'd start by seeing what's doing on. Assuming I'm right
> > about
> > Eric's patch:
> > 
> > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> > index 276ca543ef44d8..02a9f4dc594d02 100644
> > --- a/include/linux/virtio_net.h
> > +++ b/include/linux/virtio_net.h
> > @@ -103,8 +103,10 @@ static inline int virtio_net_hdr_to_skb(struct
> > sk_buff *skb,
> >  
> >  		if (!skb_partial_csum_set(skb, start, off))
> >  			return -EINVAL;
> > +		if (skb_transport_offset(skb) < nh_min_len)
> > +			return -EINVAL;
> >  
> > -		nh_min_len = max_t(u32, nh_min_len,
> > skb_transport_offset(skb));
> > +		nh_min_len = skb_transport_offset(skb);
> >  		p_off = nh_min_len + thlen;
> >  		if (!pskb_may_pull(skb, p_off))
> >  			return -EINVAL;
> > 
> > 
> > sticking a printk before return -EINVAL to show the offset and
> > nh_min_len
> > would be a good 1st step. Thanks!
> > 
> 
> I added the following printk inside virtio_net_hdr_to_skb():
> 
>     if (skb_transport_offset(skb) < nh_min_len){
>         printk(KERN_INFO "virtio_net: 3 drop, transport_offset=%u,
> nh_min_len=%u\n",
>                skb_transport_offset(skb), nh_min_len);
>         return -EINVAL;
>     }
> 
> Built and installed the kernel, then triggered a large download via:
> 
>     wget http://speedtest.belwue.net/10G
> 
> Relevant output from `dmesg -w`:
> 
> [   57.327943] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.428942] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.428962] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.553068] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.553088] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.576678] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.618438] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.618453] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.703077] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.823072] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.891982] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   57.946190] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  
> [   58.218686] virtio_net: 3 drop, transport_offset=34, nh_min_len=40  

Hmm indeed. And what about these values?
                u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
                u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
                u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
print them too?



> I would now do the test with commit
> 49d14b54a527289d09a9480f214b8c586322310a and commit
> 49d14b54a527289d09a9480f214b8c586322310a~1
> 

Worth checking though it seems likely now the hypervisor is doing weird
things. what kind of backend is it? qemu? tun? vhost-user? vhost-net?

-- 
MST


  reply	other threads:[~2025-04-04  8:29 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-02 21:12 [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+ Markus Fohrer
2025-04-03 13:04 ` Michael S. Tsirkin
2025-04-03 13:51   ` Markus Fohrer
2025-04-03 14:03     ` Michael S. Tsirkin
2025-04-03 14:26       ` Willem de Bruijn
2025-04-03 20:00         ` Markus Fohrer
2025-04-03 20:35         ` Markus Fohrer
2025-04-03 20:07       ` Markus Fohrer
2025-04-03 21:06         ` Michael S. Tsirkin
2025-04-03 21:24           ` Markus Fohrer
2025-04-03 21:49             ` Willem de Bruijn
2025-04-03 22:05             ` Michael S. Tsirkin
2025-04-04 11:32               ` Markus Fohrer
2025-04-04  8:16   ` Markus Fohrer
2025-04-04  8:29     ` Michael S. Tsirkin [this message]
2025-04-04  8:52       ` Markus Fohrer
2025-04-04 11:40         ` Markus Fohrer
2025-04-04 15:13           ` Willem de Bruijn
2025-04-04 20:23             ` Markus Fohrer
2025-04-04 22:05             ` Ilya Maximets
2025-04-05  6:15               ` Markus Fohrer
2025-04-05 12:18                 ` Ilya Maximets
2025-04-04  7:59 ` Torsten Krah
2025-04-04  8:26   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250404042711-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus.fohrer@webked.de \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.