qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Russell King - ARM Linux <linux@armlinux.org.uk>,
	David Woodhouse <dwmw2@infradead.org>
Cc: jasowang@redhat.com, vyasevic@redhat.com, stefanha@redhat.com,
	netdev@vger.kernel.org, qemu-devel@nongnu.org,
	igor.v.kovalenko@gmail.com, dgilbert@redhat.com
Subject: Re: [Qemu-devel] TCP performance problems - GSO/TSO, MSS, 8139cp related
Date: Mon, 14 Nov 2016 16:25:05 +0000	[thread overview]
Message-ID: <20161114162505.GD26664@stefanha-x1.localdomain> (raw)
In-Reply-To: <20161114092947.GB2031@work-vm>

[-- Attachment #1: Type: text/plain, Size: 3792 bytes --]

On Mon, Nov 14, 2016 at 09:29:48AM +0000, Dr. David Alan Gilbert wrote:
> * Russell King - ARM Linux (linux@armlinux.org.uk) wrote:
> > On Fri, Nov 11, 2016 at 09:23:43PM +0000, David Woodhouse wrote:
> > > It's also *fairly* unlikely that the kernel in the guest has developed
> > > a bug and isn't setting gso_size sanely. I'm more inclined to suspect
> > > that qemu isn't properly emulating those bits. But at first glance at
> > > the code, it looks like *that's* been there for the last decade too...
> > 
> > I take issue with that, having looked at the qemu rtl8139 code:
> > 
> >                 if ((txdw0 & CP_TX_LGSEN) && ip_protocol == IP_PROTO_TCP)
> >                 {
> >                     int large_send_mss = (txdw0 >> 16) & CP_TC_LGSEN_MSS_MASK;
> > 
> >                     DPRINTF("+++ C+ mode offloaded task TSO MTU=%d IP data %d "
> >                         "frame data %d specified MSS=%d\n", ETH_MTU,
> >                         ip_data_len, saved_size - ETH_HLEN, large_send_mss);
> > 
> > That's the only reference to "large_send_mss" there, other than that,
> > the MSS value that gets stuck into the field by 8139cp.c is completely
> > unused.  Instead, qemu does this:
> > 
> >                 eth_payload_data = saved_buffer + ETH_HLEN;
> >                 eth_payload_len  = saved_size   - ETH_HLEN;
> > 
> >                 ip = (ip_header*)eth_payload_data;
> > 
> >                     hlen = IP_HEADER_LENGTH(ip);
> >                     ip_data_len = be16_to_cpu(ip->ip_len) - hlen;
> > 
> >                     tcp_header *p_tcp_hdr = (tcp_header*)(eth_payload_data + hlen);
> >                     int tcp_hlen = TCP_HEADER_DATA_OFFSET(p_tcp_hdr);
> > 
> >                     /* ETH_MTU = ip header len + tcp header len + payload */
> >                     int tcp_data_len = ip_data_len - tcp_hlen;
> >                     int tcp_chunk_size = ETH_MTU - hlen - tcp_hlen;
> > 
> >                     for (tcp_send_offset = 0; tcp_send_offset < tcp_data_len; tcp_send_offset += tcp_chunk_size)
> >                     {
> > 
> > It uses a fixed value of ETH_MTU to calculate the size of the TCP
> > data chunks, and this is not surprisingly the well known:
> > 
> > #define ETH_MTU     1500
> > 
> > Qemu seems to be buggy - it ignores the MSS value, and always tries to
> > send 1500 byte frames.
> 
> cc'ing in Stefan who last touched that code and Jason and Vlad who
> know the net code.

CCing Igor Kovalenko who implemented "fixed for TCP segmentation
offloading - removed dependency on slirp.h" in 2006.  I don't actually
expect him to remember this from 10 years ago though :).

Looking at the history the large_send_mss variable was never used for
anything beyond the debug printf.

The datasheet for this NIC is here:
http://realtek.info/pdf/rtl8139cp.pdf.  See 9.2.1 Transmit.

Does this untested patch work for you?

diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index f05e59c..a3f1af5 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -2167,9 +2167,13 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
                     goto skip_offload;
                 }
 
-                /* ETH_MTU = ip header len + tcp header len + payload */
+                /* MSS too small */
+                if (tcp_hlen + hlen >= large_send_mss) {
+                    goto skip_offload;
+                }
+
                 int tcp_data_len = ip_data_len - tcp_hlen;
-                int tcp_chunk_size = ETH_MTU - hlen - tcp_hlen;
+                int tcp_chunk_size = large_send_mss - hlen - tcp_hlen;
 
                 DPRINTF("+++ C+ mode TSO IP data len %d TCP hlen %d TCP "
                     "data len %d TCP chunk size %d\n", ip_data_len,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

      reply	other threads:[~2016-11-14 16:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20161111210500.GE1041@n2100.armlinux.org.uk>
2016-11-11 21:23 ` [Qemu-devel] TCP performance problems - GSO/TSO, MSS, 8139cp related David Woodhouse
2016-11-11 22:33   ` Russell King - ARM Linux
2016-11-12  2:52     ` David Miller
2016-11-11 22:44   ` Russell King - ARM Linux
2016-11-14  9:29     ` Dr. David Alan Gilbert
2016-11-14 16:25       ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161114162505.GD26664@stefanha-x1.localdomain \
    --to=stefanha@gmail.com \
    --cc=dgilbert@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=igor.v.kovalenko@gmail.com \
    --cc=jasowang@redhat.com \
    --cc=linux@armlinux.org.uk \
    --cc=netdev@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vyasevic@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).