From: Simon Horman <horms@verge.net.au>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>, netdev@vger.kernel.org
Subject: Re: Bonding, GRO and tcp_reordering
Date: Fri, 3 Dec 2010 22:38:00 +0900 [thread overview]
Message-ID: <20101203133800.GA26038@verge.net.au> (raw)
In-Reply-To: <20101201043445.GC3485@verge.net.au>
On Wed, Dec 01, 2010 at 01:34:45PM +0900, Simon Horman wrote:
> On Tue, Nov 30, 2010 at 05:04:33PM +0100, Eric Dumazet wrote:
> > Le mardi 30 novembre 2010 à 15:42 +0000, Ben Hutchings a écrit :
> > > On Tue, 2010-11-30 at 22:55 +0900, Simon Horman wrote:
To clarify my statement in a previous email that GSO had no effect: I
re-ran the tests and I still haven't observed any affect of GSO on my
results. However, I did notice that in order for GRO on the server to have
effect I also need TSO enabled on the client. I thought that I had
previously checked that but I was mistaken.
Enabling TSO on the client while leaving GSO disabled on the server
resulted in increased CPU utilisation on the client, from ~15% to ~20%.
> > > > The only other parameter that seemed to have significant effect was to
> > > > increase the mtu. In the case of MTU=9000, GRO seemed to have a negative
> > > > impact on throughput, though a significant positive effect on CPU
> > > > utilisation.
> > > [...]
> > >
> > > Increasing MTU also increases the interval between packets on a TCP flow
> > > using maximum segment size so that it is more likely to exceed the
> > > difference in delay.
> > >
> >
> > GRO really is operational _if_ we receive in same NAPI run several
> > packets for the same flow.
> >
> > As soon as we exit NAPI mode, GRO packets are flushed.
> >
> > Big MTU --> bigger delays between packets, so big chance that GRO cannot
> > trigger at all, since NAPI runs for one packet only.
> >
> > One possibility with big MTU is to tweak "ethtool -c eth0" params
> > rx-usecs: 20
> > rx-frames: 5
> > rx-usecs-irq: 0
> > rx-frames-irq: 5
> > so that "rx-usecs" is bigger than the delay between two MTU full sized
> > packets.
> >
> > Gigabit speed means 1 nano second per bit, and MTU=9000 means 72 us
> > delay between packets.
> >
> > So try :
> >
> > ethtool -C eth0 rx-usecs 100
> >
> > to get chance that several packets are delivered at once by NIC.
> >
> > Unfortunately, this also add some latency, so it helps bulk transferts,
> > and slowdown interactive traffic
>
> Thanks Eric,
>
> I was tweaking those values recently for some latency tuning
> but I didn't think of them in relation to last night's tests.
>
> In terms of my measurements, its just benchmarking at this stage.
> So a trade-off between throughput and latency is acceptable, so long
> as I remember to measure what it is.
Thanks, rx-usecs was set to 3 and changing it to 15 on the server
did seem increase throughput with 1500 byte packets. Although
CPU utilisation increased too, disproportionally so on the client.
MTU=1500, client,server:tcp_reordering=3, client:GSO=off,
client:TSO=on, server:GRO=off, server:rx-usecs=3(default)
# netperf -c -4 -t TCP_STREAM -H 172.17.60.216
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
87380 16384 16384 10.00 1591.34 16.35 5.80 1.683 2.390
MTU=1500, client,server:tcp_reordering=3(default), client:GSO=off,
client:TSO=on, server:GRO=off server:rx-usecs=15
# netperf -c -4 -t TCP_STREAM -H 172.17.60.216
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
87380 16384 16384 10.00 1774.38 23.75 7.58 2.193 2.801
I also saw an improvement with GRO enabled on the server and TSO enabled on
the client. Although in this case I found rx-usecs=45 to be the best
value.
MTU=1500, client,server:tcp_reordering=3(default), client:GSO=off,
client:TSO=on, server:GRO=on server:rx-usecs=3(default)
# netperf -c -4 -t TCP_STREAM -H 172.17.60.216
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
87380 16384 16384 10.00 2553.27 13.31 3.35 0.854 0.860
MTU=1500, client,server:tcp_reordering=3(default), client:GSO=off,
client:TSO=on, server:GRO=on server:rx-usecs=45
# netperf -c -4 -t TCP_STREAM -H 172.17.60.216
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
87380 16384 16384 10.00 2727.53 29.45 9.48 1.769 2.278
I did not observe any improvement in throughput when increasing rx-usecs
from 3 when using mtu=9000 although there was a slight increase in CPU
utilisation (maybe, there is quite a lot of noise in the results).
next prev parent reply other threads:[~2010-12-03 13:38 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-30 13:55 Bonding, GRO and tcp_reordering Simon Horman
2010-11-30 15:42 ` Ben Hutchings
2010-11-30 16:04 ` Eric Dumazet
2010-12-01 4:34 ` Simon Horman
2010-12-01 4:47 ` Eric Dumazet
2010-12-02 6:39 ` Simon Horman
2010-12-03 13:38 ` Simon Horman [this message]
2010-12-01 4:31 ` Simon Horman
2010-11-30 17:56 ` Rick Jones
2010-11-30 18:14 ` Eric Dumazet
2010-12-01 4:30 ` Simon Horman
2010-12-01 19:42 ` Rick Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101203133800.GA26038@verge.net.au \
--to=horms@verge.net.au \
--cc=bhutchings@solarflare.com \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).