Re: bonding: flow control regression [was Re: bridging: flow control regression]

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Simon Horman <horms@verge.net.au>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev@vger.kernel.org, Jay Vosburgh <fubar@us.ibm.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: bonding: flow control regression [was Re: bridging: flow control regression]
Date: Wed, 8 Dec 2010 22:22:17 +0900	[thread overview]
Message-ID: <20101208132217.GA28040@verge.net.au> (raw)
In-Reply-To: <20101106092535.GD5128@verge.net.au>

On Sat, Nov 06, 2010 at 06:25:37PM +0900, Simon Horman wrote:
> On Tue, Nov 02, 2010 at 10:29:45AM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 à 17:46 +0900, Simon Horman a écrit :
> > 
> > > Thanks Eric, that seems to resolve the problem that I was seeing.
> > > 
> > > With your patch I see:
> > > 
> > > No bonding
> > > 
> > > # netperf -c -4 -t UDP_STREAM -H 172.17.60.216 -l 30 -- -m 1472
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
> > > Socket  Message  Elapsed      Messages                   CPU      Service
> > > Size    Size     Time         Okay Errors   Throughput   Util     Demand
> > > bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> > > 
> > > 116736    1472   30.00     2438413      0      957.2     8.52     1.458 
> > > 129024           30.00     2438413             957.2     -1.00    -1.000
> > > 
> > > With bonding (one slave, the interface used in the test above)
> > > 
> > > netperf -c -4 -t UDP_STREAM -H 172.17.60.216 -l 30 -- -m 1472
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
> > > Socket  Message  Elapsed      Messages                   CPU      Service
> > > Size    Size     Time         Okay Errors   Throughput   Util     Demand
> > > bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> > > 
> > > 116736    1472   30.00     2438390      0      957.1     8.97     1.535 
> > > 129024           30.00     2438390             957.1     -1.00    -1.000
> > > 
> > 
> > 
> > Sure the patch helps when not too many flows are involved, but this is a
> > hack.
> > 
> > Say the device queue is 1000 packets, and you run a workload with 2000
> > sockets, it wont work...
> > 
> > Or device queue is 1000 packets, one flow, and socket send queue size
> > allows for more than 1000 packets to be 'in flight' (echo 2000000
> > >/proc/sys/net/core/wmem_default) , it wont work too with bonding, only
> > with devices with a qdisc sitting in the first device met after the
> > socket.
> 
> True, thanks for pointing that out.
> 
> The scenario that I am actually interested in is virtualisation.
> And I believe that your patch helps the vhostnet case (I don't see
> flow control problems with bonding + virtio without vhostnet). However,
> I am unsure if there are also some easy work-arounds to degrade
> flow control in the vhostnet case too.

Hi Eric,

do you have any thoughts on this?

I measured the performance impact of your patch on 2.6.37-rc1
and I can see why early orphaning is a win.

The tests are run over a bond with 3 slaves.
The bond is in rr-balance mode. Other parameters of interest are:
	MTU=1500
	client,server:tcp_reordering=3(default)
	client:GSO=off,
	client:TSO=off
	server:GRO=off
	server:rx-usecs=3(default)

Without your no early-orphan patch
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
	172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      1621.03   16.31    6.48     1.648   2.621

With your no early-orphan patch
# netperf -C -c -4 -t TCP_STREAM -H 172.17.60.216
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
	172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      1433.48   9.60     5.45     1.098   2.490


However in the case of virtualisation I think it is a win to be able to do
flow control on UDP traffic from guests (using vitio). Am I missing
something and flow control can be bypassed anyway? If not perhaps making
the change that your patch makes configurable through proc or ethtool is an
option?

next prev parent reply	other threads:[~2010-12-08 13:22 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-01 12:29 bridging: flow control regression Simon Horman
2010-11-01 12:59 ` Eric Dumazet
2010-11-02  2:06   ` bonding: flow control regression [was Re: bridging: flow control regression] Simon Horman
2010-11-02  4:53     ` Eric Dumazet
2010-11-02  7:03       ` Simon Horman
2010-11-02  7:30         ` Eric Dumazet
2010-11-02  8:46           ` Simon Horman
2010-11-02  9:29             ` Eric Dumazet
2010-11-06  9:25               ` Simon Horman
2010-12-08 13:22                 ` Simon Horman [this message]
2010-12-08 13:50                   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101208132217.GA28040@verge.net.au \
    --to=horms@verge.net.au \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=fubar@us.ibm.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).