netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Horman <horms@verge.net.au>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev@vger.kernel.org, Jay Vosburgh <fubar@us.ibm.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: bonding: flow control regression [was Re: bridging: flow control regression]
Date: Wed, 8 Dec 2010 22:22:17 +0900	[thread overview]
Message-ID: <20101208132217.GA28040@verge.net.au> (raw)
In-Reply-To: <20101106092535.GD5128@verge.net.au>

On Sat, Nov 06, 2010 at 06:25:37PM +0900, Simon Horman wrote:
> On Tue, Nov 02, 2010 at 10:29:45AM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 à 17:46 +0900, Simon Horman a écrit :
> > 
> > > Thanks Eric, that seems to resolve the problem that I was seeing.
> > > 
> > > With your patch I see:
> > > 
> > > No bonding
> > > 
> > > # netperf -c -4 -t UDP_STREAM -H 172.17.60.216 -l 30 -- -m 1472
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
> > > Socket  Message  Elapsed      Messages                   CPU      Service
> > > Size    Size     Time         Okay Errors   Throughput   Util     Demand
> > > bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> > > 
> > > 116736    1472   30.00     2438413      0      957.2     8.52     1.458 
> > > 129024           30.00     2438413             957.2     -1.00    -1.000
> > > 
> > > With bonding (one slave, the interface used in the test above)
> > > 
> > > netperf -c -4 -t UDP_STREAM -H 172.17.60.216 -l 30 -- -m 1472
> > > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
> > > Socket  Message  Elapsed      Messages                   CPU      Service
> > > Size    Size     Time         Okay Errors   Throughput   Util     Demand
> > > bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> > > 
> > > 116736    1472   30.00     2438390      0      957.1     8.97     1.535 
> > > 129024           30.00     2438390             957.1     -1.00    -1.000
> > > 
> > 
> > 
> > Sure the patch helps when not too many flows are involved, but this is a
> > hack.
> > 
> > Say the device queue is 1000 packets, and you run a workload with 2000
> > sockets, it wont work...
> > 
> > Or device queue is 1000 packets, one flow, and socket send queue size
> > allows for more than 1000 packets to be 'in flight' (echo 2000000
> > >/proc/sys/net/core/wmem_default) , it wont work too with bonding, only
> > with devices with a qdisc sitting in the first device met after the
> > socket.
> 
> True, thanks for pointing that out.
> 
> The scenario that I am actually interested in is virtualisation.
> And I believe that your patch helps the vhostnet case (I don't see
> flow control problems with bonding + virtio without vhostnet). However,
> I am unsure if there are also some easy work-arounds to degrade
> flow control in the vhostnet case too.

Hi Eric,

do you have any thoughts on this?

I measured the performance impact of your patch on 2.6.37-rc1
and I can see why early orphaning is a win.

The tests are run over a bond with 3 slaves.
The bond is in rr-balance mode. Other parameters of interest are:
	MTU=1500
	client,server:tcp_reordering=3(default)
	client:GSO=off,
	client:TSO=off
	server:GRO=off
	server:rx-usecs=3(default)

Without your no early-orphan patch
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
	172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      1621.03   16.31    6.48     1.648   2.621

With your no early-orphan patch
# netperf -C -c -4 -t TCP_STREAM -H 172.17.60.216
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
	172.17.60.216 (172.17.60.216) port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      1433.48   9.60     5.45     1.098   2.490


However in the case of virtualisation I think it is a win to be able to do
flow control on UDP traffic from guests (using vitio). Am I missing
something and flow control can be bypassed anyway? If not perhaps making
the change that your patch makes configurable through proc or ethtool is an
option?


  reply	other threads:[~2010-12-08 13:22 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-01 12:29 bridging: flow control regression Simon Horman
2010-11-01 12:59 ` Eric Dumazet
2010-11-02  2:06   ` bonding: flow control regression [was Re: bridging: flow control regression] Simon Horman
2010-11-02  4:53     ` Eric Dumazet
2010-11-02  7:03       ` Simon Horman
2010-11-02  7:30         ` Eric Dumazet
2010-11-02  8:46           ` Simon Horman
2010-11-02  9:29             ` Eric Dumazet
2010-11-06  9:25               ` Simon Horman
2010-12-08 13:22                 ` Simon Horman [this message]
2010-12-08 13:50                   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101208132217.GA28040@verge.net.au \
    --to=horms@verge.net.au \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=fubar@us.ibm.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).