netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Horman <horms@verge.net.au>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev@vger.kernel.org, Jay Vosburgh <fubar@us.ibm.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: bonding: flow control regression [was Re: bridging: flow control regression]
Date: Tue, 2 Nov 2010 11:06:28 +0900	[thread overview]
Message-ID: <20101102020625.GA22724@verge.net.au> (raw)
In-Reply-To: <1288616372.2660.101.camel@edumazet-laptop>

On Mon, Nov 01, 2010 at 01:59:32PM +0100, Eric Dumazet wrote:
> Le lundi 01 novembre 2010 à 21:29 +0900, Simon Horman a écrit :
> > Hi,
> > 
> > I have observed what appears to be a regression between 2.6.34 and
> > 2.6.35-rc1. The behaviour described below is still present in Linus's
> > current tree (2.6.36+).
> > 
> > On 2.6.34 and earlier when sending a UDP stream to a bonded interface
> > the throughput is approximately equal to the available physical bandwidth.
> > 
> > # netperf -c -4 -t UDP_STREAM -H 172.17.50.253 -l 30 -- -m 1472
> > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> > 172.17.50.253 (172.17.50.253) port 0 AF_INET
> > Socket  Message  Elapsed      Messages                   CPU      Service
> > Size    Size     Time         Okay Errors   Throughput   Util     Demand
> > bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> > 
> > 114688    1472   30.00     2438265      0      957.1     18.09    3.159 
> > 109568           30.00     2389980             938.1     -1.00    -1.000
> > 
> > On 2.6.35-rc1 netpref sends~7Gbits/s.
> > Curiously it only consumes 50% CPU, I would expect this to be CPU bound.
> > 
> > # netperf -c -4 -t UDP_STREAM -H 172.17.50.253 -l 30 -- -m 1472
> > UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> > 172.17.50.253 (172.17.50.253) port 0 AF_INET
> > Socket  Message  Elapsed      Messages                   CPU      Service
> > Size    Size     Time         Okay Errors   Throughput   Util     Demand
> > bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> > 
> > 116736    1472   30.00     18064360      0     7090.8     50.62    8.665 
> > 109568           30.00     2438090             957.0     -1.00    -1.000
> > 
> > In this case the bonding device has a single gitabit slave device
> > and is running in balance-rr mode. I have observed similar results
> > with two and three slave devices.
> > 
> > I have bisected the problem and the offending commit appears to be
> > "net: Introduce skb_orphan_try()". My tired eyes tell me that change
> > frees skb's earlier than they otherwise would be unless tx timestamping
> > is in effect. That does seem to make sense in relation to this problem,
> > though I am yet to dig into specifically why bonding is adversely affected.
> > 
> 
> I assume you meant "bonding: flow control regression", ie this is not
> related to bridging ?

Yes, sorry about that. I meant bonding not bridging.

> One problem on bonding is that the xmit() method always returns
> NETDEV_TX_OK.
> 
> So a flooder cannot know some of its frames were lost.
> 
> So yes, the patch you mention has the effect of allowing UDP to flood
> bonding device, since we orphan skb before giving it to device (bond or
> ethX)
> 
> With a normal device (with a qdisc), we queue skb, and orphan it only
> when leaving queue. With a not too big socket send buffer, it slows down
> the sender enough to "send UDP frames at line rate only"

Thanks for the explanation.
I'm not entirely sure how much of a problem this is in practice.

  reply	other threads:[~2010-11-02  2:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-01 12:29 bridging: flow control regression Simon Horman
2010-11-01 12:59 ` Eric Dumazet
2010-11-02  2:06   ` Simon Horman [this message]
2010-11-02  4:53     ` bonding: flow control regression [was Re: bridging: flow control regression] Eric Dumazet
2010-11-02  7:03       ` Simon Horman
2010-11-02  7:30         ` Eric Dumazet
2010-11-02  8:46           ` Simon Horman
2010-11-02  9:29             ` Eric Dumazet
2010-11-06  9:25               ` Simon Horman
2010-12-08 13:22                 ` Simon Horman
2010-12-08 13:50                   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101102020625.GA22724@verge.net.au \
    --to=horms@verge.net.au \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=fubar@us.ibm.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).