public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Simon Horman <horms@verge.net.au>
Cc: netdev@vger.kernel.org, Jay Vosburgh <fubar@us.ibm.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: bridging: flow control regression
Date: Mon, 01 Nov 2010 13:59:32 +0100	[thread overview]
Message-ID: <1288616372.2660.101.camel@edumazet-laptop> (raw)
In-Reply-To: <20101101122920.GB10052@verge.net.au>

Le lundi 01 novembre 2010 à 21:29 +0900, Simon Horman a écrit :
> Hi,
> 
> I have observed what appears to be a regression between 2.6.34 and
> 2.6.35-rc1. The behaviour described below is still present in Linus's
> current tree (2.6.36+).
> 
> On 2.6.34 and earlier when sending a UDP stream to a bonded interface
> the throughput is approximately equal to the available physical bandwidth.
> 
> # netperf -c -4 -t UDP_STREAM -H 172.17.50.253 -l 30 -- -m 1472
> UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 172.17.50.253 (172.17.50.253) port 0 AF_INET
> Socket  Message  Elapsed      Messages                   CPU      Service
> Size    Size     Time         Okay Errors   Throughput   Util     Demand
> bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> 
> 114688    1472   30.00     2438265      0      957.1     18.09    3.159 
> 109568           30.00     2389980             938.1     -1.00    -1.000
> 
> On 2.6.35-rc1 netpref sends~7Gbits/s.
> Curiously it only consumes 50% CPU, I would expect this to be CPU bound.
> 
> # netperf -c -4 -t UDP_STREAM -H 172.17.50.253 -l 30 -- -m 1472
> UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 172.17.50.253 (172.17.50.253) port 0 AF_INET
> Socket  Message  Elapsed      Messages                   CPU      Service
> Size    Size     Time         Okay Errors   Throughput   Util     Demand
> bytes   bytes    secs            #      #   10^6bits/sec % SU     us/KB
> 
> 116736    1472   30.00     18064360      0     7090.8     50.62    8.665 
> 109568           30.00     2438090             957.0     -1.00    -1.000
> 
> In this case the bonding device has a single gitabit slave device
> and is running in balance-rr mode. I have observed similar results
> with two and three slave devices.
> 
> I have bisected the problem and the offending commit appears to be
> "net: Introduce skb_orphan_try()". My tired eyes tell me that change
> frees skb's earlier than they otherwise would be unless tx timestamping
> is in effect. That does seem to make sense in relation to this problem,
> though I am yet to dig into specifically why bonding is adversely affected.
> 

I assume you meant "bonding: flow control regression", ie this is not
related to bridging ?

One problem on bonding is that the xmit() method always returns
NETDEV_TX_OK.

So a flooder cannot know some of its frames were lost.

So yes, the patch you mention has the effect of allowing UDP to flood
bonding device, since we orphan skb before giving it to device (bond or
ethX)

With a normal device (with a qdisc), we queue skb, and orphan it only
when leaving queue. With a not too big socket send buffer, it slows down
the sender enough to "send UDP frames at line rate only"




  reply	other threads:[~2010-11-01 12:59 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-01 12:29 bridging: flow control regression Simon Horman
2010-11-01 12:59 ` Eric Dumazet [this message]
2010-11-02  2:06   ` bonding: flow control regression [was Re: bridging: flow control regression] Simon Horman
2010-11-02  4:53     ` Eric Dumazet
2010-11-02  7:03       ` Simon Horman
2010-11-02  7:30         ` Eric Dumazet
2010-11-02  8:46           ` Simon Horman
2010-11-02  9:29             ` Eric Dumazet
2010-11-06  9:25               ` Simon Horman
2010-12-08 13:22                 ` Simon Horman
2010-12-08 13:50                   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1288616372.2660.101.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=fubar@us.ibm.com \
    --cc=horms@verge.net.au \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox