Netdev List
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: jhautbois@gmail.com, netdev@vger.kernel.org
Subject: Re: Regression on TX throughput when using bonding
Date: Thu, 14 Jun 2012 12:07:51 +0200	[thread overview]
Message-ID: <1339668471.22704.714.camel@edumazet-glaptop> (raw)
In-Reply-To: <20120614.030021.2291563831943273331.davem@davemloft.net>

On Thu, 2012-06-14 at 03:00 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 14 Jun 2012 11:50:17 +0200
> 
> > On Thu, 2012-06-14 at 11:22 +0200, Eric Dumazet wrote:
> > 
> >> So you are saying that if you make skb_orphan_try() doing nothing, it
> >> solves your problem ?
> > 
> > It probably does, if your application does an UDP flood, trying to send
> > more than the link bandwidth. I guess only benchmarks workloads ever try
> > to do that.
> 
> Eric, I just want to point out that back when this early orphaning
> idea were being proposed I warned about this, and specifically I
> mentioned that, for datagram sockets, the socket send buffer limits
> are what provide proper rate control and fairness.

If I remember well, the argument was that if workload was using thousand
of sockets, the per socket limitation of in-flight packet would not save
you anyway. We would drop packets.


> It also, therefore, protects the system from one datagram spammer
> being able to essentially take over the network interface and blocking
> out all other users.
> 
> Early orphaning breaks this completely.
> 
> I guess we decided that moving an atomic operation earlier is worth
> all of this?

It was, but with BQL, we should have far less packets in TX rings, so it
might be different today (on BQL enabled NICS only)

> 
> Now we are so addicted to the increased performance from early
> orphaning that I fear we'll never be allowed back into that sane
> state of affairs ever again.


bonding (or other virtual devices) is special in the sense the
dev_hard_start_xmit() is called twice.

We should have a way to properly park packets in Qdiscs, and only do the
orphaning once skb given to real device for 'immediate or so'
transmission.

The pppoe thread is only another manifestation of the same problem.

  reply	other threads:[~2012-06-14 10:07 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-14  8:58 Regression on TX throughput when using bonding Jean-Michel Hautbois
2012-06-14  9:21 ` Eric Dumazet
2012-06-14  9:40   ` Jean-Michel Hautbois
2012-06-14  9:50   ` Eric Dumazet
2012-06-14 10:00     ` David Miller
2012-06-14 10:07       ` Eric Dumazet [this message]
2012-06-14 10:31         ` David Miller
2012-06-14 16:42           ` [PATCH] net: remove skb_orphan_try() Eric Dumazet
2012-06-15  7:15             ` Oliver Hartkopp
2012-06-15 22:31             ` David Miller
2012-06-14 10:15     ` Regression on TX throughput when using bonding Jean-Michel Hautbois
2012-06-14 14:14       ` Jean-Michel Hautbois
2012-06-14 14:29         ` Eric Dumazet
2012-06-14 15:43           ` Jean-Michel Hautbois
2012-06-14 17:46             ` Rick Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1339668471.22704.714.camel@edumazet-glaptop \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=jhautbois@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox