From: Andrew Morton <akpm@linux-foundation.org>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
netdev <netdev@vger.kernel.org>,
Michael Chan <mchan@broadcom.com>,
Eilon Greenstein <eilong@broadcom.com>,
Christoph Hellwig <hch@lst.de>,
Christoph Lameter <cl@linux-foundation.org>
Subject: Re: [PATCH net-next] net: allocate skbs on local node
Date: Tue, 12 Oct 2010 00:24:35 -0700 [thread overview]
Message-ID: <20101012002435.f51f2c0e.akpm@linux-foundation.org> (raw)
In-Reply-To: <1286866699.30423.234.camel@edumazet-laptop>
On Tue, 12 Oct 2010 08:58:19 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le lundi 11 octobre 2010 __ 23:03 -0700, Andrew Morton a __crit :
> > On Tue, 12 Oct 2010 07:05:25 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> > > [PATCH net-next] net: allocate skbs on local node
> > >
> > > commit b30973f877 (node-aware skb allocation) spread a wrong habit of
> > > allocating net drivers skbs on a given memory node : The one closest to
> > > the NIC hardware. This is wrong because as soon as we try to scale
> > > network stack, we need to use many cpus to handle traffic and hit
> > > slub/slab management on cross-node allocations/frees when these cpus
> > > have to alloc/free skbs bound to a central node.
> > >
> > > skb allocated in RX path are ephemeral, they have a very short
> > > lifetime : Extra cost to maintain NUMA affinity is too expensive. What
> > > appeared as a nice idea four years ago is in fact a bad one.
> > >
> > > In 2010, NIC hardwares are multiqueue, or we use RPS to spread the load,
> > > and two 10Gb NIC might deliver more than 28 million packets per second,
> > > needing all the available cpus.
> > >
> > > Cost of cross-node handling in network and vm stacks outperforms the
> > > small benefit hardware had when doing its DMA transfert in its 'local'
> > > memory node at RX time. Even trying to differentiate the two allocations
> > > done for one skb (the sk_buff on local node, the data part on NIC
> > > hardware node) is not enough to bring good performance.
> > >
> >
> > This is all conspicuously hand-wavy and unquantified. (IOW: prove it!)
> >
>
> I would say, _you_ should prove that original patch was good. It seems
> no network guy was really in the discussion ?
Two wrongs and all that. The 2006 patch has nothing to do with it,
apart from demonstrating the importance of including performance
measurements in a performance patch.
> Just run a test on a bnx2x or ixgbe multiqueue 10Gb adapter, and see the
> difference. Thats about a 40% slowdown on high packet rates, on a dual
> socket machine (dual X5570 @2.93GHz). You can expect higher values on
> four nodes (I dont have such hardware to do the test)
Like that. Please flesh it out and stick it in the changelog.
>
> > The mooted effects should be tested for on both slab and slub, I
> > suggest. They're pretty different beasts.
>
> SLAB is so slow on NUMA these days, you can forget it for good.
I'd love to forget it, but it's faster for some things (I forget
which). Which is why it's still around.
And the ghastly thing about this is that you're forced to care about it
too because some people are, apparently, still using it.
next prev parent reply other threads:[~2010-10-12 7:23 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-11 23:03 [PATCH net-next] bnx2x: dont use netdev_alloc_skb() Eric Dumazet
2010-10-11 23:22 ` Eric Dumazet
2010-10-12 5:03 ` Tom Herbert
2010-10-12 5:16 ` Eric Dumazet
2010-10-12 9:12 ` Vladislav Zolotarov
2010-10-14 17:39 ` David Miller
2010-10-14 18:17 ` Eilon Greenstein
2010-10-14 18:20 ` Eric Dumazet
2010-10-14 18:25 ` David Miller
2010-10-14 18:17 ` Tom Herbert
2010-10-12 5:05 ` [PATCH net-next] net: allocate skbs on local node Eric Dumazet
2010-10-12 5:35 ` Tom Herbert
2010-10-12 6:03 ` Andrew Morton
2010-10-12 6:58 ` Eric Dumazet
2010-10-12 7:24 ` Andrew Morton [this message]
2010-10-12 7:49 ` Eric Dumazet
2010-10-12 7:58 ` Andrew Morton
2010-10-12 11:08 ` Pekka Enberg
2010-10-12 12:50 ` Christoph Lameter
2010-10-12 19:43 ` David Rientjes
2010-10-13 6:17 ` Pekka Enberg
2010-10-13 6:31 ` David Rientjes
2010-10-13 6:36 ` Pekka Enberg
2010-10-13 16:00 ` Christoph Lameter
2010-10-13 20:48 ` David Rientjes
2010-10-13 21:43 ` Christoph Lameter
2010-10-13 22:41 ` David Rientjes
2010-10-14 6:22 ` Pekka Enberg
2010-10-14 7:23 ` David Rientjes
2010-10-15 14:23 ` Christoph Lameter
2010-10-14 15:31 ` Tom Herbert
2010-10-14 16:05 ` Eric Dumazet
2010-10-15 16:57 ` Christoph Lameter
2010-10-14 19:27 ` Andrew Morton
2010-10-14 19:59 ` Eric Dumazet
2010-10-16 18:54 ` David Miller
2010-10-12 16:07 ` [BUG net-next] bnx2x: all traffic comes to RX queue 0 Eric Dumazet
2010-10-12 16:20 ` Dmitry Kravkov
2010-10-12 18:11 ` Eric Dumazet
2010-10-12 18:18 ` Vladislav Zolotarov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101012002435.f51f2c0e.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=eilong@broadcom.com \
--cc=eric.dumazet@gmail.com \
--cc=hch@lst.de \
--cc=mchan@broadcom.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.