From: Ingo Molnar <mingo@elte.hu>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
akpm@linux-foundation.org, tj@kernel.org, hpa@zytor.com,
brgerst@gmail.com, ebiederm@xmission.com,
cl@linux-foundation.org, travis@sgi.com,
linux-kernel@vger.kernel.org, steiner@sgi.com, hugh@veritas.com,
"David S. Miller" <davem@davemloft.net>,
netdev@vger.kernel.org,
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Subject: Re: [PATCH] percpu: add optimized generic percpu accessors
Date: Fri, 16 Jan 2009 23:08:32 +0100 [thread overview]
Message-ID: <20090116220832.GB20653@elte.hu> (raw)
In-Reply-To: <200901170827.33729.rusty@rustcorp.com.au>
* Rusty Russell <rusty@rustcorp.com.au> wrote:
> On Friday 16 January 2009 10:48:24 Herbert Xu wrote:
> > On Fri, Jan 16, 2009 at 01:15:44AM +0100, Ingo Molnar wrote:
> > >
> > > > So if you could design the API such that we have a variant of add/inc
> > > > that automatically disables/enables preemption then we can optimise that
> > > > away on x86.
> > >
> > > Yeah. percpu_add(var, 1) does exactly that on x86.
>
> <sigh>. No it doesn't.
What do you mean by "No it doesn't". It does exactly what i claimed it
does.
> It's really nice that everyone's excited about this, but it's more
> complex than this. Unf. I'm too busy preparing for linux.conf.au to
> explain it all properly right now, but here's the highlights:
>
> 1) This only works on static per-cpu vars.
> - We are working on fixing this, but it's non-trivial for large allocs like
> those in networking. Small allocs, we have patches for.
How do difficulties of dynamic percpu-alloc make my above suggestion
unsuitable for SNMP stats in networking? Most of those stats are not
dynamically allocated - they are plain straightforward percpu variables.
Plus the majority of percpu usage is static - just like the majority of
local variables is static, not dynamic. So any percpu-alloc complication
is a non-issue.
> 2) The generic versions of these as posted by Tejun are unsuitable for
> networking; they need to bh_disable. That would make networking less
> efficient than it is now for non-x86, and to be generic it would have
> to be local_irq_save/restore anyway.
The generic versions will not be used on 95%+ of the active Linux systems
out there, as they run on x86. If you worry about the remaining 5%, those
can be optimized too.
> 3) local_t was designed to do exactly this: a fast cpu-local counter
> implemented optimally for each arch. For sparc64, doing a trivalue version
> seems optimal, for s390 atomics, for x86 single-insn, for powerpc
> irq_save/restore, etc.
But local_t does not actually solve this problem at all - because one
still has to have per-cpu-ness.
> 4) Unfortunately, local_t has been extended beyond a simple counter, meaning
> it now has more complex requirements (eg. Mathieu wants nmi-safe, even
> though that's impossible on sparc and parisc, and percpu_counter wants
> local_add_return, which makes trival less desirable). These discussions
> are on the back burner at the moment, but ongoing.
In reality local_t has almost zero users in the kernel - despite being
with us at least since v2.6.12. That pretty much tells us all about its
utility.
The thing is, local_t without proper percpu integration is a toothless
tiger in the jungle. And our APIS do exactly that kind of integration and
i expect them to be more popular than local_t. There's already a dozen
usage sites of it in arch/x86.
Ingo
next prev parent reply other threads:[~2009-01-16 22:09 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20090115183942.GA6325@elte.hu>
[not found] ` <20090116001200.GA9137@gondor.apana.org.au>
[not found] ` <20090116001544.GA11073@elte.hu>
2009-01-16 0:18 ` [PATCH] percpu: add optimized generic percpu accessors Herbert Xu
[not found] ` <200901170827.33729.rusty@rustcorp.com.au>
2009-01-16 22:08 ` Ingo Molnar [this message]
[not found] ` <200901201328.24605.rusty@rustcorp.com.au>
2009-01-20 6:25 ` Tejun Heo
2009-01-20 10:36 ` Ingo Molnar
[not found] ` <200901271213.18605.rusty@rustcorp.com.au>
2009-01-27 2:24 ` Tejun Heo
2009-01-27 13:13 ` Ingo Molnar
2009-01-27 23:07 ` Tejun Heo
2009-01-28 3:36 ` Tejun Heo
2009-01-28 8:12 ` Tejun Heo
2009-01-27 20:08 ` Christoph Lameter
2009-01-27 21:47 ` David Miller
2009-01-27 22:47 ` Rick Jones
2009-01-28 0:17 ` Luck, Tony
2009-01-28 16:48 ` Christoph Lameter
2009-01-28 17:15 ` Luck, Tony
2009-01-28 16:45 ` Christoph Lameter
2009-01-28 20:47 ` David Miller
2009-01-28 10:38 ` Rusty Russell
2009-01-28 10:56 ` Tejun Heo
2009-01-29 2:06 ` Rusty Russell
2009-01-31 6:11 ` Tejun Heo
2009-01-28 16:50 ` Christoph Lameter
2009-01-28 18:07 ` Mathieu Desnoyers
2009-01-29 18:33 ` Christoph Lameter
2009-01-29 18:48 ` H. Peter Anvin
2009-01-20 10:40 ` Ingo Molnar
2009-01-21 5:52 ` Tejun Heo
2009-01-21 10:05 ` Ingo Molnar
2009-01-21 11:21 ` Eric W. Biederman
2009-01-21 12:45 ` Stephen Hemminger
2009-01-21 14:13 ` Eric W. Biederman
2009-01-21 20:34 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090116220832.GB20653@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=brgerst@gmail.com \
--cc=cl@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=herbert@gondor.apana.org.au \
--cc=hpa@zytor.com \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=netdev@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
--cc=steiner@sgi.com \
--cc=tj@kernel.org \
--cc=travis@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).