netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	akpm@linux-foundation.org, tj@kernel.org, hpa@zytor.com,
	brgerst@gmail.com, ebiederm@xmission.com,
	cl@linux-foundation.org, travis@sgi.com,
	linux-kernel@vger.kernel.org, steiner@sgi.com, hugh@veritas.com,
	"David S. Miller" <davem@davemloft.net>,
	netdev@vger.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Subject: Re: [PATCH] percpu: add optimized generic percpu accessors
Date: Fri, 16 Jan 2009 23:08:32 +0100	[thread overview]
Message-ID: <20090116220832.GB20653@elte.hu> (raw)
In-Reply-To: <200901170827.33729.rusty@rustcorp.com.au>


* Rusty Russell <rusty@rustcorp.com.au> wrote:

> On Friday 16 January 2009 10:48:24 Herbert Xu wrote:
> > On Fri, Jan 16, 2009 at 01:15:44AM +0100, Ingo Molnar wrote:
> > >
> > > > So if you could design the API such that we have a variant of add/inc 
> > > > that automatically disables/enables preemption then we can optimise that 
> > > > away on x86.
> > > 
> > > Yeah. percpu_add(var, 1) does exactly that on x86.
> 
> <sigh>.  No it doesn't.

What do you mean by "No it doesn't". It does exactly what i claimed it 
does.

> It's really nice that everyone's excited about this, but it's more 
> complex than this.  Unf. I'm too busy preparing for linux.conf.au to 
> explain it all properly right now, but here's the highlights:
> 
> 1) This only works on static per-cpu vars.
>    - We are working on fixing this, but it's non-trivial for large allocs like
>      those in networking.  Small allocs, we have patches for.

How do difficulties of dynamic percpu-alloc make my above suggestion 
unsuitable for SNMP stats in networking? Most of those stats are not 
dynamically allocated - they are plain straightforward percpu variables.

Plus the majority of percpu usage is static - just like the majority of 
local variables is static, not dynamic. So any percpu-alloc complication 
is a non-issue.

> 2) The generic versions of these as posted by Tejun are unsuitable for
>    networking; they need to bh_disable.  That would make networking less
>    efficient than it is now for non-x86, and to be generic it would have
>    to be local_irq_save/restore anyway.

The generic versions will not be used on 95%+ of the active Linux systems 
out there, as they run on x86. If you worry about the remaining 5%, those 
can be optimized too.

> 3) local_t was designed to do exactly this: a fast cpu-local counter
>    implemented optimally for each arch.  For sparc64, doing a trivalue version
>    seems optimal, for s390 atomics, for x86 single-insn, for powerpc
>    irq_save/restore, etc.

But local_t does not actually solve this problem at all - because one 
still has to have per-cpu-ness.

> 4) Unfortunately, local_t has been extended beyond a simple counter, meaning
>    it now has more complex requirements (eg. Mathieu wants nmi-safe, even
>    though that's impossible on sparc and parisc, and percpu_counter wants
>    local_add_return, which makes trival less desirable).  These discussions
>    are on the back burner at the moment, but ongoing.

In reality local_t has almost zero users in the kernel - despite being 
with us at least since v2.6.12. That pretty much tells us all about its 
utility.

The thing is, local_t without proper percpu integration is a toothless 
tiger in the jungle. And our APIS do exactly that kind of integration and 
i expect them to be more popular than local_t. There's already a dozen 
usage sites of it in arch/x86.

	Ingo

  parent reply	other threads:[~2009-01-16 22:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20090115183942.GA6325@elte.hu>
     [not found] ` <20090116001200.GA9137@gondor.apana.org.au>
     [not found]   ` <20090116001544.GA11073@elte.hu>
2009-01-16  0:18     ` [PATCH] percpu: add optimized generic percpu accessors Herbert Xu
     [not found]       ` <200901170827.33729.rusty@rustcorp.com.au>
2009-01-16 22:08         ` Ingo Molnar [this message]
     [not found]           ` <200901201328.24605.rusty@rustcorp.com.au>
2009-01-20  6:25             ` Tejun Heo
2009-01-20 10:36               ` Ingo Molnar
     [not found]               ` <200901271213.18605.rusty@rustcorp.com.au>
2009-01-27  2:24                 ` Tejun Heo
2009-01-27 13:13                   ` Ingo Molnar
2009-01-27 23:07                     ` Tejun Heo
2009-01-28  3:36                       ` Tejun Heo
2009-01-28  8:12                         ` Tejun Heo
2009-01-27 20:08                   ` Christoph Lameter
2009-01-27 21:47                     ` David Miller
2009-01-27 22:47                       ` Rick Jones
2009-01-28  0:17                         ` Luck, Tony
2009-01-28 16:48                           ` Christoph Lameter
2009-01-28 17:15                             ` Luck, Tony
2009-01-28 16:45                       ` Christoph Lameter
2009-01-28 20:47                         ` David Miller
2009-01-28 10:38                   ` Rusty Russell
2009-01-28 10:56                     ` Tejun Heo
2009-01-29  2:06                       ` Rusty Russell
2009-01-31  6:11                         ` Tejun Heo
2009-01-28 16:50                     ` Christoph Lameter
2009-01-28 18:07                       ` Mathieu Desnoyers
2009-01-29 18:33                         ` Christoph Lameter
2009-01-29 18:48                           ` H. Peter Anvin
2009-01-20 10:40             ` Ingo Molnar
2009-01-21  5:52               ` Tejun Heo
2009-01-21 10:05                 ` Ingo Molnar
2009-01-21 11:21                 ` Eric W. Biederman
2009-01-21 12:45                   ` Stephen Hemminger
2009-01-21 14:13                     ` Eric W. Biederman
2009-01-21 20:34                     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090116220832.GB20653@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=brgerst@gmail.com \
    --cc=cl@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=hpa@zytor.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=netdev@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=steiner@sgi.com \
    --cc=tj@kernel.org \
    --cc=travis@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).