linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Tejun Heo <tj@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>,
	Christoph Lameter <cl@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	linux-arch@vger.kernel.org, hpa@zytor.com, tglx@linutronix.de
Subject: Re: [PATCHSET] percpu: generalize first chunk allocators and improve lpage NUMA support
Date: Wed, 1 Jul 2009 14:23:12 +0200	[thread overview]
Message-ID: <20090701122312.GP6760@one.firstfloor.org> (raw)
In-Reply-To: <4A4B38C5.1070504@kernel.org>

On Wed, Jul 01, 2009 at 07:21:57PM +0900, Tejun Heo wrote:
> > using possible per cpu data I picked in current git: icmp.c
> 
> I was talking about percpu allocator proper.  Yeap, the major work
> would be in auditing and converting for_each_possible_cpu() users.

and testing. that's the hard part. cpu hotplug is normally not well
tested. Any code change that requires lots of new code for 
it will be a problem because that code will then likely bitrot.

> >         kfree(net->ipv4.icmp_sk);
> >         net->ipv4.icmp_sk = NULL;
> > }
> > 
> > You would need to convert that to use a CPU notifier and callbacks
> > setting up the sockets. Then make sure there are no races in all of
> > this. And get it somehow tested (where is the user base who
> > tests cpu hotplug?) 
> 
> Maybe it would be better to allocate percpu sockets as proper percpu
> variables.  

That would be like percpu dentries.

A socket is a very complex structure. Trying to pull it out 
of its normal allocators would be a bad idea.

I actually had a patch some time ago to move this one over
to callbacks, but it was already difficult, and I dropped
it.

> Initialization would still need callback mechanism tho.  I
> was thinking about adding @init callback to percpu_alloc(), which
> would be much simpler than doing full cpu hotplug callback.

I'm not sure it will be actually simpler. It's not just
initializing the structure, but all the other set up to make
it known to the world.

> 
> > And there is lots of similar code all over the tree
> 
> For static percpu variables, it'll be mostly about converting
> for_each_possible_cpu() to for_each_used_cpu() as both allocation and
> initialization can be handled by percpu proper.  For dynamic areas,

That's the easy part, but you would still need all the callbacks 
for the extension case.

> allocation can be handled by percpu proper but cpus coming online
> would need more work to convert.  It'll take some effort but there
> aren't too many alloc_percpu() users yet and I don't think it will be

alloc_percpu()? It affects all DEFINE_PER_CPU users. My current
tree has hundreds.

> too difficult.  I wouldn't know for sure before I actually try tho.

I think it's clear that you haven't tried yet :)

I wrote quite a few per cpu callback handlers over the years and
in my experience they are all nasty code with subtle races. The problem
is that instead of having a single subset init function which
is just single threaded and doesn't need to worry about races
you now have multi threaded init, which tends to be a can of worms.

I think a far saner strategy than rewriting every user of DEFINE_PER_CPU,
ending up with lots of badly tested code is to:

- Fix the few large size percpu pigs that are problematic today to
allocate in a callback.
- Then once the per cpu data in all configurations is <200k (better
<100 in the non debug builds) again just keep pre-allocating like we always did
- Possibly adjust the vmalloc area on 32bit based on the possible
CPU map at the cost of the direct mapping, to make sure there's always enough 
mapping space.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

  reply	other threads:[~2009-07-01 12:23 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-24 13:30 [PATCHSET] percpu: generalize first chunk allocators and improve lpage NUMA support Tejun Heo
2009-06-24 13:30 ` Tejun Heo
2009-06-24 13:30 ` [PATCH 01/10] x86: make pcpu_chunk_addr_search() matching stricter Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 13:30 ` [PATCH 02/10] percpu: drop @unit_size from embed first chunk allocator Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 13:30 ` [PATCH 03/10] x86,percpu: generalize 4k " Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 13:30 ` [PATCH 04/10] percpu: make 4k first chunk allocator map memory Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 13:30 ` [PATCH 05/10] x86,percpu: generalize lpage first chunk allocator Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 13:30 ` [PATCH 06/10] percpu: simplify pcpu_setup_first_chunk() Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 13:30 ` [PATCH 07/10] percpu: reorder a few functions in mm/percpu.c Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 13:30 ` [PATCH 08/10] percpu: drop pcpu_chunk->page[] Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 13:30 ` [PATCH 09/10] percpu: allow non-linear / sparse cpu -> unit mapping Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 13:30 ` [PATCH 10/10] percpu: teach large page allocator about NUMA Tejun Heo
2009-06-24 13:30   ` Tejun Heo
2009-06-24 23:55 ` [PATCHSET] percpu: generalize first chunk allocators and improve lpage NUMA support Andrew Morton
2009-06-25  0:02   ` Andi Kleen
2009-06-25  0:13     ` H. Peter Anvin
2009-06-25  9:19       ` Andi Kleen
2009-06-25 14:18         ` H. Peter Anvin
2009-06-25 19:54           ` Andi Kleen
2009-06-25 20:15             ` H. Peter Anvin
2009-06-25 20:26               ` Andi Kleen
2009-06-26  0:40                 ` Tejun Heo
2009-06-26  2:02                   ` H. Peter Anvin
2009-06-26  6:54                   ` Andi Kleen
2009-06-25  2:35   ` Tejun Heo
2009-06-25  9:20     ` Ingo Molnar
2009-06-29 23:20   ` Christoph Lameter
2009-06-29 23:39     ` Andrew Morton
2009-06-30 14:24       ` Christoph Lameter
2009-06-30 19:15       ` Ingo Molnar
2009-06-30 19:39         ` Christoph Lameter
2009-06-30 20:21           ` Scott Lurndal
2009-06-30 21:31           ` Ingo Molnar
2009-06-30 22:16             ` Christoph Lameter
2009-06-30 22:31               ` Ingo Molnar
2009-06-30 22:40                 ` Andi Kleen
2009-07-01  0:48                   ` Tejun Heo
2009-06-30 22:55                 ` Christoph Lameter
2009-06-30 22:55                   ` Christoph Lameter
2009-06-30 23:07                   ` Ingo Molnar
2009-06-30 23:18                     ` Christoph Lameter
2009-06-30 23:30                       ` Ingo Molnar
2009-07-01  6:34                     ` Andi Kleen
2009-06-30 23:20               ` Tejun Heo
2009-06-30 23:31                 ` Ingo Molnar
2009-06-30 23:34                   ` H. Peter Anvin
2009-07-01  6:42                 ` Andi Kleen
2009-07-01 10:21                   ` Tejun Heo
2009-07-01 12:23                     ` Andi Kleen [this message]
2009-07-01 12:53                       ` Tejun Heo
2009-07-01 13:11                         ` Andi Kleen
2009-07-01 17:33                           ` Christoph Lameter
2009-07-01 22:42                             ` Tejun Heo
2009-07-03 23:14 ` Tejun Heo
2009-07-03 23:14   ` Tejun Heo
2009-07-13 10:12 ` [PATCH 04/10] percpu: make 4k first chunk allocator map memory David Howells
2009-07-15  3:17   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090701122312.GP6760@one.firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).