From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
akpm@linux-foundation.org, cl@linux-foundation.org,
kamezawa.hiroyu@jp.fujitsu.com, lizf@cn.fujitsu.com,
mingo@elte.hu, yinghai@kernel.org
Subject: Re: [GIT PULL v2] Early SLAB fixes for 2.6.31
Date: Mon, 15 Jun 2009 20:39:48 +1000 [thread overview]
Message-ID: <1245062388.12400.17.camel@pasglop> (raw)
In-Reply-To: <20090615101254.GB10294@wotan.suse.de>
> Why? The best reason to use slab allocator is that the allocations
> are much more efficient and also can be freed later.
I think you are making the mistake of reasoning too much in term of
implementation of the allocator itself, and not enough in term of the
consistency of the API exposed to the rest of the kernel.
I think the current approach is a good compromise. If you can make it
more optimal (by pushing the masking in slow path for example), then go
for it, but from an API standpoint, I don't like having anybody who
needs to allocate memory have to know about seemingly unrelated things
such as whether interrupts have been enabled globally yet, scheduler
have been initialized, or whatever else we might stumble upon.
> > I think the boot order is too likely to change to make it a sane thing
> > to have all call sites "know" at what point they are in the boot
> > process.
>
> I disagree.
How so ? IE. We are changing things in the boot order today, and I'm
expecting things to continue to move in that area. I believe it's going
to be endless headaches and breakage if we have to get the right set of
flags on every caller.
In addition, there's the constant issue of code that can be called both
at boot and non-boot time and shouldn't have to know where it has been
called from, while wanting to make allocations, such as get_vm_area().
I don't think it will make anybody's life better to push out the "boot
state" up into those APIs, duplicating them, etc...
> > In your example, what does GFP_BOOT would mean ? Before
> > scheduler is initialized ? before interrupts are on ?
>
> Before initcalls is probably easiest. But it really does not
> matter that much. Why? Because if we run out of memory before
> then, then there is not going to be anything to reclaim
> anyway.
Precisely. It -doesn't matter- (to the caller). So why make it matter in
term of API ? There's a whole bunch of things in arch code or subsystems
that really don't have any business knowing in what context or at what
time they have been called.
> > There's just too much stuff involved and we don't want random
> > allocations in various subsystem or arch code to be done with that
> > special knowledge of where specifically in that process they are done.
>
> If they're done that early, of course they have to know where
> they are because they only get to use a subset of kernel
> services depending exactly on what has already been done.
To a certain extent, yes. But not -that- much, expecially when it comes
to a very basic service such as allocating memory.
> > Especially since it may change.
>
> "it" meaning the ability to reclaim memory? Not really. Not a
> significant amount of memory may be reclaimed really until
> after init process starts running.
How much stuff allocated during boot needs to be reclaimed ?
>
> > Additionally, I believe the flag test/masking can be moved easily enough
> > out of the fast path... slub shouldn't need it there afaik and if it's
> > pushed down into the allocation of new slab's then it shouldn't be a big
> > deal.
>
> Given that things have been apparently coping fine so far, I
> think it will be a backward step to just give up now and say
> it is too hard simply because slab is available to use slightly
> earlier.
Things have been coping thanks to horrors such as
if (slab_is_available())
kmalloc
else
alloc_bootmem.
Now you are proposing to change that into
if (whatever_are_we_talking_about())
kmalloc(... GFP_KERNEL)
else
kmalloc(... GFP_BOOT)
Not a very big improvement in my book :-)
> It's not that the world is going to come to an end if we
> can't remove the masking, but just maybe the information
> can be used in future to avoid adding more overhead, or
> maybe some other debugging features can be added or something.
> I just think it is cleaner to go that way if possible, and
> claiming that callers can't be expected to know what context
> they clal the slab allocator from just sounds like a
> contradiction to me.
I agree with the general principle of pushing state information out to
the caller as much as possible. But like all principles, there are
meaningful exceptions and I believe this is a good example of one.
Cheers,
Ben.
next prev parent reply other threads:[~2009-06-15 10:46 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-12 13:25 [GIT PULL] Early SLAB fixes for 2.6.31 Pekka J Enberg
2009-06-12 13:38 ` Benjamin Herrenschmidt
2009-06-12 13:45 ` Pekka Enberg
2009-06-12 14:30 ` Christoph Lameter
2009-06-12 16:16 ` [GIT PULL v2] " Pekka J Enberg
2009-06-12 17:30 ` Christoph Lameter
2009-06-12 21:46 ` Benjamin Herrenschmidt
2009-06-15 6:46 ` Nick Piggin
2009-06-15 9:10 ` Pekka Enberg
2009-06-15 9:38 ` Nick Piggin
2009-06-15 14:43 ` Christoph Lameter
2009-06-14 7:12 ` Pekka J Enberg
2009-06-15 14:55 ` Christoph Lameter
2009-06-15 14:58 ` Pekka Enberg
2009-06-15 15:05 ` Christoph Lameter
2009-06-15 15:11 ` Pekka Enberg
2009-06-15 15:27 ` Pekka Enberg
2009-06-15 15:51 ` Christoph Lameter
2009-06-15 15:57 ` Pekka Enberg
2009-06-15 16:08 ` Christoph Lameter
2009-06-15 17:15 ` Linus Torvalds
2009-06-15 18:19 ` Pekka Enberg
2009-06-15 15:48 ` Christoph Lameter
2009-06-15 8:18 ` Heiko Carstens
2009-06-15 8:26 ` Nick Piggin
2009-06-15 8:32 ` Pekka Enberg
2009-06-15 8:52 ` Nick Piggin
2009-06-15 9:08 ` Pekka Enberg
2009-06-15 10:20 ` Heiko Carstens
2009-06-15 10:21 ` Pekka Enberg
2009-06-15 10:31 ` Nick Piggin
2009-06-15 10:36 ` Pekka Enberg
2009-06-15 9:10 ` Pekka Enberg
2009-06-15 9:41 ` Nick Piggin
2009-06-15 9:48 ` Pekka Enberg
2009-06-15 9:59 ` Nick Piggin
2009-06-15 9:51 ` Benjamin Herrenschmidt
2009-06-15 9:57 ` Pekka Enberg
2009-06-15 10:27 ` Nick Piggin
2009-06-15 10:45 ` Benjamin Herrenschmidt
2009-06-15 11:23 ` Nick Piggin
2009-06-15 12:38 ` Hugh Dickins
2009-06-15 13:07 ` Pekka Enberg
2009-06-16 4:57 ` Nick Piggin
2009-06-16 5:28 ` Benjamin Herrenschmidt
2009-06-16 5:36 ` Nick Piggin
2009-06-16 15:12 ` Christoph Lameter
2009-06-16 15:59 ` Nick Piggin
2009-06-15 21:31 ` Benjamin Herrenschmidt
2009-06-16 4:46 ` Nick Piggin
2009-06-16 5:18 ` Benjamin Herrenschmidt
2009-06-16 5:29 ` Nick Piggin
2009-06-16 18:45 ` Linus Torvalds
2009-06-17 7:47 ` Nick Piggin
2009-06-17 16:01 ` Linus Torvalds
2009-06-17 16:17 ` Nick Piggin
2009-06-17 21:39 ` Benjamin Herrenschmidt
2009-06-15 10:12 ` Nick Piggin
2009-06-15 10:39 ` Benjamin Herrenschmidt [this message]
2009-06-15 11:22 ` Nick Piggin
2009-06-15 11:28 ` Nick Piggin
2009-06-15 11:38 ` Nick Piggin
2009-06-15 21:37 ` Benjamin Herrenschmidt
2009-06-16 4:42 ` Nick Piggin
2009-06-15 21:32 ` Benjamin Herrenschmidt
2009-06-16 15:08 ` Christoph Lameter
2009-06-16 19:10 ` Linus Torvalds
2009-06-16 19:23 ` Christoph Lameter
2009-06-16 19:33 ` Linus Torvalds
2009-06-16 19:48 ` Christoph Lameter
2009-06-17 5:18 ` Pekka Enberg
2009-06-17 16:45 ` Linus Torvalds
2009-06-18 2:00 ` Benjamin Herrenschmidt
2009-06-18 3:24 ` Benjamin Herrenschmidt
2009-06-18 6:01 ` Pekka Enberg
2009-06-18 8:52 ` Benjamin Herrenschmidt
2009-06-16 21:58 ` Benjamin Herrenschmidt
2009-06-16 22:06 ` Linus Torvalds
2009-06-16 22:51 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1245062388.12400.17.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=heiko.carstens@de.ibm.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=penberg@cs.helsinki.fi \
--cc=torvalds@linux-foundation.org \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox