* New slab allocator SLUB
@ 2007-05-08 19:10 Christoph Lameter
2007-05-09 1:04 ` Paul Mundt
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Christoph Lameter @ 2007-05-08 19:10 UTC (permalink / raw)
To: linux-arch; +Cc: akpm
The new slab allocator SLUB was merged and it seems that we are heading
towards replacing SLAB completely by SLUB. This means that we would like
to be sure that SLUB runs reliably on all platforms. SLUB is first
available upstream with 2.6.21-git9.
One issue is that SLUB requires the use of the entire page struct for the
management of its objects. If arch code uses the page struct too the
disaster strikes. As a result SLUB has been disabled for several platforms
by setting ARCH_USES_SLAB_PAGE_STRUCT in the Kconfig. These are
i386:
Uses slab for pgd handling and modifies the page structs of those
Fix has been in Andrew's tree for awhile now.
PowerPC:
Uses slab allocator for pte allocation / freeing. The page structs of
ptes also are used for splitting the page table lock in large cpu
configurations (well more than 4) causing issues. There is a patch
by Hugh Dickins to address the issues but it seems that the arch
maintainers have now decided on a different course of action.
FRV
Like i386. Also uses slab for pgd handling and modifies page structs.
Fix sent to David Howell's. Hopefully we get can this working soon.
I would appreciate if you could test SLUB on your platform and make sure
that everything works the right way. There is a slabinfo tool that allows
monitoring of SLUB slabs in Documentation/vm/slabinfo.c. If there are
problems then please boot specifying "slub_debug" which should give you a
detailed analysis of the issues encountered.
There are a lot of kernel config files around that have CONFIG_SLAB=y.
This means that the kernel will be build with SLAB and not SLUB. In order
to build a kernel with SLUB you will need to have CONFIG_SLUB=y in there.
Differences in the treatment of power of two slabs:
SLUB has a higher packing density since the control fields are placed in
the page struct. There is no need for a control structure in the slab
itself or have control structures in a separate slab (OFF_SLAB). This is
only possible since SLUB does not have to maintain a map of all objects
like SLAB. Instead we use a linked list.
In order to manage objects with linked lists we need to have a pointer to
the next free object for each object. This is no problem for slab
configurations where the object state is irrelevant after kfree or before
kmalloc. However, if the object cannot be touched at all
(SLAB_DESTROY_BY_RCU or the use of constructors) then SLUB must place the
free list pointer after the object and therefore increase the object size.
This is particularly bad if the object is also aligned to the same
power-of-two because it means that the object size has just doubled.
For page sized allocations quicklists are an alternative. Moving to
quicklists also allows to continue the use of page struct fields. The
solution for most of the three platforms above was to switch to quicklists
instead.
If you have smaller than page sized allocations that are of the power of
two and which are to be aligned to the same power of two then it may be
advisable to make sure that the slab does not have a constructor.
Otherwise there may be some memory wasted.
I would expect that the experimental status of SLUB will be removed
soon. SLUB then become the default slab allocator. It may not be the
default when we release 2.6.22 but it is scheduled to be the default for
2.6.23.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-08 19:10 New slab allocator SLUB Christoph Lameter
@ 2007-05-09 1:04 ` Paul Mundt
2007-05-09 13:36 ` Andi Kleen
2007-05-10 22:13 ` David Miller
2 siblings, 0 replies; 14+ messages in thread
From: Paul Mundt @ 2007-05-09 1:04 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-arch, akpm
On Tue, May 08, 2007 at 12:10:32PM -0700, Christoph Lameter wrote:
> I would appreciate if you could test SLUB on your platform and make sure
> that everything works the right way. There is a slabinfo tool that allows
> monitoring of SLUB slabs in Documentation/vm/slabinfo.c. If there are
> problems then please boot specifying "slub_debug" which should give you a
> detailed analysis of the issues encountered.
>
It seems to hold up on SH as expected at least. I've been running
it for awhile with varying workloads and nothing out of the ordinary has
popped up yet.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-08 19:10 New slab allocator SLUB Christoph Lameter
2007-05-09 1:04 ` Paul Mundt
@ 2007-05-09 13:36 ` Andi Kleen
2007-05-09 15:52 ` Christoph Lameter
2007-05-10 22:13 ` David Miller
2 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2007-05-09 13:36 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-arch, akpm
>
> i386:
> Uses slab for pgd handling and modifies the page structs of those
> Fix has been in Andrew's tree for awhile now.
Should be already upstream.
> I would expect that the experimental status of SLUB will be removed
> soon. SLUB then become the default slab allocator. It may not be the
> default when we release 2.6.22 but it is scheduled to be the default for
> 2.6.23.
Assuming you fix the performance regressions first?
-Andi
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-09 13:36 ` Andi Kleen
@ 2007-05-09 15:52 ` Christoph Lameter
0 siblings, 0 replies; 14+ messages in thread
From: Christoph Lameter @ 2007-05-09 15:52 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-arch, akpm
On Wed, 9 May 2007, Andi Kleen wrote:
> Assuming you fix the performance regressions first?
There are no unfixed performance regressions. The netperf issue with
Cloverton has a fix in mm.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-08 19:10 New slab allocator SLUB Christoph Lameter
2007-05-09 1:04 ` Paul Mundt
2007-05-09 13:36 ` Andi Kleen
@ 2007-05-10 22:13 ` David Miller
2007-05-10 22:21 ` Christoph Lameter
2 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2007-05-10 22:13 UTC (permalink / raw)
To: clameter; +Cc: linux-arch, akpm
From: Christoph Lameter <clameter@sgi.com>
Date: Tue, 8 May 2007 12:10:32 -0700 (PDT)
> The new slab allocator SLUB was merged and it seems that we are heading
> towards replacing SLAB completely by SLUB. This means that we would like
> to be sure that SLUB runs reliably on all platforms. SLUB is first
> available upstream with 2.6.21-git9.
I found a new difference in SLUB and it prevents sparc64 from
booting currently :-)
What SLAB allows you to do is define LARGE_ALLOCS but not necessarily
set MAX_ORDER large enough for the largest kmalloc SLAB. SLAB would
ignore the kmalloc cache creation failures for these largest ones that
are over MAX_ORDER.
SLUB instead panic()'s which isn't so nice that early in the boot.
There are a few platforms that will trigger this problem, in
fact pretty much every one that specifies LARGE_ALLOCS currently
based upon a casual scan of platform Kconfig files.
To be honest I don't think I even need LARGE_ALLOCS on sparc64 so I
think I'll just see if I can delete that, but I would suggest one of
two courses of action:
1) Make SLUB ignore kmalloc cache creation failures at least for
the higher order ones
or
2) Detect the (PAGE_SIZE << MAX_ORDER) < LARGEST_KMALLOC_SIZE
at compile time so that nobody gets such an early panic.
Take care.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-10 22:13 ` David Miller
@ 2007-05-10 22:21 ` Christoph Lameter
2007-05-10 22:30 ` David Miller
0 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2007-05-10 22:21 UTC (permalink / raw)
To: David Miller; +Cc: linux-arch, akpm
On Thu, 10 May 2007, David Miller wrote:
> What SLAB allows you to do is define LARGE_ALLOCS but not necessarily
> set MAX_ORDER large enough for the largest kmalloc SLAB. SLAB would
> ignore the kmalloc cache creation failures for these largest ones that
> are over MAX_ORDER.
Hmmm... How about limiting KMALLOC_SHIFT_HIGH to max order?
---
include/linux/slub_def.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: slub/include/linux/slub_def.h
===================================================================
--- slub.orig/include/linux/slub_def.h 2007-05-10 15:19:39.000000000 -0700
+++ slub/include/linux/slub_def.h 2007-05-10 15:20:39.000000000 -0700
@@ -59,7 +59,7 @@ struct kmem_cache {
#define KMALLOC_SHIFT_LOW 3
#ifdef CONFIG_LARGE_ALLOCS
-#define KMALLOC_SHIFT_HIGH 25
+#define KMALLOC_SHIFT_HIGH (min(25, MAX_ORDER + PAGE_SHIFT))
#else
#if !defined(CONFIG_MMU) || NR_CPUS > 512 || MAX_NUMNODES > 256
#define KMALLOC_SHIFT_HIGH 20
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-10 22:21 ` Christoph Lameter
@ 2007-05-10 22:30 ` David Miller
2007-05-10 22:33 ` Christoph Lameter
0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2007-05-10 22:30 UTC (permalink / raw)
To: clameter; +Cc: linux-arch, akpm
From: Christoph Lameter <clameter@sgi.com>
Date: Thu, 10 May 2007 15:21:52 -0700 (PDT)
> On Thu, 10 May 2007, David Miller wrote:
>
> > What SLAB allows you to do is define LARGE_ALLOCS but not necessarily
> > set MAX_ORDER large enough for the largest kmalloc SLAB. SLAB would
> > ignore the kmalloc cache creation failures for these largest ones that
> > are over MAX_ORDER.
>
> Hmmm... How about limiting KMALLOC_SHIFT_HIGH to max order?
That should definitely do the trick too:
Signed-off-by: David S. Miller <davem@davemloft.net>
I just confirmed that I don't actually need LARGE_ALLOCS on sparc64.
I think I needed them for some reason back when I used kmalloc() to
allocate the per-address-space TLB miss hash tables.
I think the issue was that for Niagara and later really huge TLB
hash table sizes are allowed, and I wanted to experiment with those
and the sizes were large enough to require LARGE_ALLOCS. But now
I use SLAB for this and I cap the size at the pre-Niagara limit
of 1MB because larger sizes showed no performance gains.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-10 22:30 ` David Miller
@ 2007-05-10 22:33 ` Christoph Lameter
2007-05-10 22:35 ` David Miller
0 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2007-05-10 22:33 UTC (permalink / raw)
To: David Miller; +Cc: linux-arch, akpm
On Thu, 10 May 2007, David Miller wrote:
> From: Christoph Lameter <clameter@sgi.com>
> Date: Thu, 10 May 2007 15:21:52 -0700 (PDT)
>
> > On Thu, 10 May 2007, David Miller wrote:
> >
> > > What SLAB allows you to do is define LARGE_ALLOCS but not necessarily
> > > set MAX_ORDER large enough for the largest kmalloc SLAB. SLAB would
> > > ignore the kmalloc cache creation failures for these largest ones that
> > > are over MAX_ORDER.
> >
> > Hmmm... How about limiting KMALLOC_SHIFT_HIGH to max order?
>
> That should definitely do the trick too:
Could you verify that it indeed does the trick?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-10 22:33 ` Christoph Lameter
@ 2007-05-10 22:35 ` David Miller
2007-05-10 22:38 ` David Miller
0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2007-05-10 22:35 UTC (permalink / raw)
To: clameter; +Cc: linux-arch, akpm
From: Christoph Lameter <clameter@sgi.com>
Date: Thu, 10 May 2007 15:33:30 -0700 (PDT)
> On Thu, 10 May 2007, David Miller wrote:
>
> > From: Christoph Lameter <clameter@sgi.com>
> > Date: Thu, 10 May 2007 15:21:52 -0700 (PDT)
> >
> > > On Thu, 10 May 2007, David Miller wrote:
> > >
> > > > What SLAB allows you to do is define LARGE_ALLOCS but not necessarily
> > > > set MAX_ORDER large enough for the largest kmalloc SLAB. SLAB would
> > > > ignore the kmalloc cache creation failures for these largest ones that
> > > > are over MAX_ORDER.
> > >
> > > Hmmm... How about limiting KMALLOC_SHIFT_HIGH to max order?
> >
> > That should definitely do the trick too:
>
> Could you verify that it indeed does the trick?
Sure... give me a few minutes.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-10 22:35 ` David Miller
@ 2007-05-10 22:38 ` David Miller
2007-05-10 22:44 ` Christoph Lameter
0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2007-05-10 22:38 UTC (permalink / raw)
To: clameter; +Cc: linux-arch, akpm
From: David Miller <davem@davemloft.net>
Date: Thu, 10 May 2007 15:35:31 -0700 (PDT)
> From: Christoph Lameter <clameter@sgi.com>
> Date: Thu, 10 May 2007 15:33:30 -0700 (PDT)
>
> > On Thu, 10 May 2007, David Miller wrote:
> >
> > > From: Christoph Lameter <clameter@sgi.com>
> > > Date: Thu, 10 May 2007 15:21:52 -0700 (PDT)
> > >
> > > > On Thu, 10 May 2007, David Miller wrote:
> > > >
> > > > > What SLAB allows you to do is define LARGE_ALLOCS but not necessarily
> > > > > set MAX_ORDER large enough for the largest kmalloc SLAB. SLAB would
> > > > > ignore the kmalloc cache creation failures for these largest ones that
> > > > > are over MAX_ORDER.
> > > >
> > > > Hmmm... How about limiting KMALLOC_SHIFT_HIGH to max order?
> > >
> > > That should definitely do the trick too:
> >
> > Could you verify that it indeed does the trick?
>
> Sure... give me a few minutes.
Ugh, it won't build, you can't use min() because this is
evaluated at compile time to compute array sizes etc.
include/linux/slub_def.h:76: error: braced-group within expression allowed only inside a function
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-10 22:38 ` David Miller
@ 2007-05-10 22:44 ` Christoph Lameter
2007-05-10 23:01 ` David Miller
0 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2007-05-10 22:44 UTC (permalink / raw)
To: David Miller; +Cc: linux-arch, akpm
On Thu, 10 May 2007, David Miller wrote:
> Ugh, it won't build, you can't use min() because this is
> evaluated at compile time to compute array sizes etc.
>
> include/linux/slub_def.h:76: error: braced-group within expression allowed only inside a function
Rats. Then we have to do it by hand. This compiles here...
---
include/linux/slub_def.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
Index: slub/include/linux/slub_def.h
===================================================================
--- slub.orig/include/linux/slub_def.h 2007-05-10 15:19:39.000000000 -0700
+++ slub/include/linux/slub_def.h 2007-05-10 15:42:57.000000000 -0700
@@ -59,7 +59,8 @@ struct kmem_cache {
#define KMALLOC_SHIFT_LOW 3
#ifdef CONFIG_LARGE_ALLOCS
-#define KMALLOC_SHIFT_HIGH 25
+#define KMALLOC_SHIFT_HIGH ((MAX_ORDER + PAGE_SHIFT) < 25 ? \
+ MAX_ORDER + PAGE_SHIFT : 25)
#else
#if !defined(CONFIG_MMU) || NR_CPUS > 512 || MAX_NUMNODES > 256
#define KMALLOC_SHIFT_HIGH 20
@@ -86,6 +87,9 @@ static inline int kmalloc_index(int size
*/
WARN_ON_ONCE(size == 0);
+ if (size >= (1UL << KMALLOC_SHIFT_HIGH))
+ return -1;
+
if (size > 64 && size <= 96)
return 1;
if (size > 128 && size <= 192)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-10 22:44 ` Christoph Lameter
@ 2007-05-10 23:01 ` David Miller
2007-05-10 23:07 ` Christoph Lameter
0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2007-05-10 23:01 UTC (permalink / raw)
To: clameter; +Cc: linux-arch, akpm
From: Christoph Lameter <clameter@sgi.com>
Date: Thu, 10 May 2007 15:44:12 -0700 (PDT)
> On Thu, 10 May 2007, David Miller wrote:
>
> > Ugh, it won't build, you can't use min() because this is
> > evaluated at compile time to compute array sizes etc.
> >
> > include/linux/slub_def.h:76: error: braced-group within expression allowed only inside a function
>
> Rats. Then we have to do it by hand. This compiles here...
Unfortunately, still no dice with LARGE_ALLOCS=y/SLUB=y on sparc64.
It still tries to create the 16MB kmalloc cache even though MAX_ORDER
is 11. :-)
I think this is an off-by-one error, the kmalloc cache builder
iterates to >= KMALLOC_SHIFT_HIGH but your min() on MAX_ORDER would
only work if it iterated to > KMALLOC_SHIFT_HIGH.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-10 23:01 ` David Miller
@ 2007-05-10 23:07 ` Christoph Lameter
2007-05-11 0:05 ` David Miller
0 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2007-05-10 23:07 UTC (permalink / raw)
To: David Miller; +Cc: linux-arch, akpm
On Thu, 10 May 2007, David Miller wrote:
> Unfortunately, still no dice with LARGE_ALLOCS=y/SLUB=y on sparc64.
> It still tries to create the 16MB kmalloc cache even though MAX_ORDER
> is 11. :-)
>
> I think this is an off-by-one error, the kmalloc cache builder
> iterates to >= KMALLOC_SHIFT_HIGH but your min() on MAX_ORDER would
> only work if it iterated to > KMALLOC_SHIFT_HIGH.
Then subtract one?
---
include/linux/slub_def.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
Index: slub/include/linux/slub_def.h
===================================================================
--- slub.orig/include/linux/slub_def.h 2007-05-10 15:19:39.000000000 -0700
+++ slub/include/linux/slub_def.h 2007-05-10 16:06:15.000000000 -0700
@@ -59,7 +59,8 @@ struct kmem_cache {
#define KMALLOC_SHIFT_LOW 3
#ifdef CONFIG_LARGE_ALLOCS
-#define KMALLOC_SHIFT_HIGH 25
+#define KMALLOC_SHIFT_HIGH ((MAX_ORDER + PAGE_SHIFT) <= 25 ? \
+ MAX_ORDER + PAGE_SHIFT - 1 : 25)
#else
#if !defined(CONFIG_MMU) || NR_CPUS > 512 || MAX_NUMNODES > 256
#define KMALLOC_SHIFT_HIGH 20
@@ -86,6 +87,9 @@ static inline int kmalloc_index(int size
*/
WARN_ON_ONCE(size == 0);
+ if (size >= (1UL << KMALLOC_SHIFT_HIGH))
+ return -1;
+
if (size > 64 && size <= 96)
return 1;
if (size > 128 && size <= 192)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: New slab allocator SLUB
2007-05-10 23:07 ` Christoph Lameter
@ 2007-05-11 0:05 ` David Miller
0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2007-05-11 0:05 UTC (permalink / raw)
To: clameter; +Cc: linux-arch, akpm
From: Christoph Lameter <clameter@sgi.com>
Date: Thu, 10 May 2007 16:07:16 -0700 (PDT)
> On Thu, 10 May 2007, David Miller wrote:
>
> > Unfortunately, still no dice with LARGE_ALLOCS=y/SLUB=y on sparc64.
> > It still tries to create the 16MB kmalloc cache even though MAX_ORDER
> > is 11. :-)
> >
> > I think this is an off-by-one error, the kmalloc cache builder
> > iterates to >= KMALLOC_SHIFT_HIGH but your min() on MAX_ORDER would
> > only work if it iterated to > KMALLOC_SHIFT_HIGH.
>
> Then subtract one?
Yep, that works.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2007-05-11 0:05 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-08 19:10 New slab allocator SLUB Christoph Lameter
2007-05-09 1:04 ` Paul Mundt
2007-05-09 13:36 ` Andi Kleen
2007-05-09 15:52 ` Christoph Lameter
2007-05-10 22:13 ` David Miller
2007-05-10 22:21 ` Christoph Lameter
2007-05-10 22:30 ` David Miller
2007-05-10 22:33 ` Christoph Lameter
2007-05-10 22:35 ` David Miller
2007-05-10 22:38 ` David Miller
2007-05-10 22:44 ` Christoph Lameter
2007-05-10 23:01 ` David Miller
2007-05-10 23:07 ` Christoph Lameter
2007-05-11 0:05 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).