* [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node()
@ 2013-01-15 7:20 Joonsoo Kim
2013-01-15 7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-15 7:20 UTC (permalink / raw)
To: Pekka Enberg
Cc: Christoph Lameter, js1304, linux-mm, linux-kernel, Joonsoo Kim
There is a subtle bug when calculating a number of acquired objects.
After acquire_slab() is executed at first, page->inuse is same as
page->objects, then, available is always 0. So, we always go next
iteration.
Correct it to stop a iteration when we find sufficient objects
at first iteration.
After that, we don't need return value of put_cpu_partial().
So remove it.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
These are based on v3.8-rc3 and there is no dependency between each other.
If rebase is needed, please notify me.
diff --git a/mm/slub.c b/mm/slub.c
index ba2ca53..abef30e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1493,7 +1493,7 @@ static inline void remove_partial(struct kmem_cache_node *n,
*/
static inline void *acquire_slab(struct kmem_cache *s,
struct kmem_cache_node *n, struct page *page,
- int mode)
+ int mode, int *objects)
{
void *freelist;
unsigned long counters;
@@ -1506,6 +1506,7 @@ static inline void *acquire_slab(struct kmem_cache *s,
*/
freelist = page->freelist;
counters = page->counters;
+ *objects = page->objects - page->inuse;
new.counters = counters;
if (mode) {
new.inuse = page->objects;
@@ -1528,7 +1529,7 @@ static inline void *acquire_slab(struct kmem_cache *s,
return freelist;
}
-static int put_cpu_partial(struct kmem_cache *s, struct page *page, int drain);
+static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain);
static inline bool pfmemalloc_match(struct page *page, gfp_t gfpflags);
/*
@@ -1539,6 +1540,8 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n,
{
struct page *page, *page2;
void *object = NULL;
+ int available = 0;
+ int objects;
/*
* Racy check. If we mistakenly see no partial slabs then we
@@ -1552,22 +1555,21 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n,
spin_lock(&n->list_lock);
list_for_each_entry_safe(page, page2, &n->partial, lru) {
void *t;
- int available;
if (!pfmemalloc_match(page, flags))
continue;
- t = acquire_slab(s, n, page, object == NULL);
+ t = acquire_slab(s, n, page, object == NULL, &objects);
if (!t)
break;
+ available += objects;
if (!object) {
c->page = page;
stat(s, ALLOC_FROM_PARTIAL);
object = t;
- available = page->objects - page->inuse;
} else {
- available = put_cpu_partial(s, page, 0);
+ put_cpu_partial(s, page, 0);
stat(s, CPU_PARTIAL_NODE);
}
if (kmem_cache_debug(s) || available > s->cpu_partial / 2)
@@ -1946,7 +1948,7 @@ static void unfreeze_partials(struct kmem_cache *s,
* If we did not find a slot then simply move all the partials to the
* per node partial list.
*/
-static int put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
+static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
{
struct page *oldpage;
int pages;
@@ -1984,7 +1986,6 @@ static int put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
page->next = oldpage;
} while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
- return pobjects;
}
static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node
2013-01-15 7:20 [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Joonsoo Kim
@ 2013-01-15 7:20 ` Joonsoo Kim
2013-01-15 15:36 ` Christoph Lameter
2013-01-15 7:20 ` [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node() Joonsoo Kim
2013-01-15 15:46 ` [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Christoph Lameter
2 siblings, 1 reply; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-15 7:20 UTC (permalink / raw)
To: Pekka Enberg
Cc: Christoph Lameter, js1304, linux-mm, linux-kernel, Joonsoo Kim
Current implementation of bootstrap() is not sufficient for kmem_cache
and kmem_cache_node.
First, for kmem_cache.
bootstrap() call kmem_cache_zalloc() at first. When kmem_cache_zalloc()
is called, kmem_cache's slab is moved to cpu slab for satisfying kmem_cache
allocation request. In current implementation, we only consider
n->partial slabs, so, we miss this cpu slab for kmem_cache.
Second, for kmem_cache_node.
When slab_state = PARTIAL, create_boot_cache() is called. And then,
kmem_cache_node's slab is moved to cpu slab for satisfying kmem_cache_node
allocation request. So, we also miss this slab.
These didn't make any error previously, because we normally don't free
objects which comes from kmem_cache's first slab and kmem_cache_node's.
Problem will be solved if we consider a cpu slab in bootstrap().
This patch implement it.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
diff --git a/mm/slub.c b/mm/slub.c
index abef30e..830348b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3613,11 +3613,22 @@ static int slab_memory_callback(struct notifier_block *self,
static struct kmem_cache * __init bootstrap(struct kmem_cache *static_cache)
{
+ int cpu;
int node;
struct kmem_cache *s = kmem_cache_zalloc(kmem_cache, GFP_NOWAIT);
memcpy(s, static_cache, kmem_cache->object_size);
+ for_each_possible_cpu(cpu) {
+ struct kmem_cache_cpu *c;
+ struct page *p;
+
+ c = per_cpu_ptr(s->cpu_slab, cpu);
+ p = c->page;
+ if (p)
+ p->slab_cache = s;
+ }
+
for_each_node_state(node, N_NORMAL_MEMORY) {
struct kmem_cache_node *n = get_node(s, node);
struct page *p;
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node()
2013-01-15 7:20 [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Joonsoo Kim
2013-01-15 7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
@ 2013-01-15 7:20 ` Joonsoo Kim
2013-01-15 15:36 ` Christoph Lameter
2013-01-15 15:46 ` [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Christoph Lameter
2 siblings, 1 reply; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-15 7:20 UTC (permalink / raw)
To: Pekka Enberg
Cc: Christoph Lameter, js1304, linux-mm, linux-kernel, Joonsoo Kim
After boot phase, 'n' always exist.
So add 'likely' macro for helping compiler.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
diff --git a/mm/slub.c b/mm/slub.c
index 830348b..6f82070 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1005,7 +1005,7 @@ static inline void inc_slabs_node(struct kmem_cache *s, int node, int objects)
* dilemma by deferring the increment of the count during
* bootstrap (see early_kmem_cache_node_alloc).
*/
- if (n) {
+ if (likely(n)) {
atomic_long_inc(&n->nr_slabs);
atomic_long_add(objects, &n->total_objects);
}
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node
2013-01-15 7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
@ 2013-01-15 15:36 ` Christoph Lameter
2013-01-16 8:45 ` Joonsoo Kim
0 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2013-01-15 15:36 UTC (permalink / raw)
To: Joonsoo Kim; +Cc: Pekka Enberg, js1304, linux-mm, linux-kernel
On Tue, 15 Jan 2013, Joonsoo Kim wrote:
> These didn't make any error previously, because we normally don't free
> objects which comes from kmem_cache's first slab and kmem_cache_node's.
And these slabs are on the partial list because the objects are typically
relatively small compared to page size. Do you have a system with a very
large kmem_cache size?
> Problem will be solved if we consider a cpu slab in bootstrap().
> This patch implement it.
At boot time only one processor is up so you do not need the loop over all
processors.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node()
2013-01-15 7:20 ` [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node() Joonsoo Kim
@ 2013-01-15 15:36 ` Christoph Lameter
0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2013-01-15 15:36 UTC (permalink / raw)
To: Joonsoo Kim; +Cc: Pekka Enberg, js1304, linux-mm, linux-kernel
On Tue, 15 Jan 2013, Joonsoo Kim wrote:
> After boot phase, 'n' always exist.
> So add 'likely' macro for helping compiler.
Acked-by: Christoph Lameter <cl@linux.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node()
2013-01-15 7:20 [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Joonsoo Kim
2013-01-15 7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
2013-01-15 7:20 ` [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node() Joonsoo Kim
@ 2013-01-15 15:46 ` Christoph Lameter
2013-01-16 8:41 ` Joonsoo Kim
2 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2013-01-15 15:46 UTC (permalink / raw)
To: Joonsoo Kim; +Cc: Pekka Enberg, js1304, linux-mm, linux-kernel
On Tue, 15 Jan 2013, Joonsoo Kim wrote:
> There is a subtle bug when calculating a number of acquired objects.
> After acquire_slab() is executed at first, page->inuse is same as
> page->objects, then, available is always 0. So, we always go next
> iteration.
page->inuse is always < page->objects because the partial list is not used
for slabs that are fully allocated. page->inuse == page->objects means
that no objects are available on the slab and therefore the slab would
have been removed from the partial list.
> After that, we don't need return value of put_cpu_partial().
> So remove it.
Hmmm... The code looks a bit easier to understand than what we have right now.
Could you try to explain it better?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node()
2013-01-15 15:46 ` [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Christoph Lameter
@ 2013-01-16 8:41 ` Joonsoo Kim
2013-01-16 15:04 ` Christoph Lameter
0 siblings, 1 reply; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-16 8:41 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, linux-kernel
On Tue, Jan 15, 2013 at 03:46:17PM +0000, Christoph Lameter wrote:
> On Tue, 15 Jan 2013, Joonsoo Kim wrote:
>
> > There is a subtle bug when calculating a number of acquired objects.
> > After acquire_slab() is executed at first, page->inuse is same as
> > page->objects, then, available is always 0. So, we always go next
> > iteration.
>
> page->inuse is always < page->objects because the partial list is not used
> for slabs that are fully allocated. page->inuse == page->objects means
> that no objects are available on the slab and therefore the slab would
> have been removed from the partial list.
Currently, we calculate "available = page->objects - page->inuse",
after acquire_slab() is called in get_partial_node().
In acquire_slab() with mode = 1, we always set new.inuse = page->objects.
So
acquire_slab(s, n, page, object == NULL);
if (!object) {
c->page = page;
stat(s, ALLOC_FROM_PARTIAL);
object = t;
available = page->objects - page->inuse;
!!!!!! available is always 0 !!!!!!
} else {
available = put_cpu_partial(s, page, 0);
stat(s, CPU_PARTIAL_NODE);
}
Therefore, "available > s->cpu_partial / 2" is always false and
we always go to second iteration.
This patch correct this problem.
> > After that, we don't need return value of put_cpu_partial().
> > So remove it.
>
> Hmmm... The code looks a bit easier to understand than what we have right now.
>
> Could you try to explain it better?
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node
2013-01-15 15:36 ` Christoph Lameter
@ 2013-01-16 8:45 ` Joonsoo Kim
2013-01-16 15:05 ` Christoph Lameter
0 siblings, 1 reply; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-16 8:45 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, linux-kernel
On Tue, Jan 15, 2013 at 03:36:10PM +0000, Christoph Lameter wrote:
> On Tue, 15 Jan 2013, Joonsoo Kim wrote:
>
> > These didn't make any error previously, because we normally don't free
> > objects which comes from kmem_cache's first slab and kmem_cache_node's.
>
> And these slabs are on the partial list because the objects are typically
> relatively small compared to page size. Do you have a system with a very
> large kmem_cache size?
These slabs are not on the partial list, but on the cpu_slab of boot cpu.
Reason for this is described in changelog.
Because these slabs are not on partial list, we need to
check kmem_cache_cpu's cpu slab. This patch implement it.
> > Problem will be solved if we consider a cpu slab in bootstrap().
> > This patch implement it.
>
> At boot time only one processor is up so you do not need the loop over all
> processors.
Okay! I will fix and submit v2, soon.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node()
2013-01-16 8:41 ` Joonsoo Kim
@ 2013-01-16 15:04 ` Christoph Lameter
0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2013-01-16 15:04 UTC (permalink / raw)
To: Joonsoo Kim; +Cc: Pekka Enberg, linux-mm, linux-kernel
On Wed, 16 Jan 2013, Joonsoo Kim wrote:
> In acquire_slab() with mode = 1, we always set new.inuse = page->objects.
Yes with that we signal that we have extracted the objects from the slab.
> So
>
> acquire_slab(s, n, page, object == NULL);
>
> if (!object) {
> c->page = page;
> stat(s, ALLOC_FROM_PARTIAL);
> object = t;
> available = page->objects - page->inuse;
>
> !!!!!! available is always 0 !!!!!!
Correct. We should really count the objects that we extracted in
acquire_slab(). Please update the description to the patch and repost.
Also it would be nice if we had some way to avoid passing a pointer to an
integer to acquire_slab. If we cannot avoid that then ok but it would be
nicer without that.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node
2013-01-16 8:45 ` Joonsoo Kim
@ 2013-01-16 15:05 ` Christoph Lameter
0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2013-01-16 15:05 UTC (permalink / raw)
To: Joonsoo Kim; +Cc: Pekka Enberg, linux-mm, linux-kernel
On Wed, 16 Jan 2013, Joonsoo Kim wrote:
> These slabs are not on the partial list, but on the cpu_slab of boot cpu.
> Reason for this is described in changelog.
> Because these slabs are not on partial list, we need to
> check kmem_cache_cpu's cpu slab. This patch implement it.
Ah. Ok.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-01-16 15:05 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-15 7:20 [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Joonsoo Kim
2013-01-15 7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
2013-01-15 15:36 ` Christoph Lameter
2013-01-16 8:45 ` Joonsoo Kim
2013-01-16 15:05 ` Christoph Lameter
2013-01-15 7:20 ` [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node() Joonsoo Kim
2013-01-15 15:36 ` Christoph Lameter
2013-01-15 15:46 ` [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Christoph Lameter
2013-01-16 8:41 ` Joonsoo Kim
2013-01-16 15:04 ` Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).