[PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node()
@ 2013-01-15  7:20 Joonsoo Kim
  2013-01-15  7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-15  7:20 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, js1304, linux-mm, linux-kernel, Joonsoo Kim

There is a subtle bug when calculating a number of acquired objects.
After acquire_slab() is executed at first, page->inuse is same as
page->objects, then, available is always 0. So, we always go next
iteration.

Correct it to stop a iteration when we find sufficient objects
at first iteration.

After that, we don't need return value of put_cpu_partial().
So remove it.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
These are based on v3.8-rc3 and there is no dependency between each other.
If rebase is needed, please notify me.

diff --git a/mm/slub.c b/mm/slub.c
index ba2ca53..abef30e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1493,7 +1493,7 @@ static inline void remove_partial(struct kmem_cache_node *n,
  */
 static inline void *acquire_slab(struct kmem_cache *s,
 		struct kmem_cache_node *n, struct page *page,
-		int mode)
+		int mode, int *objects)
 {
 	void *freelist;
 	unsigned long counters;
@@ -1506,6 +1506,7 @@ static inline void *acquire_slab(struct kmem_cache *s,
 	 */
 	freelist = page->freelist;
 	counters = page->counters;
+	*objects = page->objects - page->inuse;
 	new.counters = counters;
 	if (mode) {
 		new.inuse = page->objects;
@@ -1528,7 +1529,7 @@ static inline void *acquire_slab(struct kmem_cache *s,
 	return freelist;
 }
 
-static int put_cpu_partial(struct kmem_cache *s, struct page *page, int drain);
+static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain);
 static inline bool pfmemalloc_match(struct page *page, gfp_t gfpflags);
 
 /*
@@ -1539,6 +1540,8 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n,
 {
 	struct page *page, *page2;
 	void *object = NULL;
+	int available = 0;
+	int objects;
 
 	/*
 	 * Racy check. If we mistakenly see no partial slabs then we
@@ -1552,22 +1555,21 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n,
 	spin_lock(&n->list_lock);
 	list_for_each_entry_safe(page, page2, &n->partial, lru) {
 		void *t;
-		int available;
 
 		if (!pfmemalloc_match(page, flags))
 			continue;
 
-		t = acquire_slab(s, n, page, object == NULL);
+		t = acquire_slab(s, n, page, object == NULL, &objects);
 		if (!t)
 			break;
 
+		available += objects;
 		if (!object) {
 			c->page = page;
 			stat(s, ALLOC_FROM_PARTIAL);
 			object = t;
-			available =  page->objects - page->inuse;
 		} else {
-			available = put_cpu_partial(s, page, 0);
+			put_cpu_partial(s, page, 0);
 			stat(s, CPU_PARTIAL_NODE);
 		}
 		if (kmem_cache_debug(s) || available > s->cpu_partial / 2)
@@ -1946,7 +1948,7 @@ static void unfreeze_partials(struct kmem_cache *s,
  * If we did not find a slot then simply move all the partials to the
  * per node partial list.
  */
-static int put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
+static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
 {
 	struct page *oldpage;
 	int pages;
@@ -1984,7 +1986,6 @@ static int put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
 		page->next = oldpage;
 
 	} while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
-	return pobjects;
 }
 
 static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node
  2013-01-15  7:20 [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Joonsoo Kim
@ 2013-01-15  7:20 ` Joonsoo Kim
  2013-01-15 15:36   ` Christoph Lameter
  2013-01-15  7:20 ` [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node() Joonsoo Kim
  2013-01-15 15:46 ` [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Christoph Lameter
  2 siblings, 1 reply; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-15  7:20 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, js1304, linux-mm, linux-kernel, Joonsoo Kim

Current implementation of bootstrap() is not sufficient for kmem_cache
and kmem_cache_node.

First, for kmem_cache.
bootstrap() call kmem_cache_zalloc() at first. When kmem_cache_zalloc()
is called, kmem_cache's slab is moved to cpu slab for satisfying kmem_cache
allocation request. In current implementation, we only consider
n->partial slabs, so, we miss this cpu slab for kmem_cache.

Second, for kmem_cache_node.
When slab_state = PARTIAL, create_boot_cache() is called. And then,
kmem_cache_node's slab is moved to cpu slab for satisfying kmem_cache_node
allocation request. So, we also miss this slab.

These didn't make any error previously, because we normally don't free
objects which comes from kmem_cache's first slab and kmem_cache_node's.

Problem will be solved if we consider a cpu slab in bootstrap().
This patch implement it.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/mm/slub.c b/mm/slub.c
index abef30e..830348b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3613,11 +3613,22 @@ static int slab_memory_callback(struct notifier_block *self,

 static struct kmem_cache * __init bootstrap(struct kmem_cache *static_cache)
 {
+	int cpu;
 	int node;
 	struct kmem_cache *s = kmem_cache_zalloc(kmem_cache, GFP_NOWAIT);

 	memcpy(s, static_cache, kmem_cache->object_size);

+	for_each_possible_cpu(cpu) {
+		struct kmem_cache_cpu *c;
+		struct page *p;
+
+		c = per_cpu_ptr(s->cpu_slab, cpu);
+		p = c->page;
+		if (p)
+			p->slab_cache = s;
+	}
+
 	for_each_node_state(node, N_NORMAL_MEMORY) {
 		struct kmem_cache_node *n = get_node(s, node);
 		struct page *p;
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node()
  2013-01-15  7:20 [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Joonsoo Kim
  2013-01-15  7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
@ 2013-01-15  7:20 ` Joonsoo Kim
  2013-01-15 15:36   ` Christoph Lameter
  2013-01-15 15:46 ` [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Christoph Lameter
  2 siblings, 1 reply; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-15  7:20 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, js1304, linux-mm, linux-kernel, Joonsoo Kim

After boot phase, 'n' always exist.
So add 'likely' macro for helping compiler.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/mm/slub.c b/mm/slub.c
index 830348b..6f82070 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1005,7 +1005,7 @@ static inline void inc_slabs_node(struct kmem_cache *s, int node, int objects)
 	 * dilemma by deferring the increment of the count during
 	 * bootstrap (see early_kmem_cache_node_alloc).
 	 */
-	if (n) {
+	if (likely(n)) {
 		atomic_long_inc(&n->nr_slabs);
 		atomic_long_add(objects, &n->total_objects);
 	}
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node
  2013-01-15  7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
@ 2013-01-15 15:36   ` Christoph Lameter
  2013-01-16  8:45     ` Joonsoo Kim
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2013-01-15 15:36 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Pekka Enberg, js1304, linux-mm, linux-kernel

On Tue, 15 Jan 2013, Joonsoo Kim wrote:

> These didn't make any error previously, because we normally don't free
> objects which comes from kmem_cache's first slab and kmem_cache_node's.

And these slabs are on the partial list because the objects are typically
relatively small compared to page size. Do you have a system with a very
large kmem_cache size?

> Problem will be solved if we consider a cpu slab in bootstrap().
> This patch implement it.

At boot time only one processor is up so you do not need the loop over all
processors.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node()
  2013-01-15  7:20 ` [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node() Joonsoo Kim
@ 2013-01-15 15:36   ` Christoph Lameter
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2013-01-15 15:36 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Pekka Enberg, js1304, linux-mm, linux-kernel

On Tue, 15 Jan 2013, Joonsoo Kim wrote:

> After boot phase, 'n' always exist.
> So add 'likely' macro for helping compiler.

Acked-by: Christoph Lameter <cl@linux.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node()
  2013-01-15  7:20 [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Joonsoo Kim
  2013-01-15  7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
  2013-01-15  7:20 ` [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node() Joonsoo Kim
@ 2013-01-15 15:46 ` Christoph Lameter
  2013-01-16  8:41   ` Joonsoo Kim
  2 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2013-01-15 15:46 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Pekka Enberg, js1304, linux-mm, linux-kernel

On Tue, 15 Jan 2013, Joonsoo Kim wrote:

> There is a subtle bug when calculating a number of acquired objects.
> After acquire_slab() is executed at first, page->inuse is same as
> page->objects, then, available is always 0. So, we always go next
> iteration.

page->inuse is always < page->objects because the partial list is not used
for slabs that are fully allocated. page->inuse == page->objects means
that no objects are available on the slab and therefore the slab would
have been removed from the partial list.

> After that, we don't need return value of put_cpu_partial().
> So remove it.

Hmmm... The code looks a bit easier to understand than what we have right now.

Could you try to explain it better?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node()
  2013-01-15 15:46 ` [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Christoph Lameter
@ 2013-01-16  8:41   ` Joonsoo Kim
  2013-01-16 15:04     ` Christoph Lameter
  0 siblings, 1 reply; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-16  8:41 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, linux-kernel

On Tue, Jan 15, 2013 at 03:46:17PM +0000, Christoph Lameter wrote:
> On Tue, 15 Jan 2013, Joonsoo Kim wrote:
> 
> > There is a subtle bug when calculating a number of acquired objects.
> > After acquire_slab() is executed at first, page->inuse is same as
> > page->objects, then, available is always 0. So, we always go next
> > iteration.
> 
> page->inuse is always < page->objects because the partial list is not used
> for slabs that are fully allocated. page->inuse == page->objects means
> that no objects are available on the slab and therefore the slab would
> have been removed from the partial list.

Currently, we calculate "available = page->objects - page->inuse",
after acquire_slab() is called in get_partial_node().

In acquire_slab() with mode = 1, we always set new.inuse = page->objects.
So

		acquire_slab(s, n, page, object == NULL);

                if (!object) {
                        c->page = page;
                        stat(s, ALLOC_FROM_PARTIAL);
                        object = t; 
                        available =  page->objects - page->inuse;

			!!!!!! available is always 0 !!!!!!


                } else {
                        available = put_cpu_partial(s, page, 0);
                        stat(s, CPU_PARTIAL_NODE);
                }

Therefore, "available > s->cpu_partial / 2" is always false and
we always go to second iteration.
This patch correct this problem.

> > After that, we don't need return value of put_cpu_partial().
> > So remove it.
> 
> Hmmm... The code looks a bit easier to understand than what we have right now.
> 
> Could you try to explain it better?
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node
  2013-01-15 15:36   ` Christoph Lameter
@ 2013-01-16  8:45     ` Joonsoo Kim
  2013-01-16 15:05       ` Christoph Lameter
  0 siblings, 1 reply; 10+ messages in thread
From: Joonsoo Kim @ 2013-01-16  8:45 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, linux-mm, linux-kernel

On Tue, Jan 15, 2013 at 03:36:10PM +0000, Christoph Lameter wrote:
> On Tue, 15 Jan 2013, Joonsoo Kim wrote:
> 
> > These didn't make any error previously, because we normally don't free
> > objects which comes from kmem_cache's first slab and kmem_cache_node's.
> 
> And these slabs are on the partial list because the objects are typically
> relatively small compared to page size. Do you have a system with a very
> large kmem_cache size?

These slabs are not on the partial list, but on the cpu_slab of boot cpu.
Reason for this is described in changelog.
Because these slabs are not on partial list, we need to
check kmem_cache_cpu's cpu slab. This patch implement it.

> > Problem will be solved if we consider a cpu slab in bootstrap().
> > This patch implement it.
> 
> At boot time only one processor is up so you do not need the loop over all
> processors.

Okay! I will fix and submit v2, soon.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node()
  2013-01-16  8:41   ` Joonsoo Kim
@ 2013-01-16 15:04     ` Christoph Lameter
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2013-01-16 15:04 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Pekka Enberg, linux-mm, linux-kernel

On Wed, 16 Jan 2013, Joonsoo Kim wrote:

> In acquire_slab() with mode = 1, we always set new.inuse = page->objects.

Yes with that we signal that we have extracted the objects from the slab.

> So
>
> 		acquire_slab(s, n, page, object == NULL);
>
>                 if (!object) {
>                         c->page = page;
>                         stat(s, ALLOC_FROM_PARTIAL);
>                         object = t;
>                         available =  page->objects - page->inuse;
>
> 			!!!!!! available is always 0 !!!!!!

Correct. We should really count the objects that we extracted in
acquire_slab(). Please update the description to the patch and repost.

Also it would be nice if we had some way to avoid passing a pointer to an
integer to acquire_slab. If we cannot avoid that then ok but it would be
nicer without that.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node
  2013-01-16  8:45     ` Joonsoo Kim
@ 2013-01-16 15:05       ` Christoph Lameter
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2013-01-16 15:05 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Pekka Enberg, linux-mm, linux-kernel

On Wed, 16 Jan 2013, Joonsoo Kim wrote:

> These slabs are not on the partial list, but on the cpu_slab of boot cpu.

> Reason for this is described in changelog.
> Because these slabs are not on partial list, we need to
> check kmem_cache_cpu's cpu slab. This patch implement it.

Ah. Ok.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-01-16 15:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-15  7:20 [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Joonsoo Kim
2013-01-15  7:20 ` [PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node Joonsoo Kim
2013-01-15 15:36   ` Christoph Lameter
2013-01-16  8:45     ` Joonsoo Kim
2013-01-16 15:05       ` Christoph Lameter
2013-01-15  7:20 ` [PATCH 3/3] slub: add 'likely' macro to inc_slabs_node() Joonsoo Kim
2013-01-15 15:36   ` Christoph Lameter
2013-01-15 15:46 ` [PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node() Christoph Lameter
2013-01-16  8:41   ` Joonsoo Kim
2013-01-16 15:04     ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).