* [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy()
@ 2011-12-22 0:18 Tejun Heo
2011-12-22 0:19 ` [PATCH 2/2] mempool: fix first round failure behavior Tejun Heo
2011-12-22 0:25 ` [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Andrew Morton
0 siblings, 2 replies; 17+ messages in thread
From: Tejun Heo @ 2011-12-22 0:18 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel
mempool_destroy() is a thin wrapper around free_pool(). The only
thing it adds is BUG_ON(pool->curr_nr != pool->min_nr). The intention
seems to be to enforce that all allocated elements are freed; however,
the BUG_ON() can't achieve that (it doesn't know anything about
objects above min_nr) and incorrect as mempool_resize() is allowed to
leave the pool extended but not filled. Furthermore, panicking is way
worse than any memory leak and there are better debug tools to track
memory leaks.
Drop the BUG_ON() from mempool_destory() and as that leaves the
function identical to free_pool(), replace it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
---
These patches are on top of "mempool: fix and document synchronization
and memory barrier usage" patch[1]. Both are fixes and it probably is
a good idea to forward to -stable.
[1] http://thread.gmane.org/gmane.linux.kernel/1231609/focus=1232127
mm/mempool.c | 30 +++++++++++-------------------
1 file changed, 11 insertions(+), 19 deletions(-)
Index: work/mm/mempool.c
===================================================================
--- work.orig/mm/mempool.c
+++ work/mm/mempool.c
@@ -27,7 +27,15 @@ static void *remove_element(mempool_t *p
return pool->elements[--pool->curr_nr];
}
-static void free_pool(mempool_t *pool)
+/**
+ * mempool_destroy - deallocate a memory pool
+ * @pool: pointer to the memory pool which was allocated via
+ * mempool_create().
+ *
+ * Free all reserved elements in @pool and @pool itself. This function
+ * only sleeps if the free_fn() function sleeps.
+ */
+void mempool_destroy(mempool_t *pool)
{
while (pool->curr_nr) {
void *element = remove_element(pool);
@@ -36,6 +44,7 @@ static void free_pool(mempool_t *pool)
kfree(pool->elements);
kfree(pool);
}
+EXPORT_SYMBOL(mempool_destroy);
/**
* mempool_create - create a memory pool
@@ -86,7 +95,7 @@ mempool_t *mempool_create_node(int min_n
element = pool->alloc(GFP_KERNEL, pool->pool_data);
if (unlikely(!element)) {
- free_pool(pool);
+ mempool_destroy(pool);
return NULL;
}
add_element(pool, element);
@@ -172,23 +181,6 @@ out:
EXPORT_SYMBOL(mempool_resize);
/**
- * mempool_destroy - deallocate a memory pool
- * @pool: pointer to the memory pool which was allocated via
- * mempool_create().
- *
- * this function only sleeps if the free_fn() function sleeps. The caller
- * has to guarantee that all elements have been returned to the pool (ie:
- * freed) prior to calling mempool_destroy().
- */
-void mempool_destroy(mempool_t *pool)
-{
- /* Check for outstanding elements */
- BUG_ON(pool->curr_nr != pool->min_nr);
- free_pool(pool);
-}
-EXPORT_SYMBOL(mempool_destroy);
-
-/**
* mempool_alloc - allocate an element from a specific memory pool
* @pool: pointer to the memory pool which was allocated via
* mempool_create().
^ permalink raw reply [flat|nested] 17+ messages in thread* [PATCH 2/2] mempool: fix first round failure behavior 2011-12-22 0:18 [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Tejun Heo @ 2011-12-22 0:19 ` Tejun Heo 2011-12-22 0:32 ` Andrew Morton 2011-12-22 0:46 ` [PATCH UPDATED " Tejun Heo 2011-12-22 0:25 ` [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Andrew Morton 1 sibling, 2 replies; 17+ messages in thread From: Tejun Heo @ 2011-12-22 0:19 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel For the initial allocation, mempool passes modified gfp mask to the backing allocator so that it doesn't try too hard when there are reserved elements waiting in the pool; however, when that allocation fails and pool is empty too, it either waits for the pool to be replenished before retrying or fails if !__GFP_WAIT. * If the caller was calling in with GFP_ATOMIC, it never gets to try emergency reserve. Allocations which would have succeeded without mempool may fail, which is just wrong. * Allocation which could have succeeded after a bit of reclaim now has to wait on the reserved items and it's not like mempool doesn't retry with the original gfp mask. It just does that *after* someone returns an element, pointlessly delaying things. Fix it by retrying immediately with the gfp mask requested by the caller if the first round of allocation attempts fails with modified mask. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@kernel.org --- mm/mempool.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) Index: work/mm/mempool.c =================================================================== --- work.orig/mm/mempool.c +++ work/mm/mempool.c @@ -221,14 +221,24 @@ repeat_alloc: return element; } - /* We must not sleep in the GFP_ATOMIC case */ + /* + * We use modified gfp mask for the first round. If alloc failed + * with that and @pool was empty too, immediately retry with the + * original gfp mask. + */ + if (gfp_temp != gfp_mask) { + gfp_temp = gfp_mask; + spin_unlock_irqrestore(&pool->lock, flags); + goto repeat_alloc; + } + + /* We must not sleep if !__GFP_WAIT */ if (!(gfp_mask & __GFP_WAIT)) { spin_unlock_irqrestore(&pool->lock, flags); return NULL; } /* Let's wait for someone else to return an element to @pool */ - gfp_temp = gfp_mask; init_wait(&wait); prepare_to_wait(&pool->wait, &wait, TASK_UNINTERRUPTIBLE); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] mempool: fix first round failure behavior 2011-12-22 0:19 ` [PATCH 2/2] mempool: fix first round failure behavior Tejun Heo @ 2011-12-22 0:32 ` Andrew Morton 2011-12-22 0:34 ` Tejun Heo 2011-12-22 0:46 ` [PATCH UPDATED " Tejun Heo 1 sibling, 1 reply; 17+ messages in thread From: Andrew Morton @ 2011-12-22 0:32 UTC (permalink / raw) To: Tejun Heo; +Cc: Linus Torvalds, linux-kernel On Wed, 21 Dec 2011 16:19:39 -0800 Tejun Heo <tj@kernel.org> wrote: > For the initial allocation, mempool passes modified gfp mask to the > backing allocator so that it doesn't try too hard when there are > reserved elements waiting in the pool; however, when that allocation > fails and pool is empty too, it either waits for the pool to be > replenished before retrying or fails if !__GFP_WAIT. > > * If the caller was calling in with GFP_ATOMIC, it never gets to try > emergency reserve. Allocations which would have succeeded without > mempool may fail, which is just wrong. > > * Allocation which could have succeeded after a bit of reclaim now has > to wait on the reserved items and it's not like mempool doesn't > retry with the original gfp mask. It just does that *after* someone > returns an element, pointlessly delaying things. This is a significant change in behaviour. Previously the mempool code would preserve emergency pools while waiting for someone to return an item. Now, it will permit many more items to be allocated, chewing into the emergency pools. We *know* that items will soon become available, so why not wait for that to happen rather than consuming memory which less robust callers could have utilised? IOW, this change appears to make the kernel more vulnerable to memory exhaustion failures? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] mempool: fix first round failure behavior 2011-12-22 0:32 ` Andrew Morton @ 2011-12-22 0:34 ` Tejun Heo 0 siblings, 0 replies; 17+ messages in thread From: Tejun Heo @ 2011-12-22 0:34 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel Hello, Andrew. On Wed, Dec 21, 2011 at 4:32 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > This is a significant change in behaviour. Previously the mempool code > would preserve emergency pools while waiting for someone to return an > item. Now, it will permit many more items to be allocated, chewing > into the emergency pools. > > We *know* that items will soon become available, so why not wait for > that to happen rather than consuming memory which less robust callers > could have utilised? > > IOW, this change appears to make the kernel more vulnerable to memory > exhaustion failures? Yeah, that's me misreading the code. mempool allocator doesn't retry with the original gfp mask. It does with umm... less modified one, so the patch description is wrong but the behavior w.r.t. emergency pool doesn't change. I'm rewriting the commit message. Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 0:19 ` [PATCH 2/2] mempool: fix first round failure behavior Tejun Heo 2011-12-22 0:32 ` Andrew Morton @ 2011-12-22 0:46 ` Tejun Heo 2011-12-22 1:09 ` Andrew Morton 1 sibling, 1 reply; 17+ messages in thread From: Tejun Heo @ 2011-12-22 0:46 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel mempool modifies gfp_mask so that the backing allocator doesn't try too hard or trigger warning message when there's pool to fall back on. In addition, for the first try, it removes __GFP_WAIT and IO, so that it doesn't trigger reclaim or wait when allocation can be fulfilled from pool; however, when that allocation fails and pool is empty too, it waits for the pool to be replenished before retrying. Allocation which could have succeeded after a bit of reclaim has to wait on the reserved items and it's not like mempool doesn't retry with __GFP_WAIT and IO. It just does that *after* someone returns an element, pointlessly delaying things. Fix it by retrying immediately if the first round of allocation attempts w/o __GFP_WAIT and IO fails. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> --- The code hasn't changed. Only the description and comment are updated. It doesn't affect anything regarding emergency pool. It just changes when the first retry happens. That said, I still find it a bit unsettling that a GFP_ATOMIC allocation which would otherwise succeed may fail when issued through mempool. Maybe the RTTD is clearing __GFP_NOMEMALLOC on retry if the gfp requsted by the caller is !__GFP_WAIT && !__GFP_NOMEMALLOC? Thanks. mm/mempool.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) Index: work/mm/mempool.c =================================================================== --- work.orig/mm/mempool.c +++ work/mm/mempool.c @@ -221,14 +221,23 @@ repeat_alloc: return element; } - /* We must not sleep in the GFP_ATOMIC case */ + /* + * We use gfp mask w/o __GFP_WAIT or IO for the first round. If + * alloc failed with that and @pool was empty, retry immediately. + */ + if (gfp_temp != gfp_mask) { + gfp_temp = gfp_mask; + spin_unlock_irqrestore(&pool->lock, flags); + goto repeat_alloc; + } + + /* We must not sleep if !__GFP_WAIT */ if (!(gfp_mask & __GFP_WAIT)) { spin_unlock_irqrestore(&pool->lock, flags); return NULL; } /* Let's wait for someone else to return an element to @pool */ - gfp_temp = gfp_mask; init_wait(&wait); prepare_to_wait(&pool->wait, &wait, TASK_UNINTERRUPTIBLE); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 0:46 ` [PATCH UPDATED " Tejun Heo @ 2011-12-22 1:09 ` Andrew Morton 2011-12-22 1:23 ` Tejun Heo 0 siblings, 1 reply; 17+ messages in thread From: Andrew Morton @ 2011-12-22 1:09 UTC (permalink / raw) To: Tejun Heo; +Cc: Linus Torvalds, linux-kernel On Wed, 21 Dec 2011 16:46:29 -0800 Tejun Heo <tj@kernel.org> wrote: > mempool modifies gfp_mask so that the backing allocator doesn't try > too hard or trigger warning message when there's pool to fall back on. > In addition, for the first try, it removes __GFP_WAIT and IO, so that > it doesn't trigger reclaim or wait when allocation can be fulfilled > from pool; however, when that allocation fails and pool is empty too, > it waits for the pool to be replenished before retrying. > > Allocation which could have succeeded after a bit of reclaim has to > wait on the reserved items and it's not like mempool doesn't retry > with __GFP_WAIT and IO. It just does that *after* someone returns an > element, pointlessly delaying things. > > Fix it by retrying immediately if the first round of allocation > attempts w/o __GFP_WAIT and IO fails. If the pool is empty and the memory allocator is down into its emergency reserves then we have: Old behaviour: Wait for someone to return an item, then retry New behaviour: enable page reclaim in gfp_mask, retry a single time then wait for someone to return an item. So what we can expect to see is that in this low-memory situation, mempool_alloc() will perform a lot more page reclaim, and more mempool items will be let loose into the kernel. I'm not sure what the effects of this will be. I can't immediately point at any bad ones. Probably not much, as the mempool_alloc() caller will probably be doing other allocations, using the reclaim-permitting gfp_mask. But I have painful memories of us (me and Jens, iirc) churning this code over and over again until it stopped causing problems. Some were subtle and nasty. Much dumpster diving into the pre-git changelogs should be done before changing it, lest we rediscover long-fixed problems :( I have a feeling that if we go for "New behaviour" then the implementation could be made simpler than this, but laziness is setting in. > That said, I still find it a bit unsettling that a GFP_ATOMIC > allocation which would otherwise succeed may fail when issued through > mempool. Spose so. It would be strange to call mempool_alloc() with GFP_ATOMIC. Because "wait for an item to be returned" is the whole point of the thing. > Maybe the RTTD is clearing __GFP_NOMEMALLOC on retry if the > gfp requsted by the caller is !__GFP_WAIT && !__GFP_NOMEMALLOC? > What the heck is an RTTD? > + /* > + * We use gfp mask w/o __GFP_WAIT or IO for the first round. If > + * alloc failed with that and @pool was empty, retry immediately. > + */ > + if (gfp_temp != gfp_mask) { > + gfp_temp = gfp_mask; > + spin_unlock_irqrestore(&pool->lock, flags); > + goto repeat_alloc; > + } > + Here, have a faster kernel ;) --- a/mm/mempool.c~mempool-fix-first-round-failure-behavior-fix +++ a/mm/mempool.c @@ -227,8 +227,8 @@ repeat_alloc: * original gfp mask. */ if (gfp_temp != gfp_mask) { - gfp_temp = gfp_mask; spin_unlock_irqrestore(&pool->lock, flags); + gfp_temp = gfp_mask; goto repeat_alloc; } _ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 1:09 ` Andrew Morton @ 2011-12-22 1:23 ` Tejun Heo 2011-12-22 1:31 ` Tejun Heo 2011-12-22 15:15 ` Vivek Goyal 0 siblings, 2 replies; 17+ messages in thread From: Tejun Heo @ 2011-12-22 1:23 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel Hello, Andrew. On Wed, Dec 21, 2011 at 05:09:19PM -0800, Andrew Morton wrote: > If the pool is empty and the memory allocator is down into its > emergency reserves then we have: > > Old behaviour: Wait for someone to return an item, then retry > > New behaviour: enable page reclaim in gfp_mask, retry a > single time then wait for someone to return an item. > > So what we can expect to see is that in this low-memory situation, > mempool_alloc() will perform a lot more page reclaim, and more mempool > items will be let loose into the kernel. > > I'm not sure what the effects of this will be. I can't immediately > point at any bad ones. Probably not much, as the mempool_alloc() > caller will probably be doing other allocations, using the > reclaim-permitting gfp_mask. > > But I have painful memories of us (me and Jens, iirc) churning this > code over and over again until it stopped causing problems. Some were > subtle and nasty. Much dumpster diving into the pre-git changelogs > should be done before changing it, lest we rediscover long-fixed > problems :( I see. It just seemed like a weird behavior and looking at the commit log, there was originally code to kick reclaim there, so the sequence made sense - first try w/o reclaim, look at the mempool, kick reclaim and retry w/ GFP_WAIT and then wait for someone else to free. That part was removed by 20a77776c24 "[PATCH] mempool: simplify alloc" back in 05. In the process, it also lost retry w/ reclaim before waiting for mempool reserves. I was trying to add percpu mempool and this bit me as percpu allocator can't to NOIO and the above delayed retry logic ended up adding random 5s delay (or until the next free). > > That said, I still find it a bit unsettling that a GFP_ATOMIC > > allocation which would otherwise succeed may fail when issued through > > mempool. > > Spose so. It would be strange to call mempool_alloc() with GFP_ATOMIC. > Because "wait for an item to be returned" is the whole point of the > thing. Yeah but the pool can be used from multiple code paths and I think it plausible to use it that way and expect at least the same or better alloc behavior as not using mempool. Eh... this doesn't really affect correctness, so not such a big deal but still weird. > > Maybe the RTTD is clearing __GFP_NOMEMALLOC on retry if the > > gfp requsted by the caller is !__GFP_WAIT && !__GFP_NOMEMALLOC? > > What the heck is an RTTD? Right thing to do? Hmmm... I thought other people were using it too. It's quite possible that I just dreamed it up tho. > > + /* > > + * We use gfp mask w/o __GFP_WAIT or IO for the first round. If > > + * alloc failed with that and @pool was empty, retry immediately. > > + */ > > + if (gfp_temp != gfp_mask) { > > + gfp_temp = gfp_mask; > > + spin_unlock_irqrestore(&pool->lock, flags); > > + goto repeat_alloc; > > + } > > + > > Here, have a faster kernel ;) ;) Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 1:23 ` Tejun Heo @ 2011-12-22 1:31 ` Tejun Heo 2011-12-22 15:15 ` Vivek Goyal 1 sibling, 0 replies; 17+ messages in thread From: Tejun Heo @ 2011-12-22 1:31 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel On Wed, Dec 21, 2011 at 05:23:12PM -0800, Tejun Heo wrote: > I see. It just seemed like a weird behavior and looking at the commit > log, there was originally code to kick reclaim there, so the sequence > made sense - first try w/o reclaim, look at the mempool, kick reclaim > and retry w/ GFP_WAIT and then wait for someone else to free. That > part was removed by 20a77776c24 "[PATCH] mempool: simplify alloc" back > in 05. In the process, it also lost retry w/ reclaim before waiting > for mempool reserves. Oops, it isn't that relevant but the original code retried w/ reclaim before looking at mempool if less than half were left in the pool. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 1:23 ` Tejun Heo 2011-12-22 1:31 ` Tejun Heo @ 2011-12-22 15:15 ` Vivek Goyal 2011-12-22 15:20 ` Vivek Goyal 2011-12-22 15:21 ` Tejun Heo 1 sibling, 2 replies; 17+ messages in thread From: Vivek Goyal @ 2011-12-22 15:15 UTC (permalink / raw) To: Tejun Heo; +Cc: Andrew Morton, Linus Torvalds, linux-kernel On Wed, Dec 21, 2011 at 05:23:12PM -0800, Tejun Heo wrote: > Hello, Andrew. > > On Wed, Dec 21, 2011 at 05:09:19PM -0800, Andrew Morton wrote: > > If the pool is empty and the memory allocator is down into its > > emergency reserves then we have: > > > > Old behaviour: Wait for someone to return an item, then retry > > > > New behaviour: enable page reclaim in gfp_mask, retry a > > single time then wait for someone to return an item. > > > > So what we can expect to see is that in this low-memory situation, > > mempool_alloc() will perform a lot more page reclaim, and more mempool > > items will be let loose into the kernel. > > > > I'm not sure what the effects of this will be. I can't immediately > > point at any bad ones. Probably not much, as the mempool_alloc() > > caller will probably be doing other allocations, using the > > reclaim-permitting gfp_mask. > > > > But I have painful memories of us (me and Jens, iirc) churning this > > code over and over again until it stopped causing problems. Some were > > subtle and nasty. Much dumpster diving into the pre-git changelogs > > should be done before changing it, lest we rediscover long-fixed > > problems :( > > I see. It just seemed like a weird behavior and looking at the commit > log, there was originally code to kick reclaim there, so the sequence > made sense - first try w/o reclaim, look at the mempool, kick reclaim > and retry w/ GFP_WAIT and then wait for someone else to free. That > part was removed by 20a77776c24 "[PATCH] mempool: simplify alloc" back > in 05. In the process, it also lost retry w/ reclaim before waiting > for mempool reserves. > > I was trying to add percpu mempool and this bit me as percpu allocator > can't to NOIO and the above delayed retry logic ended up adding random > 5s delay (or until the next free). > Tejun, For blkio to use these percpu mempool, I guess we shall have to change this notion that an allocated object from mempool is returned to the pool soon. We might not be returning these per cpu stat objects to mempool for a very long time (These will be returned to pool only when somebody deletes the cgroup). Thanks Vivek ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 15:15 ` Vivek Goyal @ 2011-12-22 15:20 ` Vivek Goyal 2011-12-22 15:58 ` Tejun Heo 2011-12-22 15:21 ` Tejun Heo 1 sibling, 1 reply; 17+ messages in thread From: Vivek Goyal @ 2011-12-22 15:20 UTC (permalink / raw) To: Tejun Heo; +Cc: Andrew Morton, Linus Torvalds, linux-kernel On Thu, Dec 22, 2011 at 10:15:29AM -0500, Vivek Goyal wrote: > On Wed, Dec 21, 2011 at 05:23:12PM -0800, Tejun Heo wrote: > > Hello, Andrew. > > > > On Wed, Dec 21, 2011 at 05:09:19PM -0800, Andrew Morton wrote: > > > If the pool is empty and the memory allocator is down into its > > > emergency reserves then we have: > > > > > > Old behaviour: Wait for someone to return an item, then retry > > > > > > New behaviour: enable page reclaim in gfp_mask, retry a > > > single time then wait for someone to return an item. > > > > > > So what we can expect to see is that in this low-memory situation, > > > mempool_alloc() will perform a lot more page reclaim, and more mempool > > > items will be let loose into the kernel. > > > > > > I'm not sure what the effects of this will be. I can't immediately > > > point at any bad ones. Probably not much, as the mempool_alloc() > > > caller will probably be doing other allocations, using the > > > reclaim-permitting gfp_mask. > > > > > > But I have painful memories of us (me and Jens, iirc) churning this > > > code over and over again until it stopped causing problems. Some were > > > subtle and nasty. Much dumpster diving into the pre-git changelogs > > > should be done before changing it, lest we rediscover long-fixed > > > problems :( > > > > I see. It just seemed like a weird behavior and looking at the commit > > log, there was originally code to kick reclaim there, so the sequence > > made sense - first try w/o reclaim, look at the mempool, kick reclaim > > and retry w/ GFP_WAIT and then wait for someone else to free. That > > part was removed by 20a77776c24 "[PATCH] mempool: simplify alloc" back > > in 05. In the process, it also lost retry w/ reclaim before waiting > > for mempool reserves. > > > > I was trying to add percpu mempool and this bit me as percpu allocator > > can't to NOIO and the above delayed retry logic ended up adding random > > 5s delay (or until the next free). > > > > Tejun, > > For blkio to use these percpu mempool, I guess we shall have to change this > notion that an allocated object from mempool is returned to the pool soon. > We might not be returning these per cpu stat objects to mempool for a very > long time (These will be returned to pool only when somebody deletes the > cgroup). Hi Tejun, Also mempool expects all the objects to be returned to the pool at the time of destroy. Given the fact that blkio groups are reference counted, theoritically they can be freed later. I was playing a bit with maintaining a pool of blkio group objects. I ran into the issue of blkio group not having any pointer to request queue or cfqd as that is required to retrieve the pool pointer to which this object has to go. Stashing pointer in blkio group then brings back the issue of reference on queue or cfqd etc. So that makes it no simpler then my current implementation. So I gave up. Thanks Vivek ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 15:20 ` Vivek Goyal @ 2011-12-22 15:58 ` Tejun Heo 2011-12-22 16:04 ` Vivek Goyal 0 siblings, 1 reply; 17+ messages in thread From: Tejun Heo @ 2011-12-22 15:58 UTC (permalink / raw) To: Vivek Goyal; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Hello, On Thu, Dec 22, 2011 at 7:20 AM, Vivek Goyal <vgoyal@redhat.com> wrote: > Also mempool expects all the objects to be returned to the pool at the > time of destroy. Given the fact that blkio groups are reference counted, > theoritically they can be freed later. I removed that part but no you still can't free objects after mempool destruction refcnted or not. mempool is destructed on module unload. Which text would free those objects? If that's theoretically possible, the code is broken. Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 15:58 ` Tejun Heo @ 2011-12-22 16:04 ` Vivek Goyal 2011-12-22 16:15 ` Tejun Heo 0 siblings, 1 reply; 17+ messages in thread From: Vivek Goyal @ 2011-12-22 16:04 UTC (permalink / raw) To: Tejun Heo; +Cc: Andrew Morton, Linus Torvalds, linux-kernel On Thu, Dec 22, 2011 at 07:58:18AM -0800, Tejun Heo wrote: > Hello, > > On Thu, Dec 22, 2011 at 7:20 AM, Vivek Goyal <vgoyal@redhat.com> wrote: > > Also mempool expects all the objects to be returned to the pool at the > > time of destroy. Given the fact that blkio groups are reference counted, > > theoritically they can be freed later. > > I removed that part but no you still can't free objects after mempool > destruction refcnted or not. mempool is destructed on module unload. > Which text would free those objects? If that's theoretically possible, > the code is broken. That's a good point. I did not think about module unload. I think then my current patch for allocating per cpu object from worker thread is buggy as on IO scheduler exit I don't try to flush the possibly in progress work[s]. Thanks Vivek ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 16:04 ` Vivek Goyal @ 2011-12-22 16:15 ` Tejun Heo 0 siblings, 0 replies; 17+ messages in thread From: Tejun Heo @ 2011-12-22 16:15 UTC (permalink / raw) To: Vivek Goyal; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Hello, Vivek. On Thu, Dec 22, 2011 at 11:04:23AM -0500, Vivek Goyal wrote: > That's a good point. I did not think about module unload. I think then > my current patch for allocating per cpu object from worker thread is buggy > as on IO scheduler exit I don't try to flush the possibly in progress > work[s]. Hmmm... I'm not sure whether backporting the mempool stuff would be a viable approach for this merge window and -stable and we might have to come up with some kind of minimal solution. Don't yet know how that should look like but let's talk about that in a different thread. Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH UPDATED 2/2] mempool: fix first round failure behavior 2011-12-22 15:15 ` Vivek Goyal 2011-12-22 15:20 ` Vivek Goyal @ 2011-12-22 15:21 ` Tejun Heo 1 sibling, 0 replies; 17+ messages in thread From: Tejun Heo @ 2011-12-22 15:21 UTC (permalink / raw) To: Vivek Goyal; +Cc: Andrew Morton, Linus Torvalds, linux-kernel Hello, On Thu, Dec 22, 2011 at 7:15 AM, Vivek Goyal <vgoyal@redhat.com> wrote: > For blkio to use these percpu mempool, I guess we shall have to change this > notion that an allocated object from mempool is returned to the pool soon. > We might not be returning these per cpu stat objects to mempool for a very > long time (These will be returned to pool only when somebody deletes the > cgroup). I'm almost done with it and it can behave as allocation buffer which is filled from more permissible context and consumed in more constricted one and async filling kicks the waiting allocator so I think we should be fine. Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() 2011-12-22 0:18 [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Tejun Heo 2011-12-22 0:19 ` [PATCH 2/2] mempool: fix first round failure behavior Tejun Heo @ 2011-12-22 0:25 ` Andrew Morton 2011-12-22 0:35 ` Tejun Heo 1 sibling, 1 reply; 17+ messages in thread From: Andrew Morton @ 2011-12-22 0:25 UTC (permalink / raw) To: Tejun Heo; +Cc: Linus Torvalds, linux-kernel On Wed, 21 Dec 2011 16:18:00 -0800 Tejun Heo <tj@kernel.org> wrote: > mempool_destroy() is a thin wrapper around free_pool(). The only > thing it adds is BUG_ON(pool->curr_nr != pool->min_nr). The intention > seems to be to enforce that all allocated elements are freed; however, > the BUG_ON() can't achieve that (it doesn't know anything about > objects above min_nr) and incorrect as mempool_resize() is allowed to > leave the pool extended but not filled. Furthermore, panicking is way > worse than any memory leak and there are better debug tools to track > memory leaks. > > Drop the BUG_ON() from mempool_destory() and as that leaves the > function identical to free_pool(), replace it. > > Signed-off-by: Tejun Heo <tj@kernel.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: stable@kernel.org (that's stable@vger.kernel.org) > --- > These patches are on top of "mempool: fix and document synchronization > and memory barrier usage" patch[1]. Both are fixes and it probably is > a good idea to forward to -stable. I'm not sure that either of these are suitable for -stable. There's no demonstrated problem, nor even a likely theoretical one, is there? If we do decide to backport, I don't think the -stable guys will want the large-but-nice comment-adding patch so both these patches would need to be reworked for -stable usage. The first patch does apply successfully to mainline. The second does not. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() 2011-12-22 0:25 ` [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Andrew Morton @ 2011-12-22 0:35 ` Tejun Heo 2011-12-22 0:40 ` Greg KH 0 siblings, 1 reply; 17+ messages in thread From: Tejun Heo @ 2011-12-22 0:35 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, gregkh Hello, On Wed, Dec 21, 2011 at 04:25:19PM -0800, Andrew Morton wrote: > > Signed-off-by: Tejun Heo <tj@kernel.org> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: stable@kernel.org > > (that's stable@vger.kernel.org) (cc'ing Greg) It has been stable@kernel.org for quite a while and Greg scans for that Cc. Even MAINTAINERS has that as the official mail address. I heard that the mailing alias is broken at the moment but wouldn't it be better to fix that? > > --- > > These patches are on top of "mempool: fix and document synchronization > > and memory barrier usage" patch[1]. Both are fixes and it probably is > > a good idea to forward to -stable. > > I'm not sure that either of these are suitable for -stable. There's no > demonstrated problem, nor even a likely theoretical one, is there? > > If we do decide to backport, I don't think the -stable guys will want > the large-but-nice comment-adding patch so both these patches would need to > be reworked for -stable usage. The first patch does apply successfully > to mainline. The second does not. Hmmm... I think it should be possible to trip the BUG_ON() removed by the first patch with targeted enough attack but AFAICS that would require root priv so it might not be too bad. The second one, while not optimal, shouldn't be critical. BTW, I missed sth in the second patch, will soon post an updated one. Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() 2011-12-22 0:35 ` Tejun Heo @ 2011-12-22 0:40 ` Greg KH 0 siblings, 0 replies; 17+ messages in thread From: Greg KH @ 2011-12-22 0:40 UTC (permalink / raw) To: Tejun Heo; +Cc: Andrew Morton, Linus Torvalds, linux-kernel On Wed, Dec 21, 2011 at 04:35:07PM -0800, Tejun Heo wrote: > Hello, > > On Wed, Dec 21, 2011 at 04:25:19PM -0800, Andrew Morton wrote: > > > Signed-off-by: Tejun Heo <tj@kernel.org> > > > Cc: Andrew Morton <akpm@linux-foundation.org> > > > Cc: stable@kernel.org > > > > (that's stable@vger.kernel.org) > > (cc'ing Greg) > > It has been stable@kernel.org for quite a while and Greg scans for > that Cc. Even MAINTAINERS has that as the official mail address. I > heard that the mailing alias is broken at the moment but wouldn't it > be better to fix that? I have a patch in one my trees to update the MAINTAINERS file, which will get backported to the older kernels as well, it must be queued for 3.3, as it's not really a big deal. And the alias is being worked on, sometimes it works, others it doesn't< I'm not quite sure what is going on. But either way, I will catch patches with cc: stable@vger.kernel.org or stable@kernel.org when they hit Linus's tree, which is the important thing, as that's the "real" path to get patches into stable releases. thanks, greg k-h ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2011-12-22 16:15 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-12-22 0:18 [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Tejun Heo 2011-12-22 0:19 ` [PATCH 2/2] mempool: fix first round failure behavior Tejun Heo 2011-12-22 0:32 ` Andrew Morton 2011-12-22 0:34 ` Tejun Heo 2011-12-22 0:46 ` [PATCH UPDATED " Tejun Heo 2011-12-22 1:09 ` Andrew Morton 2011-12-22 1:23 ` Tejun Heo 2011-12-22 1:31 ` Tejun Heo 2011-12-22 15:15 ` Vivek Goyal 2011-12-22 15:20 ` Vivek Goyal 2011-12-22 15:58 ` Tejun Heo 2011-12-22 16:04 ` Vivek Goyal 2011-12-22 16:15 ` Tejun Heo 2011-12-22 15:21 ` Tejun Heo 2011-12-22 0:25 ` [PATCH 1/2] mempool: drop unnecessary and incorrect BUG_ON() from mempool_destroy() Andrew Morton 2011-12-22 0:35 ` Tejun Heo 2011-12-22 0:40 ` Greg KH
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).