* [patch 1/6] md: remove dependency on __GFP_NOFAIL [not found] <alpine.DEB.2.00.1008161953430.17924@chino.kir.corp.google.com> @ 2010-08-17 2:57 ` David Rientjes 2010-08-23 19:26 ` Andrew Morton 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2010-08-17 2:57 UTC (permalink / raw) To: Neil Brown, Alasdair G Kergon; +Cc: Andrew Morton, linux-raid, linux-kernel Removes the dependency on __GFP_NOFAIL by looping indefinitely in the caller. Signed-off-by: David Rientjes <rientjes@google.com> --- drivers/md/dm-region-hash.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm-region-hash.c b/drivers/md/dm-region-hash.c --- a/drivers/md/dm-region-hash.c +++ b/drivers/md/dm-region-hash.c @@ -289,8 +289,12 @@ static struct dm_region *__rh_alloc(struct dm_region_hash *rh, region_t region) struct dm_region *reg, *nreg; nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC); - if (unlikely(!nreg)) - nreg = kmalloc(sizeof(*nreg), GFP_NOIO | __GFP_NOFAIL); + if (unlikely(!nreg)) { + /* FIXME: this may potentially loop forever */ + do { + nreg = kmalloc(sizeof(*nreg), GFP_NOIO); + } while (!nreg); + } nreg->state = rh->log->type->in_sync(rh->log, region, 1) ? DM_RH_CLEAN : DM_RH_NOSYNC; ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-17 2:57 ` [patch 1/6] md: remove dependency on __GFP_NOFAIL David Rientjes @ 2010-08-23 19:26 ` Andrew Morton 2010-08-23 19:35 ` David Rientjes 0 siblings, 1 reply; 13+ messages in thread From: Andrew Morton @ 2010-08-23 19:26 UTC (permalink / raw) To: David Rientjes; +Cc: Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel On Mon, 16 Aug 2010 19:57:51 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > Removes the dependency on __GFP_NOFAIL by looping indefinitely in the > caller. > > Signed-off-by: David Rientjes <rientjes@google.com> > --- > drivers/md/dm-region-hash.c | 8 ++++++-- > 1 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/dm-region-hash.c b/drivers/md/dm-region-hash.c > --- a/drivers/md/dm-region-hash.c > +++ b/drivers/md/dm-region-hash.c > @@ -289,8 +289,12 @@ static struct dm_region *__rh_alloc(struct dm_region_hash *rh, region_t region) > struct dm_region *reg, *nreg; > > nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC); > - if (unlikely(!nreg)) > - nreg = kmalloc(sizeof(*nreg), GFP_NOIO | __GFP_NOFAIL); > + if (unlikely(!nreg)) { > + /* FIXME: this may potentially loop forever */ > + do { > + nreg = kmalloc(sizeof(*nreg), GFP_NOIO); > + } while (!nreg); > + } > > nreg->state = rh->log->type->in_sync(rh->log, region, 1) ? > DM_RH_CLEAN : DM_RH_NOSYNC; erm. The reason for adding GFP_NOFAIL in the first place was my observation that the kernel contained lots of open-coded retry-for-ever loops. All of these are wrong, bad, buggy and mustfix. So we consolidated the wrongbadbuggymustfix concept into the core MM so that miscreants could be easily identified and hopefully fixed. I think that simply undoing that change is a bad idea - it allows the wrongbadbuggymustfix code to hide from view. The correct way to remove __GFP_NOFAIL is to fix the wrongbadbuggymustfix code properly. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 19:26 ` Andrew Morton @ 2010-08-23 19:35 ` David Rientjes 2010-08-23 19:51 ` Andrew Morton 2010-08-23 20:01 ` Andrew Morton 0 siblings, 2 replies; 13+ messages in thread From: David Rientjes @ 2010-08-23 19:35 UTC (permalink / raw) To: Andrew Morton; +Cc: Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel On Mon, 23 Aug 2010, Andrew Morton wrote: > > diff --git a/drivers/md/dm-region-hash.c b/drivers/md/dm-region-hash.c > > --- a/drivers/md/dm-region-hash.c > > +++ b/drivers/md/dm-region-hash.c > > @@ -289,8 +289,12 @@ static struct dm_region *__rh_alloc(struct dm_region_hash *rh, region_t region) > > struct dm_region *reg, *nreg; > > > > nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC); > > - if (unlikely(!nreg)) > > - nreg = kmalloc(sizeof(*nreg), GFP_NOIO | __GFP_NOFAIL); > > + if (unlikely(!nreg)) { > > + /* FIXME: this may potentially loop forever */ > > + do { > > + nreg = kmalloc(sizeof(*nreg), GFP_NOIO); > > + } while (!nreg); > > + } > > > > nreg->state = rh->log->type->in_sync(rh->log, region, 1) ? > > DM_RH_CLEAN : DM_RH_NOSYNC; > > erm. > > The reason for adding GFP_NOFAIL in the first place was my observation > that the kernel contained lots of open-coded retry-for-ever loops. > > All of these are wrong, bad, buggy and mustfix. So we consolidated the > wrongbadbuggymustfix concept into the core MM so that miscreants could > be easily identified and hopefully fixed. > That consolidation would have been unnecessary, then, since all allocations with order < PAGE_ALLOC_COSTLY_ORDER automatically loop indefinitely in the page allocator. struct dm_region allocations would already do that. So this retry loop doesn't actually do anything that the page allocator already doesn't, with or without __GFP_NOFAIL. The difference here is that - it doesn't depend on the page allocator's implementation, which may change over time, and - it adds documentation so that the subsystems doing these loops can (hopefully) fix these problems later, although their appear to be geniune cases where little other options are available. > I think that simply undoing that change is a bad idea - it allows the > wrongbadbuggymustfix code to hide from view. > It removes several branches from the page allocator. > The correct way to remove __GFP_NOFAIL is to fix the > wrongbadbuggymustfix code properly. > If the prerequisite for removing __GFP_NOFAIL is that nobody must ever loop indefinitely looking for memory or smaller order allocations don't implicitly retry, then there's little chance it'll ever get removed since they've existed for years without anybody cleaning them up. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 19:35 ` David Rientjes @ 2010-08-23 19:51 ` Andrew Morton 2010-08-23 20:03 ` David Rientjes 2010-08-23 20:01 ` Andrew Morton 1 sibling, 1 reply; 13+ messages in thread From: Andrew Morton @ 2010-08-23 19:51 UTC (permalink / raw) To: David Rientjes; +Cc: Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel On Mon, 23 Aug 2010 12:35:22 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > On Mon, 23 Aug 2010, Andrew Morton wrote: > > > > diff --git a/drivers/md/dm-region-hash.c b/drivers/md/dm-region-hash.c > > > --- a/drivers/md/dm-region-hash.c > > > +++ b/drivers/md/dm-region-hash.c > > > @@ -289,8 +289,12 @@ static struct dm_region *__rh_alloc(struct dm_region_hash *rh, region_t region) > > > struct dm_region *reg, *nreg; > > > > > > nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC); > > > - if (unlikely(!nreg)) > > > - nreg = kmalloc(sizeof(*nreg), GFP_NOIO | __GFP_NOFAIL); > > > + if (unlikely(!nreg)) { > > > + /* FIXME: this may potentially loop forever */ > > > + do { > > > + nreg = kmalloc(sizeof(*nreg), GFP_NOIO); > > > + } while (!nreg); > > > + } > > > > > > nreg->state = rh->log->type->in_sync(rh->log, region, 1) ? > > > DM_RH_CLEAN : DM_RH_NOSYNC; > > > > erm. > > > > The reason for adding GFP_NOFAIL in the first place was my observation > > that the kernel contained lots of open-coded retry-for-ever loops. > > > > All of these are wrong, bad, buggy and mustfix. So we consolidated the > > wrongbadbuggymustfix concept into the core MM so that miscreants could > > be easily identified and hopefully fixed. > > > > That consolidation would have been unnecessary, then, since all > allocations with order < PAGE_ALLOC_COSTLY_ORDER automatically loop > indefinitely in the page allocator. The difference is that an order-0 !__GFP_NOFAIL allocation attempt can fail due to oom-killing. Unless someone broke that. > struct dm_region allocations would > already do that. > > So this retry loop doesn't actually do anything that the page allocator > already doesn't, with or without __GFP_NOFAIL. The difference here is > that > > - it doesn't depend on the page allocator's implementation, which may > change over time, and > > - it adds documentation so that the subsystems doing these loops can > (hopefully) fix these problems later, although their appear to be > geniune cases where little other options are available. > > > I think that simply undoing that change is a bad idea - it allows the > > wrongbadbuggymustfix code to hide from view. > > > > It removes several branches from the page allocator. None on the fast path. > > The correct way to remove __GFP_NOFAIL is to fix the > > wrongbadbuggymustfix code properly. > > > > If the prerequisite for removing __GFP_NOFAIL is that nobody must ever > loop indefinitely looking for memory or smaller order allocations don't > implicitly retry, then there's little chance it'll ever get removed since > they've existed for years without anybody cleaning them up. The JBD one is hard - I haven't looked at the others. We should fix them. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 19:51 ` Andrew Morton @ 2010-08-23 20:03 ` David Rientjes 0 siblings, 0 replies; 13+ messages in thread From: David Rientjes @ 2010-08-23 20:03 UTC (permalink / raw) To: Andrew Morton; +Cc: Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel On Mon, 23 Aug 2010, Andrew Morton wrote: > > > > diff --git a/drivers/md/dm-region-hash.c b/drivers/md/dm-region-hash.c > > > > --- a/drivers/md/dm-region-hash.c > > > > +++ b/drivers/md/dm-region-hash.c > > > > @@ -289,8 +289,12 @@ static struct dm_region *__rh_alloc(struct dm_region_hash *rh, region_t region) > > > > struct dm_region *reg, *nreg; > > > > > > > > nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC); > > > > - if (unlikely(!nreg)) > > > > - nreg = kmalloc(sizeof(*nreg), GFP_NOIO | __GFP_NOFAIL); > > > > + if (unlikely(!nreg)) { > > > > + /* FIXME: this may potentially loop forever */ > > > > + do { > > > > + nreg = kmalloc(sizeof(*nreg), GFP_NOIO); > > > > + } while (!nreg); > > > > + } > > > > > > > > nreg->state = rh->log->type->in_sync(rh->log, region, 1) ? > > > > DM_RH_CLEAN : DM_RH_NOSYNC; > > > > > > erm. > > > > > > The reason for adding GFP_NOFAIL in the first place was my observation > > > that the kernel contained lots of open-coded retry-for-ever loops. > > > > > > All of these are wrong, bad, buggy and mustfix. So we consolidated the > > > wrongbadbuggymustfix concept into the core MM so that miscreants could > > > be easily identified and hopefully fixed. > > > > > > > That consolidation would have been unnecessary, then, since all > > allocations with order < PAGE_ALLOC_COSTLY_ORDER automatically loop > > indefinitely in the page allocator. > > The difference is that an order-0 !__GFP_NOFAIL allocation attempt can > fail due to oom-killing. Unless someone broke that. > This is GFP_NOIO, which all the allocations in this patchset are (or GFP_NOFS), so the oom killer isn't involved. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 19:35 ` David Rientjes 2010-08-23 19:51 ` Andrew Morton @ 2010-08-23 20:01 ` Andrew Morton 2010-08-23 20:08 ` David Rientjes 2010-08-23 20:09 ` Pekka Enberg 1 sibling, 2 replies; 13+ messages in thread From: Andrew Morton @ 2010-08-23 20:01 UTC (permalink / raw) To: David Rientjes; +Cc: Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel Hows about you add a helper function void *[kmalloc|alloc_page]_retrying_forever_because_i_suck(lots of args) then convert the callsites to use that, then nuke __GFP_NOFAIL? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 20:01 ` Andrew Morton @ 2010-08-23 20:08 ` David Rientjes 2010-08-23 20:23 ` Andrew Morton 2010-08-23 20:09 ` Pekka Enberg 1 sibling, 1 reply; 13+ messages in thread From: David Rientjes @ 2010-08-23 20:08 UTC (permalink / raw) To: Andrew Morton; +Cc: Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel On Mon, 23 Aug 2010, Andrew Morton wrote: > Hows about you add a helper function > > void *[kmalloc|alloc_page]_retrying_forever_because_i_suck(lots of args) > > then convert the callsites to use that, then nuke __GFP_NOFAIL? > That would only serve as documentation of a caller that could potentially loop forever waiting for memory (which I did by adding "/* FIXME: this may potentially loop forever */") since all of the allocations in this patchset never loop in the code that was added, they already loop forever in the page allocator doing the same thing. The hope is that kswapd will eventually be able to free memory since direct reclaim will usually fail for GFP_NOFS and we simply need to wait long enough for there to be memory. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 20:08 ` David Rientjes @ 2010-08-23 20:23 ` Andrew Morton 2010-08-23 20:37 ` David Rientjes 0 siblings, 1 reply; 13+ messages in thread From: Andrew Morton @ 2010-08-23 20:23 UTC (permalink / raw) To: David Rientjes; +Cc: Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel On Mon, 23 Aug 2010 13:08:53 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > On Mon, 23 Aug 2010, Andrew Morton wrote: > > > Hows about you add a helper function > > > > void *[kmalloc|alloc_page]_retrying_forever_because_i_suck(lots of args) > > > > then convert the callsites to use that, then nuke __GFP_NOFAIL? > > > > That would only serve as documentation Is that bad? > of a caller that could potentially > loop forever waiting for memory (which I did by adding "/* FIXME: this may > potentially loop forever */") A helper function could check that appropriate gfp flags are being set. > since all of the allocations in this > patchset never loop in the code that was added, they already loop forever > in the page allocator doing the same thing. The hope is that kswapd will > eventually be able to free memory since direct reclaim will usually fail > for GFP_NOFS and we simply need to wait long enough for there to be > memory. While holding locks which will prevent kswapd from doing anything useful. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 20:23 ` Andrew Morton @ 2010-08-23 20:37 ` David Rientjes 0 siblings, 0 replies; 13+ messages in thread From: David Rientjes @ 2010-08-23 20:37 UTC (permalink / raw) To: Andrew Morton; +Cc: Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel On Mon, 23 Aug 2010, Andrew Morton wrote: > > > Hows about you add a helper function > > > > > > void *[kmalloc|alloc_page]_retrying_forever_because_i_suck(lots of args) > > > > > > then convert the callsites to use that, then nuke __GFP_NOFAIL? > > > > > > > That would only serve as documentation > > Is that bad? > It implies that these calls are special in some way, basically to mean we have no sane error handling for failures, when in reality this is only a function of the allocation order when the context does not have __GFP_FS. You may as well just add BUG_ON(!nreg) here instead. I thought my "/* FIXME: this may potentially loop forever */" was sufficient to mean that the allocation must succeed independent of the page allocator's implementation, but perhaps you find greping for [kmalloc|alloc_page.*]_nofail easier? I'm not sure what flags such a function would be checking for since the page allocator determines whether the oom killer is called or not. If current is killed, an order-0 allocation will succeed because of TIF_MEMDIE, and if another task is killed, we must still loop waiting for the free memory even with __GFP_FS. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 20:01 ` Andrew Morton 2010-08-23 20:08 ` David Rientjes @ 2010-08-23 20:09 ` Pekka Enberg 2010-08-23 20:13 ` David Rientjes 1 sibling, 1 reply; 13+ messages in thread From: Pekka Enberg @ 2010-08-23 20:09 UTC (permalink / raw) To: Andrew Morton Cc: David Rientjes, Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel On Mon, Aug 23, 2010 at 11:01 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > Hows about you add a helper function > > void *[kmalloc|alloc_page]_retrying_forever_because_i_suck(lots of args) > > then convert the callsites to use that, then nuke __GFP_NOFAIL? I'd prefer killing off __GFP_NOFAIL properly :-) -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 20:09 ` Pekka Enberg @ 2010-08-23 20:13 ` David Rientjes 2010-08-23 20:29 ` Pekka Enberg 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2010-08-23 20:13 UTC (permalink / raw) To: Pekka Enberg Cc: Andrew Morton, Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel [-- Attachment #1: Type: TEXT/PLAIN, Size: 706 bytes --] On Mon, 23 Aug 2010, Pekka Enberg wrote: > > Hows about you add a helper function > > > > void *[kmalloc|alloc_page]_retrying_forever_because_i_suck(lots of args) > > > > then convert the callsites to use that, then nuke __GFP_NOFAIL? > > I'd prefer killing off __GFP_NOFAIL properly :-) > And how is this not done properly if it's not even needed for any of the allocations in this patchset since the page allocator loops forever for their orders and context? (This is why we don't need to add __GFP_NOWARN in its place.) This is a cleanup patchset to remove the unnecessary use of __GFP_NOFAIL, there _are_ GFP_KERNEL | __GFP_NOFAIL allocations that need to be addressed in phase three. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 20:13 ` David Rientjes @ 2010-08-23 20:29 ` Pekka Enberg 2010-08-23 20:40 ` David Rientjes 0 siblings, 1 reply; 13+ messages in thread From: Pekka Enberg @ 2010-08-23 20:29 UTC (permalink / raw) To: David Rientjes Cc: Andrew Morton, Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel Hi David, On Mon, 23 Aug 2010, Pekka Enberg wrote: >> > Hows about you add a helper function >> > >> > void *[kmalloc|alloc_page]_retrying_forever_because_i_suck(lots of args) >> > >> > then convert the callsites to use that, then nuke __GFP_NOFAIL? >> >> I'd prefer killing off __GFP_NOFAIL properly :-) On Mon, Aug 23, 2010 at 11:13 PM, David Rientjes <rientjes@google.com> wrote: > And how is this not done properly if it's not even needed for any of the > allocations in this patchset since the page allocator loops forever for > their orders and context? (This is why we don't need to add __GFP_NOWARN > in its place.) > > This is a cleanup patchset to remove the unnecessary use of __GFP_NOFAIL, > there _are_ GFP_KERNEL | __GFP_NOFAIL allocations that need to be > addressed in phase three. My point is that I don't think Andrew's helper will change all that much. Fixing the actual callers is the hard part and I don't see your patches helping that either. Hiding the looping in filesystem code is only going to make problematic callers harder to find (e.g kmem_alloc() in XFS code). FWIW, I'd just add a nasty WARN_ON in the page allocator and put a big fat "fix your shit" comment on top of it to motivate people to fix their code. Pekka ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL 2010-08-23 20:29 ` Pekka Enberg @ 2010-08-23 20:40 ` David Rientjes 0 siblings, 0 replies; 13+ messages in thread From: David Rientjes @ 2010-08-23 20:40 UTC (permalink / raw) To: Pekka Enberg Cc: Andrew Morton, Neil Brown, Alasdair G Kergon, linux-raid, linux-kernel On Mon, 23 Aug 2010, Pekka Enberg wrote: > My point is that I don't think Andrew's helper will change all that > much. Fixing the actual callers is the hard part and I don't see your > patches helping that either. Hiding the looping in filesystem code is > only going to make problematic callers harder to find (e.g > kmem_alloc() in XFS code). > Nothing is getting hidden in the callers here, all of the patches in this series are using __GFP_NOFAIL unnecessarily. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-08-23 20:40 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <alpine.DEB.2.00.1008161953430.17924@chino.kir.corp.google.com>
2010-08-17 2:57 ` [patch 1/6] md: remove dependency on __GFP_NOFAIL David Rientjes
2010-08-23 19:26 ` Andrew Morton
2010-08-23 19:35 ` David Rientjes
2010-08-23 19:51 ` Andrew Morton
2010-08-23 20:03 ` David Rientjes
2010-08-23 20:01 ` Andrew Morton
2010-08-23 20:08 ` David Rientjes
2010-08-23 20:23 ` Andrew Morton
2010-08-23 20:37 ` David Rientjes
2010-08-23 20:09 ` Pekka Enberg
2010-08-23 20:13 ` David Rientjes
2010-08-23 20:29 ` Pekka Enberg
2010-08-23 20:40 ` David Rientjes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).