* [patch] mm, page_alloc: make __GFP_NOFAIL really not fail
@ 2013-12-09 21:56 ` David Rientjes
0 siblings, 0 replies; 18+ messages in thread
From: David Rientjes @ 2013-12-09 21:56 UTC (permalink / raw)
To: Andrew Morton; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm
__GFP_NOFAIL specifies that the page allocator cannot fail to return
memory. Allocators that call it may not even check for NULL upon
returning.
It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can
actually return NULL. More interestingly, processes that are doing
direct reclaim and have PF_MEMALLOC set may also return NULL for any
__GFP_NOFAIL allocation.
This patch fixes it so that the page allocator never actually returns
NULL as expected for __GFP_NOFAIL. It turns out that no code actually
does anything as crazy as GFP_ATOMIC | __GFP_NOFAIL currently, so this
is more for correctness than a bug fix for that issue.
Signed-off-by: David Rientjes <rientjes@google.com>
---
mm/page_alloc.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2535,17 +2535,19 @@ rebalance:
}
}
- /* Atomic allocations - we can't balance anything */
- if (!wait)
- goto nopage;
-
- /* Avoid recursion of direct reclaim */
- if (current->flags & PF_MEMALLOC)
- goto nopage;
-
- /* Avoid allocations with no watermarks from looping endlessly */
- if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
- goto nopage;
+ if (likely(!(gfp_mask & __GFP_NOFAIL))) {
+ /* Atomic allocations - we can't balance anything */
+ if (!wait)
+ goto nopage;
+
+ /* Avoid recursion of direct reclaim */
+ if (current->flags & PF_MEMALLOC)
+ goto nopage;
+
+ /* Avoid allocations with no watermarks from looping forever */
+ if (test_thread_flag(TIF_MEMDIE))
+ goto nopage;
+ }
/*
* Try direct compaction. The first pass is asynchronous. Subsequent
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread* [patch] mm, page_alloc: make __GFP_NOFAIL really not fail @ 2013-12-09 21:56 ` David Rientjes 0 siblings, 0 replies; 18+ messages in thread From: David Rientjes @ 2013-12-09 21:56 UTC (permalink / raw) To: Andrew Morton; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm __GFP_NOFAIL specifies that the page allocator cannot fail to return memory. Allocators that call it may not even check for NULL upon returning. It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can actually return NULL. More interestingly, processes that are doing direct reclaim and have PF_MEMALLOC set may also return NULL for any __GFP_NOFAIL allocation. This patch fixes it so that the page allocator never actually returns NULL as expected for __GFP_NOFAIL. It turns out that no code actually does anything as crazy as GFP_ATOMIC | __GFP_NOFAIL currently, so this is more for correctness than a bug fix for that issue. Signed-off-by: David Rientjes <rientjes@google.com> --- mm/page_alloc.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2535,17 +2535,19 @@ rebalance: } } - /* Atomic allocations - we can't balance anything */ - if (!wait) - goto nopage; - - /* Avoid recursion of direct reclaim */ - if (current->flags & PF_MEMALLOC) - goto nopage; - - /* Avoid allocations with no watermarks from looping endlessly */ - if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL)) - goto nopage; + if (likely(!(gfp_mask & __GFP_NOFAIL))) { + /* Atomic allocations - we can't balance anything */ + if (!wait) + goto nopage; + + /* Avoid recursion of direct reclaim */ + if (current->flags & PF_MEMALLOC) + goto nopage; + + /* Avoid allocations with no watermarks from looping forever */ + if (test_thread_flag(TIF_MEMDIE)) + goto nopage; + } /* * Try direct compaction. The first pass is asynchronous. Subsequent ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail 2013-12-09 21:56 ` David Rientjes @ 2013-12-09 23:22 ` Andrew Morton -1 siblings, 0 replies; 18+ messages in thread From: Andrew Morton @ 2013-12-09 23:22 UTC (permalink / raw) To: David Rientjes; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Mon, 9 Dec 2013 13:56:37 -0800 (PST) David Rientjes <rientjes@google.com> wrote: > __GFP_NOFAIL specifies that the page allocator cannot fail to return > memory. Allocators that call it may not even check for NULL upon > returning. > > It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can > actually return NULL. More interestingly, processes that are doing > direct reclaim and have PF_MEMALLOC set may also return NULL for any > __GFP_NOFAIL allocation. __GFP_NOFAIL is a nasty thing and making it pretend to work even better is heading in the wrong direction, surely? It would be saner to just disallow these even-sillier combinations. Can we fix up the current callers then stick a WARN_ON() in there? > This patch fixes it so that the page allocator never actually returns > NULL as expected for __GFP_NOFAIL. It turns out that no code actually > does anything as crazy as GFP_ATOMIC | __GFP_NOFAIL currently, so this > is more for correctness than a bug fix for that issue. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail @ 2013-12-09 23:22 ` Andrew Morton 0 siblings, 0 replies; 18+ messages in thread From: Andrew Morton @ 2013-12-09 23:22 UTC (permalink / raw) To: David Rientjes; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Mon, 9 Dec 2013 13:56:37 -0800 (PST) David Rientjes <rientjes@google.com> wrote: > __GFP_NOFAIL specifies that the page allocator cannot fail to return > memory. Allocators that call it may not even check for NULL upon > returning. > > It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can > actually return NULL. More interestingly, processes that are doing > direct reclaim and have PF_MEMALLOC set may also return NULL for any > __GFP_NOFAIL allocation. __GFP_NOFAIL is a nasty thing and making it pretend to work even better is heading in the wrong direction, surely? It would be saner to just disallow these even-sillier combinations. Can we fix up the current callers then stick a WARN_ON() in there? > This patch fixes it so that the page allocator never actually returns > NULL as expected for __GFP_NOFAIL. It turns out that no code actually > does anything as crazy as GFP_ATOMIC | __GFP_NOFAIL currently, so this > is more for correctness than a bug fix for that issue. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail 2013-12-09 23:22 ` Andrew Morton @ 2013-12-10 23:20 ` David Rientjes -1 siblings, 0 replies; 18+ messages in thread From: David Rientjes @ 2013-12-10 23:20 UTC (permalink / raw) To: Andrew Morton; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Mon, 9 Dec 2013, Andrew Morton wrote: > > __GFP_NOFAIL specifies that the page allocator cannot fail to return > > memory. Allocators that call it may not even check for NULL upon > > returning. > > > > It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can > > actually return NULL. More interestingly, processes that are doing > > direct reclaim and have PF_MEMALLOC set may also return NULL for any > > __GFP_NOFAIL allocation. > > __GFP_NOFAIL is a nasty thing and making it pretend to work even better > is heading in the wrong direction, surely? It would be saner to just > disallow these even-sillier combinations. Can we fix up the current > callers then stick a WARN_ON() in there? > Heh, it's difficult to remove __GFP_NOFAIL when new users get added: 84235de394d9 ("fs: buffer: move allocation failure loop into the allocator") added a new user and a bypass of memcg limits in oom conditions so __GFP_NOFAIL just essentially became __GFP_BYPASS_MEMCG_LIMIT_ON_OOM. We can probably ignore the PF_MEMALLOC behavior since it allows full access to memory reserves and the only time we would see a __GFP_NOFAIL allocation fail in such a context is if every zone's free memory was 0. We have bigger problems if memory reserves are completely depleted like that, so it's probably sufficient not to address it. I'd be concerned about new users of __GFP_NOFAIL that are added for GFP_NOWAIT or GFP_ATOMIC and never actually trigger such a warning because in testing they never trigger the slowpath, but the conditional is probably better placed outside of the fastpath: diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2536,8 +2536,15 @@ rebalance: } /* Atomic allocations - we can't balance anything */ - if (!wait) + if (!wait) { + /* + * All existing users of the deprecated __GFP_NOFAIL are + * blockable, so warn of any new users that actually allow this + * type of allocation to fail. + */ + WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL); goto nopage; + } /* Avoid recursion of direct reclaim */ if (current->flags & PF_MEMALLOC) But perhaps the best way to do this in a preventative way is to add a warning to checkpatch.pl that actually warns about adding new users. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail @ 2013-12-10 23:20 ` David Rientjes 0 siblings, 0 replies; 18+ messages in thread From: David Rientjes @ 2013-12-10 23:20 UTC (permalink / raw) To: Andrew Morton; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Mon, 9 Dec 2013, Andrew Morton wrote: > > __GFP_NOFAIL specifies that the page allocator cannot fail to return > > memory. Allocators that call it may not even check for NULL upon > > returning. > > > > It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can > > actually return NULL. More interestingly, processes that are doing > > direct reclaim and have PF_MEMALLOC set may also return NULL for any > > __GFP_NOFAIL allocation. > > __GFP_NOFAIL is a nasty thing and making it pretend to work even better > is heading in the wrong direction, surely? It would be saner to just > disallow these even-sillier combinations. Can we fix up the current > callers then stick a WARN_ON() in there? > Heh, it's difficult to remove __GFP_NOFAIL when new users get added: 84235de394d9 ("fs: buffer: move allocation failure loop into the allocator") added a new user and a bypass of memcg limits in oom conditions so __GFP_NOFAIL just essentially became __GFP_BYPASS_MEMCG_LIMIT_ON_OOM. We can probably ignore the PF_MEMALLOC behavior since it allows full access to memory reserves and the only time we would see a __GFP_NOFAIL allocation fail in such a context is if every zone's free memory was 0. We have bigger problems if memory reserves are completely depleted like that, so it's probably sufficient not to address it. I'd be concerned about new users of __GFP_NOFAIL that are added for GFP_NOWAIT or GFP_ATOMIC and never actually trigger such a warning because in testing they never trigger the slowpath, but the conditional is probably better placed outside of the fastpath: diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2536,8 +2536,15 @@ rebalance: } /* Atomic allocations - we can't balance anything */ - if (!wait) + if (!wait) { + /* + * All existing users of the deprecated __GFP_NOFAIL are + * blockable, so warn of any new users that actually allow this + * type of allocation to fail. + */ + WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL); goto nopage; + } /* Avoid recursion of direct reclaim */ if (current->flags & PF_MEMALLOC) But perhaps the best way to do this in a preventative way is to add a warning to checkpatch.pl that actually warns about adding new users. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail 2013-12-10 23:20 ` David Rientjes @ 2013-12-10 23:39 ` Andrew Morton -1 siblings, 0 replies; 18+ messages in thread From: Andrew Morton @ 2013-12-10 23:39 UTC (permalink / raw) To: David Rientjes; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Tue, 10 Dec 2013 15:20:17 -0800 (PST) David Rientjes <rientjes@google.com> wrote: > On Mon, 9 Dec 2013, Andrew Morton wrote: > > > > __GFP_NOFAIL specifies that the page allocator cannot fail to return > > > memory. Allocators that call it may not even check for NULL upon > > > returning. > > > > > > It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can > > > actually return NULL. More interestingly, processes that are doing > > > direct reclaim and have PF_MEMALLOC set may also return NULL for any > > > __GFP_NOFAIL allocation. > > > > __GFP_NOFAIL is a nasty thing and making it pretend to work even better > > is heading in the wrong direction, surely? It would be saner to just > > disallow these even-sillier combinations. Can we fix up the current > > callers then stick a WARN_ON() in there? > > > > Heh, it's difficult to remove __GFP_NOFAIL when new users get added: > 84235de394d9 ("fs: buffer: move allocation failure loop into the > allocator") added a new user That wasn't reeeeealy a new user - it was "convert an existing open-coded retry-for-ever loop". Which is what __GFP_NOFAIL is for. I don't think I've ever seen anyone actually fix one of these things (by teaching the caller to handle ENOMEM), so it obviously isn't working... > and a bypass of memcg limits in oom > conditions so __GFP_NOFAIL just essentially became > __GFP_BYPASS_MEMCG_LIMIT_ON_OOM. > > We can probably ignore the PF_MEMALLOC behavior since it allows full > access to memory reserves and the only time we would see a __GFP_NOFAIL > allocation fail in such a context is if every zone's free memory was 0. > We have bigger problems if memory reserves are completely depleted like > that, so it's probably sufficient not to address it. > > I'd be concerned about new users of __GFP_NOFAIL that are added for > GFP_NOWAIT or GFP_ATOMIC and never actually trigger such a warning because > in testing they never trigger the slowpath, but the conditional is > probably better placed outside of the fastpath: > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2536,8 +2536,15 @@ rebalance: > } > > /* Atomic allocations - we can't balance anything */ > - if (!wait) > + if (!wait) { > + /* > + * All existing users of the deprecated __GFP_NOFAIL are > + * blockable, so warn of any new users that actually allow this > + * type of allocation to fail. > + */ > + WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL); > goto nopage; > + } Seems sensible. > But perhaps the best way to do this in a preventative way is to add a > warning to checkpatch.pl that actually warns about adding new users. yup. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail @ 2013-12-10 23:39 ` Andrew Morton 0 siblings, 0 replies; 18+ messages in thread From: Andrew Morton @ 2013-12-10 23:39 UTC (permalink / raw) To: David Rientjes; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Tue, 10 Dec 2013 15:20:17 -0800 (PST) David Rientjes <rientjes@google.com> wrote: > On Mon, 9 Dec 2013, Andrew Morton wrote: > > > > __GFP_NOFAIL specifies that the page allocator cannot fail to return > > > memory. Allocators that call it may not even check for NULL upon > > > returning. > > > > > > It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can > > > actually return NULL. More interestingly, processes that are doing > > > direct reclaim and have PF_MEMALLOC set may also return NULL for any > > > __GFP_NOFAIL allocation. > > > > __GFP_NOFAIL is a nasty thing and making it pretend to work even better > > is heading in the wrong direction, surely? It would be saner to just > > disallow these even-sillier combinations. Can we fix up the current > > callers then stick a WARN_ON() in there? > > > > Heh, it's difficult to remove __GFP_NOFAIL when new users get added: > 84235de394d9 ("fs: buffer: move allocation failure loop into the > allocator") added a new user That wasn't reeeeealy a new user - it was "convert an existing open-coded retry-for-ever loop". Which is what __GFP_NOFAIL is for. I don't think I've ever seen anyone actually fix one of these things (by teaching the caller to handle ENOMEM), so it obviously isn't working... > and a bypass of memcg limits in oom > conditions so __GFP_NOFAIL just essentially became > __GFP_BYPASS_MEMCG_LIMIT_ON_OOM. > > We can probably ignore the PF_MEMALLOC behavior since it allows full > access to memory reserves and the only time we would see a __GFP_NOFAIL > allocation fail in such a context is if every zone's free memory was 0. > We have bigger problems if memory reserves are completely depleted like > that, so it's probably sufficient not to address it. > > I'd be concerned about new users of __GFP_NOFAIL that are added for > GFP_NOWAIT or GFP_ATOMIC and never actually trigger such a warning because > in testing they never trigger the slowpath, but the conditional is > probably better placed outside of the fastpath: > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2536,8 +2536,15 @@ rebalance: > } > > /* Atomic allocations - we can't balance anything */ > - if (!wait) > + if (!wait) { > + /* > + * All existing users of the deprecated __GFP_NOFAIL are > + * blockable, so warn of any new users that actually allow this > + * type of allocation to fail. > + */ > + WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL); > goto nopage; > + } Seems sensible. > But perhaps the best way to do this in a preventative way is to add a > warning to checkpatch.pl that actually warns about adding new users. yup. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail 2013-12-10 23:39 ` Andrew Morton @ 2013-12-11 0:11 ` David Rientjes -1 siblings, 0 replies; 18+ messages in thread From: David Rientjes @ 2013-12-11 0:11 UTC (permalink / raw) To: Andrew Morton; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Tue, 10 Dec 2013, Andrew Morton wrote: > > Heh, it's difficult to remove __GFP_NOFAIL when new users get added: > > 84235de394d9 ("fs: buffer: move allocation failure loop into the > > allocator") added a new user > > That wasn't reeeeealy a new user - it was "convert an existing > open-coded retry-for-ever loop". Which is what __GFP_NOFAIL is for. > No, it just looks like that's what it did. find_or_create_page() in that function does an order-0 allocation which always implicitly __GFP_NOFAIL because of the should_alloc_retry() behavior. So why does it need to add __GFP_NOFAIL there now? Because it is now allowed to bypass memcg limits to the root memcg, which is new behavior with the patch. It adds additional memcg powers that can't be duplicated in the caller, so now it's really become __GFP_BYPASS_MEMCG_LIMIT_ON_OOM for everything that was doing order-3 or smaller allocations, which should be all existing __GFP_NOFAIL users. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail @ 2013-12-11 0:11 ` David Rientjes 0 siblings, 0 replies; 18+ messages in thread From: David Rientjes @ 2013-12-11 0:11 UTC (permalink / raw) To: Andrew Morton; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Tue, 10 Dec 2013, Andrew Morton wrote: > > Heh, it's difficult to remove __GFP_NOFAIL when new users get added: > > 84235de394d9 ("fs: buffer: move allocation failure loop into the > > allocator") added a new user > > That wasn't reeeeealy a new user - it was "convert an existing > open-coded retry-for-ever loop". Which is what __GFP_NOFAIL is for. > No, it just looks like that's what it did. find_or_create_page() in that function does an order-0 allocation which always implicitly __GFP_NOFAIL because of the should_alloc_retry() behavior. So why does it need to add __GFP_NOFAIL there now? Because it is now allowed to bypass memcg limits to the root memcg, which is new behavior with the patch. It adds additional memcg powers that can't be duplicated in the caller, so now it's really become __GFP_BYPASS_MEMCG_LIMIT_ON_OOM for everything that was doing order-3 or smaller allocations, which should be all existing __GFP_NOFAIL users. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail 2013-12-10 23:39 ` Andrew Morton @ 2013-12-12 1:07 ` Dave Chinner -1 siblings, 0 replies; 18+ messages in thread From: Dave Chinner @ 2013-12-12 1:07 UTC (permalink / raw) To: Andrew Morton Cc: David Rientjes, Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Tue, Dec 10, 2013 at 03:39:09PM -0800, Andrew Morton wrote: > On Tue, 10 Dec 2013 15:20:17 -0800 (PST) David Rientjes <rientjes@google.com> wrote: > > > On Mon, 9 Dec 2013, Andrew Morton wrote: > > > > > > __GFP_NOFAIL specifies that the page allocator cannot fail to return > > > > memory. Allocators that call it may not even check for NULL upon > > > > returning. > > > > > > > > It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can > > > > actually return NULL. More interestingly, processes that are doing > > > > direct reclaim and have PF_MEMALLOC set may also return NULL for any > > > > __GFP_NOFAIL allocation. > > > > > > __GFP_NOFAIL is a nasty thing and making it pretend to work even better > > > is heading in the wrong direction, surely? It would be saner to just > > > disallow these even-sillier combinations. Can we fix up the current > > > callers then stick a WARN_ON() in there? > > > > > > > Heh, it's difficult to remove __GFP_NOFAIL when new users get added: > > 84235de394d9 ("fs: buffer: move allocation failure loop into the > > allocator") added a new user > > That wasn't reeeeealy a new user - it was "convert an existing > open-coded retry-for-ever loop". Which is what __GFP_NOFAIL is for. > > I don't think I've ever seen anyone actually fix one of these things > (by teaching the caller to handle ENOMEM), so it obviously isn't > working... Right, because most of the loops are deep within filesystem transaction code where the only thing to do with a memory allocation failure is to abort the transaction, shutdown the filesystem and deny user access (i.e. DOS the system) because the filesystem is inconsistent in memory and the only way it can be recovered is toosing everything in memory away and recovering the last valid on disk state from the journal. i.e. umount, mount. IOWs, the "fix" is far worse than current behaviour and so there is absolutely no motivation for the people who own these __GFP_NOFAIL allocations to fix them. Indeed, when you consider that the amount of work to fix the filesystems to robustly handle ENOMEM is a *massive* undertaking that adds significant overhead and complexity to each filesystem, the cost/benefit analysis comes down so far on the side of "just use __GFP_NOFAIL" that doing anything else is sheer lunacy. Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] mm, page_alloc: make __GFP_NOFAIL really not fail @ 2013-12-12 1:07 ` Dave Chinner 0 siblings, 0 replies; 18+ messages in thread From: Dave Chinner @ 2013-12-12 1:07 UTC (permalink / raw) To: Andrew Morton Cc: David Rientjes, Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Tue, Dec 10, 2013 at 03:39:09PM -0800, Andrew Morton wrote: > On Tue, 10 Dec 2013 15:20:17 -0800 (PST) David Rientjes <rientjes@google.com> wrote: > > > On Mon, 9 Dec 2013, Andrew Morton wrote: > > > > > > __GFP_NOFAIL specifies that the page allocator cannot fail to return > > > > memory. Allocators that call it may not even check for NULL upon > > > > returning. > > > > > > > > It turns out GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL can > > > > actually return NULL. More interestingly, processes that are doing > > > > direct reclaim and have PF_MEMALLOC set may also return NULL for any > > > > __GFP_NOFAIL allocation. > > > > > > __GFP_NOFAIL is a nasty thing and making it pretend to work even better > > > is heading in the wrong direction, surely? It would be saner to just > > > disallow these even-sillier combinations. Can we fix up the current > > > callers then stick a WARN_ON() in there? > > > > > > > Heh, it's difficult to remove __GFP_NOFAIL when new users get added: > > 84235de394d9 ("fs: buffer: move allocation failure loop into the > > allocator") added a new user > > That wasn't reeeeealy a new user - it was "convert an existing > open-coded retry-for-ever loop". Which is what __GFP_NOFAIL is for. > > I don't think I've ever seen anyone actually fix one of these things > (by teaching the caller to handle ENOMEM), so it obviously isn't > working... Right, because most of the loops are deep within filesystem transaction code where the only thing to do with a memory allocation failure is to abort the transaction, shutdown the filesystem and deny user access (i.e. DOS the system) because the filesystem is inconsistent in memory and the only way it can be recovered is toosing everything in memory away and recovering the last valid on disk state from the journal. i.e. umount, mount. IOWs, the "fix" is far worse than current behaviour and so there is absolutely no motivation for the people who own these __GFP_NOFAIL allocations to fix them. Indeed, when you consider that the amount of work to fix the filesystems to robustly handle ENOMEM is a *massive* undertaking that adds significant overhead and complexity to each filesystem, the cost/benefit analysis comes down so far on the side of "just use __GFP_NOFAIL" that doing anything else is sheer lunacy. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* [patch alternative] mm, page_alloc: warn for non-blockable __GFP_NOFAIL allocation failure 2013-12-10 23:20 ` David Rientjes @ 2013-12-11 0:19 ` David Rientjes -1 siblings, 0 replies; 18+ messages in thread From: David Rientjes @ 2013-12-11 0:19 UTC (permalink / raw) To: Andrew Morton; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm __GFP_NOFAIL may return NULL when coupled with GFP_NOWAIT or GFP_ATOMIC. Luckily, nothing currently does such craziness. So instead of causing such allocations to loop (potentially forever), we maintain the current behavior and also warn about the new users of the deprecated flag. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> --- mm/page_alloc.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2536,8 +2536,15 @@ rebalance: } /* Atomic allocations - we can't balance anything */ - if (!wait) + if (!wait) { + /* + * All existing users of the deprecated __GFP_NOFAIL are + * blockable, so warn of any new users that actually allow this + * type of allocation to fail. + */ + WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL); goto nopage; + } /* Avoid recursion of direct reclaim */ if (current->flags & PF_MEMALLOC) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* [patch alternative] mm, page_alloc: warn for non-blockable __GFP_NOFAIL allocation failure @ 2013-12-11 0:19 ` David Rientjes 0 siblings, 0 replies; 18+ messages in thread From: David Rientjes @ 2013-12-11 0:19 UTC (permalink / raw) To: Andrew Morton; +Cc: Mel Gorman, Michal Hocko, linux-kernel, linux-mm __GFP_NOFAIL may return NULL when coupled with GFP_NOWAIT or GFP_ATOMIC. Luckily, nothing currently does such craziness. So instead of causing such allocations to loop (potentially forever), we maintain the current behavior and also warn about the new users of the deprecated flag. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> --- mm/page_alloc.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2536,8 +2536,15 @@ rebalance: } /* Atomic allocations - we can't balance anything */ - if (!wait) + if (!wait) { + /* + * All existing users of the deprecated __GFP_NOFAIL are + * blockable, so warn of any new users that actually allow this + * type of allocation to fail. + */ + WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL); goto nopage; + } /* Avoid recursion of direct reclaim */ if (current->flags & PF_MEMALLOC) ^ permalink raw reply [flat|nested] 18+ messages in thread
* [patch] checkpatch: add warning of future __GFP_NOFAIL use 2013-12-11 0:19 ` David Rientjes @ 2013-12-11 0:26 ` David Rientjes -1 siblings, 0 replies; 18+ messages in thread From: David Rientjes @ 2013-12-11 0:26 UTC (permalink / raw) To: Andrew Morton Cc: Andy Whitcroft, Joe Perches, Mel Gorman, Michal Hocko, linux-kernel, linux-mm gfp.h and page_alloc.c already specify that __GFP_NOFAIL is deprecated and no new users should be added. Add a warning to checkpatch to catch this. Signed-off-by: David Rientjes <rientjes@google.com> --- scripts/checkpatch.pl | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 9c98100..6667689 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -4114,6 +4114,12 @@ sub process { "$1 uses number as first arg, sizeof is generally wrong\n" . $herecurr); } +# check for GFP_NOWAIT use + if ($line =~ /\b__GFP_NOFAIL\b/) { + WARN("__GFP_NOFAIL", + "Use of __GFP_NOFAIL is deprecated, no new users should be added\n" . $herecurr); + } + # check for multiple semicolons if ($line =~ /;\s*;\s*$/) { if (WARN("ONE_SEMICOLON", -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [patch] checkpatch: add warning of future __GFP_NOFAIL use @ 2013-12-11 0:26 ` David Rientjes 0 siblings, 0 replies; 18+ messages in thread From: David Rientjes @ 2013-12-11 0:26 UTC (permalink / raw) To: Andrew Morton Cc: Andy Whitcroft, Joe Perches, Mel Gorman, Michal Hocko, linux-kernel, linux-mm gfp.h and page_alloc.c already specify that __GFP_NOFAIL is deprecated and no new users should be added. Add a warning to checkpatch to catch this. Signed-off-by: David Rientjes <rientjes@google.com> --- scripts/checkpatch.pl | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 9c98100..6667689 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -4114,6 +4114,12 @@ sub process { "$1 uses number as first arg, sizeof is generally wrong\n" . $herecurr); } +# check for GFP_NOWAIT use + if ($line =~ /\b__GFP_NOFAIL\b/) { + WARN("__GFP_NOFAIL", + "Use of __GFP_NOFAIL is deprecated, no new users should be added\n" . $herecurr); + } + # check for multiple semicolons if ($line =~ /;\s*;\s*$/) { if (WARN("ONE_SEMICOLON", ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [patch] checkpatch: add warning of future __GFP_NOFAIL use 2013-12-11 0:26 ` David Rientjes @ 2013-12-11 1:35 ` Joe Perches -1 siblings, 0 replies; 18+ messages in thread From: Joe Perches @ 2013-12-11 1:35 UTC (permalink / raw) To: David Rientjes Cc: Andrew Morton, Andy Whitcroft, Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Tue, 2013-12-10 at 16:26 -0800, David Rientjes wrote: > gfp.h and page_alloc.c already specify that __GFP_NOFAIL is deprecated and > no new users should be added. > > Add a warning to checkpatch to catch this. Fine by me. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [patch] checkpatch: add warning of future __GFP_NOFAIL use @ 2013-12-11 1:35 ` Joe Perches 0 siblings, 0 replies; 18+ messages in thread From: Joe Perches @ 2013-12-11 1:35 UTC (permalink / raw) To: David Rientjes Cc: Andrew Morton, Andy Whitcroft, Mel Gorman, Michal Hocko, linux-kernel, linux-mm On Tue, 2013-12-10 at 16:26 -0800, David Rientjes wrote: > gfp.h and page_alloc.c already specify that __GFP_NOFAIL is deprecated and > no new users should be added. > > Add a warning to checkpatch to catch this. Fine by me. ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2013-12-12 1:08 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-09 21:56 [patch] mm, page_alloc: make __GFP_NOFAIL really not fail David Rientjes 2013-12-09 21:56 ` David Rientjes 2013-12-09 23:22 ` Andrew Morton 2013-12-09 23:22 ` Andrew Morton 2013-12-10 23:20 ` David Rientjes 2013-12-10 23:20 ` David Rientjes 2013-12-10 23:39 ` Andrew Morton 2013-12-10 23:39 ` Andrew Morton 2013-12-11 0:11 ` David Rientjes 2013-12-11 0:11 ` David Rientjes 2013-12-12 1:07 ` Dave Chinner 2013-12-12 1:07 ` Dave Chinner 2013-12-11 0:19 ` [patch alternative] mm, page_alloc: warn for non-blockable __GFP_NOFAIL allocation failure David Rientjes 2013-12-11 0:19 ` David Rientjes 2013-12-11 0:26 ` [patch] checkpatch: add warning of future __GFP_NOFAIL use David Rientjes 2013-12-11 0:26 ` David Rientjes 2013-12-11 1:35 ` Joe Perches 2013-12-11 1:35 ` Joe Perches
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.