* [PATCH] VM: kswapd should not do blocking memory allocations @ 2010-08-18 19:04 Trond Myklebust [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com> ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Trond Myklebust @ 2010-08-18 19:04 UTC (permalink / raw) To: linux-mm; +Cc: linux-kernel, linux-nfs From: Trond Myklebust <Trond.Myklebust@netapp.com> Allowing kswapd to do GFP_KERNEL memory allocations (or any blocking memory allocations) is wrong and can cause deadlocks in try_to_release_page(), as the filesystem believes it is safe to allocate new memory and block, whereas kswapd is there specifically to clear a low-memory situation... Set the gfp_mask to GFP_IOFS instead. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> --- mm/vmscan.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index ec5ddcc..716dd16 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2095,7 +2095,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order) unsigned long total_scanned; struct reclaim_state *reclaim_state = current->reclaim_state; struct scan_control sc = { - .gfp_mask = GFP_KERNEL, + .gfp_mask = GFP_IOFS, .may_unmap = 1, .may_swap = 1, /* -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
[parent not found: <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com>]
* Re: [PATCH] VM: kswapd should not do blocking memory allocations [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com> @ 2010-08-18 19:31 ` Trond Myklebust 2010-08-20 5:45 ` Wu Fengguang 0 siblings, 1 reply; 7+ messages in thread From: Trond Myklebust @ 2010-08-18 19:31 UTC (permalink / raw) To: Ram Pai; +Cc: linux-mm, linux-kernel, linux-nfs On Wed, 2010-08-18 at 12:24 -0700, Ram Pai wrote: > > > On Wed, Aug 18, 2010 at 12:04 PM, Trond Myklebust > <Trond.Myklebust@netapp.com> wrote: > From: Trond Myklebust <Trond.Myklebust@netapp.com> > > Allowing kswapd to do GFP_KERNEL memory allocations (or any > blocking memory > allocations) is wrong and can cause deadlocks in > try_to_release_page(), as > the filesystem believes it is safe to allocate new memory and > block, > whereas kswapd is there specifically to clear a low-memory > situation... > > Set the gfp_mask to GFP_IOFS instead. > > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> > --- > > mm/vmscan.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index ec5ddcc..716dd16 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2095,7 +2095,7 @@ static unsigned long > balance_pgdat(pg_data_t *pgdat, int order) > unsigned long total_scanned; > struct reclaim_state *reclaim_state = > current->reclaim_state; > struct scan_control sc = { > - .gfp_mask = GFP_KERNEL, > + .gfp_mask = GFP_IOFS, > .may_unmap = 1, > .may_swap = 1, > /* > > Trond, > > Has anyone hit this issue? Or is this based on code > inspection? > > The reason I ask is we are seeing a problem, similar to > the symptom described, on RH based kernel but have not been able to > reproduce on 2.6.35. Hi Ram, I was seeing it on NFS until I put in the following kswapd-specific hack into nfs_release_page(): /* Only do I/O if gfp is a superset of GFP_KERNEL */ if (mapping && (gfp & GFP_KERNEL) == GFP_KERNEL) { int how = FLUSH_SYNC; /* Don't let kswapd deadlock waiting for OOM RPC calls */ if (current_is_kswapd()) how = 0; nfs_commit_inode(mapping->host, how); } Remove the 'if (current_is_kswapd())' line, and run an mmap() write intensive workload, and it should hang pretty much every time. Cheers Trond -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] VM: kswapd should not do blocking memory allocations 2010-08-18 19:31 ` Trond Myklebust @ 2010-08-20 5:45 ` Wu Fengguang 2010-08-20 12:17 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Wu Fengguang @ 2010-08-20 5:45 UTC (permalink / raw) To: Trond Myklebust; +Cc: Ram Pai, linux-mm, linux-kernel, linux-nfs > Hi Ram, > > I was seeing it on NFS until I put in the following kswapd-specific hack > into nfs_release_page(): > > /* Only do I/O if gfp is a superset of GFP_KERNEL */ > if (mapping && (gfp & GFP_KERNEL) == GFP_KERNEL) { > int how = FLUSH_SYNC; > > /* Don't let kswapd deadlock waiting for OOM RPC calls */ > if (current_is_kswapd()) > how = 0; So the patch can remove the above workaround together, and add comment that NFS exploits the gfp mask to avoid complex operations involving recursive memory allocation and hence deadlock? Thanks, Fengguang > nfs_commit_inode(mapping->host, how); > } > > Remove the 'if (current_is_kswapd())' line, and run an mmap() write > intensive workload, and it should hang pretty much every time. > > Cheers > Trond > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] VM: kswapd should not do blocking memory allocations 2010-08-20 5:45 ` Wu Fengguang @ 2010-08-20 12:17 ` Trond Myklebust 0 siblings, 0 replies; 7+ messages in thread From: Trond Myklebust @ 2010-08-20 12:17 UTC (permalink / raw) To: Wu Fengguang; +Cc: Ram Pai, linux-mm, linux-kernel, linux-nfs On Fri, 2010-08-20 at 13:45 +0800, Wu Fengguang wrote: > > Hi Ram, > > > > I was seeing it on NFS until I put in the following kswapd-specific hack > > into nfs_release_page(): > > > > /* Only do I/O if gfp is a superset of GFP_KERNEL */ > > if (mapping && (gfp & GFP_KERNEL) == GFP_KERNEL) { > > int how = FLUSH_SYNC; > > > > /* Don't let kswapd deadlock waiting for OOM RPC calls */ > > if (current_is_kswapd()) > > how = 0; > > So the patch can remove the above workaround together, and add comment > that NFS exploits the gfp mask to avoid complex operations involving > recursive memory allocation and hence deadlock? I thought I'd send that as a separate patch, but yes, that is my intention next. Cheers Trond -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] VM: kswapd should not do blocking memory allocations 2010-08-18 19:04 [PATCH] VM: kswapd should not do blocking memory allocations Trond Myklebust [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com> @ 2010-08-18 19:34 ` Chris Mason 2010-08-18 20:10 ` Trond Myklebust 2010-08-20 5:40 ` Wu Fengguang 2 siblings, 1 reply; 7+ messages in thread From: Chris Mason @ 2010-08-18 19:34 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-mm, linux-kernel, linux-nfs On Wed, Aug 18, 2010 at 03:04:01PM -0400, Trond Myklebust wrote: > From: Trond Myklebust <Trond.Myklebust@netapp.com> > > Allowing kswapd to do GFP_KERNEL memory allocations (or any blocking memory > allocations) is wrong and can cause deadlocks in try_to_release_page(), as > the filesystem believes it is safe to allocate new memory and block, > whereas kswapd is there specifically to clear a low-memory situation... > > Set the gfp_mask to GFP_IOFS instead. I always thought releasepage was supposed to do almost zero work. It could release an instantly freeable page but it wasn't supposed to dive in and solve world hunger or anything. I thought the VM would be using writepage for that. -chris -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] VM: kswapd should not do blocking memory allocations 2010-08-18 19:34 ` Chris Mason @ 2010-08-18 20:10 ` Trond Myklebust 0 siblings, 0 replies; 7+ messages in thread From: Trond Myklebust @ 2010-08-18 20:10 UTC (permalink / raw) To: Chris Mason; +Cc: linux-mm, linux-kernel, linux-nfs On Wed, 2010-08-18 at 15:34 -0400, Chris Mason wrote: > On Wed, Aug 18, 2010 at 03:04:01PM -0400, Trond Myklebust wrote: > > From: Trond Myklebust <Trond.Myklebust@netapp.com> > > > > Allowing kswapd to do GFP_KERNEL memory allocations (or any blocking memory > > allocations) is wrong and can cause deadlocks in try_to_release_page(), as > > the filesystem believes it is safe to allocate new memory and block, > > whereas kswapd is there specifically to clear a low-memory situation... > > > > Set the gfp_mask to GFP_IOFS instead. > > I always thought releasepage was supposed to do almost zero work. It > could release an instantly freeable page but it wasn't supposed to dive > in and solve world hunger or anything. > > I thought the VM would be using writepage for that. writepage isn't sufficient for the NFS case: the page may be in the 'clean but unstable' state, in which case the NFS client needs to send a COMMIT rpc call before the page can finally be released. That is why we need the gfp_flag to tell us when it is safe to do this, and when it is not. The main case where it is safe and necessary for try_to_release_page() to initiate a COMMIT call is in the invalidate_inode_pages2(). We might want to do it in the kswapd case too, but in that case, we definitely should tell the filesystem that it is unsafe to block. Trond -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] VM: kswapd should not do blocking memory allocations 2010-08-18 19:04 [PATCH] VM: kswapd should not do blocking memory allocations Trond Myklebust [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com> 2010-08-18 19:34 ` Chris Mason @ 2010-08-20 5:40 ` Wu Fengguang 2 siblings, 0 replies; 7+ messages in thread From: Wu Fengguang @ 2010-08-20 5:40 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-mm, linux-kernel, linux-nfs On Wed, Aug 18, 2010 at 03:04:01PM -0400, Trond Myklebust wrote: > From: Trond Myklebust <Trond.Myklebust@netapp.com> > > Allowing kswapd to do GFP_KERNEL memory allocations (or any blocking memory > allocations) is wrong and can cause deadlocks in try_to_release_page(), as > the filesystem believes it is safe to allocate new memory and block, > whereas kswapd is there specifically to clear a low-memory situation... > > Set the gfp_mask to GFP_IOFS instead. It would be more descriptive to say "remove the __GFP_WAIT bit". The change looks reasonable _in itself_, since we always prefer to avoid unnecessary waits for kswapd. So Acked-by: Wu Fengguang <fengguang.wu@intel.com> > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> > --- > > mm/vmscan.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index ec5ddcc..716dd16 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2095,7 +2095,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order) > unsigned long total_scanned; > struct reclaim_state *reclaim_state = current->reclaim_state; > struct scan_control sc = { > - .gfp_mask = GFP_KERNEL, > + .gfp_mask = GFP_IOFS, > .may_unmap = 1, > .may_swap = 1, > /* > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-08-20 12:18 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-08-18 19:04 [PATCH] VM: kswapd should not do blocking memory allocations Trond Myklebust [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com> 2010-08-18 19:31 ` Trond Myklebust 2010-08-20 5:45 ` Wu Fengguang 2010-08-20 12:17 ` Trond Myklebust 2010-08-18 19:34 ` Chris Mason 2010-08-18 20:10 ` Trond Myklebust 2010-08-20 5:40 ` Wu Fengguang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).