linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] VM: kswapd should not do blocking memory allocations
@ 2010-08-18 19:04 Trond Myklebust
       [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com>
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Trond Myklebust @ 2010-08-18 19:04 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, linux-nfs

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Allowing kswapd to do GFP_KERNEL memory allocations (or any blocking memory
allocations) is wrong and can cause deadlocks in try_to_release_page(), as
the filesystem believes it is safe to allocate new memory and block,
whereas kswapd is there specifically to clear a low-memory situation...

Set the gfp_mask to GFP_IOFS instead.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 mm/vmscan.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


diff --git a/mm/vmscan.c b/mm/vmscan.c
index ec5ddcc..716dd16 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2095,7 +2095,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
 	unsigned long total_scanned;
 	struct reclaim_state *reclaim_state = current->reclaim_state;
 	struct scan_control sc = {
-		.gfp_mask = GFP_KERNEL,
+		.gfp_mask = GFP_IOFS,
 		.may_unmap = 1,
 		.may_swap = 1,
 		/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] VM: kswapd should not do blocking memory allocations
       [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com>
@ 2010-08-18 19:31   ` Trond Myklebust
  2010-08-20  5:45     ` Wu Fengguang
  0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2010-08-18 19:31 UTC (permalink / raw)
  To: Ram Pai; +Cc: linux-mm, linux-kernel, linux-nfs

On Wed, 2010-08-18 at 12:24 -0700, Ram Pai wrote:
> 
> 
> On Wed, Aug 18, 2010 at 12:04 PM, Trond Myklebust
> <Trond.Myklebust@netapp.com> wrote:
>         From: Trond Myklebust <Trond.Myklebust@netapp.com>
>         
>         Allowing kswapd to do GFP_KERNEL memory allocations (or any
>         blocking memory
>         allocations) is wrong and can cause deadlocks in
>         try_to_release_page(), as
>         the filesystem believes it is safe to allocate new memory and
>         block,
>         whereas kswapd is there specifically to clear a low-memory
>         situation...
>         
>         Set the gfp_mask to GFP_IOFS instead.
>         
>         Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
>         ---
>         
>          mm/vmscan.c |    2 +-
>          1 files changed, 1 insertions(+), 1 deletions(-)
>         
>         
>         diff --git a/mm/vmscan.c b/mm/vmscan.c
>         index ec5ddcc..716dd16 100644
>         --- a/mm/vmscan.c
>         +++ b/mm/vmscan.c
>         @@ -2095,7 +2095,7 @@ static unsigned long
>         balance_pgdat(pg_data_t *pgdat, int order)
>                unsigned long total_scanned;
>                struct reclaim_state *reclaim_state =
>         current->reclaim_state;
>                struct scan_control sc = {
>         -               .gfp_mask = GFP_KERNEL,
>         +               .gfp_mask = GFP_IOFS,
>                        .may_unmap = 1,
>                        .may_swap = 1,
>                        /*
> 
> Trond,
> 
>            Has anyone hit this issue? Or is this based on code
> inspection?  
> 
>            The reason I  ask is we are seeing a problem, similar to
> the symptom described, on RH based kernel but have not been able to
> reproduce on 2.6.35.

Hi Ram,

I was seeing it on NFS until I put in the following kswapd-specific hack
into nfs_release_page():

	/* Only do I/O if gfp is a superset of GFP_KERNEL */
	if (mapping && (gfp & GFP_KERNEL) == GFP_KERNEL) {
		int how = FLUSH_SYNC;

		/* Don't let kswapd deadlock waiting for OOM RPC calls */
		if (current_is_kswapd())
			how = 0;
		nfs_commit_inode(mapping->host, how);
	}

Remove the 'if (current_is_kswapd())' line, and run an mmap() write
intensive workload, and it should hang pretty much every time.

Cheers
  Trond

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] VM: kswapd should not do blocking memory allocations
  2010-08-18 19:04 [PATCH] VM: kswapd should not do blocking memory allocations Trond Myklebust
       [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com>
@ 2010-08-18 19:34 ` Chris Mason
  2010-08-18 20:10   ` Trond Myklebust
  2010-08-20  5:40 ` Wu Fengguang
  2 siblings, 1 reply; 7+ messages in thread
From: Chris Mason @ 2010-08-18 19:34 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-mm, linux-kernel, linux-nfs

On Wed, Aug 18, 2010 at 03:04:01PM -0400, Trond Myklebust wrote:
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> Allowing kswapd to do GFP_KERNEL memory allocations (or any blocking memory
> allocations) is wrong and can cause deadlocks in try_to_release_page(), as
> the filesystem believes it is safe to allocate new memory and block,
> whereas kswapd is there specifically to clear a low-memory situation...
> 
> Set the gfp_mask to GFP_IOFS instead.

I always thought releasepage was supposed to do almost zero work.  It
could release an instantly freeable page but it wasn't supposed to dive
in and solve world hunger or anything.

I thought the VM would be using writepage for that.

-chris

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] VM: kswapd should not do blocking memory allocations
  2010-08-18 19:34 ` Chris Mason
@ 2010-08-18 20:10   ` Trond Myklebust
  0 siblings, 0 replies; 7+ messages in thread
From: Trond Myklebust @ 2010-08-18 20:10 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-mm, linux-kernel, linux-nfs

On Wed, 2010-08-18 at 15:34 -0400, Chris Mason wrote:
> On Wed, Aug 18, 2010 at 03:04:01PM -0400, Trond Myklebust wrote:
> > From: Trond Myklebust <Trond.Myklebust@netapp.com>
> > 
> > Allowing kswapd to do GFP_KERNEL memory allocations (or any blocking memory
> > allocations) is wrong and can cause deadlocks in try_to_release_page(), as
> > the filesystem believes it is safe to allocate new memory and block,
> > whereas kswapd is there specifically to clear a low-memory situation...
> > 
> > Set the gfp_mask to GFP_IOFS instead.
> 
> I always thought releasepage was supposed to do almost zero work.  It
> could release an instantly freeable page but it wasn't supposed to dive
> in and solve world hunger or anything.
> 
> I thought the VM would be using writepage for that.

writepage isn't sufficient for the NFS case: the page may be in the
'clean but unstable' state, in which case the NFS client needs to send a
COMMIT rpc call before the page can finally be released.

That is why we need the gfp_flag to tell us when it is safe to do this,
and when it is not.
The main case where it is safe and necessary for try_to_release_page()
to initiate a COMMIT call is in the invalidate_inode_pages2(). We might
want to do it in the kswapd case too, but in that case, we definitely
should tell the filesystem that it is unsafe to block.

Trond

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] VM: kswapd should not do blocking memory allocations
  2010-08-18 19:04 [PATCH] VM: kswapd should not do blocking memory allocations Trond Myklebust
       [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com>
  2010-08-18 19:34 ` Chris Mason
@ 2010-08-20  5:40 ` Wu Fengguang
  2 siblings, 0 replies; 7+ messages in thread
From: Wu Fengguang @ 2010-08-20  5:40 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-mm, linux-kernel, linux-nfs

On Wed, Aug 18, 2010 at 03:04:01PM -0400, Trond Myklebust wrote:
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> Allowing kswapd to do GFP_KERNEL memory allocations (or any blocking memory
> allocations) is wrong and can cause deadlocks in try_to_release_page(), as
> the filesystem believes it is safe to allocate new memory and block,
> whereas kswapd is there specifically to clear a low-memory situation...
> 
> Set the gfp_mask to GFP_IOFS instead.

It would be more descriptive to say "remove the __GFP_WAIT bit".

The change looks reasonable _in itself_, since we always prefer to
avoid unnecessary waits for kswapd. So

Acked-by: Wu Fengguang <fengguang.wu@intel.com>

> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> 
>  mm/vmscan.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index ec5ddcc..716dd16 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2095,7 +2095,7 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
>  	unsigned long total_scanned;
>  	struct reclaim_state *reclaim_state = current->reclaim_state;
>  	struct scan_control sc = {
> -		.gfp_mask = GFP_KERNEL,
> +		.gfp_mask = GFP_IOFS,
>  		.may_unmap = 1,
>  		.may_swap = 1,
>  		/*
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] VM: kswapd should not do blocking memory allocations
  2010-08-18 19:31   ` Trond Myklebust
@ 2010-08-20  5:45     ` Wu Fengguang
  2010-08-20 12:17       ` Trond Myklebust
  0 siblings, 1 reply; 7+ messages in thread
From: Wu Fengguang @ 2010-08-20  5:45 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Ram Pai, linux-mm, linux-kernel, linux-nfs

> Hi Ram,
> 
> I was seeing it on NFS until I put in the following kswapd-specific hack
> into nfs_release_page():
> 
> 	/* Only do I/O if gfp is a superset of GFP_KERNEL */
> 	if (mapping && (gfp & GFP_KERNEL) == GFP_KERNEL) {
> 		int how = FLUSH_SYNC;
> 
> 		/* Don't let kswapd deadlock waiting for OOM RPC calls */
> 		if (current_is_kswapd())
> 			how = 0;

So the patch can remove the above workaround together, and add comment
that NFS exploits the gfp mask to avoid complex operations involving
recursive memory allocation and hence deadlock?

Thanks,
Fengguang

> 		nfs_commit_inode(mapping->host, how);
> 	}
> 
> Remove the 'if (current_is_kswapd())' line, and run an mmap() write
> intensive workload, and it should hang pretty much every time.
> 
> Cheers
>   Trond
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] VM: kswapd should not do blocking memory allocations
  2010-08-20  5:45     ` Wu Fengguang
@ 2010-08-20 12:17       ` Trond Myklebust
  0 siblings, 0 replies; 7+ messages in thread
From: Trond Myklebust @ 2010-08-20 12:17 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Ram Pai, linux-mm, linux-kernel, linux-nfs

On Fri, 2010-08-20 at 13:45 +0800, Wu Fengguang wrote:
> > Hi Ram,
> > 
> > I was seeing it on NFS until I put in the following kswapd-specific hack
> > into nfs_release_page():
> > 
> > 	/* Only do I/O if gfp is a superset of GFP_KERNEL */
> > 	if (mapping && (gfp & GFP_KERNEL) == GFP_KERNEL) {
> > 		int how = FLUSH_SYNC;
> > 
> > 		/* Don't let kswapd deadlock waiting for OOM RPC calls */
> > 		if (current_is_kswapd())
> > 			how = 0;
> 
> So the patch can remove the above workaround together, and add comment
> that NFS exploits the gfp mask to avoid complex operations involving
> recursive memory allocation and hence deadlock?

I thought I'd send that as a separate patch, but yes, that is my
intention next.

Cheers
  Trond

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-08-20 12:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-18 19:04 [PATCH] VM: kswapd should not do blocking memory allocations Trond Myklebust
     [not found] ` <AANLkTi=WkoxjwZbt6Vd0VhbuA7_k2WM-NUXZnrmzOOPy@mail.gmail.com>
2010-08-18 19:31   ` Trond Myklebust
2010-08-20  5:45     ` Wu Fengguang
2010-08-20 12:17       ` Trond Myklebust
2010-08-18 19:34 ` Chris Mason
2010-08-18 20:10   ` Trond Myklebust
2010-08-20  5:40 ` Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).