linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: Avoid possible deadlock caused by too_many_isolated()
@ 2010-10-22  4:55 Wu Fengguang
  2010-10-22  4:55 ` [PATCH] vmscan: comment too_many_isolated() Wu Fengguang
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Wu Fengguang @ 2010-10-22  4:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Neil Brown, Rik van Riel, KOSAKI Motohiro, KAMEZAWA Hiroyuki,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Li, Shaohua

Neil find that if too_many_isolated() returns true while performing
direct reclaim we can end up waiting for other threads to complete their
direct reclaim.  If those threads are allowed to enter the FS or IO to
free memory, but this thread is not, then it is possible that those
threads will be waiting on this thread and so we get a circular
deadlock.

some task enters direct reclaim with GFP_KERNEL
  => too_many_isolated() false
    => vmscan and run into dirty pages
      => pageout()
        => take some FS lock
	  => fs/block code does GFP_NOIO allocation
	    => enter direct reclaim again
	      => too_many_isolated() true
		  => waiting for others to progress, however the other
		     tasks may be circular waiting for the FS lock..

The fix is to let !__GFP_IO and !__GFP_FS direct reclaims enjoy higher
priority than normal ones, by lowering the throttle threshold for the
latter.

Allowing ~1/8 isolated pages in normal is large enough. For example,
for a 1GB LRU list, that's ~128MB isolated pages, or 1k blocked tasks
(each isolates 32 4KB pages), or 64 blocked tasks per logical CPU
(assuming 16 logical CPUs per NUMA node). So it's not likely some CPU
goes idle waiting (when it could make progress) because of this limit:
there are much more sleeping reclaim tasks than the number of CPU, so
the task may well be blocked by some low level queue/lock anyway.

Now !GFP_IOFS reclaims won't be waiting for GFP_IOFS reclaims to
progress. They will be blocked only when there are too many concurrent
!GFP_IOFS reclaims, however that's very unlikely because the IO-less
direct reclaims is able to progress much more faster, and they won't
deadlock each other. The threshold is raised high enough for them, so
that there can be sufficient parallel progress of !GFP_IOFS reclaims.

CC: Torsten Kaiser <just.for.lkml@googlemail.com>
CC: Minchan Kim <minchan.kim@gmail.com>
Tested-by: NeilBrown <neilb@suse.de>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/vmscan.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- linux-next.orig/mm/vmscan.c	2010-10-13 12:35:14.000000000 +0800
+++ linux-next/mm/vmscan.c	2010-10-19 00:13:04.000000000 +0800
@@ -1163,6 +1163,13 @@ static int too_many_isolated(struct zone
 		isolated = zone_page_state(zone, NR_ISOLATED_ANON);
 	}
 
+	/*
+	 * GFP_NOIO/GFP_NOFS callers are allowed to isolate more pages, so that
+	 * they won't get blocked by normal ones and form circular deadlock.
+	 */
+	if ((sc->gfp_mask & GFP_IOFS) == GFP_IOFS)
+		inactive >>= 3;
+
 	return isolated > inactive;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] vmscan: comment too_many_isolated()
  2010-10-22  4:55 [PATCH] mm: Avoid possible deadlock caused by too_many_isolated() Wu Fengguang
@ 2010-10-22  4:55 ` Wu Fengguang
  2010-10-22 12:00   ` Rik van Riel
  2010-10-22 12:00 ` [PATCH] mm: Avoid possible deadlock caused by too_many_isolated() Rik van Riel
  2010-10-24 22:55 ` Minchan Kim
  2 siblings, 1 reply; 5+ messages in thread
From: Wu Fengguang @ 2010-10-22  4:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Neil Brown, Rik van Riel, KOSAKI Motohiro, KAMEZAWA Hiroyuki,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Li, Shaohua

Comment "Why it's doing so" rather than "What it does"
as proposed by Andrew Morton.

Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/vmscan.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/vmscan.c	2010-10-19 09:29:44.000000000 +0800
+++ linux-next/mm/vmscan.c	2010-10-19 10:21:41.000000000 +0800
@@ -1142,7 +1142,11 @@ int isolate_lru_page(struct page *page)
 }
 
 /*
- * Are there way too many processes in the direct reclaim path already?
+ * A direct reclaimer may isolate SWAP_CLUSTER_MAX pages from the LRU list and
+ * then get resheduled. When there are massive number of tasks doing page
+ * allocation, such sleeping direct reclaimers may keep piling up on each CPU,
+ * the LRU list will go small and be scanned faster than necessary, leading to
+ * unnecessary swapping, thrashing and OOM.
  */
 static int too_many_isolated(struct zone *zone, int file,
 		struct scan_control *sc)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] vmscan: comment too_many_isolated()
  2010-10-22  4:55 ` [PATCH] vmscan: comment too_many_isolated() Wu Fengguang
@ 2010-10-22 12:00   ` Rik van Riel
  0 siblings, 0 replies; 5+ messages in thread
From: Rik van Riel @ 2010-10-22 12:00 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Neil Brown, KOSAKI Motohiro, KAMEZAWA Hiroyuki,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Li, Shaohua

On 10/22/2010 12:55 AM, Wu Fengguang wrote:
> Comment "Why it's doing so" rather than "What it does"
> as proposed by Andrew Morton.
>
> Reviewed-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com>
> Reviewed-by: Minchan Kim<minchan.kim@gmail.com>
> Signed-off-by: Wu Fengguang<fengguang.wu@intel.com>

Reviewed-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Avoid possible deadlock caused by too_many_isolated()
  2010-10-22  4:55 [PATCH] mm: Avoid possible deadlock caused by too_many_isolated() Wu Fengguang
  2010-10-22  4:55 ` [PATCH] vmscan: comment too_many_isolated() Wu Fengguang
@ 2010-10-22 12:00 ` Rik van Riel
  2010-10-24 22:55 ` Minchan Kim
  2 siblings, 0 replies; 5+ messages in thread
From: Rik van Riel @ 2010-10-22 12:00 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Neil Brown, KOSAKI Motohiro, KAMEZAWA Hiroyuki,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, Li, Shaohua

On 10/22/2010 12:55 AM, Wu Fengguang wrote:

> Now !GFP_IOFS reclaims won't be waiting for GFP_IOFS reclaims to
> progress. They will be blocked only when there are too many concurrent
> !GFP_IOFS reclaims, however that's very unlikely because the IO-less
> direct reclaims is able to progress much more faster, and they won't
> deadlock each other. The threshold is raised high enough for them, so
> that there can be sufficient parallel progress of !GFP_IOFS reclaims.
>
> CC: Torsten Kaiser<just.for.lkml@googlemail.com>
> CC: Minchan Kim<minchan.kim@gmail.com>
> Tested-by: NeilBrown<neilb@suse.de>
> Acked-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang<fengguang.wu@intel.com>

Acked-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Avoid possible deadlock caused by too_many_isolated()
  2010-10-22  4:55 [PATCH] mm: Avoid possible deadlock caused by too_many_isolated() Wu Fengguang
  2010-10-22  4:55 ` [PATCH] vmscan: comment too_many_isolated() Wu Fengguang
  2010-10-22 12:00 ` [PATCH] mm: Avoid possible deadlock caused by too_many_isolated() Rik van Riel
@ 2010-10-24 22:55 ` Minchan Kim
  2 siblings, 0 replies; 5+ messages in thread
From: Minchan Kim @ 2010-10-24 22:55 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Neil Brown, Rik van Riel, KOSAKI Motohiro,
	KAMEZAWA Hiroyuki, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Li, Shaohua

On Fri, Oct 22, 2010 at 1:55 PM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> Neil find that if too_many_isolated() returns true while performing
> direct reclaim we can end up waiting for other threads to complete their
> direct reclaim.  If those threads are allowed to enter the FS or IO to
> free memory, but this thread is not, then it is possible that those
> threads will be waiting on this thread and so we get a circular
> deadlock.
>
> some task enters direct reclaim with GFP_KERNEL
>  => too_many_isolated() false
>    => vmscan and run into dirty pages
>      => pageout()
>        => take some FS lock
>          => fs/block code does GFP_NOIO allocation
>            => enter direct reclaim again
>              => too_many_isolated() true
>                  => waiting for others to progress, however the other
>                     tasks may be circular waiting for the FS lock..
>
> The fix is to let !__GFP_IO and !__GFP_FS direct reclaims enjoy higher
> priority than normal ones, by lowering the throttle threshold for the
> latter.
>
> Allowing ~1/8 isolated pages in normal is large enough. For example,
> for a 1GB LRU list, that's ~128MB isolated pages, or 1k blocked tasks
> (each isolates 32 4KB pages), or 64 blocked tasks per logical CPU
> (assuming 16 logical CPUs per NUMA node). So it's not likely some CPU
> goes idle waiting (when it could make progress) because of this limit:
> there are much more sleeping reclaim tasks than the number of CPU, so
> the task may well be blocked by some low level queue/lock anyway.
>
> Now !GFP_IOFS reclaims won't be waiting for GFP_IOFS reclaims to
> progress. They will be blocked only when there are too many concurrent
> !GFP_IOFS reclaims, however that's very unlikely because the IO-less
> direct reclaims is able to progress much more faster, and they won't
> deadlock each other. The threshold is raised high enough for them, so
> that there can be sufficient parallel progress of !GFP_IOFS reclaims.
>
> CC: Torsten Kaiser <just.for.lkml@googlemail.com>
> CC: Minchan Kim <minchan.kim@gmail.com>
> Tested-by: NeilBrown <neilb@suse.de>
> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-10-24 22:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-22  4:55 [PATCH] mm: Avoid possible deadlock caused by too_many_isolated() Wu Fengguang
2010-10-22  4:55 ` [PATCH] vmscan: comment too_many_isolated() Wu Fengguang
2010-10-22 12:00   ` Rik van Riel
2010-10-22 12:00 ` [PATCH] mm: Avoid possible deadlock caused by too_many_isolated() Rik van Riel
2010-10-24 22:55 ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).