Re: [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mel@csn.ul.ie>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel List <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Minchan Kim <minchan.kim@gmail.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails
Date: Thu, 9 Sep 2010 13:41:38 +0100	[thread overview]
Message-ID: <20100909124138.GQ29263@csn.ul.ie> (raw)
In-Reply-To: <20100908163956.C930.A69D9226@jp.fujitsu.com>

On Wed, Sep 08, 2010 at 04:43:03PM +0900, KOSAKI Motohiro wrote:
> > +	/*
> > +	 * If an allocation failed after direct reclaim, it could be because
> > +	 * pages are pinned on the per-cpu lists. Drain them and try again
> > +	 */
> > +	if (!page && !drained) {
> > +		drain_all_pages();
> > +		drained = true;
> > +		goto retry;
> > +	}
> 
> nit: when slub, get_page_from_freelist() failure is frequently happen
> than slab because slub try to allocate high order page at first.
> So, I guess we have to avoid drain_all_pages() if __GFP_NORETRY is passed.
> 

Old behaviour was for high-order allocations which one would assume did
not have __GFP_NORETRY specified except in very rare cases. Still, calling
drain_all_pages() raises interrupt counts and I worried that large machines
might exhibit some livelock-like problem. I'm considering the following patch,
what do you think?

==== CUT HERE ====
mm: page allocator: Reduce the instances where drain_all_pages() is called

When a page allocation fails after direct reclaim, the per-cpu lists are
drained and another attempt made to allocate. On larger systems,
this can cause IPI storms in low-memory situations with latencies
increasing the more CPUs there are on the system. In extreme situations,
it is suspected it could cause livelock-like situations.

This patch restores older behaviour to call drain_all_pages() after direct
reclaim fails only for high-order allocations. As there is an expectation
that lower-orders will free naturally, the drain only occurs for order >
PAGE_ALLOC_COSTLY_ORDER. The reasoning is that the allocation is already
expected to be very expensive and rare so there will not be a resulting IPI
storm. drain_all_pages() called are not eliminated as it is still the case
that an allocation can fail because the necessary pages are pinned in the
per-cpu list. After this patch, the lists are only drained as a last-resort
before calling the OOM killer.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 mm/page_alloc.c |   23 ++++++++++++++++++++---
 1 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 750e1dc..16f516c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1737,6 +1737,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 	int migratetype)
 {
 	struct page *page;
+	bool drained = false;
 
 	/* Acquire the OOM killer lock for the zones in zonelist */
 	if (!try_set_zonelist_oom(zonelist, gfp_mask)) {
@@ -1744,6 +1745,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 		return NULL;
 	}
 
+retry:
 	/*
 	 * Go through the zonelist yet one more time, keep very high watermark
 	 * here, this is only to catch a parallel oom killing, we must fail if
@@ -1773,6 +1775,18 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 		if (gfp_mask & __GFP_THISNODE)
 			goto out;
 	}
+
+	/*
+	 * If an allocation failed, it could be because pages are pinned on
+	 * the per-cpu lists. Before resorting to the OOM killer, try
+	 * draining 
+	 */
+	if (!drained) {
+		drain_all_pages();
+		drained = true;
+		goto retry;
+	}
+
 	/* Exhausted what can be done so it's blamo time */
 	out_of_memory(zonelist, gfp_mask, order, nodemask);
 
@@ -1876,10 +1890,13 @@ retry:
 					migratetype);
 
 	/*
-	 * If an allocation failed after direct reclaim, it could be because
-	 * pages are pinned on the per-cpu lists. Drain them and try again
+	 * If a high-order allocation failed after direct reclaim, it could
+	 * be because pages are pinned on the per-cpu lists. However, only
+	 * do it for PAGE_ALLOC_COSTLY_ORDER as the cost of the IPI needed
+	 * to drain the pages is itself high. Assume that lower orders
+	 * will naturally free without draining.
 	 */
-	if (!page && !drained) {
+	if (!page && !drained && order > PAGE_ALLOC_COSTLY_ORDER) {
 		drain_all_pages();
 		drained = true;
 		goto retry;

WARNING: multiple messages have this Message-ID (diff)

From: Mel Gorman <mel@csn.ul.ie>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel List <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Minchan Kim <minchan.kim@gmail.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails
Date: Thu, 9 Sep 2010 13:41:38 +0100	[thread overview]
Message-ID: <20100909124138.GQ29263@csn.ul.ie> (raw)
In-Reply-To: <20100908163956.C930.A69D9226@jp.fujitsu.com>

On Wed, Sep 08, 2010 at 04:43:03PM +0900, KOSAKI Motohiro wrote:
> > +	/*
> > +	 * If an allocation failed after direct reclaim, it could be because
> > +	 * pages are pinned on the per-cpu lists. Drain them and try again
> > +	 */
> > +	if (!page && !drained) {
> > +		drain_all_pages();
> > +		drained = true;
> > +		goto retry;
> > +	}
> 
> nit: when slub, get_page_from_freelist() failure is frequently happen
> than slab because slub try to allocate high order page at first.
> So, I guess we have to avoid drain_all_pages() if __GFP_NORETRY is passed.
> 

Old behaviour was for high-order allocations which one would assume did
not have __GFP_NORETRY specified except in very rare cases. Still, calling
drain_all_pages() raises interrupt counts and I worried that large machines
might exhibit some livelock-like problem. I'm considering the following patch,
what do you think?

==== CUT HERE ====
mm: page allocator: Reduce the instances where drain_all_pages() is called

When a page allocation fails after direct reclaim, the per-cpu lists are
drained and another attempt made to allocate. On larger systems,
this can cause IPI storms in low-memory situations with latencies
increasing the more CPUs there are on the system. In extreme situations,
it is suspected it could cause livelock-like situations.

This patch restores older behaviour to call drain_all_pages() after direct
reclaim fails only for high-order allocations. As there is an expectation
that lower-orders will free naturally, the drain only occurs for order >
PAGE_ALLOC_COSTLY_ORDER. The reasoning is that the allocation is already
expected to be very expensive and rare so there will not be a resulting IPI
storm. drain_all_pages() called are not eliminated as it is still the case
that an allocation can fail because the necessary pages are pinned in the
per-cpu list. After this patch, the lists are only drained as a last-resort
before calling the OOM killer.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 mm/page_alloc.c |   23 ++++++++++++++++++++---
 1 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 750e1dc..16f516c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1737,6 +1737,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 	int migratetype)
 {
 	struct page *page;
+	bool drained = false;
 
 	/* Acquire the OOM killer lock for the zones in zonelist */
 	if (!try_set_zonelist_oom(zonelist, gfp_mask)) {
@@ -1744,6 +1745,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 		return NULL;
 	}
 
+retry:
 	/*
 	 * Go through the zonelist yet one more time, keep very high watermark
 	 * here, this is only to catch a parallel oom killing, we must fail if
@@ -1773,6 +1775,18 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 		if (gfp_mask & __GFP_THISNODE)
 			goto out;
 	}
+
+	/*
+	 * If an allocation failed, it could be because pages are pinned on
+	 * the per-cpu lists. Before resorting to the OOM killer, try
+	 * draining 
+	 */
+	if (!drained) {
+		drain_all_pages();
+		drained = true;
+		goto retry;
+	}
+
 	/* Exhausted what can be done so it's blamo time */
 	out_of_memory(zonelist, gfp_mask, order, nodemask);
 
@@ -1876,10 +1890,13 @@ retry:
 					migratetype);
 
 	/*
-	 * If an allocation failed after direct reclaim, it could be because
-	 * pages are pinned on the per-cpu lists. Drain them and try again
+	 * If a high-order allocation failed after direct reclaim, it could
+	 * be because pages are pinned on the per-cpu lists. However, only
+	 * do it for PAGE_ALLOC_COSTLY_ORDER as the cost of the IPI needed
+	 * to drain the pages is itself high. Assume that lower orders
+	 * will naturally free without draining.
 	 */
-	if (!page && !drained) {
+	if (!page && !drained && order > PAGE_ALLOC_COSTLY_ORDER) {
 		drain_all_pages();
 		drained = true;
 		goto retry;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-09-09 12:41 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-03  9:08 [PATCH 0/3] Reduce watermark-related problems with the per-cpu allocator V4 Mel Gorman
2010-09-03  9:08 ` Mel Gorman
2010-09-03  9:08 ` [PATCH 1/3] mm: page allocator: Update free page counters after pages are placed on the free list Mel Gorman
2010-09-03  9:08   ` Mel Gorman
2010-09-03 22:38   ` Andrew Morton
2010-09-03 22:38     ` Andrew Morton
2010-09-05 18:06     ` Mel Gorman
2010-09-05 18:06       ` Mel Gorman
2010-09-03  9:08 ` [PATCH 2/3] mm: page allocator: Calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake Mel Gorman
2010-09-03  9:08   ` Mel Gorman
2010-09-03 22:55   ` Andrew Morton
2010-09-03 22:55     ` Andrew Morton
2010-09-03 23:17     ` Christoph Lameter
2010-09-03 23:17       ` Christoph Lameter
2010-09-03 23:28       ` Andrew Morton
2010-09-03 23:28         ` Andrew Morton
2010-09-04  0:54         ` Christoph Lameter
2010-09-04  0:54           ` Christoph Lameter
2010-09-05 18:12     ` Mel Gorman
2010-09-05 18:12       ` Mel Gorman
2010-09-03  9:08 ` [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails Mel Gorman
2010-09-03  9:08   ` Mel Gorman
2010-09-03 23:00   ` Andrew Morton
2010-09-03 23:00     ` Andrew Morton
2010-09-04  2:25     ` Dave Chinner
2010-09-04  2:25       ` Dave Chinner
2010-09-04  3:21       ` Andrew Morton
2010-09-04  3:21         ` Andrew Morton
2010-09-04  7:58         ` Dave Chinner
2010-09-04  7:58           ` Dave Chinner
2010-09-04  8:14           ` Dave Chinner
2010-09-04  8:14             ` Dave Chinner
     [not found]             ` <20100905015400.GA10714@localhost>
     [not found]               ` <20100905021555.GG705@dastard>
     [not found]                 ` <20100905060539.GA17450@localhost>
     [not found]                   ` <20100905131447.GJ705@dastard>
2010-09-05 13:45                     ` Wu Fengguang
2010-09-05 13:45                       ` Wu Fengguang
2010-09-05 23:33                       ` Dave Chinner
2010-09-05 23:33                         ` Dave Chinner
2010-09-06  4:02                       ` Dave Chinner
2010-09-06  4:02                         ` Dave Chinner
2010-09-06  8:40                         ` Mel Gorman
2010-09-06  8:40                           ` Mel Gorman
2010-09-06 21:50                           ` Dave Chinner
2010-09-06 21:50                             ` Dave Chinner
2010-09-08  8:49                             ` Dave Chinner
2010-09-08  8:49                               ` Dave Chinner
2010-09-09 12:39                               ` Mel Gorman
2010-09-09 12:39                                 ` Mel Gorman
2010-09-10  6:17                                 ` Dave Chinner
2010-09-10  6:17                                   ` Dave Chinner
2010-09-07 14:23                         ` Christoph Lameter
2010-09-07 14:23                           ` Christoph Lameter
2010-09-08  2:13                           ` Wu Fengguang
2010-09-08  2:13                             ` Wu Fengguang
2010-09-04  3:23       ` Wu Fengguang
2010-09-04  3:23         ` Wu Fengguang
2010-09-04  3:59         ` Andrew Morton
2010-09-04  3:59           ` Andrew Morton
2010-09-04  4:37           ` Wu Fengguang
2010-09-04  4:37             ` Wu Fengguang
2010-09-05 18:22       ` Mel Gorman
2010-09-05 18:22         ` Mel Gorman
2010-09-05 18:14     ` Mel Gorman
2010-09-05 18:14       ` Mel Gorman
2010-09-08  7:43   ` KOSAKI Motohiro
2010-09-08  7:43     ` KOSAKI Motohiro
2010-09-08 20:05     ` Christoph Lameter
2010-09-08 20:05       ` Christoph Lameter
2010-09-09 12:41     ` Mel Gorman [this message]
2010-09-09 12:41       ` Mel Gorman
2010-09-09 13:45       ` Christoph Lameter
2010-09-09 13:45         ` Christoph Lameter
2010-09-09 13:55         ` Mel Gorman
2010-09-09 13:55           ` Mel Gorman
2010-09-09 14:32           ` Christoph Lameter
2010-09-09 14:32             ` Christoph Lameter
2010-09-09 15:05             ` Mel Gorman
2010-09-09 15:05               ` Mel Gorman
2010-09-10  2:56               ` KOSAKI Motohiro
2010-09-10  2:56                 ` KOSAKI Motohiro
2010-09-03 23:05 ` [PATCH 0/3] Reduce watermark-related problems with the per-cpu allocator V4 Andrew Morton
2010-09-03 23:05   ` Andrew Morton
2010-09-21 11:17   ` Mel Gorman
2010-09-21 11:17     ` Mel Gorman
2010-09-21 12:58     ` [stable] " Greg KH
2010-09-21 12:58       ` Greg KH
2010-09-21 14:23       ` Mel Gorman
2010-09-21 14:23         ` Mel Gorman
2010-09-23 18:49         ` Greg KH
2010-09-23 18:49           ` Greg KH
2010-09-24  9:14           ` Mel Gorman
2010-09-24  9:14             ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2010-08-31 17:37 [PATCH 0/3] Reduce watermark-related problems with the per-cpu allocator V3 Mel Gorman
2010-08-31 17:37 ` [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails Mel Gorman
2010-08-31 17:37   ` Mel Gorman
2010-08-31 18:26   ` Christoph Lameter
2010-08-31 18:26     ` Christoph Lameter
2010-08-23  8:00 [PATCH 0/3] Reduce watermark-related problems with the per-cpu allocator V2 Mel Gorman
2010-08-23  8:00 ` [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails Mel Gorman
2010-08-23  8:00   ` Mel Gorman
2010-08-23 23:17   ` KOSAKI Motohiro
2010-08-23 23:17     ` KOSAKI Motohiro
2010-08-16  9:42 [RFC PATCH 0/3] Reduce watermark-related problems with the per-cpu allocator Mel Gorman
2010-08-16  9:42 ` [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails Mel Gorman
2010-08-16 14:50   ` Rik van Riel
2010-08-17  2:57   ` Minchan Kim
2010-08-18  3:02   ` KAMEZAWA Hiroyuki
2010-08-19 14:47   ` Minchan Kim
2010-08-19 15:10     ` Mel Gorman

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:750e1dc dfblob:16f516c dfblob:750e1dc dfblob:16f516c )
 OR (
bs:"Re: [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100909124138.GQ29263@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.