[PATCH v2 -mm] limit direct reclaim for higher order allocations

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 -mm] limit direct reclaim for higher order allocations
@ 2011-09-27 14:52 Rik van Riel
  2011-09-27 16:06 ` Johannes Weiner
  2011-09-28 10:02 ` Mel Gorman
  0 siblings, 2 replies; 4+ messages in thread
From: Rik van Riel @ 2011-09-27 14:52 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, Mel Gorman, akpm, Johannes Weiner, aarcange

When suffering from memory fragmentation due to unfreeable pages,
THP page faults will repeatedly try to compact memory.  Due to
the unfreeable pages, compaction fails.

Needless to say, at that point page reclaim also fails to create
free contiguous 2MB areas.  However, that doesn't stop the current
code from trying, over and over again, and freeing a minimum of
4MB (2UL << sc->order pages) at every single invocation.

This resulted in my 12GB system having 2-3GB free memory, a
corresponding amount of used swap and very sluggish response times.

This can be avoided by having the direct reclaim code not reclaim
from zones that already have plenty of free memory available for
compaction.

If compaction still fails due to unmovable memory, doing additional
reclaim will only hurt the system, not help.

Signed-off-by: Rik van Riel <riel@redhat.com>

---
-v2: shrink_zones now uses the same thresholds as used by compaction itself,
     not only is this conceptually nicer, it also results in kswapd doing
     some actual work; before all the page freeing work was done by THP
     allocators, I seem to see fewer application stalls after this change.

 mm/vmscan.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index b7719ec..117eb4d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2083,6 +2083,16 @@ static void shrink_zones(int priority, struct zonelist *zonelist,
 				continue;
 			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
 				continue;	/* Let kswapd poll it */
+			if (COMPACTION_BUILD) {
+				/*
+				 * If we already have plenty of memory free
+				 * for compaction, don't free any more.
+				 */
+				if (sc->order > PAGE_ALLOC_COSTLY_ORDER &&
+					(compaction_suitable(zone, sc->order) ||
+					 compaction_deferred(zone)))
+					continue;
+			}
 			/*
 			 * This steals pages from memory cgroups over softlimit
 			 * and returns the number of reclaimed pages and

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 -mm] limit direct reclaim for higher order allocations
  2011-09-27 14:52 [PATCH v2 -mm] limit direct reclaim for higher order allocations Rik van Riel
@ 2011-09-27 16:06 ` Johannes Weiner
  2011-10-07  9:07   ` Johannes Weiner
  2011-09-28 10:02 ` Mel Gorman
  1 sibling, 1 reply; 4+ messages in thread
From: Johannes Weiner @ 2011-09-27 16:06 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-mm, linux-kernel, Mel Gorman, akpm, Johannes Weiner,
	aarcange

On Tue, Sep 27, 2011 at 10:52:46AM -0400, Rik van Riel wrote:
> When suffering from memory fragmentation due to unfreeable pages,
> THP page faults will repeatedly try to compact memory.  Due to
> the unfreeable pages, compaction fails.
> 
> Needless to say, at that point page reclaim also fails to create
> free contiguous 2MB areas.  However, that doesn't stop the current
> code from trying, over and over again, and freeing a minimum of
> 4MB (2UL << sc->order pages) at every single invocation.
> 
> This resulted in my 12GB system having 2-3GB free memory, a
> corresponding amount of used swap and very sluggish response times.
> 
> This can be avoided by having the direct reclaim code not reclaim
> from zones that already have plenty of free memory available for
> compaction.
> 
> If compaction still fails due to unmovable memory, doing additional
> reclaim will only hurt the system, not help.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> 
> ---
> -v2: shrink_zones now uses the same thresholds as used by compaction itself,
>      not only is this conceptually nicer, it also results in kswapd doing
>      some actual work; before all the page freeing work was done by THP
>      allocators, I seem to see fewer application stalls after this change.
> 
>  mm/vmscan.c |   10 ++++++++++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index b7719ec..117eb4d 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2083,6 +2083,16 @@ static void shrink_zones(int priority, struct zonelist *zonelist,
>  				continue;
>  			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
>  				continue;	/* Let kswapd poll it */
> +			if (COMPACTION_BUILD) {
> +				/*
> +				 * If we already have plenty of memory free
> +				 * for compaction, don't free any more.
> +				 */
> +				if (sc->order > PAGE_ALLOC_COSTLY_ORDER &&
> +					(compaction_suitable(zone, sc->order) ||
> +					 compaction_deferred(zone)))
> +					continue;
> +			}

I don't think the comment is complete in combination with the check
for order > PAGE_ALLOC_COSTLY_ORDER, as compaction is invoked for all
non-zero orders.

But the traditional behaviour does less harm if the orders are small
and your problem was triggered by THP allocations, so I agree with the
code itself.

Acked-by: Johannes Weiner <jweiner@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 -mm] limit direct reclaim for higher order allocations
  2011-09-27 16:06 ` Johannes Weiner
@ 2011-10-07  9:07   ` Johannes Weiner
  0 siblings, 0 replies; 4+ messages in thread
From: Johannes Weiner @ 2011-10-07  9:07 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-mm, linux-kernel, Mel Gorman, akpm, Johannes Weiner,
	aarcange

On Tue, Sep 27, 2011 at 06:06:48PM +0200, Johannes Weiner wrote:
> On Tue, Sep 27, 2011 at 10:52:46AM -0400, Rik van Riel wrote:
> > When suffering from memory fragmentation due to unfreeable pages,
> > THP page faults will repeatedly try to compact memory.  Due to
> > the unfreeable pages, compaction fails.
> > 
> > Needless to say, at that point page reclaim also fails to create
> > free contiguous 2MB areas.  However, that doesn't stop the current
> > code from trying, over and over again, and freeing a minimum of
> > 4MB (2UL << sc->order pages) at every single invocation.
> > 
> > This resulted in my 12GB system having 2-3GB free memory, a
> > corresponding amount of used swap and very sluggish response times.
> > 
> > This can be avoided by having the direct reclaim code not reclaim
> > from zones that already have plenty of free memory available for
> > compaction.
> > 
> > If compaction still fails due to unmovable memory, doing additional
> > reclaim will only hurt the system, not help.
> > 
> > Signed-off-by: Rik van Riel <riel@redhat.com>
> > 
> > ---
> > -v2: shrink_zones now uses the same thresholds as used by compaction itself,
> >      not only is this conceptually nicer, it also results in kswapd doing
> >      some actual work; before all the page freeing work was done by THP
> >      allocators, I seem to see fewer application stalls after this change.
> > 
> >  mm/vmscan.c |   10 ++++++++++
> >  1 files changed, 10 insertions(+), 0 deletions(-)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index b7719ec..117eb4d 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2083,6 +2083,16 @@ static void shrink_zones(int priority, struct zonelist *zonelist,
> >  				continue;
> >  			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
> >  				continue;	/* Let kswapd poll it */
> > +			if (COMPACTION_BUILD) {
> > +				/*
> > +				 * If we already have plenty of memory free
> > +				 * for compaction, don't free any more.
> > +				 */
> > +				if (sc->order > PAGE_ALLOC_COSTLY_ORDER &&
> > +					(compaction_suitable(zone, sc->order) ||
> > +					 compaction_deferred(zone)))
> > +					continue;
> > +			}
> 
> I don't think the comment is complete in combination with the check
> for order > PAGE_ALLOC_COSTLY_ORDER, as compaction is invoked for all
> non-zero orders.
> 
> But the traditional behaviour does less harm if the orders are small
> and your problem was triggered by THP allocations, so I agree with the
> code itself.

FWIW, an incremental patch to explain the order check.  What do you
think?

Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
 mm/vmscan.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3817fa9..930085a 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2068,8 +2068,14 @@ static void shrink_zones(int priority, struct zonelist *zonelist,
 				continue;	/* Let kswapd poll it */
 			if (COMPACTION_BUILD) {
 				/*
-				 * If we already have plenty of memory free
-				 * for compaction, don't free any more.
+				 * If we already have plenty of memory
+				 * free for compaction, don't free any
+				 * more.  Even though compaction is
+				 * invoked for any non-zero order,
+				 * only frequent costly order
+				 * reclamation is disruptive enough to
+				 * become a noticable problem, like
+				 * transparent huge page allocations.
 				 */
 				if (sc->order > PAGE_ALLOC_COSTLY_ORDER &&
 					(compaction_suitable(zone, sc->order) ||
-- 
1.7.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 -mm] limit direct reclaim for higher order allocations
  2011-09-27 14:52 [PATCH v2 -mm] limit direct reclaim for higher order allocations Rik van Riel
  2011-09-27 16:06 ` Johannes Weiner
@ 2011-09-28 10:02 ` Mel Gorman
  1 sibling, 0 replies; 4+ messages in thread
From: Mel Gorman @ 2011-09-28 10:02 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm, linux-kernel, akpm, Johannes Weiner, aarcange

On Tue, Sep 27, 2011 at 10:52:46AM -0400, Rik van Riel wrote:
> When suffering from memory fragmentation due to unfreeable pages,
> THP page faults will repeatedly try to compact memory.  Due to
> the unfreeable pages, compaction fails.
> 
> Needless to say, at that point page reclaim also fails to create
> free contiguous 2MB areas.  However, that doesn't stop the current
> code from trying, over and over again, and freeing a minimum of
> 4MB (2UL << sc->order pages) at every single invocation.
> 
> This resulted in my 12GB system having 2-3GB free memory, a
> corresponding amount of used swap and very sluggish response times.
> 
> This can be avoided by having the direct reclaim code not reclaim
> from zones that already have plenty of free memory available for
> compaction.
> 
> If compaction still fails due to unmovable memory, doing additional
> reclaim will only hurt the system, not help.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> 

Because this patch improves things;

Acked-by: Mel Gorman <mgorman@suse.de>

That said, shrink_zones potentially returns having scanning
and reclaimed 0 pages. We still fall through to shrink_slab and
because we are reclaiming 0 pages, we loop over all priorities in
do_try_to_free_pages() and potentially even call wait_iff_congested. I
think this patch would be better if shrink_zones() returned true if
we did not reclaim because compaction was ready and returned early
from shrink_zones.

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 117eb4d..ead9c94 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2061,14 +2061,19 @@ restart:
  *
  * If a zone is deemed to be full of pinned pages then just give it a light
  * scan then give up on it.
+ *
+ * This function returns true if a zone is being reclaimed for a high-order
+ * allocation that will use compaction and compaction is ready to begin. This
+ * indicates to the caller that further reclaim is unnecessary.
  */
-static void shrink_zones(int priority, struct zonelist *zonelist,
+static bool shrink_zones(int priority, struct zonelist *zonelist,
 					struct scan_control *sc)
 {
 	struct zoneref *z;
 	struct zone *zone;
 	unsigned long nr_soft_reclaimed;
 	unsigned long nr_soft_scanned;
+	bool abort_reclaim_compaction = false;
 
 	for_each_zone_zonelist_nodemask(zone, z, zonelist,
 					gfp_zone(sc->gfp_mask), sc->nodemask) {
@@ -2090,8 +2095,10 @@ static void shrink_zones(int priority, struct zonelist *zonelist,
 				 */
 				if (sc->order > PAGE_ALLOC_COSTLY_ORDER &&
 					(compaction_suitable(zone, sc->order) ||
-					 compaction_deferred(zone)))
+					 compaction_deferred(zone))) {
+					abort_reclaim_compaction = true;
 					continue;
+				}
 			}
 			/*
 			 * This steals pages from memory cgroups over softlimit
@@ -2110,6 +2117,8 @@ static void shrink_zones(int priority, struct zonelist *zonelist,
 
 		shrink_zone(priority, zone, sc);
 	}
+
+	return abort_reclaim_compaction;
 }
 
 static bool zone_reclaimable(struct zone *zone)
@@ -2174,7 +2183,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 		sc->nr_scanned = 0;
 		if (!priority)
 			disable_swap_token(sc->mem_cgroup);
-		shrink_zones(priority, zonelist, sc);
+		if (shrink_zones(priority, zonelist, sc))
+			break;
 		/*
 		 * Don't shrink slabs when reclaiming memory from
 		 * over limit cgroups

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-10-07  9:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-27 14:52 [PATCH v2 -mm] limit direct reclaim for higher order allocations Rik van Riel
2011-09-27 16:06 ` Johannes Weiner
2011-10-07  9:07   ` Johannes Weiner
2011-09-28 10:02 ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).