All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hush Bensen <hush.bensen@gmail.com>
To: Andrea Arcangeli <aarcange@redhat.com>, linux-mm@kvack.org
Cc: Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	Richard Davies <richard@arachsys.com>,
	Shaohua Li <shli@kernel.org>, Rafael Aquini <aquini@redhat.com>
Subject: Re: [PATCH 7/7] mm: compaction: add compaction to zone_reclaim_mode
Date: Fri, 12 Jul 2013 18:57:25 -0500	[thread overview]
Message-ID: <51E097E5.7060308@gmail.com> (raw)
In-Reply-To: <1370445037-24144-8-git-send-email-aarcange@redhat.com>

Hi Andrea,
OU 2013/6/5 10:10, Andrea Arcangeli D'uA:
> This fixes zone_reclaim_mode by using the min watermark so it won't
> fail in presence of concurrent allocations. This greatly increases the
> reliability of zone_reclaim_mode > 0 also with cache shrinking and THP
> disabled.
>
> This also adds compaction to zone_reclaim so THP enabled won't
> decrease the NUMA locality with /proc/sys/vm/zone_reclaim_mode > 0.
>
> Some checks for __GFP_WAIT and numa_node_id() are moved from the
> zone_reclaim() to the caller so they also apply to the compaction
> logic.
>
> It is important to boot with numa_zonelist_order=n (n means nodes) to
> get more accurate NUMA locality if there are multiple zones per node.
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
>  mm/internal.h   |  1 -
>  mm/page_alloc.c | 99 +++++++++++++++++++++++++++++++++++++++++++--------------
>  mm/vmscan.c     | 17 ----------
>  3 files changed, 75 insertions(+), 42 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 8562de0..560a1ec 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -339,7 +339,6 @@ static inline void mminit_validate_memmodel_limits(unsigned long *start_pfn,
>  }
>  #endif /* CONFIG_SPARSEMEM */
>  
> -#define ZONE_RECLAIM_NOSCAN	-2
>  #define ZONE_RECLAIM_FULL	-1
>  #define ZONE_RECLAIM_SOME	0
>  #define ZONE_RECLAIM_SUCCESS	1
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c13e062..3ca905a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1902,7 +1902,9 @@ zonelist_scan:
>  		BUILD_BUG_ON(ALLOC_NO_WATERMARKS < NR_WMARK);
>  		if (!(alloc_flags & ALLOC_NO_WATERMARKS)) {
>  			unsigned long mark;
> -			int ret;
> +			int ret, node_id, c_ret;
> +			bool repeated_compaction, need_compaction;
> +			bool contended = false;
>  
>  			mark = zone->watermark[alloc_flags & ALLOC_WMARK_MASK];
>  			if (zone_watermark_ok(zone, order, mark,
> @@ -1933,35 +1935,84 @@ zonelist_scan:
>  				!zlc_zone_worth_trying(zonelist, z, allowednodes))
>  				continue;
>  
> -			ret = zone_reclaim(zone, gfp_mask, order);
> -			switch (ret) {
> -			case ZONE_RECLAIM_NOSCAN:
> -				/* did not scan */
> +			if (!(gfp_mask & __GFP_WAIT) ||
> +			    (current->flags & PF_MEMALLOC))
>  				continue;
> -			case ZONE_RECLAIM_FULL:
> -				/* scanned but unreclaimable */
> +
> +			/*
> +			 * Only reclaim the local zone or on zones
> +			 * that do not have associated
> +			 * processors. This will favor the local
> +			 * processor over remote processors and spread
> +			 * off node memory allocations as wide as
> +			 * possible.
> +			 */
> +			node_id = zone_to_nid(zone);
> +			if (node_state(node_id, N_CPU) &&
> +			    node_id != numa_node_id())
>  				continue;
> -			default:
> -				/* did we reclaim enough */
> +
> +			/*
> +			 * We're going to do reclaim so allow
> +			 * allocations up to the MIN watermark, so less
> +			 * concurrent allocation will fail.
> +			 */
> +			mark = min_wmark_pages(zone);

Target reclaim will be done once under low wmark in vanilla kernel,
however, your patch change it to min wmark, why this behavior change?

> +
> +			/* initialize to avoid warnings */
> +			c_ret = COMPACT_SKIPPED;
> +			ret = ZONE_RECLAIM_FULL;
> +
> +			repeated_compaction = false;
> +			need_compaction = false;
> +			if (!compaction_deferred(preferred_zone, order))
> +				need_compaction = order &&
> +					(gfp_mask & GFP_KERNEL) == GFP_KERNEL;
> +			if (need_compaction) {
> +			repeat_compaction:
> +				c_ret = compact_zone_order(zone, order,
> +							   gfp_mask,
> +							   repeated_compaction,
> +							   &contended);
> +				if (c_ret != COMPACT_SKIPPED &&
> +				    zone_watermark_ok(zone, order, mark,
> +						      classzone_idx,
> +						      alloc_flags)) {
> +#ifdef CONFIG_COMPACTION
> +					preferred_zone->compact_considered = 0;
> +					preferred_zone->compact_defer_shift = 0;
> +#endif
> +					goto try_this_zone;
> +				}
> +			}
> +			/*
> +			 * reclaim if compaction failed because not
> +			 * enough memory was available or if
> +			 * compaction didn't run (order 0) or didn't
> +			 * succeed.
> +			 */
> +			if (!repeated_compaction || c_ret == COMPACT_SKIPPED) {
> +				ret = zone_reclaim(zone, gfp_mask, order);
>  				if (zone_watermark_ok(zone, order, mark,
> -						classzone_idx, alloc_flags))
> +						      classzone_idx,
> +						      alloc_flags))
>  					goto try_this_zone;
> +			}
> +			if (need_compaction &&
> +			    (!repeated_compaction ||
> +			     (c_ret == COMPACT_SKIPPED &&
> +			      ret == ZONE_RECLAIM_SUCCESS))) {
> +				repeated_compaction = true;
> +				cond_resched();
> +				goto repeat_compaction;
> +			}
> +			if (need_compaction)
> +				defer_compaction(preferred_zone, order);
>  
> -				/*
> -				 * Failed to reclaim enough to meet watermark.
> -				 * Only mark the zone full if checking the min
> -				 * watermark or if we failed to reclaim just
> -				 * 1<<order pages or else the page allocator
> -				 * fastpath will prematurely mark zones full
> -				 * when the watermark is between the low and
> -				 * min watermarks.
> -				 */
> -				if (((alloc_flags & ALLOC_WMARK_MASK) == ALLOC_WMARK_MIN) ||
> -				    ret == ZONE_RECLAIM_SOME)
> -					goto this_zone_full;
> -
> +			if (!order)
> +				goto this_zone_full;
> +			else

You do the works should be done in slow path, is it worth?

>  				continue;
> -			}
>  		}
>  
>  try_this_zone:
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 825c631..6a65107 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3380,7 +3380,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  
>  int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  {
> -	int node_id;
>  	int ret;
>  
>  	/*
> @@ -3400,22 +3399,6 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
>  	if (zone->all_unreclaimable)
>  		return ZONE_RECLAIM_FULL;
>  
> -	/*
> -	 * Do not scan if the allocation should not be delayed.
> -	 */
> -	if (!(gfp_mask & __GFP_WAIT) || (current->flags & PF_MEMALLOC))
> -		return ZONE_RECLAIM_NOSCAN;
> -
> -	/*
> -	 * Only run zone reclaim on the local zone or on zones that do not
> -	 * have associated processors. This will favor the local processor
> -	 * over remote processors and spread off node memory allocations
> -	 * as wide as possible.
> -	 */
> -	node_id = zone_to_nid(zone);
> -	if (node_state(node_id, N_CPU) && node_id != numa_node_id())
> -		return ZONE_RECLAIM_NOSCAN;
> -
>  	ret = __zone_reclaim(zone, gfp_mask, order);
>  
>  	if (!ret)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2013-07-12 23:57 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-05 15:10 [PATCH 0/7] RFC: adding compaction to zone_reclaim_mode > 0 Andrea Arcangeli
2013-06-05 15:10 ` [PATCH 1/7] mm: remove ZONE_RECLAIM_LOCKED Andrea Arcangeli
2013-06-05 19:23   ` Rik van Riel
2013-06-05 20:37   ` KOSAKI Motohiro
2013-06-05 20:51     ` Christoph Lameter
2013-06-05 21:03       ` KOSAKI Motohiro
2013-06-06 14:15         ` Christoph Lameter
2013-06-06 17:17           ` KOSAKI Motohiro
2013-06-06 18:16             ` Christoph Lameter
2013-06-05 21:33   ` Rafael Aquini
2013-06-06  9:04   ` Mel Gorman
2013-06-06 17:37     ` Rik van Riel
2013-06-14 16:16       ` Rik van Riel
2013-06-17  9:30         ` Mel Gorman
2013-06-17 18:12           ` Rik van Riel
2013-06-26 20:10             ` Andrea Arcangeli
2013-06-05 15:10 ` [PATCH 2/7] mm: compaction: scan all memory with /proc/sys/vm/compact_memory Andrea Arcangeli
2013-06-05 19:34   ` Rik van Riel
2013-06-05 21:39   ` Rafael Aquini
2013-06-06  9:05   ` Mel Gorman
2013-06-05 15:10 ` [PATCH 3/7] mm: compaction: don't depend on kswapd to invoke reset_isolation_suitable Andrea Arcangeli
2013-06-05 19:49   ` Rik van Riel
2013-06-26 20:38     ` Andrea Arcangeli
2013-06-06  9:11   ` Mel Gorman
2013-06-26 20:48     ` Andrea Arcangeli
2013-06-06 12:47   ` Rafael Aquini
2013-06-05 15:10 ` [PATCH 4/7] mm: compaction: reset before initializing the scan cursors Andrea Arcangeli
2013-06-05 20:04   ` Rik van Riel
2013-06-06  9:14   ` Mel Gorman
2013-06-06 12:49   ` Rafael Aquini
2013-06-05 15:10 ` [PATCH 5/7] mm: compaction: increase the high order pages in the watermarks Andrea Arcangeli
2013-06-05 20:18   ` Rik van Riel
2013-06-28 22:14     ` Andrea Arcangeli
2013-06-06  9:19   ` Mel Gorman
2013-06-05 15:10 ` [PATCH 6/7] mm: compaction: export compact_zone_order() Andrea Arcangeli
2013-06-05 20:24   ` Rik van Riel
2013-06-05 15:10 ` [PATCH 7/7] mm: compaction: add compaction to zone_reclaim_mode Andrea Arcangeli
2013-06-05 22:21   ` Rik van Riel
2013-06-06 10:05   ` Mel Gorman
2013-07-11 16:02     ` Andrea Arcangeli
2013-07-12 12:26       ` Hush Bensen
2013-07-12 16:01         ` Andrea Arcangeli
2013-07-12 23:23           ` Hush Bensen
2013-07-15  9:16             ` Andrea Arcangeli
2013-07-12 23:57   ` Hush Bensen [this message]
2013-07-15  9:25     ` Andrea Arcangeli
2013-06-06  8:53 ` [PATCH 0/7] RFC: adding compaction to zone_reclaim_mode > 0 Mel Gorman
2013-06-06 10:09 ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51E097E5.7060308@gmail.com \
    --to=hush.bensen@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=aquini@redhat.com \
    --cc=hughd@google.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=richard@arachsys.com \
    --cc=riel@redhat.com \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.