linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: akpm@linux-foundation.org, clameter@sgi.com, apw@shadowen.org,
	kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org
Subject: Re: [PATCH] Smarter retry of costly-order allocations
Date: Tue, 15 Apr 2008 09:51:55 +0100	[thread overview]
Message-ID: <20080415085154.GA20316@csn.ul.ie> (raw)
In-Reply-To: <20080411233553.GB19078@us.ibm.com>

On (11/04/08 16:35), Nishanth Aravamudan didst pronounce:
> Because of page order checks in __alloc_pages(), hugepage (and similarly
> large order) allocations will not retry unless explicitly marked
> __GFP_REPEAT. However, the current retry logic is nearly an infinite
> loop (or until reclaim does no progress whatsoever). For these costly
> allocations, that seems like overkill and could potentially never
> terminate.
> 
> Modify try_to_free_pages() to indicate how many pages were reclaimed.
> Use that information in __alloc_pages() to eventually fail a large
> __GFP_REPEAT allocation when we've reclaimed an order of pages equal to
> or greater than the allocation's order. This relies on lumpy reclaim
> functioning as advertised. Due to fragmentation, lumpy reclaim may not
> be able to free up the order needed in one invocation, so multiple
> iterations may be requred. In other words, the more fragmented memory
> is, the more retry attempts __GFP_REPEAT will make (particularly for
> higher order allocations).
> 
> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

Changelog is a lot clearer now. Thanks.

Tested-by: Mel Gorman <mel@csn.ul.ie>

> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1db36da..1a0cc4d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1541,7 +1541,8 @@ __alloc_pages_internal(gfp_t gfp_mask, unsigned int order,
>  	struct task_struct *p = current;
>  	int do_retry;
>  	int alloc_flags;
> -	int did_some_progress;
> +	unsigned long did_some_progress;
> +	unsigned long pages_reclaimed = 0;
>  
>  	might_sleep_if(wait);
>  
> @@ -1691,15 +1692,26 @@ nofail_alloc:
>  	 * Don't let big-order allocations loop unless the caller explicitly
>  	 * requests that.  Wait for some write requests to complete then retry.
>  	 *
> -	 * In this implementation, either order <= PAGE_ALLOC_COSTLY_ORDER or
> -	 * __GFP_REPEAT mean __GFP_NOFAIL, but that may not be true in other
> +	 * In this implementation, order <= PAGE_ALLOC_COSTLY_ORDER
> +	 * means __GFP_NOFAIL, but that may not be true in other
>  	 * implementations.
> +	 *
> +	 * For order > PAGE_ALLOC_COSTLY_ORDER, if __GFP_REPEAT is
> +	 * specified, then we retry until we no longer reclaim any pages
> +	 * (above), or we've reclaimed an order of pages at least as
> +	 * large as the allocation's order. In both cases, if the
> +	 * allocation still fails, we stop retrying.
>  	 */
> +	pages_reclaimed += did_some_progress;
>  	do_retry = 0;
>  	if (!(gfp_mask & __GFP_NORETRY)) {
> -		if ((order <= PAGE_ALLOC_COSTLY_ORDER) ||
> -						(gfp_mask & __GFP_REPEAT))
> +		if (order <= PAGE_ALLOC_COSTLY_ORDER) {
>  			do_retry = 1;
> +		} else {
> +			if (gfp_mask & __GFP_REPEAT &&
> +				pages_reclaimed < (1 << order))
> +					do_retry = 1;
> +		}
>  		if (gfp_mask & __GFP_NOFAIL)
>  			do_retry = 1;
>  	}
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 83f42c9..d106b2c 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1319,6 +1319,9 @@ static unsigned long shrink_zones(int priority, struct zonelist *zonelist,
>   * hope that some of these pages can be written.  But if the allocating task
>   * holds filesystem locks which prevent writeout this might not work, and the
>   * allocation attempt will fail.
> + *
> + * returns:	0, if no pages reclaimed
> + * 		else, the number of pages reclaimed
>   */
>  static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>  					struct scan_control *sc)
> @@ -1368,7 +1371,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>  		}
>  		total_scanned += sc->nr_scanned;
>  		if (nr_reclaimed >= sc->swap_cluster_max) {
> -			ret = 1;
> +			ret = nr_reclaimed;
>  			goto out;
>  		}
>  
> @@ -1391,7 +1394,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
>  	}
>  	/* top priority shrink_caches still had more to do? don't OOM, then */
>  	if (!sc->all_unreclaimable && scan_global_lru(sc))
> -		ret = 1;
> +		ret = nr_reclaimed;
>  out:
>  	/*
>  	 * Now that we've scanned all the zones at this priority level, note
> 
> -- 
> Nishanth Aravamudan <nacc@us.ibm.com>
> IBM Linux Technology Center
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2008-04-15  8:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-11 23:35 [PATCH 1/3] mm: fix misleading __GFP_REPEAT related comments Nishanth Aravamudan
2008-04-11 23:35 ` [PATCH] Smarter retry of costly-order allocations Nishanth Aravamudan
2008-04-11 23:36   ` [PATCH 3/3] Explicitly retry hugepage allocations Nishanth Aravamudan
2008-04-15  8:56     ` Mel Gorman
2008-04-17  1:40       ` [UPDATED][PATCH " Nishanth Aravamudan
2008-04-15  7:07   ` [PATCH] Smarter retry of costly-order allocations Andrew Morton
2008-04-15 17:26     ` Nishanth Aravamudan
2008-04-15 19:18       ` Andrew Morton
2008-04-16  0:00         ` Nishanth Aravamudan
2008-04-16  0:09           ` Andrew Morton
2008-04-17  1:39             ` [UPDATED][PATCH 2/3] " Nishanth Aravamudan
2008-04-15  8:51   ` Mel Gorman [this message]
2008-04-15  9:02     ` [PATCH] " Andrew Morton
2008-04-15  9:27       ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080415085154.GA20316@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=apw@shadowen.org \
    --cc=clameter@sgi.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=nacc@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).