linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Suleiman Souhlal <suleiman@google.com>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Rafael Aquini <aquini@redhat.com>, Michal Hocko <mhocko@suse.cz>,
	Yuanhan Liu <yuanhan.liu@linux.intel.com>,
	Seth Jennings <sjennings@variantweb.net>,
	Bob Liu <bob.liu@oracle.com>, Minchan Kim <minchan@kernel.org>,
	Luigi Semenzato <semenzato@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: Only force scan in reclaim when none of the LRUs are big enough.
Date: Tue, 1 Apr 2014 12:49:13 -0700	[thread overview]
Message-ID: <20140401124913.c27f190e2342d6e5c2c29277@linux-foundation.org> (raw)
In-Reply-To: <alpine.LSU.2.11.1403151957160.21388@eggly.anvils>

On Sat, 15 Mar 2014 20:36:02 -0700 (PDT) Hugh Dickins <hughd@google.com> wrote:

> From: Suleiman Souhlal <suleiman@google.com>
> 
> Prior to this change, we would decide whether to force scan a LRU
> during reclaim if that LRU itself was too small for the current
> priority. However, this can lead to the file LRU getting force
> scanned even if there are a lot of anonymous pages we can reclaim,
> leading to hot file pages getting needlessly reclaimed.

Struggling a bit here.  You're referring to this code?

			size = get_lru_size(lruvec, lru);
			scan = size >> sc->priority;

			if (!scan && force_scan)
				scan = min(size, SWAP_CLUSTER_MAX);

So we're talking about the case where the LRU is so small that it
contains fewer than (1<<sc->priority) pages?

If so, then I'd expect that in normal operation this situation rarely
occurs?  Surely the LRUs normally contain many more pages than this.

> To address this, we instead only force scan when none of the
> reclaimable LRUs are big enough.
> 
> Gives huge improvements with zswap. For example, when doing -j20
> kernel build in a 500MB container with zswap enabled, runtime (in
> seconds) is greatly reduced:
> 
> x without this change
> + with this change
>     N           Min           Max        Median           Avg        Stddev
> x   5       700.997       790.076       763.928        754.05      39.59493
> +   5       141.634       197.899       155.706         161.9     21.270224
> Difference at 95.0% confidence
>         -592.15 +/- 46.3521
>         -78.5293% +/- 6.14709%
>         (Student's t, pooled s = 31.7819)

And yet the patch makes a large difference.  What am I missing here?

> --- 3.14-rc6/mm/vmscan.c	2014-02-02 18:49:07.949302116 -0800
> +++ linux/mm/vmscan.c	2014-03-15 19:31:44.948977032 -0700
> @@ -1971,39 +1973,49 @@ static void get_scan_count(struct lruvec
>  	fraction[1] = fp;
>  	denominator = ap + fp + 1;
>  out:
> -	for_each_evictable_lru(lru) {
> -		int file = is_file_lru(lru);
> -		unsigned long size;
> -		unsigned long scan;
> -
> -		size = get_lru_size(lruvec, lru);
> -		scan = size >> sc->priority;
> -
> -		if (!scan && force_scan)
> -			scan = min(size, SWAP_CLUSTER_MAX);
> -
> -		switch (scan_balance) {
> -		case SCAN_EQUAL:
> -			/* Scan lists relative to size */
> -			break;
> -		case SCAN_FRACT:
> +	some_scanned = false;
> +	/* Only use force_scan on second pass. */

That's a poor comment.

> +	for (pass = 0; !some_scanned && pass < 2; pass++) {
> +		for_each_evictable_lru(lru) {
> +			int file = is_file_lru(lru);
> +			unsigned long size;
> +			unsigned long scan;
> +
> +			size = get_lru_size(lruvec, lru);
> +			scan = size >> sc->priority;
> +
> +			if (!scan && pass && force_scan)
> +				scan = min(size, SWAP_CLUSTER_MAX);
> +
> +			switch (scan_balance) {
> +			case SCAN_EQUAL:
> +				/* Scan lists relative to size */
> +				break;
> +			case SCAN_FRACT:
> +				/*
> +				 * Scan types proportional to swappiness and
> +				 * their relative recent reclaim efficiency.
> +				 */
> +				scan = div64_u64(scan * fraction[file],
> +							denominator);
> +				break;
> +			case SCAN_FILE:
> +			case SCAN_ANON:
> +				/* Scan one type exclusively */
> +				if ((scan_balance == SCAN_FILE) != file)
> +					scan = 0;
> +				break;
> +			default:
> +				/* Look ma, no brain */
> +				BUG();
> +			}
> +			nr[lru] = scan;
>  			/*
> -			 * Scan types proportional to swappiness and
> -			 * their relative recent reclaim efficiency.
> +			 * Skip the second pass and don't force_scan,
> +			 * if we found something to scan.

And so is that.  Both comments explain *what* the code is doing (which
was fairly obvious from the code!) but they fail to explain *why* the
code is doing what it does.

>  			 */
> -			scan = div64_u64(scan * fraction[file], denominator);
> -			break;
> -		case SCAN_FILE:
> -		case SCAN_ANON:
> -			/* Scan one type exclusively */
> -			if ((scan_balance == SCAN_FILE) != file)
> -				scan = 0;
> -			break;
> -		default:
> -			/* Look ma, no brain */
> -			BUG();
> +			some_scanned |= !!scan;

Also the "and don't force_scan" part appears to be flatly untrue.  Either
the comment is wrong or the code should be along the lines of

	if (scan) {
		some_scanned = true;
		force_scan = false;
	}

Can we fix these things please?  And retest if necessary.

>  		}
> -		nr[lru] = scan;
>  	}
>  }
>  

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      parent reply	other threads:[~2014-04-01 19:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-16  3:36 [PATCH] mm: Only force scan in reclaim when none of the LRUs are big enough Hugh Dickins
2014-03-27 20:41 ` Rik van Riel
2014-03-28 18:10 ` Rafael Aquini
2014-04-01 19:49 ` Andrew Morton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140401124913.c27f190e2342d6e5c2c29277@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=aquini@redhat.com \
    --cc=bob.liu@oracle.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=semenzato@google.com \
    --cc=sjennings@variantweb.net \
    --cc=suleiman@google.com \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).