All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Rik van Riel <riel@redhat.com>, Michal Hocko <mhocko@suse.cz>,
	Kent Overstreet <kent.overstreet@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: vmscan: Stop reclaim/compaction earlier due to insufficient progress if !__GFP_REPEAT
Date: Wed, 16 Feb 2011 13:38:57 +0100	[thread overview]
Message-ID: <20110216123857.GE2380@cmpxchg.org> (raw)
In-Reply-To: <20110216095048.GA4473@csn.ul.ie>

On Wed, Feb 16, 2011 at 09:50:49AM +0000, Mel Gorman wrote:
> should_continue_reclaim() for reclaim/compaction allows scanning to continue
> even if pages are not being reclaimed until the full list is scanned. In
> terms of allocation success, this makes sense but potentially it introduces
> unwanted latency for high-order allocations such as transparent hugepages
> and network jumbo frames that would prefer to fail the allocation attempt
> and fallback to order-0 pages.  Worse, there is a potential that the full
> LRU scan will clear all the young bits, distort page aging information and
> potentially push pages into swap that would have otherwise remained resident.
> 
> This patch will stop reclaim/compaction if no pages were reclaimed in the
> last SWAP_CLUSTER_MAX pages that were considered. For allocations such as
> hugetlbfs that use GFP_REPEAT and have fewer fallback options, the full LRU
> list may still be scanned.
> 
> To test this, a tool was developed based on ftrace that tracked the latency of
> high-order allocations while transparent hugepage support was enabled and three
> benchmarks were run. The "fix-infinite" figures are 2.6.38-rc4 with Johannes's
> patch "vmscan: fix zone shrinking exit when scan work is done" applied.
> 
> STREAM Highorder Allocation Latency Statistics
> 	       fix-infinite	break-early
> 1 :: Count            10298           10229
> 1 :: Min             0.4560          0.4640
> 1 :: Mean            1.0589          1.0183
> 1 :: Max            14.5990         11.7510
> 1 :: Stddev          0.5208          0.4719
> 2 :: Count                2               1
> 2 :: Min             1.8610          3.7240
> 2 :: Mean            3.4325          3.7240
> 2 :: Max             5.0040          3.7240
> 2 :: Stddev          1.5715          0.0000
> 9 :: Count           111696          111694
> 9 :: Min             0.5230          0.4110
> 9 :: Mean           10.5831         10.5718
> 9 :: Max            38.4480         43.2900
> 9 :: Stddev          1.1147          1.1325
> 
> Mean time for order-1 allocations is reduced. order-2 looks increased
> but with so few allocations, it's not particularly significant. THP mean
> allocation latency is also reduced. That said, allocation time varies so
> significantly that the reductions are within noise.
> 
> Max allocation time is reduced by a significant amount for low-order
> allocations but reduced for THP allocations which presumably are now
> breaking before reclaim has done enough work.
> 
> SysBench Highorder Allocation Latency Statistics
> 	       fix-infinite	break-early
> 1 :: Count            15745           15677
> 1 :: Min             0.4250          0.4550
> 1 :: Mean            1.1023          1.0810
> 1 :: Max            14.4590         10.8220
> 1 :: Stddev          0.5117          0.5100
> 2 :: Count                1               1
> 2 :: Min             3.0040          2.1530
> 2 :: Mean            3.0040          2.1530
> 2 :: Max             3.0040          2.1530
> 2 :: Stddev          0.0000          0.0000
> 9 :: Count             2017            1931
> 9 :: Min             0.4980          0.7480
> 9 :: Mean           10.4717         10.3840
> 9 :: Max            24.9460         26.2500
> 9 :: Stddev          1.1726          1.1966
> 
> Again, mean time for order-1 allocations is reduced while order-2 allocations
> are too few to draw conclusions from. The mean time for THP allocations is
> also slightly reduced albeit the reductions are within varianes.
> 
> Once again, our maximum allocation time is significantly reduced for
> low-order allocations and slightly increased for THP allocations.
> 
> Anon stream mmap reference Highorder Allocation Latency Statistics
> 1 :: Count             1376            1790
> 1 :: Min             0.4940          0.5010
> 1 :: Mean            1.0289          0.9732
> 1 :: Max             6.2670          4.2540
> 1 :: Stddev          0.4142          0.2785
> 2 :: Count                1               -
> 2 :: Min             1.9060               -
> 2 :: Mean            1.9060               -
> 2 :: Max             1.9060               -
> 2 :: Stddev          0.0000               -
> 9 :: Count            11266           11257
> 9 :: Min             0.4990          0.4940
> 9 :: Mean        27250.4669      24256.1919
> 9 :: Max      11439211.0000    6008885.0000
> 9 :: Stddev     226427.4624     186298.1430
> 
> This benchmark creates one thread per CPU which references an amount of
> anonymous memory 1.5 times the size of physical RAM. This pounds swap quite
> heavily and is intended to exercise THP a bit.
> 
> Mean allocation time for order-1 is reduced as before. It's also reduced
> for THP allocations but the variations here are pretty massive due to swap.
> As before, maximum allocation times are significantly reduced.
> 
> Overall, the patch reduces the mean and maximum allocation latencies for
> the smaller high-order allocations. This was with Slab configured so it
> would be expected to be more significant with Slub which uses these size
> allocations more aggressively.
> 
> The mean allocation times for THP allocations are also slightly reduced.
> The maximum latency was slightly increased as predicted by the comments due
> to reclaim/compaction breaking early. However, workloads care more about the
> latency of lower-order allocations than THP so it's an acceptable trade-off.
> Please consider merging for 2.6.38.
> 
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Rik van Riel <riel@redhat.com>, Michal Hocko <mhocko@suse.cz>,
	Kent Overstreet <kent.overstreet@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: vmscan: Stop reclaim/compaction earlier due to insufficient progress if !__GFP_REPEAT
Date: Wed, 16 Feb 2011 13:38:57 +0100	[thread overview]
Message-ID: <20110216123857.GE2380@cmpxchg.org> (raw)
In-Reply-To: <20110216095048.GA4473@csn.ul.ie>

On Wed, Feb 16, 2011 at 09:50:49AM +0000, Mel Gorman wrote:
> should_continue_reclaim() for reclaim/compaction allows scanning to continue
> even if pages are not being reclaimed until the full list is scanned. In
> terms of allocation success, this makes sense but potentially it introduces
> unwanted latency for high-order allocations such as transparent hugepages
> and network jumbo frames that would prefer to fail the allocation attempt
> and fallback to order-0 pages.  Worse, there is a potential that the full
> LRU scan will clear all the young bits, distort page aging information and
> potentially push pages into swap that would have otherwise remained resident.
> 
> This patch will stop reclaim/compaction if no pages were reclaimed in the
> last SWAP_CLUSTER_MAX pages that were considered. For allocations such as
> hugetlbfs that use GFP_REPEAT and have fewer fallback options, the full LRU
> list may still be scanned.
> 
> To test this, a tool was developed based on ftrace that tracked the latency of
> high-order allocations while transparent hugepage support was enabled and three
> benchmarks were run. The "fix-infinite" figures are 2.6.38-rc4 with Johannes's
> patch "vmscan: fix zone shrinking exit when scan work is done" applied.
> 
> STREAM Highorder Allocation Latency Statistics
> 	       fix-infinite	break-early
> 1 :: Count            10298           10229
> 1 :: Min             0.4560          0.4640
> 1 :: Mean            1.0589          1.0183
> 1 :: Max            14.5990         11.7510
> 1 :: Stddev          0.5208          0.4719
> 2 :: Count                2               1
> 2 :: Min             1.8610          3.7240
> 2 :: Mean            3.4325          3.7240
> 2 :: Max             5.0040          3.7240
> 2 :: Stddev          1.5715          0.0000
> 9 :: Count           111696          111694
> 9 :: Min             0.5230          0.4110
> 9 :: Mean           10.5831         10.5718
> 9 :: Max            38.4480         43.2900
> 9 :: Stddev          1.1147          1.1325
> 
> Mean time for order-1 allocations is reduced. order-2 looks increased
> but with so few allocations, it's not particularly significant. THP mean
> allocation latency is also reduced. That said, allocation time varies so
> significantly that the reductions are within noise.
> 
> Max allocation time is reduced by a significant amount for low-order
> allocations but reduced for THP allocations which presumably are now
> breaking before reclaim has done enough work.
> 
> SysBench Highorder Allocation Latency Statistics
> 	       fix-infinite	break-early
> 1 :: Count            15745           15677
> 1 :: Min             0.4250          0.4550
> 1 :: Mean            1.1023          1.0810
> 1 :: Max            14.4590         10.8220
> 1 :: Stddev          0.5117          0.5100
> 2 :: Count                1               1
> 2 :: Min             3.0040          2.1530
> 2 :: Mean            3.0040          2.1530
> 2 :: Max             3.0040          2.1530
> 2 :: Stddev          0.0000          0.0000
> 9 :: Count             2017            1931
> 9 :: Min             0.4980          0.7480
> 9 :: Mean           10.4717         10.3840
> 9 :: Max            24.9460         26.2500
> 9 :: Stddev          1.1726          1.1966
> 
> Again, mean time for order-1 allocations is reduced while order-2 allocations
> are too few to draw conclusions from. The mean time for THP allocations is
> also slightly reduced albeit the reductions are within varianes.
> 
> Once again, our maximum allocation time is significantly reduced for
> low-order allocations and slightly increased for THP allocations.
> 
> Anon stream mmap reference Highorder Allocation Latency Statistics
> 1 :: Count             1376            1790
> 1 :: Min             0.4940          0.5010
> 1 :: Mean            1.0289          0.9732
> 1 :: Max             6.2670          4.2540
> 1 :: Stddev          0.4142          0.2785
> 2 :: Count                1               -
> 2 :: Min             1.9060               -
> 2 :: Mean            1.9060               -
> 2 :: Max             1.9060               -
> 2 :: Stddev          0.0000               -
> 9 :: Count            11266           11257
> 9 :: Min             0.4990          0.4940
> 9 :: Mean        27250.4669      24256.1919
> 9 :: Max      11439211.0000    6008885.0000
> 9 :: Stddev     226427.4624     186298.1430
> 
> This benchmark creates one thread per CPU which references an amount of
> anonymous memory 1.5 times the size of physical RAM. This pounds swap quite
> heavily and is intended to exercise THP a bit.
> 
> Mean allocation time for order-1 is reduced as before. It's also reduced
> for THP allocations but the variations here are pretty massive due to swap.
> As before, maximum allocation times are significantly reduced.
> 
> Overall, the patch reduces the mean and maximum allocation latencies for
> the smaller high-order allocations. This was with Slab configured so it
> would be expected to be more significant with Slub which uses these size
> allocations more aggressively.
> 
> The mean allocation times for THP allocations are also slightly reduced.
> The maximum latency was slightly increased as predicted by the comments due
> to reclaim/compaction breaking early. However, workloads care more about the
> latency of lower-order allocations than THP so it's an acceptable trade-off.
> Please consider merging for 2.6.38.
> 
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-02-16 12:39 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-09 15:46 [patch] vmscan: fix zone shrinking exit when scan work is done Johannes Weiner
2011-02-09 15:46 ` Johannes Weiner
2011-02-09 15:54 ` Kent Overstreet
2011-02-09 15:54   ` Kent Overstreet
2011-02-09 16:46 ` Mel Gorman
2011-02-09 16:46   ` Mel Gorman
2011-02-09 18:28   ` Andrea Arcangeli
2011-02-09 18:28     ` Andrea Arcangeli
2011-02-09 20:05     ` Andrew Morton
2011-02-09 20:05       ` Andrew Morton
2011-02-10 10:21     ` Mel Gorman
2011-02-10 10:21       ` Mel Gorman
2011-02-10 10:41       ` Michal Hocko
2011-02-10 10:41         ` Michal Hocko
2011-02-10 12:48       ` Andrea Arcangeli
2011-02-10 12:48         ` Andrea Arcangeli
2011-02-10 13:33         ` Mel Gorman
2011-02-10 13:33           ` Mel Gorman
2011-02-10 14:14           ` Andrea Arcangeli
2011-02-10 14:14             ` Andrea Arcangeli
2011-02-10 14:58             ` Mel Gorman
2011-02-10 14:58               ` Mel Gorman
2011-02-16  9:50               ` [PATCH] mm: vmscan: Stop reclaim/compaction earlier due to insufficient progress if !__GFP_REPEAT Mel Gorman
2011-02-16  9:50                 ` Mel Gorman
2011-02-16 10:13                 ` Andrea Arcangeli
2011-02-16 10:13                   ` Andrea Arcangeli
2011-02-16 11:22                   ` Mel Gorman
2011-02-16 11:22                     ` Mel Gorman
2011-02-16 14:44                     ` Andrea Arcangeli
2011-02-16 14:44                       ` Andrea Arcangeli
2011-02-16 12:03                 ` Andrea Arcangeli
2011-02-16 12:03                   ` Andrea Arcangeli
2011-02-16 12:14                 ` Rik van Riel
2011-02-16 12:14                   ` Rik van Riel
2011-02-16 12:38                 ` Johannes Weiner [this message]
2011-02-16 12:38                   ` Johannes Weiner
2011-02-16 23:26                 ` Minchan Kim
2011-02-16 23:26                   ` Minchan Kim
2011-02-17 22:22                 ` Andrew Morton
2011-02-17 22:22                   ` Andrew Morton
2011-02-18 12:22                   ` Mel Gorman
2011-02-18 12:22                     ` Mel Gorman
2011-02-10  4:04 ` [patch] vmscan: fix zone shrinking exit when scan work is done Minchan Kim
2011-02-10  4:04   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110216123857.GE2380@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kent.overstreet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.