* [patch] mm: vmscan: do not swap anon pages just because free+file is low
@ 2014-03-14 15:35 Johannes Weiner
2014-03-14 16:06 ` Rik van Riel
2014-03-14 17:06 ` Rafael Aquini
0 siblings, 2 replies; 7+ messages in thread
From: Johannes Weiner @ 2014-03-14 15:35 UTC (permalink / raw)
To: Andrew Morton; +Cc: Rik van Riel, Mel Gorman, linux-mm, linux-kernel
Page reclaim force-scans / swaps anonymous pages when file cache drops
below the high watermark of a zone in order to prevent what little
cache remains from thrashing.
However, on bigger machines the high watermark value can be quite
large and when the workload is dominated by a static anonymous/shmem
set, the file set might just be a small window of used-once cache. In
such situations, the VM starts swapping heavily when instead it should
be recycling the no longer used cache.
This is a longer-standing problem, but it's more likely to trigger
after 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy")
because file pages can no longer accumulate in a single zone and are
dispersed into smaller fractions among the available zones.
To resolve this, do not force scan anon when file pages are low but
instead rely on the scan/rotation ratios to make the right prediction.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@kernel.org> [3.12+]
---
mm/vmscan.c | 16 +---------------
1 file changed, 1 insertion(+), 15 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a9c74b409681..e58e9ad5b5d1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1848,7 +1848,7 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
struct zone *zone = lruvec_zone(lruvec);
unsigned long anon_prio, file_prio;
enum scan_balance scan_balance;
- unsigned long anon, file, free;
+ unsigned long anon, file;
bool force_scan = false;
unsigned long ap, fp;
enum lru_list lru;
@@ -1902,20 +1902,6 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
get_lru_size(lruvec, LRU_INACTIVE_FILE);
/*
- * If it's foreseeable that reclaiming the file cache won't be
- * enough to get the zone back into a desirable shape, we have
- * to swap. Better start now and leave the - probably heavily
- * thrashing - remaining file pages alone.
- */
- if (global_reclaim(sc)) {
- free = zone_page_state(zone, NR_FREE_PAGES);
- if (unlikely(file + free <= high_wmark_pages(zone))) {
- scan_balance = SCAN_ANON;
- goto out;
- }
- }
-
- /*
* There is enough inactive page cache, do not reclaim
* anything from the anonymous working set right now.
*/
--
1.9.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [patch] mm: vmscan: do not swap anon pages just because free+file is low
2014-03-14 15:35 [patch] mm: vmscan: do not swap anon pages just because free+file is low Johannes Weiner
@ 2014-03-14 16:06 ` Rik van Riel
2014-03-14 17:08 ` Mel Gorman
2014-03-14 17:06 ` Rafael Aquini
1 sibling, 1 reply; 7+ messages in thread
From: Rik van Riel @ 2014-03-14 16:06 UTC (permalink / raw)
To: Johannes Weiner, Andrew Morton; +Cc: Mel Gorman, linux-mm, linux-kernel
On 03/14/2014 11:35 AM, Johannes Weiner wrote:
> Page reclaim force-scans / swaps anonymous pages when file cache drops
> below the high watermark of a zone in order to prevent what little
> cache remains from thrashing.
>
> However, on bigger machines the high watermark value can be quite
> large and when the workload is dominated by a static anonymous/shmem
> set, the file set might just be a small window of used-once cache. In
> such situations, the VM starts swapping heavily when instead it should
> be recycling the no longer used cache.
>
> This is a longer-standing problem, but it's more likely to trigger
> after 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy")
> because file pages can no longer accumulate in a single zone and are
> dispersed into smaller fractions among the available zones.
>
> To resolve this, do not force scan anon when file pages are low but
> instead rely on the scan/rotation ratios to make the right prediction.
I am not entirely sure that the scan/rotation ratio will be
meaningful when the page cache has been essentially depleted,
but on larger systems the distance between the low and high
watermark is gigantic, and I have no better idea on how to
fix the bug you encountered, so ...
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: <stable@kernel.org> [3.12+]
Reviewed-by: Rik van Riel <riel@redhat.com>
--
All rights reversed
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [patch] mm: vmscan: do not swap anon pages just because free+file is low
2014-03-14 16:06 ` Rik van Riel
@ 2014-03-14 17:08 ` Mel Gorman
2014-03-16 4:20 ` Hugh Dickins
0 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2014-03-14 17:08 UTC (permalink / raw)
To: Rik van Riel
Cc: Johannes Weiner, Andrew Morton, linux-mm, linux-kernel, mhocko
On Fri, Mar 14, 2014 at 12:06:25PM -0400, Rik van Riel wrote:
> On 03/14/2014 11:35 AM, Johannes Weiner wrote:
> > Page reclaim force-scans / swaps anonymous pages when file cache drops
> > below the high watermark of a zone in order to prevent what little
> > cache remains from thrashing.
> >
> > However, on bigger machines the high watermark value can be quite
> > large and when the workload is dominated by a static anonymous/shmem
> > set, the file set might just be a small window of used-once cache. In
> > such situations, the VM starts swapping heavily when instead it should
> > be recycling the no longer used cache.
> >
> > This is a longer-standing problem, but it's more likely to trigger
> > after 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy")
> > because file pages can no longer accumulate in a single zone and are
> > dispersed into smaller fractions among the available zones.
> >
> > To resolve this, do not force scan anon when file pages are low but
> > instead rely on the scan/rotation ratios to make the right prediction.
>
> I am not entirely sure that the scan/rotation ratio will be
> meaningful when the page cache has been essentially depleted,
> but on larger systems the distance between the low and high
> watermark is gigantic, and I have no better idea on how to
> fix the bug you encountered, so ...
>
I still agree with the direction in general even though I've not put
thought into this specific patch yet. We've observed a problem whereby force
reclaim was causing one or other LRU list to be trashed. In one specific
case, the inactive file is low logic was causing problems because while
the relative size of inactive/active was taken into account, the absolute
size vs anon was not. It was not a mainline kernel and we do not have a
test configuration that properly illustrates the problem on mainline it's
on our radar that it's a potential problem. The scan/rotation ratio at the
moment does not take absolute sizes into account but we almost certainly
want to go in that direction at some stage. Hugh's patch on altering how
proportional shrinking works is also relevant.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [patch] mm: vmscan: do not swap anon pages just because free+file is low
2014-03-14 17:08 ` Mel Gorman
@ 2014-03-16 4:20 ` Hugh Dickins
2014-03-17 15:15 ` Johannes Weiner
0 siblings, 1 reply; 7+ messages in thread
From: Hugh Dickins @ 2014-03-16 4:20 UTC (permalink / raw)
To: Johannes Weiner
Cc: Mel Gorman, Rik van Riel, Andrew Morton, Michal Hocko,
Rafael Aquini, Suleiman Souhlal, linux-mm, linux-kernel
On Fri, 14 Mar 2014, Mel Gorman wrote:
> On Fri, Mar 14, 2014 at 12:06:25PM -0400, Rik van Riel wrote:
> > On 03/14/2014 11:35 AM, Johannes Weiner wrote:
> > > Page reclaim force-scans / swaps anonymous pages when file cache drops
> > > below the high watermark of a zone in order to prevent what little
> > > cache remains from thrashing.
> > >
> > > However, on bigger machines the high watermark value can be quite
> > > large and when the workload is dominated by a static anonymous/shmem
> > > set, the file set might just be a small window of used-once cache. In
> > > such situations, the VM starts swapping heavily when instead it should
> > > be recycling the no longer used cache.
> > >
> > > This is a longer-standing problem, but it's more likely to trigger
> > > after 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy")
> > > because file pages can no longer accumulate in a single zone and are
> > > dispersed into smaller fractions among the available zones.
> > >
> > > To resolve this, do not force scan anon when file pages are low but
> > > instead rely on the scan/rotation ratios to make the right prediction.
> >
> > I am not entirely sure that the scan/rotation ratio will be
> > meaningful when the page cache has been essentially depleted,
> > but on larger systems the distance between the low and high
> > watermark is gigantic, and I have no better idea on how to
> > fix the bug you encountered, so ...
> >
>
> I still agree with the direction in general even though I've not put
> thought into this specific patch yet. We've observed a problem whereby force
> reclaim was causing one or other LRU list to be trashed. In one specific
> case, the inactive file is low logic was causing problems because while
> the relative size of inactive/active was taken into account, the absolute
> size vs anon was not. It was not a mainline kernel and we do not have a
> test configuration that properly illustrates the problem on mainline it's
> on our radar that it's a potential problem. The scan/rotation ratio at the
> moment does not take absolute sizes into account but we almost certainly
> want to go in that direction at some stage. Hugh's patch on altering how
> proportional shrinking works is also relevant.
That https://lkml.org/lkml/2014/3/13/217
is relevant, yes, but I think rather more so is
Suleiman's in https://lkml.org/lkml/2014/3/15/168
Hannes, your patch looks reasonable to me, and as I read it would
be well complemented by Suleiman's and mine; but I do worry that
the "scan_balance = SCAN_ANON" block you're removing was inserted
for good reason, and its removal bring complaint from some direction.
By the way, I notice you marked yours for stable [3.12+]:
if it's for stable at all, shouldn't it be for 3.9+?
(well, maybe nobody's doing a 3.9.N.M but 3.10.N is still alive).
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [patch] mm: vmscan: do not swap anon pages just because free+file is low
2014-03-16 4:20 ` Hugh Dickins
@ 2014-03-17 15:15 ` Johannes Weiner
2014-03-17 18:33 ` Hugh Dickins
0 siblings, 1 reply; 7+ messages in thread
From: Johannes Weiner @ 2014-03-17 15:15 UTC (permalink / raw)
To: Hugh Dickins
Cc: Mel Gorman, Rik van Riel, Andrew Morton, Michal Hocko,
Rafael Aquini, Suleiman Souhlal, linux-mm, linux-kernel
On Sat, Mar 15, 2014 at 09:20:16PM -0700, Hugh Dickins wrote:
> On Fri, 14 Mar 2014, Mel Gorman wrote:
> > On Fri, Mar 14, 2014 at 12:06:25PM -0400, Rik van Riel wrote:
> > > On 03/14/2014 11:35 AM, Johannes Weiner wrote:
> > > > Page reclaim force-scans / swaps anonymous pages when file cache drops
> > > > below the high watermark of a zone in order to prevent what little
> > > > cache remains from thrashing.
> > > >
> > > > However, on bigger machines the high watermark value can be quite
> > > > large and when the workload is dominated by a static anonymous/shmem
> > > > set, the file set might just be a small window of used-once cache. In
> > > > such situations, the VM starts swapping heavily when instead it should
> > > > be recycling the no longer used cache.
> > > >
> > > > This is a longer-standing problem, but it's more likely to trigger
> > > > after 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy")
> > > > because file pages can no longer accumulate in a single zone and are
> > > > dispersed into smaller fractions among the available zones.
> > > >
> > > > To resolve this, do not force scan anon when file pages are low but
> > > > instead rely on the scan/rotation ratios to make the right prediction.
> > >
> > > I am not entirely sure that the scan/rotation ratio will be
> > > meaningful when the page cache has been essentially depleted,
> > > but on larger systems the distance between the low and high
> > > watermark is gigantic, and I have no better idea on how to
> > > fix the bug you encountered, so ...
> > >
> >
> > I still agree with the direction in general even though I've not put
> > thought into this specific patch yet. We've observed a problem whereby force
> > reclaim was causing one or other LRU list to be trashed. In one specific
> > case, the inactive file is low logic was causing problems because while
> > the relative size of inactive/active was taken into account, the absolute
> > size vs anon was not. It was not a mainline kernel and we do not have a
> > test configuration that properly illustrates the problem on mainline it's
> > on our radar that it's a potential problem. The scan/rotation ratio at the
> > moment does not take absolute sizes into account but we almost certainly
> > want to go in that direction at some stage. Hugh's patch on altering how
> > proportional shrinking works is also relevant.
>
> That https://lkml.org/lkml/2014/3/13/217
> is relevant, yes, but I think rather more so is
> Suleiman's in https://lkml.org/lkml/2014/3/15/168
>
> Hannes, your patch looks reasonable to me, and as I read it would
> be well complemented by Suleiman's and mine; but I do worry that
> the "scan_balance = SCAN_ANON" block you're removing was inserted
> for good reason, and its removal bring complaint from some direction.
It's been introduced with the original LRU split patch but there is no
explanation why. Rik's concern now was that the scan/rotate numbers
might not be too meaningful with very little cache.
> By the way, I notice you marked yours for stable [3.12+]:
> if it's for stable at all, shouldn't it be for 3.9+?
> (well, maybe nobody's doing a 3.9.N.M but 3.10.N is still alive).
The code I'm removing is fairly old and it's only been reported to
create problems starting with the fair zone allocator in 3.12.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [patch] mm: vmscan: do not swap anon pages just because free+file is low
2014-03-17 15:15 ` Johannes Weiner
@ 2014-03-17 18:33 ` Hugh Dickins
0 siblings, 0 replies; 7+ messages in thread
From: Hugh Dickins @ 2014-03-17 18:33 UTC (permalink / raw)
To: Johannes Weiner
Cc: Hugh Dickins, Mel Gorman, Rik van Riel, Andrew Morton,
Michal Hocko, Rafael Aquini, Suleiman Souhlal, linux-mm,
linux-kernel
On Mon, 17 Mar 2014, Johannes Weiner wrote:
> On Sat, Mar 15, 2014 at 09:20:16PM -0700, Hugh Dickins wrote:
> >
> > Hannes, your patch looks reasonable to me, and as I read it would
> > be well complemented by Suleiman's and mine; but I do worry that
> > the "scan_balance = SCAN_ANON" block you're removing was inserted
> > for good reason, and its removal bring complaint from some direction.
>
> It's been introduced with the original LRU split patch but there is no
> explanation why. Rik's concern now was that the scan/rotate numbers
> might not be too meaningful with very little cache.
>
> > By the way, I notice you marked yours for stable [3.12+]:
> > if it's for stable at all, shouldn't it be for 3.9+?
> > (well, maybe nobody's doing a 3.9.N.M but 3.10.N is still alive).
>
> The code I'm removing is fairly old and it's only been reported to
> create problems starting with the fair zone allocator in 3.12.
Ah, you're right, thanks.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] mm: vmscan: do not swap anon pages just because free+file is low
2014-03-14 15:35 [patch] mm: vmscan: do not swap anon pages just because free+file is low Johannes Weiner
2014-03-14 16:06 ` Rik van Riel
@ 2014-03-14 17:06 ` Rafael Aquini
1 sibling, 0 replies; 7+ messages in thread
From: Rafael Aquini @ 2014-03-14 17:06 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Rik van Riel, Mel Gorman, linux-mm, linux-kernel
On Fri, Mar 14, 2014 at 11:35:02AM -0400, Johannes Weiner wrote:
> Page reclaim force-scans / swaps anonymous pages when file cache drops
> below the high watermark of a zone in order to prevent what little
> cache remains from thrashing.
>
> However, on bigger machines the high watermark value can be quite
> large and when the workload is dominated by a static anonymous/shmem
> set, the file set might just be a small window of used-once cache. In
> such situations, the VM starts swapping heavily when instead it should
> be recycling the no longer used cache.
>
> This is a longer-standing problem, but it's more likely to trigger
> after 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy")
> because file pages can no longer accumulate in a single zone and are
> dispersed into smaller fractions among the available zones.
>
> To resolve this, do not force scan anon when file pages are low but
> instead rely on the scan/rotation ratios to make the right prediction.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: <stable@kernel.org> [3.12+]
> ---
Acked-by: Rafael Aquini <aquini@redhat.com>
> mm/vmscan.c | 16 +---------------
> 1 file changed, 1 insertion(+), 15 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a9c74b409681..e58e9ad5b5d1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1848,7 +1848,7 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
> struct zone *zone = lruvec_zone(lruvec);
> unsigned long anon_prio, file_prio;
> enum scan_balance scan_balance;
> - unsigned long anon, file, free;
> + unsigned long anon, file;
> bool force_scan = false;
> unsigned long ap, fp;
> enum lru_list lru;
> @@ -1902,20 +1902,6 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
> get_lru_size(lruvec, LRU_INACTIVE_FILE);
>
> /*
> - * If it's foreseeable that reclaiming the file cache won't be
> - * enough to get the zone back into a desirable shape, we have
> - * to swap. Better start now and leave the - probably heavily
> - * thrashing - remaining file pages alone.
> - */
> - if (global_reclaim(sc)) {
> - free = zone_page_state(zone, NR_FREE_PAGES);
> - if (unlikely(file + free <= high_wmark_pages(zone))) {
> - scan_balance = SCAN_ANON;
> - goto out;
> - }
> - }
> -
> - /*
> * There is enough inactive page cache, do not reclaim
> * anything from the anonymous working set right now.
> */
> --
> 1.9.0
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-03-17 18:34 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-14 15:35 [patch] mm: vmscan: do not swap anon pages just because free+file is low Johannes Weiner
2014-03-14 16:06 ` Rik van Riel
2014-03-14 17:08 ` Mel Gorman
2014-03-16 4:20 ` Hugh Dickins
2014-03-17 15:15 ` Johannes Weiner
2014-03-17 18:33 ` Hugh Dickins
2014-03-14 17:06 ` Rafael Aquini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).