* [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely v2 @ 2017-03-09 7:56 Mel Gorman 2017-03-09 7:56 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Mel Gorman @ 2017-03-09 7:56 UTC (permalink / raw) To: Andrew Morton Cc: Shantanu Goel, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM, Mel Gorman Changelog since v1 o Rebase to 4.11-rc1 o Add small clarifying comment based on review The series is unusual in that the first patch fixes one problem and introduces of other issues that are noted in the changelog. Patch 2 makes a minor modification that is worth considering on its own but leaves the kernel in a state where it behaves badly. It's not until patch 3 that there is an improvement against baseline. This was mostly motivated by examining Chris Mason's "simoop" benchmark which puts the VM under similar pressure to HADOOP. It has been reported that the benchmark has regressed severely during the last number of releases. While I cannot reproduce all the same problems Chris experienced due to hardware limitations, there was a number of problems on a 2-socket machine with a single disk. simoop latencies 4.11.0-rc1 4.11.0-rc1 vanilla keepawake-v2 Amean p50-Read 21670074.18 ( 0.00%) 22668332.52 ( -4.61%) Amean p95-Read 25456267.64 ( 0.00%) 26738688.00 ( -5.04%) Amean p99-Read 29369064.73 ( 0.00%) 30991404.52 ( -5.52%) Amean p50-Write 1390.30 ( 0.00%) 924.91 ( 33.47%) Amean p95-Write 412901.57 ( 0.00%) 1362.62 ( 99.67%) Amean p99-Write 6668722.09 ( 0.00%) 16854.04 ( 99.75%) Amean p50-Allocation 78714.31 ( 0.00%) 74729.74 ( 5.06%) Amean p95-Allocation 175533.51 ( 0.00%) 101609.74 ( 42.11%) Amean p99-Allocation 247003.02 ( 0.00%) 125765.57 ( 49.08%) These are latencies. Read/write are threads reading fixed-size random blocks from a simulated database. The allocation latency is mmaping and faulting regions of memory. The p50, 95 and p99 reports the worst latencies for 50% of the samples, 95% and 99% respectively. For example, the report indicates that while the test was running 99% of writes completed 99.75% faster. It's worth noting that on a UMA machine that no difference in performance with simoop was observed so milage will vary. It's noted that there is a slight impact to read latencies but it's mostly due to IO scheduler decisions and offset by the large reduction in other latencies. mm/memory_hotplug.c | 2 +- mm/vmscan.c | 136 ++++++++++++++++++++++++++++++---------------------- 2 files changed, 79 insertions(+), 59 deletions(-) -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep 2017-03-09 7:56 [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely v2 Mel Gorman @ 2017-03-09 7:56 ` Mel Gorman 2017-03-10 9:06 ` Vlastimil Babka 2017-03-09 7:56 ` [PATCH 2/3] mm, vmscan: Only clear pgdat congested/dirty/writeback state when balanced Mel Gorman 2017-03-09 7:56 ` [PATCH 3/3] mm, vmscan: Prevent kswapd sleeping prematurely due to mismatched classzone_idx Mel Gorman 2 siblings, 1 reply; 13+ messages in thread From: Mel Gorman @ 2017-03-09 7:56 UTC (permalink / raw) To: Andrew Morton Cc: Shantanu Goel, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM, Mel Gorman From: Shantanu Goel <sgoel01@yahoo.com> The check in prepare_kswapd_sleep needs to match the one in balance_pgdat since the latter will return as soon as any one of the zones in the classzone is above the watermark. This is specially important for higher order allocations since balance_pgdat will typically reset the order to zero relying on compaction to create the higher order pages. Without this patch, prepare_kswapd_sleep fails to wake up kcompactd since the zone balance check fails. It was first reported against 4.9.7 that kswapd is failing to wake up kcompactd due to a mismatch in the zone balance check between balance_pgdat() and prepare_kswapd_sleep(). balance_pgdat() returns as soon as a single zone satisfies the allocation but prepare_kswapd_sleep() requires all zones to do +the same. This causes prepare_kswapd_sleep() to never succeed except in the order == 0 case and consequently, wakeup_kcompactd() is never called. For the machine that originally motivated this patch, the state of compaction from /proc/vmstat looked this way after a day and a half +of uptime: compact_migrate_scanned 240496 compact_free_scanned 76238632 compact_isolated 123472 compact_stall 1791 compact_fail 29 compact_success 1762 compact_daemon_wake 0 After applying the patch and about 10 hours of uptime the state looks like this: compact_migrate_scanned 59927299 compact_free_scanned 2021075136 compact_isolated 640926 compact_stall 4 compact_fail 2 compact_success 2 compact_daemon_wake 5160 Further notes from Mel that motivated him to pick this patch up and resend it; It was observed for the simoop workload (pressures the VM similar to HADOOP) that kswapd was failing to keep ahead of direct reclaim. The investigation noted that there was a need to rationalise kswapd decisions to reclaim with kswapd decisions to sleep. With this patch on a 2-socket box, there was a 49% reduction in direct reclaim scanning. However, the impact otherwise is extremely negative. Kswapd reclaim efficiency dropped from 98% to 76%. simoop has three latency-related metrics for read, write and allocation (an anonymous mmap and fault). 4.11.0-rc1 4.11.0-rc1 vanilla fixcheck-v2 Amean p50-Read 21670074.18 ( 0.00%) 20464344.18 ( 5.56%) Amean p95-Read 25456267.64 ( 0.00%) 25721423.64 ( -1.04%) Amean p99-Read 29369064.73 ( 0.00%) 30174230.76 ( -2.74%) Amean p50-Write 1390.30 ( 0.00%) 1395.28 ( -0.36%) Amean p95-Write 412901.57 ( 0.00%) 37737.74 ( 90.86%) Amean p99-Write 6668722.09 ( 0.00%) 666489.04 ( 90.01%) Amean p50-Allocation 78714.31 ( 0.00%) 86286.22 ( -9.62%) Amean p95-Allocation 175533.51 ( 0.00%) 351812.27 (-100.42%) Amean p99-Allocation 247003.02 ( 0.00%) 6291171.56 (-2447.00%) Of greater concern is that the patch causes swapping and page writes from kswapd context rose from 0 pages to 4189753 pages during the hour the workload ran for. By and large, the patch has very bad behaviour but easily missed as the impact on a UMA machine is negligible. This patch is included with the data in case a bisection leads to this area. This patch is also a pre-requisite for the rest of the series. Signed-off-by: Shantanu Goel <sgoel01@yahoo.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> --- mm/vmscan.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index bc8031ef994d..4ea444142c2e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3134,11 +3134,11 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, int classzone_idx) if (!managed_zone(zone)) continue; - if (!zone_balanced(zone, order, classzone_idx)) - return false; + if (zone_balanced(zone, order, classzone_idx)) + return true; } - return true; + return false; } /* @@ -3331,7 +3331,13 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE); - /* Try to sleep for a short interval */ + /* + * Try to sleep for a short interval. Note that kcompactd will only be + * woken if it is possible to sleep for a short interval. This is + * deliberate on the assumption that if reclaim cannot keep an + * eligible zone balanced that it's also unlikely that compaction will + * succeed. + */ if (prepare_kswapd_sleep(pgdat, reclaim_order, classzone_idx)) { /* * Compaction records what page blocks it recently failed to -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep 2017-03-09 7:56 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman @ 2017-03-10 9:06 ` Vlastimil Babka 0 siblings, 0 replies; 13+ messages in thread From: Vlastimil Babka @ 2017-03-10 9:06 UTC (permalink / raw) To: Mel Gorman, Andrew Morton; +Cc: Shantanu Goel, Johannes Weiner, LKML, Linux-MM On 03/09/2017 08:56 AM, Mel Gorman wrote: > From: Shantanu Goel <sgoel01@yahoo.com> > > The check in prepare_kswapd_sleep needs to match the one in balance_pgdat > since the latter will return as soon as any one of the zones in the > classzone is above the watermark. This is specially important for higher > order allocations since balance_pgdat will typically reset the order to > zero relying on compaction to create the higher order pages. Without this > patch, prepare_kswapd_sleep fails to wake up kcompactd since the zone > balance check fails. > > It was first reported against 4.9.7 that kswapd is failing to wake up > kcompactd due to a mismatch in the zone balance check between balance_pgdat() > and prepare_kswapd_sleep(). balance_pgdat() returns as soon as a single > zone satisfies the allocation but prepare_kswapd_sleep() requires all zones > to do +the same. This causes prepare_kswapd_sleep() to never succeed except > in the order == 0 case and consequently, wakeup_kcompactd() is never called. > For the machine that originally motivated this patch, the state of compaction > from /proc/vmstat looked this way after a day and a half +of uptime: > > compact_migrate_scanned 240496 > compact_free_scanned 76238632 > compact_isolated 123472 > compact_stall 1791 > compact_fail 29 > compact_success 1762 > compact_daemon_wake 0 > > After applying the patch and about 10 hours of uptime the state looks > like this: > > compact_migrate_scanned 59927299 > compact_free_scanned 2021075136 > compact_isolated 640926 > compact_stall 4 > compact_fail 2 > compact_success 2 > compact_daemon_wake 5160 > > Further notes from Mel that motivated him to pick this patch up and > resend it; > > It was observed for the simoop workload (pressures the VM similar to HADOOP) > that kswapd was failing to keep ahead of direct reclaim. The investigation > noted that there was a need to rationalise kswapd decisions to reclaim > with kswapd decisions to sleep. With this patch on a 2-socket box, there > was a 49% reduction in direct reclaim scanning. > > However, the impact otherwise is extremely negative. Kswapd reclaim > efficiency dropped from 98% to 76%. simoop has three latency-related > metrics for read, write and allocation (an anonymous mmap and fault). > > 4.11.0-rc1 4.11.0-rc1 > vanilla fixcheck-v2 > Amean p50-Read 21670074.18 ( 0.00%) 20464344.18 ( 5.56%) > Amean p95-Read 25456267.64 ( 0.00%) 25721423.64 ( -1.04%) > Amean p99-Read 29369064.73 ( 0.00%) 30174230.76 ( -2.74%) > Amean p50-Write 1390.30 ( 0.00%) 1395.28 ( -0.36%) > Amean p95-Write 412901.57 ( 0.00%) 37737.74 ( 90.86%) > Amean p99-Write 6668722.09 ( 0.00%) 666489.04 ( 90.01%) > Amean p50-Allocation 78714.31 ( 0.00%) 86286.22 ( -9.62%) > Amean p95-Allocation 175533.51 ( 0.00%) 351812.27 (-100.42%) > Amean p99-Allocation 247003.02 ( 0.00%) 6291171.56 (-2447.00%) > > Of greater concern is that the patch causes swapping and page writes > from kswapd context rose from 0 pages to 4189753 pages during the hour > the workload ran for. By and large, the patch has very bad behaviour but > easily missed as the impact on a UMA machine is negligible. > > This patch is included with the data in case a bisection leads to this area. > This patch is also a pre-requisite for the rest of the series. > > Signed-off-by: Shantanu Goel <sgoel01@yahoo.com> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/3] mm, vmscan: Only clear pgdat congested/dirty/writeback state when balanced 2017-03-09 7:56 [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely v2 Mel Gorman 2017-03-09 7:56 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman @ 2017-03-09 7:56 ` Mel Gorman 2017-03-10 9:06 ` Vlastimil Babka 2017-03-09 7:56 ` [PATCH 3/3] mm, vmscan: Prevent kswapd sleeping prematurely due to mismatched classzone_idx Mel Gorman 2 siblings, 1 reply; 13+ messages in thread From: Mel Gorman @ 2017-03-09 7:56 UTC (permalink / raw) To: Andrew Morton Cc: Shantanu Goel, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM, Mel Gorman A pgdat tracks if recent reclaim encountered too many dirty, writeback or congested pages. The flags control whether kswapd writes pages back from reclaim context, tags pages for immediate reclaim when IO completes, whether processes block on wait_iff_congested and whether kswapd blocks when too many pages marked for immediate reclaim are encountered. The state is cleared in a check function with side-effects. With the patch "mm, vmscan: fix zone balance check in prepare_kswapd_sleep", the timing of when the bits get cleared changed. Due to the way the check works, it'll clear the bits if ZONE_DMA is balanced for a GFP_DMA allocation because it does not account for lowmem reserves properly. For the simoop workload, kswapd is not stalling when it should due to the premature clearing, writing pages from reclaim context like crazy and generally being unhelpful. This patch resets the pgdat bits related to page reclaim only when kswapd is going to sleep. The comparison with simoop is then 4.11.0-rc1 4.11.0-rc1 4.11.0-rc1 vanilla fixcheck-v2 clear-v2 Amean p50-Read 21670074.18 ( 0.00%) 20464344.18 ( 5.56%) 19786774.76 ( 8.69%) Amean p95-Read 25456267.64 ( 0.00%) 25721423.64 ( -1.04%) 24101956.27 ( 5.32%) Amean p99-Read 29369064.73 ( 0.00%) 30174230.76 ( -2.74%) 27691872.71 ( 5.71%) Amean p50-Write 1390.30 ( 0.00%) 1395.28 ( -0.36%) 1011.91 ( 27.22%) Amean p95-Write 412901.57 ( 0.00%) 37737.74 ( 90.86%) 34874.98 ( 91.55%) Amean p99-Write 6668722.09 ( 0.00%) 666489.04 ( 90.01%) 575449.60 ( 91.37%) Amean p50-Allocation 78714.31 ( 0.00%) 86286.22 ( -9.62%) 84246.26 ( -7.03%) Amean p95-Allocation 175533.51 ( 0.00%) 351812.27 (-100.42%) 400058.43 (-127.91%) Amean p99-Allocation 247003.02 ( 0.00%) 6291171.56 (-2447.00%) 10905600.00 (-4315.17%) Read latency is improved, write latency is mostly improved but allocation latency is regressed. kswapd is still reclaiming inefficiently, pages are being written back from writeback context and a host of other issues. However, given the change, it needed to be spelled out why the side-effect was moved. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- mm/vmscan.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 4ea444142c2e..17b1afbce88e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3091,17 +3091,17 @@ static bool zone_balanced(struct zone *zone, int order, int classzone_idx) if (!zone_watermark_ok_safe(zone, order, mark, classzone_idx)) return false; - /* - * If any eligible zone is balanced then the node is not considered - * to be congested or dirty - */ - clear_bit(PGDAT_CONGESTED, &zone->zone_pgdat->flags); - clear_bit(PGDAT_DIRTY, &zone->zone_pgdat->flags); - clear_bit(PGDAT_WRITEBACK, &zone->zone_pgdat->flags); - return true; } +/* Clear pgdat state for congested, dirty or under writeback. */ +static void clear_pgdat_congested(pg_data_t *pgdat) +{ + clear_bit(PGDAT_CONGESTED, &pgdat->flags); + clear_bit(PGDAT_DIRTY, &pgdat->flags); + clear_bit(PGDAT_WRITEBACK, &pgdat->flags); +} + /* * Prepare kswapd for sleeping. This verifies that there are no processes * waiting in throttle_direct_reclaim() and that watermarks have been met. @@ -3134,8 +3134,10 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, int classzone_idx) if (!managed_zone(zone)) continue; - if (zone_balanced(zone, order, classzone_idx)) + if (zone_balanced(zone, order, classzone_idx)) { + clear_pgdat_congested(pgdat); return true; + } } return false; -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3] mm, vmscan: Only clear pgdat congested/dirty/writeback state when balanced 2017-03-09 7:56 ` [PATCH 2/3] mm, vmscan: Only clear pgdat congested/dirty/writeback state when balanced Mel Gorman @ 2017-03-10 9:06 ` Vlastimil Babka 0 siblings, 0 replies; 13+ messages in thread From: Vlastimil Babka @ 2017-03-10 9:06 UTC (permalink / raw) To: Mel Gorman, Andrew Morton; +Cc: Shantanu Goel, Johannes Weiner, LKML, Linux-MM On 03/09/2017 08:56 AM, Mel Gorman wrote: > A pgdat tracks if recent reclaim encountered too many dirty, writeback > or congested pages. The flags control whether kswapd writes pages back > from reclaim context, tags pages for immediate reclaim when IO completes, > whether processes block on wait_iff_congested and whether kswapd blocks > when too many pages marked for immediate reclaim are encountered. > > The state is cleared in a check function with side-effects. With the patch > "mm, vmscan: fix zone balance check in prepare_kswapd_sleep", the timing > of when the bits get cleared changed. Due to the way the check works, > it'll clear the bits if ZONE_DMA is balanced for a GFP_DMA allocation > because it does not account for lowmem reserves properly. > > For the simoop workload, kswapd is not stalling when it should due to > the premature clearing, writing pages from reclaim context like crazy and > generally being unhelpful. > > This patch resets the pgdat bits related to page reclaim only when kswapd > is going to sleep. The comparison with simoop is then > > 4.11.0-rc1 4.11.0-rc1 4.11.0-rc1 > vanilla fixcheck-v2 clear-v2 > Amean p50-Read 21670074.18 ( 0.00%) 20464344.18 ( 5.56%) 19786774.76 ( 8.69%) > Amean p95-Read 25456267.64 ( 0.00%) 25721423.64 ( -1.04%) 24101956.27 ( 5.32%) > Amean p99-Read 29369064.73 ( 0.00%) 30174230.76 ( -2.74%) 27691872.71 ( 5.71%) > Amean p50-Write 1390.30 ( 0.00%) 1395.28 ( -0.36%) 1011.91 ( 27.22%) > Amean p95-Write 412901.57 ( 0.00%) 37737.74 ( 90.86%) 34874.98 ( 91.55%) > Amean p99-Write 6668722.09 ( 0.00%) 666489.04 ( 90.01%) 575449.60 ( 91.37%) > Amean p50-Allocation 78714.31 ( 0.00%) 86286.22 ( -9.62%) 84246.26 ( -7.03%) > Amean p95-Allocation 175533.51 ( 0.00%) 351812.27 (-100.42%) 400058.43 (-127.91%) > Amean p99-Allocation 247003.02 ( 0.00%) 6291171.56 (-2447.00%) 10905600.00 (-4315.17%) > > Read latency is improved, write latency is mostly improved but allocation > latency is regressed. kswapd is still reclaiming inefficiently, > pages are being written back from writeback context and a host of other > issues. However, given the change, it needed to be spelled out why the > side-effect was moved. > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 3/3] mm, vmscan: Prevent kswapd sleeping prematurely due to mismatched classzone_idx 2017-03-09 7:56 [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely v2 Mel Gorman 2017-03-09 7:56 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman 2017-03-09 7:56 ` [PATCH 2/3] mm, vmscan: Only clear pgdat congested/dirty/writeback state when balanced Mel Gorman @ 2017-03-09 7:56 ` Mel Gorman 2 siblings, 0 replies; 13+ messages in thread From: Mel Gorman @ 2017-03-09 7:56 UTC (permalink / raw) To: Andrew Morton Cc: Shantanu Goel, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM, Mel Gorman kswapd is woken to reclaim a node based on a failed allocation request from any eligible zone. Once reclaiming in balance_pgdat(), it will continue reclaiming until there is an eligible zone available for the zone it was woken for. kswapd tracks what zone it was recently woken for in pgdat->kswapd_classzone_idx. If it has not been woken recently, this zone will be 0. However, the decision on whether to sleep is made on kswapd_classzone_idx which is 0 without a recent wakeup request and that classzone does not account for lowmem reserves. This allows kswapd to sleep when a low small zone such as ZONE_DMA is balanced for a GFP_DMA request even if a stream of allocations cannot use that zone. While kswapd may be woken again shortly in the near future there are two consequences -- the pgdat bits that control congestion are cleared prematurely and direct reclaim is more likely as kswapd slept prematurely. This patch flips kswapd_classzone_idx to default to MAX_NR_ZONES (an invalid index) when there has been no recent wakeups. If there are no wakeups, it'll decide whether to sleep based on the highest possible zone available (MAX_NR_ZONES - 1). It then becomes critical that the "pgdat balanced" decisions during reclaim and when deciding to sleep are the same. If there is a mismatch, kswapd can stay awake continually trying to balance tiny zones. simoop was used to evaluate it again. Two of the preparation patches regressed the workload so they are included as the second set of results. Otherwise this patch looks artifically excellent 4.11.0-rc1 4.11.0-rc1 4.11.0-rc1 vanilla clear-v2 keepawake-v2 Amean p50-Read 21670074.18 ( 0.00%) 19786774.76 ( 8.69%) 22668332.52 ( -4.61%) Amean p95-Read 25456267.64 ( 0.00%) 24101956.27 ( 5.32%) 26738688.00 ( -5.04%) Amean p99-Read 29369064.73 ( 0.00%) 27691872.71 ( 5.71%) 30991404.52 ( -5.52%) Amean p50-Write 1390.30 ( 0.00%) 1011.91 ( 27.22%) 924.91 ( 33.47%) Amean p95-Write 412901.57 ( 0.00%) 34874.98 ( 91.55%) 1362.62 ( 99.67%) Amean p99-Write 6668722.09 ( 0.00%) 575449.60 ( 91.37%) 16854.04 ( 99.75%) Amean p50-Allocation 78714.31 ( 0.00%) 84246.26 ( -7.03%) 74729.74 ( 5.06%) Amean p95-Allocation 175533.51 ( 0.00%) 400058.43 (-127.91%) 101609.74 ( 42.11%) Amean p99-Allocation 247003.02 ( 0.00%) 10905600.00 (-4315.17%) 125765.57 ( 49.08%) With this patch on top, write and allocation latencies are massively improved. The read latencies are slightly impaired but it's worth noting that this is mostly due to the IO scheduler and not directly related to reclaim. The vmstats are a bit of a mix but the relevant ones are as follows; 4.10.0-rc7 4.10.0-rc7 4.10.0-rc7 mmots-20170209 clear-v1r25keepawake-v1r25 Swap Ins 0 0 0 Swap Outs 0 608 0 Direct pages scanned 6910672 3132699 6357298 Kswapd pages scanned 57036946 82488665 56986286 Kswapd pages reclaimed 55993488 63474329 55939113 Direct pages reclaimed 6905990 2964843 6352115 Kswapd efficiency 98% 76% 98% Kswapd velocity 12494.375 17597.507 12488.065 Direct efficiency 99% 94% 99% Direct velocity 1513.835 668.306 1393.148 Page writes by reclaim 0.000 4410243.000 0.000 Page writes file 0 4409635 0 Page writes anon 0 608 0 Page reclaim immediate 1036792 14175203 1042571 4.11.0-rc1 4.11.0-rc1 4.11.0-rc1 vanilla clear-v2 keepawake-v2 Swap Ins 0 12 0 Swap Outs 0 838 0 Direct pages scanned 6579706 3237270 6256811 Kswapd pages scanned 61853702 79961486 54837791 Kswapd pages reclaimed 60768764 60755788 53849586 Direct pages reclaimed 6579055 2987453 6256151 Kswapd efficiency 98% 75% 98% Page writes by reclaim 0.000 4389496.000 0.000 Page writes file 0 4388658 0 Page writes anon 0 838 0 Page reclaim immediate 1073573 14473009 982507 Swap-outs are equivalent to baseline. Direct reclaim is reduced but not eliminated. It's worth noting that there are two periods of direct reclaim for this workload. The first is when it switches from preparing the files for the actual test itself. It's a lot of file IO followed by a lot of allocs that reclaims heavily for a brief window. While direct reclaim is lower with clear-v2, it is due to kswapd scanning aggressively and trying to reclaim the world which is not the right thing to do. With the patches applied, there is still direct reclaim but the phase change from "creating work files" to starting multiple threads that allocate a lot of anonymous memory faster than kswapd can reclaim. Scanning/reclaim efficiency is restored by this patch. Page writes from reclaim context are back at 0 which is ideal. Pages immediately reclaimed after IO completes is slightly improved but it is expected this will vary slightly. On UMA, there is almost no change so this is not expected to be a universal win. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> --- mm/memory_hotplug.c | 2 +- mm/vmscan.c | 118 +++++++++++++++++++++++++++++----------------------- 2 files changed, 66 insertions(+), 54 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 295479b792ec..edff09061e32 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1207,7 +1207,7 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) /* Reset the nr_zones, order and classzone_idx before reuse */ pgdat->nr_zones = 0; pgdat->kswapd_order = 0; - pgdat->kswapd_classzone_idx = 0; + pgdat->kswapd_classzone_idx = MAX_NR_ZONES; } /* we can use NODE_DATA(nid) from here */ diff --git a/mm/vmscan.c b/mm/vmscan.c index 17b1afbce88e..6b09ed5e4bda 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3084,14 +3084,36 @@ static void age_active_anon(struct pglist_data *pgdat, } while (memcg); } -static bool zone_balanced(struct zone *zone, int order, int classzone_idx) +/* + * Returns true if there is an eligible zone balanced for the request order + * and classzone_idx + */ +static bool pgdat_balanced(pg_data_t *pgdat, int order, int classzone_idx) { - unsigned long mark = high_wmark_pages(zone); + int i; + unsigned long mark = -1; + struct zone *zone; - if (!zone_watermark_ok_safe(zone, order, mark, classzone_idx)) - return false; + for (i = 0; i <= classzone_idx; i++) { + zone = pgdat->node_zones + i; - return true; + if (!managed_zone(zone)) + continue; + + mark = high_wmark_pages(zone); + if (zone_watermark_ok_safe(zone, order, mark, classzone_idx)) + return true; + } + + /* + * If a node has no populated zone within classzone_idx, it does not + * need balancing by definition. This can happen if a zone-restricted + * allocation tries to wake a remote kswapd. + */ + if (mark == -1) + return true; + + return false; } /* Clear pgdat state for congested, dirty or under writeback. */ @@ -3110,8 +3132,6 @@ static void clear_pgdat_congested(pg_data_t *pgdat) */ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, int classzone_idx) { - int i; - /* * The throttled processes are normally woken up in balance_pgdat() as * soon as pfmemalloc_watermark_ok() is true. But there is a potential @@ -3128,16 +3148,9 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, int classzone_idx) if (waitqueue_active(&pgdat->pfmemalloc_wait)) wake_up_all(&pgdat->pfmemalloc_wait); - for (i = 0; i <= classzone_idx; i++) { - struct zone *zone = pgdat->node_zones + i; - - if (!managed_zone(zone)) - continue; - - if (zone_balanced(zone, order, classzone_idx)) { - clear_pgdat_congested(pgdat); - return true; - } + if (pgdat_balanced(pgdat, order, classzone_idx)) { + clear_pgdat_congested(pgdat); + return true; } return false; @@ -3243,23 +3256,12 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx) } /* - * Only reclaim if there are no eligible zones. Check from - * high to low zone as allocations prefer higher zones. - * Scanning from low to high zone would allow congestion to be - * cleared during a very small window when a small low - * zone was balanced even under extreme pressure when the - * overall node may be congested. Note that sc.reclaim_idx - * is not used as buffer_heads_over_limit may have adjusted - * it. + * Only reclaim if there are no eligible zones. Note that + * sc.reclaim_idx is not used as buffer_heads_over_limit may + * have adjusted it. */ - for (i = classzone_idx; i >= 0; i--) { - zone = pgdat->node_zones + i; - if (!managed_zone(zone)) - continue; - - if (zone_balanced(zone, sc.order, classzone_idx)) - goto out; - } + if (pgdat_balanced(pgdat, sc.order, classzone_idx)) + goto out; /* * Do some background aging of the anon list, to give @@ -3322,6 +3324,22 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx) return sc.order; } +/* + * pgdat->kswapd_classzone_idx is the highest zone index that a recent + * allocation request woke kswapd for. When kswapd has not woken recently, + * the value is MAX_NR_ZONES which is not a valid index. This compares a + * given classzone and returns it or the highest classzone index kswapd + * was recently woke for. + */ +static enum zone_type kswapd_classzone_idx(pg_data_t *pgdat, + enum zone_type classzone_idx) +{ + if (pgdat->kswapd_classzone_idx == MAX_NR_ZONES) + return classzone_idx; + + return max(pgdat->kswapd_classzone_idx, classzone_idx); +} + static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, unsigned int classzone_idx) { @@ -3363,7 +3381,7 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o * the previous request that slept prematurely. */ if (remaining) { - pgdat->kswapd_classzone_idx = max(pgdat->kswapd_classzone_idx, classzone_idx); + pgdat->kswapd_classzone_idx = kswapd_classzone_idx(pgdat, classzone_idx); pgdat->kswapd_order = max(pgdat->kswapd_order, reclaim_order); } @@ -3417,7 +3435,8 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o */ static int kswapd(void *p) { - unsigned int alloc_order, reclaim_order, classzone_idx; + unsigned int alloc_order, reclaim_order; + unsigned int classzone_idx = MAX_NR_ZONES - 1; pg_data_t *pgdat = (pg_data_t*)p; struct task_struct *tsk = current; @@ -3447,20 +3466,23 @@ static int kswapd(void *p) tsk->flags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD; set_freezable(); - pgdat->kswapd_order = alloc_order = reclaim_order = 0; - pgdat->kswapd_classzone_idx = classzone_idx = 0; + pgdat->kswapd_order = 0; + pgdat->kswapd_classzone_idx = MAX_NR_ZONES; for ( ; ; ) { bool ret; + alloc_order = reclaim_order = pgdat->kswapd_order; + classzone_idx = kswapd_classzone_idx(pgdat, classzone_idx); + kswapd_try_sleep: kswapd_try_to_sleep(pgdat, alloc_order, reclaim_order, classzone_idx); /* Read the new order and classzone_idx */ alloc_order = reclaim_order = pgdat->kswapd_order; - classzone_idx = pgdat->kswapd_classzone_idx; + classzone_idx = kswapd_classzone_idx(pgdat, 0); pgdat->kswapd_order = 0; - pgdat->kswapd_classzone_idx = 0; + pgdat->kswapd_classzone_idx = MAX_NR_ZONES; ret = try_to_freeze(); if (kthread_should_stop()) @@ -3486,9 +3508,6 @@ static int kswapd(void *p) reclaim_order = balance_pgdat(pgdat, alloc_order, classzone_idx); if (reclaim_order < alloc_order) goto kswapd_try_sleep; - - alloc_order = reclaim_order = pgdat->kswapd_order; - classzone_idx = pgdat->kswapd_classzone_idx; } tsk->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD); @@ -3504,7 +3523,6 @@ static int kswapd(void *p) void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx) { pg_data_t *pgdat; - int z; if (!managed_zone(zone)) return; @@ -3512,22 +3530,16 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx) if (!cpuset_zone_allowed(zone, GFP_KERNEL | __GFP_HARDWALL)) return; pgdat = zone->zone_pgdat; - pgdat->kswapd_classzone_idx = max(pgdat->kswapd_classzone_idx, classzone_idx); + pgdat->kswapd_classzone_idx = kswapd_classzone_idx(pgdat, classzone_idx); pgdat->kswapd_order = max(pgdat->kswapd_order, order); if (!waitqueue_active(&pgdat->kswapd_wait)) return; /* Only wake kswapd if all zones are unbalanced */ - for (z = 0; z <= classzone_idx; z++) { - zone = pgdat->node_zones + z; - if (!managed_zone(zone)) - continue; - - if (zone_balanced(zone, order, classzone_idx)) - return; - } + if (pgdat_balanced(pgdat, order, classzone_idx)) + return; - trace_mm_vmscan_wakeup_kswapd(pgdat->node_id, zone_idx(zone), order); + trace_mm_vmscan_wakeup_kswapd(pgdat->node_id, classzone_idx, order); wake_up_interruptible(&pgdat->kswapd_wait); } -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely @ 2017-02-15 9:22 Mel Gorman 2017-02-15 9:22 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman 0 siblings, 1 reply; 13+ messages in thread From: Mel Gorman @ 2017-02-15 9:22 UTC (permalink / raw) To: Andrew Morton Cc: Shantanu Goel, Chris Mason, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM, Mel Gorman This patchset is based on mmots as of Feb 9th, 2016. The baseline is important as there are a number of kswapd-related fixes in that tree and a comparison against v4.10-rc7 would be almost meaningless as a result. The series is unusual in that the first patch fixes one problem and introduces a host of other issues and is incomplete. It was not developed by me but it appears to have gotten lost so I picked it up and added to the changelog. Patch 2 makes a minor modification that is worth considering on its own but leaves the kernel in a state where it behaves badly. It's not until patch 3 that there is an improvement against baseline. This was mostly motivated by examining Chris Mason's "simoop" benchmark which puts the VM under similar pressure to HADOOP. It has been reported that the benchmark has regressed severely during the last number of releases. While I cannot reproduce all the same problems Chris experienced due to hardware limitations, there was a number of problems on a 2-socket machine with a single disk. 4.10.0-rc7 4.10.0-rc7 mmots-20170209 keepawake-v1r25 Amean p50-Read 22325202.49 ( 0.00%) 22092755.48 ( 1.04%) Amean p95-Read 26102988.80 ( 0.00%) 26101849.04 ( 0.00%) Amean p99-Read 30935176.53 ( 0.00%) 29746220.52 ( 3.84%) Amean p50-Write 976.44 ( 0.00%) 952.73 ( 2.43%) Amean p95-Write 15471.29 ( 0.00%) 3140.27 ( 79.70%) Amean p99-Write 35108.62 ( 0.00%) 8843.73 ( 74.81%) Amean p50-Allocation 76382.61 ( 0.00%) 76349.22 ( 0.04%) Amean p95-Allocation 127777.39 ( 0.00%) 108630.26 ( 14.98%) Amean p99-Allocation 187937.39 ( 0.00%) 139094.26 ( 25.99%) These are latencies. Read/write are threads reading fixed-size random blocks from a simulated database. The allocation latency is mmaping and faulting regions of memory. The p50, 95 and p99 reports the worst latencies for 50% of the samples, 95% and 99% respectively. For example, the report indicates that while the test was running 99% of writes completed 74.81% faster. It's worth noting that on a UMA machine that no difference in performance with simoop was observed so milage will vary. On UMA, there was a notable difference in the "stutter" benchmark which measures the latency of mmap while large files are being copied. This has been used as a proxy measure for desktop jitter while large amounts of IO were taking place 4.10.0-rc7 4.10.0-rc7 mmots-20170209 keepawake-v1 Min mmap 6.3847 ( 0.00%) 5.9785 ( 6.36%) 1st-qrtle mmap 7.6310 ( 0.00%) 7.4086 ( 2.91%) 2nd-qrtle mmap 9.9959 ( 0.00%) 7.7052 ( 22.92%) 3rd-qrtle mmap 14.8180 ( 0.00%) 8.5895 ( 42.03%) Max-90% mmap 15.8397 ( 0.00%) 13.6974 ( 13.52%) Max-93% mmap 16.4268 ( 0.00%) 14.3175 ( 12.84%) Max-95% mmap 18.3295 ( 0.00%) 16.9233 ( 7.67%) Max-99% mmap 24.2042 ( 0.00%) 20.6182 ( 14.82%) Max mmap 255.0688 ( 0.00%) 265.5818 ( -4.12%) Mean mmap 11.2192 ( 0.00%) 9.1811 ( 18.17%) Latency is measured in milliseconds and indicates that 99% of mmap operations complete 14.82% faster and are 18.17% faster on average with these patches applied. mm/memory_hotplug.c | 2 +- mm/vmscan.c | 128 +++++++++++++++++++++++++++++----------------------- 2 files changed, 72 insertions(+), 58 deletions(-) -- 2.11.0 Mel Gorman (2): mm, vmscan: Only clear pgdat congested/dirty/writeback state when balanced mm, vmscan: Prevent kswapd sleeping prematurely due to mismatched classzone_idx Shantanu Goel (1): mm, vmscan: fix zone balance check in prepare_kswapd_sleep mm/memory_hotplug.c | 2 +- mm/vmscan.c | 128 +++++++++++++++++++++++++++++----------------------- 2 files changed, 72 insertions(+), 58 deletions(-) -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep 2017-02-15 9:22 [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely Mel Gorman @ 2017-02-15 9:22 ` Mel Gorman 2017-02-16 2:50 ` Hillf Danton 2017-02-22 7:00 ` Minchan Kim 0 siblings, 2 replies; 13+ messages in thread From: Mel Gorman @ 2017-02-15 9:22 UTC (permalink / raw) To: Andrew Morton Cc: Shantanu Goel, Chris Mason, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM, Mel Gorman From: Shantanu Goel <sgoel01@yahoo.com> The check in prepare_kswapd_sleep needs to match the one in balance_pgdat since the latter will return as soon as any one of the zones in the classzone is above the watermark. This is specially important for higher order allocations since balance_pgdat will typically reset the order to zero relying on compaction to create the higher order pages. Without this patch, prepare_kswapd_sleep fails to wake up kcompactd since the zone balance check fails. On 4.9.7 kswapd is failing to wake up kcompactd due to a mismatch in the zone balance check between balance_pgdat() and prepare_kswapd_sleep(). balance_pgdat() returns as soon as a single zone satisfies the allocation but prepare_kswapd_sleep() requires all zones to do +the same. This causes prepare_kswapd_sleep() to never succeed except in the order == 0 case and consequently, wakeup_kcompactd() is never called. On my machine prior to apply this patch, the state of compaction from /proc/vmstat looked this way after a day and a half +of uptime: compact_migrate_scanned 240496 compact_free_scanned 76238632 compact_isolated 123472 compact_stall 1791 compact_fail 29 compact_success 1762 compact_daemon_wake 0 After applying the patch and about 10 hours of uptime the state looks like this: compact_migrate_scanned 59927299 compact_free_scanned 2021075136 compact_isolated 640926 compact_stall 4 compact_fail 2 compact_success 2 compact_daemon_wake 5160 Further notes from Mel that motivated him to pick this patch up and resend it; It was observed for the simoop workload (pressures the VM similar to HADOOP) that kswapd was failing to keep ahead of direct reclaim. The investigation noted that there was a need to rationalise kswapd decisions to reclaim with kswapd decisions to sleep. With this patch on a 2-socket box, there was a 43% reduction in direct reclaim scanning. However, the impact otherwise is extremely negative. Kswapd reclaim efficiency dropped from 98% to 76%. simoop has three latency-related metrics for read, write and allocation (an anonymous mmap and fault). 4.10.0-rc7 4.10.0-rc7 mmots-20170209 fixcheck-v1 Amean p50-Read 22325202.49 ( 0.00%) 20026926.55 ( 10.29%) Amean p95-Read 26102988.80 ( 0.00%) 27023360.00 ( -3.53%) Amean p99-Read 30935176.53 ( 0.00%) 30994432.00 ( -0.19%) Amean p50-Write 976.44 ( 0.00%) 1905.28 (-95.12%) Amean p95-Write 15471.29 ( 0.00%) 36210.09 (-134.05%) Amean p99-Write 35108.62 ( 0.00%) 479494.96 (-1265.75%) Amean p50-Allocation 76382.61 ( 0.00%) 87603.20 (-14.69%) Amean p95-Allocation 127777.39 ( 0.00%) 244491.38 (-91.34%) Amean p99-Allocation 187937.39 ( 0.00%) 1745237.33 (-828.63%) There are also more allocation stalls. One of the largest impacts was due to pages written back from kswapd context rising from 0 pages to 4516642 pages during the hour the workload ran for. By and large, the patch has very bad behaviour but easily missed as the impact on a UMA machine is negligible. This patch is included with the data in case a bisection leads to this area. This patch is also a pre-requisite for the rest of the series. Signed-off-by: Shantanu Goel <sgoel01@yahoo.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- mm/vmscan.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 26c3b405ef34..92fc66bd52bc 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3140,11 +3140,11 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, int classzone_idx) if (!managed_zone(zone)) continue; - if (!zone_balanced(zone, order, classzone_idx)) - return false; + if (zone_balanced(zone, order, classzone_idx)) + return true; } - return true; + return false; } /* -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep 2017-02-15 9:22 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman @ 2017-02-16 2:50 ` Hillf Danton 2017-02-22 7:00 ` Minchan Kim 1 sibling, 0 replies; 13+ messages in thread From: Hillf Danton @ 2017-02-16 2:50 UTC (permalink / raw) To: 'Mel Gorman', 'Andrew Morton' Cc: 'Shantanu Goel', 'Chris Mason', 'Johannes Weiner', 'Vlastimil Babka', 'LKML', 'Linux-MM' On February 15, 2017 5:23 PM Mel Gorman wrote: > > From: Shantanu Goel <sgoel01@yahoo.com> > > The check in prepare_kswapd_sleep needs to match the one in balance_pgdat > since the latter will return as soon as any one of the zones in the > classzone is above the watermark. This is specially important for higher > order allocations since balance_pgdat will typically reset the order to > zero relying on compaction to create the higher order pages. Without this > patch, prepare_kswapd_sleep fails to wake up kcompactd since the zone > balance check fails. > > On 4.9.7 kswapd is failing to wake up kcompactd due to a mismatch in the > zone balance check between balance_pgdat() and prepare_kswapd_sleep(). > balance_pgdat() returns as soon as a single zone satisfies the allocation > but prepare_kswapd_sleep() requires all zones to do +the same. This causes > prepare_kswapd_sleep() to never succeed except in the order == 0 case and > consequently, wakeup_kcompactd() is never called. On my machine prior to > apply this patch, the state of compaction from /proc/vmstat looked this > way after a day and a half +of uptime: > > compact_migrate_scanned 240496 > compact_free_scanned 76238632 > compact_isolated 123472 > compact_stall 1791 > compact_fail 29 > compact_success 1762 > compact_daemon_wake 0 > > After applying the patch and about 10 hours of uptime the state looks > like this: > > compact_migrate_scanned 59927299 > compact_free_scanned 2021075136 > compact_isolated 640926 > compact_stall 4 > compact_fail 2 > compact_success 2 > compact_daemon_wake 5160 > > Further notes from Mel that motivated him to pick this patch up and > resend it; > > It was observed for the simoop workload (pressures the VM similar to HADOOP) > that kswapd was failing to keep ahead of direct reclaim. The investigation > noted that there was a need to rationalise kswapd decisions to reclaim > with kswapd decisions to sleep. With this patch on a 2-socket box, there > was a 43% reduction in direct reclaim scanning. > > However, the impact otherwise is extremely negative. Kswapd reclaim > efficiency dropped from 98% to 76%. simoop has three latency-related > metrics for read, write and allocation (an anonymous mmap and fault). > > 4.10.0-rc7 4.10.0-rc7 > mmots-20170209 fixcheck-v1 > Amean p50-Read 22325202.49 ( 0.00%) 20026926.55 ( 10.29%) > Amean p95-Read 26102988.80 ( 0.00%) 27023360.00 ( -3.53%) > Amean p99-Read 30935176.53 ( 0.00%) 30994432.00 ( -0.19%) > Amean p50-Write 976.44 ( 0.00%) 1905.28 (-95.12%) > Amean p95-Write 15471.29 ( 0.00%) 36210.09 (-134.05%) > Amean p99-Write 35108.62 ( 0.00%) 479494.96 (-1265.75%) > Amean p50-Allocation 76382.61 ( 0.00%) 87603.20 (-14.69%) > Amean p95-Allocation 127777.39 ( 0.00%) 244491.38 (-91.34%) > Amean p99-Allocation 187937.39 ( 0.00%) 1745237.33 (-828.63%) > > There are also more allocation stalls. One of the largest impacts was due > to pages written back from kswapd context rising from 0 pages to 4516642 > pages during the hour the workload ran for. By and large, the patch has very > bad behaviour but easily missed as the impact on a UMA machine is negligible. > > This patch is included with the data in case a bisection leads to this area. > This patch is also a pre-requisite for the rest of the series. > > Signed-off-by: Shantanu Goel <sgoel01@yahoo.com> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > --- Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> > mm/vmscan.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 26c3b405ef34..92fc66bd52bc 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3140,11 +3140,11 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, int classzone_idx) > if (!managed_zone(zone)) > continue; > > - if (!zone_balanced(zone, order, classzone_idx)) > - return false; > + if (zone_balanced(zone, order, classzone_idx)) > + return true; > } > > - return true; > + return false; > } > > /* > -- > 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep 2017-02-15 9:22 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman 2017-02-16 2:50 ` Hillf Danton @ 2017-02-22 7:00 ` Minchan Kim 2017-02-23 15:05 ` Mel Gorman 1 sibling, 1 reply; 13+ messages in thread From: Minchan Kim @ 2017-02-22 7:00 UTC (permalink / raw) To: Mel Gorman Cc: Andrew Morton, Shantanu Goel, Chris Mason, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM Hi, On Wed, Feb 15, 2017 at 09:22:45AM +0000, Mel Gorman wrote: > From: Shantanu Goel <sgoel01@yahoo.com> > > The check in prepare_kswapd_sleep needs to match the one in balance_pgdat > since the latter will return as soon as any one of the zones in the > classzone is above the watermark. This is specially important for higher > order allocations since balance_pgdat will typically reset the order to > zero relying on compaction to create the higher order pages. Without this > patch, prepare_kswapd_sleep fails to wake up kcompactd since the zone > balance check fails. > > On 4.9.7 kswapd is failing to wake up kcompactd due to a mismatch in the > zone balance check between balance_pgdat() and prepare_kswapd_sleep(). > balance_pgdat() returns as soon as a single zone satisfies the allocation > but prepare_kswapd_sleep() requires all zones to do +the same. This causes > prepare_kswapd_sleep() to never succeed except in the order == 0 case and > consequently, wakeup_kcompactd() is never called. On my machine prior to > apply this patch, the state of compaction from /proc/vmstat looked this > way after a day and a half +of uptime: > > compact_migrate_scanned 240496 > compact_free_scanned 76238632 > compact_isolated 123472 > compact_stall 1791 > compact_fail 29 > compact_success 1762 > compact_daemon_wake 0 > > After applying the patch and about 10 hours of uptime the state looks > like this: > > compact_migrate_scanned 59927299 > compact_free_scanned 2021075136 > compact_isolated 640926 > compact_stall 4 > compact_fail 2 > compact_success 2 > compact_daemon_wake 5160 > > Further notes from Mel that motivated him to pick this patch up and > resend it; > > It was observed for the simoop workload (pressures the VM similar to HADOOP) > that kswapd was failing to keep ahead of direct reclaim. The investigation > noted that there was a need to rationalise kswapd decisions to reclaim > with kswapd decisions to sleep. With this patch on a 2-socket box, there > was a 43% reduction in direct reclaim scanning. > > However, the impact otherwise is extremely negative. Kswapd reclaim > efficiency dropped from 98% to 76%. simoop has three latency-related > metrics for read, write and allocation (an anonymous mmap and fault). > > 4.10.0-rc7 4.10.0-rc7 > mmots-20170209 fixcheck-v1 > Amean p50-Read 22325202.49 ( 0.00%) 20026926.55 ( 10.29%) > Amean p95-Read 26102988.80 ( 0.00%) 27023360.00 ( -3.53%) > Amean p99-Read 30935176.53 ( 0.00%) 30994432.00 ( -0.19%) > Amean p50-Write 976.44 ( 0.00%) 1905.28 (-95.12%) > Amean p95-Write 15471.29 ( 0.00%) 36210.09 (-134.05%) > Amean p99-Write 35108.62 ( 0.00%) 479494.96 (-1265.75%) > Amean p50-Allocation 76382.61 ( 0.00%) 87603.20 (-14.69%) > Amean p95-Allocation 127777.39 ( 0.00%) 244491.38 (-91.34%) > Amean p99-Allocation 187937.39 ( 0.00%) 1745237.33 (-828.63%) > > There are also more allocation stalls. One of the largest impacts was due > to pages written back from kswapd context rising from 0 pages to 4516642 > pages during the hour the workload ran for. By and large, the patch has very > bad behaviour but easily missed as the impact on a UMA machine is negligible. > > This patch is included with the data in case a bisection leads to this area. > This patch is also a pre-requisite for the rest of the series. > > Signed-off-by: Shantanu Goel <sgoel01@yahoo.com> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Hmm, I don't understand why we should bind wakeup_kcompactd to kswapd's short sleep point where every eligible zones are balanced. What's the correlation between them? Can't we wake up kcompactd once we found a zone has enough free pages above high watermark like this? diff --git a/mm/vmscan.c b/mm/vmscan.c index 26c3b405ef34..f4f0ad0e9ede 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3346,13 +3346,6 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o * that pages and compaction may succeed so reset the cache. */ reset_isolation_suitable(pgdat); - - /* - * We have freed the memory, now we should compact it to make - * allocation of the requested order possible. - */ - wakeup_kcompactd(pgdat, alloc_order, classzone_idx); - remaining = schedule_timeout(HZ/10); /* @@ -3451,6 +3444,14 @@ static int kswapd(void *p) bool ret; kswapd_try_sleep: + /* + * We have freed the memory, now we should compact it to make + * allocation of the requested order possible. + */ + if (alloc_order > 0 && zone_balanced(zone, reclaim_order, + classzone_idx)) + wakeup_kcompactd(pgdat, alloc_order, classzone_idx); + kswapd_try_to_sleep(pgdat, alloc_order, reclaim_order, classzone_idx); -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep 2017-02-22 7:00 ` Minchan Kim @ 2017-02-23 15:05 ` Mel Gorman 2017-02-24 1:17 ` Minchan Kim 0 siblings, 1 reply; 13+ messages in thread From: Mel Gorman @ 2017-02-23 15:05 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, Shantanu Goel, Chris Mason, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM On Wed, Feb 22, 2017 at 04:00:36PM +0900, Minchan Kim wrote: > > There are also more allocation stalls. One of the largest impacts was due > > to pages written back from kswapd context rising from 0 pages to 4516642 > > pages during the hour the workload ran for. By and large, the patch has very > > bad behaviour but easily missed as the impact on a UMA machine is negligible. > > > > This patch is included with the data in case a bisection leads to this area. > > This patch is also a pre-requisite for the rest of the series. > > > > Signed-off-by: Shantanu Goel <sgoel01@yahoo.com> > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > Hmm, I don't understand why we should bind wakeup_kcompactd to kswapd's > short sleep point where every eligible zones are balanced. > What's the correlation between them? > If kswapd is ready for a short sleep, eligible zones are balanced for order-0 but not necessarily the originally requested order if kswapd gave up reclaiming as compaction was ready to start. As kswapd is ready to sleep for a short period, it's a suitable time for kcompactd to decide if it should start working or not. There is no need for kswapd to be aware of kcompactd's wakeup criteria. > Can't we wake up kcompactd once we found a zone has enough free pages > above high watermark like this? > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 26c3b405ef34..f4f0ad0e9ede 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3346,13 +3346,6 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o > * that pages and compaction may succeed so reset the cache. > */ > reset_isolation_suitable(pgdat); > - > - /* > - * We have freed the memory, now we should compact it to make > - * allocation of the requested order possible. > - */ > - wakeup_kcompactd(pgdat, alloc_order, classzone_idx); > - > remaining = schedule_timeout(HZ/10); > > /* > @@ -3451,6 +3444,14 @@ static int kswapd(void *p) > bool ret; > > kswapd_try_sleep: > + /* > + * We have freed the memory, now we should compact it to make > + * allocation of the requested order possible. > + */ > + if (alloc_order > 0 && zone_balanced(zone, reclaim_order, > + classzone_idx)) > + wakeup_kcompactd(pgdat, alloc_order, classzone_idx); > + > kswapd_try_to_sleep(pgdat, alloc_order, reclaim_order, > classzone_idx); That's functionally very similar to what happens already. wakeup_kcompactd checks the order and does not wake for order-0. It also makes its own decisions that include zone_balanced on whether it is safe to wakeup. I doubt there would be any measurable difference from a patch like this and to my mind at least, it does not improve the readability or flow of the code. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep 2017-02-23 15:05 ` Mel Gorman @ 2017-02-24 1:17 ` Minchan Kim 2017-02-24 9:11 ` Mel Gorman 0 siblings, 1 reply; 13+ messages in thread From: Minchan Kim @ 2017-02-24 1:17 UTC (permalink / raw) To: Mel Gorman Cc: Andrew Morton, Shantanu Goel, Chris Mason, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM Hi Mel, On Thu, Feb 23, 2017 at 03:05:34PM +0000, Mel Gorman wrote: > On Wed, Feb 22, 2017 at 04:00:36PM +0900, Minchan Kim wrote: > > > There are also more allocation stalls. One of the largest impacts was due > > > to pages written back from kswapd context rising from 0 pages to 4516642 > > > pages during the hour the workload ran for. By and large, the patch has very > > > bad behaviour but easily missed as the impact on a UMA machine is negligible. > > > > > > This patch is included with the data in case a bisection leads to this area. > > > This patch is also a pre-requisite for the rest of the series. > > > > > > Signed-off-by: Shantanu Goel <sgoel01@yahoo.com> > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > > > Hmm, I don't understand why we should bind wakeup_kcompactd to kswapd's > > short sleep point where every eligible zones are balanced. > > What's the correlation between them? > > > > If kswapd is ready for a short sleep, eligible zones are balanced for > order-0 but not necessarily the originally requested order if kswapd > gave up reclaiming as compaction was ready to start. As kswapd is ready > to sleep for a short period, it's a suitable time for kcompactd to decide > if it should start working or not. There is no need for kswapd to be aware > of kcompactd's wakeup criteria. If all eligible zones are balanced for order-0, I agree it's good timing because high-order alloc's ratio would be higher since kcompactd can compact eligible zones, not that only classzone. However, this patch breaks it as well as long time kswapd behavior which continues to balance eligible zones for order-0. Is it really okay now? > > > Can't we wake up kcompactd once we found a zone has enough free pages > > above high watermark like this? > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 26c3b405ef34..f4f0ad0e9ede 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -3346,13 +3346,6 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o > > * that pages and compaction may succeed so reset the cache. > > */ > > reset_isolation_suitable(pgdat); > > - > > - /* > > - * We have freed the memory, now we should compact it to make > > - * allocation of the requested order possible. > > - */ > > - wakeup_kcompactd(pgdat, alloc_order, classzone_idx); > > - > > remaining = schedule_timeout(HZ/10); > > > > /* > > @@ -3451,6 +3444,14 @@ static int kswapd(void *p) > > bool ret; > > > > kswapd_try_sleep: > > + /* > > + * We have freed the memory, now we should compact it to make > > + * allocation of the requested order possible. > > + */ > > + if (alloc_order > 0 && zone_balanced(zone, reclaim_order, > > + classzone_idx)) > > + wakeup_kcompactd(pgdat, alloc_order, classzone_idx); > > + > > kswapd_try_to_sleep(pgdat, alloc_order, reclaim_order, > > classzone_idx); > > That's functionally very similar to what happens already. wakeup_kcompactd > checks the order and does not wake for order-0. It also makes its own > decisions that include zone_balanced on whether it is safe to wakeup. Agree. > > I doubt there would be any measurable difference from a patch like this > and to my mind at least, it does not improve the readability or flow of > the code. However, my concern is premature kswapd sleep for order-0 which has been long time behavior so I hope it should be documented why it's okay now. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep 2017-02-24 1:17 ` Minchan Kim @ 2017-02-24 9:11 ` Mel Gorman 2017-02-27 6:16 ` Minchan Kim 0 siblings, 1 reply; 13+ messages in thread From: Mel Gorman @ 2017-02-24 9:11 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, Shantanu Goel, Chris Mason, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM On Fri, Feb 24, 2017 at 10:17:06AM +0900, Minchan Kim wrote: > Hi Mel, > > On Thu, Feb 23, 2017 at 03:05:34PM +0000, Mel Gorman wrote: > > On Wed, Feb 22, 2017 at 04:00:36PM +0900, Minchan Kim wrote: > > > > There are also more allocation stalls. One of the largest impacts was due > > > > to pages written back from kswapd context rising from 0 pages to 4516642 > > > > pages during the hour the workload ran for. By and large, the patch has very > > > > bad behaviour but easily missed as the impact on a UMA machine is negligible. > > > > > > > > This patch is included with the data in case a bisection leads to this area. > > > > This patch is also a pre-requisite for the rest of the series. > > > > > > > > Signed-off-by: Shantanu Goel <sgoel01@yahoo.com> > > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > > > > > Hmm, I don't understand why we should bind wakeup_kcompactd to kswapd's > > > short sleep point where every eligible zones are balanced. > > > What's the correlation between them? > > > > > > > If kswapd is ready for a short sleep, eligible zones are balanced for > > order-0 but not necessarily the originally requested order if kswapd > > gave up reclaiming as compaction was ready to start. As kswapd is ready > > to sleep for a short period, it's a suitable time for kcompactd to decide > > if it should start working or not. There is no need for kswapd to be aware > > of kcompactd's wakeup criteria. > > If all eligible zones are balanced for order-0, I agree it's good timing > because high-order alloc's ratio would be higher since kcompactd can compact > eligible zones, not that only classzone. > However, this patch breaks it as well as long time kswapd behavior which > continues to balance eligible zones for order-0. > Is it really okay now? > Reclaim stops in balance_pgdat() if any eligible zone for the requested classzone is free. The initial sleep for kswapd is very different because it'll sleep if all zones are balanced for order-0 which is a bad disconnect. The way node balancing works means there is no guarantee at all that all zones will be balanced even if there is little or no memory pressure and one large zone in a node with multiple zones can be balanced quickly. The short-sleep logic that kswapd uses to decide whether to go to sleep is shortcut and it does not properly try the short sleep checking if the high watermarks are quickly reached or not. Instead, it quickly fails the first attempt at sleep, reenters balance_pgdat(), finds nothing to do and rechecks sleeping based on order-0, classzone-0 which it can easily sleep for but is *not* what kswapd was woken for in the first place. For many allocation requests that initially woke kswapd, the impact is marginal. kswapd sleeps early and is woken in the near future if there is a continual stream of allocations with a risk that direct reclaim is required. While the motivation for the patch was that kcompact is not woken up, the existing behaviour is just wrong -- kswapd should be deciding to sleep based on the classzone it was woken for and if possible, the order it was woken for but the classzone is more important in the common case for order-0 allocations. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep 2017-02-24 9:11 ` Mel Gorman @ 2017-02-27 6:16 ` Minchan Kim 0 siblings, 0 replies; 13+ messages in thread From: Minchan Kim @ 2017-02-27 6:16 UTC (permalink / raw) To: Mel Gorman Cc: Andrew Morton, Shantanu Goel, Chris Mason, Johannes Weiner, Vlastimil Babka, LKML, Linux-MM Hi Mel, On Fri, Feb 24, 2017 at 09:11:28AM +0000, Mel Gorman wrote: > On Fri, Feb 24, 2017 at 10:17:06AM +0900, Minchan Kim wrote: > > Hi Mel, > > > > On Thu, Feb 23, 2017 at 03:05:34PM +0000, Mel Gorman wrote: > > > On Wed, Feb 22, 2017 at 04:00:36PM +0900, Minchan Kim wrote: > > > > > There are also more allocation stalls. One of the largest impacts was due > > > > > to pages written back from kswapd context rising from 0 pages to 4516642 > > > > > pages during the hour the workload ran for. By and large, the patch has very > > > > > bad behaviour but easily missed as the impact on a UMA machine is negligible. > > > > > > > > > > This patch is included with the data in case a bisection leads to this area. > > > > > This patch is also a pre-requisite for the rest of the series. > > > > > > > > > > Signed-off-by: Shantanu Goel <sgoel01@yahoo.com> > > > > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > > > > > > > Hmm, I don't understand why we should bind wakeup_kcompactd to kswapd's > > > > short sleep point where every eligible zones are balanced. > > > > What's the correlation between them? > > > > > > > > > > If kswapd is ready for a short sleep, eligible zones are balanced for > > > order-0 but not necessarily the originally requested order if kswapd > > > gave up reclaiming as compaction was ready to start. As kswapd is ready > > > to sleep for a short period, it's a suitable time for kcompactd to decide > > > if it should start working or not. There is no need for kswapd to be aware > > > of kcompactd's wakeup criteria. > > > > If all eligible zones are balanced for order-0, I agree it's good timing > > because high-order alloc's ratio would be higher since kcompactd can compact > > eligible zones, not that only classzone. > > However, this patch breaks it as well as long time kswapd behavior which > > continues to balance eligible zones for order-0. > > Is it really okay now? > > > > Reclaim stops in balance_pgdat() if any eligible zone for the requested > classzone is free. The initial sleep for kswapd is very different because > it'll sleep if all zones are balanced for order-0 which is a bad disconnect. > The way node balancing works means there is no guarantee at all that all > zones will be balanced even if there is little or no memory pressure and > one large zone in a node with multiple zones can be balanced quickly. Indeed but it would tip toward direct relcaim more so it could make more failure for allocation relies on kswapd like atomic allocation However, if VM balance all of zones for order-0, it would make excessive reclaim with node-based LRU unlike zone-based, which is bad, too. > > The short-sleep logic that kswapd uses to decide whether to go to sleep > is shortcut and it does not properly try the short sleep checking if the > high watermarks are quickly reached or not. Instead, it quickly fails the > first attempt at sleep, reenters balance_pgdat(), finds nothing to do and > rechecks sleeping based on order-0, classzone-0 which it can easily sleep > for but is *not* what kswapd was woken for in the first place. > > For many allocation requests that initially woke kswapd, the impact is > marginal. kswapd sleeps early and is woken in the near future if there > is a continual stream of allocations with a risk that direct reclaim is > required. While the motivation for the patch was that kcompact is not woken > up, the existing behaviour is just wrong -- kswapd should be deciding to > sleep based on the classzone it was woken for and if possible, the order > it was woken for but the classzone is more important in the common case > for order-0 allocations. I agree but I think it's rather risky to paper over order-0 zone-balancing problem by kcompactd missing problem so at least, it should be documented. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-03-10 9:06 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-03-09 7:56 [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely v2 Mel Gorman 2017-03-09 7:56 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman 2017-03-10 9:06 ` Vlastimil Babka 2017-03-09 7:56 ` [PATCH 2/3] mm, vmscan: Only clear pgdat congested/dirty/writeback state when balanced Mel Gorman 2017-03-10 9:06 ` Vlastimil Babka 2017-03-09 7:56 ` [PATCH 3/3] mm, vmscan: Prevent kswapd sleeping prematurely due to mismatched classzone_idx Mel Gorman -- strict thread matches above, loose matches on Subject: below -- 2017-02-15 9:22 [PATCH 0/3] Reduce amount of time kswapd sleeps prematurely Mel Gorman 2017-02-15 9:22 ` [PATCH 1/3] mm, vmscan: fix zone balance check in prepare_kswapd_sleep Mel Gorman 2017-02-16 2:50 ` Hillf Danton 2017-02-22 7:00 ` Minchan Kim 2017-02-23 15:05 ` Mel Gorman 2017-02-24 1:17 ` Minchan Kim 2017-02-24 9:11 ` Mel Gorman 2017-02-27 6:16 ` Minchan Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).