From: Mel Gorman <mgorman@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Slaby <jslaby@suse.cz>,
Valdis Kletnieks <Valdis.Kletnieks@vt.edu>,
Rik van Riel <riel@redhat.com>,
Zlatko Calusic <zcalusic@bitsync.net>,
Johannes Weiner <hannes@cmpxchg.org>,
dormando <dormando@rydia.net>,
Satoru Moriya <satoru.moriya@hds.com>,
Michal Hocko <mhocko@suse.cz>, Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>, Mel Gorman <mgorman@suse.de>
Subject: [PATCH 03/10] mm: vmscan: Flatten kswapd priority loop
Date: Tue, 9 Apr 2013 12:06:58 +0100 [thread overview]
Message-ID: <1365505625-9460-4-git-send-email-mgorman@suse.de> (raw)
In-Reply-To: <1365505625-9460-1-git-send-email-mgorman@suse.de>
kswapd stops raising the scanning priority when at least SWAP_CLUSTER_MAX
pages have been reclaimed or the pgdat is considered balanced. It then
rechecks if it needs to restart at DEF_PRIORITY and whether high-order
reclaim needs to be reset. This is not wrong per-se but it is confusing
to follow and forcing kswapd to stay at DEF_PRIORITY may require several
restarts before it has scanned enough pages to meet the high watermark even
at 100% efficiency. This patch irons out the logic a bit by controlling
when priority is raised and removing the "goto loop_again".
This patch has kswapd raise the scanning priority until it is scanning
enough pages that it could meet the high watermark in one shrink of the
LRU lists if it is able to reclaim at 100% efficiency. It will not raise
the scanning prioirty higher unless it is failing to reclaim any pages.
To avoid infinite looping for high-order allocation requests kswapd will
not reclaim for high-order allocations when it has reclaimed at least
twice the number of pages as the allocation request.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/vmscan.c | 85 +++++++++++++++++++++++++++++--------------------------------
1 file changed, 40 insertions(+), 45 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0742c45..78268ca 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2633,8 +2633,12 @@ static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order, long remaining,
/*
* kswapd shrinks the zone by the number of pages required to reach
* the high watermark.
+ *
+ * Returns true if kswapd scanned at least the requested number of pages to
+ * reclaim. This is used to determine if the scanning priority needs to be
+ * raised.
*/
-static void kswapd_shrink_zone(struct zone *zone,
+static bool kswapd_shrink_zone(struct zone *zone,
struct scan_control *sc,
unsigned long lru_pages)
{
@@ -2654,6 +2658,8 @@ static void kswapd_shrink_zone(struct zone *zone,
if (nr_slab == 0 && !zone_reclaimable(zone))
zone->all_unreclaimable = 1;
+
+ return sc->nr_scanned >= sc->nr_to_reclaim;
}
/*
@@ -2680,26 +2686,25 @@ static void kswapd_shrink_zone(struct zone *zone,
static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
int *classzone_idx)
{
- bool pgdat_is_balanced = false;
int i;
int end_zone = 0; /* Inclusive. 0 = ZONE_DMA */
unsigned long nr_soft_reclaimed;
unsigned long nr_soft_scanned;
struct scan_control sc = {
.gfp_mask = GFP_KERNEL,
+ .priority = DEF_PRIORITY,
.may_unmap = 1,
.may_swap = 1,
+ .may_writepage = !laptop_mode,
.order = order,
.target_mem_cgroup = NULL,
};
-loop_again:
- sc.priority = DEF_PRIORITY;
- sc.nr_reclaimed = 0;
- sc.may_writepage = !laptop_mode;
count_vm_event(PAGEOUTRUN);
do {
unsigned long lru_pages = 0;
+ unsigned long nr_reclaimed = sc.nr_reclaimed = 0;
+ bool raise_priority = true;
/*
* Scan in the highmem->dma direction for the highest
@@ -2741,10 +2746,8 @@ loop_again:
}
}
- if (i < 0) {
- pgdat_is_balanced = true;
+ if (i < 0)
goto out;
- }
for (i = 0; i <= end_zone; i++) {
struct zone *zone = pgdat->node_zones + i;
@@ -2811,8 +2814,16 @@ loop_again:
if ((buffer_heads_over_limit && is_highmem_idx(i)) ||
!zone_balanced(zone, testorder,
- balance_gap, end_zone))
- kswapd_shrink_zone(zone, &sc, lru_pages);
+ balance_gap, end_zone)) {
+ /*
+ * There should be no need to raise the
+ * scanning priority if enough pages are
+ * already being scanned that high
+ * watermark would be met at 100% efficiency.
+ */
+ if (kswapd_shrink_zone(zone, &sc, lru_pages))
+ raise_priority = false;
+ }
/*
* If we're getting trouble reclaiming, start doing
@@ -2847,46 +2858,29 @@ loop_again:
pfmemalloc_watermark_ok(pgdat))
wake_up(&pgdat->pfmemalloc_wait);
- if (pgdat_balanced(pgdat, order, *classzone_idx)) {
- pgdat_is_balanced = true;
- break; /* kswapd: all done */
- }
-
/*
- * We do this so kswapd doesn't build up large priorities for
- * example when it is freeing in parallel with allocators. It
- * matches the direct reclaim path behaviour in terms of impact
- * on zone->*_priority.
+ * Fragmentation may mean that the system cannot be rebalanced
+ * for high-order allocations in all zones. If twice the
+ * allocation size has been reclaimed and the zones are still
+ * not balanced then recheck the watermarks at order-0 to
+ * prevent kswapd reclaiming excessively. Assume that a
+ * process requested a high-order can direct reclaim/compact.
*/
- if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX)
- break;
- } while (--sc.priority >= 0);
-
-out:
- if (!pgdat_is_balanced) {
- cond_resched();
+ if (order && sc.nr_reclaimed >= 2UL << order)
+ order = sc.order = 0;
- try_to_freeze();
+ /* Check if kswapd should be suspending */
+ if (try_to_freeze() || kthread_should_stop())
+ break;
/*
- * Fragmentation may mean that the system cannot be
- * rebalanced for high-order allocations in all zones.
- * At this point, if nr_reclaimed < SWAP_CLUSTER_MAX,
- * it means the zones have been fully scanned and are still
- * not balanced. For high-order allocations, there is
- * little point trying all over again as kswapd may
- * infinite loop.
- *
- * Instead, recheck all watermarks at order-0 as they
- * are the most important. If watermarks are ok, kswapd will go
- * back to sleep. High-order users can still perform direct
- * reclaim if they wish.
+ * Raise priority if scanning rate is too low or there was no
+ * progress in reclaiming pages
*/
- if (sc.nr_reclaimed < SWAP_CLUSTER_MAX)
- order = sc.order = 0;
-
- goto loop_again;
- }
+ if (raise_priority || sc.nr_reclaimed - nr_reclaimed == 0)
+ sc.priority--;
+ } while (sc.priority >= 0 &&
+ !pgdat_balanced(pgdat, order, *classzone_idx));
/*
* If kswapd was reclaiming at a higher order, it has the option of
@@ -2915,6 +2909,7 @@ out:
compact_pgdat(pgdat, order);
}
+out:
/*
* Return the order we were reclaiming at so prepare_kswapd_sleep()
* makes a decision on the order we were last reclaiming at. However,
--
1.8.1.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-04-09 11:18 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-09 11:06 [PATCH 0/10] Reduce system disruption due to kswapd V2 Mel Gorman
2013-04-09 11:06 ` [PATCH 01/10] mm: vmscan: Limit the number of pages kswapd reclaims at each priority Mel Gorman
2013-04-09 13:27 ` Michal Hocko
2013-04-10 6:47 ` Kamezawa Hiroyuki
2013-04-09 11:06 ` [PATCH 02/10] mm: vmscan: Obey proportional scanning requirements for kswapd Mel Gorman
2013-04-10 7:16 ` Kamezawa Hiroyuki
2013-04-10 14:08 ` Mel Gorman
2013-04-11 0:14 ` Kamezawa Hiroyuki
2013-04-11 9:09 ` Mel Gorman
2013-04-09 11:06 ` Mel Gorman [this message]
2013-04-10 7:47 ` [PATCH 03/10] mm: vmscan: Flatten kswapd priority loop Kamezawa Hiroyuki
2013-04-10 13:29 ` Mel Gorman
2013-04-12 2:45 ` Rik van Riel
2013-04-09 11:06 ` [PATCH 04/10] mm: vmscan: Decide whether to compact the pgdat based on reclaim progress Mel Gorman
2013-04-10 8:05 ` Kamezawa Hiroyuki
2013-04-10 13:57 ` Mel Gorman
2013-04-12 2:46 ` Rik van Riel
2013-04-09 11:07 ` [PATCH 05/10] mm: vmscan: Do not allow kswapd to scan at maximum priority Mel Gorman
2013-04-09 11:07 ` [PATCH 06/10] mm: vmscan: Have kswapd writeback pages based on dirty pages encountered, not priority Mel Gorman
2013-04-12 2:51 ` Rik van Riel
2013-04-09 11:07 ` [PATCH 07/10] mm: vmscan: Block kswapd if it is encountering pages under writeback Mel Gorman
2013-04-12 2:54 ` Rik van Riel
2013-04-09 11:07 ` [PATCH 08/10] mm: vmscan: Have kswapd shrink slab only once per priority Mel Gorman
2013-04-09 11:07 ` [PATCH 09/10] mm: vmscan: Check if kswapd should writepage once per pgdat scan Mel Gorman
2013-04-09 11:07 ` [PATCH 10/10] mm: vmscan: Move logic from balance_pgdat() to kswapd_shrink_zone() Mel Gorman
2013-04-12 2:56 ` Rik van Riel
2013-04-09 17:27 ` [PATCH 0/10] Reduce system disruption due to kswapd V2 Christoph Lameter
2013-04-10 14:14 ` Mel Gorman
2013-04-10 22:28 ` dormando
2013-04-10 23:46 ` KOSAKI Motohiro
2013-04-11 9:10 ` Mel Gorman
2013-04-11 20:13 ` Michal Hocko
2013-04-11 20:55 ` Zlatko Calusic
2013-04-12 19:40 ` Mel Gorman
2013-04-12 19:52 ` Mel Gorman
2013-04-12 20:07 ` Zlatko Calusic
2013-04-12 20:41 ` Mel Gorman
2013-04-12 21:14 ` Zlatko Calusic
2013-04-22 6:37 ` Zlatko Calusic
2013-04-22 6:43 ` Simon Jeons
2013-04-22 6:54 ` Zlatko Calusic
2013-04-22 7:12 ` Simon Jeons
-- strict thread matches above, loose matches on Subject: below --
2013-04-11 19:57 [PATCH 0/10] Reduce system disruption due to kswapd V3 Mel Gorman
2013-04-11 19:57 ` [PATCH 03/10] mm: vmscan: Flatten kswapd priority loop Mel Gorman
2013-04-18 15:02 ` Johannes Weiner
2013-03-17 13:04 [RFC PATCH 0/8] Reduce system disruption due to kswapd Mel Gorman
2013-03-17 13:04 ` [PATCH 03/10] mm: vmscan: Flatten kswapd priority loop Mel Gorman
2013-03-17 14:36 ` Andi Kleen
2013-03-17 15:09 ` Mel Gorman
2013-03-18 7:02 ` Hillf Danton
2013-03-19 10:01 ` Mel Gorman
2013-03-18 23:58 ` Simon Jeons
2013-03-19 10:12 ` Mel Gorman
2013-03-19 3:08 ` Simon Jeons
2013-03-19 8:23 ` Michal Hocko
2013-03-19 10:14 ` Mel Gorman
2013-03-19 10:26 ` Simon Jeons
2013-03-19 11:01 ` Mel Gorman
2013-03-21 14:54 ` Michal Hocko
2013-03-21 15:26 ` Mel Gorman
2013-03-21 15:38 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1365505625-9460-4-git-send-email-mgorman@suse.de \
--to=mgorman@suse.de \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=dormando@rydia.net \
--cc=hannes@cmpxchg.org \
--cc=jslaby@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
--cc=satoru.moriya@hds.com \
--cc=zcalusic@bitsync.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).