From: Mel Gorman <mgorman@suse.com>
To: Linux-MM <linux-mm@kvack.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Rik van Riel <riel@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
Pintu Kumar <pintu.k@samsung.com>,
Xishi Qiu <qiuxishi@huawei.com>, Gioh Kim <gioh.kim@lge.com>,
LKML <linux-kernel@vger.kernel.org>,
Mel Gorman <mgorman@techsingularity.net>
Subject: [PATCH 10/10] mm, page_alloc: Only enforce watermarks for order-0 allocations
Date: Mon, 20 Jul 2015 09:00:19 +0100 [thread overview]
Message-ID: <1437379219-9160-11-git-send-email-mgorman@suse.com> (raw)
In-Reply-To: <1437379219-9160-1-git-send-email-mgorman@suse.com>
From: Mel Gorman <mgorman@suse.de>
The primary purpose of watermarks is to ensure that reclaim can always
make forward progress in PF_MEMALLOC context (kswapd and direct reclaim).
These assume that order-0 allocations are all that is necessary for
forward progress.
High-order watermarks serve a different purpose. Kswapd had no high-order
awareness before they were introduced (https://lkml.org/lkml/2004/9/5/9).
This was particularly important when there were high-order atomic requests.
The watermarks both gave kswapd awareness and made a reserve for those
atomic requests.
There are two important side-effects of this. The most important is that
a non-atomic high-order request can fail even though free pages are available
and the order-0 watermarks are ok. The second is that high-order watermark
checks are expensive as the free list counts up to the requested order must
be examined.
With the introduction of MIGRATE_HIGHATOMIC it is no longer necessary to
have high-order watermarks. Kswapd and compaction still need high-order
awareness which is handled by checking that at least one suitable high-order
page is free.
In kernel 4.2-rc1 running this workload on a single-node machine there
were 339574 allocation failures. With HighAtomic reserves, it drops to
28798 failures. With this patch applied, it drops to 9567 failures --
a 98% reduction compared to the vanilla kernel or 67% in comparison to
having high atomic reserves with watermark checking.
The one potential side-effect of this is that in a vanilla kernel, the
watermark checks may have kept a free page for an atomic allocation. Now,
we are 100% relying on the HighAtomic reserves and an early allocation to
have allocated them. If the first high-order atomic allocation is after
the system is already heavily fragmented then it'll fail.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/page_alloc.c | 38 ++++++++++++++++++++++++--------------
1 file changed, 24 insertions(+), 14 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e5755390a5e5..e756df60dba6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2250,8 +2250,10 @@ static inline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
#endif /* CONFIG_FAIL_PAGE_ALLOC */
/*
- * Return true if free pages are above 'mark'. This takes into account the order
- * of the allocation.
+ * Return true if free base pages are above 'mark'. For high-order checks it
+ * will return true of the order-0 watermark is reached and there is at least
+ * one free page of a suitable size. Checking now avoids taking the zone lock
+ * to check in the allocation paths if no pages are free.
*/
static bool __zone_watermark_ok(struct zone *z, unsigned int order,
unsigned long mark, int classzone_idx, int alloc_flags,
@@ -2259,7 +2261,7 @@ static bool __zone_watermark_ok(struct zone *z, unsigned int order,
{
long min = mark;
int o;
- long free_cma = 0;
+ const bool atomic = (alloc_flags & ALLOC_HARDER);
/* free_pages may go negative - that's OK */
free_pages -= (1 << order) - 1;
@@ -2271,7 +2273,7 @@ static bool __zone_watermark_ok(struct zone *z, unsigned int order,
* If the caller is not atomic then discount the reserves. This will
* over-estimate how the atomic reserve but it avoids a search
*/
- if (likely(!(alloc_flags & ALLOC_HARDER)))
+ if (likely(!atomic))
free_pages -= z->nr_reserved_highatomic;
else
min -= min / 4;
@@ -2279,22 +2281,30 @@ static bool __zone_watermark_ok(struct zone *z, unsigned int order,
#ifdef CONFIG_CMA
/* If allocation can't use CMA areas don't use free CMA pages */
if (!(alloc_flags & ALLOC_CMA))
- free_cma = zone_page_state(z, NR_FREE_CMA_PAGES);
+ free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
#endif
- if (free_pages - free_cma <= min + z->lowmem_reserve[classzone_idx])
+ if (free_pages <= min + z->lowmem_reserve[classzone_idx])
return false;
- for (o = 0; o < order; o++) {
- /* At the next order, this order's pages become unavailable */
- free_pages -= z->free_area[o].nr_free << o;
- /* Require fewer higher order pages to be free */
- min >>= 1;
+ /* order-0 watermarks are ok */
+ if (!order)
+ return true;
+
+ /* Check at least one high-order page is free */
+ for (o = order; o < MAX_ORDER; o++) {
+ struct free_area *area = &z->free_area[o];
+ int mt;
+
+ if (atomic && area->nr_free)
+ return true;
- if (free_pages <= min)
- return false;
+ for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) {
+ if (!list_empty(&area->free_list[mt]))
+ return true;
+ }
}
- return true;
+ return false;
}
bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
--
2.4.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-07-20 8:00 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-20 8:00 [RFC PATCH 00/10] Remove zonelist cache and high-order watermark checking Mel Gorman
2015-07-20 8:00 ` [PATCH 01/10] mm, page_alloc: Delete the zonelist_cache Mel Gorman
2015-07-21 23:47 ` David Rientjes
2015-07-23 10:58 ` Mel Gorman
2015-07-20 8:00 ` [PATCH 02/10] mm, page_alloc: Remove unnecessary parameter from zone_watermark_ok_safe Mel Gorman
2015-07-21 23:49 ` David Rientjes
2015-07-28 12:20 ` Vlastimil Babka
2015-07-20 8:00 ` [PATCH 03/10] mm, page_alloc: Remove unnecessary recalculations for dirty zone balancing Mel Gorman
2015-07-22 0:08 ` David Rientjes
2015-07-23 12:28 ` Mel Gorman
2015-07-28 12:25 ` Vlastimil Babka
2015-07-20 8:00 ` [PATCH 04/10] mm, page_alloc: Remove unnecessary taking of a seqlock when cpusets are disabled Mel Gorman
2015-07-22 0:11 ` David Rientjes
2015-07-28 12:32 ` Vlastimil Babka
2015-07-20 8:00 ` [PATCH 05/10] mm, page_alloc: Remove unnecessary updating of GFP flags during normal operation Mel Gorman
2015-07-28 13:36 ` Vlastimil Babka
2015-07-28 13:47 ` Peter Zijlstra
2015-07-28 15:48 ` Mel Gorman
2015-07-20 8:00 ` [PATCH 06/10] mm, page_alloc: Use jump label to check if page grouping by mobility is enabled Mel Gorman
2015-07-28 13:42 ` Vlastimil Babka
2015-07-20 8:00 ` [PATCH 07/10] mm, page_alloc: Use masks and shifts when converting GFP flags to migrate types Mel Gorman
2015-07-20 8:00 ` [PATCH 08/10] mm, page_alloc: Remove MIGRATE_RESERVE Mel Gorman
2015-07-29 9:59 ` Vlastimil Babka
2015-07-29 12:25 ` Mel Gorman
2015-07-20 8:00 ` [PATCH 09/10] mm, page_alloc: Reserve pageblocks for high-order atomic allocations on demand Mel Gorman
2015-07-29 11:35 ` Vlastimil Babka
2015-07-29 12:53 ` Mel Gorman
2015-07-31 8:28 ` Vlastimil Babka
2015-07-31 8:43 ` Mel Gorman
2015-07-31 5:54 ` Joonsoo Kim
2015-07-31 7:11 ` Mel Gorman
2015-07-31 7:25 ` Vlastimil Babka
2015-07-31 8:22 ` Mel Gorman
2015-07-31 8:30 ` Joonsoo Kim
2015-07-31 8:26 ` Joonsoo Kim
2015-07-31 8:41 ` Mel Gorman
2015-07-20 8:00 ` Mel Gorman [this message]
2015-07-29 12:25 ` [PATCH 10/10] mm, page_alloc: Only enforce watermarks for order-0 allocations Vlastimil Babka
2015-07-29 13:04 ` Mel Gorman
2015-07-31 6:08 ` Joonsoo Kim
2015-07-31 7:19 ` Mel Gorman
2015-07-31 8:40 ` Joonsoo Kim
2015-07-31 6:14 ` [RFC PATCH 00/10] Remove zonelist cache and high-order watermark checking Joonsoo Kim
2015-07-31 7:20 ` Mel Gorman
-- strict thread matches above, loose matches on Subject: below --
2015-08-12 10:45 [PATCH 00/10] Remove zonelist cache and high-order watermark checking v2 Mel Gorman
2015-08-12 10:45 ` [PATCH 10/10] mm, page_alloc: Only enforce watermarks for order-0 allocations Mel Gorman
2015-09-21 10:52 [PATCH 00/10] Remove zonelist cache and high-order watermark checking v4 Mel Gorman
2015-09-21 12:03 ` [PATCH 10/10] mm, page_alloc: Only enforce watermarks for order-0 allocations Mel Gorman
2015-09-25 19:32 ` Johannes Weiner
2015-09-29 21:05 ` Andrew Morton
2015-09-30 8:46 ` Mel Gorman
2015-09-30 14:17 ` Vlastimil Babka
2015-09-30 15:12 ` Mel Gorman
2015-09-30 20:37 ` Andrew Morton
2015-09-30 14:11 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1437379219-9160-11-git-send-email-mgorman@suse.com \
--to=mgorman@suse.com \
--cc=gioh.kim@lge.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=pintu.k@samsung.com \
--cc=qiuxishi@huawei.com \
--cc=riel@redhat.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).