linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>, Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Mel Gorman <mgorman@techsingularity.net>
Subject: [PATCH 05/22] mm, page_alloc: Inline the fast path of the zonelist iterator
Date: Mon, 11 Apr 2016 09:13:28 +0100	[thread overview]
Message-ID: <1460362424-26369-6-git-send-email-mgorman@techsingularity.net> (raw)
In-Reply-To: <1460362424-26369-1-git-send-email-mgorman@techsingularity.net>

The page allocator iterates through a zonelist for zones that match
the addressing limitations and nodemask of the caller but many allocations
will not be restricted. Despite this, there is always functional call
overhead which builds up.

This patch inlines the optimistic basic case and only calls the
iterator function for the complex case. A hindrance was the fact that
cpuset_current_mems_allowed is used in the fastpath as the allowed nodemask
even though all nodes are allowed on most systems. The patch handles this
by only considering cpuset_current_mems_allowed if a cpuset exists. As well
as being faster in the fast-path, this removes some junk in the slowpath.

The performance difference on a page allocator microbenchmark is;

                                           4.6.0-rc2                  4.6.0-rc2
                                    statinline-v1r20              optiter-v1r20
Min      alloc-odr0-1               412.00 (  0.00%)           382.00 (  7.28%)
Min      alloc-odr0-2               301.00 (  0.00%)           282.00 (  6.31%)
Min      alloc-odr0-4               247.00 (  0.00%)           233.00 (  5.67%)
Min      alloc-odr0-8               215.00 (  0.00%)           203.00 (  5.58%)
Min      alloc-odr0-16              199.00 (  0.00%)           188.00 (  5.53%)
Min      alloc-odr0-32              191.00 (  0.00%)           182.00 (  4.71%)
Min      alloc-odr0-64              187.00 (  0.00%)           177.00 (  5.35%)
Min      alloc-odr0-128             185.00 (  0.00%)           175.00 (  5.41%)
Min      alloc-odr0-256             193.00 (  0.00%)           184.00 (  4.66%)
Min      alloc-odr0-512             207.00 (  0.00%)           197.00 (  4.83%)
Min      alloc-odr0-1024            213.00 (  0.00%)           203.00 (  4.69%)
Min      alloc-odr0-2048            220.00 (  0.00%)           209.00 (  5.00%)
Min      alloc-odr0-4096            226.00 (  0.00%)           214.00 (  5.31%)
Min      alloc-odr0-8192            229.00 (  0.00%)           218.00 (  4.80%)
Min      alloc-odr0-16384           229.00 (  0.00%)           219.00 (  4.37%)

perf indicated that next_zones_zonelist disappeared in the profile and
__next_zones_zonelist did not appear. This is expected as the micro-benchmark
would hit the inlined fast-path every time.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/mmzone.h | 13 +++++++++++--
 mm/mmzone.c            |  2 +-
 mm/page_alloc.c        | 26 +++++++++-----------------
 3 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index c60df9257cc7..0c4d5ebb3849 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -922,6 +922,10 @@ static inline int zonelist_node_idx(struct zoneref *zoneref)
 #endif /* CONFIG_NUMA */
 }
 
+struct zoneref *__next_zones_zonelist(struct zoneref *z,
+					enum zone_type highest_zoneidx,
+					nodemask_t *nodes);
+
 /**
  * next_zones_zonelist - Returns the next zone at or below highest_zoneidx within the allowed nodemask using a cursor within a zonelist as a starting point
  * @z - The cursor used as a starting point for the search
@@ -934,9 +938,14 @@ static inline int zonelist_node_idx(struct zoneref *zoneref)
  * being examined. It should be advanced by one before calling
  * next_zones_zonelist again.
  */
-struct zoneref *next_zones_zonelist(struct zoneref *z,
+static __always_inline struct zoneref *next_zones_zonelist(struct zoneref *z,
 					enum zone_type highest_zoneidx,
-					nodemask_t *nodes);
+					nodemask_t *nodes)
+{
+	if (likely(!nodes && zonelist_zone_idx(z) <= highest_zoneidx))
+		return z;
+	return __next_zones_zonelist(z, highest_zoneidx, nodes);
+}
 
 /**
  * first_zones_zonelist - Returns the first zone at or below highest_zoneidx within the allowed nodemask in a zonelist
diff --git a/mm/mmzone.c b/mm/mmzone.c
index 52687fb4de6f..5652be858e5e 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -52,7 +52,7 @@ static inline int zref_in_nodemask(struct zoneref *zref, nodemask_t *nodes)
 }
 
 /* Returns the next zone at or below highest_zoneidx in a zonelist */
-struct zoneref *next_zones_zonelist(struct zoneref *z,
+struct zoneref *__next_zones_zonelist(struct zoneref *z,
 					enum zone_type highest_zoneidx,
 					nodemask_t *nodes)
 {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b56c2b2911a2..e9acc0b0f787 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3193,17 +3193,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 */
 	alloc_flags = gfp_to_alloc_flags(gfp_mask);
 
-	/*
-	 * Find the true preferred zone if the allocation is unconstrained by
-	 * cpusets.
-	 */
-	if (!(alloc_flags & ALLOC_CPUSET) && !ac->nodemask) {
-		struct zoneref *preferred_zoneref;
-		preferred_zoneref = first_zones_zonelist(ac->zonelist,
-				ac->high_zoneidx, NULL, &ac->preferred_zone);
-		ac->classzone_idx = zonelist_zone_idx(preferred_zoneref);
-	}
-
 	/* This is the last chance, in general, before the goto nopage. */
 	page = get_page_from_freelist(gfp_mask, order,
 				alloc_flags & ~ALLOC_NO_WATERMARKS, ac);
@@ -3359,14 +3348,21 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
 	struct zoneref *preferred_zoneref;
 	struct page *page = NULL;
 	unsigned int cpuset_mems_cookie;
-	int alloc_flags = ALLOC_WMARK_LOW|ALLOC_CPUSET|ALLOC_FAIR;
+	int alloc_flags = ALLOC_WMARK_LOW|ALLOC_FAIR;
 	gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */
 	struct alloc_context ac = {
 		.high_zoneidx = gfp_zone(gfp_mask),
+		.zonelist = zonelist,
 		.nodemask = nodemask,
 		.migratetype = gfpflags_to_migratetype(gfp_mask),
 	};
 
+	if (cpusets_enabled()) {
+		alloc_flags |= ALLOC_CPUSET;
+		if (!ac.nodemask)
+			ac.nodemask = &cpuset_current_mems_allowed;
+	}
+
 	gfp_mask &= gfp_allowed_mask;
 
 	lockdep_trace_alloc(gfp_mask);
@@ -3390,16 +3386,12 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
 retry_cpuset:
 	cpuset_mems_cookie = read_mems_allowed_begin();
 
-	/* We set it here, as __alloc_pages_slowpath might have changed it */
-	ac.zonelist = zonelist;
-
 	/* Dirty zone balancing only done in the fast path */
 	ac.spread_dirty_pages = (gfp_mask & __GFP_WRITE);
 
 	/* The preferred zone is used for statistics later */
 	preferred_zoneref = first_zones_zonelist(ac.zonelist, ac.high_zoneidx,
-				ac.nodemask ? : &cpuset_current_mems_allowed,
-				&ac.preferred_zone);
+				ac.nodemask, &ac.preferred_zone);
 	if (!ac.preferred_zone)
 		goto out;
 	ac.classzone_idx = zonelist_zone_idx(preferred_zoneref);
-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-04-11  8:14 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-11  8:13 [PATCH 00/21] Optimise page alloc/free fast paths Mel Gorman
2016-04-11  8:13 ` [PATCH 01/22] mm, page_alloc: Only check PageCompound for high-order pages Mel Gorman
2016-04-11  8:13 ` [PATCH 02/22] mm, page_alloc: Use new PageAnonHead helper in the free page fast path Mel Gorman
2016-04-11  8:13 ` [PATCH 03/22] mm, page_alloc: Reduce branches in zone_statistics Mel Gorman
2016-04-11  8:13 ` [PATCH 04/22] mm, page_alloc: Inline zone_statistics Mel Gorman
2016-04-11  8:13 ` Mel Gorman [this message]
2016-04-11  8:13 ` [PATCH 06/22] mm, page_alloc: Use __dec_zone_state for order-0 page allocation Mel Gorman
2016-04-11  8:13 ` [PATCH 07/22] mm, page_alloc: Avoid unnecessary zone lookups during pageblock operations Mel Gorman
2016-04-11  8:13 ` [PATCH 08/22] mm, page_alloc: Convert alloc_flags to unsigned Mel Gorman
2016-04-11  8:13 ` [PATCH 09/22] mm, page_alloc: Convert nr_fair_skipped to bool Mel Gorman
2016-04-11  8:13 ` [PATCH 10/22] mm, page_alloc: Remove unnecessary local variable in get_page_from_freelist Mel Gorman
2016-04-11  8:13 ` [PATCH 11/22] mm, page_alloc: Remove unnecessary initialisation " Mel Gorman
2016-04-11  8:13 ` [PATCH 12/22] mm, page_alloc: Remove unnecessary initialisation from __alloc_pages_nodemask() Mel Gorman
2016-04-11  8:13 ` [PATCH 13/22] mm, page_alloc: Remove redundant check for empty zonelist Mel Gorman
2016-04-11  8:13 ` [PATCH 14/22] mm, page_alloc: Simplify last cpupid reset Mel Gorman
2016-04-11  8:13 ` [PATCH 15/22] mm, page_alloc: Move might_sleep_if check to the allocator slowpath Mel Gorman
2016-04-11  8:13 ` [PATCH 16/22] mm, page_alloc: Move __GFP_HARDWALL modifications out of the fastpath Mel Gorman
2016-04-11  8:13 ` [PATCH 17/22] mm, page_alloc: Reduce cost of fair zone allocation policy retry Mel Gorman
2016-04-11  8:13 ` [PATCH 18/22] mm, page_alloc: Shortcut watermark checks for order-0 pages Mel Gorman
2016-04-11  8:13 ` [PATCH 19/22] mm, page_alloc: Avoid looking up the first zone in a zonelist twice Mel Gorman
2016-04-11  8:13 ` [PATCH 20/22] mm, page_alloc: Check multiple page fields with a single branch Mel Gorman
2016-04-11  8:13 ` [PATCH 21/22] cpuset: use static key better and convert to new API Mel Gorman
2016-04-11  8:18 ` [PATCH 00/21] Optimise page alloc/free fast paths Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1460362424-26369-6-git-send-email-mgorman@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).