[PATCH v2] mm/page_alloc.c: inline __rmqueue()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Aaron Lu <aaron.lu@intel.com>
To: linux-mm <linux-mm@kvack.org>, lkml <linux-kernel@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andi Kleen <ak@linux.intel.com>,
	Huang Ying <ying.huang@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Kemi Wang <kemi.wang@intel.com>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>
Subject: [PATCH v2] mm/page_alloc.c: inline __rmqueue()
Date: Tue, 10 Oct 2017 10:56:01 +0800	[thread overview]
Message-ID: <20171010025601.GE1798@intel.com> (raw)
In-Reply-To: <20171010025151.GD1798@intel.com>

__rmqueue() is called by rmqueue_bulk() and rmqueue() under zone->lock
and the two __rmqueue() call sites are in very hot page allocator paths.

Since __rmqueue() is a small function, inline it can save us some time.
With the will-it-scale/page_fault1/process benchmark, when using nr_cpu
processes to stress buddy, this patch improved the benchmark by 6.3% on
a 2-sockets Intel-Skylake system and 4.6% on a 4-sockets Intel-Skylake
system. The benefit being less on 4 sockets machine is due to the lock
contention there(perf-profile/native_queued_spin_lock_slowpath=81%) is
less severe than on the 2 sockets machine(84%).

What the benchmark does is: it forks nr_cpu processes and then each
process does the following:
    1 mmap() 128M anonymous space;
    2 writes to each page there to trigger actual page allocation;
    3 munmap() it.
in a loop.
https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c

This patch adds inline to __rmqueue() and vmlinux' size doesn't have any
change after this patch according to size(1).

without this patch:
   text    data     bss     dec     hex     filename
9968576 5793372 17715200  33477148  1fed21c vmlinux

with this patch:
   text    data     bss     dec     hex     filename
9968576 5793372 17715200  33477148  1fed21c vmlinux

Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Tested-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
---
v2: change commit message according to Dave Hansen's suggestion.

 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0e309ce4a44a..c9605c7ebaf6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2291,7 +2291,7 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype)
  * Do the hard work of removing an element from the buddy allocator.
  * Call me with the zone->lock already held.
  */
-static struct page *__rmqueue(struct zone *zone, unsigned int order,
+static inline struct page *__rmqueue(struct zone *zone, unsigned int order,
 				int migratetype)
 {
 	struct page *page;
-- 
2.13.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Aaron Lu <aaron.lu@intel.com>
To: linux-mm <linux-mm@kvack.org>, lkml <linux-kernel@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andi Kleen <ak@linux.intel.com>,
	Huang Ying <ying.huang@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Kemi Wang <kemi.wang@intel.com>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>
Subject: [PATCH v2] mm/page_alloc.c: inline __rmqueue()
Date: Tue, 10 Oct 2017 10:56:01 +0800	[thread overview]
Message-ID: <20171010025601.GE1798@intel.com> (raw)
In-Reply-To: <20171010025151.GD1798@intel.com>

__rmqueue() is called by rmqueue_bulk() and rmqueue() under zone->lock
and the two __rmqueue() call sites are in very hot page allocator paths.

Since __rmqueue() is a small function, inline it can save us some time.
With the will-it-scale/page_fault1/process benchmark, when using nr_cpu
processes to stress buddy, this patch improved the benchmark by 6.3% on
a 2-sockets Intel-Skylake system and 4.6% on a 4-sockets Intel-Skylake
system. The benefit being less on 4 sockets machine is due to the lock
contention there(perf-profile/native_queued_spin_lock_slowpath=81%) is
less severe than on the 2 sockets machine(84%).

What the benchmark does is: it forks nr_cpu processes and then each
process does the following:
    1 mmap() 128M anonymous space;
    2 writes to each page there to trigger actual page allocation;
    3 munmap() it.
in a loop.
https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c

This patch adds inline to __rmqueue() and vmlinux' size doesn't have any
change after this patch according to size(1).

without this patch:
   text    data     bss     dec     hex     filename
9968576 5793372 17715200  33477148  1fed21c vmlinux

with this patch:
   text    data     bss     dec     hex     filename
9968576 5793372 17715200  33477148  1fed21c vmlinux

Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Tested-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
---
v2: change commit message according to Dave Hansen's suggestion.

 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0e309ce4a44a..c9605c7ebaf6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2291,7 +2291,7 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype)
  * Do the hard work of removing an element from the buddy allocator.
  * Call me with the zone->lock already held.
  */
-static struct page *__rmqueue(struct zone *zone, unsigned int order,
+static inline struct page *__rmqueue(struct zone *zone, unsigned int order,
 				int migratetype)
 {
 	struct page *page;
-- 
2.13.6

next prev parent reply	other threads:[~2017-10-10  2:56 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-09  5:44 [PATCH] page_alloc.c: inline __rmqueue() Aaron Lu
2017-10-09  5:44 ` Aaron Lu
2017-10-09  7:37 ` Anshuman Khandual
2017-10-09  7:37   ` Anshuman Khandual
2017-10-09  7:53   ` Aaron Lu
2017-10-09  7:53     ` Aaron Lu
2017-10-09 20:23 ` Dave Hansen
2017-10-09 20:23   ` Dave Hansen
2017-10-10  2:51   ` Aaron Lu
2017-10-10  2:51     ` Aaron Lu
2017-10-10  2:56     ` Aaron Lu [this message]
2017-10-10  2:56       ` [PATCH v2] mm/page_alloc.c: " Aaron Lu
2017-10-10  5:19       ` Dave Hansen
2017-10-10  5:19         ` Dave Hansen
2017-10-10  5:43         ` Aaron Lu
2017-10-10  5:43           ` Aaron Lu
2017-10-10 21:45           ` Andrew Morton
2017-10-10 21:45             ` Andrew Morton
2017-10-10 22:27             ` Andi Kleen
2017-10-10 22:27               ` Andi Kleen
2017-10-11  2:34             ` Aaron Lu
2017-10-11  2:34               ` Aaron Lu
2017-10-13  6:31               ` [PATCH] mm/page_alloc: make sure __rmqueue() etc. always inline Aaron Lu
2017-10-13  6:31                 ` Aaron Lu
2017-10-17 11:32                 ` Vlastimil Babka
2017-10-17 11:32                   ` Vlastimil Babka
2017-10-18  1:53                   ` Lu, Aaron
2017-10-18  6:28                     ` Vlastimil Babka
2017-10-18  6:28                       ` Vlastimil Babka
2017-10-18  8:57                       ` Aaron Lu
2017-10-18  8:57                         ` Aaron Lu

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:0e309ce4a44 dfblob:c9605c7ebaf dfblob:0e309ce4a44
dfblob:c9605c7ebaf )
 OR (
bs:"[PATCH v2] mm/page_alloc.c: inline __rmqueue()" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171010025601.GE1798@intel.com \
    --to=aaron.lu@intel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=kemi.wang@intel.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tim.c.chen@linux.intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.