From: Dave Hansen <dave@sr71.net>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Peter Zijlstra <peterz@infradead.org>,
Russell King - ARM Linux <linux@arm.linux.org.uk>,
Michal Simek <monstr@monstr.eu>,
Linus Torvalds <torvalds@linux-foundation.org>,
Will Deacon <will.deacon@arm.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>
Subject: post-3.18 performance regression in TLB flushing code
Date: Tue, 16 Dec 2014 13:36:56 -0800 [thread overview]
Message-ID: <5490A5F8.6050504@sr71.net> (raw)
[-- Attachment #1: Type: text/plain, Size: 1438 bytes --]
I'm running the 'brk1' test from will-it-scale:
> https://github.com/antonblanchard/will-it-scale/blob/master/tests/brk1.c
on a 8-socket/160-thread system. It's seeing about a 6% drop in
performance (263M -> 247M ops/sec at 80-threads) from this commit:
commit fb7332a9fedfd62b1ba6530c86f39f0fa38afd49
Author: Will Deacon <will.deacon@arm.com>
Date: Wed Oct 29 10:03:09 2014 +0000
mmu_gather: move minimal range calculations into generic code
tlb_finish_mmu() goes up about 9x in the profiles (~0.4%->3.6%) and
tlb_flush_mmu_free() takes about 3.1% of CPU time with the patch
applied, but does not show up at all on the commit before.
This isn't a major regression, but it is rather unfortunate for a patch
that is apparently a code cleanup. It also _looks_ to show up even when
things are single-threaded, although I haven't looked at it in detail.
I suspect the tlb->need_flush logic was serving some role that the
modified code isn't capturing like in this hunk:
> void tlb_flush_mmu(struct mmu_gather *tlb)
> {
> - if (!tlb->need_flush)
> - return;
> tlb_flush_mmu_tlbonly(tlb);
> tlb_flush_mmu_free(tlb);
> }
tlb_flush_mmu_tlbonly() has tlb->end check (which replaces the
->need_flush logic), but tlb_flush_mmu_free() does not.
If we add a !tlb->end (patch attached) to tlb_flush_mmu(), that gets us
back up to ~258M ops/sec, but that's still ~2% down from where we started.
[-- Attachment #2: fix-old-need_flush-logic.patch --]
[-- Type: text/x-patch, Size: 461 bytes --]
---
b/mm/memory.c | 3 +++
1 file changed, 3 insertions(+)
diff -puN mm/memory.c~fix-old-need_flush-logic mm/memory.c
--- a/mm/memory.c~fix-old-need_flush-logic 2014-12-16 13:24:27.338557014 -0800
+++ b/mm/memory.c 2014-12-16 13:24:50.412598019 -0800
@@ -258,6 +258,9 @@ static void tlb_flush_mmu_free(struct mm
void tlb_flush_mmu(struct mmu_gather *tlb)
{
+ if (!tlb->end)
+ return;
+
tlb_flush_mmu_tlbonly(tlb);
tlb_flush_mmu_free(tlb);
}
_
next reply other threads:[~2014-12-16 21:37 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-16 21:36 Dave Hansen [this message]
2014-12-17 10:08 ` post-3.18 performance regression in TLB flushing code Will Deacon
2014-12-17 10:08 ` Will Deacon
2014-12-17 16:28 ` Linus Torvalds
2014-12-17 16:53 ` Will Deacon
2014-12-17 16:53 ` Will Deacon
2014-12-17 18:52 ` Dave Hansen
2014-12-17 18:52 ` Dave Hansen
2014-12-17 19:58 ` Linus Torvalds
2014-12-17 19:58 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5490A5F8.6050504@sr71.net \
--to=dave@sr71.net \
--cc=benh@kernel.crashing.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux@arm.linux.org.uk \
--cc=monstr@monstr.eu \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.