From: Rik van Riel <riel@surriel.com>
To: Mike Galbraith <efault@gmx.de>, Andy Lutomirski <luto@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
songliubraving@fb.com, kernel-team <kernel-team@fb.com>,
Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>, X86 ML <x86@kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm
Date: Fri, 01 Jun 2018 15:43:27 -0400 [thread overview]
Message-ID: <1527882207.7898.86.camel@surriel.com> (raw)
In-Reply-To: <1527878882.4448.11.camel@gmx.de>
[-- Attachment #1: Type: text/plain, Size: 3080 bytes --]
On Fri, 2018-06-01 at 20:48 +0200, Mike Galbraith wrote:
> On Fri, 2018-06-01 at 14:22 -0400, Rik van Riel wrote:
> > On Fri, 2018-06-01 at 08:11 -0700, Andy Lutomirski wrote:
> > > On Fri, Jun 1, 2018 at 5:28 AM Rik van Riel <riel@surriel.com>
> > > wrote:
> > > >
> > > > Song noticed switch_mm_irqs_off taking a lot of CPU time in
> > > > recent
> > > > kernels,using 2.4% of a 48 CPU system during a netperf to
> > > > localhost
> > > > run.
> > > > Digging into the profile, we noticed that cpumask_clear_cpu and
> > > > cpumask_set_cpu together take about half of the CPU time taken
> > > > by
> > > > switch_mm_irqs_off.
> > > >
> > > > However, the CPUs running netperf end up switching back and
> > > > forth
> > > > between netperf and the idle task, which does not require
> > > > changes
> > > > to the mm_cpumask. Furthermore, the init_mm cpumask ends up
> > > > being
> > > > the most heavily contended one in the system.`
> > > >
> > > > Skipping cpumask_clear_cpu and cpumask_set_cpu for init_mm
> > > > (mostly the idle task) reduced CPU use of switch_mm_irqs_off
> > > > from 2.4% of the CPU to 1.9% of the CPU, with the following
> > > > netperf commandline:
> > >
> > > I'm conceptually fine with this change. Does
> > > mm_cpumask(&init_mm)
> > > end
> > > up in a deterministic state?
> >
> > Given that we do not touch mm_cpumask(&init_mm)
> > any more, and that bitmask never appears to be
> > used for things like tlb shootdowns (kernel TLB
> > shootdowns simply go to everybody), I suspect
> > it ends up in whatever state it is initialized
> > to on startup.
> >
> > I had not looked into this much, because it does
> > not appear to be used for anything.
> >
> > > Mike, depending on exactly what's going on with your benchmark,
> > > this
> > > might help recover a bit of your performance, too.
> >
> > It will be interesting to know how this change
> > impacts others.
>
> previous pipe-test numbers
> 4.13.16 2.024978 usecs/loop -- avg 2.045250 977.9 KHz
> 4.14.47 2.234518 usecs/loop -- avg 2.227716 897.8 KHz
> 4.15.18 2.287815 usecs/loop -- avg 2.295858 871.1 KHz
> 4.16.13 2.286036 usecs/loop -- avg 2.279057 877.6 KHz
> 4.17.0.g88a8676 2.288231 usecs/loop -- avg 2.288917 873.8 KHz
>
> new numbers
> 4.17.0.g0512e01 2.268629 usecs/loop -- avg 2.269493 881.3 KHz
> 4.17.0.g0512e01 2.035401 usecs/loop -- avg 2.038341 981.2 KHz +andy
> 4.17.0.g0512e01 2.238701 usecs/loop -- avg 2.231828 896.1 KHz
> -andy+rik
>
> There might be something there with your change Rik, but it's small
> enough to be wary of variance. Andy's "invert the return of
> tlb_defer_switch_to_init_mm()" is OTOH pretty clear.
If inverting the return value of that function helps
some systems, chances are the other value might help
other systems.
That makes you wonder whether it might make sense
to always switch to lazy TLB mode, and only call
switch_mm at TLB flush time, regardless of whether
the CPU supports PCID...
--
All Rights Reversed.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2018-06-01 19:43 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-01 12:28 [PATCH] x86,switch_mm: skip atomic operations for init_mm Rik van Riel
2018-06-01 15:11 ` Andy Lutomirski
2018-06-01 18:22 ` Rik van Riel
2018-06-01 18:48 ` Mike Galbraith
2018-06-01 19:43 ` Rik van Riel [this message]
2018-06-01 20:03 ` Andy Lutomirski
2018-06-01 20:35 ` Rik van Riel
2018-06-01 21:21 ` Andy Lutomirski
2018-06-01 22:13 ` Rik van Riel
2018-06-02 3:35 ` Andy Lutomirski
2018-06-02 5:04 ` Rik van Riel
2018-06-02 20:14 ` Andy Lutomirski
2018-06-03 0:51 ` Song Liu
2018-06-03 1:38 ` Rik van Riel
2018-06-06 18:17 ` Andy Lutomirski
2018-06-06 19:00 ` Rik van Riel
2018-06-06 19:23 ` Andy Lutomirski
2018-06-02 3:39 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1527882207.7898.86.camel@surriel.com \
--to=riel@surriel.com \
--cc=efault@gmx.de \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=songliubraving@fb.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox