From: Mel Gorman <mgorman@suse.de>
To: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Alex Shi <alex.shi@linaro.org>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Fengguang Wu <fengguang.wu@intel.com>,
H Peter Anvin <hpa@zytor.com>, Linux-X86 <x86@kernel.org>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2
Date: Tue, 17 Dec 2013 14:32:53 +0000 [thread overview]
Message-ID: <20131217143253.GB11295@suse.de> (raw)
In-Reply-To: <20131217110051.GA27701@gmail.com>
On Tue, Dec 17, 2013 at 12:00:51PM +0100, Ingo Molnar wrote:
>
> > sched: Assign correct scheduling domain to sd_llc
> >
> > Commit 42eb088e (sched: Avoid NULL dereference on sd_busy) corrected a NULL
> > dereference on sd_busy but the fix also altered what scheduling domain it
> > used for sd_llc. One impact of this is that a task selecting a runqueue may
> > consider idle CPUs that are not cache siblings as candidates for running.
> > Tasks are then running on CPUs that are not cache hot.
> >
> > <PATCH SNIPPED>
>
> Indeed that makes a lot of sense, thanks Mel for tracking down this
> part of the puzzle! Will get your fix to Linus ASAP.
>
> Does this fix also speed up Ebizzy's transaction performance, or is
> its main effect a reduction in workload variation noise?
>
Mixed results, some gains and some losses.
3.13.0-rc3 3.13.0-rc3 3.4.69 3.13.0-rc3
vanilla nowalk-v2r7 vanilla fixsd-v3r3
Mean 1 7295.77 ( 0.00%) 7835.63 ( 7.40%) 6713.32 ( -7.98%) 7757.03 ( 6.32%)
Mean 2 8252.58 ( 0.00%) 9554.63 ( 15.78%) 8334.43 ( 0.99%) 9457.34 ( 14.60%)
Mean 3 8179.74 ( 0.00%) 9032.46 ( 10.42%) 8134.42 ( -0.55%) 8928.25 ( 9.15%)
Mean 4 7862.45 ( 0.00%) 8688.01 ( 10.50%) 7966.27 ( 1.32%) 8560.87 ( 8.88%)
Mean 5 7170.24 ( 0.00%) 8216.15 ( 14.59%) 7820.63 ( 9.07%) 8270.72 ( 15.35%)
Mean 6 6835.10 ( 0.00%) 7866.95 ( 15.10%) 7773.30 ( 13.73%) 7998.50 ( 17.02%)
Mean 7 6740.99 ( 0.00%) 7586.36 ( 12.54%) 7712.45 ( 14.41%) 7519.46 ( 11.55%)
Mean 8 6494.01 ( 0.00%) 6849.82 ( 5.48%) 7705.62 ( 18.66%) 6842.44 ( 5.37%)
Mean 12 6567.37 ( 0.00%) 6973.66 ( 6.19%) 7554.82 ( 15.04%) 6471.83 ( -1.45%)
Mean 16 6630.26 ( 0.00%) 7042.52 ( 6.22%) 7331.04 ( 10.57%) 6380.16 ( -3.77%)
Range 1 767.00 ( 0.00%) 194.00 ( 74.71%) 661.00 ( 13.82%) 217.00 ( 71.71%)
Range 2 178.00 ( 0.00%) 185.00 ( -3.93%) 592.00 (-232.58%) 240.00 (-34.83%)
Range 3 175.00 ( 0.00%) 213.00 (-21.71%) 431.00 (-146.29%) 511.00 (-192.00%)
Range 4 806.00 ( 0.00%) 924.00 (-14.64%) 542.00 ( 32.75%) 723.00 ( 10.30%)
Range 5 544.00 ( 0.00%) 438.00 ( 19.49%) 444.00 ( 18.38%) 663.00 (-21.88%)
Range 6 399.00 ( 0.00%) 1111.00 (-178.45%) 528.00 (-32.33%) 1031.00 (-158.40%)
Range 7 629.00 ( 0.00%) 895.00 (-42.29%) 467.00 ( 25.76%) 877.00 (-39.43%)
Range 8 400.00 ( 0.00%) 255.00 ( 36.25%) 435.00 ( -8.75%) 656.00 (-64.00%)
Range 12 233.00 ( 0.00%) 108.00 ( 53.65%) 330.00 (-41.63%) 343.00 (-47.21%)
Range 16 141.00 ( 0.00%) 134.00 ( 4.96%) 496.00 (-251.77%) 291.00 (-106.38%)
Stddev 1 73.94 ( 0.00%) 52.33 ( 29.23%) 177.17 (-139.59%) 37.34 ( 49.51%)
Stddev 2 23.47 ( 0.00%) 42.08 (-79.24%) 88.91 (-278.74%) 38.16 (-62.58%)
Stddev 3 36.48 ( 0.00%) 29.02 ( 20.45%) 101.07 (-177.05%) 134.62 (-269.01%)
Stddev 4 158.37 ( 0.00%) 133.99 ( 15.40%) 130.52 ( 17.59%) 150.61 ( 4.90%)
Stddev 5 116.74 ( 0.00%) 76.76 ( 34.25%) 78.31 ( 32.92%) 116.67 ( 0.06%)
Stddev 6 66.34 ( 0.00%) 273.87 (-312.83%) 87.79 (-32.33%) 235.11 (-254.40%)
Stddev 7 145.62 ( 0.00%) 174.99 (-20.16%) 90.52 ( 37.84%) 156.08 ( -7.18%)
Stddev 8 68.51 ( 0.00%) 47.58 ( 30.54%) 81.11 (-18.39%) 96.00 (-40.13%)
Stddev 12 32.15 ( 0.00%) 20.18 ( 37.22%) 65.74 (-104.50%) 45.00 (-39.99%)
Stddev 16 21.59 ( 0.00%) 20.29 ( 6.01%) 86.42 (-300.25%) 38.20 (-76.93%)
fixsd-v3r3 is all the patches discussed so far applied. Lost at higher
thread counts, won at lower ones. All the results still worse than 3.4.69
To complicate matters further, additional testing indicated that the
tlbflush shift change *may* have made the variation worse. I was preparing
to bisect to search for patches that increased "thread performance spread"
in ebizzy and tested a number of potential bisect points
Tue 17 Dec 11:11:08 GMT 2013 ivy ebizzyrange v3.12 mean-max:36 good
Tue 17 Dec 11:32:28 GMT 2013 ivy ebizzyrange v3.13-rc3 mean-max:80 bad
Tue 17 Dec 12:00:23 GMT 2013 ivy ebizzyrange v3.4 mean-max:0 good
Tue 17 Dec 12:21:58 GMT 2013 ivy ebizzyrange v3.10 mean-max:26 good
Tue 17 Dec 12:42:49 GMT 2013 ivy ebizzyrange v3.11 mean-max:7 good
Tue 17 Dec 13:32:14 GMT 2013 ivy ebizzyrange x86-tlb-range-flush-optimisation-v3r3 mean-max:110 bad
This is part of the log for an automated bisection script. mean-max is
the worst average spread recorded for all threads tested. It's telling
me that the worst thread spread seen by v3.13-rc3 is 80 and the worst
seen by the patch series (tlbflush shift change, fix to sd etc) is 110.
The bisection is doing very few iterations so it could just be co-incidence
but it makes sense. If the kernel is scheduling tasks on CPUs that are not
cache siblings then the cost of remote TLB flushes (range or otherwise)
changes. It's an important enough problem that I feel compelled to
retest with
x86: mm: Clean up inconsistencies when flushing TLB ranges
x86: mm: Account for TLB flushes only when debugging
x86: mm: Eliminate redundant page table walk during TLB range flushing
sched: Assign correct scheduling domain to sd_llc
I'll then re-evalate the tlbflush shift patch based on what falls out of
that test. It may turn out that tlbflush shifts on its own simply cannot
optimise for both the tlbflush microbenchmark and ebizzy as the former
deals with average cost and the latter hits the worst case every time.
At that point it'll be time to look at profiles and see where we are
actually spending time because the possibilities of finding things to fix
through bisection will be exhausted.
> Also it appears the Ebizzy numbers ought to be stable enough now to
> make the range-TLB-flush measurements more precise?
>
Right now, the tlbflush microbenchmark figures look awful on the 8-core
machine when the tlbflush shift patch and the schedule domain fix are
both applied.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-12-17 14:32 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-13 20:01 [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2 Mel Gorman
2013-12-13 20:01 ` [PATCH 1/4] x86: mm: Clean up inconsistencies when flushing TLB ranges Mel Gorman
2013-12-13 20:01 ` [PATCH 2/4] x86: mm: Account for TLB flushes only when debugging Mel Gorman
2013-12-13 20:01 ` [PATCH 3/4] x86: mm: Change tlb_flushall_shift for IvyBridge Mel Gorman
2013-12-13 20:01 ` [PATCH 4/4] x86: mm: Eliminate redundant page table walk during TLB range flushing Mel Gorman
2013-12-13 21:16 ` [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB range flush v2 Linus Torvalds
2013-12-13 22:38 ` H. Peter Anvin
2013-12-16 10:39 ` Mel Gorman
2013-12-16 17:17 ` Linus Torvalds
2013-12-17 9:55 ` Mel Gorman
2013-12-15 15:55 ` Mel Gorman
2013-12-15 16:17 ` Mel Gorman
2013-12-15 18:34 ` Linus Torvalds
2013-12-16 11:16 ` Mel Gorman
2013-12-16 10:24 ` Ingo Molnar
2013-12-16 12:59 ` Mel Gorman
2013-12-16 13:44 ` Ingo Molnar
2013-12-17 9:21 ` Mel Gorman
2013-12-17 9:26 ` Peter Zijlstra
2013-12-17 11:00 ` Ingo Molnar
2013-12-17 14:32 ` Mel Gorman [this message]
2013-12-17 14:42 ` Ingo Molnar
2013-12-17 17:54 ` Mel Gorman
2013-12-18 10:24 ` Ingo Molnar
2013-12-19 14:24 ` Mel Gorman
2013-12-19 16:49 ` Ingo Molnar
2013-12-20 11:13 ` Mel Gorman
2013-12-20 11:18 ` Ingo Molnar
2013-12-20 12:00 ` Mel Gorman
2013-12-20 12:20 ` Ingo Molnar
2013-12-20 13:55 ` Mel Gorman
2013-12-18 7:28 ` Fengguang Wu
2013-12-19 14:34 ` Mel Gorman
2013-12-20 15:51 ` Fengguang Wu
2013-12-20 16:44 ` Mel Gorman
2013-12-21 15:49 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131217143253.GB11295@suse.de \
--to=mgorman@suse.de \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@linaro.org \
--cc=fengguang.wu@intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).