From: Mel Gorman <mgorman@suse.de>
To: Julia Lawall <julia.lawall@inria.fr>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] sched/fair: check for idle core
Date: Fri, 23 Oct 2020 11:05:51 +0100 [thread overview]
Message-ID: <20201023100551.GR32041@suse.de> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2010231121200.2707@hadrien>
On Fri, Oct 23, 2020 at 11:21:50AM +0200, Julia Lawall wrote:
>
>
> On Fri, 23 Oct 2020, Mel Gorman wrote:
>
> > On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > > In the case of a thread wakeup, wake_affine determines whether a core
> > > will be chosen for the thread on the socket where the thread ran
> > > previously or on the socket of the waker. This is done primarily by
> > > comparing the load of the core where th thread ran previously (prev)
> > > and the load of the waker (this).
> > >
> > > commit 11f10e5420f6 ("sched/fair: Use load instead of runnable load
> > > in wakeup path") changed the load computation from the runnable load
> > > to the load average, where the latter includes the load of threads
> > > that have already blocked on the core.
> > >
> > > When a short-running daemon processes happens to run on prev, this
> > > change raised the situation that prev could appear to have a greater
> > > load than this, even when prev is actually idle. When prev and this
> > > are on the same socket, the idle prev is detected later, in
> > > select_idle_sibling. But if that does not hold, prev is completely
> > > ignored, causing the waking thread to move to the socket of the waker.
> > > In the case of N mostly active threads on N cores, this triggers other
> > > migrations and hurts performance.
> > >
> > > In contrast, before commit 11f10e5420f6, the load on an idle core
> > > was 0, and in the case of a non-idle waker core, the effect of
> > > wake_affine was to select prev as the target for searching for a core
> > > for the waking thread.
> > >
> > > To avoid unnecessary migrations, extend wake_affine_idle to check
> > > whether the core where the thread previously ran is currently idle,
> > > and if so simply return that core as the target.
> > > target
> > > [1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable
> > > load in wakeup path")
> > >
> > > This particularly has an impact when using the ondemand power manager,
> > > where kworkers run every 0.004 seconds on all cores, increasing the
> > > likelihood that an idle core will be considered to have a load.
> > >
> > > The following numbers were obtained with the benchmarking tool
> > > hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel
> > > benchmarks (https://www.nas.nasa.gov/publications/npb.html). The
> > > tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @
> > > 2.10GHz. Active (intel_pstate) and passive (intel_cpufreq) power
> > > management were used. Times are in seconds. All experiments use all
> > > 160 hardware threads.
> > >
> > > v5.9/intel-pstate v5.9+patch/intel-pstate
> > > bt.C.c 24.725724+-0.962340 23.349608+-1.607214
> > > lu.C.x 29.105952+-4.804203 25.249052+-5.561617
> > > sp.C.x 31.220696+-1.831335 30.227760+-2.429792
> > > ua.C.x 26.606118+-1.767384 25.778367+-1.263850
> > >
> > > v5.9/ondemand v5.9+patch/ondemand
> > > bt.C.c 25.330360+-1.028316 23.544036+-1.020189
> > > lu.C.x 35.872659+-4.872090 23.719295+-3.883848
> > > sp.C.x 32.141310+-2.289541 29.125363+-0.872300
> > > ua.C.x 29.024597+-1.667049 25.728888+-1.539772
> > >
> > > On the smaller data sets (A and B) and on the other NAS benchmarks
> > > there is no impact on performance.
> > >
> > > This also has a major impact on the splash2x.volrend benchmark of the
> > > parsec benchmark suite that goes from 1m25 without this patch to 0m45,
> > > in active (intel_pstate) mode.
> > >
> > > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> > >
> >
> > In principal, I think the patch is ok after the recent discussion. I'm
> > holding off an ack until a battery of tests on loads with different
> > levels of utilisation and wakeup patterns makes its way through a test
> > grid. It's based on Linus's tree mid-merge window that includes what is
> > in the scheduler pipeline
>
> OK, if it doesn't work out, it would be interesting to know what goes
> badly.
>
Yep, if something goes wrong, I'll make the full logs available, details
on reproducing it etc.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2020-10-23 10:05 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-22 13:15 [PATCH v2] sched/fair: check for idle core Julia Lawall
2020-10-23 8:40 ` Mel Gorman
2020-10-23 9:21 ` Julia Lawall
2020-10-23 10:05 ` Mel Gorman [this message]
2020-10-23 16:28 ` Chen Yu
2020-10-23 16:52 ` Julia Lawall
2020-10-27 9:19 ` Mel Gorman
2020-10-27 10:00 ` Julia Lawall
2021-01-24 20:38 ` Julia Lawall
2021-01-25 9:12 ` Mel Gorman
2021-01-25 9:20 ` Julia Lawall
2021-01-25 9:25 ` Vincent Guittot
2021-01-25 13:38 ` Julia Lawall
2021-02-06 17:20 ` Julia Lawall
2020-10-29 10:51 ` [tip: sched/core] sched/fair: Check for idle core in wake_affine tip-bot2 for Julia Lawall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201023100551.GR32041@suse.de \
--to=mgorman@suse.de \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=julia.lawall@inria.fr \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.