From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Paul Turner" <pjt@google.com>,
"Tim Chen" <tim.c.chen@linux.intel.com>,
"Linux List Kernel Mailing" <linux-kernel@vger.kernel.org>,
subhra.mazumdar@oracle.com,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Kees Cook" <keescook@chromium.org>,
kerrnel@google.com
Subject: Re: [RFC][PATCH 00/16] sched: Core scheduling
Date: Tue, 19 Feb 2019 16:15:32 +0100 [thread overview]
Message-ID: <20190219151532.GA40581@gmail.com> (raw)
In-Reply-To: <CAHk-=whVrNomWXRmCjnBJkosiwiGXz5pYb63aXy=nSPGjvc-1g@mail.gmail.com>
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Mon, Feb 18, 2019 at 12:40 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > If there were close to no VMEXITs, it beat smt=off, if there were lots
> > of VMEXITs it was far far worse. Supposedly hosting people try their
> > very bestest to have no VMEXITs so it mostly works for them (with the
> > obvious exception of single VCPU guests).
> >
> > It's just that people have been bugging me for this crap; and I figure
> > I'd post it now that it's not exploding anymore and let others have at.
>
> The patches didn't look disgusting to me, but I admittedly just
> scanned through them quickly.
>
> Are there downsides (maintenance and/or performance) when core
> scheduling _isn't_ enabled? I guess if it's not a maintenance or
> performance nightmare when off, it's ok to just give people the
> option.
So this bit is the main straight-line performance impact when the
CONFIG_SCHED_CORE Kconfig feature is present (which I expect distros to
enable broadly):
+static inline bool sched_core_enabled(struct rq *rq)
+{
+ return static_branch_unlikely(&__sched_core_enabled) && rq->core_enabled;
+}
static inline raw_spinlock_t *rq_lockp(struct rq *rq)
{
+ if (sched_core_enabled(rq))
+ return &rq->core->__lock
+
return &rq->__lock;
This should at least in principe keep the runtime overhead down to more
NOPs and a bit bigger instruction cache footprint - modulo compiler
shenanigans.
Here's the code generation impact on x86-64 defconfig:
text data bss dec hex filename
228 48 0 276 114 sched.core.n/cpufreq.o (ex sched.core.n/built-in.a)
228 48 0 276 114 sched.core.y/cpufreq.o (ex sched.core.y/built-in.a)
4438 96 0 4534 11b6 sched.core.n/completion.o (ex sched.core.n/built-in.a)
4438 96 0 4534 11b6 sched.core.y/completion.o (ex sched.core.y/built-in.a)
2167 2428 0 4595 11f3 sched.core.n/cpuacct.o (ex sched.core.n/built-in.a)
2167 2428 0 4595 11f3 sched.core.y/cpuacct.o (ex sched.core.y/built-in.a)
61099 22114 488 83701 146f5 sched.core.n/core.o (ex sched.core.n/built-in.a)
70541 25370 508 96419 178a3 sched.core.y/core.o (ex sched.core.y/built-in.a)
3262 6272 0 9534 253e sched.core.n/wait_bit.o (ex sched.core.n/built-in.a)
3262 6272 0 9534 253e sched.core.y/wait_bit.o (ex sched.core.y/built-in.a)
12235 341 96 12672 3180 sched.core.n/rt.o (ex sched.core.n/built-in.a)
13073 917 96 14086 3706 sched.core.y/rt.o (ex sched.core.y/built-in.a)
10293 477 1928 12698 319a sched.core.n/topology.o (ex sched.core.n/built-in.a)
10363 509 1928 12800 3200 sched.core.y/topology.o (ex sched.core.y/built-in.a)
886 24 0 910 38e sched.core.n/cpupri.o (ex sched.core.n/built-in.a)
886 24 0 910 38e sched.core.y/cpupri.o (ex sched.core.y/built-in.a)
1061 64 0 1125 465 sched.core.n/stop_task.o (ex sched.core.n/built-in.a)
1077 128 0 1205 4b5 sched.core.y/stop_task.o (ex sched.core.y/built-in.a)
18443 365 24 18832 4990 sched.core.n/deadline.o (ex sched.core.n/built-in.a)
20019 2189 24 22232 56d8 sched.core.y/deadline.o (ex sched.core.y/built-in.a)
1123 8 64 1195 4ab sched.core.n/loadavg.o (ex sched.core.n/built-in.a)
1123 8 64 1195 4ab sched.core.y/loadavg.o (ex sched.core.y/built-in.a)
1323 8 0 1331 533 sched.core.n/stats.o (ex sched.core.n/built-in.a)
1323 8 0 1331 533 sched.core.y/stats.o (ex sched.core.y/built-in.a)
1282 164 32 1478 5c6 sched.core.n/isolation.o (ex sched.core.n/built-in.a)
1282 164 32 1478 5c6 sched.core.y/isolation.o (ex sched.core.y/built-in.a)
1564 36 0 1600 640 sched.core.n/cpudeadline.o (ex sched.core.n/built-in.a)
1564 36 0 1600 640 sched.core.y/cpudeadline.o (ex sched.core.y/built-in.a)
1640 56 0 1696 6a0 sched.core.n/swait.o (ex sched.core.n/built-in.a)
1640 56 0 1696 6a0 sched.core.y/swait.o (ex sched.core.y/built-in.a)
1859 244 32 2135 857 sched.core.n/clock.o (ex sched.core.n/built-in.a)
1859 244 32 2135 857 sched.core.y/clock.o (ex sched.core.y/built-in.a)
2339 8 0 2347 92b sched.core.n/cputime.o (ex sched.core.n/built-in.a)
2339 8 0 2347 92b sched.core.y/cputime.o (ex sched.core.y/built-in.a)
3014 32 0 3046 be6 sched.core.n/membarrier.o (ex sched.core.n/built-in.a)
3014 32 0 3046 be6 sched.core.y/membarrier.o (ex sched.core.y/built-in.a)
50027 964 96 51087 c78f sched.core.n/fair.o (ex sched.core.n/built-in.a)
51537 2484 96 54117 d365 sched.core.y/fair.o (ex sched.core.y/built-in.a)
3192 220 0 3412 d54 sched.core.n/idle.o (ex sched.core.n/built-in.a)
3276 252 0 3528 dc8 sched.core.y/idle.o (ex sched.core.y/built-in.a)
3633 0 0 3633 e31 sched.core.n/pelt.o (ex sched.core.n/built-in.a)
3633 0 0 3633 e31 sched.core.y/pelt.o (ex sched.core.y/built-in.a)
3794 160 0 3954 f72 sched.core.n/wait.o (ex sched.core.n/built-in.a)
3794 160 0 3954 f72 sched.core.y/wait.o (ex sched.core.y/built-in.a)
I'd say this one is representative:
text data bss dec hex filename
12235 341 96 12672 3180 sched.core.n/rt.o (ex sched.core.n/built-in.a)
13073 917 96 14086 3706 sched.core.y/rt.o (ex sched.core.y/built-in.a)
which ~6% bloat is primarily due to the higher rq-lock inlining overhead,
I believe.
This is roughly what you'd expect from a change wrapping all 350+ inlined
instantiations of rq->lock uses. I.e. it might make sense to uninline it.
In terms of long term maintenance overhead, ignoring the overhead of the
core-scheduling feature itself, the rq-lock wrappery is the biggest
ugliness, the rest is mostly isolated.
So if this actually *works* and improves the performance of some real
VMEXIT-poor SMT workloads and allows the enabling of HyperThreading with
untrusted VMs without inviting thousands of guest roots then I'm
cautiously in support of it.
> That all assumes that it works at all for the people who are clamoring
> for this feature, but I guess they can run some loads on it eventually.
> It's a holiday in the US right now ("Presidents' Day"), but maybe we
> can get some numebrs this week?
Such numbers would be *very* helpful indeed.
Thanks,
Ingo
next prev parent reply other threads:[~2019-02-19 15:15 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-18 16:56 [RFC][PATCH 00/16] sched: Core scheduling Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 01/16] stop_machine: Fix stop_cpus_in_progress ordering Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 02/16] sched: Fix kerneldoc comment for ia64_set_curr_task Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 03/16] sched: Wrap rq::lock access Peter Zijlstra
2019-02-19 16:13 ` Phil Auld
2019-02-19 16:22 ` Peter Zijlstra
2019-02-19 16:37 ` Phil Auld
2019-03-18 15:41 ` Julien Desfossez
2019-03-20 2:29 ` Subhra Mazumdar
2019-03-21 21:20 ` Julien Desfossez
2019-03-22 13:34 ` Peter Zijlstra
2019-03-22 20:59 ` Julien Desfossez
2019-03-23 0:06 ` Subhra Mazumdar
2019-03-27 1:02 ` Subhra Mazumdar
2019-03-29 13:35 ` Julien Desfossez
2019-03-29 22:23 ` Subhra Mazumdar
2019-04-01 21:35 ` Subhra Mazumdar
2019-04-03 20:16 ` Julien Desfossez
2019-04-05 1:30 ` Subhra Mazumdar
2019-04-02 7:42 ` Peter Zijlstra
2019-03-22 23:28 ` Tim Chen
2019-03-22 23:44 ` Tim Chen
2019-02-18 16:56 ` [RFC][PATCH 04/16] sched/{rt,deadline}: Fix set_next_task vs pick_next_task Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 05/16] sched: Add task_struct pointer to sched_class::set_curr_task Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 06/16] sched/fair: Export newidle_balance() Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 07/16] sched: Allow put_prev_task() to drop rq->lock Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 08/16] sched: Rework pick_next_task() slow-path Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 09/16] sched: Introduce sched_class::pick_task() Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 10/16] sched: Core-wide rq->lock Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 11/16] sched: Basic tracking of matching tasks Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 12/16] sched: A quick and dirty cgroup tagging interface Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling Peter Zijlstra
[not found] ` <20190402064612.GA46500@aaronlu>
2019-04-02 8:28 ` Peter Zijlstra
2019-04-02 13:20 ` Aaron Lu
2019-04-05 14:55 ` Aaron Lu
2019-04-09 18:09 ` Tim Chen
2019-04-10 4:36 ` Aaron Lu
2019-04-10 14:18 ` Aubrey Li
2019-04-11 2:11 ` Aaron Lu
2019-04-10 14:44 ` Peter Zijlstra
2019-04-11 3:05 ` Aaron Lu
2019-04-11 9:19 ` Peter Zijlstra
2019-04-10 8:06 ` Peter Zijlstra
2019-04-10 19:58 ` Vineeth Remanan Pillai
2019-04-15 16:59 ` Julien Desfossez
2019-04-16 13:43 ` Aaron Lu
2019-04-09 18:38 ` Julien Desfossez
2019-04-10 15:01 ` Peter Zijlstra
2019-04-11 0:11 ` Subhra Mazumdar
2019-04-19 8:40 ` Ingo Molnar
2019-04-19 23:16 ` Subhra Mazumdar
2019-02-18 16:56 ` [RFC][PATCH 14/16] sched/fair: Add a few assertions Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 15/16] sched: Trivial forced-newidle balancer Peter Zijlstra
2019-02-21 16:19 ` Valentin Schneider
2019-02-21 16:41 ` Peter Zijlstra
2019-02-21 16:47 ` Peter Zijlstra
2019-02-21 18:28 ` Valentin Schneider
2019-04-04 8:31 ` Aubrey Li
2019-04-06 1:36 ` Aubrey Li
2019-02-18 16:56 ` [RFC][PATCH 16/16] sched: Debug bits Peter Zijlstra
2019-02-18 17:49 ` [RFC][PATCH 00/16] sched: Core scheduling Linus Torvalds
2019-02-18 20:40 ` Peter Zijlstra
2019-02-19 0:29 ` Linus Torvalds
2019-02-19 15:15 ` Ingo Molnar [this message]
2019-02-22 12:17 ` Paolo Bonzini
2019-02-22 14:20 ` Peter Zijlstra
2019-02-22 19:26 ` Tim Chen
2019-02-26 8:26 ` Aubrey Li
2019-02-27 7:54 ` Aubrey Li
2019-02-21 2:53 ` Subhra Mazumdar
2019-02-21 14:03 ` Peter Zijlstra
2019-02-21 18:44 ` Subhra Mazumdar
2019-02-22 0:34 ` Subhra Mazumdar
2019-02-22 12:45 ` Mel Gorman
2019-02-22 16:10 ` Mel Gorman
2019-03-08 19:44 ` Subhra Mazumdar
2019-03-11 4:23 ` Aubrey Li
2019-03-11 18:34 ` Subhra Mazumdar
2019-03-11 23:33 ` Subhra Mazumdar
2019-03-12 0:20 ` Greg Kerr
2019-03-12 0:47 ` Subhra Mazumdar
2019-03-12 7:33 ` Aaron Lu
2019-03-12 7:45 ` Aubrey Li
2019-03-13 5:55 ` Aubrey Li
2019-03-14 0:35 ` Tim Chen
2019-03-14 5:30 ` Aubrey Li
2019-03-14 6:07 ` Li, Aubrey
2019-03-18 6:56 ` Aubrey Li
2019-03-12 19:07 ` Pawan Gupta
2019-03-26 7:32 ` Aaron Lu
2019-03-26 7:56 ` Aaron Lu
2019-02-19 22:07 ` Greg Kerr
2019-02-20 9:42 ` Peter Zijlstra
2019-02-20 18:33 ` Greg Kerr
2019-02-22 14:10 ` Peter Zijlstra
2019-03-07 22:06 ` Paolo Bonzini
2019-02-20 18:43 ` Subhra Mazumdar
2019-03-01 2:54 ` Subhra Mazumdar
2019-03-14 15:28 ` Julien Desfossez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190219151532.GA40581@gmail.com \
--to=mingo@kernel.org \
--cc=fweisbec@gmail.com \
--cc=keescook@chromium.org \
--cc=kerrnel@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=subhra.mazumdar@oracle.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.