From: Mel Gorman <mgorman@techsingularity.net>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
"Li, Aubrey" <aubrey.li@linux.intel.com>,
Ingo Molnar <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>,
Valentin Schneider <valentin.schneider@arm.com>,
Qais Yousef <qais.yousef@arm.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
Mel Gorman <mgorman@suse.de>, Jiang Biao <benbjiang@gmail.com>
Subject: Re: [RFC PATCH v7] sched/fair: select idle cpu from idle cpumask for task wakeup
Date: Mon, 14 Dec 2020 12:42:17 +0000 [thread overview]
Message-ID: <20201214124217.GA3371@techsingularity.net> (raw)
In-Reply-To: <CAKfTPtB1sRmZAf5MEZOquHg3z7TTriAbqpwVup5xwu42DN2yCA@mail.gmail.com>
On Mon, Dec 14, 2020 at 10:18:16AM +0100, Vincent Guittot wrote:
> On Fri, 11 Dec 2020 at 18:45, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Dec 10, 2020 at 12:58:33PM +0000, Mel Gorman wrote:
> > > The prequisite patch to make that approach work was rejected though
> > > as on its own, it's not very helpful and Vincent didn't like that the
> > > load_balance_mask was abused to make it effective.
> >
> > So last time I poked at all this, I found that using more masks was a
> > performance issue as well due to added cache misses.
> >
> > Anyway, I finally found some time to look at all this, and while reading
> > over the whole SiS crud to refresh the brain came up with the below...
> >
> > It's still naf, but at least it gets rid of a bunch of duplicate
> > scanning and LoC decreases, so it should be awesome. Ofcourse, as
> > always, benchmarks will totally ruin everything, we'll see, I started
> > some.
> >
> > It goes on top of Mel's first two patches (which is about as far as I
> > got)...
>
> We have several different things that Aubrey and Mel patches are
> trying to achieve:
>
> Aubrey wants to avoid scanning busy cpus
> - patch works well on busy system: I see a significant benefit with
> hackbench on a lot of group on my 2 nodes * 28 cores * 4 smt
> hackbench -l 2000 -g 128
> tip 3.334
> w/ patch 2.978 (+12%)
>
It's still the case that Aubrey's work does not conflict with what Peter
is doing. All that changes is what mask is applied.
Similarly, the p->recent_used_cpu changes I made initially (but got
rejected) reduces scanning. I intend to revisit that to use recent_used_cpu
as a search target because one side-effect of the patch was the siblings
can be used prematurely.
> - Aubey also tried to not scan the cpus which are idle for a short
> duration (when a tick not stopped) but there are problems because not
> stopping a tick doesn't mean a short idle. In fact , something similar
> to find_idlest_group_cpu() should be done in this case but then it's
> no more a fast path search
>
The crowd most likely to complain about this is Facebook as they have
workloads that prefer deep searches to find idle CPUs.
> Mel wants to minimize looping over the cpus
> -patch 4 is an obvious win on light loaded system because it avoids
> looping twice the cpus mask when some cpus are idle but no core
Peter's patch replaces that entirely which I'm fine with.
> -But patch 3 generates perf régression
> hackbench -l 2000 -g 1
> tip 12.293
> /w all Mel's patches 15.163 -14%
> /w Aubrey + Mel patches minus patch 3 : 12.788 +3.8% But I think that
> Aubreay's patch doesn't help here. Test without aubrey's patch are
> running
>
> -he also tried to used load_balance_mask to do something similar to the below
>
Peter's approach removes load_balance_mask abuse so I think it's a
better approach overall to scanning LLC domains in a single pass. It
needs refinement and a lot of testing but I think it's promising.
> > -static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
> > +static int __select_idle_core(int core, struct cpumask *cpus, int *idle_cpu)
> > {
> > - struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> > - int core, cpu;
> > -
> > - if (!static_branch_likely(&sched_smt_present))
> > - return -1;
> > -
> > - if (!test_idle_cores(target, false))
> > - return -1;
> > -
> > - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> > -
> > - for_each_cpu_wrap(core, cpus, target) {
> > - bool idle = true;
> > + bool idle = true;
> > + int cpu;
> >
> > - for_each_cpu(cpu, cpu_smt_mask(core)) {
> > - if (!available_idle_cpu(cpu)) {
> > - idle = false;
> > - break;
> > - }
> > + for_each_cpu(cpu, cpu_smt_mask(core)) {
> > + if (!available_idle_cpu(cpu)) {
> > + idle = false;
> > + continue;
>
> By not breaking on the first not idle cpu of the core, you will
> largely increase the number of loops. On my system, it increases 4
> times from 28 up tu 112
>
But breaking early will mean that SMT siblings are used prematurely.
However, if SIS_PROP was partially enforced for the idle core scan, it
could limit the search for an idle core once an idle candidate was
discovered.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2020-12-14 12:43 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-09 6:24 [RFC PATCH v7] sched/fair: select idle cpu from idle cpumask for task wakeup Aubrey Li
2020-12-09 8:15 ` Vincent Guittot
2020-12-09 10:58 ` Li, Aubrey
2020-12-09 13:09 ` Vincent Guittot
2020-12-09 14:53 ` Li, Aubrey
2020-12-09 14:36 ` Mel Gorman
2020-12-10 8:23 ` Li, Aubrey
2020-12-10 11:34 ` Mel Gorman
2020-12-10 12:21 ` Li, Aubrey
2020-12-10 12:58 ` Mel Gorman
2020-12-11 17:44 ` Peter Zijlstra
2020-12-11 20:43 ` Mel Gorman
2020-12-11 22:19 ` Peter Zijlstra
2020-12-11 22:50 ` Mel Gorman
2020-12-14 8:11 ` Vincent Guittot
2020-12-14 9:31 ` Peter Zijlstra
2020-12-14 12:36 ` Mel Gorman
2020-12-14 15:01 ` Peter Zijlstra
2020-12-14 9:32 ` Peter Zijlstra
2020-12-14 9:18 ` Vincent Guittot
2020-12-14 12:42 ` Mel Gorman [this message]
2020-12-14 7:53 ` Li, Aubrey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201214124217.GA3371@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=aubrey.li@linux.intel.com \
--cc=benbjiang@gmail.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=qais.yousef@arm.com \
--cc=rostedt@goodmis.org \
--cc=tim.c.chen@linux.intel.com \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.