From: Phil Auld <pauld@redhat.com>
To: Aubrey Li <aubrey.intel@gmail.com>
Cc: "Vineeth Remanan Pillai" <vpillai@digitalocean.com>,
"Nishanth Aravamudan" <naravamudan@digitalocean.com>,
"Julien Desfossez" <jdesfossez@digitalocean.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Tim Chen" <tim.c.chen@linux.intel.com>,
"Ingo Molnar" <mingo@kernel.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Paul Turner" <pjt@google.com>,
"Linus Torvalds" <torvalds@linux-foundation.org>,
"Linux List Kernel Mailing" <linux-kernel@vger.kernel.org>,
"Subhra Mazumdar" <subhra.mazumdar@oracle.com>,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Kees Cook" <keescook@chromium.org>,
"Greg Kerr" <kerrnel@google.com>,
"Aaron Lu" <aaron.lwe@gmail.com>,
"Valentin Schneider" <valentin.schneider@arm.com>,
"Mel Gorman" <mgorman@techsingularity.net>,
"Pawan Gupta" <pawan.kumar.gupta@linux.intel.com>,
"Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [RFC PATCH v2 13/17] sched: Add core wide task selection and scheduling.
Date: Mon, 20 May 2019 09:04:54 -0400 [thread overview]
Message-ID: <20190520130454.GA677@pauld.bos.csb> (raw)
In-Reply-To: <CAERHkrtZo0BQg_u9ZPNY_Rk2JY4YT8d5NDRKFQMWeYyAviVShA@mail.gmail.com>
On Sat, May 18, 2019 at 11:37:56PM +0800 Aubrey Li wrote:
> On Wed, Apr 24, 2019 at 12:18 AM Vineeth Remanan Pillai
> <vpillai@digitalocean.com> wrote:
> >
> > From: Peter Zijlstra (Intel) <peterz@infradead.org>
> >
> > Instead of only selecting a local task, select a task for all SMT
> > siblings for every reschedule on the core (irrespective which logical
> > CPU does the reschedule).
> >
> > NOTE: there is still potential for siblings rivalry.
> > NOTE: this is far too complicated; but thus far I've failed to
> > simplify it further.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> > kernel/sched/core.c | 222 ++++++++++++++++++++++++++++++++++++++++++-
> > kernel/sched/sched.h | 5 +-
> > 2 files changed, 224 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index e5bdc1c4d8d7..9e6e90c6f9b9 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3574,7 +3574,7 @@ static inline void schedule_debug(struct task_struct *prev)
> > * Pick up the highest-prio task:
> > */
> > static inline struct task_struct *
> > -pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> > +__pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> > {
> > const struct sched_class *class;
> > struct task_struct *p;
> > @@ -3619,6 +3619,220 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> > BUG();
> > }
> >
> > +#ifdef CONFIG_SCHED_CORE
> > +
> > +static inline bool cookie_match(struct task_struct *a, struct task_struct *b)
> > +{
> > + if (is_idle_task(a) || is_idle_task(b))
> > + return true;
> > +
> > + return a->core_cookie == b->core_cookie;
> > +}
> > +
> > +// XXX fairness/fwd progress conditions
> > +static struct task_struct *
> > +pick_task(struct rq *rq, const struct sched_class *class, struct task_struct *max)
> > +{
> > + struct task_struct *class_pick, *cookie_pick;
> > + unsigned long cookie = 0UL;
> > +
> > + /*
> > + * We must not rely on rq->core->core_cookie here, because we fail to reset
> > + * rq->core->core_cookie on new picks, such that we can detect if we need
> > + * to do single vs multi rq task selection.
> > + */
> > +
> > + if (max && max->core_cookie) {
> > + WARN_ON_ONCE(rq->core->core_cookie != max->core_cookie);
> > + cookie = max->core_cookie;
> > + }
> > +
> > + class_pick = class->pick_task(rq);
> > + if (!cookie)
> > + return class_pick;
> > +
> > + cookie_pick = sched_core_find(rq, cookie);
> > + if (!class_pick)
> > + return cookie_pick;
> > +
> > + /*
> > + * If class > max && class > cookie, it is the highest priority task on
> > + * the core (so far) and it must be selected, otherwise we must go with
> > + * the cookie pick in order to satisfy the constraint.
> > + */
> > + if (cpu_prio_less(cookie_pick, class_pick) && core_prio_less(max, class_pick))
> > + return class_pick;
> > +
> > + return cookie_pick;
> > +}
> > +
> > +static struct task_struct *
> > +pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> > +{
> > + struct task_struct *next, *max = NULL;
> > + const struct sched_class *class;
> > + const struct cpumask *smt_mask;
> > + int i, j, cpu;
> > +
> > + if (!sched_core_enabled(rq))
> > + return __pick_next_task(rq, prev, rf);
> > +
> > + /*
> > + * If there were no {en,de}queues since we picked (IOW, the task
> > + * pointers are all still valid), and we haven't scheduled the last
> > + * pick yet, do so now.
> > + */
> > + if (rq->core->core_pick_seq == rq->core->core_task_seq &&
> > + rq->core->core_pick_seq != rq->core_sched_seq) {
> > + WRITE_ONCE(rq->core_sched_seq, rq->core->core_pick_seq);
> > +
> > + next = rq->core_pick;
> > + if (next != prev) {
> > + put_prev_task(rq, prev);
> > + set_next_task(rq, next);
> > + }
> > + return next;
> > + }
> > +
>
> The following patch improved my test cases.
> Welcome any comments.
>
This is certainly better than violating the point of the core scheduler :)
If I'm understanding this right what will happen in this case is instead
of using the idle process selected by the sibling we do the core scheduling
again. This may start with a newidle_balance which might bring over something
to run that matches what we want to put on the sibling. If that works then I
can see this helping.
But I'd be a little concerned that we could end up thrashing. Once we do core
scheduling again here we'd force the sibling to resched and if we got a different
result which "helped" him pick idle we'd go around again.
I think inherent in the concept of core scheduling (barring a perfectly aligned set
of jobs) is some extra idle time on siblings.
Cheers,
Phil
> Thanks,
> -Aubrey
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 3e3162f..86031f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3685,10 +3685,12 @@ pick_next_task(struct rq *rq, struct
> task_struct *prev, struct rq_flags *rf)
> /*
> * If there were no {en,de}queues since we picked (IOW, the task
> * pointers are all still valid), and we haven't scheduled the last
> - * pick yet, do so now.
> + * pick yet, do so now. If the last pick is idle task, we abandon
> + * last pick and try to pick up task this time.
> */
> if (rq->core->core_pick_seq == rq->core->core_task_seq &&
> - rq->core->core_pick_seq != rq->core_sched_seq) {
> + rq->core->core_pick_seq != rq->core_sched_seq &&
> + !is_idle_task(rq->core_pick)) {
> WRITE_ONCE(rq->core_sched_seq, rq->core->core_pick_seq);
>
> next = rq->core_pick;
--
next prev parent reply other threads:[~2019-05-20 13:05 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-23 16:18 [RFC PATCH v2 00/17] Core scheduling v2 Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 01/17] stop_machine: Fix stop_cpus_in_progress ordering Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 02/17] sched: Fix kerneldoc comment for ia64_set_curr_task Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 03/17] sched: Wrap rq::lock access Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 04/17] sched/{rt,deadline}: Fix set_next_task vs pick_next_task Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 05/17] sched: Add task_struct pointer to sched_class::set_curr_task Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 06/17] sched/fair: Export newidle_balance() Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 07/17] sched: Allow put_prev_task() to drop rq->lock Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 08/17] sched: Rework pick_next_task() slow-path Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 09/17] sched: Introduce sched_class::pick_task() Vineeth Remanan Pillai
2019-04-26 14:02 ` Peter Zijlstra
2019-04-26 16:10 ` Vineeth Remanan Pillai
2019-04-29 5:38 ` Aaron Lu
2019-04-23 16:18 ` [RFC PATCH v2 10/17] sched: Core-wide rq->lock Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks Vineeth Remanan Pillai
2019-04-24 0:08 ` Tim Chen
2019-04-24 20:43 ` Vineeth Remanan Pillai
2019-04-24 22:12 ` Tim Chen
2019-04-25 14:35 ` Phil Auld
2019-05-22 19:52 ` Vineeth Remanan Pillai
2019-04-24 0:17 ` Tim Chen
2019-04-24 20:43 ` Vineeth Remanan Pillai
2019-04-29 3:36 ` Aaron Lu
2019-05-10 13:06 ` Peter Zijlstra
2019-04-29 6:15 ` Aaron Lu
2019-05-01 23:27 ` Tim Chen
2019-05-03 0:06 ` Tim Chen
2019-05-08 15:49 ` Aubrey Li
2019-05-08 18:19 ` Subhra Mazumdar
2019-05-08 18:37 ` Subhra Mazumdar
2019-05-09 0:01 ` Aubrey Li
2019-05-09 0:25 ` Subhra Mazumdar
2019-05-09 1:38 ` Aubrey Li
2019-05-09 2:14 ` Subhra Mazumdar
2019-05-09 15:10 ` Aubrey Li
2019-05-09 17:50 ` Subhra Mazumdar
2019-05-10 0:09 ` Tim Chen
2019-04-23 16:18 ` [RFC PATCH v2 12/17] sched: A quick and dirty cgroup tagging interface Vineeth Remanan Pillai
2019-04-25 14:26 ` Phil Auld
2019-04-26 14:13 ` Peter Zijlstra
2019-04-26 14:19 ` Phil Auld
2019-05-10 15:12 ` Julien Desfossez
2019-04-23 16:18 ` [RFC PATCH v2 13/17] sched: Add core wide task selection and scheduling Vineeth Remanan Pillai
2019-04-29 7:13 ` Aaron Lu
2019-05-18 15:37 ` Aubrey Li
2019-05-20 13:04 ` Phil Auld [this message]
2019-05-20 14:04 ` Vineeth Pillai
2019-05-21 8:19 ` Aubrey Li
2019-05-21 13:24 ` Vineeth Pillai
2019-04-23 16:18 ` [RFC PATCH v2 14/17] sched/fair: Add a few assertions Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 15/17] sched: Trivial forced-newidle balancer Vineeth Remanan Pillai
2019-04-23 23:46 ` Aubrey Li
2019-04-24 14:03 ` Vineeth Remanan Pillai
2019-04-24 14:05 ` Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 16/17] sched: Wake up sibling if it has something to run Vineeth Remanan Pillai
2019-04-26 15:03 ` Peter Zijlstra
2019-04-29 12:36 ` Julien Desfossez
2019-04-23 16:18 ` [RFC PATCH v2 17/17] sched: Debug bits Vineeth Remanan Pillai
2019-05-17 17:18 ` Aubrey Li
2019-04-23 18:02 ` [RFC PATCH v2 00/17] Core scheduling v2 Phil Auld
2019-04-23 18:45 ` Vineeth Remanan Pillai
2019-04-29 3:53 ` Aaron Lu
2019-05-06 19:39 ` Julien Desfossez
2019-05-08 2:30 ` Aaron Lu
2019-05-08 17:49 ` Julien Desfossez
2019-05-09 2:11 ` Aaron Lu
2019-05-15 21:36 ` Vineeth Remanan Pillai
2019-04-23 23:25 ` Aubrey Li
2019-04-24 11:19 ` Vineeth Remanan Pillai
2019-05-15 21:39 ` Vineeth Remanan Pillai
2019-04-24 13:13 ` Aubrey Li
2019-04-24 14:00 ` Julien Desfossez
2019-04-25 3:15 ` Aubrey Li
2019-04-25 9:55 ` Ingo Molnar
2019-04-25 14:46 ` Mel Gorman
2019-04-25 18:53 ` Ingo Molnar
2019-04-25 18:59 ` Thomas Gleixner
2019-04-25 19:34 ` Ingo Molnar
2019-04-25 21:31 ` Mel Gorman
2019-04-26 8:42 ` Ingo Molnar
2019-04-26 10:43 ` Mel Gorman
2019-04-26 18:37 ` Subhra Mazumdar
2019-04-26 19:49 ` Mel Gorman
2019-04-26 9:45 ` Ingo Molnar
2019-04-26 10:19 ` Mel Gorman
2019-04-27 9:06 ` Ingo Molnar
2019-04-26 9:51 ` Ingo Molnar
2019-04-26 14:15 ` Phil Auld
2019-04-26 2:18 ` Aubrey Li
2019-04-26 9:51 ` Ingo Molnar
2019-04-27 3:51 ` Aubrey Li
2019-04-27 9:17 ` Ingo Molnar
2019-04-27 14:04 ` Aubrey Li
2019-04-27 14:21 ` Ingo Molnar
2019-04-27 15:54 ` Aubrey Li
2019-04-28 9:33 ` Ingo Molnar
2019-04-28 10:29 ` Aubrey Li
2019-04-28 12:17 ` Ingo Molnar
2019-04-29 2:17 ` Li, Aubrey
2019-04-29 6:14 ` Ingo Molnar
2019-04-29 13:25 ` Li, Aubrey
2019-04-29 15:39 ` Phil Auld
2019-04-30 1:24 ` Aubrey Li
2019-04-29 16:00 ` Ingo Molnar
2019-04-30 1:34 ` Aubrey Li
2019-04-30 4:42 ` Ingo Molnar
2019-05-18 0:58 ` Li, Aubrey
2019-05-18 1:08 ` Li, Aubrey
2019-04-25 14:36 ` Julien Desfossez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190520130454.GA677@pauld.bos.csb \
--to=pauld@redhat.com \
--cc=aaron.lwe@gmail.com \
--cc=aubrey.intel@gmail.com \
--cc=fweisbec@gmail.com \
--cc=jdesfossez@digitalocean.com \
--cc=keescook@chromium.org \
--cc=kerrnel@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@kernel.org \
--cc=naravamudan@digitalocean.com \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=subhra.mazumdar@oracle.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
--cc=valentin.schneider@arm.com \
--cc=vpillai@digitalocean.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.