All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Vineeth Remanan Pillai <vpillai@digitalocean.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
	"Tim Chen" <tim.c.chen@linux.intel.com>,
	"Nishanth Aravamudan" <naravamudan@digitalocean.com>,
	"Julien Desfossez" <jdesfossez@digitalocean.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Paul Turner" <pjt@google.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Linux List Kernel Mailing" <linux-kernel@vger.kernel.org>,
	"Subhra Mazumdar" <subhra.mazumdar@oracle.com>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Greg Kerr" <kerrnel@google.com>, "Phil Auld" <pauld@redhat.com>,
	"Aaron Lu" <aaron.lwe@gmail.com>,
	"Aubrey Li" <aubrey.intel@gmail.com>,
	"Valentin Schneider" <valentin.schneider@arm.com>,
	"Mel Gorman" <mgorman@techsingularity.net>,
	"Pawan Gupta" <pawan.kumar.gupta@linux.intel.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Vineeth Pillai" <vineethrp@gmail.com>,
	"Chen Yu" <yu.c.chen@intel.com>,
	"Christian Brauner" <christian.brauner@ubuntu.com>,
	"Aaron Lu" <aaron.lu@linux.alibaba.com>,
	"Paul E . McKenney" <paulmck@kernel.org>
Subject: Re: [RFC PATCH 06/16] sched: Add core wide task selection and scheduling.
Date: Mon, 6 Jul 2020 13:37:09 -0400	[thread overview]
Message-ID: <20200706173709.GA190133@google.com> (raw)
In-Reply-To: <CANaguZAJk=yCGiiSh1y1gYYh_9ZfO94VT4vNjYjR_SY=_0ao-A@mail.gmail.com>

Hi Vineeth,

On Mon, Jul 06, 2020 at 10:38:27AM -0400, Vineeth Remanan Pillai wrote:
> On Mon, Jul 6, 2020 at 10:09 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> >
> > > I am not sure if this can happen. If the other sibling sets core_pick, it
> > > will be under the core wide lock and it should set the core_sched_seq also
> > > before releasing the lock. So when this cpu tries, it would see the core_pick
> > > before resetting it. Is this the same case you were mentioning? Sorry if I
> > > misunderstood the case you mentioned..
> >
> > If you have a case where you have 3 siblings all trying to enter the schedule
> > loop. Call them A, B and C.
> >
> > A picks something for B in core_pick. Now C comes and resets B's core_pick
> > before running the mega-loop, hoping to select something for it shortly.
> > However, C then does an unconstrained pick and forgets to set B's pick to
> > something.
> >
> > I don't know if this can really happen - but this is why I added the warning
> > in the end of the patch. I think we should make the code more robust and
> > handle these kind of cases.
> >
> I don't think this can happen. Each of the sibling takes the core wide
> lock before calling into pick_next _task. So this should not happen.

So my patch is correct but the warnings I added were probably overkill.

About the warnings, Vineeth explained to me on IRC that the design was
intially done to set ->core_pick to NULL if nothing is being picked for a
sibling rq, and the fact that we don't increment that rq's core_sched_seq
means it would the rq it is being set for would not go read core_pick.

And that resetting ->core_pick should be ok, since a sibling will go select a
task for itself if its core_pick was NULL anyway.

The only requirement is that the selection code definitely select something
for the current CPU, or idle. NULL is not an option,

So I guess we can drop the additional warnings I added, I was likely too
paranoid.

> > Again, it is about making the code more robust. Why should not set
> > rq->core_pick when we pick something? As we discussed in the private
> > discussion - we should make the code robust and consistent. Correctness is
> > not enough, the code has to be robust and maintainable.
> >
> > I think in our private discussion, you agreed with me that there is no harm
> > in setting core_pick in this case.
> >
> I agreed there was no harm, because we wanted to use that in the last
> check after 'done' label. But now I see that adding that check after
> done label cause the WARN_ON to fire even in valid case. Firing the
> WARN_ON in valid case is not good. So, if that WARN_ON check can be
> removed, adding this is not necessary IMHO.

Makes sense.

> > > cpumask_copy(&select_mask, cpu_smt_mask(cpu));
> > > if (unlikely(cpumask_test_cpu(cpu, &select_mask))) {
> > >     cpumask_set_cpu(cpu, &select_mask);
> > >     need_sync = false;
> > > }
> >
> > Nah, more lines of code for no good no reason, plus another branch right? I'd
> > like to leave my one liner alone than adding 4 more lines :-)
> >
> Remember, this is the fast path. Every schedule() except for our sync
> IPI reaches here. And we are sure that smt_cpumask will not have cpu
> only on hotplug cases which is very rare. I feel adding more code to
> make it clear that this setting is not needed always and also optimizing for
> the fast path is what I was looking for.

It occurs to us that may we want to optimize this a bit more, because we have
to copy cpumask every schedule() with my patch which may be unnecessarily
expensive for large CPU systems.  I think we can do better -- probably by
unconditionally running the selection code on the current CPU without first
preparing an intermediate mask..

> > As discussed above, > 2 SMT case, we don't really know if the warning will
> > fire or not. I would rather keep the warning just in case for the future.
> >
> I think I was not clear last time. This WARN_ON will fire on valid cases
> if you have this check here. As I mentioned unconstrained pick, picks only
> for that cpu and not to any other siblings. This is by design. So for
> unconstrained pick, core_pick of all siblings will be NULL. We jump to done
> label on unconstrained pick and this for loop goes through all the siblings
> and finds that its core_pick is not set. Then thei WARN_ON will fire. I have
> reproduced this. We do not want it to fire as it is the correct logic not to
> set core_pick for unconstrained pick. Please let me know if this is not clear.

Agreed, I think my patch can be used as a starting point and we optimize it
further.

Me/Vineeth will continue to work on this and come up with a final patch, thanks!

 - Joel


  reply	other threads:[~2020-07-06 17:37 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-30 21:32 [RFC PATCH 00/16] Core scheduling v6 Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 01/16] sched: Wrap rq::lock access Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 02/16] sched: Introduce sched_class::pick_task() Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 03/16] sched: Core-wide rq->lock Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 04/16] sched/fair: Add a few assertions Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 05/16] sched: Basic tracking of matching tasks Vineeth Remanan Pillai
2020-07-21 14:02   ` [RFC PATCH 05/16] sched: Basic tracking of matching tasks(Internet mail) benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 06/16] sched: Add core wide task selection and scheduling Vineeth Remanan Pillai
2020-07-01 23:28   ` Joel Fernandes
2020-07-02  0:54     ` Tim Chen
2020-07-02 12:57       ` Joel Fernandes
2020-07-02 13:23         ` Joel Fernandes
2020-07-05 23:44         ` Tim Chen
2020-07-03 20:21     ` Vineeth Remanan Pillai
2020-07-06 14:09       ` Joel Fernandes
2020-07-06 14:38         ` Vineeth Remanan Pillai
2020-07-06 17:37           ` Joel Fernandes [this message]
2020-06-30 21:32 ` [RFC PATCH 07/16] sched/fair: Fix forced idle sibling starvation corner case Vineeth Remanan Pillai
2020-07-21  7:35   ` [RFC PATCH 07/16] sched/fair: Fix forced idle sibling starvation corner case(Internet mail) benbjiang(蒋彪)
2020-07-22  7:20   ` benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 08/16] sched/fair: wrapper for cfs_rq->min_vruntime Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 09/16] sched/fair: core wide cfs task priority comparison Vineeth Remanan Pillai
2020-07-22  0:23   ` [RFC PATCH 09/16] sched/fair: core wide cfs task priority comparison(Internet mail) benbjiang(蒋彪)
2020-07-24  7:14     ` Aaron Lu
2020-07-24 12:08       ` Jiang Biao
2020-06-30 21:32 ` [RFC PATCH 10/16] sched: Trivial forced-newidle balancer Vineeth Remanan Pillai
2020-07-20  4:06   ` [RFC PATCH 10/16] sched: Trivial forced-newidle balancer(Internet mail) benbjiang(蒋彪)
2020-07-20  6:06     ` Li, Aubrey
     [not found]       ` <8082F052-2F52-42D3-B396-18A35A94F26F@tencent.com>
2020-07-20  8:03         ` Li, Aubrey
2020-07-20  8:22           ` benbjiang(蒋彪)
2020-07-20 14:34   ` benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 11/16] sched: migration changes for core scheduling Vineeth Remanan Pillai
2020-07-22  8:54   ` [RFC PATCH 11/16] sched: migration changes for core scheduling(Internet mail) benbjiang(蒋彪)
2020-07-22 12:13     ` Li, Aubrey
2020-07-22 14:32       ` benbjiang(蒋彪)
2020-07-23  1:57         ` Li, Aubrey
2020-07-23  2:42           ` benbjiang(蒋彪)
2020-07-23  3:35             ` Li, Aubrey
2020-07-23  4:23               ` benbjiang(蒋彪)
2020-07-23  5:39                 ` Li, Aubrey
2020-07-23  7:47                   ` benbjiang(蒋彪)
2020-07-23  8:06                     ` Li, Aubrey
2020-07-23  8:28                       ` benbjiang(蒋彪)
2020-07-23 23:43                         ` Aubrey Li
2020-07-24  1:26                           ` benbjiang(蒋彪)
2020-07-24  2:05                             ` Li, Aubrey
2020-07-24  2:29                               ` benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 12/16] sched: cgroup tagging interface for core scheduling Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 13/16] sched: Fix pick_next_task() race condition in " Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 14/16] irq: Add support for core-wide protection of IRQ and softirq Vineeth Remanan Pillai
2020-07-10 12:19   ` Li, Aubrey
2020-07-10 13:21     ` Joel Fernandes
2020-07-13  2:23       ` Li, Aubrey
2020-07-13 15:58         ` Joel Fernandes
2020-07-10 13:36     ` Vineeth Remanan Pillai
2020-07-11  1:33       ` Aubrey Li
2020-07-17 23:37     ` Thomas Gleixner
2020-07-18 17:05       ` Joel Fernandes
2020-07-17 23:36   ` Thomas Gleixner
2020-07-20  3:53     ` Joel Fernandes
2020-07-20  8:20       ` Thomas Gleixner
2020-07-20 11:09       ` Vineeth Pillai
2020-06-30 21:32 ` [RFC PATCH 15/16] Documentation: Add documentation on core scheduling Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 16/16] sched: Debug bits Vineeth Remanan Pillai
2020-07-31 16:41 ` [RFC PATCH 00/16] Core scheduling v6 Vineeth Pillai
2020-08-03  8:23 ` Li, Aubrey
2020-08-03 16:53   ` Joel Fernandes
2020-08-05  3:57     ` Li, Aubrey
2020-08-05  6:16       ` [RFC PATCH 00/16] Core scheduling v6(Internet mail) benbjiang(蒋彪)
2020-08-09 16:44       ` [RFC PATCH 00/16] Core scheduling v6 Joel Fernandes
2020-08-12  2:01         ` Li, Aubrey
2020-08-12 23:08           ` Joel Fernandes
2020-08-13  4:28             ` Li, Aubrey
2020-08-14  0:26               ` [RFC PATCH 00/16] Core scheduling v6(Internet mail) benbjiang(蒋彪)
2020-08-14  1:36                 ` Li, Aubrey
2020-08-14  4:04                   ` benbjiang(蒋彪)
2020-08-14  5:18                     ` Li, Aubrey
2020-08-14  7:54                       ` benbjiang(蒋彪)
2020-08-20 22:37               ` [RFC PATCH 00/16] Core scheduling v6 Joel Fernandes
2020-08-27  0:30 ` Alexander Graf
2020-08-27  1:20   ` Vineeth Pillai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200706173709.GA190133@google.com \
    --to=joel@joelfernandes.org \
    --cc=aaron.lu@linux.alibaba.com \
    --cc=aaron.lwe@gmail.com \
    --cc=aubrey.intel@gmail.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=fweisbec@gmail.com \
    --cc=jdesfossez@digitalocean.com \
    --cc=keescook@chromium.org \
    --cc=kerrnel@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=naravamudan@digitalocean.com \
    --cc=pauld@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=subhra.mazumdar@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=vineethrp@gmail.com \
    --cc=vpillai@digitalocean.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.