From: Pavan Kondeti <pkondeti@codeaurora.org>
To: Qais Yousef <qais.yousef@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 5/6] sched/rt: Better manage pushing unfit tasks on wakeup
Date: Tue, 25 Feb 2020 09:25:05 +0530 [thread overview]
Message-ID: <20200225035505.GI28029@codeaurora.org> (raw)
In-Reply-To: <20200224174138.n6pmoeffqg7eqiy2@e107158-lin.cambridge.arm.com>
On Mon, Feb 24, 2020 at 05:41:39PM +0000, Qais Yousef wrote:
> On 02/24/20 21:34, Pavan Kondeti wrote:
> > Hi Qais,
> >
> > On Mon, Feb 24, 2020 at 5:42 PM Qais Yousef <qais.yousef@arm.com> wrote:
> > [...]
> > > We could do, temporarily, to get these fixes into 5.6. But I do think
> > > select_task_rq_rt() doesn't do a good enough job into pushing unfit tasks to
> > > the right CPUs.
> > >
> > > I don't understand the reasons behind your objection. It seems you think that
> > > select_task_rq_rt() should be enough, but not AFAICS. Can you be a bit more
> > > detailed please?
> > >
> > > FWIW, here's a screenshot of what I see
> > >
> > > https://imgur.com/a/peV27nE
> > >
> > > After the first activation, select_task_rq_rt() fails to find the right CPU
> > > (due to the same move all tasks to the cpumask_fist()) - but when the task
> > > wakes up on 4, the logic I put causes it to migrate to CPU2, which is the 2nd
> > > big core. CPU1 and CPU2 are the big cores on Juno.
> > >
> > > Now maybe we should fix select_task_rq_rt() to better balance tasks, but not
> > > sure how easy is that.
> > >
> >
> > Thanks for the trace. Now things are clear to me. Two RT tasks woke up
> > simultaneously and the first task got its previous CPU i.e CPU#1. The next task
> > goes through find_lowest_rq() and got the same CPU#1. Since this task priority
> > is not more than the just queued task (already queued on CPU#1), it is sent
> > to its previous CPU i.e CPU#4 in your case.
> >
> > From task_woken_rt() path, CPU#4 attempts push_rt_tasks(). CPU#4 is
> > not overloaded,
> > but we have rt_task_fits_capacity() check which forces the push. Since the CPU
> > is not overloaded, your has_unfit_tasks() comes to rescue and push the
> > task. Since
> > the task has not scheduled in yet, it is eligible for push. You added checks
> > to skip resched_curr() in push_rt_tasks() otherwise the push won't happen.
>
> Nice summary, that's exactly what it is :)
>
> > Finally, I understood your patch. Obviously this is not clear to me
> > before. I am not
> > sure if this patch is the right approach to solve this race. I will
> > think a bit more.
>
> I haven't been staring at this code for as long as you, but since we have
> logic at wakeup to do a push, I think we need something here anyway for unfit
> tasks.
>
> Fixing select_task_rq_rt() to better balance tasks will help a lot in general,
> but if that was enough already then why do we need to consider a push at the
> wakeup at all then?
>
> AFAIU, in SMP the whole push-pull mechanism is racy and we introduce redundancy
> at taking the decision on various points to ensure we minimize this racy nature
> of SMP systems. Anything could have happened between the time we called
> select_task_rq_rt() and the wakeup, so we double check again before we finally
> go and run. That's how I interpret it.
>
> I am open to hear about other alternatives first anyway. Your help has been
> much appreciated so far.
>
The search inside find_lowest_rq() happens without any locks so I believe it
is expected to have races like this. In fact there is a comment in the code
saying "This test is optimistic, if we get it wrong the load-balancer
will have to sort it out" in select_task_rq_rt(). However, the push logic
as of today works only for overloaded case. In that sense, your patch fixes
this race for b.L systems. At the same time, I feel like tracking nonfit tasks
just to fix this race seems to be too much. I will leave this to Steve and
others to take a decision.
I thought of suggesting to remove the below check from select_task_rq_rt()
p->prio < cpu_rq(target)->rt.highest_prio.curr
which would then make the target CPU overloaded and the push logic would
spread the tasks. That works for a b.L system too. However there seems to
be a very good reason for doing this. see
https://lore.kernel.org/patchwork/patch/539137/
The fact that a CPU is part of lowest_mask but running a higher prio RT
task means there is a race. Should we retry one more time to see if we find
another CPU?
Thanks,
Pavan
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
next prev parent reply other threads:[~2020-02-25 3:55 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-23 18:39 [PATCH v2 0/6] RT Capacity Awareness Fixes & Improvements Qais Yousef
2020-02-23 18:39 ` [PATCH v2 1/6] sched/rt: cpupri_find: implement fallback mechanism for !fit case Qais Yousef
2020-02-23 18:39 ` [PATCH v2 2/6] sched/rt: Re-instate old behavior in select_task_rq_rt Qais Yousef
2020-02-25 15:21 ` Dietmar Eggemann
2020-02-26 11:34 ` Qais Yousef
2020-02-23 18:39 ` [PATCH v2 3/6] sched/rt: Optimize cpupri_find on non-heterogenous systems Qais Yousef
2020-02-23 18:39 ` [PATCH v2 4/6] sched/rt: allow pulling unfitting task Qais Yousef
2020-02-23 18:40 ` [PATCH v2 5/6] sched/rt: Better manage pushing unfit tasks on wakeup Qais Yousef
2020-02-24 6:10 ` Pavan Kondeti
2020-02-24 12:11 ` Qais Yousef
2020-02-24 16:04 ` Pavan Kondeti
2020-02-24 17:41 ` Qais Yousef
2020-02-25 3:55 ` Pavan Kondeti [this message]
2020-02-26 16:02 ` Qais Yousef
2020-02-27 3:36 ` Pavan Kondeti
2020-02-27 10:29 ` Qais Yousef
2020-02-23 18:40 ` [PATCH v2 6/6] sched/rt: Remove unnecessary assignment in inc/dec_rt_migration Qais Yousef
2020-02-23 23:16 ` Dietmar Eggemann
2020-02-24 12:31 ` Qais Yousef
2020-02-24 13:03 ` Dietmar Eggemann
2020-02-24 13:47 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200225035505.GI28029@codeaurora.org \
--to=pkondeti@codeaurora.org \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=qais.yousef@arm.com \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.