From: Juri Lelli <juri.lelli@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: peterz@infradead.org, mingo@redhat.com, mark.rutland@arm.com,
linux-kernel@vger.kernel.org, luca.abeni@santannapisa.it,
claudio@evidence.eu.com, bristot@redhat.com
Subject: Re: [PATCH] sched/deadline: Fix switched_from_dl
Date: Mon, 6 Aug 2018 12:17:07 +0200 [thread overview]
Message-ID: <20180806101707.GD26470@localhost.localdomain> (raw)
In-Reply-To: <20180801231953.151fdce6@vmware.local.home>
On 01/08/18 23:19, Steven Rostedt wrote:
> On Wed, 11 Jul 2018 09:29:48 +0200
> Juri Lelli <juri.lelli@redhat.com> wrote:
>
> > Mark noticed that syzkaller is able to reliably trigger the following
> >
> > dl_rq->running_bw > dl_rq->this_bw
> > WARNING: CPU: 1 PID: 153 at kernel/sched/deadline.c:124 switched_from_dl+0x454/0x608
> > Kernel panic - not syncing: panic_on_warn set ...
> >
> > CPU: 1 PID: 153 Comm: syz-executor253 Not tainted 4.18.0-rc3+ #29
> > Hardware name: linux,dummy-virt (DT)
> > Call trace:
> > dump_backtrace+0x0/0x458
> > show_stack+0x20/0x30
> > dump_stack+0x180/0x250
> > panic+0x2dc/0x4ec
> > __warn_printk+0x0/0x150
> > report_bug+0x228/0x2d8
> > bug_handler+0xa0/0x1a0
> > brk_handler+0x2f0/0x568
> > do_debug_exception+0x1bc/0x5d0
> > el1_dbg+0x18/0x78
> > switched_from_dl+0x454/0x608
> > __sched_setscheduler+0x8cc/0x2018
> > sys_sched_setattr+0x340/0x758
> > el0_svc_naked+0x30/0x34
> >
> > syzkaller reproducer runs a bunch of threads that constantly switch
> > between DEADLINE and NORMAL classes while interacting through futexes.
> >
> > The splat above is caused by the fact that if a DEADLINE task is setattr
> > back to NORMAL while in non_contending state (blocked on a futex -
> > inactive timer armed), its contribution to running_bw is not removed
> > before sub_rq_bw() gets called (!task_on_rq_queued() branch) and the
> > latter sees running_bw > this_bw.
> >
> > Fix it by removing a task contribution from running_bw if the task is
> > not queued and in non_contending state while switched to a different
> > class.
> >
> > Reported-by: Mark Rutland <mark.rutland@arm.com>
> > Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
> > ---
> > kernel/sched/deadline.c | 11 ++++++++++-
> > 1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index fbfc3f1d368a..10c7b51c0d1f 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -2290,8 +2290,17 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p)
> > if (task_on_rq_queued(p) && p->dl.dl_runtime)
> > task_non_contending(p);
> >
> > - if (!task_on_rq_queued(p))
> > + if (!task_on_rq_queued(p)) {
> > + /*
> > + * Inactive timer is armed. However, p is leaving DEADLINE and
> > + * might migrate away from this rq while continuing to run on
> > + * some other class. We need to remove its contribution from
> > + * this rq running_bw now, or sub_rq_bw (below) will complain.
> > + */
> > + if (p->dl.dl_non_contending)
> > + sub_running_bw(&p->dl, &rq->dl);
> > sub_rq_bw(&p->dl, &rq->dl);
> > + }
> >
> > /*
> > * We cannot use inactive_task_timer() to invoke sub_running_bw()
>
> Looking at this code:
>
> if (!task_on_rq_queued(p)) {
> /*
> * Inactive timer is armed. However, p is leaving DEADLINE and
> * might migrate away from this rq while continuing to run on
> * some other class. We need to remove its contribution from
> * this rq running_bw now, or sub_rq_bw (below) will complain.
> */
> if (p->dl.dl_non_contending)
> sub_running_bw(&p->dl, &rq->dl);
> sub_rq_bw(&p->dl, &rq->dl);
> }
>
> /*
> * We cannot use inactive_task_timer() to invoke sub_running_bw()
> * at the 0-lag time, because the task could have been migrated
> * while SCHED_OTHER in the meanwhile.
> */
> if (p->dl.dl_non_contending)
> p->dl.dl_non_contending = 0;
>
> Question. Is the "dl_non_contending" only able to be set
> if !task_on_rq_queued(p) is true? In that case, we could just clear it
> in the first if block.
Code right before the if block does
if (task_on_rq_queued(p) && p->dl.dl_runtime)
task_non_contending(p);
So we can end up with dl_non_contending being set even if task_on_rq_
queued(p) is true.
> If it's not true, I would think the subtraction
> is needed regardless.
And if we do sub_running_bw unconditionally we might end up subtracting
twice if inactive timer fired (resetting dl_non_contending) before we
end up here, no?
Thanks,
- Juri
prev parent reply other threads:[~2018-08-06 10:17 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-11 7:29 [PATCH] sched/deadline: Fix switched_from_dl Juri Lelli
2018-07-11 8:53 ` luca abeni
2018-07-12 14:40 ` Daniel Bristot de Oliveira
2018-07-15 23:24 ` [tip:sched/urgent] sched/deadline: Fix switched_from_dl() warning tip-bot for Juri Lelli
2018-08-02 3:19 ` [PATCH] sched/deadline: Fix switched_from_dl Steven Rostedt
2018-08-06 10:17 ` Juri Lelli [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180806101707.GD26470@localhost.localdomain \
--to=juri.lelli@redhat.com \
--cc=bristot@redhat.com \
--cc=claudio@evidence.eu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.abeni@santannapisa.it \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.