All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Juri Lelli <juri.lelli@redhat.com>
Cc: peterz@infradead.org, mingo@redhat.com, mark.rutland@arm.com,
	linux-kernel@vger.kernel.org, luca.abeni@santannapisa.it,
	claudio@evidence.eu.com, bristot@redhat.com
Subject: Re: [PATCH] sched/deadline: Fix switched_from_dl
Date: Wed, 1 Aug 2018 23:19:53 -0400	[thread overview]
Message-ID: <20180801231953.151fdce6@vmware.local.home> (raw)
In-Reply-To: <20180711072948.27061-1-juri.lelli@redhat.com>

On Wed, 11 Jul 2018 09:29:48 +0200
Juri Lelli <juri.lelli@redhat.com> wrote:

> Mark noticed that syzkaller is able to reliably trigger the following
> 
>   dl_rq->running_bw > dl_rq->this_bw
>   WARNING: CPU: 1 PID: 153 at kernel/sched/deadline.c:124 switched_from_dl+0x454/0x608
>   Kernel panic - not syncing: panic_on_warn set ...
> 
>   CPU: 1 PID: 153 Comm: syz-executor253 Not tainted 4.18.0-rc3+ #29
>   Hardware name: linux,dummy-virt (DT)
>   Call trace:
>    dump_backtrace+0x0/0x458
>    show_stack+0x20/0x30
>    dump_stack+0x180/0x250
>    panic+0x2dc/0x4ec
>    __warn_printk+0x0/0x150
>    report_bug+0x228/0x2d8
>    bug_handler+0xa0/0x1a0
>    brk_handler+0x2f0/0x568
>    do_debug_exception+0x1bc/0x5d0
>    el1_dbg+0x18/0x78
>    switched_from_dl+0x454/0x608
>    __sched_setscheduler+0x8cc/0x2018
>    sys_sched_setattr+0x340/0x758
>    el0_svc_naked+0x30/0x34
> 
> syzkaller reproducer runs a bunch of threads that constantly switch
> between DEADLINE and NORMAL classes while interacting through futexes.
> 
> The splat above is caused by the fact that if a DEADLINE task is setattr
> back to NORMAL while in non_contending state (blocked on a futex -
> inactive timer armed), its contribution to running_bw is not removed
> before sub_rq_bw() gets called (!task_on_rq_queued() branch) and the
> latter sees running_bw > this_bw.
> 
> Fix it by removing a task contribution from running_bw if the task is
> not queued and in non_contending state while switched to a different
> class.
> 
> Reported-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
> ---
>  kernel/sched/deadline.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index fbfc3f1d368a..10c7b51c0d1f 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2290,8 +2290,17 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p)
>  	if (task_on_rq_queued(p) && p->dl.dl_runtime)
>  		task_non_contending(p);
>  
> -	if (!task_on_rq_queued(p))
> +	if (!task_on_rq_queued(p)) {
> +		/*
> +		 * Inactive timer is armed. However, p is leaving DEADLINE and
> +		 * might migrate away from this rq while continuing to run on
> +		 * some other class. We need to remove its contribution from
> +		 * this rq running_bw now, or sub_rq_bw (below) will complain.
> +		 */
> +		if (p->dl.dl_non_contending)
> +			sub_running_bw(&p->dl, &rq->dl);
>  		sub_rq_bw(&p->dl, &rq->dl);
> +	}
>  
>  	/*
>  	 * We cannot use inactive_task_timer() to invoke sub_running_bw()

Looking at this code:

	if (!task_on_rq_queued(p)) {
		/*
		 * Inactive timer is armed. However, p is leaving DEADLINE and
		 * might migrate away from this rq while continuing to run on
		 * some other class. We need to remove its contribution from
		 * this rq running_bw now, or sub_rq_bw (below) will complain.
		 */
		if (p->dl.dl_non_contending)
			sub_running_bw(&p->dl, &rq->dl);
		sub_rq_bw(&p->dl, &rq->dl);
	}

	/*
	 * We cannot use inactive_task_timer() to invoke sub_running_bw()
	 * at the 0-lag time, because the task could have been migrated
	 * while SCHED_OTHER in the meanwhile.
	 */
	if (p->dl.dl_non_contending)
		p->dl.dl_non_contending = 0;

Question. Is the "dl_non_contending" only able to be set
if !task_on_rq_queued(p) is true? In that case, we could just clear it
in the first if block. If it's not true, I would think the subtraction
is needed regardless.

-- Steve

  parent reply	other threads:[~2018-08-02  3:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11  7:29 [PATCH] sched/deadline: Fix switched_from_dl Juri Lelli
2018-07-11  8:53 ` luca abeni
2018-07-12 14:40 ` Daniel Bristot de Oliveira
2018-07-15 23:24 ` [tip:sched/urgent] sched/deadline: Fix switched_from_dl() warning tip-bot for Juri Lelli
2018-08-02  3:19 ` Steven Rostedt [this message]
2018-08-06 10:17   ` [PATCH] sched/deadline: Fix switched_from_dl Juri Lelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180801231953.151fdce6@vmware.local.home \
    --to=rostedt@goodmis.org \
    --cc=bristot@redhat.com \
    --cc=claudio@evidence.eu.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luca.abeni@santannapisa.it \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.