From: peterz@infradead.org
To: Oleg Nesterov <oleg@redhat.com>
Cc: Jiri Slaby <jirislaby@kernel.org>,
Christian Brauner <christian.brauner@ubuntu.com>,
christian@brauner.io, "Eric W. Biederman" <ebiederm@xmission.com>,
Linux kernel mailing list <linux-kernel@vger.kernel.org>,
Mel Gorman <mgorman@suse.de>,
Dave Jones <davej@codemonkey.org.uk>,
Paul Gortmaker <paul.gortmaker@windriver.com>,
Will Deacon <will@kernel.org>,
paulmck@kernel.org
Subject: Re: 5.8-rc*: kernel BUG at kernel/signal.c:1917
Date: Mon, 20 Jul 2020 16:05:41 +0200 [thread overview]
Message-ID: <20200720140541.GG43129@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200720112623.GF43129@hirez.programming.kicks-ass.net>
On Mon, Jul 20, 2020 at 01:26:23PM +0200, peterz@infradead.org wrote:
> kernel/sched/core.c | 34 ++++++++++++++++++++++++++++------
> 1 file changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index e15543cb84812..b5973d7fa521c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4100,9 +4100,9 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> */
> static void __sched notrace __schedule(bool preempt)
> {
> + unsigned long prev_state, tmp_state;
> struct task_struct *prev, *next;
> unsigned long *switch_count;
> - unsigned long prev_state;
> struct rq_flags rf;
> struct rq *rq;
> int cpu;
> @@ -4140,16 +4140,38 @@ static void __sched notrace __schedule(bool preempt)
> rq_lock(rq, &rf);
> smp_mb__after_spinlock();
>
> + /*
> + * We must re-load prev->state in case ttwu_remote() changed it
> + * before we acquired rq->lock.
> + */
> + tmp_state = prev->state;
> + if (unlikely(prev_state != tmp_state)) {
> + /*
> + * ptrace_{,un}freeze_traced() think it is cool to change
> + * ->state around behind our backs between TASK_TRACED and
> + * __TASK_TRACED.
> + *
> + * This is safe because this, as well as any __TASK_TRACED
> + * wakeups are under siglock.
> + *
> + * For any other case, a changed prev_state must be to
> + * TASK_RUNNING, such that when it blocks, the load has
> + * happened before the smp_mb().
> + *
> + * Also see the comment with deactivate_task().
> + */
> + SCHED_WARN_ON(tmp_state && (prev_state & __TASK_TRACED &&
> + !(tmp_state & __TASK_TRACED)));
> +
> + prev_state = tmp_state;
While trying to write a changelog for this thing, I can't convince
myself we don't need:
smp_mb();
here. Consider:
CPU0 CPU1 CPU2
schedule()
prev_state = prev->state;
spin_lock(rq->lock);
smp_mb__after_spin_lock();
ptrace_freeze_traced()
spin_lock(siglock)
task->state = __TASK_TRACED;
spin_unlock(siglock);
tmp_state = prev->state;
if (prev_state != tmp_state)
prev_state = tmp_state;
/* NO SMP_MB */
if (prev_state)
deactivate_task()
prev->on_rq = 0;
spin_lock(siglock);
ttwu()
if (rq->on_rq && ...)
goto unlock;
smp_acquire__after_ctrl_dep();
p->state = TASK_WAKING;
Looses the ordering we previously relied upon. That is, CPU1's
prev->state load and prev->on_rq store can get reordered vs CPU2.
OTOH, we have a control dependency on CPU1 as well, that should provide
LOAD->STORE ordering, after all we only do the ->on_rq=0 store, IFF we
see prev_state.
So that is:
if (p->state) if (!p->on_rq)
p->on_rq = 0; p->state = TASK_WAKING
which matches a CTRL-DEP to a CTRL-DEP ...
But this then means we can simplify dbfb089d360 as well, but now my head
hurts.
> + }
> +
> /* Promote REQ to ACT */
> rq->clock_update_flags <<= 1;
> update_rq_clock(rq);
>
> switch_count = &prev->nivcsw;
> - /*
> - * We must re-load prev->state in case ttwu_remote() changed it
> - * before we acquired rq->lock.
> - */
> - if (!preempt && prev_state && prev_state == prev->state) {
> + if (!preempt && prev_state) {
> if (signal_pending_state(prev_state, prev)) {
> prev->state = TASK_RUNNING;
> } else {
next prev parent reply other threads:[~2020-07-20 14:05 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-17 10:45 5.8-rc*: kernel BUG at kernel/signal.c:1917 Jiri Slaby
2020-07-17 11:04 ` Jiri Slaby
2020-07-17 11:12 ` Christian Brauner
2020-07-18 13:05 ` Jiri Slaby
2020-07-17 12:26 ` Oleg Nesterov
2020-07-17 12:40 ` Oleg Nesterov
2020-07-18 12:28 ` Jiri Slaby
2020-07-18 17:14 ` Oleg Nesterov
2020-07-18 17:44 ` Christian Brauner
2020-07-20 5:44 ` Jiri Slaby
2020-07-20 6:43 ` Oleg Nesterov
2020-07-20 8:26 ` Oleg Nesterov
2020-07-20 8:41 ` Peter Zijlstra
2020-07-20 10:59 ` peterz
2020-07-20 11:26 ` peterz
2020-07-20 11:40 ` Jiri Slaby
2020-07-20 12:20 ` Valentin Schneider
2020-07-20 13:17 ` peterz
2020-07-20 14:26 ` Valentin Schneider
2020-07-20 12:57 ` Christian Brauner
2020-07-20 14:05 ` peterz [this message]
2020-07-20 14:02 ` Oleg Nesterov
2020-07-20 14:21 ` Peter Zijlstra
2020-07-20 14:39 ` Oleg Nesterov
2020-07-20 15:35 ` Oleg Nesterov
2020-07-20 15:38 ` Peter Zijlstra
2020-07-21 4:52 ` Paul Gortmaker
2020-07-21 8:37 ` peterz
2020-07-21 12:13 ` [PATCH] sched: Fix race against ptrace_freeze_trace() peterz
2020-07-21 14:29 ` Christian Brauner
2020-07-21 15:38 ` Oleg Nesterov
2020-07-21 9:14 ` 5.8-rc*: kernel BUG at kernel/signal.c:1917 Valentin Schneider
[not found] ` <20200719072726.5892-1-hdanton@sina.com>
2020-07-19 18:23 ` Oleg Nesterov
2020-07-20 6:00 ` Jiri Slaby
2020-07-20 6:56 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200720140541.GG43129@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=christian.brauner@ubuntu.com \
--cc=christian@brauner.io \
--cc=davej@codemonkey.org.uk \
--cc=ebiederm@xmission.com \
--cc=jirislaby@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=oleg@redhat.com \
--cc=paul.gortmaker@windriver.com \
--cc=paulmck@kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.