All of lore.kernel.org
 help / color / mirror / Atom feed
From: peterz@infradead.org
To: Oleg Nesterov <oleg@redhat.com>
Cc: Jiri Slaby <jirislaby@kernel.org>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	christian@brauner.io, "Eric W. Biederman" <ebiederm@xmission.com>,
	Linux kernel mailing list <linux-kernel@vger.kernel.org>,
	Mel Gorman <mgorman@suse.de>,
	Dave Jones <davej@codemonkey.org.uk>,
	Paul Gortmaker <paul.gortmaker@windriver.com>
Subject: Re: 5.8-rc*: kernel BUG at kernel/signal.c:1917
Date: Mon, 20 Jul 2020 12:59:24 +0200	[thread overview]
Message-ID: <20200720105924.GE43129@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200720084106.GJ10769@hirez.programming.kicks-ass.net>

On Mon, Jul 20, 2020 at 10:41:06AM +0200, Peter Zijlstra wrote:
> On Mon, Jul 20, 2020 at 10:26:58AM +0200, Oleg Nesterov wrote:
> > Peter,
> > 
> > Let me add another note. TASK_TRACED/TASK_STOPPED was always protected by
> > ->siglock. In particular, ttwu(__TASK_TRACED) must be always called with
> > ->siglock held. That is why ptrace_freeze_traced() assumes it can safely
> > do s/TASK_TRACED/__TASK_TRACED/ under spin_lock(siglock).
> > 
> > Can this change race with
> > 
> > 		if (signal_pending_state(prev->state, prev)) {
> > 			prev->state = TASK_RUNNING;
> > 		}
> > 
> > in __schedule() ? Hopefully not, signal-state is protected by siglock too.
> > 
> > So I think this logic was correct even if it doesn't look nice. But "doesn't
> > look nice" is true for the whole ptrace code ;)
> 
> *groan*... another bit of obscure magic :-(
> 
> let me go try and wake up and figure out how best to deal with this.

So clearly I'm still missing something, the below results in:

[   63.760863] ------------[ cut here ]------------
[   63.766019] !(tmp_state & __TASK_TRACED)
[   63.766030] WARNING: CPU: 33 PID: 33801 at kernel/sched/core.c:4158 __schedule+0x6bd/0x8e0

Also, is there any way to not have ptrace do this? How performance
critical is this ptrace path? Because I really hate having to add code
to __schedule() to deal with this horrible thing.


---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e15543cb84812..f65a801d268b6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4100,9 +4100,9 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
  */
 static void __sched notrace __schedule(bool preempt)
 {
+	unsigned long prev_state, tmp_state;
 	struct task_struct *prev, *next;
 	unsigned long *switch_count;
-	unsigned long prev_state;
 	struct rq_flags rf;
 	struct rq *rq;
 	int cpu;
@@ -4140,16 +4140,38 @@ static void __sched notrace __schedule(bool preempt)
 	rq_lock(rq, &rf);
 	smp_mb__after_spinlock();
 
+	/*
+	 * We must re-load prev->state in case ttwu_remote() changed it
+	 * before we acquired rq->lock.
+	 */
+	tmp_state = prev->state;
+	if (unlikely(prev_state != tmp_state)) {
+		if (prev_state & __TASK_TRACED) {
+			/*
+			 * ptrace_{,un}freeze_traced() think it is cool
+			 * to change ->state around behind our backs
+			 * between TASK_TRACED and __TASK_TRACED.
+			 *
+			 * Luckily this transformation doesn't affect
+			 * sched_contributes_to_load.
+			 */
+			SCHED_WARN_ON(!(tmp_state & __TASK_TRACED));
+		} else {
+			/*
+			 * For any other case, a changed prev_state
+			 * must be to TASK_RUNNING, such that...
+			 */
+			SCHED_WARN_ON(tmp_state != TASK_RUNNING);
+		}
+		prev_state = tmp_state;
+	}
+
 	/* Promote REQ to ACT */
 	rq->clock_update_flags <<= 1;
 	update_rq_clock(rq);
 
 	switch_count = &prev->nivcsw;
-	/*
-	 * We must re-load prev->state in case ttwu_remote() changed it
-	 * before we acquired rq->lock.
-	 */
-	if (!preempt && prev_state && prev_state == prev->state) {
+	if (!preempt && prev_state) {
 		if (signal_pending_state(prev_state, prev)) {
 			prev->state = TASK_RUNNING;
 		} else {

  reply	other threads:[~2020-07-20 10:59 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-17 10:45 5.8-rc*: kernel BUG at kernel/signal.c:1917 Jiri Slaby
2020-07-17 11:04 ` Jiri Slaby
2020-07-17 11:12   ` Christian Brauner
2020-07-18 13:05     ` Jiri Slaby
2020-07-17 12:26   ` Oleg Nesterov
2020-07-17 12:40     ` Oleg Nesterov
2020-07-18 12:28       ` Jiri Slaby
2020-07-18 17:14         ` Oleg Nesterov
2020-07-18 17:44           ` Christian Brauner
2020-07-20  5:44             ` Jiri Slaby
2020-07-20  6:43               ` Oleg Nesterov
2020-07-20  8:26                 ` Oleg Nesterov
2020-07-20  8:41                   ` Peter Zijlstra
2020-07-20 10:59                     ` peterz [this message]
2020-07-20 11:26                       ` peterz
2020-07-20 11:40                         ` Jiri Slaby
2020-07-20 12:20                         ` Valentin Schneider
2020-07-20 13:17                           ` peterz
2020-07-20 14:26                             ` Valentin Schneider
2020-07-20 12:57                         ` Christian Brauner
2020-07-20 14:05                         ` peterz
2020-07-20 14:02                       ` Oleg Nesterov
2020-07-20 14:21                         ` Peter Zijlstra
2020-07-20 14:39                           ` Oleg Nesterov
2020-07-20 15:35                             ` Oleg Nesterov
2020-07-20 15:38                               ` Peter Zijlstra
2020-07-21  4:52                           ` Paul Gortmaker
2020-07-21  8:37                             ` peterz
2020-07-21 12:13                               ` [PATCH] sched: Fix race against ptrace_freeze_trace() peterz
2020-07-21 14:29                                 ` Christian Brauner
2020-07-21 15:38                                 ` Oleg Nesterov
2020-07-21  9:14                           ` 5.8-rc*: kernel BUG at kernel/signal.c:1917 Valentin Schneider
     [not found]           ` <20200719072726.5892-1-hdanton@sina.com>
2020-07-19 18:23             ` Oleg Nesterov
2020-07-20  6:00           ` Jiri Slaby
2020-07-20  6:56             ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200720105924.GE43129@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=christian@brauner.io \
    --cc=davej@codemonkey.org.uk \
    --cc=ebiederm@xmission.com \
    --cc=jirislaby@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=oleg@redhat.com \
    --cc=paul.gortmaker@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.