All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org, Rik van Riel <riel@surriel.com>
Subject: Re: sched: rq->nr_iowait transiently going negative after the recent p->on_cpu optimization
Date: Thu, 24 Sep 2020 13:50:42 +0200	[thread overview]
Message-ID: <20200924115042.GG2628@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200918172759.GA4247@mtj.thefacebook.com>

On Fri, Sep 18, 2020 at 01:27:59PM -0400, Tejun Heo wrote:
> Hello,
> 
> Peter, I noticed /proc/stat::procs_blocked going U64_MAX transiently once in
> the blue moon without any other persistent issues. After looking at the code
> with Rik for a bit, the culprit seems to be c6e7bd7afaeb ("sched/core:
> Optimize ttwu() spinning on p->on_cpu") - it changed where ttwu dec's
> nr_iowait and it looks like that can happen before the target task gets to
> inc.

Hurmph.. I suppose you're right :/ And this is an actual problem?

I think the below should cure that, but blergh, not nice. If you could
confirm, I'll try and think of something nicer.


diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ebb90572e10d..259a4ae8ab8e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2505,7 +2505,12 @@ ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
 #ifdef CONFIG_SMP
 	if (wake_flags & WF_MIGRATED)
 		en_flags |= ENQUEUE_MIGRATED;
+	else
 #endif
+	if (p->in_iowait) {
+		delayacct_blkio_end(p);
+		atomic_dec(&task_rq(p)->nr_iowait);
+	}
 
 	activate_task(rq, p, en_flags);
 	ttwu_do_wakeup(rq, p, wake_flags, rf);
@@ -2892,11 +2897,6 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 	if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
 		goto unlock;
 
-	if (p->in_iowait) {
-		delayacct_blkio_end(p);
-		atomic_dec(&task_rq(p)->nr_iowait);
-	}
-
 #ifdef CONFIG_SMP
 	/*
 	 * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
@@ -2967,6 +2967,11 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 
 	cpu = select_task_rq(p, p->wake_cpu, SD_BALANCE_WAKE, wake_flags);
 	if (task_cpu(p) != cpu) {
+		if (p->in_iowait) {
+			delayacct_blkio_end(p);
+			atomic_dec(&task_rq(p)->nr_iowait);
+		}
+
 		wake_flags |= WF_MIGRATED;
 		psi_ttwu_dequeue(p);
 		set_task_cpu(p, cpu);

  reply	other threads:[~2020-09-24 11:50 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-18 17:27 sched: rq->nr_iowait transiently going negative after the recent p->on_cpu optimization Tejun Heo
2020-09-24 11:50 ` Peter Zijlstra [this message]
2020-09-24 14:27   ` Tejun Heo
2020-09-24 14:50     ` Peter Zijlstra
2020-09-24 14:57       ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200924115042.GG2628@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@surriel.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.