public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: John Stultz <jstultz@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Qais Yousef <qyousef@layalina.io>, Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Zimuzo Ezeozue <zezeozue@google.com>,
	Mel Gorman <mgorman@suse.de>, Will Deacon <will@kernel.org>,
	Waiman Long <longman@redhat.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Metin Kaya <Metin.Kaya@arm.com>,
	Xuewen Yan <xuewen.yan94@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Suleiman Souhlal <suleiman@google.com>,
	kuyo chang <kuyo.chang@mediatek.com>, hupu <hupu.gm@gmail.com>,
	kernel-team@android.com
Subject: Re: [PATCH v26 00/10] Simple Donor Migration for Proxy Execution
Date: Fri, 3 Apr 2026 16:38:19 +0200	[thread overview]
Message-ID: <20260403143819.GC2872@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <5e2337e7-24d8-4cc7-a86d-b8f5a19fb3ce@amd.com>

On Fri, Apr 03, 2026 at 07:13:29PM +0530, K Prateek Nayak wrote:
> Hello Peter,
> 
> On 4/3/2026 4:58 PM, Peter Zijlstra wrote:
> > On Fri, Apr 03, 2026 at 03:55:22PM +0530, K Prateek Nayak wrote:
> >>>> @@ -4256,6 +4277,15 @@ int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
> >>>>  		 */
> >>>>  		smp_cond_load_acquire(&p->on_cpu, !VAL);
> >>>>  
> >>>> +		/*
> >>>> +		 * We never clear the blocked_on relation on proxy_deactivate.
> >>>> +		 * If we don't clear it here, we have TASK_RUNNING + p->blocked_on
> >>>> +		 * when waking up. Since this is a fully blocked, off CPU task
> >>>> +		 * waking up, it should be safe to clear the blocked_on relation.
> >>>> +		 */
> >>>> +		if (task_is_blocked(p))
> >>>> +			clear_task_blocked_on(p, NULL);
> >>>> +
> >>>
> >>> Aah, yes! This is when find_proxy_task() hits deactivate() for us and we
> >>> skip ttwu_runnable(). We still need to clear ->blocked_on.
> > 
> > I wonder, should we have proxy_deactivate() do this instead?
> 
> That is one way to tackle that, yes!

OK, lets put it there. At that site we already know task_is_blocked()
and we get less noise in the wakeup path.

Or should we perhaps put it in block_task() itself? The moment you're
off the runqueue, ->blocked_on becomes meaningless.

> >>>
> >>>>  		cpu = select_task_rq(p, p->wake_cpu, &wake_flags);
> >>>>  		if (task_cpu(p) != cpu) {
> >>>>  			if (p->in_iowait) {
> >>>> @@ -6977,6 +7007,10 @@ static void __sched notrace __schedule(int sched_mode)
> >>>>  		switch_count = &prev->nvcsw;
> >>>>  	}
> >>>>  
> >>>> +	/* See: https://github.com/kudureranganath/linux/commit/0d6a01bb19db39f045d6f0f5fb4d196500091637 */
> >>>> +	if (!prev_state && task_is_blocked(prev))
> >>>> +		clear_task_blocked_on(prev, NULL);
> >>>> +
> >>>
> >>> This one confuses me, ttwu() should never results in ->blocked_on being
> >>> set.
> >>
> >> This is from the signal_pending_state() in try_to_block_task() putting
> >> prev to TASK_RUNNING while still having p->blocked_on.
> > 
> > Ooh, I misread that. I saw that set_task_blocked_on_waking(, NULL) in
> > there and though it would clear. Damn, sometimes reading is so very
> > hard...
> > 
> > Anyhow, with my changes on, try_to_block_task() is only ever called from
> > __schedule() on current. This means this could in fact be
> > clear_task_blocked_on(), right?
> 
> Yes, that should work too but there is also the case you mentioned on
> the parallel thread - if ttwu() doesn't see p->blocked_on, it won't
> clear it and if ttwu_runnable() wins we can simply go into
> __schedule() with TASK_RUNNING + p->blocked_on with no guarantee that
> same task will be picked again. Then the task gets preempted and stays
> in an illegal state.
> 
> Clearing of ->blocked_on in schedule also safeguards against that race:
> 
> o Scenario 1: ttwu_runnable() wins
> 
>   CPU0				CPU1
> 
>   LOCK				ACQUIRE
>   [W] ->blocked_on = lock       [R] ->__state
>   [W] ->__state = state;        RMB
>                                 [R] ->blocked_on
>                                 [W] ->__state = RUNNING
> 
>            /* Synchronized by rq_lock */
> 
>   MB
>   [R] if (!->__state &&
>   [R]     ->blocked_on)
>   [W]   ->blocked_on = NULL; /* Safeguard */
> 
> 
> o Scenario 2: __schedule() wins
> 
>   CPU0				CPU1
> 
>   LOCK				ACQUIRE
>   [W] ->blocked_on = lock       [R] ->__state
>   [W] ->__state = state;
> 
>            /* Synchronized by rq_lock */
>   MB
>   [W]  __block_task()
> 
>                                 MB
>                                 /* Full wakeup. */
>                                 [R] if (->blocked_on)
>                                 [W]   ->blocked_on = NULL
> 

Oh Boohoo :-( Yes, you're quite right.

Is find_proxy_task() that is affected, so this fixup should
perhaps live in the existing sched_proxy_exec() branch?

Something like so?


--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2222,6 +2222,13 @@ static bool block_task(struct rq *rq, st
 {
 	int flags = DEQUEUE_NOCLOCK;
 
+	/*
+	 * We're being taken off the runqueue, cannot still be blocked_on
+	 * anything. This also means that delay_dequeue can not have
+	 * blocked_on.
+	 */
+	clear_task_blocked_on(p, NULL);
+	
 	p->sched_contributes_to_load =
 		(task_state & TASK_UNINTERRUPTIBLE) &&
 		!(task_state & TASK_NOLOAD) &&
@@ -6614,7 +6621,7 @@ static bool try_to_block_task(struct rq
 	if (signal_pending_state(task_state, p)) {
 		WRITE_ONCE(p->__state, TASK_RUNNING);
 		*task_state_p = TASK_RUNNING;
-		set_task_blocked_on_waking(p, NULL);
+		clear_task_blocked_on(p, NULL);
 
 		return false;
 	}
@@ -7043,6 +7050,14 @@ static void __sched notrace __schedule(i
 	if (sched_proxy_exec()) {
 		struct task_struct *prev_donor = rq->donor;
 
+		/*
+		 * There is a race between ttwu() and __mutex_lock_common()
+		 * where it is possible for the mutex code to call into
+		 * schedule() with ->blocked_on still set.
+		 */
+		if (!prev_state && prev->blocked_on)
+			clear_task_blocked_on(prev, NULL);
+
 		rq_set_donor(rq, next);
 		if (unlikely(next->blocked_on)) {
 			next = find_proxy_task(rq, next, &rf);


  reply	other threads:[~2026-04-03 14:38 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 19:13 [PATCH v26 00/10] Simple Donor Migration for Proxy Execution John Stultz
2026-03-24 19:13 ` [PATCH v26 01/10] sched: Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr() John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 02/10] sched: Minimise repeated sched_proxy_exec() checking John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 03/10] sched: Fix potentially missing balancing with Proxy Exec John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 04/10] locking: Add task::blocked_lock to serialize blocked_on state John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 05/10] sched: Fix modifying donor->blocked on without proper locking John Stultz
2026-03-26 21:45   ` Steven Rostedt
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 06/10] sched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy return-migration John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 07/10] sched: Add assert_balance_callbacks_empty helper John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 08/10] sched: Add logic to zap balance callbacks if we pick again John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 09/10] sched: Move attach_one_task and attach_task helpers to sched.h John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 10/10] sched: Handle blocked-waiter migration (and return migration) John Stultz
2026-03-26 22:52   ` Steven Rostedt
2026-03-27  4:47     ` K Prateek Nayak
2026-03-27 12:47       ` Peter Zijlstra
2026-04-02 14:43   ` Peter Zijlstra
2026-04-02 15:08     ` Peter Zijlstra
2026-04-02 17:43       ` John Stultz
2026-04-02 17:34     ` John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-25 10:52 ` [PATCH v26 00/10] Simple Donor Migration for Proxy Execution K Prateek Nayak
2026-03-27 11:48   ` Peter Zijlstra
2026-03-27 13:33     ` K Prateek Nayak
2026-03-27 15:20       ` Peter Zijlstra
2026-03-27 15:41         ` Peter Zijlstra
2026-03-27 16:00       ` Peter Zijlstra
2026-03-27 16:57         ` K Prateek Nayak
2026-04-02 15:50           ` Peter Zijlstra
2026-04-02 18:31             ` John Stultz
2026-04-02 21:04               ` John Stultz
2026-04-03  6:09               ` K Prateek Nayak
2026-04-03  9:52                 ` Peter Zijlstra
2026-04-03 10:25                   ` K Prateek Nayak
2026-04-03 11:28                     ` Peter Zijlstra
2026-04-03 13:43                       ` K Prateek Nayak
2026-04-03 14:38                         ` Peter Zijlstra [this message]
2026-04-03 15:39                           ` K Prateek Nayak
2026-04-03 21:08                             ` Peter Zijlstra
2026-04-04  0:26                             ` John Stultz
2026-04-04  5:49                               ` K Prateek Nayak
2026-04-04  6:07                                 ` John Stultz
2026-04-06  2:40                                   ` K Prateek Nayak
2026-04-03 12:54                     ` Peter Zijlstra
2026-04-03  9:18               ` Peter Zijlstra
2026-03-27 19:15     ` John Stultz
2026-03-27 19:10   ` John Stultz
2026-03-28  4:53     ` K Prateek Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260403143819.GC2872@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Metin.Kaya@arm.com \
    --cc=boqun.feng@gmail.com \
    --cc=bsegall@google.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=hupu.gm@gmail.com \
    --cc=joelagnelf@nvidia.com \
    --cc=jstultz@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-team@android.com \
    --cc=kprateek.nayak@amd.com \
    --cc=kuyo.chang@mediatek.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=qyousef@layalina.io \
    --cc=rostedt@goodmis.org \
    --cc=suleiman@google.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=will@kernel.org \
    --cc=xuewen.yan94@gmail.com \
    --cc=zezeozue@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox