All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: John Stultz <jstultz@google.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Qais Yousef <qyousef@layalina.io>, Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Zimuzo Ezeozue <zezeozue@google.com>,
	Mel Gorman <mgorman@suse.de>, Will Deacon <will@kernel.org>,
	Waiman Long <longman@redhat.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Metin Kaya <Metin.Kaya@arm.com>,
	Xuewen Yan <xuewen.yan94@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Suleiman Souhlal <suleiman@google.com>,
	kuyo chang <kuyo.chang@mediatek.com>, hupu <hupu.gm@gmail.com>,
	kernel-team@android.com
Subject: Re: [PATCH v26 00/10] Simple Donor Migration for Proxy Execution
Date: Fri, 3 Apr 2026 16:38:19 +0200	[thread overview]
Message-ID: <20260403143819.GC2872@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <5e2337e7-24d8-4cc7-a86d-b8f5a19fb3ce@amd.com>

On Fri, Apr 03, 2026 at 07:13:29PM +0530, K Prateek Nayak wrote:
> Hello Peter,
> 
> On 4/3/2026 4:58 PM, Peter Zijlstra wrote:
> > On Fri, Apr 03, 2026 at 03:55:22PM +0530, K Prateek Nayak wrote:
> >>>> @@ -4256,6 +4277,15 @@ int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
> >>>>  		 */
> >>>>  		smp_cond_load_acquire(&p->on_cpu, !VAL);
> >>>>  
> >>>> +		/*
> >>>> +		 * We never clear the blocked_on relation on proxy_deactivate.
> >>>> +		 * If we don't clear it here, we have TASK_RUNNING + p->blocked_on
> >>>> +		 * when waking up. Since this is a fully blocked, off CPU task
> >>>> +		 * waking up, it should be safe to clear the blocked_on relation.
> >>>> +		 */
> >>>> +		if (task_is_blocked(p))
> >>>> +			clear_task_blocked_on(p, NULL);
> >>>> +
> >>>
> >>> Aah, yes! This is when find_proxy_task() hits deactivate() for us and we
> >>> skip ttwu_runnable(). We still need to clear ->blocked_on.
> > 
> > I wonder, should we have proxy_deactivate() do this instead?
> 
> That is one way to tackle that, yes!

OK, lets put it there. At that site we already know task_is_blocked()
and we get less noise in the wakeup path.

Or should we perhaps put it in block_task() itself? The moment you're
off the runqueue, ->blocked_on becomes meaningless.

> >>>
> >>>>  		cpu = select_task_rq(p, p->wake_cpu, &wake_flags);
> >>>>  		if (task_cpu(p) != cpu) {
> >>>>  			if (p->in_iowait) {
> >>>> @@ -6977,6 +7007,10 @@ static void __sched notrace __schedule(int sched_mode)
> >>>>  		switch_count = &prev->nvcsw;
> >>>>  	}
> >>>>  
> >>>> +	/* See: https://github.com/kudureranganath/linux/commit/0d6a01bb19db39f045d6f0f5fb4d196500091637 */
> >>>> +	if (!prev_state && task_is_blocked(prev))
> >>>> +		clear_task_blocked_on(prev, NULL);
> >>>> +
> >>>
> >>> This one confuses me, ttwu() should never results in ->blocked_on being
> >>> set.
> >>
> >> This is from the signal_pending_state() in try_to_block_task() putting
> >> prev to TASK_RUNNING while still having p->blocked_on.
> > 
> > Ooh, I misread that. I saw that set_task_blocked_on_waking(, NULL) in
> > there and though it would clear. Damn, sometimes reading is so very
> > hard...
> > 
> > Anyhow, with my changes on, try_to_block_task() is only ever called from
> > __schedule() on current. This means this could in fact be
> > clear_task_blocked_on(), right?
> 
> Yes, that should work too but there is also the case you mentioned on
> the parallel thread - if ttwu() doesn't see p->blocked_on, it won't
> clear it and if ttwu_runnable() wins we can simply go into
> __schedule() with TASK_RUNNING + p->blocked_on with no guarantee that
> same task will be picked again. Then the task gets preempted and stays
> in an illegal state.
> 
> Clearing of ->blocked_on in schedule also safeguards against that race:
> 
> o Scenario 1: ttwu_runnable() wins
> 
>   CPU0				CPU1
> 
>   LOCK				ACQUIRE
>   [W] ->blocked_on = lock       [R] ->__state
>   [W] ->__state = state;        RMB
>                                 [R] ->blocked_on
>                                 [W] ->__state = RUNNING
> 
>            /* Synchronized by rq_lock */
> 
>   MB
>   [R] if (!->__state &&
>   [R]     ->blocked_on)
>   [W]   ->blocked_on = NULL; /* Safeguard */
> 
> 
> o Scenario 2: __schedule() wins
> 
>   CPU0				CPU1
> 
>   LOCK				ACQUIRE
>   [W] ->blocked_on = lock       [R] ->__state
>   [W] ->__state = state;
> 
>            /* Synchronized by rq_lock */
>   MB
>   [W]  __block_task()
> 
>                                 MB
>                                 /* Full wakeup. */
>                                 [R] if (->blocked_on)
>                                 [W]   ->blocked_on = NULL
> 

Oh Boohoo :-( Yes, you're quite right.

Is find_proxy_task() that is affected, so this fixup should
perhaps live in the existing sched_proxy_exec() branch?

Something like so?


--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2222,6 +2222,13 @@ static bool block_task(struct rq *rq, st
 {
 	int flags = DEQUEUE_NOCLOCK;
 
+	/*
+	 * We're being taken off the runqueue, cannot still be blocked_on
+	 * anything. This also means that delay_dequeue can not have
+	 * blocked_on.
+	 */
+	clear_task_blocked_on(p, NULL);
+	
 	p->sched_contributes_to_load =
 		(task_state & TASK_UNINTERRUPTIBLE) &&
 		!(task_state & TASK_NOLOAD) &&
@@ -6614,7 +6621,7 @@ static bool try_to_block_task(struct rq
 	if (signal_pending_state(task_state, p)) {
 		WRITE_ONCE(p->__state, TASK_RUNNING);
 		*task_state_p = TASK_RUNNING;
-		set_task_blocked_on_waking(p, NULL);
+		clear_task_blocked_on(p, NULL);
 
 		return false;
 	}
@@ -7043,6 +7050,14 @@ static void __sched notrace __schedule(i
 	if (sched_proxy_exec()) {
 		struct task_struct *prev_donor = rq->donor;
 
+		/*
+		 * There is a race between ttwu() and __mutex_lock_common()
+		 * where it is possible for the mutex code to call into
+		 * schedule() with ->blocked_on still set.
+		 */
+		if (!prev_state && prev->blocked_on)
+			clear_task_blocked_on(prev, NULL);
+
 		rq_set_donor(rq, next);
 		if (unlikely(next->blocked_on)) {
 			next = find_proxy_task(rq, next, &rf);


  reply	other threads:[~2026-04-03 14:38 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 19:13 [PATCH v26 00/10] Simple Donor Migration for Proxy Execution John Stultz
2026-03-24 19:13 ` [PATCH v26 01/10] sched: Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr() John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 02/10] sched: Minimise repeated sched_proxy_exec() checking John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 03/10] sched: Fix potentially missing balancing with Proxy Exec John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 04/10] locking: Add task::blocked_lock to serialize blocked_on state John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 05/10] sched: Fix modifying donor->blocked on without proper locking John Stultz
2026-03-26 21:45   ` Steven Rostedt
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 06/10] sched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy return-migration John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 07/10] sched: Add assert_balance_callbacks_empty helper John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 08/10] sched: Add logic to zap balance callbacks if we pick again John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 09/10] sched: Move attach_one_task and attach_task helpers to sched.h John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 10/10] sched: Handle blocked-waiter migration (and return migration) John Stultz
2026-03-26 22:52   ` Steven Rostedt
2026-03-27  4:47     ` K Prateek Nayak
2026-03-27 12:47       ` Peter Zijlstra
2026-04-02 14:43   ` Peter Zijlstra
2026-04-02 15:08     ` Peter Zijlstra
2026-04-02 17:43       ` John Stultz
2026-04-02 17:34     ` John Stultz
2026-04-03 12:30   ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-25 10:52 ` [PATCH v26 00/10] Simple Donor Migration for Proxy Execution K Prateek Nayak
2026-03-27 11:48   ` Peter Zijlstra
2026-03-27 13:33     ` K Prateek Nayak
2026-03-27 15:20       ` Peter Zijlstra
2026-03-27 15:41         ` Peter Zijlstra
2026-03-27 16:00       ` Peter Zijlstra
2026-03-27 16:57         ` K Prateek Nayak
2026-04-02 15:50           ` Peter Zijlstra
2026-04-02 18:31             ` John Stultz
2026-04-02 21:04               ` John Stultz
2026-04-03  6:09               ` K Prateek Nayak
2026-04-03  9:52                 ` Peter Zijlstra
2026-04-03 10:25                   ` K Prateek Nayak
2026-04-03 11:28                     ` Peter Zijlstra
2026-04-03 13:43                       ` K Prateek Nayak
2026-04-03 14:38                         ` Peter Zijlstra [this message]
2026-04-03 15:39                           ` K Prateek Nayak
2026-04-03 21:08                             ` Peter Zijlstra
2026-04-04  0:26                             ` John Stultz
2026-04-04  5:49                               ` K Prateek Nayak
2026-04-04  6:07                                 ` John Stultz
2026-04-06  2:40                                   ` K Prateek Nayak
2026-04-03 12:54                     ` Peter Zijlstra
2026-04-03  9:18               ` Peter Zijlstra
2026-03-27 19:15     ` John Stultz
2026-03-27 19:10   ` John Stultz
2026-03-28  4:53     ` K Prateek Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260403143819.GC2872@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Metin.Kaya@arm.com \
    --cc=boqun.feng@gmail.com \
    --cc=bsegall@google.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=hupu.gm@gmail.com \
    --cc=joelagnelf@nvidia.com \
    --cc=jstultz@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-team@android.com \
    --cc=kprateek.nayak@amd.com \
    --cc=kuyo.chang@mediatek.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=qyousef@layalina.io \
    --cc=rostedt@goodmis.org \
    --cc=suleiman@google.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=will@kernel.org \
    --cc=xuewen.yan94@gmail.com \
    --cc=zezeozue@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.