From: Peter Zijlstra <peterz@infradead.org>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: John Stultz <jstultz@google.com>,
LKML <linux-kernel@vger.kernel.org>,
Joel Fernandes <joelagnelf@nvidia.com>,
Qais Yousef <qyousef@layalina.io>, Ingo Molnar <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Valentin Schneider <vschneid@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>,
Zimuzo Ezeozue <zezeozue@google.com>,
Mel Gorman <mgorman@suse.de>, Will Deacon <will@kernel.org>,
Waiman Long <longman@redhat.com>,
Boqun Feng <boqun.feng@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Metin Kaya <Metin.Kaya@arm.com>,
Xuewen Yan <xuewen.yan94@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Suleiman Souhlal <suleiman@google.com>,
kuyo chang <kuyo.chang@mediatek.com>, hupu <hupu.gm@gmail.com>,
kernel-team@android.com
Subject: Re: [PATCH v26 00/10] Simple Donor Migration for Proxy Execution
Date: Fri, 3 Apr 2026 16:38:19 +0200 [thread overview]
Message-ID: <20260403143819.GC2872@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <5e2337e7-24d8-4cc7-a86d-b8f5a19fb3ce@amd.com>
On Fri, Apr 03, 2026 at 07:13:29PM +0530, K Prateek Nayak wrote:
> Hello Peter,
>
> On 4/3/2026 4:58 PM, Peter Zijlstra wrote:
> > On Fri, Apr 03, 2026 at 03:55:22PM +0530, K Prateek Nayak wrote:
> >>>> @@ -4256,6 +4277,15 @@ int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
> >>>> */
> >>>> smp_cond_load_acquire(&p->on_cpu, !VAL);
> >>>>
> >>>> + /*
> >>>> + * We never clear the blocked_on relation on proxy_deactivate.
> >>>> + * If we don't clear it here, we have TASK_RUNNING + p->blocked_on
> >>>> + * when waking up. Since this is a fully blocked, off CPU task
> >>>> + * waking up, it should be safe to clear the blocked_on relation.
> >>>> + */
> >>>> + if (task_is_blocked(p))
> >>>> + clear_task_blocked_on(p, NULL);
> >>>> +
> >>>
> >>> Aah, yes! This is when find_proxy_task() hits deactivate() for us and we
> >>> skip ttwu_runnable(). We still need to clear ->blocked_on.
> >
> > I wonder, should we have proxy_deactivate() do this instead?
>
> That is one way to tackle that, yes!
OK, lets put it there. At that site we already know task_is_blocked()
and we get less noise in the wakeup path.
Or should we perhaps put it in block_task() itself? The moment you're
off the runqueue, ->blocked_on becomes meaningless.
> >>>
> >>>> cpu = select_task_rq(p, p->wake_cpu, &wake_flags);
> >>>> if (task_cpu(p) != cpu) {
> >>>> if (p->in_iowait) {
> >>>> @@ -6977,6 +7007,10 @@ static void __sched notrace __schedule(int sched_mode)
> >>>> switch_count = &prev->nvcsw;
> >>>> }
> >>>>
> >>>> + /* See: https://github.com/kudureranganath/linux/commit/0d6a01bb19db39f045d6f0f5fb4d196500091637 */
> >>>> + if (!prev_state && task_is_blocked(prev))
> >>>> + clear_task_blocked_on(prev, NULL);
> >>>> +
> >>>
> >>> This one confuses me, ttwu() should never results in ->blocked_on being
> >>> set.
> >>
> >> This is from the signal_pending_state() in try_to_block_task() putting
> >> prev to TASK_RUNNING while still having p->blocked_on.
> >
> > Ooh, I misread that. I saw that set_task_blocked_on_waking(, NULL) in
> > there and though it would clear. Damn, sometimes reading is so very
> > hard...
> >
> > Anyhow, with my changes on, try_to_block_task() is only ever called from
> > __schedule() on current. This means this could in fact be
> > clear_task_blocked_on(), right?
>
> Yes, that should work too but there is also the case you mentioned on
> the parallel thread - if ttwu() doesn't see p->blocked_on, it won't
> clear it and if ttwu_runnable() wins we can simply go into
> __schedule() with TASK_RUNNING + p->blocked_on with no guarantee that
> same task will be picked again. Then the task gets preempted and stays
> in an illegal state.
>
> Clearing of ->blocked_on in schedule also safeguards against that race:
>
> o Scenario 1: ttwu_runnable() wins
>
> CPU0 CPU1
>
> LOCK ACQUIRE
> [W] ->blocked_on = lock [R] ->__state
> [W] ->__state = state; RMB
> [R] ->blocked_on
> [W] ->__state = RUNNING
>
> /* Synchronized by rq_lock */
>
> MB
> [R] if (!->__state &&
> [R] ->blocked_on)
> [W] ->blocked_on = NULL; /* Safeguard */
>
>
> o Scenario 2: __schedule() wins
>
> CPU0 CPU1
>
> LOCK ACQUIRE
> [W] ->blocked_on = lock [R] ->__state
> [W] ->__state = state;
>
> /* Synchronized by rq_lock */
> MB
> [W] __block_task()
>
> MB
> /* Full wakeup. */
> [R] if (->blocked_on)
> [W] ->blocked_on = NULL
>
Oh Boohoo :-( Yes, you're quite right.
Is find_proxy_task() that is affected, so this fixup should
perhaps live in the existing sched_proxy_exec() branch?
Something like so?
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2222,6 +2222,13 @@ static bool block_task(struct rq *rq, st
{
int flags = DEQUEUE_NOCLOCK;
+ /*
+ * We're being taken off the runqueue, cannot still be blocked_on
+ * anything. This also means that delay_dequeue can not have
+ * blocked_on.
+ */
+ clear_task_blocked_on(p, NULL);
+
p->sched_contributes_to_load =
(task_state & TASK_UNINTERRUPTIBLE) &&
!(task_state & TASK_NOLOAD) &&
@@ -6614,7 +6621,7 @@ static bool try_to_block_task(struct rq
if (signal_pending_state(task_state, p)) {
WRITE_ONCE(p->__state, TASK_RUNNING);
*task_state_p = TASK_RUNNING;
- set_task_blocked_on_waking(p, NULL);
+ clear_task_blocked_on(p, NULL);
return false;
}
@@ -7043,6 +7050,14 @@ static void __sched notrace __schedule(i
if (sched_proxy_exec()) {
struct task_struct *prev_donor = rq->donor;
+ /*
+ * There is a race between ttwu() and __mutex_lock_common()
+ * where it is possible for the mutex code to call into
+ * schedule() with ->blocked_on still set.
+ */
+ if (!prev_state && prev->blocked_on)
+ clear_task_blocked_on(prev, NULL);
+
rq_set_donor(rq, next);
if (unlikely(next->blocked_on)) {
next = find_proxy_task(rq, next, &rf);
next prev parent reply other threads:[~2026-04-03 14:38 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-24 19:13 [PATCH v26 00/10] Simple Donor Migration for Proxy Execution John Stultz
2026-03-24 19:13 ` [PATCH v26 01/10] sched: Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr() John Stultz
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 02/10] sched: Minimise repeated sched_proxy_exec() checking John Stultz
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 03/10] sched: Fix potentially missing balancing with Proxy Exec John Stultz
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 04/10] locking: Add task::blocked_lock to serialize blocked_on state John Stultz
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 05/10] sched: Fix modifying donor->blocked on without proper locking John Stultz
2026-03-26 21:45 ` Steven Rostedt
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 06/10] sched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy return-migration John Stultz
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 07/10] sched: Add assert_balance_callbacks_empty helper John Stultz
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 08/10] sched: Add logic to zap balance callbacks if we pick again John Stultz
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 09/10] sched: Move attach_one_task and attach_task helpers to sched.h John Stultz
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-24 19:13 ` [PATCH v26 10/10] sched: Handle blocked-waiter migration (and return migration) John Stultz
2026-03-26 22:52 ` Steven Rostedt
2026-03-27 4:47 ` K Prateek Nayak
2026-03-27 12:47 ` Peter Zijlstra
2026-04-02 14:43 ` Peter Zijlstra
2026-04-02 15:08 ` Peter Zijlstra
2026-04-02 17:43 ` John Stultz
2026-04-02 17:34 ` John Stultz
2026-04-03 12:30 ` [tip: sched/core] " tip-bot2 for John Stultz
2026-03-25 10:52 ` [PATCH v26 00/10] Simple Donor Migration for Proxy Execution K Prateek Nayak
2026-03-27 11:48 ` Peter Zijlstra
2026-03-27 13:33 ` K Prateek Nayak
2026-03-27 15:20 ` Peter Zijlstra
2026-03-27 15:41 ` Peter Zijlstra
2026-03-27 16:00 ` Peter Zijlstra
2026-03-27 16:57 ` K Prateek Nayak
2026-04-02 15:50 ` Peter Zijlstra
2026-04-02 18:31 ` John Stultz
2026-04-02 21:04 ` John Stultz
2026-04-03 6:09 ` K Prateek Nayak
2026-04-03 9:52 ` Peter Zijlstra
2026-04-03 10:25 ` K Prateek Nayak
2026-04-03 11:28 ` Peter Zijlstra
2026-04-03 13:43 ` K Prateek Nayak
2026-04-03 14:38 ` Peter Zijlstra [this message]
2026-04-03 15:39 ` K Prateek Nayak
2026-04-03 21:08 ` Peter Zijlstra
2026-04-04 0:26 ` John Stultz
2026-04-04 5:49 ` K Prateek Nayak
2026-04-04 6:07 ` John Stultz
2026-04-06 2:40 ` K Prateek Nayak
2026-04-03 12:54 ` Peter Zijlstra
2026-04-03 9:18 ` Peter Zijlstra
2026-03-27 19:15 ` John Stultz
2026-03-27 19:10 ` John Stultz
2026-03-28 4:53 ` K Prateek Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260403143819.GC2872@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=Metin.Kaya@arm.com \
--cc=boqun.feng@gmail.com \
--cc=bsegall@google.com \
--cc=daniel.lezcano@linaro.org \
--cc=dietmar.eggemann@arm.com \
--cc=hupu.gm@gmail.com \
--cc=joelagnelf@nvidia.com \
--cc=jstultz@google.com \
--cc=juri.lelli@redhat.com \
--cc=kernel-team@android.com \
--cc=kprateek.nayak@amd.com \
--cc=kuyo.chang@mediatek.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=qyousef@layalina.io \
--cc=rostedt@goodmis.org \
--cc=suleiman@google.com \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=will@kernel.org \
--cc=xuewen.yan94@gmail.com \
--cc=zezeozue@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox