From: John Stultz <jstultz@google.com>
To: LKML <linux-kernel@vger.kernel.org>
Cc: John Stultz <jstultz@google.com>,
Vineeth Pillai <vineethrp@google.com>,
Sonam Sanju <sonam.sanju@intel.com>,
Sean Christopherson <seanjc@google.com>,
Kunwu Chan <kunwu.chan@linux.dev>, Tejun Heo <tj@kernel.org>,
Joel Fernandes <joelagnelf@nvidia.com>,
Qais Yousef <qyousef@layalina.io>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Valentin Schneider <vschneid@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Will Deacon <will@kernel.org>, Waiman Long <longman@redhat.com>,
Boqun Feng <boqun.feng@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Metin Kaya <Metin.Kaya@arm.com>,
Xuewen Yan <xuewen.yan94@gmail.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Thomas Gleixner <tglx@linutronix.de>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Suleiman Souhlal <suleiman@google.com>,
kuyo chang <kuyo.chang@mediatek.com>, hupu <hupu.gm@gmail.com>,
kernel-team@android.com
Subject: [PATCH 2/2] locking: mutex: Fix proxy-exec potentially deactivating tasks marked TASK_RUNNING
Date: Mon, 27 Apr 2026 18:38:41 +0000 [thread overview]
Message-ID: <20260427183848.698551-3-jstultz@google.com> (raw)
In-Reply-To: <20260427183848.698551-1-jstultz@google.com>
Vineeth found came up with a test driver that could trip up
workqueue stalls. After fixing one issue this test found,
Vineeth reported the test was still failing.
Greatly simplified, a task that tries to take a mutex already
owned by another task that is sleeping, can hit a edge case in
the mutex_lock_common() case.
If the task fails to get the lock, calls into schedule, but gets
a spurious wakeup, it will find that it is first waiter, and
go into the mutex_optimistic_spin() logic. Though before calling
mutex_optimistic_spin(), we clear task blocked_on state, since
mutex_optimistic_spin() may call schedule() if need_resched() is
set.
After mutex_optimistic_spin() fails, we set blocked_on again,
restart the main mutex loop, try to take the lock and call into
schedule_preempt_disabled().
From there, with proxy-execution, we'll see the task is
blocked_on, follow the chain, see the owner is sleeping and
dequeue the waiting task from the runqueue.
This all sounds fine and reasonable. But what I had missed is
that in mutex_optimistic_spin(), not only do we call schedule()
but we set TASK_RUNNABLE right before doing so.
This is ok for that invocation of schedule(). But when we come
back we re-set the blocked_on we had just cleared, but we do not
re-set the task state to TASK_INTERRUPTIBLE/UNINTERRUPTIBLE.
This means we have a task that is blocked_on & TASK_RUNNABLE,
so when the proxy execution code dequeues the task, we are
in trouble since future wakeups will be shortcut by the
ttwu_state_match() check.
Thus, to avoid this, after mutex_optimistic_spin(), set the task
state back when we set blocked_on.
Many many thanks again to Vineeth for his very useful testing
driver that uncovered this long hidden bug, that I hadn't
tripped in all my testing! Very impressed with the problems he's
uncovered!
Reported-by: Vineeth Pillai <vineethrp@google.com>
Tested-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: John Stultz <jstultz@google.com>
---
Cc: Vineeth Pillai <vineethrp@google.com>
Cc: Sonam Sanju <sonam.sanju@intel.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Kunwu Chan <kunwu.chan@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Joel Fernandes <joelagnelf@nvidia.com>
Cc: Qais Yousef <qyousef@layalina.io>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Metin Kaya <Metin.Kaya@arm.com>
Cc: Xuewen Yan <xuewen.yan94@gmail.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: kuyo chang <kuyo.chang@mediatek.com>
Cc: hupu <hupu.gm@gmail.com>
Cc: kernel-team@android.com
---
kernel/locking/mutex.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 09534628dc01a..a93d4c6bee1a3 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -763,6 +763,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int state, unsigned int subclas
raw_spin_lock_irqsave(&lock->wait_lock, flags);
raw_spin_lock(¤t->blocked_lock);
__set_task_blocked_on(current, lock);
+ set_current_state(state);
if (opt_acquired)
break;
--
2.54.0.545.g6539524ca2-goog
next prev parent reply other threads:[~2026-04-27 18:38 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-27 18:38 [PATCH 0/2] Proxy Execution fixes for v7.1-rc John Stultz
2026-04-27 18:38 ` [PATCH 1/2] sched: proxy-exec: Close race causing workqueue work being delayed John Stultz
2026-04-28 8:06 ` K Prateek Nayak
2026-04-28 9:43 ` Peter Zijlstra
2026-04-28 11:18 ` Peter Zijlstra
2026-04-28 13:15 ` K Prateek Nayak
2026-04-28 14:12 ` K Prateek Nayak
2026-04-28 16:50 ` Peter Zijlstra
2026-04-29 2:27 ` John Stultz
2026-04-29 8:59 ` K Prateek Nayak
2026-04-30 5:44 ` John Stultz
2026-04-30 5:47 ` John Stultz
2026-04-30 7:25 ` K Prateek Nayak
2026-04-30 21:05 ` John Stultz
2026-04-30 20:40 ` John Stultz
2026-05-01 5:57 ` K Prateek Nayak
2026-04-27 18:38 ` John Stultz [this message]
2026-04-28 8:16 ` [PATCH 2/2] locking: mutex: Fix proxy-exec potentially deactivating tasks marked TASK_RUNNING K Prateek Nayak
2026-04-28 19:50 ` John Stultz
2026-04-30 9:56 ` [PATCH 0/2] Proxy Execution fixes for v7.1-rc Kunwu Chan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260427183848.698551-3-jstultz@google.com \
--to=jstultz@google.com \
--cc=Metin.Kaya@arm.com \
--cc=boqun.feng@gmail.com \
--cc=daniel.lezcano@linaro.org \
--cc=dietmar.eggemann@arm.com \
--cc=hupu.gm@gmail.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kernel-team@android.com \
--cc=kprateek.nayak@amd.com \
--cc=kunwu.chan@linux.dev \
--cc=kuyo.chang@mediatek.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=qyousef@layalina.io \
--cc=rostedt@goodmis.org \
--cc=seanjc@google.com \
--cc=sonam.sanju@intel.com \
--cc=suleiman@google.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=vineethrp@google.com \
--cc=vschneid@redhat.com \
--cc=will@kernel.org \
--cc=xuewen.yan94@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.