From: John Stultz <jstultz@google.com>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Valentin Schneider <valentin.schneider@arm.com>,
"Connor O'Brien" <connoro@google.com>,
John Stultz <jstultz@google.com>,
Joel Fernandes <joelagnelf@nvidia.com>,
Qais Yousef <qyousef@layalina.io>,
Ingo Molnar <mingo@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Valentin Schneider <vschneid@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>,
Zimuzo Ezeozue <zezeozue@google.com>,
Will Deacon <will@kernel.org>, Waiman Long <longman@redhat.com>,
Boqun Feng <boqun.feng@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Metin Kaya <Metin.Kaya@arm.com>,
Xuewen Yan <xuewen.yan94@gmail.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Thomas Gleixner <tglx@linutronix.de>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Suleiman Souhlal <suleiman@google.com>,
kuyo chang <kuyo.chang@mediatek.com>, hupu <hupu.gm@gmail.com>,
kernel-team@android.com
Subject: [PATCH v28 6/8] sched: Add blocked_donor link to task for smarter mutex handoffs
Date: Wed, 22 Apr 2026 23:06:47 +0000 [thread overview]
Message-ID: <20260422230659.903191-7-jstultz@google.com> (raw)
In-Reply-To: <20260422230659.903191-1-jstultz@google.com>
From: Peter Zijlstra <peterz@infradead.org>
Add link to the task this task is proxying for, and use it so
the mutex owner can do an intelligent hand-off of the mutex to
the task that the owner is running on behalf.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Connor O'Brien <connoro@google.com>
[jstultz: This patch was split out from larger proxy patch]
Signed-off-by: John Stultz <jstultz@google.com>
---
v5:
* Split out from larger proxy patch
v6:
* Moved proxied value from earlier patch to this one where it
is actually used
* Rework logic to check sched_proxy_exec() instead of using ifdefs
* Moved comment change to this patch where it makes sense
v7:
* Use more descriptive term then "us" in comments, as suggested
by Metin Kaya.
* Minor typo fixup from Metin Kaya
* Reworked proxied variable to prev_not_proxied to simplify usage
v8:
* Use helper for donor blocked_on_state transition
v9:
* Re-add mutex lock handoff in the unlock path, but only when we
have a blocked donor
* Slight reword of commit message suggested by Metin
v18:
* Add task_init initialization for blocked_donor, suggested by
Suleiman
v23:
* Reworks for PROXY_WAKING approach suggested by PeterZ
v25:
* Simplified some logic now we don't have proxy_tag_curr()
v28:
* Remove sched_proxy_exec() conditionalized blocked_lock usage
as it was hitting llvm's "'blocked_lock' is not held on every
path through here [-Wthread-safety-analysis]" errors.
Cc: Joel Fernandes <joelagnelf@nvidia.com>
Cc: Qais Yousef <qyousef@layalina.io>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Zimuzo Ezeozue <zezeozue@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Metin Kaya <Metin.Kaya@arm.com>
Cc: Xuewen Yan <xuewen.yan94@gmail.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: kuyo chang <kuyo.chang@mediatek.com>
Cc: hupu <hupu.gm@gmail.com>
Cc: kernel-team@android.com
Fixup for llvm 'blocked_lock' is not held on every path through here [-Wthread-safety-analysis]
Clang is fussy so lets not do conditional locking
---
include/linux/sched.h | 1 +
init/init_task.c | 1 +
kernel/fork.c | 1 +
kernel/locking/mutex.c | 43 +++++++++++++++++++++++++++++++++++++++---
kernel/sched/core.c | 14 +++++++++++++-
5 files changed, 56 insertions(+), 4 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5b68a1c9eedcf..65791d7359553 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1239,6 +1239,7 @@ struct task_struct {
#endif
struct mutex *blocked_on; /* lock we're blocked on */
+ struct task_struct *blocked_donor; /* task that is boosting this task */
raw_spinlock_t blocked_lock;
#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
diff --git a/init/init_task.c b/init/init_task.c
index b5f48ebdc2b6e..41c19670c8f6b 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -200,6 +200,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
.mems_allowed_seq = SEQCNT_SPINLOCK_ZERO(init_task.mems_allowed_seq,
&init_task.alloc_lock),
#endif
+ .blocked_donor = NULL,
#ifdef CONFIG_RT_MUTEXES
.pi_waiters = RB_ROOT_CACHED,
.pi_top_task = NULL,
diff --git a/kernel/fork.c b/kernel/fork.c
index f1ad69c6dc2d4..c1d65b86729d6 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2216,6 +2216,7 @@ __latent_entropy struct task_struct *copy_process(
lockdep_init_task(p);
p->blocked_on = NULL; /* not blocked yet */
+ p->blocked_donor = NULL; /* nobody is boosting p yet */
#ifdef CONFIG_BCACHE
p->sequential_io = 0;
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 6e6699d9fcba6..333f33d33ef06 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -976,7 +976,7 @@ EXPORT_SYMBOL_GPL(ww_mutex_lock_interruptible);
static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigned long ip)
__releases(lock)
{
- struct task_struct *next = NULL;
+ struct task_struct *donor, *next = NULL;
struct mutex_waiter *waiter;
DEFINE_WAKE_Q(wake_q);
unsigned long owner;
@@ -997,6 +997,12 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
MUTEX_WARN_ON(__owner_task(owner) != current);
MUTEX_WARN_ON(owner & MUTEX_FLAG_PICKUP);
+ if (sched_proxy_exec() && current->blocked_donor) {
+ /* force handoff if we have a blocked_donor */
+ owner = MUTEX_FLAG_HANDOFF;
+ break;
+ }
+
if (owner & MUTEX_FLAG_HANDOFF)
break;
@@ -1009,19 +1015,50 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
}
raw_spin_lock_irqsave(&lock->wait_lock, flags);
+ raw_spin_lock(¤t->blocked_lock);
debug_mutex_unlock(lock);
+
+ if (sched_proxy_exec()) {
+ /*
+ * If we have a task boosting current, and that task was boosting
+ * current through this lock, hand the lock to that task, as that
+ * is the highest waiter, as selected by the scheduling function.
+ */
+ donor = current->blocked_donor;
+ if (donor) {
+ struct mutex *next_lock;
+
+ raw_spin_lock_nested(&donor->blocked_lock, SINGLE_DEPTH_NESTING);
+ next_lock = __get_task_blocked_on(donor);
+ if (next_lock == lock) {
+ next = donor;
+ __set_task_blocked_on_waking(donor, next_lock);
+ wake_q_add(&wake_q, donor);
+ current->blocked_donor = NULL;
+ }
+ raw_spin_unlock(&donor->blocked_lock);
+ }
+ }
+
+ /*
+ * Failing that, pick first on the wait list.
+ */
waiter = lock->first_waiter;
- if (waiter) {
+ if (!next && waiter) {
next = waiter->task;
+ raw_spin_lock_nested(&next->blocked_lock, SINGLE_DEPTH_NESTING);
debug_mutex_wake_waiter(lock, waiter);
- set_task_blocked_on_waking(next, lock);
+ __set_task_blocked_on_waking(next, lock);
+ raw_spin_unlock(&next->blocked_lock);
wake_q_add(&wake_q, next);
+
}
if (owner & MUTEX_FLAG_HANDOFF)
__mutex_handoff(lock, next);
+ raw_spin_unlock(¤t->blocked_lock);
raw_spin_unlock_irqrestore_wake(&lock->wait_lock, flags, &wake_q);
}
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 17797e1f76f25..edd418dde1e47 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6809,7 +6809,17 @@ static void proxy_migrate_task(struct rq *rq, struct rq_flags *rf,
* Find runnable lock owner to proxy for mutex blocked donor
*
* Follow the blocked-on relation:
- * task->blocked_on -> mutex->owner -> task...
+ *
+ * ,-> task
+ * | | blocked-on
+ * | v
+ * blocked_donor | mutex
+ * | | owner
+ * | v
+ * `-- task
+ *
+ * and set the blocked_donor relation, this latter is used by the mutex
+ * code to find which (blocked) task to hand-off to.
*
* Lock order:
*
@@ -6949,6 +6959,7 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
* rq, therefore holding @rq->lock is sufficient to
* guarantee its existence, as per ttwu_remote().
*/
+ owner->blocked_donor = p;
}
WARN_ON_ONCE(owner && !owner->on_rq);
return owner;
@@ -7105,6 +7116,7 @@ static void __sched notrace __schedule(int sched_mode)
clear_task_blocked_on(prev, NULL);
rq_set_donor(rq, next);
+ next->blocked_donor = NULL;
if (unlikely(next->blocked_on)) {
next = find_proxy_task(rq, next, &rf);
if (!next) {
--
2.54.0.rc2.533.g4f5dca5207-goog
next prev parent reply other threads:[~2026-04-22 23:07 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-22 23:06 [PATCH v28 0/8] Optimized Donor Migration for Proxy Execution John Stultz
2026-04-22 23:06 ` [PATCH v28 1/8] sched: Rework pick_next_task() and prev_balance() to avoid stale prev references John Stultz
2026-04-22 23:06 ` [PATCH v28 2/8] sched: deadline: Add some helper variables to cleanup deadline logic John Stultz
2026-04-22 23:06 ` [PATCH v28 3/8] sched: deadline: Add dl_rq->curr pointer to address issues with Proxy Exec John Stultz
2026-04-22 23:06 ` [PATCH v28 4/8] sched: Rework block_task so it can be directly called John Stultz
2026-04-22 23:06 ` [PATCH v28 5/8] sched: Have try_to_wake_up() handle return-migration for PROXY_WAKING case John Stultz
2026-04-22 23:06 ` John Stultz [this message]
2026-04-22 23:06 ` [PATCH v28 7/8] sched: Break out core of attach_tasks() helper into sched.h John Stultz
2026-04-22 23:06 ` [PATCH v28 8/8] sched: Migrate whole chain in proxy_migrate_task() John Stultz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260422230659.903191-7-jstultz@google.com \
--to=jstultz@google.com \
--cc=Metin.Kaya@arm.com \
--cc=boqun.feng@gmail.com \
--cc=bsegall@google.com \
--cc=connoro@google.com \
--cc=daniel.lezcano@linaro.org \
--cc=dietmar.eggemann@arm.com \
--cc=hupu.gm@gmail.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kernel-team@android.com \
--cc=kprateek.nayak@amd.com \
--cc=kuyo.chang@mediatek.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=qyousef@layalina.io \
--cc=rostedt@goodmis.org \
--cc=suleiman@google.com \
--cc=tglx@linutronix.de \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=will@kernel.org \
--cc=xuewen.yan94@gmail.com \
--cc=zezeozue@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.