From: John Stultz <jstultz@google.com>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Valentin Schneider <valentin.schneider@arm.com>,
"Connor O'Brien" <connoro@google.com>,
John Stultz <jstultz@google.com>,
Joel Fernandes <joelagnelf@nvidia.com>,
Qais Yousef <qyousef@layalina.io>,
Ingo Molnar <mingo@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Valentin Schneider <vschneid@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>,
Zimuzo Ezeozue <zezeozue@google.com>,
Mel Gorman <mgorman@suse.de>, Will Deacon <will@kernel.org>,
Waiman Long <longman@redhat.com>,
Boqun Feng <boqun.feng@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Metin Kaya <Metin.Kaya@arm.com>,
Xuewen Yan <xuewen.yan94@gmail.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Thomas Gleixner <tglx@linutronix.de>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Suleiman Souhlal <suleiman@google.com>,
kuyo chang <kuyo.chang@mediatek.com>, hupu <hupu.gm@gmail.com>,
kernel-team@android.com
Subject: [PATCH v27 08/10] sched: Add blocked_donor link to task for smarter mutex handoffs
Date: Sat, 4 Apr 2026 05:36:25 +0000 [thread overview]
Message-ID: <20260404053632.1729280-9-jstultz@google.com> (raw)
In-Reply-To: <20260404053632.1729280-1-jstultz@google.com>
From: Peter Zijlstra <peterz@infradead.org>
Add link to the task this task is proxying for, and use it so
the mutex owner can do an intelligent hand-off of the mutex to
the task that the owner is running on behalf.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Connor O'Brien <connoro@google.com>
[jstultz: This patch was split out from larger proxy patch]
Signed-off-by: John Stultz <jstultz@google.com>
---
v5:
* Split out from larger proxy patch
v6:
* Moved proxied value from earlier patch to this one where it
is actually used
* Rework logic to check sched_proxy_exec() instead of using ifdefs
* Moved comment change to this patch where it makes sense
v7:
* Use more descriptive term then "us" in comments, as suggested
by Metin Kaya.
* Minor typo fixup from Metin Kaya
* Reworked proxied variable to prev_not_proxied to simplify usage
v8:
* Use helper for donor blocked_on_state transition
v9:
* Re-add mutex lock handoff in the unlock path, but only when we
have a blocked donor
* Slight reword of commit message suggested by Metin
v18:
* Add task_init initialization for blocked_donor, suggested by
Suleiman
v23:
* Reworks for PROXY_WAKING approach suggested by PeterZ
v25:
* Simplified some logic now we don't have proxy_tag_curr()
Cc: Joel Fernandes <joelagnelf@nvidia.com>
Cc: Qais Yousef <qyousef@layalina.io>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Zimuzo Ezeozue <zezeozue@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Metin Kaya <Metin.Kaya@arm.com>
Cc: Xuewen Yan <xuewen.yan94@gmail.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: kuyo chang <kuyo.chang@mediatek.com>
Cc: hupu <hupu.gm@gmail.com>
Cc: kernel-team@android.com
---
include/linux/sched.h | 1 +
init/init_task.c | 1 +
kernel/fork.c | 1 +
kernel/locking/mutex.c | 44 +++++++++++++++++++++++++++++++++++++++---
kernel/sched/core.c | 14 +++++++++++++-
5 files changed, 57 insertions(+), 4 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 3ae1330801157..18665b4b973e2 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1238,6 +1238,7 @@ struct task_struct {
#endif
struct mutex *blocked_on; /* lock we're blocked on */
+ struct task_struct *blocked_donor; /* task that is boosting this task */
raw_spinlock_t blocked_lock;
#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
diff --git a/init/init_task.c b/init/init_task.c
index b5f48ebdc2b6e..41c19670c8f6b 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -200,6 +200,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
.mems_allowed_seq = SEQCNT_SPINLOCK_ZERO(init_task.mems_allowed_seq,
&init_task.alloc_lock),
#endif
+ .blocked_donor = NULL,
#ifdef CONFIG_RT_MUTEXES
.pi_waiters = RB_ROOT_CACHED,
.pi_top_task = NULL,
diff --git a/kernel/fork.c b/kernel/fork.c
index 079802cb61002..a3d2cd4395791 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2178,6 +2178,7 @@ __latent_entropy struct task_struct *copy_process(
lockdep_init_task(p);
p->blocked_on = NULL; /* not blocked yet */
+ p->blocked_donor = NULL; /* nobody is boosting p yet */
#ifdef CONFIG_BCACHE
p->sequential_io = 0;
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 7d359647156df..65f0f35b88972 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -942,7 +942,7 @@ EXPORT_SYMBOL_GPL(ww_mutex_lock_interruptible);
*/
static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigned long ip)
{
- struct task_struct *next = NULL;
+ struct task_struct *donor, *next = NULL;
DEFINE_WAKE_Q(wake_q);
unsigned long owner;
unsigned long flags;
@@ -961,6 +961,12 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
MUTEX_WARN_ON(__owner_task(owner) != current);
MUTEX_WARN_ON(owner & MUTEX_FLAG_PICKUP);
+ if (sched_proxy_exec() && current->blocked_donor) {
+ /* force handoff if we have a blocked_donor */
+ owner = MUTEX_FLAG_HANDOFF;
+ break;
+ }
+
if (owner & MUTEX_FLAG_HANDOFF)
break;
@@ -974,7 +980,34 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
raw_spin_lock_irqsave(&lock->wait_lock, flags);
debug_mutex_unlock(lock);
- if (!list_empty(&lock->wait_list)) {
+
+ if (sched_proxy_exec()) {
+ raw_spin_lock(¤t->blocked_lock);
+ /*
+ * If we have a task boosting current, and that task was boosting
+ * current through this lock, hand the lock to that task, as that
+ * is the highest waiter, as selected by the scheduling function.
+ */
+ donor = current->blocked_donor;
+ if (donor) {
+ struct mutex *next_lock;
+
+ raw_spin_lock_nested(&donor->blocked_lock, SINGLE_DEPTH_NESTING);
+ next_lock = __get_task_blocked_on(donor);
+ if (next_lock == lock) {
+ next = donor;
+ __set_task_blocked_on_waking(donor, next_lock);
+ wake_q_add(&wake_q, donor);
+ current->blocked_donor = NULL;
+ }
+ raw_spin_unlock(&donor->blocked_lock);
+ }
+ }
+
+ /*
+ * Failing that, pick any on the wait list.
+ */
+ if (!next && !list_empty(&lock->wait_list)) {
/* get the first entry from the wait-list: */
struct mutex_waiter *waiter =
list_first_entry(&lock->wait_list,
@@ -982,14 +1015,19 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
next = waiter->task;
+ raw_spin_lock_nested(&next->blocked_lock, SINGLE_DEPTH_NESTING);
debug_mutex_wake_waiter(lock, waiter);
- set_task_blocked_on_waking(next, lock);
+ __set_task_blocked_on_waking(next, lock);
+ raw_spin_unlock(&next->blocked_lock);
wake_q_add(&wake_q, next);
+
}
if (owner & MUTEX_FLAG_HANDOFF)
__mutex_handoff(lock, next);
+ if (sched_proxy_exec())
+ raw_spin_unlock(¤t->blocked_lock);
raw_spin_unlock_irqrestore_wake(&lock->wait_lock, flags, &wake_q);
}
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a0d55225a62c3..9197b4274de8c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6753,7 +6753,17 @@ static void proxy_migrate_task(struct rq *rq, struct rq_flags *rf,
* Find runnable lock owner to proxy for mutex blocked donor
*
* Follow the blocked-on relation:
- * task->blocked_on -> mutex->owner -> task...
+ *
+ * ,-> task
+ * | | blocked-on
+ * | v
+ * blocked_donor | mutex
+ * | | owner
+ * | v
+ * `-- task
+ *
+ * and set the blocked_donor relation, this latter is used by the mutex
+ * code to find which (blocked) task to hand-off to.
*
* Lock order:
*
@@ -6893,6 +6903,7 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
* rq, therefore holding @rq->lock is sufficient to
* guarantee its existence, as per ttwu_remote().
*/
+ owner->blocked_donor = p;
}
WARN_ON_ONCE(owner && !owner->on_rq);
return owner;
@@ -7050,6 +7061,7 @@ static void __sched notrace __schedule(int sched_mode)
clear_task_blocked_on(prev, NULL);
rq_set_donor(rq, next);
+ next->blocked_donor = NULL;
if (unlikely(next->blocked_on)) {
next = find_proxy_task(rq, next, &rf);
if (!next) {
--
2.53.0.1213.gd9a14994de-goog
next prev parent reply other threads:[~2026-04-04 5:36 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-04 5:36 [PATCH v27 00/10] Optimized Donor Migration for Proxy Execution John Stultz
2026-04-04 5:36 ` [PATCH v27 01/10] sched: Rework pick_next_task() and prev_balance() to avoid stale prev references John Stultz
2026-04-04 5:36 ` [PATCH v27 02/10] sched: Avoid donor->sched_class->yield_task() null traversal John Stultz
2026-04-04 5:57 ` K Prateek Nayak
2026-04-04 6:09 ` John Stultz
2026-04-04 5:36 ` [PATCH v27 03/10] sched: deadline: Add some helper variables to cleanup deadline logic John Stultz
2026-04-04 5:36 ` [PATCH v27 04/10] sched: deadline: Add dl_rq->curr pointer to address issues with Proxy Exec John Stultz
2026-04-04 5:36 ` [PATCH v27 05/10] sched: Rework block_task so it can be directly called John Stultz
2026-04-04 5:36 ` [PATCH v27 06/10] sched: Have try_to_wake_up() handle return-migration for PROXY_WAKING case John Stultz
2026-04-04 5:36 ` [PATCH v27 07/10] sched/core: Reset the donor to current task when donor is woken John Stultz
2026-04-04 5:36 ` John Stultz [this message]
2026-04-04 5:36 ` [PATCH v27 09/10] sched: Break out core of attach_tasks() helper into sched.h John Stultz
2026-04-04 5:36 ` [PATCH v27 10/10] sched: Migrate whole chain in proxy_migrate_task() John Stultz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260404053632.1729280-9-jstultz@google.com \
--to=jstultz@google.com \
--cc=Metin.Kaya@arm.com \
--cc=boqun.feng@gmail.com \
--cc=bsegall@google.com \
--cc=connoro@google.com \
--cc=daniel.lezcano@linaro.org \
--cc=dietmar.eggemann@arm.com \
--cc=hupu.gm@gmail.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kernel-team@android.com \
--cc=kprateek.nayak@amd.com \
--cc=kuyo.chang@mediatek.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=qyousef@layalina.io \
--cc=rostedt@goodmis.org \
--cc=suleiman@google.com \
--cc=tglx@linutronix.de \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=will@kernel.org \
--cc=xuewen.yan94@gmail.com \
--cc=zezeozue@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.