All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution
@ 2025-11-17 18:55 K Prateek Nayak
  2025-11-17 18:55 ` [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI K Prateek Nayak
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-17 18:55 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Suren Baghdasaryan, linux-kernel
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak

When booting into a kernel with CONFIG_SCHED_PROXY_EXEC and CONFIG_PSI,
a inconsistent task state warning was noticed soon after the boot
similar to:

    psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=0 set=4

On analysis, the following sequence of event was found to be the cause
of the splat:

o Blocked task is retained on the runqueue.
o psi_sched_switch() sees task_on_rq_queued() and retains the runnable
  signals for the task.
o Tasks blocks later via proxy_deactivate() but psi_dequeue() doesn't
  adjust the PSI flags since DEQUEUE_SLEEP is set expecting
  psi_sched_switch() to fix the signals.
o The blocked task is woken up with the PSI state still reflecting that
  the task is runnable (TSK_RUNNING) leading to the splat.


Simply tracking proxy_deactivate() is not enough since the task's
blocked_on relationship can be cleared remotely without acquiring the
runqueue lock which can force a blocked task to run before a wakeup -
pick_next_task() pickes the blocked donor and since blocked on
relationship was cleared remotely, task_is_blocked() returns false
leading to the task being run on the CPU.

If the task blocks again before it is woken up, psi_sched_switch() will
try to clear the runnable signals (TSK_RUNNING) unconditionally leading
to a different splat similar to:

    psi: inconsistent task state! task=... cpu=... psi_flags=10 clear=14 set=0


To get around this, track the complete lifecycle of a blocked doner
right from delaying the deactivation to the wakeup. When in
blocked/donor state, PSI will consider these tasks similar to delayed
tasks - blocked but migratable.

When the ttwu_runnable() finally wakeups up the task, or if the donor is
deactivated via proxy_deactivate(), the proxy indicator is cleared to
show that the task is either fully blocked or fully runnable now.

Patch 1 and 2 were cleanups to make life slightly easier when auditing
the implementation and inspecting the debug logs. Patch 3 to 5 implement
the tracking of donor states and a couple of fixes on top.

Series was tested on top of tip:sched/core for a while running
sched-messaging without observing any inconsistent task state warning
and should apply cleanly on top of:

    git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core

at commit 33cf66d88306 ("sched/fair: Proportional newidle balance").

---
K Prateek Nayak (5):
  sched/psi: Make psi stubs consistent for !CONFIG_PSI
  sched/psi: Prepend "0x" to format specifiers when printing PSI flags
  sched/core: Track blocked tasks retained on rq for proxy
  sched/core: Block proxy task on pick when blocked_on is cleared before
    wakeup
  sched/psi: Fix PSI signals of blocked tasks retained for proxy

 include/linux/sched.h |  4 +++
 kernel/sched/core.c   | 59 +++++++++++++++++++++++++++++++++++++++++--
 kernel/sched/psi.c    |  4 +--
 kernel/sched/sched.h  |  2 ++
 kernel/sched/stats.h  |  6 ++---
 5 files changed, 68 insertions(+), 7 deletions(-)


base-commit: 33cf66d88306663d16e4759e9d24766b0aaa2e17
-- 
2.34.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI
  2025-11-17 18:55 [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution K Prateek Nayak
@ 2025-11-17 18:55 ` K Prateek Nayak
  2025-11-18  1:06   ` John Stultz
                     ` (2 more replies)
  2025-11-17 18:55 ` [PATCH 2/5] sched/psi: Prepend "0x" to format specifiers when printing PSI flags K Prateek Nayak
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-17 18:55 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Suren Baghdasaryan, linux-kernel
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak

commit 1a6151017ee5 ("sched: psi: pass enqueue/dequeue flags to psi
callbacks directly") modified the psi_enqueue() and psi_dequeue()
functions to take the complete enqueue/dequeue flags but left the stubs
for !CONFIG_PSI unaltered.

Modify the stubs to also accept the flags argument to keep it consistent
with CONFIG_PSI.

Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/stats.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index cbf7206b3f9d..3323d773fec3 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -221,8 +221,8 @@ static inline void psi_sched_switch(struct task_struct *prev,
 }
 
 #else /* !CONFIG_PSI: */
-static inline void psi_enqueue(struct task_struct *p, bool migrate) {}
-static inline void psi_dequeue(struct task_struct *p, bool migrate) {}
+static inline void psi_enqueue(struct task_struct *p, int flags) {}
+static inline void psi_dequeue(struct task_struct *p, int flags) {}
 static inline void psi_ttwu_dequeue(struct task_struct *p) {}
 static inline void psi_sched_switch(struct task_struct *prev,
 				    struct task_struct *next,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/5] sched/psi: Prepend "0x" to format specifiers when printing PSI flags
  2025-11-17 18:55 [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution K Prateek Nayak
  2025-11-17 18:55 ` [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI K Prateek Nayak
@ 2025-11-17 18:55 ` K Prateek Nayak
  2025-11-18  1:08   ` John Stultz
  2025-12-02 14:33   ` Johannes Weiner
  2025-11-17 18:55 ` [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy K Prateek Nayak
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-17 18:55 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Suren Baghdasaryan, linux-kernel
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak

It is not immediately clear that the PSI flags, the set, and the clear
bits printed in PSI warnings are hexadecimal values. Prepend "0x" to
format specifiers to make it clear.

Since "kernel/sched" uses "0x%x" as opposed to "%#x" when printing
hexadecimal values, the same was followed to keep consistency.

Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/psi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 59fdb7ebbf22..b031608c02ce 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -830,7 +830,7 @@ static void psi_group_change(struct psi_group *group, int cpu,
 		if (groupc->tasks[t]) {
 			groupc->tasks[t]--;
 		} else if (!psi_bug) {
-			printk_deferred(KERN_ERR "psi: task underflow! cpu=%d t=%d tasks=[%u %u %u %u] clear=%x set=%x\n",
+			printk_deferred(KERN_ERR "psi: task underflow! cpu=%d t=%d tasks=[%u %u %u %u] clear=0x%x set=0x%x\n",
 					cpu, t, groupc->tasks[0],
 					groupc->tasks[1], groupc->tasks[2],
 					groupc->tasks[3], clear, set);
@@ -896,7 +896,7 @@ static void psi_flags_change(struct task_struct *task, int clear, int set)
 	if (((task->psi_flags & set) ||
 	     (task->psi_flags & clear) != clear) &&
 	    !psi_bug) {
-		printk_deferred(KERN_ERR "psi: inconsistent task state! task=%d:%s cpu=%d psi_flags=%x clear=%x set=%x\n",
+		printk_deferred(KERN_ERR "psi: inconsistent task state! task=%d:%s cpu=%d psi_flags=0x%x clear=0x%x set=0x%x\n",
 				task->pid, task->comm, task_cpu(task),
 				task->psi_flags, clear, set);
 		psi_bug = 1;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy
  2025-11-17 18:55 [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution K Prateek Nayak
  2025-11-17 18:55 ` [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI K Prateek Nayak
  2025-11-17 18:55 ` [PATCH 2/5] sched/psi: Prepend "0x" to format specifiers when printing PSI flags K Prateek Nayak
@ 2025-11-17 18:55 ` K Prateek Nayak
  2025-11-17 20:44   ` kernel test robot
                     ` (2 more replies)
  2025-11-17 18:55 ` [RFC PATCH 4/5] sched/core: Block proxy task on pick when blocked_on is cleared before wakeup K Prateek Nayak
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-17 18:55 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Suren Baghdasaryan, linux-kernel
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak

Track blocked task retained on runqueue to act as a proxy donor. This is
necessary since task's "blocked_on" relationship can be cleared without
holding the task_rq lock and a blocked donor can be forced to run on CPU
before a wakeup since task_is_blocked() returns false at the time of pick.

This is necessary for to fix a PSI task state corruption observed with
CONFIG_SCHED_PROXY_EXEC=y but it also serves as a medium to track the
lifecycle of proxy donor when it is retained on the rq for proxy.

Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 include/linux/sched.h |  4 ++++
 kernel/sched/core.c   | 32 +++++++++++++++++++++++++++++++-
 kernel/sched/sched.h  |  2 ++
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index bb436ee1942d..f96ad1ec680b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -978,6 +978,10 @@ struct task_struct {
 	unsigned			sched_contributes_to_load:1;
 	unsigned			sched_migrated:1;
 	unsigned			sched_task_hot:1;
+#ifdef CONFIG_SCHED_PROXY_EXEC
+	/* To indicate blocked task was retained on the rq for proxy. */
+	unsigned			sched_proxy:1;
+#endif
 
 	/* Force alignment to the next boundary: */
 	unsigned			:0;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9f10cfbdc228..52a744beeca9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3659,6 +3659,9 @@ ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
 	}
 }
 
+static inline void set_task_proxy(struct task_struct *p);
+static inline void clear_task_proxy(struct task_struct *p);
+
 /*
  * Consider @p being inside a wait loop:
  *
@@ -3693,6 +3696,8 @@ static int ttwu_runnable(struct task_struct *p, int wake_flags)
 	rq = __task_rq_lock(p, &rf);
 	if (task_on_rq_queued(p)) {
 		update_rq_clock(rq);
+		if (is_proxy_task(p))
+			clear_task_proxy(p);
 		if (p->se.sched_delayed)
 			enqueue_task(rq, p, ENQUEUE_NOCLOCK | ENQUEUE_DELAYED);
 		if (!task_on_cpu(rq, p)) {
@@ -6460,8 +6465,10 @@ static bool try_to_block_task(struct rq *rq, struct task_struct *p,
 	 * blocked on a mutex, and we want to keep it on the runqueue
 	 * to be selectable for proxy-execution.
 	 */
-	if (!should_block)
+	if (!should_block) {
+		set_task_proxy(p);
 		return false;
+	}
 
 	p->sched_contributes_to_load =
 		(task_state & TASK_UNINTERRUPTIBLE) &&
@@ -6487,6 +6494,23 @@ static bool try_to_block_task(struct rq *rq, struct task_struct *p,
 }
 
 #ifdef CONFIG_SCHED_PROXY_EXEC
+bool is_proxy_task(struct task_struct *p)
+{
+	return !!p->sched_proxy;
+}
+
+static inline void set_task_proxy(struct task_struct *p)
+{
+	WARN_ON_ONCE(p->sched_proxy);
+	p->sched_proxy = 1;
+}
+
+static inline void clear_task_proxy(struct task_struct *p)
+{
+	WARN_ON_ONCE(!p->sched_proxy);
+	p->sched_proxy = 0;
+}
+
 static inline struct task_struct *proxy_resched_idle(struct rq *rq)
 {
 	put_prev_set_next_task(rq, rq->donor, rq->idle);
@@ -6502,6 +6526,9 @@ static bool __proxy_deactivate(struct rq *rq, struct task_struct *donor)
 	/* Don't deactivate if the state has been changed to TASK_RUNNING */
 	if (state == TASK_RUNNING)
 		return false;
+
+	clear_task_proxy(donor);
+
 	/*
 	 * Because we got donor from pick_next_task(), it is *crucial*
 	 * that we call proxy_resched_idle() before we deactivate it.
@@ -6649,6 +6676,9 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
 	return owner;
 }
 #else /* SCHED_PROXY_EXEC */
+bool is_proxy_task(struct task_struct *p) { return false; }
+static inline void set_task_proxy(struct task_struct *p) { }
+static inline void clear_task_proxy(p) { }
 static struct task_struct *
 find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
 {
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b419a4d98461..fa2a2d5bff6d 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1362,6 +1362,8 @@ static inline u32 sched_rng(void)
 #define cpu_curr(cpu)		(cpu_rq(cpu)->curr)
 #define raw_rq()		raw_cpu_ptr(&runqueues)
 
+bool is_proxy_task(struct task_struct *p);
+
 #ifdef CONFIG_SCHED_PROXY_EXEC
 static inline void rq_set_donor(struct rq *rq, struct task_struct *t)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 4/5] sched/core: Block proxy task on pick when blocked_on is cleared before wakeup
  2025-11-17 18:55 [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution K Prateek Nayak
                   ` (2 preceding siblings ...)
  2025-11-17 18:55 ` [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy K Prateek Nayak
@ 2025-11-17 18:55 ` K Prateek Nayak
  2025-11-17 18:55 ` [RFC PATCH 5/5] sched/psi: Fix PSI signals of blocked tasks retained for proxy K Prateek Nayak
  2025-11-18  0:45 ` [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution John Stultz
  5 siblings, 0 replies; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-17 18:55 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Suren Baghdasaryan, linux-kernel
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak

Task's "blocked_on" relationship can be cleared remotely without
acquiring the task's runqueue lock and can lead to a blocked donor being
picked to run without having a wakeup.

Although this is not a problem since the task blocks back again if it
wasn't woken up in time, it cause problem from stats acccounting point,
especially for PSI since psi_enqueue() tries to reset the runnable
signals for a proxy task that wasn't fully block, and psi_sched_switch()
assumes a blocking task is always runnable before being switched out.

When investigating PSI task state corruption, the following transitions
were observed with CONFIG_SCHED_PROXY_EXEC=y:

   ... [141] ...: psi_flags_change: task=4358:... cpu=141 psi_flags=14 clear=14 set=0 queued=1 delayed=0 blocked=1 psi_bug=0

   # Task is retained on the rq for proxy. All runnable signals are
   # cleared for the task.

   ... [141] ...: psi_flags_change: task=4358:... cpu=141 psi_flags=0 clear=0 set=10 queued=1 delayed=0 blocked=0 psi_bug=0

   # Task is picked and forced to run since task_is_blocked() returns
   # false.

   ... [141] ...: psi_dequeue: task=4358:... cpu=141 psi_flags=10

   # Task blocks again. Flag modifications are deferred to
   # psi_sched_switch() since DEQUEUE_SLEEP.

   ... [141] ...: psi_flags_change: task=4358:... cpu=141 psi_flags=10 clear=14 set=0 queued=1 delayed=1 blocked=0 psi_bug=(0 -> 1)

   # psi_sched_switch() tries to clear TSK_ONCPU and TSK_RUNNING.
   # TSK_RUNNING was never set since task was never woken up.
   # !! PSI inconsistent state waring triggered !!

   ... [014] ...: try_to_wake_up: wakeup: task=4358: psi_flags=0 queued=0 delayed=0 blocked=0

   # Task wakes up and is finally runnable.

To prevent any inconsistencies from running a task that was supposed to
be blocked, deactivate a potential donor task if it was picked without
having the "sched_proxy" indicator cleared. A pending wakeup would queue
the task back again when it turns runnable again.

Fixes: be41bde4c3a8 ("sched: Add an initial sketch of the find_proxy_task() function")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/core.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 52a744beeca9..d6265f38e93a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6679,6 +6679,7 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
 bool is_proxy_task(struct task_struct *p) { return false; }
 static inline void set_task_proxy(struct task_struct *p) { }
 static inline void clear_task_proxy(p) { }
+static bool __proxy_deactivate(struct rq *rq, struct task_struct *donor) { return false; }
 static struct task_struct *
 find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
 {
@@ -6839,6 +6840,25 @@ static void __sched notrace __schedule(int sched_mode)
 		if (next == rq->idle)
 			goto keep_resched;
 	}
+	if (unlikely(is_proxy_task(next))) {
+		/*
+		 * It is possible for a remote CPU to clear task
+		 * "blocked_on" without acquiring the task rq lock.
+		 *
+		 * This can lead to a blocked task retained for proxy to
+		 * be forced on CPU without the task being woken up
+		 * since task_is_blocked(next) above returns false.
+		 *
+		 * Since "sched_proxy" is only cleared on wakeup,
+		 * is_proxy_task() returning true indicates that the
+		 * task hasn't woken up yet.
+		 *
+		 * Block the task and wait for wakeup to queue it back
+		 * when it is runnable again.
+		 */
+		if (__proxy_deactivate(rq, next))
+			goto pick_again;
+	}
 picked:
 	clear_tsk_need_resched(prev);
 	clear_preempt_need_resched();
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 5/5] sched/psi: Fix PSI signals of blocked tasks retained for proxy
  2025-11-17 18:55 [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution K Prateek Nayak
                   ` (3 preceding siblings ...)
  2025-11-17 18:55 ` [RFC PATCH 4/5] sched/core: Block proxy task on pick when blocked_on is cleared before wakeup K Prateek Nayak
@ 2025-11-17 18:55 ` K Prateek Nayak
  2025-11-18  0:45 ` [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution John Stultz
  5 siblings, 0 replies; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-17 18:55 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Suren Baghdasaryan, linux-kernel
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak

Booting a machine with CONFIG_SCHED_PROXY_EXEC=y leads to a PSI warning
similar to following early into the boot:

    psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=0 set=4

Investigating the set of events that led to warning indicated that
psi_sched_switch() never dequeued the signals for a blocked task since
it was retained on the runqueue as a potential donor but the
psi_enqueue() that follows the wakeup tries to re-enqueue these signals
for the blocked tasks thus leading to the inconsistent state warning.

When fixing PSI state corruption for delayed dequeue in commit
c6508124193d ("sched/psi: Fix mistaken CPU pressure indication after
corrupted task state bug"), Johannes mentioned that delayed tasks are
in-fact blocked and the PSI signals should treat them as such in [1].

With proxy execution, the same argument holds true for blocked tasks
that are kept around to donate the vruntime context - the tasks are
essentially blocked and their PSI signals should be treated as such
until they are woken up again.

Treat is_proxy_task() as blocked for psi_sched_switch() and call
psi_enqueue(ENQUEUE_WAKEUP) when the "sched_proxy" signal is cleared.
For all the transitions in-between, treat the task similar to a delayed
task and just move the block signals on migration of blocked donor.

Fixes: be41bde4c3a8 ("sched: Add an initial sketch of the find_proxy_task() function")
Link: https://lore.kernel.org/all/20241010130316.GA181795@cmpxchg.org/ [1]
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/core.c  | 9 +++++++--
 kernel/sched/stats.h | 2 +-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d6265f38e93a..765365b81b12 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3696,8 +3696,12 @@ static int ttwu_runnable(struct task_struct *p, int wake_flags)
 	rq = __task_rq_lock(p, &rf);
 	if (task_on_rq_queued(p)) {
 		update_rq_clock(rq);
-		if (is_proxy_task(p))
+		if (is_proxy_task(p)) {
 			clear_task_proxy(p);
+			/* Task was never fully blocked to be delayed. */
+			WARN_ON_ONCE(p->se.sched_delayed);
+			psi_enqueue(p, ENQUEUE_WAKEUP);
+		}
 		if (p->se.sched_delayed)
 			enqueue_task(rq, p, ENQUEUE_NOCLOCK | ENQUEUE_DELAYED);
 		if (!task_on_cpu(rq, p)) {
@@ -6903,7 +6907,8 @@ static void __sched notrace __schedule(int sched_mode)
 
 		psi_account_irqtime(rq, prev, next);
 		psi_sched_switch(prev, next, !task_on_rq_queued(prev) ||
-					     prev->se.sched_delayed);
+					     prev->se.sched_delayed ||
+					     is_proxy_task(prev));
 
 		trace_sched_switch(preempt, prev, next, prev_state);
 
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 3323d773fec3..b2d7461c1ea9 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -142,7 +142,7 @@ static inline void psi_enqueue(struct task_struct *p, int flags)
 	if (task_on_cpu(task_rq(p), p))
 		return;
 
-	if (p->se.sched_delayed) {
+	if (p->se.sched_delayed || is_proxy_task(p)) {
 		/* CPU migration of "sleeping" task */
 		WARN_ON_ONCE(!(flags & ENQUEUE_MIGRATED));
 		if (p->in_memstall)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy
  2025-11-17 18:55 ` [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy K Prateek Nayak
@ 2025-11-17 20:44   ` kernel test robot
  2025-11-18  2:03     ` K Prateek Nayak
  2025-11-18  1:46   ` kernel test robot
  2025-11-18  4:38   ` K Prateek Nayak
  2 siblings, 1 reply; 21+ messages in thread
From: kernel test robot @ 2025-11-17 20:44 UTC (permalink / raw)
  To: K Prateek Nayak; +Cc: oe-kbuild-all

Hi Prateek,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on 33cf66d88306663d16e4759e9d24766b0aaa2e17]

url:    https://github.com/intel-lab-lkp/linux/commits/K-Prateek-Nayak/sched-psi-Make-psi-stubs-consistent-for-CONFIG_PSI/20251118-030832
base:   33cf66d88306663d16e4759e9d24766b0aaa2e17
patch link:    https://lore.kernel.org/r/20251117185550.365156-4-kprateek.nayak%40amd.com
patch subject: [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy
config: nios2-allnoconfig (https://download.01.org/0day-ci/archive/20251118/202511180452.4hbbjwkB-lkp@intel.com/config)
compiler: nios2-linux-gcc (GCC) 11.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251118/202511180452.4hbbjwkB-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511180452.4hbbjwkB-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/sched/core.c: In function 'clear_task_proxy':
>> kernel/sched/core.c:6681:20: error: type of 'p' defaults to 'int' [-Werror=implicit-int]
    6681 | static inline void clear_task_proxy(p) { }
         |                    ^~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +6681 kernel/sched/core.c

  6557	
  6558	/*
  6559	 * Find runnable lock owner to proxy for mutex blocked donor
  6560	 *
  6561	 * Follow the blocked-on relation:
  6562	 *   task->blocked_on -> mutex->owner -> task...
  6563	 *
  6564	 * Lock order:
  6565	 *
  6566	 *   p->pi_lock
  6567	 *     rq->lock
  6568	 *       mutex->wait_lock
  6569	 *
  6570	 * Returns the task that is going to be used as execution context (the one
  6571	 * that is actually going to be run on cpu_of(rq)).
  6572	 */
  6573	static struct task_struct *
  6574	find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
  6575	{
  6576		struct task_struct *owner = NULL;
  6577		int this_cpu = cpu_of(rq);
  6578		struct task_struct *p;
  6579		struct mutex *mutex;
  6580	
  6581		/* Follow blocked_on chain. */
  6582		for (p = donor; task_is_blocked(p); p = owner) {
  6583			mutex = p->blocked_on;
  6584			/* Something changed in the chain, so pick again */
  6585			if (!mutex)
  6586				return NULL;
  6587			/*
  6588			 * By taking mutex->wait_lock we hold off concurrent mutex_unlock()
  6589			 * and ensure @owner sticks around.
  6590			 */
  6591			guard(raw_spinlock)(&mutex->wait_lock);
  6592	
  6593			/* Check again that p is blocked with wait_lock held */
  6594			if (mutex != __get_task_blocked_on(p)) {
  6595				/*
  6596				 * Something changed in the blocked_on chain and
  6597				 * we don't know if only at this level. So, let's
  6598				 * just bail out completely and let __schedule()
  6599				 * figure things out (pick_again loop).
  6600				 */
  6601				return NULL;
  6602			}
  6603	
  6604			owner = __mutex_owner(mutex);
  6605			if (!owner) {
  6606				__clear_task_blocked_on(p, mutex);
  6607				return p;
  6608			}
  6609	
  6610			if (!READ_ONCE(owner->on_rq) || owner->se.sched_delayed) {
  6611				/* XXX Don't handle blocked owners/delayed dequeue yet */
  6612				return proxy_deactivate(rq, donor);
  6613			}
  6614	
  6615			if (task_cpu(owner) != this_cpu) {
  6616				/* XXX Don't handle migrations yet */
  6617				return proxy_deactivate(rq, donor);
  6618			}
  6619	
  6620			if (task_on_rq_migrating(owner)) {
  6621				/*
  6622				 * One of the chain of mutex owners is currently migrating to this
  6623				 * CPU, but has not yet been enqueued because we are holding the
  6624				 * rq lock. As a simple solution, just schedule rq->idle to give
  6625				 * the migration a chance to complete. Much like the migrate_task
  6626				 * case we should end up back in find_proxy_task(), this time
  6627				 * hopefully with all relevant tasks already enqueued.
  6628				 */
  6629				return proxy_resched_idle(rq);
  6630			}
  6631	
  6632			/*
  6633			 * Its possible to race where after we check owner->on_rq
  6634			 * but before we check (owner_cpu != this_cpu) that the
  6635			 * task on another cpu was migrated back to this cpu. In
  6636			 * that case it could slip by our  checks. So double check
  6637			 * we are still on this cpu and not migrating. If we get
  6638			 * inconsistent results, try again.
  6639			 */
  6640			if (!task_on_rq_queued(owner) || task_cpu(owner) != this_cpu)
  6641				return NULL;
  6642	
  6643			if (owner == p) {
  6644				/*
  6645				 * It's possible we interleave with mutex_unlock like:
  6646				 *
  6647				 *				lock(&rq->lock);
  6648				 *				  find_proxy_task()
  6649				 * mutex_unlock()
  6650				 *   lock(&wait_lock);
  6651				 *   donor(owner) = current->blocked_donor;
  6652				 *   unlock(&wait_lock);
  6653				 *
  6654				 *   wake_up_q();
  6655				 *     ...
  6656				 *       ttwu_runnable()
  6657				 *         __task_rq_lock()
  6658				 *				  lock(&wait_lock);
  6659				 *				  owner == p
  6660				 *
  6661				 * Which leaves us to finish the ttwu_runnable() and make it go.
  6662				 *
  6663				 * So schedule rq->idle so that ttwu_runnable() can get the rq
  6664				 * lock and mark owner as running.
  6665				 */
  6666				return proxy_resched_idle(rq);
  6667			}
  6668			/*
  6669			 * OK, now we're absolutely sure @owner is on this
  6670			 * rq, therefore holding @rq->lock is sufficient to
  6671			 * guarantee its existence, as per ttwu_remote().
  6672			 */
  6673		}
  6674	
  6675		WARN_ON_ONCE(owner && !owner->on_rq);
  6676		return owner;
  6677	}
  6678	#else /* SCHED_PROXY_EXEC */
  6679	bool is_proxy_task(struct task_struct *p) { return false; }
  6680	static inline void set_task_proxy(struct task_struct *p) { }
> 6681	static inline void clear_task_proxy(p) { }
  6682	static struct task_struct *
  6683	find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
  6684	{
  6685		WARN_ONCE(1, "This should never be called in the !SCHED_PROXY_EXEC case\n");
  6686		return donor;
  6687	}
  6688	#endif /* SCHED_PROXY_EXEC */
  6689	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution
  2025-11-17 18:55 [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution K Prateek Nayak
                   ` (4 preceding siblings ...)
  2025-11-17 18:55 ` [RFC PATCH 5/5] sched/psi: Fix PSI signals of blocked tasks retained for proxy K Prateek Nayak
@ 2025-11-18  0:45 ` John Stultz
  2025-11-18  1:38   ` K Prateek Nayak
  5 siblings, 1 reply; 21+ messages in thread
From: John Stultz @ 2025-11-18  0:45 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Johannes Weiner, Suren Baghdasaryan, linux-kernel,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider

On Mon, Nov 17, 2025 at 10:56 AM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>
> When booting into a kernel with CONFIG_SCHED_PROXY_EXEC and CONFIG_PSI,
> a inconsistent task state warning was noticed soon after the boot
> similar to:
>
>     psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=0 set=4
>
> On analysis, the following sequence of event was found to be the cause
> of the splat:
>
> o Blocked task is retained on the runqueue.
> o psi_sched_switch() sees task_on_rq_queued() and retains the runnable
>   signals for the task.
> o Tasks blocks later via proxy_deactivate() but psi_dequeue() doesn't
>   adjust the PSI flags since DEQUEUE_SLEEP is set expecting
>   psi_sched_switch() to fix the signals.
> o The blocked task is woken up with the PSI state still reflecting that
>   the task is runnable (TSK_RUNNING) leading to the splat.

Hey, K Prateek!
  Thanks for chasing this down and sending this series out!

I'm still getting my head around the description above (its been
awhile since I last looked at the PSI code), but early on I often hit
PSI splats, and I thought I had addressed it with the patch here:
  https://github.com/johnstultz-work/linux-dev/commit/f60923a6176b3778a8fc9b9b0bbe4953153ce565

And with that I've not run across any warnings since.

Now, I hadn't tripped over the issue recently with the subset of the
full series I've been pushing upstream, and as I most easily ran into
it with the sleeping owner enqueuing feature I was holding the fix
back for those changes. But I realize unfortunately CONFIG_PSI at some
point got disabled in my test defconfig, so I've not had the
opportunity to trip it, and sure enough I can trivially see it booting
with the current upstream code.

Applying that fix does seem to avoid the warnings in my trivial
testing, but again I've not dug through the logic in awhile, so you
may have a better sense of the inadequacies of that fix.

If it looks reasonable to you, I'll rework the commit message so it
isn't so focused on the sleeping-owner-enquing case and submit it.

I'll have to spend some time here looking more at your proposed
solution. On the initial glance, I do fret a little with the
task->sched_proxy bit overlapping a bit in meaning with the
task->blocked_on value.

thanks
-john

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI
  2025-11-17 18:55 ` [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI K Prateek Nayak
@ 2025-11-18  1:06   ` John Stultz
  2025-11-20  5:59   ` Madadi Vineeth Reddy
  2025-12-02 14:32   ` Johannes Weiner
  2 siblings, 0 replies; 21+ messages in thread
From: John Stultz @ 2025-11-18  1:06 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Johannes Weiner, Suren Baghdasaryan, linux-kernel,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider

On Mon, Nov 17, 2025 at 10:56 AM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>
> commit 1a6151017ee5 ("sched: psi: pass enqueue/dequeue flags to psi
> callbacks directly") modified the psi_enqueue() and psi_dequeue()
> functions to take the complete enqueue/dequeue flags but left the stubs
> for !CONFIG_PSI unaltered.
>
> Modify the stubs to also accept the flags argument to keep it consistent
> with CONFIG_PSI.
>
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>

I'm not super deep in the PSI logic, but this looks obviously correct.
Thanks for cleaning this up.
Reviewed-by: John Stultz <jstultz@google.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] sched/psi: Prepend "0x" to format specifiers when printing PSI flags
  2025-11-17 18:55 ` [PATCH 2/5] sched/psi: Prepend "0x" to format specifiers when printing PSI flags K Prateek Nayak
@ 2025-11-18  1:08   ` John Stultz
  2025-12-02 14:33   ` Johannes Weiner
  1 sibling, 0 replies; 21+ messages in thread
From: John Stultz @ 2025-11-18  1:08 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Johannes Weiner, Suren Baghdasaryan, linux-kernel,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider

On Mon, Nov 17, 2025 at 10:56 AM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>
> It is not immediately clear that the PSI flags, the set, and the clear
> bits printed in PSI warnings are hexadecimal values. Prepend "0x" to
> format specifiers to make it clear.
>
> Since "kernel/sched" uses "0x%x" as opposed to "%#x" when printing
> hexadecimal values, the same was followed to keep consistency.
>
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>

Reviewed-by: John Stultz <jstultz@google.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution
  2025-11-18  0:45 ` [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution John Stultz
@ 2025-11-18  1:38   ` K Prateek Nayak
  2025-11-18  4:26     ` John Stultz
  0 siblings, 1 reply; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-18  1:38 UTC (permalink / raw)
  To: John Stultz
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Johannes Weiner, Suren Baghdasaryan, linux-kernel,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider

Hello John,

On 11/18/2025 6:15 AM, John Stultz wrote:
> On Mon, Nov 17, 2025 at 10:56 AM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>>
>> When booting into a kernel with CONFIG_SCHED_PROXY_EXEC and CONFIG_PSI,
>> a inconsistent task state warning was noticed soon after the boot
>> similar to:
>>
>>     psi: inconsistent task state! task=... cpu=... psi_flags=4 clear=0 set=4
>>
>> On analysis, the following sequence of event was found to be the cause
>> of the splat:
>>
>> o Blocked task is retained on the runqueue.
>> o psi_sched_switch() sees task_on_rq_queued() and retains the runnable
>>   signals for the task.
>> o Tasks blocks later via proxy_deactivate() but psi_dequeue() doesn't
>>   adjust the PSI flags since DEQUEUE_SLEEP is set expecting
>>   psi_sched_switch() to fix the signals.
>> o The blocked task is woken up with the PSI state still reflecting that
>>   the task is runnable (TSK_RUNNING) leading to the splat.
> 
> Hey, K Prateek!
>   Thanks for chasing this down and sending this series out!
> 
> I'm still getting my head around the description above (its been
> awhile since I last looked at the PSI code), but early on I often hit
> PSI splats, and I thought I had addressed it with the patch here:
>   https://github.com/johnstultz-work/linux-dev/commit/f60923a6176b3778a8fc9b9b0bbe4953153ce565

Oooo! Let me go test that.

> 
> And with that I've not run across any warnings since.
> 
> Now, I hadn't tripped over the issue recently with the subset of the
> full series I've been pushing upstream, and as I most easily ran into
> it with the sleeping owner enqueuing feature I was holding the fix
> back for those changes. But I realize unfortunately CONFIG_PSI at some
> point got disabled in my test defconfig, so I've not had the
> opportunity to trip it, and sure enough I can trivially see it booting
> with the current upstream code.

I hit this on tip:sched/core when looking at the recent sched_yield()
changes. Maybe the "blocked_on" serialization with the proxy migration
will make this all go away :)

> 
> Applying that fix does seem to avoid the warnings in my trivial
> testing, but again I've not dug through the logic in awhile, so you
> may have a better sense of the inadequacies of that fix.
> 
> If it looks reasonable to you, I'll rework the commit message so it
> isn't so focused on the sleeping-owner-enquing case and submit it.

That would be great! And it seems to be a lot more simpler than the
the stuff I'm trying to do. I'll give it a spin and get back to you.
Thank you again for pointing to the fix.

> 
> I'll have to spend some time here looking more at your proposed
> solution. On the initial glance, I do fret a little with the
> task->sched_proxy bit overlapping a bit in meaning with the
> task->blocked_on value.

Ack! I'm pretty sure with the blocked_on locking we'll not have these
"interesting" situations but I posted the RFC out just in case we
needed something in the interim but turns out its a solved problem :)

On last thing, it'll be good to get some clarification on how to treat
the blocked tasks retained on the runqueue for PSI - quick look at your
fix suggests we still consider them runnable (TSK_RUNNING) from PSI
standpoint - is this ideal or should PSI consider these tasks blocked?

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy
  2025-11-17 18:55 ` [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy K Prateek Nayak
  2025-11-17 20:44   ` kernel test robot
@ 2025-11-18  1:46   ` kernel test robot
  2025-11-18  4:38   ` K Prateek Nayak
  2 siblings, 0 replies; 21+ messages in thread
From: kernel test robot @ 2025-11-18  1:46 UTC (permalink / raw)
  To: K Prateek Nayak; +Cc: llvm, oe-kbuild-all

Hi Prateek,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on 33cf66d88306663d16e4759e9d24766b0aaa2e17]

url:    https://github.com/intel-lab-lkp/linux/commits/K-Prateek-Nayak/sched-psi-Make-psi-stubs-consistent-for-CONFIG_PSI/20251118-030832
base:   33cf66d88306663d16e4759e9d24766b0aaa2e17
patch link:    https://lore.kernel.org/r/20251117185550.365156-4-kprateek.nayak%40amd.com
patch subject: [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy
config: x86_64-allnoconfig (https://download.01.org/0day-ci/archive/20251118/202511180828.6BVP2q32-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251118/202511180828.6BVP2q32-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511180828.6BVP2q32-lkp@intel.com/

All errors (new ones prefixed by >>):

>> kernel/sched/core.c:6681:37: error: parameter 'p' was not declared, defaults to 'int'; ISO C99 and later do not support implicit int [-Wimplicit-int]
    6681 | static inline void clear_task_proxy(p) { }
         |                                     ^
>> kernel/sched/core.c:6681:20: error: a function definition without a prototype is deprecated in all versions of C and is not supported in C23 [-Werror,-Wdeprecated-non-prototype]
    6681 | static inline void clear_task_proxy(p) { }
         |                    ^
>> kernel/sched/core.c:6681:20: error: conflicting types for 'clear_task_proxy'
   kernel/sched/core.c:3663:20: note: previous declaration is here
    3663 | static inline void clear_task_proxy(struct task_struct *p);
         |                    ^
>> kernel/sched/core.c:6681:20: error: a function definition without a prototype is deprecated in all versions of C and is not supported in C23 [-Werror,-Wdeprecated-non-prototype]
    6681 | static inline void clear_task_proxy(p) { }
         |                    ^
   kernel/sched/core.c:7764:12: warning: array index -1 is before the beginning of the array [-Warray-bounds]
    7764 |                                        preempt_modes[preempt_dynamic_mode] : "undef",
         |                                        ^             ~~~~~~~~~~~~~~~~~~~~
   kernel/sched/core.c:7739:1: note: array 'preempt_modes' declared here
    7739 | const char *preempt_modes[] = {
         | ^
   1 warning and 4 errors generated.


vim +6681 kernel/sched/core.c

  6557	
  6558	/*
  6559	 * Find runnable lock owner to proxy for mutex blocked donor
  6560	 *
  6561	 * Follow the blocked-on relation:
  6562	 *   task->blocked_on -> mutex->owner -> task...
  6563	 *
  6564	 * Lock order:
  6565	 *
  6566	 *   p->pi_lock
  6567	 *     rq->lock
  6568	 *       mutex->wait_lock
  6569	 *
  6570	 * Returns the task that is going to be used as execution context (the one
  6571	 * that is actually going to be run on cpu_of(rq)).
  6572	 */
  6573	static struct task_struct *
  6574	find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
  6575	{
  6576		struct task_struct *owner = NULL;
  6577		int this_cpu = cpu_of(rq);
  6578		struct task_struct *p;
  6579		struct mutex *mutex;
  6580	
  6581		/* Follow blocked_on chain. */
  6582		for (p = donor; task_is_blocked(p); p = owner) {
  6583			mutex = p->blocked_on;
  6584			/* Something changed in the chain, so pick again */
  6585			if (!mutex)
  6586				return NULL;
  6587			/*
  6588			 * By taking mutex->wait_lock we hold off concurrent mutex_unlock()
  6589			 * and ensure @owner sticks around.
  6590			 */
  6591			guard(raw_spinlock)(&mutex->wait_lock);
  6592	
  6593			/* Check again that p is blocked with wait_lock held */
  6594			if (mutex != __get_task_blocked_on(p)) {
  6595				/*
  6596				 * Something changed in the blocked_on chain and
  6597				 * we don't know if only at this level. So, let's
  6598				 * just bail out completely and let __schedule()
  6599				 * figure things out (pick_again loop).
  6600				 */
  6601				return NULL;
  6602			}
  6603	
  6604			owner = __mutex_owner(mutex);
  6605			if (!owner) {
  6606				__clear_task_blocked_on(p, mutex);
  6607				return p;
  6608			}
  6609	
  6610			if (!READ_ONCE(owner->on_rq) || owner->se.sched_delayed) {
  6611				/* XXX Don't handle blocked owners/delayed dequeue yet */
  6612				return proxy_deactivate(rq, donor);
  6613			}
  6614	
  6615			if (task_cpu(owner) != this_cpu) {
  6616				/* XXX Don't handle migrations yet */
  6617				return proxy_deactivate(rq, donor);
  6618			}
  6619	
  6620			if (task_on_rq_migrating(owner)) {
  6621				/*
  6622				 * One of the chain of mutex owners is currently migrating to this
  6623				 * CPU, but has not yet been enqueued because we are holding the
  6624				 * rq lock. As a simple solution, just schedule rq->idle to give
  6625				 * the migration a chance to complete. Much like the migrate_task
  6626				 * case we should end up back in find_proxy_task(), this time
  6627				 * hopefully with all relevant tasks already enqueued.
  6628				 */
  6629				return proxy_resched_idle(rq);
  6630			}
  6631	
  6632			/*
  6633			 * Its possible to race where after we check owner->on_rq
  6634			 * but before we check (owner_cpu != this_cpu) that the
  6635			 * task on another cpu was migrated back to this cpu. In
  6636			 * that case it could slip by our  checks. So double check
  6637			 * we are still on this cpu and not migrating. If we get
  6638			 * inconsistent results, try again.
  6639			 */
  6640			if (!task_on_rq_queued(owner) || task_cpu(owner) != this_cpu)
  6641				return NULL;
  6642	
  6643			if (owner == p) {
  6644				/*
  6645				 * It's possible we interleave with mutex_unlock like:
  6646				 *
  6647				 *				lock(&rq->lock);
  6648				 *				  find_proxy_task()
  6649				 * mutex_unlock()
  6650				 *   lock(&wait_lock);
  6651				 *   donor(owner) = current->blocked_donor;
  6652				 *   unlock(&wait_lock);
  6653				 *
  6654				 *   wake_up_q();
  6655				 *     ...
  6656				 *       ttwu_runnable()
  6657				 *         __task_rq_lock()
  6658				 *				  lock(&wait_lock);
  6659				 *				  owner == p
  6660				 *
  6661				 * Which leaves us to finish the ttwu_runnable() and make it go.
  6662				 *
  6663				 * So schedule rq->idle so that ttwu_runnable() can get the rq
  6664				 * lock and mark owner as running.
  6665				 */
  6666				return proxy_resched_idle(rq);
  6667			}
  6668			/*
  6669			 * OK, now we're absolutely sure @owner is on this
  6670			 * rq, therefore holding @rq->lock is sufficient to
  6671			 * guarantee its existence, as per ttwu_remote().
  6672			 */
  6673		}
  6674	
  6675		WARN_ON_ONCE(owner && !owner->on_rq);
  6676		return owner;
  6677	}
  6678	#else /* SCHED_PROXY_EXEC */
  6679	bool is_proxy_task(struct task_struct *p) { return false; }
  6680	static inline void set_task_proxy(struct task_struct *p) { }
> 6681	static inline void clear_task_proxy(p) { }
  6682	static struct task_struct *
  6683	find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
  6684	{
  6685		WARN_ONCE(1, "This should never be called in the !SCHED_PROXY_EXEC case\n");
  6686		return donor;
  6687	}
  6688	#endif /* SCHED_PROXY_EXEC */
  6689	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy
  2025-11-17 20:44   ` kernel test robot
@ 2025-11-18  2:03     ` K Prateek Nayak
  0 siblings, 0 replies; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-18  2:03 UTC (permalink / raw)
  To: kernel test robot; +Cc: oe-kbuild-all

On 11/18/2025 2:14 AM, kernel test robot wrote:
> All errors (new ones prefixed by >>):
> 
>    kernel/sched/core.c: In function 'clear_task_proxy':
>>> kernel/sched/core.c:6681:20: error: type of 'p' defaults to 'int' [-Werror=implicit-int]
>     6681 | static inline void clear_task_proxy(p) { }
>          |                    ^~~~~~~~~~~~~~~~
>    cc1: some warnings being treated as errors

Woops! I clearly didn't test the !PROXY_EXEC case. Sorry about that.
The following diff on top fixes it:

(Build tested only)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 765365b81b12..d51c35e26d80 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6682,7 +6682,7 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
 #else /* SCHED_PROXY_EXEC */
 bool is_proxy_task(struct task_struct *p) { return false; }
 static inline void set_task_proxy(struct task_struct *p) { }
-static inline void clear_task_proxy(p) { }
+static inline void clear_task_proxy(struct task_struct *p) { }
 static bool __proxy_deactivate(struct rq *rq, struct task_struct *donor) { return false; }
 static struct task_struct *
 find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)

-- 
Thanks and Regards,
Prateek


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution
  2025-11-18  1:38   ` K Prateek Nayak
@ 2025-11-18  4:26     ` John Stultz
  2025-11-18  5:08       ` K Prateek Nayak
  0 siblings, 1 reply; 21+ messages in thread
From: John Stultz @ 2025-11-18  4:26 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Johannes Weiner, Suren Baghdasaryan, linux-kernel,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider

On Mon, Nov 17, 2025 at 5:39 PM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
> On 11/18/2025 6:15 AM, John Stultz wrote:
> > I'm still getting my head around the description above (its been
> > awhile since I last looked at the PSI code), but early on I often hit
> > PSI splats, and I thought I had addressed it with the patch here:
> >   https://github.com/johnstultz-work/linux-dev/commit/f60923a6176b3778a8fc9b9b0bbe4953153ce565
>
> Oooo! Let me go test that.
>
> >
> > And with that I've not run across any warnings since.
> >
> > Now, I hadn't tripped over the issue recently with the subset of the
> > full series I've been pushing upstream, and as I most easily ran into
> > it with the sleeping owner enqueuing feature I was holding the fix
> > back for those changes. But I realize unfortunately CONFIG_PSI at some
> > point got disabled in my test defconfig, so I've not had the
> > opportunity to trip it, and sure enough I can trivially see it booting
> > with the current upstream code.
>
> I hit this on tip:sched/core when looking at the recent sched_yield()
> changes. Maybe the "blocked_on" serialization with the proxy migration
> will make this all go away :)
>
> >
> > Applying that fix does seem to avoid the warnings in my trivial
> > testing, but again I've not dug through the logic in awhile, so you
> > may have a better sense of the inadequacies of that fix.
> >
> > If it looks reasonable to you, I'll rework the commit message so it
> > isn't so focused on the sleeping-owner-enquing case and submit it.
>
> That would be great! And it seems to be a lot more simpler than the
> the stuff I'm trying to do. I'll give it a spin and get back to you.
> Thank you again for pointing to the fix.
>
> >
> > I'll have to spend some time here looking more at your proposed
> > solution. On the initial glance, I do fret a little with the
> > task->sched_proxy bit overlapping a bit in meaning with the
> > task->blocked_on value.
>
> Ack! I'm pretty sure with the blocked_on locking we'll not have these
> "interesting" situations but I posted the RFC out just in case we
> needed something in the interim but turns out its a solved problem :)
>
> On last thing, it'll be good to get some clarification on how to treat
> the blocked tasks retained on the runqueue for PSI - quick look at your
> fix suggests we still consider them runnable (TSK_RUNNING) from PSI
> standpoint - is this ideal or should PSI consider these tasks blocked?

So my default way of thinking about mutex-blocked tasks with proxy is
that they are equivalent to runnable. They can be selected by
pick_next_task(), and they are charged for the time they donate to the
lock-owner that runs as the proxy.
To conceptualize things with ProxyExec, I often imagine the
mutex-blocked task as being in "optimistic spin" mode waiting for the
mutex, where we'd just run the task and let it spin, instead of
blocking the task (when the lock owner isn't already running). Then we
just have the optimization of instead of just wasting time spinning,
we run the lock owner to release the lock.

So, I need to further refresh myself with more of the subtleties of
PSI, but to me considering it TSK_RUNNING seems intuitive.

There are maybe some transient cases, like where the blocked task is
on one RQ, and the lock holder is on another, and thus until the
blocked task is selected (and then proxy-migrated to boost the task on
the other cpu), where if it were very far back in the runqueue it
could be contributing what could be seen as "false pressure" on that
RQ.  So maybe I need to think a bit more about that. But it still is a
task that wants to run to boost the lock owner, so I'm not sure how
different it is in the PSI view compared to transient runqueue
imbalances.

thanks
-john

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy
  2025-11-17 18:55 ` [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy K Prateek Nayak
  2025-11-17 20:44   ` kernel test robot
  2025-11-18  1:46   ` kernel test robot
@ 2025-11-18  4:38   ` K Prateek Nayak
  2 siblings, 0 replies; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-18  4:38 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Suren Baghdasaryan, linux-kernel
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider

On 11/18/2025 12:25 AM, K Prateek Nayak wrote:
> @@ -6649,6 +6676,9 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
>  	return owner;
>  }
>  #else /* SCHED_PROXY_EXEC */
> +bool is_proxy_task(struct task_struct *p) { return false; }
> +static inline void set_task_proxy(struct task_struct *p) { }
> +static inline void clear_task_proxy(p) { }

I clearly messed that up for !CONFIG_SCHED_PROXY_EXEC

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 765365b81b12..d51c35e26d80 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6682,7 +6682,7 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
 #else /* SCHED_PROXY_EXEC */
 bool is_proxy_task(struct task_struct *p) { return false; }
 static inline void set_task_proxy(struct task_struct *p) { }
-static inline void clear_task_proxy(p) { }
+static inline void clear_task_proxy(struct task_struct *p) { }
 static bool __proxy_deactivate(struct rq *rq, struct task_struct *donor) { return false; }
 static struct task_struct *
 find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)


>  static struct task_struct *
>  find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
>  {
-- 
Thanks and Regards,
Prateek


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution
  2025-11-18  4:26     ` John Stultz
@ 2025-11-18  5:08       ` K Prateek Nayak
  0 siblings, 0 replies; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-18  5:08 UTC (permalink / raw)
  To: John Stultz
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Johannes Weiner, Suren Baghdasaryan, linux-kernel,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider

Hello John,

On 11/18/2025 9:56 AM, John Stultz wrote:
> On Mon, Nov 17, 2025 at 5:39 PM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>> On 11/18/2025 6:15 AM, John Stultz wrote:
>>> I'm still getting my head around the description above (its been
>>> awhile since I last looked at the PSI code), but early on I often hit
>>> PSI splats, and I thought I had addressed it with the patch here:
>>>   https://github.com/johnstultz-work/linux-dev/commit/f60923a6176b3778a8fc9b9b0bbe4953153ce565
>>
>> Oooo! Let me go test that.

Seems like that solution works too on top of current tip:sched/core.
I think you can send it out as a standalone patch for inclusion while
we hash out the donor migration bits (and blocked owner, and rwsem!).

>>
>>>
>>> And with that I've not run across any warnings since.
>>>
>>> Now, I hadn't tripped over the issue recently with the subset of the
>>> full series I've been pushing upstream, and as I most easily ran into
>>> it with the sleeping owner enqueuing feature I was holding the fix
>>> back for those changes. But I realize unfortunately CONFIG_PSI at some
>>> point got disabled in my test defconfig, so I've not had the
>>> opportunity to trip it, and sure enough I can trivially see it booting
>>> with the current upstream code.
>>
>> I hit this on tip:sched/core when looking at the recent sched_yield()
>> changes. Maybe the "blocked_on" serialization with the proxy migration
>> will make this all go away :)
>>
>>>
>>> Applying that fix does seem to avoid the warnings in my trivial
>>> testing, but again I've not dug through the logic in awhile, so you
>>> may have a better sense of the inadequacies of that fix.
>>>
>>> If it looks reasonable to you, I'll rework the commit message so it
>>> isn't so focused on the sleeping-owner-enquing case and submit it.
>>
>> That would be great! And it seems to be a lot more simpler than the
>> the stuff I'm trying to do. I'll give it a spin and get back to you.
>> Thank you again for pointing to the fix.
>>
>>>
>>> I'll have to spend some time here looking more at your proposed
>>> solution. On the initial glance, I do fret a little with the
>>> task->sched_proxy bit overlapping a bit in meaning with the
>>> task->blocked_on value.
>>
>> Ack! I'm pretty sure with the blocked_on locking we'll not have these
>> "interesting" situations but I posted the RFC out just in case we
>> needed something in the interim but turns out its a solved problem :)
>>
>> On last thing, it'll be good to get some clarification on how to treat
>> the blocked tasks retained on the runqueue for PSI - quick look at your
>> fix suggests we still consider them runnable (TSK_RUNNING) from PSI
>> standpoint - is this ideal or should PSI consider these tasks blocked?
> 
> So my default way of thinking about mutex-blocked tasks with proxy is
> that they are equivalent to runnable. They can be selected by
> pick_next_task(), and they are charged for the time they donate to the
> lock-owner that runs as the proxy.
> To conceptualize things with ProxyExec, I often imagine the
> mutex-blocked task as being in "optimistic spin" mode waiting for the
> mutex, where we'd just run the task and let it spin, instead of
> blocking the task (when the lock owner isn't already running). Then we
> just have the optimization of instead of just wasting time spinning,
> we run the lock owner to release the lock.

I think I can see it now. I generally considered them the other
way around as blocked tasks retained just for the vruntime context.
I'll try changing my perspective to match yours when looking at
proxy :)

As for the fix in your tree, feel free to include:

Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>

> 
> So, I need to further refresh myself with more of the subtleties of
> PSI, but to me considering it TSK_RUNNING seems intuitive.
> 
> There are maybe some transient cases, like where the blocked task is
> on one RQ, and the lock holder is on another, and thus until the
> blocked task is selected (and then proxy-migrated to boost the task on
> the other cpu), where if it were very far back in the runqueue it
> could be contributing what could be seen as "false pressure" on that
> RQ.  So maybe I need to think a bit more about that. But it still is a
> task that wants to run to boost the lock owner, so I'm not sure how
> different it is in the PSI view compared to transient runqueue
> imbalances.

I think Johannes has a better understanding of how these signals are
used in the field so I'll defer to him.

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI
  2025-11-17 18:55 ` [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI K Prateek Nayak
  2025-11-18  1:06   ` John Stultz
@ 2025-11-20  5:59   ` Madadi Vineeth Reddy
  2025-11-20  6:10     ` K Prateek Nayak
  2025-12-02 14:32   ` Johannes Weiner
  2 siblings, 1 reply; 21+ messages in thread
From: Madadi Vineeth Reddy @ 2025-11-20  5:59 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Valentin Schneider, Suren Baghdasaryan,
	linux-kernel, Madadi Vineeth Reddy

On 18/11/25 00:25, K Prateek Nayak wrote:
> commit 1a6151017ee5 ("sched: psi: pass enqueue/dequeue flags to psi
> callbacks directly") modified the psi_enqueue() and psi_dequeue()
> functions to take the complete enqueue/dequeue flags but left the stubs
> for !CONFIG_PSI unaltered.
> 
> Modify the stubs to also accept the flags argument to keep it consistent
> with CONFIG_PSI.
> 
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> ---
>  kernel/sched/stats.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
> index cbf7206b3f9d..3323d773fec3 100644
> --- a/kernel/sched/stats.h
> +++ b/kernel/sched/stats.h
> @@ -221,8 +221,8 @@ static inline void psi_sched_switch(struct task_struct *prev,
>  }
>  
>  #else /* !CONFIG_PSI: */
> -static inline void psi_enqueue(struct task_struct *p, bool migrate) {}
> -static inline void psi_dequeue(struct task_struct *p, bool migrate) {}
> +static inline void psi_enqueue(struct task_struct *p, int flags) {}
> +static inline void psi_dequeue(struct task_struct *p, int flags) {}
>  static inline void psi_ttwu_dequeue(struct task_struct *p) {}
>  static inline void psi_sched_switch(struct task_struct *prev,
>  				    struct task_struct *next,

Right. The commit that updated the function signature did not update the
!CONFIG_PSI stubs accordingly. This patch corrects that.

Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>

Thanks,
Vineeth

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI
  2025-11-20  5:59   ` Madadi Vineeth Reddy
@ 2025-11-20  6:10     ` K Prateek Nayak
  2025-11-20  6:22       ` Madadi Vineeth Reddy
  0 siblings, 1 reply; 21+ messages in thread
From: K Prateek Nayak @ 2025-11-20  6:10 UTC (permalink / raw)
  To: Madadi Vineeth Reddy
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Valentin Schneider, Suren Baghdasaryan,
	linux-kernel

Hello Vineeth,

On 11/20/2025 11:29 AM, Madadi Vineeth Reddy wrote:
> On 18/11/25 00:25, K Prateek Nayak wrote:
>> commit 1a6151017ee5 ("sched: psi: pass enqueue/dequeue flags to psi
>> callbacks directly") modified the psi_enqueue() and psi_dequeue()
>> functions to take the complete enqueue/dequeue flags but left the stubs
>> for !CONFIG_PSI unaltered.
>>
>> Modify the stubs to also accept the flags argument to keep it consistent
>> with CONFIG_PSI.
>>
>> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
>> ---
>>  kernel/sched/stats.h | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
>> index cbf7206b3f9d..3323d773fec3 100644
>> --- a/kernel/sched/stats.h
>> +++ b/kernel/sched/stats.h
>> @@ -221,8 +221,8 @@ static inline void psi_sched_switch(struct task_struct *prev,
>>  }
>>  
>>  #else /* !CONFIG_PSI: */
>> -static inline void psi_enqueue(struct task_struct *p, bool migrate) {}
>> -static inline void psi_dequeue(struct task_struct *p, bool migrate) {}
>> +static inline void psi_enqueue(struct task_struct *p, int flags) {}
>> +static inline void psi_dequeue(struct task_struct *p, int flags) {}
>>  static inline void psi_ttwu_dequeue(struct task_struct *p) {}
>>  static inline void psi_sched_switch(struct task_struct *prev,
>>  				    struct task_struct *next,
> 
> Right. The commit that updated the function signature did not update the
> !CONFIG_PSI stubs accordingly. This patch corrects that.
> 
> Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>

Thanks for the review.

P.S. John has an alternate fix for the same at
https://lore.kernel.org/lkml/20251118055242.4030849-1-jstultz@google.com/

That considers a blocked donor as a runnable entity from PSI
standpoint.

This series implements the alternate approach of blocking the PSI signals
when the blocked donor is retained on the rq, skipping the block_task()
and it enqueues the PSI signal back when the donor is woken up again.

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI
  2025-11-20  6:10     ` K Prateek Nayak
@ 2025-11-20  6:22       ` Madadi Vineeth Reddy
  0 siblings, 0 replies; 21+ messages in thread
From: Madadi Vineeth Reddy @ 2025-11-20  6:22 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Johannes Weiner, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Valentin Schneider, Suren Baghdasaryan,
	linux-kernel, Madadi Vineeth Reddy

Hi Prateek,

On 20/11/25 11:40, K Prateek Nayak wrote:
> Hello Vineeth,
> 
> On 11/20/2025 11:29 AM, Madadi Vineeth Reddy wrote:
>> On 18/11/25 00:25, K Prateek Nayak wrote:
>>> commit 1a6151017ee5 ("sched: psi: pass enqueue/dequeue flags to psi
>>> callbacks directly") modified the psi_enqueue() and psi_dequeue()
>>> functions to take the complete enqueue/dequeue flags but left the stubs
>>> for !CONFIG_PSI unaltered.
>>>
>>> Modify the stubs to also accept the flags argument to keep it consistent
>>> with CONFIG_PSI.
>>>
>>> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
>>> ---
>>>  kernel/sched/stats.h | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
>>> index cbf7206b3f9d..3323d773fec3 100644
>>> --- a/kernel/sched/stats.h
>>> +++ b/kernel/sched/stats.h
>>> @@ -221,8 +221,8 @@ static inline void psi_sched_switch(struct task_struct *prev,
>>>  }
>>>  
>>>  #else /* !CONFIG_PSI: */
>>> -static inline void psi_enqueue(struct task_struct *p, bool migrate) {}
>>> -static inline void psi_dequeue(struct task_struct *p, bool migrate) {}
>>> +static inline void psi_enqueue(struct task_struct *p, int flags) {}
>>> +static inline void psi_dequeue(struct task_struct *p, int flags) {}
>>>  static inline void psi_ttwu_dequeue(struct task_struct *p) {}
>>>  static inline void psi_sched_switch(struct task_struct *prev,
>>>  				    struct task_struct *next,
>>
>> Right. The commit that updated the function signature did not update the
>> !CONFIG_PSI stubs accordingly. This patch corrects that.
>>
>> Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
> 
> Thanks for the review.
> 
> P.S. John has an alternate fix for the same at
> https://lore.kernel.org/lkml/20251118055242.4030849-1-jstultz@google.com/

Thanks for pointing this out. While going through the discussion I had seen it
in John's tree, but I hadn't realized it was already posted upstream.

I gave the Reviewed-by since the first two cleanup patches still seem valid
and can go in independently of that alternate fix.

Thanks,
Vineeth

> 
> That considers a blocked donor as a runnable entity from PSI
> standpoint.
> 
> This series implements the alternate approach of blocking the PSI signals
> when the blocked donor is retained on the rq, skipping the block_task()
> and it enqueues the PSI signal back when the donor is woken up again.
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI
  2025-11-17 18:55 ` [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI K Prateek Nayak
  2025-11-18  1:06   ` John Stultz
  2025-11-20  5:59   ` Madadi Vineeth Reddy
@ 2025-12-02 14:32   ` Johannes Weiner
  2 siblings, 0 replies; 21+ messages in thread
From: Johannes Weiner @ 2025-12-02 14:32 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Suren Baghdasaryan, linux-kernel, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider

On Mon, Nov 17, 2025 at 06:55:46PM +0000, K Prateek Nayak wrote:
> commit 1a6151017ee5 ("sched: psi: pass enqueue/dequeue flags to psi
> callbacks directly") modified the psi_enqueue() and psi_dequeue()
> functions to take the complete enqueue/dequeue flags but left the stubs
> for !CONFIG_PSI unaltered.
> 
> Modify the stubs to also accept the flags argument to keep it consistent
> with CONFIG_PSI.
> 
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] sched/psi: Prepend "0x" to format specifiers when printing PSI flags
  2025-11-17 18:55 ` [PATCH 2/5] sched/psi: Prepend "0x" to format specifiers when printing PSI flags K Prateek Nayak
  2025-11-18  1:08   ` John Stultz
@ 2025-12-02 14:33   ` Johannes Weiner
  1 sibling, 0 replies; 21+ messages in thread
From: Johannes Weiner @ 2025-12-02 14:33 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	John Stultz, Suren Baghdasaryan, linux-kernel, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider

On Mon, Nov 17, 2025 at 06:55:47PM +0000, K Prateek Nayak wrote:
> It is not immediately clear that the PSI flags, the set, and the clear
> bits printed in PSI warnings are hexadecimal values. Prepend "0x" to
> format specifiers to make it clear.
> 
> Since "kernel/sched" uses "0x%x" as opposed to "%#x" when printing
> hexadecimal values, the same was followed to keep consistency.
> 
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-12-02 14:33 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-17 18:55 [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution K Prateek Nayak
2025-11-17 18:55 ` [PATCH 1/5] sched/psi: Make psi stubs consistent for !CONFIG_PSI K Prateek Nayak
2025-11-18  1:06   ` John Stultz
2025-11-20  5:59   ` Madadi Vineeth Reddy
2025-11-20  6:10     ` K Prateek Nayak
2025-11-20  6:22       ` Madadi Vineeth Reddy
2025-12-02 14:32   ` Johannes Weiner
2025-11-17 18:55 ` [PATCH 2/5] sched/psi: Prepend "0x" to format specifiers when printing PSI flags K Prateek Nayak
2025-11-18  1:08   ` John Stultz
2025-12-02 14:33   ` Johannes Weiner
2025-11-17 18:55 ` [RFC PATCH 3/5] sched/core: Track blocked tasks retained on rq for proxy K Prateek Nayak
2025-11-17 20:44   ` kernel test robot
2025-11-18  2:03     ` K Prateek Nayak
2025-11-18  1:46   ` kernel test robot
2025-11-18  4:38   ` K Prateek Nayak
2025-11-17 18:55 ` [RFC PATCH 4/5] sched/core: Block proxy task on pick when blocked_on is cleared before wakeup K Prateek Nayak
2025-11-17 18:55 ` [RFC PATCH 5/5] sched/psi: Fix PSI signals of blocked tasks retained for proxy K Prateek Nayak
2025-11-18  0:45 ` [RFC PATCH 0/5] sched/psi: Fix PSI accounting with proxy execution John Stultz
2025-11-18  1:38   ` K Prateek Nayak
2025-11-18  4:26     ` John Stultz
2025-11-18  5:08       ` K Prateek Nayak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.