* [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
@ 2026-04-05 5:50 Mashiro Chen
2026-04-08 15:10 ` Johannes Weiner
0 siblings, 1 reply; 5+ messages in thread
From: Mashiro Chen @ 2026-04-05 5:50 UTC (permalink / raw)
To: hannes, surenb
Cc: peterz, mingo, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid, akpm, linux-kernel,
syzbot+4b1bd55fba6260160779, Mashiro Chen
When PSI is disabled, psi_memstall_enter() returns early without
writing to *flags, leaving the caller's local variable uninitialized.
psi_memstall_leave() also returns early when PSI is disabled and does
not read *flags, so the uninitialized value is never used functionally.
However, KMSAN tracks the shadow and origin metadata per physical
address. When a kernel stack page is subsequently reused, a new object
at the same address inherits the stale KMSAN shadow from the old
uninitialized pflags, causing spurious uninit-value reports in
unrelated code paths such as __flush_smp_call_function_queue().
Initialize *flags to 0 in the psi_disabled early-return path to
prevent the stale shadow from escaping the callers' stack frames.
Fixes: eb414681d5a0 ("psi: pressure stall information for CPU, memory, and IO")
Reported-by: syzbot+4b1bd55fba6260160779@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4b1bd55fba6260160779
Link: https://lore.kernel.org/all/6991885b.050a0220.340abe.02ef.GAE@google.com/
Signed-off-by: Mashiro Chen <mashiro.chen@mailbox.org>
---
kernel/sched/psi.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index d9c9d9480a45b..32d3a180fc03b 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1058,8 +1058,10 @@ void psi_memstall_enter(unsigned long *flags)
struct rq_flags rf;
struct rq *rq;
- if (static_branch_likely(&psi_disabled))
+ if (static_branch_likely(&psi_disabled)) {
+ *flags = 0;
return;
+ }
*flags = current->in_memstall;
if (*flags)
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
2026-04-05 5:50 [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled Mashiro Chen
@ 2026-04-08 15:10 ` Johannes Weiner
2026-04-08 16:14 ` Mashiro Chen
0 siblings, 1 reply; 5+ messages in thread
From: Johannes Weiner @ 2026-04-08 15:10 UTC (permalink / raw)
To: Mashiro Chen
Cc: surenb, peterz, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, akpm,
linux-kernel, syzbot+4b1bd55fba6260160779
On Sun, Apr 05, 2026 at 01:50:44PM +0800, Mashiro Chen wrote:
> When PSI is disabled, psi_memstall_enter() returns early without
> writing to *flags, leaving the caller's local variable uninitialized.
> psi_memstall_leave() also returns early when PSI is disabled and does
> not read *flags, so the uninitialized value is never used functionally.
>
> However, KMSAN tracks the shadow and origin metadata per physical
> address. When a kernel stack page is subsequently reused, a new object
> at the same address inherits the stale KMSAN shadow from the old
> uninitialized pflags, causing spurious uninit-value reports in
> unrelated code paths such as __flush_smp_call_function_queue().
>
> Initialize *flags to 0 in the psi_disabled early-return path to
> prevent the stale shadow from escaping the callers' stack frames.
How is this not a kmsan bug?
I don't think putting an unexpected init into unrelated code with no
comment is an appropriate fix for what seems like a false positive in
this tool.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
2026-04-08 15:10 ` Johannes Weiner
@ 2026-04-08 16:14 ` Mashiro Chen
2026-04-08 16:40 ` Johannes Weiner
0 siblings, 1 reply; 5+ messages in thread
From: Mashiro Chen @ 2026-04-08 16:14 UTC (permalink / raw)
To: Johannes Weiner
Cc: surenb, peterz, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, akpm,
linux-kernel, syzbot+4b1bd55fba6260160779
Hi Johannes,
Good question. You're right that KMSAN's stack tracking persisting
across page reuse boundaries is arguably a tool limitation. That said,
I think fixing it on the PSI side is still reasonable:
psi_memstall_enter() takes a pointer parameter with an implicit contract:
if the caller passes &flags, they expect *flags to be initialized upon
return. The current early-return silently violates that contract by
leaving *flags uninitialized, even though the value is never actually used
functionally.
The fix is essentially free (we're already in the early-return path) and
makes the contract explicit. You're right that the original patch lacked
a comment explaining this, I should have added:
/* Initialize to 0 even in psi_disabled case to honor the
* implicit API contract that *flags is initialized on return.
* psi_memstall_leave() also returns early when psi_disabled
* and does not read *flags, so this is zero-cost. */
*flags = 0;
return;
That said, if you prefer this stays in KMSAN (e.g., treating stack
variables as out-of-scope once their frame returns), I'm happy to drop
the patch and redirect the effort there instead.
Thanks for the thoughtful review feedback.
Best,
Mashiro Chen
On 4/8/26 23:10, Johannes Weiner wrote:
> On Sun, Apr 05, 2026 at 01:50:44PM +0800, Mashiro Chen wrote:
>> When PSI is disabled, psi_memstall_enter() returns early without
>> writing to *flags, leaving the caller's local variable uninitialized.
>> psi_memstall_leave() also returns early when PSI is disabled and does
>> not read *flags, so the uninitialized value is never used functionally.
>>
>> However, KMSAN tracks the shadow and origin metadata per physical
>> address. When a kernel stack page is subsequently reused, a new object
>> at the same address inherits the stale KMSAN shadow from the old
>> uninitialized pflags, causing spurious uninit-value reports in
>> unrelated code paths such as __flush_smp_call_function_queue().
>>
>> Initialize *flags to 0 in the psi_disabled early-return path to
>> prevent the stale shadow from escaping the callers' stack frames.
> How is this not a kmsan bug?
>
> I don't think putting an unexpected init into unrelated code with no
> comment is an appropriate fix for what seems like a false positive in
> this tool.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
2026-04-08 16:14 ` Mashiro Chen
@ 2026-04-08 16:40 ` Johannes Weiner
2026-04-08 16:58 ` Mashiro Chen
0 siblings, 1 reply; 5+ messages in thread
From: Johannes Weiner @ 2026-04-08 16:40 UTC (permalink / raw)
To: Mashiro Chen
Cc: surenb, peterz, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, akpm,
linux-kernel, syzbot+4b1bd55fba6260160779
On Thu, Apr 09, 2026 at 12:14:50AM +0800, Mashiro Chen wrote:
> Hi Johannes,
>
> Good question. You're right that KMSAN's stack tracking persisting
> across page reuse boundaries is arguably a tool limitation. That said,
> I think fixing it on the PSI side is still reasonable:
>
> psi_memstall_enter() takes a pointer parameter with an implicit contract:
> if the caller passes &flags, they expect *flags to be initialized upon
> return. The current early-return silently violates that contract by
> leaving *flags uninitialized, even though the value is never actually used
> functionally.
The caller has no expectations towards the contents of *flags and no
business reading or manipulating them. It's an opaque channel that
lets _enter() communicate with _leave().
> The fix is essentially free (we're already in the early-return path) and
> makes the contract explicit. You're right that the original patch lacked
> a comment explaining this, I should have added:
>
> /* Initialize to 0 even in psi_disabled case to honor the
> * implicit API contract that *flags is initialized on return.
> * psi_memstall_leave() also returns early when psi_disabled
> * and does not read *flags, so this is zero-cost. */
> *flags = 0;
> return;
>
> That said, if you prefer this stays in KMSAN (e.g., treating stack
> variables as out-of-scope once their frame returns), I'm happy to drop
> the patch and redirect the effort there instead.
It sounds to me like this would be a good thing to fix regardless of
what psi is doing here. Even if psi initialized it to some value that
is meaningful to psi - that value is totally random, and for all
intents and purposes "uninitialized", from the view of a subsequent
user of that stack slot?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled
2026-04-08 16:40 ` Johannes Weiner
@ 2026-04-08 16:58 ` Mashiro Chen
0 siblings, 0 replies; 5+ messages in thread
From: Mashiro Chen @ 2026-04-08 16:58 UTC (permalink / raw)
To: Johannes Weiner
Cc: surenb, peterz, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, akpm,
linux-kernel, syzbot+4b1bd55fba6260160779
Hi Johannes,
You're right on both counts. The 'opaque channel' framing makes it
clear there's no meaningful API contract being violated here -- the
caller is not supposed to interpret *flags at all.
And yes, your second point is exactly the real issue: once a stack
frame returns, its local variables should be considered dead. KMSAN
tracking that shadow across page reuse into an unrelated frame is the
actual bug.
I'll drop this patch. The correct fix is in KMSAN -- it should treat
stack slots as out-of-scope once their owning frame returns, rather
than letting stale shadow metadata escape into subsequent users of
the same physical address.
Thanks for the clear explanation.
Best,
Mashiro Chen
On 4/9/26 00:40, Johannes Weiner wrote:
> On Thu, Apr 09, 2026 at 12:14:50AM +0800, Mashiro Chen wrote:
>> Hi Johannes,
>>
>> Good question. You're right that KMSAN's stack tracking persisting
>> across page reuse boundaries is arguably a tool limitation. That said,
>> I think fixing it on the PSI side is still reasonable:
>>
>> psi_memstall_enter() takes a pointer parameter with an implicit contract:
>> if the caller passes &flags, they expect *flags to be initialized upon
>> return. The current early-return silently violates that contract by
>> leaving *flags uninitialized, even though the value is never actually used
>> functionally.
> The caller has no expectations towards the contents of *flags and no
> business reading or manipulating them. It's an opaque channel that
> lets _enter() communicate with _leave().
>
>> The fix is essentially free (we're already in the early-return path) and
>> makes the contract explicit. You're right that the original patch lacked
>> a comment explaining this, I should have added:
>>
>> /* Initialize to 0 even in psi_disabled case to honor the
>> * implicit API contract that *flags is initialized on return.
>> * psi_memstall_leave() also returns early when psi_disabled
>> * and does not read *flags, so this is zero-cost. */
>> *flags = 0;
>> return;
>>
>> That said, if you prefer this stays in KMSAN (e.g., treating stack
>> variables as out-of-scope once their frame returns), I'm happy to drop
>> the patch and redirect the effort there instead.
> It sounds to me like this would be a good thing to fix regardless of
> what psi is doing here. Even if psi initialized it to some value that
> is meaningful to psi - that value is totally random, and for all
> intents and purposes "uninitialized", from the view of a subsequent
> user of that stack slot?
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-08 16:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-05 5:50 [PATCH] sched/psi: initialize *flags in psi_memstall_enter when PSI is disabled Mashiro Chen
2026-04-08 15:10 ` Johannes Weiner
2026-04-08 16:14 ` Mashiro Chen
2026-04-08 16:40 ` Johannes Weiner
2026-04-08 16:58 ` Mashiro Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox