* [PATCH] sched: Fix race in rt_mutex_pre_schedule by removing non-atomic fetch_and_set
@ 2025-08-26 11:08 cuiguoqi
2025-08-26 13:56 ` Steven Rostedt
0 siblings, 1 reply; 4+ messages in thread
From: cuiguoqi @ 2025-08-26 11:08 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot
Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, Sebastian Andrzej Siewior, Clark Williams,
guoqi0226, linux-kernel, linux-rt-devel, cuiguoqi
During Wound/Wait testing on PREEMPT_RT, a WARNING was hit:
WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:7085 rt_mutex_pre_schedule+0xa8/0x108
Call trace:
rt_mutex_pre_schedule+0xa8/0x108
__ww_rt_mutex_lock+0x1d4/0x300
ww_mutex_lock+0x1c/0x30
The issue stems from the non-atomic `fetch_and_set` macro:
#define fetch_and_set(x, v) ({ int _x = (x); (x) = (v); _x; })
It lacks atomicity and memory ordering, leading to race conditions under
preemption or interrupts, where `current->sched_rt_mutex` may be corrupted.
Since this flag is only used for lockdep assertions and accessed per-task,
replace the unsafe macro with direct assignment and explicit state checks:
- In rt_mutex_pre_schedule(): assert is 0 before setting to 1.
- In rt_mutex_post_schedule(): assert is 1 before clearing to 0.
This fixes the false-positive warning without needing atomic operations.
Signed-off-by: cuiguoqi <cuiguoqi@kylinos.cn>
---
kernel/sched/core.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5e89a6eeadba..fb4c446e46f7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7078,11 +7078,11 @@ const struct sched_class *__setscheduler_class(struct task_struct *p, int prio)
* name such that if someone were to implement this function we get to compare
* notes.
*/
-#define fetch_and_set(x, v) ({ int _x = (x); (x) = (v); _x; })
void rt_mutex_pre_schedule(void)
{
- lockdep_assert(!fetch_and_set(current->sched_rt_mutex, 1));
+ lockdep_assert(!current->sched_rt_mutex);
+ current->sched_rt_mutex = 1;
sched_submit_work(current);
}
@@ -7095,7 +7095,9 @@ void rt_mutex_schedule(void)
void rt_mutex_post_schedule(void)
{
sched_update_worker(current);
- lockdep_assert(fetch_and_set(current->sched_rt_mutex, 0));
+ lockdep_assert(current->sched_rt_mutex);
+ current->sched_rt_mutex = 0;
+
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] sched: Fix race in rt_mutex_pre_schedule by removing non-atomic fetch_and_set
2025-08-26 11:08 [PATCH] sched: Fix race in rt_mutex_pre_schedule by removing non-atomic fetch_and_set cuiguoqi
@ 2025-08-26 13:56 ` Steven Rostedt
2025-08-26 14:16 ` Sebastian Andrzej Siewior
2025-08-27 8:17 ` cuiguoqi
0 siblings, 2 replies; 4+ messages in thread
From: Steven Rostedt @ 2025-08-26 13:56 UTC (permalink / raw)
To: cuiguoqi
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Ben Segall, Mel Gorman, Valentin Schneider,
Sebastian Andrzej Siewior, Clark Williams, guoqi0226,
linux-kernel, linux-rt-devel
On Tue, 26 Aug 2025 19:08:33 +0800
cuiguoqi <cuiguoqi@kylinos.cn> wrote:
> @@ -7078,11 +7078,11 @@ const struct sched_class *__setscheduler_class(struct task_struct *p, int prio)
> * name such that if someone were to implement this function we get to compare
> * notes.
> */
> -#define fetch_and_set(x, v) ({ int _x = (x); (x) = (v); _x; })
>
> void rt_mutex_pre_schedule(void)
> {
> - lockdep_assert(!fetch_and_set(current->sched_rt_mutex, 1));
> + lockdep_assert(!current->sched_rt_mutex);
> + current->sched_rt_mutex = 1;
> sched_submit_work(current);
> }
>
> @@ -7095,7 +7095,9 @@ void rt_mutex_schedule(void)
> void rt_mutex_post_schedule(void)
> {
> sched_update_worker(current);
> - lockdep_assert(fetch_and_set(current->sched_rt_mutex, 0));
> + lockdep_assert(current->sched_rt_mutex);
> + current->sched_rt_mutex = 0;
> +
> }
Honestly, without adding a READ_ONCE() or barrier() I don't see how your
change would prevent the compiler from making the code any different?
It may have "fixed" your issue because the compiler may have done things
differently, but this change doesn't guarantee anything.
Also, could you show an example of how current->sched_rt_mutex could be
corrupted?
-- Steve
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sched: Fix race in rt_mutex_pre_schedule by removing non-atomic fetch_and_set
2025-08-26 13:56 ` Steven Rostedt
@ 2025-08-26 14:16 ` Sebastian Andrzej Siewior
2025-08-27 8:17 ` cuiguoqi
1 sibling, 0 replies; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-08-26 14:16 UTC (permalink / raw)
To: Steven Rostedt
Cc: cuiguoqi, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Vincent Guittot, Dietmar Eggemann, Ben Segall, Mel Gorman,
Valentin Schneider, Clark Williams, guoqi0226, linux-kernel,
linux-rt-devel
On 2025-08-26 09:56:15 [-0400], Steven Rostedt wrote:
> Honestly, without adding a READ_ONCE() or barrier() I don't see how your
> change would prevent the compiler from making the code any different?
Other than that, that flag is supposed to be only set/ cleared by the
thread itself. There should be no need for it to be atomic.
> It may have "fixed" your issue because the compiler may have done things
> differently, but this change doesn't guarantee anything.
>
> Also, could you show an example of how current->sched_rt_mutex could be
> corrupted?
That would be important.
> -- Steve
Sebastian
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sched: Fix race in rt_mutex_pre_schedule by removing non-atomic fetch_and_set
2025-08-26 13:56 ` Steven Rostedt
2025-08-26 14:16 ` Sebastian Andrzej Siewior
@ 2025-08-27 8:17 ` cuiguoqi
1 sibling, 0 replies; 4+ messages in thread
From: cuiguoqi @ 2025-08-27 8:17 UTC (permalink / raw)
To: rostedt
Cc: bigeasy, bsegall, clrkwllms, cuiguoqi, dietmar.eggemann,
guoqi0226, juri.lelli, linux-kernel, linux-rt-devel, mgorman,
mingo, peterz, vincent.guittot, vschneid
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 1937 bytes --]
The issue arises during EDEADLK testing in `lib/locking-selftest.c` when `is_wait_die=1`.
In this mode, the current thread's `debug_locks` flag is disabled via `__debug_locks_off` (which calls `xchg(&debug_locks, 0)`) during the blocking path of `rt_mutex_slowlock`, specifically in `rt_mutex_slowlock_block()`:
rt_mutex_slowlock()
rt_mutex_pre_schedule()
rt_mutex_slowlock_block()
DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock)
__debug_locks_off(); // xchg(&debug_locks, 0)
However, `rt_mutex_post_schedule()` still performs:
lockdep_assert(fetch_and_set(current->sched_rt_mutex, 0));
Which expands to:
do {
WARN_ON(debug_locks && !( ({ int _x = current->sched_rt_mutex; current->sched_rt_mutex = 0; _x; }) ));
} while (0)
The generated assembly shows that the entire assertion is conditional on `debug_locks`:
adrp x0, debug_locks
ldr w0, [x0]
cbz w0, .label_skip_warn // Skip WARN if debug_locks == 0
This means: if `debug_locks` was cleared earlier, the check on `current->sched_rt_mutex` is effectively skipped, and the flag may remain set.
Later, when the same task re-enters `rt_mutex_slowlock`, it calls `lockdep_reset()` to re-enable `debug_locks`, but the stale `current->sched_rt_mutex` state (left over from the previous lock attempt) causes a false-positive warning in `rt_mutex_pre_schedule()`:
WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:7085 rt_mutex_pre_schedule+0xa8/0x108
Because:
- `rt_mutex_pre_schedule()` asserts `!current->sched_rt_mutex`
- But the flag was never properly cleared due to the skipped post-schedule check.
This is not a data race on the flag itself, but a **state inconsistency caused by conditional debugging logic** — the `fetch_and_set` macro is not atomic, but more importantly, the assertion is bypassed when `debug_locks` is off, breaking the expected state transition.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-08-27 8:18 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-26 11:08 [PATCH] sched: Fix race in rt_mutex_pre_schedule by removing non-atomic fetch_and_set cuiguoqi
2025-08-26 13:56 ` Steven Rostedt
2025-08-26 14:16 ` Sebastian Andrzej Siewior
2025-08-27 8:17 ` cuiguoqi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).