* [PATCH] sched/deadline: Fix potential race in dl_add_task_root_domain()
[not found] <https://lore.kernel.org/lkml/20251119095525.12019-3-piliu@redhat.com>
@ 2025-11-24 3:34 ` Pingfan Liu
2025-11-24 4:31 ` Waiman Long
0 siblings, 1 reply; 2+ messages in thread
From: Pingfan Liu @ 2025-11-24 3:34 UTC (permalink / raw)
To: Tejun Heo, cgroups, linux-kernel
Cc: Pingfan Liu, Juri Lelli, Waiman Long, Chen Ridong, Peter Zijlstra,
Pierre Gondois, Ingo Molnar, Vincent Guittot, Dietmar Eggemann,
Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider
The access rule for local_cpu_mask_dl requires it to be called on the
local CPU with preemption disabled. However, dl_add_task_root_domain()
currently violates this rule.
Without preemption disabled, the following race can occur:
1. ThreadA calls dl_add_task_root_domain() on CPU 0
2. Gets pointer to CPU 0's local_cpu_mask_dl
3. ThreadA is preempted and migrated to CPU 1
4. ThreadA continues using CPU 0's local_cpu_mask_dl
5. Meanwhile, the scheduler on CPU 0 calls find_later_rq() which also
uses local_cpu_mask_dl (with preemption properly disabled)
6. Both contexts now corrupt the same per-CPU buffer concurrently
Fix this by moving the local_cpu_mask_dl access to the preemption
disabled section.
Closes: https://lore.kernel.org/lkml/aSBjm3mN_uIy64nz@jlelli-thinkpadt14gen4.remote.csb
Fixes: 318e18ed22e8 ("sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug")
Reported-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Pingfan Liu <piliu@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Chen Ridong <chenridong@huaweicloud.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Pierre Gondois <pierre.gondois@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Valentin Schneider <vschneid@redhat.com>
To: cgroups@vger.kernel.org
To: linux-kernel@vger.kernel.org
---
kernel/sched/deadline.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 194a341e85864..e9153e86de0a7 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2944,7 +2944,7 @@ void dl_add_task_root_domain(struct task_struct *p)
struct rq *rq;
struct dl_bw *dl_b;
unsigned int cpu;
- struct cpumask *msk = this_cpu_cpumask_var_ptr(local_cpu_mask_dl);
+ struct cpumask *msk;
raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
if (!dl_task(p) || dl_entity_is_special(&p->dl)) {
@@ -2952,6 +2952,7 @@ void dl_add_task_root_domain(struct task_struct *p)
return;
}
+ msk = this_cpu_cpumask_var_ptr(local_cpu_mask_dl);
/*
* Get an active rq, whose rq->rd traces the correct root
* domain.
--
2.49.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] sched/deadline: Fix potential race in dl_add_task_root_domain()
2025-11-24 3:34 ` [PATCH] sched/deadline: Fix potential race in dl_add_task_root_domain() Pingfan Liu
@ 2025-11-24 4:31 ` Waiman Long
0 siblings, 0 replies; 2+ messages in thread
From: Waiman Long @ 2025-11-24 4:31 UTC (permalink / raw)
To: Pingfan Liu, Tejun Heo, cgroups, linux-kernel
Cc: Juri Lelli, Chen Ridong, Peter Zijlstra, Pierre Gondois,
Ingo Molnar, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
Ben Segall, Mel Gorman, Valentin Schneider
On 11/23/25 10:34 PM, Pingfan Liu wrote:
> The access rule for local_cpu_mask_dl requires it to be called on the
> local CPU with preemption disabled. However, dl_add_task_root_domain()
> currently violates this rule.
>
> Without preemption disabled, the following race can occur:
>
> 1. ThreadA calls dl_add_task_root_domain() on CPU 0
> 2. Gets pointer to CPU 0's local_cpu_mask_dl
> 3. ThreadA is preempted and migrated to CPU 1
> 4. ThreadA continues using CPU 0's local_cpu_mask_dl
> 5. Meanwhile, the scheduler on CPU 0 calls find_later_rq() which also
> uses local_cpu_mask_dl (with preemption properly disabled)
> 6. Both contexts now corrupt the same per-CPU buffer concurrently
>
> Fix this by moving the local_cpu_mask_dl access to the preemption
> disabled section.
>
> Closes: https://lore.kernel.org/lkml/aSBjm3mN_uIy64nz@jlelli-thinkpadt14gen4.remote.csb
> Fixes: 318e18ed22e8 ("sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug")
> Reported-by: Juri Lelli <juri.lelli@redhat.com>
> Signed-off-by: Pingfan Liu <piliu@redhat.com>
> To: Tejun Heo <tj@kernel.org>
> Cc: Waiman Long <longman@redhat.com>
> Cc: Chen Ridong <chenridong@huaweicloud.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Pierre Gondois <pierre.gondois@arm.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ben Segall <bsegall@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Valentin Schneider <vschneid@redhat.com>
> To: cgroups@vger.kernel.org
> To: linux-kernel@vger.kernel.org
> ---
> kernel/sched/deadline.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 194a341e85864..e9153e86de0a7 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2944,7 +2944,7 @@ void dl_add_task_root_domain(struct task_struct *p)
> struct rq *rq;
> struct dl_bw *dl_b;
> unsigned int cpu;
> - struct cpumask *msk = this_cpu_cpumask_var_ptr(local_cpu_mask_dl);
> + struct cpumask *msk;
>
> raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
> if (!dl_task(p) || dl_entity_is_special(&p->dl)) {
> @@ -2952,6 +2952,7 @@ void dl_add_task_root_domain(struct task_struct *p)
> return;
> }
>
> + msk = this_cpu_cpumask_var_ptr(local_cpu_mask_dl);
> /*
> * Get an active rq, whose rq->rd traces the correct root
> * domain.
It will be clearerer by moving the statement down to before the
dl_get_task_effective_cpus() call that uses msk. Please also update the
comment as suggested by Juri.
Thanks,
Longman
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-11-24 4:31 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <https://lore.kernel.org/lkml/20251119095525.12019-3-piliu@redhat.com>
2025-11-24 3:34 ` [PATCH] sched/deadline: Fix potential race in dl_add_task_root_domain() Pingfan Liu
2025-11-24 4:31 ` Waiman Long
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).