From: Pingfan Liu <piliu@redhat.com>
To: Juri Lelli <juri.lelli@redhat.com>
Cc: linux-kernel@vger.kernel.org, Waiman Long <longman@redhat.com>,
Chen Ridong <chenridong@huaweicloud.com>,
Peter Zijlstra <peterz@infradead.org>,
Pierre Gondois <pierre.gondois@arm.com>,
Ingo Molnar <mingo@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Tejun Heo <tj@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>,
mkoutny@suse.com
Subject: Re: [PATCHv7 2/2] sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug
Date: Mon, 24 Nov 2025 09:45:16 +0800 [thread overview]
Message-ID: <aSO4rMm59Z68n6EI@fedora> (raw)
In-Reply-To: <aSBjm3mN_uIy64nz@jlelli-thinkpadt14gen4.remote.csb>
On Fri, Nov 21, 2025 at 02:05:31PM +0100, Juri Lelli wrote:
> Hi!
>
> On 19/11/25 17:55, Pingfan Liu wrote:
>
> ...
>
> > +/* Access rule: must be called on local CPU with preemption disabled */
> > static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask_dl);
>
> ...
>
> > +/* The caller should hold cpuset_mutex */
>
> Maybe we can add a lockdep explicit check?
>
Currently, all cpuset locks are encapsulated in
kernel/cgroup/cpuset-internal.h. I'm not sure if it's appropriate to
expose them. If exposing them is acceptable,
cpuset_callback_lock_irq()/cpuset_callback_unlock_irq() would be
preferable to cpuset_mutex assertion.
@Waiman, @Ridong, could you kindly share your opinion?
> > void dl_add_task_root_domain(struct task_struct *p)
> > {
> > struct rq_flags rf;
> > struct rq *rq;
> > struct dl_bw *dl_b;
> > + unsigned int cpu;
> > + struct cpumask *msk = this_cpu_cpumask_var_ptr(local_cpu_mask_dl);
>
> Can this corrupt local_cpu_mask_dl?
>
> Without preemption being disabled, the following race can occur:
>
> 1. Thread calls dl_add_task_root_domain() on CPU 0
> 2. Gets pointer to CPU 0's local_cpu_mask_dl
> 3. Thread is preempted and migrated to CPU 1
> 4. Thread continues using CPU 0's local_cpu_mask_dl
> 5. Meanwhile, the scheduler on CPU 0 calls find_later_rq() which also
> uses local_cpu_mask_dl (with preemption properly disabled)
> 6. Both contexts now corrupt the same per-CPU buffer concurrently
>
Oh, that is definitely an issue. Thanks for pointing it out.
> >
> > raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
>
> It's safe to get the pointer after this point.
>
Yes.
> > if (!dl_task(p) || dl_entity_is_special(&p->dl)) {
> > @@ -2919,16 +2952,25 @@ void dl_add_task_root_domain(struct task_struct *p)
> > return;
> > }
> >
> > - rq = __task_rq_lock(p, &rf);
> > -
> > + /*
> > + * Get an active rq, whose rq->rd traces the correct root
> > + * domain.
> > + * Ideally this would be under cpuset reader lock until rq->rd is
> > + * fetched. However, sleepable locks cannot nest inside pi_lock, so we
> > + * rely on the caller of dl_add_task_root_domain() holds 'cpuset_mutex'
> > + * to guarantee the CPU stays in the cpuset.
> > + */
> > + dl_get_task_effective_cpus(p, msk);
> > + cpu = cpumask_first_and(cpu_active_mask, msk);
> > + BUG_ON(cpu >= nr_cpu_ids);
> > + rq = cpu_rq(cpu);
> > dl_b = &rq->rd->dl_bw;
> > - raw_spin_lock(&dl_b->lock);
> > + /* End of fetching rd */
>
> Not sure we need this comment above. :)
>
OK, I can remove them to keep the code neat.
Thanks,
Pingfan
next prev parent reply other threads:[~2025-11-24 1:45 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-19 9:55 [PATCHv7 0/2] sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug Pingfan Liu
2025-11-19 9:55 ` [PATCHv7 1/2] cgroup/cpuset: Introduce cpuset_cpus_allowed_locked() Pingfan Liu
2025-11-19 20:51 ` Waiman Long
2025-11-20 1:12 ` Chen Ridong
2025-11-19 9:55 ` [PATCHv7 2/2] sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug Pingfan Liu
2025-11-21 13:05 ` Juri Lelli
2025-11-24 1:45 ` Pingfan Liu [this message]
2025-11-24 2:24 ` Waiman Long
2025-11-24 3:56 ` Pingfan Liu
2025-11-24 4:24 ` Waiman Long
2025-11-25 3:26 ` [PATCHv2 0/2] sched/deadline: Fix potential race in dl_add_task_root_domain() Pingfan Liu
2025-11-25 3:26 ` [PATCHv2 1/2] sched/deadline: Remove unnecessary comment " Pingfan Liu
2026-01-13 10:43 ` [tip: sched/urgent] " tip-bot2 for Pingfan Liu
2025-11-25 3:26 ` [PATCHv2 2/2] sched/deadline: Fix potential race " Pingfan Liu
2026-01-13 10:43 ` [tip: sched/urgent] " tip-bot2 for Pingfan Liu
2025-11-25 8:14 ` [PATCHv2 0/2] " Juri Lelli
2026-01-07 19:57 ` Waiman Long
2026-01-08 11:42 ` Peter Zijlstra
2026-01-08 12:02 ` Pingfan Liu
2025-11-25 14:53 ` Waiman Long
2025-11-20 17:00 ` [PATCHv7 0/2] sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aSO4rMm59Z68n6EI@fedora \
--to=piliu@redhat.com \
--cc=bsegall@google.com \
--cc=chenridong@huaweicloud.com \
--cc=dietmar.eggemann@arm.com \
--cc=hannes@cmpxchg.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=peterz@infradead.org \
--cc=pierre.gondois@arm.com \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.