* [PATCH v3 0/2] cgroup/cpuset: fix DL attach accounting @ 2026-05-09 10:20 Guopeng Zhang 2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang 2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang 0 siblings, 2 replies; 10+ messages in thread From: Guopeng Zhang @ 2026-05-09 10:20 UTC (permalink / raw) To: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups, Guopeng Zhang Hi, This v3 series contains two cpuset fixes for SCHED_DEADLINE attach accounting. Patch 1 fixes an internal cpuset_can_attach() failure path where temporary DL migration state can be left behind if a later per-task check fails before cpuset marks attach_in_progress. Patch 2 keeps cpuset DL bandwidth reservation aligned with the condition used by set_cpus_allowed_dl() for source-side bandwidth removal. It keeps counting all migrating DL tasks for cpuset task accounting, but reserves destination DL bandwidth only for tasks that actually need a root-domain bandwidth move. Guopeng Zhang (2): cgroup/cpuset: reset DL migration state on can_attach() failure cgroup/cpuset: reserve DL bandwidth only for root-domain moves include/linux/sched/deadline.h | 9 ++++++++ kernel/cgroup/cpuset-internal.h | 1 + kernel/cgroup/cpuset.c | 39 ++++++++++++++++++--------------- kernel/sched/deadline.c | 13 ++++++++--- 4 files changed, 41 insertions(+), 21 deletions(-) --- Changes in v3: - Patch 1: use common ret != 0 cleanup in cpuset_can_attach(), as suggested by Waiman Long and Chen Ridong. - Patch 2: drop task_cpu_possible_mask() / attach-target-mask handling as suggested by Waiman Long. - Patch 2: keep the change limited to reserving DL bandwidth only for tasks that need a root-domain bandwidth move. - Leave the broader can_attach()/attach() transaction model unchanged. Changes in v2: - Split the original change into two patches. - Add a separate fix for resetting pending DL migration state on cpuset_can_attach() failure. - Clarify that nr_migrate_dl_tasks counts all migrating DL tasks for cpuset task accounting, while sum_migrate_dl_bw only tracks bandwidth needing destination root-domain reservation. v2: https://lore.kernel.org/all/20260507103310.35849-1-zhangguopeng@kylinos.cn/ v1: https://lore.kernel.org/all/20260421083449.95750-1-zhangguopeng@kylinos.cn -- 2.43.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure 2026-05-09 10:20 [PATCH v3 0/2] cgroup/cpuset: fix DL attach accounting Guopeng Zhang @ 2026-05-09 10:20 ` Guopeng Zhang 2026-05-11 2:48 ` Chen Ridong ` (3 more replies) 2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang 1 sibling, 4 replies; 10+ messages in thread From: Guopeng Zhang @ 2026-05-09 10:20 UTC (permalink / raw) To: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups, Guopeng Zhang cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration state in the destination cpuset while walking the taskset. If a later task_can_attach() or security_task_setscheduler() check fails, cgroup_migrate_execute() treats cpuset as the failing subsystem and does not call cpuset_cancel_attach() for it. The partially accumulated state is then left behind and can be consumed by a later attach, corrupting cpuset DL task accounting and pending DL bandwidth accounting. Reset the pending DL migration state from the common error exit when ret is non-zero. Successful can_attach() keeps the state for cpuset_attach() or cpuset_cancel_attach(). Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails") Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn> --- kernel/cgroup/cpuset.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index e3a081a07c6d..b9c839538900 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -3050,16 +3050,13 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); if (unlikely(cpu >= nr_cpu_ids)) { - reset_migrate_dl_data(cs); ret = -EINVAL; goto out_unlock; } ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); - if (ret) { - reset_migrate_dl_data(cs); + if (ret) goto out_unlock; - } cs->dl_bw_cpu = cpu; } @@ -3070,7 +3067,10 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) * changes which zero cpus/mems_allowed. */ cs->attach_in_progress++; + out_unlock: + if (ret) + reset_migrate_dl_data(cs); mutex_unlock(&cpuset_mutex); return ret; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure 2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang @ 2026-05-11 2:48 ` Chen Ridong 2026-05-11 5:04 ` Waiman Long ` (2 subsequent siblings) 3 siblings, 0 replies; 10+ messages in thread From: Chen Ridong @ 2026-05-11 2:48 UTC (permalink / raw) To: Guopeng Zhang, Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups On 2026/5/9 18:20, Guopeng Zhang wrote: > cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration > state in the destination cpuset while walking the taskset. > > If a later task_can_attach() or security_task_setscheduler() check > fails, cgroup_migrate_execute() treats cpuset as the failing subsystem > and does not call cpuset_cancel_attach() for it. The partially > accumulated state is then left behind and can be consumed by a later > attach, corrupting cpuset DL task accounting and pending DL bandwidth > accounting. > > Reset the pending DL migration state from the common error exit when > ret is non-zero. Successful can_attach() keeps the state for > cpuset_attach() or cpuset_cancel_attach(). > > Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails") > Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn> > --- > kernel/cgroup/cpuset.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index e3a081a07c6d..b9c839538900 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -3050,16 +3050,13 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); > > if (unlikely(cpu >= nr_cpu_ids)) { > - reset_migrate_dl_data(cs); > ret = -EINVAL; > goto out_unlock; > } > > ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); > - if (ret) { > - reset_migrate_dl_data(cs); > + if (ret) > goto out_unlock; > - } > > cs->dl_bw_cpu = cpu; > } > @@ -3070,7 +3067,10 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > * changes which zero cpus/mems_allowed. > */ > cs->attach_in_progress++; > + > out_unlock: > + if (ret) > + reset_migrate_dl_data(cs); > mutex_unlock(&cpuset_mutex); > return ret; > } LGTM. Thanks. Reviewed-by: Chen Ridong <chenridong@huaweicloud.com> -- Best regards, Ridong ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure 2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang 2026-05-11 2:48 ` Chen Ridong @ 2026-05-11 5:04 ` Waiman Long 2026-05-11 8:18 ` Tejun Heo 2026-05-11 9:17 ` Juri Lelli 3 siblings, 0 replies; 10+ messages in thread From: Waiman Long @ 2026-05-11 5:04 UTC (permalink / raw) To: Guopeng Zhang, Tejun Heo, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups On 5/9/26 6:20 AM, Guopeng Zhang wrote: > cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration > state in the destination cpuset while walking the taskset. > > If a later task_can_attach() or security_task_setscheduler() check > fails, cgroup_migrate_execute() treats cpuset as the failing subsystem > and does not call cpuset_cancel_attach() for it. The partially > accumulated state is then left behind and can be consumed by a later > attach, corrupting cpuset DL task accounting and pending DL bandwidth > accounting. > > Reset the pending DL migration state from the common error exit when > ret is non-zero. Successful can_attach() keeps the state for > cpuset_attach() or cpuset_cancel_attach(). > > Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails") > Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn> > --- > kernel/cgroup/cpuset.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index e3a081a07c6d..b9c839538900 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -3050,16 +3050,13 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); > > if (unlikely(cpu >= nr_cpu_ids)) { > - reset_migrate_dl_data(cs); > ret = -EINVAL; > goto out_unlock; > } > > ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); > - if (ret) { > - reset_migrate_dl_data(cs); > + if (ret) > goto out_unlock; > - } > > cs->dl_bw_cpu = cpu; > } > @@ -3070,7 +3067,10 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > * changes which zero cpus/mems_allowed. > */ > cs->attach_in_progress++; > + > out_unlock: > + if (ret) > + reset_migrate_dl_data(cs); > mutex_unlock(&cpuset_mutex); > return ret; > } Reviewed-by: Waiman Long <longman@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure 2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang 2026-05-11 2:48 ` Chen Ridong 2026-05-11 5:04 ` Waiman Long @ 2026-05-11 8:18 ` Tejun Heo 2026-05-11 9:17 ` Juri Lelli 3 siblings, 0 replies; 10+ messages in thread From: Tejun Heo @ 2026-05-11 8:18 UTC (permalink / raw) To: Guopeng Zhang, Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups Hello, Applied 1/2 to cgroup/for-7.1-fixes with Cc: stable@vger.kernel.org # v6.10+. Thanks. -- tejun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure 2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang ` (2 preceding siblings ...) 2026-05-11 8:18 ` Tejun Heo @ 2026-05-11 9:17 ` Juri Lelli 3 siblings, 0 replies; 10+ messages in thread From: Juri Lelli @ 2026-05-11 9:17 UTC (permalink / raw) To: Guopeng Zhang Cc: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong, Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups On Sat, 09 May 2026 18:20:30 +0800, Guopeng Zhang <zhangguopeng@kylinos.cn> wrote: > cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration > state in the destination cpuset while walking the taskset. > > If a later task_can_attach() or security_task_setscheduler() check > fails, cgroup_migrate_execute() treats cpuset as the failing subsystem > and does not call cpuset_cancel_attach() for it. The partially > accumulated state is then left behind and can be consumed by a later > attach, corrupting cpuset DL task accounting and pending DL bandwidth > accounting. > > [...] Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Tested-by: Juri Lelli <juri.lelli@redhat.com> -- Juri Lelli <juri.lelli@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves 2026-05-09 10:20 [PATCH v3 0/2] cgroup/cpuset: fix DL attach accounting Guopeng Zhang 2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang @ 2026-05-09 10:20 ` Guopeng Zhang 2026-05-11 9:17 ` Juri Lelli ` (2 more replies) 1 sibling, 3 replies; 10+ messages in thread From: Guopeng Zhang @ 2026-05-09 10:20 UTC (permalink / raw) To: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups, Guopeng Zhang cpuset_can_attach() currently adds the bandwidth of all migrating SCHED_DEADLINE tasks to sum_migrate_dl_bw. If the source and destination cpuset effective CPU masks do not overlap, the whole sum is then reserved in the destination root domain. set_cpus_allowed_dl(), however, subtracts bandwidth from the source root domain only when the affinity change really moves the task between root domains. A DL task can move between cpusets that are still in the same root domain, so including that task in sum_migrate_dl_bw can reserve destination bandwidth without a matching source-side subtraction. Share the root-domain move test with set_cpus_allowed_dl(). Keep nr_migrate_dl_tasks counting all migrating deadline tasks for cpuset DL task accounting, but add to sum_migrate_dl_bw only for tasks that need a root-domain bandwidth move. Keep using the destination cpuset effective CPU mask and leave the broader can_attach()/attach() transaction model unchanged. Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails") Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn> --- include/linux/sched/deadline.h | 9 +++++++++ kernel/cgroup/cpuset-internal.h | 1 + kernel/cgroup/cpuset.c | 33 ++++++++++++++++++--------------- kernel/sched/deadline.c | 13 ++++++++++--- 4 files changed, 38 insertions(+), 18 deletions(-) diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h index 1198138cb839..273538200a44 100644 --- a/include/linux/sched/deadline.h +++ b/include/linux/sched/deadline.h @@ -33,6 +33,15 @@ struct root_domain; extern void dl_add_task_root_domain(struct task_struct *p); extern void dl_clear_root_domain(struct root_domain *rd); extern void dl_clear_root_domain_cpu(int cpu); +/* + * Return whether moving DL task @p to @new_mask requires moving DL + * bandwidth accounting between root domains. This helper is specific to + * DL bandwidth move accounting semantics and is shared by + * cpuset_can_attach() and set_cpus_allowed_dl() so both paths use the + * same source root-domain test. + */ +extern bool dl_task_needs_bw_move(struct task_struct *p, + const struct cpumask *new_mask); extern u64 dl_cookie; extern bool dl_bw_visited(int cpu, u64 cookie); diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-internal.h index bb4e692bea30..f7aaf01f7cd5 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -167,6 +167,7 @@ struct cpuset { */ int nr_deadline_tasks; int nr_migrate_dl_tasks; + /* DL bandwidth that needs destination reservation for this attach. */ u64 sum_migrate_dl_bw; /* * CPU used for temporary DL bandwidth allocation during attach; diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index b9c839538900..23abfbbb4686 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -2993,7 +2993,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) struct cpuset *cs, *oldcs; struct task_struct *task; bool setsched_check; - int ret; + int cpu, ret; /* used later by cpuset_attach() */ cpuset_attach_old_cs = task_cs(cgroup_taskset_first(tset, &css)); @@ -3038,28 +3038,31 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) } if (dl_task(task)) { + /* + * Count all migrating DL tasks for cpuset task accounting. + * Only tasks that need a root-domain bandwidth move + * contribute to sum_migrate_dl_bw. + */ cs->nr_migrate_dl_tasks++; - cs->sum_migrate_dl_bw += task->dl.dl_bw; + if (dl_task_needs_bw_move(task, cs->effective_cpus)) + cs->sum_migrate_dl_bw += task->dl.dl_bw; } } - if (!cs->nr_migrate_dl_tasks) + if (!cs->sum_migrate_dl_bw) goto out_success; - if (!cpumask_intersects(oldcs->effective_cpus, cs->effective_cpus)) { - int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); - - if (unlikely(cpu >= nr_cpu_ids)) { - ret = -EINVAL; - goto out_unlock; - } + cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); + if (unlikely(cpu >= nr_cpu_ids)) { + ret = -EINVAL; + goto out_unlock; + } - ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); - if (ret) - goto out_unlock; + ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); + if (ret) + goto out_unlock; - cs->dl_bw_cpu = cpu; - } + cs->dl_bw_cpu = cpu; out_success: /* diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index edca7849b165..7db4c87df83b 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -3107,20 +3107,18 @@ static void task_woken_dl(struct rq *rq, struct task_struct *p) static void set_cpus_allowed_dl(struct task_struct *p, struct affinity_context *ctx) { - struct root_domain *src_rd; struct rq *rq; WARN_ON_ONCE(!dl_task(p)); rq = task_rq(p); - src_rd = rq->rd; /* * Migrating a SCHED_DEADLINE task between exclusive * cpusets (different root_domains) entails a bandwidth * update. We already made space for us in the destination * domain (see cpuset_can_attach()). */ - if (!cpumask_intersects(src_rd->span, ctx->new_mask)) { + if (dl_task_needs_bw_move(p, ctx->new_mask)) { struct dl_bw *src_dl_b; src_dl_b = dl_bw_of(cpu_of(rq)); @@ -3137,6 +3135,15 @@ static void set_cpus_allowed_dl(struct task_struct *p, set_cpus_allowed_common(p, ctx); } +bool dl_task_needs_bw_move(struct task_struct *p, + const struct cpumask *new_mask) +{ + if (!dl_task(p)) + return false; + + return !cpumask_intersects(task_rq(p)->rd->span, new_mask); +} + /* Assumes rq->lock is held */ static void rq_online_dl(struct rq *rq) { -- 2.43.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves 2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang @ 2026-05-11 9:17 ` Juri Lelli 2026-05-11 20:00 ` Waiman Long 2026-05-11 20:29 ` Tejun Heo 2 siblings, 0 replies; 10+ messages in thread From: Juri Lelli @ 2026-05-11 9:17 UTC (permalink / raw) To: Guopeng Zhang Cc: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong, Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups On Sat, 09 May 2026 18:20:31 +0800, Guopeng Zhang <zhangguopeng@kylinos.cn> wrote: > cpuset_can_attach() currently adds the bandwidth of all migrating > SCHED_DEADLINE tasks to sum_migrate_dl_bw. If the source and destination > cpuset effective CPU masks do not overlap, the whole sum is then > reserved in the destination root domain. > > set_cpus_allowed_dl(), however, subtracts bandwidth from the source > root domain only when the affinity change really moves the task between > root domains. A DL task can move between cpusets that are still in the > same root domain, so including that task in sum_migrate_dl_bw can reserve > destination bandwidth without a matching source-side subtraction. > > [...] Acked-by: Juri Lelli <juri.lelli@redhat.com> Tested-by: Juri Lelli <juri.lelli@redhat.com> -- Juri Lelli <juri.lelli@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves 2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang 2026-05-11 9:17 ` Juri Lelli @ 2026-05-11 20:00 ` Waiman Long 2026-05-11 20:29 ` Tejun Heo 2 siblings, 0 replies; 10+ messages in thread From: Waiman Long @ 2026-05-11 20:00 UTC (permalink / raw) To: Guopeng Zhang, Tejun Heo, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups On 5/9/26 6:20 AM, Guopeng Zhang wrote: > cpuset_can_attach() currently adds the bandwidth of all migrating > SCHED_DEADLINE tasks to sum_migrate_dl_bw. If the source and destination > cpuset effective CPU masks do not overlap, the whole sum is then > reserved in the destination root domain. > > set_cpus_allowed_dl(), however, subtracts bandwidth from the source > root domain only when the affinity change really moves the task between > root domains. A DL task can move between cpusets that are still in the > same root domain, so including that task in sum_migrate_dl_bw can reserve > destination bandwidth without a matching source-side subtraction. > > Share the root-domain move test with set_cpus_allowed_dl(). Keep > nr_migrate_dl_tasks counting all migrating deadline tasks for cpuset DL > task accounting, but add to sum_migrate_dl_bw only for tasks that need a > root-domain bandwidth move. Keep using the destination cpuset effective > CPU mask and leave the broader can_attach()/attach() transaction model > unchanged. > > Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails") > Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn> > --- > include/linux/sched/deadline.h | 9 +++++++++ > kernel/cgroup/cpuset-internal.h | 1 + > kernel/cgroup/cpuset.c | 33 ++++++++++++++++++--------------- > kernel/sched/deadline.c | 13 ++++++++++--- > 4 files changed, 38 insertions(+), 18 deletions(-) > > diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h > index 1198138cb839..273538200a44 100644 > --- a/include/linux/sched/deadline.h > +++ b/include/linux/sched/deadline.h > @@ -33,6 +33,15 @@ struct root_domain; > extern void dl_add_task_root_domain(struct task_struct *p); > extern void dl_clear_root_domain(struct root_domain *rd); > extern void dl_clear_root_domain_cpu(int cpu); > +/* > + * Return whether moving DL task @p to @new_mask requires moving DL > + * bandwidth accounting between root domains. This helper is specific to > + * DL bandwidth move accounting semantics and is shared by > + * cpuset_can_attach() and set_cpus_allowed_dl() so both paths use the > + * same source root-domain test. > + */ > +extern bool dl_task_needs_bw_move(struct task_struct *p, > + const struct cpumask *new_mask); > > extern u64 dl_cookie; > extern bool dl_bw_visited(int cpu, u64 cookie); > diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-internal.h > index bb4e692bea30..f7aaf01f7cd5 100644 > --- a/kernel/cgroup/cpuset-internal.h > +++ b/kernel/cgroup/cpuset-internal.h > @@ -167,6 +167,7 @@ struct cpuset { > */ > int nr_deadline_tasks; > int nr_migrate_dl_tasks; > + /* DL bandwidth that needs destination reservation for this attach. */ > u64 sum_migrate_dl_bw; > /* > * CPU used for temporary DL bandwidth allocation during attach; > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index b9c839538900..23abfbbb4686 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -2993,7 +2993,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > struct cpuset *cs, *oldcs; > struct task_struct *task; > bool setsched_check; > - int ret; > + int cpu, ret; > > /* used later by cpuset_attach() */ > cpuset_attach_old_cs = task_cs(cgroup_taskset_first(tset, &css)); > @@ -3038,28 +3038,31 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > } > > if (dl_task(task)) { > + /* > + * Count all migrating DL tasks for cpuset task accounting. > + * Only tasks that need a root-domain bandwidth move > + * contribute to sum_migrate_dl_bw. > + */ > cs->nr_migrate_dl_tasks++; > - cs->sum_migrate_dl_bw += task->dl.dl_bw; > + if (dl_task_needs_bw_move(task, cs->effective_cpus)) > + cs->sum_migrate_dl_bw += task->dl.dl_bw; > } > } > > - if (!cs->nr_migrate_dl_tasks) > + if (!cs->sum_migrate_dl_bw) > goto out_success; > > - if (!cpumask_intersects(oldcs->effective_cpus, cs->effective_cpus)) { > - int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); > - > - if (unlikely(cpu >= nr_cpu_ids)) { > - ret = -EINVAL; > - goto out_unlock; > - } > + cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus); > + if (unlikely(cpu >= nr_cpu_ids)) { > + ret = -EINVAL; > + goto out_unlock; > + } > > - ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); > - if (ret) > - goto out_unlock; > + ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw); > + if (ret) > + goto out_unlock; > > - cs->dl_bw_cpu = cpu; > - } > + cs->dl_bw_cpu = cpu; > > out_success: > /* > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index edca7849b165..7db4c87df83b 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -3107,20 +3107,18 @@ static void task_woken_dl(struct rq *rq, struct task_struct *p) > static void set_cpus_allowed_dl(struct task_struct *p, > struct affinity_context *ctx) > { > - struct root_domain *src_rd; > struct rq *rq; > > WARN_ON_ONCE(!dl_task(p)); > > rq = task_rq(p); > - src_rd = rq->rd; > /* > * Migrating a SCHED_DEADLINE task between exclusive > * cpusets (different root_domains) entails a bandwidth > * update. We already made space for us in the destination > * domain (see cpuset_can_attach()). > */ > - if (!cpumask_intersects(src_rd->span, ctx->new_mask)) { > + if (dl_task_needs_bw_move(p, ctx->new_mask)) { > struct dl_bw *src_dl_b; > > src_dl_b = dl_bw_of(cpu_of(rq)); > @@ -3137,6 +3135,15 @@ static void set_cpus_allowed_dl(struct task_struct *p, > set_cpus_allowed_common(p, ctx); > } > > +bool dl_task_needs_bw_move(struct task_struct *p, > + const struct cpumask *new_mask) > +{ > + if (!dl_task(p)) > + return false; > + > + return !cpumask_intersects(task_rq(p)->rd->span, new_mask); > +} > + > /* Assumes rq->lock is held */ > static void rq_online_dl(struct rq *rq) > { LGTM Reviewed-by: Waiman Long <longman@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves 2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang 2026-05-11 9:17 ` Juri Lelli 2026-05-11 20:00 ` Waiman Long @ 2026-05-11 20:29 ` Tejun Heo 2 siblings, 0 replies; 10+ messages in thread From: Tejun Heo @ 2026-05-11 20:29 UTC (permalink / raw) To: Guopeng Zhang Cc: Waiman Long, Michal Koutný, Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong, Johannes Weiner, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel, cgroups Hello, Applied to cgroup/for-7.1-fixes with Cc: stable@vger.kernel.org # v6.10+. Thanks. -- tejun ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-05-11 20:29 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-09 10:20 [PATCH v3 0/2] cgroup/cpuset: fix DL attach accounting Guopeng Zhang 2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang 2026-05-11 2:48 ` Chen Ridong 2026-05-11 5:04 ` Waiman Long 2026-05-11 8:18 ` Tejun Heo 2026-05-11 9:17 ` Juri Lelli 2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang 2026-05-11 9:17 ` Juri Lelli 2026-05-11 20:00 ` Waiman Long 2026-05-11 20:29 ` Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox