Linux cgroups development
 help / color / mirror / Atom feed
* [PATCH v3 0/2] cgroup/cpuset: fix DL attach accounting
@ 2026-05-09 10:20 Guopeng Zhang
  2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang
  2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang
  0 siblings, 2 replies; 10+ messages in thread
From: Guopeng Zhang @ 2026-05-09 10:20 UTC (permalink / raw)
  To: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Chen Ridong
  Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel,
	cgroups, Guopeng Zhang

Hi,

This v3 series contains two cpuset fixes for SCHED_DEADLINE attach
accounting.

Patch 1 fixes an internal cpuset_can_attach() failure path where
temporary DL migration state can be left behind if a later per-task
check fails before cpuset marks attach_in_progress.

Patch 2 keeps cpuset DL bandwidth reservation aligned with the condition
used by set_cpus_allowed_dl() for source-side bandwidth removal. It keeps
counting all migrating DL tasks for cpuset task accounting, but reserves
destination DL bandwidth only for tasks that actually need a root-domain
bandwidth move.

Guopeng Zhang (2):
  cgroup/cpuset: reset DL migration state on can_attach() failure
  cgroup/cpuset: reserve DL bandwidth only for root-domain moves

 include/linux/sched/deadline.h  |  9 ++++++++
 kernel/cgroup/cpuset-internal.h |  1 +
 kernel/cgroup/cpuset.c          | 39 ++++++++++++++++++---------------
 kernel/sched/deadline.c         | 13 ++++++++---
 4 files changed, 41 insertions(+), 21 deletions(-)

---
Changes in v3:
- Patch 1: use common ret != 0 cleanup in cpuset_can_attach(), as
  suggested by Waiman Long and Chen Ridong.
- Patch 2: drop task_cpu_possible_mask() / attach-target-mask handling
  as suggested by Waiman Long.
- Patch 2: keep the change limited to reserving DL bandwidth only for
  tasks that need a root-domain bandwidth move.
- Leave the broader can_attach()/attach() transaction model unchanged.

Changes in v2:
- Split the original change into two patches.
- Add a separate fix for resetting pending DL migration state on
  cpuset_can_attach() failure.
- Clarify that nr_migrate_dl_tasks counts all migrating DL tasks for
  cpuset task accounting, while sum_migrate_dl_bw only tracks bandwidth
  needing destination root-domain reservation.

v2:
  https://lore.kernel.org/all/20260507103310.35849-1-zhangguopeng@kylinos.cn/

v1:
  https://lore.kernel.org/all/20260421083449.95750-1-zhangguopeng@kylinos.cn

-- 
2.43.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure
  2026-05-09 10:20 [PATCH v3 0/2] cgroup/cpuset: fix DL attach accounting Guopeng Zhang
@ 2026-05-09 10:20 ` Guopeng Zhang
  2026-05-11  2:48   ` Chen Ridong
                     ` (3 more replies)
  2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang
  1 sibling, 4 replies; 10+ messages in thread
From: Guopeng Zhang @ 2026-05-09 10:20 UTC (permalink / raw)
  To: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Chen Ridong
  Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel,
	cgroups, Guopeng Zhang

cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration
state in the destination cpuset while walking the taskset.

If a later task_can_attach() or security_task_setscheduler() check
fails, cgroup_migrate_execute() treats cpuset as the failing subsystem
and does not call cpuset_cancel_attach() for it. The partially
accumulated state is then left behind and can be consumed by a later
attach, corrupting cpuset DL task accounting and pending DL bandwidth
accounting.

Reset the pending DL migration state from the common error exit when
ret is non-zero. Successful can_attach() keeps the state for
cpuset_attach() or cpuset_cancel_attach().

Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails")
Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
---
 kernel/cgroup/cpuset.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index e3a081a07c6d..b9c839538900 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3050,16 +3050,13 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
 		int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus);
 
 		if (unlikely(cpu >= nr_cpu_ids)) {
-			reset_migrate_dl_data(cs);
 			ret = -EINVAL;
 			goto out_unlock;
 		}
 
 		ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw);
-		if (ret) {
-			reset_migrate_dl_data(cs);
+		if (ret)
 			goto out_unlock;
-		}
 
 		cs->dl_bw_cpu = cpu;
 	}
@@ -3070,7 +3067,10 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
 	 * changes which zero cpus/mems_allowed.
 	 */
 	cs->attach_in_progress++;
+
 out_unlock:
+	if (ret)
+		reset_migrate_dl_data(cs);
 	mutex_unlock(&cpuset_mutex);
 	return ret;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves
  2026-05-09 10:20 [PATCH v3 0/2] cgroup/cpuset: fix DL attach accounting Guopeng Zhang
  2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang
@ 2026-05-09 10:20 ` Guopeng Zhang
  2026-05-11  9:17   ` Juri Lelli
                     ` (2 more replies)
  1 sibling, 3 replies; 10+ messages in thread
From: Guopeng Zhang @ 2026-05-09 10:20 UTC (permalink / raw)
  To: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Chen Ridong
  Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel,
	cgroups, Guopeng Zhang

cpuset_can_attach() currently adds the bandwidth of all migrating
SCHED_DEADLINE tasks to sum_migrate_dl_bw. If the source and destination
cpuset effective CPU masks do not overlap, the whole sum is then
reserved in the destination root domain.

set_cpus_allowed_dl(), however, subtracts bandwidth from the source
root domain only when the affinity change really moves the task between
root domains. A DL task can move between cpusets that are still in the
same root domain, so including that task in sum_migrate_dl_bw can reserve
destination bandwidth without a matching source-side subtraction.

Share the root-domain move test with set_cpus_allowed_dl(). Keep
nr_migrate_dl_tasks counting all migrating deadline tasks for cpuset DL
task accounting, but add to sum_migrate_dl_bw only for tasks that need a
root-domain bandwidth move. Keep using the destination cpuset effective
CPU mask and leave the broader can_attach()/attach() transaction model
unchanged.

Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails")
Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
---
 include/linux/sched/deadline.h  |  9 +++++++++
 kernel/cgroup/cpuset-internal.h |  1 +
 kernel/cgroup/cpuset.c          | 33 ++++++++++++++++++---------------
 kernel/sched/deadline.c         | 13 ++++++++++---
 4 files changed, 38 insertions(+), 18 deletions(-)

diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
index 1198138cb839..273538200a44 100644
--- a/include/linux/sched/deadline.h
+++ b/include/linux/sched/deadline.h
@@ -33,6 +33,15 @@ struct root_domain;
 extern void dl_add_task_root_domain(struct task_struct *p);
 extern void dl_clear_root_domain(struct root_domain *rd);
 extern void dl_clear_root_domain_cpu(int cpu);
+/*
+ * Return whether moving DL task @p to @new_mask requires moving DL
+ * bandwidth accounting between root domains. This helper is specific to
+ * DL bandwidth move accounting semantics and is shared by
+ * cpuset_can_attach() and set_cpus_allowed_dl() so both paths use the
+ * same source root-domain test.
+ */
+extern bool dl_task_needs_bw_move(struct task_struct *p,
+				  const struct cpumask *new_mask);
 
 extern u64 dl_cookie;
 extern bool dl_bw_visited(int cpu, u64 cookie);
diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-internal.h
index bb4e692bea30..f7aaf01f7cd5 100644
--- a/kernel/cgroup/cpuset-internal.h
+++ b/kernel/cgroup/cpuset-internal.h
@@ -167,6 +167,7 @@ struct cpuset {
 	 */
 	int nr_deadline_tasks;
 	int nr_migrate_dl_tasks;
+	/* DL bandwidth that needs destination reservation for this attach. */
 	u64 sum_migrate_dl_bw;
 	/*
 	 * CPU used for temporary DL bandwidth allocation during attach;
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index b9c839538900..23abfbbb4686 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2993,7 +2993,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
 	struct cpuset *cs, *oldcs;
 	struct task_struct *task;
 	bool setsched_check;
-	int ret;
+	int cpu, ret;
 
 	/* used later by cpuset_attach() */
 	cpuset_attach_old_cs = task_cs(cgroup_taskset_first(tset, &css));
@@ -3038,28 +3038,31 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
 		}
 
 		if (dl_task(task)) {
+			/*
+			 * Count all migrating DL tasks for cpuset task accounting.
+			 * Only tasks that need a root-domain bandwidth move
+			 * contribute to sum_migrate_dl_bw.
+			 */
 			cs->nr_migrate_dl_tasks++;
-			cs->sum_migrate_dl_bw += task->dl.dl_bw;
+			if (dl_task_needs_bw_move(task, cs->effective_cpus))
+				cs->sum_migrate_dl_bw += task->dl.dl_bw;
 		}
 	}
 
-	if (!cs->nr_migrate_dl_tasks)
+	if (!cs->sum_migrate_dl_bw)
 		goto out_success;
 
-	if (!cpumask_intersects(oldcs->effective_cpus, cs->effective_cpus)) {
-		int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus);
-
-		if (unlikely(cpu >= nr_cpu_ids)) {
-			ret = -EINVAL;
-			goto out_unlock;
-		}
+	cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus);
+	if (unlikely(cpu >= nr_cpu_ids)) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
 
-		ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw);
-		if (ret)
-			goto out_unlock;
+	ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw);
+	if (ret)
+		goto out_unlock;
 
-		cs->dl_bw_cpu = cpu;
-	}
+	cs->dl_bw_cpu = cpu;
 
 out_success:
 	/*
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index edca7849b165..7db4c87df83b 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -3107,20 +3107,18 @@ static void task_woken_dl(struct rq *rq, struct task_struct *p)
 static void set_cpus_allowed_dl(struct task_struct *p,
 				struct affinity_context *ctx)
 {
-	struct root_domain *src_rd;
 	struct rq *rq;
 
 	WARN_ON_ONCE(!dl_task(p));
 
 	rq = task_rq(p);
-	src_rd = rq->rd;
 	/*
 	 * Migrating a SCHED_DEADLINE task between exclusive
 	 * cpusets (different root_domains) entails a bandwidth
 	 * update. We already made space for us in the destination
 	 * domain (see cpuset_can_attach()).
 	 */
-	if (!cpumask_intersects(src_rd->span, ctx->new_mask)) {
+	if (dl_task_needs_bw_move(p, ctx->new_mask)) {
 		struct dl_bw *src_dl_b;
 
 		src_dl_b = dl_bw_of(cpu_of(rq));
@@ -3137,6 +3135,15 @@ static void set_cpus_allowed_dl(struct task_struct *p,
 	set_cpus_allowed_common(p, ctx);
 }
 
+bool dl_task_needs_bw_move(struct task_struct *p,
+			   const struct cpumask *new_mask)
+{
+	if (!dl_task(p))
+		return false;
+
+	return !cpumask_intersects(task_rq(p)->rd->span, new_mask);
+}
+
 /* Assumes rq->lock is held */
 static void rq_online_dl(struct rq *rq)
 {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure
  2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang
@ 2026-05-11  2:48   ` Chen Ridong
  2026-05-11  5:04   ` Waiman Long
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Chen Ridong @ 2026-05-11  2:48 UTC (permalink / raw)
  To: Guopeng Zhang, Waiman Long, Tejun Heo, Michal Koutný,
	Ingo Molnar, Peter Zijlstra, Juri Lelli
  Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel,
	cgroups



On 2026/5/9 18:20, Guopeng Zhang wrote:
> cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration
> state in the destination cpuset while walking the taskset.
> 
> If a later task_can_attach() or security_task_setscheduler() check
> fails, cgroup_migrate_execute() treats cpuset as the failing subsystem
> and does not call cpuset_cancel_attach() for it. The partially
> accumulated state is then left behind and can be consumed by a later
> attach, corrupting cpuset DL task accounting and pending DL bandwidth
> accounting.
> 
> Reset the pending DL migration state from the common error exit when
> ret is non-zero. Successful can_attach() keeps the state for
> cpuset_attach() or cpuset_cancel_attach().
> 
> Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails")
> Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
> ---
>  kernel/cgroup/cpuset.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index e3a081a07c6d..b9c839538900 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -3050,16 +3050,13 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
>  		int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus);
>  
>  		if (unlikely(cpu >= nr_cpu_ids)) {
> -			reset_migrate_dl_data(cs);
>  			ret = -EINVAL;
>  			goto out_unlock;
>  		}
>  
>  		ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw);
> -		if (ret) {
> -			reset_migrate_dl_data(cs);
> +		if (ret)
>  			goto out_unlock;
> -		}
>  
>  		cs->dl_bw_cpu = cpu;
>  	}
> @@ -3070,7 +3067,10 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
>  	 * changes which zero cpus/mems_allowed.
>  	 */
>  	cs->attach_in_progress++;
> +
>  out_unlock:
> +	if (ret)
> +		reset_migrate_dl_data(cs);
>  	mutex_unlock(&cpuset_mutex);
>  	return ret;
>  }

LGTM.
Thanks.

Reviewed-by: Chen Ridong <chenridong@huaweicloud.com>

-- 
Best regards,
Ridong


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure
  2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang
  2026-05-11  2:48   ` Chen Ridong
@ 2026-05-11  5:04   ` Waiman Long
  2026-05-11  8:18   ` Tejun Heo
  2026-05-11  9:17   ` Juri Lelli
  3 siblings, 0 replies; 10+ messages in thread
From: Waiman Long @ 2026-05-11  5:04 UTC (permalink / raw)
  To: Guopeng Zhang, Tejun Heo, Michal Koutný, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Chen Ridong
  Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel,
	cgroups

On 5/9/26 6:20 AM, Guopeng Zhang wrote:
> cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration
> state in the destination cpuset while walking the taskset.
>
> If a later task_can_attach() or security_task_setscheduler() check
> fails, cgroup_migrate_execute() treats cpuset as the failing subsystem
> and does not call cpuset_cancel_attach() for it. The partially
> accumulated state is then left behind and can be consumed by a later
> attach, corrupting cpuset DL task accounting and pending DL bandwidth
> accounting.
>
> Reset the pending DL migration state from the common error exit when
> ret is non-zero. Successful can_attach() keeps the state for
> cpuset_attach() or cpuset_cancel_attach().
>
> Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails")
> Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
> ---
>   kernel/cgroup/cpuset.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index e3a081a07c6d..b9c839538900 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -3050,16 +3050,13 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
>   		int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus);
>   
>   		if (unlikely(cpu >= nr_cpu_ids)) {
> -			reset_migrate_dl_data(cs);
>   			ret = -EINVAL;
>   			goto out_unlock;
>   		}
>   
>   		ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw);
> -		if (ret) {
> -			reset_migrate_dl_data(cs);
> +		if (ret)
>   			goto out_unlock;
> -		}
>   
>   		cs->dl_bw_cpu = cpu;
>   	}
> @@ -3070,7 +3067,10 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
>   	 * changes which zero cpus/mems_allowed.
>   	 */
>   	cs->attach_in_progress++;
> +
>   out_unlock:
> +	if (ret)
> +		reset_migrate_dl_data(cs);
>   	mutex_unlock(&cpuset_mutex);
>   	return ret;
>   }
Reviewed-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure
  2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang
  2026-05-11  2:48   ` Chen Ridong
  2026-05-11  5:04   ` Waiman Long
@ 2026-05-11  8:18   ` Tejun Heo
  2026-05-11  9:17   ` Juri Lelli
  3 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2026-05-11  8:18 UTC (permalink / raw)
  To: Guopeng Zhang, Waiman Long, Tejun Heo, Michal Koutný,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Chen Ridong
  Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel,
	cgroups

Hello,

Applied 1/2 to cgroup/for-7.1-fixes with Cc: stable@vger.kernel.org # v6.10+.

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure
  2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang
                     ` (2 preceding siblings ...)
  2026-05-11  8:18   ` Tejun Heo
@ 2026-05-11  9:17   ` Juri Lelli
  3 siblings, 0 replies; 10+ messages in thread
From: Juri Lelli @ 2026-05-11  9:17 UTC (permalink / raw)
  To: Guopeng Zhang
  Cc: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Chen Ridong, Johannes Weiner,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco,
	Will Deacon, linux-kernel, cgroups

On Sat, 09 May 2026 18:20:30 +0800, Guopeng Zhang <zhangguopeng@kylinos.cn> wrote:
> cpuset_can_attach() accumulates temporary SCHED_DEADLINE migration
> state in the destination cpuset while walking the taskset.
> 
> If a later task_can_attach() or security_task_setscheduler() check
> fails, cgroup_migrate_execute() treats cpuset as the failing subsystem
> and does not call cpuset_cancel_attach() for it. The partially
> accumulated state is then left behind and can be consumed by a later
> attach, corrupting cpuset DL task accounting and pending DL bandwidth
> accounting.
> 
> [...]

Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>

-- 
Juri Lelli <juri.lelli@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves
  2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang
@ 2026-05-11  9:17   ` Juri Lelli
  2026-05-11 20:00   ` Waiman Long
  2026-05-11 20:29   ` Tejun Heo
  2 siblings, 0 replies; 10+ messages in thread
From: Juri Lelli @ 2026-05-11  9:17 UTC (permalink / raw)
  To: Guopeng Zhang
  Cc: Waiman Long, Tejun Heo, Michal Koutný, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Chen Ridong, Johannes Weiner,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Valentin Schneider, K Prateek Nayak, Gabriele Monaco,
	Will Deacon, linux-kernel, cgroups

On Sat, 09 May 2026 18:20:31 +0800, Guopeng Zhang <zhangguopeng@kylinos.cn> wrote:
> cpuset_can_attach() currently adds the bandwidth of all migrating
> SCHED_DEADLINE tasks to sum_migrate_dl_bw. If the source and destination
> cpuset effective CPU masks do not overlap, the whole sum is then
> reserved in the destination root domain.
> 
> set_cpus_allowed_dl(), however, subtracts bandwidth from the source
> root domain only when the affinity change really moves the task between
> root domains. A DL task can move between cpusets that are still in the
> same root domain, so including that task in sum_migrate_dl_bw can reserve
> destination bandwidth without a matching source-side subtraction.
> 
> [...]

Acked-by: Juri Lelli <juri.lelli@redhat.com>
Tested-by: Juri Lelli <juri.lelli@redhat.com>

-- 
Juri Lelli <juri.lelli@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves
  2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang
  2026-05-11  9:17   ` Juri Lelli
@ 2026-05-11 20:00   ` Waiman Long
  2026-05-11 20:29   ` Tejun Heo
  2 siblings, 0 replies; 10+ messages in thread
From: Waiman Long @ 2026-05-11 20:00 UTC (permalink / raw)
  To: Guopeng Zhang, Tejun Heo, Michal Koutný, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Chen Ridong
  Cc: Johannes Weiner, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	K Prateek Nayak, Gabriele Monaco, Will Deacon, linux-kernel,
	cgroups

On 5/9/26 6:20 AM, Guopeng Zhang wrote:
> cpuset_can_attach() currently adds the bandwidth of all migrating
> SCHED_DEADLINE tasks to sum_migrate_dl_bw. If the source and destination
> cpuset effective CPU masks do not overlap, the whole sum is then
> reserved in the destination root domain.
>
> set_cpus_allowed_dl(), however, subtracts bandwidth from the source
> root domain only when the affinity change really moves the task between
> root domains. A DL task can move between cpusets that are still in the
> same root domain, so including that task in sum_migrate_dl_bw can reserve
> destination bandwidth without a matching source-side subtraction.
>
> Share the root-domain move test with set_cpus_allowed_dl(). Keep
> nr_migrate_dl_tasks counting all migrating deadline tasks for cpuset DL
> task accounting, but add to sum_migrate_dl_bw only for tasks that need a
> root-domain bandwidth move. Keep using the destination cpuset effective
> CPU mask and leave the broader can_attach()/attach() transaction model
> unchanged.
>
> Fixes: 2ef269ef1ac0 ("cgroup/cpuset: Free DL BW in case can_attach() fails")
> Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
> ---
>   include/linux/sched/deadline.h  |  9 +++++++++
>   kernel/cgroup/cpuset-internal.h |  1 +
>   kernel/cgroup/cpuset.c          | 33 ++++++++++++++++++---------------
>   kernel/sched/deadline.c         | 13 ++++++++++---
>   4 files changed, 38 insertions(+), 18 deletions(-)
>
> diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
> index 1198138cb839..273538200a44 100644
> --- a/include/linux/sched/deadline.h
> +++ b/include/linux/sched/deadline.h
> @@ -33,6 +33,15 @@ struct root_domain;
>   extern void dl_add_task_root_domain(struct task_struct *p);
>   extern void dl_clear_root_domain(struct root_domain *rd);
>   extern void dl_clear_root_domain_cpu(int cpu);
> +/*
> + * Return whether moving DL task @p to @new_mask requires moving DL
> + * bandwidth accounting between root domains. This helper is specific to
> + * DL bandwidth move accounting semantics and is shared by
> + * cpuset_can_attach() and set_cpus_allowed_dl() so both paths use the
> + * same source root-domain test.
> + */
> +extern bool dl_task_needs_bw_move(struct task_struct *p,
> +				  const struct cpumask *new_mask);
>   
>   extern u64 dl_cookie;
>   extern bool dl_bw_visited(int cpu, u64 cookie);
> diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-internal.h
> index bb4e692bea30..f7aaf01f7cd5 100644
> --- a/kernel/cgroup/cpuset-internal.h
> +++ b/kernel/cgroup/cpuset-internal.h
> @@ -167,6 +167,7 @@ struct cpuset {
>   	 */
>   	int nr_deadline_tasks;
>   	int nr_migrate_dl_tasks;
> +	/* DL bandwidth that needs destination reservation for this attach. */
>   	u64 sum_migrate_dl_bw;
>   	/*
>   	 * CPU used for temporary DL bandwidth allocation during attach;
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index b9c839538900..23abfbbb4686 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -2993,7 +2993,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
>   	struct cpuset *cs, *oldcs;
>   	struct task_struct *task;
>   	bool setsched_check;
> -	int ret;
> +	int cpu, ret;
>   
>   	/* used later by cpuset_attach() */
>   	cpuset_attach_old_cs = task_cs(cgroup_taskset_first(tset, &css));
> @@ -3038,28 +3038,31 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
>   		}
>   
>   		if (dl_task(task)) {
> +			/*
> +			 * Count all migrating DL tasks for cpuset task accounting.
> +			 * Only tasks that need a root-domain bandwidth move
> +			 * contribute to sum_migrate_dl_bw.
> +			 */
>   			cs->nr_migrate_dl_tasks++;
> -			cs->sum_migrate_dl_bw += task->dl.dl_bw;
> +			if (dl_task_needs_bw_move(task, cs->effective_cpus))
> +				cs->sum_migrate_dl_bw += task->dl.dl_bw;
>   		}
>   	}
>   
> -	if (!cs->nr_migrate_dl_tasks)
> +	if (!cs->sum_migrate_dl_bw)
>   		goto out_success;
>   
> -	if (!cpumask_intersects(oldcs->effective_cpus, cs->effective_cpus)) {
> -		int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus);
> -
> -		if (unlikely(cpu >= nr_cpu_ids)) {
> -			ret = -EINVAL;
> -			goto out_unlock;
> -		}
> +	cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus);
> +	if (unlikely(cpu >= nr_cpu_ids)) {
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}
>   
> -		ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw);
> -		if (ret)
> -			goto out_unlock;
> +	ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw);
> +	if (ret)
> +		goto out_unlock;
>   
> -		cs->dl_bw_cpu = cpu;
> -	}
> +	cs->dl_bw_cpu = cpu;
>   
>   out_success:
>   	/*
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index edca7849b165..7db4c87df83b 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -3107,20 +3107,18 @@ static void task_woken_dl(struct rq *rq, struct task_struct *p)
>   static void set_cpus_allowed_dl(struct task_struct *p,
>   				struct affinity_context *ctx)
>   {
> -	struct root_domain *src_rd;
>   	struct rq *rq;
>   
>   	WARN_ON_ONCE(!dl_task(p));
>   
>   	rq = task_rq(p);
> -	src_rd = rq->rd;
>   	/*
>   	 * Migrating a SCHED_DEADLINE task between exclusive
>   	 * cpusets (different root_domains) entails a bandwidth
>   	 * update. We already made space for us in the destination
>   	 * domain (see cpuset_can_attach()).
>   	 */
> -	if (!cpumask_intersects(src_rd->span, ctx->new_mask)) {
> +	if (dl_task_needs_bw_move(p, ctx->new_mask)) {
>   		struct dl_bw *src_dl_b;
>   
>   		src_dl_b = dl_bw_of(cpu_of(rq));
> @@ -3137,6 +3135,15 @@ static void set_cpus_allowed_dl(struct task_struct *p,
>   	set_cpus_allowed_common(p, ctx);
>   }
>   
> +bool dl_task_needs_bw_move(struct task_struct *p,
> +			   const struct cpumask *new_mask)
> +{
> +	if (!dl_task(p))
> +		return false;
> +
> +	return !cpumask_intersects(task_rq(p)->rd->span, new_mask);
> +}
> +
>   /* Assumes rq->lock is held */
>   static void rq_online_dl(struct rq *rq)
>   {

LGTM

Reviewed-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves
  2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang
  2026-05-11  9:17   ` Juri Lelli
  2026-05-11 20:00   ` Waiman Long
@ 2026-05-11 20:29   ` Tejun Heo
  2 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2026-05-11 20:29 UTC (permalink / raw)
  To: Guopeng Zhang
  Cc: Waiman Long, Michal Koutný, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Chen Ridong, Johannes Weiner, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak, Gabriele Monaco, Will Deacon,
	linux-kernel, cgroups

Hello,

Applied to cgroup/for-7.1-fixes with Cc: stable@vger.kernel.org # v6.10+.

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-05-11 20:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-09 10:20 [PATCH v3 0/2] cgroup/cpuset: fix DL attach accounting Guopeng Zhang
2026-05-09 10:20 ` [PATCH v3 1/2] cgroup/cpuset: reset DL migration state on can_attach() failure Guopeng Zhang
2026-05-11  2:48   ` Chen Ridong
2026-05-11  5:04   ` Waiman Long
2026-05-11  8:18   ` Tejun Heo
2026-05-11  9:17   ` Juri Lelli
2026-05-09 10:20 ` [PATCH v3 2/2] cgroup/cpuset: reserve DL bandwidth only for root-domain moves Guopeng Zhang
2026-05-11  9:17   ` Juri Lelli
2026-05-11 20:00   ` Waiman Long
2026-05-11 20:29   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox