From: Ridong Chen <ridong.chen@linux.dev>
To: Waiman Long <longman@redhat.com>
Cc: cgroups@vger.kernel.org, Tejun Heo <tj@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] cgroup/cpuset: Support multiple source/destination cpusets using pids pattern
Date: Fri, 5 Jun 2026 15:35:01 +0800 [thread overview]
Message-ID: <d708fb7a-d12f-40da-95ca-fbc6d0552f07@linux.dev> (raw)
In-Reply-To: <07bfe9cc-b8ab-4c4c-bfe0-b974abd3ff08@redhat.com>
On 6/4/2026 2:47 AM, Waiman Long wrote:
> On 6/3/26 6:26 AM, Ridong Chen wrote:
>> The current cpuset_can_attach() and cpuset_attach() functions assume task
>> migration is from one source cpuset to one destination cpuset. This
>> can be
>> wrong in several scenarios:
>> - Moving a multi-threaded process with threads in different cpusets
>> - Disabling the cpuset controller (many children to one parent)
>> - Enabling the cpuset controller (one parent to many children)
>>
>> Fix this by adopting the pids subsystem's per-task accounting pattern.
>> In cpuset_can_attach(), use task_cs(task) to get the correct source
>> cpuset
>> for each task (like pids_can_attach uses task_css), adjust
>> nr_deadline_tasks
>> and reserve DL bandwidth per-task, and increment attach_in_progress
>> per-task
>> on the destination cpuset. In cpuset_attach(), handle destination cpuset
>> changes within the task iteration loop.
>>
>> A shared helper cpuset_undo_attach() reverses the per-task operations for
>> both partial rollback in cpuset_can_attach() and full reversal in
>> cpuset_cancel_attach().
>>
>> When multiple source cpusets are detected in can_attach(), set
>> attach_many_sources so that cpuset_attach() forces cpus_updated and
>> mems_updated to true, ensuring all tasks get properly updated regardless
>> of which source cpuset cpuset_attach_old_cs points to.
>>
>> This eliminates the need for nr_migrate_dl_tasks, sum_migrate_dl_bw, and
>> dl_bw_cpu fields in struct cpuset.
>>
>> Fixes: 4ec22e9c5a90 ("cpuset: Enable cpuset controller in default
>> hierarchy")
>> Signed-off-by: Ridong Chen <ridong.chen@linux.dev>
>
> It is not a problem doing per-task DL BW allocation and eliminating the
> *dl_bw* fields. However, updating nr_deadline_tasks before it is
> committed can be problematic.
>
Good to hear that.
> nr_deadline_tasks is used in dl_rebuild_rd_accounting() which is called
> by partition_sched_domains_locked(). After the release of cpuset_mutex
> at the end of cpuset_can_attach() and before cpuset_attach() or
> cpuset_cancel_attach() is called, it is possible
> that partition_sched_domains_locked() can be called
> and dl_rebuild_rd_accounting() is not getting the right DL BW accounting
> information. So unless there is a way to confirm that this situation
> cannot happen, we can't change nr_deadline_tasks before the attach is
> commited.
>
We can keep the nr_migrate_dl_tasks field and update nr_deadline_tasks
once migration is complete. I think this will be much simpler than
fixing the issue using lists.
--
Best regards,
Ridong
next prev parent reply other threads:[~2026-06-05 7:35 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-02 2:31 [PATCH-next v5 0/6] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach() Waiman Long
2026-06-02 2:31 ` [PATCH-next v5 1/6] cgroup/cpuset: Fix node inconsistencies between cpuset_update_tasks_nodemask() and cpuset_attach() Waiman Long
2026-06-02 13:37 ` Ridong Chen
2026-06-02 18:43 ` Waiman Long
2026-06-02 2:31 ` [PATCH-next v5 2/6] cgroup/cpuset: Add a cpuset_reserve_dl_bw() helper Waiman Long
2026-06-02 13:40 ` Ridong Chen
2026-06-02 2:32 ` [PATCH-next v5 3/6] cgroup/cpuset: Expand the scope of cpuset_can_attach_check() Waiman Long
2026-06-02 13:51 ` Ridong Chen
2026-06-02 2:32 ` [PATCH-next v5 4/6] cgroup/cpuset: Make cpuset_attach_old_cs track task group leaders Waiman Long
2026-06-02 13:58 ` Ridong Chen
2026-06-02 2:32 ` [PATCH-next v5 5/6] cgroup/cpuset: Move mpol_rebind_mm/cpuset_migrate_mm() calls inside cpuset_attach_task() Waiman Long
2026-06-02 2:32 ` [PATCH-next v5 6/6] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach() Waiman Long
2026-06-03 10:26 ` [PATCH] cgroup/cpuset: Support multiple source/destination cpusets using pids pattern Ridong Chen
2026-06-03 10:32 ` Ridong Chen
2026-06-03 18:47 ` Waiman Long
2026-06-05 7:35 ` Ridong Chen [this message]
2026-06-05 17:15 ` Waiman Long
2026-06-07 3:12 ` Ridong Chen
2026-06-08 18:49 ` Waiman Long
2026-06-07 3:22 ` Ridong Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d708fb7a-d12f-40da-95ca-fbc6d0552f07@linux.dev \
--to=ridong.chen@linux.dev \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.