From: Ridong Chen <ridong.chen@linux.dev>
To: Waiman Long <longman@redhat.com>
Cc: cgroups@vger.kernel.org, Tejun Heo <tj@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] cgroup/cpuset: Support multiple source/destination cpusets using pids pattern
Date: Fri, 5 Jun 2026 15:35:01 +0800 [thread overview]
Message-ID: <d708fb7a-d12f-40da-95ca-fbc6d0552f07@linux.dev> (raw)
In-Reply-To: <07bfe9cc-b8ab-4c4c-bfe0-b974abd3ff08@redhat.com>
On 6/4/2026 2:47 AM, Waiman Long wrote:
> On 6/3/26 6:26 AM, Ridong Chen wrote:
>> The current cpuset_can_attach() and cpuset_attach() functions assume task
>> migration is from one source cpuset to one destination cpuset. This
>> can be
>> wrong in several scenarios:
>> - Moving a multi-threaded process with threads in different cpusets
>> - Disabling the cpuset controller (many children to one parent)
>> - Enabling the cpuset controller (one parent to many children)
>>
>> Fix this by adopting the pids subsystem's per-task accounting pattern.
>> In cpuset_can_attach(), use task_cs(task) to get the correct source
>> cpuset
>> for each task (like pids_can_attach uses task_css), adjust
>> nr_deadline_tasks
>> and reserve DL bandwidth per-task, and increment attach_in_progress
>> per-task
>> on the destination cpuset. In cpuset_attach(), handle destination cpuset
>> changes within the task iteration loop.
>>
>> A shared helper cpuset_undo_attach() reverses the per-task operations for
>> both partial rollback in cpuset_can_attach() and full reversal in
>> cpuset_cancel_attach().
>>
>> When multiple source cpusets are detected in can_attach(), set
>> attach_many_sources so that cpuset_attach() forces cpus_updated and
>> mems_updated to true, ensuring all tasks get properly updated regardless
>> of which source cpuset cpuset_attach_old_cs points to.
>>
>> This eliminates the need for nr_migrate_dl_tasks, sum_migrate_dl_bw, and
>> dl_bw_cpu fields in struct cpuset.
>>
>> Fixes: 4ec22e9c5a90 ("cpuset: Enable cpuset controller in default
>> hierarchy")
>> Signed-off-by: Ridong Chen <ridong.chen@linux.dev>
>
> It is not a problem doing per-task DL BW allocation and eliminating the
> *dl_bw* fields. However, updating nr_deadline_tasks before it is
> committed can be problematic.
>
Good to hear that.
> nr_deadline_tasks is used in dl_rebuild_rd_accounting() which is called
> by partition_sched_domains_locked(). After the release of cpuset_mutex
> at the end of cpuset_can_attach() and before cpuset_attach() or
> cpuset_cancel_attach() is called, it is possible
> that partition_sched_domains_locked() can be called
> and dl_rebuild_rd_accounting() is not getting the right DL BW accounting
> information. So unless there is a way to confirm that this situation
> cannot happen, we can't change nr_deadline_tasks before the attach is
> commited.
>
We can keep the nr_migrate_dl_tasks field and update nr_deadline_tasks
once migration is complete. I think this will be much simpler than
fixing the issue using lists.
--
Best regards,
Ridong
next prev parent reply other threads:[~2026-06-05 7:35 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-02 2:31 [PATCH-next v5 0/6] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach() Waiman Long
2026-06-02 2:31 ` [PATCH-next v5 1/6] cgroup/cpuset: Fix node inconsistencies between cpuset_update_tasks_nodemask() and cpuset_attach() Waiman Long
2026-06-02 13:37 ` Ridong Chen
2026-06-02 18:43 ` Waiman Long
2026-06-02 2:31 ` [PATCH-next v5 2/6] cgroup/cpuset: Add a cpuset_reserve_dl_bw() helper Waiman Long
2026-06-02 13:40 ` Ridong Chen
2026-06-02 2:32 ` [PATCH-next v5 3/6] cgroup/cpuset: Expand the scope of cpuset_can_attach_check() Waiman Long
2026-06-02 13:51 ` Ridong Chen
2026-06-02 2:32 ` [PATCH-next v5 4/6] cgroup/cpuset: Make cpuset_attach_old_cs track task group leaders Waiman Long
2026-06-02 13:58 ` Ridong Chen
2026-06-02 2:32 ` [PATCH-next v5 5/6] cgroup/cpuset: Move mpol_rebind_mm/cpuset_migrate_mm() calls inside cpuset_attach_task() Waiman Long
2026-06-02 2:32 ` [PATCH-next v5 6/6] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach() Waiman Long
2026-06-03 10:26 ` [PATCH] cgroup/cpuset: Support multiple source/destination cpusets using pids pattern Ridong Chen
2026-06-03 10:32 ` Ridong Chen
2026-06-03 18:47 ` Waiman Long
2026-06-05 7:35 ` Ridong Chen [this message]
2026-06-05 17:15 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d708fb7a-d12f-40da-95ca-fbc6d0552f07@linux.dev \
--to=ridong.chen@linux.dev \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox