From: Ridong Chen <ridong.chen@linux.dev>
To: "Waiman Long" <longman@redhat.com>,
"Chen Ridong" <chenridong@huaweicloud.com>,
"Tejun Heo" <tj@kernel.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Ingo Molnar" <mingo@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Juri Lelli" <juri.lelli@redhat.com>,
"Vincent Guittot" <vincent.guittot@linaro.org>,
"Dietmar Eggemann" <dietmar.eggemann@arm.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Ben Segall" <bsegall@google.com>, "Mel Gorman" <mgorman@suse.de>,
"Valentin Schneider" <vschneid@redhat.com>,
"K Prateek Nayak" <kprateek.nayak@amd.com>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
Aaron Tomlin <atomlin@atomlin.com>
Subject: Re: [PATCH cgroup/for-next v2 0/5] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach()
Date: Wed, 3 Jun 2026 18:05:04 +0800 [thread overview]
Message-ID: <4e7f7515-8960-4677-bd42-b83cd5f863ad@linux.dev> (raw)
In-Reply-To: <b91080da-486c-4df0-9e6b-8eb3364cae45@redhat.com>
On 2026/5/27 4:12, Waiman Long wrote:
>
> On 5/20/26 4:29 AM, Ridong Chen wrote:
>>
>>
>> On 2026/5/16 12:24, Waiman Long wrote:
>>> Sashiko AI review of another cpuset patch had found that cpuset_attach()
>>> and cpuset_can_attach() can be passed a cgroup_taskset with tasks
>>> migrating from one source cpuset to multiple destination cpusets and
>>> vice versa. Further testing of the cpuset code indicates that this is
>>> indeed the case when the v2 cpuset controller is enabled or disabled.
>>>
>>> Unfortunately, cpuset_attach() and cpuset_can_attach() still assume that
>>> there will be one source and one destinaton cpuset which may result in
>>> inocrrect behavior.
>>>
>>
>> Hi Longman,
>>
>> I am thinking whether we can use the pids subsystem's approach to
>> solve this issue, which I think could be much simpler.
>>
>> For the DL task accounting, we can handle it the same way
>> pids_can_attach() does - just call task_cs(task) for each task
>> individually inside the can_attach() loop and do the nr_deadline_tasks
>> adjustment right there. This eliminates the need to pass per-task
>> source cpuset information to the attach() callback entirely for DL
>> accounting purposes.
> DL task accounting doesn't use the new oldcs stored in the task
> structure which is only used for mm migration. BTW, I believe
> task_cs(task) doesn't return the old cs in cpuset_attach().
Sorry for the late response.
If I understand correctly, for DL task accounting, we need to know the
destination cpuset to allocate bandwidth. The destination cpuset can be
obtained in cpuset_can_attach.
You are right that task_cs(task) does not return the old cpuset in
cpuset_attach(). But do we really need the old cpuset in cpuset_attach?
Is cpuset_attach_old_cs sufficient for mm migration?
>>
>> For cpuset_migrate_mm(), I don't think we need per-task oldcs storage
>> in task_struct either. The scenarios where multiple source cpusets are
>> involved are:
>>
>> enable cpuset controller: child cpusets inherit parent's
>> effective_mems, so attach_mems_updated is false and
>> cpuset_migrate_mm() is never called.
>>
>> disable cpuset controller: tasks move from children to parent. Since
>> children's effective_mems is always a subset of parent's
>> effective_mems, even if cpuset_migrate_mm() is triggered, it's
>> effectively a noop (no pages need to move from a subset to its superset).
>>
>> cgroup.procs write with threads in different cpusets: this is a
>> many-to-one migration with a single process, so there is only one
>> group_leader and one mm. We only need to record the leader's oldcs,
>> which a single static variable can handle.
>>
>> So in all cases, the migration path only needs one oldcs for the
>> leader. We don't need to add a field to task_struct.
>>
>> What do you think?
>
> Yes, that makes sense. I will rework the patch series.
>
> Thanks,
> Longman
>
>>
>>
>>
>>> This patch series is created to fix this issue. The first 2 patches are
>>> just preparatory patches to make the remaining patches easier to review.
>>>
>>> Patch 3 adds a new attach_old_cs field into task_struct to track the
>>> old cpuset to be used in case when cpuset_migrate_mm() needs to be
>>> called in cpuset_attach().
>>>
>>> Patch 4 moves mpol_rebind_mm() and cpuset_migrate_mm() inside
>>> cpuset_attach_task() to make CLONE_INTO_CGROUP flag of clone(2) works
>>> more like moving task from one cpuset to another one, while also make
>>> supporting multiple source and destination cpusets easier.
>>>
>>> Patch 5 makes the necessary changes to enable the support of multiple
>>> source and destination cpusets by keeping all the source and destination
>>> cpusets found during task iterations in two singly linked lists for
>>> source and destination cpusets respectively.
>>>
>>> Waiman Long (5):
>>> cgroup/cpuset: Add a cpuset_reserve_dl_bw() helper
>>> cgroup/cpuset: Expand the scope of cpuset_can_attach_check()
>>> cgroup/cpuset: Replace cpuset_attach_old_cs by a new attach_old_cs
>>> field in task_struct
>>> cgroup/cpuset: Move mpol_rebind_mm/cpuset_migrate_mm() calls inside
>>> cpuset_attach_task()
>>> cgroup/cpuset: Support multiple source/destination cpusets for
>>> cpuset_*attach()
>>>
>>> include/linux/sched.h | 3 +
>>> kernel/cgroup/cpuset-internal.h | 6 +
>>> kernel/cgroup/cpuset.c | 358 +++++++++++++++++++++-----------
>>> 3 files changed, 249 insertions(+), 118 deletions(-)
>>>
>>
>
>
--
Best regards,
Ridong
prev parent reply other threads:[~2026-06-03 10:05 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-16 4:24 [PATCH cgroup/for-next v2 0/5] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach() Waiman Long
2026-05-16 4:24 ` [PATCH cgroup/for-next v2 1/5] cgroup/cpuset: Add a cpuset_reserve_dl_bw() helper Waiman Long
2026-05-19 8:19 ` Ridong Chen
2026-05-25 0:08 ` Aaron Tomlin
2026-05-16 4:24 ` [PATCH cgroup/for-next v2 2/5] cgroup/cpuset: Expand the scope of cpuset_can_attach_check() Waiman Long
2026-05-19 8:26 ` Ridong Chen
2026-05-16 4:24 ` [PATCH cgroup/for-next v2 3/5] cgroup/cpuset: Replace cpuset_attach_old_cs by a new attach_old_cs field in task_struct Waiman Long
2026-05-16 4:24 ` [PATCH cgroup/for-next v2 4/5] cgroup/cpuset: Move mpol_rebind_mm/cpuset_migrate_mm() calls inside cpuset_attach_task() Waiman Long
2026-05-16 4:24 ` [PATCH cgroup/for-next v2 5/5] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach() Waiman Long
2026-05-16 4:36 ` [PATCH cgroup/for-next v2 0/5] " Waiman Long
2026-05-20 8:29 ` Ridong Chen
2026-05-26 20:12 ` Waiman Long
2026-06-03 10:05 ` Ridong Chen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4e7f7515-8960-4677-bd42-b83cd5f863ad@linux.dev \
--to=ridong.chen@linux.dev \
--cc=atomlin@atomlin.com \
--cc=bsegall@google.com \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huaweicloud.com \
--cc=dietmar.eggemann@arm.com \
--cc=hannes@cmpxchg.org \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.