* [PATCH 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks()
@ 2026-06-22 22:45 Waiman Long
2026-06-22 22:45 ` [PATCH 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask() Waiman Long
2026-06-23 1:14 ` [PATCH 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks() Ridong Chen
0 siblings, 2 replies; 4+ messages in thread
From: Waiman Long @ 2026-06-22 22:45 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný, Ridong Chen,
Jonathan Corbet, Shuah Khan
Cc: cgroups, linux-kernel, linux-doc, Waiman Long
As reported by sashiko [1], cpuset_hotplug_update_tasks() may perform
unnecessary task iteration and updating of tasks' CPU and node masks
when mems_allowed and/or cpus_allowed are not set in cpuset v2. It is
due to the fact that the temporary new_cpus and new_mems masks do not
inherit parent's effective_cpus/mems when they are empty which is the
expected behavior for cpuset v2 since commit 4ec22e9c5a90 ("cpuset:
Enable cpuset controller in default hierarchy").
Fix that and avoid unnecessay work by adding the empty mask checks and
inheriting the parent's versions if empty.
[1] https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com
Fixes: 4ec22e9c5a90 ("cpuset: Enable cpuset controller in default hierarchy")
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index aff86acea701..bc0207fd6e57 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3925,6 +3925,14 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
compute_effective_cpumask(&new_cpus, cs, parent);
nodes_and(new_mems, cs->mems_allowed, parent->effective_mems);
+ if (is_in_v2_mode()) {
+ /* Inherit parent's effective_cpus/mems if empty */
+ if (cpumask_empty(&new_cpus))
+ cpumask_copy(&new_cpus, parent->effective_cpus);
+ if (nodes_empty(new_mems))
+ new_mems = parent->effective_mems;
+ }
+
if (!tmp || !cs->partition_root_state)
goto update_tasks;
--
2.54.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask()
2026-06-22 22:45 [PATCH 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks() Waiman Long
@ 2026-06-22 22:45 ` Waiman Long
2026-06-23 1:22 ` Ridong Chen
2026-06-23 1:14 ` [PATCH 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks() Ridong Chen
1 sibling, 1 reply; 4+ messages in thread
From: Waiman Long @ 2026-06-22 22:45 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný, Ridong Chen,
Jonathan Corbet, Shuah Khan
Cc: cgroups, linux-kernel, linux-doc, Waiman Long
As reported by sashiko [1], cpuset_update_tasks_nodemask() will do
mpol_rebind_mm() and possibly cpuset_migrate_mm() for all threads of
a multithreaded process. Since commit 3df9ca0a2b8b ("cpuset: migrate
memory only for threadgroup leaders"), cpuset_attach() had been updated
to rebind and migrate memory only for threadgroup leaders to mark the
group leader as the owner of the mm_struct.
To be consistent and avoid unnecessary performance overhead for heavily
multithreaded processes, follow the cpuset_attach() example and perform
memory rebind and migration only for threadgroup leaders.
Also add a paragraph in cgroup-v2.rst under cpuset.mems that the
threadgroup leader is the memory owner of that threadgroup. Therefore
the non-leading threads shouldn't be in other cgroups whose "cpuset.mems"
doesn't fully overleap that of the group leader.
[1] https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
---
Documentation/admin-guide/cgroup-v2.rst | 7 +++++++
kernel/cgroup/cpuset.c | 4 ++++
2 files changed, 11 insertions(+)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 993446ab66d0..341037c7ec9d 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2527,6 +2527,13 @@ Cpuset Interface Files
a need to change "cpuset.mems" with active tasks, it shouldn't
be done frequently.
+ For a multithreaded process, the threadgroup leader is
+ considered the owner of the group's memory. Memory policy
+ rebinding and migration will only happen with respect to the
+ threadgroup leader. To avoid unexpected result, non-leading
+ threads shouldn't be put into another cgroup whose "cpuset.mems"
+ doesn't full overleap that of the threadgroup leader.
+
cpuset.mems.effective
A read-only multiple values file which exists on all
cpuset-enabled cgroups.
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index bc0207fd6e57..27bc7a466468 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2659,6 +2659,10 @@ void cpuset_update_tasks_nodemask(struct cpuset *cs)
cpuset_change_task_nodemask(task, &newmems);
+ /* Rebind and migrate mm only for task group leader */
+ if (task != task->group_leader)
+ continue;
+
mm = get_task_mm(task);
if (!mm)
continue;
--
2.54.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks()
2026-06-22 22:45 [PATCH 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks() Waiman Long
2026-06-22 22:45 ` [PATCH 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask() Waiman Long
@ 2026-06-23 1:14 ` Ridong Chen
1 sibling, 0 replies; 4+ messages in thread
From: Ridong Chen @ 2026-06-23 1:14 UTC (permalink / raw)
To: Waiman Long, Tejun Heo, Johannes Weiner, Michal Koutný,
Jonathan Corbet, Shuah Khan
Cc: cgroups, linux-kernel, linux-doc
On 6/23/2026 6:45 AM, Waiman Long wrote:
> As reported by sashiko [1], cpuset_hotplug_update_tasks() may perform
> unnecessary task iteration and updating of tasks' CPU and node masks
> when mems_allowed and/or cpus_allowed are not set in cpuset v2. It is
> due to the fact that the temporary new_cpus and new_mems masks do not
> inherit parent's effective_cpus/mems when they are empty which is the
> expected behavior for cpuset v2 since commit 4ec22e9c5a90 ("cpuset:
> Enable cpuset controller in default hierarchy").
>
> Fix that and avoid unnecessay work by adding the empty mask checks and
> inheriting the parent's versions if empty.
>
> [1] https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com
>
> Fixes: 4ec22e9c5a90 ("cpuset: Enable cpuset controller in default hierarchy")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/cgroup/cpuset.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index aff86acea701..bc0207fd6e57 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -3925,6 +3925,14 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
> compute_effective_cpumask(&new_cpus, cs, parent);
> nodes_and(new_mems, cs->mems_allowed, parent->effective_mems);
>
> + if (is_in_v2_mode()) {
> + /* Inherit parent's effective_cpus/mems if empty */
> + if (cpumask_empty(&new_cpus))
> + cpumask_copy(&new_cpus, parent->effective_cpus);
> + if (nodes_empty(new_mems))
> + new_mems = parent->effective_mems;
> + }
> +
> if (!tmp || !cs->partition_root_state)
> goto update_tasks;
>
I noticed that compute_effective_cpumask(...) is called in several
places, so I think the logic should be consolidated into that function.
```
static void compute_effective_cpumask(struct cpumask *new_cpus,
struct cpuset *cs, struct cpuset *parent)
{
cpumask_and(new_cpus, cs->cpus_allowed, parent->effective_cpus);
if (cpumask_empty(&new_cpus) && is_in_v2_mode())
cpumask_copy(&new_cpus, parent->effective_cpus);
}
```
Similarly, for new_mems, should we introduce a dedicated helper like
compute_effective_nodemask? The same fallback logic is needed in
update_nodemasks_hier:
```
static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems)
{
...
bool has_mems = nodes_and(*new_mems, cp->mems_allowed,
parent->effective_mems);
/*
* If it becomes empty, inherit the effective mask of the
* parent, which is guaranteed to have some MEMs.
*/
if (is_in_v2_mode() && !has_mems)
*new_mems = parent->effective_mems;
...
```
--
Best regards
Ridong
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask()
2026-06-22 22:45 ` [PATCH 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask() Waiman Long
@ 2026-06-23 1:22 ` Ridong Chen
0 siblings, 0 replies; 4+ messages in thread
From: Ridong Chen @ 2026-06-23 1:22 UTC (permalink / raw)
To: Waiman Long, Tejun Heo, Johannes Weiner, Michal Koutný,
Jonathan Corbet, Shuah Khan
Cc: cgroups, linux-kernel, linux-doc
On 6/23/2026 6:45 AM, Waiman Long wrote:
> As reported by sashiko [1], cpuset_update_tasks_nodemask() will do
> mpol_rebind_mm() and possibly cpuset_migrate_mm() for all threads of
> a multithreaded process. Since commit 3df9ca0a2b8b ("cpuset: migrate
> memory only for threadgroup leaders"), cpuset_attach() had been updated
> to rebind and migrate memory only for threadgroup leaders to mark the
> group leader as the owner of the mm_struct.
>
> To be consistent and avoid unnecessary performance overhead for heavily
> multithreaded processes, follow the cpuset_attach() example and perform
> memory rebind and migration only for threadgroup leaders.
>
> Also add a paragraph in cgroup-v2.rst under cpuset.mems that the
> threadgroup leader is the memory owner of that threadgroup. Therefore
> the non-leading threads shouldn't be in other cgroups whose "cpuset.mems"
> doesn't fully overleap that of the group leader.
>
> [1] https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> Documentation/admin-guide/cgroup-v2.rst | 7 +++++++
> kernel/cgroup/cpuset.c | 4 ++++
> 2 files changed, 11 insertions(+)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 993446ab66d0..341037c7ec9d 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -2527,6 +2527,13 @@ Cpuset Interface Files
> a need to change "cpuset.mems" with active tasks, it shouldn't
> be done frequently.
>
> + For a multithreaded process, the threadgroup leader is
> + considered the owner of the group's memory. Memory policy
> + rebinding and migration will only happen with respect to the
> + threadgroup leader. To avoid unexpected result, non-leading
> + threads shouldn't be put into another cgroup whose "cpuset.mems"
> + doesn't full overleap that of the threadgroup leader.
> +
> cpuset.mems.effective
> A read-only multiple values file which exists on all
> cpuset-enabled cgroups.
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index bc0207fd6e57..27bc7a466468 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -2659,6 +2659,10 @@ void cpuset_update_tasks_nodemask(struct cpuset *cs)
>
> cpuset_change_task_nodemask(task, &newmems);
>
> + /* Rebind and migrate mm only for task group leader */
> + if (task != task->group_leader)
> + continue;
> +
Nit.
if (!thread_group_leader(task))
continue;
> mm = get_task_mm(task);
> if (!mm)
> continue;
Reviewed-by: Ridong Chen <ridong.chen@linux.dev>
--
Best regards
Ridong
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-23 1:22 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-22 22:45 [PATCH 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks() Waiman Long
2026-06-22 22:45 ` [PATCH 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask() Waiman Long
2026-06-23 1:22 ` Ridong Chen
2026-06-23 1:14 ` [PATCH 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks() Ridong Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox