* [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup
@ 2025-08-06 17:24 Waiman Long
2025-08-06 17:24 ` [PATCH 1/3] cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key Waiman Long
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Waiman Long @ 2025-08-06 17:24 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Ingo Molnar, Juri Lelli,
Peter Zijlstra (Intel), Waiman Long
This patch series includes 2 bug fixes patches and 1 cleanup patch.
Waiman Long (3):
cgroup/cpuset: Use static_branch_enable_cpuslocked() on
cpusets_insane_config_key
cgroup/cpuset.c: Fix a partition error with CPU hotplug
cgroup/cpuset: Remove the unnecessary css_get/put() in
cpuset_partition_write()
kernel/cgroup/cpuset.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
--
2.50.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/3] cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key
2025-08-06 17:24 [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup Waiman Long
@ 2025-08-06 17:24 ` Waiman Long
2025-08-07 13:15 ` Juri Lelli
2025-08-06 17:24 ` [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug Waiman Long
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2025-08-06 17:24 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Ingo Molnar, Juri Lelli,
Peter Zijlstra (Intel), Waiman Long
The following lockdep splat was observed.
[ 812.359086] ============================================
[ 812.359089] WARNING: possible recursive locking detected
[ 812.359097] --------------------------------------------
[ 812.359100] runtest.sh/30042 is trying to acquire lock:
[ 812.359105] ffffffffa7f27420 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0xe/0x20
[ 812.359131]
[ 812.359131] but task is already holding lock:
[ 812.359134] ffffffffa7f27420 (cpu_hotplug_lock){++++}-{0:0}, at: cpuset_write_resmask+0x98/0xa70
:
[ 812.359267] Call Trace:
[ 812.359272] <TASK>
[ 812.359367] cpus_read_lock+0x3c/0xe0
[ 812.359382] static_key_enable+0xe/0x20
[ 812.359389] check_insane_mems_config.part.0+0x11/0x30
[ 812.359398] cpuset_write_resmask+0x9f2/0xa70
[ 812.359411] cgroup_file_write+0x1c7/0x660
[ 812.359467] kernfs_fop_write_iter+0x358/0x530
[ 812.359479] vfs_write+0xabe/0x1250
[ 812.359529] ksys_write+0xf9/0x1d0
[ 812.359558] do_syscall_64+0x5f/0xe0
Since commit d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem
and hotplug lock order"), the ordering of cpu hotplug lock
and cpuset_mutex had been reversed. That patch correctly
used the cpuslocked version of the static branch API to enable
cpusets_pre_enable_key and cpusets_enabled_key, but it didn't do the
same for cpusets_insane_config_key.
The cpusets_insane_config_key can be enabled in the
check_insane_mems_config() which is called from update_nodemask()
or cpuset_hotplug_update_tasks() with both cpu hotplug lock and
cpuset_mutex held. Deadlock can happen with a pending hotplug event that
tries to acquire the cpu hotplug write lock which will block further
cpus_read_lock() attempt from check_insane_mems_config(). Fix that by
switching to use static_branch_enable_cpuslocked().
Fixes: d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock order")
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index f74d04429a29..bf149246e001 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -280,7 +280,7 @@ static inline void check_insane_mems_config(nodemask_t *nodes)
{
if (!cpusets_insane_config() &&
movable_only_nodes(nodes)) {
- static_branch_enable(&cpusets_insane_config_key);
+ static_branch_enable_cpuslocked(&cpusets_insane_config_key);
pr_info("Unsupported (movable nodes only) cpuset configuration detected (nmask=%*pbl)!\n"
"Cpuset allocations might fail even with a lot of memory available.\n",
nodemask_pr_args(nodes));
--
2.50.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug
2025-08-06 17:24 [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup Waiman Long
2025-08-06 17:24 ` [PATCH 1/3] cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key Waiman Long
@ 2025-08-06 17:24 ` Waiman Long
2025-08-06 17:30 ` Waiman Long
2025-08-07 2:44 ` Chen Ridong
2025-08-06 17:24 ` [PATCH 3/3] cgroup/cpuset: Remove the unnecessary css_get/put() in cpuset_partition_write() Waiman Long
2025-08-09 18:44 ` [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup Tejun Heo
3 siblings, 2 replies; 11+ messages in thread
From: Waiman Long @ 2025-08-06 17:24 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Ingo Molnar, Juri Lelli,
Peter Zijlstra (Intel), Waiman Long
It was found during testing that an invalid leaf partition with an
empty effective exclusive CPU list can become a valid empty partition
with no CPU afer an offline/online operation of an unrelated CPU. An
empty partition root is allowed in the special case that it has no
task in its cgroup and has distributed out all its CPUs to its child
partitions. That is certainly not the case here.
The problem is in the cpumask_subsets() test in the hotplug case
(update with no new mask) of update_parent_effective_cpumask() as it
also returns true if the effective exclusive CPU list is empty. Fix that
by addding the cpumask_empty() test to root out this exception case.
Also add the cpumask_empty() test in cpuset_hotplug_update_tasks()
to avoid calling update_parent_effective_cpumask() for this special case.
Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index bf149246e001..d993e058a663 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1843,7 +1843,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
if (is_partition_valid(cs))
adding = cpumask_and(tmp->addmask,
xcpus, parent->effective_xcpus);
- } else if (is_partition_invalid(cs) &&
+ } else if (is_partition_invalid(cs) && !cpumask_empty(xcpus) &&
cpumask_subset(xcpus, parent->effective_xcpus)) {
struct cgroup_subsys_state *css;
struct cpuset *child;
@@ -3870,9 +3870,10 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
partcmd = partcmd_invalidate;
/*
* On the other hand, an invalid partition root may be transitioned
- * back to a regular one.
+ * back to a regular one with a non-empty effective xcpus.
*/
- else if (is_partition_valid(parent) && is_partition_invalid(cs))
+ else if (is_partition_valid(parent) && is_partition_invalid(cs) &&
+ !cpumask_empty(cs->effective_xcpus))
partcmd = partcmd_update;
if (partcmd >= 0) {
--
2.50.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/3] cgroup/cpuset: Remove the unnecessary css_get/put() in cpuset_partition_write()
2025-08-06 17:24 [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup Waiman Long
2025-08-06 17:24 ` [PATCH 1/3] cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key Waiman Long
2025-08-06 17:24 ` [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug Waiman Long
@ 2025-08-06 17:24 ` Waiman Long
2025-08-07 11:25 ` Michal Koutný
2025-08-09 18:44 ` [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup Tejun Heo
3 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2025-08-06 17:24 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Ingo Molnar, Juri Lelli,
Peter Zijlstra (Intel), Waiman Long
The css_get/put() calls in cpuset_partition_write() are unnecessary as
an active reference of the kernfs node will be taken which will prevent
its removal and guarantee the existence of the css. Only the online
check is needed.
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index d993e058a663..27adb04df675 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3358,14 +3358,12 @@ static ssize_t cpuset_partition_write(struct kernfs_open_file *of, char *buf,
else
return -EINVAL;
- css_get(&cs->css);
cpus_read_lock();
mutex_lock(&cpuset_mutex);
if (is_cpuset_online(cs))
retval = update_prstate(cs, val);
mutex_unlock(&cpuset_mutex);
cpus_read_unlock();
- css_put(&cs->css);
return retval ?: nbytes;
}
--
2.50.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug
2025-08-06 17:24 ` [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug Waiman Long
@ 2025-08-06 17:30 ` Waiman Long
2025-08-07 2:44 ` Chen Ridong
1 sibling, 0 replies; 11+ messages in thread
From: Waiman Long @ 2025-08-06 17:30 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Ingo Molnar, Juri Lelli,
Peter Zijlstra (Intel)
On 8/6/25 1:24 PM, Waiman Long wrote:
> It was found during testing that an invalid leaf partition with an
> empty effective exclusive CPU list can become a valid empty partition
> with no CPU afer an offline/online operation of an unrelated CPU. An
> empty partition root is allowed in the special case that it has no
> task in its cgroup and has distributed out all its CPUs to its child
> partitions. That is certainly not the case here.
>
> The problem is in the cpumask_subsets() test in the hotplug case
> (update with no new mask) of update_parent_effective_cpumask() as it
> also returns true if the effective exclusive CPU list is empty. Fix that
> by addding the cpumask_empty() test to root out this exception case.
> Also add the cpumask_empty() test in cpuset_hotplug_update_tasks()
> to avoid calling update_parent_effective_cpumask() for this special case.
>
> Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
> Signed-off-by: Waiman Long <longman@redhat.com>
Oh, forget to remove the unneeded ".c" from the patch title.
-Longman
> ---
> kernel/cgroup/cpuset.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index bf149246e001..d993e058a663 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1843,7 +1843,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
> if (is_partition_valid(cs))
> adding = cpumask_and(tmp->addmask,
> xcpus, parent->effective_xcpus);
> - } else if (is_partition_invalid(cs) &&
> + } else if (is_partition_invalid(cs) && !cpumask_empty(xcpus) &&
> cpumask_subset(xcpus, parent->effective_xcpus)) {
> struct cgroup_subsys_state *css;
> struct cpuset *child;
> @@ -3870,9 +3870,10 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
> partcmd = partcmd_invalidate;
> /*
> * On the other hand, an invalid partition root may be transitioned
> - * back to a regular one.
> + * back to a regular one with a non-empty effective xcpus.
> */
> - else if (is_partition_valid(parent) && is_partition_invalid(cs))
> + else if (is_partition_valid(parent) && is_partition_invalid(cs) &&
> + !cpumask_empty(cs->effective_xcpus))
> partcmd = partcmd_update;
>
> if (partcmd >= 0) {
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug
2025-08-06 17:24 ` [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug Waiman Long
2025-08-06 17:30 ` Waiman Long
@ 2025-08-07 2:44 ` Chen Ridong
2025-08-07 13:54 ` Waiman Long
1 sibling, 1 reply; 11+ messages in thread
From: Chen Ridong @ 2025-08-07 2:44 UTC (permalink / raw)
To: Waiman Long, Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Ingo Molnar, Juri Lelli,
Peter Zijlstra (Intel)
On 2025/8/7 1:24, Waiman Long wrote:
> It was found during testing that an invalid leaf partition with an
> empty effective exclusive CPU list can become a valid empty partition
> with no CPU afer an offline/online operation of an unrelated CPU. An
> empty partition root is allowed in the special case that it has no
> task in its cgroup and has distributed out all its CPUs to its child
> partitions. That is certainly not the case here.
>
> The problem is in the cpumask_subsets() test in the hotplug case
> (update with no new mask) of update_parent_effective_cpumask() as it
> also returns true if the effective exclusive CPU list is empty. Fix that
> by addding the cpumask_empty() test to root out this exception case.
> Also add the cpumask_empty() test in cpuset_hotplug_update_tasks()
> to avoid calling update_parent_effective_cpumask() for this special case.
>
> Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/cgroup/cpuset.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index bf149246e001..d993e058a663 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1843,7 +1843,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
> if (is_partition_valid(cs))
> adding = cpumask_and(tmp->addmask,
> xcpus, parent->effective_xcpus);
> - } else if (is_partition_invalid(cs) &&
> + } else if (is_partition_invalid(cs) && !cpumask_empty(xcpus) &&
> cpumask_subset(xcpus, parent->effective_xcpus)) {
> struct cgroup_subsys_state *css;
> struct cpuset *child;
This path looks good to me.
However, I found the update_parent_effective_cpumask function a bit difficult to follow due to its
complexity.
To improve readability, could we refactor the partcmd_enable, partcmd_disable, partcmd_update and
partcmd_invalidate logic into separate, well-defined function blocks? I'd be happy to take
ownership of this refactoring work if you agree with the approach.
Best regards,
Ridong
> @@ -3870,9 +3870,10 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
> partcmd = partcmd_invalidate;
> /*
> * On the other hand, an invalid partition root may be transitioned
> - * back to a regular one.
> + * back to a regular one with a non-empty effective xcpus.
> */
> - else if (is_partition_valid(parent) && is_partition_invalid(cs))
> + else if (is_partition_valid(parent) && is_partition_invalid(cs) &&
> + !cpumask_empty(cs->effective_xcpus))
> partcmd = partcmd_update;
>
> if (partcmd >= 0) {
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] cgroup/cpuset: Remove the unnecessary css_get/put() in cpuset_partition_write()
2025-08-06 17:24 ` [PATCH 3/3] cgroup/cpuset: Remove the unnecessary css_get/put() in cpuset_partition_write() Waiman Long
@ 2025-08-07 11:25 ` Michal Koutný
0 siblings, 0 replies; 11+ messages in thread
From: Michal Koutný @ 2025-08-07 11:25 UTC (permalink / raw)
To: Waiman Long
Cc: Tejun Heo, Johannes Weiner, cgroups, linux-kernel, Ingo Molnar,
Juri Lelli, Peter Zijlstra (Intel)
[-- Attachment #1: Type: text/plain, Size: 503 bytes --]
On Wed, Aug 06, 2025 at 01:24:30PM -0400, Waiman Long <longman@redhat.com> wrote:
> The css_get/put() calls in cpuset_partition_write() are unnecessary as
> an active reference of the kernfs node will be taken which will prevent
> its removal and guarantee the existence of the css. Only the online
> check is needed.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/cgroup/cpuset.c | 2 --
> 1 file changed, 2 deletions(-)
Reviewed-by: Michal Koutný <mkoutny@suse.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key
2025-08-06 17:24 ` [PATCH 1/3] cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key Waiman Long
@ 2025-08-07 13:15 ` Juri Lelli
2025-08-07 13:47 ` Waiman Long
0 siblings, 1 reply; 11+ messages in thread
From: Juri Lelli @ 2025-08-07 13:15 UTC (permalink / raw)
To: Waiman Long
Cc: Tejun Heo, Johannes Weiner, Michal Koutný, cgroups,
linux-kernel, Ingo Molnar, Peter Zijlstra (Intel)
Hi,
On 06/08/25 13:24, Waiman Long wrote:
> The following lockdep splat was observed.
>
> [ 812.359086] ============================================
> [ 812.359089] WARNING: possible recursive locking detected
> [ 812.359097] --------------------------------------------
> [ 812.359100] runtest.sh/30042 is trying to acquire lock:
> [ 812.359105] ffffffffa7f27420 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0xe/0x20
> [ 812.359131]
> [ 812.359131] but task is already holding lock:
> [ 812.359134] ffffffffa7f27420 (cpu_hotplug_lock){++++}-{0:0}, at: cpuset_write_resmask+0x98/0xa70
> :
> [ 812.359267] Call Trace:
> [ 812.359272] <TASK>
> [ 812.359367] cpus_read_lock+0x3c/0xe0
> [ 812.359382] static_key_enable+0xe/0x20
> [ 812.359389] check_insane_mems_config.part.0+0x11/0x30
> [ 812.359398] cpuset_write_resmask+0x9f2/0xa70
> [ 812.359411] cgroup_file_write+0x1c7/0x660
> [ 812.359467] kernfs_fop_write_iter+0x358/0x530
> [ 812.359479] vfs_write+0xabe/0x1250
> [ 812.359529] ksys_write+0xf9/0x1d0
> [ 812.359558] do_syscall_64+0x5f/0xe0
>
> Since commit d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem
> and hotplug lock order"), the ordering of cpu hotplug lock
> and cpuset_mutex had been reversed. That patch correctly
> used the cpuslocked version of the static branch API to enable
> cpusets_pre_enable_key and cpusets_enabled_key, but it didn't do the
> same for cpusets_insane_config_key.
>
> The cpusets_insane_config_key can be enabled in the
> check_insane_mems_config() which is called from update_nodemask()
> or cpuset_hotplug_update_tasks() with both cpu hotplug lock and
> cpuset_mutex held. Deadlock can happen with a pending hotplug event that
> tries to acquire the cpu hotplug write lock which will block further
> cpus_read_lock() attempt from check_insane_mems_config(). Fix that by
> switching to use static_branch_enable_cpuslocked().
>
> Fixes: d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock order")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
Looks good to me. Thanks for spotting and fixing this.
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Best,
Juri
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key
2025-08-07 13:15 ` Juri Lelli
@ 2025-08-07 13:47 ` Waiman Long
0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2025-08-07 13:47 UTC (permalink / raw)
To: Juri Lelli
Cc: Tejun Heo, Johannes Weiner, Michal Koutný, cgroups,
linux-kernel, Ingo Molnar, Peter Zijlstra (Intel)
On 8/7/25 9:15 AM, Juri Lelli wrote:
> Hi,
>
> On 06/08/25 13:24, Waiman Long wrote:
>> The following lockdep splat was observed.
>>
>> [ 812.359086] ============================================
>> [ 812.359089] WARNING: possible recursive locking detected
>> [ 812.359097] --------------------------------------------
>> [ 812.359100] runtest.sh/30042 is trying to acquire lock:
>> [ 812.359105] ffffffffa7f27420 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0xe/0x20
>> [ 812.359131]
>> [ 812.359131] but task is already holding lock:
>> [ 812.359134] ffffffffa7f27420 (cpu_hotplug_lock){++++}-{0:0}, at: cpuset_write_resmask+0x98/0xa70
>> :
>> [ 812.359267] Call Trace:
>> [ 812.359272] <TASK>
>> [ 812.359367] cpus_read_lock+0x3c/0xe0
>> [ 812.359382] static_key_enable+0xe/0x20
>> [ 812.359389] check_insane_mems_config.part.0+0x11/0x30
>> [ 812.359398] cpuset_write_resmask+0x9f2/0xa70
>> [ 812.359411] cgroup_file_write+0x1c7/0x660
>> [ 812.359467] kernfs_fop_write_iter+0x358/0x530
>> [ 812.359479] vfs_write+0xabe/0x1250
>> [ 812.359529] ksys_write+0xf9/0x1d0
>> [ 812.359558] do_syscall_64+0x5f/0xe0
>>
>> Since commit d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem
>> and hotplug lock order"), the ordering of cpu hotplug lock
>> and cpuset_mutex had been reversed. That patch correctly
>> used the cpuslocked version of the static branch API to enable
>> cpusets_pre_enable_key and cpusets_enabled_key, but it didn't do the
>> same for cpusets_insane_config_key.
>>
>> The cpusets_insane_config_key can be enabled in the
>> check_insane_mems_config() which is called from update_nodemask()
>> or cpuset_hotplug_update_tasks() with both cpu hotplug lock and
>> cpuset_mutex held. Deadlock can happen with a pending hotplug event that
>> tries to acquire the cpu hotplug write lock which will block further
>> cpus_read_lock() attempt from check_insane_mems_config(). Fix that by
>> switching to use static_branch_enable_cpuslocked().
>>
>> Fixes: d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock order")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
> Looks good to me. Thanks for spotting and fixing this.
>
> Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
It is really a corner case that is not easy to trigger. I would have
missed that myself.
Thanks,
Longman
>
> Best,
> Juri
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug
2025-08-07 2:44 ` Chen Ridong
@ 2025-08-07 13:54 ` Waiman Long
0 siblings, 0 replies; 11+ messages in thread
From: Waiman Long @ 2025-08-07 13:54 UTC (permalink / raw)
To: Chen Ridong, Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Ingo Molnar, Juri Lelli,
Peter Zijlstra (Intel)
On 8/6/25 10:44 PM, Chen Ridong wrote:
>
> On 2025/8/7 1:24, Waiman Long wrote:
>> It was found during testing that an invalid leaf partition with an
>> empty effective exclusive CPU list can become a valid empty partition
>> with no CPU afer an offline/online operation of an unrelated CPU. An
>> empty partition root is allowed in the special case that it has no
>> task in its cgroup and has distributed out all its CPUs to its child
>> partitions. That is certainly not the case here.
>>
>> The problem is in the cpumask_subsets() test in the hotplug case
>> (update with no new mask) of update_parent_effective_cpumask() as it
>> also returns true if the effective exclusive CPU list is empty. Fix that
>> by addding the cpumask_empty() test to root out this exception case.
>> Also add the cpumask_empty() test in cpuset_hotplug_update_tasks()
>> to avoid calling update_parent_effective_cpumask() for this special case.
>>
>> Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>> kernel/cgroup/cpuset.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>> index bf149246e001..d993e058a663 100644
>> --- a/kernel/cgroup/cpuset.c
>> +++ b/kernel/cgroup/cpuset.c
>> @@ -1843,7 +1843,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
>> if (is_partition_valid(cs))
>> adding = cpumask_and(tmp->addmask,
>> xcpus, parent->effective_xcpus);
>> - } else if (is_partition_invalid(cs) &&
>> + } else if (is_partition_invalid(cs) && !cpumask_empty(xcpus) &&
>> cpumask_subset(xcpus, parent->effective_xcpus)) {
>> struct cgroup_subsys_state *css;
>> struct cpuset *child;
> This path looks good to me.
>
> However, I found the update_parent_effective_cpumask function a bit difficult to follow due to its
> complexity.
>
> To improve readability, could we refactor the partcmd_enable, partcmd_disable, partcmd_update and
> partcmd_invalidate logic into separate, well-defined function blocks? I'd be happy to take
> ownership of this refactoring work if you agree with the approach.
I agree that the code can be a bit hard to read. You are more than
welcome to improve the readability of the code if you have time.
Cheers,
Longman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup
2025-08-06 17:24 [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup Waiman Long
` (2 preceding siblings ...)
2025-08-06 17:24 ` [PATCH 3/3] cgroup/cpuset: Remove the unnecessary css_get/put() in cpuset_partition_write() Waiman Long
@ 2025-08-09 18:44 ` Tejun Heo
3 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2025-08-09 18:44 UTC (permalink / raw)
To: Waiman Long
Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel,
Ingo Molnar, Juri Lelli, Peter Zijlstra (Intel)
On Wed, Aug 06, 2025 at 01:24:27PM -0400, Waiman Long wrote:
> This patch series includes 2 bug fixes patches and 1 cleanup patch.
>
> Waiman Long (3):
> cgroup/cpuset: Use static_branch_enable_cpuslocked() on
> cpusets_insane_config_key
> cgroup/cpuset.c: Fix a partition error with CPU hotplug
> cgroup/cpuset: Remove the unnecessary css_get/put() in
> cpuset_partition_write()
Applied 1-3 to cgroup/for-6.17-fixes.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-08-09 18:44 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-06 17:24 [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup Waiman Long
2025-08-06 17:24 ` [PATCH 1/3] cgroup/cpuset: Use static_branch_enable_cpuslocked() on cpusets_insane_config_key Waiman Long
2025-08-07 13:15 ` Juri Lelli
2025-08-07 13:47 ` Waiman Long
2025-08-06 17:24 ` [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug Waiman Long
2025-08-06 17:30 ` Waiman Long
2025-08-07 2:44 ` Chen Ridong
2025-08-07 13:54 ` Waiman Long
2025-08-06 17:24 ` [PATCH 3/3] cgroup/cpuset: Remove the unnecessary css_get/put() in cpuset_partition_write() Waiman Long
2025-08-07 11:25 ` Michal Koutný
2025-08-09 18:44 ` [PATCH 0/3] cgroup/cpuset: Miscellaneous fixes and cleanup Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).