* [cgroup/for-6.19 PATCH v2 0/3] cgroup/cpuset: Additional housekeeping check & cleanup
@ 2025-11-04 1:30 Waiman Long
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 1/3] cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() Waiman Long
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Waiman Long @ 2025-11-04 1:30 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Chen Ridong, Gabriele Monaco,
Frederic Weisbecker, Waiman Long
v2: Update patch 2 to correct some of the checking code and add
prstate_housekeeping_conflict() as suggested by Chen Ridong.
The first two patches are based on the two cpuset patches from the
"timers: Exclude isolated cpus from timer migration" patch series [1]
with some modifications. They add additional nohz_full housekeeping
mask check to ensure that there is at least one common housekeeping
CPU that is not domain and nohz_full isolated.
The last patch is another cleanup patch to simplify the current
partition code.
[1] https://lore.kernel.org/lkml/20251020112802.102451-1-gmonaco@redhat.com/
Gabriele Monaco (1):
cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to
update_isolation_cpumasks()
Waiman Long (2):
cgroup/cpuset: Fail if isolated and nohz_full don't leave any
housekeeping
cgroup/cpuset: Globally track isolated_cpus update
kernel/cgroup/cpuset.c | 148 ++++++++++++++++++++++++++++++-----------
1 file changed, 109 insertions(+), 39 deletions(-)
--
2.51.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [cgroup/for-6.19 PATCH v2 1/3] cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks()
2025-11-04 1:30 [cgroup/for-6.19 PATCH v2 0/3] cgroup/cpuset: Additional housekeeping check & cleanup Waiman Long
@ 2025-11-04 1:30 ` Waiman Long
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 2/3] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Waiman Long
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 3/3] cgroup/cpuset: Globally track isolated_cpus update Waiman Long
2 siblings, 0 replies; 8+ messages in thread
From: Waiman Long @ 2025-11-04 1:30 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Chen Ridong, Gabriele Monaco,
Frederic Weisbecker, Waiman Long, Chen Ridong
From: Gabriele Monaco <gmonaco@redhat.com>
update_unbound_workqueue_cpumask() updates unbound workqueues settings
when there's a change in isolated CPUs, but it can be used for other
subsystems requiring updated when isolated CPUs change.
Generalise the name to update_isolation_cpumasks() to prepare for other
functions unrelated to workqueues to be called in that spot.
[longman: Change the function name to update_isolation_cpumasks()]
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Chen Ridong <chenridong@huaweicloud.com>
---
kernel/cgroup/cpuset.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 7aef59ea9627..da770dac955e 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1393,7 +1393,7 @@ static bool partition_xcpus_del(int old_prs, struct cpuset *parent,
return isolcpus_updated;
}
-static void update_unbound_workqueue_cpumask(bool isolcpus_updated)
+static void update_isolation_cpumasks(bool isolcpus_updated)
{
int ret;
@@ -1557,7 +1557,7 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs,
list_add(&cs->remote_sibling, &remote_children);
cpumask_copy(cs->effective_xcpus, tmp->new_cpus);
spin_unlock_irq(&callback_lock);
- update_unbound_workqueue_cpumask(isolcpus_updated);
+ update_isolation_cpumasks(isolcpus_updated);
cpuset_force_rebuild();
cs->prs_err = 0;
@@ -1598,7 +1598,7 @@ static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp)
compute_excpus(cs, cs->effective_xcpus);
reset_partition_data(cs);
spin_unlock_irq(&callback_lock);
- update_unbound_workqueue_cpumask(isolcpus_updated);
+ update_isolation_cpumasks(isolcpus_updated);
cpuset_force_rebuild();
/*
@@ -1667,7 +1667,7 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
if (xcpus)
cpumask_copy(cs->exclusive_cpus, xcpus);
spin_unlock_irq(&callback_lock);
- update_unbound_workqueue_cpumask(isolcpus_updated);
+ update_isolation_cpumasks(isolcpus_updated);
if (adding || deleting)
cpuset_force_rebuild();
@@ -2011,7 +2011,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
tmp->delmask);
spin_unlock_irq(&callback_lock);
- update_unbound_workqueue_cpumask(isolcpus_updated);
+ update_isolation_cpumasks(isolcpus_updated);
if ((old_prs != new_prs) && (cmd == partcmd_update))
update_partition_exclusive_flag(cs, new_prs);
@@ -3029,7 +3029,7 @@ static int update_prstate(struct cpuset *cs, int new_prs)
else if (isolcpus_updated)
isolated_cpus_update(old_prs, new_prs, cs->effective_xcpus);
spin_unlock_irq(&callback_lock);
- update_unbound_workqueue_cpumask(isolcpus_updated);
+ update_isolation_cpumasks(isolcpus_updated);
/* Force update if switching back to member & update effective_xcpus */
update_cpumasks_hier(cs, &tmpmask, !new_prs);
--
2.51.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [cgroup/for-6.19 PATCH v2 2/3] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping
2025-11-04 1:30 [cgroup/for-6.19 PATCH v2 0/3] cgroup/cpuset: Additional housekeeping check & cleanup Waiman Long
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 1/3] cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() Waiman Long
@ 2025-11-04 1:30 ` Waiman Long
2025-11-04 2:19 ` Chen Ridong
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 3/3] cgroup/cpuset: Globally track isolated_cpus update Waiman Long
2 siblings, 1 reply; 8+ messages in thread
From: Waiman Long @ 2025-11-04 1:30 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Chen Ridong, Gabriele Monaco,
Frederic Weisbecker, Waiman Long
Currently the user can set up isolated cpus via cpuset and nohz_full in
such a way that leaves no housekeeping CPU (i.e. no CPU that is neither
domain isolated nor nohz full). This can be a problem for other
subsystems (e.g. the timer wheel imgration).
Prevent this configuration by blocking any assignation that would cause
the union of domain isolated cpus and nohz_full to covers all CPUs.
[longman: Remove isolated_cpus_should_update() and rewrite the checking
in update_prstate() and update_parent_effective_cpumask(), also add
prstate_housekeeping_conflict() check in update_prstate() as
suggested by Chen Ridong]
Originally-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset.c | 75 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 74 insertions(+), 1 deletion(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index da770dac955e..0c49905df394 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1393,6 +1393,45 @@ static bool partition_xcpus_del(int old_prs, struct cpuset *parent,
return isolcpus_updated;
}
+/*
+ * isolated_cpus_can_update - check for isolated & nohz_full conflicts
+ * @add_cpus: cpu mask for cpus that are going to be isolated
+ * @del_cpus: cpu mask for cpus that are no longer isolated, can be NULL
+ * Return: false if there is conflict, true otherwise
+ *
+ * If nohz_full is enabled and we have isolated CPUs, their combination must
+ * still leave housekeeping CPUs.
+ *
+ * TBD: Should consider merging this function into
+ * prstate_housekeeping_conflict().
+ */
+static bool isolated_cpus_can_update(struct cpumask *add_cpus,
+ struct cpumask *del_cpus)
+{
+ cpumask_var_t full_hk_cpus;
+ int res = true;
+
+ if (!housekeeping_enabled(HK_TYPE_KERNEL_NOISE))
+ return true;
+
+ if (del_cpus && cpumask_weight_and(del_cpus,
+ housekeeping_cpumask(HK_TYPE_KERNEL_NOISE)))
+ return true;
+
+ if (!alloc_cpumask_var(&full_hk_cpus, GFP_KERNEL))
+ return false;
+
+ cpumask_and(full_hk_cpus, housekeeping_cpumask(HK_TYPE_KERNEL_NOISE),
+ housekeeping_cpumask(HK_TYPE_DOMAIN));
+ cpumask_andnot(full_hk_cpus, full_hk_cpus, isolated_cpus);
+ cpumask_and(full_hk_cpus, full_hk_cpus, cpu_active_mask);
+ if (!cpumask_weight_andnot(full_hk_cpus, add_cpus))
+ res = false;
+
+ free_cpumask_var(full_hk_cpus);
+ return res;
+}
+
static void update_isolation_cpumasks(bool isolcpus_updated)
{
int ret;
@@ -1551,6 +1590,9 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs,
if (!cpumask_intersects(tmp->new_cpus, cpu_active_mask) ||
cpumask_subset(top_cpuset.effective_cpus, tmp->new_cpus))
return PERR_INVCPUS;
+ if ((new_prs == PRS_ISOLATED) &&
+ !isolated_cpus_can_update(tmp->new_cpus, NULL))
+ return PERR_HKEEPING;
spin_lock_irq(&callback_lock);
isolcpus_updated = partition_xcpus_add(new_prs, NULL, tmp->new_cpus);
@@ -1650,6 +1692,9 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
else if (cpumask_intersects(tmp->addmask, subpartitions_cpus) ||
cpumask_subset(top_cpuset.effective_cpus, tmp->addmask))
cs->prs_err = PERR_NOCPUS;
+ else if ((prs == PRS_ISOLATED) &&
+ !isolated_cpus_can_update(tmp->addmask, tmp->delmask))
+ cs->prs_err = PERR_HKEEPING;
if (cs->prs_err)
goto invalidate;
}
@@ -1750,6 +1795,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
int part_error = PERR_NONE; /* Partition error? */
int isolcpus_updated = 0;
struct cpumask *xcpus = user_xcpus(cs);
+ int parent_prs = parent->partition_root_state;
bool nocpu;
lockdep_assert_held(&cpuset_mutex);
@@ -1813,6 +1859,10 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
if (prstate_housekeeping_conflict(new_prs, xcpus))
return PERR_HKEEPING;
+ if ((new_prs == PRS_ISOLATED) && (new_prs != parent_prs) &&
+ !isolated_cpus_can_update(xcpus, NULL))
+ return PERR_HKEEPING;
+
if (tasks_nocpu_error(parent, cs, xcpus))
return PERR_NOCPUS;
@@ -1866,6 +1916,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
*
* For invalid partition:
* delmask = newmask & parent->effective_xcpus
+ * The partition may become valid soon.
*/
if (is_partition_invalid(cs)) {
adding = false;
@@ -1880,6 +1931,23 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
deleting = cpumask_and(tmp->delmask, tmp->delmask,
parent->effective_xcpus);
}
+
+ /*
+ * TBD: Invalidate a currently valid child root partition may
+ * still break isolated_cpus_can_update() rule if parent is an
+ * isolated partition.
+ */
+ if (is_partition_valid(cs) && (old_prs != parent_prs)) {
+ if ((parent_prs == PRS_ROOT) &&
+ /* Adding to parent means removing isolated CPUs */
+ !isolated_cpus_can_update(tmp->delmask, tmp->addmask))
+ part_error = PERR_HKEEPING;
+ if ((parent_prs == PRS_ISOLATED) &&
+ /* Adding to parent means adding isolated CPUs */
+ !isolated_cpus_can_update(tmp->addmask, tmp->delmask))
+ part_error = PERR_HKEEPING;
+ }
+
/*
* The new CPUs to be removed from parent's effective CPUs
* must be present.
@@ -2994,7 +3062,12 @@ static int update_prstate(struct cpuset *cs, int new_prs)
* A change in load balance state only, no change in cpumasks.
* Need to update isolated_cpus.
*/
- isolcpus_updated = true;
+ if (((new_prs == PRS_ISOLATED) &&
+ !isolated_cpus_can_update(cs->effective_xcpus, NULL)) ||
+ prstate_housekeeping_conflict(new_prs, cs->effective_xcpus))
+ err = PERR_HKEEPING;
+ else
+ isolcpus_updated = true;
} else {
/*
* Switching back to member is always allowed even if it
--
2.51.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [cgroup/for-6.19 PATCH v2 3/3] cgroup/cpuset: Globally track isolated_cpus update
2025-11-04 1:30 [cgroup/for-6.19 PATCH v2 0/3] cgroup/cpuset: Additional housekeeping check & cleanup Waiman Long
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 1/3] cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() Waiman Long
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 2/3] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Waiman Long
@ 2025-11-04 1:30 ` Waiman Long
2025-11-04 2:21 ` Chen Ridong
2 siblings, 1 reply; 8+ messages in thread
From: Waiman Long @ 2025-11-04 1:30 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Chen Ridong, Gabriele Monaco,
Frederic Weisbecker, Waiman Long
The current cpuset code passes a local isolcpus_updated flag around in a
number of functions to determine if external isolation related cpumasks
like wq_unbound_cpumask should be updated. It is a bit cumbersome and
makes the code more complex. Simplify the code by using a global boolean
flag "isolated_cpus_updating" to track this. This flag will be set in
isolated_cpus_update() and cleared in update_isolation_cpumasks().
No functional change is expected.
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset.c | 73 ++++++++++++++++++++----------------------
1 file changed, 35 insertions(+), 38 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 0c49905df394..854fe4cc1358 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -81,6 +81,13 @@ static cpumask_var_t subpartitions_cpus;
*/
static cpumask_var_t isolated_cpus;
+/*
+ * isolated_cpus updating flag (protected by cpuset_mutex)
+ * Set if isolated_cpus is going to be updated in the current
+ * cpuset_mutex crtical section.
+ */
+static bool isolated_cpus_updating;
+
/*
* Housekeeping (HK_TYPE_DOMAIN) CPUs at boot
*/
@@ -1327,6 +1334,8 @@ static void isolated_cpus_update(int old_prs, int new_prs, struct cpumask *xcpus
cpumask_or(isolated_cpus, isolated_cpus, xcpus);
else
cpumask_andnot(isolated_cpus, isolated_cpus, xcpus);
+
+ isolated_cpus_updating = true;
}
/*
@@ -1334,15 +1343,12 @@ static void isolated_cpus_update(int old_prs, int new_prs, struct cpumask *xcpus
* @new_prs: new partition_root_state
* @parent: parent cpuset
* @xcpus: exclusive CPUs to be added
- * Return: true if isolated_cpus modified, false otherwise
*
* Remote partition if parent == NULL
*/
-static bool partition_xcpus_add(int new_prs, struct cpuset *parent,
+static void partition_xcpus_add(int new_prs, struct cpuset *parent,
struct cpumask *xcpus)
{
- bool isolcpus_updated;
-
WARN_ON_ONCE(new_prs < 0);
lockdep_assert_held(&callback_lock);
if (!parent)
@@ -1352,13 +1358,11 @@ static bool partition_xcpus_add(int new_prs, struct cpuset *parent,
if (parent == &top_cpuset)
cpumask_or(subpartitions_cpus, subpartitions_cpus, xcpus);
- isolcpus_updated = (new_prs != parent->partition_root_state);
- if (isolcpus_updated)
+ if (new_prs != parent->partition_root_state)
isolated_cpus_update(parent->partition_root_state, new_prs,
xcpus);
cpumask_andnot(parent->effective_cpus, parent->effective_cpus, xcpus);
- return isolcpus_updated;
}
/*
@@ -1366,15 +1370,12 @@ static bool partition_xcpus_add(int new_prs, struct cpuset *parent,
* @old_prs: old partition_root_state
* @parent: parent cpuset
* @xcpus: exclusive CPUs to be removed
- * Return: true if isolated_cpus modified, false otherwise
*
* Remote partition if parent == NULL
*/
-static bool partition_xcpus_del(int old_prs, struct cpuset *parent,
+static void partition_xcpus_del(int old_prs, struct cpuset *parent,
struct cpumask *xcpus)
{
- bool isolcpus_updated;
-
WARN_ON_ONCE(old_prs < 0);
lockdep_assert_held(&callback_lock);
if (!parent)
@@ -1383,14 +1384,12 @@ static bool partition_xcpus_del(int old_prs, struct cpuset *parent,
if (parent == &top_cpuset)
cpumask_andnot(subpartitions_cpus, subpartitions_cpus, xcpus);
- isolcpus_updated = (old_prs != parent->partition_root_state);
- if (isolcpus_updated)
+ if (old_prs != parent->partition_root_state)
isolated_cpus_update(old_prs, parent->partition_root_state,
xcpus);
cpumask_and(xcpus, xcpus, cpu_active_mask);
cpumask_or(parent->effective_cpus, parent->effective_cpus, xcpus);
- return isolcpus_updated;
}
/*
@@ -1432,17 +1431,24 @@ static bool isolated_cpus_can_update(struct cpumask *add_cpus,
return res;
}
-static void update_isolation_cpumasks(bool isolcpus_updated)
+/*
+ * update_isolation_cpumasks - Update external isolation related CPU masks
+ *
+ * The following external CPU masks will be updated if necessary:
+ * - workqueue unbound cpumask
+ */
+static void update_isolation_cpumasks(void)
{
int ret;
- lockdep_assert_cpus_held();
-
- if (!isolcpus_updated)
+ if (!isolated_cpus_updating)
return;
+ lockdep_assert_cpus_held();
+
ret = workqueue_unbound_exclude_cpumask(isolated_cpus);
WARN_ON_ONCE(ret < 0);
+ isolated_cpus_updating = false;
}
/**
@@ -1567,8 +1573,6 @@ static inline bool is_local_partition(struct cpuset *cs)
static int remote_partition_enable(struct cpuset *cs, int new_prs,
struct tmpmasks *tmp)
{
- bool isolcpus_updated;
-
/*
* The user must have sysadmin privilege.
*/
@@ -1595,11 +1599,11 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs,
return PERR_HKEEPING;
spin_lock_irq(&callback_lock);
- isolcpus_updated = partition_xcpus_add(new_prs, NULL, tmp->new_cpus);
+ partition_xcpus_add(new_prs, NULL, tmp->new_cpus);
list_add(&cs->remote_sibling, &remote_children);
cpumask_copy(cs->effective_xcpus, tmp->new_cpus);
spin_unlock_irq(&callback_lock);
- update_isolation_cpumasks(isolcpus_updated);
+ update_isolation_cpumasks();
cpuset_force_rebuild();
cs->prs_err = 0;
@@ -1622,15 +1626,12 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs,
*/
static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp)
{
- bool isolcpus_updated;
-
WARN_ON_ONCE(!is_remote_partition(cs));
WARN_ON_ONCE(!cpumask_subset(cs->effective_xcpus, subpartitions_cpus));
spin_lock_irq(&callback_lock);
list_del_init(&cs->remote_sibling);
- isolcpus_updated = partition_xcpus_del(cs->partition_root_state,
- NULL, cs->effective_xcpus);
+ partition_xcpus_del(cs->partition_root_state, NULL, cs->effective_xcpus);
if (cs->prs_err)
cs->partition_root_state = -cs->partition_root_state;
else
@@ -1640,7 +1641,7 @@ static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp)
compute_excpus(cs, cs->effective_xcpus);
reset_partition_data(cs);
spin_unlock_irq(&callback_lock);
- update_isolation_cpumasks(isolcpus_updated);
+ update_isolation_cpumasks();
cpuset_force_rebuild();
/*
@@ -1665,7 +1666,6 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
{
bool adding, deleting;
int prs = cs->partition_root_state;
- int isolcpus_updated = 0;
if (WARN_ON_ONCE(!is_remote_partition(cs)))
return;
@@ -1701,9 +1701,9 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
spin_lock_irq(&callback_lock);
if (adding)
- isolcpus_updated += partition_xcpus_add(prs, NULL, tmp->addmask);
+ partition_xcpus_add(prs, NULL, tmp->addmask);
if (deleting)
- isolcpus_updated += partition_xcpus_del(prs, NULL, tmp->delmask);
+ partition_xcpus_del(prs, NULL, tmp->delmask);
/*
* Need to update effective_xcpus and exclusive_cpus now as
* update_sibling_cpumasks() below may iterate back to the same cs.
@@ -1712,7 +1712,7 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
if (xcpus)
cpumask_copy(cs->exclusive_cpus, xcpus);
spin_unlock_irq(&callback_lock);
- update_isolation_cpumasks(isolcpus_updated);
+ update_isolation_cpumasks();
if (adding || deleting)
cpuset_force_rebuild();
@@ -1793,7 +1793,6 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
int deleting; /* Deleting cpus from parent's effective_cpus */
int old_prs, new_prs;
int part_error = PERR_NONE; /* Partition error? */
- int isolcpus_updated = 0;
struct cpumask *xcpus = user_xcpus(cs);
int parent_prs = parent->partition_root_state;
bool nocpu;
@@ -2072,14 +2071,12 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
* and vice versa.
*/
if (adding)
- isolcpus_updated += partition_xcpus_del(old_prs, parent,
- tmp->addmask);
+ partition_xcpus_del(old_prs, parent, tmp->addmask);
if (deleting)
- isolcpus_updated += partition_xcpus_add(new_prs, parent,
- tmp->delmask);
+ partition_xcpus_add(new_prs, parent, tmp->delmask);
spin_unlock_irq(&callback_lock);
- update_isolation_cpumasks(isolcpus_updated);
+ update_isolation_cpumasks();
if ((old_prs != new_prs) && (cmd == partcmd_update))
update_partition_exclusive_flag(cs, new_prs);
@@ -3102,7 +3099,7 @@ static int update_prstate(struct cpuset *cs, int new_prs)
else if (isolcpus_updated)
isolated_cpus_update(old_prs, new_prs, cs->effective_xcpus);
spin_unlock_irq(&callback_lock);
- update_isolation_cpumasks(isolcpus_updated);
+ update_isolation_cpumasks();
/* Force update if switching back to member & update effective_xcpus */
update_cpumasks_hier(cs, &tmpmask, !new_prs);
--
2.51.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [cgroup/for-6.19 PATCH v2 2/3] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 2/3] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Waiman Long
@ 2025-11-04 2:19 ` Chen Ridong
2025-11-04 4:10 ` Waiman Long
0 siblings, 1 reply; 8+ messages in thread
From: Chen Ridong @ 2025-11-04 2:19 UTC (permalink / raw)
To: Waiman Long, Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Chen Ridong, Gabriele Monaco,
Frederic Weisbecker
On 2025/11/4 9:30, Waiman Long wrote:
> Currently the user can set up isolated cpus via cpuset and nohz_full in
> such a way that leaves no housekeeping CPU (i.e. no CPU that is neither
> domain isolated nor nohz full). This can be a problem for other
> subsystems (e.g. the timer wheel imgration).
>
> Prevent this configuration by blocking any assignation that would cause
> the union of domain isolated cpus and nohz_full to covers all CPUs.
>
> [longman: Remove isolated_cpus_should_update() and rewrite the checking
> in update_prstate() and update_parent_effective_cpumask(), also add
> prstate_housekeeping_conflict() check in update_prstate() as
> suggested by Chen Ridong]
>
> Originally-by: Gabriele Monaco <gmonaco@redhat.com>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/cgroup/cpuset.c | 75 +++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 74 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index da770dac955e..0c49905df394 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1393,6 +1393,45 @@ static bool partition_xcpus_del(int old_prs, struct cpuset *parent,
> return isolcpus_updated;
> }
>
> +/*
> + * isolated_cpus_can_update - check for isolated & nohz_full conflicts
> + * @add_cpus: cpu mask for cpus that are going to be isolated
> + * @del_cpus: cpu mask for cpus that are no longer isolated, can be NULL
> + * Return: false if there is conflict, true otherwise
> + *
> + * If nohz_full is enabled and we have isolated CPUs, their combination must
> + * still leave housekeeping CPUs.
> + *
> + * TBD: Should consider merging this function into
> + * prstate_housekeeping_conflict().
> + */
> +static bool isolated_cpus_can_update(struct cpumask *add_cpus,
> + struct cpumask *del_cpus)
> +{
> + cpumask_var_t full_hk_cpus;
> + int res = true;
> +
> + if (!housekeeping_enabled(HK_TYPE_KERNEL_NOISE))
> + return true;
> +
> + if (del_cpus && cpumask_weight_and(del_cpus,
> + housekeeping_cpumask(HK_TYPE_KERNEL_NOISE)))
> + return true;
> +
> + if (!alloc_cpumask_var(&full_hk_cpus, GFP_KERNEL))
> + return false;
> +
> + cpumask_and(full_hk_cpus, housekeeping_cpumask(HK_TYPE_KERNEL_NOISE),
> + housekeeping_cpumask(HK_TYPE_DOMAIN));
> + cpumask_andnot(full_hk_cpus, full_hk_cpus, isolated_cpus);
> + cpumask_and(full_hk_cpus, full_hk_cpus, cpu_active_mask);
> + if (!cpumask_weight_andnot(full_hk_cpus, add_cpus))
> + res = false;
> +
> + free_cpumask_var(full_hk_cpus);
> + return res;
> +}
> +
> static void update_isolation_cpumasks(bool isolcpus_updated)
> {
> int ret;
> @@ -1551,6 +1590,9 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs,
> if (!cpumask_intersects(tmp->new_cpus, cpu_active_mask) ||
> cpumask_subset(top_cpuset.effective_cpus, tmp->new_cpus))
> return PERR_INVCPUS;
> + if ((new_prs == PRS_ISOLATED) &&
> + !isolated_cpus_can_update(tmp->new_cpus, NULL))
> + return PERR_HKEEPING;
>
Do we also need to check prstate_housekeeping_conflict here?
> spin_lock_irq(&callback_lock);
> isolcpus_updated = partition_xcpus_add(new_prs, NULL, tmp->new_cpus);
> @@ -1650,6 +1692,9 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
> else if (cpumask_intersects(tmp->addmask, subpartitions_cpus) ||
> cpumask_subset(top_cpuset.effective_cpus, tmp->addmask))
> cs->prs_err = PERR_NOCPUS;
> + else if ((prs == PRS_ISOLATED) &&
> + !isolated_cpus_can_update(tmp->addmask, tmp->delmask))
> + cs->prs_err = PERR_HKEEPING;
> if (cs->prs_err)
> goto invalidate;
> }
Ditto.
> @@ -1750,6 +1795,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
> int part_error = PERR_NONE; /* Partition error? */
> int isolcpus_updated = 0;
> struct cpumask *xcpus = user_xcpus(cs);
> + int parent_prs = parent->partition_root_state;
> bool nocpu;
>
> lockdep_assert_held(&cpuset_mutex);
> @@ -1813,6 +1859,10 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
> if (prstate_housekeeping_conflict(new_prs, xcpus))
> return PERR_HKEEPING;
>
> + if ((new_prs == PRS_ISOLATED) && (new_prs != parent_prs) &&
> + !isolated_cpus_can_update(xcpus, NULL))
> + return PERR_HKEEPING;
> +
> if (tasks_nocpu_error(parent, cs, xcpus))
> return PERR_NOCPUS;
>
I think isolated_cpus_can_update check should be also added to validate_partition function.
If deemed necessary, you may consider applying the patch below, which reuses validate_partition to
enable the local partition, so validate_partition can be common block.
https://lore.kernel.org/cgroups/20251025064844.495525-4-chenridong@huaweicloud.com/
> @@ -1866,6 +1916,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
> *
> * For invalid partition:
> * delmask = newmask & parent->effective_xcpus
> + * The partition may become valid soon.
> */
> if (is_partition_invalid(cs)) {
> adding = false;
> @@ -1880,6 +1931,23 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
> deleting = cpumask_and(tmp->delmask, tmp->delmask,
> parent->effective_xcpus);
> }
> +
> + /*
> + * TBD: Invalidate a currently valid child root partition may
> + * still break isolated_cpus_can_update() rule if parent is an
> + * isolated partition.
> + */
> + if (is_partition_valid(cs) && (old_prs != parent_prs)) {
> + if ((parent_prs == PRS_ROOT) &&
> + /* Adding to parent means removing isolated CPUs */
> + !isolated_cpus_can_update(tmp->delmask, tmp->addmask))
> + part_error = PERR_HKEEPING;
> + if ((parent_prs == PRS_ISOLATED) &&
> + /* Adding to parent means adding isolated CPUs */
> + !isolated_cpus_can_update(tmp->addmask, tmp->delmask))
> + part_error = PERR_HKEEPING;
> + }
> +
> /*
> * The new CPUs to be removed from parent's effective CPUs
> * must be present.
> @@ -2994,7 +3062,12 @@ static int update_prstate(struct cpuset *cs, int new_prs)
> * A change in load balance state only, no change in cpumasks.
> * Need to update isolated_cpus.
> */
> - isolcpus_updated = true;
> + if (((new_prs == PRS_ISOLATED) &&
> + !isolated_cpus_can_update(cs->effective_xcpus, NULL)) ||
> + prstate_housekeeping_conflict(new_prs, cs->effective_xcpus))
> + err = PERR_HKEEPING;
> + else
> + isolcpus_updated = true;
> } else {
> /*
> * Switching back to member is always allowed even if it
--
Best regards,
Ridong
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [cgroup/for-6.19 PATCH v2 3/3] cgroup/cpuset: Globally track isolated_cpus update
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 3/3] cgroup/cpuset: Globally track isolated_cpus update Waiman Long
@ 2025-11-04 2:21 ` Chen Ridong
0 siblings, 0 replies; 8+ messages in thread
From: Chen Ridong @ 2025-11-04 2:21 UTC (permalink / raw)
To: Waiman Long, Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Chen Ridong, Gabriele Monaco,
Frederic Weisbecker
On 2025/11/4 9:30, Waiman Long wrote:
> The current cpuset code passes a local isolcpus_updated flag around in a
> number of functions to determine if external isolation related cpumasks
> like wq_unbound_cpumask should be updated. It is a bit cumbersome and
> makes the code more complex. Simplify the code by using a global boolean
> flag "isolated_cpus_updating" to track this. This flag will be set in
> isolated_cpus_update() and cleared in update_isolation_cpumasks().
>
> No functional change is expected.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/cgroup/cpuset.c | 73 ++++++++++++++++++++----------------------
> 1 file changed, 35 insertions(+), 38 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 0c49905df394..854fe4cc1358 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -81,6 +81,13 @@ static cpumask_var_t subpartitions_cpus;
> */
> static cpumask_var_t isolated_cpus;
>
> +/*
> + * isolated_cpus updating flag (protected by cpuset_mutex)
> + * Set if isolated_cpus is going to be updated in the current
> + * cpuset_mutex crtical section.
> + */
> +static bool isolated_cpus_updating;
> +
> /*
> * Housekeeping (HK_TYPE_DOMAIN) CPUs at boot
> */
> @@ -1327,6 +1334,8 @@ static void isolated_cpus_update(int old_prs, int new_prs, struct cpumask *xcpus
> cpumask_or(isolated_cpus, isolated_cpus, xcpus);
> else
> cpumask_andnot(isolated_cpus, isolated_cpus, xcpus);
> +
> + isolated_cpus_updating = true;
> }
>
> /*
> @@ -1334,15 +1343,12 @@ static void isolated_cpus_update(int old_prs, int new_prs, struct cpumask *xcpus
> * @new_prs: new partition_root_state
> * @parent: parent cpuset
> * @xcpus: exclusive CPUs to be added
> - * Return: true if isolated_cpus modified, false otherwise
> *
> * Remote partition if parent == NULL
> */
> -static bool partition_xcpus_add(int new_prs, struct cpuset *parent,
> +static void partition_xcpus_add(int new_prs, struct cpuset *parent,
> struct cpumask *xcpus)
> {
> - bool isolcpus_updated;
> -
> WARN_ON_ONCE(new_prs < 0);
> lockdep_assert_held(&callback_lock);
> if (!parent)
> @@ -1352,13 +1358,11 @@ static bool partition_xcpus_add(int new_prs, struct cpuset *parent,
> if (parent == &top_cpuset)
> cpumask_or(subpartitions_cpus, subpartitions_cpus, xcpus);
>
> - isolcpus_updated = (new_prs != parent->partition_root_state);
> - if (isolcpus_updated)
> + if (new_prs != parent->partition_root_state)
> isolated_cpus_update(parent->partition_root_state, new_prs,
> xcpus);
>
> cpumask_andnot(parent->effective_cpus, parent->effective_cpus, xcpus);
> - return isolcpus_updated;
> }
>
> /*
> @@ -1366,15 +1370,12 @@ static bool partition_xcpus_add(int new_prs, struct cpuset *parent,
> * @old_prs: old partition_root_state
> * @parent: parent cpuset
> * @xcpus: exclusive CPUs to be removed
> - * Return: true if isolated_cpus modified, false otherwise
> *
> * Remote partition if parent == NULL
> */
> -static bool partition_xcpus_del(int old_prs, struct cpuset *parent,
> +static void partition_xcpus_del(int old_prs, struct cpuset *parent,
> struct cpumask *xcpus)
> {
> - bool isolcpus_updated;
> -
> WARN_ON_ONCE(old_prs < 0);
> lockdep_assert_held(&callback_lock);
> if (!parent)
> @@ -1383,14 +1384,12 @@ static bool partition_xcpus_del(int old_prs, struct cpuset *parent,
> if (parent == &top_cpuset)
> cpumask_andnot(subpartitions_cpus, subpartitions_cpus, xcpus);
>
> - isolcpus_updated = (old_prs != parent->partition_root_state);
> - if (isolcpus_updated)
> + if (old_prs != parent->partition_root_state)
> isolated_cpus_update(old_prs, parent->partition_root_state,
> xcpus);
>
> cpumask_and(xcpus, xcpus, cpu_active_mask);
> cpumask_or(parent->effective_cpus, parent->effective_cpus, xcpus);
> - return isolcpus_updated;
> }
>
> /*
> @@ -1432,17 +1431,24 @@ static bool isolated_cpus_can_update(struct cpumask *add_cpus,
> return res;
> }
>
> -static void update_isolation_cpumasks(bool isolcpus_updated)
> +/*
> + * update_isolation_cpumasks - Update external isolation related CPU masks
> + *
> + * The following external CPU masks will be updated if necessary:
> + * - workqueue unbound cpumask
> + */
> +static void update_isolation_cpumasks(void)
> {
> int ret;
>
> - lockdep_assert_cpus_held();
> -
> - if (!isolcpus_updated)
> + if (!isolated_cpus_updating)
> return;
>
> + lockdep_assert_cpus_held();
> +
> ret = workqueue_unbound_exclude_cpumask(isolated_cpus);
> WARN_ON_ONCE(ret < 0);
> + isolated_cpus_updating = false;
> }
>
> /**
> @@ -1567,8 +1573,6 @@ static inline bool is_local_partition(struct cpuset *cs)
> static int remote_partition_enable(struct cpuset *cs, int new_prs,
> struct tmpmasks *tmp)
> {
> - bool isolcpus_updated;
> -
> /*
> * The user must have sysadmin privilege.
> */
> @@ -1595,11 +1599,11 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs,
> return PERR_HKEEPING;
>
> spin_lock_irq(&callback_lock);
> - isolcpus_updated = partition_xcpus_add(new_prs, NULL, tmp->new_cpus);
> + partition_xcpus_add(new_prs, NULL, tmp->new_cpus);
> list_add(&cs->remote_sibling, &remote_children);
> cpumask_copy(cs->effective_xcpus, tmp->new_cpus);
> spin_unlock_irq(&callback_lock);
> - update_isolation_cpumasks(isolcpus_updated);
> + update_isolation_cpumasks();
> cpuset_force_rebuild();
> cs->prs_err = 0;
>
> @@ -1622,15 +1626,12 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs,
> */
> static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp)
> {
> - bool isolcpus_updated;
> -
> WARN_ON_ONCE(!is_remote_partition(cs));
> WARN_ON_ONCE(!cpumask_subset(cs->effective_xcpus, subpartitions_cpus));
>
> spin_lock_irq(&callback_lock);
> list_del_init(&cs->remote_sibling);
> - isolcpus_updated = partition_xcpus_del(cs->partition_root_state,
> - NULL, cs->effective_xcpus);
> + partition_xcpus_del(cs->partition_root_state, NULL, cs->effective_xcpus);
> if (cs->prs_err)
> cs->partition_root_state = -cs->partition_root_state;
> else
> @@ -1640,7 +1641,7 @@ static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp)
> compute_excpus(cs, cs->effective_xcpus);
> reset_partition_data(cs);
> spin_unlock_irq(&callback_lock);
> - update_isolation_cpumasks(isolcpus_updated);
> + update_isolation_cpumasks();
> cpuset_force_rebuild();
>
> /*
> @@ -1665,7 +1666,6 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
> {
> bool adding, deleting;
> int prs = cs->partition_root_state;
> - int isolcpus_updated = 0;
>
> if (WARN_ON_ONCE(!is_remote_partition(cs)))
> return;
> @@ -1701,9 +1701,9 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
>
> spin_lock_irq(&callback_lock);
> if (adding)
> - isolcpus_updated += partition_xcpus_add(prs, NULL, tmp->addmask);
> + partition_xcpus_add(prs, NULL, tmp->addmask);
> if (deleting)
> - isolcpus_updated += partition_xcpus_del(prs, NULL, tmp->delmask);
> + partition_xcpus_del(prs, NULL, tmp->delmask);
> /*
> * Need to update effective_xcpus and exclusive_cpus now as
> * update_sibling_cpumasks() below may iterate back to the same cs.
> @@ -1712,7 +1712,7 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
> if (xcpus)
> cpumask_copy(cs->exclusive_cpus, xcpus);
> spin_unlock_irq(&callback_lock);
> - update_isolation_cpumasks(isolcpus_updated);
> + update_isolation_cpumasks();
> if (adding || deleting)
> cpuset_force_rebuild();
>
> @@ -1793,7 +1793,6 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
> int deleting; /* Deleting cpus from parent's effective_cpus */
> int old_prs, new_prs;
> int part_error = PERR_NONE; /* Partition error? */
> - int isolcpus_updated = 0;
> struct cpumask *xcpus = user_xcpus(cs);
> int parent_prs = parent->partition_root_state;
> bool nocpu;
> @@ -2072,14 +2071,12 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
> * and vice versa.
> */
> if (adding)
> - isolcpus_updated += partition_xcpus_del(old_prs, parent,
> - tmp->addmask);
> + partition_xcpus_del(old_prs, parent, tmp->addmask);
> if (deleting)
> - isolcpus_updated += partition_xcpus_add(new_prs, parent,
> - tmp->delmask);
> + partition_xcpus_add(new_prs, parent, tmp->delmask);
>
> spin_unlock_irq(&callback_lock);
> - update_isolation_cpumasks(isolcpus_updated);
> + update_isolation_cpumasks();
>
> if ((old_prs != new_prs) && (cmd == partcmd_update))
> update_partition_exclusive_flag(cs, new_prs);
> @@ -3102,7 +3099,7 @@ static int update_prstate(struct cpuset *cs, int new_prs)
> else if (isolcpus_updated)
> isolated_cpus_update(old_prs, new_prs, cs->effective_xcpus);
> spin_unlock_irq(&callback_lock);
> - update_isolation_cpumasks(isolcpus_updated);
> + update_isolation_cpumasks();
>
> /* Force update if switching back to member & update effective_xcpus */
> update_cpumasks_hier(cs, &tmpmask, !new_prs);
Reviewed-by: Chen Ridong <chenridong@huawei.com>
--
Best regards,
Ridong
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [cgroup/for-6.19 PATCH v2 2/3] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping
2025-11-04 2:19 ` Chen Ridong
@ 2025-11-04 4:10 ` Waiman Long
2025-11-04 7:26 ` Chen Ridong
0 siblings, 1 reply; 8+ messages in thread
From: Waiman Long @ 2025-11-04 4:10 UTC (permalink / raw)
To: Chen Ridong, Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Chen Ridong, Gabriele Monaco,
Frederic Weisbecker
On 11/3/25 9:19 PM, Chen Ridong wrote:
>
> On 2025/11/4 9:30, Waiman Long wrote:
>> Currently the user can set up isolated cpus via cpuset and nohz_full in
>> such a way that leaves no housekeeping CPU (i.e. no CPU that is neither
>> domain isolated nor nohz full). This can be a problem for other
>> subsystems (e.g. the timer wheel imgration).
>>
>> Prevent this configuration by blocking any assignation that would cause
>> the union of domain isolated cpus and nohz_full to covers all CPUs.
>>
>> [longman: Remove isolated_cpus_should_update() and rewrite the checking
>> in update_prstate() and update_parent_effective_cpumask(), also add
>> prstate_housekeeping_conflict() check in update_prstate() as
>> suggested by Chen Ridong]
>>
>> Originally-by: Gabriele Monaco <gmonaco@redhat.com>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>> kernel/cgroup/cpuset.c | 75 +++++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 74 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>> index da770dac955e..0c49905df394 100644
>> --- a/kernel/cgroup/cpuset.c
>> +++ b/kernel/cgroup/cpuset.c
>> @@ -1393,6 +1393,45 @@ static bool partition_xcpus_del(int old_prs, struct cpuset *parent,
>> return isolcpus_updated;
>> }
>>
>> +/*
>> + * isolated_cpus_can_update - check for isolated & nohz_full conflicts
>> + * @add_cpus: cpu mask for cpus that are going to be isolated
>> + * @del_cpus: cpu mask for cpus that are no longer isolated, can be NULL
>> + * Return: false if there is conflict, true otherwise
>> + *
>> + * If nohz_full is enabled and we have isolated CPUs, their combination must
>> + * still leave housekeeping CPUs.
>> + *
>> + * TBD: Should consider merging this function into
>> + * prstate_housekeeping_conflict().
>> + */
>> +static bool isolated_cpus_can_update(struct cpumask *add_cpus,
>> + struct cpumask *del_cpus)
>> +{
>> + cpumask_var_t full_hk_cpus;
>> + int res = true;
>> +
>> + if (!housekeeping_enabled(HK_TYPE_KERNEL_NOISE))
>> + return true;
>> +
>> + if (del_cpus && cpumask_weight_and(del_cpus,
>> + housekeeping_cpumask(HK_TYPE_KERNEL_NOISE)))
>> + return true;
>> +
>> + if (!alloc_cpumask_var(&full_hk_cpus, GFP_KERNEL))
>> + return false;
>> +
>> + cpumask_and(full_hk_cpus, housekeeping_cpumask(HK_TYPE_KERNEL_NOISE),
>> + housekeeping_cpumask(HK_TYPE_DOMAIN));
>> + cpumask_andnot(full_hk_cpus, full_hk_cpus, isolated_cpus);
>> + cpumask_and(full_hk_cpus, full_hk_cpus, cpu_active_mask);
>> + if (!cpumask_weight_andnot(full_hk_cpus, add_cpus))
>> + res = false;
>> +
>> + free_cpumask_var(full_hk_cpus);
>> + return res;
>> +}
>> +
>> static void update_isolation_cpumasks(bool isolcpus_updated)
>> {
>> int ret;
>> @@ -1551,6 +1590,9 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs,
>> if (!cpumask_intersects(tmp->new_cpus, cpu_active_mask) ||
>> cpumask_subset(top_cpuset.effective_cpus, tmp->new_cpus))
>> return PERR_INVCPUS;
>> + if ((new_prs == PRS_ISOLATED) &&
>> + !isolated_cpus_can_update(tmp->new_cpus, NULL))
>> + return PERR_HKEEPING;
>>
> Do we also need to check prstate_housekeeping_conflict here?
Right. I missed that. Will add that in the next version.
>
>> spin_lock_irq(&callback_lock);
>> isolcpus_updated = partition_xcpus_add(new_prs, NULL, tmp->new_cpus);
>> @@ -1650,6 +1692,9 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
>> else if (cpumask_intersects(tmp->addmask, subpartitions_cpus) ||
>> cpumask_subset(top_cpuset.effective_cpus, tmp->addmask))
>> cs->prs_err = PERR_NOCPUS;
>> + else if ((prs == PRS_ISOLATED) &&
>> + !isolated_cpus_can_update(tmp->addmask, tmp->delmask))
>> + cs->prs_err = PERR_HKEEPING;
>> if (cs->prs_err)
>> goto invalidate;
>> }
> Ditto.
prstate_housekeeping_conflict() has been called earlier via
validate_partition() from partition_cpus_change(). We don't need one
more check here. However, I forgot that enabling a partition will not
call validate_partition().
>
>> @@ -1750,6 +1795,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
>> int part_error = PERR_NONE; /* Partition error? */
>> int isolcpus_updated = 0;
>> struct cpumask *xcpus = user_xcpus(cs);
>> + int parent_prs = parent->partition_root_state;
>> bool nocpu;
>>
>> lockdep_assert_held(&cpuset_mutex);
>> @@ -1813,6 +1859,10 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
>> if (prstate_housekeeping_conflict(new_prs, xcpus))
>> return PERR_HKEEPING;
>>
>> + if ((new_prs == PRS_ISOLATED) && (new_prs != parent_prs) &&
>> + !isolated_cpus_can_update(xcpus, NULL))
>> + return PERR_HKEEPING;
>> +
>> if (tasks_nocpu_error(parent, cs, xcpus))
>> return PERR_NOCPUS;
>>
> I think isolated_cpus_can_update check should be also added to validate_partition function.
>
> If deemed necessary, you may consider applying the patch below, which reuses validate_partition to
> enable the local partition, so validate_partition can be common block.
>
> https://lore.kernel.org/cgroups/20251025064844.495525-4-chenridong@huaweicloud.com/
I do think your patch series will make that simpler. You can certainly
update your patch series to include that additional check into
validate_partition().
Cheers,
Longman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [cgroup/for-6.19 PATCH v2 2/3] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping
2025-11-04 4:10 ` Waiman Long
@ 2025-11-04 7:26 ` Chen Ridong
0 siblings, 0 replies; 8+ messages in thread
From: Chen Ridong @ 2025-11-04 7:26 UTC (permalink / raw)
To: Waiman Long, Tejun Heo, Johannes Weiner, Michal Koutný
Cc: cgroups, linux-kernel, Chen Ridong, Gabriele Monaco,
Frederic Weisbecker
On 2025/11/4 12:10, Waiman Long wrote:
> I do think your patch series will make that simpler. You can certainly update your patch series to
> include that additional check into validate_partition().
Okay. After you have reviewed the patches, I will prepare a revision that incorporates the
additional check.
--
Best regards,
Ridong
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-11-04 7:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04 1:30 [cgroup/for-6.19 PATCH v2 0/3] cgroup/cpuset: Additional housekeeping check & cleanup Waiman Long
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 1/3] cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() Waiman Long
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 2/3] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Waiman Long
2025-11-04 2:19 ` Chen Ridong
2025-11-04 4:10 ` Waiman Long
2025-11-04 7:26 ` Chen Ridong
2025-11-04 1:30 ` [cgroup/for-6.19 PATCH v2 3/3] cgroup/cpuset: Globally track isolated_cpus update Waiman Long
2025-11-04 2:21 ` Chen Ridong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).