* [PATCH -next v5 0/3] some optimization for cpuset
@ 2025-08-25 3:23 Chen Ridong
2025-08-25 3:23 ` [PATCH -next v5 1/3] cpuset: decouple tmpmasks and cpumasks freeing in cgroup Chen Ridong
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Chen Ridong @ 2025-08-25 3:23 UTC (permalink / raw)
To: longman, tj, hannes, mkoutny; +Cc: cgroups, linux-kernel, lujialin4, chenridong
From: Chen Ridong <chenridong@huawei.com>
This patch series contains several cpuset improvements:
1. Decouple cpuset and tmpmasks allocation/freeing.
2. Add cpuset_full_[un]lock helpers.
---
v5:
- add return tag comment for patch 2.
v4:
- update the commit message for patch 3.
v3:
- fix typos and comment errors.
- rename cpus_read_cpuset_[un]lock to cpuset_full_[un]lock
v2:
- dropped guard helper approach, nusing new helper instead.
- added patches for decoupling cpuset/tmpmasks allocation
Chen Ridong (3):
cpuset: decouple tmpmasks and cpumasks freeing in cgroup
cpuset: separate tmpmasks and cpuset allocation logic
cpuset: add helpers for cpus read and cpuset_mutex locks
kernel/cgroup/cpuset-internal.h | 2 +
kernel/cgroup/cpuset-v1.c | 12 +-
kernel/cgroup/cpuset.c | 219 +++++++++++++++++---------------
3 files changed, 122 insertions(+), 111 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH -next v5 1/3] cpuset: decouple tmpmasks and cpumasks freeing in cgroup
2025-08-25 3:23 [PATCH -next v5 0/3] some optimization for cpuset Chen Ridong
@ 2025-08-25 3:23 ` Chen Ridong
2025-08-25 3:23 ` [PATCH -next v5 2/3] cpuset: separate tmpmasks and cpuset allocation logic Chen Ridong
` (2 subsequent siblings)
3 siblings, 0 replies; 10+ messages in thread
From: Chen Ridong @ 2025-08-25 3:23 UTC (permalink / raw)
To: longman, tj, hannes, mkoutny; +Cc: cgroups, linux-kernel, lujialin4, chenridong
From: Chen Ridong <chenridong@huawei.com>
Currently, free_cpumasks() can free both tmpmasks and cpumasks of a cpuset
(cs). However, these two operations are not logically coupled. To improve
code clarity:
1. Move cpumask freeing to free_cpuset()
2. Rename free_cpumasks() to free_tmpmasks()
This change enforces the single responsibility principle.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset.c | 32 +++++++++++++-------------------
1 file changed, 13 insertions(+), 19 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 3466ebbf1016..aebda14cc67f 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -459,23 +459,14 @@ static inline int alloc_cpumasks(struct cpuset *cs, struct tmpmasks *tmp)
}
/**
- * free_cpumasks - free cpumasks in a tmpmasks structure
- * @cs: the cpuset that have cpumasks to be free.
+ * free_tmpmasks - free cpumasks in a tmpmasks structure
* @tmp: the tmpmasks structure pointer
*/
-static inline void free_cpumasks(struct cpuset *cs, struct tmpmasks *tmp)
+static inline void free_tmpmasks(struct tmpmasks *tmp)
{
- if (cs) {
- free_cpumask_var(cs->cpus_allowed);
- free_cpumask_var(cs->effective_cpus);
- free_cpumask_var(cs->effective_xcpus);
- free_cpumask_var(cs->exclusive_cpus);
- }
- if (tmp) {
- free_cpumask_var(tmp->new_cpus);
- free_cpumask_var(tmp->addmask);
- free_cpumask_var(tmp->delmask);
- }
+ free_cpumask_var(tmp->new_cpus);
+ free_cpumask_var(tmp->addmask);
+ free_cpumask_var(tmp->delmask);
}
/**
@@ -508,7 +499,10 @@ static struct cpuset *alloc_trial_cpuset(struct cpuset *cs)
*/
static inline void free_cpuset(struct cpuset *cs)
{
- free_cpumasks(cs, NULL);
+ free_cpumask_var(cs->cpus_allowed);
+ free_cpumask_var(cs->effective_cpus);
+ free_cpumask_var(cs->effective_xcpus);
+ free_cpumask_var(cs->exclusive_cpus);
kfree(cs);
}
@@ -2427,7 +2421,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
if (cs->partition_root_state)
update_partition_sd_lb(cs, old_prs);
out_free:
- free_cpumasks(NULL, &tmp);
+ free_tmpmasks(&tmp);
return retval;
}
@@ -2530,7 +2524,7 @@ static int update_exclusive_cpumask(struct cpuset *cs, struct cpuset *trialcs,
if (cs->partition_root_state)
update_partition_sd_lb(cs, old_prs);
- free_cpumasks(NULL, &tmp);
+ free_tmpmasks(&tmp);
return 0;
}
@@ -2983,7 +2977,7 @@ static int update_prstate(struct cpuset *cs, int new_prs)
notify_partition_change(cs, old_prs);
if (force_sd_rebuild)
rebuild_sched_domains_locked();
- free_cpumasks(NULL, &tmpmask);
+ free_tmpmasks(&tmpmask);
return 0;
}
@@ -4006,7 +4000,7 @@ static void cpuset_handle_hotplug(void)
if (force_sd_rebuild)
rebuild_sched_domains_cpuslocked();
- free_cpumasks(NULL, ptmp);
+ free_tmpmasks(ptmp);
}
void cpuset_update_active_cpus(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH -next v5 2/3] cpuset: separate tmpmasks and cpuset allocation logic
2025-08-25 3:23 [PATCH -next v5 0/3] some optimization for cpuset Chen Ridong
2025-08-25 3:23 ` [PATCH -next v5 1/3] cpuset: decouple tmpmasks and cpumasks freeing in cgroup Chen Ridong
@ 2025-08-25 3:23 ` Chen Ridong
2025-08-25 17:37 ` Waiman Long
2025-08-25 3:23 ` [PATCH -next v5 3/3] cpuset: add helpers for cpus read and cpuset_mutex locks Chen Ridong
2025-08-25 18:20 ` [PATCH -next v5 0/3] some optimization for cpuset Tejun Heo
3 siblings, 1 reply; 10+ messages in thread
From: Chen Ridong @ 2025-08-25 3:23 UTC (permalink / raw)
To: longman, tj, hannes, mkoutny; +Cc: cgroups, linux-kernel, lujialin4, chenridong
From: Chen Ridong <chenridong@huawei.com>
The original alloc_cpumasks() served dual purposes: allocating cpumasks
for both temporary masks (tmpmasks) and cpuset structures. This patch:
1. Decouples these allocation paths for better code clarity
2. Introduces dedicated alloc_tmpmasks() and dup_or_alloc_cpuset()
functions
3. Maintains symmetric pairing:
- alloc_tmpmasks() ↔ free_tmpmasks()
- dup_or_alloc_cpuset() ↔ free_cpuset()
Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
kernel/cgroup/cpuset.c | 127 ++++++++++++++++++++++-------------------
1 file changed, 69 insertions(+), 58 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index aebda14cc67f..7b0b81c835bf 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -411,51 +411,47 @@ static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
}
/**
- * alloc_cpumasks - allocate three cpumasks for cpuset
- * @cs: the cpuset that have cpumasks to be allocated.
- * @tmp: the tmpmasks structure pointer
+ * alloc_cpumasks - Allocate an array of cpumask variables
+ * @pmasks: Pointer to array of cpumask_var_t pointers
+ * @size: Number of cpumasks to allocate
* Return: 0 if successful, -ENOMEM otherwise.
*
- * Only one of the two input arguments should be non-NULL.
+ * Allocates @size cpumasks and initializes them to empty. Returns 0 on
+ * success, -ENOMEM on allocation failure. On failure, any previously
+ * allocated cpumasks are freed.
*/
-static inline int alloc_cpumasks(struct cpuset *cs, struct tmpmasks *tmp)
+static inline int alloc_cpumasks(cpumask_var_t *pmasks[], u32 size)
{
- cpumask_var_t *pmask1, *pmask2, *pmask3, *pmask4;
+ int i;
- if (cs) {
- pmask1 = &cs->cpus_allowed;
- pmask2 = &cs->effective_cpus;
- pmask3 = &cs->effective_xcpus;
- pmask4 = &cs->exclusive_cpus;
- } else {
- pmask1 = &tmp->new_cpus;
- pmask2 = &tmp->addmask;
- pmask3 = &tmp->delmask;
- pmask4 = NULL;
+ for (i = 0; i < size; i++) {
+ if (!zalloc_cpumask_var(pmasks[i], GFP_KERNEL)) {
+ while (--i >= 0)
+ free_cpumask_var(*pmasks[i]);
+ return -ENOMEM;
+ }
}
-
- if (!zalloc_cpumask_var(pmask1, GFP_KERNEL))
- return -ENOMEM;
-
- if (!zalloc_cpumask_var(pmask2, GFP_KERNEL))
- goto free_one;
-
- if (!zalloc_cpumask_var(pmask3, GFP_KERNEL))
- goto free_two;
-
- if (pmask4 && !zalloc_cpumask_var(pmask4, GFP_KERNEL))
- goto free_three;
-
-
return 0;
+}
-free_three:
- free_cpumask_var(*pmask3);
-free_two:
- free_cpumask_var(*pmask2);
-free_one:
- free_cpumask_var(*pmask1);
- return -ENOMEM;
+/**
+ * alloc_tmpmasks - Allocate temporary cpumasks for cpuset operations.
+ * @tmp: Pointer to tmpmasks structure to populate
+ * Return: 0 on success, -ENOMEM on allocation failure
+ */
+static inline int alloc_tmpmasks(struct tmpmasks *tmp)
+{
+ /*
+ * Array of pointers to the three cpumask_var_t fields in tmpmasks.
+ * Note: Array size must match actual number of masks (3)
+ */
+ cpumask_var_t *pmask[3] = {
+ &tmp->new_cpus,
+ &tmp->addmask,
+ &tmp->delmask
+ };
+
+ return alloc_cpumasks(pmask, ARRAY_SIZE(pmask));
}
/**
@@ -470,26 +466,46 @@ static inline void free_tmpmasks(struct tmpmasks *tmp)
}
/**
- * alloc_trial_cpuset - allocate a trial cpuset
- * @cs: the cpuset that the trial cpuset duplicates
+ * dup_or_alloc_cpuset - Duplicate or allocate a new cpuset
+ * @cs: Source cpuset to duplicate (NULL for a fresh allocation)
+ *
+ * Creates a new cpuset by either:
+ * 1. Duplicating an existing cpuset (if @cs is non-NULL), or
+ * 2. Allocating a fresh cpuset with zero-initialized masks (if @cs is NULL)
+ *
+ * Return: Pointer to newly allocated cpuset on success, NULL on failure
*/
-static struct cpuset *alloc_trial_cpuset(struct cpuset *cs)
+static struct cpuset *dup_or_alloc_cpuset(struct cpuset *cs)
{
struct cpuset *trial;
- trial = kmemdup(cs, sizeof(*cs), GFP_KERNEL);
+ /* Allocate base structure */
+ trial = cs ? kmemdup(cs, sizeof(*cs), GFP_KERNEL) :
+ kzalloc(sizeof(*cs), GFP_KERNEL);
if (!trial)
return NULL;
- if (alloc_cpumasks(trial, NULL)) {
+ /* Setup cpumask pointer array */
+ cpumask_var_t *pmask[4] = {
+ &trial->cpus_allowed,
+ &trial->effective_cpus,
+ &trial->effective_xcpus,
+ &trial->exclusive_cpus
+ };
+
+ if (alloc_cpumasks(pmask, ARRAY_SIZE(pmask))) {
kfree(trial);
return NULL;
}
- cpumask_copy(trial->cpus_allowed, cs->cpus_allowed);
- cpumask_copy(trial->effective_cpus, cs->effective_cpus);
- cpumask_copy(trial->effective_xcpus, cs->effective_xcpus);
- cpumask_copy(trial->exclusive_cpus, cs->exclusive_cpus);
+ /* Copy masks if duplicating */
+ if (cs) {
+ cpumask_copy(trial->cpus_allowed, cs->cpus_allowed);
+ cpumask_copy(trial->effective_cpus, cs->effective_cpus);
+ cpumask_copy(trial->effective_xcpus, cs->effective_xcpus);
+ cpumask_copy(trial->exclusive_cpus, cs->exclusive_cpus);
+ }
+
return trial;
}
@@ -2332,7 +2348,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
if (cpumask_equal(cs->cpus_allowed, trialcs->cpus_allowed))
return 0;
- if (alloc_cpumasks(NULL, &tmp))
+ if (alloc_tmpmasks(&tmp))
return -ENOMEM;
if (old_prs) {
@@ -2476,7 +2492,7 @@ static int update_exclusive_cpumask(struct cpuset *cs, struct cpuset *trialcs,
if (retval)
return retval;
- if (alloc_cpumasks(NULL, &tmp))
+ if (alloc_tmpmasks(&tmp))
return -ENOMEM;
if (old_prs) {
@@ -2820,7 +2836,7 @@ int cpuset_update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
int spread_flag_changed;
int err;
- trialcs = alloc_trial_cpuset(cs);
+ trialcs = dup_or_alloc_cpuset(cs);
if (!trialcs)
return -ENOMEM;
@@ -2881,7 +2897,7 @@ static int update_prstate(struct cpuset *cs, int new_prs)
if (new_prs && is_prs_invalid(old_prs))
old_prs = PRS_MEMBER;
- if (alloc_cpumasks(NULL, &tmpmask))
+ if (alloc_tmpmasks(&tmpmask))
return -ENOMEM;
err = update_partition_exclusive_flag(cs, new_prs);
@@ -3223,7 +3239,7 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
if (!is_cpuset_online(cs))
goto out_unlock;
- trialcs = alloc_trial_cpuset(cs);
+ trialcs = dup_or_alloc_cpuset(cs);
if (!trialcs) {
retval = -ENOMEM;
goto out_unlock;
@@ -3456,15 +3472,10 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
if (!parent_css)
return &top_cpuset.css;
- cs = kzalloc(sizeof(*cs), GFP_KERNEL);
+ cs = dup_or_alloc_cpuset(NULL);
if (!cs)
return ERR_PTR(-ENOMEM);
- if (alloc_cpumasks(cs, NULL)) {
- kfree(cs);
- return ERR_PTR(-ENOMEM);
- }
-
__set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
fmeter_init(&cs->fmeter);
cs->relax_domain_level = -1;
@@ -3920,7 +3931,7 @@ static void cpuset_handle_hotplug(void)
bool on_dfl = is_in_v2_mode();
struct tmpmasks tmp, *ptmp = NULL;
- if (on_dfl && !alloc_cpumasks(NULL, &tmp))
+ if (on_dfl && !alloc_tmpmasks(&tmp))
ptmp = &tmp;
lockdep_assert_cpus_held();
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH -next v5 3/3] cpuset: add helpers for cpus read and cpuset_mutex locks
2025-08-25 3:23 [PATCH -next v5 0/3] some optimization for cpuset Chen Ridong
2025-08-25 3:23 ` [PATCH -next v5 1/3] cpuset: decouple tmpmasks and cpumasks freeing in cgroup Chen Ridong
2025-08-25 3:23 ` [PATCH -next v5 2/3] cpuset: separate tmpmasks and cpuset allocation logic Chen Ridong
@ 2025-08-25 3:23 ` Chen Ridong
2025-08-26 14:23 ` Michal Koutný
2025-08-25 18:20 ` [PATCH -next v5 0/3] some optimization for cpuset Tejun Heo
3 siblings, 1 reply; 10+ messages in thread
From: Chen Ridong @ 2025-08-25 3:23 UTC (permalink / raw)
To: longman, tj, hannes, mkoutny; +Cc: cgroups, linux-kernel, lujialin4, chenridong
From: Chen Ridong <chenridong@huawei.com>
cpuset: add helpers for cpus_read_lock and cpuset_mutex locks.
Replace repetitive locking patterns with new helpers:
- cpuset_full_lock()
- cpuset_full_unlock()
This makes the code cleaner and ensures consistent lock ordering.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset-internal.h | 2 ++
kernel/cgroup/cpuset-v1.c | 12 +++----
kernel/cgroup/cpuset.c | 60 +++++++++++++++++++--------------
3 files changed, 40 insertions(+), 34 deletions(-)
diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-internal.h
index 75b3aef39231..337608f408ce 100644
--- a/kernel/cgroup/cpuset-internal.h
+++ b/kernel/cgroup/cpuset-internal.h
@@ -276,6 +276,8 @@ int cpuset_update_flag(cpuset_flagbits_t bit, struct cpuset *cs, int turning_on)
ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off);
int cpuset_common_seq_show(struct seq_file *sf, void *v);
+void cpuset_full_lock(void);
+void cpuset_full_unlock(void);
/*
* cpuset-v1.c
diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c
index b69a7db67090..12e76774c75b 100644
--- a/kernel/cgroup/cpuset-v1.c
+++ b/kernel/cgroup/cpuset-v1.c
@@ -169,8 +169,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft,
cpuset_filetype_t type = cft->private;
int retval = -ENODEV;
- cpus_read_lock();
- cpuset_lock();
+ cpuset_full_lock();
if (!is_cpuset_online(cs))
goto out_unlock;
@@ -184,8 +183,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft,
break;
}
out_unlock:
- cpuset_unlock();
- cpus_read_unlock();
+ cpuset_full_unlock();
return retval;
}
@@ -454,8 +452,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
cpuset_filetype_t type = cft->private;
int retval = 0;
- cpus_read_lock();
- cpuset_lock();
+ cpuset_full_lock();
if (!is_cpuset_online(cs)) {
retval = -ENODEV;
goto out_unlock;
@@ -498,8 +495,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
break;
}
out_unlock:
- cpuset_unlock();
- cpus_read_unlock();
+ cpuset_full_unlock();
return retval;
}
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 7b0b81c835bf..a78ccd11ce9b 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -250,6 +250,12 @@ static struct cpuset top_cpuset = {
static DEFINE_MUTEX(cpuset_mutex);
+/**
+ * cpuset_lock - Acquire the global cpuset mutex
+ *
+ * This locks the global cpuset mutex to prevent modifications to cpuset
+ * hierarchy and configurations. This helper is not enough to make modification.
+ */
void cpuset_lock(void)
{
mutex_lock(&cpuset_mutex);
@@ -260,6 +266,24 @@ void cpuset_unlock(void)
mutex_unlock(&cpuset_mutex);
}
+/**
+ * cpuset_full_lock - Acquire full protection for cpuset modification
+ *
+ * Takes both CPU hotplug read lock (cpus_read_lock()) and cpuset mutex
+ * to safely modify cpuset data.
+ */
+void cpuset_full_lock(void)
+{
+ cpus_read_lock();
+ mutex_lock(&cpuset_mutex);
+}
+
+void cpuset_full_unlock(void)
+{
+ mutex_unlock(&cpuset_mutex);
+ cpus_read_unlock();
+}
+
static DEFINE_SPINLOCK(callback_lock);
void cpuset_callback_lock_irq(void)
@@ -3234,8 +3258,7 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
int retval = -ENODEV;
buf = strstrip(buf);
- cpus_read_lock();
- mutex_lock(&cpuset_mutex);
+ cpuset_full_lock();
if (!is_cpuset_online(cs))
goto out_unlock;
@@ -3264,8 +3287,7 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
if (force_sd_rebuild)
rebuild_sched_domains_locked();
out_unlock:
- mutex_unlock(&cpuset_mutex);
- cpus_read_unlock();
+ cpuset_full_unlock();
flush_workqueue(cpuset_migrate_mm_wq);
return retval ?: nbytes;
}
@@ -3368,12 +3390,10 @@ static ssize_t cpuset_partition_write(struct kernfs_open_file *of, char *buf,
else
return -EINVAL;
- cpus_read_lock();
- mutex_lock(&cpuset_mutex);
+ cpuset_full_lock();
if (is_cpuset_online(cs))
retval = update_prstate(cs, val);
- mutex_unlock(&cpuset_mutex);
- cpus_read_unlock();
+ cpuset_full_unlock();
return retval ?: nbytes;
}
@@ -3498,9 +3518,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
if (!parent)
return 0;
- cpus_read_lock();
- mutex_lock(&cpuset_mutex);
-
+ cpuset_full_lock();
if (is_spread_page(parent))
set_bit(CS_SPREAD_PAGE, &cs->flags);
if (is_spread_slab(parent))
@@ -3552,8 +3570,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
cpumask_copy(cs->effective_cpus, parent->cpus_allowed);
spin_unlock_irq(&callback_lock);
out_unlock:
- mutex_unlock(&cpuset_mutex);
- cpus_read_unlock();
+ cpuset_full_unlock();
return 0;
}
@@ -3568,16 +3585,12 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css)
{
struct cpuset *cs = css_cs(css);
- cpus_read_lock();
- mutex_lock(&cpuset_mutex);
-
+ cpuset_full_lock();
if (!cpuset_v2() && is_sched_load_balance(cs))
cpuset_update_flag(CS_SCHED_LOAD_BALANCE, cs, 0);
cpuset_dec();
-
- mutex_unlock(&cpuset_mutex);
- cpus_read_unlock();
+ cpuset_full_unlock();
}
/*
@@ -3589,16 +3602,11 @@ static void cpuset_css_killed(struct cgroup_subsys_state *css)
{
struct cpuset *cs = css_cs(css);
- cpus_read_lock();
- mutex_lock(&cpuset_mutex);
-
+ cpuset_full_lock();
/* Reset valid partition back to member */
if (is_partition_valid(cs))
update_prstate(cs, PRS_MEMBER);
-
- mutex_unlock(&cpuset_mutex);
- cpus_read_unlock();
-
+ cpuset_full_unlock();
}
static void cpuset_css_free(struct cgroup_subsys_state *css)
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH -next v5 2/3] cpuset: separate tmpmasks and cpuset allocation logic
2025-08-25 3:23 ` [PATCH -next v5 2/3] cpuset: separate tmpmasks and cpuset allocation logic Chen Ridong
@ 2025-08-25 17:37 ` Waiman Long
0 siblings, 0 replies; 10+ messages in thread
From: Waiman Long @ 2025-08-25 17:37 UTC (permalink / raw)
To: Chen Ridong, tj, hannes, mkoutny
Cc: cgroups, linux-kernel, lujialin4, chenridong
On 8/24/25 11:23 PM, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> The original alloc_cpumasks() served dual purposes: allocating cpumasks
> for both temporary masks (tmpmasks) and cpuset structures. This patch:
>
> 1. Decouples these allocation paths for better code clarity
> 2. Introduces dedicated alloc_tmpmasks() and dup_or_alloc_cpuset()
> functions
> 3. Maintains symmetric pairing:
> - alloc_tmpmasks() ↔ free_tmpmasks()
> - dup_or_alloc_cpuset() ↔ free_cpuset()
>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> ---
> kernel/cgroup/cpuset.c | 127 ++++++++++++++++++++++-------------------
> 1 file changed, 69 insertions(+), 58 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index aebda14cc67f..7b0b81c835bf 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -411,51 +411,47 @@ static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
> }
>
> /**
> - * alloc_cpumasks - allocate three cpumasks for cpuset
> - * @cs: the cpuset that have cpumasks to be allocated.
> - * @tmp: the tmpmasks structure pointer
> + * alloc_cpumasks - Allocate an array of cpumask variables
> + * @pmasks: Pointer to array of cpumask_var_t pointers
> + * @size: Number of cpumasks to allocate
> * Return: 0 if successful, -ENOMEM otherwise.
> *
> - * Only one of the two input arguments should be non-NULL.
> + * Allocates @size cpumasks and initializes them to empty. Returns 0 on
> + * success, -ENOMEM on allocation failure. On failure, any previously
> + * allocated cpumasks are freed.
> */
> -static inline int alloc_cpumasks(struct cpuset *cs, struct tmpmasks *tmp)
> +static inline int alloc_cpumasks(cpumask_var_t *pmasks[], u32 size)
> {
> - cpumask_var_t *pmask1, *pmask2, *pmask3, *pmask4;
> + int i;
>
> - if (cs) {
> - pmask1 = &cs->cpus_allowed;
> - pmask2 = &cs->effective_cpus;
> - pmask3 = &cs->effective_xcpus;
> - pmask4 = &cs->exclusive_cpus;
> - } else {
> - pmask1 = &tmp->new_cpus;
> - pmask2 = &tmp->addmask;
> - pmask3 = &tmp->delmask;
> - pmask4 = NULL;
> + for (i = 0; i < size; i++) {
> + if (!zalloc_cpumask_var(pmasks[i], GFP_KERNEL)) {
> + while (--i >= 0)
> + free_cpumask_var(*pmasks[i]);
> + return -ENOMEM;
> + }
> }
> -
> - if (!zalloc_cpumask_var(pmask1, GFP_KERNEL))
> - return -ENOMEM;
> -
> - if (!zalloc_cpumask_var(pmask2, GFP_KERNEL))
> - goto free_one;
> -
> - if (!zalloc_cpumask_var(pmask3, GFP_KERNEL))
> - goto free_two;
> -
> - if (pmask4 && !zalloc_cpumask_var(pmask4, GFP_KERNEL))
> - goto free_three;
> -
> -
> return 0;
> +}
>
> -free_three:
> - free_cpumask_var(*pmask3);
> -free_two:
> - free_cpumask_var(*pmask2);
> -free_one:
> - free_cpumask_var(*pmask1);
> - return -ENOMEM;
> +/**
> + * alloc_tmpmasks - Allocate temporary cpumasks for cpuset operations.
> + * @tmp: Pointer to tmpmasks structure to populate
> + * Return: 0 on success, -ENOMEM on allocation failure
> + */
> +static inline int alloc_tmpmasks(struct tmpmasks *tmp)
> +{
> + /*
> + * Array of pointers to the three cpumask_var_t fields in tmpmasks.
> + * Note: Array size must match actual number of masks (3)
> + */
> + cpumask_var_t *pmask[3] = {
> + &tmp->new_cpus,
> + &tmp->addmask,
> + &tmp->delmask
> + };
> +
> + return alloc_cpumasks(pmask, ARRAY_SIZE(pmask));
> }
>
> /**
> @@ -470,26 +466,46 @@ static inline void free_tmpmasks(struct tmpmasks *tmp)
> }
>
> /**
> - * alloc_trial_cpuset - allocate a trial cpuset
> - * @cs: the cpuset that the trial cpuset duplicates
> + * dup_or_alloc_cpuset - Duplicate or allocate a new cpuset
> + * @cs: Source cpuset to duplicate (NULL for a fresh allocation)
> + *
> + * Creates a new cpuset by either:
> + * 1. Duplicating an existing cpuset (if @cs is non-NULL), or
> + * 2. Allocating a fresh cpuset with zero-initialized masks (if @cs is NULL)
> + *
> + * Return: Pointer to newly allocated cpuset on success, NULL on failure
> */
> -static struct cpuset *alloc_trial_cpuset(struct cpuset *cs)
> +static struct cpuset *dup_or_alloc_cpuset(struct cpuset *cs)
> {
> struct cpuset *trial;
>
> - trial = kmemdup(cs, sizeof(*cs), GFP_KERNEL);
> + /* Allocate base structure */
> + trial = cs ? kmemdup(cs, sizeof(*cs), GFP_KERNEL) :
> + kzalloc(sizeof(*cs), GFP_KERNEL);
> if (!trial)
> return NULL;
>
> - if (alloc_cpumasks(trial, NULL)) {
> + /* Setup cpumask pointer array */
> + cpumask_var_t *pmask[4] = {
> + &trial->cpus_allowed,
> + &trial->effective_cpus,
> + &trial->effective_xcpus,
> + &trial->exclusive_cpus
> + };
> +
> + if (alloc_cpumasks(pmask, ARRAY_SIZE(pmask))) {
> kfree(trial);
> return NULL;
> }
>
> - cpumask_copy(trial->cpus_allowed, cs->cpus_allowed);
> - cpumask_copy(trial->effective_cpus, cs->effective_cpus);
> - cpumask_copy(trial->effective_xcpus, cs->effective_xcpus);
> - cpumask_copy(trial->exclusive_cpus, cs->exclusive_cpus);
> + /* Copy masks if duplicating */
> + if (cs) {
> + cpumask_copy(trial->cpus_allowed, cs->cpus_allowed);
> + cpumask_copy(trial->effective_cpus, cs->effective_cpus);
> + cpumask_copy(trial->effective_xcpus, cs->effective_xcpus);
> + cpumask_copy(trial->exclusive_cpus, cs->exclusive_cpus);
> + }
> +
> return trial;
> }
>
> @@ -2332,7 +2348,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
> if (cpumask_equal(cs->cpus_allowed, trialcs->cpus_allowed))
> return 0;
>
> - if (alloc_cpumasks(NULL, &tmp))
> + if (alloc_tmpmasks(&tmp))
> return -ENOMEM;
>
> if (old_prs) {
> @@ -2476,7 +2492,7 @@ static int update_exclusive_cpumask(struct cpuset *cs, struct cpuset *trialcs,
> if (retval)
> return retval;
>
> - if (alloc_cpumasks(NULL, &tmp))
> + if (alloc_tmpmasks(&tmp))
> return -ENOMEM;
>
> if (old_prs) {
> @@ -2820,7 +2836,7 @@ int cpuset_update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
> int spread_flag_changed;
> int err;
>
> - trialcs = alloc_trial_cpuset(cs);
> + trialcs = dup_or_alloc_cpuset(cs);
> if (!trialcs)
> return -ENOMEM;
>
> @@ -2881,7 +2897,7 @@ static int update_prstate(struct cpuset *cs, int new_prs)
> if (new_prs && is_prs_invalid(old_prs))
> old_prs = PRS_MEMBER;
>
> - if (alloc_cpumasks(NULL, &tmpmask))
> + if (alloc_tmpmasks(&tmpmask))
> return -ENOMEM;
>
> err = update_partition_exclusive_flag(cs, new_prs);
> @@ -3223,7 +3239,7 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
> if (!is_cpuset_online(cs))
> goto out_unlock;
>
> - trialcs = alloc_trial_cpuset(cs);
> + trialcs = dup_or_alloc_cpuset(cs);
> if (!trialcs) {
> retval = -ENOMEM;
> goto out_unlock;
> @@ -3456,15 +3472,10 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
> if (!parent_css)
> return &top_cpuset.css;
>
> - cs = kzalloc(sizeof(*cs), GFP_KERNEL);
> + cs = dup_or_alloc_cpuset(NULL);
> if (!cs)
> return ERR_PTR(-ENOMEM);
>
> - if (alloc_cpumasks(cs, NULL)) {
> - kfree(cs);
> - return ERR_PTR(-ENOMEM);
> - }
> -
> __set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
> fmeter_init(&cs->fmeter);
> cs->relax_domain_level = -1;
> @@ -3920,7 +3931,7 @@ static void cpuset_handle_hotplug(void)
> bool on_dfl = is_in_v2_mode();
> struct tmpmasks tmp, *ptmp = NULL;
>
> - if (on_dfl && !alloc_cpumasks(NULL, &tmp))
> + if (on_dfl && !alloc_tmpmasks(&tmp))
> ptmp = &tmp;
>
> lockdep_assert_cpus_held();
Reviewed-by: Waiman Long <longman@redhat.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH -next v5 0/3] some optimization for cpuset
2025-08-25 3:23 [PATCH -next v5 0/3] some optimization for cpuset Chen Ridong
` (2 preceding siblings ...)
2025-08-25 3:23 ` [PATCH -next v5 3/3] cpuset: add helpers for cpus read and cpuset_mutex locks Chen Ridong
@ 2025-08-25 18:20 ` Tejun Heo
3 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2025-08-25 18:20 UTC (permalink / raw)
To: Chen Ridong
Cc: longman, hannes, mkoutny, cgroups, linux-kernel, lujialin4,
chenridong
On Mon, Aug 25, 2025 at 03:23:49AM +0000, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> This patch series contains several cpuset improvements:
>
> 1. Decouple cpuset and tmpmasks allocation/freeing.
> 2. Add cpuset_full_[un]lock helpers.
Applied 1-3 to cgroup/for-6.18.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH -next v5 3/3] cpuset: add helpers for cpus read and cpuset_mutex locks
2025-08-25 3:23 ` [PATCH -next v5 3/3] cpuset: add helpers for cpus read and cpuset_mutex locks Chen Ridong
@ 2025-08-26 14:23 ` Michal Koutný
2025-08-26 14:43 ` Waiman Long
0 siblings, 1 reply; 10+ messages in thread
From: Michal Koutný @ 2025-08-26 14:23 UTC (permalink / raw)
To: Chen Ridong
Cc: longman, tj, hannes, cgroups, linux-kernel, lujialin4, chenridong
[-- Attachment #1: Type: text/plain, Size: 1083 bytes --]
(I wrote this yesterday before merging but I'm still sending it to give
my opinion ;-))
On Mon, Aug 25, 2025 at 03:23:52AM +0000, Chen Ridong <chenridong@huaweicloud.com> wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> cpuset: add helpers for cpus_read_lock and cpuset_mutex locks.
>
> Replace repetitive locking patterns with new helpers:
> - cpuset_full_lock()
> - cpuset_full_unlock()
I don't see many precedents elsewhere in the kernel for such naming
(like _lock and _full_lock()). Wouldn't it be more illustrative to have
cpuset_read_lock() and cpuset_write_lock()? (As I'm looking at current
users and your accompanying comments which could be substituted with
the more conventional naming.)
(Also if you decide going this direction, please mention commit
111cd11bbc548 ("sched/cpuset: Bring back cpuset_mutex") in the message
so that it doesn't tempt to do further changes.)
> This makes the code cleaner and ensures consistent lock ordering.
Lock guards anyone? (When you're touching this and seeking clean code.)
Thanks,
Michal
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH -next v5 3/3] cpuset: add helpers for cpus read and cpuset_mutex locks
2025-08-26 14:23 ` Michal Koutný
@ 2025-08-26 14:43 ` Waiman Long
2025-08-27 6:23 ` Chen Ridong
0 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2025-08-26 14:43 UTC (permalink / raw)
To: Michal Koutný, Chen Ridong
Cc: tj, hannes, cgroups, linux-kernel, lujialin4, chenridong
On 8/26/25 10:23 AM, Michal Koutný wrote:
> (I wrote this yesterday before merging but I'm still sending it to give
> my opinion ;-))
>
> On Mon, Aug 25, 2025 at 03:23:52AM +0000, Chen Ridong <chenridong@huaweicloud.com> wrote:
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> cpuset: add helpers for cpus_read_lock and cpuset_mutex locks.
>>
>> Replace repetitive locking patterns with new helpers:
>> - cpuset_full_lock()
>> - cpuset_full_unlock()
> I don't see many precedents elsewhere in the kernel for such naming
> (like _lock and _full_lock()). Wouldn't it be more illustrative to have
> cpuset_read_lock() and cpuset_write_lock()? (As I'm looking at current
> users and your accompanying comments which could be substituted with
> the more conventional naming.)
Good naming is always an issue. Using cpuset_read_lock/cpuset_write_lock
will be more confusing as the current locking scheme is exclusive.
> (Also if you decide going this direction, please mention commit
> 111cd11bbc548 ("sched/cpuset: Bring back cpuset_mutex") in the message
> so that it doesn't tempt to do further changes.)
>
>
>> This makes the code cleaner and ensures consistent lock ordering.
> Lock guards anyone? (When you're touching this and seeking clean code.)
Yes, I guess we can use lock guards here. You are welcome to send a
patch to do that.
Cheers,
Longman
>
> Thanks,
> Michal
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH -next v5 3/3] cpuset: add helpers for cpus read and cpuset_mutex locks
2025-08-26 14:43 ` Waiman Long
@ 2025-08-27 6:23 ` Chen Ridong
2025-08-27 8:46 ` Michal Koutný
0 siblings, 1 reply; 10+ messages in thread
From: Chen Ridong @ 2025-08-27 6:23 UTC (permalink / raw)
To: Waiman Long, Michal Koutný
Cc: tj, hannes, cgroups, linux-kernel, lujialin4, chenridong
On 2025/8/26 22:43, Waiman Long wrote:
>
> On 8/26/25 10:23 AM, Michal Koutný wrote:
>> (I wrote this yesterday before merging but I'm still sending it to give
>> my opinion ;-))
>>
>> On Mon, Aug 25, 2025 at 03:23:52AM +0000, Chen Ridong <chenridong@huaweicloud.com> wrote:
>>> From: Chen Ridong <chenridong@huawei.com>
>>>
>>> cpuset: add helpers for cpus_read_lock and cpuset_mutex locks.
>>>
>>> Replace repetitive locking patterns with new helpers:
>>> - cpuset_full_lock()
>>> - cpuset_full_unlock()
>> I don't see many precedents elsewhere in the kernel for such naming
>> (like _lock and _full_lock()). Wouldn't it be more illustrative to have
>> cpuset_read_lock() and cpuset_write_lock()? (As I'm looking at current
>> users and your accompanying comments which could be substituted with
>> the more conventional naming.)
> Good naming is always an issue. Using cpuset_read_lock/cpuset_write_lock will be more confusing as
> the current locking scheme is exclusive.
>> (Also if you decide going this direction, please mention commit
>> 111cd11bbc548 ("sched/cpuset: Bring back cpuset_mutex") in the message
>> so that it doesn't tempt to do further changes.)
>>
>>
>>> This makes the code cleaner and ensures consistent lock ordering.
>> Lock guards anyone? (When you're touching this and seeking clean code.)
>
> Yes, I guess we can use lock guards here. You are welcome to send a patch to do that.
>
I attempted to define the cpuset_full_lock() macro, but the initial implementation was inconsistent
with our coding conventions.
Initial version:
#define cpuset_full_lock() \
guard(cpus_read_lock)(); \
guard(mutex)(&cpuset_mutex);
It was suggested to use a do-while construct for proper scoping. but it could not work if we define as:
#define cpuset_full_lock() \
do { \
guard(cpus_read_lock)(); \
guard(mutex)(&cpuset_mutex); \
} while(0)
So I sent this patch version.
--
Best regards,
Ridong
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH -next v5 3/3] cpuset: add helpers for cpus read and cpuset_mutex locks
2025-08-27 6:23 ` Chen Ridong
@ 2025-08-27 8:46 ` Michal Koutný
0 siblings, 0 replies; 10+ messages in thread
From: Michal Koutný @ 2025-08-27 8:46 UTC (permalink / raw)
To: Chen Ridong
Cc: Waiman Long, tj, hannes, cgroups, linux-kernel, lujialin4,
chenridong
[-- Attachment #1: Type: text/plain, Size: 364 bytes --]
On Wed, Aug 27, 2025 at 02:23:17PM +0800, Chen Ridong <chenridong@huaweicloud.com> wrote:
> It was suggested to use a do-while construct for proper scoping. but it could not work if we define as:
Perhaps like this:
DEFINE_LOCK_GUARD_0(cpuset_full, cpuset_full_lock(), cpuset_full_unlock())
> So I sent this patch version.
No probs, it's a minor issue.
Michal
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-08-27 8:46 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-25 3:23 [PATCH -next v5 0/3] some optimization for cpuset Chen Ridong
2025-08-25 3:23 ` [PATCH -next v5 1/3] cpuset: decouple tmpmasks and cpumasks freeing in cgroup Chen Ridong
2025-08-25 3:23 ` [PATCH -next v5 2/3] cpuset: separate tmpmasks and cpuset allocation logic Chen Ridong
2025-08-25 17:37 ` Waiman Long
2025-08-25 3:23 ` [PATCH -next v5 3/3] cpuset: add helpers for cpus read and cpuset_mutex locks Chen Ridong
2025-08-26 14:23 ` Michal Koutný
2025-08-26 14:43 ` Waiman Long
2025-08-27 6:23 ` Chen Ridong
2025-08-27 8:46 ` Michal Koutný
2025-08-25 18:20 ` [PATCH -next v5 0/3] some optimization for cpuset Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).