From: Waiman Long <longman@redhat.com>
To: "Tejun Heo" <tj@kernel.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Ridong Chen" <ridong.chen@linux.dev>,
"Jonathan Corbet" <corbet@lwn.net>,
"Shuah Khan" <skhan@linuxfoundation.org>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
Waiman Long <longman@redhat.com>
Subject: [PATCH v2 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks()
Date: Tue, 23 Jun 2026 19:04:12 -0400 [thread overview]
Message-ID: <20260623230413.1984188-2-longman@redhat.com> (raw)
In-Reply-To: <20260623230413.1984188-1-longman@redhat.com>
As reported by sashiko [1], cpuset_hotplug_update_tasks() may perform
unnecessary task iteration and updating of tasks' CPU and node masks
when mems_allowed and/or cpus_allowed are not set in cpuset v2. It is
due to the fact that the temporary new_cpus and new_mems masks do not
inherit parent's effective_cpus/mems when they are empty which is the
expected behavior for cpuset v2 since commit 4ec22e9c5a90 ("cpuset:
Enable cpuset controller in default hierarchy").
Fix that and avoid unnecessay work by enhancing
compute_effective_cpumask() to add the empty cpumask check
and inheriting the parent's versions if empty when in v2. A new
compute_effective_nodemask() helper is also added to perform similar
function for new effective_mems.
Add new test_cpuset_prs.sh test cases to confirm that effective_cpus
will inherit the parent's version if cpuset.cpus is empty.
[1] https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com
Suggested-by: Ridong Chen <ridong.chen@linux.dev>
Fixes: 4ec22e9c5a90 ("cpuset: Enable cpuset controller in default hierarchy")
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/cgroup/cpuset.c | 45 +++++++++++--------
.../selftests/cgroup/test_cpuset_prs.sh | 11 ++++-
2 files changed, 35 insertions(+), 21 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index aff86acea701..044ddbf66f8e 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1094,12 +1094,35 @@ void cpuset_update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus)
* @cs: the cpuset the need to recompute the new effective_cpus mask
* @parent: the parent cpuset
*
+ * For v2, the parent's effective_cpus is inherited if cpumask empty.
* The result is valid only if the given cpuset isn't a partition root.
*/
static void compute_effective_cpumask(struct cpumask *new_cpus,
struct cpuset *cs, struct cpuset *parent)
{
- cpumask_and(new_cpus, cs->cpus_allowed, parent->effective_cpus);
+ bool has_cpus;
+
+ has_cpus = cpumask_and(new_cpus, cs->cpus_allowed, parent->effective_cpus);
+ if (!has_cpus && is_in_v2_mode())
+ cpumask_copy(new_cpus, parent->effective_cpus);
+}
+
+/**
+ * compute_effective_nodemask - Compute the effective nodemask of the cpuset
+ * @new_cpus: the temp variable for the new effective_mems mask
+ * @cs: the cpuset the need to recompute the new effective_mems mask
+ * @parent: the parent cpuset
+ *
+ * For v2, the parent's effective_mems is inherited if nodemask empty.
+ */
+static void compute_effective_nodemask(nodemask_t *new_mems,
+ struct cpuset *cs, struct cpuset *parent)
+{
+ bool has_mems;
+
+ has_mems = nodes_and(*new_mems, cs->mems_allowed, parent->effective_mems);
+ if (!has_mems && is_in_v2_mode())
+ nodes_copy(*new_mems, parent->effective_mems);
}
/*
@@ -2148,15 +2171,6 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
goto update_parent_effective;
}
- /*
- * If it becomes empty, inherit the effective mask of the
- * parent, which is guaranteed to have some CPUs unless
- * it is a partition root that has explicitly distributed
- * out all its CPUs.
- */
- if (is_in_v2_mode() && !remote && cpumask_empty(tmp->new_cpus))
- cpumask_copy(tmp->new_cpus, parent->effective_cpus);
-
/*
* Skip the whole subtree if
* 1) the cpumask remains the same,
@@ -2704,14 +2718,7 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems)
cpuset_for_each_descendant_pre(cp, pos_css, cs) {
struct cpuset *parent = parent_cs(cp);
- bool has_mems = nodes_and(*new_mems, cp->mems_allowed, parent->effective_mems);
-
- /*
- * If it becomes empty, inherit the effective mask of the
- * parent, which is guaranteed to have some MEMs.
- */
- if (is_in_v2_mode() && !has_mems)
- *new_mems = parent->effective_mems;
+ compute_effective_nodemask(new_mems, cp, parent);
/* Skip the whole subtree if the nodemask remains the same. */
if (nodes_equal(*new_mems, cp->effective_mems)) {
@@ -3923,7 +3930,7 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
parent = parent_cs(cs);
compute_effective_cpumask(&new_cpus, cs, parent);
- nodes_and(new_mems, cs->mems_allowed, parent->effective_mems);
+ compute_effective_nodemask(&new_mems, cs, parent);
if (!tmp || !cs->partition_root_state)
goto update_tasks;
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
index 0d41aa0d343d..ca9bc38fdb95 100755
--- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
+++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
@@ -495,13 +495,20 @@ REMOTE_TEST_MATRIX=(
# Narrowing cpuset.cpus to previously sibling-excluded CPUs should
# not return CPUs that were never actually owned.
" C1-4:P1 . C1-2:P1 C1-3:P2 . . \
- . . . C3 . . p1:4|c11:1-2|c12:3 \
+ . . . C3 . . p1:4|c11:1-2|c12:3 \
p1:P1|c11:P1|c12:P2 3"
# Expanding cpuset.cpus to include a previously sibling-excluded CPU
# after the sibling has become a member should correctly request it.
" C1-4:P1 . C1-2:P1 C1-3:P2 . . \
- . . P0 C2-3 . . p1:1,4|c11:1|c12:2-3 \
+ . . P0 C2-3 . . p1:1,4|c11:1|c12:2-3 \
p1:P1|c11:P0|c12:P2 2-3"
+ # Cpusets with empty cpuset.cpus should inherit parent's effective_cpus
+ " C1-4:P1 C5-6 C1-2 . C5 . \
+ . P1 P1 . . . p1:3-4|p2:5-6|c11:1-2|c12:3-4|c21:5|c22:5-6 \
+ p1:P1|p2:P1|c11:P1"
+ " C1-4:P1 C5-6 C1-2 . C5 . \
+ . P1 P1 . O5=0 . p1:3-4|p2:6|c11:1-2|c12:3-4|c21:6|c22:6 \
+ p1:P1|p2:P1|c11:P1"
)
#
--
2.54.0
next prev parent reply other threads:[~2026-06-23 23:05 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-23 23:04 [PATCH v2 0/2] cgroup/cpuset: Miscellaneous fixes and cleanups Waiman Long
2026-06-23 23:04 ` Waiman Long [this message]
2026-06-24 5:51 ` [PATCH v2 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks() Ridong Chen
2026-06-24 8:40 ` Manuel Ebner
2026-06-23 23:04 ` [PATCH v2 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask() Waiman Long
2026-06-24 8:27 ` Manuel Ebner
2026-06-24 19:47 ` [PATCH v2 0/2] cgroup/cpuset: Miscellaneous fixes and cleanups Tejun Heo
2026-06-24 20:55 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260623230413.1984188-2-longman@redhat.com \
--to=longman@redhat.com \
--cc=cgroups@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=hannes@cmpxchg.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mkoutny@suse.com \
--cc=ridong.chen@linux.dev \
--cc=skhan@linuxfoundation.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.