Linux Kernel Selftest development
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Tejun Heo <tj@kernel.org>, Zefan Li <lizefan.x@bytedance.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Xavier <ghostxavier@sina.com>, Peter Hunt <pehunt@redhat.com>,
	Petr Malat <oss@malat.biz>, Waiman Long <longman@redhat.com>
Subject: [PATCH-cgroup v2 3/5] cgroup/cpuset: Delay setting of CS_CPU_EXCLUSIVE until valid partition
Date: Mon, 17 Jun 2024 10:39:43 -0400	[thread overview]
Message-ID: <20240617143945.454888-4-longman@redhat.com> (raw)
In-Reply-To: <20240617143945.454888-1-longman@redhat.com>

The CS_CPU_EXCLUSIVE flag is currently set whenever cpuset.cpus.exclusive
is set to make sure that the exclusivity test will be run to ensure its
exclusiveness. At the same time, this flag can be changed whenever the
partition root state is changed. For example, the CS_CPU_EXCLUSIVE flag
will be reset whenever a partition root becomes invalid. This makes
using CS_CPU_EXCLUSIVE to ensure exclusiveness a bit fragile.

The current scheme also makes setting up a cpuset.cpus.exclusive
hierarchy to enable remote partition harder as cpuset.cpus.exclusive
cannot overlap with any cpuset.cpus of sibling cpusets if their
cpuset.cpus.exclusive aren't set.

Solve these issues by deferring the setting of CS_CPU_EXCLUSIVE flag
until the cpuset become a valid partition root while adding new checks
in validate_change() to ensure that cpuset.cpus.exclusive of sibling
cpusets cannot overlap.

An additional check is also added to validate_change() to make sure that
cpuset.cpus of one cpuset cannot be a subset of cpuset.cpus.exclusive
of a sibling cpuset to avoid the problem that none of those CPUs will
be available when these exclusive CPUs are extracted out to a newly
enabled partition root. The Documentation/admin-guide/cgroup-v2.rst
file is updated to document the new constraints.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 Documentation/admin-guide/cgroup-v2.rst |  8 ++++--
 kernel/cgroup/cpuset.c                  | 36 ++++++++++++++++++++-----
 2 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index f886c6ad704e..722e4762c4e0 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2359,8 +2359,12 @@ Cpuset Interface Files
 	is always a subset of it.
 
 	Users can manually set it to a value that is different from
-	"cpuset.cpus".	The only constraint in setting it is that the
-	list of CPUs must be exclusive with respect to its sibling.
+	"cpuset.cpus".	One constraint in setting it is that the list of
+	CPUs must be exclusive with respect to "cpuset.cpus.exclusive"
+	of its sibling.  If "cpuset.cpus.exclusive" of a sibling cgroup
+	isn't set, its "cpuset.cpus" value, if set, cannot be a subset
+	of it to leave at least one CPU available when the exclusive
+	CPUs are taken away.
 
 	For a parent cgroup, any one of its exclusive CPUs can only
 	be distributed to at most one of its child cgroups.  Having an
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index fb71d710a603..144bfc319809 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -825,17 +825,41 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
 
 	/*
 	 * If either I or some sibling (!= me) is exclusive, we can't
-	 * overlap
+	 * overlap. exclusive_cpus cannot overlap with each other if set.
 	 */
 	ret = -EINVAL;
 	cpuset_for_each_child(c, css, par) {
-		if ((is_cpu_exclusive(trial) || is_cpu_exclusive(c)) &&
-		    c != cur) {
+		bool txset, cxset;	/* Are exclusive_cpus set? */
+
+		if (c == cur)
+			continue;
+
+		txset = !cpumask_empty(trial->exclusive_cpus);
+		cxset = !cpumask_empty(c->exclusive_cpus);
+		if (is_cpu_exclusive(trial) || is_cpu_exclusive(c) ||
+		    (txset && cxset)) {
 			if (!cpusets_are_exclusive(trial, c))
 				goto out;
+		} else if (txset || cxset) {
+			struct cpumask *xcpus, *acpus;
+
+			/*
+			 * When just one of the exclusive_cpus's is set,
+			 * cpus_allowed of the other cpuset, if set, cannot be
+			 * a subset of it or none of those CPUs will be
+			 * available if these exclusive CPUs are activated.
+			 */
+			if (txset) {
+				xcpus = trial->exclusive_cpus;
+				acpus = c->cpus_allowed;
+			} else {
+				xcpus = c->exclusive_cpus;
+				acpus = trial->cpus_allowed;
+			}
+			if (!cpumask_empty(acpus) && cpumask_subset(acpus, xcpus))
+				goto out;
 		}
 		if ((is_mem_exclusive(trial) || is_mem_exclusive(c)) &&
-		    c != cur &&
 		    nodes_intersects(trial->mems_allowed, c->mems_allowed))
 			goto out;
 	}
@@ -1375,7 +1399,7 @@ static void update_sibling_cpumasks(struct cpuset *parent, struct cpuset *cs,
  */
 static int update_partition_exclusive(struct cpuset *cs, int new_prs)
 {
-	bool exclusive = (new_prs > 0);
+	bool exclusive = (new_prs > PRS_MEMBER);
 
 	if (exclusive && !is_cpu_exclusive(cs)) {
 		if (update_flag(CS_CPU_EXCLUSIVE, cs, 1))
@@ -2620,8 +2644,6 @@ static int update_exclusive_cpumask(struct cpuset *cs, struct cpuset *trialcs,
 		retval = cpulist_parse(buf, trialcs->exclusive_cpus);
 		if (retval < 0)
 			return retval;
-		if (!is_cpu_exclusive(cs))
-			set_bit(CS_CPU_EXCLUSIVE, &trialcs->flags);
 	}
 
 	/* Nothing to do if the CPUs didn't change */
-- 
2.39.3


  parent reply	other threads:[~2024-06-17 14:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-17 14:39 [PATCH-cgroup v2 0/5] cgroup/cpuset: Fix miscellaneous issues Waiman Long
2024-06-17 14:39 ` [PATCH-cgroup v2 1/5] cgroup/cpuset: Fix remote root partition creation problem Waiman Long
2024-06-17 14:39 ` [PATCH-cgroup v2 2/5] selftest/cgroup: Fix test_cpuset_prs.sh problems reported by test robot Waiman Long
2024-06-17 14:39 ` Waiman Long [this message]
2024-06-17 14:39 ` [PATCH-cgroup v2 4/5] cgroup/cpuset: Make cpuset.cpus.exclusive independent of cpuset.cpus Waiman Long
2024-07-15 15:56   ` Petr Malat
2024-07-15 19:17     ` Waiman Long
2024-06-17 14:39 ` [PATCH-cgroup v2 5/5] selftest/cgroup: Update test_cpuset_prs.sh to match changes Waiman Long
2024-06-19 17:38 ` [PATCH-cgroup v2 0/5] cgroup/cpuset: Fix miscellaneous issues Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240617143945.454888-4-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=ghostxavier@sina.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=oss@malat.biz \
    --cc=pehunt@redhat.com \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox