public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Vincent Guittot
	<vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org>,
	Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
	Ben Segall <bsegall-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>,
	Daniel Bristot de Oliveira
	<bristot-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Valentin Schneider
	<vschneid-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: [PATCH 1/2] cgroup/cpuset: Keep current cpus list if cpus affinity was explicitly set
Date: Wed, 27 Jul 2022 20:58:14 -0400	[thread overview]
Message-ID: <20220728005815.1715522-1-longman@redhat.com> (raw)

It was found that any change to the current cpuset hierarchy may reset
the cpus_allowed list of the tasks in the affected cpusets to the
default cpuset value even if those tasks have cpus affinity explicitly
set by the users before. That is especially easy to trigger under a
cgroup v2 environment where writing "+cpuset" to the root cgroup's
cgroup.subtree_control file will reset the cpus affinity of all the
processes in the system.

That is especially problematic in a nohz_full environment where the
tasks running in the nohz_full CPUs usually have their cpus affinity
explicitly set and will behave incorrectly if cpus affinity changes.

Fix this problem by adding a flag in the task structure to indicate that
a task has their cpus affinity explicitly set before and make cpuset
code not to change their cpus_allowed list unless the user chosen cpu
list is no longer a subset of the cpus_allowed list of the cpuset itself.

With that change in place, it was verified that tasks that have its
cpus affinity explicitly set will not be affected by changes made to
the v2 cgroup.subtree_control files.

Signed-off-by: Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 include/linux/sched.h  |  1 +
 kernel/cgroup/cpuset.c | 18 ++++++++++++++++--
 kernel/sched/core.c    |  1 +
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index c46f3a63b758..60ae022fa842 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -815,6 +815,7 @@ struct task_struct {
 
 	unsigned int			policy;
 	int				nr_cpus_allowed;
+	int				cpus_affinity_set;
 	const cpumask_t			*cpus_ptr;
 	cpumask_t			*user_cpus_ptr;
 	cpumask_t			cpus_mask;
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 71a418858a5e..c47757c61f39 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -704,6 +704,20 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
 	return ret;
 }
 
+/*
+ * Don't change the cpus_allowed list if cpus affinity has been explicitly
+ * set before unless the current cpu list is not a subset of the new cpu list.
+ */
+static int cpuset_set_cpus_allowed_ptr(struct task_struct *p,
+				       const struct cpumask *new_mask)
+{
+	if (p->cpus_affinity_set && cpumask_subset(p->cpus_ptr, new_mask))
+		return 0;
+
+	p->cpus_affinity_set = 0;
+	return set_cpus_allowed_ptr(p, new_mask);
+}
+
 #ifdef CONFIG_SMP
 /*
  * Helper routine for generate_sched_domains().
@@ -1130,7 +1144,7 @@ static void update_tasks_cpumask(struct cpuset *cs)
 
 	css_task_iter_start(&cs->css, 0, &it);
 	while ((task = css_task_iter_next(&it)))
-		set_cpus_allowed_ptr(task, cs->effective_cpus);
+		cpuset_set_cpus_allowed_ptr(task, cs->effective_cpus);
 	css_task_iter_end(&it);
 }
 
@@ -2303,7 +2317,7 @@ static void cpuset_attach(struct cgroup_taskset *tset)
 		 * can_attach beforehand should guarantee that this doesn't
 		 * fail.  TODO: have a better way to handle failure here
 		 */
-		WARN_ON_ONCE(set_cpus_allowed_ptr(task, cpus_attach));
+		WARN_ON_ONCE(cpuset_set_cpus_allowed_ptr(task, cpus_attach));
 
 		cpuset_change_task_nodemask(task, &cpuset_attach_nodemask_to);
 		cpuset_update_task_spread_flag(cs, task);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index da0bf6fe9ecd..ab8ea6fa92db 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8034,6 +8034,7 @@ __sched_setaffinity(struct task_struct *p, const struct cpumask *mask)
 	if (retval)
 		goto out_free_new_mask;
 
+	p->cpus_affinity_set = 1;
 	cpuset_cpus_allowed(p, cpus_allowed);
 	if (!cpumask_subset(new_mask, cpus_allowed)) {
 		/*
-- 
2.31.1


             reply	other threads:[~2022-07-28  0:58 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-28  0:58 Waiman Long [this message]
     [not found] ` <20220728005815.1715522-1-longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2022-07-28  0:58   ` [PATCH 2/2] cgroup: Skip subtree root in cgroup_update_dfl_csses() Waiman Long
     [not found]     ` <20220728005815.1715522-2-longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2022-07-28 14:44       ` Michal Koutný
2022-07-28 14:49         ` Waiman Long
     [not found]         ` <20220728144426.GA26631-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2022-07-28 17:26           ` Tejun Heo
2022-07-28 17:27       ` Tejun Heo
2022-07-28 14:44   ` [PATCH 1/2] cgroup/cpuset: Keep current cpus list if cpus affinity was explicitly set Michal Koutný
     [not found]     ` <20220728144420.GA27407-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2022-07-28 14:59       ` Waiman Long
     [not found]         ` <a58852b4-313a-9271-f31d-f79a91ec188b-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2022-07-28 15:23           ` Michal Koutný
     [not found]             ` <20220728152355.GB25894-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org>
2022-07-28 15:35               ` Waiman Long
2022-07-28 16:50           ` Valentin Schneider
     [not found]             ` <xhsmhbkt9dvwm.mognet-lVOlpLwaOlUmbyhZVYo91WPO1xz+ivuR@public.gmane.org>
2022-07-28 17:42               ` Waiman Long
2022-07-28 17:23   ` Tejun Heo
2022-07-28 18:57     ` Waiman Long
     [not found]       ` <1ae1cc6c-dca9-4958-6b22-24a5777c5e8d-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2022-07-28 19:02         ` Tejun Heo
     [not found]           ` <YuLdX7BYGvo57LNU-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-07-28 19:21             ` Waiman Long
2022-07-28 20:44               ` Tejun Heo
2022-07-28 21:04                 ` Waiman Long
     [not found]                   ` <c470d3f7-f0f8-b8e6-4a95-7b334f0a824b-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2022-07-28 21:39                     ` Tejun Heo
     [not found]                       ` <YuMCB86fH2K3NcqM-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-07-29 14:15                         ` Valentin Schneider
     [not found]                           ` <xhsmhy1wcc8dq.mognet-lVOlpLwaOlUmbyhZVYo91WPO1xz+ivuR@public.gmane.org>
2022-07-29 14:50                             ` Waiman Long
     [not found]                               ` <92f68392-12d2-f64a-9bb9-1a3a15f99d02-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2022-07-29 18:31                                 ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220728005815.1715522-1-longman@redhat.com \
    --to=longman-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=bristot-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=bsegall-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=dietmar.eggemann-5wv7dgnIgG8@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
    --cc=mgorman-l3A5Bk7waGM@public.gmane.org \
    --cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
    --cc=vschneid-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox