From: Will Deacon <will-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: kernel-team-z5hGa2qSFaRBDgjK7y7TUQ@public.gmane.org,
Will Deacon <will-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [PATCH 1/2] cpuset: Fix cpuset_cpus_allowed() to not filter offline CPUs
Date: Tue, 31 Jan 2023 22:17:18 +0000 [thread overview]
Message-ID: <20230131221719.3176-2-will@kernel.org> (raw)
In-Reply-To: <20230131221719.3176-1-will-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
From: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
There is a difference in behaviour between CPUSET={y,n} that is now
wrecking havoc with {relax,force}_compatible_cpus_allowed_ptr().
Specifically, since commit 8f9ea86fdf99 ("sched: Always preserve the
user requested cpumask") relax_compatible_cpus_allowed_ptr() is
calling __sched_setaffinity() unconditionally.
But the underlying problem goes back a lot further, possibly to
commit: ae1c802382f7 ("cpuset: apply cs->effective_{cpus,mems}") which
switched cpuset_cpus_allowed() from cs->cpus_allowed to
cs->effective_cpus.
The problem is that for CPUSET=y cpuset_cpus_allowed() will filter out
all offline CPUs. For tasks that are part of a (!root) cpuset this is
then later fixed up by the cpuset hotplug notifiers that re-evaluate
and re-apply cs->effective_cpus, but for (normal) tasks in the root
cpuset this does not happen and they will forever after be excluded
from CPUs onlined later.
As such, rewrite cpuset_cpus_allowed() to return a wider mask,
including the offline CPUs.
Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask")
Reported-by: Will Deacon <will-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Link: https://lkml.kernel.org/r/20230117160825.GA17756@willie-the-truck
Signed-off-by: Will Deacon <will-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
kernel/cgroup/cpuset.c | 39 ++++++++++++++++++++++++++++++++++-----
1 file changed, 34 insertions(+), 5 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index a29c0b13706b..8552cc2c586a 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3683,23 +3683,52 @@ void __init cpuset_init_smp(void)
BUG_ON(!cpuset_migrate_mm_wq);
}
+static const struct cpumask *__cs_cpus_allowed(struct cpuset *cs)
+{
+ const struct cpumask *cs_mask = cs->cpus_allowed;
+ if (!parent_cs(cs))
+ cs_mask = cpu_possible_mask;
+ return cs_mask;
+}
+
+static void cs_cpus_allowed(struct cpuset *cs, struct cpumask *pmask)
+{
+ do {
+ cpumask_and(pmask, pmask, __cs_cpus_allowed(cs));
+ cs = parent_cs(cs);
+ } while (cs);
+}
+
/**
* cpuset_cpus_allowed - return cpus_allowed mask from a tasks cpuset.
* @tsk: pointer to task_struct from which to obtain cpuset->cpus_allowed.
* @pmask: pointer to struct cpumask variable to receive cpus_allowed set.
*
- * Description: Returns the cpumask_var_t cpus_allowed of the cpuset
- * attached to the specified @tsk. Guaranteed to return some non-empty
- * subset of cpu_online_mask, even if this means going outside the
- * tasks cpuset.
+ * Description: Returns the cpumask_var_t cpus_allowed of the cpuset attached
+ * to the specified @tsk. Guaranteed to return some non-empty intersection
+ * with cpu_online_mask, even if this means going outside the tasks cpuset.
**/
void cpuset_cpus_allowed(struct task_struct *tsk, struct cpumask *pmask)
{
unsigned long flags;
+ struct cpuset *cs;
spin_lock_irqsave(&callback_lock, flags);
- guarantee_online_cpus(tsk, pmask);
+ rcu_read_lock();
+
+ cs = task_cs(tsk);
+ do {
+ cpumask_copy(pmask, task_cpu_possible_mask(tsk));
+ cs_cpus_allowed(cs, pmask);
+
+ if (cpumask_intersects(pmask, cpu_online_mask))
+ break;
+
+ cs = parent_cs(cs);
+ } while (cs);
+
+ rcu_read_unlock();
spin_unlock_irqrestore(&callback_lock, flags);
}
--
2.39.1.456.gfc5497dd1b-goog
WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com, Will Deacon <will@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Waiman Long <longman@redhat.com>,
Zefan Li <lizefan.x@bytedance.com>, Tejun Heo <tj@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
cgroups@vger.kernel.org
Subject: [PATCH 1/2] cpuset: Fix cpuset_cpus_allowed() to not filter offline CPUs
Date: Tue, 31 Jan 2023 22:17:18 +0000 [thread overview]
Message-ID: <20230131221719.3176-2-will@kernel.org> (raw)
In-Reply-To: <20230131221719.3176-1-will@kernel.org>
From: Peter Zijlstra <peterz@infradead.org>
There is a difference in behaviour between CPUSET={y,n} that is now
wrecking havoc with {relax,force}_compatible_cpus_allowed_ptr().
Specifically, since commit 8f9ea86fdf99 ("sched: Always preserve the
user requested cpumask") relax_compatible_cpus_allowed_ptr() is
calling __sched_setaffinity() unconditionally.
But the underlying problem goes back a lot further, possibly to
commit: ae1c802382f7 ("cpuset: apply cs->effective_{cpus,mems}") which
switched cpuset_cpus_allowed() from cs->cpus_allowed to
cs->effective_cpus.
The problem is that for CPUSET=y cpuset_cpus_allowed() will filter out
all offline CPUs. For tasks that are part of a (!root) cpuset this is
then later fixed up by the cpuset hotplug notifiers that re-evaluate
and re-apply cs->effective_cpus, but for (normal) tasks in the root
cpuset this does not happen and they will forever after be excluded
from CPUs onlined later.
As such, rewrite cpuset_cpus_allowed() to return a wider mask,
including the offline CPUs.
Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask")
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230117160825.GA17756@willie-the-truck
Signed-off-by: Will Deacon <will@kernel.org>
---
kernel/cgroup/cpuset.c | 39 ++++++++++++++++++++++++++++++++++-----
1 file changed, 34 insertions(+), 5 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index a29c0b13706b..8552cc2c586a 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3683,23 +3683,52 @@ void __init cpuset_init_smp(void)
BUG_ON(!cpuset_migrate_mm_wq);
}
+static const struct cpumask *__cs_cpus_allowed(struct cpuset *cs)
+{
+ const struct cpumask *cs_mask = cs->cpus_allowed;
+ if (!parent_cs(cs))
+ cs_mask = cpu_possible_mask;
+ return cs_mask;
+}
+
+static void cs_cpus_allowed(struct cpuset *cs, struct cpumask *pmask)
+{
+ do {
+ cpumask_and(pmask, pmask, __cs_cpus_allowed(cs));
+ cs = parent_cs(cs);
+ } while (cs);
+}
+
/**
* cpuset_cpus_allowed - return cpus_allowed mask from a tasks cpuset.
* @tsk: pointer to task_struct from which to obtain cpuset->cpus_allowed.
* @pmask: pointer to struct cpumask variable to receive cpus_allowed set.
*
- * Description: Returns the cpumask_var_t cpus_allowed of the cpuset
- * attached to the specified @tsk. Guaranteed to return some non-empty
- * subset of cpu_online_mask, even if this means going outside the
- * tasks cpuset.
+ * Description: Returns the cpumask_var_t cpus_allowed of the cpuset attached
+ * to the specified @tsk. Guaranteed to return some non-empty intersection
+ * with cpu_online_mask, even if this means going outside the tasks cpuset.
**/
void cpuset_cpus_allowed(struct task_struct *tsk, struct cpumask *pmask)
{
unsigned long flags;
+ struct cpuset *cs;
spin_lock_irqsave(&callback_lock, flags);
- guarantee_online_cpus(tsk, pmask);
+ rcu_read_lock();
+
+ cs = task_cs(tsk);
+ do {
+ cpumask_copy(pmask, task_cpu_possible_mask(tsk));
+ cs_cpus_allowed(cs, pmask);
+
+ if (cpumask_intersects(pmask, cpu_online_mask))
+ break;
+
+ cs = parent_cs(cs);
+ } while (cs);
+
+ rcu_read_unlock();
spin_unlock_irqrestore(&callback_lock, flags);
}
--
2.39.1.456.gfc5497dd1b-goog
next prev parent reply other threads:[~2023-01-31 22:17 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-31 22:17 [PATCH 0/2] Fix broken cpuset affinity handling on heterogeneous systems Will Deacon
2023-01-31 22:17 ` Will Deacon
[not found] ` <20230131221719.3176-1-will-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2023-01-31 22:17 ` Will Deacon [this message]
2023-01-31 22:17 ` [PATCH 1/2] cpuset: Fix cpuset_cpus_allowed() to not filter offline CPUs Will Deacon
[not found] ` <20230131221719.3176-2-will-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2023-02-01 4:14 ` Waiman Long
2023-02-01 4:14 ` Waiman Long
[not found] ` <6b068916-5e1b-a943-1aad-554964d8b746-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-02-01 9:14 ` Peter Zijlstra
2023-02-01 9:14 ` Peter Zijlstra
[not found] ` <Y9otWX+MGOLDKU6t-Nxj+rRp3nVydTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2023-02-01 15:16 ` Waiman Long
2023-02-01 15:16 ` Waiman Long
2023-02-01 18:46 ` Waiman Long
[not found] ` <a892d340-ea99-1562-0e70-176f02f195c2-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-02-01 19:14 ` Waiman Long
2023-02-01 19:14 ` Waiman Long
[not found] ` <37f158af-6ca8-9f5a-c87a-0266d8bb21a6-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-02-01 19:17 ` Waiman Long
2023-02-01 19:17 ` Waiman Long
2023-02-01 21:10 ` Peter Zijlstra
2023-02-02 3:34 ` Waiman Long
[not found] ` <773e2f22-211e-163f-64bb-15ae29ad161b-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-02-03 11:50 ` Will Deacon
2023-02-03 11:50 ` Will Deacon
2023-02-03 15:13 ` Waiman Long
2023-02-03 15:13 ` Waiman Long
[not found] ` <d626998b-4cb0-dd8f-fd97-21715bf2eb0b-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-02-03 15:26 ` Peter Zijlstra
2023-02-03 15:26 ` Peter Zijlstra
[not found] ` <Y90nn9NVkEhcZ6nq-Nxj+rRp3nVydTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2023-02-03 15:35 ` Waiman Long
2023-02-03 15:35 ` Waiman Long
2023-02-02 8:34 ` Peter Zijlstra
2023-02-02 8:34 ` Peter Zijlstra
[not found] ` <Y9t1sP/6nFht7RSN-Nxj+rRp3nVydTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2023-02-02 16:06 ` Waiman Long
2023-02-02 16:06 ` Waiman Long
[not found] ` <d630ca53-71f0-c735-fbc3-e826479aa86b-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-02-02 19:42 ` Peter Zijlstra
2023-02-02 19:42 ` Peter Zijlstra
[not found] ` <Y9wSC1Wxlm8CKKlN-Nxj+rRp3nVydTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2023-02-02 20:46 ` Waiman Long
2023-02-02 20:46 ` Waiman Long
[not found] ` <2bc730db-704d-080b-6869-02f6d0035fad-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-02-02 20:48 ` Tejun Heo
2023-02-02 20:48 ` Tejun Heo
[not found] ` <Y9whrU4IUeleqdrt-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-02-02 20:53 ` Waiman Long
2023-02-02 20:53 ` Waiman Long
2023-02-02 21:05 ` Waiman Long
[not found] ` <8787b5f7-9822-e49b-0357-d0ce224ca920-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-02-02 21:50 ` Tejun Heo
2023-02-02 21:50 ` Tejun Heo
[not found] ` <Y9wwP4LF9vgreO3U-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2023-02-03 0:54 ` Waiman Long
2023-02-03 0:54 ` Waiman Long
2023-02-03 16:31 ` Will Deacon
2023-02-03 16:31 ` Will Deacon
2023-02-01 10:23 ` Hillf Danton
2023-01-31 22:17 ` [PATCH 2/2] cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task Will Deacon
2023-01-31 22:17 ` Will Deacon
2023-02-01 2:22 ` Waiman Long
[not found] ` <66cdf2e8-f1aa-5dfe-cd2e-0e37dc0ae799-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2023-02-01 9:15 ` Peter Zijlstra
2023-02-01 9:15 ` Peter Zijlstra
2023-02-01 15:03 ` Waiman Long
[not found] ` <20230131221719.3176-3-will-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2023-02-01 9:27 ` Peter Zijlstra
2023-02-01 9:27 ` Peter Zijlstra
2023-02-03 17:55 ` Waiman Long
2023-02-03 17:55 ` Waiman Long
2023-02-06 20:21 ` Tejun Heo
2023-02-06 20:21 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230131221719.3176-2-will@kernel.org \
--to=will-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=kernel-team-z5hGa2qSFaRBDgjK7y7TUQ@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org \
--cc=longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.