* [PATCH v4 0/2] sched: Fix dup_user_cpus_ptr() & do_set_cpus_allowed() bugs
@ 2022-12-22 22:49 Waiman Long
2022-12-22 22:49 ` [PATCH v4 1/2] sched: Fix use-after-free bug in dup_user_cpus_ptr() Waiman Long
2022-12-22 22:49 ` [PATCH v4 2/2] sched: Use kfree_rcu() in do_set_cpus_allowed() Waiman Long
0 siblings, 2 replies; 3+ messages in thread
From: Waiman Long @ 2022-12-22 22:49 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Daniel Bristot de Oliveira, Valentin Schneider
Cc: Phil Auld, Wenjie Li, David Wang 王标, Quentin Perret,
Will Deacon, linux-kernel, Waiman Long
v4:
- Make sure user_cpus_ptr allocation size is large enough for
rcu_head.
This series fixes a UAF bug in dup_user_cpus_ptr() and uses kfree_rcu()
in do_set_cpus_allowed to avoid lockdep splats.
Waiman Long (2):
sched: Fix use-after-free bug in dup_user_cpus_ptr()
sched: Use kfree_rcu() in do_set_cpus_allowed()
kernel/sched/core.c | 59 +++++++++++++++++++++++++++++++++++++++------
1 file changed, 51 insertions(+), 8 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v4 1/2] sched: Fix use-after-free bug in dup_user_cpus_ptr()
2022-12-22 22:49 [PATCH v4 0/2] sched: Fix dup_user_cpus_ptr() & do_set_cpus_allowed() bugs Waiman Long
@ 2022-12-22 22:49 ` Waiman Long
2022-12-22 22:49 ` [PATCH v4 2/2] sched: Use kfree_rcu() in do_set_cpus_allowed() Waiman Long
1 sibling, 0 replies; 3+ messages in thread
From: Waiman Long @ 2022-12-22 22:49 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Daniel Bristot de Oliveira, Valentin Schneider
Cc: Phil Auld, Wenjie Li, David Wang 王标, Quentin Perret,
Will Deacon, linux-kernel, Waiman Long, stable
Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
restricted on asymmetric systems"), the setting and clearing of
user_cpus_ptr are done under pi_lock for arm64 architecture. However,
dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
protection. Since sched_setaffinity() can be invoked from another
process, the process being modified may be undergoing fork() at
the same time. When racing with the clearing of user_cpus_ptr in
__set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
possibly double-free in arm64 kernel.
Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
cpumask") fixes this problem as user_cpus_ptr, once set, will never
be cleared in a task's lifetime. However, this bug was re-introduced
in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
do_set_cpus_allowed(). This time, it will affect all arches.
Fix this bug by always clearing the user_cpus_ptr of the newly
cloned/forked task before the copying process starts and check the
user_cpus_ptr state of the source task under pi_lock.
Note to stable, this patch won't be applicable to stable releases.
Just copy the new dup_user_cpus_ptr() function over.
Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
CC: stable@vger.kernel.org
Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/sched/core.c | 34 +++++++++++++++++++++++++++++-----
1 file changed, 29 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 25b582b6ee5f..b93d030b9fd5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2612,19 +2612,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
int node)
{
+ cpumask_t *user_mask;
unsigned long flags;
- if (!src->user_cpus_ptr)
+ /*
+ * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
+ * may differ by now due to racing.
+ */
+ dst->user_cpus_ptr = NULL;
+
+ /*
+ * This check is racy and losing the race is a valid situation.
+ * It is not worth the extra overhead of taking the pi_lock on
+ * every fork/clone.
+ */
+ if (data_race(!src->user_cpus_ptr))
return 0;
- dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
- if (!dst->user_cpus_ptr)
+ user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+ if (!user_mask)
return -ENOMEM;
- /* Use pi_lock to protect content of user_cpus_ptr */
+ /*
+ * Use pi_lock to protect content of user_cpus_ptr
+ *
+ * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
+ * do_set_cpus_allowed().
+ */
raw_spin_lock_irqsave(&src->pi_lock, flags);
- cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+ if (src->user_cpus_ptr) {
+ swap(dst->user_cpus_ptr, user_mask);
+ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+ }
raw_spin_unlock_irqrestore(&src->pi_lock, flags);
+
+ if (unlikely(user_mask))
+ kfree(user_mask);
+
return 0;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH v4 2/2] sched: Use kfree_rcu() in do_set_cpus_allowed()
2022-12-22 22:49 [PATCH v4 0/2] sched: Fix dup_user_cpus_ptr() & do_set_cpus_allowed() bugs Waiman Long
2022-12-22 22:49 ` [PATCH v4 1/2] sched: Fix use-after-free bug in dup_user_cpus_ptr() Waiman Long
@ 2022-12-22 22:49 ` Waiman Long
1 sibling, 0 replies; 3+ messages in thread
From: Waiman Long @ 2022-12-22 22:49 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Daniel Bristot de Oliveira, Valentin Schneider
Cc: Phil Auld, Wenjie Li, David Wang 王标, Quentin Perret,
Will Deacon, linux-kernel, Waiman Long
Commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
do_set_cpus_allowed()") may call kfree() if user_cpus_ptr was previously
set. Unfortunately, some of the callers of do_set_cpus_allowed()
may have pi_lock held when calling it. So the following splats may be
printed especially when running with a PREEMPT_RT kernel:
WARNING: possible circular locking dependency detected
BUG: sleeping function called from invalid context
To avoid these problems, kfree_rcu() is used instead. An internal
cpumask_rcuhead union is created for the sole purpose of facilitating
the use of kfree_rcu() to free the cpumask.
Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/sched/core.c | 27 +++++++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b93d030b9fd5..c2df236fd8c1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2604,9 +2604,19 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
.user_mask = NULL,
.flags = SCA_USER, /* clear the user requested mask */
};
+ union cpumask_rcuhead {
+ cpumask_t cpumask;
+ struct rcu_head rcu;
+ };
__do_set_cpus_allowed(p, &ac);
- kfree(ac.user_mask);
+
+ /*
+ * Because this is called with p->pi_lock held, it is not possible
+ * to use kfree() here (when PREEMPT_RT=y), therefore punt to using
+ * kfree_rcu().
+ */
+ kfree_rcu((union cpumask_rcuhead *)ac.user_mask, rcu);
}
int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
@@ -2614,6 +2624,7 @@ int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
{
cpumask_t *user_mask;
unsigned long flags;
+ int size;
/*
* Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
@@ -2629,7 +2640,11 @@ int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
if (data_race(!src->user_cpus_ptr))
return 0;
- user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+ /*
+ * See do_set_cpus_allowed() for the rcu_head usage.
+ */
+ size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
+ user_mask = kmalloc_node(size, GFP_KERNEL, node);
if (!user_mask)
return -ENOMEM;
@@ -8230,7 +8245,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
struct affinity_context ac;
struct cpumask *user_mask;
struct task_struct *p;
- int retval;
+ int retval, size;
rcu_read_lock();
@@ -8263,7 +8278,11 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
if (retval)
goto out_put_task;
- user_mask = kmalloc(cpumask_size(), GFP_KERNEL);
+ /*
+ * See do_set_cpus_allowed() for the rcu_head usage.
+ */
+ size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
+ user_mask = kmalloc(size, GFP_KERNEL);
if (!user_mask) {
retval = -ENOMEM;
goto out_put_task;
--
2.31.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-12-22 22:50 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-22 22:49 [PATCH v4 0/2] sched: Fix dup_user_cpus_ptr() & do_set_cpus_allowed() bugs Waiman Long
2022-12-22 22:49 ` [PATCH v4 1/2] sched: Fix use-after-free bug in dup_user_cpus_ptr() Waiman Long
2022-12-22 22:49 ` [PATCH v4 2/2] sched: Use kfree_rcu() in do_set_cpus_allowed() Waiman Long
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox