public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] sched: Fix dup_user_cpus_ptr() & do_set_cpus_allowed() bugs
@ 2022-12-22 22:49 Waiman Long
  2022-12-22 22:49 ` [PATCH v4 1/2] sched: Fix use-after-free bug in dup_user_cpus_ptr() Waiman Long
  2022-12-22 22:49 ` [PATCH v4 2/2] sched: Use kfree_rcu() in do_set_cpus_allowed() Waiman Long
  0 siblings, 2 replies; 3+ messages in thread
From: Waiman Long @ 2022-12-22 22:49 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider
  Cc: Phil Auld, Wenjie Li, David Wang 王标, Quentin Perret,
	Will Deacon, linux-kernel, Waiman Long

 v4:
  - Make sure user_cpus_ptr allocation size is large enough for
    rcu_head.

This series fixes a UAF bug in dup_user_cpus_ptr() and uses kfree_rcu()
in do_set_cpus_allowed to avoid lockdep splats.

Waiman Long (2):
  sched: Fix use-after-free bug in dup_user_cpus_ptr()
  sched: Use kfree_rcu() in do_set_cpus_allowed()

 kernel/sched/core.c | 59 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 51 insertions(+), 8 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v4 1/2] sched: Fix use-after-free bug in dup_user_cpus_ptr()
  2022-12-22 22:49 [PATCH v4 0/2] sched: Fix dup_user_cpus_ptr() & do_set_cpus_allowed() bugs Waiman Long
@ 2022-12-22 22:49 ` Waiman Long
  2022-12-22 22:49 ` [PATCH v4 2/2] sched: Use kfree_rcu() in do_set_cpus_allowed() Waiman Long
  1 sibling, 0 replies; 3+ messages in thread
From: Waiman Long @ 2022-12-22 22:49 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider
  Cc: Phil Auld, Wenjie Li, David Wang 王标, Quentin Perret,
	Will Deacon, linux-kernel, Waiman Long, stable

Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be
restricted on asymmetric systems"), the setting and clearing of
user_cpus_ptr are done under pi_lock for arm64 architecture. However,
dup_user_cpus_ptr() accesses user_cpus_ptr without any lock
protection. Since sched_setaffinity() can be invoked from another
process, the process being modified may be undergoing fork() at
the same time.  When racing with the clearing of user_cpus_ptr in
__set_cpus_allowed_ptr_locked(), it can lead to user-after-free and
possibly double-free in arm64 kernel.

Commit 8f9ea86fdf99 ("sched: Always preserve the user requested
cpumask") fixes this problem as user_cpus_ptr, once set, will never
be cleared in a task's lifetime. However, this bug was re-introduced
in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in
do_set_cpus_allowed(). This time, it will affect all arches.

Fix this bug by always clearing the user_cpus_ptr of the newly
cloned/forked task before the copying process starts and check the
user_cpus_ptr state of the source task under pi_lock.

Note to stable, this patch won't be applicable to stable releases.
Just copy the new dup_user_cpus_ptr() function over.

Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems")
Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
CC: stable@vger.kernel.org
Reported-by: David Wang 王标 <wangbiao3@xiaomi.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/sched/core.c | 34 +++++++++++++++++++++++++++++-----
 1 file changed, 29 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 25b582b6ee5f..b93d030b9fd5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2612,19 +2612,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
 int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
 		      int node)
 {
+	cpumask_t *user_mask;
 	unsigned long flags;
 
-	if (!src->user_cpus_ptr)
+	/*
+	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
+	 * may differ by now due to racing.
+	 */
+	dst->user_cpus_ptr = NULL;
+
+	/*
+	 * This check is racy and losing the race is a valid situation.
+	 * It is not worth the extra overhead of taking the pi_lock on
+	 * every fork/clone.
+	 */
+	if (data_race(!src->user_cpus_ptr))
 		return 0;
 
-	dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
-	if (!dst->user_cpus_ptr)
+	user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+	if (!user_mask)
 		return -ENOMEM;
 
-	/* Use pi_lock to protect content of user_cpus_ptr */
+	/*
+	 * Use pi_lock to protect content of user_cpus_ptr
+	 *
+	 * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
+	 * do_set_cpus_allowed().
+	 */
 	raw_spin_lock_irqsave(&src->pi_lock, flags);
-	cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+	if (src->user_cpus_ptr) {
+		swap(dst->user_cpus_ptr, user_mask);
+		cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+	}
 	raw_spin_unlock_irqrestore(&src->pi_lock, flags);
+
+	if (unlikely(user_mask))
+		kfree(user_mask);
+
 	return 0;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v4 2/2] sched: Use kfree_rcu() in do_set_cpus_allowed()
  2022-12-22 22:49 [PATCH v4 0/2] sched: Fix dup_user_cpus_ptr() & do_set_cpus_allowed() bugs Waiman Long
  2022-12-22 22:49 ` [PATCH v4 1/2] sched: Fix use-after-free bug in dup_user_cpus_ptr() Waiman Long
@ 2022-12-22 22:49 ` Waiman Long
  1 sibling, 0 replies; 3+ messages in thread
From: Waiman Long @ 2022-12-22 22:49 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider
  Cc: Phil Auld, Wenjie Li, David Wang 王标, Quentin Perret,
	Will Deacon, linux-kernel, Waiman Long

Commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in
do_set_cpus_allowed()") may call kfree() if user_cpus_ptr was previously
set. Unfortunately, some of the callers of do_set_cpus_allowed()
may have pi_lock held when calling it. So the following splats may be
printed especially when running with a PREEMPT_RT kernel:

   WARNING: possible circular locking dependency detected
   BUG: sleeping function called from invalid context

To avoid these problems, kfree_rcu() is used instead. An internal
cpumask_rcuhead union is created for the sole purpose of facilitating
the use of kfree_rcu() to free the cpumask.

Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/sched/core.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b93d030b9fd5..c2df236fd8c1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2604,9 +2604,19 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
 		.user_mask = NULL,
 		.flags     = SCA_USER,	/* clear the user requested mask */
 	};
+	union cpumask_rcuhead {
+		cpumask_t cpumask;
+		struct rcu_head rcu;
+	};
 
 	__do_set_cpus_allowed(p, &ac);
-	kfree(ac.user_mask);
+
+	/*
+	 * Because this is called with p->pi_lock held, it is not possible
+	 * to use kfree() here (when PREEMPT_RT=y), therefore punt to using
+	 * kfree_rcu().
+	 */
+	kfree_rcu((union cpumask_rcuhead *)ac.user_mask, rcu);
 }
 
 int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
@@ -2614,6 +2624,7 @@ int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
 {
 	cpumask_t *user_mask;
 	unsigned long flags;
+	int size;
 
 	/*
 	 * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
@@ -2629,7 +2640,11 @@ int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
 	if (data_race(!src->user_cpus_ptr))
 		return 0;
 
-	user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+	/*
+	 * See do_set_cpus_allowed() for the rcu_head usage.
+	 */
+	size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
+	user_mask = kmalloc_node(size, GFP_KERNEL, node);
 	if (!user_mask)
 		return -ENOMEM;
 
@@ -8230,7 +8245,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 	struct affinity_context ac;
 	struct cpumask *user_mask;
 	struct task_struct *p;
-	int retval;
+	int retval, size;
 
 	rcu_read_lock();
 
@@ -8263,7 +8278,11 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 	if (retval)
 		goto out_put_task;
 
-	user_mask = kmalloc(cpumask_size(), GFP_KERNEL);
+	/*
+	 * See do_set_cpus_allowed() for the rcu_head usage.
+	 */
+	size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
+	user_mask = kmalloc(size, GFP_KERNEL);
 	if (!user_mask) {
 		retval = -ENOMEM;
 		goto out_put_task;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-22 22:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-22 22:49 [PATCH v4 0/2] sched: Fix dup_user_cpus_ptr() & do_set_cpus_allowed() bugs Waiman Long
2022-12-22 22:49 ` [PATCH v4 1/2] sched: Fix use-after-free bug in dup_user_cpus_ptr() Waiman Long
2022-12-22 22:49 ` [PATCH v4 2/2] sched: Use kfree_rcu() in do_set_cpus_allowed() Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox