* [PATCH 2/4] task_work: Introduce task_work_cancel() again
2024-03-29 23:58 [PATCH 0/4] perf: Fix leaked events when sigtrap = 1 Frederic Weisbecker
@ 2024-03-29 23:58 ` Frederic Weisbecker
2024-03-30 21:10 ` kernel test robot
0 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2024-03-29 23:58 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter
Re-introduce task_work_cancel(), this time to cancel an actual callback
and not *any* callback pointing to a given function. This is going to be
needed for perf events event freeing.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
include/linux/task_work.h | 1 +
kernel/task_work.c | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 89ee2cbf044b..58e42ef59580 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -37,6 +37,7 @@ int task_work_add(struct task_struct *task, struct callback_head *twork,
struct callback_head *task_work_cancel_match(struct task_struct *task,
bool (*match)(struct callback_head *, void *data), void *data);
struct callback_head *task_work_cancel_func(struct task_struct *, task_work_func_t);
+bool task_work_cancel(struct task_struct *, struct callback_head *twork);
void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
diff --git a/kernel/task_work.c b/kernel/task_work.c
index c1b4d3ba2590..9e85ac7632ae 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -136,6 +136,30 @@ task_work_cancel_func(struct task_struct *task, task_work_func_t func)
return task_work_cancel_match(task, task_work_func_match, func);
}
+static bool task_work_match(struct callback_head *cb, void *data)
+{
+ return cb == data;
+}
+
+/**
+ * task_work_cancel - cancel a pending work added by task_work_add()
+ * @task: the task which should execute the work
+ * @func: the work to remove if queued
+ *
+ * Remove a callback from a task's queue if queued.
+ *
+ * RETURNS:
+ * True if the callback was queued and got cancelled, false otherwise.
+ */
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb)
+{
+ struct callback_head *ret;
+
+ ret = task_work_cancel_match(task, task_work_match, cb);
+
+ return ret == cb;
+}
+
/**
* task_work_run - execute the works added by task_work_add()
*
--
2.44.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 2/4] task_work: Introduce task_work_cancel() again
2024-03-29 23:58 ` [PATCH 2/4] task_work: Introduce task_work_cancel() again Frederic Weisbecker
@ 2024-03-30 21:10 ` kernel test robot
0 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2024-03-30 21:10 UTC (permalink / raw)
To: Frederic Weisbecker, LKML
Cc: oe-kbuild-all, Frederic Weisbecker, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter
Hi Frederic,
kernel test robot noticed the following build warnings:
[auto build test WARNING on perf-tools-next/perf-tools-next]
[also build test WARNING on tip/perf/core perf-tools/perf-tools linus/master acme/perf/core v6.9-rc1 next-20240328]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Frederic-Weisbecker/task_work-s-task_work_cancel-task_work_cancel_func/20240330-080207
base: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git perf-tools-next
patch link: https://lore.kernel.org/r/20240329235812.18917-3-frederic%40kernel.org
patch subject: [PATCH 2/4] task_work: Introduce task_work_cancel() again
config: nios2-randconfig-r071-20240330 (https://download.01.org/0day-ci/archive/20240331/202403310406.TPrIela8-lkp@intel.com/config)
compiler: nios2-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240331/202403310406.TPrIela8-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202403310406.TPrIela8-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> kernel/task_work.c:155: warning: Function parameter or struct member 'cb' not described in 'task_work_cancel'
>> kernel/task_work.c:155: warning: Excess function parameter 'func' description in 'task_work_cancel'
vim +155 kernel/task_work.c
143
144 /**
145 * task_work_cancel - cancel a pending work added by task_work_add()
146 * @task: the task which should execute the work
147 * @func: the work to remove if queued
148 *
149 * Remove a callback from a task's queue if queued.
150 *
151 * RETURNS:
152 * True if the callback was queued and got cancelled, false otherwise.
153 */
154 bool task_work_cancel(struct task_struct *task, struct callback_head *cb)
> 155 {
156 struct callback_head *ret;
157
158 ret = task_work_cancel_match(task, task_work_match, cb);
159
160 return ret == cb;
161 }
162
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 2/4] task_work: Introduce task_work_cancel() again
2024-05-15 14:43 [PATCH 0/4 v2] perf: Fix leaked sigtrap events Frederic Weisbecker
@ 2024-05-15 14:43 ` Frederic Weisbecker
0 siblings, 0 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2024-05-15 14:43 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Sebastian Andrzej Siewior
Re-introduce task_work_cancel(), this time to cancel an actual callback
and not *any* callback pointing to a given function. This is going to be
needed for perf events event freeing.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
include/linux/task_work.h | 1 +
kernel/task_work.c | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 23ab01ae185e..26b8a47f41fc 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -31,6 +31,7 @@ int task_work_add(struct task_struct *task, struct callback_head *twork,
struct callback_head *task_work_cancel_match(struct task_struct *task,
bool (*match)(struct callback_head *, void *data), void *data);
struct callback_head *task_work_cancel_func(struct task_struct *, task_work_func_t);
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb);
void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 54ac24059daa..2134ac8057a9 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -136,6 +136,30 @@ task_work_cancel_func(struct task_struct *task, task_work_func_t func)
return task_work_cancel_match(task, task_work_func_match, func);
}
+static bool task_work_match(struct callback_head *cb, void *data)
+{
+ return cb == data;
+}
+
+/**
+ * task_work_cancel - cancel a pending work added by task_work_add()
+ * @task: the task which should execute the work
+ * @cb: the callback to remove if queued
+ *
+ * Remove a callback from a task's queue if queued.
+ *
+ * RETURNS:
+ * True if the callback was queued and got cancelled, false otherwise.
+ */
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb)
+{
+ struct callback_head *ret;
+
+ ret = task_work_cancel_match(task, task_work_match, cb);
+
+ return ret == cb;
+}
+
/**
* task_work_run - execute the works added by task_work_add()
*
--
2.44.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/4] task_work: Introduce task_work_cancel() again
2024-05-16 14:09 [PATCH 0/4 v3] " Frederic Weisbecker
@ 2024-05-16 14:09 ` Frederic Weisbecker
0 siblings, 0 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2024-05-16 14:09 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Sebastian Andrzej Siewior
Re-introduce task_work_cancel(), this time to cancel an actual callback
and not *any* callback pointing to a given function. This is going to be
needed for perf events event freeing.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
include/linux/task_work.h | 1 +
kernel/task_work.c | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 23ab01ae185e..26b8a47f41fc 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -31,6 +31,7 @@ int task_work_add(struct task_struct *task, struct callback_head *twork,
struct callback_head *task_work_cancel_match(struct task_struct *task,
bool (*match)(struct callback_head *, void *data), void *data);
struct callback_head *task_work_cancel_func(struct task_struct *, task_work_func_t);
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb);
void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 54ac24059daa..2134ac8057a9 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -136,6 +136,30 @@ task_work_cancel_func(struct task_struct *task, task_work_func_t func)
return task_work_cancel_match(task, task_work_func_match, func);
}
+static bool task_work_match(struct callback_head *cb, void *data)
+{
+ return cb == data;
+}
+
+/**
+ * task_work_cancel - cancel a pending work added by task_work_add()
+ * @task: the task which should execute the work
+ * @cb: the callback to remove if queued
+ *
+ * Remove a callback from a task's queue if queued.
+ *
+ * RETURNS:
+ * True if the callback was queued and got cancelled, false otherwise.
+ */
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb)
+{
+ struct callback_head *ret;
+
+ ret = task_work_cancel_match(task, task_work_match, cb);
+
+ return ret == cb;
+}
+
/**
* task_work_run - execute the works added by task_work_add()
*
--
2.34.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 0/4 v4] perf: Fix leaked sigtrap events
@ 2024-06-21 9:15 Frederic Weisbecker
2024-06-21 9:15 ` [PATCH 1/4] task_work: s/task_work_cancel()/task_work_cancel_func()/ Frederic Weisbecker
` (4 more replies)
0 siblings, 5 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2024-06-21 9:15 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Sebastian Andrzej Siewior
Hi,
This is essentially a resend to remind the patchset on people's pile,
using a small typo fix on patch 3 (thanks Sebastian) as an excuse.
Thanks.
Frederic Weisbecker (4):
task_work: s/task_work_cancel()/task_work_cancel_func()/
task_work: Introduce task_work_cancel() again
perf: Fix event leak upon exit
perf: Fix event leak upon exec and file release
include/linux/perf_event.h | 1 +
include/linux/task_work.h | 3 ++-
kernel/events/core.c | 49 +++++++++++++++++++++++++++++---------
kernel/irq/manage.c | 2 +-
kernel/task_work.c | 34 ++++++++++++++++++++++----
security/keys/keyctl.c | 2 +-
6 files changed, 72 insertions(+), 19 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/4] task_work: s/task_work_cancel()/task_work_cancel_func()/
2024-06-21 9:15 [PATCH 0/4 v4] perf: Fix leaked sigtrap events Frederic Weisbecker
@ 2024-06-21 9:15 ` Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
2024-07-09 11:42 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
2024-06-21 9:15 ` [PATCH 2/4] task_work: Introduce task_work_cancel() again Frederic Weisbecker
` (3 subsequent siblings)
4 siblings, 2 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2024-06-21 9:15 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Sebastian Andrzej Siewior
A proper task_work_cancel() API that actually cancels a callback and not
*any* callback pointing to a given function is going to be needed for
perf events event freeing. Do the appropriate rename to prepare for
that.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
include/linux/task_work.h | 2 +-
kernel/irq/manage.c | 2 +-
kernel/task_work.c | 10 +++++-----
security/keys/keyctl.c | 2 +-
4 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 795ef5a68429..23ab01ae185e 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -30,7 +30,7 @@ int task_work_add(struct task_struct *task, struct callback_head *twork,
struct callback_head *task_work_cancel_match(struct task_struct *task,
bool (*match)(struct callback_head *, void *data), void *data);
-struct callback_head *task_work_cancel(struct task_struct *, task_work_func_t);
+struct callback_head *task_work_cancel_func(struct task_struct *, task_work_func_t);
void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 71b0fc2d0aea..dd53298ef1a5 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1337,7 +1337,7 @@ static int irq_thread(void *data)
* synchronize_hardirq(). So neither IRQTF_RUNTHREAD nor the
* oneshot mask bit can be set.
*/
- task_work_cancel(current, irq_thread_dtor);
+ task_work_cancel_func(current, irq_thread_dtor);
return 0;
}
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 95a7e1b7f1da..54ac24059daa 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -120,9 +120,9 @@ static bool task_work_func_match(struct callback_head *cb, void *data)
}
/**
- * task_work_cancel - cancel a pending work added by task_work_add()
- * @task: the task which should execute the work
- * @func: identifies the work to remove
+ * task_work_cancel_func - cancel a pending work matching a function added by task_work_add()
+ * @task: the task which should execute the func's work
+ * @func: identifies the func to match with a work to remove
*
* Find the last queued pending work with ->func == @func and remove
* it from queue.
@@ -131,7 +131,7 @@ static bool task_work_func_match(struct callback_head *cb, void *data)
* The found work or NULL if not found.
*/
struct callback_head *
-task_work_cancel(struct task_struct *task, task_work_func_t func)
+task_work_cancel_func(struct task_struct *task, task_work_func_t func)
{
return task_work_cancel_match(task, task_work_func_match, func);
}
@@ -168,7 +168,7 @@ void task_work_run(void)
if (!work)
break;
/*
- * Synchronize with task_work_cancel(). It can not remove
+ * Synchronize with task_work_cancel_match(). It can not remove
* the first entry == work, cmpxchg(task_works) must fail.
* But it can remove another entry from the ->next list.
*/
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 4bc3e9398ee3..ab927a142f51 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1694,7 +1694,7 @@ long keyctl_session_to_parent(void)
goto unlock;
/* cancel an already pending keyring replacement */
- oldwork = task_work_cancel(parent, key_change_session_keyring);
+ oldwork = task_work_cancel_func(parent, key_change_session_keyring);
/* the replacement session keyring is applied just prior to userspace
* restarting */
--
2.45.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/4] task_work: Introduce task_work_cancel() again
2024-06-21 9:15 [PATCH 0/4 v4] perf: Fix leaked sigtrap events Frederic Weisbecker
2024-06-21 9:15 ` [PATCH 1/4] task_work: s/task_work_cancel()/task_work_cancel_func()/ Frederic Weisbecker
@ 2024-06-21 9:15 ` Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
2024-07-09 11:42 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
2024-06-21 9:16 ` [PATCH 3/4] perf: Fix event leak upon exit Frederic Weisbecker
` (2 subsequent siblings)
4 siblings, 2 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2024-06-21 9:15 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Sebastian Andrzej Siewior
Re-introduce task_work_cancel(), this time to cancel an actual callback
and not *any* callback pointing to a given function. This is going to be
needed for perf events event freeing.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
include/linux/task_work.h | 1 +
kernel/task_work.c | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 23ab01ae185e..26b8a47f41fc 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -31,6 +31,7 @@ int task_work_add(struct task_struct *task, struct callback_head *twork,
struct callback_head *task_work_cancel_match(struct task_struct *task,
bool (*match)(struct callback_head *, void *data), void *data);
struct callback_head *task_work_cancel_func(struct task_struct *, task_work_func_t);
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb);
void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 54ac24059daa..2134ac8057a9 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -136,6 +136,30 @@ task_work_cancel_func(struct task_struct *task, task_work_func_t func)
return task_work_cancel_match(task, task_work_func_match, func);
}
+static bool task_work_match(struct callback_head *cb, void *data)
+{
+ return cb == data;
+}
+
+/**
+ * task_work_cancel - cancel a pending work added by task_work_add()
+ * @task: the task which should execute the work
+ * @cb: the callback to remove if queued
+ *
+ * Remove a callback from a task's queue if queued.
+ *
+ * RETURNS:
+ * True if the callback was queued and got cancelled, false otherwise.
+ */
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb)
+{
+ struct callback_head *ret;
+
+ ret = task_work_cancel_match(task, task_work_match, cb);
+
+ return ret == cb;
+}
+
/**
* task_work_run - execute the works added by task_work_add()
*
--
2.45.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/4] perf: Fix event leak upon exit
2024-06-21 9:15 [PATCH 0/4 v4] perf: Fix leaked sigtrap events Frederic Weisbecker
2024-06-21 9:15 ` [PATCH 1/4] task_work: s/task_work_cancel()/task_work_cancel_func()/ Frederic Weisbecker
2024-06-21 9:15 ` [PATCH 2/4] task_work: Introduce task_work_cancel() again Frederic Weisbecker
@ 2024-06-21 9:16 ` Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
2024-07-09 11:42 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
2024-06-21 9:16 ` [PATCH 4/4] perf: Fix event leak upon exec and file release Frederic Weisbecker
2024-06-25 8:43 ` [PATCH 0/4 v4] perf: Fix leaked sigtrap events Peter Zijlstra
4 siblings, 2 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2024-06-21 9:16 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Sebastian Andrzej Siewior
When a task is scheduled out, pending sigtrap deliveries are deferred
to the target task upon resume to userspace via task_work.
However failures while adding an event's callback to the task_work
engine are ignored. And since the last call for events exit happen
after task work is eventually closed, there is a small window during
which pending sigtrap can be queued though ignored, leaking the event
refcount addition such as in the following scenario:
TASK A
-----
do_exit()
exit_task_work(tsk);
<IRQ>
perf_event_overflow()
event->pending_sigtrap = pending_id;
irq_work_queue(&event->pending_irq);
</IRQ>
=========> PREEMPTION: TASK A -> TASK B
event_sched_out()
event->pending_sigtrap = 0;
atomic_long_inc_not_zero(&event->refcount)
// FAILS: task work has exited
task_work_add(&event->pending_task)
[...]
<IRQ WORK>
perf_pending_irq()
// early return: event->oncpu = -1
</IRQ WORK>
[...]
=========> TASK B -> TASK A
perf_event_exit_task(tsk)
perf_event_exit_event()
free_event()
WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
// leak event due to unexpected refcount == 2
As a result the event is never released while the task exits.
Fix this with appropriate task_work_add()'s error handling.
Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
kernel/events/core.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8f908f077935..7c3218d31d1d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2284,18 +2284,15 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
}
if (event->pending_sigtrap) {
- bool dec = true;
-
event->pending_sigtrap = 0;
if (state != PERF_EVENT_STATE_OFF &&
- !event->pending_work) {
- event->pending_work = 1;
- dec = false;
+ !event->pending_work &&
+ !task_work_add(current, &event->pending_task, TWA_RESUME)) {
WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
- task_work_add(current, &event->pending_task, TWA_RESUME);
- }
- if (dec)
+ event->pending_work = 1;
+ } else {
local_dec(&event->ctx->nr_pending);
+ }
}
perf_event_set_state(event, state);
--
2.45.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 4/4] perf: Fix event leak upon exec and file release
2024-06-21 9:15 [PATCH 0/4 v4] perf: Fix leaked sigtrap events Frederic Weisbecker
` (2 preceding siblings ...)
2024-06-21 9:16 ` [PATCH 3/4] perf: Fix event leak upon exit Frederic Weisbecker
@ 2024-06-21 9:16 ` Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
2024-07-09 11:41 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
2024-06-25 8:43 ` [PATCH 0/4 v4] perf: Fix leaked sigtrap events Peter Zijlstra
4 siblings, 2 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2024-06-21 9:16 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Sebastian Andrzej Siewior
The perf pending task work is never waited upon the matching event
release. In the case of a child event, released via free_event()
directly, this can potentially result in a leaked event, such as in the
following scenario that doesn't even require a weak IRQ work
implementation to trigger:
schedule()
prepare_task_switch()
=======> <NMI>
perf_event_overflow()
event->pending_sigtrap = ...
irq_work_queue(&event->pending_irq)
<======= </NMI>
perf_event_task_sched_out()
event_sched_out()
event->pending_sigtrap = 0;
atomic_long_inc_not_zero(&event->refcount)
task_work_add(&event->pending_task)
finish_lock_switch()
=======> <IRQ>
perf_pending_irq()
//do nothing, rely on pending task work
<======= </IRQ>
begin_new_exec()
perf_event_exit_task()
perf_event_exit_event()
// If is child event
free_event()
WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
// event is leaked
Similar scenarios can also happen with perf_event_remove_on_exec() or
simply against concurrent perf_event_release().
Fix this with synchonizing against the possibly remaining pending task
work while freeing the event, just like is done with remaining pending
IRQ work. This means that the pending task callback neither need nor
should hold a reference to the event, preventing it from ever beeing
freed.
Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
include/linux/perf_event.h | 1 +
kernel/events/core.c | 38 ++++++++++++++++++++++++++++++++++----
2 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index a5304ae8c654..393fb13733b0 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -786,6 +786,7 @@ struct perf_event {
struct irq_work pending_irq;
struct callback_head pending_task;
unsigned int pending_work;
+ struct rcuwait pending_work_wait;
atomic_t event_limit;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7c3218d31d1d..586d4f367624 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2288,7 +2288,6 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
if (state != PERF_EVENT_STATE_OFF &&
!event->pending_work &&
!task_work_add(current, &event->pending_task, TWA_RESUME)) {
- WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
event->pending_work = 1;
} else {
local_dec(&event->ctx->nr_pending);
@@ -5203,9 +5202,35 @@ static bool exclusive_event_installable(struct perf_event *event,
static void perf_addr_filters_splice(struct perf_event *event,
struct list_head *head);
+static void perf_pending_task_sync(struct perf_event *event)
+{
+ struct callback_head *head = &event->pending_task;
+
+ if (!event->pending_work)
+ return;
+ /*
+ * If the task is queued to the current task's queue, we
+ * obviously can't wait for it to complete. Simply cancel it.
+ */
+ if (task_work_cancel(current, head)) {
+ event->pending_work = 0;
+ local_dec(&event->ctx->nr_pending);
+ return;
+ }
+
+ /*
+ * All accesses related to the event are within the same
+ * non-preemptible section in perf_pending_task(). The RCU
+ * grace period before the event is freed will make sure all
+ * those accesses are complete by then.
+ */
+ rcuwait_wait_event(&event->pending_work_wait, !event->pending_work, TASK_UNINTERRUPTIBLE);
+}
+
static void _free_event(struct perf_event *event)
{
irq_work_sync(&event->pending_irq);
+ perf_pending_task_sync(event);
unaccount_event(event);
@@ -6828,24 +6853,28 @@ static void perf_pending_task(struct callback_head *head)
struct perf_event *event = container_of(head, struct perf_event, pending_task);
int rctx;
+ /*
+ * All accesses to the event must belong to the same implicit RCU read-side
+ * critical section as the ->pending_work reset. See comment in
+ * perf_pending_task_sync().
+ */
+ preempt_disable_notrace();
/*
* If we 'fail' here, that's OK, it means recursion is already disabled
* and we won't recurse 'further'.
*/
- preempt_disable_notrace();
rctx = perf_swevent_get_recursion_context();
if (event->pending_work) {
event->pending_work = 0;
perf_sigtrap(event);
local_dec(&event->ctx->nr_pending);
+ rcuwait_wake_up(&event->pending_work_wait);
}
if (rctx >= 0)
perf_swevent_put_recursion_context(rctx);
preempt_enable_notrace();
-
- put_event(event);
}
#ifdef CONFIG_GUEST_PERF_EVENTS
@@ -11959,6 +11988,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
init_waitqueue_head(&event->waitq);
init_irq_work(&event->pending_irq, perf_pending_irq);
init_task_work(&event->pending_task, perf_pending_task);
+ rcuwait_init(&event->pending_work_wait);
mutex_init(&event->mmap_mutex);
raw_spin_lock_init(&event->addr_filters.lock);
--
2.45.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 0/4 v4] perf: Fix leaked sigtrap events
2024-06-21 9:15 [PATCH 0/4 v4] perf: Fix leaked sigtrap events Frederic Weisbecker
` (3 preceding siblings ...)
2024-06-21 9:16 ` [PATCH 4/4] perf: Fix event leak upon exec and file release Frederic Weisbecker
@ 2024-06-25 8:43 ` Peter Zijlstra
4 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2024-06-25 8:43 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, Sebastian Andrzej Siewior
On Fri, Jun 21, 2024 at 11:15:57AM +0200, Frederic Weisbecker wrote:
> Hi,
>
> This is essentially a resend to remind the patchset on people's pile,
> using a small typo fix on patch 3 (thanks Sebastian) as an excuse.
Poke worked -- I remember going through this a while ago and only having
small niggles. So this one must be good :-)
I've queued the thing for perf/urgent.
^ permalink raw reply [flat|nested] 18+ messages in thread
* [tip: perf/urgent] perf: Fix event leak upon exit
2024-06-21 9:16 ` [PATCH 3/4] perf: Fix event leak upon exit Frederic Weisbecker
@ 2024-07-01 7:14 ` tip-bot2 for Frederic Weisbecker
2024-07-09 11:42 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2024-07-01 7:14 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: 73caafba7021a97c22bff58c3d123228e03cdc46
Gitweb: https://git.kernel.org/tip/73caafba7021a97c22bff58c3d123228e03cdc46
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Fri, 21 Jun 2024 11:16:00 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 25 Jun 2024 10:43:38 +02:00
perf: Fix event leak upon exit
When a task is scheduled out, pending sigtrap deliveries are deferred
to the target task upon resume to userspace via task_work.
However failures while adding an event's callback to the task_work
engine are ignored. And since the last call for events exit happen
after task work is eventually closed, there is a small window during
which pending sigtrap can be queued though ignored, leaking the event
refcount addition such as in the following scenario:
TASK A
-----
do_exit()
exit_task_work(tsk);
<IRQ>
perf_event_overflow()
event->pending_sigtrap = pending_id;
irq_work_queue(&event->pending_irq);
</IRQ>
=========> PREEMPTION: TASK A -> TASK B
event_sched_out()
event->pending_sigtrap = 0;
atomic_long_inc_not_zero(&event->refcount)
// FAILS: task work has exited
task_work_add(&event->pending_task)
[...]
<IRQ WORK>
perf_pending_irq()
// early return: event->oncpu = -1
</IRQ WORK>
[...]
=========> TASK B -> TASK A
perf_event_exit_task(tsk)
perf_event_exit_event()
free_event()
WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
// leak event due to unexpected refcount == 2
As a result the event is never released while the task exits.
Fix this with appropriate task_work_add()'s error handling.
Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240621091601.18227-4-frederic@kernel.org
---
kernel/events/core.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8f908f0..7c3218d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2284,18 +2284,15 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
}
if (event->pending_sigtrap) {
- bool dec = true;
-
event->pending_sigtrap = 0;
if (state != PERF_EVENT_STATE_OFF &&
- !event->pending_work) {
- event->pending_work = 1;
- dec = false;
+ !event->pending_work &&
+ !task_work_add(current, &event->pending_task, TWA_RESUME)) {
WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
- task_work_add(current, &event->pending_task, TWA_RESUME);
- }
- if (dec)
+ event->pending_work = 1;
+ } else {
local_dec(&event->ctx->nr_pending);
+ }
}
perf_event_set_state(event, state);
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip: perf/urgent] task_work: Introduce task_work_cancel() again
2024-06-21 9:15 ` [PATCH 2/4] task_work: Introduce task_work_cancel() again Frederic Weisbecker
@ 2024-07-01 7:14 ` tip-bot2 for Frederic Weisbecker
2024-07-09 11:42 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2024-07-01 7:14 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: 74e45974c966fdecafb7149edb08a2f210e7ab60
Gitweb: https://git.kernel.org/tip/74e45974c966fdecafb7149edb08a2f210e7ab60
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Fri, 21 Jun 2024 11:15:59 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 25 Jun 2024 10:43:37 +02:00
task_work: Introduce task_work_cancel() again
Re-introduce task_work_cancel(), this time to cancel an actual callback
and not *any* callback pointing to a given function. This is going to be
needed for perf events event freeing.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240621091601.18227-3-frederic@kernel.org
---
include/linux/task_work.h | 1 +
kernel/task_work.c | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 23ab01a..26b8a47 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -31,6 +31,7 @@ int task_work_add(struct task_struct *task, struct callback_head *twork,
struct callback_head *task_work_cancel_match(struct task_struct *task,
bool (*match)(struct callback_head *, void *data), void *data);
struct callback_head *task_work_cancel_func(struct task_struct *, task_work_func_t);
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb);
void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 54ac240..2134ac8 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -136,6 +136,30 @@ task_work_cancel_func(struct task_struct *task, task_work_func_t func)
return task_work_cancel_match(task, task_work_func_match, func);
}
+static bool task_work_match(struct callback_head *cb, void *data)
+{
+ return cb == data;
+}
+
+/**
+ * task_work_cancel - cancel a pending work added by task_work_add()
+ * @task: the task which should execute the work
+ * @cb: the callback to remove if queued
+ *
+ * Remove a callback from a task's queue if queued.
+ *
+ * RETURNS:
+ * True if the callback was queued and got cancelled, false otherwise.
+ */
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb)
+{
+ struct callback_head *ret;
+
+ ret = task_work_cancel_match(task, task_work_match, cb);
+
+ return ret == cb;
+}
+
/**
* task_work_run - execute the works added by task_work_add()
*
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip: perf/urgent] task_work: s/task_work_cancel()/task_work_cancel_func()/
2024-06-21 9:15 ` [PATCH 1/4] task_work: s/task_work_cancel()/task_work_cancel_func()/ Frederic Weisbecker
@ 2024-07-01 7:14 ` tip-bot2 for Frederic Weisbecker
2024-07-09 11:42 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2024-07-01 7:14 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: b3d9ad61dc099bfd1e289460cde199b1ca4a7415
Gitweb: https://git.kernel.org/tip/b3d9ad61dc099bfd1e289460cde199b1ca4a7415
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Fri, 21 Jun 2024 11:15:58 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 25 Jun 2024 10:43:36 +02:00
task_work: s/task_work_cancel()/task_work_cancel_func()/
A proper task_work_cancel() API that actually cancels a callback and not
*any* callback pointing to a given function is going to be needed for
perf events event freeing. Do the appropriate rename to prepare for
that.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240621091601.18227-2-frederic@kernel.org
---
include/linux/task_work.h | 2 +-
kernel/irq/manage.c | 2 +-
kernel/task_work.c | 10 +++++-----
security/keys/keyctl.c | 2 +-
4 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 795ef5a..23ab01a 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -30,7 +30,7 @@ int task_work_add(struct task_struct *task, struct callback_head *twork,
struct callback_head *task_work_cancel_match(struct task_struct *task,
bool (*match)(struct callback_head *, void *data), void *data);
-struct callback_head *task_work_cancel(struct task_struct *, task_work_func_t);
+struct callback_head *task_work_cancel_func(struct task_struct *, task_work_func_t);
void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 71b0fc2..dd53298 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1337,7 +1337,7 @@ static int irq_thread(void *data)
* synchronize_hardirq(). So neither IRQTF_RUNTHREAD nor the
* oneshot mask bit can be set.
*/
- task_work_cancel(current, irq_thread_dtor);
+ task_work_cancel_func(current, irq_thread_dtor);
return 0;
}
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 95a7e1b..54ac240 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -120,9 +120,9 @@ static bool task_work_func_match(struct callback_head *cb, void *data)
}
/**
- * task_work_cancel - cancel a pending work added by task_work_add()
- * @task: the task which should execute the work
- * @func: identifies the work to remove
+ * task_work_cancel_func - cancel a pending work matching a function added by task_work_add()
+ * @task: the task which should execute the func's work
+ * @func: identifies the func to match with a work to remove
*
* Find the last queued pending work with ->func == @func and remove
* it from queue.
@@ -131,7 +131,7 @@ static bool task_work_func_match(struct callback_head *cb, void *data)
* The found work or NULL if not found.
*/
struct callback_head *
-task_work_cancel(struct task_struct *task, task_work_func_t func)
+task_work_cancel_func(struct task_struct *task, task_work_func_t func)
{
return task_work_cancel_match(task, task_work_func_match, func);
}
@@ -168,7 +168,7 @@ void task_work_run(void)
if (!work)
break;
/*
- * Synchronize with task_work_cancel(). It can not remove
+ * Synchronize with task_work_cancel_match(). It can not remove
* the first entry == work, cmpxchg(task_works) must fail.
* But it can remove another entry from the ->next list.
*/
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 4bc3e93..ab927a1 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1694,7 +1694,7 @@ long keyctl_session_to_parent(void)
goto unlock;
/* cancel an already pending keyring replacement */
- oldwork = task_work_cancel(parent, key_change_session_keyring);
+ oldwork = task_work_cancel_func(parent, key_change_session_keyring);
/* the replacement session keyring is applied just prior to userspace
* restarting */
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip: perf/urgent] perf: Fix event leak upon exec and file release
2024-06-21 9:16 ` [PATCH 4/4] perf: Fix event leak upon exec and file release Frederic Weisbecker
@ 2024-07-01 7:14 ` tip-bot2 for Frederic Weisbecker
2024-07-09 11:41 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2024-07-01 7:14 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Peter Zijlstra (Intel), x86, linux-kernel
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: b8accda880eaf60504446be8d5b81f9532b98b93
Gitweb: https://git.kernel.org/tip/b8accda880eaf60504446be8d5b81f9532b98b93
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Fri, 21 Jun 2024 11:16:01 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 25 Jun 2024 10:43:38 +02:00
perf: Fix event leak upon exec and file release
The perf pending task work is never waited upon the matching event
release. In the case of a child event, released via free_event()
directly, this can potentially result in a leaked event, such as in the
following scenario that doesn't even require a weak IRQ work
implementation to trigger:
schedule()
prepare_task_switch()
=======> <NMI>
perf_event_overflow()
event->pending_sigtrap = ...
irq_work_queue(&event->pending_irq)
<======= </NMI>
perf_event_task_sched_out()
event_sched_out()
event->pending_sigtrap = 0;
atomic_long_inc_not_zero(&event->refcount)
task_work_add(&event->pending_task)
finish_lock_switch()
=======> <IRQ>
perf_pending_irq()
//do nothing, rely on pending task work
<======= </IRQ>
begin_new_exec()
perf_event_exit_task()
perf_event_exit_event()
// If is child event
free_event()
WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
// event is leaked
Similar scenarios can also happen with perf_event_remove_on_exec() or
simply against concurrent perf_event_release().
Fix this with synchonizing against the possibly remaining pending task
work while freeing the event, just like is done with remaining pending
IRQ work. This means that the pending task callback neither need nor
should hold a reference to the event, preventing it from ever beeing
freed.
Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240621091601.18227-5-frederic@kernel.org
---
include/linux/perf_event.h | 1 +-
kernel/events/core.c | 38 +++++++++++++++++++++++++++++++++----
2 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index a5304ae..393fb13 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -786,6 +786,7 @@ struct perf_event {
struct irq_work pending_irq;
struct callback_head pending_task;
unsigned int pending_work;
+ struct rcuwait pending_work_wait;
atomic_t event_limit;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7c3218d..586d4f3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2288,7 +2288,6 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
if (state != PERF_EVENT_STATE_OFF &&
!event->pending_work &&
!task_work_add(current, &event->pending_task, TWA_RESUME)) {
- WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
event->pending_work = 1;
} else {
local_dec(&event->ctx->nr_pending);
@@ -5203,9 +5202,35 @@ static bool exclusive_event_installable(struct perf_event *event,
static void perf_addr_filters_splice(struct perf_event *event,
struct list_head *head);
+static void perf_pending_task_sync(struct perf_event *event)
+{
+ struct callback_head *head = &event->pending_task;
+
+ if (!event->pending_work)
+ return;
+ /*
+ * If the task is queued to the current task's queue, we
+ * obviously can't wait for it to complete. Simply cancel it.
+ */
+ if (task_work_cancel(current, head)) {
+ event->pending_work = 0;
+ local_dec(&event->ctx->nr_pending);
+ return;
+ }
+
+ /*
+ * All accesses related to the event are within the same
+ * non-preemptible section in perf_pending_task(). The RCU
+ * grace period before the event is freed will make sure all
+ * those accesses are complete by then.
+ */
+ rcuwait_wait_event(&event->pending_work_wait, !event->pending_work, TASK_UNINTERRUPTIBLE);
+}
+
static void _free_event(struct perf_event *event)
{
irq_work_sync(&event->pending_irq);
+ perf_pending_task_sync(event);
unaccount_event(event);
@@ -6829,23 +6854,27 @@ static void perf_pending_task(struct callback_head *head)
int rctx;
/*
+ * All accesses to the event must belong to the same implicit RCU read-side
+ * critical section as the ->pending_work reset. See comment in
+ * perf_pending_task_sync().
+ */
+ preempt_disable_notrace();
+ /*
* If we 'fail' here, that's OK, it means recursion is already disabled
* and we won't recurse 'further'.
*/
- preempt_disable_notrace();
rctx = perf_swevent_get_recursion_context();
if (event->pending_work) {
event->pending_work = 0;
perf_sigtrap(event);
local_dec(&event->ctx->nr_pending);
+ rcuwait_wake_up(&event->pending_work_wait);
}
if (rctx >= 0)
perf_swevent_put_recursion_context(rctx);
preempt_enable_notrace();
-
- put_event(event);
}
#ifdef CONFIG_GUEST_PERF_EVENTS
@@ -11959,6 +11988,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
init_waitqueue_head(&event->waitq);
init_irq_work(&event->pending_irq, perf_pending_irq);
init_task_work(&event->pending_task, perf_pending_task);
+ rcuwait_init(&event->pending_work_wait);
mutex_init(&event->mmap_mutex);
raw_spin_lock_init(&event->addr_filters.lock);
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip: perf/core] perf: Fix event leak upon exec and file release
2024-06-21 9:16 ` [PATCH 4/4] perf: Fix event leak upon exec and file release Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
@ 2024-07-09 11:41 ` tip-bot2 for Frederic Weisbecker
1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2024-07-09 11:41 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Peter Zijlstra (Intel), stable, x86,
linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 3a5465418f5fd970e86a86c7f4075be262682840
Gitweb: https://git.kernel.org/tip/3a5465418f5fd970e86a86c7f4075be262682840
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Fri, 21 Jun 2024 11:16:01 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 09 Jul 2024 13:26:33 +02:00
perf: Fix event leak upon exec and file release
The perf pending task work is never waited upon the matching event
release. In the case of a child event, released via free_event()
directly, this can potentially result in a leaked event, such as in the
following scenario that doesn't even require a weak IRQ work
implementation to trigger:
schedule()
prepare_task_switch()
=======> <NMI>
perf_event_overflow()
event->pending_sigtrap = ...
irq_work_queue(&event->pending_irq)
<======= </NMI>
perf_event_task_sched_out()
event_sched_out()
event->pending_sigtrap = 0;
atomic_long_inc_not_zero(&event->refcount)
task_work_add(&event->pending_task)
finish_lock_switch()
=======> <IRQ>
perf_pending_irq()
//do nothing, rely on pending task work
<======= </IRQ>
begin_new_exec()
perf_event_exit_task()
perf_event_exit_event()
// If is child event
free_event()
WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
// event is leaked
Similar scenarios can also happen with perf_event_remove_on_exec() or
simply against concurrent perf_event_release().
Fix this with synchonizing against the possibly remaining pending task
work while freeing the event, just like is done with remaining pending
IRQ work. This means that the pending task callback neither need nor
should hold a reference to the event, preventing it from ever beeing
freed.
Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240621091601.18227-5-frederic@kernel.org
---
include/linux/perf_event.h | 1 +-
kernel/events/core.c | 38 +++++++++++++++++++++++++++++++++----
2 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index a5304ae..393fb13 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -786,6 +786,7 @@ struct perf_event {
struct irq_work pending_irq;
struct callback_head pending_task;
unsigned int pending_work;
+ struct rcuwait pending_work_wait;
atomic_t event_limit;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 576400d..32c7996 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2288,7 +2288,6 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
if (state != PERF_EVENT_STATE_OFF &&
!event->pending_work &&
!task_work_add(current, &event->pending_task, TWA_RESUME)) {
- WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
event->pending_work = 1;
} else {
local_dec(&event->ctx->nr_pending);
@@ -5203,9 +5202,35 @@ static bool exclusive_event_installable(struct perf_event *event,
static void perf_addr_filters_splice(struct perf_event *event,
struct list_head *head);
+static void perf_pending_task_sync(struct perf_event *event)
+{
+ struct callback_head *head = &event->pending_task;
+
+ if (!event->pending_work)
+ return;
+ /*
+ * If the task is queued to the current task's queue, we
+ * obviously can't wait for it to complete. Simply cancel it.
+ */
+ if (task_work_cancel(current, head)) {
+ event->pending_work = 0;
+ local_dec(&event->ctx->nr_pending);
+ return;
+ }
+
+ /*
+ * All accesses related to the event are within the same
+ * non-preemptible section in perf_pending_task(). The RCU
+ * grace period before the event is freed will make sure all
+ * those accesses are complete by then.
+ */
+ rcuwait_wait_event(&event->pending_work_wait, !event->pending_work, TASK_UNINTERRUPTIBLE);
+}
+
static void _free_event(struct perf_event *event)
{
irq_work_sync(&event->pending_irq);
+ perf_pending_task_sync(event);
unaccount_event(event);
@@ -6818,23 +6843,27 @@ static void perf_pending_task(struct callback_head *head)
int rctx;
/*
+ * All accesses to the event must belong to the same implicit RCU read-side
+ * critical section as the ->pending_work reset. See comment in
+ * perf_pending_task_sync().
+ */
+ preempt_disable_notrace();
+ /*
* If we 'fail' here, that's OK, it means recursion is already disabled
* and we won't recurse 'further'.
*/
- preempt_disable_notrace();
rctx = perf_swevent_get_recursion_context();
if (event->pending_work) {
event->pending_work = 0;
perf_sigtrap(event);
local_dec(&event->ctx->nr_pending);
+ rcuwait_wake_up(&event->pending_work_wait);
}
if (rctx >= 0)
perf_swevent_put_recursion_context(rctx);
preempt_enable_notrace();
-
- put_event(event);
}
#ifdef CONFIG_GUEST_PERF_EVENTS
@@ -11948,6 +11977,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
init_waitqueue_head(&event->waitq);
init_irq_work(&event->pending_irq, perf_pending_irq);
init_task_work(&event->pending_task, perf_pending_task);
+ rcuwait_init(&event->pending_work_wait);
mutex_init(&event->mmap_mutex);
raw_spin_lock_init(&event->addr_filters.lock);
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip: perf/core] task_work: Introduce task_work_cancel() again
2024-06-21 9:15 ` [PATCH 2/4] task_work: Introduce task_work_cancel() again Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
@ 2024-07-09 11:42 ` tip-bot2 for Frederic Weisbecker
1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2024-07-09 11:42 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Peter Zijlstra (Intel), stable, x86,
linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: f409530e4db9dd11b88cb7703c97c8f326ff6566
Gitweb: https://git.kernel.org/tip/f409530e4db9dd11b88cb7703c97c8f326ff6566
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Fri, 21 Jun 2024 11:15:59 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 09 Jul 2024 13:26:32 +02:00
task_work: Introduce task_work_cancel() again
Re-introduce task_work_cancel(), this time to cancel an actual callback
and not *any* callback pointing to a given function. This is going to be
needed for perf events event freeing.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240621091601.18227-3-frederic@kernel.org
---
include/linux/task_work.h | 1 +
kernel/task_work.c | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 23ab01a..26b8a47 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -31,6 +31,7 @@ int task_work_add(struct task_struct *task, struct callback_head *twork,
struct callback_head *task_work_cancel_match(struct task_struct *task,
bool (*match)(struct callback_head *, void *data), void *data);
struct callback_head *task_work_cancel_func(struct task_struct *, task_work_func_t);
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb);
void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 54ac240..2134ac8 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -136,6 +136,30 @@ task_work_cancel_func(struct task_struct *task, task_work_func_t func)
return task_work_cancel_match(task, task_work_func_match, func);
}
+static bool task_work_match(struct callback_head *cb, void *data)
+{
+ return cb == data;
+}
+
+/**
+ * task_work_cancel - cancel a pending work added by task_work_add()
+ * @task: the task which should execute the work
+ * @cb: the callback to remove if queued
+ *
+ * Remove a callback from a task's queue if queued.
+ *
+ * RETURNS:
+ * True if the callback was queued and got cancelled, false otherwise.
+ */
+bool task_work_cancel(struct task_struct *task, struct callback_head *cb)
+{
+ struct callback_head *ret;
+
+ ret = task_work_cancel_match(task, task_work_match, cb);
+
+ return ret == cb;
+}
+
/**
* task_work_run - execute the works added by task_work_add()
*
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip: perf/core] perf: Fix event leak upon exit
2024-06-21 9:16 ` [PATCH 3/4] perf: Fix event leak upon exit Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
@ 2024-07-09 11:42 ` tip-bot2 for Frederic Weisbecker
1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2024-07-09 11:42 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Peter Zijlstra (Intel), stable, x86,
linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 2fd5ad3f310de22836cdacae919dd99d758a1f1b
Gitweb: https://git.kernel.org/tip/2fd5ad3f310de22836cdacae919dd99d758a1f1b
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Fri, 21 Jun 2024 11:16:00 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 09 Jul 2024 13:26:33 +02:00
perf: Fix event leak upon exit
When a task is scheduled out, pending sigtrap deliveries are deferred
to the target task upon resume to userspace via task_work.
However failures while adding an event's callback to the task_work
engine are ignored. And since the last call for events exit happen
after task work is eventually closed, there is a small window during
which pending sigtrap can be queued though ignored, leaking the event
refcount addition such as in the following scenario:
TASK A
-----
do_exit()
exit_task_work(tsk);
<IRQ>
perf_event_overflow()
event->pending_sigtrap = pending_id;
irq_work_queue(&event->pending_irq);
</IRQ>
=========> PREEMPTION: TASK A -> TASK B
event_sched_out()
event->pending_sigtrap = 0;
atomic_long_inc_not_zero(&event->refcount)
// FAILS: task work has exited
task_work_add(&event->pending_task)
[...]
<IRQ WORK>
perf_pending_irq()
// early return: event->oncpu = -1
</IRQ WORK>
[...]
=========> TASK B -> TASK A
perf_event_exit_task(tsk)
perf_event_exit_event()
free_event()
WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
// leak event due to unexpected refcount == 2
As a result the event is never released while the task exits.
Fix this with appropriate task_work_add()'s error handling.
Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240621091601.18227-4-frederic@kernel.org
---
kernel/events/core.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 51ce436..576400d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2284,18 +2284,15 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
}
if (event->pending_sigtrap) {
- bool dec = true;
-
event->pending_sigtrap = 0;
if (state != PERF_EVENT_STATE_OFF &&
- !event->pending_work) {
- event->pending_work = 1;
- dec = false;
+ !event->pending_work &&
+ !task_work_add(current, &event->pending_task, TWA_RESUME)) {
WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
- task_work_add(current, &event->pending_task, TWA_RESUME);
- }
- if (dec)
+ event->pending_work = 1;
+ } else {
local_dec(&event->ctx->nr_pending);
+ }
}
perf_event_set_state(event, state);
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [tip: perf/core] task_work: s/task_work_cancel()/task_work_cancel_func()/
2024-06-21 9:15 ` [PATCH 1/4] task_work: s/task_work_cancel()/task_work_cancel_func()/ Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
@ 2024-07-09 11:42 ` tip-bot2 for Frederic Weisbecker
1 sibling, 0 replies; 18+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2024-07-09 11:42 UTC (permalink / raw)
To: linux-tip-commits
Cc: Frederic Weisbecker, Peter Zijlstra (Intel), stable, x86,
linux-kernel
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 68cbd415dd4b9c5b9df69f0f091879e56bf5907a
Gitweb: https://git.kernel.org/tip/68cbd415dd4b9c5b9df69f0f091879e56bf5907a
Author: Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Fri, 21 Jun 2024 11:15:58 +02:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 09 Jul 2024 13:26:31 +02:00
task_work: s/task_work_cancel()/task_work_cancel_func()/
A proper task_work_cancel() API that actually cancels a callback and not
*any* callback pointing to a given function is going to be needed for
perf events event freeing. Do the appropriate rename to prepare for
that.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240621091601.18227-2-frederic@kernel.org
---
include/linux/task_work.h | 2 +-
kernel/irq/manage.c | 2 +-
kernel/task_work.c | 10 +++++-----
security/keys/keyctl.c | 2 +-
4 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
index 795ef5a..23ab01a 100644
--- a/include/linux/task_work.h
+++ b/include/linux/task_work.h
@@ -30,7 +30,7 @@ int task_work_add(struct task_struct *task, struct callback_head *twork,
struct callback_head *task_work_cancel_match(struct task_struct *task,
bool (*match)(struct callback_head *, void *data), void *data);
-struct callback_head *task_work_cancel(struct task_struct *, task_work_func_t);
+struct callback_head *task_work_cancel_func(struct task_struct *, task_work_func_t);
void task_work_run(void);
static inline void exit_task_work(struct task_struct *task)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 71b0fc2..dd53298 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1337,7 +1337,7 @@ static int irq_thread(void *data)
* synchronize_hardirq(). So neither IRQTF_RUNTHREAD nor the
* oneshot mask bit can be set.
*/
- task_work_cancel(current, irq_thread_dtor);
+ task_work_cancel_func(current, irq_thread_dtor);
return 0;
}
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 95a7e1b..54ac240 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -120,9 +120,9 @@ static bool task_work_func_match(struct callback_head *cb, void *data)
}
/**
- * task_work_cancel - cancel a pending work added by task_work_add()
- * @task: the task which should execute the work
- * @func: identifies the work to remove
+ * task_work_cancel_func - cancel a pending work matching a function added by task_work_add()
+ * @task: the task which should execute the func's work
+ * @func: identifies the func to match with a work to remove
*
* Find the last queued pending work with ->func == @func and remove
* it from queue.
@@ -131,7 +131,7 @@ static bool task_work_func_match(struct callback_head *cb, void *data)
* The found work or NULL if not found.
*/
struct callback_head *
-task_work_cancel(struct task_struct *task, task_work_func_t func)
+task_work_cancel_func(struct task_struct *task, task_work_func_t func)
{
return task_work_cancel_match(task, task_work_func_match, func);
}
@@ -168,7 +168,7 @@ void task_work_run(void)
if (!work)
break;
/*
- * Synchronize with task_work_cancel(). It can not remove
+ * Synchronize with task_work_cancel_match(). It can not remove
* the first entry == work, cmpxchg(task_works) must fail.
* But it can remove another entry from the ->next list.
*/
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 4bc3e93..ab927a1 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1694,7 +1694,7 @@ long keyctl_session_to_parent(void)
goto unlock;
/* cancel an already pending keyring replacement */
- oldwork = task_work_cancel(parent, key_change_session_keyring);
+ oldwork = task_work_cancel_func(parent, key_change_session_keyring);
/* the replacement session keyring is applied just prior to userspace
* restarting */
^ permalink raw reply related [flat|nested] 18+ messages in thread
end of thread, other threads:[~2024-07-09 11:42 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-21 9:15 [PATCH 0/4 v4] perf: Fix leaked sigtrap events Frederic Weisbecker
2024-06-21 9:15 ` [PATCH 1/4] task_work: s/task_work_cancel()/task_work_cancel_func()/ Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
2024-07-09 11:42 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
2024-06-21 9:15 ` [PATCH 2/4] task_work: Introduce task_work_cancel() again Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
2024-07-09 11:42 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
2024-06-21 9:16 ` [PATCH 3/4] perf: Fix event leak upon exit Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
2024-07-09 11:42 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
2024-06-21 9:16 ` [PATCH 4/4] perf: Fix event leak upon exec and file release Frederic Weisbecker
2024-07-01 7:14 ` [tip: perf/urgent] " tip-bot2 for Frederic Weisbecker
2024-07-09 11:41 ` [tip: perf/core] " tip-bot2 for Frederic Weisbecker
2024-06-25 8:43 ` [PATCH 0/4 v4] perf: Fix leaked sigtrap events Peter Zijlstra
-- strict thread matches above, loose matches on Subject: below --
2024-05-16 14:09 [PATCH 0/4 v3] " Frederic Weisbecker
2024-05-16 14:09 ` [PATCH 2/4] task_work: Introduce task_work_cancel() again Frederic Weisbecker
2024-05-15 14:43 [PATCH 0/4 v2] perf: Fix leaked sigtrap events Frederic Weisbecker
2024-05-15 14:43 ` [PATCH 2/4] task_work: Introduce task_work_cancel() again Frederic Weisbecker
2024-03-29 23:58 [PATCH 0/4] perf: Fix leaked events when sigtrap = 1 Frederic Weisbecker
2024-03-29 23:58 ` [PATCH 2/4] task_work: Introduce task_work_cancel() again Frederic Weisbecker
2024-03-30 21:10 ` kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox