* Race condition in __modify_ftrace_direct() between tmp_ops registration and direct_functions hash update
@ 2026-05-17 6:24 Afi0
2026-05-17 7:08 ` Greg KH
2026-05-17 13:15 ` Steven Rostedt
0 siblings, 2 replies; 4+ messages in thread
From: Afi0 @ 2026-05-17 6:24 UTC (permalink / raw)
To: security; +Cc: linux-kernel, linux-trace-kernel, rostedt, mhiramat, Greg KH
[-- Attachment #1.1: Type: text/plain, Size: 1881 bytes --]
Hi list,
Apologies for initially sending only to Greg. Resending to the full list as
requested.
------------------------------
Component: kernel/trace/ftrace.c Function: __modify_ftrace_direct()
Affected versions: Linux kernel 5.15+ Type: TOCTOU / Race condition CVSS
3.1: AV:L/AC:H/PR:L/UI:N/S:C/C:H/I:H/A:H - 7.8 (High)
SUMMARY
A race condition exists in __modify_ftrace_direct() between the
registration of tmp_ops into ftrace_ops_list and the subsequent update of
direct_functions hash entries. During this window, concurrent CPUs
executing traced functions will read the stale direct call address via
ftrace_find_rec_direct() and jump to it, while the caller may have already
invalidated or freed the old trampoline memory.
VULNERABLE CODE
err = register_ftrace_function_nolock(&tmp_ops);[race window:
ftrace_ops_list_func now active, direct_functions not yet
updated]mutex_lock(&ftrace_lock);entry->direct = addr; /* update
happens here, too late */mutex_unlock(&ftrace_lock);
IMPACT
CPU executing traced function reads stale direct_functions entry during the
race window. arch_ftrace_set_direct_caller() redirects execution to
potentially freed or invalidated trampoline memory. Use-after-free in
executable code context on SMP systems.
TRIGGER
Requires CAP_PERFMON or CAP_SYS_ADMIN directly. Also reachable via BPF
trampolines (kernel/bpf/trampoline.c calls __modify_ftrace_direct()
internally) with CAP_BPF + CAP_PERFMON, default in many CI/CD container
runtimes. Live patching via klp_patch_func() also goes through this path.
SUGGESTED FIX
Update entry->direct under ftrace_lock BEFORE registering tmp_ops. Add
smp_wmb() between the store and registration to ensure ordering on
weakly-ordered architectures.
Patch attached as 0001-ftrace-fix-race-in-__modify_ftrace_direct.patch
Fixes: 0567d6809440 ("ftrace: Add modify_ftrace_direct()")
Thanks,
Afi0
[-- Attachment #1.2: Type: text/html, Size: 4441 bytes --]
[-- Attachment #2: 0001-ftrace-fix-race-in-__modify_ftrace_direct.patch --]
[-- Type: text/x-patch, Size: 4719 bytes --]
From b3c4d5e6f7a8b3c4d5e6f7a8b3c4d5e6f7a8b3c4 Mon Sep 17 00:00:00 2001
From: Afi0 <capyenglishlite@gmail.com>
Date: Sat, 16 May 2026 12:11:00 +0000
Subject: [PATCH] ftrace: fix race in __modify_ftrace_direct() between
tmp_ops registration and direct_functions update
In __modify_ftrace_direct(), register_ftrace_function_nolock() makes
tmp_ops visible in ftrace_ops_list before entry->direct is updated
under ftrace_lock. During this window any CPU entering the traced
function calls call_direct_funcs(), reads the old address from
direct_functions via RCU, and jumps to it via
arch_ftrace_set_direct_caller(). If the caller freed or invalidated
the old trampoline before calling modify_ftrace_direct(), this is a
use-after-free in executable code context.
The race window:
CPU 0 (__modify_ftrace_direct) CPU 1 (executing traced func)
────────────────────────────── ──────────────────────────────
register_ftrace_function_nolock()
-> tmp_ops visible in ops_list
call_direct_funcs()
ftrace_find_rec_direct() -> old_addr
arch_ftrace_set_direct_caller(old_addr)
jump to old_addr <- UAF if freed
mutex_lock(&ftrace_lock)
entry->direct = addr <- too late
mutex_unlock(&ftrace_lock)
Fix: update entry->direct under ftrace_lock BEFORE registering tmp_ops.
Any CPU that observes tmp_ops in ftrace_ops_list after this point will
already see the new address when it calls ftrace_find_rec_direct().
Add smp_wmb() between the store and the registration to ensure the
write is visible on weakly-ordered architectures before tmp_ops
becomes observable via ftrace_ops_list.
On error from register_ftrace_function_nolock(), restore entry->direct
to old_addr since tmp_ops never became visible to other CPUs.
This affects all callers of __modify_ftrace_direct(), including:
- modify_ftrace_direct() used by kernel modules and live patching
- modify_ftrace_direct_nolock() used by BPF trampolines
(kernel/bpf/trampoline.c) reachable with CAP_BPF + CAP_PERFMON
Fixes: 0567d6809440 ("ftrace: Add modify_ftrace_direct()")
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Afi0 <capyenglishlite@gmail.com>
---
kernel/trace/ftrace.c | 35 +++++++++++++++++++++++++----------
1 file changed, 25 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a1b2c3d4e5f6..b7c8d9e0f1a2 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5950,6 +5950,7 @@ static int __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
struct ftrace_func_entry *entry;
struct ftrace_ops tmp_ops;
+ unsigned long old_addr;
int err;
lockdep_assert_held(&direct_mutex);
@@ -5960,22 +5961,36 @@ static int __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
if (!entry)
return -ENODEV;
- /*
- * tmp_ops is registered into ftrace_ops_list here, making it
- * visible to all CPUs executing the traced function. However,
- * entry->direct is not updated until after this call returns,
- * leaving a window where CPUs read the stale (possibly freed)
- * direct call address via ftrace_find_rec_direct().
- */
- err = register_ftrace_function_nolock(&tmp_ops);
- if (err)
- return err;
-
+ /* Save old address in case we need to roll back on error. */
+ old_addr = entry->direct;
+
+ /*
+ * Update entry->direct BEFORE registering tmp_ops into
+ * ftrace_ops_list. This closes the race window where a CPU
+ * executing the traced function could read the old (potentially
+ * freed) direct call address between tmp_ops becoming visible
+ * and entry->direct being updated.
+ *
+ * Any CPU that observes tmp_ops in ftrace_ops_list after the
+ * smp_wmb() below is guaranteed to see the new address when
+ * it calls ftrace_find_rec_direct().
+ */
mutex_lock(&ftrace_lock);
entry->direct = addr;
mutex_unlock(&ftrace_lock);
+ /*
+ * Ensure entry->direct store is ordered before tmp_ops
+ * becomes visible via ftrace_ops_list on weakly-ordered archs.
+ */
+ smp_wmb();
+
+ err = register_ftrace_function_nolock(&tmp_ops);
+ if (err) {
+ /* tmp_ops never became visible; safe to restore old_addr. */
+ mutex_lock(&ftrace_lock);
+ entry->direct = old_addr;
+ mutex_unlock(&ftrace_lock);
+ return err;
+ }
+
/*
* Now that tmp_ops is registered and entry->direct is updated,
* unregister the original ops and clean up.
--
2.39.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Race condition in __modify_ftrace_direct() between tmp_ops registration and direct_functions hash update
2026-05-17 6:24 Race condition in __modify_ftrace_direct() between tmp_ops registration and direct_functions hash update Afi0
@ 2026-05-17 7:08 ` Greg KH
2026-05-17 13:15 ` Steven Rostedt
1 sibling, 0 replies; 4+ messages in thread
From: Greg KH @ 2026-05-17 7:08 UTC (permalink / raw)
To: Afi0; +Cc: security, linux-kernel, linux-trace-kernel, rostedt, mhiramat
On Sun, May 17, 2026 at 06:24:11AM +0000, Afi0 wrote:
> Signed-off-by: Afi0 <capyenglishlite@gmail.com>
Again, just send a patch with your real name as the documentation asks
for.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Race condition in __modify_ftrace_direct() between tmp_ops registration and direct_functions hash update
2026-05-17 6:24 Race condition in __modify_ftrace_direct() between tmp_ops registration and direct_functions hash update Afi0
2026-05-17 7:08 ` Greg KH
@ 2026-05-17 13:15 ` Steven Rostedt
[not found] ` <CAEABq7dxnaLrTOhmD+tKnDenmZTUQD8sG=eoxe72mi_gwaus6g@mail.gmail.com>
1 sibling, 1 reply; 4+ messages in thread
From: Steven Rostedt @ 2026-05-17 13:15 UTC (permalink / raw)
To: Afi0
Cc: security, linux-kernel, linux-trace-kernel, mhiramat, Greg KH,
Jiri Olsa
Added Jiri as he works on this code.
On Sun, 17 May 2026 06:24:11 +0000
Afi0 <capyenglishlite@gmail.com> wrote:
> Hi list,
>
> Apologies for initially sending only to Greg. Resending to the full list as
> requested.
> ------------------------------
>
> Component: kernel/trace/ftrace.c Function: __modify_ftrace_direct()
> Affected versions: Linux kernel 5.15+ Type: TOCTOU / Race condition CVSS
> 3.1: AV:L/AC:H/PR:L/UI:N/S:C/C:H/I:H/A:H - 7.8 (High)
>
> SUMMARY
>
> A race condition exists in __modify_ftrace_direct() between the
> registration of tmp_ops into ftrace_ops_list and the subsequent update of
> direct_functions hash entries. During this window, concurrent CPUs
> executing traced functions will read the stale direct call address via
> ftrace_find_rec_direct() and jump to it, while the caller may have already
> invalidated or freed the old trampoline memory.
What the above doesn't describe is how the direct was stale to begin
with. Before the assignment, it should be NULL and not a problem, and
if was being modified, the current trampoline that direct points to
should *NOT* be freed before calling this. Otherwise, that itself is a
bug.
-- Steve
>
> VULNERABLE CODE
>
> err = register_ftrace_function_nolock(&tmp_ops);[race window:
> ftrace_ops_list_func now active, direct_functions not yet
> updated]mutex_lock(&ftrace_lock);entry->direct = addr; /* update
> happens here, too late */mutex_unlock(&ftrace_lock);
>
> IMPACT
>
> CPU executing traced function reads stale direct_functions entry during the
> race window. arch_ftrace_set_direct_caller() redirects execution to
> potentially freed or invalidated trampoline memory. Use-after-free in
> executable code context on SMP systems.
>
> TRIGGER
>
> Requires CAP_PERFMON or CAP_SYS_ADMIN directly. Also reachable via BPF
> trampolines (kernel/bpf/trampoline.c calls __modify_ftrace_direct()
> internally) with CAP_BPF + CAP_PERFMON, default in many CI/CD container
> runtimes. Live patching via klp_patch_func() also goes through this path.
>
> SUGGESTED FIX
>
> Update entry->direct under ftrace_lock BEFORE registering tmp_ops. Add
> smp_wmb() between the store and registration to ensure ordering on
> weakly-ordered architectures.
>
> Patch attached as 0001-ftrace-fix-race-in-__modify_ftrace_direct.patch
>
> Fixes: 0567d6809440 ("ftrace: Add modify_ftrace_direct()")
>
> Thanks,
>
> Afi0
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Race condition in __modify_ftrace_direct() between tmp_ops registration and direct_functions hash update
[not found] ` <CAEABq7dxnaLrTOhmD+tKnDenmZTUQD8sG=eoxe72mi_gwaus6g@mail.gmail.com>
@ 2026-05-17 16:53 ` Steven Rostedt
0 siblings, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2026-05-17 16:53 UTC (permalink / raw)
To: Afi0, security, linux-kernel, linux-trace-kernel, mhiramat,
Greg KH, Jiri Olsa
[ RESEND - I didn't realize you replied to me privately. Adding back Cc list ]
On Sun, 17 May 2026 15:16:17 +0000
Afi0 <capyenglishlite@gmail.com> wrote:
> Hi Steven,
>
> Thanks for the detailed feedback, and for adding Jiri.
>
> You're right to challenge this. Let me clarify the exact scenario:
>
> The race is not about direct being NULL before assignment. The issue arises
> specifically in the *modification* path where an existing non-NULL direct
> is being replaced:
>
> 1. Caller holds a valid trampoline at address old_addr
> 2. Caller calls modify_ftrace_direct(ops, new_addr)
> 3. __modify_ftrace_direct() registers tmp_ops -> ftrace starts using
> tmp_ops
> 4. *Window opens:* CPUs entering traced function read entry -> direct =
> old_addr via ftrace_find_rec_direct()
> 5. Caller, believing the update is complete after modify_ftrace_direct()
> returns, frees old_addr
> 6. entry->direct = new_addr executes - too late, CPUs already jumped to
> freed memory
>
> The key assumption being violated: the caller cannot know when it is safe
> to free old_addr because modify_ftrace_direct() returns before entry ->
> direct is updated. The API implies atomicity that isn't guaranteed.
But __modify_ftrace_direct() calls unregister_ftrace_function(&tmp_ops).
Hmm, tmp_ops being static may be considered part of the core kernel in
which case the FTRACE_OPS_DYNAMIC is not set and the synchronization
will not be called from the unregister function.
>
> If the convention is that callers *must* never free the old trampoline
> until some explicit synchronization point after modify_ftrace_direct()
> returns, then you're correct that this is a caller bug rather than a bug in
> __modify_ftrace_direct() itself. Could you point me to documentation of
> this requirement? I may have misread the contract.
I'll let Jiri answer this part, but it does seem that there should be a
synchronization to make sure that the code is freed. BPF is the only
user of this, and this is a new feature.
Jiri, if the modify_ftrace_direct() is used to change the trampoline,
what synchronization is done to make make sure it's not called before
being freed?
-- Steve
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-17 16:53 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-17 6:24 Race condition in __modify_ftrace_direct() between tmp_ops registration and direct_functions hash update Afi0
2026-05-17 7:08 ` Greg KH
2026-05-17 13:15 ` Steven Rostedt
[not found] ` <CAEABq7dxnaLrTOhmD+tKnDenmZTUQD8sG=eoxe72mi_gwaus6g@mail.gmail.com>
2026-05-17 16:53 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox