Linux Trace Kernel
 help / color / mirror / Atom feed
* [PATCH] ftrace: fix race in __modify_ftrace_direct() between tmp_ops registration and direct_functions update
@ 2026-05-17 11:01 Andrii Kuchmenko
  2026-05-18 16:19 ` Steven Rostedt
  0 siblings, 1 reply; 3+ messages in thread
From: Andrii Kuchmenko @ 2026-05-17 11:01 UTC (permalink / raw)
  To: linux-trace-kernel
  Cc: rostedt, mhiramat, linux-kernel, Andrii Kuchmenko, stable

In __modify_ftrace_direct(), register_ftrace_function_nolock() makes
tmp_ops visible in ftrace_ops_list before entry->direct is updated
under ftrace_lock. During this window any CPU entering the traced
function calls call_direct_funcs(), reads the old address from
direct_functions via RCU, and jumps to it via
arch_ftrace_set_direct_caller(). If the caller freed or invalidated
the old trampoline before calling modify_ftrace_direct(), this is a
use-after-free in executable code context.

The race window:

  CPU 0 (__modify_ftrace_direct)       CPU 1 (executing traced func)
  ──────────────────────────────       ──────────────────────────────
  register_ftrace_function_nolock()
    -> tmp_ops visible in ops_list
                                        call_direct_funcs()
                                          ftrace_find_rec_direct() -> old_addr
                                          arch_ftrace_set_direct_caller(old_addr)
                                          jump to old_addr  <- UAF if freed
  mutex_lock(&ftrace_lock)
  entry->direct = addr   <- too late
  mutex_unlock(&ftrace_lock)

Fix: update entry->direct under ftrace_lock BEFORE registering tmp_ops.
Any CPU that observes tmp_ops in ftrace_ops_list after this point will
already see the new address when it calls ftrace_find_rec_direct().
Add smp_wmb() between the store and the registration to ensure the
write is visible on weakly-ordered architectures before tmp_ops
becomes observable via ftrace_ops_list.

On error from register_ftrace_function_nolock(), restore entry->direct
to old_addr since tmp_ops never became visible to other CPUs.

This affects all callers of __modify_ftrace_direct(), including:
  - modify_ftrace_direct() used by kernel modules and live patching
  - modify_ftrace_direct_nolock() used by BPF trampolines
    (kernel/bpf/trampoline.c) reachable with CAP_BPF + CAP_PERFMON

Fixes: 0567d6809440 ("ftrace: Add modify_ftrace_direct()")
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Andrii Kuchmenko <capyenglishlite@gmail.com>
---
 kernel/trace/ftrace.c | 35 +++++++++++++++++++++++++----------
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a1b2c3d4e5f6..b7c8d9e0f1a2 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5950,6 +5950,7 @@ static int __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 	struct ftrace_func_entry *entry;
 	struct ftrace_ops tmp_ops;
+	unsigned long old_addr;
 	int err;
 
 	lockdep_assert_held(&direct_mutex);
@@ -5960,22 +5961,36 @@ static int __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
 	if (!entry)
 		return -ENODEV;
 
-	/*
-	 * tmp_ops is registered into ftrace_ops_list here, making it
-	 * visible to all CPUs executing the traced function. However,
-	 * entry->direct is not updated until after this call returns,
-	 * leaving a window where CPUs read the stale (possibly freed)
-	 * direct call address via ftrace_find_rec_direct().
-	 */
-	err = register_ftrace_function_nolock(&tmp_ops);
-	if (err)
-		return err;
-
+	/* Save old address in case we need to roll back on error. */
+	old_addr = entry->direct;
+
+	/*
+	 * Update entry->direct BEFORE registering tmp_ops into
+	 * ftrace_ops_list. This closes the race window where a CPU
+	 * executing the traced function could read the old (potentially
+	 * freed) direct call address between tmp_ops becoming visible
+	 * and entry->direct being updated.
+	 *
+	 * Any CPU that observes tmp_ops in ftrace_ops_list after the
+	 * smp_wmb() below is guaranteed to see the new address when
+	 * it calls ftrace_find_rec_direct().
+	 */
 	mutex_lock(&ftrace_lock);
 	entry->direct = addr;
 	mutex_unlock(&ftrace_lock);
 
+	/*
+	 * Ensure entry->direct store is ordered before tmp_ops
+	 * becomes visible via ftrace_ops_list on weakly-ordered archs.
+	 */
+	smp_wmb();
+
+	err = register_ftrace_function_nolock(&tmp_ops);
+	if (err) {
+		/* tmp_ops never became visible; safe to restore old_addr. */
+		mutex_lock(&ftrace_lock);
+		entry->direct = old_addr;
+		mutex_unlock(&ftrace_lock);
+		return err;
+	}
+
 	/*
 	 * Now that tmp_ops is registered and entry->direct is updated,
 	 * unregister the original ops and clean up.
-- 
2.39.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] ftrace: fix race in __modify_ftrace_direct() between tmp_ops registration and direct_functions update
  2026-05-17 11:01 [PATCH] ftrace: fix race in __modify_ftrace_direct() between tmp_ops registration and direct_functions update Andrii Kuchmenko
@ 2026-05-18 16:19 ` Steven Rostedt
  2026-05-19 10:55   ` Jiri Olsa
  0 siblings, 1 reply; 3+ messages in thread
From: Steven Rostedt @ 2026-05-18 16:19 UTC (permalink / raw)
  To: Andrii Kuchmenko
  Cc: linux-trace-kernel, mhiramat, linux-kernel, stable, Jiri Olsa

On Sun, 17 May 2026 14:01:53 +0300
Andrii Kuchmenko <capyenglishlite@gmail.com> wrote:

> In __modify_ftrace_direct(), register_ftrace_function_nolock() makes
> tmp_ops visible in ftrace_ops_list before entry->direct is updated
> under ftrace_lock. During this window any CPU entering the traced
> function calls call_direct_funcs(), reads the old address from
> direct_functions via RCU, and jumps to it via
> arch_ftrace_set_direct_caller(). If the caller freed or invalidated
> the old trampoline before calling modify_ftrace_direct(), this is a
> use-after-free in executable code context.
> 
> The race window:
> 
>   CPU 0 (__modify_ftrace_direct)       CPU 1 (executing traced func)
>   ──────────────────────────────       ──────────────────────────────
>   register_ftrace_function_nolock()
>     -> tmp_ops visible in ops_list  
>                                         call_direct_funcs()
>                                           ftrace_find_rec_direct() -> old_addr
>                                           arch_ftrace_set_direct_caller(old_addr)
>                                           jump to old_addr  <- UAF if freed

You do not state where old_addr is freed.

>   mutex_lock(&ftrace_lock)
>   entry->direct = addr   <- too late
>   mutex_unlock(&ftrace_lock)
> 
> Fix: update entry->direct under ftrace_lock BEFORE registering tmp_ops.
> Any CPU that observes tmp_ops in ftrace_ops_list after this point will
> already see the new address when it calls ftrace_find_rec_direct().
> Add smp_wmb() between the store and the registration to ensure the
> write is visible on weakly-ordered architectures before tmp_ops
> becomes observable via ftrace_ops_list.
> 
> On error from register_ftrace_function_nolock(), restore entry->direct
> to old_addr since tmp_ops never became visible to other CPUs.

The above statement is incorrect. The tmp_ops hash entries are also
*shared* with the ops that is being updated. That is, by changing the entry->direct, you 

> 
> This affects all callers of __modify_ftrace_direct(), including:
>   - modify_ftrace_direct() used by kernel modules and live patching
>   - modify_ftrace_direct_nolock() used by BPF trampolines
>     (kernel/bpf/trampoline.c) reachable with CAP_BPF + CAP_PERFMON
> 
> Fixes: 0567d6809440 ("ftrace: Add modify_ftrace_direct()")
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: stable@vger.kernel.org
> Signed-off-by: Andrii Kuchmenko <capyenglishlite@gmail.com>
> ---
>  kernel/trace/ftrace.c | 35 +++++++++++++++++++++++++----------
>  1 file changed, 25 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index a1b2c3d4e5f6..b7c8d9e0f1a2 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -5950,6 +5950,7 @@ static int __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  	struct ftrace_func_entry *entry;
>  	struct ftrace_ops tmp_ops;
> +	unsigned long old_addr;
>  	int err;
>  
>  	lockdep_assert_held(&direct_mutex);
> @@ -5960,22 +5961,36 @@ static int __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
>  	if (!entry)
>  		return -ENODEV;
>  
> -	/*
> -	 * tmp_ops is registered into ftrace_ops_list here, making it
> -	 * visible to all CPUs executing the traced function. However,
> -	 * entry->direct is not updated until after this call returns,
> -	 * leaving a window where CPUs read the stale (possibly freed)
> -	 * direct call address via ftrace_find_rec_direct().
> -	 */

Are you posting patches on top of your own patches that are not public?

> -	err = register_ftrace_function_nolock(&tmp_ops);
> -	if (err)
> -		return err;
> -
> +	/* Save old address in case we need to roll back on error. */
> +	old_addr = entry->direct;
> +
> +	/*
> +	 * Update entry->direct BEFORE registering tmp_ops into
> +	 * ftrace_ops_list. This closes the race window where a CPU
> +	 * executing the traced function could read the old (potentially
> +	 * freed) direct call address between tmp_ops becoming visible
> +	 * and entry->direct being updated.
> +	 *
> +	 * Any CPU that observes tmp_ops in ftrace_ops_list after the
> +	 * smp_wmb() below is guaranteed to see the new address when
> +	 * it calls ftrace_find_rec_direct().
> +	 */
>  	mutex_lock(&ftrace_lock);
>  	entry->direct = addr;
>  	mutex_unlock(&ftrace_lock);
>  
> +	/*
> +	 * Ensure entry->direct store is ordered before tmp_ops
> +	 * becomes visible via ftrace_ops_list on weakly-ordered archs.
> +	 */
> +	smp_wmb();

You do realize that register_ftrace_function_nolock() is itself a full
memory barrier? It's doing code modification which requires lots of
barriers to work.

Still, the only bug I see that is possible is that the caller may need to
do some synchronize RCU calls before freeing an old trampoline.

Can you show a path that doesn't do that?

-- Steve


> +
> +	err = register_ftrace_function_nolock(&tmp_ops);
> +	if (err) {
> +		/* tmp_ops never became visible; safe to restore old_addr. */
> +		mutex_lock(&ftrace_lock);
> +		entry->direct = old_addr;
> +		mutex_unlock(&ftrace_lock);
> +		return err;
> +	}
> +
>  	/*
>  	 * Now that tmp_ops is registered and entry->direct is updated,
>  	 * unregister the original ops and clean up.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ftrace: fix race in __modify_ftrace_direct() between tmp_ops registration and direct_functions update
  2026-05-18 16:19 ` Steven Rostedt
@ 2026-05-19 10:55   ` Jiri Olsa
  0 siblings, 0 replies; 3+ messages in thread
From: Jiri Olsa @ 2026-05-19 10:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrii Kuchmenko, linux-trace-kernel, mhiramat, linux-kernel,
	stable

On Mon, May 18, 2026 at 12:19:06PM -0400, Steven Rostedt wrote:
> On Sun, 17 May 2026 14:01:53 +0300
> Andrii Kuchmenko <capyenglishlite@gmail.com> wrote:
> 
> > In __modify_ftrace_direct(), register_ftrace_function_nolock() makes
> > tmp_ops visible in ftrace_ops_list before entry->direct is updated
> > under ftrace_lock. During this window any CPU entering the traced
> > function calls call_direct_funcs(), reads the old address from
> > direct_functions via RCU, and jumps to it via
> > arch_ftrace_set_direct_caller(). If the caller freed or invalidated
> > the old trampoline before calling modify_ftrace_direct(), this is a
> > use-after-free in executable code context.
> > 
> > The race window:
> > 
> >   CPU 0 (__modify_ftrace_direct)       CPU 1 (executing traced func)
> >   ──────────────────────────────       ──────────────────────────────
> >   register_ftrace_function_nolock()
> >     -> tmp_ops visible in ops_list  
> >                                         call_direct_funcs()
> >                                           ftrace_find_rec_direct() -> old_addr
> >                                           arch_ftrace_set_direct_caller(old_addr)
> >                                           jump to old_addr  <- UAF if freed
> 
> You do not state where old_addr is freed.
> 
> >   mutex_lock(&ftrace_lock)
> >   entry->direct = addr   <- too late
> >   mutex_unlock(&ftrace_lock)
> > 
> > Fix: update entry->direct under ftrace_lock BEFORE registering tmp_ops.
> > Any CPU that observes tmp_ops in ftrace_ops_list after this point will
> > already see the new address when it calls ftrace_find_rec_direct().
> > Add smp_wmb() between the store and the registration to ensure the
> > write is visible on weakly-ordered architectures before tmp_ops
> > becomes observable via ftrace_ops_list.
> > 
> > On error from register_ftrace_function_nolock(), restore entry->direct
> > to old_addr since tmp_ops never became visible to other CPUs.
> 
> The above statement is incorrect. The tmp_ops hash entries are also
> *shared* with the ops that is being updated. That is, by changing the entry->direct, you 
> 
> > 
> > This affects all callers of __modify_ftrace_direct(), including:
> >   - modify_ftrace_direct() used by kernel modules and live patching
> >   - modify_ftrace_direct_nolock() used by BPF trampolines
> >     (kernel/bpf/trampoline.c) reachable with CAP_BPF + CAP_PERFMON
> > 
> > Fixes: 0567d6809440 ("ftrace: Add modify_ftrace_direct()")
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Andrii Kuchmenko <capyenglishlite@gmail.com>
> > ---
> >  kernel/trace/ftrace.c | 35 +++++++++++++++++++++++++----------
> >  1 file changed, 25 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index a1b2c3d4e5f6..b7c8d9e0f1a2 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -5950,6 +5950,7 @@ static int __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> >  	struct ftrace_func_entry *entry;
> >  	struct ftrace_ops tmp_ops;
> > +	unsigned long old_addr;
> >  	int err;
> >  
> >  	lockdep_assert_held(&direct_mutex);
> > @@ -5960,22 +5961,36 @@ static int __modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
> >  	if (!entry)
> >  		return -ENODEV;
> >  
> > -	/*
> > -	 * tmp_ops is registered into ftrace_ops_list here, making it
> > -	 * visible to all CPUs executing the traced function. However,
> > -	 * entry->direct is not updated until after this call returns,
> > -	 * leaving a window where CPUs read the stale (possibly freed)
> > -	 * direct call address via ftrace_find_rec_direct().
> > -	 */
> 
> Are you posting patches on top of your own patches that are not public?

hi,

right, the original email states 5.15 is affected, but I dont see
__modify_ftrace_direct in stable version v5.15.207 .. what kernel
version is the patch for?

> 
> > -	err = register_ftrace_function_nolock(&tmp_ops);
> > -	if (err)
> > -		return err;
> > -
> > +	/* Save old address in case we need to roll back on error. */
> > +	old_addr = entry->direct;
> > +
> > +	/*
> > +	 * Update entry->direct BEFORE registering tmp_ops into
> > +	 * ftrace_ops_list. This closes the race window where a CPU
> > +	 * executing the traced function could read the old (potentially
> > +	 * freed) direct call address between tmp_ops becoming visible
> > +	 * and entry->direct being updated.
> > +	 *
> > +	 * Any CPU that observes tmp_ops in ftrace_ops_list after the
> > +	 * smp_wmb() below is guaranteed to see the new address when
> > +	 * it calls ftrace_find_rec_direct().
> > +	 */
> >  	mutex_lock(&ftrace_lock);
> >  	entry->direct = addr;
> >  	mutex_unlock(&ftrace_lock);
> >  
> > +	/*
> > +	 * Ensure entry->direct store is ordered before tmp_ops
> > +	 * becomes visible via ftrace_ops_list on weakly-ordered archs.
> > +	 */
> > +	smp_wmb();
> 
> You do realize that register_ftrace_function_nolock() is itself a full
> memory barrier? It's doing code modification which requires lots of
> barriers to work.
> 
> Still, the only bug I see that is possible is that the caller may need to
> do some synchronize RCU calls before freeing an old trampoline.
> 
> Can you show a path that doesn't do that?

+1 

jirka

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-19 10:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-17 11:01 [PATCH] ftrace: fix race in __modify_ftrace_direct() between tmp_ops registration and direct_functions update Andrii Kuchmenko
2026-05-18 16:19 ` Steven Rostedt
2026-05-19 10:55   ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox