linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] fprobe call of rethook_try_get faults
@ 2023-06-07 16:42 Jiri Olsa
  2023-06-13 21:48 ` Steven Rostedt
  0 siblings, 1 reply; 3+ messages in thread
From: Jiri Olsa @ 2023-06-07 16:42 UTC (permalink / raw)
  To: Masami Hiramatsu (Google); +Cc: LKML, Linux Trace Kernel, Steven Rostedt, bpf

hi,
I occasionally get following fault:

  general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
  CPU: 3 PID: 28438 Comm: test_progs Tainted: G           OE      6.4.0-rc3+ #448 dad92bc91c459c664b308990ada0799837010e31
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
  RIP: 0010:rethook_try_get+0x34/0xf0
  Code: 48 8b 47 08 85 d2 74 0b 65 8b 15 af 26 eb 7e 85 d2 74 57 48 85 c0 74 73 e8 39 8e f0 ff 84 c0 74 6a 48 8b 53 10 48 85 d2 74 >
  RSP: 0018:ffffc90003ccfcf0 EFLAGS: 00010202
  RAX: 0000000000000001 RBX: ffff88810920db40 RCX: 0000000000000003
  RDX: 6b6b6b6b6b6b6b6b RSI: ffffffff82c0a371 RDI: ffffffff82bcbddb
  RBP: ffffffff81f5a5f0 R08: 0000000000000001 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000014000 R12: ffffffffa02ec3f2
  R13: fffffffffffffff7 R14: ffffc90003ccfd38 R15: 0000000000000000
  FS:  00007f2f8195eb80(0000) GS:ffff88846da00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f2f819c0140 CR3: 0000000189cb8006 CR4: 0000000000770ee0
  PKRU: 55555554
  Call Trace:
   <TASK>
   fprobe_handler+0xc1/0x270
   ? __pfx_bpf_testmod_init+0x10/0x10 [bpf_testmod b0bc3019aa6d6bdb2afc30cf6381f510d7e5abbe]
   ? __pfx_bpf_testmod_init+0x10/0x10 [bpf_testmod b0bc3019aa6d6bdb2afc30cf6381f510d7e5abbe]
   ? bpf_fentry_test1+0x5/0x10
   ? bpf_fentry_test1+0x5/0x10
   ? bpf_testmod_init+0x22/0x80 [bpf_testmod b0bc3019aa6d6bdb2afc30cf6381f510d7e5abbe]
   ? do_one_initcall+0x63/0x2e0
   ? rcu_is_watching+0xd/0x40
   ? kmalloc_trace+0xaf/0xc0
   ? do_init_module+0x60/0x250
   ? __do_sys_finit_module+0xac/0x120
   ? do_syscall_64+0x37/0x90
   ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
   </TASK>
  Modules linked in: bpf_testmod(OE+) loop bpf_preload intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel >


I can't really reliable reproduce this, but while checking the code, I wonder
we should call rethook_free only after we call unregister_ftrace_function like
in the patch below

jirka


---
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 18d36842faf5..0121e8c0d54e 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -364,19 +364,13 @@ int unregister_fprobe(struct fprobe *fp)
 		    fp->ops.saved_func != fprobe_kprobe_handler))
 		return -EINVAL;
 
-	/*
-	 * rethook_free() starts disabling the rethook, but the rethook handlers
-	 * may be running on other processors at this point. To make sure that all
-	 * current running handlers are finished, call unregister_ftrace_function()
-	 * after this.
-	 */
-	if (fp->rethook)
-		rethook_free(fp->rethook);
-
 	ret = unregister_ftrace_function(&fp->ops);
 	if (ret < 0)
 		return ret;
 
+	if (fp->rethook)
+		rethook_free(fp->rethook);
+
 	ftrace_free_filter(&fp->ops);
 
 	return ret;

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC] fprobe call of rethook_try_get faults
  2023-06-07 16:42 [RFC] fprobe call of rethook_try_get faults Jiri Olsa
@ 2023-06-13 21:48 ` Steven Rostedt
  2023-06-14  6:42   ` Jiri Olsa
  0 siblings, 1 reply; 3+ messages in thread
From: Steven Rostedt @ 2023-06-13 21:48 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Masami Hiramatsu (Google), LKML, Linux Trace Kernel, bpf

On Wed, 7 Jun 2023 09:42:30 -0700
Jiri Olsa <olsajiri@gmail.com> wrote:

> I can't really reliable reproduce this, but while checking the code, I wonder
> we should call rethook_free only after we call unregister_ftrace_function like
> in the patch below

Yeah, I think you're right!

> 
> jirka
> 
> 
> ---
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 18d36842faf5..0121e8c0d54e 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -364,19 +364,13 @@ int unregister_fprobe(struct fprobe *fp)
>  		    fp->ops.saved_func != fprobe_kprobe_handler))
>  		return -EINVAL;
>  
> -	/*
> -	 * rethook_free() starts disabling the rethook, but the rethook handlers
> -	 * may be running on other processors at this point. To make sure that all
> -	 * current running handlers are finished, call unregister_ftrace_function()
> -	 * after this.
> -	 */
> -	if (fp->rethook)
> -		rethook_free(fp->rethook);

The above only waits for RCU to finish and then starts to free rethook.

This also means that something could be on the trampoline already and was
preempted. It could be that this code path gets preempted. Anyway, I don't
see how freeing rethook is safe before disabling all users.

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve


> -
>  	ret = unregister_ftrace_function(&fp->ops);
>  	if (ret < 0)
>  		return ret;
>  
> +	if (fp->rethook)
> +		rethook_free(fp->rethook);
> +
>  	ftrace_free_filter(&fp->ops);
>  
>  	return ret;


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] fprobe call of rethook_try_get faults
  2023-06-13 21:48 ` Steven Rostedt
@ 2023-06-14  6:42   ` Jiri Olsa
  0 siblings, 0 replies; 3+ messages in thread
From: Jiri Olsa @ 2023-06-14  6:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jiri Olsa, Masami Hiramatsu (Google), LKML, Linux Trace Kernel,
	bpf

On Tue, Jun 13, 2023 at 05:48:44PM -0400, Steven Rostedt wrote:
> On Wed, 7 Jun 2023 09:42:30 -0700
> Jiri Olsa <olsajiri@gmail.com> wrote:
> 
> > I can't really reliable reproduce this, but while checking the code, I wonder
> > we should call rethook_free only after we call unregister_ftrace_function like
> > in the patch below
> 
> Yeah, I think you're right!
> 
> > 
> > jirka
> > 
> > 
> > ---
> > diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> > index 18d36842faf5..0121e8c0d54e 100644
> > --- a/kernel/trace/fprobe.c
> > +++ b/kernel/trace/fprobe.c
> > @@ -364,19 +364,13 @@ int unregister_fprobe(struct fprobe *fp)
> >  		    fp->ops.saved_func != fprobe_kprobe_handler))
> >  		return -EINVAL;
> >  
> > -	/*
> > -	 * rethook_free() starts disabling the rethook, but the rethook handlers
> > -	 * may be running on other processors at this point. To make sure that all
> > -	 * current running handlers are finished, call unregister_ftrace_function()
> > -	 * after this.
> > -	 */
> > -	if (fp->rethook)
> > -		rethook_free(fp->rethook);
> 
> The above only waits for RCU to finish and then starts to free rethook.
> 
> This also means that something could be on the trampoline already and was
> preempted. It could be that this code path gets preempted. Anyway, I don't
> see how freeing rethook is safe before disabling all users.
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

thanks, I'll send formal patch

jirka

> 
> -- Steve
> 
> 
> > -
> >  	ret = unregister_ftrace_function(&fp->ops);
> >  	if (ret < 0)
> >  		return ret;
> >  
> > +	if (fp->rethook)
> > +		rethook_free(fp->rethook);
> > +
> >  	ftrace_free_filter(&fp->ops);
> >  
> >  	return ret;
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-14  6:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-07 16:42 [RFC] fprobe call of rethook_try_get faults Jiri Olsa
2023-06-13 21:48 ` Steven Rostedt
2023-06-14  6:42   ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).