public inbox for linux-trace-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: 【ftrace_events list】race between trace_event_eval_update and __register_event discuss
       [not found] <3c1259275c784bbd9e2f33e1b2d846c1@luxshare-ict.com>
@ 2025-07-17 14:50 ` Steven Rostedt
  2025-07-17 14:52   ` Steven Rostedt
  2025-07-17 14:59   ` Steven Rostedt
  0 siblings, 2 replies; 3+ messages in thread
From: Steven Rostedt @ 2025-07-17 14:50 UTC (permalink / raw)
  To: Fusheng Huang(黄富生)
  Cc: mhiramat@kernel.org, GuoJin Dai( 戴国金),
	linux-trace-kernel@vger.kernel.org

On Thu, 17 Jul 2025 12:37:37 +0000
Fusheng Huang(黄富生) <Fusheng.Huang@luxshare-ict.com> wrote:

> Dear rostedt, mhiramat:
> 
> We encountered a small problem of ftrace module, and need your help .
> 
> 【issue background】During the process of starting and loading ko in
> the Linux kernel, we encountered a crash of the CPU 7 null pointer
> caused by the synchronization operation of ftrace_events list, with a
> probability of less than 1/1000
> 
> 【call stack】
> 
>        CPU7
> 
>        load_module()
>        prepare_coming_module(inline)
>        blocking_notifier_call_chain_robust()
>        notifier_call_chain_robust(inline)
>        notifier_call_chain(inline)
>        trace_module_notify()
>        trace_module_add_evals(inline)
>        trace_insert_eval_map(inline)
>        trace_event_eval_update()
>                      |
>                      |...
>                      |---down_write(&trace_event_sem);
>                             list_for_each_entry_safe(call, p, &ftrace_events, list) {
>                             /* events are usually grouped together with systems */
>                             if (!last_system || call->class->system != last_system) {
>                                    first = true;
>                                    last_i = 0;
>                                    last_system = call->class->system;
>                             }
>                             ......
> 
> 
>             CPU4
> 
>         load_module()

Hmm, for some reason I thought module loading was synchronized and
couldn't run in parallel. It may have been that way long ago, but
apparently it isn't now.

>          prepare_coming_module(inline)
>          blocking_notifier_call_chain_robust()
>          notifier_call_chain_robust()
>          trace_module_notify()
>          trace_module_add_events(inline)
>          __register_event(inline)
>                      |...
>                      |--list_add(&call->list, &ftrace_events);
>                      |...
> 
> 
> 
> 
>        Function1: trace_event_eval_update --->traverse the ftrace_events list
> 
>        Function2: __register_event --->list_add ftrace_events ,but no write lock protect.
> 
>      Function1 and Function2 are running at the same time.
> 
> 
>       We want to know Is this a small bug or are there other considerations?

This looks like a small bug.

> 
>       Should we could add write lock protection to optimize this issue ?
> 
>       __register_event(inline)
>                 |...
>                 |+++down_write(&trace_event_sem);
>                 |
>                 |---list_add(&call->list, &ftrace_events);
>                 |
>                 |+++down_up(&trace_event_sem);
>               |...
> 

Can you see if this patch works for you?

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 120531268abf..f00d3901cd3b 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -3675,10 +3675,12 @@ static void trace_module_add_events(struct module *mod)
 	start = mod->trace_events;
 	end = mod->trace_events + mod->num_trace_events;
 
+	down_write(&trace_event_sem);
 	for_each_event(call, start, end) {
 		__register_event(*call, mod);
 		__add_event_to_tracers(*call);
 	}
+	up_write(&trace_event_sem);
 
 	update_cache_events(mod);
 }

The remove takes that semaphore, I don't know why the addition doesn't.
Looks to be a long standing bug. Goes back to 110bf2b764eb6 that was
added in 2009. Probably because I assumed load_module() was
synchronized :-p

Thanks for the report!

-- Steve

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: 【ftrace_events list】race between trace_event_eval_update and __register_event discuss
  2025-07-17 14:50 ` 【ftrace_events list】race between trace_event_eval_update and __register_event discuss Steven Rostedt
@ 2025-07-17 14:52   ` Steven Rostedt
  2025-07-17 14:59   ` Steven Rostedt
  1 sibling, 0 replies; 3+ messages in thread
From: Steven Rostedt @ 2025-07-17 14:52 UTC (permalink / raw)
  To: Fusheng Huang(黄富生)
  Cc: mhiramat@kernel.org, GuoJin Dai( 戴国金),
	linux-trace-kernel@vger.kernel.org

On Thu, 17 Jul 2025 10:50:07 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 120531268abf..f00d3901cd3b 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -3675,10 +3675,12 @@ static void trace_module_add_events(struct module *mod)
>  	start = mod->trace_events;
>  	end = mod->trace_events + mod->num_trace_events;
>  
> +	down_write(&trace_event_sem);
>  	for_each_event(call, start, end) {
>  		__register_event(*call, mod);
>  		__add_event_to_tracers(*call);
>  	}
> +	up_write(&trace_event_sem);

Bah, it appears that this semaphore is taken within these callers. Let
me fix this up properly.

-- Steve


>  
>  	update_cache_events(mod);
>  }

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 【ftrace_events list】race between trace_event_eval_update and __register_event discuss
  2025-07-17 14:50 ` 【ftrace_events list】race between trace_event_eval_update and __register_event discuss Steven Rostedt
  2025-07-17 14:52   ` Steven Rostedt
@ 2025-07-17 14:59   ` Steven Rostedt
  1 sibling, 0 replies; 3+ messages in thread
From: Steven Rostedt @ 2025-07-17 14:59 UTC (permalink / raw)
  To: Fusheng Huang(黄富生)
  Cc: mhiramat@kernel.org, GuoJin Dai( 戴国金),
	linux-trace-kernel@vger.kernel.org

On Thu, 17 Jul 2025 10:50:07 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> >       __register_event(inline)
> >                 |...
> >                 |+++down_write(&trace_event_sem);
> >                 |
> >                 |---list_add(&call->list, &ftrace_events);
> >                 |
> >                 |+++down_up(&trace_event_sem);
> >               |...

Yeah, I think this may need to be the patch, as __register_event() also
calls event_init() that eventually takes the semaphore :-p

I think there may need to be other places that need that protection, so
I'll go and see if I can find and add them all.

-- Steve

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-07-17 15:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <3c1259275c784bbd9e2f33e1b2d846c1@luxshare-ict.com>
2025-07-17 14:50 ` 【ftrace_events list】race between trace_event_eval_update and __register_event discuss Steven Rostedt
2025-07-17 14:52   ` Steven Rostedt
2025-07-17 14:59   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox