linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Ogness <john.ogness@linutronix.de>
To: Steven Rostedt <rostedt@goodmis.org>, chenyuan_fl@163.com
Cc: mhiramat@kernel.org, mathieu.desnoyers@efficios.com,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	Yuan Chen <chenyuan@kylinos.cn>,
	Peter Zijlstra <peterz@infradead.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [PATCH v2] tracing: Fix race condition in kprobe initialization causing NULL pointer dereference
Date: Mon, 29 Sep 2025 11:38:08 +0206	[thread overview]
Message-ID: <84seg5d2p3.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <20250929044836.7169d5be@batman.local.home>

On 2025-09-29, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Mon, 29 Sep 2025 07:57:31 +0100
> chenyuan_fl@163.com wrote:
>
>> From: Yuan Chen <chenyuan@kylinos.cn>
>> 
>> There is a critical race condition in kprobe initialization that can lead to
>> NULL pointer dereference and kernel crash.
>> 
>> [1135630.084782] Unable to handle kernel paging request at virtual address 0000710a04630000
>> ...
>> [1135630.260314] pstate: 404003c9 (nZcv DAIF +PAN -UAO)
>> [1135630.269239] pc : kprobe_perf_func+0x30/0x260
>> [1135630.277643] lr : kprobe_dispatcher+0x44/0x60
>> [1135630.286041] sp : ffffaeff4977fa40
>> [1135630.293441] x29: ffffaeff4977fa40 x28: ffffaf015340e400
>> [1135630.302837] x27: 0000000000000000 x26: 0000000000000000
>> [1135630.312257] x25: ffffaf029ed108a8 x24: ffffaf015340e528
>> [1135630.321705] x23: ffffaeff4977fc50 x22: ffffaeff4977fc50
>> [1135630.331154] x21: 0000000000000000 x20: ffffaeff4977fc50
>> [1135630.340586] x19: ffffaf015340e400 x18: 0000000000000000
>> [1135630.349985] x17: 0000000000000000 x16: 0000000000000000
>> [1135630.359285] x15: 0000000000000000 x14: 0000000000000000
>> [1135630.368445] x13: 0000000000000000 x12: 0000000000000000
>> [1135630.377473] x11: 0000000000000000 x10: 0000000000000000
>> [1135630.386411] x9 : 0000000000000000 x8 : 0000000000000000
>> [1135630.395252] x7 : 0000000000000000 x6 : 0000000000000000
>> [1135630.403963] x5 : 0000000000000000 x4 : 0000000000000000
>> [1135630.412545] x3 : 0000710a04630000 x2 : 0000000000000006
>> [1135630.421021] x1 : ffffaeff4977fc50 x0 : 0000710a04630000
>> [1135630.429410] Call trace:
>> [1135630.434828]  kprobe_perf_func+0x30/0x260
>> [1135630.441661]  kprobe_dispatcher+0x44/0x60
>> [1135630.448396]  aggr_pre_handler+0x70/0xc8
>> [1135630.454959]  kprobe_breakpoint_handler+0x140/0x1e0
>> [1135630.462435]  brk_handler+0xbc/0xd8
>> [1135630.468437]  do_debug_exception+0x84/0x138
>> [1135630.475074]  el1_dbg+0x18/0x8c
>> [1135630.480582]  security_file_permission+0x0/0xd0
>> [1135630.487426]  vfs_write+0x70/0x1c0
>> [1135630.493059]  ksys_write+0x5c/0xc8
>> [1135630.498638]  __arm64_sys_write+0x24/0x30
>> [1135630.504821]  el0_svc_common+0x78/0x130
>> [1135630.510838]  el0_svc_handler+0x38/0x78
>> [1135630.516834]  el0_svc+0x8/0x1b0
>> 
>> kernel/trace/trace_kprobe.c: 1308
>> 0xffff3df8995039ec <kprobe_perf_func+0x2c>:     ldr     x21, [x24,#120]
>> include/linux/compiler.h: 294
>> 0xffff3df8995039f0 <kprobe_perf_func+0x30>:     ldr     x1, [x21,x0]
>> 
>> kernel/trace/trace_kprobe.c
>> 1308: head = this_cpu_ptr(call->perf_events);
>> 1309:   if (hlist_empty(head))
>> 1310:           return 0;
>> 
>> crash> struct trace_event_call -o  
>> struct trace_event_call {
>>   ...
>>   [120] struct hlist_head *perf_events;  //(call->perf_event)
>>   ...
>> }
>> 
>> crash> struct trace_event_call ffffaf015340e528  
>> struct trace_event_call {
>>   ...
>>   perf_events = 0xffff0ad5fa89f088, //this value is correct, but x21 = 0
>>   ...
>> }
>> 
>> Race Condition Analysis:
>> 
>> The race occurs between kprobe activation and perf_events initialization:
>> 
>>   CPU0                                    CPU1
>>   ====                                    ====
>>   perf_kprobe_init
>>     perf_trace_event_init
>>       tp_event->perf_events = list;(1)
>>       tp_event->class->reg (2)← KPROBE ACTIVE
>>                                           Debug exception triggers
>>                                           ...
>>                                           kprobe_dispatcher
>>                                             kprobe_perf_func (tk->tp.flags & TP_FLAG_PROFILE)
>>                                               head = this_cpu_ptr(call->perf_events)(3)
>>                                               (perf_events is still NULL)

I do not know anything about the kprobe and perf internals. This email
should hopefully help to act as a guide of where you need to place the
memory barrier _pair_. If I understand the problem description
correctly, you would need:

>> Problem:
>> 1. CPU0 executes (1) assigning tp_event->perf_events = list

smp_wmb()

>> 2. CPU0 executes (2) enabling kprobe functionality via class->reg()
>> 3. CPU1 triggers and reaches kprobe_dispatcher
>> 4. CPU1 checks TP_FLAG_PROFILE - condition passes (step 2 completed)

smp_rmb()

>> 5. CPU1 calls kprobe_perf_func() and crashes at (3) because
>>    call->perf_events is still NULL
>> 
>> The issue: Assignment in step 1 may not be visible to CPU1 due to
>> missing memory barriers before step 2 sets TP_FLAG_PROFILE flag.

A better explanation of the issue would be: CPU1 sees that kprobe
functionality is enabled but does not see that perf_events has been
assigned.

Add pairing read and write memory barriers to guarantee that if CPU1
sees that kprobe functionality is enabled, it must also see that
perf_events has been assigned.

Note that this could also be done more efficiently using a store_release
when setting the flag (in step 2) and a load_acquire when loading the
flag (in step 4).

John Ogness

  parent reply	other threads:[~2025-09-29  9:32 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-29  3:11 [PATCH] tracing: Fix race condition in kprobe initialization causing NULL pointer dereference chenyuan_fl
2025-09-29  5:39 ` Masami Hiramatsu
2025-09-29  6:57   ` [PATCH v2] " chenyuan_fl
2025-09-29  8:48     ` Steven Rostedt
2025-09-29  9:12       ` Peter Zijlstra
2025-09-29  9:32       ` John Ogness [this message]
2025-09-29 10:12         ` Peter Zijlstra
2025-09-30  8:58           ` Masami Hiramatsu
2025-09-30 10:10             ` Peter Zijlstra
2025-09-30 15:37               ` Masami Hiramatsu
2025-09-30  8:18         ` [PATH v3] " chenyuan_fl
2025-09-30  8:46           ` Peter Zijlstra
2025-09-30 15:37             ` Masami Hiramatsu
2025-10-01  2:20               ` [PATCH v4] " chenyuan_fl
2025-10-01 12:32                 ` Peter Zijlstra
2025-10-01 14:31                   ` Steven Rostedt
2025-10-01 22:59                     ` Masami Hiramatsu
2025-10-02 14:04                       ` Masami Hiramatsu
2025-10-01 23:23                   ` Masami Hiramatsu
2025-09-30  9:13           ` [PATH v3] " John Ogness

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84seg5d2p3.fsf@jogness.linutronix.de \
    --to=john.ogness@linutronix.de \
    --cc=bigeasy@linutronix.de \
    --cc=chenyuan@kylinos.cn \
    --cc=chenyuan_fl@163.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).