From: Wen Yang <wen.yang@linux.dev>
To: Gabriele Monaco <gmonaco@redhat.com>
Cc: linux-trace-kernel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/9] rv: add generic uprobe infrastructure for RV monitors
Date: Wed, 1 Jul 2026 02:44:45 +0800 [thread overview]
Message-ID: <d1e84072-7968-4a64-9497-09db83b38682@linux.dev> (raw)
In-Reply-To: <ebde84b7c65cd6b01a1e8d124de0b1eaf0793006.camel@redhat.com>
On 6/30/26 16:48, Gabriele Monaco wrote:
> On Mon, 2026-06-29 at 00:47 +0800, Wen Yang wrote:
>> On [2-1]~[2-5]: embedded consumer causes UAF on PREEMPT_RT
>> -----------------------------------------------------------
>>
>> The uprobe_bind selftest oopses on PREEMPT_RT(full):
>>
>> handler_chain+0xc9: mov rax, [r15+0x18] ; advance list iterator
>> RAX: 000015ec00001f28 ; garbage — &uprobe->consumers after kfree
>>
>> handler_chain() reads uc->cons_node.next after uc->handler() returns,
>> still inside rcu_read_lock_trace(). That pointer is &uprobe->consumers
>> (the list head embedded in struct uprobe), which gets freed through:
>>
>> put_uprobe()
>> -> schedule_work(uprobe_free_deferred) /* async */
>> -> call_srcu(&uretprobes_srcu, ...)
>> -> call_rcu_tasks_trace(kfree_uprobe)
>> -> kfree(uprobe)
>>
>> rv_uprobe_sync() calls uprobe_unregister_sync() which calls
>> synchronize_srcu(&uretprobes_srcu), but that only matters after the
>> kworker has submitted work to uretprobes_srcu. On a loaded PREEMPT_RT
>> box the kworker may not have run yet, so synchronize_srcu() returns
>> immediately and kfree(uprobe) races with the still-iterating
>> handler_chain():
>>
>> CPU A CPU B
>> consumer_del() → list_del_rcu rcu_read_lock_trace()
>> put_uprobe() → schedule_work uc->handler() returns
>> rv_uprobe_sync(): reading cons_node.next...
>> synchronize_srcu(&uretprobes_srcu)
>
> If CPU B is in an RCU trace read-side critical section this doesn't
> return immediately, it returns after readers are done, doesn't it?
> (well, what you really care here is the synchronize_rcu_tasks_trace()
> but we do both).
>
>> ← idle; returns immediately
>> [kworker fires later]
>> kfree(uprobe) ← frees &uprobe->consumers
>> cons_node.next = freed mem → CRASH
>>
>
> So I had a bit of a look and as far as I understand, we don't have any
> control over the uprobe allocation pattern (workqueues and whatnot) and
> we don't really care as long as we deregister it appropriately.
>
> What we do control is the uprobe_consumer, that must be freed only after
> the uprobe was synchronously deregistered, that should guarantee no
> reader is going to reference it, shouldn't it?
>
> uprobe_unregister_nosync() removes it from the cons_node list, so it's
> safe to assume any handler_chain() after the next RCU-trace grace period
> won't see it.
>
> In the sketch I sent this is happening (all kfree(b) are after
> rv_uprobe_unregister() which does the sync). What am I missing here?
>
>
> uprobe is also freed after both grace periods, so no reader should use
> that either.
>
> The situation you're seeing isn't fully clear to me, I applied my sketch
> and don't see any splat on a vng box.
>
>> With uc embedded in the binding (as [2-1] suggests), no amount of
>> delaying kfree(binding) helps: uprobe->consumers is freed by a chain we
>> don't control. The fix is to keep uc->cons_node in memory that outlives
>> the handler_chain() iteration, which means a separate allocation freed
>> only after rv_uprobe_sync():
>>
>> rv_uprobe_unregister_nosync() /* list_del_rcu + schedule_work */
>> rv_uprobe_sync() /* waits for any already-submitted
>> srcu work */
>
> We of course need to do any free after this sync, but I don't get why we
> need an additional allocation since the following are both plain
> synchronous frees, why isn't a single one (on b) enough?
>
Thank you for the patient explanation.
You are right, and v4 implements the embedded approach.
struct rv_uprobe directly embeds struct uprobe_consumer:
struct rv_uprobe {
struct uprobe_consumer uc;
struct uprobe *uprobe;
struct inode *inode;
void *priv;
int (*handler)(...);
int (*ret_handler)(...);
};
rv_uprobe_free() is gone — no allocation means no explicit free. After
rv_uprobe_unregister() (or rv_uprobe_unregister_nosync() +
rv_uprobe_sync()), the caller frees the containing struct directly. In
tlob:
rv_uprobe_unregister_nosync(&b->start_probe);
rv_uprobe_unregister_nosync(&b->stop_probe);
rv_uprobe_sync();
kfree(b); /* frees both embedded consumers */
This is the pattern from your sketch.
Regarding my earlier reply that argued for the separate allocation: I
was wrong. The key barrier is synchronize_rcu_tasks_trace() (called
first in uprobe_unregister_sync()), which waits for all
rcu_read_lock_trace() readers including handler_chain(). After it
returns, no cons_node.next read is in flight and embedding is safe.
We appreciate your thorough review. All of your comments have been
addressed in v4.
We'll run local tests for one or two days, and then it will be sent out
shortly.
--
Best wishes,
Wen
next prev parent reply other threads:[~2026-06-30 18:44 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-07 16:13 [PATCH v3 0/9] rv/tlob: Add task latency over budget RV monitor wen.yang
2026-06-07 16:13 ` [PATCH v3 1/9] rv/da: introduce DA_MON_ALLOCATION_STRATEGY wen.yang
2026-06-15 9:56 ` Gabriele Monaco
2026-06-07 16:13 ` [PATCH v3 2/9] rv: add generic uprobe infrastructure for RV monitors wen.yang
2026-06-16 9:49 ` Gabriele Monaco
2026-06-28 16:47 ` Wen Yang
2026-06-30 8:48 ` Gabriele Monaco
2026-06-30 18:44 ` Wen Yang [this message]
2026-06-07 16:13 ` [PATCH v3 3/9] rv/tlob: add tlob model DOT file wen.yang
2026-06-07 16:13 ` [PATCH v3 4/9] rv/ha: fix ha_invariant_passed_ns silent bypass of invariant check wen.yang
2026-06-15 10:12 ` Gabriele Monaco
2026-06-07 16:13 ` [PATCH v3 5/9] rv/ha: make da_monitor_reset_hook and EVENT_NONE_LBL overridable wen.yang
2026-06-15 10:16 ` Gabriele Monaco
2026-06-07 16:13 ` [PATCH v3 6/9] rv/tlob: add tlob hybrid automaton monitor wen.yang
2026-06-15 15:24 ` Gabriele Monaco
2026-06-07 16:13 ` [PATCH v3 7/9] rv/tlob: add KUnit tests for the tlob monitor wen.yang
2026-06-17 7:49 ` Gabriele Monaco
2026-06-07 16:13 ` [PATCH v3 8/9] selftests/verification: fix verificationtest-ktap for out-of-tree execution wen.yang
2026-06-16 11:14 ` Gabriele Monaco
2026-06-07 16:13 ` [PATCH v3 9/9] selftests/verification: add tlob selftests wen.yang
2026-06-16 14:58 ` Gabriele Monaco
2026-06-17 15:09 ` Gabriele Monaco
2026-06-22 9:26 ` Gabriele Monaco
2026-06-13 16:00 ` [PATCH v3 0/9] rv/tlob: Add task latency over budget RV monitor Wen Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d1e84072-7968-4a64-9497-09db83b38682@linux.dev \
--to=wen.yang@linux.dev \
--cc=gmonaco@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox