From: Leon Hwang <leon.hwang@linux.dev>
To: Steven Rostedt <rostedt@goodmis.org>,
LKML <linux-kernel@vger.kernel.org>,
Linux trace kernel <linux-trace-kernel@vger.kernel.org>,
bpf@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Mark Rutland <mark.rutland@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
Namhyung Kim <namhyung@kernel.org>,
Takaya Saeki <takayas@google.com>,
Douglas Raillard <douglas.raillard@arm.com>,
Tom Zanussi <zanussi@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ian Rogers <irogers@google.com>, Jiri Olsa <olsajiri@gmail.com>
Subject: Re: [PATCH v2] tracing/probes: Allow use of BTF names to dereference pointers
Date: Mon, 18 May 2026 18:45:11 +0800 [thread overview]
Message-ID: <80621876-f151-4373-aab9-336a2c483d95@linux.dev> (raw)
In-Reply-To: <20260516173310.1dbad146@fedora>
On 17/5/26 05:33, Steven Rostedt wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Add syntax to the FETCHARGS parsing of probes to allow the use of
> structure and member names to get the offsets to dereference pointers.
>
> Currently, a dereference must be a number, where the user has to figure
> out manually the offset of a member of a structure that they want to
> reference. For example, to get the size of a kmem_cache that was passed to
> the function kmem_cache_alloc_noprof, one would need to do:
>
> # cd /sys/kernel/tracing
> # echo 'f:cache kmem_cache_alloc_noprof size=+0x18($arg1):u32' >> dynamic_events
>
> This requires knowing that the offset of size is 0x18, which can be found
> with gdb:
>
> (gdb) p &((struct kmem_cache *)0)->size
> $1 = (unsigned int *) 0x18
>
> If BTF is in the kernel, it can be used to find this with names, where the
> user doesn't need to find the actual offset:
>
> # echo 'f:cache kmem_cache_alloc_noprof size=+kmem_cache.size($arg1):u32' >> dynamic_events
>
> Instead of the "+0x18", it would have "+kmem_cache.size" where the format is:
>
> +STRUCT.MEMBER[.MEMBER[..]]
>
> The delimiter is '.' and the first item is the structure name. Then the
> member of the structure to get the offset of. If that member is an
> embedded structure, another '.MEMBER' may be added to get the offset of
> its members with respect to the original value.
>
> "+kmem_cache.size($arg1)" is equivalent to:
>
> (*(struct kmem_cache *)$arg1).size
>
> Anonymous structures are also handled:
>
> # echo 'e:xmit net.net_dev_xmit +net_device.name(+sk_buff.dev($skbaddr)):string' >> dynamic_events
>
> Where "+net_device.name(+sk_buff.dev($skbaddr))" is equivalent to:
>
> (*(struct net_device *)((*(struct sk_buff *)($skbaddr)).dev)->name)
>
> Note that "dev" of struct sk_buff is inside an anonymous structure:
>
> struct sk_buff {
> union {
> struct {
> /* These two members must be first to match sk_buff_head. */
> struct sk_buff *next;
> struct sk_buff *prev;
>
> union {
> struct net_device *dev;
> [..]
> };
> };
> [..]
> };
>
> This will allow up to three deep of anonymous structures before it will
> fail to find a member.
>
> The above produces:
>
> sshd-session-1080 [000] b..5. 1526.337161: xmit: (net.net_dev_xmit) arg1="enp7s0"
>
> And nested structures can be found by adding more members to the arg:
>
> # echo 'f:read filemap_readahead.isra.0 file=+0(+dentry.d_name.name(+file.f_path.dentry($arg2))):string' >> dynamic_events
>
> The above is equivalent to:
>
> *((*(struct dentry *)(*(struct file *)$arg2).f_path.dentry)->d_name.name)
>
> And produces:
>
> trace-cmd-1381 [002] ...1. 2082.676268: read: (filemap_readahead.isra.0+0x0/0x150) file="trace.dat"
>
Hi Steve,
Great to see that BTF is going to be nested into trace.
I'm glad to share my BPF tool, bpfsnoop [1], that utilizes the similar
way to inspect argument's data.
Read device name:
bpfsnoop -t net_dev_xmit --output-arg 'str(skb->dev->name)'
--limit-events 20
- net_dev_xmit[tp] args=((struct sk_buff *)skb=0xffff88818821d4e8,
(int)rc=0, (struct net_device *)dev=0xffff88984ba64000, (unsigned
int)skb_len=0x1f2/498) cpu=2 process=(0:swapper/2)
timestamp=18:06:17.309492697
Arg attrs: (array(char[16]))'str(skb->dev->name)'="eth0"
Read dentry name:
bpfsnoop -k 'vfs_read' --output-arg
'str((file->f_path.dentry)->d_name.name)' --limit-events 20
← vfs_read args=((struct file *)file=0xffff888175e08400, (char
*)buf=0x55c7a1168400(0x0/0), (size_t)count=0x10000/65536, (loff_t
*)pos=0xffffc9000f707bb0(0)) retval=(long int)510 cpu=3
process=(339834:sudo) timestamp=18:24:16.22021166
Arg attrs: (unsigned char *)'str((file->f_path.dentry)->d_name.name)'="ptmx"
In bpfsnoop, it provides a friendly way to inspect argument's data using
C expressions. Under the hood, it compiles the C expressions, specified
by --filter-arg/--output-arg, into BPF byte code by parsing the
struct/union member access with BTF. (I'm too lazy to write documents to
explain its internal details. But you can study it with AI assistance.)
Insanely, after developing such feature for bpfsnoop, I wondered whether
to embed a light-weight C compiler into trace tool in order to compile C
expression into BPF byte code, and then load the BPF program to
filter/output argument. Finally, users are able to filter/output
arguments using C expressions. It seemed too crazy for me to post such
idea to trace mailing list at that time, as I wasn't familiar with trace
infrastructure.
[1] https://github.com/bpfsnoop/bpfsnoop/
Thanks,
Leon
prev parent reply other threads:[~2026-05-18 10:45 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-16 21:33 [PATCH v2] tracing/probes: Allow use of BTF names to dereference pointers Steven Rostedt
2026-05-17 2:22 ` Masami Hiramatsu
2026-05-18 6:17 ` Masami Hiramatsu
2026-05-18 10:45 ` Leon Hwang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=80621876-f151-4373-aab9-336a2c483d95@linux.dev \
--to=leon.hwang@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=douglas.raillard@arm.com \
--cc=irogers@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=namhyung@kernel.org \
--cc=olsajiri@gmail.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=takayas@google.com \
--cc=tglx@linutronix.de \
--cc=zanussi@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox