* Re: [PATCH v2] tracing/probes: Allow use of BTF names to dereference pointers
2026-05-16 21:33 [PATCH v2] tracing/probes: Allow use of BTF names to dereference pointers Steven Rostedt
@ 2026-05-17 2:22 ` Masami Hiramatsu
2026-05-18 6:17 ` Masami Hiramatsu
2026-05-18 10:45 ` Leon Hwang
2 siblings, 0 replies; 4+ messages in thread
From: Masami Hiramatsu @ 2026-05-17 2:22 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, Linux trace kernel, bpf, Masami Hiramatsu,
Mathieu Desnoyers, Mark Rutland, Peter Zijlstra, Namhyung Kim,
Takaya Saeki, Douglas Raillard, Tom Zanussi, Andrew Morton,
Thomas Gleixner, Ian Rogers, Jiri Olsa
On Sat, 16 May 2026 17:33:09 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Add syntax to the FETCHARGS parsing of probes to allow the use of
> structure and member names to get the offsets to dereference pointers.
>
> Currently, a dereference must be a number, where the user has to figure
> out manually the offset of a member of a structure that they want to
> reference. For example, to get the size of a kmem_cache that was passed to
> the function kmem_cache_alloc_noprof, one would need to do:
>
> # cd /sys/kernel/tracing
> # echo 'f:cache kmem_cache_alloc_noprof size=+0x18($arg1):u32' >> dynamic_events
>
> This requires knowing that the offset of size is 0x18, which can be found
> with gdb:
>
> (gdb) p &((struct kmem_cache *)0)->size
> $1 = (unsigned int *) 0x18
>
> If BTF is in the kernel, it can be used to find this with names, where the
> user doesn't need to find the actual offset:
>
> # echo 'f:cache kmem_cache_alloc_noprof size=+kmem_cache.size($arg1):u32' >> dynamic_events
>
> Instead of the "+0x18", it would have "+kmem_cache.size" where the format is:
>
> +STRUCT.MEMBER[.MEMBER[..]]
>
> The delimiter is '.' and the first item is the structure name. Then the
> member of the structure to get the offset of. If that member is an
> embedded structure, another '.MEMBER' may be added to get the offset of
> its members with respect to the original value.
>
> "+kmem_cache.size($arg1)" is equivalent to:
>
> (*(struct kmem_cache *)$arg1).size
>
> Anonymous structures are also handled:
>
> # echo 'e:xmit net.net_dev_xmit +net_device.name(+sk_buff.dev($skbaddr)):string' >> dynamic_events
>
> Where "+net_device.name(+sk_buff.dev($skbaddr))" is equivalent to:
>
> (*(struct net_device *)((*(struct sk_buff *)($skbaddr)).dev)->name)
>
> Note that "dev" of struct sk_buff is inside an anonymous structure:
>
> struct sk_buff {
> union {
> struct {
> /* These two members must be first to match sk_buff_head. */
> struct sk_buff *next;
> struct sk_buff *prev;
>
> union {
> struct net_device *dev;
> [..]
> };
> };
> [..]
> };
>
> This will allow up to three deep of anonymous structures before it will
> fail to find a member.
>
> The above produces:
>
> sshd-session-1080 [000] b..5. 1526.337161: xmit: (net.net_dev_xmit) arg1="enp7s0"
>
> And nested structures can be found by adding more members to the arg:
>
> # echo 'f:read filemap_readahead.isra.0 file=+0(+dentry.d_name.name(+file.f_path.dentry($arg2))):string' >> dynamic_events
>
> The above is equivalent to:
>
> *((*(struct dentry *)(*(struct file *)$arg2).f_path.dentry)->d_name.name)
>
> And produces:
>
> trace-cmd-1381 [002] ...1. 2082.676268: read: (filemap_readahead.isra.0+0x0/0x150) file="trace.dat"
>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Thanks for updating it! I think this looks good to me.
Let me test it. I think we also need updating selftests.
Thank you,
> ---
> Changes since v1: https://patch.msgid.link/20250729113335.2e4f087d@batman.local.home
>
> - Pass error info back so that error_log can display what went wrong
> (Masami Hiramatsu)
>
> - Use __btf_member_bit_offset() to get nested struct member offsets
> (Douglas Raillard)
>
> - Add btf_put(btf) (Jiri Olsa)
>
> Documentation/trace/kprobetrace.rst | 3 +
> kernel/trace/trace_btf.c | 111 ++++++++++++++++++++++++++++
> kernel/trace/trace_btf.h | 10 +++
> kernel/trace/trace_probe.c | 19 ++++-
> kernel/trace/trace_probe.h | 4 +-
> 5 files changed, 144 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst
> index 3b6791c17e9b..00273157100c 100644
> --- a/Documentation/trace/kprobetrace.rst
> +++ b/Documentation/trace/kprobetrace.rst
> @@ -54,6 +54,8 @@ Synopsis of kprobe_events
> $retval : Fetch return value.(\*2)
> $comm : Fetch current task comm.
> +|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*3)(\*4)
> + +STRUCT.MEMBER[.MEMBER[..]](FETCHARG) : If BTF is supported, Fetch memory
> + at FETCHARG + the offset of MEMBER inside of STRUCT.(\*5)
> \IMM : Store an immediate value to the argument.
> NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
> FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
> @@ -70,6 +72,7 @@ Synopsis of kprobe_events
> accesses one register.
> (\*3) this is useful for fetching a field of data structures.
> (\*4) "u" means user-space dereference. See :ref:`user_mem_access`.
> + (\*5) +STRUCT.MEMBER(FETCHARG) is equivalent to (*(struct STRUCT *)(FETCHARG)).MEMBER
>
> Function arguments at kretprobe
> -------------------------------
> diff --git a/kernel/trace/trace_btf.c b/kernel/trace/trace_btf.c
> index 00172f301f25..be88cc4d97dd 100644
> --- a/kernel/trace/trace_btf.c
> +++ b/kernel/trace/trace_btf.c
> @@ -120,3 +120,114 @@ const struct btf_member *btf_find_struct_member(struct btf *btf,
> return member;
> }
>
> +#define BITS_ROUNDDOWN_BYTES(bits) ((bits) >> 3)
> +
> +static int find_member(const char *ptr, struct btf *btf,
> + const struct btf_type **type, int level)
> +{
> + const struct btf_member *member;
> + const struct btf_type *t = *type;
> + int i;
> +
> + /* Max of 3 depth of anonymous structures */
> + if (level > 3)
> + return -E2BIG;
> +
> + for_each_member(i, t, member) {
> + const char *tname = btf_name_by_offset(btf, member->name_off);
> +
> + if (strcmp(ptr, tname) == 0) {
> + int offset = __btf_member_bit_offset(t, member);
> + *type = btf_type_by_id(btf, member->type);
> + return BITS_ROUNDDOWN_BYTES(offset);
> + }
> +
> + /* Handle anonymous structures */
> + if (strlen(tname))
> + continue;
> +
> + *type = btf_type_by_id(btf, member->type);
> + if (btf_type_is_struct(*type)) {
> + int offset = find_member(ptr, btf, type, level + 1);
> +
> + if (offset < 0)
> + continue;
> +
> + return offset + BITS_ROUNDDOWN_BYTES(member->offset);
> + }
> + }
> +
> + return -ENOENT;
> +}
> +
> +/**
> + * btf_find_offset - Find an offset of a member for a structure
> + * @arg: A structure name followed by one or more members
> + * @offset_p: A pointer to where to store the offset
> + *
> + * Will parse @arg with the expected format of: struct.member[[.member]..]
> + * It is delimited by '.'. The first item must be a structure type.
> + * The next are its members. If the member is also of a structure type it
> + * another member may follow ".member".
> + *
> + * Note, @arg is modified but will be put back to what it was on return.
> + *
> + * Returns: 0 on success and -EINVAL if no '.' is present
> + * or -ENXIO if the structure or member is not found.
> + * Returns -EINVAL if BTF is not defined.
> + * On success, @offset_p will contain the offset of the member specified
> + * by @arg.
> + */
> +int btf_find_offset(char *arg, long *offset_p)
> +{
> + const struct btf_type *t;
> + struct btf *btf;
> + long offset = 0;
> + char *ptr;
> + int ret;
> + s32 id;
> +
> + ptr = strchr(arg, '.');
> + if (!ptr)
> + return -EINVAL;
> +
> + *ptr = '\0';
> +
> + ret = -ENXIO;
> + id = bpf_find_btf_id(arg, BTF_KIND_STRUCT, &btf);
> + if (id < 0)
> + goto error;
> +
> + /* Get BTF_KIND_FUNC type */
> + t = btf_type_by_id(btf, id);
> +
> + /* May allow more than one member, as long as they are structures */
> + do {
> + ret = -ENXIO;
> + if (!t || !btf_type_is_struct(t))
> + goto error;
> +
> + *ptr++ = '.';
> + arg = ptr;
> + ptr = strchr(ptr, '.');
> + if (ptr)
> + *ptr = '\0';
> +
> + ret = find_member(arg, btf, &t, 0);
> + if (ret < 0)
> + goto error;
> +
> + offset += ret;
> +
> + } while (ptr);
> +
> + btf_put(btf);
> + *offset_p = offset;
> + return 0;
> +
> +error:
> + btf_put(btf);
> + if (ptr)
> + *ptr = '.';
> + return ret;
> +}
> diff --git a/kernel/trace/trace_btf.h b/kernel/trace/trace_btf.h
> index 4bc44bc261e6..7b0797a6050b 100644
> --- a/kernel/trace/trace_btf.h
> +++ b/kernel/trace/trace_btf.h
> @@ -9,3 +9,13 @@ const struct btf_member *btf_find_struct_member(struct btf *btf,
> const struct btf_type *type,
> const char *member_name,
> u32 *anon_offset);
> +
> +#ifdef CONFIG_PROBE_EVENTS_BTF_ARGS
> +/* Will modify arg, but will put it back before returning. */
> +int btf_find_offset(char *arg, long *offset);
> +#else
> +static inline int btf_find_offset(char *arg, long *offset)
> +{
> + return -EINVAL;
> +}
> +#endif
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index e0d3a0da26af..6fcede2de1a5 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -1165,7 +1165,7 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
>
> case '+': /* deref memory */
> case '-':
> - if (arg[1] == 'u') {
> + if (arg[1] == 'u' && isdigit(arg[2])) {
> deref = FETCH_OP_UDEREF;
> arg[1] = arg[0];
> arg++;
> @@ -1178,7 +1178,22 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
> return -EINVAL;
> }
> *tmp = '\0';
> - ret = kstrtol(arg, 0, &offset);
> + if (arg[0] != '-' && !isdigit(*arg)) {
> + int err = 0;
> + ret = btf_find_offset(arg, &offset);
> + switch (ret) {
> + case -ENODEV: err = TP_ERR_NOSUP_BTFARG; break;
> + case -E2BIG: err = TP_ERR_MEMBER_TOO_DEEP; break;
> + case -EINVAL: err = TP_ERR_BAD_STRUCT_FMT; break;
> + case -ENXIO: err = TP_ERR_BAD_BTF_TID; break;
> + }
> + if (err)
> + __trace_probe_log_err(ctx->offset, err);
> + if (ret < 0)
> + return ret;
> + } else {
> + ret = kstrtol(arg, 0, &offset);
> + }
> if (ret) {
> trace_probe_log_err(ctx->offset, BAD_DEREF_OFFS);
> break;
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index 262d8707a3df..d649bb9f5b7c 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -563,7 +563,9 @@ extern int traceprobe_define_arg_fields(struct trace_event_call *event_call,
> C(NEED_STRING_TYPE, "$comm and immediate-string only accepts string type"),\
> C(TOO_MANY_ARGS, "Too many arguments are specified"), \
> C(TOO_MANY_EARGS, "Too many entry arguments specified"), \
> - C(EVENT_TOO_BIG, "Event too big (too many fields?)"),
> + C(EVENT_TOO_BIG, "Event too big (too many fields?)"), \
> + C(MEMBER_TOO_DEEP, "Too many indirections of anonymous structure"), \
> + C(BAD_STRUCT_FMT, "Unknown BTF structure"),
>
> #undef C
> #define C(a, b) TP_ERR_##a
> --
> 2.53.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH v2] tracing/probes: Allow use of BTF names to dereference pointers
2026-05-16 21:33 [PATCH v2] tracing/probes: Allow use of BTF names to dereference pointers Steven Rostedt
2026-05-17 2:22 ` Masami Hiramatsu
@ 2026-05-18 6:17 ` Masami Hiramatsu
2026-05-18 10:45 ` Leon Hwang
2 siblings, 0 replies; 4+ messages in thread
From: Masami Hiramatsu @ 2026-05-18 6:17 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, Linux trace kernel, bpf, Masami Hiramatsu,
Mathieu Desnoyers, Mark Rutland, Peter Zijlstra, Namhyung Kim,
Takaya Saeki, Douglas Raillard, Tom Zanussi, Andrew Morton,
Thomas Gleixner, Ian Rogers, Jiri Olsa
On Sat, 16 May 2026 17:33:09 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> Function arguments at kretprobe
> -------------------------------
> diff --git a/kernel/trace/trace_btf.c b/kernel/trace/trace_btf.c
> index 00172f301f25..be88cc4d97dd 100644
> --- a/kernel/trace/trace_btf.c
> +++ b/kernel/trace/trace_btf.c
> @@ -120,3 +120,114 @@ const struct btf_member *btf_find_struct_member(struct btf *btf,
> return member;
> }
>
> +#define BITS_ROUNDDOWN_BYTES(bits) ((bits) >> 3)
> +
> +static int find_member(const char *ptr, struct btf *btf,
> + const struct btf_type **type, int level)
We already have a similar function.
/*
* Find a member of data structure/union by name and return it.
* Return NULL if not found, or -EINVAL if parameter is invalid.
* If the member is an member of anonymous union/structure, the offset
* of that anonymous union/structure is stored into @anon_offset. Caller
* can calculate the correct offset from the root data structure by
* adding anon_offset to the member's offset.
*/
const struct btf_member *btf_find_struct_member(struct btf *btf,
const struct btf_type *type,
const char *member_name,
u32 *anon_offset)
And this seems useful for you,
> +{
> + const struct btf_member *member;
> + const struct btf_type *t = *type;
> + int i;
> +
> + /* Max of 3 depth of anonymous structures */
> + if (level > 3)
> + return -E2BIG;
> +
> + for_each_member(i, t, member) {
> + const char *tname = btf_name_by_offset(btf, member->name_off);
> +
> + if (strcmp(ptr, tname) == 0) {
> + int offset = __btf_member_bit_offset(t, member);
> + *type = btf_type_by_id(btf, member->type);
> + return BITS_ROUNDDOWN_BYTES(offset);
> + }
> +
> + /* Handle anonymous structures */
> + if (strlen(tname))
> + continue;
> +
> + *type = btf_type_by_id(btf, member->type);
> + if (btf_type_is_struct(*type)) {
> + int offset = find_member(ptr, btf, type, level + 1);
> +
> + if (offset < 0)
> + continue;
> +
> + return offset + BITS_ROUNDDOWN_BYTES(member->offset);
> + }
> + }
> +
> + return -ENOENT;
> +}
> +
> +/**
> + * btf_find_offset - Find an offset of a member for a structure
> + * @arg: A structure name followed by one or more members
> + * @offset_p: A pointer to where to store the offset
> + *
> + * Will parse @arg with the expected format of: struct.member[[.member]..]
> + * It is delimited by '.'. The first item must be a structure type.
> + * The next are its members. If the member is also of a structure type it
> + * another member may follow ".member".
> + *
> + * Note, @arg is modified but will be put back to what it was on return.
> + *
> + * Returns: 0 on success and -EINVAL if no '.' is present
> + * or -ENXIO if the structure or member is not found.
> + * Returns -EINVAL if BTF is not defined.
> + * On success, @offset_p will contain the offset of the member specified
> + * by @arg.
> + */
> +int btf_find_offset(char *arg, long *offset_p)
And this can share the inner loop of parse_btf_field().
(BTW, parse_btf_field() supports bitfield, but this does not.)
> +{
> + const struct btf_type *t;
> + struct btf *btf;
As Sashiko pointed, btf = NULL here.
But I think we would better to have DEFINE_FREE(btf_put).
> + long offset = 0;
> + char *ptr;
> + int ret;
> + s32 id;
> +
> + ptr = strchr(arg, '.');
> + if (!ptr)
> + return -EINVAL;
> +
> + *ptr = '\0';
> +
> + ret = -ENXIO;
> + id = bpf_find_btf_id(arg, BTF_KIND_STRUCT, &btf);
> + if (id < 0)
> + goto error;
> +
> + /* Get BTF_KIND_FUNC type */
> + t = btf_type_by_id(btf, id);
> +
> + /* May allow more than one member, as long as they are structures */
> + do {
> + ret = -ENXIO;
> + if (!t || !btf_type_is_struct(t))
> + goto error;
> +
> + *ptr++ = '.';
> + arg = ptr;
> + ptr = strchr(ptr, '.');
> + if (ptr)
> + *ptr = '\0';
> +
So, if you don't share the inner loop, you can just reuse
btf_field_struct_member() as below (this should be equivalent
to find_member()):
field = btf_find_struct_member(btf, t, arg, &anon_offs);
if (IS_ERR_OR_NULL(field)) {
ret = field ? PTR_ERR(field) : -ENOENT;
goto error;
}
offset += field->offset + anon_offs;
> +
> + } while (ptr);
> +
> + btf_put(btf);
> + *offset_p = offset;
> + return 0;
> +
> +error:
> + btf_put(btf);
> + if (ptr)
> + *ptr = '.';
> + return ret;
> +}
> diff --git a/kernel/trace/trace_btf.h b/kernel/trace/trace_btf.h
> index 4bc44bc261e6..7b0797a6050b 100644
> --- a/kernel/trace/trace_btf.h
> +++ b/kernel/trace/trace_btf.h
> @@ -9,3 +9,13 @@ const struct btf_member *btf_find_struct_member(struct btf *btf,
> const struct btf_type *type,
> const char *member_name,
> u32 *anon_offset);
> +
> +#ifdef CONFIG_PROBE_EVENTS_BTF_ARGS
> +/* Will modify arg, but will put it back before returning. */
> +int btf_find_offset(char *arg, long *offset);
> +#else
> +static inline int btf_find_offset(char *arg, long *offset)
> +{
> + return -EINVAL;
> +}
> +#endif
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index e0d3a0da26af..6fcede2de1a5 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -1165,7 +1165,7 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
>
> case '+': /* deref memory */
> case '-':
> - if (arg[1] == 'u') {
> + if (arg[1] == 'u' && isdigit(arg[2])) {
As Sashiko pointed, This can be broken if STRUCT name starts
with "u[0-9]". What about using "()" for this case?
e.g.
+u(user_unreg.disable_addr)(uarg)
> deref = FETCH_OP_UDEREF;
> arg[1] = arg[0];
> arg++;
> @@ -1178,7 +1178,22 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
> return -EINVAL;
> }
> *tmp = '\0';
> - ret = kstrtol(arg, 0, &offset);
> + if (arg[0] != '-' && !isdigit(*arg)) {
> + int err = 0;
> + ret = btf_find_offset(arg, &offset);
> + switch (ret) {
> + case -ENODEV: err = TP_ERR_NOSUP_BTFARG; break;
> + case -E2BIG: err = TP_ERR_MEMBER_TOO_DEEP; break;
> + case -EINVAL: err = TP_ERR_BAD_STRUCT_FMT; break;
> + case -ENXIO: err = TP_ERR_BAD_BTF_TID; break;
This should have -ENOENT and default case for unknown error.
(There is TP_ERR_NO_BTF_FIELD already)
> + }
> + if (err)
> + __trace_probe_log_err(ctx->offset, err);
> + if (ret < 0)
> + return ret;
> + } else {
> + ret = kstrtol(arg, 0, &offset);
> + }
> if (ret) {
> trace_probe_log_err(ctx->offset, BAD_DEREF_OFFS);
> break;
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index 262d8707a3df..d649bb9f5b7c 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -563,7 +563,9 @@ extern int traceprobe_define_arg_fields(struct trace_event_call *event_call,
> C(NEED_STRING_TYPE, "$comm and immediate-string only accepts string type"),\
> C(TOO_MANY_ARGS, "Too many arguments are specified"), \
> C(TOO_MANY_EARGS, "Too many entry arguments specified"), \
> - C(EVENT_TOO_BIG, "Event too big (too many fields?)"),
> + C(EVENT_TOO_BIG, "Event too big (too many fields?)"), \
> + C(MEMBER_TOO_DEEP, "Too many indirections of anonymous structure"), \
> + C(BAD_STRUCT_FMT, "Unknown BTF structure"),
>
> #undef C
> #define C(a, b) TP_ERR_##a
> --
> 2.53.0
>
Thanks,
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH v2] tracing/probes: Allow use of BTF names to dereference pointers
2026-05-16 21:33 [PATCH v2] tracing/probes: Allow use of BTF names to dereference pointers Steven Rostedt
2026-05-17 2:22 ` Masami Hiramatsu
2026-05-18 6:17 ` Masami Hiramatsu
@ 2026-05-18 10:45 ` Leon Hwang
2 siblings, 0 replies; 4+ messages in thread
From: Leon Hwang @ 2026-05-18 10:45 UTC (permalink / raw)
To: Steven Rostedt, LKML, Linux trace kernel, bpf
Cc: Masami Hiramatsu, Mathieu Desnoyers, Mark Rutland, Peter Zijlstra,
Namhyung Kim, Takaya Saeki, Douglas Raillard, Tom Zanussi,
Andrew Morton, Thomas Gleixner, Ian Rogers, Jiri Olsa
On 17/5/26 05:33, Steven Rostedt wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Add syntax to the FETCHARGS parsing of probes to allow the use of
> structure and member names to get the offsets to dereference pointers.
>
> Currently, a dereference must be a number, where the user has to figure
> out manually the offset of a member of a structure that they want to
> reference. For example, to get the size of a kmem_cache that was passed to
> the function kmem_cache_alloc_noprof, one would need to do:
>
> # cd /sys/kernel/tracing
> # echo 'f:cache kmem_cache_alloc_noprof size=+0x18($arg1):u32' >> dynamic_events
>
> This requires knowing that the offset of size is 0x18, which can be found
> with gdb:
>
> (gdb) p &((struct kmem_cache *)0)->size
> $1 = (unsigned int *) 0x18
>
> If BTF is in the kernel, it can be used to find this with names, where the
> user doesn't need to find the actual offset:
>
> # echo 'f:cache kmem_cache_alloc_noprof size=+kmem_cache.size($arg1):u32' >> dynamic_events
>
> Instead of the "+0x18", it would have "+kmem_cache.size" where the format is:
>
> +STRUCT.MEMBER[.MEMBER[..]]
>
> The delimiter is '.' and the first item is the structure name. Then the
> member of the structure to get the offset of. If that member is an
> embedded structure, another '.MEMBER' may be added to get the offset of
> its members with respect to the original value.
>
> "+kmem_cache.size($arg1)" is equivalent to:
>
> (*(struct kmem_cache *)$arg1).size
>
> Anonymous structures are also handled:
>
> # echo 'e:xmit net.net_dev_xmit +net_device.name(+sk_buff.dev($skbaddr)):string' >> dynamic_events
>
> Where "+net_device.name(+sk_buff.dev($skbaddr))" is equivalent to:
>
> (*(struct net_device *)((*(struct sk_buff *)($skbaddr)).dev)->name)
>
> Note that "dev" of struct sk_buff is inside an anonymous structure:
>
> struct sk_buff {
> union {
> struct {
> /* These two members must be first to match sk_buff_head. */
> struct sk_buff *next;
> struct sk_buff *prev;
>
> union {
> struct net_device *dev;
> [..]
> };
> };
> [..]
> };
>
> This will allow up to three deep of anonymous structures before it will
> fail to find a member.
>
> The above produces:
>
> sshd-session-1080 [000] b..5. 1526.337161: xmit: (net.net_dev_xmit) arg1="enp7s0"
>
> And nested structures can be found by adding more members to the arg:
>
> # echo 'f:read filemap_readahead.isra.0 file=+0(+dentry.d_name.name(+file.f_path.dentry($arg2))):string' >> dynamic_events
>
> The above is equivalent to:
>
> *((*(struct dentry *)(*(struct file *)$arg2).f_path.dentry)->d_name.name)
>
> And produces:
>
> trace-cmd-1381 [002] ...1. 2082.676268: read: (filemap_readahead.isra.0+0x0/0x150) file="trace.dat"
>
Hi Steve,
Great to see that BTF is going to be nested into trace.
I'm glad to share my BPF tool, bpfsnoop [1], that utilizes the similar
way to inspect argument's data.
Read device name:
bpfsnoop -t net_dev_xmit --output-arg 'str(skb->dev->name)'
--limit-events 20
- net_dev_xmit[tp] args=((struct sk_buff *)skb=0xffff88818821d4e8,
(int)rc=0, (struct net_device *)dev=0xffff88984ba64000, (unsigned
int)skb_len=0x1f2/498) cpu=2 process=(0:swapper/2)
timestamp=18:06:17.309492697
Arg attrs: (array(char[16]))'str(skb->dev->name)'="eth0"
Read dentry name:
bpfsnoop -k 'vfs_read' --output-arg
'str((file->f_path.dentry)->d_name.name)' --limit-events 20
← vfs_read args=((struct file *)file=0xffff888175e08400, (char
*)buf=0x55c7a1168400(0x0/0), (size_t)count=0x10000/65536, (loff_t
*)pos=0xffffc9000f707bb0(0)) retval=(long int)510 cpu=3
process=(339834:sudo) timestamp=18:24:16.22021166
Arg attrs: (unsigned char *)'str((file->f_path.dentry)->d_name.name)'="ptmx"
In bpfsnoop, it provides a friendly way to inspect argument's data using
C expressions. Under the hood, it compiles the C expressions, specified
by --filter-arg/--output-arg, into BPF byte code by parsing the
struct/union member access with BTF. (I'm too lazy to write documents to
explain its internal details. But you can study it with AI assistance.)
Insanely, after developing such feature for bpfsnoop, I wondered whether
to embed a light-weight C compiler into trace tool in order to compile C
expression into BPF byte code, and then load the BPF program to
filter/output argument. Finally, users are able to filter/output
arguments using C expressions. It seemed too crazy for me to post such
idea to trace mailing list at that time, as I wasn't familiar with trace
infrastructure.
[1] https://github.com/bpfsnoop/bpfsnoop/
Thanks,
Leon
^ permalink raw reply [flat|nested] 4+ messages in thread