From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Linux Trace Kernel <linux-trace-kernel@vger.kernel.org>,
bpf@vger.kernel.org, Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Mark Rutland <mark.rutland@arm.com>,
Peter Zijlstra <peterz@infradead.org>,
Namhyung Kim <namhyung@kernel.org>,
Takaya Saeki <takayas@google.com>,
Douglas Raillard <douglas.raillard@arm.com>,
Tom Zanussi <zanussi@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ian Rogers <irogers@google.com>, Jiri Olsa <olsajiri@gmail.com>,
"Subject:[PATCH v2]"@web.codeaurora.org,
tracing/pr@web.codeaurora.org
Subject: Re: [PATCH v4] tracing/probes: Allow use of BTF names to dereference pointers
Date: Tue, 19 May 2026 18:53:02 +0900 [thread overview]
Message-ID: <20260519185302.5fb527085a64567a388f24f3@kernel.org> (raw)
In-Reply-To: <20260518232312.0c78f055@gandalf.local.home>
On Mon, 18 May 2026 23:23:12 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Add syntax to the FETCHARGS parsing of probes to be able to typecast a
> value to a pointer to a structure.
>
> Currently, a dereference must be a number, where the user has to figure
> out manually the offset of a member of a structure that they want to
> dereference, unless the member is a function parameter that BTF already has
> information about what structure the argument is pointing to.
>
> But for event probes, or generic kprobes that records a register that
> happens to be a pointer to a structure, they cannot dereference these
> values with BTF naming, but must use numerical offsets.
Thanks for updating!
>
> For example, to find out what device a sk_buff is pointing to in the
> net_dev_xmit trace event, one must first use gdb to find the offsets of the
> members of the structures:
>
> (gdb) p &((struct sk_buff *)0)->dev
> $1 = (struct net_device **) 0x10
> (gdb) p &((struct net_device *)0)->name
> $2 = (char (*)[16]) 0x118
>
> And then use the raw numbers to dereference:
>
> # echo 'e:xmit net.net_dev_xmit +0x118(+0x10($skbaddr)):string' >> dynamic_events
>
> If BTF is in the kernel, then instead, the $skbaddr can be typecast to
> sk_buff and use the normal dereference logic.
>
> # echo 'e:xmit net.net_dev_xmit (sk_buff*)$skbaddr->dev->name:string' >> dynamic_events
Ah, eprobes supports "$PARAM" to access its parameter by name.
That is a bit complicated. Should we allow user to access
parameter without '$' prefix for eprobes?
> # echo 1 > events/eprobes/xmit/enable
> # cat trace
> [..]
> sshd-session-1022 [000] b..2. 860.249343: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.250061: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.250142: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.263553: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.283820: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.302716: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.322905: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.342828: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.362268: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.382335: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.400856: xmit: (net.net_dev_xmit) arg1="enp7s0"
> sshd-session-1022 [000] b..2. 860.419893: xmit: (net.net_dev_xmit) arg1="enp7s0"
Looks very nice!
>
> The syntax is simply: ([STRUCT]*)(VAR)->FIELD[->FIELD..]
Is the STRUCT optional?? (because [] means optional.) I guess no.
I think we maybe possible to skip '*' (Or, make it optional)
because this is not C-like typecasting, we don't support "struct"
reserved word, and it does not support white-spaces in each
fetcharg. In this case, (STRUCT)VAR->FIELD should work.
BTW, I'm also considering to support new cast syntax, which allows
us to derefer a pointer with "container_of". This is typically
used in the kernel.
We usually see this pattern:
struct {
unsigned long data;
struct list_head list;
} foo;
void callback(struct list_head *foo_list)
{
unsigned long data = container_of(foo_list, struct foo, list)->data;
...
}
To access @data, simple casting does not work. Thus we need a
new syntax:
(STRUCT)(PTR,ASSIGN)->FIELD
So the above case, we can do:
data=(foo)(foo_list,list)->data
This is naturally extend the type casting to support container_of()
equivalent casting.
>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
> Changes since v3: https://patch.msgid.link/20260518095832.52659a3a@gandalf.local.home
>
> *** COMPLETE REWRITE FROM V3 ***
>
> - Rewrote it to use typecasting instead of simply replacing BTF names with
> offsets.
>
> Documentation/trace/kprobetrace.rst | 3 +
> kernel/trace/trace_probe.c | 110 ++++++++++++++++++++++++----
> kernel/trace/trace_probe.h | 3 +
> 3 files changed, 100 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst
> index 3b6791c17e9b..450ac646fe4c 100644
> --- a/Documentation/trace/kprobetrace.rst
> +++ b/Documentation/trace/kprobetrace.rst
> @@ -54,6 +54,9 @@ Synopsis of kprobe_events
> $retval : Fetch return value.(\*2)
> $comm : Fetch current task comm.
> +|-[u]OFFS(FETCHARG) : Fetch memory at FETCHARG +|- OFFS address.(\*3)(\*4)
> + (STRUCT*)FETCHARG->FIELD[->FIELD] : If BTF is supported, typecast FETCHARG to
> + a pointer to STRUCT and then derference the pointer defined by
> + ->FIELD.
> \IMM : Store an immediate value to the argument.
> NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
> FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index e0d3a0da26af..b0829eb1cb52 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -464,6 +464,26 @@ static const char *fetch_type_from_btf_type(struct btf *btf,
> return NULL;
> }
>
> +static int query_btf_struct(const char *sname, struct traceprobe_parse_context *ctx)
> +{
> + int id;
> +
> + if (!ctx->btf) {
> + struct btf *btf;
This needs an empty line here.
> + id = bpf_find_btf_id(sname, BTF_KIND_STRUCT, &btf);
> + if (id < 0)
> + return -EINVAL;
Why don't you return id (it has corresponding errno)?
> + ctx->btf = btf;
> + } else {
> + id = btf_find_by_name_kind(ctx->btf, sname, BTF_KIND_STRUCT);
> + if (id < 0)
> + return -EINVAL;
Ditto.
> + }
> +
> + ctx->last_struct = btf_type_by_id(ctx->btf, id);
> + return 0;
> +}
> +
> static int query_btf_context(struct traceprobe_parse_context *ctx)
> {
> const struct btf_param *param;
> @@ -471,12 +491,12 @@ static int query_btf_context(struct traceprobe_parse_context *ctx)
> struct btf *btf;
> s32 nr;
>
> - if (ctx->btf)
> - return 0;
> -
> if (!ctx->funcname)
> return -EINVAL;
>
> + if (ctx->btf)
> + return 0;
> +
Could you tell me why this order is changed?
I think this type casting will allow us to skip checking funcname
because btf context is already specified.
Ah, BTW, we may need to use a special struct btf* for type
casting. If the target function is in a module and the
casting type is defined in vmlinux, those are stored in
the different places...
for example,
p funcA (foo)$arg1->bar buz
In this case, buz needs to use BTF including funcA.
Maybe we need to introduce ctx->func_btf, which resets ctx->btf
in traceprobe_parse_probe_arg_body() where parse_probe_arg()
is calling, e.g.
ctx->last_type = NULL;
+ if (ctx->btf)
+ btf_put(ctx->btf);
+ ctx->btf = ctx->func_btf;
ret = parse_probe_arg(arg, parg->type, &code, &code[FETCH_INSN_MAX - 1],
ctx);
> type = btf_find_func_proto(ctx->funcname, &btf);
> if (!type)
> return -ENOENT;
> @@ -514,6 +534,7 @@ static void clear_btf_context(struct traceprobe_parse_context *ctx)
> ctx->proto = NULL;
> ctx->params = NULL;
> ctx->nr_params = 0;
> + ctx->last_struct = NULL;
> }
> }
>
> @@ -554,22 +575,28 @@ static int parse_btf_field(char *fieldname, const struct btf_type *type,
> struct fetch_insn *code = *pcode;
> const struct btf_member *field;
> u32 bitoffs, anon_offs;
> + bool is_struct = ctx->flags & TPARG_FL_STRUCT;
> char *next;
> int is_ptr;
> s32 tid;
>
> do {
> - /* Outer loop for solving arrow operator ('->') */
> - if (BTF_INFO_KIND(type->info) != BTF_KIND_PTR) {
> - trace_probe_log_err(ctx->offset, NO_PTR_STRCT);
> - return -EINVAL;
> - }
> - /* Convert a struct pointer type to a struct type */
> - type = btf_type_skip_modifiers(ctx->btf, type->type, &tid);
> - if (!type) {
> - trace_probe_log_err(ctx->offset, BAD_BTF_TID);
> - return -EINVAL;
> + if (!is_struct) {
> + /* Outer loop for solving arrow operator ('->') */
> + if (BTF_INFO_KIND(type->info) != BTF_KIND_PTR) {
> + trace_probe_log_err(ctx->offset, NO_PTR_STRCT);
> + return -EINVAL;
> + }
> +
> + /* Convert a struct pointer type to a struct type */
> + type = btf_type_skip_modifiers(ctx->btf, type->type, &tid);
> + if (!type) {
> + trace_probe_log_err(ctx->offset, BAD_BTF_TID);
> + return -EINVAL;
> + }
> }
> + /* Only the first type can skip being a pointer */
> + is_struct = false;
>
> bitoffs = 0;
> do {
> @@ -635,12 +662,12 @@ static int parse_btf_arg(char *varname,
> {
> struct fetch_insn *code = *pcode;
> const struct btf_param *params;
> - const struct btf_type *type;
> + const struct btf_type *type = NULL;
> char *field = NULL;
> int i, is_ptr, ret;
> u32 tid;
>
> - if (WARN_ON_ONCE(!ctx->funcname))
> + if (WARN_ON_ONCE(!ctx->funcname && !(ctx->flags & TPARG_FL_STRUCT)))
> return -EINVAL;
>
> is_ptr = split_next_field(varname, &field, ctx);
> @@ -704,11 +731,18 @@ static int parse_btf_arg(char *varname,
> goto found;
> }
> }
> +
> + if (ctx->flags & TPARG_FL_STRUCT) {
> + type = ctx->last_struct;
> + goto found;
I rather like to jump type_found: label instead of
checking !type. (Or, save tid instead of type)
> + }
> +
> trace_probe_log_err(ctx->offset, NO_BTFARG);
> return -ENOENT;
>
> found:
> - type = btf_type_skip_modifiers(ctx->btf, tid, &tid);
> + if (!type)
> + type = btf_type_skip_modifiers(ctx->btf, tid, &tid);
type_found:
> if (!type) {
> trace_probe_log_err(ctx->offset, BAD_BTF_TID);
> return -EINVAL;
> @@ -952,6 +986,12 @@ static int parse_probe_vars(char *orig_arg, const struct fetch_type *t,
> int ret = 0;
> int len;
>
> + if (ctx->flags & TPARG_FL_STRUCT) {
> + ret = parse_btf_arg(orig_arg, pcode, end, ctx);
> + if (ret < 0)
> + return ret;
> + }
> +
> if (ctx->flags & TPARG_FL_TEVENT) {
> if (code->data)
> return -EFAULT;
> @@ -1231,6 +1271,43 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
> code->op = FETCH_OP_IMM;
> }
> break;
> + case '(':
> + tmp = strrchr(arg, ')');
OK, in this step, we don't support nested cast etc. so this works.
> + if (!tmp) {
> + trace_probe_log_err(ctx->offset + strlen(arg),
> + DEREF_OPEN_BRACE);
> + return -EINVAL;
> + }
> +
> + tmp--;
> + if (*tmp != '*') {
> + trace_probe_log_err(ctx->offset + (tmp - arg),
> + NO_PTR_STRCT);
> + return -EINVAL;
> + }
So I think this can be optional, not an error.
> + *tmp = '\0';
> + ret = query_btf_struct(arg + 1, ctx);
> + *tmp = '*';
> +
> + if (ret < 0) {
> + trace_probe_log_err(ctx->offset + 1, NO_PTR_STRCT);
> + return -EINVAL;
> + }
> +
> + ctx->flags |= TPARG_FL_STRUCT;
> + tmp += 2;
> +
> + if (*tmp != '$') {
> + trace_probe_log_err(ctx->offset + (tmp - arg),
> + BAD_VAR);
> + return -EINVAL;
> + }
Ok, this limitation will be removed afterwards.
Thanks,
> +
> + ctx->offset += tmp - arg;
> + ret = parse_probe_vars(tmp, type, pcode, end, ctx);
> + ctx->flags &= ~TPARG_FL_STRUCT;
> + ctx->last_struct = NULL;
> + break;
> default:
> if (isalpha(arg[0]) || arg[0] == '_') { /* BTF variable */
> if (!tparg_is_function_entry(ctx->flags) &&
> @@ -1504,6 +1581,7 @@ static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size,
> code[FETCH_INSN_MAX - 1].op = FETCH_OP_END;
>
> ctx->last_type = NULL;
> + ctx->last_struct = NULL;
> ret = parse_probe_arg(arg, parg->type, &code, &code[FETCH_INSN_MAX - 1],
> ctx);
> if (ret < 0)
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index 262d8707a3df..88ab9f6da591 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -394,6 +394,7 @@ static inline int traceprobe_get_entry_data_size(struct trace_probe *tp)
> * TPARG_FL_KERNEL and TPARG_FL_USER are also mutually exclusive.
> * TPARG_FL_FPROBE and TPARG_FL_TPOINT are optional but it should be with
> * TPARG_FL_KERNEL.
> + * TPARG_FL_STRUCT is set if an argument was typecast to a structure.
> */
> #define TPARG_FL_RETURN BIT(0)
> #define TPARG_FL_KERNEL BIT(1)
> @@ -402,6 +403,7 @@ static inline int traceprobe_get_entry_data_size(struct trace_probe *tp)
> #define TPARG_FL_USER BIT(4)
> #define TPARG_FL_FPROBE BIT(5)
> #define TPARG_FL_TPOINT BIT(6)
> +#define TPARG_FL_STRUCT BIT(7)
> #define TPARG_FL_LOC_MASK GENMASK(4, 0)
>
> static inline bool tparg_is_function_entry(unsigned int flags)
> @@ -423,6 +425,7 @@ struct traceprobe_parse_context {
> s32 nr_params; /* The number of the parameters */
> struct btf *btf; /* The BTF to be used */
> const struct btf_type *last_type; /* Saved type */
> + const struct btf_type *last_struct; /* Saved structure */
> u32 last_bitoffs; /* Saved bitoffs */
> u32 last_bitsize; /* Saved bitsize */
> struct trace_probe *tp;
> --
> 2.53.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
next prev parent reply other threads:[~2026-05-19 9:53 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-19 3:23 [PATCH v4] tracing/probes: Allow use of BTF names to dereference pointers Steven Rostedt
2026-05-19 9:34 ` kernel test robot
2026-05-19 9:53 ` Masami Hiramatsu [this message]
2026-05-19 10:10 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260519185302.5fb527085a64567a388f24f3@kernel.org \
--to=mhiramat@kernel.org \
--cc="Subject:[PATCH v2]"@web.codeaurora.org \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=douglas.raillard@arm.com \
--cc=irogers@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=namhyung@kernel.org \
--cc=olsajiri@gmail.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=takayas@google.com \
--cc=tglx@linutronix.de \
--cc=tracing/pr@web.codeaurora.org \
--cc=zanussi@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox