BPF List
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Steven Rostedt" <rostedt@goodmis.org>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH v4] tracing/probes: Allow use of BTF names to dereference pointers
Date: Tue, 19 May 2026 04:09:27 +0000	[thread overview]
Message-ID: <20260519040928.458C4C2BCB3@smtp.kernel.org> (raw)
In-Reply-To: <20260518232312.0c78f055@gandalf.local.home>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] NULL pointer dereference in `parse_btf_arg()` when typecasting `$retval`
- [High] Instruction stream corruption when typecasting BTF variables with field dereferences
- [High] Parser state corruption breaks multi-argument probes and standard BTF parameter resolution
- [Medium] Hardcoded syntax check strictly prohibits typecasting hardware registers and standard BTF arguments
- [Medium] Using `strrchr` causes parsing failures for memory dereferences containing parentheses
--

commit 8ce6119a14206ed21706a773d9694dc24e53436e
Author: Steven Rostedt <rostedt@goodmis.org>

tracing/probes: Allow use of BTF names to dereference pointers

This commit adds syntax to FETCHARG parsing to allow typecasting a value
to a pointer to a structure. This allows users to rely on BTF to figure out
member offsets instead of manually finding and hardcoding numerical offsets.

> But for event probes, or generic kprobes that records a register that
> happens to be a pointer to a structure, they cannot dereference these
> values with BTF naming, but must use numerical offsets.

Does the code actually support dereferencing a register here? Hardware
registers (like %rdi) and standard BTF arguments (like skb) do not begin
with a $ character, but the parser explicitly enforces that the cast is
followed by a $.

> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index e1c73065dae51..29b5a99074d76 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
[ ... ]
> @@ -471,12 +491,12 @@ static int query_btf_context(struct traceprobe_parse_context *ctx)
>  	struct btf *btf;
>  	s32 nr;
>  
> -	if (ctx->btf)
> -		return 0;
> -
>  	if (!ctx->funcname)
>  		return -EINVAL;
>  
> +	if (ctx->btf)
> +		return 0;
> +

If a user creates a probe with a struct cast for the return value (like
(struct task_struct*)$retval), query_btf_struct() resolves the struct and
sets ctx->btf but leaves ctx->proto as NULL.

When parse_btf_arg() later handles $retval, it calls query_btf_context(),
which will return 0 here because ctx->btf is already set. Does this lead
to parse_btf_arg() unconditionally dereferencing a NULL ctx->proto?

Additionally, for multi-argument probes where the first argument is a cast
(like arg1=(struct foo*)$arg1 arg2=skb), setting ctx->btf for the first
argument causes this check to return early for the second argument. Could
this skip populating ctx->params and ctx->nr_params, causing valid BTF
arguments to be rejected?

[ ... ]
> @@ -952,6 +986,12 @@ static int parse_probe_vars(char *orig_arg, const struct fetch_type *t,
>  	int ret = 0;
>  	int len;
>  
> +	if (ctx->flags & TPARG_FL_STRUCT) {
> +		ret = parse_btf_arg(orig_arg, pcode, end, ctx);
> +		if (ret < 0)
> +			return ret;
> +	}
> +

When parse_btf_arg() successfully parses a typecast with field dereferences,
it emits a FETCH_OP_DEREF instruction and advances the *pcode pointer.

Since this block doesn't return upon success, execution continues and matches
the base variable name later in parse_probe_vars(). The local code pointer
is not updated to reflect the advanced *pcode, so it writes a new instruction
(like FETCH_OP_ARG) into the same slot.

Will this overwrite the FETCH_OP_DEREF instruction and corrupt the trace probe
bytecode?

[ ... ]
> @@ -1231,6 +1271,43 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
>  				code->op = FETCH_OP_IMM;
>  		}
>  		break;
> +	case '(':
> +		tmp = strrchr(arg, ')');

If a user attempts to cast a standard memory dereference that itself
contains parentheses (such as (struct foo*)+0($arg1)), strrchr() will
find the final parenthesis of $arg1).

Will this cause the pointer validation logic below to evaluate the wrong
segment of the string and incorrectly fail with NO_PTR_STRCT?

> +		if (!tmp) {
> +			trace_probe_log_err(ctx->offset + strlen(arg),
> +					    DEREF_OPEN_BRACE);
> +			return -EINVAL;
> +		}
> +
> +		tmp--;
> +		if (*tmp != '*') {
> +			trace_probe_log_err(ctx->offset + (tmp - arg),
> +					    NO_PTR_STRCT);
> +			return -EINVAL;
> +		}
> +		*tmp = '\0';
> +		ret = query_btf_struct(arg + 1, ctx);
> +		*tmp = '*';
> +
> +		if (ret < 0) {
> +			trace_probe_log_err(ctx->offset + 1, NO_PTR_STRCT);
> +			return -EINVAL;
> +		}
> +
> +		ctx->flags |= TPARG_FL_STRUCT;
> +		tmp += 2;
> +
> +		if (*tmp != '$') {

Is this strict check for '$' intentional?

As mentioned regarding the commit message, this prevents typecasting hardware
registers and standard BTF arguments since they do not start with a '$',
effectively restricting the feature to $retval, $stack, and $argN.

> +			trace_probe_log_err(ctx->offset + (tmp - arg),
> +					    BAD_VAR);
> +			return -EINVAL;
> +		}
> +
> +		ctx->offset += tmp - arg;
> +		ret = parse_probe_vars(tmp, type, pcode, end, ctx);
> +		ctx->flags &= ~TPARG_FL_STRUCT;
> +		ctx->last_struct = NULL;
> +		break;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260518232312.0c78f055@gandalf.local.home?part=1

  reply	other threads:[~2026-05-19  4:09 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-19  3:23 [PATCH v4] tracing/probes: Allow use of BTF names to dereference pointers Steven Rostedt
2026-05-19  4:09 ` sashiko-bot [this message]
2026-05-19 12:36   ` Steven Rostedt
2026-05-19  9:34 ` kernel test robot
2026-05-19  9:53 ` Masami Hiramatsu
2026-05-19 12:31   ` Steven Rostedt
2026-05-19 15:26     ` Masami Hiramatsu
2026-05-19 16:28       ` Steven Rostedt
2026-05-19 16:38         ` Steven Rostedt
2026-05-19 10:10 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260519040928.458C4C2BCB3@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox