All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>, bpf@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>, Dan Schatzberg <dschatzberg@meta.com>,
	Emil Tsalapatis <emil@etsalapatis.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	kkd@meta.com, kernel-team@meta.com
Subject: Re: [PATCH bpf-next v3 1/3] bpf: Support variable offsets for syscall PTR_TO_CTX
Date: Wed, 18 Mar 2026 14:06:09 +0000	[thread overview]
Message-ID: <87cy11kzam.fsf@gmail.com> (raw)
In-Reply-To: <20260318103526.2590079-2-memxor@gmail.com>

Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:

> Allow accessing PTR_TO_CTX with variable offsets in syscall programs.
> Fixed offsets are already enabled for all program types that do not
> convert their ctx accesses, since the changes we made in the commit
> de6c7d99f898 ("bpf: Relax fixed offset check for PTR_TO_CTX"). Note
> that we also lift the restriction on passing syscall context into
> helpers, which was not permitted before, and passing modified syscall
> context into kfuncs.
>
> The structure of check_mem_access can be mostly shared and preserved,
> but we must use check_mem_region_access to correctly verify access with
> variable offsets.
>
> The check made in check_helper_mem_access is hardened to only allow
> PTR_TO_CTX for syscall programs to be passed in as helper memory. This
> was the original intention of the existing code anyway, and it makes
> little sense for other program types' context to be utilized as a memory
> buffer. In case a convincing example presents itself in the future, this
> check can be relaxed further.
>
> We also no longer use the last-byte access to simulate helper memory
> access, but instead go through check_mem_region_access. Since this no
> longer updates our max_ctx_offset, we must do so manually, to keep track
> of the maximum offset at which the program ctx may be accessed.
>
> Take care to ensure that when arg_type is ARG_PTR_TO_CTX, we do not
> relax any fixed or variable offset constraints around PTR_TO_CTX even in
> syscall programs, and require them to be passed unmodified. There are
> several reasons why this is necessary. First, if we pass a modified ctx,
> then the global subprog's accesses will not update the max_ctx_offset to
> its true maximum offset, and can lead to out of bounds accesses. Second,
> tail called program (or extension program replacing global subprog) where
> their max_ctx_offset exceeds the program they are being called from can
> also cause issues. For the latter, unmodified PTR_TO_CTX is the first
> requirement for the fix, the second is ensuring max_ctx_offset >= the
> program they are being called from, which has to be a separate change
> not made in this commit.
>
> All in all, we can hint using arg_type when we expect ARG_PTR_TO_CTX and
> make our relaxation around offsets conditional on it.
>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Dan Schatzberg <dschatzberg@meta.com>
> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/verifier.c                         | 67 ++++++++++++-------
>  .../bpf/progs/verifier_global_subprogs.c      |  1 -
>  2 files changed, 43 insertions(+), 25 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 01c18f4268de..14bf64e0c600 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -7843,6 +7843,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
>  		 * Program types that don't rewrite ctx accesses can safely
>  		 * dereference ctx pointers with fixed offsets.
>  		 */
> +		bool var_off_ok = resolve_prog_type(env->prog) == BPF_PROG_TYPE_SYSCALL;
>  		bool fixed_off_ok = !env->ops->convert_ctx_access;
>  		struct bpf_retval_range range;
>  		struct bpf_insn_access_aux info = {
> @@ -7857,16 +7858,26 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
>  			return -EACCES;
>  		}
>  
> -		err = __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
> -		if (err < 0)
> -			return err;
> -
>  		/*
>  		 * Fold the register's constant offset into the insn offset so
> -		 * that is_valid_access() sees the true effective offset.
> +		 * that is_valid_access() sees the true effective offset. If the
> +		 * register's offset is allowed to be variable, then the maximum
> +		 * possible offset is simulated (which is equal to var_off.value
> +		 * when var_off is constant).
>  		 */
> -		if (fixed_off_ok)
> -			off += reg->var_off.value;
> +		if (var_off_ok) {
> +			err = check_mem_region_access(env, regno, off, size, U16_MAX, false);
> +			if (err)
> +				return err;
> +			off += reg->umax_value;
> +		} else {
> +			err = __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
> +			if (err < 0)
> +				return err;
> +			if (fixed_off_ok)
> +				off += reg->var_off.value;
> +		}
nit: this code looks a bit complex, I wonder if it makes sense to encode
the context offset mode into an enum:
enum bpf_ctx_allowed_off {
     CTX_OFF_VAR,
     CTX_OFF_FIXED,
     CTX_OFF_NONE
};

we can factor out a helper that returns allowed offset mode:
```
enum bpf_ctx_allowed_off get_context_allowed_offset(env)
{
        if (resolve_prog_type(env->prog) == BPF_PROG_TYPE_SYSCALL)
	        return CTX_OFF_VAR;
        else if (!env->ops->convert_ctx_access)
        	return CTX_OFF_FIXED;
        else
                return CTX_OFF_NONE;
}
```

The enum makes the three-way exclusive modes explicit, eliminates the
implicit priority and more self-documenting.

The enum can also be used below.

> +
>  		err = check_ctx_access(env, insn_idx, off, size, t, &info);
>  		if (err)
>  			verbose_linfo(env, insn_idx, "; ");
> @@ -8442,22 +8453,16 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno,
>  		return check_ptr_to_btf_access(env, regs, regno, 0,
>  					       access_size, BPF_READ, -1);
>  	case PTR_TO_CTX:
> -		/* in case the function doesn't know how to access the context,
> -		 * (because we are in a program of type SYSCALL for example), we
> -		 * can not statically check its size.
> -		 * Dynamically check it now.
> -		 */
> -		if (!env->ops->convert_ctx_access) {
Why did you remove this block here, it should correspond to fixed
offset, and is not processed in the resolve_prog_type(env->prog) ==
BPF_PROG_TYPE_SYSCALL) branch.
> -			int offset = access_size - 1;
> -
> -			/* Allow zero-byte read from PTR_TO_CTX */
> -			if (access_size == 0)
> -				return zero_size_allowed ? 0 : -EACCES;
> -
> -			return check_mem_access(env, env->insn_idx, regno, offset, BPF_B,
> -						access_type, -1, false, false);
> +		/* Only permit reading or writing syscall context using helper calls. */
> +		if (resolve_prog_type(env->prog) == BPF_PROG_TYPE_SYSCALL) {
If we introduce bpf_ctx_allowed_off enum, this check could be modified
to if (get_context_allowed_offset() == CTX_OFF_VAR) here and also in
other place as well, does it capture the logic better?
I'm not 100% sure if these use cases are worth adding a separate enum,
though, let me know what you think.
> +			int err = check_mem_region_access(env, regno, 0, access_size, U16_MAX,
> +							  zero_size_allowed);
> +			if (err)
> +				return err;
> +			if (env->prog->aux->max_ctx_offset < reg->umax_value + access_size)
> +				env->prog->aux->max_ctx_offset = reg->umax_value + access_size;
> +			return 0;
>  		}
> -
>  		fallthrough;
>  	default: /* scalar_value or invalid ptr */
>  		/* Allow zero-byte read from NULL, regardless of pointer type */
> @@ -9401,6 +9406,7 @@ static const struct bpf_reg_types mem_types = {
>  		PTR_TO_MEM | MEM_RINGBUF,
>  		PTR_TO_BUF,
>  		PTR_TO_BTF_ID | PTR_TRUSTED,
> +		PTR_TO_CTX,
>  	},
>  };
>  
> @@ -9710,6 +9716,17 @@ static int check_func_arg_reg_off(struct bpf_verifier_env *env,
>  		 * still need to do checks instead of returning.
>  		 */
>  		return __check_ptr_off_reg(env, reg, regno, true);
> +	case PTR_TO_CTX:
> +		/*
> +		 * Allow fixed and variable offsets for syscall context, but
> +		 * only when the argument is passed as memory, not ctx,
> +		 * otherwise we may get modified ctx in tail called programs and
> +		 * global subprogs (that may act as extension prog hooks).
> +		 */
> +		if (arg_type != ARG_PTR_TO_CTX &&
> +		    resolve_prog_type(env->prog) == BPF_PROG_TYPE_SYSCALL)
This looks like another instance where we check for prog_type==syscall,
but actually mean: is variable offset into ctx is allowed.
> +			return 0;
> +		fallthrough;
>  	default:
>  		return __check_ptr_off_reg(env, reg, regno, false);
>  	}
> @@ -10757,7 +10774,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
>  			 * invalid memory access.
>  			 */
>  		} else if (arg->arg_type == ARG_PTR_TO_CTX) {
> -			ret = check_func_arg_reg_off(env, reg, regno, ARG_DONTCARE);
> +			ret = check_func_arg_reg_off(env, reg, regno, ARG_PTR_TO_CTX);
>  			if (ret < 0)
>  				return ret;
>  			/* If function expects ctx type in BTF check that caller
> @@ -13565,7 +13582,6 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
>  				}
>  			}
>  			fallthrough;
> -		case KF_ARG_PTR_TO_CTX:
>  		case KF_ARG_PTR_TO_DYNPTR:
>  		case KF_ARG_PTR_TO_ITER:
>  		case KF_ARG_PTR_TO_LIST_HEAD:
> @@ -13583,6 +13599,9 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
>  		case KF_ARG_PTR_TO_IRQ_FLAG:
>  		case KF_ARG_PTR_TO_RES_SPIN_LOCK:
>  			break;
> +		case KF_ARG_PTR_TO_CTX:
> +			arg_type = ARG_PTR_TO_CTX;
> +			break;
>  		default:
>  			verifier_bug(env, "unknown kfunc arg type %d", kf_arg_type);
>  			return -EFAULT;
> diff --git a/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c b/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c
> index f02012a2fbaa..2250fc31574d 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_global_subprogs.c
> @@ -134,7 +134,6 @@ __noinline __weak int subprog_user_anon_mem(user_struct_t *t)
>  
>  SEC("?tracepoint")
>  __failure __log_level(2)
> -__msg("invalid bpf_context access")
>  __msg("Caller passes invalid args into func#1 ('subprog_user_anon_mem')")
>  int anon_user_mem_invalid(void *ctx)
>  {
> -- 
> 2.52.0

  reply	other threads:[~2026-03-18 14:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18 10:35 [PATCH bpf-next v3 0/3] Allow variable offsets for syscall PTR_TO_CTX Kumar Kartikeya Dwivedi
2026-03-18 10:35 ` [PATCH bpf-next v3 1/3] bpf: Support " Kumar Kartikeya Dwivedi
2026-03-18 14:06   ` Mykyta Yatsenko [this message]
2026-03-18 21:59     ` Kumar Kartikeya Dwivedi
2026-03-18 21:13   ` Eduard Zingerman
2026-03-18 21:57     ` Kumar Kartikeya Dwivedi
2026-03-18 22:37       ` Eduard Zingerman
2026-03-18 23:33         ` Kumar Kartikeya Dwivedi
2026-03-18 23:43         ` Kumar Kartikeya Dwivedi
2026-03-18 10:35 ` [PATCH bpf-next v3 2/3] selftests/bpf: Add syscall ctx variable offset tests Kumar Kartikeya Dwivedi
2026-03-18 14:45   ` Mykyta Yatsenko
2026-03-18 21:41     ` Eduard Zingerman
2026-03-18 21:58       ` Kumar Kartikeya Dwivedi
2026-03-18 22:01     ` Kumar Kartikeya Dwivedi
2026-03-18 10:35 ` [PATCH bpf-next v3 3/3] selftests/bpf: Test modified syscall ctx for ARG_PTR_TO_CTX Kumar Kartikeya Dwivedi
2026-03-18 15:13   ` Mykyta Yatsenko
2026-03-18 11:45 ` [PATCH bpf-next v3 0/3] Allow variable offsets for syscall PTR_TO_CTX Puranjay Mohan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87cy11kzam.fsf@gmail.com \
    --to=mykyta.yatsenko5@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dschatzberg@meta.com \
    --cc=eddyz87@gmail.com \
    --cc=emil@etsalapatis.com \
    --cc=kernel-team@meta.com \
    --cc=kkd@meta.com \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.