public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 bpf-next 0/2] Properly load values from insn_arays with non-zero offsets
@ 2026-04-02 18:46 Anton Protopopov
  2026-04-02 18:46 ` [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays Anton Protopopov
  2026-04-02 18:46 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Add more tests for loading insn arrays with offsets Anton Protopopov
  0 siblings, 2 replies; 11+ messages in thread
From: Anton Protopopov @ 2026-04-02 18:46 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko
  Cc: Anton Protopopov

The PTR_TO_INSN is always loaded via BPF_LDX_MEM instruction.
However, the verifier doesn't properly verify such loads when the
offset is not zero. Fix this and extend selftests with more scenarios.

v1 -> v2:
  * u32 -> int in selftests (Mykyta)
  * do not add offset if it equals to zero

v1: https://lore.kernel.org/bpf/20260401161529.681755-1-a.s.protopopov@gmail.com

Anton Protopopov (2):
  bpf: Do not ignore offsets for loads from insn_arrays
  selftests/bpf: Add more tests for loading insn arrays with offsets

 kernel/bpf/verifier.c                         |  20 +++
 .../selftests/bpf/prog_tests/bpf_gotox.c      | 114 +++++++++++-------
 2 files changed, 93 insertions(+), 41 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays
  2026-04-02 18:46 [PATCH v2 bpf-next 0/2] Properly load values from insn_arays with non-zero offsets Anton Protopopov
@ 2026-04-02 18:46 ` Anton Protopopov
  2026-04-02 19:00   ` Alexei Starovoitov
  2026-04-02 18:46 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Add more tests for loading insn arrays with offsets Anton Protopopov
  1 sibling, 1 reply; 11+ messages in thread
From: Anton Protopopov @ 2026-04-02 18:46 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko
  Cc: Anton Protopopov

When a pointer to PTR_TO_INSN is dereferenced it is possible to
specify an offset inside the load instruction. This is a bug,
because while the verifier ignores the field, JITs are not.
So, patch the verifier to not ignore this field.

Reported-by: Jiyong Yang <ksur673@gmail.com>
Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 kernel/bpf/verifier.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8c1cf2eb6cbb..ce29b3b7a4d9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -212,6 +212,8 @@ static int ref_set_non_owning(struct bpf_verifier_env *env,
 static bool is_trusted_reg(const struct bpf_reg_state *reg);
 static inline bool in_sleepable_context(struct bpf_verifier_env *env);
 static const char *non_sleepable_context_description(struct bpf_verifier_env *env);
+static void scalar32_min_max_add(struct bpf_reg_state *dst_reg, struct bpf_reg_state *src_reg);
+static void scalar_min_max_add(struct bpf_reg_state *dst_reg, struct bpf_reg_state *src_reg);
 
 static bool bpf_map_ptr_poisoned(const struct bpf_insn_aux_data *aux)
 {
@@ -7735,6 +7737,23 @@ static bool get_func_retval_range(struct bpf_prog *prog,
 	return false;
 }
 
+static void add_scalar_to_reg(struct bpf_reg_state *dst_reg, s64 val)
+{
+	struct bpf_reg_state fake_reg;
+
+	if (!val)
+		return;
+
+	fake_reg.type = SCALAR_VALUE;
+	__mark_reg_known(&fake_reg, val);
+
+	scalar32_min_max_add(dst_reg, &fake_reg);
+	scalar_min_max_add(dst_reg, &fake_reg);
+	dst_reg->var_off = tnum_add(dst_reg->var_off, fake_reg.var_off);
+
+	reg_bounds_sync(dst_reg);
+}
+
 /* check whether memory at (regno + off) is accessible for t = (read | write)
  * if t==write, value_regno is a register which value is stored into memory
  * if t==read, value_regno is a register which will receive the value from memory
@@ -7816,6 +7835,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 					return -EACCES;
 				}
 				copy_register_state(&regs[value_regno], reg);
+				add_scalar_to_reg(&regs[value_regno], off);
 				regs[value_regno].type = PTR_TO_INSN;
 			} else {
 				mark_reg_unknown(env, regs, value_regno);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 bpf-next 2/2] selftests/bpf: Add more tests for loading insn arrays with offsets
  2026-04-02 18:46 [PATCH v2 bpf-next 0/2] Properly load values from insn_arays with non-zero offsets Anton Protopopov
  2026-04-02 18:46 ` [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays Anton Protopopov
@ 2026-04-02 18:46 ` Anton Protopopov
  1 sibling, 0 replies; 11+ messages in thread
From: Anton Protopopov @ 2026-04-02 18:46 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko
  Cc: Anton Protopopov

A typical series of load instructions for a gotox looks like

   r1 = &map + offset1
   r1 += offset2
   r1 = *(r1 + offset3)
   gotox r1

Here offset3 is, normally, equal to zero; but this is not guaranteed.
Extend selftests with tests for non-zero offset3 and, while here, also
add tests for negative offsets (the offset1 and the sum of three offsets
still have to be non-negative).

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 .../selftests/bpf/prog_tests/bpf_gotox.c      | 114 +++++++++++-------
 1 file changed, 73 insertions(+), 41 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c b/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
index 75b0cf2467ab..f4f14b6f0d24 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
@@ -317,7 +317,7 @@ static void check_ldimm64_off_load(struct bpf_gotox *skel __always_unused)
 
 static int __check_ldimm64_gotox_prog_load(struct bpf_insn *insns,
 					   __u32 insn_cnt,
-					   __u32 off1, __u32 off2)
+					   int off1, int off2, int off3)
 {
 	const __u32 values[] = {5, 7, 9, 11, 13, 15};
 	const __u32 max_entries = ARRAY_SIZE(values);
@@ -349,16 +349,46 @@ static int __check_ldimm64_gotox_prog_load(struct bpf_insn *insns,
 	/* r1 += off2 */
 	insns[2].imm = off2;
 
+	/* r1 = *(r1 + off3) */
+	insns[3].off = off3;
+
 	ret = prog_load(insns, insn_cnt);
 	close(map_fd);
 	return ret;
 }
 
-static void reject_offsets(struct bpf_insn *insns, __u32 insn_cnt, __u32 off1, __u32 off2)
+static void
+allow_offsets(struct bpf_insn *insns, __u32 insn_cnt, int off1, int off2, int off3)
+{
+	LIBBPF_OPTS(bpf_test_run_opts, topts);
+	int prog_fd, err;
+	char s[128] = "";
+
+	prog_fd = __check_ldimm64_gotox_prog_load(insns, insn_cnt, off1, off2, off3);
+	snprintf(s, sizeof(s), "__check_ldimm64_gotox_prog_load(%d,%d,%d)", off1, off2, off3);
+	if (!ASSERT_GE(prog_fd, 0, s))
+		return;
+
+	err = bpf_prog_test_run_opts(prog_fd, &topts);
+	if (!ASSERT_OK(err, "test_run_opts err")) {
+		close(prog_fd);
+		return;
+	}
+
+	if (!ASSERT_EQ(topts.retval, (off1 + off2 + off3) / 8, "test_run_opts retval")) {
+		close(prog_fd);
+		return;
+	}
+
+	close(prog_fd);
+}
+
+static void
+reject_offsets(struct bpf_insn *insns, __u32 insn_cnt, int off1, int off2, int off3)
 {
 	int prog_fd;
 
-	prog_fd = __check_ldimm64_gotox_prog_load(insns, insn_cnt, off1, off2);
+	prog_fd = __check_ldimm64_gotox_prog_load(insns, insn_cnt, off1, off2, off3);
 	if (!ASSERT_EQ(prog_fd, -EACCES, "__check_ldimm64_gotox_prog_load"))
 		close(prog_fd);
 }
@@ -376,7 +406,7 @@ static void check_ldimm64_off_gotox(struct bpf_gotox *skel __always_unused)
 		 * The program rewrites the offsets in the instructions below:
 		 *     r1 = &map + offset1
 		 *     r1 += offset2
-		 *     r1 = *r1
+		 *     r1 = *(r1 + offset3)
 		 *     gotox r1
 		 */
 		BPF_LD_IMM64_RAW(BPF_REG_1, BPF_PSEUDO_MAP_VALUE, 0),
@@ -403,43 +433,45 @@ static void check_ldimm64_off_gotox(struct bpf_gotox *skel __always_unused)
 		BPF_MOV64_IMM(BPF_REG_0, 5),
 		BPF_EXIT_INSN(),
 	};
-	int prog_fd, err;
-	__u32 off1, off2;
-
-	/* allow all combinations off1 + off2 < 6 */
-	for (off1 = 0; off1 < 6; off1++) {
-		for (off2 = 0; off1 + off2 < 6; off2++) {
-			LIBBPF_OPTS(bpf_test_run_opts, topts);
-
-			prog_fd = __check_ldimm64_gotox_prog_load(insns, ARRAY_SIZE(insns),
-								  off1 * 8, off2 * 8);
-			if (!ASSERT_GE(prog_fd, 0, "__check_ldimm64_gotox_prog_load"))
-				return;
-
-			err = bpf_prog_test_run_opts(prog_fd, &topts);
-			if (!ASSERT_OK(err, "test_run_opts err")) {
-				close(prog_fd);
-				return;
-			}
-
-			if (!ASSERT_EQ(topts.retval, off1 + off2, "test_run_opts retval")) {
-				close(prog_fd);
-				return;
-			}
-
-			close(prog_fd);
-		}
-	}
-
-	/* reject off1 + off2 >= 6 */
-	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 3, 8 * 3);
-	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 7, 8 * 0);
-	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 0, 8 * 7);
-
-	/* reject (off1 + off2) % 8 != 0 */
-	reject_offsets(insns, ARRAY_SIZE(insns), 3, 3);
-	reject_offsets(insns, ARRAY_SIZE(insns), 7, 0);
-	reject_offsets(insns, ARRAY_SIZE(insns), 0, 7);
+	int off1, off2, off3;
+
+	/* allow all combinations off1 + off2 + off3 < 6 */
+	for (off1 = 0; off1 < 6; off1++)
+		for (off2 = 0; off1 + off2 < 6; off2++)
+			for (off3 = 0; off1 + off2 + off3 < 6; off3++)
+				allow_offsets(insns, ARRAY_SIZE(insns),
+					      off1 * 8, off2 * 8, off3 * 8);
+
+	/* allow for some offsets to be negative */
+	allow_offsets(insns, ARRAY_SIZE(insns), 8 * 3, 0, -(8 * 3));
+	allow_offsets(insns, ARRAY_SIZE(insns), 8 * 3, -(8 * 3), 0);
+	allow_offsets(insns, ARRAY_SIZE(insns), 0, 8 * 3, -(8 * 3));
+	allow_offsets(insns, ARRAY_SIZE(insns), 8 * 4, 0, -(8 * 2));
+	allow_offsets(insns, ARRAY_SIZE(insns), 8 * 4, -(8 * 2), 0);
+	allow_offsets(insns, ARRAY_SIZE(insns), 0, 8 * 4, -(8 * 2));
+
+	/* disallow negative sums of offsets */
+	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 3, 0, -(8 * 4));
+	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 3, -(8 * 4), 0);
+	reject_offsets(insns, ARRAY_SIZE(insns), 0, 8 * 3, -(8 * 4));
+
+	/* disallow the off1 to be negative in any case */
+	reject_offsets(insns, ARRAY_SIZE(insns), -8 * 1, 0, 0);
+	reject_offsets(insns, ARRAY_SIZE(insns), -8 * 1, 8 * 1, 0);
+	reject_offsets(insns, ARRAY_SIZE(insns), -8 * 1, 8 * 1, 8 * 1);
+
+	/* reject off1 + off2 + off3 >= 6 */
+	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 3, 8 * 3, 8 * 0);
+	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 7, 8 * 0, 8 * 0);
+	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 0, 8 * 7, 8 * 0);
+	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 3, 8 * 0, 8 * 3);
+	reject_offsets(insns, ARRAY_SIZE(insns), 8 * 0, 8 * 3, 8 * 3);
+
+	/* reject (off1 + off2) % 8 != 0, off3 % 8 != 0 */
+	reject_offsets(insns, ARRAY_SIZE(insns), 3, 3, 0);
+	reject_offsets(insns, ARRAY_SIZE(insns), 7, 0, 0);
+	reject_offsets(insns, ARRAY_SIZE(insns), 0, 7, 0);
+	reject_offsets(insns, ARRAY_SIZE(insns), 0, 0, 7);
 }
 
 void test_bpf_gotox(void)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays
  2026-04-02 18:46 ` [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays Anton Protopopov
@ 2026-04-02 19:00   ` Alexei Starovoitov
  2026-04-02 20:53     ` Anton Protopopov
  0 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2026-04-02 19:00 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko

On Thu, Apr 2, 2026 at 11:38 AM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
>
> When a pointer to PTR_TO_INSN is dereferenced it is possible to
> specify an offset inside the load instruction. This is a bug,
> because while the verifier ignores the field, JITs are not.
> So, patch the verifier to not ignore this field.

This is too terse to understand.
In 2nd patch you're saying:

   r1 = &map + offset1
   r1 += offset2
   r1 = *(r1 + offset3)
   gotox r1

Here offset3 is, normally, equal to zero; but this is not guaranteed.

What is this 'offset3'? Where did it come from?
What do you mean JIT handles it?

can llvm ever generate such code?
if not, reject it early ?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays
  2026-04-02 19:00   ` Alexei Starovoitov
@ 2026-04-02 20:53     ` Anton Protopopov
  2026-04-02 21:32       ` Alexei Starovoitov
  0 siblings, 1 reply; 11+ messages in thread
From: Anton Protopopov @ 2026-04-02 20:53 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko

On 26/04/02 12:00PM, Alexei Starovoitov wrote:
> On Thu, Apr 2, 2026 at 11:38 AM Anton Protopopov
> <a.s.protopopov@gmail.com> wrote:
> >
> > When a pointer to PTR_TO_INSN is dereferenced it is possible to
> > specify an offset inside the load instruction. This is a bug,
> > because while the verifier ignores the field, JITs are not.
> > So, patch the verifier to not ignore this field.
> 
> This is too terse to understand.
> In 2nd patch you're saying:
> 
>    r1 = &map + offset1
>    r1 += offset2
>    r1 = *(r1 + offset3)
>    gotox r1
> 
> Here offset3 is, normally, equal to zero; but this is not guaranteed.
> 
> What is this 'offset3'? Where did it come from?

The offset3 is the .off field of the BPF_LDX_MEM instruction.
The BPF assembler will correctly work with non-zero offsets:

    r1 = &map;
    r1 = *(r1 + offset)

> What do you mean JIT handles it?

JIT will issue a memory load with this offset,
but verifier will ignore it (before this patch).

> can llvm ever generate such code?
> if not, reject it early ?

LLVM can generate code with non-zero offset in BPF_LDX_MEM,
say, if one jumps to a structure field, like in `goto *p->j`
(see CurDAG->isBaseWithConstantOffset(Addr) check in
BPFDAGToDAGISel::SelectAddr). Realistically, for now
this doesn't make too much sense (looks like switches and goto *jt[i]
will not emit such code), but one future use-case might be
smth like a function pointer in a struct.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays
  2026-04-02 20:53     ` Anton Protopopov
@ 2026-04-02 21:32       ` Alexei Starovoitov
  2026-04-03  7:56         ` Anton Protopopov
  0 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2026-04-02 21:32 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko

On Thu, Apr 2, 2026 at 1:44 PM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
>
> On 26/04/02 12:00PM, Alexei Starovoitov wrote:
> > On Thu, Apr 2, 2026 at 11:38 AM Anton Protopopov
> > <a.s.protopopov@gmail.com> wrote:
> > >
> > > When a pointer to PTR_TO_INSN is dereferenced it is possible to
> > > specify an offset inside the load instruction. This is a bug,
> > > because while the verifier ignores the field, JITs are not.
> > > So, patch the verifier to not ignore this field.
> >
> > This is too terse to understand.
> > In 2nd patch you're saying:
> >
> >    r1 = &map + offset1
> >    r1 += offset2
> >    r1 = *(r1 + offset3)
> >    gotox r1
> >
> > Here offset3 is, normally, equal to zero; but this is not guaranteed.
> >
> > What is this 'offset3'? Where did it come from?
>
> The offset3 is the .off field of the BPF_LDX_MEM instruction.
> The BPF assembler will correctly work with non-zero offsets:
>
>     r1 = &map;
>     r1 = *(r1 + offset)
>
> > What do you mean JIT handles it?
>
> JIT will issue a memory load with this offset,
> but verifier will ignore it (before this patch).
>
> > can llvm ever generate such code?
> > if not, reject it early ?
>
> LLVM can generate code with non-zero offset in BPF_LDX_MEM,
> say, if one jumps to a structure field, like in `goto *p->j`

sorry, I still don't get it. What kind of syntax is that?
Could you share the godbolt link where llvm actually generates
such code? If the verifier will reject it anyway it's fine.
I just want to make sure we're fixing real problem.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays
  2026-04-02 21:32       ` Alexei Starovoitov
@ 2026-04-03  7:56         ` Anton Protopopov
  2026-04-03 15:10           ` Alexei Starovoitov
  0 siblings, 1 reply; 11+ messages in thread
From: Anton Protopopov @ 2026-04-03  7:56 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko

On 26/04/02 02:32PM, Alexei Starovoitov wrote:
> On Thu, Apr 2, 2026 at 1:44 PM Anton Protopopov
> <a.s.protopopov@gmail.com> wrote:
> >
> > On 26/04/02 12:00PM, Alexei Starovoitov wrote:
> > > On Thu, Apr 2, 2026 at 11:38 AM Anton Protopopov
> > > <a.s.protopopov@gmail.com> wrote:
> > > >
> > > > When a pointer to PTR_TO_INSN is dereferenced it is possible to
> > > > specify an offset inside the load instruction. This is a bug,
> > > > because while the verifier ignores the field, JITs are not.
> > > > So, patch the verifier to not ignore this field.
> > >
> > > This is too terse to understand.
> > > In 2nd patch you're saying:
> > >
> > >    r1 = &map + offset1
> > >    r1 += offset2
> > >    r1 = *(r1 + offset3)
> > >    gotox r1
> > >
> > > Here offset3 is, normally, equal to zero; but this is not guaranteed.
> > >
> > > What is this 'offset3'? Where did it come from?
> >
> > The offset3 is the .off field of the BPF_LDX_MEM instruction.
> > The BPF assembler will correctly work with non-zero offsets:
> >
> >     r1 = &map;
> >     r1 = *(r1 + offset)
> >
> > > What do you mean JIT handles it?
> >
> > JIT will issue a memory load with this offset,
> > but verifier will ignore it (before this patch).
> >
> > > can llvm ever generate such code?
> > > if not, reject it early ?
> >
> > LLVM can generate code with non-zero offset in BPF_LDX_MEM,
> > say, if one jumps to a structure field, like in `goto *p->j`
> 
> sorry, I still don't get it. What kind of syntax is that?
> Could you share the godbolt link where llvm actually generates
> such code? If the verifier will reject it anyway it's fine.
> I just want to make sure we're fixing real problem.

I am not aware of any real-life code, only can construct
some artificial examples:

        SEC("syscall")
        int test(unsigned int n)
        {
                struct {
                        int i;
                        void *j[3];
                } x = {
                        .i = n,
                        .j = { &&l1, &&l2, &&l3 },
                };

                if (n < 3)
                        goto *x.j[n];

                return 0;
        l1:
                return 1;
        l2:
                return 3;
        l3:
                return 5;
        }

It will compile into

        <test>:
        ;               .j = { &&l1, &&l2, &&l3 },
             160:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0x0 ll
                        0000000000000500:  R_BPF_64_64  BPF.JT.4.0
             162:       79 23 10 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x10)
                                                                   ^ offet != 0
             163:       7b 3a f8 ff 00 00 00 00 *(u64 *)(r10 - 0x8) = r3
             164:       79 23 08 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x8)
             165:       7b 3a f0 ff 00 00 00 00 *(u64 *)(r10 - 0x10) = r3
             ...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays
  2026-04-03  7:56         ` Anton Protopopov
@ 2026-04-03 15:10           ` Alexei Starovoitov
  2026-04-03 18:10             ` Anton Protopopov
  0 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2026-04-03 15:10 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko

On Fri, Apr 3, 2026 at 12:47 AM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
>
> On 26/04/02 02:32PM, Alexei Starovoitov wrote:
> > On Thu, Apr 2, 2026 at 1:44 PM Anton Protopopov
> > <a.s.protopopov@gmail.com> wrote:
> > >
> > > On 26/04/02 12:00PM, Alexei Starovoitov wrote:
> > > > On Thu, Apr 2, 2026 at 11:38 AM Anton Protopopov
> > > > <a.s.protopopov@gmail.com> wrote:
> > > > >
> > > > > When a pointer to PTR_TO_INSN is dereferenced it is possible to
> > > > > specify an offset inside the load instruction. This is a bug,
> > > > > because while the verifier ignores the field, JITs are not.
> > > > > So, patch the verifier to not ignore this field.
> > > >
> > > > This is too terse to understand.
> > > > In 2nd patch you're saying:
> > > >
> > > >    r1 = &map + offset1
> > > >    r1 += offset2
> > > >    r1 = *(r1 + offset3)
> > > >    gotox r1
> > > >
> > > > Here offset3 is, normally, equal to zero; but this is not guaranteed.
> > > >
> > > > What is this 'offset3'? Where did it come from?
> > >
> > > The offset3 is the .off field of the BPF_LDX_MEM instruction.
> > > The BPF assembler will correctly work with non-zero offsets:
> > >
> > >     r1 = &map;
> > >     r1 = *(r1 + offset)
> > >
> > > > What do you mean JIT handles it?
> > >
> > > JIT will issue a memory load with this offset,
> > > but verifier will ignore it (before this patch).
> > >
> > > > can llvm ever generate such code?
> > > > if not, reject it early ?
> > >
> > > LLVM can generate code with non-zero offset in BPF_LDX_MEM,
> > > say, if one jumps to a structure field, like in `goto *p->j`
> >
> > sorry, I still don't get it. What kind of syntax is that?
> > Could you share the godbolt link where llvm actually generates
> > such code? If the verifier will reject it anyway it's fine.
> > I just want to make sure we're fixing real problem.
>
> I am not aware of any real-life code, only can construct
> some artificial examples:
>
>         SEC("syscall")
>         int test(unsigned int n)
>         {
>                 struct {
>                         int i;
>                         void *j[3];
>                 } x = {
>                         .i = n,
>                         .j = { &&l1, &&l2, &&l3 },
>                 };
>
>                 if (n < 3)
>                         goto *x.j[n];
>
>                 return 0;
>         l1:
>                 return 1;
>         l2:
>                 return 3;
>         l3:
>                 return 5;
>         }
>
> It will compile into
>
>         <test>:
>         ;               .j = { &&l1, &&l2, &&l3 },
>              160:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0x0 ll
>                         0000000000000500:  R_BPF_64_64  BPF.JT.4.0
>              162:       79 23 10 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x10)
>                                                                    ^ offet != 0
>              163:       7b 3a f8 ff 00 00 00 00 *(u64 *)(r10 - 0x8) = r3
>              164:       79 23 08 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x8)
>              165:       7b 3a f0 ff 00 00 00 00 *(u64 *)(r10 - 0x10) = r3

Great, but where is an actual gotox that is using that r3?
Looks like it's spilled into stack at 163 ?
Does it pass the verifier?
Would be great to add it as C selftest.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays
  2026-04-03 15:10           ` Alexei Starovoitov
@ 2026-04-03 18:10             ` Anton Protopopov
  2026-04-03 18:22               ` Alexei Starovoitov
  0 siblings, 1 reply; 11+ messages in thread
From: Anton Protopopov @ 2026-04-03 18:10 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko

O 26/04/03 08:10AM, Alexei Starovoitov wrote:
> On Fri, Apr 3, 2026 at 12:47 AM Anton Protopopov
> <a.s.protopopov@gmail.com> wrote:
> >
> > On 26/04/02 02:32PM, Alexei Starovoitov wrote:
> > > On Thu, Apr 2, 2026 at 1:44 PM Anton Protopopov
> > > <a.s.protopopov@gmail.com> wrote:
> > > >
> > > > On 26/04/02 12:00PM, Alexei Starovoitov wrote:
> > > > > On Thu, Apr 2, 2026 at 11:38 AM Anton Protopopov
> > > > > <a.s.protopopov@gmail.com> wrote:
> > > > > >
> > > > > > When a pointer to PTR_TO_INSN is dereferenced it is possible to
> > > > > > specify an offset inside the load instruction. This is a bug,
> > > > > > because while the verifier ignores the field, JITs are not.
> > > > > > So, patch the verifier to not ignore this field.
> > > > >
> > > > > This is too terse to understand.
> > > > > In 2nd patch you're saying:
> > > > >
> > > > >    r1 = &map + offset1
> > > > >    r1 += offset2
> > > > >    r1 = *(r1 + offset3)
> > > > >    gotox r1
> > > > >
> > > > > Here offset3 is, normally, equal to zero; but this is not guaranteed.
> > > > >
> > > > > What is this 'offset3'? Where did it come from?
> > > >
> > > > The offset3 is the .off field of the BPF_LDX_MEM instruction.
> > > > The BPF assembler will correctly work with non-zero offsets:
> > > >
> > > >     r1 = &map;
> > > >     r1 = *(r1 + offset)
> > > >
> > > > > What do you mean JIT handles it?
> > > >
> > > > JIT will issue a memory load with this offset,
> > > > but verifier will ignore it (before this patch).
> > > >
> > > > > can llvm ever generate such code?
> > > > > if not, reject it early ?
> > > >
> > > > LLVM can generate code with non-zero offset in BPF_LDX_MEM,
> > > > say, if one jumps to a structure field, like in `goto *p->j`
> > >
> > > sorry, I still don't get it. What kind of syntax is that?
> > > Could you share the godbolt link where llvm actually generates
> > > such code? If the verifier will reject it anyway it's fine.
> > > I just want to make sure we're fixing real problem.
> >
> > I am not aware of any real-life code, only can construct
> > some artificial examples:
> >
> >         SEC("syscall")
> >         int test(unsigned int n)
> >         {
> >                 struct {
> >                         int i;
> >                         void *j[3];
> >                 } x = {
> >                         .i = n,
> >                         .j = { &&l1, &&l2, &&l3 },
> >                 };
> >
> >                 if (n < 3)
> >                         goto *x.j[n];
> >
> >                 return 0;
> >         l1:
> >                 return 1;
> >         l2:
> >                 return 3;
> >         l3:
> >                 return 5;
> >         }
> >
> > It will compile into
> >
> >         <test>:
> >         ;               .j = { &&l1, &&l2, &&l3 },
> >              160:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0x0 ll
> >                         0000000000000500:  R_BPF_64_64  BPF.JT.4.0
> >              162:       79 23 10 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x10)
> >                                                                    ^ offet != 0
> >              163:       7b 3a f8 ff 00 00 00 00 *(u64 *)(r10 - 0x8) = r3
> >              164:       79 23 08 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x8)
> >              165:       7b 3a f0 ff 00 00 00 00 *(u64 *)(r10 - 0x10) = r3
> 
> Great, but where is an actual gotox that is using that r3?
> Looks like it's spilled into stack at 163 ?
> Does it pass the verifier?
> Would be great to add it as C selftest.

No, this one doesn't pass the verifier (it is spilled to stack, and when
it is loaded back the error is "invalid unbounded variable-offset
read from stack R2", because it is loaded back from stack as
"scalar", not "insn"). All "normal" use cases from [C-level] selftests
(a switch or a 'goto *j[i]') compile into code which accumulate
offset into previous instructions.

Do you want me to work on this example above to pass the verifier?
Or, for now, the best thing is to just reject non-zero offsets?
(The latter has the benefit that the patch will be auto-backportable
to stable trees.)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays
  2026-04-03 18:10             ` Anton Protopopov
@ 2026-04-03 18:22               ` Alexei Starovoitov
  2026-04-05 18:24                 ` Anton Protopopov
  0 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2026-04-03 18:22 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko

On Fri, Apr 3, 2026 at 11:01 AM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
>
> O 26/04/03 08:10AM, Alexei Starovoitov wrote:
> > On Fri, Apr 3, 2026 at 12:47 AM Anton Protopopov
> > <a.s.protopopov@gmail.com> wrote:
> > >
> > > On 26/04/02 02:32PM, Alexei Starovoitov wrote:
> > > > On Thu, Apr 2, 2026 at 1:44 PM Anton Protopopov
> > > > <a.s.protopopov@gmail.com> wrote:
> > > > >
> > > > > On 26/04/02 12:00PM, Alexei Starovoitov wrote:
> > > > > > On Thu, Apr 2, 2026 at 11:38 AM Anton Protopopov
> > > > > > <a.s.protopopov@gmail.com> wrote:
> > > > > > >
> > > > > > > When a pointer to PTR_TO_INSN is dereferenced it is possible to
> > > > > > > specify an offset inside the load instruction. This is a bug,
> > > > > > > because while the verifier ignores the field, JITs are not.
> > > > > > > So, patch the verifier to not ignore this field.
> > > > > >
> > > > > > This is too terse to understand.
> > > > > > In 2nd patch you're saying:
> > > > > >
> > > > > >    r1 = &map + offset1
> > > > > >    r1 += offset2
> > > > > >    r1 = *(r1 + offset3)
> > > > > >    gotox r1
> > > > > >
> > > > > > Here offset3 is, normally, equal to zero; but this is not guaranteed.
> > > > > >
> > > > > > What is this 'offset3'? Where did it come from?
> > > > >
> > > > > The offset3 is the .off field of the BPF_LDX_MEM instruction.
> > > > > The BPF assembler will correctly work with non-zero offsets:
> > > > >
> > > > >     r1 = &map;
> > > > >     r1 = *(r1 + offset)
> > > > >
> > > > > > What do you mean JIT handles it?
> > > > >
> > > > > JIT will issue a memory load with this offset,
> > > > > but verifier will ignore it (before this patch).
> > > > >
> > > > > > can llvm ever generate such code?
> > > > > > if not, reject it early ?
> > > > >
> > > > > LLVM can generate code with non-zero offset in BPF_LDX_MEM,
> > > > > say, if one jumps to a structure field, like in `goto *p->j`
> > > >
> > > > sorry, I still don't get it. What kind of syntax is that?
> > > > Could you share the godbolt link where llvm actually generates
> > > > such code? If the verifier will reject it anyway it's fine.
> > > > I just want to make sure we're fixing real problem.
> > >
> > > I am not aware of any real-life code, only can construct
> > > some artificial examples:
> > >
> > >         SEC("syscall")
> > >         int test(unsigned int n)
> > >         {
> > >                 struct {
> > >                         int i;
> > >                         void *j[3];
> > >                 } x = {
> > >                         .i = n,
> > >                         .j = { &&l1, &&l2, &&l3 },
> > >                 };
> > >
> > >                 if (n < 3)
> > >                         goto *x.j[n];
> > >
> > >                 return 0;
> > >         l1:
> > >                 return 1;
> > >         l2:
> > >                 return 3;
> > >         l3:
> > >                 return 5;
> > >         }
> > >
> > > It will compile into
> > >
> > >         <test>:
> > >         ;               .j = { &&l1, &&l2, &&l3 },
> > >              160:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0x0 ll
> > >                         0000000000000500:  R_BPF_64_64  BPF.JT.4.0
> > >              162:       79 23 10 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x10)
> > >                                                                    ^ offet != 0
> > >              163:       7b 3a f8 ff 00 00 00 00 *(u64 *)(r10 - 0x8) = r3
> > >              164:       79 23 08 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x8)
> > >              165:       7b 3a f0 ff 00 00 00 00 *(u64 *)(r10 - 0x10) = r3
> >
> > Great, but where is an actual gotox that is using that r3?
> > Looks like it's spilled into stack at 163 ?
> > Does it pass the verifier?
> > Would be great to add it as C selftest.
>
> No, this one doesn't pass the verifier (it is spilled to stack, and when
> it is loaded back the error is "invalid unbounded variable-offset
> read from stack R2", because it is loaded back from stack as
> "scalar", not "insn"). All "normal" use cases from [C-level] selftests
> (a switch or a 'goto *j[i]') compile into code which accumulate
> offset into previous instructions.
>
> Do you want me to work on this example above to pass the verifier?
> Or, for now, the best thing is to just reject non-zero offsets?
> (The latter has the benefit that the patch will be auto-backportable
> to stable trees.)

I wouldn't worry about backports. Size of the patch doesn't matter much.

Is spill issue just a matter of adding one line to is_spillable_regtype ?
If so, let's do that and have a test in C (in addition to asm) that
actually using the offset.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays
  2026-04-03 18:22               ` Alexei Starovoitov
@ 2026-04-05 18:24                 ` Anton Protopopov
  0 siblings, 0 replies; 11+ messages in thread
From: Anton Protopopov @ 2026-04-05 18:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Jiyong Yang,
	Mykyta Yatsenko

On 26/04/03 11:22AM, Alexei Starovoitov wrote:
> On Fri, Apr 3, 2026 at 11:01 AM Anton Protopopov
> <a.s.protopopov@gmail.com> wrote:
> >
> > O 26/04/03 08:10AM, Alexei Starovoitov wrote:
> > > On Fri, Apr 3, 2026 at 12:47 AM Anton Protopopov
> > > <a.s.protopopov@gmail.com> wrote:
> > > >
> > > > On 26/04/02 02:32PM, Alexei Starovoitov wrote:
> > > > > On Thu, Apr 2, 2026 at 1:44 PM Anton Protopopov
> > > > > <a.s.protopopov@gmail.com> wrote:
> > > > > >
> > > > > > On 26/04/02 12:00PM, Alexei Starovoitov wrote:
> > > > > > > On Thu, Apr 2, 2026 at 11:38 AM Anton Protopopov
> > > > > > > <a.s.protopopov@gmail.com> wrote:
> > > > > > > >
> > > > > > > > When a pointer to PTR_TO_INSN is dereferenced it is possible to
> > > > > > > > specify an offset inside the load instruction. This is a bug,
> > > > > > > > because while the verifier ignores the field, JITs are not.
> > > > > > > > So, patch the verifier to not ignore this field.
> > > > > > >
> > > > > > > This is too terse to understand.
> > > > > > > In 2nd patch you're saying:
> > > > > > >
> > > > > > >    r1 = &map + offset1
> > > > > > >    r1 += offset2
> > > > > > >    r1 = *(r1 + offset3)
> > > > > > >    gotox r1
> > > > > > >
> > > > > > > Here offset3 is, normally, equal to zero; but this is not guaranteed.
> > > > > > >
> > > > > > > What is this 'offset3'? Where did it come from?
> > > > > >
> > > > > > The offset3 is the .off field of the BPF_LDX_MEM instruction.
> > > > > > The BPF assembler will correctly work with non-zero offsets:
> > > > > >
> > > > > >     r1 = &map;
> > > > > >     r1 = *(r1 + offset)
> > > > > >
> > > > > > > What do you mean JIT handles it?
> > > > > >
> > > > > > JIT will issue a memory load with this offset,
> > > > > > but verifier will ignore it (before this patch).
> > > > > >
> > > > > > > can llvm ever generate such code?
> > > > > > > if not, reject it early ?
> > > > > >
> > > > > > LLVM can generate code with non-zero offset in BPF_LDX_MEM,
> > > > > > say, if one jumps to a structure field, like in `goto *p->j`
> > > > >
> > > > > sorry, I still don't get it. What kind of syntax is that?
> > > > > Could you share the godbolt link where llvm actually generates
> > > > > such code? If the verifier will reject it anyway it's fine.
> > > > > I just want to make sure we're fixing real problem.
> > > >
> > > > I am not aware of any real-life code, only can construct
> > > > some artificial examples:
> > > >
> > > >         SEC("syscall")
> > > >         int test(unsigned int n)
> > > >         {
> > > >                 struct {
> > > >                         int i;
> > > >                         void *j[3];
> > > >                 } x = {
> > > >                         .i = n,
> > > >                         .j = { &&l1, &&l2, &&l3 },
> > > >                 };
> > > >
> > > >                 if (n < 3)
> > > >                         goto *x.j[n];
> > > >
> > > >                 return 0;
> > > >         l1:
> > > >                 return 1;
> > > >         l2:
> > > >                 return 3;
> > > >         l3:
> > > >                 return 5;
> > > >         }
> > > >
> > > > It will compile into
> > > >
> > > >         <test>:
> > > >         ;               .j = { &&l1, &&l2, &&l3 },
> > > >              160:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0x0 ll
> > > >                         0000000000000500:  R_BPF_64_64  BPF.JT.4.0
> > > >              162:       79 23 10 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x10)
> > > >                                                                    ^ offet != 0
> > > >              163:       7b 3a f8 ff 00 00 00 00 *(u64 *)(r10 - 0x8) = r3
> > > >              164:       79 23 08 00 00 00 00 00 r3 = *(u64 *)(r2 + 0x8)
> > > >              165:       7b 3a f0 ff 00 00 00 00 *(u64 *)(r10 - 0x10) = r3
> > >
> > > Great, but where is an actual gotox that is using that r3?
> > > Looks like it's spilled into stack at 163 ?
> > > Does it pass the verifier?
> > > Would be great to add it as C selftest.
> >
> > No, this one doesn't pass the verifier (it is spilled to stack, and when
> > it is loaded back the error is "invalid unbounded variable-offset
> > read from stack R2", because it is loaded back from stack as
> > "scalar", not "insn"). All "normal" use cases from [C-level] selftests
> > (a switch or a 'goto *j[i]') compile into code which accumulate
> > offset into previous instructions.
> >
> > Do you want me to work on this example above to pass the verifier?
> > Or, for now, the best thing is to just reject non-zero offsets?
> > (The latter has the benefit that the patch will be auto-backportable
> > to stable trees.)
> 
> I wouldn't worry about backports. Size of the patch doesn't matter much.
> 
> Is spill issue just a matter of adding one line to is_spillable_regtype ?
> If so, let's do that and have a test in C (in addition to asm) that
> actually using the offset.

Ok, looks I've found a simple selftest which compiles to
a non-zero offload from realistic C code (and triggers a NULL
deref without this patch). So I will send the next version with
that selftest added, and with a better commit description.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-04-05 18:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-02 18:46 [PATCH v2 bpf-next 0/2] Properly load values from insn_arays with non-zero offsets Anton Protopopov
2026-04-02 18:46 ` [PATCH v2 bpf-next 1/2] bpf: Do not ignore offsets for loads from insn_arrays Anton Protopopov
2026-04-02 19:00   ` Alexei Starovoitov
2026-04-02 20:53     ` Anton Protopopov
2026-04-02 21:32       ` Alexei Starovoitov
2026-04-03  7:56         ` Anton Protopopov
2026-04-03 15:10           ` Alexei Starovoitov
2026-04-03 18:10             ` Anton Protopopov
2026-04-03 18:22               ` Alexei Starovoitov
2026-04-05 18:24                 ` Anton Protopopov
2026-04-02 18:46 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Add more tests for loading insn arrays with offsets Anton Protopopov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox