BPF List
 help / color / mirror / Atom feed
* [PATCH bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22
@ 2025-10-14  5:16 Yonghong Song
  2025-10-15 16:45 ` Andrii Nakryiko
  2025-10-19  2:30 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 6+ messages in thread
From: Yonghong Song @ 2025-10-14  5:16 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau

With latest llvm22, I hit the verif_scale_strobemeta selftest failure
below:
  $ ./test_progs -n 618
  libbpf: prog 'on_event': BPF program load failed: -E2BIG
  libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
  BPF program is too large. Processed 1000001 insn
  verification time 7019091 usec
  stack depth 488
  processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 33927 peak_states 12813 mark_read 0
  -- END PROG LOAD LOG --
  libbpf: prog 'on_event': failed to load: -E2BIG
  libbpf: failed to load object 'strobemeta.bpf.o'
  scale_test:FAIL:expect_success unexpected error: -7 (errno 7)
  #618     verif_scale_strobemeta:FAIL

But if I increase the verificaiton insn limit from 1M to 10M, the above
test_progs run actually will succeed. The below is the result from veristat:
  $ ./veristat strobemeta.bpf.o
  Processing 'strobemeta.bpf.o'...
  File              Program   Verdict  Duration (us)    Insns  States  Program size  Jited size
  ----------------  --------  -------  -------------  -------  ------  ------------  ----------
  strobemeta.bpf.o  on_event  success       90250893  9777685  358230         15954       80794
  ----------------  --------  -------  -------------  -------  ------  ------------  ----------
  Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs.

Further debugging shows the llvm commit [1] is responsible for the verificaiton
failure as it tries to convert certain switch statement to if-condition. Such
change may cause different transformation compared to original switch statement.

In bpf program strobemeta.c case, the initial llvm ir for read_int_var() function is
  define internal void @read_int_var(ptr noundef %0, i64 noundef %1, ptr noundef %2,
      ptr noundef %3, ptr noundef %4) #2 !dbg !535 {
    %6 = alloca ptr, align 8
    %7 = alloca i64, align 8
    %8 = alloca ptr, align 8
    %9 = alloca ptr, align 8
    %10 = alloca ptr, align 8
    %11 = alloca ptr, align 8
    %12 = alloca i32, align 4
    ...
    %20 = icmp ne ptr %19, null, !dbg !561
    br i1 %20, label %22, label %21, !dbg !562

  21:                                               ; preds = %5
    store i32 1, ptr %12, align 4
    br label %48, !dbg !563

  22:
    %23 = load ptr, ptr %9, align 8, !dbg !564
    ...

  47:                                               ; preds = %38, %22
    store i32 0, ptr %12, align 4, !dbg !588
    br label %48, !dbg !588

  48:                                               ; preds = %47, %21
    call void @llvm.lifetime.end.p0(ptr %11) #4, !dbg !588
    %49 = load i32, ptr %12, align 4
    switch i32 %49, label %51 [
      i32 0, label %50
      i32 1, label %50
    ]

  50:                                               ; preds = %48, %48
    ret void, !dbg !589

  51:                                               ; preds = %48
    unreachable
  }

Note that the above 'switch' statement is added by clang frontend.
Without [1], the switch statement will survive until SelectionDag,
so the switch statement acts like a 'barrier' and prevents some
transformation involved with both 'before' and 'after' the switch statement.

But with [1], the switch statement will be removed during middle end
optimization and later middle end passes (esp. after inlining) have more
freedom to reorder the code.

The following is the related source code:

  static void *calc_location(struct strobe_value_loc *loc, void *tls_base):
        bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
        /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
        return tls_ptr && tls_ptr != (void *)-1
                ? tls_ptr + tls_index.offset
                : NULL;

  In read_int_var() func, we have:
        void *location = calc_location(&cfg->int_locs[idx], tls_base);
        if (!location)
                return;

        bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
        ...

The static func calc_location() is called inside read_int_var(). The asm code
without [1]:
     77: .123....89 (85) call bpf_probe_read_user#112
     78: ........89 (79) r1 = *(u64 *)(r10 -368)
     79: .1......89 (79) r2 = *(u64 *)(r10 -8)
     80: .12.....89 (bf) r3 = r2
     81: .123....89 (0f) r3 += r1
     82: ..23....89 (07) r2 += 1
     83: ..23....89 (79) r4 = *(u64 *)(r10 -464)
     84: ..234...89 (a5) if r2 < 0x2 goto pc+13
     85: ...34...89 (15) if r3 == 0x0 goto pc+12
     86: ...3....89 (bf) r1 = r10
     87: .1.3....89 (07) r1 += -400
     88: .1.3....89 (b4) w2 = 16
In this case, 'r2 < 0x2' and 'r3 == 0x0' go to null 'locaiton' place,
so the verifier actually prefers to do verification first at 'r1 = r10' etc.

The asm code with [1]:
    119: .123....89 (85) call bpf_probe_read_user#112
    120: ........89 (79) r1 = *(u64 *)(r10 -368)
    121: .1......89 (79) r2 = *(u64 *)(r10 -8)
    122: .12.....89 (bf) r3 = r2
    123: .123....89 (0f) r3 += r1
    124: ..23....89 (07) r2 += -1
    125: ..23....89 (a5) if r2 < 0xfffffffe goto pc+6
    126: ........89 (05) goto pc+17
    ...
    144: ........89 (b4) w1 = 0
    145: .1......89 (6b) *(u16 *)(r8 +80) = r1
In this case, if 'r2 < 0xfffffffe' is true, the control will go to
non-null 'location' branch, so 'goto pc+17' will actually go to
null 'location' branch. This seems causing tremendous amount of
verificaiton state.

To fix the issue, rewrite the following code
  return tls_ptr && tls_ptr != (void *)-1
                ? tls_ptr + tls_index.offset
                : NULL;
to if/then statement and hopefully these explicit if/then statements
are sticky during middle-end optimizations.

Test with llvm20 and llvm21 as well and all strobemeta related selftests
are passed.

  [1] https://github.com/llvm/llvm-project/pull/161000

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 tools/testing/selftests/bpf/progs/strobemeta.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

NOTE: I will also check whether we can make changes in llvm to automatically
 adjust branch statements to minimize verification insns/states. 

diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
index a5c74d31a244..6e1918deaf26 100644
--- a/tools/testing/selftests/bpf/progs/strobemeta.h
+++ b/tools/testing/selftests/bpf/progs/strobemeta.h
@@ -330,9 +330,9 @@ static void *calc_location(struct strobe_value_loc *loc, void *tls_base)
 	}
 	bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
 	/* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
-	return tls_ptr && tls_ptr != (void *)-1
-		? tls_ptr + tls_index.offset
-		: NULL;
+	if (!tls_ptr || tls_ptr == (void *)-1)
+		return NULL;
+	return tls_ptr + tls_index.offset;
 }
 
 #ifdef SUBPROGS
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22
  2025-10-14  5:16 [PATCH bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22 Yonghong Song
@ 2025-10-15 16:45 ` Andrii Nakryiko
  2025-10-15 19:56   ` Yonghong Song
  2025-10-19  2:30 ` patchwork-bot+netdevbpf
  1 sibling, 1 reply; 6+ messages in thread
From: Andrii Nakryiko @ 2025-10-15 16:45 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau

On Mon, Oct 13, 2025 at 10:16 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
> With latest llvm22, I hit the verif_scale_strobemeta selftest failure
> below:
>   $ ./test_progs -n 618
>   libbpf: prog 'on_event': BPF program load failed: -E2BIG
>   libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
>   BPF program is too large. Processed 1000001 insn
>   verification time 7019091 usec
>   stack depth 488
>   processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 33927 peak_states 12813 mark_read 0
>   -- END PROG LOAD LOG --
>   libbpf: prog 'on_event': failed to load: -E2BIG
>   libbpf: failed to load object 'strobemeta.bpf.o'
>   scale_test:FAIL:expect_success unexpected error: -7 (errno 7)
>   #618     verif_scale_strobemeta:FAIL
>
> But if I increase the verificaiton insn limit from 1M to 10M, the above
> test_progs run actually will succeed. The below is the result from veristat:
>   $ ./veristat strobemeta.bpf.o
>   Processing 'strobemeta.bpf.o'...
>   File              Program   Verdict  Duration (us)    Insns  States  Program size  Jited size
>   ----------------  --------  -------  -------------  -------  ------  ------------  ----------
>   strobemeta.bpf.o  on_event  success       90250893  9777685  358230         15954       80794
>   ----------------  --------  -------  -------------  -------  ------  ------------  ----------
>   Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs.
>
> Further debugging shows the llvm commit [1] is responsible for the verificaiton
> failure as it tries to convert certain switch statement to if-condition. Such
> change may cause different transformation compared to original switch statement.
>
> In bpf program strobemeta.c case, the initial llvm ir for read_int_var() function is
>   define internal void @read_int_var(ptr noundef %0, i64 noundef %1, ptr noundef %2,
>       ptr noundef %3, ptr noundef %4) #2 !dbg !535 {
>     %6 = alloca ptr, align 8
>     %7 = alloca i64, align 8
>     %8 = alloca ptr, align 8
>     %9 = alloca ptr, align 8
>     %10 = alloca ptr, align 8
>     %11 = alloca ptr, align 8
>     %12 = alloca i32, align 4
>     ...
>     %20 = icmp ne ptr %19, null, !dbg !561
>     br i1 %20, label %22, label %21, !dbg !562
>
>   21:                                               ; preds = %5
>     store i32 1, ptr %12, align 4
>     br label %48, !dbg !563
>
>   22:
>     %23 = load ptr, ptr %9, align 8, !dbg !564
>     ...
>
>   47:                                               ; preds = %38, %22
>     store i32 0, ptr %12, align 4, !dbg !588
>     br label %48, !dbg !588
>
>   48:                                               ; preds = %47, %21
>     call void @llvm.lifetime.end.p0(ptr %11) #4, !dbg !588
>     %49 = load i32, ptr %12, align 4
>     switch i32 %49, label %51 [
>       i32 0, label %50
>       i32 1, label %50
>     ]
>
>   50:                                               ; preds = %48, %48
>     ret void, !dbg !589
>
>   51:                                               ; preds = %48
>     unreachable
>   }
>
> Note that the above 'switch' statement is added by clang frontend.
> Without [1], the switch statement will survive until SelectionDag,
> so the switch statement acts like a 'barrier' and prevents some
> transformation involved with both 'before' and 'after' the switch statement.
>
> But with [1], the switch statement will be removed during middle end
> optimization and later middle end passes (esp. after inlining) have more
> freedom to reorder the code.
>
> The following is the related source code:
>
>   static void *calc_location(struct strobe_value_loc *loc, void *tls_base):
>         bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
>         /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
>         return tls_ptr && tls_ptr != (void *)-1
>                 ? tls_ptr + tls_index.offset
>                 : NULL;
>
>   In read_int_var() func, we have:
>         void *location = calc_location(&cfg->int_locs[idx], tls_base);
>         if (!location)
>                 return;
>
>         bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
>         ...
>
> The static func calc_location() is called inside read_int_var(). The asm code
> without [1]:
>      77: .123....89 (85) call bpf_probe_read_user#112
>      78: ........89 (79) r1 = *(u64 *)(r10 -368)
>      79: .1......89 (79) r2 = *(u64 *)(r10 -8)
>      80: .12.....89 (bf) r3 = r2
>      81: .123....89 (0f) r3 += r1
>      82: ..23....89 (07) r2 += 1
>      83: ..23....89 (79) r4 = *(u64 *)(r10 -464)
>      84: ..234...89 (a5) if r2 < 0x2 goto pc+13
>      85: ...34...89 (15) if r3 == 0x0 goto pc+12
>      86: ...3....89 (bf) r1 = r10
>      87: .1.3....89 (07) r1 += -400
>      88: .1.3....89 (b4) w2 = 16
> In this case, 'r2 < 0x2' and 'r3 == 0x0' go to null 'locaiton' place,
> so the verifier actually prefers to do verification first at 'r1 = r10' etc.
>
> The asm code with [1]:
>     119: .123....89 (85) call bpf_probe_read_user#112
>     120: ........89 (79) r1 = *(u64 *)(r10 -368)
>     121: .1......89 (79) r2 = *(u64 *)(r10 -8)
>     122: .12.....89 (bf) r3 = r2
>     123: .123....89 (0f) r3 += r1
>     124: ..23....89 (07) r2 += -1
>     125: ..23....89 (a5) if r2 < 0xfffffffe goto pc+6
>     126: ........89 (05) goto pc+17
>     ...
>     144: ........89 (b4) w1 = 0
>     145: .1......89 (6b) *(u16 *)(r8 +80) = r1
> In this case, if 'r2 < 0xfffffffe' is true, the control will go to
> non-null 'location' branch, so 'goto pc+17' will actually go to
> null 'location' branch. This seems causing tremendous amount of
> verificaiton state.
>
> To fix the issue, rewrite the following code
>   return tls_ptr && tls_ptr != (void *)-1
>                 ? tls_ptr + tls_index.offset
>                 : NULL;
> to if/then statement and hopefully these explicit if/then statements
> are sticky during middle-end optimizations.

this is so fragile and non-obvious... Just looking at the patch, it's
an equivalent transformation, so as a user I'd have no clue that doing
something like that can even matter...

Have you tried adding likely() around non-NULL case? Does it change
generated code, while leaving ternary as is?

>
> Test with llvm20 and llvm21 as well and all strobemeta related selftests
> are passed.
>
>   [1] https://github.com/llvm/llvm-project/pull/161000
>
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> ---
>  tools/testing/selftests/bpf/progs/strobemeta.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> NOTE: I will also check whether we can make changes in llvm to automatically
>  adjust branch statements to minimize verification insns/states.
>
> diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
> index a5c74d31a244..6e1918deaf26 100644
> --- a/tools/testing/selftests/bpf/progs/strobemeta.h
> +++ b/tools/testing/selftests/bpf/progs/strobemeta.h
> @@ -330,9 +330,9 @@ static void *calc_location(struct strobe_value_loc *loc, void *tls_base)
>         }
>         bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
>         /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
> -       return tls_ptr && tls_ptr != (void *)-1
> -               ? tls_ptr + tls_index.offset
> -               : NULL;
> +       if (!tls_ptr || tls_ptr == (void *)-1)
> +               return NULL;
> +       return tls_ptr + tls_index.offset;
>  }
>
>  #ifdef SUBPROGS
> --
> 2.47.3
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22
  2025-10-15 16:45 ` Andrii Nakryiko
@ 2025-10-15 19:56   ` Yonghong Song
  2025-10-15 20:31     ` Andrii Nakryiko
  0 siblings, 1 reply; 6+ messages in thread
From: Yonghong Song @ 2025-10-15 19:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau



On 10/15/25 9:45 AM, Andrii Nakryiko wrote:
> On Mon, Oct 13, 2025 at 10:16 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>> With latest llvm22, I hit the verif_scale_strobemeta selftest failure
>> below:
>>    $ ./test_progs -n 618
>>    libbpf: prog 'on_event': BPF program load failed: -E2BIG
>>    libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
>>    BPF program is too large. Processed 1000001 insn
>>    verification time 7019091 usec
>>    stack depth 488
>>    processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 33927 peak_states 12813 mark_read 0
>>    -- END PROG LOAD LOG --
>>    libbpf: prog 'on_event': failed to load: -E2BIG
>>    libbpf: failed to load object 'strobemeta.bpf.o'
>>    scale_test:FAIL:expect_success unexpected error: -7 (errno 7)
>>    #618     verif_scale_strobemeta:FAIL
>>
>> But if I increase the verificaiton insn limit from 1M to 10M, the above
>> test_progs run actually will succeed. The below is the result from veristat:
>>    $ ./veristat strobemeta.bpf.o
>>    Processing 'strobemeta.bpf.o'...
>>    File              Program   Verdict  Duration (us)    Insns  States  Program size  Jited size
>>    ----------------  --------  -------  -------------  -------  ------  ------------  ----------
>>    strobemeta.bpf.o  on_event  success       90250893  9777685  358230         15954       80794
>>    ----------------  --------  -------  -------------  -------  ------  ------------  ----------
>>    Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs.
>>
>> Further debugging shows the llvm commit [1] is responsible for the verificaiton
>> failure as it tries to convert certain switch statement to if-condition. Such
>> change may cause different transformation compared to original switch statement.
>>
>> In bpf program strobemeta.c case, the initial llvm ir for read_int_var() function is
>>    define internal void @read_int_var(ptr noundef %0, i64 noundef %1, ptr noundef %2,
>>        ptr noundef %3, ptr noundef %4) #2 !dbg !535 {
>>      %6 = alloca ptr, align 8
>>      %7 = alloca i64, align 8
>>      %8 = alloca ptr, align 8
>>      %9 = alloca ptr, align 8
>>      %10 = alloca ptr, align 8
>>      %11 = alloca ptr, align 8
>>      %12 = alloca i32, align 4
>>      ...
>>      %20 = icmp ne ptr %19, null, !dbg !561
>>      br i1 %20, label %22, label %21, !dbg !562
>>
>>    21:                                               ; preds = %5
>>      store i32 1, ptr %12, align 4
>>      br label %48, !dbg !563
>>
>>    22:
>>      %23 = load ptr, ptr %9, align 8, !dbg !564
>>      ...
>>
>>    47:                                               ; preds = %38, %22
>>      store i32 0, ptr %12, align 4, !dbg !588
>>      br label %48, !dbg !588
>>
>>    48:                                               ; preds = %47, %21
>>      call void @llvm.lifetime.end.p0(ptr %11) #4, !dbg !588
>>      %49 = load i32, ptr %12, align 4
>>      switch i32 %49, label %51 [
>>        i32 0, label %50
>>        i32 1, label %50
>>      ]
>>
>>    50:                                               ; preds = %48, %48
>>      ret void, !dbg !589
>>
>>    51:                                               ; preds = %48
>>      unreachable
>>    }
>>
>> Note that the above 'switch' statement is added by clang frontend.
>> Without [1], the switch statement will survive until SelectionDag,
>> so the switch statement acts like a 'barrier' and prevents some
>> transformation involved with both 'before' and 'after' the switch statement.
>>
>> But with [1], the switch statement will be removed during middle end
>> optimization and later middle end passes (esp. after inlining) have more
>> freedom to reorder the code.
>>
>> The following is the related source code:
>>
>>    static void *calc_location(struct strobe_value_loc *loc, void *tls_base):
>>          bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
>>          /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
>>          return tls_ptr && tls_ptr != (void *)-1
>>                  ? tls_ptr + tls_index.offset
>>                  : NULL;
>>
>>    In read_int_var() func, we have:
>>          void *location = calc_location(&cfg->int_locs[idx], tls_base);
>>          if (!location)
>>                  return;
>>
>>          bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
>>          ...
>>
>> The static func calc_location() is called inside read_int_var(). The asm code
>> without [1]:
>>       77: .123....89 (85) call bpf_probe_read_user#112
>>       78: ........89 (79) r1 = *(u64 *)(r10 -368)
>>       79: .1......89 (79) r2 = *(u64 *)(r10 -8)
>>       80: .12.....89 (bf) r3 = r2
>>       81: .123....89 (0f) r3 += r1
>>       82: ..23....89 (07) r2 += 1
>>       83: ..23....89 (79) r4 = *(u64 *)(r10 -464)
>>       84: ..234...89 (a5) if r2 < 0x2 goto pc+13
>>       85: ...34...89 (15) if r3 == 0x0 goto pc+12
>>       86: ...3....89 (bf) r1 = r10
>>       87: .1.3....89 (07) r1 += -400
>>       88: .1.3....89 (b4) w2 = 16
>> In this case, 'r2 < 0x2' and 'r3 == 0x0' go to null 'locaiton' place,
>> so the verifier actually prefers to do verification first at 'r1 = r10' etc.
>>
>> The asm code with [1]:
>>      119: .123....89 (85) call bpf_probe_read_user#112
>>      120: ........89 (79) r1 = *(u64 *)(r10 -368)
>>      121: .1......89 (79) r2 = *(u64 *)(r10 -8)
>>      122: .12.....89 (bf) r3 = r2
>>      123: .123....89 (0f) r3 += r1
>>      124: ..23....89 (07) r2 += -1
>>      125: ..23....89 (a5) if r2 < 0xfffffffe goto pc+6
>>      126: ........89 (05) goto pc+17
>>      ...
>>      144: ........89 (b4) w1 = 0
>>      145: .1......89 (6b) *(u16 *)(r8 +80) = r1
>> In this case, if 'r2 < 0xfffffffe' is true, the control will go to
>> non-null 'location' branch, so 'goto pc+17' will actually go to
>> null 'location' branch. This seems causing tremendous amount of
>> verificaiton state.
>>
>> To fix the issue, rewrite the following code
>>    return tls_ptr && tls_ptr != (void *)-1
>>                  ? tls_ptr + tls_index.offset
>>                  : NULL;
>> to if/then statement and hopefully these explicit if/then statements
>> are sticky during middle-end optimizations.
> this is so fragile and non-obvious... Just looking at the patch, it's
> an equivalent transformation, so as a user I'd have no clue that doing
> something like that can even matter...

You are correct. The llvm generate different codes due to compiler internal
changes, and in this case the change caused the verification failure.

>
> Have you tried adding likely() around non-NULL case? Does it change
> generated code, while leaving ternary as is?

I tried the following:

diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
index a5c74d31a244..6c0ec8794d3e 100644
--- a/tools/testing/selftests/bpf/progs/strobemeta.h
+++ b/tools/testing/selftests/bpf/progs/strobemeta.h
@@ -346,13 +346,12 @@ static void read_int_var(struct strobemeta_cfg *cfg,
                          struct strobemeta_payload *data)
  {
         void *location = calc_location(&cfg->int_locs[idx], tls_base);
-       if (!location)
-               return;
-
-       bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
-       data->int_vals[idx] = value->val;
-       if (value->header.len)
-               data->int_vals_set_mask |= (1 << idx);
+       if (likely(location)) {
+               bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
+               data->int_vals[idx] = value->val;
+               if (value->header.len)
+                       data->int_vals_set_mask |= (1 << idx);
+       }
  }


and the verification still failed (exceeding 1000000 insns).

I think that we can leave patch for a while. I will do some investigation
in llvm side to see whether I can come up with some heuristics to benefit
verifier in terms of verified insns.

>
>> Test with llvm20 and llvm21 as well and all strobemeta related selftests
>> are passed.
>>
>>    [1] https://github.com/llvm/llvm-project/pull/161000
>>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>>   tools/testing/selftests/bpf/progs/strobemeta.h | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> NOTE: I will also check whether we can make changes in llvm to automatically
>>   adjust branch statements to minimize verification insns/states.
>>
>> diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
>> index a5c74d31a244..6e1918deaf26 100644
>> --- a/tools/testing/selftests/bpf/progs/strobemeta.h
>> +++ b/tools/testing/selftests/bpf/progs/strobemeta.h
>> @@ -330,9 +330,9 @@ static void *calc_location(struct strobe_value_loc *loc, void *tls_base)
>>          }
>>          bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
>>          /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
>> -       return tls_ptr && tls_ptr != (void *)-1
>> -               ? tls_ptr + tls_index.offset
>> -               : NULL;
>> +       if (!tls_ptr || tls_ptr == (void *)-1)
>> +               return NULL;
>> +       return tls_ptr + tls_index.offset;
>>   }
>>
>>   #ifdef SUBPROGS
>> --
>> 2.47.3
>>


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22
  2025-10-15 19:56   ` Yonghong Song
@ 2025-10-15 20:31     ` Andrii Nakryiko
  2025-10-15 20:54       ` Yonghong Song
  0 siblings, 1 reply; 6+ messages in thread
From: Andrii Nakryiko @ 2025-10-15 20:31 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau

On Wed, Oct 15, 2025 at 12:56 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
>
>
> On 10/15/25 9:45 AM, Andrii Nakryiko wrote:
> > On Mon, Oct 13, 2025 at 10:16 PM Yonghong Song <yonghong.song@linux.dev> wrote:
> >> With latest llvm22, I hit the verif_scale_strobemeta selftest failure
> >> below:
> >>    $ ./test_progs -n 618
> >>    libbpf: prog 'on_event': BPF program load failed: -E2BIG
> >>    libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
> >>    BPF program is too large. Processed 1000001 insn
> >>    verification time 7019091 usec
> >>    stack depth 488
> >>    processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 33927 peak_states 12813 mark_read 0
> >>    -- END PROG LOAD LOG --
> >>    libbpf: prog 'on_event': failed to load: -E2BIG
> >>    libbpf: failed to load object 'strobemeta.bpf.o'
> >>    scale_test:FAIL:expect_success unexpected error: -7 (errno 7)
> >>    #618     verif_scale_strobemeta:FAIL
> >>
> >> But if I increase the verificaiton insn limit from 1M to 10M, the above
> >> test_progs run actually will succeed. The below is the result from veristat:
> >>    $ ./veristat strobemeta.bpf.o
> >>    Processing 'strobemeta.bpf.o'...
> >>    File              Program   Verdict  Duration (us)    Insns  States  Program size  Jited size
> >>    ----------------  --------  -------  -------------  -------  ------  ------------  ----------
> >>    strobemeta.bpf.o  on_event  success       90250893  9777685  358230         15954       80794
> >>    ----------------  --------  -------  -------------  -------  ------  ------------  ----------
> >>    Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs.
> >>
> >> Further debugging shows the llvm commit [1] is responsible for the verificaiton
> >> failure as it tries to convert certain switch statement to if-condition. Such
> >> change may cause different transformation compared to original switch statement.
> >>
> >> In bpf program strobemeta.c case, the initial llvm ir for read_int_var() function is
> >>    define internal void @read_int_var(ptr noundef %0, i64 noundef %1, ptr noundef %2,
> >>        ptr noundef %3, ptr noundef %4) #2 !dbg !535 {
> >>      %6 = alloca ptr, align 8
> >>      %7 = alloca i64, align 8
> >>      %8 = alloca ptr, align 8
> >>      %9 = alloca ptr, align 8
> >>      %10 = alloca ptr, align 8
> >>      %11 = alloca ptr, align 8
> >>      %12 = alloca i32, align 4
> >>      ...
> >>      %20 = icmp ne ptr %19, null, !dbg !561
> >>      br i1 %20, label %22, label %21, !dbg !562
> >>
> >>    21:                                               ; preds = %5
> >>      store i32 1, ptr %12, align 4
> >>      br label %48, !dbg !563
> >>
> >>    22:
> >>      %23 = load ptr, ptr %9, align 8, !dbg !564
> >>      ...
> >>
> >>    47:                                               ; preds = %38, %22
> >>      store i32 0, ptr %12, align 4, !dbg !588
> >>      br label %48, !dbg !588
> >>
> >>    48:                                               ; preds = %47, %21
> >>      call void @llvm.lifetime.end.p0(ptr %11) #4, !dbg !588
> >>      %49 = load i32, ptr %12, align 4
> >>      switch i32 %49, label %51 [
> >>        i32 0, label %50
> >>        i32 1, label %50
> >>      ]
> >>
> >>    50:                                               ; preds = %48, %48
> >>      ret void, !dbg !589
> >>
> >>    51:                                               ; preds = %48
> >>      unreachable
> >>    }
> >>
> >> Note that the above 'switch' statement is added by clang frontend.
> >> Without [1], the switch statement will survive until SelectionDag,
> >> so the switch statement acts like a 'barrier' and prevents some
> >> transformation involved with both 'before' and 'after' the switch statement.
> >>
> >> But with [1], the switch statement will be removed during middle end
> >> optimization and later middle end passes (esp. after inlining) have more
> >> freedom to reorder the code.
> >>
> >> The following is the related source code:
> >>
> >>    static void *calc_location(struct strobe_value_loc *loc, void *tls_base):
> >>          bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
> >>          /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
> >>          return tls_ptr && tls_ptr != (void *)-1
> >>                  ? tls_ptr + tls_index.offset
> >>                  : NULL;
> >>
> >>    In read_int_var() func, we have:
> >>          void *location = calc_location(&cfg->int_locs[idx], tls_base);
> >>          if (!location)
> >>                  return;
> >>
> >>          bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
> >>          ...
> >>
> >> The static func calc_location() is called inside read_int_var(). The asm code
> >> without [1]:
> >>       77: .123....89 (85) call bpf_probe_read_user#112
> >>       78: ........89 (79) r1 = *(u64 *)(r10 -368)
> >>       79: .1......89 (79) r2 = *(u64 *)(r10 -8)
> >>       80: .12.....89 (bf) r3 = r2
> >>       81: .123....89 (0f) r3 += r1
> >>       82: ..23....89 (07) r2 += 1
> >>       83: ..23....89 (79) r4 = *(u64 *)(r10 -464)
> >>       84: ..234...89 (a5) if r2 < 0x2 goto pc+13
> >>       85: ...34...89 (15) if r3 == 0x0 goto pc+12
> >>       86: ...3....89 (bf) r1 = r10
> >>       87: .1.3....89 (07) r1 += -400
> >>       88: .1.3....89 (b4) w2 = 16
> >> In this case, 'r2 < 0x2' and 'r3 == 0x0' go to null 'locaiton' place,
> >> so the verifier actually prefers to do verification first at 'r1 = r10' etc.
> >>
> >> The asm code with [1]:
> >>      119: .123....89 (85) call bpf_probe_read_user#112
> >>      120: ........89 (79) r1 = *(u64 *)(r10 -368)
> >>      121: .1......89 (79) r2 = *(u64 *)(r10 -8)
> >>      122: .12.....89 (bf) r3 = r2
> >>      123: .123....89 (0f) r3 += r1
> >>      124: ..23....89 (07) r2 += -1
> >>      125: ..23....89 (a5) if r2 < 0xfffffffe goto pc+6
> >>      126: ........89 (05) goto pc+17
> >>      ...
> >>      144: ........89 (b4) w1 = 0
> >>      145: .1......89 (6b) *(u16 *)(r8 +80) = r1
> >> In this case, if 'r2 < 0xfffffffe' is true, the control will go to
> >> non-null 'location' branch, so 'goto pc+17' will actually go to
> >> null 'location' branch. This seems causing tremendous amount of
> >> verificaiton state.
> >>
> >> To fix the issue, rewrite the following code
> >>    return tls_ptr && tls_ptr != (void *)-1
> >>                  ? tls_ptr + tls_index.offset
> >>                  : NULL;
> >> to if/then statement and hopefully these explicit if/then statements
> >> are sticky during middle-end optimizations.
> > this is so fragile and non-obvious... Just looking at the patch, it's
> > an equivalent transformation, so as a user I'd have no clue that doing
> > something like that can even matter...
>
> You are correct. The llvm generate different codes due to compiler internal
> changes, and in this case the change caused the verification failure.
>
> >
> > Have you tried adding likely() around non-NULL case? Does it change
> > generated code, while leaving ternary as is?
>
> I tried the following:
>
> diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
> index a5c74d31a244..6c0ec8794d3e 100644
> --- a/tools/testing/selftests/bpf/progs/strobemeta.h
> +++ b/tools/testing/selftests/bpf/progs/strobemeta.h
> @@ -346,13 +346,12 @@ static void read_int_var(struct strobemeta_cfg *cfg,
>                           struct strobemeta_payload *data)
>   {
>          void *location = calc_location(&cfg->int_locs[idx], tls_base);
> -       if (!location)
> -               return;
> -
> -       bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
> -       data->int_vals[idx] = value->val;
> -       if (value->header.len)
> -               data->int_vals_set_mask |= (1 << idx);
> +       if (likely(location)) {
> +               bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
> +               data->int_vals[idx] = value->val;
> +               if (value->header.len)
> +                       data->int_vals_set_mask |= (1 << idx);
> +       }
>   }
>
>
> and the verification still failed (exceeding 1000000 insns).

I was thinking to add likely like so:

return likely(tls_ptr && tls_ptr != (void *)-1) ? tls_ptr +
tls_index.offset : NULL;


and then hope that Clang will prioritize leaving non-NULL code path as
linear as possible

>
> I think that we can leave patch for a while. I will do some investigation
> in llvm side to see whether I can come up with some heuristics to benefit
> verifier in terms of verified insns.
>
> >
> >> Test with llvm20 and llvm21 as well and all strobemeta related selftests
> >> are passed.
> >>
> >>    [1] https://github.com/llvm/llvm-project/pull/161000
> >>
> >> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> >> ---
> >>   tools/testing/selftests/bpf/progs/strobemeta.h | 6 +++---
> >>   1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> NOTE: I will also check whether we can make changes in llvm to automatically
> >>   adjust branch statements to minimize verification insns/states.
> >>
> >> diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
> >> index a5c74d31a244..6e1918deaf26 100644
> >> --- a/tools/testing/selftests/bpf/progs/strobemeta.h
> >> +++ b/tools/testing/selftests/bpf/progs/strobemeta.h
> >> @@ -330,9 +330,9 @@ static void *calc_location(struct strobe_value_loc *loc, void *tls_base)
> >>          }
> >>          bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
> >>          /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
> >> -       return tls_ptr && tls_ptr != (void *)-1
> >> -               ? tls_ptr + tls_index.offset
> >> -               : NULL;
> >> +       if (!tls_ptr || tls_ptr == (void *)-1)
> >> +               return NULL;
> >> +       return tls_ptr + tls_index.offset;
> >>   }
> >>
> >>   #ifdef SUBPROGS
> >> --
> >> 2.47.3
> >>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22
  2025-10-15 20:31     ` Andrii Nakryiko
@ 2025-10-15 20:54       ` Yonghong Song
  0 siblings, 0 replies; 6+ messages in thread
From: Yonghong Song @ 2025-10-15 20:54 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	kernel-team, Martin KaFai Lau



On 10/15/25 1:31 PM, Andrii Nakryiko wrote:
> On Wed, Oct 15, 2025 at 12:56 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>>
>>
>> On 10/15/25 9:45 AM, Andrii Nakryiko wrote:
>>> On Mon, Oct 13, 2025 at 10:16 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>>>> With latest llvm22, I hit the verif_scale_strobemeta selftest failure
>>>> below:
>>>>     $ ./test_progs -n 618
>>>>     libbpf: prog 'on_event': BPF program load failed: -E2BIG
>>>>     libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
>>>>     BPF program is too large. Processed 1000001 insn
>>>>     verification time 7019091 usec
>>>>     stack depth 488
>>>>     processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 33927 peak_states 12813 mark_read 0
>>>>     -- END PROG LOAD LOG --
>>>>     libbpf: prog 'on_event': failed to load: -E2BIG
>>>>     libbpf: failed to load object 'strobemeta.bpf.o'
>>>>     scale_test:FAIL:expect_success unexpected error: -7 (errno 7)
>>>>     #618     verif_scale_strobemeta:FAIL
>>>>
>>>> But if I increase the verificaiton insn limit from 1M to 10M, the above
>>>> test_progs run actually will succeed. The below is the result from veristat:
>>>>     $ ./veristat strobemeta.bpf.o
>>>>     Processing 'strobemeta.bpf.o'...
>>>>     File              Program   Verdict  Duration (us)    Insns  States  Program size  Jited size
>>>>     ----------------  --------  -------  -------------  -------  ------  ------------  ----------
>>>>     strobemeta.bpf.o  on_event  success       90250893  9777685  358230         15954       80794
>>>>     ----------------  --------  -------  -------------  -------  ------  ------------  ----------
>>>>     Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs.
>>>>
>>>> Further debugging shows the llvm commit [1] is responsible for the verificaiton
>>>> failure as it tries to convert certain switch statement to if-condition. Such
>>>> change may cause different transformation compared to original switch statement.
>>>>
>>>> In bpf program strobemeta.c case, the initial llvm ir for read_int_var() function is
>>>>     define internal void @read_int_var(ptr noundef %0, i64 noundef %1, ptr noundef %2,
>>>>         ptr noundef %3, ptr noundef %4) #2 !dbg !535 {
>>>>       %6 = alloca ptr, align 8
>>>>       %7 = alloca i64, align 8
>>>>       %8 = alloca ptr, align 8
>>>>       %9 = alloca ptr, align 8
>>>>       %10 = alloca ptr, align 8
>>>>       %11 = alloca ptr, align 8
>>>>       %12 = alloca i32, align 4
>>>>       ...
>>>>       %20 = icmp ne ptr %19, null, !dbg !561
>>>>       br i1 %20, label %22, label %21, !dbg !562
>>>>
>>>>     21:                                               ; preds = %5
>>>>       store i32 1, ptr %12, align 4
>>>>       br label %48, !dbg !563
>>>>
>>>>     22:
>>>>       %23 = load ptr, ptr %9, align 8, !dbg !564
>>>>       ...
>>>>
>>>>     47:                                               ; preds = %38, %22
>>>>       store i32 0, ptr %12, align 4, !dbg !588
>>>>       br label %48, !dbg !588
>>>>
>>>>     48:                                               ; preds = %47, %21
>>>>       call void @llvm.lifetime.end.p0(ptr %11) #4, !dbg !588
>>>>       %49 = load i32, ptr %12, align 4
>>>>       switch i32 %49, label %51 [
>>>>         i32 0, label %50
>>>>         i32 1, label %50
>>>>       ]
>>>>
>>>>     50:                                               ; preds = %48, %48
>>>>       ret void, !dbg !589
>>>>
>>>>     51:                                               ; preds = %48
>>>>       unreachable
>>>>     }
>>>>
>>>> Note that the above 'switch' statement is added by clang frontend.
>>>> Without [1], the switch statement will survive until SelectionDag,
>>>> so the switch statement acts like a 'barrier' and prevents some
>>>> transformation involved with both 'before' and 'after' the switch statement.
>>>>
>>>> But with [1], the switch statement will be removed during middle end
>>>> optimization and later middle end passes (esp. after inlining) have more
>>>> freedom to reorder the code.
>>>>
>>>> The following is the related source code:
>>>>
>>>>     static void *calc_location(struct strobe_value_loc *loc, void *tls_base):
>>>>           bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
>>>>           /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
>>>>           return tls_ptr && tls_ptr != (void *)-1
>>>>                   ? tls_ptr + tls_index.offset
>>>>                   : NULL;
>>>>
>>>>     In read_int_var() func, we have:
>>>>           void *location = calc_location(&cfg->int_locs[idx], tls_base);
>>>>           if (!location)
>>>>                   return;
>>>>
>>>>           bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
>>>>           ...
>>>>
>>>> The static func calc_location() is called inside read_int_var(). The asm code
>>>> without [1]:
>>>>        77: .123....89 (85) call bpf_probe_read_user#112
>>>>        78: ........89 (79) r1 = *(u64 *)(r10 -368)
>>>>        79: .1......89 (79) r2 = *(u64 *)(r10 -8)
>>>>        80: .12.....89 (bf) r3 = r2
>>>>        81: .123....89 (0f) r3 += r1
>>>>        82: ..23....89 (07) r2 += 1
>>>>        83: ..23....89 (79) r4 = *(u64 *)(r10 -464)
>>>>        84: ..234...89 (a5) if r2 < 0x2 goto pc+13
>>>>        85: ...34...89 (15) if r3 == 0x0 goto pc+12
>>>>        86: ...3....89 (bf) r1 = r10
>>>>        87: .1.3....89 (07) r1 += -400
>>>>        88: .1.3....89 (b4) w2 = 16
>>>> In this case, 'r2 < 0x2' and 'r3 == 0x0' go to null 'locaiton' place,
>>>> so the verifier actually prefers to do verification first at 'r1 = r10' etc.
>>>>
>>>> The asm code with [1]:
>>>>       119: .123....89 (85) call bpf_probe_read_user#112
>>>>       120: ........89 (79) r1 = *(u64 *)(r10 -368)
>>>>       121: .1......89 (79) r2 = *(u64 *)(r10 -8)
>>>>       122: .12.....89 (bf) r3 = r2
>>>>       123: .123....89 (0f) r3 += r1
>>>>       124: ..23....89 (07) r2 += -1
>>>>       125: ..23....89 (a5) if r2 < 0xfffffffe goto pc+6
>>>>       126: ........89 (05) goto pc+17
>>>>       ...
>>>>       144: ........89 (b4) w1 = 0
>>>>       145: .1......89 (6b) *(u16 *)(r8 +80) = r1
>>>> In this case, if 'r2 < 0xfffffffe' is true, the control will go to
>>>> non-null 'location' branch, so 'goto pc+17' will actually go to
>>>> null 'location' branch. This seems causing tremendous amount of
>>>> verificaiton state.
>>>>
>>>> To fix the issue, rewrite the following code
>>>>     return tls_ptr && tls_ptr != (void *)-1
>>>>                   ? tls_ptr + tls_index.offset
>>>>                   : NULL;
>>>> to if/then statement and hopefully these explicit if/then statements
>>>> are sticky during middle-end optimizations.
>>> this is so fragile and non-obvious... Just looking at the patch, it's
>>> an equivalent transformation, so as a user I'd have no clue that doing
>>> something like that can even matter...
>> You are correct. The llvm generate different codes due to compiler internal
>> changes, and in this case the change caused the verification failure.
>>
>>> Have you tried adding likely() around non-NULL case? Does it change
>>> generated code, while leaving ternary as is?
>> I tried the following:
>>
>> diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
>> index a5c74d31a244..6c0ec8794d3e 100644
>> --- a/tools/testing/selftests/bpf/progs/strobemeta.h
>> +++ b/tools/testing/selftests/bpf/progs/strobemeta.h
>> @@ -346,13 +346,12 @@ static void read_int_var(struct strobemeta_cfg *cfg,
>>                            struct strobemeta_payload *data)
>>    {
>>           void *location = calc_location(&cfg->int_locs[idx], tls_base);
>> -       if (!location)
>> -               return;
>> -
>> -       bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
>> -       data->int_vals[idx] = value->val;
>> -       if (value->header.len)
>> -               data->int_vals_set_mask |= (1 << idx);
>> +       if (likely(location)) {
>> +               bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
>> +               data->int_vals[idx] = value->val;
>> +               if (value->header.len)
>> +                       data->int_vals_set_mask |= (1 << idx);
>> +       }
>>    }
>>
>>
>> and the verification still failed (exceeding 1000000 insns).
> I was thinking to add likely like so:
>
> return likely(tls_ptr && tls_ptr != (void *)-1) ? tls_ptr +
> tls_index.offset : NULL;
>
>
> and then hope that Clang will prioritize leaving non-NULL code path as
> linear as possible

Sadly with the suggested change:

diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
index a5c74d31a244..d31d5fb1e96e 100644
--- a/tools/testing/selftests/bpf/progs/strobemeta.h
+++ b/tools/testing/selftests/bpf/progs/strobemeta.h
@@ -330,7 +330,7 @@ static void *calc_location(struct strobe_value_loc *loc, void *tls_base)
         }
         bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
         /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
-       return tls_ptr && tls_ptr != (void *)-1
+       return likely(tls_ptr && tls_ptr != (void *)-1)
                 ? tls_ptr + tls_index.offset
                 : NULL;
  }

the verification still fails. Stubborn clang :-(

>
>> I think that we can leave patch for a while. I will do some investigation
>> in llvm side to see whether I can come up with some heuristics to benefit
>> verifier in terms of verified insns.
>>
>>>> Test with llvm20 and llvm21 as well and all strobemeta related selftests
>>>> are passed.
>>>>
>>>>     [1] https://github.com/llvm/llvm-project/pull/161000
>>>>
>>>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>>>> ---
>>>>    tools/testing/selftests/bpf/progs/strobemeta.h | 6 +++---
>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> NOTE: I will also check whether we can make changes in llvm to automatically
>>>>    adjust branch statements to minimize verification insns/states.
>>>>
>>>> diff --git a/tools/testing/selftests/bpf/progs/strobemeta.h b/tools/testing/selftests/bpf/progs/strobemeta.h
>>>> index a5c74d31a244..6e1918deaf26 100644
>>>> --- a/tools/testing/selftests/bpf/progs/strobemeta.h
>>>> +++ b/tools/testing/selftests/bpf/progs/strobemeta.h
>>>> @@ -330,9 +330,9 @@ static void *calc_location(struct strobe_value_loc *loc, void *tls_base)
>>>>           }
>>>>           bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv);
>>>>           /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */
>>>> -       return tls_ptr && tls_ptr != (void *)-1
>>>> -               ? tls_ptr + tls_index.offset
>>>> -               : NULL;
>>>> +       if (!tls_ptr || tls_ptr == (void *)-1)
>>>> +               return NULL;
>>>> +       return tls_ptr + tls_index.offset;
>>>>    }
>>>>
>>>>    #ifdef SUBPROGS
>>>> --
>>>> 2.47.3
>>>>


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22
  2025-10-14  5:16 [PATCH bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22 Yonghong Song
  2025-10-15 16:45 ` Andrii Nakryiko
@ 2025-10-19  2:30 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-10-19  2:30 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, ast, andrii, daniel, kernel-team, martin.lau

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Mon, 13 Oct 2025 22:16:39 -0700 you wrote:
> With latest llvm22, I hit the verif_scale_strobemeta selftest failure
> below:
>   $ ./test_progs -n 618
>   libbpf: prog 'on_event': BPF program load failed: -E2BIG
>   libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
>   BPF program is too large. Processed 1000001 insn
>   verification time 7019091 usec
>   stack depth 488
>   processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 33927 peak_states 12813 mark_read 0
>   -- END PROG LOAD LOG --
>   libbpf: prog 'on_event': failed to load: -E2BIG
>   libbpf: failed to load object 'strobemeta.bpf.o'
>   scale_test:FAIL:expect_success unexpected error: -7 (errno 7)
>   #618     verif_scale_strobemeta:FAIL
> 
> [...]

Here is the summary with links:
  - [bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22
    https://git.kernel.org/bpf/bpf-next/c/4f8543b5f20f

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-10-19  2:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-14  5:16 [PATCH bpf-next] selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22 Yonghong Song
2025-10-15 16:45 ` Andrii Nakryiko
2025-10-15 19:56   ` Yonghong Song
2025-10-15 20:31     ` Andrii Nakryiko
2025-10-15 20:54       ` Yonghong Song
2025-10-19  2:30 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox