* [PATCH bpf-next v6 0/2] bpf, x86: inline bpf_get_current_task() for x86_64
@ 2026-01-20 7:05 Menglong Dong
2026-01-20 7:05 ` [PATCH bpf-next v6 1/2] " Menglong Dong
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Menglong Dong @ 2026-01-20 7:05 UTC (permalink / raw)
To: ast, eddyz87
Cc: davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
dave.hansen, x86, hpa, netdev, bpf, linux-kernel
Inline bpf_get_current_task() and bpf_get_current_task_btf() for x86_64
to obtain better performance, and add the testcase for it.
Changes since v5:
* remove unnecessary 'ifdef' and __description in the selftests
* v5: https://lore.kernel.org/bpf/20260119070246.249499-1-dongml2@chinatelecom.cn/
Changes since v4:
* don't support the !CONFIG_SMP case
* v4: https://lore.kernel.org/bpf/20260112104529.224645-1-dongml2@chinatelecom.cn/
Changes since v3:
* handle the !CONFIG_SMP case
* ignore the !CONFIG_SMP case in the testcase, as we enable CONFIG_SMP
for x86_64 in the selftests
Changes since v2:
* implement it in the verifier with BPF_MOV64_PERCPU_REG() instead of in
x86_64 JIT (Alexei).
Changes since v1:
* add the testcase
* remove the usage of const_current_task
Menglong Dong (2):
bpf, x86: inline bpf_get_current_task() for x86_64
selftests/bpf: test the jited inline of bpf_get_current_task
kernel/bpf/verifier.c | 22 +++++++++++++++++++
.../selftests/bpf/prog_tests/verifier.c | 2 ++
.../selftests/bpf/progs/verifier_jit_inline.c | 20 +++++++++++++++++
3 files changed, 44 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/verifier_jit_inline.c
--
2.52.0
^ permalink raw reply [flat|nested] 17+ messages in thread* [PATCH bpf-next v6 1/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-20 7:05 [PATCH bpf-next v6 0/2] bpf, x86: inline bpf_get_current_task() for x86_64 Menglong Dong @ 2026-01-20 7:05 ` Menglong Dong 2026-01-21 1:23 ` Andrii Nakryiko 2026-01-20 7:05 ` [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task Menglong Dong 2026-01-21 4:50 ` [PATCH bpf-next v6 0/2] bpf, x86: inline bpf_get_current_task() for x86_64 patchwork-bot+netdevbpf 2 siblings, 1 reply; 17+ messages in thread From: Menglong Dong @ 2026-01-20 7:05 UTC (permalink / raw) To: ast, eddyz87 Cc: davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel Inline bpf_get_current_task() and bpf_get_current_task_btf() for x86_64 to obtain better performance. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Eduard Zingerman <eddyz87@gmail.com> --- v5: - don't support the !CONFIG_SMP case v4: - handle the !CONFIG_SMP case v3: - implement it in the verifier with BPF_MOV64_PERCPU_REG() instead of in x86_64 JIT. --- kernel/bpf/verifier.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9de0ec0c3ed9..c4e2ffadfb1f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -17739,6 +17739,10 @@ static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) switch (imm) { #ifdef CONFIG_X86_64 case BPF_FUNC_get_smp_processor_id: +#ifdef CONFIG_SMP + case BPF_FUNC_get_current_task_btf: + case BPF_FUNC_get_current_task: +#endif return env->prog->jit_requested && bpf_jit_supports_percpu_insn(); #endif default: @@ -23319,6 +23323,24 @@ static int do_misc_fixups(struct bpf_verifier_env *env) insn = new_prog->insnsi + i + delta; goto next_insn; } + + /* Implement bpf_get_current_task() and bpf_get_current_task_btf() inline. */ + if ((insn->imm == BPF_FUNC_get_current_task || insn->imm == BPF_FUNC_get_current_task_btf) && + verifier_inlines_helper_call(env, insn->imm)) { + insn_buf[0] = BPF_MOV64_IMM(BPF_REG_0, (u32)(unsigned long)¤t_task); + insn_buf[1] = BPF_MOV64_PERCPU_REG(BPF_REG_0, BPF_REG_0); + insn_buf[2] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0); + cnt = 3; + + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; + goto next_insn; + } #endif /* Implement bpf_get_func_arg inline. */ if (prog_type == BPF_PROG_TYPE_TRACING && -- 2.52.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 1/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-20 7:05 ` [PATCH bpf-next v6 1/2] " Menglong Dong @ 2026-01-21 1:23 ` Andrii Nakryiko 2026-01-21 1:43 ` Alexei Starovoitov 2026-01-21 1:58 ` Menglong Dong 0 siblings, 2 replies; 17+ messages in thread From: Andrii Nakryiko @ 2026-01-21 1:23 UTC (permalink / raw) To: Menglong Dong Cc: ast, eddyz87, davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel On Mon, Jan 19, 2026 at 11:06 PM Menglong Dong <menglong8.dong@gmail.com> wrote: > > Inline bpf_get_current_task() and bpf_get_current_task_btf() for x86_64 > to obtain better performance. > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > Acked-by: Eduard Zingerman <eddyz87@gmail.com> > --- > v5: > - don't support the !CONFIG_SMP case > > v4: > - handle the !CONFIG_SMP case > > v3: > - implement it in the verifier with BPF_MOV64_PERCPU_REG() instead of in > x86_64 JIT. > --- > kernel/bpf/verifier.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 9de0ec0c3ed9..c4e2ffadfb1f 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -17739,6 +17739,10 @@ static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) > switch (imm) { > #ifdef CONFIG_X86_64 > case BPF_FUNC_get_smp_processor_id: > +#ifdef CONFIG_SMP > + case BPF_FUNC_get_current_task_btf: > + case BPF_FUNC_get_current_task: > +#endif Does this have to be x86-64 specific inlining? With verifier inlining and per_cpu instruction support it should theoretically work across all architectures that do support per-cpu instruction, no? Eduard pointed out [0] to me for why we have that x86-64 specific check. But looking at do_misc_fixups(), we have that early bpf_jit_inlines_helper_call(insn->imm)) check, so if some JIT has more performant inlining implementation, we will just do that. So it seems like we can just drop all that x86-64 specific logic and claim all three of these functions as inlinable, no? And even more. We can drop rather confusing verifier_inlines_helper_call() that duplicates the decision of which helpers can be inlined or not, and have: if (env->prog->jit_requested && bpf_jit_supports_percpu_insn() { switch (insn->imm) { case BPF_FUNC_get_smp_processor_id: ... break; case BPF_FUNC_get_current_task_btf: case BPF_FUNC_get_current_task_btf: ... break; default: } And the decision about inlining will live in one place. Or am I missing some complications? And with all that, should we mark get_current_task and get_current_task_btf as __bpf_fastcall? [0] https://lore.kernel.org/all/20240722233844.1406874-4-eddyz87@gmail.com/ > return env->prog->jit_requested && bpf_jit_supports_percpu_insn(); > #endif > default: > @@ -23319,6 +23323,24 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > insn = new_prog->insnsi + i + delta; > goto next_insn; > } > + > + /* Implement bpf_get_current_task() and bpf_get_current_task_btf() inline. */ > + if ((insn->imm == BPF_FUNC_get_current_task || insn->imm == BPF_FUNC_get_current_task_btf) && > + verifier_inlines_helper_call(env, insn->imm)) { > + insn_buf[0] = BPF_MOV64_IMM(BPF_REG_0, (u32)(unsigned long)¤t_task); > + insn_buf[1] = BPF_MOV64_PERCPU_REG(BPF_REG_0, BPF_REG_0); > + insn_buf[2] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0); > + cnt = 3; > + > + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); > + if (!new_prog) > + return -ENOMEM; > + > + delta += cnt - 1; > + env->prog = prog = new_prog; > + insn = new_prog->insnsi + i + delta; > + goto next_insn; > + } > #endif > /* Implement bpf_get_func_arg inline. */ > if (prog_type == BPF_PROG_TYPE_TRACING && > -- > 2.52.0 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 1/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-21 1:23 ` Andrii Nakryiko @ 2026-01-21 1:43 ` Alexei Starovoitov 2026-01-21 1:58 ` Menglong Dong 1 sibling, 0 replies; 17+ messages in thread From: Alexei Starovoitov @ 2026-01-21 1:43 UTC (permalink / raw) To: Andrii Nakryiko Cc: Menglong Dong, Alexei Starovoitov, Eduard, David S. Miller, David Ahern, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Network Development, bpf, LKML On Tue, Jan 20, 2026 at 5:24 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Mon, Jan 19, 2026 at 11:06 PM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > Inline bpf_get_current_task() and bpf_get_current_task_btf() for x86_64 > > to obtain better performance. > > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > Acked-by: Eduard Zingerman <eddyz87@gmail.com> > > --- > > v5: > > - don't support the !CONFIG_SMP case > > > > v4: > > - handle the !CONFIG_SMP case > > > > v3: > > - implement it in the verifier with BPF_MOV64_PERCPU_REG() instead of in > > x86_64 JIT. > > --- > > kernel/bpf/verifier.c | 22 ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 9de0ec0c3ed9..c4e2ffadfb1f 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -17739,6 +17739,10 @@ static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) > > switch (imm) { > > #ifdef CONFIG_X86_64 > > case BPF_FUNC_get_smp_processor_id: > > +#ifdef CONFIG_SMP > > + case BPF_FUNC_get_current_task_btf: > > + case BPF_FUNC_get_current_task: > > +#endif > > Does this have to be x86-64 specific inlining? With verifier inlining > and per_cpu instruction support it should theoretically work across > all architectures that do support per-cpu instruction, no? > > Eduard pointed out [0] to me for why we have that x86-64 specific > check. But looking at do_misc_fixups(), we have that early > bpf_jit_inlines_helper_call(insn->imm)) check, so if some JIT has more > performant inlining implementation, we will just do that. > > So it seems like we can just drop all that x86-64 specific logic and > claim all three of these functions as inlinable, no? > > And even more. We can drop rather confusing > verifier_inlines_helper_call() that duplicates the decision of which > helpers can be inlined or not, and have: > > if (env->prog->jit_requested && bpf_jit_supports_percpu_insn() { > switch (insn->imm) { > case BPF_FUNC_get_smp_processor_id: > ... > break; > case BPF_FUNC_get_current_task_btf: > case BPF_FUNC_get_current_task_btf: > ... > break; > default: > } > > And the decision about inlining will live in one place. > > Or am I missing some complications? I think it needs to be arch specific, since 'current' is arch specific. x86 is different from arm64. Though both JITs support percpu pseudo insn, it doesn't help to make get_current inlining generic. One has to analyze each arch individually. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 1/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-21 1:23 ` Andrii Nakryiko 2026-01-21 1:43 ` Alexei Starovoitov @ 2026-01-21 1:58 ` Menglong Dong 2026-01-21 3:10 ` Alexei Starovoitov 1 sibling, 1 reply; 17+ messages in thread From: Menglong Dong @ 2026-01-21 1:58 UTC (permalink / raw) To: Menglong Dong, Andrii Nakryiko Cc: ast, eddyz87, davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel On 2026/1/21 09:23 Andrii Nakryiko <andrii.nakryiko@gmail.com> write: > On Mon, Jan 19, 2026 at 11:06 PM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > Inline bpf_get_current_task() and bpf_get_current_task_btf() for x86_64 > > to obtain better performance. > > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > Acked-by: Eduard Zingerman <eddyz87@gmail.com> > > --- > > v5: > > - don't support the !CONFIG_SMP case > > > > v4: > > - handle the !CONFIG_SMP case > > > > v3: > > - implement it in the verifier with BPF_MOV64_PERCPU_REG() instead of in > > x86_64 JIT. > > --- > > kernel/bpf/verifier.c | 22 ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 9de0ec0c3ed9..c4e2ffadfb1f 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -17739,6 +17739,10 @@ static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) > > switch (imm) { > > #ifdef CONFIG_X86_64 > > case BPF_FUNC_get_smp_processor_id: > > +#ifdef CONFIG_SMP > > + case BPF_FUNC_get_current_task_btf: > > + case BPF_FUNC_get_current_task: > > +#endif > > Does this have to be x86-64 specific inlining? With verifier inlining > and per_cpu instruction support it should theoretically work across > all architectures that do support per-cpu instruction, no? > > Eduard pointed out [0] to me for why we have that x86-64 specific > check. But looking at do_misc_fixups(), we have that early > bpf_jit_inlines_helper_call(insn->imm)) check, so if some JIT has more > performant inlining implementation, we will just do that. > > So it seems like we can just drop all that x86-64 specific logic and > claim all three of these functions as inlinable, no? > > And even more. We can drop rather confusing > verifier_inlines_helper_call() that duplicates the decision of which > helpers can be inlined or not, and have: The verifier_inlines_helper_call() is confusing, but I think we can't remove the x86-64 checking. For example, some architecture don't support BPF_FUNC_get_current_task both in bpf_jit_inlines_helper_call() and verifier_inlines_helper_call(), which means it can't be inline. > > if (env->prog->jit_requested && bpf_jit_supports_percpu_insn() { > switch (insn->imm) { > case BPF_FUNC_get_smp_processor_id: > ... > break; > case BPF_FUNC_get_current_task_btf: > case BPF_FUNC_get_current_task_btf: > ... > break; > default: > } > > And the decision about inlining will live in one place. > > Or am I missing some complications? As Alexei said, the implement of "current" is architecture specific, and the per-cpu variable "current_task" only exist on x86_64. > > And with all that, should we mark get_current_task and > get_current_task_btf as __bpf_fastcall? I think it make sense, and the I saw bpf_get_smp_processor_id does such operation: const struct bpf_func_proto bpf_get_smp_processor_id_proto = { [...] .allow_fastcall = true, }; PS: I'm a little confused about the fast call. We inline many helper, but it seems that bpf_get_smp_processor_id is the only one that use the "allow_fastcall". Why? I'd better study harder. Thanks! Menglong Dong > > > [0] https://lore.kernel.org/all/20240722233844.1406874-4-eddyz87@gmail.com/ > > > return env->prog->jit_requested && bpf_jit_supports_percpu_insn(); > > #endif > > default: > > @@ -23319,6 +23323,24 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > > insn = new_prog->insnsi + i + delta; > > goto next_insn; > > } > > + > > + /* Implement bpf_get_current_task() and bpf_get_current_task_btf() inline. */ > > + if ((insn->imm == BPF_FUNC_get_current_task || insn->imm == BPF_FUNC_get_current_task_btf) && > > + verifier_inlines_helper_call(env, insn->imm)) { > > + insn_buf[0] = BPF_MOV64_IMM(BPF_REG_0, (u32)(unsigned long)¤t_task); > > + insn_buf[1] = BPF_MOV64_PERCPU_REG(BPF_REG_0, BPF_REG_0); > > + insn_buf[2] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0); > > + cnt = 3; > > + > > + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); > > + if (!new_prog) > > + return -ENOMEM; > > + > > + delta += cnt - 1; > > + env->prog = prog = new_prog; > > + insn = new_prog->insnsi + i + delta; > > + goto next_insn; > > + } > > #endif > > /* Implement bpf_get_func_arg inline. */ > > if (prog_type == BPF_PROG_TYPE_TRACING && > > -- > > 2.52.0 > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 1/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-21 1:58 ` Menglong Dong @ 2026-01-21 3:10 ` Alexei Starovoitov 2026-01-21 3:37 ` Menglong Dong 2026-01-21 4:12 ` Andrii Nakryiko 0 siblings, 2 replies; 17+ messages in thread From: Alexei Starovoitov @ 2026-01-21 3:10 UTC (permalink / raw) To: Menglong Dong Cc: Menglong Dong, Andrii Nakryiko, Alexei Starovoitov, Eduard, David S. Miller, David Ahern, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Network Development, bpf, LKML On Tue, Jan 20, 2026 at 5:58 PM Menglong Dong <menglong.dong@linux.dev> wrote: > > On 2026/1/21 09:23 Andrii Nakryiko <andrii.nakryiko@gmail.com> write: > > On Mon, Jan 19, 2026 at 11:06 PM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > > > Inline bpf_get_current_task() and bpf_get_current_task_btf() for x86_64 > > > to obtain better performance. > > > > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > > Acked-by: Eduard Zingerman <eddyz87@gmail.com> > > > --- > > > v5: > > > - don't support the !CONFIG_SMP case > > > > > > v4: > > > - handle the !CONFIG_SMP case > > > > > > v3: > > > - implement it in the verifier with BPF_MOV64_PERCPU_REG() instead of in > > > x86_64 JIT. > > > --- > > > kernel/bpf/verifier.c | 22 ++++++++++++++++++++++ > > > 1 file changed, 22 insertions(+) > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > index 9de0ec0c3ed9..c4e2ffadfb1f 100644 > > > --- a/kernel/bpf/verifier.c > > > +++ b/kernel/bpf/verifier.c > > > @@ -17739,6 +17739,10 @@ static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) > > > switch (imm) { > > > #ifdef CONFIG_X86_64 > > > case BPF_FUNC_get_smp_processor_id: > > > +#ifdef CONFIG_SMP > > > + case BPF_FUNC_get_current_task_btf: > > > + case BPF_FUNC_get_current_task: > > > +#endif > > > > Does this have to be x86-64 specific inlining? With verifier inlining > > and per_cpu instruction support it should theoretically work across > > all architectures that do support per-cpu instruction, no? > > > > Eduard pointed out [0] to me for why we have that x86-64 specific > > check. But looking at do_misc_fixups(), we have that early > > bpf_jit_inlines_helper_call(insn->imm)) check, so if some JIT has more > > performant inlining implementation, we will just do that. > > > > So it seems like we can just drop all that x86-64 specific logic and > > claim all three of these functions as inlinable, no? > > > > And even more. We can drop rather confusing > > verifier_inlines_helper_call() that duplicates the decision of which > > helpers can be inlined or not, and have: > > The verifier_inlines_helper_call() is confusing, but I think we can't > remove the x86-64 checking. For example, some architecture > don't support BPF_FUNC_get_current_task both in > bpf_jit_inlines_helper_call() and verifier_inlines_helper_call(), which > means it can't be inline. > > > > > if (env->prog->jit_requested && bpf_jit_supports_percpu_insn() { > > switch (insn->imm) { > > case BPF_FUNC_get_smp_processor_id: > > ... > > break; > > case BPF_FUNC_get_current_task_btf: > > case BPF_FUNC_get_current_task_btf: > > ... > > break; > > default: > > } > > > > And the decision about inlining will live in one place. > > > > Or am I missing some complications? > > As Alexei said, the implement of "current" is architecture specific, > and the per-cpu variable "current_task" only exist on x86_64. > > > > > And with all that, should we mark get_current_task and > > get_current_task_btf as __bpf_fastcall? > > I think it make sense, and the I saw bpf_get_smp_processor_id does > such operation: > > const struct bpf_func_proto bpf_get_smp_processor_id_proto = { > [...] > .allow_fastcall = true, > }; > > PS: I'm a little confused about the fast call. We inline many helper, > but it seems that bpf_get_smp_processor_id is the only one that > use the "allow_fastcall". Why? I'd better study harder. It's static __bpf_fastcall __u32 (* const bpf_get_smp_processor_id)(void) = (void *) 8; and #define __bpf_fastcall __attribute__((bpf_fastcall)) which makes LLVM use more registers at the callsite (less spill/fill). Looking at the patch again. I think it's fine as-is. fastcall can be a follow up. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 1/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-21 3:10 ` Alexei Starovoitov @ 2026-01-21 3:37 ` Menglong Dong 2026-01-21 4:12 ` Andrii Nakryiko 1 sibling, 0 replies; 17+ messages in thread From: Menglong Dong @ 2026-01-21 3:37 UTC (permalink / raw) To: Alexei Starovoitov Cc: Menglong Dong, Andrii Nakryiko, Alexei Starovoitov, Eduard, David S. Miller, David Ahern, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Network Development, bpf, LKML On 2026/1/21 11:10 Alexei Starovoitov <alexei.starovoitov@gmail.com> write: > On Tue, Jan 20, 2026 at 5:58 PM Menglong Dong <menglong.dong@linux.dev> wrote: > > > > On 2026/1/21 09:23 Andrii Nakryiko <andrii.nakryiko@gmail.com> write: > > > On Mon, Jan 19, 2026 at 11:06 PM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > > > > > Inline bpf_get_current_task() and bpf_get_current_task_btf() for x86_64 > > > > to obtain better performance. > > > > > > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > > > Acked-by: Eduard Zingerman <eddyz87@gmail.com> > > > > --- > > > > v5: > > > > - don't support the !CONFIG_SMP case > > > > > > > > v4: > > > > - handle the !CONFIG_SMP case > > > > > > > > v3: > > > > - implement it in the verifier with BPF_MOV64_PERCPU_REG() instead of in > > > > x86_64 JIT. > > > > --- > > > > kernel/bpf/verifier.c | 22 ++++++++++++++++++++++ > > > > 1 file changed, 22 insertions(+) > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > > index 9de0ec0c3ed9..c4e2ffadfb1f 100644 > > > > --- a/kernel/bpf/verifier.c > > > > +++ b/kernel/bpf/verifier.c > > > > @@ -17739,6 +17739,10 @@ static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) > > > > switch (imm) { > > > > #ifdef CONFIG_X86_64 > > > > case BPF_FUNC_get_smp_processor_id: > > > > +#ifdef CONFIG_SMP > > > > + case BPF_FUNC_get_current_task_btf: > > > > + case BPF_FUNC_get_current_task: > > > > +#endif > > > > > > Does this have to be x86-64 specific inlining? With verifier inlining > > > and per_cpu instruction support it should theoretically work across > > > all architectures that do support per-cpu instruction, no? > > > > > > Eduard pointed out [0] to me for why we have that x86-64 specific > > > check. But looking at do_misc_fixups(), we have that early > > > bpf_jit_inlines_helper_call(insn->imm)) check, so if some JIT has more > > > performant inlining implementation, we will just do that. > > > > > > So it seems like we can just drop all that x86-64 specific logic and > > > claim all three of these functions as inlinable, no? > > > > > > And even more. We can drop rather confusing > > > verifier_inlines_helper_call() that duplicates the decision of which > > > helpers can be inlined or not, and have: > > > > The verifier_inlines_helper_call() is confusing, but I think we can't > > remove the x86-64 checking. For example, some architecture > > don't support BPF_FUNC_get_current_task both in > > bpf_jit_inlines_helper_call() and verifier_inlines_helper_call(), which > > means it can't be inline. > > > > > > > > if (env->prog->jit_requested && bpf_jit_supports_percpu_insn() { > > > switch (insn->imm) { > > > case BPF_FUNC_get_smp_processor_id: > > > ... > > > break; > > > case BPF_FUNC_get_current_task_btf: > > > case BPF_FUNC_get_current_task_btf: > > > ... > > > break; > > > default: > > > } > > > > > > And the decision about inlining will live in one place. > > > > > > Or am I missing some complications? > > > > As Alexei said, the implement of "current" is architecture specific, > > and the per-cpu variable "current_task" only exist on x86_64. > > > > > > > > And with all that, should we mark get_current_task and > > > get_current_task_btf as __bpf_fastcall? > > > > I think it make sense, and the I saw bpf_get_smp_processor_id does > > such operation: > > > > const struct bpf_func_proto bpf_get_smp_processor_id_proto = { > > [...] > > .allow_fastcall = true, > > }; > > > > PS: I'm a little confused about the fast call. We inline many helper, > > but it seems that bpf_get_smp_processor_id is the only one that > > use the "allow_fastcall". Why? I'd better study harder. > > It's > static __bpf_fastcall __u32 (* const bpf_get_smp_processor_id)(void) = > (void *) 8; > > and > #define __bpf_fastcall __attribute__((bpf_fastcall)) Ah, I see. It seems that the bpf_doc.py does the trick. > > which makes LLVM use more registers at the callsite (less spill/fill). > > Looking at the patch again. I think it's fine as-is. > fastcall can be a follow up. Okay! Thanks! Menglong Dong ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 1/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-21 3:10 ` Alexei Starovoitov 2026-01-21 3:37 ` Menglong Dong @ 2026-01-21 4:12 ` Andrii Nakryiko 2026-01-21 4:46 ` Alexei Starovoitov 1 sibling, 1 reply; 17+ messages in thread From: Andrii Nakryiko @ 2026-01-21 4:12 UTC (permalink / raw) To: Alexei Starovoitov Cc: Menglong Dong, Menglong Dong, Alexei Starovoitov, Eduard, David S. Miller, David Ahern, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Network Development, bpf, LKML On Tue, Jan 20, 2026 at 7:10 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Jan 20, 2026 at 5:58 PM Menglong Dong <menglong.dong@linux.dev> wrote: > > > > On 2026/1/21 09:23 Andrii Nakryiko <andrii.nakryiko@gmail.com> write: > > > On Mon, Jan 19, 2026 at 11:06 PM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > > > > > Inline bpf_get_current_task() and bpf_get_current_task_btf() for x86_64 > > > > to obtain better performance. > > > > > > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > > > Acked-by: Eduard Zingerman <eddyz87@gmail.com> > > > > --- > > > > v5: > > > > - don't support the !CONFIG_SMP case > > > > > > > > v4: > > > > - handle the !CONFIG_SMP case > > > > > > > > v3: > > > > - implement it in the verifier with BPF_MOV64_PERCPU_REG() instead of in > > > > x86_64 JIT. > > > > --- > > > > kernel/bpf/verifier.c | 22 ++++++++++++++++++++++ > > > > 1 file changed, 22 insertions(+) > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > > index 9de0ec0c3ed9..c4e2ffadfb1f 100644 > > > > --- a/kernel/bpf/verifier.c > > > > +++ b/kernel/bpf/verifier.c > > > > @@ -17739,6 +17739,10 @@ static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) > > > > switch (imm) { > > > > #ifdef CONFIG_X86_64 > > > > case BPF_FUNC_get_smp_processor_id: > > > > +#ifdef CONFIG_SMP > > > > + case BPF_FUNC_get_current_task_btf: > > > > + case BPF_FUNC_get_current_task: > > > > +#endif > > > > > > Does this have to be x86-64 specific inlining? With verifier inlining > > > and per_cpu instruction support it should theoretically work across > > > all architectures that do support per-cpu instruction, no? > > > > > > Eduard pointed out [0] to me for why we have that x86-64 specific > > > check. But looking at do_misc_fixups(), we have that early > > > bpf_jit_inlines_helper_call(insn->imm)) check, so if some JIT has more > > > performant inlining implementation, we will just do that. > > > > > > So it seems like we can just drop all that x86-64 specific logic and > > > claim all three of these functions as inlinable, no? > > > > > > And even more. We can drop rather confusing > > > verifier_inlines_helper_call() that duplicates the decision of which > > > helpers can be inlined or not, and have: > > > > The verifier_inlines_helper_call() is confusing, but I think we can't > > remove the x86-64 checking. For example, some architecture > > don't support BPF_FUNC_get_current_task both in > > bpf_jit_inlines_helper_call() and verifier_inlines_helper_call(), which > > means it can't be inline. > > > > > > > > if (env->prog->jit_requested && bpf_jit_supports_percpu_insn() { > > > switch (insn->imm) { > > > case BPF_FUNC_get_smp_processor_id: > > > ... > > > break; > > > case BPF_FUNC_get_current_task_btf: > > > case BPF_FUNC_get_current_task_btf: > > > ... > > > break; > > > default: > > > } > > > > > > And the decision about inlining will live in one place. > > > > > > Or am I missing some complications? > > > > As Alexei said, the implement of "current" is architecture specific, > > and the per-cpu variable "current_task" only exist on x86_64. > > Ah, ok, that's the complication :) > > > > > > And with all that, should we mark get_current_task and > > > get_current_task_btf as __bpf_fastcall? > > > > I think it make sense, and the I saw bpf_get_smp_processor_id does > > such operation: > > > > const struct bpf_func_proto bpf_get_smp_processor_id_proto = { > > [...] > > .allow_fastcall = true, > > }; > > > > PS: I'm a little confused about the fast call. We inline many helper, > > but it seems that bpf_get_smp_processor_id is the only one that > > use the "allow_fastcall". Why? I'd better study harder. > > It's > static __bpf_fastcall __u32 (* const bpf_get_smp_processor_id)(void) = > (void *) 8; > > and > #define __bpf_fastcall __attribute__((bpf_fastcall)) > > which makes LLVM use more registers at the callsite (less spill/fill). > > Looking at the patch again. I think it's fine as-is. > fastcall can be a follow up. Yeah, it's fine as is. But it still seems like verifier_inlines_helper_call() is an unnecessary extra hop we can remove (even if it has to stay arch-specific). ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 1/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-21 4:12 ` Andrii Nakryiko @ 2026-01-21 4:46 ` Alexei Starovoitov 2026-01-21 6:35 ` Andrii Nakryiko 0 siblings, 1 reply; 17+ messages in thread From: Alexei Starovoitov @ 2026-01-21 4:46 UTC (permalink / raw) To: Andrii Nakryiko Cc: Menglong Dong, Menglong Dong, Alexei Starovoitov, Eduard, David S. Miller, David Ahern, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Network Development, bpf, LKML On Tue, Jan 20, 2026 at 8:12 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > > Looking at the patch again. I think it's fine as-is. > > fastcall can be a follow up. > > Yeah, it's fine as is. But it still seems like Thanks! > verifier_inlines_helper_call() is an unnecessary extra hop we can > remove (even if it has to stay arch-specific). I'm not sure that we can, since it's used in two places: get_call_summary(): cs->fastcall = fn->allow_fastcall && (verifier_inlines_helper_call(env, call->imm) || bpf_jit_inlines_helper_call(call->imm)); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 1/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-21 4:46 ` Alexei Starovoitov @ 2026-01-21 6:35 ` Andrii Nakryiko 0 siblings, 0 replies; 17+ messages in thread From: Andrii Nakryiko @ 2026-01-21 6:35 UTC (permalink / raw) To: Alexei Starovoitov Cc: Menglong Dong, Menglong Dong, Alexei Starovoitov, Eduard, David S. Miller, David Ahern, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Network Development, bpf, LKML On Tue, Jan 20, 2026 at 8:46 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Jan 20, 2026 at 8:12 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > Looking at the patch again. I think it's fine as-is. > > > fastcall can be a follow up. > > > > Yeah, it's fine as is. But it still seems like > > Thanks! > > > verifier_inlines_helper_call() is an unnecessary extra hop we can > > remove (even if it has to stay arch-specific). > > I'm not sure that we can, since it's used in two places: > get_call_summary(): > cs->fastcall = fn->allow_fastcall && > (verifier_inlines_helper_call(env, call->imm) || > bpf_jit_inlines_helper_call(call->imm)); well then, just ignore me :) ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task 2026-01-20 7:05 [PATCH bpf-next v6 0/2] bpf, x86: inline bpf_get_current_task() for x86_64 Menglong Dong 2026-01-20 7:05 ` [PATCH bpf-next v6 1/2] " Menglong Dong @ 2026-01-20 7:05 ` Menglong Dong 2026-01-20 17:52 ` Eduard Zingerman 2026-01-21 1:05 ` Andrii Nakryiko 2026-01-21 4:50 ` [PATCH bpf-next v6 0/2] bpf, x86: inline bpf_get_current_task() for x86_64 patchwork-bot+netdevbpf 2 siblings, 2 replies; 17+ messages in thread From: Menglong Dong @ 2026-01-20 7:05 UTC (permalink / raw) To: ast, eddyz87 Cc: davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel Add the testcase for the jited inline of bpf_get_current_task(). Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> --- v6: * remove unnecessary 'ifdef' and __description --- .../selftests/bpf/prog_tests/verifier.c | 2 ++ .../selftests/bpf/progs/verifier_jit_inline.c | 20 +++++++++++++++++++ 2 files changed, 22 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/verifier_jit_inline.c diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index 38c5ba70100c..2ae7b096bd64 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -111,6 +111,7 @@ #include "verifier_xdp_direct_packet_access.skel.h" #include "verifier_bits_iter.skel.h" #include "verifier_lsm.skel.h" +#include "verifier_jit_inline.skel.h" #include "irq.skel.h" #define MAX_ENTRIES 11 @@ -253,6 +254,7 @@ void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); } void test_verifier_lsm(void) { RUN(verifier_lsm); } void test_irq(void) { RUN(irq); } void test_verifier_mtu(void) { RUN(verifier_mtu); } +void test_verifier_jit_inline(void) { RUN(verifier_jit_inline); } static int init_test_val_map(struct bpf_object *obj, char *map_name) { diff --git a/tools/testing/selftests/bpf/progs/verifier_jit_inline.c b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c new file mode 100644 index 000000000000..4ea254063646 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c @@ -0,0 +1,20 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <vmlinux.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +SEC("fentry/bpf_fentry_test1") +__success __retval(0) +__arch_x86_64 +__jited(" addq %gs:{{.*}}, %rax") +__arch_arm64 +__jited(" mrs x7, SP_EL0") +int inline_bpf_get_current_task(void) +{ + bpf_get_current_task(); + + return 0; +} + +char _license[] SEC("license") = "GPL"; -- 2.52.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task 2026-01-20 7:05 ` [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task Menglong Dong @ 2026-01-20 17:52 ` Eduard Zingerman 2026-01-21 1:05 ` Andrii Nakryiko 1 sibling, 0 replies; 17+ messages in thread From: Eduard Zingerman @ 2026-01-20 17:52 UTC (permalink / raw) To: Menglong Dong, ast Cc: davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel On Tue, 2026-01-20 at 15:05 +0800, Menglong Dong wrote: > Add the testcase for the jited inline of bpf_get_current_task(). > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > --- Acked-by: Eduard Zingerman <eddyz87@gmail.com> [...] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task 2026-01-20 7:05 ` [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task Menglong Dong 2026-01-20 17:52 ` Eduard Zingerman @ 2026-01-21 1:05 ` Andrii Nakryiko 2026-01-21 1:28 ` Menglong Dong 1 sibling, 1 reply; 17+ messages in thread From: Andrii Nakryiko @ 2026-01-21 1:05 UTC (permalink / raw) To: Menglong Dong Cc: ast, eddyz87, davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel On Mon, Jan 19, 2026 at 11:06 PM Menglong Dong <menglong8.dong@gmail.com> wrote: > > Add the testcase for the jited inline of bpf_get_current_task(). > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > --- > v6: > * remove unnecessary 'ifdef' and __description > --- > .../selftests/bpf/prog_tests/verifier.c | 2 ++ > .../selftests/bpf/progs/verifier_jit_inline.c | 20 +++++++++++++++++++ > 2 files changed, 22 insertions(+) > create mode 100644 tools/testing/selftests/bpf/progs/verifier_jit_inline.c > > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c > index 38c5ba70100c..2ae7b096bd64 100644 > --- a/tools/testing/selftests/bpf/prog_tests/verifier.c > +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c > @@ -111,6 +111,7 @@ > #include "verifier_xdp_direct_packet_access.skel.h" > #include "verifier_bits_iter.skel.h" > #include "verifier_lsm.skel.h" > +#include "verifier_jit_inline.skel.h" > #include "irq.skel.h" > > #define MAX_ENTRIES 11 > @@ -253,6 +254,7 @@ void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); } > void test_verifier_lsm(void) { RUN(verifier_lsm); } > void test_irq(void) { RUN(irq); } > void test_verifier_mtu(void) { RUN(verifier_mtu); } > +void test_verifier_jit_inline(void) { RUN(verifier_jit_inline); } > > static int init_test_val_map(struct bpf_object *obj, char *map_name) > { > diff --git a/tools/testing/selftests/bpf/progs/verifier_jit_inline.c b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c > new file mode 100644 > index 000000000000..4ea254063646 > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c > @@ -0,0 +1,20 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +#include <vmlinux.h> > +#include <bpf/bpf_helpers.h> > +#include "bpf_misc.h" > + > +SEC("fentry/bpf_fentry_test1") > +__success __retval(0) > +__arch_x86_64 > +__jited(" addq %gs:{{.*}}, %rax") > +__arch_arm64 > +__jited(" mrs x7, SP_EL0") I was confused to see this, as your patch actually implements inlining only on x86-64. And then it turned out that on arm64 we inline this in JIT. But Eduard also noticed that we actually SKIP this test on arm64 because of missing LLVM dependency, so that's not great. So we should do something about silently skipped tests at least... > +int inline_bpf_get_current_task(void) > +{ > + bpf_get_current_task(); > + > + return 0; > +} > + > +char _license[] SEC("license") = "GPL"; > -- > 2.52.0 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task 2026-01-21 1:05 ` Andrii Nakryiko @ 2026-01-21 1:28 ` Menglong Dong 2026-01-21 1:32 ` Eduard Zingerman 0 siblings, 1 reply; 17+ messages in thread From: Menglong Dong @ 2026-01-21 1:28 UTC (permalink / raw) To: Menglong Dong, Andrii Nakryiko Cc: ast, eddyz87, davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel On 2026/1/21 09:05 Andrii Nakryiko <andrii.nakryiko@gmail.com> write: > On Mon, Jan 19, 2026 at 11:06 PM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > Add the testcase for the jited inline of bpf_get_current_task(). > > > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> > > --- > > v6: > > * remove unnecessary 'ifdef' and __description > > --- > > .../selftests/bpf/prog_tests/verifier.c | 2 ++ > > .../selftests/bpf/progs/verifier_jit_inline.c | 20 +++++++++++++++++++ > > 2 files changed, 22 insertions(+) > > create mode 100644 tools/testing/selftests/bpf/progs/verifier_jit_inline.c > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c > > index 38c5ba70100c..2ae7b096bd64 100644 > > --- a/tools/testing/selftests/bpf/prog_tests/verifier.c > > +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c > > @@ -111,6 +111,7 @@ > > #include "verifier_xdp_direct_packet_access.skel.h" > > #include "verifier_bits_iter.skel.h" > > #include "verifier_lsm.skel.h" > > +#include "verifier_jit_inline.skel.h" > > #include "irq.skel.h" > > > > #define MAX_ENTRIES 11 > > @@ -253,6 +254,7 @@ void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); } > > void test_verifier_lsm(void) { RUN(verifier_lsm); } > > void test_irq(void) { RUN(irq); } > > void test_verifier_mtu(void) { RUN(verifier_mtu); } > > +void test_verifier_jit_inline(void) { RUN(verifier_jit_inline); } > > > > static int init_test_val_map(struct bpf_object *obj, char *map_name) > > { > > diff --git a/tools/testing/selftests/bpf/progs/verifier_jit_inline.c b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c > > new file mode 100644 > > index 000000000000..4ea254063646 > > --- /dev/null > > +++ b/tools/testing/selftests/bpf/progs/verifier_jit_inline.c > > @@ -0,0 +1,20 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > + > > +#include <vmlinux.h> > > +#include <bpf/bpf_helpers.h> > > +#include "bpf_misc.h" > > + > > +SEC("fentry/bpf_fentry_test1") > > +__success __retval(0) > > +__arch_x86_64 > > +__jited(" addq %gs:{{.*}}, %rax") > > +__arch_arm64 > > +__jited(" mrs x7, SP_EL0") > > I was confused to see this, as your patch actually implements inlining > only on x86-64. And then it turned out that on arm64 we inline this in Yeah, the arm64 implemented it already. And I add the testing for it BTW. > JIT. But Eduard also noticed that we actually SKIP this test on arm64 > because of missing LLVM dependency, so that's not great. Do you mean that the CI of arm64 doesn't use LLVM for the selftests? I noted that. I found that there are other similar "__jited" testings for arm64, is there anything we can do? PS: I tested the arm64 locally, and it works fine. > > So we should do something about silently skipped tests at least... Like a warning? Thanks! Menglong Dong > > > +int inline_bpf_get_current_task(void) > > +{ > > + bpf_get_current_task(); > > + > > + return 0; > > +} > > + > > +char _license[] SEC("license") = "GPL"; > > -- > > 2.52.0 > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task 2026-01-21 1:28 ` Menglong Dong @ 2026-01-21 1:32 ` Eduard Zingerman 2026-01-21 3:03 ` Menglong Dong 0 siblings, 1 reply; 17+ messages in thread From: Eduard Zingerman @ 2026-01-21 1:32 UTC (permalink / raw) To: Menglong Dong, Menglong Dong, Andrii Nakryiko Cc: ast, davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel On Wed, 2026-01-21 at 09:28 +0800, Menglong Dong wrote: [...] > Do you mean that the CI of arm64 doesn't use LLVM for the selftests? > I noted that. I found that there are other similar "__jited" testings for > arm64, is there anything we can do? > > PS: I tested the arm64 locally, and it works fine. > > > > > So we should do something about silently skipped tests at least... > > Like a warning? Yes, probably llvm-devel or libs dependency is missing, hence jit related selftests are skipped. Same thing for x86. Discussed with Andrii making llvm an opt-out dependency: fail selftests compilation if libraries are not found and SKIP_LLVM is not set. We plan to address CI config issue tomorrow. [...] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task 2026-01-21 1:32 ` Eduard Zingerman @ 2026-01-21 3:03 ` Menglong Dong 0 siblings, 0 replies; 17+ messages in thread From: Menglong Dong @ 2026-01-21 3:03 UTC (permalink / raw) To: Eduard Zingerman Cc: Menglong Dong, Andrii Nakryiko, ast, davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel On Wed, Jan 21, 2026 at 9:32 AM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Wed, 2026-01-21 at 09:28 +0800, Menglong Dong wrote: > > [...] > > > Do you mean that the CI of arm64 doesn't use LLVM for the selftests? > > I noted that. I found that there are other similar "__jited" testings for > > arm64, is there anything we can do? > > > > PS: I tested the arm64 locally, and it works fine. > > > > > > > > So we should do something about silently skipped tests at least... > > > > Like a warning? > > Yes, probably llvm-devel or libs dependency is missing, > hence jit related selftests are skipped. Same thing for x86. > Discussed with Andrii making llvm an opt-out dependency: > fail selftests compilation if libraries are not found and SKIP_LLVM is not set. Sounds nice. People may not be aware of the LLVM dependence sometimes. So is there anything I can do in this series? Thanks! Menglong Dong > We plan to address CI config issue tomorrow. > > [...] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v6 0/2] bpf, x86: inline bpf_get_current_task() for x86_64 2026-01-20 7:05 [PATCH bpf-next v6 0/2] bpf, x86: inline bpf_get_current_task() for x86_64 Menglong Dong 2026-01-20 7:05 ` [PATCH bpf-next v6 1/2] " Menglong Dong 2026-01-20 7:05 ` [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task Menglong Dong @ 2026-01-21 4:50 ` patchwork-bot+netdevbpf 2 siblings, 0 replies; 17+ messages in thread From: patchwork-bot+netdevbpf @ 2026-01-21 4:50 UTC (permalink / raw) To: Menglong Dong Cc: ast, eddyz87, davem, dsahern, daniel, andrii, martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel Hello: This series was applied to bpf/bpf-next.git (master) by Alexei Starovoitov <ast@kernel.org>: On Tue, 20 Jan 2026 15:05:53 +0800 you wrote: > Inline bpf_get_current_task() and bpf_get_current_task_btf() for x86_64 > to obtain better performance, and add the testcase for it. > > Changes since v5: > * remove unnecessary 'ifdef' and __description in the selftests > * v5: https://lore.kernel.org/bpf/20260119070246.249499-1-dongml2@chinatelecom.cn/ > > [...] Here is the summary with links: - [bpf-next,v6,1/2] bpf, x86: inline bpf_get_current_task() for x86_64 https://git.kernel.org/bpf/bpf-next/c/eaedea154eb9 - [bpf-next,v6,2/2] selftests/bpf: test the jited inline of bpf_get_current_task https://git.kernel.org/bpf/bpf-next/c/4fca95095cdc You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-01-21 6:36 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-20 7:05 [PATCH bpf-next v6 0/2] bpf, x86: inline bpf_get_current_task() for x86_64 Menglong Dong 2026-01-20 7:05 ` [PATCH bpf-next v6 1/2] " Menglong Dong 2026-01-21 1:23 ` Andrii Nakryiko 2026-01-21 1:43 ` Alexei Starovoitov 2026-01-21 1:58 ` Menglong Dong 2026-01-21 3:10 ` Alexei Starovoitov 2026-01-21 3:37 ` Menglong Dong 2026-01-21 4:12 ` Andrii Nakryiko 2026-01-21 4:46 ` Alexei Starovoitov 2026-01-21 6:35 ` Andrii Nakryiko 2026-01-20 7:05 ` [PATCH bpf-next v6 2/2] selftests/bpf: test the jited inline of bpf_get_current_task Menglong Dong 2026-01-20 17:52 ` Eduard Zingerman 2026-01-21 1:05 ` Andrii Nakryiko 2026-01-21 1:28 ` Menglong Dong 2026-01-21 1:32 ` Eduard Zingerman 2026-01-21 3:03 ` Menglong Dong 2026-01-21 4:50 ` [PATCH bpf-next v6 0/2] bpf, x86: inline bpf_get_current_task() for x86_64 patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox