* [bpf-next 1/6] bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store instructions
2025-08-05 6:27 [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Saket Kumar Bhaskar
@ 2025-08-05 6:27 ` Saket Kumar Bhaskar
2025-08-05 7:34 ` Christophe Leroy
2025-08-05 6:27 ` [bpf-next 2/6] bpf,powerpc: Implement PROBE_MEM32 pseudo instructions Saket Kumar Bhaskar
` (6 subsequent siblings)
7 siblings, 1 reply; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-05 6:27 UTC (permalink / raw)
To: bpf, linuxppc-dev, linux-kselftest, linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, christophe.leroy, naveen, maddy, mpe, npiggin,
memxor, iii, shuah
bpf_jit_emit_probe_mem_store() is introduced to emit instructions for
storing memory values depending on the size (byte, halfword,
word, doubleword).
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
---
arch/powerpc/net/bpf_jit_comp64.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 025524378443..489de21fe3d6 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -409,6 +409,36 @@ asm (
" blr ;"
);
+static int bpf_jit_emit_probe_mem_store(struct codegen_context *ctx, u32 src_reg, s16 off,
+ u32 code, u32 *image)
+{
+ u32 tmp1_reg = bpf_to_ppc(TMP_REG_1);
+ u32 tmp2_reg = bpf_to_ppc(TMP_REG_2);
+
+ switch (BPF_SIZE(code)) {
+ case BPF_B:
+ EMIT(PPC_RAW_STB(src_reg, tmp1_reg, off));
+ break;
+ case BPF_H:
+ EMIT(PPC_RAW_STH(src_reg, tmp1_reg, off));
+ break;
+ case BPF_W:
+ EMIT(PPC_RAW_STW(src_reg, tmp1_reg, off));
+ break;
+ case BPF_DW:
+ if (off % 4) {
+ EMIT(PPC_RAW_LI(tmp2_reg, off));
+ EMIT(PPC_RAW_STDX(src_reg, tmp1_reg, tmp2_reg));
+ } else {
+ EMIT(PPC_RAW_STD(src_reg, tmp1_reg, off));
+ }
+ break;
+ default:
+ return -EINVAL;
+ }
+ return 0;
+}
+
static int emit_atomic_ld_st(const struct bpf_insn insn, struct codegen_context *ctx, u32 *image)
{
u32 code = insn.code;
--
2.43.5
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [bpf-next 1/6] bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store instructions
2025-08-05 6:27 ` [bpf-next 1/6] bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store instructions Saket Kumar Bhaskar
@ 2025-08-05 7:34 ` Christophe Leroy
2025-08-05 11:59 ` Venkat Rao Bagalkote
0 siblings, 1 reply; 23+ messages in thread
From: Christophe Leroy @ 2025-08-05 7:34 UTC (permalink / raw)
To: Saket Kumar Bhaskar, bpf, linuxppc-dev, linux-kselftest,
linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii,
shuah
Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
> bpf_jit_emit_probe_mem_store() is introduced to emit instructions for
> storing memory values depending on the size (byte, halfword,
> word, doubleword).
Build break with this patch
CC arch/powerpc/net/bpf_jit_comp64.o
arch/powerpc/net/bpf_jit_comp64.c:395:12: error:
'bpf_jit_emit_probe_mem_store' defined but not used
[-Werror=unused-function]
static int bpf_jit_emit_probe_mem_store(struct codegen_context *ctx,
u32 src_reg, s16 off,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[4]: *** [scripts/Makefile.build:287:
arch/powerpc/net/bpf_jit_comp64.o] Error 1
>
> Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
> ---
> arch/powerpc/net/bpf_jit_comp64.c | 30 ++++++++++++++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 025524378443..489de21fe3d6 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -409,6 +409,36 @@ asm (
> " blr ;"
> );
>
> +static int bpf_jit_emit_probe_mem_store(struct codegen_context *ctx, u32 src_reg, s16 off,
> + u32 code, u32 *image)
> +{
> + u32 tmp1_reg = bpf_to_ppc(TMP_REG_1);
> + u32 tmp2_reg = bpf_to_ppc(TMP_REG_2);
> +
> + switch (BPF_SIZE(code)) {
> + case BPF_B:
> + EMIT(PPC_RAW_STB(src_reg, tmp1_reg, off));
> + break;
> + case BPF_H:
> + EMIT(PPC_RAW_STH(src_reg, tmp1_reg, off));
> + break;
> + case BPF_W:
> + EMIT(PPC_RAW_STW(src_reg, tmp1_reg, off));
> + break;
> + case BPF_DW:
> + if (off % 4) {
> + EMIT(PPC_RAW_LI(tmp2_reg, off));
> + EMIT(PPC_RAW_STDX(src_reg, tmp1_reg, tmp2_reg));
> + } else {
> + EMIT(PPC_RAW_STD(src_reg, tmp1_reg, off));
> + }
> + break;
> + default:
> + return -EINVAL;
> + }
> + return 0;
> +}
> +
> static int emit_atomic_ld_st(const struct bpf_insn insn, struct codegen_context *ctx, u32 *image)
> {
> u32 code = insn.code;
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 1/6] bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store instructions
2025-08-05 7:34 ` Christophe Leroy
@ 2025-08-05 11:59 ` Venkat Rao Bagalkote
2025-08-06 6:59 ` Christophe Leroy
0 siblings, 1 reply; 23+ messages in thread
From: Venkat Rao Bagalkote @ 2025-08-05 11:59 UTC (permalink / raw)
To: Christophe Leroy, Saket Kumar Bhaskar, bpf, linuxppc-dev,
linux-kselftest, linux-kernel
Cc: hbathini, sachinpb, andrii, eddyz87, mykolal, ast, daniel,
martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii, shuah
On 05/08/25 1:04 pm, Christophe Leroy wrote:
>
>
> Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
>> bpf_jit_emit_probe_mem_store() is introduced to emit instructions for
>> storing memory values depending on the size (byte, halfword,
>> word, doubleword).
>
> Build break with this patch
>
> CC arch/powerpc/net/bpf_jit_comp64.o
> arch/powerpc/net/bpf_jit_comp64.c:395:12: error:
> 'bpf_jit_emit_probe_mem_store' defined but not used
> [-Werror=unused-function]
> static int bpf_jit_emit_probe_mem_store(struct codegen_context *ctx,
> u32 src_reg, s16 off,
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> cc1: all warnings being treated as errors
> make[4]: *** [scripts/Makefile.build:287:
> arch/powerpc/net/bpf_jit_comp64.o] Error 1
>
I tried this on top of bpf-next, and for me build passed.
Note: I applied
https://lore.kernel.org/bpf/20250717202935.29018-2-puranjay@kernel.org/
before applying current patch.
gcc version 14.2.1 20250110
uname -r: 6.16.0-gf2844c7fdb07
bpf-next repo:
https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next
HEAD:
commit f3af62b6cee8af9f07012051874af2d2a451f0e5 (origin/master, origin/HEAD)
Author: Tao Chen <chen.dylane@linux.dev>
Date: Wed Jul 23 22:44:42 2025 +0800
bpftool: Add bash completion for token argument
Build Success logs:
TEST-OBJ [test_progs-cpuv4] xdp_vlan.test.o
TEST-OBJ [test_progs-cpuv4] xdpwall.test.o
TEST-OBJ [test_progs-cpuv4] xfrm_info.test.o
BINARY bench
BINARY test_maps
BINARY test_progs
BINARY test_progs-no_alu32
BINARY test_progs-cpuv4
Regards,
Venkat.
>
>>
>> Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
>> ---
>> arch/powerpc/net/bpf_jit_comp64.c | 30 ++++++++++++++++++++++++++++++
>> 1 file changed, 30 insertions(+)
>>
>> diff --git a/arch/powerpc/net/bpf_jit_comp64.c
>> b/arch/powerpc/net/bpf_jit_comp64.c
>> index 025524378443..489de21fe3d6 100644
>> --- a/arch/powerpc/net/bpf_jit_comp64.c
>> +++ b/arch/powerpc/net/bpf_jit_comp64.c
>> @@ -409,6 +409,36 @@ asm (
>> " blr ;"
>> );
>> +static int bpf_jit_emit_probe_mem_store(struct codegen_context
>> *ctx, u32 src_reg, s16 off,
>> + u32 code, u32 *image)
>> +{
>> + u32 tmp1_reg = bpf_to_ppc(TMP_REG_1);
>> + u32 tmp2_reg = bpf_to_ppc(TMP_REG_2);
>> +
>> + switch (BPF_SIZE(code)) {
>> + case BPF_B:
>> + EMIT(PPC_RAW_STB(src_reg, tmp1_reg, off));
>> + break;
>> + case BPF_H:
>> + EMIT(PPC_RAW_STH(src_reg, tmp1_reg, off));
>> + break;
>> + case BPF_W:
>> + EMIT(PPC_RAW_STW(src_reg, tmp1_reg, off));
>> + break;
>> + case BPF_DW:
>> + if (off % 4) {
>> + EMIT(PPC_RAW_LI(tmp2_reg, off));
>> + EMIT(PPC_RAW_STDX(src_reg, tmp1_reg, tmp2_reg));
>> + } else {
>> + EMIT(PPC_RAW_STD(src_reg, tmp1_reg, off));
>> + }
>> + break;
>> + default:
>> + return -EINVAL;
>> + }
>> + return 0;
>> +}
>> +
>> static int emit_atomic_ld_st(const struct bpf_insn insn, struct
>> codegen_context *ctx, u32 *image)
>> {
>> u32 code = insn.code;
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 1/6] bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store instructions
2025-08-05 11:59 ` Venkat Rao Bagalkote
@ 2025-08-06 6:59 ` Christophe Leroy
2025-08-07 10:29 ` Saket Kumar Bhaskar
0 siblings, 1 reply; 23+ messages in thread
From: Christophe Leroy @ 2025-08-06 6:59 UTC (permalink / raw)
To: Venkat Rao Bagalkote, Saket Kumar Bhaskar, bpf, linuxppc-dev,
linux-kselftest, linux-kernel
Cc: hbathini, sachinpb, andrii, eddyz87, mykolal, ast, daniel,
martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii, shuah
Le 05/08/2025 à 13:59, Venkat Rao Bagalkote a écrit :
>
> On 05/08/25 1:04 pm, Christophe Leroy wrote:
>>
>>
>> Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
>>> bpf_jit_emit_probe_mem_store() is introduced to emit instructions for
>>> storing memory values depending on the size (byte, halfword,
>>> word, doubleword).
>>
>> Build break with this patch
>>
>> CC arch/powerpc/net/bpf_jit_comp64.o
>> arch/powerpc/net/bpf_jit_comp64.c:395:12: error:
>> 'bpf_jit_emit_probe_mem_store' defined but not used [-Werror=unused-
>> function]
>> static int bpf_jit_emit_probe_mem_store(struct codegen_context *ctx,
>> u32 src_reg, s16 off,
>> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> cc1: all warnings being treated as errors
>> make[4]: *** [scripts/Makefile.build:287: arch/powerpc/net/
>> bpf_jit_comp64.o] Error 1
>>
> I tried this on top of bpf-next, and for me build passed.
Build of _this_ patch (alone) passed ?
This patch defines a static function but doesn't use it, so the build
must breaks because of that, unless you have set CONFIG_PPC_DISABLE_WERROR.
Following patch starts using this function so then the build doesn't
break anymore. But until next patch is applied the build doesn't work.
Both patches have to be squashed together in order to not break
bisectability of the kernel.
Christophe
>
> Note: I applied https://eur01.safelinks.protection.outlook.com/?
> url=https%3A%2F%2Flore.kernel.org%2Fbpf%2F20250717202935.29018-2-
> puranjay%40kernel.org%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C0468473019834e07ef2b08ddd4179b9c%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638899920058624267%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=iZLg9NUWxtH3vO1STI8wRYLzwvhohd2KKTAGYDe3WnM%3D&reserved=0 before applying current patch.
>
> gcc version 14.2.1 20250110
>
> uname -r: 6.16.0-gf2844c7fdb07
>
> bpf-next repo: https://eur01.safelinks.protection.outlook.com/?
> url=https%3A%2F%2Fkernel.googlesource.com%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fbpf%2Fbpf-next&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C0468473019834e07ef2b08ddd4179b9c%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638899920058644309%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=OrMauttrzPbaFYhzKdkH5l%2FltISc95MwitnUC7YLhJQ%3D&reserved=0
>
> HEAD:
>
> commit f3af62b6cee8af9f07012051874af2d2a451f0e5 (origin/master, origin/
> HEAD)
> Author: Tao Chen <chen.dylane@linux.dev>
> Date: Wed Jul 23 22:44:42 2025 +0800
>
> bpftool: Add bash completion for token argument
>
>
> Build Success logs:
>
> TEST-OBJ [test_progs-cpuv4] xdp_vlan.test.o
> TEST-OBJ [test_progs-cpuv4] xdpwall.test.o
> TEST-OBJ [test_progs-cpuv4] xfrm_info.test.o
> BINARY bench
> BINARY test_maps
> BINARY test_progs
> BINARY test_progs-no_alu32
> BINARY test_progs-cpuv4
>
>
> Regards,
>
> Venkat.
>
>>
>>>
>>> Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
>>> ---
>>> arch/powerpc/net/bpf_jit_comp64.c | 30 ++++++++++++++++++++++++++++++
>>> 1 file changed, 30 insertions(+)
>>>
>>> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/
>>> bpf_jit_comp64.c
>>> index 025524378443..489de21fe3d6 100644
>>> --- a/arch/powerpc/net/bpf_jit_comp64.c
>>> +++ b/arch/powerpc/net/bpf_jit_comp64.c
>>> @@ -409,6 +409,36 @@ asm (
>>> " blr ;"
>>> );
>>> +static int bpf_jit_emit_probe_mem_store(struct codegen_context
>>> *ctx, u32 src_reg, s16 off,
>>> + u32 code, u32 *image)
>>> +{
>>> + u32 tmp1_reg = bpf_to_ppc(TMP_REG_1);
>>> + u32 tmp2_reg = bpf_to_ppc(TMP_REG_2);
>>> +
>>> + switch (BPF_SIZE(code)) {
>>> + case BPF_B:
>>> + EMIT(PPC_RAW_STB(src_reg, tmp1_reg, off));
>>> + break;
>>> + case BPF_H:
>>> + EMIT(PPC_RAW_STH(src_reg, tmp1_reg, off));
>>> + break;
>>> + case BPF_W:
>>> + EMIT(PPC_RAW_STW(src_reg, tmp1_reg, off));
>>> + break;
>>> + case BPF_DW:
>>> + if (off % 4) {
>>> + EMIT(PPC_RAW_LI(tmp2_reg, off));
>>> + EMIT(PPC_RAW_STDX(src_reg, tmp1_reg, tmp2_reg));
>>> + } else {
>>> + EMIT(PPC_RAW_STD(src_reg, tmp1_reg, off));
>>> + }
>>> + break;
>>> + default:
>>> + return -EINVAL;
>>> + }
>>> + return 0;
>>> +}
>>> +
>>> static int emit_atomic_ld_st(const struct bpf_insn insn, struct
>>> codegen_context *ctx, u32 *image)
>>> {
>>> u32 code = insn.code;
>>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 1/6] bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store instructions
2025-08-06 6:59 ` Christophe Leroy
@ 2025-08-07 10:29 ` Saket Kumar Bhaskar
0 siblings, 0 replies; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-07 10:29 UTC (permalink / raw)
To: Christophe Leroy
Cc: Venkat Rao Bagalkote, bpf, linuxppc-dev, linux-kselftest,
linux-kernel, hbathini, sachinpb, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii,
shuah
On Wed, Aug 06, 2025 at 08:59:59AM +0200, Christophe Leroy wrote:
>
>
> Le 05/08/2025 à 13:59, Venkat Rao Bagalkote a écrit :
> >
> > On 05/08/25 1:04 pm, Christophe Leroy wrote:
> > >
> > >
> > > Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
> > > > bpf_jit_emit_probe_mem_store() is introduced to emit instructions for
> > > > storing memory values depending on the size (byte, halfword,
> > > > word, doubleword).
> > >
> > > Build break with this patch
> > >
> > > CC arch/powerpc/net/bpf_jit_comp64.o
> > > arch/powerpc/net/bpf_jit_comp64.c:395:12: error:
> > > 'bpf_jit_emit_probe_mem_store' defined but not used [-Werror=unused-
> > > function]
> > > static int bpf_jit_emit_probe_mem_store(struct codegen_context
> > > *ctx, u32 src_reg, s16 off,
> > > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > cc1: all warnings being treated as errors
> > > make[4]: *** [scripts/Makefile.build:287: arch/powerpc/net/
> > > bpf_jit_comp64.o] Error 1
> > >
> > I tried this on top of bpf-next, and for me build passed.
>
> Build of _this_ patch (alone) passed ?
>
> This patch defines a static function but doesn't use it, so the build must
> breaks because of that, unless you have set CONFIG_PPC_DISABLE_WERROR.
>
> Following patch starts using this function so then the build doesn't break
> anymore. But until next patch is applied the build doesn't work. Both
> patches have to be squashed together in order to not break bisectability of
> the kernel.
>
> Christophe
>
Got it Chris, will squash both the patches together in v2.
> >
> > Note: I applied https://eur01.safelinks.protection.outlook.com/?
> > url=https%3A%2F%2Flore.kernel.org%2Fbpf%2F20250717202935.29018-2- puranjay%40kernel.org%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C0468473019834e07ef2b08ddd4179b9c%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638899920058624267%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=iZLg9NUWxtH3vO1STI8wRYLzwvhohd2KKTAGYDe3WnM%3D&reserved=0
> > before applying current patch.
> >
> > gcc version 14.2.1 20250110
> >
> > uname -r: 6.16.0-gf2844c7fdb07
> >
> > bpf-next repo: https://eur01.safelinks.protection.outlook.com/? url=https%3A%2F%2Fkernel.googlesource.com%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fbpf%2Fbpf-next&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C0468473019834e07ef2b08ddd4179b9c%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638899920058644309%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=OrMauttrzPbaFYhzKdkH5l%2FltISc95MwitnUC7YLhJQ%3D&reserved=0
> >
> > HEAD:
> >
> > commit f3af62b6cee8af9f07012051874af2d2a451f0e5 (origin/master, origin/
> > HEAD)
> > Author: Tao Chen <chen.dylane@linux.dev>
> > Date: Wed Jul 23 22:44:42 2025 +0800
> >
> > bpftool: Add bash completion for token argument
> >
> >
> > Build Success logs:
> >
> > TEST-OBJ [test_progs-cpuv4] xdp_vlan.test.o
> > TEST-OBJ [test_progs-cpuv4] xdpwall.test.o
> > TEST-OBJ [test_progs-cpuv4] xfrm_info.test.o
> > BINARY bench
> > BINARY test_maps
> > BINARY test_progs
> > BINARY test_progs-no_alu32
> > BINARY test_progs-cpuv4
> >
> >
> > Regards,
> >
> > Venkat.
> >
> > >
> > > >
> > > > Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
> > > > ---
> > > > arch/powerpc/net/bpf_jit_comp64.c | 30 ++++++++++++++++++++++++++++++
> > > > 1 file changed, 30 insertions(+)
> > > >
> > > > diff --git a/arch/powerpc/net/bpf_jit_comp64.c
> > > > b/arch/powerpc/net/ bpf_jit_comp64.c
> > > > index 025524378443..489de21fe3d6 100644
> > > > --- a/arch/powerpc/net/bpf_jit_comp64.c
> > > > +++ b/arch/powerpc/net/bpf_jit_comp64.c
> > > > @@ -409,6 +409,36 @@ asm (
> > > > " blr ;"
> > > > );
> > > > +static int bpf_jit_emit_probe_mem_store(struct
> > > > codegen_context *ctx, u32 src_reg, s16 off,
> > > > + u32 code, u32 *image)
> > > > +{
> > > > + u32 tmp1_reg = bpf_to_ppc(TMP_REG_1);
> > > > + u32 tmp2_reg = bpf_to_ppc(TMP_REG_2);
> > > > +
> > > > + switch (BPF_SIZE(code)) {
> > > > + case BPF_B:
> > > > + EMIT(PPC_RAW_STB(src_reg, tmp1_reg, off));
> > > > + break;
> > > > + case BPF_H:
> > > > + EMIT(PPC_RAW_STH(src_reg, tmp1_reg, off));
> > > > + break;
> > > > + case BPF_W:
> > > > + EMIT(PPC_RAW_STW(src_reg, tmp1_reg, off));
> > > > + break;
> > > > + case BPF_DW:
> > > > + if (off % 4) {
> > > > + EMIT(PPC_RAW_LI(tmp2_reg, off));
> > > > + EMIT(PPC_RAW_STDX(src_reg, tmp1_reg, tmp2_reg));
> > > > + } else {
> > > > + EMIT(PPC_RAW_STD(src_reg, tmp1_reg, off));
> > > > + }
> > > > + break;
> > > > + default:
> > > > + return -EINVAL;
> > > > + }
> > > > + return 0;
> > > > +}
> > > > +
> > > > static int emit_atomic_ld_st(const struct bpf_insn insn,
> > > > struct codegen_context *ctx, u32 *image)
> > > > {
> > > > u32 code = insn.code;
> > >
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [bpf-next 2/6] bpf,powerpc: Implement PROBE_MEM32 pseudo instructions
2025-08-05 6:27 [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Saket Kumar Bhaskar
2025-08-05 6:27 ` [bpf-next 1/6] bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store instructions Saket Kumar Bhaskar
@ 2025-08-05 6:27 ` Saket Kumar Bhaskar
2025-08-05 7:41 ` Christophe Leroy
2025-08-14 8:54 ` Hari Bathini
2025-08-05 6:27 ` [bpf-next 3/6] bpf,powerpc: Implement bpf_addr_space_cast instruction Saket Kumar Bhaskar
` (5 subsequent siblings)
7 siblings, 2 replies; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-05 6:27 UTC (permalink / raw)
To: bpf, linuxppc-dev, linux-kselftest, linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, christophe.leroy, naveen, maddy, mpe, npiggin,
memxor, iii, shuah
Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW]
instructions. They are similar to PROBE_MEM instructions with the
following differences:
- PROBE_MEM32 supports store.
- PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
src/dst register
- PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in _R26
in the prologue). Due to bpf_arena constructions such _R26 + reg +
off16 access is guaranteed to be within arena virtual range, so no
address check at run-time.
- PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When
LDX faults the destination register is zeroed.
To support these on powerpc, we do tmp1 = _R26 + src/dst reg and then use
tmp1 as the new src/dst register. This allows us to reuse most of the
code for normal [LDX | STX | ST].
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
---
arch/powerpc/net/bpf_jit.h | 5 +-
arch/powerpc/net/bpf_jit_comp.c | 10 ++-
arch/powerpc/net/bpf_jit_comp32.c | 2 +-
arch/powerpc/net/bpf_jit_comp64.c | 108 ++++++++++++++++++++++++++++--
4 files changed, 114 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 4c26912c2e3c..2d095a873305 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -161,9 +161,10 @@ struct codegen_context {
unsigned int seen;
unsigned int idx;
unsigned int stack_size;
- int b2p[MAX_BPF_JIT_REG + 2];
+ int b2p[MAX_BPF_JIT_REG + 3];
unsigned int exentry_idx;
unsigned int alt_exit_addr;
+ u64 arena_vm_start;
};
#define bpf_to_ppc(r) (ctx->b2p[r])
@@ -201,7 +202,7 @@ int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg,
int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass,
struct codegen_context *ctx, int insn_idx,
- int jmp_off, int dst_reg);
+ int jmp_off, int dst_reg, u32 code);
#endif
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index c0684733e9d6..35bfdf4d8785 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -204,6 +204,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
/* Make sure that the stack is quadword aligned. */
cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
+ cgctx.arena_vm_start = bpf_arena_get_kern_vm_start(fp->aux->arena);
/* Scouting faux-generate pass 0 */
if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false)) {
@@ -326,7 +327,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
*/
int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass,
struct codegen_context *ctx, int insn_idx, int jmp_off,
- int dst_reg)
+ int dst_reg, u32 code)
{
off_t offset;
unsigned long pc;
@@ -354,7 +355,12 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass
(fp->aux->num_exentries * BPF_FIXUP_LEN * 4) +
(ctx->exentry_idx * BPF_FIXUP_LEN * 4);
- fixup[0] = PPC_RAW_LI(dst_reg, 0);
+ if ((BPF_CLASS(code) == BPF_LDX && BPF_MODE(code) == BPF_PROBE_MEM32) ||
+ (BPF_CLASS(code) == BPF_LDX && BPF_MODE(code) == BPF_PROBE_MEM))
+ fixup[0] = PPC_RAW_LI(dst_reg, 0);
+ else if (BPF_CLASS(code) == BPF_ST || BPF_CLASS(code) == BPF_STX)
+ fixup[0] = PPC_RAW_NOP();
+
if (IS_ENABLED(CONFIG_PPC32))
fixup[1] = PPC_RAW_LI(dst_reg - 1, 0); /* clear higher 32-bit register too */
diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 0aace304dfe1..3087e744fb25 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -1087,7 +1087,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
}
ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx, insn_idx,
- jmp_off, dst_reg);
+ jmp_off, dst_reg, code);
if (ret)
return ret;
}
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 489de21fe3d6..16e62766c757 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -44,6 +44,7 @@
/* BPF register usage */
#define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
#define TMP_REG_2 (MAX_BPF_JIT_REG + 1)
+#define ARENA_VM_START (MAX_BPF_JIT_REG + 2)
/* BPF to ppc register mappings */
void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
@@ -61,6 +62,8 @@ void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
ctx->b2p[BPF_REG_7] = _R28;
ctx->b2p[BPF_REG_8] = _R29;
ctx->b2p[BPF_REG_9] = _R30;
+ /* non volatile register for kern_vm_start address */
+ ctx->b2p[ARENA_VM_START] = _R26;
/* frame pointer aka BPF_REG_10 */
ctx->b2p[BPF_REG_FP] = _R31;
/* eBPF jit internal registers */
@@ -69,8 +72,8 @@ void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
ctx->b2p[TMP_REG_2] = _R10;
}
-/* PPC NVR range -- update this if we ever use NVRs below r27 */
-#define BPF_PPC_NVR_MIN _R27
+/* PPC NVR range -- update this if we ever use NVRs below r26 */
+#define BPF_PPC_NVR_MIN _R26
static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
{
@@ -170,10 +173,17 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
if (bpf_is_seen_register(ctx, bpf_to_ppc(i)))
EMIT(PPC_RAW_STD(bpf_to_ppc(i), _R1, bpf_jit_stack_offsetof(ctx, bpf_to_ppc(i))));
+ if (ctx->arena_vm_start)
+ EMIT(PPC_RAW_STD(bpf_to_ppc(ARENA_VM_START), _R1,
+ bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ARENA_VM_START))));
+
/* Setup frame pointer to point to the bpf stack area */
if (bpf_is_seen_register(ctx, bpf_to_ppc(BPF_REG_FP)))
EMIT(PPC_RAW_ADDI(bpf_to_ppc(BPF_REG_FP), _R1,
STACK_FRAME_MIN_SIZE + ctx->stack_size));
+
+ if (ctx->arena_vm_start)
+ PPC_LI64(bpf_to_ppc(ARENA_VM_START), ctx->arena_vm_start);
}
static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx)
@@ -185,6 +195,10 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx
if (bpf_is_seen_register(ctx, bpf_to_ppc(i)))
EMIT(PPC_RAW_LD(bpf_to_ppc(i), _R1, bpf_jit_stack_offsetof(ctx, bpf_to_ppc(i))));
+ if (ctx->arena_vm_start)
+ EMIT(PPC_RAW_LD(bpf_to_ppc(ARENA_VM_START), _R1,
+ bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ARENA_VM_START))));
+
/* Tear down our stack frame */
if (bpf_has_stack_frame(ctx)) {
EMIT(PPC_RAW_ADDI(_R1, _R1, BPF_PPC_STACKFRAME + ctx->stack_size));
@@ -990,6 +1004,50 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
}
break;
+ case BPF_STX | BPF_PROBE_MEM32 | BPF_B:
+ case BPF_STX | BPF_PROBE_MEM32 | BPF_H:
+ case BPF_STX | BPF_PROBE_MEM32 | BPF_W:
+ case BPF_STX | BPF_PROBE_MEM32 | BPF_DW:
+
+ EMIT(PPC_RAW_ADD(tmp1_reg, dst_reg, bpf_to_ppc(ARENA_VM_START)));
+
+ ret = bpf_jit_emit_probe_mem_store(ctx, src_reg, off, code, image);
+ if (ret)
+ return ret;
+
+ ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
+ ctx->idx - 1, 4, -1, code);
+ if (ret)
+ return ret;
+
+ break;
+
+ case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
+ case BPF_ST | BPF_PROBE_MEM32 | BPF_H:
+ case BPF_ST | BPF_PROBE_MEM32 | BPF_W:
+ case BPF_ST | BPF_PROBE_MEM32 | BPF_DW:
+
+ EMIT(PPC_RAW_ADD(tmp1_reg, dst_reg, bpf_to_ppc(ARENA_VM_START)));
+
+ if (BPF_SIZE(code) == BPF_W || BPF_SIZE(code) == BPF_DW) {
+ PPC_LI32(tmp2_reg, imm);
+ src_reg = tmp2_reg;
+ } else {
+ EMIT(PPC_RAW_LI(tmp2_reg, imm));
+ src_reg = tmp2_reg;
+ }
+
+ ret = bpf_jit_emit_probe_mem_store(ctx, src_reg, off, code, image);
+ if (ret)
+ return ret;
+
+ ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
+ ctx->idx - 1, 4, -1, code);
+ if (ret)
+ return ret;
+
+ break;
+
/*
* BPF_STX ATOMIC (atomic ops)
*/
@@ -1142,9 +1200,10 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
* Check if 'off' is word aligned for BPF_DW, because
* we might generate two instructions.
*/
- if ((BPF_SIZE(code) == BPF_DW ||
- (BPF_SIZE(code) == BPF_B && BPF_MODE(code) == BPF_PROBE_MEMSX)) &&
- (off & 3))
+ if ((BPF_SIZE(code) == BPF_DW && (off & 3)) ||
+ (BPF_SIZE(code) == BPF_B &&
+ BPF_MODE(code) == BPF_PROBE_MEMSX) ||
+ (BPF_SIZE(code) == BPF_B && BPF_MODE(code) == BPF_MEMSX))
PPC_JMP((ctx->idx + 3) * 4);
else
PPC_JMP((ctx->idx + 2) * 4);
@@ -1190,12 +1249,49 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
if (BPF_MODE(code) == BPF_PROBE_MEM) {
ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
- ctx->idx - 1, 4, dst_reg);
+ ctx->idx - 1, 4, dst_reg, code);
if (ret)
return ret;
}
break;
+ /* dst = *(u64 *)(ul) (src + ARENA_VM_START + off) */
+ case BPF_LDX | BPF_PROBE_MEM32 | BPF_B:
+ case BPF_LDX | BPF_PROBE_MEM32 | BPF_H:
+ case BPF_LDX | BPF_PROBE_MEM32 | BPF_W:
+ case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW:
+
+ EMIT(PPC_RAW_ADD(tmp1_reg, src_reg, bpf_to_ppc(ARENA_VM_START)));
+
+ switch (size) {
+ case BPF_B:
+ EMIT(PPC_RAW_LBZ(dst_reg, tmp1_reg, off));
+ break;
+ case BPF_H:
+ EMIT(PPC_RAW_LHZ(dst_reg, tmp1_reg, off));
+ break;
+ case BPF_W:
+ EMIT(PPC_RAW_LWZ(dst_reg, tmp1_reg, off));
+ break;
+ case BPF_DW:
+ if (off % 4) {
+ EMIT(PPC_RAW_LI(tmp2_reg, off));
+ EMIT(PPC_RAW_LDX(dst_reg, tmp1_reg, tmp2_reg));
+ } else {
+ EMIT(PPC_RAW_LD(dst_reg, tmp1_reg, off));
+ }
+ break;
+ }
+
+ if (size != BPF_DW && insn_is_zext(&insn[i + 1]))
+ addrs[++i] = ctx->idx * 4;
+
+ ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
+ ctx->idx - 1, 4, dst_reg, code);
+ if (ret)
+ return ret;
+ break;
+
/*
* Doubleword load
* 16 byte instruction that uses two 'struct bpf_insn'
--
2.43.5
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [bpf-next 2/6] bpf,powerpc: Implement PROBE_MEM32 pseudo instructions
2025-08-05 6:27 ` [bpf-next 2/6] bpf,powerpc: Implement PROBE_MEM32 pseudo instructions Saket Kumar Bhaskar
@ 2025-08-05 7:41 ` Christophe Leroy
2025-08-07 13:25 ` Saket Kumar Bhaskar
2025-08-14 8:54 ` Hari Bathini
1 sibling, 1 reply; 23+ messages in thread
From: Christophe Leroy @ 2025-08-05 7:41 UTC (permalink / raw)
To: Saket Kumar Bhaskar, bpf, linuxppc-dev, linux-kselftest,
linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii,
shuah
Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
> Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW]
> instructions. They are similar to PROBE_MEM instructions with the
> following differences:
> - PROBE_MEM32 supports store.
> - PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
> src/dst register
> - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in _R26
> in the prologue). Due to bpf_arena constructions such _R26 + reg +
> off16 access is guaranteed to be within arena virtual range, so no
> address check at run-time.
> - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When
> LDX faults the destination register is zeroed.
>
> To support these on powerpc, we do tmp1 = _R26 + src/dst reg and then use
> tmp1 as the new src/dst register. This allows us to reuse most of the
> code for normal [LDX | STX | ST].
>
> Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
> ---
> arch/powerpc/net/bpf_jit.h | 5 +-
> arch/powerpc/net/bpf_jit_comp.c | 10 ++-
> arch/powerpc/net/bpf_jit_comp32.c | 2 +-
> arch/powerpc/net/bpf_jit_comp64.c | 108 ++++++++++++++++++++++++++++--
> 4 files changed, 114 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> index 4c26912c2e3c..2d095a873305 100644
> --- a/arch/powerpc/net/bpf_jit.h
> +++ b/arch/powerpc/net/bpf_jit.h
> @@ -161,9 +161,10 @@ struct codegen_context {
> unsigned int seen;
> unsigned int idx;
> unsigned int stack_size;
> - int b2p[MAX_BPF_JIT_REG + 2];
> + int b2p[MAX_BPF_JIT_REG + 3];
> unsigned int exentry_idx;
> unsigned int alt_exit_addr;
> + u64 arena_vm_start;
> };
>
> #define bpf_to_ppc(r) (ctx->b2p[r])
> @@ -201,7 +202,7 @@ int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg,
>
> int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass,
> struct codegen_context *ctx, int insn_idx,
> - int jmp_off, int dst_reg);
> + int jmp_off, int dst_reg, u32 code);
>
> #endif
>
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> index c0684733e9d6..35bfdf4d8785 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -204,6 +204,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
>
> /* Make sure that the stack is quadword aligned. */
> cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
> + cgctx.arena_vm_start = bpf_arena_get_kern_vm_start(fp->aux->arena);
>
> /* Scouting faux-generate pass 0 */
> if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false)) {
> @@ -326,7 +327,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
> */
> int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass,
> struct codegen_context *ctx, int insn_idx, int jmp_off,
> - int dst_reg)
> + int dst_reg, u32 code)
> {
> off_t offset;
> unsigned long pc;
> @@ -354,7 +355,12 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass
> (fp->aux->num_exentries * BPF_FIXUP_LEN * 4) +
> (ctx->exentry_idx * BPF_FIXUP_LEN * 4);
>
> - fixup[0] = PPC_RAW_LI(dst_reg, 0);
> + if ((BPF_CLASS(code) == BPF_LDX && BPF_MODE(code) == BPF_PROBE_MEM32) ||
> + (BPF_CLASS(code) == BPF_LDX && BPF_MODE(code) == BPF_PROBE_MEM))
> + fixup[0] = PPC_RAW_LI(dst_reg, 0);
> + else if (BPF_CLASS(code) == BPF_ST || BPF_CLASS(code) == BPF_STX)
> + fixup[0] = PPC_RAW_NOP();
> +
Is there also a 'else' to consider ? If not, why not just a 'else'
instead of an 'if else' ?
> if (IS_ENABLED(CONFIG_PPC32))
> fixup[1] = PPC_RAW_LI(dst_reg - 1, 0); /* clear higher 32-bit register too */
>
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index 0aace304dfe1..3087e744fb25 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -1087,7 +1087,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> }
>
> ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx, insn_idx,
> - jmp_off, dst_reg);
> + jmp_off, dst_reg, code);
> if (ret)
> return ret;
> }
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 489de21fe3d6..16e62766c757 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -44,6 +44,7 @@
> /* BPF register usage */
> #define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
> #define TMP_REG_2 (MAX_BPF_JIT_REG + 1)
> +#define ARENA_VM_START (MAX_BPF_JIT_REG + 2)
>
> /* BPF to ppc register mappings */
> void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
> @@ -61,6 +62,8 @@ void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
> ctx->b2p[BPF_REG_7] = _R28;
> ctx->b2p[BPF_REG_8] = _R29;
> ctx->b2p[BPF_REG_9] = _R30;
> + /* non volatile register for kern_vm_start address */
> + ctx->b2p[ARENA_VM_START] = _R26;
> /* frame pointer aka BPF_REG_10 */
> ctx->b2p[BPF_REG_FP] = _R31;
> /* eBPF jit internal registers */
> @@ -69,8 +72,8 @@ void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
> ctx->b2p[TMP_REG_2] = _R10;
> }
>
> -/* PPC NVR range -- update this if we ever use NVRs below r27 */
> -#define BPF_PPC_NVR_MIN _R27
> +/* PPC NVR range -- update this if we ever use NVRs below r26 */
> +#define BPF_PPC_NVR_MIN _R26
>
> static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
> {
> @@ -170,10 +173,17 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
> if (bpf_is_seen_register(ctx, bpf_to_ppc(i)))
> EMIT(PPC_RAW_STD(bpf_to_ppc(i), _R1, bpf_jit_stack_offsetof(ctx, bpf_to_ppc(i))));
>
> + if (ctx->arena_vm_start)
> + EMIT(PPC_RAW_STD(bpf_to_ppc(ARENA_VM_START), _R1,
> + bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ARENA_VM_START))));
> +
> /* Setup frame pointer to point to the bpf stack area */
> if (bpf_is_seen_register(ctx, bpf_to_ppc(BPF_REG_FP)))
> EMIT(PPC_RAW_ADDI(bpf_to_ppc(BPF_REG_FP), _R1,
> STACK_FRAME_MIN_SIZE + ctx->stack_size));
> +
> + if (ctx->arena_vm_start)
> + PPC_LI64(bpf_to_ppc(ARENA_VM_START), ctx->arena_vm_start);
> }
>
> static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx)
> @@ -185,6 +195,10 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx
> if (bpf_is_seen_register(ctx, bpf_to_ppc(i)))
> EMIT(PPC_RAW_LD(bpf_to_ppc(i), _R1, bpf_jit_stack_offsetof(ctx, bpf_to_ppc(i))));
>
> + if (ctx->arena_vm_start)
> + EMIT(PPC_RAW_LD(bpf_to_ppc(ARENA_VM_START), _R1,
> + bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ARENA_VM_START))));
> +
> /* Tear down our stack frame */
> if (bpf_has_stack_frame(ctx)) {
> EMIT(PPC_RAW_ADDI(_R1, _R1, BPF_PPC_STACKFRAME + ctx->stack_size));
> @@ -990,6 +1004,50 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> }
> break;
>
> + case BPF_STX | BPF_PROBE_MEM32 | BPF_B:
> + case BPF_STX | BPF_PROBE_MEM32 | BPF_H:
> + case BPF_STX | BPF_PROBE_MEM32 | BPF_W:
> + case BPF_STX | BPF_PROBE_MEM32 | BPF_DW:
> +
> + EMIT(PPC_RAW_ADD(tmp1_reg, dst_reg, bpf_to_ppc(ARENA_VM_START)));
> +
> + ret = bpf_jit_emit_probe_mem_store(ctx, src_reg, off, code, image);
> + if (ret)
> + return ret;
> +
> + ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
> + ctx->idx - 1, 4, -1, code);
> + if (ret)
> + return ret;
> +
> + break;
> +
> + case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
> + case BPF_ST | BPF_PROBE_MEM32 | BPF_H:
> + case BPF_ST | BPF_PROBE_MEM32 | BPF_W:
> + case BPF_ST | BPF_PROBE_MEM32 | BPF_DW:
> +
> + EMIT(PPC_RAW_ADD(tmp1_reg, dst_reg, bpf_to_ppc(ARENA_VM_START)));
> +
> + if (BPF_SIZE(code) == BPF_W || BPF_SIZE(code) == BPF_DW) {
> + PPC_LI32(tmp2_reg, imm);
> + src_reg = tmp2_reg;
> + } else {
> + EMIT(PPC_RAW_LI(tmp2_reg, imm));
> + src_reg = tmp2_reg;
> + }
> +
> + ret = bpf_jit_emit_probe_mem_store(ctx, src_reg, off, code, image);
> + if (ret)
> + return ret;
> +
> + ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
> + ctx->idx - 1, 4, -1, code);
> + if (ret)
> + return ret;
> +
> + break;
> +
> /*
> * BPF_STX ATOMIC (atomic ops)
> */
> @@ -1142,9 +1200,10 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> * Check if 'off' is word aligned for BPF_DW, because
> * we might generate two instructions.
> */
> - if ((BPF_SIZE(code) == BPF_DW ||
> - (BPF_SIZE(code) == BPF_B && BPF_MODE(code) == BPF_PROBE_MEMSX)) &&
> - (off & 3))
> + if ((BPF_SIZE(code) == BPF_DW && (off & 3)) ||
> + (BPF_SIZE(code) == BPF_B &&
> + BPF_MODE(code) == BPF_PROBE_MEMSX) ||
> + (BPF_SIZE(code) == BPF_B && BPF_MODE(code) == BPF_MEMSX))
> PPC_JMP((ctx->idx + 3) * 4);
> else
> PPC_JMP((ctx->idx + 2) * 4);
> @@ -1190,12 +1249,49 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
>
> if (BPF_MODE(code) == BPF_PROBE_MEM) {
> ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
> - ctx->idx - 1, 4, dst_reg);
> + ctx->idx - 1, 4, dst_reg, code);
> if (ret)
> return ret;
> }
> break;
>
> + /* dst = *(u64 *)(ul) (src + ARENA_VM_START + off) */
> + case BPF_LDX | BPF_PROBE_MEM32 | BPF_B:
> + case BPF_LDX | BPF_PROBE_MEM32 | BPF_H:
> + case BPF_LDX | BPF_PROBE_MEM32 | BPF_W:
> + case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW:
> +
> + EMIT(PPC_RAW_ADD(tmp1_reg, src_reg, bpf_to_ppc(ARENA_VM_START)));
> +
> + switch (size) {
> + case BPF_B:
> + EMIT(PPC_RAW_LBZ(dst_reg, tmp1_reg, off));
> + break;
> + case BPF_H:
> + EMIT(PPC_RAW_LHZ(dst_reg, tmp1_reg, off));
> + break;
> + case BPF_W:
> + EMIT(PPC_RAW_LWZ(dst_reg, tmp1_reg, off));
> + break;
> + case BPF_DW:
> + if (off % 4) {
> + EMIT(PPC_RAW_LI(tmp2_reg, off));
> + EMIT(PPC_RAW_LDX(dst_reg, tmp1_reg, tmp2_reg));
> + } else {
> + EMIT(PPC_RAW_LD(dst_reg, tmp1_reg, off));
> + }
> + break;
> + }
> +
> + if (size != BPF_DW && insn_is_zext(&insn[i + 1]))
> + addrs[++i] = ctx->idx * 4;
> +
> + ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
> + ctx->idx - 1, 4, dst_reg, code);
> + if (ret)
> + return ret;
> + break;
> +
> /*
> * Doubleword load
> * 16 byte instruction that uses two 'struct bpf_insn'
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 2/6] bpf,powerpc: Implement PROBE_MEM32 pseudo instructions
2025-08-05 7:41 ` Christophe Leroy
@ 2025-08-07 13:25 ` Saket Kumar Bhaskar
0 siblings, 0 replies; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-07 13:25 UTC (permalink / raw)
To: Christophe Leroy
Cc: bpf, linuxppc-dev, linux-kselftest, linux-kernel, hbathini,
sachinpb, venkat88, andrii, eddyz87, mykolal, ast, daniel,
martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii, shuah
On Tue, Aug 05, 2025 at 09:41:37AM +0200, Christophe Leroy wrote:
>
>
> Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
> > Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW]
> > instructions. They are similar to PROBE_MEM instructions with the
> > following differences:
> > - PROBE_MEM32 supports store.
> > - PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
> > src/dst register
> > - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in _R26
> > in the prologue). Due to bpf_arena constructions such _R26 + reg +
> > off16 access is guaranteed to be within arena virtual range, so no
> > address check at run-time.
> > - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When
> > LDX faults the destination register is zeroed.
> >
> > To support these on powerpc, we do tmp1 = _R26 + src/dst reg and then use
> > tmp1 as the new src/dst register. This allows us to reuse most of the
> > code for normal [LDX | STX | ST].
> >
> > Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
> > ---
> > arch/powerpc/net/bpf_jit.h | 5 +-
> > arch/powerpc/net/bpf_jit_comp.c | 10 ++-
> > arch/powerpc/net/bpf_jit_comp32.c | 2 +-
> > arch/powerpc/net/bpf_jit_comp64.c | 108 ++++++++++++++++++++++++++++--
> > 4 files changed, 114 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> > index 4c26912c2e3c..2d095a873305 100644
> > --- a/arch/powerpc/net/bpf_jit.h
> > +++ b/arch/powerpc/net/bpf_jit.h
> > @@ -161,9 +161,10 @@ struct codegen_context {
> > unsigned int seen;
> > unsigned int idx;
> > unsigned int stack_size;
> > - int b2p[MAX_BPF_JIT_REG + 2];
> > + int b2p[MAX_BPF_JIT_REG + 3];
> > unsigned int exentry_idx;
> > unsigned int alt_exit_addr;
> > + u64 arena_vm_start;
> > };
> > #define bpf_to_ppc(r) (ctx->b2p[r])
> > @@ -201,7 +202,7 @@ int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg,
> > int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass,
> > struct codegen_context *ctx, int insn_idx,
> > - int jmp_off, int dst_reg);
> > + int jmp_off, int dst_reg, u32 code);
> > #endif
> > diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> > index c0684733e9d6..35bfdf4d8785 100644
> > --- a/arch/powerpc/net/bpf_jit_comp.c
> > +++ b/arch/powerpc/net/bpf_jit_comp.c
> > @@ -204,6 +204,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
> > /* Make sure that the stack is quadword aligned. */
> > cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
> > + cgctx.arena_vm_start = bpf_arena_get_kern_vm_start(fp->aux->arena);
> > /* Scouting faux-generate pass 0 */
> > if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false)) {
> > @@ -326,7 +327,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
> > */
> > int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass,
> > struct codegen_context *ctx, int insn_idx, int jmp_off,
> > - int dst_reg)
> > + int dst_reg, u32 code)
> > {
> > off_t offset;
> > unsigned long pc;
> > @@ -354,7 +355,12 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass
> > (fp->aux->num_exentries * BPF_FIXUP_LEN * 4) +
> > (ctx->exentry_idx * BPF_FIXUP_LEN * 4);
> > - fixup[0] = PPC_RAW_LI(dst_reg, 0);
> > + if ((BPF_CLASS(code) == BPF_LDX && BPF_MODE(code) == BPF_PROBE_MEM32) ||
> > + (BPF_CLASS(code) == BPF_LDX && BPF_MODE(code) == BPF_PROBE_MEM))
> > + fixup[0] = PPC_RAW_LI(dst_reg, 0);
> > + else if (BPF_CLASS(code) == BPF_ST || BPF_CLASS(code) == BPF_STX)
> > + fixup[0] = PPC_RAW_NOP();
> > +
>
> Is there also a 'else' to consider ? If not, why not just a 'else' instead
> of an 'if else' ?
>
Thanks Chris, for pointing it out. Will try to accomodate this in v2.
> > if (IS_ENABLED(CONFIG_PPC32))
> > fixup[1] = PPC_RAW_LI(dst_reg - 1, 0); /* clear higher 32-bit register too */
> > diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> > index 0aace304dfe1..3087e744fb25 100644
> > --- a/arch/powerpc/net/bpf_jit_comp32.c
> > +++ b/arch/powerpc/net/bpf_jit_comp32.c
> > @@ -1087,7 +1087,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> > }
> > ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx, insn_idx,
> > - jmp_off, dst_reg);
> > + jmp_off, dst_reg, code);
> > if (ret)
> > return ret;
> > }
> > diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> > index 489de21fe3d6..16e62766c757 100644
> > --- a/arch/powerpc/net/bpf_jit_comp64.c
> > +++ b/arch/powerpc/net/bpf_jit_comp64.c
> > @@ -44,6 +44,7 @@
> > /* BPF register usage */
> > #define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
> > #define TMP_REG_2 (MAX_BPF_JIT_REG + 1)
> > +#define ARENA_VM_START (MAX_BPF_JIT_REG + 2)
> > /* BPF to ppc register mappings */
> > void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
> > @@ -61,6 +62,8 @@ void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
> > ctx->b2p[BPF_REG_7] = _R28;
> > ctx->b2p[BPF_REG_8] = _R29;
> > ctx->b2p[BPF_REG_9] = _R30;
> > + /* non volatile register for kern_vm_start address */
> > + ctx->b2p[ARENA_VM_START] = _R26;
> > /* frame pointer aka BPF_REG_10 */
> > ctx->b2p[BPF_REG_FP] = _R31;
> > /* eBPF jit internal registers */
> > @@ -69,8 +72,8 @@ void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
> > ctx->b2p[TMP_REG_2] = _R10;
> > }
> > -/* PPC NVR range -- update this if we ever use NVRs below r27 */
> > -#define BPF_PPC_NVR_MIN _R27
> > +/* PPC NVR range -- update this if we ever use NVRs below r26 */
> > +#define BPF_PPC_NVR_MIN _R26
> > static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
> > {
> > @@ -170,10 +173,17 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
> > if (bpf_is_seen_register(ctx, bpf_to_ppc(i)))
> > EMIT(PPC_RAW_STD(bpf_to_ppc(i), _R1, bpf_jit_stack_offsetof(ctx, bpf_to_ppc(i))));
> > + if (ctx->arena_vm_start)
> > + EMIT(PPC_RAW_STD(bpf_to_ppc(ARENA_VM_START), _R1,
> > + bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ARENA_VM_START))));
> > +
> > /* Setup frame pointer to point to the bpf stack area */
> > if (bpf_is_seen_register(ctx, bpf_to_ppc(BPF_REG_FP)))
> > EMIT(PPC_RAW_ADDI(bpf_to_ppc(BPF_REG_FP), _R1,
> > STACK_FRAME_MIN_SIZE + ctx->stack_size));
> > +
> > + if (ctx->arena_vm_start)
> > + PPC_LI64(bpf_to_ppc(ARENA_VM_START), ctx->arena_vm_start);
> > }
> > static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx)
> > @@ -185,6 +195,10 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx
> > if (bpf_is_seen_register(ctx, bpf_to_ppc(i)))
> > EMIT(PPC_RAW_LD(bpf_to_ppc(i), _R1, bpf_jit_stack_offsetof(ctx, bpf_to_ppc(i))));
> > + if (ctx->arena_vm_start)
> > + EMIT(PPC_RAW_LD(bpf_to_ppc(ARENA_VM_START), _R1,
> > + bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ARENA_VM_START))));
> > +
> > /* Tear down our stack frame */
> > if (bpf_has_stack_frame(ctx)) {
> > EMIT(PPC_RAW_ADDI(_R1, _R1, BPF_PPC_STACKFRAME + ctx->stack_size));
> > @@ -990,6 +1004,50 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> > }
> > break;
> > + case BPF_STX | BPF_PROBE_MEM32 | BPF_B:
> > + case BPF_STX | BPF_PROBE_MEM32 | BPF_H:
> > + case BPF_STX | BPF_PROBE_MEM32 | BPF_W:
> > + case BPF_STX | BPF_PROBE_MEM32 | BPF_DW:
> > +
> > + EMIT(PPC_RAW_ADD(tmp1_reg, dst_reg, bpf_to_ppc(ARENA_VM_START)));
> > +
> > + ret = bpf_jit_emit_probe_mem_store(ctx, src_reg, off, code, image);
> > + if (ret)
> > + return ret;
> > +
> > + ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
> > + ctx->idx - 1, 4, -1, code);
> > + if (ret)
> > + return ret;
> > +
> > + break;
> > +
> > + case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
> > + case BPF_ST | BPF_PROBE_MEM32 | BPF_H:
> > + case BPF_ST | BPF_PROBE_MEM32 | BPF_W:
> > + case BPF_ST | BPF_PROBE_MEM32 | BPF_DW:
> > +
> > + EMIT(PPC_RAW_ADD(tmp1_reg, dst_reg, bpf_to_ppc(ARENA_VM_START)));
> > +
> > + if (BPF_SIZE(code) == BPF_W || BPF_SIZE(code) == BPF_DW) {
> > + PPC_LI32(tmp2_reg, imm);
> > + src_reg = tmp2_reg;
> > + } else {
> > + EMIT(PPC_RAW_LI(tmp2_reg, imm));
> > + src_reg = tmp2_reg;
> > + }
> > +
> > + ret = bpf_jit_emit_probe_mem_store(ctx, src_reg, off, code, image);
> > + if (ret)
> > + return ret;
> > +
> > + ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
> > + ctx->idx - 1, 4, -1, code);
> > + if (ret)
> > + return ret;
> > +
> > + break;
> > +
> > /*
> > * BPF_STX ATOMIC (atomic ops)
> > */
> > @@ -1142,9 +1200,10 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> > * Check if 'off' is word aligned for BPF_DW, because
> > * we might generate two instructions.
> > */
> > - if ((BPF_SIZE(code) == BPF_DW ||
> > - (BPF_SIZE(code) == BPF_B && BPF_MODE(code) == BPF_PROBE_MEMSX)) &&
> > - (off & 3))
> > + if ((BPF_SIZE(code) == BPF_DW && (off & 3)) ||
> > + (BPF_SIZE(code) == BPF_B &&
> > + BPF_MODE(code) == BPF_PROBE_MEMSX) ||
> > + (BPF_SIZE(code) == BPF_B && BPF_MODE(code) == BPF_MEMSX))
> > PPC_JMP((ctx->idx + 3) * 4);
> > else
> > PPC_JMP((ctx->idx + 2) * 4);
> > @@ -1190,12 +1249,49 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> > if (BPF_MODE(code) == BPF_PROBE_MEM) {
> > ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
> > - ctx->idx - 1, 4, dst_reg);
> > + ctx->idx - 1, 4, dst_reg, code);
> > if (ret)
> > return ret;
> > }
> > break;
> > + /* dst = *(u64 *)(ul) (src + ARENA_VM_START + off) */
> > + case BPF_LDX | BPF_PROBE_MEM32 | BPF_B:
> > + case BPF_LDX | BPF_PROBE_MEM32 | BPF_H:
> > + case BPF_LDX | BPF_PROBE_MEM32 | BPF_W:
> > + case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW:
> > +
> > + EMIT(PPC_RAW_ADD(tmp1_reg, src_reg, bpf_to_ppc(ARENA_VM_START)));
> > +
> > + switch (size) {
> > + case BPF_B:
> > + EMIT(PPC_RAW_LBZ(dst_reg, tmp1_reg, off));
> > + break;
> > + case BPF_H:
> > + EMIT(PPC_RAW_LHZ(dst_reg, tmp1_reg, off));
> > + break;
> > + case BPF_W:
> > + EMIT(PPC_RAW_LWZ(dst_reg, tmp1_reg, off));
> > + break;
> > + case BPF_DW:
> > + if (off % 4) {
> > + EMIT(PPC_RAW_LI(tmp2_reg, off));
> > + EMIT(PPC_RAW_LDX(dst_reg, tmp1_reg, tmp2_reg));
> > + } else {
> > + EMIT(PPC_RAW_LD(dst_reg, tmp1_reg, off));
> > + }
> > + break;
> > + }
> > +
> > + if (size != BPF_DW && insn_is_zext(&insn[i + 1]))
> > + addrs[++i] = ctx->idx * 4;
> > +
> > + ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
> > + ctx->idx - 1, 4, dst_reg, code);
> > + if (ret)
> > + return ret;
> > + break;
> > +
> > /*
> > * Doubleword load
> > * 16 byte instruction that uses two 'struct bpf_insn'
>
Thanks for reviewing, Chris.
Regards,
Saket
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 2/6] bpf,powerpc: Implement PROBE_MEM32 pseudo instructions
2025-08-05 6:27 ` [bpf-next 2/6] bpf,powerpc: Implement PROBE_MEM32 pseudo instructions Saket Kumar Bhaskar
2025-08-05 7:41 ` Christophe Leroy
@ 2025-08-14 8:54 ` Hari Bathini
1 sibling, 0 replies; 23+ messages in thread
From: Hari Bathini @ 2025-08-14 8:54 UTC (permalink / raw)
To: Saket Kumar Bhaskar, bpf, linuxppc-dev, linux-kselftest,
linux-kernel
Cc: sachinpb, venkat88, andrii, eddyz87, mykolal, ast, daniel,
martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
haoluo, jolsa, christophe.leroy, naveen, maddy, mpe, npiggin,
memxor, iii, shuah
On 05/08/25 11:57 am, Saket Kumar Bhaskar wrote:
> Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW]
> instructions. They are similar to PROBE_MEM instructions with the
> following differences:
> - PROBE_MEM32 supports store.
> - PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
> src/dst register
> - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in _R26
> in the prologue). Due to bpf_arena constructions such _R26 + reg +
> off16 access is guaranteed to be within arena virtual range, so no
> address check at run-time.
> - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When
> LDX faults the destination register is zeroed.
>
> To support these on powerpc, we do tmp1 = _R26 + src/dst reg and then use
> tmp1 as the new src/dst register. This allows us to reuse most of the
> code for normal [LDX | STX | ST].
>
> Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
> ---
> arch/powerpc/net/bpf_jit.h | 5 +-
> arch/powerpc/net/bpf_jit_comp.c | 10 ++-
> arch/powerpc/net/bpf_jit_comp32.c | 2 +-
> arch/powerpc/net/bpf_jit_comp64.c | 108 ++++++++++++++++++++++++++++--
> 4 files changed, 114 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> index 4c26912c2e3c..2d095a873305 100644
> --- a/arch/powerpc/net/bpf_jit.h
> +++ b/arch/powerpc/net/bpf_jit.h
> @@ -161,9 +161,10 @@ struct codegen_context {
> unsigned int seen;
> unsigned int idx;
> unsigned int stack_size;
> - int b2p[MAX_BPF_JIT_REG + 2];
> + int b2p[MAX_BPF_JIT_REG + 3];
> unsigned int exentry_idx;
> unsigned int alt_exit_addr;
> + u64 arena_vm_start;
> };
>
> #define bpf_to_ppc(r) (ctx->b2p[r])
> @@ -201,7 +202,7 @@ int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg,
>
> int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass,
> struct codegen_context *ctx, int insn_idx,
> - int jmp_off, int dst_reg);
> + int jmp_off, int dst_reg, u32 code);
>
> #endif
>
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> index c0684733e9d6..35bfdf4d8785 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -204,6 +204,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
>
> /* Make sure that the stack is quadword aligned. */
> cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
> + cgctx.arena_vm_start = bpf_arena_get_kern_vm_start(fp->aux->arena);
>
> /* Scouting faux-generate pass 0 */
> if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false)) {
> @@ -326,7 +327,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
> */
> int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass,
> struct codegen_context *ctx, int insn_idx, int jmp_off,
> - int dst_reg)
> + int dst_reg, u32 code)
> {
> off_t offset;
> unsigned long pc;
> @@ -354,7 +355,12 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 *image, u32 *fimage, int pass
> (fp->aux->num_exentries * BPF_FIXUP_LEN * 4) +
> (ctx->exentry_idx * BPF_FIXUP_LEN * 4);
>
> - fixup[0] = PPC_RAW_LI(dst_reg, 0);
> + if ((BPF_CLASS(code) == BPF_LDX && BPF_MODE(code) == BPF_PROBE_MEM32) ||
> + (BPF_CLASS(code) == BPF_LDX && BPF_MODE(code) == BPF_PROBE_MEM))
> + fixup[0] = PPC_RAW_LI(dst_reg, 0);
> + else if (BPF_CLASS(code) == BPF_ST || BPF_CLASS(code) == BPF_STX)
> + fixup[0] = PPC_RAW_NOP();
> +
> if (IS_ENABLED(CONFIG_PPC32))
> fixup[1] = PPC_RAW_LI(dst_reg - 1, 0); /* clear higher 32-bit register too */
>
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index 0aace304dfe1..3087e744fb25 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -1087,7 +1087,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> }
>
> ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx, insn_idx,
> - jmp_off, dst_reg);
> + jmp_off, dst_reg, code);
> if (ret)
> return ret;
> }
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 489de21fe3d6..16e62766c757 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -44,6 +44,7 @@
> /* BPF register usage */
> #define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
> #define TMP_REG_2 (MAX_BPF_JIT_REG + 1)
> +#define ARENA_VM_START (MAX_BPF_JIT_REG + 2)
>
> /* BPF to ppc register mappings */
> void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
> @@ -61,6 +62,8 @@ void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
> ctx->b2p[BPF_REG_7] = _R28;
> ctx->b2p[BPF_REG_8] = _R29;
> ctx->b2p[BPF_REG_9] = _R30;
> + /* non volatile register for kern_vm_start address */
> + ctx->b2p[ARENA_VM_START] = _R26;
> /* frame pointer aka BPF_REG_10 */
> ctx->b2p[BPF_REG_FP] = _R31;
> /* eBPF jit internal registers */
> @@ -69,8 +72,8 @@ void bpf_jit_init_reg_mapping(struct codegen_context *ctx)
> ctx->b2p[TMP_REG_2] = _R10;
> }
>
> -/* PPC NVR range -- update this if we ever use NVRs below r27 */
> -#define BPF_PPC_NVR_MIN _R27
> +/* PPC NVR range -- update this if we ever use NVRs below r26 */
> +#define BPF_PPC_NVR_MIN _R26
>
> static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
> {
> @@ -170,10 +173,17 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
> if (bpf_is_seen_register(ctx, bpf_to_ppc(i)))
> EMIT(PPC_RAW_STD(bpf_to_ppc(i), _R1, bpf_jit_stack_offsetof(ctx, bpf_to_ppc(i))));
>
> + if (ctx->arena_vm_start)
> + EMIT(PPC_RAW_STD(bpf_to_ppc(ARENA_VM_START), _R1,
> + bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ARENA_VM_START))));
> +
I don't see a selftest that tests both arena and tailcalls
together but the above change is going to clobber tailcall count.
That is because the current stack layout is impacted with this
new non-volatile register usage:
/*
* [ prev sp ] <-------------
* [ nv gpr save area ] 5*8 |
* [ tail_call_cnt ] 8 |
* [ local_tmp_var ] 16 |
* fp (r31) --> [ ebpf stack space ] upto 512 |
* [ frame header ] 32/112 |
* sp (r1) ---> [ stack pointer ] --------------
*/
Please rework the above stack layout and the corresponding macros
accordingly.
> /* Setup frame pointer to point to the bpf stack area */
> if (bpf_is_seen_register(ctx, bpf_to_ppc(BPF_REG_FP)))
> EMIT(PPC_RAW_ADDI(bpf_to_ppc(BPF_REG_FP), _R1,
> STACK_FRAME_MIN_SIZE + ctx->stack_size));
> +
> + if (ctx->arena_vm_start)
> + PPC_LI64(bpf_to_ppc(ARENA_VM_START), ctx->arena_vm_start);
> }
>
> static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx)
> @@ -185,6 +195,10 @@ static void bpf_jit_emit_common_epilogue(u32 *image, struct codegen_context *ctx
> if (bpf_is_seen_register(ctx, bpf_to_ppc(i)))
> EMIT(PPC_RAW_LD(bpf_to_ppc(i), _R1, bpf_jit_stack_offsetof(ctx, bpf_to_ppc(i))));
>
> + if (ctx->arena_vm_start)
> + EMIT(PPC_RAW_LD(bpf_to_ppc(ARENA_VM_START), _R1,
> + bpf_jit_stack_offsetof(ctx, bpf_to_ppc(ARENA_VM_START))));
> +
> /* Tear down our stack frame */
> if (bpf_has_stack_frame(ctx)) {
> EMIT(PPC_RAW_ADDI(_R1, _R1, BPF_PPC_STACKFRAME + ctx->stack_size));
- Hari
^ permalink raw reply [flat|nested] 23+ messages in thread
* [bpf-next 3/6] bpf,powerpc: Implement bpf_addr_space_cast instruction
2025-08-05 6:27 [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Saket Kumar Bhaskar
2025-08-05 6:27 ` [bpf-next 1/6] bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store instructions Saket Kumar Bhaskar
2025-08-05 6:27 ` [bpf-next 2/6] bpf,powerpc: Implement PROBE_MEM32 pseudo instructions Saket Kumar Bhaskar
@ 2025-08-05 6:27 ` Saket Kumar Bhaskar
2025-08-05 7:29 ` Christophe Leroy
2025-08-05 6:27 ` [bpf-next 4/6] bpf,powerpc: Introduce bpf_jit_emit_atomic_ops() to emit atomic instructions Saket Kumar Bhaskar
` (4 subsequent siblings)
7 siblings, 1 reply; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-05 6:27 UTC (permalink / raw)
To: bpf, linuxppc-dev, linux-kselftest, linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, christophe.leroy, naveen, maddy, mpe, npiggin,
memxor, iii, shuah
LLVM generates bpf_addr_space_cast instruction while translating
pointers between native (zero) address space and
__attribute__((address_space(N))). The addr_space=0 is reserved as
bpf_arena address space.
rY = addr_space_cast(rX, 0, 1) is processed by the verifier and
converted to normal 32-bit move: wX = wY.
rY = addr_space_cast(rX, 1, 0) : used to convert a bpf arena pointer to
a pointer in the userspace vma. This has to be converted by the JIT.
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
---
arch/powerpc/net/bpf_jit.h | 1 +
arch/powerpc/net/bpf_jit_comp.c | 6 ++++++
arch/powerpc/net/bpf_jit_comp64.c | 11 +++++++++++
3 files changed, 18 insertions(+)
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 2d095a873305..748e30e8b5b4 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -165,6 +165,7 @@ struct codegen_context {
unsigned int exentry_idx;
unsigned int alt_exit_addr;
u64 arena_vm_start;
+ u64 user_vm_start;
};
#define bpf_to_ppc(r) (ctx->b2p[r])
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 35bfdf4d8785..2b3f90930c27 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -205,6 +205,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
/* Make sure that the stack is quadword aligned. */
cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
cgctx.arena_vm_start = bpf_arena_get_kern_vm_start(fp->aux->arena);
+ cgctx.user_vm_start = bpf_arena_get_user_vm_start(fp->aux->arena);
/* Scouting faux-generate pass 0 */
if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false)) {
@@ -441,6 +442,11 @@ bool bpf_jit_supports_kfunc_call(void)
return true;
}
+bool bpf_jit_supports_arena(void)
+{
+ return IS_ENABLED(CONFIG_PPC64);
+}
+
bool bpf_jit_supports_far_kfunc_call(void)
{
return IS_ENABLED(CONFIG_PPC64);
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 16e62766c757..d4fe4dacf2d6 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -812,6 +812,17 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
*/
case BPF_ALU | BPF_MOV | BPF_X: /* (u32) dst = src */
case BPF_ALU64 | BPF_MOV | BPF_X: /* dst = src */
+
+ if (insn_is_cast_user(&insn[i])) {
+ EMIT(PPC_RAW_RLDICL(tmp1_reg, src_reg, 0, 32));
+ PPC_LI64(dst_reg, (ctx->user_vm_start & 0xffffffff00000000UL));
+ EMIT(PPC_RAW_CMPDI(tmp1_reg, 0));
+ PPC_BCC_SHORT(COND_EQ, (ctx->idx + 2) * 4);
+ EMIT(PPC_RAW_OR(tmp1_reg, dst_reg, tmp1_reg));
+ EMIT(PPC_RAW_MR(dst_reg, tmp1_reg));
+ break;
+ }
+
if (imm == 1) {
/* special mov32 for zext */
EMIT(PPC_RAW_RLWINM(dst_reg, dst_reg, 0, 0, 31));
--
2.43.5
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [bpf-next 3/6] bpf,powerpc: Implement bpf_addr_space_cast instruction
2025-08-05 6:27 ` [bpf-next 3/6] bpf,powerpc: Implement bpf_addr_space_cast instruction Saket Kumar Bhaskar
@ 2025-08-05 7:29 ` Christophe Leroy
2025-08-07 10:24 ` Saket Kumar Bhaskar
0 siblings, 1 reply; 23+ messages in thread
From: Christophe Leroy @ 2025-08-05 7:29 UTC (permalink / raw)
To: Saket Kumar Bhaskar, bpf, linuxppc-dev, linux-kselftest,
linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii,
shuah
Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
> LLVM generates bpf_addr_space_cast instruction while translating
> pointers between native (zero) address space and
> __attribute__((address_space(N))). The addr_space=0 is reserved as
> bpf_arena address space.
>
> rY = addr_space_cast(rX, 0, 1) is processed by the verifier and
> converted to normal 32-bit move: wX = wY.
>
> rY = addr_space_cast(rX, 1, 0) : used to convert a bpf arena pointer to
> a pointer in the userspace vma. This has to be converted by the JIT.
>
> Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
> ---
> arch/powerpc/net/bpf_jit.h | 1 +
> arch/powerpc/net/bpf_jit_comp.c | 6 ++++++
> arch/powerpc/net/bpf_jit_comp64.c | 11 +++++++++++
> 3 files changed, 18 insertions(+)
>
> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> index 2d095a873305..748e30e8b5b4 100644
> --- a/arch/powerpc/net/bpf_jit.h
> +++ b/arch/powerpc/net/bpf_jit.h
> @@ -165,6 +165,7 @@ struct codegen_context {
> unsigned int exentry_idx;
> unsigned int alt_exit_addr;
> u64 arena_vm_start;
> + u64 user_vm_start;
> };
>
> #define bpf_to_ppc(r) (ctx->b2p[r])
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> index 35bfdf4d8785..2b3f90930c27 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -205,6 +205,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
> /* Make sure that the stack is quadword aligned. */
> cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
> cgctx.arena_vm_start = bpf_arena_get_kern_vm_start(fp->aux->arena);
> + cgctx.user_vm_start = bpf_arena_get_user_vm_start(fp->aux->arena);
>
> /* Scouting faux-generate pass 0 */
> if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false)) {
> @@ -441,6 +442,11 @@ bool bpf_jit_supports_kfunc_call(void)
> return true;
> }
>
> +bool bpf_jit_supports_arena(void)
> +{
> + return IS_ENABLED(CONFIG_PPC64);
> +}
> +
> bool bpf_jit_supports_far_kfunc_call(void)
> {
> return IS_ENABLED(CONFIG_PPC64);
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 16e62766c757..d4fe4dacf2d6 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -812,6 +812,17 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> */
> case BPF_ALU | BPF_MOV | BPF_X: /* (u32) dst = src */
> case BPF_ALU64 | BPF_MOV | BPF_X: /* dst = src */
> +
> + if (insn_is_cast_user(&insn[i])) {
> + EMIT(PPC_RAW_RLDICL(tmp1_reg, src_reg, 0, 32));
Define and use PPC_RAW_RLDICL_DOT to avoid the CMPDI below.
> + PPC_LI64(dst_reg, (ctx->user_vm_start & 0xffffffff00000000UL));
> + EMIT(PPC_RAW_CMPDI(tmp1_reg, 0));
> + PPC_BCC_SHORT(COND_EQ, (ctx->idx + 2) * 4);
> + EMIT(PPC_RAW_OR(tmp1_reg, dst_reg, tmp1_reg));
> + EMIT(PPC_RAW_MR(dst_reg, tmp1_reg));
> + break;
> + }
> +
> if (imm == 1) {
> /* special mov32 for zext */
> EMIT(PPC_RAW_RLWINM(dst_reg, dst_reg, 0, 0, 31));
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 3/6] bpf,powerpc: Implement bpf_addr_space_cast instruction
2025-08-05 7:29 ` Christophe Leroy
@ 2025-08-07 10:24 ` Saket Kumar Bhaskar
0 siblings, 0 replies; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-07 10:24 UTC (permalink / raw)
To: Christophe Leroy
Cc: bpf, linuxppc-dev, linux-kselftest, linux-kernel, hbathini,
sachinpb, venkat88, andrii, eddyz87, mykolal, ast, daniel,
martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii, shuah
On Tue, Aug 05, 2025 at 09:29:07AM +0200, Christophe Leroy wrote:
>
>
> Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
> > LLVM generates bpf_addr_space_cast instruction while translating
> > pointers between native (zero) address space and
> > __attribute__((address_space(N))). The addr_space=0 is reserved as
> > bpf_arena address space.
> >
> > rY = addr_space_cast(rX, 0, 1) is processed by the verifier and
> > converted to normal 32-bit move: wX = wY.
> >
> > rY = addr_space_cast(rX, 1, 0) : used to convert a bpf arena pointer to
> > a pointer in the userspace vma. This has to be converted by the JIT.
> >
> > Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
> > ---
> > arch/powerpc/net/bpf_jit.h | 1 +
> > arch/powerpc/net/bpf_jit_comp.c | 6 ++++++
> > arch/powerpc/net/bpf_jit_comp64.c | 11 +++++++++++
> > 3 files changed, 18 insertions(+)
> >
> > diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> > index 2d095a873305..748e30e8b5b4 100644
> > --- a/arch/powerpc/net/bpf_jit.h
> > +++ b/arch/powerpc/net/bpf_jit.h
> > @@ -165,6 +165,7 @@ struct codegen_context {
> > unsigned int exentry_idx;
> > unsigned int alt_exit_addr;
> > u64 arena_vm_start;
> > + u64 user_vm_start;
> > };
> > #define bpf_to_ppc(r) (ctx->b2p[r])
> > diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> > index 35bfdf4d8785..2b3f90930c27 100644
> > --- a/arch/powerpc/net/bpf_jit_comp.c
> > +++ b/arch/powerpc/net/bpf_jit_comp.c
> > @@ -205,6 +205,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
> > /* Make sure that the stack is quadword aligned. */
> > cgctx.stack_size = round_up(fp->aux->stack_depth, 16);
> > cgctx.arena_vm_start = bpf_arena_get_kern_vm_start(fp->aux->arena);
> > + cgctx.user_vm_start = bpf_arena_get_user_vm_start(fp->aux->arena);
> > /* Scouting faux-generate pass 0 */
> > if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false)) {
> > @@ -441,6 +442,11 @@ bool bpf_jit_supports_kfunc_call(void)
> > return true;
> > }
> > +bool bpf_jit_supports_arena(void)
> > +{
> > + return IS_ENABLED(CONFIG_PPC64);
> > +}
> > +
> > bool bpf_jit_supports_far_kfunc_call(void)
> > {
> > return IS_ENABLED(CONFIG_PPC64);
> > diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> > index 16e62766c757..d4fe4dacf2d6 100644
> > --- a/arch/powerpc/net/bpf_jit_comp64.c
> > +++ b/arch/powerpc/net/bpf_jit_comp64.c
> > @@ -812,6 +812,17 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> > */
> > case BPF_ALU | BPF_MOV | BPF_X: /* (u32) dst = src */
> > case BPF_ALU64 | BPF_MOV | BPF_X: /* dst = src */
> > +
> > + if (insn_is_cast_user(&insn[i])) {
> > + EMIT(PPC_RAW_RLDICL(tmp1_reg, src_reg, 0, 32));
>
> Define and use PPC_RAW_RLDICL_DOT to avoid the CMPDI below.
>
Alright Chris, will define and implement it here.
> > + PPC_LI64(dst_reg, (ctx->user_vm_start & 0xffffffff00000000UL));
> > + EMIT(PPC_RAW_CMPDI(tmp1_reg, 0));
> > + PPC_BCC_SHORT(COND_EQ, (ctx->idx + 2) * 4);
> > + EMIT(PPC_RAW_OR(tmp1_reg, dst_reg, tmp1_reg));
> > + EMIT(PPC_RAW_MR(dst_reg, tmp1_reg));
> > + break;
> > + }
> > +
> > if (imm == 1) {
> > /* special mov32 for zext */
> > EMIT(PPC_RAW_RLWINM(dst_reg, dst_reg, 0, 0, 31));
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [bpf-next 4/6] bpf,powerpc: Introduce bpf_jit_emit_atomic_ops() to emit atomic instructions
2025-08-05 6:27 [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Saket Kumar Bhaskar
` (2 preceding siblings ...)
2025-08-05 6:27 ` [bpf-next 3/6] bpf,powerpc: Implement bpf_addr_space_cast instruction Saket Kumar Bhaskar
@ 2025-08-05 6:27 ` Saket Kumar Bhaskar
2025-08-05 6:27 ` [bpf-next 5/6] bpf,powerpc: Implement PROBE_ATOMIC instructions Saket Kumar Bhaskar
` (3 subsequent siblings)
7 siblings, 0 replies; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-05 6:27 UTC (permalink / raw)
To: bpf, linuxppc-dev, linux-kselftest, linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, christophe.leroy, naveen, maddy, mpe, npiggin,
memxor, iii, shuah
The existing code for emitting bpf atomic instruction sequences for
atomic operations such as XCHG, CMPXCHG, ADD, AND, OR, and XOR has been
refactored into a reusable function, bpf_jit_emit_ppc_atomic_op().
It also computes the jump offset and tracks the instruction index for jited
LDARX/LWARX to be used in case it causes a fault.
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
---
arch/powerpc/net/bpf_jit_comp64.c | 203 +++++++++++++++++-------------
1 file changed, 115 insertions(+), 88 deletions(-)
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index d4fe4dacf2d6..6a85cd847075 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -423,6 +423,111 @@ asm (
" blr ;"
);
+static int bpf_jit_emit_atomic_ops(u32 *image, struct codegen_context *ctx,
+ const struct bpf_insn *insn, u32 *jmp_off,
+ u32 *tmp_idx, u32 *addrp)
+{
+ u32 tmp1_reg = bpf_to_ppc(TMP_REG_1);
+ u32 tmp2_reg = bpf_to_ppc(TMP_REG_2);
+ u32 size = BPF_SIZE(insn->code);
+ u32 src_reg = bpf_to_ppc(insn->src_reg);
+ u32 dst_reg = bpf_to_ppc(insn->dst_reg);
+ s32 imm = insn->imm;
+
+ u32 save_reg = tmp2_reg;
+ u32 ret_reg = src_reg;
+ u32 fixup_idx;
+
+ /* Get offset into TMP_REG_1 */
+ EMIT(PPC_RAW_LI(tmp1_reg, insn->off));
+ /*
+ * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
+ * before and after the operation.
+ *
+ * This is a requirement in the Linux Kernel Memory Model.
+ * See __cmpxchg_u64() in asm/cmpxchg.h as an example.
+ */
+ if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
+ EMIT(PPC_RAW_SYNC());
+
+ *tmp_idx = ctx->idx;
+
+ /* load value from memory into TMP_REG_2 */
+ if (size == BPF_DW)
+ EMIT(PPC_RAW_LDARX(tmp2_reg, tmp1_reg, dst_reg, 0));
+ else
+ EMIT(PPC_RAW_LWARX(tmp2_reg, tmp1_reg, dst_reg, 0));
+ /* Save old value in _R0 */
+ if (imm & BPF_FETCH)
+ EMIT(PPC_RAW_MR(_R0, tmp2_reg));
+
+ switch (imm) {
+ case BPF_ADD:
+ case BPF_ADD | BPF_FETCH:
+ EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
+ break;
+ case BPF_AND:
+ case BPF_AND | BPF_FETCH:
+ EMIT(PPC_RAW_AND(tmp2_reg, tmp2_reg, src_reg));
+ break;
+ case BPF_OR:
+ case BPF_OR | BPF_FETCH:
+ EMIT(PPC_RAW_OR(tmp2_reg, tmp2_reg, src_reg));
+ break;
+ case BPF_XOR:
+ case BPF_XOR | BPF_FETCH:
+ EMIT(PPC_RAW_XOR(tmp2_reg, tmp2_reg, src_reg));
+ break;
+ case BPF_CMPXCHG:
+ /*
+ * Return old value in BPF_REG_0 for BPF_CMPXCHG &
+ * in src_reg for other cases.
+ */
+ ret_reg = bpf_to_ppc(BPF_REG_0);
+
+ /* Compare with old value in BPF_R0 */
+ if (size == BPF_DW)
+ EMIT(PPC_RAW_CMPD(bpf_to_ppc(BPF_REG_0), tmp2_reg));
+ else
+ EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), tmp2_reg));
+ /* Don't set if different from old value */
+ PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
+ fallthrough;
+ case BPF_XCHG:
+ save_reg = src_reg;
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ /* store new value */
+ if (size == BPF_DW)
+ EMIT(PPC_RAW_STDCX(save_reg, tmp1_reg, dst_reg));
+ else
+ EMIT(PPC_RAW_STWCX(save_reg, tmp1_reg, dst_reg));
+ /* we're done if this succeeded */
+ PPC_BCC_SHORT(COND_NE, *tmp_idx * 4);
+ fixup_idx = ctx->idx;
+
+ if (imm & BPF_FETCH) {
+ /* Emit 'sync' to enforce full ordering */
+ if (IS_ENABLED(CONFIG_SMP))
+ EMIT(PPC_RAW_SYNC());
+ EMIT(PPC_RAW_MR(ret_reg, _R0));
+ /*
+ * Skip unnecessary zero-extension for 32-bit cmpxchg.
+ * For context, see commit 39491867ace5.
+ */
+ if (size != BPF_DW && imm == BPF_CMPXCHG &&
+ insn_is_zext(insn + 1))
+ *addrp = ctx->idx * 4;
+ }
+
+ *jmp_off = (fixup_idx - *tmp_idx) * 4;
+
+ return 0;
+}
+
static int bpf_jit_emit_probe_mem_store(struct codegen_context *ctx, u32 src_reg, s16 off,
u32 code, u32 *image)
{
@@ -538,7 +643,6 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
u32 size = BPF_SIZE(code);
u32 tmp1_reg = bpf_to_ppc(TMP_REG_1);
u32 tmp2_reg = bpf_to_ppc(TMP_REG_2);
- u32 save_reg, ret_reg;
s16 off = insn[i].off;
s32 imm = insn[i].imm;
bool func_addr_fixed;
@@ -546,6 +650,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
u64 imm64;
u32 true_cond;
u32 tmp_idx;
+ u32 jmp_off;
/*
* addrs[] maps a BPF bytecode address into a real offset from
@@ -1081,93 +1186,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
return -EOPNOTSUPP;
}
- save_reg = tmp2_reg;
- ret_reg = src_reg;
-
- /* Get offset into TMP_REG_1 */
- EMIT(PPC_RAW_LI(tmp1_reg, off));
- /*
- * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
- * before and after the operation.
- *
- * This is a requirement in the Linux Kernel Memory Model.
- * See __cmpxchg_u64() in asm/cmpxchg.h as an example.
- */
- if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
- EMIT(PPC_RAW_SYNC());
- tmp_idx = ctx->idx * 4;
- /* load value from memory into TMP_REG_2 */
- if (size == BPF_DW)
- EMIT(PPC_RAW_LDARX(tmp2_reg, tmp1_reg, dst_reg, 0));
- else
- EMIT(PPC_RAW_LWARX(tmp2_reg, tmp1_reg, dst_reg, 0));
-
- /* Save old value in _R0 */
- if (imm & BPF_FETCH)
- EMIT(PPC_RAW_MR(_R0, tmp2_reg));
-
- switch (imm) {
- case BPF_ADD:
- case BPF_ADD | BPF_FETCH:
- EMIT(PPC_RAW_ADD(tmp2_reg, tmp2_reg, src_reg));
- break;
- case BPF_AND:
- case BPF_AND | BPF_FETCH:
- EMIT(PPC_RAW_AND(tmp2_reg, tmp2_reg, src_reg));
- break;
- case BPF_OR:
- case BPF_OR | BPF_FETCH:
- EMIT(PPC_RAW_OR(tmp2_reg, tmp2_reg, src_reg));
- break;
- case BPF_XOR:
- case BPF_XOR | BPF_FETCH:
- EMIT(PPC_RAW_XOR(tmp2_reg, tmp2_reg, src_reg));
- break;
- case BPF_CMPXCHG:
- /*
- * Return old value in BPF_REG_0 for BPF_CMPXCHG &
- * in src_reg for other cases.
- */
- ret_reg = bpf_to_ppc(BPF_REG_0);
-
- /* Compare with old value in BPF_R0 */
- if (size == BPF_DW)
- EMIT(PPC_RAW_CMPD(bpf_to_ppc(BPF_REG_0), tmp2_reg));
- else
- EMIT(PPC_RAW_CMPW(bpf_to_ppc(BPF_REG_0), tmp2_reg));
- /* Don't set if different from old value */
- PPC_BCC_SHORT(COND_NE, (ctx->idx + 3) * 4);
- fallthrough;
- case BPF_XCHG:
- save_reg = src_reg;
- break;
- default:
- pr_err_ratelimited(
- "eBPF filter atomic op code %02x (@%d) unsupported\n",
- code, i);
- return -EOPNOTSUPP;
- }
-
- /* store new value */
- if (size == BPF_DW)
- EMIT(PPC_RAW_STDCX(save_reg, tmp1_reg, dst_reg));
- else
- EMIT(PPC_RAW_STWCX(save_reg, tmp1_reg, dst_reg));
- /* we're done if this succeeded */
- PPC_BCC_SHORT(COND_NE, tmp_idx);
-
- if (imm & BPF_FETCH) {
- /* Emit 'sync' to enforce full ordering */
- if (IS_ENABLED(CONFIG_SMP))
- EMIT(PPC_RAW_SYNC());
- EMIT(PPC_RAW_MR(ret_reg, _R0));
- /*
- * Skip unnecessary zero-extension for 32-bit cmpxchg.
- * For context, see commit 39491867ace5.
- */
- if (size != BPF_DW && imm == BPF_CMPXCHG &&
- insn_is_zext(&insn[i + 1]))
- addrs[++i] = ctx->idx * 4;
+ ret = bpf_jit_emit_atomic_ops(image, ctx, &insn[i],
+ &jmp_off, &tmp_idx, &addrs[i + 1]);
+ if (ret) {
+ if (ret == -EOPNOTSUPP) {
+ pr_err_ratelimited(
+ "eBPF filter atomic op code %02x (@%d) unsupported\n",
+ code, i);
+ }
+ return ret;
}
break;
--
2.43.5
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [bpf-next 5/6] bpf,powerpc: Implement PROBE_ATOMIC instructions
2025-08-05 6:27 [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Saket Kumar Bhaskar
` (3 preceding siblings ...)
2025-08-05 6:27 ` [bpf-next 4/6] bpf,powerpc: Introduce bpf_jit_emit_atomic_ops() to emit atomic instructions Saket Kumar Bhaskar
@ 2025-08-05 6:27 ` Saket Kumar Bhaskar
2025-08-05 6:27 ` [bpf-next 6/6] selftests/bpf: Fix arena_spin_lock selftest failure Saket Kumar Bhaskar
` (2 subsequent siblings)
7 siblings, 0 replies; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-05 6:27 UTC (permalink / raw)
To: bpf, linuxppc-dev, linux-kselftest, linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, christophe.leroy, naveen, maddy, mpe, npiggin,
memxor, iii, shuah
powerpc supports BPF atomic operations using a loop around
Load-And-Reserve(LDARX/LWARX) and Store-Conditional(STDCX/STWCX)
instructions gated by sync instructions to enforce full ordering.
To implement arena_atomics, arena vm start address is added to the
dst_reg to be used for both the LDARX/LWARX and STDCX/STWCX instructions.
Further, an exception table entry is added for LDARX/LWARX
instruction to land after the loop on fault. At the end of sequence,
dst_reg is restored by subtracting arena vm start address.
bpf_jit_supports_insn() is introduced to selectively enable instruction
support as in other architectures like x86 and arm64.
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
---
arch/powerpc/net/bpf_jit_comp.c | 16 ++++++++++++++++
arch/powerpc/net/bpf_jit_comp64.c | 26 ++++++++++++++++++++++++++
2 files changed, 42 insertions(+)
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 2b3f90930c27..69232ee56c6a 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -452,6 +452,22 @@ bool bpf_jit_supports_far_kfunc_call(void)
return IS_ENABLED(CONFIG_PPC64);
}
+bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena)
+{
+ if (!in_arena)
+ return true;
+ switch (insn->code) {
+ case BPF_STX | BPF_ATOMIC | BPF_H:
+ case BPF_STX | BPF_ATOMIC | BPF_B:
+ case BPF_STX | BPF_ATOMIC | BPF_W:
+ case BPF_STX | BPF_ATOMIC | BPF_DW:
+ if (bpf_atomic_is_load_store(insn))
+ return false;
+ return IS_ENABLED(CONFIG_PPC64);
+ }
+ return true;
+}
+
void *arch_alloc_bpf_trampoline(unsigned int size)
{
return bpf_prog_pack_alloc(size, bpf_jit_fill_ill_insns);
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 6a85cd847075..8931bded97f4 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -1164,6 +1164,32 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
break;
+ /*
+ * BPF_STX PROBE_ATOMIC (arena atomic ops)
+ */
+ case BPF_STX | BPF_PROBE_ATOMIC | BPF_W:
+ case BPF_STX | BPF_PROBE_ATOMIC | BPF_DW:
+ EMIT(PPC_RAW_ADD(dst_reg, dst_reg, bpf_to_ppc(ARENA_VM_START)));
+ ret = bpf_jit_emit_atomic_ops(image, ctx, &insn[i],
+ &jmp_off, &tmp_idx, &addrs[i + 1]);
+ if (ret) {
+ if (ret == -EOPNOTSUPP) {
+ pr_err_ratelimited(
+ "eBPF filter atomic op code %02x (@%d) unsupported\n",
+ code, i);
+ }
+ return ret;
+ }
+ /* LDARX/LWARX should land here on exception. */
+ ret = bpf_add_extable_entry(fp, image, fimage, pass, ctx,
+ tmp_idx, jmp_off, dst_reg, code);
+ if (ret)
+ return ret;
+
+ /* Retrieve the dst_reg */
+ EMIT(PPC_RAW_SUB(dst_reg, dst_reg, bpf_to_ppc(ARENA_VM_START)));
+ break;
+
/*
* BPF_STX ATOMIC (atomic ops)
*/
--
2.43.5
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [bpf-next 6/6] selftests/bpf: Fix arena_spin_lock selftest failure
2025-08-05 6:27 [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Saket Kumar Bhaskar
` (4 preceding siblings ...)
2025-08-05 6:27 ` [bpf-next 5/6] bpf,powerpc: Implement PROBE_ATOMIC instructions Saket Kumar Bhaskar
@ 2025-08-05 6:27 ` Saket Kumar Bhaskar
2025-08-07 22:21 ` Alexei Starovoitov
2025-08-05 7:45 ` [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Christophe Leroy
2025-08-05 12:07 ` Venkat Rao Bagalkote
7 siblings, 1 reply; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-05 6:27 UTC (permalink / raw)
To: bpf, linuxppc-dev, linux-kselftest, linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, christophe.leroy, naveen, maddy, mpe, npiggin,
memxor, iii, shuah
For systems having CONFIG_NR_CPUS set to > 1024 in kernel config
the selftest fails even though the current number of online cpus
is less. For example, on powerpc the default value for
CONFIG_NR_CPUS is set to 8192.
get_nprocs() is used to get the number of available cpus in test
driver code and the same is passed to the bpf program using rodata.
Also the selftest is skipped incase bpf program returns EOPNOTSUPP,
with a descriptive message logged.
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
---
.../bpf/prog_tests/arena_spin_lock.c | 23 +++++++++++++++++--
.../selftests/bpf/progs/arena_spin_lock.c | 8 ++++++-
.../selftests/bpf/progs/bpf_arena_spin_lock.h | 4 +---
3 files changed, 29 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/arena_spin_lock.c b/tools/testing/selftests/bpf/prog_tests/arena_spin_lock.c
index 0223fce4db2b..fa0b4f0240a3 100644
--- a/tools/testing/selftests/bpf/prog_tests/arena_spin_lock.c
+++ b/tools/testing/selftests/bpf/prog_tests/arena_spin_lock.c
@@ -40,8 +40,13 @@ static void *spin_lock_thread(void *arg)
err = bpf_prog_test_run_opts(prog_fd, &topts);
ASSERT_OK(err, "test_run err");
+
+ if (topts.retval == -EOPNOTSUPP)
+ goto end;
+
ASSERT_EQ((int)topts.retval, 0, "test_run retval");
+end:
pthread_exit(arg);
}
@@ -60,9 +65,16 @@ static void test_arena_spin_lock_size(int size)
return;
}
- skel = arena_spin_lock__open_and_load();
- if (!ASSERT_OK_PTR(skel, "arena_spin_lock__open_and_load"))
+ skel = arena_spin_lock__open();
+ if (!ASSERT_OK_PTR(skel, "arena_spin_lock__open"))
return;
+
+ skel->rodata->nr_cpus = get_nprocs();
+
+ err = arena_spin_lock__load(skel);
+ if (!ASSERT_OK(err, "arena_spin_lock__load"))
+ goto end;
+
if (skel->data->test_skip == 2) {
test__skip();
goto end;
@@ -86,6 +98,13 @@ static void test_arena_spin_lock_size(int size)
goto end_barrier;
}
+ if (skel->data->test_skip == 2) {
+ printf("%s:SKIP: %d online CPUs exceed the maximum supported by arena spinlock\n",
+ __func__, get_nprocs());
+ test__skip();
+ goto end_barrier;
+ }
+
ASSERT_EQ(skel->bss->counter, repeat * nthreads, "check counter value");
end_barrier:
diff --git a/tools/testing/selftests/bpf/progs/arena_spin_lock.c b/tools/testing/selftests/bpf/progs/arena_spin_lock.c
index c4500c37f85e..9ed5a3281fd4 100644
--- a/tools/testing/selftests/bpf/progs/arena_spin_lock.c
+++ b/tools/testing/selftests/bpf/progs/arena_spin_lock.c
@@ -4,6 +4,9 @@
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
+
+const volatile int nr_cpus;
+
#include "bpf_arena_spin_lock.h"
struct {
@@ -37,8 +40,11 @@ int prog(void *ctx)
#if defined(ENABLE_ATOMICS_TESTS) && defined(__BPF_FEATURE_ADDR_SPACE_CAST)
unsigned long flags;
- if ((ret = arena_spin_lock_irqsave(&lock, flags)))
+ if ((ret = arena_spin_lock_irqsave(&lock, flags))) {
+ if (ret == -EOPNOTSUPP)
+ test_skip = 2;
return ret;
+ }
if (counter != limit)
counter++;
bpf_repeat(cs_count);
diff --git a/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h b/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h
index d67466c1ff77..752131161315 100644
--- a/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h
+++ b/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h
@@ -20,8 +20,6 @@
#define __arena __attribute__((address_space(1)))
#endif
-extern unsigned long CONFIG_NR_CPUS __kconfig;
-
/*
* Typically, we'd just rely on the definition in vmlinux.h for qspinlock, but
* PowerPC overrides the definition to define lock->val as u32 instead of
@@ -494,7 +492,7 @@ static __always_inline int arena_spin_lock(arena_spinlock_t __arena *lock)
{
int val = 0;
- if (CONFIG_NR_CPUS > 1024)
+ if (nr_cpus > 1024)
return -EOPNOTSUPP;
bpf_preempt_disable();
--
2.43.5
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [bpf-next 6/6] selftests/bpf: Fix arena_spin_lock selftest failure
2025-08-05 6:27 ` [bpf-next 6/6] selftests/bpf: Fix arena_spin_lock selftest failure Saket Kumar Bhaskar
@ 2025-08-07 22:21 ` Alexei Starovoitov
2025-08-08 15:28 ` Saket Kumar Bhaskar
0 siblings, 1 reply; 23+ messages in thread
From: Alexei Starovoitov @ 2025-08-07 22:21 UTC (permalink / raw)
To: Saket Kumar Bhaskar
Cc: bpf, ppc-dev, open list:KERNEL SELFTEST FRAMEWORK, LKML,
Hari Bathini, sachinpb, Venkat Rao Bagalkote, Andrii Nakryiko,
Eduard, Mykola Lysenko, Alexei Starovoitov, Daniel Borkmann,
Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Christophe Leroy, Naveen N. Rao, Madhavan Srinivasan,
Michael Ellerman, Nicholas Piggin, Kumar Kartikeya Dwivedi,
Ilya Leoshkevich, Shuah Khan
On Mon, Aug 4, 2025 at 11:29 PM Saket Kumar Bhaskar <skb99@linux.ibm.com> wrote:
>
> @@ -60,9 +65,16 @@ static void test_arena_spin_lock_size(int size)
> return;
> }
>
> - skel = arena_spin_lock__open_and_load();
> - if (!ASSERT_OK_PTR(skel, "arena_spin_lock__open_and_load"))
> + skel = arena_spin_lock__open();
> + if (!ASSERT_OK_PTR(skel, "arena_spin_lock__open"))
> return;
> +
> + skel->rodata->nr_cpus = get_nprocs();
...
> --- a/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h
> +++ b/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h
> @@ -20,8 +20,6 @@
> #define __arena __attribute__((address_space(1)))
> #endif
>
> -extern unsigned long CONFIG_NR_CPUS __kconfig;
> -
> /*
> * Typically, we'd just rely on the definition in vmlinux.h for qspinlock, but
> * PowerPC overrides the definition to define lock->val as u32 instead of
> @@ -494,7 +492,7 @@ static __always_inline int arena_spin_lock(arena_spinlock_t __arena *lock)
> {
> int val = 0;
>
> - if (CONFIG_NR_CPUS > 1024)
> + if (nr_cpus > 1024)
> return -EOPNOTSUPP;
We cannot do this. It will make arena_spin_lock much harder to use.
BPF CI doesn't run on powerpc anyway, but you can document that this
test is disable by creating selftests/bpf/DENYLIST.powerpc.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 6/6] selftests/bpf: Fix arena_spin_lock selftest failure
2025-08-07 22:21 ` Alexei Starovoitov
@ 2025-08-08 15:28 ` Saket Kumar Bhaskar
2025-08-08 16:27 ` Alexei Starovoitov
0 siblings, 1 reply; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-08 15:28 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: bpf, ppc-dev, open list:KERNEL SELFTEST FRAMEWORK, LKML,
Hari Bathini, sachinpb, Venkat Rao Bagalkote, Andrii Nakryiko,
Eduard, Mykola Lysenko, Alexei Starovoitov, Daniel Borkmann,
Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Christophe Leroy, Naveen N. Rao, Madhavan Srinivasan,
Michael Ellerman, Nicholas Piggin, Kumar Kartikeya Dwivedi,
Ilya Leoshkevich, Shuah Khan
On Thu, Aug 07, 2025 at 03:21:42PM -0700, Alexei Starovoitov wrote:
> On Mon, Aug 4, 2025 at 11:29 PM Saket Kumar Bhaskar <skb99@linux.ibm.com> wrote:
> >
> > @@ -60,9 +65,16 @@ static void test_arena_spin_lock_size(int size)
> > return;
> > }
> >
> > - skel = arena_spin_lock__open_and_load();
> > - if (!ASSERT_OK_PTR(skel, "arena_spin_lock__open_and_load"))
> > + skel = arena_spin_lock__open();
> > + if (!ASSERT_OK_PTR(skel, "arena_spin_lock__open"))
> > return;
> > +
> > + skel->rodata->nr_cpus = get_nprocs();
>
> ...
>
> > --- a/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h
> > +++ b/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h
> > @@ -20,8 +20,6 @@
> > #define __arena __attribute__((address_space(1)))
> > #endif
> >
> > -extern unsigned long CONFIG_NR_CPUS __kconfig;
> > -
> > /*
> > * Typically, we'd just rely on the definition in vmlinux.h for qspinlock, but
> > * PowerPC overrides the definition to define lock->val as u32 instead of
> > @@ -494,7 +492,7 @@ static __always_inline int arena_spin_lock(arena_spinlock_t __arena *lock)
> > {
> > int val = 0;
> >
> > - if (CONFIG_NR_CPUS > 1024)
> > + if (nr_cpus > 1024)
> > return -EOPNOTSUPP;
>
> We cannot do this. It will make arena_spin_lock much harder to use.
> BPF CI doesn't run on powerpc anyway, but you can document that this
> test is disable by creating selftests/bpf/DENYLIST.powerpc.
Hi Alexie,
Sorry, I did not get it. Can you please help me to understand why it
makes arena_spin_lock harder to use.
Thanks,
Saket
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 6/6] selftests/bpf: Fix arena_spin_lock selftest failure
2025-08-08 15:28 ` Saket Kumar Bhaskar
@ 2025-08-08 16:27 ` Alexei Starovoitov
0 siblings, 0 replies; 23+ messages in thread
From: Alexei Starovoitov @ 2025-08-08 16:27 UTC (permalink / raw)
To: Saket Kumar Bhaskar
Cc: bpf, ppc-dev, open list:KERNEL SELFTEST FRAMEWORK, LKML,
Hari Bathini, sachinpb, Venkat Rao Bagalkote, Andrii Nakryiko,
Eduard, Mykola Lysenko, Alexei Starovoitov, Daniel Borkmann,
Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Christophe Leroy, Naveen N. Rao, Madhavan Srinivasan,
Michael Ellerman, Nicholas Piggin, Kumar Kartikeya Dwivedi,
Ilya Leoshkevich, Shuah Khan
On Fri, Aug 8, 2025 at 8:29 AM Saket Kumar Bhaskar <skb99@linux.ibm.com> wrote:
>
> On Thu, Aug 07, 2025 at 03:21:42PM -0700, Alexei Starovoitov wrote:
> > On Mon, Aug 4, 2025 at 11:29 PM Saket Kumar Bhaskar <skb99@linux.ibm.com> wrote:
> > >
> > > @@ -60,9 +65,16 @@ static void test_arena_spin_lock_size(int size)
> > > return;
> > > }
> > >
> > > - skel = arena_spin_lock__open_and_load();
> > > - if (!ASSERT_OK_PTR(skel, "arena_spin_lock__open_and_load"))
> > > + skel = arena_spin_lock__open();
> > > + if (!ASSERT_OK_PTR(skel, "arena_spin_lock__open"))
> > > return;
> > > +
> > > + skel->rodata->nr_cpus = get_nprocs();
> >
> > ...
> >
> > > --- a/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h
> > > +++ b/tools/testing/selftests/bpf/progs/bpf_arena_spin_lock.h
> > > @@ -20,8 +20,6 @@
> > > #define __arena __attribute__((address_space(1)))
> > > #endif
> > >
> > > -extern unsigned long CONFIG_NR_CPUS __kconfig;
> > > -
> > > /*
> > > * Typically, we'd just rely on the definition in vmlinux.h for qspinlock, but
> > > * PowerPC overrides the definition to define lock->val as u32 instead of
> > > @@ -494,7 +492,7 @@ static __always_inline int arena_spin_lock(arena_spinlock_t __arena *lock)
> > > {
> > > int val = 0;
> > >
> > > - if (CONFIG_NR_CPUS > 1024)
> > > + if (nr_cpus > 1024)
> > > return -EOPNOTSUPP;
> >
> > We cannot do this. It will make arena_spin_lock much harder to use.
> > BPF CI doesn't run on powerpc anyway, but you can document that this
> > test is disable by creating selftests/bpf/DENYLIST.powerpc.
> Hi Alexie,
> Sorry, I did not get it. Can you please help me to understand why it
> makes arena_spin_lock harder to use.
because requiring user space to do
skel->rodata->nr_cpus = get_nprocs()
is a headache.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics
2025-08-05 6:27 [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Saket Kumar Bhaskar
` (5 preceding siblings ...)
2025-08-05 6:27 ` [bpf-next 6/6] selftests/bpf: Fix arena_spin_lock selftest failure Saket Kumar Bhaskar
@ 2025-08-05 7:45 ` Christophe Leroy
2025-08-07 10:26 ` Saket Kumar Bhaskar
2025-08-05 12:07 ` Venkat Rao Bagalkote
7 siblings, 1 reply; 23+ messages in thread
From: Christophe Leroy @ 2025-08-05 7:45 UTC (permalink / raw)
To: Saket Kumar Bhaskar, bpf, linuxppc-dev, linux-kselftest,
linux-kernel
Cc: hbathini, sachinpb, venkat88, andrii, eddyz87, mykolal, ast,
daniel, martin.lau, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii,
shuah
Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
> This patch series introduces support for the PROBE_MEM32,
> bpf_addr_space_cast and PROBE_ATOMIC instructions in the powerpc BPF JIT,
> facilitating the implementation of BPF arena and arena atomics.
This series seems to be limited to powerpc64. Please make it explicit in
all patches subject, see exemple below:
$ git log --oneline arch/powerpc/net/bpf_jit_comp64.c
cf2a6de32cabb (tag: powerpc-6.17-2, origin/next-test, origin/next)
powerpc64/bpf: Add jit support for load_acquire and store_release
59ba025948be2 powerpc/bpf: fix JIT code size calculation of bpf trampoline
d243b62b7bd3d powerpc64/bpf: Add support for bpf trampolines
9670f6d2097c4 powerpc64/bpf: Fold bpf_jit_emit_func_call_hlp() into
bpf_jit_emit_func_call_rel()
fde318326daa4 powerpc64/bpf: jit support for signed division and modulo
597b1710982d1 powerpc64/bpf: jit support for sign extended mov
717756c9c8dda powerpc64/bpf: jit support for sign extended load
a71c0b09a14db powerpc64/bpf: jit support for unconditional byte swap
3c086ce222cef powerpc64/bpf: jit support for 32bit offset jmp instruction
b1e7cee961274 powerpc/bpf: enforce full ordering for ATOMIC operations
with BPF_FETCH
61688a82e047a powerpc/bpf: enable kfunc call
>
> The last patch in the series has fix for arena spinlock selftest
> failure.
>
> This series is rebased on top of:
> https://lore.kernel.org/bpf/20250717202935.29018-2-puranjay@kernel.org/
>
> All selftests related to bpf_arena, bpf_arena_atomic(except
> load_acquire/store_release) enablement are passing:
>
> # ./test_progs -t arena_list
> #5/1 arena_list/arena_list_1:OK
> #5/2 arena_list/arena_list_1000:OK
> #5 arena_list:OK
> Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
>
> # ./test_progs -t arena_htab
> #4/1 arena_htab/arena_htab_llvm:OK
> #4/2 arena_htab/arena_htab_asm:OK
> #4 arena_htab:OK
> Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
>
> # ./test_progs -t verifier_arena
> #464/1 verifier_arena/basic_alloc1:OK
> #464/2 verifier_arena/basic_alloc2:OK
> #464/3 verifier_arena/basic_alloc3:OK
> #464/4 verifier_arena/iter_maps1:OK
> #464/5 verifier_arena/iter_maps2:OK
> #464/6 verifier_arena/iter_maps3:OK
> #464 verifier_arena:OK
> #465/1 verifier_arena_large/big_alloc1:OK
> #465/2 verifier_arena_large/big_alloc2:OK
> #465 verifier_arena_large:OK
> Summary: 2/8 PASSED, 0 SKIPPED, 0 FAILED
>
> # ./test_progs -t arena_atomics
> #3/1 arena_atomics/add:OK
> #3/2 arena_atomics/sub:OK
> #3/3 arena_atomics/and:OK
> #3/4 arena_atomics/or:OK
> #3/5 arena_atomics/xor:OK
> #3/6 arena_atomics/cmpxchg:OK
> #3/7 arena_atomics/xchg:OK
> #3/8 arena_atomics/uaf:OK
> #3/9 arena_atomics/load_acquire:SKIP
> #3/10 arena_atomics/store_release:SKIP
> #3 arena_atomics:OK (SKIP: 2/10)
> Summary: 1/8 PASSED, 2 SKIPPED, 0 FAILED
>
> All selftests related to arena_spin_lock are passing:
>
> # ./test_progs -t arena_spin_lock
> #6/1 arena_spin_lock/arena_spin_lock_1:OK
> #6/2 arena_spin_lock/arena_spin_lock_1000:OK
> #6/3 arena_spin_lock/arena_spin_lock_50000:OK
> #6 arena_spin_lock:OK
> Summary: 1/3 PASSED, 0 SKIPPED, 0 FAILED
>
> Saket Kumar Bhaskar (6):
> bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store
> instructions
> bpf,powerpc: Implement PROBE_MEM32 pseudo instructions
> bpf,powerpc: Implement bpf_addr_space_cast instruction
> bpf,powerpc: Introduce bpf_jit_emit_atomic_ops() to emit atomic
> instructions
> bpf,powerpc: Implement PROBE_ATOMIC instructions
> selftests/bpf: Fix arena_spin_lock selftest failure
>
> arch/powerpc/net/bpf_jit.h | 6 +-
> arch/powerpc/net/bpf_jit_comp.c | 32 +-
> arch/powerpc/net/bpf_jit_comp32.c | 2 +-
> arch/powerpc/net/bpf_jit_comp64.c | 378 +++++++++++++-----
> .../bpf/prog_tests/arena_spin_lock.c | 23 +-
> .../selftests/bpf/progs/arena_spin_lock.c | 8 +-
> .../selftests/bpf/progs/bpf_arena_spin_lock.h | 4 +-
> 7 files changed, 348 insertions(+), 105 deletions(-)
>
> base-commit: ea2aecdf7a954a8c0015e185cc870c4191d1d93f
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics
2025-08-05 7:45 ` [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Christophe Leroy
@ 2025-08-07 10:26 ` Saket Kumar Bhaskar
0 siblings, 0 replies; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-07 10:26 UTC (permalink / raw)
To: Christophe Leroy
Cc: bpf, linuxppc-dev, linux-kselftest, linux-kernel, hbathini,
sachinpb, venkat88, andrii, eddyz87, mykolal, ast, daniel,
martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
haoluo, jolsa, naveen, maddy, mpe, npiggin, memxor, iii, shuah
On Tue, Aug 05, 2025 at 09:45:39AM +0200, Christophe Leroy wrote:
>
>
> Le 05/08/2025 à 08:27, Saket Kumar Bhaskar a écrit :
> > This patch series introduces support for the PROBE_MEM32,
> > bpf_addr_space_cast and PROBE_ATOMIC instructions in the powerpc BPF JIT,
> > facilitating the implementation of BPF arena and arena atomics.
>
> This series seems to be limited to powerpc64. Please make it explicit in all
> patches subject, see exemple below:
>
> $ git log --oneline arch/powerpc/net/bpf_jit_comp64.c
> cf2a6de32cabb (tag: powerpc-6.17-2, origin/next-test, origin/next)
> powerpc64/bpf: Add jit support for load_acquire and store_release
> 59ba025948be2 powerpc/bpf: fix JIT code size calculation of bpf trampoline
> d243b62b7bd3d powerpc64/bpf: Add support for bpf trampolines
> 9670f6d2097c4 powerpc64/bpf: Fold bpf_jit_emit_func_call_hlp() into
> bpf_jit_emit_func_call_rel()
> fde318326daa4 powerpc64/bpf: jit support for signed division and modulo
> 597b1710982d1 powerpc64/bpf: jit support for sign extended mov
> 717756c9c8dda powerpc64/bpf: jit support for sign extended load
> a71c0b09a14db powerpc64/bpf: jit support for unconditional byte swap
> 3c086ce222cef powerpc64/bpf: jit support for 32bit offset jmp instruction
> b1e7cee961274 powerpc/bpf: enforce full ordering for ATOMIC operations with
> BPF_FETCH
> 61688a82e047a powerpc/bpf: enable kfunc call
>
Chris, will keep this in mind while sending v2.
>
> >
> > The last patch in the series has fix for arena spinlock selftest
> > failure.
> >
> > This series is rebased on top of:
> > https://lore.kernel.org/bpf/20250717202935.29018-2-puranjay@kernel.org/
> >
> > All selftests related to bpf_arena, bpf_arena_atomic(except
> > load_acquire/store_release) enablement are passing:
> >
> > # ./test_progs -t arena_list
> > #5/1 arena_list/arena_list_1:OK
> > #5/2 arena_list/arena_list_1000:OK
> > #5 arena_list:OK
> > Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
> >
> > # ./test_progs -t arena_htab
> > #4/1 arena_htab/arena_htab_llvm:OK
> > #4/2 arena_htab/arena_htab_asm:OK
> > #4 arena_htab:OK
> > Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
> >
> > # ./test_progs -t verifier_arena
> > #464/1 verifier_arena/basic_alloc1:OK
> > #464/2 verifier_arena/basic_alloc2:OK
> > #464/3 verifier_arena/basic_alloc3:OK
> > #464/4 verifier_arena/iter_maps1:OK
> > #464/5 verifier_arena/iter_maps2:OK
> > #464/6 verifier_arena/iter_maps3:OK
> > #464 verifier_arena:OK
> > #465/1 verifier_arena_large/big_alloc1:OK
> > #465/2 verifier_arena_large/big_alloc2:OK
> > #465 verifier_arena_large:OK
> > Summary: 2/8 PASSED, 0 SKIPPED, 0 FAILED
> >
> > # ./test_progs -t arena_atomics
> > #3/1 arena_atomics/add:OK
> > #3/2 arena_atomics/sub:OK
> > #3/3 arena_atomics/and:OK
> > #3/4 arena_atomics/or:OK
> > #3/5 arena_atomics/xor:OK
> > #3/6 arena_atomics/cmpxchg:OK
> > #3/7 arena_atomics/xchg:OK
> > #3/8 arena_atomics/uaf:OK
> > #3/9 arena_atomics/load_acquire:SKIP
> > #3/10 arena_atomics/store_release:SKIP
> > #3 arena_atomics:OK (SKIP: 2/10)
> > Summary: 1/8 PASSED, 2 SKIPPED, 0 FAILED
> >
> > All selftests related to arena_spin_lock are passing:
> >
> > # ./test_progs -t arena_spin_lock
> > #6/1 arena_spin_lock/arena_spin_lock_1:OK
> > #6/2 arena_spin_lock/arena_spin_lock_1000:OK
> > #6/3 arena_spin_lock/arena_spin_lock_50000:OK
> > #6 arena_spin_lock:OK
> > Summary: 1/3 PASSED, 0 SKIPPED, 0 FAILED
> >
> > Saket Kumar Bhaskar (6):
> > bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store
> > instructions
> > bpf,powerpc: Implement PROBE_MEM32 pseudo instructions
> > bpf,powerpc: Implement bpf_addr_space_cast instruction
> > bpf,powerpc: Introduce bpf_jit_emit_atomic_ops() to emit atomic
> > instructions
> > bpf,powerpc: Implement PROBE_ATOMIC instructions
> > selftests/bpf: Fix arena_spin_lock selftest failure
> >
> > arch/powerpc/net/bpf_jit.h | 6 +-
> > arch/powerpc/net/bpf_jit_comp.c | 32 +-
> > arch/powerpc/net/bpf_jit_comp32.c | 2 +-
> > arch/powerpc/net/bpf_jit_comp64.c | 378 +++++++++++++-----
> > .../bpf/prog_tests/arena_spin_lock.c | 23 +-
> > .../selftests/bpf/progs/arena_spin_lock.c | 8 +-
> > .../selftests/bpf/progs/bpf_arena_spin_lock.h | 4 +-
> > 7 files changed, 348 insertions(+), 105 deletions(-)
> >
> > base-commit: ea2aecdf7a954a8c0015e185cc870c4191d1d93f
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics
2025-08-05 6:27 [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Saket Kumar Bhaskar
` (6 preceding siblings ...)
2025-08-05 7:45 ` [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics Christophe Leroy
@ 2025-08-05 12:07 ` Venkat Rao Bagalkote
2025-08-07 13:17 ` Saket Kumar Bhaskar
7 siblings, 1 reply; 23+ messages in thread
From: Venkat Rao Bagalkote @ 2025-08-05 12:07 UTC (permalink / raw)
To: Saket Kumar Bhaskar, bpf, linuxppc-dev, linux-kselftest,
linux-kernel
Cc: hbathini, sachinpb, andrii, eddyz87, mykolal, ast, daniel,
martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
haoluo, jolsa, christophe.leroy, naveen, maddy, mpe, npiggin,
memxor, iii, shuah
On 05/08/25 11:57 am, Saket Kumar Bhaskar wrote:
> This patch series introduces support for the PROBE_MEM32,
> bpf_addr_space_cast and PROBE_ATOMIC instructions in the powerpc BPF JIT,
> facilitating the implementation of BPF arena and arena atomics.
>
> The last patch in the series has fix for arena spinlock selftest
> failure.
>
> This series is rebased on top of:
> https://lore.kernel.org/bpf/20250717202935.29018-2-puranjay@kernel.org/
>
> All selftests related to bpf_arena, bpf_arena_atomic(except
> load_acquire/store_release) enablement are passing:
Hello Saket,
I see couple of selftests are failing on my set up.
>
> # ./test_progs -t arena_list
> #5/1 arena_list/arena_list_1:OK
> #5/2 arena_list/arena_list_1000:OK
> #5 arena_list:OK
> Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
>
> # ./test_progs -t arena_htab
> #4/1 arena_htab/arena_htab_llvm:OK
> #4/2 arena_htab/arena_htab_asm:OK
> #4 arena_htab:OK
> Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
>
> # ./test_progs -t verifier_arena
> #464/1 verifier_arena/basic_alloc1:OK
> #464/2 verifier_arena/basic_alloc2:OK
> #464/3 verifier_arena/basic_alloc3:OK
> #464/4 verifier_arena/iter_maps1:OK
> #464/5 verifier_arena/iter_maps2:OK
> #464/6 verifier_arena/iter_maps3:OK
> #464 verifier_arena:OK
> #465/1 verifier_arena_large/big_alloc1:OK
> #465/2 verifier_arena_large/big_alloc2:OK
> #465 verifier_arena_large:OK
> Summary: 2/8 PASSED, 0 SKIPPED, 0 FAILED
All error logs:
tester_init:PASS:tester_log_buf 0 nsec
process_subtest:PASS:obj_open_mem 0 nsec
process_subtest:PASS:specs_alloc 0 nsec
run_subtest:PASS:obj_open_mem 0 nsec
run_subtest:PASS:unexpected_load_failure 0 nsec
do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
run_subtest:FAIL:1103 Unexpected retval: 4 != 0
#513/7 verifier_arena/reserve_invalid_region:FAIL
#513 verifier_arena:FAIL
Summary: 1/14 PASSED, 0 SKIPPED, 1 FAILED
>
> # ./test_progs -t arena_atomics
> #3/1 arena_atomics/add:OK
> #3/2 arena_atomics/sub:OK
> #3/3 arena_atomics/and:OK
> #3/4 arena_atomics/or:OK
> #3/5 arena_atomics/xor:OK
> #3/6 arena_atomics/cmpxchg:OK
> #3/7 arena_atomics/xchg:OK
> #3/8 arena_atomics/uaf:OK
> #3/9 arena_atomics/load_acquire:SKIP
> #3/10 arena_atomics/store_release:SKIP
> #3 arena_atomics:OK (SKIP: 2/10)
> Summary: 1/8 PASSED, 2 SKIPPED, 0 FAILED
>
> All selftests related to arena_spin_lock are passing:
>
> # ./test_progs -t arena_spin_lock
> #6/1 arena_spin_lock/arena_spin_lock_1:OK
> #6/2 arena_spin_lock/arena_spin_lock_1000:OK
> #6/3 arena_spin_lock/arena_spin_lock_50000:OK
> #6 arena_spin_lock:OK
> Summary: 1/3 PASSED, 0 SKIPPED, 0 FAILED
test_arena_spin_lock_size:FAIL:check counter value unexpected check
counter value: actual 15999 != expected 16000
#6/1 arena_spin_lock/arena_spin_lock_1:FAIL
#6 arena_spin_lock:FAIL
Summary: 0/2 PASSED, 0 SKIPPED, 1 FAILED
> Saket Kumar Bhaskar (6):
> bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store
> instructions
> bpf,powerpc: Implement PROBE_MEM32 pseudo instructions
> bpf,powerpc: Implement bpf_addr_space_cast instruction
> bpf,powerpc: Introduce bpf_jit_emit_atomic_ops() to emit atomic
> instructions
> bpf,powerpc: Implement PROBE_ATOMIC instructions
> selftests/bpf: Fix arena_spin_lock selftest failure
>
> arch/powerpc/net/bpf_jit.h | 6 +-
> arch/powerpc/net/bpf_jit_comp.c | 32 +-
> arch/powerpc/net/bpf_jit_comp32.c | 2 +-
> arch/powerpc/net/bpf_jit_comp64.c | 378 +++++++++++++-----
> .../bpf/prog_tests/arena_spin_lock.c | 23 +-
> .../selftests/bpf/progs/arena_spin_lock.c | 8 +-
> .../selftests/bpf/progs/bpf_arena_spin_lock.h | 4 +-
> 7 files changed, 348 insertions(+), 105 deletions(-)
>
> base-commit: ea2aecdf7a954a8c0015e185cc870c4191d1d93f
Regards,
Venkat.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bpf-next 0/6] bpf,powerpc: Add support for bpf arena and arena atomics
2025-08-05 12:07 ` Venkat Rao Bagalkote
@ 2025-08-07 13:17 ` Saket Kumar Bhaskar
0 siblings, 0 replies; 23+ messages in thread
From: Saket Kumar Bhaskar @ 2025-08-07 13:17 UTC (permalink / raw)
To: Venkat Rao Bagalkote
Cc: bpf, linuxppc-dev, linux-kselftest, linux-kernel, hbathini,
sachinpb, andrii, eddyz87, mykolal, ast, daniel, martin.lau, song,
yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
christophe.leroy, naveen, maddy, mpe, npiggin, memxor, iii, shuah
On Tue, Aug 05, 2025 at 05:37:00PM +0530, Venkat Rao Bagalkote wrote:
>
> On 05/08/25 11:57 am, Saket Kumar Bhaskar wrote:
> > This patch series introduces support for the PROBE_MEM32,
> > bpf_addr_space_cast and PROBE_ATOMIC instructions in the powerpc BPF JIT,
> > facilitating the implementation of BPF arena and arena atomics.
> >
> > The last patch in the series has fix for arena spinlock selftest
> > failure.
> >
> > This series is rebased on top of:
> > https://lore.kernel.org/bpf/20250717202935.29018-2-puranjay@kernel.org/
> >
> > All selftests related to bpf_arena, bpf_arena_atomic(except
> > load_acquire/store_release) enablement are passing:
>
>
> Hello Saket,
>
>
> I see couple of selftests are failing on my set up.
>
> >
> > # ./test_progs -t arena_list
> > #5/1 arena_list/arena_list_1:OK
> > #5/2 arena_list/arena_list_1000:OK
> > #5 arena_list:OK
> > Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
> >
> > # ./test_progs -t arena_htab
> > #4/1 arena_htab/arena_htab_llvm:OK
> > #4/2 arena_htab/arena_htab_asm:OK
> > #4 arena_htab:OK
> > Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
> >
> > # ./test_progs -t verifier_arena
> > #464/1 verifier_arena/basic_alloc1:OK
> > #464/2 verifier_arena/basic_alloc2:OK
> > #464/3 verifier_arena/basic_alloc3:OK
> > #464/4 verifier_arena/iter_maps1:OK
> > #464/5 verifier_arena/iter_maps2:OK
> > #464/6 verifier_arena/iter_maps3:OK
> > #464 verifier_arena:OK
> > #465/1 verifier_arena_large/big_alloc1:OK
> > #465/2 verifier_arena_large/big_alloc2:OK
> > #465 verifier_arena_large:OK
> > Summary: 2/8 PASSED, 0 SKIPPED, 0 FAILED
>
>
> All error logs:
> tester_init:PASS:tester_log_buf 0 nsec
> process_subtest:PASS:obj_open_mem 0 nsec
> process_subtest:PASS:specs_alloc 0 nsec
> run_subtest:PASS:obj_open_mem 0 nsec
> run_subtest:PASS:unexpected_load_failure 0 nsec
> do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
> run_subtest:FAIL:1103 Unexpected retval: 4 != 0
> #513/7 verifier_arena/reserve_invalid_region:FAIL
> #513 verifier_arena:FAIL
> Summary: 1/14 PASSED, 0 SKIPPED, 1 FAILED
>
>
Hi Venkat,
It is known failure. This selftest was added recently. We are working on it to
fix this. Will post the fix for this selftest separately.
> >
> > # ./test_progs -t arena_atomics
> > #3/1 arena_atomics/add:OK
> > #3/2 arena_atomics/sub:OK
> > #3/3 arena_atomics/and:OK
> > #3/4 arena_atomics/or:OK
> > #3/5 arena_atomics/xor:OK
> > #3/6 arena_atomics/cmpxchg:OK
> > #3/7 arena_atomics/xchg:OK
> > #3/8 arena_atomics/uaf:OK
> > #3/9 arena_atomics/load_acquire:SKIP
> > #3/10 arena_atomics/store_release:SKIP
> > #3 arena_atomics:OK (SKIP: 2/10)
> > Summary: 1/8 PASSED, 2 SKIPPED, 0 FAILED
> >
> > All selftests related to arena_spin_lock are passing:
> >
> > # ./test_progs -t arena_spin_lock
> > #6/1 arena_spin_lock/arena_spin_lock_1:OK
> > #6/2 arena_spin_lock/arena_spin_lock_1000:OK
> > #6/3 arena_spin_lock/arena_spin_lock_50000:OK
> > #6 arena_spin_lock:OK
> > Summary: 1/3 PASSED, 0 SKIPPED, 0 FAILED
> test_arena_spin_lock_size:FAIL:check counter value unexpected check counter
> value: actual 15999 != expected 16000
> #6/1 arena_spin_lock/arena_spin_lock_1:FAIL
> #6 arena_spin_lock:FAIL
> Summary: 0/2 PASSED, 0 SKIPPED, 1 FAILED
This too, with llvm-19 the failure is known to us, where llvm doesn't have support for
may_goto insn https://github.com/llvm/llvm-project/commit/0e0bfacff71859d1f9212205f8f873d47029d3fb.
Though, there is else condition which is envoked incase llvm doesn't have support for may_goto insn,
which we are looking into.
Since llvm-20 has support for may_goto, we are not seeing this failure there(the selftest passes).
So we are planning to fix this in separate patch for llvm-19 for now.
Regards,
Saket
> > Saket Kumar Bhaskar (6):
> > bpf,powerpc: Introduce bpf_jit_emit_probe_mem_store() to emit store
> > instructions
> > bpf,powerpc: Implement PROBE_MEM32 pseudo instructions
> > bpf,powerpc: Implement bpf_addr_space_cast instruction
> > bpf,powerpc: Introduce bpf_jit_emit_atomic_ops() to emit atomic
> > instructions
> > bpf,powerpc: Implement PROBE_ATOMIC instructions
> > selftests/bpf: Fix arena_spin_lock selftest failure
> >
> > arch/powerpc/net/bpf_jit.h | 6 +-
> > arch/powerpc/net/bpf_jit_comp.c | 32 +-
> > arch/powerpc/net/bpf_jit_comp32.c | 2 +-
> > arch/powerpc/net/bpf_jit_comp64.c | 378 +++++++++++++-----
> > .../bpf/prog_tests/arena_spin_lock.c | 23 +-
> > .../selftests/bpf/progs/arena_spin_lock.c | 8 +-
> > .../selftests/bpf/progs/bpf_arena_spin_lock.h | 4 +-
> > 7 files changed, 348 insertions(+), 105 deletions(-)
> >
> > base-commit: ea2aecdf7a954a8c0015e185cc870c4191d1d93f
>
>
> Regards,
>
> Venkat.
>
^ permalink raw reply [flat|nested] 23+ messages in thread