* [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH
@ 2024-05-13 10:02 Puranjay Mohan
2024-05-13 10:06 ` Christophe Leroy
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Puranjay Mohan @ 2024-05-13 10:02 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Naveen N. Rao, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Aneesh Kumar K.V, Hari Bathini, bpf,
linuxppc-dev, linux-kernel, paulmck
Cc: puranjay12
The Linux Kernel Memory Model [1][2] requires RMW operations that have a
return value to be fully ordered.
BPF atomic operations with BPF_FETCH (including BPF_XCHG and
BPF_CMPXCHG) return a value back so they need to be JITed to fully
ordered operations. POWERPC currently emits relaxed operations for
these.
We can show this by running the following litmus-test:
PPC SB+atomic_add+fetch
{
0:r0=x; (* dst reg assuming offset is 0 *)
0:r1=2; (* src reg *)
0:r2=1;
0:r4=y; (* P0 writes to this, P1 reads this *)
0:r5=z; (* P1 writes to this, P0 reads this *)
0:r6=0;
1:r2=1;
1:r4=y;
1:r5=z;
}
P0 | P1 ;
stw r2, 0(r4) | stw r2,0(r5) ;
| ;
loop:lwarx r3, r6, r0 | ;
mr r8, r3 | ;
add r3, r3, r1 | sync ;
stwcx. r3, r6, r0 | ;
bne loop | ;
mr r1, r8 | ;
| ;
lwa r7, 0(r5) | lwa r7,0(r4) ;
~exists(0:r7=0 /\ 1:r7=0)
Witnesses
Positive: 9 Negative: 3
Condition ~exists (0:r7=0 /\ 1:r7=0)
Observation SB+atomic_add+fetch Sometimes 3 9
This test shows that the older store in P0 is reordered with a newer
load to a different address. Although there is a RMW operation with
fetch between them. Adding a sync before and after RMW fixes the issue:
Witnesses
Positive: 9 Negative: 0
Condition ~exists (0:r7=0 /\ 1:r7=0)
Observation SB+atomic_add+fetch Never 0 9
[1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
[2] https://www.kernel.org/doc/Documentation/atomic_t.txt
Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
---
Changes in v2 -> v3:
v2: https://lore.kernel.org/all/20240508115404.74823-1-puranjay@kernel.org/
- Emit the sync outside the loop so it doesn't get executed everytime.
- Minor coding style changes.
Changes in v1 -> v2:
v1: https://lore.kernel.org/all/20240507175439.119467-1-puranjay@kernel.org/
- Don't emit `sync` for non-SMP kernels as that adds unessential overhead.
---
arch/powerpc/net/bpf_jit_comp32.c | 12 ++++++++++++
arch/powerpc/net/bpf_jit_comp64.c | 12 ++++++++++++
2 files changed, 24 insertions(+)
diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 2f39c50ca729..35f64dcfa68e 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -852,6 +852,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
/* Get offset into TMP_REG */
EMIT(PPC_RAW_LI(tmp_reg, off));
+ /*
+ * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
+ * before and after the operation.
+ *
+ * This is a requirement in the Linux Kernel Memory Model.
+ * See __cmpxchg_u32() in asm/cmpxchg.h as an example.
+ */
+ if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
+ EMIT(PPC_RAW_SYNC());
tmp_idx = ctx->idx * 4;
/* load value from memory into r0 */
EMIT(PPC_RAW_LWARX(_R0, tmp_reg, dst_reg, 0));
@@ -905,6 +914,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
/* For the BPF_FETCH variant, get old data into src_reg */
if (imm & BPF_FETCH) {
+ /* Emit 'sync' to enforce full ordering */
+ if (IS_ENABLED(CONFIG_SMP))
+ EMIT(PPC_RAW_SYNC());
EMIT(PPC_RAW_MR(ret_reg, ax_reg));
if (!fp->aux->verifier_zext)
EMIT(PPC_RAW_LI(ret_reg - 1, 0)); /* higher 32-bit */
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 79f23974a320..884eef1b3973 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -803,6 +803,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
/* Get offset into TMP_REG_1 */
EMIT(PPC_RAW_LI(tmp1_reg, off));
+ /*
+ * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
+ * before and after the operation.
+ *
+ * This is a requirement in the Linux Kernel Memory Model.
+ * See __cmpxchg_u64() in asm/cmpxchg.h as an example.
+ */
+ if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
+ EMIT(PPC_RAW_SYNC());
tmp_idx = ctx->idx * 4;
/* load value from memory into TMP_REG_2 */
if (size == BPF_DW)
@@ -865,6 +874,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
PPC_BCC_SHORT(COND_NE, tmp_idx);
if (imm & BPF_FETCH) {
+ /* Emit 'sync' to enforce full ordering */
+ if (IS_ENABLED(CONFIG_SMP))
+ EMIT(PPC_RAW_SYNC());
EMIT(PPC_RAW_MR(ret_reg, _R0));
/*
* Skip unnecessary zero-extension for 32-bit cmpxchg.
--
2.40.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH
2024-05-13 10:02 [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH Puranjay Mohan
@ 2024-05-13 10:06 ` Christophe Leroy
2024-05-13 13:44 ` Naveen N Rao
2024-06-03 2:45 ` Michael Ellerman
2 siblings, 0 replies; 6+ messages in thread
From: Christophe Leroy @ 2024-05-13 10:06 UTC (permalink / raw)
To: Puranjay Mohan, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
Hao Luo, Jiri Olsa, Naveen N. Rao, Michael Ellerman,
Nicholas Piggin, Aneesh Kumar K.V, Hari Bathini,
bpf@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-kernel@vger.kernel.org, paulmck@kernel.org
Cc: puranjay12@gmail.com
Le 13/05/2024 à 12:02, Puranjay Mohan a écrit :
> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
> return value to be fully ordered.
>
> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
> BPF_CMPXCHG) return a value back so they need to be JITed to fully
> ordered operations. POWERPC currently emits relaxed operations for
> these.
>
> We can show this by running the following litmus-test:
>
> PPC SB+atomic_add+fetch
>
> {
> 0:r0=x; (* dst reg assuming offset is 0 *)
> 0:r1=2; (* src reg *)
> 0:r2=1;
> 0:r4=y; (* P0 writes to this, P1 reads this *)
> 0:r5=z; (* P1 writes to this, P0 reads this *)
> 0:r6=0;
>
> 1:r2=1;
> 1:r4=y;
> 1:r5=z;
> }
>
> P0 | P1 ;
> stw r2, 0(r4) | stw r2,0(r5) ;
> | ;
> loop:lwarx r3, r6, r0 | ;
> mr r8, r3 | ;
> add r3, r3, r1 | sync ;
> stwcx. r3, r6, r0 | ;
> bne loop | ;
> mr r1, r8 | ;
> | ;
> lwa r7, 0(r5) | lwa r7,0(r4) ;
>
> ~exists(0:r7=0 /\ 1:r7=0)
>
> Witnesses
> Positive: 9 Negative: 3
> Condition ~exists (0:r7=0 /\ 1:r7=0)
> Observation SB+atomic_add+fetch Sometimes 3 9
>
> This test shows that the older store in P0 is reordered with a newer
> load to a different address. Although there is a RMW operation with
> fetch between them. Adding a sync before and after RMW fixes the issue:
>
> Witnesses
> Positive: 9 Negative: 0
> Condition ~exists (0:r7=0 /\ 1:r7=0)
> Observation SB+atomic_add+fetch Never 0 9
>
> [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
> [2] https://www.kernel.org/doc/Documentation/atomic_t.txt
>
> Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
> Acked-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
> Changes in v2 -> v3:
> v2: https://lore.kernel.org/all/20240508115404.74823-1-puranjay@kernel.org/
> - Emit the sync outside the loop so it doesn't get executed everytime.
> - Minor coding style changes.
>
> Changes in v1 -> v2:
> v1: https://lore.kernel.org/all/20240507175439.119467-1-puranjay@kernel.org/
> - Don't emit `sync` for non-SMP kernels as that adds unessential overhead.
> ---
>
> arch/powerpc/net/bpf_jit_comp32.c | 12 ++++++++++++
> arch/powerpc/net/bpf_jit_comp64.c | 12 ++++++++++++
> 2 files changed, 24 insertions(+)
>
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index 2f39c50ca729..35f64dcfa68e 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -852,6 +852,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
>
> /* Get offset into TMP_REG */
> EMIT(PPC_RAW_LI(tmp_reg, off));
> + /*
> + * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
> + * before and after the operation.
> + *
> + * This is a requirement in the Linux Kernel Memory Model.
> + * See __cmpxchg_u32() in asm/cmpxchg.h as an example.
> + */
> + if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
> + EMIT(PPC_RAW_SYNC());
> tmp_idx = ctx->idx * 4;
> /* load value from memory into r0 */
> EMIT(PPC_RAW_LWARX(_R0, tmp_reg, dst_reg, 0));
> @@ -905,6 +914,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
>
> /* For the BPF_FETCH variant, get old data into src_reg */
> if (imm & BPF_FETCH) {
> + /* Emit 'sync' to enforce full ordering */
> + if (IS_ENABLED(CONFIG_SMP))
> + EMIT(PPC_RAW_SYNC());
> EMIT(PPC_RAW_MR(ret_reg, ax_reg));
> if (!fp->aux->verifier_zext)
> EMIT(PPC_RAW_LI(ret_reg - 1, 0)); /* higher 32-bit */
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 79f23974a320..884eef1b3973 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -803,6 +803,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
>
> /* Get offset into TMP_REG_1 */
> EMIT(PPC_RAW_LI(tmp1_reg, off));
> + /*
> + * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
> + * before and after the operation.
> + *
> + * This is a requirement in the Linux Kernel Memory Model.
> + * See __cmpxchg_u64() in asm/cmpxchg.h as an example.
> + */
> + if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
> + EMIT(PPC_RAW_SYNC());
> tmp_idx = ctx->idx * 4;
> /* load value from memory into TMP_REG_2 */
> if (size == BPF_DW)
> @@ -865,6 +874,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
> PPC_BCC_SHORT(COND_NE, tmp_idx);
>
> if (imm & BPF_FETCH) {
> + /* Emit 'sync' to enforce full ordering */
> + if (IS_ENABLED(CONFIG_SMP))
> + EMIT(PPC_RAW_SYNC());
> EMIT(PPC_RAW_MR(ret_reg, _R0));
> /*
> * Skip unnecessary zero-extension for 32-bit cmpxchg.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH
2024-05-13 10:02 [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH Puranjay Mohan
2024-05-13 10:06 ` Christophe Leroy
@ 2024-05-13 13:44 ` Naveen N Rao
2024-05-13 15:59 ` Puranjay Mohan
2024-06-03 2:45 ` Michael Ellerman
2 siblings, 1 reply; 6+ messages in thread
From: Naveen N Rao @ 2024-05-13 13:44 UTC (permalink / raw)
To: Puranjay Mohan
Cc: Alexei Starovoitov, Song Liu, Stanislav Fomichev, Yonghong Song,
Daniel Borkmann, John Fastabend, Andrii Nakryiko,
Aneesh Kumar K.V, paulmck, Nicholas Piggin, KP Singh,
Hari Bathini, Hao Luo, puranjay12, Martin KaFai Lau, linux-kernel,
Eduard Zingerman, Jiri Olsa, bpf, linuxppc-dev
On Mon, May 13, 2024 at 10:02:48AM GMT, Puranjay Mohan wrote:
> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
> return value to be fully ordered.
>
> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
> BPF_CMPXCHG) return a value back so they need to be JITed to fully
> ordered operations. POWERPC currently emits relaxed operations for
> these.
>
> We can show this by running the following litmus-test:
>
> PPC SB+atomic_add+fetch
>
> {
> 0:r0=x; (* dst reg assuming offset is 0 *)
> 0:r1=2; (* src reg *)
> 0:r2=1;
> 0:r4=y; (* P0 writes to this, P1 reads this *)
> 0:r5=z; (* P1 writes to this, P0 reads this *)
> 0:r6=0;
>
> 1:r2=1;
> 1:r4=y;
> 1:r5=z;
> }
>
> P0 | P1 ;
> stw r2, 0(r4) | stw r2,0(r5) ;
> | ;
> loop:lwarx r3, r6, r0 | ;
> mr r8, r3 | ;
> add r3, r3, r1 | sync ;
> stwcx. r3, r6, r0 | ;
> bne loop | ;
> mr r1, r8 | ;
> | ;
> lwa r7, 0(r5) | lwa r7,0(r4) ;
>
> ~exists(0:r7=0 /\ 1:r7=0)
>
> Witnesses
> Positive: 9 Negative: 3
> Condition ~exists (0:r7=0 /\ 1:r7=0)
> Observation SB+atomic_add+fetch Sometimes 3 9
>
> This test shows that the older store in P0 is reordered with a newer
> load to a different address. Although there is a RMW operation with
> fetch between them. Adding a sync before and after RMW fixes the issue:
>
> Witnesses
> Positive: 9 Negative: 0
> Condition ~exists (0:r7=0 /\ 1:r7=0)
> Observation SB+atomic_add+fetch Never 0 9
>
> [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
> [2] https://www.kernel.org/doc/Documentation/atomic_t.txt
>
> Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")
As I noted in v2, I think that is the wrong commit. This fixes the below
four commits in mainline:
Fixes: aea7ef8a82c0 ("powerpc/bpf/32: add support for BPF_ATOMIC bitwise operations")
Fixes: 2d9206b22743 ("powerpc/bpf/32: Add instructions for atomic_[cmp]xchg")
Fixes: dbe6e2456fb0 ("powerpc/bpf/64: add support for atomic fetch operations")
Fixes: 1e82dfaa7819 ("powerpc/bpf/64: Add instructions for atomic_[cmp]xchg")
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
> Acked-by: Paul E. McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org # v6.0+
I have tested this with test_bpf and test_progs.
Reviewed-by: Naveen N Rao <naveen@kernel.org>
- Naveen
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH
2024-05-13 13:44 ` Naveen N Rao
@ 2024-05-13 15:59 ` Puranjay Mohan
2024-05-14 1:09 ` Michael Ellerman
0 siblings, 1 reply; 6+ messages in thread
From: Puranjay Mohan @ 2024-05-13 15:59 UTC (permalink / raw)
To: Naveen N Rao
Cc: Hao Luo, linux-kernel, Jiri Olsa, Daniel Borkmann,
Eduard Zingerman, paulmck, linuxppc-dev, John Fastabend,
Alexei Starovoitov, Nicholas Piggin, Andrii Nakryiko,
Aneesh Kumar K.V, Song Liu, Stanislav Fomichev, Yonghong Song,
KP Singh, bpf, Martin KaFai Lau, Hari Bathini
Naveen N Rao <naveen@kernel.org> writes:
> On Mon, May 13, 2024 at 10:02:48AM GMT, Puranjay Mohan wrote:
>> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
>> return value to be fully ordered.
>>
>> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
>> BPF_CMPXCHG) return a value back so they need to be JITed to fully
>> ordered operations. POWERPC currently emits relaxed operations for
>> these.
>>
>> We can show this by running the following litmus-test:
>>
>> PPC SB+atomic_add+fetch
>>
>> {
>> 0:r0=x; (* dst reg assuming offset is 0 *)
>> 0:r1=2; (* src reg *)
>> 0:r2=1;
>> 0:r4=y; (* P0 writes to this, P1 reads this *)
>> 0:r5=z; (* P1 writes to this, P0 reads this *)
>> 0:r6=0;
>>
>> 1:r2=1;
>> 1:r4=y;
>> 1:r5=z;
>> }
>>
>> P0 | P1 ;
>> stw r2, 0(r4) | stw r2,0(r5) ;
>> | ;
>> loop:lwarx r3, r6, r0 | ;
>> mr r8, r3 | ;
>> add r3, r3, r1 | sync ;
>> stwcx. r3, r6, r0 | ;
>> bne loop | ;
>> mr r1, r8 | ;
>> | ;
>> lwa r7, 0(r5) | lwa r7,0(r4) ;
>>
>> ~exists(0:r7=0 /\ 1:r7=0)
>>
>> Witnesses
>> Positive: 9 Negative: 3
>> Condition ~exists (0:r7=0 /\ 1:r7=0)
>> Observation SB+atomic_add+fetch Sometimes 3 9
>>
>> This test shows that the older store in P0 is reordered with a newer
>> load to a different address. Although there is a RMW operation with
>> fetch between them. Adding a sync before and after RMW fixes the issue:
>>
>> Witnesses
>> Positive: 9 Negative: 0
>> Condition ~exists (0:r7=0 /\ 1:r7=0)
>> Observation SB+atomic_add+fetch Never 0 9
>>
>> [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
>> [2] https://www.kernel.org/doc/Documentation/atomic_t.txt
>>
>> Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")
>
> As I noted in v2, I think that is the wrong commit. This fixes the below
Sorry for missing this. Would this need another version or your message
below will make it work with the stable process?
> four commits in mainline:
> Fixes: aea7ef8a82c0 ("powerpc/bpf/32: add support for BPF_ATOMIC bitwise operations")
> Fixes: 2d9206b22743 ("powerpc/bpf/32: Add instructions for atomic_[cmp]xchg")
> Fixes: dbe6e2456fb0 ("powerpc/bpf/64: add support for atomic fetch operations")
> Fixes: 1e82dfaa7819 ("powerpc/bpf/64: Add instructions for atomic_[cmp]xchg")
>
> Cc: stable@vger.kernel.org # v6.0+
Thanks,
Puranjay
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH
2024-05-13 15:59 ` Puranjay Mohan
@ 2024-05-14 1:09 ` Michael Ellerman
0 siblings, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2024-05-14 1:09 UTC (permalink / raw)
To: Puranjay Mohan, Naveen N Rao
Cc: Hao Luo, linux-kernel, Jiri Olsa, Daniel Borkmann,
Eduard Zingerman, paulmck, linuxppc-dev, John Fastabend,
Alexei Starovoitov, Nicholas Piggin, Andrii Nakryiko,
Aneesh Kumar K.V, Song Liu, Stanislav Fomichev, Yonghong Song,
KP Singh, bpf, Martin KaFai Lau, Hari Bathini
Puranjay Mohan <puranjay@kernel.org> writes:
> Naveen N Rao <naveen@kernel.org> writes:
>> On Mon, May 13, 2024 at 10:02:48AM GMT, Puranjay Mohan wrote:
>>> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
>>> return value to be fully ordered.
>>>
>>> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
>>> BPF_CMPXCHG) return a value back so they need to be JITed to fully
>>> ordered operations. POWERPC currently emits relaxed operations for
>>> these.
>>>
>>> We can show this by running the following litmus-test:
>>>
>>> PPC SB+atomic_add+fetch
>>>
>>> {
>>> 0:r0=x; (* dst reg assuming offset is 0 *)
>>> 0:r1=2; (* src reg *)
>>> 0:r2=1;
>>> 0:r4=y; (* P0 writes to this, P1 reads this *)
>>> 0:r5=z; (* P1 writes to this, P0 reads this *)
>>> 0:r6=0;
>>>
>>> 1:r2=1;
>>> 1:r4=y;
>>> 1:r5=z;
>>> }
>>>
>>> P0 | P1 ;
>>> stw r2, 0(r4) | stw r2,0(r5) ;
>>> | ;
>>> loop:lwarx r3, r6, r0 | ;
>>> mr r8, r3 | ;
>>> add r3, r3, r1 | sync ;
>>> stwcx. r3, r6, r0 | ;
>>> bne loop | ;
>>> mr r1, r8 | ;
>>> | ;
>>> lwa r7, 0(r5) | lwa r7,0(r4) ;
>>>
>>> ~exists(0:r7=0 /\ 1:r7=0)
>>>
>>> Witnesses
>>> Positive: 9 Negative: 3
>>> Condition ~exists (0:r7=0 /\ 1:r7=0)
>>> Observation SB+atomic_add+fetch Sometimes 3 9
>>>
>>> This test shows that the older store in P0 is reordered with a newer
>>> load to a different address. Although there is a RMW operation with
>>> fetch between them. Adding a sync before and after RMW fixes the issue:
>>>
>>> Witnesses
>>> Positive: 9 Negative: 0
>>> Condition ~exists (0:r7=0 /\ 1:r7=0)
>>> Observation SB+atomic_add+fetch Never 0 9
>>>
>>> [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
>>> [2] https://www.kernel.org/doc/Documentation/atomic_t.txt
>>>
>>> Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")
>>
>> As I noted in v2, I think that is the wrong commit. This fixes the below
>
> Sorry for missing this. Would this need another version or your message
> below will make it work with the stable process?
No need for another version. b4 should pick up those tags, or if not
I'll add them by hand.
cheers
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH
2024-05-13 10:02 [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH Puranjay Mohan
2024-05-13 10:06 ` Christophe Leroy
2024-05-13 13:44 ` Naveen N Rao
@ 2024-06-03 2:45 ` Michael Ellerman
2 siblings, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2024-06-03 2:45 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Naveen N. Rao, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Aneesh Kumar K.V, Hari Bathini, bpf,
linuxppc-dev, linux-kernel, paulmck, Puranjay Mohan
Cc: Puranjay Mohan
On Mon, 13 May 2024 10:02:48 +0000, Puranjay Mohan wrote:
> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
> return value to be fully ordered.
>
> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
> BPF_CMPXCHG) return a value back so they need to be JITed to fully
> ordered operations. POWERPC currently emits relaxed operations for
> these.
>
> [...]
Applied to powerpc/fixes.
[1/1] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH
https://git.kernel.org/powerpc/c/b1e7cee96127468c2483cf10c2899c9b5cf79bf8
cheers
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-06-03 2:46 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-13 10:02 [PATCH bpf v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH Puranjay Mohan
2024-05-13 10:06 ` Christophe Leroy
2024-05-13 13:44 ` Naveen N Rao
2024-05-13 15:59 ` Puranjay Mohan
2024-05-14 1:09 ` Michael Ellerman
2024-06-03 2:45 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).