qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] ppc: improve some memory ordering issues
@ 2022-05-19 13:59 Nicholas Piggin
  2022-05-19 13:59 ` [PATCH 1/4] target/ppc: Fix eieio memory ordering semantics Nicholas Piggin
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Nicholas Piggin @ 2022-05-19 13:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Richard Henderson

Since RFC[*], this fixes a compile issue noticed by Richard,
and has survived some basic stressing with mttcg.

Thanks,
Nick

[*] https://lists.nongnu.org/archive/html/qemu-ppc/2022-05/msg00046.html

Nicholas Piggin (4):
  target/ppc: Fix eieio memory ordering semantics
  tcg/ppc: ST_ST memory ordering is not provided with eieio
  tcg/ppc: Optimize memory ordering generation with lwsync
  target/ppc: Implement lwsync with weaker memory ordering

 target/ppc/cpu.h         |  4 +++-
 target/ppc/cpu_init.c    | 13 +++++++------
 target/ppc/machine.c     |  3 ++-
 target/ppc/translate.c   | 35 +++++++++++++++++++++++++++++++++--
 tcg/ppc/tcg-target.c.inc | 11 ++++++-----
 5 files changed, 51 insertions(+), 15 deletions(-)

-- 
2.35.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] target/ppc: Fix eieio memory ordering semantics
  2022-05-19 13:59 [PATCH 0/4] ppc: improve some memory ordering issues Nicholas Piggin
@ 2022-05-19 13:59 ` Nicholas Piggin
  2022-05-19 15:30   ` Richard Henderson
  2022-05-19 13:59 ` [PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio Nicholas Piggin
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Nicholas Piggin @ 2022-05-19 13:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Richard Henderson

The generated eieio memory ordering semantics do not match the
instruction definition in the architecture. Add a big comment to
explain this strange instruction and correct the memory ordering
behaviour.

Signed-off: Nicholas Piggin <npiggin@gmail.com>
---
 target/ppc/translate.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index fa34f81c30..eb42f7e459 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -3513,7 +3513,32 @@ static void gen_stswx(DisasContext *ctx)
 /* eieio */
 static void gen_eieio(DisasContext *ctx)
 {
-    TCGBar bar = TCG_MO_LD_ST;
+    TCGBar bar = TCG_MO_ALL;
+
+    /*
+     * eieio has complex semanitcs. It provides memory ordering between
+     * operations in the set:
+     * - loads from CI memory.
+     * - stores to CI memory.
+     * - stores to WT memory.
+     *
+     * It separately also orders memory for operations in the set:
+     * - stores to cacheble memory.
+     *
+     * It also serializes instructions:
+     * - dcbt and dcbst.
+     *
+     * It separately serializes:
+     * - tlbie and tlbsync.
+     *
+     * And separately serializes:
+     * - slbieg, slbiag, and slbsync.
+     *
+     * The end result is that CI memory ordering requires TCG_MO_ALL
+     * and it is not possible to special-case more relaxed ordering for
+     * cacheable accesses. TCG_BAR_SC is required to provide this
+     * serialization.
+     */
 
     /*
      * POWER9 has a eieio instruction variant using bit 6 as a hint to
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio
  2022-05-19 13:59 [PATCH 0/4] ppc: improve some memory ordering issues Nicholas Piggin
  2022-05-19 13:59 ` [PATCH 1/4] target/ppc: Fix eieio memory ordering semantics Nicholas Piggin
@ 2022-05-19 13:59 ` Nicholas Piggin
  2022-05-19 13:59 ` [PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync Nicholas Piggin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Nicholas Piggin @ 2022-05-19 13:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Richard Henderson

eieio does not provide ordering between stores to CI memory and stores
to cacheable memory so it can't be used as a general ST_ST barrier.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-of-by: Nicholas Piggin <npiggin@gmail.com>
---
 tcg/ppc/tcg-target.c.inc | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index cfcd121f9c..3ff845d063 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -1836,8 +1836,6 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
     a0 &= TCG_MO_ALL;
     if (a0 == TCG_MO_LD_LD) {
         insn = LWSYNC;
-    } else if (a0 == TCG_MO_ST_ST) {
-        insn = EIEIO;
     }
     tcg_out32(s, insn);
 }
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync
  2022-05-19 13:59 [PATCH 0/4] ppc: improve some memory ordering issues Nicholas Piggin
  2022-05-19 13:59 ` [PATCH 1/4] target/ppc: Fix eieio memory ordering semantics Nicholas Piggin
  2022-05-19 13:59 ` [PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio Nicholas Piggin
@ 2022-05-19 13:59 ` Nicholas Piggin
  2022-05-19 15:30   ` Richard Henderson
  2022-05-19 13:59 ` [PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering Nicholas Piggin
  2022-05-23 19:24 ` [PATCH 0/4] ppc: improve some memory ordering issues Daniel Henrique Barboza
  4 siblings, 1 reply; 9+ messages in thread
From: Nicholas Piggin @ 2022-05-19 13:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Richard Henderson

lwsync orders more than just LD_LD, importantly it matches x86 and
s390 default memory ordering.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tcg/ppc/tcg-target.c.inc | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 3ff845d063..c0a5bca34f 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -1832,11 +1832,14 @@ static void tcg_out_brcond2 (TCGContext *s, const TCGArg *args,
 
 static void tcg_out_mb(TCGContext *s, TCGArg a0)
 {
-    uint32_t insn = HWSYNC;
-    a0 &= TCG_MO_ALL;
-    if (a0 == TCG_MO_LD_LD) {
+    uint32_t insn;
+
+    if (a0 & TCG_MO_ST_LD) {
+        insn = HWSYNC;
+    } else {
         insn = LWSYNC;
     }
+
     tcg_out32(s, insn);
 }
 
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering
  2022-05-19 13:59 [PATCH 0/4] ppc: improve some memory ordering issues Nicholas Piggin
                   ` (2 preceding siblings ...)
  2022-05-19 13:59 ` [PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync Nicholas Piggin
@ 2022-05-19 13:59 ` Nicholas Piggin
  2022-05-19 15:34   ` Richard Henderson
  2022-05-23 19:24 ` [PATCH 0/4] ppc: improve some memory ordering issues Daniel Henrique Barboza
  4 siblings, 1 reply; 9+ messages in thread
From: Nicholas Piggin @ 2022-05-19 13:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Richard Henderson

This allows an x86 host to no-op lwsyncs, and ppc host can use lwsync
rather than sync.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 target/ppc/cpu.h       |  4 +++-
 target/ppc/cpu_init.c  | 13 +++++++------
 target/ppc/machine.c   |  3 ++-
 target/ppc/translate.c |  8 +++++++-
 4 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 48596cfb25..b9b2536394 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2271,6 +2271,8 @@ enum {
     PPC2_ISA300        = 0x0000000000080000ULL,
     /* POWER ISA 3.1                                                         */
     PPC2_ISA310        = 0x0000000000100000ULL,
+    /*   lwsync instruction                                                  */
+    PPC2_MEM_LWSYNC    = 0x0000000000200000ULL,
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
                         PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
@@ -2279,7 +2281,7 @@ enum {
                         PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \
                         PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \
                         PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206 | \
-                        PPC2_ISA300 | PPC2_ISA310)
+                        PPC2_ISA300 | PPC2_ISA310 | PPC2_MEM_LWSYNC)
 };
 
 /*****************************************************************************/
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 527ad40fcb..0f891afa04 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -5769,7 +5769,7 @@ POWERPC_FAMILY(970)(ObjectClass *oc, void *data)
                        PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
                        PPC_64B | PPC_ALTIVEC |
                        PPC_SEGMENT_64B | PPC_SLBI;
-    pcc->insns_flags2 = PPC2_FP_CVT_S64;
+    pcc->insns_flags2 = PPC2_FP_CVT_S64 | PPC2_MEM_LWSYNC;
     pcc->msr_mask = (1ull << MSR_SF) |
                     (1ull << MSR_VR) |
                     (1ull << MSR_POW) |
@@ -5846,7 +5846,7 @@ POWERPC_FAMILY(POWER5P)(ObjectClass *oc, void *data)
                        PPC_64B |
                        PPC_POPCNTB |
                        PPC_SEGMENT_64B | PPC_SLBI;
-    pcc->insns_flags2 = PPC2_FP_CVT_S64;
+    pcc->insns_flags2 = PPC2_FP_CVT_S64 | PPC2_MEM_LWSYNC;
     pcc->msr_mask = (1ull << MSR_SF) |
                     (1ull << MSR_VR) |
                     (1ull << MSR_POW) |
@@ -5985,7 +5985,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
                         PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
                         PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206 |
                         PPC2_FP_TST_ISA206 | PPC2_FP_CVT_S64 |
-                        PPC2_PM_ISA206;
+                        PPC2_PM_ISA206 | PPC2_MEM_LWSYNC;
     pcc->msr_mask = (1ull << MSR_SF) |
                     (1ull << MSR_VR) |
                     (1ull << MSR_VSX) |
@@ -6159,7 +6159,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
                         PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
                         PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
                         PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
-                        PPC2_TM | PPC2_PM_ISA206;
+                        PPC2_TM | PPC2_PM_ISA206 | PPC2_MEM_LWSYNC;
     pcc->msr_mask = (1ull << MSR_SF) |
                     (1ull << MSR_HV) |
                     (1ull << MSR_TM) |
@@ -6379,7 +6379,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
                         PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
                         PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
                         PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
-                        PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
+                        PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_MEM_LWSYNC;
     pcc->msr_mask = (1ull << MSR_SF) |
                     (1ull << MSR_HV) |
                     (1ull << MSR_TM) |
@@ -6596,7 +6596,8 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
                         PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
                         PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
                         PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
-                        PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310;
+                        PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310 |
+                        PPC2_MEM_LWSYNC;
     pcc->msr_mask = (1ull << MSR_SF) |
                     (1ull << MSR_HV) |
                     (1ull << MSR_TM) |
diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index 7104a5c67e..a7d9036c09 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -157,7 +157,8 @@ static int cpu_pre_save(void *opaque)
         | PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206
         | PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207
         | PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207
-        | PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | PPC2_TM;
+        | PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | PPC2_TM
+        | PPC2_MEM_LWSYNC;
 
     env->spr[SPR_LR] = env->lr;
     env->spr[SPR_CTR] = env->ctr;
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index eb42f7e459..1d6daa4608 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4041,8 +4041,13 @@ static void gen_stqcx_(DisasContext *ctx)
 /* sync */
 static void gen_sync(DisasContext *ctx)
 {
+    TCGBar bar = TCG_MO_ALL;
     uint32_t l = (ctx->opcode >> 21) & 3;
 
+    if ((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) {
+        bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST;
+    }
+
     /*
      * We may need to check for a pending TLB flush.
      *
@@ -4054,7 +4059,8 @@ static void gen_sync(DisasContext *ctx)
     if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) {
         gen_check_tlb_flush(ctx, true);
     }
-    tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
+
+    tcg_gen_mb(bar | TCG_BAR_SC);
 }
 
 /* wait */
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] target/ppc: Fix eieio memory ordering semantics
  2022-05-19 13:59 ` [PATCH 1/4] target/ppc: Fix eieio memory ordering semantics Nicholas Piggin
@ 2022-05-19 15:30   ` Richard Henderson
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2022-05-19 15:30 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel

On 5/19/22 06:59, Nicholas Piggin wrote:
> The generated eieio memory ordering semantics do not match the
> instruction definition in the architecture. Add a big comment to
> explain this strange instruction and correct the memory ordering
> behaviour.
> 
> Signed-off: Nicholas Piggin<npiggin@gmail.com>
> ---
>   target/ppc/translate.c | 27 ++++++++++++++++++++++++++-
>   1 file changed, 26 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync
  2022-05-19 13:59 ` [PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync Nicholas Piggin
@ 2022-05-19 15:30   ` Richard Henderson
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2022-05-19 15:30 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel

On 5/19/22 06:59, Nicholas Piggin wrote:
> lwsync orders more than just LD_LD, importantly it matches x86 and
> s390 default memory ordering.
> 
> Signed-off-by: Nicholas Piggin<npiggin@gmail.com>
> ---
>   tcg/ppc/tcg-target.c.inc | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering
  2022-05-19 13:59 ` [PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering Nicholas Piggin
@ 2022-05-19 15:34   ` Richard Henderson
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2022-05-19 15:34 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel

On 5/19/22 06:59, Nicholas Piggin wrote:
> This allows an x86 host to no-op lwsyncs, and ppc host can use lwsync
> rather than sync.
> 
> Signed-off-by: Nicholas Piggin<npiggin@gmail.com>
> ---
>   target/ppc/cpu.h       |  4 +++-
>   target/ppc/cpu_init.c  | 13 +++++++------
>   target/ppc/machine.c   |  3 ++-
>   target/ppc/translate.c |  8 +++++++-
>   4 files changed, 19 insertions(+), 9 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

to the translate part, and I'll trust you on the set of cpus adjusted.


r~


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] ppc: improve some memory ordering issues
  2022-05-19 13:59 [PATCH 0/4] ppc: improve some memory ordering issues Nicholas Piggin
                   ` (3 preceding siblings ...)
  2022-05-19 13:59 ` [PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering Nicholas Piggin
@ 2022-05-23 19:24 ` Daniel Henrique Barboza
  4 siblings, 0 replies; 9+ messages in thread
From: Daniel Henrique Barboza @ 2022-05-23 19:24 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel, Richard Henderson

Queued in gitlab.com/danielhb/qemu/tree/ppc-next. Thanks,


Daniel

On 5/19/22 10:59, Nicholas Piggin wrote:
> Since RFC[*], this fixes a compile issue noticed by Richard,
> and has survived some basic stressing with mttcg.
> 
> Thanks,
> Nick
> 
> [*] https://lists.nongnu.org/archive/html/qemu-ppc/2022-05/msg00046.html
> 
> Nicholas Piggin (4):
>    target/ppc: Fix eieio memory ordering semantics
>    tcg/ppc: ST_ST memory ordering is not provided with eieio
>    tcg/ppc: Optimize memory ordering generation with lwsync
>    target/ppc: Implement lwsync with weaker memory ordering
> 
>   target/ppc/cpu.h         |  4 +++-
>   target/ppc/cpu_init.c    | 13 +++++++------
>   target/ppc/machine.c     |  3 ++-
>   target/ppc/translate.c   | 35 +++++++++++++++++++++++++++++++++--
>   tcg/ppc/tcg-target.c.inc | 11 ++++++-----
>   5 files changed, 51 insertions(+), 15 deletions(-)
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-05-23 19:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-19 13:59 [PATCH 0/4] ppc: improve some memory ordering issues Nicholas Piggin
2022-05-19 13:59 ` [PATCH 1/4] target/ppc: Fix eieio memory ordering semantics Nicholas Piggin
2022-05-19 15:30   ` Richard Henderson
2022-05-19 13:59 ` [PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio Nicholas Piggin
2022-05-19 13:59 ` [PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync Nicholas Piggin
2022-05-19 15:30   ` Richard Henderson
2022-05-19 13:59 ` [PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering Nicholas Piggin
2022-05-19 15:34   ` Richard Henderson
2022-05-23 19:24 ` [PATCH 0/4] ppc: improve some memory ordering issues Daniel Henrique Barboza

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).