* [RFC PATCH 1/4] target/ppc: Fix eieio memory ordering semantics
@ 2022-05-03 10:33 Nicholas Piggin
2022-05-03 10:33 ` [RFC PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio Nicholas Piggin
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Nicholas Piggin @ 2022-05-03 10:33 UTC (permalink / raw)
To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel
The generated eieio memory ordering semantics do not match the
instruction definition in the architecture. Add a big comment to
explain this strange instruction and correct the memory ordering
behaviour.
Signed-off: Nicholas Piggin <npiggin@gmail.com>
---
target/ppc/translate.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index fa34f81c30..abb8807180 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -3513,7 +3513,31 @@ static void gen_stswx(DisasContext *ctx)
/* eieio */
static void gen_eieio(DisasContext *ctx)
{
- TCGBar bar = TCG_MO_LD_ST;
+ TCGBar bar = TCG_MO_ALL;
+
+ /*
+ * eieio has complex semanitcs. It provides memory ordering between
+ * operations in the set:
+ * - loads from CI memory.
+ * - stores to CI memory.
+ * - stores to WT memory.
+ *
+ * It separately also orders memory for operations in the set:
+ * - stores to cacheble memory.
+ *
+ * It also serializes instructions:
+ * - dcbt and dcbst.
+ *
+ * It separately serializes:
+ * - tlbie and tlbsync.
+ *
+ * And separately serializes:
+ * - slbieg, slbiag, and slbsync.
+ *
+ * The end result is that CI memory ordering requires TCG_MO_ALL
+ * and it is not possible to special-case more relaxed ordering for
+ * cacheable accesses. TCG_BAR_SC is required to provide the serialization.
+ */
/*
* POWER9 has a eieio instruction variant using bit 6 as a hint to
--
2.35.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio
2022-05-03 10:33 [RFC PATCH 1/4] target/ppc: Fix eieio memory ordering semantics Nicholas Piggin
@ 2022-05-03 10:33 ` Nicholas Piggin
2022-05-03 15:01 ` Richard Henderson
2022-05-03 10:33 ` [RFC PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync Nicholas Piggin
2022-05-03 10:33 ` [RFC PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering Nicholas Piggin
2 siblings, 1 reply; 6+ messages in thread
From: Nicholas Piggin @ 2022-05-03 10:33 UTC (permalink / raw)
To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel
eieio does not provide ordering between stores to CI memory and stores
to cacheable memory so it can't be used as a general ST_ST barrier.
Signed-of-by: Nicholas Piggin <npiggin@gmail.com>
---
tcg/ppc/tcg-target.c.inc | 2 --
1 file changed, 2 deletions(-)
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index cfcd121f9c..3ff845d063 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -1836,8 +1836,6 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
a0 &= TCG_MO_ALL;
if (a0 == TCG_MO_LD_LD) {
insn = LWSYNC;
- } else if (a0 == TCG_MO_ST_ST) {
- insn = EIEIO;
}
tcg_out32(s, insn);
}
--
2.35.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync
2022-05-03 10:33 [RFC PATCH 1/4] target/ppc: Fix eieio memory ordering semantics Nicholas Piggin
2022-05-03 10:33 ` [RFC PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio Nicholas Piggin
@ 2022-05-03 10:33 ` Nicholas Piggin
2022-05-03 14:53 ` Richard Henderson
2022-05-03 10:33 ` [RFC PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering Nicholas Piggin
2 siblings, 1 reply; 6+ messages in thread
From: Nicholas Piggin @ 2022-05-03 10:33 UTC (permalink / raw)
To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel
lwsync orders more than just LD_LD, importantly it matches x86 and
s390 default memory ordering.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
target/ppc/cpu.h | 2 ++
tcg/ppc/tcg-target.c.inc | 2 +-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index c2b6c987c0..0b0e9761cd 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -28,6 +28,8 @@
#define TCG_GUEST_DEFAULT_MO 0
+#define PPC_LWSYNC_MO (TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST)
+
#define TARGET_PAGE_BITS_64K 16
#define TARGET_PAGE_BITS_16M 24
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 3ff845d063..b87fc2383e 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -1834,7 +1834,7 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
{
uint32_t insn = HWSYNC;
a0 &= TCG_MO_ALL;
- if (a0 == TCG_MO_LD_LD) {
+ if ((a0 & PPC_LWSYNC_MO) == a0) {
insn = LWSYNC;
}
tcg_out32(s, insn);
--
2.35.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering
2022-05-03 10:33 [RFC PATCH 1/4] target/ppc: Fix eieio memory ordering semantics Nicholas Piggin
2022-05-03 10:33 ` [RFC PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio Nicholas Piggin
2022-05-03 10:33 ` [RFC PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync Nicholas Piggin
@ 2022-05-03 10:33 ` Nicholas Piggin
2 siblings, 0 replies; 6+ messages in thread
From: Nicholas Piggin @ 2022-05-03 10:33 UTC (permalink / raw)
To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel
This allows an x86 host to no-op lwsyncs, and ppc host can use lwsync
rather than sync.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
target/ppc/cpu.h | 4 +++-
target/ppc/cpu_init.c | 13 +++++++------
target/ppc/machine.c | 3 ++-
target/ppc/translate.c | 8 +++++++-
4 files changed, 19 insertions(+), 9 deletions(-)
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 0b0e9761cd..bf5f226567 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2287,6 +2287,8 @@ enum {
PPC2_ISA300 = 0x0000000000080000ULL,
/* POWER ISA 3.1 */
PPC2_ISA310 = 0x0000000000100000ULL,
+ /* lwsync instruction */
+ PPC2_MEM_LWSYNC = 0x0000000000200000ULL,
#define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
@@ -2295,7 +2297,7 @@ enum {
PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \
PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \
PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206 | \
- PPC2_ISA300 | PPC2_ISA310)
+ PPC2_ISA300 | PPC2_ISA310 | PPC2_MEM_LWSYNC)
};
/*****************************************************************************/
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index d42e2ba8e0..26d9277ffb 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -5769,7 +5769,7 @@ POWERPC_FAMILY(970)(ObjectClass *oc, void *data)
PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
PPC_64B | PPC_ALTIVEC |
PPC_SEGMENT_64B | PPC_SLBI;
- pcc->insns_flags2 = PPC2_FP_CVT_S64;
+ pcc->insns_flags2 = PPC2_FP_CVT_S64 | PPC2_MEM_LWSYNC;
pcc->msr_mask = (1ull << MSR_SF) |
(1ull << MSR_VR) |
(1ull << MSR_POW) |
@@ -5846,7 +5846,7 @@ POWERPC_FAMILY(POWER5P)(ObjectClass *oc, void *data)
PPC_64B |
PPC_POPCNTB |
PPC_SEGMENT_64B | PPC_SLBI;
- pcc->insns_flags2 = PPC2_FP_CVT_S64;
+ pcc->insns_flags2 = PPC2_FP_CVT_S64 | PPC2_MEM_LWSYNC;
pcc->msr_mask = (1ull << MSR_SF) |
(1ull << MSR_VR) |
(1ull << MSR_POW) |
@@ -5984,7 +5984,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 |
PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206 |
PPC2_FP_TST_ISA206 | PPC2_FP_CVT_S64 |
- PPC2_PM_ISA206;
+ PPC2_PM_ISA206 | PPC2_MEM_LWSYNC;
pcc->msr_mask = (1ull << MSR_SF) |
(1ull << MSR_VR) |
(1ull << MSR_VSX) |
@@ -6157,7 +6157,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
- PPC2_TM | PPC2_PM_ISA206;
+ PPC2_TM | PPC2_PM_ISA206 | PPC2_MEM_LWSYNC;
pcc->msr_mask = (1ull << MSR_SF) |
(1ull << MSR_HV) |
(1ull << MSR_TM) |
@@ -6375,7 +6375,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
- PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
+ PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_MEM_LWSYNC;
pcc->msr_mask = (1ull << MSR_SF) |
(1ull << MSR_HV) |
(1ull << MSR_TM) |
@@ -6590,7 +6590,8 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
- PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310;
+ PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310 |
+ PPC2_MEM_LWSYNC;
pcc->msr_mask = (1ull << MSR_SF) |
(1ull << MSR_HV) |
(1ull << MSR_TM) |
diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index e673944597..33b3d6cf30 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -157,7 +157,8 @@ static int cpu_pre_save(void *opaque)
| PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206
| PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207
| PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207
- | PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | PPC2_TM;
+ | PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | PPC2_TM
+ | PPC2_MEM_LWSYNC;
env->spr[SPR_LR] = env->lr;
env->spr[SPR_CTR] = env->ctr;
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index abb8807180..76691cf082 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4040,8 +4040,13 @@ static void gen_stqcx_(DisasContext *ctx)
/* sync */
static void gen_sync(DisasContext *ctx)
{
+ TCGBar bar = TCG_MO_ALL;
uint32_t l = (ctx->opcode >> 21) & 3;
+ if ((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) {
+ bar = PPC_LWSYNC_MO;
+ }
+
/*
* We may need to check for a pending TLB flush.
*
@@ -4053,7 +4058,8 @@ static void gen_sync(DisasContext *ctx)
if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) {
gen_check_tlb_flush(ctx, true);
}
- tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
+
+ tcg_gen_mb(bar | TCG_BAR_SC);
}
/* wait */
--
2.35.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync
2022-05-03 10:33 ` [RFC PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync Nicholas Piggin
@ 2022-05-03 14:53 ` Richard Henderson
0 siblings, 0 replies; 6+ messages in thread
From: Richard Henderson @ 2022-05-03 14:53 UTC (permalink / raw)
To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel
On 5/3/22 03:33, Nicholas Piggin wrote:
> lwsync orders more than just LD_LD, importantly it matches x86 and
> s390 default memory ordering.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> target/ppc/cpu.h | 2 ++
> tcg/ppc/tcg-target.c.inc | 2 +-
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index c2b6c987c0..0b0e9761cd 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -28,6 +28,8 @@
>
> #define TCG_GUEST_DEFAULT_MO 0
>
> +#define PPC_LWSYNC_MO (TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST)
You can't put this here...
> +
> #define TARGET_PAGE_BITS_64K 16
> #define TARGET_PAGE_BITS_16M 24
>
> diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
> index 3ff845d063..b87fc2383e 100644
> --- a/tcg/ppc/tcg-target.c.inc
> +++ b/tcg/ppc/tcg-target.c.inc
> @@ -1834,7 +1834,7 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
> {
> uint32_t insn = HWSYNC;
> a0 &= TCG_MO_ALL;
> - if (a0 == TCG_MO_LD_LD) {
> + if ((a0 & PPC_LWSYNC_MO) == a0) {
... and have it used here. You should have seen compilation failures for the missing
symbol. I can only assume you used a restricted --target-list in testing.
Anyway, it looks like a simpler test would be
insn = (a0 & TCG_MO_ST_LD ? HWSYNC : LWSYNC);
r~
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio
2022-05-03 10:33 ` [RFC PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio Nicholas Piggin
@ 2022-05-03 15:01 ` Richard Henderson
0 siblings, 0 replies; 6+ messages in thread
From: Richard Henderson @ 2022-05-03 15:01 UTC (permalink / raw)
To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel
On 5/3/22 03:33, Nicholas Piggin wrote:
> eieio does not provide ordering between stores to CI memory and stores
> to cacheable memory so it can't be used as a general ST_ST barrier.
>
> Signed-of-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> tcg/ppc/tcg-target.c.inc | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
> index cfcd121f9c..3ff845d063 100644
> --- a/tcg/ppc/tcg-target.c.inc
> +++ b/tcg/ppc/tcg-target.c.inc
> @@ -1836,8 +1836,6 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
> a0 &= TCG_MO_ALL;
> if (a0 == TCG_MO_LD_LD) {
> insn = LWSYNC;
> - } else if (a0 == TCG_MO_ST_ST) {
> - insn = EIEIO;
> }
> tcg_out32(s, insn);
> }
Certainly matches the comment from patch 1.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-05-03 15:03 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-03 10:33 [RFC PATCH 1/4] target/ppc: Fix eieio memory ordering semantics Nicholas Piggin
2022-05-03 10:33 ` [RFC PATCH 2/4] tcg/ppc: ST_ST memory ordering is not provided with eieio Nicholas Piggin
2022-05-03 15:01 ` Richard Henderson
2022-05-03 10:33 ` [RFC PATCH 3/4] tcg/ppc: Optimize memory ordering generation with lwsync Nicholas Piggin
2022-05-03 14:53 ` Richard Henderson
2022-05-03 10:33 ` [RFC PATCH 4/4] target/ppc: Implement lwsync with weaker memory ordering Nicholas Piggin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).