* [PATCH 0/3] target/ppc: Fixes and updates for sync instructions
@ 2024-05-01 13:04 Nicholas Piggin
2024-05-01 13:04 ` [PATCH 1/3] target/ppc: Move sync instructions to decodetree Nicholas Piggin
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Nicholas Piggin @ 2024-05-01 13:04 UTC (permalink / raw)
To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Daniel Henrique Barboza,
Chinmay Rath
I forgot I needed to do this, I tried adding new POWER10 sync
instructions to the kernel and patch got nacked because it crashed
with TCG.
Unfortuantely I don't think our old decoder does a great job with
handling reserved bits like this, but decodetree makes this kind
of thing much easier.
I'll probably add at least patch 1 to -stable, so the Linux changes
can be upstreamed a bit sooner.
Thanks,
Nick
Nicholas Piggin (3):
target/ppc: Move sync instructions to decodetree
target/ppc: Fix embedded memory barriers
target/ppc: Add ISA v3.1 variants of sync instruction
target/ppc/insn32.decode | 7 ++
target/ppc/translate.c | 102 +-----------------
target/ppc/translate/misc-impl.c.inc | 152 +++++++++++++++++++++++++++
3 files changed, 161 insertions(+), 100 deletions(-)
create mode 100644 target/ppc/translate/misc-impl.c.inc
--
2.43.0
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 1/3] target/ppc: Move sync instructions to decodetree 2024-05-01 13:04 [PATCH 0/3] target/ppc: Fixes and updates for sync instructions Nicholas Piggin @ 2024-05-01 13:04 ` Nicholas Piggin 2024-05-07 6:41 ` Chinmay Rath 2024-05-01 13:04 ` [PATCH 2/3] target/ppc: Fix embedded memory barriers Nicholas Piggin 2024-05-01 13:04 ` [PATCH 3/3] target/ppc: Add ISA v3.1 variants of sync instruction Nicholas Piggin 2 siblings, 1 reply; 7+ messages in thread From: Nicholas Piggin @ 2024-05-01 13:04 UTC (permalink / raw) To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Daniel Henrique Barboza, Chinmay Rath This tries to faithfully reproduce the odd BookE logic. It does change the handling of non-zero reserved bits outside the defined fields from being illegal to being ignored, which the architecture specifies ot help with backward compatibility of new fields. The existing behaviour causes illegal instruction exceptions when using new POWER10 sync variants that add new fields, after this the instructions are accepted and are implemented as supersets of the new behaviour, as intended. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- target/ppc/insn32.decode | 7 ++ target/ppc/translate.c | 102 +------------------- target/ppc/translate/misc-impl.c.inc | 135 +++++++++++++++++++++++++++ 3 files changed, 144 insertions(+), 100 deletions(-) create mode 100644 target/ppc/translate/misc-impl.c.inc diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index eada59f59f..6b89804b15 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -998,3 +998,10 @@ MSGSND 011111 ----- ----- ..... 0011001110 - @X_rb MSGCLRP 011111 ----- ----- ..... 0010101110 - @X_rb MSGSNDP 011111 ----- ----- ..... 0010001110 - @X_rb MSGSYNC 011111 ----- ----- ----- 1101110110 - + +# Memory Barrier Instructions + +&X_sync l +@X_sync ...... ... l:2 ..... ..... .......... . &X_sync +SYNC 011111 --- .. ----- ----- 1001010110 - @X_sync +EIEIO 011111 ----- ----- ----- 1101010110 - diff --git a/target/ppc/translate.c b/target/ppc/translate.c index 93ffec787c..bb2cabae10 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -3423,59 +3423,6 @@ static void gen_stswx(DisasContext *ctx) gen_helper_stsw(tcg_env, t0, t1, t2); } -/*** Memory synchronisation ***/ -/* eieio */ -static void gen_eieio(DisasContext *ctx) -{ - TCGBar bar = TCG_MO_ALL; - - /* - * eieio has complex semanitcs. It provides memory ordering between - * operations in the set: - * - loads from CI memory. - * - stores to CI memory. - * - stores to WT memory. - * - * It separately also orders memory for operations in the set: - * - stores to cacheble memory. - * - * It also serializes instructions: - * - dcbt and dcbst. - * - * It separately serializes: - * - tlbie and tlbsync. - * - * And separately serializes: - * - slbieg, slbiag, and slbsync. - * - * The end result is that CI memory ordering requires TCG_MO_ALL - * and it is not possible to special-case more relaxed ordering for - * cacheable accesses. TCG_BAR_SC is required to provide this - * serialization. - */ - - /* - * POWER9 has a eieio instruction variant using bit 6 as a hint to - * tell the CPU it is a store-forwarding barrier. - */ - if (ctx->opcode & 0x2000000) { - /* - * ISA says that "Reserved fields in instructions are ignored - * by the processor". So ignore the bit 6 on non-POWER9 CPU but - * as this is not an instruction software should be using, - * complain to the user. - */ - if (!(ctx->insns_flags2 & PPC2_ISA300)) { - qemu_log_mask(LOG_GUEST_ERROR, "invalid eieio using bit 6 at @" - TARGET_FMT_lx "\n", ctx->cia); - } else { - bar = TCG_MO_ST_LD; - } - } - - tcg_gen_mb(bar | TCG_BAR_SC); -} - #if !defined(CONFIG_USER_ONLY) static inline void gen_check_tlb_flush(DisasContext *ctx, bool global) { @@ -3877,31 +3824,6 @@ static void gen_stqcx_(DisasContext *ctx) } #endif /* defined(TARGET_PPC64) */ -/* sync */ -static void gen_sync(DisasContext *ctx) -{ - TCGBar bar = TCG_MO_ALL; - uint32_t l = (ctx->opcode >> 21) & 3; - - if ((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) { - bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST; - } - - /* - * We may need to check for a pending TLB flush. - * - * We do this on ptesync (l == 2) on ppc64 and any sync pn ppc32. - * - * Additionally, this can only happen in kernel mode however so - * check MSR_PR as well. - */ - if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) { - gen_check_tlb_flush(ctx, true); - } - - tcg_gen_mb(bar | TCG_BAR_SC); -} - /* wait */ static void gen_wait(DisasContext *ctx) { @@ -6010,23 +5932,6 @@ static void gen_dlmzb(DisasContext *ctx) cpu_gpr[rS(ctx->opcode)], cpu_gpr[rB(ctx->opcode)], t0); } -/* mbar replaces eieio on 440 */ -static void gen_mbar(DisasContext *ctx) -{ - /* interpreted as no-op */ -} - -/* msync replaces sync on 440 */ -static void gen_msync_4xx(DisasContext *ctx) -{ - /* Only e500 seems to treat reserved bits as invalid */ - if ((ctx->insns_flags2 & PPC2_BOOKE206) && - (ctx->opcode & 0x03FFF801)) { - gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL); - } - /* otherwise interpreted as no-op */ -} - /* icbt */ static void gen_icbt_440(DisasContext *ctx) { @@ -6364,6 +6269,8 @@ static bool resolve_PLS_D(DisasContext *ctx, arg_D *d, arg_PLS_D *a) #include "translate/storage-ctrl-impl.c.inc" +#include "translate/misc-impl.c.inc" + /* Handles lfdp */ static void gen_dform39(DisasContext *ctx) { @@ -6492,7 +6399,6 @@ GEN_HANDLER(lswi, 0x1F, 0x15, 0x12, 0x00000001, PPC_STRING), GEN_HANDLER(lswx, 0x1F, 0x15, 0x10, 0x00000001, PPC_STRING), GEN_HANDLER(stswi, 0x1F, 0x15, 0x16, 0x00000001, PPC_STRING), GEN_HANDLER(stswx, 0x1F, 0x15, 0x14, 0x00000001, PPC_STRING), -GEN_HANDLER(eieio, 0x1F, 0x16, 0x1A, 0x01FFF801, PPC_MEM_EIEIO), GEN_HANDLER(isync, 0x13, 0x16, 0x04, 0x03FFF801, PPC_MEM), GEN_HANDLER_E(lbarx, 0x1F, 0x14, 0x01, 0, PPC_NONE, PPC2_ATOMIC_ISA206), GEN_HANDLER_E(lharx, 0x1F, 0x14, 0x03, 0, PPC_NONE, PPC2_ATOMIC_ISA206), @@ -6510,7 +6416,6 @@ GEN_HANDLER_E(lqarx, 0x1F, 0x14, 0x08, 0, PPC_NONE, PPC2_LSQ_ISA207), GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B), GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207), #endif -GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC), /* ISA v3.0 changed the extended opcode from 62 to 30 */ GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT), GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300), @@ -6633,9 +6538,6 @@ GEN_HANDLER2_E(tlbilx_booke206, "tlbilx", 0x1F, 0x12, 0x00, 0x03800001, GEN_HANDLER(wrtee, 0x1F, 0x03, 0x04, 0x000FFC01, PPC_WRTEE), GEN_HANDLER(wrteei, 0x1F, 0x03, 0x05, 0x000E7C01, PPC_WRTEE), GEN_HANDLER(dlmzb, 0x1F, 0x0E, 0x02, 0x00000000, PPC_440_SPEC), -GEN_HANDLER_E(mbar, 0x1F, 0x16, 0x1a, 0x001FF801, - PPC_BOOKE, PPC2_BOOKE206), -GEN_HANDLER(msync_4xx, 0x1F, 0x16, 0x12, 0x039FF801, PPC_BOOKE), GEN_HANDLER2_E(icbt_440, "icbt", 0x1F, 0x16, 0x00, 0x03E00001, PPC_BOOKE, PPC2_BOOKE206), GEN_HANDLER2(icbt_440, "icbt", 0x1F, 0x06, 0x08, 0x03E00001, diff --git a/target/ppc/translate/misc-impl.c.inc b/target/ppc/translate/misc-impl.c.inc new file mode 100644 index 0000000000..f58bf8b848 --- /dev/null +++ b/target/ppc/translate/misc-impl.c.inc @@ -0,0 +1,135 @@ +/* + * Power ISA decode for misc instructions + * + * Copyright (c) 2024, IBM Corporation. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see <http://www.gnu.org/licenses/>. + */ + +/* + * Memory Barrier Instructions + */ + +static bool trans_SYNC(DisasContext *ctx, arg_X_sync *a) +{ + TCGBar bar = TCG_MO_ALL; + uint32_t l = a->l; + + /* + * BookE uses the msync mnemonic. This means hwsync, except in the + * 440, where it an execution serialisation point that requires all + * previous storage accesses to have been performed to memory (which + * doesn't matter for TCG). + */ + if (!(ctx->insns_flags & PPC_MEM_SYNC)) { + if (ctx->insns_flags & PPC_BOOKE) { + /* msync replaces sync on 440, interpreted as nop */ + /* XXX: this also catches e200 */ + return true; + } + + return false; + } + + /* e500 family seems to treat reserved bits as invalid, this enforces l=0 */ + if ((ctx->insns_flags2 & PPC2_BOOKE206) && (ctx->opcode & 0x03FFF801)) { + gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL); + } + + if ((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) { + bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST; + } + + /* + * We may need to check for a pending TLB flush. + * + * We do this on ptesync (l == 2) on ppc64 and any sync on ppc32. + * + * Additionally, this can only happen in kernel mode however so + * check MSR_PR as well. + */ + if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) { + gen_check_tlb_flush(ctx, true); + } + + tcg_gen_mb(bar | TCG_BAR_SC); + + return true; +} + +static bool trans_EIEIO(DisasContext *ctx, arg_EIEIO *a) +{ + TCGBar bar = TCG_MO_ALL; + + /* + * BookE uses the mbar instruction instead of eieio, which is basically + * full hwsync memory barrier, but is not execution synchronising. For + * the purpose of TCG the distinction is not relevant. + */ + if (!(ctx->insns_flags & PPC_MEM_EIEIO)) { + if ((ctx->insns_flags & PPC_BOOKE) || + (ctx->insns_flags2 & PPC2_BOOKE206)) { + return true; + } + return false; + } + + /* + * eieio has complex semanitcs. It provides memory ordering between + * operations in the set: + * - loads from CI memory. + * - stores to CI memory. + * - stores to WT memory. + * + * It separately also orders memory for operations in the set: + * - stores to cacheble memory. + * + * It also serializes instructions: + * - dcbt and dcbst. + * + * It separately serializes: + * - tlbie and tlbsync. + * + * And separately serializes: + * - slbieg, slbiag, and slbsync. + * + * The end result is that CI memory ordering requires TCG_MO_ALL + * and it is not possible to special-case more relaxed ordering for + * cacheable accesses. TCG_BAR_SC is required to provide this + * serialization. + */ + + /* + * POWER9 has a eieio instruction variant using bit 6 as a hint to + * tell the CPU it is a store-forwarding barrier. + */ + if (ctx->opcode & 0x2000000) { + /* + * ISA says that "Reserved fields in instructions are ignored + * by the processor". So ignore the bit 6 on non-POWER9 CPU but + * as this is not an instruction software should be using, + * complain to the user. + */ + if (!(ctx->insns_flags2 & PPC2_ISA300)) { + qemu_log_mask(LOG_GUEST_ERROR, "invalid eieio using bit 6 at @" + TARGET_FMT_lx "\n", ctx->cia); + } else { + bar = TCG_MO_ST_LD; + } + } + + tcg_gen_mb(bar | TCG_BAR_SC); + + return true; +} -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] target/ppc: Move sync instructions to decodetree 2024-05-01 13:04 ` [PATCH 1/3] target/ppc: Move sync instructions to decodetree Nicholas Piggin @ 2024-05-07 6:41 ` Chinmay Rath 0 siblings, 0 replies; 7+ messages in thread From: Chinmay Rath @ 2024-05-07 6:41 UTC (permalink / raw) To: Nicholas Piggin, qemu-ppc Cc: qemu-devel, Daniel Henrique Barboza, Chinmay Rath On 5/1/24 18:34, Nicholas Piggin wrote: > This tries to faithfully reproduce the odd BookE logic. > > It does change the handling of non-zero reserved bits outside the > defined fields from being illegal to being ignored, which the > architecture specifies ot help with backward compatibility of new > fields. The existing behaviour causes illegal instruction exceptions > when using new POWER10 sync variants that add new fields, after this > the instructions are accepted and are implemented as supersets of > the new behaviour, as intended. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Chinmay Rath <rathc@linux.ibm.com> > --- > target/ppc/insn32.decode | 7 ++ > target/ppc/translate.c | 102 +------------------- > target/ppc/translate/misc-impl.c.inc | 135 +++++++++++++++++++++++++++ > 3 files changed, 144 insertions(+), 100 deletions(-) > create mode 100644 target/ppc/translate/misc-impl.c.inc > > diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode > index eada59f59f..6b89804b15 100644 > --- a/target/ppc/insn32.decode > +++ b/target/ppc/insn32.decode > @@ -998,3 +998,10 @@ MSGSND 011111 ----- ----- ..... 0011001110 - @X_rb > MSGCLRP 011111 ----- ----- ..... 0010101110 - @X_rb > MSGSNDP 011111 ----- ----- ..... 0010001110 - @X_rb > MSGSYNC 011111 ----- ----- ----- 1101110110 - > + > +# Memory Barrier Instructions > + > +&X_sync l > +@X_sync ...... ... l:2 ..... ..... .......... . &X_sync > +SYNC 011111 --- .. ----- ----- 1001010110 - @X_sync > +EIEIO 011111 ----- ----- ----- 1101010110 - > diff --git a/target/ppc/translate.c b/target/ppc/translate.c > index 93ffec787c..bb2cabae10 100644 > --- a/target/ppc/translate.c > +++ b/target/ppc/translate.c > @@ -3423,59 +3423,6 @@ static void gen_stswx(DisasContext *ctx) > gen_helper_stsw(tcg_env, t0, t1, t2); > } > > -/*** Memory synchronisation ***/ > -/* eieio */ > -static void gen_eieio(DisasContext *ctx) > -{ > - TCGBar bar = TCG_MO_ALL; > - > - /* > - * eieio has complex semanitcs. It provides memory ordering between > - * operations in the set: > - * - loads from CI memory. > - * - stores to CI memory. > - * - stores to WT memory. > - * > - * It separately also orders memory for operations in the set: > - * - stores to cacheble memory. > - * > - * It also serializes instructions: > - * - dcbt and dcbst. > - * > - * It separately serializes: > - * - tlbie and tlbsync. > - * > - * And separately serializes: > - * - slbieg, slbiag, and slbsync. > - * > - * The end result is that CI memory ordering requires TCG_MO_ALL > - * and it is not possible to special-case more relaxed ordering for > - * cacheable accesses. TCG_BAR_SC is required to provide this > - * serialization. > - */ > - > - /* > - * POWER9 has a eieio instruction variant using bit 6 as a hint to > - * tell the CPU it is a store-forwarding barrier. > - */ > - if (ctx->opcode & 0x2000000) { > - /* > - * ISA says that "Reserved fields in instructions are ignored > - * by the processor". So ignore the bit 6 on non-POWER9 CPU but > - * as this is not an instruction software should be using, > - * complain to the user. > - */ > - if (!(ctx->insns_flags2 & PPC2_ISA300)) { > - qemu_log_mask(LOG_GUEST_ERROR, "invalid eieio using bit 6 at @" > - TARGET_FMT_lx "\n", ctx->cia); > - } else { > - bar = TCG_MO_ST_LD; > - } > - } > - > - tcg_gen_mb(bar | TCG_BAR_SC); > -} > - > #if !defined(CONFIG_USER_ONLY) > static inline void gen_check_tlb_flush(DisasContext *ctx, bool global) > { > @@ -3877,31 +3824,6 @@ static void gen_stqcx_(DisasContext *ctx) > } > #endif /* defined(TARGET_PPC64) */ > > -/* sync */ > -static void gen_sync(DisasContext *ctx) > -{ > - TCGBar bar = TCG_MO_ALL; > - uint32_t l = (ctx->opcode >> 21) & 3; > - > - if ((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) { > - bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST; > - } > - > - /* > - * We may need to check for a pending TLB flush. > - * > - * We do this on ptesync (l == 2) on ppc64 and any sync pn ppc32. > - * > - * Additionally, this can only happen in kernel mode however so > - * check MSR_PR as well. > - */ > - if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) { > - gen_check_tlb_flush(ctx, true); > - } > - > - tcg_gen_mb(bar | TCG_BAR_SC); > -} > - > /* wait */ > static void gen_wait(DisasContext *ctx) > { > @@ -6010,23 +5932,6 @@ static void gen_dlmzb(DisasContext *ctx) > cpu_gpr[rS(ctx->opcode)], cpu_gpr[rB(ctx->opcode)], t0); > } > > -/* mbar replaces eieio on 440 */ > -static void gen_mbar(DisasContext *ctx) > -{ > - /* interpreted as no-op */ > -} > - > -/* msync replaces sync on 440 */ > -static void gen_msync_4xx(DisasContext *ctx) > -{ > - /* Only e500 seems to treat reserved bits as invalid */ > - if ((ctx->insns_flags2 & PPC2_BOOKE206) && > - (ctx->opcode & 0x03FFF801)) { > - gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL); > - } > - /* otherwise interpreted as no-op */ > -} > - > /* icbt */ > static void gen_icbt_440(DisasContext *ctx) > { > @@ -6364,6 +6269,8 @@ static bool resolve_PLS_D(DisasContext *ctx, arg_D *d, arg_PLS_D *a) > > #include "translate/storage-ctrl-impl.c.inc" > > +#include "translate/misc-impl.c.inc" > + > /* Handles lfdp */ > static void gen_dform39(DisasContext *ctx) > { > @@ -6492,7 +6399,6 @@ GEN_HANDLER(lswi, 0x1F, 0x15, 0x12, 0x00000001, PPC_STRING), > GEN_HANDLER(lswx, 0x1F, 0x15, 0x10, 0x00000001, PPC_STRING), > GEN_HANDLER(stswi, 0x1F, 0x15, 0x16, 0x00000001, PPC_STRING), > GEN_HANDLER(stswx, 0x1F, 0x15, 0x14, 0x00000001, PPC_STRING), > -GEN_HANDLER(eieio, 0x1F, 0x16, 0x1A, 0x01FFF801, PPC_MEM_EIEIO), > GEN_HANDLER(isync, 0x13, 0x16, 0x04, 0x03FFF801, PPC_MEM), > GEN_HANDLER_E(lbarx, 0x1F, 0x14, 0x01, 0, PPC_NONE, PPC2_ATOMIC_ISA206), > GEN_HANDLER_E(lharx, 0x1F, 0x14, 0x03, 0, PPC_NONE, PPC2_ATOMIC_ISA206), > @@ -6510,7 +6416,6 @@ GEN_HANDLER_E(lqarx, 0x1F, 0x14, 0x08, 0, PPC_NONE, PPC2_LSQ_ISA207), > GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B), > GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207), > #endif > -GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC), > /* ISA v3.0 changed the extended opcode from 62 to 30 */ > GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT), > GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300), > @@ -6633,9 +6538,6 @@ GEN_HANDLER2_E(tlbilx_booke206, "tlbilx", 0x1F, 0x12, 0x00, 0x03800001, > GEN_HANDLER(wrtee, 0x1F, 0x03, 0x04, 0x000FFC01, PPC_WRTEE), > GEN_HANDLER(wrteei, 0x1F, 0x03, 0x05, 0x000E7C01, PPC_WRTEE), > GEN_HANDLER(dlmzb, 0x1F, 0x0E, 0x02, 0x00000000, PPC_440_SPEC), > -GEN_HANDLER_E(mbar, 0x1F, 0x16, 0x1a, 0x001FF801, > - PPC_BOOKE, PPC2_BOOKE206), > -GEN_HANDLER(msync_4xx, 0x1F, 0x16, 0x12, 0x039FF801, PPC_BOOKE), > GEN_HANDLER2_E(icbt_440, "icbt", 0x1F, 0x16, 0x00, 0x03E00001, > PPC_BOOKE, PPC2_BOOKE206), > GEN_HANDLER2(icbt_440, "icbt", 0x1F, 0x06, 0x08, 0x03E00001, > diff --git a/target/ppc/translate/misc-impl.c.inc b/target/ppc/translate/misc-impl.c.inc > new file mode 100644 > index 0000000000..f58bf8b848 > --- /dev/null > +++ b/target/ppc/translate/misc-impl.c.inc > @@ -0,0 +1,135 @@ > +/* > + * Power ISA decode for misc instructions > + * > + * Copyright (c) 2024, IBM Corporation. > + * > + * This library is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * This library is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with this library; if not, see <http://www.gnu.org/licenses/>. > + */ > + > +/* > + * Memory Barrier Instructions > + */ > + > +static bool trans_SYNC(DisasContext *ctx, arg_X_sync *a) > +{ > + TCGBar bar = TCG_MO_ALL; > + uint32_t l = a->l; > + > + /* > + * BookE uses the msync mnemonic. This means hwsync, except in the > + * 440, where it an execution serialisation point that requires all > + * previous storage accesses to have been performed to memory (which > + * doesn't matter for TCG). > + */ > + if (!(ctx->insns_flags & PPC_MEM_SYNC)) { > + if (ctx->insns_flags & PPC_BOOKE) { > + /* msync replaces sync on 440, interpreted as nop */ > + /* XXX: this also catches e200 */ > + return true; > + } > + > + return false; > + } > + > + /* e500 family seems to treat reserved bits as invalid, this enforces l=0 */ > + if ((ctx->insns_flags2 & PPC2_BOOKE206) && (ctx->opcode & 0x03FFF801)) { > + gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL); > + } > + > + if ((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) { > + bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST; > + } > + > + /* > + * We may need to check for a pending TLB flush. > + * > + * We do this on ptesync (l == 2) on ppc64 and any sync on ppc32. > + * > + * Additionally, this can only happen in kernel mode however so > + * check MSR_PR as well. > + */ > + if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) { > + gen_check_tlb_flush(ctx, true); > + } > + > + tcg_gen_mb(bar | TCG_BAR_SC); > + > + return true; > +} > + > +static bool trans_EIEIO(DisasContext *ctx, arg_EIEIO *a) > +{ > + TCGBar bar = TCG_MO_ALL; > + > + /* > + * BookE uses the mbar instruction instead of eieio, which is basically > + * full hwsync memory barrier, but is not execution synchronising. For > + * the purpose of TCG the distinction is not relevant. > + */ > + if (!(ctx->insns_flags & PPC_MEM_EIEIO)) { > + if ((ctx->insns_flags & PPC_BOOKE) || > + (ctx->insns_flags2 & PPC2_BOOKE206)) { > + return true; > + } > + return false; > + } > + > + /* > + * eieio has complex semanitcs. It provides memory ordering between > + * operations in the set: > + * - loads from CI memory. > + * - stores to CI memory. > + * - stores to WT memory. > + * > + * It separately also orders memory for operations in the set: > + * - stores to cacheble memory. > + * > + * It also serializes instructions: > + * - dcbt and dcbst. > + * > + * It separately serializes: > + * - tlbie and tlbsync. > + * > + * And separately serializes: > + * - slbieg, slbiag, and slbsync. > + * > + * The end result is that CI memory ordering requires TCG_MO_ALL > + * and it is not possible to special-case more relaxed ordering for > + * cacheable accesses. TCG_BAR_SC is required to provide this > + * serialization. > + */ > + > + /* > + * POWER9 has a eieio instruction variant using bit 6 as a hint to > + * tell the CPU it is a store-forwarding barrier. > + */ > + if (ctx->opcode & 0x2000000) { > + /* > + * ISA says that "Reserved fields in instructions are ignored > + * by the processor". So ignore the bit 6 on non-POWER9 CPU but > + * as this is not an instruction software should be using, > + * complain to the user. > + */ > + if (!(ctx->insns_flags2 & PPC2_ISA300)) { > + qemu_log_mask(LOG_GUEST_ERROR, "invalid eieio using bit 6 at @" > + TARGET_FMT_lx "\n", ctx->cia); > + } else { > + bar = TCG_MO_ST_LD; > + } > + } > + > + tcg_gen_mb(bar | TCG_BAR_SC); > + > + return true; > +} ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/3] target/ppc: Fix embedded memory barriers 2024-05-01 13:04 [PATCH 0/3] target/ppc: Fixes and updates for sync instructions Nicholas Piggin 2024-05-01 13:04 ` [PATCH 1/3] target/ppc: Move sync instructions to decodetree Nicholas Piggin @ 2024-05-01 13:04 ` Nicholas Piggin 2024-05-07 7:24 ` Chinmay Rath 2024-05-01 13:04 ` [PATCH 3/3] target/ppc: Add ISA v3.1 variants of sync instruction Nicholas Piggin 2 siblings, 1 reply; 7+ messages in thread From: Nicholas Piggin @ 2024-05-01 13:04 UTC (permalink / raw) To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Daniel Henrique Barboza, Chinmay Rath Memory barriers are supposed to do something on BookE systems, these were probably just missed during MTTCG enablement, maybe no targets support SMP. Either way, add proper BookE implementations. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- target/ppc/translate/misc-impl.c.inc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/ppc/translate/misc-impl.c.inc b/target/ppc/translate/misc-impl.c.inc index f58bf8b848..9226467f81 100644 --- a/target/ppc/translate/misc-impl.c.inc +++ b/target/ppc/translate/misc-impl.c.inc @@ -34,8 +34,7 @@ static bool trans_SYNC(DisasContext *ctx, arg_X_sync *a) */ if (!(ctx->insns_flags & PPC_MEM_SYNC)) { if (ctx->insns_flags & PPC_BOOKE) { - /* msync replaces sync on 440, interpreted as nop */ - /* XXX: this also catches e200 */ + tcg_gen_mb(bar | TCG_BAR_SC); return true; } @@ -80,6 +79,7 @@ static bool trans_EIEIO(DisasContext *ctx, arg_EIEIO *a) if (!(ctx->insns_flags & PPC_MEM_EIEIO)) { if ((ctx->insns_flags & PPC_BOOKE) || (ctx->insns_flags2 & PPC2_BOOKE206)) { + tcg_gen_mb(bar | TCG_BAR_SC); return true; } return false; -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] target/ppc: Fix embedded memory barriers 2024-05-01 13:04 ` [PATCH 2/3] target/ppc: Fix embedded memory barriers Nicholas Piggin @ 2024-05-07 7:24 ` Chinmay Rath 0 siblings, 0 replies; 7+ messages in thread From: Chinmay Rath @ 2024-05-07 7:24 UTC (permalink / raw) To: Nicholas Piggin, qemu-ppc Cc: qemu-devel, Daniel Henrique Barboza, Chinmay Rath On 5/1/24 18:34, Nicholas Piggin wrote: > Memory barriers are supposed to do something on BookE systems, these > were probably just missed during MTTCG enablement, maybe no targets > support SMP. Either way, add proper BookE implementations. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Chinmay Rath <rathc@linux.ibm.com> > --- > target/ppc/translate/misc-impl.c.inc | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/target/ppc/translate/misc-impl.c.inc b/target/ppc/translate/misc-impl.c.inc > index f58bf8b848..9226467f81 100644 > --- a/target/ppc/translate/misc-impl.c.inc > +++ b/target/ppc/translate/misc-impl.c.inc > @@ -34,8 +34,7 @@ static bool trans_SYNC(DisasContext *ctx, arg_X_sync *a) > */ > if (!(ctx->insns_flags & PPC_MEM_SYNC)) { > if (ctx->insns_flags & PPC_BOOKE) { > - /* msync replaces sync on 440, interpreted as nop */ > - /* XXX: this also catches e200 */ > + tcg_gen_mb(bar | TCG_BAR_SC); > return true; > } > > @@ -80,6 +79,7 @@ static bool trans_EIEIO(DisasContext *ctx, arg_EIEIO *a) > if (!(ctx->insns_flags & PPC_MEM_EIEIO)) { > if ((ctx->insns_flags & PPC_BOOKE) || > (ctx->insns_flags2 & PPC2_BOOKE206)) { > + tcg_gen_mb(bar | TCG_BAR_SC); > return true; > } > return false; ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3/3] target/ppc: Add ISA v3.1 variants of sync instruction 2024-05-01 13:04 [PATCH 0/3] target/ppc: Fixes and updates for sync instructions Nicholas Piggin 2024-05-01 13:04 ` [PATCH 1/3] target/ppc: Move sync instructions to decodetree Nicholas Piggin 2024-05-01 13:04 ` [PATCH 2/3] target/ppc: Fix embedded memory barriers Nicholas Piggin @ 2024-05-01 13:04 ` Nicholas Piggin 2024-05-07 7:09 ` Chinmay Rath 2 siblings, 1 reply; 7+ messages in thread From: Nicholas Piggin @ 2024-05-01 13:04 UTC (permalink / raw) To: qemu-ppc; +Cc: Nicholas Piggin, qemu-devel, Daniel Henrique Barboza, Chinmay Rath POWER10 adds a new field to sync for store-store syncs, and some new variants of the existing syncs that include persistent memory. Implement the store-store syncs and plwsync/phwsync. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- target/ppc/insn32.decode | 6 ++-- target/ppc/translate/misc-impl.c.inc | 41 ++++++++++++++++++++-------- 2 files changed, 32 insertions(+), 15 deletions(-) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index 6b89804b15..a180380750 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -1001,7 +1001,7 @@ MSGSYNC 011111 ----- ----- ----- 1101110110 - # Memory Barrier Instructions -&X_sync l -@X_sync ...... ... l:2 ..... ..... .......... . &X_sync -SYNC 011111 --- .. ----- ----- 1001010110 - @X_sync +&X_sync l sc +@X_sync ...... .. l:3 ... sc:2 ..... .......... . &X_sync +SYNC 011111 -- ... --- .. ----- 1001010110 - @X_sync EIEIO 011111 ----- ----- ----- 1101010110 - diff --git a/target/ppc/translate/misc-impl.c.inc b/target/ppc/translate/misc-impl.c.inc index 9226467f81..3467b49d0d 100644 --- a/target/ppc/translate/misc-impl.c.inc +++ b/target/ppc/translate/misc-impl.c.inc @@ -25,6 +25,7 @@ static bool trans_SYNC(DisasContext *ctx, arg_X_sync *a) { TCGBar bar = TCG_MO_ALL; uint32_t l = a->l; + uint32_t sc = a->sc; /* * BookE uses the msync mnemonic. This means hwsync, except in the @@ -46,20 +47,36 @@ static bool trans_SYNC(DisasContext *ctx, arg_X_sync *a) gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL); } - if ((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) { - bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST; - } - /* - * We may need to check for a pending TLB flush. - * - * We do this on ptesync (l == 2) on ppc64 and any sync on ppc32. - * - * Additionally, this can only happen in kernel mode however so - * check MSR_PR as well. + * In ISA v3.1, the L field grew one bit. Mask that out to ignore it in + * older processors. It also added the SC field, zero this to ignore + * it too. */ - if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) { - gen_check_tlb_flush(ctx, true); + if (!(ctx->insns_flags2 & PPC2_ISA310)) { + l &= 0x3; + sc = 0; + } + + if (sc) { + /* Store syncs [stsync, stcisync, stncisync]. These ignore L. */ + bar = TCG_MO_ST_ST; + } else { + if (((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) || (l == 5)) { + /* lwsync, or plwsync on POWER10 and later */ + bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST; + } + + /* + * We may need to check for a pending TLB flush. + * + * We do this on ptesync (l == 2) on ppc64 and any sync on ppc32. + * + * Additionally, this can only happen in kernel mode however so + * check MSR_PR as well. + */ + if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) { + gen_check_tlb_flush(ctx, true); + } } tcg_gen_mb(bar | TCG_BAR_SC); -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] target/ppc: Add ISA v3.1 variants of sync instruction 2024-05-01 13:04 ` [PATCH 3/3] target/ppc: Add ISA v3.1 variants of sync instruction Nicholas Piggin @ 2024-05-07 7:09 ` Chinmay Rath 0 siblings, 0 replies; 7+ messages in thread From: Chinmay Rath @ 2024-05-07 7:09 UTC (permalink / raw) To: Nicholas Piggin, qemu-ppc Cc: qemu-devel, Daniel Henrique Barboza, Chinmay Rath On 5/1/24 18:34, Nicholas Piggin wrote: > POWER10 adds a new field to sync for store-store syncs, and some > new variants of the existing syncs that include persistent memory. > > Implement the store-store syncs and plwsync/phwsync. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Chinmay Rath <rathc@linux.ibm.com> > --- > target/ppc/insn32.decode | 6 ++-- > target/ppc/translate/misc-impl.c.inc | 41 ++++++++++++++++++++-------- > 2 files changed, 32 insertions(+), 15 deletions(-) > > diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode > index 6b89804b15..a180380750 100644 > --- a/target/ppc/insn32.decode > +++ b/target/ppc/insn32.decode > @@ -1001,7 +1001,7 @@ MSGSYNC 011111 ----- ----- ----- 1101110110 - > > # Memory Barrier Instructions > > -&X_sync l > -@X_sync ...... ... l:2 ..... ..... .......... . &X_sync > -SYNC 011111 --- .. ----- ----- 1001010110 - @X_sync > +&X_sync l sc > +@X_sync ...... .. l:3 ... sc:2 ..... .......... . &X_sync > +SYNC 011111 -- ... --- .. ----- 1001010110 - @X_sync > EIEIO 011111 ----- ----- ----- 1101010110 - > diff --git a/target/ppc/translate/misc-impl.c.inc b/target/ppc/translate/misc-impl.c.inc > index 9226467f81..3467b49d0d 100644 > --- a/target/ppc/translate/misc-impl.c.inc > +++ b/target/ppc/translate/misc-impl.c.inc > @@ -25,6 +25,7 @@ static bool trans_SYNC(DisasContext *ctx, arg_X_sync *a) > { > TCGBar bar = TCG_MO_ALL; > uint32_t l = a->l; > + uint32_t sc = a->sc; > > /* > * BookE uses the msync mnemonic. This means hwsync, except in the > @@ -46,20 +47,36 @@ static bool trans_SYNC(DisasContext *ctx, arg_X_sync *a) > gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL); > } > > - if ((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) { > - bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST; > - } > - > /* > - * We may need to check for a pending TLB flush. > - * > - * We do this on ptesync (l == 2) on ppc64 and any sync on ppc32. > - * > - * Additionally, this can only happen in kernel mode however so > - * check MSR_PR as well. > + * In ISA v3.1, the L field grew one bit. Mask that out to ignore it in > + * older processors. It also added the SC field, zero this to ignore > + * it too. > */ > - if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) { > - gen_check_tlb_flush(ctx, true); > + if (!(ctx->insns_flags2 & PPC2_ISA310)) { > + l &= 0x3; > + sc = 0; > + } > + > + if (sc) { > + /* Store syncs [stsync, stcisync, stncisync]. These ignore L. */ > + bar = TCG_MO_ST_ST; > + } else { > + if (((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) || (l == 5)) { > + /* lwsync, or plwsync on POWER10 and later */ > + bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST; > + } > + > + /* > + * We may need to check for a pending TLB flush. > + * > + * We do this on ptesync (l == 2) on ppc64 and any sync on ppc32. > + * > + * Additionally, this can only happen in kernel mode however so > + * check MSR_PR as well. > + */ > + if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) { > + gen_check_tlb_flush(ctx, true); > + } > } > > tcg_gen_mb(bar | TCG_BAR_SC); ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-05-07 7:24 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-01 13:04 [PATCH 0/3] target/ppc: Fixes and updates for sync instructions Nicholas Piggin 2024-05-01 13:04 ` [PATCH 1/3] target/ppc: Move sync instructions to decodetree Nicholas Piggin 2024-05-07 6:41 ` Chinmay Rath 2024-05-01 13:04 ` [PATCH 2/3] target/ppc: Fix embedded memory barriers Nicholas Piggin 2024-05-07 7:24 ` Chinmay Rath 2024-05-01 13:04 ` [PATCH 3/3] target/ppc: Add ISA v3.1 variants of sync instruction Nicholas Piggin 2024-05-07 7:09 ` Chinmay Rath
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).