From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53633) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b9lyr-0006Ut-G4 for qemu-devel@nongnu.org; Mon, 06 Jun 2016 00:17:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b9lyn-0005nU-5q for qemu-devel@nongnu.org; Mon, 06 Jun 2016 00:17:36 -0400 Message-ID: <1465186627.4274.30.camel@kernel.crashing.org> From: Benjamin Herrenschmidt Date: Mon, 06 Jun 2016 14:17:07 +1000 In-Reply-To: <5754645E.8080504@kaod.org> References: <1464955880-10176-1-git-send-email-clg@kaod.org> <57518BA1.1030400@ilande.co.uk> <5754645E.8080504@kaod.org> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] ppc: complete the new HV mode List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?ISO-8859-1?Q?C=E9dric?= Le Goater , Mark Cave-Ayland , David Gibson Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org On Sun, 2016-06-05 at 19:41 +0200, C=C3=A9dric Le Goater wrote: >=C2=A0 > Here is a fix I think. Could you give it a try ?=C2=A0 This is somewhat wrong... > commit cd0c6f473532 ('ppc: Do some batching of TCG tlb flushes') > introduced an optimisation to flush TLBs only when a context > synchronizing event is reached (interrupt, rfi). This was done for > ppc64 but 32bit was forgotten on the way. No it didn't. That commit only delays flushes on ppc64. ppc32 is unaffected, unless I missed something. IE. It will delay flushes caused by slb instructions (which don't exist on 32-bit) and=C2=A0ppc_tlb_invalidate_one() only in the 64-bit cases. Also what your patch does in practice is not really change that, though you seem to try to somewhat extend the batching to 32-bit (but incompletely), you also introduce something which effectively reverts part of=C2=A09fb044911444fdd09f5f072ad0ca269d7f8b841d (split I/D mode). I think that's more what's "fixing" your problem, ie, the flush in IR/DR changes. However it shouldn't be needed. I suspect all of that is papering over another bug somewhere else which got exposed by the split I/D mode, since we no longer over-flush on transitions to/from real-mode. So we must be missing flushes elsewhere, possibly some G3 specific stuff, or there always was some kind of bug in the TLB flushing on 32-bit that got somewhat masked by the over- flushing we used to do. I need a repro-case. Cheers, Ben. > Tested on mac99 and g3beige with >=20 > =C2=A0=C2=A0=C2=A0=C2=A0qemu-system-ppc -cdrom darwinppc-602.cdr -boot = d >=20 > Signed-off-by: C=C3=A9dric Le Goater > --- >=20 > =C2=A0I think the hunk in powerpc_excp() is needed if we don't generate= a > =C2=A0context synchronizing event. what is best to do ? >=20 > =C2=A0target-ppc/cpu.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0|=C2=A0=C2=A0=C2=A0=C2=A02 +- > =C2=A0target-ppc/excp_helper.c |=C2=A0=C2=A0=C2=A010 ++++++++++ > =C2=A0target-ppc/helper_regs.h |=C2=A0=C2=A0=C2=A0=C2=A09 ++++++++- > =C2=A0target-ppc/translate.c=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A0=C2=A0=C2=A0= 2 +- > =C2=A04 files changed, 20 insertions(+), 3 deletions(-) >=20 > Index: qemu-dgibson-for-2.7.git/target-ppc/translate.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- qemu-dgibson-for-2.7.git.orig/target-ppc/translate.c > +++ qemu-dgibson-for-2.7.git/target-ppc/translate.c > @@ -3290,7 +3290,7 @@ static void gen_eieio(DisasContext *ctx) > =C2=A0{ > =C2=A0} > =C2=A0 > -#if !defined(CONFIG_USER_ONLY) && defined(TARGET_PPC64) > +#if !defined(CONFIG_USER_ONLY) > =C2=A0static inline void gen_check_tlb_flush(DisasContext *ctx) > =C2=A0{ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0TCGv_i32 t =3D tcg_temp_new_i32(); > Index: qemu-dgibson-for-2.7.git/target-ppc/cpu.h > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- qemu-dgibson-for-2.7.git.orig/target-ppc/cpu.h > +++ qemu-dgibson-for-2.7.git/target-ppc/cpu.h > @@ -958,9 +958,9 @@ struct CPUPPCState { > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* PowerPC 64 SLB area */ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0ppc_slb_t slb[MAX_SLB_ENTRIES]; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0int32_t slb_nr; > +#endif > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* tcg TLB needs flush (deferred slb inva= l instruction > typically) */ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0uint32_t tlb_need_flush; > -#endif > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* segment registers */ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0hwaddr htab_base; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* mask used to normalize hash value to P= TEG index */ > Index: qemu-dgibson-for-2.7.git/target-ppc/helper_regs.h > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- qemu-dgibson-for-2.7.git.orig/target-ppc/helper_regs.h > +++ qemu-dgibson-for-2.7.git/target-ppc/helper_regs.h > @@ -121,6 +121,13 @@ static inline int hreg_store_msr(CPUPPCS > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (((value >> MSR_IR) & 1) !=3D msr_ir |= | > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0((value >> MSR_DR= ) & 1) !=3D msr_dr) { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* A change of the ins= truction relocation bit in the MSR can > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* cause an impli= cit branch in the address space. This > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* requires a tlb= flush. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (env->mmu_model & P= OWERPC_MMU_32B) { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= env->tlb_need_flush =3D 1; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cs->interrupt_req= uest |=3D CPU_INTERRUPT_EXITTB; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ((env->mmu_model & POWERPC_MMU_BOOKE) = && > @@ -151,7 +158,7 @@ static inline int hreg_store_msr(CPUPPCS > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return excp; > =C2=A0} > =C2=A0 > -#if !defined(CONFIG_USER_ONLY) && defined(TARGET_PPC64) > +#if !defined(CONFIG_USER_ONLY) > =C2=A0static inline void check_tlb_flush(CPUPPCState *env) > =C2=A0{ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0CPUState *cs =3D CPU(ppc_env_get_cpu(env)= ); > Index: qemu-dgibson-for-2.7.git/target-ppc/excp_helper.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- qemu-dgibson-for-2.7.git.orig/target-ppc/excp_helper.c > +++ qemu-dgibson-for-2.7.git/target-ppc/excp_helper.c > @@ -709,6 +709,16 @@ static inline void powerpc_excp(PowerPCC > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0#endif > +=C2=A0=C2=A0=C2=A0=C2=A0if (((new_msr >> MSR_IR) & 1) !=3D msr_ir || > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0((new_msr >> MSR_DR) &= 1) !=3D msr_dr) { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* A change of the ins= truction relocation bit in the MSR can > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* cause an impli= cit branch in the address space. This > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* requires a tlb= flush. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (env->mmu_model & P= OWERPC_MMU_32B) { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= env->tlb_need_flush =3D 1; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > +=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* We don't use hreg_store_msr here as al= ready have treated > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* any special case that could occur= . Just store MSR and update > hflags > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* >=20