From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39500) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bSEsZ-0002k0-LH for qemu-devel@nongnu.org; Tue, 26 Jul 2016 22:47:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bSEsV-0001nI-Bo for qemu-devel@nongnu.org; Tue, 26 Jul 2016 22:47:26 -0400 Date: Wed, 27 Jul 2016 12:36:54 +1000 From: David Gibson Message-ID: <20160727023654.GZ17429@voom.fritz.box> References: <1469571686-7284-1-git-send-email-benh@kernel.crashing.org> <1469571686-7284-26-git-send-email-benh@kernel.crashing.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="SqEPCaGsKgsesFh5" Content-Disposition: inline In-Reply-To: <1469571686-7284-26-git-send-email-benh@kernel.crashing.org> Subject: Re: [Qemu-devel] [PATCH 26/32] ppc: Speed up dcbz List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Benjamin Herrenschmidt Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org --SqEPCaGsKgsesFh5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jul 27, 2016 at 08:21:20AM +1000, Benjamin Herrenschmidt wrote: > Use tlb_vaddr_to_host to do a fast path single translate for > the whole cache line. Also make the reservation check match > the entire range. >=20 > Signed-off-by: Benjamin Herrenschmidt > --- > target-ppc/mem_helper.c | 46 +++++++++++++++++++++++++------------------= --- > target-ppc/translate.c | 11 ++++------- > 2 files changed, 29 insertions(+), 28 deletions(-) >=20 > diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c > index 92a594c..6548715 100644 > --- a/target-ppc/mem_helper.c > +++ b/target-ppc/mem_helper.c > @@ -141,35 +141,39 @@ void helper_stsw(CPUPPCState *env, target_ulong add= r, uint32_t nb, > } > } > =20 > -static void do_dcbz(CPUPPCState *env, target_ulong addr, int dcache_line= _size, > - uintptr_t raddr) > +void helper_dcbz(CPUPPCState *env, target_ulong addr, uint32_t opcode) > { > - int i; > - > - addr &=3D ~(dcache_line_size - 1); > - for (i =3D 0; i < dcache_line_size; i +=3D 4) { > - cpu_stl_data_ra(env, addr + i, 0, raddr); > - } > - if (env->reserve_addr =3D=3D addr) { > - env->reserve_addr =3D (target_ulong)-1ULL; > - } > -} > - > -void helper_dcbz(CPUPPCState *env, target_ulong addr, uint32_t is_dcbzl) > -{ > - int dcbz_size =3D env->dcache_line_size; > + target_ulong mask, dcbz_size =3D env->dcache_line_size; > + uint32_t i; > + void *haddr; > =20 > #if defined(TARGET_PPC64) > - if (!is_dcbzl && > - (env->excp_model =3D=3D POWERPC_EXCP_970) && > - ((env->spr[SPR_970_HID5] >> 7) & 0x3) =3D=3D 1) { > + /* Check for dcbz vs dcbzl on 970 */ > + if (env->excp_model =3D=3D POWERPC_EXCP_970 && > + !(opcode & 0x00200000) && ((env->spr[SPR_970_HID5] >> 7) & 0x3) = =3D=3D 1) { > dcbz_size =3D 32; > } > #endif > =20 > - /* XXX add e500mc support */ > + /* Align address */ > + mask =3D ~(dcbz_size - 1); > + addr &=3D mask; > + > + /* Check reservation */ > + if ((env->reserve_addr & mask) =3D=3D (addr & mask)) { > + env->reserve_addr =3D (target_ulong)-1ULL; > + } > =20 > - do_dcbz(env, addr, dcbz_size, GETPC()); > + /* Try fast path translate */ > + haddr =3D tlb_vaddr_to_host(env, addr, MMU_DATA_STORE, env->dmmu_idx= ); It worries me slightly that this doesn't take any length to verify. I guess it's ok in practice, because memory blocks will always be at least cache line size aligned. > + if (haddr) { > + memset(haddr, 0, dcbz_size); > + } else { > + /* Slow path */ > + for (i =3D 0; i < dcbz_size; i +=3D 8) { > + cpu_stq_data_ra(env, addr + i, 0, GETPC()); > + } > + } > } > =20 > void helper_icbi(CPUPPCState *env, target_ulong addr) > diff --git a/target-ppc/translate.c b/target-ppc/translate.c > index 57a891b..5288e02 100644 > --- a/target-ppc/translate.c > +++ b/target-ppc/translate.c > @@ -3851,18 +3851,15 @@ static void gen_dcbtls(DisasContext *ctx) > static void gen_dcbz(DisasContext *ctx) > { > TCGv tcgv_addr; > - TCGv_i32 tcgv_is_dcbzl; > - int is_dcbzl =3D ctx->opcode & 0x00200000 ? 1 : 0; > + TCGv_i32 tcgv_op; > =20 > gen_set_access_type(ctx, ACCESS_CACHE); > tcgv_addr =3D tcg_temp_new(); > - tcgv_is_dcbzl =3D tcg_const_i32(is_dcbzl); > - > + tcgv_op =3D tcg_const_i32(ctx->opcode & 0x03FF000); > gen_addr_reg_index(ctx, tcgv_addr); > - gen_helper_dcbz(cpu_env, tcgv_addr, tcgv_is_dcbzl); > - > + gen_helper_dcbz(cpu_env, tcgv_addr, tcgv_op); > tcg_temp_free(tcgv_addr); > - tcg_temp_free_i32(tcgv_is_dcbzl); > + tcg_temp_free_i32(tcgv_op); > } > =20 > /* dst / dstt */ --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --SqEPCaGsKgsesFh5 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJXmB5GAAoJEGw4ysog2bOS5rAQAKZjBjm68Zt4W2x2g1XAXYVZ 8GK5VPgljTy/iYKhDTyupfxfsSEVNm17hEn05oMRYkAObdwWACZtTnmQs/ghSplI VZo7NcEPly1prsLSOzcKaCTtkvO2kGo9jHM0PgrFgATdKtUUiOrarJTeOj2xSxq3 UhVFJgrX7jjht5pjWxb1Iapq4BfJgPnTUs/DPbzNxDqNI5vKwbwo/muJVEi848Tr D6cVcjf14ZfkwU+wPqDqLKQYvbZL0468I1mdLmP6tYDA6Anux9Q1F4b0dVwprmOQ YMM9xM5Thvp806nTgi/MhbAZVy1n6CWZ4MAyKicCv3B+f2gubl5zbkweBfz+5uBb OThbZA3mYXx2Ex7kQGywW7G1Uy3EMYcaa7XSYj4ZgANrdnbgRyFWXyh84T2nEekd 9mFCKWzRtr2k5IYiAtoKyB7/abhA+1fObyShSxmeSagFhW7lc4qaopoCBhfojHzC GEcSfGG+GdvVkcKHUu23zPdQZfajdS0PmSpENS6jHtcxReDgtW8q7PSgc7G4Skzv bha+c19TmY3/3Z8VagqupPRZS7mMtFu1iGpBYK2O5kAmg5BOXPt6ApSM2hS1C+rm JKpMX1omnaTOO4TGafVrdkVEb8zOawOjzo2UuN0tSeHHphQp7UVV2xYKQ2/kiUXN Kl3ItBHmtVPcXRfbOphL =eH5g -----END PGP SIGNATURE----- --SqEPCaGsKgsesFh5--