From: David Gibson <david@gibson.dropbear.id.au>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 26/32] ppc: Speed up dcbz
Date: Wed, 27 Jul 2016 12:36:54 +1000 [thread overview]
Message-ID: <20160727023654.GZ17429@voom.fritz.box> (raw)
In-Reply-To: <1469571686-7284-26-git-send-email-benh@kernel.crashing.org>
[-- Attachment #1: Type: text/plain, Size: 3953 bytes --]
On Wed, Jul 27, 2016 at 08:21:20AM +1000, Benjamin Herrenschmidt wrote:
> Use tlb_vaddr_to_host to do a fast path single translate for
> the whole cache line. Also make the reservation check match
> the entire range.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
> target-ppc/mem_helper.c | 46 +++++++++++++++++++++++++---------------------
> target-ppc/translate.c | 11 ++++-------
> 2 files changed, 29 insertions(+), 28 deletions(-)
>
> diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c
> index 92a594c..6548715 100644
> --- a/target-ppc/mem_helper.c
> +++ b/target-ppc/mem_helper.c
> @@ -141,35 +141,39 @@ void helper_stsw(CPUPPCState *env, target_ulong addr, uint32_t nb,
> }
> }
>
> -static void do_dcbz(CPUPPCState *env, target_ulong addr, int dcache_line_size,
> - uintptr_t raddr)
> +void helper_dcbz(CPUPPCState *env, target_ulong addr, uint32_t opcode)
> {
> - int i;
> -
> - addr &= ~(dcache_line_size - 1);
> - for (i = 0; i < dcache_line_size; i += 4) {
> - cpu_stl_data_ra(env, addr + i, 0, raddr);
> - }
> - if (env->reserve_addr == addr) {
> - env->reserve_addr = (target_ulong)-1ULL;
> - }
> -}
> -
> -void helper_dcbz(CPUPPCState *env, target_ulong addr, uint32_t is_dcbzl)
> -{
> - int dcbz_size = env->dcache_line_size;
> + target_ulong mask, dcbz_size = env->dcache_line_size;
> + uint32_t i;
> + void *haddr;
>
> #if defined(TARGET_PPC64)
> - if (!is_dcbzl &&
> - (env->excp_model == POWERPC_EXCP_970) &&
> - ((env->spr[SPR_970_HID5] >> 7) & 0x3) == 1) {
> + /* Check for dcbz vs dcbzl on 970 */
> + if (env->excp_model == POWERPC_EXCP_970 &&
> + !(opcode & 0x00200000) && ((env->spr[SPR_970_HID5] >> 7) & 0x3) == 1) {
> dcbz_size = 32;
> }
> #endif
>
> - /* XXX add e500mc support */
> + /* Align address */
> + mask = ~(dcbz_size - 1);
> + addr &= mask;
> +
> + /* Check reservation */
> + if ((env->reserve_addr & mask) == (addr & mask)) {
> + env->reserve_addr = (target_ulong)-1ULL;
> + }
>
> - do_dcbz(env, addr, dcbz_size, GETPC());
> + /* Try fast path translate */
> + haddr = tlb_vaddr_to_host(env, addr, MMU_DATA_STORE, env->dmmu_idx);
It worries me slightly that this doesn't take any length to verify. I
guess it's ok in practice, because memory blocks will always be at
least cache line size aligned.
> + if (haddr) {
> + memset(haddr, 0, dcbz_size);
> + } else {
> + /* Slow path */
> + for (i = 0; i < dcbz_size; i += 8) {
> + cpu_stq_data_ra(env, addr + i, 0, GETPC());
> + }
> + }
> }
>
> void helper_icbi(CPUPPCState *env, target_ulong addr)
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 57a891b..5288e02 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -3851,18 +3851,15 @@ static void gen_dcbtls(DisasContext *ctx)
> static void gen_dcbz(DisasContext *ctx)
> {
> TCGv tcgv_addr;
> - TCGv_i32 tcgv_is_dcbzl;
> - int is_dcbzl = ctx->opcode & 0x00200000 ? 1 : 0;
> + TCGv_i32 tcgv_op;
>
> gen_set_access_type(ctx, ACCESS_CACHE);
> tcgv_addr = tcg_temp_new();
> - tcgv_is_dcbzl = tcg_const_i32(is_dcbzl);
> -
> + tcgv_op = tcg_const_i32(ctx->opcode & 0x03FF000);
> gen_addr_reg_index(ctx, tcgv_addr);
> - gen_helper_dcbz(cpu_env, tcgv_addr, tcgv_is_dcbzl);
> -
> + gen_helper_dcbz(cpu_env, tcgv_addr, tcgv_op);
> tcg_temp_free(tcgv_addr);
> - tcg_temp_free_i32(tcgv_is_dcbzl);
> + tcg_temp_free_i32(tcgv_op);
> }
>
> /* dst / dstt */
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2016-07-27 2:47 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-26 22:20 [Qemu-devel] [PATCH 01/32] ppc: Fix fault PC reporting for lve*/stve* VMX instructions Benjamin Herrenschmidt
2016-07-26 22:20 ` [Qemu-devel] [PATCH 02/32] ppc: Provide basic raise_exception_* functions Benjamin Herrenschmidt
2016-07-27 1:50 ` David Gibson
2016-07-27 3:46 ` Benjamin Herrenschmidt
2016-07-26 22:20 ` [Qemu-devel] [PATCH 03/32] ppc: Move classic fp ops out of translate.c Benjamin Herrenschmidt
2016-07-28 16:02 ` Richard Henderson
2016-07-28 21:56 ` Benjamin Herrenschmidt
2016-07-26 22:20 ` [Qemu-devel] [PATCH 04/32] ppc: Move embedded spe " Benjamin Herrenschmidt
2016-07-26 22:20 ` [Qemu-devel] [PATCH 05/32] ppc: Move DFP " Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 06/32] ppc: Move VMX " Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 07/32] ppc: Move VSX " Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 08/32] ppc: Rename fload_invalid_op_excp to float_invalid_op_excp Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 09/32] ppc: Make float_invalid_op_excp() pass the return address Benjamin Herrenschmidt
2016-07-28 16:06 ` Richard Henderson
2016-07-28 21:57 ` Benjamin Herrenschmidt
2016-07-28 22:10 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 10/32] ppc: Make float_check_status() " Benjamin Herrenschmidt
2016-07-27 1:57 ` David Gibson
2016-07-27 3:47 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 11/32] ppc: Don't update the NIP in floating point generated code Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 12/32] ppc: FP exceptions are always precise Benjamin Herrenschmidt
2016-07-27 2:00 ` David Gibson
2016-07-27 3:50 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 13/32] ppc: Don't update NIP in lswi/lswx/stswi/stswx Benjamin Herrenschmidt
2016-07-27 2:04 ` David Gibson
2016-07-27 3:51 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 14/32] ppc: Don't update NIP in lmw/stmw/icbi Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 15/32] ppc: Make tlb_fill() use new exception helper Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 16/32] ppc: Rework NIP updates vs. exception generation Benjamin Herrenschmidt
2016-07-27 2:19 ` David Gibson
2016-07-27 3:54 ` Benjamin Herrenschmidt
2016-07-27 4:35 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 17/32] ppc: Fix source NIP on SLB related interrupts Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 18/32] ppc: Don't update NIP in DCR access routines Benjamin Herrenschmidt
2016-07-27 2:21 ` David Gibson
2016-07-27 3:55 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 19/32] ppc: Don't update NIP in facility unavailable interrupts Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 20/32] ppc: Don't update NIP BookE 2.06 tlbwe Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 21/32] ppc: Don't update NIP on conditional trap instructions Benjamin Herrenschmidt
2016-07-27 2:26 ` David Gibson
2016-07-27 3:56 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 22/32] ppc: Don't update NIP if not taking alignment exceptions Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 23/32] ppc: Don't update NIP in dcbz and lscbx Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 24/32] ppc: Make alignment exceptions suck less Benjamin Herrenschmidt
2016-07-27 2:30 ` David Gibson
2016-07-27 3:59 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 25/32] ppc: Handle unconditional (always/never) traps at translation time Benjamin Herrenschmidt
2016-07-27 2:33 ` David Gibson
2016-07-27 4:00 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 26/32] ppc: Speed up dcbz Benjamin Herrenschmidt
2016-07-27 2:36 ` David Gibson [this message]
2016-07-27 4:02 ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 27/32] ppc: Fix CFAR updates Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 28/32] ppc: Avoid double translation for lvx/lvxl/stvx/stvxl Benjamin Herrenschmidt
2016-07-29 0:49 ` Richard Henderson
2016-07-29 2:13 ` Benjamin Herrenschmidt
2016-07-29 3:34 ` David Gibson
2016-07-29 4:40 ` Benjamin Herrenschmidt
2016-07-29 4:58 ` Benjamin Herrenschmidt
2016-07-29 5:42 ` David Gibson
2016-07-29 9:00 ` Benjamin Herrenschmidt
2016-07-29 12:43 ` Richard Henderson
2016-07-26 22:21 ` [Qemu-devel] [PATCH 29/32] ppc: Don't set access_type on all load/stores on hash64 Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 30/32] ppc: Use a helper to generate "LE unsupported" alignment interrupts Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 31/32] ppc: load/store multiple and string insns don't do LE Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 32/32] ppc: Speed up load/store multiple Benjamin Herrenschmidt
2016-07-27 2:47 ` David Gibson
2016-07-27 4:04 ` Benjamin Herrenschmidt
2016-07-27 1:06 ` [Qemu-devel] [PATCH 01/32] ppc: Fix fault PC reporting for lve*/stve* VMX instructions David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160727023654.GZ17429@voom.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=benh@kernel.crashing.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).