qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 32/32] ppc: Speed up load/store multiple
Date: Wed, 27 Jul 2016 12:47:13 +1000	[thread overview]
Message-ID: <20160727024713.GA17429@voom.fritz.box> (raw)
In-Reply-To: <1469571686-7284-32-git-send-email-benh@kernel.crashing.org>

[-- Attachment #1: Type: text/plain, Size: 3563 bytes --]

On Wed, Jul 27, 2016 at 08:21:26AM +1000, Benjamin Herrenschmidt wrote:
> Use a single translate when not crossing a page boundary and avoid
> going through layers of helpers. MacOS uses those instructions
> a lot, so does OpenBIOS.
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>  target-ppc/mem_helper.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 69 insertions(+)
> 
> diff --git a/target-ppc/mem_helper.c b/target-ppc/mem_helper.c
> index da3f973..511079b 100644
> --- a/target-ppc/mem_helper.c
> +++ b/target-ppc/mem_helper.c
> @@ -53,8 +53,48 @@ static inline target_ulong addr_add(CPUPPCState *env, target_ulong addr,
>      }
>  }
>  
> +/* Reduce the length so that addr + len doesn't cross a page boundary.  */
> +static inline uint64_t adj_len_to_page(uint64_t len, uint64_t addr)
> +{
> +#ifndef CONFIG_USER_ONLY
> +    if ((addr & ~TARGET_PAGE_MASK) + len - 1 >= TARGET_PAGE_SIZE) {
> +        return -addr & ~TARGET_PAGE_MASK;
> +    }
> +#endif
> +    return len;
> +}
> +
>  void helper_lmw(CPUPPCState *env, target_ulong addr, uint32_t reg)
>  {
> +    uint32_t *src;
> +    uint64_t len, adjlen;
> +
> +    if ((addr & 3)) {
> +        goto fallback;
> +    }
> +    len = (32 - reg) << 2;
> +    while (len) {
> +        src = tlb_vaddr_to_host(env, addr, MMU_DATA_LOAD, env->dmmu_idx);
> +        if (!src) {
> +            goto fallback;
> +        }
> +        adjlen = adj_len_to_page(len, addr);
> +        len -= adjlen;
> +#if defined(HOST_WORDS_BIGENDIAN)
> +        memcpy(&env->gpr[reg], src, adjlen);
> +        reg += (adjlen >> 2);
> +        addr = addr_add(env, addr, adjlen);
> +#else
> +        while(adjlen) {
> +            env->gpr[reg++] = bswap32(*(src++));
> +            adjlen -= 4;
> +            addr = addr_add(env, addr, 4);
> +        }
> +#endif

Would it improve this any further to do the memcpy() unconditionally,
then byteswap the GPRs in-place for the LE host case?

> +    }
> +    return;
> +
> + fallback:
>      for (; reg < 32; reg++) {
>          if (needs_byteswap(env)) {
>              env->gpr[reg] = bswap32(cpu_ldl_data_ra(env, addr, GETPC()));
> @@ -67,6 +107,35 @@ void helper_lmw(CPUPPCState *env, target_ulong addr, uint32_t reg)
>  
>  void helper_stmw(CPUPPCState *env, target_ulong addr, uint32_t reg)
>  {
> +    uint32_t *dst;
> +    uint64_t len, adjlen;
> +
> +    if ((addr & 3)) {
> +        goto fallback;
> +    }
> +    len = (32 - reg) << 2;
> +    while (len) {
> +        dst = tlb_vaddr_to_host(env, addr, MMU_DATA_STORE, env->dmmu_idx);
> +        if (!dst) {
> +            goto fallback;
> +        }
> +        adjlen = adj_len_to_page(len, addr);
> +        len -= adjlen;
> +#if defined(HOST_WORDS_BIGENDIAN)
> +        memcpy(dst, &env->gpr[reg], adjlen);
> +        reg += (adjlen >> 2);
> +        addr = addr_add(env, addr, adjlen);
> +#else
> +        while(adjlen) {
> +            *(dst++) = bswap32(env->gpr[reg++]);
> +            adjlen -= 4;
> +            addr = addr_add(env, addr, 4);
> +        }
> +#endif
> +    }
> +    return;
> +
> + fallback:
>      for (; reg < 32; reg++) {
>          if (needs_byteswap(env)) {
>              cpu_stl_data_ra(env, addr, bswap32((uint32_t)env->gpr[reg]),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-07-27  2:47 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-26 22:20 [Qemu-devel] [PATCH 01/32] ppc: Fix fault PC reporting for lve*/stve* VMX instructions Benjamin Herrenschmidt
2016-07-26 22:20 ` [Qemu-devel] [PATCH 02/32] ppc: Provide basic raise_exception_* functions Benjamin Herrenschmidt
2016-07-27  1:50   ` David Gibson
2016-07-27  3:46     ` Benjamin Herrenschmidt
2016-07-26 22:20 ` [Qemu-devel] [PATCH 03/32] ppc: Move classic fp ops out of translate.c Benjamin Herrenschmidt
2016-07-28 16:02   ` Richard Henderson
2016-07-28 21:56     ` Benjamin Herrenschmidt
2016-07-26 22:20 ` [Qemu-devel] [PATCH 04/32] ppc: Move embedded spe " Benjamin Herrenschmidt
2016-07-26 22:20 ` [Qemu-devel] [PATCH 05/32] ppc: Move DFP " Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 06/32] ppc: Move VMX " Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 07/32] ppc: Move VSX " Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 08/32] ppc: Rename fload_invalid_op_excp to float_invalid_op_excp Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 09/32] ppc: Make float_invalid_op_excp() pass the return address Benjamin Herrenschmidt
2016-07-28 16:06   ` Richard Henderson
2016-07-28 21:57     ` Benjamin Herrenschmidt
2016-07-28 22:10       ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 10/32] ppc: Make float_check_status() " Benjamin Herrenschmidt
2016-07-27  1:57   ` David Gibson
2016-07-27  3:47     ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 11/32] ppc: Don't update the NIP in floating point generated code Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 12/32] ppc: FP exceptions are always precise Benjamin Herrenschmidt
2016-07-27  2:00   ` David Gibson
2016-07-27  3:50     ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 13/32] ppc: Don't update NIP in lswi/lswx/stswi/stswx Benjamin Herrenschmidt
2016-07-27  2:04   ` David Gibson
2016-07-27  3:51     ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 14/32] ppc: Don't update NIP in lmw/stmw/icbi Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 15/32] ppc: Make tlb_fill() use new exception helper Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 16/32] ppc: Rework NIP updates vs. exception generation Benjamin Herrenschmidt
2016-07-27  2:19   ` David Gibson
2016-07-27  3:54     ` Benjamin Herrenschmidt
2016-07-27  4:35     ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 17/32] ppc: Fix source NIP on SLB related interrupts Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 18/32] ppc: Don't update NIP in DCR access routines Benjamin Herrenschmidt
2016-07-27  2:21   ` David Gibson
2016-07-27  3:55     ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 19/32] ppc: Don't update NIP in facility unavailable interrupts Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 20/32] ppc: Don't update NIP BookE 2.06 tlbwe Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 21/32] ppc: Don't update NIP on conditional trap instructions Benjamin Herrenschmidt
2016-07-27  2:26   ` David Gibson
2016-07-27  3:56     ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 22/32] ppc: Don't update NIP if not taking alignment exceptions Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 23/32] ppc: Don't update NIP in dcbz and lscbx Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 24/32] ppc: Make alignment exceptions suck less Benjamin Herrenschmidt
2016-07-27  2:30   ` David Gibson
2016-07-27  3:59     ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 25/32] ppc: Handle unconditional (always/never) traps at translation time Benjamin Herrenschmidt
2016-07-27  2:33   ` David Gibson
2016-07-27  4:00     ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 26/32] ppc: Speed up dcbz Benjamin Herrenschmidt
2016-07-27  2:36   ` David Gibson
2016-07-27  4:02     ` Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 27/32] ppc: Fix CFAR updates Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 28/32] ppc: Avoid double translation for lvx/lvxl/stvx/stvxl Benjamin Herrenschmidt
2016-07-29  0:49   ` Richard Henderson
2016-07-29  2:13     ` Benjamin Herrenschmidt
2016-07-29  3:34       ` David Gibson
2016-07-29  4:40         ` Benjamin Herrenschmidt
2016-07-29  4:58           ` Benjamin Herrenschmidt
2016-07-29  5:42             ` David Gibson
2016-07-29  9:00     ` Benjamin Herrenschmidt
2016-07-29 12:43       ` Richard Henderson
2016-07-26 22:21 ` [Qemu-devel] [PATCH 29/32] ppc: Don't set access_type on all load/stores on hash64 Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 30/32] ppc: Use a helper to generate "LE unsupported" alignment interrupts Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 31/32] ppc: load/store multiple and string insns don't do LE Benjamin Herrenschmidt
2016-07-26 22:21 ` [Qemu-devel] [PATCH 32/32] ppc: Speed up load/store multiple Benjamin Herrenschmidt
2016-07-27  2:47   ` David Gibson [this message]
2016-07-27  4:04     ` Benjamin Herrenschmidt
2016-07-27  1:06 ` [Qemu-devel] [PATCH 01/32] ppc: Fix fault PC reporting for lve*/stve* VMX instructions David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160727024713.GA17429@voom.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=benh@kernel.crashing.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).