qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Henderson <richard.henderson@linaro.org>
To: BALATON Zoltan <balaton@eik.bme.hu>,
	qemu-devel@nongnu.org, qemu-ppc@nongnu.org
Cc: Nicholas Piggin <npiggin@gmail.com>,
	Daniel Henrique Barboza <danielhb413@gmail.com>
Subject: Re: [PATCH] target/ppc/mem_helper.c: Remove a conditional from dcbz_common()
Date: Sun, 23 Jun 2024 10:52:49 -0700	[thread overview]
Message-ID: <6664471b-7223-4c6e-a106-ce272be72f28@linaro.org> (raw)
In-Reply-To: <20240622204833.5F7C74E6000@zero.eik.bme.hu>

On 6/22/24 13:48, BALATON Zoltan wrote:
> Instead of passing a bool and select a value within dcbz_common() let
> the callers pass in the right value to avoid this conditional
> statement. On PPC dcbz is often used to zero memory and some code uses
> it a lot. This change improves the run time of a test case that copies
> memory with a dcbz call in every iteration from 6.23 to 5.83 seconds.
> 
> Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
> ---
> This is just a small optimisation removing some of the overhead but
> dcbz still seems to be the biggest issue with this test. Removing the
> dcbz call it runs in 2 seconds. In a profile I see:
>    Children      Self  Command   Shared Object            Symbol
> -   55.01%    11.44%  qemu-ppc  qemu-ppc                 [.] dcbz_common.constprop.0
>                 - 43.57% dcbz_common.constprop.0
>                    - probe_access
>                       - page_get_flags
>                            interval_tree_iter_first
>                 - 11.44% helper_raise_exception_err
>                      cpu_loop_exit_restore
>                      cpu_loop
>                      cpu_exec
>                      cpu_exec_setjmp.isra.0
>                      cpu_exec_loop.constprop.0
>                      cpu_tb_exec
>                      0x7f262403636e
>                      helper_raise_exception_err
>                      cpu_loop_exit_restore
>                      cpu_loop
>                      cpu_exec
>                      cpu_exec_setjmp.isra.0
>                      cpu_exec_loop.constprop.0
>                      cpu_tb_exec
>                    - 0x7f26240386a4
>                         11.20% helper_dcbz
> +   43.81%    12.28%  qemu-ppc  qemu-ppc                 [.] probe_access
> +   39.31%     0.00%  qemu-ppc  [JIT] tid 9969           [.] 0x00007f2624000000
> +   32.45%     4.51%  qemu-ppc  qemu-ppc                 [.] page_get_flags
> +   25.50%     2.10%  qemu-ppc  qemu-ppc                 [.] interval_tree_iter_first
> +   24.67%    24.67%  qemu-ppc  qemu-ppc                 [.] interval_tree_subtree_search
> +   16.75%     1.19%  qemu-ppc  qemu-ppc                 [.] helper_dcbz
> +    4.78%     4.78%  qemu-ppc  [JIT] tid 9969           [.] 0x00007f26240386be
> +    3.46%     3.46%  qemu-ppc  libc-2.32.so             [.] __memset_avx2_unaligned_erms
> Any idea how this could be optimised further? (This is running with
> qemu-ppc user mode emulation but I think with system it might be even
> worse.) Could an inline implementation with TCG vector ops work to
> avoid the helper and let it compile to efficient host code? Even if
> that could work I don't know how to do that so I'd need some further
> advice on this.
> 
>   target/ppc/mem_helper.c | 7 +++----
>   1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/target/ppc/mem_helper.c b/target/ppc/mem_helper.c
> index f88155ad45..361fd72226 100644
> --- a/target/ppc/mem_helper.c
> +++ b/target/ppc/mem_helper.c
> @@ -271,12 +271,11 @@ void helper_stsw(CPUPPCState *env, target_ulong addr, uint32_t nb,
>   }
>   
>   static void dcbz_common(CPUPPCState *env, target_ulong addr,
> -                        uint32_t opcode, bool epid, uintptr_t retaddr)
> +                        uint32_t opcode, int mmu_idx, uintptr_t retaddr)
>   {
>       target_ulong mask, dcbz_size = env->dcache_line_size;
>       uint32_t i;
>       void *haddr;
> -    int mmu_idx = epid ? PPC_TLB_EPID_STORE : ppc_env_mmu_index(env, false);
>   
>   #if defined(TARGET_PPC64)
>       /* Check for dcbz vs dcbzl on 970 */
> @@ -309,12 +308,12 @@ static void dcbz_common(CPUPPCState *env, target_ulong addr,
>   
>   void helper_dcbz(CPUPPCState *env, target_ulong addr, uint32_t opcode)
>   {
> -    dcbz_common(env, addr, opcode, false, GETPC());
> +    dcbz_common(env, addr, opcode, ppc_env_mmu_index(env, false), GETPC());

This is already computed in the translator: DisasContext.mem_idx.
If you pass the mmu_idx as an argument, you can unify these two helpers.


r~

>   }
>   
>   void helper_dcbzep(CPUPPCState *env, target_ulong addr, uint32_t opcode)
>   {
> -    dcbz_common(env, addr, opcode, true, GETPC());
> +    dcbz_common(env, addr, opcode, PPC_TLB_EPID_STORE, GETPC());
>   }
>   
>   void helper_icbi(CPUPPCState *env, target_ulong addr)



  reply	other threads:[~2024-06-23 17:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-22 20:48 [PATCH] target/ppc/mem_helper.c: Remove a conditional from dcbz_common() BALATON Zoltan
2024-06-23 17:52 ` Richard Henderson [this message]
2024-06-23 22:24   ` BALATON Zoltan
2024-06-24  3:23     ` Richard Henderson
2024-06-24  9:19       ` BALATON Zoltan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6664471b-7223-4c6e-a106-ce272be72f28@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=balaton@eik.bme.hu \
    --cc=danielhb413@gmail.com \
    --cc=npiggin@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).