qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: qemu-devel@nongnu.org, "Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: Re: [PULL v2 77/91] target/arm: Avoid tcg_const_ptr in handle_vec_simd_sqshrn
Date: Tue, 23 Jan 2024 15:09:18 +0000	[thread overview]
Message-ID: <CAFEAcA-VGXJCL5XVqYfr6MmvMYX1hL68fd2kMPUz4dfuH9p4WQ@mail.gmail.com> (raw)
In-Reply-To: <20230309200550.3878088-78-richard.henderson@linaro.org>

On Thu, 9 Mar 2023 at 20:10, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> It is easy enough to use mov instead of or-with-zero
> and relying on the optimizer to fold away the or.
>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/tcg/translate-a64.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
> index 2ad7c48901..082a8b82dd 100644
> --- a/target/arm/tcg/translate-a64.c
> +++ b/target/arm/tcg/translate-a64.c
> @@ -8459,7 +8459,7 @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
>      tcg_rn = tcg_temp_new_i64();
>      tcg_rd = tcg_temp_new_i64();
>      tcg_rd_narrowed = tcg_temp_new_i32();
> -    tcg_final = tcg_const_i64(0);
> +    tcg_final = tcg_temp_new_i64();
>
>      if (round) {
>          tcg_round = tcg_constant_i64(1ULL << (shift - 1));
> @@ -8473,7 +8473,11 @@ static void handle_vec_simd_sqshrn(DisasContext *s, bool is_scalar, bool is_q,
>                                  false, is_u_shift, size+1, shift);
>          narrowfn(tcg_rd_narrowed, cpu_env, tcg_rd);
>          tcg_gen_extu_i32_i64(tcg_rd, tcg_rd_narrowed);
> -        tcg_gen_deposit_i64(tcg_final, tcg_final, tcg_rd, esize * i, esize);
> +        if (i == 0) {
> +            tcg_gen_mov_i64(tcg_final, tcg_rd);
> +        } else {
> +            tcg_gen_deposit_i64(tcg_final, tcg_final, tcg_rd, esize * i, esize);
> +        }

So, it turns out that this causes a regression:
https://gitlab.com/qemu-project/qemu/-/issues/2089

The change here is fine for the vector case, because when
we loop round the subsequent deposit ops will overwrite
the bits of tcg_final above the initial element, whatever
they happen to be in tcg_rd. However, for the scalar case
we only execute this loop once, and so after this change
instead of the high bits of the result being 0, we leave
them as whatever they were in tcg_rd. If the narrow is a
signed version and the result was negative, those high bits
will now be 1 instead of the 0 they should be.

Using
 tcg_gen_extract_i64(tcg_final, tcg_rd, 0, esize);
instead of the tcg_gen_mov_i64() should fix this.

I'll send a patch later this afternoon.

thanks
-- PMM


  reply	other threads:[~2024-01-23 15:10 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-09 20:04 [PULL v2 00/91] tcg patch queue Richard Henderson
2023-03-09 20:04 ` [PULL v2 01/91] target/mips: Drop tcg_temp_free from micromips_translate.c.inc Richard Henderson
2023-03-09 20:04 ` [PULL v2 02/91] target/mips: Drop tcg_temp_free from msa_translate.c Richard Henderson
2023-03-09 20:04 ` [PULL v2 03/91] target/mips: Drop tcg_temp_free from mxu_translate.c Richard Henderson
2023-03-09 20:04 ` [PULL v2 04/91] target/mips: Drop tcg_temp_free from nanomips_translate.c.inc Richard Henderson
2023-03-09 20:04 ` [PULL v2 05/91] target/mips: Drop tcg_temp_free from octeon_translate.c Richard Henderson
2023-03-09 20:04 ` [PULL v2 06/91] target/mips: Drop tcg_temp_free from translate_addr_const.c Richard Henderson
2023-03-09 20:04 ` [PULL v2 07/91] target/mips: Drop tcg_temp_free from tx79_translate.c Richard Henderson
2023-03-09 20:04 ` [PULL v2 08/91] target/mips: Drop tcg_temp_free from vr54xx_translate.c Richard Henderson
2023-03-09 20:04 ` [PULL v2 09/91] target/mips: Drop tcg_temp_free from translate.c Richard Henderson
2023-03-09 20:04 ` [PULL v2 10/91] target/s390x: Drop free_compare Richard Henderson
2023-03-09 20:04 ` [PULL v2 11/91] target/s390x: Drop tcg_temp_free from translate_vx.c.inc Richard Henderson
2023-03-09 20:04 ` [PULL v2 12/91] target/s390x: Drop tcg_temp_free from translate.c Richard Henderson
2023-03-09 20:04 ` [PULL v2 13/91] target/s390x: Remove assert vs g_in2 Richard Henderson
2023-03-09 20:04 ` [PULL v2 14/91] target/s390x: Remove g_out, g_out2, g_in1, g_in2 from DisasContext Richard Henderson
2023-03-09 20:04 ` [PULL v2 15/91] tcg: Create tcg/tcg-temp-internal.h Richard Henderson
2023-03-09 20:04 ` [PULL v2 16/91] include/exec: Set default `NB_MMU_MODES` to 16 Richard Henderson
2023-03-09 20:04 ` [PULL v2 17/91] target/alpha: Remove `NB_MMU_MODES` define Richard Henderson
2023-03-09 20:04 ` [PULL v2 18/91] target/arm: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 19/91] target/avr: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 20/91] target/cris: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 21/91] target/hexagon: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 22/91] target/hppa: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 23/91] target/i386: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 24/91] target/loongarch: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 25/91] target/m68k: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 26/91] target/microblaze: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 27/91] target/mips: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 28/91] target/nios2: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 29/91] target/openrisc: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 30/91] target/ppc: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 31/91] target/riscv: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 32/91] target/rx: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 33/91] target/s390x: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 34/91] target/sh4: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 35/91] target/sparc: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 36/91] target/tricore: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 37/91] target/xtensa: " Richard Henderson
2023-03-09 20:04 ` [PULL v2 38/91] include/exec: Remove guards around `NB_MMU_MODES` Richard Henderson
2023-03-09 20:04 ` [PULL v2 39/91] target/avr: Avoid use of tcg_const_i32 in SBIC, SBIS Richard Henderson
2023-03-09 20:04 ` [PULL v2 40/91] target/avr: Avoid use of tcg_const_i32 throughout Richard Henderson
2023-03-09 20:05 ` [PULL v2 41/91] target/cris: " Richard Henderson
2023-03-09 20:05 ` [PULL v2 42/91] target/hppa: Avoid tcg_const_i64 in trans_fid_f Richard Henderson
2023-03-09 20:05 ` [PULL v2 43/91] target/hppa: Avoid use of tcg_const_i32 throughout Richard Henderson
2023-03-09 20:05 ` [PULL v2 44/91] target/i386: Avoid use of tcg_const_* throughout Richard Henderson
2023-03-09 20:05 ` [PULL v2 45/91] target/m68k: Avoid tcg_const_i32 when modified Richard Henderson
2023-03-09 20:05 ` [PULL v2 46/91] target/m68k: Avoid tcg_const_i32 in bfop_reg Richard Henderson
2023-03-09 20:05 ` [PULL v2 47/91] target/m68k: Avoid tcg_const_* throughout Richard Henderson
2023-03-09 20:05 ` [PULL v2 48/91] target/mips: Split out gen_lxl Richard Henderson
2023-03-09 20:05 ` [PULL v2 49/91] target/mips: Split out gen_lxr Richard Henderson
2023-03-09 20:05 ` [PULL v2 50/91] target/mips: Avoid tcg_const_tl in gen_r6_ld Richard Henderson
2023-03-09 20:05 ` [PULL v2 51/91] target/mips: Avoid tcg_const_* throughout Richard Henderson
2023-03-09 20:05 ` [PULL v2 52/91] target/ppc: Split out gen_vx_vmul10 Richard Henderson
2023-03-09 20:05 ` [PULL v2 53/91] target/ppc: Avoid tcg_const_i64 in do_vector_shift_quad Richard Henderson
2023-03-09 20:05 ` [PULL v2 54/91] target/rx: Use tcg_gen_abs_i32 Richard Henderson
2023-03-09 20:05 ` [PULL v2 55/91] target/rx: Use cpu_psw_z as temp in flags computation Richard Henderson
2023-03-09 20:05 ` [PULL v2 56/91] target/rx: Avoid tcg_const_i32 when new temp needed Richard Henderson
2023-03-09 20:05 ` [PULL v2 57/91] target/rx: Avoid tcg_const_i32 Richard Henderson
2023-03-09 20:05 ` [PULL v2 58/91] target/s390x: Avoid tcg_const_i64 Richard Henderson
2023-03-09 20:05 ` [PULL v2 59/91] target/sh4: Avoid tcg_const_i32 for TAS.B Richard Henderson
2023-03-09 20:05 ` [PULL v2 60/91] target/sh4: Avoid tcg_const_i32 Richard Henderson
2023-03-09 20:05 ` [PULL v2 61/91] tcg/sparc: Avoid tcg_const_tl in gen_edge Richard Henderson
2023-03-09 20:05 ` [PULL v2 62/91] target/tricore: Split t_n as constant from temp as variable Richard Henderson
2023-03-09 20:05 ` [PULL v2 63/91] target/tricore: Rename t_off10 and use tcg_constant_i32 Richard Henderson
2023-03-09 20:05 ` [PULL v2 64/91] target/tricore: Use setcondi instead of explicit allocation Richard Henderson
2023-03-09 20:05 ` [PULL v2 65/91] target/tricore: Drop some temp initialization Richard Henderson
2023-03-09 20:05 ` [PULL v2 66/91] target/tricore: Avoid tcg_const_i32 Richard Henderson
2023-03-09 20:05 ` [PULL v2 67/91] tcg: Replace tcg_const_i64 in tcg-op.c Richard Henderson
2023-03-09 20:05 ` [PULL v2 68/91] target/arm: Use rmode >= 0 for need_rmode Richard Henderson
2023-03-09 20:05 ` [PULL v2 69/91] target/arm: Handle FPROUNDING_ODD in arm_rmode_to_sf Richard Henderson
2023-03-09 20:05 ` [PULL v2 70/91] target/arm: Improve arm_rmode_to_sf Richard Henderson
2023-03-09 20:05 ` [PULL v2 71/91] target/arm: Consistently use ARMFPRounding during translation Richard Henderson
2023-03-09 20:05 ` [PULL v2 72/91] target/arm: Create gen_set_rmode, gen_restore_rmode Richard Henderson
2023-03-09 20:05 ` [PULL v2 73/91] target/arm: Improve trans_BFCI Richard Henderson
2023-03-09 20:05 ` [PULL v2 74/91] target/arm: Avoid tcg_const_ptr in gen_sve_{ldr,str} Richard Henderson
2023-03-09 20:05 ` [PULL v2 75/91] target/arm: Avoid tcg_const_* in translate-mve.c Richard Henderson
2023-03-09 20:05 ` [PULL v2 76/91] target/arm: Avoid tcg_const_ptr in disas_simd_zip_trn Richard Henderson
2023-03-09 20:05 ` [PULL v2 77/91] target/arm: Avoid tcg_const_ptr in handle_vec_simd_sqshrn Richard Henderson
2024-01-23 15:09   ` Peter Maydell [this message]
2023-03-09 20:05 ` [PULL v2 78/91] target/arm: Avoid tcg_const_ptr in handle_rev Richard Henderson
2023-03-09 20:05 ` [PULL v2 79/91] target/m68k: Use tcg_constant_i32 in gen_ea_mode Richard Henderson
2023-03-09 20:05 ` [PULL v2 80/91] target/ppc: Avoid tcg_const_i64 in do_vcntmb Richard Henderson
2023-03-09 20:05 ` [PULL v2 81/91] target/ppc: Avoid tcg_const_* in vmx-impl.c.inc Richard Henderson
2023-03-09 20:05 ` [PULL v2 82/91] target/ppc: Avoid tcg_const_* in xxeval Richard Henderson
2023-03-09 20:05 ` [PULL v2 83/91] target/ppc: Avoid tcg_const_* in vsx-impl.c.inc Richard Henderson
2023-03-09 20:05 ` [PULL v2 84/91] target/ppc: Avoid tcg_const_* in fp-impl.c.inc Richard Henderson
2023-03-09 20:05 ` [PULL v2 85/91] target/ppc: Avoid tcg_const_* in power8-pmu-regs.c.inc Richard Henderson
2023-03-09 20:05 ` [PULL v2 86/91] target/ppc: Rewrite trans_ADDG6S Richard Henderson
2023-03-09 20:05 ` [PULL v2 87/91] target/ppc: Fix gen_tlbsx_booke206 Richard Henderson
2023-03-09 20:05 ` [PULL v2 88/91] target/ppc: Avoid tcg_const_* in translate.c Richard Henderson
2023-03-09 20:05 ` [PULL v2 89/91] target/tricore: Use min/max for saturate Richard Henderson
2023-03-09 20:05 ` [PULL v2 90/91] tcg: Drop tcg_const_*_vec Richard Henderson
2023-03-09 20:05 ` [PULL v2 91/91] tcg: Drop tcg_const_* Richard Henderson
2023-03-11 17:16 ` [PULL v2 00/91] tcg patch queue Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFEAcA-VGXJCL5XVqYfr6MmvMYX1hL68fd2kMPUz4dfuH9p4WQ@mail.gmail.com \
    --to=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).