All of lore.kernel.org
 help / color / mirror / Atom feed
From: Claudio Fontana <claudio.fontana@huawei.com>
To: Richard Henderson <rth@twiddle.net>, qemu-devel@nongnu.org
Cc: peter.maydell@linaro.org, claudio.fontana@gmail.com
Subject: Re: [Qemu-devel] [PATCH 04/26] tcg-aarch64: Use MOVN in tcg_out_movi
Date: Mon, 24 Mar 2014 15:06:11 +0100	[thread overview]
Message-ID: <53303BD3.2070109@huawei.com> (raw)
In-Reply-To: <1394851732-25692-5-git-send-email-rth@twiddle.net>

On 15.03.2014 03:48, Richard Henderson wrote:
> When profitable, initialize the register with MOVN instead of MOVZ,
> before setting the remaining lanes with MOVK.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/aarch64/tcg-target.c | 62 ++++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 49 insertions(+), 13 deletions(-)
> 
> diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c
> index 47f4708..a7b6796 100644
> --- a/tcg/aarch64/tcg-target.c
> +++ b/tcg/aarch64/tcg-target.c
> @@ -531,24 +531,60 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
>                           tcg_target_long value)
>  {
>      AArch64Insn insn;
> -
> -    if (type == TCG_TYPE_I32) {
> +    int i, wantinv, shift;
> +    tcg_target_long svalue = value;
> +    tcg_target_long ivalue, imask;
> +
> +    /* For 32-bit values, discard potential garbage in value.  For 64-bit
> +       values within [2**31, 2**32-1], we can create smaller sequences by
> +       interpreting this as a negative 32-bit number, while ensuring that
> +       the high 32 bits are cleared by setting SF=0.  */
> +    if (type == TCG_TYPE_I32 || (value & ~0xffffffffull) == 0) {
> +        svalue = (int32_t)value;
>          value = (uint32_t)value;
> +        type = TCG_TYPE_I32;
> +    }
> +    ivalue = ~svalue;
> +
> +    /* Would it take fewer insns to begin with MOVN?  For the value and its
> +       inverse, count the number of 16-bit lanes that are 0.  */
> +    for (i = wantinv = imask = 0; i < (32 << type); i += 16) {
> +        tcg_target_long mask = 0xffffull << i;
> +        if ((value & mask) == 0) {
> +            wantinv -= 1;
> +        }
> +        if ((ivalue & mask) == 0) {
> +            wantinv += 1;
> +            imask |= mask;
> +        }
>      }
>  
> -    /* count trailing zeros in 16 bit steps, mapping 64 to 0. Emit the
> -       first MOVZ with the half-word immediate skipping the zeros, with a shift
> -       (LSL) equal to this number. Then all next instructions use MOVKs.
> -       Zero the processed half-word in the value, continue until empty.
> -       We build the final result 16bits at a time with up to 4 instructions,
> -       but do not emit instructions for 16bit zero holes. */
> +    /* If we had more 0xffff than 0x0000, invert VALUE and use MOVN.  */
>      insn = I3405_MOVZ;
> -    do {
> -        unsigned shift = ctz64(value) & (63 & -16);
> -        tcg_out_insn_3405(s, insn, shift >= 32, rd, value >> shift, shift);
> +    if (wantinv > 0) {
> +        value = ivalue;
> +        insn = I3405_MOVN;
> +    }
> +
> +    /* Find the lowest lane that is not 0x0000.  */
> +    shift = ctz64(value) & (63 & -16);
> +    tcg_out_insn_3405(s, insn, type, rd, value >> shift, shift);
> +
> +    if (wantinv > 0) {
> +        /* Re-invert the value, so MOVK sees non-inverted bits.  */
> +        value = ~value;
> +        /* Clear out all the 0xffff lanes.  */
> +        value ^= imask;
> +    }
> +    /* Clear out the lane that we just set.  */
> +    value &= ~(0xffffUL << shift);
> +
> +    /* Iterate until all lanes have been set, and thus cleared from VALUE.  */
> +    while (value) {
> +        shift = ctz64(value) & (63 & -16);
> +        tcg_out_insn(s, 3405, MOVK, type, rd, value >> shift, shift);
>          value &= ~(0xffffUL << shift);
> -        insn = I3405_MOVK;
> -    } while (value);
> +    }
>  }
>  
>  static inline void tcg_out_ldst_r(TCGContext *s,
> 

Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>

  reply	other threads:[~2014-03-24 14:20 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-15  2:48 [Qemu-devel] [PATCH 00/26] tcg/aarch64 improvements, part 3 Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 01/26] tcg-aarch64: Properly detect SIGSEGV writes Richard Henderson
2014-03-24 11:05   ` Claudio Fontana
2014-03-24 15:33     ` Richard Henderson
2014-03-24 11:45   ` Claudio Fontana
2014-03-24 12:17   ` Peter Maydell
2014-03-24 12:41   ` Peter Maydell
2014-03-24 15:27     ` Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 02/26] tcg-aarch64: Use intptr_t apropriately Richard Henderson
2014-03-24 12:12   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 03/26] tcg-aarch64: Use TCGType and TCGMemOp constants Richard Henderson
2014-03-24 12:52   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 04/26] tcg-aarch64: Use MOVN in tcg_out_movi Richard Henderson
2014-03-24 14:06   ` Claudio Fontana [this message]
2014-03-15  2:48 ` [Qemu-devel] [PATCH 05/26] tcg-aarch64: Use ORRI " Richard Henderson
2014-03-24 14:06   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 06/26] tcg-aarch64: Special case small constants " Richard Henderson
2014-03-24 14:08   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 07/26] tcg-aarch64: Use adrp " Richard Henderson
2014-03-24 14:05   ` Claudio Fontana
2014-03-24 15:36     ` Richard Henderson
2014-03-26  9:34   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 08/26] tcg-aarch64: Use symbolic names for branches Richard Henderson
2014-03-24 15:31   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 09/26] tcg-aarch64: Create tcg_out_brcond Richard Henderson
2014-03-24 15:31   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 10/26] tcg-aarch64: Use CBZ and CBNZ Richard Henderson
2014-03-24 15:32   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 11/26] tcg-aarch64: Reuse FP and LR in translated code Richard Henderson
2014-03-28  9:48   ` Claudio Fontana
2014-03-28 13:23     ` Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 12/26] tcg-aarch64: Introduce tcg_out_insn_3314 Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 13/26] tcg-aarch64: Rearrange prologue insn order Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 14/26] tcg-aarch64: Implement tcg_register_jit Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 15/26] tcg-aarch64: Avoid add with zero in tlb load Richard Henderson
2014-03-26  9:36   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 16/26] tcg-aarch64: Use tcg_out_call for qemu_ld/st Richard Henderson
2014-03-26  9:37   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 17/26] tcg-aarch64: Use ADR to pass the return address to the ld/st helpers Richard Henderson
2014-03-26  9:38   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 18/26] tcg-aarch64: Use TCGMemOp in qemu_ld/st Richard Henderson
2014-03-26  9:39   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 19/26] tcg-aarch64: Implement TCG_TARGET_HAS_new_ldst Richard Henderson
2014-03-26  9:40   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 20/26] tcg-aarch64: Introduce tcg_out_insn_3507 Richard Henderson
2014-03-26  9:40   ` Claudio Fontana
2014-03-15  2:48 ` [Qemu-devel] [PATCH 21/26] tcg-aarch64: Merge aarch64_ldst_get_data/type into tcg_out_op Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 22/26] tcg-aarch64: Replace aarch64_ldst_op_data with TCGMemOp Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 23/26] tcg-aarch64: Replace aarch64_ldst_op_data with AArch64LdstType Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 24/26] tcg-aarch64: Prefer unsigned offsets before signed offsets for ldst Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 25/26] tcg-aarch64: Merge tcg_out_movr with tcg_out_mov Richard Henderson
2014-03-15  2:48 ` [Qemu-devel] [PATCH 26/26] tcg-aarch64: Support stores of zero Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53303BD3.2070109@huawei.com \
    --to=claudio.fontana@huawei.com \
    --cc=claudio.fontana@gmail.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.