Re: [PATCH 02/11] tcg/loongarch64: Lower basic tcg vec ops to LSX

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Richard Henderson <richard.henderson@linaro.org>
To: Jiajie Chen <c@jia.je>, qemu-devel@nongnu.org
Cc: gaosong@loongson.cn, WANG Xuerui <git@xen0n.name>
Subject: Re: [PATCH 02/11] tcg/loongarch64: Lower basic tcg vec ops to LSX
Date: Mon, 28 Aug 2023 09:57:45 -0700	[thread overview]
Message-ID: <692c49da-af4d-3913-cf82-726294a0d792@linaro.org> (raw)
In-Reply-To: <20230828152009.352048-3-c@jia.je>

On 8/28/23 08:19, Jiajie Chen wrote:
> +static void tcg_out_dupi_vec(TCGContext *s, TCGType type, unsigned vece,
> +                             TCGReg rd, int64_t v64)
> +{
> +    /* Try vldi if imm can fit */
> +    if (vece <= MO_32 && (-0x200 <= v64 && v64 <= 0x1FF)) {
> +        uint32_t imm = (vece << 10) | ((uint32_t)v64 & 0x3FF);
> +        tcg_out_opc_vldi(s, rd, imm);
> +        return;
> +    }

v64 has the value replicated across 64 bits.
In order to do the comparison above, you'll want

     int64_t vale = sextract64(v64, 0, 8 << vece);
     if (-0x200 <= vale && vale <= 0x1ff)
         ...

Since the only documentation for LSX is qemu's own translator code, why are you testing 
vece <= MO_32?  MO_64 should be available as well?  Or is there a bug in trans_vldi()?

It might be nice to leave a to-do for vldi imm bit 12 set, for the patterns expanded by 
vldi_get_value().  In particular, mode == 9 is likely to be useful, and modes {1,2,3,5} 
are easy to test for.

> +
> +    /* Fallback to vreplgr2vr */
> +    tcg_out_movi(s, type, TCG_REG_TMP0, v64);

type is a vector type; you can't use it here.
Correct would be TCG_TYPE_I64.

Better to load vale instead, since that will take fewer insns in tcg_out_movi.

> +static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
> +                           unsigned vecl, unsigned vece,
> +                           const TCGArg args[TCG_MAX_OP_ARGS],
> +                           const int const_args[TCG_MAX_OP_ARGS])
> +{
> +    TCGType type = vecl + TCG_TYPE_V64;
> +    TCGArg a0, a1, a2;
> +    TCGReg base;
> +    TCGReg temp = TCG_REG_TMP0;
> +    int32_t offset;
> +
> +    a0 = args[0];
> +    a1 = args[1];
> +    a2 = args[2];
> +
> +    /* Currently only supports V128 */
> +    tcg_debug_assert(type == TCG_TYPE_V128);
> +
> +    switch (opc) {
> +    case INDEX_op_st_vec:
> +        /* Try to fit vst imm */
> +        if (-0x800 <= a2 && a2 <= 0x7ff) {
> +            base = a1;
> +            offset = a2;
> +        } else {
> +            tcg_out_addi(s, TCG_TYPE_I64, temp, a1, a2);
> +            base = temp;
> +            offset = 0;
> +        }
> +        tcg_out_opc_vst(s, a0, base, offset);
> +        break;
> +    case INDEX_op_ld_vec:
> +        /* Try to fit vld imm */
> +        if (-0x800 <= a2 && a2 <= 0x7ff) {
> +            base = a1;
> +            offset = a2;
> +        } else {
> +            tcg_out_addi(s, TCG_TYPE_I64, temp, a1, a2);
> +            base = temp;
> +            offset = 0;
> +        }
> +        tcg_out_opc_vld(s, a0, base, offset);

tcg_out_addi has a hole in bits [15:12], and can take an extra insn if those bits are set. 
  Better to load the offset with tcg_out_movi and then use VLDX/VSTX instead of VLD/VST.

> @@ -159,6 +170,30 @@ typedef enum {
>   #define TCG_TARGET_HAS_mulsh_i64        1
>   #define TCG_TARGET_HAS_qemu_ldst_i128   0
>   
> +#define TCG_TARGET_HAS_v64              0
> +#define TCG_TARGET_HAS_v128             use_lsx_instructions
> +#define TCG_TARGET_HAS_v256             0

Perhaps reserve for a follow-up, but TCG_TARGET_HAS_v64 can easily be supported using the 
same instructions.

The only difference is load/store, where you could use FLD.D/FST.D to load the lower 
64-bits of the fp/vector register, or VLDREPL.D to load and initialize all bits and 
VSTELM.D to store the lower 64-bits.

I tend to think the float insns are more flexible, having a larger displacement, and the 
availability of FLDX/FSTX as well.

r~

next prev parent reply	other threads:[~2023-08-28 16:58 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-28 15:19 [PATCH 00/11] Lower TCG vector ops to LSX Jiajie Chen
2023-08-28 15:19 ` [PATCH 01/11] tcg/loongarch64: Import LSX instructions Jiajie Chen
2023-08-28 15:19 ` [PATCH 02/11] tcg/loongarch64: Lower basic tcg vec ops to LSX Jiajie Chen
2023-08-28 16:57   ` Richard Henderson [this message]
2023-08-28 17:04     ` Jiajie Chen
2023-08-28 15:19 ` [PATCH 03/11] tcg/loongarch64: Lower cmp_vec to vseq/vsle/vslt Jiajie Chen
2023-08-28 17:08   ` Richard Henderson
2023-08-28 15:19 ` [PATCH 04/11] tcg/loongarch64: Lower add/sub_vec to vadd/vsub Jiajie Chen
2023-08-28 17:13   ` Richard Henderson
2023-08-28 15:19 ` [PATCH 05/11] tcg/loongarch64: Lower vector bitwise operations Jiajie Chen
2023-08-28 17:17   ` Richard Henderson
2023-08-28 15:19 ` [PATCH 06/11] tcg/loongarch64: Lower neg_vec to vneg Jiajie Chen
2023-08-28 17:18   ` Richard Henderson
2023-08-28 15:19 ` [PATCH 07/11] tcg/loongarch64: Lower mul_vec to vmul Jiajie Chen
2023-08-28 17:18   ` Richard Henderson
2023-08-28 15:19 ` [PATCH 08/11] tcg/loongarch64: Lower vector min max ops Jiajie Chen
2023-08-28 17:19   ` Richard Henderson
2023-08-28 15:19 ` [PATCH 09/11] tcg/loongarch64: Lower vector saturated ops Jiajie Chen
2023-08-28 17:20   ` Richard Henderson
2023-08-28 15:19 ` [PATCH 10/11] tcg/loongarch64: Lower vector shift vector ops Jiajie Chen
2023-08-28 17:21   ` Richard Henderson
2023-08-28 15:19 ` [PATCH 11/11] tcg/loongarch64: Lower bitsel_vec to vbitsel Jiajie Chen
2023-08-28 17:22   ` Richard Henderson
2023-08-28 17:29 ` [PATCH 00/11] Lower TCG vector ops to LSX Richard Henderson
2023-08-28 19:39   ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=692c49da-af4d-3913-cf82-726294a0d792@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=c@jia.je \
    --cc=gaosong@loongson.cn \
    --cc=git@xen0n.name \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).