Re: [PATCH v2 04/14] tcg/riscv: Add riscv vset{i}vli support

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Richard Henderson <richard.henderson@linaro.org>
To: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>, qemu-devel@nongnu.org
Cc: qemu-riscv@nongnu.org, palmer@dabbelt.com,
	alistair.francis@wdc.com, dbarboza@ventanamicro.com,
	liwei1518@gmail.com, bmeng.cn@gmail.com,
	TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Subject: Re: [PATCH v2 04/14] tcg/riscv: Add riscv vset{i}vli support
Date: Mon, 2 Sep 2024 11:06:59 +1000	[thread overview]
Message-ID: <5da4214a-0ee8-4b37-9d95-b92c511d7386@linaro.org> (raw)
In-Reply-To: <20240830061607.1940-5-zhiwei_liu@linux.alibaba.com>

On 8/30/24 16:15, LIU Zhiwei wrote:
> From: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
> 
> In RISC-V, vector operations require initial configuration using
> the vset{i}vl{i} instruction.
> 
> This instruction:
>    1. Sets the vector length (vl) in bytes
>    2. Configures the vtype register, which includes:
>      SEW (Single Element Width)
>      LMUL (vector register group multiplier)
>      Other vector operation parameters
> 
> This configuration is crucial for defining subsequent vector
> operation behavior. To optimize performance, the configuration
> process is managed dynamically:
>    1. Reconfiguration using vset{i}vl{i} is necessary when SEW
>       or vector register group width changes.
>    2. The vset instruction can be omitted when configuration
>       remains unchanged.
> 
> This optimization is only effective within a single TB.
> Each TB requires reconfiguration at its start, as the current
> state cannot be obtained from hardware.
> 
> Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
> Signed-off-by: Weiwei Li <liwei1518@gmail.com>
> Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
> ---
>   tcg/riscv/tcg-target.c.inc | 104 +++++++++++++++++++++++++++++++++++++
>   1 file changed, 104 insertions(+)
> 
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 5ef1538aed..49d01b8775 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -119,6 +119,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
>   #define GET_VREG_SET(vlen) (vlen == 64 ? ALL_QVECTOR_REG_GROUPS : \
>                                (vlen == 128 ? ALL_DVECTOR_REG_GROUPS : \
>                                 ALL_VECTOR_REGS))
> +#define riscv_vlenb (riscv_vlen / 8)
>   
>   #define sextreg  sextract64
>   
> @@ -168,6 +169,18 @@ static bool tcg_target_const_match(int64_t val, int ct,
>    * RISC-V Base ISA opcodes (IM)
>    */
>   
> +#define V_OPIVV (0x0 << 12)
> +#define V_OPFVV (0x1 << 12)
> +#define V_OPMVV (0x2 << 12)
> +#define V_OPIVI (0x3 << 12)
> +#define V_OPIVX (0x4 << 12)
> +#define V_OPFVF (0x5 << 12)
> +#define V_OPMVX (0x6 << 12)
> +#define V_OPCFG (0x7 << 12)
> +
> +#define V_SUMOP (0x0 << 20)
> +#define V_LUMOP (0x0 << 20)
> +
>   typedef enum {
>       OPC_ADD = 0x33,
>       OPC_ADDI = 0x13,
> @@ -263,6 +276,11 @@ typedef enum {
>       /* Zicond: integer conditional operations */
>       OPC_CZERO_EQZ = 0x0e005033,
>       OPC_CZERO_NEZ = 0x0e007033,
> +
> +    /* V: Vector extension 1.0 */
> +    OPC_VSETVLI  = 0x57 | V_OPCFG,
> +    OPC_VSETIVLI = 0xc0000057 | V_OPCFG,
> +    OPC_VSETVL   = 0x80000057 | V_OPCFG,
>   } RISCVInsn;
>   
>   /*
> @@ -355,6 +373,35 @@ static int32_t encode_uj(RISCVInsn opc, TCGReg rd, uint32_t imm)
>       return opc | (rd & 0x1f) << 7 | encode_ujimm20(imm);
>   }
>   
> +typedef enum {
> +    VTA_TU = 0,
> +    VTA_TA,
> +} RISCVVta;
> +
> +typedef enum {
> +    VMA_MU = 0,
> +    VMA_MA,
> +} RISCVVma;

Do these really need enumerators, or would 'bool' be sufficient?

> +static int32_t encode_vtypei(RISCVVta vta, RISCVVma vma,
> +                            unsigned vsew, RISCVVlmul vlmul)
> +{
> +    return (vma & 0x1) << 7 | (vta & 0x1) << 6 | (vsew & 0x7) << 3 |
> +           (vlmul & 0x7);
> +}

s/vtypei/vtype/g?  vtype is only immediate in specific contexts, and you'll match the 
manual better if you talk about vtype the CSR rather than the vset*vli argument.

Assert values in range rather than masking.

Use MemOp vsew, since you're using MO_64, etc.

> @@ -498,12 +551,62 @@ static void tcg_out_opc_reg_vec_i(TCGContext *s, RISCVInsn opc,
>   #define tcg_out_opc_vi(s, opc, vd, vs2, imm, vm) \
>       tcg_out_opc_reg_vec_i(s, opc, vd, imm, vs2, vm);
>   
> +#define tcg_out_opc_vconfig(s, opc, rd, avl, vtypei) \
> +    tcg_out_opc_vec_config(s, opc, rd, avl, vtypei);

Why the extra define?

> +
> +/*
> + * TODO: If the vtype value is not supported by the implementation,
> + * then the vill bit is set in vtype, the remaining bits in
> + * vtype are set to zero, and the vl register is also set to zero
> + */

Why is this a TODO?
Are you suggesting that we might need to probe *all* of the cases at startup?

> +static __thread int prev_vtypei;

I think we should put this into TCGContext.
We don't currently have any host-specific values there, but there's no reason we can't 
have any.

> +#define get_vlmax(vsew) (riscv_vlen / (8 << vsew) * (LMUL_MAX))

Given that we know that LMUL_MAX is 8, doesn't this cancel out?


> +#define get_vec_type_bytes(type)    (type >= TCG_TYPE_V64 ? \
> +                                    (8 << (type - TCG_TYPE_V64)) : 0)

Again, assert not produce nonsense results.  And this doesn't need hiding in a macro.

> +#define calc_vlmul(oprsz)    (ctzl(oprsz / riscv_vlenb))

I think it's clearer to do this inline, where we can see that oprsz > vlenb.

> +
> +static void tcg_target_set_vec_config(TCGContext *s, TCGType type,
> +                                      unsigned vece)
> +{
> +    unsigned vsew, oprsz, avl;
> +    int vtypei;
> +    RISCVVlmul vlmul;
> +
> +    vsew = vece;

You can just name the argument vsew...

> +    oprsz = get_vec_type_bytes(type);
> +    avl = oprsz / (1 << vece);
> +    vlmul = oprsz > riscv_vlenb ?
> +                      calc_vlmul(oprsz) : VLMUL_M1;

I guess it is always the case that full register operations are preferred over fractional?

> +    vtypei = encode_vtypei(VTA_TA, VMA_MA, vsew, vlmul);
> +
> +    tcg_debug_assert(avl <= get_vlmax(vsew));
> +    tcg_debug_assert(vlmul <= VLMUL_RESERVED);
> +    tcg_debug_assert(vsew <= MO_64);

These asserts should be moved higher, above their first uses.


r~

next prev parent reply	other threads:[~2024-09-02  1:08 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30  6:15 [PATCH v2 00/14] tcg/riscv: Add support for vector LIU Zhiwei
2024-08-30  6:15 ` [PATCH v2 01/14] tcg/op-gvec: Fix iteration step in 32-bit operation LIU Zhiwei
2024-08-31 23:59   ` Richard Henderson
2024-08-30  6:15 ` [PATCH v2 02/14] util: Add RISC-V vector extension probe in cpuinfo LIU Zhiwei
2024-09-02  0:12   ` Richard Henderson
2024-08-30  6:15 ` [PATCH v2 03/14] tcg/riscv: Add basic support for vector LIU Zhiwei
2024-09-02  0:28   ` Richard Henderson
2024-08-30  6:15 ` [PATCH v2 04/14] tcg/riscv: Add riscv vset{i}vli support LIU Zhiwei
2024-09-02  1:06   ` Richard Henderson [this message]
2024-08-30  6:15 ` [PATCH v2 05/14] tcg/riscv: Implement vector load/store LIU Zhiwei
2024-09-02  1:31   ` Richard Henderson
2024-08-30  6:15 ` [PATCH v2 06/14] tcg/riscv: Implement vector mov/dup{m/i} LIU Zhiwei
2024-09-02  1:36   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 07/14] tcg/riscv: Add support for basic vector opcodes LIU Zhiwei
2024-09-02  1:39   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 08/14] tcg/riscv: Implement vector cmp ops LIU Zhiwei
2024-09-03  6:45   ` Richard Henderson
2024-09-03 14:51     ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 09/14] tcg/riscv: Implement vector neg ops LIU Zhiwei
2024-09-03 14:52   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 10/14] tcg/riscv: Implement vector sat/mul ops LIU Zhiwei
2024-09-03 14:52   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 11/14] tcg/riscv: Implement vector min/max ops LIU Zhiwei
2024-09-03 14:53   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 12/14] tcg/riscv: Implement vector shs/v ops LIU Zhiwei
2024-09-03 14:54   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 13/14] tcg/riscv: Implement vector roti/v/x shi ops LIU Zhiwei
2024-09-03 15:15   ` Richard Henderson
2024-09-04 15:25     ` LIU Zhiwei
2024-09-04 19:05       ` Richard Henderson
2024-09-05  1:40         ` LIU Zhiwei
2024-08-30  6:16 ` [PATCH v2 14/14] tcg/riscv: Enable native vector support for TCG host LIU Zhiwei
2024-09-03 15:02   ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5da4214a-0ee8-4b37-9d95-b92c511d7386@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=alistair.francis@wdc.com \
    --cc=bmeng.cn@gmail.com \
    --cc=dbarboza@ventanamicro.com \
    --cc=liwei1518@gmail.com \
    --cc=palmer@dabbelt.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-riscv@nongnu.org \
    --cc=tangtiancheng.ttc@alibaba-inc.com \
    --cc=zhiwei_liu@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).