Re: [PATCH v2 03/14] tcg/riscv: Add basic support for vector

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Richard Henderson <richard.henderson@linaro.org>
To: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>, qemu-devel@nongnu.org
Cc: qemu-riscv@nongnu.org, palmer@dabbelt.com,
	alistair.francis@wdc.com, dbarboza@ventanamicro.com,
	liwei1518@gmail.com, bmeng.cn@gmail.com,
	Swung0x48 <swung0x48@outlook.com>,
	TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Subject: Re: [PATCH v2 03/14] tcg/riscv: Add basic support for vector
Date: Mon, 2 Sep 2024 10:28:04 +1000	[thread overview]
Message-ID: <0b1df6e8-5fa3-469e-b1e0-1b0ac7d4dfd2@linaro.org> (raw)
In-Reply-To: <20240830061607.1940-4-zhiwei_liu@linux.alibaba.com>

On 8/30/24 16:15, LIU Zhiwei wrote:
> From: Swung0x48 <swung0x48@outlook.com>
> 
> The RISC-V vector instruction set utilizes the LMUL field to group
> multiple registers, enabling variable-length vector registers. This
> implementation uses only the first register number of each group while
> reserving the other register numbers within the group.
> 
> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
> host runtime needs to adjust LMUL based on the type to use different
> register groups.
> 
> This presents challenges for TCG's register allocation. Currently, we
> avoid modifying the register allocation part of TCG and only expose the
> minimum number of vector registers.
> 
> For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, with
> LMUL equal to 4, we use 4 vector registers as one register group. We can
> use a maximum of 8 register groups, but the V0 register number is reserved
> as a mask register, so we can effectively use at most 7 register groups.
> Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
> forced to be used. This is because TCG cannot yet dynamically constrain
> registers with type; likewise, when the host vlen is 128 bits and
> TCG_TYPE_V256, we can use at most 15 registers.
> 
> There is not much pressure on vector register allocation in TCG now, so
> using 7 registers is feasible and will not have a major impact on code
> generation.
> 
> This patch:
> 1. Reserves vector register 0 for use as a mask register.
> 2. When using register groups, reserves the additional registers within
>     each group.
> 
> Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
> Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
> ---
>   tcg/riscv/tcg-target-con-str.h |   1 +
>   tcg/riscv/tcg-target.c.inc     | 131 +++++++++++++++++++++++++--------
>   tcg/riscv/tcg-target.h         |  78 +++++++++++---------
>   tcg/riscv/tcg-target.opc.h     |  12 +++
>   4 files changed, 157 insertions(+), 65 deletions(-)
>   create mode 100644 tcg/riscv/tcg-target.opc.h
> 
> diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
> index d5c419dff1..21c4a0a0e0 100644
> --- a/tcg/riscv/tcg-target-con-str.h
> +++ b/tcg/riscv/tcg-target-con-str.h
> @@ -9,6 +9,7 @@
>    * REGS(letter, register_mask)
>    */
>   REGS('r', ALL_GENERAL_REGS)
> +REGS('v', GET_VREG_SET(riscv_vlen))

Perhaps too complicated. Make this MAKE_64BIT_MASK(32, 32); everything else will be 
handled by tcg_target_available_regs[] and reserved_regs.

> @@ -127,6 +113,12 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
>   #define TCG_CT_CONST_J12  0x1000
>   
>   #define ALL_GENERAL_REGS   MAKE_64BIT_MASK(0, 32)
> +#define ALL_VECTOR_REGS    MAKE_64BIT_MASK(33, 31)
> +#define ALL_DVECTOR_REG_GROUPS 0x5555555400000000
> +#define ALL_QVECTOR_REG_GROUPS 0x1111111000000000

V0 is still a vector register, even if it is reserved.
I think you should include it in these masks.

> +#define GET_VREG_SET(vlen) (vlen == 64 ? ALL_QVECTOR_REG_GROUPS : \
> +                             (vlen == 128 ? ALL_DVECTOR_REG_GROUPS : \
> +                              ALL_VECTOR_REGS))

I think you will not need this macro.

> +/*
> + * RISC-V vector instruction emitters
> + */
> +
> +/* Vector registers uses the same 5 lower bits as GPR registers. */
> +static void tcg_out_opc_reg_vec(TCGContext *s, RISCVInsn opc,
> +                                TCGReg d, TCGReg s1, TCGReg s2, bool vm)
> +{
> +    tcg_out32(s, encode_r(opc, d, s1, s2) | (vm << 25));
> +}
> +
> +static void tcg_out_opc_reg_vec_i(TCGContext *s, RISCVInsn opc,
> +                                  TCGReg rd, TCGArg imm, TCGReg vs2, bool vm)
> +{
> +    tcg_out32(s, encode_r(opc, rd, (imm & 0x1f), vs2) | (vm << 25));

I think you want to create new encode_* functions, not abuse the integer ones.

> +}
> +
> +/* vm=0 (vm = false) means vector masking ENABLED. */
> +#define tcg_out_opc_vv(s, opc, vd, vs2, vs1, vm) \
> +    tcg_out_opc_reg_vec(s, opc, vd, vs1, vs2, vm);
> +
> +/*
> + * In RISC-V, vs2 is the first operand, while rs1/imm is the
> + * second operand.
> + */
> +#define tcg_out_opc_vx(s, opc, vd, vs2, rs1, vm) \
> +    tcg_out_opc_reg_vec(s, opc, vd, rs1, vs2, vm);
> +
> +#define tcg_out_opc_vi(s, opc, vd, vs2, imm, vm) \
> +    tcg_out_opc_reg_vec_i(s, opc, vd, imm, vs2, vm);
> +
> +/*
> + * Only unit-stride addressing implemented; may extend in future.
> + */
> +#define tcg_out_opc_ldst_vec(s, opc, vs3_vd, rs1, vm) \
> +    tcg_out_opc_reg_vec(s, opc, vs3_vd, rs1, 0, vm);

I don't understand the need for any of these #defines.
Why are we not simply creating functions of the correct name?

> @@ -2101,6 +2160,13 @@ static void tcg_target_init(TCGContext *s)
>       tcg_target_available_regs[TCG_TYPE_I32] = 0xffffffff;
>       tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffff;
>   
> +    if (cpuinfo & CPUINFO_ZVE64X) {
> +        TCGRegSet vector_regs = GET_VREG_SET(riscv_vlen);

This ought to be the only usage of GET_VREG_SET, so I think you should inline that code 
with a switch on riscv_vlen/vlenb.


r~

next prev parent reply	other threads:[~2024-09-02  0:29 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30  6:15 [PATCH v2 00/14] tcg/riscv: Add support for vector LIU Zhiwei
2024-08-30  6:15 ` [PATCH v2 01/14] tcg/op-gvec: Fix iteration step in 32-bit operation LIU Zhiwei
2024-08-31 23:59   ` Richard Henderson
2024-08-30  6:15 ` [PATCH v2 02/14] util: Add RISC-V vector extension probe in cpuinfo LIU Zhiwei
2024-09-02  0:12   ` Richard Henderson
2024-08-30  6:15 ` [PATCH v2 03/14] tcg/riscv: Add basic support for vector LIU Zhiwei
2024-09-02  0:28   ` Richard Henderson [this message]
2024-08-30  6:15 ` [PATCH v2 04/14] tcg/riscv: Add riscv vset{i}vli support LIU Zhiwei
2024-09-02  1:06   ` Richard Henderson
2024-08-30  6:15 ` [PATCH v2 05/14] tcg/riscv: Implement vector load/store LIU Zhiwei
2024-09-02  1:31   ` Richard Henderson
2024-08-30  6:15 ` [PATCH v2 06/14] tcg/riscv: Implement vector mov/dup{m/i} LIU Zhiwei
2024-09-02  1:36   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 07/14] tcg/riscv: Add support for basic vector opcodes LIU Zhiwei
2024-09-02  1:39   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 08/14] tcg/riscv: Implement vector cmp ops LIU Zhiwei
2024-09-03  6:45   ` Richard Henderson
2024-09-03 14:51     ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 09/14] tcg/riscv: Implement vector neg ops LIU Zhiwei
2024-09-03 14:52   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 10/14] tcg/riscv: Implement vector sat/mul ops LIU Zhiwei
2024-09-03 14:52   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 11/14] tcg/riscv: Implement vector min/max ops LIU Zhiwei
2024-09-03 14:53   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 12/14] tcg/riscv: Implement vector shs/v ops LIU Zhiwei
2024-09-03 14:54   ` Richard Henderson
2024-08-30  6:16 ` [PATCH v2 13/14] tcg/riscv: Implement vector roti/v/x shi ops LIU Zhiwei
2024-09-03 15:15   ` Richard Henderson
2024-09-04 15:25     ` LIU Zhiwei
2024-09-04 19:05       ` Richard Henderson
2024-09-05  1:40         ` LIU Zhiwei
2024-08-30  6:16 ` [PATCH v2 14/14] tcg/riscv: Enable native vector support for TCG host LIU Zhiwei
2024-09-03 15:02   ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0b1df6e8-5fa3-469e-b1e0-1b0ac7d4dfd2@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=alistair.francis@wdc.com \
    --cc=bmeng.cn@gmail.com \
    --cc=dbarboza@ventanamicro.com \
    --cc=liwei1518@gmail.com \
    --cc=palmer@dabbelt.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-riscv@nongnu.org \
    --cc=swung0x48@outlook.com \
    --cc=tangtiancheng.ttc@alibaba-inc.com \
    --cc=zhiwei_liu@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).