From: Richard Henderson <richard.henderson@linaro.org>
To: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>, qemu-devel@nongnu.org
Cc: qemu-riscv@nongnu.org, palmer@dabbelt.com,
alistair.francis@wdc.com, dbarboza@ventanamicro.com,
liwei1518@gmail.com, bmeng.cn@gmail.com,
TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Subject: Re: [PATCH v3 04/14] tcg/riscv: Add riscv vset{i}vli support
Date: Wed, 4 Sep 2024 23:03:11 -0700 [thread overview]
Message-ID: <efa2bcd4-3ed6-4943-8dee-f764ee5afe87@linaro.org> (raw)
In-Reply-To: <20240904142739.854-5-zhiwei_liu@linux.alibaba.com>
On 9/4/24 07:27, LIU Zhiwei wrote:
> From: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
>
> In RISC-V, vector operations require initial configuration using
> the vset{i}vl{i} instruction.
>
> This instruction:
> 1. Sets the vector length (vl) in bytes
> 2. Configures the vtype register, which includes:
> SEW (Single Element Width)
> LMUL (vector register group multiplier)
> Other vector operation parameters
>
> This configuration is crucial for defining subsequent vector
> operation behavior. To optimize performance, the configuration
> process is managed dynamically:
> 1. Reconfiguration using vset{i}vl{i} is necessary when SEW
> or vector register group width changes.
> 2. The vset instruction can be omitted when configuration
> remains unchanged.
>
> This optimization is only effective within a single TB.
> Each TB requires reconfiguration at its start, as the current
> state cannot be obtained from hardware.
>
> Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
> Signed-off-by: Weiwei Li <liwei1518@gmail.com>
> Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
> ---
> include/tcg/tcg.h | 3 +
> tcg/riscv/tcg-target.c.inc | 128 +++++++++++++++++++++++++++++++++++++
> 2 files changed, 131 insertions(+)
>
> diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
> index 21d5884741..267e6ff95c 100644
> --- a/include/tcg/tcg.h
> +++ b/include/tcg/tcg.h
> @@ -566,6 +566,9 @@ struct TCGContext {
>
> /* Exit to translator on overflow. */
> sigjmp_buf jmp_trans;
> +
> + /* For host-specific values. */
> + int riscv_host_vtype;
> };
(1) At minimum this needs #ifdef __riscv.
I planned to think of a cleaner way to do this,
but haven't gotten there yet.
I had also planned to place it higher in the structure, before
the large temp arrays, so that the structure offset would be smaller.
(2) I have determined through experimentation that vtype alone is insufficient.
While vtype + avl would be sufficient, it is inefficient.
Best to store the original inputs: TCGType and SEW, since that way
there's no effort required when querying the current SEW for use in
load/store/logicals.
The bug here appears as TCG swaps between TCGTypes for different
operations. E.g. if the vtype computed for (V64, E8) is the same
as the vtype computed for (V128, E8), with AVL differing, then we
will incorrectly omit the vsetvl instruction.
My test case was tcg/tests/aarch64-linux-user/sha1-vector
The naming of these functions is varied and inconsistent.
I suggest the following:
static void set_vtype(TCGContext *s, TCGType type, MemOp vsew)
{
unsigned vtype, insn, avl;
int lmul;
RISCVVlmul vlmul;
bool lmul_eq_avl;
s->riscv_cur_type = type;
s->riscv_cur_vsew = vsew;
/* Match riscv_lg2_vlenb to TCG_TYPE_V64. */
QEMU_BUILD_BUG_ON(TCG_TYPE_V64 != 3);
lmul = type - riscv_lg2_vlenb;
if (lmul < -3) {
/* Host VLEN >= 1024 bits. */
vlmul = VLMUL_M1;
lmul_eq_avl = false;
} else if (lmul < 3) {
/* 1/8 ... 1 ... 8 */
vlmul = lmul & 7;
lmul_eq_avl = true;
} else {
/* Guaranteed by Zve64x. */
g_assert_not_reached();
}
avl = tcg_type_size(type) >> vsew;
vtype = encode_vtype(true, true, vsew, vlmul);
if (avl < 32) {
insn = encode_i(OPC_VSETIVLI, TCG_REG_ZERO, avl, vtype);
} else if (lmul_eq_avl) {
/* rd != 0 and rs1 == 0 uses vlmax */
insn = encode_i(OPC_VSETVLI, TCG_REG_TMP0, TCG_REG_ZERO, vtype);
} else {
tcg_out_opc_imm(s, OPC_ADDI, TCG_REG_TMP0, TCG_REG_ZERO, avl);
insn = encode_i(OPC_VSETVLI, TCG_REG_ZERO, TCG_REG_TMP0, vtype);
}
tcg_out32(s, insn);
}
static MemOp set_vtype_len(TCGContext *s, TCGType type)
{
if (type != s->riscv_cur_type) {
set_type(s, type, MO_64);
}
return s->riscv_cur_vsew;
}
static void set_vtype_len_sew(TCGContext *s, TCGType type, MemOp vsew)
{
if (type != s->riscv_cur_type || vsew != s->riscv_cur_vsew) {
set_type(s, type, vsew);
}
}
(1) The storing of lg2(vlenb) means we can convert all of the division into subtraction.
(2) get_vec_type_bytes() already exists as tcg_type_size().
(3) Make use of the signed 3-bit encoding of vlmul.
(4) Make use of rd != 0, rs1 = 0 for the relatively common case of AVL = 32.
r~
next prev parent reply other threads:[~2024-09-05 6:03 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-04 14:27 [PATCH v3 00/14] Add support for vector LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 01/14] tcg/op-gvec: Fix iteration step in 32-bit operation LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 02/14] util: Add RISC-V vector extension probe in cpuinfo LIU Zhiwei
2024-09-05 3:34 ` Richard Henderson
2024-09-09 7:18 ` LIU Zhiwei
2024-09-09 15:45 ` Richard Henderson
2024-09-10 2:47 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 03/14] tcg/riscv: Add basic support for vector LIU Zhiwei
2024-09-05 4:05 ` Richard Henderson
2024-09-10 2:49 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 04/14] tcg/riscv: Add riscv vset{i}vli support LIU Zhiwei
2024-09-05 6:03 ` Richard Henderson [this message]
2024-09-10 2:46 ` LIU Zhiwei
2024-09-10 4:34 ` Richard Henderson
2024-09-10 7:03 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 05/14] tcg/riscv: Implement vector load/store LIU Zhiwei
2024-09-05 6:39 ` Richard Henderson
2024-09-10 3:04 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i} LIU Zhiwei
2024-09-05 6:56 ` Richard Henderson
2024-09-10 1:13 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 07/14] tcg/riscv: Add support for basic vector opcodes LIU Zhiwei
2024-09-05 6:57 ` Richard Henderson
2024-09-04 14:27 ` [PATCH v3 08/14] tcg/riscv: Implement vector cmp ops LIU Zhiwei
2024-09-05 7:12 ` Richard Henderson
2024-09-10 1:17 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 09/14] tcg/riscv: Implement vector neg ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 10/14] tcg/riscv: Implement vector sat/mul ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 11/14] tcg/riscv: Implement vector min/max ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 12/14] tcg/riscv: Implement vector shs/v ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 13/14] tcg/riscv: Implement vector roti/v/x shi ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 14/14] tcg/riscv: Enable native vector support for TCG host LIU Zhiwei
2024-09-05 23:46 ` [PATCH v3 00/14] Add support for vector Alistair Francis
2024-09-10 3:08 ` LIU Zhiwei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=efa2bcd4-3ed6-4943-8dee-f764ee5afe87@linaro.org \
--to=richard.henderson@linaro.org \
--cc=alistair.francis@wdc.com \
--cc=bmeng.cn@gmail.com \
--cc=dbarboza@ventanamicro.com \
--cc=liwei1518@gmail.com \
--cc=palmer@dabbelt.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-riscv@nongnu.org \
--cc=tangtiancheng.ttc@alibaba-inc.com \
--cc=zhiwei_liu@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).