From: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
To: Richard Henderson <richard.henderson@linaro.org>, qemu-devel@nongnu.org
Cc: qemu-riscv@nongnu.org, palmer@dabbelt.com,
alistair.francis@wdc.com, dbarboza@ventanamicro.com,
liwei1518@gmail.com, bmeng.cn@gmail.com,
TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Subject: Re: [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i}
Date: Tue, 10 Sep 2024 09:13:52 +0800 [thread overview]
Message-ID: <3d2f459a-8d73-48fa-9da1-c8aa7e5989f7@linux.alibaba.com> (raw)
In-Reply-To: <e8091bd4-95f3-4a7c-966e-521d75a33c79@linaro.org>
On 2024/9/5 14:56, Richard Henderson wrote:
> On 9/4/24 07:27, LIU Zhiwei wrote:
>> @@ -698,6 +704,21 @@ static bool tcg_out_mov(TCGContext *s, TCGType
>> type, TCGReg ret, TCGReg arg)
>> case TCG_TYPE_I64:
>> tcg_out_opc_imm(s, OPC_ADDI, ret, arg, 0);
>> break;
>> + case TCG_TYPE_V64:
>> + case TCG_TYPE_V128:
>> + case TCG_TYPE_V256:
>> + {
>> + int nf = get_vec_type_bytes(type) / riscv_vlenb;
>> +
>> + if (nf != 0) {
>> + tcg_debug_assert(is_power_of_2(nf) && nf <= 8);
>> + tcg_out_opc_vi(s, OPC_VMVNR_V, ret, arg, nf - 1, true);
>> + } else {
>> + riscv_set_vec_config_vl(s, type);
>> + tcg_out_opc_vv(s, OPC_VMV_V_V, ret, TCG_REG_V0, arg,
>> true);
>> + }
>> + }
>> + break;
>
> Perhaps
>
> int lmul = type - riscv_lg2_vlenb;
> int nf = 1 << MIN(lmul, 0);
> tcg_out_opc_vi(s, OPC_VMVNR_V, ret, arg, nf - 1);
>
> Is there a reason to prefer vmv.v.v over vmvnr.v?
I think it's a trade-off. For some CPUs, instruction will be split
internally. Thus the less the fraction lmul is, the less micro ops for
execution.
That's the benefit of using vmv.v.v. But here we also need a vsetivli.
On some cpus, it can be fusion-ed to the next instruction.
> Seems like we can always move one vector reg...
OK. I will take this way.
>
>> +static void tcg_out_dupi_vec(TCGContext *s, TCGType type, unsigned
>> vece,
>> + TCGReg dst, int64_t arg)
>> +{
>> + if (arg < 16 && arg >= -16) {
>> + riscv_set_vec_config_vl_vece(s, type, vece);
>> + tcg_out_opc_vi(s, OPC_VMV_V_I, dst, TCG_REG_V0, arg, true);
>> + return;
>> + }
>> + tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP0, arg);
>> + tcg_out_dup_vec(s, type, vece, dst, TCG_REG_TMP0);
>> +}
>
> I'll note that 0 and -1 do not require SEW change. I don't know how
> often that will come up
On our test on OpenCV, we get a rate of 99.7%. Thus we will optimize
this next version.
Thanks,
Zhiwei
> , since in my testing with aarch64, we usually needed to swap to
> TCG_TYPE_V256 anyway.
>
>
> r~
next prev parent reply other threads:[~2024-09-10 1:16 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-04 14:27 [PATCH v3 00/14] Add support for vector LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 01/14] tcg/op-gvec: Fix iteration step in 32-bit operation LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 02/14] util: Add RISC-V vector extension probe in cpuinfo LIU Zhiwei
2024-09-05 3:34 ` Richard Henderson
2024-09-09 7:18 ` LIU Zhiwei
2024-09-09 15:45 ` Richard Henderson
2024-09-10 2:47 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 03/14] tcg/riscv: Add basic support for vector LIU Zhiwei
2024-09-05 4:05 ` Richard Henderson
2024-09-10 2:49 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 04/14] tcg/riscv: Add riscv vset{i}vli support LIU Zhiwei
2024-09-05 6:03 ` Richard Henderson
2024-09-10 2:46 ` LIU Zhiwei
2024-09-10 4:34 ` Richard Henderson
2024-09-10 7:03 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 05/14] tcg/riscv: Implement vector load/store LIU Zhiwei
2024-09-05 6:39 ` Richard Henderson
2024-09-10 3:04 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 06/14] tcg/riscv: Implement vector mov/dup{m/i} LIU Zhiwei
2024-09-05 6:56 ` Richard Henderson
2024-09-10 1:13 ` LIU Zhiwei [this message]
2024-09-04 14:27 ` [PATCH v3 07/14] tcg/riscv: Add support for basic vector opcodes LIU Zhiwei
2024-09-05 6:57 ` Richard Henderson
2024-09-04 14:27 ` [PATCH v3 08/14] tcg/riscv: Implement vector cmp ops LIU Zhiwei
2024-09-05 7:12 ` Richard Henderson
2024-09-10 1:17 ` LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 09/14] tcg/riscv: Implement vector neg ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 10/14] tcg/riscv: Implement vector sat/mul ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 11/14] tcg/riscv: Implement vector min/max ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 12/14] tcg/riscv: Implement vector shs/v ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 13/14] tcg/riscv: Implement vector roti/v/x shi ops LIU Zhiwei
2024-09-04 14:27 ` [PATCH v3 14/14] tcg/riscv: Enable native vector support for TCG host LIU Zhiwei
2024-09-05 23:46 ` [PATCH v3 00/14] Add support for vector Alistair Francis
2024-09-10 3:08 ` LIU Zhiwei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3d2f459a-8d73-48fa-9da1-c8aa7e5989f7@linux.alibaba.com \
--to=zhiwei_liu@linux.alibaba.com \
--cc=alistair.francis@wdc.com \
--cc=bmeng.cn@gmail.com \
--cc=dbarboza@ventanamicro.com \
--cc=liwei1518@gmail.com \
--cc=palmer@dabbelt.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-riscv@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=tangtiancheng.ttc@alibaba-inc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).