From: Richard Henderson <richard.henderson@linaro.org>
To: LIU Zhiwei <zhiwei_liu@c-sky.com>,
alistair23@gmail.com, chihmin.chao@sifive.com,
palmer@dabbelt.com
Cc: wenmeng_zhang@c-sky.com, qemu-riscv@nongnu.org,
qemu-devel@nongnu.org, wxy194768@alibaba-inc.com
Subject: Re: [PATCH v3 1/5] target/riscv: add vector unit stride load and store instructions
Date: Tue, 11 Feb 2020 22:38:14 -0800 [thread overview]
Message-ID: <9054a6fb-adee-4dcc-d7c6-9a974a83668a@linaro.org> (raw)
In-Reply-To: <20200210074256.11412-2-zhiwei_liu@c-sky.com>
On 2/9/20 11:42 PM, LIU Zhiwei wrote:
> +/*
> + * As simd_desc supports at most 256 bytes, and in this implementation,
> + * the max vector group length is 2048 bytes. So split it into two parts.
> + *
> + * The first part is floor(maxsz, 64), encoded in maxsz of simd_desc.
> + * The second part is (maxsz % 64) >> 3, encoded in data of simd_desc.
> + */
> +static uint32_t maxsz_part1(uint32_t maxsz)
> +{
> + return ((maxsz & ~(0x3f)) >> 3) + 0x8; /* add offset 8 to avoid return 0 */
> +}
> +
> +static uint32_t maxsz_part2(uint32_t maxsz)
> +{
> + return (maxsz & 0x3f) >> 3;
> +}
I would much rather adjust simd_desc to support 2048 bytes.
I've just posted a patch set that removes an assert in target/arm that would
trigger if SIMD_DATA_SHIFT was increased to make room for a larger oprsz.
Or, since we're not going through tcg_gen_gvec_* for ldst, don't bother with
simd_desc at all, and just pass vlen, unencoded.
> +/* define check conditions data structure */
> +struct vext_check_ctx {
> +
> + struct vext_reg {
> + uint8_t reg;
> + bool widen;
> + bool need_check;
> + } check_reg[6];
> +
> + struct vext_overlap_mask {
> + uint8_t reg;
> + uint8_t vm;
> + bool need_check;
> + } check_overlap_mask;
> +
> + struct vext_nf {
> + uint8_t nf;
> + bool need_check;
> + } check_nf;
> + target_ulong check_misa;
> +
> +} vchkctx;
You cannot use a global variable. The data must be thread-safe.
If we're going to do the checks this way, with a structure, it needs to be on
the stack or within DisasContext.
> +#define GEN_VEXT_LD_US_TRANS(NAME, DO_OP, SEQ) \
> +static bool trans_##NAME(DisasContext *s, arg_r2nfvm* a) \
> +{ \
> + vchkctx.check_misa = RVV; \
> + vchkctx.check_overlap_mask.need_check = true; \
> + vchkctx.check_overlap_mask.reg = a->rd; \
> + vchkctx.check_overlap_mask.vm = a->vm; \
> + vchkctx.check_reg[0].need_check = true; \
> + vchkctx.check_reg[0].reg = a->rd; \
> + vchkctx.check_reg[0].widen = false; \
> + vchkctx.check_nf.need_check = true; \
> + vchkctx.check_nf.nf = a->nf; \
> + \
> + if (!vext_check(s)) { \
> + return false; \
> + } \
> + return DO_OP(s, a, SEQ); \
> +}
I don't see the improvement from a pointer. Something like
if (vext_check_isa_ill(s) &&
vext_check_overlap(s, a->rd, a->rm) &&
vext_check_reg(s, a->rd, false) &&
vext_check_nf(s, a->nf)) {
return DO_OP(s, a, SEQ);
}
return false;
seems just as clear without the extra data.
> +#ifdef CONFIG_USER_ONLY
> +#define MO_SB 0
> +#define MO_LESW 0
> +#define MO_LESL 0
> +#define MO_LEQ 0
> +#define MO_UB 0
> +#define MO_LEUW 0
> +#define MO_LEUL 0
> +#endif
What is this for? We already define these unconditionally.
> +static inline int vext_elem_mask(void *v0, int mlen, int index)
> +{
> + int idx = (index * mlen) / 8;
> + int pos = (index * mlen) % 8;
> +
> + return (*((uint8_t *)v0 + idx) >> pos) & 0x1;
> +}
This is a little-endian indexing of the mask. Just above we talk about using a
host-endian ordering of uint64_t.
Thus this must be based on uint64_t instead of uint8_t.
> +/*
> + * This function checks watchpoint before really load operation.
> + *
> + * In softmmu mode, the TLB API probe_access is enough for watchpoint check.
> + * In user mode, there is no watchpoint support now.
> + *
> + * It will triggle an exception if there is no mapping in TLB
> + * and page table walk can't fill the TLB entry. Then the guest
> + * software can return here after process the exception or never return.
> + */
> +static void probe_read_access(CPURISCVState *env, target_ulong addr,
> + target_ulong len, uintptr_t ra)
> +{
> + while (len) {
> + const target_ulong pagelen = -(addr | TARGET_PAGE_MASK);
> + const target_ulong curlen = MIN(pagelen, len);
> +
> + probe_read(env, addr, curlen, cpu_mmu_index(env, false), ra);
The return value here is non-null when we can read directly from host memory.
It would be a shame to throw that work away.
> +/* data structure and common functions for load and store */
> +typedef void vext_ld_elem_fn(CPURISCVState *env, target_ulong addr,
> + uint32_t idx, void *vd, uintptr_t retaddr);
> +typedef void vext_st_elem_fn(CPURISCVState *env, target_ulong addr,
> + uint32_t idx, void *vd, uintptr_t retaddr);
> +typedef target_ulong vext_get_index_addr(target_ulong base,
> + uint32_t idx, void *vs2);
> +typedef void vext_ld_clear_elem(void *vd, uint32_t idx,
> + uint32_t cnt, uint32_t tot);
> +
> +struct vext_ldst_ctx {
> + struct vext_common_ctx vcc;
> + uint32_t nf;
> + target_ulong base;
> + target_ulong stride;
> + int mmuidx;
> +
> + vext_ld_elem_fn *ld_elem;
> + vext_st_elem_fn *st_elem;
> + vext_get_index_addr *get_index_addr;
> + vext_ld_clear_elem *clear_elem;
> +};
I think you should pass these elements directly, as needed, rather than putting
them all in a struct.
This would allow the main helper function to be inlined, which in turn allows
the mini helper functions to be inlined.
r~
next prev parent reply other threads:[~2020-02-12 6:39 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-10 7:42 [PATCH v3 0/5] target/riscv: support vector extension part 2 LIU Zhiwei
2020-02-10 7:42 ` [PATCH v3 1/5] target/riscv: add vector unit stride load and store instructions LIU Zhiwei
2020-02-12 6:38 ` Richard Henderson [this message]
2020-02-12 8:55 ` LIU Zhiwei
2020-02-19 8:57 ` LIU Zhiwei
2020-02-10 7:42 ` [PATCH v3 2/5] target/riscv: add vector " LIU Zhiwei
2020-02-10 7:42 ` [PATCH v3 3/5] target/riscv: add vector index " LIU Zhiwei
2020-02-10 7:42 ` [PATCH v3 4/5] target/riscv: add fault-only-first unit stride load LIU Zhiwei
2020-02-10 7:42 ` [PATCH v3 5/5] target/riscv: add vector amo operations LIU Zhiwei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9054a6fb-adee-4dcc-d7c6-9a974a83668a@linaro.org \
--to=richard.henderson@linaro.org \
--cc=alistair23@gmail.com \
--cc=chihmin.chao@sifive.com \
--cc=palmer@dabbelt.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-riscv@nongnu.org \
--cc=wenmeng_zhang@c-sky.com \
--cc=wxy194768@alibaba-inc.com \
--cc=zhiwei_liu@c-sky.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).