Re: [RFC v3 26/71] target/riscv: rvv-1.0: update vext_max_elems() for load/store insns

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Frank Chang <frank.chang@sifive.com>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: "open list:RISC-V" <qemu-riscv@nongnu.org>,
	Sagar Karandikar <sagark@eecs.berkeley.edu>,
	Bastian Koppelmann <kbastian@mail.uni-paderborn.de>,
	"qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>,
	Alistair Francis <Alistair.Francis@wdc.com>,
	Palmer Dabbelt <palmer@dabbelt.com>
Subject: Re: [RFC v3 26/71] target/riscv: rvv-1.0: update vext_max_elems() for load/store insns
Date: Fri, 14 Aug 2020 10:48:54 +0800	[thread overview]
Message-ID: <CAE_xrPiJRRV3FYtfve6LMOF6LNEYGfhmi9CiabxqUBEew9igLg@mail.gmail.com> (raw)
In-Reply-To: <90f01984-54a4-2a56-c52f-d1f4332b39d4@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 3839 bytes --]

On Fri, Aug 7, 2020 at 8:04 AM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 8/6/20 3:46 AM, frank.chang@sifive.com wrote:
> > +static inline uint32_t vext_max_elems(uint32_t desc, uint32_t esz, bool
> is_ldst)
> >  {
> > -    return simd_maxsz(desc) << vext_lmul(desc);
> > +    /*
> > +     * As simd_desc support at most 256 bytes, the max vlen is 256 bits.
> > +     * so vlen in bytes (vlenb) is encoded as maxsz.
> > +     */
> > +    uint32_t vlenb = simd_maxsz(desc);
> > +
> > +    if (is_ldst) {
> > +        /*
> > +         * Vector load/store instructions have the EEW encoded
> > +         * directly in the instructions. The maximum vector size is
> > +         * calculated with EMUL rather than LMUL.
> > +         */
> > +        uint32_t eew = ctzl(esz);
> > +        uint32_t sew = vext_sew(desc);
> > +        uint32_t lmul = vext_lmul(desc);
> > +        int32_t emul = eew - sew + lmul;
> > +        uint32_t emul_r = emul < 0 ? 0 : emul;
> > +        return 1 << (ctzl(vlenb) + emul_r - ctzl(esz));
>
> As I said before, the is_ldst instructions should put the EEW and EMUL
> values
> into the SEW and LMUL desc fields, so that this does not need to be

special-cased at all.
>

I add a vext_get_emul() helper function in trans_rvv.inc.c:

> static uint8_t vext_get_emul(DisasContext *s, uint8_t eew)
> {
>     int8_t lmul = sextract32(s->lmul, 0, 3);
>     int8_t emul = ctzl(eew) - (s->sew + 3) + lmul;  // may remove ctzl()
if eew is already log2(eew)
>     return emul < 0 ? 0 : emul;
> }

and pass emul as LMUL field in VDATA so that it can be
reused in vector_helper.c: vext_max_elems():

> uint8_t emul = vext_get_emul(s, eew);
> data = FIELD_DP32(data, VDATA, LMUL, emul);

I also remove the passing SEW field in VDATA codes as I think SEW
might not be required in the updated vext_max_elems() (see below).


>
> > +        /* Return VLMAX */
> > +        return 1 << (ctzl(vlenb) + vext_lmul(desc) - ctzl(esz));
>
> This is overly complicated.
>
> (1) 1 << ctzl(vlenb) == vlenb.
> (2) I'm not sure why esz is not already a log2 number.
>

esz is passed from e.g. GEN_VEXT_LD_STRIDE() macro:

> #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN)        \
> void HELPER(NAME)(void *vd, void * v0, target_ulong base,           \
>                   target_ulong stride, CPURISCVState *env,
      \
>                   uint32_t desc)
                         \
> {
                                   \
>     uint32_t vm = vext_vm(desc);
                \
>     vext_ldst_stride(vd, v0, base, stride, env, desc, vm, LOAD_FN,     \
>                      sizeof(ETYPE), GETPC(), MMU_DATA_LOAD);            \
> }
>
> GEN_VEXT_LD_STRIDE(vlse8_v,  int8_t,  lde_b)

which is calculated by sizeof(ETYPE), so the results would be: 1, 2, 4, 8.
and vext_max_elems() is called by e.g. vext_ldst_stride():

> uint32_t max_elems = vext_max_elems(desc, esz);

I can add another parameter to the macro and pass the hard-coded log2(esz)
number
if it's the better way instead of using ctzl().
Or if there's another approach to get the log2(esz) number more elegantly?


>
> This ought to look more like
>
>   int scale = lmul - esz;
>   return (scale < 0
>           ? vlenb >> -scale
>           : vlenb << scale);
>
>
Thanks for the detailed point outs.
I manage to change the codes to below as your suggestion.

> static inline uint32_t vext_max_elems(uint32_t desc, uint32_t esz)
> {
>     /*
>      * As simd_desc support at most 256 bytes, the max vlen is 256 bits.
>      * so vlen in bytes (vlenb) is encoded as maxsz.
>      */
>     uint32_t vlenb = simd_maxsz(desc);
>
>     /* Return VLMAX */
>     int scale = vext_lmul(desc) - ctzl(esz);  // may remove ctzl() if esz
is already log2(esz)
>     return scale < 0 ? vlenb >> -scale : vlenb << scale;
> }


>
> r~
>

Thanks for the review.
Frank Chang

[-- Attachment #2: Type: text/html, Size: 6163 bytes --]

next prev parent reply	other threads:[~2020-08-14  2:49 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-06 10:45 [RFC v3 00/71] target/riscv: support vector extension v1.0 frank.chang
2020-08-06 10:45 ` [RFC v3 01/71] target/riscv: drop vector 0.7.1 and add 1.0 support frank.chang
2020-08-06 10:45 ` [RFC v3 02/71] target/riscv: Use FIELD_EX32() to extract wd field frank.chang
2020-08-06 10:46 ` [RFC v3 03/71] target/riscv: rvv-1.0: add mstatus VS field frank.chang
2020-08-06 10:46 ` [RFC v3 04/71] target/riscv: rvv-1.0: add sstatus " frank.chang
2020-08-06 10:46 ` [RFC v3 05/71] target/riscv: rvv-1.0: introduce writable misa.v field frank.chang
2020-08-06 18:04   ` Richard Henderson
2020-08-06 10:46 ` [RFC v3 06/71] target/riscv: rvv-1.0: add translation-time vector context status frank.chang
2020-08-06 10:46 ` [RFC v3 07/71] target/riscv: rvv-1.0: remove vxrm and vxsat fields from fcsr register frank.chang
2020-08-06 18:08   ` Richard Henderson
2020-08-06 18:30   ` Richard Henderson
2020-08-06 10:46 ` [RFC v3 08/71] target/riscv: rvv-1.0: add vcsr register frank.chang
2020-08-06 10:46 ` [RFC v3 09/71] target/riscv: rvv-1.0: add vlenb register frank.chang
2020-08-06 10:46 ` [RFC v3 10/71] target/riscv: rvv-1.0: check MSTATUS_VS when accessing vector csr registers frank.chang
2020-08-06 18:28   ` Richard Henderson
2020-08-06 10:46 ` [RFC v3 11/71] target/riscv: rvv-1.0: remove MLEN calculations frank.chang
2020-08-06 10:46 ` [RFC v3 12/71] target/riscv: rvv-1.0: add fractional LMUL frank.chang
2020-08-06 18:36   ` Richard Henderson
2020-08-14  3:12     ` Frank Chang
2020-08-06 10:46 ` [RFC v3 13/71] target/riscv: rvv-1.0: add VMA and VTA frank.chang
2020-08-06 19:08   ` Richard Henderson
2020-08-06 10:46 ` [RFC v3 14/71] target/riscv: rvv-1.0: update check functions frank.chang
2020-08-06 10:46 ` [RFC v3 15/71] target/riscv: introduce more imm value modes in translator functions frank.chang
2020-08-06 22:54   ` Richard Henderson
2020-08-06 10:46 ` [RFC v3 16/71] target/riscv: add fp16 nan-box check generator function frank.chang
2020-08-06 22:57   ` Richard Henderson
2020-08-06 10:46 ` [RFC v3 17/71] target/riscv: rvv:1.0: add translation-time nan-box helper function frank.chang
2020-08-06 22:58   ` Richard Henderson
2020-08-06 10:46 ` [RFC v3 18/71] target/riscv: rvv-1.0: apply nanbox helper in opfvf_trans frank.chang
2020-08-06 23:00   ` Richard Henderson
2020-08-06 10:46 ` [RFC v3 19/71] target/riscv: rvv-1.0: configure instructions frank.chang
2020-08-06 23:40   ` Richard Henderson
2020-08-06 10:46 ` [RFC v3 20/71] target/riscv: rvv-1.0: stride load and store instructions frank.chang
2020-08-06 10:46 ` [RFC v3 21/71] target/riscv: rvv-1.0: index " frank.chang
2020-08-06 10:46 ` [RFC v3 22/71] target/riscv: rvv-1.0: fix address index overflow bug of indexed load/store insns frank.chang
2020-08-06 10:46 ` [RFC v3 23/71] target/riscv: rvv-1.0: fault-only-first unit stride load frank.chang
2020-08-06 10:46 ` [RFC v3 24/71] target/riscv: rvv-1.0: amo operations frank.chang
2020-08-06 10:46 ` [RFC v3 25/71] target/riscv: rvv-1.0: load/store whole register instructions frank.chang
2020-08-06 10:46 ` [RFC v3 26/71] target/riscv: rvv-1.0: update vext_max_elems() for load/store insns frank.chang
2020-08-07  0:03   ` Richard Henderson
2020-08-14  2:48     ` Frank Chang [this message]
2020-08-14 18:36       ` Richard Henderson
2020-08-15  2:25         ` Frank Chang
2020-08-15  2:52         ` Frank Chang
2020-08-15  5:29           ` Richard Henderson
2020-08-15 21:59             ` Frank Chang
2020-08-06 10:46 ` [RFC v3 27/71] target/riscv: rvv-1.0: take fractional LMUL into vector max elements calculation frank.chang
2020-08-06 10:46 ` [RFC v3 28/71] target/riscv: rvv-1.0: floating-point square-root instruction frank.chang
2020-08-06 10:46 ` [RFC v3 29/71] target/riscv: rvv-1.0: floating-point classify instructions frank.chang
2020-08-06 10:46 ` [RFC v3 30/71] target/riscv: rvv-1.0: mask population count instruction frank.chang
2020-08-06 10:46 ` [RFC v3 31/71] target/riscv: rvv-1.0: find-first-set mask bit instruction frank.chang
2020-08-06 10:46 ` [RFC v3 32/71] target/riscv: rvv-1.0: set-X-first mask bit instructions frank.chang
2020-08-06 10:46 ` [RFC v3 33/71] target/riscv: rvv-1.0: iota instruction frank.chang
2020-08-06 10:46 ` [RFC v3 34/71] target/riscv: rvv-1.0: element index instruction frank.chang
2020-08-06 10:46 ` [RFC v3 35/71] target/riscv: rvv-1.0: allow load element with sign-extended frank.chang
2020-08-06 10:46 ` [RFC v3 36/71] target/riscv: rvv-1.0: register gather instructions frank.chang
2020-08-06 10:46 ` [RFC v3 37/71] target/riscv: rvv-1.0: integer scalar move instructions frank.chang
2020-08-06 10:46 ` [RFC v3 38/71] target/riscv: rvv-1.0: floating-point move instruction frank.chang
2020-08-06 10:46 ` [RFC v3 39/71] target/riscv: rvv-1.0: floating-point scalar move instructions frank.chang
2020-08-06 10:46 ` [RFC v3 40/71] target/riscv: rvv-1.0: whole register " frank.chang
2020-08-06 10:46 ` [RFC v3 41/71] target/riscv: rvv-1.0: integer extension instructions frank.chang
2020-08-06 10:46 ` [RFC v3 42/71] target/riscv: rvv-1.0: single-width averaging add and subtract instructions frank.chang
2020-08-06 10:46 ` [RFC v3 43/71] target/riscv: rvv-1.0: single-width bit shift instructions frank.chang
2020-08-06 10:46 ` [RFC v3 44/71] target/riscv: rvv-1.0: integer add-with-carry/subtract-with-borrow frank.chang
2020-08-06 10:46 ` [RFC v3 45/71] target/riscv: rvv-1.0: narrowing integer right shift instructions frank.chang
2020-08-06 10:46 ` [RFC v3 46/71] target/riscv: rvv-1.0: widening integer multiply-add instructions frank.chang
2020-08-06 10:46 ` [RFC v3 47/71] target/riscv: rvv-1.0: add Zvqmac extension frank.chang
2020-08-06 10:46 ` [RFC v3 48/71] target/riscv: rvv-1.0: quad-widening integer multiply-add instructions frank.chang
2020-08-06 10:46 ` [RFC v3 49/71] target/riscv: rvv-1.0: single-width saturating add and subtract instructions frank.chang
2020-08-06 10:46 ` [RFC v3 50/71] target/riscv: rvv-1.0: integer comparison instructions frank.chang
2020-08-06 10:46 ` [RFC v3 51/71] target/riscv: use softfloat lib float16 comparison functions frank.chang
2020-08-06 10:46 ` [RFC v3 52/71] target/riscv: rvv-1.0: floating-point compare instructions frank.chang
2020-08-06 10:46 ` [RFC v3 53/71] target/riscv: rvv-1.0: mask-register logical instructions frank.chang
2020-08-06 10:46 ` [RFC v3 54/71] target/riscv: rvv-1.0: slide instructions frank.chang
2020-08-06 10:46 ` [RFC v3 55/71] target/riscv: rvv-1.0: floating-point " frank.chang
2020-08-06 10:46 ` [RFC v3 56/71] target/riscv: rvv-1.0: narrowing fixed-point clip instructions frank.chang
2020-08-06 10:46 ` [RFC v3 57/71] target/riscv: rvv-1.0: single-width floating-point reduction frank.chang
2020-08-06 10:46 ` [RFC v3 58/71] target/riscv: rvv-1.0: widening floating-point reduction instructions frank.chang
2020-08-06 10:46 ` [RFC v3 59/71] target/riscv: rvv-1.0: single-width scaling shift instructions frank.chang
2020-08-06 10:46 ` [RFC v3 60/71] target/riscv: rvv-1.0: remove widening saturating scaled multiply-add frank.chang
2020-08-06 10:46 ` [RFC v3 61/71] target/riscv: rvv-1.0: remove vmford.vv and vmford.vf frank.chang
2020-08-06 10:46 ` [RFC v3 62/71] target/riscv: rvv-1.0: remove integer extract instruction frank.chang
2020-08-06 10:47 ` [RFC v3 63/71] target/riscv: rvv-1.0: floating-point min/max instructions frank.chang
2020-08-06 10:47 ` [RFC v3 64/71] target/riscv: introduce floating-point rounding mode enum frank.chang
2020-08-06 10:47 ` [RFC v3 65/71] target/riscv: rvv-1.0: floating-point/integer type-convert instructions frank.chang
2020-08-06 10:47 ` [RFC v3 66/71] target/riscv: rvv-1.0: widening floating-point/integer type-convert frank.chang
2020-08-06 10:47 ` [RFC v3 67/71] target/riscv: add "set round to odd" rounding mode helper function frank.chang
2020-08-06 10:47 ` [RFC v3 68/71] target/riscv: rvv-1.0: narrowing floating-point/integer type-convert frank.chang
2020-08-06 10:47 ` [RFC v3 69/71] target/riscv: gdb: modify gdb csr xml file to align with csr register map frank.chang
2020-08-06 10:47 ` [RFC v3 70/71] target/riscv: gdb: support vector registers for rv64 frank.chang
2020-08-06 10:47 ` [RFC v3 71/71] target/riscv: gdb: support vector registers for rv32 frank.chang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAE_xrPiJRRV3FYtfve6LMOF6LNEYGfhmi9CiabxqUBEew9igLg@mail.gmail.com \
    --to=frank.chang@sifive.com \
    --cc=Alistair.Francis@wdc.com \
    --cc=kbastian@mail.uni-paderborn.de \
    --cc=palmer@dabbelt.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-riscv@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=sagark@eecs.berkeley.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).