From: "Alex Bennée" <alex.bennee@linaro.org>
To: liuzhiwei <zhiwei_liu@c-sky.com>
Cc: qemu-devel@nongnu.org, qemu-riscv@nongnu.org,
aurelien@aurel32.net, peter.maydell@linaro.org,
riku.voipio@iki.fi, laurent@vivier.eu, palmer@sifive.com,
Alistair.Francis@wdc.com, sagark@eecs.berkeley.edu,
kbastian@mail.uni-paderborn.de
Subject: Re: [Qemu-riscv] [PATCH] RISCV: support riscv vector extension 0.7.1
Date: Wed, 28 Aug 2019 10:08:47 +0100 [thread overview]
Message-ID: <87k1ax64io.fsf@linaro.org> (raw)
In-Reply-To: <1566959818-38369-1-git-send-email-zhiwei_liu@c-sky.com>
liuzhiwei <zhiwei_liu@c-sky.com> writes:
> Change-Id: I3cf891bc400713b95f47ecca82b1bf773f3dcb25
> Signed-off-by: liuzhiwei <zhiwei_liu@c-sky.com>
> ---
> fpu/softfloat.c | 119 +
> include/fpu/softfloat.h | 4 +
Changes to softfloat should be in a separate patch, but see bellow.
> linux-user/riscv/cpu_loop.c | 8 +-
> target/riscv/Makefile.objs | 2 +-
> target/riscv/cpu.h | 30 +
> target/riscv/cpu_bits.h | 15 +
> target/riscv/cpu_helper.c | 7 +
> target/riscv/csr.c | 65 +-
> target/riscv/helper.h | 354 +
> target/riscv/insn32.decode | 374 +-
> target/riscv/insn_trans/trans_rvv.inc.c | 484 +
> target/riscv/translate.c | 1 +
> target/riscv/vector_helper.c | 26563 ++++++++++++++++++++++++++++++
This is likely too big to be reviewed. Is it possible to split the patch
up into more discrete chunks, for example support pieces and then maybe
a class at a time?
> 13 files changed, 28017 insertions(+), 9 deletions(-)
> create mode 100644 target/riscv/insn_trans/trans_rvv.inc.c
> create mode 100644 target/riscv/vector_helper.c
>
> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> index 2ba36ec..da155ea 100644
> --- a/fpu/softfloat.c
> +++ b/fpu/softfloat.c
> @@ -433,6 +433,16 @@ static inline int extractFloat16Exp(float16 a)
> }
>
> /*----------------------------------------------------------------------------
> +| Returns the sign bit of the half-precision floating-point value `a'.
> +*----------------------------------------------------------------------------*/
> +
> +static inline flag extractFloat16Sign(float16 a)
> +{
> + return float16_val(a) >> 0xf;
> +}
> +
We are trying to avoid this sort of bit fiddling for new code when we
already have generic decompose functions that can extract all the parts
into a common format.
> +
> +/*----------------------------------------------------------------------------
> | Returns the fraction bits of the single-precision floating-point value `a'.
> *----------------------------------------------------------------------------*/
>
> @@ -4790,6 +4800,35 @@ int float32_eq(float32 a, float32 b, float_status *status)
> }
>
> /*----------------------------------------------------------------------------
> +| Returns 1 if the half-precision floating-point value `a' is less than
> +| or equal to the corresponding value `b', and 0 otherwise. The invalid
> +| exception is raised if either operand is a NaN. The comparison is performed
> +| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
> +*----------------------------------------------------------------------------*/
> +
> +int float16_le(float16 a, float16 b, float_status *status)
> +{
> + flag aSign, bSign;
> + uint16_t av, bv;
> + a = float16_squash_input_denormal(a, status);
> + b = float16_squash_input_denormal(b, status);
> +
> + if ( ( ( extractFloat16Exp( a ) == 0x1F ) && extractFloat16Frac( a ) )
> + || ( ( extractFloat16Exp( b ) == 0x1F ) && extractFloat16Frac( b ) )
> + ) {
> + float_raise(float_flag_invalid, status);
> + return 0;
> + }
> + aSign = extractFloat16Sign( a );
> + bSign = extractFloat16Sign( b );
> + av = float16_val(a);
> + bv = float16_val(b);
> + if ( aSign != bSign ) return aSign || ( (uint16_t) ( ( av | bv )<<1 ) == 0 );
> + return ( av == bv ) || ( aSign ^ ( av < bv ) );
> +
> +}
What does this provide that:
float16_compare(a, b, status) == float_relation_less;
doesn't?
> +
> +/*----------------------------------------------------------------------------
> | Returns 1 if the single-precision floating-point value `a' is less than
> | or equal to the corresponding value `b', and 0 otherwise. The invalid
> | exception is raised if either operand is a NaN. The comparison is performed
> @@ -4825,6 +4864,35 @@ int float32_le(float32 a, float32 b, float_status *status)
> | to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
> *----------------------------------------------------------------------------*/
>
> +int float16_lt(float16 a, float16 b, float_status *status)
> +{
> + flag aSign, bSign;
> + uint16_t av, bv;
> + a = float16_squash_input_denormal(a, status);
> + b = float16_squash_input_denormal(b, status);
> +
> + if ( ( ( extractFloat16Exp( a ) == 0x1F ) && extractFloat16Frac( a ) )
> + || ( ( extractFloat16Exp( b ) == 0x1F ) && extractFloat16Frac( b ) )
> + ) {
> + float_raise(float_flag_invalid, status);
> + return 0;
> + }
> + aSign = extractFloat16Sign( a );
> + bSign = extractFloat16Sign( b );
> + av = float16_val(a);
> + bv = float16_val(b);
> + if ( aSign != bSign ) return aSign && ( (uint16_t) ( ( av | bv )<<1 ) != 0 );
> + return ( av != bv ) && ( aSign ^ ( av < bv ) );
> +
> +}
> +
> +/*----------------------------------------------------------------------------
> +| Returns 1 if the single-precision floating-point value `a' is less than
> +| the corresponding value `b', and 0 otherwise. The invalid exception is
> +| raised if either operand is a NaN. The comparison is performed according
> +| to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
> +*----------------------------------------------------------------------------*/
> +
> int float32_lt(float32 a, float32 b, float_status *status)
> {
> flag aSign, bSign;
> @@ -4869,6 +4937,32 @@ int float32_unordered(float32 a, float32 b, float_status *status)
> }
>
> /*----------------------------------------------------------------------------
> +| Returns 1 if the half-precision floating-point value `a' is equal to
> +| the corresponding value `b', and 0 otherwise. Quiet NaNs do not cause an
> +| exception. The comparison is performed according to the IEC/IEEE Standard
> +| for Binary Floating-Point Arithmetic.
> +*----------------------------------------------------------------------------*/
> +
> +int float16_eq_quiet(float16 a, float16 b, float_status *status)
> +{
> + a = float16_squash_input_denormal(a, status);
> + b = float16_squash_input_denormal(b, status);
> +
> + if ( ( ( extractFloat16Exp( a ) == 0x1F ) && extractFloat16Frac( a ) )
> + || ( ( extractFloat16Exp( b ) == 0x1F ) && extractFloat16Frac( b ) )
> + ) {
> + if (float16_is_signaling_nan(a, status)
> + || float16_is_signaling_nan(b, status)) {
> + float_raise(float_flag_invalid, status);
> + }
> + return 0;
> + }
> + return ( float16_val(a) == float16_val(b) ) ||
> + ( (uint16_t) ( ( float16_val(a) | float16_val(b) )<<1 ) == 0 );
> +}
> +
See also float_16_compare_quiet
> +
> +/*----------------------------------------------------------------------------
> | Returns 1 if the single-precision floating-point value `a' is equal to
> | the corresponding value `b', and 0 otherwise. Quiet NaNs do not cause an
> | exception. The comparison is performed according to the IEC/IEEE Standard
> @@ -4958,6 +5052,31 @@ int float32_lt_quiet(float32 a, float32 b, float_status *status)
> }
>
> /*----------------------------------------------------------------------------
> +| Returns 1 if the half-precision floating-point values `a' and `b' cannot
> +| be compared, and 0 otherwise. Quiet NaNs do not cause an exception. The
> +| comparison is performed according to the IEC/IEEE Standard for Binary
> +| Floating-Point Arithmetic.
> +*----------------------------------------------------------------------------*/
> +
> +int float16_unordered_quiet(float16 a, float16 b, float_status *status)
> +{
> + a = float16_squash_input_denormal(a, status);
> + b = float16_squash_input_denormal(b, status);
> +
> + if ( ( ( extractFloat16Exp( a ) == 0x1F ) && extractFloat16Frac( a ) )
> + || ( ( extractFloat16Exp( b ) == 0x1F ) && extractFloat16Frac( b ) )
> + ) {
> + if (float16_is_signaling_nan(a, status)
> + || float16_is_signaling_nan(b, status)) {
> + float_raise(float_flag_invalid, status);
> + }
> + return 1;
> + }
> + return 0;
> +}
> +
> +
> +/*----------------------------------------------------------------------------
> | Returns 1 if the single-precision floating-point values `a' and `b' cannot
> | be compared, and 0 otherwise. Quiet NaNs do not cause an exception. The
> | comparison is performed according to the IEC/IEEE Standard for Binary
> diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
> index 3ff3fa5..3b0754c 100644
> --- a/include/fpu/softfloat.h
> +++ b/include/fpu/softfloat.h
> @@ -293,6 +293,10 @@ float16 float16_maxnummag(float16, float16, float_status *status);
> float16 float16_sqrt(float16, float_status *status);
> int float16_compare(float16, float16, float_status *status);
> int float16_compare_quiet(float16, float16, float_status *status);
> +int float16_unordered_quiet(float16, float16, float_status *status);
> +int float16_le(float16, float16, float_status *status);
> +int float16_lt(float16, float16, float_status *status);
> +int float16_eq_quiet(float16, float16, float_status *status);
>
> int float16_is_quiet_nan(float16, float_status *status);
> int float16_is_signaling_nan(float16, float_status *status);
> diff --git a/linux-user/riscv/cpu_loop.c b/linux-user/riscv/cpu_loop.c
> index 12aa3c0..b01548a 100644
> --- a/linux-user/riscv/cpu_loop.c
> +++ b/linux-user/riscv/cpu_loop.c
> @@ -40,7 +40,13 @@ void cpu_loop(CPURISCVState *env)
> signum = 0;
> sigcode = 0;
> sigaddr = 0;
> -
> + if (env->foflag) {
> + if (env->vfp.vl != 0) {
> + env->foflag = false;
> + env->pc += 4;
> + continue;
> + }
> + }
What is this trying to do?
> switch (trapnr) {
> case EXCP_INTERRUPT:
> /* just indicate that signals should be handled asap */
> diff --git a/target/riscv/Makefile.objs b/target/riscv/Makefile.objs
> index b1c79bc..d577cef 100644
> --- a/target/riscv/Makefile.objs
> +++ b/target/riscv/Makefile.objs
> @@ -1,4 +1,4 @@
> -obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o gdbstub.o pmp.o
> +obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o vector_helper.o gdbstub.o pmp.o
>
> DECODETREE = $(SRC_PATH)/scripts/decodetree.py
>
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 0adb307..5a93aa2 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -67,6 +67,7 @@
> #define RVC RV('C')
> #define RVS RV('S')
> #define RVU RV('U')
> +#define RVV RV('V')
>
> /* S extension denotes that Supervisor mode exists, however it is possible
> to have a core that support S mode but does not have an MMU and there
> @@ -93,9 +94,38 @@ typedef struct CPURISCVState CPURISCVState;
>
> #include "pmp.h"
>
> +#define VLEN 128
> +#define VUNIT(x) (VLEN / x)
> +
If you want to do vectors I suggest you look at the TCGvec types for
passing pointers to vector registers to helpers. In this case you will
want to ensure your vector registers are properly aligned.
> struct CPURISCVState {
> target_ulong gpr[32];
> uint64_t fpr[32]; /* assume both F and D extensions */
> +
> + /* vector coprocessor state. */
> + struct {
> + union VECTOR {
> + float64 f64[VUNIT(64)];
> + float32 f32[VUNIT(32)];
> + float16 f16[VUNIT(16)];
> + target_ulong ul[VUNIT(sizeof(target_ulong))];
> + uint64_t u64[VUNIT(64)];
> + int64_t s64[VUNIT(64)];
> + uint32_t u32[VUNIT(32)];
> + int32_t s32[VUNIT(32)];
> + uint16_t u16[VUNIT(16)];
> + int16_t s16[VUNIT(16)];
> + uint8_t u8[VUNIT(8)];
> + int8_t s8[VUNIT(8)];
> + } vreg[32];
> + target_ulong vxrm;
> + target_ulong vxsat;
> + target_ulong vl;
> + target_ulong vstart;
> + target_ulong vtype;
> + float_status fp_status;
> + } vfp;
> +
> + bool foflag;
Again I have no idea what foflag is here.
> target_ulong pc;
> target_ulong load_res;
> target_ulong load_val;
> diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
> index 11f971a..9eb43ec 100644
> --- a/target/riscv/cpu_bits.h
> +++ b/target/riscv/cpu_bits.h
> @@ -29,6 +29,14 @@
> #define FSR_NXA (FPEXC_NX << FSR_AEXC_SHIFT)
> #define FSR_AEXC (FSR_NVA | FSR_OFA | FSR_UFA | FSR_DZA | FSR_NXA)
>
> +/* Vector Fixed-Point round model */
> +#define FSR_VXRM_SHIFT 9
> +#define FSR_VXRM (0x3 << FSR_VXRM_SHIFT)
> +
> +/* Vector Fixed-Point saturation flag */
> +#define FSR_VXSAT_SHIFT 8
> +#define FSR_VXSAT (0x1 << FSR_VXSAT_SHIFT)
> +
> /* Control and Status Registers */
>
> /* User Trap Setup */
> @@ -48,6 +56,13 @@
> #define CSR_FRM 0x002
> #define CSR_FCSR 0x003
>
> +/* User Vector CSRs */
> +#define CSR_VSTART 0x008
> +#define CSR_VXSAT 0x009
> +#define CSR_VXRM 0x00a
> +#define CSR_VL 0xc20
> +#define CSR_VTYPE 0xc21
> +
> /* User Timers and Counters */
> #define CSR_CYCLE 0xc00
> #define CSR_TIME 0xc01
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index e32b612..405caf6 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -521,6 +521,13 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> [PRV_H] = RISCV_EXCP_H_ECALL,
> [PRV_M] = RISCV_EXCP_M_ECALL
> };
> + if (env->foflag) {
> + if (env->vfp.vl != 0) {
> + env->foflag = false;
> + env->pc += 4;
> + return;
> + }
> + }
>
> if (!async) {
> /* set tval to badaddr for traps with address information */
> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> index e0d4586..a6131ff 100644
> --- a/target/riscv/csr.c
> +++ b/target/riscv/csr.c
> @@ -87,12 +87,12 @@ static int ctr(CPURISCVState *env, int csrno)
> return 0;
> }
>
> -#if !defined(CONFIG_USER_ONLY)
> static int any(CPURISCVState *env, int csrno)
> {
> return 0;
> }
>
> +#if !defined(CONFIG_USER_ONLY)
> static int smode(CPURISCVState *env, int csrno)
> {
> return -!riscv_has_ext(env, RVS);
> @@ -158,8 +158,10 @@ static int read_fcsr(CPURISCVState *env, int csrno, target_ulong *val)
> return -1;
> }
> #endif
> - *val = (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
> - | (env->frm << FSR_RD_SHIFT);
> + *val = (env->vfp.vxrm << FSR_VXRM_SHIFT)
> + | (env->vfp.vxsat << FSR_VXSAT_SHIFT)
> + | (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
> + | (env->frm << FSR_RD_SHIFT);
> return 0;
> }
>
> @@ -172,10 +174,60 @@ static int write_fcsr(CPURISCVState *env, int csrno, target_ulong val)
> env->mstatus |= MSTATUS_FS;
> #endif
> env->frm = (val & FSR_RD) >> FSR_RD_SHIFT;
> + env->vfp.vxrm = (val & FSR_VXRM) >> FSR_VXRM_SHIFT;
> + env->vfp.vxsat = (val & FSR_VXSAT) >> FSR_VXSAT_SHIFT;
> riscv_cpu_set_fflags(env, (val & FSR_AEXC) >> FSR_AEXC_SHIFT);
> return 0;
> }
>
> +static int read_vtype(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vtype;
> + return 0;
> +}
> +
> +static int read_vl(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vl;
> + return 0;
> +}
> +
> +static int read_vxrm(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vxrm;
> + return 0;
> +}
> +
> +static int read_vxsat(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vxsat;
> + return 0;
> +}
> +
> +static int read_vstart(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vstart;
> + return 0;
> +}
> +
> +static int write_vxrm(CPURISCVState *env, int csrno, target_ulong val)
> +{
> + env->vfp.vxrm = val;
> + return 0;
> +}
> +
> +static int write_vxsat(CPURISCVState *env, int csrno, target_ulong val)
> +{
> + env->vfp.vxsat = val;
> + return 0;
> +}
> +
> +static int write_vstart(CPURISCVState *env, int csrno, target_ulong val)
> +{
> + env->vfp.vstart = val;
> + return 0;
> +}
A fixed return value makes me think these should be void functions.
> +
> /* User Timers and Counters */
> static int read_instret(CPURISCVState *env, int csrno, target_ulong *val)
> {
> @@ -873,7 +925,12 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
> [CSR_FFLAGS] = { fs, read_fflags, write_fflags },
> [CSR_FRM] = { fs, read_frm, write_frm },
> [CSR_FCSR] = { fs, read_fcsr, write_fcsr },
> -
> + /* Vector CSRs */
> + [CSR_VSTART] = { any, read_vstart, write_vstart },
> + [CSR_VXSAT] = { any, read_vxsat, write_vxsat },
> + [CSR_VXRM] = { any, read_vxrm, write_vxrm },
> + [CSR_VL] = { any, read_vl },
> + [CSR_VTYPE] = { any, read_vtype },
> /* User Timers and Counters */
> [CSR_CYCLE] = { ctr, read_instret },
> [CSR_INSTRET] = { ctr, read_instret },
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index debb22a..fee02c0 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -76,3 +76,357 @@ DEF_HELPER_2(mret, tl, env, tl)
> DEF_HELPER_1(wfi, void, env)
> DEF_HELPER_1(tlb_flush, void, env)
> #endif
> +/* Vector functions */
Think about how you could split this patch up to introduce a group of
instructions at a time. This will make it a lot easier review.
I'm going to leave review of the specifics to the RISCV maintainers but
I suspect they will want to wait until a v2 of the series. However it
looks like a good first pass at implementing vectors.
--
Alex Bennée
WARNING: multiple messages have this Message-ID (diff)
From: "Alex Bennée" <alex.bennee@linaro.org>
To: liuzhiwei <zhiwei_liu@c-sky.com>
Cc: peter.maydell@linaro.org, palmer@sifive.com,
qemu-riscv@nongnu.org, sagark@eecs.berkeley.edu,
kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi,
qemu-devel@nongnu.org, laurent@vivier.eu,
Alistair.Francis@wdc.com, aurelien@aurel32.net
Subject: Re: [Qemu-devel] [PATCH] RISCV: support riscv vector extension 0.7.1
Date: Wed, 28 Aug 2019 10:08:47 +0100 [thread overview]
Message-ID: <87k1ax64io.fsf@linaro.org> (raw)
In-Reply-To: <1566959818-38369-1-git-send-email-zhiwei_liu@c-sky.com>
liuzhiwei <zhiwei_liu@c-sky.com> writes:
> Change-Id: I3cf891bc400713b95f47ecca82b1bf773f3dcb25
> Signed-off-by: liuzhiwei <zhiwei_liu@c-sky.com>
> ---
> fpu/softfloat.c | 119 +
> include/fpu/softfloat.h | 4 +
Changes to softfloat should be in a separate patch, but see bellow.
> linux-user/riscv/cpu_loop.c | 8 +-
> target/riscv/Makefile.objs | 2 +-
> target/riscv/cpu.h | 30 +
> target/riscv/cpu_bits.h | 15 +
> target/riscv/cpu_helper.c | 7 +
> target/riscv/csr.c | 65 +-
> target/riscv/helper.h | 354 +
> target/riscv/insn32.decode | 374 +-
> target/riscv/insn_trans/trans_rvv.inc.c | 484 +
> target/riscv/translate.c | 1 +
> target/riscv/vector_helper.c | 26563 ++++++++++++++++++++++++++++++
This is likely too big to be reviewed. Is it possible to split the patch
up into more discrete chunks, for example support pieces and then maybe
a class at a time?
> 13 files changed, 28017 insertions(+), 9 deletions(-)
> create mode 100644 target/riscv/insn_trans/trans_rvv.inc.c
> create mode 100644 target/riscv/vector_helper.c
>
> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> index 2ba36ec..da155ea 100644
> --- a/fpu/softfloat.c
> +++ b/fpu/softfloat.c
> @@ -433,6 +433,16 @@ static inline int extractFloat16Exp(float16 a)
> }
>
> /*----------------------------------------------------------------------------
> +| Returns the sign bit of the half-precision floating-point value `a'.
> +*----------------------------------------------------------------------------*/
> +
> +static inline flag extractFloat16Sign(float16 a)
> +{
> + return float16_val(a) >> 0xf;
> +}
> +
We are trying to avoid this sort of bit fiddling for new code when we
already have generic decompose functions that can extract all the parts
into a common format.
> +
> +/*----------------------------------------------------------------------------
> | Returns the fraction bits of the single-precision floating-point value `a'.
> *----------------------------------------------------------------------------*/
>
> @@ -4790,6 +4800,35 @@ int float32_eq(float32 a, float32 b, float_status *status)
> }
>
> /*----------------------------------------------------------------------------
> +| Returns 1 if the half-precision floating-point value `a' is less than
> +| or equal to the corresponding value `b', and 0 otherwise. The invalid
> +| exception is raised if either operand is a NaN. The comparison is performed
> +| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
> +*----------------------------------------------------------------------------*/
> +
> +int float16_le(float16 a, float16 b, float_status *status)
> +{
> + flag aSign, bSign;
> + uint16_t av, bv;
> + a = float16_squash_input_denormal(a, status);
> + b = float16_squash_input_denormal(b, status);
> +
> + if ( ( ( extractFloat16Exp( a ) == 0x1F ) && extractFloat16Frac( a ) )
> + || ( ( extractFloat16Exp( b ) == 0x1F ) && extractFloat16Frac( b ) )
> + ) {
> + float_raise(float_flag_invalid, status);
> + return 0;
> + }
> + aSign = extractFloat16Sign( a );
> + bSign = extractFloat16Sign( b );
> + av = float16_val(a);
> + bv = float16_val(b);
> + if ( aSign != bSign ) return aSign || ( (uint16_t) ( ( av | bv )<<1 ) == 0 );
> + return ( av == bv ) || ( aSign ^ ( av < bv ) );
> +
> +}
What does this provide that:
float16_compare(a, b, status) == float_relation_less;
doesn't?
> +
> +/*----------------------------------------------------------------------------
> | Returns 1 if the single-precision floating-point value `a' is less than
> | or equal to the corresponding value `b', and 0 otherwise. The invalid
> | exception is raised if either operand is a NaN. The comparison is performed
> @@ -4825,6 +4864,35 @@ int float32_le(float32 a, float32 b, float_status *status)
> | to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
> *----------------------------------------------------------------------------*/
>
> +int float16_lt(float16 a, float16 b, float_status *status)
> +{
> + flag aSign, bSign;
> + uint16_t av, bv;
> + a = float16_squash_input_denormal(a, status);
> + b = float16_squash_input_denormal(b, status);
> +
> + if ( ( ( extractFloat16Exp( a ) == 0x1F ) && extractFloat16Frac( a ) )
> + || ( ( extractFloat16Exp( b ) == 0x1F ) && extractFloat16Frac( b ) )
> + ) {
> + float_raise(float_flag_invalid, status);
> + return 0;
> + }
> + aSign = extractFloat16Sign( a );
> + bSign = extractFloat16Sign( b );
> + av = float16_val(a);
> + bv = float16_val(b);
> + if ( aSign != bSign ) return aSign && ( (uint16_t) ( ( av | bv )<<1 ) != 0 );
> + return ( av != bv ) && ( aSign ^ ( av < bv ) );
> +
> +}
> +
> +/*----------------------------------------------------------------------------
> +| Returns 1 if the single-precision floating-point value `a' is less than
> +| the corresponding value `b', and 0 otherwise. The invalid exception is
> +| raised if either operand is a NaN. The comparison is performed according
> +| to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
> +*----------------------------------------------------------------------------*/
> +
> int float32_lt(float32 a, float32 b, float_status *status)
> {
> flag aSign, bSign;
> @@ -4869,6 +4937,32 @@ int float32_unordered(float32 a, float32 b, float_status *status)
> }
>
> /*----------------------------------------------------------------------------
> +| Returns 1 if the half-precision floating-point value `a' is equal to
> +| the corresponding value `b', and 0 otherwise. Quiet NaNs do not cause an
> +| exception. The comparison is performed according to the IEC/IEEE Standard
> +| for Binary Floating-Point Arithmetic.
> +*----------------------------------------------------------------------------*/
> +
> +int float16_eq_quiet(float16 a, float16 b, float_status *status)
> +{
> + a = float16_squash_input_denormal(a, status);
> + b = float16_squash_input_denormal(b, status);
> +
> + if ( ( ( extractFloat16Exp( a ) == 0x1F ) && extractFloat16Frac( a ) )
> + || ( ( extractFloat16Exp( b ) == 0x1F ) && extractFloat16Frac( b ) )
> + ) {
> + if (float16_is_signaling_nan(a, status)
> + || float16_is_signaling_nan(b, status)) {
> + float_raise(float_flag_invalid, status);
> + }
> + return 0;
> + }
> + return ( float16_val(a) == float16_val(b) ) ||
> + ( (uint16_t) ( ( float16_val(a) | float16_val(b) )<<1 ) == 0 );
> +}
> +
See also float_16_compare_quiet
> +
> +/*----------------------------------------------------------------------------
> | Returns 1 if the single-precision floating-point value `a' is equal to
> | the corresponding value `b', and 0 otherwise. Quiet NaNs do not cause an
> | exception. The comparison is performed according to the IEC/IEEE Standard
> @@ -4958,6 +5052,31 @@ int float32_lt_quiet(float32 a, float32 b, float_status *status)
> }
>
> /*----------------------------------------------------------------------------
> +| Returns 1 if the half-precision floating-point values `a' and `b' cannot
> +| be compared, and 0 otherwise. Quiet NaNs do not cause an exception. The
> +| comparison is performed according to the IEC/IEEE Standard for Binary
> +| Floating-Point Arithmetic.
> +*----------------------------------------------------------------------------*/
> +
> +int float16_unordered_quiet(float16 a, float16 b, float_status *status)
> +{
> + a = float16_squash_input_denormal(a, status);
> + b = float16_squash_input_denormal(b, status);
> +
> + if ( ( ( extractFloat16Exp( a ) == 0x1F ) && extractFloat16Frac( a ) )
> + || ( ( extractFloat16Exp( b ) == 0x1F ) && extractFloat16Frac( b ) )
> + ) {
> + if (float16_is_signaling_nan(a, status)
> + || float16_is_signaling_nan(b, status)) {
> + float_raise(float_flag_invalid, status);
> + }
> + return 1;
> + }
> + return 0;
> +}
> +
> +
> +/*----------------------------------------------------------------------------
> | Returns 1 if the single-precision floating-point values `a' and `b' cannot
> | be compared, and 0 otherwise. Quiet NaNs do not cause an exception. The
> | comparison is performed according to the IEC/IEEE Standard for Binary
> diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
> index 3ff3fa5..3b0754c 100644
> --- a/include/fpu/softfloat.h
> +++ b/include/fpu/softfloat.h
> @@ -293,6 +293,10 @@ float16 float16_maxnummag(float16, float16, float_status *status);
> float16 float16_sqrt(float16, float_status *status);
> int float16_compare(float16, float16, float_status *status);
> int float16_compare_quiet(float16, float16, float_status *status);
> +int float16_unordered_quiet(float16, float16, float_status *status);
> +int float16_le(float16, float16, float_status *status);
> +int float16_lt(float16, float16, float_status *status);
> +int float16_eq_quiet(float16, float16, float_status *status);
>
> int float16_is_quiet_nan(float16, float_status *status);
> int float16_is_signaling_nan(float16, float_status *status);
> diff --git a/linux-user/riscv/cpu_loop.c b/linux-user/riscv/cpu_loop.c
> index 12aa3c0..b01548a 100644
> --- a/linux-user/riscv/cpu_loop.c
> +++ b/linux-user/riscv/cpu_loop.c
> @@ -40,7 +40,13 @@ void cpu_loop(CPURISCVState *env)
> signum = 0;
> sigcode = 0;
> sigaddr = 0;
> -
> + if (env->foflag) {
> + if (env->vfp.vl != 0) {
> + env->foflag = false;
> + env->pc += 4;
> + continue;
> + }
> + }
What is this trying to do?
> switch (trapnr) {
> case EXCP_INTERRUPT:
> /* just indicate that signals should be handled asap */
> diff --git a/target/riscv/Makefile.objs b/target/riscv/Makefile.objs
> index b1c79bc..d577cef 100644
> --- a/target/riscv/Makefile.objs
> +++ b/target/riscv/Makefile.objs
> @@ -1,4 +1,4 @@
> -obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o gdbstub.o pmp.o
> +obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o vector_helper.o gdbstub.o pmp.o
>
> DECODETREE = $(SRC_PATH)/scripts/decodetree.py
>
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 0adb307..5a93aa2 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -67,6 +67,7 @@
> #define RVC RV('C')
> #define RVS RV('S')
> #define RVU RV('U')
> +#define RVV RV('V')
>
> /* S extension denotes that Supervisor mode exists, however it is possible
> to have a core that support S mode but does not have an MMU and there
> @@ -93,9 +94,38 @@ typedef struct CPURISCVState CPURISCVState;
>
> #include "pmp.h"
>
> +#define VLEN 128
> +#define VUNIT(x) (VLEN / x)
> +
If you want to do vectors I suggest you look at the TCGvec types for
passing pointers to vector registers to helpers. In this case you will
want to ensure your vector registers are properly aligned.
> struct CPURISCVState {
> target_ulong gpr[32];
> uint64_t fpr[32]; /* assume both F and D extensions */
> +
> + /* vector coprocessor state. */
> + struct {
> + union VECTOR {
> + float64 f64[VUNIT(64)];
> + float32 f32[VUNIT(32)];
> + float16 f16[VUNIT(16)];
> + target_ulong ul[VUNIT(sizeof(target_ulong))];
> + uint64_t u64[VUNIT(64)];
> + int64_t s64[VUNIT(64)];
> + uint32_t u32[VUNIT(32)];
> + int32_t s32[VUNIT(32)];
> + uint16_t u16[VUNIT(16)];
> + int16_t s16[VUNIT(16)];
> + uint8_t u8[VUNIT(8)];
> + int8_t s8[VUNIT(8)];
> + } vreg[32];
> + target_ulong vxrm;
> + target_ulong vxsat;
> + target_ulong vl;
> + target_ulong vstart;
> + target_ulong vtype;
> + float_status fp_status;
> + } vfp;
> +
> + bool foflag;
Again I have no idea what foflag is here.
> target_ulong pc;
> target_ulong load_res;
> target_ulong load_val;
> diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
> index 11f971a..9eb43ec 100644
> --- a/target/riscv/cpu_bits.h
> +++ b/target/riscv/cpu_bits.h
> @@ -29,6 +29,14 @@
> #define FSR_NXA (FPEXC_NX << FSR_AEXC_SHIFT)
> #define FSR_AEXC (FSR_NVA | FSR_OFA | FSR_UFA | FSR_DZA | FSR_NXA)
>
> +/* Vector Fixed-Point round model */
> +#define FSR_VXRM_SHIFT 9
> +#define FSR_VXRM (0x3 << FSR_VXRM_SHIFT)
> +
> +/* Vector Fixed-Point saturation flag */
> +#define FSR_VXSAT_SHIFT 8
> +#define FSR_VXSAT (0x1 << FSR_VXSAT_SHIFT)
> +
> /* Control and Status Registers */
>
> /* User Trap Setup */
> @@ -48,6 +56,13 @@
> #define CSR_FRM 0x002
> #define CSR_FCSR 0x003
>
> +/* User Vector CSRs */
> +#define CSR_VSTART 0x008
> +#define CSR_VXSAT 0x009
> +#define CSR_VXRM 0x00a
> +#define CSR_VL 0xc20
> +#define CSR_VTYPE 0xc21
> +
> /* User Timers and Counters */
> #define CSR_CYCLE 0xc00
> #define CSR_TIME 0xc01
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index e32b612..405caf6 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -521,6 +521,13 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> [PRV_H] = RISCV_EXCP_H_ECALL,
> [PRV_M] = RISCV_EXCP_M_ECALL
> };
> + if (env->foflag) {
> + if (env->vfp.vl != 0) {
> + env->foflag = false;
> + env->pc += 4;
> + return;
> + }
> + }
>
> if (!async) {
> /* set tval to badaddr for traps with address information */
> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> index e0d4586..a6131ff 100644
> --- a/target/riscv/csr.c
> +++ b/target/riscv/csr.c
> @@ -87,12 +87,12 @@ static int ctr(CPURISCVState *env, int csrno)
> return 0;
> }
>
> -#if !defined(CONFIG_USER_ONLY)
> static int any(CPURISCVState *env, int csrno)
> {
> return 0;
> }
>
> +#if !defined(CONFIG_USER_ONLY)
> static int smode(CPURISCVState *env, int csrno)
> {
> return -!riscv_has_ext(env, RVS);
> @@ -158,8 +158,10 @@ static int read_fcsr(CPURISCVState *env, int csrno, target_ulong *val)
> return -1;
> }
> #endif
> - *val = (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
> - | (env->frm << FSR_RD_SHIFT);
> + *val = (env->vfp.vxrm << FSR_VXRM_SHIFT)
> + | (env->vfp.vxsat << FSR_VXSAT_SHIFT)
> + | (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
> + | (env->frm << FSR_RD_SHIFT);
> return 0;
> }
>
> @@ -172,10 +174,60 @@ static int write_fcsr(CPURISCVState *env, int csrno, target_ulong val)
> env->mstatus |= MSTATUS_FS;
> #endif
> env->frm = (val & FSR_RD) >> FSR_RD_SHIFT;
> + env->vfp.vxrm = (val & FSR_VXRM) >> FSR_VXRM_SHIFT;
> + env->vfp.vxsat = (val & FSR_VXSAT) >> FSR_VXSAT_SHIFT;
> riscv_cpu_set_fflags(env, (val & FSR_AEXC) >> FSR_AEXC_SHIFT);
> return 0;
> }
>
> +static int read_vtype(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vtype;
> + return 0;
> +}
> +
> +static int read_vl(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vl;
> + return 0;
> +}
> +
> +static int read_vxrm(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vxrm;
> + return 0;
> +}
> +
> +static int read_vxsat(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vxsat;
> + return 0;
> +}
> +
> +static int read_vstart(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vstart;
> + return 0;
> +}
> +
> +static int write_vxrm(CPURISCVState *env, int csrno, target_ulong val)
> +{
> + env->vfp.vxrm = val;
> + return 0;
> +}
> +
> +static int write_vxsat(CPURISCVState *env, int csrno, target_ulong val)
> +{
> + env->vfp.vxsat = val;
> + return 0;
> +}
> +
> +static int write_vstart(CPURISCVState *env, int csrno, target_ulong val)
> +{
> + env->vfp.vstart = val;
> + return 0;
> +}
A fixed return value makes me think these should be void functions.
> +
> /* User Timers and Counters */
> static int read_instret(CPURISCVState *env, int csrno, target_ulong *val)
> {
> @@ -873,7 +925,12 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
> [CSR_FFLAGS] = { fs, read_fflags, write_fflags },
> [CSR_FRM] = { fs, read_frm, write_frm },
> [CSR_FCSR] = { fs, read_fcsr, write_fcsr },
> -
> + /* Vector CSRs */
> + [CSR_VSTART] = { any, read_vstart, write_vstart },
> + [CSR_VXSAT] = { any, read_vxsat, write_vxsat },
> + [CSR_VXRM] = { any, read_vxrm, write_vxrm },
> + [CSR_VL] = { any, read_vl },
> + [CSR_VTYPE] = { any, read_vtype },
> /* User Timers and Counters */
> [CSR_CYCLE] = { ctr, read_instret },
> [CSR_INSTRET] = { ctr, read_instret },
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index debb22a..fee02c0 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -76,3 +76,357 @@ DEF_HELPER_2(mret, tl, env, tl)
> DEF_HELPER_1(wfi, void, env)
> DEF_HELPER_1(tlb_flush, void, env)
> #endif
> +/* Vector functions */
Think about how you could split this patch up to introduce a group of
instructions at a time. This will make it a lot easier review.
I'm going to leave review of the specifics to the RISCV maintainers but
I suspect they will want to wait until a v2 of the series. However it
looks like a good first pass at implementing vectors.
--
Alex Bennée
next prev parent reply other threads:[~2019-08-28 9:08 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-28 2:36 [Qemu-riscv] [PATCH] RISCV: support riscv vector extension 0.7.1 liuzhiwei
2019-08-28 9:08 ` Alex Bennée [this message]
2019-08-28 9:08 ` [Qemu-devel] " Alex Bennée
2019-08-28 16:39 ` [Qemu-riscv] " Richard Henderson
2019-08-28 16:39 ` Richard Henderson
2019-08-29 13:35 ` [Qemu-riscv] " liuzhiwei
2019-08-29 13:35 ` [Qemu-devel] " liuzhiwei
2019-08-28 18:54 ` [Qemu-riscv] " Richard Henderson
2019-08-28 18:54 ` Richard Henderson
2019-08-28 20:43 ` [Qemu-riscv] " Richard Henderson
2019-08-28 20:43 ` Richard Henderson
2019-08-29 12:45 ` [Qemu-riscv] " liuzhiwei
2019-08-29 12:45 ` liuzhiwei
2019-08-29 15:09 ` [Qemu-riscv] " Richard Henderson
2019-08-29 15:09 ` Richard Henderson
2019-09-02 7:45 ` [Qemu-riscv] " liuzhiwei
2019-09-02 7:45 ` liuzhiwei
2019-09-03 14:38 ` [Qemu-riscv] " Richard Henderson
2019-09-03 14:38 ` Richard Henderson
2019-09-02 9:43 ` [Qemu-riscv] " liuzhiwei
2019-09-02 9:43 ` liuzhiwei
2019-09-03 14:21 ` [Qemu-riscv] " Richard Henderson
2019-09-03 14:21 ` Richard Henderson
2019-12-19 9:11 ` LIU Zhiwei
2019-12-19 20:38 ` Richard Henderson
2019-12-25 9:36 ` LIU Zhiwei
2019-12-28 1:14 ` Richard Henderson
2019-12-30 8:11 ` LIU Zhiwei
2020-01-05 20:19 ` Richard Henderson
2019-08-28 19:20 ` [Qemu-riscv] " Aleksandar Markovic
2019-08-29 12:56 ` liuzhiwei
2019-08-29 18:32 ` Aleksandar Markovic
2019-08-29 18:32 ` Aleksandar Markovic
2019-08-28 21:34 ` [Qemu-riscv] " Alistair Francis
2019-08-28 21:34 ` Alistair Francis
2019-08-29 12:00 ` [Qemu-riscv] " liuzhiwei
2019-08-29 12:00 ` liuzhiwei
2019-08-29 15:14 ` [Qemu-riscv] " Richard Henderson
2019-08-29 15:14 ` Richard Henderson
2019-09-02 6:54 ` [Qemu-riscv] " liuzhiwei
2019-09-02 6:54 ` liuzhiwei
2019-08-29 21:50 ` [Qemu-riscv] " Alistair Francis
2019-08-29 21:50 ` Alistair Francis
2019-08-30 9:06 ` [Qemu-riscv] " Alex Bennée
2019-08-30 9:06 ` Alex Bennée
2019-08-30 18:39 ` [Qemu-riscv] " Alistair Francis
2019-08-30 18:39 ` Alistair Francis
2019-09-02 6:36 ` [Qemu-riscv] " liuzhiwei
2019-09-02 6:36 ` liuzhiwei
2019-08-29 14:06 ` [Qemu-riscv] " Chih-Min Chao
2019-09-02 8:17 ` liuzhiwei
2019-09-02 8:17 ` [Qemu-devel] " liuzhiwei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k1ax64io.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=Alistair.Francis@wdc.com \
--cc=aurelien@aurel32.net \
--cc=kbastian@mail.uni-paderborn.de \
--cc=laurent@vivier.eu \
--cc=palmer@sifive.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-riscv@nongnu.org \
--cc=riku.voipio@iki.fi \
--cc=sagark@eecs.berkeley.edu \
--cc=zhiwei_liu@c-sky.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.