* [PATCH 0/2] target/riscv: RVV 1-fill tail element changes
@ 2023-04-27 20:57 Daniel Henrique Barboza
2023-04-27 20:57 ` [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero Daniel Henrique Barboza
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Daniel Henrique Barboza @ 2023-04-27 20:57 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-riscv, alistair.francis, bmeng, liweiwei, zhiwei_liu, palmer,
Daniel Henrique Barboza
Hi,
This series makes changes in vext_set_tail_elements_1s() to be a little
nicer to the emulation.
First patch makes the function a no-op when vta == 0. Aside from the
logic simplification we also have a little performance boost.
Second patch makes the function debug only. The logic is explained in
the commit message, but long story short: we don't have to implement any
tail-agnostic policy at all to be spec compliant, but this function has
its uses for debug purposes, so keeping it as a debug option allow users
to disable it on demand.
Patches are based on top of Alistair's riscv-to-apply.next.
Daniel Henrique Barboza (2):
target/riscv/vector_helper.c: skip set tail when vta is zero
target/riscv/vector_helper.c: make vext_set_tail_elems_1s() debug only
target/riscv/vector_helper.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
--
2.40.0
^ permalink raw reply [flat|nested] 10+ messages in thread* [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero 2023-04-27 20:57 [PATCH 0/2] target/riscv: RVV 1-fill tail element changes Daniel Henrique Barboza @ 2023-04-27 20:57 ` Daniel Henrique Barboza 2023-04-28 1:08 ` Weiwei Li ` (2 more replies) 2023-04-27 20:57 ` [PATCH 2/2] target/riscv/vector_helper.c: make vext_set_tail_elems_1s() debug only Daniel Henrique Barboza ` (2 subsequent siblings) 3 siblings, 3 replies; 10+ messages in thread From: Daniel Henrique Barboza @ 2023-04-27 20:57 UTC (permalink / raw) To: qemu-devel Cc: qemu-riscv, alistair.francis, bmeng, liweiwei, zhiwei_liu, palmer, Daniel Henrique Barboza The function is a no-op if 'vta' is zero but we're still doing a lot of stuff in this function regardless. vext_set_elems_1s() will ignore every single time (since vta is zero) and we just wasted time. Skip it altogether in this case. Aside from the code simplification there's a noticeable emulation performance gain by doing it. For a regular C binary that does a vectors operation like this: ======= #define SZ 10000000 int main () { int *a = malloc (SZ * sizeof (int)); int *b = malloc (SZ * sizeof (int)); int *c = malloc (SZ * sizeof (int)); for (int i = 0; i < SZ; i++) c[i] = a[i] + b[i]; return c[SZ - 1]; } ======= Emulating it with qemu-riscv64 and RVV takes ~0.3 sec: $ time ~/work/qemu/build/qemu-riscv64 \ -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out real 0m0.303s user 0m0.281s sys 0m0.023s With this skip we take ~0.275 sec: $ time ~/work/qemu/build/qemu-riscv64 \ -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out real 0m0.274s user 0m0.252s sys 0m0.019s This performance gain adds up fast when executing heavy benchmarks like SPEC. Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> --- target/riscv/vector_helper.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index f4d0438988..8e6c99e573 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -268,12 +268,17 @@ static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl, void *vd, uint32_t desc, uint32_t nf, uint32_t esz, uint32_t max_elems) { - uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vlenb = riscv_cpu_cfg(env)->vlen >> 3; + uint32_t total_elems, vlenb, registers_used; uint32_t vta = vext_vta(desc); - uint32_t registers_used; int k; + if (vta == 0) { + return; + } + + total_elems = vext_get_total_elems(env, desc, esz); + vlenb = riscv_cpu_cfg(env)->vlen >> 3; + for (k = 0; k < nf; ++k) { vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz, (k * max_elems + max_elems) * esz); -- 2.40.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero 2023-04-27 20:57 ` [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero Daniel Henrique Barboza @ 2023-04-28 1:08 ` Weiwei Li 2023-05-07 23:26 ` Alistair Francis 2023-05-07 23:30 ` Alistair Francis 2 siblings, 0 replies; 10+ messages in thread From: Weiwei Li @ 2023-04-28 1:08 UTC (permalink / raw) To: Daniel Henrique Barboza, qemu-devel Cc: liweiwei, qemu-riscv, alistair.francis, bmeng, zhiwei_liu, palmer On 2023/4/28 04:57, Daniel Henrique Barboza wrote: > The function is a no-op if 'vta' is zero but we're still doing a lot of > stuff in this function regardless. vext_set_elems_1s() will ignore every > single time (since vta is zero) and we just wasted time. > > Skip it altogether in this case. Aside from the code simplification > there's a noticeable emulation performance gain by doing it. For a > regular C binary that does a vectors operation like this: > > ======= > #define SZ 10000000 > > int main () > { > int *a = malloc (SZ * sizeof (int)); > int *b = malloc (SZ * sizeof (int)); > int *c = malloc (SZ * sizeof (int)); > > for (int i = 0; i < SZ; i++) > c[i] = a[i] + b[i]; > return c[SZ - 1]; > } > ======= > > Emulating it with qemu-riscv64 and RVV takes ~0.3 sec: > > $ time ~/work/qemu/build/qemu-riscv64 \ > -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out > > real 0m0.303s > user 0m0.281s > sys 0m0.023s > > With this skip we take ~0.275 sec: > > $ time ~/work/qemu/build/qemu-riscv64 \ > -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out > > real 0m0.274s > user 0m0.252s > sys 0m0.019s > > This performance gain adds up fast when executing heavy benchmarks like > SPEC. > > Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> > --- Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn> Weiwei Li > target/riscv/vector_helper.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c > index f4d0438988..8e6c99e573 100644 > --- a/target/riscv/vector_helper.c > +++ b/target/riscv/vector_helper.c > @@ -268,12 +268,17 @@ static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl, > void *vd, uint32_t desc, uint32_t nf, > uint32_t esz, uint32_t max_elems) > { > - uint32_t total_elems = vext_get_total_elems(env, desc, esz); > - uint32_t vlenb = riscv_cpu_cfg(env)->vlen >> 3; > + uint32_t total_elems, vlenb, registers_used; > uint32_t vta = vext_vta(desc); > - uint32_t registers_used; > int k; > > + if (vta == 0) { > + return; > + } > + > + total_elems = vext_get_total_elems(env, desc, esz); > + vlenb = riscv_cpu_cfg(env)->vlen >> 3; > + > for (k = 0; k < nf; ++k) { > vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz, > (k * max_elems + max_elems) * esz); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero 2023-04-27 20:57 ` [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero Daniel Henrique Barboza 2023-04-28 1:08 ` Weiwei Li @ 2023-05-07 23:26 ` Alistair Francis 2023-05-07 23:30 ` Alistair Francis 2 siblings, 0 replies; 10+ messages in thread From: Alistair Francis @ 2023-05-07 23:26 UTC (permalink / raw) To: Daniel Henrique Barboza Cc: qemu-devel, qemu-riscv, alistair.francis, bmeng, liweiwei, zhiwei_liu, palmer On Fri, Apr 28, 2023 at 6:58 AM Daniel Henrique Barboza <dbarboza@ventanamicro.com> wrote: > > The function is a no-op if 'vta' is zero but we're still doing a lot of > stuff in this function regardless. vext_set_elems_1s() will ignore every > single time (since vta is zero) and we just wasted time. > > Skip it altogether in this case. Aside from the code simplification > there's a noticeable emulation performance gain by doing it. For a > regular C binary that does a vectors operation like this: > > ======= > #define SZ 10000000 > > int main () > { > int *a = malloc (SZ * sizeof (int)); > int *b = malloc (SZ * sizeof (int)); > int *c = malloc (SZ * sizeof (int)); > > for (int i = 0; i < SZ; i++) > c[i] = a[i] + b[i]; > return c[SZ - 1]; > } > ======= > > Emulating it with qemu-riscv64 and RVV takes ~0.3 sec: > > $ time ~/work/qemu/build/qemu-riscv64 \ > -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out > > real 0m0.303s > user 0m0.281s > sys 0m0.023s > > With this skip we take ~0.275 sec: > > $ time ~/work/qemu/build/qemu-riscv64 \ > -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out > > real 0m0.274s > user 0m0.252s > sys 0m0.019s > > This performance gain adds up fast when executing heavy benchmarks like > SPEC. > > Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Alistair > --- > target/riscv/vector_helper.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c > index f4d0438988..8e6c99e573 100644 > --- a/target/riscv/vector_helper.c > +++ b/target/riscv/vector_helper.c > @@ -268,12 +268,17 @@ static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl, > void *vd, uint32_t desc, uint32_t nf, > uint32_t esz, uint32_t max_elems) > { > - uint32_t total_elems = vext_get_total_elems(env, desc, esz); > - uint32_t vlenb = riscv_cpu_cfg(env)->vlen >> 3; > + uint32_t total_elems, vlenb, registers_used; > uint32_t vta = vext_vta(desc); > - uint32_t registers_used; > int k; > > + if (vta == 0) { > + return; > + } > + > + total_elems = vext_get_total_elems(env, desc, esz); > + vlenb = riscv_cpu_cfg(env)->vlen >> 3; > + > for (k = 0; k < nf; ++k) { > vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz, > (k * max_elems + max_elems) * esz); > -- > 2.40.0 > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero 2023-04-27 20:57 ` [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero Daniel Henrique Barboza 2023-04-28 1:08 ` Weiwei Li 2023-05-07 23:26 ` Alistair Francis @ 2023-05-07 23:30 ` Alistair Francis 2 siblings, 0 replies; 10+ messages in thread From: Alistair Francis @ 2023-05-07 23:30 UTC (permalink / raw) To: Daniel Henrique Barboza Cc: qemu-devel, qemu-riscv, alistair.francis, bmeng, liweiwei, zhiwei_liu, palmer On Fri, Apr 28, 2023 at 6:58 AM Daniel Henrique Barboza <dbarboza@ventanamicro.com> wrote: > > The function is a no-op if 'vta' is zero but we're still doing a lot of > stuff in this function regardless. vext_set_elems_1s() will ignore every > single time (since vta is zero) and we just wasted time. > > Skip it altogether in this case. Aside from the code simplification > there's a noticeable emulation performance gain by doing it. For a > regular C binary that does a vectors operation like this: > > ======= > #define SZ 10000000 > > int main () > { > int *a = malloc (SZ * sizeof (int)); > int *b = malloc (SZ * sizeof (int)); > int *c = malloc (SZ * sizeof (int)); > > for (int i = 0; i < SZ; i++) > c[i] = a[i] + b[i]; > return c[SZ - 1]; > } > ======= > > Emulating it with qemu-riscv64 and RVV takes ~0.3 sec: > > $ time ~/work/qemu/build/qemu-riscv64 \ > -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out > > real 0m0.303s > user 0m0.281s > sys 0m0.023s > > With this skip we take ~0.275 sec: > > $ time ~/work/qemu/build/qemu-riscv64 \ > -cpu rv64,debug=false,vext_spec=v1.0,v=true,vlen=128 ./foo.out > > real 0m0.274s > user 0m0.252s > sys 0m0.019s > > This performance gain adds up fast when executing heavy benchmarks like > SPEC. > > Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Thanks! Applied to riscv-to-apply.next Alistair > --- > target/riscv/vector_helper.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c > index f4d0438988..8e6c99e573 100644 > --- a/target/riscv/vector_helper.c > +++ b/target/riscv/vector_helper.c > @@ -268,12 +268,17 @@ static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl, > void *vd, uint32_t desc, uint32_t nf, > uint32_t esz, uint32_t max_elems) > { > - uint32_t total_elems = vext_get_total_elems(env, desc, esz); > - uint32_t vlenb = riscv_cpu_cfg(env)->vlen >> 3; > + uint32_t total_elems, vlenb, registers_used; > uint32_t vta = vext_vta(desc); > - uint32_t registers_used; > int k; > > + if (vta == 0) { > + return; > + } > + > + total_elems = vext_get_total_elems(env, desc, esz); > + vlenb = riscv_cpu_cfg(env)->vlen >> 3; > + > for (k = 0; k < nf; ++k) { > vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz, > (k * max_elems + max_elems) * esz); > -- > 2.40.0 > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] target/riscv/vector_helper.c: make vext_set_tail_elems_1s() debug only 2023-04-27 20:57 [PATCH 0/2] target/riscv: RVV 1-fill tail element changes Daniel Henrique Barboza 2023-04-27 20:57 ` [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero Daniel Henrique Barboza @ 2023-04-27 20:57 ` Daniel Henrique Barboza 2023-04-28 1:22 ` Weiwei Li 2023-04-27 21:15 ` [PATCH 0/2] target/riscv: RVV 1-fill tail element changes Palmer Dabbelt 2023-04-28 9:30 ` Dickon Hood 3 siblings, 1 reply; 10+ messages in thread From: Daniel Henrique Barboza @ 2023-04-27 20:57 UTC (permalink / raw) To: qemu-devel Cc: qemu-riscv, alistair.francis, bmeng, liweiwei, zhiwei_liu, palmer, Daniel Henrique Barboza Commit 3479a814 ("target/riscv: rvv-1.0: add VMA and VTA") added vma and vta fields in the vtype register, while also defining that QEMU doesn't need to have a tail agnostic policy to be compliant with the RVV spec. It ended up removing all tail handling code as well. Later, commit 752614ca ("target/riscv: rvv: Add tail agnostic for vector load / store instructions") reintroduced the tail agnostic fill for vector load/store instructions only. This puts QEMU in a situation where some functions are 1-filling the tail elements and others don't. This is still a valid implementation, but the process of 1-filling the tail elements takes valuable emulation time that can be used doing anything else. If the spec doesn't demand a specific tail-agostic policy, a proper software wouldn't expect any policy to be in place. This means that, more often than not, the work we're doing by 1-filling tail elements is wasted. We would be better of if vext_set_tail_elems_1s() is removed entirely from the code. All this said, there's still a debug value associated with it. So, instead of removing it, let's gate it with cpu->cfg.debug. This way software can enable this code if desirable, but for the regular case we shouldn't waste time with it. Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> --- target/riscv/vector_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 8e6c99e573..e0a292ac24 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -272,7 +272,7 @@ static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl, uint32_t vta = vext_vta(desc); int k; - if (vta == 0) { + if (vta == 0 || !riscv_cpu_cfg(env)->debug) { return; } -- 2.40.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] target/riscv/vector_helper.c: make vext_set_tail_elems_1s() debug only 2023-04-27 20:57 ` [PATCH 2/2] target/riscv/vector_helper.c: make vext_set_tail_elems_1s() debug only Daniel Henrique Barboza @ 2023-04-28 1:22 ` Weiwei Li 2023-04-28 9:16 ` Daniel Henrique Barboza 0 siblings, 1 reply; 10+ messages in thread From: Weiwei Li @ 2023-04-28 1:22 UTC (permalink / raw) To: Daniel Henrique Barboza, qemu-devel Cc: liweiwei, qemu-riscv, alistair.francis, bmeng, zhiwei_liu, palmer On 2023/4/28 04:57, Daniel Henrique Barboza wrote: > Commit 3479a814 ("target/riscv: rvv-1.0: add VMA and VTA") added vma and > vta fields in the vtype register, while also defining that QEMU doesn't > need to have a tail agnostic policy to be compliant with the RVV spec. > It ended up removing all tail handling code as well. Later, commit > 752614ca ("target/riscv: rvv: Add tail agnostic for vector load / store > instructions") reintroduced the tail agnostic fill for vector load/store > instructions only. > > This puts QEMU in a situation where some functions are 1-filling the > tail elements and others don't. This is still a valid implementation, > but the process of 1-filling the tail elements takes valuable emulation > time that can be used doing anything else. If the spec doesn't demand a > specific tail-agostic policy, a proper software wouldn't expect any > policy to be in place. This means that, more often than not, the work > we're doing by 1-filling tail elements is wasted. We would be better of > if vext_set_tail_elems_1s() is removed entirely from the code. > > All this said, there's still a debug value associated with it. So, > instead of removing it, let's gate it with cpu->cfg.debug. This way > software can enable this code if desirable, but for the regular case we > shouldn't waste time with it. > > Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> > --- > target/riscv/vector_helper.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c > index 8e6c99e573..e0a292ac24 100644 > --- a/target/riscv/vector_helper.c > +++ b/target/riscv/vector_helper.c > @@ -272,7 +272,7 @@ static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl, > uint32_t vta = vext_vta(desc); > int k; > > - if (vta == 0) { > + if (vta == 0 || !riscv_cpu_cfg(env)->debug) { I think this is not correct. 'debug' property is used for debug spec. And this feature is controlled by another property 'rvv_ta_all_1s' . By the way, cfg.rvv_ta_all_1s have been ANDed intovta value. So additional check on it is also unnecessary here. Regards, Weiwei Li > return; > } > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] target/riscv/vector_helper.c: make vext_set_tail_elems_1s() debug only 2023-04-28 1:22 ` Weiwei Li @ 2023-04-28 9:16 ` Daniel Henrique Barboza 0 siblings, 0 replies; 10+ messages in thread From: Daniel Henrique Barboza @ 2023-04-28 9:16 UTC (permalink / raw) To: Weiwei Li, qemu-devel Cc: qemu-riscv, alistair.francis, bmeng, zhiwei_liu, palmer On 4/27/23 22:22, Weiwei Li wrote: > > On 2023/4/28 04:57, Daniel Henrique Barboza wrote: >> Commit 3479a814 ("target/riscv: rvv-1.0: add VMA and VTA") added vma and >> vta fields in the vtype register, while also defining that QEMU doesn't >> need to have a tail agnostic policy to be compliant with the RVV spec. >> It ended up removing all tail handling code as well. Later, commit >> 752614ca ("target/riscv: rvv: Add tail agnostic for vector load / store >> instructions") reintroduced the tail agnostic fill for vector load/store >> instructions only. >> >> This puts QEMU in a situation where some functions are 1-filling the >> tail elements and others don't. This is still a valid implementation, >> but the process of 1-filling the tail elements takes valuable emulation >> time that can be used doing anything else. If the spec doesn't demand a >> specific tail-agostic policy, a proper software wouldn't expect any >> policy to be in place. This means that, more often than not, the work >> we're doing by 1-filling tail elements is wasted. We would be better of >> if vext_set_tail_elems_1s() is removed entirely from the code. >> >> All this said, there's still a debug value associated with it. So, >> instead of removing it, let's gate it with cpu->cfg.debug. This way >> software can enable this code if desirable, but for the regular case we >> shouldn't waste time with it. >> >> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> >> --- >> target/riscv/vector_helper.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c >> index 8e6c99e573..e0a292ac24 100644 >> --- a/target/riscv/vector_helper.c >> +++ b/target/riscv/vector_helper.c >> @@ -272,7 +272,7 @@ static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl, >> uint32_t vta = vext_vta(desc); >> int k; >> - if (vta == 0) { >> + if (vta == 0 || !riscv_cpu_cfg(env)->debug) { > > I think this is not correct. 'debug' property is used for debug spec. And this feature is controlled by another property 'rvv_ta_all_1s' . You're right. I wasn't aware that this flag exists: $ git grep 'rvv_ta_all_1s' target/riscv/cpu.c: DEFINE_PROP_BOOL("rvv_ta_all_1s", RISCVCPU, cfg.rvv_ta_all_1s, false), target/riscv/cpu.h: bool rvv_ta_all_1s; target/riscv/translate.c: ctx->vta = FIELD_EX32(tb_flags, TB_FLAGS, VTA) && cpu->cfg.rvv_ta_all_1s; target/riscv/translate.c: ctx->cfg_vta_all_1s = cpu->cfg.rvv_ta_all_1s; > > By the way, cfg.rvv_ta_all_1s have been ANDed intovta value. So additional check on it is also unnecessary here. Yes. We can drop this patch then since 'vta' is already accounting for ta_all_1s. Thanks, Daniel > > Regards, > > Weiwei Li > >> return; >> } > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] target/riscv: RVV 1-fill tail element changes 2023-04-27 20:57 [PATCH 0/2] target/riscv: RVV 1-fill tail element changes Daniel Henrique Barboza 2023-04-27 20:57 ` [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero Daniel Henrique Barboza 2023-04-27 20:57 ` [PATCH 2/2] target/riscv/vector_helper.c: make vext_set_tail_elems_1s() debug only Daniel Henrique Barboza @ 2023-04-27 21:15 ` Palmer Dabbelt 2023-04-28 9:30 ` Dickon Hood 3 siblings, 0 replies; 10+ messages in thread From: Palmer Dabbelt @ 2023-04-27 21:15 UTC (permalink / raw) To: dbarboza Cc: qemu-devel, qemu-riscv, Alistair Francis, bmeng, liweiwei, zhiwei_liu, dbarboza On Thu, 27 Apr 2023 13:57:06 PDT (-0700), dbarboza@ventanamicro.com wrote: > Hi, > > This series makes changes in vext_set_tail_elements_1s() to be a little > nicer to the emulation. > > First patch makes the function a no-op when vta == 0. Aside from the > logic simplification we also have a little performance boost. > > Second patch makes the function debug only. The logic is explained in > the commit message, but long story short: we don't have to implement any > tail-agnostic policy at all to be spec compliant, but this function has > its uses for debug purposes, so keeping it as a debug option allow users > to disable it on demand. > > Patches are based on top of Alistair's riscv-to-apply.next. > > Daniel Henrique Barboza (2): > target/riscv/vector_helper.c: skip set tail when vta is zero > target/riscv/vector_helper.c: make vext_set_tail_elems_1s() debug only > > target/riscv/vector_helper.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Though this made me think: it'd be nice to have some sort of "aggressively do odd things for VTA/VMA" mode in QEMU, as that could help shake out bugs in software. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/2] target/riscv: RVV 1-fill tail element changes 2023-04-27 20:57 [PATCH 0/2] target/riscv: RVV 1-fill tail element changes Daniel Henrique Barboza ` (2 preceding siblings ...) 2023-04-27 21:15 ` [PATCH 0/2] target/riscv: RVV 1-fill tail element changes Palmer Dabbelt @ 2023-04-28 9:30 ` Dickon Hood 3 siblings, 0 replies; 10+ messages in thread From: Dickon Hood @ 2023-04-28 9:30 UTC (permalink / raw) To: Daniel Henrique Barboza Cc: qemu-devel, qemu-riscv, alistair.francis, bmeng, liweiwei, zhiwei_liu, palmer On Thu, Apr 27, 2023 at 17:57:06 -0300, Daniel Henrique Barboza wrote: : Second patch makes the function debug only. The logic is explained in : the commit message, but long story short: we don't have to implement any : tail-agnostic policy at all to be spec compliant, but this function has : its uses for debug purposes, so keeping it as a debug option allow users : to disable it on demand. Yes, please don't remove this functionality completely -- our internal stress test tool for the vector crypto patches relies on it. -- Dickon Hood Senior Software Engineer Codethink Ltd. 3rd Floor, Dale House, 35 Dale Street, https://www.codethink.co.uk/ Manchester, M1 2HF, United Kingdom. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-05-07 23:31 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-04-27 20:57 [PATCH 0/2] target/riscv: RVV 1-fill tail element changes Daniel Henrique Barboza 2023-04-27 20:57 ` [PATCH 1/2] target/riscv/vector_helper.c: skip set tail when vta is zero Daniel Henrique Barboza 2023-04-28 1:08 ` Weiwei Li 2023-05-07 23:26 ` Alistair Francis 2023-05-07 23:30 ` Alistair Francis 2023-04-27 20:57 ` [PATCH 2/2] target/riscv/vector_helper.c: make vext_set_tail_elems_1s() debug only Daniel Henrique Barboza 2023-04-28 1:22 ` Weiwei Li 2023-04-28 9:16 ` Daniel Henrique Barboza 2023-04-27 21:15 ` [PATCH 0/2] target/riscv: RVV 1-fill tail element changes Palmer Dabbelt 2023-04-28 9:30 ` Dickon Hood
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).