[PATCH v2 0/2] target/riscv: fix vector LD/ST instruction

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/2] target/riscv: fix vector LD/ST instruction
@ 2025-08-16  0:29 ` Chao Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Chao Liu @ 2025-08-15 19:55 UTC (permalink / raw)
  To: paolo.savini, dbarboza, ebiggers, palmer, alistair.francis,
	liwei1518, zhiwei_liu
  Cc: qemu-riscv, qemu-devel, Chao Liu

Hi Paolo, Eric, Daniel,

I have attempted to fix this issue. Thanks to Eric for providing the test case.

This patch fixes a critical bug in the RISC-V vector instruction translation
that caused incorrect data handling in strided load operations (e.g.,vlsseg8e32).

patch v2 changes:
- Fix the commit message of the patch v1.

#### Problem Description

The `get_log2` function in `trans_rvv.c.inc` returned a value 1 higher than the
actual log2 value. For example, get_log2(4) incorrectly returned 3 instead of 2.

This led to erroneous vector register offset calculations, resulting in data
overlap where bytes 32-47 were incorrectly copied to positions 16-31 in ChaCha20
encryption code.

rvv_test_func:
    vsetivli    zero, 1, e32, m1, ta, ma
    li          t0, 64

    vlsseg8e32.v v0, (a0), t0
    addi        a0, a0, 32
    vlsseg8e32.v v8, (a0), t0

    vssseg8e32.v v0, (a1), t0
    addi        a1, a1, 32
    vssseg8e32.v v8, (a1), t0
    ret

#### Root Cause Analysis

The original implementation counted the number of right shifts until zero,
including the final shift that reduced the value to zero:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    for (; a > 0;) {
        a >>= 1;
        i++;
    }
    return i; // Returns 3 for a=4 (0b100 → 0b10 → 0b1 → 0b0)
}

#### Fix Implementation

The corrected function stops shifting when only the highest bit remains and
handles the special case of a=0:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    if (a == 0) {
        return i; // Handle edge case
    }
    for (; a > 1; a >>= 1) {
        i++;
    }
    return i; // Now returns 2 for a=4
}

#### Testing

This fix has been verified with:
    1. The provided ChaCha20 vector optimization test case
    2. RVV strided load instruction tests in `test-vlsseg8e32.S`

All tests now pass with correct data handling and no memory overlap.

Test using the following command:

    ./configure --target-list=riscv64-softmmu \
                --cross-prefix-riscv64=riscv64-unknown-elf-

    ninja -j$(nproc) -C build && make check-tcg

Expected result:

    BUILD   riscv64-softmmu guest-tests
    RUN     riscv64-softmmu guest-tests
    TEST    issue1060 on riscv64
    TEST    test-vlsseg8e32 on riscv64

Best regards,

Chao

Chao Liu (2):
  target/riscv: fix vector register address calculation in strided LD/ST
  tests/tcg/riscv64: Add test for vlsseg8e32 instruction

 target/riscv/insn_trans/trans_rvv.c.inc   |   5 +-
 tests/tcg/riscv64/Makefile.softmmu-target |   8 +-
 tests/tcg/riscv64/test-vlsseg8e32.S       | 108 ++++++++++++++++++++++
 3 files changed, 118 insertions(+), 3 deletions(-)
 create mode 100644 tests/tcg/riscv64/test-vlsseg8e32.S

-- 
2.50.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 0/2] target/riscv: fix vector LD/ST instruction
@ 2025-08-16  0:29 ` Chao Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Chao Liu @ 2025-08-16  0:29 UTC (permalink / raw)
  To: paolo.savini, dbarboza, ebiggers, palmer, alistair.francis,
	liwei1518, zhiwei_liu
  Cc: qemu-riscv, qemu-devel, Chao Liu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=all, Size: 2561 bytes --]

Hi Paolo, Eric, Daniel,

patch v3 changes:
- Fix the get_log2 cunftion:
  https://lore.kernel.org/qemu-riscv/cover.1755287531.git.chao.liu@yeah.net/T/#t

#### Problem Description

The `get_log2` function in `trans_rvv.c.inc` returned a value 1 higher than the
actual log2 value. For example, get_log2(4) incorrectly returned 3 instead of 2.

This led to erroneous vector register offset calculations, resulting in data
overlap where bytes 32-47 were incorrectly copied to positions 16-31 in ChaCha20
encryption code.

rvv_test_func:
    vsetivli    zero, 1, e32, m1, ta, ma
    li          t0, 64

    vlsseg8e32.v v0, (a0), t0
    addi        a0, a0, 32
    vlsseg8e32.v v8, (a0), t0

    vssseg8e32.v v0, (a1), t0
    addi        a1, a1, 32
    vssseg8e32.v v8, (a1), t0
    ret

#### Root Cause Analysis

The original implementation counted the number of right shifts until zero,
including the final shift that reduced the value to zero:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    for (; a > 0;) {
        a >>= 1;
        i++;
    }
    return i; // Returns 3 for a=4 (0b100 → 0b10 → 0b1 → 0b0)
}

#### Fix Implementation

The corrected function stops shifting when only the highest bit remains and
handles the special case of a=0:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    if (a == 0) {
        return i; // Handle edge case
    }
    for (; a > 1; a >>= 1) {
        i++;
    }
    return i; // Now returns 2 for a=4
}

#### Testing

This fix has been verified with:
    1. The provided ChaCha20 vector optimization test case
    2. RVV strided load instruction tests in `test-vlsseg8e32.S`

All tests now pass with correct data handling and no memory overlap.


Test using the following command:

    ./configure --target-list=riscv64-softmmu \
                --cross-prefix-riscv64=riscv64-unknown-elf-

    ninja -j$(nproc) -C build && make check-tcg

Expected result:

    BUILD   riscv64-softmmu guest-tests
    RUN     riscv64-softmmu guest-tests
    TEST    issue1060 on riscv64
    TEST    test-vlsseg8e32 on riscv64


Best regards,

Chao

Chao Liu (2):
  target/riscv: fix vector register address calculation in strided LD/ST
  tests/tcg/riscv64: Add test for vlsseg8e32 instruction

 target/riscv/insn_trans/trans_rvv.c.inc   |   5 +-
 tests/tcg/riscv64/Makefile.softmmu-target |   8 +-
 tests/tcg/riscv64/test-vlsseg8e32.S       | 108 ++++++++++++++++++++++
 3 files changed, 118 insertions(+), 3 deletions(-)
 create mode 100644 tests/tcg/riscv64/test-vlsseg8e32.S

-- 
2.50.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] target/riscv: fix vector register address calculation in strided LD/ST
@ 2025-08-16  0:29   ` Chao Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Chao Liu @ 2025-08-15 19:55 UTC (permalink / raw)
  To: paolo.savini, dbarboza, ebiggers, palmer, alistair.francis,
	liwei1518, zhiwei_liu
  Cc: qemu-riscv, qemu-devel, Chao Liu

This patch fixes a critical bug in the RISC-V vector instruction
translation that caused incorrect data handling in strided load
operations (e.g., vlsseg8e32).

Problem Description:

The `get_log2` function in `trans_rvv.c.inc` returned a value 1 higher
than the actual log2 value. For example, get_log2(4) incorrectly
returned 3 instead of 2.

This led to erroneous vector register offset calculations, resulting in
data overlap where bytes 32-47 were incorrectly copied to positions
16-31 in ChaCha20 encryption code.

rvv_test_func:
    vsetivli    zero, 1, e32, m1, ta, ma
    li          t0, 64

    vlsseg8e32.v v0, (a0), t0
    addi        a0, a0, 32
    vlsseg8e32.v v8, (a0), t0

    vssseg8e32.v v0, (a1), t0
    addi        a1, a1, 32
    vssseg8e32.v v8, (a1), t0
    ret

Analysis:

The original implementation counted the number of right shifts until
zero, including the final shift that reduced the value to zero:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    for (; a > 0;) {
        a >>= 1;
        i++;
    }
    return i; // Returns 3 for a=4 (0b100 → 0b10 → 0b1 → 0b0)
}

Fix:

The corrected function stops shifting when only the highest bit remains
and handles the special case of a=0:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    if (a == 0) {
        return i; // Handle edge case
    }
    for (; a > 1; a >>= 1) {
        i++;
    }
    return i; // Now returns 2 for a=4
}

Fixes: 28c12c1f2f ("Generate strided vector loads/stores with tcg nodes.")

Signed-off-by: Chao Liu <chao.liu@yeah.net>
---
 target/riscv/insn_trans/trans_rvv.c.inc | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
index 2b6077ac06..f50b62b1d8 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -877,7 +877,10 @@ static inline uint32_t MAXSZ(DisasContext *s)
 static inline uint32_t get_log2(uint32_t a)
 {
     uint32_t i = 0;
-    for (; a > 0;) {
+    if (a == 0) {
+        return i;
+    }
+    for (; a > 1;) {
         a >>= 1;
         i++;
     }
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] target/riscv: fix vector register address calculation in strided LD/ST
@ 2025-08-16  0:29   ` Chao Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Chao Liu @ 2025-08-16  0:29 UTC (permalink / raw)
  To: paolo.savini, dbarboza, ebiggers, palmer, alistair.francis,
	liwei1518, zhiwei_liu
  Cc: qemu-riscv, qemu-devel, Chao Liu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=all, Size: 2182 bytes --]

This patch fixes a critical bug in the RISC-V vector instruction
translation that caused incorrect data handling in strided load
operations (e.g., vlsseg8e32).

Problem Description:

The `get_log2` function in `trans_rvv.c.inc` returned a value 1 higher
than the actual log2 value. For example, get_log2(4) incorrectly
returned 3 instead of 2.

This led to erroneous vector register offset calculations, resulting in
data overlap where bytes 32-47 were incorrectly copied to positions
16-31 in ChaCha20 encryption code.

rvv_test_func:
    vsetivli    zero, 1, e32, m1, ta, ma
    li          t0, 64

    vlsseg8e32.v v0, (a0), t0
    addi        a0, a0, 32
    vlsseg8e32.v v8, (a0), t0

    vssseg8e32.v v0, (a1), t0
    addi        a1, a1, 32
    vssseg8e32.v v8, (a1), t0
    ret

Analysis:

The original implementation counted the number of right shifts until
zero, including the final shift that reduced the value to zero:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    for (; a > 0;) {
        a >>= 1;
        i++;
    }
    return i; // Returns 3 for a=4 (0b100 → 0b10 → 0b1 → 0b0)
}

Fix:

The corrected function stops shifting when only the highest bit remains
and handles the special case of a=0:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    if (a == 0) {
        return i; // Handle edge case
    }
    for (; a > 1; a >>= 1) {
        i++;
    }
    return i; // Now returns 2 for a=4
}

Fixes: 28c12c1f2f ("Generate strided vector loads/stores with tcg nodes.")

Signed-off-by: Chao Liu <chao.liu@yeah.net>
---
 target/riscv/insn_trans/trans_rvv.c.inc | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
index 2b6077ac06..f50b62b1d8 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -877,7 +877,10 @@ static inline uint32_t MAXSZ(DisasContext *s)
 static inline uint32_t get_log2(uint32_t a)
 {
     uint32_t i = 0;
-    for (; a > 0;) {
+    if (a == 0) {
+        return i;
+    }
+    for (; a > 1;) {
         a >>= 1;
         i++;
     }
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] target/riscv: fix vector register address calculation in strided LD/ST
  2025-08-16  0:29   ` Chao Liu
  (?)
@ 2025-08-15 22:07   ` Eric Biggers
  -1 siblings, 0 replies; 8+ messages in thread
From: Eric Biggers @ 2025-08-15 22:07 UTC (permalink / raw)
  To: Chao Liu
  Cc: paolo.savini, dbarboza, palmer, alistair.francis, liwei1518,
	zhiwei_liu, qemu-riscv, qemu-devel

On Sat, Aug 16, 2025 at 03:55:40AM +0800, Chao Liu wrote:
> This patch fixes a critical bug in the RISC-V vector instruction
> translation that caused incorrect data handling in strided load
> operations (e.g., vlsseg8e32).
> 
> Problem Description:
> 
> The `get_log2` function in `trans_rvv.c.inc` returned a value 1 higher
> than the actual log2 value. For example, get_log2(4) incorrectly
> returned 3 instead of 2.
> 
> This led to erroneous vector register offset calculations, resulting in
> data overlap where bytes 32-47 were incorrectly copied to positions
> 16-31 in ChaCha20 encryption code.
> 
> rvv_test_func:
>     vsetivli    zero, 1, e32, m1, ta, ma
>     li          t0, 64
> 
>     vlsseg8e32.v v0, (a0), t0
>     addi        a0, a0, 32
>     vlsseg8e32.v v8, (a0), t0
> 
>     vssseg8e32.v v0, (a1), t0
>     addi        a1, a1, 32
>     vssseg8e32.v v8, (a1), t0
>     ret
> 
> Analysis:
> 
> The original implementation counted the number of right shifts until
> zero, including the final shift that reduced the value to zero:
> 
> static inline uint32_t get_log2(uint32_t a)
> {
>     uint32_t i = 0;
>     for (; a > 0;) {
>         a >>= 1;
>         i++;
>     }
>     return i; // Returns 3 for a=4 (0b100 → 0b10 → 0b1 → 0b0)
> }
> 
> Fix:
> 
> The corrected function stops shifting when only the highest bit remains
> and handles the special case of a=0:
> 
> static inline uint32_t get_log2(uint32_t a)
> {
>     uint32_t i = 0;
>     if (a == 0) {
>         return i; // Handle edge case
>     }
>     for (; a > 1; a >>= 1) {
>         i++;
>     }
>     return i; // Now returns 2 for a=4
> }
> 
> Fixes: 28c12c1f2f ("Generate strided vector loads/stores with tcg nodes.")
> 
> Signed-off-by: Chao Liu <chao.liu@yeah.net>
> ---
>  target/riscv/insn_trans/trans_rvv.c.inc | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 

Tested-by: Eric Biggers <ebiggers@kernel.org>

But, to get this to apply I had to re-apply the fixed commit (which was
reverted), then resolve a conflict.  You'll need to send out a new
series which applies to the latest master branch.

- Eric


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] target/riscv: fix vector register address calculation in strided LD/ST
  2025-08-16  0:29   ` Chao Liu
  (?)
  (?)
@ 2025-08-16  6:44   ` Richard Henderson
  -1 siblings, 0 replies; 8+ messages in thread
From: Richard Henderson @ 2025-08-16  6:44 UTC (permalink / raw)
  To: qemu-devel

On 8/16/25 10:29, Chao Liu wrote:
> This patch fixes a critical bug in the RISC-V vector instruction
> translation that caused incorrect data handling in strided load
> operations (e.g., vlsseg8e32).
> 
> Problem Description:
> 
> The `get_log2` function in `trans_rvv.c.inc` returned a value 1 higher
> than the actual log2 value. For example, get_log2(4) incorrectly
> returned 3 instead of 2.
> 
> This led to erroneous vector register offset calculations, resulting in
> data overlap where bytes 32-47 were incorrectly copied to positions
> 16-31 in ChaCha20 encryption code.
> 
> rvv_test_func:
>      vsetivli    zero, 1, e32, m1, ta, ma
>      li          t0, 64
> 
>      vlsseg8e32.v v0, (a0), t0
>      addi        a0, a0, 32
>      vlsseg8e32.v v8, (a0), t0
> 
>      vssseg8e32.v v0, (a1), t0
>      addi        a1, a1, 32
>      vssseg8e32.v v8, (a1), t0
>      ret
> 
> Analysis:
> 
> The original implementation counted the number of right shifts until
> zero, including the final shift that reduced the value to zero:
> 
> static inline uint32_t get_log2(uint32_t a)
> {
>      uint32_t i = 0;
>      for (; a > 0;) {
>          a >>= 1;
>          i++;
>      }
>      return i; // Returns 3 for a=4 (0b100 → 0b10 → 0b1 → 0b0)
> }
> 
> Fix:
> 
> The corrected function stops shifting when only the highest bit remains
> and handles the special case of a=0:
> 
> static inline uint32_t get_log2(uint32_t a)
> {
>      uint32_t i = 0;
>      if (a == 0) {
>          return i; // Handle edge case
>      }
>      for (; a > 1; a >>= 1) {
>          i++;
>      }
>      return i; // Now returns 2 for a=4
> }
> 
> Fixes: 28c12c1f2f ("Generate strided vector loads/stores with tcg nodes.")
> 
> Signed-off-by: Chao Liu <chao.liu@yeah.net>
> ---
>   target/riscv/insn_trans/trans_rvv.c.inc | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
> index 2b6077ac06..f50b62b1d8 100644
> --- a/target/riscv/insn_trans/trans_rvv.c.inc
> +++ b/target/riscv/insn_trans/trans_rvv.c.inc
> @@ -877,7 +877,10 @@ static inline uint32_t MAXSZ(DisasContext *s)
>   static inline uint32_t get_log2(uint32_t a)
>   {
>       uint32_t i = 0;
> -    for (; a > 0;) {
> +    if (a == 0) {
> +        return i;
> +    }
> +    for (; a > 1;) {
>           a >>= 1;
>           i++;
>       }

I suggest

     assert(is_power_of_2(a));
     return ctz32(a);

I was surprised we don't have such a function in qemu/host-utils.h already.


r~



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] tests/tcg/riscv64: Add test for vlsseg8e32 instruction
@ 2025-08-16  0:29   ` Chao Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Chao Liu @ 2025-08-15 19:55 UTC (permalink / raw)
  To: paolo.savini, dbarboza, ebiggers, palmer, alistair.francis,
	liwei1518, zhiwei_liu
  Cc: qemu-riscv, qemu-devel, Chao Liu

This case, it copied 64 bytes from a0 to a1 with vlsseg8e32.

Signed-off-by: Chao Liu <chao.liu@yeah.net>
---
 tests/tcg/riscv64/Makefile.softmmu-target |   8 +-
 tests/tcg/riscv64/test-vlsseg8e32.S       | 108 ++++++++++++++++++++++
 2 files changed, 114 insertions(+), 2 deletions(-)
 create mode 100644 tests/tcg/riscv64/test-vlsseg8e32.S

diff --git a/tests/tcg/riscv64/Makefile.softmmu-target b/tests/tcg/riscv64/Makefile.softmmu-target
index 7c1d44d3f4..c3c5b66713 100644
--- a/tests/tcg/riscv64/Makefile.softmmu-target
+++ b/tests/tcg/riscv64/Makefile.softmmu-target
@@ -7,18 +7,22 @@ VPATH += $(TEST_SRC)
 
 LINK_SCRIPT = $(TEST_SRC)/semihost.ld
 LDFLAGS = -T $(LINK_SCRIPT)
-CFLAGS += -g -Og
+CFLAGS += -march=rv64gcv -mabi=lp64d -g -Og
 
 %.o: %.S
 	$(CC) $(CFLAGS) $< -Wa,--noexecstack -c -o $@
 %: %.o $(LINK_SCRIPT)
 	$(LD) $(LDFLAGS) $< -o $@
 
-QEMU_OPTS += -M virt -display none -semihosting -device loader,file=
+QEMU_OPTS += -M virt -cpu rv64,v=true -display none -semihosting -device loader,file=
 
 EXTRA_RUNS += run-issue1060
 run-issue1060: issue1060
 	$(call run-test, $<, $(QEMU) $(QEMU_OPTS)$<)
 
+EXTRA_RUNS += run-vlsseg8e32
+run-vlsseg8e32: test-vlsseg8e32
+	$(call run-test, $<, $(QEMU) $(QEMU_OPTS)$<)
+
 # We don't currently support the multiarch system tests
 undefine MULTIARCH_TESTS
diff --git a/tests/tcg/riscv64/test-vlsseg8e32.S b/tests/tcg/riscv64/test-vlsseg8e32.S
new file mode 100644
index 0000000000..2861ff3702
--- /dev/null
+++ b/tests/tcg/riscv64/test-vlsseg8e32.S
@@ -0,0 +1,108 @@
+#
+# QEMU RISC-V Vector Strided Load Instruction testcase
+#
+# Copyright (c) 2025 Chao Liu chao.liu@yeah.net
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+
+	.option	norvc
+
+	.section .data
+	.align 4
+source_data:
+	.asciz "Test the vlsseg8e32 insn by copy 64b and verifying correctness."
+	.equ source_len, 64
+
+	.text
+	.global _start
+_start:
+	lla	t0, trap
+	csrw	mtvec, t0
+
+enable_rvv:
+
+	li	x15, 0x800000000024112d
+	csrw	0x301, x15
+	li	x1, 0x2200
+	csrr	x2, mstatus
+	or	x2, x2, x1
+	csrw	mstatus, x2
+
+rvv_test_func:
+	la	a0, source_data
+	li	a1, 0x80020000
+	vsetivli	zero, 1, e32, m1, ta, ma
+	li	t0, 64
+
+	vlsseg8e32.v	v0, (a0), t0
+	addi	a0, a0, 32
+	vlsseg8e32.v	v8, (a0), t0
+
+	vssseg8e32.v	v0, (a1), t0
+	addi	a1, a1, 32
+	vssseg8e32.v	v8, (a1), t0
+
+compare_start:
+	la	a0, source_data
+	li	a1, 0x80020000
+	li	t0, 0
+	li	t1, source_len
+
+compare_loop:
+	# when t0 >= len, compare end
+	bge	 t0, t1, compare_done
+
+	lb	t2, 0(a0)
+	lb	t3, 0(a1)
+	bne	t2, t3, compare_fail
+
+	addi	a0, a0, 1
+	addi	a1, a1, 1
+	addi	t0, t0, 1
+	j	compare_loop
+
+compare_done:
+	# compare ok, return 0
+	li	a0, 0
+	j	_exit
+
+compare_fail:
+	# compare failed, return 2
+	li	a0, 2
+	j	_exit
+
+trap:
+	# When an instruction traps, compare it to the insn in memory.
+	csrr	t0, mepc
+	csrr	t1, mtval
+	lwu	t2, 0(t0)
+	bne	t1, t2, fail
+
+	# Skip the insn and continue.
+	addi	t0, t0, 4
+	csrw	mepc, t0
+	mret
+
+fail:
+	li	a0, 1
+
+# Exit code in a0
+_exit:
+	lla	a1, semiargs
+	li	t0, 0x20026	# ADP_Stopped_ApplicationExit
+	sd	t0, 0(a1)
+	sd	a0, 8(a1)
+	li	a0, 0x20	# TARGET_SYS_EXIT_EXTENDED
+
+	# Semihosting call sequence
+	.balign	16
+	slli	zero, zero, 0x1f
+	ebreak
+	srai	zero, zero, 0x7
+	j	.
+
+	.data
+	.balign	16
+semiargs:
+	.space	16
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] tests/tcg/riscv64: Add test for vlsseg8e32 instruction
@ 2025-08-16  0:29   ` Chao Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Chao Liu @ 2025-08-16  0:29 UTC (permalink / raw)
  To: paolo.savini, dbarboza, ebiggers, palmer, alistair.francis,
	liwei1518, zhiwei_liu
  Cc: qemu-riscv, qemu-devel, Chao Liu

This case, it copied 64 bytes from a0 to a1 with vlsseg8e32.

Signed-off-by: Chao Liu <chao.liu@yeah.net>
---
 tests/tcg/riscv64/Makefile.softmmu-target |   8 +-
 tests/tcg/riscv64/test-vlsseg8e32.S       | 108 ++++++++++++++++++++++
 2 files changed, 114 insertions(+), 2 deletions(-)
 create mode 100644 tests/tcg/riscv64/test-vlsseg8e32.S

diff --git a/tests/tcg/riscv64/Makefile.softmmu-target b/tests/tcg/riscv64/Makefile.softmmu-target
index 7c1d44d3f4..c3c5b66713 100644
--- a/tests/tcg/riscv64/Makefile.softmmu-target
+++ b/tests/tcg/riscv64/Makefile.softmmu-target
@@ -7,18 +7,22 @@ VPATH += $(TEST_SRC)
 
 LINK_SCRIPT = $(TEST_SRC)/semihost.ld
 LDFLAGS = -T $(LINK_SCRIPT)
-CFLAGS += -g -Og
+CFLAGS += -march=rv64gcv -mabi=lp64d -g -Og
 
 %.o: %.S
 	$(CC) $(CFLAGS) $< -Wa,--noexecstack -c -o $@
 %: %.o $(LINK_SCRIPT)
 	$(LD) $(LDFLAGS) $< -o $@
 
-QEMU_OPTS += -M virt -display none -semihosting -device loader,file=
+QEMU_OPTS += -M virt -cpu rv64,v=true -display none -semihosting -device loader,file=
 
 EXTRA_RUNS += run-issue1060
 run-issue1060: issue1060
 	$(call run-test, $<, $(QEMU) $(QEMU_OPTS)$<)
 
+EXTRA_RUNS += run-vlsseg8e32
+run-vlsseg8e32: test-vlsseg8e32
+	$(call run-test, $<, $(QEMU) $(QEMU_OPTS)$<)
+
 # We don't currently support the multiarch system tests
 undefine MULTIARCH_TESTS
diff --git a/tests/tcg/riscv64/test-vlsseg8e32.S b/tests/tcg/riscv64/test-vlsseg8e32.S
new file mode 100644
index 0000000000..2861ff3702
--- /dev/null
+++ b/tests/tcg/riscv64/test-vlsseg8e32.S
@@ -0,0 +1,108 @@
+#
+# QEMU RISC-V Vector Strided Load Instruction testcase
+#
+# Copyright (c) 2025 Chao Liu chao.liu@yeah.net
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+
+	.option	norvc
+
+	.section .data
+	.align 4
+source_data:
+	.asciz "Test the vlsseg8e32 insn by copy 64b and verifying correctness."
+	.equ source_len, 64
+
+	.text
+	.global _start
+_start:
+	lla	t0, trap
+	csrw	mtvec, t0
+
+enable_rvv:
+
+	li	x15, 0x800000000024112d
+	csrw	0x301, x15
+	li	x1, 0x2200
+	csrr	x2, mstatus
+	or	x2, x2, x1
+	csrw	mstatus, x2
+
+rvv_test_func:
+	la	a0, source_data
+	li	a1, 0x80020000
+	vsetivli	zero, 1, e32, m1, ta, ma
+	li	t0, 64
+
+	vlsseg8e32.v	v0, (a0), t0
+	addi	a0, a0, 32
+	vlsseg8e32.v	v8, (a0), t0
+
+	vssseg8e32.v	v0, (a1), t0
+	addi	a1, a1, 32
+	vssseg8e32.v	v8, (a1), t0
+
+compare_start:
+	la	a0, source_data
+	li	a1, 0x80020000
+	li	t0, 0
+	li	t1, source_len
+
+compare_loop:
+	# when t0 >= len, compare end
+	bge	 t0, t1, compare_done
+
+	lb	t2, 0(a0)
+	lb	t3, 0(a1)
+	bne	t2, t3, compare_fail
+
+	addi	a0, a0, 1
+	addi	a1, a1, 1
+	addi	t0, t0, 1
+	j	compare_loop
+
+compare_done:
+	# compare ok, return 0
+	li	a0, 0
+	j	_exit
+
+compare_fail:
+	# compare failed, return 2
+	li	a0, 2
+	j	_exit
+
+trap:
+	# When an instruction traps, compare it to the insn in memory.
+	csrr	t0, mepc
+	csrr	t1, mtval
+	lwu	t2, 0(t0)
+	bne	t1, t2, fail
+
+	# Skip the insn and continue.
+	addi	t0, t0, 4
+	csrw	mepc, t0
+	mret
+
+fail:
+	li	a0, 1
+
+# Exit code in a0
+_exit:
+	lla	a1, semiargs
+	li	t0, 0x20026	# ADP_Stopped_ApplicationExit
+	sd	t0, 0(a1)
+	sd	a0, 8(a1)
+	li	a0, 0x20	# TARGET_SYS_EXIT_EXTENDED
+
+	# Semihosting call sequence
+	.balign	16
+	slli	zero, zero, 0x1f
+	ebreak
+	srai	zero, zero, 0x7
+	j	.
+
+	.data
+	.balign	16
+semiargs:
+	.space	16
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-08-16  6:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-15 19:55 [PATCH v2 0/2] target/riscv: fix vector LD/ST instruction Chao Liu
2025-08-16  0:29 ` Chao Liu
2025-08-15 19:55 ` [PATCH v2 1/2] target/riscv: fix vector register address calculation in strided LD/ST Chao Liu
2025-08-16  0:29   ` Chao Liu
2025-08-15 22:07   ` Eric Biggers
2025-08-16  6:44   ` Richard Henderson
2025-08-15 19:55 ` [PATCH v2 2/2] tests/tcg/riscv64: Add test for vlsseg8e32 instruction Chao Liu
2025-08-16  0:29   ` Chao Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.