qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] target/riscv: fix vector register address calculation in strided LD/ST
@ 2025-08-15 19:37 Chao Liu
  2025-08-15 19:37 ` [PATCH v1 1/2] " Chao Liu
  2025-08-15 19:37 ` [PATCH v1 2/2] tests/tcg/riscv64: Add test for vlsseg8e32 instruction Chao Liu
  0 siblings, 2 replies; 3+ messages in thread
From: Chao Liu @ 2025-08-15 19:37 UTC (permalink / raw)
  To: paolo.savini, dbarboza, ebiggers, palmer, alistair.francis,
	liwei1518, zhiwei_liu
  Cc: qemu-riscv, qemu-devel, Chao Liu

Hi Paolo, Eric, Daniel,

I have attempted to fix this issue. Thanks to Eric for providing the test case.

This patch fixes a critical bug in the RISC-V vector instruction translation
that caused incorrect data handling in strided load operations
(e.g.,vlsseg8e32).

#### Problem Description

The `get_log2` function in `trans_rvv.c.inc` returned a value 1 higher than the
actual log2 value. For example, get_log2(4) incorrectly returned 3 instead of 2.

This led to erroneous vector register offset calculations, resulting in data
overlap where bytes 32-47 were incorrectly copied to positions 16-31 in ChaCha20
encryption code.

rvv_test_func:
    vsetivli    zero, 1, e32, m1, ta, ma
    li          t0, 64

    vlsseg8e32.v v0, (a0), t0
    addi        a0, a0, 32
    vlsseg8e32.v v8, (a0), t0

    vssseg8e32.v v0, (a1), t0
    addi        a1, a1, 32
    vssseg8e32.v v8, (a1), t0
    ret

#### Root Cause Analysis

The original implementation counted the number of right shifts until zero,
including the final shift that reduced the value to zero:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    for (; a > 0;) {
        a >>= 1;
        i++;
    }
    return i; // Returns 3 for a=4 (0b100 → 0b10 → 0b1 → 0b0)
}

#### Fix Implementation

The corrected function stops shifting when only the highest bit remains and
handles the special case of a=0:

static inline uint32_t get_log2(uint32_t a)
{
    uint32_t i = 0;
    if (a == 0) {
        return i; // Handle edge case
    }
    for (; a > 1; a >>= 1) {
        i++;
    }
    return i; // Now returns 2 for a=4
}

#### Testing

This fix has been verified with:
    1. The provided ChaCha20 vector optimization test case
    2. RVV strided load instruction tests in `test-vlsseg8e32.S`

All tests now pass with correct data handling and no memory overlap.


Test using the following command:

    ./configure --target-list=riscv64-softmmu \
                --cross-prefix-riscv64=riscv64-unknown-elf-

    ninja -j$(nproc) -C build && make check-tcg

Expected result:

    BUILD   riscv64-softmmu guest-tests
    RUN     riscv64-softmmu guest-tests
    TEST    issue1060 on riscv64
    TEST    test-vlsseg8e32 on riscv64


Best regards,

Chao

Chao Liu (2):
  target/riscv: fix vector register address calculation in strided LD/ST
  tests/tcg/riscv64: Add test for vlsseg8e32 instruction

 target/riscv/insn_trans/trans_rvv.c.inc   |   5 +-
 tests/tcg/riscv64/Makefile.softmmu-target |   8 +-
 tests/tcg/riscv64/test-vlsseg8e32.S       | 108 ++++++++++++++++++++++
 3 files changed, 118 insertions(+), 3 deletions(-)
 create mode 100644 tests/tcg/riscv64/test-vlsseg8e32.S

-- 
2.50.1



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-08-15 19:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-15 19:37 [PATCH v1 0/2] target/riscv: fix vector register address calculation in strided LD/ST Chao Liu
2025-08-15 19:37 ` [PATCH v1 1/2] " Chao Liu
2025-08-15 19:37 ` [PATCH v1 2/2] tests/tcg/riscv64: Add test for vlsseg8e32 instruction Chao Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).