qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/6] Improve the performance of RISC-V vector unit-stride/whole register ld/st instructions
@ 2024-05-31 17:44 Max Chou
  2024-05-31 17:44 ` [RFC PATCH v2 1/6] target/riscv: Separate vector segment " Max Chou
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Max Chou @ 2024-05-31 17:44 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv; +Cc: dbarboza, Max Chou

Hi,

This RFC patch set tries to fix the issue of
https://gitlab.com/qemu-project/qemu/-/issues/2137.

In this new version, we added patches that try to load/store more data
at a time in part of vector continuous load/store (unit-stride/whole
register) instructions with some assumptions (e.g. no masking, no tail
agnostic, perform virtual address resolution once for the entire vector,
etc.) as suggested by Richard Henderson.

This version can improve the performance of the test case provided in
https://gitlab.com/qemu-project/qemu/-/issues/2137#note_1757501369 (from
~13.5 sec to ~1.5 sec) on QEMU user mode.

PS: This RFC patch set only focuses on the vle8.v/vse8.v/vl8re8.v/vs8r.v
instructions. The next version will try to complete other instructions.

Series based on riscv-to-apply.next branch (commit 1806da7).

Max Chou (6):
  target/riscv: Separate vector segment ld/st instructions
  accel/tcg: Avoid unnecessary call overhead from
    qemu_plugin_vcpu_mem_cb
  target/riscv: Inline vext_ldst_us and corresponding function for
    performance
  target/riscv: Add check_probe_[read|write] helper functions
  target/riscv: rvv: Optimize v[l|s]e8.v with limitations
  target/riscv: rvv: Optimize vl8re8.v/vs8r.v with limitations

 accel/tcg/ldst_common.c.inc             |   8 +-
 target/riscv/helper.h                   |   8 +
 target/riscv/insn32.decode              |  11 +-
 target/riscv/insn_trans/trans_rvv.c.inc | 454 +++++++++++++++++++++++-
 target/riscv/vector_helper.c            | 142 ++++++--
 5 files changed, 591 insertions(+), 32 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-06-04  0:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-31 17:44 [RFC PATCH v2 0/6] Improve the performance of RISC-V vector unit-stride/whole register ld/st instructions Max Chou
2024-05-31 17:44 ` [RFC PATCH v2 1/6] target/riscv: Separate vector segment " Max Chou
2024-05-31 17:44 ` [RFC PATCH v2 2/6] accel/tcg: Avoid unnecessary call overhead from qemu_plugin_vcpu_mem_cb Max Chou
2024-05-31 17:44 ` [RFC PATCH v2 3/6] target/riscv: Inline vext_ldst_us and corresponding function for performance Max Chou
2024-05-31 17:44 ` [RFC PATCH v2 4/6] target/riscv: Add check_probe_[read|write] helper functions Max Chou
2024-05-31 17:44 ` [RFC PATCH v2 5/6] target/riscv: rvv: Optimize v[l|s]e8.v with limitations Max Chou
2024-06-02 17:45   ` Richard Henderson
2024-06-03 15:50     ` Max Chou
2024-06-04  0:58       ` Richard Henderson
2024-05-31 17:44 ` [RFC PATCH v2 6/6] target/riscv: rvv: Optimize vl8re8.v/vs8r.v " Max Chou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).