qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] Add LoongArch v1.1 instructions
@ 2025-11-19 12:24 Jiajie Chen
  2025-11-19 12:24 ` [PATCH v2 1/7] target/loongarch: Require atomics to be aligned Jiajie Chen
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Jiajie Chen @ 2025-11-19 12:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: richard.henderson, gaosong, git, Jiajie Chen

Latest revision of LoongArch ISA is out at
https://www.loongson.cn/uploads/images/2023102309132647981.%E9%BE%99%E8%8A%AF%E6%9E%B6%E6%9E%84%E5%8F%82%E8%80%83%E6%89%8B%E5%86%8C%E5%8D%B7%E4%B8%80_r1p10.pdf
(Chinese only). The revision includes the following updates:

- estimated fp reciporcal instructions: frecip -> frecipe, frsqrt ->
  frsqrte
- 128-bit width store-conditional instruction: sc.q
- ll.w/d with acquire semantic: llacq.w/d, sc.w/d with release semantic:
  screl.w/d
- compare and swap instructions: amcas[_db].b/w/h/d
- byte and word-wide amswap/add instructions: am{swap/add}[_db].{b/h}
- new definition for dbar hints
- clarify 32-bit division instruction hebavior
- clarify load ordering when accessing the same address
- introduce message signaled interrupt
- introduce hardware page table walker

The new revision is implemented in the Loongson 3A6000 processor.

This patch series implements all the new instructions. The v1 version
can be found at
https://patchew.org/QEMU/20231023153029.269211-2-c@jia.je/.

A simple testcase to test the new fp and sc.q instructions:

#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

void test_fp() {
  float a = 3.0;
  float b;

  asm volatile("frecip.s %0, %1" : "=f"(b) : "f"(a));
  printf("frecip: %f\n", b);
  asm volatile("frecipe.s %0, %1" : "=f"(b) : "f"(a));
  printf("frecipe: %f\n", b);

  asm volatile("frsqrt.s %0, %1" : "=f"(b) : "f"(a));
  printf("frsqrt: %f\n", b);
  asm volatile("frsqrte.s %0, %1" : "=f"(b) : "f"(a));
  printf("frsqrte: %f\n", b);
}

uint64_t rand64() { return ((uint64_t)rand() << 32) | rand(); }

void test_sc_q() {
  __int128 val = rand64();
  val = (val << 64) | rand64();
  __int128 *ptr = &val;
  uint64_t add_lo = rand64();
  uint64_t add_hi = rand64();
  __int128 add = add_hi;
  add = (add << 64) | add_lo;
  __int128 expect = val + add;
  int res = 0;

  asm volatile("ll.d $t1, %1, 0\nld.d $t2, %1, 8\nadd.d $t1, $t1, %2\nadd.d "
               "$t2, $t2, %3\nsc.q $t1, $t2, %1\nmove %0, $t1"
               : "=r"(res), "+r"(ptr)
               : "r"(add_lo), "r"(add_hi)
               : "$t1", "$t2", "memory");
  assert(res == 1);
  assert(val == expect);

  // change memory content to make sc fail
  res = 1;
  asm volatile("ll.d $t1, %1, 0\nld.d $t2, %1, 8\naddi.d $t1, $t1, 1\nst.d "
               "$t1, %1, 0\nsc.q $t1, $t2, %1\nmove %0, $t1"
               : "=r"(res), "+r"(ptr)
               :
               : "$t1", "$t2", "memory");
  assert(res == 0);

  res = 1;
  asm volatile("ll.d $t1, %1, 0\nld.d $t2, %1, 8\naddi.d $t2, $t2, 1\nst.d "
               "$t2, %1, 8\nsc.q $t1, $t2, %1\nmove %0, $t1"
               : "=r"(res), "+r"(ptr)
               :
               : "$t1", "$t2", "memory");
  assert(res == 0);

  printf("SC.Q passed\n");
}

int main(int argc, char *argv[]) {
  test_fp();
  test_sc_q();
  return 0;
}

Compile and test by:

loongarch64-linux-gnu-gcc test.c -o test -static && ./qemu-loongarch64 -cpu max test

Jiajie Chen (7):
  target/loongarch: Require atomics to be aligned
  target/loongarch: Add am{swap/add}[_db].{b/h}
  target/loongarch: Add amcas[_db].{b/h/w/d}
  target/loongarch: Add estimated reciprocal instructions
  target/loongarch: Add llacq/screl instructions
  target/loongarch: Add sc.q instructions
  target/loongarch: Add LA v1.1 instructions to max cpu

 target/loongarch/cpu.c                        |  11 +-
 target/loongarch/cpu.h                        |   7 +
 target/loongarch/disas.c                      |  33 ++++
 target/loongarch/insns.decode                 |  34 ++++
 .../tcg/insn_trans/trans_atomic.c.inc         | 145 ++++++++++++++++--
 .../tcg/insn_trans/trans_farith.c.inc         |   4 +
 .../tcg/insn_trans/trans_memory.c.inc         |  22 +++
 .../loongarch/tcg/insn_trans/trans_vec.c.inc  |   8 +
 target/loongarch/tcg/translate.c              |   6 +-
 target/loongarch/translate.h                  |  30 ++--
 10 files changed, 280 insertions(+), 20 deletions(-)

-- 
2.51.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-11-19 12:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-19 12:24 [PATCH v2 0/7] Add LoongArch v1.1 instructions Jiajie Chen
2025-11-19 12:24 ` [PATCH v2 1/7] target/loongarch: Require atomics to be aligned Jiajie Chen
2025-11-19 12:24 ` [PATCH v2 2/7] target/loongarch: Add am{swap/add}[_db].{b/h} Jiajie Chen
2025-11-19 12:24 ` [PATCH v2 3/7] target/loongarch: Add amcas[_db].{b/h/w/d} Jiajie Chen
2025-11-19 12:30 ` [PATCH v2 4/7] target/loongarch: Add estimated reciprocal instructions Jiajie Chen
2025-11-19 12:30   ` [PATCH v2 5/7] target/loongarch: Add llacq/screl instructions Jiajie Chen
2025-11-19 12:30   ` [PATCH v2 6/7] target/loongarch: Add sc.q instructions Jiajie Chen
2025-11-19 12:30   ` [PATCH v2 7/7] target/loongarch: Add LA v1.1 instructions to max cpu Jiajie Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).