qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/16] make system memory API available for common code
@ 2025-03-10  4:58 Pierrick Bouvier
  2025-03-10  4:58 ` [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h Pierrick Bouvier
                   ` (16 more replies)
  0 siblings, 17 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

The main goal of this series is to be able to call any memory ld/st function
from code that is *not* target dependent. As a positive side effect, we can
turn related system compilation units into common code.

The first 6 patches remove dependency of memory API to cpu headers and remove
dependency to target specific code. This could be a series on its own, but it's
great to be able to turn system memory compilation units into common code to
make sure it can't regress, and prove it achieves the desired result.

The next patches remove more dependencies on cpu headers (exec-all,
memory-internal, ram_addr).
Then, we add access to a needed function from kvm, some xen stubs, and we
finally can turn our compilation units into common code.

Every commit was tested to build correctly for all targets (on windows, linux,
macos), and the series was fully tested by running all tests we have (linux,
x86_64 host).

Pierrick Bouvier (16):
  exec/memory_ldst: extract memory_ldst declarations from cpu-all.h
  exec/memory_ldst_phys: extract memory_ldst_phys declarations from
    cpu-all.h
  include: move target_words_bigendian() from tswap to bswap
  exec/memory.h: make devend_memop target agnostic
  qemu/bswap: implement {ld,st}.*_p as functions
  exec/cpu-all.h: we can now remove ld/st macros
  codebase: prepare to remove cpu.h from exec/exec-all.h
  exec/exec-all: remove dependency on cpu.h
  exec/memory-internal: remove dependency on cpu.h
  exec/ram_addr: remove dependency on cpu.h
  system/kvm: make kvm_flush_coalesced_mmio_buffer() accessible for
    common code
  exec/ram_addr: call xen_hvm_modified_memory only if xen is enabled
  hw/xen: add stubs for various functions
  system/physmem: compilation unit is now common to all targets
  system/memory: make compilation unit common
  system/ioport: make compilation unit common

 include/exec/cpu-all.h              | 52 ------------------
 include/exec/exec-all.h             |  1 -
 include/exec/memory-internal.h      |  2 -
 include/exec/memory.h               | 48 ++++++++++++++---
 include/exec/ram_addr.h             | 11 ++--
 include/exec/tswap.h                | 11 ----
 include/qemu/bswap.h                | 82 +++++++++++++++++++++++++++++
 include/system/kvm.h                |  6 +--
 include/tcg/tcg-op.h                |  1 +
 target/ppc/helper_regs.h            |  2 +
 include/exec/memory_ldst.h.inc      | 13 ++---
 include/exec/memory_ldst_phys.h.inc |  5 +-
 hw/ppc/spapr_nested.c               |  1 +
 hw/sh4/sh7750.c                     |  1 +
 hw/xen/xen_stubs.c                  | 56 ++++++++++++++++++++
 page-vary-target.c                  |  3 +-
 system/ioport.c                     |  1 -
 system/memory.c                     | 22 +++++---
 target/riscv/bitmanip_helper.c      |  1 +
 hw/xen/meson.build                  |  3 ++
 system/meson.build                  |  6 +--
 21 files changed, 225 insertions(+), 103 deletions(-)
 create mode 100644 hw/xen/xen_stubs.c

-- 
2.39.5



^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 15:17   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 02/16] exec/memory_ldst_phys: extract memory_ldst_phys " Pierrick Bouvier
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

They are now accessible through exec/memory.h instead, and we make sure
all variants are available for common or target dependent code.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/exec/cpu-all.h         | 12 ------------
 include/exec/memory_ldst.h.inc | 13 +++++--------
 2 files changed, 5 insertions(+), 20 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 8cd6c00cf89..17ea82518a0 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -69,18 +69,6 @@
 
 #include "exec/hwaddr.h"
 
-#define SUFFIX
-#define ARG1         as
-#define ARG1_DECL    AddressSpace *as
-#define TARGET_ENDIANNESS
-#include "exec/memory_ldst.h.inc"
-
-#define SUFFIX       _cached_slow
-#define ARG1         cache
-#define ARG1_DECL    MemoryRegionCache *cache
-#define TARGET_ENDIANNESS
-#include "exec/memory_ldst.h.inc"
-
 static inline void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val)
 {
     address_space_stl_notdirty(as, addr, val,
diff --git a/include/exec/memory_ldst.h.inc b/include/exec/memory_ldst.h.inc
index 92ad74e9560..74519a88de0 100644
--- a/include/exec/memory_ldst.h.inc
+++ b/include/exec/memory_ldst.h.inc
@@ -19,7 +19,8 @@
  * License along with this library; if not, see <http://www.gnu.org/licenses/>.
  */
 
-#ifdef TARGET_ENDIANNESS
+uint8_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
 uint16_t glue(address_space_lduw, SUFFIX)(ARG1_DECL,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
 uint32_t glue(address_space_ldl, SUFFIX)(ARG1_DECL,
@@ -28,15 +29,15 @@ uint64_t glue(address_space_ldq, SUFFIX)(ARG1_DECL,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
 void glue(address_space_stl_notdirty, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
+void glue(address_space_stb, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint8_t val, MemTxAttrs attrs, MemTxResult *result);
 void glue(address_space_stw, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint16_t val, MemTxAttrs attrs, MemTxResult *result);
 void glue(address_space_stl, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
 void glue(address_space_stq, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint64_t val, MemTxAttrs attrs, MemTxResult *result);
-#else
-uint8_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
-    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+
 uint16_t glue(address_space_lduw_le, SUFFIX)(ARG1_DECL,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
 uint16_t glue(address_space_lduw_be, SUFFIX)(ARG1_DECL,
@@ -49,8 +50,6 @@ uint64_t glue(address_space_ldq_le, SUFFIX)(ARG1_DECL,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
 uint64_t glue(address_space_ldq_be, SUFFIX)(ARG1_DECL,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
-void glue(address_space_stb, SUFFIX)(ARG1_DECL,
-    hwaddr addr, uint8_t val, MemTxAttrs attrs, MemTxResult *result);
 void glue(address_space_stw_le, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint16_t val, MemTxAttrs attrs, MemTxResult *result);
 void glue(address_space_stw_be, SUFFIX)(ARG1_DECL,
@@ -63,9 +62,7 @@ void glue(address_space_stq_le, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint64_t val, MemTxAttrs attrs, MemTxResult *result);
 void glue(address_space_stq_be, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint64_t val, MemTxAttrs attrs, MemTxResult *result);
-#endif
 
 #undef ARG1_DECL
 #undef ARG1
 #undef SUFFIX
-#undef TARGET_ENDIANNESS
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 02/16] exec/memory_ldst_phys: extract memory_ldst_phys declarations from cpu-all.h
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
  2025-03-10  4:58 ` [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-11  0:08   ` Pierrick Bouvier
  2025-03-10  4:58 ` [PATCH 03/16] include: move target_words_bigendian() from tswap to bswap Pierrick Bouvier
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

They are now accessible through exec/memory.h instead, and we make sure
all variants are available for common or target dependent code.

To allow this, we need to implement address_space_st{*}_cached, simply
forwarding the calls to _cached_slow variants.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/exec/cpu-all.h              | 15 ------------
 include/exec/memory.h               | 36 +++++++++++++++++++++++++++++
 include/exec/memory_ldst_phys.h.inc |  5 +---
 3 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 17ea82518a0..1c2e18f492b 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -75,21 +75,6 @@ static inline void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val
                                MEMTXATTRS_UNSPECIFIED, NULL);
 }
 
-#define SUFFIX
-#define ARG1         as
-#define ARG1_DECL    AddressSpace *as
-#define TARGET_ENDIANNESS
-#include "exec/memory_ldst_phys.h.inc"
-
-/* Inline fast path for direct RAM access.  */
-#define ENDIANNESS
-#include "exec/memory_ldst_cached.h.inc"
-
-#define SUFFIX       _cached
-#define ARG1         cache
-#define ARG1_DECL    MemoryRegionCache *cache
-#define TARGET_ENDIANNESS
-#include "exec/memory_ldst_phys.h.inc"
 #endif
 
 /* page related stuff */
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 78c4e0aec8d..7c20f36a312 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2798,6 +2798,42 @@ static inline void address_space_stb_cached(MemoryRegionCache *cache,
     }
 }
 
+static inline uint16_t address_space_lduw_cached(MemoryRegionCache *cache,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
+{
+    return address_space_lduw_cached_slow(cache, addr, attrs, result);
+}
+
+static inline void address_space_stw_cached(MemoryRegionCache *cache,
+    hwaddr addr, uint16_t val, MemTxAttrs attrs, MemTxResult *result)
+{
+    address_space_stw_cached_slow(cache, addr, val, attrs, result);
+}
+
+static inline uint32_t address_space_ldl_cached(MemoryRegionCache *cache,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
+{
+    return address_space_ldl_cached_slow(cache, addr, attrs, result);
+}
+
+static inline void address_space_stl_cached(MemoryRegionCache *cache,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
+{
+    address_space_stl_cached_slow(cache, addr, val, attrs, result);
+}
+
+static inline uint64_t address_space_ldq_cached(MemoryRegionCache *cache,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
+{
+    return address_space_ldq_cached_slow(cache, addr, attrs, result);
+}
+
+static inline void address_space_stq_cached(MemoryRegionCache *cache,
+    hwaddr addr, uint64_t val, MemTxAttrs attrs, MemTxResult *result)
+{
+    address_space_stq_cached_slow(cache, addr, val, attrs, result);
+}
+
 #define ENDIANNESS   _le
 #include "exec/memory_ldst_cached.h.inc"
 
diff --git a/include/exec/memory_ldst_phys.h.inc b/include/exec/memory_ldst_phys.h.inc
index ecd678610d1..db67de75251 100644
--- a/include/exec/memory_ldst_phys.h.inc
+++ b/include/exec/memory_ldst_phys.h.inc
@@ -19,7 +19,6 @@
  * License along with this library; if not, see <http://www.gnu.org/licenses/>.
  */
 
-#ifdef TARGET_ENDIANNESS
 static inline uint16_t glue(lduw_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
 {
     return glue(address_space_lduw, SUFFIX)(ARG1, addr,
@@ -55,7 +54,7 @@ static inline void glue(stq_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t val)
     glue(address_space_stq, SUFFIX)(ARG1, addr, val,
                                     MEMTXATTRS_UNSPECIFIED, NULL);
 }
-#else
+
 static inline uint8_t glue(ldub_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
 {
     return glue(address_space_ldub, SUFFIX)(ARG1, addr,
@@ -139,9 +138,7 @@ static inline void glue(stq_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t va
     glue(address_space_stq_be, SUFFIX)(ARG1, addr, val,
                                        MEMTXATTRS_UNSPECIFIED, NULL);
 }
-#endif
 
 #undef ARG1_DECL
 #undef ARG1
 #undef SUFFIX
-#undef TARGET_ENDIANNESS
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 03/16] include: move target_words_bigendian() from tswap to bswap
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
  2025-03-10  4:58 ` [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h Pierrick Bouvier
  2025-03-10  4:58 ` [PATCH 02/16] exec/memory_ldst_phys: extract memory_ldst_phys " Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 15:21   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 04/16] exec/memory.h: make devend_memop target agnostic Pierrick Bouvier
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

This is needed for next commits (especially when implementing st/ld
primitives which will use this function).
As well, remove reference to TARGET_BIG_ENDIAN, as we are about to
remove this dependency.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/exec/tswap.h | 11 -----------
 include/qemu/bswap.h | 12 ++++++++++++
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/exec/tswap.h b/include/exec/tswap.h
index ecd4faef015..cc662cd8b54 100644
--- a/include/exec/tswap.h
+++ b/include/exec/tswap.h
@@ -10,17 +10,6 @@
 
 #include "qemu/bswap.h"
 
-/**
- * target_words_bigendian:
- * Returns true if the (default) endianness of the target is big endian,
- * false otherwise. Note that in target-specific code, you can use
- * TARGET_BIG_ENDIAN directly instead. On the other hand, common
- * code should normally never need to know about the endianness of the
- * target, so please do *not* use this function unless you know very well
- * what you are doing!
- */
-bool target_words_bigendian(void);
-
 /*
  * If we're in target-specific code, we can hard-code the swapping
  * condition, otherwise we have to do (slower) run-time checks.
diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h
index b915835bead..ebf6f9e5f5c 100644
--- a/include/qemu/bswap.h
+++ b/include/qemu/bswap.h
@@ -1,6 +1,8 @@
 #ifndef BSWAP_H
 #define BSWAP_H
 
+#include <stdbool.h>
+
 #undef  bswap16
 #define bswap16(_x) __builtin_bswap16(_x)
 #undef  bswap32
@@ -8,6 +10,16 @@
 #undef  bswap64
 #define bswap64(_x) __builtin_bswap64(_x)
 
+/**
+ * target_words_bigendian:
+ * Returns true if the (default) endianness of the target is big endian,
+ * false otherwise.
+ * Common code should normally never need to know about the endianness of the
+ * target, so please do *not* use this function unless you know very well
+ * what you are doing!
+ */
+bool target_words_bigendian(void);
+
 static inline uint32_t bswap24(uint32_t x)
 {
     return (((x & 0x000000ffU) << 16) |
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 04/16] exec/memory.h: make devend_memop target agnostic
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (2 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 03/16] include: move target_words_bigendian() from tswap to bswap Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 15:25   ` Richard Henderson
  2025-03-10 16:30   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions Pierrick Bouvier
                   ` (12 subsequent siblings)
  16 siblings, 2 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Will allow to make system/memory.c common later.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/exec/memory.h | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 7c20f36a312..698179b26d2 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -3164,25 +3164,23 @@ address_space_write_cached(MemoryRegionCache *cache, hwaddr addr,
 MemTxResult address_space_set(AddressSpace *as, hwaddr addr,
                               uint8_t c, hwaddr len, MemTxAttrs attrs);
 
-#ifdef COMPILING_PER_TARGET
 /* enum device_endian to MemOp.  */
 static inline MemOp devend_memop(enum device_endian end)
 {
     QEMU_BUILD_BUG_ON(DEVICE_HOST_ENDIAN != DEVICE_LITTLE_ENDIAN &&
                       DEVICE_HOST_ENDIAN != DEVICE_BIG_ENDIAN);
 
-#if HOST_BIG_ENDIAN != TARGET_BIG_ENDIAN
-    /* Swap if non-host endianness or native (target) endianness */
-    return (end == DEVICE_HOST_ENDIAN) ? 0 : MO_BSWAP;
-#else
+    if (HOST_BIG_ENDIAN != target_words_bigendian()) {
+        /* Swap if non-host endianness or native (target) endianness */
+        return (end == DEVICE_HOST_ENDIAN) ? 0 : MO_BSWAP;
+    }
+
     const int non_host_endianness =
         DEVICE_LITTLE_ENDIAN ^ DEVICE_BIG_ENDIAN ^ DEVICE_HOST_ENDIAN;
 
     /* In this case, native (target) endianness needs no swap.  */
     return (end == non_host_endianness) ? MO_BSWAP : 0;
-#endif
 }
-#endif /* COMPILING_PER_TARGET */
 
 /*
  * Inhibit technologies that require discarding of pages in RAM blocks, e.g.,
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (3 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 04/16] exec/memory.h: make devend_memop target agnostic Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 16:08   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 06/16] exec/cpu-all.h: we can now remove ld/st macros Pierrick Bouvier
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

For now, they are duplicate of the same macros in cpu-all.h that we
eliminate in next commit.

Keep code readable by not defining them with macros, but simply their
implementation.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/qemu/bswap.h | 70 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h
index ebf6f9e5f5c..46ec62f716d 100644
--- a/include/qemu/bswap.h
+++ b/include/qemu/bswap.h
@@ -442,6 +442,76 @@ DO_STN_LDN_P(be)
 
 #undef DO_STN_LDN_P
 
+/* Return ld{word}_{le,be}_p following target endianness. */
+#define LOAD_IMPL(word, args...)                    \
+do {                                                \
+    if (target_words_bigendian()) {                 \
+        return glue(glue(ld, word), _be_p)(args);   \
+    } else {                                        \
+        return glue(glue(ld, word), _le_p)(args);   \
+    }                                               \
+} while (0)
+
+static inline int lduw_p(const void *ptr)
+{
+    LOAD_IMPL(uw, ptr);
+}
+
+static inline int ldsw_p(const void *ptr)
+{
+    LOAD_IMPL(sw, ptr);
+}
+
+static inline int ldl_p(const void *ptr)
+{
+    LOAD_IMPL(l, ptr);
+}
+
+static inline uint64_t ldq_p(const void *ptr)
+{
+    LOAD_IMPL(q, ptr);
+}
+
+static inline uint64_t ldn_p(const void *ptr, int sz)
+{
+    LOAD_IMPL(n, ptr, sz);
+}
+
+#undef LOAD_IMPL
+
+/* Call st{word}_{le,be}_p following target endianness. */
+#define STORE_IMPL(word, args...)           \
+do {                                        \
+    if (target_words_bigendian()) {         \
+        glue(glue(st, word), _be_p)(args);  \
+    } else {                                \
+        glue(glue(st, word), _le_p)(args);  \
+    }                                       \
+} while (0)
+
+
+static inline void stw_p(void *ptr, uint16_t v)
+{
+    STORE_IMPL(w, ptr, v);
+}
+
+static inline void stl_p(void *ptr, uint32_t v)
+{
+    STORE_IMPL(l, ptr, v);
+}
+
+static inline void stq_p(void *ptr, uint64_t v)
+{
+    STORE_IMPL(q, ptr, v);
+}
+
+static inline void stn_p(void *ptr, int sz, uint64_t v)
+{
+    STORE_IMPL(n, ptr, sz, v);
+}
+
+#undef STORE_IMPL
+
 #undef le_bswap
 #undef be_bswap
 #undef le_bswaps
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 06/16] exec/cpu-all.h: we can now remove ld/st macros
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (4 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 16:39   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 07/16] codebase: prepare to remove cpu.h from exec/exec-all.h Pierrick Bouvier
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Functions declared in bswap.h will be used instead.

At this point, we finished to extract memory API from cpu-all.h, and it
can be called from any common or target dependent code.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/exec/cpu-all.h | 25 -------------------------
 1 file changed, 25 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 1c2e18f492b..625b4c51f3c 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -38,31 +38,6 @@
 #define BSWAP_NEEDED
 #endif
 
-/* Target-endianness CPU memory access functions. These fit into the
- * {ld,st}{type}{sign}{size}{endian}_p naming scheme described in bswap.h.
- */
-#if TARGET_BIG_ENDIAN
-#define lduw_p(p) lduw_be_p(p)
-#define ldsw_p(p) ldsw_be_p(p)
-#define ldl_p(p) ldl_be_p(p)
-#define ldq_p(p) ldq_be_p(p)
-#define stw_p(p, v) stw_be_p(p, v)
-#define stl_p(p, v) stl_be_p(p, v)
-#define stq_p(p, v) stq_be_p(p, v)
-#define ldn_p(p, sz) ldn_be_p(p, sz)
-#define stn_p(p, sz, v) stn_be_p(p, sz, v)
-#else
-#define lduw_p(p) lduw_le_p(p)
-#define ldsw_p(p) ldsw_le_p(p)
-#define ldl_p(p) ldl_le_p(p)
-#define ldq_p(p) ldq_le_p(p)
-#define stw_p(p, v) stw_le_p(p, v)
-#define stl_p(p, v) stl_le_p(p, v)
-#define stq_p(p, v) stq_le_p(p, v)
-#define ldn_p(p, sz) ldn_le_p(p, sz)
-#define stn_p(p, sz, v) stn_le_p(p, sz, v)
-#endif
-
 /* MMU memory access macros */
 
 #if !defined(CONFIG_USER_ONLY)
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 07/16] codebase: prepare to remove cpu.h from exec/exec-all.h
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (5 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 06/16] exec/cpu-all.h: we can now remove ld/st macros Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:22   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 08/16] exec/exec-all: remove dependency on cpu.h Pierrick Bouvier
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/tcg/tcg-op.h           | 1 +
 target/ppc/helper_regs.h       | 2 ++
 hw/ppc/spapr_nested.c          | 1 +
 hw/sh4/sh7750.c                | 1 +
 page-vary-target.c             | 3 ++-
 target/riscv/bitmanip_helper.c | 1 +
 6 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index a02850583bd..bc46b5570c4 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -9,6 +9,7 @@
 #define TCG_TCG_OP_H
 
 #include "tcg/tcg-op-common.h"
+#include "exec/target_long.h"
 
 #ifndef TARGET_LONG_BITS
 #error must include QEMU headers
diff --git a/target/ppc/helper_regs.h b/target/ppc/helper_regs.h
index 8196c1346dc..b928c2c452d 100644
--- a/target/ppc/helper_regs.h
+++ b/target/ppc/helper_regs.h
@@ -20,6 +20,8 @@
 #ifndef HELPER_REGS_H
 #define HELPER_REGS_H
 
+#include "target/ppc/cpu.h"
+
 void hreg_swap_gpr_tgpr(CPUPPCState *env);
 void hreg_compute_hflags(CPUPPCState *env);
 void hreg_update_pmu_hflags(CPUPPCState *env);
diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index 23958c6383a..18bbb1403c6 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -2,6 +2,7 @@
 #include "qemu/cutils.h"
 #include "exec/exec-all.h"
 #include "exec/cputlb.h"
+#include "exec/target_long.h"
 #include "helper_regs.h"
 #include "hw/ppc/ppc.h"
 #include "hw/ppc/spapr.h"
diff --git a/hw/sh4/sh7750.c b/hw/sh4/sh7750.c
index 6faf0e3ca8b..41306fb6008 100644
--- a/hw/sh4/sh7750.c
+++ b/hw/sh4/sh7750.c
@@ -29,6 +29,7 @@
 #include "hw/irq.h"
 #include "hw/sh4/sh.h"
 #include "system/system.h"
+#include "target/sh4/cpu.h"
 #include "hw/qdev-properties.h"
 #include "hw/qdev-properties-system.h"
 #include "sh7750_regs.h"
diff --git a/page-vary-target.c b/page-vary-target.c
index 3f81144cda8..12fda410bb0 100644
--- a/page-vary-target.c
+++ b/page-vary-target.c
@@ -20,8 +20,9 @@
 #define IN_PAGE_VARY 1
 
 #include "qemu/osdep.h"
-#include "exec/page-vary.h"
 #include "exec/exec-all.h"
+#include "exec/page-vary.h"
+#include "exec/target_page.h"
 
 bool set_preferred_target_page_bits(int bits)
 {
diff --git a/target/riscv/bitmanip_helper.c b/target/riscv/bitmanip_helper.c
index b99c4a39a1f..d93312a811c 100644
--- a/target/riscv/bitmanip_helper.c
+++ b/target/riscv/bitmanip_helper.c
@@ -21,6 +21,7 @@
 #include "qemu/osdep.h"
 #include "qemu/host-utils.h"
 #include "exec/exec-all.h"
+#include "exec/target_long.h"
 #include "exec/helper-proto.h"
 #include "tcg/tcg.h"
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 08/16] exec/exec-all: remove dependency on cpu.h
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (6 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 07/16] codebase: prepare to remove cpu.h from exec/exec-all.h Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:29   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 09/16] exec/memory-internal: " Pierrick Bouvier
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/exec/exec-all.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index dd5c40f2233..19b0eda44a7 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -20,7 +20,6 @@
 #ifndef EXEC_ALL_H
 #define EXEC_ALL_H
 
-#include "cpu.h"
 #if defined(CONFIG_USER_ONLY)
 #include "exec/cpu_ldst.h"
 #endif
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 09/16] exec/memory-internal: remove dependency on cpu.h
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (7 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 08/16] exec/exec-all: remove dependency on cpu.h Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:29   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 10/16] exec/ram_addr: " Pierrick Bouvier
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/exec/memory-internal.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 100c1237ac2..b729f3b25ad 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -20,8 +20,6 @@
 #ifndef MEMORY_INTERNAL_H
 #define MEMORY_INTERNAL_H
 
-#include "cpu.h"
-
 #ifndef CONFIG_USER_ONLY
 static inline AddressSpaceDispatch *flatview_to_dispatch(FlatView *fv)
 {
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 10/16] exec/ram_addr: remove dependency on cpu.h
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (8 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 09/16] exec/memory-internal: " Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:29   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 11/16] system/kvm: make kvm_flush_coalesced_mmio_buffer() accessible for common code Pierrick Bouvier
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/exec/ram_addr.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 3d8df4edf15..7c011fadd11 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -20,13 +20,14 @@
 #define RAM_ADDR_H
 
 #ifndef CONFIG_USER_ONLY
-#include "cpu.h"
 #include "system/xen.h"
 #include "system/tcg.h"
 #include "exec/cputlb.h"
 #include "exec/ramlist.h"
 #include "exec/ramblock.h"
 #include "exec/exec-all.h"
+#include "exec/memory.h"
+#include "exec/target_page.h"
 #include "qemu/rcu.h"
 
 #include "exec/hwaddr.h"
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 11/16] system/kvm: make kvm_flush_coalesced_mmio_buffer() accessible for common code
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (9 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 10/16] exec/ram_addr: " Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:30   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 12/16] exec/ram_addr: call xen_hvm_modified_memory only if xen is enabled Pierrick Bouvier
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

This function is used by system/physmem.c will be turn into common code
in next commit.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/system/kvm.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/system/kvm.h b/include/system/kvm.h
index ab17c09a551..21da3b8b052 100644
--- a/include/system/kvm.h
+++ b/include/system/kvm.h
@@ -210,11 +210,11 @@ bool kvm_arm_supports_user_irq(void);
 int kvm_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
 int kvm_on_sigbus(int code, void *addr);
 
-#ifdef COMPILING_PER_TARGET
-#include "cpu.h"
-
 void kvm_flush_coalesced_mmio_buffer(void);
 
+#ifdef COMPILING_PER_TARGET
+#include "cpu.h"
+
 /**
  * kvm_update_guest_debug(): ensure KVM debug structures updated
  * @cs: the CPUState for this cpu
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 12/16] exec/ram_addr: call xen_hvm_modified_memory only if xen is enabled
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (10 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 11/16] system/kvm: make kvm_flush_coalesced_mmio_buffer() accessible for common code Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:30   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 13/16] hw/xen: add stubs for various functions Pierrick Bouvier
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/exec/ram_addr.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 7c011fadd11..098fccb5835 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -342,7 +342,9 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start,
         }
     }
 
-    xen_hvm_modified_memory(start, length);
+    if (xen_enabled()) {
+        xen_hvm_modified_memory(start, length);
+    }
 }
 
 #if !defined(_WIN32)
@@ -418,7 +420,9 @@ uint64_t cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap,
             }
         }
 
-        xen_hvm_modified_memory(start, pages << TARGET_PAGE_BITS);
+        if (xen_enabled()) {
+            xen_hvm_modified_memory(start, pages << TARGET_PAGE_BITS);
+        }
     } else {
         uint8_t clients = tcg_enabled() ? DIRTY_CLIENTS_ALL : DIRTY_CLIENTS_NOCODE;
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 13/16] hw/xen: add stubs for various functions
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (11 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 12/16] exec/ram_addr: call xen_hvm_modified_memory only if xen is enabled Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:32   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 14/16] system/physmem: compilation unit is now common to all targets Pierrick Bouvier
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Those functions are used by system/physmem.c, and are called only if
xen is enabled (which happens only if CONFIG_XEN is not set).

So we can crash in case those are called.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 hw/xen/xen_stubs.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++
 hw/xen/meson.build |  3 +++
 2 files changed, 59 insertions(+)
 create mode 100644 hw/xen/xen_stubs.c

diff --git a/hw/xen/xen_stubs.c b/hw/xen/xen_stubs.c
new file mode 100644
index 00000000000..19cee84bbb4
--- /dev/null
+++ b/hw/xen/xen_stubs.c
@@ -0,0 +1,56 @@
+/*
+ * Various stubs for xen functions
+ *
+ * Those functions are used only if xen_enabled(). This file is linked only if
+ * CONFIG_XEN is not set, so they should never be called.
+ *
+ * Copyright (c) 2025 Linaro, Ltd.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "system/xen.h"
+#include "system/xen-mapcache.h"
+
+void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length)
+{
+    g_assert_not_reached();
+}
+
+void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
+                   struct MemoryRegion *mr, Error **errp)
+{
+    g_assert_not_reached();
+}
+
+bool xen_mr_is_memory(MemoryRegion *mr)
+{
+    g_assert_not_reached();
+}
+
+void xen_invalidate_map_cache_entry(uint8_t *buffer)
+{
+    g_assert_not_reached();
+}
+
+void xen_invalidate_map_cache(void)
+{
+    g_assert_not_reached();
+}
+
+ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
+{
+    g_assert_not_reached();
+}
+
+uint8_t *xen_map_cache(MemoryRegion *mr,
+                       hwaddr phys_addr,
+                       hwaddr size,
+                       ram_addr_t ram_addr_offset,
+                       uint8_t lock,
+                       bool dma,
+                       bool is_write)
+{
+    g_assert_not_reached();
+}
diff --git a/hw/xen/meson.build b/hw/xen/meson.build
index 4a486e36738..a1850e76988 100644
--- a/hw/xen/meson.build
+++ b/hw/xen/meson.build
@@ -9,6 +9,9 @@ system_ss.add(when: ['CONFIG_XEN_BUS'], if_true: files(
 
 system_ss.add(when: ['CONFIG_XEN', xen], if_true: files(
   'xen-operations.c',
+),
+if_false: files(
+  'xen_stubs.c',
 ))
 
 xen_specific_ss = ss.source_set()
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 14/16] system/physmem: compilation unit is now common to all targets
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (12 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 13/16] hw/xen: add stubs for various functions Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:32   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 15/16] system/memory: make compilation unit common Pierrick Bouvier
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 system/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/system/meson.build b/system/meson.build
index c83d80fa248..9d0b0122e54 100644
--- a/system/meson.build
+++ b/system/meson.build
@@ -2,7 +2,6 @@ specific_ss.add(when: 'CONFIG_SYSTEM_ONLY', if_true: [files(
   'arch_init.c',
   'ioport.c',
   'memory.c',
-  'physmem.c',
 )])
 
 system_ss.add(files(
@@ -15,6 +14,7 @@ system_ss.add(files(
   'dma-helpers.c',
   'globals.c',
   'memory_mapping.c',
+  'physmem.c',
   'qdev-monitor.c',
   'qtest.c',
   'rtc.c',
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 15/16] system/memory: make compilation unit common
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (13 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 14/16] system/physmem: compilation unit is now common to all targets Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:43   ` Richard Henderson
  2025-03-10  4:58 ` [PATCH 16/16] system/ioport: " Pierrick Bouvier
  2025-03-10 13:23 ` [PATCH 00/16] make system memory API available for common code BALATON Zoltan
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 system/memory.c    | 22 +++++++++++++++-------
 system/meson.build |  2 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/system/memory.c b/system/memory.c
index 4c829793a0a..b401be8b5f1 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -355,11 +355,11 @@ static void flatview_simplify(FlatView *view)
 
 static bool memory_region_big_endian(MemoryRegion *mr)
 {
-#if TARGET_BIG_ENDIAN
-    return mr->ops->endianness != DEVICE_LITTLE_ENDIAN;
-#else
-    return mr->ops->endianness == DEVICE_BIG_ENDIAN;
-#endif
+    if (target_words_bigendian()) {
+        return mr->ops->endianness != DEVICE_LITTLE_ENDIAN;
+    } else {
+        return mr->ops->endianness == DEVICE_BIG_ENDIAN;
+    }
 }
 
 static void adjust_endianness(MemoryRegion *mr, uint64_t *data, MemOp op)
@@ -2584,7 +2584,11 @@ void memory_region_add_eventfd(MemoryRegion *mr,
     unsigned i;
 
     if (size) {
-        adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_TE);
+        if (target_words_bigendian()) {
+            adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_BE);
+        } else {
+            adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_LE);
+        }
     }
     memory_region_transaction_begin();
     for (i = 0; i < mr->ioeventfd_nb; ++i) {
@@ -2619,7 +2623,11 @@ void memory_region_del_eventfd(MemoryRegion *mr,
     unsigned i;
 
     if (size) {
-        adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_TE);
+        if (target_words_bigendian()) {
+            adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_BE);
+        } else {
+            adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_LE);
+        }
     }
     memory_region_transaction_begin();
     for (i = 0; i < mr->ioeventfd_nb; ++i) {
diff --git a/system/meson.build b/system/meson.build
index 9d0b0122e54..881cb2736fe 100644
--- a/system/meson.build
+++ b/system/meson.build
@@ -1,7 +1,6 @@
 specific_ss.add(when: 'CONFIG_SYSTEM_ONLY', if_true: [files(
   'arch_init.c',
   'ioport.c',
-  'memory.c',
 )])
 
 system_ss.add(files(
@@ -14,6 +13,7 @@ system_ss.add(files(
   'dma-helpers.c',
   'globals.c',
   'memory_mapping.c',
+  'memory.c',
   'physmem.c',
   'qdev-monitor.c',
   'qtest.c',
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 16/16] system/ioport: make compilation unit common
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (14 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 15/16] system/memory: make compilation unit common Pierrick Bouvier
@ 2025-03-10  4:58 ` Pierrick Bouvier
  2025-03-10 17:43   ` Richard Henderson
  2025-03-10 13:23 ` [PATCH 00/16] make system memory API available for common code BALATON Zoltan
  16 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10  4:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini, Pierrick Bouvier

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 system/ioport.c    | 1 -
 system/meson.build | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/system/ioport.c b/system/ioport.c
index 55c2a752396..89daae9d602 100644
--- a/system/ioport.c
+++ b/system/ioport.c
@@ -26,7 +26,6 @@
  */
 
 #include "qemu/osdep.h"
-#include "cpu.h"
 #include "exec/ioport.h"
 #include "exec/memory.h"
 #include "exec/address-spaces.h"
diff --git a/system/meson.build b/system/meson.build
index 881cb2736fe..3faec7e4dfb 100644
--- a/system/meson.build
+++ b/system/meson.build
@@ -1,6 +1,5 @@
 specific_ss.add(when: 'CONFIG_SYSTEM_ONLY', if_true: [files(
   'arch_init.c',
-  'ioport.c',
 )])
 
 system_ss.add(files(
@@ -12,6 +11,7 @@ system_ss.add(files(
   'dirtylimit.c',
   'dma-helpers.c',
   'globals.c',
+  'ioport.c',
   'memory_mapping.c',
   'memory.c',
   'physmem.c',
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH 00/16] make system memory API available for common code
  2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
                   ` (15 preceding siblings ...)
  2025-03-10  4:58 ` [PATCH 16/16] system/ioport: " Pierrick Bouvier
@ 2025-03-10 13:23 ` BALATON Zoltan
  2025-03-10 16:28   ` Pierrick Bouvier
  16 siblings, 1 reply; 61+ messages in thread
From: BALATON Zoltan @ 2025-03-10 13:23 UTC (permalink / raw)
  To: Pierrick Bouvier
  Cc: qemu-devel, qemu-ppc, Alistair Francis, Richard Henderson,
	Harsh Prateek Bora, alex.bennee, Palmer Dabbelt,
	Daniel Henrique Barboza, kvm, Peter Xu, Nicholas Piggin,
	Liu Zhiwei, David Hildenbrand, Weiwei Li, Paul Durrant,
	Edgar E. Iglesias, Philippe Mathieu-Daudé, Anthony PERARD,
	Yoshinori Sato, manos.pitsidianakis, qemu-riscv, Paolo Bonzini,
	xen-devel, Stefano Stabellini

On Sun, 9 Mar 2025, Pierrick Bouvier wrote:
> The main goal of this series is to be able to call any memory ld/st function
> from code that is *not* target dependent.

Why is that needed?

> As a positive side effect, we can
> turn related system compilation units into common code.

Are there any negative side effects? In particular have you done any 
performance benchmarking to see if this causes a measurable slow down? 
Such as with the STREAM benchmark:
https://stackoverflow.com/questions/56086993/what-does-stream-memory-bandwidth-benchmark-really-measure

Maybe it would be good to have some performance tests similiar to 
functional tests that could be run like the CI tests to detect such 
performance changes. People report that QEMU is getting slower and slower 
with each release. Maybe it could be a GSoC project to make such tests but 
maybe we're too late for that.

Regards,
BALATON Zoltan


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h
  2025-03-10  4:58 ` [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h Pierrick Bouvier
@ 2025-03-10 15:17   ` Richard Henderson
  2025-03-10 16:03     ` Pierrick Bouvier
  2025-03-11  0:04     ` Pierrick Bouvier
  0 siblings, 2 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 15:17 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> They are now accessible through exec/memory.h instead, and we make sure
> all variants are available for common or target dependent code.
...
> diff --git a/include/exec/memory_ldst.h.inc b/include/exec/memory_ldst.h.inc
> index 92ad74e9560..74519a88de0 100644
> --- a/include/exec/memory_ldst.h.inc
> +++ b/include/exec/memory_ldst.h.inc
> @@ -19,7 +19,8 @@
>    * License along with this library; if not, see <http://www.gnu.org/licenses/>.
>    */
>   
> -#ifdef TARGET_ENDIANNESS
> +uint8_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
> +    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
>   uint16_t glue(address_space_lduw, SUFFIX)(ARG1_DECL,
>       hwaddr addr, MemTxAttrs attrs, MemTxResult *result);

You shouldn't be exposing

   address_space_lduw

to common code, only

   address_space_lduw_be
   address_space_lduw_le

etc.  I'm not sure what you're trying to do here.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 03/16] include: move target_words_bigendian() from tswap to bswap
  2025-03-10  4:58 ` [PATCH 03/16] include: move target_words_bigendian() from tswap to bswap Pierrick Bouvier
@ 2025-03-10 15:21   ` Richard Henderson
  2025-03-10 16:05     ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 15:21 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> This is needed for next commits (especially when implementing st/ld
> primitives which will use this function).
> As well, remove reference to TARGET_BIG_ENDIAN, as we are about to
> remove this dependency.

Where is the TARGET_BIG_ENDIAN reference being removed?

> diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h
> index b915835bead..ebf6f9e5f5c 100644
> --- a/include/qemu/bswap.h
> +++ b/include/qemu/bswap.h
> @@ -1,6 +1,8 @@
>   #ifndef BSWAP_H
>   #define BSWAP_H
>   
> +#include <stdbool.h>

You shouldn't need this.  It is included by osdep.h, for every file.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/16] exec/memory.h: make devend_memop target agnostic
  2025-03-10  4:58 ` [PATCH 04/16] exec/memory.h: make devend_memop target agnostic Pierrick Bouvier
@ 2025-03-10 15:25   ` Richard Henderson
  2025-03-10 16:04     ` Pierrick Bouvier
  2025-03-10 16:30   ` Richard Henderson
  1 sibling, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 15:25 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Will allow to make system/memory.c common later.
> 
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>   include/exec/memory.h | 12 +++++-------
>   1 file changed, 5 insertions(+), 7 deletions(-)

I guess this was split from patch 3 without updating descriptions,
and this is the TARGET_BIG_ENDIAN reference removed.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h
  2025-03-10 15:17   ` Richard Henderson
@ 2025-03-10 16:03     ` Pierrick Bouvier
  2025-03-11  0:04     ` Pierrick Bouvier
  1 sibling, 0 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 16:03 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 08:17, Richard Henderson wrote:
> On 3/9/25 21:58, Pierrick Bouvier wrote:
>> They are now accessible through exec/memory.h instead, and we make sure
>> all variants are available for common or target dependent code.
> ...
>> diff --git a/include/exec/memory_ldst.h.inc b/include/exec/memory_ldst.h.inc
>> index 92ad74e9560..74519a88de0 100644
>> --- a/include/exec/memory_ldst.h.inc
>> +++ b/include/exec/memory_ldst.h.inc
>> @@ -19,7 +19,8 @@
>>     * License along with this library; if not, see <http://www.gnu.org/licenses/>.
>>     */
>>    
>> -#ifdef TARGET_ENDIANNESS
>> +uint8_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
>> +    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
>>    uint16_t glue(address_space_lduw, SUFFIX)(ARG1_DECL,
>>        hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
> 
> You shouldn't be exposing
> 
>     address_space_lduw
> 
> to common code, only
> 
>     address_space_lduw_be
>     address_space_lduw_le
> 
> etc.  I'm not sure what you're trying to do here.
> 
> 
> r~

As mentioned in the cover letter, the goal is to extract memory 
functions from cpu headers. The result is that we need to expose all 
functions for both common and target dependent code.

In case you don't see the point, you can try to remove memory_ldst 
include from cpu-all.h (without touching memory_ldst.h) and see the 
compilation errors. This patch is the minimal change to get something 
working.

If your point is that non be/le variants should be eliminated 
completely, yes, it can be done after this series.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/16] exec/memory.h: make devend_memop target agnostic
  2025-03-10 15:25   ` Richard Henderson
@ 2025-03-10 16:04     ` Pierrick Bouvier
  0 siblings, 0 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 16:04 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 08:25, Richard Henderson wrote:
> On 3/9/25 21:58, Pierrick Bouvier wrote:
>> Will allow to make system/memory.c common later.
>>
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>> ---
>>    include/exec/memory.h | 12 +++++-------
>>    1 file changed, 5 insertions(+), 7 deletions(-)
> 
> I guess this was split from patch 3 without updating descriptions,
> and this is the TARGET_BIG_ENDIAN reference removed.
> 

This was intentionally splitted, and the subject mentions the goal:
make devend_memop target agnostic.

As added in the description, it's needed to allow system/memory.c to be 
compiled as common code.

> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> 
> r~



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 03/16] include: move target_words_bigendian() from tswap to bswap
  2025-03-10 15:21   ` Richard Henderson
@ 2025-03-10 16:05     ` Pierrick Bouvier
  0 siblings, 0 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 16:05 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 08:21, Richard Henderson wrote:
> On 3/9/25 21:58, Pierrick Bouvier wrote:
>> This is needed for next commits (especially when implementing st/ld
>> primitives which will use this function).
>> As well, remove reference to TARGET_BIG_ENDIAN, as we are about to
>> remove this dependency.
> 
> Where is the TARGET_BIG_ENDIAN reference being removed?
> 

The comment part removed, during the move, is:
"Note that in target-specific code, you can use
  TARGET_BIG_ENDIAN directly instead."

>> diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h
>> index b915835bead..ebf6f9e5f5c 100644
>> --- a/include/qemu/bswap.h
>> +++ b/include/qemu/bswap.h
>> @@ -1,6 +1,8 @@
>>    #ifndef BSWAP_H
>>    #define BSWAP_H
>>    
>> +#include <stdbool.h>
> 
> You shouldn't need this.  It is included by osdep.h, for every file.
> 

I'll take a look at that.

> 
> r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions
  2025-03-10  4:58 ` [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions Pierrick Bouvier
@ 2025-03-10 16:08   ` Richard Henderson
  2025-03-10 16:14     ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 16:08 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> For now, they are duplicate of the same macros in cpu-all.h that we
> eliminate in next commit.
> 
> Keep code readable by not defining them with macros, but simply their
> implementation.
> 
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>

Why do you want these in bswap.h, rather than tswap.h?
They're target swaps, after all.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions
  2025-03-10 16:08   ` Richard Henderson
@ 2025-03-10 16:14     ` Pierrick Bouvier
  2025-03-10 16:37       ` Richard Henderson
  0 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 16:14 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 09:08, Richard Henderson wrote:
> On 3/9/25 21:58, Pierrick Bouvier wrote:
>> For now, they are duplicate of the same macros in cpu-all.h that we
>> eliminate in next commit.
>>
>> Keep code readable by not defining them with macros, but simply their
>> implementation.
>>
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> 
> Why do you want these in bswap.h, rather than tswap.h?
> They're target swaps, after all.
> 
> 
> r~

No preference on that, I simply added them to the same file than their 
explicit endianness variant. Would you prefer the endianness agnostic 
variant to be in tswap.h instead?


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 00/16] make system memory API available for common code
  2025-03-10 13:23 ` [PATCH 00/16] make system memory API available for common code BALATON Zoltan
@ 2025-03-10 16:28   ` Pierrick Bouvier
  2025-03-10 16:56     ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 16:28 UTC (permalink / raw)
  To: BALATON Zoltan
  Cc: qemu-devel, qemu-ppc, Alistair Francis, Richard Henderson,
	Harsh Prateek Bora, alex.bennee, Palmer Dabbelt,
	Daniel Henrique Barboza, kvm, Peter Xu, Nicholas Piggin,
	Liu Zhiwei, David Hildenbrand, Weiwei Li, Paul Durrant,
	Edgar E. Iglesias, Philippe Mathieu-Daudé, Anthony PERARD,
	Yoshinori Sato, manos.pitsidianakis, qemu-riscv, Paolo Bonzini,
	xen-devel, Stefano Stabellini

Hi Zoltan,

On 3/10/25 06:23, BALATON Zoltan wrote:
> On Sun, 9 Mar 2025, Pierrick Bouvier wrote:
>> The main goal of this series is to be able to call any memory ld/st function
>> from code that is *not* target dependent.
> 
> Why is that needed?
> 

this series belongs to the "single binary" topic, where we are trying to 
build a single QEMU binary with all architectures embedded.

To achieve that, we need to have every single compilation unit compiled 
only once, to be able to link a binary without any symbol conflict.

A consequence of that is target specific code (in terms of code relying 
of target specific macros) needs to be converted to common code, 
checking at runtime properties of the target we run. We are tackling 
various places in QEMU codebase at the same time, which can be confusing 
for the community members.

This series take care of system memory related functions and associated 
compilation units in system/.

>> As a positive side effect, we can
>> turn related system compilation units into common code.
> 
> Are there any negative side effects? In particular have you done any
> performance benchmarking to see if this causes a measurable slow down?
> Such as with the STREAM benchmark:
> https://stackoverflow.com/questions/56086993/what-does-stream-memory-bandwidth-benchmark-really-measure
> 
> Maybe it would be good to have some performance tests similiar to
> functional tests that could be run like the CI tests to detect such
> performance changes. People report that QEMU is getting slower and slower
> with each release. Maybe it could be a GSoC project to make such tests but
> maybe we're too late for that.
> 

I agree with you, and it's something we have mentioned during our 
"internal" conversations. Testing performance with existing functional 
tests would already be a first good step. However, given the poor 
reliability we have on our CI runners, I think it's a bit doomed.

Ideally, every QEMU release cycle should have a performance measurement 
window to detect potential sources of regressions.

To answer to your specific question, I am trying first to get a review 
on the approach taken. We can always optimize in next series version, in 
case we identify it's a big deal to introduce a branch for every memory 
related function call.

In all cases, transforming code relying on compile time 
optimization/dead code elimination through defines to runtime checks 
will *always* have an impact, even though it should be minimal in most 
of cases. But the maintenance and compilation time benefits, as well as 
the perspectives it opens (single binary, heterogeneous emulation, use 
QEMU as a library) are worth it IMHO.

> Regards,
> BALATON Zoltan

Regards,
Pierrick



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/16] exec/memory.h: make devend_memop target agnostic
  2025-03-10  4:58 ` [PATCH 04/16] exec/memory.h: make devend_memop target agnostic Pierrick Bouvier
  2025-03-10 15:25   ` Richard Henderson
@ 2025-03-10 16:30   ` Richard Henderson
  2025-03-10 16:38     ` Pierrick Bouvier
  1 sibling, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 16:30 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Will allow to make system/memory.c common later.
> 
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>   include/exec/memory.h | 12 +++++-------
>   1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 7c20f36a312..698179b26d2 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -3164,25 +3164,23 @@ address_space_write_cached(MemoryRegionCache *cache, hwaddr addr,
>   MemTxResult address_space_set(AddressSpace *as, hwaddr addr,
>                                 uint8_t c, hwaddr len, MemTxAttrs attrs);
>   
> -#ifdef COMPILING_PER_TARGET
>   /* enum device_endian to MemOp.  */
>   static inline MemOp devend_memop(enum device_endian end)
>   {
>       QEMU_BUILD_BUG_ON(DEVICE_HOST_ENDIAN != DEVICE_LITTLE_ENDIAN &&
>                         DEVICE_HOST_ENDIAN != DEVICE_BIG_ENDIAN);
>   
> -#if HOST_BIG_ENDIAN != TARGET_BIG_ENDIAN
> -    /* Swap if non-host endianness or native (target) endianness */
> -    return (end == DEVICE_HOST_ENDIAN) ? 0 : MO_BSWAP;
> -#else
> +    if (HOST_BIG_ENDIAN != target_words_bigendian()) {
> +        /* Swap if non-host endianness or native (target) endianness */
> +        return (end == DEVICE_HOST_ENDIAN) ? 0 : MO_BSWAP;
> +    }
> +
>       const int non_host_endianness =
>           DEVICE_LITTLE_ENDIAN ^ DEVICE_BIG_ENDIAN ^ DEVICE_HOST_ENDIAN;
>   
>       /* In this case, native (target) endianness needs no swap.  */
>       return (end == non_host_endianness) ? MO_BSWAP : 0;
> -#endif
>   }
> -#endif /* COMPILING_PER_TARGET */

Someone (me?) was trying to be overly clever here.
We can simplify this function and conditionally avoid the function call:

     bool big_endian = (end == DEVICE_NATIVE_ENDIAN
                        ? target_words_bigendian()
                        : end == DEVICE_BIG_ENDIAN);
     return big_endian ? MO_BE : MO_LE;


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions
  2025-03-10 16:14     ` Pierrick Bouvier
@ 2025-03-10 16:37       ` Richard Henderson
  2025-03-10 16:43         ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 16:37 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/10/25 09:14, Pierrick Bouvier wrote:
> On 3/10/25 09:08, Richard Henderson wrote:
>> On 3/9/25 21:58, Pierrick Bouvier wrote:
>>> For now, they are duplicate of the same macros in cpu-all.h that we
>>> eliminate in next commit.
>>>
>>> Keep code readable by not defining them with macros, but simply their
>>> implementation.
>>>
>>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>>
>> Why do you want these in bswap.h, rather than tswap.h?
>> They're target swaps, after all.
>>
>>
>> r~
> 
> No preference on that, I simply added them to the same file than their explicit endianness 
> variant. Would you prefer the endianness agnostic variant to be in tswap.h instead?

I think I would.

In addition, I think we want

#ifdef COMPILING_PER_TARGET
#define target_words_bigendian()  TARGET_BIG_ENDIAN
#else
bool target_words_bigendian(void);
#endif

moving the conditional from around target_needs_bswap just below.

With that, we eliminate the extra branch that you're otherwise
adding to target-specific code with this patch.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/16] exec/memory.h: make devend_memop target agnostic
  2025-03-10 16:30   ` Richard Henderson
@ 2025-03-10 16:38     ` Pierrick Bouvier
  0 siblings, 0 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 16:38 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 09:30, Richard Henderson wrote:
> On 3/9/25 21:58, Pierrick Bouvier wrote:
>> Will allow to make system/memory.c common later.
>>
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>> ---
>>    include/exec/memory.h | 12 +++++-------
>>    1 file changed, 5 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index 7c20f36a312..698179b26d2 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -3164,25 +3164,23 @@ address_space_write_cached(MemoryRegionCache *cache, hwaddr addr,
>>    MemTxResult address_space_set(AddressSpace *as, hwaddr addr,
>>                                  uint8_t c, hwaddr len, MemTxAttrs attrs);
>>    
>> -#ifdef COMPILING_PER_TARGET
>>    /* enum device_endian to MemOp.  */
>>    static inline MemOp devend_memop(enum device_endian end)
>>    {
>>        QEMU_BUILD_BUG_ON(DEVICE_HOST_ENDIAN != DEVICE_LITTLE_ENDIAN &&
>>                          DEVICE_HOST_ENDIAN != DEVICE_BIG_ENDIAN);
>>    
>> -#if HOST_BIG_ENDIAN != TARGET_BIG_ENDIAN
>> -    /* Swap if non-host endianness or native (target) endianness */
>> -    return (end == DEVICE_HOST_ENDIAN) ? 0 : MO_BSWAP;
>> -#else
>> +    if (HOST_BIG_ENDIAN != target_words_bigendian()) {
>> +        /* Swap if non-host endianness or native (target) endianness */
>> +        return (end == DEVICE_HOST_ENDIAN) ? 0 : MO_BSWAP;
>> +    }
>> +
>>        const int non_host_endianness =
>>            DEVICE_LITTLE_ENDIAN ^ DEVICE_BIG_ENDIAN ^ DEVICE_HOST_ENDIAN;
>>    
>>        /* In this case, native (target) endianness needs no swap.  */
>>        return (end == non_host_endianness) ? MO_BSWAP : 0;
>> -#endif
>>    }
>> -#endif /* COMPILING_PER_TARGET */
> 
> Someone (me?) was trying to be overly clever here.

Tends to happen through the QEMU codebase :).

> We can simplify this function and conditionally avoid the function call:
> 
>       bool big_endian = (end == DEVICE_NATIVE_ENDIAN
>                          ? target_words_bigendian()
>                          : end == DEVICE_BIG_ENDIAN);
>       return big_endian ? MO_BE : MO_LE;
> 

That's fine for me (still requires a bit of cleverness to read it at first).

> 
> r~



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 06/16] exec/cpu-all.h: we can now remove ld/st macros
  2025-03-10  4:58 ` [PATCH 06/16] exec/cpu-all.h: we can now remove ld/st macros Pierrick Bouvier
@ 2025-03-10 16:39   ` Richard Henderson
  2025-03-10 16:45     ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 16:39 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Functions declared in bswap.h will be used instead.
> 
> At this point, we finished to extract memory API from cpu-all.h, and it
> can be called from any common or target dependent code.
> 
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>   include/exec/cpu-all.h | 25 -------------------------
>   1 file changed, 25 deletions(-)

I think this should be squashed with the previous patch.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions
  2025-03-10 16:37       ` Richard Henderson
@ 2025-03-10 16:43         ` Pierrick Bouvier
  2025-03-10 16:53           ` Richard Henderson
  0 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 16:43 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 09:37, Richard Henderson wrote:
> On 3/10/25 09:14, Pierrick Bouvier wrote:
>> On 3/10/25 09:08, Richard Henderson wrote:
>>> On 3/9/25 21:58, Pierrick Bouvier wrote:
>>>> For now, they are duplicate of the same macros in cpu-all.h that we
>>>> eliminate in next commit.
>>>>
>>>> Keep code readable by not defining them with macros, but simply their
>>>> implementation.
>>>>
>>>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>>>
>>> Why do you want these in bswap.h, rather than tswap.h?
>>> They're target swaps, after all.
>>>
>>>
>>> r~
>>
>> No preference on that, I simply added them to the same file than their explicit endianness
>> variant. Would you prefer the endianness agnostic variant to be in tswap.h instead?
> 
> I think I would.

Ok, I will move it.

> 
> In addition, I think we want
> 
> #ifdef COMPILING_PER_TARGET
> #define target_words_bigendian()  TARGET_BIG_ENDIAN
> #else
> bool target_words_bigendian(void);
> #endif
> 
> moving the conditional from around target_needs_bswap just below.
> 
> With that, we eliminate the extra branch that you're otherwise
> adding to target-specific code with this patch.
> 

I understand the change requested, but should we really aim in that 
direction? In the end, if we pursue the compilation units deduplication, 
the branch will be present anyway.

I'm ok with your change, just asking if we really want to preserve 
target specific code until the "end".

> 
> r~



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 06/16] exec/cpu-all.h: we can now remove ld/st macros
  2025-03-10 16:39   ` Richard Henderson
@ 2025-03-10 16:45     ` Pierrick Bouvier
  0 siblings, 0 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 16:45 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 09:39, Richard Henderson wrote:
> On 3/9/25 21:58, Pierrick Bouvier wrote:
>> Functions declared in bswap.h will be used instead.
>>
>> At this point, we finished to extract memory API from cpu-all.h, and it
>> can be called from any common or target dependent code.
>>
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>> ---
>>    include/exec/cpu-all.h | 25 -------------------------
>>    1 file changed, 25 deletions(-)
> 
> I think this should be squashed with the previous patch.
> 

No specific opinion here.
I'm ok with it, just thought it was more easy to split this, considering 
previous state compiles cleanly as well.

> 
> r~



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions
  2025-03-10 16:43         ` Pierrick Bouvier
@ 2025-03-10 16:53           ` Richard Henderson
  2025-03-10 17:09             ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 16:53 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/10/25 09:43, Pierrick Bouvier wrote:
> On 3/10/25 09:37, Richard Henderson wrote:
>> On 3/10/25 09:14, Pierrick Bouvier wrote:
>>> On 3/10/25 09:08, Richard Henderson wrote:
>>>> On 3/9/25 21:58, Pierrick Bouvier wrote:
>>>>> For now, they are duplicate of the same macros in cpu-all.h that we
>>>>> eliminate in next commit.
>>>>>
>>>>> Keep code readable by not defining them with macros, but simply their
>>>>> implementation.
>>>>>
>>>>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>>>>
>>>> Why do you want these in bswap.h, rather than tswap.h?
>>>> They're target swaps, after all.
>>>>
>>>>
>>>> r~
>>>
>>> No preference on that, I simply added them to the same file than their explicit endianness
>>> variant. Would you prefer the endianness agnostic variant to be in tswap.h instead?
>>
>> I think I would.
> 
> Ok, I will move it.
> 
>>
>> In addition, I think we want
>>
>> #ifdef COMPILING_PER_TARGET
>> #define target_words_bigendian()  TARGET_BIG_ENDIAN
>> #else
>> bool target_words_bigendian(void);
>> #endif
>>
>> moving the conditional from around target_needs_bswap just below.
>>
>> With that, we eliminate the extra branch that you're otherwise
>> adding to target-specific code with this patch.
>>
> 
> I understand the change requested, but should we really aim in that direction? In the end, 
> if we pursue the compilation units deduplication, the branch will be present anyway.
> 
> I'm ok with your change, just asking if we really want to preserve target specific code 
> until the "end".

All of target/ is target specific.  De-duplication will not eliminate that.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 00/16] make system memory API available for common code
  2025-03-10 16:28   ` Pierrick Bouvier
@ 2025-03-10 16:56     ` Pierrick Bouvier
  2025-03-10 19:40       ` BALATON Zoltan
  0 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 16:56 UTC (permalink / raw)
  To: BALATON Zoltan
  Cc: qemu-devel, qemu-ppc, Alistair Francis, Richard Henderson,
	Harsh Prateek Bora, alex.bennee, Palmer Dabbelt,
	Daniel Henrique Barboza, kvm, Peter Xu, Nicholas Piggin,
	Liu Zhiwei, David Hildenbrand, Weiwei Li, Paul Durrant,
	Edgar E. Iglesias, Philippe Mathieu-Daudé, Anthony PERARD,
	Yoshinori Sato, manos.pitsidianakis, qemu-riscv, Paolo Bonzini,
	xen-devel, Stefano Stabellini

On 3/10/25 09:28, Pierrick Bouvier wrote:
> Hi Zoltan,
> 
> On 3/10/25 06:23, BALATON Zoltan wrote:
>> On Sun, 9 Mar 2025, Pierrick Bouvier wrote:
>>> The main goal of this series is to be able to call any memory ld/st function
>>> from code that is *not* target dependent.
>>
>> Why is that needed?
>>
> 
> this series belongs to the "single binary" topic, where we are trying to
> build a single QEMU binary with all architectures embedded.
> 
> To achieve that, we need to have every single compilation unit compiled
> only once, to be able to link a binary without any symbol conflict.
> 
> A consequence of that is target specific code (in terms of code relying
> of target specific macros) needs to be converted to common code,
> checking at runtime properties of the target we run. We are tackling
> various places in QEMU codebase at the same time, which can be confusing
> for the community members.
> 
> This series take care of system memory related functions and associated
> compilation units in system/.
> 
>>> As a positive side effect, we can
>>> turn related system compilation units into common code.
>>
>> Are there any negative side effects? In particular have you done any
>> performance benchmarking to see if this causes a measurable slow down?
>> Such as with the STREAM benchmark:
>> https://stackoverflow.com/questions/56086993/what-does-stream-memory-bandwidth-benchmark-really-measure
>>
>> Maybe it would be good to have some performance tests similiar to
>> functional tests that could be run like the CI tests to detect such
>> performance changes. People report that QEMU is getting slower and slower
>> with each release. Maybe it could be a GSoC project to make such tests but
>> maybe we're too late for that.
>>
> 
> I agree with you, and it's something we have mentioned during our
> "internal" conversations. Testing performance with existing functional
> tests would already be a first good step. However, given the poor
> reliability we have on our CI runners, I think it's a bit doomed.
> 
> Ideally, every QEMU release cycle should have a performance measurement
> window to detect potential sources of regressions.
> 
> To answer to your specific question, I am trying first to get a review
> on the approach taken. We can always optimize in next series version, in
> case we identify it's a big deal to introduce a branch for every memory
> related function call.
> 
> In all cases, transforming code relying on compile time
> optimization/dead code elimination through defines to runtime checks
> will *always* have an impact, even though it should be minimal in most
> of cases. But the maintenance and compilation time benefits, as well as
> the perspectives it opens (single binary, heterogeneous emulation, use
> QEMU as a library) are worth it IMHO.
> 
>> Regards,
>> BALATON Zoltan
> 
> Regards,
> Pierrick
> 

As a side note, we recently did some work around performance analysis 
(for aarch64), as you can see here [1]. In the end, QEMU performance 
depends (roughly in this order) on:
1. quality of code generated by TCG
2. helper code to implement instructions
3. mmu emulation

Other state of the art translators that exist are faster (fex, box64) 
mainly by enhancing 1, and relying on various tricks to avoid 
translating some libraries calls. But those translators are host/target 
specific, and the ratio of instructions generated (vs target ones read) 
is much lower than QEMU. In the experimentation listed in the blog, I 
observed that for qemu-system-aarch64, we have an average expansion 
factor of around 18 (1 guest insn translates to 18 host ones).

For users seeing performance decreases, beyond the QEMU code changes, 
adding new target instructions may add new helpers, which may be called 
by the stack people use, and they can sometimes observe a slower behaviour.

There are probably some other low hanging fruits for other target 
architectures.

[1] https://www.linaro.org/blog/qemu-a-tale-of-performance-analysis/


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions
  2025-03-10 16:53           ` Richard Henderson
@ 2025-03-10 17:09             ` Pierrick Bouvier
  2025-03-10 20:17               ` BALATON Zoltan
  0 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 17:09 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 09:53, Richard Henderson wrote:
> On 3/10/25 09:43, Pierrick Bouvier wrote:
>> On 3/10/25 09:37, Richard Henderson wrote:
>>> On 3/10/25 09:14, Pierrick Bouvier wrote:
>>>> On 3/10/25 09:08, Richard Henderson wrote:
>>>>> On 3/9/25 21:58, Pierrick Bouvier wrote:
>>>>>> For now, they are duplicate of the same macros in cpu-all.h that we
>>>>>> eliminate in next commit.
>>>>>>
>>>>>> Keep code readable by not defining them with macros, but simply their
>>>>>> implementation.
>>>>>>
>>>>>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>>>>>
>>>>> Why do you want these in bswap.h, rather than tswap.h?
>>>>> They're target swaps, after all.
>>>>>
>>>>>
>>>>> r~
>>>>
>>>> No preference on that, I simply added them to the same file than their explicit endianness
>>>> variant. Would you prefer the endianness agnostic variant to be in tswap.h instead?
>>>
>>> I think I would.
>>
>> Ok, I will move it.
>>
>>>
>>> In addition, I think we want
>>>
>>> #ifdef COMPILING_PER_TARGET
>>> #define target_words_bigendian()  TARGET_BIG_ENDIAN
>>> #else
>>> bool target_words_bigendian(void);
>>> #endif
>>>
>>> moving the conditional from around target_needs_bswap just below.
>>>
>>> With that, we eliminate the extra branch that you're otherwise
>>> adding to target-specific code with this patch.
>>>
>>
>> I understand the change requested, but should we really aim in that direction? In the end,
>> if we pursue the compilation units deduplication, the branch will be present anyway.
>>
>> I'm ok with your change, just asking if we really want to preserve target specific code
>> until the "end".
> 
> All of target/ is target specific.  De-duplication will not eliminate that.
> 

My vocabulary was wrong here. I meant "if we want to preserve target 
specific macros" until the end.
Sure, there will always be compilation units (devices, cpus, helpers, 
...) specific to a target. I just wonder if sticking to ifdef paradigm 
for this kind of code is worth the "optimization" we are supposed to get.

I'll add the change requested.

> 
> r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 07/16] codebase: prepare to remove cpu.h from exec/exec-all.h
  2025-03-10  4:58 ` [PATCH 07/16] codebase: prepare to remove cpu.h from exec/exec-all.h Pierrick Bouvier
@ 2025-03-10 17:22   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:22 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> diff --git a/page-vary-target.c b/page-vary-target.c
> index 3f81144cda8..12fda410bb0 100644
> --- a/page-vary-target.c
> +++ b/page-vary-target.c
> @@ -20,8 +20,9 @@
>   #define IN_PAGE_VARY 1
>   
>   #include "qemu/osdep.h"
> -#include "exec/page-vary.h"
>   #include "exec/exec-all.h"
> +#include "exec/page-vary.h"
> +#include "exec/target_page.h"

You can remove exec-all.h here.

> diff --git a/target/riscv/bitmanip_helper.c b/target/riscv/bitmanip_helper.c
> index b99c4a39a1f..d93312a811c 100644
> --- a/target/riscv/bitmanip_helper.c
> +++ b/target/riscv/bitmanip_helper.c
> @@ -21,6 +21,7 @@
>   #include "qemu/osdep.h"
>   #include "qemu/host-utils.h"
>   #include "exec/exec-all.h"
> +#include "exec/target_long.h"
>   #include "exec/helper-proto.h"
>   #include "tcg/tcg.h"

Likewise.

With that,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/16] exec/exec-all: remove dependency on cpu.h
  2025-03-10  4:58 ` [PATCH 08/16] exec/exec-all: remove dependency on cpu.h Pierrick Bouvier
@ 2025-03-10 17:29   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:29 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>   include/exec/exec-all.h | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index dd5c40f2233..19b0eda44a7 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -20,7 +20,6 @@
>   #ifndef EXEC_ALL_H
>   #define EXEC_ALL_H
>   
> -#include "cpu.h"
>   #if defined(CONFIG_USER_ONLY)
>   #include "exec/cpu_ldst.h"
>   #endif

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 09/16] exec/memory-internal: remove dependency on cpu.h
  2025-03-10  4:58 ` [PATCH 09/16] exec/memory-internal: " Pierrick Bouvier
@ 2025-03-10 17:29   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:29 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>   include/exec/memory-internal.h | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
> index 100c1237ac2..b729f3b25ad 100644
> --- a/include/exec/memory-internal.h
> +++ b/include/exec/memory-internal.h
> @@ -20,8 +20,6 @@
>   #ifndef MEMORY_INTERNAL_H
>   #define MEMORY_INTERNAL_H
>   
> -#include "cpu.h"
> -
>   #ifndef CONFIG_USER_ONLY
>   static inline AddressSpaceDispatch *flatview_to_dispatch(FlatView *fv)
>   {

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 10/16] exec/ram_addr: remove dependency on cpu.h
  2025-03-10  4:58 ` [PATCH 10/16] exec/ram_addr: " Pierrick Bouvier
@ 2025-03-10 17:29   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:29 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Signed-off-by: Pierrick Bouvier<pierrick.bouvier@linaro.org>
> ---
>   include/exec/ram_addr.h | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 11/16] system/kvm: make kvm_flush_coalesced_mmio_buffer() accessible for common code
  2025-03-10  4:58 ` [PATCH 11/16] system/kvm: make kvm_flush_coalesced_mmio_buffer() accessible for common code Pierrick Bouvier
@ 2025-03-10 17:30   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:30 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> This function is used by system/physmem.c will be turn into common code
> in next commit.
> 
> Signed-off-by: Pierrick Bouvier<pierrick.bouvier@linaro.org>
> ---
>   include/system/kvm.h | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 12/16] exec/ram_addr: call xen_hvm_modified_memory only if xen is enabled
  2025-03-10  4:58 ` [PATCH 12/16] exec/ram_addr: call xen_hvm_modified_memory only if xen is enabled Pierrick Bouvier
@ 2025-03-10 17:30   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:30 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Signed-off-by: Pierrick Bouvier<pierrick.bouvier@linaro.org>
> ---
>   include/exec/ram_addr.h | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 13/16] hw/xen: add stubs for various functions
  2025-03-10  4:58 ` [PATCH 13/16] hw/xen: add stubs for various functions Pierrick Bouvier
@ 2025-03-10 17:32   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:32 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel; +Cc: open list:X86

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Those functions are used by system/physmem.c, and are called only if
> xen is enabled (which happens only if CONFIG_XEN is not set).
> 
> So we can crash in case those are called.
> 
> Signed-off-by: Pierrick Bouvier<pierrick.bouvier@linaro.org>
> ---
>   hw/xen/xen_stubs.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++
>   hw/xen/meson.build |  3 +++
>   2 files changed, 59 insertions(+)
>   create mode 100644 hw/xen/xen_stubs.c


Acked-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 14/16] system/physmem: compilation unit is now common to all targets
  2025-03-10  4:58 ` [PATCH 14/16] system/physmem: compilation unit is now common to all targets Pierrick Bouvier
@ 2025-03-10 17:32   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:32 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>   system/meson.build | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/system/meson.build b/system/meson.build
> index c83d80fa248..9d0b0122e54 100644
> --- a/system/meson.build
> +++ b/system/meson.build
> @@ -2,7 +2,6 @@ specific_ss.add(when: 'CONFIG_SYSTEM_ONLY', if_true: [files(
>     'arch_init.c',
>     'ioport.c',
>     'memory.c',
> -  'physmem.c',
>   )])
>   
>   system_ss.add(files(
> @@ -15,6 +14,7 @@ system_ss.add(files(
>     'dma-helpers.c',
>     'globals.c',
>     'memory_mapping.c',
> +  'physmem.c',
>     'qdev-monitor.c',
>     'qtest.c',
>     'rtc.c',

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 15/16] system/memory: make compilation unit common
  2025-03-10  4:58 ` [PATCH 15/16] system/memory: make compilation unit common Pierrick Bouvier
@ 2025-03-10 17:43   ` Richard Henderson
  2025-03-10 17:47     ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:43 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>   system/memory.c    | 22 +++++++++++++++-------
>   system/meson.build |  2 +-
>   2 files changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/system/memory.c b/system/memory.c
> index 4c829793a0a..b401be8b5f1 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -355,11 +355,11 @@ static void flatview_simplify(FlatView *view)
>   
>   static bool memory_region_big_endian(MemoryRegion *mr)
>   {
> -#if TARGET_BIG_ENDIAN
> -    return mr->ops->endianness != DEVICE_LITTLE_ENDIAN;
> -#else
> -    return mr->ops->endianness == DEVICE_BIG_ENDIAN;
> -#endif
> +    if (target_words_bigendian()) {
> +        return mr->ops->endianness != DEVICE_LITTLE_ENDIAN;
> +    } else {
> +        return mr->ops->endianness == DEVICE_BIG_ENDIAN;
> +    }
>   }

This should use the same expression as for patch 4:

     return (end == DEVICE_NATIVE_ENDIAN
             ? target_words_bigendian()
             : end == DEVICE_BIG_ENDIAN);

Which perhaps ought to be split out to it's own inline function?

>   
>   static void adjust_endianness(MemoryRegion *mr, uint64_t *data, MemOp op)
> @@ -2584,7 +2584,11 @@ void memory_region_add_eventfd(MemoryRegion *mr,
>       unsigned i;
>   
>       if (size) {
> -        adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_TE);
> +        if (target_words_bigendian()) {
> +            adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_BE);
> +        } else {
> +            adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_LE);
> +        }

Maybe better as

     MemOp mop = (target_words_bigendian() ? MO_BE : MO_LE) | size_memop(size);
     adjust_endianness(mr, &mrfd.data, size_memop(size), mop);


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 16/16] system/ioport: make compilation unit common
  2025-03-10  4:58 ` [PATCH 16/16] system/ioport: " Pierrick Bouvier
@ 2025-03-10 17:43   ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:43 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/9/25 21:58, Pierrick Bouvier wrote:
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>   system/ioport.c    | 1 -
>   system/meson.build | 2 +-
>   2 files changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 15/16] system/memory: make compilation unit common
  2025-03-10 17:43   ` Richard Henderson
@ 2025-03-10 17:47     ` Pierrick Bouvier
  2025-03-10 17:58       ` Richard Henderson
  0 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 17:47 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 10:43, Richard Henderson wrote:
> On 3/9/25 21:58, Pierrick Bouvier wrote:
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>> ---
>>    system/memory.c    | 22 +++++++++++++++-------
>>    system/meson.build |  2 +-
>>    2 files changed, 16 insertions(+), 8 deletions(-)
>>
>> diff --git a/system/memory.c b/system/memory.c
>> index 4c829793a0a..b401be8b5f1 100644
>> --- a/system/memory.c
>> +++ b/system/memory.c
>> @@ -355,11 +355,11 @@ static void flatview_simplify(FlatView *view)
>>    
>>    static bool memory_region_big_endian(MemoryRegion *mr)
>>    {
>> -#if TARGET_BIG_ENDIAN
>> -    return mr->ops->endianness != DEVICE_LITTLE_ENDIAN;
>> -#else
>> -    return mr->ops->endianness == DEVICE_BIG_ENDIAN;
>> -#endif
>> +    if (target_words_bigendian()) {
>> +        return mr->ops->endianness != DEVICE_LITTLE_ENDIAN;
>> +    } else {
>> +        return mr->ops->endianness == DEVICE_BIG_ENDIAN;
>> +    }
>>    }
> 
> This should use the same expression as for patch 4:
> 
>       return (end == DEVICE_NATIVE_ENDIAN
>               ? target_words_bigendian()
>               : end == DEVICE_BIG_ENDIAN);
> 
> Which perhaps ought to be split out to it's own inline function?
> 

Good point, I'll add this.

>>    
>>    static void adjust_endianness(MemoryRegion *mr, uint64_t *data, MemOp op)
>> @@ -2584,7 +2584,11 @@ void memory_region_add_eventfd(MemoryRegion *mr,
>>        unsigned i;
>>    
>>        if (size) {
>> -        adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_TE);
>> +        if (target_words_bigendian()) {
>> +            adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_BE);
>> +        } else {
>> +            adjust_endianness(mr, &mrfd.data, size_memop(size) | MO_LE);
>> +        }
> 
> Maybe better as
> 
>       MemOp mop = (target_words_bigendian() ? MO_BE : MO_LE) | size_memop(size);
>       adjust_endianness(mr, &mrfd.data, size_memop(size), mop);
> 

Do you think defining MO_TE as this expression is a good idea?

I'm afraid it's a bit too much implicit though, but it would save us 
from the hassle to change a lot of code using MO_BE, MO_LE (and all 
other variants defined in MemOp enum).

> 
> r~



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 15/16] system/memory: make compilation unit common
  2025-03-10 17:47     ` Pierrick Bouvier
@ 2025-03-10 17:58       ` Richard Henderson
  2025-03-10 18:04         ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 17:58 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/10/25 10:47, Pierrick Bouvier wrote:
>> Maybe better as
>>
>>       MemOp mop = (target_words_bigendian() ? MO_BE : MO_LE) | size_memop(size);
>>       adjust_endianness(mr, &mrfd.data, size_memop(size), mop);
>>
> 
> Do you think defining MO_TE as this expression is a good idea?

There are not so many references to MO_TE outside target/ or accel/tcg/.

Indeed, after this change, the only ones left are in hw/arm/armv7m.c,
which (because it's Arm) can be changed to MO_LE.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 15/16] system/memory: make compilation unit common
  2025-03-10 17:58       ` Richard Henderson
@ 2025-03-10 18:04         ` Pierrick Bouvier
  2025-03-10 18:10           ` Richard Henderson
  2025-03-10 18:27           ` Philippe Mathieu-Daudé
  0 siblings, 2 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 18:04 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 10:58, Richard Henderson wrote:
> On 3/10/25 10:47, Pierrick Bouvier wrote:
>>> Maybe better as
>>>
>>>        MemOp mop = (target_words_bigendian() ? MO_BE : MO_LE) | size_memop(size);
>>>        adjust_endianness(mr, &mrfd.data, size_memop(size), mop);
>>>
>>
>> Do you think defining MO_TE as this expression is a good idea?
> 
> There are not so many references to MO_TE outside target/ or accel/tcg/.
> 
> Indeed, after this change, the only ones left are in hw/arm/armv7m.c,
> which (because it's Arm) can be changed to MO_LE.
> 

I see a bit more than that (17 files):
hw/arm/armv7m.c
include/exec/memop.h
target/arm/tcg/helper-a64.c
target/arm/tcg/translate.c
target/hexagon/idef-parser/parser-helpers.c
target/hppa/translate.c
target/i386/tcg/emit.c.inc
target/loongarch/tcg/insn_trans/trans_vec.c.inc
target/m68k/translate.c
target/mips/tcg/mips16e_translate.c.inc
target/riscv/translate.c
target/rx/translate.c
target/s390x/tcg/mem_helper.c
target/s390x/tcg/translate.c
target/s390x/tcg/translate_vx.c.inc
target/sparc/ldst_helper.c
target/sparc/translate.c

Plus more (22 files) who relies on:
MO_TE* variants (which relies on MO_TE transitively)

Thus my proposal to have a first change to MO_TE definition, and 
eventually do the change later.

What do you think?

> 
> r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 15/16] system/memory: make compilation unit common
  2025-03-10 18:04         ` Pierrick Bouvier
@ 2025-03-10 18:10           ` Richard Henderson
  2025-03-10 18:25             ` Pierrick Bouvier
  2025-03-10 18:27           ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2025-03-10 18:10 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/10/25 11:04, Pierrick Bouvier wrote:
> On 3/10/25 10:58, Richard Henderson wrote:
>> On 3/10/25 10:47, Pierrick Bouvier wrote:
>>>> Maybe better as
>>>>
>>>>        MemOp mop = (target_words_bigendian() ? MO_BE : MO_LE) | size_memop(size);
>>>>        adjust_endianness(mr, &mrfd.data, size_memop(size), mop);
>>>>
>>>
>>> Do you think defining MO_TE as this expression is a good idea?
>>
>> There are not so many references to MO_TE outside target/ or accel/tcg/.
>>
>> Indeed, after this change, the only ones left are in hw/arm/armv7m.c,
>> which (because it's Arm) can be changed to MO_LE.
>>
> 
> I see a bit more than that (17 files):
> hw/arm/armv7m.c
> include/exec/memop.h
> target/arm/tcg/helper-a64.c
> target/arm/tcg/translate.c
> target/hexagon/idef-parser/parser-helpers.c
> target/hppa/translate.c
> target/i386/tcg/emit.c.inc
> target/loongarch/tcg/insn_trans/trans_vec.c.inc
> target/m68k/translate.c
> target/mips/tcg/mips16e_translate.c.inc
> target/riscv/translate.c
> target/rx/translate.c
> target/s390x/tcg/mem_helper.c
> target/s390x/tcg/translate.c
> target/s390x/tcg/translate_vx.c.inc
> target/sparc/ldst_helper.c
> target/sparc/translate.c
> 
> Plus more (22 files) who relies on:
> MO_TE* variants (which relies on MO_TE transitively)

As I said, *outside* target/ and accel/tcg/.

> Thus my proposal to have a first change to MO_TE definition, and eventually do the change 
> later.
> 
> What do you think?

I don't think a change to MO_TE is necessary.


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 15/16] system/memory: make compilation unit common
  2025-03-10 18:10           ` Richard Henderson
@ 2025-03-10 18:25             ` Pierrick Bouvier
  0 siblings, 0 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 18:25 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 11:10, Richard Henderson wrote:
> On 3/10/25 11:04, Pierrick Bouvier wrote:
>> On 3/10/25 10:58, Richard Henderson wrote:
>>> On 3/10/25 10:47, Pierrick Bouvier wrote:
>>>>> Maybe better as
>>>>>
>>>>>         MemOp mop = (target_words_bigendian() ? MO_BE : MO_LE) | size_memop(size);
>>>>>         adjust_endianness(mr, &mrfd.data, size_memop(size), mop);
>>>>>
>>>>
>>>> Do you think defining MO_TE as this expression is a good idea?
>>>
>>> There are not so many references to MO_TE outside target/ or accel/tcg/.
>>>
>>> Indeed, after this change, the only ones left are in hw/arm/armv7m.c,
>>> which (because it's Arm) can be changed to MO_LE.
>>>
>>
>> I see a bit more than that (17 files):
>> hw/arm/armv7m.c
>> include/exec/memop.h
>> target/arm/tcg/helper-a64.c
>> target/arm/tcg/translate.c
>> target/hexagon/idef-parser/parser-helpers.c
>> target/hppa/translate.c
>> target/i386/tcg/emit.c.inc
>> target/loongarch/tcg/insn_trans/trans_vec.c.inc
>> target/m68k/translate.c
>> target/mips/tcg/mips16e_translate.c.inc
>> target/riscv/translate.c
>> target/rx/translate.c
>> target/s390x/tcg/mem_helper.c
>> target/s390x/tcg/translate.c
>> target/s390x/tcg/translate_vx.c.inc
>> target/sparc/ldst_helper.c
>> target/sparc/translate.c
>>
>> Plus more (22 files) who relies on:
>> MO_TE* variants (which relies on MO_TE transitively)
> 
> As I said, *outside* target/ and accel/tcg/.
> 

Got it.
We can replace the hw/arm entry when tackling this part later.

>> Thus my proposal to have a first change to MO_TE definition, and eventually do the change
>> later.
>>
>> What do you think?
> 
> I don't think a change to MO_TE is necessary.
> 
> 
> r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 15/16] system/memory: make compilation unit common
  2025-03-10 18:04         ` Pierrick Bouvier
  2025-03-10 18:10           ` Richard Henderson
@ 2025-03-10 18:27           ` Philippe Mathieu-Daudé
  2025-03-10 18:38             ` Pierrick Bouvier
  1 sibling, 1 reply; 61+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-03-10 18:27 UTC (permalink / raw)
  To: Pierrick Bouvier, Richard Henderson, qemu-devel

On 10/3/25 19:04, Pierrick Bouvier wrote:
> On 3/10/25 10:58, Richard Henderson wrote:
>> On 3/10/25 10:47, Pierrick Bouvier wrote:
>>>> Maybe better as
>>>>
>>>>        MemOp mop = (target_words_bigendian() ? MO_BE : MO_LE) | 
>>>> size_memop(size);
>>>>        adjust_endianness(mr, &mrfd.data, size_memop(size), mop);
>>>>
>>>
>>> Do you think defining MO_TE as this expression is a good idea?
>>
>> There are not so many references to MO_TE outside target/ or accel/tcg/.
>>
>> Indeed, after this change, the only ones left are in hw/arm/armv7m.c,
>> which (because it's Arm) can be changed to MO_LE.
>>
> 
> I see a bit more than that (17 files):
> hw/arm/armv7m.c
> include/exec/memop.h
> target/arm/tcg/helper-a64.c
> target/arm/tcg/translate.c
> target/hexagon/idef-parser/parser-helpers.c
> target/hppa/translate.c
> target/i386/tcg/emit.c.inc
> target/loongarch/tcg/insn_trans/trans_vec.c.inc
> target/m68k/translate.c
> target/mips/tcg/mips16e_translate.c.inc
> target/riscv/translate.c
> target/rx/translate.c
> target/s390x/tcg/mem_helper.c
> target/s390x/tcg/translate.c
> target/s390x/tcg/translate_vx.c.inc
> target/sparc/ldst_helper.c
> target/sparc/translate.c

For targets tied to single endianness, we can replace using gsed,
but using a helper is clearer (see for example commit 415aae543ed
target/microblaze: Consider endianness while translating code").

> Plus more (22 files) who relies on:
> MO_TE* variants (which relies on MO_TE transitively)
> 
> Thus my proposal to have a first change to MO_TE definition, and 
> eventually do the change later.
> 
> What do you think?

Removing MO_TE is in my TODO list.

I started with Microblaze (now merged) to get familiar, then had
a look at ARM (see i.e.
https://lore.kernel.org/qemu-devel/20240924191932.49386-1-philmd@linaro.org/ 
and
https://lore.kernel.org/qemu-devel/d831600a-9a61-45c1-a535-f75bb64cdff4@linaro.org/).
I also took care of MIPS few years ago but I need to rebase,
however it isn't in the priority list.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 15/16] system/memory: make compilation unit common
  2025-03-10 18:27           ` Philippe Mathieu-Daudé
@ 2025-03-10 18:38             ` Pierrick Bouvier
  0 siblings, 0 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 18:38 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Richard Henderson, qemu-devel

On 3/10/25 11:27, Philippe Mathieu-Daudé wrote:
> On 10/3/25 19:04, Pierrick Bouvier wrote:
>> On 3/10/25 10:58, Richard Henderson wrote:
>>> On 3/10/25 10:47, Pierrick Bouvier wrote:
>>>>> Maybe better as
>>>>>
>>>>>         MemOp mop = (target_words_bigendian() ? MO_BE : MO_LE) |
>>>>> size_memop(size);
>>>>>         adjust_endianness(mr, &mrfd.data, size_memop(size), mop);
>>>>>
>>>>
>>>> Do you think defining MO_TE as this expression is a good idea?
>>>
>>> There are not so many references to MO_TE outside target/ or accel/tcg/.
>>>
>>> Indeed, after this change, the only ones left are in hw/arm/armv7m.c,
>>> which (because it's Arm) can be changed to MO_LE.
>>>
>>
>> I see a bit more than that (17 files):
>> hw/arm/armv7m.c
>> include/exec/memop.h
>> target/arm/tcg/helper-a64.c
>> target/arm/tcg/translate.c
>> target/hexagon/idef-parser/parser-helpers.c
>> target/hppa/translate.c
>> target/i386/tcg/emit.c.inc
>> target/loongarch/tcg/insn_trans/trans_vec.c.inc
>> target/m68k/translate.c
>> target/mips/tcg/mips16e_translate.c.inc
>> target/riscv/translate.c
>> target/rx/translate.c
>> target/s390x/tcg/mem_helper.c
>> target/s390x/tcg/translate.c
>> target/s390x/tcg/translate_vx.c.inc
>> target/sparc/ldst_helper.c
>> target/sparc/translate.c
> 
> For targets tied to single endianness, we can replace using gsed,
> but using a helper is clearer (see for example commit 415aae543ed
> target/microblaze: Consider endianness while translating code").
> 

That's good, I just want to keep this series focus on minimal changes to 
achieve the current goal, and not bring any additional refactoring here.

>> Plus more (22 files) who relies on:
>> MO_TE* variants (which relies on MO_TE transitively)
>>
>> Thus my proposal to have a first change to MO_TE definition, and
>> eventually do the change later.
>>
>> What do you think?
> 
> Removing MO_TE is in my TODO list.
> 
> I started with Microblaze (now merged) to get familiar, then had
> a look at ARM (see i.e.
> https://lore.kernel.org/qemu-devel/20240924191932.49386-1-philmd@linaro.org/
> and
> https://lore.kernel.org/qemu-devel/d831600a-9a61-45c1-a535-f75bb64cdff4@linaro.org/).
> I also took care of MIPS few years ago but I need to rebase,
> however it isn't in the priority list.

Instead of doing a full codebase refactoring/cleanup, we can adopt a "as 
needed basis" approach.

Basically architectures that can have varying endianness must be handled 
(since their compilation units are duplicated for variants).
For the rest, as Richard mentioned on this series, the code will stay 
target specific, compiling with a single set of defines, which is what 
we really aim for.

Same discussion will happen for architectures with files duplicated 
between 32/64 bits variants.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 00/16] make system memory API available for common code
  2025-03-10 16:56     ` Pierrick Bouvier
@ 2025-03-10 19:40       ` BALATON Zoltan
  2025-03-10 20:26         ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: BALATON Zoltan @ 2025-03-10 19:40 UTC (permalink / raw)
  To: Pierrick Bouvier
  Cc: qemu-devel, qemu-ppc, Alistair Francis, Richard Henderson,
	Harsh Prateek Bora, alex.bennee, Palmer Dabbelt,
	Daniel Henrique Barboza, kvm, Peter Xu, Nicholas Piggin,
	Liu Zhiwei, David Hildenbrand, Weiwei Li, Paul Durrant,
	Edgar E. Iglesias, Philippe Mathieu-Daudé, Anthony PERARD,
	Yoshinori Sato, manos.pitsidianakis, qemu-riscv, Paolo Bonzini,
	xen-devel, Stefano Stabellini

On Mon, 10 Mar 2025, Pierrick Bouvier wrote:
> On 3/10/25 09:28, Pierrick Bouvier wrote:
>> Hi Zoltan,
>> 
>> On 3/10/25 06:23, BALATON Zoltan wrote:
>>> On Sun, 9 Mar 2025, Pierrick Bouvier wrote:
>>>> The main goal of this series is to be able to call any memory ld/st 
>>>> function
>>>> from code that is *not* target dependent.
>>> 
>>> Why is that needed?
>>> 
>> 
>> this series belongs to the "single binary" topic, where we are trying to
>> build a single QEMU binary with all architectures embedded.

Yes I get it now, I just forgot as this wasn't mentioned so the goal 
wasn't obvious.

>> To achieve that, we need to have every single compilation unit compiled
>> only once, to be able to link a binary without any symbol conflict.
>> 
>> A consequence of that is target specific code (in terms of code relying
>> of target specific macros) needs to be converted to common code,
>> checking at runtime properties of the target we run. We are tackling
>> various places in QEMU codebase at the same time, which can be confusing
>> for the community members.

Mentioning this single binary in related series may help reminding readers 
about the context.

>> This series take care of system memory related functions and associated
>> compilation units in system/.
>> 
>>>> As a positive side effect, we can
>>>> turn related system compilation units into common code.
>>> 
>>> Are there any negative side effects? In particular have you done any
>>> performance benchmarking to see if this causes a measurable slow down?
>>> Such as with the STREAM benchmark:
>>> https://stackoverflow.com/questions/56086993/what-does-stream-memory-bandwidth-benchmark-really-measure
>>> 
>>> Maybe it would be good to have some performance tests similiar to
>>> functional tests that could be run like the CI tests to detect such
>>> performance changes. People report that QEMU is getting slower and slower
>>> with each release. Maybe it could be a GSoC project to make such tests but
>>> maybe we're too late for that.
>>> 
>> 
>> I agree with you, and it's something we have mentioned during our
>> "internal" conversations. Testing performance with existing functional
>> tests would already be a first good step. However, given the poor
>> reliability we have on our CI runners, I think it's a bit doomed.
>> 
>> Ideally, every QEMU release cycle should have a performance measurement
>> window to detect potential sources of regressions.

Maybe instead of aiming for full CI like performance testing something 
simpler like a few tests that excercise some apects each like STREAM that 
tests memory access, copying a file from network and/or disk that tests 
I/O and mp3 encode with lame for example that's supposed to test floating 
point and SIMD might be simpler to do. It could be made a bootable image 
that just runs the test and reports a number (I did that before for 
qemu-system-ppc when we wanted to test an issue that on some hosts it ran 
slower). Such test could be run by somebody making changes so they could 
call these before and after their patch to quickly check if there's 
anything to improve. This may be less through then full performance 
testing but still give some insight and better than not testing anything 
for performance.

I'm bringig this topic up to try to keep awareness on this so QEMU can 
remain true to its name. (Although I'm not sure if originally the Q in the 
name stood for the time it took to write or its performance but it's 
hopefully still a goal to keep it fast.)

>> To answer to your specific question, I am trying first to get a review
>> on the approach taken. We can always optimize in next series version, in
>> case we identify it's a big deal to introduce a branch for every memory
>> related function call.

I'm not sure we can always optimise after the fact so sometimes it can be 
necessary to take performance in consideration while designing changes.

>> In all cases, transforming code relying on compile time
>> optimization/dead code elimination through defines to runtime checks
>> will *always* have an impact,

Yes, that's why it would be good to know how much impact is that.

>> even though it should be minimal in most of cases.

Hopefully but how do we know if we don't even test for it?

>> But the maintenance and compilation time benefits, as well as
>> the perspectives it opens (single binary, heterogeneous emulation, use
>> QEMU as a library) are worth it IMHO.

I'm not so sure about that. Heterogeneous emulation sounds interesting but 
is it needed most of the time? Using QEMU as a library also may not be 
common and limited by licencing. The single binary would simplify packages 
but then this binary may get huge so it's slower to load, may take more 
resources to run and more time to compile and if somebody only needs one 
architecture why do I want to include all of the others and wait for it to 
compile using up a lot of space on my disk? So in other words, while these 
are interesting and good goals could it be achieved with keeping the 
current way of building single ARCH binary as opposed to single binary 
with multiple archs and not throwing out the optimisations a single arch 
binary can use? Which one is better may depend on the use case so if 
possible it would be better to allow both keeping what we have and adding 
multi arch binary on top not replacing the current way completely.

>>> Regards,
>>> BALATON Zoltan
>> 
>> Regards,
>> Pierrick
>> 
>
> As a side note, we recently did some work around performance analysis (for 
> aarch64), as you can see here [1]. In the end, QEMU performance depends

Thank you, very interesting read.

> (roughly in this order) on:
> 1. quality of code generated by TCG
> 2. helper code to implement instructions
> 3. mmu emulation
>
> Other state of the art translators that exist are faster (fex, box64) mainly 
> by enhancing 1, and relying on various tricks to avoid translating some 
> libraries calls. But those translators are host/target specific, and the 
> ratio of instructions generated (vs target ones read) is much lower than 
> QEMU. In the experimentation listed in the blog, I observed that for 
> qemu-system-aarch64, we have an average expansion factor of around 18 (1 
> guest insn translates to 18 host ones).
>
> For users seeing performance decreases, beyond the QEMU code changes, adding 
> new target instructions may add new helpers, which may be called by the stack 
> people use, and they can sometimes observe a slower behaviour.

I'm mostly interested in emulating PPC for older and obscure OSes running 
on older hardware so there new instructions isn't a problem. Most of the 
time MMU emulation, helpers and TCG code generation is mostly dominating 
there and on PPC particularly the lack of hard float usage. Apart from 
that maybe some device emulations but that's a different topic. This is 
already slow so any overhead introduced at lowest levels just adds to 
that and target specific optimisation may only get back what's lost 
elsewhere.

Regards,
BALATON Zoltan

> There are probably some other low hanging fruits for other target 
> architectures.
>
> [1] https://www.linaro.org/blog/qemu-a-tale-of-performance-analysis/
>
>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions
  2025-03-10 17:09             ` Pierrick Bouvier
@ 2025-03-10 20:17               ` BALATON Zoltan
  2025-03-10 20:31                 ` Pierrick Bouvier
  0 siblings, 1 reply; 61+ messages in thread
From: BALATON Zoltan @ 2025-03-10 20:17 UTC (permalink / raw)
  To: Pierrick Bouvier; +Cc: Richard Henderson, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3015 bytes --]

On Mon, 10 Mar 2025, Pierrick Bouvier wrote:
> On 3/10/25 09:53, Richard Henderson wrote:
>> On 3/10/25 09:43, Pierrick Bouvier wrote:
>>> On 3/10/25 09:37, Richard Henderson wrote:
>>>> On 3/10/25 09:14, Pierrick Bouvier wrote:
>>>>> On 3/10/25 09:08, Richard Henderson wrote:
>>>>>> On 3/9/25 21:58, Pierrick Bouvier wrote:
>>>>>>> For now, they are duplicate of the same macros in cpu-all.h that we
>>>>>>> eliminate in next commit.
>>>>>>> 
>>>>>>> Keep code readable by not defining them with macros, but simply their
>>>>>>> implementation.
>>>>>>> 
>>>>>>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>>>>>> 
>>>>>> Why do you want these in bswap.h, rather than tswap.h?
>>>>>> They're target swaps, after all.
>>>>>> 
>>>>>> 
>>>>>> r~
>>>>> 
>>>>> No preference on that, I simply added them to the same file than their 
>>>>> explicit endianness
>>>>> variant. Would you prefer the endianness agnostic variant to be in 
>>>>> tswap.h instead?
>>>> 
>>>> I think I would.
>>> 
>>> Ok, I will move it.
>>> 
>>>> 
>>>> In addition, I think we want
>>>> 
>>>> #ifdef COMPILING_PER_TARGET
>>>> #define target_words_bigendian()  TARGET_BIG_ENDIAN
>>>> #else
>>>> bool target_words_bigendian(void);
>>>> #endif
>>>> 
>>>> moving the conditional from around target_needs_bswap just below.
>>>> 
>>>> With that, we eliminate the extra branch that you're otherwise
>>>> adding to target-specific code with this patch.
>>>> 
>>> 
>>> I understand the change requested, but should we really aim in that 
>>> direction? In the end,
>>> if we pursue the compilation units deduplication, the branch will be 
>>> present anyway.
>>> 
>>> I'm ok with your change, just asking if we really want to preserve target 
>>> specific code
>>> until the "end".
>> 
>> All of target/ is target specific.  De-duplication will not eliminate that.
>> 
>
> My vocabulary was wrong here. I meant "if we want to preserve target specific 
> macros" until the end.
> Sure, there will always be compilation units (devices, cpus, helpers, ...) 
> specific to a target. I just wonder if sticking to ifdef paradigm for this 
> kind of code is worth the "optimization" we are supposed to get.

I've already tried to say that in the previous reply but maybe I can 
explain it better here. I think keeping per target binaries would be 
desired so single binary would not replace it just become an additional 
option. For example when I want to play with old stuff I compile with 
--target-list=ppc-softmmu and don't want to wait compiling all the other 
targets I don't use and not even interested in PPC64. A distro may want to 
ship a single qemu-system binary instead but other distros may prefer per 
target packages not one huge package so users can decide which ones to 
install. All of these are valid use cases, therefore this single binary 
should be an additional option not the only true way from now on 
replacing existing per target builds.

Regards,
BALATON Zoltan

> I'll add the change requested.
>
>> 
>> r~
>
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 00/16] make system memory API available for common code
  2025-03-10 19:40       ` BALATON Zoltan
@ 2025-03-10 20:26         ` Pierrick Bouvier
  2025-03-10 21:02           ` BALATON Zoltan
  0 siblings, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 20:26 UTC (permalink / raw)
  To: BALATON Zoltan
  Cc: qemu-devel, qemu-ppc, Alistair Francis, Richard Henderson,
	Harsh Prateek Bora, alex.bennee, Palmer Dabbelt,
	Daniel Henrique Barboza, kvm, Peter Xu, Nicholas Piggin,
	Liu Zhiwei, David Hildenbrand, Weiwei Li, Paul Durrant,
	Edgar E. Iglesias, Philippe Mathieu-Daudé, Anthony PERARD,
	Yoshinori Sato, manos.pitsidianakis, qemu-riscv, Paolo Bonzini,
	xen-devel, Stefano Stabellini

On 3/10/25 12:40, BALATON Zoltan wrote:
> On Mon, 10 Mar 2025, Pierrick Bouvier wrote:
>> On 3/10/25 09:28, Pierrick Bouvier wrote:
>>> Hi Zoltan,
>>>
>>> On 3/10/25 06:23, BALATON Zoltan wrote:
>>>> On Sun, 9 Mar 2025, Pierrick Bouvier wrote:
>>>>> The main goal of this series is to be able to call any memory ld/st
>>>>> function
>>>>> from code that is *not* target dependent.
>>>>
>>>> Why is that needed?
>>>>
>>>
>>> this series belongs to the "single binary" topic, where we are trying to
>>> build a single QEMU binary with all architectures embedded.
> 
> Yes I get it now, I just forgot as this wasn't mentioned so the goal
> wasn't obvious.
>

The more I work on this topic, the more I realize we miss a clear and 
concise document (wiki page, or anything than can be edited easily - not 
email) explaining this to other developers, and that we could share as a 
link, and enhance based on the questions asked.

>>> To achieve that, we need to have every single compilation unit compiled
>>> only once, to be able to link a binary without any symbol conflict.
>>>
>>> A consequence of that is target specific code (in terms of code relying
>>> of target specific macros) needs to be converted to common code,
>>> checking at runtime properties of the target we run. We are tackling
>>> various places in QEMU codebase at the same time, which can be confusing
>>> for the community members.
> 
> Mentioning this single binary in related series may help reminding readers
> about the context.
> 

I'll make sure to mention this "name" in the title for next series, thanks!

>>> This series take care of system memory related functions and associated
>>> compilation units in system/.
>>>
>>>>> As a positive side effect, we can
>>>>> turn related system compilation units into common code.
>>>>
>>>> Are there any negative side effects? In particular have you done any
>>>> performance benchmarking to see if this causes a measurable slow down?
>>>> Such as with the STREAM benchmark:
>>>> https://stackoverflow.com/questions/56086993/what-does-stream-memory-bandwidth-benchmark-really-measure
>>>>
>>>> Maybe it would be good to have some performance tests similiar to
>>>> functional tests that could be run like the CI tests to detect such
>>>> performance changes. People report that QEMU is getting slower and slower
>>>> with each release. Maybe it could be a GSoC project to make such tests but
>>>> maybe we're too late for that.
>>>>
>>>
>>> I agree with you, and it's something we have mentioned during our
>>> "internal" conversations. Testing performance with existing functional
>>> tests would already be a first good step. However, given the poor
>>> reliability we have on our CI runners, I think it's a bit doomed.
>>>
>>> Ideally, every QEMU release cycle should have a performance measurement
>>> window to detect potential sources of regressions.
> 
> Maybe instead of aiming for full CI like performance testing something
> simpler like a few tests that excercise some apects each like STREAM that
> tests memory access, copying a file from network and/or disk that tests
> I/O and mp3 encode with lame for example that's supposed to test floating
> point and SIMD might be simpler to do. It could be made a bootable image
> that just runs the test and reports a number (I did that before for
> qemu-system-ppc when we wanted to test an issue that on some hosts it ran
> slower). Such test could be run by somebody making changes so they could
> call these before and after their patch to quickly check if there's
> anything to improve. This may be less through then full performance
> testing but still give some insight and better than not testing anything
> for performance.
> 
> I'm bringig this topic up to try to keep awareness on this so QEMU can
> remain true to its name. (Although I'm not sure if originally the Q in the
> name stood for the time it took to write or its performance but it's
> hopefully still a goal to keep it fast.)
> 

You do well to remind that, but as always, the problem is that "run by 
somebody" is not an enforceable process.

>>> To answer to your specific question, I am trying first to get a review
>>> on the approach taken. We can always optimize in next series version, in
>>> case we identify it's a big deal to introduce a branch for every memory
>>> related function call.
> 
> I'm not sure we can always optimise after the fact so sometimes it can be
> necessary to take performance in consideration while designing changes.
> 

In the context of single binary concerned series, we mostly introduce a 
few branches in various spots, to do a runtime check.
As Richard mentioned in this series, we can keep target code exactly as 
it is.

>>> In all cases, transforming code relying on compile time
>>> optimization/dead code elimination through defines to runtime checks
>>> will *always* have an impact,
> 
> Yes, that's why it would be good to know how much impact is that.
> 
>>> even though it should be minimal in most of cases.
> 
> Hopefully but how do we know if we don't even test for it?
> 

In the case of this series, I usually so a local test booting 
(automatically) an x64 debian stable vm, that poweroff itself as part of 
its init.

With and without this series, the variation is below the average one I 
have between two runs (<1 sec, for a total of 40 seconds), so the impact 
is litterally invisible.

>>> But the maintenance and compilation time benefits, as well as
>>> the perspectives it opens (single binary, heterogeneous emulation, use
>>> QEMU as a library) are worth it IMHO.
> 
> I'm not so sure about that. Heterogeneous emulation sounds interesting but
> is it needed most of the time? Using QEMU as a library also may not be
> common and limited by licencing. The single binary would simplify packages
> but then this binary may get huge so it's slower to load, may take more
> resources to run and more time to compile and if somebody only needs one
> architecture why do I want to include all of the others and wait for it to
> compile using up a lot of space on my disk? So in other words, while these
> are interesting and good goals could it be achieved with keeping the
> current way of building single ARCH binary as opposed to single binary
> with multiple archs and not throwing out the optimisations a single arch
> binary can use? Which one is better may depend on the use case so if
> possible it would be better to allow both keeping what we have and adding
> multi arch binary on top not replacing the current way completely.
> 

Thanks, it's definitely interesting to hear the concerns on this, so we 
can address them, and find the best and minimal solution to achive the 
desired goal.

I'll answer point by point.

QEMU as a library: that's what Unicorn is 
(https://www.unicorn-engine.org/docs/beyond_qemu.html), which is used by 
a lot of researchers. Talking frequently with some of them, they would 
be happy to have such a library directly with upstream QEMU, so it can 
benefit from all the enhancements done to TCG. It's mostly a use case 
for security researchers/engineers, but definitely a valid one. Just 
look at the list of QEMU downstream forks focused on that. Combining 
this with plugins would be amazing, and only grow our list of users.

For the heterogeneous scenario, yes it's not the most common case. But 
we *must*, in terms of QEMU binary, be able to have a single binary 
first. By that, I mean the need is to be able to link a binary with 
several arch present, without any symbol conflict.

The other approach possible is to rename many functions through QEMU 
codebase by adding a target_prefix everywhere, which would be ugly and 
endless. That's why we are currently using the "remove duplicated 
compilation units" pragmatic approach. As well, we can do a lot of 
headers cleanup on the way (removing useless dependencies), which is 
good for everyone.

For compilation times, it will only speed it up, because in case you 
have only specific targets, non-needed files won't be compiled/linked. 
For multi target setup, it's only a speed up (with all targets, it would 
be a drop from 9000+ CUs to around 4000+). Less disk space as well, most 
notable in debug.
As well, having files compiled only once allow to use reliably code 
indexation tools (clangd for instance), instead of picking a random CU 
setting based on one target.
Finally, having a single binary would mean it's easy to use LTO (or at 
least distros would use it easily), and get the same or better 
performance as what we have today.

The "current" way, with several binaries, can be kept forever if people 
wants. But it's not feasible to keep headers and cu compatible for both 
modes. It would be a lot of code duplication, and that is really not 
desirable IMHO. So we need to do those system wide changes and convince 
the community it's a good progress for everyone.

Kudos to Philippe who has been doing this long and tedious work for 
several years now, and I hope that with some fresh eyes/blood, it can be 
completed soon.

>>>> Regards,
>>>> BALATON Zoltan
>>>
>>> Regards,
>>> Pierrick
>>>
>>
>> As a side note, we recently did some work around performance analysis (for
>> aarch64), as you can see here [1]. In the end, QEMU performance depends
> 
> Thank you, very interesting read.
> 
>> (roughly in this order) on:
>> 1. quality of code generated by TCG
>> 2. helper code to implement instructions
>> 3. mmu emulation
>>
>> Other state of the art translators that exist are faster (fex, box64) mainly
>> by enhancing 1, and relying on various tricks to avoid translating some
>> libraries calls. But those translators are host/target specific, and the
>> ratio of instructions generated (vs target ones read) is much lower than
>> QEMU. In the experimentation listed in the blog, I observed that for
>> qemu-system-aarch64, we have an average expansion factor of around 18 (1
>> guest insn translates to 18 host ones).
>>
>> For users seeing performance decreases, beyond the QEMU code changes, adding
>> new target instructions may add new helpers, which may be called by the stack
>> people use, and they can sometimes observe a slower behaviour.
> 
> I'm mostly interested in emulating PPC for older and obscure OSes running
> on older hardware so there new instructions isn't a problem. Most of the
> time MMU emulation, helpers and TCG code generation is mostly dominating
> there and on PPC particularly the lack of hard float usage. Apart from
> that maybe some device emulations but that's a different topic. This is
> already slow so any overhead introduced at lowest levels just adds to
> that and target specific optimisation may only get back what's lost
> elsewhere.
> 

One think we really lack for now is how to measure generated code 
quality. I mean, to know how far we are from the optimal translation.
For mmu and helper code, it's easy, as this appear in any profiling.
But for the rest, it's kind of a blackbox.

Once again, having QEMU has a library (TCG more precisely) would be 
something very beneficial to work on that.

> Regards,
> BALATON Zoltan
> 
>> There are probably some other low hanging fruits for other target
>> architectures.
>>
>> [1] https://www.linaro.org/blog/qemu-a-tale-of-performance-analysis/
>>
>>



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions
  2025-03-10 20:17               ` BALATON Zoltan
@ 2025-03-10 20:31                 ` Pierrick Bouvier
  0 siblings, 0 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-10 20:31 UTC (permalink / raw)
  To: BALATON Zoltan; +Cc: Richard Henderson, qemu-devel

On 3/10/25 13:17, BALATON Zoltan wrote:
> On Mon, 10 Mar 2025, Pierrick Bouvier wrote:
>> On 3/10/25 09:53, Richard Henderson wrote:
>>> On 3/10/25 09:43, Pierrick Bouvier wrote:
>>>> On 3/10/25 09:37, Richard Henderson wrote:
>>>>> On 3/10/25 09:14, Pierrick Bouvier wrote:
>>>>>> On 3/10/25 09:08, Richard Henderson wrote:
>>>>>>> On 3/9/25 21:58, Pierrick Bouvier wrote:
>>>>>>>> For now, they are duplicate of the same macros in cpu-all.h that we
>>>>>>>> eliminate in next commit.
>>>>>>>>
>>>>>>>> Keep code readable by not defining them with macros, but simply their
>>>>>>>> implementation.
>>>>>>>>
>>>>>>>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>>>>>>>
>>>>>>> Why do you want these in bswap.h, rather than tswap.h?
>>>>>>> They're target swaps, after all.
>>>>>>>
>>>>>>>
>>>>>>> r~
>>>>>>
>>>>>> No preference on that, I simply added them to the same file than their
>>>>>> explicit endianness
>>>>>> variant. Would you prefer the endianness agnostic variant to be in
>>>>>> tswap.h instead?
>>>>>
>>>>> I think I would.
>>>>
>>>> Ok, I will move it.
>>>>
>>>>>
>>>>> In addition, I think we want
>>>>>
>>>>> #ifdef COMPILING_PER_TARGET
>>>>> #define target_words_bigendian()  TARGET_BIG_ENDIAN
>>>>> #else
>>>>> bool target_words_bigendian(void);
>>>>> #endif
>>>>>
>>>>> moving the conditional from around target_needs_bswap just below.
>>>>>
>>>>> With that, we eliminate the extra branch that you're otherwise
>>>>> adding to target-specific code with this patch.
>>>>>
>>>>
>>>> I understand the change requested, but should we really aim in that
>>>> direction? In the end,
>>>> if we pursue the compilation units deduplication, the branch will be
>>>> present anyway.
>>>>
>>>> I'm ok with your change, just asking if we really want to preserve target
>>>> specific code
>>>> until the "end".
>>>
>>> All of target/ is target specific.  De-duplication will not eliminate that.
>>>
>>
>> My vocabulary was wrong here. I meant "if we want to preserve target specific
>> macros" until the end.
>> Sure, there will always be compilation units (devices, cpus, helpers, ...)
>> specific to a target. I just wonder if sticking to ifdef paradigm for this
>> kind of code is worth the "optimization" we are supposed to get.
> 
> I've already tried to say that in the previous reply but maybe I can
> explain it better here. I think keeping per target binaries would be
> desired so single binary would not replace it just become an additional
> option. For example when I want to play with old stuff I compile with
> --target-list=ppc-softmmu and don't want to wait compiling all the other
> targets I don't use and not even interested in PPC64. A distro may want to
> ship a single qemu-system binary instead but other distros may prefer per
> target packages not one huge package so users can decide which ones to
> install. All of these are valid use cases, therefore this single binary
> should be an additional option not the only true way from now on
> replacing existing per target builds.
> 

This valid use case is and *will* stay the default. There is no secret 
evil plan to break people habits here :).

For now, we don't even introduce a "single binary" configure option, 
simply because the resulting code could not link, so there is no point.

If you're curious about this, it is the command we use to check the work 
left before being able to do it:
# configure with any target list, we use all in our case
$ ./configure ...
$ jq --raw-output < build/compile_commands.json '.[].file' | sort |
   uniq -c | sort -rn | grep -v '^\s*1 '

Once this list is empty, we'll be able to link and execute a single 
binary (and adding a additional new main() for that). But first things 
first.

> Regards,
> BALATON Zoltan
> 
>> I'll add the change requested.
>>
>>>
>>> r~
>>
>>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 00/16] make system memory API available for common code
  2025-03-10 20:26         ` Pierrick Bouvier
@ 2025-03-10 21:02           ` BALATON Zoltan
  0 siblings, 0 replies; 61+ messages in thread
From: BALATON Zoltan @ 2025-03-10 21:02 UTC (permalink / raw)
  To: Pierrick Bouvier
  Cc: qemu-devel, qemu-ppc, Alistair Francis, Richard Henderson,
	Harsh Prateek Bora, alex.bennee, Palmer Dabbelt,
	Daniel Henrique Barboza, kvm, Peter Xu, Nicholas Piggin,
	Liu Zhiwei, David Hildenbrand, Weiwei Li, Paul Durrant,
	Edgar E. Iglesias, Philippe Mathieu-Daudé, Anthony PERARD,
	Yoshinori Sato, manos.pitsidianakis, qemu-riscv, Paolo Bonzini,
	xen-devel, Stefano Stabellini

On Mon, 10 Mar 2025, Pierrick Bouvier wrote:
> On 3/10/25 12:40, BALATON Zoltan wrote:
>> On Mon, 10 Mar 2025, Pierrick Bouvier wrote:
>>> On 3/10/25 09:28, Pierrick Bouvier wrote:
>>>> Hi Zoltan,
>>>> 
>>>> On 3/10/25 06:23, BALATON Zoltan wrote:
>>>>> On Sun, 9 Mar 2025, Pierrick Bouvier wrote:
>>>>>> The main goal of this series is to be able to call any memory ld/st
>>>>>> function
>>>>>> from code that is *not* target dependent.
>>>>> 
>>>>> Why is that needed?
>>>>> 
>>>> 
>>>> this series belongs to the "single binary" topic, where we are trying to
>>>> build a single QEMU binary with all architectures embedded.
>> 
>> Yes I get it now, I just forgot as this wasn't mentioned so the goal
>> wasn't obvious.
>> 
>
> The more I work on this topic, the more I realize we miss a clear and concise 
> document (wiki page, or anything than can be edited easily - not email) 
> explaining this to other developers, and that we could share as a link, and 
> enhance based on the questions asked.

Maybe you can start collecting FAQ on a wiki page so you don't have to 
answer them multiple times. I think most people aware of this though just 
may not associate a series with it if not mentioned in the description.

>>>> To achieve that, we need to have every single compilation unit compiled
>>>> only once, to be able to link a binary without any symbol conflict.
>>>> 
>>>> A consequence of that is target specific code (in terms of code relying
>>>> of target specific macros) needs to be converted to common code,
>>>> checking at runtime properties of the target we run. We are tackling
>>>> various places in QEMU codebase at the same time, which can be confusing
>>>> for the community members.
>> 
>> Mentioning this single binary in related series may help reminding readers
>> about the context.
>> 
>
> I'll make sure to mention this "name" in the title for next series, thanks!
>
>>>> This series take care of system memory related functions and associated
>>>> compilation units in system/.
>>>> 
>>>>>> As a positive side effect, we can
>>>>>> turn related system compilation units into common code.
>>>>> 
>>>>> Are there any negative side effects? In particular have you done any
>>>>> performance benchmarking to see if this causes a measurable slow down?
>>>>> Such as with the STREAM benchmark:
>>>>> https://stackoverflow.com/questions/56086993/what-does-stream-memory-bandwidth-benchmark-really-measure
>>>>> 
>>>>> Maybe it would be good to have some performance tests similiar to
>>>>> functional tests that could be run like the CI tests to detect such
>>>>> performance changes. People report that QEMU is getting slower and 
>>>>> slower
>>>>> with each release. Maybe it could be a GSoC project to make such tests 
>>>>> but
>>>>> maybe we're too late for that.
>>>>> 
>>>> 
>>>> I agree with you, and it's something we have mentioned during our
>>>> "internal" conversations. Testing performance with existing functional
>>>> tests would already be a first good step. However, given the poor
>>>> reliability we have on our CI runners, I think it's a bit doomed.
>>>> 
>>>> Ideally, every QEMU release cycle should have a performance measurement
>>>> window to detect potential sources of regressions.
>> 
>> Maybe instead of aiming for full CI like performance testing something
>> simpler like a few tests that excercise some apects each like STREAM that
>> tests memory access, copying a file from network and/or disk that tests
>> I/O and mp3 encode with lame for example that's supposed to test floating
>> point and SIMD might be simpler to do. It could be made a bootable image
>> that just runs the test and reports a number (I did that before for
>> qemu-system-ppc when we wanted to test an issue that on some hosts it ran
>> slower). Such test could be run by somebody making changes so they could
>> call these before and after their patch to quickly check if there's
>> anything to improve. This may be less through then full performance
>> testing but still give some insight and better than not testing anything
>> for performance.
>> 
>> I'm bringig this topic up to try to keep awareness on this so QEMU can
>> remain true to its name. (Although I'm not sure if originally the Q in the
>> name stood for the time it took to write or its performance but it's
>> hopefully still a goal to keep it fast.)
>> 
>
> You do well to remind that, but as always, the problem is that "run by 
> somebody" is not an enforceable process.
>
>>>> To answer to your specific question, I am trying first to get a review
>>>> on the approach taken. We can always optimize in next series version, in
>>>> case we identify it's a big deal to introduce a branch for every memory
>>>> related function call.
>> 
>> I'm not sure we can always optimise after the fact so sometimes it can be
>> necessary to take performance in consideration while designing changes.
>> 
>
> In the context of single binary concerned series, we mostly introduce a few 
> branches in various spots, to do a runtime check.
> As Richard mentioned in this series, we can keep target code exactly as it 
> is.
>
>>>> In all cases, transforming code relying on compile time
>>>> optimization/dead code elimination through defines to runtime checks
>>>> will *always* have an impact,
>> 
>> Yes, that's why it would be good to know how much impact is that.
>> 
>>>> even though it should be minimal in most of cases.
>> 
>> Hopefully but how do we know if we don't even test for it?
>> 
>
> In the case of this series, I usually so a local test booting (automatically) 
> an x64 debian stable vm, that poweroff itself as part of its init.
>
> With and without this series, the variation is below the average one I have 
> between two runs (<1 sec, for a total of 40 seconds), so the impact is 
> litterally invisible.

That's good to hear. Some overhead which is unavoidable is OK I just hope 
we can avoid which is not unavoidable and try to do something about what 
would have noticable performance penalty. If you're already aware of that 
and do that then that's all I wanted to say, nothing new.

>>>> But the maintenance and compilation time benefits, as well as
>>>> the perspectives it opens (single binary, heterogeneous emulation, use
>>>> QEMU as a library) are worth it IMHO.
>> 
>> I'm not so sure about that. Heterogeneous emulation sounds interesting but
>> is it needed most of the time? Using QEMU as a library also may not be
>> common and limited by licencing. The single binary would simplify packages
>> but then this binary may get huge so it's slower to load, may take more
>> resources to run and more time to compile and if somebody only needs one
>> architecture why do I want to include all of the others and wait for it to
>> compile using up a lot of space on my disk? So in other words, while these
>> are interesting and good goals could it be achieved with keeping the
>> current way of building single ARCH binary as opposed to single binary
>> with multiple archs and not throwing out the optimisations a single arch
>> binary can use? Which one is better may depend on the use case so if
>> possible it would be better to allow both keeping what we have and adding
>> multi arch binary on top not replacing the current way completely.
>> 
>
> Thanks, it's definitely interesting to hear the concerns on this, so we can 
> address them, and find the best and minimal solution to achive the desired 
> goal.
>
> I'll answer point by point.
>
> QEMU as a library: that's what Unicorn is 
> (https://www.unicorn-engine.org/docs/beyond_qemu.html), which is used by a 
> lot of researchers. Talking frequently with some of them, they would be happy 
> to have such a library directly with upstream QEMU, so it can benefit from 
> all the enhancements done to TCG. It's mostly a use case for security 
> researchers/engineers, but definitely a valid one. Just look at the list of 
> QEMU downstream forks focused on that. Combining this with plugins would be 
> amazing, and only grow our list of users.
>
> For the heterogeneous scenario, yes it's not the most common case. But we 
> *must*, in terms of QEMU binary, be able to have a single binary first. By 
> that, I mean the need is to be able to link a binary with several arch 
> present, without any symbol conflict.

OK Unicorn engine explains it and it needs multiple targets in single 
library (which maybe is the real goal, not a single binary here and that 
only needs targets not all devices). By the way I think multiple-arch is 
what they really mean on that beyond_qemu.html page above under 
Thread-safety.

> The other approach possible is to rename many functions through QEMU codebase 
> by adding a target_prefix everywhere, which would be ugly and endless. That's 
> why we are currently using the "remove duplicated compilation units" 
> pragmatic approach. As well, we can do a lot of headers cleanup on the way 
> (removing useless dependencies), which is good for everyone.
>
> For compilation times, it will only speed it up, because in case you have 
> only specific targets, non-needed files won't be compiled/linked. For multi 
> target setup, it's only a speed up (with all targets, it would be a drop from 
> 9000+ CUs to around 4000+). Less disk space as well, most notable in debug.
> As well, having files compiled only once allow to use reliably code 
> indexation tools (clangd for instance), instead of picking a random CU 
> setting based on one target.
> Finally, having a single binary would mean it's easy to use LTO (or at least 
> distros would use it easily), and get the same or better performance as what 
> we have today.
>
> The "current" way, with several binaries, can be kept forever if people

As I said I think that would be needed as there are valid use cases for 
both.

> wants. But it's not feasible to keep headers and cu compatible for both 
> modes. It would be a lot of code duplication, and that is really not 
> desirable IMHO. So we need to do those system wide changes and convince the 
> community it's a good progress for everyone.

It would be nice to keep optimisations where possible and it seems it 
might be possible sometimes so just take that in consideration as well not 
just one goal.

> Kudos to Philippe who has been doing this long and tedious work for several 
> years now, and I hope that with some fresh eyes/blood, it can be completed 
> soon.

Absolutely and I did not mean to say not to do it just added another view 
point for consideration.

Regards,
BALATON Zoltan


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h
  2025-03-10 15:17   ` Richard Henderson
  2025-03-10 16:03     ` Pierrick Bouvier
@ 2025-03-11  0:04     ` Pierrick Bouvier
  2025-03-11 14:40       ` Richard Henderson
  1 sibling, 1 reply; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-11  0:04 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/10/25 08:17, Richard Henderson wrote:
> On 3/9/25 21:58, Pierrick Bouvier wrote:
>> They are now accessible through exec/memory.h instead, and we make sure
>> all variants are available for common or target dependent code.
> ...
>> diff --git a/include/exec/memory_ldst.h.inc b/include/exec/memory_ldst.h.inc
>> index 92ad74e9560..74519a88de0 100644
>> --- a/include/exec/memory_ldst.h.inc
>> +++ b/include/exec/memory_ldst.h.inc
>> @@ -19,7 +19,8 @@
>>     * License along with this library; if not, see <http://www.gnu.org/licenses/>.
>>     */
>>    
>> -#ifdef TARGET_ENDIANNESS
>> +uint8_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
>> +    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
>>    uint16_t glue(address_space_lduw, SUFFIX)(ARG1_DECL,
>>        hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
> 
> You shouldn't be exposing
> 
>     address_space_lduw
> 
> to common code, only
> 
>     address_space_lduw_be
>     address_space_lduw_le
> 
> etc.  I'm not sure what you're trying to do here.
> 
> 
> r~

Taking another look at this, I don't see how we can avoid to expose 
those functions to common code.

The goal of those first two patches is to break the dependency to cpu.h, 
which can't be included from common code, and thus prevent any common 
code wanting to do memory op to access it.

All the calls to ld*_phys() and address_space_*() are indeed done from 
code related to a target (some of it in hw/). However, as we now move 
those compilation units to common, they still need to be accessed. The 
semantic stays exactly the same.

 From what I understand, non endian versions are simply passing 
DEVICE_NATIVE_ENDIAN as a parameter for address_space_ldl_internal, 
which will thus match the target endianness.

So what is the risk for common code to call this?


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 02/16] exec/memory_ldst_phys: extract memory_ldst_phys declarations from cpu-all.h
  2025-03-10  4:58 ` [PATCH 02/16] exec/memory_ldst_phys: extract memory_ldst_phys " Pierrick Bouvier
@ 2025-03-11  0:08   ` Pierrick Bouvier
  0 siblings, 0 replies; 61+ messages in thread
From: Pierrick Bouvier @ 2025-03-11  0:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-ppc, Alistair Francis, Richard Henderson, Harsh Prateek Bora,
	alex.bennee, Palmer Dabbelt, Daniel Henrique Barboza, kvm,
	Peter Xu, Nicholas Piggin, Liu Zhiwei, David Hildenbrand,
	Weiwei Li, Paul Durrant, Edgar E. Iglesias,
	Philippe Mathieu-Daudé, Anthony PERARD, Yoshinori Sato,
	manos.pitsidianakis, qemu-riscv, Paolo Bonzini, xen-devel,
	Stefano Stabellini

On 3/9/25 21:58, Pierrick Bouvier wrote:
> They are now accessible through exec/memory.h instead, and we make sure
> all variants are available for common or target dependent code.
> 
> To allow this, we need to implement address_space_st{*}_cached, simply
> forwarding the calls to _cached_slow variants.
> 

It's not needed, following inclusion will do it.

#define ENDIANNESS
+#include "exec/memory_ldst_cached.h.inc"

> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
>   include/exec/cpu-all.h              | 15 ------------
>   include/exec/memory.h               | 36 +++++++++++++++++++++++++++++
>   include/exec/memory_ldst_phys.h.inc |  5 +---
>   3 files changed, 37 insertions(+), 19 deletions(-)
> 
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index 17ea82518a0..1c2e18f492b 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -75,21 +75,6 @@ static inline void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val
>                                  MEMTXATTRS_UNSPECIFIED, NULL);
>   }
>   
> -#define SUFFIX
> -#define ARG1         as
> -#define ARG1_DECL    AddressSpace *as
> -#define TARGET_ENDIANNESS
> -#include "exec/memory_ldst_phys.h.inc"
> -
> -/* Inline fast path for direct RAM access.  */
> -#define ENDIANNESS
> -#include "exec/memory_ldst_cached.h.inc"
> -
> -#define SUFFIX       _cached
> -#define ARG1         cache
> -#define ARG1_DECL    MemoryRegionCache *cache
> -#define TARGET_ENDIANNESS
> -#include "exec/memory_ldst_phys.h.inc"
>   #endif
>   
>   /* page related stuff */
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 78c4e0aec8d..7c20f36a312 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -2798,6 +2798,42 @@ static inline void address_space_stb_cached(MemoryRegionCache *cache,
>       }
>   }
>   
> +static inline uint16_t address_space_lduw_cached(MemoryRegionCache *cache,
> +    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
> +{
> +    return address_space_lduw_cached_slow(cache, addr, attrs, result);
> +}
> +
> +static inline void address_space_stw_cached(MemoryRegionCache *cache,
> +    hwaddr addr, uint16_t val, MemTxAttrs attrs, MemTxResult *result)
> +{
> +    address_space_stw_cached_slow(cache, addr, val, attrs, result);
> +}
> +
> +static inline uint32_t address_space_ldl_cached(MemoryRegionCache *cache,
> +    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
> +{
> +    return address_space_ldl_cached_slow(cache, addr, attrs, result);
> +}
> +
> +static inline void address_space_stl_cached(MemoryRegionCache *cache,
> +    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
> +{
> +    address_space_stl_cached_slow(cache, addr, val, attrs, result);
> +}
> +
> +static inline uint64_t address_space_ldq_cached(MemoryRegionCache *cache,
> +    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
> +{
> +    return address_space_ldq_cached_slow(cache, addr, attrs, result);
> +}
> +
> +static inline void address_space_stq_cached(MemoryRegionCache *cache,
> +    hwaddr addr, uint64_t val, MemTxAttrs attrs, MemTxResult *result)
> +{
> +    address_space_stq_cached_slow(cache, addr, val, attrs, result);
> +}
> +
>   #define ENDIANNESS   _le
>   #include "exec/memory_ldst_cached.h.inc"
>   
> diff --git a/include/exec/memory_ldst_phys.h.inc b/include/exec/memory_ldst_phys.h.inc
> index ecd678610d1..db67de75251 100644
> --- a/include/exec/memory_ldst_phys.h.inc
> +++ b/include/exec/memory_ldst_phys.h.inc
> @@ -19,7 +19,6 @@
>    * License along with this library; if not, see <http://www.gnu.org/licenses/>.
>    */
>   
> -#ifdef TARGET_ENDIANNESS
>   static inline uint16_t glue(lduw_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
>   {
>       return glue(address_space_lduw, SUFFIX)(ARG1, addr,
> @@ -55,7 +54,7 @@ static inline void glue(stq_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t val)
>       glue(address_space_stq, SUFFIX)(ARG1, addr, val,
>                                       MEMTXATTRS_UNSPECIFIED, NULL);
>   }
> -#else
> +
>   static inline uint8_t glue(ldub_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
>   {
>       return glue(address_space_ldub, SUFFIX)(ARG1, addr,
> @@ -139,9 +138,7 @@ static inline void glue(stq_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t va
>       glue(address_space_stq_be, SUFFIX)(ARG1, addr, val,
>                                          MEMTXATTRS_UNSPECIFIED, NULL);
>   }
> -#endif
>   
>   #undef ARG1_DECL
>   #undef ARG1
>   #undef SUFFIX
> -#undef TARGET_ENDIANNESS



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h
  2025-03-11  0:04     ` Pierrick Bouvier
@ 2025-03-11 14:40       ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2025-03-11 14:40 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 3/10/25 17:04, Pierrick Bouvier wrote:
>  From what I understand, non endian versions are simply passing DEVICE_NATIVE_ENDIAN as a 
> parameter for address_space_ldl_internal, which will thus match the target endianness.
> 
> So what is the risk for common code to call this?

You're right.  I failed to look at the current implementation
to see that it would already work.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2025-03-11 14:41 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-10  4:58 [PATCH 00/16] make system memory API available for common code Pierrick Bouvier
2025-03-10  4:58 ` [PATCH 01/16] exec/memory_ldst: extract memory_ldst declarations from cpu-all.h Pierrick Bouvier
2025-03-10 15:17   ` Richard Henderson
2025-03-10 16:03     ` Pierrick Bouvier
2025-03-11  0:04     ` Pierrick Bouvier
2025-03-11 14:40       ` Richard Henderson
2025-03-10  4:58 ` [PATCH 02/16] exec/memory_ldst_phys: extract memory_ldst_phys " Pierrick Bouvier
2025-03-11  0:08   ` Pierrick Bouvier
2025-03-10  4:58 ` [PATCH 03/16] include: move target_words_bigendian() from tswap to bswap Pierrick Bouvier
2025-03-10 15:21   ` Richard Henderson
2025-03-10 16:05     ` Pierrick Bouvier
2025-03-10  4:58 ` [PATCH 04/16] exec/memory.h: make devend_memop target agnostic Pierrick Bouvier
2025-03-10 15:25   ` Richard Henderson
2025-03-10 16:04     ` Pierrick Bouvier
2025-03-10 16:30   ` Richard Henderson
2025-03-10 16:38     ` Pierrick Bouvier
2025-03-10  4:58 ` [PATCH 05/16] qemu/bswap: implement {ld,st}.*_p as functions Pierrick Bouvier
2025-03-10 16:08   ` Richard Henderson
2025-03-10 16:14     ` Pierrick Bouvier
2025-03-10 16:37       ` Richard Henderson
2025-03-10 16:43         ` Pierrick Bouvier
2025-03-10 16:53           ` Richard Henderson
2025-03-10 17:09             ` Pierrick Bouvier
2025-03-10 20:17               ` BALATON Zoltan
2025-03-10 20:31                 ` Pierrick Bouvier
2025-03-10  4:58 ` [PATCH 06/16] exec/cpu-all.h: we can now remove ld/st macros Pierrick Bouvier
2025-03-10 16:39   ` Richard Henderson
2025-03-10 16:45     ` Pierrick Bouvier
2025-03-10  4:58 ` [PATCH 07/16] codebase: prepare to remove cpu.h from exec/exec-all.h Pierrick Bouvier
2025-03-10 17:22   ` Richard Henderson
2025-03-10  4:58 ` [PATCH 08/16] exec/exec-all: remove dependency on cpu.h Pierrick Bouvier
2025-03-10 17:29   ` Richard Henderson
2025-03-10  4:58 ` [PATCH 09/16] exec/memory-internal: " Pierrick Bouvier
2025-03-10 17:29   ` Richard Henderson
2025-03-10  4:58 ` [PATCH 10/16] exec/ram_addr: " Pierrick Bouvier
2025-03-10 17:29   ` Richard Henderson
2025-03-10  4:58 ` [PATCH 11/16] system/kvm: make kvm_flush_coalesced_mmio_buffer() accessible for common code Pierrick Bouvier
2025-03-10 17:30   ` Richard Henderson
2025-03-10  4:58 ` [PATCH 12/16] exec/ram_addr: call xen_hvm_modified_memory only if xen is enabled Pierrick Bouvier
2025-03-10 17:30   ` Richard Henderson
2025-03-10  4:58 ` [PATCH 13/16] hw/xen: add stubs for various functions Pierrick Bouvier
2025-03-10 17:32   ` Richard Henderson
2025-03-10  4:58 ` [PATCH 14/16] system/physmem: compilation unit is now common to all targets Pierrick Bouvier
2025-03-10 17:32   ` Richard Henderson
2025-03-10  4:58 ` [PATCH 15/16] system/memory: make compilation unit common Pierrick Bouvier
2025-03-10 17:43   ` Richard Henderson
2025-03-10 17:47     ` Pierrick Bouvier
2025-03-10 17:58       ` Richard Henderson
2025-03-10 18:04         ` Pierrick Bouvier
2025-03-10 18:10           ` Richard Henderson
2025-03-10 18:25             ` Pierrick Bouvier
2025-03-10 18:27           ` Philippe Mathieu-Daudé
2025-03-10 18:38             ` Pierrick Bouvier
2025-03-10  4:58 ` [PATCH 16/16] system/ioport: " Pierrick Bouvier
2025-03-10 17:43   ` Richard Henderson
2025-03-10 13:23 ` [PATCH 00/16] make system memory API available for common code BALATON Zoltan
2025-03-10 16:28   ` Pierrick Bouvier
2025-03-10 16:56     ` Pierrick Bouvier
2025-03-10 19:40       ` BALATON Zoltan
2025-03-10 20:26         ` Pierrick Bouvier
2025-03-10 21:02           ` BALATON Zoltan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).