[Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH  v1 0/4] de-macrofy softmmu
@ 2018-12-17 15:01 Alex Bennée
  2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 1/4] accel/tcg: export some cputlb functions Alex Bennée
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Alex Bennée @ 2018-12-17 15:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota, Alex Bennée

Hi,

This is a re-based re-spin of my RFC series with a few minor updates
thanks to updates to the softmmu code. The series is attempting to
achieve the same results we did with softfloat by replacing our nest of
macro expansions with some common helpers. By using __flatten__ to force
the whole function to be un-rolled the compiler can then dead-code away
parts of the common function not used in each particular instantiated
helper.

This is a win from a debugging point of view - no more "where is
helper_le_lduw_mmu defined" questions. I think it will also make
eventual instrumentation easier because there is exactly one slow path
load and store function to instrument.

Although the individual functions are now bigger (__flatten__ stops the
compiler ignoring inline if it wants to) the size of the resulting
binary is slightly smaller!

original - 73027224 bytes demacro - 66913848 bytes

Unfortunately in my simple boot test I see a slight performance
degradation:

original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13
deviation) demacro: 10 times (100.00%), avg time 5.760 (0.08
varience/0.29 deviation)

Emilio,

Any chance you could run this through your more comprehensive benchmark
suite?

Alex Bennée (4):
  accel/tcg: export some cputlb functions
  accel/tcg: introduce softmmu.c
  accel/tcg: use TLB helpers from softmmu.o
  accel/tcg: remove softmmu_template.h

 accel/tcg/Makefile.objs      |   1 +
 accel/tcg/cputlb.c           |  63 +----
 accel/tcg/cputlb.h           |  21 ++
 accel/tcg/softmmu.c          | 452 +++++++++++++++++++++++++++++++++++
 accel/tcg/softmmu_template.h | 446 ----------------------------------
 5 files changed, 485 insertions(+), 498 deletions(-)
 create mode 100644 accel/tcg/cputlb.h
 create mode 100644 accel/tcg/softmmu.c
 delete mode 100644 accel/tcg/softmmu_template.h

-- 
2.17.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v1 1/4] accel/tcg: export some cputlb functions
  2018-12-17 15:01 [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
@ 2018-12-17 15:01 ` Alex Bennée
  2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 2/4] accel/tcg: introduce softmmu.c Alex Bennée
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Alex Bennée @ 2018-12-17 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: cota, Alex Bennée, Peter Crosthwaite, Richard Henderson,
	Paolo Bonzini

In preparation for having softmmu helpers in their own file rather
than generated as part of softmmu-template.h we need to make a couple
of helper functions public outside of cputlb.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 accel/tcg/cputlb.c | 21 +++++++++++----------
 accel/tcg/cputlb.h | 21 +++++++++++++++++++++
 2 files changed, 32 insertions(+), 10 deletions(-)
 create mode 100644 accel/tcg/cputlb.h

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index af6bd8ccf9..3cae7335d0 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -20,6 +20,7 @@
 #include "qemu/osdep.h"
 #include "qemu/main-loop.h"
 #include "cpu.h"
+#include "cputlb.h"
 #include "exec/exec-all.h"
 #include "exec/memory.h"
 #include "exec/address-spaces.h"
@@ -675,10 +676,10 @@ static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
     return ram_addr;
 }
 
-static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
-                         int mmu_idx,
-                         target_ulong addr, uintptr_t retaddr,
-                         bool recheck, MMUAccessType access_type, int size)
+uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
+                  int mmu_idx,
+                  target_ulong addr, uintptr_t retaddr,
+                  bool recheck, MMUAccessType access_type, int size)
 {
     CPUState *cpu = ENV_GET_CPU(env);
     hwaddr mr_offset;
@@ -743,10 +744,10 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
     return val;
 }
 
-static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
-                      int mmu_idx,
-                      uint64_t val, target_ulong addr,
-                      uintptr_t retaddr, bool recheck, int size)
+void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
+               int mmu_idx,
+               uint64_t val, target_ulong addr,
+               uintptr_t retaddr, bool recheck, int size)
 {
     CPUState *cpu = ENV_GET_CPU(env);
     hwaddr mr_offset;
@@ -809,8 +810,8 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
 
 /* Return true if ADDR is present in the victim tlb, and has been copied
    back to the main tlb.  */
-static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
-                           size_t elt_ofs, target_ulong page)
+bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
+                    size_t elt_ofs, target_ulong page)
 {
     size_t vidx;
 
diff --git a/accel/tcg/cputlb.h b/accel/tcg/cputlb.h
new file mode 100644
index 0000000000..da09f45b86
--- /dev/null
+++ b/accel/tcg/cputlb.h
@@ -0,0 +1,21 @@
+/*
+ * CPU TLB Helpers
+ */
+
+#ifndef CPUTLB_H
+#define CPUTBL_H
+
+uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
+                  int mmu_idx,
+                  target_ulong addr, uintptr_t retaddr,
+                  bool recheck, MMUAccessType access_type, int size);
+
+void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
+               int mmu_idx,
+               uint64_t val, target_ulong addr,
+               uintptr_t retaddr, bool recheck, int size);
+
+bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
+                    size_t elt_ofs, target_ulong page);
+
+#endif
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH  v1 2/4] accel/tcg: introduce softmmu.c
  2018-12-17 15:01 [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
  2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 1/4] accel/tcg: export some cputlb functions Alex Bennée
@ 2018-12-17 15:01 ` Alex Bennée
  2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 3/4] accel/tcg: use TLB helpers from softmmu.o Alex Bennée
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Alex Bennée @ 2018-12-17 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: cota, Alex Bennée, Peter Crosthwaite, Richard Henderson,
	Paolo Bonzini

Instead of expanding a series of macros to generate the load/store
helpers we move stuff into common functions and rely on the compiler
to eliminate the dead code for each variant.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 accel/tcg/softmmu.c | 452 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 452 insertions(+)
 create mode 100644 accel/tcg/softmmu.c

diff --git a/accel/tcg/softmmu.c b/accel/tcg/softmmu.c
new file mode 100644
index 0000000000..e08730736f
--- /dev/null
+++ b/accel/tcg/softmmu.c
@@ -0,0 +1,452 @@
+/*
+ * Software MMU support
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "cputlb.h"
+#include "exec/exec-all.h"
+#include "exec/cpu_ldst.h"
+#include "tcg/tcg.h"
+
+#ifdef TARGET_WORDS_BIGENDIAN
+#define NEED_BE_BSWAP 0
+#define NEED_LE_BSWAP 1
+#else
+#define NEED_BE_BSWAP 1
+#define NEED_LE_BSWAP 0
+#endif
+
+/*
+ * Byte Swap Helper
+ *
+ * This should all dead code away depending on the build host and
+ * access type.
+ */
+
+static inline uint64_t handle_bswap(uint64_t val, int size, bool big_endian)
+{
+    if ((big_endian && NEED_BE_BSWAP) || (!big_endian && NEED_LE_BSWAP)) {
+        switch (size) {
+        case 1: return val;
+        case 2: return bswap16(val);
+        case 4: return bswap32(val);
+        case 8: return bswap64(val);
+        default:
+            g_assert_not_reached();
+        }
+    } else {
+        return val;
+    }
+}
+
+/* Macro to call the above, with local variables from the use context.  */
+#define VICTIM_TLB_HIT(TY, ADDR) \
+  victim_tlb_hit(env, mmu_idx, index, offsetof(CPUTLBEntry, TY), \
+                 (ADDR) & TARGET_PAGE_MASK)
+
+/*
+ * Load Helpers
+ *
+ * We support two different access types. SOFTMMU_CODE_ACCESS is
+ * specifically for reading instructions from system memory. It is
+ * called by the translation loop and in some helpers where the code
+ * is disassembled. It shouldn't be called directly by guest code.
+ */
+
+static tcg_target_ulong load_helper(CPUArchState *env, target_ulong addr,
+                                    TCGMemOpIdx oi, uintptr_t retaddr,
+                                    size_t size, bool big_endian,
+                                    bool code_read)
+{
+    uintptr_t mmu_idx = get_mmuidx(oi);
+    uintptr_t index = tlb_index(env, mmu_idx, addr);
+    CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
+    target_ulong tlb_addr = code_read ? entry->addr_code : entry->addr_read;
+    unsigned a_bits = get_alignment_bits(get_memop(oi));
+    uintptr_t haddr;
+    tcg_target_ulong res;
+
+    /* Handle unaligned */
+    if (addr & ((1 << a_bits) - 1)) {
+        cpu_unaligned_access(ENV_GET_CPU(env), addr,
+                             code_read ? MMU_INST_FETCH : MMU_DATA_LOAD,
+                             mmu_idx, retaddr);
+    }
+
+    /* If the TLB entry is for a different page, reload and try again.  */
+    if (!tlb_hit(tlb_addr, addr)) {
+        if (!VICTIM_TLB_HIT(addr_code, addr)) {
+            tlb_fill(ENV_GET_CPU(env), addr, size,
+                     code_read ? MMU_INST_FETCH : MMU_DATA_LOAD,
+                     mmu_idx, retaddr);
+        }
+        tlb_addr = code_read ? entry->addr_code : entry->addr_read;
+    }
+
+    /* Handle an IO access.  */
+    if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
+        CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index];
+        uint64_t tmp;
+
+        if ((addr & (size - 1)) != 0) {
+            goto do_unaligned_access;
+        }
+
+        tmp = io_readx(env, iotlbentry, mmu_idx, addr, retaddr,
+                       addr & tlb_addr & TLB_RECHECK,
+                       code_read ? MMU_INST_FETCH : MMU_DATA_LOAD, size);
+        return handle_bswap(tmp, size, big_endian);
+    }
+
+    /* Handle slow unaligned access (it spans two pages or IO).  */
+    if (size > 1
+        && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1
+                    >= TARGET_PAGE_SIZE)) {
+        target_ulong addr1, addr2;
+        tcg_target_ulong r1, r2;
+        unsigned shift;
+    do_unaligned_access:
+        addr1 = addr & ~(size - 1);
+        addr2 = addr1 + size;
+        r1 = load_helper(env, addr1, oi, retaddr, size, big_endian, code_read);
+        r2 = load_helper(env, addr2, oi, retaddr, size, big_endian, code_read);
+        shift = (addr & (size - 1)) * 8;
+
+        if (big_endian) {
+            /* Big-endian combine.  */
+            res = (r1 << shift) | (r2 >> ((size * 8) - shift));
+        } else {
+            /* Little-endian combine.  */
+            res = (r1 >> shift) | (r2 << ((size * 8) - shift));
+        }
+        return res;
+    }
+
+    haddr = addr + entry->addend;
+
+    switch (size) {
+    case 1:
+        res = ldub_p((uint8_t *)haddr);
+        break;
+    case 2:
+        if (big_endian) {
+            res = lduw_be_p((uint8_t *)haddr);
+        } else {
+            res = lduw_le_p((uint8_t *)haddr);
+        }
+        break;
+    case 4:
+        if (big_endian) {
+            res = ldl_be_p((uint8_t *)haddr);
+        } else {
+            res = ldl_le_p((uint8_t *)haddr);
+        }
+        break;
+    case 8:
+        if (big_endian) {
+            res = ldq_be_p((uint8_t *)haddr);
+        } else {
+            res = ldq_le_p((uint8_t *)haddr);
+        }
+        break;
+    default:
+        g_assert_not_reached();
+        break;
+    }
+
+    return res;
+}
+
+/*
+ * For the benefit of TCG generated code, we want to avoid the
+ * complication of ABI-specific return type promotion and always
+ * return a value extended to the register size of the host. This is
+ * tcg_target_long, except in the case of a 32-bit host and 64-bit
+ * data, and for that we always have uint64_t.
+ *
+ * We don't bother with this widened value for SOFTMMU_CODE_ACCESS.
+ */
+
+tcg_target_ulong __attribute__((flatten))
+helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                    uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 1, false, false);
+}
+
+tcg_target_ulong __attribute__((flatten))
+helper_le_lduw_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 2, false, false);
+}
+
+tcg_target_ulong __attribute__((flatten))
+helper_be_lduw_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 2, true, false);
+}
+
+tcg_target_ulong __attribute__((flatten))
+helper_le_ldul_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 4, false, false);
+}
+
+tcg_target_ulong __attribute__((flatten))
+helper_be_ldul_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 4, true, false);
+}
+
+tcg_target_ulong __attribute__((flatten))
+helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                  uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 8, false, false);
+}
+
+tcg_target_ulong __attribute__((flatten))
+helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                  uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 8, true, false);
+}
+
+/*
+ * Code Access
+ */
+
+uint8_t __attribute__((flatten))
+helper_ret_ldb_cmmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                    uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 1, false, true);
+}
+
+uint16_t __attribute__((flatten))
+helper_le_ldw_cmmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 2, false, true);
+}
+
+uint16_t __attribute__((flatten))
+helper_be_ldw_cmmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 2, true, true);
+}
+
+uint32_t __attribute__((flatten))
+helper_le_ldl_cmmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 4, false, true);
+}
+
+uint32_t __attribute__((flatten))
+helper_be_ldl_cmmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 4, true, true);
+}
+
+uint64_t __attribute__((flatten))
+helper_le_ldq_cmmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 8, false, true);
+}
+
+uint64_t __attribute__((flatten))
+helper_be_ldq_cmmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return load_helper(env, addr, oi, retaddr, 8, true, true);
+}
+
+/* Provide signed versions of the load routines as well.  We can of course
+   avoid this for 64-bit data, or for 32-bit data on 32-bit host.  */
+
+tcg_target_ulong __attribute__((flatten))
+helper_le_ldsw_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return (int16_t)helper_le_lduw_mmu(env, addr, oi, retaddr);
+}
+
+tcg_target_ulong __attribute__((flatten))
+helper_be_ldsw_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
+                   uintptr_t retaddr)
+{
+    return (int16_t)helper_be_lduw_mmu(env, addr, oi, retaddr);
+}
+
+/*
+ * Store Helpers
+ */
+
+static void store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
+                         TCGMemOpIdx oi, uintptr_t retaddr, size_t size,
+                         bool big_endian)
+{
+    uintptr_t mmu_idx = get_mmuidx(oi);
+    uintptr_t index = tlb_index(env, mmu_idx, addr);
+    CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
+    target_ulong tlb_addr = tlb_addr_write(entry);
+    unsigned a_bits = get_alignment_bits(get_memop(oi));
+    uintptr_t haddr;
+
+    if (addr & ((1 << a_bits) - 1)) {
+        cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
+                             mmu_idx, retaddr);
+    }
+
+    /* If the TLB entry is for a different page, reload and try again.  */
+    if (!tlb_hit(tlb_addr, addr)) {
+        if (!VICTIM_TLB_HIT(addr_write, addr)) {
+            tlb_fill(ENV_GET_CPU(env), addr, size, MMU_DATA_STORE,
+                     mmu_idx, retaddr);
+        }
+        tlb_addr = tlb_addr_write(entry) & ~TLB_INVALID_MASK;
+    }
+
+    /* Handle an IO access.  */
+    if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
+        CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index];
+
+        if ((addr & (size - 1)) != 0) {
+            goto do_unaligned_access;
+        }
+
+        io_writex(env, iotlbentry, mmu_idx,
+                  handle_bswap(val, size, big_endian),
+                  addr, retaddr, tlb_addr & TLB_RECHECK, size);
+        return;
+    }
+
+    /* Handle slow unaligned access (it spans two pages or IO).  */
+    if (size > 1
+        && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1
+                     >= TARGET_PAGE_SIZE)) {
+        int i;
+        uintptr_t index2;
+        CPUTLBEntry *entry2;
+        target_ulong page2, tlb_addr2;
+    do_unaligned_access:
+        /* Ensure the second page is in the TLB.  Note that the first page
+           is already guaranteed to be filled, and that the second page
+           cannot evict the first.  */
+        page2 = (addr + size) & TARGET_PAGE_MASK;
+        index2 = tlb_index(env, mmu_idx, page2);
+        entry2 = tlb_entry(env, mmu_idx, index2);
+        tlb_addr2 = tlb_addr_write(entry2);
+        if (!tlb_hit_page(tlb_addr2, page2)
+            && !VICTIM_TLB_HIT(addr_write, page2)) {
+            tlb_fill(ENV_GET_CPU(env), page2, size, MMU_DATA_STORE,
+                     mmu_idx, retaddr);
+        }
+
+        /* XXX: not efficient, but simple.  */
+        /* This loop must go in the forward direction to avoid issues
+           with self-modifying code in Windows 64-bit.  */
+        for (i = 0; i < size; ++i) {
+            uint8_t val8;
+            if (big_endian) {
+                /* Big-endian extract.  */
+                val8 = val >> (((size - 1) * 8) - (i * 8));
+            } else {
+                /* Little-endian extract.  */
+                val8 = val >> (i * 8);
+            }
+            store_helper(env, addr + i, val8, oi, retaddr, 1, big_endian);
+        }
+        return;
+    }
+
+    haddr = addr + entry->addend;
+
+    switch (size) {
+    case 1:
+        stb_p((uint8_t *)haddr, val);
+        break;
+    case 2:
+        if (big_endian) {
+            stw_be_p((uint8_t *)haddr, val);
+        } else {
+            stw_le_p((uint8_t *)haddr, val);
+        }
+        break;
+    case 4:
+        if (big_endian) {
+            stl_be_p((uint8_t *)haddr, val);
+        } else {
+            stl_le_p((uint8_t *)haddr, val);
+        }
+        break;
+    case 8:
+        if (big_endian) {
+            stq_be_p((uint8_t *)haddr, val);
+        } else {
+            stq_le_p((uint8_t *)haddr, val);
+        }
+        break;
+    default:
+        g_assert_not_reached();
+        break;
+    }
+}
+
+void __attribute__((flatten))
+helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val,
+                   TCGMemOpIdx oi, uintptr_t retaddr)
+{
+    store_helper(env, addr, val, oi, retaddr, 1, false);
+}
+
+void __attribute__((flatten))
+helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+                  TCGMemOpIdx oi, uintptr_t retaddr)
+{
+    store_helper(env, addr, val, oi, retaddr, 2, false);
+}
+
+void __attribute__((flatten))
+helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+                  TCGMemOpIdx oi, uintptr_t retaddr)
+{
+    store_helper(env, addr, val, oi, retaddr, 2, true);
+}
+
+void __attribute__((flatten))
+helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
+                  TCGMemOpIdx oi, uintptr_t retaddr)
+{
+    store_helper(env, addr, val, oi, retaddr, 4, false);
+}
+
+void __attribute__((flatten))
+helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
+                  TCGMemOpIdx oi, uintptr_t retaddr)
+{
+    store_helper(env, addr, val, oi, retaddr, 4, true);
+}
+
+void __attribute__((flatten))
+helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
+                  TCGMemOpIdx oi, uintptr_t retaddr)
+{
+    store_helper(env, addr, val, oi, retaddr, 8, false);
+}
+
+void __attribute__((flatten))
+helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
+                  TCGMemOpIdx oi, uintptr_t retaddr)
+{
+    store_helper(env, addr, val, oi, retaddr, 8, true);
+}
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH v1 3/4] accel/tcg: use TLB helpers from softmmu.o
  2018-12-17 15:01 [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
  2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 1/4] accel/tcg: export some cputlb functions Alex Bennée
  2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 2/4] accel/tcg: introduce softmmu.c Alex Bennée
@ 2018-12-17 15:01 ` Alex Bennée
  2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 4/4] accel/tcg: remove softmmu_template.h Alex Bennée
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Alex Bennée @ 2018-12-17 15:01 UTC (permalink / raw)
  To: qemu-devel
  Cc: cota, Alex Bennée, Peter Crosthwaite, Richard Henderson,
	Paolo Bonzini

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 accel/tcg/Makefile.objs |  1 +
 accel/tcg/cputlb.c      | 42 -----------------------------------------
 2 files changed, 1 insertion(+), 42 deletions(-)

diff --git a/accel/tcg/Makefile.objs b/accel/tcg/Makefile.objs
index d381a02f34..6b0b96633d 100644
--- a/accel/tcg/Makefile.objs
+++ b/accel/tcg/Makefile.objs
@@ -1,5 +1,6 @@
 obj-$(CONFIG_SOFTMMU) += tcg-all.o
 obj-$(CONFIG_SOFTMMU) += cputlb.o
+obj-$(CONFIG_SOFTMMU) += softmmu.o
 obj-y += tcg-runtime.o tcg-runtime-gvec.o
 obj-y += cpu-exec.o cpu-exec-common.o translate-all.o
 obj-y += translator.o
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 3cae7335d0..ab07689d0e 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -983,28 +983,6 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
     cpu_loop_exit_atomic(ENV_GET_CPU(env), retaddr);
 }
 
-#ifdef TARGET_WORDS_BIGENDIAN
-# define TGT_BE(X)  (X)
-# define TGT_LE(X)  BSWAP(X)
-#else
-# define TGT_BE(X)  BSWAP(X)
-# define TGT_LE(X)  (X)
-#endif
-
-#define MMUSUFFIX _mmu
-
-#define DATA_SIZE 1
-#include "softmmu_template.h"
-
-#define DATA_SIZE 2
-#include "softmmu_template.h"
-
-#define DATA_SIZE 4
-#include "softmmu_template.h"
-
-#define DATA_SIZE 8
-#include "softmmu_template.h"
-
 /* First set of helpers allows passing in of OI and RETADDR.  This makes
    them callable from other helpers.  */
 
@@ -1061,23 +1039,3 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
 #define DATA_SIZE 8
 #include "atomic_template.h"
 #endif
-
-/* Code access functions.  */
-
-#undef MMUSUFFIX
-#define MMUSUFFIX _cmmu
-#undef GETPC
-#define GETPC() ((uintptr_t)0)
-#define SOFTMMU_CODE_ACCESS
-
-#define DATA_SIZE 1
-#include "softmmu_template.h"
-
-#define DATA_SIZE 2
-#include "softmmu_template.h"
-
-#define DATA_SIZE 4
-#include "softmmu_template.h"
-
-#define DATA_SIZE 8
-#include "softmmu_template.h"
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH  v1 4/4] accel/tcg: remove softmmu_template.h
  2018-12-17 15:01 [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
                   ` (2 preceding siblings ...)
  2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 3/4] accel/tcg: use TLB helpers from softmmu.o Alex Bennée
@ 2018-12-17 15:01 ` Alex Bennée
  2018-12-17 16:15 ` [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
  2018-12-24  5:21 ` no-reply
  5 siblings, 0 replies; 9+ messages in thread
From: Alex Bennée @ 2018-12-17 15:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota, Alex Bennée

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 accel/tcg/softmmu_template.h | 446 -----------------------------------
 1 file changed, 446 deletions(-)
 delete mode 100644 accel/tcg/softmmu_template.h

diff --git a/accel/tcg/softmmu_template.h b/accel/tcg/softmmu_template.h
deleted file mode 100644
index b0adea045e..0000000000
--- a/accel/tcg/softmmu_template.h
+++ /dev/null
@@ -1,446 +0,0 @@
-/*
- *  Software MMU support
- *
- * Generate helpers used by TCG for qemu_ld/st ops and code load
- * functions.
- *
- * Included from target op helpers and exec.c.
- *
- *  Copyright (c) 2003 Fabrice Bellard
- *
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2 of the License, or (at your option) any later version.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; if not, see <http://www.gnu.org/licenses/>.
- */
-#if DATA_SIZE == 8
-#define SUFFIX q
-#define LSUFFIX q
-#define SDATA_TYPE  int64_t
-#define DATA_TYPE  uint64_t
-#elif DATA_SIZE == 4
-#define SUFFIX l
-#define LSUFFIX l
-#define SDATA_TYPE  int32_t
-#define DATA_TYPE  uint32_t
-#elif DATA_SIZE == 2
-#define SUFFIX w
-#define LSUFFIX uw
-#define SDATA_TYPE  int16_t
-#define DATA_TYPE  uint16_t
-#elif DATA_SIZE == 1
-#define SUFFIX b
-#define LSUFFIX ub
-#define SDATA_TYPE  int8_t
-#define DATA_TYPE  uint8_t
-#else
-#error unsupported data size
-#endif
-
-
-/* For the benefit of TCG generated code, we want to avoid the complication
-   of ABI-specific return type promotion and always return a value extended
-   to the register size of the host.  This is tcg_target_long, except in the
-   case of a 32-bit host and 64-bit data, and for that we always have
-   uint64_t.  Don't bother with this widened value for SOFTMMU_CODE_ACCESS.  */
-#if defined(SOFTMMU_CODE_ACCESS) || DATA_SIZE == 8
-# define WORD_TYPE  DATA_TYPE
-# define USUFFIX    SUFFIX
-#else
-# define WORD_TYPE  tcg_target_ulong
-# define USUFFIX    glue(u, SUFFIX)
-# define SSUFFIX    glue(s, SUFFIX)
-#endif
-
-#ifdef SOFTMMU_CODE_ACCESS
-#define READ_ACCESS_TYPE MMU_INST_FETCH
-#define ADDR_READ addr_code
-#else
-#define READ_ACCESS_TYPE MMU_DATA_LOAD
-#define ADDR_READ addr_read
-#endif
-
-#if DATA_SIZE == 8
-# define BSWAP(X)  bswap64(X)
-#elif DATA_SIZE == 4
-# define BSWAP(X)  bswap32(X)
-#elif DATA_SIZE == 2
-# define BSWAP(X)  bswap16(X)
-#else
-# define BSWAP(X)  (X)
-#endif
-
-#if DATA_SIZE == 1
-# define helper_le_ld_name  glue(glue(helper_ret_ld, USUFFIX), MMUSUFFIX)
-# define helper_be_ld_name  helper_le_ld_name
-# define helper_le_lds_name glue(glue(helper_ret_ld, SSUFFIX), MMUSUFFIX)
-# define helper_be_lds_name helper_le_lds_name
-# define helper_le_st_name  glue(glue(helper_ret_st, SUFFIX), MMUSUFFIX)
-# define helper_be_st_name  helper_le_st_name
-#else
-# define helper_le_ld_name  glue(glue(helper_le_ld, USUFFIX), MMUSUFFIX)
-# define helper_be_ld_name  glue(glue(helper_be_ld, USUFFIX), MMUSUFFIX)
-# define helper_le_lds_name glue(glue(helper_le_ld, SSUFFIX), MMUSUFFIX)
-# define helper_be_lds_name glue(glue(helper_be_ld, SSUFFIX), MMUSUFFIX)
-# define helper_le_st_name  glue(glue(helper_le_st, SUFFIX), MMUSUFFIX)
-# define helper_be_st_name  glue(glue(helper_be_st, SUFFIX), MMUSUFFIX)
-#endif
-
-#ifndef SOFTMMU_CODE_ACCESS
-static inline DATA_TYPE glue(io_read, SUFFIX)(CPUArchState *env,
-                                              size_t mmu_idx, size_t index,
-                                              target_ulong addr,
-                                              uintptr_t retaddr,
-                                              bool recheck,
-                                              MMUAccessType access_type)
-{
-    CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index];
-    return io_readx(env, iotlbentry, mmu_idx, addr, retaddr, recheck,
-                    access_type, DATA_SIZE);
-}
-#endif
-
-WORD_TYPE helper_le_ld_name(CPUArchState *env, target_ulong addr,
-                            TCGMemOpIdx oi, uintptr_t retaddr)
-{
-    uintptr_t mmu_idx = get_mmuidx(oi);
-    uintptr_t index = tlb_index(env, mmu_idx, addr);
-    CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
-    target_ulong tlb_addr = entry->ADDR_READ;
-    unsigned a_bits = get_alignment_bits(get_memop(oi));
-    uintptr_t haddr;
-    DATA_TYPE res;
-
-    if (addr & ((1 << a_bits) - 1)) {
-        cpu_unaligned_access(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE,
-                             mmu_idx, retaddr);
-    }
-
-    /* If the TLB entry is for a different page, reload and try again.  */
-    if (!tlb_hit(tlb_addr, addr)) {
-        if (!VICTIM_TLB_HIT(ADDR_READ, addr)) {
-            tlb_fill(ENV_GET_CPU(env), addr, DATA_SIZE, READ_ACCESS_TYPE,
-                     mmu_idx, retaddr);
-        }
-        tlb_addr = entry->ADDR_READ;
-    }
-
-    /* Handle an IO access.  */
-    if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
-        if ((addr & (DATA_SIZE - 1)) != 0) {
-            goto do_unaligned_access;
-        }
-
-        /* ??? Note that the io helpers always read data in the target
-           byte ordering.  We should push the LE/BE request down into io.  */
-        res = glue(io_read, SUFFIX)(env, mmu_idx, index, addr, retaddr,
-                                    tlb_addr & TLB_RECHECK,
-                                    READ_ACCESS_TYPE);
-        res = TGT_LE(res);
-        return res;
-    }
-
-    /* Handle slow unaligned access (it spans two pages or IO).  */
-    if (DATA_SIZE > 1
-        && unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
-                    >= TARGET_PAGE_SIZE)) {
-        target_ulong addr1, addr2;
-        DATA_TYPE res1, res2;
-        unsigned shift;
-    do_unaligned_access:
-        addr1 = addr & ~(DATA_SIZE - 1);
-        addr2 = addr1 + DATA_SIZE;
-        res1 = helper_le_ld_name(env, addr1, oi, retaddr);
-        res2 = helper_le_ld_name(env, addr2, oi, retaddr);
-        shift = (addr & (DATA_SIZE - 1)) * 8;
-
-        /* Little-endian combine.  */
-        res = (res1 >> shift) | (res2 << ((DATA_SIZE * 8) - shift));
-        return res;
-    }
-
-    haddr = addr + entry->addend;
-#if DATA_SIZE == 1
-    res = glue(glue(ld, LSUFFIX), _p)((uint8_t *)haddr);
-#else
-    res = glue(glue(ld, LSUFFIX), _le_p)((uint8_t *)haddr);
-#endif
-    return res;
-}
-
-#if DATA_SIZE > 1
-WORD_TYPE helper_be_ld_name(CPUArchState *env, target_ulong addr,
-                            TCGMemOpIdx oi, uintptr_t retaddr)
-{
-    uintptr_t mmu_idx = get_mmuidx(oi);
-    uintptr_t index = tlb_index(env, mmu_idx, addr);
-    CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
-    target_ulong tlb_addr = entry->ADDR_READ;
-    unsigned a_bits = get_alignment_bits(get_memop(oi));
-    uintptr_t haddr;
-    DATA_TYPE res;
-
-    if (addr & ((1 << a_bits) - 1)) {
-        cpu_unaligned_access(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE,
-                             mmu_idx, retaddr);
-    }
-
-    /* If the TLB entry is for a different page, reload and try again.  */
-    if (!tlb_hit(tlb_addr, addr)) {
-        if (!VICTIM_TLB_HIT(ADDR_READ, addr)) {
-            tlb_fill(ENV_GET_CPU(env), addr, DATA_SIZE, READ_ACCESS_TYPE,
-                     mmu_idx, retaddr);
-        }
-        tlb_addr = entry->ADDR_READ;
-    }
-
-    /* Handle an IO access.  */
-    if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
-        if ((addr & (DATA_SIZE - 1)) != 0) {
-            goto do_unaligned_access;
-        }
-
-        /* ??? Note that the io helpers always read data in the target
-           byte ordering.  We should push the LE/BE request down into io.  */
-        res = glue(io_read, SUFFIX)(env, mmu_idx, index, addr, retaddr,
-                                    tlb_addr & TLB_RECHECK,
-                                    READ_ACCESS_TYPE);
-        res = TGT_BE(res);
-        return res;
-    }
-
-    /* Handle slow unaligned access (it spans two pages or IO).  */
-    if (DATA_SIZE > 1
-        && unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
-                    >= TARGET_PAGE_SIZE)) {
-        target_ulong addr1, addr2;
-        DATA_TYPE res1, res2;
-        unsigned shift;
-    do_unaligned_access:
-        addr1 = addr & ~(DATA_SIZE - 1);
-        addr2 = addr1 + DATA_SIZE;
-        res1 = helper_be_ld_name(env, addr1, oi, retaddr);
-        res2 = helper_be_ld_name(env, addr2, oi, retaddr);
-        shift = (addr & (DATA_SIZE - 1)) * 8;
-
-        /* Big-endian combine.  */
-        res = (res1 << shift) | (res2 >> ((DATA_SIZE * 8) - shift));
-        return res;
-    }
-
-    haddr = addr + entry->addend;
-    res = glue(glue(ld, LSUFFIX), _be_p)((uint8_t *)haddr);
-    return res;
-}
-#endif /* DATA_SIZE > 1 */
-
-#ifndef SOFTMMU_CODE_ACCESS
-
-/* Provide signed versions of the load routines as well.  We can of course
-   avoid this for 64-bit data, or for 32-bit data on 32-bit host.  */
-#if DATA_SIZE * 8 < TCG_TARGET_REG_BITS
-WORD_TYPE helper_le_lds_name(CPUArchState *env, target_ulong addr,
-                             TCGMemOpIdx oi, uintptr_t retaddr)
-{
-    return (SDATA_TYPE)helper_le_ld_name(env, addr, oi, retaddr);
-}
-
-# if DATA_SIZE > 1
-WORD_TYPE helper_be_lds_name(CPUArchState *env, target_ulong addr,
-                             TCGMemOpIdx oi, uintptr_t retaddr)
-{
-    return (SDATA_TYPE)helper_be_ld_name(env, addr, oi, retaddr);
-}
-# endif
-#endif
-
-static inline void glue(io_write, SUFFIX)(CPUArchState *env,
-                                          size_t mmu_idx, size_t index,
-                                          DATA_TYPE val,
-                                          target_ulong addr,
-                                          uintptr_t retaddr,
-                                          bool recheck)
-{
-    CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index];
-    return io_writex(env, iotlbentry, mmu_idx, val, addr, retaddr,
-                     recheck, DATA_SIZE);
-}
-
-void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
-                       TCGMemOpIdx oi, uintptr_t retaddr)
-{
-    uintptr_t mmu_idx = get_mmuidx(oi);
-    uintptr_t index = tlb_index(env, mmu_idx, addr);
-    CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
-    target_ulong tlb_addr = tlb_addr_write(entry);
-    unsigned a_bits = get_alignment_bits(get_memop(oi));
-    uintptr_t haddr;
-
-    if (addr & ((1 << a_bits) - 1)) {
-        cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
-                             mmu_idx, retaddr);
-    }
-
-    /* If the TLB entry is for a different page, reload and try again.  */
-    if (!tlb_hit(tlb_addr, addr)) {
-        if (!VICTIM_TLB_HIT(addr_write, addr)) {
-            tlb_fill(ENV_GET_CPU(env), addr, DATA_SIZE, MMU_DATA_STORE,
-                     mmu_idx, retaddr);
-        }
-        tlb_addr = tlb_addr_write(entry) & ~TLB_INVALID_MASK;
-    }
-
-    /* Handle an IO access.  */
-    if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
-        if ((addr & (DATA_SIZE - 1)) != 0) {
-            goto do_unaligned_access;
-        }
-
-        /* ??? Note that the io helpers always read data in the target
-           byte ordering.  We should push the LE/BE request down into io.  */
-        val = TGT_LE(val);
-        glue(io_write, SUFFIX)(env, mmu_idx, index, val, addr,
-                               retaddr, tlb_addr & TLB_RECHECK);
-        return;
-    }
-
-    /* Handle slow unaligned access (it spans two pages or IO).  */
-    if (DATA_SIZE > 1
-        && unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
-                     >= TARGET_PAGE_SIZE)) {
-        int i;
-        target_ulong page2;
-        CPUTLBEntry *entry2;
-    do_unaligned_access:
-        /* Ensure the second page is in the TLB.  Note that the first page
-           is already guaranteed to be filled, and that the second page
-           cannot evict the first.  */
-        page2 = (addr + DATA_SIZE) & TARGET_PAGE_MASK;
-        entry2 = tlb_entry(env, mmu_idx, page2);
-        if (!tlb_hit_page(tlb_addr_write(entry2), page2)
-            && !VICTIM_TLB_HIT(addr_write, page2)) {
-            tlb_fill(ENV_GET_CPU(env), page2, DATA_SIZE, MMU_DATA_STORE,
-                     mmu_idx, retaddr);
-        }
-
-        /* XXX: not efficient, but simple.  */
-        /* This loop must go in the forward direction to avoid issues
-           with self-modifying code in Windows 64-bit.  */
-        for (i = 0; i < DATA_SIZE; ++i) {
-            /* Little-endian extract.  */
-            uint8_t val8 = val >> (i * 8);
-            glue(helper_ret_stb, MMUSUFFIX)(env, addr + i, val8,
-                                            oi, retaddr);
-        }
-        return;
-    }
-
-    haddr = addr + entry->addend;
-#if DATA_SIZE == 1
-    glue(glue(st, SUFFIX), _p)((uint8_t *)haddr, val);
-#else
-    glue(glue(st, SUFFIX), _le_p)((uint8_t *)haddr, val);
-#endif
-}
-
-#if DATA_SIZE > 1
-void helper_be_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
-                       TCGMemOpIdx oi, uintptr_t retaddr)
-{
-    uintptr_t mmu_idx = get_mmuidx(oi);
-    uintptr_t index = tlb_index(env, mmu_idx, addr);
-    CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
-    target_ulong tlb_addr = tlb_addr_write(entry);
-    unsigned a_bits = get_alignment_bits(get_memop(oi));
-    uintptr_t haddr;
-
-    if (addr & ((1 << a_bits) - 1)) {
-        cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
-                             mmu_idx, retaddr);
-    }
-
-    /* If the TLB entry is for a different page, reload and try again.  */
-    if (!tlb_hit(tlb_addr, addr)) {
-        if (!VICTIM_TLB_HIT(addr_write, addr)) {
-            tlb_fill(ENV_GET_CPU(env), addr, DATA_SIZE, MMU_DATA_STORE,
-                     mmu_idx, retaddr);
-        }
-        tlb_addr = tlb_addr_write(entry) & ~TLB_INVALID_MASK;
-    }
-
-    /* Handle an IO access.  */
-    if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
-        if ((addr & (DATA_SIZE - 1)) != 0) {
-            goto do_unaligned_access;
-        }
-
-        /* ??? Note that the io helpers always read data in the target
-           byte ordering.  We should push the LE/BE request down into io.  */
-        val = TGT_BE(val);
-        glue(io_write, SUFFIX)(env, mmu_idx, index, val, addr, retaddr,
-                               tlb_addr & TLB_RECHECK);
-        return;
-    }
-
-    /* Handle slow unaligned access (it spans two pages or IO).  */
-    if (DATA_SIZE > 1
-        && unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
-                     >= TARGET_PAGE_SIZE)) {
-        int i;
-        target_ulong page2;
-        CPUTLBEntry *entry2;
-    do_unaligned_access:
-        /* Ensure the second page is in the TLB.  Note that the first page
-           is already guaranteed to be filled, and that the second page
-           cannot evict the first.  */
-        page2 = (addr + DATA_SIZE) & TARGET_PAGE_MASK;
-        entry2 = tlb_entry(env, mmu_idx, page2);
-        if (!tlb_hit_page(tlb_addr_write(entry2), page2)
-            && !VICTIM_TLB_HIT(addr_write, page2)) {
-            tlb_fill(ENV_GET_CPU(env), page2, DATA_SIZE, MMU_DATA_STORE,
-                     mmu_idx, retaddr);
-        }
-
-        /* XXX: not efficient, but simple */
-        /* This loop must go in the forward direction to avoid issues
-           with self-modifying code.  */
-        for (i = 0; i < DATA_SIZE; ++i) {
-            /* Big-endian extract.  */
-            uint8_t val8 = val >> (((DATA_SIZE - 1) * 8) - (i * 8));
-            glue(helper_ret_stb, MMUSUFFIX)(env, addr + i, val8,
-                                            oi, retaddr);
-        }
-        return;
-    }
-
-    haddr = addr + entry->addend;
-    glue(glue(st, SUFFIX), _be_p)((uint8_t *)haddr, val);
-}
-#endif /* DATA_SIZE > 1 */
-#endif /* !defined(SOFTMMU_CODE_ACCESS) */
-
-#undef READ_ACCESS_TYPE
-#undef DATA_TYPE
-#undef SUFFIX
-#undef LSUFFIX
-#undef DATA_SIZE
-#undef ADDR_READ
-#undef WORD_TYPE
-#undef SDATA_TYPE
-#undef USUFFIX
-#undef SSUFFIX
-#undef BSWAP
-#undef helper_le_ld_name
-#undef helper_be_ld_name
-#undef helper_le_lds_name
-#undef helper_be_lds_name
-#undef helper_le_st_name
-#undef helper_be_st_name
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH  v1 0/4] de-macrofy softmmu
  2018-12-17 15:01 [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
                   ` (3 preceding siblings ...)
  2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 4/4] accel/tcg: remove softmmu_template.h Alex Bennée
@ 2018-12-17 16:15 ` Alex Bennée
  2018-12-17 17:29   ` Alex Bennée
  2018-12-17 17:33   ` Emilio G. Cota
  2018-12-24  5:21 ` no-reply
  5 siblings, 2 replies; 9+ messages in thread
From: Alex Bennée @ 2018-12-17 16:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota


Alex Bennée <alex.bennee@linaro.org> writes:

> Hi,
>
<snip>
>
> Unfortunately in my simple boot test I see a slight performance
> degradation:
>
> original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation)
> demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation)

Moving stuff back into cputlb seems to help:

 10 times (100.00%), avg time 5.583 (0.03 varience/0.17 deviation)

>
> Emilio,
>
> Any chance you could run this through your more comprehensive benchmark
> suite?
>
> Alex Bennée (4):
>   accel/tcg: export some cputlb functions
>   accel/tcg: introduce softmmu.c
>   accel/tcg: use TLB helpers from softmmu.o
>   accel/tcg: remove softmmu_template.h
>
>  accel/tcg/Makefile.objs      |   1 +
>  accel/tcg/cputlb.c           |  63 +----
>  accel/tcg/cputlb.h           |  21 ++
>  accel/tcg/softmmu.c          | 452 +++++++++++++++++++++++++++++++++++
>  accel/tcg/softmmu_template.h | 446 ----------------------------------
>  5 files changed, 485 insertions(+), 498 deletions(-)
>  create mode 100644 accel/tcg/cputlb.h
>  create mode 100644 accel/tcg/softmmu.c
>  delete mode 100644 accel/tcg/softmmu_template.h


--
Alex Bennée

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH  v1 0/4] de-macrofy softmmu
  2018-12-17 16:15 ` [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
@ 2018-12-17 17:29   ` Alex Bennée
  2018-12-17 17:33   ` Emilio G. Cota
  1 sibling, 0 replies; 9+ messages in thread
From: Alex Bennée @ 2018-12-17 17:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: cota


Alex Bennée <alex.bennee@linaro.org> writes:

> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> Hi,
>>
> <snip>
>>
>> Unfortunately in my simple boot test I see a slight performance
>> degradation:
>>
>> original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation)
>> demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation)
>
> Moving stuff back into cputlb seems to help:
>
>  10 times (100.00%), avg time 5.583 (0.03 varience/0.17 deviation)

See:

  https://github.com/stsquad/qemu/tree/ldst/demacrofy-v2

Which:

  - keeps everything in cputlb (dropping the externs)
  - factors out unaligned handling
  - uses __always_inline__ instead of __flatten__

>
>>
>> Emilio,
>>
>> Any chance you could run this through your more comprehensive benchmark
>> suite?
>>
>> Alex Bennée (4):
>>   accel/tcg: export some cputlb functions
>>   accel/tcg: introduce softmmu.c
>>   accel/tcg: use TLB helpers from softmmu.o
>>   accel/tcg: remove softmmu_template.h
>>
>>  accel/tcg/Makefile.objs      |   1 +
>>  accel/tcg/cputlb.c           |  63 +----
>>  accel/tcg/cputlb.h           |  21 ++
>>  accel/tcg/softmmu.c          | 452 +++++++++++++++++++++++++++++++++++
>>  accel/tcg/softmmu_template.h | 446 ----------------------------------
>>  5 files changed, 485 insertions(+), 498 deletions(-)
>>  create mode 100644 accel/tcg/cputlb.h
>>  create mode 100644 accel/tcg/softmmu.c
>>  delete mode 100644 accel/tcg/softmmu_template.h


--
Alex Bennée

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH  v1 0/4] de-macrofy softmmu
  2018-12-17 16:15 ` [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
  2018-12-17 17:29   ` Alex Bennée
@ 2018-12-17 17:33   ` Emilio G. Cota
  1 sibling, 0 replies; 9+ messages in thread
From: Emilio G. Cota @ 2018-12-17 17:33 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel

On Mon, Dec 17, 2018 at 16:15:10 +0000, Alex Bennée wrote:
> 
> Alex Bennée <alex.bennee@linaro.org> writes:
> 
> > Hi,
> >
> <snip>
> >
> > Unfortunately in my simple boot test I see a slight performance
> > degradation:
> >
> > original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation)
> > demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation)
> 
> Moving stuff back into cputlb seems to help:
> 
>  10 times (100.00%), avg time 5.583 (0.03 varience/0.17 deviation)

Yes, I'd move it there. Also playing with attr __noinline__
for the slow paths like we did for hardfloat should help.

> > Emilio,
> >
> > Any chance you could run this through your more comprehensive benchmark
> > suite?

Sure, please give me a few days (got a paper submission deadline
on Wed).

Do you have a branch I can pull from?

Thanks,

		Emilio

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH  v1 0/4] de-macrofy softmmu
  2018-12-17 15:01 [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
                   ` (4 preceding siblings ...)
  2018-12-17 16:15 ` [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
@ 2018-12-24  5:21 ` no-reply
  5 siblings, 0 replies; 9+ messages in thread
From: no-reply @ 2018-12-24  5:21 UTC (permalink / raw)
  To: alex.bennee; +Cc: fam, qemu-devel, cota

Patchew URL: https://patchew.org/QEMU/20181217150116.10446-1-alex.bennee@linaro.org/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-mingw@fedora SHOW_ENV=1 J=8
=== TEST SCRIPT END ===

  CC      x86_64-softmmu/target/i386/cpu.o
  CC      x86_64-softmmu/target/i386/gdbstub.o
  CC      aarch64-softmmu/hw/block/virtio-blk.o
/tmp/qemu-test/src/accel/tcg/softmmu.c:207:1: error: conflicting types for 'helper_le_ldq_mmu'
 helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 ^~~~~~~~~~~~~~~~~
In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0,
---
/tmp/qemu-test/src/tcg/tcg.h:1307:10: note: previous declaration of 'helper_le_ldq_mmu' was here
 uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
          ^~~~~~~~~~~~~~~~~
/tmp/qemu-test/src/accel/tcg/softmmu.c:214:1: error: conflicting types for 'helper_be_ldq_mmu'
 helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 ^~~~~~~~~~~~~~~~~
In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0,
---
  CC      aarch64-softmmu/hw/misc/omap_tap.o
  CC      aarch64-softmmu/hw/misc/bcm2835_mbox.o
  CC      aarch64-softmmu/hw/misc/bcm2835_property.o
/tmp/qemu-test/src/accel/tcg/softmmu.c:207:1: error: conflicting types for 'helper_le_ldq_mmu'
 helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 ^~~~~~~~~~~~~~~~~
In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0,
---
/tmp/qemu-test/src/tcg/tcg.h:1307:10: note: previous declaration of 'helper_le_ldq_mmu' was here
 uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
          ^~~~~~~~~~~~~~~~~
/tmp/qemu-test/src/accel/tcg/softmmu.c:214:1: error: conflicting types for 'helper_be_ldq_mmu'
 helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 ^~~~~~~~~~~~~~~~~
In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0,


The full log is available at
http://patchew.org/logs/20181217150116.10446-1-alex.bennee@linaro.org/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-12-24  5:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-17 15:01 [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 1/4] accel/tcg: export some cputlb functions Alex Bennée
2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 2/4] accel/tcg: introduce softmmu.c Alex Bennée
2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 3/4] accel/tcg: use TLB helpers from softmmu.o Alex Bennée
2018-12-17 15:01 ` [Qemu-devel] [PATCH v1 4/4] accel/tcg: remove softmmu_template.h Alex Bennée
2018-12-17 16:15 ` [Qemu-devel] [PATCH v1 0/4] de-macrofy softmmu Alex Bennée
2018-12-17 17:29   ` Alex Bennée
2018-12-17 17:33   ` Emilio G. Cota
2018-12-24  5:21 ` no-reply

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).