qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching
@ 2018-04-17 14:07 Paolo Bonzini
  2018-04-17 14:07 ` [Qemu-devel] [PATCH 1/4] exec: move memory access declarations to a common header, inline *_phys functions Paolo Bonzini
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Paolo Bonzini @ 2018-04-17 14:07 UTC (permalink / raw)
  To: qemu-devel

MemoryRegionCache was reverted to "normal" address_space_* operations
for 2.9, due to lack of support for IOMMUs.  This series reinstates
optimizations, caching only the IOMMU translation but not the IOMMU
lookup and target AddressSpace translation.

Patches 1 to 3 prepare by introducing a new function
address_space_translate_iommu (split out of address_space_translate)
and new header files for declarations shared by regular and "cached"
address_space_* functions.  Patch 4 uses them to introduce both
the slow path for IOMMU and MMIO cached regions, and the fast path
for RAM cached regions.

Paolo

Paolo Bonzini (4):
  exec: move memory access declarations to a common header, inline
    *_phys functions
  exec: small changes to flatview_do_translate
  exec: extract address_space_translate_iommu, fix page_mask corner case
  exec: reintroduce MemoryRegion caching

 exec.c                                | 245 ++++++++++++++++++++------
 include/exec/cpu-all.h                |  79 ++++-----
 include/exec/memory-internal.h        |   3 +
 include/exec/memory.h                 | 209 ++++++++++------------
 include/exec/memory_ldst.inc.h        |  71 ++++++++
 include/exec/memory_ldst_cached.inc.h | 108 ++++++++++++
 include/exec/memory_ldst_phys.inc.h   | 147 ++++++++++++++++
 memory.c                              |   4 +-
 memory_ldst.inc.c                     | 126 -------------
 9 files changed, 651 insertions(+), 341 deletions(-)
 create mode 100644 include/exec/memory_ldst.inc.h
 create mode 100644 include/exec/memory_ldst_cached.inc.h
 create mode 100644 include/exec/memory_ldst_phys.inc.h

-- 
2.17.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH 1/4] exec: move memory access declarations to a common header, inline *_phys functions
  2018-04-17 14:07 [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching Paolo Bonzini
@ 2018-04-17 14:07 ` Paolo Bonzini
  2018-04-17 14:08 ` [Qemu-devel] [PATCH 2/4] exec: small changes to flatview_do_translate Paolo Bonzini
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2018-04-17 14:07 UTC (permalink / raw)
  To: qemu-devel

For now, this reduces the text size very slightly due to the newly-added
inlining:

   text size before: 9301965
   text size after: 9300645

Later, however, the declarations in include/exec/memory_ldst.inc.h will be
reused for the MemoryRegionCache slow path functions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/exec/cpu-all.h              |  75 ++++++--------
 include/exec/memory.h               | 153 ++++++++--------------------
 include/exec/memory_ldst.inc.h      |  71 +++++++++++++
 include/exec/memory_ldst_phys.inc.h | 147 ++++++++++++++++++++++++++
 memory_ldst.inc.c                   | 126 -----------------------
 5 files changed, 292 insertions(+), 280 deletions(-)
 create mode 100644 include/exec/memory_ldst.inc.h
 create mode 100644 include/exec/memory_ldst_phys.inc.h

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index f4fa94e966..173edd1fb4 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -168,51 +168,36 @@ extern unsigned long reserved_va;
 #else
 
 #include "exec/hwaddr.h"
-uint32_t lduw_phys(AddressSpace *as, hwaddr addr);
-uint32_t ldl_phys(AddressSpace *as, hwaddr addr);
-uint64_t ldq_phys(AddressSpace *as, hwaddr addr);
-void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val);
-void stw_phys(AddressSpace *as, hwaddr addr, uint32_t val);
-void stl_phys(AddressSpace *as, hwaddr addr, uint32_t val);
-void stq_phys(AddressSpace *as, hwaddr addr, uint64_t val);
-
-uint32_t address_space_lduw(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_ldl(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint64_t address_space_ldq(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stl_notdirty(AddressSpace *as, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stw(AddressSpace *as, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stl(AddressSpace *as, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stq(AddressSpace *as, hwaddr addr, uint64_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-
-uint32_t lduw_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-uint32_t ldl_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-uint64_t ldq_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-void stl_phys_notdirty_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val);
-void stw_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val);
-void stl_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val);
-void stq_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint64_t val);
-
-uint32_t address_space_lduw_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_ldl_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint64_t address_space_ldq_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stl_notdirty_cached(MemoryRegionCache *cache, hwaddr addr,
-                            uint32_t val, MemTxAttrs attrs, MemTxResult *result);
-void address_space_stw_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stl_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stq_cached(MemoryRegionCache *cache, hwaddr addr, uint64_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
+
+#define SUFFIX
+#define ARG1         as
+#define ARG1_DECL    AddressSpace *as
+#define TARGET_ENDIANNESS
+#include "exec/memory_ldst.inc.h"
+
+#define SUFFIX       _cached
+#define ARG1         cache
+#define ARG1_DECL    MemoryRegionCache *cache
+#define TARGET_ENDIANNESS
+#include "exec/memory_ldst.inc.h"
+
+static inline void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val)
+{
+    address_space_stl_notdirty(as, addr, val,
+                               MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+#define SUFFIX
+#define ARG1         as
+#define ARG1_DECL    AddressSpace *as
+#define TARGET_ENDIANNESS
+#include "exec/memory_ldst_phys.inc.h"
+
+#define SUFFIX       _cached
+#define ARG1         cache
+#define ARG1_DECL    MemoryRegionCache *cache
+#define TARGET_ENDIANNESS
+#include "exec/memory_ldst_phys.inc.h"
 #endif
 
 /* page related stuff */
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 31eae0a640..ca361bc409 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1676,49 +1676,16 @@ MemTxResult address_space_write(AddressSpace *as, hwaddr addr,
  * @result: location to write the success/failure of the transaction;
  *   if NULL, this information is discarded
  */
-uint32_t address_space_ldub(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_lduw_le(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_lduw_be(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_ldl_le(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_ldl_be(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint64_t address_space_ldq_le(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint64_t address_space_ldq_be(AddressSpace *as, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stb(AddressSpace *as, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stw_le(AddressSpace *as, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stw_be(AddressSpace *as, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stl_le(AddressSpace *as, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stl_be(AddressSpace *as, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stq_le(AddressSpace *as, hwaddr addr, uint64_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stq_be(AddressSpace *as, hwaddr addr, uint64_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-
-uint32_t ldub_phys(AddressSpace *as, hwaddr addr);
-uint32_t lduw_le_phys(AddressSpace *as, hwaddr addr);
-uint32_t lduw_be_phys(AddressSpace *as, hwaddr addr);
-uint32_t ldl_le_phys(AddressSpace *as, hwaddr addr);
-uint32_t ldl_be_phys(AddressSpace *as, hwaddr addr);
-uint64_t ldq_le_phys(AddressSpace *as, hwaddr addr);
-uint64_t ldq_be_phys(AddressSpace *as, hwaddr addr);
-void stb_phys(AddressSpace *as, hwaddr addr, uint32_t val);
-void stw_le_phys(AddressSpace *as, hwaddr addr, uint32_t val);
-void stw_be_phys(AddressSpace *as, hwaddr addr, uint32_t val);
-void stl_le_phys(AddressSpace *as, hwaddr addr, uint32_t val);
-void stl_be_phys(AddressSpace *as, hwaddr addr, uint32_t val);
-void stq_le_phys(AddressSpace *as, hwaddr addr, uint64_t val);
-void stq_be_phys(AddressSpace *as, hwaddr addr, uint64_t val);
+
+#define SUFFIX
+#define ARG1         as
+#define ARG1_DECL    AddressSpace *as
+#include "exec/memory_ldst.inc.h"
+
+#define SUFFIX
+#define ARG1         as
+#define ARG1_DECL    AddressSpace *as
+#include "exec/memory_ldst_phys.inc.h"
 
 struct MemoryRegionCache {
     hwaddr xlat;
@@ -1728,6 +1695,40 @@ struct MemoryRegionCache {
 
 #define MEMORY_REGION_CACHE_INVALID ((MemoryRegionCache) { .as = NULL })
 
+/* address_space_ld*_cached: load from a cached #MemoryRegion
+ * address_space_st*_cached: store into a cached #MemoryRegion
+ *
+ * These functions perform a load or store of the byte, word,
+ * longword or quad to the specified address.  The address is
+ * a physical address in the AddressSpace, but it must lie within
+ * a #MemoryRegion that was mapped with address_space_cache_init.
+ *
+ * The _le suffixed functions treat the data as little endian;
+ * _be indicates big endian; no suffix indicates "same endianness
+ * as guest CPU".
+ *
+ * The "guest CPU endianness" accessors are deprecated for use outside
+ * target-* code; devices should be CPU-agnostic and use either the LE
+ * or the BE accessors.
+ *
+ * @cache: previously initialized #MemoryRegionCache to be accessed
+ * @addr: address within the address space
+ * @val: data value, for stores
+ * @attrs: memory transaction attributes
+ * @result: location to write the success/failure of the transaction;
+ *   if NULL, this information is discarded
+ */
+
+#define SUFFIX       _cached
+#define ARG1         cache
+#define ARG1_DECL    MemoryRegionCache *cache
+#include "exec/memory_ldst.inc.h"
+
+#define SUFFIX       _cached
+#define ARG1         cache
+#define ARG1_DECL    MemoryRegionCache *cache
+#include "exec/memory_ldst_phys.inc.h"
+
 /* address_space_cache_init: prepare for repeated access to a physical
  * memory region
  *
@@ -1772,72 +1773,6 @@ void address_space_cache_invalidate(MemoryRegionCache *cache,
  */
 void address_space_cache_destroy(MemoryRegionCache *cache);
 
-/* address_space_ld*_cached: load from a cached #MemoryRegion
- * address_space_st*_cached: store into a cached #MemoryRegion
- *
- * These functions perform a load or store of the byte, word,
- * longword or quad to the specified address.  The address is
- * a physical address in the AddressSpace, but it must lie within
- * a #MemoryRegion that was mapped with address_space_cache_init.
- *
- * The _le suffixed functions treat the data as little endian;
- * _be indicates big endian; no suffix indicates "same endianness
- * as guest CPU".
- *
- * The "guest CPU endianness" accessors are deprecated for use outside
- * target-* code; devices should be CPU-agnostic and use either the LE
- * or the BE accessors.
- *
- * @cache: previously initialized #MemoryRegionCache to be accessed
- * @addr: address within the address space
- * @val: data value, for stores
- * @attrs: memory transaction attributes
- * @result: location to write the success/failure of the transaction;
- *   if NULL, this information is discarded
- */
-uint32_t address_space_ldub_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_lduw_le_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_lduw_be_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_ldl_le_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint32_t address_space_ldl_be_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint64_t address_space_ldq_le_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-uint64_t address_space_ldq_be_cached(MemoryRegionCache *cache, hwaddr addr,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stb_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stw_le_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stw_be_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stl_le_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stl_be_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stq_le_cached(MemoryRegionCache *cache, hwaddr addr, uint64_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-void address_space_stq_be_cached(MemoryRegionCache *cache, hwaddr addr, uint64_t val,
-                            MemTxAttrs attrs, MemTxResult *result);
-
-uint32_t ldub_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-uint32_t lduw_le_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-uint32_t lduw_be_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-uint32_t ldl_le_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-uint32_t ldl_be_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-uint64_t ldq_le_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-uint64_t ldq_be_phys_cached(MemoryRegionCache *cache, hwaddr addr);
-void stb_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val);
-void stw_le_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val);
-void stw_be_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val);
-void stl_le_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val);
-void stl_be_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint32_t val);
-void stq_le_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint64_t val);
-void stq_be_phys_cached(MemoryRegionCache *cache, hwaddr addr, uint64_t val);
 /* address_space_get_iotlb_entry: translate an address into an IOTLB
  * entry. Should be called from an RCU critical section.
  */
diff --git a/include/exec/memory_ldst.inc.h b/include/exec/memory_ldst.inc.h
new file mode 100644
index 0000000000..272c20f02e
--- /dev/null
+++ b/include/exec/memory_ldst.inc.h
@@ -0,0 +1,71 @@
+/*
+ *  Physical memory access templates
+ *
+ *  Copyright (c) 2003 Fabrice Bellard
+ *  Copyright (c) 2015 Linaro, Inc.
+ *  Copyright (c) 2018 Red Hat, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef TARGET_ENDIANNESS
+extern uint32_t glue(address_space_lduw, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern uint32_t glue(address_space_ldl, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern uint64_t glue(address_space_ldq, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stl_notdirty, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stw, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stl, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stq, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint64_t val, MemTxAttrs attrs, MemTxResult *result);
+#else
+extern uint32_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern uint32_t glue(address_space_lduw_le, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern uint32_t glue(address_space_lduw_be, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern uint32_t glue(address_space_ldl_le, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern uint32_t glue(address_space_ldl_be, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern uint64_t glue(address_space_ldq_le, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern uint64_t glue(address_space_ldq_be, SUFFIX)(ARG1_DECL,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stb, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stw_le, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stw_be, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stl_le, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stl_be, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stq_le, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint64_t val, MemTxAttrs attrs, MemTxResult *result);
+extern void glue(address_space_stq_be, SUFFIX)(ARG1_DECL,
+    hwaddr addr, uint64_t val, MemTxAttrs attrs, MemTxResult *result);
+#endif
+
+#undef ARG1_DECL
+#undef ARG1
+#undef SUFFIX
+#undef TARGET_ENDIANNESS
diff --git a/include/exec/memory_ldst_phys.inc.h b/include/exec/memory_ldst_phys.inc.h
new file mode 100644
index 0000000000..91f72973cb
--- /dev/null
+++ b/include/exec/memory_ldst_phys.inc.h
@@ -0,0 +1,147 @@
+/*
+ *  Physical memory access templates
+ *
+ *  Copyright (c) 2003 Fabrice Bellard
+ *  Copyright (c) 2015 Linaro, Inc.
+ *  Copyright (c) 2018 Red Hat, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifdef TARGET_ENDIANNESS
+static inline uint32_t glue(ldl_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_ldl, SUFFIX)(ARG1, addr,
+                                           MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline uint64_t glue(ldq_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_ldq, SUFFIX)(ARG1, addr,
+                                           MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline uint32_t glue(lduw_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_lduw, SUFFIX)(ARG1, addr,
+                                            MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stl_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
+{
+    glue(address_space_stl, SUFFIX)(ARG1, addr, val,
+                                    MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stw_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
+{
+    glue(address_space_stw, SUFFIX)(ARG1, addr, val,
+                                    MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stq_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t val)
+{
+    glue(address_space_stq, SUFFIX)(ARG1, addr, val,
+                                    MEMTXATTRS_UNSPECIFIED, NULL);
+}
+#else
+static inline uint32_t glue(ldl_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_ldl_le, SUFFIX)(ARG1, addr,
+                                              MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline uint32_t glue(ldl_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_ldl_be, SUFFIX)(ARG1, addr,
+                                              MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline uint64_t glue(ldq_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_ldq_le, SUFFIX)(ARG1, addr,
+                                              MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline uint64_t glue(ldq_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_ldq_be, SUFFIX)(ARG1, addr,
+                                              MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline uint32_t glue(ldub_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_ldub, SUFFIX)(ARG1, addr,
+                                            MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline uint32_t glue(lduw_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_lduw_le, SUFFIX)(ARG1, addr,
+                                               MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline uint32_t glue(lduw_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
+{
+    return glue(address_space_lduw_be, SUFFIX)(ARG1, addr,
+                                               MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stl_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
+{
+    glue(address_space_stl_le, SUFFIX)(ARG1, addr, val,
+                                       MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stl_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
+{
+    glue(address_space_stl_be, SUFFIX)(ARG1, addr, val,
+                                       MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stb_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
+{
+    glue(address_space_stb, SUFFIX)(ARG1, addr, val,
+                                    MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stw_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
+{
+    glue(address_space_stw_le, SUFFIX)(ARG1, addr, val,
+                                       MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stw_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
+{
+    glue(address_space_stw_be, SUFFIX)(ARG1, addr, val,
+                                       MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stq_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t val)
+{
+    glue(address_space_stq_le, SUFFIX)(ARG1, addr, val,
+                                       MEMTXATTRS_UNSPECIFIED, NULL);
+}
+
+static inline void glue(stq_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t val)
+{
+    glue(address_space_stq_be, SUFFIX)(ARG1, addr, val,
+                                       MEMTXATTRS_UNSPECIFIED, NULL);
+}
+#endif
+
+#undef ARG1_DECL
+#undef ARG1
+#undef SUFFIX
+#undef TARGET_ENDIANNESS
diff --git a/memory_ldst.inc.c b/memory_ldst.inc.c
index 5dbff9cef8..25d6125747 100644
--- a/memory_ldst.inc.c
+++ b/memory_ldst.inc.c
@@ -95,24 +95,6 @@ uint32_t glue(address_space_ldl_be, SUFFIX)(ARG1_DECL,
                                                     DEVICE_BIG_ENDIAN);
 }
 
-uint32_t glue(ldl_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_ldl, SUFFIX)(ARG1, addr,
-                                           MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-uint32_t glue(ldl_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_ldl_le, SUFFIX)(ARG1, addr,
-                                              MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-uint32_t glue(ldl_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_ldl_be, SUFFIX)(ARG1, addr,
-                                              MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
 /* warning: addr must be aligned */
 static inline uint64_t glue(address_space_ldq_internal, SUFFIX)(ARG1_DECL,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result,
@@ -189,24 +171,6 @@ uint64_t glue(address_space_ldq_be, SUFFIX)(ARG1_DECL,
                                                     DEVICE_BIG_ENDIAN);
 }
 
-uint64_t glue(ldq_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_ldq, SUFFIX)(ARG1, addr,
-                                           MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-uint64_t glue(ldq_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_ldq_le, SUFFIX)(ARG1, addr,
-                                              MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-uint64_t glue(ldq_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_ldq_be, SUFFIX)(ARG1, addr,
-                                              MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
 uint32_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
 {
@@ -241,12 +205,6 @@ uint32_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
     return val;
 }
 
-uint32_t glue(ldub_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_ldub, SUFFIX)(ARG1, addr,
-                                            MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
 /* warning: addr must be aligned */
 static inline uint32_t glue(address_space_lduw_internal, SUFFIX)(ARG1_DECL,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result,
@@ -323,24 +281,6 @@ uint32_t glue(address_space_lduw_be, SUFFIX)(ARG1_DECL,
                                        DEVICE_BIG_ENDIAN);
 }
 
-uint32_t glue(lduw_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_lduw, SUFFIX)(ARG1, addr,
-                                            MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-uint32_t glue(lduw_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_lduw_le, SUFFIX)(ARG1, addr,
-                                               MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-uint32_t glue(lduw_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr)
-{
-    return glue(address_space_lduw_be, SUFFIX)(ARG1, addr,
-                                               MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
 /* warning: addr must be aligned. The ram page is not masked as dirty
    and the code inside is not invalidated. It is useful if the dirty
    bits are used to track modified PTEs */
@@ -380,12 +320,6 @@ void glue(address_space_stl_notdirty, SUFFIX)(ARG1_DECL,
     RCU_READ_UNLOCK();
 }
 
-void glue(stl_phys_notdirty, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
-{
-    glue(address_space_stl_notdirty, SUFFIX)(ARG1, addr, val,
-                                             MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
 /* warning: addr must be aligned */
 static inline void glue(address_space_stl_internal, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint32_t val, MemTxAttrs attrs,
@@ -460,24 +394,6 @@ void glue(address_space_stl_be, SUFFIX)(ARG1_DECL,
                                              result, DEVICE_BIG_ENDIAN);
 }
 
-void glue(stl_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
-{
-    glue(address_space_stl, SUFFIX)(ARG1, addr, val,
-                                    MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-void glue(stl_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
-{
-    glue(address_space_stl_le, SUFFIX)(ARG1, addr, val,
-                                       MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-void glue(stl_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
-{
-    glue(address_space_stl_be, SUFFIX)(ARG1, addr, val,
-                                       MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
 void glue(address_space_stb, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
 {
@@ -509,12 +425,6 @@ void glue(address_space_stb, SUFFIX)(ARG1_DECL,
     RCU_READ_UNLOCK();
 }
 
-void glue(stb_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
-{
-    glue(address_space_stb, SUFFIX)(ARG1, addr, val,
-                                    MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
 /* warning: addr must be aligned */
 static inline void glue(address_space_stw_internal, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint32_t val, MemTxAttrs attrs,
@@ -589,24 +499,6 @@ void glue(address_space_stw_be, SUFFIX)(ARG1_DECL,
                                DEVICE_BIG_ENDIAN);
 }
 
-void glue(stw_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
-{
-    glue(address_space_stw, SUFFIX)(ARG1, addr, val,
-                                    MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-void glue(stw_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
-{
-    glue(address_space_stw_le, SUFFIX)(ARG1, addr, val,
-                                       MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-void glue(stw_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint32_t val)
-{
-    glue(address_space_stw_be, SUFFIX)(ARG1, addr, val,
-                                       MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
 static void glue(address_space_stq_internal, SUFFIX)(ARG1_DECL,
     hwaddr addr, uint64_t val, MemTxAttrs attrs,
     MemTxResult *result, enum device_endian endian)
@@ -680,24 +572,6 @@ void glue(address_space_stq_be, SUFFIX)(ARG1_DECL,
                                              DEVICE_BIG_ENDIAN);
 }
 
-void glue(stq_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t val)
-{
-    glue(address_space_stq, SUFFIX)(ARG1, addr, val,
-                                    MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-void glue(stq_le_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t val)
-{
-    glue(address_space_stq_le, SUFFIX)(ARG1, addr, val,
-                                       MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
-void glue(stq_be_phys, SUFFIX)(ARG1_DECL, hwaddr addr, uint64_t val)
-{
-    glue(address_space_stq_be, SUFFIX)(ARG1, addr, val,
-                                       MEMTXATTRS_UNSPECIFIED, NULL);
-}
-
 #undef ARG1_DECL
 #undef ARG1
 #undef SUFFIX
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH 2/4] exec: small changes to flatview_do_translate
  2018-04-17 14:07 [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching Paolo Bonzini
  2018-04-17 14:07 ` [Qemu-devel] [PATCH 1/4] exec: move memory access declarations to a common header, inline *_phys functions Paolo Bonzini
@ 2018-04-17 14:08 ` Paolo Bonzini
  2018-05-04  4:21   ` Peter Xu
  2018-04-17 14:08 ` [Qemu-devel] [PATCH 3/4] exec: extract address_space_translate_iommu, fix page_mask corner case Paolo Bonzini
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2018-04-17 14:08 UTC (permalink / raw)
  To: qemu-devel

Prepare for extracting the IOMMU part to a separate function.  Mostly
cosmetic; the only semantic change is that, if there is more than one
cascaded IOMMU and the second one fails to translate, *plen_out is now
adjusted according to the page mask of the first IOMMU.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 33 ++++++++++++++-------------------
 1 file changed, 14 insertions(+), 19 deletions(-)

diff --git a/exec.c b/exec.c
index 02b1efebb7..1dea3e9c95 100644
--- a/exec.c
+++ b/exec.c
@@ -476,6 +476,7 @@ address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x
  *            would tell. It can be @NULL if we don't care about it.
  * @is_write: whether the translation operation is for write
  * @is_mmio: whether this can be MMIO, set true if it can
+ * @target_as: the address space targeted by the IOMMU
  *
  * This function is called from RCU critical section
  */
@@ -495,14 +496,14 @@ static MemoryRegionSection flatview_do_translate(FlatView *fv,
     hwaddr page_mask = (hwaddr)(-1);
     hwaddr plen = (hwaddr)(-1);
 
-    if (plen_out) {
-        plen = *plen_out;
+    if (!plen_out) {
+        plen_out = &plen;
     }
 
     for (;;) {
         section = address_space_translate_internal(
-                flatview_to_dispatch(fv), addr, &addr,
-                &plen, is_mmio);
+                flatview_to_dispatch(fv), addr, xlat,
+                plen_out, is_mmio);
 
         iommu_mr = memory_region_get_iommu(section->mr);
         if (!iommu_mr) {
@@ -510,35 +511,29 @@ static MemoryRegionSection flatview_do_translate(FlatView *fv,
         }
         imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
 
+        addr = *xlat;
         iotlb = imrc->translate(iommu_mr, addr, is_write ?
                                 IOMMU_WO : IOMMU_RO);
-        addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
-                | (addr & iotlb.addr_mask));
-        page_mask &= iotlb.addr_mask;
-        plen = MIN(plen, (addr | iotlb.addr_mask) - addr + 1);
         if (!(iotlb.perm & (1 << is_write))) {
             goto translate_fail;
         }
 
+        addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
+                | (addr & iotlb.addr_mask));
+        page_mask &= iotlb.addr_mask;
+        *plen_out = MIN(*plen_out, (addr | iotlb.addr_mask) - addr + 1);
         fv = address_space_to_flatview(iotlb.target_as);
         *target_as = iotlb.target_as;
     }
 
-    *xlat = addr;
-
-    if (page_mask == (hwaddr)(-1)) {
-        /* Not behind an IOMMU, use default page size. */
-        page_mask = ~TARGET_PAGE_MASK;
-    }
-
     if (page_mask_out) {
+        if (page_mask == (hwaddr)(-1)) {
+            /* Not behind an IOMMU, use default page size. */
+            page_mask = ~TARGET_PAGE_MASK;
+        }
         *page_mask_out = page_mask;
     }
 
-    if (plen_out) {
-        *plen_out = plen;
-    }
-
     return *section;
 
 translate_fail:
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH 3/4] exec: extract address_space_translate_iommu, fix page_mask corner case
  2018-04-17 14:07 [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching Paolo Bonzini
  2018-04-17 14:07 ` [Qemu-devel] [PATCH 1/4] exec: move memory access declarations to a common header, inline *_phys functions Paolo Bonzini
  2018-04-17 14:08 ` [Qemu-devel] [PATCH 2/4] exec: small changes to flatview_do_translate Paolo Bonzini
@ 2018-04-17 14:08 ` Paolo Bonzini
  2018-05-04  4:22   ` Peter Xu
  2018-04-17 14:08 ` [Qemu-devel] [PATCH 4/4] exec: reintroduce MemoryRegion caching Paolo Bonzini
  2018-05-16 10:41 ` [Qemu-devel] [PATCH 0/4] " Auger Eric
  4 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2018-04-17 14:08 UTC (permalink / raw)
  To: qemu-devel

This will be used to process IOMMUs in a MemoryRegionCache.  This
includes a small bugfix, in that the returned page_mask is now
correctly -1 if the IOMMU memory region maps the entire address
space directly.  Previously, address_space_get_iotlb_entry would
return ~TARGET_PAGE_MASK.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 113 +++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 78 insertions(+), 35 deletions(-)

diff --git a/exec.c b/exec.c
index 1dea3e9c95..868f89c345 100644
--- a/exec.c
+++ b/exec.c
@@ -461,6 +461,72 @@ address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x
     return section;
 }
 
+/**
+ * address_space_translate_iommu - translate an address through an IOMMU
+ * memory region and then through the target address space.
+ *
+ * @iommu_mr: the IOMMU memory region that we start the translation from
+ * @addr: the address to be translated through the MMU
+ * @xlat: the translated address offset within the destination memory region.
+ *        It cannot be %NULL.
+ * @plen_out: valid read/write length of the translated address. It
+ *            cannot be %NULL.
+ * @page_mask_out: page mask for the translated address. This
+ *            should only be meaningful for IOMMU translated
+ *            addresses, since there may be huge pages that this bit
+ *            would tell. It can be %NULL if we don't care about it.
+ * @is_write: whether the translation operation is for write
+ * @is_mmio: whether this can be MMIO, set true if it can
+ * @target_as: the address space targeted by the IOMMU
+ *
+ * This function is called from RCU critical section.
+ */
+static MemoryRegionSection address_space_translate_iommu(IOMMUMemoryRegion *iommu_mr,
+                                                         hwaddr *xlat,
+                                                         hwaddr *plen_out,
+                                                         hwaddr *page_mask_out,
+                                                         bool is_write,
+                                                         bool is_mmio,
+                                                         AddressSpace **target_as)
+{
+    MemoryRegionSection *section;
+    hwaddr page_mask = (hwaddr)-1;
+
+    do {
+        hwaddr addr = *xlat;
+        IOMMUMemoryRegionClass *imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
+        IOMMUTLBEntry iotlb = imrc->translate(iommu_mr, addr, is_write ?
+                                              IOMMU_WO : IOMMU_RO);
+
+        if (!(iotlb.perm & (1 << is_write))) {
+            goto unassigned;
+        }
+
+        addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
+                | (addr & iotlb.addr_mask));
+        page_mask &= iotlb.addr_mask;
+        *plen_out = MIN(*plen_out, (addr | iotlb.addr_mask) - addr + 1);
+        *target_as = iotlb.target_as;
+
+        section = address_space_translate_internal(
+                address_space_to_dispatch(iotlb.target_as), addr, xlat,
+                plen_out, is_mmio);
+        if (!section) {
+            goto unassigned;
+        }
+
+        iommu_mr = memory_region_get_iommu(section->mr);
+    } while (iommu_mr);
+
+    if (page_mask_out) {
+        *page_mask_out = page_mask;
+    }
+    return *section;
+
+unassigned:
+    return (MemoryRegionSection) { .mr = &io_mem_unassigned };
+}
+
 /**
  * flatview_do_translate - translate an address in FlatView
  *
@@ -489,55 +556,31 @@ static MemoryRegionSection flatview_do_translate(FlatView *fv,
                                                  bool is_mmio,
                                                  AddressSpace **target_as)
 {
-    IOMMUTLBEntry iotlb;
     MemoryRegionSection *section;
     IOMMUMemoryRegion *iommu_mr;
-    IOMMUMemoryRegionClass *imrc;
-    hwaddr page_mask = (hwaddr)(-1);
     hwaddr plen = (hwaddr)(-1);
 
     if (!plen_out) {
         plen_out = &plen;
     }
 
-    for (;;) {
-        section = address_space_translate_internal(
-                flatview_to_dispatch(fv), addr, xlat,
-                plen_out, is_mmio);
+    section = address_space_translate_internal(
+            flatview_to_dispatch(fv), addr, xlat,
+            plen_out, is_mmio);
 
-        iommu_mr = memory_region_get_iommu(section->mr);
-        if (!iommu_mr) {
-            break;
-        }
-        imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
-
-        addr = *xlat;
-        iotlb = imrc->translate(iommu_mr, addr, is_write ?
-                                IOMMU_WO : IOMMU_RO);
-        if (!(iotlb.perm & (1 << is_write))) {
-            goto translate_fail;
-        }
-
-        addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
-                | (addr & iotlb.addr_mask));
-        page_mask &= iotlb.addr_mask;
-        *plen_out = MIN(*plen_out, (addr | iotlb.addr_mask) - addr + 1);
-        fv = address_space_to_flatview(iotlb.target_as);
-        *target_as = iotlb.target_as;
+    iommu_mr = memory_region_get_iommu(section->mr);
+    if (unlikely(iommu_mr)) {
+        return address_space_translate_iommu(iommu_mr, xlat,
+                                             plen_out, page_mask_out,
+                                             is_write, is_mmio,
+                                             target_as);
     }
-
     if (page_mask_out) {
-        if (page_mask == (hwaddr)(-1)) {
-            /* Not behind an IOMMU, use default page size. */
-            page_mask = ~TARGET_PAGE_MASK;
-        }
-        *page_mask_out = page_mask;
+        /* Not behind an IOMMU, use default page size. */
+        *page_mask_out = ~TARGET_PAGE_MASK;
     }
 
     return *section;
-
-translate_fail:
-    return (MemoryRegionSection) { .mr = &io_mem_unassigned };
 }
 
 /* Called from RCU critical section */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH 4/4] exec: reintroduce MemoryRegion caching
  2018-04-17 14:07 [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching Paolo Bonzini
                   ` (2 preceding siblings ...)
  2018-04-17 14:08 ` [Qemu-devel] [PATCH 3/4] exec: extract address_space_translate_iommu, fix page_mask corner case Paolo Bonzini
@ 2018-04-17 14:08 ` Paolo Bonzini
  2018-05-16 10:41 ` [Qemu-devel] [PATCH 0/4] " Auger Eric
  4 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2018-04-17 14:08 UTC (permalink / raw)
  To: qemu-devel

MemoryRegionCache was reverted to "normal" address_space_* operations
for 2.9, due to lack of support for IOMMUs.  Reinstate the
optimizations, caching only the IOMMU translation at address_cache_init
but not the IOMMU lookup and target AddressSpace translation are not
cached; now that MemoryRegionCache supports IOMMUs, it becomes more widely
applicable too.

The inlined fast path is defined in memory_ldst_cached.inc.h, while the
slow path uses memory_ldst.inc.c as before.  The smaller fast path causes
a little code size reduction in MemoryRegionCache users:

    hw/virtio/virtio.o text size before: 32373
    hw/virtio/virtio.o text size after: 31941

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                                | 121 +++++++++++++++++++++++---
 include/exec/cpu-all.h                |   6 +-
 include/exec/memory-internal.h        |   3 +
 include/exec/memory.h                 |  58 ++++++++++--
 include/exec/memory_ldst_cached.inc.h | 108 +++++++++++++++++++++++
 memory.c                              |   4 +-
 6 files changed, 280 insertions(+), 20 deletions(-)
 create mode 100644 include/exec/memory_ldst_cached.inc.h

diff --git a/exec.c b/exec.c
index 868f89c345..8c01f01e98 100644
--- a/exec.c
+++ b/exec.c
@@ -3654,33 +3654,130 @@ int64_t address_space_cache_init(MemoryRegionCache *cache,
                                  hwaddr len,
                                  bool is_write)
 {
-    cache->len = len;
-    cache->as = as;
-    cache->xlat = addr;
-    return len;
+    AddressSpaceDispatch *d;
+    hwaddr l;
+    MemoryRegion *mr;
+
+    assert(len > 0);
+
+    l = len;
+    cache->fv = address_space_get_flatview(as);
+    d = flatview_to_dispatch(cache->fv);
+    cache->mrs = *address_space_translate_internal(d, addr, &cache->xlat, &l, true);
+
+    mr = cache->mrs.mr;
+    memory_region_ref(mr);
+    if (memory_access_is_direct(mr, is_write)) {
+        l = flatview_extend_translation(cache->fv, addr, len, mr,
+                                        cache->xlat, l, is_write);
+        cache->ptr = qemu_ram_ptr_length(mr->ram_block, cache->xlat, &l, true);
+    } else {
+        cache->ptr = NULL;
+    }
+
+    cache->len = l;
+    cache->is_write = is_write;
+    return l;
 }
 
 void address_space_cache_invalidate(MemoryRegionCache *cache,
                                     hwaddr addr,
                                     hwaddr access_len)
 {
+    assert(cache->is_write);
+    if (likely(cache->ptr)) {
+        invalidate_and_set_dirty(cache->mrs.mr, addr + cache->xlat, access_len);
+    }
 }
 
 void address_space_cache_destroy(MemoryRegionCache *cache)
 {
-    cache->as = NULL;
+    if (!cache->mrs.mr) {
+        return;
+    }
+
+    if (xen_enabled()) {
+        xen_invalidate_map_cache_entry(cache->ptr);
+    }
+    memory_region_unref(cache->mrs.mr);
+    flatview_unref(cache->fv);
+    cache->mrs.mr = NULL;
+    cache->fv = NULL;
+}
+
+/* Called from RCU critical section.  This function has the same
+ * semantics as address_space_translate, but it only works on a
+ * predefined range of a MemoryRegion that was mapped with
+ * address_space_cache_init.
+ */
+static inline MemoryRegion *address_space_translate_cached(
+    MemoryRegionCache *cache, hwaddr addr, hwaddr *xlat,
+    hwaddr *plen, bool is_write)
+{
+    MemoryRegionSection section;
+    MemoryRegion *mr;
+    IOMMUMemoryRegion *iommu_mr;
+    AddressSpace *target_as;
+
+    assert(!cache->ptr);
+    *xlat = addr + cache->xlat;
+
+    mr = cache->mrs.mr;
+    iommu_mr = memory_region_get_iommu(mr);
+    if (!iommu_mr) {
+        /* MMIO region.  */
+        return mr;
+    }
+
+    section = address_space_translate_iommu(iommu_mr, xlat, plen,
+                                            NULL, is_write, true,
+                                            &target_as);
+    return section.mr;
+}
+
+/* Called from RCU critical section. address_space_read_cached uses this
+ * out of line function when the target is an MMIO or IOMMU region.
+ */
+void
+address_space_read_cached_slow(MemoryRegionCache *cache, hwaddr addr,
+                                   void *buf, int len)
+{
+    hwaddr addr1, l;
+    MemoryRegion *mr;
+
+    l = len;
+    mr = address_space_translate_cached(cache, addr, &addr1, &l, false);
+    flatview_read_continue(cache->fv,
+                           addr, MEMTXATTRS_UNSPECIFIED, buf, len,
+                           addr1, l, mr);
+}
+
+/* Called from RCU critical section. address_space_write_cached uses this
+ * out of line function when the target is an MMIO or IOMMU region.
+ */
+void
+address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr,
+                                    const void *buf, int len)
+{
+    hwaddr addr1, l;
+    MemoryRegion *mr;
+
+    l = len;
+    mr = address_space_translate_cached(cache, addr, &addr1, &l, true);
+    flatview_write_continue(cache->fv,
+                            addr, MEMTXATTRS_UNSPECIFIED, buf, len,
+                            addr1, l, mr);
 }
 
 #define ARG1_DECL                MemoryRegionCache *cache
 #define ARG1                     cache
-#define SUFFIX                   _cached
-#define TRANSLATE(addr, ...)     \
-    address_space_translate(cache->as, cache->xlat + (addr), __VA_ARGS__)
-#define IS_DIRECT(mr, is_write)  true
-#define MAP_RAM(mr, ofs)         qemu_map_ram_ptr((mr)->ram_block, ofs)
+#define SUFFIX                   _cached_slow
+#define TRANSLATE(...)           address_space_translate_cached(cache, __VA_ARGS__)
+#define IS_DIRECT(mr, is_write)  memory_access_is_direct(mr, is_write)
+#define MAP_RAM(mr, ofs)         (cache->ptr + (ofs - cache->xlat))
 #define INVALIDATE(mr, ofs, len) invalidate_and_set_dirty(mr, ofs, len)
-#define RCU_READ_LOCK()          rcu_read_lock()
-#define RCU_READ_UNLOCK()        rcu_read_unlock()
+#define RCU_READ_LOCK()          ((void)0)
+#define RCU_READ_UNLOCK()        ((void)0)
 #include "memory_ldst.inc.c"
 
 /* virtual memory access for debug (includes writing to ROM) */
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 173edd1fb4..a635f532f9 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -175,7 +175,7 @@ extern unsigned long reserved_va;
 #define TARGET_ENDIANNESS
 #include "exec/memory_ldst.inc.h"
 
-#define SUFFIX       _cached
+#define SUFFIX       _cached_slow
 #define ARG1         cache
 #define ARG1_DECL    MemoryRegionCache *cache
 #define TARGET_ENDIANNESS
@@ -193,6 +193,10 @@ static inline void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val
 #define TARGET_ENDIANNESS
 #include "exec/memory_ldst_phys.inc.h"
 
+/* Inline fast path for direct RAM access.  */
+#define ENDIANNESS
+#include "exec/memory_ldst_cached.inc.h"
+
 #define SUFFIX       _cached
 #define ARG1         cache
 #define ARG1_DECL    MemoryRegionCache *cache
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 6a5ee42d36..58399b9318 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -31,6 +31,9 @@ static inline AddressSpaceDispatch *address_space_to_dispatch(AddressSpace *as)
     return flatview_to_dispatch(address_space_to_flatview(as));
 }
 
+FlatView *address_space_get_flatview(AddressSpace *as);
+void flatview_unref(FlatView *view);
+
 extern const MemoryRegionOps unassigned_mem_ops;
 
 bool memory_region_access_valid(MemoryRegion *mr, hwaddr addr,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index ca361bc409..525619a5f4 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1688,12 +1688,16 @@ MemTxResult address_space_write(AddressSpace *as, hwaddr addr,
 #include "exec/memory_ldst_phys.inc.h"
 
 struct MemoryRegionCache {
+    void *ptr;
     hwaddr xlat;
     hwaddr len;
-    AddressSpace *as;
+    FlatView *fv;
+    MemoryRegionSection mrs;
+    bool is_write;
 };
 
-#define MEMORY_REGION_CACHE_INVALID ((MemoryRegionCache) { .as = NULL })
+#define MEMORY_REGION_CACHE_INVALID ((MemoryRegionCache) { .mrs.mr = NULL })
+
 
 /* address_space_ld*_cached: load from a cached #MemoryRegion
  * address_space_st*_cached: store into a cached #MemoryRegion
@@ -1719,11 +1723,40 @@ struct MemoryRegionCache {
  *   if NULL, this information is discarded
  */
 
-#define SUFFIX       _cached
+#define SUFFIX       _cached_slow
 #define ARG1         cache
 #define ARG1_DECL    MemoryRegionCache *cache
 #include "exec/memory_ldst.inc.h"
 
+/* Inline fast path for direct RAM access.  */
+static inline uint8_t address_space_ldub_cached(MemoryRegionCache *cache,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
+{
+    assert(addr < cache->len);
+    if (likely(cache->ptr)) {
+        return ldub_p(cache->ptr + addr);
+    } else {
+        return address_space_ldub_cached_slow(cache, addr, attrs, result);
+    }
+}
+
+static inline void address_space_stb_cached(MemoryRegionCache *cache,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
+{
+    assert(addr < cache->len);
+    if (likely(cache->ptr)) {
+        stb_p(cache->ptr + addr, val);
+    } else {
+        address_space_stb_cached_slow(cache, addr, val, attrs, result);
+    }
+}
+
+#define ENDIANNESS   _le
+#include "exec/memory_ldst_cached.inc.h"
+
+#define ENDIANNESS   _be
+#include "exec/memory_ldst_cached.inc.h"
+
 #define SUFFIX       _cached
 #define ARG1         cache
 #define ARG1_DECL    MemoryRegionCache *cache
@@ -1860,6 +1893,13 @@ MemTxResult flatview_read_continue(FlatView *fv, hwaddr addr,
                                    MemoryRegion *mr);
 void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t addr);
 
+/* Internal functions, part of the implementation of address_space_read_cached
+ * and address_space_write_cached.  */
+void address_space_read_cached_slow(MemoryRegionCache *cache,
+                                    hwaddr addr, void *buf, int len);
+void address_space_write_cached_slow(MemoryRegionCache *cache,
+                                     hwaddr addr, const void *buf, int len);
+
 static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write)
 {
     if (is_write) {
@@ -1928,7 +1968,11 @@ address_space_read_cached(MemoryRegionCache *cache, hwaddr addr,
                           void *buf, int len)
 {
     assert(addr < cache->len && len <= cache->len - addr);
-    address_space_read(cache->as, cache->xlat + addr, MEMTXATTRS_UNSPECIFIED, buf, len);
+    if (likely(cache->ptr)) {
+        memcpy(buf, cache->ptr + addr, len);
+    } else {
+        address_space_read_cached_slow(cache, addr, buf, len);
+    }
 }
 
 /**
@@ -1944,7 +1988,11 @@ address_space_write_cached(MemoryRegionCache *cache, hwaddr addr,
                            void *buf, int len)
 {
     assert(addr < cache->len && len <= cache->len - addr);
-    address_space_write(cache->as, cache->xlat + addr, MEMTXATTRS_UNSPECIFIED, buf, len);
+    if (likely(cache->ptr)) {
+        memcpy(cache->ptr + addr, buf, len);
+    } else {
+        address_space_write_cached_slow(cache, addr, buf, len);
+    }
 }
 
 #endif
diff --git a/include/exec/memory_ldst_cached.inc.h b/include/exec/memory_ldst_cached.inc.h
new file mode 100644
index 0000000000..fd4bbb40e7
--- /dev/null
+++ b/include/exec/memory_ldst_cached.inc.h
@@ -0,0 +1,108 @@
+/*
+ *  Memory access templates for MemoryRegionCache
+ *
+ *  Copyright (c) 2018 Red Hat, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define ADDRESS_SPACE_LD_CACHED(size) \
+    glue(glue(address_space_ld, size), glue(ENDIANNESS, _cached))
+#define ADDRESS_SPACE_LD_CACHED_SLOW(size) \
+    glue(glue(address_space_ld, size), glue(ENDIANNESS, _cached_slow))
+#define LD_P(size) \
+    glue(glue(ld, size), glue(ENDIANNESS, _p))
+
+static inline uint32_t ADDRESS_SPACE_LD_CACHED(l)(MemoryRegionCache *cache,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
+{
+    assert(addr < cache->len && 4 <= cache->len - addr);
+    if (likely(cache->ptr)) {
+        return LD_P(l)(cache->ptr + addr);
+    } else {
+        return ADDRESS_SPACE_LD_CACHED_SLOW(l)(cache, addr, attrs, result);
+    }
+}
+
+static inline uint64_t ADDRESS_SPACE_LD_CACHED(q)(MemoryRegionCache *cache,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
+{
+    assert(addr < cache->len && 8 <= cache->len - addr);
+    if (likely(cache->ptr)) {
+        return LD_P(q)(cache->ptr + addr);
+    } else {
+        return ADDRESS_SPACE_LD_CACHED_SLOW(q)(cache, addr, attrs, result);
+    }
+}
+
+static inline uint32_t ADDRESS_SPACE_LD_CACHED(uw)(MemoryRegionCache *cache,
+    hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
+{
+    assert(addr < cache->len && 2 <= cache->len - addr);
+    if (likely(cache->ptr)) {
+        return LD_P(uw)(cache->ptr + addr);
+    } else {
+        return ADDRESS_SPACE_LD_CACHED_SLOW(uw)(cache, addr, attrs, result);
+    }
+}
+
+#undef ADDRESS_SPACE_LD_CACHED
+#undef ADDRESS_SPACE_LD_CACHED_SLOW
+#undef LD_P
+
+#define ADDRESS_SPACE_ST_CACHED(size) \
+    glue(glue(address_space_st, size), glue(ENDIANNESS, _cached))
+#define ADDRESS_SPACE_ST_CACHED_SLOW(size) \
+    glue(glue(address_space_st, size), glue(ENDIANNESS, _cached_slow))
+#define ST_P(size) \
+    glue(glue(st, size), glue(ENDIANNESS, _p))
+
+static inline void ADDRESS_SPACE_ST_CACHED(l)(MemoryRegionCache *cache,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
+{
+    assert(addr < cache->len && 4 <= cache->len - addr);
+    if (likely(cache->ptr)) {
+        ST_P(l)(cache->ptr + addr, val);
+    } else {
+        ADDRESS_SPACE_ST_CACHED_SLOW(l)(cache, addr, val, attrs, result);
+    }
+}
+
+static inline void ADDRESS_SPACE_ST_CACHED(w)(MemoryRegionCache *cache,
+    hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
+{
+    assert(addr < cache->len && 2 <= cache->len - addr);
+    if (likely(cache->ptr)) {
+        ST_P(w)(cache->ptr + addr, val);
+    } else {
+        ADDRESS_SPACE_ST_CACHED_SLOW(w)(cache, addr, val, attrs, result);
+    }
+}
+
+static inline void ADDRESS_SPACE_ST_CACHED(q)(MemoryRegionCache *cache,
+    hwaddr addr, uint64_t val, MemTxAttrs attrs, MemTxResult *result)
+{
+    assert(addr < cache->len && 8 <= cache->len - addr);
+    if (likely(cache->ptr)) {
+        ST_P(q)(cache->ptr + addr, val);
+    } else {
+        ADDRESS_SPACE_ST_CACHED_SLOW(q)(cache, addr, val, attrs, result);
+    }
+}
+
+#undef ADDRESS_SPACE_ST_CACHED
+#undef ADDRESS_SPACE_ST_CACHED_SLOW
+#undef ST_P
+
+#undef ENDIANNESS
diff --git a/memory.c b/memory.c
index e70b64b8b9..fc7f9b782b 100644
--- a/memory.c
+++ b/memory.c
@@ -298,7 +298,7 @@ static bool flatview_ref(FlatView *view)
     return atomic_fetch_inc_nonzero(&view->ref) > 0;
 }
 
-static void flatview_unref(FlatView *view)
+void flatview_unref(FlatView *view)
 {
     if (atomic_fetch_dec(&view->ref) == 1) {
         trace_flatview_destroy_rcu(view, view->root);
@@ -822,7 +822,7 @@ static void address_space_add_del_ioeventfds(AddressSpace *as,
     }
 }
 
-static FlatView *address_space_get_flatview(AddressSpace *as)
+FlatView *address_space_get_flatview(AddressSpace *as)
 {
     FlatView *view;
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] exec: small changes to flatview_do_translate
  2018-04-17 14:08 ` [Qemu-devel] [PATCH 2/4] exec: small changes to flatview_do_translate Paolo Bonzini
@ 2018-05-04  4:21   ` Peter Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2018-05-04  4:21 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On Tue, Apr 17, 2018 at 04:08:00PM +0200, Paolo Bonzini wrote:
> Prepare for extracting the IOMMU part to a separate function.  Mostly
> cosmetic; the only semantic change is that, if there is more than one
> cascaded IOMMU and the second one fails to translate, *plen_out is now
> adjusted according to the page mask of the first IOMMU.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] exec: extract address_space_translate_iommu, fix page_mask corner case
  2018-04-17 14:08 ` [Qemu-devel] [PATCH 3/4] exec: extract address_space_translate_iommu, fix page_mask corner case Paolo Bonzini
@ 2018-05-04  4:22   ` Peter Xu
  2018-05-04  7:33     ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Xu @ 2018-05-04  4:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On Tue, Apr 17, 2018 at 04:08:01PM +0200, Paolo Bonzini wrote:

[...]

> +static MemoryRegionSection address_space_translate_iommu(IOMMUMemoryRegion *iommu_mr,
> +                                                         hwaddr *xlat,
> +                                                         hwaddr *plen_out,
> +                                                         hwaddr *page_mask_out,
> +                                                         bool is_write,
> +                                                         bool is_mmio,
> +                                                         AddressSpace **target_as)
> +{
> +    MemoryRegionSection *section;
> +    hwaddr page_mask = (hwaddr)-1;
> +
> +    do {
> +        hwaddr addr = *xlat;
> +        IOMMUMemoryRegionClass *imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
> +        IOMMUTLBEntry iotlb = imrc->translate(iommu_mr, addr, is_write ?
> +                                              IOMMU_WO : IOMMU_RO);
> +
> +        if (!(iotlb.perm & (1 << is_write))) {
> +            goto unassigned;
> +        }
> +
> +        addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
> +                | (addr & iotlb.addr_mask));
> +        page_mask &= iotlb.addr_mask;
> +        *plen_out = MIN(*plen_out, (addr | iotlb.addr_mask) - addr + 1);
> +        *target_as = iotlb.target_as;
> +
> +        section = address_space_translate_internal(
> +                address_space_to_dispatch(iotlb.target_as), addr, xlat,
> +                plen_out, is_mmio);
> +        if (!section) {
> +            goto unassigned;

(we won't reach here, will we?)

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] exec: extract address_space_translate_iommu, fix page_mask corner case
  2018-05-04  4:22   ` Peter Xu
@ 2018-05-04  7:33     ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2018-05-04  7:33 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel



----- Original Message -----
> From: "Peter Xu" <peterx@redhat.com>
> To: "Paolo Bonzini" <pbonzini@redhat.com>
> Cc: qemu-devel@nongnu.org
> Sent: Friday, May 4, 2018 6:22:31 AM
> Subject: Re: [Qemu-devel] [PATCH 3/4] exec: extract address_space_translate_iommu, fix page_mask corner case
> 
> On Tue, Apr 17, 2018 at 04:08:01PM +0200, Paolo Bonzini wrote:
> 
> [...]
> 
> > +static MemoryRegionSection address_space_translate_iommu(IOMMUMemoryRegion
> > *iommu_mr,
> > +                                                         hwaddr *xlat,
> > +                                                         hwaddr *plen_out,
> > +                                                         hwaddr
> > *page_mask_out,
> > +                                                         bool is_write,
> > +                                                         bool is_mmio,
> > +                                                         AddressSpace
> > **target_as)
> > +{
> > +    MemoryRegionSection *section;
> > +    hwaddr page_mask = (hwaddr)-1;
> > +
> > +    do {
> > +        hwaddr addr = *xlat;
> > +        IOMMUMemoryRegionClass *imrc =
> > memory_region_get_iommu_class_nocheck(iommu_mr);
> > +        IOMMUTLBEntry iotlb = imrc->translate(iommu_mr, addr, is_write ?
> > +                                              IOMMU_WO : IOMMU_RO);
> > +
> > +        if (!(iotlb.perm & (1 << is_write))) {
> > +            goto unassigned;
> > +        }
> > +
> > +        addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
> > +                | (addr & iotlb.addr_mask));
> > +        page_mask &= iotlb.addr_mask;
> > +        *plen_out = MIN(*plen_out, (addr | iotlb.addr_mask) - addr + 1);
> > +        *target_as = iotlb.target_as;
> > +
> > +        section = address_space_translate_internal(
> > +                address_space_to_dispatch(iotlb.target_as), addr, xlat,
> > +                plen_out, is_mmio);
> > +        if (!section) {
> > +            goto unassigned;
> 
> (we won't reach here, will we?)

Good point, I'll fix that.

Paolo

> 
> Reviewed-by: Peter Xu <peterx@redhat.com>
> 
> --
> Peter Xu
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching
  2018-04-17 14:07 [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching Paolo Bonzini
                   ` (3 preceding siblings ...)
  2018-04-17 14:08 ` [Qemu-devel] [PATCH 4/4] exec: reintroduce MemoryRegion caching Paolo Bonzini
@ 2018-05-16 10:41 ` Auger Eric
  2018-05-16 10:42   ` Paolo Bonzini
  4 siblings, 1 reply; 12+ messages in thread
From: Auger Eric @ 2018-05-16 10:41 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

Hi Paolo,

On 04/17/2018 04:07 PM, Paolo Bonzini wrote:
> MemoryRegionCache was reverted to "normal" address_space_* operations
> for 2.9, due to lack of support for IOMMUs.  This series reinstates
> optimizations, caching only the IOMMU translation but not the IOMMU
> lookup and target AddressSpace translation.
> 
> Patches 1 to 3 prepare by introducing a new function
> address_space_translate_iommu (split out of address_space_translate)
> and new header files for declarations shared by regular and "cached"
> address_space_* functions.  Patch 4 uses them to introduce both
> the slow path for IOMMU and MMIO cached regions, and the fast path
> for RAM cached regions.

This patch seems to cause a regression with ARM vsmmu + virtio-blk-pci. Reverting it looks to fix the issue. Otherwise I get:

Program received signal SIGSEGV, Segmentation fault.
address_space_lduw_internal_cached_slow (endian=DEVICE_LITTLE_ENDIAN, result=0x0, attrs=..., addr=2, cache=0xffffa81110a0) at /home/augere/UPSTREAM/qemu/memory_ldst.inc.c:242
242	            val = lduw_le_p(ptr);
(gdb) where
#0  address_space_lduw_internal_cached_slow (endian=DEVICE_LITTLE_ENDIAN, result=0x0, attrs=..., addr=2, cache=0xffffa81110a0) at /home/augere/UPSTREAM/qemu/memory_ldst.inc.c:242
#1  address_space_lduw_le_cached_slow (cache=0xffffa81110a0, addr=addr@entry=2, attrs=..., attrs@entry=..., result=0x0) at /home/augere/UPSTREAM/qemu/memory_ldst.inc.c:273
#2  0x0000000000511c74 in address_space_lduw_le_cached (result=0x0, attrs=..., addr=2, cache=<optimized out>) at /home/augere/UPSTREAM/qemu/include/exec/memory_ldst_cached.inc.h:56
#3  lduw_le_phys_cached (addr=2, cache=<optimized out>) at /home/augere/UPSTREAM/qemu/include/exec/memory_ldst_phys.inc.h:91
#4  virtio_lduw_phys_cached (pa=2, cache=<optimized out>, vdev=<optimized out>) at /home/augere/UPSTREAM/qemu/include/hw/virtio/virtio-access.h:166
#5  vring_avail_idx (vq=0x1cfe2d0) at /home/augere/UPSTREAM/qemu/hw/virtio/virtio.c:227
#6  virtio_queue_set_notification (vq=0x1cfe2d0, enable=0) at /home/augere/UPSTREAM/qemu/hw/virtio/virtio.c:324
#7  0x0000000000511d2c in virtio_queue_set_notification (vq=<optimized out>, enable=<optimized out>) at /home/augere/UPSTREAM/qemu/hw/virtio/virtio.c:318
#8  0x00000000004aa158 in virtio_blk_handle_vq (s=0x1cf32b0, vq=0x1cfe2d0) at /home/augere/UPSTREAM/qemu/hw/block/virtio-blk.c:605
#9  0x00000000005113f0 in virtio_queue_notify_aio_vq (vq=0x1cfe2d0) at /home/augere/UPSTREAM/qemu/hw/virtio/virtio.c:1515
#10 0x000000000087cbf0 in aio_dispatch_handlers (ctx=ctx@entry=0x15300e0) at util/aio-posix.c:406
#11 0x000000000087d3f8 in aio_dispatch (ctx=0x15300e0) at util/aio-posix.c:437
#12 0x0000000000879f30 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
#13 0x0000ffffbeee97a0 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#14 0x000000000087c678 in glib_pollfds_poll () at util/main-loop.c:215
#15 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:263
#16 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:522
#17 0x0000000000423dd8 in main_loop () at vl.c:1943
#18 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4675

Investigating further ...

Thanks

Eric
> 
> Paolo
> 
> Paolo Bonzini (4):
>   exec: move memory access declarations to a common header, inline
>     *_phys functions
>   exec: small changes to flatview_do_translate
>   exec: extract address_space_translate_iommu, fix page_mask corner case
>   exec: reintroduce MemoryRegion caching
> 
>  exec.c                                | 245 ++++++++++++++++++++------
>  include/exec/cpu-all.h                |  79 ++++-----
>  include/exec/memory-internal.h        |   3 +
>  include/exec/memory.h                 | 209 ++++++++++------------
>  include/exec/memory_ldst.inc.h        |  71 ++++++++
>  include/exec/memory_ldst_cached.inc.h | 108 ++++++++++++
>  include/exec/memory_ldst_phys.inc.h   | 147 ++++++++++++++++
>  memory.c                              |   4 +-
>  memory_ldst.inc.c                     | 126 -------------
>  9 files changed, 651 insertions(+), 341 deletions(-)
>  create mode 100644 include/exec/memory_ldst.inc.h
>  create mode 100644 include/exec/memory_ldst_cached.inc.h
>  create mode 100644 include/exec/memory_ldst_phys.inc.h
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching
  2018-05-16 10:41 ` [Qemu-devel] [PATCH 0/4] " Auger Eric
@ 2018-05-16 10:42   ` Paolo Bonzini
  2018-05-16 13:38     ` Auger Eric
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2018-05-16 10:42 UTC (permalink / raw)
  To: Auger Eric, qemu-devel

On 16/05/2018 12:41, Auger Eric wrote:
> This patch seems to cause a regression with ARM vsmmu +
> virtio-blk-pci. Reverting it looks to fix the issue. Otherwise I
> get:

What's the command line for a reproducer?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching
  2018-05-16 10:42   ` Paolo Bonzini
@ 2018-05-16 13:38     ` Auger Eric
  2018-05-16 13:49       ` Auger Eric
  0 siblings, 1 reply; 12+ messages in thread
From: Auger Eric @ 2018-05-16 13:38 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

Hi Paolo

On 05/16/2018 12:42 PM, Paolo Bonzini wrote:
> On 16/05/2018 12:41, Auger Eric wrote:
>> This patch seems to cause a regression with ARM vsmmu +
>> virtio-blk-pci. Reverting it looks to fix the issue. Otherwise I
>> get:
> 
> What's the command line for a reproducer?
> 
> Thanks,
> 
> Paolo
> 

here is my command line:

TCG:

sudo /home/augere/UPSTREAM/qemu/aarch64-softmmu/qemu-system-aarch64 -M virt-2.12,gic-version=3,iommu=smmuv3 \
-cpu cortex-a57 -smp 8 -m 4096 -display none -machine accel=tcg \
-serial tcp:localhost:4444,server -trace events=/home/augere/TEST/QEMU/hw-arm-smmu \
-qmp unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait \
-device virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=drv0,bootindex=1,iommu_platform,disable-modern=off,disable-legacy=on,werror=stop,rerror=stop \
-drive file=/home/augere/VM/IMAGES/vm0.qcow2,format=qcow2,if=none,id=drv0 \
-device virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
-netdev tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown \
-net none

or with acceleration:

sudo /home/augere/UPSTREAM/qemu/aarch64-softmmu/qemu-system-aarch64 -M virt-2.12,gic-version=3,iommu=smmuv3 -cpu host -smp 8 -m 4096 -display none --enable-kvm ../..

I reproduce with both, 100% of the cases.

Trying to reproduce with intel iommu as well.

Thanks

Eric

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching
  2018-05-16 13:38     ` Auger Eric
@ 2018-05-16 13:49       ` Auger Eric
  0 siblings, 0 replies; 12+ messages in thread
From: Auger Eric @ 2018-05-16 13:49 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

Hi Paolo,

On 05/16/2018 03:38 PM, Auger Eric wrote:
> Hi Paolo
> 
> On 05/16/2018 12:42 PM, Paolo Bonzini wrote:
>> On 16/05/2018 12:41, Auger Eric wrote:
>>> This patch seems to cause a regression with ARM vsmmu +
>>> virtio-blk-pci. Reverting it looks to fix the issue. Otherwise I
>>> get:
>>
>> What's the command line for a reproducer?
>>
>> Thanks,
>>
>> Paolo
>>
> 
> here is my command line:
> 
> TCG:
> 
> sudo /home/augere/UPSTREAM/qemu/aarch64-softmmu/qemu-system-aarch64 -M virt-2.12,gic-version=3,iommu=smmuv3 \
> -cpu cortex-a57 -smp 8 -m 4096 -display none -machine accel=tcg \
> -serial tcp:localhost:4444,server -trace events=/home/augere/TEST/QEMU/hw-arm-smmu \
> -qmp unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait \
> -device virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=drv0,bootindex=1,iommu_platform,disable-modern=off,disable-legacy=on,werror=stop,rerror=stop \
> -drive file=/home/augere/VM/IMAGES/vm0.qcow2,format=qcow2,if=none,id=drv0 \
> -device virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
> -netdev tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown \
> -net none
> 
> or with acceleration:
> 
> sudo /home/augere/UPSTREAM/qemu/aarch64-softmmu/qemu-system-aarch64 -M virt-2.12,gic-version=3,iommu=smmuv3 -cpu host -smp 8 -m 4096 -display none --enable-kvm ../..
> 
> I reproduce with both, 100% of the cases.
> 
> Trying to reproduce with intel iommu as well.

Yep I just reproduced on x86 as well. Here is the used cmd line.

/home/augere/UPSTREAM/qemu/x86_64-softmmu/qemu-system-x86_64 \
-M q35,accel=kvm,usb=off,dump-guest-core=off -cpu Haswell,-hle,-rtm -smp 4,sockets=4,cores=1,threads=1 -m 8192 \
-display none --enable-kvm -serial tcp:localhost:4444,server -qmp unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay -realtime mlock=off -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 \
-boot strict=on -machine kernel_irqchip=split -device ioh3420,chassis=1,addr=4,id=pcie.1 -device ioh3420,chassis=2,addr=5,id=pcie.2 \
-device intel-iommu,intremap=on \
-device virtio-blk-pci,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,iommu_platform,disable-modern=off,disable-legacy=on \
-drive file=/home/augere/VM/IMAGES/vm0.qcow2,format=qcow2,if=none,id=drv0 \
-device virtio-net-pci,netdev=nic0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
-netdev tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown \
-net none

Program received signal SIGSEGV, Segmentation fault.
address_space_lduw_internal_cached_slow (endian=DEVICE_LITTLE_ENDIAN, result=<optimized out>, attrs=...,
    addr=2, cache=<optimized out>) at /home/augere/UPSTREAM/qemu/memory_ldst.inc.c:242
242	            val = lduw_le_p(ptr);
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 celt051-0.5.1.3-8.el7.x86_64 cyrus-sasl-lib-2.1.26-23.el7.x86_64 elfutils-libelf-0.170-4.el7.x86_64
elfutils-libs-0.170-4.el7.x86_64 glibc-2.17-222.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-18.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libblkid-2.23.2-52.el7.x86_64
libcacard-2.5.2-2.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-11.el7.x86_64 libffi-3.0.13-18.el7.x86_64 libgcc-4.8.5-28.el7.x86_64 libjpeg-turbo-1.2.90-5.el7.x86_64
libmount-2.23.2-52.el7.x86_64 libselinux-2.5-12.el7.x86_64 libstdc++-4.8.5-28.el7.x86_64 libusbx-1.0.21-1.el7.x86_64 libuuid-2.23.2-52.el7.x86_64 lz4-1.7.5-2.el7.x86_64
ncurses-libs-5.9-14.20130511.el7_4.x86_64 nspr-4.17.0-1.el7.x86_64 nss-3.34.0-4.el7.x86_64 nss-softokn-freebl-3.34.0-2.el7.x86_64 nss-util-3.34.0-2.el7.x86_64 numactl-libs-2.0.9-7.el7.x86_64
openssl-libs-1.0.2k-12.el7.x86_64 opus-1.0.2-6.el7.x86_64 pcre-8.32-17.el7.x86_64 pixman-0.34.0-1.el7.x86_64 spice-server-0.14.0-2.el7.x86_64 systemd-libs-219-57.el7.x86_64 usbredir-0.7.1-3.el7.x86_64
xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) where
#0  0x000055555580e773 in address_space_lduw_le_cached_slow (endian=
    DEVICE_LITTLE_ENDIAN, result=<optimized out>, attrs=..., addr=2, cache=<optimized out>)
    at /home/augere/UPSTREAM/qemu/memory_ldst.inc.c:242
#1  0x000055555580e773 in address_space_lduw_le_cached_slow (cache=<optimized out>, addr=addr@entry=2, attrs=..., attrs@entry=..., result=result@entry=0x0) at
/home/augere/UPSTREAM/qemu/memory_ldst.inc.c:273
#2  0x00005555558b0a9b in virtio_queue_set_notification (result=0x0, attrs=..., addr=2, cache=<optimized out>) at /home/augere/UPSTREAM/qemu/include/exec/memory_ldst_cached.inc.h:56
#3  0x00005555558b0a9b in virtio_queue_set_notification (addr=2, cache=<optimized out>)
    at /home/augere/UPSTREAM/qemu/include/exec/memory_ldst_phys.inc.h:91
#4  0x00005555558b0a9b in virtio_queue_set_notification (vdev=<optimized out>, pa=2, cache=<optimized out>) at /home/augere/UPSTREAM/qemu/include/hw/virtio/virtio-access.h:166
#5  0x00005555558b0a9b in virtio_queue_set_notification (vq=0x7fffec260010)
    at /home/augere/UPSTREAM/qemu/hw/virtio/virtio.c:227
#6  0x00005555558b0a9b in virtio_queue_set_notification (vq=vq@entry=0x7fffec260010, enable=enable@entry=0) at /home/augere/UPSTREAM/qemu/hw/virtio/virtio.c:324
#7  0x00005555558b0b12 in virtio_queue_set_notification (vq=vq@entry=0x7fffec260010, enable=enable@entry=0) at /home/augere/UPSTREAM/qemu/hw/virtio/virtio.c:318
#8  0x0000555555880fea in virtio_blk_handle_vq (s=0x555557dc9660, vq=0x7fffec260010)
    at /home/augere/UPSTREAM/qemu/hw/block/virtio-blk.c:605
#9  0x00005555558b0680 in virtio_queue_notify_aio_vq (vq=0x7fffec260010)
    at /home/augere/UPSTREAM/qemu/hw/virtio/virtio.c:1515
#10 0x0000555555bef7c8 in aio_dispatch_handlers (ctx=ctx@entry=0x555556a35ce0) at util/aio-posix.c:406
#11 0x0000555555bf0068 in aio_dispatch (ctx=0x555556a35ce0) at util/aio-posix.c:437
#12 0x0000555555becbce in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
#13 0x00007ffff79058f9 in g_main_context_dispatch (context=0x555556a360b0) at gmain.c:3146
#14 0x00007ffff79058f9 in g_main_context_dispatch (context=context@entry=0x555556a360b0) at gmain.c:3811
#15 0x0000555555bef326 in main_loop_wait () at util/main-loop.c:215
#16 0x0000555555bef326 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:263
#17 0x0000555555bef326 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:522
#18 0x00005555557fdb1f in main () at vl.c:1943
#19 0x00005555557fdb1f in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at vl.c:4675
(gdb) quit


Thanks

Eric

> 
> Thanks
> 
> Eric
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-05-16 13:49 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-17 14:07 [Qemu-devel] [PATCH 0/4] exec: reintroduce MemoryRegion caching Paolo Bonzini
2018-04-17 14:07 ` [Qemu-devel] [PATCH 1/4] exec: move memory access declarations to a common header, inline *_phys functions Paolo Bonzini
2018-04-17 14:08 ` [Qemu-devel] [PATCH 2/4] exec: small changes to flatview_do_translate Paolo Bonzini
2018-05-04  4:21   ` Peter Xu
2018-04-17 14:08 ` [Qemu-devel] [PATCH 3/4] exec: extract address_space_translate_iommu, fix page_mask corner case Paolo Bonzini
2018-05-04  4:22   ` Peter Xu
2018-05-04  7:33     ` Paolo Bonzini
2018-04-17 14:08 ` [Qemu-devel] [PATCH 4/4] exec: reintroduce MemoryRegion caching Paolo Bonzini
2018-05-16 10:41 ` [Qemu-devel] [PATCH 0/4] " Auger Eric
2018-05-16 10:42   ` Paolo Bonzini
2018-05-16 13:38     ` Auger Eric
2018-05-16 13:49       ` Auger Eric

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).